Iranian Water Research Journal

Iranian Water Research Journal

Modeling Streamflow During the Snowmelt-Dominated Period Using LSTM and ERA5-LAND Reanalysis Data (A Case Study: The Snow-Covered Bazoft Basin)

Document Type : Original Article

Authors
1 Ph D. Student of Faculty of Water and Environmental Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran.
2 Associate Professor, Faculty of Water and Environmental Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran
3 Assistant Professor, Faculty of Water and Environmental Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran
10.22034/iwrj.2026.14932.2634
Abstract
Introduction

Mountain snow reserves are a crucial component of the hydrological cycle in high-altitude basins, supplying a substantial portion of river flow during the warm season. Forecasting discharge during the snowmelt-dominated period is therefore essential for water resource management and flood mitigation, a challenge intensified by climate change impacts on melt timing. While physically-based and conceptual temperature-index models are common, they face limitations due to data scarcity or an inability to capture complex, non-linear melt-runoff processes. Consequently, this study employs a data-driven approach using a Long Short-Term Memory (LSTM) deep learning network to model the non-linear temporal dynamics of snowmelt. Specifically, an LSTM model was developed and evaluated to forecast daily discharge during the snowmelt period in the data-scarce, snow-fed Bazoft Basin, Iran. To address the lack of in-situ data, meteorological time series from the ERA5 reanalysis product were utilized as model inputs. The results demonstrate the LSTM model's effectiveness in learning the long-term dependencies between snowmelt processes and streamflow, offering a robust framework for seasonal runoff prediction in snow-dominated regions to support sustainable water management.

Materials and Methods

This study was conducted in the snow-covered Bazoft Basin, located in the northern Karun River catchment, Iran (31.6°–32.65° N, 49.56°–50.46° E). The basin experiences snow accumulation from December to March, with the snowmelt-dominated period occurring from mid-February to April, during which snowmelt is the

primary contributor to streamflow. Daily streamflow data from the Landi hydrometric station were used as the target variable.

Due to the scarcity of ground-based meteorological stations in this mountainous region, this study utilized the ERA5-LAND reanalysis dataset from the European Centre for Medium-Range Weather Forecasts (ECMWF). ERA5-LAND provides improved spatial resolution (9 km) compared to ERA5 (31 km) and has been extensively validated in previous studies over Iran, demonstrating high correlation with ground observations, particularly for temperature (R² > 0.95). The following daily variables were extracted: air temperature (T), precipitation (P), and snow water equivalent (SWE). The study period covered the water years 2014–2023.

The Gamma Test (GT), a non-parametric method for identifying the optimal input combination, was employed to select the most relevant input variables. Four different input combinations were evaluated: M1 (Q and SWE), M2 (Q, T, and SWE), M3 (Q, P, and SWE), and M4 (Q, P, T, and SWE). The combination with the minimum Gamma value and V-ratio was selected as the optimal input set.

A Long Short-Term Memory (LSTM) neural network was developed to model the non-linear, long-term temporal dependencies inherent in the snowmelt-runoff process. The LSTM architecture includes forget, input, and output gates that regulate information flow, enabling the network to retain relevant information over extended time steps. The model was implemented using the Keras framework. The dataset was randomly divided into training (80%) and testing (20%) subsets. Various hyperparameters were optimized, including the number of hidden layers (1–3), number of hidden units (10–40), and seven optimization algorithms (Adam, Adamax, SGD, RMSprop, Nadam, Adagrad, and Adadelta). Model performance was evaluated using the Nash-Sutcliffe Efficiency (NSE), Root Mean Square Error (RMSE), and Coefficient of Determination (R²).

Results and Discussion

The Gamma Test results identified the optimal input combination for LSTM modeling. Among the four evaluated combinations, M2 (discharge, temperature, and snow water equivalent) exhibited the minimum Gamma value (0.00151) and V-ratio (0.00412), indicating the smoothest input-output relationship with the lowest irreducible noise. Consequently, M2 was selected as the most appropriate input set for subsequent LSTM modeling.

Evaluation of LSTM hyperparameters revealed that a single hidden layer architecture outperformed deeper configurations (2-3 layers), achieving the lowest RMSE (0.220) and MAE (0.125). Regarding hidden units, 20 neurons yielded optimal performance. Among seven optimization algorithms compared, Adamax demonstrated superior results with RMSE and MAE values of 0.37 and 0.18, respectively, and was therefore selected as the optimizer for the final model.

The LSTM model with M2 inputs (Q, T, SWE) demonstrated exceptional performance in simulating daily streamflow during the snowmelt-dominated period. During the training phase, the model achieved NSE=0.994, R²=0.991, and RMSE=0.08. In the testing phase, performance remained excellent with NSE=0.994, R²=0.991, and RMSE=0.174, indicating robust generalization capability. Visual comparison of observed and simulated hydrographs for the 2021-2022 and 2022-2023 water years showed that the LSTM accurately captured both low-flow conditions and peak discharges during snowmelt events. Scatter plots further confirmed this accuracy, with points closely clustered around the 1:1 line and R² values of 0.997, demonstrating that the model effectively reproduced the variance in observed streamflow across the entire flow regime.

The superior performance of the LSTM model can be attributed to its inherent capability to capture long-term temporal dependencies through memory cells and gating mechanisms, which is particularly advantageous for snowmelt-runoff processes characterized by lagged responses and memory effects. The inclusion of SWE as an input variable proved crucial, as it directly represents the accumulated snowpack available for melt, while temperature controls the melt energy. Notably, the M2 combination (Q, T, SWE) outperformed combinations including precipitation (M3 and M4), suggesting that during the snowmelt-dominated period, the hydrological signal is primarily governed by snowmelt dynamics rather than concurrent rainfall.

Comparison with previous studies demonstrates the competitiveness of the proposed approach. The obtained NSE (0.994) substantially exceeds values reported for traditional temperature-index models (NSE~0.80, Pradhananga et al., 2014) and is higher than ANN-based snowmelt runoff predictions (NSE~0.93, Uysal et al., 2016). The results are comparable with other LSTM applications in hydrological modeling (NSE~0.99, Le et al., 2019; Adib et al., 2024). However, this study specifically contributes to the limited literature on LSTM application for snowmelt-dominated period streamflow modeling in data-scarce mountainous regions.

The successful integration of ERA5-LAND reanalysis data addresses the critical challenge of data scarcity in mountainous basins. The high accuracy achieved demonstrates that reanalysis products can effectively substitute for ground-based meteorological observations when the latter are unavailable, provided they are appropriately validated for the study region.

In conclusion, the LSTM model, with optimal input combination of discharge, temperature, and SWE from ERA5-LAND, provides a robust and accurate framework for simulating snowmelt-dominated streamflow in the Bazoft Basin, offering valuable implications for water resources management in similar snow-fed mountainous catchments.

Conclusion

This study demonstrated that the LSTM model, forced with ERA5-LAND reanalysis data (temperature and snow water equivalent) together with antecedent discharge, accurately simulates daily streamflow during the snowmelt-dominated period in the data-scarce Bazoft Basin (NSE=0.994, RMSE=0.174). The optimal performance was achieved with a single hidden layer, 20 hidden units, and the Adamax optimizer. The superior performance of the temperature-SWE-discharge combination over precipitation-inclusive inputs confirms the dominance of snowmelt processes during the study period. This framework offers an effective solution for seasonal runoff prediction in snow-dominated mountainous regions lacking dense ground-based observations, supporting informed water resource management and planning.
Keywords

Subjects



Articles in Press, Accepted Manuscript
Available Online from 14 March 2026

  • Receive Date 16 February 2026
  • Revise Date 14 March 2026
  • Accept Date 14 March 2026
  • Publish Date 14 March 2026