نوع مقاله : مقاله پژوهشی
نویسندگان
چکیده
کلیدواژهها
عنوان مقاله [English]
نویسندگان [English]
The detailed and correct information on precipitation in different areas has an important role in the hydrological and climate studies of a region, such as the estimation of floodwaters, drought, runoff, sediment, river basin management, agriculture, irrigation scheduling and etc. Precipitation is a highly non-linear phenomenon, which changes temporally and spatially. Many factors influence precipitation variation. Generally, these factors can be divided into two climatic and geographical groups. Given technological developments, although it is not impossible to predict precipitation in the range of time and space, there are many complications. Despite many conceptual and statistical models that have been proposed to predict and forecast climatic variables, nowadays tools such as the artificial neural networks, decision trees and kernel-based methods are used to model hydrological processes and water engineering. In the current study, the efficiencies of support vector regression (SVR) and the Gaussian process regression (GPR) were investigated on prediction the amount of monthly precipitation in Mashhad. The sensitivity of precipitation to other meteorological parameters was also analyzed.
In this study, we use different kinds of meteorological parameters on monthly data scale in the Mashhad region, located in Razavi Khorasan Province in Iran. Different combinations of these meteorology parameters have been entered to support vector regression and Gaussian process regression as our chosen data mining methods. Support vector machines fall into two groups, including support vector regression and support vector classification. Based on the statistical learning theory, the support vector machine (SVM), introduced by Vapnik in 1995, is one of the supervised learning methods. Sometimes in this method, complicated and non-linear structures are required to separate data. The Gaussian process regression is a useful method employed to define prior distributions for the flexible models of regression and classification, in which regression or class probability functions are not limited to the simple parametric forms. The concept of Gaussian processes is based on the normal distribution, which was named after Carl Friedrich Gauss the Gaussian distribution. . It can be said that the Gaussian process is actually an infinite dimensional generalization for multivariate infinite distributions. The Gaussian processes are very important and prevalent in statistical modelling because they bear normal characteristics (Neal, 1997). Designing methods for support vector regression and Gaussian process regression also includes using the concept of the kernel function. In fact, with a non-linear transform from the input space to a characteristic space having more dimensions, even infinite, the problems can be made linearly separable. The most important kernel functions are linear, polynomial, normalized polynomial, radial basis function and Pearson function. In this study, kernel functions were used.
After investigating different kernel functions, it was observed that optimal results were obtained when the Pearson kernel function was employed in both support vector regression and Gaussian process regression. The research results indicated a higher accuracy and fewer errors when the parameters like monthly index, the mean of monthly relative humidity, the mean of maximum monthly relative humidity, difference between the means of minimum and maximum monthly temperatures and previous-month precipitation were used. This shows the greater impact of these parameters on precipitation. The results also indicated the higher efficiencies of modern data mining methods like support vector regression and the Gaussian process regression in predicting monthly precipitation. The Gaussian process regression provided the correlation coefficient, Nash-Sutcliffe coefficient, root-mean-square error, and the mean of absolute error with 0.870, 0.736, 12.37 (mm) and 7.85 (mm). It is introduced as the best method for predicting monthly precipitation in similar cases. The results also indicated that the Gaussian process regression was more powerful in predicting maximum monthly precipitation. It also led to more accurate predictions in cases that monthly precipitations were maximized, a fact which is very important and applicable in the prediction of floodwaters. Analyzing the sensitivities of models to input variables indicated that monthly precipitation was mostly influenced by previous-month precipitation, monthly index and the minimum monthly temperature. Both support vector regression and Gaussian process regression method had good performance in predicting monthly rainfall. The results showed that both methods had almost equal performance but that, in this case, Gaussian process regression provided more accurate predictions, especially in maximum precipitations. Therefore, this method could be considered an efficient and practical application for rainfall measurement.
کلیدواژهها [English]