مقایسة مدل‌های برنامه‌ریزی ژنتیکی و مدل درختی M5 در پیش‌بینی خشک‌سالی

نویسندگان

چکیده

خشک‌سالی، به‌عنوان یک حادثة ناگوار طبیعی، به‌طور مستقیم جوامع را از طریق تغییرات در دسترسی به منابع آب تحت تأثیر قرار می‌دهد. برای درجه‌بندی شدت خشک‌سالی، شاخص‌های متعددی وجود دارد که از میان آن‌ها، شاخص خشک‌سالی مؤثر (EDI) و شاخص بارش استاندارد (SPI) کاربرد فراوانی دارند. در این پژوهش، برای پیش‌بینی خشک‌سالی از داده‌های بارش دو حوضة واقع در استان‌های همدان و لرستان برای محاسبة شاخص SPIو شاخص EDI بهره گرفته و برای پیش‌بینی پدیدة خشک‌سالی از دو مدل برنامه‌ریزی ژنتیکی و مدل درختی M? استفاده شد. بررسی‌های انجام گرفته نشان داد که این مدل‌ها توانایی خوبی در پیش‌بینی پدیدة خشک‌سالی داشتند و از دقت مناسبی برای مسائل پیش‌بینی برخوردار هستند. از دیگر مزیت این مدل‌ها، ارائة فرمول‌های ساده و صریح برای پیش‌بینی پدیدة موردنظر است. ضریب تبیین در مدل M? برای شاخص EDI در بهترین حالت ??/? و در مدل برنامه‌ریزی ژنتیکی ??/?به ‌دست آمد. همچنین ضریب تبیین برای شاخص SPI در بهترین حالت در مدل M?، ??/? و در مدل برنامه‌ریزی ژنتیکی ??/? حاصل شد. این مهم نشان‌دهندة این نکته است که مدل درختی M? در مقایسه با مدل برنامه‌ریزی ژنتیکی از دقت بالاتری برخوردار بوده و به‌دلیل سادگی و قابل فهم بودن نسبت به مدل برنامه‌ریزی ژنتیکی دارای برتری نسبی است

کلیدواژه‌ها


عنوان مقاله [English]

Comparison Genetic Programming model and M5 model tree in Drought Forecasting

نویسندگان [English]

  • MEHDI KOMASI
  • SOUDEH GHOBADI KHOSRO
  • mohammadreza hashemi
  • Mohammad Reza Goodarzi
چکیده [English]

Drought is a temporary and recurring meteorological event, originating from a lack of precipitation over an extended period of time. The success of drought preparedness and mitigation depends on timely information about drought onset and forecasting. This information may be obtained through continuous drought monitoring, which is normally performed using drought indices. Drought is an unpleasant, naturally occurring event caused by climate change that directly affects societies through changing their access to water resources. Among the numerous indices for drought intensity rating, the EDI and SPI have widespread applications. The SPI was computed by fitting a probability density function to the frequency distribution of the monthly precipitation records of each station. A drought event is considered to occur at a time when the value of the SPI is continuously negative and ends when the SPI becomes positive. The computation of the SPI drought index for any location is based on the long-term precipitation record (at least ?? years) cumulated over a selected time scale. This long-term precipitation time series is then ?tted to a gamma distribution, which is then transformed through an equal probability transformation into a normal distribution. Positive and negative SPI values respectively indicate wet conditions (greater than median precipitation), and dry (lower than median precipitation). In most cases, the probability distribution that best models observational precipitation data is the Gamma distribution. Unlike most other drought indices, the EDI in its original form is calculated with the daily. The resulting EDI value represents standardized value for currently utilizable water resources, considering the continued dry period. If a negative DEP continues for more than ?day, the addition period of EDI will increase as long as the continued days. This variable addition period is limitless.
The nature of genetic programming allows the user to gain additional information on how the system performs, i.e., gives insight into the relationship between input and output data. The GP is similar to genetic algorithm (GA) but unlike the latter, its solution is a computer program or an equation as against a set of numbers in the genetic algorithm. So, GP is more attractive than traditional GA for problems that require the construction of explicit models. The GP thus transforms one population of individuals into another one, in an iterative manner by applying operators. In evolutionary computation, it can distinguish between three different types of operators which are named crossover, reproduction, and mutation. M? model tree approach is based on the principle of information theory that makes it possible to split the multi-dimensional parameter space and generate the models automatically according to the overall quality criterion. It allows for variation in the number of models created. The splitting in the M? modal tree approach follows the idea of decision tree, but instead of the class labels, it has linear regression functions at the leaves, which can predict continuous numerical attributes. Model trees generalize the concepts of regression trees, which have constant values at their leaves. Therefore, they are analogous to piece-wise linear functions (and hence nonlinear). Computational requirements for model trees grow rapidly with increase in the dimensionality of the data set. Model trees learn efficiently and can tackle tasks with very high dimensionality. The major advantage of model trees over regression trees is that model trees are much smaller than regression trees and regression functions do not normally involve many variables.
This research used precipitation data on two basins in Hamedan and Lorestan Provinces to calculate the SPI and EDI indices for monitoring drought. The genetic programming model and M? model trees were used to predict the occurrence of drought in these two basins. It was found these models had good capability in predicting drought and enjoyed high accuracy in solving prediction problems. Another advantage of these models is that they use simple equations for predicting the phenomena under study. In the best-case scenario, the coefficients of determination for the EDI index in the M? model trees and in the genetic programming model were ?.?? and ?.??, respectively. Moreover, the coefficients of determination for the SPI index in the M? model trees and in the genetic programming model, in the best-case scenario, were ?.?? and ?.??, respectively. This suggests the M? model trees are more accurate compared to the genetic programming model and enjoy relative superiority because they are simpler and more understandable than the genetic programming model.

کلیدواژه‌ها [English]

  • Data driven
  • EDI Index
  • Modeling
  • SPI Index
  • Silakhor plain