# Comparison of Artificial Neural Networks ANN and statistica in daily flow forecasting.

Introduction

The foundation of Support Vector Machines (SVM) was given by Vapnik, a Russian mathematician in the early1960s, based on the Structural Risk Minimisation principle from statistical learning theory and gained popularity due to its many attractive features and promising empirical performance. SVM has been proved to be effective in classification by many researchers in many different fields such as electric and electrical engineering, civil engineering, mechanical engineering, medical, financial and others. Recently, it has been extended to the domain of regression problems. In the river flow modelling field, Liong & Sivapragasam compared SVM with Artificial Neural Networks (ANN) and concluded that SVM's inherent properties give it an edge in overcoming some of the major problems in the application of ANN in addition to Due to the complexity of the methods like artificial intelligence (ANN) and Support vector machine (SVM), simpler methods with much more efficiency can be used in some initial studies. In this study, statistica software was used too for the first time in order to predict the daily discharge. This paper compares two expert models in daily flow forecasting.the support vector machine (SVM) and statistica model, are used to forecast daily river flow in north of Iran and the results of these models are compared with Observed daily values Ghara-soo River is the case study and Ghara-soo river data is used for this article.

Case study area and data:

Ghara-soo River basin is in Golestan province, northeast of Iran. This basin is located (54) to (54-45) east latitude and (36-36) to (36-59) north longitude. Basin area is 1678.1 [km.sup.2]. Maximum height of this basin is 3200 meters from sea level and the length of Main River is 108.005 km. Fig. 1 shows the natural plan and location of Ghara-soo River.

[FIGURE 1 OMITTED]

More than 4 rain measurement stations are existed over this river, but because of lack of statistics for all stations, in this research 4 stations are used. Gharasoo station as exit discharge of this basin and Ziarat, Shastkalateh and Kordkooy as input of this basin in three different locations.(Table 1).

Preprocessing data:

Preprocessing of data includes selection of effective variable, selection of training and test patterns and normalizing the patterns. The goal of normalizing is that all values in one pattern are in a range. Pattern normalizing exchanges all values to a specified interval such as [0-1] or [-1-1].

[FIGURE 2 OMITTED]

After normalizing all patterns, period of case study was selected between 1989 to 2007 (18 years).For this period, there are 6550 daily patterns for every station. 75% of these data are used for support vector machine training and 25% of these data are used for test. Fig. 2 shows daily flow hydrograph of Gharasoo River for training period and Fig. 3 shows daily flow hydrograph of Gharasoo River for test period.

[FIGURE 3 OMITTED]

Designed and developed simulation models using artificial neural network:

To determine the neural network architecture also consider having the software defaults Qnet2000, a process of trial and error in determining the number of hidden layer neurons and the threshold is also used.

Moreover, avoiding the phenomenon over fitness number of training cycles is determined by trial and error with over fitness could be two reasons Number of training cycles required and the other one too bulky structure of neural network As was stated in both cases to avoid trial and error method is used .In this study, the neural network training function of the t sigmoid functions and tangent hyperbolic in the MLP, the acceptable application performance in similar processes Is used and the results are compared To determine the best number of hidden layer neurons is required The root mean square error of network output Count the number of neurons for each hidden layer neurons is selected in a drawing graphs Root mean square error is the lowest number of neurons as the number of neurons in the hidden layer is selected To determine the number of hidden layers in a network similar to this should be done. It is worth mentioning that to find the rate and momentum coefficient of tried-and-error method is used In the methods of teaching--oriented, the number and type of model input parameters is important Since the structure of the neural network input, there is a constant and uniform The results presented in the articles will help Accordingly, the following five input pattern has been studied:

1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

2) Q(t) = f{[Q.sub.n] (t), [Q.sub.n] (t - 1), [Q.sub.P] (t), Q(t - 1), Q(t - 1), Q(t - 2)}

3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

4) Q(t) = f{Q (t-1), Q (t - 2)}

5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In these equations:

Q: Daily average discharge of Gharasoo station

[Q.sub.n]: Daily average discharge of Naharkhoran station

[Q.sub.p]: Daily average discharge of Polejadde station

[P.sub.n]: Daily average rainfall of Naharkhoran station

[P.sub.sh]: Daily average rainfall of Shastkalateh station

[P.sub.p]: Daily average rainfall of Polejaddeh station

[P.sub.g]: Daily average rainfall of Gharasoo station

The other patterns were built near each of their results with one of the top models have not gained, and here the expression patterns .Here is the question raised is which of the five models in the best performance will determine the daily discharge To answer this question with the two input patterns, sigmoid functions and tangent hyperbolic is evaluated .In the present study to compare the performance of models of the correlation coefficient R2 is used.

What the tables (2) and (3) can be concluded that the model trained with the first input pattern with 14 provinces sigmoid for the input parameters (1-5-14) means 14 input neurons, 5 hidden neurons and Way that the results of the model trained with a threshold function tangent hyperbolic Has created a better and less error, the predicted peak discharge operation is successful.

Predicting the course of the river by using software statistica:

[FIGURE 4 OMITTED]

[FIGURE 5 OMITTED]

By using the data related to the structure of discharge proposed exit point with the software and statistica had predicted the results with the results of the structure proposed by comparison, svm and just as in the table (2) is determined by the structure proposed by statistica software performance as a result.

Figure 3 and 4 shows the observed discharge and predicted discharge with statistica software. Figure 5 shows the results of both observed and predicted discharge. It shows that predicted maximum discharge is lower than observed discharge[2].

[FIGURE 6 OMITTED]

[FIGURE 3 OMITTED]

[FIGURE 4 OMITTED]

[FIGURE 5 OMITTED]

Conclusion:

1. five input pattern is introduced, the two patterns in both the, sigmoid functions and tangent hyperbolic, provided acceptable results

The first pattern in both the sigmoid functions and tangent hyperbolic less error than the other four patterns of established and most successful model was among the five models

2. Sigmoid the results of the tangent hyperbolic with higher correlation coefficient and the RMSE is less

3. In the prediction of river flow and rainfall the day before and two days before the day Ghareh Station plays a fundamental role in the model, so the results can be seen as

4. If the minimum error in the network model and if we consider the minimum error and minimum parameters, model 4 is the best model

5. The rate of learning and distance education had an important effect on network performance

6. The structure of the proposed structure, use patterns, the MLP network with 14 input parameters as Sigmoid provinces (1-5-14) means 14 input neurons, 5 hidden neurons and output neurons, compared with a much better software STATISTICA.

According to what mentioned above, we can say that the Sigmoid function as network transfer function is a reasonable thing, because it predicts that tangent hyperbolic as well as the function is successful software STATISTICA

References

[1.] Moharrampour, M., 2008. "Flood forecasting using artificial neural networks",, MS Thesis, Islamic Azad University Central Tehran.

[2.] Kisi, O., 2004. "River flow modeling using artificial neural networks", Journal of Hydrologic Engineering, 9(1): 60-63.

[3.] Dolling, O.R. and E.A. Varas, 2002. "Artificial neural networks for stream flow prediction", Journal of Hydrologic Research, 40(5): 247-254.

[4.] Sudheer, K.P., P.C. Nayak and K.S. Ramasastri, 2003. "Improving peak flow estimates in artificial neural network river flow models", Hydrological Processes, 17: 677-686.

(1) Mahdi Moharrampour, (2) Mohammad Kherad Ranjbar, (3) Nazi Abachi, (4) Mehrnush Zoghi, (1) Mohammad Reza Asadi Asad Abad

(1) Young Researchers Club, Islamic Azad University, Buinzahra Branch, Buinzahra, Iran.

(2) Same Technical and Vocational Training College, Karaj Branch, Islamic Azad University, Karaj, Iran.

(3) Department of Mathematical, Buinzahra Branch, Islamic Azad University, Buinzahra, Iran.

(4) Department of Architecture, Buinzahra Branch, Islamic Azad University, Buinzahra, Iran.

Corresponding Author

Mahdi Moharrampour, Young Researchers Club, Islamic Azad University, Buinzahra Branch, Buinzahra, Iran.

E-mail: m62.mahdi@yahoo.com

The foundation of Support Vector Machines (SVM) was given by Vapnik, a Russian mathematician in the early1960s, based on the Structural Risk Minimisation principle from statistical learning theory and gained popularity due to its many attractive features and promising empirical performance. SVM has been proved to be effective in classification by many researchers in many different fields such as electric and electrical engineering, civil engineering, mechanical engineering, medical, financial and others. Recently, it has been extended to the domain of regression problems. In the river flow modelling field, Liong & Sivapragasam compared SVM with Artificial Neural Networks (ANN) and concluded that SVM's inherent properties give it an edge in overcoming some of the major problems in the application of ANN in addition to Due to the complexity of the methods like artificial intelligence (ANN) and Support vector machine (SVM), simpler methods with much more efficiency can be used in some initial studies. In this study, statistica software was used too for the first time in order to predict the daily discharge. This paper compares two expert models in daily flow forecasting.the support vector machine (SVM) and statistica model, are used to forecast daily river flow in north of Iran and the results of these models are compared with Observed daily values Ghara-soo River is the case study and Ghara-soo river data is used for this article.

Case study area and data:

Ghara-soo River basin is in Golestan province, northeast of Iran. This basin is located (54) to (54-45) east latitude and (36-36) to (36-59) north longitude. Basin area is 1678.1 [km.sup.2]. Maximum height of this basin is 3200 meters from sea level and the length of Main River is 108.005 km. Fig. 1 shows the natural plan and location of Ghara-soo River.

[FIGURE 1 OMITTED]

More than 4 rain measurement stations are existed over this river, but because of lack of statistics for all stations, in this research 4 stations are used. Gharasoo station as exit discharge of this basin and Ziarat, Shastkalateh and Kordkooy as input of this basin in three different locations.(Table 1).

Preprocessing data:

Preprocessing of data includes selection of effective variable, selection of training and test patterns and normalizing the patterns. The goal of normalizing is that all values in one pattern are in a range. Pattern normalizing exchanges all values to a specified interval such as [0-1] or [-1-1].

[FIGURE 2 OMITTED]

After normalizing all patterns, period of case study was selected between 1989 to 2007 (18 years).For this period, there are 6550 daily patterns for every station. 75% of these data are used for support vector machine training and 25% of these data are used for test. Fig. 2 shows daily flow hydrograph of Gharasoo River for training period and Fig. 3 shows daily flow hydrograph of Gharasoo River for test period.

[FIGURE 3 OMITTED]

Designed and developed simulation models using artificial neural network:

To determine the neural network architecture also consider having the software defaults Qnet2000, a process of trial and error in determining the number of hidden layer neurons and the threshold is also used.

Moreover, avoiding the phenomenon over fitness number of training cycles is determined by trial and error with over fitness could be two reasons Number of training cycles required and the other one too bulky structure of neural network As was stated in both cases to avoid trial and error method is used .In this study, the neural network training function of the t sigmoid functions and tangent hyperbolic in the MLP, the acceptable application performance in similar processes Is used and the results are compared To determine the best number of hidden layer neurons is required The root mean square error of network output Count the number of neurons for each hidden layer neurons is selected in a drawing graphs Root mean square error is the lowest number of neurons as the number of neurons in the hidden layer is selected To determine the number of hidden layers in a network similar to this should be done. It is worth mentioning that to find the rate and momentum coefficient of tried-and-error method is used In the methods of teaching--oriented, the number and type of model input parameters is important Since the structure of the neural network input, there is a constant and uniform The results presented in the articles will help Accordingly, the following five input pattern has been studied:

1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

2) Q(t) = f{[Q.sub.n] (t), [Q.sub.n] (t - 1), [Q.sub.P] (t), Q(t - 1), Q(t - 1), Q(t - 2)}

3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

4) Q(t) = f{Q (t-1), Q (t - 2)}

5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In these equations:

Q: Daily average discharge of Gharasoo station

[Q.sub.n]: Daily average discharge of Naharkhoran station

[Q.sub.p]: Daily average discharge of Polejadde station

[P.sub.n]: Daily average rainfall of Naharkhoran station

[P.sub.sh]: Daily average rainfall of Shastkalateh station

[P.sub.p]: Daily average rainfall of Polejaddeh station

[P.sub.g]: Daily average rainfall of Gharasoo station

The other patterns were built near each of their results with one of the top models have not gained, and here the expression patterns .Here is the question raised is which of the five models in the best performance will determine the daily discharge To answer this question with the two input patterns, sigmoid functions and tangent hyperbolic is evaluated .In the present study to compare the performance of models of the correlation coefficient R2 is used.

What the tables (2) and (3) can be concluded that the model trained with the first input pattern with 14 provinces sigmoid for the input parameters (1-5-14) means 14 input neurons, 5 hidden neurons and Way that the results of the model trained with a threshold function tangent hyperbolic Has created a better and less error, the predicted peak discharge operation is successful.

Predicting the course of the river by using software statistica:

[FIGURE 4 OMITTED]

[FIGURE 5 OMITTED]

By using the data related to the structure of discharge proposed exit point with the software and statistica had predicted the results with the results of the structure proposed by comparison, svm and just as in the table (2) is determined by the structure proposed by statistica software performance as a result.

Figure 3 and 4 shows the observed discharge and predicted discharge with statistica software. Figure 5 shows the results of both observed and predicted discharge. It shows that predicted maximum discharge is lower than observed discharge[2].

[FIGURE 6 OMITTED]

[FIGURE 3 OMITTED]

[FIGURE 4 OMITTED]

[FIGURE 5 OMITTED]

Conclusion:

1. five input pattern is introduced, the two patterns in both the, sigmoid functions and tangent hyperbolic, provided acceptable results

The first pattern in both the sigmoid functions and tangent hyperbolic less error than the other four patterns of established and most successful model was among the five models

2. Sigmoid the results of the tangent hyperbolic with higher correlation coefficient and the RMSE is less

3. In the prediction of river flow and rainfall the day before and two days before the day Ghareh Station plays a fundamental role in the model, so the results can be seen as

4. If the minimum error in the network model and if we consider the minimum error and minimum parameters, model 4 is the best model

5. The rate of learning and distance education had an important effect on network performance

6. The structure of the proposed structure, use patterns, the MLP network with 14 input parameters as Sigmoid provinces (1-5-14) means 14 input neurons, 5 hidden neurons and output neurons, compared with a much better software STATISTICA.

According to what mentioned above, we can say that the Sigmoid function as network transfer function is a reasonable thing, because it predicts that tangent hyperbolic as well as the function is successful software STATISTICA

References

[1.] Moharrampour, M., 2008. "Flood forecasting using artificial neural networks",, MS Thesis, Islamic Azad University Central Tehran.

[2.] Kisi, O., 2004. "River flow modeling using artificial neural networks", Journal of Hydrologic Engineering, 9(1): 60-63.

[3.] Dolling, O.R. and E.A. Varas, 2002. "Artificial neural networks for stream flow prediction", Journal of Hydrologic Research, 40(5): 247-254.

[4.] Sudheer, K.P., P.C. Nayak and K.S. Ramasastri, 2003. "Improving peak flow estimates in artificial neural network river flow models", Hydrological Processes, 17: 677-686.

(1) Mahdi Moharrampour, (2) Mohammad Kherad Ranjbar, (3) Nazi Abachi, (4) Mehrnush Zoghi, (1) Mohammad Reza Asadi Asad Abad

(1) Young Researchers Club, Islamic Azad University, Buinzahra Branch, Buinzahra, Iran.

(2) Same Technical and Vocational Training College, Karaj Branch, Islamic Azad University, Karaj, Iran.

(3) Department of Mathematical, Buinzahra Branch, Islamic Azad University, Buinzahra, Iran.

(4) Department of Architecture, Buinzahra Branch, Islamic Azad University, Buinzahra, Iran.

Corresponding Author

Mahdi Moharrampour, Young Researchers Club, Islamic Azad University, Buinzahra Branch, Buinzahra, Iran.

E-mail: m62.mahdi@yahoo.com

Table 1: Specification of Gharasoo basin stations. Province Code Location River Longitude Latitude Golestan 12-050 Gharasoo Gharasoo 54-03-00 36-50-00 Golestan 12-043 Naharkhoran Ziarat 54-28-00 36-46-00 Golestan 12-045 Shastkalate Shastkalate 54-20-00 36-45-00 Golestan 12-049 Ghaz mahalle Kordkooy 54-05-00 36-47-00 (pole jadde) Table 2: Review of ANN performance results (sigmoid). Input pattern Pattern1 Pattern1 Pattern1 neuron number in input layer 14 6 10 neuron number in hidden layer 5 4 4 neuron number in output layer 1 1 1 R2 0.9314 0.9152 0.9246 RMSE 0.02423 0.02646 0.02492 Input pattern Pattern1 Pattern1 neuron number in input layer 2 8 neuron number in hidden layer 2 7 neuron number in output layer 1 1 R2 0.9074 0.3049 RMSE 0.02749 0.06227 Table 2: Results of statistica function. Ways RMSE statistica 0/095025 Table 3: Review of ANN performance results (tangent hyperbolic). Input pattern Pattern1 Pattern1 Pattern1 neuron number in input layer 14 6 10 neuron number in hidden layer 5 4 4 neuron number in output layer 1 1 1 [R.sup.2] 0.9287 0.9148 0.9247 RMSE 0.02466 0.02651 0.02489 Input pattern Pattern1 Pattern1 neuron number in input layer 2 8 neuron number in hidden layer 2 7 neuron number in output layer 1 1 [R.sup.2] 0.9091 0.3045 RMSE 0.02725 0.06233

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Original Article |
---|---|

Author: | Moharrampour, Mahdi; Ranjbar, Mohammad Kherad; Abachi, Nazi; Zoghi, Mehrnush; Abad, Mohammad Reza As |

Publication: | Advances in Environmental Biology |

Article Type: | Report |

Geographic Code: | 7IRAN |

Date: | Feb 1, 2012 |

Words: | 1729 |

Previous Article: | Rice vegetative response to different biological and chemical fertilizers. |

Next Article: | Radiation-induced graft copolymerization of acrylonitrile onto carboxymethylcellulose and modification of its chemical stracture. |

Topics: |