Reconstruction of dynamical forecasting model between western pacific subtropical high area index and its summer monsoon impact factors based on the improved self-memorization principle.
The western pacific subtropical high (WPSH) is one of the most important components of the East Asian Summer Monsoon (EASM) system . The intensity and position of WPSH show complex seasonal evolutions and the changes in them greatly affect the occurrence of rainy season in China, including floods, droughts, and heavy rains . For example, when WPSH reaches the northernmost position, especially in summer, it significantly influences rainfall over China and Japan .
Owing to its dominance on the East Asian climate, WPSH has become one of the leading topics of interest in atmospheric sciences [3-5]. Over the past decades, much effort has gone into uncovering the forecast of the WPSH , especially the forecast of abnormal WPSH [7, 8]. Current forecasts for the WPSH can be divided into two categories: numerical forecasts and statistical forecasts. Numerical forecasts are widely used throughout the world; examples include the numerical forecast products of the European Centre for Medium-Range Weather Forecasts Model  and the Japanese FUFE502 numerical forecast products. However, numerical forecasts require field boundaries and the complex computations and low efficiency mean that the results are very unstable. Numerical forecast products have better results for large-scale weather systems, but for mesoscale weather systems, such as the WPSH, the results are less good and the forecast time is short.
Statistical methods, in contrast, have achieved some success in forecasting the WPSH. These methods can use historical data and the computation is simple. However, there are some inherent flaws in statistical methods. Using neural networks as an example, it is difficult to objectively determine the number of hidden layer neurons and the training process tends to predict a local optimum, which will limit the forecast accuracy . Moreover, the reliability of all these methods is gradually reduced with increasing forecast time, so the forecast results and credibility become very low after two weeks . Statistical forecasting products and numerical forecasting products both have some degrees of bias. In particular, error is more obvious in WPSH anomalies and long-term forecasting . So the prediction of unusual activities of WPSH within season and the long-term trend forecast of WPSH has become difficult problems in recent years.
A statistical-dynamical model of a weather system is reconstructed from actual data and can be used to describe the physical mechanisms of a complex weather system. Concerning the questions of local convergence of errors, Zhang et al.  introduced genetic algorithms (GA), which is widely used in a lot of fields [13,14], to improve the determination of root efficiency of model parameters. On that basis, Bai et al. [15,16] carried out research on the reconstruction of a nonlinear statistical-dynamical forecast model of the WPSH and achieved good results.
However, the dynamical prediction equations derived by Zhang et al.  and Bai et al. [15,16] greatly depend on the initial value, so the long-term forecast over 15 days diverged significantly and the results were not satisfactory. For the long-term forecast, the model should be improved. Cao  proposed the self-memorization principle, transforming the dynamical equation into memorization equation in a broader sense, named a differential-integral equation, wherein the memory coefficients can also be determined by actual data. This method has been widely used in prediction problems in meteorology, hydrology, and environmental field [18-20]. Because this method avoids the problem of initial condition in differential equations, it can be introduced to improve the proposed dynamical forecast model.
The set of self-memory function is relatively simple  and is suitable for cyclical and linear systems. For nonlinear systems, especially chaotic systems, forecast results are unsatisfactory . As the atmosphere and ocean are nonlinear systems, the self-memory function is needed to be modified for nonlinear system modeling. The largest Lyapunov exponent is introduced to improve the traditional self-memorization function. Finally, the improved dynamical forecasting model of WPSH with a new self-memorization function is developed. The improved function not only takes into account the chaotic characteristics of the nonlinear system, but also absorbs the information of past observations.
In our study, we firstly define the WPSH area index (SI) as a measurement of the scope and form of the WPSH. The rest of this paper is organized as follows. Section 2 introduces data and factors. Three summer Monsoon factors which have higher correlation with SI are chosen. The dynamical model of SI and three factors is reconstructed in Section 3. In Section 4, self-memorization dynamics is introduced to improve the reconstructed model. The improved self-memorization functions for chaotic systems are investigated and discussed in Section 5. Forecast experiment of 2010 and further forecast tests of an additional nine WPSH abnormal years are described in Section 6, and results are summarized in Section 7.
2. Research Data and Factors
2.1. Data. The daily data from May to October of the past 30 years from 1982 to 2011 were obtained from the National Centers for Environmental Prediction (NECP) Climate Forecast System Reanalysis (CFSR), including horizontal wind field and geopotential height field at 850 hPa and 200 hPa; geopotential height field at 500 hPa; sea level pressure field with a resolution of 0.5[degrees] x 0.5[degrees]; and sensible heat flux, latent heat flux, and precipitation rate in the Gaussian grid.
Observed long-wave radiation (OLR) from May to October of the 30 years from 1982 to 2011 of National Oceanic and Atmospheric Administration (NOAA) satellites, with a resolution of 0.5[degrees] x 0.5[degrees]. Unit is W [m.sup.-2].
These data are primarily used to calculate the SI and summer Monsoon impact factors in Section 2.3.
2.2. The Basic Facts of WPSH Activity in the Summer of 2010. Change of WPSH within seasons in various years is very different compared with the average, especially the "abnormal" activities of the WPSH, which often lead to East Asia subtropical circulation anomalies and extreme weather events in China in some years, such as 1998, 2003, 2006, and 2010 [21, 22].
In order to better describe the changes in the WPSH, SI is used, which has been defined by the Central Meteorological Observatory  to characterize the WPSH range and intensity, that is, the average grid points of 500 hPa geopotential height greater than 588 gpm in the range [10[degrees]-90[degrees]N, 110[degrees]E-180[degrees]E]. The higher the value is, the wider the range represented is, or the greater the intensity is.
SI of summer half-year (from May to October) from 1982 to 2011 can be calculated, and changes of SI in different years are shown in Figure 1. In Figure 1, the straight line represents the average SI in the observed years. SI varied widely in different years, representing irregular activities of the WPSH; particularly, in some years, SI showed dramatic changes. For example, in 2010 SI is 32, while in 1984 SI is just 6, which indicates that WPSH activities behaved abnormally in these particular years. If we use the average data of several years, these exceptions will not be apparent. So we should choose an abnormal SI year to study.
From Figure 1, 2010 is the most prominent year of WPSH anomalies as shown by the arrow. From May to October in 2010, SI is above the mean and it is the peak of nearly 30 years. The anomaly of WPSH strength resulted in a very unusual climate in China in 2010. Extreme heat and heavy precipitation events occurred frequently. The strength and wide range of these events are rarely seen in history. In particular, the most powerful WPSH happened from June to August and is to date the strongest one in the meteorological records (Figure 2). This most powerful WPSH directly resulted in rare flood disasters in the eastern part of south China, the Yangtze River, and northeast and northwest of China. Therefore, we selected the summer of 2010 as a typical case to analyze the relevance between the enhanced WPSH and the members of the Monsoon system.
2.3. Selection of Three Factors. According to previous studies [21, 22], there are many members of the summer Monsoon system, 21 of which are closely related to the WPSH. If all are used for modeling, the equation would be too complex. Therefore, the correlation analysis method is used to analyze the correlation between these factors and SI. (Specific definition of each factor can be seen in Xue et al.  and Yu et al. . Values of these factors can be calculated from CFSR data as in Section 2 and are not described in detail here.)
Based on the above correlation analysis, three factors with the best correlation with the SI are filtered out for further study, which are
(1) Mascarene cold high strength index (MH): the average grid points of sea level pressure within [40-60[degrees]E, 20-10[degrees]S] region;
(2) Tibetan high (eastern type) activity index (TH): the average grid points of 200 hPa geopotential height within [95-105[degrees]E, 25-30[degrees]N] range;
(3) Monsoon circulation index at Bay of Bengal (J1V): the average grid point J1V = V850 - V200 within [80100[degrees]E, 0-20[degrees]N] region.
The correlation coefficients between the above three factors and SI are 0.77, -0.80, and 0.86, all of which can reach above 0.75.
The Mascarene high in the southern hemisphere enhances the WPSH early in the year, which is a positive correlation and a very close relationship, consistent with previous research . The close relationships among TH, J1V, and SI are basically consistent with previous research [22, 25].
3. Reconstruction of the Dynamical Model Based on GA
Takens  set forth and tested the basic idea of reconstructing the dynamical system from the time series of observed data in his phase space reconstruction theory. Hence, nonlinear dynamics research has entered a new stage, where the evolution information of the dynamical system can be extracted from the time series, such as the calculation of fractal dimension and Lyapunov exponent . However, these studies help understand what a chaotic system is and what the periodic system is. Dynamical models for practical problems have not yet been developed fully. Huang and Yi  proposed a method of nonlinear dynamic model reconstruction from the actual data and tested it with the Lorenz system, the result of which was satisfactory. Therefore, we introduce this idea and improve it for reconstructing a dynamical model of the SI and three factors.
The principle of dynamical model reconstruction has been introduced in the previous studies [12,15,16] and is not described in detail here but can be found in Appendix A.
The existing parameter estimation methods (such as neighborhood search method and least square estimation) are mostly one-way search which needs to travel the entire parameter space, so the searching efficiency is low. Because of the limitation of the error gradient convergence and its dependence on initial solution, parameter estimation is prone to fall into local optimum, rather than global optimum. GA is a method extensively used as a global optimization method which has developed in recent years. GA has been found to be excellent in global search and parallel computing, and error convergence rate is greatly improved, so it is helpful in parameter optimization.
We can use a simplified second-order nonlinear dynamic model to depict the basic characteristics of the atmosphere and the ocean interactions . In our paper, [x.sub.1], [x.sub.2], [x.sub.3], and [x.sub.4] are used to represent time coefficient series of SI, MH, TH, and J1V; then GA is introduced to reconstruct the dynamical model and for parameter optimization.
With S = [(D - GP).sup.T](D - GP) minimum as a restriction, an optimization solution searching is made in the model parameter space by GA.
Suppose that the following nonlinear second-order ordinary differential equations are taken as the dynamical model of reconstruction and reconstruction; we choose the time coefficient series of SI, MH, TH, and J1V from May 1 to July 31 in 2010 as the expected data to optimize and retrieve model parameters:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (1)
Suppose that the parameter matrix P = [[a.sub.1], [a.sub.2], ..., [a.sub.9];[b.sub.1], [b.sub.2], ..., [b.sub.9]; [c.sub.1],[c.sub.2], ..., [c.sub.9]] in the above equation is the population, the minimal residual sum of squares S = [(D- GP).sup.T] (D - GP) is the objective function, the individual fitness value is [l.sub.i] = 1/S, and the total fitness value is L = [[summation].sup.n.sub.i=1] [l.sub.i]. The idiographic operating steps include coding and creating of initial population, calculation of fitness, male individual choice, crossover, and variation. For a complete theoretical explanation, one can refer to the literature . During calculation, the step length is one day. After about 35 times of genetic operations and optimization search, it is possible to converge to the target adaptive value rapidly and retrieve each optimized parameter of the dynamical equations. After eliminating weak items with little dimension coefficient, the nonlinear dynamic model of SI and three factors are inverted as follows:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (2)
The dynamical reconstruction model should be tested. So we chose the data of August 1 in 2010 of SI, MH, TH, and J1V which do not participate in the modeling as the forecast initial data. Then the Runge-Kutta method is used to carry out numerical integration of the above equations and every step of integration can be regarded as a day forecasting result. The forecast results of four time series within 25 days can be obtained. The correlation coefficients between forecast results and actual results of SI, MH, TH, and J1V within 25 days are, respectively, 0.5659, 0.5746, 0.6091, and 0.5023 and root mean square errors (RMSE) are 36.22%, 34.54%, 32.71%, and 36.78%. This indicates that the results of long-term forecast within 25 days are unsatisfactory for the dynamical reconstruction model. That is because the integration results may diverge significantly with time, as well as our initial data for integration being relatively simple. Therefore, we should improve the model to get better long-term forecast results.
Following these analyses, correlation coefficients (r) were evaluated using the t-test ([alpha] = 0.05). All resulting coefficients were found to be statistically significant at the 95% confidence level and were thus deemed to be statistically acceptable.
4. Introduction of Self-Memorization Dynamics to Improve the Reconstructed Model
From the above discussion, we know that the accuracy of forecast results of the dynamical reconstruction model within 25 days is unsatisfactory. Literature suggests that introducing the principle of self-memorization into the mature model can improve long-term forecasting results [18, 20]. So the self-memorization principle is introduced to improve the model.
Following the study of Cao  and Feng et al. , the mathematical principle of self-memorization dynamics of systems is shown in Appendix B.
Using (2) as the dynamical kernel, the improved model based on the self-memorization equation (B.12) in Appendix B can be expressed as
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (3)
Because we predict the value of SI, the first formula in (3) is used to get results of our final prediction. That is,
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (4)
If we can get the value of a, [theta], (4) can be used for prediction. Values of a, [theta] are associated with the self-memorization function [beta], which is defined in the next section.
We make a prediction with the self-memorization equation (4); the model uses the p values before [t.sub.0], therefore [beta](t) in the self-memorization equation (4) in memorizing the p+1 values of x. This explains the reason why we call [beta](t) the self-memorization function. This is the mathematical basis for introducing the self-memorization principle.
The physical basis for introducing the self-memorization principle is that the thermodynamic equation is one of atmospheric motion equations, which implies that the atmospheric motion is an irreversible process. An important contribution to the study of irreversible process is to introduce the memory concept to physics. So the atmospheric development in the future is not only related to the state at the initial time, but also related to states in the past, which means the atmosphere does not forget its past.
5. Improved Self-Memorization Functions for Chaotic Systems
Cao  referred to the problem of self-memorization function assignment. Based on experience and tests, he found the memory level gradually decreased with the distance of initial prediction time, so the memory function can be expressed as
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (5)
It shows that the self-memorization function (5) indicated by Cao only considers the characteristics of memory level gradually decreasing with the distance of initial prediction time and ignores nonlinear characteristics of the self-memorization function. So we modify the self-memorization function as
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (6)
Term [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] represents the characteristics of memory level gradually decreasing with the distance of initial prediction time, while [t.sup.(1-r).sub.i] reflects the nonlinear characteristics of self-memorization function. r and a are the parameters to be determined, and they are important for the specification of the assigned self-memorization function [beta](i). In Caos study he found no good way to determine parameters r and a, only believing that they had powerful connection with the past observational data.
The Lyapunov exponent (LE) has long been used to study atmospheric and oceanic predictability . In chaotic dynamical systems theory, the largest Lyapunov exponent can be portrayed as a whole (long-term) average forecast error growth rate, which generally describes the divergence of nonlinear chaotic systems. So it is often introduced in the forecast study of chaotic systems such as atmospheric and oceanic systems [32, 33]. The traditional self-memorization function is useful in the linear periodic system forecast, but not appropriate in the forecast of nonlinear chaotic systems. The largest Lyapunov exponent (LLE) is one of the indexes which can represent the characteristics of chaotic systems. So here r can be considered as LLE of the system, which can be calculated from the reconstructed dynamical equation in Section 3. And a can be obtained through GA from historical data.
5.1. Determination of Parameter r. Based on the above analysis, the solution of r requires the solution of the LLE in dynamical equation (2).
5.1.1. Definition of LLE. The definition of the Lyapunov spectrum and the largest Lyapunov exponent (LLE) can be referred to the literature 34] and are not described in detail here. The LLE determines the notion of predictability for a dynamical system. A positive LLE is usually taken as an indication that the system is chaotic (provided some other conditions are met, e.g., phase space compactness).
5.1.2. The Principle of Lyapunov Exponent Spectrum Calculations. If the dynamic differential equations are known, through theoretical derivation or using a numerical iterative algorithm to discretize differential equations, the exact Lyapunov exponents of known dynamical systems can be obtained. First, we work out the solution of the ordinary differential equations and get the Jacobi matrix. Then, we decompose the Jacobi matrix by QR while necessary orthogonal reformings in a multiple of small time intervals are performed. Finally, the Lyapunov spectrum of the system can be obtained by iterative calculations. The specific calculation principle of the Lyapunov exponents spectrum is not shown here, which can be referred to von Bremen et al. .
5.1.3. Lyapunov Spectrum of Dynamical Systems and Determination of r. According to the above calculation method, the Lyapunov spectrum of the reconstruction dynamical model (see Figure 3) of SI and three factors can be obtained as in Figure 2 which shows that convergence speed and volatility are not too large and they are relatively stable. Final Lyapunov exponents are [0.3816, 0.0016, -0.5715], containing both negative Lyapunov exponent and two positive Lyapunov exponents, which demonstrate our dynamical system is indeed a chaotic system.
Finally, r is taken as the LLE, and r = 0.3816.
5.2. Determination of Parameters a. Not only should the nonlinear characteristics of chaotic system be considered, but also the impact of the past actual data on self-memorization function should be taken into account. So parameter a should be optimized.
Equation (4) is written as
x(t) = f(t,x,[beta]). (7)
The initial value is x([t.sub.i]) = [x.sub.i], i = N, N - 1, ..., N - p.
Our goal is to make the smallest error between the fitting values and the observed values of p periods before; that is, secondary quality index reaches a minimum:
J(u)= [N.summation over (i=p+1)] [([??]([t.sub.i]) - [??]([t.sub.i])).sup.2] [right arrow] min, (8)
where [??] is the estimated value of the prediction model. [??] is the actual value. When further constraint is added, the self-memorization function should be taken as
[absolute value of [beta]([t.sub.i])] [less than or equal to] 1. (9)
Parameter r can be obtained by solving for the optimal solution of (8) in the constraint condition of (9). Setting J(u) = [[summation].sup.N.sub.i=p+1] ([??][([t.sub.i]) - [??]([t.sub.i])).sup.2] as the objective function of GA, putting [absolute value of [beta]([t.sub.i])] [less than or equal to] 1 as a constraint, the specific process of GA is discussed in Section 3.
We can see that the value of parameter a is related with the p data before forecasting time t. The forecasting value is preserved as the early data for the next prediction. With the change of data, the value of parameter a is varied. The first calculation is [a.sub.1] = 0.244 through 39 iterations of GA, and then in next prediction step, parameter a is varied, which we do not list one by one here. In fact, the continuous adjustment of parameter a is more accurate.
Self-memorization function [beta] is determined, and then it is substituted into (4), which can be used to do prediction.
6. Model Prediction Experiments
6.1. Effective Steps and Maximum Effective Computation Time. In order to further study the improved model, the algorithm to solve for effective steps and maximum effective computation time should be given, based on the dynamic core (2) of the improved model. Currently the optimal search method is used, which is achieved based on the size of the differences among many numerical solutions. Its principles can be seen in Kirkpatrick .
In step interval [[h.sub.min], [h.sub.max]], ([h.sub.min] is a very small number), n steps are selected, [h.sub.i] = [h.sub.min] + i * [dt.sub.t] < [h.sub.max] (i = 1,2, ..., n), and dtt is known as the time increment. The n steps are integral to t at the same time, getting the n number of numerical solutions, denoted by [[??].sub.t](n). If their differences [V.sub.s](t) are less than the given tolerance limit e, it means that they are relatively close to the true solution at t time. At that time the effective time step interval is [[h.sub.min], [h.sub.max]] and the width is [W.sub.h](t) = lg [h.sub.max] - lg [h.sub.min]. Otherwise, if [V.sub.s](t) > [epsilon], that means values between some steps show larger deviations, so removing these steps from [[??].sub.t](n) will make the difference of the remaining [n.sub.1] ([n.sub.1] < n) solutions less than [epsilon]. The optimum is to exclude the least number.
The effective step intervals which meet such conditions can be gotten. Given integral step, we continue to do integral calculation to t + [Nd.sub.t]; here, N is an integer. Integral operation continues until there are only two adjacent steps [h.sub.j]([t.sub.1]), [h.sub.j+1]([t.sub.1]) left. Then step interval [[h.sub.j]([t.sub.1]),[h.sub.j+1]([t.sub.1])] is divided into m parts for a another round of integral operation, until the difference of the solution in m+1 step is smaller than [epsilon]. In this case, the maximum effective computation time of the dynamical kernel of the improved model (2) can be obtained, as shown in Figure 4.
From Figure 4, when [epsilon] is certain, for our system, the maximum effective computation time increases with the increment of [dt.sub.t] and finally remains at 28.7, which is the maximum effective computation time (Figure 4(a)). When [dt.sub.t] remains constant, the maximum effective computation time oscillates with [epsilon] (Figure 4(b)).
6.2. Prediction Experiment of 2010. The period from August 2 to September 5 in 2010 is selected to do predictions (35 days in total). August contains abnormal activities of WPSH. Here, the forecasting result is obtained by the sum of (4), which is called as step-by-step forecast. The specific procedure is shown as follows.
Step-by-step forecasts are made after retrospective order p is fixed. That means when we forecast the SI value of August 2, we must get yi based on the previous p+1 time SI data and F([x.sub.1i], [x.sub.2i], [x.sub.3i], [x.sub.4i]) based on the previous p time SI, MH, TH, and J1V data (because [x.sub.2], [x.sub.3], and [x.sub.4] are MH, TH, and J1V). Then using them in (4) can obtain the SI value of August 2. Then, taking the SI value of August 2 as the initial value for the next prediction step, the SI value of August 3 can be generated, and so on.
6.2.1. Determination of p. When the principle of self-memorization is introduced, the retrospective order p is related with self-memorization of the system . If the system "forgets" slowly, which means parameters a and r are smaller, a high level of p should be used. WPSH abnormal activities are on ten-day scale [22, 24], which is a slow process in contrast to the large-scale atmospheric motion. So parameters a and r are smaller, and generally p is in the range of 5 to 15.
The correlation coefficients and root mean square errors (RMSE) between forecast value and real value can be calculated with different retrospective order, as shown in Table 1.
From the table, we can see when p = 8, the correlation coefficient is the largest, and the root mean square error is the smallest. So we choose p = 8.
After p is determined, the improved self-memorization equation (4) can be used for prediction experiments. Short-, medium-, and long-term integration forecasts of 15 days, 25 days, and 35 days are carried out. Retrospective order p = 8 means that earlier observational data (p + 1 = 9) are used for integration to begin. The integral result per day is preserved as preliminary information and will be used for integration in the next period.
6.2.2. Prediction Results of 2010. The forecast results within 35 days are shown in Figure 5, which show that the forecast result of SI is better.
As can be seen in Figure 5, forecast performance of the first 15 days is better. The correlation coefficient has reached 0.9542 and the root mean square errors (RMSE) are 2.45%. Two peaks and one valley of SI are also forecast very accurately. The forecast time series from 15 days to 25 days gradually diverges, but the trend is accurate. The correlation coefficient reaches 0.9254 and the RMSE is 6.37%. Forecast rate is still accurate. The peak of SI in August 18 is also forecast accurately.
After nearly 25 days, the RMSE begins to increase: the forecast trend is still accurate with correlation coefficient of 0.8136. Forecast curve from 25 days to 35 days is accurate compared with the actual situation. But the peaks and valleys are not forecast accurately. In particular after 28 days, the error has started to increase significantly. For example, the forecast value of August 28 is nearly 100 more than the actual value, resulting in a false peak. The root mean square error increases to 19.18%, but it is still controlled within 20%. Figure 5 shows that the forecast results after 28 days are obviously unsatisfactory with the occurrence of greater oscillations. The effective forecast period (within 28 days) happens to be in accordance with the maximum integration time (28.7 units) calculated in Section 6.1.
In brief, from Figure 5, we can see that the short-term forecast within 28 days is accurate, and the average RMSEs are less than 8%, which indicates that the reconstruction model can perform accurate predictions of the changing trends of indices.
6.3. More Forecasting Experiments of WPSH Abnormal Years. To further test the forecasting performance of the improved model, cross testing of more experiments is performed. From Figure 2 discussed in Section 2.2, we choose another four years (1998, 2006, 1987, and 1983) in which WPSH intensity is abnormally strong (bigger SI) and five years (1984, 2000, 1994, 1999, and 1985) in which WPSH intensity is abnormally weak (smaller SI) to carry out integral forecast experiments of SI. In accordance with the previous idea, the common 2010 model which is reconstructed in Section 4 is used to carry out the cross testing. Forecast results of different time periods (1-15 days as short term, 16-25 days as medium term, and 26-35 days as long term) are compared with the actual situation. Cross test results are displayed in Table 2. From the table, we can see the forecast results of the short term and the medium term are accurate, and those of the long term (>26 days) can also be accepted. The results of MH, FLH, and TH are similar to those of SI.
7. Summary and Discussion
7.1. Summary. Although in recent years the WPSH has become one of the leading topics of atmospheric sciences, it is still difficult to be forecast. Combining dynamical system reconstruction idea with the improved self-memorization principle, a new dynamical forecasting model of SI index is developed. The approach of this paper consists of the following steps.
(1) We use correlation analysis method to discuss contributions and impact of the key members of EASM to the abnormal WPSH in 2010 and identify the three most significant factors: the Mascarene high, the Tibetan high (eastern type), and Monsoon circulation index at Bay of Bengal.
(2) Take the SI, MH, XZ, and J1V time series and consider them as trajectories of a set of 4 coupled quadratic differential equations based on the dynamic system reconstruction idea. Parameters of this dynamical model are estimated by a genetic algorithm (GA).
(3) Apply the "principle of self-memorization" in order to improve the forecasting results of the above dynamical model. This involves a manipulation of the time series using an exponentially decaying (in time) function [beta](r), which also depends on the largest Lyapunov exponent of the above dynamical system. A second free parameter of [beta](a) is determined by minimizing the distance of the modified reconstructed time series.
(4) The improved model is used to forecast the SI index. Through 2010 experiments and other 9 experiments of WPSH abnormal years, we find that forecast results within 25 days are good, which proves that our improved model has better long-term forecasting results. Given the complexity of the mechanism of the WPSH 3], the new dynamical forecasting model has some scientific significance and practical value.
Based on the discussion in Section 1, we know that the forecast results and credibility of the previous methods are very low after two weeks . So our improved statistical-dynamical model represents an exploration of and supplement to the traditional numerical forecasting and statistical forecasting methods and also extends the prediction time.
7.2. Discussion. Why are the forecasting results of our improved model especially good? (1) Previous studies have showed that the long-term forecasting results of dynamical system reconstructed model are unsatisfactory; thus, we introduce the self-memorization principle to improve the long-term forecasting results. Previous studies have proved that this approach is feasible. For example, Gu  used the self-memorization principle to improve the traditional T42 model. And Wang et al.  also carried out dynamical prediction of building subsidence deformation with self-memorization model. Long-term forecasting results of their models were very good. Our study also shows that introducing the self-memorization principle to improve a mature model can bring better long-term forecasting results. (2) The largest Lyapunov exponent is firstly introduced to improve the traditional self-memorization function, making it more appropriate to describe the chaotic systems, such as WPSH. (3) In developing the improved model, the parameters are obtained from the historical data, which contain sufficient information of WPSH abnormal process. Statistical data combining with improved dynamical model makes forecasting results more reliable.
Although the forecast results of improved model are better, there are still some issues which are needed for further research.
(1) The physical meaning of the factors in the model is not clear, so its dynamical characteristics should be further analyzed.
(2) The forecast accuracy has a great relationship with self-memorization functions. Maybe we can find a better self-memorization function to improve forecast accuracy for long term in the future.
These two items are the focus of our next work.
A. Principle of Dynamical Model Reconstruction
Suppose that the finite difference form of the physical law of any nonlinear system evolving with time can be expressed as follows:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (A.1)
where fi is the generalized nonlinear function of [q.sub.1],[q.sub.2], ..., [q.sub.i], ..., [q.sub.N], N is the number of state variables, and M is the length of time series of observed data. We assume that [f.sub.i]([q.sup.j[DELTA]t.sub.1], [q.sup.j[DELTA]t.sub.2], ..., [q.sup.j[DELTA]t.sub.i] ..., [q.sup.j[DELTA]t.sub.N]) contains two parts: [G.sub.jk] representing the expanding items containing variable [q.sub.i] and [P.sub.ik] just representing the corresponding parameters which are real numbers (i = 1,2, ..., N, j = 1,2, ..., M, k = 1,2, ..., K).
It can be supposed that
[f.sub.i] ([q.sub.1], [q.sup.2], ..., [q.sub.n]) = [K.summation over (k=1)] [G.sub.jk][P.sub.ik]. (A.2)
The matrix form of (A.2) is D = GP, in which
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (A.3)
Coefficients of the above generalized unknown equation can be identified through inverting the observed data. Given a vector D, vector P can be solved to satisfy the above equation. It is a nonlinear system with respect to q; however it is a linear system with respect to P (assume P is unknown). So the classical least square method can be introduced to estimate the equation and the regular equation [G.sup.T]GP = [G.sup.T]D can be derived by making the residual sum of squares S = [(D - GP).sup.T] (D - GP) minimum.
Basedonthe above approach, coefficients of the nonlinear dynamical systems can be determined and the nonlinear dynamical equations of observed data can be established.
B. Mathematical Principle of
Self-Memorization Dynamics of Systems
In general, dynamical equations of a system can be written as
[partial derivative][x.sub.i]/[partial derivative]t = [F.sub.i] (x, [lambda], t), i = 1, 2, ..., J, (B.1)
where J is an integer, [x.sub.i] the ith variable of the system state, and [lambda] the parameter. Equation (B.1) expresses the relationship between a local change of x and a source function F. Obviously, x is scalar function at the space [r.sub.0] and the time t. Consider a set of time T = [[t.sub.-p] ... [t.sub.0] ... [t.sub.q]], where [t.sub.0] is an initial time, and a set of space R = [[r.sub.a] ... [r.sub.i] ... [r.sub.[beta]]],where [r.sub.i] is a spatial point considered. An inner product in space [L.sup.2] : T x R is defined by
(f, g) = [[integral].sup.b.sub.a] f ([xi])g([xi])d([xi]), f, g [member of] [L.sup.2]. (B.2)
Accordingly, define a norm
[parallel]f[parallel] = [[[[integral].sup.b.sub.a] f([[xi].sup.2]d[xi]].sup.1/2]. (B.3)
Making a completion for [L.sup.2], it becomes a Hilbert space H. A solution of the multitime model can be regarded as a generalized one in H. Dropping i in (B.1) and applying an operation of the inner product defined in (B.2), by introducing a memorial function [beta](r, t), we obtain
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (B.4)
where r in [beta](r, t) is dropped because of fixing on the spatial point [r.sub.0]. Supposing variable x and function [beta](r,t), and so forth, are all continuous, differentiable, and integrable, following the calculus, making an integration by parts for the left of (B.4) yields
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (B.5)
where [beta]'(t) = [partial derivative][beta](t)/[partial derivative]t. Apply the median theorem in calculus to the third term in the right-hand side of (B.5). That is, the following is obtained:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (B.6)
where the median [x.sup.m](t0) = x([t.sub.m]), [t.sub.0] < [t.sub.m] < t. Substituting (B.5) and (B.6) for (B.4) and performing algebraic operation, we get
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (B.7)
As the first and second terms in (B.7) relate to the x value only on the fixed point [r.sub.0] itself at the initial time to and the middle time [t.sub.m], they are called a self-memory term. Naturally, call the third term an exogenous effect, that is, total effect contributed by other spatial points to point [r.sub.0] in an interval [[t.sub.0], t].
For multitime [t.sub.i], i = -p, -p + 1, ..., [t.sub.0], t, similar to (B.5), we have
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (B.8)
By eliminating the same term [beta]([t.sub.i])x([t.sub.i]), i = -p + 1, -p + 2, ..., 0, it gives
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (B.9)
For simplicity, setting [[beta].sub.t] = [beta](t), [[beta].sub.0] = [beta]([t.sub.0]), [x.sub.t] [equivalent to] x(t), [x.sub.0] = x([t.sub.0]), similar notations are used in the following context. Then, (B.9) can be written as
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (B.10)
Setting [x.sub.-p] [not equal to] [x.sup.m.sub.p-1], [[beta].sub.-p-1] = 0, we rewrite (B.10) as
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (B.11)
We call [S.sub.1] a self-memory term and [S.sub.2] an exogenous effect term.
For the convenience of calculations, to discretize self-memorization equation, the integration in (B.11) in Appendix B is replaced by summation and the differential by difference, and the median [x.sup.m.sub.i] is simply replaced by the mean of two values at adjoining times; that is, [x.sup.m.sub.i] [approximately equal to](1/2)([x.sub.i+1] + [x.sub.i]) [equivalent to] [y.sub.i].
By sampling an equal time interval [DELTA]t, that is, [t.sub.i] = [t.sub.0] + i[DELTA]t, i = 1,0,-1, -2, ..., -p, taking an equal time interval [DELTA][t.sub.i] = [t.sub.i+1] - [t.sub.i] = 1, where [t.sub.0] is initial time, [t.sub.0] + [DELTA]t is forecast time, p is retrospective order, and, incorporating [[beta].sub.i] and [[beta].sub.t], a discretized self-memorization equation is obtained:
[x.sub.t] = [-1.summation over (i=-p-1)] [[alpha].sub.i][y.sub.i] + [0.summation over (i=-p)] [[theta].sub.i]F (x,i), (B.12)
where F is the dynamic core of the self-memorization equation; [[alpha].sub.i] = [[beta].sub.i+1] - [[beta].sub.i])/[[beta].sub.t]); [[theta].sub.i] = [[beta].sub.i]/ [[beta].sub.t].
We call a technique with which one makes forecast and computation based on (B.12) a self-memorization principle.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This study is supported by the Chinese National Natural Science Fund for young scholars (41005025/D0505), the Chinese National Natural Science Fund (41375002; 41075045), and the Chinese National Natural Science Fund (BK2011123) of Jiangsu Province.
 T. Miyasaka and H. Nakamura, "Structure and formation mechanisms of the northern hemisphere summertime subtropical highs," Journal of Climate, vol. 18, no. 23, pp. 5046-5065, 2005.
 K. Kurihara, "A climatological study on the relationship between the Japanese summer weather and the subtropical high in the western North Pacific," Geophysical Magazine, vol. 43, no. 2, pp. 45-104, 1989.
 S. Y. Tao, Y. Q. Ni, S. X. Zhao et al., The Study on Formation Mechanism and Forecast of the 1998 Summer Rainstorm in China, Meteorological Press, Beijing, China, 2001.
 H. J. Wang and F. Xue, "Interannual variability of somali jet and its influences on the inter-hemispheric water vapor transport and on the east Asian summer rainfall," Chinese Journal of Geophysics, vol. 1, pp. 230-239, 2003.
 X. L. Xu, H. M. Xu, and D. Si, "Relationship between June sustained torrential rain in South China and anomalous convection over the bay of Bengal," Journal of Nanjing Institute of Meteorology, vol. 30, pp. 463-471, 2007
 J. Cao, R. H. Huang, Y. Q. Xie, and Y. Tao, "The study of the physical mechanism of West Pacific subtropical high evolution," Science in China Series D, vol. 8, pp. 659-666, 2002.
 R. Zhang, J. G. Wang, and Z. H. Yu, "Teleconnection of Rossby inertial gravity waves and East-West Pacific subtropical," Applied Mathematics and Mechanics, vol. 23, pp. 707-714, 2002.
 A. Grinsted, J. C. Moore, and S. Jevrejeva, "Application of the cross wavelet transform and wavelet coherence to geophysical times series," Nonlinear Processes in Geophysics, vol. 11, no. 5-6, pp. 561-566, 2004.
 M. L. Wu, H. Liang, Y. Wang et al., "Contrast tests of precipitation products between T213 and Germany numerical prediction in 2008," Journal of Meteorology and Environment, vol. 25, pp. 4-12, 2009.
 R. Lu, H. Ding, C.-S. Ryu, Z. Lin, and H. Dong, "Mid-latitude westward propagating disturbances preceding intraseasonal oscillations of convection over the subtropical western North Pacific during summer," Geophysical Research Letters, vol. 34, no. 21, Article ID L21702, 2007.
 L. W. Zou, T. J. Zhou, B. Wu et al., "The interannual variability of summertime western Pacific subtropical high hindcasted by GAMIL CliPAS experiments," Chinese Journal of Atmospheric Science, vol. 33, pp. 959-970, 2009.
 R. Zhang, M. Hong, Z.-B. Sun et al., "Non-linear dynamic model retrieval of subtropical high based on empirical orthogonal function and genetic algorithm," Applied Mathematics and Mechanics, vol. 27, no. 12, pp. 1645-1653, 2006.
 M. Hong, R. Zhang, J. X. Li, J. J. Ge, and K. F. Liu, "Inversion of the western Pacific subtropical high dynamic model and analysis of dynamic characteristics for its abnormality," Nonlinear Processes in Geophysics, vol. 20, no. 1, pp. 131-142, 2013.
 M. Hong, R. Zhang, and K. F. Liu, "Retrieving dynamic forecast model of the western pacific subtropical high in abnormal years based on GA," Acta Physica Sinica, vol. 62, no. 7, pp. 6430-6442, 2013.
 C. Z. Bai, M. Hong, R. Zhang, D. Wang, and L. Qian, "Evolving an information diffusion model using a genetic algorithm for monthly river discharge time series interpolation and forecasting," Journal of Hydrometeorology, vol. 15, no. 6, pp. 2236-2249, 2014.
 C. Z. Bai, R. Zhang, M. Hong, L. X. Qian, and Z. X. Wang, "A new information diffusion modeling technique based on vibrating string equation and its application in natural disaster risk assessment," International Journal of General Systems. In press.
 H. X. Cao, "Self-memorization equation in atmospheric motion," Science in China. Series B, vol. 36, no. 7, pp. 845-855, 1993.
 X. Gu, "A spectral model based on atmospheric self-memorization principle," Chinese Science Bulletin, vol. 43, no. 20, pp. 1692-1702, 1998.
 G. Feng, H. Cao, X. Gao, W. Dong, and J. Chou, "Prediction of precipitation during summer monsoon with self-memorial model," Advances in Atmospheric Sciences, vol. 18, no. 5, pp. 701-709, 2001.
 X. Chen, J. Xia, and Q. Xu, "Differential Hydrological Grey Model (DHGM) with self-memory function and its application to flood forecasting," Science in China Series E: Technological Sciences, vol. 52, no. 4, pp. 1039-1049, 2009.
 S. Y. Tao and J. Wei, "The westward and northward advance of the subtropical high over the west Pacific in summer," Journal of Applied Meteorological Science, vol. 17, pp. 513-524, 2006.
 D. D. Yu, R. Zhang, and M. Hong, "A charateristic correlation analysis between the Asia summer monsoon memberships and west Pacific subtropical high," Journal of Tropical Meteorology, vol. 1, pp. 58-67, 2007.
 The Central Meteorological Station Forecast Group, The Long-Range Weather Forecast Technology Experience, pp. 5-6, The Central Meteorological Station Press, Beijing, China, 1976.
 F. Xue, H. J. Wang, and J. H. He, "The influence of Mask Lin high and Australia high interannual variation to the East Asian summer monsoon precipitation," Chinese Science Bulletin, vol. 3, pp. 19-27, 2003.
 W. Wang, J. Su, B. Hou, J. Tian, and D. Ma, "Dynamic prediction of building subsidence deformation with data-based mechanistic self-memory model," Chinese Science Bulletin, vol. 57, no. 26, pp. 3430-3435, 2012.
 F. Takens, "Detecting strange attractors in fluid turbulence," in Dynamical Systems and Turbulence, Warwick 1980, vol. 898 of Lecture Notes in Mathematics, pp. 361-381, Springer, New York, NY, USA, 1981.
 T. N. Palmer, A. Alessandri, U. Andersen et al., "Development of a European multi-model ensemble system forseasonal to interannual prediction (DEMETER)," Bulletin of the American Meteorological Society, vol. 85, pp. 853-872, 2004.
 J. P. Huang and Y. H. Yi, "A non-linear dynamic system reconstructing of actual data," Science in China, vol. 3, pp. 331-336, 1991.
 S. W Yeh, S. J. Kug, and B. Dewitte, "El Nino in a changing climate," Nature, vol. 461, pp. 511-514, 2009.
 L. Wang, Intelligent Optimization Algorithms and Its Application, pp. 23-24, Tsinghua University Press, Chendu, China, 2001.
 K. Fraedrich, "Estimating weather and climate predictability on attractors," Journal of the Atmospheric Sciences, vol. 44, no. 4, pp. 722-728, 1987.
 S. Yoden and M. Nomura, "Finite-time Lyapunov stability analysis and its application to atmospheric predictability," Journal of the Atmospheric Sciences, vol. 50, no. 11, pp. 1531-1543, 1993.
 E. Kazantsev, "Local Lyapunov exponents of the quasi-geostrophic ocean dynamics," Applied Mathematics and Computation, vol. 104, no. 2-3, pp. 217-257, 1999.
 E. Firdaus and H. F. von Bremen, "Computation of Lyapunov characteristic exponents for continuous dynamical systems," Zeitschrift fur Angewandte Mathematik und Physik, vol. 53, pp. 123-128, 2001.
 H. F. von Bremen, F. E. Udwadia, and W. Proskurowski, "An efficient QR based method for the computation of Lyapunov exponents," Physica D. Nonlinear Phenomena, vol. 101, no. 1-2, pp. 1-16, 1997.
 D. Kirkpatrick, "Optimal search in planar subdivisions," SIAM Journal on Computing, vol. 12, no. 1, pp. 28-35, 1983.
Mei Hong, (1) Ren Zhang, (1) Xi Chen, (1) Shanshan Ge, (1) Chengzu Bai, (1) and Vijay P. Singh (2,3)
(1) Research Center of Ocean Environment Numerical Simulation, Institute of Meteorology and Oceanography,
PLA University of Science and Technology, P.O. Box 003, No. 60, Shuanglong Road, Nanjing 211101, China
(2) Department of Biological and Agricultural Engineering, Texas A&M University, College Station, TX 77843, USA
(3) Zachry Department of Civil Engineering, Texas A &M University, College Station, TX 77843, USA
Correspondence should be addressed to Mei Hong; firstname.lastname@example.org
Received 19 June 2014; Accepted 15 November 2014; Published 10 December 2014
Academic Editor: Stefan Balint
Table 1: The correlation coefficient (C.C.) and root mean square errors (RMSE) when the retrospective order p is different. P 1 5 6 7 8 9 10 C.C. 0.38 0.72 0.78 0.81 0.94 0.92 0.87 RMSE 27.53% 14.33% 13.76% 13.06% 3.72% 4.63% 5.01% P 11 12 13 14 15 20 30 C.C. 0.86 0.84 0.83 0.71 0.67 0.44 0.36 RMSE 6.77% 7.01% 10.98% 13.24% 15.50% 24.13% 28.91% P 40 50 60 70 C.C. 0.32 0.28 0.21 0.17 RMSE 30.22% 39.11% 42.18% 50.24% P 1 5 6 7 8 9 10 C.C. 0.38 0.72 0.78 0.81 0.94 0.92 0.87 RMSE 27.53% 14.33% 13.76% 13.06% 3.72% 4.63% 5.01% P 11 12 13 14 15 20 30 C.C. 0.86 0.84 0.83 0.71 0.67 0.44 0.36 RMSE 6.77% 7.01% 10.98% 13.24% 15.50% 24.13% 28.91% P 40 50 60 70 C.C. 0.32 0.28 0.21 0.17 RMSE 30.22% 39.11% 42.18% 50.24% Table 2: The correlation coefficients (C.C.) and root mean square errors (RMSE) between forecast value and real value of different events of WPSH abnormal years. Statistical tests Short term Medium term Forecast events (1~15 days) (16~25 days) C.C. RMSE C.C. RMSE SI bigger event 1 (1998.06.21 as initial values to forecast) 0.957 2.92% 0.812 4.51% SI bigger event 2 (2006.07.18 as initial values to forecast) 0.936 2.88% 0.877 4.72% SI bigger event 3 (1987.07.08 as initial values to forecast) 0.942 3.16% 0.881 3.97% SI bigger event 4 (1983.08.05 as initial values to forecast) 0.958 3.40% 0.820 4.06% SI smaller event 1 (1984.07.28 as initial values to forecast) 0.951 3.12% 0.815 3.43% SI smaller event 2 (2000.06.29 as initial values to forecast) 0.892 3.47% 0.825 4.97% SI smaller event 3 (1994.08.17 as initial values to forecast) 0.914 4.52% 0.755 3.83% SI smaller event 4 (1999.06.12 as initial values to forecast) 0.809 3.97% 0.873 3.89% SI smaller event 5 (1985.07.11 as initial values to forecast) 0.926 2.07% 0.882 4.85% The average 0.921 3.28% 0.838 4.25% Statistical tests Long term Forecast events (26~35 days) C.C. RMSE SI bigger event 1 (1998.06.21 as initial values to forecast) 0.723 11.91% SI bigger event 2 (2006.07.18 as initial values to forecast) 0.776 10.98% SI bigger event 3 (1987.07.08 as initial values to forecast) 0.718 11.48% SI bigger event 4 (1983.08.05 as initial values to forecast) 0.729 10.90% SI smaller event 1 (1984.07.28 as initial values to forecast) 0.789 11.21% SI smaller event 2 (2000.06.29 as initial values to forecast) 0.708 10.76% SI smaller event 3 (1994.08.17 as initial values to forecast) 0.698 11.87% SI smaller event 4 (1999.06.12 as initial values to forecast) 0.777 11.70% SI smaller event 5 (1985.07.11 as initial values to forecast) 0.737 10.17% The average 0.739 11.22%
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Research Article|
|Author:||Hong, Mei; Zhang, Ren; Chen, Xi; Ge, Shanshan; Bai, Chengzu; Singh, Vijay P.|
|Publication:||Discrete Dynamics in Nature and Society|
|Date:||Jan 1, 2014|
|Previous Article:||Optimal time-consistent investment strategy for a DC pension plan with the return of premiums clauses and annuity contracts.|
|Next Article:||Transit station congestion index research based on pedestrian simulation and gray clustering evaluation.|