Rural electric power requirements forecasts: detecting and correcting for weaknesses and bias.
The first weakness is an over-reliance on trending, even though the trends are often hidden in econometric equations. The second is a significant statistical bias that goes unreported and undetected. Since both affect the policy decisions of system needs, they reach beyond the arcane world of econometrics. That is, they can lead to very costly mistakes, even affecting a system's ability to compete. And even when detected, if uncorrected, they leave managers with unappealing options. If managers follow their PRS forecasts, they may reach erroneous conclusions about need. If they ignore their PRS's, they have little more than intuition as a guide. Fortunately, these problems can often be corrected, at least partially. In this paper we have used the actual consumption and economic data of a cooperative distribution system to illustrate these problems and their effects.
In rural electrical systems, forecasting the determination of factors that influence the level of peak usage and load almost inevitably centers on the availability of data. The economic, demographic, and even weather data for the rural areas are limited, more so than for most metropolitan areas. That limitation often leads to selecting data elements for a forecast because they are available rather than for their theoretical relationship to electric consumption. Consequently, the equations used to forecast kilowatt and kilowatt hour (kWh) requirements often are conceptually weak. (In econometrics this is known as weak model specification).
As a substitute for a more thoroughly defined system, analysts often use a trend variable for time to forecast prospective growth. That fails to explain the underlying causes of growth in consumption. Even when the forecasts conform to realistic growth expectations and appear reasonable, models fail to explain explicitly the underlying causes of consumption. These models provide little insight into economic and demographic factors that affect consumption, which leads to two difficulties. First, the models will miss significant structural changes, and second, the models provide no means to assess program effects.
Structural changes will be masked by over-reliance on trending forecasts, whether or not the trend is developed econometrically. Unforeseen changes are identifiable only after-the-fact. For example, as shown in Figure 1, the impact of fuel costs in the late 1970s was not foreseeable in forecasts based on trend. As the Figure 1 shows, a decline in electric consumption followed the increase in fuel costs in the 1979-81 period, but, of course, population continued to grow.
Models based on trending also have limited use in measuring program effects unless the models specifically include program-related variables. That is, the effects of rate changes, population growth, or income shocks can only be effectively measured if those variables are used to estimate consumption. For example, the residential kWh consumption model in Table 1 includes variables for rate levels, population growth and cooling degree days. Since the mills/kWh variable has a positive sign, many analysts would delete that variable. It is not logical to expect electric consumption to increase as rates increase. Consequently, the resulting model is shown in the second column and that is likely to be used to forecast consumption. Note that this model relies on population growth and a weather adjustment to predict residential consumption.
Is there anything that managers should do under these circumstances, given the data limitations? We think that there is. First, remember one point that the best model for prediction is not necessarily the one that fits the historical data the best. (In econometric jargon, the word fit refers to the R-square which measures the percent of the variation in the historical data explained by the model). A model that accurately predicts for the long-term may be the one that captures the major influences on consumption in the service territory, rather than the one that best fits past consumption. Second, ask about the structure of the model and make sure that the model conforms to its intended purpose; that is, are the predictors that drive the model appropriate for its intended use? A model that effectively estimates the impact of structural changes in a future electrical system may not be the one that best estimates the consumption of the present system. Alternatively, a model that accurately predicts the near-term impact of sharp increases in fuel costs may not predict long-term consumption very well at all.
Stated differently, users should remember that behind the statistics, forecasting is still an art form. When a forecast is restricted by the data available, which is common in rural systems, remember that identifying the factors that influence consumption still may be the most important contribution of the model.
TABLE 1 Residential Usage Forecast OLS Estimates Forecast Forecast Variable w/Mills/kWh w/out Mills/kWh Cooling Degree Days Squared 8.0496320E-07 7.6300000E-07 (5.952) (5.68) Natural Log of Mills/kwh 0.125877400 (1.454) Natural Log of Population 0.671319900 0.6384587 (35.204) (397.388) Durbin-Watson 0.7707 0.694 R2 0.9995 0.9995 Note: The values in parenthesis are the t statistics.
Beyond the data limitation problems and their implications, we believe that statistical bias in PRS forecasts is even more significant. In our review of regression equations, it is clear that PRS forecasts often rely on the so-called "Ordinary Least Squares" (OLS) regression technique, even when it is statistically inappropriate. When using time-series data, this technique may cause a statistical bias known as "autocorrelation." A regression equation may, in fact, fit the historical data very well, but the coefficients of the equation may be biased. As a result, a PRS forecast using these coefficients can produce misleading consumption estimates.
Autocorrelation. Perhaps we should apologize for lapsing into econometric jargon, but unreported or undetected autocorrelation occurs frequently, and its impact is often sizeable. In the case of load forecasting, when the level of consumption is correlated with itself over time, i.e., when the estimated consumption in one year is not independent from the level of consumption in the previous year, there is autocorrelation. Viewed differently, that occurs when, after taking into account the factors used to estimate consumption, e.g., weather, number of customers, income levels, rate levels and the like, one year's consumption still is correlated with previous levels of consumption. That causes the coefficients in the model to take on biased values. If the model is used for forecasting, biased coefficients mean that the forecasts will be misleading. If the model is used to estimate the impact of a rate change, those estimates will also be misleading.
TABLE 2 Residential Usage Forecast MLE Estimates Variable MLE Forecast w/Mills/kWh Cooling Degree Days Squared 4.7703160E-07 (3.303) Natural Log of Mills/kwh -0.116278600 (-1.641) Natural Log of Populations 0.602294000 (38.697) Autoregressive Variable -0.693360200 (-10.298) Durbin Watson 1.6980 R2 0.9997 Note: The values in parenthesis are the t statistics.
The presence of autocorrelation is usually determined by the so-called Durbin-Watson statistic (D-W). This statistic will take a value between 0 and 4, where a value close to 2 indicates that there is no autocorrelation.
An Example. To illustrate how autocorrelation can produce a deceptive PRS, we have compared two kWh forecasts for the test cooperative distribution system using the same, actual consumption data. Despite their extremely good "fit" to the historical data, as shown by their high R2 values, the models in Table 1 have autocorrelation as shown by the low D-W statistics.
Table 2 illustrates a new model, using the "Maximized Likelihood Estimate" (MLE) method that adjusts for autocorrelation. (The MLE is a technique that minimizes the sum of squares and maximizes the logarithm of a special likelihood function by computer reiteration). The D-W statistic improves to 1.6980. The impact of autocorrelation to these equations is apparent in Figure 2, which compares the forecasts and the potential misjudgment that statistical bias can generate. Notably, the forecast using the biased coefficients of the OLS equation are much higher than with the MLE equation. The total growth over the first ten years using the OLS equations is 23.015% or a growth rate of 2,093% per year. However, the MLE equation after adjusting for autocorrelation produces a lower forecast of 15.648% over ten years or a rate of 1.468% per year for this distribution system. (For this system, there is a forecasted decline in population growth which accounts for the "bend" in the forecast and the reduced growth after 1994).
Note that managers need not be deceived by biased regression equations. Nor do managers need to learn econometric techniques to prepare unbiased forecasts. It is important, nevertheless, to know that significant bias can occur and to ask pointed questions. Analysts can identify and correct for autocorrelation.
We have detected two commonly occurring problems in the forecasts underlying many PRS studies. Each can lead to misleading interpretations and costly misjudgments. First, there is an over-reliance on trending, which may fit historical data, but is likely to miss forthcoming structural changes and is of limited use in evaluating program changes. Second, there is a frequently undetected statistical bias in forecasts. Each of these problems can be mitigated, but only if managers and forecast users ask the right questions.
Misters Murry, Nan and Harrington are with the consulting engineering firm of C. H. Guernsey & Co. in Oklahoma City, OK.
Donald A. Murry, Ph.D., is an economist specializing in rate of return analysis, economic forecasting and economic regulatory policy analysis. Dr. Murry has testified before state and federal regulatory commissions, federal court and legislative committees on numerous regulatory issues. He has a B.S. in business administration, a M.A. and a Ph.D. in economics from the University of Missouri-Columbia and is a professor of economics at the University of Oklahoma.
G. David Nan, Ph.D., is an econometrician specializing in the areas of rate of return analysis, econometric forecasting, energy demand modeling, and environmental-energy-economic issues. Dr. Nan has conducted several energy research projects relating to demand forecasting, power allocation, and rate of return. Dr. Nan has a B.S. in transportation engineering and management from the National Chiao Tung University, Taiwan, and a M.A. and Ph.D. in economics from the University of Oklahoma.
Bryan Harrington is an economist specializing in rate of return analysis, economic/rate forecasting and economic regulatory policy analysis. Mr. Harrington has prepared testimony presented before state and federal regulatory commissions and federal court on issues relating to rate of return, demand forecasting, power allocations and energy economic issues. He has a B.S. degree in business administration an a M.A. in economics from the University of Oklahoma, and is a Ph.D. candidate at the University of Oklahoma.
|Printer friendly Cite/link Email Feedback|
|Author:||Murry, Donald A.; Nan, G. David; Harrington, Bryan|
|Date:||Sep 22, 1993|
|Previous Article:||Implications for utilities of the National Energy Policy Act and other developments.|
|Next Article:||Job reengineering: a new approach to meet the challenges faced by rural electric systems.|