# A single-equation study of U.S. petroleum consumption: the role of model specification.

I. Introduction

The price responsiveness of U.S. petroleum consumption began to attract a great deal of attention following the unexpected and substantial oil price increases of 1973-74. There have been a number of large, multi-equation econometric studies of U.S. energy demand since then which have focused primarily on estimating short run and long run price and income elasticities of individual energy resources (coal, oil, natural gas & electricity) for various consumer sectors (residential, industrial, commercial).(1) Following these early multi-equation studies there have been several single-equation studies of aggregate U.S. petroleum consumption |4; 3; 9; 5; 6~. These single-equation studies have found that U.S. demand for petroleum products can be quite price inelastic in the short run (estimates range from -0.04 to -0.08) but exhibits long lags of anywhere from 6 to 10 years that yield long run price elasticities ranging from -0.25 to -0.56. Income or GNP elasticities are usually found to be much higher (0.69 to 1.13) with the complete response often assumed to occur in the current period.

Single-equation studies are often justified as efficient shortcuts, or reduced forms for identifying the central behavioral aspects of a particular market. The main issue addressed in this paper is the extent to which the existing empirical results from single-equation studies of aggregate U.S. petroleum consumption have been influenced by the researchers' choice of dynamic model specification. This question is relevant because the econometric methodology usually followed in such studies is to start with a relatively simple specification, such as a partial adjustment model or some kind of a distributed lag structure on price alone. This initial specification choice is rarely justified, except possibly to cite its previous success in similar work. Given that models which produce insignificant or perverse coefficient estimates are not usually publishable, the reported specification is almost certain to provide "reasonable," statistically significant estimates with the correct signs.

The primary diagnostic test of the chosen specification is the Durbin-Watson test statistic, or perhaps Durbin's "h" statistic in the presence of lagged dependent variables. When this test gives evidence of serial correlation in the residuals, the usual response is to "correct" for this by estimating a first or second-order autoregressive model (AR(1) or AR(2)). Should this easily-implemented correction not solve the problem, there is a tendency to chalk it up to some kind of unknown specification error or deficiency in the data, leaving any further investigation as a suggestion for future research.

Hendry |12, 223~ argues that this type of specification search (from simple to general) characterizes much of applied work, and is "a reasonably certain path to concluding with a mis-specified relationship" if it is not accompanied by rigorous diagnostic testing.(2) Additionally, he believes that a simple-to-general modeling approach often involves "excessive presimplification with inadequate testing" |12, 222~ or what Leamer has called a "prejudiced search for an acceptable model" |13, 126~.

Mis-specification of the dynamic adjustment process for U.S. petroleum consumption could lead to biased elasticity estimates with corresponding inaccurate forecasts. A mis-specified model might appear to suffer structural shifts or "changes in regimes" when in reality the true underlying structural economic relationship has remained unchanged while one or more of the exogenous variables have somehow changed their behavior |12, 219~. Similarly, a mis-specified model may provide biased evidence of parameter symmetry, leading to further inaccuracies in forecasting. Structural change and/or parameter symmetry have been the focus of many single-equation studies of U.S. petroleum consumption |3; 9; 5; 6~.

Given the importance of oil to the U.S. economy and the high degree of uncertainty associated with the future path of world oil prices, it seems appropriate to stop and closely investigate the issue of possible model mis-specification in the U.S. demand for petroleum products. Following Hendry's suggestions |12~, we first pursue a general-to-simple dynamic specification search, sequentially testing various restrictions on a deliberately overparameterized general model. The aim is to obtain a data-based simplification of a general model that provides a parsimonious representation of the underlying data generation process. Among other topics |7; 15~, Hendry's modeling approach has been used to analyze aggregate OECD energy demand |1~, yielding price elasticity estimates comparable to existing studies but a long run income elasticity twice as large as the consensus estimate of unity.

Next, rather than to just report "one more set" of elasticity estimates, we use the same data set to estimate the parameters of several typical alternative single-equation models, including the widely-used partial adjustment model, a simple static model, and the popular polynomial distributed lag (PDL) on price. These alternative estimates over the same sample period provide some indication of how our results differ from those of other researchers who might have chosen a different specification by following a simple-to-general approach.

Besides revealing the impact of model specification on our elasticities, all but one of these alternative models (the PDL) can be seen as nested or restricted versions of a more general dynamic specification, and therefore can be tested for their acceptability using standard statistical tests. In each case, we find that the restrictions implied by the alternative models are not supported by the data, suggesting that their selection and use to analyze U.S. petroleum consumption is inappropriate. Furthermore, a comparison of forecast errors (ex post and historical) shows substantial variation in forecasting performance across the various models. Hendry's general-to-simple modeling approach is seen to provide a data-acceptable restricted model that outperforms the alternatives and is free of the prior subjective prejudices of the researcher for a particular specification.

II. Data

The data set used for the following estimations contained annual observations over the period 1947-89. U.S. petroleum consumption (in thousand barrels per day) was measured by "petroleum products supplied" as reported in Table 50 of Annual Energy Review 1989 by the Energy Information Administration of the U.S. Department of Energy (DOE/EIA). The price of oil (in current dollars per barrel) was measured by refiner acquisition cost (composite) as calculated by DOE/EIA for 1968-89, published in Table 68 of Annual Energy Review 1989. Oil prices prior to 1968 were generously supplied by Dermot Gately, calculated as the domestic average wellhead price plus a 10 percent markup for transportation costs. All prices were converted to 1982 dollars using the implicit GNP deflator (1982 = 100) published in Survey of Current Business by the U.S. Department of Commerce. U.S. GNP (in billions of 1982 dollars) was also obtained from the Survey of Current Business.

III. General-to-Simple Modeling Approach

An application of Hendry's general-to-simple modeling approach to the study of U.S. petroleum consumption would involve first estimating an unrestricted autoregressive distributed lag model (denoted ADL (|m.sub.0~, |m.sub.1~, |m.sub.2~)) of the form:

|q.sub.t~ = |Alpha~ + |summation of~ ||Delta~.sub.j~|q.sub.t-j-1~ where j=0 to |m.sub.0~ + |summation of~ ||Beta~.sub.j~|p.sub.t-j~ where j=0 to |m.sub.1~ + |summation of~ ||Gamma~.sub.j~|y.sub.t-j~ where j=0 to |m.sub.2~ + |u.sub.t~ t = 1,2,...,T (1)

where the three lag lengths (|m.sub.0~, |m.sub.1~, |m.sub.2~) are initially set at the same maximum value (in this case, 4). All variables are measured in natural logarithms, with |q.sub.t~ denoting U.S. petroleum consumption in year t, |p.sub.t~ denoting the real price of oil, and |y.sub.t~ denoting real GNP. As others have noted |1~, theory often does not provide us with any strict guidelines on the length of these three lags. However, a maximum of 4 lags on each variable will allow for a large number of different possible lag distributions.(3)

This unrestricted general model is then progressively simplified by sequential testing of individual parameter restrictions suggested by the data. Potential restrictions are usually identified by looking at the magnitudes of individual parameter estimates, as well as their standard errors. Parameters whose estimates involve large standard errors can usually be set to zero. Often the estimated coefficient for one lagged value of a variable will be close in magnitude to the estimate for a neighboring lag, suggesting the use of a single common parameter. Should the estimates be of similar magnitude yet opposite in sign, then perhaps differencing may be acceptable. Restrictions are tested one at a time, and if accepted, all subsequent testing will include those restrictions in the model.

The criteria for accepting any particular restriction are: (i) it must lower the estimated standard error of the regression (SE), (ii) it cannot induce non-randomness in the residuals (as measured by the Lagrange multiplier (LM) test for serial correlation proposed by Godfrey |11~); and (iii) it cannot cause "predictive failure" outside the estimation period (as measured by the Chow TABULAR DATA OMITTED test for predictive stability or parameter constancy |12, 222~). In order to implement this third test, we hold back the last four observations (1986-89), basing all subsequent results on the estimation period 1961-85.(4)

Results from the OLS estimation of the unrestricted general model of equation (1) are presented in Table I, along with the relevant diagnostic test statistics. LM(s) is Godfrey's LM test for auto correlation of order s against a null hypothesis of serial independence, which is asymptotically distributed as ||Chi~.sup.2~(s), and F(4, 25 - k) is the Chow test for predictive stability, where k is the number of estimated coefficients in the model (k = 15 in the unrestricted model). Sums of the estimated coefficients for each variable (q, p, y) are also shown for easy calculation of the long run elasticities. In particular, the short run elasticities for p and y are given by the estimated coefficients on the zeroth lag term (|Mathematical Expression Omitted~ and |Mathematical Expression Omitted~), while the long run elasticities are calculated as:

|Mathematical Expression Omitted~

and

|Mathematical Expression Omitted~.

These unrestricted estimates serve at least two purposes: (i) they provide a statistical benchmark for evaluating the acceptability of restricted variants of the general model of equation (1); (ii) they give us unrestricted elasticity estimates that will serve as a guide against imposing invalid restrictions in the paring-down process of model selection. The estimated price elasticities are comparable to existing single-equation estimates for the short run, yet are considerably smaller for the long run, reflecting a somewhat shorter lag in the full price response. Both the short run and long run GNP elasticity estimates are well within the range of existing results.

The two diagnostic tests for serial correlation and post-sample parameter constancy reveal TABULAR DATA OMITTED a stable model that probably suffers from "over-fitting" or over-parameterization. Progressive reduction of the LM test statistic in the restricted models which follow confirm this suspicion.

We now proceed to simplify the general model by imposing data-instigated restrictions on the set of coefficients. Table II provides a sequential list of acceptable parameter restrictions and the corresponding values of SE, LM (4) and F(4, 25 - k). The final restricted model is (with t-statistics in parentheses):

|Mathematical Expression Omitted~

T = 25, SE = 0.0089098, Adj. |R.sup.2~ = 0.999252, LM(4) = 7.217, F(4, 18) = 2.775.

This restricted ADL model displays no evidence of serial correlation in its residuals, is stable over the post-sample period, and involves a reduction in SE of 20 percent over the general model.

The unscrambled coefficients from the restricted ADL model in (2) are given in Table III. From these, the short run and long run elasticities are easily seen to be:

|Mathematical Expression Omitted~ |Mathematical Expression Omitted~

|Mathematical Expression Omitted~ |Mathematical Expression Omitted~

which are very close to those found in the unrestricted model, suggesting that the restrictions inherent in the final model are valid.

As a point of interest, the restricted model was re-estimated over the longer sample period of 1961-89. The summary statistics over this longer sample period were T = 29, SE = 0.0102473, adjusted |R.sup.2~ = 0.998872 and LM(4) = 4.823. The resulting unscrambled coefficients are also shown in Table III, yielding the following implied elasticities:

|Mathematical Expression Omitted~ |Mathematical Expression Omitted~

|Mathematical Expression Omitted~ |Mathematical Expression Omitted~

TABULAR DATA OMITTED

which show a slightly more responsive price effect in the long run but otherwise are essentially the same as those obtained over 1961-85.

In conclusion, the general-to-simple model selection procedure has produced a rather parsimonious (only 7 parameters to be estimated) representation of the data generation process that appears to adequately capture the dynamics of U.S. petroleum consumption. Rather than to continue to compare these estimates to those obtained in other studies using different specifications and different data bases, in the next section we pursue the simple-to-general modeling approach over the same sample period. This will isolate the role that model specification, apart from data differences, plays in estimating price and GNP elasticities for U.S. petroleum consumption.

IV. Simple-to-General Modeling Approach

Given the same set of annual observations on U.S. petroleum consumption, the real price of oil and real GNP described in the data section above, most applied researchers seeking a parsimonious, single-equation econometric model would not have followed the general-to-simple selection process of the previous section. Instead, they would have chosen some familiar specification they have been trained to use or find easy to implement. Some would begin to determine an "optimal" PDL structure for the price term, assuming away any lags on the GNP response |4; 5; 6; 9; 17~. Others might appeal to other researchers' success with a popular specification, such as the partial adjustment model |3~. Still others might want to start out by estimating a simple static model, with the strong expectation of having to upgrade it to at least an AR(1) model.

PDL Model

We begin by estimating a simple model of U.S. petroleum consumption with a PDL (n, r) structure on price but an instantaneous response to GNP:

|q.sub.t~ = |Alpha~ + |Gamma~|y.sub.t~ + |summation of~ ||Beta~.sub.i~|p.sub.t-1~ where i=0 to n + |u.sub.t~ (3)

where |Mathematical Expression Omitted~

and n = lag length and r = order of the polynomial distributed lag structure on price are two values to be determined. The instantaneous GNP elasticity is given by |Gamma~ and the price elasticity is given by ||Beta~.sub.0~ for the short run and |summation~||Beta~.sub.i~ for the long run.

As before, we held back the last 4 observations (1986-89) for post-sample testing. This left a maximum possible estimation sample of 1947-85 (T = 39), or a maximum unconstrained lag on price of 17 years. The optimal lag length (n |is less than or equal to~ 17) was determined by finding the number of lags which gave the highest value of adjusted |R.sup.2~ |14, 357~. In the present case, an unconstrained lag of 12 years gave the highest value of adjusted |R.sup.2~ = 0.982517, with an inconclusive value for the DW statistic of 1.182. Next, the optimal order of the polynomial lag structure (r |is less than or equal to~ n - 1) was determined by starting with an 11th order polynomial (with no endpoint restrictions) and progressively reducing the order by one until we were unable to reject the hypothesis that the last term in the polynomial was zero |14, 357~. This procedure was followed using the previously determined optimal 12 year lag length over the longest possible sample period of 1959-85 and resulted in an optimal first-degree polynomial (r = 1).

Unfortunately, this PDL (12, 1) model showed clear evidence of serial correlation in its residuals: the DW statistic was only 0.472 and the Ljung-Box (Q*(s)) portmanteau test statistics for serial independence (distributed as ||Chi~.sup.2~(s)) were significant from the very first lag (s = 1).(5) Subsequent re-estimation using a Cochrane-Orcutt iterative AR(2) procedure seemed to correct this problem, yielding insignificant Q* statistics out to the fourth lag and a respectable DW statistic of 2.140.

All parameter estimates were of reasonable magnitudes, had the "correct" or expected signs and were statistically significant at the 1 percent level. The short run price elasticity was very small (-0.024), but with a lengthy 12 year lag was able to rise to -0.487 in the long run. The instantaneous GNP elasticity was once again close to unity (1.029).(6)

By assuming no change in the autoregressive parameters ||Rho~.sub.1~ and ||Rho~.sub.2~ over 1986-89, a post-sample Chow test of parameter constancy was also performed.(7) The test statistic of 1.609 is distributed as F(4, 21) and is not significant at the 5 percent level, showing that the model does not appear to suffer from "predictive failure" as defined by Hendry. Thus, from the perspective of the simple-to-general approach, we seem to have found an acceptable model.

Partial Adjustment Model

However, not everyone wanting to analyze U.S. petroleum consumption with a single-equation model would have chosen to fit a simple PDL structure on price. The partial adjustment model is another very popular dynamic specification that we could have used. A simple partial adjustment model would take the form:

|q.sub.t~ = ||Phi~.sub.0~ + ||Phi~.sub.1~|p.sub.t~ + ||Phi~.sub.2~|y.sub.t~ + ||Phi~.sub.3~|q.sub.t-1~ + |u.sub.t~ (4)

where 0 |is less than~ ||Phi~.sub.3~ |is less than~ 1. In this well-known specification, the short run price elasticity is ||Phi~.sub.1~, the short run GNP elasticity is ||Phi~.sub.2~ and the long run elasticities are ||Phi~.sub.1~/(1 - ||Phi~.sub.3~) and ||Phi~.sub.2~/(1 - ||Phi~.sub.3~), respectively. Unlike the PDL model, changes in both price and GNP cause lagged responses in consumption.

OLS estimation of (4) over 1961-85 yielded the following results:

|Mathematical Expression Omitted~

T = 25, SE = 0.0247197, Adj. |R.sup.2~ = 0.983455, LM(1) = 4.029, F(4, 21) = 1.469.

While the parameter estimates seem reasonable and are all statistically significant, the residuals show some evidence of first order autocorrelation since Godfrey's LM test statistic exceeds the critical value of 3.84 for a ||Chi~.sup.2~(1) variate at the 5 percent level.

"Correcting" for first-order autocorrelation using the Cochrane-Orcutt iterative procedure produced a value for ||Rho~.sub.1~ = 0.896. With this transformation the LM test statistic is now insignificant, so the problem seems to be solved. The AR(1) partial adjustment model is:

|Mathematical Expression Omitted~

T = 25, SE = 0.0212301, Adj. |R.sup.2~ = 0.686825, LM(1) = 2.366, F(4, 21) = 1.355

where the post-sample F-test of parameter constancy assumes no change in ||Rho~.sub.1~ over 1986-89.

The final parameter estimates seem acceptable, although the speed of adjustment parameter (0.590) suggests a much faster adjustment process than that of the OLS model. The short run price elasticity is -0.114, somewhat larger than other estimates, with a comparatively low long run price elasticity of -0.278. On the other hand, the short run GNP elasticity is low at 0.416 although the long run estimate is very close to the consensus value of unity at 1.015. Finally, the parameters appear to be constant through the post-sample period, giving the model predictive stability. Once again, we seem to have obtained an acceptable model by an entirely different approach.

Static Model

The last simple-to-general modeling alternative we follow begins with OLS estimation of a purely static specification, yielding the following results:

|Mathematical Expression Omitted~

T = 25, SE = 0.0884227, Adj. |R.sup.2~ = 0.788308, DW = 0.142, F(4, 22) = 3.781.

Although the parameter estimates seem reasonable and are significant, the extremely low DW statistic suggests serially correlated residuals if not outright mis-specification, and the significant F-statistic for post-sample stability indicates predictive failure.

In an attempt to correct this problem we again used the Cochrane-Orcutt iterative procedure to determine the value of the first order autoregressive parameter ||Rho~.sub.1~. However, this yielded a value for ||Rho~.sub.1~ in excess of one (||Rho~.sub.1~ = 1.112), which violates the stationarity condition that ||Rho~.sub.1~ lie inside the unit circle. A first differences model was then estimated as an alternative:

|Mathematical Expression Omitted~

T = 25, SE = 0.0295252, Adj. |R.sup.2~ = 0.510185, DW = 0.454, F(4, 22) = 0.158

which represents some improvement over the static model (7) but still appears to suffer from serial correlation. Adjusting this first differences model for first-order autocorrelation gives ||Rho~.sub.1~ = 0.800 and the following results:

|Mathematical Expression Omitted~

T = 25, SE = 0.0341131, Adj. |R.sup.2~ = 0.704120, DW = 2.186, F(4, 22) = 0.628

where the post-sample test again assumes a constant value of ||Rho~.sub.1~ over the 1986-89 period. By all appearances, this AR(1) first differences model is thus also an acceptable specification.

V. Model Comparisons

At this point, we have selected one restricted ADL model (equation (2)) using the general-to-simple approach and three very different dynamic specifications ((i) AR(2) PDL(12, 1) model; (ii) AR(1) partial adjustment model (equation (6)); and (iii) AR(1) first differences TABULAR DATA OMITTED TABULAR DATA OMITTED model (equation (9)) by following different paths in a simple-to-general approach. All four chosen models appear to be free of serial correlation and exhibit post-sample stability. All four models yielded statistically significant parameter estimates with the correct signs and reasonable magnitudes. However, the implied elasticities and lengths of lag responses differed substantially across the four models. Which model should we use?

As an initial comparison we looked at the ex post forecasting properties of the four chosen models. Table V reports five commonly used measures of forecast error for ex post forecasts over the post-sample period 1986-89: (1) root mean squared error (RMSE); (2) root mean squared percent error (RMSPE); (3) mean absolute error (MAE); (4) mean absolute percent error (MAPE); and Theil's inequality coefficient (U).(8) Over this brief period of only 4 observations, the mean forecast errors were only about 0.2 to 0.3 percent for all but the partial adjustment model, which produced forecast errors of over 9 percent as well as a substantially higher value for Theil's U. On this basis we might prefer any one of the first three models, with either the PDL or first differences model having a slight edge over the restricted ADL model.

To further discriminate between these four models, we performed historical simulations over the estimation sample period of 1961-85 and report the same five error measures as before in Table VI. Over this longer period the partial adjustment model still has the poorest performance, with mean errors almost identical to those over 1986-89. However, the other three models' performances are no longer so similar. The restricted ADL model now yields mean errors roughly half the size of those for the PDL model and one-third the size of those for the first differences model, as well as the lowest value for Theil's U. The restricted ADL model now clearly outperforms all other specifications considered here.

As a final comparison, it is possible to recognize the partial adjustment and first differences models as special cases of the general ADL(4, 4, 4) model of equation (1) and test whether or not the restrictions implicit in each specification are supported by the data. The long 12 year lag on price in the PDL model makes it impossible to test in this way, since it is not a nested case of the general model. Nevertheless, for the same number of observations (25) the chosen PDL model has a SE of 0.0155485, which is almost 40 percent higher than the SE of the unrestricted general model of 0.0111443, suggesting that it would not be a data-acceptable simplification.

The AR (1) partial adjustment model of equation (6) is observationally equivalent to a nested case of the general ADL(4, 4, 4) model, in particular the ADL(2, 1, 1) model:

|q.sub.t~ = |Alpha~ + ||Delta~.sub.1~|q.sub.t-1~ + ||Delta~.sub.2~|q.sub.t-2~ + ||Beta~.sub.0~|p.sub.t~ + ||Beta~.sub.1~|p.sub.t-1~ + ||Gamma~.sub.0~|y.sub.t~ + ||Gamma~.sub.1~|y.sub.t-1~ + |u.sub.t~. (10)

This equivalence can be shown by taking quasi-first differences of the basic partial adjustment model of equation (4):

|q.sub.t~ = ||Phi~.sub.0~(1-||Rho~.sub.1~) + (||Phi~.sub.3~ + ||Rho~.sub.1~)|q.sub.t-1~ - ||Phi~.sub.3~||Rho~.sub.1~|q.sub.t-2~ + ||Phi~.sub.1~|p.sub.t~ - ||Phi~.sub.1~||Rho~.sub.1~|p.sub.t-1~ + ||Phi~.sub.2~|y.sub.t~ - ||Phi~.sub.2~||Rho~.sub.1~|y.sub.t-1~ + |u.sub.t~ (11)

which has 5 parameters (||Phi~.sub.0~, ||Phi~.sub.1~, ||Phi~.sub.2~, ||Phi~.sub.3~, ||Rho~.sub.1~) to be estimated against 7 parameters in equation (10). This implies that the AR(1) partial adjustment model imposes 2 nonlinear restrictions on the ADL(2, 1, 1) model of equation (10). To test these 2 nonlinear restrictions, we use a likelihood ratio test statistic (LR), which is asymptotically distributed as ||Chi~.sup.2~(2). The calculated value of LR = 17.06 exceeds the critical value of 9.21 at the 1 percent level, so the restrictions implied by the AR(1) partial adjustment model are rejected.

Of course, this test is immaterial if the underlying restrictions implied by the ADL(2, 1, 1) model of equation (10) are not first found to be data-acceptable. This set of 8 linear restrictions (||Beta~.sub.2~ = ||Beta~.sub.3~ = ||Beta~.sub.4~ = ||Gamma~.sub.2~ = ||Gamma~.sub.3~ = ||Gamma~.sub.4~ = ||Delta~.sub.3~ = ||Delta~.sub.4~ = 0) is easily tested with a standard F-test, being distributed as F(8, 10). The calculated value is 3.57, which exceeds the critical value of 3.07 at the 5 percent level. Furthermore, the ADL(2, 1, 1) model has a SE = 0.0163026, which is almost 50 percent greater than that for the general ADL(4, 4, 4) model, so equation (10) is not itself a data-acceptable simplification. We conclude that the AR(1) partial adjustment model is a data-rejectable specification for this sample.

Similarly, the AR(1) first differences model of equation (9) can be seen as observationally equivalent to an ADL(2, 2, 2) model. This nested case would have 9 separate parameters while the AR(1) first differences model has only 4, implying 5 nonlinear restrictions that can again be tested by a LR test statistic (assuming the ADL(2, 2, 2) model is itself data-acceptable). The calculated value of LR is 14.02, being distributed as ||Chi~.sup.2~(5), which exceeds the critical value of 11.1 at the 5 percent level, so these 5 restrictions can be rejected. However, the nested ADL(2, 2, 2) is not an acceptable simplification, as its SE of 0.0164392 is again roughly 50 percent above the SE for the general model, and an F-test of the 6 linear restrictions implied by an ADL(2, 2, 2) model (||Beta~.sub.3~ = ||Beta~.sub.4~ = ||Gamma~.sub.3~ = ||Gamma~.sub.4~ = ||Delta~.sub.3~ = ||Delta~.sub.4~ = 0) gives a statistic of F(6, 10) = 4.14, surpassing the critical value of 3.22 at the 5 percent level. Therefore, the AR(1) first differences model must also be data-rejectable, comprising a set of invalid parameter restrictions.(9)

VI. Conclusions

When choosing an econometric model specification for a single-equation study of aggregate U.S. petroleum consumption, the researcher seeks a parsimonious, easily estimated model that will provide unbiased price and income elasticity estimates and yield accurate forecasts. In contrast to existing studies, we have used Hendry's general-to-simple specification search technique and annual data (1961-89) to obtain a restricted, data-acceptable simplification of a general ADL model. This restricted ADL model yielded GNP and short run price elasticities near the consensus estimates, but a long run price elasticity (-0.17) that is substantially smaller than existing estimates.

Comparisons with three other seemingly acceptable alternative models that were chosen via the simple-to-general modeling approach showed that popular model specifications often involve untested parameter restrictions that cannot be accepted. In addition, such models may also have poorer forecasting performance, with the widely used partial adjustment model found to have the largest mean forecast errors (nearly 10 percent) of all those considered here.

Therefore, the untested acceptance of a popular dynamic specification which yields "reasonable" and significant estimates can lead to the selection of a model which is not consistent with the data (i.e., is data-rejectable). Parameter estimates from such data-rejectable models may give very misleading indications of the dynamic nature of the behavioral relationships being modeled. In the present case, the long run price elasticity appears to have been over-estimated by such models, while the speed of the adjustment process may have been under-estimated. Energy policies which assumed a lingering, substantial consumption response to an oil price change could be confounded by an actual response that was much smaller and largely complete in just a few years.

Finally, forecasts of future U.S. petroleum consumption levels are often required for policy planning purposes. While the accuracy of any petroleum consumption forecast will depend heavily upon the assumptions made about future world oil prices and GNP levels, using a forecasting model that is not data-acceptable can lead to sizable errors. Selection of a data-acceptable model that does not suffer from predictive failure over existing out-of-sample data points can help to minimize forecasting errors for policy makers and other analysts. Based on the results presented here, the general-to-simple approach appears to offer a satisfying new methodology for generating superior forecast models of petroleum consumption and other energy use patterns.

Appendix

TABULAR DATA OMITTED

1. An excellent survey of these results may be found in Bohi |2, 159~.

2. For a comparison of Hendry's econometric methodology with the traditional, simple-to-general ("North American") approach, see Gilbert |10~.

3. A formal F-test of |H.sub.O~ : ADL(4, 4, 4) vs |H.sub.A~ : ADL(5, 5, 5) yielded a test statistic of 0.701, which is distributed as F(3, 7), falling well short of the critical value of 4.35 at the 5 percent level.

4. With four lags, this implies using observations reaching back to 1957. Obviously, with data going as far back as 1947, we could have begun the estimation period in 1951. We started in 1961 so that these results would be comparable to those in the next section using alternative specifications requiring longer lags, in particular the AR(2) PDL model.

5. We can use the familiar DW and Ljung-Box portmanteau test statistics here rather than Godfrey's LM test since the PDL model has no lagged dependent variables. Otherwise, we report the LM test statistic, since the Ljung-Box statistics have been shown to be inappropriate for models with lagged dependent variables |8~.

6. For comparison, Gately and Rappoport |9~ estimated an AR(2) PDL(10, 3) model from annual data over 1949-85 and found a much larger (although still small) short run price elasticity of -0.068, a smaller long run price elasticity of -0.364, and a smaller GNP elasticity of 0.689. However, our elasticities are very similar to those reported by Walls |17~ that estimated a nearly identical AR(2) PDL(10, 1) model from annual data over 1946-87.

7. Complete re-estimation over 1961-89 yielded almost identical values for these two parameters (||Rho~.sub.1~ = 1.182 and ||Rho~.sub.2~ = -0.597), so this necessary assumption seemed reasonable.

8. An explanation of each of these forecast error measures may be found in Pindyck and Rubinfeld |16, 338-40~.

9. For the sake of curiosity, the OLS partial adjustment model and the OLS first differences model were also found to be data-rejectable.

References

1. Beenstock, Michael and Patrick Willcocks, "Energy Consumption and Economic Activity in Industrialized Countries: The Dynamic Aggregate Time Series Relationship." Energy Economics, October 1981, 225-32.

2. Bohi, Douglas R. Analyzing Demand Behavior: A Study of Energy Elasticities. Baltimore: John Hopkins University Press, 1981.

3. Bopp, Anthony E., "Tests for Structural Change in U.S. Oil Consumption, 1967-82." Energy Economics, October 1984, 223-30.

4. Brown, Scott B., "An Aggregate Petroleum Consumption Model." Energy Economics, January 1983, 27-30.

5. Brown, Stephen P. A. and Keith R. Phillips, "Oil Demand and Prices in the 1990s." Federal Reserve Bank of Dallas Economic Review, January 1989, 1-8.

6. ----- and -----, "U.S. Oil Demand and Conservation." Contemporary Policy Issues, January 1991, 67-72.

7. Cuthbertson, Keith and Paul Richards, "An Econometric Study of the Demand for First and Second Class Inland Letter Services." Review of Economics and Statistics, November 1990, 640-48.

8. Dezhbakhsh, Hashem, "The Inappropriate Use of Serial Correlation Tests in Dynamic Linear Models." Review of Economics and Statistics, February 1990, 126-32.

9. Gately, Dermot and Peter Rappoport, "The Adjustment of U.S. Oil Demand to the Price Increases of the 1970s." The Energy Journal, April 1988, 93-107.

10. Gilbert, Christopher L., "Professor Hendry's Econometric Methodology." Oxford Bulletin of Economics and Statistics, August 1986, 283-307.

11. Godfrey, Leslie G., "Testing for Higher Order Serial Correlation in Regression Equations When the Regressors Include Lagged Dependent Variables." Econometrica, November 1978, 1303-10.

12. Hendry, David F. "Predictive Failure and Econometric Modelling in Macroeconomics: The Transactions Demand for Money," in London Business School Conference on Economic Modelling, edited by P. Ormerod. London: Heinemann, 1979, 217-42.

13. Leamer, Edward E., "False Models and Post-Data Model Construction." Journal of the American Statistical Association, March 1974, 122-31.

14. Maddala, G. S. Introduction to Econometrics. New York: Macmillan, 1988.

15. Mizon, Grayham E. and David F. Hendry, "An Empirical Application and Monte Carlo Analysis of Tests of Dynamic Specification." Review of Economic Studies, 47(1980), 21-45.

16. Pindyck, Robert S. and Daniel L. Rubinfeld. Econometric Models and Economic Forecasts. New York: McGraw-Hill, 1991.

17. Walls, Margaret A., "Dynamic Firm Behavior and Regional Deadweight Losses from a U.S. Oil Import Fee." Southern Economic Journal, October 1990, 772-88.

The price responsiveness of U.S. petroleum consumption began to attract a great deal of attention following the unexpected and substantial oil price increases of 1973-74. There have been a number of large, multi-equation econometric studies of U.S. energy demand since then which have focused primarily on estimating short run and long run price and income elasticities of individual energy resources (coal, oil, natural gas & electricity) for various consumer sectors (residential, industrial, commercial).(1) Following these early multi-equation studies there have been several single-equation studies of aggregate U.S. petroleum consumption |4; 3; 9; 5; 6~. These single-equation studies have found that U.S. demand for petroleum products can be quite price inelastic in the short run (estimates range from -0.04 to -0.08) but exhibits long lags of anywhere from 6 to 10 years that yield long run price elasticities ranging from -0.25 to -0.56. Income or GNP elasticities are usually found to be much higher (0.69 to 1.13) with the complete response often assumed to occur in the current period.

Single-equation studies are often justified as efficient shortcuts, or reduced forms for identifying the central behavioral aspects of a particular market. The main issue addressed in this paper is the extent to which the existing empirical results from single-equation studies of aggregate U.S. petroleum consumption have been influenced by the researchers' choice of dynamic model specification. This question is relevant because the econometric methodology usually followed in such studies is to start with a relatively simple specification, such as a partial adjustment model or some kind of a distributed lag structure on price alone. This initial specification choice is rarely justified, except possibly to cite its previous success in similar work. Given that models which produce insignificant or perverse coefficient estimates are not usually publishable, the reported specification is almost certain to provide "reasonable," statistically significant estimates with the correct signs.

The primary diagnostic test of the chosen specification is the Durbin-Watson test statistic, or perhaps Durbin's "h" statistic in the presence of lagged dependent variables. When this test gives evidence of serial correlation in the residuals, the usual response is to "correct" for this by estimating a first or second-order autoregressive model (AR(1) or AR(2)). Should this easily-implemented correction not solve the problem, there is a tendency to chalk it up to some kind of unknown specification error or deficiency in the data, leaving any further investigation as a suggestion for future research.

Hendry |12, 223~ argues that this type of specification search (from simple to general) characterizes much of applied work, and is "a reasonably certain path to concluding with a mis-specified relationship" if it is not accompanied by rigorous diagnostic testing.(2) Additionally, he believes that a simple-to-general modeling approach often involves "excessive presimplification with inadequate testing" |12, 222~ or what Leamer has called a "prejudiced search for an acceptable model" |13, 126~.

Mis-specification of the dynamic adjustment process for U.S. petroleum consumption could lead to biased elasticity estimates with corresponding inaccurate forecasts. A mis-specified model might appear to suffer structural shifts or "changes in regimes" when in reality the true underlying structural economic relationship has remained unchanged while one or more of the exogenous variables have somehow changed their behavior |12, 219~. Similarly, a mis-specified model may provide biased evidence of parameter symmetry, leading to further inaccuracies in forecasting. Structural change and/or parameter symmetry have been the focus of many single-equation studies of U.S. petroleum consumption |3; 9; 5; 6~.

Given the importance of oil to the U.S. economy and the high degree of uncertainty associated with the future path of world oil prices, it seems appropriate to stop and closely investigate the issue of possible model mis-specification in the U.S. demand for petroleum products. Following Hendry's suggestions |12~, we first pursue a general-to-simple dynamic specification search, sequentially testing various restrictions on a deliberately overparameterized general model. The aim is to obtain a data-based simplification of a general model that provides a parsimonious representation of the underlying data generation process. Among other topics |7; 15~, Hendry's modeling approach has been used to analyze aggregate OECD energy demand |1~, yielding price elasticity estimates comparable to existing studies but a long run income elasticity twice as large as the consensus estimate of unity.

Next, rather than to just report "one more set" of elasticity estimates, we use the same data set to estimate the parameters of several typical alternative single-equation models, including the widely-used partial adjustment model, a simple static model, and the popular polynomial distributed lag (PDL) on price. These alternative estimates over the same sample period provide some indication of how our results differ from those of other researchers who might have chosen a different specification by following a simple-to-general approach.

Besides revealing the impact of model specification on our elasticities, all but one of these alternative models (the PDL) can be seen as nested or restricted versions of a more general dynamic specification, and therefore can be tested for their acceptability using standard statistical tests. In each case, we find that the restrictions implied by the alternative models are not supported by the data, suggesting that their selection and use to analyze U.S. petroleum consumption is inappropriate. Furthermore, a comparison of forecast errors (ex post and historical) shows substantial variation in forecasting performance across the various models. Hendry's general-to-simple modeling approach is seen to provide a data-acceptable restricted model that outperforms the alternatives and is free of the prior subjective prejudices of the researcher for a particular specification.

II. Data

The data set used for the following estimations contained annual observations over the period 1947-89. U.S. petroleum consumption (in thousand barrels per day) was measured by "petroleum products supplied" as reported in Table 50 of Annual Energy Review 1989 by the Energy Information Administration of the U.S. Department of Energy (DOE/EIA). The price of oil (in current dollars per barrel) was measured by refiner acquisition cost (composite) as calculated by DOE/EIA for 1968-89, published in Table 68 of Annual Energy Review 1989. Oil prices prior to 1968 were generously supplied by Dermot Gately, calculated as the domestic average wellhead price plus a 10 percent markup for transportation costs. All prices were converted to 1982 dollars using the implicit GNP deflator (1982 = 100) published in Survey of Current Business by the U.S. Department of Commerce. U.S. GNP (in billions of 1982 dollars) was also obtained from the Survey of Current Business.

III. General-to-Simple Modeling Approach

An application of Hendry's general-to-simple modeling approach to the study of U.S. petroleum consumption would involve first estimating an unrestricted autoregressive distributed lag model (denoted ADL (|m.sub.0~, |m.sub.1~, |m.sub.2~)) of the form:

|q.sub.t~ = |Alpha~ + |summation of~ ||Delta~.sub.j~|q.sub.t-j-1~ where j=0 to |m.sub.0~ + |summation of~ ||Beta~.sub.j~|p.sub.t-j~ where j=0 to |m.sub.1~ + |summation of~ ||Gamma~.sub.j~|y.sub.t-j~ where j=0 to |m.sub.2~ + |u.sub.t~ t = 1,2,...,T (1)

where the three lag lengths (|m.sub.0~, |m.sub.1~, |m.sub.2~) are initially set at the same maximum value (in this case, 4). All variables are measured in natural logarithms, with |q.sub.t~ denoting U.S. petroleum consumption in year t, |p.sub.t~ denoting the real price of oil, and |y.sub.t~ denoting real GNP. As others have noted |1~, theory often does not provide us with any strict guidelines on the length of these three lags. However, a maximum of 4 lags on each variable will allow for a large number of different possible lag distributions.(3)

This unrestricted general model is then progressively simplified by sequential testing of individual parameter restrictions suggested by the data. Potential restrictions are usually identified by looking at the magnitudes of individual parameter estimates, as well as their standard errors. Parameters whose estimates involve large standard errors can usually be set to zero. Often the estimated coefficient for one lagged value of a variable will be close in magnitude to the estimate for a neighboring lag, suggesting the use of a single common parameter. Should the estimates be of similar magnitude yet opposite in sign, then perhaps differencing may be acceptable. Restrictions are tested one at a time, and if accepted, all subsequent testing will include those restrictions in the model.

The criteria for accepting any particular restriction are: (i) it must lower the estimated standard error of the regression (SE), (ii) it cannot induce non-randomness in the residuals (as measured by the Lagrange multiplier (LM) test for serial correlation proposed by Godfrey |11~); and (iii) it cannot cause "predictive failure" outside the estimation period (as measured by the Chow TABULAR DATA OMITTED test for predictive stability or parameter constancy |12, 222~). In order to implement this third test, we hold back the last four observations (1986-89), basing all subsequent results on the estimation period 1961-85.(4)

Results from the OLS estimation of the unrestricted general model of equation (1) are presented in Table I, along with the relevant diagnostic test statistics. LM(s) is Godfrey's LM test for auto correlation of order s against a null hypothesis of serial independence, which is asymptotically distributed as ||Chi~.sup.2~(s), and F(4, 25 - k) is the Chow test for predictive stability, where k is the number of estimated coefficients in the model (k = 15 in the unrestricted model). Sums of the estimated coefficients for each variable (q, p, y) are also shown for easy calculation of the long run elasticities. In particular, the short run elasticities for p and y are given by the estimated coefficients on the zeroth lag term (|Mathematical Expression Omitted~ and |Mathematical Expression Omitted~), while the long run elasticities are calculated as:

|Mathematical Expression Omitted~

and

|Mathematical Expression Omitted~.

These unrestricted estimates serve at least two purposes: (i) they provide a statistical benchmark for evaluating the acceptability of restricted variants of the general model of equation (1); (ii) they give us unrestricted elasticity estimates that will serve as a guide against imposing invalid restrictions in the paring-down process of model selection. The estimated price elasticities are comparable to existing single-equation estimates for the short run, yet are considerably smaller for the long run, reflecting a somewhat shorter lag in the full price response. Both the short run and long run GNP elasticity estimates are well within the range of existing results.

The two diagnostic tests for serial correlation and post-sample parameter constancy reveal TABULAR DATA OMITTED a stable model that probably suffers from "over-fitting" or over-parameterization. Progressive reduction of the LM test statistic in the restricted models which follow confirm this suspicion.

We now proceed to simplify the general model by imposing data-instigated restrictions on the set of coefficients. Table II provides a sequential list of acceptable parameter restrictions and the corresponding values of SE, LM (4) and F(4, 25 - k). The final restricted model is (with t-statistics in parentheses):

|Mathematical Expression Omitted~

T = 25, SE = 0.0089098, Adj. |R.sup.2~ = 0.999252, LM(4) = 7.217, F(4, 18) = 2.775.

This restricted ADL model displays no evidence of serial correlation in its residuals, is stable over the post-sample period, and involves a reduction in SE of 20 percent over the general model.

The unscrambled coefficients from the restricted ADL model in (2) are given in Table III. From these, the short run and long run elasticities are easily seen to be:

|Mathematical Expression Omitted~ |Mathematical Expression Omitted~

|Mathematical Expression Omitted~ |Mathematical Expression Omitted~

which are very close to those found in the unrestricted model, suggesting that the restrictions inherent in the final model are valid.

As a point of interest, the restricted model was re-estimated over the longer sample period of 1961-89. The summary statistics over this longer sample period were T = 29, SE = 0.0102473, adjusted |R.sup.2~ = 0.998872 and LM(4) = 4.823. The resulting unscrambled coefficients are also shown in Table III, yielding the following implied elasticities:

|Mathematical Expression Omitted~ |Mathematical Expression Omitted~

|Mathematical Expression Omitted~ |Mathematical Expression Omitted~

TABULAR DATA OMITTED

which show a slightly more responsive price effect in the long run but otherwise are essentially the same as those obtained over 1961-85.

In conclusion, the general-to-simple model selection procedure has produced a rather parsimonious (only 7 parameters to be estimated) representation of the data generation process that appears to adequately capture the dynamics of U.S. petroleum consumption. Rather than to continue to compare these estimates to those obtained in other studies using different specifications and different data bases, in the next section we pursue the simple-to-general modeling approach over the same sample period. This will isolate the role that model specification, apart from data differences, plays in estimating price and GNP elasticities for U.S. petroleum consumption.

IV. Simple-to-General Modeling Approach

Given the same set of annual observations on U.S. petroleum consumption, the real price of oil and real GNP described in the data section above, most applied researchers seeking a parsimonious, single-equation econometric model would not have followed the general-to-simple selection process of the previous section. Instead, they would have chosen some familiar specification they have been trained to use or find easy to implement. Some would begin to determine an "optimal" PDL structure for the price term, assuming away any lags on the GNP response |4; 5; 6; 9; 17~. Others might appeal to other researchers' success with a popular specification, such as the partial adjustment model |3~. Still others might want to start out by estimating a simple static model, with the strong expectation of having to upgrade it to at least an AR(1) model.

PDL Model

We begin by estimating a simple model of U.S. petroleum consumption with a PDL (n, r) structure on price but an instantaneous response to GNP:

|q.sub.t~ = |Alpha~ + |Gamma~|y.sub.t~ + |summation of~ ||Beta~.sub.i~|p.sub.t-1~ where i=0 to n + |u.sub.t~ (3)

where |Mathematical Expression Omitted~

and n = lag length and r = order of the polynomial distributed lag structure on price are two values to be determined. The instantaneous GNP elasticity is given by |Gamma~ and the price elasticity is given by ||Beta~.sub.0~ for the short run and |summation~||Beta~.sub.i~ for the long run.

Table IV. AR(2) PDL(12, 1) Model Results, 1961-1985 Parameter Estimate t-statistic |Alpha~ 1.141 (6.25) |Gamma~ 1.029 (19.20) ||Beta~.sub.0~ -0.024 (3.48) ||Beta~.sub.1~ -0.027 (4.62) ||Beta~.sub.2~ -0.029 (6.25) ||Beta~.sub.3~ -0.031 (8.50) ||Beta~.sub.4~ -0.033 (10.87) ||Beta~.sub.5~ -0.035 (11.24) ||Beta~.sub.6~ -0.037 (10.24) ||Beta~.sub.7~ -0.040 (8.56) ||Beta~.sub.8~ -0.042 (7.22) ||Beta~.sub.9~ -0.044 (6.25) ||Beta~.sub.10~ -0.046 (5.53) ||Beta~.sub.11~ -0.048 (4.99) ||Beta~.sub.12~ -0.051 (4.58) |summation~||Beta~.sub.i~ -0.487 -- ||Theta~.sub.0~ -0.024 (3.48) ||Theta~.sub.1~ -0.051 (4.58) ||Rho~.sub.1~ 1.195 -- ||Rho~.sub.2~ -0.622 -- T = 25 SE = 0.0155485 Adj. |R.sup.2~ = 0.964346 DW = 2.140 Q*(1) = 0.18 Q*(4) = 3.24 F(4, 21) = 1.609

As before, we held back the last 4 observations (1986-89) for post-sample testing. This left a maximum possible estimation sample of 1947-85 (T = 39), or a maximum unconstrained lag on price of 17 years. The optimal lag length (n |is less than or equal to~ 17) was determined by finding the number of lags which gave the highest value of adjusted |R.sup.2~ |14, 357~. In the present case, an unconstrained lag of 12 years gave the highest value of adjusted |R.sup.2~ = 0.982517, with an inconclusive value for the DW statistic of 1.182. Next, the optimal order of the polynomial lag structure (r |is less than or equal to~ n - 1) was determined by starting with an 11th order polynomial (with no endpoint restrictions) and progressively reducing the order by one until we were unable to reject the hypothesis that the last term in the polynomial was zero |14, 357~. This procedure was followed using the previously determined optimal 12 year lag length over the longest possible sample period of 1959-85 and resulted in an optimal first-degree polynomial (r = 1).

Unfortunately, this PDL (12, 1) model showed clear evidence of serial correlation in its residuals: the DW statistic was only 0.472 and the Ljung-Box (Q*(s)) portmanteau test statistics for serial independence (distributed as ||Chi~.sup.2~(s)) were significant from the very first lag (s = 1).(5) Subsequent re-estimation using a Cochrane-Orcutt iterative AR(2) procedure seemed to correct this problem, yielding insignificant Q* statistics out to the fourth lag and a respectable DW statistic of 2.140.

All parameter estimates were of reasonable magnitudes, had the "correct" or expected signs and were statistically significant at the 1 percent level. The short run price elasticity was very small (-0.024), but with a lengthy 12 year lag was able to rise to -0.487 in the long run. The instantaneous GNP elasticity was once again close to unity (1.029).(6)

By assuming no change in the autoregressive parameters ||Rho~.sub.1~ and ||Rho~.sub.2~ over 1986-89, a post-sample Chow test of parameter constancy was also performed.(7) The test statistic of 1.609 is distributed as F(4, 21) and is not significant at the 5 percent level, showing that the model does not appear to suffer from "predictive failure" as defined by Hendry. Thus, from the perspective of the simple-to-general approach, we seem to have found an acceptable model.

Partial Adjustment Model

However, not everyone wanting to analyze U.S. petroleum consumption with a single-equation model would have chosen to fit a simple PDL structure on price. The partial adjustment model is another very popular dynamic specification that we could have used. A simple partial adjustment model would take the form:

|q.sub.t~ = ||Phi~.sub.0~ + ||Phi~.sub.1~|p.sub.t~ + ||Phi~.sub.2~|y.sub.t~ + ||Phi~.sub.3~|q.sub.t-1~ + |u.sub.t~ (4)

where 0 |is less than~ ||Phi~.sub.3~ |is less than~ 1. In this well-known specification, the short run price elasticity is ||Phi~.sub.1~, the short run GNP elasticity is ||Phi~.sub.2~ and the long run elasticities are ||Phi~.sub.1~/(1 - ||Phi~.sub.3~) and ||Phi~.sub.2~/(1 - ||Phi~.sub.3~), respectively. Unlike the PDL model, changes in both price and GNP cause lagged responses in consumption.

OLS estimation of (4) over 1961-85 yielded the following results:

|Mathematical Expression Omitted~

T = 25, SE = 0.0247197, Adj. |R.sup.2~ = 0.983455, LM(1) = 4.029, F(4, 21) = 1.469.

While the parameter estimates seem reasonable and are all statistically significant, the residuals show some evidence of first order autocorrelation since Godfrey's LM test statistic exceeds the critical value of 3.84 for a ||Chi~.sup.2~(1) variate at the 5 percent level.

"Correcting" for first-order autocorrelation using the Cochrane-Orcutt iterative procedure produced a value for ||Rho~.sub.1~ = 0.896. With this transformation the LM test statistic is now insignificant, so the problem seems to be solved. The AR(1) partial adjustment model is:

|Mathematical Expression Omitted~

T = 25, SE = 0.0212301, Adj. |R.sup.2~ = 0.686825, LM(1) = 2.366, F(4, 21) = 1.355

where the post-sample F-test of parameter constancy assumes no change in ||Rho~.sub.1~ over 1986-89.

The final parameter estimates seem acceptable, although the speed of adjustment parameter (0.590) suggests a much faster adjustment process than that of the OLS model. The short run price elasticity is -0.114, somewhat larger than other estimates, with a comparatively low long run price elasticity of -0.278. On the other hand, the short run GNP elasticity is low at 0.416 although the long run estimate is very close to the consensus value of unity at 1.015. Finally, the parameters appear to be constant through the post-sample period, giving the model predictive stability. Once again, we seem to have obtained an acceptable model by an entirely different approach.

Static Model

The last simple-to-general modeling alternative we follow begins with OLS estimation of a purely static specification, yielding the following results:

|Mathematical Expression Omitted~

T = 25, SE = 0.0884227, Adj. |R.sup.2~ = 0.788308, DW = 0.142, F(4, 22) = 3.781.

Although the parameter estimates seem reasonable and are significant, the extremely low DW statistic suggests serially correlated residuals if not outright mis-specification, and the significant F-statistic for post-sample stability indicates predictive failure.

In an attempt to correct this problem we again used the Cochrane-Orcutt iterative procedure to determine the value of the first order autoregressive parameter ||Rho~.sub.1~. However, this yielded a value for ||Rho~.sub.1~ in excess of one (||Rho~.sub.1~ = 1.112), which violates the stationarity condition that ||Rho~.sub.1~ lie inside the unit circle. A first differences model was then estimated as an alternative:

|Mathematical Expression Omitted~

T = 25, SE = 0.0295252, Adj. |R.sup.2~ = 0.510185, DW = 0.454, F(4, 22) = 0.158

which represents some improvement over the static model (7) but still appears to suffer from serial correlation. Adjusting this first differences model for first-order autocorrelation gives ||Rho~.sub.1~ = 0.800 and the following results:

|Mathematical Expression Omitted~

T = 25, SE = 0.0341131, Adj. |R.sup.2~ = 0.704120, DW = 2.186, F(4, 22) = 0.628

where the post-sample test again assumes a constant value of ||Rho~.sub.1~ over the 1986-89 period. By all appearances, this AR(1) first differences model is thus also an acceptable specification.

V. Model Comparisons

At this point, we have selected one restricted ADL model (equation (2)) using the general-to-simple approach and three very different dynamic specifications ((i) AR(2) PDL(12, 1) model; (ii) AR(1) partial adjustment model (equation (6)); and (iii) AR(1) first differences TABULAR DATA OMITTED TABULAR DATA OMITTED model (equation (9)) by following different paths in a simple-to-general approach. All four chosen models appear to be free of serial correlation and exhibit post-sample stability. All four models yielded statistically significant parameter estimates with the correct signs and reasonable magnitudes. However, the implied elasticities and lengths of lag responses differed substantially across the four models. Which model should we use?

As an initial comparison we looked at the ex post forecasting properties of the four chosen models. Table V reports five commonly used measures of forecast error for ex post forecasts over the post-sample period 1986-89: (1) root mean squared error (RMSE); (2) root mean squared percent error (RMSPE); (3) mean absolute error (MAE); (4) mean absolute percent error (MAPE); and Theil's inequality coefficient (U).(8) Over this brief period of only 4 observations, the mean forecast errors were only about 0.2 to 0.3 percent for all but the partial adjustment model, which produced forecast errors of over 9 percent as well as a substantially higher value for Theil's U. On this basis we might prefer any one of the first three models, with either the PDL or first differences model having a slight edge over the restricted ADL model.

To further discriminate between these four models, we performed historical simulations over the estimation sample period of 1961-85 and report the same five error measures as before in Table VI. Over this longer period the partial adjustment model still has the poorest performance, with mean errors almost identical to those over 1986-89. However, the other three models' performances are no longer so similar. The restricted ADL model now yields mean errors roughly half the size of those for the PDL model and one-third the size of those for the first differences model, as well as the lowest value for Theil's U. The restricted ADL model now clearly outperforms all other specifications considered here.

As a final comparison, it is possible to recognize the partial adjustment and first differences models as special cases of the general ADL(4, 4, 4) model of equation (1) and test whether or not the restrictions implicit in each specification are supported by the data. The long 12 year lag on price in the PDL model makes it impossible to test in this way, since it is not a nested case of the general model. Nevertheless, for the same number of observations (25) the chosen PDL model has a SE of 0.0155485, which is almost 40 percent higher than the SE of the unrestricted general model of 0.0111443, suggesting that it would not be a data-acceptable simplification.

The AR (1) partial adjustment model of equation (6) is observationally equivalent to a nested case of the general ADL(4, 4, 4) model, in particular the ADL(2, 1, 1) model:

|q.sub.t~ = |Alpha~ + ||Delta~.sub.1~|q.sub.t-1~ + ||Delta~.sub.2~|q.sub.t-2~ + ||Beta~.sub.0~|p.sub.t~ + ||Beta~.sub.1~|p.sub.t-1~ + ||Gamma~.sub.0~|y.sub.t~ + ||Gamma~.sub.1~|y.sub.t-1~ + |u.sub.t~. (10)

This equivalence can be shown by taking quasi-first differences of the basic partial adjustment model of equation (4):

|q.sub.t~ = ||Phi~.sub.0~(1-||Rho~.sub.1~) + (||Phi~.sub.3~ + ||Rho~.sub.1~)|q.sub.t-1~ - ||Phi~.sub.3~||Rho~.sub.1~|q.sub.t-2~ + ||Phi~.sub.1~|p.sub.t~ - ||Phi~.sub.1~||Rho~.sub.1~|p.sub.t-1~ + ||Phi~.sub.2~|y.sub.t~ - ||Phi~.sub.2~||Rho~.sub.1~|y.sub.t-1~ + |u.sub.t~ (11)

which has 5 parameters (||Phi~.sub.0~, ||Phi~.sub.1~, ||Phi~.sub.2~, ||Phi~.sub.3~, ||Rho~.sub.1~) to be estimated against 7 parameters in equation (10). This implies that the AR(1) partial adjustment model imposes 2 nonlinear restrictions on the ADL(2, 1, 1) model of equation (10). To test these 2 nonlinear restrictions, we use a likelihood ratio test statistic (LR), which is asymptotically distributed as ||Chi~.sup.2~(2). The calculated value of LR = 17.06 exceeds the critical value of 9.21 at the 1 percent level, so the restrictions implied by the AR(1) partial adjustment model are rejected.

Of course, this test is immaterial if the underlying restrictions implied by the ADL(2, 1, 1) model of equation (10) are not first found to be data-acceptable. This set of 8 linear restrictions (||Beta~.sub.2~ = ||Beta~.sub.3~ = ||Beta~.sub.4~ = ||Gamma~.sub.2~ = ||Gamma~.sub.3~ = ||Gamma~.sub.4~ = ||Delta~.sub.3~ = ||Delta~.sub.4~ = 0) is easily tested with a standard F-test, being distributed as F(8, 10). The calculated value is 3.57, which exceeds the critical value of 3.07 at the 5 percent level. Furthermore, the ADL(2, 1, 1) model has a SE = 0.0163026, which is almost 50 percent greater than that for the general ADL(4, 4, 4) model, so equation (10) is not itself a data-acceptable simplification. We conclude that the AR(1) partial adjustment model is a data-rejectable specification for this sample.

Similarly, the AR(1) first differences model of equation (9) can be seen as observationally equivalent to an ADL(2, 2, 2) model. This nested case would have 9 separate parameters while the AR(1) first differences model has only 4, implying 5 nonlinear restrictions that can again be tested by a LR test statistic (assuming the ADL(2, 2, 2) model is itself data-acceptable). The calculated value of LR is 14.02, being distributed as ||Chi~.sup.2~(5), which exceeds the critical value of 11.1 at the 5 percent level, so these 5 restrictions can be rejected. However, the nested ADL(2, 2, 2) is not an acceptable simplification, as its SE of 0.0164392 is again roughly 50 percent above the SE for the general model, and an F-test of the 6 linear restrictions implied by an ADL(2, 2, 2) model (||Beta~.sub.3~ = ||Beta~.sub.4~ = ||Gamma~.sub.3~ = ||Gamma~.sub.4~ = ||Delta~.sub.3~ = ||Delta~.sub.4~ = 0) gives a statistic of F(6, 10) = 4.14, surpassing the critical value of 3.22 at the 5 percent level. Therefore, the AR(1) first differences model must also be data-rejectable, comprising a set of invalid parameter restrictions.(9)

VI. Conclusions

When choosing an econometric model specification for a single-equation study of aggregate U.S. petroleum consumption, the researcher seeks a parsimonious, easily estimated model that will provide unbiased price and income elasticity estimates and yield accurate forecasts. In contrast to existing studies, we have used Hendry's general-to-simple specification search technique and annual data (1961-89) to obtain a restricted, data-acceptable simplification of a general ADL model. This restricted ADL model yielded GNP and short run price elasticities near the consensus estimates, but a long run price elasticity (-0.17) that is substantially smaller than existing estimates.

Comparisons with three other seemingly acceptable alternative models that were chosen via the simple-to-general modeling approach showed that popular model specifications often involve untested parameter restrictions that cannot be accepted. In addition, such models may also have poorer forecasting performance, with the widely used partial adjustment model found to have the largest mean forecast errors (nearly 10 percent) of all those considered here.

Therefore, the untested acceptance of a popular dynamic specification which yields "reasonable" and significant estimates can lead to the selection of a model which is not consistent with the data (i.e., is data-rejectable). Parameter estimates from such data-rejectable models may give very misleading indications of the dynamic nature of the behavioral relationships being modeled. In the present case, the long run price elasticity appears to have been over-estimated by such models, while the speed of the adjustment process may have been under-estimated. Energy policies which assumed a lingering, substantial consumption response to an oil price change could be confounded by an actual response that was much smaller and largely complete in just a few years.

Finally, forecasts of future U.S. petroleum consumption levels are often required for policy planning purposes. While the accuracy of any petroleum consumption forecast will depend heavily upon the assumptions made about future world oil prices and GNP levels, using a forecasting model that is not data-acceptable can lead to sizable errors. Selection of a data-acceptable model that does not suffer from predictive failure over existing out-of-sample data points can help to minimize forecasting errors for policy makers and other analysts. Based on the results presented here, the general-to-simple approach appears to offer a satisfying new methodology for generating superior forecast models of petroleum consumption and other energy use patterns.

Appendix

TABULAR DATA OMITTED

1. An excellent survey of these results may be found in Bohi |2, 159~.

2. For a comparison of Hendry's econometric methodology with the traditional, simple-to-general ("North American") approach, see Gilbert |10~.

3. A formal F-test of |H.sub.O~ : ADL(4, 4, 4) vs |H.sub.A~ : ADL(5, 5, 5) yielded a test statistic of 0.701, which is distributed as F(3, 7), falling well short of the critical value of 4.35 at the 5 percent level.

4. With four lags, this implies using observations reaching back to 1957. Obviously, with data going as far back as 1947, we could have begun the estimation period in 1951. We started in 1961 so that these results would be comparable to those in the next section using alternative specifications requiring longer lags, in particular the AR(2) PDL model.

5. We can use the familiar DW and Ljung-Box portmanteau test statistics here rather than Godfrey's LM test since the PDL model has no lagged dependent variables. Otherwise, we report the LM test statistic, since the Ljung-Box statistics have been shown to be inappropriate for models with lagged dependent variables |8~.

6. For comparison, Gately and Rappoport |9~ estimated an AR(2) PDL(10, 3) model from annual data over 1949-85 and found a much larger (although still small) short run price elasticity of -0.068, a smaller long run price elasticity of -0.364, and a smaller GNP elasticity of 0.689. However, our elasticities are very similar to those reported by Walls |17~ that estimated a nearly identical AR(2) PDL(10, 1) model from annual data over 1946-87.

7. Complete re-estimation over 1961-89 yielded almost identical values for these two parameters (||Rho~.sub.1~ = 1.182 and ||Rho~.sub.2~ = -0.597), so this necessary assumption seemed reasonable.

8. An explanation of each of these forecast error measures may be found in Pindyck and Rubinfeld |16, 338-40~.

9. For the sake of curiosity, the OLS partial adjustment model and the OLS first differences model were also found to be data-rejectable.

References

1. Beenstock, Michael and Patrick Willcocks, "Energy Consumption and Economic Activity in Industrialized Countries: The Dynamic Aggregate Time Series Relationship." Energy Economics, October 1981, 225-32.

2. Bohi, Douglas R. Analyzing Demand Behavior: A Study of Energy Elasticities. Baltimore: John Hopkins University Press, 1981.

3. Bopp, Anthony E., "Tests for Structural Change in U.S. Oil Consumption, 1967-82." Energy Economics, October 1984, 223-30.

4. Brown, Scott B., "An Aggregate Petroleum Consumption Model." Energy Economics, January 1983, 27-30.

5. Brown, Stephen P. A. and Keith R. Phillips, "Oil Demand and Prices in the 1990s." Federal Reserve Bank of Dallas Economic Review, January 1989, 1-8.

6. ----- and -----, "U.S. Oil Demand and Conservation." Contemporary Policy Issues, January 1991, 67-72.

7. Cuthbertson, Keith and Paul Richards, "An Econometric Study of the Demand for First and Second Class Inland Letter Services." Review of Economics and Statistics, November 1990, 640-48.

8. Dezhbakhsh, Hashem, "The Inappropriate Use of Serial Correlation Tests in Dynamic Linear Models." Review of Economics and Statistics, February 1990, 126-32.

9. Gately, Dermot and Peter Rappoport, "The Adjustment of U.S. Oil Demand to the Price Increases of the 1970s." The Energy Journal, April 1988, 93-107.

10. Gilbert, Christopher L., "Professor Hendry's Econometric Methodology." Oxford Bulletin of Economics and Statistics, August 1986, 283-307.

11. Godfrey, Leslie G., "Testing for Higher Order Serial Correlation in Regression Equations When the Regressors Include Lagged Dependent Variables." Econometrica, November 1978, 1303-10.

12. Hendry, David F. "Predictive Failure and Econometric Modelling in Macroeconomics: The Transactions Demand for Money," in London Business School Conference on Economic Modelling, edited by P. Ormerod. London: Heinemann, 1979, 217-42.

13. Leamer, Edward E., "False Models and Post-Data Model Construction." Journal of the American Statistical Association, March 1974, 122-31.

14. Maddala, G. S. Introduction to Econometrics. New York: Macmillan, 1988.

15. Mizon, Grayham E. and David F. Hendry, "An Empirical Application and Monte Carlo Analysis of Tests of Dynamic Specification." Review of Economic Studies, 47(1980), 21-45.

16. Pindyck, Robert S. and Daniel L. Rubinfeld. Econometric Models and Economic Forecasts. New York: McGraw-Hill, 1991.

17. Walls, Margaret A., "Dynamic Firm Behavior and Regional Deadweight Losses from a U.S. Oil Import Fee." Southern Economic Journal, October 1990, 772-88.

Printer friendly Cite/link Email Feedback | |

Author: | Jones, Clifton T. |
---|---|

Publication: | Southern Economic Journal |

Date: | Apr 1, 1993 |

Words: | 5864 |

Previous Article: | Effects of quotas under variable returns to scale: the large country case. |

Next Article: | Habit formation as a resolution to the equity premium puzzle: what is in the data, what is not. |

Topics: |