Bayesian VAR Forecasts Fail to Live Up to Their Promise.
Macroeconomic forecasts from vector autoregression (VAR) models, in which the data are combined with Bayesian prior distributions on the coefficients, gained great popularity in the 1980s. The prior distribution known as the "Minnesota Prior" was used to make forecasts starting in 1980. These forecasts seemed, for a time, to not only equal but even to surpass those of the consulting groups selling forecasts based on large, judgmentally adjusted econometric models. Using actual forecasts made by the group then called DRI between 1981 and mid-1996, we find the forecasts based on the "Minnesota Prior" did not continue their early success, even when they are averaged with the DRI forecasts.
Why should business economists care about comparative forecasting performance? This paper addresses the forecasting needs of a firm too small to have its own in-house forecasting staff, but with at least one employee capable of applying the unconditional "Bayesian" vector autoregressive (BYAR) forecasting methods described by Litterman (1986c). This employee might typically make forecasts using the program RATS.  Alternatively, this employee, perhaps in concert with others, could subscribe to services provided by an econometric consulting firm. Such services, in addition to giving access to an extensive machine-readable database, typically publish a monthly document with quarterly forecasts of many time series for horizons generally extending at least two years ahead. The in-house economist could use the database to build his or her own forecasting model, use the service's forecasts directly or both.
The choice is not a trivial one--for at least twenty years leading economists and econometricians in business, government and academia have argued both sides of this issue. It is important because no business decision maker can avoid making forecasts. Decisions about stocks of raw materials, goods in process and finished goods, among other things, must be based on forecasts. The advent of the Internet does not change this necessity: at best the process is speeded up.
To address this choice, imagine it is 1981 and Litterman's method has just become known. Suppose it had been used for the next sixteen years, along with the DRI model, as a representative of commercially available large econometric models. Which would have had the better forecasting performance? As a third choice, we consider an average of the two forecast methods for each variable, for each base period and for each horizon.
Previous Forecast Comparison Studies
Francis X. Diebold (1998a) declared that large "structural" econometric models combined with judgmental adjustments are the "past" of macroeconomic forecasting, while ever more complex non-linear vector auto-regression (VAR) models are the "future." He asserted that "[t]he reports of the death of large-scale macroeconomic forecasting models are not exaggerated." In addition, Robert Litterman (1984, 1986a) declared that:
"... a statistical time series model has been developed that, for the first time, appears to generate forecasts that compare favorably in terms of accuracy with those generated by the best judgment of economic forecasters." (l986a, p.1)
This paper tests both of these assertions. Litterman's model, developed after intensive data mining, was asserted to be the first model to be suitable for forecasting with no intervention from its manager. When the model's inflation forecasts turned out to be seriously deficient, however, Litterman began to modify the specification in 1984 and soon afterward abandoned the exercise entirely (Sims, 1993, p.179).
Christopher A. Sims, who popularized VAR models starting in 1980 (Sims, 1980), took over Litterman's work. He found that after 1987 the model "had been making a sequence of same-signed errors in forecasting real GNP..." and so he "... decided ... to complicate the specification of the model in several ways" (Sims, 1993, p. 180). In doing so, he made a model so complex that one of the original goals of VARs, which was to have models so simple that uninitiated practitioners could build and use them on their own, was lost. Sims published ex-ante forecasts with this model for thirteen quarters.
John C. Robertson and Ellis W. Tallman of the Federal Reserve Bank of Atlanta, however, carried on and extended Litterman's original program (see Robertson and Tallman, 1999). They made monthly ex-ante forecasts using six VAR models, including two versions of Litterman's, for the period 1986-1997.
For several important variables of interest to business decision-makers, the different methods produce quite different results, as will be seen. The time-series methods that seemed so promising in 1985 did not produce superior results for the next ten years. This led Sims and other theorists to more complex methods, but these too have yet to produce better results. Perhaps ten years from now this will not be the case, but business decision-makers must make a choice today. 
Why Yet Another Study?
While it has been suggested that large, structural, judgmentally adjusted models are obsolete, it has also been suggested that VARs and BVARs may be combined with a large, judgmentally adjusted model to produce superior forecasts. (Lupoletti and Webb, 1986, p.284) On the other hand, Stephen K. McNees (1990) finds that when Sims' forecasts for 1986-89 are added to Litterman's from 1980:2 to 1985:4,
"[f]or four of the seven variables, the BVAR forecasts are distinctly inferior to the others (DRI, Georgia State, Michigan and WEFA). The forecasts of the narrow definitions of money are roughly as accurate as the adjusted forecasts. For the, other two variables, the relative performance of the BVAR model forecast depends on the forecast horizon" (p. 43).
McNees also writes: "The evidence presented here broadly confirms the conclusion that individuals adjust their models to compensate in part for their models' deficiencies, thereby improving the accuracy" (p. 52). Of course, model adjustment is considered by many to be unscientific, suggesting that forecasting is an art rather than a science.
Ray C. Fair and Robert J. Shiller (1990) attempt to provide a scientific comparison between Fair's fully automatic model, as it existed in 1976, against four different BVAR specifications, a simple VAR, and a method they refer to as "autoregressive components" (AC). They forecast only real GNP, for one quarter ahead (actually the current quarter) and four quarters ahead, for the period 1976:3 through 1986:2. The method they use is the "encompassing" method, and any model with a significant coefficient in the encompassing equation is considered to carry "independent information."
"The fact that the forecasts from the Fair model are significant shows that they are not collinear ... and that differences between the Fair model and the other models are meaningful" (p. 386).
Fair and Shiller also conclude that "... it appears that the VAR and AC models do not contain a lot of information" (p. 386). They caution however, that their results are for only one period of ten years and one variable and that the structure of the economy may be changing.
Why another study of BVAR and big model forecasts?
For one thing, unrealistic claims for the accuracy of VARs as opposed to large structural models are still being made (see, for example, Diebold, 1998a). Second, in order to use a VAR or BYAR in forecasting, the estimation, model selection and forecasting must be "recursive" or "sequential", with the sample size changing each time a new observation is added (see Diebold, 1998b, p.111). Third, several more years of data have accumulated since the last test (we will use almost six more years).
Our purpose in this paper is very modest. We take Litterman's precise original six-equation specification, and generate "simulated ex-ante" quarterly forecasts for the sixty-two quarter period from 1981:1 to 1996:2. Each forecast is for the current quarter and seven "future" quarters. We then compare the mean absolute errors and root mean square errors for each variable and horizon, with similar errors from the Standard and Poor's/DRI model and also for an equally weighted combination of the two.
We use only simulated ex-ante data, which are required for actual combination forecasts. Although revised data may change economic history greatly and may be based on more accurate data (see Robertson and Tallman, 1998), policy decisions based on forecasts must use preliminary data. That is a major reason we use preliminary data to judge accuracy as well. Finally, we report for horizons of up to two years, like McNees but unlike others.
Robertson and Tallman (1999) provide an interesting comparison to our study. They use six different kinds of VARs, using ex-ante or simulated ex-ante data, for the period 1986-1997. One of these is the same unchanged Litterman model we use. But they provide no comparison to big models.
Victor Zarnowitz and Phillip Braun (1993) performed the most comprehensive study on forecast accuracy. But the only big model they use is the Michigan model, and their VAR forecasts suffer from ambiguities about the information set. Their conclusions about the value of consensus forecasts from many forecasters must be dealt with in another paper. 
Charles W. Bischoff (1989) outlines the procedure for making simulated ex-ante forecasts. All data are compiled so that they represent what was available when the historical "mid quarter" (also known as "45- day") GNP (or GDP) data were released. Monthly data (money supply (MI), the unemployment rate, and the Treasury bill rate) are defined to include quantities reported in two different quarters in order to avoid giving an unfair advantage to the large model with judgmental forecasts to which Litterman's model is being compared.
Zarnowitz and Braun (1993) grappled with the problem of monthly, weekly and daily data available when a forecast is made on other than the first day of a quarter. They recorded VAR forecasts assuming either zero or all within-quarter data. There is a problem in discussing the definition of "error," especially when data are revised or aggregated over time. We try to alleviate the problem by including one month's data for the unemployment, M1 and Treasury bill data. This was less than ideal for two reasons.
First, the "45-day" data have become more like "57-day" data. Also, the bill rate is available daily and is never revised, while M1 is available with only a weekly lag. This gives a slight advantage to DRI over BVAR for the bill rate, unemployment, and M1. Fortunately, for all of these variables over most horizons the two sets of forecasts are not even close. However, for some horizons this is not true for the unemployment rate.
The second reason is even more critical. When we went to combine data, we were comparing apples to oranges for unemployment, M1 and the bill rate. We tried to overcome this by computing growth rates from heterogeneous bases. However, for short horizons there are still problems of unknown magnitude.
In our exercise Litterman's model is re-estimated every quarter, but the specification is never changed. Of course, Litterman probably never believed that this model could truly be used forever without any judgmental intervention in modifying the specification to react to events occurring after the original model was written down. In this sense, we are surely being unfair to him. But, because he did not prescribe any rules for future modification, the only way we can provide a truly rigorous test of his model as a scientific, non-judgmental specification is to adhere to the original concept. In any case, since our ultimate goal is to provide combination forecasts from two models developed using disparate philosophies, it is not necessary to have the best possible VAR model available at any given moment.
The "other" model in our comparison is the Standard and Poor's/DRI model (hereafter referred to only as DRI). We obtained a complete set of true ex-ante "mid-quarter" forecasts beginning in 1981:1 and going through 1996:2 (forecasts made near the end of May 1996). We report the ratios of the mean absolute errors and root mean square errors of DRI to BVAR, of DRI to an equally weighted combination of the two forecasts and of BVAR to the same equally weighted combination. These are averaged over the sixty-two forecast bases, for eight quarterly horizons up to almost two years ahead, for the six original variables forecast by Litterman. Thus, the forecasts extend up to and include the first quarter of 1998. As a standard for the "actual" value for each data point, we use the first available mid-quarter reported datum (see Bischoff (1989) for a defense of this practice). Zarnowitz and Braun (1993) use essentially the same standard, while McNees uses "final" or "almost final" data.
J.M. Bates and Clive W.J. Granger (1969), the original developers of combination forecasting, suggested many combination methods involving variances, covariances and time-varying weights. Many other methods have been suggested, including encompassing methods and sequential regression methods; but we use simple equal weights. The experience using covariances has been primarily negative; and, as the two sets of forecasts are asserted by both camps to be at least as good as those at the other camp, fifty-fifty weights seem appropriate. Diebold (1998b) weakly endorses equal weights as a second-best solution. In addition, we are trying to provide practical methods of forecasting which are as simple as possible.
Thus, our paper should be useful to business and academic forecasters operating on small budgets. Should they buy the RATS software package for, approximately, $600 and do their own BYARs, using someone else's list of easily available variables? Or should they buy a subscription to an econometric service, for a minimum of several thousand dollars per year? Or should they buy both and produce combination forecasts? The large econometric models were developed at great expense and are adjusted judgmentally by many analysts. They also forecast thousands of variables. Here we focus on just a few variables, and there is no guarantee our results will carry over to other situations, especially if the prior is changed (see Fair and Shiller (1990) and Wi (2000)). Also, the best set of assumptions for the BVAR is not obvious. Nonetheless, the combination method used here is easy and sensible. In the remainder of the paper we describe our methods and data sources, and then present the results.
The Method of Comparison
In each case for real GNP (or GDP) and GNP (or GDP) Deflator, BVAR estimates are made of eight future logarithmic first differences. For each of the sixty-two base periods for the forecasts, data available contemporaneously are used. Also, the corresponding level forecasts from the DRI model are converted to growth rates by taking logarithmic first differences. The combination forecasts are made by converting all the logarithmic first differences back to levels, with the appropriate base levels (which are almost always the same) subtracted off. The level forecasts are then combined arithmetically with equal weights, and the combined forecasts are once again converted to logarithmic first differences.
In all three cases (BVAR, DRI and the combination) errors were computed by comparing each logarithmic first difference forecast (at each horizon) with a logarithmic first difference computed from the appropriate "45-day" estimate published by the Department of Commerce in either February, May, August or November.
Finally, absolute values of the logarithmic first difference errors were computed, and the sixty-two errors at each of the eight horizons were averaged to produce a mean absolute error (MAE) statistic. Similarly, the algebraic logarithmic first difference errors were squared and added up. Then the square root of the sum was taken to form a root mean square error (RMSQE). Note that because the growth rates were usually small, logarithmic first differences were usually close to growth rates and were treated as such.
Litterman (1986b) reports that he used actual growth rates (rather than logarithmic first differences) on both the left-hand side and right hand side of his forecasts, which were based on (interpolated and/or distributed) monthly data.
We used quarterly data and logarithmic first differences. Nevertheless, we feel our forecasts are close to what Litterman would have gotten if he had continued to use his 1980-1983 model (he changed it in 1984) and had continued to forecast past 1985. Sims continued using Litterman's model in 1986 and 1987, but reports that as he worked with the model to improve the inflation forecast, the output forecasts became less reliable (Sims, 1993). He then changed the model drastically in 1989, making it more difficult for non-experts to use. Since Litterman (1986b) and Todd (1984) make it clear that the choice of Bayesian priors and the choice of variables, as well as whether to forecast in levels or first differences, etc., require extensive research, we limited ourselves to Litterman's original prior (described in Litterman 1986b) and original specification and list of variables (described in Litterman (1984, 1986b)).
Non-residential fixed investment and money supply (Ml) were forecast in logarithmic levels, again following Litterman (1986b). Net investment may be thought of as a first difference of capital stock, which perhaps explains why Litterman used logarithmic levels of gross investment. The reason he used logarithmic levels for Ml is not clear to us, but we followed his usage. With logarithmic levels, the task of computing errors and combining them was considerably eased, as we only had to switch levels, combine, and switch back again. Of course, we still had to worry about the appropriate base.
Litterman's estimation technique is "Bayesian" in the sense that prior means, variances and covariances are specified coefficients. Thus the parameter estimates are based both on these priors and on the data. A practical result is that more variables can be included in the model with non-zero coefficients than if no priors were specified. We fitted sixty-two sets of BVAR forecasts, for six variables, one to eight periods ahead.
Obviously the results depend partly on the prior. Litterman used the so-called "Minnesota prior," in which the first lag of each variable had a prior mean of unity and all other variables in that equation had prior means of zero. The prior variances declined with the lag. Other priors that have been suggested and tested to some extent include those of Beth F. Ingram and Charles H. Whiteman (1994) and Christopher A. Sims and Tao A. Zha (1998).
Our results are summarized in Tables 1-9. As McNees (1988, 1990) and others have amply documented, some variables are harder to forecast than others, some are harder to forecast for short horizons, and some are harder for long horizons. For these reasons, it is meaningless to call a forecast "good" or "bad" without comparison to some other forecast. Furthermore, some "structural" forecasts combine better with time series forecasts than others (see Bischoff, 1989).
Table 1 reports the ratios of mean absolute errors (MAEs) or root mean squared errors (RMSQEs) of one forecast model to another for each forecast horizon, one to eight quarters ahead. It is important to keep in mind that Tables 1-6 are comparative only. Thus, for example, ratios equal to one may mean only that the forecasts are equally good or equally poor.
Table 1 shows that in forecasting real GNP (or GDP) for horizons of one to six quarters DRI had a smaller MAE than BVAR and that for one to four quarters ahead DRI had a smaller RMSQE. The forecasts for a horizon of one quarter ahead are not too meaningful, because they use data for as much as two months of the quarter being "forecast." Let us focus, then, on the two-quarter horizon. On average, for the sixty-two quarters 1981:2 through 1996:3 the forecast error of the growth rate of real GNP (or GDP) based on data from the middle of the previous quarter was about one-third lower for DRI than for BVAR.
Whether the reader chooses to look at ratios of mean absolute errors or root mean square errors depends on his or her loss function. In any case, both sets of statistics tell the same story. Richard Ashley (1988) defines a concept he calls "maximum useful forecasting horizon," based on the accuracy of forecasts for all of the variables we study except the interest rate (but including nominal GNP). He concludes that, using error data reported by McNees (1986),
"....most of the forecasts of most of the variables are so inaccurate beyond a couple of quarters as to be essentially useless as inputs to forecasting models ...." (p. 374).
The focus of Ashley's study is to see how well national forecasts can be used as inputs for corporate or regional forecasting models. Although he speculates that his results do not depend on the sample period, he uses the only sample period for which BVAR methods were really successful (1980:2--1985:1; see Litterman (1986) and
McNees (1986)). Nevertheless, his point, based on seventeen models (some made early, middle, and late in the relevant quarter) and six variables, is well taken. It is very hard to forecast more than a year ahead. Both models were equally good (or bad) over long horizons. Although the BVAR method was accurate for horizons of two to four quarters for a while, its accuracy declined about the time Litterman stopped managing it.
The simplest but perhaps most robust method of combining forecasts, a simple average, improved MAE and RMSQE when combined with BVAR (see the middle column labeled Combination/BVAR of Table 1), but is worse than DRI for all horizons up to a year ahead (see the last column of Table 1).
The BVAR forecasts of GNP (or GDP) deflator (Table 2) were worse than those of DRI and the combinations by wide margins, for all horizons, and for all sub-periods. Litterman's original six-equation BVAR did a very bad job of forecasting inflation. As McNees (1996, pp. 11-13) points out, if Litterman had been allowed to manage his model the monetary turmoil between 1980 and 1983 could have been at least partially offset by judgmental adjustments. Instead, Litterman added three variables (a stock price index, commodity prices, and the trade-weighted exchange rate) to his model. As Sims (1993, p. 18) reports, forecasts of inflation improved, but those for real variables deteriorated. Then "a sequence of same-signed errors in forecasting real GNP" appeared.
The BVAR forecasts of the unemployment rate were relatively good (Table 3), but still always seventeen to twenty-four percent worse in MAE for one-to-four quarter horizons and twelve to twenty-two percent in RMSQE. Ashley (1988, p. 373) reports that the unemployment rate is relatively easy to forecast, when measured against its own variance.
BVAR forecasts of real nonresidential fixed investment had been the most accurate among nine models over all horizons for forecasts made in 1980:2 through 1985:1 (McNees 1986, p. 10), when compared to DRI, WEFA, RSQE (University of Michigan) and GSU (Georgia State University). They became the worst by a wide margin, however, for virtually all horizons between 1985:1 and 1989:4 (see McNees 1990, pp. 43-45).
In the table on nonresidential fixed investment (Table 4), ratios are close to one, but this fact does not necessarily mean that this variable is easy to forecast. It may simply mean that the forecasts are equally inaccurate across methods. For forecasts one to four quarters ahead DRI's
MAE was fifteen to eighteen percent better, and its RMSQE was fourteen to nineteen percent better, compared to BVAR.
DRI was much better in forecasting the Treasury bill rate in the current quarter, but this may represent high-frequency information (see Table 5), as discussed above.
Finally, the one variable for which the combination of the two forecasts was more accurate is MI. Still, for short horizons, DRI beat BVAR (see Table 6).
Table 7 shows the results of pairwise comparison of the three models. Each entry in the column labeled DRI over BVAR represents the number of variables (out of a total of six) for which the DRI model beat the BVAR model in pairwise comparison of mean absolute errors (MAE) or root mean square errors (RMSQE) for forecast horizon one to eight quarters ahead. For example, six (i.e. the first entry) means that for one-quarter-ahead forecast horizon, the DRI model beat the BVAR model in MAE for all six variables. Similarly, columns labeled DRI over combination and combination over BVAR each report the number of variables (out of a total of six) for which the DRI model beat the Combination and the Combination beat the BVAR model in pairwise comparisons of MAE or RMSQE. As can be seen in Table 7, BVAR never had the smallest MAE or RMSQE for any variable at a horizon less than two quarters.
Table 8 reports average MAE (and RMSQE) ratios across all eight forecast horizons for each of the six variables under study. For example, an average MAE ratio of 0.86 for real GNP (or GDP) forecasts from the DRI model versus the BVAR model means that on average across all eight forecast horizons, the mean absolute errors of BVAR forecasts were fourteen percent larger than those of DRI forecasts. What Table 8 shows is that across all horizons, BVAR was the worst, but the combination and DRI were roughly equal except for the price deflator, which was dragged down by BVAR's performance.
Finally, Tables 9 and 10 show that BVAR was never "best" for any variable, horizon, or measure of accuracy.
Conclusion--Large Macroeconomic Models Are More Accurate Forecasting Tools Than BVAR Models
Throughout this paper, we have used one big model, and implicitly acted as if all are the same. They are not, as McNees' many studies have shown. But each big model succeeds for some variables and some time periods, and these advantages change through time.
We have tried to provide a rigorous test of Litterman's original model forecasts against corresponding DRI forecasts and an equal-weight combination of the two. The BVAR lost, but the combination and DRI were about the same. In future work we plan to use more sophisticated combinations, perhaps as suggested in Diebold (1998b, Chapter 10). Also, we will stratify by time period,  including the 1970s.  More variables may also be used. New priors have been proposed by Ingram and Whiteman (1994), Zha (1998), Leeper, Sims and Zha (1996), and others. Doan, Litterman and Sims (1984) still contain some untapped material.
We have surely been unfair to Litterman. However, Robertson and Tallman (1999) show that even the best VAR techniques available in 1998 improved results by only five or ten percent over Litterman's original method.  Our Table 8 shows that this might make the best VAR (chosen ex-post) competitive in root mean square error to DRI's forecasts of real GDP, the unemployment rate and the narrow money supply. These methods could be combined with big model forecasts, and the promise of William M. Lupoletti and Roy H. Webb might be fulfilled by a few percent.
Our own hypothesis is that nothing is really stable; all structures are changing slowly. Perhaps future nonlinear methods, mentioned by Diebold (1998a), may be the future of macroeconomic forecasting. But the present, for forecasting purposes, still belongs to big models in combination with judgment.
Several proprietary and university groups as mentioned above are still forecasting successfully with large models.  None of these forecasts turning points well, but neither do the BVARs (see Wi, 2000). We believe the equations that describe the economy are slowly changing--in specification and functional form as well as coefficients. Either a large model or a BVAR can reflect this change; but over the 1981-1996, period the DRI model appears to have adapted better than Litterman's BVAR.
Does this mean that the BVAR technique is worthless for macroeconomic forecasting or is not as good as the DRI model? Certainly not worthless--different types of models can be used for different reasons. If one only wants to forecast (i.e. not do "structural" analysis), a small VAR, Bayesian or otherwise, can be useful. It is the opposite conclusion that stands out. The evidence presented here suggests that big macroeconometric models are more accurate forecasting tools than BVAR models.
We are grateful to David Wyss for access to ex-ante forecasts from the Standard and Poor's/DRI model, and Steven Hine and Kenneth Lynch for help in copying the data. We are also grateful to Robert Crow and an anonymous referee for their constructive comments and suggestions. All remaining errors are the responsibility of the authors.
Charles W Bischoff and In-Bong Kang are at Binghamton University in Binghamton, NY Halefom Belay is at Whitman College in Walla Walla, WA.
(1.) See Doan, Thomas, RATS Users Manual, Version 4, Evanston, IL, 1996.
(2.) The choice is not really so distinct. In 1988, for example, the staff which was developing and using the DRI model reported that its forecasts were based sixty-five percent on the model, twenty-five percent on judgment, and ten percent on current high frequency data. See McNees (1988, p. 21).
(3.) Note that in this paper we focus only on "unconditional" forecasts, and not on "structural" VARs, "conditional" VARs or policy analysis.
(4.) McNees (1992, p. 33) concludes, "... it seems clear that a major factor in forecast accuracy is the time period to be forecast. Errors were large in the severe 1973-75 and 1981-82 recessions, much smaller in the 1980 and 1990-91 recessions, and generally quite minimal apart from business cycle turning points."
(5.) Since Litterman (1986b) developed his prior (The "Minnesota Prior") using simulated "out-of-sample" forecasts for 1976 -1979, it is likely that his prior would lead to good results for this period. In fact, as Wi (2000) shows, some Bayesian VARs combined with Chase Econometrics, Inc., ex-ante forecasts, helped forecast the 1974 downturn, and some Bayesian VARs combined with DRI ex-ante forecasts helped forecast the 1982 downturn.
(6.) Only Robertson and Tallman's results for unemployment forecasts can be compared to ours. Our sample includes the 1981-1983 period when unemployment was bard to forecast, but this was true for both DRI and the Minnesota prior BVAR. In fact Litterman (1986a, Table 4, p. 35) reports he did better in RMSQE than DRI with the judgmental adjustment in forecasting the unemployment rate for the period 1980:21985:1 for all horizons except one-period ahead (actually the current period). Yet, we find that for a longer period that includes all but the first three quarters of Litterman's sample, and more than ten additional years, DRI is better in RMSQE for all horizons up to four quarters and for all eight horizons averaged together by over five percent. For horizons of one (current) quarter through three, the ratios of DRI's RMSQE to Litterman's BVAR are 0.86, 0.79 and 0.81. Robertson and Tallman (1999, Table 1, p. 14), for a period starting five years later and extending three quarters later (in base period but not horizon), find that the model they apparently prefer (ZVAR), based on Zha (1998) and an unpublished manuscript Zha co-authored with Daniel Waggoner in 1998, did better than the Litterman prior in forecasting the unemployment rate. The Litterman root mean square error was five percent larger for the current period (first quarter in our notation), ten percent larger for the next period and nine percent larger for the third. Since DRI beats Litterman by sixteen, twenty-six, and twenty-three percent (putting the better fit in the denominator as Robertson and Tallman do), this at least suggests that respective "simulated ex-ante" forecasts with the best model VAR forecasts could think of by 1998 produced considerably worse short run unemployment forecasts than DRI.
(7.) There is no reason a BVAR need be small. Litterman (1986a) mentions a forty-seven equation BVAR developed at the Federal Reserve Bank of Minneapolis. Although there are now fewer large structural models, several with over one thousand equations still exist. Other large models, in addition to the ones mentioned above, are maintained at the WEFA Group, Macroeconomic Advisers, UCLA, Kent State and Washington University at St. Louis. However, since they are more expensive to develop and maintain than VARs or BVARs, the rise of this method of forecasting, along with questions about using Keynesian models for policy purposes, has definitely led to a decline in the "industry" of producing large Keynesian-style models.
Ashley, Richard, "On the Relative Worth of Recent Macroeconomic Forecasts," International Journal of Forecasting, 1988, pp. 363-376.
Bates, J. M. and Clive W.J. Granger, "The Combination of Forecasts," Operational Research Quarterly, 1969, pp. 451-468.
Bischoff, Charles W., "The Combination of Macroeconomic Forecasting," The Journal of Forecasting, July-September 1989, pp. 293-314.
Cho, Dong W., "Forecast Accuracy: Are Some Business Economists Consistently Better Than Others?" Business Economics, October 1996, pp. 45-49.
Diebold, Francis X., "The Past, Present, and Future of Macroeconomic Forecasting," The Journal of Economic Perspectives, Spring 1998a, pp. 175-192.
Diebold, Francis X., Elements of Forecasting, Cincinnati OH, South-Western College Publishing, 1998b.
Doan, Thomas, Robert B. Litterman and Christopher A. Sims, "Forecasting and Conditional Projection Using Realistic Prior Distributions," Econometric Reviews, 1984, pp. 1-100.
Fair, Ray C., Testing Macroeconomic Models, Cambridge MA, Harvard University Press, 1994.
Fair, Ray C. and Robert J. Shiller, "Comparing Information in Forecasts from Econometric Models," American Economic Review, Vol. 80 June 1990, pp. 375-389.
Ingram, Beth E and Charles H. Whiteman, "Supplanting the 'Minnesota' Prior: Forecasting Macroeconomic Time Series Using Real Business Cycle Model Priors," Journal of Monetary Economics, 1994, pp. 497-510.
Leeper, Eric M., Christopher A. Sims, and Tao A. Zha, "What Does Monetary Policy Do?" Brookings Papers on Economic Activity, (1996), Vol. 2, pp. 1-63.
Litterman, Robert B., "Forecasting With Bayesian Vector Autoregressions--Four Years of Experience," Proceedings of the Business Economic Statistics Sector, American Statistical Association, 1984, pp. 21-29.
Litterman, Robert B., "A Statistical Approach to Economic Forecasting," The Journal of Business and Economic Statistics, January 1986a, pp. 1-4.
Litterman, Robert B., "Forecasting With Bayesian Vector Autoregressions -- Five Years of Experience," The Journal of Business and Economic Statistics, January 1986c, pp. 25-30.
Litterman, Robert B., "Comment," The Journal of Business and Economic Statistics, January 1986b, pp. 17-19.
Lupoletti, William M. and Roy H. Webb, "Defining and Improving the Accuracy of Macroeconomic Forecasts," The Journal Of Business, April 1986, pp. 263-285.
McLaughlin, Robert L., "The Real Record of Economic Forecasters," Business Economics, May 1975, pp. 28-36.
McNees, Stephen K., "Which Forecast Should You Use?" New England Economic Review, July/August 1985, pp. 36-42.
McNees, Stephen K., "Forecasting Accuracy of Alternative Techniques: A Comparison of U.S. Macroeconomic Forecasts," Journal of Business and Economic Statistics, January 1986, pp. 5-15.
McNees, Stephen K., "How Accurate Are Macroeconomic Forecasts?" New England Economic Review, July/August 1988, pp. 15-36.
McNees, Stephen K., "Why Do Forecasts Differ?" New England Economic Review, January/February 1989, pp. 42-54.
McNees, Stephen K., "Man Versus Model? The Role of Judgment in Forecasting," New England Economic Review, July-August 1990, pp. 41-52.
McNees, Stephen K., "How Large Are Economic Forecast Errors?" New England Economic Review, July/August 1992, p. 25.
Robertson, John C. and Ellis W. Tallman, "Data Vintages and Measuring Forecast Model Performance," Federal Reserve Bank of Atlanta Economic Review, Vol. 83 Fourth Quarter 1998, pp. 4-20.
Robertson, John C. and Ellis W. Tallman, "Vector Autoregressions: Forecasting and Reality," Federal Reserve Bank of Atlanta Economic Review, First Quarter 1999, pp. 4-19.
Sims, Christopher A., "Macroeconomics and Reality," Econometrica, January 1980, pp. 1-48.
Sims, Christopher A., "A Nine-Variable Probabilistic Macroeconomic Forecasting Model," Business Cycles, Indicators and Forecasting, edited by James H. Stock and Mark W. Watson, Chicago IL: University of Chicago Press, 1993, pp. 170-212.
Sims, Christopher A. and Tao A. Zha, "Bayesian Methods for Dynamic Multivariate Models," International Economic Review, November 1998, pp. 949-968.
Todd, Richard M., "Improving Economic Forecasting With Bayesian Vector Autoregressions," Federal Reserve Bank of Minneapolis Quarterly Review, Fall 1984, pp. 18-29.
Wi, Seongbak, An Evaluation of Combining Forecasts and a Strategy for an Optimum VAR Prior to Forecast Business Cycle Turning Points, Unpublished Ph.D. Dissertation, State University of New York at Binghamton, January 2000.
Zarnowitz, Victor and Phillip Braun, 'Twenty Two Years of the NBER-ASA Quarterly Economic Outlook Surveys: Aspects and Comparisons of Forecasting Performance," Business Cycles, Indicators and Forecasting, edited by James H. Stock and Mark W. Watson, Chicago IL: University of Chicago Press, pp. 11-93.
Zha, Tao A., "A Dynamic Multivariate Model for Use in Formulating Policy," Federal Bank of Minnesota Quarterly Review, First Quarter 1998, pp. 16-29.
RATIOS OF FORECAST ERRORS OF REAL GNP OR GDP Horizon DRI/BVAR Combination/ Combination/ BVAR DRI MAE 1 0.70 0.79 1.13 2 0.66 0.76 1.15 3 0.75 0.80 1.06 4 0.86 0.86 1.00 5 0.91 0.88 0.97 6 0.95 0.92 0.96 7 1.02 0.95 0.93 8 1.01 0.90 0.89 RMSQE 1 0.72 0.79 1.10 2 0.69 0.75 1.10 3 0.79 0.79 1.01 4 0.90 0.86 0.95 5 1.01 0.91 0.91 6 1.11 0.97 0.87 7 1.13 0.97 0.87 8 1.07 0.95 0.88
Notes: In Tables 1-6, each element In column labeled DRI/BVAR represents the ratio of the mean absolute error (MAE) or root mean square error (RMSQE) of DRI forecasts of the growth rate of real GNP (or GDP) to that of BVAR forecasts for each forecast horizon, 1 to 8 quarters ahead. Thus, a ratio of 0.70 (the first element) means that the average error for the DRI modal is about 70% that of the BVAR model, at an annual rate, for one-quarter-ahead forecasts. Similarly, columns labeled Combination/ BVAR and Combination/DRI report the ratios of MAEs and RMSQEs of an equally weighted combination of the two forecasts to BVAR forecasts and DRI forecasts, respectively.
RATIOS OF FORECAST ERRORS OF GNP (OR GDP) DEFLATOR Horizon DRI/BVAR Combination/ Combination/ BVAR DRI MAE 1 0.62 0.77 1.23 2 0.51 0.71 1.38 3 0.46 0.67 1.48 4 0.39 0.66 1.67 5 0.39 0.65 1.68 6 0.38 0.66 1.73 7 0.38 0.67 1.78 8 0.38 0.67 1.77 RMSQE 1 0.58 0.73 1.25 2 0.50 0.64 1.27 3 0.43 0.66 1.55 4 0.40 0.66 1.67 5 0.41 0.67 1.63 6 0.42 0.68 1.63 7 0.42 0.69 1.63 8 0.43 0.69 1.62 RATIOS OF FORECAST ERRORS OF UNEMPLOYMENT Horizon DRI/BVAR Combination/ Combination/ BVAR DRI MAE 1 0.78 0.76 0.97 2 0.72 0.77 1.07 3 0.77 0.79 1.03 4 0.82 0.83 1.02 5 0.85 0.87 1.03 6 0.88 0.90 1.02 7 0.87 0.91 1.04 8 0.85 0.88 1.04 RMSQE 1 0.86 0.78 0.91 2 0.79 0.79 1.00 3 0.81 0.81 1.00 4 0.94 0.88 0.93 5 1.02 0.92 0.91 6 1.07 0.96 0.89 7 1.07 0.96 0.90 8 1.02 0.93 0.91 RATIOS OF FORECAST ERRORS OF REAL NONRESIDENTIAL FIXED INVESTMENT Horizon DRI/BVAR Combination/ Combination/ BVAR DRI MAE 1 0.84 0.82 0.97 2 0.82 0.81 0.98 3 0.82 0.81 0.99 4 0.85 0.83 0.97 5 0.94 0.90 0.96 6 1.02 0.94 0.92 7 1.06 0.96 0.91 8 1.07 0.95 0.89 RMSQE 1 0.83 0.83 0.99 2 0.84 0.79 0.94 3 0.81 0.78 0.95 4 0.86 0.81 0.94 5 0.91 0.85 0.93 6 0.98 0.88 0.90 7 1.01 0.90 0.89 8 0.97 0.90 0.93 RATIOS OF FORECAST ERRORS OF TREASURY BILL RATE Horizon DRI/BVAR Combination/ Combination/ BVAR DRI MAE 1 0.45 0.59 1.32 2 0.59 0.71 1.20 3 0.68 0.77 1.13 4 0.73 0.81 1.11 5 0.75 0.82 1.10 6 0.77 0.85 1.10 7 0.77 0.85 1.10 8 0.78 0.85 1.09 RMSQE 1 0.44 0.56 1.27 2 0.68 0.72 1.06 3 0.68 0.77 1.12 4 0.73 0.81 1.11 5 0.79 0.84 1.07 6 0.79 0.85 1.07 7 0.79 0.85 1.07 8 0.78 0.84 1.07 RATIOS OF FORECAST ERRORS OF NARROW MONEY SUPPLY (M1) Horizon DRI/BVAR Combination/ Combination/ BVAR DRI MAE 1 0.70 0.77 1.10 2 0.88 0.87 1.00 3 0.99 0.95 0.96 4 1.01 0.97 0.96 5 1.03 0.99 0.96 6 1.03 0.98 0.95 7 1.01 0.97 0.96 8 1.00 0.96 0.96 RMSQE 1 0.75 0.80 1.07 2 0.90 0.89 0.99 3 0.98 0.93 0.96 4 1.01 0.95 0.94 5 1.01 0.95 0.95 6 0.99 0.95 0.95 7 0.97 0.94 0.97 8 0.95 0.93 0.98 PAIRWISE COMPARISONS OF THE THREE FORECASTING MODELS: FREQUENCY OF SUPERIORITY Horizon DRI over DRI over Combination BVAR Combination over BVAR MAE 1 6 4 6 2 6 4 6 3 6 4 6 4 5 3 6 5 5 3 6 6 4 3 6 7 3 3 6 8 4 3 6 Total 39 27 48 RMSQE 1 6 4 6 1 6 4 6 2 6 4 6 3 6 4 6 4 5 2 6 5 3 2 6 6 4 2 6 7 3 2 6 8 4 2 6 Total 37 26 48 MAE AND RMSQE RATIOS AVERAGED ACROSS ALL FORECAST HORIZONS MAE RMSQE Ratio of DRI to BVAR Real GNP or GDP 0.86 0.93 GNP (or GOP) Deflator 0.44 0.45 Unemployment Rate 0.82 0.95 Real Non-Res. Fixed Investment 0.93 0.90 Treasury Bill Rate 0.69 0.71 Narrow Money Supply (M1) 0.95 0.94 Ratio of Combination to BVAR Real GNP or GDP 0.86 0.87 GNP (or GDP) Deflator 0.68 0.68 Unemployment Rate 0.84 0.88 Real Non-Res. Fixed Investment 0.88 0.84 Treasury Bill Rate 0.78 0.78 Narrow Money Supply (M1) 0.93 0.92 Ratio of Combination to DRI Real GNP or GDP 1.01 0.96 GNP (or GDP) Deflator 1.59 1.53 Unemployment Rate 1.03 0.93 Real Non-Res. Fixed Investment 0.95 0.93 Treasury Bill Rate 1.14 1.11 Narrow Money Supply (M1) 0.98 0.98 WHICH FORECAST MODEL PRODUCES MINIMUM MAE FOR EACH VARIABLE AND FOR EACH HORIZON? Horizon Variables 1 2 3 4 5 6 7 8 Real GNP or GDP D D D C C C C C GNP (or GDP) Deflator D D D D D D D D Unemployment Rate C D D D D D D D Real Non-Res. Fixed Investment C C C C C C C C Treasury Bill Rate D D D D D D D D Narrow Money Supply (M1) D C C C C C C C Notes: D, B, and C denote the DRI model, the BVAR model, and the Combinations. WHICH FORECAST MODEL PRODUCES MINIMUM RMSQE FOR EACH VARIABLE AND FOR EACH HORIZON? Horizon Variables 1 2 3 4 5 6 7 8 Real GNP or GDP D D D C C C C C GNP (or GDP) Deflator D D D D D D D D Unemployment Rate C D D C C C C C Real Non-Res. Fixed Investment C C C C C C C C Treasury Bill Rate D D D D D D D D Narrow Money Supply (M1) D C C C C C C C Notes: D, B, and C denote the DRI-model, the BVAR model, and the Combinations.
|Printer friendly Cite/link Email Feedback|
|Author:||Bischoff, Charles W.; Belay, Halefom; Kang, In-Bong|
|Date:||Jul 1, 2000|
|Previous Article:||Forecasting Inflation--Surveys Versus Other Forecasts.|
|Next Article:||The Political Economy of Child Labor and Its mpacts on International Business.|