# Habit formation as a resolution to the equity premium puzzle: what is in the data, what is not.

I. Introduction

The consumption based asset pricing model of Lucas |14~ defines a theoretical relationship between streams of consumption and equilibrium asset prices. Since data of both aggregate consumption and asset prices are available, the theory can be tested. Empirical tests of the Lucas model using standard time separable utility functions indicate mismatches between the theory and the data. For example, in the GMM (Generalized Method of Moments) estimation of Hansen and Singleton |9~, overidentifying constraints implied by the model were rejected. Mehra and Prescott |16~ demonstrated the difficulty of explaining a particular statistic: the theoretical expected equity premium (the yield differential between stocks and risk-free bonds) is much higher than the observed one if a standard utility function is used. They called the mismatch "the equity premium puzzle".

To improve the performance of the model, several authors have relaxed the time-separability of preferences. In a stimulating paper, Constantinides |1~ argued that the equity premium puzzle can be resolved through the assumption of "habit formation". The idea is that consumption in the past reduces utility in the present because it establishes habits. His model can match the observed mean and the variance of both the equity premium and the consumption growth rate. The match of the low moments is consistent with the bound tests of Heaton |11~, Gallant, Hansen, and Tauchen |7~, and Hansen and Jagannathan |10~.

On the other hand, GMM estimations using monthly aggregate consumption of nondurable goods and leisure by Eichenbaum, Hansen, and Singleton |3~, and that using consumption of nondurable goods and durable goods by Dunn and Singleton |2~ and Eichenbaum and Hansen |4~, showed that if current utility depends on current and past consumption, then current utility increases in past consumption, i.e., consumption has "local durability," the opposite of that implied by habit formation. A particularly strong result was obtained by Gallant and Tauchen |6~. They estimated a general form of utility functions with a general law of motion of data and found that the source of the time-non-separability is local durability. Since it is known that local durability produces a smaller equilibrium equity premium, these results imply that in order to match other moments well, the match for the equity premium has to be sacrificed.

However, the GMM estimations on the nature of time-non-separability are not yet conclusive. Ferson and Constantinides |5~ illustrated that the fit of the model with multi-asset portfolios can be improved by either introducing habit formation or durability, and the chi-square statistics are not dramatically different in these two cases. They also found that the sample estimates are influenced by the choice of instrumental variables. Using the instruments they considered plausible, the GMM estimate indicates that the utility function displays habit persistence.

There are many factors that affect outcomes of finite sample GMM estimations. One of them is the scaling factor. The actual level of consumption exhibits an upward trend, therefore it is not stationary. As a result, marginal utility of consumption is also non-stationary. Since the asymptotic theory can be expected to give a reasonable approximation under stationarity, researchers use a scaling factor to offset the trend of the marginal utility if the utility function to be estimated is homogeneous. The scaling factors are similar to ordinary instrumental variables except that by construction scaling factors explicitly depend on the parameters to be estimated, whereas ordinary instrumental variables do not. According to the consistency theorem of Hansen |8~, the instruments and scaling factors should not matter asymptotically. But like instruments, the scaling factors can affect finite sample estimates. The various scaling factors used by different authors may have a nontrivial impact, but such an impact has not been drawn out.

The contribution of the present paper is two fold. First, GMM estimations are conducted using different scaling factors. The paper shows that scaling factors are important to finite sample estimates. Using a plausible scaling factor, our estimated utility function is locally durable. This implies that habit formation assumption cannot explain some moments other than the equity premium. Secondly, it demonstrates that even the equity premium puzzle is not solved by the introduction of habit formation because the correlation between aggregate consumption and the equity premium is too small. It is held that the model equity premium is large when the detrended marginal utility is volatile and when the correlation between marginal utility and equity premium is large. The calculation in the paper shows that comparing with the locally durable and time-separable preferences, the habit persistent preferences yield larger volatility in the marginal utility. So the habit formation assumption implies larger equity premium. But the paper also finds the correlation embodied in the data is so small that in order to match the observed equity premium, the volatility of marginal utility has to be much larger than that implied by strongly habit persistent utility functions.

The remainder of the paper is organized as follows. Section II specifies the class of utility functions to be studied. Section III reports the results of GMM estimation using the equilibrium conditions on the asset returns and the marginal utility. Section IV demonstrates that in actual data, the correlation between the marginal utility and the equity premium is small under time separable preferences and is even smaller with habit persistent utility functions. Finally, section V concludes.

II. The Utility Function

The representative agent's lifetime utility is assumed to be

E |summation of~ ||Beta~.sup.t~ |(|c.sub.t~ + |s.sub.t~).sup.1-|Gamma~~ where t = 1 to |infinity~/(1 - |Gamma~) |Gamma~ |is greater than~ 0, (1)

where |Beta~ is the discount rate, |c.sub.t~ is consumption at t and

|s.sub.t~ = ||Theta~.sub.1~|c.sub.t-1~. (2)(1)

In (2), if ||Theta~.sub.1~ = 0, the utility function becomes time-separable; if ||Theta~.sub.1~ |is greater than~ 0, the utility function shows local durability, which means consumption in the previous period and the present period are substitutes; and finally if ||Theta~.sub.1~ |is less than~ 0, the utility function shows habit formation, where a high level of consumption in the previous period changes the agent's habits, and the satisfaction the agent gains from the consumption of the current period depends on the difference between the present consumption and the habit. The marginal utility of consumption |c.sub.t~ divided by ||Beta~.sup.t~ is given by

mr|u.sub.t~ = |(|c.sub.t~ + |s.sub.t~).sup.-|Gamma~~ + |E.sub.t~|Beta~||Theta~.sub.1~|(|c.sub.t+1~ + |s.sub.t+1~).sup.-|Gamma~~. (3)

Equilibrium conditions in the market for assets imply that

|Mathematical Expression Omitted~

and

|Mathematical Expression Omitted~

where |Mathematical Expression Omitted~ is the realized return of the risky asset from period t to t + 1, and |Mathematical Expression Omitted~ is the return of the risk-free bond from period t to t + 1. The latter rate is assumed known with certainty at time t, whereas the former one is not. The parameters we wish to estimate are |b.sub.0~ = {|Beta~, |Gamma~, ||Theta~.sub.1~}.

III. GMM Estimation

Define |Mathematical Expression Omitted~ as mr|u.sub.t~ without the conditional expectation operator on the second component of (3), i.e., |Mathematical Expression Omitted~ is the realized marginal utility. Denote the vector of parameters {|Beta~, |Gamma~, ||Theta~.sub.1~} by b. Given a scaling factor S|F.sub.t~(b), expressions similar to (4) and (5) can be rewritten as

|Mathematical Expression Omitted~,

where |Mathematical Expression Omitted~; and

|Mathematical Expression Omitted~,

where |Mathematical Expression Omitted~.

Equations (4) and (5) also imply that in (4|prime~) and (5|prime~), |Mathematical Expression Omitted~. The scaling factor S|F.sub.t~ is introduced because the aggregate consumption |c.sub.t~ grows with t, thus ||Eta~.sub.t~, the vector (|Mathematical Expression Omitted~), is nonstationary. Assuming that {|c.sub.t+1~/|c.sub.t~} is stationary and ergodic, the residuals ||Eta~.sub.t+1~ of equations (4|prime~) and (5|prime~) are scaled by the factor S|F.sub.t~ before being used as disturbances in GMM estimation. S|F.sub.t~ contains data observable at period t and also contains a trend so that ||Eta~.sub.t+1~S|F.sub.t~ is stationary. There are many ways to construct scaling factors, and choosing among them is an important issue that will be discussed later in this section.

The method of GMM estimation can be described in the following way: Suppose that one desires to estimate a parameter vector |b.sub.0~. The orthogonality conditions for a sample of T observations imply that the following vector |g.sub.T~(b), evaluated at |b.sub.0~, must be close to zero if T is large.

|Mathematical Expression Omitted~

where |Mathematical Expression Omitted~ is the Kronecker product and |IV.sub.t~ is a vector of instrumental variables. The GMM estimator |Mathematical Expression Omitted~ for a given weighting matrix W is

|Mathematical Expression Omitted~

where |g|prime~.sub.T~(b) is the transpose of the vector |g.sub.T~(b).

In practice, procedures of GMM estimation involve steps of iteration of the following: At step k, for given |Mathematical Expression Omitted~, calculate the weighting matrix |Mathematical Expression Omitted~ (the matrix |Omega~(b) is defined in Appendix A, usually the initial choice is the identity matrix), then search for |Mathematical Expression Omitted~ so that

|Mathematical Expression Omitted~

The convergence criteria are also discussed in Appendix A.

As emphasized previously, there are many ways to detrend the residuals ||Eta~.sub.t+1~ in (4|prime~) and (5|prime~). Similar to instrumental variables, scaling factors can affect the finite sample estimates. But there is a difference between scaling factors and ordinary instrumental variables: Usually instruments are unrelated to the parameters to be estimated, but by construction scaling factors specifically depend on the parameters. With the marginal utility given by (3), two convenient choices of S|F.sub.t~ are

(A) |Mathematical Expression Omitted~;

(B) S|F.sub.t~ = |(|c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~).sup.|Gamma~~, with ||Theta~.sub.1~ |is less than or equal to~ 1.

Both of the scaling factors are natural candidates for their simplicity and because if the growth rate of consumption is stationary then S|F.sub.t~||Eta~.sub.t+1~ is stationary. The constraint ||Theta~.sub.1~ |is less than or equal to~ 1 means that the consumption of the previous period cannot carry a larger weight than the consumption of the current period. These constraints are particularly plausible when the consumption stands for nondurable goods.

The data set is monthly consumption, stock returns and bond returns from 59:1-88:12. The measure of consumption is real per capita spending on nondurable goods.(2) The stock return is the value weighted average of ex-post returns of stocks listed on the New York Stock Exchange. The bond return is the one-month T-Bill rate. The asset returns are obtained from the CRSP tape and converted to real terms by the consumption deflator.

The set of instrumental variables associated with both equations (4|prime~) and (5|prime~) is |Mathematical Expression Omitted~.(3) The GMM estimation result is reported in Table I.

TABULAR DATA OMITTED

Panel A of Table I reports the result with |Mathematical Expression Omitted~ under constraints 0 |is less than or equal to~ |Gamma~ |is less than or equal to~ 20, ||Theta~.sub.1~ |is less than or equal to~ 1, and |Mathematical Expression Omitted~ for all t. The estimation results in very large |Gamma~ and ||Theta~.sub.1~. ||Theta~.sub.1~ = 1 means that the consumption of nondurable goods of the previous month has the same effect as the consumption of the current month. This result appears implausible. |Gamma~ is also much larger than the estimates obtained by other authors.(4) Finally, the chi-square statistic strongly rejected the overidentifying restrictions.

Panel B of Table I reports the estimates with S|F.sub.t~ = |(|c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~).sup.|Gamma~~, ||Theta~.sub.1~ |is less than or equal to~ 1 and |Mathematical Expression Omitted~ for all t. The parameters to be estimated are still b = {|Beta~, |Gamma~, ||Theta~.sub.1~}. As in Panel A, the estimate for |Beta~ is larger than 1, which supports the conjecture of the negative discount factor by Kocherlakota |12~. But the estimates for |Gamma~ and ||Theta~.sub.1~ are much smaller than that with the previous scaling factor. The chi-square statistic for the overidentifying constraints of Panel B is much smaller than that of Panel A. Compared with the estimate with the first scaling factor, the results generated by the second scaling factor are more reasonable.

In principle, scaling factors affect finite sample estimates in a similar way as instrumental variables do. So long as the data sample is finite, the sample estimates are under the influence of instruments and scaling factors. The importance of instruments are discussed in detail by many authors (e.g., Ferson and Constantinides |5~). The comparison of the scaling factors above shows that in finite sample estimation the choice of scaling factors can also be very important.

The estimation result obtained by using scaling factor (B) indicates that a consumption of one dollar nondurable goods in the previous month generates the same utility in the current month as a consumption of 46 cents of the current month. The effect of local durability is therefore nontrivial. The fact that the GMM estimation results in a locally durable utility function instead of a habit persistent one indicates that the estimates may have problem matching the observed equity premium. The next section examines the equity premium implied by the estimates.

IV. Matching the Equity Premium

In this section we will study some sample statistics of the actual data. Denote |Mathematical Expression Omitted~, the ex-post equity premium, by R|P.sub.t+1~. From (4) and (5) we have

|Mathematical Expression Omitted~(5)

As in the previous section, let S|F.sub.t~ = |(|c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~).sup.|Gamma~~. Denote E(|center dot~) as mean, SD (|center dot~) as standard deviation, and |Mathematical Expression Omitted~ as the correlation between |Mathematical Expression Omitted~ and R|P.sub.t+1~. From (9)

|Mathematical Expression Omitted~

The difference between the observed average equity premium and the value of the RHS of (10) can be used as a measure of match to the equity premium. Since mr|u.sub.t+1~S|F.sub.t~ depends on the parameters {|Beta~, |Gamma~, ||Theta~.sub.1~}, if we fix |Beta~ and |Gamma~ then the sample consumption and the parameter ||Theta~.sub.1~ will determine |Mathematical Expression Omitted~ as well as the implied equity premium (as expressed by the RHS of (10)). The primary task of this section is to study how the sample statistics of marginal utility and the implied equity premium change with ||Theta~.sub.1~. Let |Beta~ = 1.001, and |Gamma~ = 4.838, as suggested by the GMM estimates; also let |Mathematical Expression Omitted~ stand for standard error, |Mathematical Expression Omitted~ for the sample correlation, |Mathematical Expression Omitted~ for the sample average, and DRP of the following for the match of the equity premium:

|Mathematical Expression Omitted~

where |Mathematical Expression Omitted~ and |Mathematical Expression Omitted~ are given by the sample equity premium; |Mathematical Expression Omitted~, |Mathematical Expression Omitted~, and |Mathematical Expression Omitted~ can be derived from the sample data and given values of ||Theta~.sub.1~. With the same set of monthly data used in the GMM estimation, |Mathematical Expression Omitted~ and |Mathematical Expression Omitted~.

By definition, the smaller DRP(||Theta~.sub.1~) is, the higher is the theoretical equity premium. If DRP(||Theta~.sub.1~) is larger than zero, then the sample average of equity premium implied by the theory is smaller than the observed one. For given |Mathematical Expression Omitted~, equation (11) suggests that the difficulty of generating a small DRP may stem from low volatility of |Mathematical Expression Omitted~, measured by the ratio |Mathematical Expression Omitted~; or from a small negative correlation |Mathematical Expression Omitted~. It has been suggested by the bound tests of Gallant, Hansen, and Tauchen |7~, and Hansen and Jagannathan |10~ that the volatility problem can be solved by introducing habit formation, but what is unknown is the behavior of |Mathematical Expression Omitted~ as ||Theta~.sub.1~ changes. Figure 1 plots |Mathematical Expression Omitted~, |Mathematical Expression Omitted~, and DRP as functions of ||Theta~.sub.1~. DRP is in percentage terms. The set of monthly data is the same as the one used in the estimation. The lowest value for ||Theta~.sub.1~ is set to be -.62 because if ||Theta~.sub.1~ is smaller than that value, the marginal utility becomes negative.

From Figure 1, the following features can be observed:

(i) Habit formation (||Theta~.sub.1~ |is less than~ 0) implies large volatility in marginal utility.

|Mathematical Expression Omitted~ reaches its highest level, 0.32, at ||Theta~.sub.1~ = -0.62. It is decreasing in ||Theta~.sub.1~ until it reaches the lowest level, 0.025, at ||Theta~.sub.1~ = 0.64. So there are considerable changes in |Mathematical Expression Omitted~ as ||Theta~.sub.1~ varies. Figure 1 verifies the intuition that the habit formation assumption may imply a higher equity premium through its impact on the volatility of marginal utility.

(ii) Habit persistent utility functions generate larger equilibrium equity premia compared with time-separable and locally durable utility functions.

DRP is minimized at ||Theta~.sub.1~ = -0.62, it is increasing in ||Theta~.sub.1~, and it reaches the maximum level at ||Theta~.sub.1~ = 1.0. This means the discrepancy between the mean of the equity premium implied by equation (10) and the observed one is minimized by a habit persistent utility function. Figure 1 also illustrates the model's ability to match the observed equity premium. DRP is about 0.39% when ||Theta~.sub.1~ = 0.46 (the GMM estimate), which means that the model equity premium is less than one percent of size actually observed. DRP is 0.38% at ||Theta~.sub.1~ = 0 (that means the model equity premium is less than six percent of size actually observed, which is the puzzle proposed by Mehra and Prescott); and 0.25% at ||Theta~.sub.1~ = -0.62 (which means that the model equity premium is about forty percent of that observed if we assume strong habit persistence). Therefore the habit formation assumption increases the model equity premium but still leaves a substantial part of the observed equity premium unexplained.

(iii) The equity premium puzzle cannot be solved by introducing habit formation because of the small correlation |Mathematical Expression Omitted~.

In the figure, the maximum of |Mathematical Expression Omitted~ is less than 0.15. Given this maximum correlation and the fact that the monthly ex-post equity premium has a sample average of 0.41% and a standard error of 4.4%, the ratio |Mathematical Expression Omitted~ that lets DRP be zero is at least 0.62. But the largest ratio shown in the figure is about 0.32. Moreover, Figure 1 shows that the increase in |Mathematical Expression Omitted~ due to the decrease in ||Theta~.sub.1~ as ||Theta~.sub.1~ |is less than~ 0 is partially offset by the decrease in |Mathematical Expression Omitted~. This property of the actual time series data reduces the power of the habit formation theory in matching the equity premium. The key difficulty of matching the observed equity premium, according to the figure, lies in the fact that the correlation between the marginal utility and equity premium is too small. The figure also shows that habit persistent utility functions result in even smaller correlations |Mathematical Expression Omitted~ compared with that associated with the time separable utility function.

The reason that GMM estimation implies local durability instead of habit formation is that although habit formation explains the equity premium better, it does not fit other moments well. This point is made by Gallant et al. |7~. They found the Euler equation residuals predictable when a habit persistent utility function is tested.

Figure 1 implies that matching the equity premium with the class of utility functions studied in this paper is more difficult when actual data are used than when artificial data generated from equilibrium conditions are used. We make the same observation when different utility functions are employed. Calculations of the sample statistics using the utility functions studied by Eichenbaum, Hansen, and Singleton |3~, and Dunn and Singleton |2~ are reported in Appendix B.

The approach presented above is related to the bound constraint proposed by Heaton |11~, Gallant, Hansen, and Tauchen |7~, and Hansen and Jagannathan |10~. They used asset return data to obtain the admissible region of the mean and the standard deviation of IMSR (Intertemporal Marginal Rate of Substitution) by the Euler equations, an approach that does not rest on the use of a particular utility function. With the same utility function used in this paper, they showed that the assumption of habit formation coupled with a high risk aversion coefficient can push the mean and the standard deviation of IMSR into the admissible region. The finding means that the introduction of habit formation makes IMSR volatile enough to solve the equity premium puzzle if the correlation between IMSR and the equity premium is large.

We now use (11) to do a simpler bound test.(6) Although the test is less general than that conducted by the authors mentioned above, it does illustrate more directly the key to the equity premium puzzle. The variable we examine is |Mathematical Expression Omitted~ instead of IMSR. Figure 1 shows that strong habit formation does make |Mathematical Expression Omitted~ more volatile. The lower bound for the admissible volatility of marginal utility |Mathematical Expression Omitted~ can be obtained by letting |Mathematical Expression Omitted~ be -1 and DRP in (11) be zero. The lower bound for |Mathematical Expression Omitted~ is about 0.093. In Figure 1 the ratio |Mathematical Expression Omitted~ is higher than the lower bound as long as ||Theta~.sub.1~ is smaller than -0.35. In other words, when the habit formation is strong enough, the ratio |Mathematical Expression Omitted~ can fall into the admissible region. Apparently, the small correlation |Mathematical Expression Omitted~ is the main cause of the equity premium puzzle. Although low volatility of |Mathematical Expression Omitted~ or that of IMSR can also contribute to the puzzle, it can be overcome by the habit formation assumption. However, the problem of small correlation |Mathematical Expression Omitted~ does not go away in the presence of habit formation.

V. Concluding Remarks

In this paper, the GMM estimates are shown to be sensitive to scaling factors. When a plausible scaling factor is used, the estimated utility function exhibits local durability. The estimated utility function is unable to explain the equity premium puzzle.

The equity premium puzzle can be resolved if the normalized marginal utility is volatile and is strongly negatively correlated with the equity premium. Habit persistence in the preference yields volatile marginal utility. And if the artificial consumption is solved from the Euler equations, the negative correlation is produced automatically. In the actual data, the correlation between marginal utility and equity premium is small. Thus when time series data are used instead of several low moments, the test is more difficult to pass. The habit formation assumption does reproduce the admissible volatility of normalized marginal utility. But if we claim that habit formation resolves the equity premium puzzle, we have to assume a large correlation between the marginal utility and equity premium that does not exist in the data.

What makes the model unsatisfactory may be the use of aggregate consumption data. Using household panel data, Mankiw and Zeldes |15~ found that the consumption of stockholders is more volatile and more highly correlated with the equity premium than aggregate consumption is. Since only about one-fourth of U.S. families own stock, tests based on the aggregate consumption tend to reject the model even if the theory is correct for the stock holders. However, although it is obvious now that empirical tests using disaggregated consumption data will be more fruitful, an immediate obstacle is the difficulty of approximating the consumption of stockholders as a long time series.

Appendix A

Hansen |8~ showed that the smallest asymptotic covariance matrix of the estimator |Mathematical Expression Omitted~ can be obtained by letting the weighting matrix W be |Mathematical Expression Omitted~. Because we assume that the consumption of the previous period affects the current period utility, |S.sub.0~ is defined by

|S.sub.0~ = |summation of~ E{|F.sub.t~ (|b.sub.0~)|F|prime~.sub.t-k~(|b.sub.0~)} where k = -1 to 1,

where |Mathematical Expression Omitted~

Following Newey and West |17~, a consistent estimator of |S.sub.0~ which is positive semidefinite in a finite sample, is

|Omega~(b) = {|summation of~ |F.sub.t~(b)|F|prime~.sub.t~(b) where t = 1 to T + (1/2) |summation of~ ||F.sub.t~(b)|F|prime~.sub.t-1~(b) + |F.sub.t-1~(b)|F|prime~.sub.t~(b)~ where t = 2 to T}/T.

In practice there are two commonly used procedures to obtain |Mathematical Expression Omitted~. Both procedures involve steps of iteration of the following: For given |Mathematical Expression Omitted~, calculate |Mathematical Expression Omitted~, (usually the initial choice |Mathematical Expression Omitted~), then search for |Mathematical Expression Omitted~ such that for g(b) given by (6),

|Mathematical Expression Omitted~.

One approach is the two-step procedure described by Hansen and Singleton |9~. Under that procedure, |Mathematical Expression Omitted~ is obtained as |arg.sub.b~ min |g|prime~.sub.T~ (b)|g.sub.T(b), and then is substituted into (A1) to calculate |Mathematical Expression Omitted~. By Theorem 2.1 of Hansen |8~, |Mathematical Expression Omitted~ is a consistent estimator of |b.sub.0~ under some regularity conditions, which implies that |Mathematical Expression Omitted~ is the asymptotically optimal weighting matrix. An alternative approach is to keep iterating by (A1) until |Mathematical Expression Omitted~ converges. This procedure is described by Dunn and Singleton |2~ and is also adopted in this paper. The criterion for stopping is to treat |Mathematical Expression Omitted~ as |Mathematical Expression Omitted~ if |Mathematical Expression Omitted~, where |Delta~ is a very small number. The sum of the squares of elements of b divided by the dimension of b is used as the norm and |Delta~ is chosen to be |10.sup.-6~. The initial |Mathematical Expression Omitted~ is arbitrary and |Mathematical Expression Omitted~ is picked to be the identity matrix. The variance-covariance matrix of the asymptotically normally distributed estimator is

|Mathematical Expression Omitted~

In practice, |Mathematical Expression Omitted~ can be obtained from sample data. In this paper, the derivatives are calculated numerically.

Appendix B

In Eichenbaum, Hansen, and Singleton |3~, the utility function, which includes leisure (L), is assumed to be

|Mathematical Expression Omitted~

where |c*.sub.t~ = |c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~ and |L*.sub.t~ = |L.sub.t~ + |Lambda~|L.sub.t-1~.

The realized marginal utility of consumption |c.sub.t~ is

|Mathematical Expression Omitted~

We choose the scaling factor |Mathematical Expression Omitted~.

Eichenbaum, Hansen, and Singleton estimated |Delta~ to be .14. They also estimated |Lambda~ to be .7 when only the asset holding equation was tested, and |Lambda~ to be -.7 when both asset holding and intratemporal Euler equations on choice of leisure were tested. If we fix the parameters to be the estimates obtained by Eichenbaum, Hansen, and Singleton and change ||Theta~.sub.1~ from -1 to 1, the variation in the negative correlation between |Mathematical Expression Omitted~ and R|P.sub.t+1~ (denoted by |Mathematical Expression Omitted~) and the ratio of standard error of |Mathematical Expression Omitted~ to its mean, |Mathematical Expression Omitted~, can be derived. The variables used here are the same as that used by Eichenbaum, Hansen, and Singleton. Consumption is monthly real per capita consumption of nondurable goods and services, dated from 1959:1 to 1988:12. Leisure of the representative agent is calculated as the time endowment of 112 hours per week minus the average hours worked.

The behavior of |Mathematical Expression Omitted~ and |Mathematical Expression Omitted~ is roughly the same as their counterparts in Figure 1: the correlation between |Mathematical Expression Omitted~ and R|P.sub.t+1~ is low, and may move in the opposite direction from the ratio |Mathematical Expression Omitted~. In particular, |Mathematical Expression Omitted~ is around .067 when |Lambda~ = .7 and ranges from .020 to .025 when |Lambda~ = -.7. So in this economy |Mathematical Expression Omitted~ is higher when the consumption of leisure shows durability, but the correlation is still insignificant. In this economy |Mathematical Expression Omitted~ is between .012 and 0.16 when |Lambda~ = -.7 and between .010 and .011 when |Lambda~ = .7. Therefore when consumption of leisure shows habit persistent, |Mathematical Expression Omitted~ is higher, but its value is too low to explain the equity premium puzzle. This observation raises another problem related to the real business cycle theory (see Kydland and Prescott |13~). One labor market phenomenon that troubles the competitive equilibrium theory is the large volatility of employment relative to output fluctuation. The assumption of durability in consumption of leisure will push the volatility of equilibrium employment closer to that implied by the real data. But this assumption will imply smaller |Mathematical Expression Omitted~, therefore making the equity premium more puzzling.

Now consider the utility function used by Dunn and Singleton, which includes durable goods (d), and is given by

|Mathematical Expression Omitted~,

where service from consumption is |c*.sub.t~ = |c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~, service from durable goods is |d*.sub.t~ = |Omega~ (|k.sub.t-1~ + |d.sub.t~), |k.sub.t-1~ is the stock of durable goods at the beginning of period t, and |d.sub.t~ is the purchase of durable goods in period t. Let the scaling factor S|F.sub.t~ be |Mathematical Expression Omitted~. Dunn and Singleton estimated |Delta~ to be .9 and |Omega~ to be .01 and |Gamma~ close to -1. These numbers are used here. The consumption data is the similar to the one used in the estimation reported above. Monthly durable goods purchases from 59:1 to 88:12 are obtained from the CITIBASE tape, and the stock of durable goods as of December 1958 is taken to be the same value used by Dunn and Singleton.

The correlation between |Mathematical Expression Omitted~ and R|P.sub.t+1~ is shown to be in the range of .07 to .18 as ||Theta~.sub.1~ changes from -1 to 1. The ratio of the standard deviation of |Mathematical Expression Omitted~ to its mean is between .008 and .03. The co-movement of these two statistics shows the same pattern as that in Figure 1. Thus it can be concluded that with utility functions studied by Eichenbaum, Hansen, and Singleton |3~, and Dunn and Singleton |2~, the observed equity premium cannot be explained by habit persistence.

I am grateful to Wayne Ferson, Lars Hansen, Kiseok Lee, Peter Mueser, Ron Ratti, Chris Sims, and the referee for their valuable comments on the earlier versions of the paper. The responsibility for any remaining errors is my own.

1. Generally, we may assume that |s.sub.t~ depends on consumption of past N periods. Namely |Mathematical Expression Omitted~, and |Mathematical Expression Omitted~, for N |is greater than or equal to~ 1. But evidence from previous empirical studies (Dunn and Singleton |2~, and Gallant and Tauchen |6~) suggests that one lag, i.e., N = 1, suffices to fit the data.

2. This definition of consumption was used by Ferson and Constantinides.

3. Tauchen |18~ showed that a large set of instrumental variables likely leads to biased estimates in a finite sample GMM estimation. The set of instruments here is commonly used, although Ferson and Constantinides argued against using them. Since the main point of the estimation is to show the sensitivity of the results to scaling factors when the same set of instruments are used, we decide to use this convenient set of instruments.

4. If ||Theta~.sub.1~ is fixed at 0 then the scaling factor is not problematic. In this case, the estimates {|Beta~, |Gamma~} are |Beta~ = 1.0001 (with a standard error of 0.008) and |Gamma~ = 2.700 (with a standard error of 2.446). These numbers (which are not reported in Table I) are similar to that obtained by Hansen and Singleton |9~.

5. We use unconditional expectation here to avoid calculating conditional expectation. This simplification causes a loss of information. But the loss is not essential to the main point of the section.

6. This test is less general than that of Gallant, Hansen, and Tauchen |7~, and Hansen and Jagannathan |10~ in two aspects. First, this bound is parametric (i.e., it depends on the specification of the utility function) whereas theirs are non-parametric. Second, this bound is constructed only based on the equity premium data, while theirs are generally based on a vector of asset returns.

References

1. Constantinides, George M., "Habit Formation: A Resolution of the Equity Premium Puzzle." Journal of Political Economy, June 1990, 519-43.

2. Dunn, Kenneth B. and Kenneth J. Singleton, "Modeling the Term Structure of Interest Rates Under Non-Separable Utility and Durability of Goods." Journal of Financial Economics, September 1986, 27-55.

3. Eichenbaum, Martin S., Lars P. Hansen, and Kenneth J. Singleton, "A Time Series Analysis of Representative Agent Models of Consumption and Leisure Choice Under Uncertainty." Quarterly Journal of Economics, February 1988, 51-78.

4. ----- and -----, "Estimating Models With Intertemporal Substitution Using Aggregate Time Series Data." Journal of Business and Economic Statistics, January 1990, 53-69.

5. Ferson, Wayne E. and George M. Constantinides, "Habit Formation and Durability in Aggregate Consumption: Empirical Tests." Journal of Financial Economics, October 1991, 199-240.

6. Gallant, Ronald A. and George Tauchen, "Seminonparametric Estimation of Conditionally Constrained Heterogeneous Process: Asset Pricing Applications." Econometrica, September 1989, 1091-120.

7. -----, Lars P. Hansen and George Tauchen, "Using Conditional Moments of Asset Payoffs To Infer the Volatility of Intertemporal Marginal Rates of Substitution." Journal of Econometrics, January 1990, 141-79.

8. Hansen, Lars P., "Large Sample Properties of Generalized Method of Moments Estimators." Econometrica, July 1982, 1029-54.

9. ----- and Kenneth J. Singleton, "Generalized Instrumental Variable Estimation of Nonlinear Rational Expectations Models." Econometrica September 1982, 1269-86.

10. ----- and Ravi Jagannathan, "Implications of Security Market Data for Models of Dynamic Economics." Journal of Political Economy, April 1991, 225-62.

11. Heaton, John. "Notes on an Empirical Investigation of Asset Pricing with Nonseparable Preference Specifications." Manuscript, U. of Chicago, 1988.

12. Kocherlakota, Narayana R. "On the 'Discount' Factor in Growth Economies." Journal of Monetary Economics, January 1990, 43-47.

13. Kydland, Finn E. and Edward C. Prescott, "Time to Build and Aggregate Fluctuations." Econometrica, November 1982, 1345-70.

14. Lucas, Robert E., Jr., "Asset Prices in an Exchange Economy." Econometrica, November 1978, 1429-45.

15. Mankiw, Gregory N. and Stephen P. Zeldes, "The Consumption of Stockholders and Nonstockholders." Journal of Financial Economics, March 1991, 97-112.

16. Mehra, Rajnish and Edward C. Prescott, "Equity Premium: A Puzzle." Journal of Monetary Economics, March 1985, 145-61.

17. Newey, Whitney K. and Kenneth D. West, "A Simple, Positive Semi-Definite, Heteroskedasticity and Auto-correlation Consistent Covariance Matrix." Econometrica, May 1987, 703-08.

18. Tauchen, George. "Statistical Properties of GMM Estimators of Structural Parameters Obtained From Financial Market Data." Journal of Business and Economic Statistics, October 1985, 397-415.

The consumption based asset pricing model of Lucas |14~ defines a theoretical relationship between streams of consumption and equilibrium asset prices. Since data of both aggregate consumption and asset prices are available, the theory can be tested. Empirical tests of the Lucas model using standard time separable utility functions indicate mismatches between the theory and the data. For example, in the GMM (Generalized Method of Moments) estimation of Hansen and Singleton |9~, overidentifying constraints implied by the model were rejected. Mehra and Prescott |16~ demonstrated the difficulty of explaining a particular statistic: the theoretical expected equity premium (the yield differential between stocks and risk-free bonds) is much higher than the observed one if a standard utility function is used. They called the mismatch "the equity premium puzzle".

To improve the performance of the model, several authors have relaxed the time-separability of preferences. In a stimulating paper, Constantinides |1~ argued that the equity premium puzzle can be resolved through the assumption of "habit formation". The idea is that consumption in the past reduces utility in the present because it establishes habits. His model can match the observed mean and the variance of both the equity premium and the consumption growth rate. The match of the low moments is consistent with the bound tests of Heaton |11~, Gallant, Hansen, and Tauchen |7~, and Hansen and Jagannathan |10~.

On the other hand, GMM estimations using monthly aggregate consumption of nondurable goods and leisure by Eichenbaum, Hansen, and Singleton |3~, and that using consumption of nondurable goods and durable goods by Dunn and Singleton |2~ and Eichenbaum and Hansen |4~, showed that if current utility depends on current and past consumption, then current utility increases in past consumption, i.e., consumption has "local durability," the opposite of that implied by habit formation. A particularly strong result was obtained by Gallant and Tauchen |6~. They estimated a general form of utility functions with a general law of motion of data and found that the source of the time-non-separability is local durability. Since it is known that local durability produces a smaller equilibrium equity premium, these results imply that in order to match other moments well, the match for the equity premium has to be sacrificed.

However, the GMM estimations on the nature of time-non-separability are not yet conclusive. Ferson and Constantinides |5~ illustrated that the fit of the model with multi-asset portfolios can be improved by either introducing habit formation or durability, and the chi-square statistics are not dramatically different in these two cases. They also found that the sample estimates are influenced by the choice of instrumental variables. Using the instruments they considered plausible, the GMM estimate indicates that the utility function displays habit persistence.

There are many factors that affect outcomes of finite sample GMM estimations. One of them is the scaling factor. The actual level of consumption exhibits an upward trend, therefore it is not stationary. As a result, marginal utility of consumption is also non-stationary. Since the asymptotic theory can be expected to give a reasonable approximation under stationarity, researchers use a scaling factor to offset the trend of the marginal utility if the utility function to be estimated is homogeneous. The scaling factors are similar to ordinary instrumental variables except that by construction scaling factors explicitly depend on the parameters to be estimated, whereas ordinary instrumental variables do not. According to the consistency theorem of Hansen |8~, the instruments and scaling factors should not matter asymptotically. But like instruments, the scaling factors can affect finite sample estimates. The various scaling factors used by different authors may have a nontrivial impact, but such an impact has not been drawn out.

The contribution of the present paper is two fold. First, GMM estimations are conducted using different scaling factors. The paper shows that scaling factors are important to finite sample estimates. Using a plausible scaling factor, our estimated utility function is locally durable. This implies that habit formation assumption cannot explain some moments other than the equity premium. Secondly, it demonstrates that even the equity premium puzzle is not solved by the introduction of habit formation because the correlation between aggregate consumption and the equity premium is too small. It is held that the model equity premium is large when the detrended marginal utility is volatile and when the correlation between marginal utility and equity premium is large. The calculation in the paper shows that comparing with the locally durable and time-separable preferences, the habit persistent preferences yield larger volatility in the marginal utility. So the habit formation assumption implies larger equity premium. But the paper also finds the correlation embodied in the data is so small that in order to match the observed equity premium, the volatility of marginal utility has to be much larger than that implied by strongly habit persistent utility functions.

The remainder of the paper is organized as follows. Section II specifies the class of utility functions to be studied. Section III reports the results of GMM estimation using the equilibrium conditions on the asset returns and the marginal utility. Section IV demonstrates that in actual data, the correlation between the marginal utility and the equity premium is small under time separable preferences and is even smaller with habit persistent utility functions. Finally, section V concludes.

II. The Utility Function

The representative agent's lifetime utility is assumed to be

E |summation of~ ||Beta~.sup.t~ |(|c.sub.t~ + |s.sub.t~).sup.1-|Gamma~~ where t = 1 to |infinity~/(1 - |Gamma~) |Gamma~ |is greater than~ 0, (1)

where |Beta~ is the discount rate, |c.sub.t~ is consumption at t and

|s.sub.t~ = ||Theta~.sub.1~|c.sub.t-1~. (2)(1)

In (2), if ||Theta~.sub.1~ = 0, the utility function becomes time-separable; if ||Theta~.sub.1~ |is greater than~ 0, the utility function shows local durability, which means consumption in the previous period and the present period are substitutes; and finally if ||Theta~.sub.1~ |is less than~ 0, the utility function shows habit formation, where a high level of consumption in the previous period changes the agent's habits, and the satisfaction the agent gains from the consumption of the current period depends on the difference between the present consumption and the habit. The marginal utility of consumption |c.sub.t~ divided by ||Beta~.sup.t~ is given by

mr|u.sub.t~ = |(|c.sub.t~ + |s.sub.t~).sup.-|Gamma~~ + |E.sub.t~|Beta~||Theta~.sub.1~|(|c.sub.t+1~ + |s.sub.t+1~).sup.-|Gamma~~. (3)

Equilibrium conditions in the market for assets imply that

|Mathematical Expression Omitted~

and

|Mathematical Expression Omitted~

where |Mathematical Expression Omitted~ is the realized return of the risky asset from period t to t + 1, and |Mathematical Expression Omitted~ is the return of the risk-free bond from period t to t + 1. The latter rate is assumed known with certainty at time t, whereas the former one is not. The parameters we wish to estimate are |b.sub.0~ = {|Beta~, |Gamma~, ||Theta~.sub.1~}.

III. GMM Estimation

Define |Mathematical Expression Omitted~ as mr|u.sub.t~ without the conditional expectation operator on the second component of (3), i.e., |Mathematical Expression Omitted~ is the realized marginal utility. Denote the vector of parameters {|Beta~, |Gamma~, ||Theta~.sub.1~} by b. Given a scaling factor S|F.sub.t~(b), expressions similar to (4) and (5) can be rewritten as

|Mathematical Expression Omitted~,

where |Mathematical Expression Omitted~; and

|Mathematical Expression Omitted~,

where |Mathematical Expression Omitted~.

Equations (4) and (5) also imply that in (4|prime~) and (5|prime~), |Mathematical Expression Omitted~. The scaling factor S|F.sub.t~ is introduced because the aggregate consumption |c.sub.t~ grows with t, thus ||Eta~.sub.t~, the vector (|Mathematical Expression Omitted~), is nonstationary. Assuming that {|c.sub.t+1~/|c.sub.t~} is stationary and ergodic, the residuals ||Eta~.sub.t+1~ of equations (4|prime~) and (5|prime~) are scaled by the factor S|F.sub.t~ before being used as disturbances in GMM estimation. S|F.sub.t~ contains data observable at period t and also contains a trend so that ||Eta~.sub.t+1~S|F.sub.t~ is stationary. There are many ways to construct scaling factors, and choosing among them is an important issue that will be discussed later in this section.

The method of GMM estimation can be described in the following way: Suppose that one desires to estimate a parameter vector |b.sub.0~. The orthogonality conditions for a sample of T observations imply that the following vector |g.sub.T~(b), evaluated at |b.sub.0~, must be close to zero if T is large.

|Mathematical Expression Omitted~

where |Mathematical Expression Omitted~ is the Kronecker product and |IV.sub.t~ is a vector of instrumental variables. The GMM estimator |Mathematical Expression Omitted~ for a given weighting matrix W is

|Mathematical Expression Omitted~

where |g|prime~.sub.T~(b) is the transpose of the vector |g.sub.T~(b).

In practice, procedures of GMM estimation involve steps of iteration of the following: At step k, for given |Mathematical Expression Omitted~, calculate the weighting matrix |Mathematical Expression Omitted~ (the matrix |Omega~(b) is defined in Appendix A, usually the initial choice is the identity matrix), then search for |Mathematical Expression Omitted~ so that

|Mathematical Expression Omitted~

The convergence criteria are also discussed in Appendix A.

As emphasized previously, there are many ways to detrend the residuals ||Eta~.sub.t+1~ in (4|prime~) and (5|prime~). Similar to instrumental variables, scaling factors can affect the finite sample estimates. But there is a difference between scaling factors and ordinary instrumental variables: Usually instruments are unrelated to the parameters to be estimated, but by construction scaling factors specifically depend on the parameters. With the marginal utility given by (3), two convenient choices of S|F.sub.t~ are

(A) |Mathematical Expression Omitted~;

(B) S|F.sub.t~ = |(|c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~).sup.|Gamma~~, with ||Theta~.sub.1~ |is less than or equal to~ 1.

Both of the scaling factors are natural candidates for their simplicity and because if the growth rate of consumption is stationary then S|F.sub.t~||Eta~.sub.t+1~ is stationary. The constraint ||Theta~.sub.1~ |is less than or equal to~ 1 means that the consumption of the previous period cannot carry a larger weight than the consumption of the current period. These constraints are particularly plausible when the consumption stands for nondurable goods.

The data set is monthly consumption, stock returns and bond returns from 59:1-88:12. The measure of consumption is real per capita spending on nondurable goods.(2) The stock return is the value weighted average of ex-post returns of stocks listed on the New York Stock Exchange. The bond return is the one-month T-Bill rate. The asset returns are obtained from the CRSP tape and converted to real terms by the consumption deflator.

The set of instrumental variables associated with both equations (4|prime~) and (5|prime~) is |Mathematical Expression Omitted~.(3) The GMM estimation result is reported in Table I.

TABULAR DATA OMITTED

Panel A of Table I reports the result with |Mathematical Expression Omitted~ under constraints 0 |is less than or equal to~ |Gamma~ |is less than or equal to~ 20, ||Theta~.sub.1~ |is less than or equal to~ 1, and |Mathematical Expression Omitted~ for all t. The estimation results in very large |Gamma~ and ||Theta~.sub.1~. ||Theta~.sub.1~ = 1 means that the consumption of nondurable goods of the previous month has the same effect as the consumption of the current month. This result appears implausible. |Gamma~ is also much larger than the estimates obtained by other authors.(4) Finally, the chi-square statistic strongly rejected the overidentifying restrictions.

Panel B of Table I reports the estimates with S|F.sub.t~ = |(|c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~).sup.|Gamma~~, ||Theta~.sub.1~ |is less than or equal to~ 1 and |Mathematical Expression Omitted~ for all t. The parameters to be estimated are still b = {|Beta~, |Gamma~, ||Theta~.sub.1~}. As in Panel A, the estimate for |Beta~ is larger than 1, which supports the conjecture of the negative discount factor by Kocherlakota |12~. But the estimates for |Gamma~ and ||Theta~.sub.1~ are much smaller than that with the previous scaling factor. The chi-square statistic for the overidentifying constraints of Panel B is much smaller than that of Panel A. Compared with the estimate with the first scaling factor, the results generated by the second scaling factor are more reasonable.

In principle, scaling factors affect finite sample estimates in a similar way as instrumental variables do. So long as the data sample is finite, the sample estimates are under the influence of instruments and scaling factors. The importance of instruments are discussed in detail by many authors (e.g., Ferson and Constantinides |5~). The comparison of the scaling factors above shows that in finite sample estimation the choice of scaling factors can also be very important.

The estimation result obtained by using scaling factor (B) indicates that a consumption of one dollar nondurable goods in the previous month generates the same utility in the current month as a consumption of 46 cents of the current month. The effect of local durability is therefore nontrivial. The fact that the GMM estimation results in a locally durable utility function instead of a habit persistent one indicates that the estimates may have problem matching the observed equity premium. The next section examines the equity premium implied by the estimates.

IV. Matching the Equity Premium

In this section we will study some sample statistics of the actual data. Denote |Mathematical Expression Omitted~, the ex-post equity premium, by R|P.sub.t+1~. From (4) and (5) we have

|Mathematical Expression Omitted~(5)

As in the previous section, let S|F.sub.t~ = |(|c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~).sup.|Gamma~~. Denote E(|center dot~) as mean, SD (|center dot~) as standard deviation, and |Mathematical Expression Omitted~ as the correlation between |Mathematical Expression Omitted~ and R|P.sub.t+1~. From (9)

|Mathematical Expression Omitted~

The difference between the observed average equity premium and the value of the RHS of (10) can be used as a measure of match to the equity premium. Since mr|u.sub.t+1~S|F.sub.t~ depends on the parameters {|Beta~, |Gamma~, ||Theta~.sub.1~}, if we fix |Beta~ and |Gamma~ then the sample consumption and the parameter ||Theta~.sub.1~ will determine |Mathematical Expression Omitted~ as well as the implied equity premium (as expressed by the RHS of (10)). The primary task of this section is to study how the sample statistics of marginal utility and the implied equity premium change with ||Theta~.sub.1~. Let |Beta~ = 1.001, and |Gamma~ = 4.838, as suggested by the GMM estimates; also let |Mathematical Expression Omitted~ stand for standard error, |Mathematical Expression Omitted~ for the sample correlation, |Mathematical Expression Omitted~ for the sample average, and DRP of the following for the match of the equity premium:

|Mathematical Expression Omitted~

where |Mathematical Expression Omitted~ and |Mathematical Expression Omitted~ are given by the sample equity premium; |Mathematical Expression Omitted~, |Mathematical Expression Omitted~, and |Mathematical Expression Omitted~ can be derived from the sample data and given values of ||Theta~.sub.1~. With the same set of monthly data used in the GMM estimation, |Mathematical Expression Omitted~ and |Mathematical Expression Omitted~.

By definition, the smaller DRP(||Theta~.sub.1~) is, the higher is the theoretical equity premium. If DRP(||Theta~.sub.1~) is larger than zero, then the sample average of equity premium implied by the theory is smaller than the observed one. For given |Mathematical Expression Omitted~, equation (11) suggests that the difficulty of generating a small DRP may stem from low volatility of |Mathematical Expression Omitted~, measured by the ratio |Mathematical Expression Omitted~; or from a small negative correlation |Mathematical Expression Omitted~. It has been suggested by the bound tests of Gallant, Hansen, and Tauchen |7~, and Hansen and Jagannathan |10~ that the volatility problem can be solved by introducing habit formation, but what is unknown is the behavior of |Mathematical Expression Omitted~ as ||Theta~.sub.1~ changes. Figure 1 plots |Mathematical Expression Omitted~, |Mathematical Expression Omitted~, and DRP as functions of ||Theta~.sub.1~. DRP is in percentage terms. The set of monthly data is the same as the one used in the estimation. The lowest value for ||Theta~.sub.1~ is set to be -.62 because if ||Theta~.sub.1~ is smaller than that value, the marginal utility becomes negative.

From Figure 1, the following features can be observed:

(i) Habit formation (||Theta~.sub.1~ |is less than~ 0) implies large volatility in marginal utility.

|Mathematical Expression Omitted~ reaches its highest level, 0.32, at ||Theta~.sub.1~ = -0.62. It is decreasing in ||Theta~.sub.1~ until it reaches the lowest level, 0.025, at ||Theta~.sub.1~ = 0.64. So there are considerable changes in |Mathematical Expression Omitted~ as ||Theta~.sub.1~ varies. Figure 1 verifies the intuition that the habit formation assumption may imply a higher equity premium through its impact on the volatility of marginal utility.

(ii) Habit persistent utility functions generate larger equilibrium equity premia compared with time-separable and locally durable utility functions.

DRP is minimized at ||Theta~.sub.1~ = -0.62, it is increasing in ||Theta~.sub.1~, and it reaches the maximum level at ||Theta~.sub.1~ = 1.0. This means the discrepancy between the mean of the equity premium implied by equation (10) and the observed one is minimized by a habit persistent utility function. Figure 1 also illustrates the model's ability to match the observed equity premium. DRP is about 0.39% when ||Theta~.sub.1~ = 0.46 (the GMM estimate), which means that the model equity premium is less than one percent of size actually observed. DRP is 0.38% at ||Theta~.sub.1~ = 0 (that means the model equity premium is less than six percent of size actually observed, which is the puzzle proposed by Mehra and Prescott); and 0.25% at ||Theta~.sub.1~ = -0.62 (which means that the model equity premium is about forty percent of that observed if we assume strong habit persistence). Therefore the habit formation assumption increases the model equity premium but still leaves a substantial part of the observed equity premium unexplained.

(iii) The equity premium puzzle cannot be solved by introducing habit formation because of the small correlation |Mathematical Expression Omitted~.

In the figure, the maximum of |Mathematical Expression Omitted~ is less than 0.15. Given this maximum correlation and the fact that the monthly ex-post equity premium has a sample average of 0.41% and a standard error of 4.4%, the ratio |Mathematical Expression Omitted~ that lets DRP be zero is at least 0.62. But the largest ratio shown in the figure is about 0.32. Moreover, Figure 1 shows that the increase in |Mathematical Expression Omitted~ due to the decrease in ||Theta~.sub.1~ as ||Theta~.sub.1~ |is less than~ 0 is partially offset by the decrease in |Mathematical Expression Omitted~. This property of the actual time series data reduces the power of the habit formation theory in matching the equity premium. The key difficulty of matching the observed equity premium, according to the figure, lies in the fact that the correlation between the marginal utility and equity premium is too small. The figure also shows that habit persistent utility functions result in even smaller correlations |Mathematical Expression Omitted~ compared with that associated with the time separable utility function.

The reason that GMM estimation implies local durability instead of habit formation is that although habit formation explains the equity premium better, it does not fit other moments well. This point is made by Gallant et al. |7~. They found the Euler equation residuals predictable when a habit persistent utility function is tested.

Figure 1 implies that matching the equity premium with the class of utility functions studied in this paper is more difficult when actual data are used than when artificial data generated from equilibrium conditions are used. We make the same observation when different utility functions are employed. Calculations of the sample statistics using the utility functions studied by Eichenbaum, Hansen, and Singleton |3~, and Dunn and Singleton |2~ are reported in Appendix B.

The approach presented above is related to the bound constraint proposed by Heaton |11~, Gallant, Hansen, and Tauchen |7~, and Hansen and Jagannathan |10~. They used asset return data to obtain the admissible region of the mean and the standard deviation of IMSR (Intertemporal Marginal Rate of Substitution) by the Euler equations, an approach that does not rest on the use of a particular utility function. With the same utility function used in this paper, they showed that the assumption of habit formation coupled with a high risk aversion coefficient can push the mean and the standard deviation of IMSR into the admissible region. The finding means that the introduction of habit formation makes IMSR volatile enough to solve the equity premium puzzle if the correlation between IMSR and the equity premium is large.

We now use (11) to do a simpler bound test.(6) Although the test is less general than that conducted by the authors mentioned above, it does illustrate more directly the key to the equity premium puzzle. The variable we examine is |Mathematical Expression Omitted~ instead of IMSR. Figure 1 shows that strong habit formation does make |Mathematical Expression Omitted~ more volatile. The lower bound for the admissible volatility of marginal utility |Mathematical Expression Omitted~ can be obtained by letting |Mathematical Expression Omitted~ be -1 and DRP in (11) be zero. The lower bound for |Mathematical Expression Omitted~ is about 0.093. In Figure 1 the ratio |Mathematical Expression Omitted~ is higher than the lower bound as long as ||Theta~.sub.1~ is smaller than -0.35. In other words, when the habit formation is strong enough, the ratio |Mathematical Expression Omitted~ can fall into the admissible region. Apparently, the small correlation |Mathematical Expression Omitted~ is the main cause of the equity premium puzzle. Although low volatility of |Mathematical Expression Omitted~ or that of IMSR can also contribute to the puzzle, it can be overcome by the habit formation assumption. However, the problem of small correlation |Mathematical Expression Omitted~ does not go away in the presence of habit formation.

V. Concluding Remarks

In this paper, the GMM estimates are shown to be sensitive to scaling factors. When a plausible scaling factor is used, the estimated utility function exhibits local durability. The estimated utility function is unable to explain the equity premium puzzle.

The equity premium puzzle can be resolved if the normalized marginal utility is volatile and is strongly negatively correlated with the equity premium. Habit persistence in the preference yields volatile marginal utility. And if the artificial consumption is solved from the Euler equations, the negative correlation is produced automatically. In the actual data, the correlation between marginal utility and equity premium is small. Thus when time series data are used instead of several low moments, the test is more difficult to pass. The habit formation assumption does reproduce the admissible volatility of normalized marginal utility. But if we claim that habit formation resolves the equity premium puzzle, we have to assume a large correlation between the marginal utility and equity premium that does not exist in the data.

What makes the model unsatisfactory may be the use of aggregate consumption data. Using household panel data, Mankiw and Zeldes |15~ found that the consumption of stockholders is more volatile and more highly correlated with the equity premium than aggregate consumption is. Since only about one-fourth of U.S. families own stock, tests based on the aggregate consumption tend to reject the model even if the theory is correct for the stock holders. However, although it is obvious now that empirical tests using disaggregated consumption data will be more fruitful, an immediate obstacle is the difficulty of approximating the consumption of stockholders as a long time series.

Appendix A

Hansen |8~ showed that the smallest asymptotic covariance matrix of the estimator |Mathematical Expression Omitted~ can be obtained by letting the weighting matrix W be |Mathematical Expression Omitted~. Because we assume that the consumption of the previous period affects the current period utility, |S.sub.0~ is defined by

|S.sub.0~ = |summation of~ E{|F.sub.t~ (|b.sub.0~)|F|prime~.sub.t-k~(|b.sub.0~)} where k = -1 to 1,

where |Mathematical Expression Omitted~

Following Newey and West |17~, a consistent estimator of |S.sub.0~ which is positive semidefinite in a finite sample, is

|Omega~(b) = {|summation of~ |F.sub.t~(b)|F|prime~.sub.t~(b) where t = 1 to T + (1/2) |summation of~ ||F.sub.t~(b)|F|prime~.sub.t-1~(b) + |F.sub.t-1~(b)|F|prime~.sub.t~(b)~ where t = 2 to T}/T.

In practice there are two commonly used procedures to obtain |Mathematical Expression Omitted~. Both procedures involve steps of iteration of the following: For given |Mathematical Expression Omitted~, calculate |Mathematical Expression Omitted~, (usually the initial choice |Mathematical Expression Omitted~), then search for |Mathematical Expression Omitted~ such that for g(b) given by (6),

|Mathematical Expression Omitted~.

One approach is the two-step procedure described by Hansen and Singleton |9~. Under that procedure, |Mathematical Expression Omitted~ is obtained as |arg.sub.b~ min |g|prime~.sub.T~ (b)|g.sub.T(b), and then is substituted into (A1) to calculate |Mathematical Expression Omitted~. By Theorem 2.1 of Hansen |8~, |Mathematical Expression Omitted~ is a consistent estimator of |b.sub.0~ under some regularity conditions, which implies that |Mathematical Expression Omitted~ is the asymptotically optimal weighting matrix. An alternative approach is to keep iterating by (A1) until |Mathematical Expression Omitted~ converges. This procedure is described by Dunn and Singleton |2~ and is also adopted in this paper. The criterion for stopping is to treat |Mathematical Expression Omitted~ as |Mathematical Expression Omitted~ if |Mathematical Expression Omitted~, where |Delta~ is a very small number. The sum of the squares of elements of b divided by the dimension of b is used as the norm and |Delta~ is chosen to be |10.sup.-6~. The initial |Mathematical Expression Omitted~ is arbitrary and |Mathematical Expression Omitted~ is picked to be the identity matrix. The variance-covariance matrix of the asymptotically normally distributed estimator is

|Mathematical Expression Omitted~

In practice, |Mathematical Expression Omitted~ can be obtained from sample data. In this paper, the derivatives are calculated numerically.

Appendix B

In Eichenbaum, Hansen, and Singleton |3~, the utility function, which includes leisure (L), is assumed to be

|Mathematical Expression Omitted~

where |c*.sub.t~ = |c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~ and |L*.sub.t~ = |L.sub.t~ + |Lambda~|L.sub.t-1~.

The realized marginal utility of consumption |c.sub.t~ is

|Mathematical Expression Omitted~

We choose the scaling factor |Mathematical Expression Omitted~.

Eichenbaum, Hansen, and Singleton estimated |Delta~ to be .14. They also estimated |Lambda~ to be .7 when only the asset holding equation was tested, and |Lambda~ to be -.7 when both asset holding and intratemporal Euler equations on choice of leisure were tested. If we fix the parameters to be the estimates obtained by Eichenbaum, Hansen, and Singleton and change ||Theta~.sub.1~ from -1 to 1, the variation in the negative correlation between |Mathematical Expression Omitted~ and R|P.sub.t+1~ (denoted by |Mathematical Expression Omitted~) and the ratio of standard error of |Mathematical Expression Omitted~ to its mean, |Mathematical Expression Omitted~, can be derived. The variables used here are the same as that used by Eichenbaum, Hansen, and Singleton. Consumption is monthly real per capita consumption of nondurable goods and services, dated from 1959:1 to 1988:12. Leisure of the representative agent is calculated as the time endowment of 112 hours per week minus the average hours worked.

The behavior of |Mathematical Expression Omitted~ and |Mathematical Expression Omitted~ is roughly the same as their counterparts in Figure 1: the correlation between |Mathematical Expression Omitted~ and R|P.sub.t+1~ is low, and may move in the opposite direction from the ratio |Mathematical Expression Omitted~. In particular, |Mathematical Expression Omitted~ is around .067 when |Lambda~ = .7 and ranges from .020 to .025 when |Lambda~ = -.7. So in this economy |Mathematical Expression Omitted~ is higher when the consumption of leisure shows durability, but the correlation is still insignificant. In this economy |Mathematical Expression Omitted~ is between .012 and 0.16 when |Lambda~ = -.7 and between .010 and .011 when |Lambda~ = .7. Therefore when consumption of leisure shows habit persistent, |Mathematical Expression Omitted~ is higher, but its value is too low to explain the equity premium puzzle. This observation raises another problem related to the real business cycle theory (see Kydland and Prescott |13~). One labor market phenomenon that troubles the competitive equilibrium theory is the large volatility of employment relative to output fluctuation. The assumption of durability in consumption of leisure will push the volatility of equilibrium employment closer to that implied by the real data. But this assumption will imply smaller |Mathematical Expression Omitted~, therefore making the equity premium more puzzling.

Now consider the utility function used by Dunn and Singleton, which includes durable goods (d), and is given by

|Mathematical Expression Omitted~,

where service from consumption is |c*.sub.t~ = |c.sub.t~ + ||Theta~.sub.1~|c.sub.t-1~, service from durable goods is |d*.sub.t~ = |Omega~ (|k.sub.t-1~ + |d.sub.t~), |k.sub.t-1~ is the stock of durable goods at the beginning of period t, and |d.sub.t~ is the purchase of durable goods in period t. Let the scaling factor S|F.sub.t~ be |Mathematical Expression Omitted~. Dunn and Singleton estimated |Delta~ to be .9 and |Omega~ to be .01 and |Gamma~ close to -1. These numbers are used here. The consumption data is the similar to the one used in the estimation reported above. Monthly durable goods purchases from 59:1 to 88:12 are obtained from the CITIBASE tape, and the stock of durable goods as of December 1958 is taken to be the same value used by Dunn and Singleton.

The correlation between |Mathematical Expression Omitted~ and R|P.sub.t+1~ is shown to be in the range of .07 to .18 as ||Theta~.sub.1~ changes from -1 to 1. The ratio of the standard deviation of |Mathematical Expression Omitted~ to its mean is between .008 and .03. The co-movement of these two statistics shows the same pattern as that in Figure 1. Thus it can be concluded that with utility functions studied by Eichenbaum, Hansen, and Singleton |3~, and Dunn and Singleton |2~, the observed equity premium cannot be explained by habit persistence.

I am grateful to Wayne Ferson, Lars Hansen, Kiseok Lee, Peter Mueser, Ron Ratti, Chris Sims, and the referee for their valuable comments on the earlier versions of the paper. The responsibility for any remaining errors is my own.

1. Generally, we may assume that |s.sub.t~ depends on consumption of past N periods. Namely |Mathematical Expression Omitted~, and |Mathematical Expression Omitted~, for N |is greater than or equal to~ 1. But evidence from previous empirical studies (Dunn and Singleton |2~, and Gallant and Tauchen |6~) suggests that one lag, i.e., N = 1, suffices to fit the data.

2. This definition of consumption was used by Ferson and Constantinides.

3. Tauchen |18~ showed that a large set of instrumental variables likely leads to biased estimates in a finite sample GMM estimation. The set of instruments here is commonly used, although Ferson and Constantinides argued against using them. Since the main point of the estimation is to show the sensitivity of the results to scaling factors when the same set of instruments are used, we decide to use this convenient set of instruments.

4. If ||Theta~.sub.1~ is fixed at 0 then the scaling factor is not problematic. In this case, the estimates {|Beta~, |Gamma~} are |Beta~ = 1.0001 (with a standard error of 0.008) and |Gamma~ = 2.700 (with a standard error of 2.446). These numbers (which are not reported in Table I) are similar to that obtained by Hansen and Singleton |9~.

5. We use unconditional expectation here to avoid calculating conditional expectation. This simplification causes a loss of information. But the loss is not essential to the main point of the section.

6. This test is less general than that of Gallant, Hansen, and Tauchen |7~, and Hansen and Jagannathan |10~ in two aspects. First, this bound is parametric (i.e., it depends on the specification of the utility function) whereas theirs are non-parametric. Second, this bound is constructed only based on the equity premium data, while theirs are generally based on a vector of asset returns.

References

1. Constantinides, George M., "Habit Formation: A Resolution of the Equity Premium Puzzle." Journal of Political Economy, June 1990, 519-43.

2. Dunn, Kenneth B. and Kenneth J. Singleton, "Modeling the Term Structure of Interest Rates Under Non-Separable Utility and Durability of Goods." Journal of Financial Economics, September 1986, 27-55.

3. Eichenbaum, Martin S., Lars P. Hansen, and Kenneth J. Singleton, "A Time Series Analysis of Representative Agent Models of Consumption and Leisure Choice Under Uncertainty." Quarterly Journal of Economics, February 1988, 51-78.

4. ----- and -----, "Estimating Models With Intertemporal Substitution Using Aggregate Time Series Data." Journal of Business and Economic Statistics, January 1990, 53-69.

5. Ferson, Wayne E. and George M. Constantinides, "Habit Formation and Durability in Aggregate Consumption: Empirical Tests." Journal of Financial Economics, October 1991, 199-240.

6. Gallant, Ronald A. and George Tauchen, "Seminonparametric Estimation of Conditionally Constrained Heterogeneous Process: Asset Pricing Applications." Econometrica, September 1989, 1091-120.

7. -----, Lars P. Hansen and George Tauchen, "Using Conditional Moments of Asset Payoffs To Infer the Volatility of Intertemporal Marginal Rates of Substitution." Journal of Econometrics, January 1990, 141-79.

8. Hansen, Lars P., "Large Sample Properties of Generalized Method of Moments Estimators." Econometrica, July 1982, 1029-54.

9. ----- and Kenneth J. Singleton, "Generalized Instrumental Variable Estimation of Nonlinear Rational Expectations Models." Econometrica September 1982, 1269-86.

10. ----- and Ravi Jagannathan, "Implications of Security Market Data for Models of Dynamic Economics." Journal of Political Economy, April 1991, 225-62.

11. Heaton, John. "Notes on an Empirical Investigation of Asset Pricing with Nonseparable Preference Specifications." Manuscript, U. of Chicago, 1988.

12. Kocherlakota, Narayana R. "On the 'Discount' Factor in Growth Economies." Journal of Monetary Economics, January 1990, 43-47.

13. Kydland, Finn E. and Edward C. Prescott, "Time to Build and Aggregate Fluctuations." Econometrica, November 1982, 1345-70.

14. Lucas, Robert E., Jr., "Asset Prices in an Exchange Economy." Econometrica, November 1978, 1429-45.

15. Mankiw, Gregory N. and Stephen P. Zeldes, "The Consumption of Stockholders and Nonstockholders." Journal of Financial Economics, March 1991, 97-112.

16. Mehra, Rajnish and Edward C. Prescott, "Equity Premium: A Puzzle." Journal of Monetary Economics, March 1985, 145-61.

17. Newey, Whitney K. and Kenneth D. West, "A Simple, Positive Semi-Definite, Heteroskedasticity and Auto-correlation Consistent Covariance Matrix." Econometrica, May 1987, 703-08.

18. Tauchen, George. "Statistical Properties of GMM Estimators of Structural Parameters Obtained From Financial Market Data." Journal of Business and Economic Statistics, October 1985, 397-415.

Printer friendly Cite/link Email Feedback | |

Publication: | Southern Economic Journal |
---|---|

Date: | Apr 1, 1993 |

Words: | 5880 |

Previous Article: | A single-equation study of U.S. petroleum consumption: the role of model specification. |

Next Article: | A theoretical and empirical analysis of family migration and household production: U.S. 1980-1985. |

Topics: |