# Errors truncation in approximations to expected consumer surplus.

Errors of Truncation in Approximations to Expected Consumer Surplus

I. INTRODUCTION

Consumer surplus and other welfare measures calculated from demand curves are random variables. This is so because these measures are functions of estimated (as opposed to known) demand parameters, which, of course, are random variables. Increasingly, this realization has been incorporated into studies which assess such benefit measures. It is typical in these studies that the expected value of the benefit measure is employed, in keeping with the common practice of assuming that benefit-cost decision-makers are risk neutral. (1)

Unfortunately, consumer surplus measures usually involve the ratio of random variables, and the expected value of a ratio of random variables is not equal to the ratio of the expected values. The use of the ratio of the expected values leads to a biased estimate of the true measure, although it is a consistent estimator. Indeed, in small samples the expection of consumer surplus often does not have a closed-form representation. In this case, one must resort to an approximation or a cumbersome Monte Carlo analysis. The former is the tack most often taken; for example, Bockstael and Strand (1987) and Kealy and Bishop (1986) have used a second-order Taylor series approximation to expected consumer surplus in their investigations.

Use of a second-order approximation will improve the estimation of expected consumer surplus relative to the use of the ratio of expected values. However, this second-order approximation still may not be accurate, especially when the variance of the denominator of the ratio (typically an estimated demand parameter on a price variable) is relatively large and/or its mean is relatively small. This is because these statistics figure prominently in higher order terms of the approximation. We investigate this issue via a Monte Carlo analysis which (with a large enough number of trials) is able to give a direct estimate of the mean of the consumer surplus measure. Comparing this to the second-order approximation shows that the impact on estimated consumer surplus from omitting higher order terms can be substantial. In our example, the magnitude of the error due to truncation of the approximation varies markedly across functional forms for the demand function; the error is especially large for the linear functional form, is evident to a lesser extent in the semi-log form, and is almost nonexistent in the double-log form.

Our illustration of this error is conducted using consumer surplus estimates for a travel cost model of recreation demand. We investigate three functional forms for a model with the number of visits in a season regressed on a constant and travel cost. The exact model used in our analysis is quite simple; we do not intend to offer a convincing model of recreation demand as much as we wish to illustrate the size of the potential errors that can be made by truncating an approximation to expected consumer surplus.

II. EXPECTED CONSUMER SURPLUS

The most commonly used functional forms for recreation demand functions are the linear, the semi-log, and the double-log (Smith 1988). For the simple, one-variable demand equations we employ, these forms are:  Linear Q = a + bP  Semi-log 1n Q = a + bP  Double-log 1n Q = a + b 1n P

As Bockstael and Strand (1987) point out, the derivation of the consumer surplus functions for these forms requires an assumption regarding the source of error in the demand equation. They contrast two cases, one in which all of the error term is due to observation-specific omitted variables, and one in which the error is due entirely to errors in measurement of the dependent variable. They note that in the former case consumer surplus should be computed for each observation using a demand curve going through the actual price/quantity pair, whereas for errors in measurement it would be appropriate to use the predicted quantities along the estimated demand curve. Here, we use only the omitted variables assumption.

The consumer surplus functions are  Linear CS = [Q.sub.2]/(-2b)  Semi-log CS = Q/(-b)  [Mathematical Expression Omitted] The quantity used in these equations is the actual quantity, while the price P in  is the price corresponding to the sample average quantity. The double-log demand curve in  does not integrate to a finite number. Consequently, those employing this form have devised methods to render it finite. In  one common strategy is used: Max(P) and Min(Q) are the maximum price observed over the sample and the corresponding quantity along the estimated demand curve (see, e.g., Dwyer, Kelly, and Bowes 1977). (2) As can be seen in equations  through , the consumer surplus for these functional forms involve demand parameters in the denominator. If the analyst wishes to use the expected value of the consumer surplus functions, a closed-form representation of the expected value is not available. As noted in the introduction, other investigators (e.g., Bockstael and Strand 1987; Kealy and Bishop 1986) have proposed using a Taylor series approximation to expected surplus. This requires that the random variables are continuous and their ratio has finite moments of all orders. The second-order approximation employed in the literature is given by  [Mathematical Expression Omitted]

This expression involves error due to truncation of the approximation at the second order. For example, the third-order term is  [Mathematical Expression Omitted] It is apparent that, if the price parameter in the demand function is fairly small, ceteris paribus, the third and higher order terms can be fairly large. It would be nice if the contribution of these terms to the approximation could be related to a summary statistic. However, the relative magnitude of the higher order terms does not seem to depend on any readily observable statistic, such as the t-statistic for a test of the null hypothesis of zero price effect. While the fit of the equation in terms of the t-statistic on the coefficient of the price variable, or some other statistical measure, may be related to the higher order terms, we do not know from such statistics when these terms reasonably may be ignored.

Equation  is stated for general random variables x and y. In economic contexts, some restrictions on these variables may be available. For consumer surplus measures most analysts have assumed that the numerator is nonrandom, in which case terms involving the covariances of x and y and x and [y.sup.2] drop out. (3) If the error terms are due to measurement error, then it is reasonable to use the predicted quantity, which is not a random variable, in the numerator of  (Bockstael and Strand 1987). In this case the second-order approximation consists only of the first and third terms on the righthand side of , as exhibited in the literature. When equation errors are due to omitted variables, the relevance of the covariance terms depends on additional considerations.

Examination of equations  through  reveals that the variable Q, the individual number of trips, is in the role of x in . In the omitted variables case, Bockstael and Strand recommend using the actual quantity, which, by hypothesis, is a random variable for each individual. In this instance, the covariance term in  is not zero in general. The relevance of the covariance term in  depends on the functional form for the demand equation.

Suppose first that the demand equation is of the semi-log form. Then, the consumer surplus function is given by  and the covariance between x and y is a linear function of y. By Jensen's inequality, we can replace the mean (over the sample) of the covariances by the covariance of the sample mean quantity and the price coefficient. Since the sample mean of the (estimated) error terms is zero, the sample mean of the quantity variable is nonstochastic, and the covariance is zero. This will not necessarily be the case for functional forms, such as the linear demand curve, which yield a non-linear covariance term. (4)

In order to investigate the potential magnitude of the error from truncating at the second term and how this might vary across functional forms for the demand curve, we undertake a Monte Carlo analysis which yields a direct estimate of the expected value of consumer surplus. The difference between this direct estimate and the second-order approximation reveals the magnitude of the approximation error. The methods used and our results are reported in the next section of the paper.

III. ESTIMATES AND RESULTS

We estimated the three functional forms given above for a travel cost demand function for recreation. The model employed was very simple; the number of trips over a season was the measure of quantity and travel cost was used as the price variable. The travel costs were measured as the round-trip out-of-pocket expense of the trip not including an opportunity cost of travel time. The data, collected by mail survey

Graham-Tomasi is assistant professor, Agricultural and Applied Economics Department, University of Minnesota, St. Paul; Adamowicz is associate professor, Rural Economy Department, University of Alberta, Edmonton; and Fletcher is assistant professor, Resource Management Division, West Virginia University, Morgantown.

The authors thank Yacov Tsur and two anonymous referees for helpful comments on an earlier draft of the paper; remaining errors are the authors' responsibility. Minnesota Agricultural Experiment Station Publication No. 16476. (1) It is not necessarily the case that benefit-cost decision-makers act as if they are risk neutral. If they are not, then the variance of the welfare measure matters as well as its mean. On the relationship between functional form and the variance of the welfare measure see Adamowicz et al. (1989a). (2) Another functional form of interest is the linear-log, i.e., Q = a + b 1n P. For this form the consumer surplus function is CS = Max(P)(Max(Q) - b) - P(Q - b). Since this is a linear function of random variables, no approximation is needed. Hence, we do not consider this form here. The use of the double-log form can be questioned on the grounds that it is not integrable, which implies that recreation is an essential good (Bockstael et al. 1987), and that the manner in which it is truncated is ad hoc. We do not necessarily recommend it in applied work; it is included here for illustrative purposes. (3) Both Bockstael and Strand (1987) and Kealy and Bishop (1986) adjust the ratio of expected values by an "expansion factor" (1 + (1/[t.sup.2])) to account for the other terms in , where t is the t-statistic for the price parameter estimate. Note that [Mathematical Expression Omitted] This is the approximation in  only if the second term in  is zero. (4) Let A be the kxn matrix [Mathematical Expression Omitted]. Then the covariance of the square of the dependent variable vector and the vector of parameters is 2diag [Mathematical Expression Omitted], where [sigma.sup.2] is the variance of the error terms, [x.sub.i] is the vector of explanatory variable values for observation i, and b is the estimated parameter vector. [Tabular Data Omitted] from 132 respondents, concern hunting trips for Bighorn sheep in Alberta, Canada.(5)

Estimated coefficients are reported in Table 1.(6) Inspection of the results indicates that the linear and semi-log perform reasonably well in terms of t-statistics on the price variable and F-tests of the overall equation. The double-log did not perform as well by these criteria.

A Monte Carlo analysis was performed in order to estimate directly the expected value of consumer surplus using a technique described by Freedman and Peters (1984). We derived an empirical distribution for the price parameter and used this distribution to describe the consumer surplus measure for each model. This was done in the following fashion. For each model we generated a new set of values for the dependent variable using the nonrandom set of independent variables and an error term that was generated from a normal distribution with zero mean and variance equal to the variance of the error for the regression equation. This new set of dependent variables was then used to estimate a new set of regression parameters, which in turn were used to compute a new value of the consumer surplus. This was repeated 5,000 times. The mean of this distribution is used as a direct estimate of the true mean for the welfare measure.

Table 1 also contains the point estimates of consumer surplus computed from equations  through , the second-order approximation to expected surplus as calculated from , and the expected surplus generated by the Monte Carlo procedure.

It is interesting that the magnitudes of the error due to truncation of the approximation at the second term can be as large as the differences between consumer surplus estimates across functional forms; much of the apparent similarity between the linear and double-log surplus estimates based on the approximation can be attributed to truncation error. As well, the truncation error is lowest for that model (the double-log) with the lowest t-statistic on the price variable and the worst overall fit to the data.

As discussed above, our model is a relatively simple one and it may be that improved specifications would mitigate the effect of the third and higher order terms on the approximation. However, it is difficult to discern the impact on our results of alternative formulations of the model. Estimation of a more elaborate travel cost model by including the prices of substitute sites or altering the measurement of the travel cost variable to include the opportunity cost of time may or may not mitigate the influence of third and higher order terms. This will depend on changes in the magnitude of the price coefficient and its covariance with powers of the quantity variable. The absence of a negative relationship between the overall fit of the model and the influence of the higher order terms is evidenced by our results concerning the double-log specification.

IV. DISCUSSION

Our analysis suggests that expected consumer surplus estimates obtained using a Taylor-series approximation truncated at the second order may be misleading. The reason for our results is clear: over repeated trials the Monte Carlo procedure will result in some estimates of the coefficient on the price variable that are very close to zero. Since this coefficient appears in the denominator of the consumer surplus function for the linear and semi-log forms, estimates close to zero can have a dramatic effect on the surplus estimate. The resulting inflation of the variability of the surplus function potentially leads to poor performance for a second-order approximation procedure. This effect is not as significant for the double-log form, although the double-log form has other theoretical and practical difficulties. The effects of the truncation are not necessarily signaled by the statistical significance of the price parameter or other statistical measures of goodness of fit: in our example one functional form fit the data well, but approximation errors were large for this form.

Our results constitute another example of the impact of the choice of functional form on welfare measures. It long has been recognized that functional form choice may significantly affect the magnitude of expected consumer surplus (Zeimer et al. 1980). More recently it has been recognized that functional form may also affect the statistical properties of welfare measures (Adamowicz et al. 1989a, 1989b; Kling 1988). Overall, the literature contains substantial empirical support for the semi-log from (McConnell 1985; Smith 1988). The results reported here seem to suggest that the semi-log form does not perform badly regarding the errors of truncating approximations to expected surplus at the second-order term, at least in the limited example we employ. However, significant questions remain regarding the appropriate functional form to use in empirical analyses.

Having noted the potential for a problem with use of second-order approximations, one might inquire as to what one can do about it. A full treatment of this is beyond the scope of this paper, but we offer a few suggestions here. One possibility is to truncate the approximation at a higher order. Unfortunately, this is not easily done since the third-order term involves quantities that are not readily calculable. Of course, one always could carry out a Monte-Carlo analysis of the sort reported here. In the case where the numerator of the ratio is nonrandom, the distribution of 1/x can be integrated numerically, as long as the distribution of x is known, as it will be for a regression parameter (e.g., Kaylen and Preckel 1987). This will not work for the case of a random numerator, since the joint distribution of x and y is not generally known. Another possibility is to truncate the consumer surplus function even in the case when it is finite (as with the linear form). For example, it may be reasonable to impose that willingness to pay for each observation, as approximated by consumer surplus, cannot be negative and cannot exceed income. A full investigation of the imposition of these sorts of constraints is a topic for further research (see Adamowicz et al. 1989b).

(1) It is not necessarily the case that benefit-cost decision-makers act as if they are risk neutral. If they are not, then the variance of the welfare measure matters as well as its mean. On the relationship between functional form and the variance of the welfare measure see Adamowicz et al. (1989a).

(2) Another functional form of interest is the linear-log, i.e., Q = a + b In P. For this form the consumer surplus funcs function is CS = Max(P)(Max(Q) - b) - P(Q) ables, no approximation is needed. Hence, we do not consider this form here. The use of the double-log form can be questioned on the grounds that it is not integrable, which implies that recreation is an essential good (Bockstael et al. 1987), and that the manner in which it is truncated is ad hoc. We do not necessarily recommend it in applied work; it is included here for illustrative purposes.

(3) Both Bockstael and Strand (1987) and Kealy and Bishop (1986) adjust the ratio of expected values by an "expansion factor" (1 + 1/[t.sup.2]) to account for the other terms in , where t is the t-statistic for the price parameter estimate. Note that [Mathematical Expression Omitted] This is the approximation in  only if the second term in  is zero.

(4) Let A be the kxn matrix [(x'x).sup.-1]x'. Then the covarience of the square of the dependent variable vector and the vector of parameters is 2diag . [sigma.sup.2][x.sub.i]b]A', where [sigma.sup.2] is the variance of the error terms, [x.sub.i] is the vector of explanatory variable values for observation i, and b is the estimated parameter vector.

(5) Further details regarding the data are available from the authors upon request.

(6) For simplicity we have addressed neither the sample selection problem, that only users of the site are included in the sample, nor the problem of censoring, i.e., that only nonnegative quantities can be observed, while the assumed normality of our error term implies that trips can be negative. While the basic insight and spirit of our investigation applies to models which correct for sample selection and censoring, properly accounting for these effects would significantly complicate our analysis; this is left for further research.

References

Adamowicz, W., J. Fletcher, and T. Graham-Tomasi.

1989a. "Functional Form and the

Statistical Properties of Welfare Measures."

American Journal of Agricultural Economics

71:414-21. Adamowicz, W., T. Graham-Tomasi, and J.

Fletcher. 1989b. "Inequality Constrained Estimation

of Consumer Surplus." Staff Paper

No. 89-04 Department of Rural Economy,

University of Alberta, Edmonton. Bockstael, N., and I. Strand. 1987. "The Effect

of Common Sources of Regression Error on

Benefit Estimates." Land Economics 63:11-20. Bockstael, N., W. M. Hanemann, and I. Strand.

1987. Measuring the Benefits of Water Quality

Improvements Using Recreation Demand

Models. Vol. 3, Report to U.S. Environmental

Protection Agency, Department of Agricultural

and Resource Economics, University

of Maryland. Dwyer, J., J. Kelly, and M. Bowes. 1977. "Improved

Procedures for Valuation of the Contribution

of Recreation to National Economics

Development." Report No. 128, Water

Resources Center, University of Illinois, Urbana. Freedman, D., and S. Peters. 1984. "Bootstrapping

a Regression Equation: Some Empirical

Results." Journal of the American Statistical

Association 79:97-106. Kaylen, M., and P. Preckel. 1987. "MINTDF: A

Fortran Subroutine for Computing Parametric

Integrals." Station Bulletin No. 519, Agricultural

Experiment Station, Purdue University. Kealy, M., and R. Bishop. 1986. "Theoretical

and Empirical Specifications in Travel Cost

Demand Studies." American Journal of Agricultural

Economics 69:660-67. Kling, C. 1988. "Comparing Welfare Estimates

of Environmental Quality Changes from Recreation

Demand Models." Journal of Environmental

Economics and Management

15:331-40. McConnell, K. E. 1985. "The Economics of

Outdoor Recreation." In Handbook of Natural

Resource Resources and Energy Economics,

eds. A. Kneese and J. Sweeney. Armsterdam:

North Holland. Smith, V. K. 1988. "Travel Cost Recreation Demand

Methods: Theory and Implementation."

Discussion Paper QE89-03. Washington,

DC: Resources for the Future. Zeimer, R., W. Musser, and R. Hill. 1980. "Recreation

Demand Equations: Functional

Forum and Consumer Surplus." American

Journal of Agricultural Economics 62:136-41.

Graham-Tomasi is assistant professor, Agricultural and Applied Economics Department, University of Minnesota, St. Paul; Adamowicz is associate professor, Rural Economy Department, University of Alberta, Edmonton; and Fletcher is assistant professor, Resource Management Division, West Virginia University, Morgantown.

The authors thank Yacov Tsur and two anonymous referees for helpful comments on an earlier draft of the paper; remaining errors are the authors' responsibility. Minnesota Agricultural Experiment Station Publication No. 16476.
COPYRIGHT 1990 University of Wisconsin Press
No portion of this article can be reproduced without the express written permission from the copyright holder.