# Stein rule estimation in real estate appraisal.

Multiple regression analysis (MRA) is used by appraisers and assessors as an input to value estimation.(1) The approach is economical and makes use of the growing amounts of data available to an appraiser. Ordinary least squares (OLS) is the usual method for carrying out regression analysis, and it is optimal in the sense that it summarizes the information in the data. When information of good quality is available but not contained in the data, however, OLS is not the best technique.

In this article, we introduce "Stein-like" estimation as an alternative to OLS in the appraisal context. Originally proposed by Charles Stein in 1956, Stein estimators combine sample information with non-sample information in a way that improves the precision of the estimation process and the quality of subsequent predictions. The method is potentially useful to appraisers, who frequently work with data of poor quality but who have highly pertinent experience and knowledge about the marketplace and the factors that determine property value. To illustrate the use of Stein rules we consider the valuation of individual residential properties and demonstrate the superiority of this technique to OLS.

ALTERNATIVE ESTIMATION TECHNIQUES

OLS and Stein rule estimators are based on the linear regression model

y = ||Beta~.sub.1~ + ||Beta~.sub.2~|x.sub.2~ + ... + ||Beta~.sub.k~|x.sub.k~ + e

where y is the dependent variable, in this case the sale price of a property; the |x.sub.i~'s are independent explanatory variables (i.e., individual housing characteristics) from 1 to k; the |Beta~'s are regression coefficients representing the marginal contributions of the housing characteristics to the sale price of a house; and e is a random error representing factors not appearing explicitly in the model.

OLS is the simplest and most commonly encountered form of estimation when MRA is used in appraisal. This method chooses estimates |b.sub.i~ of the parameter coefficients ||Beta~.sub.i~, so as to minimize the sum of squared errors,

|Mathematical Expression Omitted~

for all the sample observations of sale price and corresponding property characteristics. This technique provides the best unbiased estimates if the errors are normally distributed.

Appraisers, however, often have information about the regression coefficients ||Beta~.sub.i~. For example, an appraiser may know that changes in certain characteristics add to property value (i.e., ||Beta~.sub.i~ |is greater than~ 0), or that other changes may reduce values (||Beta~.sub.i~ |is less than~ 0). Alternatively, an appraiser may be skeptical that a certain characteristic has any influence on price (||Beta~.sub.i~ = 0), or may have a prior guess as to the marginal price of a characteristic (||Beta~.sub.i~ = \$c). Each of these prior, subjective, or personal assumptions can be called non-sample information.

The problem facing an appraiser is how to use this additional information in the optimum way, or at least in a way that will yield predicted valuations superior to the traditional least squares approach. The solution to this dilemma is the use of Stein rules to incorporate the non-sample information into the estimation process.

Stein rules are a way to use prior information that can be written as an exact equation involving the coefficients, such as ||Beta~.sub.i~ = \$c or ||Beta~.sub.i~ = 0.(2) Let |b*.sub.i~ be least squares estimates of the coefficients ||Beta~.sub.i~ from a regression model that incorporates the exact prior information.(3) Using |b*.sub.i~ coefficients themselves as a basis for valuation may be risky, as they are based on assumptions (the prior information) that may be faulty. The Stein rule is to use both sets of estimates, but weighted to reflect how well the OLS estimates |b.sub.i~ agree with the estimates |b*.sub.i~.

Specifically, the Stein rule estimate of the marginal effect ||Beta~.sub.i~ is

|Mathematical Expression Omitted~

where c = J - 2/T - K + 2.

In this expression, J is the number of coefficient constraints, T is the number of data points, K is the number of coefficients in the regression model, and F is the "F-test" statistic that is commonly used to test joint constraints.(4)

The beauty of the Stein rule estimator |Mathematical Expression Omitted~ of ||Beta~.sub.i~ is that the data, by agreeing or disagreeing with an appraiser's guesses, indicate how much weight to put on each estimate. If the data agree with the appraiser, the F is small and a relatively large weight is placed on |b*.sub.i~. Conversely, if the data do not support an appraiser's non-sample input, then F is large and a relatively large weight is placed on the usual least squares estimate |b.sub.i~. This rule can be shown, via a complex mathematical proof, to predict property values, on average, more accurately than the least squares estimator.(5)

AN ILLUSTRATIVE EXAMPLE

Implementing Stein rule estimation for property appraisal is not difficult, nor does it require sophisticated software. The regression capabilities of most spreadsheet applications are adequate for operationalizing the technique. We demonstrate this with an example using Lotus 1-2-3.

Suppose an appraiser must estimate the value of a 10-year-old, single-family house with the following characteristics: 2,000 square feet of living area, a fireplace, four bedrooms, and two-and-a-half baths. The appraiser has collected information on 20 comparables as shown in Table 1, and knows from experience that square feet of living area and age are important in determining the sale price of a home. The appraiser is not so sure, however, about the appeal of the other three variables; he doubts that a fireplace would add much to the selling price of a house and thinks some potential buyers may even dislike a fireplace for maintenance and safety reasons. As for the number of bedrooms and bathrooms, the appraiser may feel that the information to be conveyed by these variables is already accounted for by the size of the home. Also, having extra bedrooms and bathrooms in a house reduces the space available for other parts of the house where extra room may be preferred, such as in the kitchen. For these reasons, the appraiser is unsure whether the last three variables in his comparable data would have any power to explain the selling price of his subject property. In other words, he hypothesizes that the coefficients of these variables in MRA would be zero. Rather than simply omitting the information from his TABULAR DATA OMITTED analysis, which might be costly in terms of the precision of his subject property value estimate, he can use Stein rule estimation to let the data determine the extent to which his hypothesis is correct and to correspondingly adjust the estimates of attribute values.

Table 2 shows the results of two regressions performed using Lotus 1-2-3. The first regression (panel A) uses all five variables. The second regression (panel B) uses only the data regarding living area and age, in effect restricting the value estimates of the presence of a fireplace or an additional bedroom or bathroom to be zero. As expected, square feet of living area and age are highly significant in both regressions. When included, the other three variables are positive in sign but statistically insignificant.

One option would be to ignore the three statistically insignificant variables in the first regression and use the parameter estimates of the second regression. This would be a bad decision, in a statistical sense, because it totally ignores the information available in those three variables. Using Stein rule estimation, we select a compromise between the two regressions based on an F-statistic measuring the extra explanatory power offered by the three variables in question. We express the F-statistic in terms of TABULAR DATA OMITTED |R.sup.2~,(6) since the Lotus 1-2-3 output does not include sum of squared errors, and use it to determine how much weight to assign to each regression.

|Mathematical Expression Omitted~

The F-statistic thus computed is then incorporated into calculating parameter estimates as follows:

|Mathematical Expression Omitted~

Thus, the Stein rule estimates fall between the OLS estimates, |b.sub.i~, in the first regression, and the restricted least squares (RLS) estimates, |b*.sub.i~, of the second regression as shown in Table 3.

Multiplying the subject property attributes by the Stein rule parameter estimates and then summing, the appraiser obtains a value estimate of \$80,309. This compares with an estimate of \$80,556 using OLS and \$79,135 using RLS. When the number of restrictions imposed is three or more, Stein rule estimation will result in smaller errors, on average, in value estimates. The amount of improvement depends on the quality of the information added in the form of the appraiser's expert knowledge and experience. This theoretical result has been examined via simulations in several contexts,(7) and we provide some empirical evidence in the next section.

COMPARISON OF PREDICTIVE ABILITY

As an illustration of its superior properties, we compared the Stein estimator with OLS using multiple listing service (MLS) data from Baton Rouge, Louisiana, during the period from October 1984 through June 1989. The data used for the study consist of sales of residences in the various neighborhoods and subdivisions along a major traffic artery in Baton Rouge TABULAR DATA OMITTED running south through the Louisiana State University campus. Homes along this road are heterogeneous in terms of size, age, and structural quality.

We segmented this data, 975 observations in all, into five nine-month time periods to provide five opportunities to compare the estimators. For each time period, the immediately ensuing three months of observations were held out to test the predictive performance of the estimators. In using a holdout sample, we departed in a realistic way from the methodology of many previous comparative studies.(8) Those studies tested the ability of alternative estimators to predict sale prices of houses in the sample from which the estimates were developed. In effect, the estimators were compared based on their ability to predict a sale price that is already known with perfect accuracy.

For our comparison, we used data from the five nine-month periods to develop parameter estimates for the property characteristics. Then we applied these estimates to the property characteristics of homes in the immediately following TABULAR DATA OMITTED three-month holdout sample to predict the sale price of the holdout homes. The estimators were then compared based on their ability to predict the unknown prices in this separate sample.

Table 4 shows the model that we used to explain and predict housing prices. It includes structural characteristics, details of financing, and neighborhood location variables thought to influence the sale price of houses. The number of variables used in this model, 17, is typical of hedonic models used to explain housing prices. Of course, the amount of variation explained by the model (model |R.sup.2~) increases with the addition of each new variable. From a predictive standpoint, however, using too many variables creates difficulties. Adding variables increases the collinearity problem and decreases the precision with which the parameters can be estimated. High model |R.sup.2~ is no guarantee that the model will predict well. Further, good predictive ability is the primary concern of appraisers and assessors who would use a model to predict housing prices.

This feeling that the model might be overspecified (i.e., with too many variables) is precisely the non-sample information that can be added to the estimation process via the Stein rule. In addition to estimating the full model of 17 variables, a model that includes only the intercept and first five variables is estimated. Our judgment is that this smaller model may be better, in a predictive context, than the larger one. The degree to which the data support our judgment determines whether the Stein rule estimates will be from this smaller model, from OLS estimation, or somewhere between the two.

RESULTS

Table 5 compares the predictive ability of the two estimators. To develop this table, actual sale prices of residences in the holdout sample are compared with the sale prices predicted using each of the estimators. These differences are squared and averaged providing a mean square error of prediction. The square root of these values, root mean square error (RMSE), is reported for OLS in Table 5. The column for the Stein rule estimator contains a performance index that is the ratio of the Stein rule's RMSE to that of OLS. Numbers in this column smaller than one represent improvement over OLS.

By this measure, the Stein rule estimator is superior to OLS in four of the five time periods. It provides an average 4% improvement. Putting the comparison in dollar terms, the Stein rule estimator, on average, provided predictions approximately \$960 closer than those of OLS.
```TABLE 5 Summary of Prediction Performance Root Mean Squared
Error of Prediction as Compared with Ordinary Least Squares
(OLS)(*)

Estimator
Period OLS Stein

1 16352.7 1.0641
2 21509.3 .9038
3 24485.4 .9538
4 22547.2 .9507
5 34764.0 .9226

* OLS column reports root mean squared error (RMSE) for the OLS
estimator. The Stein column reports the ratio of Stein RMSE to
OLS RMSE. Values less than one in the Stein column indicates
favorable performance.
```

CONCLUSIONS

Stein rule estimation is especially useful for the valuation problem encountered by appraisers and assessors. The sample data available for valuing residences is not very good; this is not because data are unavailable, but because of the collinear nature of the data available. At the same time, non-sample information about values of housing characteristics is often quite accurate. For example, an appraiser is often aware of the cost of adding a bathroom, or of the value a swimming pool has for a prospective buyer. Stein-like rules permit a professional to impart this high-quality information (and other expert opinions) into the estimation process in an objective manner.

In this article, we demonstrate that the performance of Stein rule estimation is impressive in a predictive context, even when the only non-sample information added is the fact that the predictive model contains too many unimportant explanatory variables. That it performs so well under these circumstances conveys two messages. First, when using MRA, an appraiser should be wary about incorporating all the possible variables available to him or her. It may very well be that an appraiser can predict most probable sale price better using a much smaller model. The second message is the potential promise of this method for improving an appraiser's accuracy of prediction. If Stein rule performance is impressive with relatively unexciting non-sample information, what are the possibilities when the high-quality information that is at an appraiser's disposal is incorporated into the estimation process?

1. See Jonathan Mark and Michael A. Goldberg, "Multiple Regression Analysis and Mass Assessment: A Review of the Issues," The Appraisal Journal (January 1988): 89-109 for a review of the application of MRA to appraisal and assessment and a discussion of problems associated with its use in this context. For a treatment of MRA, see Peter Kennedy, A Guide to Econometrics, 2d ed. (Cambridge: The MIT Press, 1985).

2. More general exact linear relations can be used.

3. A good discussion of restricted least squares estimation can be found in Kennedy, chapter 11.

4. F = |S.sub.S~|E.sub.R~ - SS|E.sub.U~/J * SS|E.sub.U~/(T - K).

5. A proof is provided in Stephen M. Stigler, "A Galtonian Perspective on Shrinkage Estimators," Statistical Science (February 1990): 147-155.

6. Kennedy, 63.

7. For example, see John R. Knight, R. Carter Hill, and C. F. Sirmans, "Estimation of Hedonic Housing Price Models Using Non-Sample Information: A Monte Carlo Study," Journal of Urban Economics, forthcoming November 1993.

8. For example, see Graeme J. Newell, "The Application of Ridge Regression to Real Estate Appraisal," The Appraisal Journal (January 1982): 116-119, or Eurico J. Ferreira and G. Stacy Sirmans, "Ridge Regression in Real Estate Analysis," The Appraisal Journal (July 1988): 311-319.

John R. Knight, PhD, is assistant professor of finance and real estate at the University of Connecticut. He has published in AREUEA Journal and Journal of Urban Economics.

R. Carter Hill, PhD, is professor of economics at Louisiana State University. He has published numerous econometrics articles and textbooks, and his work has appeared in AREUEA Journal and Journal of Urban Economics.

C. F. Sirmans, SRPA, PhD, is professor of finance and real estate and director of the Center for Real Estate and Urban Economic Studies at the University of Connecticut. He is the author of numerous real estate textbooks and has published extensively in finance, economics, and real estate journals, including The Appraisal Journal.