# Errors-in-Variables Bounds in a Tobit Model of Endogenous Protection.

Kishore Gawande [*]Alok K. Bohara [+]

The errors-in-variables (EIV) problem is pervasive in econometrics but has not received the attention it deserves, perhaps because it is difficult to resolve. The first objective of this paper is to demonstrate the effectiveness of recently developed methods to deal with the EIV problem in models with censoring. The second objective of this paper is to empirically examine, in light of the EIV problem, theories of endogenous protection that have become important in trade theory in their ability to explain why nations do not follow the traditional economic maxim of free trade. These theories emphasizing political-economic factors have gained momentum based on a set of empirical studies that have sought to prove their validity. Whether inferences about the theories of endogenous protection are gravely affected by errors in variables is examined using data on U.S. nontariff barriers with respect to nine developed countries. The theoretical developments in Kiepper (1988) and Klepper and Learner (1984) are combined with a result from Levine (1986), which usefully extends the use of EIV diagnostics to a model with censoring.

1. Introduction

The predominantly nonexperimental nature of economic data compels the use of proxies, imperfectly measured variables, and dirty data. This paper is motivated by the cogent arguments in favor of sensitivity analyses made by Leamer (1983, 1985). In this paper the recent theoretical advances in the errors-in-variables (EIV) literature by Klepper and Leamer (1984) and Klepper (1988), which have focused on the linear regression model, are applied to a Tobit model via a result in Levine (1986).

The empirical literature on endogenous protection provides a rich context within which to study the sensitivity of inferences to the EIV problem. Studies of endogenous protection based on the seminal empirical work of Pincus (1975), Caves (1976), Ray (1981), and Baldwin (1985), and among others, have significantly influenced traditional thinking in the area of trade. It is a prime example of empirical work that has led theoretical development and continues to influence it. But inferences from econometric studies of endogenous protection are suspected to be fragile because there is widespread use of proxies and variables that are poorly measured. The EIV problem is not specific to just the variables measured with error. The extensive use of mismeasured variables and proxies may lead to spurious estimates on even well-measured variables.

This paper seeks to make two contributions. First, using cross-country and cross-industry data on nontariff barriers, the sensitivity of inferences about the validity of theories of endogenous protection to classical errors in variables is investigated. Second, the applicability of the EIV methodology to limited-dependent variable models is demonstrated. The paper proceeds as follows. In section 2, the choice of regressors is motivated, an endogenous protection equation is estimated, and inferences are made under the presumption that there are no errors in variables. In section 3, the EIV methodology and its extension to the Tobit model is described. In section 4, two kinds of EIV analyses are performed, one that leads to bounds on estimated coefficients and another that exposes those inferences that are fragile on account of errors in variables. Section 5 concludes.

2. Inferences About Endogenous Protection Theories

Empirical Specification

Trade protection in the United States has been modeled in the empirical literature as including four components: (i) a self-interested political component that is a response to protectionist pressures, which is substantially influenced by the lobbying efforts of private agents, (ii) an altruistic political component influenced by welfare-oriented motives of the government, (iii) a retaliatory component that serves as a strategic deterrent against undesirable protectionist policies of its partners, and (iv) a component motivated by comparative advantage. Their empirical relevance has been demonstrated by Caves (1976), among others, using tariff data from the Kennedy round, by Ray (1981) and Baldwin (1985) using tariff data during the Tokyo round of cuts, and by Trefler (1993) using aggregate U.S. NTB (non-tariff barrier) data from 1983.

A feature of this study is the use of bilateral cross-industry NTB data between the United States and nine developed partner countries. NTBs include all trade barriers other than ad valorem tariffs. Prominent examples of NTBs are antidumping duties to thwart dumping below fair price, countervailing duties to counter partner's export subsidies, quotas whose licences may be distributed to domestic agents, and voluntary export restraints where the partner country voluntarily restricts its exports. Leamer (1990) provides an exhaustive taxonomy of nearly 50 NTBs. After the Tokyo round tariff cuts, the new protectionism in developed countries took the form of NTBs. In the United States, their use sharply escalated after 1979 and continued to rise through the 1980s. Data from 1983 are used in this study and capture a period in which the use of NTBs was widespread.

NTBs are measured in this study as coverage ratios; that is, the fraction of imports covered by some NTB or other. Following Baldwin's framework, the specification employed in the econometric analysis is

[N.sub.ij] = [X1.sub.ij][[alpha].sub.1] + [X2.sub.ij][[alpha].sub.2] + [X3.sub.ij][[alpha].sub.3] + [[[beta]N.sup.*].sub.ij] + [[D.sup.*].sub.j][[gamma].sub.j] + [[varepsilon].sub.ij],

[[varepsilon].sub.ij] [sim] N(0, [[sigma].sup.2]), i = 1, [ldots], 435, j = 1,[ldots], 9. (1)

United States NTBs on good i against country j, [N.sub.ij], are determined by a self-interested political component, whose variables are represented by the vector [X1.sub.ij], an altruistic political component represented by [X2.sub.ij] the theory of comparative costs represented by [X3.sub.ij], and an offensive component, [[[beta]N.sup.*].sub.ij], designed to thwart foreign NTBs. Country-effect dummy variables are included in [[D.sup.*].sub.j]. The parameters [[alpha].sub.1], [[alpha].sub.2], [[alpha].sub.3], and [beta] are assumed stable across industries and countries, and Equation 1 is estimated by pooling industry and country data. Here, cross-industry data at the four-digit SIC level of disaggregation are pooled across nine countries: Belgium, Finland, France, Germany, Italy, Japan, the Netherlands, Norway, and the United Kingdom. The errors [[varepsilon].sub.ij] are assumed to be homoskedastic across countries and goods.

The choice of regressors in Equation 1 is influenced by Baldwin's study but also includes newly constructed variables, and an additional model, namely, the strategically retaliatory model. Table 1 shows the association of regressors with the underlying theory and the expected sign on each coefficient. The Appendix details the construction of the variables.

(i) Strategic retaliation: The aim of using retaliatory NTBs, as opposed to employing them for purely protectionist purposes, is to deter undesirable foreign trade policy at minimum domestic cost. In a game-theoretic political economy model based on fairly real-world assumptions, Baldwin (1990) shows the existence of optimal nonnegative retaliatory trade barriers. His argument is motivated by the ability of retaliatory measures to discourage special-interest pressures in the foreign country that led to the formation of the foreign trade barrier in the first place. Grossman and Helpman (1995) describe a model with a noncooperative trade-war equilibrium and a bargaining equilibrium resulting from trade talks. In Baron's (1997) case study of the Kodak-Fujifilm trade war, the use of normarket strategies such as applying pressure on the government to impose sanctions is highlighted using a bargaining model. Although these theories about retaliation and bargaining are more properly tested using relative levels of protection in the two countries, the use of foreign NTBs as a regressor can be used to infer whether the United States retaliates against high NTBs abroad (positive sign on [[N.sup.*].sub.ij]) or whether high NTBs abroad are an indication of greater bargaining strength in the partner country (negative sign on [[N.sup.*].sub.ij]).

(ii) Special-interest or pressure group model: The special-interest group model associated with Olson (1965) and Pincus (1975), and subsequently formalized by Brock and Magee (1978) and Findlay and Wellisz (1982), suggests measures of special-interest pressure. The concentration ratio (CONC4) and measures of scale economies (SCALE) have traditionally been used as proxies for special-interest pressures because the stakes from protection are highest in industries with a high degree of concentration or scale economies. In addition to these proxies, a more direct measure of pressure--corporate PAC (Political Action Committee) campaign contributions scaled by industry value added (PACCVA83)--is employed and presumed to be positively related to the level of protection. More recently, protection has been modeled by Grossman and Helpman (1994) as the outcome of a menu auction, in which industry lobbies each bid on a menu of trade tax vectors. The government then sets a specific trade tax vector and collects from eac h lobby its bid on that specific vector. In their model, which abstracts from market structure issues, the prediction is that the inverse import penetration ratio (CONS/[M.sub.ij]) should be positively related to protection. [1]

(iii) Adding machine model: The adding machine model due to Caves (1976) focuses on the voting strength of the industry and suggests that number of employees (NE82) and degree of unionization (UNION) in that industry and the industry's labor intensity (LABINT82) are all positively related to the level of NTBs. The number of states in which production is located (REPRST) is another measure of the spread and consequently of voting power. Further, this model predicts that industries with a large number of unconcentrated firms are more likely to receive protection than a concentrated industry so that the level of protection is expected to be negatively associated with the concentration ratio (CONC4). The special-interest group model and the adding machine model fall under the category of political models motivated by self-interest, and variables that represent them are collected in X1.

(iv) Public interest: The set of models emphasizing the public interest range from the status quo model based on Corden's (1974) conservative social welfare function, to models of government altruism (e.g., Lavergne 1983) emphasizing equity issues. The status quo model, used by Baldwin with some success in explaining tariff cuts during the Tokyo round, suggests that the proportion of unskilled workers (P_UNSK) and tariff protection (TAR) will be positively related to the level of NTB protection. The equity model has been offered as an explanation as to why industries without much political clout such as apparel and textiles have been successful in obtaining protection. It suggests that industries with low average earnings (AVEARN) or high labor intensity (LABINT82) will likely obtain high levels of protection. The variables associated with the status quo and equity models are contained in X2.

(v) Comparative cost-comparative advantage: The traditional comparative advantage model of trade suggests that NTBs are positively related to import penetration ([M.sub.ij]/CONS) and negatively to exports ([X.sub.ij]/CONS). Because U.S. industries have been demonstrated to have a comparative advantage in skill-intensive industries, industries with a high proportion of scientists (P_SCI) and managers (P_MAN) are expected to require less protection. A large change in import penetration (DPEN7982) may be a sign of an industry that has lost its comparative advantage and hence is a candidate for protection. Other control variables included in X3, in addition to the comparative cost variables, address the concerns of incorporating the effects of real exchange rates into the cross-sectional analysis. The absolute value of both, the real-exchange-rate (RER) elasticity of imports (MELAST) and exports (XELAST), are expected to be positively related to the level of protection during this period. [2] The extended period of RER appreciation between 1981-1984 led to a rise in import penetration and a lowering of exports, fueling protectionist pressures in industries with high (absolute) RER elasticities.

Inference from ML Estimates

The dependent variable, [N.sub.ij], is measured only when it takes a positive value. Foreign export subsidies, if not countervailed in the United States act like negative NTBs, and the absence of any export subsidy data leads to censoring in the lhs variable. Also, theoretically there exist arguments for direct import subsidies (see Vousden 1990; Grossman and Helpman 1994). With intra-industry trade in intermediate goods, which characterizes a large part of the trade among the countries in the present analysis, there is certainly the possibility of subsidizing imports. Hence, the dependent variable is truncated below zero, requiring the use of a Tobit specification.

The ML estimates from two models in Table 2 indicate that every political economy model finds some support from the data. In the first model, country effect dummies are included, while in the second model both country dummies and industry dummies for four industry groups--food processing, resource-based manufacturing, general manufacturing, and capital-intensive industries--are included. Both models lead to similar estimates. Of the group of variables representing any theory, at least one variable has the expected sign and a large t-value. The model of retaliation is clearly supported, with an estimate for the retaliation coefficient that is both statistically and economically significant. The pressure group model finds strong support from the significant estimate on corporate PAC spending (PACCVA83). This variable provides clear-cut and direct inference about special interest groups rather than the indirect inference through the coefficient on the industry concentration ratio (CONC4). The adding machine mod el finds support from the estimate on number employed (NE82) and on the geographic spread of firms within an industry (REPRST), validating Caves' (1976) theory that voting power in terms of numbers is an important determinant of whether an industry receives protection. However, the unexpected sign on labor intensity (LABINT82) is contrary, and this finding is hard to rationalize.

The data provide mixed inferences about models of political altruism. The positive ML estimate on average earnings do not support the status quo model of Corden (1974), but the positive coefficient on ad valorem tariffs (TAR) indicates that the same industries that were earlier protected by tariffs now receive NTB protection, thus undermining the multilateral cuts from the Tokyo round. The status quo model predicts that to prevent damage to industries from a sudden removal of protection to the most highly protected before the Tokyo round implementations, these industries would be supported in some alternative way. The results show that tariff cu NTBs filled the gap in protection left by the Tokyo round cuts. The equity model does not receive support from the data. Although the coefficient on the proportion unskilled (P_UNSK) is positive, it has a low t-value. Perhaps the labor-intensity variable (LABINT82), earlier attributed to the adding machine model, is more representative of the equity model and the neg ative coefficient is evidence against that model. The model of comparative costs--comparative advantages is strongly supported by the data. Bilateral imports and exports ([M.sub.ij]/CONS, [X.sub.ij]/CONS) have the expected signs and high t-values. Industries with high skill levels measured by the proportion of scientists and engineers (P_SCI) do not receive protection largely because as past empirical studies have shown, the United States has a comparative advantage in the production of skill-intensive goods.

A strong criticism of the empirical model is its tenuous connection with any formal underlying theory. Due to the ad hoc nature of these models, the variables are at best proxies for the theoretically correct measure of special interest pressure, voting strength, public interest, or government altruism. In addition to errors in variables arising from their use as proxies, the variables are imperfectly measured. It is therefore relevant to question whether the inferences in Table 2 are robust to errors in variables. CONC4 and SCALE seek to measure the stakes to firms from obtaining protection, as well as the ability to solve the free-riding problem in organizing lobbying activities. REPRST seeks to measure congressional representation of industries. P_SCI and P_MAN seek to measure human capital across industries. P_UNSK is a proxy for those workers for whom protection is the only form of insurance against unemployment. The trade measures (NTB, import penetration, export-to-consumption ratio, and tariffs) are subject to measurement errors because they are concorded from disparate international systems of data-keeping. PACCVA83, which is constructed from Federal Election Commission tapes is also subject to measurement error due to concordance problems. Still other variables, namely P_SCI, P_MAN, P_UNSK, MELAST, and XELAST, are measured at different levels of aggregation. An EIV case can be made against many variables employed in the Tobit model. The remainder of this paper is devoted to a sensitivity analysis of the ML estimates to the EIV problem.

3. Errors-in-Variables Diagnostics in the Tobit Model

Errors-in-Variables Model

Consider the classical EIV model in which the observed variable y is generated by

y = [beta]'[x.sup.*] + [mu], (2)

where [x.sup.*] is a K X 1 vector of true regressors with mean 0 and covariance matrix [Sigma], [mu]. is a classical disturbance with mean 0 and variance [[sigma].sup.2], and [beta] is a K X 1 vector of coefficients, whose estimation is the focus of interest. A K X 1 vector of proxy variables x is observed, which measures [x.sup.*] with error as

x = [x.sup.*] + [epsilon], (3)

where [epsilon] is a K X 1 vector of measurement errors with mean 0 and covariance V = diag([[nu].sub.1], [[nu].sub.2], [ldots], [[nu].sub.K]), which is assumed to be distributed independenfly of x and [mu]. [3]

The EIV analysis of a model with a single regressor, K = 1, is well-known. The set of feasible values of the single coefficient [beta] lies between the direct-regression estimate from the regression of y on x, and the reverse-regression estimate computed by regressing x on y and then expressing y as a function of x. For example, if the regression of x on y yields the fitted equation x = a + by, the reverse-regression estimates on the intercept and slope are, respectively, a/b and 1/b. With many regressors, K [greater than] 1, the generalization of this result involves the direct-regression estimate plus the K reverse regressions where the K regressors [x.sub.i], i = 1, [ldots], K are each regressed on y and their reverse regression estimates then computed.

Klepper and Learner (1984) show that if every coefficient has the same sign in the direct and all K reverse regressions, then the set of feasible values of [beta] can be bounded. However, if the signs of any coefficient differ across the direct and reverse regressions, none of the coefficients can be bounded. [4] It is then necessary to invoke additional prior information to bound the feasible set. Klepper (1988) describes how to use reasonable prior information to bound the feasible values of [beta], and once this is accomplished, how to further tighten the bounds on individual coefficients. The steps involved in Kiepper's method are explained below.

The limitations of the Klepper--Leamer methodology should be noted. First, it is assumed that response errors are additive white noise and uncorrelated with all other latent and measured variables in the model. Even though economic data may not satisfy these assumptions, relaxing them makes the EIV problem even more intractable. Krasker and Pratt (1986), Bekker, Kapteyn, and Wansbeek (1987), and Erickson (1989) show that bounds will not generally exist if the no-correlation assumption between measurement errors and the (true) equation errors is dropped. More results with weaker assumptions are called for to provide general solutions to the EIV problem with economic data (e.g., Erickson 1993; Iwata 1992). Second, applying the Kiepper--Learner results to the Tobit model requires the use of Levine's (1986) result (see the Appendix). Levine's result is based on the assumption of joint normality of the explanatory variables, which we also assume in our application. A more fundamental problem here is that Levine a nd Klepper both identify with population moments in deriving bounds. Because there is uncertainty about the population Tobit model for the mismeasured variables, there is uncertainty about the bounds. Hence, standard errors for the Klepper-Leamer bounds are also required. This concern is addressed here by computing not just the EIV bounds, but also their standard errors.

Errors-in-Variables Diagnostics

We begin our EIV analysis of model 1 by computing the direct and reverse regressions. [5] The upper and lower bounds from the set of direct and reverse regressions are reported in Table 3. For the set of feasible values of [beta] to be bounded, the coefficient on each regressor must have the same sign within its interval; that is, no interval may contain zero. Otherwise the feasible set of values for [beta] cannot be bounded, and the EIV problem prevents any inference about [beta]. Clearly, based on the intervals from the direct- and reverse-regression estimates presented in Table 3, the set of political economy coefficients cannot be bounded in the presence of errors in variables.

To bound the feasible set of coefficients, prior information must be introduced. Klepper (1988) focuses on two types of prior information: (i) prior bound on [R.sup.*2], the (hypothetical) true R-squared of the regression of y on x if all the measurement error in the [x.sub.i]'s were completely removed, and (ii) prior bounds on [f.sub.i], the fraction of the variation in each regressor that is attributable to measurement error. These two Sets of bounds are not independent of each other, and their connection is made evident below. Klepper's method includes the computation of two sets of diagnostics. The first is an upper bound M, such that if the true [R.sup.2] of the regression can be constrained below M (using prior information), the feasible set is bounded. Given the constraint [R.sup.*2] [less than] M, the set of feasible values of [beta] is bounded by the direct regression and K constrained reverse regressions. The EIV intervals have the same signs as the coefficients from the direct regression. The intuition behind this result is as follows. If the true [R.sup.2] is allowed to be unrestricted, then we cannot rule out some combinations of measurement error variances that imply the true regressors are collinear. If the true [R.sup.2] may be bounded below M, it renders infeasible all the combinations of the measurement error variances that imply the true regressors are collinear (see Klepper 1988). Hence, if prior information in the form of an upper bound on the true [R.sup.2] is applied to the problem, then the resulting EIV intervals for [beta] do not contain zero for any coefficient.

The second set of diagnostics involves the identification of two key variables and the computation of two values for them, denoted [d.sub.1] and [d.sub.2]. These values may be used to bound the proportions of the total variances of the key variables that are attributable to measurement error variances. If either of the two measurement error variance proportions can be bounded below their respective d value upon the basis of prior information, then a further relaxation of the M bound on the true [R.sup.2] is possible. That is, the feasible set of estimates can be bounded by restricting the true [R.sup.2] below an even larger M value than permitted earlier. Beliefs about [f.sub.i] and [R.sup.*2] are related. [6] To bound [R.sup.*2] below M, the [f.sub.i] need to be correctly bounded. Klepper's (1988) formula computes the maximum value, [d.sub.i], that each [f.sub.i] can take and still not violate the upper bound on [R.sup.*2].

Corresponding to this new upper bound on the true [R.sup.2] is another set of two key variables for which d values may be computed. Again, by applying prior information to bound either of the measurement error variance proportions below their respective d values, yet further relaxation of the M bound on the true [R.sup.2] is possible. [7] This process can then be repeated to tighten the EIV intervals. In some cases, very reasonable bounds on the [f.sub.i] and [R.sup.*2] can lead to fairly tight EIV intervals for [beta]. In other cases, reasonable bounds on [f.sub.i] and [R.sup.*2] may lead to wide bounds on some important coefficients. Klepper shows how to tighten the intervals for those individual coefficients selectively by bounding a key [f.sub.i].

The key variables along the path of iterations are casualties of the EIV problem, for they cannot be bounded and no inference about their size and signs is possible.

Priors

Because the unconstrained direct and reverse regressions in Table 3 yield EIV intervals containing zero for all coefficients, we apply prior information about (i) an upper bound on the true [R.sup.2] and (ii) the maximum [f.sub.i] values, that is the proportion of variation in the explanatory variables accounted by measurement error.

Our prior on the upper [R.sup.2] bound is based on a set of cross-industry political economy studies. Leamer's (1990) NTB study pools across countries and three-digit SITC industries with the intention of measuring the impact of NTBs on trade. His explanatory variables are mainly partner-country and industry-group dummies. He reports [R.sup.2] values of around 0.30. Trefler's (1993) NTB study employs a simultaneous Tobit model of NTBs and imports for cross-industry U.S. data at the four-digit SIC level, aggregated (not pooled) across partners. The log-likelihood ratio (LLR) reported could be used to translate into a pseudo-[R.sup.2] measure if the LLR for the null model were also reported. Baldwin's tariff study (1985, pp. 162-3) finds a value of [R.sup.2] close to 0.40. Other cross-industry studies of tariffs with similar measures of fit are Ray (1981; U.S. Kennedy round tariff levels), Pincus (1975; U.S. tariff levels in 1820), and Caves (1976; Canadian Kennedy round tariffs). Based on these studies, we ar e prepared to accept a true [R.sup.2] of between 0.30 and 0.40 for our study. Our regression contains many new variables that have not been considered before including PAC spending, partner NTBs, RER elasticities, and bilateral imports and exports. Further, country-fixed effects are included. Hence, our prior value of the true [R.sup.2] of the regression lies in the interval [0.30, 0.40] with a uniform distribution over values in this interval. Although a more precise statement may be required in other situations, it is adequate here.

Our priors on [f.sub.i], the measurement error variances as a fraction of sample variance, are displayed in Table 4. The four political economy variables NE82, LABJNT82, AVEARN, and NEGR82, taken from the Census of Manufacturing, are presumed to be precisely measured. So are the partner-country dummies. The Census does not report standard errors for these variables, which is taken as evidence of the precision with which they are estimated. However, the Census does report that the relative standard error--that is, the standard error divided by the estimate-- on the four-firm concentration ratio (CONC4) and on the number of firms (used to construct SCALE) is greater than 0.15 for over 80% of the four-digit SIC industries. For these two variables, their prior value of f is set at 0.40. We illustrate the rationale for this with CONC4. Suppose the relative standard error for CONC4 averages to 0.30 for the sample. Then, since its sample mean is approximately 0.40, the actual (average) standard error is 0.40 X 0.30 = 0.12, or a measurement error variance equal to 0.0144. Because the sample variance of CONC4 is 0.0425, this implies a value of [f.sub.CONC4] = 0.0144/0.0425 = 0.34. The prior value for [f.sub.CONC4] is therefore put conservatively at 0.40. Similarly, the prior value for [f.sub.SCALE] equals 0.40.

If there is a mismatch at the degree of aggregation for any variable, their prior f value is set at a minimum of 0.40. Because PACCVA83, UNION, REPRST, P_SCI, P_MAN, and P_UNSK are all measured at the three-digit SIC level and are replicated at the four-digit level, they are assigned a prior f value of 0.40. MELAST and XELAST are estimated at the higher two-digit level of aggregation and replicated at the four-digit level. Because their measurement error is greater than that for the variables measured at the three-digit level, both their prior f values are set at 0.50. Conversion from disparate systems of data keeping induces measurement error. Whenever a variable requires concordance between the trade system of data keeping (TSUSA, SITC, etc.) and the industry system (SIC), they are considered to be moderately measured with error and their prior f values are set at 0.25. While we have employed reliable converters among these disparate systems, the mappings are not accurate. For example, in going from the SITC system to SIC, there are many cases with many-to-one and one-to-many mappings. For these cases, the mappings are simply proportioned equally, which is an approximation. Hence, the variables [[N.sup.*].sub.ij], M/CON, X/CON, DPEN7982, and TAR each have prior f values equal to 0.25.

If anything, the set of priors on f are conservative. The priors therefore lead to wider bounds on coefficient estimates than if the measurement errors were believed to be smaller. Another important reason to err on the conservative side is that statements about prior beliefs are usually approximate judgements, and eliciting exact prior information is a difficult, if not impossible, task (Learner 1978).

4. Inferences with Errors in Variables: EIV Diagnostics and Bounds

Stage I Iterations

Table 5 shows the values of M and [d.sub.i] along the path of iterations. Consider the first row (iteration J = 0) of the table. M = 0.3375 signifies that if the true R-squared of the regression ([R.sup.2*]) is bounded below 0.3375, it would permit the coefficients to be bounded and hence resolve the EIV problem. This value for [R.sup.2*] is well within the prior interval set of [0.30, 0.40]. However, for [R.sup.2*] to be bounded below 0.3375, all the [f.sub.i]'s must also be bounded below their respective [d.sub.i] values. But that requires restricting the [f.sub.i] values of some variables far below their admissible prior values (set out in Table 4). The first row of Table 5 shows that the [f.sub.i] bounds for these variables are in conflict with their prior bounds: [f.sub.PACCVA83] [leq] 0.04, [f.sub.REPRST] [leq] 0.095, [f.sub.MELAST] [leq] 0.02, and [f.sub.XELAST] [leq] 0.14.

To proceed, two key variables must be identified (only two such variables exist), one of them chosen and the value of its [f.sub.i] bounded below its computed [d.sub.i] value. These two key variables are identified in Table 5 as CONC4 and P_MAN (bold) in the first row. There are now two courses of action: (i) choose the variable for which [f.sub.i] [less than] [d.sub.i] is satisfied in Table 4 or choose either variable if both their [d.sub.i] values exceed their prior [f.sub.i] values or (ii) choose neither if both their [d.sub.i] values are unacceptably low. In the latter case, the iterations terminate. Because CONC4 and P_MAN both have admissibly high [d.sub.i] values, we randomly chose CONC4 and imposed the prior constraint [f.sub.CONC4] [leq] 0.40. Recomputing the new M bound on [R.sup.*2] yielded a higher value of M = 0.3381. This now became the sufficient upper bound on [R.sup.*2] required to relax the d([cdotp]) bounds on all the remaining variables that were measured with error. [8] The key variable CONC4 now had upper and lower bounds of the opposite signs and became the EIV problem's first casualty. It could no longer support inferences about the special-interest models of Olson (1965) and Brock and Magee (1978), and the adding machine model of Caves (1976).

This process of iterating to find acceptable M bounds and d([cdotp]) bounds is termed "stage I". [9] After six stage I iterations, the M bound was still satisfactory, but the two key variables at this step (REPRST and MELAST) both had unacceptably low [d.sub.i], values, thus concluding the stage I iterations. The six iterations were based on the following sequence of bounds on the fraction of variation attributable to measurement error in key variables: [f.sub.CONC4] [leq] 0.40, [f.sub.SCALE] [leq] 0.40, [f.sub.P_MAN] [leq] 0.40, [f.sub.UNION] [leq] 0.40, [f.sub.DPEN7982] [leq] 0.25, [f.sub.P_UNSK] [leq] 0.40. The M bound on [R.sup.*2] at the end of the stage I iterations was 0.3706, which produced intervals for all coefficient estimates except those corresponding to a bounded key [f.sub.i].

Stage I Intervals

The stage I intervals are presented in Table 6 under the column labeled "path 1". For comparison, we also include the a priori sign on each variable, the maximum likelihood estimates, their standard errors, and the connection of the variable with an underlying theory. The stage I intervals are constructed after imposing the constraint [R.sup.*2] [less than] 0.3706. [10] The blanks denote the key variables selected during the stage I iterations, for which no inferences are possible. The bounds on the remaining variables can be used to make inferences about the models of endogenous protection they each represent.

The computation of their standard errors is not straightforward. Because the iterations trace a specific path, the EIV bounds are conditional on the set of variables that define the path, or the path set. There is no analytic formula for the bounds even were the path set known in advance, therefore rendering analytic techniques such as the delta method practically useless for our purpose. Hence, we use a simulation method for computing standard errors on the bounds similar to the method used by Krinsky and Robb (1986). The details of the simulation method are provided in the Appendix.

The retaliation model of Baldwin (1990) is supported by the positive interval on [[N.sup.*].sub.ij]. From the point of view of the model of Grossman and Helpman (1995) and Baron (1997), this result may be used to infer that the United States has bargaining strength in the sense that an increase in partner NTBs is met by an increase in U.S. NTBs. A negative coefficient would imply that a foreign NTB increase deters U.S. NTBs. The standard errors on the lower and upper bounds of the stage I interval for [[N.sup.*].sub.ij] show that both bounds are measured quite precisely and are statistically significantly greater than zero at the 5% level.

The special-interest model finds support in the positive, though wide, interval for PACCVAS3 (industry PAC spending). Both the lower and upper bounds are more than two standard errors greater than zero, which demonstrate that they are precisely estimated. The estimates support the prior belief that the greater is special-interest pressure in the form of congressional campaign contributions by industry lobbies, the higher the protection they receive. The traditionally used proxies for special-interest motives, namely CONC4 (four-firm concentration) and SCALE (firm scale of output) do not allow any inference about the special interest model because of measurement error in regressors that are correlated with these variables. This reiterates the point made by Klepper, Kamlet, and Frank (1993), which is that the use of proxies that are correlated with other variables that are poorly measured may lead to spurious estimates on the well-measured proxy variables. [11] SCALE and CONC4 fall into this category of proxie s whose usefulness is suspect due to their correlation with other poorly measured variables. The results reiterate the fact that it may be well worth the construction of variables that more directly represent the theory, a point made by Baldwin (1985) in assessing the empirical literature on the special-interest model. PACCVA83 is constructed to do just that, and it is reassuring that it is unaffected by the EIV problem. Its positive interval unambiguously supports the special-interest model: Politically active industries succeed in buying protection. However, perfect measurement is not by itself sufficient in avoiding the EIV problem, as will be seen in the case of the adding machine model.

The one variable most representing the adding machine model of Caves (1976), namely NE82 (number of employees), is presumed to be perfectly measured because the data are from an accurately conducted census. Importantly, it is not a proxy and precisely represents what it is designed to measure, namely, voting strength. The bounds on perfectly measured variables can be computed after determining the bounds on the mismeasured variables using the method in Bollinger (1996). [12] Even though NE82 is measured without error, its correlation with other regressors that are measured with error fatally affects its usefulness for inferential purposes. The EIV interval on the coefficient on NE82 contains zero. Hence, the variable that is the mainstay of the adding machine model does not allow unambiguous inference due to the presence of other mismeasured variables.

The negative interval on LABINT82 (labor intensity) runs counter to the theory's prediction and is a puzzle. [13] We can only speculate that the contrary sign is due to specification error other than measurement error. The positive interval for REPRST (geographic spread of an industry) provides support for the adding machine model. REPRST proxies the geographic spread of an industry, and its positive interval indicates that industries that are geographically concentrated are less successful in obtaining protection than those that are dispersed and therefore more widely represented in Congress. Two factors, however, diminish the case REPRST makes for the adding machine model. First, the EIV interval for REPRST is too wide to infer whether support for the adding machine model is significant in magnitude. Second, the standard error on the lower bound of 1.776 makes the lower bound statistically insignificant from zero. This implies that under some measurement error schemes, it is possible that the estimate on R EPRST, were all measurement error removed, could be zero or even negative. Even though further iterating in the (subsequent) stage II runs can narrow the interval further, they cannot be expected to solve the imprecision with which the lower bound is estimated.

The comparative cost-comparative advantage model finds support from the positive interval for [M.sub.ij]/CONS (bilateral import penetration) and negative interval for [X.sub.ij]/CONS (bilateral exports scaled by consumption) just as the theory predicts. [14] Even though the intervals are wide, the lower bound of 2.870 on import penetration and the upper bound of -- 11.90 on export to consumption are economically large numbers. They imply that if import penetration were to increase by 0.05, then NTB coverage would increase by 0.05 X 2.87 = 0.143, or if the export-to-consumption ratio were to increase by 0.05, the coverage ratio would decline by 0.05 X 11.90 = 0.60. Hence, the support for the theory of protection according to comparative disadvantage is quite strong. The standard errors indicate that both the upper and lower bounds of [M.sub.ij]/CONS and [X.sub.ij]/CONS are precisely measured.

The empirical literature surrounding the Leontief paradox has established that the United States is human-capital abundant. The negative interval on P_SCI (proportion of employees who are scientists) confirms this. However, the interval is too wide to judge whether the data strongly support the theory of comparative costs and advantages. Further, the upper bound of -0.357 is statistically no different from zero at the 5% level of statistical significance. There remains the possibility that under some measurement error configurations, the true coefficient on P_SCI is statistically no different from zero. Because it is widely accepted that management is the source of comparative advantage in U.S. manufacturing, it is unfortunate that the EIV problem precludes any inference about the human capital variable P_MAN.

The evidence in favor of the models of public interest is mixed. Even though AVEARN (average earnings) is a perfectly measured variable, due to its correlation with mismeasured variables, its interval contains zero. Hence, AVEARN cannot be used to refute the public interest model. The positive interval on TAR (tariffs) is reassuring for Corden's status quo theory for it indicates that the tariff cuts in the Tokyo round were made up by tariff protection. It validates Corden's conservative social welfare function, based on the notion that government prefers the status quo to sudden and drastic income changes. Industries with high tariffs (which suffered the highest Tokyo Round cuts) continued to be protected with NTBs. However, this support for the Corden model is weakened due to the large standard error on the lower bound for TAR, which makes it possible for the true coefficient to be statistically no different from zero. P_UNSK (proportion unskilled) cannot be used to make inferences about the public interes t models because it is chosen as a key variable in the stage I iterations. The positive interval on NEGR82 (employment growth) is evidence against the theory of protection based on equity considerations and altruism but both the upper and lower bounds are measured quite imprecisely, as is evident from their standard errors. Neither bound is statistically different from zero.

It is natural to question whether the EIV bounds are robust to the path of iterations. There are 26 possible paths with 6 iterations (nodes) and a choice of one out of two key variables at each node. However, the set of key variables through which the paths can be routed is a smaller subset of the set of mismeasured variables and hence the number of free paths are limited. This is fortunate because our final results are robust to the choice of path. We experimented with three alternative stage I paths. Of the two key variables identified by Klepper's algorithm at each node, path 1 chooses one at random, provided its d value is acceptable according to the priors in Table 4. Path 2 goes through the key variable that has the higher d value. Path 3 goes through that key variable that has the lower d value, provided of course that it is not unacceptably low; otherwise, it is routed through the other key variable. As the intervals in Table 6 show, all three paths have similar progressions and although the sequence may be different (see notes to the table), the path set making up the set of key variables chosen along any path is the same for all three paths. Since the M bound on the true [R.sub.2] depends only on the path set, not their sequence, the stage I EIV intervals are the same for all three paths. [15]

Stage II intervals

The stage I bounds may be too wide for policy purposes or policy simulations, and it may be necessary to narrow them further. Klepper's method for tightening the bounds on individual coefficients is as follows. To tighten an individual bound (say, the lower bound), the [f.sub.i] value of one key variable (there is only one such variable corresponding to that bound) must be bounded below its [d.sub.i] value. In the next iteration, another key variable can be similarly used to tighten the bound further. These iterations may be continued until one of two events occur: either the bound is generated by a direct regression, in which case the coefficient bound cannot be tightened any further, or the required [d.sub.i] bound on the key variable is unacceptably low. This set of iterations is termed stage II iterations. At their conclusion, the final EIV bounds are produced.

The final EIV intervals are reported in the last two columns of Table 6. In the first of those columns, the lower and upper bounds for variable i are expressed in the original units as [[[b.sup.LB].sub.i], [[b.sup.UB].sub.i]]; in the next column, the interval estimates are expressed as beta coefficients by standardizing by the sample standard deviations. The beta coefficient on an rhs variable [x.sub.i] in a linear regression is the ordinary least squares estimate [[b.sup.OLS].sub.i] times sd([x.sub.i])/sd(y), where sd(.) denotes sample standard deviation. Hence, the beta coefficients corresponding to the interval bounds for variable i are [[[b.sup.LB].sub.i] X sd([x.sub.i])/sd(y), [[b.sup.UB].sub.i] X sd([x.sub.i])/sd(y)].

Consider the stage II bounds expressed in original units. A comparison with the stage I bounds shows that in many cases the bounds remain unchanged: This is true for both bounds for REPRST and MELAST, lower bound for PACCVA83, and lower bound for TAR. In all these cases, the key variables corresponding to those bounds at the last stage I iteration (J = 6) all have unacceptably low [d.sub.i] values. Hence, the stage II iterations terminate at iteration J = 6 without tightening those bounds. In fact, all the stage II bounds in Table 6 terminated because the key variable had an unacceptably low [d.sub.i] value. In no case did the stage II iterations progress beyond J = 7, and usually it terminated at J = 6. Had they been allowed to progress beyond that, say, by allowing the key [f.sub.i] value to be bounded below the required [d.sub.i] value, a further tightening would be possible. For example, if the [d.sub.i] value for the key variable for REPRST, which was also REPRST, were allowed to be bounded below 0.15, then the upper bound on REPRST could be tightened down to 44.7.

Because the stage II iterations did not progress far, if at all, beyond the stage I iterations, the EIV bounds did not, in general, narrow significantly beyond the stage I bounds. The inferences are therefore largely unchanged. The implications of the EIV problem for the theories of political economy are not very damaging if the focus is entirely on the sign on the coefficients, but they are damaging if the size of the intervals is a matter of concern.

The beta coefficients allow a comparison of the width of the final EIV bounds across variables. They indicate the amount of change in standard deviation units of NTBs induced by a change of one standard deviation in the rhs variable. Note that because sd(y) is smaller than the standard deviation of the latent uncensored dependent variable, the beta coefficients reported here are overstated. If it is supposed that an absolute beta value of 0.5 is an economically significant magnitude, because all the beta coefficient intervals for the mismeasured variables contain 0.5, the possibility that all these variables are economically significant cannot be precluded; that is, their true coefficients may all be of economically significant magnitude. Beyond this, it is difficult to make inferences about the likelihood of an absolute beta value exceeding 0.5. At best, informal judgements may be made, such as PACCVA83 is highly likely to have a true beta value greater than 0.5.

In sum, the EIV problem does not damage the case for any of the political economy models. Some variables do fall victim to the EIV problem, but there is always at least one variable which does allow inference about the underlying theory. The models of retaliation, pressure groups, voting strength, and comparative advantage all find support from the signs on the variables that do allow inference. The models of public interest get ambiguous or no support. The status quo model is refuted by the contrary sign on AVEARN while the equity model is refuted by the contrary signs on AVEARN and NEGR82. On the other hand, the imprecision with which the lower bound for AVEARN is measured opens the possibility that the true coefficient on AVEARN is statistically no different from zero. The high standard errors on both bounds for NEGR82 actually make this highly likely.

More optimistic priors about the extent of measurement error are required, if narrow bounds are desired. While this has the potential to strikingly narrow the final bounds, far greater confidence in the priors would be required. The importance of prior information cannot be overstated. The analysis began with unbounded EIV intervals that required prior information to narrow the bounds. The other side of the coin is, of course, that none of the results are valid unless the prior information is correct. We have taken care to elaborate how our priors are formed. Based on those priors, we must accept the resulting EIV bounds, however wide they are. If the intervals are too wide to be useful for policy purposes, the problem cannot be overcome via our priors. Improved prior information calls for the empirical measurement of measurement error.

5. Conclusion

The literature on endogenous protection contains a variety of disparate theories that address the motivation for the observed cross-industry structure of protection. They range from models of political self-interest to models of government altruism. With the possible exception of Grossman and Helpman (1994, 1995), endogenous protection theory has not yielded tight predictions against which the theory can be tested using precisely measured variables. Hence, a number of proxy variables, which probably span the range in terms of quality, has been used in their empirical testing. This is justifiable because it is costly to construct precisely measured variables. For example, it is a daunting task to compute the tariff-equivalents of NTBs, which are probably better measures of the restrictive effect of trade barriers than the import coverage ratios employed in this paper.

Sturdy inference in this setting would seem to require a sensitivity analysis of the estimates to the EIV problem. In this paper, the methodology of Klepper and Leamer (1984) and Klepper (1988) is employed to perform a sensitivity analysis to classical errors in variables of estimates from the nonlinear Tobit model. Their methodology focuses on diagnostics that take the form of imposing restrictions on the fraction of variation in regressors that are due to measurement error. Some variables fall victim to the EIV problem and do not allow any inferences. But reasonable prior restrictions can bound the coefficients for the remaining variables and allow useful inferences that are robust to classical errors in variables. We hope that sensitivity to errors in variables becomes a regular component of applied econometric studies, not because they are yet another set of diagnostics, but because economic data are fundamentally prone to measurement error.

We are grateful to two anonymous referees for insightful comments that improved this paper considerably. We acknowledge helpful suggestions by Ed Bedrick. Responsibility for any remaining errors is ours.

Finally, the EIV problem in the area of the political economy of protection benefits from recent work by Anderson and Neary (1996), who provide a theoretical basis for the computation of tariff equivalents of NTBs. Their measures are less prone to measurement errors than the ad hoc coverage ratios we have used. While their general equilibrium method requires far greater information than is available for a study at this scope and hence introduces additional sources of measurement error, their short-cut partial equilibrium method can potentially be used in the next generation of empirical work in this area using newer data.

(*.) Department of Economics, University of New Mexico, Albuquerque, NM 87131, USA; E-mail gawande@unm.edu; corresponding author.

(+.) Department of Economics, University of New Mexico, Albuquerque, NM 87131, USA; E-mail bohara@unm.edu.

Received September 1997; accepted May 1999.

(1.) More precisely, the inverse import penetration ratio divided by the absolute price elasticity of imports should be positively related to protection.

(2.) MELAST and XELAST are taken from Ceglowaki (1989). The variable MELAST is measured as a negative number. Hence, the coefficient on MELAST is expected to be negative.

(3.) The model can easily be generalized to include the case where the dependent variable is also measured with error, but to keep the exposition simple, we presume that y is measured without error. All the results hold if the true dependent variable is [y.sup.*], which is measured as y, where y = [y.sub.*] + [delta] and [delta] is normally distributed but not correlated with either [y.sup.*] or [epsilon]. Here, we assume that [delta] is absorbed into the error term [mu] in Equation 2.

(4.) The reason for this is that with more than one variable measured with error, it may he possible for the measurement error variances to take on values that imply that the true regressors are collinear.

(5.) The analysis of model 2 arrives at similar conclusions as the analysis presented in the paper.

(6.) The example in Klepper, Kamlet, and Frank (1993, p. 196) is instructive. Suppose all of the residual variation in the regression of y on x is due to the measurement error in one variable [x.sub.i]. Then removing the measurement error in [x.sub.i] would raise the R squared of the regression of y on x to 1, implying [R.sup.2*] equals one. But for this to happen, [f.sub.i], would have to equal its largest possible value. Thus, if it is believed that the true [f.sub.i], equaled its maximum possible value, then it would have to be believed that [R.sup.*2] equaled one. And if it were believed that [R.sup.*2] were less than one, then it would have to also be believed that the true [f.sub.i] was less than its maximum value.

(7.) Since a Tobit model is analyzed in the paper, the [R.sup.*2] measure is a pseudo-[R.sup.2] computed as 1 - [[[hat{[sigma]}].sup.2]/[[hat{[sigma]}].sup.2] + b'Nb)], where [[hat{[sigma]}].sup.2] is the MLE of the error variance in the Tobit model, b is the MLE of the Tobit coefficients, and N is the variance matrix of the matrix of rhs variables x. Levine's (1986) result stated earlier motivates its use.

(8.) See footnote 17, p. 198 of Klepper, Kamlet, and Frank (1993) for an intuitive explanation for why M bound is relaxed through the [d.sub.i] bounding of one of the key variables. The new upper bound on [R.sup.*2] of 0.3381 is well within our admissible priors for M.

(9.) For example, the next iteration proceeds in the following manner. The d([cdotsp]) bounds on the remaining variables required to support the new M bound on [R.sup.*2] at iteration 1 are unacceptably low, specifically for [[N.sup.*].sub.ij], PACCVA83, REPRST, TAR, M/CON, X/CON, P_SCI, and MELAST. The new key variables identified at iteration 1 were SCALE and P_MAN. Because their [d.sub.i] bounds were both admissib]e, we randomly chose SCALE and constrained [f.sub.SCALE] [leq] 0.40. Hence, SCALE no longer supported unambiguous inference about the validity of the special interest model. Imposing the constraints [f.sub.CONC4] [leq] 0.40 (from iteration J = 0) and [f.sub.SCALE] [leq] 0.40 (iteration J = 1), yielded a new M bound on [R.sup.*2] with a marginally higher value of M = 0.3383. This now became the sufficient upper bound on [R.sup.*2] in order to relax the [d.sub.i] bounds on the remaining coefficients.

(10.) If [R.sup.*2] is bounded below 0.3706, by Klepper's method the extreme points of the feasible set are composed of [2.sup.j](K + 1 - J) regressions, where J is the number of stage I iterations before the EIV coefficient bounds are computed and K is the number of regressors measured with error. According to this method, the EIV intervals for the nonkey variables will be of the same sign as the direct regression while the EIV intervals for the key variables will all contain 0. In the model considered here, J = 6 and K = 15 (the four variables NE82, LABINT82, NEGR82, and AVEARN and the nine dummies are presumed to be perfectly measured) so the stage I coefficient bounds are constructed as (elementwise) extreme values from [2.sup.6] x 10 = 640 regressions. Hence, the perfectly measured variables reduces the computational load and also the width of the intervals on the mismeasured variables. For the computation of the EIV intervals on the perfectly measured variables themselves, see footnote 12.

(11.) The extent of asymptotic bias in a coefficient will be greater: (i) the larger the correlation between it and the mismeasured variable(s), (ii) the smaller the independent explanatory power of the variable relative to the mismeasured variable(s), and (iii) the larger the variation in the mismeasured variable(s) due to measurement error (see Klepper, Kamlet, and Frank (1993, p. 204)).

(12.) The bounds on the variables not measured with error are computed using Bollinger's (1996) method. Let the regression model be written by partitioning the perfectly measured variables x1: n X k1, and the variables measured with error x2: n X k2. Let [x1.sup.*] be the true variables related to x1, with e1 = x1 - [x1.sup.*] being the measurement error. Then y = [x1.sup.*][beta]1 + x2[beta]2 + [mu] is the model of interest with [beta]: k1 X 1, and [beta]2: k2 X 1, that is Equation 2 rewritten after partitioning [x.sup.*]. Let the residuals from the linear projection of y on x2 be denoted w, the matrix of residuals from the (hypothetical, since [x1.sup.*] is unmeasurable) regression of [x1.sup.*] on x2 be denoted [W.sup.*]: n X k1, and the matrix of residuals from the regression of x1 on x2 be denoted W: n X k1. Then the regression model w = [W.sup.*][beta]1 + p. together with the measurement error model W = [W.sup.*] + e1 involve only variables measured with error. However, we know from the standard omitted variables bias formula that the coefficients b2 from the regression of y on only x2 are b2 = [beta] + G[beta]1, where G is the k2 X k1 matrix of coefficients from the regression of [x1.sup.*] on x2. Since the measurement error is white noise, the regression of x1 on x2 gives consistent estimates of G. Thus, the bound for [beta]2 the coefficients on variables not measured with error, are derived, given any bounds on [beta]1, from the formula [beta]2 = b2 - G[beta]1. In the application, b2 is the Tobit MLE of the regression of y on x2.

(13.) The labor intensity variable may also be argued to represent the pressure group model. The specific-factors model (see, e.g., Mussa 1974) predicts that, because the returns to the specific factor that benefits from protection increase with the industry's labor intensity, lobbying by specific factors is an increasing function of labor intensity. Hence, protection should rise with labor intensity. The evidence in Table 5 runs counter to this prediction as well.

(14.) The positive estimate on [M.sub.ij]/CONS runs counter to the Grossman-Helpman (1994) prediction. The main reason is that their prediction requires the scaling of [M.sub.ij]CONS by import elasticities, which is not undertaken here (see footnote 1). Goldberg and Maggi (1997) find support for the Grossman-Helpman hypothesis using Trefier's (1993) NTB data. There are a number of reasons why their results may be correct for their data. First, they aggregate across partners rather than pool across partners. Hence, they have around 400 observations, and we have nearly 4000. Second, they estimate a system with a rather different specification than our single-equation Tobit model. The literature on errors-in-variables bounds for systems of equations is still in its infancy. Regardless, it will be useful to use the Goldberg-Maggi data and their specification of the (single) protection equation, and subject it to an EIV sensitivity analysis as we have done here.

(15.) The data simulations required to compute the standard errors on the stage I bounds permitted another view of the robustness of the EIV bounds to possible paths. We performed approximately 900 data simulations, of which 300 resulted in the same path set as in Table 6 and 600 with different path sets. If we compute the means across all the 900 simulated bounds and compare them with the corresponding means from the 300 bounds with our path set, they are qualitatively the same and quantitatively statistically no different in most cases. This striking result demonstrates the robustness of the results across paths. The main reason is that there is a core set of path variables that are constant in most of the 900 simulated path sets. Only a few path variables change.

(16.) The model can easily be generalized to include the case in which the dependent variable is also measured with error, hut to keep the exposition simple, we presume that y is measured without error. All the results hold if the true dependent variable is [y.sup.*], which is measured as y, where y = [y.sup.*] + [delta] and [delta] is normally distributed but not correlated with either [y.sup.*] or [varepsilon]. Here, we assume that [delta] is absorbed into the error term [mu] in Equation 2.

(17.) The reason for this is that with more than one variable measured with error, it may be possible for the measurement error variances to take on values that imply that the true regressors are collinear.

References

Anderson, J. E., and J. P. Neary. 1996. A new approach to evaluating trade policy. Review of Economic Studies 63:107-25.

Baldwin, Richard E. 1990. Optimal tariff retaliation rules. In The political economy of international trade: Essays in honor of Robert E. Baldwin, edited by R. W. Jones and A. Krueger. Cambridge, MA: Basil Blackwell, pp. 108-21.

Baldwin, Robert E. 1985. The political economy of U.S. import policy. Cambridge, MA: MIT Press.

Baron, D. P. 1997. Integrated strategy and international trade disputes: The Kodak-Fujifilm ease. Journal of Economics and Management Strategy 6:291-346.

Bekker, P., A. Kapteyn, and T. Wansbeek. 1987. Consistent sets of estimates for regressions with correlated or uncorrelated measurement errors in arbitrary subsets of all variables. Econometrica 55:1223-30.

Bollinger, C. R. 1996. Bounding mean regressions when a binary regressor is mismeasured. Journal of Econometrics 73:387-99.

Brock, W P., and S. P. Magee. 1978. The economics of special interest politics: The case of tariffs. American Economic Review 68:246-50.

Caves, R. E. 1976. Economic models of political choice: Canada's tariff structure. Canadian Journal of Economics 9:278-300.

Ceglowski, J. 1989. Dollar depreciation and U.S. industry performance. Journal of International Money and Finance 8:233-51.

Corden, W. M. 1974. Trade policy and welfare. Oxford, UK: Oxford University Press.

Erickson, T. 1989. Proper posteriors from improper priors for an unidentified errors-in-variables model. Econometrica 57:1299-316.

Erickson, T. 1993. Restricting regression slopes in the errors-in-variables model by bounding the error correlation. Econonsetrica 61:959-69.

Findlay, R., and S. Wellisz. 1982. Endogenous tariffs and the political economy of trade restrictions and welfare. In Import competition and response, edited by J. Bhagwati. Chicago: University of Chicago Press.

Goldberg, P., and G. Maggi. 1997. Protection for sale: An empirical investigation. Princeton University. Mimeographed.

Grossman, G. M., and E. Helpman, 1994. Protection for sale. American Economic Review 4:833-50.

Grossman, G. M., and E. Helpman, 1995. Trade wars and trade talks. Journal of Political Economy 103:675-708.

Iwata, S. 1992. Instrumental variables estimation in errors-in-variables model's when instruments are correlated. Journal of Econometrics 53 1-3:297-322.

Klepper, S. 1988. Regressor diagnostics for the classical errors-in-variables model Journal of Econometrics 37:225-50.

Klepper, S., M. S. Kamlet, and R. G. Frank. 1993. Regressor diagnostics for the errors-in-variables model-An application to the health effects of pollution. Journal of Environmental Economics and Management 24:190-211.

Klepper, S., and E. E. Learner. 1984. Consistent sets of estimates for regressions with errors in all variables. Econometrica 52:163-83.

Kokkelenberg, B. C., and D. R. Sockell. 1985. Union membership in the United States, 1973-1981. Industrial and Labor Relations Review 38:497-543.

Krasker, W. S., and J. W. Pratt. 1986. Bounding the effects of proxy variables on regression coefficients. Economtrica 54:641-55.

Krinsky, I., and A. L. Robb. 1986. On approximating the statistical properties of elasticities. Review of Economics and Statistics 68:715-9.

Lavergne, R. 1983. The Political economy of U.S. tariffs. Toronto: Academic Press.

Leamer, E. E. 1978. Specification searches. New York: Wiley.

Leamer, E. E. 1983. Let's take the con Out of econometrics. American Economic Review 73:31-43.

Leamer, E. E. 1985. Sensitivity analysis would help. American Economic Review 753:308-13.

Leamer, E. E. 1990. The structure and effects of tariff and nontariff barriers in 1983. In The political economy of international trade: Essays in honor of Robert E. Baldwin, edited by R. W. Jones and A. Krueger. Cambridge, MA: Basil Blackwell, pp. 224-60.

Levine, D. K. 1986. Reverse regressions for latent-variable models. Journal of Econometrics 32:291-2.

Mussa, M. 1974. Tariffs and the distribution of income: The importance of factor specificity, substitutability, intensity in the short and long run. Journal of Political Economy 1191-1203.

Olson, M. 1965. The logic of collective action. Cambridge, MA: Harvard University Press.

Pincus, J. J. 1975. Pressure groups and the pattern of tariffs. Journal of Political Economy 83:775-8.

Ray, E. J. 1981. The determinants of tariff and nontariff trade restrictions in the United States. Journal of Political Economy 89:105-21.

Trefler, D. 1993. Trade liberalization and the theory of endogenous protection: An econometric study of U.S. import policy. Journal of Political Economy 101:138-60.

Vousden, N. 1990. The economics of trade protection. Cambridge, UK: Cambridge University Press.

Weinberger, M. I., and D. U. Greavey. 1984. The PAC directory: A complete guide to political action committees. Cambridge, MA: Ballinger.

Variable Definitions, Political Economy Theories, and Expected Signs Theory Variable Sign Dependent variable [N.sub.ij] Retaliation, strategic policy [[N.sup.*].sub.ij] + Pressure groups PACCVA53 + SCALE + CONC4 +, - Adding machine NE82 + UNION + LABINT82 + REPRST + Comparative costs-comparative ad- [M.sub.ij]/CONS + vantage [X.sub.ij]/CONS + DPEN7982 + P_SCI - P_MAN - Public interest (status quo, equity) AVEARN - TAR +, - P_UNSK + NEGR82 - Other control variables MELAST - XELAST + [D_C.sub.j], j = 1, [ldots], 9 [D_I.sub.j], j = 1, [ldots], 4 Theory Variable Description Dependent variable U.S. all NTB (nontariff barrier) coverage of imports of good i from partner j (ratio) Retaliation, strategic policy Partner j's all NTB coverage of its imports of good i from the United States (ratio) Pressure groups Corporate PAC spending by the industry, 1977-1984 scaled by value added ($100 Mn/$Bn) Measure of industry scale: Value added per firm, 1982. ($Bn/firm) Four-firm concentration ratio, 1982 Adding machine Number of employees, 1982 (Mn. persons) Fraction of employees unionized, 1981 Labor intensity: Share of labor in value added, 1982 Number of states in which production is located, 1982 (scaled by 100) Comparative costs-comparative Penetration of U.S. consumption of advantage good i by imports from partner j U.S. exports of good ito partner j, scaled by consumption IMP/CONS(1979) - IMP/CONS(1982), IMP total industry imports Fraction of employees classified as scientists and engineers, 1982 Fraction of employees classified as managerial, 1982 Public interest (status quo, equity) Average earnings per employee, 1982 ($Mn/year) Ad valorem tariff rate Fraction of employees classified as unskilled, 1982 Growth in employment, 1981-1982 Other control variables Real exchange rate elasticity of imports Real exchange rate elasticity of exports Nine country dummies Four industry group dummies: Food, resource-based, general manufacturing, and capital-intensive Cross-industry four-digit SIC level data pooled across the nine partners j: Belgium, Finalnd, France, Germany, Italy, Japan, the Netherlands, Norway, and the United Kingdom. Number of observations = 3915. Pooled Runs: Tobit MLEs; [a] Dependent Variable: All U.S. NTBs ([N.sub.ij]) Theory Rhs Variable Model 1 Retaliation, strategic policy [[N.sup.*].sub.ij] 0.337 [**] (0.047) Pressure groups PACCVA83 1.332 [**] (0.267) SCALE 0.079 (0.191) CONC4 -0.037 (0.084) Adding machine NE82 1.047 [**] (0.346) UNION -0.136 (0.087) LABINT82 -1.286 [**] (0.157) REPRST 6.110 [*] (3.212) Comparative costs, comparative [M.sub.ij]/CONS 4.427 [**] advantage (1.508) [X.sub.ij]/CONS -19.44 [**] (4.732) DPEN7982 0.039 (0.058) P_SCI -1.056 [**] (0.376) P_MAN 0.249 (0.436) Public interest (status quo, equity) AVEARN 20.42 [**] (4.040) TAR 1.909 [**] (0.273) P_UNSK 0.417 (0.357) NEGR82 0.141 (0.105) Control variables MELAST 0.190 [**] (0.028) XELAST 0.055 [**] (0.022) [D_C.sub.j], j = 1, See note [b] [ldots], 9 [D_I.sub.j], j = 1, -- [ldots], 4 Theory Model 2 Retaliation, strategic policy 000.234 [**] (0.050) Pressure groups 1.077 [**] (0.267) -0.061 (0.196) -0.041 (0.085) Adding machine 0.916 [**] (0.340) -0.103 (0.086) -0.685 [**] (0.168) 6.089 [*] (3.15) Comparative costs, comparative 5.072 [**] advantage (1.486) -13.598 [**] (4.653) 0.033 (0.584) -0.691 (0.411) -0.327 (0.440) Public interest (status quo, equity) 22.95 [**] (4.045) 1.923 [**] (0.271) -1.165 (0.425) 0.156 (0.103) Control variables 0.245 [**] (0.030) 0.032 (0.022) See note [b] See note [c]

N = 3915, k = 28, degree of truncation = 84.3%. Four-digit SIC cross-industry data pooled across nine countries: Belgium, Finland, France, Germany, Italy, Japan, the Netherlands, Norway, and the United Kingdom. Goodness of fit: Model 1: likelihood-ratio statistic 685.6, Maddala's [R.sup.2] = 0.160, McFadden's [R.sup.2] = 0.213, Cragg-Uhler's [R.sup.2] = 0.286. Model 2: likelihood-ratio statistic = 741.2, Maddala's [R.sup.2] = 0.173, McFadden's [R.sup.2] = 0.230, Cragg-Uhler's [R.sup.2] = 0.309.

Standard errors in parentheses.

(*.)and (**.)indicate, respectively, that \t\ [greater than] 1.98 and \t\ [greater than] 1.66.

(a.)MLEs, maximum likelihood estimates.

(b.)All country dummies have negative MLEs with t-values in excess of 2.

(c.)Of the four industry group dummies--food, resources, manufacturing, and capital intensive--the food and the manufacturing dummies are positive and statistically significant.

Bounds from Direct and Reverse Regressions Dependent Variable: All U.S. NTBs ([N.sub.ij]) Rhs Variable [N.sub.ij] [-19.30, 13.37] PACCVA83 [-73.5, 203.8] SCALE [-22.1, 1225.2] CONC4 [-279.5, 34.3] NE82 [-292.1, 242.8] UNION [-90.1, 45.8] LABINT82 [-69.91, 29.87] REPRST [-1563.5, 3033.1] [M.sub.ij]/CONS [-30.2, 1213.3] [X.sub.ij]/CONS [-1673.0, 1276.6] DPEN7982 [-11.1, 5.32] P_SCI [-161.1, 100.3] P_MAN [-229.3, 1016.0] AVERAN [-2349, 5607] TAR [-243.8, 110.1] P_UNSK [-71.1, 439.4] NEGR82 [-15.04, 112.6] MELAST [-22.7, 7.06] XELAST [-3.89, 11.44] D_[C.sub.j], j = 1, [ldots], 9 See note [a]

N = 3915, k = 28, degree of truncation = 84.3%. Four-digit SIC cross-industry data pooled across nine countries: Belgium, Finland, France, Germany, Italy, Japan, the Netherlands, Norway, and the United Kingdom. Goodness of fit: Model 1: likelihood-ratio statistic 685.6, Maddala's [R.sup.2] = 0.160, McFadden's [R.sup.2] = 0.213, Cragg-Uhler's [R.sup.2] = 0.286. Model 2: likelihood-ratio statistic = 741.2, Maddala's [R.sup.2] = 0.173, McFadden's [R.sup.2] = 0.230, Cragg-Uhler's [R.sup.2] = 0.309. Standard errors in parentheses. (*.)and (**.)indicate, respectively, that \t\ [greater than] 1.98 and \t\ [greater than] 1.66.

NE82, LABINT82, AVEARN, and NEGR82 are presumed tobe accurately measured. See Appendix for construction of their bounds. The 20 two-digit SIC dummiesare presumed to be accurately measured.

(a.)All country dummies have negative maximum likelihood estimates with t-values in excess of 2.

f-value: The Fraction of Variation in Each Variable Attributable to Measurement Error Degree of Mismeasure- Variable ment f [[N.sup.*].sub.ij] Moderate 0.25 PACCVA83 Serious 0.40 SCALE Serious 0.40 CONC4 Serious 0.40 NE82 None 0 UNION Serious 0.40 LABINT82 None 0 REPRST Serious 0.40 [M.sub.ij]/CONS Moderate 0.25 [X.sub.ij]/CONS Moderate 0.25 DPEN7982 Moderate 0.25 P_SCI Serious 0.40 P_MAN Serious 0.40 AVEARN None 0 TAR Moderate 0.25 P_UNSK Serious 0.40 NEGR82 None 0 MELAST Serious 0.50 XELAST Serious 0.50 [D.sub.j], j = 1, [ldots], 9 None 0 Variable Cause/Source [[N.sup.*].sub.ij] Trade-industry mismatch; SITC to ISIC to SIC PACCVA83 Higher aggregation level (three-digit SIC) SCALE Use industry-level data to proxy firm-level data CONC4 Census but from firm-level data. Relative standard error mostly reported to be [greater than]0.15 NE82 Census of manufactures UNION Higher aggregation level (three-digit SIC); Estimates from Kokkelenberg and Sockell (1985) LABINT82 Census of manufactures REPRST State-level data at three-digit SIC [M.sub.ij]/CONS Trade-industry mismatch: SITC-SIC [X.sub.ij]/CONS Trade-industry mismatch: SITC-SIC DPEN7982 Trade-industry mismatch: SITC-SIC P_SCI Higher aggregation level (three-digit SIC) P_MAN Higher aggregation level (three-digit SIC) AVEARN Census of manufactures TAR Trade-industry mismatch: TSUSA-SIC P_UNSK Higher aggregation level (three-digit SIC) NEGR82 Census of manufactures MELAST Higher aggregation level (two-digit SIC); Estimates from Caglowski (1989) XELAST Higher aggregation level (two-digit SIC); Estimates from Caglowski (1989) [D.sub.j], j = 1, [ldots], 9 -- Errors-in-Variables Diagnostics for Stage I Iterations: M([cdotp]) and d([cdotp]) Values (All NTBs ([N.sub.ij]) as Dependent Variable) Key Variable Itera- d([cdotp]) Chosen M([cdotp]) tion [[N.sup.*].sub.ij] PAC SCA CON UNI REP TAR -- 0.3375 0 0.03 0.04 0.77 0.55 0.30 0.09 0.02 CONC4 0.3381 1 0.04 0.05 0.74 -- 0.37 0.13 0.03 SCALE 0.3383 2 0.05 0.06 -- -- 0.39 0.13 0.04 P_MAN 0.3402 3 0.12 0.14 -- -- 0.55 0.18 0.09 UNION 0.3428 4 0.16 0.18 -- -- -- 0.20 0.13 DPEN7982 0.3435 5 0.20 0.22 -- -- -- 0.22 0.16 P_UNSK 0.3706 6 0.47 0.43 -- -- -- 0.24 0.40 Key Variable Chosen P_UN MCON XCON DPEN P_SC P_MA MEL XEL -- 0.36 0.15 0.05 0.78 0.09 0.56 0.02 0.14 CONC4 0.44 0.19 0.07 0.81 0.13 0.60 0.04 0.20 SCALE 0.46 0.21 0.08 0.82 0.14 0.60 0.04 0.22 P_MAN 0.51 0.36 0.18 0.90 0.28 -- 0.10 0.37 UNION 0.54 0.42 0.21 0.93 0.33 -- 0.12 0.44 DPEN7982 0.62 0.46 0.28 -- 0.39 -- 0.17 0.53 P_UNSK -- 0.55 0.59 -- 0.50 -- 0.42 0.69

M([cdotp]) is the upper bound on [R.sup.*2] (the R-squared of the regression if measurement error were completely removed from all variables) in order to bound the coefficients. The R-squared (and therefore M) in the Tobit model is computed as the pseudo R-squared given by 1 - [[hat{[sigma]}].sup.2]/( [[hat{[sigma]}].sup.2] + b' Nb).

d([cdotp]) is the upper bound on f([cdotp]), the fraction of variation in each variable attributable to measurement error. f([cdotp]) [less than] d([cdotp]) elementwise is a necessary condition for [R.sup.*2] [less than] M.

The bold [d.sub.i] value denotes the key variables (at this initial iteration). If either of their [f.sub.i] is bounded below its corresponding [d.sub.i] value, then M increases and relaxes the upper bound on [R.sup.*2]

Perfectly measured variables NE82, LABINT82, AVERN, NEGR82 do not appear here. Variable names are abbreviated but are in the same order as in Tables 1-4.

In stage I, iterations continue until the M bound becomes acceptable (conditional on all [d.sub.i] bounds being acceptable) or if the M bound is acceptable at an early step, say, at the first iteration, iterations continue until all [d.sub.i] bounds are acceptable. Here, after six iterations the d bounds on REPRST and MELAST were both unacceptably low, and hence, the stage I iterations were terminated.

Appendix

I. Data

Data on NTBs (nontariff barriers) were taken from the United Nations Commission for Trade and Development data base on trade control measures. Data were aggregated to the four-digit SIC level, which required concordance among disparate systems of data keeping. The sample accounts for over 98% of manufacturing sales. In the following, COMTAP refers to the Compatible Trade and Production Database, 1968-1986, CM refers to the 1982 Census of Manufactures, ASM for the 1983 Annual Survey of Manufactures, and CPS refers to the 1983 Current Population Survey. Bilateral trade and production (the latter required to obtain domestic consumption) were constructed using 1983 figures from COMTAP. These data were at the ISIC level and were concorded into the SITC (r1) level and then into the four-digit SIC level. However, aggregate (across partners) trade data for the United States were aggregated up from the tariff-line (TSUSA) data (as is TAR). Political Action Committee (PAC) campaign contribution data are from Federal E lection Commission (FEC) tapes for the four election cycles between 1977-1984. Since PACs are associated with individual firms, PACCVA83 was constructed as follows. Using COMPUSTAT tapes, firms were classified into three- or four-digit SIC industries. Where firm coverage was incomplete in COMPUSTAT, PACs were classified into two-digit SIC industries using Weinberger and Greavey (1984) and replicated at the four-digit level. Because classification of PACs to SIC industries is one to many, we use per-firm contributions as our measure of PAC spending. This is scaled by value added to get PACCVA83. Geographic concentration (GEOG) is defined as [[[sigma].sup.50].sub.j]= 1 \(V[A.sub.ij]/[[[sigma].sup.50].sub.j]= 1 V[A.sub.ij]) -- ([POP.sub.j]/[[[sigma].sup.50].sub.j]= 1 [POP.sub.j])\, where [VA.sub.ij] is value added in industry i and state j and [POP.sub.j] is population in State j. Value-added data are from ASM. REPRST is constructed from the county data in the Geographic Area Series of the COM. Earnings and empl oyment (AVEARN, SH_L) are also from ASM as are capital stock figures. Number of firms (used in SCALE) and CONC4 are taken from

CM. SCHOOL, P_SCI, P_MAN, P_UNSK are from CPS. UNION is from Kokkelenberg and Sockell (1985). MELAST and XELAST are replicated at the four-digit level from the two-digit estimates of Ceglowski (1989).

II. Technical Appendix: EIV Bounds Using Klepper's (1988) Diagnostics

Following Klepper (1988), consider the classical errors-in-variables model in which the observed variable y is generated by

y = [beta]'[x.sup.*] + [mu] (A1)

where [x.sup.*] is a K X 1 vector of true unobservable regressors with mean 0 and covariance matrix [Sigma], [mu], is a classical disturbance with mean 0 and variance [[sigma].sup.2], and [beta] is a K X 1 vector of coefficients on which interest centers. A K X 1 vector of proxy variables X is observed, which is related to [x.sup.*] by

x = [x.sup.*] + [varepsilon] (A2)

where [varepsilon] is a K X 1 vector of measurement errors with mean 0, covariance matrix V = diag([[nu].sub.1], [[nu].sub.2],[ldots], [[nu].sub.k]), and which is assumed to be distributed independently of X and [mu]. [16] Without any further distributional assumptions about the unobservables, the parameters of the model are not identified. Klepper and Leamer (1984) show that it is possible to bound the parameters using the fact that the second moment of the observables, [[sigma].sup.2], [Sigma], and V must be positive semi-definite (p.s.d.). Let [s.sup.2] denote the sample variance of y, r the vector of sample covariances between y and x, and N the sample covariance matrix of x. Klepper and Leamer derive the following equations that solve for [[sigma].sup.2], [Sigma], and V in terms of the sample moments of (y, x')':

[Sigma] = N - V, (A3)

[beta] = [(N - V).sup.-1]r (A4)

[[sigma].sup.2] = [s.sup.2] - r'[(N - V).sup.-1]r, (A5)

where the solution is unique if (N - V) is nonsingular. If X were perfectly measured, V = 0 and [beta] is the usual least squares (LS) estimator. Where V is not a zero matrix, it is used to adjust N so that the solution for [beta] is an adjusted LS estimator. Even though V is not usually known, leading to the identification problem, a set of values for [beta] that consistently bounds the true value of [beta] can be constructed by appeal to the fact that V and the solutions for [Sigma] and [[sigma].sup.2] in Equations A3 and A5 must be p.s.d. A result by Levine (1986) allows an extension of Klepper and Leamer (1984) and Klepper (1988) to a Tobit model. The main result is that if the matrix

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

is used in place of the sample moments of (y, x')', the results from the linear model apply.

Consider the direct regression estimate plus the K reverse regressions where the K regressors x, i = 1,[ldots], K are each regressed on y and the reverse regression estimates then computed (by solving for y in terms of the x's). Klepper and Leamer show that if for every coefficient, the sign of the direct and all K reverse regressions is the same, then the set of feasible values of [beta] can be bounded. However, if the signs of any coefficient differ across the direct and reverse regressions, none of the coefficients can be bounded. [17] It is then necessary to invoke additional prior information to bound the feasible set. Klepper (1988) describes the use of reasonable prior information to bound the feasible values of [beta], and even narrow these bounds for individual coefficients. Two types of information (they are not independent of one another) used by Klepper (1988) in constructing EIV diagnostics are as follows: (i) bounds on [f.sub.0], the fraction of the variation in each regressor that is attributa ble to measurement error, and (ii) bound on [R.sup.*2], the (hypothetical) R squared of the regression of y on x if all the measurement error in the [x.sub.i]'s were completely removed.

A value, denoted M, can be computed such that if the true [R.sup.2] of the regression can be bounded below this value, the feasible set is bounded. Bounding the true [R.sup.2] below M renders infeasible all the combinations of the measurement error variances that imply the true regressors are collinear (Klepper 1988). Klepper shows that the combination of measurement error variances that imply a true [R.sup.2] equal to M is one for which exactly two of the measurement error variances are nonzero. If either of these two measurement error variances can be bounded below its respective value upon the basis of prior information, then this combination is rendered infeasible and it is no longer necessary to bound the true [R.sup.2] below M to bound the feasible set. Instead, the feasible set of estimates can be bounded by bounding the true [R.sup.2] below a larger M value, which is computed from the diagnostics in Klepper (1988). Now, corresponding to this new upper bound on the true [R.sup.2] is yet another two-me asurement-error-variance combination. If this combination can he rendered infeasible as above, then once again the upper bound on the true [R.sup.2] required to bound the feasible set is further relaxed. This process can then be repeated. Because a Tobit model is analyzed in this paper, the [R.sup.2] measure is a pseudo-[R.sup.2] computed as 1 - [[[hat{[sigma]}].sup.2]/[[hat{[sigma]}].sup.2] + b'Nb), where [[hat{[sigma]}].sup.2] is the maximum likelihood estimate (MLE) of the error variance in the Tobit model, b is the MLE of the Tobit coefficients, and N is the variance matrix of the matrix of rhs variables x. Levine's (1986) result stated earlier motivates its use.

III. Technical Appendix: Simulated Standard Errors on EIV Bounds

The standard errors on the EIV bounds reported in Table 6 are computed by simulation (see, e.g., Krinsky and Robb 1986) as follows. The Tobit ML estimates and their covariance matrix are used to generate m samples of [beta] assuming a multivariate normal distribution as

[[tilde{[beta]}].sub.i] = [hat{[beta]}] + [C'rndn.sub.i](k, 1), i = 1,[ldots],m, (A7)

where C is the Cholesky decomposition of the covariance matrix of the Tobit MLE[hat{[beta]}], and [rndn.sub.i] is a randomly drawn standard normal random vector with dimension k Hence, [[tilde{[beta]}].sub.i], is a sample value of [beta] generated according to the Tobit estimates on the coefficients and their covariance matrix. Each sample i then defines a new sample moment matrix by replacing [hat{[beta]}] with [[tilde{[beta]}].sub.i] in Equation A6. This covariance matrix is the main input into producing the EIV intervals. Thus, m sets of EIV intervals are computed. The standard errors reported in Table 6 estimated as the standard deviations across these m upper and lower bounds, respectively.

To generate the m samples, we discard all generated samples where the path set does not coincide with our path set. In Table 6, the path set for the stage I bounds is {CONC4, SCALE, P_MAN, UNION, DPEN7982, P_UNSK}. Note that the bounds depend only on the path set, not their sequence. The standard errors are computed conditionally on this path set. The standard errors reported in Table 6 are based on m = 300. Now, to generate m samples with this specific path set, required (3 x m) unconditional samples. That is, of the unconditional sample, around 33% contained this specific path set. Regardless, the mean across the full unconditional sample were not qualitatively or quantitatively different from what we have reported in the tables. This is significant because it demonstrates that the EIV bounds are robust to the choice of paths, on average. This is mainly due to the fact that across all samples only one or two variables in the path set are usually different from our path set. Hence, the path set itself is no t very volatile.

The standard errors on the stage II bounds are based on one iteration beyond the stage I iterations, that is, for stage I, J = 6, and for stage II, J = 7. The standard errors on the stage II bounds are based on the unconditional sample, not just those samples that follow the exact path for each stage II bound. This is done mainly for computational convenience so that a separate set of simulations for each individual coefficient is not required. If anything, this overstates their standard errors, but we do not believe it qualitatively alters any of the conclusions.

Finally, for the perfectly measured variables, NE82, LABINT82, AVEARN, and NEGR82 (plus the country dummies), the computation of standard errors use Bollinger's (1996) formula. Write the regression model as a partitioned model, y = x1[beta]1 + x2[beta]2 + [mu] with [beta]: k1 x 1, and [beta]2: k2 x 1, where x1 is the set of mismeasured variables and x2 is the set of perfectly measured variables. Let G be the k2 x k1 matrix of coefficients from the regression of x1 on x2. The EIV bound for [beta] are derived, given any bounds on [beta]1, from the formula [beta]2 = b2 - G[beta]1, where we use the Tobit MLE of the regression of y on x2 in place of b. The computation of standard errors now follows easily. G is fixed in all simulations. The simulations proceed along the lines described above, generating m sets of upper and lower bounds on [beta]1 (as well as m values for the vector b2). From this, we compute in upper and lower bounds on [beta]2 from the formula. The standard deviations across the m bounds are est imates of the corresponding standard errors. We use m = 300.

Note that in Table 6 we report the bounds as determined by the original data and the standard errors from the simulations described here. Alternatively, we could have reported the means from the simulations in place of the bounds. Although they are not reported here for brevity, the means are qualitatively similar to what we report.

Printer friendly Cite/link Email Feedback | |

Author: | Bohara, Alok K. |
---|---|

Publication: | Southern Economic Journal |

Geographic Code: | 1USA |

Date: | Apr 1, 2000 |

Words: | 14772 |

Previous Article: | Urban Development in the United States, 1690-1990. |

Next Article: | A Long History of FOMC Voting Behavior. |

Topics: |