Inequality and growth in the United States: evidence from a new state-level panel of income inequality measures.
Rising income inequality in the United States over the past quarter century is well documented (see, e.g., Gottschalk 1997; Krueger 2003; Levy and Murnane 1992; Piketty and Saez 2003). Whether and by how much this change in inequality is associated with a change in economic performance is an important question, yet recent empirical work has been largely inconclusive. Positive relationships between income inequality and economic growth have been found by Partridge (1997, 2005) using a panel of states and by Forbes (2000) using a panel of countries. Empirical work by Panizza (2002) and Quah (2001), however, has found little or no stable relationship between inequality and growth; the results appear to be extremely sensitive to the econometric specification.
Additionally, Barro (2000) has found evidence that the relationship is nonlinear, with inequality being positively related to growth among wealthier countries like the United States but negatively related to growth among low-income countries.
Much of this recent empirical literature was initiated by the important work of Deininger and Squire (1996), who constructed a large cross-national panel of inequality measures containing several time-series observations for each nation spaced over multiple decades. A parallel empirical literature also emerged using U.S. state-level data, with income inequality measures typically spaced at 10-yr intervals (see, e.g., Panizza 2002; Partridge 1997, 2005). Both the state-level and the cross-national empirical literatures benefited by exploiting the more advanced panel data econometrics afforded by the large-N, small-T panel dimensions.
In this paper, we offer a new, more comprehensive panel of state-level income inequality measures and use this panel to reevaluate the empirical inequality-growth relationship. (1) This panel has the unique feature of being large in both N and T, with annual observations of the 48 contiguous states for the entirety of the postwar period 1945-2004. For nearly all the states in the panel, the share of income held by the top decile experienced a prolonged period of stability after World War II, followed by a substantial increase during the 1980s and 1990s. These state-level trends appear to closely replicate the overall trends in aggregate U.S. inequality found by Piketty and Saez (2003).
Exploiting the large and balanced size of our inequality panel, we explore the long-run relationship between inequality and growth via three alternative dynamic panel error-correction estimators: the fixed effects (FE) estimator, the mean group (MG) estimator of Pesaran and Smith (1995), and the pooled MG estimator of Pesaran, Shin, and Smith (1999). The greater homogeneity of state-level data helps mitigate the difficulty in adequately capturing structural differences across international panels of earlier studies such as Forbes (2000) and Barro (2000). Corruption levels, labor market flexibility, tax neutrality, tradition of entrepreneurship, and many other factors are only poorly measured, if at all (Barro 2000, 1011), and these sources of heterogeneity are much more likely to contribute to omitted variable bias across countries than across states. The results from our analysis indicate that the long-run relationship between the top decile share of income and economic growth is positive in nature. Moreover, an evaluation of several alternative income inequality measures suggests that this positive relationship is driven primarily by the concentration of income in the upper end of the income distribution.
The structure of the paper is as follows. Section II presents the new panel of annual state-level inequality measures and includes an important discussion of its key limitations. Section III then offers an empirical investigation on the impact of income inequality on the growth rate of real income per capita. Finally, Section IV offers a brief set of conclusions.
II. TRENDS IN STATE-LEVEL INEQUALITY
This paper offers a rich new panel of annual income inequality measures for the 48 states over the period 1945-2004 (N = 48, T = 60). Descriptive statistics for all the variables used in the analysis are presented in Table 1. Our inequality measures are derived from tax data reported in Statistics of Income published by the Internal Revenue Service (IRS). The pretax adjusted gross income reported by the IRS is a broad measure of income. In addition to wages and salaries, it also includes capital income (dividends, interest, rents, and royalties) and entrepreneurial income (self-employment, small businesses, and partnerships). (2) Notable income exclusions include interest on state and local bonds and transfer income from federal and state governments. Further details on the construction of the inequality measures are provided in Appendix A.
Figure 1 presents the annual trends in real income per capita and income inequality averaged over the 48 states. Shaded areas show periods of recession as defined by the National Bureau of Economic Research (NBER). The solid line (left scale) shows the yearly trend in the logarithm of real income per capita. Average state-level real income per capita in 2004 ($31,908) was over three times greater than that in 1949 ($10,320 in 2004 constant dollars), the lowest year in the period. The thick dashed line (right scale) shows the yearly trend in the share of income held by the top 10% of the population. The top decile share of income has grown substantially over this 60-yr period, from a low of 28% in 1953 to a high of 43% in 2000.
Aggregate U.S. trends in income inequality from IRS income data have been explored before, most notably by Piketty and Saez (2003, 2006), who construct several time-series measures of U.S. top income shares spanning the period 1913-1998. Piketty and Saez find that income inequality in the United States has displayed a distinct U-shaped pattern. In the early part of the century, inequality declined substantially, particularly during the Great Depression and World War II (see also Goldin and Margo 1992). After three decades of post-World War II stability, large increases in inequality began in the 1980s, with a significant part of this increase occur ring after the Tax Reform Act of 1986, and continued throughout the 1990s (see also Gottschalk 1997; Krueger 2003; Levy and Murnane 1992).
[FIGURE 1 OMITTED]
The new state-level inequality panel we present appears quite consistent with the aggregate U.S. data of Piketty and Saez. Comparing our measure of the top 10% share of income averaged across the 48 states (shown in Figure 1) to the total U.S. share presented in Piketty and Saez (2003, 11), the mean share of income for the period of commonality (1945-1998) is 32.7% in our panel and 34.0% in the time-series data of Piketty and Saez. The minimum annual share of income is 28.2% in our sample and 31.4% in Piketty and Saez (both occurring in 1953), while the maximum annual share is 41.9% in our panel and 41.4% in Piketty and Saez (both occurring in 1998). Moreover, the Pearson's correlation coefficient between the two series is 0.980, while the Theil U statistic is 0.044. (3)
The distinguishing feature of our panel is the construction of annual inequality measures for each of the states. State-level inequality panels have been used before, notably by Panizza (2002) and Partridge (1997, 2005), though these panels are spaced at 10-yr or longer intervals. Figure 2 shows the individual state-level trends in the top 10% share of income and the log of real income per capita. Overall, many of the individual states appear to replicate the general trend and level of U.S. inequality discussed above. The lowest level of income inequality over the 60-yr period occurred in West Virginia (with an average top decile share of income of 30.5%), while the largest level of inequality occurred in Florida and New York (37.7% and 37.5%, respectively). The largest state outlier is Delaware, with a Pearson's correlation to the top decile share from Piketty and Saez (2003) of only 0.11. The remaining 47 states are highly correlated to the Piketty and Saez data, with Oklahoma having the next-lowest correlation at 0.89.
One significant limitation of IRS income data, however, is the omission of some individuals earning less than a threshold level of gross income. This threshold varies by age and marital status, as well as the tax filing year. For this reason, we follow Piketty and Saez (2003) in using the top decile share of income as our primary measure of inequality. (4) Other non-IRS data sources have the clear advantage of not omitting these low-income individuals, but these sources are either not available annually, such as the decennial Census, or, in the case of the March Current Population Survey (CPS), only available annually for more recent years. Akhand and Liu (2002), moreover, provide evidence that these survey-based alternatives suffer additional bias resulting from an "over-reporting of earnings by individuals in the lower tail of the income distribution and under-reporting by individuals in the upper rail of the income distribution" (258). The IRS, unlike the March CPS or Bureau of the Census, will penalize respondents for income reporting errors.
III. THE RELATIONSHIP BETWEEN INCOME INEQUALITY AND ECONOMIC GROWTH
This section explores the relationship between income inequality and economic growth using a methodology that fully exploits the unusually large and balanced size of our panel. We begin with the common autoregressive distributive lag (ARDL) (p, q, ..., q) dynamic panel specification:
(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where the number of states i = 1, 2,..., N, the number of time periods t = 1, 2,..., T, [y.sub.i,t] is the log of real income per capita, [X.sub.i,t] is a vector of explanatory variables that includes a measure of income inequality, [[micro].sub.i] is the time-invariant fixed effect for state i, [[tau].sub.t] is the state-invariant time effect for time t, and [epsilon] is the idiosyncratic, time- and state-varying error term. Adding [y.sub.i,t-1] to each side of Equation (1) yields
(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where [[lambda].sub.i,j] = [[alpha].sub.i,j] [for all] j [not equal to] and [[lambda].sub.i,1] = [[alpha].sub.i,1] + 1.
[FIGURE 2 OMITTED]
Real state income per capita ([y.sub.i,t]) is taken from the Regional Accounts Data available at the Web site of the Bureau of Economic Analysis (BEA) and deflated using the consumer price index (2004 = 100). (5) We also include two measures of human capital attainment in vector [X.sub.i,t]: the proportion of the population with at least a high school degree and the proportion with at least a college degree. The inclusion of these measures is consistent with much of the relevant theoretical literature (Aghion and Bolton 1997; Galor and Moav 2004; Galor and Zeira 1993; Perotti 1993). Human capital attainment information is unavailable on an annual state-level basis for much of our early sample period, however. We therefore constructed measures of human capital attainment using the perpetual inventory method proposed by Barro and Lee (1993, 1996, 2000). Appendix B describes this construction and provides tests of its accuracy.
Estimation of the model is problematic, however, if the log of real income per capita and income inequality are nonstationary. For [[epsilon].sub.i,t] to be stationary, it must be the case that any I(1) variables are cointegrated. Nonstationarity appears likely as given in Figures 1 and 2, where each series appears to meander, and shows no tendency to return to a constant mean over the 60-yr period. We formally test for nonstationarity using several Hadri (2000) panel stationarity tests. In each test, the null hypothesis of stationarity is rejected at the 1% significance level. (6) This finding is consistent with nonstationary results we found using the U.S. time-series inequality data of Piketty and Saez (2003) for the period 1913-1998. (7)
To evaluate if the variables are cointegrated, we employ the Kao (1999) test, an augmented Dickey-Fuller-type test applicable to panel data, as well as the Pedroni (1995, 2004) test and a pooled Phillips and Perrontype test for panel data. Note that cointegration is implied if a long-run relationship between real income per capita and income inequality exists. With both these cointegration tests, we reject the null hypothesis of no cointegration at the 1% significance level. (8)
Equation (2) can be reparameterized into the error-correction equation:
(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
The parameter [[phi].sub.i] is the error-correcting speed of adjustment term; the vector [[theta]'.sub.i] captures the long-run relationships between the variables, while [[beta]'.sub.ij] captures the short-run relationships. One would expect the parameter [[phi].sub.i] to be significantly negative if the variables show a return to long-run equilibrium. If [[phi].sub.i] = 0, however, there would be no evidence for a long-run relationship. Since we are primarily interested in the nature of the long-run relationship between growth and income inequality, the long-run vector of coefficients [[phi]'.sub.i] will be of particular importance.
From Equation (3), the fixed time effects ([[tau].sub.t]) can be eliminated by mean-differencing the variables from their cross-section means. Eliminating the time effects is important given the long span of the sample and the year-to-year incremental changes in tax laws associated with IRS income data. Additionally, the fixed state effects ([[micro].sub.i]) can be eliminated by estimating Equation (3) with the FE estimator. With the FE estimator, the time-series data for each state are pooled and only the intercepts are allowed to differ across states. If the slope coefficients are not identical, however, the FE estimator could produce inconsistent and potentially misleading results. Moreover, Li, Squire, and Zou (1998), Barro (2000), and Quah (2001) have argued against the use of FE, at least in the international context, since much of the variation in international inequality is cross-sectional. Their contention is that the use of the FE estimator would incorrectly lead to the conclusion of an insignificant relationship since much of the variation in international income inequality occurs over the cross-sectional dimension. Partridge (2005) implies that a similar variation may exist in state-level inequality panels, though we do not find evidence of a cross-sectional weight in our inequality panel. (9)
Alternatively, Equation (3) could be estimated for each state separately, and an average of the coefficients could then be calculated.
This is the MG estimator proposed by Pesaran and Smith (1995). The MG estimator exploits the unusually large number of time-series observations available for each state and provides an important alternative to the FE estimator. With the MG estimator, for example, the error-correction speed of adjustment term is
(4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
with the variance
(5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
Since the intercepts, slope coefficients, and error variances are all allowed to differ across states, the cross-sectional information of the data will be retained. This is a useful feature given the aforementioned concerns of Li, Squire, and Zou (1998), Barro (2000), and Quah (2001).
More recently, Pesaran, Shin, and Smith (1999) have proposed a third alternative, the pooled MG estimator, which combines both pooling and averaging. This intermediate estimator allows the intercepts, short-run coefficients, and error variances to differ across states (as would the MG estimator) but pools the data and constrains the long-run coefficients to be the same across states (as would an FE estimator). Since Equation (3) is nonlinear in the parameters, Pesaran, Shin, and Smith (1999) develop a maximum likelihood method to estimate the parameters. Expressing the likelihood as the product of each cross-section's likelihood and taking the log yields:
(6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
for i = 1, 2,...,N, where [xi]([theta]) = [y.sub.i,t-1] - [X.sub.i][[theta].sub.i], [H.sub.i] = [I.sub.T] - [W.sub.i] ([W'.sub.i] [W.sub.i])[W.sub.i], [I.sub.T] is the identity
matrix of order T, and [W.sub.i] = ([DELTA][y.sub.i,t-i],..., [DELTA] [y.sub.i,t-p+l], [DELTA][X.sub.i], [DELTA][X.sub.i,t-1],... [DELTA][X.sub.i,t-q+l)]. The likelihood is maximized iteratively via back-substitution until convergence is achieved. (10)
Table 2 presents our primary estimates of the relationship between income inequality and human capital attainment on economic growth. Ten additional industry wage and salary variables are also included in each estimation as short-run control variables. (11) In the first three models, the ARDL (1,1) form of Equation (3) is estimated via the dynamic FE, pooled MG, and MG estimators. In the final two models, the Schwarz Bayesian Criterion (SBC) is used to select the lag lengths for the pooled MG and MG estimators. Across all the models, the long-run relationship between the top decile share of income and economic growth is positive and statistically significant. Notice that the magnitude of the relationship is similar across the estimators (dynamic FE, pooled MG, and MG), suggesting that U.S. inequality panels are not as sensitive to the use of an FE estimator as international panels (Barro 2000; Li, Squire, and Zou 1998; Quah 2001). The magnitude is impacted by the selection of lags, however, with the more parsimonious ARDL (1,1) specification (Columns 1-3) indicating a relationship of larger magnitude than the SBC models (Columns 4 and 5).
In the estimations in Table 2, the speed of adjustment parameter, [[phi].sub.i], is consistently negative and significant but does vary in magnitude. While the MG and pooled MG [[phi].sub.i] indicate similar returns to long-run equilibrium, the dynamic FE [[phi].sub.i] implies a much slower return. The two long-run human capital variables are always positive in sign, as expected, though not always statistically significant. (12) This lack of significance is well known in the human capital literature (Krueger and Lindan 2001) and the continuing subject of recent research (see, e.g., Vandenbussche, Aghion, and Meghir 2006).
To evaluate the differences between the MG and the pooled MG estimations, note that the long-run coefficients from the pooled MG estimator are restricted to be the same for all states, while the MG long-run coefficients are unrestricted. To compare these two estimators, a Hausman test can be performed to test the additional restrictions of the pooled MG estimator (Pesaran, Shin, and Smith 1999). Under the null hypothesis of the Hausman test, there are no differences in the estimators; thus, the pooled estimator is consistent and efficient. In the ARDL estimations of Table 2 (Columns 2 and 3), the Hausman test statistic is a marginally insignificant 6.76 (p value = .08). Similarly, in the SBC estimations (Columns 4 and 5), the Hausman test statistic is 6.87 (p value = .08). Hence, the pooled MG estimates appear consistent and efficient in comparison to the unrestricted MG estimation.
To further interpret this relationship between inequality and economic growth, notice from the pooled MG ARDL parameter estimates (Column 2) that a two-standard deviation increase in the top decile share of income (about 0.09, Table 1) would be related to an increase in the long-run growth rate of real income per capita of 0.072% ceteris paribus.
Table 3 continues this evaluation by replicating the pooled MG estimation from Column 2 of Table 2 with six alternative measures of inequality. The first two columns divide the top decile into two groups: the income share of the top 1% (Column 1) and the income share of the top 90%-99% (Column 2). An interesting relationship emerges; the positive long-run association between inequality and growth is found to reside only within the income share of the top 1%, while the income share of the top 90%-99% appears to have no significant relation to economic growth. The parameter estimates from Column 1 of Table 3 imply that a two-standard deviation increase in the share of income of the top 1% would increase the long-run growth rate of real income per capita by 0.066%.
Column 3 of Table 3 replaces the percentile share of income measures with the gini coefficient, a measure of inequality which encompasses the entire income distribution. Again the long-run relationship between inequality and economic growth is positive and statistically significant, though the magnitude of the relationship appears considerably smaller than the relationship found using the top 10% and top 1% measures of inequality. The estimates from Column 3 imply that a two-standard deviation increase in the gini coefficient would be related to a 0.008% increase in the long-run growth rate of real income per capita. One possible explanation for this difference in magnitude is that the true relationship between inequality and growth is narrowly related to only upper end income inequality. Hence, the smaller magnitude from the gini coefficient would be a meaningful difference driven by the broad nature of inequality captured by the gini coefficient. Alternatively, it is plausible that the gini coefficient is an inefficient measure of inequality in the context of IRS income data, since IRS data are truncated below a threshold level of income, as discussed in the previous section.
The next two columns use the Atkinson index of inequality, a measure of inequality bound between 0 and 1, with higher values indicating greater inequality. Column 4 employs an inequality aversion parameter ([epsilon]) of 0.5, meaning the index is sensitive to transfers to those in the high end of the income distribution. Column 5, by contrast, uses an inequality aversion parameter of 1.5, meaning the index is sensitive to transfers at the low end of the distribution. Both measures indicate that income inequality is positively related to economic growth. Moreover, the parameter estimates in Column 4 imply that a two-standard deviation increase in inequality from the high-aversion Atkinson index will be related to an increase in the long-run growth rate of real income per capita of 0.063%, a magnitude very similar to the top decile and top 1% income share measures discussed above. By contrast, the low-aversion Atkinson index in Column 5 indicates that a two-standard deviation increase in inequality will be related to only a 0.023% increase in the long-run growth.
Combined with the top 1% and top 90%-99% income share findings (Columns 1 and 2), these Atkinson index results provide further evidence that the positive relationship between inequality and growth is driven principally by the concentration of income in the upper end of the income distribution. The relationship between bottom-end inequality and economic growth remains an open question, however. It is noteworthy that Voitchovsky (2005) has found evidence that while top-end inequality is positively associated with growth, bottom-end inequality may be negatively related to growth. IRS income data, however, are not well suited for the construction of bottom-end inequality measures given the truncation of low-income individuals. (13)
The final column of Table 3 uses Theil entropy index, a derivative of statistical information theory where larger values indicate greater income inequality. Unlike the top decile, top 1%, gini coefficient, and Atkinson indexes, the Theil index is an unbound measure of inequality, a useful property for nonstationary data. The parameter estimates in this case imply that a two-standard deviation increase in the Theil index would be related to an increase in the long-run growth rate of real income per capita of 0.081%, a rate consistent with the top decile measure of inequality, the top 1%, and the high-income aversion Atkinson index.
This paper is motivated by the desire to provide a comprehensive state-level panel of income inequality measures covering the entirety of the postwar period. Existing state-level and cross-national inequality panels are often restricted to only recent years or contain only a very limited number of time-series observations for each cross-section. Panels that cover the entirety of the postwar period are typically spaced at 10-yr intervals, meaning that only five or six time-series observations are available for each cross section. This paper, by contrast, offers a new comprehensive panel of annual income inequality measures for the 48 states over the 60-yr period 1945-2004. These measures of inequality are constructed from individual tax filing data available from the IRS. Although IRS income data have several important limitations, including the censoring of individuals below a threshold level of income, they have the unique feature of being available annually for each state throughout the postwar period.
Individual state-level trends in income inequality from this panel appear to closely replicate the overall trends in aggregate U.S. inequality found by Piketty and Saez (2003). For the vast majority of states, the share of income held by the top decile experienced a prolonged period of stability after World War II, followed by a substantial increase in inequality during the 1980s and 1990s.
This paper also offers an exploration of the long-run relationship between income inequality and economic growth. Given the unusually large and balanced size of our panel, we have employed three alternative dynamic panel error-correction estimators: the FE estimator, the MG estimator of Pesaran and Smith (1995), and the pooled MG estimator of Pesaran, Shin, and Smith (1999). From this analysis, we find the long-run relationship between inequality and growth to be positive in nature. The results imply that a two-standard deviation increase in the top 10% share of income is related to an increase in the long-run growth rate of real income per capita of 0.072% ceteris paribus. Additionally, using several alternative measures of income inequality, we find this positive relationship to be driven primarily by income concentration within the upper end of the income distribution. IRS data limitations, however, have prevented an analysis of bottom-end inequality effects, from which one might expect the opposite effect (see, e.g., Voitchovsky 2005).
Though suggestive, our analysis is only one step in the empirical investigation of the inequality-growth relationship using high-frequency panel data. Our analysis does not, for example, consider the impact of structural breaks in the state-level time series, nor does it consider potential nonlinearities in the relationship between inequality and growth, a point of emphasis in parts of the recent theoretical literature (see, e.g., Banerjee and Duflo 2003; Galor and Moav 2004; Sylwester 2000). Also neglected is an investigation of the appropriate lag interval between measures of inequality and economic growth. (14) Much of the current inequality-growth literature, for example, uses a 10- or 20-yr lag between inequality and subsequent economic growth rates (Barro 2000; Forbes 2000; Panizza 2002; Partridge 1997, 2003). This practice reflects the desire to isolate the long-run relationship between inequality and growth but is also an imposed artifact of prior data limitations. The new inequality panel this paper introduces, unlike the low-frequency panels of prior research, has both the comprehensiveness and the flexibility to further the empirical evaluation of each of these concerns.
APPENDIX A: CONSTRUCTION OF THE INEQUALITY MEASURES
The income inequality measures are constructed using data published by the IRS on the number of returns and adjusted gross income (before taxes) by state and size of the adjusted gross income. (15) Percentile rankings are used to construct the top decile share of income. This construction is based on the split-histogram interpolation method suggested by Cowell (1995), whereby the proportion of the population with income less than or equal to income y is defined as:
(A.1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [a.sub.i] is the lower bound of group i and [F.sub.i] is the cumulative frequency of the number of individuals before group i. The proportion of the total income received by those with income less than or equal to y is given by
(A.2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [mu] is mean income. The density within each interval i is defined by the split-histogram density:
(A.3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [f.sub.i] is the relative frequency of [n.sub.i] within group i, and [a.sub.i+1] is the upper bound of group i.
Since the IRS income data are in group form, the gini coefficient we construct is the compromise gini coefficient proposed by Cowell and Mehta (1982) and Cowell (1995). Accordingly, the lower limit of the gini can be derived based on the assumption that all individuals in a group receive exactly the mean income of the group:
(A.4) [G.sub.L] = 1/2 [k.summation over (i=1)] [k.summation over (j=1)] [n.sub.i][n.sub.j]/[n.sub.[mu]] [absolute value of [[mu].sub.i] - [[mu].sub.j]],
where n is the number of individuals and subscripts i and j denote within-group values. The upper limit gini can be constructed based on the assumption that individuals within the group receive income equal to either the lower or the upper bound of the group interval:
(A.5) [G.sub.U] = [G.sub.L] + [k.summation over (i=1)] [n.sup.2.sub.i]([a.sub.i+1] - [[mu].sub.i])([[mu].sub.i] - [a.sub.i])/[n.sup.2][mu]([a.sub.i+1] - [a.sub.i])
The compromise gini coefficient proposed by Cowell and Mehta (1982) is simply [G.sub.U]2/3 + [G.sub.L] 1/3.
The Atkinson index and Theil entropy index can be derived using the basic form:
(A.6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],
where [[phi].sub.i] is the split-histogram density and h(y) is an evaluation function. To construct the Atkinson index, the evaluation function is defined as:
(A.7) h(y) = [(y/[mu]).sup.1-[epsilon]],
where 1-[J.sup.1/1-[epsilon]. Note that [member of] is the Atkinson inequality aversion parameter. To construct the Theii entropy index, the evaluation function is given by:
(A.8) h(y) = y/[mu] ln (y/[mu]).
Unlike the percentile rankings or gini index, the Atkinson index and Theil index are undefined for negative incomes (Cowell 1995). Hence, to construct the Atkinson and Theil inequality measures, negative IRS income data must be truncated, meaning the lowest possible income, [a.sub.1], is $0.
B. CONSTRUCTION OF THE HUMAN CAPITAL MEASURES
Annual human capital attainment measures for each state are not available for the entire period 1945 2004. The Census Bureau provides state-level attainment measures for each state, but these are available only at 10-yr increments (1940, 1950, 1960, 1970, 1980, 1990, and 2000). The March supplement to the CPS provides full state-level attainment information only for the years 1989, 1991, and 1993-2004. Attainment information for the largest 15 states is available from the March CPS for the years 1979, 1981, 1983, 1985, 1987, and 1988.
To build an annual state-level measure of human capital attainment, we follow the spirit of the perpetual inventory method proposed by Barro and Lee (1993, 1996, 2000). Attainment information from the Census and March CPS is used as benchmark human capital stocks, while the number of new graduates each year is used as flows added to the current stock of human capital. Additionally, each year's stock is adjusted for mortality and net migration. Accordingly, we construct two human capital attainment-to-population ratios for each state: (16)
(B.1) [high school.sub.i,j] = [h.sub.i,t]/[n.sub.i,t] = ([n.sub.i,t-1]) - [d.sub.i,t] + [m.sub.i,t]) [high school.sub.i,t-1] + [[??].sub.i,t]/[n.sub.i,t]
(B.2) [college.sub.i,t] = [c.sub.i,t]/[n.sub.i,t] = ([n.sub.i,t-1] - [d.sub.i,t] + [m.sub.i,t]) [college.sub.i,t-1] + [[??].sub.i,t]/[n.sub.i,t],
where [h.sub.i,t] is the total number of individuals with at least a high school diploma in state i for year t, [c.sub.i,t] is the total number of individuals with at least a baccalaureate or first professional degree, [[??].sub.i,t] is the number of new high school graduates, [[??].sub.i,t] is the number of new bachelor or first professional degrees conferred, [d.sub.i,t] is the number of deaths, and [m.sub.i,t] is net migration (the number of new arrivals into a state minus the number that have left the state).
The assumption from Equations (B.1) and (B.2) is that the number of deaths and net migration are independent from the level of schooling attained. Though not entirely accurate, this assumption is necessary given data limitations and similar to the assumption made by Barro and Lee (1993, 1996, 2000).
Net migration ([m.sub.i,t]) is not known on an annual basis for each state but may be inferred, since the change in population from period t - 1 to period t must equal the number of new births minus the number of deaths plus net migration:
(B.3) [n.sub.i,t] - [n.sub.i,t-1] = [b.sub.i,t] - [d.sub.i,t] + [m.sub.i,t].
Rearranging Equation (B.3) and substituting into Equations (B.1) and (B.2),
(B.4) [high school.sub.i,t] = ([n.sub.i,t] - [b.sub.i,t]) [high school.sub.i,t-1] + [[??].sub.i,t]/[n.sub.i,t]
(B.5) [college.sub.i,t] = ([n.sub.i,t] - [b.sub.i,t] [college.sub.i,t-1] + [[??].sub.i,t]/[n.sub.i,t].
Equations (B.4) and (B.5) may then be used to construct forward-flow and backward-flow estimates of human capital attainment for the missing cells (65.2% of the sample). (17) As a rule, we choose the flow estimate that minimizes the distance from a Census or March CPS benchmark. For the year 1967, for example, the backward-flow estimate is used since the 1970 Census benchmark is closer than the 1960 benchmark. In years where the backward-flow and forward-flow estimates are equal distances apart (e.g., 1965), an average of the two is used.
To evaluate the accuracy of our perpetual inventory method, we estimate attainment over the period 1979-2004 using only the Census benchmark information (1980, 1990, and 2000) and compare these values to the actual attainment information provided in the March CPS. The root mean square error of the actual and estimated is 0.022 for high school attainment and 0.013 for college attainment. Following Barro and Lee (1993), we can further assess this accuracy using the Theil U statistic, a measure bound between 0 and 1, with larger values indicating poor forecasting performance. Over this period, the Theil Ustatistic for high school attainment is 0.043, a magnitude substantially less than the secondary attainment measures of Barro and Lee (0.14-0.36). Similarly, the Theil U statistic for college attainment is 0.087, a magnitude less than the higher attainment measures of Barro and Lee (0.1043.25). Both values indicate that the two attainment measures provide a good fit for the period sampled, though the high school attainment measure performs better than the college attainment measure. It is plausible that this difference in relative performance reflects the greater mobility of college graduates vis-a-vis high school graduates, a tendency we are unable to account for.
ARDL: Autoregressive Distributive Lag
BEA: Bureau of Economic Analysis
CPS: Current Population Survey
FE: Fixed Effects
IRS: Internal Revenue Service
MG: Mean Group
NBER: National Bureau of Economic Research
SBC: Schwarz Bayesian Criterion
Aghion, P., and P. Bolton. "A Theory of Trickle-Down Growth and Development." Review of Economic Studies 64, 1997, 151-72.
Akhand, H., and H. Liu. "Income Inequality in the United States: What the Individual Tax Files Say." Applied Economics Letters, 9, 2002, 255-59.
Banerjee, A. V., and E. Duflo. "Inequality and Growth: What Can the Data Say?" Journal of Economic Growth, 8, 2003, 267-99.
Barro, R. J. "Inequality and Growth in a Panel of Countries." Journal of Economic Growth, 5, 2000, 5-32.
Barro, R. J., and J.-W. Lee. "International Comparisons of Educational Attainment." Journal of Monetary Economics, 32, 1993, 363-94.
--. International Data on Educational Attainment: Updates and Implications. Center for International Development Working Paper No. 42, Center for International Development at Harvard University, 2000.
--. "International Measures of Schooling Years and Schooling Quality." American Economic Review, 86, 1996, 218-23.
Blackburne, E. F., and M. W. Frank. "Estimation of Nonstationary Heterogeneous Panels." The Stata Journal, 7, 2007, 197-208.
Cowell, F. A. Measuring Inequality. London School of Economics Handbooks in Economics Series. 2nd ed. New York: Prentice Hall, Harvester Wheatsheaf, 1995.
Cowell, F. A., and F. Mehta. "The Estimation and Interpolation of Inequality Measures." Review of Economic Studies, 49, 1982, 273-90.
Deininger, K., and L. Squire. "A New Data Set Measuring Income Inequality." World Bank Economic Review, 10, 1996, 565-91.
Forbes, K. J. "A Reassessment of the Relationship between Inequality and Economic Growth." American Economic Review, 90, 2000, 869-87.
Galor, O., and O. Moav. "From Physical to Human Capital Accumulation: Inequality and the Process of Development." Review of Economic Studies, 71, 2004, 1001-26.
Galor, O., and J. Zeira. "Income Distribution and Macroeconomics." Review of Economic Studies, 60, 1993, 35-52.
Goldin, C., and R. A. Margo. "The Great Compression: The Wage Structure in the United States at Mid-Century." Quarterly Journal of Economics, 107, 1992, 1-34.
Gottschalk, P. "Inequality, Income Growth, and Mobility: The Basic Facts." Journal of Economic Perspectives, 11, 1997, 21-40.
Hadri, K. "Testing for Stationarity in Heterogeneous Panel Data." Econometrics Journal, 3, 2000, 148-61.
Kao, C. "Spurious Regression and Residual-Based Tests for Cointegration in Panel Data." Journal of Econometrics, 90, 1999, 1-44.
Krueger, A. B. "Inequality, Too Much of a Good Thing," in Inequality in America: What Role for Human Capital Policies?, edited by J. J. Heckman and A. B. Krueger. Cambridge: MIT Press, 2003, 1-75.
Krueger, A. B, and M. Lindahl. "Education for Growth: Why and For Whom?" Journal of Economic Literature, 39, 2001, 1101-36.
Levy, F., and R. J. Murnane. "U.S. Earnings Levels and Earnings Inequality: A Review of Recent Trends and Proposed Explanations." Journal of Economic Literature, 30, 1992, 1333-81.
Li, H., L. Squire, and H.-F. Zou. "Explaining International and Intertemporal Variations in Income Inequality." Economic Journal, 108, 1998, 26-43.
Panizza, U. "Income Inequality and Economic Growth: Evidence from American Data." Journal of Economic Growth, 7, 2002, 25-41.
Partridge, M. D. "Is Inequality Harmful for Growth? Comment." American Economic Review, 87, 1997, 1019-32.
--. "Does Income Distribution Affect U.S. State Economic Growth?" Journal of Regional Science, 45, 2005, 336 94.
Pedroni, P. "Panel Cointegration: Asymptotic and Finite Sample Properties of Pooled Time Series Tests, with and Application to the PPP Hypothesis." Indiana University Working Papers in Economics, 1995.
-. "Panel Cointegration: Asymptotic and Finite Sample Properties of Pooled Time Series Test with an Application to the PPP Hypothesis." Econometric Theory, 20, 2004, 597-625.
Perotti, R. "Political Equilibrium, Income, Distribution, and Growth." Review of Economic Studies, 60, 1993, 755-76.
Pesaran, M. H., Y. Shin, and R. P. Smith. "Pooled Mean Group Estimation of Dynamic Heterogeneous Panels." Journal of the American Statistical Association, 94, 1999, 621-34.
Pesaran, M. H., and R. Smith. "Estimating Long-Run Relationships from Dynamic Heterogeneous Panels." Journal of Econometrics, 68, 1995, 79-113.
Piketty, T., and E. Saez. "Income Inequality in the United States, 1913-1998." Quarterly Journal of Economics, 118, 2003, 1-39.
--. "The Evolution of Top Incomes: A Historical and International Perspective." American Economic Review, 96, 2006, 200-05.
--. "Response by Thomas Piketty and Emmanuel Saez to: The Top 1% ... of What? By Alan Reynolds." Accessed December 20, 2006. Available online at http://elsa.berkeley.edu/~saez/.
Quah, D. "Some Simple Arithmetic on How Income Inequality and Economic Growth Matter." LSE Economics Working Paper, 2001.
Reynolds, A. "The Top 1% ... of What?" The Wall Street Journal, December 14, 2006, A20.
--. "Has U.S. Income Inequality Really Increased?" Policy Analysis No. 586. CATO Institute, 2007.
Sylwester, K. "Income Inequality, Education Expenditures, and Growth." Journal of Development Economics, 63, 2000, 379-98.
Vandenbussche, J., P. Aghion, and C. Meghir. "Growth, Distance to Frontier and Composition of Human Capital." Journal of Economic Growth, 11, 2006, 97-127.
Voitchovsky, S. "Does the Profile of Income Inequality Matter for Economic Growth?: Distinguishing between the Effects of Inequality in Different Parts of the Income Distribution." Journal of Economic Growth, 10, 2005, 273-96.
MARK W. FRANK, I am grateful for the insights offered by Edward F. Blackburne, Donald G. Freeman, and Hiranya K. Nath, and the comments of two anonymous referees. I also thank David Mathews, Jennifer George, Swapna Easwar, Gabi Eissa, and Sadaf Monam for their excellent assistance in the construction of the inequality and human capital data. Thanks also to Charles Hicks at the Internal Revenue Service for graciously providing state-level individual tax filing data for the years 1982-1987. I am grateful for the financial support provided by the Faculty Research Grant from the College of Business Administration at Sam Houston State University. All errors remain my responsibility.
Frank: Associate Professor, Department of Economics and International Business, Sam Houston State University, Box 2118, Huntsville, TX 77341. Phone 1-936-2944890, Fax 936-294-3488, E-mail firstname.lastname@example.org.
Online Early Publication April 8, 2008
(1.) This new panel of annual state-level income inequality measures may be obtained online at http://www.shsu.edu/~eco_mwf/inequality.html.
(2.) The IRS does not, however, provide a meaningful separation of these income sources for each income group at the state-level. Hence, unlike Piketty and Saez (2003), we will be unable to assess the relative impact from changes in each income source (wages and salaries, capital, or entrepreneurial) on income inequality.
(3.) The Theil U statistic varies between 0 and 1 and is analogous to an [R.sup.2] measure, though large values indicate poor performance.
(4.) Additionally, Reynolds (2006, 2007) has argued that the IRS-based income share measure for the top 1% reported in Piketty and Saez (2003) is seriously flawed, an assessment strongly disputed by Piketty and Saez (2006).
(5.) We use the BEA's calculation of per capita state income instead of an IRS-based measure of state income per capita because IRS data are based on tax units, not individuals. Under current tax law, for example, a tax unit can be defined as a married couple living together or as a single adult. Moreover, each tax unit mayor may not have dependents.
(6.) We tested the null hypothesis of stationarity with three Hadri (2000) residual-based Lagrange multiplier tests applicable to panel data with homoskedastic, heteroskedastic, of serially dependent error processes. The three test statistics for log of real income per capita are 116.4, 109.3, and 12.2, respectively. The three test statistics for the top decile share of income are 167.4, 161.5, and 18.6, respectively. For high school attainment, the test statistics are 108.6, 109.1, and 9.7. For college attainment, the test statistics are 160.8, 154.6, and 17.1. Each of the above tests is statistically significant at the 1% level.
(7.) We evaluated each of the data series presented in Figures I and II of Piketty and Saez (2003): income shares for the top 90%-100% (P90-100), the top 90%-95% (P9095), the top 95%-99% (P95-99), and the top 99%-100% (P99-100). Under the null hypothesis of a unit root, the augmented Dickey-Fuller test statistic for P90-100 is -1.096, for P90-95 the test statistic is -2.814, for P9599 the test statistic is -1.765, and for P99-00 the test statistic is -1.810. Each of these test statistics is insignificant at the 5% level (5% critical value = -2.908), indicating the presence of a unit root in each series. Phillips-Perron unit root tests also indicate nonstationarity in each of the four series.
(8.) The Kao (1999) test statistic for the null hypothesis of no cointegration between log real income per capita and the top decile share of income is -6.44, while the Pedroni (1995, 2004) test statistic is -16.97. The Kao test statistic for no cointegration between log real income per capita, top decile share of income, high school attainment, and college attainment is -3.36, while the Pedroni test statistic is--15.38. Each of the above tests is statistically significant at the 1% level.
(9.) Li, Squire, and Zou (1998) show that about 91% of the total variation in the Deininger and Squire (1996) panel of international gini coefficients can be explained by variations across countries. Perhaps unsurprisingly given the prominent rise in income inequality within each state over the last half century (Figure 2), the opposite appears true with U.S. state-level data: about 78% of the variation in the top income decile can be explained by variations over time, while only 12% is explained by cross-sectional variations.
(10.) To estimate the FE, MG, and pooled MG estimators using Stata, see Blackburne and Frank (2007).
(11.) These data are taken from the Regional Economic Accounts data available at the Web site of the BEA.
(12.) Notice also that the two short-run human capital coefficients are always negatively signed, as one would expect given the temporal trade-off inherent in educational investment.
(13.) The source of income (wages and salaries, capital, or entrepreneurial) might also be a contributing factor in this estimated relationship. Piketty and Saez (2003) show, for example, that those in the upper-end of the distribution derive their income disproportionately from capital. However, since the IRS does not separate these income sources for each income group at the state-level, we are unable to assess this further.
(14.) Banerjee and Duflo (2003), for example, argue the political-economy mechanism (whereby voters respond to increased inequality with harmful redistribution) is limited by important time lags, leading to substantial differences between the short-run and long-run inequality-growth relationship.
(15.) For the years 1945-1973 and 1975-1981, the data are available in the Statistics of Income, Individual Income Tax Returns annual series. The 1974 volume of this series was never published, but the data are available from the 1974 edition of Statistics of Income: Small Area Data. Data for the years 1982-1987 were tabulated by the IRS but never included in any of the publicly available IRS publications. Upon our request, however, Charles Hicks with the IRS graciously provided the data. For the years 1988-2004, the data are available in the Statistics of Income Bulletin quarterly series.
(16.) The population 25 and older, a more intuitive denominator, is not available annually at the state level for the entirety of the sample period.
(17.) For the years 1963-2004, the annual number of college graduates (bachelor's and first professional degrees) and public high school graduates are available from annual issues of the Digest of Educational Statistics and the Statistical Abstract of the United States. For the years 1945-1962, the Biennial Survey of Education and the Statistical Abstract of the United States are used. The years 1953, 1955, and 1961 were undocumented and thus had to be linearly interpolated. The number of live births and total population are available from annual issues of the Statistical Abstract of the United States.
TABLE 1 Descriptive Statistics of the Variables (2004 = 100) Standard Variable Mean Deviation Top decile share of income 0.336 0.046 Top 1% share of income 0.096 0.036 Gini coefficient 0.493 0.062 Atkinson index, [epsilon] = 0.5 0.197 0.037 Atkinson index, [epsilon] = 1.5 0.514 0.054 Theil entropy index 0.478 0.145 Real state income per capita 20,260 7,411 High school attainment 0.351 0.142 College attainment 0.084 0.051 Farming (x1,000) 386.4 537.3 Agriculture services (x1,000) 239.2 505.3 Mining (x 1,000) 694.9 1,444.4 Construction (x 1,000) 3,228.5 4,088.1 Manufacturing (x1,000) 14,450.2 17,822.3 Transportation (x1,000) 4,153.3 5,136.1 Trade (x 1,000) 9,723.8 12,733.1 Finance, Insurance, and 4,019.0 8,095.8 Real Estate (x1,000) Services (x1,000) 11.946.0 21,644.3 Government (x 1,000) 10,550.8 13,866.8 Minimum Annual Maximum Annual Variable Mean (Year) Mean (Year) Top decile share of income 0.282 (1956) 0.43 (2000) Top 1% share of income 0.047 (1974) 0.172 (2000) Gini coefficient 0.408 (1956) 0.631 (2004) Atkinson index, [epsilon] = 0.5 0.156 (1953) 0.280 (2000) Atkinson index, [epsilon] = 1.5 0.429 (1947) 0.618 (2000) Theil entropy index 0.346 (1953) 0.760 (2000) Real state income per capita 10320 (1949) 31,908 (2004) High school attainment 0.095 (1945) 0.564 (2004) College attainment 0.029 (1945) 0.175 (2004) Farming (x1,000) 285.5 (1987) 511.0 (1946) Agriculture services (x1,000) 45.5 (1945) 689.7 (2000) Mining (x 1,000) 475.1 (1945) 1275.6 (1971) Construction (x 1,000) 640.6 (1945) 6031.4 (1971) Manufacturing (x1,000) 354.5 (1946) 18,920.0 (2000) Transportation (x1,000) 2,166.0 (1947) 7,047.8 (2000) Trade (x 1,000) 3,178.6 (1945) 17,671.8 (2000) Finance, Insurance, and 708.9 (1945) 10,385.6 (2004) Real Estate (x1,000) Services (x1,000) 1,885.3 (1945) 37,100.6 (2004) Government (x 1,000) 2,722.4 (1947) 18,682.3 (2004) TABLE 2 Dynamic Panel Error-Correction Estimates of Economic Growth and Income Inequality ARDL (1,1) Dynamic FE Pooled MG 1 2 Adjustment coefficient ([[phi].sub.i]) -0.159** (0.010) -0.593 * (0.039) Long-run coefficients ([theta].sub.i]) Top decile 0.763 ** (0.248) 0.785 ** (0.089) High school 1.516 ** (0.206) 0.058 (0.076) College 0.439 (0.318) 0.166 (0.123) Short-run coefficients (([[beta].sub.ij]) Top decile 0.153 * (0.060) -0.217 * (0.055) High school -0.015 (0.063) -0.066 (0.067) College -0.127 ** (0.092) -0.049 (0.050) Farming -0.007 ** (0.002) 0.004 (0.006) Agriculture -0.006 ** (0.001) 0.000 (0.004) Mining -0.004 * (0.002) 0.010 (0.006) Construction 0.019 ** (0.003) 0.031 ** (0.005) Manufacturing 0.008 ** (0.003) 0.056 ** (0.016) Transportation 0.011 (0.005) -0.029 (0.015) Trade 0.005 (0.008) -0.005 (0.033) Finance, Insurance, -0.009 (0.005) -0.040 * (0.019) and Real Estate Services 0.006 (0.003) 0.029 (0.015) Government -0.023 ** (0.004) -0.023 * (0.009) SBC (3,3) MG Pooled MG 3 4 Adjustment coefficient ([[phi].sub.i]) -0.633 ** (0.039) -0.650 * (0.046) Long-run coefficients ([theta].sub.i]) Top decile 0.652 ** (0.233) 0.484 ** (0.067) High school 0.166 (0.161) 0.058 (0.055) College 0.564* (0.255) 0.144 (0.082) Short-run coefficients (([[beta].sub.ij]) Top decile -0.188 ** (0.062) -0.036 (0.039) High school -0.117 (0.084) -0.026 (0.024) College -0.114 (0.085) -0.024 (0.023) Farming 0.010 (0.006) 0.004 (0.006) Agriculture -0.003 (0.004) 0.001 (0.004) Mining 0.008 (0.006) 0.008 (0.007) Construction 0.031 ** (0.006) 0.034 ** (0.006) Manufacturing 0.058 ** (0.016) 0.061 ** (0.015) Transportation -0.029 (0.017) -0.042 * (0.014) Trade 0.005 (0.039) 0.003 (0.033) Finance, Insurance, -0.041 * (0.021) -0.043 * (0.020) and Real Estate Services 0.026 (0.015) 0.033 (0.016) Government -0.025 * (0.010) -0.027 * (0.010) MG 5 Adjustment coefficient ([[phi].sub.i]) -0.682 ** (0.046) Long-run coefficients ([theta].sub.i]) Top decile 0.431 * (0.178) High school 0.241 (0.168) College 0.479 * (0.208) Short-run coefficients (([[beta].sub.ij]) Top decile -0.113 (0.075) High school -0.079 (0.054) College -0.032 (0.058) Farming 0.007 (0.006) Agriculture 0.000 (0.004) Mining 0.009 (0.007) Construction 0.033 ** (0.006) Manufacturing 0.063 ** (0.014) Transportation -0.046 * (0.018) Trade 0.011 (0.041) Finance, Insurance, -0.042 (0.021) and Real Estate Services 0.027 (0.015) Government -0.025 * (0.011) Notes: Standard errors are in parentheses. All variables are mean-differenced. * and ** indicate significance at the 5% and 1% levels, respectively. TABLE 3 Alternative Inequality Measures of Economic Growth and Income Inequality Top 1% Top 90%-99 % 1 2 Adjustment coefficient ([[phi].sub.i]) -0.588 ** (0.038) -0.582** (0.036) Long-run coefficients ([[theta].sub.i]) Income inequality 0.703 ** (0.094) 0.089 (0.091) High school 0.036 (0.078) 0.000 (0.079) College 0.224 (0.126) 0.188 (0.128) Atkinson Gini Coefficient Index, [epsilon]=0.5 3 4 Adjustment coefficient ([[phi].sub.i]) -0.581 ** (0.036) -0.585 * (0.036) Long-run coefficients ([[theta].sub.i]) Income inequality 0.061 ** (0.071) 0.854 ** (0.108) High school -0.009 (0.077) 0.044 (0.077) College 0.228 (0.126) 0.248 * (0.125) Atkinson Theil Entropy Index, [epsilon]=1.5 Index 5 6 Adjustment coefficient ([[phi].sub.i]) -0.582 ** (0.036) -0.593 * (0.037) Long-run coefficients ([[theta].sub.i]) Income inequality 0.217 ** (0.053) 0.278 ** (0.028) High school 0.033 (0.078) 0.055 (0.074) College 0.196 (0.128) 0.260 * (0.120) Notes: Standard errors are in parentheses. All variables are mean-differenced. Short-run coefficients are included in each estimation but not reported in the table. Each model is estimated using the dynamic panel error-correction pooled MG estimator and 1 levels, respectively
|Printer friendly Cite/link Email Feedback|
|Author:||Frank, Mark W.|
|Article Type:||Statistical data|
|Date:||Jan 1, 2009|
|Previous Article:||Fiscal readjustments in the United States: a nonlinear time-series analysis.|
|Next Article:||The health effects of military service: evidence from the Vietnam draft.|