# Time spent unemployed: a new look at data from the CPS.

Time spent unemployed: a new look at data from the CPS

In July 1983, 8 months after the unemployment rate peaked during the 1981-82 recession, the published mean duration of unemployment figure reached 21.2 weeks. In October 1986, 47 months into the recovery, unemployment duration had fallen to 15.2 weeks. Although these numbers move in the expected direction, do they really provide an accurate portrayal of the time individuals spend unemployed? In a 1970 article in the Review, Hyman B. Kaitz considered the question of how long a person remains unemployed "on average.' He concluded that it was "a simple question, yet one that cannot be easily answered despite the wealth of data available.'1 Some reasons for this difficulty are tied to the choices of data and statistical techniques used for estimating unemployment duration. Other reasons reflect basic disagreements among economists as to what constitutes the best measure of the average time individuals remain unemployed.

For example, many of the earliest articles written on unemployment duration concentrated on the fact that the published statistics measure the average age of unemployment spells among the currently unemployed; that is, the survey interrupts spells which are in progress. As a result, the statistics do not show the average completed length of spells or average total time unemployed for these individuals. A consensus emerged from these studies that the average total time spent unemployed should be measured. What has yet to emerge is a consensus as to how to accomplish this goal.

This article examines the conceptual and empirical problems encountered in selecting the most appropriate measure of the average total time an individual remains unemployed, or the duration of a completed spell of unemployment. The discussion of these problems helps set the stage for the focus of our analysis: a comparison of different methods of using data from the Current Population Survey (CPS) to construct estimates of unemployment duration. Two sources of data are considered: published cross-sections of time unemployed and unpublished listings of time unemployed by single weeks of unemployment. In conducting this comparison, we find that the unpublished data permit development of new and robust estimates of average time spent unemployed; in particular, cyclically sensitive estimates can be developed monthly.

Conceptual and empirical problems

In any study of the duration of unemployment, two questions are either implicitly or explicitly addressed. First, which group of individuals is to be used to construct an estimate of the average length of time members of the group remain unemployed? Is it the group of currently unemployed individuals? Or perhaps individuals who only recently became unemployed? Numerous other choices exist, each providing a different picture of the dynamics of the labor market.

Second, how can the available data to be used to construct estimates of the average total time unemployed among members of the chosen group? As noted previously, the CPS does not measure the total time individuals remain unemployed before finding employment or withdrawing from the labor force. Rather, the survey records the amount of time individuals have remained unemployed up to the survey reference week. Hence, the measure we desire to estimate, average total time unemployed, must be inferred from the type of information available, current age of unemployment spells. Since available data differ from the required data, what assumptions are needed that permit such inferences to be made?

Most studies of duration have used published CPS cross-sectional data and have attempted to measure the average total time unemployed for newly unemployed individuals. These studies often required the assumption of a constant or steady state level of unemployment. Studies by Hyman Kaitz in 1970 and Stephen Salant in 1977(2) are largely representative of work in this area. Both studies assume steady state flows to generate empirical estimates.

Since publication of the Kaitz and Salant articles, a wide range of approaches have been adopted for estimating duration.3 These approaches include relaxing the assumption of steady state flows, constructing duration measures for groups other than newly unemployed individuals, and using alternative sources of data. Through these efforts, a consensus has emerged, although held in varying degrees. First, the average total time unemployed should be measured without assuming steady state flows. Second, in reporting and interpreting monthly duration statistics, no single group of individuals is the preferred group for analysis; each choice simply reflects a different aspect of the underlying dynamics of the labor market.

The justification for the first point would appear obvious; the assumption that the unemployment rate is constant over time is simply too much at variance with the real world of cyclical unemployment rates. In terms of conventional economic methodology, however, differences that arise between estimates derived from methods requiring such an assumption and those that do not must be carefully examined. For this reason, steady state methods are included in our study and we compare the results with those generated by our nonsteady state methods.

The second point is potentially harder to justify. The choice of any group of individuals leads to the exclusion of other individuals; the impact of current economic conditions on the group of newly unemployed individuals may be far different than the impact on the currently unemployed. Could not one group of individuals be more "representative' of labor market conditions than any other? Our contention is that, at any given survey date, the answer to this question is no. An example may help to clarify this issue.

Suppose that every week, 10 individuals become unemployed and remain unemployed for exactly 1 week. Also suppose that every week, 1 individual becomes unemployed and remains unemployed for exactly 10 weeks. Assuming this process has been continuing indefinitely, in any given week you would find 20 individuals currently unemployed, 10 of whom will each experience a total time unemployed of 1 week, and 10 of whom will each experience a total time unemployed of 10 weeks. The average total time unemployed experienced by these 20 individuals is 5.5 weeks. At the same time, for any given week, there are 11 individuals who have become newly unemployed. The average total time unemployed experienced by these individuals is 1.8 weeks.

Currently versus newly unemployed

Which statistic is more indicative of labor market conditions? Each group offers a partial glimpse of the dynamics underlying the labor market, and understanding the differences between various choices is critical. The currently unemployed are the remaining members of previous newly unemployed groups. By definition, the currently unemployed do not include members of previous inflow groups who have either found employment or withdrawn from the labor force by the time of the survey. The unemployment experience of these individuals will not be captured by calculating the duration of unemployment of individuals currently unemployed at a survey date.

The previous discussion indicates that at each survey date, no single group of individuals is preferred for measuring duration. However, the analysis does not provide an indication as to the consequences of choosing a particular group concept and measuring duration for that choice each month over an extended period of time. Because one focus of this study is the behavior of duration statistics over business cycles, serious consideration must be given to our choice. In particular, we chose to measure average total time unemployed for groups of newly unemployed individuals. Our justification of this selection is quite simple. Over time, this choice theoretically captures all individuals who become unemployed, and, unlike examining the currently unemployed each month, examining newly unemployed individuals does not result in groups that include the same individuals over successive periods. That is, if we were to measure the average total time unemployed for individuals currently unemployed in January and then repeat the exercise for the currently unemployed in February, many of the same individuals would be included in both measures. Using the newly unemployed group in each month, however, does not present this problem.

Given this choice, we make comparisons between different methods of using CPS data to estimate unemployment duration. Attention in this study is limited to constructing monthly estimates of the average duration of completed spells for newly unemployed groups in a nonsteady state environment. Within the selected framework, alternative estimating techniques are considered, and criteria for judging their efficacy are developed.

The next several sections of this article are confined to measuring the duration of unemployment for the group of newly unemployed individuals in January 1979. First, we discuss the two techniques (Kaitz and Salant) regarded as representative of work using cross-sectional data. Next, we examine the use of combined cross-sectional data using a parametric estimating technique.

One of the more difficult tasks in this research area is the selection among competing choices of techniques for estimating the average length of time an individual remains unemployed. The estimation of duration has been termed as much an "art as science,'4 and as such, serious consideration must be given to the development of criteria within which reasonable choices of techniques can be made. Using such a framework, both cross-sectional and combined cross-sectional methods are applied in the last section to the business cycles of the 1967-82 period. Monthly duration estimates are constructed, permitting an examination of the sensitivity of the chosen techniques to business cycle turning points.

Cross-sectional data

The duration concept is based on answers to the CPS survey question: "How many weeks has . . . been looking for work?' The resulting statistic measures the average age of unemployment spells among the currently unemployed. The answers to the duration question are published using the following seasonally adjusted groupings of weeks unemployed: [less than 5 weeks], [5 to 14 weeks], [15 to 26 weeks], and [27 weeks and over]. Two features of this statistic are important: first, the current length in weeks of any spell is an underestimate of eventual completed spell length; and second, the group under consideration is the currently unemployed. This group is made up of the remaining members of all previous newly unemployed groups. To the extent that the composition of a group changes as some of its original members leave unemployment, measurable differences may be observed between newly and currently unemployed groups at any point in time. Kaitz and Salant demonstrated that published CPS duration statistics which provide the average length in weeks of currently unemployed individuals overestimate the average completed length of unemployment spells of newly unemployed groups.

The Kaitz and Salant studies adopted the assumption of steady state flows; a constant level of total unemployment which is accompanied by a constant level of inflow into and exit out of unemployment. Although the steady state assumption runs counter to the objectives of this present research, we consider these two methods in detail because these pioneering studies framed the context within which current discussions take place, and because these studies can be applied directly and easily to a nonsteady state environment.

In a steady state world, the intersection of a survey and an unemployment spell is random so that, on average, spells are halfway through their complete length when caught by the survey. However, a survey is more likely to capture longer spells, so that relative to the average completed spell lengths of the newly unemployed, the average spell age of currently unemployed individuals may be longer. The empirical work of Kaitz and Salant demonstrated the latter effect dominates the former.

Kaitz's method. In using cross-sectional data, methods are needed which allow the inference of the total time newly unemployed individuals will remain unemployed from the cross-sectional data on the current age of spells. The steady state is an attractive assumption in this regard, because it provides direct and easily calculated relationships between point-in-time information on spell ages and longitudinal estimates of completed spell lengths. In particular, in generating his duration estimates, Kaitz relied on the following result from the steady state model:5

D = U/F

where D is the expected duration of unemployment for the newly unemployed group; U is the level of unemployment; and F is the size of the newly unemployed group.

This steady state method is attractive because the two components, unemployment and inflow levels, are easily measured from cross-sectional data. Another benefit from this procedure is that it provides a theoretical justification that the newly unemployed is the proper group for analysis: namely, the expected completed spell duration of a group of newly unemployed individuals can be derived from the steady state level of unemployment and inflows.6

In January 1979, the unemployment rate equaled 5.9 percent, seasonally adjusted, and the average spell age of currently unemployed individuals from the cross-section sample was reported to be 11.1 weeks (3r 2.8 survey periods), seasonally adjusted. To apply Kaitz's method, the size of the newly unemployed group in January 1979 was estimated as the seasonally adjusted number of individuals (2,791,000) reporting spell ages in the interval of [less than 5 weeks] from the published statistics.7 Given the level of total unemployment (6,109,000), this yields an expected completed spell length of 2.2 survey periods for the newly unemployed. Hence, according to Kaitz's method, in January 1979, the average spell age of currently unemployed individuals exceeded the expected completed spell length for the newly unemployed. Although age or current spell length is an obvious underestimate of completed spell length for a single individual, the overselection of longer spells in the

currently unemployed group relative to the newly unemployed appears to dominate.

Salant's method. Salant assumed that although each individual in a newly unemployed group has a constant probability of escaping unemployment each period, these probability values differ across individuals. This assumption permits development of the concept of a sorting process: individuals with the highest escape probabilities will tend to leave unemployment more quickly than those with lower probabilities. The average probability of existing unemployment tends to fall over time as the group is increasingly made up of the lower probability individuals.

Salant uses this sorting concept to develop a precise mathematical relationship between published information on the age of spells of currently unemployed individuals and the total time spent unemployed by newly unemployed individuals.8 Salant's method requires maximizing the likelihood of observing the published breakdown of spell ages which are listed below:9

[less than 5 weeks] 2,791,000

[5 to 14 weeks] 2,003,000

[15 to 26 weeks] 717,000

[27 weeks and over] 533,000

Maximizing the likelihood of observing this particular pattern of spell ages yields an estimate of duration of 1.6 survey periods for the newly unemployed.10

How accurate is this estimate? To answer this question the published breakdowns of time unemployed given above can be compared with the breakdowns predicted by Salant's method. As described in footnote 11, the predicted and actual breakdowns are extremely close, providing confidence in Salant's description of the process that gives rise to the published figures.11

Comparing Kaitz and Salant

The duration estimate using Salant's method (1.6 survey periods) is much lower than the estimate using Kaitz's method (2.2 survey periods). One reason for the discrepancy is the implicit difference in the determination of the attrition rates affecting the newly unemployed group. In Kaitz's world, duration simply equals the ratio of the level of total unemployment to the level of inflow of newly unemployed individuals. Given the steady state assumption that entry and exit levels are always constant and equal, the level of total unemployment reflects the assumption of a constant attrition pattern affecting all previous newly unemployed groups:12 the greater the level of total unemployment relative to that of new unemployment, the higher the implicit proportion of individuals remaining unemployed each month after entrance. For example, if unemployment rates have passed a turning point so that inflow levels are fairly low despite slowly adjusting high total unemployment rates, Kaitz's method will produce a high average continuation rate. Conversely, if the unemployment rate is low despite a high inflow rate, Kaitz's method will result in a relatively low average continuation rate. We would expect this method of using cross-sectional data to produce lags in the response of duration statistics to business turning points.

Salant's method provides more direct information about the attrition process. By maximizing the likelihood of the observed breakdown of current spell ages, his method captures how changing business conditions affect the current sizes of spell age groups. A low inflow rate accompanying a high total unemployment rate would also be accompanied by changing attrition patterns of all previous inflow groups. Salant's method captures the reflection of these changes as they affect the cross-sectional view of the unemployed.

On way to examine the difference in the estimate of duration between these two methods is to compare the average monthly probabilities of remaining unemployed implied by the two procedures. Kaitz's method results in a constant probability value using the procedure described in footnote 5. Because Kaitz's implied probability refers to monthly attrition behavior, we used the parameter estimates resulting from Salant's method to generate estimates of the monthly average probabilities of remaining unemployed. The exact procedure is detailed in footnote 13.(13) The following tabulation displays the average monthly probabilities of remaining unemployed implied by the Kaitz and Salant procedures:

The estimate of duration using Kaitz's method is consistent with an average monthly continuation rate of .5431. Salant's probability of remaining unemployed increases over time from .4241 to .6519. Hence, Salant's method implies a much faster rate of escape over the first period of unemployment than Kaitz's. The relatively more sluggish behavior of Salant's newly unemployed group in later periods is not strong enough to cause its associated duration figure to exceed Kaitz's figure.

Although these comparisons are informative, a source of data permitting the construction of the original size of the January 1979 newly unemployed group and tracing its remaining sizes over time is needed to judge the efficacy of these methods. In this way, comparisons can be made between the actual attrition process and the ones implied by the cross-sectional methods. Although Salant's method indicates a good distributional fit to the observed cross-sectional data, this fit does not necessarily imply that his method provides an accurate measure of the attrition pattern over time for the January 1979 newly unemployed group. The data we have in mind are the raw data underlying the published intervals of time unemployed; these data are the focus of the next section.

Combined cross-sectional data

Underlying the published seasonally adjusted cross-sectional data are seasonally unadjusted unpublished numbers, which provide a monthly breakdown of the distribution of current spell ages by single weeks of unemployment. These data permit the construction of intervals of single weeks of unemployment, which are roughly consistent with the periodicity of the CPS survey. Therefore, seasonally adjusted estimates of the original size of a newly unemployed group as well as estimates of its remaining sizes in successive survey periods can be constructed by combining several cross-sectional data sets. The interval population values can then be used to construct the average probabilities of remaining unemployed over time for newly unemployed individuals, which in turn can be used for constructing nonsteady state estimates of the average time it takes a newly unemployed individual to leave unemployment.

Here, we use a parametric approach for deriving duration estimates using interval population values from combined cross-sectional data; this approach requires the choice of parametric form to represent the nature of attrition in the sample. We are sensitive to the criticism that such choices are often made arbitrarily and without independent verification. One of the important features of this section is the development of criteria for evaluating alternative choices of parametric forms.

Table 1 provides a selected subset of the seasonally unadjusted single-week duration data for the January-February 1979 period. As the data indicate, and as noted by Kaitz, in responding to the survey, participants tend to round their estimates of time unemployed to the nearest monthly, biannual, or annual figure creating local modes most notably at 4, 8, 12, 16, 21, 26, and 52 weeks. In using these data to construct average transition probabilities of remaining unemployed from one survey period to the next, it is necessary to choose the intervals of single weeks of unemployment for use in the construction of average probabilities of remaining unemployed.

Local mode biases. In choosing the intervals of single weeks of unemployment, the biases introduced by the local modes must be considered carefully. Consider the group of newly unemployed individuals in January 1979. The original size of this group was chosen to be the number of individuals with anywhere from 0 to 5 weeks of unemployment: this choice surrounds both sides of the local mode occurring at 4 weeks, thus capturing individuals who round either up or down to that modal point. To estimate the remaining size of this group as of the February survey, it is assumed that the number of individuals in the [5 to 9 weeks] interval in February provides a robust estimate. Using seasonally adjusted data, the average probability of remaining unemployed from January to February for this group is then calculated as the ratio of the size of the [5 to 9 weeks] group in February to the size of the [0 to 5 weeks] group in January. The other interval choices are listed in table 2, and the implied number of exists between successive survey dates are given in table 3.

Although the interval choices are an attempt to minimize the possible bias introduced by the local modes, the extent to which individuals round their estimates of time unemployed up or down to the nearest local mode cannot be determined. This fact, more than anything else, contributes to the uncertainty associated with measuring the average time individuals remain unemployed. These modal influences affect the accuracy of both cross-sectional and combined cross-sectional data. Norman Bowers and Francis Horvath14 suggest that not only do individuals tend to round their estimates of time unemployed to the nearest month, but that estimates over consecutive months are also inconsistent with the time between surveys. The authors do point out, however, that the net bias is small, because the average time unemployed between successive survey dates increases by only slightly more than the time frame of the survey. However, substantial local modes exist in the data, and judgment on the validity of the results of this study depends on one's view of the efficacy of the methods adopted to account for these modes.

Toward this end, one approach adopted in this study was to examine alternative choices of intervals of single weeks of unemployment. In examining the choices, two considerations concerning interval selection emerged as most important and deserve comment. First, the local mode at 26 weeks makes the process of determining transition probabilities near that milestone problematic. One might assume, for example, that an individual with anywhere from 17 to 21 weeks of unemployment in one month might be in the interval of [21 to 25 weeks] in the next month. To the extent that members of this group round up their estimates of time unemployed to 26 weeks (or half a year), the size of this latter interval may be biased downward, creating a downward bias in the associated transition probability of remaining unemployed. To include 26 weeks in the definition of this interval could arguably produce biases in the opposite direction.

A second consideration is that the strength of the mode at 52 weeks makes the division of current spell ages near that milestone into intervals of 4 or 5 single weeks meaningless. Therefore, the estimates of the remaining sizes of the January 1979 newly unemployed group are cut off at 33 weeks as of the August 1979 survey (the number of individuals with 29 to 33 weeks of unemployment in August 1979 is used as an estimate of the number of individuals who were newly unemployed in January 1979 and remained unemployed up to August 1979). This introduces the problem of truncation or the right-censoring of information on the completed length of unemployment spells; that is, by making the [29 to 33 weeks] interval in August 1979 the final interval, we know that individuals in this interval experienced at least 8 months of unemployment, but we do not know the lengths of their completed spells of unemployment. However, as of the August 1979 survey, the data indicate that only 2.8 percent of the January 1979 newly unemployed group remained unemployed.

One way in which the influence of these modes was examined involved constructing the range of duration estimates corresponding to a selection of alternative interval specifications; these included selections which assumed that different proportions of individuals responding 26 weeks actually belonged to the [21 to 25 weeks] interval in one month and the [25 to 29 weeks] interval in the next month. For the January 1979 newly unemployed group, varying the choices of intervals around the 26-week mark had little effect on the resulting estimates of expected duration; the impact was solely on the goodness of fit associated with each particular distributional form. The results based on alternative interval selections are available from the author on request.15

Testing Salant's sorting hypothesis

Although the data suffer from certain limitations, they also exhibit qualities which, when taken as a whole, point to a general usefulness in determining the average length of an unemployment spell. In dealing with social science survey data, the preponderance of evidence provided by the data must be considered, taking care to clearly state the criterion on which these judgments rest. In the case of repeated cross-sectional data, a natural starting point is the support these data provide of the notion that the average probability of remaining unemployed tends to rise over time. This observation was first made by Salant regarding the general trend indicated by cross-sectional data and is also a logical implication of his theory of sorting.16

In examining the transition probabilities of remaining unemployed over the length of an unemployment spell, both impressionistic and formal pieces of evidence were examined for the January 1979 newly unemployed group. On the impressionistic side, the general pattern of the transition probabilities is one smoothly rising over time. This can be seen from the following tabulation showing the average probability of remaining unemployed:

Months & Average probability

January to February P1 = .4316

February to March P2 = .5415

March to April P3 = .6003

April to May P4 = .6216

May to June P5 = .6759

June to July P6 = .8363

July to August P7 = .5734

The exception to this pattern is found in the behavior of P6 and P7. Transition probability P6, the average probability that a member of the January newly unemployed group will remain unemployed from June to July, jumps to a much higher value than would be expected from the trend set by P1 to P5. The value of the next transition probability, P7, is actually lower than the value of P6. To conclude that the true process is one of slowly rising average probabilities of remaining unemployed requires discounting the behavior of P6 and P7.

A former test was conducted of the null hypothesis that the probabilities of remaining unemployed are constant over time against the alternatives that the probabilities exhibit either consistent rising or falling patterns.17 Despite the behavior of transition probabilities P6 and P7, this test provided a strong indication that the average probabilities of remaining unemployed tend to rise consistently over this newly unemployed group's spell length.

Acceptance of this conclusion is important because of the guidance it provides as to the acceptable class of density functions for describing the attrition behavior of newly unemployed groups. One of the prior beliefs which influenced this judgment was the existence of a local mode at 26 weeks. However, despite this belief, one option we did not pursue was to reassign individuals around the 26 week mode and smooth transition probabilities P6 and P7 prior to estimating duration. Our reluctance in this regard was based on the notion that a better approach would be to assume that these interval populations are governed by some attrition process and the observed values are simply random draws from this process measured with error. By comparing the predicted with the actual attrition values, it is then possible to measure the influence of local modes on the goodness of fit of the chosen parametric form.

A common criticism of duration studies has been the practice of arbitrarily specifying parametric forms to describe the attrition behavior of groups, especially if the parameters of the distribution are estimated using truncated data. Combined cross-sectional data permit the systematic examination of the efficacy of our choices. Besides lending support to the choice of a class of functions for which the average probability of remaining unemployed rises over time, these data also permit independent testing of the appropriateness of each specific functional form within that class. The data permit estimating the original and remaining sizes of a newly unemployed group for eight successive periods. The idea is to truncate the data at some point, say 4 months, and test the fit of the chosen parametric form on the known excluded observations at the tail of the distribution of spell lengths. This procedure is then repeated for truncation points at the fifth and up to the eighth month of data. This procedure permits construction of a measure of the closeness of fit of the chosen parametric form to the observed data.

Estimating duration

The idea behind our method is to specify a parametric form describing the attrition of individuals out of unemployment and use the observed attrition rates to estimate the underlying true ones. We then use the estimated attrition rates to construct a measure of the expected value of completed spell lengths. The formula we employ is a discrete approximation of unemployment duration. The key to estimating duration using this formula lies in the assumption as to when individuals enter and leave unemployment between survey dates. One common assumption in the literature is that, on average, individuals enter and exit unemployment halfway between survey dates: individuals leaving unemployment between January and February are expected to experience, on average, one full period of unemployment.18 Using the term P(i) to mean the average probability of remaining unemployed the ith survey period after entrance, the expected length of a completed spell of unemployment, E(S), can be written as:

E(S) = 1{1-P1) 2*P1{1-P2) 3*P1*P2{1-P3) . . .

In applying combined cross-sectional data to this formula, it is assumed that the observed transition probabilities reflect an underlying attrition process measured with random error. To estimate the true transition probabilities, it is necessary to regress a linear version of the chosen attrition process against time and use the fitted coefficients to predict the true transition probabilities. The closeness of the fitted with the observed transition probabilities help to discriminate between alternative choices.

Six parametric forms were chosen to describe the attrition process out of unemployment for members of the January 1979 newly unemployed group: the Weibull, Salant, and Gompertz distributional forms; a linear and log-linear probability function; and the functional form utilized by Clark and Summers in their 1979 Brookings Paper article. Each of these chosen forms allows for rising average probabilities of remaining unemployed over time. The choices of forms of the likelihood function were based on the belief that the average probability of remaining unemployed tends to rise with any group's spell length. Another structure which is often employed is the exponential. This latter form is restrictive, because it assumes that the average probability of remaining unemployed is constant over time for any group. Although restrictive, it is instructive to include this form for comparison purposes. In each case, a linear or log-linear version of the parametric process was regressed against a function of time. Using the estimated parameters, fitted transition probabilities were calculated and used to estimate duration. The parametric forms chosen are given in the following tabulation:

Functional form & Statistical expression

Weibull ln(-lnS(t)) = a b ln(t)

Clark-Summers h(t) = a b ln(t)

Gompertz ln(h(t)) = a b t

Salant (1/h(t)) = a b t

Linear form h(t) = a b t

Log-linear ln(h(t)) = a b ln(t)

Exponential -ln(S(t)) = b t

where t is 1, 2, 3, and so on months of unemployment; S(t) is the average probability that a member of a newly unemployed cohort remains unemployed at least t survey periods; and h(t) is the average probability that a member of a newly unemployed group remains unemployed (t-1) periods and then leaves unemployment by the t(th) period.

Two goodness of fit statistics were constructed to judge the efficacy of these alternative forms. The first is a chisquared measure of the squared differences between the number of actual and fitted exits and survivors applied to within-sample observations only (the within-sample chisquared statistic); that is, if the fourth month is picked as the truncation date, a comparison is made between the predicted and actual number of exits in the first 3 months as well as a comparison of predicted and actual survivors (the number of individuals still unemployed) in the fourth month.

The second chi-squared statistic measures the fit between the observed and predicted observations before and after the selected point of truncation (the full-sample chi-squared statistic). Suppose that the parameter estimates are based on exits between the first and fourth months, with those remaining unemployed in the fourth month treated as survivors. The full-sample chi-squared statistic is based on using the resulting parameter estimates to predict the number of exits up to the eighth month and the number of survivors at that date. For both chi-squared statistics, the point of truncation is varied between 4 and 8 months.

The results are given in table 4. As both the point of truncation and the choice of parametric form were varied, the unemployment duration estimates remained within a small range of each other. In addition, on the basis of both the within- and full-sample chi-squared statistics, the Weibull form was joined by the Clark-Summers form in generating the closest fit between the actual and predicted numbers of exits and survivors. It should be noted, however, that, except for poor fit when the truncation point was chosen to be July, the log-linear function also performed very well. The Salant function also generated fairly close fits except for a highly unstable performance with the July truncation date. The linear and Gompertz forms had goodness of fit statistics which were uniformly higher than the Weibull and Clark-Summers forms but more consistent than the Salant or the log-linear. Finally, as expected, the exponential function had the poorest overall fit.

The behavior of the exponential is consistent with the observation that the average probability of remaining unemployed tends to increase over time. Notice the relative improvement in the full-sample chi-squared statistics as the truncation date is advanced from the fourth to eighth periods. When observations are truncated at four months, the exponential form requires that monthly attrition rates out of unemployment after the truncation date resemble those observed in the first four periods. Given the declining average probability of escape, this will result in an overestimate of actual attrition rates. It is not surprising, therefore, to see the improvement we do in the full-sample chi-squared statistic and the increase in the estimates of duration as the truncation point is advanced and more information about the behavior of the tail becomes known. The chi-squared statistics are in general, however, well beyond any acceptable range for concluding that the exponential density is an appropriate description of the attrition pattern of the January 1979 newly unemployed group.

Examining business cycles

In order to estimate expected completed spell duration for newly unemployed groups during the business cycles of the 1967-82 period, we used the knowledge gained from examining the January 1979 newly unemployed group to limit the analysis to a more selective choice of techniques.

The a priori convictions generated by examining January 1979 data which are being brought to bear on business cycle data include a conviction that the average probability of remaining unemployed tends to rise over time for newly unemployed groups; a conviction that among the functional forms underlying the probability smoothing techniques, the Weibull and the Clark-Summers forms perform the best; and a conviction that the change in estimated duration induced by truncating the data is fairly small.

For each of the newly unemployed groups considered, the following steps were taken: first, estimates of duration were made by applying the probability smoothing procedure to the discrete formula for duration; second, among the variety of fitted procedures, only the Weibull and the Clark-Summers parametric forms were employed; third, the empirical procedure was applied to truncation points of 3, 4, and 5 months; and finally, in addition to constructing duration estimates using combined cross-sectional data, we also report published duration figures and estimates constructed using Kaitz's steady state method. This process permits a comparison of the sensitivity of each statistic to turning points in the business cycle.

We constructed the duration estimates on a monthly basis over the January 1967-June 1982 period. We used seasonally adjusted numbers and report quarterly average duration figures. For the combined cross-sectional data, we only report results from using the Weibull form because it proved consistently superior to the Clark-Summers form. Also, as there were only minor differences in the results using truncation dates of 3, 4, and 5 months, we report the 5-month truncation estimates here.

As is well known, and as indicated by chart 1, published duration statistics tend to lag business cycle turning points. The duration statistic based on Kaitz's steady state formula also lags business cycle turning points but not as strongly as the published statistic. Our nonsteady state duration measure based on combined cross-sectional data tends to be coincident with turning points.

Both Kaitz's steady state and our nonsteady state duration measures are nearly always less than the published duration estimates. The difference tends to widen during the initial phase of a recovery period and to decline after the published statistic reaches its lagged peak. Once the recovery has spent its course and the economy enters and progresses through the ensuing recessionary period, the difference reaches its smallest level.

Although it lags business turning points slightly, the Kaitz steady state measure is cyclically sensitive and is within the same general range as our nonsteady state estimates. Given its cyclical sensitivity, it is tempting to suggest that the Kaitz measure be used to track the cyclical behavior of the average time spent unemployed for newly unemployed groups. There would be numerous advantages to such a suggestion. First, unlike our nonsteady state measure which requires forward-looking data, the Kaitz measure only requires the current level of unemployment and new inflows. Hence, it could be produced in a timely fashion.

Second, although it implicitly assumes steady state flows, it is not as restrictive a measure as it might first seem. Most studies of duration use the steady state assumption applied to annual average data--for example, Kaitz, Salant, and George Akerlof and Brian Main.19 In each of these studies, the use of annual average data either presupposes stable conditions over the period or attempts to smooth the underlying fluctuations. Instead of using annual average data to approximate steady state conditions, our procedure implicitly limits the time over which the steady state assumption is interpreted to hold. In particular, given the periodicity of the CPS survey, this steady state technique is used to develop monthly duration statistics. In this way, each month's estimate can be interpreted as that which would have been observed had the current levels of inflow and unemployment remained the same over time. Thus, changes in the duration statistic over time reflect changes either in inflow levels or changes in the current level of total unemployment, or both.

However, significant differences between Kaitz's steady state measure and our nonsteady state measure exist, and although the former may be easier to calculate, these differences cast doubt on its usefulness. A close comparison of the formulae on which these statistics are based tells us why. In Kaitz's analysis, the steady state duration measure, D(t)(K), is the ratio of the level of current unemployment to the size of the current month's level of new unemployment. The numerator can be thought of as the sum of the number of currently unemployed individuals in their first, second, third, and so on month of unemployment. This formula is:

D(t)(K) = >(N(t)(1) N(t)(2) N(t)(3) . . .>)/N(t)(1)

where N(t)(i) is the number of individuals who are in their i(th) month of unemployment at time (t).

The nonsteady state duration statistic, D(t)(N), for survey date (t) can also be written in terms of the original size of the newly unemployed group at survey date (t) and the remaining sizes of the group in subsequent survey periods. This formula is:

D(t)(N) = >(N(t)(1) N(t 1)(2) N(t 2)(3) . . . .>)/N(t)(1)

where N(t j)(i) is the number of individuals who are in their i(th) month of unemployment at time (t j).

The ratio of Kaitz's steady state formula and our nonsteady state discrete formula is given by:

D(t)(K)/D(t)(N) = >(N(t)(1) N(t)(2) N(t)(3) . . . .>)/>(N(t)(1) N(t 1)(2) N(t 2)(3) . . . .>)

Careful inspection of this ratio reveals that the fundamental difference between the numerator and denominator is a function of how the size of the group of currently unemployed individuals in their, say, third month of unemployment at date (t), Nt(3), compares with the remaining size of the newly unemployed group in their third month of unemployment at date (t 2). Thus, to the extent that the current size of unemployment groups lag business cycle turning points, so too will Kaitz's measure in the numerator. However, since our nonsteady state measure in the denominator is forward looking, it incorporates the lagged reactions of unemployed groups much more quickly into its estimates.

The cyclical behavior of the ratio of Kaitz's to the nonsteady state measure is presented in chart 2. As can be seen, during periods of rising unemployment rates, the ratio falls until just before the business cycle trough when it begins to increase. This increase precedes the peak unemployment rate after which the ratio maintains a consistently high level as unemployment rates fall during the ensuing recovery period.

As unemployment rates begin to rise, the size of newly unemployed groups begins to swell. Over time, as these groups enter their second, third, and so on, period of unemployment, we may observe the size of the newly unemployed groups becoming successively larger. For example, the size of a currently employed group in its third month of unemployment at date (t), N(t)(3), may be less than the remaining size of the newly employed group at date (t 2), N(t 2)(3). As a result, each term in the numerator may be dominated by its counterpart in the denominator until, nearing the end of the recession, the sizes of younger groups at each date (t) may dominate the sizes of their counterparts at date (t) and beyond; hence, the ratio will rise.

Although Kaitz's steady state measure has obvious advantages in terms of its ease of calculation, by its nature of construction, it lags our nonsteady state measure in responding to business turning points. Due to its forward-looking nature, the nonsteady state measure is more coincident with the peaks and troughs of the business cycle.

Which data are more promising?

This article examined two types of data from the CPS survey for estimating the average time individuals remain unemployed. We argued that published cross-sectional data from the CPS are inappropriate for this task because they require imposing the restrictive assumption of steady state flows. In addition, by their nature, we found that such estimates tend to lag turning points in the business cycle.

The second type of data, based on combined cross-sectional data from the CPS promises more for the development of duration statistics. These data permit defining the original and remaining sizes of unemployed groups. Because this method traces the actual sizes of groups, it is no longer necessary to assume steady state flows. Rather, the attrition patterns observed reflect the impact of current economic conditions on flows in and out of unemployment.

How good are these data? Do they permit the development of statistics which provide an accurate picture of the average amount of time a newly unemployed individual can expect to remain unemployed? Although the data suffer from local modes due to the tendency of individuals to round off their estimates of time unemployed to the nearest monthly interval, the CPS data are useful in determining the duration of unemployment spells.

This conclusion is justified by several reasons. First, the consistency of the pattern of local modes in the data support our procedure of surrounding both sides of each mode to minimize the possible biases. Second, the data consistently agree with the theoretical argument that the average probability of remaining unemployed tends to increase over time for any given group. Third, combined cross-sectional data permit independent verification of the choice of parametric form for describing attrition patterns, thus dispelling the argument that the choices of such forms using CPS data are arbitrary and unjustified. We consistently found that although the fit of estimated to actual patterns of exit is very sensitive to the choice of parametric form, the estimates of duration were not. Finally, these data have the advantage of permitting the construction of duration estimates for several unemployed groups, such as currently unemployed individuals. Also, because single-week duration data exists for groups stratified by race and sex, it is possible to greatly expand the number of currently and newly employed groups for which duration estimates can be constructed. Once a group is chosen, the actual mechanics of estimating duration are very straightforward and easy to calculate.

Several robustness tests need to be performed on these data before final judgments can be made. However, this study concludes that combined cross-sectional data permit the production of duration estimates which would provide an accurate portrayal of average time spent unemployed for various economic groups. Because of their forward-looking nature, these monthly duration statistics could not be produced in a manner that is coincident with other statistics for the same survey month. This drawback is offset by the quality of information that combined cross-sectional data provide for measuring the average total length of time unemployed.

1 Hyman B. Kaitz, "Analyzing the length of spells of unemployment,' Monthly Labor Review, November 1970, p. 11.

2 Steven W. Salant, "Search Theory and Duration Data: A Theory of Sorts,' Quarterly Journal of Economics, February 1977, pp. 39-57.

3 Numerous articles have appeared utilizing these methods on cross-sectional data: George Perry, "Unemployment Flows in the U.S. Labor Market,' Brookings Papers on Economic Activity, vol. 2, 1972, pp. 245-78; Stephen T. Marston, "The Impact of Unemployment Insurance on Job Search,' Brookings Papers on Economic Activity, vol. 1, 1976, pp. 169-203; R. Frank, "How Long is a Spell of Unemployment,' Econometrica, March 1978, pp. 285-301; Norman Bowers, "Probing the issues of unemployment duration,' Monthly Labor Review, July 1980, pp. 23-32; Robert Warren, "A method to measure flow and duration as unemployment rate components,' Monthly Labor Review, March 1977, pp. 71-72; George Akerlof and Brian Main, "An Experience-Weighted Measure of Employment and Unemployment Durations,' American Economic Review, December 1981, pp. 1003-11. Other cross-sectional studies have taken the much needed step of relaxing the assumption of steady state flows: Kim Clark and Lawrence Summers, "Labor Market Dynamics and Unemployment: A Reconsideration,' Brookings Papers on Economic Activity, vol. 1, 1979, pp. 13-60. There has also been much debate as to whether individuals who are newly or currently unemployed form the appropriate group for analysis of duration. See Clark and Summers, "Labor Market Dynamics and Unemployment'; Akerlof and Main, "An Experience-Weighted Measure'; and John A. Carlson and Michael W. Horrigan, "Measures of Unemployment Duration as Guides to Research and Policy: Comment,' American Economic Review, December 1983, pp. 1143-50. In contrast, one recent study utilizes combined cross-sectional data to construct a nonsteady state estimate of duration among the group of individuals exiting unemployment between consecutive survey dates: Hal Sider, "Unemployment Duration and Incidence: 1968-1982,' American Economic Review, June 1985, pp. 461-72.

4 Sider, "Unemployment Duration,' p. 464.

5 To derive this result, assume that the constant inflow into unemployment between dates (t-1) and (t) is represented by:

F(t) = F for all time periods (t)

Next, it is necessary to specify an attrition process for each member of a newly unemployed group at survey date (t); in particular, it is often assumed that each individual has the same constant probability, P, of exiting unemployment between any dates (j-1) and (j). That is,

P(j) = P for all time periods (j)

The expected completed spell duration for a group of newly unemployed individuals at time (t) becomes:

D = 1 (1-P) 2 P (1-P) 3 P2 (1-P) . . .

= 1/(1-P)

To estimate this concept it is only necessary to observe that in a steady state world, the constant level of unemployment is equal to the product of the constant rate of inflow and duration as measured by D. Since the level of unemployment is equal to this month's inflow and the remaining members of all previous inflow groups, it follows that:

U = F F P F P2 F P3 . . .

= F (1/(1-P))

= F D

Hence, the formula for duration becomes the ratio of the constant level of unemployment to the constant level of inflows; that is,

D = U/F

6 This justification no longer holds when the assumption of steady state flows is relaxed.

7 The interval [less than 5 weeks] is defined to include any individual with a current spell age of less than 4.5 weeks.

8 The way in which this is accomplished is to assume that the spell ages of the currently unemployed are governed by a particular distribution which is deterministically related to the distribution of completed spell lengths among the newly unemployed. In other words, once you know the parameters of the distribution of spell ages, you automatically know the parameters of the distribution of completed spell lengths.

Hence, all that is needed to generate estimates of completed spell duration is published information on spell ages. In particular, by maximizing the likelihood of observing the published breakdown of spell ages by time unemployed, Salant generates estimates of the parameters of the distribution of spell ages for the currently unemployed. These parameter estimates are in turn used to generate an estimate of average completed spell length for the newly unemployed.

9 These figures are all seasonally adjusted numbers. In addition, in setting up the likelihood specification, Salant noted that the endpoints of the published groupings corresponded to the following:

[0 to 4.5 weeks]

[4.5 to 14.5 weeks]

[14.5 to 26.5 weeks]

[26.5 to 99 weeks]

10 Salant used the following density function to describe the probability of observing an unemployment spell of current length (T) at a survey date:

g(T) = (r-1) a(r-1) (a T)-(r)

The density function of completed spell lengths turns out to equal:

f(x) = ra(r)(a x)-(r 1)

11 Comparing the fitted with the published breakdowns of time unemployed yields a chi-squared statistic of (3.199) which, given the critical chi-squared value of (5.99), provides strong evidence in favor of the chosen density structure as appropriate at the 95-percent level of confidence.

12 As is pointed out in the final section, by constructing it on a monthly basis, Kaitz's statistic responds to business cycle turning points. The assumption of a constant levels of unemployment is applied anew each time the statistic is constructed.

13 The procedure for generating the monthly average probability values given in the text tabulations on p. 6 was as follows:

First, we used the data underlying the published cross-sectional statistics to divide time unemployed into the following single-week intervals:

[0 to 4 weeks], [5 to 8 weeks], [9 to 12 weeks], [13 to 16 weeks], [17 to 20 weeks]

Second, we used the parameter estimates from the maximum likelihood procedure to generate the probability of observing a spell in each of those intervals. We then multiplied these probabilities by the size of the currently unemployed to predict the population of these intervals.

Finally, we constructed the following transition probabilities:

P1 = [5 to 8 weeks]/[0 to 4 weeks]

P2 = [9 to 12 weeks]/[5 to 8 weeks]

P3 = [13 to 16 weeks]/[9 to 12 weeks]

P4 = [17 to 20 weeks]/[13 to 16 weeks]

Strictly speaking, these are not transition probabilities because they are based on interval populations using January 1979 data; however, they can be interpreted as the transition probabilities implied by the attrition process embodied in Salant's maximum likelihood procedure.

14 Norman Bowers and Francis Horvath, "Keeping Time: An Analysis of Errors in the Measurement of Unemployment Duration,' Journal of Business and Economic Statistics, April 1984.

15 These duration estimates derived using alternative interval selections were based on seasonally unadjusted single-week duration data.

16 This "fact' of nature turns out to be a critical assumption in Salant's empirical method; it is interesting he does not perform any formal statistical tests of his maintained hypothesis.

17 This test requires construction of the G-statistic which is defined as follows:

Assume that the observations on the remaining sizes of a group occur at scaled intervals given by t(i), (i=0, 1, . . ., n). Let the number of exits which occur between t(i-1) and t(i) be given by (i). Let Wi be defined as:

Wi = (n-i 1) (t(i) - t(i 1))

As it turns out, the following statistic is distributed standard normal:

Z = [12(n-1)].5 (Gr, n - .5)

where:

Gr, n = r-1 i=1iW(i 1)/(r-1) r i-1Wi

The maintained hypothesis is that the probability of existing unemployment is constant over the group's spell length. Large negative values of the observed Z are supportive of Salant's sorting process.

The sample value of Z associated with the January 1979 newly unemployed group is (-4.026) which is sufficiently greater than (1.96) in absolute value to reject the maintained hypothesis at the 97.5-percent level of confidence.

18 This assumption requires that the entries and exits occurring between survey dates are governed by a uniform distribution.

19 Kaitz, Analyzing the length; Salant, Search Theory; and Akerlof and Main, Experience-Weighted Measure.

Table:

Table: 1. Duration of unemployment by selected single weeks, unadjusted data for January and February, 1979

Table: 2. Single-week intervals defining the original and remaining sizes of the newly unemployed cohort in January 1979

Table: 3. Number of the unemployed who found work between successive survey periods, 1979

Table: 4. Duration estimates applying parametric smoothing techniques to generate fitted transition probabilities

Photo: Chart 1. Cyclical behavior of various unemployment duration measures, 1967-82

Photo: Chart 2. Cyclical behavior of the ratio of the Kaitz steady state measure to the Weibull nonsteady state measure, 1967-82

In July 1983, 8 months after the unemployment rate peaked during the 1981-82 recession, the published mean duration of unemployment figure reached 21.2 weeks. In October 1986, 47 months into the recovery, unemployment duration had fallen to 15.2 weeks. Although these numbers move in the expected direction, do they really provide an accurate portrayal of the time individuals spend unemployed? In a 1970 article in the Review, Hyman B. Kaitz considered the question of how long a person remains unemployed "on average.' He concluded that it was "a simple question, yet one that cannot be easily answered despite the wealth of data available.'1 Some reasons for this difficulty are tied to the choices of data and statistical techniques used for estimating unemployment duration. Other reasons reflect basic disagreements among economists as to what constitutes the best measure of the average time individuals remain unemployed.

For example, many of the earliest articles written on unemployment duration concentrated on the fact that the published statistics measure the average age of unemployment spells among the currently unemployed; that is, the survey interrupts spells which are in progress. As a result, the statistics do not show the average completed length of spells or average total time unemployed for these individuals. A consensus emerged from these studies that the average total time spent unemployed should be measured. What has yet to emerge is a consensus as to how to accomplish this goal.

This article examines the conceptual and empirical problems encountered in selecting the most appropriate measure of the average total time an individual remains unemployed, or the duration of a completed spell of unemployment. The discussion of these problems helps set the stage for the focus of our analysis: a comparison of different methods of using data from the Current Population Survey (CPS) to construct estimates of unemployment duration. Two sources of data are considered: published cross-sections of time unemployed and unpublished listings of time unemployed by single weeks of unemployment. In conducting this comparison, we find that the unpublished data permit development of new and robust estimates of average time spent unemployed; in particular, cyclically sensitive estimates can be developed monthly.

Conceptual and empirical problems

In any study of the duration of unemployment, two questions are either implicitly or explicitly addressed. First, which group of individuals is to be used to construct an estimate of the average length of time members of the group remain unemployed? Is it the group of currently unemployed individuals? Or perhaps individuals who only recently became unemployed? Numerous other choices exist, each providing a different picture of the dynamics of the labor market.

Second, how can the available data to be used to construct estimates of the average total time unemployed among members of the chosen group? As noted previously, the CPS does not measure the total time individuals remain unemployed before finding employment or withdrawing from the labor force. Rather, the survey records the amount of time individuals have remained unemployed up to the survey reference week. Hence, the measure we desire to estimate, average total time unemployed, must be inferred from the type of information available, current age of unemployment spells. Since available data differ from the required data, what assumptions are needed that permit such inferences to be made?

Most studies of duration have used published CPS cross-sectional data and have attempted to measure the average total time unemployed for newly unemployed individuals. These studies often required the assumption of a constant or steady state level of unemployment. Studies by Hyman Kaitz in 1970 and Stephen Salant in 1977(2) are largely representative of work in this area. Both studies assume steady state flows to generate empirical estimates.

Since publication of the Kaitz and Salant articles, a wide range of approaches have been adopted for estimating duration.3 These approaches include relaxing the assumption of steady state flows, constructing duration measures for groups other than newly unemployed individuals, and using alternative sources of data. Through these efforts, a consensus has emerged, although held in varying degrees. First, the average total time unemployed should be measured without assuming steady state flows. Second, in reporting and interpreting monthly duration statistics, no single group of individuals is the preferred group for analysis; each choice simply reflects a different aspect of the underlying dynamics of the labor market.

The justification for the first point would appear obvious; the assumption that the unemployment rate is constant over time is simply too much at variance with the real world of cyclical unemployment rates. In terms of conventional economic methodology, however, differences that arise between estimates derived from methods requiring such an assumption and those that do not must be carefully examined. For this reason, steady state methods are included in our study and we compare the results with those generated by our nonsteady state methods.

The second point is potentially harder to justify. The choice of any group of individuals leads to the exclusion of other individuals; the impact of current economic conditions on the group of newly unemployed individuals may be far different than the impact on the currently unemployed. Could not one group of individuals be more "representative' of labor market conditions than any other? Our contention is that, at any given survey date, the answer to this question is no. An example may help to clarify this issue.

Suppose that every week, 10 individuals become unemployed and remain unemployed for exactly 1 week. Also suppose that every week, 1 individual becomes unemployed and remains unemployed for exactly 10 weeks. Assuming this process has been continuing indefinitely, in any given week you would find 20 individuals currently unemployed, 10 of whom will each experience a total time unemployed of 1 week, and 10 of whom will each experience a total time unemployed of 10 weeks. The average total time unemployed experienced by these 20 individuals is 5.5 weeks. At the same time, for any given week, there are 11 individuals who have become newly unemployed. The average total time unemployed experienced by these individuals is 1.8 weeks.

Currently versus newly unemployed

Which statistic is more indicative of labor market conditions? Each group offers a partial glimpse of the dynamics underlying the labor market, and understanding the differences between various choices is critical. The currently unemployed are the remaining members of previous newly unemployed groups. By definition, the currently unemployed do not include members of previous inflow groups who have either found employment or withdrawn from the labor force by the time of the survey. The unemployment experience of these individuals will not be captured by calculating the duration of unemployment of individuals currently unemployed at a survey date.

The previous discussion indicates that at each survey date, no single group of individuals is preferred for measuring duration. However, the analysis does not provide an indication as to the consequences of choosing a particular group concept and measuring duration for that choice each month over an extended period of time. Because one focus of this study is the behavior of duration statistics over business cycles, serious consideration must be given to our choice. In particular, we chose to measure average total time unemployed for groups of newly unemployed individuals. Our justification of this selection is quite simple. Over time, this choice theoretically captures all individuals who become unemployed, and, unlike examining the currently unemployed each month, examining newly unemployed individuals does not result in groups that include the same individuals over successive periods. That is, if we were to measure the average total time unemployed for individuals currently unemployed in January and then repeat the exercise for the currently unemployed in February, many of the same individuals would be included in both measures. Using the newly unemployed group in each month, however, does not present this problem.

Given this choice, we make comparisons between different methods of using CPS data to estimate unemployment duration. Attention in this study is limited to constructing monthly estimates of the average duration of completed spells for newly unemployed groups in a nonsteady state environment. Within the selected framework, alternative estimating techniques are considered, and criteria for judging their efficacy are developed.

The next several sections of this article are confined to measuring the duration of unemployment for the group of newly unemployed individuals in January 1979. First, we discuss the two techniques (Kaitz and Salant) regarded as representative of work using cross-sectional data. Next, we examine the use of combined cross-sectional data using a parametric estimating technique.

One of the more difficult tasks in this research area is the selection among competing choices of techniques for estimating the average length of time an individual remains unemployed. The estimation of duration has been termed as much an "art as science,'4 and as such, serious consideration must be given to the development of criteria within which reasonable choices of techniques can be made. Using such a framework, both cross-sectional and combined cross-sectional methods are applied in the last section to the business cycles of the 1967-82 period. Monthly duration estimates are constructed, permitting an examination of the sensitivity of the chosen techniques to business cycle turning points.

Cross-sectional data

The duration concept is based on answers to the CPS survey question: "How many weeks has . . . been looking for work?' The resulting statistic measures the average age of unemployment spells among the currently unemployed. The answers to the duration question are published using the following seasonally adjusted groupings of weeks unemployed: [less than 5 weeks], [5 to 14 weeks], [15 to 26 weeks], and [27 weeks and over]. Two features of this statistic are important: first, the current length in weeks of any spell is an underestimate of eventual completed spell length; and second, the group under consideration is the currently unemployed. This group is made up of the remaining members of all previous newly unemployed groups. To the extent that the composition of a group changes as some of its original members leave unemployment, measurable differences may be observed between newly and currently unemployed groups at any point in time. Kaitz and Salant demonstrated that published CPS duration statistics which provide the average length in weeks of currently unemployed individuals overestimate the average completed length of unemployment spells of newly unemployed groups.

The Kaitz and Salant studies adopted the assumption of steady state flows; a constant level of total unemployment which is accompanied by a constant level of inflow into and exit out of unemployment. Although the steady state assumption runs counter to the objectives of this present research, we consider these two methods in detail because these pioneering studies framed the context within which current discussions take place, and because these studies can be applied directly and easily to a nonsteady state environment.

In a steady state world, the intersection of a survey and an unemployment spell is random so that, on average, spells are halfway through their complete length when caught by the survey. However, a survey is more likely to capture longer spells, so that relative to the average completed spell lengths of the newly unemployed, the average spell age of currently unemployed individuals may be longer. The empirical work of Kaitz and Salant demonstrated the latter effect dominates the former.

Kaitz's method. In using cross-sectional data, methods are needed which allow the inference of the total time newly unemployed individuals will remain unemployed from the cross-sectional data on the current age of spells. The steady state is an attractive assumption in this regard, because it provides direct and easily calculated relationships between point-in-time information on spell ages and longitudinal estimates of completed spell lengths. In particular, in generating his duration estimates, Kaitz relied on the following result from the steady state model:5

D = U/F

where D is the expected duration of unemployment for the newly unemployed group; U is the level of unemployment; and F is the size of the newly unemployed group.

This steady state method is attractive because the two components, unemployment and inflow levels, are easily measured from cross-sectional data. Another benefit from this procedure is that it provides a theoretical justification that the newly unemployed is the proper group for analysis: namely, the expected completed spell duration of a group of newly unemployed individuals can be derived from the steady state level of unemployment and inflows.6

In January 1979, the unemployment rate equaled 5.9 percent, seasonally adjusted, and the average spell age of currently unemployed individuals from the cross-section sample was reported to be 11.1 weeks (3r 2.8 survey periods), seasonally adjusted. To apply Kaitz's method, the size of the newly unemployed group in January 1979 was estimated as the seasonally adjusted number of individuals (2,791,000) reporting spell ages in the interval of [less than 5 weeks] from the published statistics.7 Given the level of total unemployment (6,109,000), this yields an expected completed spell length of 2.2 survey periods for the newly unemployed. Hence, according to Kaitz's method, in January 1979, the average spell age of currently unemployed individuals exceeded the expected completed spell length for the newly unemployed. Although age or current spell length is an obvious underestimate of completed spell length for a single individual, the overselection of longer spells in the

currently unemployed group relative to the newly unemployed appears to dominate.

Salant's method. Salant assumed that although each individual in a newly unemployed group has a constant probability of escaping unemployment each period, these probability values differ across individuals. This assumption permits development of the concept of a sorting process: individuals with the highest escape probabilities will tend to leave unemployment more quickly than those with lower probabilities. The average probability of existing unemployment tends to fall over time as the group is increasingly made up of the lower probability individuals.

Salant uses this sorting concept to develop a precise mathematical relationship between published information on the age of spells of currently unemployed individuals and the total time spent unemployed by newly unemployed individuals.8 Salant's method requires maximizing the likelihood of observing the published breakdown of spell ages which are listed below:9

[less than 5 weeks] 2,791,000

[5 to 14 weeks] 2,003,000

[15 to 26 weeks] 717,000

[27 weeks and over] 533,000

Maximizing the likelihood of observing this particular pattern of spell ages yields an estimate of duration of 1.6 survey periods for the newly unemployed.10

How accurate is this estimate? To answer this question the published breakdowns of time unemployed given above can be compared with the breakdowns predicted by Salant's method. As described in footnote 11, the predicted and actual breakdowns are extremely close, providing confidence in Salant's description of the process that gives rise to the published figures.11

Comparing Kaitz and Salant

The duration estimate using Salant's method (1.6 survey periods) is much lower than the estimate using Kaitz's method (2.2 survey periods). One reason for the discrepancy is the implicit difference in the determination of the attrition rates affecting the newly unemployed group. In Kaitz's world, duration simply equals the ratio of the level of total unemployment to the level of inflow of newly unemployed individuals. Given the steady state assumption that entry and exit levels are always constant and equal, the level of total unemployment reflects the assumption of a constant attrition pattern affecting all previous newly unemployed groups:12 the greater the level of total unemployment relative to that of new unemployment, the higher the implicit proportion of individuals remaining unemployed each month after entrance. For example, if unemployment rates have passed a turning point so that inflow levels are fairly low despite slowly adjusting high total unemployment rates, Kaitz's method will produce a high average continuation rate. Conversely, if the unemployment rate is low despite a high inflow rate, Kaitz's method will result in a relatively low average continuation rate. We would expect this method of using cross-sectional data to produce lags in the response of duration statistics to business turning points.

Salant's method provides more direct information about the attrition process. By maximizing the likelihood of the observed breakdown of current spell ages, his method captures how changing business conditions affect the current sizes of spell age groups. A low inflow rate accompanying a high total unemployment rate would also be accompanied by changing attrition patterns of all previous inflow groups. Salant's method captures the reflection of these changes as they affect the cross-sectional view of the unemployed.

On way to examine the difference in the estimate of duration between these two methods is to compare the average monthly probabilities of remaining unemployed implied by the two procedures. Kaitz's method results in a constant probability value using the procedure described in footnote 5. Because Kaitz's implied probability refers to monthly attrition behavior, we used the parameter estimates resulting from Salant's method to generate estimates of the monthly average probabilities of remaining unemployed. The exact procedure is detailed in footnote 13.(13) The following tabulation displays the average monthly probabilities of remaining unemployed implied by the Kaitz and Salant procedures:

The estimate of duration using Kaitz's method is consistent with an average monthly continuation rate of .5431. Salant's probability of remaining unemployed increases over time from .4241 to .6519. Hence, Salant's method implies a much faster rate of escape over the first period of unemployment than Kaitz's. The relatively more sluggish behavior of Salant's newly unemployed group in later periods is not strong enough to cause its associated duration figure to exceed Kaitz's figure.

Although these comparisons are informative, a source of data permitting the construction of the original size of the January 1979 newly unemployed group and tracing its remaining sizes over time is needed to judge the efficacy of these methods. In this way, comparisons can be made between the actual attrition process and the ones implied by the cross-sectional methods. Although Salant's method indicates a good distributional fit to the observed cross-sectional data, this fit does not necessarily imply that his method provides an accurate measure of the attrition pattern over time for the January 1979 newly unemployed group. The data we have in mind are the raw data underlying the published intervals of time unemployed; these data are the focus of the next section.

Combined cross-sectional data

Underlying the published seasonally adjusted cross-sectional data are seasonally unadjusted unpublished numbers, which provide a monthly breakdown of the distribution of current spell ages by single weeks of unemployment. These data permit the construction of intervals of single weeks of unemployment, which are roughly consistent with the periodicity of the CPS survey. Therefore, seasonally adjusted estimates of the original size of a newly unemployed group as well as estimates of its remaining sizes in successive survey periods can be constructed by combining several cross-sectional data sets. The interval population values can then be used to construct the average probabilities of remaining unemployed over time for newly unemployed individuals, which in turn can be used for constructing nonsteady state estimates of the average time it takes a newly unemployed individual to leave unemployment.

Here, we use a parametric approach for deriving duration estimates using interval population values from combined cross-sectional data; this approach requires the choice of parametric form to represent the nature of attrition in the sample. We are sensitive to the criticism that such choices are often made arbitrarily and without independent verification. One of the important features of this section is the development of criteria for evaluating alternative choices of parametric forms.

Table 1 provides a selected subset of the seasonally unadjusted single-week duration data for the January-February 1979 period. As the data indicate, and as noted by Kaitz, in responding to the survey, participants tend to round their estimates of time unemployed to the nearest monthly, biannual, or annual figure creating local modes most notably at 4, 8, 12, 16, 21, 26, and 52 weeks. In using these data to construct average transition probabilities of remaining unemployed from one survey period to the next, it is necessary to choose the intervals of single weeks of unemployment for use in the construction of average probabilities of remaining unemployed.

Local mode biases. In choosing the intervals of single weeks of unemployment, the biases introduced by the local modes must be considered carefully. Consider the group of newly unemployed individuals in January 1979. The original size of this group was chosen to be the number of individuals with anywhere from 0 to 5 weeks of unemployment: this choice surrounds both sides of the local mode occurring at 4 weeks, thus capturing individuals who round either up or down to that modal point. To estimate the remaining size of this group as of the February survey, it is assumed that the number of individuals in the [5 to 9 weeks] interval in February provides a robust estimate. Using seasonally adjusted data, the average probability of remaining unemployed from January to February for this group is then calculated as the ratio of the size of the [5 to 9 weeks] group in February to the size of the [0 to 5 weeks] group in January. The other interval choices are listed in table 2, and the implied number of exists between successive survey dates are given in table 3.

Although the interval choices are an attempt to minimize the possible bias introduced by the local modes, the extent to which individuals round their estimates of time unemployed up or down to the nearest local mode cannot be determined. This fact, more than anything else, contributes to the uncertainty associated with measuring the average time individuals remain unemployed. These modal influences affect the accuracy of both cross-sectional and combined cross-sectional data. Norman Bowers and Francis Horvath14 suggest that not only do individuals tend to round their estimates of time unemployed to the nearest month, but that estimates over consecutive months are also inconsistent with the time between surveys. The authors do point out, however, that the net bias is small, because the average time unemployed between successive survey dates increases by only slightly more than the time frame of the survey. However, substantial local modes exist in the data, and judgment on the validity of the results of this study depends on one's view of the efficacy of the methods adopted to account for these modes.

Toward this end, one approach adopted in this study was to examine alternative choices of intervals of single weeks of unemployment. In examining the choices, two considerations concerning interval selection emerged as most important and deserve comment. First, the local mode at 26 weeks makes the process of determining transition probabilities near that milestone problematic. One might assume, for example, that an individual with anywhere from 17 to 21 weeks of unemployment in one month might be in the interval of [21 to 25 weeks] in the next month. To the extent that members of this group round up their estimates of time unemployed to 26 weeks (or half a year), the size of this latter interval may be biased downward, creating a downward bias in the associated transition probability of remaining unemployed. To include 26 weeks in the definition of this interval could arguably produce biases in the opposite direction.

A second consideration is that the strength of the mode at 52 weeks makes the division of current spell ages near that milestone into intervals of 4 or 5 single weeks meaningless. Therefore, the estimates of the remaining sizes of the January 1979 newly unemployed group are cut off at 33 weeks as of the August 1979 survey (the number of individuals with 29 to 33 weeks of unemployment in August 1979 is used as an estimate of the number of individuals who were newly unemployed in January 1979 and remained unemployed up to August 1979). This introduces the problem of truncation or the right-censoring of information on the completed length of unemployment spells; that is, by making the [29 to 33 weeks] interval in August 1979 the final interval, we know that individuals in this interval experienced at least 8 months of unemployment, but we do not know the lengths of their completed spells of unemployment. However, as of the August 1979 survey, the data indicate that only 2.8 percent of the January 1979 newly unemployed group remained unemployed.

One way in which the influence of these modes was examined involved constructing the range of duration estimates corresponding to a selection of alternative interval specifications; these included selections which assumed that different proportions of individuals responding 26 weeks actually belonged to the [21 to 25 weeks] interval in one month and the [25 to 29 weeks] interval in the next month. For the January 1979 newly unemployed group, varying the choices of intervals around the 26-week mark had little effect on the resulting estimates of expected duration; the impact was solely on the goodness of fit associated with each particular distributional form. The results based on alternative interval selections are available from the author on request.15

Testing Salant's sorting hypothesis

Although the data suffer from certain limitations, they also exhibit qualities which, when taken as a whole, point to a general usefulness in determining the average length of an unemployment spell. In dealing with social science survey data, the preponderance of evidence provided by the data must be considered, taking care to clearly state the criterion on which these judgments rest. In the case of repeated cross-sectional data, a natural starting point is the support these data provide of the notion that the average probability of remaining unemployed tends to rise over time. This observation was first made by Salant regarding the general trend indicated by cross-sectional data and is also a logical implication of his theory of sorting.16

In examining the transition probabilities of remaining unemployed over the length of an unemployment spell, both impressionistic and formal pieces of evidence were examined for the January 1979 newly unemployed group. On the impressionistic side, the general pattern of the transition probabilities is one smoothly rising over time. This can be seen from the following tabulation showing the average probability of remaining unemployed:

Months & Average probability

January to February P1 = .4316

February to March P2 = .5415

March to April P3 = .6003

April to May P4 = .6216

May to June P5 = .6759

June to July P6 = .8363

July to August P7 = .5734

The exception to this pattern is found in the behavior of P6 and P7. Transition probability P6, the average probability that a member of the January newly unemployed group will remain unemployed from June to July, jumps to a much higher value than would be expected from the trend set by P1 to P5. The value of the next transition probability, P7, is actually lower than the value of P6. To conclude that the true process is one of slowly rising average probabilities of remaining unemployed requires discounting the behavior of P6 and P7.

A former test was conducted of the null hypothesis that the probabilities of remaining unemployed are constant over time against the alternatives that the probabilities exhibit either consistent rising or falling patterns.17 Despite the behavior of transition probabilities P6 and P7, this test provided a strong indication that the average probabilities of remaining unemployed tend to rise consistently over this newly unemployed group's spell length.

Acceptance of this conclusion is important because of the guidance it provides as to the acceptable class of density functions for describing the attrition behavior of newly unemployed groups. One of the prior beliefs which influenced this judgment was the existence of a local mode at 26 weeks. However, despite this belief, one option we did not pursue was to reassign individuals around the 26 week mode and smooth transition probabilities P6 and P7 prior to estimating duration. Our reluctance in this regard was based on the notion that a better approach would be to assume that these interval populations are governed by some attrition process and the observed values are simply random draws from this process measured with error. By comparing the predicted with the actual attrition values, it is then possible to measure the influence of local modes on the goodness of fit of the chosen parametric form.

A common criticism of duration studies has been the practice of arbitrarily specifying parametric forms to describe the attrition behavior of groups, especially if the parameters of the distribution are estimated using truncated data. Combined cross-sectional data permit the systematic examination of the efficacy of our choices. Besides lending support to the choice of a class of functions for which the average probability of remaining unemployed rises over time, these data also permit independent testing of the appropriateness of each specific functional form within that class. The data permit estimating the original and remaining sizes of a newly unemployed group for eight successive periods. The idea is to truncate the data at some point, say 4 months, and test the fit of the chosen parametric form on the known excluded observations at the tail of the distribution of spell lengths. This procedure is then repeated for truncation points at the fifth and up to the eighth month of data. This procedure permits construction of a measure of the closeness of fit of the chosen parametric form to the observed data.

Estimating duration

The idea behind our method is to specify a parametric form describing the attrition of individuals out of unemployment and use the observed attrition rates to estimate the underlying true ones. We then use the estimated attrition rates to construct a measure of the expected value of completed spell lengths. The formula we employ is a discrete approximation of unemployment duration. The key to estimating duration using this formula lies in the assumption as to when individuals enter and leave unemployment between survey dates. One common assumption in the literature is that, on average, individuals enter and exit unemployment halfway between survey dates: individuals leaving unemployment between January and February are expected to experience, on average, one full period of unemployment.18 Using the term P(i) to mean the average probability of remaining unemployed the ith survey period after entrance, the expected length of a completed spell of unemployment, E(S), can be written as:

E(S) = 1{1-P1) 2*P1{1-P2) 3*P1*P2{1-P3) . . .

In applying combined cross-sectional data to this formula, it is assumed that the observed transition probabilities reflect an underlying attrition process measured with random error. To estimate the true transition probabilities, it is necessary to regress a linear version of the chosen attrition process against time and use the fitted coefficients to predict the true transition probabilities. The closeness of the fitted with the observed transition probabilities help to discriminate between alternative choices.

Six parametric forms were chosen to describe the attrition process out of unemployment for members of the January 1979 newly unemployed group: the Weibull, Salant, and Gompertz distributional forms; a linear and log-linear probability function; and the functional form utilized by Clark and Summers in their 1979 Brookings Paper article. Each of these chosen forms allows for rising average probabilities of remaining unemployed over time. The choices of forms of the likelihood function were based on the belief that the average probability of remaining unemployed tends to rise with any group's spell length. Another structure which is often employed is the exponential. This latter form is restrictive, because it assumes that the average probability of remaining unemployed is constant over time for any group. Although restrictive, it is instructive to include this form for comparison purposes. In each case, a linear or log-linear version of the parametric process was regressed against a function of time. Using the estimated parameters, fitted transition probabilities were calculated and used to estimate duration. The parametric forms chosen are given in the following tabulation:

Functional form & Statistical expression

Weibull ln(-lnS(t)) = a b ln(t)

Clark-Summers h(t) = a b ln(t)

Gompertz ln(h(t)) = a b t

Salant (1/h(t)) = a b t

Linear form h(t) = a b t

Log-linear ln(h(t)) = a b ln(t)

Exponential -ln(S(t)) = b t

where t is 1, 2, 3, and so on months of unemployment; S(t) is the average probability that a member of a newly unemployed cohort remains unemployed at least t survey periods; and h(t) is the average probability that a member of a newly unemployed group remains unemployed (t-1) periods and then leaves unemployment by the t(th) period.

Two goodness of fit statistics were constructed to judge the efficacy of these alternative forms. The first is a chisquared measure of the squared differences between the number of actual and fitted exits and survivors applied to within-sample observations only (the within-sample chisquared statistic); that is, if the fourth month is picked as the truncation date, a comparison is made between the predicted and actual number of exits in the first 3 months as well as a comparison of predicted and actual survivors (the number of individuals still unemployed) in the fourth month.

The second chi-squared statistic measures the fit between the observed and predicted observations before and after the selected point of truncation (the full-sample chi-squared statistic). Suppose that the parameter estimates are based on exits between the first and fourth months, with those remaining unemployed in the fourth month treated as survivors. The full-sample chi-squared statistic is based on using the resulting parameter estimates to predict the number of exits up to the eighth month and the number of survivors at that date. For both chi-squared statistics, the point of truncation is varied between 4 and 8 months.

The results are given in table 4. As both the point of truncation and the choice of parametric form were varied, the unemployment duration estimates remained within a small range of each other. In addition, on the basis of both the within- and full-sample chi-squared statistics, the Weibull form was joined by the Clark-Summers form in generating the closest fit between the actual and predicted numbers of exits and survivors. It should be noted, however, that, except for poor fit when the truncation point was chosen to be July, the log-linear function also performed very well. The Salant function also generated fairly close fits except for a highly unstable performance with the July truncation date. The linear and Gompertz forms had goodness of fit statistics which were uniformly higher than the Weibull and Clark-Summers forms but more consistent than the Salant or the log-linear. Finally, as expected, the exponential function had the poorest overall fit.

The behavior of the exponential is consistent with the observation that the average probability of remaining unemployed tends to increase over time. Notice the relative improvement in the full-sample chi-squared statistics as the truncation date is advanced from the fourth to eighth periods. When observations are truncated at four months, the exponential form requires that monthly attrition rates out of unemployment after the truncation date resemble those observed in the first four periods. Given the declining average probability of escape, this will result in an overestimate of actual attrition rates. It is not surprising, therefore, to see the improvement we do in the full-sample chi-squared statistic and the increase in the estimates of duration as the truncation point is advanced and more information about the behavior of the tail becomes known. The chi-squared statistics are in general, however, well beyond any acceptable range for concluding that the exponential density is an appropriate description of the attrition pattern of the January 1979 newly unemployed group.

Examining business cycles

In order to estimate expected completed spell duration for newly unemployed groups during the business cycles of the 1967-82 period, we used the knowledge gained from examining the January 1979 newly unemployed group to limit the analysis to a more selective choice of techniques.

The a priori convictions generated by examining January 1979 data which are being brought to bear on business cycle data include a conviction that the average probability of remaining unemployed tends to rise over time for newly unemployed groups; a conviction that among the functional forms underlying the probability smoothing techniques, the Weibull and the Clark-Summers forms perform the best; and a conviction that the change in estimated duration induced by truncating the data is fairly small.

For each of the newly unemployed groups considered, the following steps were taken: first, estimates of duration were made by applying the probability smoothing procedure to the discrete formula for duration; second, among the variety of fitted procedures, only the Weibull and the Clark-Summers parametric forms were employed; third, the empirical procedure was applied to truncation points of 3, 4, and 5 months; and finally, in addition to constructing duration estimates using combined cross-sectional data, we also report published duration figures and estimates constructed using Kaitz's steady state method. This process permits a comparison of the sensitivity of each statistic to turning points in the business cycle.

We constructed the duration estimates on a monthly basis over the January 1967-June 1982 period. We used seasonally adjusted numbers and report quarterly average duration figures. For the combined cross-sectional data, we only report results from using the Weibull form because it proved consistently superior to the Clark-Summers form. Also, as there were only minor differences in the results using truncation dates of 3, 4, and 5 months, we report the 5-month truncation estimates here.

As is well known, and as indicated by chart 1, published duration statistics tend to lag business cycle turning points. The duration statistic based on Kaitz's steady state formula also lags business cycle turning points but not as strongly as the published statistic. Our nonsteady state duration measure based on combined cross-sectional data tends to be coincident with turning points.

Both Kaitz's steady state and our nonsteady state duration measures are nearly always less than the published duration estimates. The difference tends to widen during the initial phase of a recovery period and to decline after the published statistic reaches its lagged peak. Once the recovery has spent its course and the economy enters and progresses through the ensuing recessionary period, the difference reaches its smallest level.

Although it lags business turning points slightly, the Kaitz steady state measure is cyclically sensitive and is within the same general range as our nonsteady state estimates. Given its cyclical sensitivity, it is tempting to suggest that the Kaitz measure be used to track the cyclical behavior of the average time spent unemployed for newly unemployed groups. There would be numerous advantages to such a suggestion. First, unlike our nonsteady state measure which requires forward-looking data, the Kaitz measure only requires the current level of unemployment and new inflows. Hence, it could be produced in a timely fashion.

Second, although it implicitly assumes steady state flows, it is not as restrictive a measure as it might first seem. Most studies of duration use the steady state assumption applied to annual average data--for example, Kaitz, Salant, and George Akerlof and Brian Main.19 In each of these studies, the use of annual average data either presupposes stable conditions over the period or attempts to smooth the underlying fluctuations. Instead of using annual average data to approximate steady state conditions, our procedure implicitly limits the time over which the steady state assumption is interpreted to hold. In particular, given the periodicity of the CPS survey, this steady state technique is used to develop monthly duration statistics. In this way, each month's estimate can be interpreted as that which would have been observed had the current levels of inflow and unemployment remained the same over time. Thus, changes in the duration statistic over time reflect changes either in inflow levels or changes in the current level of total unemployment, or both.

However, significant differences between Kaitz's steady state measure and our nonsteady state measure exist, and although the former may be easier to calculate, these differences cast doubt on its usefulness. A close comparison of the formulae on which these statistics are based tells us why. In Kaitz's analysis, the steady state duration measure, D(t)(K), is the ratio of the level of current unemployment to the size of the current month's level of new unemployment. The numerator can be thought of as the sum of the number of currently unemployed individuals in their first, second, third, and so on month of unemployment. This formula is:

D(t)(K) = >(N(t)(1) N(t)(2) N(t)(3) . . .>)/N(t)(1)

where N(t)(i) is the number of individuals who are in their i(th) month of unemployment at time (t).

The nonsteady state duration statistic, D(t)(N), for survey date (t) can also be written in terms of the original size of the newly unemployed group at survey date (t) and the remaining sizes of the group in subsequent survey periods. This formula is:

D(t)(N) = >(N(t)(1) N(t 1)(2) N(t 2)(3) . . . .>)/N(t)(1)

where N(t j)(i) is the number of individuals who are in their i(th) month of unemployment at time (t j).

The ratio of Kaitz's steady state formula and our nonsteady state discrete formula is given by:

D(t)(K)/D(t)(N) = >(N(t)(1) N(t)(2) N(t)(3) . . . .>)/>(N(t)(1) N(t 1)(2) N(t 2)(3) . . . .>)

Careful inspection of this ratio reveals that the fundamental difference between the numerator and denominator is a function of how the size of the group of currently unemployed individuals in their, say, third month of unemployment at date (t), Nt(3), compares with the remaining size of the newly unemployed group in their third month of unemployment at date (t 2). Thus, to the extent that the current size of unemployment groups lag business cycle turning points, so too will Kaitz's measure in the numerator. However, since our nonsteady state measure in the denominator is forward looking, it incorporates the lagged reactions of unemployed groups much more quickly into its estimates.

The cyclical behavior of the ratio of Kaitz's to the nonsteady state measure is presented in chart 2. As can be seen, during periods of rising unemployment rates, the ratio falls until just before the business cycle trough when it begins to increase. This increase precedes the peak unemployment rate after which the ratio maintains a consistently high level as unemployment rates fall during the ensuing recovery period.

As unemployment rates begin to rise, the size of newly unemployed groups begins to swell. Over time, as these groups enter their second, third, and so on, period of unemployment, we may observe the size of the newly unemployed groups becoming successively larger. For example, the size of a currently employed group in its third month of unemployment at date (t), N(t)(3), may be less than the remaining size of the newly employed group at date (t 2), N(t 2)(3). As a result, each term in the numerator may be dominated by its counterpart in the denominator until, nearing the end of the recession, the sizes of younger groups at each date (t) may dominate the sizes of their counterparts at date (t) and beyond; hence, the ratio will rise.

Although Kaitz's steady state measure has obvious advantages in terms of its ease of calculation, by its nature of construction, it lags our nonsteady state measure in responding to business turning points. Due to its forward-looking nature, the nonsteady state measure is more coincident with the peaks and troughs of the business cycle.

Which data are more promising?

This article examined two types of data from the CPS survey for estimating the average time individuals remain unemployed. We argued that published cross-sectional data from the CPS are inappropriate for this task because they require imposing the restrictive assumption of steady state flows. In addition, by their nature, we found that such estimates tend to lag turning points in the business cycle.

The second type of data, based on combined cross-sectional data from the CPS promises more for the development of duration statistics. These data permit defining the original and remaining sizes of unemployed groups. Because this method traces the actual sizes of groups, it is no longer necessary to assume steady state flows. Rather, the attrition patterns observed reflect the impact of current economic conditions on flows in and out of unemployment.

How good are these data? Do they permit the development of statistics which provide an accurate picture of the average amount of time a newly unemployed individual can expect to remain unemployed? Although the data suffer from local modes due to the tendency of individuals to round off their estimates of time unemployed to the nearest monthly interval, the CPS data are useful in determining the duration of unemployment spells.

This conclusion is justified by several reasons. First, the consistency of the pattern of local modes in the data support our procedure of surrounding both sides of each mode to minimize the possible biases. Second, the data consistently agree with the theoretical argument that the average probability of remaining unemployed tends to increase over time for any given group. Third, combined cross-sectional data permit independent verification of the choice of parametric form for describing attrition patterns, thus dispelling the argument that the choices of such forms using CPS data are arbitrary and unjustified. We consistently found that although the fit of estimated to actual patterns of exit is very sensitive to the choice of parametric form, the estimates of duration were not. Finally, these data have the advantage of permitting the construction of duration estimates for several unemployed groups, such as currently unemployed individuals. Also, because single-week duration data exists for groups stratified by race and sex, it is possible to greatly expand the number of currently and newly employed groups for which duration estimates can be constructed. Once a group is chosen, the actual mechanics of estimating duration are very straightforward and easy to calculate.

Several robustness tests need to be performed on these data before final judgments can be made. However, this study concludes that combined cross-sectional data permit the production of duration estimates which would provide an accurate portrayal of average time spent unemployed for various economic groups. Because of their forward-looking nature, these monthly duration statistics could not be produced in a manner that is coincident with other statistics for the same survey month. This drawback is offset by the quality of information that combined cross-sectional data provide for measuring the average total length of time unemployed.

1 Hyman B. Kaitz, "Analyzing the length of spells of unemployment,' Monthly Labor Review, November 1970, p. 11.

2 Steven W. Salant, "Search Theory and Duration Data: A Theory of Sorts,' Quarterly Journal of Economics, February 1977, pp. 39-57.

3 Numerous articles have appeared utilizing these methods on cross-sectional data: George Perry, "Unemployment Flows in the U.S. Labor Market,' Brookings Papers on Economic Activity, vol. 2, 1972, pp. 245-78; Stephen T. Marston, "The Impact of Unemployment Insurance on Job Search,' Brookings Papers on Economic Activity, vol. 1, 1976, pp. 169-203; R. Frank, "How Long is a Spell of Unemployment,' Econometrica, March 1978, pp. 285-301; Norman Bowers, "Probing the issues of unemployment duration,' Monthly Labor Review, July 1980, pp. 23-32; Robert Warren, "A method to measure flow and duration as unemployment rate components,' Monthly Labor Review, March 1977, pp. 71-72; George Akerlof and Brian Main, "An Experience-Weighted Measure of Employment and Unemployment Durations,' American Economic Review, December 1981, pp. 1003-11. Other cross-sectional studies have taken the much needed step of relaxing the assumption of steady state flows: Kim Clark and Lawrence Summers, "Labor Market Dynamics and Unemployment: A Reconsideration,' Brookings Papers on Economic Activity, vol. 1, 1979, pp. 13-60. There has also been much debate as to whether individuals who are newly or currently unemployed form the appropriate group for analysis of duration. See Clark and Summers, "Labor Market Dynamics and Unemployment'; Akerlof and Main, "An Experience-Weighted Measure'; and John A. Carlson and Michael W. Horrigan, "Measures of Unemployment Duration as Guides to Research and Policy: Comment,' American Economic Review, December 1983, pp. 1143-50. In contrast, one recent study utilizes combined cross-sectional data to construct a nonsteady state estimate of duration among the group of individuals exiting unemployment between consecutive survey dates: Hal Sider, "Unemployment Duration and Incidence: 1968-1982,' American Economic Review, June 1985, pp. 461-72.

4 Sider, "Unemployment Duration,' p. 464.

5 To derive this result, assume that the constant inflow into unemployment between dates (t-1) and (t) is represented by:

F(t) = F for all time periods (t)

Next, it is necessary to specify an attrition process for each member of a newly unemployed group at survey date (t); in particular, it is often assumed that each individual has the same constant probability, P, of exiting unemployment between any dates (j-1) and (j). That is,

P(j) = P for all time periods (j)

The expected completed spell duration for a group of newly unemployed individuals at time (t) becomes:

D = 1 (1-P) 2 P (1-P) 3 P2 (1-P) . . .

= 1/(1-P)

To estimate this concept it is only necessary to observe that in a steady state world, the constant level of unemployment is equal to the product of the constant rate of inflow and duration as measured by D. Since the level of unemployment is equal to this month's inflow and the remaining members of all previous inflow groups, it follows that:

U = F F P F P2 F P3 . . .

= F (1/(1-P))

= F D

Hence, the formula for duration becomes the ratio of the constant level of unemployment to the constant level of inflows; that is,

D = U/F

6 This justification no longer holds when the assumption of steady state flows is relaxed.

7 The interval [less than 5 weeks] is defined to include any individual with a current spell age of less than 4.5 weeks.

8 The way in which this is accomplished is to assume that the spell ages of the currently unemployed are governed by a particular distribution which is deterministically related to the distribution of completed spell lengths among the newly unemployed. In other words, once you know the parameters of the distribution of spell ages, you automatically know the parameters of the distribution of completed spell lengths.

Hence, all that is needed to generate estimates of completed spell duration is published information on spell ages. In particular, by maximizing the likelihood of observing the published breakdown of spell ages by time unemployed, Salant generates estimates of the parameters of the distribution of spell ages for the currently unemployed. These parameter estimates are in turn used to generate an estimate of average completed spell length for the newly unemployed.

9 These figures are all seasonally adjusted numbers. In addition, in setting up the likelihood specification, Salant noted that the endpoints of the published groupings corresponded to the following:

[0 to 4.5 weeks]

[4.5 to 14.5 weeks]

[14.5 to 26.5 weeks]

[26.5 to 99 weeks]

10 Salant used the following density function to describe the probability of observing an unemployment spell of current length (T) at a survey date:

g(T) = (r-1) a(r-1) (a T)-(r)

The density function of completed spell lengths turns out to equal:

f(x) = ra(r)(a x)-(r 1)

11 Comparing the fitted with the published breakdowns of time unemployed yields a chi-squared statistic of (3.199) which, given the critical chi-squared value of (5.99), provides strong evidence in favor of the chosen density structure as appropriate at the 95-percent level of confidence.

12 As is pointed out in the final section, by constructing it on a monthly basis, Kaitz's statistic responds to business cycle turning points. The assumption of a constant levels of unemployment is applied anew each time the statistic is constructed.

13 The procedure for generating the monthly average probability values given in the text tabulations on p. 6 was as follows:

First, we used the data underlying the published cross-sectional statistics to divide time unemployed into the following single-week intervals:

[0 to 4 weeks], [5 to 8 weeks], [9 to 12 weeks], [13 to 16 weeks], [17 to 20 weeks]

Second, we used the parameter estimates from the maximum likelihood procedure to generate the probability of observing a spell in each of those intervals. We then multiplied these probabilities by the size of the currently unemployed to predict the population of these intervals.

Finally, we constructed the following transition probabilities:

P1 = [5 to 8 weeks]/[0 to 4 weeks]

P2 = [9 to 12 weeks]/[5 to 8 weeks]

P3 = [13 to 16 weeks]/[9 to 12 weeks]

P4 = [17 to 20 weeks]/[13 to 16 weeks]

Strictly speaking, these are not transition probabilities because they are based on interval populations using January 1979 data; however, they can be interpreted as the transition probabilities implied by the attrition process embodied in Salant's maximum likelihood procedure.

14 Norman Bowers and Francis Horvath, "Keeping Time: An Analysis of Errors in the Measurement of Unemployment Duration,' Journal of Business and Economic Statistics, April 1984.

15 These duration estimates derived using alternative interval selections were based on seasonally unadjusted single-week duration data.

16 This "fact' of nature turns out to be a critical assumption in Salant's empirical method; it is interesting he does not perform any formal statistical tests of his maintained hypothesis.

17 This test requires construction of the G-statistic which is defined as follows:

Assume that the observations on the remaining sizes of a group occur at scaled intervals given by t(i), (i=0, 1, . . ., n). Let the number of exits which occur between t(i-1) and t(i) be given by (i). Let Wi be defined as:

Wi = (n-i 1) (t(i) - t(i 1))

As it turns out, the following statistic is distributed standard normal:

Z = [12(n-1)].5 (Gr, n - .5)

where:

Gr, n = r-1 i=1iW(i 1)/(r-1) r i-1Wi

The maintained hypothesis is that the probability of existing unemployment is constant over the group's spell length. Large negative values of the observed Z are supportive of Salant's sorting process.

The sample value of Z associated with the January 1979 newly unemployed group is (-4.026) which is sufficiently greater than (1.96) in absolute value to reject the maintained hypothesis at the 97.5-percent level of confidence.

18 This assumption requires that the entries and exits occurring between survey dates are governed by a uniform distribution.

19 Kaitz, Analyzing the length; Salant, Search Theory; and Akerlof and Main, Experience-Weighted Measure.

Table:

Table: 1. Duration of unemployment by selected single weeks, unadjusted data for January and February, 1979

Table: 2. Single-week intervals defining the original and remaining sizes of the newly unemployed cohort in January 1979

Table: 3. Number of the unemployed who found work between successive survey periods, 1979

Table: 4. Duration estimates applying parametric smoothing techniques to generate fitted transition probabilities

Photo: Chart 1. Cyclical behavior of various unemployment duration measures, 1967-82

Photo: Chart 2. Cyclical behavior of the ratio of the Kaitz steady state measure to the Weibull nonsteady state measure, 1967-82

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Current Population Survey |
---|---|

Author: | Horrigan, Michael W. |

Publication: | Monthly Labor Review |

Date: | Jul 1, 1987 |

Words: | 9218 |

Previous Article: | Up from the ashes: the rise of the steel minimill in the United States. |

Next Article: | Employer-sponsored long-term disability insurance. |

Topics: |