CRIME AND THE BUSINESS CYCLE IN POST-WAR BRITAIN REVISITED.
The notion that crime is related to economic hardship has a long and distinguished history within both criminology and economics. A recent example of the importance of the debate may be gleaned from papers emanating from a criminological colloquium organized by the Council of Europe (1995). One stimulus for the renewed interest was the publication of Field's study on the relationship between crime and the economy (Field 1990). He used annual data for England and Wales in the post-war period and concluded that ` ... economic factors have a major influence on trends in both property and personal crime'. According to Wells (1994) Field has ` ... done more than any other researcher in modern times to establish that property crime, measured from police records, is sensitive to the cycle in economic activity'. Anglo-centrism aside there is no doubt that the work has had significant impact upon both political and academic debate. Field found that property crimes (burglary, theft and robbery) were counter-cyclically related to economic activity. In particular he used personal consumption as his indicator of the business cycle and found no role for unemployment. Field further interpreted this effect as a short-run phenomenon ` ... this effect is short term rather than one which might explain the growth of property crime in the long term ... the full relation between long-run economic growth and growth in property crime is as yet unclear, it seems that the effects identified in this study have only a limited bearing on the issue.' (Field 1990: 5). The role of consumption in Field's work is to explain fluctuations around the upward trend in crime but not the trend itself which is left as an essentially exogenous fact of life.
In an interesting contribution to the debate, Pyle and Deadman (1994) (see also Deadman and Pyle 1997) suggest that Field's work might be improved to allow explicit modelling of long-run trends. They do this by presenting what are known to economists as error correction models (ECM) for property crime in post-war England and Wales. The modelling approach they adopt uses the ideas of integration and co-integration developed by econometricians and statisticians to allow long-run relationships between variables to be incorporated into short-run dynamic models. Pyle and Deadman argue that crime variables have long-run relationships with each of personal consumption, gross domestic product (GDP) and unemployment. This paper will adopt a similar methodology to Pyle and Deadman but argues that their empirical results are incorrect since they depend critically upon their finding that the crime variables need to be differenced twice for stationarity, or formally that they are integrated of order 2, I(2). On the contrary it will be shown that, as argued by both Hale and Sabbagh (1991) and Osborn (1995), the crime variables are in fact I(1), and hence only need differencing once for stationarity. From this it follows that the co-integrating regression used by Pyle and Deadman both to test for co-integration between the crime and the set of explanatory variables and to obtain the error correction term for the second stage of their modelling is mis-specified. Consequently their finding of a co-integrating relationship between the differenced crime data is wrong since their dependent variables are I(0) and hence can have no relationship with their set of I(1) explanatory factors. It will be argued that it is precisely this lack of relationship which leads to their incorrect conclusion of co-integration. The results for the correctly specified co-integrating equation will then be presented and it will be shown that the evidence for such relationships is less strong than suggested by Pyle and Deadman. In particular it will be shown that burglary and theft are co-integrated with consumption alone and that there is no co-integrating relationship between robbery and any of the explanatory variables either singly or together. Finally based on these results explanatory models of the trends in property crime will be presented. Before doing this, however, it is necessary to review the methodology adopted and to consider Pyle and Deadman's criticisms of earlier work in the area.
Co-integration and Error Correction Models
The approach to modelling the long-run crime data adopted here is that developed from the work of Engle and Granger (1987). This methodology has been used in earlier work analysing British crime data. For example, Hale (1989) utilizes it to examine imprisonment rates in England and Wales; Hale and Sabbagh (1991), Pyle and Deadman (1994) and Osborn (1995) have employed it to consider the relationship between post-war crime trends and the economy; while Hale and Caddy (1995) have used it to look at crime trends since the middle of the nineteenth century.
The Engle-Granger approach begins by recognizing that for the standard results of multiple regression analysis to be valid, the variables used must be stationary, which (roughly) means their properties should be constant over time. In particular they should not exhibit any tendency to drift upwards or downwards (constant mean) and should have constant variance. The typical realistic case in practice is one of trend stationarity where the series is stationary around an underlying deterministic (usually linear) trend. Variables which are not stationary should be differenced until they are, Variables which need differencing once for stationarity are referred to as being integrated of order once I(1), twice as I(2) and so on. In this schema stationary variables are I(0). The first step in any analysis is therefore to test all the data for levels of integration in order to identify the appropriate difference operator to apply to each series to achieve stationarity. It is the appropriately differenced data which should be used in modelling relationships.
There is, however, a further stage to be considered. Much economic (and sociological?) theory is in terms of long-run equilibrium relationships between variables. Differencing data has the effect of purging it of long-run information (the trend) and hence ignoring these possible long-run relationships. Excluding this information when such relationships exist does not seem sensible. The way to get this back in is to consider the possibility of using error correction models (ECM) which incorporate both long and short-run aspects of the data. This is only possible when there exists a linear combination of non-stationary variables which is itself stationary. If such a linear combination of variables exists, the variables are said to be co-integrated.(1) If the variables are not co-integrated then it is only able to model the short-run dynamic relationship between them. However if the variables are co-integrated models which ignore this fact will be mis-specified.
In summary conventional regression analysis faces two threats from non-stationary variables. The first arises because regressing two unrelated I(1) variables against each other results in more high values of t and F statistics than would be predicted by a statistical theory based upon stationary variables. Consequently the use of standard distributions when the variables are unrelated and non-stationary will lead to too frequent rejection of the null hypothesis that there is no relationship between the variables. It is to avoid such spurious regressions that Granger and Newbold (1974) propose that in such cases the regressions be conducted in terms of first differences rather than levels of the variables. Co-integration analysis on the other hand avoids such problems by using the correct rather than incorrect statistical distributions.
The second problem arises with non-stationary variables that are truly related. As noted above, using a model in first differences in this situation results in a mis-specified regression equation since the long-run relationship has been ignored. The error correction term connecting the co-integrated variables is missing from the estimated equation. (For an interesting and amusing discussion of these issues see Murray 1994).
More formally let us consider the simplest bivariate case of a possible relationship between two variables X and Y. Suppose X and Y are both I(1) and that there exists a long-run equilibrium relationship between them which may be expressed as
(1) [Y.sub.t] = [Alpha] + [Beta][X.sub.t] + [u.sub.t]
then short-run deviations
should not show any tendency to increase over time. That is to say they should be stationary.
Looked at in another way, there are two variables which are I(1) but a linear combination, [Y.sub.t]-[Alpha]-[Beta][X.sub.t], of which is stationary, I(0). In this case, as noted earlier, Y and X are said to be co-integrated.
To determine whether the relationship in (1) actually holds, the model is fitted and the regression residuals, the estimates of [u.sub.t], tested for stationarity. If the Y and X are co-integrated then the error correction model of the form
(3) [Delta][Y.sub.t]=[[Theta].sub.0]+[[Theta].sub.1][Delta][X.sub.t]+ [[Theta].sub.2]([Y.sub.t-1] - [Alpha] - [Beta][X.sub.t-1])+[[Epsilon].sub.t]
can be estimated. The term in brackets is the error correction term which, if [[Theta].sub.2] is negative corrects short-run deviations from the equilibrium level implied by (1), hence the name error correction. If the variables are not co-integrated then
(4) [Delta][Y.sub.t]=[[Theta].sub.0]+[[Theta].sub.1][Delta][X.sub.t]+ [[Epsilon].sub.t]
is the appropriate form to consider.
When co-integrating relationships between variables are found then the problem which immediately arises is how to estimate the error correction term ([Y.sub.t-1]-[Alpha]-[Beta][x.sub.t-1]) which appears in (3). Engle and Granger (1987) suggest a two-stage procedure using the lagged residuals from Ordinary Least Squares (OLS) estimation of the long-run regression (1) as estimates of the error correction term in (3). Since all the terms in (3) are by definition stationary this may be done by OLS and Engle and Granger show that the resulting parameter estimates are not only consistent but also asymptotically efficient.
This two-stage procedure (hereafter the EG procedure) has, however, been criticized on the grounds of the small sample bias present in the OLS estimation of the co-integrating equation. Banerjee et al. (1986) propose overcoming this bias by estimating the long-run and short-run parameters in a single step. If (3) is rewritten by multiplying out the brackets the result is
(5a) [Delta][Y.sub.t]=[[Theta].sub.0]-[[Theta].sub.2][Alpha]+ [[Theta].sub.1][Delta][X.sub.t]+[[Theta].sub.2] [Y.sub.t-1]-[[Theta].sub.2] [Beta][X.sub.t-1]+[[Epsilon].sub.t]
or in unrestricted form
(5b) [Delta][Y.sub.t]=[[Delta].sub.0]+[[Delta].sub.1][Delta][X.sub.t] +[[Delta].sub.2][Y.sub.t-1]+[[Delta].sub.3][X.sub.t-1]+[[Epsilon].sub.t]
Although [Y.sub.t-1] and [X.sub.t-1] are I(1) variables, OLS can still be applied, since there is a linear combination which is I(0). There is evidence from small sample simulation studies that the small sample properties of estimates of (5b) are superior to those from the Engle-Granger two-step procedure. For further discussion of the asymptotic properties of the estimates reference should be made to Sims, Stock and Watson (1990). Using their results it can be shown that the individual OLS parameter estimates for the
[[Delta].sub.s] in (5b) have asymptotic normal distributions and hence the standard procedures can be applied (see for example Banerjee et al. 1993: 188-89; Davidson and MacKinnon 1992: 724-25). Hereafter this unrestricted procedure will be referred to as the Sims-Stock-Watson or SSW approach to estimating the ECM. In the results section estimates from both the EG and SSW approaches will be presented for comparative purposes but where these conflict the SSW estimates will be preferred.
Clearly then an important first step in any regression analysis using time series data is to establish the level of integration of the variables to be used. Once this has been done it is possible to move to the second stage and look for possible co-integrating relationships between any set of I(1) variables in order to establish whether our modelling strategy should be based upon a generalization of equation (3) or whether (4) is the correct specification. It is, however, important to emphasize that tests for co-integration are critically dependent upon the first stage being correct, that is to say that the variables included in the co-integrating regression are indeed all I(1). It is the contention of this paper that crime variables are indeed I(1) and not, as claimed by Pyle and Deadman, I(2). Hence rather than using, as they do, first differences of the crime data and levels of the explanatory variables in the co-integrating regression, all variables should be in levels.
To recap, Pyle and Deadman argue that in terms of the simple bivariate model above, if [Y.sub.t] is a crime variable then it is I(2) not I(1). Hence any long term co-integrating relationship which exists between I(1) variables will involve [Delta][Y.sub.t] not [Y.sub.t] and will be of the form
rather than that suggested in (1). What, however, if a mistake has been made at the first stage when testing [Y.sub.t] for its level of integration and it is indeed I(1)? The estimation of a regression equation based upon (5) now involves regressing a stationary I(0) variable, [Delta][Y.sub.t], on one which is I(1) and this is clearly spurious. Stationary variables are not associated with those that are non-stationary. As Stock and Watson (1988) note, if a stationary variable is regressed against a non-stationary variable in large samples, the observed association will tend to zero as the variation in the stationary variable grows ever smaller in relation to the variation in the non-stationary variable. Or as Banerjee et al. (1993) note:
... when an I(0) series is regressed on an I(1) series, the only way in which ... the regression (can be made) consistent ... is to drive the coefficient on the I(1) variable to zero. (Banerjee et al. 1993: 80)
Hence were this strategy to be mistakenly pursued and (5) used as the basis for the co-integrating regression, the estimate of [Beta] would tend to be relatively small and the model [R.sup.2] would be expected to be low. In essence in large samples we would be regressing [Delta]Y on a constant. The residuals from the regression would be of the form
and hence would necessarily be stationary and so it would be inferred that [Delta]Y and X are co-integrated. As can be seen from the above discussion this conclusion would be wrong and would be entirely an artifact of the mistaken assumption that Y was I(2).
Testing the Level of Integration for the Crime Variables
Pyle and Deadman (199: 353) are critical of the approach to testing integration adopted by Hale and Sabbagh (1991).(2) They propose an alternative approach using the Dickey-Pantula (1987) procedure which starts with the highest entertained level of integration and decreases the order of differencing each time the current null hypothesis is rejected. Normally, given that the data would be expected to be at most I(1), it would be sufficient to begin with the assumption that it was integrated of order 2 roots, but given the argument of Pyle and Deadman that crime variables are in fact I(2), it seems more appropriate to base the description that follows upon the initial assumption that the variables may be I(3).
Step One: Test the maintained hypothesis [H.sub.3]: Data is I(3) against the alternative [H.sub.2]: Data is I(2).
If [H.sub.3] is rejected proceed to
Step Two: Test the maintained hypothesis [H.sub.2]: Data is I(2) against the alternative [H.sub.1]: Data is I(1).
If [H.sub.2] is rejected proceed to
Step Three: Test the maintained hypothesis [H.sub.1]: Data is I(1) against the alternative [H.sub.o]: Data is stationary I(0).
The procedure is based upon estimating the following general regression equation under various restrictions on the parameters
(8) [[Delta].sup.3][Y.sub.t]=[[Mu].sub.[Mu]]+[Gamma]t+[[Lambda].sub.1] [Y.sub.t-1]+[[Lambda].sub.2][Delta][Y.sub.t-1]+[[Lambda].sub.3][[Delta].sup.2] [Y.sub.t-1]+[[Epsilon].sub.t]
The hypotheses are tested(3) sequentially using Dickey-Fuller (DF) statistics. As the tests are only valid when the disturbance terms are serially independent, lagged values of [[Delta].sup.3]Y should be added to the right hand side of the equations until this is achieved.
Osborn (1995) has in turn criticized Pyle and Deadman for their over-reliance on results based upon equations without a constant term and, notwithstanding their comments on Hale and Sabbagh, the fact that they arbitrarily use augmentation of order 1. Results are presented below based upon the Dickey-Pantula approach for equations containing both trend and constant terms, constant term alone and finally with neither trend nor constant. In practice one should never consider applying a unit root test without either a trend or constant.(4) However in order to discuss the Pyle and Deadman results, tests based on the more restrictive models will be reported even where not strictly necessary. For similar reasons, even though the preferred strategy would be to report only results with the minimum augmentation necessary for serially independent residuals, results obtained from 0, 1 and 2 degrees of augmentation are presented for each case.
Using Dickey-Pantula tests (hereafter DP(x) where x indicates the degree of augmentation) involves using the same critical values as for the ADF statistics. As Pyle and Deadman (1994: 353) note, these vary widely across investigators since all are derived using computer simulations and hence will be sensitive to the data generating process.
There is general agreement that crime variables in England and Wales since the Second World War have been non-stationary and that the highest level of integration they will have is I(2). Although the main interest of the analysis presented here is whether crime variables are I(2) or I(1) and hence the discussion will focus on the statistics for testing [H.sub.2] against [H.sub.1], in line with the methodology set out above, tests were first carried out to test whether crime variables might be I(3). In the interests of space the results are not reported but for each of the three property crimes (using the same data as Pyle and Deadman (1994), to which reference should be made for full definitions and sources) the maintained hypothesis [H.sub.3] was rejected in favour of [H.sub.2].
Turning now to the results for [H.sub.2] against [H.sub.1], note that in Table 1 the statistics for each of the three specifications for burglary indicate that the null hypothesis [H.sub.2] be rejected at the 5 per cent significance level. On the basis of these results burglary is clearly at most I(1). However as Deadman and Pyle (p. 353) correctly point out, the burglary series was subject to a step change upward in 1969 following the Theft Act of 1968. They therefore suggest using the Perron test (Perron 1990; Perron and Wolfgang 1992), which was designed to test for levels of integration in data with structural breaks. Pyle and Deadman accept that this test indicates that burglary is I(1) and these results were replicated here. However they also conduct tests on the split samples 1946-68 and 1969-91 and find for both sub-samples that it is not possible to reject the null hypothesis [H.sub.2]. Again this is confirmed by the present work. Hence since they have already claimed that both theft and robbery are I(2) they argue that as there is no reason for crime variables to have different orders of integration burglary must also be I(2). There are several problems with this line of reasoning. First, as will become clear below, the conclusion that other crime variables are I(2) is itself in doubt.
TABLE 1 Tests for level of integration for crime variables 1946-91 ([H.sub.2]: data is I(2) vs [H.sub.1]: data is I(1))
Burglary Degree of Augmentation for Dickey Pantula (DP) Additional parameters DP(0) DP(1) No constant/no trend -2.53(*) -2.50 Constant/no trend -3.47 -4.01(*) Constant/trend -4.10([dagger]) -4.99(*) Theft Degree of Augmentation for Dickey Pantula (DP) Additional parameters DP(0) DP(1) No constant/no trend -2.04(*) -0.73 Constant/no trend -3.55(*) -2.46 Constant/trend -5.00([double dagger]) -4.40(*) Robbery Degree of Augmentation for Dickey Pantula (DP) Additional parameters DP(0) DP(1) No constant/no trend -1.37(*) -0.57([dagger]) Constant/no trend -2.40(*) -1.63 Constant/trend -4.77([dagger]) -5.01(*) Burglary Degree of Augmentation for Dickey Pantula (DP) Additional parameters DP(2) No constant/no trend -2.00([dagger]) Constant/no trend -3.74 Constant/trend -5.30 Theft Degree of Augmentation for Dickey Pantula (DP) Additional parameters DP(2) No constant/no trend -0.29([dagger]) Constant/no trend -1.92([dagger]) Constant/trend -4.38 Robbery Degree of Augmentation for Dickey Pantula (DP) Additional parameters DP(2) No constant/no trend 0.007([dagger]) Constant/no trend -0.95([dagger]) Constant/trend -4.47 5% and 1% critical values for the Dickey-Pantula Statistics Charemza and Deadman (1992) sample size 40 No constant/no trend -2.00 (5%) -2.76 (1%) Constant/no trend -2.18 (5%) -3.20 (1%) Constant/trend not reported not reported MacKinnon (1991) sample size 40 No constant/no trend -1.95 (5%) -2.62 (1%) Constant/no trend -2.94 (5%) -3.60 (1%) Constant/trend -3.52 (5%) -4.20 (1%)
(*) Minimum level of augmentation required for serially independent residuals using the Lagrange Multiplier test for serial correlation.
([dagger]) Residuals serially correlated at the 5% level of significance.
([double dagger]) Residuals serially correlated at the 10% level of significance.
Secondly, it is not clear why all crime series should be integrated of the same order. Thirdly, the reason Perron develops his test is that small numbers of observations in the split samples result in standard tests having low power to reject the null hypothesis when it is false. This last reason alone would strongly suggest accepting the conclusion of the Perron procedure rather than the split samples and means rejecting the hypothesis that burglary is I(2) in favour of accepting that it is I(1).
Turning to the results for the theft series, note first that for testing H2, either with neither constant nor trend or with just the constant included, the DP(O) statistics may be used since there is no evidence of residual serial correlation. Tests based on either statistic lead to a rejection of [H.sub.2] at the 5 per cent level. Pyle and Deadman however use DP(1) and hence do not reject [H.sub.2]. If the preferred variant including both trend and constant is used then the appropriate statistic is DP(1) and at both the 5 per cent and 1 per cent level of significance, [H.sub.2] is rejected. The evidence from Table 1 points conclusively to the conclusion that theft is at most I(1) and hence only needs differencing once for stationarity.
The final crime series considered by Pyle and Deadman is robbery which they argue is `... unambiguously I(2)'. Certainly if the model with no additional parameters is used, that is with no constant or trend term included, then the appropriate statistic for testing [H.sub.2] v [H.sub.1] is DP(0) and this is not significant. However it has been argued earlier that the equation used for the test should include at least a constant term and preferably both a constant and trend. In the constant only case DP(0) may be used since there is no evidence of serial dependence in the residuals. The test statistic has a value of-2.40 and hence Using the Charemza and Deadman critical values [H.sub.2] would be rejected at the 5 per cent level. Using the MacKinnon values would on the other hand lead to the acceptance of [H.sub.2]. Finally if the preferred approach of estimating the test statistic from the model with both constant and trend is used then the appropriate statistic is DP(l) and this is significant at the 1 per cent level. While the results for the robbery series are less clear cut than for burglary or theft, on balance the evidence does seem to indicate that it, like the other crime series, is indeed at most I(1).
In summary the evidence presented in Table 1 suggests that the conclusions of Pyle and Deadman concerning the level of integration of the crime series may be wrong and that they are in fact at most I(1). That they are indeed I(1) and not I(0) is confirmed by the results in Table 2 where only the results from regressions including both constant and trend terms are reported.
TABLE 2 Tests for level of integration for crime variables 1946-91 ([H.sub.1]: data is I(1) vs [H.sub.0]: data is I (0)
Burglary -1.76 (2) Theft -I.01 (2) Robbery -2.23 (1)
Results from regressions including both constant and trend terms 5% and 1% critical values for the Dickey-Pantula statistics as for Table 1. Number in brackets indicates the degree of augmentation used in the test to produce serially independent residuals.
This means of course that the first differences of the crime series are stationary and that the co-integrating regression used by Pyle and Deadman (equation (1) p. 349), and reproduced below, is mis-specified. The equation they used was of the form
(9) [Delta][Crime.sub.t] = [[Gamma].sub.0] + [[Gamma].sub.1] [E.sub.t] + [[Gamma].sub.2][Con.sub.t] + [[Gamma].sub.3][Pol.sub.t] + [[Mu].sub.t]
where [E.sub.t] is a measure of the economy, either Gross Domestic Product (GDP), unemployment or consumers' expenditure, [Con.sub.t] is the relevant conviction rate and [Pol.sub.t] is the number of police officers and Pyle and Deadman present results which indicate that the variables on the right hand side are all I(1). The equation was estimated separately using each of the economic variables with broadly similar results. For the purposes of discussion here only the results for consumption are presented in Table 3.
TABLE 3 Co-integrating regressions for first difference(d) crime variables 1946-91
Independent variables Dependent variable Constant Consumers' Conviction Police force expenditure rate numbers Burglary* -2.93 0.13 -48.87 0.27 (14.65) (0.62) (152.7) (13.5) Theft 232.68 -0.15 -1,027.3 0.62 (162.7) (1.25) (583.7) (2.48) Robbery 3187.0 4.81 6,614.30 -6.01 (3,705.8) (17.2) (4,186.2) (40.1) Dependent variable [R.sup.2] Burglary(*) 0.12 Theft 0.27 Robbery 0.34
The upper figure in each cell is the estimated coefficient. Corresponding standard errors are given in brackets. Standard errors are reported in this table only since in this model the conventional t-ratio has a non-standard distribution.
(*) A dummy variable is included in the burglary model to allow for the structural break in the series after 1969.
Although in this case, as the independent variables are non-stationary, standard distributions no longer apply, the results nevertheless support the conclusion that [Delta]crime is stationary since all the reported [R.sup.2]s are very low, with results in line with those discussed by Banerjee et al. (1993) and Stock and Watson (1988) (see page 6 above) for situations where the dependent variable is I(0) and the independent variables are I (1). Pyle and Deadman's conclusion that the variables are co-integrated would be an artefact of their mis-specified long-run equation rather than signalling the existence of any equilibrium relationship.(5) But the sensitivity of the results must again be emphasized. The ADF and DP tests employed to test the level of integration, upon which the second stage investigation of co-integration critically depends, often result in different outcomes with respect to rejecting the null hypothesis of non-stationarity when different lag lengths are used. Blough (1992) has argued that the tests either have poor size, that is they over-reject the null of non-stationarity when it is in fact true, or they have low power, in that they fail to reject it when it is false. This means that any results depending on these tests must be treated with caution and those presented here are no exception.
Crime levels: Are There Any Co-integrating Relationships ?
If this argument, that the crime variables like the explanatory variables are all I (1), is correct then the next stage in the model building process is to investigate whether or not there exist co-integrating relationships between the levels of the crime variables and the levels of the explanatory variables. The strategy employed was first to regress each crime variable on each of the independent explanatory variables(6) in turn and to test the residuals from the regression for stationarity. Where the tests indicated that a set of independent variables was not co-integrated individually, a second stage investigation was conducted to determine whether a linear combination of them was co-integrated with the crime variables. To do this meant estimating large numbers of regression equations and it would be tedious to report every single test statistic. Since the main interest of the Deadman and Pyle work was to focus on long-run relationships between crime and economic variables, and in any case all other results were negative indicating no co-integrating relationship, only results for unemployment consumption and GDP are presented in Table 4. In essence equations of the form
(10) [Crime.sub.t] = [Alpha] + [[Beta]E.sub.t] + [u.sub.t]
TABLE 4 Testing the long-run relationship of crime with economic variables
Burglary(a) Statistic Cons. GDP Unemp. Cons. Coefficient 4.80 3.83 0.18 12.05 [R.sup.2] 0.93 0.91 0.89 0.95 ADF -4.25(b) -2.98(c) 1.29 -4.53(b) (augmentation) (1) (1) (0) (1) Theft Statistic GDP Unemp. Cons. Coefficient 8.97 0.57 209.28 [R.sup.2] 0.94 0.75 0.89 ADF -2.74(c) -1.81(c) -1.52(b) (augmentation) (1) (1) (1) Robbery Statistic GDP Unemp. Coefficient 151.37 10.33 [R.sup.2] 0.83 0.77 ADF -0.17(c) -2.62(c) (augmentation) (1) (2)
(a) The regression for burglary also included a dummy variable to allow for the change in definition in 1969.
(b) The approximate critical values for burglary are (MacKinnon 1991) -4.63 (1%) and -3.94 (5%) or (Charemza and Deadman 1992) -4.15 (1%) and-3.33 (5%).
(c) The approximate critical values for theft and robbery are (MacKinnon 1991) -4.15 (1%) and -3.37 (5%) or (Charemza and Deadman 1992) -3.66 (1%) and -2.87 (5%).
The results showed that both burglary and theft were co-integrated with consumer expenditure at the 5 per cent significance level at least but were not co-integrated with unemployment or GDP either singly or in combination. No co-integrating relationships were found for robbery. Osborn (1995) using quarterly data also found property crime and consumption to be co-integrated but again was unable to find any long-run relationship with unemployment. These results contrast with those of Pyle and Deadman who found that the economic variables were essentially interchangeable. A similar result on the existence of long-run equilibrium relationships between property crime and consumption has also been found in a study using longer series of data for England and Wales from 1857 onwards (Hale and Caddy 1995).
The results in Table 4 add weight to Field's argument that it is consumer expenditure which is the important economic variable.(7) However it is important to emphasize that these conclusions are based upon long-run results and indicate that recorded levels of burglary and theft have grown with increasing levels of consumer expenditure, in other words the relationship is a positive one. Field's results however, as noted above, were short-run effects and indicated that changes in property crime and personal consumption had a negative relationship. The two effects, long run and short run, work in opposite directions.
One possible explanation of the long-run relationship would make use of opportunity/routine activity theory which emphasizes that for a crime to take place three things are needed--a motivated offender, a suitable target and a lack of guardianship (for a recent work in this tradition see Felson 1994). If this argument is followed then the long-run positive relationship reflects the increased availability of suitable targets for property crime which have accompanied the long consumer boom of the last 40 years. This boom brought one of the biggest improvements in the standard of living in Britain since the middle ages (Obelkevich 1994). The beginning of the move to mass consumerism can be identified with the final lifting of rationing and the end of post-war austerity in the mid 1950s, the point at which recorded crime also begin to increase after fluctuating around a relatively stable level since 1945. More generally it might be argued that rising affluence and the increased consumerism which has accompanied it has had a major impact on lifestyles which have decreased the level of `guardianship'.
In terms of opportunity/routine activity theory the short-run negative relationship reported by Field between recorded property crime and personal consumption may be interpreted as a motivational effect. Personal consumption here acts as an indicator of the business cycle increasing as the economy expands relative to the trend and decreasing when it contracts. During downturns in the cycle more offenders might be expected to be motivated to commit property offences. The short-run determinants of property crime will be discussed in more detail in the next section where the results of modelling the dynamic error correction models are presented.
Short-Run Determinants of Property Crime
The short-run dynamic models for burglary, theft and robbery are presented in Tables 5, 6 and 7 respectively. For burglary and theft the existence of co-integrating relationships with consumption means that an ECM is the correct specification for the short-run models while for robbery we were unable to find any long-run equilibrium relationship and so when modelling its dynamic behaviour no ECM term is included.
TABLE 5 Final models for burglary: 1948-91 (dependent variable: first differenced burglary)
Engle-Granger two-step procedure(a) Explanatory variable Coefficient (t-ratio) Error correction term -0.223 (-3.029) Burglary (-1) (*) Consumption (-1) (*) [Delta]Unemployment 0.190 (6.592) [Delta]Consumption -3.626 (-2.248) [Delta]Consumption (-1) 6.294 (3.856) [Delta]Police -12.712 (-3.903) [Delta]Police (-1) -6.781 (-2.147) [Delta]Burglary conviction n.s. Constant 34.528 (3.085) Sims-Stock-Watson unrestricted ECM estimator(b) Explanatory variable Coefficient (t-ratio) Error correction term (*) Burglary (-1) -0.120 (-1.990) Consumption (-1) 1.181 (3.682) [Delta]Unemployment 0.052 (2.095) [Delta]Consumption -7.037 (-4.727) [Delta]Consumption (-1) n.s. [Delta]Police -5.246 (-1.825) [Delta]Police (-1) n.s. [Delta]Burglary conviction -988.7 (-2.49) Constant -81.946 (-2.688)
(*) not included in model; n.s. not statistically significant.
(a) [R.sup.2] = 0.744; F Statistic F(6,37) = 17.90 (p = 0.00); Lagrange multiplier test serial correlation, [chi square] (1) = 0.77 (p = 0.38); Heteroscedasticity Test [chi square] (1) = 1.40 (p = 0.24).
(b) [R.sup.2] = 0.794; F Statistic F(6,38) = 24.43 (p. = 0.00); Lagrange multiplier test serial correlation, [chi square] (1) = 0.48 (p = 0.49); Heteroscedasticity Test [chi square] (1) = 0..69 (p = 0.40).
TABLE 6 Final models for burglary: 1948-91 (dependent variable: first differenced burglary)
Engle-Granger two-step procedure(a) Explanatory variable Coefficient (t-ratio) Error correction term -0.208 (-2.656) Theft (-1) (*) Consumption (-1) (*) [Delta]Theft (-1) 0.541 (4.815) [Delta]Unemployment 0.159 (3.292) [Delta]Consumption -9.879 (-3.748) [Delta]Consumption (-1) 13.587 (4.805) [Delta]Police -18.521 -(-3.555) [Delta]Theft conviction n.s. Constant 31.589 (1.845) Sims-Stock-Watson unrestricted ECM estimator(b) Explanatory variable Coefficient (t-ratio) Error correction term (*) Theft (-1) -0.190 (-3.160) Consumption (-1) 3.228 (4.759) [Delta]Theft (-1) 0.245 (2.323) [Delta]Unemployment n.s. [Delta]Consumption -12.538 (-7.268) [Delta]Consumption (-1) n.s. [Delta]Police -9.288 (-2.009) [Delta]Theft conviction -1485.3 (-2.373) Constant -219.2 (-4.057)
(*) not included in model; n.s. not statistically significant.
(a) [R.sup.2] = 0.760; F Statistic F(6,37) = 19.58 (p = 0.00); Lagrange multiplier test serial correlation, [chi square] (1) = 1.02 (p = 0.38); Heteroscedasticity Test [chi square] 2(1) = 0.44 (p = 0.51).
(b) [R.sup.2] = 0.813; F Statistic F(6,37) = 26.87 (p. = 0.00); Lagrange multiplier test serial correlation, [chi square] (1) = 1.01 (0.75); Heteroscedasticity Test [chi square] (1) = 0.12 (p = 0.73).
TABLE 7 Final model for robbery: 1949-91 (dependent variable: first differenced robbery)
Explanatory variable Coefficient (t-ratio) [Delta]Robbery(-1) 0.261 (2.534) [Delta]Unemployment 7.413 (9.891) [Delta]Unemployment(-2) -3.508 (-6.039) [Delta]Consumption(-1) 254.806 (6.313) [Delta]Police -402.224 (-4.421) [Delta]Police(-1) -333.361 (-3.726) Constant 669.939 (2.142)
[R.sup.2] = 0.824, F Statistics F(6,36) = 28.18 p (p = 0.00) Lagrange multiplier test for serial correlation, [chi square](1) = 2.38 (p = 0.123).
The first columns in both Tables 5 and 6 present ECM results based on the EG two-step procedure while those from the unrestricted SSW procedure appear in the second columns. In both tables there are marked differences between the two estimated models. As indicated above where there is conflict the SSW estimates are preferred.
Before commenting more generally upon the results note that it is possible to use the unrestricted SSW estimates to obtain a second estimate of the long-run parameter [Beta] in equation (9) to compare with those from the Engle-Granger long-run regression given in Table 4. For burglary the SSW based estimate is 9.84 compared to 4.80 for EG and for theft 16.99 compared to 12.05.(8) These differences are quite large confirming the view that the estimates from the co-integrating regression, although super-consistent, may be biased.
The signs for the ECM terms are statistically significant and of the correct sign in both models.(9) As discussed above it is these error correction terms which capture the long-run equilibrium relationships between the crime variables and personal consumption. The negative coefficient estimates associated with them indicate their roles in correcting for any deviation from this long-run equilibrium.
Unlike Field (1990) significant relationships were found between unemployment and crime. In particular changes in the level of unemployment were significantly and positively related to changes in all three property offences, although for theft unemployment was no longer significant when the preferred SSW estimation procedure was used. Whilst unemployment does not have a long-run equilibrium relationship with crime it does nevertheless play some role in the short-run fluctuations of recorded property crime. If unemployment increases year to year then other things being equal the models show that burglary and robbery will increase.
It has been suggested (Dickinson 1994) that the effect of unemployment changes may be asymmetric. It is argued that the increase in crime when unemployment increases is larger than the fall when unemployment decreases--crime responds relatively stickily to falls as compared to rises in unemployment. There is some strength in this argument and it has resonances with Hagan's notions of the social embeddedness of crime and unemployment (Hagan 1993). At a very simple level the suggestion here is that if unemployment causes individuals to become criminally active they may not stop offending when conditions in the labour market improve. Various reasons may be hypothesized for this, including `habit persistence' and the importance of the type of opportunities available as employment expands--whether they are perceived as `real' as opposed to `hamburger' jobs (see Allen and Steffensmeier 1989 for the importance of job quality on crime levels in the USA). However no empirical support for `stickiness' was found when the models were re-estimated to allow the estimated effect of unemployment to vary. This result confirms that Of Osborn (1995) who, using quarterly data, was also unable to find support for an asymmetric unemployment effect.
Field's finding that changes in levels of personal consumption are important is confirmed but with important modifications. He found that both current and lagged changes were statistically significant, but with opposite signs, implying that if consumption rises property crime will fall initially, but `bounce' back in the following period. Tables 5 and 6 show that in an ECM formulation this lagged effect was not statistically significant for either burglary or theft when the model was estimated using the preferred SSW approach.
The main difference in the results provided by the EG and SSW estimations comes when considering the effect of the deterrence variables, that is the numbers of police and the crime specific conviction rates. For both burglary and theft, while there is a negative relationship with changes in police numbers, using the EG results would indicate that changes in conviction rates have no significant effects. However using the superior SSW technique we see that conviction rates are indeed statistically significant.
For robbery the empirical results are generally less straightforward to interpret and the effect of changes in personal consumption is no exception. In the estimated model reported in Table 7 the role of capturing the business cycle effect falls to changes in unemployment, both current and lagged two periods. Lagged changes in personal consumption are positively related to changes in recorded robbery. This might be interpreted in terms of increased target availability but this is a less straightforward task than in an ECM formulation.
This paper has presented a re-analysis of data on post-war trends in recorded property crime in England and Wales. Whilst confirming the result of Field (1990) that changes in personal consumption play an important role in explaining fluctuations in levels of property crime, unlike him it also found a role for changes in unemployment in explaining some short-run changes. It further questioned the findings of Pyle and Deadman (1994) that crime variables needed differencing twice for stationarity, concluding that on balance the evidence suggested that once was sufficient. This finding was then used to explain the results of Pyle and Deadman that property crime had long-run equilibrium relationships interchangeably with each of consumption, GDP or unemployment. It was argued that their results were the outcomes of over-differencing the crime data. Rather it was found that burglary and theft had long-run equilibrium relationships with personal consumption only and that for robbery no co-integrating relationships existed. This result underlines Field's discussion of the significance of personal consumption in explaining property crime. However it extends his results, which deal only with fluctuations around the trend, and shows how consumption also has an important bearing in clarifying the full relation between long-run economic growth and the growth in property crime, something which Field himself acknowledges to be a weakness of his work (Field 1990: 5).
Thus consumption has a dual role in explaining both trends and changes in property crime. If we accept the routine activities and opportunity theorists' arguments (see for example Felson 1995) that for a crime to occur three elements are required: a motivated offender, a suitable target and a lack of guardianship for that target, then the level of personal consumption measures the increasing availability of targets in the long term, the opportunity effect, whilst changes in the level of consumption capture the impact of the business cycle upon the numbers of offenders, the motivation effect.
In the short-run dynamic models, changes in consumption along with changes in unemployment, the numbers of police and conviction rates were all statistically significant, However the sensitivity of these results to the methods used was clearly illustrated, since reliance on the results based on the two-stage Engle Granger approach would have led to the conclusion that changes in conviction rates had a statistically insignificant impact on recorded property crime.
The main conclusion to be drawn from the results presented here is to underline once again the findings of both Field (1990) and Pyle and Deadman (1994) on the importance of the economy in determining the level of crime and to emphasize the importance, when explaining long-run trends, of the availability of targets. The results may be seen as giving further support to those arguing for more resources to be given to crime prevention strategies, both physical and social, at all levels since potential targets will continue to increase in number with continued long-run economic growth.
(1) Formally variables are said to be co-integrated if they are all I(d), d [is greater than] 0 and there exists some lineal' combination of them which is I(b) where b [is less than] d, that is the combination needs differencing fewer times than the original variables for stationarity. In practical terms the case of interest is where d: 1 and b: 0, the original variables need differencing once and the combination is stationary.
(2) Without wishing to labour the point, Pyle and Deadman's (PD) main complaint seems to be that Hale and Sabbagh paid insufficient attention to the requirement that the error terms in the estimated regressions for the Dickey-Fuller test be serially uncorrelated. Where necessary this is achieved by augmenting the right hand side of the model by adding lagged dependent variables and the test is then referred to as the augmented Dickey-Fuller test. Hale and Sabbagh did, however, follow this procedure and used the degree of augmentation necessary to achieve serially uncorrelated error terms. Moreover the comment of Davidson and MacKinnon (1992) which PD (p. 353) use in part to justify their criticism refers to the inclusion of constant and trend terms in the model and not to the issue of serially independent residuals. Further they cite Banerjee et al. (1993) as supporting erring on the side of including more degrees of augmentation than may be necessary. However Banerjee et al. go on (p. 107) to say that `One can, of course, perform tests for autocorrelation on the estimated residuals from (the estimated equation) to check the acceptability of the premise that these residuals be white noise'. This was precisely the approach adopted by Hale and Sabbagh.
(3) It can be shown that: under [H.sub.3] (3 unit roots) [[Lambda].sub.1]=[[Lambda].sub.2]=[[Lambda].sub.3]=0; under [H.sub.2] (2 unit roots) [[Lambda].sub.1]=[[Lambda].sub.2]=0, [[Lambda].sub.3] [is less than] 0; under [H.sub.1] (1 unit root) [[Lambda].sub.1]=0, [[Lambda].sub.2], [[Lambda].sub.3] [is less than] 0; under Ho (No unit roots) [[Lambda].sub.1], [[Lambda].sub.2], [[Lambda].sub.3] [is less than] 0. Hence the following sequential testing procedure may be adopted:
Step One: Since tinder both [H.sub.3] and [H.sub.2] [[Lambda].sub.1]= [[Lambda].sub.2]=0 estimate
and test [H.sub.3]: [[Lambda].sub.3]=0. If [[Lambda].sub.3] [is less than] 0 reject [H.sub.3] in favour of [H.sub.2] and proceed to Step Two, otherwise accept [H.sub.3] that there are three unit roots.
Step Two: Since under both [H.sub.2] and [H.sub.1] = [[Lambda].sub.2] estimate
and test [H.sub.2]: [[Lambda].sub.2]=0. If [[Lambda].sub.2] [is less than] 0 reject [H.sub.2] in favour of [H.sub.1] and proceed to Step Three otherwise accept [H.sub.2] there are 2 unit roots.
Step Three: Estimate
[[Delta].sup.3][Y.sub.t]=[Mu]+[Gamma]t+[[Lambda].sub.1][Delta][Y.sub.t-1]+ [[Lambda].sub.2][Delta] [Y.sub.t-1]+[[Lambda].sub.3][[Delta].sup.2][Y.sub.t-1] +[[Epsilon].sub.t]
and test [H.sub.1]: [[Lambda].sub.1]=0. If [[Lambda].sub.1] [is less than] 0 reject [H.sub.1] in favour of [H.sub.0] and assume original data is stationary. Otherwise accept [H.sub.1] that there is 1 unit root. The hypotheses at each stage are tested using (augmented) Dickey-Fuller statistics.
(4) The author is grateful to one of the readers for clarification of this point. The same reader also noted that for the I(1) vs I(0) case not including trend and constant amounts to an imposition of the assumption that the unknown `starting value' [Y.sub.0] is zero. Also when testing the I(1) null a trend should be included if the variable Y could be trending over time. The role of the trend and constant terms is to make the critical values used in unit root tests invariant to the unknown values of `nuisance' parameters where these nuisance parameters include [Y.sub.0] and the constant term (see Bannerjee et al. 1993:104-5 for a discussion of these issues).
(5) Osborn is also critical of Pyle and Deadman's testing procedure for the degree of integration for the crime variables and notes further that `... their error-correction mechanisms ... indicate that the dependent variable may be overdifferenced since the estimates of the error correction coefficient are always close to minus one, which can be interpreted as the model attempting to reduce the order of differencing for the relevant crime variable' (Osborn 1995).
(6) As well as the independent variables discussed in equation (8) the numbers of males aged 1,5-19, numbers of males aged 20-24, numbers of live births and the total population were also considered.
(7) These results were confirmed by using the two tests (not reported here) for co-integration proposed by Johansen (1988) based upon an alternative approach to modelling involving maximum likelihood estimation of a Vector Autoregressive Regression (VAR).
(8) We also used the Johansen's (1988) maximum likelihood procedure to estimate [Beta] and, assuming a maximum lag of 2 in the VAR, obtained estimates of 8.51 for burglary and 15.10 for theft, values which are in line with those from the SSW approach.
(9) Following the arguments of Kremers et al. (1992) and Banerjee et al.(1993) we also checked for co-integration by using the SSW estimates of [Alpha] and [Beta] in the EG error correction model and testing whether the estimated parameter on the error correction term was significantly different from zero. These tests again confirmed the co-integration between crime and personal consumption.
ALLEN, E. A. and STEFFENSMEIER, D.J. (1989), `Youth, Underemployment and Property Crime: Differential Aspects of Job Availability and Job Quality on Juvenile and Young Adult Arrest Rates', American Sociological Review, 54: 107-23.
BANNERJEE, A., DOLADO, J., GALBRAITH, J. W. and HENDRY, D. F. (1993), Co-integration, Error-Correction, and the Econometric Analysis of Non-Stationary Data. Oxford: Oxford University Press.
BLOUGH, S. R. (1992), `The Relationship between Power and Level for Generic Unit Root Tests in Finite Samples', Journal of Applied Econometrics, 7: 295-308.
CHAREMZA, W. W. and DEADMAN, D. F. (1997), New Directions in Econometric Practice. Aldershot; Edward Elgar.
DAVIDSON, R. and MACKINNON J, G. (1992), Estimation and Inference in Econometrics. Oxford University Press.
DEADMAN, D. R. and PYLE, D.J. (1997), `Forecasting Recorded Property Crime Using a Time-Series Econometric Model', British Journal of Criminology, 37/3: 437-45.
DICKEY, D. A. and PANTULA, S. S. (1987), `Determining the Order of Differencing in A utoregressice Processes', Journal of Business and Economic Statistics, 5: 455-61.
DICKINSON, D. (1995), `Crime and Unemployment', New Economy, 2.
DOLDADO, J., JENKINSON, T. and SOSVILLA-RIVERO, S. (1990), `Co-integration and Unit Roots', Journal of Economic Perspectives, 4: 249-73.
ENGLE, R.F. and GRANGER, C. W.J. (1987), `Co-Integration and Error Correction: Representation, Estimation and Testing', Econometrica, 55: 251-76.
FELSON, M. (1994), Grime and Everyday Life. London: Pine Forge Press.
FIELD, S. (1990), Trends in Crime and their Interpretation : A Study of Recorded Crime in Post- War England and Wales, Home Office Research Study 119. London: HMSO.
GRANGER, C. W.J. and NEWBOLD, P. (1974), `Spurious Regressions in Econometrics', Journal of Econometrics, 2:111-20.
HAGAN, J. (1993), `The Social Embeddedness of Crime and Unemployment', Criminology, 32.
HALE, C. (1989), `Unemployment, Imprisonment and the Stability of Punishment Hypothesis: Some Results using Co-integration and Error Correction Models', Journal of Quantitative Criminology, 5:169-91.
HALE, C. and CADDY, M. (1995), `Long Run Crime Trends in England and Wales: 1857 - 1993', paper presented to the 1995 Annual Conference of the American Criminological Society.
HALE, C. and SABBAGH, D. (1991), `Testing the Relationship between Unemployment and Crime: A Methodological Comment and Empirical Analysis using Time Series Data from England and Wales', Journal of Research in Crime and Delinquency, 28: 400-17.
JOHANSEN, S. (1988), `Statistical Analysis of Co-integrated Vectors', Journal of Economic Dynamics and Control, 12: 231-54.
KREMERS, J. J. M., ERICSSON, N. R. and DOLADO, J. (1992), `The Power of Co-integration Tests', Oxford Bulletin of Economics and Statistics, 52: 325-48.
MACKINNON, J. G. (1991), `Critical Values for Co-Integration Tests', in R. F. Engle and C. W. J Granger, eds., Long Run Economic Relationships, ch. 13. Oxford: Oxford University Press.
MURRAY, M. P. (1994), `A Drunk and Her Dog: An Illustration of Cointegration and Error Correction', The American Statistician, 48: 37-9.
OBELKEVICH, J. (1994), `Consumption', in J. Obelkevich and P. Catrell, eds., Understanding Post-War British Society, ch. 11. London: Routledge.
OSBORN, D. R. (1995), Crime and the UK Economy, Robert Shuman Centre Working Paper 95/15. European University Institute.
PERRON, P. (1989), `The Great Crash, the Oil Price Shock and the Unit Root Hypothesis', Econometrica, 57: 1361-401.
PERRON, P. and VOGELSANG, T. (1992), `Nonstationary and Level Shifts with an Application to Purchasing Power Parity', Journal of Business and Economic Statistics, 10: 301-20.
PYLE, D.J. and DEADMAN, D. F. (1994), `Crime and the Business Cycle in Post-War Britain', British Journal of Criminology, 34: 339-57.
SIMS, C. A., STOCK, J. H. and WATSON, M. W. (1990), `Inference in Linear Time Series with some Unit Roots', Econometrica, 55: 1035-56.
STOCK, J. H. and WATSON, n. W. (1988), `Variable Trends in Economic Time Series', Journal of Economic Perspectives, 2: 147-74.
WELLS, J. (1994), `Mitigating the Effects of Unemployment: Crime and Unemployment', Report for the House of Commons Select Committee on Employment.
CHRIS HALE, University of Kent at Canterbury. The author gratefully acknowledges the financial support of the UK Economic and Social Research Council award L210252012 `Structural and Cultural Determinants of Crime and Punishment' granted under the ESRC Research Programme. An earlier version of this paper appeared as `The Structural and Cultural Determinants of Crime and Punishment', Working Paper 3. He also wishes to thank David Pyle and Derek Deadman for generously making their data set available for re-analysis. Thanks are also due to the anonymous referees for their comments which have led to improvements on an earlier draft of this paper. The usual caveats apply.
|Printer friendly Cite/link Email Feedback|
|Publication:||British Journal of Criminology|
|Article Type:||Statistical Data Included|
|Date:||Sep 22, 1998|
|Previous Article:||DELINQUENT PHASES.|
|Next Article:||The Future of Policing.|