# Analyzing comovements in housing prices using vine copulas.

I. INTRODUCTIONWhen housing prices were rapidly appreciating from 1999 to 2006, investment firms created structured securities by combining mortgages from houses located in different parts of the country, and then traders bought and sold those securities, and also pieces of those securities called "tranches," in secondary markets. The most familiar of those structured securities was the collateralized debt obligation (CDO). The main appeal of CDOs rested in the belief that, due to the localized nature of housing markets, houses in separate geographic markets would be unlikely to simultaneously experience large decreases in prices. Credit rating agencies offered support to this thinking by awarding many CDOs the highest possible safety rating.

However, it soon became clear that CDOs offered less diversified protection than originally thought when, starting in 2006, housing prices in different geographic areas, even those located far apart, simultaneously plummeted in value. As a consequence, CDOs lost much of their value, with the global CDO market shrinking from $482 billion globally in 2007 to only $8 billion in 2010. (1)

In the wake of the housing crisis, financial analysts and policy makers have questioned why, compared to pre-crisis expectations, housing prices showed strong correlations across different geographic areas. In addition, a branch of research has emerged that attempts to identify sources of, and quantify magnitudes of, housing price comovements (Apergis and Payne 2012; Barros, Gil-Alana, and Payne 2012). The popular press quickly assigned blame to the statistical method used to analyze linkages between housing markets: the Gaussian copula (Li 2000). Notably, the March 2009 issue of the technology magazine Wired featured an article on the Gaussian copula entitled "Recipe for Disaster: The Formula that Killed Wall Street." Similar ideas have reached the general public through the works of Nassim Taleb (Taleb 2007).

The Gaussian copula became popular due, in part, to its link to the familiar multivariate normal distribution. But the multivariate normal distribution has asymptotic independence, such that events, regardless of the strength of their correlation, become independent if one pushes far enough into the tails (Embrechts, McNeil, and Straumann 2002). Thus, in the midst of the housing crisis, which might be thought of as a lower tail event, the Gaussian copula predicted near independence in price movements across different areas, when in fact, prices plummeted simultaneously throughout most of the United States.

But the Gaussian copula's link to the normal distribution was not its only appeal. Perhaps a bigger reason for its popularity was that the Gaussian copula, much like the related normal distribution, readily extends to higher dimensions. Certainly, credit rating agencies were not considering simple bivariate movements between two locations, but rather multivariate movements across many locations. Recent studies argue that alternative specifications, especially copulas that depart from normality, more accurately reflect correlations in housing price movements during extreme market swings (Ho, Huynh, and JachoChavez 2014; Zimmer 2012). However, those improved fits have been achieved only in bivariate models that compare housing price movements between two locations. It remains an open question whether non-Gaussian copulas can accurately reflect housing price movements in higher dimensional settings.

Unfortunately, copulas other than the Gaussian do not readily extend to higher-than-bivariate dimensions (Nelsen 2006, 105). Attempts to develop higher dimensional copulas, some of which are discussed below, either impose unrealistic restrictions or present difficulties when applied to data. As an alternative, this article develops multivariate models of housing price comovements based on vine copulas. The approach requires marginal distributions, which financial analysts should know with some certainty, and bivariate copulas, which have well-understood statistical properties. However, the approach does not require knowledge of higher-dimensional linkages. The approach accommodates lower and upper tail dependence of different magnitudes, and importantly, its estimation relies on standard maximum likelihood methods, thus eliminating the need for time-consuming simulation-based estimators.

The empirical application uses quarterly housing price indices for four U.S. census divisions. The main finding is that, compared to the multivariate Gaussian copula, multivariate vine copulas assembled from non-Gaussian distributions more realistically capture comovements in housing prices. Furthermore, while Gaussian and vine models find similar magnitudes of comovements in housing prices during "normal" times, the vine model uncovers far stronger comovements between housing prices in the tails, especially the lower tails. These findings imply that mortgage-based structured securities offered less diversified protection than assumed before the crisis, and consequently, such financial instruments should not have received such high credit ratings.

II. BIVARIATE COPULA BASICS

A bivariate copula is a bivariate distribution function with both univariate margins distributed as U (0,1). Consider a continuous distribution function F([y.sub.1],[y.sub.2]) with univariate marginal distributions [F.sub.1] ([y.sub.1],[y.sub.1]) and [F.sup.2]([y.sub.2]) and inverse probability transforms (quantile functions) [F.sup.-1.sub.1] and [F.sup.-1.sub.2]. Then [y.sub.1] = [F.sup.-1.sub.1] ([u.sub.1]) ~ [F.sub.1], and [y.sub.2] = [F.sup.-1.sub.2] ([u.sub.2]) ~ [F.sub.2] where [u.sub.1] and [u.sub.2] are uniformly distributed variates. The transforms of the uniform variates are distributed as [F.sub.1] and [F.sub.2]. Hence

(1) F([y.sub.1], [y.sub.2]) = F([F.sup.-1.sub.1] ([u.sub.1]), [F.sup.-1.sub.2]([u.sub.2])) = C ([u.sub.1], [u.sub.2])

is the unique copula associated with the distribution function. By Sklar's (1973) theorem, the copula parameterizes a multivariate distribution in terms of its marginals. For bivariate distribution F, the copula satisfies

(2) F([y.sub.1],[y.sub.2])= C([F.sub.1]([y.sub.1]), [F.sub.2]([y.sub.2]); [theta]),

where 0 measures dependence. If the marginals are continuous, then the corresponding copula in Equation (2) is unique.

As indicated by the capital letters, the marginal distributions appearing in the copula, as well as the copula itself, are cumulative distributions rather than probability densities. The bivariate density, useful for maximum likelihood estimation, comes from differentiating,

(3) f ([y.sub.1],[y.sub.2]) = c ([F.sub.1] ([y.sub.1]), [F.sub.2] ([y.sub.2]) ; [theta]) x [f.sub.1] ([y.sub.1]) x [f.sub.2] ([y.sub.2])

where c(x), called the "copula density," is [[delta].sup.2]/[delta][F.sub.1] [delta][F.sub.2] (C([F.sub.1]([y.sub.1]), [F.sub.2]([y.sub.2]); [theta])) and [f.sub.1], and [f.sub.2] denote the densities of [F.sub.1] , and [F.sub.2].

The copula approach provides a simple recipe for forming bivariate distributions. By plugging the known marginals into a copula function C(x), the right-hand side of Equation (2) gives a parametric representation of the unknown joint distribution on the left-hand side. This is useful because financial analysts and researchers usually know more about marginal behaviors of individual variables than about multivariate linkages. As noted above, credit rating agencies, in assessing riskiness of CDOs, joined the marginals using the Gaussian copula.

For analyzing comovements in housing prices, the most important number is the dependence parameter 0, but dependence parameters do not translate conveniently between different copulas. To facilitate comparison, copula dependence parameters are converted to measures of concordance such Kendall's x (Nelsen 2006), which then can be compared across copulas. Kendall's [tau] is bounded on the region (-1, 1) with -1,0, and 1 corresponding to perfect negative dependence, independence, and perfect positive dependence.

This article's main focus centers on a copula's ability to capture comovements during times of extreme market fluctuations, such as the large decreases in prices witnessed during the housing crisis. Statistically speaking, some, but not all, copulas accommodate tail dependence in one or both tails, formally defined as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

As shown below, for copulas that accommodate dependence in one or both tails, the tail dependence measures are functions of the underlying copula dependence parameters.

A. Gaussian (Normal) Copula

The Gaussian copula takes the form

(4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] where [[phi].sup.-1] denotes the quantile function of the standard normal distribution, and [[phi].sup.-2] (.,.) is the standard bivariate normal distribution, with the dependence parameter 0 restricted to the interval (--1, 1). The Gaussian copula allows for equal degrees of positive and negative dependence, but it does not accommodate tail dependence. As Embrechts, McNeil, and Straumann (2002) remark: "Regardless of how high a correlation we choose, if we go far enough into the tail, extreme events appear to occur independently in each margin." Thus, the Gaussian copula might fail to capture dependence during extreme market movements.

B. Clayton Copula

The Clayton (1978) copula takes the form

(5) [[C.sub.clayton] ([u.sub.1], [u.sub.2]; [theta]) = ([u.sup.-[theta].sub.1]+ [u.sup.-[theta].sub.2] - 1).sup.- 1/[theta]]

with the dependence parameter [theta] restricted to the region (0, [infinity]). As [theta] approaches zero, the marginals become independent, and larger values of 0 indicate stronger dependence. The Clayton copula exhibits asymmetric dependence, with dependence concentrated in the lower tail. Thus, the Clayton might appropriately capture comovements during extreme market downswings. On the other hand, the Clayton does not allow for upper tail dependence. The Clayton also cannot account for negative dependence (at least not in the form presented here), a less-pressing issue in the current context considering overwhelming evidence that housing prices tend to move in the same direction.

C. Rotated Clayton Copula

The rotated (or survival) Clayton copula takes the form

(6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

with [theta] [member of] (0, [infinity]). In contrast to the Clayton, the rotated Clayton displays upper tail dependence, suggesting that the rotated Clayton should appropriately capture comovements during extreme market upswings.

D. Joe-Clayton Copula

The Joe-Clayton copula, often called the BB7 copula in the statistics literature, assumes the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [[theta].sup.1] [member of] (0, [infinity] and [[theta].sup.2] [member of] (0, [infinity]. In contrast to the three aforementioned copulas, the JoeClayton has two dependence parameters, one for each tail of the distribution. As demonstrated next, this allows the Joe-Clayton to accommodate asymmetric dependence in each tail. The Joe-Clayton cannot accommodate negative dependence. (See Patton 2006 for an application of the Joe-Clayton copula to exchange rate comovements.)

E. Comparison of the Four Copulas

The following table provides formulas for Kendall's [tau] and tail dependence.

[theta] Kendall's Lower Tail Upper Tail Copula Domain [tau] Dependence Dependence Gaussian [theta] (2/[pi]) None None [member of] arcsin (-l,l) ([theta]) Clayton [theta] [theta]/ [2.sup.-1/ None [member of] ([theta]+ 2) [theta]] (0, [infinity]) Rotated [theta] [theta]/ None [2.sup.-1/ Clayton [member of] ([theta]+ 2) [theta]] (0, [infinity]) Joe- [[theta].sub.1] 1 -(4- [2.sup.-i/ [2.sup.-i/ Clayton [member of] [[theta].sup. [[theta] [[theta] (0, [infinity]) 2.sub.1] .sup.2]] .sup.1]] [[theta] .sup.2]]) (B (2, (2- [theta] .sup.1]-1) [[theta].sub.1] -B([theta] [member of] .sup.2] (0, [infinity]) + 2, (2/[theta] .sup.1] -1)) Note: B(x,y) denotes the Beta function, given by [mathematical equation not reproducible in ascii]

To emphasize the different shapes, Figure 1 shows contours of the four copulas, each with Kendall's [tau] set to .50. All marginal distributions are standard normal. The contours illustrate how copulas with equal dependence magnitudes can display markedly different dependence patterns. The Gaussian copula displays the familiar elliptical shape of the bivariate normal distribution, with the circular shape in each tail highlighting its inability to accommodate tail dependence. By contrast, the Clayton (rotated Clayton) shows strong lower tail (upper tail) dependence, with no dependence in the opposite tail. Finally, the Joe-Clayton shows more "pinching" in the tails, compared to the Gaussian, which highlights the Joe-Clayton's accommodation of dependence in each tail.

To further demonstrate the flexibility of the Joe-Clayton, Figure 2 shows contours for select values of dependence in each tail. The first panel shows contours for (nearly) unrelated variables, while the second panel displays contours for strongly related variables. The bottom two panels highlight the copula's ability to handle asymmetric dependence patterns. (2)

III. MULTIVARIATE VINE COPULAS

Analyzing comovements for more than two locations requires forming an m-dimensional copula, C([F.sub.1]([y.sub.1]), ..., [F.sub.m] ([y.sub.m]);[theta]) where the dependence term [theta] contains, at least, m(m - 1)/2 elements, one for each pairwise combination ([y.sub.j], [y.sub.k])[for all] j [not equal to] k. The most obvious approach, and the one preferred by credit rating agencies before the housing crisis, uses the Gaussian copula, which readily extends to higher dimensions as

(7) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [[phi].sub.m] (.,.) is the w-variate standard normal cdf, and [theta] is an (m x m) matrix with dependence terms in the off-diagonal. But as noted above, the multivariate normal does not accommodate tail dependence, which unrealistically imposes independence during extreme market swings. Another obvious solution is the multivariate Student's t distribution, which includes a degrees of freedom parameter that captures tail dependence. However, because the multivariate t distribution contains only one degrees of freedom parameter, it imposes equal tail dependence for all pairs ([y.sub.j], [y.sub.k]), which does not seem realistic.

Earlier attempts to construct flexible higher dimensional copulas either failed to perform in applications or required complex simulation-based estimators, which negates the main advantage of copula estimation (Husler and Reiss 1989; Joe 1990, 1994). An alternative, known as the "mixtures of powers" approach, does not require simulation-based estimation, but the method allows only m - 1 dependence terms, a restriction that becomes especially unrealistic as m grows (Zimmer and Trivedi 2006). More recent approaches based on the simulated method of moments allow flexible dependence structures at very high dimensionality (Patton 2012a, 2012b), but because those approaches remain in their infancy, this article does not pursue them further.

Instead, this article exploits recent advances on vine copulas (Aas et al. 2009), which, in general, do not require simulation-based methods. Although somewhat slow to gain an audience among economists, statisticians recently have applied vine methods to a diverse number of topics including energy (Czado, Gartner, and Min 2011), insurance (Erhardt and Czado 2012; Kramer, Brechmann, Silvestrini, and Czado 2013), and finance (Brechmann and Czado 2013; Brechmann, Czado, and Paterlini 2014). The web page vine-copula.org, maintained by Claudia Kliippelberg and Claudia Czado, provides links to recent empirical and theoretical studies, as well as to software resources.

A. Vine Structures

The idea begins by decomposing a multi-variate density into a cascade of marginal and conditional densities,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where the decomposition is unique up to a relabeling of the variables. Consider, for example, a trivariate decomposition

(8) f ([y.sub.1], [y.sub.2], [y.sub.3]) = f ([y.sub.3]) x f ([y.sub.2]| [y.sub.3]) x f ([y.sub.1]| [y.sub.2], [y.sub.3]).

Assuming one knows the marginal density f/([y.sub.3]), forming the trivariate density requires specifying the remaining two conditional densities f([y.sub.2]|[y.sub.3]) and f([y.sub.1] |[y.sub.2], [y.sub.3]). The following bullet points describe how to form those conditional densities from bivariate copulas.

* f([y.sub.2]|[y.sub.1]): Using the equation for a bivariate copula density given in Equation (3), and also using the standard formula for conditional distributions, this conditional density can be expressed as the product of a bivariate copula density and a

marginal density,

f{[y.sub.2]\[y.sub.2]) = c23 iF2 {yi)'F3 (>3) ;023) -k ([y.sub.2])

where the subscripts on the copula density emphasize that its arguments are the marginal distributions for [y.sub.2] and y3. Thus, the conditional density f([y.sub.2]|[y.sub.3]) can be formed from the product of a bivariate copula density, which has marginal distributions as arguments, and a marginal density.

* Following similar reasoning, this conditional density can be expressed as

f([y.sub.1]| [y.sub.2], [y.sub.3]) = [c.sub.13;2] (F ([y.sub.1]| [y.sub.2]) F ([y.sub.3]| [y.sub.2]); [[theta].sub.13;2]) x f ([y.sub.1]|[y.sub.2]).

In this expression, the conditional density f([y.sub.1]|[y.sub.2]) can be expressed according to the derivation in the first bullet point, such that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Thus, the conditional density f([y.sub.1]| [y.sub.2], [y.sub.3]) can be formed from the product of two bivariate copula densities and a marginal density.

* f([y.sub.1], [y.sub.2], [y.sub.3]): Substituting the expressions in the previous two bullet points into Equation (8), the full trivariate density can be expressed as

(9) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Note that two of the marginal distributions, F([y.sub.1]|[y.sub.2]) and C([y.sub.3]|[y.sub.2]), are "marginal conditional" distributions. As shown below, those, too, can be formed from bivariate copulas and (unconditional) marginal distributions. Consequently, using the vine copula approach, any n;-dimensional density can be expressed as a function of (1) bivariate copula densities, (2) marginal distributions, and (3) marginal densities. This approach provides important advantages in applied settings, because researchers usually know marginal distributions and densities, and bivariate copulas are much better understood than their higher-dimensional counterparts. Adding further flexibility to vine structures, the functional forms of the bivariate copula densities need not be identical, and likewise, each marginal distribution may assume a different form. Furthermore, if one desires to avoid parametric assumptions, various nonparametric approaches exist that allow the bivariate copulas to remain unspecified (Chen and Huang 2009; Ho, Huynh, and Jacho-Chavez 2014; Racine 2013), although such methods inevitably sacrifice precision in favor of protection against misspecification.

Calculating the natural logarithm of Equation (9) and summing over all observations gives the log likelihood function, which is maximized with respect to the unknown parameters [[theta].sub.12], [[theta].sub.23], [[theta].sub.13:2] to arrive at estimates of those parameters. As shown below, the marginal distributions may contain additional estimable parameters, including coefficients attached to explanatory variables if the marginals assume regression structures.

One drawback of the vine approach is that, among the dependence parameters, some will capture conditional dependence. For example, in the trivariate vine representation in Equation (9), the two dependence terms [[theta].sub.12] and [[theta].sub.23] reflect familiar pairwise measures of dependence. In contrast, the other dependence term, [[theta].sub.13:2], captures dependence between [y.sub.1] and [y.sub.3] conditional on [y.sub.2] (Joe, Li, and Nikoloulopoulos 2010). As discussed in more detail below, the easierto-interpret unconditional measure [[theta].sub.13] can be calculated by drawing simulated values from the estimated vine copula, finding the appropriate bivariate unconditional copula for those simulated values, and then calculating [[theta].sub.13] from the appropriate unconditional copula.

11. Vine Types

A multivariate density f([y.sub.1], ..., [y.sub.m]) has many possible vine representations. For example, for the trivariate density in Equation (8), the term f([y.sup.1]|[y.sub.2], [y.sub.3]) could instead be expressed as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which leads to a different vine specification for f/([y.sub.1], [y.sub.2], [y.sub.3]). The number of unique vine structures expands rapidly with the dimension of the density. For example, a trivariate density has three unique vine representations, a four-variate density has 24 possible vine representations, and a five-variate density has 240 possible vine structures (Aas et al. 2009). Because the number of possible structures becomes unwieldy in higher dimensions, the statistics literature emphasizes graphical representations of those structures, known as R-(or regular) vines. But even graphical R-vine representations become overwhelming in higher dimensions, so an R-vine structure often is summarized in the form of an "R-vine matrix." Dissmann et al. (2013) provide a detailed explanation of R-vine graphs and their associated matrix representations.

Two subcategories of R-vines deserve special mention: D-vines and canonical (or C-) vines (Kurowicka and Cooke 2004). D-vines are useful for variables that have a temporal order known a priori, whereas C-vines are useful when there is a natural order of importance to the variables. Figure 3 gives an example of a four-variate Dvine, with the temporal ordering listed on the first row. The first level consists of three copula densities, [c.sub.12], [c.sub.23], and [c.sub.34]. The second level consists of two conditional copula densities, [c.sub.13;2] and [c.sub.24;3]. Finally, the third level consists of the copula density [c.sub.14;23]. The product of those six copula densities and the four marginal densities provides a D-vine specification,

(10) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Permuting the order of the variables in the first row yields different D-vine specifications. (Note that Equation (9) is a trivariate D-vine.)

Figure 4 gives an example of a four-variate C-vine. The first level consists of the copula densities [c.sub.12], [c.sub.13], and [c.sub.14], the second level contains [c.sub.23;1] and [c.sub.24;1], and the third level consists of [c.sub.34;12]. The product of those copula densities and the marginal densities gives a C-vine specification,

(11)[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Note that in this C-vine specification, [y.sub.1], pairs with the other three variables in the first level,

and then y3 appears as the conditioning variable in the second and third levels. So in this setup, y, can be thought of as "most important," while [y.sub.2] is "second-most important." See Aas et al. (2009) for general formulas for D- and C-vines in higher dimensions.

It is worth noting that, because D- and C-vines are subcategories of R-vines, all D- and C-vines are R-vines, but not all R-vines are D- or Cvines. Nonetheless, in four-variate applications, such as the one presented in this article, all Rvine structures can, in fact, be classified as either D- or C-vines.

C. Marginal Conditional Distributions

The vine approach requires marginal conditional distributions of the form E(y|v). Aas et al. (2009), using a result from Joe (1996), show that for every j

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [v.sub.-j] denotes the vector v excluding element [v.sup.j].

As an example, the D-vine specification in Equation (10) requires F([y.sub.1]|[y.sub.2]), which can be calculated as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which is a function of a bivariate copula and marginal distributions. For brevity denote this expression, called the "h-function," as [h.sub.12,] which is defined on the copula scale only; that is, its arguments must assume values on [0,1]. Equation (10) also requires F([y.sub.1]|[y.sub.2],[y.sub.3]), which can be calculated as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which can be expressed as a nesting of the h-function, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Consequently, each of the marginal "conditional distributions can be calculated from bivariate copulas and marginal distributions. (3)

IV. DATA

Data on housing prices for U.S. census divisions come from the Housing Price Index (HPI), collected and published by the U.S. Federal Housing Finance Agency, and based on transactions of single-family properties involving conforming, conventional mortgages purchased or securitized by Fannie Mae or Freddie Mac. The HPI calculates price indices based on a modified version of the weighted-repeat sales technique due to Case and Shiller (1989). The HPI covers far more transactions than the Commerce Department's Constant Quality House Price Index, and it also covers more geographic areas, and stretches further back in time, than the widely reported Standard and Poor/Case-Shiller Index. See Calhoun (1996) for a technical description of the HPI.

The HPI provides quarterly data for census divisions for the period 1975:Q2 to 2012:Q2, for a total of 149 observations for each census division. Each data point gives the percentage change from the previous quarter. This article considers data for four census divisions located in the Western and Midwestern United States and with significant land area sitting north of the 36.5 parallel north, a line with historical importance due to its role in the Missouri Compromise. The four divisions are: Pacific, Mountain, West North Central, and East North Central. (4) By focusing on four census divisions, the models estimated in the following sections are four-variate. Although it is possible to include additional census divisions in order to estimate higher-dimensional models, with only 149 observations, adding additional census divisions confronts degrees of freedom problems. For example, a vine copula assembled from bivariate Joe-Clayton copulas has 2x{m(m- l)/2) dependence parameters to be estimated, which for a five-dimensional model would be 20 parameters, plus another 25 parameters in the marginal distributions (discussed in the following section), which in preliminary estimations proved too taxing for a sample of only 149 observations. The preferred four-variate model estimated below, by contrast, has 8 copula parameters plus 20 marginal distribution parameters, which proved manageable. (It should be noted, however, that estimating several dozen parameters from only 149 data points could lead to relatively nonrobust findings.)

Figure 5 shows percentage changes from the previous quarter in housing prices in the four divisions. Following some volatility in the 1970s, price changes appeared to moderate during the 1980s and 1990s. As widely reported, prices climbed rapidly during the mid 2000s only to plummet toward the end of the decade. Price changes have only recently returned to positive territory. Throughout the series, the Pacific division, represented by the solid line, appears to have experienced the most volatility. This feature highlights the unique nature of real estate markets in the Pacific division (Shiller 2007).

V. MARGINAL DISTRIBUTIONS

In addition to the functional forms of the bivariate copulas, the vine specifications in Equations (10) and (11) require functional forms for the marginal distributions, denoted F(x), and for the marginal densities, denoted /f(x). This section describes those marginals.

Changes in house prices exhibit autocorrelation and autoregressive conditional heteroskedasticity, both of which can produce spurious findings of dependence if not properly addressed (Deb, Trivedi, and Varangis 1996; Granger and Newbold 1974). Following a model developed by Chen and Fan (2006) for estimating bivariate nonlinear time series models, let [y.sub.j,t] ; denote the percentage change in housing prices between quarters t - 1 and t for census division j. Then price changes in each division follow a univariate AR(1)-GARCH(1,1) specification,

(12) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] for j= 1,2,3,4. The error terms [[epsilon].sub.j,t] follow independent normal distributions with conditional variances given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

After estimation, new series [[??].sub.j.t] are calculated as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The series [[??].sub.j,t], hereafter referred to as "filtered price changes," filter out AR(1) and GARCH(1.1) components of [[??].sub.j,t].

Breusch-Godfrey tests (Breusch 1978; Godfrey 1978) of zero autocorrelation for the filtered price changes, not reported but available upon request, failed to reject the null hypothesis of zero autocorrelation for each filtered series, indicating that the AR(1) process appears to sufficiently remove autocorrelation. Likewise, Bollerslev tests (1986) failed to reject the null hypothesis of zero autoregressive conditional heteroskedasticity, indicating at the GARCH(1,1) process appears sufficient to remove ARCH effects. However, GARCH models can produce filtered series with heavy-tailed marginal distributions. Therefore, the series were checked for normality using Jacque-Bera (1987) tests; those tests failed to reject the null hypothesis of normality for three of the four series, the exception being the Mountain division.

After calculating the filtered price series, they are plugged into F ([[??].sub.j,t]) and f([[??].sub.j,t]), which for each census division, follow the cdf and pdf, respectively, of the standard normal distribution. An advantage of assuming normal marginals is that, for any vine copula specification, if all marginals are normal, and if all bivariate copulas within the vine are Gaussian, then the resulting distribution is identical to a multivariate non-copula normal distribution (Czado 2010). A vine constructed from Gaussian pairs and normal marginals is an example of a "simplified vine" (Stober, Joe, and Czado 2013). Such simplified models, although certainly restrictive, enjoyed widespread use before the housing crisis (Li 2000). Therefore, by changing the bivariate copulas in the vine structure, but not the marginals, this article explores whether non-Gaussian dependence structures reveal stronger links in co-movements of housing prices.

VI. MODEL SELECTION AND IMPLEMENTATION

The first step in implementation is to estimate univariate GARCH models and form the filtered price series [[??].sub.j,t], as outlined in the previous section. This is accomplished using the GARCH estimators available in Stata version 12.1. All subsequent steps treat the filtered price series as given.

The next step is to choose the appropriate vine structure, as well as the six appropriate copula functions to be included in the vine structure. The command RVineStructureSelect available in the R package VineCopula finds an appropriate specification using a sequential algorithm described by Dili mann et al. (2013). (5) Based on that sequential selection method, the preferred specification is a D-vine with the variables ordered: [y.sub.1] = PAC, [y.sub.2] = MNT, [y.sub.3] = ENC, [y.sub.4] = WNC. The selection method also gives appropriate bivariate copulas, according to whichever produce the smallest Bayes Information Criteria (BIC) measures, summarized in the following table.

[C.sub.12] [C.sub.23] [C.sub.34] [C.sub.13;2] Joe- Joe- Joe- Rotated Clayton Clayton Clayton Clayton [C.sub.12] [C.sub.24;4] [C.sub.14;23] Joe- Gaussian Rotated Clayton Clayton

For brevity, the remainder of the article refers to this D-vine specification assembled from these six copulas as the "vine" model. The model is estimated by maximum likelihood using the command CDVineMLE in the R package CDVine. For comparison, the article also estimates the same D-vine structure, but with all copulas assuming the bivariate Gaussian form. This simplified vine structure is identical to a four-variate non-vine Gaussian copula, and was the workhorse model used by credit rating agencies prior to the housing crisis. For brevity, the remainder of the article refers to this as the "Gaussian" model. A Vuong Test (1989) comparing the two approaches produced a test statistic (p value) of 3.11 (.002), indicating that the simple Gaussian model provides an inferior fit to its more flexible vine counterpart.

For reporting estimates, dependence parameters for the first three copulas in the above table are converted directly to Kendall's [tau]. Moreover, because those three copulas have measures of tail dependence, those are calculated directly from the corresponding dependence parameters. For the other three copulas, which provide conditional measures of dependence, the easier-to-interpret unconditional estimates of Kendall's [tau] and tail dependence are calculated by first drawing 10,000 simulated values from the estimated vine copula using the command CDVineSim in the R package CDVine. Those simulated values then are used to select the appropriate unconditional bivariate copulas, from among the four under consideration, according to whichever yield the smallest BIC values. (For the "Gaussian" model, bivariate Gaussian copulas are used throughout.) The selected bivariate copula then is estimated by maximum likelihood using the simulated data to arrive at unconditional estimates of Kendall's [tau] and tail dependence. (6)

A bootstrap procedure is used to calculate standard errors for Kendall's x and tail dependence. For each replication, a bootstrapped sample is drawn from the parent sample (treating the first-stage AR-GARCH models as given), and after re-estimation using the bootstrapped sample, the steps outlined in the previous paragraph are repeated. That process is replicated 200 times, with standard deviations of the Kendall's [tau] and tail dependence measures providing approximated standard errors.

VII. RESULTS

Table 1 shows estimates of Kendall's [tau] for the Gaussian and vine models. The two specifications appear to uncover similar magnitudes of dependence in housing price comovements. Consequently, a researcher might incorrectly conclude that the Gaussian performs satisfactorily. But results in Table 1 hide the degree of dependence during extreme market swings. To that end, Table 2 shows estimates of tail dependence. The estimates show that, in the vine model, housing prices do exhibit strong dependence in both tails. Furthermore, five of the six pairs (PAC/ENC being the exception) find asymmetric dependence, with larger dependence appearing in the lower tails. By contrast, the Gaussian specification allows no tail dependence for any of the pairs.

Despite that the Gaussian and vine models produce similar estimates of Kendall's x, the two specifications differ in their accommodation (or lack thereof) of tail dependence. To illustrate the importance of this issue, conditional probabilities are calculated based upon the estimated bivariate copulas,

(13) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(14) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where ([[??].sub.ij] denotes the estimated copula dependence parameter. For all six pairs, the bivariate copulas appearing in Equations (13) and (14) are unconditional copulas. (Recall that the Joe-Clayton copula has two dependence terms.)

These formulas give the probabilities that prices in census division i decrease (increase) by more than k given that prices also decrease (increase) by more than k in division j. Such linkages between simultaneous movements in housing prices lie at the center of the debate over whether mortgage-based securities had appropriate diversified protection prior to the housing crisis. Recalling that filtered housing prices represent standard deviation price changes, estimated probabilities are calculated for k = -3.0, -2.9, ..., 2.9, 3.0. Values near the ends of that range are of particular interest, as those might be interpreted as "tail" events.

Figure 6 graphically presents estimates of the conditional probabilities. For the Gaussian models, probabilities are the same in each tail, which reflects the symmetry of the normal distribution. Furthermore, the probabilities for the Gaussian models approach zero in both tails, which reflects the inability of the Gaussian model to capture tail dependence. By contrast, most of the estimates for the vine models show asymmetry and nonzero tail probabilities.

Taking the top-left picture as an example, the Gaussian model reveals that, given a price decrease in either the Pacific or Mountain larger than 2 (i.e., k [less than or equal to] 2), the probability that the other location experiences a similar price decrease is approximately .21, indicated by the hollow circle in the graph. By contrast, the vine model estimates that same probability to be more than twice as large (.54), as indicated by the solid circle in the graph.

Compared to the Gaussian model, the vine model uncovers much larger probabilities of simultaneous price decreases for all six pairs. Probabilities of simultaneous price increases also are larger than their Gaussian counterparts, although not to the extent seen in the lower tails. Overall, the vine model suggests stronger comovements in housing prices during extreme market events. This finding offers evidence that structured securities assembled from mortgages for houses in different locations did not contain the degree of diversification that investors assumed.

VIII. CONCLUSION

Prior to the housing crisis, the Gaussian copula provided the basis for estimates of the degree of diversification of mortgage-based securities. The Gaussian copula's popularity stemmed not only from its link to the familiar normal distribution, but also from the fact that, unlike other copula-based models, it readily extends to higher dimensions. Unfortunately, the Gaussian framework cannot accommodate dependence during extreme market swings, such as the housing crisis.

This article forms multivariate models of housing price comovements using vine copulas. The models are simple to estimate, as they require only bivariate copulas, marginal distributions, and standard maximum likelihood methods. The vine model not only fits the data better, but it also uncovers far stronger correlations between housing prices, especially during extreme market swings. The implication is that structured securities that pooled mortgages from different geographic locations did not offer the diversified protection that lead those financial instruments to secure high credit grades before the crisis.

This article does not claim that the vine specification estimated here constitutes the best model. Indeed, different marginal distributions and even nonparametric approaches might provide superior fits. More importantly, credit rating agencies certainly need to estimate higher-than-four dimensional models, and they have sufficient data to do so efficiently. Rather, this article advocates a flexible, easy-to-estimate approach.

ABBREVIATIONS BIC: Bayes Information Criteria CDO: Collateralized Debt Obligation HPI: Housing Price Index

doi: 10.1111/ecin.12156

REFERENCES

Aas, K., C. Czado, A. Frigessi, and H. Bakken. "PairCopula Constructions of Multiple Dependence." Insurance: Mathematics and Economics, 44, 2009, 182-98.

Apergis, N., and J. E. Payne. "Convergence in U.S. Housing Prices by State: Evidence from the Club Convergence and Clustering Procedure." Letters in Spatial and Resource Sciences, 5. 2012, 103-11.

Barros, C. R, L. A. Gil-Alana, and J. E. Payne. "Comovements among U.S. State Housing Prices: Evidence from Fractional Cointegration." Economic Modelling, 29, 2012,936-42.

Bollerslev, T. "Generalized Autoregressive Conditional Heteroscedasticity." Journal of Econometrics, 31, 1986, 307-27.

Brechmann, E., and C. Czado. "Risk Management with High-Dimensional Vine Copulas: An Analysis of the Euro Stoxx 50." Statistics & Risk Modeling, 30. 2013, 307-42.

Brechmann, E., C. Czado, and S. Paterlini. "Flexible Dependence Modeling of Operational Risk Losses and Its Impact on Total Capital Requirements." Journal of Banking & Finance, 40, 2014, 271-85.

Breusch, T. "Testing for Autocorrelation in Dynamic Linear Models." Australian Economic Papers, 17, 1978, 334-55.

Calhoun, C. "OFHEO House Price Indexes: HPI Technical Description." Federal Housing Finance Agency Technical Report, 1996.

Case. K" and R. Shiller. "The Efficiency of the Market for Single-Family Homes." American Economic Review, 79, 1989, 125-37.

Chen, S., and T.-M. Huang. "Nonparametric Estimation of Copula Functions for Dependence Modelling." Canadian Journal of Statistics, 35, 2009, 265-82.

Chen, X., and Y. Fan. "Estimation and Model Selection of Semiparametric Copula-based Multivariate Dynamic Models under Copula Misspecification." Journal of Econometrics, 135, 2006, 125-54.

Clayton, D. "A Model for Association in Bivariate Life Tables and Its Application in Epidemiological Studies of Familial Tendency in Chronic Disease Incidence." Biometrika, 65, 1978, 141-51.

Czado, C. "Pair-Copula Constructions of Multivariate Copulas," in Copula Theory and Its Applications, edited by P. Jaworski, F. Durante, W. K. Hiirdle, and T. Rychlik. New York: Springer, 2010.

Czado, C., F. Gartner, and A. Min. "Analysis of Australian Electricity Loads Using Joint Bayesian Inference of DVines with Autoregressive Margins," in Handbook on Vines, edited by D. Kurowicka and H. Joe. Singapore: World Scientific, 2011.

Czado, C., U. Schepsmeier, and A. Min. "Maximum Likelihood Estimation of Mixed C-vines with Application to Exchange Rates." Statistical Modelling, 12, 2012, 229-55.

Czado, C., E. Brechmann, and L. Gruber. "Selection of Vine Copulas," in Copulae in Mathematical and Quantitative Finance, edited by P. Jaworski, F. Durante, and W. Hardle. New York: Springer, 2013, 17-37.

Deb, P., P. Trivedi, and P. Varangis. "Excess Co-movement in Commodity Prices Reconsidered." Journal of Applied Econometrics, 11, 1996, 275-91.

Diss mann, J., E. C. Brechmann, C. Czado, and D. Kurowicka. "Selecting and Estimating Regular Vine Copulae and Application to Financial Returns." Computational Statistics & Data Analysis, 59, 2013, 52-69.

Embrechts, P, A. McNeil, and D. Straumann "Correlation and Dependence in Risk Management: Properties and Pitfalls," in Risk Management: Value at Risk and Beyond, edited by M. Dempster. Cambridge, UK: Cambridge University Press, 2002.

Erhardt, V., and C. Czado. "Modeling Dependent Yearly Claim Totals Including Zero Claims in Private Health Insurance." Scandinavian Actuarial Journal, 2, 2012, 106-29.

Godfrey, L. "Testing Against General Autoregressive and Moving Average Error Models When the Regressors Include Lagged Dependent Variables." Econometrica, 46, 1978, 1293-302.

Granger, C., and P. Newbold. "Spurious Regressions in Econometrics." Journal of Econometrics, 2, 1974, 111-20.

Ho, A., K. Huynh, and D. Jacho-Chavez. "Nonparametric Estimation of Copulas: Application to Housing Crisis," Working Paper, Emory University, 2014.

Husler, J., and R. Reiss. "Maxima of Normal Random Vectors: Between Independence and Complete Dependence." Statistics and Probability Letters, 7, 1989, 283-6.

Jarque, C., and A. Bera. "A Test for Normality of Observations and Regression Residuals." International Statistical Review, 55, 1987, 163-72.

Joe, H. "Families of Mini-Stable Multivariate Exponential and Multivariate Extreme Value Distributions." Statistics and Probability Letters, 9, 1990, 75-81.

--. "Multivariate Extreme Value Distributions with Applications to Environmental Data." Canadian Journal of Statistics, 22, 1994, 47-64.

--"Families of m-Variate Distributions with Given Margins and m(m - l)/2 Bivariate Dependence Parameters," in Distributions with Fixed Marginals and Related Topics, edited by L. Ruschendorf, B. Schweizer, and M. Taylor. Hayward, CA: Instituted of Mathematical Statistics, 1996.

Joe, H., H. Li, and A. Nikoloulopoulos. "Tail Dependence Functions and Vine Copulas." Journal of Multivariate Analysis, 101,2010, 252-70.

Kramer, N., E. Brechmann, D. Silvestrini, and C. Czado. "Total Loss Estimation Using Copula-based Regression Models." Insurance: Mathematics and Economics, 53, 2013, 829-39.

Kurowicka, D., and R. Cooke. "Distribution-Free Continuous Bayesian Belief Nets," in Fourth International Conference on Mathematical Methods in Reliability Methodology and Practice. Sante Fe, NM, 2004.

Li, D. "On Default Correlation: A Copula Function Approach." Journal of Fixed Income, 9, 2000, 43-54.

Nelsen, R. An Introduction to Copulas. 2nd ed. Springer: New York, 2006.

Patton, A. "Modelling Asymmetric Exchange Rate Dependence." International Economic Review, 47, 2006, 527-56.

--. "Simulated Method of Moments Estimation for Copula-Based Multivariate Models." Unpublished Manuscript, Duke University, 2012a.

--. "Modelling Dependence in High Dimensions with Factor Copulas." Unpublished Manuscript, Duke University, 2012b.

Racine, J. "Mixed Data Kernel Copulas." Working Paper No. 46_13, The Rimini Centre for Economic Analysis (RCEA), Italy, 2013.

Shiller, R. "Understanding Recent Trends in House Prices and Home Ownership." NBER Working Paper #13553. 2007.

Sklar, A. "Random Variables, Joint Distributions, and Copulas." Kybernetica, 9, 1973, 449-60.

Stober, J., H. Joe, and C. Czado. "Simplified Pair Copula Constructions: Limitations and Extensions." Journal of Multivariate Analysis, 119, 2013, 101-18.

Taleb, N. The Black Swan: The Impact of the Highly Improbable. New York: Random House, 2007.

Trivedi, P., and D. Zimmer. Copula Modeling: An Introduction for Practitioners. Hanover, MA: Now Publishers, 2007.

Vuong, Q. "Likelihood Ratio Tests for Model Selection and Non-nested Hypotheses." Econometrica, SI, 1989, 307-33.

Zimmer, D. "The Role of Copulas in the Housing Crisis." Review of Economics and Statistics, 94, 2012, 607-20.

Zimmer, D" and P. Trivedi. "Using Trivariate Copulas to Model Sample Selection and Treatment Effects: Application to Family Health Care Demand." Journal of Business and Economic Statistics, 24, 2006, 63-76.

(1.) Global CDO volume numbers come from a Securities Industry and Financial Markets Association press release, dated November 21, 2011.

(2.) The functional form for the copula density for the Joe-Clayton is available on Andrew Patton's webpage (http:// public.econ.duke.edu/~ap172/research.html). The functional forms for the other three copulas used in this paper are available in many places, including Trivedi and Zimmer (2007).

(3.) Functional forms for the h-functions for the Gaussian and Clayton copulas are given in Aas et al. (2009). The h-function for the Joe-Clayton copula is given in Czado, Schepsmeier, and Min (2012).

(4.) The Pacific division includes California, Oregon, Washington, Alaska, and Hawaii. The Mountain division includes Nevada, Arizona, Utah, Colorado. New Mexico, Wyoming, Idaho, and Montana. The West North Central division includes North Dakota, South Dakota, Nebraska, Kansas, Missouri, Iowa, and Minnesota. The East North Central division includes Wisconsin, Illinois, Indiana, Ohio, and Michigan.

(5.) The method first calculates empirical Kendall's [tau] measures of all variable pairs, and then it selects a spanning tree that maximizes the sum of the absolute empirical Kendall's [tau]s. Bivariate copulas for the edge of the selected spanning tree are selected according to whichever produce the smallest Bayes Information Criteria (BIC) measures. This gives the first "tree" of the R-vine. Then, using the selected bivariate copulas, pseudo-observations are created using the appropriate ^-functions, discussed above. Using the pseudo-observations, the process repeats to find the appropriate second tree in the vine, and so on for each successive tree. (See Algorithm 3.1 in DiB mann et al. for a more detailed description of the selection method. Also, see Czado, Brechmann, and Gruber 2013 for an overview of copula selection methods.)

(6.) The three unconditional copulas approximated via this simulation approach need not mirror their conditional counterparts. In fact, whereas the three conditional copulas are (1) Rotated Clayton, (2) Gaussian, and (3) Rotated Clayton, their unconditional counterparts are all Joe-Clayton. Thus, this paper is able to obtain estimates of lower and upper tail dependence for all six pairs.

DAVID M. ZIMMER *

* The author wishes to thank Editor Bruce McGough and anonymous referees for comments that improved the article. They are not responsible for any remaining errors.

Zimmer: Department of Economics, Western Kentucky University, Grise Hall 426, Bowling Green, KY 42101. Phone 270-745-2880, Fax 270-745-3190, E-mail david.zimmer@wku.edu

TABLE 1 Estimates of Kendall's [tau] Gaussian Model Vine Model Standard Standard Estimates Error Estimates Error [[tau].sub.TPAC,MNT] .37 (.05) .39 (.04) [[tau].sub.MNT,WNC] .37 (.08) .38 (.05) [[tau].sub.WNC,ENC] .37 (.07) .40 (.05) [[tau].sub.PAC,WNC] .25 (.07) .28 (.05) [[tau].sub.MNT,ENC] .38 (.05) .35 (.05) [[tau].sub.PAC,ENC] .34 (.05) .33 (.04) TABLE 2 Estimates of Tail Dependence Gaussiain Model Vine Model Standard Standard Estimates Error Estimates Error [lower.sub.PAC,MNT] -- -- .54 (.06) [upper.sub.PAC,MNT] -- -- .19 (.12) [lower.sub.MNX,WNC] -- -- .50 (.08) [upper.sub.MNT,WNC] -- -- .26 (.14) [lower.sub.WNC,ENC] -- -- .50 (.07) [upper.sub.WNC,ENC] -- -- .33 (.12) [lower.sub.PAC,WNC] -- -- .32 (.06) [lower.sub.PAC,WNC] -- -- .17 (.12) [lower.sub.RMN,ENC] -- -- .43 (.11) [upper.sub.MNT,ENC] -- -- .28 (.11) [lower.sub.PAC,ENC] -- -- .32 (.07) [upper.sub.PAC,ENC] -- -- .34 (.07)

Printer friendly Cite/link Email Feedback | |

Author: | Zimmer, David M. |
---|---|

Publication: | Economic Inquiry |

Article Type: | Report |

Geographic Code: | 1USA |

Date: | Apr 1, 2015 |

Words: | 7595 |

Previous Article: | Media and human capital development: can video game playing make you smarter? |

Next Article: | On the external validity of laboratory tax compliance experiments. |

Topics: |