# Identifying financial distress in the insurance industry: a synthesis of methodological and empirical issues.

12 The likelihood ratio test used in these results may be too
conservative for testing hypotheses on the boundary of the parameter
space. The problem may arise when comparing the EGB2 model with the
probit model.

13 The NPDM slightly outperforms the MDA and most qualitative response models in the classification of insolvent firms, especially three years prior to insolvency, when the minimum number of misclassification criterion is used.

References

The Importance of Insolvency Prediction

This study presents a methodological approach for identifying insolvent insurance companies. In this article, financial distress and insolvency are used interchangeably to describe insurers experiencing liquidation, receivership, conservatorship, restraining orders, rehabilitation, etc. Previous models for predicting financially distressed insurers are summarized and evaluated. More robust models for classifying and predicting financial distress in the insurance industry are presented, and an attempt is made to address methodological issues that previous studies have sometimes ignored. The problem of insolvency in the property-liability insurance industry merits special attention in view of the large number of failures. Since 1961, about 350 property-liability insurers have failed, more than 240 insurers have voluntarily retired, and over 500 companies have merged into other companies, resulting in more than 1,100 property-liability company retirements (compiled from A. M. Best Company, 1961-1990).

Insolvency prediction models can help insurance commissioners determine whether an insurer is in danger of failing and can also help auditors decide whether an insurer is a "going concern." The ability to classify and identify financial distress is important to regulators, legislators, policyholders, auditors, owners, bondholders, and even the general public. Statistical models of insolvency prediction can be constructed to help determine what accounting, financial, and other information could be employed by regulators in making decisions on the financial solidity of insurers.

A number of empirical studies have compared statistical models that use insurers' financial data to predict insolvencies in the property-liability insurance industry (Trieschmann and Pinches, 1973; Pinches and Trieschmann, 1974, 1977; Harmelink, 1974; Cooley, 1975; Eck, 1982; Hershbarger and Miller, 1986; Harrington and Nelson, 1986; BarNiv and Raveh, 1986; BarNiv and Smith, 1987; Ambrose and Seward, 1988; Barniv, 1989; and McDonald, 1992). The models have impressive ability to predict insolvencies in the insurance industry. For example, Trieschmann and Pinches (1973) report that their multiple discriminant analysis (MDA) model correctly classifies 92 percent of insolvent insurers and 96 percent of solvent firms two years prior to the determination of insolvency or solvency; later studies report correct classifications ranging from 62 to 100 percent. Despite the classification success of previous studies, we should be concerned with the accuracy, reliability, and levels of significance for models and coefficients obtained by these studies.

This article will review several of the important methodological issues that have been raised about models used to identify financial distress. This study's objectives are (1) to establish a general framework for multivariate prediction models that is applicable to the insurance industry; (2) to enumerate some of the methodological problems associated with insolvency prediction models for the insurance industry (most of which are relevant to binary state prediction models in general); and (3) to use the multivariate models to predict insolvencies with a high degree of accuracy and reliability by overcoming the methodological limitations encountered in previous studies. We present the current state of knowledge and illustrate the methodological considerations through the use of robust novel models and empirical applications.

Previous Research and Methodological Concerns

Early studies on financial distress in the property-liability industry lacked methodological and statistical verification and were mostly descriptive (see Denenberg, 1967; Evans, 1968; Nelson, 1971). Also, Kenny's (1967) tests, such as the surplus ratio ("2 for 1" rule) and other measures of performance, were criticized as "rules of thumb" (see Hofflander, 1968). Table 1 summarizes the methodologies and the main characteristics of previous studies.

Studies Based on Multivariate Analyses

Most previous studies on predicting financial distress concerned not insurers but other types of corporations. A large number of studies since Altman (1968) have used MDA.(1) However, Joy and Tollefson (1975), Eisenbeis (1977), and Altman and Eisenbeis (1978) warn against potential misinterpretation and possible misapplication of MDA in the classification and prediction of financial distress. More recent studies have discouraged the use of MDA in the prediction of bankruptcy (see, e.g., Ohlson, 1980; Richardson and Davidson, 1984; Zmijewski, 1984; Zavgren, 1985). MDA, which develops a composite score for observations by a procedure of maximizing the ratio of between-group to within-group variance, requires strong assumptions (see Pinches and Trieschmann, 1977, and BarNiv and Raveh, 1989, for discussions and elaborations). Similarly, the use of a state binary zero-one regression model violates the assumptions of the ordinary least square (OLS) regression. Ignoring the assumptions may result in inefficient coefficient estimators and raise questions about classification accuracy. The use of small samples can result in inaccurate classification reports that do not represent the classifications in the population. Also, prior state probabilities of solvent and insolvent insurers are often not considered, resulting in incorrect classification percentages, unless the entire population is sampled. For the years 1961 to 1988 the expected annual prior probability of insolvency in the property-liability industry was less than 1.3 percent. Many studies on insolvency problems in the insurance industry have ignored these issues.

An additional issue is that misclassification costs (state-payoff matrix) are usually not considered. Misclassifying insolvent insurers as solvent ones (type 1 error) would cost much more than misclassifying solvent insurers as insolvent ones (type 2 error). Also, the choice of an arbitrary cutoff score may not be relevant to the research design and thus may bias the correct classification. The use of choice-based (state-based) samples--that is, oversampling insolvent firms, which actually have a low frequency rate in the population--leads to biased and inconsistent estimates of the model's parameters and classification power.

Another methodological issue is that several studies have not used holdout samples and hence are unable to test the predictive ability of the proposed models. In addition, the use of choice-based holdout samples for predictions fails to represent the predictive ability in the population. Using arbitrary cutoff TABULAR DATA OMITTED points in predictions without specifying a decision context makes the reported predictions difficult to interpret (see Palepu, 1986). Incomplete data for some observations in the population (usually insolvent companies) present a challenging problem. The parameters of the models are biased in studies which assume that such firms do not exist and therefore eliminate them from analysis. Although in most cases such firms must be omitted, the model's parameters can be modified if incomplete data are available for several observations (see Zmijewski, 1984). Finally, the models' ability to identify insolvency is substantially reduced as the length of time prior to insolvency increases. Although this problem is well documented in many studies on financial distress, several studies in the property-liability industry have used data for only one year before insolvency.

Trieschmann and Pinches (1973), using MDA, performed the first study on predicting financially distressed property-liability insurers. Pinches and Trieschmann (1974) used the same sample to examine the efficiency of univariate versus multivariate financial ratios-models for solvency surveillance.(2) The MDA outperformed the univariate models in identifying financially distressed insurers. Cooley (1975), using prior probabilities for populations of solvent and insolvent firms, as well as the relative misclassification costs in prediction, found the impact of both to be substantial. In a subsequent study, Pinches and Trieschmann (1977) illustrated that many different results are possible from the same MDA model. Univariate and multivariate statistical tests indicated that the data were neither univariate normal nor multivariate normal. The Best's ratings were also viewed as surrogates for degrees of solvency. Harmelink (1974) used MDA to predict the degree of insolvency among property-liability firms as measured by a decline in Best's policyholder's ratings. Most studies in the property-liability industry have used MDA while ignoring its potential problems, which include violation of the normal distribution assumptions on the variables, unequal covariance matrices, and the lack of a screening-out procedure for insignificant variables through significant tests on the single-univariate coefficients (thus, standard t-tests of significance are not applicable).

The insurance regulatory information system (IRIS), developed by the National Association of Insurance Commissioners during the 1970s, classifies insurers with four or more of eleven financial ratios outside of specified ranges as priority firms for immediate regulatory scrutiny. Thornton and Meador (1977) concluded that the IRIS tests were not reliable indicators for insolvency prediction. Hershbarger and Miller (1986) used MDA to examine the ability of the IRIS ratios to discriminate between sound, priority, and insolvent insurers. They concluded that the IRIS test includes a number of ratios that have very little ability to distinguish between solvent and insolvent companies. Ambrose and Seward (1988) incorporated Best's ratings into MDA through a system of dummy variables, and introduced a two-stage discriminant technique for improving predictive accuracy.

Other models have been suggested to supplement or substitute for MDA. Subsequent studies on predicting financial distress used linear regression models to classify insurers. A few studies have used this methodology to classify and predict financial distress among banks and industrial corporations (see Meyer and Pifer, 1970; Collins, 1980). Amemiya (1981) discussed the relationship between the regression and MDA coefficients and indicated that the assumptions and interpretation of the coefficients are different. Applications of the regression models violate the Gauss-Markov assumptions. Eck (1982) presented a zero-one regression model that incorporated variables designed to detect dishonesty and to identify financially distressed property-liability insurers. Harrington and Nelson (1986) used regression analysis, in which the dependent variable was the premium-to-surplus ratio, for solvency surveillance in the property-liability industry, thus avoiding most problems associated with the zero-one regression analysis. Harrington and Nelson also provided an extensive rationale for their study. BarNiv (1990) used logit analysis to identify insolvencies in the property-liability insurance industry and emphasized the impact of alternative accounting practices as well as cash flows and market value of assets.

Other Studies

Hammond and Shilling (1978) used the average and the standard deviation of insurers' combined underwriting trade ratio to measure the solidity of insurers. This ranking method introduced various iso-ruin lines for measuring the solvency of an insurer but did not consider investment performance. BarNiv and Smith (1987) used a mean-variance ranking based on a one-year operating ratio that considered both underwriting and investment performance.

Bachman (1978) developed a model for determining the minimum capitalization requirements of an insurer. The model indicated that the minimum capital required to maintain solvency with a constrained ruin probability(3) varies among companies because of the risk associated with the underwriting profit margin of each insurer. Kahane (1978, 1979) included investment and underwriting performance in the mean-variance dimension. He demonstrated that the ruin constraint can be translated into practical criteria, such as the premium-to-surplus ratio. Venezian (1983) developed a model of risk and return for rate regulation, which is also useful for evaluating the effect of profit performance on insolvency. Kahane, Tapiero, and Jacque (1986) provided a comprehensive literature review and pointed out that "insolvency" has not been clearly defined and is used interchangeably with terms such as "bankruptcy," "illiquidity," and "ruin."

Shaked (1985) measured the probability of insolvency for life insurers by an approximation of the option pricing model. He indicated that life insurers are reasonably safe, but the distribution of insolvencies is skewed to the right. Gustavson and Lee (1986) used the capital asset pricing model to examine the risk and return for life insurers. BarNiv and Hershbarger (1990) identified variables useful for monitoring solvency in the life insurance industry and examined the applicability and efficiency of alternative multivariate models.

Limited Dependent Variable Models

This article investigates the relative efficiency and applicability of alternative multivariate models in the property-liability insurance industry. MDA and zero-one OLS regression models (also known as the linear probability model, or LPM) are fairly robust models and have been investigated extensively in the statistical and econometrics literature. The validity of the assumptions underlying these methods often has been ignored in previous studies.

It has been suggested that qualitative response models might reduce some limitations of MDA and zero-one OLS regression. Martin (1977) and Ohlson (1980) used a logit model to identify bankruptcies among commercial banks and industrial firms, respectively. Comprehensive reviews of many of these models are presented by Zavgren (1983), Altman (1984), and others. McFadden (1976) and Amemiya (1981) suggest that the logit model is more robust than MDA.(4) However, Lo (1986) found that MDA may be superior to a logit model if distributions are approximately normal.

The present study considers qualitative response models based on the exponential generalized beta distribution of the second kind (EGB2), which include logit and probit models as special cases. McDonald (1984) and Bookstaber and McDonald (1987) investigate the generalized beta distribution of the second kind (GB2) along with other distributions which are included as special cases.(5) The GB2 provides the basis for the EGB2 that is used in this article to generalize the probit, logit, and other models.

The limited dependent variable distributions are parametric by definition. Breiman et al. (1984) expanded a recursive partitioning algorithm (RPA), and Marais, Patell, and Wolfson (1984) and Frydman, Altman, and Kao (1985) employed the RPA for classification of commercial bank loans and bankruptcy prediction, respectively. The RPA is nonparametric, but it cannot be used for scoring observations within the same group. By contrast, multiple discriminant analysis, nonparametric discriminant models, and logit and probit models assign a score (probability) to each observation on a continuous scale. The development and use of limited dependent variable models for classification of solvent and distressed firms are summarized in Table 2.

TABULAR DATA OMITTED

Multiple Discriminant Analysis and the Linear Probability Model

Multiple discriminant analysis is based on the assumption that the vector of characteristics (X) is distributed as a multivariate normal with unequal means for each group (||Mu~.sub.0~,||Mu~.sub.1~) but with a common and known covariance matrix (|sigma~). The details of MDA are well known and will not be given here. The classification rule for the case of two groups is given by

|Mathematical Expression Omitted~.

The estimator of |Mathematical Expression Omitted~ in equation (1) is the coefficient vector used to obtain the scores for MDA. The assumptions of normality, symmetry, and equal covariance matrices are often violated, especially where financial data on binary variables are employed (see, e.g., Pinches and Trieschmann, 1977).

The linear probability model (LPM) or zero-one regression model is defined by

|Y.sub.j~ = |X.sub.j~b+||Epsilon~.sub.j~, (2)

where |Y.sub.j~ denotes a zero or one for the jth firm, b is the unknown vector of coefficients, and |Epsilon~ is the unobservable error term. It is well known that |Epsilon~ in equation (2) is neither normally distributed nor homoskedastic. Consequently, least squares estimators are neither efficient nor normally distributed, t-tests and other diagnostic statistics such as |R.sub.2~ are meaningless, and predicted probabilities obtained from equation (2) need not lie in the unit interval.

Nonparametric Discriminant Model

The nonparametric discriminant model (NPDM), which uses a new separation rule, was developed by BarNiv and Raveh (1986, 1989) and was used by BarNiv and Hershbarger (1990) to classify financial distress in the life insurance industry. The authors suggest choosing the coefficients so that scores |Z.sub.1i~ given to group 1 will be greater than (or less than) the scores |Z.sub.0j~ of group 0. The method searches for an optimal linear combination that yields minimum overlap between the two groups of scores. The zone of overlap between the two groups of scores obtained by the NPDM is always smaller than or equal to the overlap obtained by the MDA or other models such as the logit or probit. The measure to be maximized is based on the inequalities

|Z.sub.1i~ |is greater than or equal to~ |Z.sub.0j~, (3)

where |Z.sub.1i~ are the scores of group 1, and |Z.sub.0j~ are the scores of group 0; i = 1,..., |n.sub.1~, and j = ,..., |n.sub.2~. The following index of separation is obtained:

|Mathematical Expression Omitted~,

where |Mathematical Expression Omitted~, and |Mathematical Expression Omitted~, b is the vector of coefficients obtained by maximizing equation (4), and where |Mathematical Expression Omitted~ and |Mathematical Expression Omitted~ are mean scores of group 0 and group 1, respectively. The condition -1 |is less than or equal to~ IS(b) |is less than or equal to~ 1 is always maintained. IS(b) = 1 implies no overlapping of scores of the two groups |Z.sub.i~ |is less than~ |Z.sub.j~; the two populations are degenerate (i.e., there is no overlapping between the two distributions of scores and there is maximum separation). Another extreme case is IS(b) = 0, which occurs when both means are equal, |Z.sub.1~ = |Z.sub.0~. The maximization index, IS(b), is solved by the Zangwill (1967/1968) algorithm, which requires an initial guess of the weights and is restricted to local maxima. Other initial guesses of the weight vector b might also be the uniform vector b = (1,..., 1) or might be based on the data properties. A cutoff point, cp, may be chosen so that the total number of misclassifications is minimized. The classification rule is: Assign an observation to group 1 if x|prime~b |is greater than or equal to~ cp, otherwise assign it to group 0.

A similar classification rule can be used by the MDA, logit, or probit scores (probabilities), but the number of misclassifications with NPDM will be always less than, or equal to, the number of misclassifications obtained by the MDA or logit for an estimation sample.(6)

Qualitative Response Models

In qualitative response models such as the logit or probit, the distributions of the dependent variable or score (Z) are conditional on the vector of the explanatory variables x, and z/x, is logistic. In contrast, MDA assumes that the distribution of x is conditional on z, and (x/z) is normal. Lo (1986) developed the relation between logit and MDA and proved that the required assumption of normality for MDA assures that the conditional distribution z/x is logistic. However, the converse is not true, and therefore the logit model is a more robust theoretical procedure.

Qualitative response models take the following form:

|Mathematical Expression Omitted~

where F is the cumulative distribution function (CDF), and f is the density function. The probit and logit models, the most widely used qualitative response models, are based, respectively, on the normal and logistic density functions.

The logit and probit models can be generalized by selecting f(z) to be the exponentiated generalized beta of the second kind (EGB2), defined by

|Mathematical Expression Omitted~

for -|infinity~ |is less than~ Z |is less than~ |infinity~, where a, p, and q are positive shape parameters, d is a positive scale parameter, and B(p,q) denotes the beta function. The density of EGB2 is symmetric about the origin if p = q and d = 1. If we denote the cumulative probability density function (cdf) by |Mathematical Expression Omitted~, then the logit model can be expressed as

|Mathematical Expression Omitted~.

The relationship between the probit model and equation (6) involves the following limit (see also McDonald, 1984, for preliminary discussion):

|Mathematical Expression Omitted~

The probit and logit models are symmetric about the origin. Two special cases of the EGB2 which allow, but don't impose, symmetry will be considered. These will be referred to as the Burrit and Lomit models, because they can be presented as being based on the Burr and Lomax distributions as the logit model is based on the logistics model.(7) Before defining these models, note from equation (9) that, except for limiting cases as in the probit model, we can, without loss of generality, let a and d equal one.

The Lomit qualitative response model is defined as

|Mathematical Expression Omitted~,

and the Burrit model is defined by

|Mathematical Expression Omitted~.

Equations (9) and (10) can be shown to follow from equation (6). Note that equations (9) and (10) both encompass the logit model. Thus, the EGB2 includes the probit, logit, Burrit, and Lomit as special cases. Maximum likelihood estimations are used to obtain the vector of coefficients (b).

The relative log-likelihood values for the EGB2 and the Lomit or Burrit models are expected to be better than, or equal to, the log-likelihood values obtained for the logit or probit model. However, the predictive ability of the various models cannot be determined a priori. It is assumed implicitly that the scores |z.sub.j~ are symmetrically distributed for the logit and probit models. However, the Lomit and Burrit models do not assume that the |z.sub.j~ scores are symmetrically distributed.

Data and Statistical Methodology

Over 40 percent of the property-liability insurers (about 1,130 companies) retired during the period 1961 through 1988. Of these, 531 companies merged or were dissolved into other insurers; most of these companies were acquired by other insurers or groups of insurers, but a few also merged into the parent insurance companies. Not all mergers or acquisitions of insurers, which causes disappearance of the companies, should be regarded as distress or insolvency. Property-liability insurer retirements for this period are summarized in Table 3.

Sample Selection

The population of property-liability insurers ranged from 2,700 to over 3,000 during the period from 1974 through 1988. Of these, approximately 1,950 insurers reported data to A. M. Best in 1989. This rating company provides data for property-liability insurers that meet minimum premium and total assets volume requirements. Because none of the insolvent insurers reported total assets exceeding $750 million, the population of solvent insurers was also selected from insurers of the same size. Groups of companies with total consolidated assets over $1 billion were also excluded.

Insolvent property-liability insurers in the sample were selected from Best's Insurance Reports: Property-Casualty and Best's Key Rating Guide. An additional source of data for a few insurers was financial statements obtained from an insurance department. Of these insurers, 182 became insolvent between April 1, 1974, and March 31, 1988. Of those 182 insurers, data are available for 155, 14 of which were dropped because of incomplete or missing data. This left an insolvent insurer sample of 141 firms that had at least four years of data one year before insolvency. Of these firms, 138 had at least four years of data two years prior to becoming insolvent, and 126 had at least four years of data three years prior to insolvency. The insolvent firms were matched with 160 solvent companies. Matching was based on approximate size and time for which data were available. Seven solvent insurers were dropped because of missing data. The final sample comprises 294 property-liability insurers--153 solvent and 141 insolvent--one year prior to insolvency.(8)

The solvent and insolvent samples were each split into an estimation sample and prediction (holdout) sample; a time-series (inter-temporal) holdout sample was selected. The models were estimated by using companies from 1974 through 1983 for the estimation sample and different firms from 1984 through 1988 as an independent holdout sample. This partition ensured adequate estimation and holdout sample sizes (see Table 4 for a summary of the number of insurers in each sample). We used a time-series holdout sample instead of a random cross-sectional sample for three reasons. First, the random cross-sectional holdout sample is more likely to exaggerate the model's predictive ability; a time-series holdout is more likely to indicate poor predictive ability, because different time periods and, as a result, different costs are involved. Second, the time-series holdout sample is independent of the estimation sample, and spurious correlations between decisions of regulators or owners and independent variables are eliminated. Third, time-series holdout samples are relevant if regulators want to know whether models estimated on historical data can be used to make predictions.(9)

Variable Selection

Previous studies on predicting financial distress in the property-liability insurance industry used stepwise or similar techniques to select the financial variables shown to be useful in predicting insolvency. Trieschmann and Pinches (1973) used a six-variable MDA model. Harmelink (1974) and Eck (1982) used two different seven-variable models. Ambrose and Seward (1988) employed different MDA models with between two and six variables. Only three of those variables coincided in two or more studies. Harrington and Nelson (1986) used a seven-variable model for the regression analysis, only one ratio of which coincided with previous studies. BarNiv and Raveh (1986) presented models with the three most important variables, which were obtained by forward stepwise analysis for one and three years prior to insolvency. These variables measure the variability and stability of balance sheet items, as well as the mean/standard deviation of the profit margin, over time. BarNiv and Smith (1987) used a similar mean/variance ranking based on the overall operating ratio. Here, the variable selection process focuses on various aspects of insurer underwriting and investment operations, as well as other measures of performance outlined by the previous studies. The analyses use 45 financial variables, most of which were included in previous studies.

The variable selection process is influenced by the previous studies. First, we investigate the classification power of the 45 variables. Second, the significant variables obtained by forward stepwise analyses are selected for one, two, and three years prior to insolvency. Lag length was found to have a significant impact on predictive accuracy. Predictive ability deteriorates as one moves away from the insolvency date. For example, the models show a lower percentage of correct classifications three years prior to insolvency, compared with correct classification one and two years prior to insolvency. The seven variables selected for this study were identified by a forward stepwise procedure as significant for one, two, and three years prior to insolvency. Third, several combinations of other subsets of variables postulated to disclose information on insolvency are also considered. However, the seven-variable model statistically dominates these other combinations in terms of the highest percentage of correct classification and/or lower expected cost of misclassification.

The following seven significant variables are included in the model:

|X.sub.10~ Net income/total assets. This overall operating ratio comprises underwriting profits, net investment income, and other investment gains (or losses) in the numerator.

|X.sub.20~ Surplus (equity).

|X.sub.29~ Net income/surplus. The numerator is identical to net income in ratio |X.sub.10~.

|X.sub.35~ Mean/standard deviation of an overall operating ratio for a nine-year period. This overall operating ratio is identified as 1 - CTR + (NII+ OIG)/NPW, where CTR = combined trade ratio (the loss ratios plus the expense ratio), NII = net investment income, OIG = other investment gains (or losses), and NPW = net premiums written.

|X.sub.37~ Mean/standard deviation of another overall operating ratio (essentially similar to the ratio, |X.sub.29~) for a nine-year period. The ratio is 1 - |(UE + LE - NII-OIG)/surplus~, where UE = underwriting expenses, and LE = loss expenses. This variable differs from |X.sub.35~ in both the numerator and the denominator.(10)

|X.sub.42~ Liability decomposition defined as |Mathematical Expression Omitted~, where i is types of liabilities (including surplus, loss reserves, unearned premium reserves, and all other liabilities in one item), i = 1,..., k; and in this study k = 4. |Q.sub.i~ is the relative share (proportion) of liability i to total balance sheet for the year of data, and |p.sub.i~ is the share (proportion) of liability i to total balance sheet for a previous year (one year in this study), and 0 |is less than or equal to~ |Q.sub.i~, |p.sub.i~ |is less than or equal to~ 1.

|X.sub.43~ Liability decomposition measure, which uses the absolute value of ln(|Q.sub.i~/|p.sub.i~), defined as |Mathematical Expression Omitted~.

These variables are hypothesized to be associated with insurers that are likely to become insolvent. The variables |X.sub.35~ and |X.sub.37~ measure profitability versus earning stability (see BarNiv and Smith, 1987). A high mean over time is a sign of high profitability; a high standard deviation is a sign of instability. Both a low mean and high standard deviation are signs of financial distress. The lower the value of these variables, the greater the probability of insolvency. The standard deviation and standard error of financial ratios over periods of about ten years have also been used in a few previous studies (Altman, Haldeman, and Narayanan, 1977; Dambolena and Khoury, 1980). Lev (1971, 1974), Booth (1983), BarNiv and Raveh (1986), and Barniv and Hershbarger (1990) have provided evidence that the decomposition measures are useful in insolvency prediction models and have also demonstrated remarkable results in classifying failed and nonfailed firms. The hypothesis implies that large decomposition measures are a sign of financial distress. The variables |X.sub.10~ and |X.sub.29~ are overall profitability ratios indicating for management efficiency. A low ratio is a sign of financial distress. We expect large companies to be less susceptible to financial distress so that the likelihood of insolvency decreases with size of surplus (|X.sub.20~).

Estimation Models and Methodological Issues

Different procedures to classify and predict insolvencies are used in this study. The profile of the univariate variables is tested empirically. Univariate and multivariate tests for normality are then conducted. Other assumptions of the MDA and LPM are also tested. Multivariate models also applicable to the property-liability industry are used. Since the LPM provides results similar to those of the MDA, its results are not presented. Univariate ranking methods are compared with the multivariate models. Before using the multivariate models, the assumptions of various models are examined. The empirical results in the following sections are analyzed for various prior probabilities and misclassification costs. The population prior probabilities are ||Pi~.sub.0~ and ||Pi~.sub.1~ for the solvent and insolvent firms, respectively; ||Pi~.sub.0~ and ||Pi~.sub.1~ are known. The relative cost of misclassifying an insolvent firm as a solvent one (type 1 error) is denoted by |c.sub.1~, and the cost of misclassifying a solvent insurer as an insolvent one (type 2 error) is denoted by |c.sub.0~. It is expected that |c.sub.1~ |is greater than or equal to~ |c.sub.0~. The classification rule for MDA is to assign a firm with a profile vector x' to group 1 if

|Mathematical Expression Omitted~ (11)

Thus, the adjustment in the classification rule is made by incorporating the population prior probability rates and the misclassification cost. Therefore, only the constant term is affected in the MDA function. This sample rule may be employed for the NPDM. BarNiv and Raveh (1989) developed a generalization for equation (4) that takes into consideration misclassification costs and prior probabilities.

A cutoff score may also be selected to minimize the expected cost of misclassification (ECM), also termed the "resubstitution risk" by Breiman et al. (1984). The expected cost of misclassification is defined by

ECM = ||Pi~.sub.0~|c.sub.0~P(I/S) + ||Pi~.sub.1~|c.sub.1~P(S/I), (12)

where P(S/I) is the conditional probability, P (predicted solvent/the firm is insolvent); and P(I/S) is the conditional probability P (predicted insolvent/the firm is solvent).

For a sample size of N observations, the ECM is approximated as follows:

ECM = ||Pi~.sub.0~|c.sub.0~|n.sub.0~/|N.sub.0~ + ||Pi~.sub.1~|c.sub.1~|n.sub.1~/|N.sub.1~, (13)

where |n.sub.i~ is the total number of type i misclassifications, |N.sub.i~ is the sample size of the ith group, N = |N.sub.0~ + |N.sub.1~, and i = 0, 1 (the solvent and insolvent groups, respectively).

Cutoff scores are selected to minimize the expected cost of misclassification for each |c.sub.1~ in the sample. They are applicable for MDA, NPDM, LPM, logit, probit, lomit, etc. Also, as ||Pi~.sub.1~|c.sub.1~ = ||Pi~.sub.0~|c.sub.0~, the expected cost of misclassification is approximated for the multiple discriminate analysis and the nonparametric discriminant model by their classification rules. For the qualitative response models the conditional probabilities are as follows:

|Mathematical Expression Omitted~ (14)

and

|Mathematical Expression Omitted~

The prior probabilities of the solvent and insolvent groups are first taken as ||Pi~.sub.1~ = 0.01 and ||Pi~.sub.0~ = 0.99, and then as ||Pi~.sub.1~ = 0.02 and ||Pi~.sub.0~ = 0.98. The misclassification costs for type 2 errors are fixed at |c.sub.0~ = 1, while misclassification costs for type 1 errors ranged from 1 to 100. Cutoff scores relevant to the research design are used. All the models are corrected for prior probabilities, misclassification costs, and the effect of choice-based samples for model estimation.(11)

Empirical Results

Beaver (1966), Ohlson (1980), Zavgren (1985), and others found that the accuracy of classification results increased as the firms moved toward the year prior to insolvency. Thus, the firm's financial variables deteriorated from three to two years prior to insolvency, and further deterioration occurred as the firms reported the last financial statement one year before insolvency. The following analyses show the results for the three different base years.

Independent Variables: Univariate Analysis

Table 5 presents the summary statistics of the seven independent variables and the premium-to-surplus ratio. The statistics indicate significant differences between the solvent and insolvent samples for all seven variables one, two, and three years prior to insolvency. Insignificant differences between the solvent and insolvent samples are also indicated for a few other variables, which were used in previous studies; these results are not reported here. The premium-to-surplus ratio included in most previous studies has generally no significant ability to classify solvent and distressed property-liability insurers in a univariate analysis. Contrary to common belief in the industry, this traditional measure of capacity and underwriting risk fails to classify solvent and insolvent property-liability insurers. Distressed firms had about the same median ratio as solvent firms three and two years prior to insolvency; the difference in means between solvent and insolvent firms is only slightly significant. The empirical findings present insignificant zones of overlaps for the mean/standard deviation ranking variable |X.sub.35~ and the decomposition measures |X.sub.42~ and |X.sub.43~; these results are consistent with earlier findings of BarNiv and Raveh (1986) and BarNiv and Smith (1987). Apparently, these measures of profitability and stability over time are effective variables for detecting financial distress in the property-liability insurance industry. Univariate classification results reveal that most of the seven variables appear to distinguish between solvent and insolvent firms even three years before failure. For example, the mean/standard deviation ranking variable |X.sub.35~ correctly classifies 82 percent of property-liability insurers one year before insolvency, while the premium-to-surplus ratio correctly classifies only 53 percent of insurers. Additional results are available upon request from the authors.

Multivariate Models: Assumptions and Estimated Coefficients

In order to examine the reliability of the various multivariate models, the univariate and multivariate distributions of the seven-variable vector and additional variables were examined. The equality of the variance/covariance matrices is also examined. The univariate coefficients of skewness and kurtosis and the related z statistics for the entire estimation sample for both solvent and insolvent companies were tested for the three different base years. Several variables were skewed to the right or to the left and are more peaked with higher tails than a normal distribution (see Pinches and Trieschmann, 1977, for specifications of the z test).

Several statistics of the Kolmogornov-Smirnov test of normal approximation for the estimation sample were significant, an indication that variables were not univariate normal and could not be assumed multivariate normal. F-tests of the variance/covariance matrices of the seven variable models for one, two, and three years before insolvency were significant. The matrices are therefore unequal, indicating that linear multiple discriminant analysis should not be used. In summary, the main assumptions of MDA are violated and, although MDA is considered fairly robust for prediction purposes, other multivariate analyses should also be employed.

We compared our classification results with those of both previous studies for various MDA functions. Classification results are substantially reduced when our large data base is employed for functions used in previous studies. For example, Ambrose and Seward's (1988) four-variable MDA function correctly classified 86 percent of the 29 solvent and 29 insolvent firms. Only 74 percent of the 159 solvent and insolvent firms used in the estimation sample of this study one year before insolvency were correctly classified, and this is reduced to 58 percent of the firms three years prior to insolvency. The hypothesis that our data base and the 58 insurers used by Ambrose and Seward have the same predictive ability is rejected; approximate ||Chi~.sup.2~ tests for differences in predictive ability (see Conover, 1971) are highly significant, and the null hypotheses are rejected for the three base years. Similar empirical results are obtained for comparisons with other studies (with the exception of Harmelink, 1974). In addition to the significant reduction in classification ability, the coefficients of the functions are changed; for a few variables, even the signs of the coefficients are changed.

Although the differences in performance of the functions for the different data bases may be due to the time period analyzed, the evidence indicates that some time-series stability exists. Because this study and Ambrose and Seward used essentially the same time period, it appears that the differences in performance are the result of the small samples used in previous studies and the selection of variables. Table 6 also shows the classification ability of the stepwise MDA functions. The results indicate that the selection of variables from the large data base used here and the elimination of the small sample problem significantly improve the classification.

TABULAR DATA OMITTED

TABULAR DATA OMITTED

Table 7 shows the estimated coefficients and their relative contribution across the seven-variable models based on our data. Significance tests on individual coefficients are not available with MDA or NPDM, but the relative contribution can be approximated by standardized adjusted coefficients. Significance tests on coefficients are available with the qualitative response models. An analysis of t-statistics suggests that |X.sub.43~ is the most highly significant variable, followed by |X.sub.37~ and |X.sub.20~. The expected direction and impact of the variables on the probability of insolvency are also indicated in the table. A positive coefficient indicates that the larger the variable, the greater the expected probability of insolvency; a negative coefficient indicates that the larger the variable, the smaller the expected probability of insolvency. The estimated coefficients for the absolute values of liabilities decomposition measures (|X.sub.43~) are significantly positive in the expected direction for one and three years prior to insolvency. Estimates for surplus size (|X.sub.10~) are also significant in the expected direction for one and three years prior to insolvency. The mean-standard deviation ratio for overall profitability (|X.sub.37~) is significant in the negative direction, but only for one year prior to insolvency. The separation indices IS(b) improve slightly when the NPDM models are applied. The eigen values and log-likelihood values are highly significant and substantially decline across the base years; however, they are still highly significant three years before insolvency. We also used ||Chi~.sup.2~ tests of significance to test the differences in log-likelihood values. The EGB2, Lomit, and Burrit values had significantly more explanatory power than logit or probit values for the year prior to insolvency. However, the differences in the log-likelihood values were statistically insignificant (at p |is less than~ .05) two and three years before insolvency.(12)

Classification and Prediction

The classification analysis is confined to the seven-variable function. Several results for 12-variable to 18-variable functions are reported for the MDA at the bottom of Table 6. However, a substantial computational burden is involved in calculating the NPDM and qualitative response models with so many independent variables. In any event, the classification accuracy for the seven-variable models does not differ significantly from classifications obtained by the 12- to 18-variable functions.

Table 8 reports the classification ability for the estimation sample and the predictive ability of the holdout sample. The cutoff point (threshold value) selected for the analysis has a major effect on the empirical results. The midpoint of the Z scale between solvent and insolvent groups implies that ||Pi~.sub.0~|c.sub.0~ = ||Pi~.sub.1~|c.sub.1~. Classifications vary across the model and base years (years prior to insolvency). For the estimation sample, MDA correctly classifies 92, 89, and TABULAR DATA OMITTED TABULAR DATA OMITTED 84 percent of the firms one, two, and three years, respectively, prior to insolvency. The NPDM correctly classifies a few more firms for the three base years, but the improvement is insignificant. A cutoff point of 0.5 is used for the qualitative response models, which correctly classify similar percentages for the three base years. The logit function slightly outperforms the other models for the estimation sample one year prior to insolvency; the Lomit and Burrit models outperform the other models two years before insolvency; and the NPDM slightly outperforms other models three years prior to insolvency. The differences, however, are insignificant; no model consistently outperforms all of the other models.

The predictive abilities of the models are estimated by the use of the holdout sample. Firms are classified with cutoff scores used in the estimation sample. The different models yield similar results for all years prior to insolvency. The effect of the base year on the classification accuracy is even more substantial.

We applied another criterion that minimized the number of misclassifications. Both the NPDM and MDA provide similar results two years prior to insolvency, but the NPDM slightly outperforms MDA one and three years prior to insolvency.(13) Other initial guesses for the NPDM are possible and more efficient coefficients might be produced, which will increase the classification and prediction ability of the NPDM. However, in this study the MDA coefficients are used as the initial guesses for the NPDM. The qualitative response models provide similar results and are available upon request. In conclusion, the differences between the classification abilities of the various models are generally insignificant. The base year, however, has a substantial effect.

Choice-Based Samples and Resubstitution Risks

The empirical results are corrected for the effect of choice-based samples for model estimation. The adjusted probabilities (see the Appendix) are estimated for the population. The t-test and the two-sample Mann-Whitney-U or Wilcoxon-Rank-Sum approximated Z scores are used to test for differences in the means (medians) of the probability values for the solvent and insolvent firms. All t or z values are highly significant, indicating the ability of the various models to discriminate between solvent and insolvent firms. Both t and z values are somehow higher for the EGB2, Lomit, and Burrit models, especially one year prior to insolvency.

The resubstitution risks (ECMs) of the models for the estimation sample and the holdout sample are computed one, two, and three years prior to insolvency. Cutoff scores are selected to minimize ECMs in the estimation sample. The qualitative response models generate similar results in the estimation sample, but the risks of the MDA are significantly higher. Nonparametric test statistics indicate several significant differences in resubstitution risks among the models. The ECMs with logit or probit models are significantly higher (more substitution risk) than those with the Lomit model for both estimation and holdout samples one year prior to insolvency. The ECMs with the Lomit model are significantly lower than those with logit and probit models for three years prior to insolvency, but only in the estimation sample. The Lomit, Burrit, and EGB2 models yield similar results for both samples. The ECMs with all models provide insignificant differences for both samples at p |is less than~ .05 two years prior to insolvency. In conclusion, the ECMs with the Lomit model provide significantly lower ECMs relative to the logit and probit models for one and three years before a firm becomes insolvent. The MDA provides significantly higher ECMs. Other differences in resubstitution risks among the models are insignificant.

We also used a weighted exogenous sample maximum likelihood procedure to account for choice-based sampling (see Manski and Lerman, 1977; Zmijewski, 1984). The EGB2, Lomit, and Burrit models slightly outperform the other models for all three base years, especially for the estimation sample. However, except for the year before insolvency, the differences are insignificant.

Summary and Conclusions

This article's methodology overcomes some of the problems of traditional analyses for identifying financial distress in the insurance industry. This issue is important in view of the large number of failures in the property-liability insurance industry since 1961. The models and empirical results described in this article are intended to present the current state of knowledge and correct the methodological problems of previous studies.

The theoretical framework emphasizes the nexus among the different models discussed in the study--univariate models, qualitative response models, multiple discriminant analysis, and the nonparametric discriminant model. A new model, the exponential general beta 2 (EGB2), is defined and presented. The logit, probit, Lomit, and Burrit models are special cases of the EGB2. The Lomit or Burrit and the NPDM do not assume that the scores (given to the firms) are symmetrically distributed, and therefore they may better fit the data than MDA. In addition, the use of the NPDM eliminates violations of the basic assumptions which underlie the MDA model.

Large samples of solvent and insolvent firms are used to illustrate the application of the models. Each sample is split into an estimation sample and a holdout (prediction) sample. A seven-variable model is employed, with variables selected by a stepwise procedure for all three base years. The variables include measures of profitability, profitability versus earning stability, stability of balance sheet liabilities (decomposition measures), and surplus size. The seven-variable models perform quite well for classification and prediction and statistically outperform models based on other combinations of variables.

The MDA model seems robust for classification and prediction of insolvent and solvent firms. However, the NPDM, the logit, and other qualitative response models often correctly classify more cases than MDA where minimizing the number of misclassifications is employed. In addition, the NPDM outperforms both the logit and the MDA model in terms of prediction or validation results. The EGB2, Lomit, and Burrit models have significantly more powerful log-likelihood values compared to the logit or probit models one year prior to insolvency, whereas the expected cost of misclassification with the Lomit model is significantly lower for both samples one and three years prior to insolvency. The criteria for selecting the cutoff points and the base year prior to insolvency have a substantial effect on the empirical results.

If classification and predictive ability are the only objectives of the model, then multiple discriminant analysis may be robust because the coefficient estimates, the assumptions of the model, and the significance of the results are less important. However, if the purpose of the research is related to evaluation of the firms, selecting the appropriate accounting method, or regulation, the researcher also must be concerned with the assumptions of the models (which are violated if MDA is applied), the selection of proper variables and models, and the minimization of resubstitution risks (ECM). Most multivariate models perform quite well for predicting insolvency in the property-liability industry, but statistical, industrial, and risk considerations support the use of the Lomit and the nonparametric discriminant models and to some extent the exponential general beta 2 and the Burrit models.

Appendix

The choice-based sample problem results from nonrandom sampling (see Manski and Lerman, 1977; Zmijewski, 1984; Palepu, 1986). One possible adjustment for oversampling the insolvent firm is:

p|prime~ = 1(p)/|(1)p+|Alpha~(1-p)~,

where p = the estimated probability of being an insolvent firm in the population,

p|prime~ = the Bayesian probability that a firm j in the sample is insolvent; thus, p|prime~ = probability (j is insolvent/j is in the sample), and

|Alpha~ = the probability that a solvent firm in the population is in the sample.

The estimation of |Alpha~ is determined by the ratio for the solvent firms in the sample to the total solvent firms in the population. Hence, the probability that a firm in the population is in the sample is 1.0 (for firms with available data) if it is insolvent, and only 0.169 if it is solvent. If p is a logistic distribution (CDF), p = 1/(1+|e.sup.-z~), then p|prime~ (see Palepu, 1986) is

p|prime~ = 1/(1 + |Alpha~|e.sup.-z~) = 1/(1 + |e.sup.ln||Alpha~.sup.-z~~).

Because we derived the coefficients by maximizing the likelihood function based on p|prime~, the parameters for p should be determined. Palepu demonstrated that only the constant term is affected when the logistic function is employed. In general, p can also be recovered by the following equation (see BarNiv, 1990):

p = 1/|1/|Alpha~p|prime~) + 1 - (1/|Alpha~)~ = |Alpha~p|prime~/(1 + |Alpha~p|prime~ - p|prime~).

This adjustment is used for estimating the probability of being an insolvent firm j in the population. The incomplete data available for several firms in the population (of which 14 were insolvent) and the unavailability of some data for many other firms force this study to ignore the incomplete data problem because no alternative exists (for discussions, see Heckman, 1979, and Zmijewski, 1984). The problems of using choice-based samples for prediction and choosing arbitrary cutoff points in prediction are solved as follows. The estimated probability of an insurer being an insolvent firm j in the prediction is adjusted from p' to p in the holdout sample. A cutoff point is selected, with the specific objective of minimizing the expected cost of misclassification in the estimation sample. Firms in the holdout sample are then predicted as solvent or insolvent based on this cutoff point. Thus, the cutoff points in predictions are used within a specific decision context, which minimizes the expected cost of misclassification. Other criteria, such as minimizing the number of misclassifications, are also used for comparison.

1 Edminister (1972), Deakin (1972), and others used MDA to identify financial distress among industrial corporations. Sinkey (1975) and Santemero and Vinso (1977), among others, used MDA to identify bankruptcy in banks, while Altman (1973) used it for identifying distressed railroad companies. MDA has also been used for bond rating, classification of loan applications, and other classification problems.

2 Beaver (1966) provided the foundation for dichotomous classification of financially distressed firms based on univariate financial ratios.

3 Borch (1974) and Hofflander and Duval (1967) defined ruin as a zero quality (i.e., the equity is completely eliminated).

4 For an early reference to MDA see Fisher (1936) and Welch (1939). Biometricians used the qualitative response models during the 1940s and 1950s (see Berkson, 1944, 1951 for the logit, and Finney, 1952, 1971 for the probit model). McFadden (1974) presented an analysis of the logit model and its maximum likelihood estimation (see also Maddala, 1983).

5 Special general cases of the GB2 are the beta of the second kind, the Singh-Madalla (or Burr type 12) and Burr type 3. The generalized gamma distribution can also be included as a limiting case. Other distributions such as the Lomax, Fisk (|sech.sub.2~), F, ||Chi~.sub.2~, Weibull, etc., can be presented as specific cases of the special general cases.

6 The general properties of the NPDM are that (1) no assumptions of specific parametric distributions are needed for the independent variables X; for the scores, qualitative and quantitative variables can be treated; (2) neither symmetric distributions (such as the normal) nor equality of the variance/covariance matrices are required; (3) the treatment of cost of misclassification and prior probabilities is straightforward; (4) the model can be generalized for more than two groups; and (5) many different discriminant functions may be used (in contrast, MDA, logit, and probit models provide unique explicit solutions).

7 Both the Burr and Lomax distributions are discussed in Johnson and Kotz (1970, pp. 31, 234).

8 In addition, 158 mergers were identified during the research period; most of these firms merged or dissolved into to other property-liability insurers, and a few retired through affiliates (or similar) mergers. Data are available for many of these insurers and empirical analysis might be employed in future research.

9 See Dopuch et al. (1987) for similar arguments regarding prediction of audit qualifications.

10 |X.sub.37~ includes the statutory underwriting expenses and investment income (the numerator of variable |X.sub.29~) divided by the surplus. The variable |X.sub.35~ includes the combined trade ratio (UE/NPW + LE/NPE) and the investment income and other investment gains divided by NPW, where NPE is the net premiums earned.

11 The use of the midpoint score between the two groups implies equal expected cost of misclassification or ||Pi~.sub.1~|c.sub.1~=||Pi~.sub.0~|c.sub.0~. However, two other criteria are also used in this study: one minimizes the number of errors; the other minimizes the ECM for each |C.sub.1~ in the estimation sample. The second criterion considers the effect of prior probabilities, alternative costs, and the relative number of misclassifications.

13 The NPDM slightly outperforms the MDA and most qualitative response models in the classification of insolvent firms, especially three years prior to insolvency, when the minimum number of misclassification criterion is used.

References

The Importance of Insolvency Prediction

This study presents a methodological approach for identifying insolvent insurance companies. In this article, financial distress and insolvency are used interchangeably to describe insurers experiencing liquidation, receivership, conservatorship, restraining orders, rehabilitation, etc. Previous models for predicting financially distressed insurers are summarized and evaluated. More robust models for classifying and predicting financial distress in the insurance industry are presented, and an attempt is made to address methodological issues that previous studies have sometimes ignored. The problem of insolvency in the property-liability insurance industry merits special attention in view of the large number of failures. Since 1961, about 350 property-liability insurers have failed, more than 240 insurers have voluntarily retired, and over 500 companies have merged into other companies, resulting in more than 1,100 property-liability company retirements (compiled from A. M. Best Company, 1961-1990).

Insolvency prediction models can help insurance commissioners determine whether an insurer is in danger of failing and can also help auditors decide whether an insurer is a "going concern." The ability to classify and identify financial distress is important to regulators, legislators, policyholders, auditors, owners, bondholders, and even the general public. Statistical models of insolvency prediction can be constructed to help determine what accounting, financial, and other information could be employed by regulators in making decisions on the financial solidity of insurers.

A number of empirical studies have compared statistical models that use insurers' financial data to predict insolvencies in the property-liability insurance industry (Trieschmann and Pinches, 1973; Pinches and Trieschmann, 1974, 1977; Harmelink, 1974; Cooley, 1975; Eck, 1982; Hershbarger and Miller, 1986; Harrington and Nelson, 1986; BarNiv and Raveh, 1986; BarNiv and Smith, 1987; Ambrose and Seward, 1988; Barniv, 1989; and McDonald, 1992). The models have impressive ability to predict insolvencies in the insurance industry. For example, Trieschmann and Pinches (1973) report that their multiple discriminant analysis (MDA) model correctly classifies 92 percent of insolvent insurers and 96 percent of solvent firms two years prior to the determination of insolvency or solvency; later studies report correct classifications ranging from 62 to 100 percent. Despite the classification success of previous studies, we should be concerned with the accuracy, reliability, and levels of significance for models and coefficients obtained by these studies.

This article will review several of the important methodological issues that have been raised about models used to identify financial distress. This study's objectives are (1) to establish a general framework for multivariate prediction models that is applicable to the insurance industry; (2) to enumerate some of the methodological problems associated with insolvency prediction models for the insurance industry (most of which are relevant to binary state prediction models in general); and (3) to use the multivariate models to predict insolvencies with a high degree of accuracy and reliability by overcoming the methodological limitations encountered in previous studies. We present the current state of knowledge and illustrate the methodological considerations through the use of robust novel models and empirical applications.

Previous Research and Methodological Concerns

Early studies on financial distress in the property-liability industry lacked methodological and statistical verification and were mostly descriptive (see Denenberg, 1967; Evans, 1968; Nelson, 1971). Also, Kenny's (1967) tests, such as the surplus ratio ("2 for 1" rule) and other measures of performance, were criticized as "rules of thumb" (see Hofflander, 1968). Table 1 summarizes the methodologies and the main characteristics of previous studies.

Studies Based on Multivariate Analyses

Most previous studies on predicting financial distress concerned not insurers but other types of corporations. A large number of studies since Altman (1968) have used MDA.(1) However, Joy and Tollefson (1975), Eisenbeis (1977), and Altman and Eisenbeis (1978) warn against potential misinterpretation and possible misapplication of MDA in the classification and prediction of financial distress. More recent studies have discouraged the use of MDA in the prediction of bankruptcy (see, e.g., Ohlson, 1980; Richardson and Davidson, 1984; Zmijewski, 1984; Zavgren, 1985). MDA, which develops a composite score for observations by a procedure of maximizing the ratio of between-group to within-group variance, requires strong assumptions (see Pinches and Trieschmann, 1977, and BarNiv and Raveh, 1989, for discussions and elaborations). Similarly, the use of a state binary zero-one regression model violates the assumptions of the ordinary least square (OLS) regression. Ignoring the assumptions may result in inefficient coefficient estimators and raise questions about classification accuracy. The use of small samples can result in inaccurate classification reports that do not represent the classifications in the population. Also, prior state probabilities of solvent and insolvent insurers are often not considered, resulting in incorrect classification percentages, unless the entire population is sampled. For the years 1961 to 1988 the expected annual prior probability of insolvency in the property-liability industry was less than 1.3 percent. Many studies on insolvency problems in the insurance industry have ignored these issues.

An additional issue is that misclassification costs (state-payoff matrix) are usually not considered. Misclassifying insolvent insurers as solvent ones (type 1 error) would cost much more than misclassifying solvent insurers as insolvent ones (type 2 error). Also, the choice of an arbitrary cutoff score may not be relevant to the research design and thus may bias the correct classification. The use of choice-based (state-based) samples--that is, oversampling insolvent firms, which actually have a low frequency rate in the population--leads to biased and inconsistent estimates of the model's parameters and classification power.

Another methodological issue is that several studies have not used holdout samples and hence are unable to test the predictive ability of the proposed models. In addition, the use of choice-based holdout samples for predictions fails to represent the predictive ability in the population. Using arbitrary cutoff TABULAR DATA OMITTED points in predictions without specifying a decision context makes the reported predictions difficult to interpret (see Palepu, 1986). Incomplete data for some observations in the population (usually insolvent companies) present a challenging problem. The parameters of the models are biased in studies which assume that such firms do not exist and therefore eliminate them from analysis. Although in most cases such firms must be omitted, the model's parameters can be modified if incomplete data are available for several observations (see Zmijewski, 1984). Finally, the models' ability to identify insolvency is substantially reduced as the length of time prior to insolvency increases. Although this problem is well documented in many studies on financial distress, several studies in the property-liability industry have used data for only one year before insolvency.

Trieschmann and Pinches (1973), using MDA, performed the first study on predicting financially distressed property-liability insurers. Pinches and Trieschmann (1974) used the same sample to examine the efficiency of univariate versus multivariate financial ratios-models for solvency surveillance.(2) The MDA outperformed the univariate models in identifying financially distressed insurers. Cooley (1975), using prior probabilities for populations of solvent and insolvent firms, as well as the relative misclassification costs in prediction, found the impact of both to be substantial. In a subsequent study, Pinches and Trieschmann (1977) illustrated that many different results are possible from the same MDA model. Univariate and multivariate statistical tests indicated that the data were neither univariate normal nor multivariate normal. The Best's ratings were also viewed as surrogates for degrees of solvency. Harmelink (1974) used MDA to predict the degree of insolvency among property-liability firms as measured by a decline in Best's policyholder's ratings. Most studies in the property-liability industry have used MDA while ignoring its potential problems, which include violation of the normal distribution assumptions on the variables, unequal covariance matrices, and the lack of a screening-out procedure for insignificant variables through significant tests on the single-univariate coefficients (thus, standard t-tests of significance are not applicable).

The insurance regulatory information system (IRIS), developed by the National Association of Insurance Commissioners during the 1970s, classifies insurers with four or more of eleven financial ratios outside of specified ranges as priority firms for immediate regulatory scrutiny. Thornton and Meador (1977) concluded that the IRIS tests were not reliable indicators for insolvency prediction. Hershbarger and Miller (1986) used MDA to examine the ability of the IRIS ratios to discriminate between sound, priority, and insolvent insurers. They concluded that the IRIS test includes a number of ratios that have very little ability to distinguish between solvent and insolvent companies. Ambrose and Seward (1988) incorporated Best's ratings into MDA through a system of dummy variables, and introduced a two-stage discriminant technique for improving predictive accuracy.

Other models have been suggested to supplement or substitute for MDA. Subsequent studies on predicting financial distress used linear regression models to classify insurers. A few studies have used this methodology to classify and predict financial distress among banks and industrial corporations (see Meyer and Pifer, 1970; Collins, 1980). Amemiya (1981) discussed the relationship between the regression and MDA coefficients and indicated that the assumptions and interpretation of the coefficients are different. Applications of the regression models violate the Gauss-Markov assumptions. Eck (1982) presented a zero-one regression model that incorporated variables designed to detect dishonesty and to identify financially distressed property-liability insurers. Harrington and Nelson (1986) used regression analysis, in which the dependent variable was the premium-to-surplus ratio, for solvency surveillance in the property-liability industry, thus avoiding most problems associated with the zero-one regression analysis. Harrington and Nelson also provided an extensive rationale for their study. BarNiv (1990) used logit analysis to identify insolvencies in the property-liability insurance industry and emphasized the impact of alternative accounting practices as well as cash flows and market value of assets.

Other Studies

Hammond and Shilling (1978) used the average and the standard deviation of insurers' combined underwriting trade ratio to measure the solidity of insurers. This ranking method introduced various iso-ruin lines for measuring the solvency of an insurer but did not consider investment performance. BarNiv and Smith (1987) used a mean-variance ranking based on a one-year operating ratio that considered both underwriting and investment performance.

Bachman (1978) developed a model for determining the minimum capitalization requirements of an insurer. The model indicated that the minimum capital required to maintain solvency with a constrained ruin probability(3) varies among companies because of the risk associated with the underwriting profit margin of each insurer. Kahane (1978, 1979) included investment and underwriting performance in the mean-variance dimension. He demonstrated that the ruin constraint can be translated into practical criteria, such as the premium-to-surplus ratio. Venezian (1983) developed a model of risk and return for rate regulation, which is also useful for evaluating the effect of profit performance on insolvency. Kahane, Tapiero, and Jacque (1986) provided a comprehensive literature review and pointed out that "insolvency" has not been clearly defined and is used interchangeably with terms such as "bankruptcy," "illiquidity," and "ruin."

Shaked (1985) measured the probability of insolvency for life insurers by an approximation of the option pricing model. He indicated that life insurers are reasonably safe, but the distribution of insolvencies is skewed to the right. Gustavson and Lee (1986) used the capital asset pricing model to examine the risk and return for life insurers. BarNiv and Hershbarger (1990) identified variables useful for monitoring solvency in the life insurance industry and examined the applicability and efficiency of alternative multivariate models.

Limited Dependent Variable Models

This article investigates the relative efficiency and applicability of alternative multivariate models in the property-liability insurance industry. MDA and zero-one OLS regression models (also known as the linear probability model, or LPM) are fairly robust models and have been investigated extensively in the statistical and econometrics literature. The validity of the assumptions underlying these methods often has been ignored in previous studies.

It has been suggested that qualitative response models might reduce some limitations of MDA and zero-one OLS regression. Martin (1977) and Ohlson (1980) used a logit model to identify bankruptcies among commercial banks and industrial firms, respectively. Comprehensive reviews of many of these models are presented by Zavgren (1983), Altman (1984), and others. McFadden (1976) and Amemiya (1981) suggest that the logit model is more robust than MDA.(4) However, Lo (1986) found that MDA may be superior to a logit model if distributions are approximately normal.

The present study considers qualitative response models based on the exponential generalized beta distribution of the second kind (EGB2), which include logit and probit models as special cases. McDonald (1984) and Bookstaber and McDonald (1987) investigate the generalized beta distribution of the second kind (GB2) along with other distributions which are included as special cases.(5) The GB2 provides the basis for the EGB2 that is used in this article to generalize the probit, logit, and other models.

The limited dependent variable distributions are parametric by definition. Breiman et al. (1984) expanded a recursive partitioning algorithm (RPA), and Marais, Patell, and Wolfson (1984) and Frydman, Altman, and Kao (1985) employed the RPA for classification of commercial bank loans and bankruptcy prediction, respectively. The RPA is nonparametric, but it cannot be used for scoring observations within the same group. By contrast, multiple discriminant analysis, nonparametric discriminant models, and logit and probit models assign a score (probability) to each observation on a continuous scale. The development and use of limited dependent variable models for classification of solvent and distressed firms are summarized in Table 2.

TABULAR DATA OMITTED

Multiple Discriminant Analysis and the Linear Probability Model

Multiple discriminant analysis is based on the assumption that the vector of characteristics (X) is distributed as a multivariate normal with unequal means for each group (||Mu~.sub.0~,||Mu~.sub.1~) but with a common and known covariance matrix (|sigma~). The details of MDA are well known and will not be given here. The classification rule for the case of two groups is given by

|Mathematical Expression Omitted~.

The estimator of |Mathematical Expression Omitted~ in equation (1) is the coefficient vector used to obtain the scores for MDA. The assumptions of normality, symmetry, and equal covariance matrices are often violated, especially where financial data on binary variables are employed (see, e.g., Pinches and Trieschmann, 1977).

The linear probability model (LPM) or zero-one regression model is defined by

|Y.sub.j~ = |X.sub.j~b+||Epsilon~.sub.j~, (2)

where |Y.sub.j~ denotes a zero or one for the jth firm, b is the unknown vector of coefficients, and |Epsilon~ is the unobservable error term. It is well known that |Epsilon~ in equation (2) is neither normally distributed nor homoskedastic. Consequently, least squares estimators are neither efficient nor normally distributed, t-tests and other diagnostic statistics such as |R.sub.2~ are meaningless, and predicted probabilities obtained from equation (2) need not lie in the unit interval.

Nonparametric Discriminant Model

The nonparametric discriminant model (NPDM), which uses a new separation rule, was developed by BarNiv and Raveh (1986, 1989) and was used by BarNiv and Hershbarger (1990) to classify financial distress in the life insurance industry. The authors suggest choosing the coefficients so that scores |Z.sub.1i~ given to group 1 will be greater than (or less than) the scores |Z.sub.0j~ of group 0. The method searches for an optimal linear combination that yields minimum overlap between the two groups of scores. The zone of overlap between the two groups of scores obtained by the NPDM is always smaller than or equal to the overlap obtained by the MDA or other models such as the logit or probit. The measure to be maximized is based on the inequalities

|Z.sub.1i~ |is greater than or equal to~ |Z.sub.0j~, (3)

where |Z.sub.1i~ are the scores of group 1, and |Z.sub.0j~ are the scores of group 0; i = 1,..., |n.sub.1~, and j = ,..., |n.sub.2~. The following index of separation is obtained:

|Mathematical Expression Omitted~,

where |Mathematical Expression Omitted~, and |Mathematical Expression Omitted~, b is the vector of coefficients obtained by maximizing equation (4), and where |Mathematical Expression Omitted~ and |Mathematical Expression Omitted~ are mean scores of group 0 and group 1, respectively. The condition -1 |is less than or equal to~ IS(b) |is less than or equal to~ 1 is always maintained. IS(b) = 1 implies no overlapping of scores of the two groups |Z.sub.i~ |is less than~ |Z.sub.j~; the two populations are degenerate (i.e., there is no overlapping between the two distributions of scores and there is maximum separation). Another extreme case is IS(b) = 0, which occurs when both means are equal, |Z.sub.1~ = |Z.sub.0~. The maximization index, IS(b), is solved by the Zangwill (1967/1968) algorithm, which requires an initial guess of the weights and is restricted to local maxima. Other initial guesses of the weight vector b might also be the uniform vector b = (1,..., 1) or might be based on the data properties. A cutoff point, cp, may be chosen so that the total number of misclassifications is minimized. The classification rule is: Assign an observation to group 1 if x|prime~b |is greater than or equal to~ cp, otherwise assign it to group 0.

A similar classification rule can be used by the MDA, logit, or probit scores (probabilities), but the number of misclassifications with NPDM will be always less than, or equal to, the number of misclassifications obtained by the MDA or logit for an estimation sample.(6)

Qualitative Response Models

In qualitative response models such as the logit or probit, the distributions of the dependent variable or score (Z) are conditional on the vector of the explanatory variables x, and z/x, is logistic. In contrast, MDA assumes that the distribution of x is conditional on z, and (x/z) is normal. Lo (1986) developed the relation between logit and MDA and proved that the required assumption of normality for MDA assures that the conditional distribution z/x is logistic. However, the converse is not true, and therefore the logit model is a more robust theoretical procedure.

Qualitative response models take the following form:

|Mathematical Expression Omitted~

where F is the cumulative distribution function (CDF), and f is the density function. The probit and logit models, the most widely used qualitative response models, are based, respectively, on the normal and logistic density functions.

The logit and probit models can be generalized by selecting f(z) to be the exponentiated generalized beta of the second kind (EGB2), defined by

|Mathematical Expression Omitted~

for -|infinity~ |is less than~ Z |is less than~ |infinity~, where a, p, and q are positive shape parameters, d is a positive scale parameter, and B(p,q) denotes the beta function. The density of EGB2 is symmetric about the origin if p = q and d = 1. If we denote the cumulative probability density function (cdf) by |Mathematical Expression Omitted~, then the logit model can be expressed as

|Mathematical Expression Omitted~.

The relationship between the probit model and equation (6) involves the following limit (see also McDonald, 1984, for preliminary discussion):

|Mathematical Expression Omitted~

The probit and logit models are symmetric about the origin. Two special cases of the EGB2 which allow, but don't impose, symmetry will be considered. These will be referred to as the Burrit and Lomit models, because they can be presented as being based on the Burr and Lomax distributions as the logit model is based on the logistics model.(7) Before defining these models, note from equation (9) that, except for limiting cases as in the probit model, we can, without loss of generality, let a and d equal one.

The Lomit qualitative response model is defined as

|Mathematical Expression Omitted~,

and the Burrit model is defined by

|Mathematical Expression Omitted~.

Equations (9) and (10) can be shown to follow from equation (6). Note that equations (9) and (10) both encompass the logit model. Thus, the EGB2 includes the probit, logit, Burrit, and Lomit as special cases. Maximum likelihood estimations are used to obtain the vector of coefficients (b).

The relative log-likelihood values for the EGB2 and the Lomit or Burrit models are expected to be better than, or equal to, the log-likelihood values obtained for the logit or probit model. However, the predictive ability of the various models cannot be determined a priori. It is assumed implicitly that the scores |z.sub.j~ are symmetrically distributed for the logit and probit models. However, the Lomit and Burrit models do not assume that the |z.sub.j~ scores are symmetrically distributed.

Data and Statistical Methodology

Over 40 percent of the property-liability insurers (about 1,130 companies) retired during the period 1961 through 1988. Of these, 531 companies merged or were dissolved into other insurers; most of these companies were acquired by other insurers or groups of insurers, but a few also merged into the parent insurance companies. Not all mergers or acquisitions of insurers, which causes disappearance of the companies, should be regarded as distress or insolvency. Property-liability insurer retirements for this period are summarized in Table 3.

Table 3 Property-Liability Company Retirement, 1961-1988 Voluntary Years Insolvencies Retirements Mergers Total 1961-1970 148 142 305 595 1971-1980 90 60 149 299 1981-1988(a) 112 45 77 234 Total 350 247 531 1128 Source: Best's Review (1961-1990). a Data for retired insurers between January 1, 1981, and March 31, 1988.

Sample Selection

The population of property-liability insurers ranged from 2,700 to over 3,000 during the period from 1974 through 1988. Of these, approximately 1,950 insurers reported data to A. M. Best in 1989. This rating company provides data for property-liability insurers that meet minimum premium and total assets volume requirements. Because none of the insolvent insurers reported total assets exceeding $750 million, the population of solvent insurers was also selected from insurers of the same size. Groups of companies with total consolidated assets over $1 billion were also excluded.

Insolvent property-liability insurers in the sample were selected from Best's Insurance Reports: Property-Casualty and Best's Key Rating Guide. An additional source of data for a few insurers was financial statements obtained from an insurance department. Of these insurers, 182 became insolvent between April 1, 1974, and March 31, 1988. Of those 182 insurers, data are available for 155, 14 of which were dropped because of incomplete or missing data. This left an insolvent insurer sample of 141 firms that had at least four years of data one year before insolvency. Of these firms, 138 had at least four years of data two years prior to becoming insolvent, and 126 had at least four years of data three years prior to insolvency. The insolvent firms were matched with 160 solvent companies. Matching was based on approximate size and time for which data were available. Seven solvent insurers were dropped because of missing data. The final sample comprises 294 property-liability insurers--153 solvent and 141 insolvent--one year prior to insolvency.(8)

The solvent and insolvent samples were each split into an estimation sample and prediction (holdout) sample; a time-series (inter-temporal) holdout sample was selected. The models were estimated by using companies from 1974 through 1983 for the estimation sample and different firms from 1984 through 1988 as an independent holdout sample. This partition ensured adequate estimation and holdout sample sizes (see Table 4 for a summary of the number of insurers in each sample). We used a time-series holdout sample instead of a random cross-sectional sample for three reasons. First, the random cross-sectional holdout sample is more likely to exaggerate the model's predictive ability; a time-series holdout is more likely to indicate poor predictive ability, because different time periods and, as a result, different costs are involved. Second, the time-series holdout sample is independent of the estimation sample, and spurious correlations between decisions of regulators or owners and independent variables are eliminated. Third, time-series holdout samples are relevant if regulators want to know whether models estimated on historical data can be used to make predictions.(9)

Table 4 Partition of Insurers by Samples Sample Period Solvent Insolvent Total Estimation 1974-1983 83 76 159 Holdout 1984-1988 70 65 135 Total 1974-1988 153 141 294

Variable Selection

Previous studies on predicting financial distress in the property-liability insurance industry used stepwise or similar techniques to select the financial variables shown to be useful in predicting insolvency. Trieschmann and Pinches (1973) used a six-variable MDA model. Harmelink (1974) and Eck (1982) used two different seven-variable models. Ambrose and Seward (1988) employed different MDA models with between two and six variables. Only three of those variables coincided in two or more studies. Harrington and Nelson (1986) used a seven-variable model for the regression analysis, only one ratio of which coincided with previous studies. BarNiv and Raveh (1986) presented models with the three most important variables, which were obtained by forward stepwise analysis for one and three years prior to insolvency. These variables measure the variability and stability of balance sheet items, as well as the mean/standard deviation of the profit margin, over time. BarNiv and Smith (1987) used a similar mean/variance ranking based on the overall operating ratio. Here, the variable selection process focuses on various aspects of insurer underwriting and investment operations, as well as other measures of performance outlined by the previous studies. The analyses use 45 financial variables, most of which were included in previous studies.

The variable selection process is influenced by the previous studies. First, we investigate the classification power of the 45 variables. Second, the significant variables obtained by forward stepwise analyses are selected for one, two, and three years prior to insolvency. Lag length was found to have a significant impact on predictive accuracy. Predictive ability deteriorates as one moves away from the insolvency date. For example, the models show a lower percentage of correct classifications three years prior to insolvency, compared with correct classification one and two years prior to insolvency. The seven variables selected for this study were identified by a forward stepwise procedure as significant for one, two, and three years prior to insolvency. Third, several combinations of other subsets of variables postulated to disclose information on insolvency are also considered. However, the seven-variable model statistically dominates these other combinations in terms of the highest percentage of correct classification and/or lower expected cost of misclassification.

The following seven significant variables are included in the model:

|X.sub.10~ Net income/total assets. This overall operating ratio comprises underwriting profits, net investment income, and other investment gains (or losses) in the numerator.

|X.sub.20~ Surplus (equity).

|X.sub.29~ Net income/surplus. The numerator is identical to net income in ratio |X.sub.10~.

|X.sub.35~ Mean/standard deviation of an overall operating ratio for a nine-year period. This overall operating ratio is identified as 1 - CTR + (NII+ OIG)/NPW, where CTR = combined trade ratio (the loss ratios plus the expense ratio), NII = net investment income, OIG = other investment gains (or losses), and NPW = net premiums written.

|X.sub.37~ Mean/standard deviation of another overall operating ratio (essentially similar to the ratio, |X.sub.29~) for a nine-year period. The ratio is 1 - |(UE + LE - NII-OIG)/surplus~, where UE = underwriting expenses, and LE = loss expenses. This variable differs from |X.sub.35~ in both the numerator and the denominator.(10)

|X.sub.42~ Liability decomposition defined as |Mathematical Expression Omitted~, where i is types of liabilities (including surplus, loss reserves, unearned premium reserves, and all other liabilities in one item), i = 1,..., k; and in this study k = 4. |Q.sub.i~ is the relative share (proportion) of liability i to total balance sheet for the year of data, and |p.sub.i~ is the share (proportion) of liability i to total balance sheet for a previous year (one year in this study), and 0 |is less than or equal to~ |Q.sub.i~, |p.sub.i~ |is less than or equal to~ 1.

|X.sub.43~ Liability decomposition measure, which uses the absolute value of ln(|Q.sub.i~/|p.sub.i~), defined as |Mathematical Expression Omitted~.

These variables are hypothesized to be associated with insurers that are likely to become insolvent. The variables |X.sub.35~ and |X.sub.37~ measure profitability versus earning stability (see BarNiv and Smith, 1987). A high mean over time is a sign of high profitability; a high standard deviation is a sign of instability. Both a low mean and high standard deviation are signs of financial distress. The lower the value of these variables, the greater the probability of insolvency. The standard deviation and standard error of financial ratios over periods of about ten years have also been used in a few previous studies (Altman, Haldeman, and Narayanan, 1977; Dambolena and Khoury, 1980). Lev (1971, 1974), Booth (1983), BarNiv and Raveh (1986), and Barniv and Hershbarger (1990) have provided evidence that the decomposition measures are useful in insolvency prediction models and have also demonstrated remarkable results in classifying failed and nonfailed firms. The hypothesis implies that large decomposition measures are a sign of financial distress. The variables |X.sub.10~ and |X.sub.29~ are overall profitability ratios indicating for management efficiency. A low ratio is a sign of financial distress. We expect large companies to be less susceptible to financial distress so that the likelihood of insolvency decreases with size of surplus (|X.sub.20~).

Estimation Models and Methodological Issues

Different procedures to classify and predict insolvencies are used in this study. The profile of the univariate variables is tested empirically. Univariate and multivariate tests for normality are then conducted. Other assumptions of the MDA and LPM are also tested. Multivariate models also applicable to the property-liability industry are used. Since the LPM provides results similar to those of the MDA, its results are not presented. Univariate ranking methods are compared with the multivariate models. Before using the multivariate models, the assumptions of various models are examined. The empirical results in the following sections are analyzed for various prior probabilities and misclassification costs. The population prior probabilities are ||Pi~.sub.0~ and ||Pi~.sub.1~ for the solvent and insolvent firms, respectively; ||Pi~.sub.0~ and ||Pi~.sub.1~ are known. The relative cost of misclassifying an insolvent firm as a solvent one (type 1 error) is denoted by |c.sub.1~, and the cost of misclassifying a solvent insurer as an insolvent one (type 2 error) is denoted by |c.sub.0~. It is expected that |c.sub.1~ |is greater than or equal to~ |c.sub.0~. The classification rule for MDA is to assign a firm with a profile vector x' to group 1 if

|Mathematical Expression Omitted~ (11)

Thus, the adjustment in the classification rule is made by incorporating the population prior probability rates and the misclassification cost. Therefore, only the constant term is affected in the MDA function. This sample rule may be employed for the NPDM. BarNiv and Raveh (1989) developed a generalization for equation (4) that takes into consideration misclassification costs and prior probabilities.

A cutoff score may also be selected to minimize the expected cost of misclassification (ECM), also termed the "resubstitution risk" by Breiman et al. (1984). The expected cost of misclassification is defined by

ECM = ||Pi~.sub.0~|c.sub.0~P(I/S) + ||Pi~.sub.1~|c.sub.1~P(S/I), (12)

where P(S/I) is the conditional probability, P (predicted solvent/the firm is insolvent); and P(I/S) is the conditional probability P (predicted insolvent/the firm is solvent).

For a sample size of N observations, the ECM is approximated as follows:

ECM = ||Pi~.sub.0~|c.sub.0~|n.sub.0~/|N.sub.0~ + ||Pi~.sub.1~|c.sub.1~|n.sub.1~/|N.sub.1~, (13)

where |n.sub.i~ is the total number of type i misclassifications, |N.sub.i~ is the sample size of the ith group, N = |N.sub.0~ + |N.sub.1~, and i = 0, 1 (the solvent and insolvent groups, respectively).

Cutoff scores are selected to minimize the expected cost of misclassification for each |c.sub.1~ in the sample. They are applicable for MDA, NPDM, LPM, logit, probit, lomit, etc. Also, as ||Pi~.sub.1~|c.sub.1~ = ||Pi~.sub.0~|c.sub.0~, the expected cost of misclassification is approximated for the multiple discriminate analysis and the nonparametric discriminant model by their classification rules. For the qualitative response models the conditional probabilities are as follows:

|Mathematical Expression Omitted~ (14)

and

|Mathematical Expression Omitted~

The prior probabilities of the solvent and insolvent groups are first taken as ||Pi~.sub.1~ = 0.01 and ||Pi~.sub.0~ = 0.99, and then as ||Pi~.sub.1~ = 0.02 and ||Pi~.sub.0~ = 0.98. The misclassification costs for type 2 errors are fixed at |c.sub.0~ = 1, while misclassification costs for type 1 errors ranged from 1 to 100. Cutoff scores relevant to the research design are used. All the models are corrected for prior probabilities, misclassification costs, and the effect of choice-based samples for model estimation.(11)

Empirical Results

Beaver (1966), Ohlson (1980), Zavgren (1985), and others found that the accuracy of classification results increased as the firms moved toward the year prior to insolvency. Thus, the firm's financial variables deteriorated from three to two years prior to insolvency, and further deterioration occurred as the firms reported the last financial statement one year before insolvency. The following analyses show the results for the three different base years.

Independent Variables: Univariate Analysis

Table 5 presents the summary statistics of the seven independent variables and the premium-to-surplus ratio. The statistics indicate significant differences between the solvent and insolvent samples for all seven variables one, two, and three years prior to insolvency. Insignificant differences between the solvent and insolvent samples are also indicated for a few other variables, which were used in previous studies; these results are not reported here. The premium-to-surplus ratio included in most previous studies has generally no significant ability to classify solvent and distressed property-liability insurers in a univariate analysis. Contrary to common belief in the industry, this traditional measure of capacity and underwriting risk fails to classify solvent and insolvent property-liability insurers. Distressed firms had about the same median ratio as solvent firms three and two years prior to insolvency; the difference in means between solvent and insolvent firms is only slightly significant. The empirical findings present insignificant zones of overlaps for the mean/standard deviation ranking variable |X.sub.35~ and the decomposition measures |X.sub.42~ and |X.sub.43~; these results are consistent with earlier findings of BarNiv and Raveh (1986) and BarNiv and Smith (1987). Apparently, these measures of profitability and stability over time are effective variables for detecting financial distress in the property-liability insurance industry. Univariate classification results reveal that most of the seven variables appear to distinguish between solvent and insolvent firms even three years before failure. For example, the mean/standard deviation ranking variable |X.sub.35~ correctly classifies 82 percent of property-liability insurers one year before insolvency, while the premium-to-surplus ratio correctly classifies only 53 percent of insurers. Additional results are available upon request from the authors.

Multivariate Models: Assumptions and Estimated Coefficients

In order to examine the reliability of the various multivariate models, the univariate and multivariate distributions of the seven-variable vector and additional variables were examined. The equality of the variance/covariance matrices is also examined. The univariate coefficients of skewness and kurtosis and the related z statistics for the entire estimation sample for both solvent and insolvent companies were tested for the three different base years. Several variables were skewed to the right or to the left and are more peaked with higher tails than a normal distribution (see Pinches and Trieschmann, 1977, for specifications of the z test).

Several statistics of the Kolmogornov-Smirnov test of normal approximation for the estimation sample were significant, an indication that variables were not univariate normal and could not be assumed multivariate normal. F-tests of the variance/covariance matrices of the seven variable models for one, two, and three years before insolvency were significant. The matrices are therefore unequal, indicating that linear multiple discriminant analysis should not be used. In summary, the main assumptions of MDA are violated and, although MDA is considered fairly robust for prediction purposes, other multivariate analyses should also be employed.

We compared our classification results with those of both previous studies for various MDA functions. Classification results are substantially reduced when our large data base is employed for functions used in previous studies. For example, Ambrose and Seward's (1988) four-variable MDA function correctly classified 86 percent of the 29 solvent and 29 insolvent firms. Only 74 percent of the 159 solvent and insolvent firms used in the estimation sample of this study one year before insolvency were correctly classified, and this is reduced to 58 percent of the firms three years prior to insolvency. The hypothesis that our data base and the 58 insurers used by Ambrose and Seward have the same predictive ability is rejected; approximate ||Chi~.sup.2~ tests for differences in predictive ability (see Conover, 1971) are highly significant, and the null hypotheses are rejected for the three base years. Similar empirical results are obtained for comparisons with other studies (with the exception of Harmelink, 1974). In addition to the significant reduction in classification ability, the coefficients of the functions are changed; for a few variables, even the signs of the coefficients are changed.

Although the differences in performance of the functions for the different data bases may be due to the time period analyzed, the evidence indicates that some time-series stability exists. Because this study and Ambrose and Seward used essentially the same time period, it appears that the differences in performance are the result of the small samples used in previous studies and the selection of variables. Table 6 also shows the classification ability of the stepwise MDA functions. The results indicate that the selection of variables from the large data base used here and the elimination of the small sample problem significantly improve the classification.

TABULAR DATA OMITTED

TABULAR DATA OMITTED

Table 7 shows the estimated coefficients and their relative contribution across the seven-variable models based on our data. Significance tests on individual coefficients are not available with MDA or NPDM, but the relative contribution can be approximated by standardized adjusted coefficients. Significance tests on coefficients are available with the qualitative response models. An analysis of t-statistics suggests that |X.sub.43~ is the most highly significant variable, followed by |X.sub.37~ and |X.sub.20~. The expected direction and impact of the variables on the probability of insolvency are also indicated in the table. A positive coefficient indicates that the larger the variable, the greater the expected probability of insolvency; a negative coefficient indicates that the larger the variable, the smaller the expected probability of insolvency. The estimated coefficients for the absolute values of liabilities decomposition measures (|X.sub.43~) are significantly positive in the expected direction for one and three years prior to insolvency. Estimates for surplus size (|X.sub.10~) are also significant in the expected direction for one and three years prior to insolvency. The mean-standard deviation ratio for overall profitability (|X.sub.37~) is significant in the negative direction, but only for one year prior to insolvency. The separation indices IS(b) improve slightly when the NPDM models are applied. The eigen values and log-likelihood values are highly significant and substantially decline across the base years; however, they are still highly significant three years before insolvency. We also used ||Chi~.sup.2~ tests of significance to test the differences in log-likelihood values. The EGB2, Lomit, and Burrit values had significantly more explanatory power than logit or probit values for the year prior to insolvency. However, the differences in the log-likelihood values were statistically insignificant (at p |is less than~ .05) two and three years before insolvency.(12)

Classification and Prediction

The classification analysis is confined to the seven-variable function. Several results for 12-variable to 18-variable functions are reported for the MDA at the bottom of Table 6. However, a substantial computational burden is involved in calculating the NPDM and qualitative response models with so many independent variables. In any event, the classification accuracy for the seven-variable models does not differ significantly from classifications obtained by the 12- to 18-variable functions.

Table 8 reports the classification ability for the estimation sample and the predictive ability of the holdout sample. The cutoff point (threshold value) selected for the analysis has a major effect on the empirical results. The midpoint of the Z scale between solvent and insolvent groups implies that ||Pi~.sub.0~|c.sub.0~ = ||Pi~.sub.1~|c.sub.1~. Classifications vary across the model and base years (years prior to insolvency). For the estimation sample, MDA correctly classifies 92, 89, and TABULAR DATA OMITTED TABULAR DATA OMITTED 84 percent of the firms one, two, and three years, respectively, prior to insolvency. The NPDM correctly classifies a few more firms for the three base years, but the improvement is insignificant. A cutoff point of 0.5 is used for the qualitative response models, which correctly classify similar percentages for the three base years. The logit function slightly outperforms the other models for the estimation sample one year prior to insolvency; the Lomit and Burrit models outperform the other models two years before insolvency; and the NPDM slightly outperforms other models three years prior to insolvency. The differences, however, are insignificant; no model consistently outperforms all of the other models.

The predictive abilities of the models are estimated by the use of the holdout sample. Firms are classified with cutoff scores used in the estimation sample. The different models yield similar results for all years prior to insolvency. The effect of the base year on the classification accuracy is even more substantial.

We applied another criterion that minimized the number of misclassifications. Both the NPDM and MDA provide similar results two years prior to insolvency, but the NPDM slightly outperforms MDA one and three years prior to insolvency.(13) Other initial guesses for the NPDM are possible and more efficient coefficients might be produced, which will increase the classification and prediction ability of the NPDM. However, in this study the MDA coefficients are used as the initial guesses for the NPDM. The qualitative response models provide similar results and are available upon request. In conclusion, the differences between the classification abilities of the various models are generally insignificant. The base year, however, has a substantial effect.

Choice-Based Samples and Resubstitution Risks

The empirical results are corrected for the effect of choice-based samples for model estimation. The adjusted probabilities (see the Appendix) are estimated for the population. The t-test and the two-sample Mann-Whitney-U or Wilcoxon-Rank-Sum approximated Z scores are used to test for differences in the means (medians) of the probability values for the solvent and insolvent firms. All t or z values are highly significant, indicating the ability of the various models to discriminate between solvent and insolvent firms. Both t and z values are somehow higher for the EGB2, Lomit, and Burrit models, especially one year prior to insolvency.

The resubstitution risks (ECMs) of the models for the estimation sample and the holdout sample are computed one, two, and three years prior to insolvency. Cutoff scores are selected to minimize ECMs in the estimation sample. The qualitative response models generate similar results in the estimation sample, but the risks of the MDA are significantly higher. Nonparametric test statistics indicate several significant differences in resubstitution risks among the models. The ECMs with logit or probit models are significantly higher (more substitution risk) than those with the Lomit model for both estimation and holdout samples one year prior to insolvency. The ECMs with the Lomit model are significantly lower than those with logit and probit models for three years prior to insolvency, but only in the estimation sample. The Lomit, Burrit, and EGB2 models yield similar results for both samples. The ECMs with all models provide insignificant differences for both samples at p |is less than~ .05 two years prior to insolvency. In conclusion, the ECMs with the Lomit model provide significantly lower ECMs relative to the logit and probit models for one and three years before a firm becomes insolvent. The MDA provides significantly higher ECMs. Other differences in resubstitution risks among the models are insignificant.

We also used a weighted exogenous sample maximum likelihood procedure to account for choice-based sampling (see Manski and Lerman, 1977; Zmijewski, 1984). The EGB2, Lomit, and Burrit models slightly outperform the other models for all three base years, especially for the estimation sample. However, except for the year before insolvency, the differences are insignificant.

Summary and Conclusions

This article's methodology overcomes some of the problems of traditional analyses for identifying financial distress in the insurance industry. This issue is important in view of the large number of failures in the property-liability insurance industry since 1961. The models and empirical results described in this article are intended to present the current state of knowledge and correct the methodological problems of previous studies.

The theoretical framework emphasizes the nexus among the different models discussed in the study--univariate models, qualitative response models, multiple discriminant analysis, and the nonparametric discriminant model. A new model, the exponential general beta 2 (EGB2), is defined and presented. The logit, probit, Lomit, and Burrit models are special cases of the EGB2. The Lomit or Burrit and the NPDM do not assume that the scores (given to the firms) are symmetrically distributed, and therefore they may better fit the data than MDA. In addition, the use of the NPDM eliminates violations of the basic assumptions which underlie the MDA model.

Large samples of solvent and insolvent firms are used to illustrate the application of the models. Each sample is split into an estimation sample and a holdout (prediction) sample. A seven-variable model is employed, with variables selected by a stepwise procedure for all three base years. The variables include measures of profitability, profitability versus earning stability, stability of balance sheet liabilities (decomposition measures), and surplus size. The seven-variable models perform quite well for classification and prediction and statistically outperform models based on other combinations of variables.

The MDA model seems robust for classification and prediction of insolvent and solvent firms. However, the NPDM, the logit, and other qualitative response models often correctly classify more cases than MDA where minimizing the number of misclassifications is employed. In addition, the NPDM outperforms both the logit and the MDA model in terms of prediction or validation results. The EGB2, Lomit, and Burrit models have significantly more powerful log-likelihood values compared to the logit or probit models one year prior to insolvency, whereas the expected cost of misclassification with the Lomit model is significantly lower for both samples one and three years prior to insolvency. The criteria for selecting the cutoff points and the base year prior to insolvency have a substantial effect on the empirical results.

If classification and predictive ability are the only objectives of the model, then multiple discriminant analysis may be robust because the coefficient estimates, the assumptions of the model, and the significance of the results are less important. However, if the purpose of the research is related to evaluation of the firms, selecting the appropriate accounting method, or regulation, the researcher also must be concerned with the assumptions of the models (which are violated if MDA is applied), the selection of proper variables and models, and the minimization of resubstitution risks (ECM). Most multivariate models perform quite well for predicting insolvency in the property-liability industry, but statistical, industrial, and risk considerations support the use of the Lomit and the nonparametric discriminant models and to some extent the exponential general beta 2 and the Burrit models.

Appendix

The choice-based sample problem results from nonrandom sampling (see Manski and Lerman, 1977; Zmijewski, 1984; Palepu, 1986). One possible adjustment for oversampling the insolvent firm is:

p|prime~ = 1(p)/|(1)p+|Alpha~(1-p)~,

where p = the estimated probability of being an insolvent firm in the population,

p|prime~ = the Bayesian probability that a firm j in the sample is insolvent; thus, p|prime~ = probability (j is insolvent/j is in the sample), and

|Alpha~ = the probability that a solvent firm in the population is in the sample.

The estimation of |Alpha~ is determined by the ratio for the solvent firms in the sample to the total solvent firms in the population. Hence, the probability that a firm in the population is in the sample is 1.0 (for firms with available data) if it is insolvent, and only 0.169 if it is solvent. If p is a logistic distribution (CDF), p = 1/(1+|e.sup.-z~), then p|prime~ (see Palepu, 1986) is

p|prime~ = 1/(1 + |Alpha~|e.sup.-z~) = 1/(1 + |e.sup.ln||Alpha~.sup.-z~~).

Because we derived the coefficients by maximizing the likelihood function based on p|prime~, the parameters for p should be determined. Palepu demonstrated that only the constant term is affected when the logistic function is employed. In general, p can also be recovered by the following equation (see BarNiv, 1990):

p = 1/|1/|Alpha~p|prime~) + 1 - (1/|Alpha~)~ = |Alpha~p|prime~/(1 + |Alpha~p|prime~ - p|prime~).

This adjustment is used for estimating the probability of being an insolvent firm j in the population. The incomplete data available for several firms in the population (of which 14 were insolvent) and the unavailability of some data for many other firms force this study to ignore the incomplete data problem because no alternative exists (for discussions, see Heckman, 1979, and Zmijewski, 1984). The problems of using choice-based samples for prediction and choosing arbitrary cutoff points in prediction are solved as follows. The estimated probability of an insurer being an insolvent firm j in the prediction is adjusted from p' to p in the holdout sample. A cutoff point is selected, with the specific objective of minimizing the expected cost of misclassification in the estimation sample. Firms in the holdout sample are then predicted as solvent or insolvent based on this cutoff point. Thus, the cutoff points in predictions are used within a specific decision context, which minimizes the expected cost of misclassification. Other criteria, such as minimizing the number of misclassifications, are also used for comparison.

1 Edminister (1972), Deakin (1972), and others used MDA to identify financial distress among industrial corporations. Sinkey (1975) and Santemero and Vinso (1977), among others, used MDA to identify bankruptcy in banks, while Altman (1973) used it for identifying distressed railroad companies. MDA has also been used for bond rating, classification of loan applications, and other classification problems.

2 Beaver (1966) provided the foundation for dichotomous classification of financially distressed firms based on univariate financial ratios.

3 Borch (1974) and Hofflander and Duval (1967) defined ruin as a zero quality (i.e., the equity is completely eliminated).

4 For an early reference to MDA see Fisher (1936) and Welch (1939). Biometricians used the qualitative response models during the 1940s and 1950s (see Berkson, 1944, 1951 for the logit, and Finney, 1952, 1971 for the probit model). McFadden (1974) presented an analysis of the logit model and its maximum likelihood estimation (see also Maddala, 1983).

5 Special general cases of the GB2 are the beta of the second kind, the Singh-Madalla (or Burr type 12) and Burr type 3. The generalized gamma distribution can also be included as a limiting case. Other distributions such as the Lomax, Fisk (|sech.sub.2~), F, ||Chi~.sub.2~, Weibull, etc., can be presented as specific cases of the special general cases.

6 The general properties of the NPDM are that (1) no assumptions of specific parametric distributions are needed for the independent variables X; for the scores, qualitative and quantitative variables can be treated; (2) neither symmetric distributions (such as the normal) nor equality of the variance/covariance matrices are required; (3) the treatment of cost of misclassification and prior probabilities is straightforward; (4) the model can be generalized for more than two groups; and (5) many different discriminant functions may be used (in contrast, MDA, logit, and probit models provide unique explicit solutions).

7 Both the Burr and Lomax distributions are discussed in Johnson and Kotz (1970, pp. 31, 234).

8 In addition, 158 mergers were identified during the research period; most of these firms merged or dissolved into to other property-liability insurers, and a few retired through affiliates (or similar) mergers. Data are available for many of these insurers and empirical analysis might be employed in future research.

9 See Dopuch et al. (1987) for similar arguments regarding prediction of audit qualifications.

10 |X.sub.37~ includes the statutory underwriting expenses and investment income (the numerator of variable |X.sub.29~) divided by the surplus. The variable |X.sub.35~ includes the combined trade ratio (UE/NPW + LE/NPE) and the investment income and other investment gains divided by NPW, where NPE is the net premiums earned.

11 The use of the midpoint score between the two groups implies equal expected cost of misclassification or ||Pi~.sub.1~|c.sub.1~=||Pi~.sub.0~|c.sub.0~. However, two other criteria are also used in this study: one minimizes the number of errors; the other minimizes the ECM for each |C.sub.1~ in the estimation sample. The second criterion considers the effect of prior probabilities, alternative costs, and the relative number of misclassifications.

Printer friendly Cite/link Email Feedback | |

Author: | BarNiv, Ran; McDonald, James B. |
---|---|

Publication: | Journal of Risk and Insurance |

Date: | Dec 1, 1992 |

Words: | 8939 |

Previous Article: | Risk Financing: A Guide to Insurance Cash Flow, 2 vols. |

Next Article: | Catastrophe futures: a better hedge for insurers. |

Topics: |