Printer Friendly

MCDM approach to evaluating bank loan default models.


Loans are often the major type of credit risk for banks. Over the years, many quantitative default risk models have been developed to assess the default risk of individual and corporate obligors and help banks to better manage credit risks and allocate economic capital (Bastos 2010).

According to the underlying techniques, default risk models and algorithms can be broadly classified into traditional statistical category and intelligence or machine-learning category (Kumar, Ravi 2007). Examples of traditional statistical techniques include simple univariate analysis (Beaver 1966), multiple discriminant analysis (MDA) (Altman 1968), multiple regression (Meyer, Pifer 1970), linear discriminant analysis (Altman 1973), logit model (Martin 1977; Ohlson 1980), goal programming (Srinivasan, Kim 1987), logistic regression (Wiginton 1980; Gilbert et al. 1990), quadratic discriminant analysis (QDA) (Banks, Prakash 1994), Bayesian analysis (Li 1999), Bayesian network models (Shumway 2001), and quadratic programming (Tseng, Lin 2005). Machine-learning category includes techniques like recursive partitioning algorithm (Marais et al. 1984; Frydman et al. 1985), neural network models (NN) (Tam, Kiang 1992), case-based reasoning (Buta 1994; Bryant 1997), decision trees (Henley 1995), k-nearest-neighbour (KNN) (Henley, Hand 1996), genetic programming (Varetto 1998; McKee, Lensberg 2002), rough sets (McKee 1998), neuro-fuzzy techniques (Gorzalczany, Piasta 1999), data envelopment analysis (Cielen et al. 2004), ensembles (Doumpos, Zopounidis 2007), soft computing (Ravi, Pramodh 2008), multi-criteria convex quadratic programming (Peng et al. 2008a), classification and regression trees (CART) (Khandani et al. 2010). Reviews and comparative studies on credit risk models can be found at Altman (1984), Rosenberg and Gleit (1994), Hand and Henley (1997), Galindo and Tamayo (2000), Kumar and Ravi (2007), Crook et al. (2007), and Verikas et al. (2010).

Although comparative analysis of default classification models has been conducted intensively, the findings of these studies do not always agree with each other. A default classification model that outperforms another using one set of dataset and performance measures may be disadvantaged under different circumstances. For example, Fernandez and Olmeda (1995) reported that the neural network model provided generally better results than CART-based decision tree algorithms in default prediction, whereas Galindo and Tamayo (2000) concluded that CART decision tree model was superior to neural networks in mortgage loan default prediction.

While important advances have been made in developing default classification models, less progress has been made in default model evaluation and selection. Lopez and Saidenberg (2000) proposed evaluation methods for credit risk models based on cross-sectional simulation. Sobehart et al. (2001) developed several performance metrics and a validation framework for quantitative default risk models. Stein (2002) discussed validation strategies for default prediction models. Medema et al. (2009) implemented a practical methodology to validate credit risk models. In addition to the inconclusiveness associated with default classification models, the limited availability of bank loan data presents another major challenge in assessing and selecting default classification models.

The objective of this study is to propose an approach to evaluate default classification models and generate a ranking of them using a combination of multiple performance metrics (Peng et al. 2011b). Though the correct-classification rate is a prominent measurement in evaluating default risk models, it is not a sufficient measure of a model's predictive power due to the highly unbalanced nature of default data (Khandani et al. 2010). Thus it is a common practice in default classification model evaluation to use several performance measures. Based on this observation, this study treats the model evaluation and selection problem as a multiple criteria decision making (MCDM) problem (Peng et al. 2011a; Zavadskas, Turskis 2011; Brauers, Zavadskas 2011; Antucheviciene et al. 2011; Fernando et al. 2012). Each performance measure is considered as a criterion and the quality of default classification model is determined by combining multiple performance metrics (e.g. accuracy, type-I error, and type-II error). TOPSIS, a multiple criteria decision making (MCDM) method, is used to rank the default prediction models.

The bank loan default prediction process includes data pre-processing, modelling, and model assessment and selection (Verikas et al. 2010). Two challenges in data pre-processing are high dimensionality and class unbalancedness. A large number of features may be collected in bank loan data and some of them are redundant or irrelevant. Discarding such features can improve the performance of prediction models (Fukunaga 1972). This paper examines the performances of two feature selection techniques (i.e. PCA and ICA) in the context of the proposed procedure and a real-life large corporate bank loan data. Bank loan datasets often have highly unequal frequencies of the normal and default records. Default models trained on random samples from unbalanced datasets may perform poorly on future bank loan default data (Zhang, Zhou 2004; Kou et al. 2005; Shi et al. 2005; Kou et al. 2014). The Synthetic Minority Oversampling Technique (SMOT) (Chawla et al. 2002) is utilized in this study to handle the unbalancedness by creating synthetic default records. The proposed approach is applied to a proprietary Chinese commercial bank loan dataset, which contains over 10,000 records of corporate loans with 120 variables.

The paper is structured as follows. Section 1 explains the proposed approach. Section 2 presents the design and the results of the experimental study. The final section concludes the paper with summaries and future research directions.

1. Research methodology

This section describes the proposed approach that this paper used to pre-processing, classifying, and evaluating default classification models. The major components of this approach are explained in sequence.

1.1. Proposed approach for bank loan default prediction

Figure 1 describes the proposed approach for bank loan default prediction that includes feature selection, sampling, classification, and model evaluation.

Real world data may be incorrect, duplicate, incomplete, missing, and dispersed. Thus they need to be cleaned, integrated, and transformed. A comprehensive description of methods and techniques for data preparation can be found at Han and Kamber (2006). Data preparation steps applied to the bank loan default data will be discussed in Section 2.1.


The following subsections present the four major components of the procedure: feature selection, sampling, classification algorithms, and the MCDM methods for model evaluation.

1.2. Feature selection techniques

Feature selection is an essential step in bank loan default prediction. It can not only reduce the computation complexity, but also improve the performance of prediction models by removing irrelevant variables. Many studies have been conducted in the area of corporate bankruptcy prediction to compare feature selection techniques (e.g. Ryu, Yue 2005; Tsai 2009), combine feature selection methods with classifiers (e.g. Back et al. 1996), or design systems integrating feature selection and prediction models (e.g. Lin, McClean 2001).

Principal component analysis (PCA) (Pearson 1901) is one of the most popular feature selection techniques used in bankruptcy prediction. Independent component analysis (ICA) (Jutten, Herault 1991; Comon 1994), a relatively new technique, has also been adopted in feature selection with increasing frequency and has shown promise in some circumstances (e.g. Cao, Chong 2002; Kao et al. 2011). While PCA is suitable for datasets with Gaussian distributions, ICA can decompose non-Gaussian datasets into independent components. The effects of PCA and ICA on pattern recognition problems have been examined in several studies and the observations were mixed: Fortuna and Capson (2004) have shown that ICA outperformed PCA in its generalization ability. Baek et al. (2002), on the other hand, claimed that PCA outperformed ICA for face recognition and the difference in performance between the two techniques is statistically significant. Meanwhile, Moghaddam (2002), as well as Ekenel and Sankur (2004), observed that there are no significant performance differences between ICA and PCA.

In the bank loan default prediction field, few works have been conducted to compare the prediction performances of features selected by PCA and ICA. In this paper, both techniques are applied to the original bank loan data to compute principal and independent components. The two new datasets are then sampled and classified using the same sampling technique and classifiers.

1.2.1. Principal component analysis (PCA)

Principal component analysis (PCA) is a multivariate statistical technique that uses an orthogonal transformation to convert a set of often inter-correlated variables into a new set of orthogonal variables (Abdi, Williams 2010). PCA has a lot of applications and its goal in feature extraction is to find a mapping that simplifies the description of the dataset.

Let X = [([x.sub.1],[x.sub.2] ,...,[x.sub.m]).sup.T] be the original dataset of size m x n, m [less than or equal to] n, consisting of observed variable [x.sub.i] of size l x n,i = 1,2,...,m. Let Y be another m x n matrix related by a linear transformation P:

PX = Y, (1)

where the rows of P, {[p.sub.1] ,...,[p.sub.m]}, are a set of new basis vectors, which are also called principal components, for representing the columns of X. Principal components can be obtained by a three-step algorithm (Shlens 2005): the first step selects a normalized direction in m-dimensional space along which the variance in X is maximized and save it as [p.sub.l]; the second step finds another direction along which variance is maximized under the constraint that it is orthogonal to the preceding vector and save it as [p.sub.i]; the third step repeats this procedure until m vectors are selected. The output is a set of ordered principal components p. The dimensionality of the original data can be reduced by choosing the strongest principal components.

1.2.2. Independent component analysis (ICA)

ICA searches the linear transformation that minimizes the statistical dependence between its components and may be considered as an extension of PCA (Comon 1994). The ICA model is defined as (Hyvarinen, Oja 2000):

x = As = [n.summation over (i=1)] [a.sub.i][s.sub.i], (2)

where: x is the observed random vector with elements [x.sub.1], ..., [x.sub.n]; A is the mixing matrix with elements [a.sub.ij]; and s is the random vector with elements [s.sub.1], ..., [s.sub.n]. Both s and A are assumed to be unknown and must be estimated using the observed vector x under several assumptions. First, the components s. are assumed to be statistically independent. Second, the independent component (IC) is assumed to have non-Gaussian distributions. Third, the unknown mixing matrix is assumed to be square. Then the ICs can be obtained by compute W, the inverse of A:

s = Wx. (3)

Non-Gaussianity is the key to estimating the ICA model and can be measured by kurtosis, negentropy, and approximations of negentropy (Hyvarinen, Oja 2000). Kurtosis is the classical measure of non-Gaussianity and is adopted in this paper. The kurtosis of y is defined by:

kurt(y) = E{[y.sup.4]} - 3[(E{[y.sup.2]}).sup.2]. (4)

If y is assumed to zero mean and unit variance, the right-hand side becomes E{[y.sup.4]}-3. Thus, kurtosis is zero for a Gaussian random variable and nonzero for most non-Gaussian random variables.

Several algorithms have been developed to solve the ICA problem (Bell, Sejnowski 1995; Amari et al. 1996; Karhunen, Oja 1997; Hyvarinen 1999). This paper adopts the FastICA algorithm proposed by Hyvarinen and Oja (2000).

1.3. Sampling approach: SMOTE

Credit-default data are normally highly unbalanced because the proportion of the number of default loans to the population is normally below 10% (Paleologo et al. 2010). Unbalanced datasets cause at least two types of problems in default prediction (Zhang, Zhou 2004). The first problem is that the performance of classification models for the default class may be poor due to the overwhelming number of normal loan records in the samples. The second problem is related to classifier evaluation. Predictive accuracy, the most widely used measure for classifiers, can be misleading with unbalanced data. For example, suppose a dataset has 95% non-default firms and 5% default firms. An accuracy rate of 95% may be inadequate since the classifier could be correctly classified only the non-default firms. Possible ways to address this problem are sampling techniques and multiple performance measures. The performance measures will be discussed in Section 1.5.

Many sampling techniques have been developed to deal with unbalanced datasets. The fundamental idea of these techniques is to oversampling the minority class and/or under-sampling the majority class (Chawla et al. 2002). Chawla et al. (2002) proposed the Synthetic Minority Oversampling TEchnique (SMOTE), an over-sampling approach in which the minority class is over-sampled by creating synthetic minority class records. The synthetic minority samples are generated along the line segments joining any or all of the k minority class nearest neighbours. SMOTE achieves better performance than over-sampling with replacement techniques because it creates larger decision regions, which leads to more coverage of the minority class. The experiment implements SMOTE using the SMOTE class (weka.filters.supervised.instance.SMOTE) of WEKA 3.7 (Witten, Frank 2005).

1.4. Classification algorithms and performance measures

Default risk prediction can be considered as a binary classification problem, which means bank loans are assigned to one of the two classes (default or non-default) after data analysis. In the classification framework, a model or classifier attempts to accurately predict class labels (default or non-default) of future individual loans based on two steps. In the first step, a classifier learns a function that maps independent variables or input variables to a dependent or output variable (class label). The second step tests the predictive accuracy of the classifier learned in the first step (Han, Kamber 2006).

1.4.1. Classification algorithms

Data mining and knowledge discovery (DMKD) has made great progress during the last twenty years (Peng et al. 2008b). As one of the major tasks of data mining and an important problem in research and practice, classification has wide business and scientific applications, such as credit risk management, default prediction, marketing promotion, financial loan evaluation, insurance premium calculation, and medical clinic decision. Five classification algorithms: Naive Bayes, linear logistic regression, k-nearest-neighbour (k-NN), C4.5, and Classification and Regression Tree (CART), are chosen for the experimental study.

Naive Bayes (Domingos, Pazzani 1997) models probabilistic relationships between input variables and the output variable. It estimates the class-conditional probability based on Bayes theorem and can represent simple distributions. Linear logistic regression (Cessie, Houwelingen 1992) models the probability of occurrence of an event as a linear function of a set of input variables and generalizes well to multiclass problems. The k-nearest-neighbour (Cover, Hart 1967) is an instance-based learning method. It classifies a new instance to the same class as the closest existing instance, and the closeness is measured by a distance metric. In the experimental study, k-NN is implemented by the IBK algorithm in WEKA (Witten, Frank 2005). C4.5 is a decision tree algorithm that constructs decision trees in a top-down recursive divide-and-conquer manner (Quinlan 1993) and is one of the most extensively studied machine learning algorithms. Due to its outstanding performance and easy-to-interpret decision rules, C4.5 became a benchmark of classification algorithms. Classification and Regression Tree (CART) (Breiman et al. 1984) is a greedy algorithm for learning multivariate decision trees. It can be used to predict both continuous and categorical variables.

1.4.2. Performance measures

Widely used performance measures of default prediction systems are accuracy, Type-I error, Type-II error, and AUC (Verikas et al. 2010). Bank loan default prediction is normally a twoclass problem. Suppose default and non-default loans are labelled as negative and positive, respectively. These measures can be defined as follows.

--Overall accuracy: Accuracy is the percentage of correctly classified instances. It is the most widely used classification performance metrics.


Overall Accuracy = [[TN + TP]/[TP + FP + FN + TN]], (5)

where TN, TP, FN, and FP stand for true negative, true positive, false negative, and false positive, respectively.

--True Positive (TP): TP is the number of correctly classified non-default instances. TP rate is also called sensitivity measure.

True Positive rate/Sensitivity = [TP/[TP + FN]]. (6)

--False Positive (FP): FP is the number of default firms that is misclassified as non-default class. FP rate is called Type-I error. In default prediction, Type-I error is more important than Type-II error because it represents potential credit loss and Type-II error implies potential loss of business. Actually, Altman et al. (1977) indicated that the costs of Type-I error are 35 times higher for banks than Type-II error costs.

False Positive rate/Type-I error = [FP/[FP + TN]]. (7)

--True Negative (TN): TN is the number of correctly classified default instances. TN rate is also called specificity measure.

True Negative rate/Specificity = [TN/[TN + FP]]. (8)

--False Negative (FN): FN is the number of non-default instances that is misclassified as default class. FN rate is also called Type-II error.

False Negative rate/Type-II error = [FN/[FN + TP]]. (9)

--Area under the curve (AUC): ROC stands for Receiver Operating Characteristic, which shows the tradeoff between TP rate and FP rate. AUC represents the accuracy of a classifier. The larger the area, the better the classifier.

1.5. MCDM method: TOPSIS

There are many MCDM methods that can be used to rank alternatives. This study chose TOPSIS due to its good performance in previous studies and ease of use (Agrawal et al. 1991; Zanakis et al. 1998). Hwang and Yoon (1981) proposed the Technique for order preference by similarity to ideal solution (TOPSIS) method to rank alternatives over multiple criteria. It finds the best alternatives by minimizing the distance to the idea solution and maximizing the distance to the nadir or negative-ideal solution (Olson 2004).

A number of extensions and variations of TOPSIS have been developed over the years. The following TOPSIS procedure adopted from Opricovic and Tzeng (2004: 448-449) and Olson (2004: 722) is used in the experimental study.

Step 1: calculate the normalized decision matrix. The normalized value v.. is calculated as:

[r.sub.ij] = [x.sub.ij] / [square root of [J.summation over (j=1)] [x.sup.2.sub.ij]], j = 1, ..., J; i = l, ..., n, (10)

where: J and n denote the number of alternatives and the number of criteria, respectively. For alternative [A.sub.j], the performance measure of the ith criterion [C.sub.i] is represented by [x.sub.ij]

Step 2: develop a set of weights w. for each criterion and calculate the weighted normalized decision matrix. The weighted normalized value [v.sub.ij] is calculated as:

[v.sub.ij] = [w.sub.i][r.sub.ij] = j = 1,..., J; i = 1, ...,n, (11)

where [w.sub.i] is the weight of the ith criterion, and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Step 3: find the ideal alternative solution [S.sup.+], which is calculated as:


where I' is associated with benefit criteria and I" is associated with cost criteria. For the evaluation of default prediction models, accuracy and AUC are benefit criteria and have to be maximized, and the Type-I error and Type-II error are cost criteria and have to be minimized.

Step 4: find the negative-ideal alternative solution [S.sup.-], which is calculated as:


Step 5: Calculate the separation measures, using the n-dimensional Euclidean distance. The separation of each alternative from the ideal solution is calculated as:

[D.sup.+.sub.j] = [square root of [n.summation over (i=1)] [([v.sub.ij]] - [v.sup.+.sub.i]).sup.2]] j = 1, ..., J (14)

The separation of each alternative from the negative-ideal solution is calculated as:

[D.sup.-.sub.j] = [square root of [n.summation over (i=1)] [([v.sub.ij]] - [v.sup.-.sub.i]).sup.2]] j = 1, ..., J (15)

Step 6: Calculate a ratio [R.sup.+.sub.j] that measures the relative closeness to the ideal solution and is calculated as:

[R.sup.+.sub.j] = [D.sup.-.sub.j]/([D.sup.+.sub.j] + [D.sup.-.sub.j], j = 1,..., J. (16)

Step 7: Rank alternatives by maximizing the ratio [R.sup.+.sub.j].

2. Experimental study

The experiment is designed to test the proposed approach using a large real-life bank loan default dataset provided by a Chinese commercial bank. The following subsections describe the data, the experimental design, and the results.

2.1. Bank loan data

The dataset which was provided by a Chinese commercial bank contains 10,868 short-term, medium-term, and long-term loans granted to various sizes of companies over the period January 2002-June 2004. Companies included in the data cover a wide range of industries, such as manufacturing, information technology, pharmacy, telecommunication, foreign trading, energy, agriculture, public utility, and real estate.

The raw dataset was pre-processed to remove irrelevant, sensitive, and incorrect data. Originally, there are 120 variables and some of them are sensitive and irrelevant to the default prediction, such as company identification codes, company names, and company addresses. Second, instances with more than 50% missing values were considered as incomplete or incorrect and deleted from the dataset. Pre-processing step removed 224 instances and 16 variables.

Different definitions of 'defaulted' or 'bad' loans are available in the literature (e.g. Marais et al. 1984; Hurt, Felsovalyi 1998; Basel Committee... 2004; Dermine, Carvalho 2006). Because the dataset used in this study was provided by a Chinese commercial bank, the following five-category loan classification system used by China's state-owned commercial banks (SCB) and joint shareholding commercial banks (JSCB) is adopted. The five-category criteria classify bank loans as pass, special-mention, substandard, doubtful, and loss. The last three categories are treated as defaulted loans by SCB and JSCB. A loan is classified as 'substandard' when the obligor's ability to pay its credit obligations is in question depending on its normal business revenues. A loan is classified as 'doubtful' when the obligor cannot pay interest and/or principal. A loan is classified as 'loss' when principal and interest cannot be recovered or only a small portion can be recovered after taking all possible measure or legal procedure is started (Bank loan classification regulation 2002; Banks to adopt five-category loan classification system 2003).

According to this definition, the sample data has 9,928 normal loans and 716 defaulted loans. The 104 variables describe five important financial aspects of companies as pointed out by Altman (1968), including liquidity, profitability, leverage, solvency, and activity. This is a typical bank loan data with two characteristics: a large number of variables and highly unbalanced (only 6.7% default loans). Table 1 compares the normal and defaulted loans in the dataset from four aspects (McKee, Lensberg 2002). RMB is the abbreviation of "renminbi", the official currency of China.

The two subsamples have some different characteristics. For example, the normal loans have larger mean total assets and mean total revenues than the defaulted loans. The Net income/Total assets ratio and the Cash/Current liabilities ratio of the normal loan category are also higher than the defaulted loan category. On the other hand, the normal and defaulted loans exhibit some similar features. For instance, both normal and defaulted loans have wide range values for total assets and revenues. The defaulted subsample has a negative Net income/Total assets ratio and has 45% companies with nonpositive net incomes. Although the normal group has a positive Net income/Total assets ratio, 3,425 companies (or 34%) in the subsample have nonpositive net incomes. This feature is similar to the one reported by McKee and Lensberg (2002). The mixed subsample characteristics reflect the real life problem in loan default prediction.

2.2. Experimental design

The experiment was carried out according to the following process:

Input: The bank loan default dataset

Output: Rankings of classifiers

Step 1. Prepare the dataset: remove irrelevant variables; handle missing values by filling in with the variable average; eliminate incorrect data.

Step 2. Select relevant variables using PCA and ICA to get two sets of data. PCA is implemented using the PrincipalComponents class of WEKA 3.7 (Witten, Frank 2005; Hall et al. 2009) and ICA is implemented using the FastICA operator of RapidMiner 5.0 (Mierswa et al. 2006).

Step 3. Applying SMOTE approach to the PCA and ICA datasets to generate synthetic default instances, which is implemented using the SMOTE class of WEKA 3.7 (Witten, Frank 2005; Hall et al. 2009).

Step 4. Train and test the selected classification algorithms using WEKA 3.7 on randomly sampled partitions (i.e. 10-fold cross-validation) of the sampled PCA and ICA datasets, respectively.

Step 5. Evaluate classification algorithms using TOPSIS, which is implemented using MATLAB 7.0 (Matlab 2005).

Step 6. Generate the ranking of classification algorithms provided by TOPSIS for the PCA and ICA datasets, respectively.


After removing irrelevant variables, there are 104 variables in the dataset. Non-Gaussianity is the key to estimating the ICA model and can be verified by kurtosis. Figure 2 illustrates that all variables satisfy non-Gaussianity since none of them has zero kurtosis.

PCA and ICA methods are applied to the pre-processed data to reduce the dimensionality. A threshold of 85% is set for both PCA and ICA methods. The resulting datasets after applying PCA and ICA have 31 and 32 variables, respectively.

The SMOTE approach is used to create synthetic default instances for the PCA and ICA datasets. Figure 3 compares the different class sizes of default before and after applying the SMOTE. The default data increase from 716 records to 3365 records.



2.3. Results and discussions

2.3.1. Classification results

Table 2 summarizes classification results on the PCA test dataset. The best accuracy, Type-I error rate, Type-II error rate, and AUC are highlighted in boldface and italic. K-NN has the highest accuracy of 96.67%, followed by C4.5 and CART. Naive Bayes and logistic have the lowest Type-I and Type-II error rate, respectively. However, the low Type-I or Type-II error rate is achieved by sacrificing the other type of error rate. The 6.4% Type-I error rate of naive Bayes is accompanied with a 69.7% Type-II error rate and the 1.2% Type-II error rate of logistic is accompanied with a 48.18% Type-I error rate. For the measure of AUC, the performances of K-NN, C4.5 and CART are close and the average rate of the five classifiers reaches 90.2%.

Table 3 compares the five classifiers using the same set of measures on the ICA test dataset. K-NN is still the best classifier in terms of the accuracy and AUC. Similarly, Naive Bayes and logistic have the lowest Type-I and Type-II error rate, respectively. At the same time, these two classifiers also have the highest Type-II and Type-I error rates.

2.3.2. PCA vs. ICA

To compare the effects of PCA and ICA on the classification results, the experiment applied the selected classification algorithms to the PCA and ICA datasets. The averaged results of accuracy, AUC, Type-I error rate, and Type-II error rate for the PCA and ICA data are summarized in Table 4. It can be seen that PCA outperforms ICA on the four measurements.

2.3.3. TOPSIS results

The rankings of classifiers generated by TOPSIS for the PCA data and the ICA data are summarized in Table 5. The rankings of the five classifiers using the PCA and ICA datasets are identical and K-NN is ranked the best classifier.

In this specific dataset, it chanced that one classifier outperformed other classifiers in most performance measures. In reality, it is normal that classifiers with the best scores on some measures may perform poorly on other measures. In those circumstances, the MCDM methods are helpful in providing a final ranking of classification algorithms. It is worth to mention that the performance measures used by this study are mainly accuracy-related. In real application in commercial banks, there are other issues need to be considered when selecting classifiers. For instance, probability of default (PD) is a commonly used indicator in bankruptcy prediction and thus a classifier which can provide PD information may be preferred over a classifier which cannot provide PD information by commercial banks.


This paper concentrated on feature selection, unbalanced data, and model assessment in bank loan default prediction. It developed a procedure to address the three problems. Firstly, ICA and PCA were used to select relevant features. Secondly, the SMOT approach was utilized to deal with the unbalanced data by creating synthetic default examples. Thirdly, TOPSIS, a multiple criteria decision making method, was utilized to rank a selection of default prediction models.

An experimental study using a large real-life bank loan default dataset provided by a Chinese commercial bank was conducted to validate the proposed process. PCA and ICA generated two separate datasets and tested in the sampling, classification, and model evaluation steps. After feature selection, PCA and ICA reduced the dimensionality from 104 to 31 and 32, respectively. The reduced datasets were then balanced using the SMOTE approach and classified with five selected classification algorithms. The performances of classifiers were measured using accuracy, Type-I error rate, Type-II error rate, and AUC. Finally, TOPSIS was applied to the PCA and ICA datasets to evaluate the classifiers based on the values of performance metrics. The experimental results showed that K-NN has good potential in default prediction. In addition, the outcome indicated that there is no significant difference between PCA and ICA on default prediction of the specific dataset.

Future research issues include introducing more feature selection techniques, sampling approaches, classification algorithms, and MCDM methods to the process. Ranking results generated by MCDM methods may be very different. Another research direction is to resolve this disagreement and help decision-makers pick the most suitable classifier(s). Experimental study with more default datasets may also be conducted to test the process.


This research has been partially supported by grants from the National Natural Science Foundation of China (#71325001 for Yi Peng, #71222108 for Gang Kou), the Fundamental Research Funds for the Central Universities, the Research Fund for the Doctoral Program of Higher Education (#20120185110031) and Program for New Century Excellent Talents in University (NCET-12-0086 and NCET-10-0293).

doi: 10.3846/20294913.2014.913275


Abdi, H.; Williams, L. J. 2010. Principal component analysis, Wiley Interdisciplinary Reviews: Computational Statistics 2(4): 433-459.

Agrawal, V. P.; Kohli, V.; Gupta, S. 1991. Computer aided robot selection: the multiple attribute decision making approach, International Journal of Production Research 29(8): 1629-1644.

Altman, E. I. 1968. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy, Journal of Finance 23(1): 589-609.

Altman, E. I. 1973. Predicting railroad bankruptcies in America, Bell Journal of Economics 4(1): 184-211.

Altman, E. I.; Haldeman, R. G.; Narayanan, P. 1977. ZETA analysis, Journal of BanMng and Fmance 1(1): 29-54.

Altman, E. I. 1984. The success of business failure prediction models, Journal of BanMng and Fmance 8(2): 171-198.

Amari, S.; Chichocki, A.; Yang, H. 1996. A new learning algorithm for blind signal separation, Advances in Neural Information Processing Systems 8: 757-763.

Antucheviciene, J.; Zakarevicius, A.; Zavadskas, E. K. 2011. Measuring congruence of ranking results applying particular MCDM methods, Infovmatica 22(3): 319-338.

Back, B.; Laitinen, T.; Sere, K. 1996. Neural network and genetic algorithm for bankruptcy prediction, Expert Systems with Applications 11(4): 407-413.

Baek, K.; Draper, B. A.; Beveridge, J. R.; She, K. 2002. PCA vs. ICA: a comparison on the FERET data set, In JCIS, 824-827.

Bank loan classification regulation. 2002. The People's Bank of China.

Banks to adopt five-category loan classification system [online], [cited 08 October 2011]. 2003. People's Daily. Available from Internet: shtml

Banks, W. J.; Prakash, L. A. 1994. On the performance of linear programming heuristics applied on a quadratic transformation in the classification problem, European Journal of Operational Research 74(23): 23-28.

Basel Committee on Banking Supervision, 2004. International Convergence of Capital Measurement and Capital Standards. Basel.

Bastos, J. A. 2010. Forecasting bank loans loss-given-default, Journal of Banking and Finance 34(10): 2510-2517.

Beaver, W. H. 1966. Financial ratios as predictors of failure, Journal of Accounting Research 4: 71-111.

Bell, A. J.; Sejnowski, T. J. 1995. An information maximization approach to blind separation and blind deconvolution, Neural Computation 7(6): 1129-1159.

Brauers, W. K. M.; Zavadskas, E. K. 2011. Multimoora optimization used to decide on a bank loan to buy property, Technological and Economic Development of Economy 17(1): 174-188.

Breiman, L.; Friedman, J. H.; Olshen, R. A.; Stone, C. J. 1984. Classification and regression trees. Wadsworth International Group, Belmont, California.

Bryant, S. M. 1997. A case-based reasoning approach to bankruptcy prediction modeling, International Journal of Intelligent Systems in Accounting, Finance and Management 6(3): 195-214.<195::AID-ISAF132>3.3.CO;2-6

Buta, P. 1994. Mining for financial knowledge with CBR, AI Expert 9(2): 34-41.

Cao, L. J.; Chong, W. K. 2002. Feature extraction in support vector machine: a comparison of PCA, XPCA and ICA, in Proceedings of the 9th International Conference on Neural Information, 18-22 November, 2002, Singapore, 1001-1005.

Cessie, S. le; Houwelingen, J. C. 1992. Ridge estimators in logistic regression, Applied Statistics 41(1): 191-201.

Chawla, N. V.; Bowyer, K. W.; Hall, L. O.; Kegelmeyer, W. P. 2002. SMOTE: Synthetic Minority Oversampling Technique, Journal of Artificial Intelligence Research 16: 341-378.

Cielen, A.; Peeters, L.; Vanhoof, K. 2004. Bankruptcy prediction using a data envelopment analysis, European Journal of Operational Research 154(2): 526-532.

Comon, P. 1994. Independent component analysis - a new concept?, Signal Processing 36(3): 287-314.

Cover, T. M.; Hart, P. E. 1967. Nearest neighbor pattern classification, IEEE Transactions on Information Theory 13(1): 21-27.

Crook, J. N.; Edelman, D. B.; Thomas, L. C. 2007. Recent developments in consumer credit risk assessment, European Journal of Operational Research 183(3): 1447-1465.

Dermine, J.; Neto De Carvalho, C. 2006. Bank loan losses-given-default: a case study, Journal of Banking and Finance 30(4): 1219-1243.

Domingos, P.; Pazzani, M. 1997. On the optimality of the simple Bayesian classifier under zero-one loss, Machine Learning 29(203): 103-130.

Doumpos, M.; Zopounidis, C. 2007. Model combination for credit risk assessment: a stacked generalization approach, Annals of Operations Research 151(1): 289-306.

Ekenel, H. K.; Sankur, B. 2004. Feature selection in the independent component subspace for face recognition, Pattern Recognition Letters 25(12): 1377-1388.

Fernandez, E.; Olmeda, I. 1995. Bankruptcy prediction with artificial neural networks, Lecture Notes in Computer Science 930: 1142-1146.

Fernando, A. F.; Spahr, R.; Santos, P.; Rodrigues, P. M. M. 2012. A multiple criteria framework to evaluate bank branch potential attractiveness, International Journal of Strategic Property Management 16(3): 254-276.

Fortuna, J.; Capson, D. 2004. Improved support vector classification using PCA and ICA feature space modification, Pattern Recognition 37(6): 1117-1129.

Frydman, H.; Altman, E. I.; Kao, D. 1985. Introducing recursive partitioning for financial classification: the case of financial distress, Journal of Finance 40(1): 269-291.

Fukunaga, K. 1972. Introduction to statistical pattern recognition. New York: Academic Press. 369 p.

Galindo, J.; Tamayo, P. 2000. Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications, Computational Economics 15(1-2): 107-143.

Gilbert, L. R.; Menon, K.; Schwartz, K. B. 1990. Predicting bankruptcy for firms in financial distress, Journal of Business Finance and Accounting 17(1): 161-171.

Gorzalczany, M. B.; Piasta, Z. 1999. Neuro-fuzzy approach versus rough-set inspired methodology for intelligent decision support, Information Sciences 120(1-4): 45-68.

Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I. H. 2009. The WEKA data mining software: an update, SIGKDD Explorations 11(1): 10-18.

Han, J.; Kamber, M. 2006. Data mining: concepts and techniques. 2nd edition. San Francisco: Morgan Kaufmann Publishers. 770 p.

Hand, D. J.; Henley, W. E. 1997. Statistical classification methods in consumer credit scoring: a review, Journal of the Royal Statistical Society. Series A (Statistics in Society) 160(3): 523-541.

Henley, W. E. 1995. Statistical aspects of credit scoring: PhD Dissertation. Department of Statistics, The Open University, Milton Keynes.

Henley, W. E.; Hand, D. J. 1996. A k-nearest-neighbour classifier for assessing consumer credit risk, Journal of the Royal Statistical Society. Series D (The Statistician) 45(1): 77-95.

Hurt, L.; Felsovalyi, A. 1998. Measuring loss on Latin American defaulted bank loans, a 27-year study of 27 countries, Journal of Lending and Credit Risk Management 80: 41-46.

Hwang, C. L.; Yoon, K. 1981. Multiple attribute decision making methods and applications. Berlin Heidelberg: Springer. 259 p.

Hyvarinen, A. 1999. Survey on independent component analysis, Neural Computing Surveys 2: 94-128.

Hyvarinen, A.; Oja, E. 2000. Independent component analysis: algorithms and applications, Neural Networks 13(4-5): 411-430.

Jutten, C.; Herault, J. 1991. Blind separation of sources, Part I: an adaptive algorithm based on neuromimetic architecture, Signal Processing 24(1): 1-10.

Kao, L. J.; Lu, C. J.; Chiu, C. C. 2011. Efficiency measurement using independent component analysis and data envelopment analysis, European Journal of Operational Research 210(2): 310-317.

Karhunen, J.; Oja, E. 1997. A class of neural networks for independent component analysis, IEEE Transactions on Neural Networks 8(3): 486-503.

Khandani, A. E.; Kim, A. J.; Lo, A. W 2010. Consumer credit-risk models via machine-learning algorithms, Journal of Banking and Finance 34(11): 2767-2787.

Kou, G.; Peng, Y.; Shi, Y.; Wise M.; Xu, W. 2005. Discovering credit cardholders' behavior by multiple criteria linear programming, Annals of Operations Research 135(1): 261-274.

Kou, G; Peng, Y.; Wang, G. 2014. Evaluation of clustering algorithms for financial risk analysis using MCDM methods, Information Sciences (in Press).

Kumar, P. R.; Ravi, V. 2007. Bankruptcy prediction in banks and firms via statistical and intelligent techniques--a review, European Journal of Operational Research 180(1): 1-28.

Li, K. 1999. Bayesian analysis of duration models: an application to Chapter 11 bankruptcy, Economics Letters 63(3): 305-312.

Lin, F. Y.; McClean, S. 2001. A datamining approach to the prediction of corporate failure, Knowledge Based Systems 14(3-4): 189-195.

Lopez, J. A.; Saidenberg, M. R. 2000. Evluating credit risk models, Journal of Banking and Finance 24(1-2): 151-165.

Marais, M. L.; Patel, J.; Wolfson, M. 1984. The experimental design of classification models: an application of recursive partitioning and bootstrapping to commercial bank loan classifications, Journal of Accounting Research 22: 87-113.

Martin, D. 1977. Early warning of bank failure: a logit regression approach, Journal of Banking and Finance 1(3): 249-276.

MATLAB 2005. The MathWorks, Inc., Natick, MA 01760 [online], [cited 12 December 2012]. Available from Internet:

McKee, T. E. 1998. A mathematically derived rough set model for bankruptcy prediction, in Brown, C. E. (Ed.). Collected Papers of the Seventh Annual Research Workshop on Artificial Intelligence and Emerging Technologies in Accounting, Auditing and Tax. Artificial Intelligence/Emerging Technologies Section of the American Accounting Association, Sarasota, Florida.

McKee, T. E.; Lensberg, T. 2002. Genetic programming and rough sets: a hybrid approach to bankruptcy classification, European Journal of Operational Research 138(2): 436-451.

Medema, L.; Koning, R. H.; Lensink, R. 2009. A practical approach to validating a PD model, Journal of Banking and Finance 33(4): 701-708.

Meyer, P. A.; Pifer, H. 1970. Prediction of bank failures, The Journal of Finance 25(4): 853-868.

Mierswa, I.; Wurst, M.; Klinkenberg, R.; Scholz, M.; Euler, T. 2006. YALE: rapid prototyping for complex data mining tasks, in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06), 20-23 August, 2006, Philadelphia, USA, 935-940.

Moghaddam, B. 2002. Principal manifolds and probabilistic subspaces for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 24(6): 780-788.

Ohlson, J. A. 1980. Financial ratios and the probabilistic prediction of bankruptcy, Journal of Accounting Research 18(1): 109-131.

Olson, D. L. 2004. Comparison of weights in TOPSIS models, Mathematical and Computer Modelling 40(7-8): 721-727.

Opricovic, S.; Tzeng, G. H. 2004. Compromise solution by MCDM methods: a comparative analysis of VIKOR and TOPSIS, European Journal of Operational Research 156(2): 445-455.

Paleologo, G.; Elisseeff, A.; Antonini, G. 2010. Subagging for credit scoring models, European Journal of Operational Research 201(2): 490-499.

Pearson, K. 1901. On lines and planes of closest fit to systems of points in space, Philosophical Magazine 2(6): 559-572.

Peng, Y.; Kou, G.; Shi, Y.; Chen, Z. 2008a. A multi-criteria convex quadratic programming model for credit data analysis, Decision Support Systems 44(4): 1016-1030.

Peng, Y.; Kou, G.; Shi, Y.; Chen, Z. 2008b. A descriptive framework for the field of data mining and knowledge discovery, International Journal of Information Technology & Decision Making 7(4): 639-682.

Peng, Y.; Kou, G.; Wang, G.; Shi, Y. 2011a. FAMCDM: a fusion approach of MCDM methods to rank multiclass classification algorithms, Omega 39(6): 677-689.

Peng, Y.; Kou, G.; Wang, G.; Wu, W.; Shi, Y. 2011b. Ensemble of software defect predictors: an AHPbased evaluation method, International Journal of Information Technology & Decision Making 10(1): 187-206.

Quinlan, J. R. 1993. C4.5: programs for machine learning. San Francisco: Morgan Kaufmann. 302 p.

Ravi, V.; Pramodh, C. 2008. Threshold accepting trained principal component neural network and feature subset selection: application to bankruptcy prediction in banks, Applied Soft Computing 8(4): 1539-1548.

Rosenberg, E.; Gleit, A. 1994. Quantitative methods in credit management: a survey, Operations Research 42(4): 589-613.

Ryu, Y. U.; Yue, W. T. 2005. Firm bankruptcy prediction: experimental comparison of isotonic separation and other classification approaches, IEEE Transactions On Systems, Management and Cybernetics Part A: Systems and Humans 35(5): 727-737.

Shi, Y.; Peng, Y.; Kou, G.; Chen, Z. 2005 Classifying credit card accounts for business intelligence and decision making: a multiple-criteria quadratic programming approach, International Journal of Information Technology & Decision Making 4(4): 1-19.

Shlens, J. 2005. A tutorial on principal component analysis. Institute for Nonlinear Science. 13 p.

Shumway, T. 2001. Forecasting bankruptcy more accurately: a simple hazard model, The Journal of Business 74(1): 101-124.

Sobehart, J.; Keenan, S.; Stein, R. 2001. Benchmarking quantitative default risk models: a validation methodology, ALGO Research Quarterly 4(1/2): 57-72.

Srinivasan, V.; Kim, Y. H. 1987. Credit granting: a comparative analysis of classification procedures, The Journal of Finance XLII(3): 665-681.

Stein, R. M. 2002. Benchmarking default prediction models: pitfalls and remedies in model validation, Journal of Risk Model Validation 1(1): 77-113.

Tam, K. Y.; Kiang, M. Y. 1992. Managerial applications of neural networks: the case of bank failure predictions, Management Science 38(7): 926-947.

Tsai, C. 2009. Feature selection in bankruptcy prediction, Knowledge-Based Systems 22(2): 120-127.

Tseng, F.-M.; Lin, L. 2005. A quadratic interval logit model for forecasting bankruptcy, Omega 33: 85-91.

Zavadskas, E. K.; Turskis, Z. 2011. Multiple criteria decision making (MCDM) methods in economics: an overview, Technological and Economic Development of Economy 17(2): 397-427.

Varetto, F. 1998. Genetic algorithm applications in the analysis of insolvency risk, Journal of Banking and Finance 22(10-11): 1421-1439.

Verikas, A.; Kalsyte, Z.; Bacauskiene, M.; Gelzinis, A. 2010. Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: a survey, Soft Computing 14(9): 995-1010.

Wiginton, J. C. 1980. A note on the comparison of logit and discriminant models of consumer credit behavior, Journal of Financial and Quantities Analysis 15(3): 757-770.

Witten, I. H.; Frank, E. 2005. Data mining: practical machine learning tools and techniques. 2nd edition. San Francisco: Morgan Kaufmann. 560 p.

Zanakis, S. H.; Solomon, A.; Wishart, N.; Dublish, S. 1998. Multi-attribute decision making: a simulation comparison of select methods, European Journal of Operational Research 107(3): 507-529.

Zhang, D.; Zhou, L. 2004. Discovering golden nuggets: data mining in financial application, IEEE Transactions on Systems, Man, and Cybernetics 34(4): 513-522.

Gang KOU (a), Yi PENG (b), Chen LU (b)

(a) School of Business Administration, Southwestern University of Finance and Economics, Chengdu, China

(b) School of Management and Economics, University of Electronic Science and Technology of China, Chengdu, 610054, P. R. China

Received 25 March 2013; accepted 10 November 2013

Corresponding author Yi Peng


Gang KOU is a Professor and Executive Dean of School of Business Administration, Southwestern University of Finance and Economics. He is the Managing Editor of International Journal of Information Technology & Decision Making and series Editor of Quantitative Management (Springer). Previously, he was a Professor ofSchool of Management and Economics, University of Electronic Science and Technology of China, and a Research Scientist in Thomson Co., R&D. He received his PhD in Information Technology from the College of Information Science & Technology, University of Nebraska at Omaha; got his Master's degree in Department of Computer Science, University of Nebraska at Omaha; and BS degree in Department of Physics, Tsinghua University, Beijing, China. He has participated in various data mining projects, including data mining for software engineering, network intrusion detection, health insurance fraud detection and credit card portfolio analysis. He has published more than eighty papers in various peer-reviewed journals and conferences. He has been Keynote speaker/workshop chair in several international conferences. He co-chaired Data Mining contest on The Seventh IEEE International Conference on Data Mining 2007 and he is the Program Committee Co-Chair of the 20th International Conference on Multiple Criteria Decision Making (2009) and NCM 2009: 5th International Joint Conference on INC, ICM and IDC. He is also co-editor of special issues of several journals, such as Journal of Multi Criteria Decision Analysis, Decision Support Systems, Journal of Supercomputing and Information Sciences.

Yi PENG is a Professor of School of Management and Economics, University of Electronic Science and Technology of China. She received her PhD in Information Technology from the College of Information Science & Technology, University of Nebraska at Omaha and got her Master's degree in Department of Info. Science & Quality Assurance, University of Nebraska at Omaha and BS degree in Department of Management Information Systems, Sichuan University, China. Her research interests cover knowledge discover in database and data mining, multi-criteria decision making, data mining methods and modelling, knowledge discovery in real-life applications.

Chen LU got his Master's degree in Management Science from School of Management and Economics, University of Electronic Science and Technology of China.
Table 1. Comparison of subsample characteristics

Variable or ratio        Normal loans           Defaulted loans

Total assets

maximum                  RMB.26,854,906,969     RMB.1,792,017,755.50
Mean                     RMB.559,394,670.58     RMB.87,387,234.37
minimum                  RMB.640.1              RMB.1827.6
S.D.                     RMB.2,447,286,504.76   RMB.183,269,496.1
Total revenues
maximum                  RMB.16,993,642,023     RMB.1,620,347,617
Mean                     RMB.333,414,922.10     RMB.44,486,523.13
minimum                  RMB.-1,631,589.70      RMB.-335,266.87
S.D.                     RMB.1,104,160,852      RMB.159,020,842.9
Net mcome/Total assets
Mean                     2.915901569            -2.104
S.D.                     43.47841237            47.665
Cash/Current Ha^Hties
Mean                     0.354                  0.102
S.D.                     4.117                  0.184

Table 2. Classification results on the PCA test dataset

              Accuracy   Type-I errors   Type-II errors   AUC

Naive Bayes   0.4674     0.0642          0.6977           0.75
Logistic      0.8655     0.4818          0.012            0.9
K-NN          0.9667     0.0771          0.0178           0.96
C4.5          0.9564     0.1017          0.0231           0.95
CART          0.9522     0.1176          0.0231           0.95

Table 3. Classification results on the ICA test dataset

              Accuracy   Type-I errors   Type-II errors   AUC

Naive Bayes   0.3897     0.0536          0.8066           0.64
Logistic      0.7633     0.8791          0.0101           0.79
K-NN          0.9665     0.0902          0.0135           0.97
C4.5          0.9342     0.164           0.0311           0.92
CART          0.9266     0.1922          0.0315           0.91

Table 4. Comparison of classification results using
the PCA and ICA datasets

      Accuracy   Type-I errors   Type-II errors   AUC

PCA   0.836933   0.256317        0.130133         0.86
ICA   0.786617   0.396417        0.148817         0.788333

Table 5. Ranking of classifiers by TOPSIS

              TOPSIS(PCA)   TOPSIS(ICA)

Naive Bayes     0.4169        0.3992
Logistic        0.681         0.6204
K-NN            0.9831        0.9752
C4.5            0.9609        0.9264
CART            0.9478        0.9111
COPYRIGHT 2014 Vilnius Gediminas Technical University
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2014 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Kou, Gang; Peng, Yi; Lu, Chen
Publication:Technological and Economic Development of Economy
Article Type:Report
Geographic Code:9CHIN
Date:Jun 1, 2014
Previous Article:Applying fuzzy MCDM for financial performance evaluation of Iranian companies.
Next Article:Criticality assessment of energy infrastructure.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters