Data mining and customer relationship marketing in the banking industry.Advances in computer hardware and data mining software have made data mining accessible and affordable to many businesses. Hence, it is no surprise that data mining has gained widespread attention and increasing popularity in the commercial world in recent years. Data mining provides the technology to analyse mass volume of data and/or detect hidden patterns in data to convert raw data into valuable information. This paper discusses the potential usefulness of data mining for customer relationship management (CRM (Customer Relationship Management) An integrated information system that is used to plan, schedule and control the presales and postsales activities in an organization. ) in the banking industry First, the paper introduces the CRM concept and summarises the data mining methodology and tools. Second, it discusses the data mining literature, particularly its applications in banks. Third, it illustrates a possible CRM application of data mining in banking. Finally, it suggests other potential data mining banking applications and highlights some of the limitations of data mining. Introduction Since the mid-1990s, three new interrelated in·ter·re·late tr. & intr.v. in·ter·re·lat·ed, in·ter·re·lat·ing, in·ter·re·lates To place in or come into mutual relationship. in areas that emphasised obtaining more information from data have emerged strongly in information systems and information technology. They are data warehousing See data warehouse. data warehousing - data warehouse , knowledge management, and data mining. Coupled with advances in both computer hardware and software, many applications are more accessible and affordable to businesses than before. This paper focuses on data mining, which aims to identify valid, novel, potentially useful and understandable correlations and patterns in data (Chung and Gray, 1999). In particular, the paper explores the potential usefulness of data mining in banks in the area of customer relationship management (CRM). Although the paper focuses mainly on the banking industry, the issues and applications discussed are applicable to other industries as well. See, for example, Koh and Low (2001) on data mining applications in the insurance industry and Koh and Leong (2001) on data mining applications in the healthcare industry. The remainder of the paper is organised into five sections. The first section introduces customer relationship management in general. The second section discusses the data mining methodology and tools, and the data mining literature. The third section illustrates possible banking applications and examples of data mining in the literature, both in the context of customer relationship management. The fourth section gives an illustration of how data mining can be applied to chum modelling (that is, the prediction of customer turnover) in banks. Finally, the concluding section highlights the limitations of data mining and suggests other possible CRM applications of data mining in the banking industry. Customer Relationship Management CRM can be defined as the process of predicting customer behaviour and selecting actions to influence that behaviour to benefit the company (Jenkins, 1999), usually leveraging on information technology and database-related tools. This important concept has been given a new lease of life because of the growth of the Internet and E-businesses. CRM is crucial in the on-line business environment because face-to-face contact is impossible on the net and customer loyalty can waver easily. As gaining customer loyalty becomes the focus in an E-business environment, it is not surprising that analysts have referred to CRM services as one of the hottest enterprise services today. Statistics from the International Data Corporation predict that worldwide revenues in this market will explode at a compound annual growth rate of 29 per cent from US$34.4 billion in 1999 to US$125.2 billion in 2004 (Chin, 2000). This projected phenomenal growth of CRM illustrates its increasing popularity among businesses. CRM initiatives usually seek to fulfil several objectives. One of the objectives is to get closer to the customer by utilising the data "hidden" in scattered Scattered Used for listed equity securities. Unconcentrated buy or sell interest. enterprise databases. Examining and analysing the data can turn raw data into valuable information about customer's needs. By predicting customer needs in advance, businesses can then market the right products to the right segments at the right time through the right delivery channels. Customer satisfaction can also be improved through more effective marketing. Another objective of the CRM initiative is to transform the company into customer-centric organisations with a greater focus on customer profitability Customer profitability (CP) is the difference between the revenues earned from and the costs associated with the customer relationship in a specified period. According to Philip Kotler,"a profitable customer is a person,household or a company that overtime,yields a revenue as compared to line profitability. The insights gained from CRM enable companies to calculate or estimate the profitability of individual accounts. Businesses are then able to differentiate their customers correctly with respect to their profitability. From such insights, companies can build predictive churn churn: see butter. models to retain their best customers by identifying telltale symptoms of dissatisfaction and churning Firing one group of employees and hiring another. As companies move into newer, high-tech ventures, they often eliminate employees with older skills while bringing on new people who have computer programming, networking and Web experience. . As for less profitable accounts, efforts can be directed to switch them to lower cost/service delivery channels. Other CRM objectives include increased cross-selling possibilities, better lead management, better customer response and improved customer loyalty (Chin, 2000). CRM and Data Mining While the tremendous business value of customer-centric marketing and management strategies is intuitive, implementing CRM initiatives have only been popularised by recent developments in technology, particularly in data storage capabilities, data warehousing applications, and data mining techniques (Berry and Linoff, 1997). Although a large part of CRM is technologically driven, it is not just about computer software and hardware. For most small businesses, CRM occurs naturally (Coyle, 1999). Customer loyalty and profitability are derived from the closely knitted Adj. 1. closely knit - held together as by social or cultural ties; "a close-knit family"; "close-knit little villages"; "the group was closely knit" close-knit close - close in relevance or relationship; "a close family"; "we are all... relationships that small community businesses have with their customers. As businesses expand, however, that degree of intimacy is no longer available. As it is not realistic and cost effective for big corporations to know each customer individually, CRM must be achieved in an indirect manner for such organisations. They must predict the behaviour of individual customers through the available transactional, operational and other customer information they have. Data mining uses sophisticated statistical processing or artificial intelligence algorithms to discover useful trends and patterns from the extracted data. Data mining can yield important insights including prediction models This article outlines the various propagation models currently used by the wireless industry for signal transmission at both 900 MHz and 1800 MHz. We start with the foundation of free-space transmission, followed by Picquenard’s multiple knife edge diffraction model. and associations that can help companies understand their customers better. Front office applications can enable marketing personnel to have dynamic access to decision support models from different delivery channels to decision support models. Data Mining Methodology and Tools Data mining can be considered a recently developed methodology and technology, coming into prominence in 1994 (Trybula, 1997). The SAS Institute SAS Institute Inc., headquartered in Cary, North Carolina, USA, has been a major producer of software since it was founded in 1976 by Anthony Barr, James Goodnight, John Sall and Jane Helwig. defines data mining as the process of selecting, exploring and modelling large amounts of data to uncover previously unknown patterns of data (SAS Institute, 1998). Accordingly, data mining can be considered a process and a technology to detect the previously unknown in order to gain competitive advantage. The SAS (1) (SAS Institute Inc., Cary, NC, www.sas.com) A software company that specializes in data warehousing and decision support software based on the SAS System. Founded in 1976, SAS is one of the world's largest privately held software companies. See SAS System. data mining methodology comprises the following five stages: Sample, Explore, Modify, Model, and Assess (semma). Sampling is desirable if the data for analysis are too voluminous for reasonable processing time or if it is desirable to avoid problems of generalisation Noun 1. generalisation - an idea or conclusion having general application; "he spoke in broad generalities" generality, generalization idea, thought - the content of cognition; the main thing you are thinking about; "it was not a good idea"; "the thought by dividing the data into different sets for model construction and model validation. Exploration and modification refer to the review of data to enhance understanding of it (for example, by examining the summary measures) and the transformation of data (for example, to induce a linear relationship or a normal distribution), respectively. It is noted that not every data mining project needs sampling or modification of the data. However, exploration is usually useful and done as a form of preliminary analysis. The modelling stage is the actual data analysis. Most data mining software include traditional statistical methods (for example, cluster analysis Cluster analysis A statistical technique that identifies clusters of stocks whose returns are highly correlated within each cluster and relatively uncorrelated across clusters. Cluster analysis has identified groupings such as growth, cyclical, stable, and energy stocks. , discriminant dis·crim·i·nant n. An expression used to distinguish or separate other expressions in a quantity or equation. analysis, and regression analysis In statistics, a mathematical method of modeling the relationships among three or more variables. It is used to predict the value of one variable given the values of the others. For example, a model might estimate sales based on age and gender. ) as well as non-traditional statistical analysis such as neural networks neural network or neural computing, computer architecture modeled upon the human brain's interconnected system of neurons. Neural networks imitate the brain's ability to sort out patterns and learn from trial and error, discerning and extracting , decision trees, link analysis, and association analysis. Finally, the assessment stage allows the comparison of models and results from any data mining model by using a common yardstick (for example, lift charts, profit charts or diagnostic classification charts). Classification of Data Mining Tools Data mining tools can be broadly classified based on what they can do: * description and visualisation; * association and clustering; and * classification and estimation (prediction). Description and Visualisation Description and visualisation can contribute greatly towards understanding a data set and detecting hidden patterns in data--especially complicated data containing complex and non-linear interactions. They are usually performed before modelling is attempted and represents exploration in the SEMMA methodology. Standard description tools include summary statistics (for example, measures of central tendency and measures of dispersion dispersion, in chemistry dispersion, in chemistry, mixture in which fine particles of one substance are scattered throughout another substance. A dispersion is classed as a suspension, colloid, or solution. ) and graphical representations (for example, distributions and plots). Visualisation can be considered an enhanced graphical approach that allows user input and interaction. An example is a rotating multidimensional mul·ti·di·men·sion·al adj. Of, relating to, or having several dimensions. mul ti·di·men plot that
permits the user to define the multiple dimensions (multiple variables)
in the plot as well as the direction and angle of rotation to facilitate
viewing complex relationships. Colours can also enhance visualisation
tools. In the data mining context, description and visualisation tools
can be used to understand people, products and process and study the
relatio nships among variables. The results from such analyses are
seldom an end in themselves but are usually used as a means to construct
better data mining models (to predict certain target variables).
Association and Clustering In association, the objective is to determine which variables go together. For example, market basket analysis Market Basket Analysis (MBA) applies association rule learning to purchase data with the goal of identifying cross-selling opportunities. Given a data set, the algorithm trains and identifies product baskets and product association rules. refers to a technique that generates probabilistic (probability) probabilistic - Relating to, or governed by, probability. The behaviour of a probabilistic system cannot be predicted exactly but the probability of certain behaviours is known. Such systems may be simulated using pseudorandom numbers. statements such as: if customers purchase coffee, there is a 0.35 probability that they also purchase bread. Such information can be useful for store layout, items bundling, discount and promotion decisions, etc. Market basket analysis can be applied not only to items purchased concurrently but also to items purchased sequentially. Another data mining tool, link analysis, can also be considered an association technique. It looks at connection relationships-how people, places and things People, Places and Things is an unpublished collection of short stories by US author Stephen King, written in 1960 together with his friend Chris Chesley and published using their own press. are connected (for example, call patterns in telecommunications). In clustering, the objective is to group objects in such a way that objects belonging to the same cluster are similar and objects belonging to different clusters are dissimilar. The two most common data mining tools for clustering are cluster analysis and self-organising map (or Kohonen network). As an application, clustering can be used for market segmentation Market Segmentation A marketing term referring to the aggregating of prospective buyers into groups (segments) that have common needs and will respond similarly to a marketing action. to group consumers and customers. Clustering can also be used to generate cluster membership, which in turn can be used as an input variable in a prediction model-consumers belonging to particular clusters may be more inclined to respond to a particular mailing campaign favourably. Classification and Estimation (Prediction) The most common and important applications in data mining probably involve prediction. Classification refers to the prediction of a target variable that is categorical That which is unqualified or unconditional. A categorical imperative is a rule, command, or moral obligation that is absolutely and universally binding. Categorical is also used to describe programs limited to or designed for certain classes of people. in nature (for example, predicting fraud versus non-fraud, high-risk versus low-risk, or purchaser versus non-purchaser). Estimation, on the other hand, refers to the prediction of a target variable that is metric in nature (for example, predicting the amount spent, duration of a call, or the account balance). To construct prediction models, at least one of the following data mining tools is usually used: multiple or logistic regression In statistics, logistic regression is a regression model for binomially distributed response/dependent variables. It is useful for modeling the probability of an event occurring as a function of other factors. , neural networks and decision trees. Logistic regression is a tradition statistical method similar to regression, except that it handles categorical target variables. Neural networks are useful for recognising patterns in the data and are modelled after the human brain, which can be perceived as a highly connected network of neurons Neurons Nerve cells in the brain, brain stem, and spinal cord that connect the nervous system and the muscles. Mentioned in: Speech Disorders . The objective of decision trees is estimation and/or classification by dividing observations into mutually exclusive Adj. 1. mutually exclusive - unable to be both true at the same time contradictory incompatible - not compatible; "incompatible personalities"; "incompatible colors" and exhaustive subgroups. The end product can be graphically represented by a tree-like structure. In applying the decision tree methodology, each observation is eventually assigned to a node that has a predicted value or classification. Of the three prediction models, decision trees are the most interpretable in that they can be translated into decision rules. Further, as in the case of neural networks, decision trees can be used to model complex non-linear and interaction relationships. Data Mining and CRM Literature According to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. the professional and trade literature, more companies are using data mining as the foundation for strategies that help them outsmart out·smart tr.v. out·smart·ed, out·smart·ing, out·smarts To gain the advantage over by cunning; outwit. outsmart Verb Informal same as outwit Verb 1. competitors, identify new customers and lower costs (Davis, 1999). In particular, data mining is widely used in marketing, risk management and fraud control (Kuykendall, 1999). For example, the Farmers Insurance Group data mines customer information to develop competitive rates, Foote Cone & Belding analyses data mined from operational and transactional systems to refine clients' direct mailing and advertising campaigns to improve catalogue sales, and Axios Data Analysis Systems use data mining to help identify what could be fraudulent health insurance claims for one of its clients. Other successful users of data mining include Fingerhut, American Century Investments American Century Investments is a privately held investment management firm. Its headquarters are located at 4500 Main in Kansas City, Missouri, near the famous Country Club Plaza. It was formerly known as Twentieth Century Investments. The company was founded by James E. , Charles Schwab Charles Schwab can refer to:
Bank of America (NYSE: BAC TYO: 8648 ) is the largest commercial bank in the United States in terms of deposits, and the largest company of its kind in the world. , US West, Bell Atlantic, Alltel, Wal-Mart and Boots PLC (Lach, 1999; Scholber, 1999; Stedman, 1998; Brabazon, 1997). Total spending by US banks on CRM, including technology and non-technology outlays Outlays Payments on obligations in the form of cash, checks, the issuance of bonds or notes, or the maturing of interest coupons. , has been estimated to grow at a compound rate of 11 per cent (Kiesnoski, 1999). One possible data mining application in banks is risk management, such as credit risk assessment or credit scoring Credit scoring A statistical technique that combines several financial characteristics to form a single score to represent a customer's creditworthiness. . In the past, assessing credit risk (for example, in loan approval and overdraft A check that is drawn on an account containing less money than the amount stated on the check. The term overdraft is also used in reference to the condition that exists when vouchers facilities) had mostly been a rule-based affair and the rules are usually derived from weathered industry norms. In the last couple of years, more accessible and easier-to-use data mining software has made it possible for powerful data mining techniques to be applied to risk assessment and other banking-related business problems (Berger, 1999). For example, a decision tree solution for credit risk assessment produces credit-scoring rules for all the accounts in a bank database, and credit jeopardy lists can be drawn up by the use of multiple database queries. This scoring or classification of high/low risk is based on the attributes of each customer account such as overdraft records, outstanding loans, history of derogatory de·rog·a·to·ry adj. 1. Disparaging; belittling: a derogatory comment. 2. Tending to detract or diminish. credit reports, account type, income levels, and other information. Real world examples include Corestates Bank, whose Retail Credit Information System (RCRIS RCRIS Resource Conservation and Recovery (Act) Information System (US EPA) ) allows the bank to analyse customer and credit portfolio accurately to reduce its credit risk and monitor high-risk accounts (Varney, 1996). Another example is the Bank of Montreal “BMO” redirects here. For the mathematics competition, see British Mathematical Olympiad. Bank of Montreal/Banque de Montréal (TSX: BMO, NYSE: BMO) is Canada's fourth largest bank[1], and is classified as a Domestic Chartered Bank (Schedule I). which analyses mortgage customers' transactional history in checking, saving and other accounts for insight into customers' risk of default (Fabris, 1998). Similarly, the Bank of America's mortgage division has used data mining on customer behaviour data to estimate bad loans, so that credit risk managers can allocate optimal loan loss reserves which affects profitability directly (Fabris, 1998). Another possible banking application of data mining techniques is in customer acquisition. Traditionally, database marketers have made important marketing decisions based on simple one-dimensional queries (that underutilise the available data), or even on pure gut and intuition (Berger, 1999). Today, exploratory data mining methods--such as automatic cluster detection and market basket market basket n. 1. A grocery cart. 2. A group of products or services in a specific market, especially when considered in terms of its fluctuating cost in determining a consumer price index: analysis--can be used to discover attributes in customer databases that predict response rates to the bank's marketing campaigns (Peacock, 1998). Attributes that are identified as campaign friendly can then be matched to new lists of non-customers in order to increase the effectiveness of the marketing campaign. Coyle (1999) reported that, before data mining caught on several years ago, a direct mail campaign was thought to be successful if it achieved a response rate of 6 to 7 per cent. In 1998, the Canadian Imperial Bank of Commerce The Canadian Imperial Bank of Commerce TSX: CM NYSE: CM, better known to most customers as CIBC, is one of Canada's major banks. CIBC is classified as a Domestic Chartered Bank (Schedule I). utilised CRM and data mining to achieve a phenomenal response rate of 47 per cent. Much of the success was attributed to targeting the right customers and being able to predict their responses. Fleet Bank also used data mining to identify the best prospects for marketing its mutual funds based on customer demographics The attributes of people in a particular geographic area. Used for marketing purposes, population, ethnic origins, religion, spoken language, income and age range are examples of demographic data. and account data (Fabris, 1998). Another data mining application in the Bank of America's west coast customer service call centre focuses on marketing and cross-selling opportunities (Fabris, 1998). Instead of mass pitching a certain "hot" product, the bank's customer service representatives are equipped with customer profiles enriched by data mining that help them to identify which products and services are most relevant to callers. Data mining can also be used in customer retention applications (for example, by employing churn modelling). In a typical application, data mining identifies customers who are profitable and who are likely to leave or churn. With the information, the bank can target these valuable but vulnerable customers for extra value-added customer services, special offers and loyalty incentives (Peacock, 1998). The Chase Manhattan Bank in New York New York, state, United States New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of uses data mining to model customer churning (Fabris, 1998). From the data mining efforts, the Chase Manhattan Bank implemented the unusual step of reducing required the minimum balance in customers' checking accounts for two consecutive years. The result was that the percentage of profitable customers to overall customers improved. An Illustrative il·lus·tra·tive adj. Acting or serving as an illustration. il·lus tra·tive·ly adv.Adj. 1. Data Mining Application in Banking: Churn Modelling To set the context to illustrate a possible data mining application in banking, consider a customer retention application or churn modelling for a fictitious Based upon a fabrication or pretense. A fictitious name is an assumed name that differs from an individual's actual name. A fictitious action is a lawsuit brought not for the adjudication of an actual controversy between the parties but merely for the purpose of bank, ZBANK, which is facing increasing competitive challenges from other financial institutions. ZBANK has been encountering customer defections in its home loans, which is one of its most highly valued customer bases. As a marketing strategy, ZBANK gives its new customers in home loans lots of incentives (such as free electrical appliances and furniture vouchers). Thus, it has a comparatively higher initial cost of acquisition than its competitors. However, market dominance Market dominance is a measure of the strength of a brand, product, service, or firm, relative to competitive offerings. There is often a geographic element to the competitive landscape. in this type of loans has given ZBANK a lower risk exposure due to the home mortgages and a strong strategic positioning for cross selling other services such as future home loans or home insurance. Besides maintaining its strategic market dominance, predicting churn likelihood is also important to ZBANK for reducing the number of new customers who defect soon after being acquired. It is noted that ZBANK has a customer database that consists of transactional and demographic information pertaining per·tain intr.v. per·tained, per·tain·ing, per·tains 1. To have reference; relate: evidence that pertains to the accident. 2. to its home loan customers. Data and Data Mining Tools Assume that ZBANK captures the following data: (1) customer identification [cust_num], (2) balance in the savings account Savings Account A deposit account intended for funds that are expected to stay in for the short term. A savings account offers lower returns than the market rates. Notes: [savg_acc {$'000}], (3) balance in the current account [curr_acc {$'000}], (4) balance in the investment account [invt_acc {$'000}], (5) average number of transactions per day [trans_dy], (6) mode of credit card payment [card_pay {giro, cheque, other accounts}], (7) whether there are other mortgage loans [mortg_ln], (8) whether there are credit lines [cdt_line], (9) customer age [cust_age], (10) customer gender [cust_sex], (11) customer marital status marital status, n the legal standing of a person in regard to his or her marriage state. [cust_mar], (12) customer number of children [cust_chd], (13) customer income per annum Per annum Yearly. [cust_inc], (14) whether customer has more than one car [cust_car], and (15) customer churn status [cust_chn]. Assume further that the objective of the illustrative data mining application is to construct a churn prediction model to predict the probability that a current customer will churn in the next six months. The prediction will be made based on thirteen of the variables listed above (that is, from savg_acc to cust_car). The target variable (cust_chn) is captured as a multichotomous variable as follows: current customer, involuntary churn, and voluntary churn. Involuntary churn is probably the least interesting to ZBANK as it reflects mostly customers who have sold their homes within the loan period and who therefore no longer require the home loans. Voluntary chum refers to customers who defect to ZBANK's competitors and is the primary concern of the bank. Prior to developing this application, ZBANK has categorised Adj. 1. categorised - arranged into categories categorized classified - arranged into classes all its existing customers into the above three groups. Also, as a routine practice, all demographic information (that is, from cust_age to cust_car) is updated every six months while transactional information (that is, from savg_acc to cdt_line) is updated real-time. To enable the prediction model to provide early indicators so that remedial actions A remedial action is a change made to a nonconforming product or service to address the deficiency. Rework and repair are generally the remedial actions taken on products, while services usually require additional services to be performed to ensure satisfaction. can be taken, a lag of six months between the target (that is, dependent) variable and input (that is, independent) variables is decided upon. That is, the input variables are collected six months prior to categorising the customers churn status; thus, the model predicts churn six months in advance. For predictive modelling Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of a signal given a set amount of input data, for , three data mining tools are usually appropriate; namely, logistic regression, neural network and decision tree. SPSS A statistical package from SPSS, Inc., Chicago (www.spss.com) that runs on PCs, most mainframes and minis and is used extensively in marketing research. It provides over 50 statistical processes, including regression analysis, correlation and analysis of variance. Clementine Clementine forty-niner’s drowned daughter; “lost and gone forever.” [Am. Music: Leach, 236] See : Grief , a data mining software, is used in this illustration. The data mining diagram associated with the illustration is given in Figure 1. It is noted that description and visualisation, association and clustering, and predictive modelling are incorporated into the illustration. A snapshot of the sample data is shown in Figure 2. Description and Visualisation Results As mentioned earlier, description and visualisation are useful for understanding the data and in the initial modelling stage to explore patterns, trends and relationships. Several description and visualisation tools are used in the illustration. For example, descriptive statistics descriptive statistics see statistics. are derived using the Statistics and Distribution nodes in Clementine. Some results are shown in Figure 3 (for example, the mean age of home loan customers is 57.4 years, see left panel of Figure 3, and 720 or 50.7 per cent are females, see top right panel of Figure 3). Such description aids in understanding the data. To visualise the data using the Plot and Histogram histogram or bar graph Graph using vertical or horizontal bars whose lengths indicate quantities. Along with the pie chart, the histogram is the most common format for representing statistical data. nodes, a plot of customer income and customer age and a histogram showing average number of transactions per day are generated (see centre panel and middle right panel, respectively). Further, to relate the visualisation to the target variable, customer churn status is overlaid o·ver·laid v. Past tense and past participle of overlay1. in the different graphs. For example, the dispersion of customer, invol_chn and vol_churn a mong female and male customers and for each level of trans_dy is incorporated into the graphs. This preliminary assessment of the relationships can be useful for modelling purposes. In particular, the results suggest that voluntary churn is proportionately more common among female customers as well as less active customers (as measured by trans_dy). Finally, a Web graph (via the Web node in Clementine) is drawn showing the links among cust_sex, cust_mar, card_pay and cust_chn (see bottom right panel in Figure 3). Stronger relationships are shown by stronger lines. Links below a threshold level Noun 1. threshold level - the intensity level that is just barely perceptible intensity, intensity level, strength - the amount of energy transmitted (as by acoustic or electromagnetic radiation); "he adjusted the intensity of the sound"; "they measured the (as defined by the user) are not included in the web graph (for example, between invol_chn and the selected input variables). The web graph suggests that existing customers (that is, non-churners) tend to be those who are married and male and those who make their credit card payments with other accounts. It is noted that the customer chum status lags the input variables by six months as discussed earlier. Association and Clustering Results To further understand the home loan customers, clustering can be performed. The results obtained from running the TwoStep clustering node are summarised in Figure 4. As shown, the customers seem to fall into seven natural clusters. The cluster profile and characteristics generated can help to define and understand each cluster as well as differences among clusters. For example, comparing Cluster 1 and Cluster 4, Cluster 1 consists of female customers only who are relatively younger and mostly married (92.2 per cent), and who have relatively higher annual income. In comparison, Cluster 4 consists of male customers only who are relatively older (by about 5 years on average) and among whom about 59.8 per cent are married, and who have relatively tower annual income (by almost $4,000 on average). Clustering results are very useful for market profiling and segmentation studies but are less relevant for predictive modelling. In this illustration, association analysis is used to generate rules that indicate the relationship between the input variables and the target variable. These rules are important not only for discovering patterns, relationships and trends but also for predictive modelling (for example, deciding on which input variables to include/exclude from the model). The GRI GRI Graduate, Realtors Institute GRI Global Reporting Initiative GRI Gas Research Institute GRI Gallaudet Research Institute GRI General Rate Increase GRI Geoscience Research Institute (Loma Linda, CA) (generalised Adj. 1. generalised - not biologically differentiated or adapted to a specific function or environment; "the hedgehog is a primitive and generalized mammal" generalized biological science, biology - the science that studies living organisms rule induction Rule induction is an area of machine learning in which formal rules are extracted from a set of observations. The rules extracted may represent a full scientific model of the data, or merely represent local patterns in the data. ) node in Clementine is used to perform association analysis and the results are summarised in Figure 5. To interpret the results, the first association rule indicates that there are 156 (or 11.0 per cent of) home loan customers whose balance in their investment account is less than $4,988; among these, 81.0 per cent of them are involuntary churners. Similarly, the third association rule indicates that there are 198 (or 13.9 per cent of) home loan customers whose balance in their current account is more than $1,017; among these, 81.0 per cent of them are voluntary churners. The other association rules can be interpreted simi larly. The association rules show how the transactional and demographic information is associated with customer churn status. It is noted that customer chum status lags the input variables by six months. Predictive Modelling Results In this illustrative data mining application for ZBANK, predictive modelling is the most important analysis. In particular, logistic regression, neural network and decision tree can be used to model customer chum in home loans. Before performing predictive modelling, the sample data is partitioned into a construction/training sample, approximately 75 per cent and validation/test sample, approximately 25 per cent. Extracts from the two samples are shown in Figure 6. Figures 7 and 8 show portions of the logistic regression, neural network and decision tree results derived from the Logistic Regression, Train Net and Build C5.0 nodes in Clementine. As can be seen, the logistic regression model is statistically significant and has a Chi-square p-value of 1.000, indicating a good fit of the data (see Figure 7). In addition, the following input variables are statistically significant in predicting customer chum status at a 0.05 level of significance: savg_acc (p-value = 0.000), curr_acc (p-value = 0.000), cust_age (p-value = 0.002), cust_inc (p-value = 0.033) and cust_sex (p-value = 0.000). Figure 8 shows that the neural network model has 15 neurons in the input layer, five neurons in the hidden layer, and three neurons in the output layer. In addition, the five most important input variables are (in descending order of importance): curr_acc, cust_chd, savg_acc, invt_acc and cust_mar. Finally, the decision tree model shows a relatively simple decision tree with four terminal nodes terminal node - leaf and only three important input variables (in descending order of importance): invt_acc cust_sex and cust_age. A graphical representation of the decision tree model is given in Figure 9. That each prediction model is significant can be seen from the lift charts generated with the Evaluation node in Figure 10 (for logistic regression, decision tree and neural network from left to fight). The lift charts plot the cumulative lift value against percentiles of the sample (in this case, the construction/training sample). The benchmark (that is, threshold for evaluating each model) is 1, which translates to the success of "hitting" existing customers if the percentiles of records are randomly selected from the sample. The lift value measures how much more successful (that is, accurate) the prediction model is in "hitting" existing customers if the percentiles reflect the descending order of the predicted probability that a record from the data is an existing customer. As can be seen in Figure 10, the lift value for each model is above the benchmark of 1, converging at 1 at the 100th percentile percentile, n the number in a frequency distribution below which a certain percentage of fees will fall. E.g., the ninetieth percentile is the number that divides the distribution of fees into the lower 90% and the upper 10%, or that fee level . Hence, it can be concluded that each of the prediction model is significant in that it can predict the ta rget variable (at least existing customers versus non-existing-customers) with significant accuracy. It can be noted that the prediction models obtained from logistic regression, neural network and decision tree are not identical. Hence, it is important to compare the performance of the three different models not only on the construction/training sample but also (and more importantly) on the validation/test sample. For these prediction models, the best way to evaluate their comparative performance is probably to look at the accuracy rates of the models in predicting the target variable (that is, customer churn status). For the purpose of this illustration and for simplicity, it is assumed that the overall accuracy rate comprises the evaluation criterion for comparing the performance of the different prediction models. The results (that is, classification tables) are captured in Figure 11. As shown in the left panel of Figure 11, predictions of the decision tree model ($C-cust_chn) are most accurate at an overall accuracy rate of 81.6 per cent, followed by those of the logistic regression model ($L-cust_chn; overall accuracy rate = 80.0 per cent) and neural network model ($N-cust_chn; overall accuracy rate = 77.9 per cent). Hence, based on the evaluation criterion, the decision tree model is the best (or champion) prediction model and should be used for predicting churn in the home loans of ZBANK. It is also noted that a decision tree model is easy to interpret, as evidenced by the simple rules reflected in Figure 9. In particular, the results indicate that home loan customers in ZBANK who churn voluntarily are likely to be female customers above the age of 39 and who have more than $4,976 in their investment accounts. Note that the target variable lags the input variables by six months. From the results presented so far, it is expected that the decision tree churn prediction model can add value by more accurately identifying churners and non-churners given their transactional and demographic information. This being the case, the decision tree model can be used to help ZBANK identify which customers are inclined to voluntarily churn. ZBANK can then offer them incentive packages or take other preventive actions A preventive action is a change implemented to address a weakness in a management system that is not yet responsible for causing nonconforming product or service. Candidates for preventive action generally result from suggestions from customers or participants in the process . Similarly, the churn model can help identify and which low-churn-risk home loan applicants to acquire. Using data mining terminology, the decision tree model can be deployed by using it to score existing customers and new home loan applicants. Finally, it should be pointed out that for this illustration, the model overall classification accuracy rate is computed in a relatively simple manner. In practice, it is appropriate to consider also the misclassifications and their relative costs as well as the relative proportion of churners and non-churners in both the sample and population--see, for example, Koh (1992). Conclusion In recent years, data mining has gained widespread attention and increasing popularity in the commercial world. Successful data mining applications in the Farmers Insurance Group, Foote Cone & Belding, Axios Data Analysis Systems, Fingerhut, American Century Investments, Charles Schwab & Company, Chase Manhattan Bank, Bank of America, US West, Bell Atlantic, Alltel, Wal-Mart and Boots PLC have been reported. It is thus not surprising that recent surveys found that data mining had grown in usage and effectiveness. Professional bodies (see Freedman freed·man n. A man who has been freed from slavery. freedman Noun pl -men History a man freed from slavery Noun 1. , 1997; AICPA AICPA See American Institute of Certified Public Accountants (AICPA). , 1999) have also identified data mining as an important technology for the twenty-first century. This paper looks at the potential usefulness of data mining in the banking industry and presents an illustrative application focused on churn modelling. The data mining methodology and its tools are also discussed and the data mining and CRM literature summarised. Other Potential Data Mining Applications in the Banking Industry Besides chum modelling, there are other potential data mining applications for banks. For example, data mining can be used to: (1) construct credit scoring models to assess the credit risk of loan applicants or credit card applicants, or (2) construct fraud detection models to give early warning signals of possible fraudulent transactions. Further, it can be used to: (3) understand consumers and customers better (for example, via market basket analysis), or (4) segment customers (for example, via clustering). The findings can then be used, say, to prepare mail catalogues, target advertisement and promotion campaigns, etc. Finally, data mining can also be used to (5) construct models to predict the probability of purchasing certain products or services in order to facilitate cross-selling or up-selling. Limitations of Data Mining It is appropriate in this concluding section to highlight some limitations of data mining. First, a sufficiently exhaustive mining of data will certainly throw up patterns of some kind that are a product of random fluctuations (Hand, 1998). This is especially so for large data sets with many variables. Hence, many interesting and/or significant patterns and relationships found in data mining may not be useful. Second, from a statistical perspective, while data mining is well developed for modelling, it is not as well developed for effect assessment. Murray (1997) and Hand (1998) have warned against using data mining for data dredging Data dredging (data fishing, data snooping) is the inappropriate (sometimes deliberately so) search for 'statistically significant' relationships in large quantities of data. or fishing (that is, trawling For fishing by dragging a baited line after a boat, see . Trawling is a method of fishing that involves actively pulling a fishing net through the water behind one or more boats, called trawlers. through data in the hope of identifying patterns) because of the statistical problems involved. Third, successful application of data mining requires the user to be knowledgeable in the domain area of application as well as in the data mining methodology and tools. Without a sufficient knowledge of data mining, the user may not be aware of or be able to avoid the pitfalls of data mining, see, for example, McQueen and Thorley (1999). Collectively, the data mining team should possess the following: domain knowledge, statistical and research expertise, and IT and data mining knowledge and skills. Finally, businesses developing data mining applications also need to make a substantial investment of their resources (that is, time and effort) in data mining. It should be borne in mind that data mining projects can fail for a variety of reasons (for example, lack of management support, unrealistic user expectations, poor project management, inadequate data mining expertise, etc.). To conclude, there is no doubt that data mining is potentially useful in the banking industry. It is envisaged that the bank that can realise the potential usefulness of data mining in transforming raw data into valuable information will gain important strategic advantage and competitive edge over its rivals. References American Institute of Certified Public Accountants With over 330,525 CPA members (in August 2006), the American Institute of Certified Public Accountants (AICPA) is the largest professional organization of Certified Public Accountants (CPAs) in the United States of America. (AICPA) (1999). "Top 10 technologies--plus 5 for tomorrow". Journal of Accountancy, 187(5): 16-17. Berger C (1999). "Data mining to reduce churn". Target Marketing, 22(8): 26-28. Berry MJA MJA Medical Journal of Australia MJA Methanococcus Jannaschii MJA Marsden Jacob Associates (Australia) MJA Modern Jesus Army MJA Microjet Array and GS Linoff (1997). Data Mining Techniques: For Marketing, Sales, and Customer Support. John Wiley John Wiley may refer to:
Brabazon T (1997). "Data mining: A new source of competitive advantage?" Accountancy Ireland, 29(3): 30-31. Chin J (2000). "It's important to do it well". Straits Straits: see Dardanelles; Bosporus. Times--Computer Times, 8 Nov, 2000: 14-16. Chung HM and P Gray (1999). "Data mining". Journal of Management Information Systems The Journal of Management Information Systems (JMIS) is an academic journal that publishes original peer-reviewed research articles in the areas of Information Systems and Information Technology. , 16(1): 11-13. Coyle T (1999). "Finding your best customers". America's Community Banker 8(9): 26-29. Davis B (1999). "Data mining transformed". Informationweek, 751: 86-88. Decker P (1998). "Data mining's hidden dangers". Banking Strategies, 74(2): 6-14. Fabris P (1998). "Advanced Navigation". CIO CIO: see American Federation of Labor and Congress of Industrial Organizations. (Chief Information Officer) The executive officer in charge of information processing in an organization. , 11(15): 50-55. Freedman J (1997). "IIA (1) (Information Industry Association, Washington, DC) In 1999, IIA merged with SPA (Software Publishers Association) to become the Software & Information Industry Association. See SIIA. announces 1997 research priorities". Management Accounting, 78(1): 65-66. Hand DJ (1998). "Data mining: Statistics and more?" The American Statistician, 52(2): 112-118. Jenkins D (1999). "Customer relationship management and the data warehouse". Call Center Solutions, 18(2): 88-92. Kiesnoski K (1999). "Customer relationship management". Bank Systems & Technology 36(2): 30-34. Koh HC (1992). "The sensitivity of optimal cut-off cut-off Anesthesiology The point at which elongation of the carbon chain of the 1-alkanol family of anesthetics results in a precipitous drop in the anesthetic potential of these agents–eg, at > 12 carbons in length, there is little anesthetic activity, points to misclassification costs of Type I and Type II errors Type I errors (or α error, or false positive) and type II errors (β error, or a false negative) are two terms used to describe statistical errors. Statistical error vs. in the going-concern prediction context". Journal of Business Finance & Accounting, 19(2): 187-197. Koh HC and SK Leong (2001). "Data Mining Applications in the Context of Casemix". Annals an·nals pl.n. 1. A chronological record of the events of successive years. 2. A descriptive account or record; a history: "the short and simple annals of the poor" , Academy of Medicine (Singapore), 30(4, Supplement): 41-49. Koh HC and CK Low (2001). "Using data mining in insurance companies". Singapore International Insurance and Actuarial ac·tu·ar·y n. pl. ac·tu·ar·ies A statistician who computes insurance risks and premiums. [Latin Journal, 4(2): 51-62. Kuykendall L (1999). "The data-mining toolbox See toolkit and toolbar. ". Credit Card Management, 12(6): 30-40. Lach J (1999). "Data mining digs in". American Demographics, 21(7): 38-45. McQueen G and S Thorley (1999). "Mining fool's gold fool's gold: see pyrite. ". Financial Analysts Journal, 55(2): 61-72. Murray LR (1997). "Lies, damned lies and more statistics: The neglected issue of multiplicity mul·ti·plic·i·ty n. pl. mul·ti·plic·i·ties 1. The state of being various or manifold: the multiplicity of architectural styles on that street. 2. in accounting research". Accounting and Business Research, 27(3): 243-258. Peacock PR (1998). "Data mining in marketing: Part 1". Marketing Management, 6(4): 8-18. SAS Institute (1998). From Data to Business Advantages: Data Mining, The SEMMA Methodology and SAS Software. SAS Institute: Gary, North Carolina North Carolina, state in the SE United States. It is bordered by the Atlantic Ocean (E), South Carolina and Georgia (S), Tennessee (W), and Virginia (N). Facts and Figures Area, 52,586 sq mi (136,198 sq km). Pop. . Sargeant A and J McKenzie (1999). "The lifetime value of donors: Gaining insight through CHAID CHAID Chi-Squared Automatic Interaction Detector (market segmentation technique) ". Fund Raising Management. 30(1): 22-27. Schober D (1999). "Data detectives". Telephony, 237(9): 20-24. Stedman C (1998). "Data mining despite the dangers". Computerworld, 32(1): 61-62. Trybula WJ (1997). "Data mining and knowledge discovery". Annual Review of information Science and Technology 32: 197-229. |
|
||||||||||||||||||

ti·di·men
tra·tive·ly adv.
Printer friendly
Cite/link
Email
Feedback
Reader Opinion