Printer Friendly

Data mining in telecommunications: case study of cluster analysis.


The development of many industries would not have flourished without the support of information and communication technology. Telecommunication industry uses information and communication technology as a support for providing telecommunication services but also as for business processes. The support of business processes is realized in the form of: (1) transaction information systems which follow regular business activities and generate standardized reports and (2) support systems for the decision-making process which enable intelligent use of data stored in the databases with the aim of making quality decisions. Data mining is a part of the support system for the decision-making process enabling many applications in the field of telecommunications. The most frequent ones are the following: telecommunication market analysis (Costea, 2006), preventing clients from shifting to other companies (Lejeune, 2001; Hung et al., 2006), sale of additional services to existing customers (Malabocchia et al., 1998), assessment of the client's values (Daskalaki et al., 2003), as well as market segmentation.

In telecommunication companies, for the purpose of segmentation of the industrial market, the most frequently used variables include the location and the size of the revenue realized from the sale of telecommunication services. The aim of this paper is to present a case study on the segmentation of the industrial market in a telecommunication company by means of cluster analysis. The business users' data were used as a sample and the approach of dynamic market microsegmentation is suggested on the basis of the data for each individual client.


The objective of conducting a cluster analysis is to discover if members of the dataset can be classified as pertaining to one of a small number of types. This can be especially important for marketing managers in order to discover what constitutes a market segment in a telecommunication company.

The cluster analysis is conducted with the aim of assigning data points (sequences) into reasonably homogenous groups (clusters). The main task in the cluster analysis is to determine how many clusters are to be used (Cattrell, 1998). If the number of clusters is too high, dissimilarity within each cluster will be low, but clusters might be very specific. Therefore, the result of such an analysis could not be easily interpreted and generalized. If the number of clusters is too low, the dissimilarity within each cluster will be high and such clusters could not produce new and useful information. However, a decision needs to be made on how many clusters will be used. In order to describe the discovery of market segments in databases well, a case study involving a telecommunication operator is used. This research will enable us to present segmentation modalities used so far as well as the proposed modality, based on the discovery of market segments in databases. We will analyse the industrial market segmentation. The telecommunication operator from the case study uses the basic market segmentation, whereby two demographic criteria are used: location and the size of the user (the total annual revenue from the user). The market of the Republic of Croatia is divided into four geographic regions. The industrial market is divided into five important market segments based on the users' size measured by the total annual revenue gained. The market segmentation is implemented once a year. One should note that a period of a calendar year is too long for the survival of static segments. In the course of a year, a large number of legal subjects register with the company, which means a large number of new telecommunication services' users in both private and business sector. Additionally, the new services market is very dynamic. New services are offered and some existing ones lose their importance. The users buy new services and new solutions thus changing their position towards the telecommunication operator. The presented approach to the industrial market segmentation, which changes only every calendar year, is not dynamic enough to encompass neither all the changes in the business activities of business subjects nor the changes in the telecommunications market. The analysis, in which variables are measured by the total revenue, other than the location and the size of the user, will be presented. The analysis is based on the following variables: (1) total telecommunications revenue from the users, (2) coefficient of revenue size from users, (3) potential of the user's branch of economic activity, (4) ICT potential, (5) compactness of the relationship between a user and the telecommunication operator and (6) loyalty coefficient. A database of 2000 business users was analysed.


A cluster analysis was performed in four clusters, whereby the two previously mentioned variables were omitted.

Cluster 1 contains the companies, which have an average compactness of the relationship, very low revenue and low ICT potential. Cluster 2 represents the companies with high compactness of the relationship but also with high revenue and average ICT potential. Cluster 3 includes the companies with low ICT potential as well as low compactness of the relationship and low revenue. Cluster 4 contains the companies with highest revenue and low ICT potential as well as low compactness of the relationship (Table 1).

In order to additionally determine in what way the identified clusters differ from each other, a descriptive statistics for the used variables will be presented: median values and standard deviations were calculated for the Internet revenue and the revenue of fixed telephony of the companies in individual clusters. The data showed that the clusters, which have higher median values of variables, used for cluster analysis in relation to other clusters also have higher average values of internet revenue and revenue from fixed telephony and vice-versa. So, the companies from Cluster 2, with the highest average values of variables (the coefficient of the revenue size, ICT potential, compactness of the relationship) have the highest average values related to the Internet revenue and the revenue from fixed telephony. The analysis of variance (ANOVA) showed that the differences of average values are statistically significant for both Internet revenue (p-value=0,000) and revenue from fixed telephony (p-value=0,000) according to individual clusters. The data revealed that this assumption is correct for both groups of revenue at 0.1 probability level. In order to determine between which clusters the statistically significant difference exists, a post-hoc analysis by means of Scheffe test was performed. The data revealed that for Internet revenue there is a statistically significant difference for all pairs of the Cluster 1 and other clusters at 0.1 probability level. For the revenue from fixed telephony a statistically significant difference exists for all pairs at 0.1 probability level except for Cluster 3 and Cluster 4. The analysis of variance and Scheffe post-hoc analysis showed that the cluster analysis is acceptable and that it resulted in determining market segments of the analysed telecommunication operator.

The experts in the telecommunication company interpreted the determined segments in the following way:

Cluster 1 represents the companies with very low coefficient of the revenue size. These companies annually spend less than KN 10.000,00 for telecommunication services. The data related to their ICT potential suggest that these companies have low ICT potential. The ICT potential is directing us to the companies, which in the future might have the need for additional telecommunication solutions. Cluster 1 represents the companies that also have an average level of compactness of the relationship with our telecommunication operator. These companies have been for quite some time the clients of this operator. Thus, this Cluster might be named SOHO (small office home office).

Cluster 2 includes the companies with a high level of compactness of the relationship and of ICT potential and somewhat lower level of revenue. It is undoubtedly the most profitable market segment to which the most attention should be paid. These companies are steady clients, who will most probably have the need to expand their business and they can be named LA (large account).

Cluster 3 represents the companies with an extremely low ICT potential as well as the compactness of the relationship, with slightly higher revenue from the lowest. It is the most unrewarding market segment with the tendency of transferring to the competition. They have not been the company's clients for a long time and they do not have the need to develop their own ICT. The best name for this market segment could be SI (Silver).

Cluster 4 represents the companies with highest revenue but in the same time with low ICT potential and compactness of relationship. This group could be named SME (small and medium enterprises).


The modern information and communication systems enable the storage of a large number of transaction data. By means of transaction data mining, it is possible to gain new knowledge on the users of company's products/ services/solutions. It is necessary to apply this knowledge in order to determine the user's habits and to form effective market segments, which will be characterized by similar consumer habits. A particular value of this case study lies in the elaboration of the segmentation model based on gaining knowledge from the databases of a Croatian telecommunication operator. It is a leading regional information and communication company which, at the moment, does not implement market segmentation using information from its own and external databases but it uses the common approach to segmentation based on location and the revenue size from telecommunication services invoiced to individual users. The study has proved that the market segmentation has to be based on thorough knowledge of users and their habits and noting all the interactions with a user. The stored data can be used for data mining, which will result in new knowledge on users' habits and inclinations and enable forming effective market segments. Targeted approach to individual market segments results in significant competitive advantage. By using cluster analysis as the proposed market segmentation model, exceptionally attractive market segments were created. It enables the company to manage profitability and loyalty of each user. Therefore, we have to be aware the limitation of this research, that there is no correct number of clusters. However, a decision is made on how many clusters we used. This model of market segmentation vividly presents the importance of effective and interactive market segmentation, which will result in their increased competitiveness in the conditions of ever-growing globalization. Future studies should be aimed at implementation of other statistical methods and techniques as well as the methods of artificial intelligence in the field of market segmentation.


Cattrell, R.B. (1998). The Scientific Use of Factor Analysis in the Behavioural and Life Sciences, Plenum Press, ISBN: 0306309394, New York, USA

Costea, A. (2006). The Analysis of The Telecommunication Sector by The Means of data Mining Techniques. Journal of Applied Quantitative Methods, Vol. 1, No. 2, (December, 2006) pp. 144-150, ISSN: 1842-4562

Daskalaki, S.; Kopanas; I.; Goudara, M. & Avouris, N. (2003). Data mining for decision support on customer insolvency in telecommunications business. European Journal of Operational Research, Vol. 145, No. 2, (Marc, 2003) pp. 239-255, ISSN: 0377-2217

Hung, S.; Yen, D.C. & Wang, H. Y. (2006). Applying data mining to telecom churn management. Expert Systems with Applications, Vol. 31, No. 3, (October, 2006) pp. 515-524, ISSN: 0957-4174

Lejeune, M.A.P.M. (2001). Measuring the impact of data mining on churn management. Internet Research, Vol. 11, No. 5, (December, 2001) pp. 375-387, ISSN: 1066-2243

Malabocchia, G.; Buriano, L.; Mollo, M.J.; Richeldi, M. & Rossotto, M. (1998). Mining telecommunications data bases: an approach to support the business management, Available from: Network Operations and Management Symposium, 1998. NOMS 98., IEEE, Accessed:1998-02-15
Tab. 1. Average values of the variables
from individual clusters

 Cluster 1 Cluster 2 Cluster 3 Cluster 4

Coefficient of the
 revenue size 0.90 3.93 1.51 4.10
ICT potential 2.04 2.78 1.27 1.67
Compactness of the
 relationship 3.13 3.86 0.38 2.36
COPYRIGHT 2009 DAAAM International Vienna
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2009 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Bach, Mirjana Pejic; Simicevic, Vanja; Leskovic, Darko
Publication:Annals of DAAAM & Proceedings
Article Type:Case study
Geographic Code:4EUAU
Date:Jan 1, 2009
Previous Article:Reserches regarding the choosing of the type device for long cylindrical surfaces rotorolling.
Next Article:FEM simulation of electrodischarge machining of microcavitiesaided by ultrasonics.

Terms of use | Privacy policy | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters