Printer Friendly
The Free Library
19,604,530 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Artificial immune system for support vector machine model selection.


I. INTRODUCTION

The support vector machines SVM SVM Support Vector Machines
SVM School of Veterinary Medicine
SVM Solaris Volume Manager
SVM Space Vector Modulation
SVM Storage Virtualization Manager (StoreAge)
SVM Service Module (also abbreviated as S/M) 
 were proposed originally in the context of machine learning, for classification problems on typically large sets of data which have an unknown dependency on possibly many variables. SVM introduced by Vapnik et al, SVM based on statistical learning theory Statistical learning theory is an ambiguous term.
  1. It may refer to computational learning theory, which is a sub-field of theoretical computer science that studies how algorithms can learn from data.
 that can get a good generalization performance with a given finite number of training examples for a given learning task, so it attracts more attention [6].

SVMs are special kind of supervised learners that construct a model from available training data with known classification. In order to construct an accurate classifier, a number of parameters have to be tuned to suit the needs of the problem of interest; our main target is to obtain the optimal set of parameters (model) that can achieve higher classification accuracy. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"
put differently
, the target is to find an optimal set of SVM parameters which will give minimum prediction error when applied to unseen samples [1].

In this work, we propose a new evolutionary algorithm evolutionary algorithm - (EA) An algorithm which incorporates aspects of natural selection or survival of the fittest. An evolutionary algorithm maintains a population of structures (usually randomly generated initially), that evolves according to rules of selection, recombination,  based on the human immune system immune system

Cells, cell products, organs, and structures of the body involved in the detection and destruction of foreign invaders, such as bacteria, viruses, and cancer cells. Immunity is based on the system's ability to launch a defense against such invaders.
 called the Artificial Immune System--AIS, we employed this algorithm to find the optimal values that can be used to obtain an optimal SVM classifier in a data driven manner. We compared our work against previously proposed work [1, 4], that target the same problem, but using Genetic Algorithm genetic algorithm - (GA) An evolutionary algorithm which generates each individual from some encoded form known as a "chromosome" or "genome". Chromosomes are combined or mutated to breed new individuals.  rather than AIS, our test data sets are four well known benchmark data sets.

The remainder of the paper is organized as follows. An overview of SVM, overview on AIS, our proposed technique, obtained results and finally the conclusion.

II. SUPPORT VECTOR MACHINE This article or section may be confusing or unclear for some readers.
Please [improve the article] or discuss this issue on the talk page.
 

SVM is a universal learning machine for its application not only in pattern recognition but also in regression estimation. Although SVM is applicable to a learning task with training examples, all the training examples do not play an important role in the learning task, but a few ones called support vectors (SVs) are essential for the classification task.

In the SVMs when input data can not be lineally lin·e·al  
adj.
1. Belonging to or being in the direct line of descent from an ancestor.

2. Derived from or relating to a particular line of descent; hereditary.

3. Linear.
 separated, they should be mapped into high-dimensional feature spaces, where a linear decision surface discriminating two classes can be designed. The computation needn't actually be performed in feature spaces since SVM rely upon the direct application of the kernel function over the input data.

[FIGURE 1 OMITTED]

As illustrated in Fig.1, our main objective is the construction of a hyper plane which provides maximal

distance 2/[||w||.sup.2] between points [X.sub.i] belonging to the two classes is: [w.sup.2] . x + b = 0 This gives rise to an optimization problem In computer science, an optimization problem is the problem of finding the best solution from all feasible solutions. More formally, an optimization problem is a quadruple  of the form:

min w, b [PHI] (W)= 1/2 [W.sup.T] W

subject to [d.sub.i] ([W.sup.T] [X.sub.i] + b) [greater than or equal to] 1

Where [PHI] represents a cost function to be minimized in order to maximize separation. To solve this 'primal' minimization problem, the Lagrange function is conduct[1].

L (w, b, [alpha])= 1/2 [W.sup.T] W - [n.summation over (i=1)] [[alpha].sub.i] [[d.sub.i] ([W.sup.T] [X.sub.i] + b ) - 1]

Where [[alpha].sub.i] are the Lagrange multipliers In mathematical optimization problems, Lagrange multipliers, named after Joseph Louis Lagrange, is a method for finding the extrema of a function of several variables subject to one or more constraints: it is the basic tool in nonlinear constrained optimization. . A key feature of Support Vector Machines is the ability to replace the input data by a (non-linear) function [phi](x) operating on the input data. At this time the support vector machines require the solution of the following optimization problem:

min w, b, [xi] [PHI](W, [xi])= 1/2 [W.sup.T] W + C [n.summation over (i=1)] [[xi].sub.i]

subject to [d.sub.i]([W.sup.T] [phi]([X.sub.i])+b) [greater than or equal to] 1 - [[xi].sub.i] [[xi].sub.i] [greater than or equal to] 0

This may be viewed as mapping the input data to a higher (possibly infinite) dimensional space, to enable classification of data that is not linearly separable In geometry, when two sets of points in a two-dimensional graph can be completely separated by a single line, they are said to be linearly separable. In general, two groups are linearly separable in n-dimensional space if they can be separated by an n  in the original input space. The previous optimization problem is a convex quadratic quadratic, mathematical expression of the second degree in one or more unknowns (see polynomial). The general quadratic in one unknown has the form ax2+bx+c, where a, b, and c are constants and x is the variable.  program which can be solved by using the well-known Lagrange multiplier method.

L(w, b, [[xi].sub.i]; [[alpha].sub.i], [[beta].sub.i])= 1/2 [W.sup.T] W + C [N.summation over (i=1)] [[xi].sub.i]

-[N.summation over (i=1)] [[alpha].sub.i] [[y.sub.i]([W.sup.T] W[phi]([X.sub.i] + b)- 1 + [[xi].sub.i]] - [N.summation over (i=1)] [[beta].sub.i][[xi].sub.i]

Accordingly, the coefficients [[alpha].sub.i] can be found by solving the following convex quadratic programming Quadratic programming

Variant of linear programming in which the objective function is quadratic rather than linear. In portfolio selection, we often minimize the variance of the portfolio (which is a quadratic function) subject to constraints on the mean return of the portfolio.
 problem:

Q [alpha] = [N.summation over (i=1)][[alpha].sub.i]- 1/2 [N.summation over (i=1)] [N.summation over (j=1)] [[alpha].sub.i], [[alpha].sub.j] [y.sub.i] [y.sub.j] [phi][([X.sub.i]).sup.T][phi]([X.sub.j])

Subject [N.summation over (i=1)] [[alpha].sub.i] [y.sub.i]= 0,0 [less than or equal to] [[alpha].sub.i] [less than or equal to] C

And K(x, [x.sub.i])= [[phi].sup.T](X)[phi][X.sub.i] is called the kernel function, which is motivated by Mercer's theorem. So the problem is changed to:

Q ([alpha]) = [N.summation over (i=1)][[alpha].sub.i] - 1/2 [N.summation over (i=1)] [N.summation over (j=1)] [[alpha].sub.i] [[alpha].sub.j] [y.sub.i] [y.sub.j] K ([x.sub.i], [x.sub.j]

III. ARTIFCIAL IMMUNE SYSTEMS

Artificial Immune System An artificial immune system (AIS) is a type of optimisation algorithm inspired by the principles and processes of the vertebrate immune system. The algorithms typically exploit the immune system's characteristics of learning and memory to solve a problem.  (AIS) is an evolutionary algorithm inspired from the biological immune system [8, 10]. It performs multi resolution search [10] that tries to search the fitted region of the search space more than other regions. It is based on the Clonal-Selection principal, the general AIS steps are shown in figure 2.

This approach implements AIS in an iterative manner; each iteration is divided into seven sub-processes, described below :

1-Initialization

Creating the initial immune system antibodies, each antibody contains an SVN SVN Subversion (version control system)
SVN Slovenia (international traffic code)
SVN Social Venture Network
SVN South Vietnam
SVN Secure Virtual Network
SVN Supervised Visitation Network
 representation.

2-Measuring Affinities

In this step, each SVM antibody is examined against the unseen data.

3-Clonal Selection

Antibodies that have better affinities are selected for reproduction. A number of antibodies are cloned from the selected antibodies and added to the AIS.

4-Clonal Expansion

Selected antibodies are cloned with a strategy that allows the best antibodies to produce more clones.

5-Hyper Mutation

We alter the SVM parameters in table [1] to some other values.

6-Meta-dynamics

A set of new randomly generated antibodies are added to the AIS to introduce significant diversity of antibodies.

7-Replacement

Based on a specific strategy, a set of antibodies are chosen to be replaced by the new cloned expanded antibodies.

Fig. 2, shows a flow chart that describes AIS steps.

[FIGURE 2 OMITTED]

IV. SVM MODEL PARAMETERS

We implemented our approach on top of the famous SVM library LIBSVM LIBSVM Library for Support Vector Machines  [7] software, SVMLIBM has a number of parameters (see table 1) that can be optimized.

V. AIS APPROACH FOR SVM MODEL SELECTION

To understand how the algorithm works, and how we used AIS to optimize the SVM, we will highlight steps in the AIS that we have modified to implement this algorithm.

First, we will describe the antibody representation that is used to train the SVM. Our antibody is as simple as representing the previous table in a simple data structure with fields (normal C struct), this choice is preferred over representing it as a floating point array since will not use operations like crossover, which mostly work faster on arrays, however, if we are going to try a hybrid approach between genetic and AIS, we might need to use the array representation.

The AIS algorithm has number of tuning parameters that can affect its performance and behavior, and it can be used to guide the AIS to achieve faster and better solution if well adjusted, we illustrate these tuning parameters below in table 2.

A. Population Initialization in·i·tial·ize  
tr.v. in·i·tial·ized, in·i·tial·iz·ing, in·i·tial·iz·es Computer Science
1. To set (a starting value of a variable).

2. To prepare (a computer or a printer) for use; boot.

3.
 

In this step, the AIS is initialized with randomly generated antibodies, each antibody differs from other antibodies in the parameters mentioned in the previous table, for example, the Kernel Type parameter, can take one value of LINER, POLY, SIGMOID sigmoid /sig·moid/ (sig´moid)
1. shaped like the letter C or S.

2. sigmoid colon.


sig·moid or sig·moi·dal
adj.
1. Having the shape of the letter S.
, and each of these kernels has in turn parameters to be configured. For example if we select the POLY kernel, a random value is also assigned to gamma and degree and coef0 parameters. The range of the parameters is quiet subjective for our configuration. In table 3 we list the initialized values for all LIBSVM parameter we chosen.

B. Affinity Measurement

This is equivalent to the fitness function in the Genetic Algorithms, in AIS it means how strong the antibody when matching the antigens. In our implementation "strong" means how much accurate is our SVM when classifying the unseen data, the more number of correctly classified data, the more accurate our antibody, subsequently, SVM.

For our implementation, our affinity is measured by balanced classification accuracy (bca). The bca is calculated as:

bca = 1/2([[pi].sup.-]/[m.sup.-]+ [[pi].sup.+]/[m.sup.+])

Where [m.sup.-] denotes the number of class -1 records in the data set and [[pi].sup.-] the number of class -1 records that have been classified correctly with similar meanings for [[pi].sup.-] and [m.sup.+] [1].

C. Clonal Selection The clonal selection theory has become a widely accepted model for how the immune system responds to infection and how certain types of B and T lymphocytes are selected for destruction of specific antigens invading the body.  and Expansion

Antibodies that have better affinities (i.e. which can classify with better accuracy) are selected for reproduction. A number of antibodies are cloned from the selected antibodies and added to the AIS, and then these selected antibodies are cloned (i.e. copied) with a strategy that allows the best antibodies to produce more clones, the reason behind this is to limit the possibility of the AIS to stuck in a local minimum.

D. Hyper Mutation

This is equivalent to the normal genetic mutation Noun 1. genetic mutation - (genetics) any event that changes genetic structure; any alteration in the inherited nucleic acid sequence of the genotype of an organism
chromosomal mutation, mutation
 operator, except that the mutation ratio and values are chosen proportional to the antibody affinity, i.e. antibodies with higher affinity have mutation rate less than antibodies with lower affinity, because antibodies with higher affinity are more likely to reach a good solution than those with lower affinity.

For the mutation process it self, we alter the value of one or more (depends on the affinity and the mutation ratio) of the parameters mentioned in table 1, newer values are calculated randomly.

VI. RESULTS

We run AIS on the same data sets used in [1], they are four data sets from the Stat log project and the UCI UCI University of California, Irvine
UCI Union Cycliste Internationale (International Cycling Union)
UCI Unidad de Cuidados Intensivos
UCI United Cinemas International (UK) 
 machine learning library. The data sets are Australian credit (ac) and German credit (gc) exemplify a case of corporate credit scoring Credit scoring

A statistical technique that combines several financial characteristics to form a single score to represent a customer's creditworthiness.
, e.g. classifying if an applicant is a good/bad credit risk. As examples for medical diagnosis we consider the data sets heart-disease (hrt), and Wisconsin breast cancer (wbc) each of which require a classification if a patient suffers from a certain disease or not. All data sets are cases of binary classification so that examples either belong to class +1 or class -1 respectively. A brief description of each data set's characteristic is given in Table 4. For details see [1].

The data sets have been partitioned into 2/3 training set for model building and 1/3 test set for out-of-sample evaluation.

After running AIS on the previous datasets and using parameters descried in table 2, we obtained the following results which is depicted in table 5.

VII. CONCLUSION

In this paper we developed an enhanced algorithm that is based on Artificial Immune System--AIS that can be ideal to select and optimize the process of model selection for the SVM, for the sake of achieving more accurate classifications, the generated model is appropriate to choose a classifier that will generalize well to unknown data.

Our results suggest that using the AIS for SVM model selection will in most cases generate more accurate models for SVM than the GA based approach.

REFERENCES

[1] Stefan Lessmann, Robert Stahlbock, Sven F. Crone crone

see crock.
. "Genetic Algorithms for Support Vector Machine Model Selection", 2006 International Joint Conference on Neural Networks Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada, July 16-21, 2006.

[2] Huan-Jun Lui, Yao-Nan Wang, Xiao-Fen Lu. "A Method to Choose Kernel Function and Its Parameters for Support Vector Machines", Proceedings of The Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, August 18-21, 2005.

[3] John C. Platt, " Fast Training of Support Vector Machines using Sequential Minimal Optimization ",Microsoft Research, Technical Report MSR-TR-98-14, April 21, 1998.

[4] Stefan Lessmann, Robert Stahlbock, Sven F. Crone. "Optimizing Hyper-parameters of Support Vector Machines by Genetic Algorithms",2006. url: https://iwi.econ.unihamburg.de /IWIWeb/uploads/team/sl/ICAI05_OptHypSVM.pdf

[5] Tom Howley & Michael G. Madden. "The Genetic Evolution of Kernels for Support Vector Machine Classifiers", Department of Information Technology, National University of Ireland ,Galway, url : http://www.it.nuigalway.ie/m_madden/profile/pubs/aics-04b.pdf

[6] J. P. Lewis. "A Short SVM (Support Vector Machine) Tutorial", U. Southern California version 0.zz, 2004.

[7] Chih-Chung Chang and Chih-Jen Lin, "LIBSVM : a library for support vector machines", 2001. Software Available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

[8] D. Dasgupta. "Artificial Immune Systems and Their Applications", Springer 1999.

[9] L.N. de Castro, J. Timmis. "Artificial Immune Systems: A New Computational Intelligence Approach", Springer 2002.

[10] S. A. Hofmeyr "An Interpretative Introduction to The Immune System" Dept. of Computer Science University of New Mexico The University of New Mexico (UNM) is a public university in Albuquerque, New Mexico. It was founded in 1889. It also offers multiple bachelor's, master's, doctoral, and professional degree programs in all areas of the arts, sciences, and engineering. , 2000.

Amr Badr

Faculty of Computers and Information, Dept of Computer Science, Cairo Unive rsity

Email: ruaab@rusys.eg.net

Amr Ali Faculty of Computers and Information, Dept of Computer Science, Cairo University Email: amr@egjug.org
Table 1: SVMLIB parameters

LIB-SVM         Description
parameter

* SVM Type      Type of the SVM to be used
 * C_SVC        Classification SVM
 * NU_SVC       v-Support Vector Classification
 * ONE_CLA      Distribution Estimation (One-class SVM)
   SS
  * EPSILON_    Epsilon Support Vector Regression
    SVR
  * NU_SVR      v-Support Vector Regression
* Kernel Type   Type of the kernel to be used
  * LINER       Liner Kernel
  * POLY        K (X, X') = (X . X' + 1) (d)
  * RBF         K (X, X') = exp (-[gamma][||X - X'||.sup.2])
  * SIGMOID     K (X, X') = tanh (K X . X' + C)
  * PRECOMP     n/a
    UTED
* C             for C_SVC, EPSILON_SVR and
                NU_SVR
* degree        For Poly Kernels
* gamma         For Poly/RBF/Sigmoid kernels
* cofe0         For Poly/Sigmoid
* epsilon       Stopping criteria

Table 2: Parameters initialization values and randomization ranges

Parameter            value   Description
Maximum Generation     25    Number of AIS iterations
Population Size        50    Number of Antibodies/iteration
Best N                 5     Number of antibodies selected for Clonal
                              selection
Meta Dynamics          10    Number of randomly added antibodies
Num Clones             2     Maximum number of clones/antibody

Table 3: Parameters initialization values and randomization ranges

Parameter        Initialization value      Mutation value/ranges

svm_type                Random             One of SVMLIBM SVMs types

Kernel_type             Random             POLY/RBF/LINER/SIGMOID

C               Random [0.01 to 5.000]     +/- [0.01, 0.5] step 0.01

degree                    3                +/- 1.0

gamma                     1                +/- 1.0

cofe0                     0                +/- [0,9], step = 1

epsilon                  1e-4              n/a

Table 4: Data sets used, from [1]

Dataset   #Cases    #Features   #class     #class
                                  -1         +1

Ac          690        14         307        383
Gc         1000        20         700        300
hrt         270        13         150        120
wbc         683        10         239        444

Table 5: Results obtained against GA algorithm

Dataset    GA-SVM (bca      AIS-SVM (bca
           on test) [1]       on test)

ac            0.8761           0.8757
gc            0.6794           0.7430
wbc           0.9750           0.9904
hrt           0.8611           0.8730
COPYRIGHT 2009 University of the West of Scotland, School of Computing
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2009 Gale, Cengage Learning. All rights reserved.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Badr, Amr; Ali, Amr
Publication:Computing and Information Systems
Date:Feb 1, 2009
Words:2525
Previous Article:A dialogue act modelling approach to web-based system modelling.
Next Article:Bees collective dynamics.

Terms of use | Copyright © 2012 Farlex, Inc. | Feedback | For webmasters | Submit articles