Printer Friendly

Quaternion based fuzzy neural network classifier for MPIK dataset's view-invariant color face image recognition.

This paper presents an effective color image processing system view-invariant person face image recognition for Max Planck Institute Kybernetik (MPIK) dataset. The proposed system can recognize face images of view-invariant person by correlating the input face images with the reference face image and classifying them according to the correct persons' name/ID indeed. It has been carried out by constructing a complex quaternion correlator and a max-product fuzzy neural network classifier. Two classification parameters, namely discrete quaternion correlator output (p-value) and the peak to sidelobe ratio (PSR), were used in classifying the input face images, and to categorise them either into the authentic class or non-authentic class. Besides, a new parameter called G-value is also introduced in the proposed view-invariant color face image recognition system for better classification purpose. Experimental results shows that the proposed view-invariant color face image recognition system outperforms the conventional NMF, BDNMF and hypercomplex Gabor fllter in terms of consumption of enrollment time, recognition time and accuracy in classifying MPIK color face images which are view-invariant, noise influenced and scale invariant.

Keywords: image processing, face recognition, fuzzy neural network classifier, quaternion correlation

Povzetek: Predstavljena je metoda prepoznavanja obrazov, testirana na domeni Max Planck Institute Kybernetik.

1 Introduction

Face recognition has been applied in many areas such as face search in databases, authentication in security system, smart user interfaces, robotics and etc. Conventional face recognition methods normally focus on grayscale face image recognition. However in recently, there are many researchers focus on color information of the face images to improve the performance of recognition algorithm due to the reasons that color face images offer more information for face recognition task in contrast to grayscale face images.

A simple color face recognition system was first proposed by Tortes et. al. in [1] based on the PCA (principal component analysis) method. The method is based on the representation of the facial images using eigenfaces. The information of three different channels (R, G, B) of color face images are first represented in the form of eigenvectors respectively; and the recognition is implemented separately on each color channel. However, it is found out that the information of different color channels that utilized separately would destroy the color information's structural and make it hard to learn the facial features (variation in expression, poses and illuminations). Rajapakse et. al. [2] presented a parallel work based on NMF (Non-negative matrix factorization) for color face recognition. In their work, color information on face images of different channels were processed separately too. Some observed advantages of NMF method on face recognition are this method is more robust to occlusion, variation of expressions and poses. However, since NMF method also treats information of different color channels separately, just like the PCA method, it would also destroy the structural integrally of color information and the correlation among the color information.

In order to preserve the integrally of color information on different channels in color face recognition system, Wang et. al. [3, 4] proposed a supersede NMF method, which is the block diagonal non-negative matrix factorization (BDNMF). Inspired by the NMF method, BDNMF also separated color information into different color channels, but it uses block diagonal matrix to simultaneously encode color information of different channels, hence preserving the integrally of color information. However, BDNMF method has the demerit of complex enrolment/training stage. In BDNMF, unsupervised multiplicative learning rules are used iteratively to up-date the parameters such as basis image matrix (W) and encoding image (H). Therefore, longer enrolment time is required for this method. Another demerit of BDNMF is that an additional coined block diagonal constraint is imposed on the factorization part to construct the BDNMF algorithm. This makes the computation more complex compare to the conventional NMF method.

Another recently developed color face recognition method is the use of hypercomplex Gabor filters [5]. Conventional Gabor filters are used in many face recognition applications [6-8] and they are proven to obtain good recognition performance due to its inherent merits of insensitivity to illumination and pose variation. In [5], the author further extended conventional Gabor filter into hypercomplex (quaternion) domain to perform color based feature extraction. Experimental results in [5] show that the conventional Gabor filter feature extraction achieved significant improvement in face matching accuracy over the monochromatic case. However, hypercomplex Gabor filter required a large number of different kernels, and hence the length of the feature vectors in quaternion domain would increase dramatically. Also, hypercomplex Gabor filter is twice the size of filter structure comparing to those used in the conventional Gabor filters.

Most of the proposed algorithms for color face recognition treat the three color channels (R, G, B) separately and apply grayscale face recognition methods [9, 10] to each of the channels and then combine the results at last. But with the quaternion correlation techniques [11], it processes all color channels jointly by using its quaternion numbers. Quaternion numbers are the generalization of complex numbers. It is a number which consists of one real part and three orthogonal imaginary pans. A RGB color face image can be represented using quaternion representation by inserting the value of three color channels into the three imaginary parts of the quaternion number respectively. Therefore, in this paper, the concept of quaternion is proposed for view-invariant color face image recognition system.

An advanced correlation filter named as unconstrained optimal trade-off synthetic discriminant (UOTSDF) [12, 13], is also applied in the proposed view-invariant color face image recognition system. The goal of the filter is to produce sharp peak that resemble 2-D delta type correlation outputs when the input face image belongs to the class of the reference face image that were used to train the input face image; and, this provides automatic shift-invariance. A strong and sharp peak can be observed in the output correlation plane when the input face image comes from the authentic class (input face image matches with a particular training/reference face image stored in database) and no discernible peak if the input face image comes from the imposter class (input face image does not matches with the particular reference face image).

Three classification parameters are in concern in classifying whether an input face image belongs to the authentic class or not. They are the real to complex ratio of the discrete quaternion correlator output (p-value) [11], peak to sidelobe ratio (PSR) [14] and the max product fuzzy neural network classifier value (G-value). p-value has been introduced in [11], which is used in measuring the correlation output between the colors, shape, size and brightness of input image and a particular reference image. PSR is another parameter introduced in [14] for a better recognition due to the reason that it is more accurate if we consider the peak value with the region around the peak value, rather than a single peak point. The higher the value of PSR, the more likely the input face image belongs to the referenced image class. In this paper, both the p-value and the PSR are combined, normalized and applied with Gaussian distribution function in the max-product fuzzy neural network classifier. This technique generates a parameter, so-called the G-value. This parameter as well as the algorithm is applied in view-invariant color face image recognition system for better classification purposes. The same technique was applied in the machine condition monitoring [15] and it yields high success rate in classifying machine conditions. It is good to be implemented for color face image recognition.

In this paper, quaternion based fuzzy neural network classifier is proposed for MPIK dataset's view-invariant color face image recognition. 10,000 repeated images generated/collected from the 7 different position color face images of 200 people in MPIK dataset were used to evaluate the system performance. Among the 10,000 repeated color face images, 5000 are normal MPIK color face images; 2500 are normal MPIK color face images embedded with noise features such as "salt and pepper", "poisson", "speckles noise" as provided in Matlab image processing toolbox; and, 2500 are normal MPIK color face images with scale invariant (shrink or dilation). The performance of the proposed quaternion based fuzzy neural network classifier is compared to NMF, BDNMF and hypercomplex Gabor filter. Experimental results show that the quaternion based fuzzy neural network classifier outperforms conventional NMF, BDNMF and hypercomplex Gabor filter in terms of enrolment time, recognition time and accuracy in classifying view-invariant, noise influenced and scale invariant MPIK color face images.

The paper is organized as follows: Section 2 briefly comments on the proposed view-invariant color face image recognition model and the quaternion based color face image correlator. Section 3 describes the enrolment stage and recognition stage for the algorithm of the proposed quaternion based color face image correlator. Then in section 4, the structure of fuzzy max-product neural network classifier will be described. Section 5 contains the experimental results. Finally, in section 6, the work is summarized and some future work is planned.

2 View-invariant color face image recognition system model

The proposed view-invariant face recognition system model considered in this paper is shown in Figure 1.

The view-invariant input color face image is first supplied to the quaternion based color face image correlator. The quaternion based color face image correlator is used to obtain correlation plane for each correlated input face image with reference face images stored in a database to calculate out some classification characteristics such as the real to complex ratio of the discrete quaternion correlator (DQCR) output, p-value and the peak-to-sidelobe ratio, PSR. These classification characteristic will later be input to the max-product fuzzy neural network to perform classification. Detailed discussion on the quaternion based color face image correlation will be discussed below.

The referenced face image after performing discrete quaternion Fourier transforms (DQFT) [11 ]:

I(m, n) = [I.sub.R] (m, n). i + [I.sub.G] (m, n).j + [I.sub.B] (m, n). k (1)

where m, n are the pixel coordinates of the reference face image. R, G, B parts of reference face image are represented by [I.sub.R] (m, n), [I.sub.G] (m, n) and [I.sub.B] (m, n) respectively, and i-, j-, k- are the imaginary parts of quaternion complex number [15] and the real part of it is set to zero. Similarly, [h.sub.i](m, n) is used for representing input face image. Then, we can produce output b (m, n) to conclude whether the input face image matches the reference face image or not. If [h.sub.i] (m, n) is the space shift of the reference face image:

[h.sub.i] (m, n) = l(m - [m.sub.0], n - [n.sub.0]) (2)

Then after some mathematical manipulation,

Max [b.sub.r] (m, n)) = [b.sub.r] (- [m.sub.0], [n.su b.0]) (3)

where [b.sub.r] (m, n) means the real part of b(m, n) and

[b.sub.r] (-[m.sub.0], [n.sub.0]) = [M-1.summation over (m=0)] [N-1.summation over (n=0)] [[absolute value of I (m, n)].sup.2] (4)

where M, N is the image x-axis, y-axis dimension. At the location (-[m.sub.0], [n.sub.0]), the multiplier of i-, j-, k- imaginary part of b(-[m.sub.0], [n.sub.0]) are equal to zero:

[b.sub.i](-[m.sub.0], [n.sub.0]) = [b.sub.j](-[m.sub.0], [n.sub.0]) = [b.sub.k] (-[m.sub.0],[n.sub.0]) = 0 (5)

Thus, the following process [11] can be modified for face image correlation:

1.) Calculate energy of reference face image I (m, n):

[E.sub.I] = [M-1.summation over (m=0)] [N-1.summation over (n=0)][absolute value of I(m, n)].sup.2] (6)

[I.sub.a] (m,n) = I (m, n)/[square root of [E.sub.I]] (7)

[H.sub.a] (m,n) = [h.sub.i] (m,n)/[square root of [E.sub.I]] (8)

2.) Calculate the output of discrete quaternion correlation (DQCR):

[g.sub.a] (m,n) = [M-1.summation over [tau]=0] [N-1.summation over [eta]=0] [I.sub.a] [tau],[eta]. [bar.[H.sub.a] (H.sub.a] ([tau]-m, [eta] - n])] (9)

where '[??]' means the quaternion conjugation operation and perform the space reverse operation:

g(m, n) = [g.sub.a](-m, -n) (10)

3.) Perform inverse discrete quaternion Fourier Transform (IDQFT) on (10) to obtain the correlation plane P(m, n).

4.) Search all the local peaks on the correlation plane and record the location of the local peaks as ([m.sub.s], [n.sub.s]).

5.) Then at all the location of local peaks ([m.sub.s], [n.sub.s]) found in step 4, we calculate the real to complex value of the DQCR output:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (11)

   where [P.sub.r]([m.sub.s],[n.sub.s])is the real part of
   P([m.sub.s],[n.sub.s]). [P.sub.i] ([m.sub.s], [n.sub.s]), [P.sub.j]
   ([m.sub.s], [n.sub.s]) and [P.sub.k] ([m.sub.s], [n.sub.s]) are the
   i-, j-k-parts of P([m.sup.s],[n.sub.s]) respectively. If p [greater
   than or equal to][d.sub.1] and [c.sub.1] < [absolute value of P
   ([m.sub.s], [n.sub.s])] < [c.sub.2], then we can conclude that at
   location ([m.sub.s], [n.sub.s]), there is a face image that has the
   same shape, size, color and brightness as the reference face image,
   [d.sub.1] < 1, [c.sub.1] < 1 < [c.sub.2] and [c.sub.1], [c.sub.2]
   and dl are all with values near to 1. The value of p decays faster
   with the color difference between matching the input face image to
   the reference face image.


Another classification characteristic that can be applied in quaternion based color face image correlation is the peak-to-sidelobe ratio (PSR). A strong peak can be observed in the correlation output if the input face image comes from imposter class. A method of measuring the peak sharpness is the peak-to-sidelobe ratio (PSR) which is defined as below [14, 17]:

PSR = peak - mean( sidelobe)/[sigma](sidelobe) (12)

where peak is the value of the peak on the correlation output plane, sidelobe refers a fixed-sized surrounding area off the peak. mean is the average value of the sidelobe region, [sigma] is the standard deviation of the sidelobe region. Large PSR values indicate the better match of the input face image and the corresponding reference face image.

The quaternion based color face image correlator involved 2 stages: 1. Enrolment stage and 2. Recognition stage. During the enrollment stage, one or multiple face images of each person in database are acquired. These multiple reference face images have the variability in the angle of turning faces (for e.g. 90[degrees] to left, 60[degrees] to left, 30[degrees] to left 0[degrees] facing in front, 30[degrees] to right, 60[degrees] to right, 90[degrees] to right and etc). The DQFT of the reference face images are used to train the fuzzy neural network and determine correlation filter coefficients for each possible person's set. During recognition stage, sample face images will be input and the DQFT of such images are correlated with the DQFT form of the reference face images stored in the database together with their corresponding filter coefficients, and the inverse DQFT of this product results in the correlation output for that filter. Enrollment stage and recognition stage are discussed in detail in the following section.

3 Enrolment stage and recognition stage for quaternion based color face image correlator

This section will describes the enrollment stage and recognition stage for the algorithm of the proposed quaternion based color face image correlator.

3.1 Enrolment Stage

The schematic of enrollment stage is shown in Figure 2. During the enrollment stage, the reference face images for each person set in database are partitioned according to S different angle face image. These partitioned reference face images are then encoded into a two dimensional quaternion array (QA) as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (13)

where [t.sub.1] = 1, 2 ..., T represents the number of person subscribe to the database, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] represents the real part of quaternion array of s-th face image for person set [t.sub.1], s = 1, 2 ..., S represents the number of face images in different angle for a particular person. [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] each represents the i-, j-, k- imaginary part of s-th face image for person [t.sub.1] respectively.

The quaternion array in (13) is then undergoes discrete quaternion Fourier transform (DQFT) to transform the quaternion image to the quaternion frequency domain. A two-side form of DQFT has been proposed by Ell [18, 19] as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (14)

where e is exponential term, [mu].sub.1] and [mu].sub.2] are two pure quaternion units (the quaternion unit with real part equal to zero) that are orthogonal to each other [20]:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (15)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (16)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (17)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (18)

The output of DQFT, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is used to train the max-product fuzzy neural network classifier and design the correlation filter.

3.1.1 Quaternion Correlator (QC)

To train the max-product fuzzy neural network classifier, the output of the DQFT is first passed to a quaternion correlator (QC) as shown in Figure 3. The function of the QC is summarized as below: For DQFT output of s-th face image, perform discrete quaternion correlation (DQCR) [21, 22] on reference face image [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] with reference face image [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and multiply with corresponding filter coefficients ([filt.sub.(t2)])):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (19)

where [t.sub.1], [t.sub.2] = 1, 2 ..., T are the number of person subscribe to the database. After that, (19) is performing inverse DQFT to obtain the correlation plane function:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (20)

The correlation plane is a collection of correlation values, each one obtained by performing a pixel-by-pixel comparison (inner product) of two images [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. A sharp peak in the correlation plane indicates the similarity of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], while the absence or lower of such peak indicate the dissimilarity of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Calculate [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] from the correlation plane as in (20) using (I1) and (12) respectively. [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] means p-values of reference face image [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] correlate on reference face image [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] angle, while [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] means PSR values of reference face image [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] correlate on reference face image [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] in s-th angle. These values are then feed into max-product fuzzy neural network classifier to perform training and calculate weight, which will be discussed in section 4.

3.1.2 Correlation Filter

Conventional filtering methods [23] are emphasizing on applying matched filters. Matched filters are optimal for detecting a known reference image in additive white Gaussian noise environment. If the input image changes slightly from the known reference image (scale, rotation and pose invariant), the detection of the matched filters degrades rapidly. However the emerge of correlation filter designs [24] have developed to handle such types of distortions. The minimum average correlation energy (MACE) filters [25] are one of such design and show good results in the field of automatic target recognition and applications in biometric verification [14, 26]. MACE filters different from conventional matched filters that more than one reference image are used to synthesize a single filter template, therefore making its classification performance invariant to shift of the input image [24].

There are two types of MACE filters in general, namely: 1.) Conventional MACE filter [25] and 2.) Unconstrained MACE (UMACE) filter [27], both with the goal to produce sharp peaks that resemble two dimensional delta-type correlation outputs when the input image belongs to the authentic class and low peaks in imposter class. Conventional MACE filter [25] minimizes the average correlation energy of the reference images while constraining the correlation output at the origin to a specific value (usually 1), for each of the reference images. Lagrange multiplier is used for noise optimization, yielding:

[filt.sub.MACE] = [D.sup.-1] X[(X' [D.sup.-1] X)].sup.-1] c (21)

This equation is the closed form solution to be the linear constrained quadratic minimization. D is a diagonal matrix with the average power spectrum of the reference images placed as elements along diagonal of the matrix. X contains Fourier transform of the reference images lexicographically re-ordered and placed along each column. As an example, if there are T sets of reference face images, each with size 256 x 1,792(=458,752), then X will be a 458,792xT matrix. x' is the matrix transpose of x. c is a column vector of length T with all entries equal to 1.

The second type of MACE filter is the unconstrained MACE (UMACE) filter [27]. Just like conventional MACE filter, UMACE filter also minimizes the average correlation energy of the reference images and maximizes the correlation output at the origin. The different between conventional MACE filter and UMACE filter is the optimization scheme. Conventional MACE filter is using Lagrange multiplier but as for UMACE filter, it is using Raleigh quotient which lead to the following equation:

[filt.sub.UMACE] = [D.sup.-1] m (22)

where D is the diagonal matrix which is the same as that in conventional MACE filter, m is a column vector containing the mean values of the Fourier transform of the reference images.

Besides MACE filters, there is a type of correlation filter, namely the unconstrained optimal tradeoff synthetic discriminant filter (UOTSDF) shown by Refreiger [28] and Kumar et al [12] has yielding good verification performance. The UOTSDF is by:

[filt.sub.UOTDSDF] = [([alpha]D + [square root of 11 - [[alpha].sup.2]C).sup.-1] m (23)

where D is a diagonal matrix with average power spectrum of the training image placed along the diagonal elements, m is a column vector containing the mean values of the Fourier transform of the reference images. C is the power spectral density of the noise. For most of the applications, a white noise power spectral density is for assumption since white noise is dominant in image; therefore C reduces to the identity matrix, [alpha] term is typically set to be close to 1 to achieve good performance even in the presence of noise, but it also helps improve generalization to distortions outside the reference images.

By comparing the three correlation filters listed above, conventional MACE filter is complicated to implement whereby it requires many inversion of T x T matrices. UMACE filter is simpler to implement from a computational viewpoint as it involves inverting diagonal matrix only, and the performance are close to the conventional MACE but poorer than UOTSDF. Therefore, we plan to extend UOTSDF in our quaternion based face image correlator for the recognition of view invariant person face since it is less complicated in computational viewpoint than conventional MACE filter and achieve good performance.

3.2 Recognition stage

The schematic of recognition stage for classification of color face image by quaternion correlation is shown in Figure 4. During the recognition stage, an input view invariant face image is first encoded into two dimensional quaternion array (QA) as follows:

[h.sub.(i)] = [h.sub.r(i)] + [h.sub.R(i)].i + [h.sub.G(i)].j + [h.sub.B(i)].k (24)

where i represents the input face image, [h.sub.r(i)] represents the real part of quaternion array for input face image i. [h.sub.R(i)], [h.sub.G(i)] and [h.sub.B(i)] each represents the i-, j-, k- imaginary part for input face image i respectively.

Performing DQFT to transforms the quaternion image to the quaternion frequency domain. A two-side form of DQFT is used:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (25)

where e is exponential term, [[micro].sub.1] and [[micro].sub.2] are two pure quaternion units as shown in (15) and (16) respectively. The output of the DQFT, [h.sub.(i)] is cross correlated with every quaternion correlation filter in the database using the quaternion correlator (QC) just as the one shown in Figure 3, but the DQFT output is now [h.sub.(1)]. In QC, performs quaternion correlation on [h.sub.(i)] with reference face images [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] from database, and multiply with corresponding filter coefficients ([filt.sub.(t2)]):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (26)

After that, (26) is performing inverse DQFT to obtain the correlation plane function:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (27)

Calculate [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] from the correlation plane as in (27) using (11) and (12) respectively. [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] means p-values of input face image [h.sub.(i)] correlate on s-th reference face image in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] means PSR values of input image [h.sub.(i)] correlate on s-th reference face image in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. These values are then feed into max-product fuzzy neural network classifier to perform classification for view invariant face images, which will be discussed in next section.

4 Max-product fuzzy neural network classifier

Fuzzy logic is a type of multi-valued logic that derived from fuzzy set theory to deal with approximate reasoning. Fuzzy logic provides high level framework for approximate reasoning that can appropriately handle both the uncertainty and imprecision in linguistic semantics, model expert heuristics and provide requisite high level organizing principles [13]. Neural network in engineering field refer to a mathematical/computational model based on biological neural network. Neural network provides self-organizing substrates for low level representation of information with adaptation capabilities. Fuzzy logic and neural network are complementary technologies. Therefore, it is plausible and justified to combine both these approaches in the design of classification systems. Such integrated system is referring to as fuzzy neural network classifier [13].

There are various fuzzy neural network classifiers have been proposed in the literature [29-32], and there has been much interest of many fuzzy neural networks applying max-min composition as functional basis [33-35]. However, in [36], Leotamonphong and Fang mention that the max-min composition is "suitable only when a system allows no compensability among the elements of a solution vector". He proposed to use max-product composition in fuzzy neural network rather than max-min composition. Bourke and fisher in [37] also comment that the max-product composition gives better results than the traditional max-min operator. Therefore, efficient learning algorithms have been studied by others [38, 39] using the max-product composition afterwards.

In this paper, a fuzzy neural network classifier using max-product composition will be proposed for view invariant color face image classification system. The max-product composition is the same as a single perceptron except that summation is replaced by maximization, and in the max-min threshold unit, min is replaced by product.

4.1 Define T classes, for T person's sets of view invariant face images

The reference face images for all T persons in database will be assigned with a Unique Number started from 1 till (TxS), where S is the number of S different angle face image specified in section 3. Class number is assigned starting from 1 till T. The same person's face images in different angle of view will be arrange in sequence according to the unique number assigned and classified in the same Class number.

4.2 Training Max-Product Fuzzy Neural Network Classifier

The max-product fuzzy neural network classifier is training with 4 processes as listed below:

1.) PSRs [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] output from the quaternion correlator of the enrollment stage are fuzzified through the activation functions (Gaussian membership function):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII](28)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (29)

where [alpha] is the smoothing factor, that is the deviation of the Gaussian functions.

2.) Calculate the G-value, which is the product value for s-th reference face image of the fuzzy neural network classifier at each correlated images:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (30)

3.) Gather and store the product values in an array:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (31)

4.) The output will be set so that it will output 1 if it is authentic class and 0 if it is imposter class, and it is in an array [Y.sub.identity], whereby it is an identity matrix of dimension T x T. To calculate the weight w for s-th angle face image, the equation is:

[w.sub.s] = [X.sub.straining] [sup.-1][Y.sub.identity] (32)

4.3 Max-Product Fuzzy Neural Network Classification

The max-product fuzzy neural network classification is with 7 steps:

1.) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] output from the quaternion correlator of the recognition stage are fuzzified through the activation functions (Gaussian membership function):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (33)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (34)

2.) Calculate the product value of the fuzzy neural network classifier at input face image on the training face images in database:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (35)

3.) Gather and store the product values in an array:

[X.sub.sclassification] = [[G.sub.s(i,1)] [G.sub.s(i,2)] ... [G.sub.s(i,T)]] (36)

4.) Obtain the classification outcomes by multiplying (36) with the weight trained at (32):

[Y.sub.classification] = [X.sub.sclassification] x [w.sub.s] (37)

5.) Classify the input face image with the person it belongs to by using max composition:

Output =max {[Y.sub.classification]} (38)

6.) Determine whether the face image is in the database or not:

If normalized Output [less than or equal to] [Thres.sub.output]

Then conclude: "The face is not in the database".

Else determine which element in [Y.sub.classification] matrix match with Output:

[psi] = the position number of element in [Y.sub.classification] matrix which has the equal value with Output. (39)

[Thres.sub.output] is the threshold value of an output to indicate that a face is not in the database.

[psi] corresponds to the assigned number of reference image in database.

7.) Based on T sets of fuzzy IF-THEN rules, perform defuzzification:

[R.sup.l]: IF [psi] is match with the Unique Number stored in Class l, THEN display the name of the person correspond to Class l. (40)

where l = 1, 2, ... T.

5 Experimental results

In this section, the application of quaternion based face image correlator together with max-product fuzzy neural network classifier for view invariant face recognition system will be briefly illustrated. Here, some experimental results are used to prove the algorithms' efficiency introduced in section 3 and 4.

5.1 Database of reference face images for 200 persons

A database with view-invariant color face images provided by the Max-Planck Institute for Biological Cybernetics in Tuebingen Germany [40] is use to test the proposed view-invariant color face image recognition system. The database contains color face images of 7 views of 200 laser-scanned (Cyberware TM) heads without hair. These modeling 200 persons' sets of color face images each with view-invariant/angle of different: facing 90[degrees] to left, facing 60[degrees] to left, facing 30[degrees] to left, facing 0[degrees] in-front, facing 30[degrees] to right, facing 60[degrees] to right and facing 90[degrees] to right. Hence, S=7 since there are 7 view-invariant images for 1 person set. An example of a person set with view-invariant face images are shown in Figure 5. The dimension of each image is 256 x 256 pixels.

5.2 Quaternion based face image correlation using unconstrained optimal trade-off synthetic discriminant filter (UOTSDF)

In the evaluation experiment, T=180 MPIK persons' faces are used to train the system during the enrollment stage. T x S = 1260 reference face images are use in database to synthesize a single UOTSDF using (23). D and m are calculated from the reference images and C is an identity matrix of dimension 1260 x 1260 and [alpha] set to 1. These values are substituted into (23) to calculate out the filter coefficients. Then in enrollment stage, for each filter line as in Figure 3, perform cross-correlations of all the DQFT form of reference face images in database [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] with the DQFT form of reference face image in database as well [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], and multiply the output value with corresponding filter coefficients respectively, where [t.sub.1], [t.sub.2] = 1, 2, ..., 180; s = 1, 2, ..., 7. In recognition stage, for each filter line, performed cross correlation of the DQFT form of input face image ([h.sub.(i)]) with the DQFT form of reference face images in database [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and multiply the output value with corresponding filter coefficient respectively. For authentic case (good match in between two face images), the correlation plane should have sharp peaks and it should not exhibit such strong peaks for imposter case (bad match in between two face images). These two cases will be investigated below:

Authentic case: Figure 6 shows the samples correlation plane for input face image matching with the exact reference face image of the same person in the database. Since both the face images are in good match, the observed correlation plane is having smooth and sharp peak.

Imposter case: Figure 7 show the sample correlation plane for input face image matching with one of the reference face image of different person in the database. Since both the face images are not in good match, the observed correlation plane is having lower and round peak as compare to those in good match.

Table 1 shows the PSR and p-value for both authentic and imposter case as in Fig. 6 and Fig.7. Note that the sharp correlation peak resulting in large normalized PSR and p-value in authentic case, whereas small PSR and p-value exhibiting in the imposter case.

To indicate that a face is not in the database, a threshold, [Thres.sub.output is implemented on the normalized Output value at Step 6 in section 4.3. The 20 persons' faces samples excluded from the training database are input to the trained system to run for accuracy test on different normalized Output value ranges from 0.05 to 1.0. The plot is shown in Figure 8. From the plot, the optimum [Thres.sub.output] is at 0.6.

5.3 Efficiency of the view-invariant color face image recognition system

The view-invariant color face image recognition system was evaluated with respect to random picking 10,000 repeated input face images from database (with mixing up the trained T=180 peoples' faces plus 20 more peoples' faces excluded from the training database sets) and input to the view-invariant color face image recognition system to run test. The graph of accuracy versus no. of person sets in database is plotted in Figure 9.

From the plot, it can be observed that as number of person in database increases, the performance drop is actually not much if G-value is applying in the proposed view-invariant color face image recognition system. Among the 10,000 input face images, 9,973 were tracked perfectly (output human names/ID agreed by the input images) in a database of 10 persons, i.e. an accuracy of 99.73%, while 9,821 were tracked perfectly in a database of 200 persons, i.e. an accuracy of 98.21%. The performance drop is increase if the proposed face recognition system is only applying PSR value (no G-value and p-value) whereby, the accuracy is 99.62% in a database of 10 persons, but 97.73% in a database of 200 persons. It is almost 0.24 fold more the performance drops compares to the system that applying G-value. The performance drops in the face recognition system, that applying only p -value (no G-value and PSR-value), is rather significant. Whereby, the accuracy is 99.50% in a database of 10 persons, but 97.04% in a database of 200 persons. It is almost 0.62 fold more the performance drop compares to the system that applying G-value. From the experiment results, it can be concluded that with the implementation of G-value and the fuzzy neural network classifier, it helps boost up the accuracy of view-invariant color face image recognition.

5.4 Comparative study with parallel method

For comparative study, the proposed quaternion based fuzzy neural network classifier is compared with conventional NMF, BDNMF and hypercomplex Gabor Filter. For conventional NMF, the reference face images as in section 5.1 database has been extracted and used. Seven training sets, each set exclusively containing the color face images for every person in different position (facing 90[degrees] to left, facing 60[degrees] to left, facing 30[degrees] to left, facing 0[degrees] in-front, facing 30[degrees] to right, facing 60[degrees] to right and facing 90[degrees] to right). For each training set, three different basis matrices and encodings were extracted for each color channels in the RGB scheme, [F.sup.l] where l [epsilon]{R, G, B} is constructed such that each color channel, l, of training color face images occupies the columns of [F.sup.l] matrices. The rank r of factorization is generally chosen so that [41]:

r < nm/[n+m] (41)

In this case, n = 7 and m = 180, r < 6.74. Hence, r is set to 6. The experiment was carried out to test the enrollment stage time consumption and the classification stage time consumption. The recorded time consumption is normalized and recorded in Table 2. For the recognition accuracy, a total of 10,000 randomly selected and repeated MPIK color face images with mixing up the trained T=180 persons' faces plus 20 more persons' faces excluded from the training database sets are tested. These also distributed to 5000 are normal MPIK color face images, 2500 are normal MPIK color face images embedded with noise features such as "salt and pepper", "poisson", "speckles noise" as in Matlab image processing toolbox, and 2500 are normal MPIK color face images with scale invariant (shrink or dilation), some examples are shown in Figure 10. The recognition accuracy (Percentage of total correct recognized images /10,000 tested images) is recorded in Table 2.

For the BDNMF method, to evaluate the performance on different color spaces, the color faces are separate into RGB spaces and face recognition experiment using BDNMF algorithm is conducted. The rank of factorization r is set to 6 as well. In the experiment, all the seven color face images of each person in different position are used to constitute the training/enrollment set. For the testing/classification set, a total of 10,000 face images used in testing the conventional NMF method are used. The results of the identification test including the enrollment stage time consumption (normalized), classification stage time consumption (normalized) and matching accuracy are shown in Table 2.

To evaluate the effectiveness of the hypercomplex Gabor filter proposed by [5] for feature extraction used in this comparative study, the hypercomplex Gabor filter is operated on all the MPIK RGB dataset color face images as in section 5.1. Each color face image was analyzed at a total of 24 landmark location, determined by the statistical analysis of the MPIK face image population according to [5]. Since Mahalanobis distance applied in [5] yields higher accuracies in compare to normal Euclidean distance approach, matching in this comparative study was performed using Mahalanobis distance classification. The jets extracted at the chosen face landmark locations was used for face matching and the Mahalanobis distance was computed using the global covariance matrix for all the color face landmarks. During classification, jets derived from color face images in database were matched against models consisting of jets extracted from a set of 10,000 color face images as use in testing both conventional NMF and BDNMF above, for accuracy measurement. The results of the identification test including the normalized enrollment stage time consumption, normalized classification stage time consumption and matching accuracy are shown in Table 2.

From the experimental results in Table 2, it is observed that quaternion based fuzzy neural network classifier has the fastest enrollment time and classification time. This follow by hypercomplex Gabor filter using Mahalanobis distance classification, conventional NMF and the slowest BDNMF. Conventional NMF and BDNMF are slow due to the reason that they required an iterative training stage for enrollment, which is time consuming and in compare to the proposed quaternion based fuzzy neural network classifier and hypercomplex Gabor filter. Comparing between conventional NMF and BDNMF, BDNMF algorithm imposes an additional constraint, which is the block diagonal constraint, on the base image matrix and coefficient matrix slowing down the enrollment processes. However, with the block diagonal constraint, BDNMF can preserve the integrity of color information better in different channels for color face representation, hence achieves higher accuracy in color face recognition. In compare to fuzzy neural network, Gabor filter required a large number of different kernels, and hence the length of the feature vectors in quaternion domains would increase dramatically. Therefore, Gabor filter required more time in enrollment and classification comparing to fuzzy neural network. In terms of recognition accuracy, the proposed quaternion based fuzzy neural network outperform hypercomplex Gabor filter, conventional NMF and BDNMF in recognizing view-invariant, noise influenced and scale invariant MPIK color face images.

6 Conclusion

This paper presents a system capable of recognizing view-invariant color face images from MPIK dataset, using quaternion based color face image correlator and max-product fuzzy neural network classifier. One of the advantage of using quaternion correlator rather than conventional correlation method is that quaternion correlation method deals with color images without converting them into grayscale images. Hence important color information can be preserved. Also the proposed Max-product fuzzy neural network provides high level framework for approximate reasoning, since it is best suitable to apply in face image classification. Our experimental results show that the proposed face recognition system's performs well with a very high accuracy of 98% from a dataset of 200 persons each with 7 view-invariant images. In comparative study with parallel work, experimental results also show that the proposed face recognition system outperforms conventional NMF, BDNMF and hypercomplex Gabor filter in terms of consumption of enrolment time, recognition time and accuracy in classifying view-invariant, noise influenced and scale invariant color face images from MPIK. Since artificial dataset (MPIK) was used in the experiments which might be impractical, this work creates a number of avenues for further work. Direct extensions of this work may fall into three main sorts in future. Firstly, more rigorous work is necessary on investigating the system performance in realistic environment and the system should be extended to consider variations include translation, facial expression, and illumination. Real face images such as FERET dataset might be employed in the training as well as empirical tests. Secondly, facial image pre-processing mechanisms, mainly eye detection, geometric and illumination normalization might be employed to ease the image acquisition. A large scale of facial images acquisition and storage of facial data might raise security concerns in terms of identity theft. Third extension might fall in the employment of cancellable face data as a step to reinforce the system security.

References

[1] L. Torres, J. Y. Reutter and L. Lorento, (1999). "The importance of the color information in face recognition", Proc. Int. Conf. on Systems, Man and Cyberbetics, Vol. 3, p.p. 627-631.

[2] M. Rajapakse, J. Tan, J. Rajapakse (2004). "Color Channel Encoding With NMF for Face Recognition", 2004 Int. Conf. on Image Processing (ICIP 2004), p.p. 2007-2010.

[3] C. Wang, X. Bai (2009). "Color Face Recognition Based on Revised NMF Algorithm", 2nd Int. Conf. on Future Information Technology and Management Engineering, p.p. 455-458.

[4] X. Bai, C. Wang (2009). "Fisher diagonal NMF based color face recognition", 2010 Chinese Control and Decision Conference (CCDC), p.p. 4158-4162.

[5] C. Jones III, A. L. Abbott (2006)."Color Face Recognition by Hypercomplex Gabor Analysis", Proc. Of the 7th Int. conf. on Automatic Face and Gesture Recognition (FGR' 06), p.p. 1-6.

[6] L. skott, J. Fellous, N. Kruger and C. V. D. Malsburg (1999). "Face recognition by elastic bunch graph matching", Intelligent Biometric Techniques in Fingerprint and Face Recognition, CRC Press, p.p. 355-396.

[7] B. Due, S. Fischer, and J. Bigun (1999). "Face authentication with Gabor information on deformable graphs", IEEE Trans. On Image Processing, Vol. 8, No. 4, p.p. 504-516.

[8] C. Liu and H. Wechsler (2001). "A Gabor feature classifier for face recognition", Proc. Eight IEEE Int. Conf. on Computer Vision, Vol. 2, p.p. 270-275.

[9] S. Lawrence, C. L. Giles, A. C. Tsoi and A. Back (1997). "Face Recognition: A Convolutional Neural Network Approach", IEEE Trans. On Neural Networks, Vol. 8, No. 1, p.p. 98-113.

[10] I. Paily, A. Sachenko, V. Koval, Y. Kurylyak (2005). "Approach to Face Recognition Using Neural Networks", IEEE Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, Sofia, Bulgeria, p.p. 112-115.

[11] S. C. Pei, J. J. Ding and J. Chang (2001). "Color pattern recognition by quaternion correlation", Proc. of Int. Conf. on image Processing, Vol. 1, p.p. 894-897.

[12] B.V.K. Kumar, D.W. Carlson, and A. Mahalanobis (1994). "Optimal trade-off synthetic discriminant function filters for arbitrary devices", Optics Letters, Vol. 19, No. 19, p.p. 1556-1558.

[13] S. Kumar (2004). Neural Networks: A Classroom Approach, McGraw Hill, Int. Ed.

[14] B.V.K. Vijaya Kumar, M. Savvides, K. Venkataramani and C. Xie (2002). "Spatial frequency domain image processing for biometric recognition", Proc. Of Int. Conf. on Image Processing, Vol. 1, p.p. 153-156.

[15] W. K. Wong, C. K. Loo, W.S. Lim and P. N. Tan (2009). "Quaternion based thermal condition monitoring system" ,Fourth International Workshop on Natural Computing (IWNC 2009), Himeiji, Japan, p.p.317-327.

[16] W. R. Hamilton (1866). Elements of Quaternions, London, U.K.: Longmans, Green.

[17] C. Xie, M. Savvides and B.V.K. Vijaya Kumar (2005). "Quaternion correlation filters for face recognition in wavelet domain", Int. Conf. on Accoustic, Speech and Signal Processing (ICASSP 2005), p.p.1185- 1188.

[18] T.A. Ell (1993). "Quaternion-Fourier transforms for analysis of two-dimensional linear time-invariant partial differential systems", Proc. of 32nd Conf. Decision Contr., p.p. 1830-1841.

[19] T.A. Ell (1992). "Hypercomplex spectral transforms", PhD dissertation, Univ. Minnesota, Minneapolis.

[20] S.C. Pei, J.J. Ding and J.H. Chang (2001). "Efficient implementation of quaternion Fourier transform convolution and correlation by 2-D Complex FFT", IEEE Trans. on Signal Processing, Vol. 49, No. 11, p.p. 2783-2797.

[21] S.J. Sangwine and T.A. Ell (1999). "Hypercomplex auto- and cross-correlation of colour images", Proc. of Int. Conf. on Image Processing, (ICIP 1999) p.p. 319-323.

[22] T.A. Ell and S.J. Sangwine (2000). "Colour--sensitive edge detection using hypercomplex filters", (EUSIPCO 2000) p.p. 151-154.

[23] A Vanderlugt (1964). "Signal detection by complex spatial filtering", IEEE Trans. Inf. Theory, Vol. 10, p.p.139-145.

[24] M. Saviddes, K. Venkataramani and B.V.K. Vijaya Kumar (2003). "Incremental updating of advanced correlation filters for biometric authentication systems", Proc. of Int. Conf. on Multimedia and Expo, Vol. 3 (ICME 2003) p.p. 229-232.

[25] A. Mahalanobis, B.V.K. Vijaya Kumar and D. Casasent (1987). "Minimum average correlation energy filters", Applied Optics, Vol. 26, p.p. 3633-3640.

[26] M. Savvides, B.V.K. Vijaya Kumar and P. Khosla (2002). "Face verification using correlations filters", Procs of 3rd IEEE Automatic Identification Advanced Technologies, Tarrytown, N.Y., p.p. 56-61.

[27] A. Mahalanobis, B.V.K. Vijaya Kumar, S.R.F. Sims and J.F. Epperson (1994). "Unconstrained correlation filters", Applied Optics, Vol. 33, p.p. 3751-3759.

[28] P. Refreiger (1990). "Filter design for optical pattern recognition: multi-criteria optimization approach", Optics Letters, Vol. 15, p.p. 854-856.

[29] J.J. Buckley and Y. Hayashi (1994). "Fuzzy neural networks: A survey", Fuzzy Sets and Systems, 66, p.p. 1-13.

[30] C.T. Lin and C.S.G. Lee (1996). Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems, Prentice Hall, Upper Saddle River, N.J.

[31] D. Nauck, F. Klawonn and R. Kurse (1997). Foundations of Neuro-Fuzzy Systems, Wiley, Chichester, U.K..

[32] S.K. Pal and S. Mitra (1999). Neuro-Fuzzy Pattern Recognition: Methods in Soft Computing, Wiley, Chichester, U.K.

[33] R. Ostermark (1999). "A Fuzzy Neural Network Algorithm for Multigroup Classification", Elsevier Science, Fuzzy Sets and Systems, 105, p.p. 113-122.

[34] H.K. Kwan and Y. Cai (1997). "A Fuzzy Neural Network and its Application to Pattern Recognition", IEEE Trans. on Fuzzy Systems, 2(3), p.p. 185-193.

[35] G.Z. Li and S.C. Fang (1998). "Solving interval-valued fuzzy relation equations", IEEE Trans. on Fuzzy Systems, Vol. 6, No. 2, p.p. 321-324.

[36] J. Leotamonphong and S. Fang (1999). "An efficient solution procedure for fuzzy relation equations with max product composition", IEEE Trans. on Fuzzy Systems, Vol. 7, No. 4, p.p. 441-445.

[37] M.M. Bourke and D.G. Fisher (1996). "A predictive fuzzy relational controller", Proc. of the Fifth Int. Conf. on Fuzzy Systems, p.p. 1464-1470.

[38] M.M. Bourke and D.G. Fisher (1998). "Solution algorithms for fuzzy relational equations with max-product composition", Fuzzy Sets Systems, Vol. 94, p.p. 61-69.

[39] P. Xiao and Y. Yu (1997). "Efficient learning algorithm for fuzzy max-product associative memory networks", SPIE, Vol. 3077, p.p. 388-395.

[40] N. Troje and H. H. Bulthoff (1996). "Face recognition under varying poses: The role of texture and shape." Vision Research 36, P.p. 1761-1771. Redirected from http://faces.kyb.tuebingen .mpg.de/

[41] D.D. Lee, H. S. Seung (1999). " Learning the parts of objects by non-negative matric factorization", Nature 401, p.p. 788-791.

Wai Kit Wong and Gin Chong Lee

Faculty of Engineering and Technology, Multimedia University

75450 JLN Ayer Keroh Lama, Melaka, Malaysia.

E-mail: wkwong@mmu.edu.my, gclee@mmu.edu.my

Chu Kiong Loo, and Raymond Lock

Faculty of Computer Science and Information Technology, University of Malaya

50603 Lembah Pantai, Kuala Lumpur, Malaysia.

E-mail: ckloo.um@gmail.com

Received: January 12, 2013

Table 1: normalized PSR and p-value for both authentic
and imposter case.

                 Normalized
Case                PSR       Normalized p-value

Authentic case     1.0000          0.7905
Imposter case      0.7894          0.5083

Table 2: normalized enrollment stage time
consumption, normalized classification stage time
consumption and matching accuracy for different color
face classification method.

                                                      Accuracy
                    Enrollment                         (output
                       stage       Classification       human
                    normalized          stage         names/ID
                       time        normalized time   match with
                    consumption      consumption         the
Color face         (for training    (for matching    correspondence
classification     all datasets     10,000 tested      input
method             in database)        image)          images)

Conventional           2.76             1.39           80.18%
  NMF
BDNMF                  3.51             1.55           83.37%
Hypercomplex           1.36             1.20           86.13%
  Gabor Filter
  (Mahalanobis
  Distance
  Classifiction)
Quaternion             1.00             1.00           92.06%
  based Fuzzy
  Neural Network
  Classifier
COPYRIGHT 2013 Slovenian Society Informatika
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2013 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Max Planck Institute Kybernetik
Author:Wong, Wai Kit; Lee, Gin Chong; Loo, Chu Kiong; Lock, Raymond
Publication:Informatica
Article Type:Report
Date:Jun 1, 2013
Words:8534
Previous Article:Semi-supervised learning for quantitative structure-activity modeling.
Next Article:Vector disambiguation for translation extraction from comparable corpora.
Topics:

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters