# Cross-modality 2D-3D face recognition via multiview smooth discriminant analysis based on ELM.

1. IntroductionDuring the past several decades, face recognition (FR) has gained a widespread attention due to its potential application values as well as theoretical challenges compared with other biometrics [1]. 2D face recognition has achieved many good results under certain conditions. However, 2D FR is usually affected by different poses, expressions, illuminations, and occlusions and still needs to further improve the recognition accuracy in many real world applications. With the development of 3D scans and 3D information acquisition technologies, 3D face recognition (3D FR) has been proven to be quite robust to face variation with different illuminations and pose changes in achieving high recognition accuracy [2].

Up to now, a number of 3D FR methods have been proposed to solve such kind of FR problems, such as illuminations, pose, and expressions. Generally, 3D FR algorithms consist of three main steps: 3D data preprocessing and normalization, 3D facial feature extraction, and pattern classification. 3D facial feature extraction plays a vital role in 3D FR process. The common 3D feature representation methods can be naturally grouped into three categories: (1) holistic feature-based methods [3-8], (2) local feature-based methods [9-13], and (3) the hybrid methods [14-17].

Though current 3D face recognition algorithms make a good performance in illumination and pose varieties, their performance declines in many new applications. For instance, only visible 2D photographs or low resolution video images of subjects are available while the gallery database may only consist of 3D face models. Nowadays, most FR systems and algorithms are not specifically developed for cross-modality 2D-3D face matching. Little work has been presented on 2D-3D feature representation. Furthermore, 3D facial feature extraction methods are generally time-consuming. The high computational complex limits its applications on high dimensional face features and large face databases.

To address this problem, a novel approach named Multiview Smooth Discriminant Analysis based on the recent Extreme Learning Machines (ELM) (Figure 1) is presented for cross-modality 2D-3D FR in this paper. In this new approach, Multiview Smooth Discriminant Analysis is first formulated to solve the multiple view-specific linear projections by adding the Laplacian smoothing constraint and to learn the common feature space for the cross-modality 2D-3D face features. Then, ELM is utilized as feature classifier by mapping the extracted feature to a high dimensional vector and treating the classification task as a regression problem.

The rest of this paper is organized as follows. Section 2 is the related works and Section 3 describes the Multiview Smooth Discriminant Analysis based on ELM method. Section 4 presents experimental results and discussions. The conclusion of this paper is drawn in Section 5.

2. Related Work

In this section, we briefly review some resent work related to our approach, such as 3D face recognition and Extreme Learning Machines (ELM).

2.1. 3D Face Recognition. 3D FR can be described into three steps: 3D data preprocessing and normalization, 3D facial feature extraction, and pattern classification. In general, 3D face feature representation methods are classified into: (1) holistic feature-based methods [3-8], (2) local feature-based methods [9-13], and (3) the hybrid methods 14-17].

The holistic-feature based methods usually use the whole face region as the input to extract facial features. Typical statistical methods, such as PCA [3] and LDA 4], which are popular 2D FR techniques, are also extended to 3D FR. PCA is used to extract the intrinsic discriminant feature vectors on 2D intensity images and 3D depth images, respectively. Then, the fusion method is utilized to get the final results [5]. ICP-based 3D face recognition methods [6, 7], which utilize the entire facial surface directly as the holistic features, have been applied into 3D face registration [8], and the recognition result is achieved by using these separated rigid parts from the nonrigid parts. However, this kind of 3D FR methods somewhat fails to consider the local geometry which contains the intrinsic structure of the 3D data distribution. Thus, these methods are sensitive to expression, illumination, and pose variations.

The local-feature based methods utilize the 3D face appearance or regional geometric features to represent the 3D faces. Shapes, curvatures, and 3D facial landmarks, as well as other feature descriptors, are used as intrinsic 3D local structure to solve the FR problems. For instance, Queirolo et al. [9] used a simulated annealing-based approach (SA) and the surface interpenetration measure (SIM) to quantify the differences of two face images. Berretti et al. [10] explored the complete geometrical information of the 3D face model and use the iso-geodesic stripes to distinguish facial feature differences. Tang et al. [11, 12] proposed an expression insensitive 3D FR algorithm based on local pattern binary (LBP). Wang and Chua [13] proposed to use invariant 3D spherical Gabor filter (3D SGF) and the least trimmed square Hausdorff distance (LTS-HD) to handle the occlusions problems in 3D FR.

The hybrid methods jointly utilize the holistic and local features for 3D FR [16]. Compared with the other two categories, hybrid methods take advantage of both the 3D spatial information and the global statistic characteristics and thus are demonstrated to be more robust in 3D FR. Spreeuwers [14] proposed to tackle the face expression variations by dividing the face surface into partially overlapped small regions and then a decision-level fusion approach is applied on these regions. Ter Haar and Veltkamp [15] preformed 3D face matching and evaluation using profile and contour of facial surface. Ming [17] proposed a 3D FR framework which utilizes the curvature information and orthogonal spectral regression for efficiently 3D discriminant feature extraction. However, most of these hybrid methods utilize two or more feature descriptors and thus have high computational complexity and cost.

Though new schemes have been proposed and achieved remarkable recognition performance on 3D face recognition, there are still some remaining unsolved problems that would affect FR performance. The performance of conventional 3D FR algorithms declines largely when there are only 2D images available for input test data. Besides, due to the expensive equipment, computational complexity, and the time-consuming 3D face preprocessing, it is difficult to perform on-line 3D FR in some real applications, such as airport or another security access control. So far as we know, there is very minimal work that has been done on this cross-modality 2D-3D face recognition [18-20]. Yang et al. [18] proposed a regularized kernel CCA method to learn the feature differences between 2D photos and 3D depth images. Jelsovka et al. [20] proposed a 3D range image and 2D-3D face images matching method by using the facial curves and CCA. Some similar approaches, such as deep CCA [21] and dictionary learning [22, 23], are also proposed to solve the cross-modality matching problem. CCA and its kernelization method are the typical approaches to learn a common subspace for two modality matching problem. However, CCA only learns the linear mapping by maximizing the total correlations between two views; it ignores the intraview and inter-view correlations. In other words, it has not taken the discriminative information into account. Recently, Kan et al. [24] proposed the Multiview Discriminant Analysis (MvDA) for cross-modality matching. Motivated by their work, we propose a Multiview Smooth Discriminant Analysis based on ELM. This new approach first uses a Laplacian smoothing constraint to make the mapping data spatially smooth and then takes advantage of the ELM as a high effective and less time-consuming classifier.

2.2. Extreme Learning Machines (ELM). In this subsection, we will briefly review the ELM and its applications on pattern classification [25,26]. ELM is recently proposed for efficiently training single-hidden-layer feedforward neural networks (SLFNs). ELM performs classification by mapping data to a high dimensional vector and changes the classification task into a multioutput functional regression problem [27]. ELM provides better classification performance with a much shorter training time and the least human interference [26]. In [27], a voting based ELM has been developed to enhance the performance of multiclasses classification. Employing multiple independent ELMs and making the final prediction with a majority voting method, V-ELM performs better than ELM with a higher classification rate. Huang et al. [28] extended ELM to least square SVM (LS-SVM) [29] and proximal SVM (PSVM) [30] and provided a unified solution for multiclass classification. Kasun et al. [31] proposed an ELM-based auto encoder (ELM-AE) for big data application. Simulations on real world classification databases demonstrate that V-ELM generally outperforms several recent comparable methods with a fast training speed.

3. The Proposed Cross-Modality 2D-3D FR

In this section, we first introduce the basic idea and formulation of our proposed approach. Illustration of the Multiview Smooth Discriminant Analysis is shown in Figure 2. Then, we explain how to solve the optimization problem. Finally, the proposed 2D-3D FR approach is presented with data processing and ELM classification.

3.1. Multiview Smooth Discriminant Analysis. As it is shown in Figure 2, MSDA aims at finding n linear transforms [alpha] = [[[alpha].sub.1], [[alpha].sub.2], ..., [[alpha].sub.n]] which represent the projection matrixes from n views to the common discriminative space. Given the vth view data samples as [X.sup.(v)] = {[x.sup.(x).sub.ij] | i = 1,2, ..., C;j = 1,2, ..., [[rho].sub.i]([v])}, where [x.sup.(v).sub.ij] [member of] is the ith sample from the vth view of the jth class in the D dimensionality, C is the class number, and [[rho].sup.(v).sub.i] is the number of samples from the vth view data, the projected data in the common space is denoted as = {[Y.sup.v.sub.ij] = [[[alpha].sup.T.sub.v][x.sup.v.sub.ij] | i = 1,2, ..., C;j = 1,2, ..., [[rho].sup.(v).sub.i]; v = 1,2, ..., n}. The Multiview Smooth Discriminant Analysis (MSDA) aims at smoothing the basis vectors of the face data from different views by applying the Laplacian smoothing functional. The objective function of MSDA can be defined as the following:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1)

where [S.sup.X.sub.B] and [S.sup.X.sub.B] are the multiview between-class scatter matrix and within-class scatter matrix, respectively. J([alpha]) is the discretized Laplacian regularization functional. [lambda] is the smoothness-controlling parameter and 0 [less than equal to] A [less than equal to] 1.

(1) Laplacian Smoothing. The Laplacian operator is defined as the following [32]:

L[phi]([tau]) = [[epsilon].summation over (k=1)] [[partial derivative].sup.2][phi]/[partial derivative][[tau].sup.2.sub.k], (2)

where [phi] is the function defined on the region-of-interest. And the Laplacian penalty function J, which measures the smoothness of the function [phi], is defined as

J([phi]) = [[integral].sub.[OMEGA]][[L[phi].sup.2]dt. (3)

In our new proposed method, we mainly focus on face images, and thus we take discretized Laplacian smoothing method [33]. Let the [n.sup.1] x [n.sup.2] face images be represented as vectors in [R.sup.m] and [[beta].sub.i][member of] [R.sup.m] be the basis vectors to be smoothed. For an image, whose region-of-interest Q is a two dimensional lattice, let [epsilon] = [[[epsilon].sub.1], [[epsilon].sub.2]], where [[epsilon].sub.1] = 1/[n.sub.1], [[epsilon].sub.2] = 1/[n.sub.2] and the two-dimensional vectors [[tau].sub.l] = ([[tau].sub.l1], [[tau].sub.l2]), where [[tau].sub.lk] = ([l.sub.k] - 0.5) x [[epsilon].sub.k], 1 [less than equal to] [l.sub.k] [less than equal to] [n.sub.k], 1 [less than equal to] k [less than equal to] 2. The total of grid points in the lattice is n = [n.sub.1] x [n.sub.2]. Suppose [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is a [n.sub.j]-dimensional vector which is a discretized version of function [mu]([tau]); then [D.sub.k], which is an [n.sub.k] x [n.sub.k] matrix that yields a discrete approximation to (2), has the property as follows:

[[D.sub.K[mu]] = [approximately equal to] [[partial derivative].sup.2][mu]([[tau].sub.l])/[[partial derivative].sup.2][tau], (4)

where l = 1, ..., [n.sub.k]. In our case, we choose the [D.sub.k] to be the modified Neumann discretization [33, 34].

Given [D.sub.k], a discrete approximation for 2D Laplacian L is an n x n matrix and defined as

[LAMBDA] = [D.sub.1] [cross product] [I.sub.2] + [D.sub.2] [cross product] [I.sub.1], (5)

where [I.sub.k] is [n.sub.k] x [n.sub.k] identity matrix for k = 1,2. It is not difficult to prove that [[parallel][LAMBDA]x[alpha][parallel].sup.2] x is directly related to the sum of the squared differences between nearby data points of a, which is an [n.sub.1] x [n.sub.2] dimensional vector. And it also demonstrates that (5) measures the smoothness of a on the [n.sub.1] x [n.sub.2] lattice.

(2) The Algorithm and Solution. In this part, we will illustrate the MSDA algorithm and its solution. According to the formulation of MSDA, the between-class and within-class scatter matrixes are calculated from the samples of all the n views. Hence, the between-class and within-class scatter matrixes in (1) are formulated as the following:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (6)

The [S.sup.(vr.sub.B] and [S.sup.(vr).sub.W]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (7)

where = ([m.sup.(x).sub.iv] = (1/[[rho].sub.iv] [SIGMA][sup.[rho]iv.sub.j=1] [x.sup.(v).sub.ij]. And the discretized Laplacian regularization function is denoted as

J ([alpha]) = [[parallel][LAMBDA] x [alpha] [parallel].sup.2] = [[alpha].sup.T][[LAMBDA].sup.T][LAMBDA][alpha], (8)

where [LAMBDA] = diag([LAMBDA] 1, [LAMBDA]2, ..., [[LAMBDA].sub.n]).

Finally, the objective function of (1) can be determined by solving the following generalized eigenvalue decomposition with its leading eigenvalues:

[S.sup.X.sub.B][alpha] = [eta]((1 - [lambda] [S.sup.X].sub.W] + [lambda][[LAMBDA].sup.T][LAMBDA])[alpha] (9)

3.2. ELM-Based Classification. In order to speed up the training phase of the classifier and to obtain a reasonable recognition performance, the recent popular extreme learning machine [25,28] is employed in our FR framework. Based on a SLFN, the ELM classifier utilized in the proposed FR recognition system can be described as follows.

Assuming that the available training feature dataset is A = {([x.sub.i][t.sub.i])}[sup.N.sub.i=1], where x;, f;, and N represent the feature vector of the ith face image, its corresponding category index, and the number of images, respectively, the SLFN with [zeta] nodes in the hidden layer can be expressed as

[o.sub.i][[zeta].summation over(j=1][w.sub.j]g ([a.sub.j], [b.sub.j], [x.sub.i]), i = 1,2, ..., N, (10)

where [o.sub.i] is the output obtained by the SLFN associated with the ith input protein sequence and [a.sub.j] [member of] [R.sup.d] and [b.sub.j] [member of] R (j = 1,2, ..., [zeta]) are parameters of the jth hidden node, respectively. The variable [w.sub.j] [member of] [R.sup.m] is the link connecting the jth hidden node to the output layer and g(*) is the hidden node activation function. With all training samples, (10) can be expressed in the compact form as

O=HW, (11)

where W = ([w.sub.1], [w.sub.2], ..., [w.sub.[zeta]]) and O are the output weight matrix and the network outputs, respectively. The variable H denotes the hidden layer output matrix with the entry [H.sub.ij] = g([a.sub.j], [b.sub.j], [x.sub.i]). To perform multiclasses classification, the ELM classifier generally utilizes the one-against-all (OAA) method to transform the classification application to a multioutput model regression problem. That is, for a C-categories classification application, the output label [t.sub.i] of the face image feature [x.sub.i] is encoded to a C-dimensional vector [t.sub.i] = [([t.sub.i1], [t.sub.i2], ..., [t.sub.iC]).sup.T] with [t.sub.ic] [member of] {1,-1} (c = 1,2, ..., C). If the category index of the face image [x.sub.i] is c, then [t.sub.ic] is set to be 1 while the rest of entries in [t.sub.i] are set to be -1. Hence, the objective of training phase for the SLFN in (10) becomes finding the best network parameters set [DELTA] = [{([a.sub.j], [b.sub.j], [w.sub.j])}.sub.j=1,] ..., [zeta] such that the following error cost function is minimized

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (12)

where T = ([T.sub.1] [t.sub.2], ..., [t.sub.N]) is the target output matrix.

ELM theory claims that random hidden node parameters can be utilized for SLFNs and the hidden node parameters may not need to be tuned. In such case, the system (11) becomes a linear model and the network parameter matrix can be analytically solved by using the least-square method. That is,

W = [H.sup.[dagger]]T, (13)

where [H.sup.[dagger]] is the Moore-Penrose generalized inverse of the hidden layer output matrix H given by [35]. The universal approximation property of the ELM algorithm is also presented in [25].

4. Experiments

In this section, we investigate the performance of our approach for cross-modality 2D-3D face recognition on FRGC 2.0. The new approach is compared with some state-of-the-art cross-modality learning methods, such as PLS [36], CDFE [37], PCA + CCA [38], and MvDA [39] and some neural networks based methods. The description of the face database and the 3D face preprocessing are presented in the following subsection. Then, the results of the experiments are concluded, as well as the discussion and experimental analysis.

4.1. Database Description and Experimental Setting. We evaluate our experiments on the FRGC v2.0 [40] 2D versus 3D face database. In this experiment, FRGC v2.0, which contains 4007 2D photos and 4007 3D faces of 466 persons respectively, is utilized to evaluate the performance of the new method. The images of FRGC are acquired with a Minolta Vivid 910 and Minolta 910 scanner utilizes triangulation with a laser stripe projector to build a 3D face model. FRGC face database consists of frontal views up above shoulder, facial expressions, male and female face modals. Some data has facial hair, but none of them is occupied by glasses. In FRGC v2.0, 57% are males and 43% are females. Our previous work [41] is utilized for 3D data preprocessing and the 2D photos are corresponding to their respective 3D faces.

(1) 3D Data Preprocessing [41]. The 3D data preprocessing consists of four main steps, face region detection, nose detection, face smoothing, and the generation of 2D and 3D face images.

Firstly, a 3*3 Gaussian filter is used to moving spikes and noise firstly, and then the range data are subsampled at a 1: 4 ratio. Ada-boost face detecting method [42] is applied on 2D texture image to help 3D facial region extraction. Secondly, we calculate the central stripe to detect the nose region and the nose tip is supposed to be on the central stripe. ICP is utilized to align the stripe of Person A to the stripe of Person B. Thus, the nose tip lays on the highest point in a cropped sphere. Once the nose tip is confirmed, a region-of-interest, which is defined by a sphere radius of 90 mm centered at the nose tip, is cropped and used in the following experiments. Figure 3 shows how to find the nose tips from the central stripe.

(2) Experimental Setting. In order to evaluate the robustness of our method, we divide the database into two sets, the training set and the test set. We pick out 285 subjects with more than 6 samples and select 5 samples of each person for training and the rest for testing. All 2D photos and the 3D range images are scaled, transformed, and cropped in the same way to 100 x 100 size according to the eye position. The cropped examples of FRGC database are shown in Figure 4.

4.2. Experimental Results. In the following experiments, we use the five front images per person as the training set, and the remaining images are utilized as the testing set. In the testing phase, the 2D photos are utilized as the gallery set and their corresponding 3D range images are used as the probe set. Firstly, we compare a set of experiments of our method with some cross-modality learning methods with hidden nodes of ELM chosen to be 1000. The rank-1 recognition performance with different dimensions is reported in Figures 5 and 6. Figure 5 shows the experimental results of the MSDA based on ELM method compared with some cross-modality learning methods, such as PLS [36], CDFE [37], PCA + CCA [38], and MvDA [39]. Figure 6 shows the comparison results of the proposed method with some feature extraction based methods. It is clear to see that our method achieves the highest recognition rate (96.8%) compared to the related subspace learning methods. Furthermore, we still choose 1000 hidden nodes and compare the new method under different ELM activation functions, such as sigmoid, sine, and hardlim. Figure 7 shows the compared results. From Figure 7, we can conclude that ELM with sigmoid activation function gets the highest results in the three and the one with sine active function gets the worst.

Secondly, the recognition performance as well as the training time of the proposed approach is compared with BP neural networks [43]. The performance of the proposed method is compared with BP based method with a fixed dimension of 40. And the experimental results are obtained under different hidden nodes [zeta] = 100, 200, 500, 1000. However, for huge computational cost of the BP neural networks, it is acceptable only when the choice of BP hidden nodes is less than 1000. The recognition results and training time are reported in Table 1. From Table 1, we can see that the ELM-based method achieves higher recognition accuracy with a much faster training speed than BP neural networks.

Overall, the rank-1 recognition rate obtained by the new method is 96.8%, which is higher than that obtained using the other compared algorithms. It can be clearly seen that ELM-based MSDA method achieves better performance, and meanwhile, it takes a much faster training speed to get the good recognition result.

5. Conclusion

In this paper, a Multiview Smooth Discriminant Analysis based on ELM method is proposed for cross-modality 2D-3D face recognition. 2D-3D face recognition is an alternative and feasible approach to the traditional 3D FR systems, and it is much more convenient to acquire a 2D image than building a 3D face model. In this new approach, the Multiview Smooth Discriminant Analysis (MSDA) is firstly performed to get the cross-modality face features by using the Laplacian smoothing functional. The Laplacian penalized functional considers the image spatial relationship in the feature level and therefore obtains a much smoother linear projection subspace than those without smooth subspace learning. Furthermore, the ELM, which reduces the computational cost, is utilized for face feature classification. Experimental results show that the proposed method consistently outperforms the other cross-modality matching algorithms and achieves good recognition performance in both accuracy and speed.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by National Program on Key Basic Research Project (Grant no. 2012CB316304), the national Natural Science Foundation of China (Grant no. 61172128), Specialized Research Fund for the Doctoral Program of Higher Education (Grant no. 20120009120009), and the Fundamental Research Funds for the Central Universities (Grant no. 2012JBZ017).

http://dx.doi.org/10.1155/2014/584241

Correspondence should be addressed to Yi Jin; yjin@bjtu.edu.cn

Received 12 January 2014; Accepted 16 March 2014; Published 23 April 2014

Academic Editor: Jun Cheng

References

[1] S. Z. Li and A. K. Jain, Encyclopedia of Biometrics: I-Z., vol. 2, Springer, New York, NY, USA, 2009.

[2] A. F. Abate, M. Nappi, D. Riccio, and G. Sabatino, "2D and 3D face recognition: a survey," Pattern Recognition Letters, vol. 28, no. 14, pp. 1885-1906, 2007

[3] M. Turk and A. Pentland, "Eigenfaces for recognition," Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.

[4] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, "Eigenfaces vs. fisherfaces: recognition using class specific linear projection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, 1997

[5] K. I. Chang, K. W. Bowyer, and P. J. Flynn, "An evaluation of multimodal 2D+3D face biometrics," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 4, pp. 619624, 2005.

[6] T. Maurer, D. Guigonis, I. Maslov et al., "Performance of geometrix activeid 3d face recognition engine on the frgc data," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05), p. 154, San Diego, Calif, USA, 2005.

[7] K. I. Chang, K. W. Bowyer, and P J. Flynn, "Multiple nose region matching for 3D face recognition under varying facial expression," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp. 1695-1700, 2006.

[8] C. S. Chua, F. Han, and Y. K. Ho, "3d human face recognition using point signature," in Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition (FG '00), pp. 233-238, 2000.

[9] C. C. Queirolo, L. Silva, O. R. P. Bellon, and M. Pamplona Segundo, "3D face recognition using simulated annealing and the surface interpenetration measure," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 2, pp. 206-219, 2010.

[10] S. Berretti, A. Del Bimbo, and P. Pala, "3D face recognition using isogeodesic stripes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 12, pp. 2162-2177, 2010.

[11] H. Tang, Y. Sun, B. Yin, and Y. Ge, "Expression-robust 3D face recognition using LBP representation," in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '10), pp. 334-339, July 2010.

[12] H. Tang, B. Yin, Y. Sun, and Y. Hu, "3d face recognition using local binary patterns," Signal Processing, vol. 93, no. 8, pp. 2190-2198, 2013.

[13] Y. Wang and C.-S. Chua, "Face recognition from 2D and 3D images using 3D Gabor filters," Image and Vision Computing, vol. 23, no. 11, pp. 1018-1028, 2005.

[14] L. Spreeuwers, "Fast and accurate 3D face recognition: using registration to an intrinsic coordinate system and fusion of multiple region classifiers," International Journal of Computer Vision, vol. 93, no. 3, pp. 389-414, 2011.

[15] F. B. ter Haar and R. C. Veltkamp, "A 3D face matching framework for facial curves," Graphical Models, vol. 71, no. 2, pp. 77-91, 2009.

[16] S. Jahanbin, H. Choi, and A. C. Bovik, "Passive multimodal 2-D+3-D face recognition using gabor features and landmark distances," IEEE Transactions on Information Forensics and Security, vol. 6, no. 4, pp. 1287-1304, 2011.

[17] Y. Ming, "Rigid-area orthogonal spectral regression for efficient 3d face recognition," Neurocomputing, vol. 129, pp. 445-457, 2013.

[18] W. Yang, D. Yi, Z. Lei, J. Sang, andS. Z. Li, "2D-3D face matching using CCA," in Proceedings of the 8th IEEE International Conference on Automatic Face and Gesture Recognition (FG '08), pp. 1-6, September 2008.

[19] D. Huang, M. Ardabilian, Y. Wang, and L. Chen, "Automatic asymmetric 3D-2D face recognition," in Proceedings of the 20th International Conference on Pattern Recognition (ICPR '10), pp. 1225-1228, August 2010.

[20] D. Jelsovka, R. Hudec, M. Breznan, and P Kamencay, "2d-3d face recognition using shapes of facial curves based on modified ccamethod," in Proceedings of the 22nd International Conference on Radioelektronika (RADIOELEKTRONIKA '12), pp. 1-4, 2012.

[21] G. Andrew, R. Arora, J. Bilmes, and K. Livescu, "Deep canonical correlation analysis," in Proceedings of the 30th International Conference on Machine Learning, pp. 1247-1255, 2013.

[22] S. Wang, D. Zhang, Y. Liang, and Q. Pan, "Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '12), pp. 2216-2223, 2012.

[23] D.-A. Huang and Y.-C. F. Wang, "Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition," in Proceedings of the IEEE International Conference on Computer Vision (ICCV '13), December 2013.

[24] M. Kan, S. Shan, H. Zhang, S. Lao, and X. Chen, "Multi-view discriminant analysis," in Computer Vision (ECCV '12), vol. 7572 of Lecture Notes in Computer Science, pp. 808-821, 2012.

[25] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, "Extreme learning machine: theory and applications," Neurocomputing, vol. 70, no. 1-3, pp. 489-501, 2006.

[26] G.-B. Huang, D. H. Wang, and Y. Lan, "Extreme learning machines: a survey," International Journal of Machine Learning and Cybernetics, vol. 2, no. 2, pp. 107-122, 2011.

[27] J. Cao, Z. Lin, G.-B. Huang, and N. Liu, "Voting based extreme learning machine," Information Sciences, vol. 185, no. 1, pp. 66-77, 2012.

[28] G.-B. Huang, H. Zhou, X. Ding, and R. Zhang, "Extreme learning machine for regression and multiclass classification," IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 42, no. 2, pp. 513-529, 2012.

[29] J. A. K. Suykens and J. Vandewalle, "Least squares support vector machine classifiers," Neural Processing Letters, vol. 9, no. 3, pp. 293-300, 1999.

[30] G. Fung and O. L. Mangasarian, "Proximal support vector machine classifiers," in Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '01), pp. 77-86, August 2001.

[31] L. L. C. Kasun, H. Zhou, G.-B. Huang, and C. M. Vong, "Representational learning with extreme learning machine forbig data," IEEE Intelligent Systems. In press.

[32] J. Jost and J. Jost, Riemannian Geometry and Geometric Analysis, vol. 2001, Springer, Berlin, Germany, 1995.

[33] D. Cai, X. He, Y. Hu, J. Han, and T. Huang, "Learning a spatially smooth subspace for face recognition," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1-7, June 2007

[34] F. O'sullivan, "Discretized laplacian smoothing by fourier methods," Journal of the American Statistical Association, vol. 86, no. 415, pp. 634-642, 1991.

[35] D. Serre, Matrices: Theory and Applications, Springer, New York, NY, USA, 2001.

[36] A. Sharma and D. W. Jacobs, "Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '11), pp. 593-600, June 2011.

[37] D. Lin and X. Tang, "Inter-modality face recognition," in Proceedings of the European Conference on Computer Vision (ECCV '06), pp. 13-26, 2006.

[38] D. R. Hardoon, S. Szedmak, and J. Shawe-Taylor, "Canonical correlation analysis: an overview with application to learning methods," Neural Computation, vol. 16, no. 12, pp. 2639-2664, 2004.

[39] M. Kan, S. Shan, H. Zhang, S. Lao, and X. Chen, "Multiview discriminant analysis," in Proceedings of the 12th European conference on Computer Vision (ECCV '12), pp. 808-821, 2012.

[40] P. J. Phillips, P J. Flynn, T. Scruggs et al., "Overview of the face recognition grand challenge," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05), vol. 1, pp. 947-954, June 2005.

[41] X. Wang, Q. Ruan, Y. Jin, and G. An, "Expression robust three-dimensional face recognition based on gaussian filter and dualtree complex wavelet transform," Journal of Intelligent and Fuzzy Systems, vol. 26, no. 1, pp. 139-201, 2014.

[42] P Viola and M. Jones, "Robust real-time object detection," International Journal of Computer Vision, vol. 4, pp. 34-47, 2001.

[43] R. Hecht-Nielsen, "Theory of the backpropagation neural network," in Proceedings of the International Joint Conference on Neural Networks (IJCNN '89), pp. 593-605, June 1989.

Yi Jin, (1) Jiuwen Cao, (2) Qiuqi Ruan, (1) and Xueqiao Wang (1)

(1) Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China

(2) Institute of Information and Control, Hangzhou Dianzi University, Zhejiang 310018, China

TABLE 1: Performance comparison of ELM and BP in terms of rank-1 Rec. Rate (in (%)) and training time (in (s)). Method ELM BP Nodes R.R. (%) Time (s) R.R. (%) Time (s) 100 42.5 0.0001 37.5 1214.6 200 70.0 0.0001 62.4 2209.8 500 85.5 0.0624 82.7 4055.6 1000 96.0 0.1872 94.0 6180.7

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Research Article; Extreme Learning Machines |
---|---|

Author: | Jin, Yi; Cao, Jiuwen; Ruan, Qiuqi; Wang, Xueqiao |

Publication: | Journal of Electrical and Computer Engineering |

Article Type: | Technical report |

Date: | Jan 1, 2014 |

Words: | 5374 |

Previous Article: | Simulation and test of a fuel cell hybrid golf cart. |

Next Article: | Unified registration model for both stationary and mobile 3D radar alignment. |

Topics: |