Printer Friendly

Position-Based Feature Selection for Body Sensors regarding Daily Living Activity Recognition.

1. Introduction

Daily living activity recognition is one of the most important research topics that can provide valuable information for various applications such as health monitoring, security, intelligent assistance in daily life, fall detection, and surveillance [1, 2]. It is obvious that the quantity and placement of wearable sensors on the human body are factors that affect the accuracy of activity classification [3-5]. Each kind of activity has different characteristics which are specific to the particular location on the human body.

The development of embedded sensor technology in wearable devices, that is, smartphone, smart band, and smart shoe, has featured a variety of daily activity recognitionrelated researches [6-8]. A major issue of these studies is that there is a difficulty in acquiring experimental results that carry all real-world applications while activities are conducted in a controlled environment [9]. Some recent publications [6, 7, 10], which utilized a single sensor module attached to different body positions, can achieve a high classification accuracy. However, a single sensor module-based platform cannot be sufficient to recognize complex daily living activities. In particular, characterizations of the activities may not be true using one sensor even though a high classification accuracy is achieved [11]. To overcome that problem, researchers have tended to attached many sensors to different parts of a human body [4,11-14]. These approaches can recognize both dynamic and transitional activities. Furthermore, a multiple-sensor system, which contains up to three sensors, can provide a good recognition accuracy [11]. The fact remains that researchers have made efforts to deal with a fusion of multiple-sensor data. In addition, challenges of a variety of monitoring activities and precision of classification are directly impacted by the number and the placement of inertial sensors attached to the human body [15]. On-body device position was investigated to determine its influence on the performance of activity recognition [16, 17]. Figueira et al. [18] explored the sensor-position awareness of monitoring activities. Atallah et al. [19] discussed the sensor positions and significant features for each activity group according to the compendium of physical activities. However, a recognition system with many sensors can contain numerous redundant features which degrade the classification performance and may even confuse the classifier. Recent studies [19-22] have approved feature selection techniques as a solution to deal with the redundant features. There have been various activity recognition topologies designed with feature selection algorithms. Appropriately, a novel scheme based on feature selection is introduced to achieve a high activity classification performance.

In this paper, we propose a new scheme which is evaluated on the DaLiAc dataset for human daily living activity recognition based on sensor-placement feature selection. We first apply a low-pass filter to the raw data. The filtered data are segmented by a 10-second sliding window with no overlap to obtain a sample. Then, we extract valuable features from the filtered data. Feature selection techniques are applied separately on each sensor location to obtain the best feature set for each body position. We optimize the feature set by investigating the correlation of the features. Finally, a fc-nearest neighbor algorithm is applied to an optimized feature set containing features from four body positions to classify thirteen activities. The key of our framework lies in the feature selection layer. In this work, we implement feature selection on an individual set of sensor features. This approach optimizes the features for specific sensors, and therefore, facilitates correct selection of the features specific to each activity. The contributions of the paper are as follows:

(i) It proposes a new method of daily living activity recognition that achieves a higher accuracy rate in a benchmark dataset in comparison with recent publications

(ii) It discovers sensor positions that most affects the recognition rate in a benchmark dataset

(iii) It explores meaningful features by utilizing three feature selection methods, that is, Laplacian score, SFFS, and Relief-F, as well as correlation-based feature optimization for four sensor locations in the ankle, chest, hip, and wrist

The rest of the paper is organized as follows. In Section 2, we present some related works on the recognition of daily living activities with one or more sensors. The proposed method is described in Section 3. Furthermore, we go into the details of data preprocessing, feature extraction, future selection, feature optimization, choice of classifier, and evaluation criteria. All comparisons of the selected set of features and the performance of the proposed method will be shown in Section 4. Finally, Section 5 presents the conclusion of this paper and future works.

2. Related Works

Many authors have reported activity recognition issues when using one sensor to monitor activities of daily life [6,7,10]. In our previous work [6], a multiclass support vector machine (SVM) algorithm demonstrated 97.3% accuracy when we used one module smart band with a triaxial acceleration stuck on the wrist to collect data during the performance of five wrist activities: texting, calling, hand in pocket, hand swinging, and suitcase carrying. Nguyen et al. [7] proposed three novel plantar-pressure sensor-based features to classify five ambulatory activities using smart insoles. By implementing a fc-nearest neighbor classifier, these walking activities can be recognized with an accuracy rate of 97.84% at the sixth step. Zheng [10] presented a hierarchical framework based on the least squares support vector machine and naive Bayes algorithm to classify ten activities using an inertial module attached on the hip. Their implemented method, which recognized activities regarding four groups (bidimensional walking, tridimensional walking, plane motion, and static activities), obtained a high accuracy rate of 95%. These studies can achieve high activity classification rates using a small size of feature vectors and simple classifiers. However, this approach, which used data collected from one sensor location, only attained the best recognition performance to predict activities that relate directly to the sensor's location. This limits the number of activities for consideration.

To address the drawback of a single-sensor configuration, various studies have utilized multiple inertial sensors attached to different body locations to classify human activities [4,9,11-14]. Sztyler et al. [9] improved an activity recognition performance by being aware of the device position. Their method was implemented on a dataset of eight activities collected from seven on-body locations. In the experimental results, they obtained an F-measure of 89%. In addition, an awakening of sensor locations can enhance the performance of activity recognition compared with subjectspecific and cross-subject approaches. Khan et al. [11] presented a cloud-based system to recognize thirteen activities using the WARD dataset [23]. Effects of on-body sensor position were explored by implementing three classifiers for each activity. Their analysis showed that activity recognition is significantly dependent on the positions of on-body devices. Furthermore, utilizing up to three sensor modules is needed to recognize more dynamic and transient activities. Schuldhaus et al. [12] proposed a decision level fusion-based method to classify seven daily living activities. Four sensor nodes, which were located on the chest, hip, wrist, and ankle, separately performed activity classification. Then, an activity was predicted by majority voting of each classifier decision of the sensor nodes. The designed system achieved an overall recognition error rate of 6.1%. Banos et al. [13] presented a metaclassifier framework which included an insertion and rejection algorithm of weights for base (classifier) and source (activity) levels, respectively. A voting algorithm is designed to fuse the decisions of individual classifiers to predict activities. To compare with other fusion models, this scheme can deal with misclassification caused by a failure of one or more sensors. The method which was evaluated on the REALDISP benchmark dataset [24] can improve the recognition performance compared with a traditional feature fusion, hierarchical-weighted classifier [25] approach to classify thirty-three activities. Gao et al. [14] developed a framework consisting of sensor selection and hierarchical classifier module to address the dynamic multisensor fusion problem. They investigated the recognition performance and energy consumption of the method on a dataset of eight activities. The experimental results demonstrated that a system can save 37.5% of the energy consumption and reduce the classification rate by 2.8% compared with an original four-sensor classifier approach. Bao and Intille [4] used features extracted from data collected using five biaxial accelerometers attached to the left elbow, right wrist, hip, left thigh, and right ankle of twenty different subjects to recognize twenty different activities. The overall accuracy using a decision tree was 84% without the utilization of a feature selection technique. The main advantage of multiple sensors is that they can recognize many daily living activities with high precision. Moreover, it is more suitable to recognize complex activities, that is, dynamic and transitional activities [26]. However, the main issue of this configuration is an optimal sensor combination to achieve a high classification performance and robustness [25]. In addition, the multiple-sensor systems can become uncomfortable for users who wear them for a long time [13, 25].

The wearable sensor community recently investigated feature selection techniques which demonstrated an efficient solution to deal with the problem of redundant features and computational expense [27] in recognizing human activities [19-22]. Feature selection algorithms were implemented in many ways, that is, performing on a total set of features extracted from various activities, applying to each activity group, and undertaking each activity. Atallah et al. [19] employed six wearable accelerometers positioned at six different body locations in the ear, chest, wrist, ankle, knee, arm, and waist. They used a filter method-based feature selection technique, that is, Relief [28], Simba [29], and minimum redundancy maximum relevance (MRMR) [30] to evaluate and rank the features. The K-nearest neighbor with k = 5 and k = 7 and the Bayesian classifier were used to classify fifteen daily activities which were combined into five groups of activities labeled as very low level, low level, medium level, high level, and transitional activities. The performance of both algorithms was similar at approximately 82%. Zhang and Sawchuck [20] acquired data using a 6DOF inertial measurement unit (IMU), which integrated a triaxial accelerometer, gyroscope, and magnetometer. This study utilized three selection techniques, that is, Relief [28], a wrapper method based on a single feature (SFC) [31], and a wrapper method based on sequential for selection (SFS) [32] to select the most important features. The SVM algorithm was then deployed to recognize twelve activities including walking, forward, walking left, walking right, going upstairs, going downstairs, jumping up, running, standing, and sitting. The accuracy of the single-layer classifier was up to 90%. Pirttikangas et al. [21] configured four triaxial accelerometers attached to the right wrist, left wrist, right thigh, and neck and one heart rate sensor attached to the right wrist of thirteen different users. The collected data from 17 different activities were characterized into five features, that is, acceleration mean, standard deviation, correlation, mean crossing, and heart rate mean. The forward-backward search was utilized to select a subset of the best features. The author used the k-nearest neighbor as a classifier and the algorithm achieved 90.61% activity recognition accuracy. Huynh and Schiele [22] showed that the performance can be improved by implementing feature selection techniques on individual features for each activity such as walking, standing, jogging, skipping, hopping, and riding a bus. Data was collected using an integrated sensor board containing sensors for 3D acceleration, audio, temperature, IR/visible/high-frequency light, humidity, barometric pressure, and a digital compass. The board was placed on the shoulder strap of a backpack and the subjects were required to wear the backpack while performing the measured activities. The author applied cluster precision to evaluate the features and the best accuracy of recognition was about 90% when the best features and window length of jogging and walking were considered. For the other activities, the recognition accuracy was about 85%. The authors applied feature selection to individual activities, but the results did not show improved efficiency when the best set of features for all activities was combined into one global set as the input for the classifier.

Some state-of-art methods evaluated on the DaLiAc dataset [33], which obtained acceptable classification accuracy rates to recognize thirteen activities, can be found in [33,34]. Leutheuser et al. [33] achieved an average classification rate of 85.8% utilizing a hierarchical multisensor-based activity recognition system. Furthermore, their proposed method can distinguish between complex daily living activities, that is, treadmill running and rope jumping. Zdravevski et al. [34] proposed a systematic feature engineering method that uses a two-phase feature selection layer. The method achieved an accuracy rate of 91.2% with 60 selected features to classify thirteen activities.

3. Activity Recognition Algorithm for Daily Living Activities

In this section, we first present a comparison between the proposed and conventional algorithms. Then, data preprocessing, feature extraction, selection, optimization, choice of classifier, and evaluation criteria are discussed.

3.1. Feature Selection Based on Body-Sensor Placement. Figures 1(b) and 1(a) respectively illustrate the proposed and standard approaches to recognize human activity using wearable devices. In the proposed architecture, the feature selection layer is different from that of the standard approach. In this layer, the proposed method separately performs feature selection on each sensor position, whereas the standard method considers all extracted features at once. The main advantage of the proposed method is that each sensor has a different set of features which is better for depicting the activities with regard to the position of the sensors. Moreover, the computation requirement of feature selection is significantly decreased compared with the standard approach.

In addition, the standard approach deployed in most studies [19, 21, 33] achieves medium classification accuracy (below 95%). The disadvantage of the standard approach is that it becomes computationally expensive as the number of features increases. We also perform optimization based on the Pearson correlation coefficient to take high-correlation features into account in the proposed method.

3.2. Data Preprocessing. In this study, inertial data were uniformly resampled with a sampling frequency of 200 Hz. We cut out a segment of the inertial signal using a 10 s sliding window with no overlap to obtain the activity samples. A low-pass filter was then applied to these samples to remove noise. Table 1 shows the distribution of the human activity samples. Examples of the acceleration and gyroscope signals for sitting are shown in Figure 2.

3.3. Feature Extraction. Feature extraction transforms high-dimensional raw data into a meaningful representation data that can depict the characteristics of different activities. This technique enables machine learning algorithms to deal with high-dimensional data. We obtain a large set of features calculated using the filtered data for each activity. To perform feature extraction, we implement eight features on smaller windows of 2 s data, that is, signal magnitude area, intensity of movement, mean trend, windowed mean difference, variance trend, windowed variance difference, standard deviation trend, and windowed standard deviation difference. In addition, the 10 s long inertial signals are divided into 5 windows with no overlap. The remaining features were extracted directly without breaking the samples into smaller windows.

The details of the feature are shown in Table 2.

The standard deviation trend and windowed standard deviation have different features, which are computed similar to the mean trend and windowed mean difference, and are given below:

(i) Standard deviation trend

[sigma]T = [5.summation over (i=2]([absolute value of [[sigma].sub.i]-[[sigma].sub.i-1]]). (1)

(ii) Windowed standard deviation difference

[sigma]D = [5.summation over (i=1)]([absolute value of [sigma]-[[sigma].sub.i]]). (2)

The extracted features are not distributed normally and are sparse which can affect the performance of estimators. Therefore, feature normalization is a common requirement for many machine learning algorithms. In our study, we normalized all the features to zero mean and unit variance.

[x.sub.normalized][x.sub.raw]-[mu]/[sigma], (3)

where [mu] and [sigma] are the mean and standard deviation of a particular feature, respectively.

3.4. Feature Selection. We obtain a total of 536 features including 192 statistical features, 120 magnitude-based features, 24 correlation-based features, 56 frequency-domain features, and 144 trend-variance features after performing feature extraction. In order to improve the classification accuracy and reduce the expense calculation time, the redundant (features are closely related) and irrelevant features should be removed [27, 35]. This task can be performed with feature selection techniques. There are three common groups of feature selections: filter method, wrapper method, and embedded method [36]. Filter methods which are computationally efficient select features that are independent of any machine learning algorithms. On the contrary, wrapper methods evaluate the quality of the selected features based on the performance of a predefined learning algorithm. Embedded methods embed the feature selection with model learning. In this paper, we use 2 filter methods and 1 wrapper method, that is, Relief-F, Laplacian Score, and Sequential Forward Floating Selection (SFFS) to rank the features for each sensor location. These methods are briefly described below.

Relief-F [37] is a popular filter method and is extended from the original feature selection method Relief [28]. This method finds the k nearest instances from different classes (k nearest hits/misses) using the Manhattan norm instead of only one nearest hit and miss when using the Euclidean norm and then updates the weight of each feature. Relief-F can also handle incomplete data and the multiclass problem, which are problems in the Relief algorithm. If feature

A is considered, then the weight W(A) will be updated following [37]:

[mathematical expression not reproducible], (4)

where diff (A, R, [H.sub.i]) calculates the difference in feature value A between two instances R and [H.sub.i] in the near hit class (same class as R), and diff (A, R, [H.sub.i]) calculates the difference in features A between two instances R and [M.sub.i] (C) in the near miss class (different class from R).

Laplacian Score algorithm is another feature selection technique based on the filter method [38]. Suppose we have m data points, [x.sub.1],[x.sub.2], ..., [x.sub.m], and the ith node corresponds to [x.sub.i]. In order to select the significance features, first, the Laplacian score algorithm constructs a k-nearest neighbor graph G with m nodes and defines an edge in the graph for the node if another node is one of its k-nearest neighbors. Second, it calculates the weight matrix with the element [mathematical expression not reproducible] (t is constant) if any two nodes xi and [x.sub.j] are connected. In [38] the Laplacian score of the rth feature was defined as

[L.sub.r] = [[??].sup.T.sub.r]L[[??].sub.r]/[[??].sup.T.sub.r]D1/[1.sup.T]D1, (5)

where [f.sub.r] = [([f.sub.r1], [f.sub.r2], ..., [f.sub.rm]).sup.T] denotes the vector of m samples for the rth feature. L is a Laplacian matrix defined as L = D - S [39] and D is a diagonal matrix defined as [D.sub.ii] = [[summation].sub.j][S.sub.ij].

The third feature selection technique used in the paper is Sequential Forward Floating Selection [40]. This technique is extended from the Sequential Forward Selection (SFS) method. Let us assume that we have a set of n dimensional features Y = {[y.sub.1], ..., [y.sub.n]}. The SFS will output a subset of m dimensional features X = {[x.sub.1], ..., [x.sub.m]} with m < n and assessed by a criterion function J([x.sub.i]) with [x.sub.i] [member of] X. First, SFS takes all m dimensional features from the original feature set as its input and will start adding the first feature xi which has the highest significance subject to

[mathematical expression not reproducible]. (6)

From the initial subset [X.sub.1], SFS extends more features according to

[mathematical expression not reproducible]. (7)
ALGORITHM 1

for i=1:3% We run three feature selection algorithms
       CVO = LeaveOneOutCV (); %Leave-one-out cross-validation
       for j = 1:CVO. NumofTestSets
                % Get training and test sets
                X_train = Features {i} (training Index,:);
                Y_train = Activities (training Index);
                X_test = Features {i} (testIndex,:);
                Y_test = Activities (testIndex);
                k_idex = 0;
                for k= 1:CVO. MaxNumOfFeature (i)
                       k_idx = k_idex + 1;
                       % Get features according to rank of feature
                       fs_index= FeatureRanking {i} (1: k);
                       % Evaluate the feature importance by K-NN
                       algorithm.
                       Prediction = knnModel (X_train
                       (:, fs_index), ....
                                    Y_train, X_test (:, fs_index), 1);
                       cvAccuracy (j, k_idx) = sum (double (Prediction
                       == Y_test))...
                                                       /TestSize (j);
                end
       end
Accuracy {1,i} (1: size (cvAccuracy, 2)) = mean (cvAccuracy);
end


Then, the new subset of features becomes

[X.sub.i+1] = [X.sub.i]+[x.sub.i+1],

i = i + 1 . (8)

The disadvantage of SFS is that once a feature is selected, it cannot be removed from the subset. SFFS can solve this problem by checking and removing the least significant feature [x.sub.w] in the subset for each iteration, that is

[mathematical expression not reproducible]. (9)

This procedure is repeated until the number of selected features equals the desired number of features or i = n.

After ranking the features for each sensor location, we use k-NN to consider various features from high to low rank based on the model accuracy of each feature selection technique. Algorithm 1 presents the model accuracy procedure used to select the important features. A leaveone-out cross-validation and a k-NN classifier are utilized in the algorithm.

3.5. Feature Set Optimization. Relief-F and Laplacian Score select features by evaluating the capabilities of the features in preserving sample similarity. Therefore, they are unable to handle feature redundancy, which may have a high correlation with each other [41]. In addition, SFFS which cannot eliminate redundant features generated in the searching process, does not consider the feature correlation [42]. To address the redundancy caused by high-correlation features, we use the Pearson correlation coefficient [4, 43] with a threshold equal to 1 in our study. Let us assume that we have two data segments X = [x.sub.1], [x.sub.2], ..., [x.sub.n] and Y = [y.sub.1],[y.sub.2], ..., [y.sub.n]. Then, the Pearson correlation is defined as

[mathematical expression not reproducible]. (10)

3.6. Classifier. K-nearest neighbor (k-NN) and support vector machine (SVM) are popular supervised classification algorithms in machine learning problems. SVM is robust and delivers a unique solution because of convex optimization. However, this technique is more computationally expensive when dealing with high-dimensional data [44, 45]. In contrast, k-NN, which is a simple and robust algorithm, does not require a learning process [15, 46]. It uses the principle of similarity (Euclidean distance) between the training set and the new observation to be classified. Therefore, it is easy to implement and less computation intensive in comparison to SVM. The problem with k-NN is that it is very sensitive to redundant features [45]. In this study, we use the k-NN classifier to choose the number of features and the classification.

3.7. Evaluation Criteria. We evaluate the performance of our proposed method by utilizing four metrics, that is, accuracy, precision, recall, and F1-score. The accuracy can be expressed for binary classification as follows:

Accuracy = [TP + TN]/[TP + TN + FP + FN], (11)

where true positive (TP) means that an instance labeled as positive is predicted as positive. A true negative (TN) is defined when the algorithm predicts a negative result for a label that is indeed negative. When the label is negative but is predicted as positive, the prediction is defined as a false positive (FP). The reverse situation for a false positive is called a false negative (FN).

Precision = TP/[TP+FP],

Recall = TP/[TP + FN]. (12)

F1-score which is the combination of two metrics, that is, the precision and recall, is defined as follows:

F1 - score 2 x [Precision x Recall]/[Precision + Recall]

4. Experimental Results

In this section, a brief description of the used dataset is given. Then, we present the results of the feature selection and feature optimization procedure. The performance of our proposed method and a comparison with the reference papers are also presented. In addition, three metrics, that is, precision, recall, and F1-score are utilized to evaluate the proposed method's performance. The comparison highlights the performance of the different methods in terms of the F1-score.

4.1. DaLiAc Dataset. In this study, we use the DaLiAc (Daily Living Activities) dataset, which is a benchmark dataset for the classification of daily life activities based on inertial data as described in [33]. Nineteen healthy subjects (8 women and 11 men, age 26 [+ or -] 8 years, height 177 [+ or -] 11cm, weight 75.2 [+ or -] 14.2 kg) performed thirteen activities, that is, sitting (labeled as SI), lying down (LY), standing (ST), washing dishes (WD), vacuuming (VC), sweeping (SW), walking (WK), ascending stairs (AS), descending stairs (DS), treadmill running (RU), bicycling on an ergometer (50 W) (BC50), bicycling on an ergometer (100 W) (BC100), and rope jumping (RJ). Inertial data were collected from four Shimmer sensor nodes [26] placed on the right hip, chest, right wrist, and left ankle. Figure 3 illustrates the positions and coordinates of each sensor.

Each sensor node consisted of a 3D accelerometer and a 3D gyroscope. The range of the accelerometer was [+ or -]6g. The range of the gyroscopes was [+ or -]500 deg/s for the sensor nodes at the wrist, chest, and hip and [+ or -]2000 deg/s for the sensor node on the ankle. The sampling rate was set to 204.8 Hz.

4.2. Results of Feature Selection and Feature Optimization. The experimental results of feature selection are given in Figure 4 and Table 3. Figure 4 illustrates the classification accuracy according to the rank of the selected features for the four sensor locations. In detail, the obtained results are summarized in Table 3.

In Table 4, it is clear that the Laplacian Score outperforms SFFS and Relief-F in terms of the number of the selected features and accuracy. Among the sensor positions, the ankle and hip, which have the same number of selected features (29 features), give a high overall accuracy, that is, 96.85% and 96.91%, respectively. The accuracy rates of the three feature selection methods are very close, but the number of features selected by Laplacian Score is the lowest number, that is, 36 features in comparison with 87 features (SFFS) and 77 features (Relief-F). Considering the wrist sensor, Laplacian Score achieves an accuracy rate of 95.24% with 44 features, whereas SFFS and Relief-F attain classification rates of 91.60% (124 features) and 92.75% (73 features), respectively.

Figure 5 presents in detail the distribution of selected features for each sensor position after optimization. Wrist location contributes most of features (32%) for activity classification which is shown in Figure 5(a). According to Figure 5(b), the mean absolute deviation (MAD), range (r), correlation ([rho]), and standard deviation ([sigma]) features for one or more axes overwhelm four sensor positions, that is, the hip (71.43% in total), ankle (66.66%), chest (58.82%), and wrist (52.38%).

4.3. Performance of the Proposed Method and a Comparison with Reference Methods. Table 5 presents the confusion matrix for the recognition of thirteen daily living activities.

Table 3 shows the performance evaluation of our proposed method in detail. The proposed method recognizes washing dishes, descending stairs, treadmill running, and rope running with the highest accuracy rate of 100%. The method also achieves high accuracy for walking (99.62%), lying (99.47%), and ascending stairs (99.47%). Differences in intensity cause confusion for the algorithm when distinguishing bicycling on an ergometer (100 W) (98.93%) and bicycling on an ergometer (50 W) (98.72%). The confusion matrix in Table 5 clearly shows that these activities are misclassified as each other. The accuracy for sweeping (98.27%) and vacuuming (98.06%) are quite similar. Sitting and standing, which have the same perspective of orientation in gravity, are misclassified by 2.97% and 3.16%, respectively.

Table 6 gives a comparison of the proposed method with two reference papers, that is, Leutheuser et al. [33] and Zdravevski et al. [34]. As shown in Table 6, our proposed method compares well against the reference papers in terms of recognition of sitting, standing, washing dishes, walking, ascending stairs, descending stairs, treadmill running, and rope jumping. Overall, our proposed method achieved an accuracy rate of 99.13% in classifying 13 daily living activities, whereas the two reference methods of Leutheuser et al. and Zdravevski et al. achieved accuracy rates of 89.00% and 93.00%, respectively.

5. Conclusion

We proposed a novel approach that applies a feature selection algorithm on different sensor locations to recognize thirteen daily living activities using the DaLiAc benchmark dataset. To perform the recognition, we extracted features from a preprocessed dataset. Then, the features were ranked according to feature importance using a feature selection technique. In addition, we performed this process separately for each sensor-location dataset. Before feeding the features to a classifier, we optimized the selected features by using the Pearson correlation coefficient with a threshold equal to 1. We evaluated the performance of our proposed method with three metrics, that is, precision, recall, and F1-score. The results showed that the proposed method achieved a high precision rate of 100% for five activities, that is, lying, walking, ascending stairs, treadmill running, and rope jumping. The lowest precision rate was 95.73% in recognizing vacuuming activity. Furthermore, we compared the F1-score metric with the two reference papers. Overall, our proposed method outperformed the two reference methods with an accuracy rate of 99.13% in classifying thirteen activities. In contrast, Leutheuser et al. and Zdravevski et al. only achieved accuracy rates of 89.00% and 93.00%, respectively. The results are promising and show that the proposed method can be useful for a multiple-sensor configuration in human activity recognition.

In the future, we plan to improve the accuracy rate and reduce the feature dimension for classifying activities of daily living. In addition, we will extend our work by taking into consideration the feature correlation and the feature rank among the sensor positions.

https://doi.org/10.1155/2018/9762098

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interests.

Acknowledgments

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF2015R1D1A1A01060917). This research was also supported by the National Research Foundation of Korea (NRF) funded by the Korean Government (MSIP) (NRF2016R1A5A1012966).

References

[1] K. Rajesh Kanna, V. Sugumaran, T. Vijayaram, and C. Karthikeyan, "Activities of daily life (ADL) recognition using wrist-worn accelerometer," International Journal of Engineering and Technology, vol. 8, no. 3, pp. 1406-1413, 2016.

[2] G. Koshmak, A. Loutfi, and M. Linden, "Challenges and issues in multisensor fusion approach for fall detection: review paper," Journal of Sensors, vol. 2016, Article ID 6931789, 12 pages, 2016.

[3] A. Ozdemir, "An analysis on sensor locations of the human body for wearable fall detection devices: principles and practice," Sensors, vol. 16, no. 8, 2016.

[4] L. Bao and S. S. Intille, "Activity recognition from userannotated acceleration data," Lecture Notes in Computer Science, vol. 3001, pp. 1-17, 2004.

[5] A. M. Khan, Y.-K. Lee, S. Y. Lee, and T.-S. Kim, "A triaxial accelerometer-based physical-activity recognition via augmented-signal features and a hierarchical recognizer," IEEE Transactions on Information Technology in Biomedicine, vol. 14, no. 5, pp. 1166-1172, 2010.

[6] N. D. Nguyen, P. H. Truong, and G.-M. Jeong, "Daily wrist activity classification using a smart band," Physiological Measurement, vol. 38, no. 9, pp. L10-L16, 2017.

[7] N. D. Nguyen, D. T. Bui, P. H. Truong, and G. M. Jeong, "Classification of five ambulatory activities regarding stair and incline walking using smart shoes," IEEE Sensors Journal, vol. 18, no. 13, pp. 5422-5428, 2018.

[8] D. Trong Bui, N. Nguyen, and G.-M. Jeong, "A robust step detection algorithm and walking distance estimation based on daily wrist activity recognition using a smart band," Sensors, vol. 18, no. 7, 2018.

[9] T. Sztyler, H. Stuckenschmidt, and W. Petrich, "Positionaware activity recognition with wearable devices," Pervasive and Mobile Computing, vol. 38, pp. 281-295, 2017.

[10] Y. Zheng, "Human activity recognition based on the hierarchical feature selection and classification framework," Journal of Electrical and Computer Engineering, vol. 2015, Article ID 140820, 9 pages, 2015.

[11] M. U. S. Khan, A. Abbas, M. Ali et al., "On the correlation of sensor location and human activity recognition in body area networks (BANs)," IEEE Systems Journal, vol. 12, no. 1, pp. 82-91, 2018.

[12] D. Schuldhaus, H. Leutheuser, and B. M. Eskofier, "Classification of daily life activities by decision level fusion of inertial sensor data," in Proceedings of the 8th International Conference on Body Area Networks, pp. 77-82, Brussels, Belgium, 2013.

[13] O. Banos, M. Damas, H. Pomares, and I. Rojas, "Activity recognition based on a multi-sensor meta-classifier," in Advances in Computational Intelligence, I. Rojas, G. Joya, and J. Cabestany, Eds., pp. 208-215, Springer, Berlin, Heidelberg, 2013.

[14] L. Gao, A. K. Bourke, and J. Nelson, "Activity recognition using dynamic multiple sensor fusion in body sensor networks," in 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 1077-1080, San Diego, CA, USA, September 2012.

[15] F. Attal, S. Mohammed, M. Dedabrishvili, F. Chamroukhi, L. Oukhellou, and Y. Amirat, "Physical human activity recognition using wearable sensors," Sensors, vol. 15, no. 12, pp. 31314-31338, 2015.

[16] K. Kunze, P. Lukowicz, H. Junker, and G. Troster, "Where am I: recognizing on-body positions of wearable sensors," in Location-and Context-Awareness, T. Strang and C. Linnhoff-Popien, Eds., pp. 264-275, Springer, Berlin, Heidelberg, 2005.

[17] A. Vahdatpour, N. Amini, and M. Sarrafzadeh, "On-body device localization for health and medical monitoring applications," in 2011 IEEE International Conference on Pervasive Computing and Communications (PerCom), pp. 37-44, Seattle, WA, USA, March 2011.

[18] C. Figueira, R. Matias, and H. Gamboa, "Body location independent activity monitoring," in Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies, pp. 190-197, Rome, Italy, 2016.

[19] L. Atallah, B. Lo, R. King, and G.-Z. Yang, "Sensor positioning for activity recognition using wearable accelerometers," IEEE Transactions on Biomedical Circuits and Systems, vol. 5, no. 4, pp. 320-329, 2011.

[20] M. Zhang and A. A. Sawchuk, "A feature selection-based framework for human activity recognition using wearable multimodal sensors," in Proceedings of the 6th International ICST Conference on Body Area Networks, vol. 6, pp. 92-98, Beijing, China, June 2012.

[21] S. Pirttikangas, K. Fujinami, and T. Nakajima, "Feature selection and activity recognition from wearable sensors," in Ubiquitous Computing Systems, pp. 516-527, Springer, Berlin, Heidelberg, 2006.

[22] T. Huynh and B. Schiele, "Analyzing features for activity recognition," in Proceedings of the 2005 Joint Conference on Smart Objects and Ambient Intelligence Innovative Context-Aware Services: Usages and Technologies--sOc-EUSAI '05, pp. 159-163, Grenoble, France, October 2005.

[23] A. Y. Yang, R. Jafari, S. S. Sastry, and R. Bajcsy, "Distributed recognition of human actions using wearable motion sensor networks," Journal of Ambient Intelligence and Smart Environments, vol. 1, no. 2, pp. 103-115, 2009.

[24] O. Banos, M. Damas, H. Pomares, A. Prieto, and I. Rojas, "Daily living activity recognition based on statistical feature quality group selection," Expert Systems with Applications, vol. 39, no. 9, pp. 8013-8021, 2012.

[25] O. Banos, M. Damas, H. Pomares, F. Rojas, B. Delgado-Marquez, and O. Valenzuela, "Human activity recognition based on a sensor weighting hierarchical classifier," Soft Computing, vol. 17, no. 2, pp. 333-343, 2013.

[26] L. Gao, A. K. Bourke, and J. Nelson, "Evaluation of accelerometer based multi-sensor versus single-sensor activity recognition systems," Medical Engineering & Physics, vol. 36, no. 6, pp. 779-785, 2014.

[27] J. Li, K. Cheng, S. Wang et al., "Feature selection: a data perspective," ACM Computing Surveys, vol. 50, no. 6, pp. 1-45, 2018.

[28] K. Kira and L. A. Rendell, "A practical approach to feature selection," in Machine Learning Proceedings 1992, Morgan Kaufmann Publishers Inc., 1992.

[29] R. Gilad-Bachrach, A. Navot, and N. Tishby, "Margin based feature selection-theory and algorithms," in ICML '04 Proceedings of the Twenty-First International Conference on Machine Learning, vol. 21, pp. 43-50, Banff, Alberta, Canada, July 2004.

[30] H. Peng, F. Long, and C. Ding, "Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226-1238, 2005.

[31] I. Guyon and A. Elisseeff, "An introduction to variable and feature selection," Journal of Machine Learning Research, vol. 3, pp. 1157-1182, 2003.

[32] A. W. Whitney, "A direct method of nonparametric measurement selection," IEEE Transactions on Computers, vol. C-20, no. 9, pp. 1100-1103, 1971.

[33] H. Leutheuser, D. Schuldhaus, and B. M. Eskofier, "Hierarchical, multi-sensor based classification of daily life activities: comparison with state-of-the-art algorithms using a benchmark dataset," PLoS One, vol. 8, no. 10, article e75196, 2013.

[34] E. Zdravevski, P. Lameski, V. Trajkovik et al., "Improving activity recognition accuracy in ambient-assisted living systems by automated feature engineering," IEEE Access, vol. 5, pp. 5262-5280, 2017.

[35] L. Yu and H. Liu, "Feature selection for high-dimensional data: a fast correlation-based filter solution," in Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 856-863, Washington, DC, USA, 2003.

[36] Y. Saeys, I. Inza, and P. Larranaga, "A review of feature selection techniques in bioinformatics," Bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007.

[37] I. Kononenko, E. Simec, and M. Robnik-Sikonja, "Overcoming the myopia of inductive learning algorithms with relief," Applied Intelligence, vol. 7, no. 1, pp. 39-55, 1997.

[38] X. He, D. Cai, and P. Niyogi, "Laplacian score for feature selection," Advances in Neural Information Processing Systems, vol. 18, 2006.

[39] F. R. K. Chung, "Spectral graph theory," in Regional Conference Series in Mathematics, vol. 92, American Mathematical Society, 1996.

[40] M. Kudo and J. Sklansky, "Comparison of algorithms that select features for pattern classifiers," Pattern Recognition, vol. 33, no. 1, pp. 25-41, 2000.

[41] Z. Zhao, L. Wang, H. Liu, and J. Ye, "On similarity preserving feature selection," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 3, pp. 619-632, 2013.

[42] A. M. De Silva and P. H. W. Leong, Grammar-Based Feature Generation for Time-Series Prediction, Springer, 2015.

[43] J.-Y. Yang, J.-S. Wang, and Y.-P. Chen, "Using acceleration measurements for activity recognition: an effective learning algorithm for constructing neural classifiers," Pattern Recognition Letters, vol. 29, no. 16, pp. 2213-2220, 2008.

[44] C.-W. Hsu and C.-J. Lin, "A comparison of methods for multiclass support vector machines," IEEE Transactions on Neural Networks, vol. 13, no. 2, pp. 415-425, 2002.

[45] P. Cunningham and S. J. Delaney, "K-nearest neighbor classifiers," Technical Report UCD-CSI-2007-4, School of Computer Science and Informatics, Ireland, 2007.

[46] H. Bhaskar, D. C. Hoyle, and S. Singh, "Machine learning in bioinformatics: a brief survey and recommendations for practitioners," Computers in Biology and Medicine, vol. 36, no. 10, pp. 1104-1125, 2006.

Nhan Duc Nguyen,(1) Duong Trong Bui, (1) Phuc Huu Truong,(2) and Gu-Min Jeong,(1)

(1) Department of Electrical Engineering, Kookmin University, Seoul, Republic of Korea

(2) Korea Institute of Industrial Technology, Ansan, Republic of Korea

Correspondence should be addressed to Gu-Min Jeong; gm1004@kookmin.ac.kr

Received 27 February 2018; Accepted 13 August 2018; Published 13 September 2018

Academic Editor: Eduard Llobet

Caption: Figure 1: Human activity recognition approaches: (a) standard approach and (b) proposed approach.

Caption: Figure 2: Sample signals of sitting activity of ankle position (blue: raw data; red: filtered data).

Caption: Figure 3: Body-sensor placement with each sensor-node coordinate ([A.sub.x], [A.sub.y], and [A.sub.z]: x-, y-, and z-axes of an accelerometer; [G.sub.x], [G.sub.y], and [G.sub.z]: x-, y-, and z-axes of a gyroscope).

Caption: Figure 4: Classification accuracy rates depending on the number of features selected by the three feature selection methods for: (a) ankle, (b) chest, (c) hip, and (d) wrist.
Table 1: Statistics of preprocessed dataset.

Number                Classes                 Subjects   Sample size

1                     Sitting                    19          237
2                      Lying                     19          189
3                     Standing                   19          188
4                  Washing dishes                19          891
5                    Vacuuming                   19          211
6                     Sweeping                   19          515
7                     Walking                    19          791
8                 Ascending stairs               19          187
9                Descending stairs               19          113
10               Treadmill running               19          438
11        Bicycling on an ergometer (50 W)       19          470
12       Bicycling on an ergometer (100 W)       19          656
13                  Rope jumping                 19          160

Table 2: Derived features with symbols and implemented data.

Feature                                  Notation

Mean                                       [mu]
Standard deviation                       [sigma]
Variance                             [[sigma].sup.2]
Mean absolute deviation                    MAD
Range                                       r
Skewness                             [[gamma].sub.1]
Kurtosis                             [[gamma].sub.2]
4th moment                             [[mu].sub.4]
5th moment                             [[mu].sub.5]
Root mean square                           RMS

Sum of absolute values                      SA

Signal magnitude area                      SMA

Intensity of movement                       IM
Dominant frequency                      [f.sub.d]

Amplitude                       [mathematical expression
                                    not reproducible]
Mean trend                                [mu]T
Windowed mean difference                  [mu]D
Variance trend                       [sigma][T.sup.2]
Windowed variance difference         [sigma][D.sup.2]
Standard deviation trend                 [sigma]T
Windowed standard                        [sigma]D
  deviation difference
Correlation                               [rho]

Average energy                              AE

Feature                                      Applied data

Mean
Standard deviation
Variance
Mean absolute deviation
Range
Skewness
Kurtosis
4th moment
5th moment
Root mean square                  [a.sup.w.sub.x], [a.sup.w.sub.y],
                                  [a.sup.w.sub.z], [w.sup.w.sub.x],
                                   [w.sup.w.sub.y], [w.sup.w.sub.z]
Sum of absolute values            [a.sup.c.sub.x], [a.sup.c.sub.y],
                                  [a.sup.c.sub.z], [w.sup.c.sub.x],
                                   [w.sup.c.sub.y], [w.sup.c.sub.z]
Signal magnitude area             [a.sup.h.sub.x], [a.sup.h.sub.y],
                                  [a.sup.h.sub.z], [w.sup.h.sub.x],
                                   [w.sup.h.sub.y], [w.sup.h.sub.z]
Intensity of movement
Dominant frequency                [a.sup.a.sub.x], [a.sup.a.sub.y],
                                  [a.sup.a.sub.z], [w.sup.a.sub.x],
                                   [w.sup.a.sub.y], [w.sup.a.sub.z]
Amplitude

Mean trend
Windowed mean difference
Variance trend
Windowed variance difference
Standard deviation trend
Windowed standard
  deviation difference
Correlation                         [mathematical expression not
                                            reproducible]
Average energy                      [mathematical expression not
                                            reproducible]

Table 3: Performance evaluation.

Activity                             Precision   Recall   F1-score

Sitting                               96.62%     97.03%    96.83%
Lying                                  100%      99.47%    99.74%
Standing                              97.87%     96.84%    97.35%
Washing dishes                        99.66%      100%     99.83%
Vacuuming                             95.73%     98.06%    96.88%
Sweeping                              99.22%     98.27%    98.74%
Walking                                100%      99.62%    99.81%
Ascending stairs                       100%      99.47%    99.73%
Descending stairs                     97.35%      100%     98.65%
Treadmill running (8.3 km/h)           100%       100%      100%
Bicycling on an ergometer (50 W)      98.51%     98.72%    98.62%
Bicycling on an ergometer (100 W)     99.09%     98.93%    99.01%
Rope jumping                           100%       100%      100%

Table 4: Number of selected features and accuracy rates depending on
the sensor positions and feature selection methods.

Number   Sensor     Feature selection       Number of        Accuracy
         position         method         selected features

  1       Ankle            SFFS                 56            91.46%
                         Relief-F               32            93.16%
                     Laplacian Score            29            96.85%

  2       Chest            SFFS                 87            95.62%
                         Relief-F               77            95.11%
                     Laplacian Score            36            95.62%

  3        Hip             SFFS                 32            96.08%
                         Relief-F               52            95.01%
                     Laplacian Score            29            96.91%

  4       Wrist            SFFS                 124           91.60%
                         Relief-F               73            92.75%
                     Laplacian Score            44            95.24%

Table 5: Confusion matrix of our proposed method.

        SI    LY    ST    WD    VC    SW    WK    AS    DS    RU

SI      229    0     4     3     0     0     0     0     0     0
LY       1    189    0     0     0     0     0     0     0     0
ST       6     0    184    0     0     0     0     0     0     0
WD       0     0     0    888    0     0     0     0     0     0
VC       0     0     0     0    202    4     0     0     0     0
SW       1     0     0     0     8    511    0     0     0     0
WK       0     0     0     0     1     0    791    0     2     0
AS       0     0     0     0     0     0     0    187    1     0
DS       0     0     0     0     0     0     0     0    110    0
RU       0     0     0     0     0     0     0     0     0    438
BC50     0     0     0     0     0     0     0     0     0     0
BC100    0     0     0     0     0     0     0     0     0     0
RJ       0     0     0     0     0     0     0     0     0     0

        BC50   BC100   RJ

SI       0       0      0
LY       0       0      0
ST       0       0      0
WD       0       0      0
VC       0       0      0
SW       0       0      0
WK       0       0      0
AS       0       0      0
DS       0       0      0
RU       0       0      0
BC50    463      6      0
BC100    7      650     0
RJ       0       0     160

Table 6: Recognition performance for thirteen activities in
terms of F1-score metric.

Activity                             Leutheuser et al.

Sitting                                   94.10%
Lying                                      100%
Standing                                  90.93%
Washing dishes                            96.50%
Vacuuming                                 85.13%
Sweeping                                  88.57%
Walking                                   98.89%
Ascending stairs                          95.55%
Descending stairs                         94.80%
Treadmill running (8.3 km/h)               100%
Bicycling on an ergometer (50 W)          63.50%
Bicycling on an ergometer (100 W)         58.75%
Rope jumping                               100%
Overall accuracy                          89.00%

Activity                             Zdravevski et al.

Sitting                                   93.07%
Lying                                      100%
Standing                                  74.40%
Washing dishes                            79.21%
Vacuuming                                  100%
Sweeping                                   100%
Walking                                   94.48%
Ascending stairs                          98.50%
Descending stairs                         91.50%
Treadmill running (8.3 km/h)              94.80%
Bicycling on an ergometer (50 W)          99.60%
Bicycling on an ergometer (100 W)         99.55%
Rope jumping                              98.43%
Overall accuracy                          93.00%

Activity                             Proposed method

Sitting                                  96.83%
Lying                                    99.74%
Standing                                 97.35%
Washing dishes                           99.83%
Vacuuming                                96.88%
Sweeping                                 98.74%
Walking                                  99.81%
Ascending stairs                         99.73%
Descending stairs                        98.65%
Treadmill running (8.3 km/h)              100%
Bicycling on an ergometer (50 W)         98.62%
Bicycling on an ergometer (100 W)        99.01%
Rope jumping                              100%
Overall accuracy                         99.13%

Figure 5: Statistics of selected features according to sensor
positions: (a) distribution of number of features and
(b) distribution of overwhelming features.

(a)

Ankle   21%
Chest   26%
Hip     21%
Wrist   32%

(b)

Ankle   27%
Chest   23%
Hip     29%
Wrist   21%

Note: Table made from pie chart.
COPYRIGHT 2018 Hindawi Limited
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Research Article
Author:Nguyen, Nhan Duc; Bui, Duong Trong; Truong, Phuc Huu; Jeong, Gu-Min
Publication:Journal of Sensors
Date:Jan 1, 2018
Words:7645
Previous Article:Practical Fingerprinting Localization for Indoor Positioning System by Using Beacons.
Next Article:Structural Health Monitoring and Assessment: Sensors and Analysis.
Topics:

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters