Printer Friendly

Detection of Lane-Changing Behavior Using Collaborative Representation Classifier-Based Sensor Fusion.

1. Introduction

In recent years, there has been significant effort in establishing the ego-vehicle which would have the capability to achieve autonomous driving for passengers. The vehicle control space seamlessly moves up and down between no-assistive, semi-assistive, semiautomated, and completely automated. While great strides in autonomous driving continue, greater understanding and research are needed for driver modeling as transition merges from full-driver control to a graduated scale of assistive-through-automated self-driving. Therefore, understanding driver expectations, experience, and capabilities within the manual through driving context is a major challenge. Maneuvers are the basic units in building a comprehensive driving session, although their definitions may vary considerably depending on the underlying application [1]. Understanding how these maneuvers are performed can provide information on how the driver controls the vehicle and how driving performance varies over time and therefore is essential in driver assistance and safety systems.

For the most time during driving, the driver is required to maneuver the steering wheel, and the lane change maneuver is one of the main causes of road traffic accidents. Detection of drivers' lane-changing maneuver is a challenging problem due to the highly dynamic behavior of driving. Drivers can change the driving direction in an instance or start/stop the vehicle abruptly. There are possibilities of unwanted lane change against the driver's intention, which may lead to a situation that endangers the safety of the driver's own vehicle to surrounding ones.

In this article, the lane change detection problem by a fusion approach using a collaborative representation classifier (CRC) is addressed. The front-view video camera and onboard diagnostics (OBD) sensor are used to achieve improved detection performance. While each of these sensors has its own limitations when operating under realistic conditions, utilizing them together provides synergy. Both feature-level fusion and decision-level fusion are considered. In the feature-level fusion, features generated from the two differing modality sensors are merged before classification. In the decision-level fusion, the Dempster-Shafer (D-S) theory is used to combine the classification outcomes from two classifiers, each corresponding to one sensor. The effectiveness of the proposed methods was verified through real driving data experiments, and the experimental results shown that the introduced fusion approach using a CRC achieved the best performance on detecting the driver's behavior of lane change when compared with other state-of-the-art models.

The remainder of the article is organized as follows. Section 2 presents a brief survey of the latest technologies for lane change detection. Section 3 presents the mathematical approach used. In Section 4, the proposed sensor fusion framework is explained in detail. Section 5 describes the lane change model and experimental dataset. Section 6 presents the experimental results generated from real driving data, and Section 7 concludes the article.

2. Related Work

The lane change maneuver is a common but significant maneuver performed by a driver in response to his driving need, traffic flow and nearby traffic, and/or environmental factors. In recent decades, most lane change detection systems are based on the frontview camera video/image processing [2, 3, 4, 5]. McCall et al. [2] and Kasper et al. [3] both focused their studies on computer vision processing for tracking lane markings on the road, demonstrating that vision-based lane change detection to be a potential effective approach. In addition to finding landmarks purely from color images/video, several other studies have introduced LiDAR and map information for sensor fusion. With the fusion of color images and LiDAR, Huang et al. [4] established models for curve estimation, and Gu et al. [5] applied a convolutional neural network (CNN) architecture for lane type classification. Both studies contributed to a better contextual understanding of the road. However, the problem with computer vision-based approaches is that they require the line of sight, that is, the lanes and surrounding objects need to be in the direct field of view of the vehicle's camera system and not being blocked. Furthermore, these approaches very much depend on the performances of computer vision algorithms for detecting and tracking lanes, which, in turn, are affected by various lighting conditions, environmental conditions such as snow, rain, and fog, image processing errors such as image localization, vehicle ego-motion estimation, etc.

For identification of the driver's intention for lane change, in particular, many researchers have investigated machine learning classification techniques, such as hidden Markov model (HMM) [6, 7, 8], support vector machine (SVM) [9, 10, 11], Bayesian network [12, 13, 14, 15,16], artificial neural network (ANN) [17, 18, 19], and deep neural network (DNN) [20, 21, 22, 23]. Zheng et al. [8] developed a HMM-based lane change detection model by using vehicle dynamics signals. They reported that the classification accuracy of the model can be obtained at 80.36% for lane change left event (LCL) and 83.22% for lane change right event (LCR) on a real driving dataset. Hou et al. [12] applied Bayesian network and decision tree (DT) methods to model lane changes. The model predicts driver decisions on whether to merge lane or non-merge. The best results were obtained when both Bayes and DT classifiers were combined into a single classifier using a majority voting principle. Li et al. [20] developed two kinds of combined DNN to detect lane boundaries in traffic scenes. The multitask CNN provides auxiliary geometric information to help the subsequent modeling of the given lane structures. And the recurrent neural network (NN) automatically detects lane boundaries.

Most of the studies mentioned above employed rather expensive sensors to measure various vehicle states, such as lateral velocity, side slip angle, and lateral position, to identify the driver's intention for lane change. Recently, many commercial vehicles are being equipped with OBD sensors to provide basic measurements, such as steering wheel angle, yaw rate, longitudinal and lateral accelerations, and vehicle speed, at a reasonable cost [24, 25, 26, 27]. Vehicle dynamics signals can be considered as a supplemental or verification resource to assist lane change detection or vehicle localization. Woo et al. [24] established models to predict a vehicle's trajectory for detecting lane changes of surrounding vehicles, and Nilsson et al. [26] proposed a pragmatic approach to select an appropriate inter-vehicle traffic gap and time instance to perform the lane change maneuver. Both studies contributed to a better understanding of vehicle states by applying trajectory prediction when the surrounding vehicle attempts to change lanes. Another reason to be concerned with vehicle dynamics signals is that it reflects the driver's control to his/her surroundings and hence provides insight for driving performance assessment.

More recently with the fast development of sensor fusion technologies, lane change detection using information through many different onboard sensors has emerged as a promising technology [28, 29, 30]. Cao et al. [29] used digital map data to render virtual images and align this information with the camera view, so as to obtain the lane-level vehicle localization. Satzoda et al. [30] demonstrated an overall naturalistic driving study hierarchy, which combined the lower-level sensor fusion for all these data, as well as higher-level driving event recognition and driver behavior analysis. The sensor fusion technique has the ability of relaxing the assumptions of traditional lane-changing model's mathematical forms and variable distributions.

The purpose of this specific study is to explore the effectiveness of a sensor fusion approach for detecting the driver's behavior of lane change using a CRC based on two differing modality sensors (front-view video camera and OBD sensor).

3. Mathematical Approach

In this section, the mathematical techniques such as CRC and D-S theory used in the proposed fusion approach for lane change detection are stated.

3.1. Collaborative Representation Classifier

For C distinct classes and a matrix X = [{[x.sub.i]}.sup.n.sub.i=1] [member of] [R.sup.dxn] formed by n-dimensional training samples arranged column-wise to form the overcomplete dictionary. For a test sample y [member of] [R.sup.d], y can be expressed as a collaborative representation in terms of matrix X as y = X[alpha], where a is a n x 1 vector of coefficients corresponding to all training samples from the C classes.

As suggested by Zhang et al. [31], it is a collaborative representation, that is, use of all the training samples as a dictionary, but not the [l.sub.1]-norm sparsity constraint, that improves classification accuracy. The [l.sub.2] regularization generates comparable results but with significantly lower computational complexity. The CRC swapped the [l.sub.1] penalty with an [l.sub.2] penalty, that is,

[mathematical expression not reproducible] Eq. (1)

where [theta] denotes a regularization parameter. According to the class labels of the training samples, [alpha] can be partitioned into C subsets [alpha] = [[[alpha].sub.1], [[alpha].sub.2],...,[[alpha].sub.C]] with [[alpha].sub.j](j [member of] 1,2,...,C) denoting the subset of the coefficients associated with the training samples from the jth class.

The [l.sub.2]-regularized minimization of Equation 1 is in the form of the Tikhonov regularization leading to the following closed-form solution:

[alpha] = [([X.sup.T]X + [theta]I ).sup.-1] [X.sup.T]y Eq. (2)

where I [member of] [R.sup.nxn] denotes an identity matrix. The general form of the Tikhonov regularization involves a Tikhonov regularization matrix [GAMMA]. As a result, Equation 1 can be expressed as

[mathematical expression not reproducible] Eq. (3)

The term [GAMMA] allows the imposition of prior knowledge on the solution using the approach presented in [32], where the training samples that are most dissimilar from a test sample are given less weight than the training samples that are most similar. Specifically, the following diagonal matrix [GAMMA] [member of] [R.sup.nxn] is considered:

[mathematical expression not reproducible] Eq. (4)

The coefficient vector [alpha] is then calculated as follows:

[alpha] = [([X.sup.T] X + [[theta][GAMMA].sup.T] [GAMMA]).sup.-1] [X.sup.T] y Eq. (5)

3.2. Dempster-Shafer Theory

D-S theory introduced by Dempster was later extended by Shafer [33]. D-S theory is able to represent uncertainty and imprecision and can effectively deal with any union of classes and has been applied to many data fusion applications [34, 35].

Let [THETA] be a finite universal set of mutually exclusive and exhaustive hypotheses, called a frame of discernment. In classification applications, [THETA] corresponds to a set of classes. The power set, [2.sup.[THETA]], is the set of all possible subsets of [THETA]. A mass function or basic probability assignment (BPA) is a function m : [2.sup.[THETA]] [right arrow] [0, 1], which satisfies the following properties:

[mathematical expression not reproducible] Eq. (6)

[mathematical expression not reproducible] Eq. (7)

where [phi] is the empty set. A subset A with nonzero BPA is called a focal element. The value of m(A) is a measure of the belief that is assigned to set A, not to subsets of A. Two common evidential measures, belief and plausibility functions, are, respectively, defined as follows (A [subset] [THETA], B [subset] [THETA]):

[mathematical expression not reproducible] Eq. (8)

[mathematical expression not reproducible] Eq. (9)

These two measures have the following properties:

Bel (A)[less than or equal to] Pl (A) Eq. (10)

Pl (A) = 1 - Bel ([bar.A]) Eq. (11)

where [bar.A] is the complementary set of A: [bar.A] = [THETA]- A.

For combining the measures of evidence from two independent sources, Dempster's rule for combining two BPAs, [m.sub.1] and [m.sub.2] (C [subset] [THETA]), is given by

[m.sub.1,2] ([phi]) = 0 Eq. (12)

[mathematical expression not reproducible] Eq. (13)

[mathematical expression not reproducible] Eq. (14)

The normalization factor K provides a measure of conflict between the two sources to be combined. This rule is commutative and associative. If there are more than two sources, the combination rule can be generalized by iteration. A joint decision is made based on the combined BPA by choosing the class with the maximum Bel or Pl.

4. Sensor Fusion Framework

Lane changes are dynamic events made by drivers. These events may require higher workload than driving on the same lane following free-flow traffic. This study is proposed to explore the robustness and limits in a solution fusing video and OBD data for lane change detection.

The limits can be caused by the curve of the route, any vibration due to the road surface, as well as variations in operating time. As shown in Figure 1, case (a) represents two actual true lane changes: one typical LCL from S1 to S2 and one LCR from S2 to S3. Case (b) happens on a curve road; although the vehicle is moving straight forward, this event could be annotated as a lane change. Without prior road information, it is not possible to detect this situation exclusively from the OBD data. Case (c) is defined as a lane drift, which usually appears like lane keeping (LK) in an early stage, but follows with a sharp swerve (avoid hitting the curb) in the end. The diversity of these situations makes it challenging to detect lane change in naturalistic driving situations.

4.1. Feature Extraction from Video Data

In this section, features from the captured front-view videos are extracted for lane change detection based on video image processing. Firstly, the video data pre-processing stage is applied before lane boundary detection. Secondly, lane boundary detection algorithm on each video frame is applied. Finally, for each frame at time t, that is, [X.sub.t], the distance from relative center position to the detected left and right lane boundaries is calculated. Each distance to the left or right lane boundaries is taken as a feature vector. These distance feature vectors are closely corresponded to the action of how the driver maneuvers the vehicle, thus becoming the most significant features for detecting lane changes.

Video Data Pre-processing Real cameras usually use curved lenses to form an image, and light rays often bend a little too much or too little at the edges of these lenses. This creates an effect that distorts the edges of images, so that lines or objects appear more or less curved than they actually are. This is the most common type of distortion known as radial distortion. In order to overcome this issue, distortion correction processing is used to correct raw video image's distortion based on camera parameters. Next is the image binarization stage, during which multiple transformations (e.g., saturation thresholding, histogram-equalized thresholding, gradient thresholding, and binarization) are first applied to obtain the clear lane boundaries in the binary image and combine to generate the best binary image for lane boundary detection. Then the region of interest (ROI) is set in perspective transform stage using the source and destination vertices to concentrate on the essential part of the image--the lanes. And perspective transform allows one to view a lane from above, thus being useful for calculating the distances to the detected left and right lane boundaries from relative center position later on. However, some lane candidates exist outside the ROI in situations where pitch motions are caused by an unstable road or speed bumps, obscure lane painting, or an illumination change. The random sample consensus (RANSAC) algorithm is used to deal with noisy lane candidates because of its good and fast performance when selecting inliers. After extracting the lane candidates from the ROI, the outliers are removed using the RANSAC algorithm.

Lane Boundary Detection After applying pre-processing to the video image, a binary image where the lane lines stand out clearly could be obtained. Then the lane boundary detection algorithm is applied to find the lane boundaries for each video frame.

For lane boundary detection, firstly the histogram is calculated along all the columns in the lower half of the binarized perspective image. Secondly, peak ranges of the left and right halves of the histogram from left to right are searched, and the most prominent peaks in both the left and right sides are selected. These peak values are considered to be the x-position of the starting point for the detected left and right lane boundaries. From the starting points of the left and right lane lines, a slide window is used to search for the lane lines within a chunk of (150, 80) in width and height. Finally, polynomial fitting method is applied to concatenating each detected lane line. This pipeline when applied to the video frames gives lots of jittering between the frames. The smoothing/average over ten previous frames is implemented to get a jitter-free lane detection. These averaged value polynomial fits of the previous frames can also deal with the scenarios where the polynomial fits are not reasonable in the current frame.

Distance Calculation After detecting the lane boundaries on each video frame, then the distances from relative center position to the detected left and right lane boundaries can be calculated. Firstly, mapping the x-position of the detected left and right lane boundaries points A([x.sub.1],y), B([x.sub.2],y) and relative center position C([x.sub.3], y) to the realworld mapping coordinates A([X.sub.1], Y), B([X.sub.2], Y), and C([X.sub.3], Y) using camera calibration processing, where [x.sub.3] = w/2, w is the image width. Then the real-world distances to the detected left and right lane boundaries from the relative center position are obtained. For each frame at time t, that is, [X.sub.t], one denotes the distances to the detected left and right lane boundaries as Left_dis = |[X.sub.1] - [X.sub.3]| and Right_dis = | [X.sub.2] - [X.sub.3]|. Appending these Left_dis and Right_dis, then these two-dimensional features can be arranged as the final distance features. Figure 2 shows some examples of lane boundary detection results on captured video frames.

For specific issues of lane change detection, computer vision-based method based on distance features could detect the lane change event on the road, demonstrating that vision-based algorithm will be a potentially effective approach. But for dealing with the situation that cannot detect any lane boundary in the video image, the vision-based lane change detection algorithm will fail. Therefore, adding other sources of knowledge, for example, vehicle position and dynamics information from OBD data, would help improve the robust of lane change detection system.

4.2. Feature Extraction from OBD Data

Sideswipe accidents occur primarily when drivers attempt an improper lane change, drift out of lane, or vehicles lose lateral traction. Recently, studies of lane change detection have relied on vehicular dynamics data collected from OBD device, such as vehicle speed, steering wheel angle, vehicle heading, and vehicle GPS latitude and longitude, which closely correspond to the action of how the driver orientates the vehicle, thus providing the most dominating signals for detecting lane changes.

In this study, vehicle dynamics features extracted from the OBD data are described as follows. Without losing generality, it is assumed that a trip is recorded as a time sequence, [gamma](t), t = 0,..., T, where [bar.[gamma]](t) is a vector of measurements taken at time index t. In this study, [bar.[gamma]](t) contains five elements, namely, lat(t), lon(t), v(t), steer(t), and heading(t), representing, respectively, the latitude, longitude, speed, steering wheel angle, and heading of the vehicle at time t. Firstly, the GPS coordinates are converted to the Universal Transverse Mercator (UTM) coordinate system that represents the earth in a form of grid with 60 zones. Each zone is 6[degrees] longitude in width and ranges from 80[degrees] S latitude to 84[degrees] N latitude. The UTM coordinates at time t are denoted as u1(t), u2(t). In order to make vehicle position detection insensitive to the trip starting point, the vehicle's position is normalized by using u1(0), u2(0) as the origin of the UTM coordinate system for each trip. So the position of the vehicle at time t is transformed to n_u1(t) = u1(t) - u1(0) and n_u2(t) = u2(t) - u2(0).

For the convenience of description, let it be denoted x(i) = n_u1(i) and (i) = n_u2(i), i = 0, T. The position detection at time t is based on a window of vehicle's past trajectories, that is, u(t, [w.sub.z]) = (x(t), y(t)), (x(t - 1), y(t - 1)), ..., (x(t - [w.sub.z] + 1), y(t - [w.sub.z] + 1)), where [w.sub.z], a positive number, is the window size. u(t, [w.sub.z]) will be referred to as position vector throughout this article. Figure 3 illustrates this process for [w.sub.z] = 10. In Figure 3(a), the current time is at t = 10, and the window size is 10, so the vehicle locations at t = 1, 2, ., 10 are all used to form an input feature vector, with the target value highlighted at t = 10. Figure 3(b) illustrates the detection made at t = 11, with the target value highlighted at t = 11.

For specific issues of lane change detection, from the position vector u(x, w), the vehicle past locations is further normalized with respect to the vehicle current location at t, x(t - i) = x(t - i) - x(i), y(t - i) = y(t - i) - y(t), for i = 1, w. The feature vectors of steering wheel angle and vehicle heading are generated by mapping it from its original range to the range [-[pi], [pi]] within the window size w. By appending GPS coordinates with steering wheel angle, vehicle heading, and speed, these five-dimensional features are then arranged as the final vehicle dynamics features. In order to reduce the impact of GPS noise, the following smoothing operation is used to extract smoothed features:

[f.sub.v] (t, S, w) = {[h.sub.k] (t)|k = 1,...,K} Eq. (15)

where [mathematical expression not reproducible], and S is the size of smoothing filter and S < w. In the experiments described in Section 5, S = 3 is used. For a window size w = 10, [f.sub.v](t, S, w) = {[h.sub.k](t)| k = 1, 2, 3}.

Lane change detection using machine learning method based on vehicle dynamics features can effectively detect typical lane changes, but when lane changes happen at corners or on curved roads, the vehicle dynamics signal characteristics for these kinds of situation are not obvious. Therefore, adding other sources of knowledge, for example, road information from front-view video, would help improve the overall performance.

4.3. Feature-Level Fusion

Feature-level fusion involves fusing feature sets of different modality sensors. Let U = [{[u.sub.l]}.sup.n.sub.l=1] in [mathematical expression not reproducible] ([d.sub.1]-dimensional feature space) and V = {[v.sub.l]}" in [mathematical expression not reproducible] ([d.sub.2]-dimensional feature space) represent the feature sets generated, respectively, from the video data and the OBD data for n training positive samples. Column vectors [u.sub.l] and [v.sub.l] are normalized to have the unit length. Then, the fused feature set is represented by F = {f}n in [mathematical expression not reproducible] with each column vector being [f.sub.l] = [[u.sup.T.sub.l], [[v.sup.T.sub.l]].sup.T]. The fused feature set is then fed into a classifier.

4.4. Decision-Level Fusion

For all the target classes C and a test sample y, the frame of discernment is given by [THETA] = {[[H.sub.1].sup.], [H.sub.2], [H.sub.C]}, where [H.sub.j] : class(y) = j, j [member of] {1, 2,..., C}. The classification decision of the classifier CRC is based on the residual error with respect to class j, [r.sub.j](y). Each class-specific representation [y.sub.s] and its corresponding class label j constitute a distinct item of evidence regarding the class membership of y. If y is close to [y.sub.j]. according to the Euclidean distance, for small [r.sub.j] (y), it is most likely that [H.sub.j] is true. If [r.sub.j](y) is large, the class of [y.sub.j] will provide little or no information about the class of y.

In the decision-level fusion here, CRC is first applied to the distance feature set U from video data and vehicle dynamics feature set V from OBD data, respectively. Therefore, two corresponding BPAs, [m.sub.1] and [m.sub.2], are generated. The combined BPA from [m.sub.1] and [m.sub.2] is then obtained via Equation 13. The class label of a new test sample is determined by one which corresponds to the maximum value of Bel(Hj), that is, max(Bel([H.sub.j])).

5. Lane Change Model and Experiment Dataset

5.1. Lane Change Model

The lane change detection problem can be formulated as follows. At each time t, it will begin with a set of signals:

[xi](t) = [[x.sub.1] (t), [x.sub.2] (t),.... [x.sub.n] (t)] Eq. (16)

The task is to classify x(t) as representing a lane change event (LC) or a non-lane change event (NLC). For this study, the signal set x(t) consists of a set of signals collected from the front-view camera and OBD sensor. In order to train and test the system, in addition to acquiring the vehicle OBD data x(t), a video of the driver's view of the road is also acquired. A schematic of the lane change maneuver is shown in Figure 4. Each frame of the video as either LC (target 1 or 2) or NLC (target 0) is hand classified. To do this, all lane change events in the video for the duration of the drive are firstly identified. If there are m such events in the video, the time stamp is identified for each [mathematical expression not reproducible], m at which the driver crosses the center lane.

The time [t.sub.c] is the critical time where a lane change event is considered to be began/ended by the driver in the process of occurring, that is, occurring before/after the center line is crossed. The target for each time sample in the range [mathematical expression not reproducible] is set to 1 or 2 to represent a LCL or LCR. For all samples outside this region, the target is set to 0 to represent a NLC. Here, [mathematical expression not reproducible] represents the crossing time for the ith lane change. All time samples in the video have the target set to 0 except for the regions [mathematical expression not reproducible], where the target is set to 1 or 2.

5.2. Experiment Dataset

The experimental dataset is collected from ten different drivers' real driving on both city streets and highways around the University of Michigan, Ann Arbor and Dearborn campus, MI area. As described in Table 1, drivers were selected with a balance distribution for gender (5 males, 5 females), age (18-40), and driving experience level (novice to expert). Within the OBD data, the decoded context includes time stamp, vehicle speed, steering wheel angle, vehicle heading, vehicle GPS latitude and longitude, etc. The abovementioned five vehicle dynamics signals are closely corresponded to the action of how the driver orientates the vehicle, thus providing the most dominating signals for detecting lane changes.

The raw signals captured from the OBD device are typically sampled at different rates. Moreover, they may lose frames or be affected with systems, sensor, or related vehicle noise. In order to synchronize the data, all the signals were preprocessed so that they have a common sample rate, which was taken to be 10 Hz in this study. Signals sampled at a rate higher than 10 Hz were downsampled and those with a rate smaller than 10 Hz were upsampled using linear interpolation. In addition, the first and last few seconds of data in each driving trip are removed, because the vehicle is driving out or into the parking lots during this period, which is beyond the scope of forward-moving scenarios focused in this study.

6. Performance Evaluation and Experimental Results

This section presents the experiments conducted to evaluate the proposed feature-level fusion and decision-level fusion methods presented in Section 4 using 20 real driving trips, which are collected from 10 different drivers' naturalistic driving. The raw data consist of front-view video captured by onboard camera and data like GPS, CAN, and accelerometer signals from the OBD DataLogger. Then the useful OBD data, such as vehicle speed, steering wheel angle, heading, and GPS latitude and longitude signal, are extracted from the raw data file using Race Technology Software. As suggested in Section 4, the extracted data will be preprocessed to reduce noise and pre-filtered to set aside pure left-/right-turning events. Figure 5 shows an example of recorded driving trip. This trip route included sharp turns, curved paths, and freeways.

The following experiments are constructed to evaluate the performance of the proposed fusion approach with respect to their generalization on trips taken by different drivers. For the experiments discussed below, the critical time is set to [t.sub.c] = 1 s. Also a sliding window to each vehicle dynamics signal of length [t.sub.w] seconds is applied. This means, if [t.sub.w] = 1 second is chosen, and since the sampling rate was 10 Hz, each feature extraction window contained [n.sub.w] = 10 sample points. According to Murphey et al. [336], the feature extraction window size is set to [t.sub.w] = 1 s here. The experiments were all conducted on a computer with a platform of Intel i7-3960X quad-core processor at 3.3 GHz, 64 GB memory, Microsoft Windows 7, and MATLAB 2016b.

The system performance comparisons are conducted based on the detection confidence and accuracy:

Confidence = [# of correct detected LC frames/#of ground truth frames] Eq. (17)

Accuracy = [#of correct detected LC events/#of ground truthLCevents] Eq. (18)

where the detection confidence is defined as overlapping frames with the ground truth on each lane change event. If the detection confidence value of the lane change detection system is larger than 75%, then this entire detected frame sequence is considered as a lane change event.

6.1. Performance of Cross-Trained Systems for Lane Change Detection

In this set of experiments, it is attempted to evaluate the effectiveness and robustness of the proposed fusion approach to detecting the drivers' behavior of lane change using the trips that have never been used for training with several different classifiers. Here, four other state-of-the-art classifiers consisting of k-nearest neighbor (k-NN), DT, HMM, and multilayer NN were employed to evaluate the effectiveness of the proposed approach compared with CRC on both feature-level fusion and decision-level fusion. The NN that produced the best results contained 15 nodes, so the NN architecture at 15 hidden nodes is fixed in subsequent experiments. The parameter k = 3 was used in k-NN as it generated the best outcome among different ks. The left-to-right topology with eight states was used for HMM. NN and CRC were assigned to those that maximized the training accuracy via a tenfold cross-validation. The lane change detection performance of feature-level fusion is compared with that of decision-level fusion as shown in Tables 2-5. In this article, first, ten trips are randomly extracted from the experiment dataset for training; then the remaining ten trips are used for testing and evaluation.

Feature-Level Fusion In this section, first, the effectiveness of feature-level fusion method was tested. The following two experiments were then conducted. In the first experiment, the above 4 classifiers and the CRC were trained based on selected 10 trips, that is, Data1_[Tr.sub.i], i = 1, 2, 10, and the performances were evaluated on the remaining 10 trips from the experiment dataset, that is, Data1_ [Te.sub.i], i = 11, 12, 20.

Table 2 summarizes the detection performance of Data1_ [Te.sub.i] based on different models. The highest accuracies are from the CRC model which obtained 81.96% for the 61 LCLs and 82.26% for the 62 LCRs. At the same time, the k-NN model has the highest error detection rate, in which 27.87% of LCLs are detected as LK and LCRs and 25.81% of LCRs are detected as LK and LCLs. As shown in Figure 6, the CRC model gives the highest average detection confidences on each testing trip for both LCLs and LCRs. This observation suggests that the CRC model could provide better performance for lane change detection when the video and OBD data are used together due to the complementary nature of the data from these two differing modality sensors.

In the second experiment, first, the above 4 classifiers and the CRC were trained based on the remaining 10 trips, that is, Data2_ [Tr.sub.i], i = 11, 12,..., 20, then the performances were evaluated on the selected 10 trips from the experiment dataset, that is, Data2_[Te.sub.i], i = 1, 2,..., 10.

Table 3 summarizes the detection performance of Data2_ [Te.sub.i] based on different models. The highest accuracies are from the CRC model which obtained 81.03% for the 58 LCLs and 81.58% for the 76 LCRs. And the experiment results of CRC model indicated a greater balance between LCLs and LCRs when the experimental test samples are not equally balanced. At the same time, the k-NN model has the highest error detection rate, in which 25.86% of LCLs are detected as LK and LCRs and 27.63% of LCRs are detected as LK and LCLs. As shown in Figure 7, the CRC model also gives the highest average detection confidences on each testing trip for both LCLs and LCRs. This observation suggests that the CRC model could provide better performance and balanced priority in consideration for lane change detection.

Decision-Level Fusion The effectiveness of decision-level fusion method was also tested. In the experiments, results of the CRC+D-S model were compared with those from four other state-of-the-art classifiers along with D-S theory using the same dataset. The decision-level fusion approach first needs the CRC and other classifiers to be applied to the distance features and vehicle dynamics features, respectively; then D-S theory is used to combine the classification outcomes from two classifiers. Tables 2-5 show that the feature-level fusion outperformed the decision-level fusion in most cases. However, the decision-level fusion involving the CRC+D-S model still achieved better performance than the other four models. One disadvantage of the decision-level fusion is that CRC and the other four classifiers need to be applied to both two different features. In other words, CRC and other four classifiers have to be run twice.

This set of experiments compares two different fusion methods, and the result shows that choosing the feature-level fusion involving two differing modality sensors' data (video and OBD data) together with CRC model could achieve more robust performances for detecting driver's lane-changing behavior on different trips than decision-level fusion.

6.2. Performance of Systems Trained on the Data from Differing Modality Sensors

In this section, the lane change detection performance of the feature-level fusion method is compared with the performance of each individual modality sensor. In order to examine the detection performance of the five classifiers, the data were partitioned using a random stratified sampling process for a tenfold cross-validation process. Here, 9 folders of data are used for training the model and the remaining single folder for test, which means that 8 additional trips from the testing 10 trips were used to train the 5 models, that is, the training data will increase to 18 trips, that is, Data3_ [Te.sub.i], i = 1, 2,..., 18. The remaining 2 trips are used to evaluate the system performance, that is, Data3_ [Te.sub.i], i = 19, 20.

Table 6 summarizes the detection performance of Data3_ [Te.sub.i] with and without data fusion method based on different models. By combining the features from the two differing modality sensors, the overall detection accuracy was improved over either the video camera or the OBD sensor alone. This improved performance was consistent for all five classifiers. In general, the best performance is obtained from CRC model, which had an improvement of 3-4% over the NN/HMM models, and 6-9% over the k-NN/DT models. And the overall detection accuracy of OBD data was found to be higher than that of video camera data.

To summarize, the performance of using CRC with feature-level fusion is superior to other 4 state-of-the-art classifiers on a real driving dataset of 20 trips. Moreover, the proposed fusion approach by using CRC is quite robust as reflected in Table 6 by the detection accuracy of 82.35% for the 119 LCLs and 82.61% for the 138 LCRs even if the experimental test samples are not equally balanced. The performance is achieved based on five vehicle dynamics signals and video data only. When the system involves more vehicle signals with environmental signals and/or driver's physiological signals, and extend the study to a larger training dataset (>20 trips), the detection performance will improve.

6.3. Failure Detection Issue Analysis

In naturalistic driving scenarios, lane change events may happen in a variety of situations. It is interesting to discuss what kind of lane changes can accurately be detected, whereas in what situation the lane change is difficult to be detected. Figure 8 displays the performance of spatial distribution of true positives (TP) and false negatives (FN) in three domains using CRC model with feature-level fusion approach training on more trip data, Data3_ [Tr.sub.i].

In Figure 8(a) and (b), the domain of the time duration vs. moving distance, the TP and FN distributions for both LCL and LCR are almost the same. It can be inferred that the effectiveness of lane change detection is not determined by the maneuver execution in time or vehicle's moving distance. Figure 8(c) and (d) shows the vehicle speed change vs. steering angle change during a lane change maneuver. Here, 46.82% of LCL FN and 42.73% of LCR FN are clustered in the area where the speed changes [less than or equal to]7 m/s and steering changes [less than or equal to]10 degree. These situations usually happened when a vehicle is approaching an intersection, with a slow progression to change the lane, and preparing to make a turn. In the domain of vehicle speed change vs. vehicle heading change in Figure 8(e) and (f), here, 32.38% of LCL FN and 35.21% of LCR FN are located in the area where the vehicle heading changes >20 degree. Referring to Figure 1, case (c) is almost in the large heading change area, because these are swerve shifts which have large deviation from LK characteristics.

Overall, using the fusion approach proposed in this study, it is clearly shown to be able to effectively detect typical lane changes, but failures still occurred when lane changes happen at corners or on curved roads.

7. Conclusion

In this article, a fusion approach is introduced that utilizes data from two differing modality sensors (a front-view camera and an OBD sensor) for the purpose of detecting driver's behavior of lane changing. In the experiments with a real driving dataset, it is demonstrated that (1) the feature-level fusion outperformed the decision-level fusion in most cases; (2) the performance of using CRC model is superior to other 4 state-of-the-art classifiers in both feature-level fusion and decision-level fusion, which is at least 3% over the NN model, 4% over the HMM model, 6% over the DT model, and 9% over the k-NN model; and (3) this proposed fusion approach involving two differing modality sensors' data together with CRC model can achieve the most robust performances for detecting driver's lane-changing behavior on different trips conducted in this study.

The proposed fusion approach can be utilized as a pre-processing algorithm for automatic driver trip segmentation to accurately and effectively detect the driver's behavior for lane change, as well as driving pattern analysis. Future research will focus on improving the detection accuracy using deep network model in different weather conditions, such as rainy and snowy days. The effects of feature extraction window size [t.sub.w] on detection accuracy will be studied and extended to larger training data collected from more drivers on different trips.

Acknowledgments

This research is supported in part by Research Grants from Toyota Research Institute (TRI).

References

(1.) D'Agostino, C., Saidi, A., Scouarnec, G., and Chen, L., "Learning-Based Driving Events Recognition and Its Application to Digital Roads," IEEE Transactions on Intelligent Transportation Systems 16(4):2155-2166, 2015.

(2.) McCall, J. and Trivedi, M., "Video-Based Lane Estimation and Tracking for Driver Assistance: Survey, System, and Evaluation," IEEE Transactions on Intelligent Transportation Systems 7(1):20-37, 2006.

(3.) Kasper, D., Weidl, G., Dang, T. et al., "Object-Oriented Bayesian Networks for Detection of Lane Change Maneuvers," Intelligent Vehicles Symposium, IEEE, 2012, 673-678.

(4.) Huang, A.S., Moore, D., Antone, M., Olson, E. et al., "Finding Multiple Lanes in Urban Road Networks with Vision and Lidar," Autonomous Robots 26(2):103-122, 2009.

(5.) Gu, X., Zang, A., Huang, X. et al., "Fusion of Color Images and LiDAR Data for Lane Classification," 23rd Sigspatial International Con[Florin]erence on Advances in Geographic In[Florin]ormation Systems, ACM, 2015, 69.

(6.) Tran, D., Sheng, W., Liu, L. et al., "A Hidden Markov Model Based Driver Intention Prediction System," IEEE International Con[Florin]erence on Cyber Technology in Automation, Control, and Intelligent Systems, 2015, 115-120.

(7.) Jin, L., Hou, H., and Jiang, Y., "Driver Intention Recognition Based on Continuous Hidden Markov Model," International Con[Florin]erence on Transportation, Mechanical, and Electrical Engineering, IEEE, 2012, 739-742.

(8.) Zheng, Y. and Hansen, J.H.L., "Lane-Change Detection from Steering Signal Using Spectral Segmentation and Learning-Based Classification," IEEE Transactions on Intelligent Vehicles 2(1):14-24, 2017.

(9.) Morris, B., Doshi, A., and Trivedi, M., "Lane Change Intent Prediction for Driver Assistance: On-Road Design and Evaluation," Intelligent Vehicles Symposium, IEEE, 2011, 895-901.

(10.) Kim, I.H., Bong, J.H., Park, J. et al., "Prediction of Driver's Intention of Lane Change by Augmenting Sensor Information Using Machine Learning Techniques," Sensors 17(6):1350, 2017.

(11.) Ramyar, S., Homaifar, A., Karimoddini, A. et al., "Identification of Anomalies in Lane Change Behavior Using One-Class SVM," IEEE International Conference on Systems, Man, and Cybernetics, 2016, 4405-4410.

(12.) Hou, Y., Edara, P., and Sun, C., "Modeling Mandatory Lane Changing Using Bayes Classifier and Decision Trees," IEEE Transactions on Intelligent Transportation Systems 15(2):647-655, 2014.

(13.) Schubert, R., Schulze, K., and Wanielik, G., "Situation Assessment for Automatic Lane-Change Maneuvers," IEEE Transactions on Intelligent Transportation Systems 11(3):607-616, 2010.

(14.) Kasper, D., Weidl, G., Dang, T. et al., "Object-Oriented Bayesian Networks for Detection of Lane Change Maneuvers," Intelligent Vehicles Symposium, IEEE, 2012, 673-678.

(15.) Ulbrich, S. and Maurer, M., "Situation Assessment in Tactical Lane Change Behavior Planning for Automated Vehicles," IEEE International Conference on Intelligent Transportation Systems, 2015, 975-981.

(16.) Yan, F., Eilers, M., Ludtke, A. et al., "Developing a Model of Driver's Uncertainty in Lane Change Situations for Trustworthy Lane Change Decision Aid Systems," Intelligent Vehicles Symposium, IEEE, 2016, 406-411.

(17.) Peng, J., Guo, Y., Fu, R., Yuan, W. et al., "Multiparameter Prediction of Drivers' Lane-Changing Behaviour with Neural Network Model," Applied Ergonomics 50:207-217, 2015.

(18.) Tomar, R.S. and Verma, S., "Lane Change Trajectory Prediction Using Artificial Neural Network," International Journal of Vehicle Safety 6(3):213-234, 2013.

(19.) Leonhardt, V. and Wanielik, G., "Feature Evaluation for Lane Change Prediction Based on Driving Situation and Driver Behavior," IEEE 20th International Conference on Information Fusion, 2017, 1-7.

(20.) Li, J., Mei, X., Prokhorov, D. et al., "Deep Neural Network for Structural Prediction and Lane Detection in Traffic Scene," IEEE Transactions on Neural Networks & Learning Systems 28(3):690-703, 2017.

(21.) Olabiyi, O., Martinson, E., Chintalapudi, V. et al., "Driver Action Prediction Using Deep (Bidirectional) Recurrent Neural Network," 2017, arXiv:1706.02257.

(22.) Gurghian, A., Koduri, T., Bailur, S.V. et al., "DeepLanes: End-to-End Lane Position Estimation Using Deep Neural Networks," IEEE Conference on Computer Vision and Pattern Recognition, 2016, 38-45.

(23.) Kin, J. and Lee, M., "Robust Lane Detection Based on Convolutional Neural Network and Random Sample Consensus," International Conference on Neural Information Processing, 2014, 454-461.

(24.) Woo, H. et al., "Lane-Change Detection Based on Vehicle-Trajectory Prediction," IEEE Robotics and Automation Letters 2(2):1109-1116, 2017.

(25.) Yao, W., Zhao, H., Bonnifait, P., and Zha, H., "Lane Change Trajectory Prediction by Using Recorded Human Driving Data," Intelligent Vehicles Symposium, IEEE, June 2013, 430-436.

(26.) Nilsson, J., Brannstrom, M., Coelingh, E. et al., "Lane Change Maneuvers for Automated Vehicles," IEEE Transactions on Intelligent Transportation Systems 18(5):1087-1096, 2017.

(27.) Xu, G., Liu, L., Ou, Y. et al., "Dynamic Modeling of Driver Control Strategy of Lane-Change Behavior and Trajectory Planning for Collision Prediction," IEEE Transactions on Intelligent Transportation Systems 13(3):1138-1155, 2012.

(28.) Kim, H.T., Song, B., Lee, H. et al., "Multiple Vehicle Recognition Based on Radar and Vision Sensor Fusion for Lane Change Assistance," Journal of Institute of Control 21(2):121-129, 2015.

(29.) Cao, G., Damerow, F., Flade, B. et al., "Camera to Map Alignment for Accurate Low-Cost Lane-Level Scene Interpretation," 19th International Conference on Intelligent Transportation Systems, IEEE, 2016, 498-504.

(30.) Satzoda, R.K. and Trivedi, M.M., "Drive Analysis Using Vehicle Dynamics and Vision-Based Lane Semantics," IEEE Transactions on Intelligent Transportation Systems 16(1):9-18, 2015.

(31.) Zhang, L., Yang, M., and Feng, X., "Sparse Representation or Collaborative Representation: Which Helps Face Recognition?," IEEE International Con[Florin]erence on Computer Vision, 2011, 471-478.

(32.) Chen, C., Li, W., Tramel, E.W. et al., "Reconstruction of Hyperspectral Imagery from Random Projections Using Multihypothesis Prediction," IEEE Transactions on Geoscience and Remote Sensing 52(1):365-374, 2014.

(33.) Shafer, G., A Mathematical Theory o[Florin] Evidence (Princeton, NJ: Princeton University Press, 1976).

(34.) Rombaut, M. and Zhu, Y.M., "Study of Dempster-Shafer Theory for Image Segmentation Applications," Image and Vision Computing 20(1):15-23, 2002.

(35.) Basir, O., Karray, F., and Zhu, H., "Connectionist-Based Dempster-Shafer Evidential Reasoning for Data Fusion," IEEE Transactions on Neural Networks 16(6):1513-1530, 2005.

(36.) Murphey, Y.L., Liu, C., Tayyab, M., and Narayan, D., "Accurate Pedestrian Path Prediction Using Neural Networks," IEEE Symposium Series on Computational Intelligence, 2017, 1-7.

Jun Gao and Yi Lu Murphey, University of Michigan-Dearborn, USA

Honghui Zhu, Wuhan University of Technology, China

History

Received: 04 Jan 2018

Revised: 02 Sep 2018

Accepted: 12 Sep 2018

e-Available: 29 Oct 2018

Citation

Gao, J., Murphey, Y., and Zhu, H., "Detection of LaneChanging Behavior Using Collaborative Representation Classifier-Based Sensor Fusion," SAE Int. J. Trans. Safety 6(2):147-162, 2018, doi:10.4271/09-06-02-0010

doi:10.4271/09-06-02-0010
TABLE 1 Description of the dataset.

Participated subjects   5 females and 5 males
Driving route           2-3 lanes, local and freeway
Driving time            10-40 minutes each trip
Driving environment     Day time, sunny and cloudy
# of driving trip       20 trips
# of lane change        119 left and 138 right
# of turning            87 left and 124 right

[c] SAE International

TABLE 2 Accuracy of lane change detection using different classifiers
with feature-level fusion on Data1_[Te.sub.j].

                  Detection classifier and accuracy (%)
Ground truth(#)   k-NN    DT      HMM     NN      CRC

LCL (61)          72.13   73.77   78.68   77.05   81.96
LCR (62)          74.19   79.03   80.64   77.42   82.26

[c] SAE International

TABLE 3 Accuracy of lane change detection using different classifiers
with feature-level fusion on Data2_[Te.sub.i].

                  Detection classifier and accuracy (%)
Ground truth(#)   k-NN    DT      HMM     NN      CRC

LCL (58)          74.14   75.86   77.58   79.31   81.03
LCR (76)          72.37   73.68   77.63   76.32   81.58

[c] SAE International

TABLE 4 Accuracy of lane change detection using different models with
decision-level fusion on Data1_[Te.sub.i].

Ground     Detection model and accuracy (%)
truth(#)   k-NN+D-S   DT+D-S   HMM+D-S   NN+D-S   CRC+D-S

LCL (61)   70.49      72.13    77.05     75.41    78.68
LCR (62)   72.58      77.42    74.19     75.81    80.64

[c] SAE International

TABLE 5 Accuracy of lane change detection using different models with
decision-level fusion on Data2_[Te.sub.i].

Ground     Detection model and accuracy (%)
truth(#)   k-NN+D-S   DT+D-S   HMM+D-S   NN+D-S   CRC+D-S

LCL (58)   72.41      74.14    75.86     77.58    79.31
LCR (76)   71.05      75       78.95     77.63    80.26

[c] SAE International

TABLE 6 Accuracy of lane change detection using the data from differing
modality sensors with feature-level fusion on Data3_[Te.sub.i].

             Accuracy (%)
             LCL (119)                   LCR(138)
Detection    Video   OBD     Video+OBD   Video   OBD     Video+OBD
classifier

k-NN         67.23   70.58   74.78       65.22   69.56   73.91
DT           68.91   73.1    76.47       68.12   71.01   77.54
HMM          70.58   74.78   79.84       70.29   74.64   78.98
NN           72.27   77.31   80.67       71.74   76.08   79.71
CRC          73.95   78.99   82.35       72.46   78.26   82.61

[c] SAE International
COPYRIGHT 2018 SAE International
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Gao, Jun; Murphey, Yi Lu; Zhu, Honghui
Publication:SAE International Journal of Transportation Safety
Date:Jul 1, 2018
Words:8152
Previous Article:Validation of Crush Energy Calculation Methods for Use in Accident Reconstructions by Finite Element Analysis.
Next Article:Railway Fastener Positioning Method Based on Improved Census Transform.
Topics:

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters