# Data-Based Control for Humanoid Robots Using Support Vector Regression, Fuzzy Logic, and Cubature Kalman Filter.

1. IntroductionIn general, lots of experiences are needed to turn the parameters of the controller for humanoid walking robots [1, 2]. At the same time, the turned parameters could be out of operation once external disturbances occurred [3, 4]. It is still a giant challenge for humanoid robots to walk autonomously in disturbed environments. How to improve the antidisturbance ability of humanoid robots using the data online and offline is an interesting problem to be settled.

Traditionally, an accurate dynamic model of the considered robot should be built in order to implement the desired high-quality control [5, 6]. The dynamic model of humanoid robots is a set of strong coupling nonlinear ordinary differential equations about the joint variables. When humanoid robots walk slowly, the coupling among the joints can be treated as disturbances. In this case, proportion integration differentiation (PID) control for each independent joint can be adopted. Additionally, inverse dynamics feed-forward control, decoupling control, and feedback linearization control also can be considered. All the abovementioned methods have a common feature that they are dependent on the established mathematical model strongly.

The control effect could be great if the system model is known exactly. However, these methods need to be improved in practical applications because there always exist external disturbances and the system model cannot be known exactly [7, 8]. Due to the cumulative errors in disturbances, when the robot is tracking the walking pattern planned in advance, the difference between actual motion states and target values will increase rapidly. As a consequence, humanoid robots always fall down after a few steps. To keep stable and sustained humanoid walking, the control systems must be improved to cope with the disturbances.

To realize the stable walking of humanoid robots, researchers proposed some effective methods, such as stability controls based on linear inverted pendulum model, stability controls based on ZMP theory, and attitude controls for the upper part of robot. By summarizing the existing humanoid prototypes, ZMP-based methods are the most popular and practical [9-13]. When the humanoid dynamics (the masses of each module, the center of mass, and the moment of inertia) is known exactly, the gait of a humanoid robot can be obtained by solving the ZMP equation. However, it is really difficult to collect all the precise parameters from a physical robot. As a result, several simplified methods are proposed to guarantee the stability criterion of ZMP. For example, Liu et al. designed an effective fuzzy logic (FL) controller for humanoid walking robots using ZMP as one of the antecedents [14]. Inspired by this literature, we try to realize a kind of data-based control with implicit constraints of ZMP.

Considering the time-varying external disturbances, it is necessary to deduce the control torques referring to the time-varying states of the humanoid joints. The support vector machine (SVM) is supervised learning methods with associated learning algorithms that analyze data [15-17]. SVMs can be used to solve the classification problems [18, 19] and also the regression problems [20-22]. In this paper, support vector regression (SVR) is used to expresses the scene of solving the regression problems using SVM. The formulation of SVM employs the structural risk minimization (SRM) principle, which has been shown to be superior to the empirical risk minimization (ERM) based on infinite samples. SRM is a technique where nested sets of functions of different complexity, controlled by the regularization term, are considered. One could select then the one which is minimizing the upper bound on the generalization error. This feature makes SVM more efficient in resolving the learning problems with limited training data. To cope with time-varying external disturbances, a time-based fuzzy SVR will be proposed to learn the dynamics between the states and the control torques of each joint in this work.

On the other hand, the effectiveness of time-based fuzzy SVR depends on the design of the SVR and the parameters of the fuzzy system. Kalman filter has been used in algorithm studies on the training of neural networks and fuzzy systems. Singhal and Wu [23] demonstrated that the EKF could serve as the basis for training MLP networks by treating the weights of the network as a nonlinear dynamical system. Inspired by the successful application of the Kalman filter for training neural networks [24] and for defuzzification strategies [25], Simon [26] built a nonlinear system to train fuzzy systems using an extended Kalman filter. Recently, there is a research [27] that embedded a third-degree spherical-radial cubature rule into the Bayesian filter to build a kind of new filter named CKF. CKF has demonstrated excellent performance in solving nonlinear filtering problems with minimal computational effort [28-31]. Therefore, we will explore approaches to train the proposed time-based fuzzy SVR using CKF.

The contributions of this work can be summarized as follows:

(i) For the first time, a trapezoidal fuzzy SVR is proposed to cope with time-varying external disturbances imposed on humanoid robots.

(ii) For the first time, a novel approach for training the trapezoidal fuzzy SVR using CKF is presented.

The organization of this paper is as follows. In Section 2, the backgrounds of humanoid robots, SVR, and CKF are presented. The details for proposed framework are presented in Section 3. Simulation results are provided in Section 4, followed by the conclusions in Section 5.

2. Background

2.1. Dynamic Balance of Humanoid Robots. The dynamic equations of the single support phase (SSP) can be written as

[mathematical expression not reproducible], (1)

where [theta] [member of] [R.sup.n] is the generalized coordinate and M([theta]) [member of] [R.sup.nxn] is the inertia matrix. C([theta],[??]) [member of] [R.sup.nxn] denotes the matrix of centripetal acceleration and Coriolis terms, G([theta]) [member of] [R.sup.nxn] is the gravity vector, and [tau] [member of] [R.sup.n] denotes the input torque vector during the SSP. The external disturbances are represented by [[tau].sub.d] [member of] [R.sup.n].

The dynamic equations of the double support phase (DSP) can be written as

[mathematical expression not reproducible], (2)

where [J.sup.T]([theta]) [member of] [R.sup.lxn] is a Jacobian matrix and F [member of] [R.sup.l] is the force vector of constraints caused by the contact with ground.

To analyze the stability of the humanoid motions, the ZMP theory is used as the criterion of dynamic humanoid balance in this work. The concept of zero moment point (ZMP) has been applied to many famous humanoid robots successfully, such as ASIMO [11] of Honda. ZMP is a well-known concept introduced in 1990 [32], which is a point on the ground at which the net moment of the inertial forces and the gravity forces has no component along the horizontal axes. At a given time instant, dynamic balance of legged systems is ensured if the ZMP is inside the support area.

2.2. Support Vector Regression. In this section, we briefly review the basis of the theory of SVR.

Given a labeled training data set [[x.sub.i], [y.sub.i]} (i = 1,2, ..., N), [x.sub.i] [member of] [R.sup.n] is the input vector of the system and [y.sub.i] [member of] R is the output vector. The basic idea of SVR is mapping the input space into a higher dimension feature space using the nonlinear mapping function and searching the optimal linear regression function in this feature space. Objective function of the least squares SVR (LS-SVR) [17] is

[mathematical expression not reproducible], (3)

where C is a positive real constant for tuning. The error of the regression becomes smaller when the value of C is smaller. [omega] is a weight vector and b is a bias; [[xi].sub.i] is the positive slack variable enabling dealing with permitted errors. To solve this optimization problem we construct the Lagrangian

[mathematical expression not reproducible], (4)

and we find the saddle point of L([omega], b, [xi], a), where [x.sub.i] [member of] [R.sup.n] is the input vector of the system and [y.sub.i] [member of] R is the output vector. [alpha] = [[[[alpha].sub.1], [[alpha].sub.2], ..., [[alpha].sub.N]].sup.T] is a vector of the Lagrange multipliers. The parameters must satisfy the following conditions:

[mathematical expression not reproducible]. (5)

Eliminating [omega] and [xi], problem (3) can be transformed into

[mathematical expression not reproducible]. (6)

where y = [[[y.sub.1], [y.sub.2], ..., [y.sub.N]].sup.T], A = [[1,1,..., 1].sup.T], [alpha] = [[[alpha].sub.1], [[alpha].sub.2], ... [[alpha].sub.N]].sup.T], [OMEGA] is a square matrix, which has elements of [[OMEGA].sub.i] = K([x.sub.i], x) = [phi][([x.sub.i]).sup.T] [phi](x), and K([x.sub.i], x) is a kernel function. For example, K([x.sub.i], x) = exp[-[([x.sub.i] - x).sup.2] / 2[[sigma].sup.2]] denotes a radial basis function (RBF) kernel. Submitting the optimal [alpha] and b, the regression function of the least squares SVR [17] is

y(x) = [n.summation over (i=1)] [[alpha].sub.i]K([x.sub.i], x) + b. (7)

2.3. Cubature Kalman Filter. To describe the CKF, we consider the filtering problem of a nonlinear dynamic system with additive noise, whose state space model is defined by a process equation and a measurement equation in discrete time:

[z.sub.k] = f([x.sub.k-1], [u.sub.k-1]) + [w.sb.k-1], [z.sub.k] = g ([x.sub.k], [u.sub.k]) + [v.sub.k], (8)

where [x.sub.k] is the state of the dynamic system at discrete time k; f(x) and g(x) are some known functions; [u.sub.k] is the control input; [z.sub.k] is the measurement; {[w.sub.k-1]} and [[v.sub.k]} are independent process and measurement Gaussian noise sequences with zero means and covariances [Q.sub.k-1] and [R.sub.k], respectively.

When we deal with a problem of state estimation using a nonlinear filtering, the integrals for the means and variances of the states can be expressed as the form of a Gaussianweighted integral. Consider a Gaussian-weighted integral of the following form:

[mathematical expression not reproducible], (9)

where f(x) is an arbitrary function and [U.sup.n] is the region of integration. There are different integration methods to derive different nonlinear filters.

For a CKF, the spherical-radial cubature rule is adopted to implement the integration. Let x = rt([t.sup.T]t = 1, r [member of] [0, +[infinity])), and integration (9) can be separated into radial integration and spherical integration. That is,

[mathematical expression not reproducible], (10)

where [S.sub.n] is the surface of the sphere defined by [rho] [member of] [S.sub.n]. Using Lagrange integration, the radial integration can be rewritten as

[mathematical expression not reproducible], (11)

where [PI](n) = [[integral].sup.[infinity].sub.0] [x.sup.n-1] [e.sup.-x] dx. Using a cubature rule of degree three, the spherical integration can be rewritten as

[mathematical expression not reproducible], (12)

where [(1).sub.i] denotes the ith column of set (1). For example, (1) = ([(1,0).sup.T], [(-1, 0).sup.T], [(0,1).sup.T], (0,1}T) when n = 2.

Combining (11) and (12), the spherical-radial cubature rule is as follows [27]:

[mathematical expression not reproducible]. (13)

For standard Gaussian distribution, (13) can be rewritten as

[mathematical expression not reproducible], (14)

where N(x; 0, I) denotes the Gaussian density of x with mean 0 and covariance I. Combining (9) and (14), we get

[mathematical expression not reproducible], (15)

where n is the dimension of the state vector. The point ([[eta].sub.i], [[beta].sub.i]) is called cubature point here. [[eta].sub.i] and [[beta].sub.i] can be calculated as follows:

[mathematical expression not reproducible]. (16),

This means that, for the third-degree spherical-radial rule, it entails a total of 2n cubature points. After calculating the cubature points, we use the cubature-point set to numerically compute the integrals and obtain the CKF algorithm; details of time update and measurement update can be found in the literature [27].

3. Data-Based Control for Humanoid Walking Robots

External disturbance is one of the key issues which influence the stability of humanoid walking robots. On the other side, it is difficult to measure the external disturbances directly. Based on these facts, we turn away to focus attention on the data of the humanoid states and the corresponding control torques because the pattern of the disturbances can be disclosed using the varying data collected from the humanoid robots to some extent.

First of all, the states and the control torques of the humanoid joints are collected from a simulated stable walking robot. Then, a data-based controller considering the varying states of the humanoid robot is designed using SVR and fuzzy theory. To optimize the controller, a CKF is designed to train the parameters of the SVR and the fuzzy system. The complete framework for data-based humanoid control using SVR, FL, and CKF is shown in Figure 1.

3.1. Data Collecting from the Simulations. Data for training the controller is collected by implementing simulation experiments. Two kinds of data are collected from the simulated humanoid robot, including the joint angles and the driving torques. The way we get the training data is described in detail next.

First of all, data of the joint angles are generated from reference trajectory planned offline. The trajectories can be represented as follows:

[mathematical expression not reproducible], (17)

where [x.sub.h], [z.sub.h] represent the position of the hip and [x.sub.a], [z.sub.a] represent the position of the swinging ankle joint. s denotes the walking step length, and d denotes the height of swinging ankle. N denotes the total number of samples for a step, i denotes the index of the samples, and [l.sub.th], [l.sub.sh] represent the length of lower limbs.

Secondly, a proportion integration differentiation (PID) controller is used to obtain the driving torques. In this work, the initial driving torques of all the joints are obtained using this PID controller. Then the key driving torques, including driving torques for the support hip and support ankle in the SSP and driving torques for the knees in the DSP, are improved using SVR, FL, and CKF. The PID controller is as follows:

[mathematical expression not reproducible], (18)

where [[tau].sub.j] (j = 1, ..., n) is the torque of the joints. denotes the offset of the desired reference trajectories and the actual trajectories. T is the integral period. The proportional gains [P.sub.j], integral gains [I.sub.j], and differential gains [D.sub.j] are slightly modified by the trial-error method.

3.2. Designing the Controller Using the Collected Data. A TFLSSVR is proposed in this section to design the controller using the collected data.

3.2.1. Humanoid Controller to Be Built Using the TF-LSSVR. Based on the existing literature [14], when the ZMP criterion is satisfied, the dynamics between the torque control inputs and the joint angles of the DSP can be presented as

[[tau].sub.left_knee] = h ([[theta].sub.left_knee], [[theta].sub.right_knee]), [[tau].sub.right_knee] = l ([[theta].sub.left_knee], [[theta].sub.right_knee]), (19)

where [[tau].sub.left_knee] and [[tau].sub.right_knee] are the driven torques of the left knee and the right knee. [[theta].sub.left_knee] and [[theta].sub.right_knee] are the joint angles of the left knee and the right knee. h(x) and l(x) are the nonlinear dynamics in the DSP to be learned using the proposed TF-LSSVR.

On the other side, when the ZMP criterion is satisfied, the dynamics between the torque control inputs and the joint angles of the SSP can be presented as [14]

[[tau].sub.sup_hip] = f ([[theta].sub.sup_hip], [[theta].sub.sup_ankle]), [[tau].sub.sup_ankle] = g ([[theta].sub.sup_hip], [[theta].sub.sup_ankle]), (20)

where [[tau].sub.sup_hip] and [[tau].sub.sup_ankle] are the driven torques of the supporting hip and the supporting ankle. [[theta].sub.sup_hip] and [[theta].sub.sup_ankle] are the joint angles of the supporting hip and the supporting ankle. f(x) and g(x) are the nonlinear dynamics in the SSP to be learned using the proposed TF-LSSVR.

For illustration purposes, we will expound the situation in the SSP mainly. The solution for the DSP can be easily deduced in the same way. Then, the humanoid controller to be built can be simply presented as follows:

[tau] = g([THETA]), (21)

where G(x) is the nonlinear dynamics that the TF-LSSVR tries to build. t is the driving torque, and [THETA] = ([[theta].sub.sup_hip], [[theta].sub.sup_ankle]) is a vector of joint angles.

3.2.2. Objective Functions of the TF-LSSVR. When the timeliness of the training data is considered, the collected data from current steps are "more important," and those from past steps are "less important." Based on this, in the proposed objective functions of the TF-LSSVR, a TFMF is used to design learning weights of the collected data. Taking the learning of driving torque of the supporting hip as an example and designing a learning weight [r.sub.k] for each of the training samples, the training sample set can be denoted by {[[theta].sup.(k).sub.sup_hip], [[theta].sup.(k).sub.sup_ankle], [[tau].sup.(k).sub.sup_hip], [r.sub.k]}, 0 [less than or equal to] [r.sub.k] [less than or equal to] 1, k = 1,2, ..., N. Then the objective function of the TF-LSSVR for training the driving torque of the supporting hip is

[mathematical expression not reproducible], (22)

where [omega] is a weight vector. [phi](x) is a nonlinear mapping function. C is a penalty coefficient, [[xi].sub.k] is a positive slack variable, and [b.sub.sup_hip] is the corresponding bias. {[[[theta].sup.(k).sub.sup_hip], [[theta].sup.(k).sub.sup_ankle], [[tau].sup.(k)], [r.sub.k]} is the kth sample, and N is the number of the samples. The [r.sub.k] is the fuzzy learning weight for the training samples, which can be calculated using the fuzzy membership functions designed in the next section. It is noted that in literature [17] also a weighting scheme similar to (22) has been proposed but it is based on robust statistics. In this paper, a different weighting scheme is proposed.

3.2.3. Designing the Learning Weights Using TFMF. The walking data from different time can be evaluated using some linguistic terms such as "more important" or "less important." When the timeliness of the training data is considered, the collected data from current steps are "more important," and those from past steps are "less important."

For this reason, a TFMF, which is the left half of a trapezoidal function, is proposed as the membership function of humanoid walking samples. The proposed time-related trapezoidal fuzzy membership function is shown in Figure 2. The formula for the trapezoidal fuzzy membership function is as follows:

[mathematical expression not reproducible], (23)

where [r.sub.k] is the trapezoidal fuzzy membership function of time [t.sub.k] (k = 1,2, ..., N), [r.sub.k] [member of] [h, 1], and h [member of] (0,1) is a tuning parameter. [t.sub.k-m-n] and [t.sub.k] are the beginning and the ending of the time window, respectively, and [t.sub.k-m] is a time point between the beginning and the ending.

From Figure 2 and (23), it can be seen that the time window [[t.sub.k-m-n],[t.sub.k]] is divided into two parts. The first part [[t.sub.k-m],[t.sub.k]] contains the newest walking samples, which are designed to have a full fuzzy membership grade. The second part [[t.sub.k-m-n], [t.sub.k-m]] crosses through the relative old samples, which are assigned descending fuzzy membership grades. As shown in Figure 2, the values of the learning weight at points A and B are larger than that of point C because the states A and B are closer to the current situation. Besides, as the sampling time window moves, the trapezoidal learning-weight function moves along the time axis accordingly. Therefore, reasonable and adaptable learning weights are assigned for all samples in the whole walking process. The deduced learning weights are then used in the learning algorithm of the TF-LSSVR.

The nonlinear dynamic model G(x) in (21) can be built when the learning process of the TF-LSSVR is completed. After this, considering the parameters of the TF-LSSVR, controller (21) can be written as

[tau] = G([THETA],[psi]), (24)

where [tau] is the driving torque, [THETA] is the joint angle, and [psi] = [[C, [sigma], m, n, h].sup.T] is a parameter vector. Here, we denote the plenty coefficient by C, the width of the RBF kernel by [sigma], and the width of the moving time window by m and n, and we denote the minimum value of the membership function by h. To obtain optimized parameters for the controller, a CKF-based training method is proposed in the next section.

3.3. Training the Parameters of the Controller Using CKF

3.3.1. Parameters Optimization Problem in a Form Suitable for CKF. In order to cast the parameters optimization problem of the TF-LSSVR in a form suitable for CKF, we let the parameters of the TF-LSSVR constitute the state of a nonlinear system, and we let the output of the TF-LSSVR constitute the output of the nonlinear system to which the CKF is applied.

As mentioned above we denote the state of the nonlinear system by

[psi] = [[C, [sigma], m, n, h].sup.T]. (25)

The vector [psi] thus consists of all of the parameters for the TFLSSVR arranged in a linear array. Let [tau] = G([THETA], [psi]) be the transfer function of the TF-LSSVR, where [tau] is the output, [THETA] is the input, and [psi] = [[C, [sigma], m, n, h].sup.T] is a parameter vector. The training of the TF-LSSVR can be formulated as a filtering problem. The model for updating the parameters using CKF can be written as

[[psi].sub.k] = [[psi].sub.k-1] [[tau].sub.k] = G([[THETA].sub.k], [[psi].sub.k]). (26)

Similar to the approach of training fuzzy systems with the extended Kalman filter [26], in order to execute a stable Kalman filter algorithm, we add some process noise and measurement noise to the above model. That is,

[[psi].sub.k] = [[psi].sub.k-1] + [[psi].sub.k-1], [[tau].sub.k] = G([[THETA.sub.k], [[psi].sub.k]) + [v.sub.k], (27)

where [w.sub.k-1] and [v.sub.k] are artificially added Gaussian noise sequences with zero means and covariances [Q.sub.k-1] and [R.sub.k], respectively.

3.3.2. The CKF Algorithm for Updating the Parameters of the Controller. To update the parameters of the controller, the CKF algorithm [27] is implemented as follows.

Step 1 (cubature-points calculation). The point ([[eta].sub.i], [[beta].sub.i]) is called cubature point here. For the third-degree spherical-radial rule, it entails a total of 2u cubature points. [[eta].sub.i] and [[beta].sub.i] can be calculated as follows:

[[eta].sub.i] = [square root of u] [(1).sub.i], [[beta].sub.i] = 1/2u, (28)

where i = 1,2, ..., 2u and [(1).sub.i] denotes the ith column of the set (1}. For example, (1} = ([(1, 0).sup.T], [(-1, 0).sup.T], [(0,- 1).sup.T], [(0,1).sup.T]) when u = 2. u is the dimension of the state vector.

Step 2 (time update). After calculating the cubature points, we use the cubature-point set to numerically compute the integrals and obtain the time update:

[mathematical expression not reproducible]. (29)

Step 3 (measurement update). Consider

[mathematical expression not reproducible], (30)

where [W.sub.k] is the Kalman gain, [[??].sub.k|k] is the updated state, and [P.sub.k|k] is the corresponding error covariance. [P.sub.[psi][tau],k|k-1] is the cross-covariance matrix and [P.sup.-1.sub.[tau][tau],k|k-1] is the inverse matrix of the innovation covariance matrix.

After time update and measurement update, an estimation of the parameters [psi] for the TF-LSSVR is obtained. That is to say, the initial controller with random parameters is updated to the proposed data-based controller.

4. Simulation Research

In this section, we test our proposed learning control method on the control of a seven-link robot by simulation experiments. Matlab 7.0 is used to model the humanoid robot and the controller.

4.1. Simplified Model of a Seven-Link Humanoid Robot. The simplified model of the robot has two legs and a trunk. Each leg is composed of a thigh, a shank, and a foot. There is one degree of freedom (DOF) in the trunk, one for each hip, one for each knee, and one for each ankle. In this paper, we focus on the sagittal dynamics of the humanoid robot. The simple model of the humanoid is shown in Figure 3. The details of the humanoid can be referenced in Table 1.

4.2. Sampling for the TF-LSSVR Learning. The whole walking cycle is denoted by [T.sub.c], and let [T.sub.c] = 1 s with 0.2 s for the DSP and 0.8 s for the SSP. The trunk of the humanoid robot is assumed to be upright during walking. Three different step lengths are implemented (0.16 m, 0.18 m, and 0.20 m) and the step height is 0.02 m for all the three step lengths. The sampling interval is 0.025 s. That is to say, there are 40 groups of samples that are collected in a single walking cycle.

To validate the advantage of the proposed method, a simulated time-varying external perturbation [[tau].sub.d] = 0.08 sin(5t) (Nm) is considered here for the tests. It is a horizontal external force with duration of 0.1 s that is applied on the hip at t = 0.15 s in the DSP and t = 0.65 s in the SSP, respectively.

4.3. Parameters and the Learning Results

4.3.1. Experimental Conditions and the Parameters. Here, we use the universal RBF as the kernel for the TF-LSSVR.

The penalty coefficient C = 1000 and width of the RBF kernels [sigma] = 0.9 are determined by a 10-fold cross-validation strategy. The other initial parameters for the proposed framework are identified according to experiences, which include the number of the steps for a full weight m = 1, the number of the steps for a gradient weight n = 1, and the lower bound of the fuzzy membership function h = 0.5. Then the initial parameters are updated using CKF.

Parameters for the CKF are as follows: process noise [w.sub.k-1] ~ (0,[10.sup.-3]) and the measurement noise [v.sub.k] ~ (0,[10.sup.- 3]), where [w.sub.k-1] and [v.sub.k] are both white, zero-mean, uncorrelated noises. The initial covariance of the humanoid state is [P.sub.0] = diag[{[10.sup.-3], ..., [10.sup.-3]}.sub.7x7]. The initial radians of the humanoid joint angles are shown in Table 2.

4.3.2. Learning Results of the TF-LSSVR. The learning results of the proposed TF-LSSVR are compared with two other intelligent methods including fuzzy and the traditional LSSVR. The designing details of the fuzzy control system can be found in the literature [14]. Nonlinear dynamics of the submodels formulated in (19)-(20) are learned using the three different methods (see Figures 4-15).

Define control error [member of] = [??] - [tau], and the integral square error (ISE) is defined as the measure index:

ISE = 1/N [N.summation over (i=1)] [([[??].sup.(i) - [[tau].sup.(i)]).sup.2], (31)

where i denotes the sampling index and N = 40 is the number of all the samples. [[??].sup.(i)] is the output of the learning methods and [[tau].sup.(i)] is the desired output. The integrated square errors (ISEs) criterion is listed in Table 3, which indicates that the proposed TF-LSSVR achieves a better learning result than the other two existing methods.

4.4. Performance Comparisons. ZMP trajectory with disturbance in the SSP can be found in Figures 16-18, which indicate that the proposed learning control method produces a larger stability margin relative to the traditional ones. Similar situations appeared in the DSP and the comparisons of humanoid walking when disturbances occur in the DSP are shown in Figures 19-21. As we can see, the ZMP trajectories corresponding to the proposed method have bigger stability margins for the humanoid compared to the other two intelligent methods. That is to say, using the data online and offline with different weights, the proposed learning control method is more effective in learning the external disturbances and increasing the stability margin of humanoid robots.

5. Conclusions

In this work, a TF-LSSVR-based control system is proposed to learn the external disturbances and increase the stability margin of humanoid robots. First, data for training the controller is collected by implementing simulation experiments. Secondly, a TF-LSSVR with a time-related TFMF is proposed to train the controller using the simulated data. Thirdly, the parameters of the proposed TF-LSSVR are updated using a CKF. Simulation results are provided.

The proposed method is shown to be effective in learning and adapting occasional external disturbances and ensuring the stability margin of the robot. We believe that the proposed method will be very promising for the development of autonomous humanoid robots.

http://dx.doi.org/10.1155/2016/1984634

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Project 61403264 and Project 61305098 and the Natural Science Foundation of Guangdong Province under Project 2016A030310018.

References

[1] J.-W. Heo and J.-H. Oh, "Biped walking pattern generation using an analytic method for a unit step with a stationary time interval between steps," IEEE Transactions on Industrial Electronics, vol. 62, no. 2, pp. 1091-1100, 2015.

[2] T.-H. S. Li, P.-H. Kuo, Y.-F. Ho, M.-C. Kao, and L.-H. Tai, "A biped gait learning algorithm for humanoid robots based on environmental impact assessed artificial bee colony," IEEE Access, vol. 3, pp. 13-26, 2015.

[3] P. Henaff, V. Scesa, F. Ben Ouezdou, and O. Bruneau, "Real time implementation of CTRNN and BPTT algorithm to learn online biped robot balance: experiments on the standing posture," Control Engineering Practice, vol. 19, no. 1, pp. 89-99, 2011.

[4] C. Sabourin and O. Bruneau, "Robustness of the dynamic walk of a biped robot subjected to disturbing external forces by using CMAC neural networks," Robotics and Autonomous Systems, vol. 51, no. 2-3, pp. 81-99, 2005.

[5] H.-K. Shin and B. K. Kim, "Energy-efficient gait planning and control for biped robots utilizing vertical body motion and allowable zmp region," IEEE Transactions on Industrial Electronics, vol. 62, no. 4, pp. 2277-2286, 2015.

[6] M. Rakovic, B. Borovac, M. Nikolic, and S. Savic, "Realization of biped walking in unstructured environment using motion primitives," IEEE Transactions on Robotics, vol. 30, no. 6, pp. 1318-1332, 2014.

[7] J. P. Ferreira, M. M. Crisostomo, and A. P. Coimbra, "Adaptive PD controller modeled via support vector regression for a biped robot," IEEE Transactions on Control Systems Technology, vol. 21, no. 3, pp. 941-949, 2013.

[8] K. Galloway, K. Sreenath, A. D. Ames, and J. W. Grizzle, "Torque saturation in bipedal robotic walking through control lyapunov function-based quadratic programs," IEEE Access, vol. 3, no. 1, pp. 323-332, 2015.

[9] M. Vukobratovic and B. Borovac, "Zero-moment point--thirty five years of its life," International Journal of Humanoid Robotics, vol. 1, no. 1, pp. 157-173, 2004.

[10] M. Vukobratovic, B. Borovac, and V. Potkonjak, "ZMP: a review of some basic misunderstandings," International Journal of Humanoid Robotics, vol. 3, no. 2, pp. 153-175, 2006.

[11] K. Hirai, M. Hirose, Y. Haikawa, and T. Takenaka, "The development of Honda humanoid robot," in Proceednigs of the IEEE International Conference on Robotics and Automation, pp. 1321-1326, Leuven, Belgium, May 1998.

[12] T.-H. S. Li, Y.-T. Su, S.-H. Liu, J.-J. Hu, and C.-C. Chen, "Dynamic balance control for biped robot walking using sensor fusion, Kalman filter, and fuzzy logic," IEEE Transactions on Industrial Electronics, vol. 59, no. 11, pp. 4394-4408, 2012.

[13] H.-K. Shin and B. K. Kim, "Energy-efficient gait planning and control for biped robots utilizing the allowable ZMP region," IEEE Transactions on Robotics, vol. 30, no. 4, pp. 986-993,2014.

[14] Z. Liu, Y. Zhang, and Y. Wang, "A type-2 fuzzy switching control system for biped robots," IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, vol. 37, no. 6, pp. 1202-1213, 2007.

[15] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.

[16] H. Drucker, C. J. C. Surges, L. Kaufman, A. Smola, and V. Vapnik, "Support vector regression machines," in Proceedings of the 10th Annual Conference on Neural Information Processing Systems (NIPS '96), pp. 155-161, December 1996.

[17] J. A. K. Suykens, J. De Brabanter, L. Lukas, and J. Vandewalle, "Weighted least squares support vector machines: robustness and sparse approximation," Neurocomputing, vol. 48, no. 1-4, pp. 85-105, 2002.

[18] S. Chatterjee, "Vision-based rock-type classification of limestone using multi-class support vector machine," Applied Intelligence, vol. 39, no. 1, pp. 14-27, 2013.

[19] L. Guo, Y. Wu, L. Zhao, T. Cao, W. Yan, and X. Shen, "Classification of mental task from EEG signals using immune feature weighted support vector machines," IEEE Transactions on Magnetics, vol. 47, no. 5, pp. 866-869, 2011.

[20] J. P. Ferreira, M. M. Crisostomo, andA. P. Coimbra, "SVRversus neural-fuzzy network controllers for the sagittal balance of a biped robot," IEEE Transactions on Neural Networks, vol. 20, no. 12, pp. 1885-1897, 2009.

[21] E. E. Elattar, J. Goulermas, and Q. H. Wu, "Electric load forecasting based on locally weighted support vector regression," IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, vol. 40, no. 4, pp. 438-447, 2010.

[22] L. Wang, Z. Liu, C. L. P. Chen, Y. Zhang, S. Lee, and X. Chen, "Energy-efficient SVM learning control system for biped walking robots," IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 5, pp. 831-837, 2013.

[23] S. Singhal and L. Wu, "Training multilayer perceptrons with the extended Kalman algorithm," in Advances in Neural Information Processing Systems I. Denver 1988, D. S. Touretzky, Ed., pp. 133140, Morgan Kaufmann, San Mateo, Calif, USA, 1989.

[24] G. V. Puskorius and L. A. Feldkamp, "Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks," IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 279-297, 1994.

[25] T. Jiang and Y. Li, "Generalized defuzzification strategies and their parameter learning procedures," IEEE Transactions on Fuzzy Systems, vol. 4, no. 1, pp. 64-71,1996.

[26] D. Simon, "Training fuzzy systems with the extended Kalman filter," Fuzzy Sets and Systems, vol. 132, no. 2, pp. 189-199, 2002.

[27] I. Arasaratnam and S. Haykin, "Cubature Kalman filters," IEEE Transactions on Automatic Control, vol. 54, no. 6, pp. 1254-1269, 2009.

[28] I. Arasaratnam, S. Haykin, and T. R. Hurd, "Cubature Kalman filtering for continuous-discrete systems: theory and simulations," IEEE Transactions on Signal Processing, vol. 58, no. 10, pp. 4977-4993, 2010.

[29] W. Li and Y. Jia, "Location of mobile station with maneuvers using an IMM-based cubature Kalman filter," IEEE Transactions on Industrial Electronics, vol. 59, no. 11, pp. 4338-4348, 2012.

[30] I. Arasaratnam and S. Haykin, "Cubature Kalman smoothers," Automatica, vol. 47, no. 10, pp. 2245-2250, 2011.

[31] S. Haykin, A. Zia, Y. Xue, and I. Arasaratnam, "Control theoretic approach to tracking radar: first step towards cognition," Digital Signal Processing: A Review Journal, vol. 21, no. 5, pp. 576-585, 2011.

[32] M. Vukobratovic, B. Borovac, D. Surla, and D. Stokic, Humanoid Locomotion: Dynamic, Stability, Control and Application, Springer, Berlin, Germany, 1990.

Liyang Wang, (1) Ming Chen, (1) Gai Li, (1) and Yongqing Fan (2)

(1) Department of Electronic Engineering, Shunde Polytechnic, Foshan, Guangdong 528300, China

(2) School of Automation, Xi'an University of Posts and Telecommunications, Xi'an 710121, China

Correspondence should be addressed to Liyang Wang; ddd0wwl@sohu.com

Received 28 March 2016; Accepted 29 May 2016

Academic Editor: Mohammad D. Aliyu

Caption: Figure 1: Data-based control for humanoid walking robots using support vector regression, fuzzy logic, and cubature Kalman filter.

Caption: Figure 2: The trapezoidal fuzzy membership function for humanoid walking samples.

Caption: Figure 3: Simplified model of the simulated humanoid robot.

Caption: Figure 4: Learning results for Submodel 1 of the SSP (the step length is 0.16 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 5: Learning results for Submodel 1 of the SSP (the step length is 0.18 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 6: Learning results for Submodel 1 of the SSP (the step length is 0.20 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 7: Learning results for Submodel 2 of the SSP (the step length is 0.16 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 8: Learning results for Submodel 2 of the SSP (the step length is 0.18 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 9: Learning results for Submodel 2 of the SSP (the step length is 0.20 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 10: Learning results for Submodel 1 of the DSP (the step length is 0.16 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 11: Learning results for Submodel 1 of the DSP (the step length is 0.18 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 12: Learning results for Submodel 1 of the DSP (the step length is 0.20 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 13: Learning results for Submodel 2 of the DSP (the step length is 0.16 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 14: Learning results for Submodel 2 of the DSP (the step length is 0.18 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 15: Learning results for Submodel 2 of the DSP (the step length is 0.20 m). (a) Fuzzy; (b) LS-SVR; (c) the proposed TF-LSSVR.

Caption: Figure 16: ZMP trajectory with disturbance in the SSP when the step length is 0.16 m. Fuzzy controller (green); LS-SVR controller (magenta); the proposed TF-LSSVR controller (black).

Caption: Figure 17: ZMP trajectory with disturbance in the SSP when the step length is 0.18 m. Fuzzy controller (green); LS-SVR controller (magenta); the proposed TF-LSSVR controller (black).

Caption: Figure 18: ZMP trajectory with disturbance in the SSP when the step length is 0.20 m. Fuzzy controller (green); LS-SVR controller (magenta); the proposed TF-LSSVR controller (black).

Caption: Figure 19: ZMP trajectory with disturbance in the DSP when the step length is 0.16 m. Fuzzy controller (green); LS-SVR controller (magenta); the proposed TF-LSSVR controller (black).

Caption: Figure 20: ZMP trajectory with disturbance in the DSP when the step length is 0.18 m. Fuzzy controller (green); LS-SVR controller (magenta); the proposed TF-LSSVR controller (black).

Caption: Figure 21: ZMP trajectory with disturbance in the DSP when the step length is 0.20 m. Fuzzy controller (green); LS-SVR controller (magenta); the proposed TF-LSSVR controller (black).

Table 1: Parameters of the humanoid robot. Link Length (m) Mass (kg) Inertia moment (kg x [m.sup.2]) Trunk 0.175 1.060 0.01623 Thigh 0.130 0.075 (*2) 0.00037(*2) Shank 0.130 0.075 (*2) 0.00037(*2) Foot 0.020 0.060(*2) 0.00004(*2) Table 2: Initial radians of the humanoid joint angles. Humanoid joint angles ([[theta].sub.1] [[theta].sub.2] Initial radians when 0.0698 0.2418 step length is 0.16 m Initial radians when 0.0785 0.1819 step length is 0.18 m Initial radians when 0.0873 0.1183 step length is 0.20 m Humanoid joint angles ([[theta].sub.3] [[theta].sub.4] Initial radians when 0.6811 0 step length is 0.16 m Initial radians when 0.7036 0 step length is 0.18 m Initial radians when 0.7210 0 step length is 0.20 m Humanoid joint angles ([[theta].sub.5] [[theta].sub.6] Initial radians when 0 0.3550 step length is 0.16 m Initial radians when 0 0.3550 step length is 0.18 m Initial radians when 0 0.3550 step length is 0.20 m Humanoid joint angles ([[theta].sub.7] Initial radians when 0.5989 step length is 0.16 m Initial radians when 0.5989 step length is 0.18 m Initial radians when 0.5989 step length is 0.20 m Table 3: Comparisons of the ISE performances. [e.sub.sup_hip] [e.sub.sup_ankle] [e.sub.left-knee] Fuzzy 8.90E-02 9.60E-02 9.2e-001 LS-SVR 7.90E-02 5.40E-02 2.8e-001 TF-LSSVR 6.1e-006 7.20E-06 1.00E-04 [e.sub.right_knee] Fuzzy 9.90E-02 LS-SVR 7.30E-02 TF-LSSVR 2.1e-005

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Research Article |
---|---|

Author: | Wang, Liyang; Chen, Ming; Li, Gai; Fan, Yongqing |

Publication: | Mathematical Problems in Engineering |

Article Type: | Report |

Date: | Jan 1, 2016 |

Words: | 6850 |

Previous Article: | A Dynamic Linear Hashing Method for Redundancy Management in Train Ethernet Consist Network. |

Next Article: | The Distribution-Free Newsboy Problem with Multiple Discounts and Upgrades. |

Topics: |