# Techniques and analysis of adaptive least squares problem.

Introduction

Adaptive filtering can be considered as a process in which the parameters used for the processing of signals changes according to some criterion, these criterion are the estimated mean squared error or correlation .

The QR decomposition prevents the inexact solution to the RLS problem and allow easy monitoring of the positive definiteness of a transformed information matrix in ill-condition situation . The Householder transformation is an approach that is numerically robust and produces sparsity in data compared to the Givens rotation approach which is used to zero one element at a time, while the Householder transformation zeros multiple element on one application of the Householder reflection.

The main features that attract  the use of the Least mean square algorithm are its low computational complexity, proof of convergence in stationary environment, unbiased convergence in the mean of the Wiener solution and stable behavior, robust performance against different signal conditions. The price paid due to its simplicity in implementation is its low convergence when the input process is highly coloured . Its convergence speed depends on the eigenvalue spread of the input signal correlation matrix.

This note is organized as follows. In section 2,we formulate the least squares problem as an adaptive least squares problem. The QR-RLS algorithm using the Givens rotations and the Householder transformation are presented in section

3.Section 4 is devoted to the LMS and NLMS algorithms and their convergence analysis. Finally, conclusions are drawn in section 5.

Formulation of the Problem

In the context of adaptive filtering, recursive formulation of the least squares problem that update the filter coefficient after the arrival of every sample inputs are preferred. Suppose that the over-determined system of equation is given by

Ax = b 1

and the least squares problem and is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 2

Where A[member of][[Real part].sup.MxN], x[member of][[Real part].sup.N] and b[member of][[Real part].sup.M], M [greater than or equal to] N, rankA = N

Definition.1

Filtering is a signal processing operations whose objective is to process a signal in order to manipulate the information contained in the signal.

Definition.2

A filter is a device that maps its input signal to another output signal facilitating the extraction of the desired information contained in the input signal.

Consider the filter structure

[FIGURE 1 OMITTED]

Where k is the iteration number, x(k) denote the input signal, y(k) is the adaptive filter output and d (k) defines the desired signal. The error signal can be computed as:

e(k) = d(k) - y(k) 3

Then, the error is used to form the performance function that is required by the adaptation algorithm in order to determine the appropriate updating of the filter coefficients. The minimization of the adaptive filter output signal corresponds in some ways to the desired signal .

y(k) = [N.summation over (i=0)][w.sub.i](k)x(k) 4

(4) Can be rewritten as

y(k) = [W.sup.T](k)X(k) = [X.sup.T](k)W(k) 5

By (3) we have

[xi](k) = [N.summation over (i=0)][rho](k)[e.sup.2](k) 6

at instant k > 0, w(k)is chosen so that the summation in (6) is minimized. If [rho](k) = 1 implies that (6) is equivalent to

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 7

where [phi](k) = [x.sup.T](k)x(k) and [phi](k) = x(k)d(k)

By [10,chapter 3] taking the gradient of (7) w.r.t. w(k) and equating to zero we obtain

[phi][(k).sup.-.sub.w](k) = [phi](k) 8

which is the normal equation, where [??](k) is the estimate of the filter tap weight vector in the least squares sense.

[??](k) = [[phi].sup.-1](k)[phi](k) 9

is the least squares solution. By substituting (9) into (7) we obtain

[[xi].sub.min](k) = [d.sup.T](k)d(k) - [[phi].sup.T][(k).sup.-.sub.w](k) 10

From (6) and assume that [rho](k) = [[lambda].sup.n-1], such that

[[lambda].sup.n-i][e.sup.2](k) = [n.summation over (i=0)][[lambda].sup.n-i][(d(k) - [w.sup.T](k)x(k)).sup.2] 11

is minimized. By  we can rewrite (11)

as [xi](k) = [n.summation over (i=0)][[lambda].sup.n-i][(d(k) - [w.sup.T](k)x(k)).sup.2] 12

0 [less than or equal to] [lambda] [less than or equal to] 1 is the adaptive least squares problem, where [lambda] is the forgetting factor. Where [lambda] = 1 the adaptive least squares problem are equivalent to the total least squares problem, i.e.,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 13

If [lambda] = 0 the performance function becomes very simple and its only depends on the current error square i.e.,

[xi](k) = [(d(k) - [w.sup.T](k)x(k)).sup.2] = [e.sup.T](k)e(k) 14

QR-RLS Algorithm

The QR decomposition is used to solve equation (1) such that we can find the least squares solution [??] by using the factorization A = QR such that

QRx = b [??] [Q.sup.T] QRx = [Q.sup.T]b [??] Rx = [Q.sup.T]b [??] Rx = [sub.c] [??] [sub.c] = [Q.sup.T]b

By (6) and fig.1 we define the following:

X(k) = [[x(k)x(k-1) ... x(k -N)].sup.T] and

W(k) = [[[w.sub.0](k)[w.sub.1](k)...[w.sub.N-1](k)].sup.T] are the input signal vector and adaptive filter tap weight coefficient vector at instant k. [xi](k) is the a posteriori error and [lambda] as defined above. The optimal solution to the LSP at instant k can be found by solving the normal equation.

[[n.summation over (i=0)][[lambda].sup.n-i][R.sub.D](k)]w(k) = [n.summation over (i=0)][[lambda].sup.n-i][P.sub.D](k) 15

However solving (15) by using the convectional RLS algorithm can be a problem when the autocorrelation matrix R and its inverse estimate becomes ill-conditioned due to quantization effect .During the initialization process i.e., from k = 0 to k = N the solution to equation(15) can be solved by using back substitution.

[W.sub.i](k) = [i.summation over (j=1)]x(j)[w.sub.i-j](k) + d(i)/x(0) 16

We observed that after the instant k = N equation (16) becomes invalid and the inverse of [R.sub.D](k) becomes necessary to obtain the optimal solution for the filter tap weight vector (coefficient).

The triangularization approach can be applied to generate QR-RLS algorithm that avoid using the matrix inversion lemma of the conventional RLS algorithm. The weighted a posteriori error vector (14) can be written as a function of the input data matrix, i.e.,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 17

By pre multiplying (17) by Q(k)yields

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 18

Since Q(k) is an orthogonal matrix (14) can be written as

[xi](k) = [[xi].sup.T.sub.q](k)[[xi].sub.q](k) 19

The weighted error square can be minimized in (19) by calculating w(k) such that [[xi].sub.qk-N+1](k) to [[xi].sub.qk+1](k) are made zero using back-substitution algorithm 

[W.sub.i](k) = -[i.summation over (j=1)][U.sub.N+1-i,i+1](k)[w.sub.i-j](k)[d.sub.qk+1-i](k) 20

i= 0,1,.., N. Note that the Givens rotation element can be calculated as

[cos.sub.[theta]i](k) = [[[U'.sub.i](k)].sub.i+1,N+1-i]/[c.sub.i] 21

[sin.sub.[theta]i](k) = [x.sub.i](k - N - i)/[c.sub.i] 22

Where [c.sub.i] = [square root of [[[U'.sub.i](k)].sup.2.sub.i+1,N+1-i] + [x'.sub.i.sup.2](k - N - i)] and [.] is the

element of the matrix. By [4,9.30]and removing the [I.sub.k-N-1] of Q(k) generate a reduced matrix [Q.sub.[theta]](k)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 23

[Q.sub.[theta]i](k) is derived from [Q.sub.i](k) by removing the increasing [I.sub.k-N-1] section of [Q.sub.i](k). By [4 9.30] yields

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 24

Where [[xi].sub.q1](k) is the first element of the rotated error vector of [[xi].sub.q](k) and [d.sub.q2](k) is a vector with the last N + 1 element of the rotated desired signal vector. The transformed weighted error vector can be updated as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 25

Where [d.sub.i](k), [[xi].sub.q1](k)and [d.sub.qi](k) are intermediate quantities generated during rotations [[[xi].sub.qN+1](k) = [[[xi].sub.q2](k), ..., [[xi].sub.qk-N](k)].sup.T], [d.sub.N+1](k) = [[xi].sub.q1](k) [d.sub.qN+1](k) = [d.sub.q2](k), the vector [[xi].sub.qN+1](k) ([I.sub.k-N-1]) of Q(k) are always increasing and by eliminating it in(25) such that in practice the updating is performed through

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 26

From the definition of Q(k) in [4,(9.28) and (9.29)] yield the following

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 27

Analyzing the Process

* (27) Shows that the a posteriori error can be computed without the coefficient vector.

* The only information needed is the Givens rotation cosines.

* After initialization, the Givens rotation elements (cosine and sine) are computed by equations (21) and (22).

* The rotations are applied to the information matrix and desired signal vector i.e., equations (23) and (26).

* Computation of the error signal is done by using (3)i.e., a prior error.

* As in (9) and (16) the computation of the tap weight coefficient can be done also by (20).

Stability and Convergence

When the QR-RLS algorithm was implemented in finite precision its stability condition was verified to be stable when it was applied to bounded input bounded output(BIBO)[4,10].

* Convergence on average of the QR-RLS algorithm can be guaranteed if the following inequality(ies) are satisfied :

[[lambda].sup.1/2][[parallel][bar.Q][sub.Q](k)[parallel].sub.2] [less than or equal to] 1 28

Where [[parallel].[parallel].sub.2] is the two norm of a matrix used which is the square root of the largest eigenvalue. Therefore

[[parallel][bar.Q][sub.Q](k)[parallel].sub.2] = [Max.sub.i][square root of [cos.sup.2.sub.Q][[theta].sub.i](k) + [sin.sup.2.sub.Q][[theta].sub.i](k)] 29

Where [Max.sub.i][.] is the maximum value of [.].

* The stability condition can be rewritten as

[lambda] [less than or equal to] 1/[Max.sub.i][[cos.sup.2.sub.Q][[theta].sub.i](k) + [sin.sup.2.sub.Q][[theta].sub.i](k)

Keeping the product of the forgetting factor and the maximum eigenvalue of the Givens rotations smaller than unity is a enough condition to guarantee stability .

Least Mean Square (LMS) And Normalized (LMS) Algorithm.

From fig. 1 we have

y(k) = [N.summation over (i=0)][w.sub.0](k)x(k - 1) 31

which is the filter output assumed to be a real valued sequence. The tap weight [w.sub.0](k), [w.sub.1](k), ..., [w.sub.N](k) are selected so that the error term (3)is minimized. The filter tap weight are explicitly function of time index k which implies that in an adaptive filter the tap weight are time varying since they are continuously being adapted so that any variation in the signal statistics could be tracked. The Least mean square algorithm changes the filter tap weight so that the error e(k) is minimized in the mean square sense. When the processes x(k) and d(k) are jointly stationary, this algorithm converges to a set of tap weights which on average are equal to the Wiener Hopf solution.

w(k + 1) = w(k) + 2[mu]e(k) x(k) 32

is the LMS recursion. [mu] is the parameter step size or the convergence factor which is chosen within a range to guarantee convergence. Equations (3), (31) and (32) are the steps required to compute each iterations of the least mean squares algorithm.

Convergence Behavior

The convergence behavior of the LMS algorithm is directly related to the eigenvalue spread of the correlation matrix [4,10].The parameter step size [mu] selected must be small enough to ensure that the algorithm converges to the optimum point, this small step size causes slow convergence. The convergence condition is satisfied by selecting 0 [less than or equal to] [mu] [less than or equal to] 1/[[lambda].sub.max], where [[lambda].sub.max] is the largest eigenvalue of the autocorrelation matrix R of the output signal vectors [1,4,10]. Under the independence assumption the steady state mean square error(MSE) is defined as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] 33

where [[rho].sub.0] is the power of observation noise [[rho].sub.0] = E[[v.sup.2](k)] and [eta] = tr(R[(I - [mu]R).sup.-1]) The convergence of both error and filter tap weight coefficient, the time required for the convergence depends on the ratio of the eigenvalue of the input signal correlation matrix R. Hence we conclude that the convergence behavior of the LMS algorithm is application dependent.

Normalized Least Mean Square Algorithm (NLMS)

w(k + 1) = w(k) + [[mu]/[x.sup.T](k)x(k) + [tau]]e(k)x(k) 34

Equations (3),(31) and (34) are the complete steps required to compute the NLMS recursion. The convergence factor [mu] and [tau] are positive constant. The addition of [tau] in the conventional LMS (NLMS) is to prevent division by a small value when the squared Euclidean norm [x.sup.T](k)[sub.x](k) is small.

Convergence Analysis

The parameter step size [mu] controls the rate of convergence of the algorithm and its misadjustment. The convergence of this algorithm is governed by the parameter step size [mu] whose value determines a trade off between convergence speed and steady state misadjustment. Generally, a large convergence factor or parameter step-size leads to faster convergence but result in large misadjustment .

Conclusion

The condition number c = [[lambda].sub.max]/[[lambda].sub.min] = [parallel]R[parallel][parallel][R.sup.-1][parallel] influences the convergence of adaptive filtering algorithms. Large value of c implies that the matrix R is ill-conditioned while c = 1 implies that the matrix R is well-conditioned.

The QR-RLS algorithm is useful in applications where the input signal vector does not consist of time delay elements. The drawback of this algorithm (QR-RLS) is the backsubstitution technique used in computing the weight vector(coefficient) see for details. We observed also that the Givens rotations and the Householder transformation performs maximally in different applications irrespective of their stability and numerically robustness. The convergence analysis presented so far shows that the convergence factor play an important role in controlling the performance of these algorithms.

Reference

 Gao Qiang and Zhang Jun:LMS algorithms for noise offset in medium-voltage power line communication, IEEE, pp1-4, 2009.

 Yilun.C, Yuantao. Gu, Alfred. O. Hero 111:sparse LMS for system identification. ICASSP, pp3125-3128, IEEE, 2009.

 Ng. Y. N, H. Mohamad and T. C. Chuah: Block based fuzzy step size NLMS algorithms forsubband adaptive channelequalization.IET signal processing .vol.3,pp 23-32,2009.

 Vijaykumar. V.R, Vanathi.P.T and Kanagasapabathy.P:Modified adaptive filtering algorithm for noise cancellation in speech signals, EEE, ISSN1392 1215, no.2(74),2007.

 Chan. S. C and Yang. X. X: Improved approximate QR-LS algorithms for adaptive filtering, IEEE, transactions on circuit and syatem-11, express brief, vol.51, no.1, jan.2004.

 Steven. J. L: Linear algebra with application, Prentice Hall, Upper Saddle River, NJ, 2002.

 Darald J. Hartfiel: Matrix Theory and Applications with Matlab. CRCpress, New York, 2000.

 Chen. Q. C, T. Bose and G. F. Xu. A direction set based algorithm for adaptive filtering, IEEE, Trans.signal processing, vol.47, pp535-539, no.20, feb.1999.

 Farhang-Borougeny. B: Adaptive filters theory and applications. John Wiley &sons, 1999.

 Diniz. P. S. R and Siqueira. M. G. Fixed pointerror analysis of QR-RLS algorithm. IEEE trans. on circuit and system 11.Analog and digital signal processing, vol.43, pp334-348, may, 1995.

F.Z. Okwonu and Atinah Noor

Department Of Mathematics, Universiti Sains Malaysia, Pinang, Malaysia