Printer Friendly

On error estimation in the conjugate gradient method and why it works in finite precision computations.

Abstract. In their paper published in 1952, Hestenes and Stiefel considered the conjugate gradient (CG) method an iterative method which terminates in at most n steps if no rounding errors are encountered [24, p. 410]. They also proved identities for the A-norm and the Euclidean norm of the error which could justify the stopping criteria [24, Theorems 6:1 and 6:3, p. 416]. The idea of estimating errors in iterative methods, and in the CG method in particular, was independently (of these results) promoted by Golub; the problem was linked to Gauss quadrature and to its modifications [7], [8]. A comprehensive summary of this approach was given in [15], [16]. During the last decade several papers developed error bounds algebraically without using Gauss quadrature. However, we have not found any reference to the corresponding results in [24]. All the existing bounds assume exact arithmetic. Still they seem to be in a striking agreement with finite precision numerical experiments, though in finite precision computations they estimate quantities which can be orders of magnitude different from their exact precision counterparts! For the lower bounds obtained from Gauss quadrature formulas this nontrivial phenomenon was explained, with some limitations, in [17].

In our paper we show that the lower bound for the A-norm of the error based on Gauss quadrature ([15], [17], [16]) is mathematically equivalent to the original formula of Hestenes and Stiefel [24]. We will compare existing bounds and we will demonstrate necessity of a proper rounding error analysis: we present an example of the well-known bound which can fail in finite precision arithmetic. We will analyse the simplest bound based on [24, Theorem 6:1], and prove that it is numerically stable. Though we concentrate mostly on the lower bound for the A-norm of the error, we describe also an estimate for the Euclidean norm of the error based on [24, Theorem 6:3]. Our results are illustrated by numerical experiments.

Key words. conjugate gradient method, Gauss quadrature, evaluation of convergence, error bounds, finite precision arithmetic, rounding errors, loss of orthogonality.

AMS subject classifications. 15A06, 65F10, 65F25, 65G50.

1. Introduction. Consider a symmetric positive definite matrix A [member] [R.sup.n x n] and a right-hand side vector b [member of] [R.sup.n] (for simplicity of notation we will assume A, b real; generalization to complex data will be obvious). This paper investigates numerical estimation of errors in iterative methods for solving linear systems

(1.1) Ax = b.

In particular, we focus on the conjugate gradient method (CG) of Hestenes and Stiefel [24] and on the lower estimates of the A-norm (also called the energy norm) of the error, which has important meaning in physics and quantum chemistry, and plays a fundamental role in evaluating convergence [1], [2].

Starting with the initial approximation [x.sub.0], the conjugate gradient approximations are determined by the condition


i.e. they minimize the A-norm of the error

[[parallel]x - [x.sub.j][parallel].sub.A] = [((x - [x.sub.j]), A(x - [x.sub.j])).sup.1/2] over all methods generating approximations in the manifold [x.sub.0] + [K.sub.j] (A; [r.sub.0]). Here

[K.sub.j] (A, [r.sub.0]) = span{[r.sub.0]; [Ar.sub.0], ... [A.sup.j-1][r.sub.0]}

is the j-th Krylov subspace generated by A with the initial residual [r.sub.0], [r.sub.0] = b - A[x.sub.0], and x is the solution of (1.1). The standard implementation of the CG method was given in [24, (3:1a)-(3:1f)]:

Given [x.sub.0], [r.sub.0] = b - A[x.sub.0], [p.sub.0] = [r.sub.0], and for j = 1, 2, ..., let

[[gamma].sub.j-1] = ([r.sub.j-1], [r.sub.j-1])=([p.sub.j-1], [Ap.sub.j-1]), [x.sub.j] = [x.sub.j-1] + [[gamma].sub.j-1] [p.sub.j-1], [r.sub.j] = [r.sub.j-1] - [[gamma].sub.j-1] [Ap.sub.j-1], [[delta].sub.j] = ([r.sub.j], [r.sub.j])=([r.sub.j-1]; [r.sub.j-1]), [p.sub.j] = [r.sub.j] + [[delta].sub.j] [p.sub.j-1].

The residual vectors {[r.sub.0], [r.sub.1], ..., [r.sub.j-1]} form an orthogonal basis and the direction vectors {[p.sub.0], [p.sub.1], ..., [p.sub.j-1]} an A-orthogonal basis of the j-th Krylov subspace [K.sub.j] (A, [r.sub.0]).

In [24] Hestenes and Stiefel considered CG as an iterative procedure. They presented relations [24, (6:1)-(6:3) and (6:5), Theorems 6:1 and 6:3] as justifications of a possible stopping criterion for the algorithm. In our notation these relations become

(1.4) [[parallel]x - [x.sub.j-1][parallel]].sup.A.sub.2] - [[parallel]x - [x.sub.j][parallel]].sup.A.sub.2] = [[gamma].sub.j-1][[[parallel][r.sub.j-1][parallel]].sup.2],

(1.5) [[parallel]x - [x.sub.j][parallel]].sup.A.sub.2] - [[parallel]x - [x.sub.k][parallel]].sup.A.sub.2] [k- 1.summation over (i=j) [[gamma].sub.i][[[parallel][r.sub.i][parallel]].sup.2], 0 [less than or equal to] j < k < [less than or equal to] n,

(1.6) [x.sub.j] = [x.sub.0] + [j-1.summation over (l=0)] [[gamma].sub.l][p.sub.l] = [x.sub.0] + [j-1.summation over (l=0)] [[[parallel]x - [x.sub.l][parallel]].sup.A.sub.2] - [[[parallel]x - [x.sub.j][parallel]].sup.A.sub.2]/ [[[parallel][r.sub.l][parallel]].sup.2] [r.sub.l],

(1.7) [[[parallel]x - [x.sub.j-1][parallel]].sup.2] - [[[parallel]x - [x.sub.j][parallel]].sup.2] = [[[parallel]x - [x.sub.j-1][parallel]].sup.A.sub.2] + [[[parallel]x - [x.sub.k][parallel]].sup.A.sub.2]/[micro]([p.sub.j-1)

[micro]([p.sub.j-1) = ([p.sub.j-1], [Ap.sub.j-1])/[[[parallel][p.sub.j-1][parallel]].sup.2].

Please note that (1.5) represents an identity describing the decrease of the A-norm of the error in terms of quantities available in the algorithm, while (1.7) describes decrease of the Euclidean norm of the error in terms of the A-norm of the error in the given steps.

Hestenes and Stiefel did not give any particular stopping criterion. They emphasized, however, that while the A-norm of the error and the Euclidean norm of the error had to decrease monotonically at each step, the residual norm oscillated and might even increase in each but the last step. An example of this behaviour was used in [23].

The paper [24] is frequently referenced, but some of its results has not been paid much attention. Residual norms have been (and still are) commonly used for evaluating convergence of CG. The possibility of using (1.4)-(1.7) for constructing a stopping criterion has not been, to our knowledge, considered.

An interest in estimating error norms in the CG method reappeared with works of Golub and his collaborators. Using some older results [7], Dahlquist, Golub and Nash [8] related error bounds to Gauss quadrature (and to its modifications). The approach presented in that paper became a basis for later developments. It is interesting to note that the relationship of the CG method to the Riemann-Stieltjes integral and Gauss quadrature was described in detail in [24, Section 14], but without any link to error estimation. The work of Golub and his collaborators was independent of [24].

The paper [8] brought also into attention an important issue of rounding errors. The authors noted that in order to guarantee the numerical stability of the computed Gauss quadrature nodes and weights, the computed basis vectors had to be reorthogonalized. That means that the authors of that paper were from the very beginning aware of the fact that rounding errors might play a significant role in the application of their bounds to practical computations. In the numerical experiments used in [8] the effect of rounding errors were, however, not noticeable. This can be explained using the results by Paige ([33], [34], [35] and [36]). Due to the distribution of eigenvalues of the matrix used in [8] Ritz values do not converge to the eigenvalues until the last few steps. Before this convergence takes place there is no significant loss of orthogonality and the effects of rounding errors are not visible.

Error bounds in iterative methods were intensively studied or used in many later papers and in several books, see, e.g. [9], [10], [12], [15], [17], [16], [11], [21], [28], [29], [30], [4], [6]. Except for [17], effects of rounding errors were not analysed in these publications.

Frommer and Weinberg [13] pointed out the problem of applying exact precision formulas to finite precision computations, and proposed to use interval arithmetic for computing verified error bounds. As stated in [13, p. 201], this approach had serious practical limitations. Axelsson and Kaporin [3] considered preconditioned conjugate gradients and presented (1.5) independently of [24]. Their derivation used (global) mutual A-orthogonality among the direction vectors [p.sub.j], j = 0, 1, ..., n - 1. They noticed that the numerical values found from the resulting estimate were identical to those obtained from Gauss quadrature, but did not prove this coincidence. They also noticed the potential difficulty due to rounding errors. They presented an observation that loss of orthogonality did not destroy applicability of their estimate. Calvetti et al. [5] presented several bounds and estimates for the A-norm of the error, and addressed a problem of cancellation in their computations [5, relation (46)].

In our paper we briefly recall some error estimates published after (and independently of) (1.4)-(1.7) in [24]. For simplicity of our exposition we will concentrate mostly on the A-norm of the error [[[parallel]x - [x.sub.j][parallel]].sub.A]. We will show that the simplest possible estimate for [[[parallel]x - [x.sub.j][parallel]].sub.A], which follows from the relation (1.4) published in the original paper [24], is mathematically (in exact arithmetic) equivalent to the corresponding bounds developed later. In finite precision arithmetic, rounding errors in the whole computation, not only in the computation of the convergence bounds, must be taken into account. We emphasize that rounding error analysis of formulas for computation of the convergence bounds represents in almost all cases a simple and unimportant part of the problem. Almost all published convergence bounds (including those given in [5]) can be computed accurately (i.e. computation of the bounds using given formulas is not significantly affected by rounding errors). But this does not prove that these bounds give anything reasonable when they are applied to finite precision CG computations. We will see an example of the accurately computed bound which gives no useful information about the convergence of CG in finite precision arithmetic in Section 6.

An example of rounding error analysis for the bounds based on Gauss quadrature was presented in [17]. The results from [17] rely on the work by Paige and Greenbaum ([36], [19] and [22]). Though [17] gives a strong qualitative justification of the bounds in finite precision arithmetic, this justification is applicable only until [[[parallel]x - [x.sub.j][parallel]].sub.A] reaches the square root of the machine precision. Moreover, quantitative expressions for the rounding error terms are very complicated. They contain factors which are not tightly estimated (see [19], [22]). Here we complement the analysis from [17] by substantially stronger results. We prove that the simplest possible lower bound for [[[parallel]x - [x.sub.j][parallel]].sub.A] based on (1.4) works also for numerically computed quantities till [[[parallel]x - [x.sub.j][parallel]].sub.A] reaches its ultimate attainable accuracy.

The paper is organized as follows. In Section 2 we briefly describe relations between the CG and Lanczos methods. Using the orthogonality of the residuals, these algorithms are related to sequences of orthogonal polynomials, where the inner product is defined by a Riemann-Stieltjes integral with some particular distribution function [omega]([lambda]). The value of the j-th Gauss quadrature approximation to this Riemann-Stieltjes integral for the function 1/[lambda] is the complement to the error in the j-th iteration of the CG method measured by [[[parallel]x - [x.sub.j][parallel]].sup.A.sub.2]/[[[parallel][r.sub.0][parallel]].sup.2]. In Section 3 we reformulate the result of the Gauss quadrature using quantities that are at our disposal during the CG iterations. In Section 4 we use the identities from Section 3 for estimation of the A-norm of the error in the CG method, and we compare the main existing bounds. Section 5 describes delay of convergence due to rounding errors. Section 6 explains why applying exact precision convergence estimates to finite precision CG computations represents a serious problem which must be properly addressed. Though exact precision CG and finite precision CG can dramatically differ, some exact precision bounds seem to be in good agreement with the finite precision computations. Sections 7-10 explain this paradox. The individual terms in the identities which the convergence estimates are based on can be strongly affected by rounding errors. The identities as a whole, however, hold true (with small perturbations) also in finite precision arithmetic. Numerical experiments are presented in Section 11.

When it will be helpful we will use the word "ideally" (or "mathematically") to refer to a result that would hold using exact arithmetic, and "computationally" or "numerically" to a result of a finite precision computation.

2. Method of conjugate gradients and Gauss quadrature. For A and [r.sub.0] the Lanczos method [27] generates ideally a sequence of orthonormal vectors [v.sub.1]; [v.sub.2], ... via the recurrence

Given [[upsilon].sub.1] = [r.sub.0]/[[parallel][r.sub.0][parallel]], [[beta].sub.1] [equivalent to] 0, and for j = 1; 2, ..., let

[[alpha].sub.j] = (A[[upsilon].sub.1] - [[beta].sub.j][[upsilon].sub.j-1], [[upsilon].sub.j]),

[w.sub.j] = A[[upsilon].sub.j] - [[alpha].sub.j][[upsilon].sub.j] - [[beta].sub.j][[upsilon].sub.j-1] (2.1)

[[beta].sub.j+1] = [[product][w.sub.j][product]],

[[upsilon].sub.j+1] = [w.sub.j]/[[beta].sub.j+1].

Denoting by [V.sub.j] = [[[upsilon].sub.1], ..., [[upsilon].sub.j]] the n by j matrix having the Lanczos vectors {[[upsilon].sub.1], ..., [[upsilon].sub.j]]} as its columns, and by [T.sub.j] the symmetric tridiagonal matrix with positive subdiagonal


the formulas (2.1) are written in the matrix form

(2.3) A[V.sub.j] = [V.sub.j][T.sub.j] + [[beta].sub.j+1][[upsilon].sub.j+1][e.sup.j.sub.T],

where [e.sub.j] is the j-th column of the n by n identity matrix. Comparing (1.3) with (2.1) gives

(2.4) [[upsilon].sub.j+1] = [(-1).sup.j] [r.sub.j]/[[parallel][r.sub.j][parallel]], krjk

and also relations between the recurrence coefficients:

(2.5) [[alpha].sub.j] = 1/[[gamma].sub.j-1] + [[delta].sub.j-1]/[[gamma].sub.j-2], [[delta].sub.0] [equivalent to] 0,[[gamma].sub.-1] [equivalent to] 1,

[[beta].sub.j+1] = [square root of [[delta].sub.j]/[[gamma].sub.j-1].

Finally, using the change of variables

(2.6) [x.sub.j] = [x.sub.0] + [V.sub.j] [y.sub.j],

and the orthogonality relation between [r.sub.j] and the basis {[[upsilon].sub.1], [[upsilon].sub.2], ..., [[upsilon].sub.j]} of [K.sub.j] (A, [r.sub.0]), we see that

0 = [V.sup.j.sub.T] [r.sub.j] = [V.sup.j.sub.T] (b - A[x.sub.j]) = [V.sup.j.sub.T] ([r.sub.0] - A[V.sub.j][y.sub.j])

= [e.sub.1][[parallel][r.sub.0][parallel]] - [V.sup.j.sub.T] A[V.sub.j][y.sub.j] = [e.sub.1][[parallel][r.sub.0][parallel]] = [T.sub.j] [y.sub.j].

Ideally, the CG approximate solution xj can therefore be determined by solving

(2.7) [T.sub.j] [y.sub.j] = [e.sub.1][[parallel][r.sub.0][parallel]]

with subsequent using of (2.6).

Orthogonality of the CG residuals creates the elegance of the CG method which is represented by its link to the world of classical orthogonal polynomials. Using (1.3), the j-th error resp. residual can be written as a polynomial in the matrix A applied to the initial error resp. residual,

(2.8) x - [x.sub.j] = [[partial derivative].sub.j](A) (x - [x.sub.0]), [r.sub.j] = [[partial derivative].sub.j](A) [r.sub.0], [[partial derivative].sub.j] [member of] [[product].sub.j]

where [[product].sub.j] denotes the class of polynomials of degree at most j having the property [partial derivative](0) = 1 (that is, the constant term equal to one). Consider the eigendecomposition of the symmetric matrix A in the form

(2.9) A = U[LAMBDA][U.sup.T], U[U.sup.T] = [U.sup.T]U = I

where [LAMBDA] = diag([[lambda].sub.1], ..., [[lambda].sub.n]) and U = [[u.sub.1], ..., [u.sub.n]] is the matrix having the normalized eigenvectors of A as its columns. Substituting (2.9) and (2.8) into (1.2) gives


Consequently, for A symmetric positive definite the rate of convergence of CG is determined by the distribution of eigenvalues of A and by the size of the components of r0 in the direction of the individual eigenvectors.

Similarly to (2.8), [[upsilon].sub.j+1] is linked with some monic polynomial [[psi].sub.j],

(2.11) [[upsilon].sub.j+1] = [[[psi].sub.j] (A) [[upsilon].sub.1] x 1/[[beta].sub.2][[beta].sub.3] ... [[beta].sub.j+1]

Using the orthogonality of [[upsilon].sub.j+1] to [[upsilon].sub.1], ..., [[upsilon].sub.j], the polynomial [[psi.sub.j] is determined by the minimizing condition


where [M.sub.j] denotes the class of monic polynomials of degree j.

We will explain what we consider the essence of the CG and Lanczos methods.

Whenever the CG or the Lanczos method (defined by (1.3) resp. by (2.1)) is considered, there is a sequence 1, [[psi].sub.1], [[psi].sub.2], ... of the monic orthogonal polynomials determined by (2.12). These polynomials are orthogonal with respect to the discrete inner product

(2.13) (f,g) = [n.summation over (i=1)] [[omega].sub.i] f([[lambda].sub.i])g([[lambda].sub.i])

where the weights [omega].sub.i] are determined as

(2.14) [[omega].sub.i] = [([[upsilon].sub.1], [[upsilon].sub.i]).sup.2], [n.summation over (i=1)][[omega.sub.i] = 1,

([[upsilon].sub.1] = [r.sub.0]/[[parallel][r.sub.0][parallel]). For simplicity of notation we assume that all the eigenvalues of A are distinct and increasingly ordered (an extension to the case of multiple eigenvalues will be obvious). Let [zeta], [xi] be such that [zeta] [less than or equal to] [[lambda].sub.1] < [[lambda].sub.2] < ... < [[lambda].sub.n] [less than or equal to] [xi]. Consider the distribution function [omega]([lambda]) with the finite points of increase [[lambda].sub.1], [[lambda].sub.2], ..., [[lambda].sub.n],

(2.15) [omega]([lambda]) = 0 for [lambda] < [[lambda].sub.1],

[omega]([lambda]) = [i.summation over (l=1)] [[omega].sub.l] for [[lambda].sub.i] [less than or equal to] < [[lambda].sub.i+1]

[omega]([lambda]) = 1 for [[lambda].sub.n] [less than or equal to] [[lambda],

see Fig. 2.1, and the corresponding Riemann-Stieltjes integral

(2.16) [[integral].sup.[zeta].sub.[xi]] f(lambda) d[omega]([lambda]) = [n.summation over (i=1) [omega].sub.i]f([[lambda].sub.i]

Then (2.12) can be rewritten as


The j steps of the CG resp. the Lanczos method starting with [parallel][r.sub.0][parallel][[upsilon].sub.1] resp. [[upsilon].sub.1] determine a symmetric tridiagonal matrix (with a positive subdiagonal) [T.sub.j] (2.2). Consider, analogously to (2.9), the eigendecomposition of [T.sub.j] in the form

(2.18) [T.sub.j] = [S.sub.j][[THETA].sub.j][S.sup.j.sub.T], [S.sup.j.sub.T][[S.sub.j] = [S.sub.j][S.sup.j.sub.T] = I,

[[THETA].sub.j] = diag([[theta].sub.1].sup.(j)], ..., [[theta].sup.j.sub.(j)), [S.sub.j] = [s.sup.1.sub.(j)], ..., [s.sup.j.sub.(j)]. Please note that we can look at [T.sub.j] also as determined by the CG or the Lanczos method applied to the j-dimensional problem [T.sub.j][y.sub.j] = [e.sub.1][parallel][r.sub.0][parallel] resp. [T.sub.j] with initial residual [e.sub.1][parallel][r.sub.0][parallel] resp. starting vector [e.sub.1]. Clearly, we can construct Riemann-Stieltjes integral for this j-dimensional problem similarly as above. Let [zeta] [less than or equal to] [[theta].sup.1.sub.(j)] < [[theta].sup.2.sub.(j)] < ... < [[theta].sup.j.sub.(j)] [less than or equal to] [xi] be the eigenvalues of [T.sub.j] (Ritz values, they must be distinct, see, e.g. [38, Chapter 7]). Let


(2.19) [[omega].sup.i.sub.(j)] = [([e.sub.1], [s.sup.1.sub.(j)]).sup.2], [j.summation over (i=1)] [[omega].sup.i.sub.(j) = 1

be the weights determined by the squared size of the components of e1 in the direction of [T.sub.j] 's eigenvectors, and

[[omega].sup.(j)]([lambda]) = 0 for [lambda] < [[theta].sup.1.sub.(j)] [[omega].sup.(j)]([lambda]) = [j.summation over (l=1)] [[omega].sup.l.sub.(j)] for [[theta].sup.i.sub.(j)] [less than or equal to] [lambda] < [[theta].sup.i+1.sub.(j)] [[omega].sup.(j)] ([lambda]) = 1 for [[theta].sup.j.sub.(j)] [less than or equal to] [lambda]

Then the first j polynomials from the set {1, [[psi].sub.1], ..., [[psi].sub.n]} determined by (2.17) are also determined by the condition based on the Riemann-Stieltjes integral with the distribution function [[omega].sup.(j)]([lambda])


(we can look at the subsequence {1, [[psi].sub.1], ..., [[psi].sub.j]} as determined by the CG or the Lanczos method applied to the j-dimensional problem described above). The integral

(2.21) [[integral].sup.[zeta].sub.[xi]] f([lambda]) d[[omega].sup.(j)]([lambda]) = [j.summation over (i=1)][[omega].sup.i.sub.(j)] f([[theta].sup.i.sub.(j)])

is the well-known j-th Gauss quadrature approximation of the integral (2.16), see, e.g., [14]. Thus, the CG and Lanczos methods determine the sequence of distribution functions [[omega].sup.(1)]([lambda]), [[omega].sup.(2)]([lambda]), ..., [[omega].sup.(j)]([lambda]), ... approximating in an optimal way (in the sense of Gauss quadrature, i.e. [[omega].sup.(l)]([lambda]) ensures that for any polynomial of degree less than of equal to 2l - 1 the value of the original integral (2.16) is approximated by (2.21) exactly) the original distribution function [omega]([lambda]), cf. [26], [46, Chapter XV], [45].

All this is well-known. Gauss quadrature represents a classical textbook material and the connection of CG to Gauss quadrature was pointed out in the original paper [24]. This connection is, however, a key to understanding both mathematical properties and finite precision behaviour of the CG method.

Given A and r0, (2.16) and its Gauss quadrature approximations (2.21) are for j = 1, 2, ..., n uniquely determined (remember we assumed that the eigenvalues of A are positive and distinct). Conversely, the distribution function [[omega].sup.(j)]([lambda]) uniquely determines the symmetric tridiagonal matrix [T.sub.j], and, through (2.7) and (2.6), the CG approximation [x.sub.j] . With f([lambda]) = [[lambda].sup.1] we have from (2.10)

(2.22) [[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = [[parallel][r.sub.0][parallel].sup.2] [n.summation over (i=1)] [[omega].sub.i]/[[lambda].sub.i] = [[parallel][r.sub.0][parallel].sup.2] [[integral].sup.[zeta].sub.[xi]] [[lambda].sup.-1] d[omega]([lambda]),

and, using (2.3) with j = n,

[[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = ([r.sub.0], [A.sup.-1][r.sub.0]) = [[parallel][r.sub.0][parallel].sup.2] ([e.sub.1] [T.sup.n.sub.-1][e.sub.1]) [equivalent to] [[parallel][r.sub.0][parallel].sup.2] [([T.sup.n.sub.-1]).sub.11]


(2.23) [[integral].sup.[zeta].sub.[xi]] [[lambda].sup.-1]d[omega]([lambda]) = [([T.sup.n.sub.-1]).sub.11]

Repeating the same considerations using the CG method for [T.sub.j] with the initial residual [parallel][r.sub.0][parallel][e.sub.1], or the Lanczos method for [T.sub.j] with [e.sub.1]

(2.24) [[integral].sup.[zeta].sub.[xi]] [[lambda].sup.-1][d[omega].sup.(j)]([lambda]) = [([T.sup.n.sub.-1]).sub.11] Finally, applying the j-point Gauss quadrature to (2.16) gives

(2.25) [[integral].sup.[zeta].sub.[xi]] f([lambda]) d[omega]([lambda]) = [[integral].sup.[zeta].sub.[xi] f([lambda])[d[omega].sup.(j)]([lambda]) + [R.sub.j](f)

where [R.sub.j] (f) stands for the (truncation) error in the Gauss quadrature. In the next section we present several different ways of expressing (2.25) with f([lambda]) = [[lambda].sup.-1].

3. Basic Identities. Multiplying the identity (2.25) by [[parallel][r.sub.0][parallel].sup.2] gives

(3.1) [[parallel][r.sub.0][parallel].sup.2] [[integral].sup.[zeta].sub.[xi]] f([lambda])d[omega]([lambda]) + kr0k2[R.sub.j] = [[parallel][r.sub.0][parallel].sup.2] [[integral].sup.[zeta].sub.[xi] f([lambda])d[[omega].sup.(j)]([lambda]) [[parallel][r.sub.0][parallel].sup.2] [R.sub.j] (f).

Using (2.22), (2.23) and (2.24), (3.1) can for f([lambda]) = [[lambda].sup.-1]. be written as

[[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = [[parallel]x - [x.sub.0][parallel].sup.A.sub.2][([T.sup.n.sub.- 1]).sub.11] = [[parallel][r.sub.0][parallel].sup.2] [([T.sup.j.sub.-1]).sub.11] + [[parallel][r.sub.0][parallel].sup.2] [R.sub.j]([[lambda].sup.-1])

In [17, pp. 253-254] it was proved that for f([lambda]) = [[lambda].sup.-1] the truncation error in the Gauss quadrature is equal to

[R.sub.j]([[lambda].sup.-1]) = [[parallel]x - [x.sub.j][parallel].sup.A.sub.2]/[[parallel][r.sub.0][parallel].sup.2],

which gives

(3.2) [[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = [[parallel][r.sub.0][parallel].sup.2] [([T.sup.j.sub.- 1]).sub.11] + [[parallel]x - [x.sub.j][parallel].sup.A.sub.2]

Summarizing, the value of the j-th Gauss quadrature approximation to the integral (2.23) is the complement of the error in the j-th CG iteration measured [[parallel]x - [x.sub.0][parallel].sup.A.sub.2],

(3.3) [[parallel]x - [x.sub.0][parallel].sup.A.sub.2]/[[parallel][r.sub.0][parallel].sup.2] = j-point Gauss quadrature + [[parallel]x - [x.sub.j][parallel].sup.A.sub.2]/[[parallel][r.sub.0][parallel].sup.2]

This relation was developed in [8] in the context of moments; it was a subject of extensive work motivated by estimation of the error norms in CG in the papers [12], [15] and [17]. Work in this direction continued and led to the papers [16], [28], [30], [5].

An interesting form of (3.2) was noticed by Warnick in [47]. In the papers mentioned above the values of [[parallel]x - [x.sub.0][parallel].sup.A.sub.2]/[[parallel][r.sub.0][parallel].sup.2] [([T.sup.n.sub.-1]).sub.11] and [([T.sup.j.sub.-1]).sub.11] were approximated from the actual Gauss quadrature calculations (or from the related recurrence relations). Using (2.7) and (2.6), the identities kr0k2(T_1


show that ([T.sup.j.sub.-1] is given by a simple inner product. Indeed,

(3.4) [[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = [r.sup.0.sub.T]([x.sub.0 - [x.sub.j]) + [[parallel]x - [x.sub.j][parallel].sup.A.sub.2]

This remarkable identity was pointed out to us by Saylor [41], [40]. Please note that derivation of the identity (3.4) from the Gauss quadrature-based (3.2) uses the orthogonality relation [[upsilon].sup.1.sub.T][V.sub.j] = [e.sub.1]. In finite precision computations this orthogonality relation does not hold. Consequently, (3.4) does not hold in finite precision arithmetic. We will return to this point in Section 6.

A mathematically equivalent identity can be derived by simple algebraic manipulations without using Gauss quadrature,



(3.5) [[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = [r.sup.j.sub.T]([x.sub.j - [x.sub.0]) + [r.sup.0.sub.T]([x.sub.j - [x.sub.0]) + [[parallel]x - [x.sub.j][parallel].sup.A.sub.2]

The right-hand side of (3.5) contains, in comparison with (3.4), the additional term [r.sup.j.sub.T] ([x.sub.j] - [x.sub.0]. This term is in exact arithmetic equal to zero, but it has an important correction effect in finite precision computations (see Section 6). Relations (3.2), (3.4) and (3.5) represent various mathematically equivalent forms of (3.1). While in (3.2) the j-point Gauss quadrature is evaluated as [([T.sup.j.sub.-1]).sub.11], in (3.4) and (3.5) this quantity is computed using inner products of the vectors that are at our disposal during the iteration process. But, as mentioned in Introduction, there is much simpler identity (1.5) mathematically equivalent to (3.1). It is very surprising that, though (1.5) is present in the Hestenes and Stiefel paper [24, Theorem 6.1, relation (6:2), p. 416], this identity has (at least to our knowledge) never been related to Gauss quadrature. Its derivation is very simple. Using (1.3)


Consequently, for 0 [less than or equal to] l < j [less than or equal to] n,

(3.7) [[parallel]x - [x.sub.l][parallel].sup.A.sub.2] - [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] = [j- 1.summation over (i=l)] ([[parallel]x - [x.sub.i][parallel].sup.A.sub.2] - [[parallel]x - [x.sub.i+1][parallel].sup.A.sub.2]) = [j-1.summation over (i=l)] [[gamma].sub.i][[parallel][r.sub.i][parallel].sup.2]

and (3.1) can be written in the form

(3.8) [[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = [j-1.summation over (i=0)] [[gamma].sub.i][[parallel][r.sub.i][parallel].sup.2] + [[parallel]x - [x.sub.j][parallel].sup.A.sub.2]

The numbers [[gamma].sub.i][[parallel][r.sub.i][parallel].sup.2] are trivially computable; both [[gamma].sub.i] and [[parallel][r.sub.i][parallel].sup.2] are available at every iteration step. Please note that in the derivation of (3.7) we used the local orthogonality among the consecutive residuals and direction vectors only. We avoided using mutual orthogonality among the vectors with generally different indices. This fact will be very important in the rounding error analysis of the finite precision counterparts of (3.7) in Sections 7-10.

4. Estimating the A-norm of the error. Using [[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = [[parallel][r.sub.0][parallel].sup.2] ([([T.sup.n.sub.-1]).sub.11] written in the form

[[parallel]x - [x.sub.0][parallel].sup.A.sub.2] = [[parallel][r.sub.0][parallel].sup.2] [([([T.sup.n.sub.-1]).sub.11] - [([T.sup.j.sub.-1]).sub.11]

As suggested in [17, pp. 28-29], the unknown value [([T.sup.n.sub.-1]).sub.11] can be replaced, at a price of m - j extra steps, by a computable value [([T.sub.m.sup.-1]).sub.11] for some m > j. The paper [17], however, did not properly use this idea and did not give a proper formula for computing the difference [([T.sub.m.sup.-1]).sub.11] - [([T.sup.j.sub.-1]).sub.11] without cancellation, which limited the applicability of the proposed result. Golub and Meurant cleverly resolved this trouble in [16] and proposed an algorithm for estimating the A-norm of the error in the CG method called CGQL. This section will briefly summarize several important estimates.

Consider, in general, (3.1) for j and j + d, where d is some positive integer. The idea is simply to eliminate the unknown term [[integral].sup.[zeta].sub.[xi] f([lambda]) d[omega]([lambda]) by subtracting the identities for j and j + d which results in

[[parallel][r.sub.0][parallel].sup.2][R.sub.j](f) = [[parallel][r.sub.0][parallel].sup.2] ([[integral].sup.[zeta].sub.[xi] f([lambda]) [d[omega].sup.(j+d)]([lambda]) - [[integral].sup.[zeta].sub.[xi] f([lambda]) [d[omega].sup.(j)]([lambda]) + [[parallel][r.sub.0][parallel].sup.2] [R.sub.j+d](f).

In particular, using (3.2), (3.4), (3.5), and (3.8) we obtain the mathematically equivalent identities kx _ xjk2

(4.1) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] = [[parallel][r.sub.0][parallel].sup.2] [([T.sup.j+d.sub.- 1]).sub.11] - [([T.sup.j.sub.-1]).sub.11]] + [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2],

(4.2) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] = [r.sup.0.sub.T] ([x.sub.j+d] - [x.sub.j]) + [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2]

(4.3) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] = [r.sup.0.sub.T] ([x.sub.j+d] - [x.sub.j]) - [r.sup.j.sub.T] ([x.sub.j] - [x.sub.0]) + [r.sup.j+d.sub.T] ([x.sub.j+d] - [x.sub.0]) + [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2]


(4.4) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] = [j+d-1.summation over (i=j)] [[gamma].sub.i][[parallel][r.sub.i][parallel].sup.2] + [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2]

Now recall that the A-norm of the error is in the CG method strictly decreasing. If d is chosen such that

(4.5) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] [much greater than] [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2]

then neglecting [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2] on the right-hand sides of (4.1), (4.2), (4.3) and (4.4) gives lower bounds (all mathematically equal) for the squared A-norm of the error in the j-th step. Under the assumption (4.5) these bounds are reasonably tight (their inaccuracy is given by [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2]. We denote them

(4.6) [[eta].sub.j,d] = [[parallel][r.sub.0][parallel].sup.2] [[([T.sup.j+d.sub.-1]).sub.11] - [([T.sup.j.sub.- 1]).sub.11]]

where the difference [([T.sup.j+d.sub.-1]).sub.11] - [([T.sup.j.sub.-1]).sub.11] is computed by the algorithm CGQL from [16],

(4.7) [[mu].sub.j,d] = [r.sub.0.supT] ([x.sub.j+d] - [x.sub.j]),

which refers to the original bound due to Warnick,

(4.8) [[upsilon].sub.j,d] = [r.sup.0.sub.T]([x.sub.j+d] - [x.sub.j]) - [r.sup.j.sub.T]([x.sub.j] - [x.sub.0]) + [r.sup.j+d.sub.T]([x.sub.j+d - [x.sub.0])

which is the previous bound modified by the correction terms and

(4.9) [v.sub.j,d] = (j+d-1.summation over (i=j)] [[gamma].sub.i] [[parallel][r.sub.i][parallel].sup.2]

Clearly, the last bound, which is a direct consequence of [24, Theorem 6:1], see (1.5), is much simpler than the others.

Mathematically (in exact arithmetic)

(4.10) [[eta].sub.j,d] = [[mu].sub.j,d] = [[upsilon].sub.j,d] = [v.sub.j,d]

In finite precision computations (4.10) does not hold in general, and the different bounds may give substantially different results. Does any of the identities (4.1)-(4.4) have any relevance for the quantities computed in finite precision arithmetic? The work described in this subsection and the papers published on this subject would be of little practical use without answering this question.

5. Delay of convergence. For more than 20 years the effects of rounding errors to the Lanczos and CG methods seemed devastating. Orthogonality among the computed vectors [[upsilon].sub.1], [[upsilon].sub.2], ... was usually lost very quickly, with a subsequent loss of linear independence. Consequently, the finite termination property was lost. Still, despite a total loss of orthogonality among the vectors in the Lanczos sequence [[upsilon].sub.1], [[upsilon].sub.2], ..., and despite a possible regular appearance of Lanczos vectors which were linearly dependent on the vectors computed in preceding iterations, the Lanczos and the CG methods produced reasonable results.

A fundamental work which brought light into this darkness was done by Paige. He proved that loss of orthogonality among the computed Lanczos vectors [[upsilon].sub.1], [[upsilon].sub.2], ... was possible only in the directions of the converged Ritz vectors [z.sup.l.sub.(j)] [equivalent to] [V.sub.j][s.sup.l.sub.(j)]. For more details see [33], [34], [35], [36], the review paper [44, Section 3.1] and the works quoted there (in particular [38], [39], [32] and [45]). Little was known about rounding errors in the Krylov subspace methods before the Ph.D. thesis of Paige [33], and almost all results (with the exception of works on ultimate attainable accuracy) published on the subject after this thesis and the papers [34], [35], [36] were based on them.

Another step, which can compete in originality with that of Paige, was made by Greenbaum in [19]. If CG is used to solve a linear symmetric positive definite system Ax = b on a computer with machine precision ", then [19] shows that the A-norms of the errors [[parallel] - [x.sub.l][parallel].sub.A], l = 1, 2, ..., j are very close to the A-norms of the errors [[[parallel][bar.x] - [[bar.x].sub.l][parallel].sub.[bar.A], l = 1, 2, ..., j determined by the exact CG applied to some particular symmetric positive definite system [bar.A](j)[bar.x](j) = [bar.b](j) (see [19, Theorem 3, pp. 26-27]). This system and the initial approximation [[bar.x].sub.0](j) depend on the iteration step j. The matrix [bar.A](j) is larger than the matrix A. Its eigenvalues must lie in tiny intervals about the eigenvalues of A, and there must be at least one eigenvalue of [bar.A](j) close to each eigenvalue of A (the last result was proved in [43]). Moreover, for each eigenvalue [[lambda].sub.i] of A, i = 1, ..., n (similarly to Section 2 we assume, with no loss of generality, that the eigenvalues of A are distinct), the weight [[omega].sub.i] = [([[upsilon].sub.1], [u.sub.i]).sup.2] closely approximates the sum of weights corresponding to the eigenvalues of [bar.A](j) clustered around [[lambda].sub.i] (see [19, relation (8:21) on p. 60]).

The quantitative formulations of the relationships between A, b, [x.sub.0] and [bar.A](j), [bar.b](j), [[bar.x].sub.0](j) contains some terms related in various complicated ways to machine precision [epsilon] (see [19], [43] and [17, Theorems 5.1-5.3 and the related discussion on pp. 257-260]). The actual size of the terms given in the quoted papers documents much more difficulties of handling accurately peculiar technical problems of rounding error analysis than it says about the accuracy of the described relationships. The fundamental concept to which the (very often weak) rounding error bounds lead should be read: the first j steps of a finite precision CG computation for Ax = b can be viewed as the first j steps of the exact CG computation for some particular [bar.A](j)[bar.x](j) = [bar.b](j). This relationship was developed and proved theoretically. Numerical experiments show that its tightness is much better than the technically complicated theoretical calculations in [19] would suggest. We will not continue with describing the results of the subsequent work [22]. We do not need it here. Moreover, a rigorous theoretical description of the model from [22] in the language of Riemann-Stieltjes integral and Gauss quadrature still needs some clarification. We hope to return to that subject elsewhere.

As a consequence of the loss of orthogonality caused by rounding errors, convergence of the CG method is delayed. In order to illustrate this important point numerically, we plot in Fig. 5.1 results of the CG method (1.3) for the matrix A = Q[LAMBDA][Q.sup.T], where Q is the orthogonal matrix obtained from the Matlab QR-decomposition of the randomly generated matrix (computed by the Matlab command randn(n)), and [LAMBDA] = diag([[lambda].sub.1], ..., [[lambda].sub.n]) is a diagonal matrix with the eigenvalues

(5.1) [[lambda].sub.i] = [[lambda].sub.1] + i - 1/n - 1 ([[lambda].sub.n] - [[lambda].sub.1])[[rho].sup.n-1], i = 2,

see [43]. We have used n = 48, [[lambda].sub.1] = 0.1, [[lambda].sub.n] = 1000, [rho] = 0:9, x = [(1, ..., 1).sup.T], b = Ax, and [x.sub.0] = [(0, ..., 0).sup.T]. We have simulated the exact arithmetic values by double reorthogonalization of the residual vectors (see [22]). The quantities obtained from the CG implementation with the double reorthogonalized residuals will be denoted by (E). Fig. 5.1 shows that when the double reorthogonalization is applied, the correspondingA-norm of the error (dashdotted line) can be very different from the A-norm of the error of the ordinary finite precision (FP) CG implementation (solid line). Without reorthogonalization, the orthogonality among the (FP) Lanczos vectors, measured by the Frobenius norm [[parallel]I - [V.sup.j.sub.T][V.sub.j][parallel].sub.F] (dotted line), is lost after a few iterations. With double reorthogonalization the orthogonality is kept close to machine precision (dots). Experiments were performed using Matlab 5.1 on a personal computer with machine precision [epsilon] ~ [10.sup.-16].


We see that the delay of convergence due to loss of orthogonality can be very substantial. Consider now application of the estimates (4.6)-(4.9) to finite precision computations. In derivation of all these estimates we assumed exact arithmetic. Consequently, in these derivations we did not count for any loss of orthogonality and delay of convergence. For the example presented above, the bounds can therefore be expected to give good results for the double reorthogonalized CG (dash-dotted convergence curve). Should they give anything reasonable also for the ordinary (FP) CG implementation (solid convergence curve)? If yes, then why? The following section explains that this question is of fundamental importance.

6. Examples. Indeed, without a proper rounding error analysis of the identities (4.1)-(4.4) there is no justification that the estimates derived assuming exact arithmetic will work in finite precision arithmetic. For example, when the significant loss of orthogonality occurs, the bound [[mu].sub.j,d] given by (4.7) does not work!

This fact is demonstrated in Fig. 6.1 which presents experimental results for the problem described in the previous section (see Fig. 5.1). It plots the computed estimate [[mu.sub.j,d].sup.1/2] (dashed line) and demonstrates the importance of the correction term

(6.1) [c.sub.j,d] = -[r.sup.j.sub.T] ([x.sub.j] - [x.sub.0]) + [r.sup.j+d.sub.T] ([x.sub.j+d - [x.sub.0]),

([|[c.sub.j,d]|.sup.1/2] is plotted by dots). Fig. 6.1 shows clearly that when the global orthogonality (measured by [[parallel]I - [V.sup.j.sub.T][V.sub.j][parallel].sub.F] and plotted by a dotted line) grows greater than [[parallel]x - [x.sub.j][parallel].sub.A] (solid line), the bound [[mu].sup.j,d.sub.1/2], which is based on global orthogonality, ceases to give any useful information about convergence ([[mu].sub.j,d] may even become negative, therefore we plot the second root of its absolute value). Adding the correction term [c.sub.j,d] to [[mu].sub.j,d] gives [[upsilon].sub.j,d], see (4.8), which gives estimates comparable to [[eta].sub.j,d] and [v.sub.j,d] (see Section 11). In this experiment we used d = 4.


It is important to understand that the additional rounding errors in computing _j;d _j;d, [[upsilon].sub.j,d] and [v.sub.j,d] from the given formulas (the algorithm CGQL and (4.7)-(4.9)) do not affect significantly the values of the computed bounds and do not represent a problem. The problem is in the fact, that when the orthogonality is significantly lost, the input quantities used in the algorithm CGQL and in the formulas (4.7)-(4.9) are significantly different from their exact precision counterparts. These quantities affected by the loss of orthogonality are plugged into the formulas which assume, in their derivation, exact orthogonality.

In order to stress the previous point and to underline the necessity of rounding error analysis of the identities (4.7)-(4.9), we present the following analogous example. In the Lanczos method the eigenvalues [[theta].sup.1.sub.(j)] < [[theta].sup.2.sub.(j)] < ... < [[theta].sup.j.sub.(j)] of [T.sub.j] (Ritz values) are considered approximations to the eigenvalues of the matrix A (see Section 2). Let [[theta].sup.l.sub.(j)], [z.sup.l.sub.(j)] = [V.sub.j][s.sup.l.sub.(j)] (where [s.sup.l.sub.(j)] is the normalized eigenvector of [T.sub.j] corresponding to [[theta].sup.l.sub.(j)] represents an approximate eigenpair of A. In exact arithmetic we have the following bound for the distance of [[theta].sup.l.sub.(j)] to the nearest eigenvalue of A


where [[parallel][z.sup.l.sub.(j)][parallel] = 1 due to the orthonormality of the Lanczos vectors [[upsilon], ..., [[upsilon].sub.j]. Using (2.3), [parallel][Az.sup.l.sub.(j)][parallel] - [[theta].sub.l.sup(j)][parallel] = [[beta].sub.j+1] ([e.sub.j], [s.sup.l.sub.(j)]) [equivalent to] [[delta].sub.lj], which gives


see, e.g., [38], [36]. Consequently, in exact arithmetic, if [[delta].sub.lj] is small, then [[theta].sup.l.sub.(j)](j) must be close to some [[lambda].sub.i]. In finite precision arithmetic loss of orthogonality has, among the others, a very unpleasant effect: we cannot guarantee, in general, that [z.sup.l.sub.(j)], which is a linear combination of [[upsilon].sub.1], ... [[upsilon].sub.j] has a nonvanishing norm. We can still compute [[delta].sub.lj] from [[beta.sub.j+1] and [T.sub.j], the effect of rounding errors in this additional computation is negligible. We can therefore say, similarly to the analogous statements published about computation of the convergence estimates in the CG method, that [[delta].sub.lj] is in the presence of rounding errors computed "accurately". Does [[delta].sub.lj] computed in finite precision arithmetic tell anything about convergence of [[theta].sup.l.sub.(j)] to some [[lambda].sub.i]? Yes, it does! But this affirmative answer is based neither on the exact precision formulas (6.2) and (6.3), nor on the fact that [[delta].sub.lj] is computed "accurately". It is based on an ingenious analysis due to Paige, who have shown that the orthogonality can be lost in the directions of the well approximated eigenvectors only. For the complicated details of this difficult result we refer to [33], [37] and to the summary given in [44, Theorem 2]. We see that even in finite precision computations small [[delta].sub.lj] guarantees that [[theta].sub.lj] approximates some [[lambda].sub.i] to high accuracy. It is very clear, however, that this conclusion is the result of the rounding error analysis of the Lanczos method given by Paige, and no similar statement could be made without this analysis.

In the following three sections we present rounding error analysis of the bound [v.sub.j,d] given by (4.4) and (4.9). We concentrate on [v.sub.j,d] because it is the simplest of all the others. If [v.sub.j,d] is proved numerically stable, then there is a small reason for using the other bounds [[eta].sub.j,d] or [[upsilon].sub.j,d] in practical computations.

7. Finite precision CG computations. In the analysis we assume the standard model of floating point arithmetic with machine precision ", see, e.g. [25, (2.4)],

(7.1) fl[a [omicron] b] = (a [omicron] b)(1 + [delta]), [absolute value of [delta]] [less than or equal to] [epsilon],

where a and b stands for floating-point numbers and the symbol [omicron] stands for the operations addition, subtraction, multiplication and division. We assume that this model holds also for the square root operation. Under this model, we have for operations involving vectors v, w, a scalar [alpha] and the matrix A the following standard results [18], see also [20], [35]

(7.2) [parallel][alpha][upsilon] - fl[[alpha][upsilon]][parallel] [less than or equal to] [parallel][alpha][upsilon][parallel],

(7.3) [parallel][upsilon] + w - fl[[upsilon] + w][parallel] [less than or equal to] [epsilon] ([parallel][upsilon][parallel] + [parallel]w[parallel]),

(7.4) |([upsilon], w) - fl[([upsilon], w)]| [less than or equal to] [eta] (1 + O([epsilon])) [parallel][upsilon] [parallel][parallel]w[parallel],

(7.5) [parallel]A[upsilon] - fl[A[upsilon]][parallel] [less than or equal to] [epsilon] c [parallel]A[parallel][parallel][upsilon][parallel].

When A is a matrix with at most h nonzeros in any row and if the matrix-vector product is computed in the standard way, c = [hn.sup.1/2]. In the following analysis we count only for the terms linear in the machine precision epsilon [epsilon] and express the higher order terms as O([[epsilon].sup.2]). By O(const) where const is different from [[epsilon].sup.2] we denote const multiplied by a bounded positive term of an insignificant size which is independent of the const and of any other variables present in the bounds.

Numerically, the CG iterates satisfy

(7.6) [x.sub.j+1] = [x.sub.j] + [[gamma].sub.j][p.sub.j] + [epsilon][z.sup.j.sub.x],

(7.7) [r.sub.j+1] = [r.sub.j] - [[gamma].sub.j][Ap.sub.j] + [epsilon][z.sup.j.sub.r],

(7.8) [p.sub.j+1] = [r.sub.j+1] + [[delta].sub.j+1][p.sub.j] + [epsilon][z.sup.j.sub.p],

where [epsilon][z.sub..sub.j.x], [epsilon][z.sub..sub.j.r] and [epsilon][z.sub..sub.j.p] account for the local roundoff ([r.sub.0] = b - [Ax.sub.0] - [epsilon][f.sub.0], [epsilon][parallel][f.sub.0][parallel] [less than or equal to] [epsilon]{[parallel]b[parallel] + [parallel][Ax.sub.0][parallel] + c[parallel]A[parallel][parallel][x.sub.0]} + O([[epsilon].sup.2]. The local roundoff can be bounded according to the standard results (7.2)-(7.5) in the following way

(7.9) [epsilon][parallel][z.sub..sub.j.x][parallel] [less than or equal to] [epsilon] {[parallel][x.sub.j][parallel] + 2[parallel][[gamma].sub.j][p.sub.j][parallel]} + O([[epsilon].sup.2]) [less than or equal to] [epsilon] {3[parallel][x.sub.j][parallel] + 2[parallel][x.sub.j+1][parallel]} + O([[epsilon].sup.2]

(7.10) [epsilon][parallel][z.sub..sub.j.r][parallel] [less than or equal to] [epsilon] {[parallel][r.sub.j][parallel] + 2[parallel][[gamma].sub.j][Ap.sub.j][parallel] + c [parallel]A[parallel][parallel][[gamma].sup.j][p.sub.j][parallel]} + O([[epsilon].sup.2])

(7.11) [epsilon][parallel][z.sub..sub.j.p][parallel] [less than or equal to] [epsilon] {[parallel][r.sub.j+1][parallel] + 2[parallel][[delta].sub.j+1][p.sub.j][parallel]} + O([[epsilon].sup.2]) [less than or equal to] {3[parallel][r.sub.j+1][parallel] + 2[parallel][p.sub.j+1]} + O([[epsilon].sup.2]).

Similarly, the computed coefficients [[gamma].sub.j] and [[delta].sub.j] satisfy

(7.12) [[gamma].sub.j] = [[parallel][r.sub.j][parallel].sup.2]/[p.sup.j.sub.T][Ap.sub.j] + [epsilon][[zeta].sup.j.sub.[gamma]], [[delta].sub.j] = [[parallel][r.sub.j][parallel].sup.2]/[[parallel][r.sub.j- 1][parallel].sup.2] + [epsilon][[zeta].sup.j.sub.[delta]].

Assuming n[epsilon] [much less than] 1, the local roundoff [epsilon][[zeta].sup.j.sub.[delta]] is bounded, according to (7.1) and (7.4), by

(7.13) [epsilon]|[[zeta].sup.j.sub.[delta]]| [less than or equal to] [epsilon] [[parallel][r.sub.j][parallel].sup.2]/[[parallel][r.sub.j-1][parallel].sup.2] O(n) + O([[epsilon.sup.2]).

Using (7.2)-(7.5) and [parallel]A[parallel][[parallel][p.sub.j][parallel].sup.2]/([p.sub.j], [Ap.sub.j]) [less than or equal to] [kappa](A),

[([p.sub.j], [Ap.sub.j])] = ([p.sub.j], [Ap.sub.j]) + [epsilon] [parallel][Ap.sub.j][parallel][parallel][p.sub.j][parallel]O(n) + [epsilon] [parallel]A[parallel][[parallel][p.sub.j][parallel].sup.2]O(c) + O([[epsilon].sup.2]) = ([p.sub.j], [Ap.sub.j])(1 + [epsilon][kappa](A)O(n + c)) + O([[epsilon].sup.2]).

Assuming [epsilon](n + c) [kappa](A) [much less than] 1, the local roundoff [epsilon][[zeta].sup.j.sub.[gamma]] is bounded by

(7.14) [[epsilon]|[[zeta].sup.j.sub.[gamma]]|[epsilon][kappa](A) [[parallel][r.sub.j][parallel].sup.2]/([p.sub.j],[Ap.sub.j]) O(n + c) + O([[epsilon].sup.2]).

It is well-known that in finite precision arithmetic the true residual b - [Ax.sub.j] differs from the recursively updated residual vector [r.sub.j],

(7.15) [r.sub.j] = b - [Ax.sub.j] - [epsilon][f.sub.j].

This topic was studied in [42] and [20]. The results can be written in the following form


(7.17) [parallel][r.sub.j][parallel] = [parallel]b - [Ax.sub.j][parallel] (1 + [epsilon][F.sub.j]),

where [epsilon][F.sub.j] is bounded by

(7.18) |[epsilon][F.sub.j]| = |[parallel][r.sub.j][parallel] - [parallel]b - [Ax.sub.j][parallel]|/[parallel]b - [Ax.sub.j][parallel] [less than or equal to] [parallel][r.sub.j] - (b - [Ax.sub.j)[parallel]/[parallel]b - [Ax.sub.j][parallel] = [epsilon][[parallel][f.sub.j][parallel]/[parallel]b - [Ax.sub.j][parallel]

Rounding errors affect results of CG computations in two main ways: they delay convergence (see Section 5) and limit the ultimate attainable accuracy. Here we are primarily interested in estimating the convergence rate. We therefore assume that the final accuracy level has not been reached yet and [epsilon][f.sub.j] is, in comparison to the size of the true and iterative residuals, small. In the subsequent text we will relate the numerical inaccuracies to the

A-norm of the error [[parallel]x - [x.sub.j][parallel].sub.A]. The following inequalities derived from (7.18) will prove useful,

(7.19) [[lambda].sup.1.sub.1/2] [[parallel]x - [x.sub.j][parallel].sub.A] (1 + [epsilon][F.sub.j]) [less than or equal to] [parallel][r.sub.j][parallel] [less than or equal to] [[lambda].sup.n.sub.1/2] [[parallel]x - [x.sub.j][parallel].sub.A] (1 + [epsilon][F.sub.j]).

The monotonicity of the A-norm and of the Euclidean norm of the error is in CG preserved (with small additional inaccuracy) also in finite precision computations (see [19], [22]). Using this fact we get for j [greater than or equal to] i

(7.20) [epsilon] [parallel][r.sub.j][parallel]/[parallel][r.sub.j][parallel] [less than or equal to] [epsilon] [[lambda].sup.n.sub.1/2]/[[lambda].sup.1.sub.1/2] x [[parallel]x - [x.sub.j][parallel].sub.A]/[[parallel]x - [x.sub.i][parallel].sub.A] x (1 + [epsilon][F.sub.j])/(1 + [epsilon][F.sub.i]) [less than or equal to] [epsilon][kappa][(A).sup.1/2] + O([[epsilon].sup.2]).

This bound will be used later.

8. Finite precision analysis--basic identity. The bounds (4.6)-(4.9) are mathematically equivalent. We will concentrate on the simplest one given by [v.sub.j,d] (4.9) and prove that it gives (up to a small term) correct estimates also in finite precision computations. In particular, we prove that the ideal (exact precision) identity (4.4) changes numerically to

(8.1) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] = [v.sub.j,d] + [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2] + [[??].sub.j,d],

where [[??].sub.j,d] is as small as it can be (the analysis here will lead to much stronger results than the analysis of the finite precision counterpart of (4.1) given in [17]). Please note that the difference between (4.4) and (8.1) is not trivial. The ideal and numerical counterparts of each individual term in these identities may be orders of magnitude different! Due to the facts that rounding errors in computing [v.sub.j,d] numerically from the quantities [[gamma].sub.i], [r.sub.i] are negligible and that [[??].sub.j,d] will be related to [epsilon] [[parallel]x - [x.sub.j][parallel].sub.A], (8.1) will justify the estimate [v.sub.j,d] in finite precision computations.

From the identity for the numerically computed approximate solution

[[parallel]x - [x.sub.j][parallel].sup.A.sub.2] = [[parallel]x - [x.sub.j+1] + [x.sub.j+1] - [x.sub.j][parallel].sup.A.sub.2] = [[parallel]x - [x.sub.j+1][parallel].sup.A.sub.2] + 2 [(x + [x.sub.j+1]).sup.T] A([x.subj+1] - [x.sub.j]) + [[parallel][x.sub.j+1] - [x.sub.j][parallel].sup.A.sub.2],

we obtain easily

(8.2) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] - [[parallel]x - [x.sub.j+1][parallel].sup.A.sub.2] = [[parallel][x.sub.j+1] - [x.sub.j][parallel].sup.A.sub.2] + 2 [(x - [x.sub.j+1]).sup.T] A([x.sub.j+1] - [x.sub.j]).

Please note that (8.2) represents an identity for the computed quantities. In order to get the desired form leading to (8.1), we will develop the right hand side of (8.2). In this derivation we will rely on local properties of the finite precision CG recurrences (7.6)-(7.8) and (7.12).

Using (7.6), the first term on the right hand side of (8.2) can be written as


Similarly, the second term on the right hand side of (8.2) transforms, using (7.15), to the form

(8.4) 2 [(x - [x.sub.j+1]).sup.T] A([x.sub.j+1] - [x.sub.j]) = 2 [([r.sub.j+1] + [epsilon][f.sub.j+1]).sup.T] ([x.sub.j+1] - [x.sub.j]) = 2 [r.sup.j+1.sub.T] ([x.sub.j+1] - [x.sub.j]) + 2[epsilon] [f.sup.j+1.sub.T] ([x.sub.j+1] - [x.sub.j]).

Combining (8.2), (8.3) and (8.4),


Substituting for j from (7.12), the first term in (8.5) can be written as


Consequently, the difference between the squared A-norms of the error in the consecutive steps can be written in the form convenient for the further analysis


The goal of the following analysis is to show that until [[parallel]x - [x.sub.j][parallel].sub.A] reaches its ultimate attainable accuracy level, the terms on the right hand side of (8.6) are, except for [[gamma].sub.j][[parallel][r.sub.j][parallel].sup.2], insignificant. Bounding the second term will not represent a problem. The norm of the difference [x.sub.j+1] - [x.sub.j] = (x - [x.sub.j]) - (x - [x.sub.j+1]) is bounded by 2[[parallel]x - [x.sub.j][parallel].sub.A]/[[lambda].sup.1.sub.1/2]. Therefore the size of the fourth term is proportional to [epsilon] [[parallel]x - [x.sub.j][parallel].sub.A]. The third term is related to the line-search principle. Ideally (in exact arithmetic), the (j + 1)-th residual is orthogonal to the difference between the (j + 1)-th and j-th approximation (which is a multiple of the j-th direction vector). This is equivalent to the line-search: ideally the (j + 1)-th CG approximation minimizes the A-norm of the error along the line determined by the j-th approximation and the j-th direction vector. Here the term [r.sup.j+1.sub.T]([x.sub.j+1] - [x.sub.j]), with [r.sub.j+1], [x.sub.j] and [x.sub.j+1] computed numerically, examines how closely the line-search holds in finite precision arithmetic. In fact, bounding the local orthogonality [r.sup.j+1.sub.T]([x.sub.j+1] - [x.sub.j]) represents the technically most difficult part of the remaining analysis.

9. Local orthogonality in the Hestenes and Stiefel implementation. Since the classical work of Paige it is well-known that in the three-term Lanczos recurrence local orthogonality is preserved close to the machine epsilon (see [35]). We will derive an analogy of this for the CG algorithm, and state it as an independent result.

The local orthogonality term [r.sup.j+1.sub.T]([x.sub.j+1] - [x.sub.j]) can be written in the form

(9.1) [r.sup.j+1.sub.T]([x.sub.j+1] - [x.sub.j]) = [r.sup.j+1.sub.T] ([[gamma].sub.j][p.sub.j] + [epsilon][z.sup.j.sub.x]) = [[gamma].sub.j][r.sup.j+1.sub.T][p.sub.j] + [epsilon][r.sup.j+1.sub.T][z.sup.j.sub.x]

Using the bound [parallel][r.sub.j+1][parallel] [less than or equal to] [[lambda].sup.n.sub.1/2] [[parallel]x - [x.sub.j+1][parallel].sub.A] (1 + [epsilon][F.sub.j+1]) [less than or equal to] [[lambda].sup.n.sub.1/2] [[parallel]x - [x.sub.j][parallel].sub.A](1 + [epsilon][F.sub.j+1]), see (7.19), the size of the second term in (9.1) is proportional to [epsilon] [[parallel]x - [x.sub.j][parallel].sub.A]. The main step consist of showing that the term [r.sup.j+1.sub.T][p.sub.j] is sufficiently small. Multiplying the recurrence (7.7) for [r.sub.j+1] by the column vector [p.sup.j.sub.T] gives (using (7.8) and (7.12))



(9.3) [M.sub.j] [equivalent to] [r.sup.j.sub.T] [z.sub.j-1.sup.p] - [[zeta].sup.j.sub.[gamma]][p.sup.j.sub.T] [Ap.sub.j] + [p.sup.j.sub.T] [z.sup.j.sub.r],

the identity (9.2) is

(9.4) [p.sup.j.sub.T][r.sub.j+1] = [[delta].sub.j] [p.sub.j-1.sup.T] [r.sub.j] + [epsilon][M.sub.j].

Recursive application of (9.4) for [p.sub.j-1].sup.T] [r.sub.j], ..., [p.sup.1.sub.T][r.sub.2] with [p.sup.0.sub.T][r.sub.1] = [[parallel][r.sub.0][parallel].sup.2] - [[gamma].sub.0] [p.sup.0.sub.T] [Ap.sub.0] + [epsilon] [p.sup.0.sub.T] [z.sup.0.sub.r] = [epsilon] {-[[zeta].sup.0.sub.[gamma]][r.sup.0.sub.T][Ar.sub.0] + [p.sup.0.sub.T] [z.sup.0.sub.T]} [equivalent to] [epsilon] [M.sub.0], gives




we can express (9.5) as

(9.6) [p.sup.j.sub.T][r.sub.j+1] = [epsilon] [[parallel][r.sub.i] [parallel].sup.2] [j.summation over (i=0)] [M.sub.i]/[[parallel][r.sub.i] [parallel].sup.2] + O([[epsilon].sup.2])

Using (9.3),

(9.7) [absolute value of [M.sub.i]]/[[parallel][r.sub.i][parallel].sup.2] [less than or equal to] [parallel][z.sub.i-1][parallel]/[parallel][r.sub.i] [parallel] + |[[zeta].sup.i.sub.[gamma]]|[p.sup.i.sub.T][Ap.sub.i]/ [[parallel][r.sub.i][parallel].sup.2] + [parallel][p.sub.i][parallel] [parallel][z.sup.i.sub.r][parallel]/[[parallel][r.sub.i][parallel].sup.2].

?From (7.11) it follows

(9.8) [epsilon] [parallel][z.sub.i-1][parallel]/[parallel][r.sub.i] [parallel] [less than or equal to] [epsilon] {3 + 2 [[parallel][p.sub.i] [parallel]/[parallel][r.sub.i][parallel]} + O([[epsilon].sup.2]).

Using (7.14),

(9.9) [epsilon] |[[zeta].sup.j.sub.[gamma]]| [p.sup.i.sub.T][Ap.sub.i]/ [[parallel][r.sub.i][parallel].sup.2] [less than or equal to] [epsilon] [kappa](A) O(n + c) + O([[epsilon].sup.2]).

The last part of (9.7) is bounded using (7.10) and (7.12)



(9.11) [epsilon] [parallel][p.sub.i][parallel]/[parallel][r.sub.i] [parallel] [less than or equal to] [epsilon] [parallel][r.sub.i][parallel] + [[delta].sub.i][parallel][p.sub.i-1[parallel]/[parallel][r.sub.i][parallel] + O([[epsilon].sup.2] [less than or equal to] [epsilon] {1 + [parallel] [r.sub.i][parallel]/[parallel][r.sub.i-1][parallel] [parallel][p.sub.i-1] [parallel]/[parallel][r.sub.i-1][parallel]} + O([[epsilon].sup.2]).

Recursive application of (9.11) for [parallel][p.sub.i-1][parallel]/ [parallel][r.sub.i-1][parallel],[parallel][p.sub.i-2][parallel]/[parallel] [p.sub.i-2][parallel], ..., [parallel][p.sub.1][parallel]/[parallel][r.sub.1] [parallel] with [parallel][p.sub.0][parallel]/[parallel][r.sub.0][[parallel] = 1 gives

(9.12) [epsilon] [parallel][p.sub.i][parallel]/[parallel][r.sub.i] [parallel] [less than or equal to] [epsilon] {1 + [parallel][r.sub.i] [parallel]/[parallel][r.sub.i-1][parallel] + [parallel][r.sub.i][parallel]/ [parallel][r.sub.i-2][parallel] + ... + [parallel][r.sub.i][parallel]/ [parallel][r.sub.0[parallel]} + O([[epsilon].sup.2]).

The size of [epsilon][parallel][r.sub.i][parallel]/[parallel][r.sub.k] [parallel], i [greater than or equal to] k is, according to (7.20), less or equal than [epsilon][kappa][(A).sup.1/2] + O([[epsilon].sup.2]). Consequently,

(9.13) [epsilon] [parallel][p.sub.i][parallel]/[parallel][r.sub.i] [parallel] [less than or equal to] {1 + i[kappa][(A).sup.1/2]} + O ([[epsilon].sup.2])

Summarizing (9.8), (9.9), (9.10) and (9.13), the ratio [epsilon] [absolute value of [M.sub.i]]/[[parallel][r.sub.i][parallel].sup.2] is bounded as

(9.14) [epsilon] [absolute value of [M.sub.i]]/[[parallel][r.sub.i] [parallel].sup.2] [less than or equal to] [epsilon] [kappa](A) O(8 + 2c + n + 3i) + O([[epsilon].sup.2]).

Combining this result with (9.6) proves the following theorem.

THEOREM 9.1. Using the previous notation, let [epsilon] (n + c) [kappa](A) [much less than] 1. Then the local orthogonality between the direction vectors and the iteratively computed residuals is in the finite precision implementation of the conjugate gradient method (7.6)-(7.8) and (7.12) bounded by

(9.15) |[p.sup.j.sub.T][r.sub.j+1]| [less than or equal to] [epsilon] [[parallel][r.sub.j][parallel].sup.2] [kappa](A)O((j + 1)(8 + 2c + n + 3j)) + O[([epsilon].sup.2]).

10. Final precision analysis - conclusions. We now return to (8.6) and finalize our discussion. Using (9.1) and (9.6),


The term

[E.sup.j.sub.(1)] [equivalent to] [epsilon] {[[zeta].sup.j.sub.[gamma]] [p.sup.j.sub.T][Ap.sub.j]/[[parallel][r.sub.j][parallel].sup.2] + 2 [j.summation over (i=0)] [M.sub.i]/[[parallel][r.sub.j][parallel].sup.2]}

is bounded using (7.14) and (9.14),

|[E.sup.j.sub.(1)]| [less than or equal to] [epsilon] [kappa](A)O (n + c + 2(j + 1)(8 + 2c + n + 3j)) + O([[epsilon].sup.2]): (10.2)

We write the remaining term on the right hand side of (10.1) proportional to " as

(10.3) 2[epsilon] {[([f.sub.j+1] + [Az.sup.j.sub.x]).sup.T] ([x.sub.j+1] - [x.sub.j]) + [r.sup.j+1.sub.T][z.sup.j.sub.x]} [equivalent to] [[parallel]x - [x.sub.j] [parallel].sub.A] [E.sup.j.sub.(2)],



With (7.16) and (7.9),


Finally, using the fact that the monotonicity of the A-norm and the Euclidean norm of the error is preserved also in finite precision CG computations (with small additional inaccuracy, see [19], [22]), we obtain the finite precision analogy of (4.4), which is formulated as a theorem.

THEOREM 10.1. With the notation defined above, let [epsilon] (n + c) [kappa](A) [much less than] 1. Then the CG approximate solutions computed in finite precision arithmetic satisfy

(10.6) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] - [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2] = [v.sub.j,d] [E.sup.j,d.sub.(1)] + [[parallel]x - [x.sub.j][parallel].sub.A] [E.sup.j.sub.(2)] + O([[epsilon].sup.2]),


(10.7) [v.sub.j,d] = [j+d-1.summation over (i=j)] [[gamma].sub.i] [[parallel][r.sub.i][parallel].sup.2] .

and the terms due to rounding errors are bounded by



O([t.sup.(1)](n)) and O([t.sup.(2)](n)) represent terms bounded by a small degree polynomial in n independent of any other variables.

Please note that the value [v.sub.j,d] is in Theorem 10.1 computed exactly using (10.7). Errors in computing [v.sub.j,d] numerically (i.e. in computing fl([[summation].sub.i=j.sup.j+d-1] [[gamma].sub.i][[parallel][r.sub.i][parallel].sup.2])) are negligible in comparison to [v.sub.j,d] multiplied by the bound for the term |[E.sup.i.sub.(1)]| and need not be considered here. Theorem 10.1 therefore says that for the numerically computed approximate solutions

(10.10) [[parallel]x - [x.sub.j][parallel].sup.A.sub.2] - [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2] = fl([v.sub.j,d]) + [[??].sub.j,d]

where the term [[??].sub.j,d] "perturbes" the ideal identity (4.4) in the finite precision case. Here [[??].sub.j,d] denotes quantity insignificantly different from [[??].sub.j,d] in (8.1). Consequently, the numerically computed value [v.sub.j,d] can be trusted until it reaches the level of [[??].sub.j,d]. Based on the assumption [epsilon](n+c) [kappa](A) [much less than] 1 and (10.8) we consider |[E.sup.j.sub.(1)]| [much less than] 1. Then, assuming (4.5), the numerically computed value [v.sub.j,d] gives a good estimate for the A-norm of the error [[parallel]x - [x.sub.j].sup.A.sub.2] A until

[[parallel]x - [x.sub.j].sub.A] |[E.sup.j,d.sub.(2)]| [much less than] | [E.sup.j,d.sub.(2)]

which is equivalent to

(10.11) [[parallel]x - [x.sub.j].sub.A] [much greater than] | [E.sup.j,d.sub.(2)]|.


The value [E.sup.j,d.sub.(2)] represents various terms. Its upper bound is, apart from [kappa][(A).sup.1/2], which comes into play as an effect of the worst-case rounding error analysis, linearly dependent on an upper bound for [[parallel]x - [x.sub.0].sub.A]. The value of [E.sup.j,d.sub.(2)] is (as similar terms or constants in any other rounding error analysis) not important. What is important is the following possible interpretation of (10.11): until [[parallel]x - [x.sub.j].sub.A] reaches a level close to [epsilon] [[parallel]x - [x.sub.j].sub.A], the computed estimate [v.sub.j,d] must work.

11. Numerical Experiments. We present illustrative experimental results for the system Ax = b described in Section 5. We set d = 4.

Fig. 11.1 demonstrates, that the estimates [[eta].sub.j,d].sup.1/2] j;d (computed by the algorithm CGQL [16], dotted line), [[upsilon].sub.j,d].sup.1 /2] (dashed line) and [v.sub.j,d].sup.1/2] (dash-dotted line) give in the presence of rounding errors similar results; all the lines essentially coincide until [[parallel]x - [x.sub.j].sub.A] (solid line) reaches its ultimate attainable accuracy level. Loss of orthogonality, measured by [[parallel]I - [V.sup.j.sub.T][V.sub.j][parallel].sub.F], is plotted by the strictly increasing dotted line. We see that the orthogonality of the computed Lanczos basis is completely lost at j ~ 22. The term [parallel][epsilon] [f.sub.j][parallel] measuring the difference between the directly and iteratively computed residuals (horizontal dotted line) remains close to machine precision [epsilon] ~ [10.sup.-16] throughout the whole computation.

Fig. 11.2 shows, in addition to the loss of orthogonality (dotted line) and the Euclidean norm of the error [parallel]x - [x.sub.j][parallel], the bound for the last one derived in the following way from (1.7). Using the identity

(11.1) [[parallel]x - [x.sub.j][parallel].sup.2] = [j+d-1.summation over (i=j)] [[parallel][p.sub.i][parallel].sup.2]/([p.sub.i], [Ap.sub.i]) ([[parallel]x - [x.sub.i][parallel].sup.A.sub.2] + [[parallel]x - [x.sub.i+1][parallel].sup.A.sub.2] + [[parallel]x - [x.sub.j+d][parallel].sup.2],


and replacing the unknown squares of the A-norms of the errors

[[parallel]x - [x.sub.j][parallel].sup.A.sub.2], [[parallel]x - [x.sub.j+1][parallel].sup.A.sub.2], ... [[parallel]x - [x.sub.j+d] [parallel].sup.A.sub.2]

by their estimates

[j+2d-1.summation over (i=j)] [[gamma].sub.i][[parallel][r.sub.i] [parallel].sup.2], [j+2d-1.summation over (i=j+1)] [[gamma].sub.i] [[parallel][r.sub.i][parallel].sup.2], ..., [j+2d-1.summation over (i=j)] [[gamma].sub.i][[parallel][r.sub.i][parallel].sup.2]

gives ideally

(11.2) [[parallel]x - [x.sub.j][parallel].sup.2] [greater than or equal to] [j+d-1.summation over (i=j)] [[parallel][p.sub.i][parallel].sup.2]/([p.sub.i], [Ap.sub.i]) ([[gamma].sub.i][[parallel][r.sub.i][parallel].sup.2] + 2 [j+2d-1.summation over (k=i+j)] [[gamma].sub.i][[parallel][r.sub.i] [parallel].sup.2]) + [[parallel]x - [x.sub.j+d][parallel].sup.2]

Similarly as above, if d is chosen such that

[[parallel]x - [x.sub.j][parallel].sup.2] [much greater than] [[parallel]x - [x.sub.j+d][parallel].sup.2] and [[parallel]x - [x.sub.j+d][parallel].sup.A.sub.2] [much greater than] [[parallel]x - [x.sub.j+2d][parallel].sup.A.sub.2]


(11.3) [[tau].sub.j,d] [equivalent to] [j+d-1.summation over (i=j)] [[parallel][p.sub.i][parallel].sup.2]/([p.sub.i], [Ap.sub.i]) ([[gamma].sub.i][[parallel][r.sub.i][parallel].sup.2] + 2 [j+2d-1.summation over (k=i+j)] [[gamma].sub.i][[parallel][r.sub.i] [parallel].sup.2])

represents ideally a tight lower bound for the squared Euclidean norm of the CG error [[parallel]x - [x.sub.j][parallel].sup.2]. Please note that evaluating (11.3) requires 2d extra steps.

In experiments shown in Fig. 11.1 and Fig. 6.1 we used a fixed value d = 4. It would be interesting to design an adaptive error estimator, which would use some heuristics for adjusting d according to the desired accuracy of the estimate and the convergence behaviour. A similar approach can be used for eliminating the disadvantage of 2d extra steps related to (11.3). We hope to report results of our work on that subject elsewhere.

12. Conclusions. Based on the results presented above we believe that the estimate for the A-norm of the error [v.sup.j,d.sub.1/2] should be incorporated into any software realization of the CG method. It is simple and numerically stable. It is worth to consider the estimate [[tau].sup.j,d.sub.1/2] for the Euclidean norm of the error, and compare it (including complexity and numerical stability) with other existing approaches not discussed here (e.g. [6], [31]). The choice of d remains a subject of further work.

By this paper we wish to pay a tribute to the truly seminal paper of Hestenes and Stiefel [24] and to the work of Golub who shaped the whole field.

Acknowledgments. Many people have contributed to the presentation of this paper by their advice, helpful objections, and remarks. We wish to especially thank Martin Gutknecht, Gerard Meurant, Chris Paige, Beresford Parlett, Lothar Reichel, Miroslav Rozloznik, and the anonymous referee for their help in revising the text.


[1] M. ARIOLI, Stopping criterion for the Conjugate Gradient algorithm in a Finite Element method framework, submitted to Numer. Math., (2001).

[2] M. ARIOLI AND L. BALDINI, Backward error analysis of a null space algorithm in sparse quadratic programming, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 425-442.

[3] O. AXELSSON AND I. KAPORIN, Error norm estimation and stopping criteria in preconditioned Conjugate Gradient iterations, Numer. Linear Algebra Appl., 8 (2001), pp. 265-286.

[4] D. CALVETTI, G. H. GOLUB, AND L. REICHEL, Estimation of the L-curve via Lanczos bidiagonalization, BIT, 39 (1999), pp. 603-609.

[5] D. CALVETTI, S. MORIGI, L. REICHEL, AND F. SGALLARI, Computable error bounds and estimates for the Conjugate Gradient Method, Numer. Algorithms, 25 (2000), pp. 79-88.

[6] --, An iterative method with error estimators, J. Comput. Appl. Math., 127 (2001), pp. 93-119.

[7] G. DAHLQUIST, S. EISENSTAT, AND G. H. GOLUB, Bounds for the error of linear systems of equations using the theory of moments, J. Math. Anal. Appl., 37 (1972), pp. 151-166.

[8] G. DAHLQUIST, G. H. GOLUB, AND S. G. NASH, Bounds for the error in linear systems, in Proc. Workshop on Semi-Infinite Programming, R. Hettich, ed., Springer, Berlin, 1978, pp. 154-172.

[9] P. DEUFLHARD, Cascadic conjugate gradient methods for elliptic partial differential equations I: Algorithm and numerical results, preprint SC 93-23, Konrad-Zuse-Zentrum fur Informationstechnik Berlin, Heilbronnen Str., D-10711 Derlin, October 1993.

[10] --, Cascadic conjugate gradient methods for elliptic partial differential equations: Algorithm and numerical results, Contemp. Math., 180 (1994), pp. 29-42.

[11] B. FISCHER, Polynomial Based Iteration Methods for Symmetric Linear Systems, Wiley Teubner Advances in Numerical Mathematics, Wiley Teubner, 1996.

[12] B. FISCHER AND G. H. GOLUB, On the error computation for polynomial based iteration methods, in Recent Advances in Iterative Methods, G. H. Golub, A. Greenbaum, and M. Luskin, eds., Springer, N.Y., 1994, pp. 59-67.

[13] A. FROMMER AND A. WEINBERG, Verified error bounds for linear systems through the Lanczos process, Reliable Computing, 5 (1999), pp. 255-267.

[14] W. GAUTSCHI, A survey of Gauss-Christoffel quadrature formulae, in E.B. Christoffel. The Influence of His Work on Mathematics and the Physical Sciences, P. Bultzer and F. Feher, eds., Birkhauser, Boston, 1981, pp. 73-157.

[15] G. H. GOLUB AND G. MEURANT, Matrices, moments and quadrature, in Numerical Analysis 1993, vol 303, Pitman research notes in mathematics series, D. Griffiths and G.Watson, eds., Longman Sci. Tech. Publ., 1994, pp. 105-156.

[16] --, Matrices, moments and quadrature II: How to compute the norm of the error in iterative methods, BIT, 37 (1997), pp. 687-705.

[17] G. H. GOLUB AND Z. STRAKOS, Estimates in quadratic formulas, Numer. Algorithms, 8 (1994), pp. 241-268.

[18] G. H. GOLUB AND C. VAN LOAN, Matrix Computation, The Johns Hopkins University Press, Baltimore MD, third ed., 1996.

[19] A. GREENBAUM, Behavior of slightly perturbed Lanczos and Conjugate Gradient recurrences, Linear Algebra Appl., 113 (1989), pp. 7-63.

[20] --, Estimating the attainable accuracy of recursively computed Residual methods, SIAM J. Matrix Anal. Appl., 18 (3) (1997), pp. 535-551.

[21] --, Iterative Methods for Solving Linear Systems, SIAM, Philadelphia, 1997.

[22] A. GREENBAUM AND Z. STRAKOS, Predicting the behavior of finite precision Lanczos and Conjugate Gradient computations, SIAM J. Matrix Anal. Appl., 18 (1992), pp. 121-137.

[23] M. H. GUTKNECHT AND Z. STRAKOS, Accuracy of two three-term and three two-term recurrences for Krylov space solvers, SIAM J. Matrix Anal. Appl., 22 (2001), pp. 213-229.

[24] M. R. HESTENES AND E. STIEFEL, Methods of conjugate gradients for solving linear systems, J. Research Nat. Bur. Standards, 49 (1952), pp. 409-435.

[25] N. J. HIGHAM, Accuracy and stability of numerical algorithms, SIAM, Philadelphia, PA, 1996.

[26] S. KARLIN AND L. S. SHAPLEY, Geometry of moment spaces, Memoirs of the Americam Mathematical Society 12, Providence, (1953).

[27] C. LANCZOS, Solution of systems of linear equations by minimized iterations, J. Research Nat. Bur. Standards, 49 (1952), pp. 33-53.

[28] G. MEURANT, The computation of bounds for the norm of the error in the Conjugate Gradient algorithm, Numer. Algorithms, 16 (1997), pp. 77-87.

[29] --, Computer Solution of Large Linear Systems, vol. 28 of Studies in Mathematics and Its Applications, Elsevier, 1999.

[30] --, Numerical experiments in computing bounds for the norm of the error in the preconditioned Conjugate Gradient algorithm, Numer. Algorithms 22, 3-4 (1999), pp. 353-365.

[31] --, Towards a reliable implementation of the conjugate gradient method. Invited plenary lecture at the Latsis Symposium: Iterative Solvers for Large Linear Systems, Zurich, February 2002.

[32] Y. NOTAY, On the convergence rate of the Conjugate Gradients in the presence of rounding errors, Numer. Math., 65 (1993), pp. 301-317.

[33] C. C. PAIGE, The Computation of Eigenvalues and Eigenvectors of Very Large Sparse Matrices, PhD thesis, Intitute of Computer Science, University of London, London, U.K., 1971.

[34] --, Computational variants of the Lanczos method for the eigenproblem, J. Inst. Math. Appl., 10 (1972), pp. 373-381.

[35] --, Error analysis of the Lanczos algorithm for tridiagonalizing a symmetric matrix, J. Inst. Math. Appl., 18 (1976), pp. 341-349.

[36] --, Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem, Linear Algebra Appl., 34 (1980), pp. 235-258.

[37] C. C. PAIGE AND Z. STRAKOS, Correspondence between exact arithmetic and finite precision behaviour of Krylov space methods, in XIV. Householder Symposium, J. Varah, ed., University of British Columbia, 1999, pp. 250-253.

[38] B. N. PARLETT, The Symmetric Eigenvalue Problem, Prentice Hall, Englewood Cliffs, 1980.

[39] --, Do we fully understand the symmetric Lanczos algorithm yet?, in Proceedins of the Lanczos Centennary Conference, Philadelphia, 1994, SIAM, pp. 93-107.

[40] P. E. SAYLOR AND D. C. SMOLARSKI, Addendum to: Why Gaussian quadrature in the complex plane?, Numer. Algorithms, 27 (2001), pp. 215-217.

[41] --, Why Gaussian quadrature in the complex plane?, Numer. Algorithms, 26 (2001), pp. 251-280.

[42] G. L. SLEIJPEN, H. A. VAN DER VORST, AND D. R. FOKKEMA, BiCGstab(l) and other hybrid BiCG methods, Numer. Algorithms, 7 (1994), pp. 75-109.

[43] Z. STRAKOS, On the real convergence rate of the Conjugate Gradient method, Linear Algebra Appl., 154-156 (1991), pp. 535-549.

[44] --, Convergence and numerical behaviour of the Krylov space methods, in Algorithms for Large Sparse Linear Algebraic Systems: The State of the Art and Applications in Science and Engineering, G. W. Althaus and E. Spedicato, eds., NATO ASI Institute, Kluwer Academic, 1998, pp. 175-197.

[45] Z. STRAKOS AND A. GREENBAUM, Open questions in the convergence analysis of the Lanczos process for the real symmetric eigenvalue problem, IMA preprint series 934, University of Minnesota, 1992.

[46] G. SZEGO, Orthogonal Polynomials, AMS Colloq. Publ. 23, AMS, Providence, 1939.

[47] K. F. WARNICK, Nonincreasing error bound for the Biconjugate Gradient method, report, University of Illinois, 2000.

* Received June 7, 2001. Accepted for publication August 18, 2002. Recommended by L. Reichel.


([dagger]) Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vodarenskou vezi 2, 182 07 Praha 8, Czech Republic. E-mail: and This research was supported by the Grant Agency of the Czech Republic under grant No. 201/02/0595.
COPYRIGHT 2002 Institute of Computational Mathematics
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2002 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Strakos, Zdenek; Tichy, Petr
Publication:Electronic Transactions on Numerical Analysis
Date:Jan 1, 2002
Previous Article:Perturbation of parallel asynchronous linear iterations by floating point errors.
Next Article:Multigrid preconditioning and Toeplitz matrices.

Related Articles
Elastomer Biaxial Characterization Using Bubble Inflation Technique. II: Numerical Investigation of Some Constitutive Models.
Recent Advances in Adaptive Computation: Proceedings.
The Lanczos and conjugate gradient algorithms; from theory to finite precision computations.
An inverse estimation of initial temperature profile in a polymer process.
Advancing analysis capabilities in ANSYS through solver technology.
A multigrid algorithm for solving the multi-group, anisotropic scattering Boltzmann equation using first-order system least-squares methodology.
Efficient solution of symmetric eigenvalue problems using multigrid preconditioners in the locally optimal block conjugate gradient method.
Superlinear cg convergence for special right-hand sides.

Terms of use | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters