# The block preconditioned steepest descent iteration for elliptic operator eigenvalue problems.

1. Introduction. Let [OMEGA] be a bounded polygonal region in [R.sup.d], often d = 2 or d = 3, and let the boundary [partial derivative][OMEGA] be the disjoint union of [partial derivative][[OMEGA].sub.1] and [partial derivative][[OMEGA].sub.2] We consider the self-adjoint elliptic eigenvalue problem(1.1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Here c(x) is a symmetric and positive definite matrix valued function, g(x) a nonnegative real function and [n.bar] denotes the normal vector on the boundary [partial derivative][[OMEGA].sub.2]. The coefficient functions are assumed piecewise continuous. We are interested in the numerical approximation of some of the smallest eigenvalues [lambda] and the corresponding eigenfunctions u.

The finite element discretization of (1.1) yields the generalized matrix eigenvalue problem

(1.2) A[x.sub.i] = [[lambda].sub.i]M[x.sub.i].

The discretization matrix A and the mass matrix M are symmetric, positive definite n x n matrices. Typically, these matrices are very large and sparse.

Our aim is to compute a fixed number of the smallest eigenvalues of (1.2) together with the associated eigenvectors. The numerical algorithm should exploit the structure of the mesh eigenproblem, and its computational costs should increase almost linearly in dimension . The demand for a near-optimal-complexity method rules out all eigensolvers which are usually used for small and dense matrices. See [2, 4, 21] for getting an overview on the wide variety of numerical eigensolvers primarily for small and dense matrices.

A conceptually easy approach to the desired near-optimal-complexity eigensolvers is based on gradient iterations for the Rayleigh quotient; cf. Dyakonov's monograph on optimization for elliptic problems [3] and its Chapter 9 on the Rayleigh-Ritz method for spectral problems. The starting point is the generalized Rayleigh quotient for (1.2)

(1.3) [rho](x) = (x, Ax)/(x, Mx), x [not equal to] 0.

As [rho](x) attains its minimum Ai at an associated eigenvector, the minimization of (1.3) can be implemented by means of a gradient iteration. The negative gradient of (1.3) reads

-[nabla][rho](x) = -2(Ax - [rho](x)Mx)/(x, Mx)

and allows us to construct a new iterate x' = x - [omega][nabla][rho](x) with [rho](x') < [rho](x), whenever is not an eigenvector and [omega] [member of] R is a proper step length. The optimal step length minimizes [rho](x') with respect to [omega] [member of] R. These optimal x' and [rho](x') can be computed by applying the Rayleigh-Ritz procedure to the 2D space spanned by x and [nabla][rho](x). The gradient iteration does not change A and M and does not need these matrices explicitly. It is a so-called matrix-free method in the sense that its implementation only requires routines z [??] Az and z [???] Mz. The sparsity of A and M allows us to implement these matrix-vector products with computational cost which only increases linearly in the number of unknowns.

1.1. Preconditioning of gradient iterations. Gradient iterations for the Rayleigh quotient which use the Euclidean gradient [nabla][rho](x) are well known to converge poorly if the condition number of A is large [5, 6, 7, 8, 12, 23, 28]. This is particularly the case for a finite element discretization of (1.1) since the condition number of A increases as O([h.sup.-2]) in the discretization parameter h.

The key ingredient to make a gradient iteration an efficient solver for the operator eigenvalue problem (1.1) is multigrid preconditioning. If a symmetric and positive definite matrix T is an approximate inverse of A, then T is called a preconditioner for A. The preconditioned gradient iteration uses as -T[nabla][rho](x) the descent direction for the Rayleigh quotient. The us age of the T-gradient can be interpreted as a change of the underlying geometry which makes ellipsoidal level sets of [rho](x) more spherical [3, 25]. Proper preconditioning, for instance by multigrid or multilevel preconditioning, can accelerate the convergence considerably. In the best case this can result in grid-independent convergence behavior [10, 11].

The T-gradient steepest descent iteration optimizes the step length [omega]

(1.4) x' = x - [omega]T[nabla][rho](x) = x - 2[omega]T(Ax - [rho](x)Mx)/(x, Mx)

in a way that x' attains the smallest possible Rayleigh quotient for all [omega] [member of] R. If T(Ax - [rho](x)Mx) is a Ritz vector, then u may be infinite. Computationally x' is determined by the Rayleigh-Ritz procedure since x' is a Ritz vector (A, M) of in the two-dimensional subspace span{x, T(Ax - [rho](x)Mx)} corresponding to the smaller Ritz value. The resulting iteration has some similarities with the Davidson method [20, 24] if the iteration is restarted after each step so that the dimension of the truncated search subspace always equals 2. Stathopoulos [27] calls such an iteration with constant memory requirement a generalized Davidson method.

The eigenpairs of (1.2) are denoted by ([[lambda].sub.i], [x.sub.i]). The enumeration is 0 < [[lambda].sub.1] < [[lambda].sub.2] < ... < [[lambda].sub.n]. We assume simple eigenvalues. The case of multiple eigenvalues can simply be reduced to that of only simple eigenvalues by a proper projection of the eigenvalue problem. Alternatively, a continuity argument can be used to show that colliding eigenvalues do not change the structure of the convergence estimates; cf. [ 17, Lemma 2.1].

For the vectorial Preconditioned Steepest Descent (PSD) iteration (1.4) sharp convergence estimates have recently been proved in [17]. The central theorem reads as follows:

THEOREM 1.1. Let x [member of] [R.sup.n] be such that the Rayleigh quotient [rho](x) satisfies [[lambda].sub.i] < [rho](x) < [[lambda].sub.i+1] for some i [member of] {1, ..., n - 1}. Let x' be the Ritz vector of (A, M) in span{x, T(Ax - [rho](x)Mx)} which corresponds to the smaller Ritz value. The Ritz value is [rho](x'). The preconditioner satisfies

(1.5) [[gamma].sub.1](x, [T.sup.-1]x) [less than or equal to] (x, Ax) [less than or equal to] [[gamma].sub.2](x, [T.sup.-1]x) [for all] x [member of] [R.sup.n]

with constants [[gamma].sub.1], [[gamma].sub.2] [greater than or equal to] 0. Let [gamma] := ([[gamma].sub.2].

Then either [rho](x') [less than or equal to] [[lambda].sub.i] or [rho](x') [member of] ([[lambda].sub.i], [[lambda].sub.i+1]) is bounded from above as follows

(1.6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Here [[DELTA].sub.p,q]([theta]) := ([theta] - [[lambda].sub.p])/([lambda]q - [theta]).

The estimate (1.6) cannot be improved in terms of the eigenvalues [[lambda].sub.i], [[lambda].sub.i+1] and [[lambda].sub.n]. The bound can be attained for [lambda] [right arrow] [[lambda].sub.i] in the three-dimensional invariant subspace [[epsilon].sub.i,i+1,n], which is spanned by the eigenvectors corresponding to [[lambda].sub.i], [[lambda].sub.i+1] and [[lambda].sub.n].

Theorem 1.1 guarantees monotone convergence of the Ritz values [rho](x') towards an eigenvalue. If the Rayleigh quotient p(x) has reached the final interval [[[lambda].sub.1], [[lambda].sub.2]), then (1.6) proves linear convergence of the Ritz values towards the smallest eigenvalue [[lambda].sub.1].

1.2. Aim of this paper. This paper aims at generalizing the preconditioned gradient iteration (1.4) to a subspace iteration for the simultaneous computation of the s smallest eigenvalues of (1.2). New and sharp convergence estimates for the Ritz values of (A, M) in the iteration subspaces are presented, which generalize Theorem 1.1.

The effectiveness of the method is demonstrated for the Laplacian eigenvalue problem on the sliced unit circle with mixed homogeneous Dirichlet and Neumann boundary conditions. The operator eigenvalue problem is discretized with linear and quadratic (only for error estimation) finite elements. Our finite element code AMPE, see Section 4, is demonstrated to work with up to 5 x [10.sup.7] degrees of freedom. AMPE uses a multigrid preconditioner with Jacobi smoothing.

1.3. Notation and overview. Subspaces are denoted by capital calligraphic letters, e.g., the column space of a matrix Z is Z = span{Z}. Similarly, spand{Z,Y} is the smallest vector space which contains the column spaces of z and of Y. All eigenvalues and Ritz values are enumerated in order of increasing magnitude. Further, [[epsilon].sub.index-set] denotes the invariant subspace of (A, M) which is spanned by the eigenvectors of (A, M) whose indexes are contained in the index set.

This paper is structured as follows: The preconditioned gradient subspace iteration is introduced in Section 2. The new convergence estimates are presented in Section 3; Theorem 3.4 is the central new estimate on the convergence of the Ritz values. In Section 4 numerical experiments illustrate the sharpness of the estimates and the cluster-robustness of the preconditioned gradient subspace iteration.

2. The block preconditioned iteration with Rayleigh-Ritz projections. The reconditioned subspace/block iteration is a straightforward generalization of vector iteration (1.4). The iteration (1.4) is just applied to each column of the matrix which columnwise contains the Ritz vectors of the current subspace. Each subspace correction step is followed by the Rayleigh-Ritz procedure in order to extract the Ritz approximations.

The starting point is an -dimensional subspace V of [R.sup.n] which is given by the column space of V [member of] [R.sup.n x s]. The columns of V are assumed to be the M-normalized Ritz vectors of (A, M) in V. Further, [THETA] = diag([[theta].sub.1], ..., [[theta].sub.s]) is the s x s diagonal matrix of the corresponding Ritz values. The matrix residual

AV - MV[THETA] [member of] [R.sup.n x s]

contains columnwise the residuals of the Ritz vectors. Left multiplication with the preconditioner T [approximately equal to] [A.sup.-1] coincides with the approximate solution of s linear systems in A. This yields the correction-subspace span{T(AV - MV[THETA])} of the preconditioned subspace iteration. The Rayleigh-Ritz procedure is used to extract the smallest Ritz values and the associated Ritz vectors of in (A, M) in span{V, T(AV - MV[THETA])}: see Algorithm 1.

Algorithm 1 Block preconditioned steepest descent iteration. Require: Symmetric and positive definite matrices A, M [member of] [R.sup.nxn], a preconditioned satisfying (1.5) and an initial (random) s-dimensional subspace [V.sup.(0)] with [[angle].sub.M]([V.sup.(0)], [[epsilon].sub.1:s]) < [pi]/2 where [[epsilon].sub.1:s] is the invariant subspace of (A, M) associated with the s smallest eigen values. 1. Initialization: Apply the Rayleigh-Ritz procedure to [V.sup.(0)]. The n x s matrix [V.sup.(0)] = [[v.sup.(0).sub.1], ... [v.sup.(0).sub.s]] contains the Ritz vectors of (A, M) corresponding to the s Ritz values [[theta].sup.(0).sub.1], ..., [[theta].sup.(0).sub.s]. Let [[THETA].sup.(0)] = diag([[theta].sup.(0).sub.1], ..., [[theta].sup.(0).sub.s]) and R(0) =[ AV.sup.(0)] - [MV.sup.(0)][[THETA].sup.(0)]. 2. Iteration, i [greater than or equal to] 0: The Rayleigh-Ritz procedure is applied to span{[V.sup.(i)], T [R.sub.(i)]}. The columns of [V.sup.(i+1)] = [[v.sup.(i+1).sub.1], ..., [v.sup.(i+1).sub.s]] are the Ritz vectors of (A, M) corresponding to the s smallest Ritz values [[theta].sup.(i+1).sub.1], ..., [[theta].sup.(i+1).sub.s]. Let [[THETA].sup.(i+1)] = diag([[theta].sup.(i+1).sub.1], ..., [[theta].sup.(i+1).sub.s] and [R.sup.(i+1)] = [AV.sup.(i+1)] - [MV.sup.(i+1)][[THETA].sup.(i+1)]. 3. Termination: The iteration is stopped if [parallel][R.sup.(i+1)][parallel] drops below a specified accuracy.

The block steepest descent iteration as given in Algorithm 1 has already been analyzed for the special case T = [A.sup.-1] (preconditioning with the exact inverse of the discretization matrix A) in [19]. If in the central convergence estimates of this paper, see Corollary 3.3 and Theorem 3.4, [gamma] = 0 is inserted, then the convergence factor

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

simplifies to [[kappa].sub.i]/(2 - [[kappa].sub.i]). This bound has been proved in case 1b of [19, Theorem 1.1].

However, the convergence analysis for the block steepest descent iteration with a general preconditioner does not rest upon the analysis in [19]. In fact, the analysis of the current paper uses the convergence analysis for the vectorial preconditioned steepest descent iteration from [17]. By starting with the result from [17] some of the proof techniques from [19] can be adapted to the block iteration with general preconditioning. One important modification is that the argument using the subspace dimensions from Theorem 3.2 of [19] cannot be transferred to the case of general preconditioning. Instead, we use here Sion's theorem for the proof of Theorem 3.2. The proof structure of the present paper has also some similarities with the chain of arguments in [ 14] wherein a convergence analysis of the subspace variant of the preconditioned inverse iteration is contained. References to similar results and similar proof techniques are given prior to Lemma 3.1, Theorem 3.2 and Theorem 3.4.

3. Convergence analysis. The convergence e timate on the convergence of the Ritz value [[theta].sup.(i+1).sub.j] with j = 1, ..., s and i [greater than or equal to] 0, see Algorithm 1, toward the eigenvalue of (A, M) are proved in this section. In Theorem 3.2 together with Corollary 3.3 the convergence of the largest of these Ritz values, namely [[theta]'.sub.s] := [[theta].sup.(i+1).sub.s], is analyzed. It is shown that the Ritz value [[theta]'.sub.s] of (A, M) in span{V, T(AV - MV[THETA])} is bounded from above by the largest Ritz value which can be attained by the vector iteration PSD (1.4) if applied elementwise to the column space of V. Theorem 3.4 proves sharp estimates for the remaining Ritz values by induction.

In the following analysis we assume that the preconditioner satisfies the inequality [[parallel]I - TA[parallel].sub.A] [less than or equal to] [gamma] < 1. This assumption is equivalent to (1.5) with [gamma] := ([[gamma].sub.2] - [[gamma].sub.1])/([[gamma].sub.2] + [[gamma].sub.1]) if the preconditioner is optimally scaled. The PSD iteration implicitly determines this optimal scaling by the Rayleigh-Ritz procedure.

The next lemma generalizes Lemma 3.1 from [14].

Lemma 3.1. Let the columns of V [member of] [R.sup.n x s] form a Ritz basis of span{V}, and let T be a symmetric and positive definite matrix with [[parallel]I - TA[parallel].sub.A] [less than or equal to] [gamma] < 1.

i) For any a [member of] [R.sup.s] \ {0} and any c [member of] [R.sup.s] it holds that

(3.1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

ii) The matrix [omega]V[THETA] + (V - T(AV - MV[THETA])) preserves for all [omega] [member of] R the full rank of the matrix V.

iii) Let x be given as in Theorem 1.1, and let [??] [member of] [R.sup.n] satisfy

(3.2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

If

(3.3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

then the PSD estimate (1.6) applies to the Ritz vector x' given by (3.3) and its Rayleigh quotient [rho](x').

Proof. i) The A-orthogonality, see Fig. 3.1,

[(Vz, ([A.sup.-1]MV[THETA] - V)a).sub.A] = [z.sup.T][V.sup.T](MV[THETA] - AV)a = [z.sup.T]([THETA] - [THETA])a = 0 [for all]z [member of] [R.sup.s]

shows that Va is the A-orthogonal projection of [A.sup.-1]MV[THETA]a on span{V}. This guarantees that

[[parallel][A.sup.-1]MV[THETA]a - Va[parallel].sub.A] [less than or equal to] [[A.sup.-1] MV[THETA]a - [V.sub.c][parallel].sub.A]

for all c [member of] [R.sup.s].

ii) Direct computation shows that for a [member of] [R.sup.s] \ {0}

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The last inequality holds with c = -[omega][THETA]a due to (3.1). The fact [[parallel][omega]V[THETA]a + (V - T(AV - MV[THETA]))a[parallel].sub.A] > 0 for all a [not equal to] 0 proves that the matrix [omega]V[THETA] + (V - T(AV - MV[THETA])) has the full rank.

iii) Inequality (3.2) is equivalent to [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. This means that [rho](x)[??] is contained in a ball with the center [rho](x)[A.sup.-1]Mx and the radius [gamma][[parallel][rho](x)[A.sup.-1] Mx -x[parallel].sub.A]. The geometry of the fixed-step-length preconditioned gradient iteration (see [10, 17]) shows that p(x)x is representable by a symmetric and positive definite [??] with [[parallel]I - [??]A[parallel].sub.A] [less than or equal to] [gamma] in the form

[rho](x)[??] = x - [??](Ax - [rho](x)Mx).

The preconditioned steepest descent iteration accelerates the convergence of this fixed-steplength iteration by applying the Rayleigh-Ritz procedure to

span{x, [??](Ax - [rho](x)Mx)} -- span{x, [??]}.

Thus the smallest Ritz value in this space is bounded from above by the estimate (1.6). []

The next theorem proves the desired convergence behavior for the largest Ritz value [[theta]'.sub.s]. Comparable estimates for the largest Ritz values, but with respect to different spaces, have been used in [14, Theorem 3.2] for the subspace analysis of the preconditioned inverse iteration and in [19, Theorem 3.2] for the block steepest descent iteration.

THEOREM 3.2. Let span{V} contain no eigenvector of A, which otherwise could easily be split off within the Rayleigh-Ritz procedure. Then the s-th Ritz value [[theta]'.sub.s] of (A, M) in W := span(V,T(AV - MV[THETA])}, which is given by

(3.4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

satisfies

(3.5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Proof. Let [a.sup.*] [member of] [R.sup.s] \ {0} and [[omega].sup.*] [member of] R be given in a way so that the max- min in (3.5) is attained

(3.6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Hence ([a.sup.*], [[omega].sup.*]) is a saddle point of the Rayleigh quotient. In a sufficiently small neighborhood Q x [OMEGA] of ([a.sup.*], [[omega].sup.*]) the Rayleigh quotient [rho](x) is a smooth function which is concave in and convex in [omega]. Sion's-minimax theorem [22, 26] provides the justification to swap the order of minimization and maximization, i.e.,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Thus [rho]([omega][W.sub.1]a + [W.sub.2]a) is the Rayleigh quotient of a with respect to the projected matrix-pair ([W.sup.T] AW, [W.sup.T] MW) with W = [omega][W.sub.1] + [W.sub.2]. The local maximum in a [member of] Q coincides with the global maximum of in the column space of (because the maximum is taken in the interior of and because the Hessian of the Rayleigh quotient shows that there is only one local extremum which is a maximum). We conclude that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

We denote the minimum point with respect to [omega] [member of] [OMEGA] by [bar.[omega]] so that [[??].sub.s] = [[theta].sub.max]([bar.[omega]][W.sub.1] + [W.sub.2]). Therefore [[??].sub.s] is the largest Ritz value of (A, M) in the column space [bar.Z] of [bar.[omega]][W.sub.1] + [W.sub.2] [member of] [R.sup.n x s]. Lemma 3.1 guarantees that the dimension of [bar.Z] equals s. Since [bar.Z] is a subspace of W one gets with (3.4) that [[theta].sub.'s] [less than or equal to] [max.sub.z [member of] [bar.Z]\{0}] [rho](z) = [[??].sub.s]. []

COROLLARY 3.3. If the s-th Ritz value [[theta].sub.s] of (A, M) in V is contained in ([[lambda].sub.q], [[lambda].sub.q+1]), q [member of] {1, ..., n - 1}, then for the s-th Ritz value [[theta]'.sub.s] of (A, M) in span{V,T(AV - MV[theta])} it holds that either or [[theta]'.sub.s] [less than or equal to] [[lambda].sub.q] or

(3.7) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Proof. If q = n - 1, then the first alternative [[theta]'.sub.s] [less than or equal to] [[lambda].sub.n-1] applies since the s-th Ritz value of (A, M) in the at least (s + 1)-dimensional subspace span{V,T(AV - MV[THETA])} is smaller than or equal to [[lambda].sub.n-1] (due to the min-max principles).

Next let q < n - 1 and assume [[lambda].sub.q] < [[theta]'.sub.s] for which (3.7) is to be proved. For [[lambda].sub.q] [less than or equal to] [theta] < [[lambda].sub.r] the function is monotone increasing in . Theorem 3.2 shows that [[theta]'.sub.s][less than or equal to][[??].sub.s] so that

(3.8)[[DELTA].sub.q,q+1]([[theta]'.sub.s])[less than or equal to][[DELTA].sub.q,q+1]([[theta]'.sub.s]).

In order to apply Lemma 3.1, Case iii, to x = V[THETA][a.sup.*] and [??] = (V - T(AV - MV[THETA]))[a.sup.*] we show that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The last inequality is proved with (3.1). Thus Case iii in Lemma 3.1 guarantees that the PSD estimate (1.6) can be applied to the smallest Ritz value of (A,M)in span {x,[??]}. By (3.6) this Ritz vector is [[omega].sup.*]x + [??] with the Ritz value [[theta]'.sub.s]. Hence the vectorial PSD estimate (1.6) results in

(3.9) [[DELTA].sub.q,q+1]([[??]'.sub.s])[less than or equal to][([kappa] + [gamma](2 - [kappa])/(2 - [kappa]) + [gama][kappa]).sup.2][[DELTA].sub.q,q+1]([rho](x)).

This concludes the proof since [rho](x), x [member of] span{V}, is bounded by the largest Ritz value [[theta].sub.s] of in , i.e.,

(3.10) [[DELTA].sub.q,q+1]([rho](x))[less than or equal to][[DELTA].sub.q,q+1]([[theta].sub.s]).

The chain of inequalities (3.8)-(3.10) proves the proposition.

The convergence estimates for the remaining Ritz values [[theta]'.sub.i], i = 1, ..., s - 1, follow from Corollary 3.3 by induction. Comparable estimates for the remaining Ritz values, but with respect to different spaces, have been used in [14, Theorem 3.3] for the subspace analysis of the preconditioned inverse iteration and in [19, Theorem 3.1] for the block steepest descent iteration.

THEOREM 3.4. Let V be an s-dimensional subspace of the [R.sup.n]. The Ritz vectors of (A,M) in V are denoted by [v.sub.1], ..., [v.sub.s] and let V := [[v.sub.1], ..., [v.sub.s]] [member of][R.sup.nxs]. The associated Ritz values are with [[theta].sub.i] = [rho]([v.sub.i]) with [[theta].sub.1][less than or equal to] ... [less than or equal to] [[theta].sub.s] Indices [k.sub.i] are given so that [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The s smallest Ritz values of (A,M) in spam{V,T(AV - MV[THETA])} are denoted by [[theta]'.sub.i] with [[theta]'.sub.1] [less than or equal to] [[theta]'.sub.s]. Then for each i [member of] {1, ..., s} it holds that [[theta]'.sub.i] and either or [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] or

(3.11) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Here [[DELTA].sub.p,q]([theta]):= ([theta] - [[lambda].sub.p])/([[lambda].sub.q] - [theta]).

The bound (3.11) cannot be improved in terms of the eigenvalues of (A, M) as for each i the upper bound is attained for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] in the 3D invariant subspace associated with the eigenvalues [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and [[lambda].sub.n]

Proof. The proof is given by induction over the subspace dimension . For a one-dimensional subspace V = span{x}, Theorem 1.1 proves the Ritz value estimate (3.11) with [rho](x') = [[theta]'.sub.1].

For an s-dimensional subspace the Ritz vectors [v.sub.i] of (A,M) are arranged in the columns of V = [[v.sub.1], ..., [v.sub.s]] =: [[V.sup.(s-1)], [v.sub.s]][member of][R.sup.nxs]. Let [[theta]'.sub.i] ([V.sup.(s-1)]) be the s - 1 smallest Ritz values of (A,M) in

span{[V.sup.(s-1)], T([AV.sup.(s-1)] - [MV.sup.(s-1)] [[THETA].sup.(s-1)])}, [[THETA].sup.(s-1)]:= diag([[theta].sub.1], ..., [[theta].sub.s-1]),

with [[theta]'.sub.1]([V.sup.(s-1)])[less than or equal to] ... [less than or equal to][[theta]'.sub.s-1] ([V.sup.(s- 1)]). The induction hypothesis with respect to the (s - 1) - dimensional space reads

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The Courant-Fischer variational principles show that these s - 1 smallest Ritz values [[theta]'.sub.1]([V.sup.(s-1)]) decrease when span {[V.sup.(s-1)]} is enlarged to v, i.e.,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Finally, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is a monotone increasing function in [theta] so that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

which proves the proposition for the first s - 1 smallest Ritz values. For the Ritz value [[theta].sub.s] Corollary 3.3 completes the proof.

The sharpness of the estimate (3.11) is a consequence of Theorem 1.1 since for [[theta].sub.i] [member of] ([MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]) the columns of can be formed by the vector of poorest convergence in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and all other columns are taken as eigenvectors with indexes different form [k.sub.i], [k.sub.i] + 1 and n. Then the subspace iteration behaves like the vectorial preconditioned steepest descent iteration due to stationarity of the iteration in the eigenvectors.

COROLLARY 3.5. The Ritz value convergence estimate (3.11) cannot be improved in terms of the eigenvalues [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and [[lambda].sub.n] and is attainable for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Proof. For [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] let [v.sub.i] be an M-normalized vector in the invariant subspace [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] with [rho]([v.sub.i]) = [[theta].sub.i]. The remaining columns of V are filled with pairwise different eigenvectors of which are M-orthogonal to [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Then a step of the block steepest descent iteration, Algorithm 1, leaves all the eigenvectors invariant. The convergence of the ith column [v.sub.i] is exactly that of the vectorial preconditioned steepest descent iteration as treated in Theorem 1.1 because the iteration is stationary in all eigenvectors. The nonimprovability of the convergence estimate in terms of the eigenvalues has already been treated in Theorem 1.1.

Corollary 3.5 shows that the convergence estimate (3.11) cannot be improved in terms of the eigenvalues without further assumptions on the subspace V. Hence cluster robust convergence estimates, which should depend in some way on the ratio [[lambda].sub.i]/[[lambda].sub.s+1], are not provable in terms of the convergence measure [[DELTA].sub.q,q+1] as used in Theorem 3.4. The numerical experiments in Section 4 illustrate that the block preconditioned steepest descent iteration behaves as a cluster robust iteration. In order to derive cluster robust convergence estimates, additional assumptions have to be made: For instance the angle between the iteration subspace and the invariant subspace has to be bounded.

4. Numerical experiments. The block preconditioned steepest descent iteration is applied to the Laplacian eigenvalue problem

(4.1) -[DELTA]u(x) = [lambda]u(x), x [member of][OMEGA]: = {(r cos ([phi]), r sin ([phi])): r [member of][0,1], [phi][member of][0, 2[pi]]}

on the unit circle with a slit along the positive abscissa. Homogeneous Dirichlet boundary conditions are given on the boundary for r = 1 and on the upper side of the slit {r, [phi]): r [member of][0,1], [phi] = 0}. Homogeneous Neumann boundary conditions are used on the lower side of the slit. The numerical approximations of the eigenvalues and eigenfunctions can be compared with the exact solutions of (4.1). The eigenfunctions are sin [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]; see Fig. 4.1. Here [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is a Bessel function of first kind of fractional order [1] and [[alpha].sub.k]:= [1/2] k + [1/4]. The eigenvalues are the squares of the positive zeros [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The operator eigenvalue problem is discretized by linear finite elements on a triangle mesh. Our program code AMPE (Adaptive Multigrid Preconditioned Eigensolver) in FORTRAN is an adaptive multigrid finite element code with an edge oriented error estimator which uses linear and quadratic finite elements. All test computations have been executed on a personal computer with an Intel Xeon 3.2GHz CPU and with a RAM of 31.4GiB. Our finite element code on this computer can solve eigenvalue problems that exceed 50 x [10.sup.6] degrees of freedom. The program includes multigrid preconditioning with Jacobi smoothing. The FORTRAN code is embedded in a Matlab GUI which allows easy and convenient usage of the program and has graphical presentation of its output.

Experiment I. In this first experiment the block preconditioned steepest descent iteration is applied in a 3-dimensional subspace; we refer to this algorithm as PSD(s = 3). Further, the eigenvalue problem is discretized on a series of uniform triangle meshes to which nested iteration is applied. The first grid level comprises 21 nodes from which 15 are located on the Dirichlet boundary. This gives initial degrees of freedom. On this coarsest level the eigenvalue problem is solved exactly (aside from rounding errors). Piecewise linear interpolation is used to prolongate the approximate eigenfunctions from one grid level to the next refined level. The multigrid preconditioner is a V-cycle multigrid solver with damped Jacobi-smoothing; the damping constant is [omega] = 2/3 and two pre-smoothing and post-smoothing iterations are applied on each level.

The quality of the preconditioner is controlled by a stopping condition for the linear system solver. For each Ritz vector the bound [parallel] A(Tr) - r [[parallel].sub.2] < 10/n is tested. Here r denotes the residual of a Ritz vector, Tr is the preconditioned residual and n is the dimension of the discrete problem on the current level. In most cases only one V-cycle is required to reach this accuracy since the initial solution on a refined grid is the prolongation of the solution from the coarser grid.

The stopping criterion for PSD(s = 3) is [parallel] r [[parallel].sub.T] = [([r.sup.T]Tr).sup.1/2] < [10.sup.-10] where r runs through the residuals [Av.sub.i] - [rho]([v.sub.i])[Mv.sub.i] for the Ritz vectors [v.sub.1], [v.sub.2] and [v.sub.3]. This stopping criterion is justified by the generalized Temple inequality (see [3, Lemma 3 in Chapter 11]), which is the first inequality in

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The second inequality follows with [parallel] I - TA [[parallel].sub.A][less than or equal to][gamma] < 1. Hence [parallel] r [[parallel].sup.2.sub.T] is an upper bound for the product of the relative distances of [rho](x) to the enclosing eigenvalues [[lambda].sub.i] and [[lambda].sub.i+1].

The nested iteration is stopped on the level l = 12 with 50348033 nodes and 50319360 degrees of freedom. Figure 4.2 (left) shows that the computational costs increase more or less linearly in the dimension of the problem which indicates the near optimal complexity of the PSD(s = 3) solver. Figure 4.2 also shows the errors [[theta].sup.(k).sub.i] - [[lambda].sub.i], i = 1,2,3, for the three smallest eigenvalues

[[lambda].sub.1] = [[xi].sup.2.sub.0,1][approximately equal to] 7.733337, [[lambda].sub.2] = [[xi].sup.2.sub.1,1][approximately equal to], [[lambda].sub.3] = [[xi].sup.2.sub.021][approximately equal to] 17.35078.

The error [[theta].sup.(k).sub.1] - [[lambda].sub.1] is relatively large since the associated eigenfunction has an unbounded derivative at the origin. The next experiment shows that the approximation of this eigenfunction clearly profits from an adaptively generated grid.

Experiment II. Next we show that adaptive mesh generation with a posteriori edge oriented error estimation similar to that in [15] results in much better approximations. The error estimator computes the eigenvector residuals with respect to quadratic finite elements for Ritz vectors which are represented by linear finite elements. The largest (in modulus) components of the residual are used to select those edges which belong to triangles that are to be refined.

Once again, the PSD(s = 3) solver is used. In order to compute a grid which allows an optimal approximation of the eigenfunction associated with the smallest eigenvalue, the error estimation and grid refinement aim at an optimal approximation of just this eigenfunction. Figure 4.3 shows a relatively coarse triangulation of the unit circle and sectional enlargements of finer triangulations around the origin, where the depth of the triangulations takes its largest values due to the unbounded derivative [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] around r = 0. The adaptive process generates highly non-uniform triangulations. Further details of the adaptive process and its more or less linearly increasing costs (as a function of the d.o.f) are shown in Figure 4.4. The resulting smallest Ritz values [[theta].sub.1], which approximate the smallest eigenvalue [[lambda].sub.1][approximately equal to] 7.733337, are as follows.

Depth of triang. 1 30 43 57 68 73 Nodes 21 10709 108693 1185777 10961756 34157092 D.o.f. 6 10409 107630 1182184 10951337 34137627 [[theta]] 12.95561 7.738704 7.733789 7.733379 7.733341 7.733338 .sub.1

In contrast to this, a uniform refinement yields a final depth 12 with 50348033 nodes only in a poor approximation [[theta].sub.1] a comparable quality of approximation can be gained by the adaptive process on level 19 with 2377 nodes and [[theta].sub.1] = 7.762841.

Finally, the results of the PSD iteration to approximate the 15 smallest eigenvalues in a 20-dimensional subspace are listed in Table 4.1.

Experiment III. Next the case of the poorest convergence of the block preconditioned steepest descent iteration is explored. Therefore, we use the final grid from Experiment II with about 3.41 x [10.sup.7] d.o.f. and apply the PSD(s = 3) iteration. According to Theorem 1.1, the poorest convergence of the vectorial iteration PSD(s = 1) is attained in the invariant subspace [[epsilon].sub.i,i+1,n]. The subspace iteration behaves similarly. To show this we consider subspaces which are spanned by a single nonzero vector from [[epsilon].sub.i,i+1,n] whereas all the other basis vectors are eigenvectors of (A,M )with indexes different from i,i + 1 and n. Then block-PSD method behaves like a vectorial iteration due to the stationarity in the eigenvectors. Theorem 1.1 provides the convergence estimate for the single vector from . Figure 4.5 shows in the intervals ([[lambda].sub.i], [[lambda].sub.i+1]) the upper bounds ([kappa] + [gamma][(2 - [kappa]) + [gamma][kappa]).sup.2] (dashed lines) and the largest ratios [[DELTA].sub.i,i+1]([[theta]'.sub.i])/[[DELTA].sub.i,i+1]([[theta].sub.i]) for 1000 equispaced normalized test vectors in [[epsilon].sub.i,i+1,n] whose Rayleigh quotients equal [[theta].sub.i]. All this is done for equidistant [[theta].sub.i][member of] ([[lambda].sub.i], [[lambda].sub.i+1]). In each interval [[lambda].sub.i], [[lambda].sub.i+1]) the estimate (3.11) is sharp, cf. Theorem 3.4, and can be attained for [[theta].sub.i] [right arrow] [[lambda].sub.i].

Experiment IV. In this experiment the sharp single-step estimates (3.11) are compared with multi-step estimates for the PSD(s = 3) iteration. For each grid level with the initial subspace [V.sup.(0,l)], which is the prolongation of the final subspace from the level l - 1, is of sufficient quality so that the three Ritz values of ([A.sub.l], [M.sub.l]) in [V.sup.(0,l)] have reached their "destination interval", i.e.,

[[theta].sub.i]([V.sup.(0,l)]) [member of]([[lambda].sup.(l).sub.i], [[lambda].sup.(l).sub.i+1], i = 1,2,3,

so that the Ritz values [[theta].sup.(k,l).sub.i](k is the iteration index on level l) do not leave this interval. Here, [[lambda].sup.(l).sub.i] are the eigenvalues of ([A.sub.l], [M.sub.l]) with respect to the grid level l. Theorem 3.4 guarantees that [limit.sub.k[right arrow][infinity]][[theta].sup.(k,l).sub.i] = [[lambda].sup.(l).sub.i].

All this allows us to apply the 1-step estimates

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

recursively, which yields the multistep estimate

(4.3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [[kappa].sup.(l)] is given by (3.11) after substitution of [[lambda].sub.i] by [[lambda].sup.(l).sub.i] for the relevant indexes i. The parameter [gamma] = [parallel]I - TA[[parallel].sub.A] for the quality of the preconditioner is approximated on each grid level by computing the spectral radius of I - TA with the power method. For this experiment we use again two steps of pre-/post-smoothing on each level with damped ([omega] = 2/3 Jacobi-smoothing. This results in [gamma][approximately equal to] 0.78. The discrete eigenvalues [[lambda].sup.(l).sub.i] are estimated by extrapolation from the computed Ritz values.

Figure 4.6 shows the multistep bound (4.3) as a bold line, the 1-step bound as a dotted line and the numerical result as a thin solid line. The 1-step estimate (4.2) is a very good upper estimate. In all cases the multistep estimate (4.3) is a relatively rough estimate. It accumulates the over-estimation of the error from step to step and suffers from its inability to use the current [[theta].sup.(k,l).sub.i] on the right-hand side of the estimate in order to improve the quality of the upper bound.

The [DELTA]([theta])-ratio depends on the discrete eigenvalues [[lambda].sup.(l)] and decreases monotonically for the iteration on each grid level; the ratio may increase after prolongation to a refined grid level. The somewhat oscillating behavior of the [DELTA][theta]-ratio for [[lambda].sub.2] in contrast to the smoother behavior for [[lambda].sub.1] reflects the fact that the error estimation and grid generation is controlled by error estimates for the first eigenfunction. The second eigenfunction also profits from the grid refinement (cf. Figure 4.4) but the [DELTA]([theta])-ratio shows a stronger variation for changing level index l.

5. Conclusion. This paper concludes our efforts of analyzing preconditioned gradient iterations and their subspace variants with either fixed step length (case of inverse iteration and preconditioned inverse iteration) or with optimal step-length (case of steepest descent and preconditioned steepest descent). Within the hierarchy of preconditioned gradient iterations, as suggested in [13], these solvers are denoted as INVIT(k,s) and PINVIT(k,s) with = 1,2 and subspace dimensions s [member of] N.

For all these iterative eigensolvers sharp convergence estimates have been derived which have the common form

[[DELTA].sub.i,i+1]([rho](x'))[less than or equal to][[sigma].sup.2][[DELTA].sub.i,i+1]([rho](x))

with [[DELTA].sub.i,i+1]([xi]) = ([xi] - [[lambda].sub.i])/[[lambda].sub.i+1 - [xi]] and convergence factors [sigma]. The following sharp convergence estimates have been gained:

Interactive Eigensolver Convergence Ref. factor Inverse iteration [sigma] = [[lambda].sub.1] [16] Preconditioned inverse [sigma] = [gamma] + (1 - [gamma]) [10] iteration [[lambda].sub.1]/[[lambda].sub.1] Block preconditioned + 1 inverse iteration Steepest descent [sigma] = [kappa]2-[kappa], [kappa] [18] = [[lambda].sub.i]([[lambda].sub.n] -[[lambda].sub.i+1])/[[lambda]. sub.i+1] ([[lambda].sub.n] - [[lambda].sub.i] Precond. steepest descent [sigma] = [kappa] + [gamma] [17] Block precond. steepest (2-kappa])/(2-[kappa]) + [gamma] descent [kappa], [kappa] = [[lambda].sub.i] [[lambda].sub.n1] - [[lambda]. sub.i+1] ([[lambda].sub.n] - [[lambda].sub.i]

Scientific efforts for the future are aimed at a convergence analysis of the important locally optimal preconditioned conjugate gradient (LOPCG) iteration [9]. As the convergence behavior of the LOPCG eigensolver has been observed to behave similarly to the conjugate gradient iteration for linear systems, sharp convergence estimates are highly desired.

Acknowledgement. The authors would like to express their gratitude to the anonymous referee for his very positive feedback and constructive remarks!

REFERENCES

[1] G. ARFKEN, Mathematical Methods for Physicists, Academic Press, San Diego, 1985.

[2] Z. BAI, J. DEMMEL, J. DONGARRA, A. RUHE, AND H. VAN DER VORST (eds.), Templates for the Solution of Algebraic Aigenvalue Problems: A Practical Guide, SIAM, Philadelphia, 2000.

[3] E. G. D'YAKONOV, Optimization in Solving Elliptic Problems, CRC Press, Boca Raton, 1996.

[4] G. H. GOLUB AND C. F. VAN LOAN, Matrix Computations, 3rd ed., Johns Hopkins University Press, Baltimore, 1996.

[5] M. R. HESTENES AND W. KARUSH, A method of gradients for the calculation of the characteristic roots and vectors of a real symmetric matrix, J. Research Nat. Bur. Standards, 47 (1951), pp. 45-61.

[6] L. V. KANTOROVICH, Functional Analysis and Applied Mathematics, National Bureau of Standards, Los Angeles, 1952.

[7] L. V. KANTOROVICH AND G. P. AKILOV, Functional Analysis in Normed Spaces, Pergamon, Oxford, 1964.

[8] A. V. KNYAZEV, Convergence rate estimates for iterative methods for a mesh symmetric eigenvalue problem, Soviet J. Numer. Anal. Math. Modelling, 2 (1987), pp. 371-396.

[9] --, Preconditioned eigensolvers--an oxymoron?, Electron. Trans. Numer. Anal., 7 (1998), pp. 104-123. http://etna.math.kent.edu/vol.7.1998/pp104-123.dir

[10] A. V. Knyazev and K. Neymeyr, A geometric theory for preconditioned inverse iteration. III: A short and sharp convergence estimate for generalized eigenvalue problems, Linear Algebra Appl., 358 (2003), pp. 95-114.

[11] --, Gradient flow approach to geometric convergence analysis of preconditioned eigensolvers, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 621-628.

[12] A. V. KNYAZEV AND A. L. SKOROKHODOV, On exact estimates of the convergence rate of the steepest ascent method in the symmetric eigenvalue problem, Linear Algebra Appl., 154/156 (1991), pp. 245-257.

[13] K. NEYMEYR, A Hierarchy of Preconditioned Eigensolvers for Elliptic Differential Operators, Habilitation Thesis, Mathematische Fakultat, Universitat Tubingen, 2001.

[14] --, A geometric theory for preconditioned inverse iteration applied to a subspace, Math. Comp., 71 (2002), pp. 197-216.

[15] --, A posteriori error estimation for elliptic eigenproblems, Numer. Linear Algebra Appl., 9 (2002), pp. 263-279.

[16] --, A note on inverse iteration, Numer. Linear Algebra Appl., 12 (2005), pp. 1-8.

[17] --, A geometric convergence theory for the preconditioned steepest descent iteration, SIAM J. Numer. Anal., 50 (2012), pp. 3188-3207.

[18] K. NEYMEYR, E. OVTCHINNIKOV, AND M. ZHOU, Convergence analysis of gradient iterations for the symmetric eigenvalue problem, SIAM J. Matrix Anal. Appl., 32 (2011), pp. 443-456.

[19] K. NEYMEYR AND M. ZHOU, Iterative minimization of the Rayleigh quotient by the block steepest descent iteration, Numer. Linear Algebra Appl., in press, doi:10.1002/nla.1915.

[20] E. OVTCHINNIKOV, Convergence estimates for the generalized Davidson method for symmetric eigenvalue problems II: The subspace acceleration, SIAM J. Numer. Anal., 41 (2003), pp. 272-286.

[21] B. N. PARLETT, The Symmetric Eigenvalue Problem, Prentice Hall, Englewood Cliffs, 1980.

[22] J. PONSTEIN, An extension of the min-max theorem, SIAM Rev., 7 (1965), pp. 181-188.

[23] V. G. PRIKAZCHIKOV, Strict estimates of the rate of convergence of an iterative method of computing eigenvalues, U.S.S.R. Comput. Math. Math. Phys., 15 (1975), pp. 235-239.

[24] Y. SAAD, Numerical Methods for Large Eigenvalue Problems, Manchester University Press, Manchester, 1992.

[25] B. A. SAMOKIS, The steepest descent method for an eigenvalue problem with semi-bounded operators, Izv. Vyssh. Uchebn. Zaved. Mat., 5 (1958), pp. 105-114.

[26] M. SION, On general minimax theorems, Pacific J. Math., 8 (1958), pp. 171-176.

[27] A. STATHOPOULOS, Nearly optimal preconditioned methods for Hermitian eigenproblems under limited memory. I. Seeking one eigenvalue, SIAM J. Sci. Comput., 29 (2007), pp. 481-514.

[28] P. F. ZHUK AND L. N. BONDARENKO, Sharp estimates for the rate of convergence of the s-step method of steepest descent in eigenvalue problems, Ukrain. Mat. Zh., 49 (1998), pp. 1694-1699.

KLAUS NEYMEYR ([dagger]) AND MING ZHOU ([dagger])

* Received January 23, 2014. Accepted February 24, 2014. Published online on May 23, 2014. Recommended by M. Hochstenbach.

([dagger]) Universitat Rostock, Institut fur Mathematik, UlmenstraBe 69, 18055 Rostock, Germany ({klaus.neymeyr, ming.zhou}@uni-rostock.de).

TABLE 4.1 The 15 smallest eigenvalues [[xi].sup.2.sub.k, l] of (4.1). Left: Ritz values in a 1047534-dimensional linear finite element space. Center: Ritz values in a 10052735-dimensional linear finite element space. Right: "Exact" eigenvalues of (4.1) k\l 1 2 0 7.733389 34.88339 1 12.18725 44.25893 2 17.35102 54.36164 3 23.19983 65.17971 4 29.71530 76.70204 5 36.88311 6 44.69164 7 53.13167 8 62.19503 9 71.87493 k\l 1 2 0 7.733342 34.88260 1 12.18715 44.25768 2 17.35080 54.35978 3 23.19943 65.17704 4 29.71460 76.69829 5 36.88199 6 44.69009 7 53.12939 8 62.19189 9 71.87073 k\l 1 2 0 7.733337 34.88252 1 12.18714 44.25756 2 17.35078 54.35960 3 23.19939 65.17677 4 29.71453 76.69790 5 36.88189 6 44.68994 7 53.12918 8 62.19159 9 71.87033

Printer friendly Cite/link Email Feedback | |

Author: | Neymeyr, Klaus; Zhou, Ming |
---|---|

Publication: | Electronic Transactions on Numerical Analysis |

Article Type: | Report |

Geographic Code: | 4EUGE |

Date: | Apr 1, 2014 |

Words: | 7623 |

Previous Article: | On convergence rates for quasi-solutions of ill-posed problems. |

Next Article: | Special volume: dedicated to Lothar Reichel on the occasion of his sixtieth birthday. |

Topics: |