Printer Friendly

Efficient preconditioning for sequences of parametric complex symmetric linear systems.

Abstract. Solution of sequences of complex symmetric linear systems of the form [A.sub.j][x.sub.j] = [b.sub.j], j = 0, ..., s, [A.sub.j] = A + [[alpha].sub.j][E.sub.j], A Hermitian, [E.sub.0], ..., [E.sub.s] complex diagonal matrices and [[alpha].sub.0], ..., [[alpha].sub.s] scalar complex parameters arise in a variety of challenging problems. This is the case of time dependent PDEs; lattice gauge computations in quantum chromodynamics; the Helmholtz equation; shift-and-invert and Jacobi-Davidson algorithms for large-scale eigenvalue calculations; problems in control theory and many others. If A is symmetric and has real entries then [A.sub.j] is complex symmetric.

The case A Hermitian positive semidefinite, Re([[alpha].sub.j]) [greater than or equal to] 0 and such that the diagonal entries of [E.sub.j], j = 0, ..., s have nonnegative real part is considered here.

Some strategies based on the update of incomplete factorizations of the matrix A and [A.sup.-1] are introduced and analyzed. The numerical solution of sequences of algebraic linear systems from the discretization of the real and complex Helmholtz equation and of the diffusion equation in a rectangle illustrate the performance of the proposed approaches.

Key words. Complex symmetric linear systems; preconditioning; parametric algebraic linear systems; incomplete factorizations; sparse approximate inverses.

AMS subject classifications. 65F10, 65N22, 15A18.

1. Introduction. The numerical solution of several problems in scientific computing requires the solution of sequences of parametrized large and sparse linear systems of the form

[A.sub.j][x.sub.j] = [b.sub.j], [A.sub.j] = A + [[alpha].sub.j] [E.sub.j], j = 0, ..., s (1.1)

where [[alpha].sub.j] [member of] C are scalar quantities and [E.sub.0], ..., [E.sub.s] are complex symmetric matrices. Here we will consider the case where [E.sub.j], j = 0, ..., s are diagonal matrices. Among these problems we recall partial differential equations (PDEs) ([l, 6, 7, 16] (time-dependent PDEs using implicit difference schemes; the Helmholtz equation; the Schrodinger equation, etc.); large and sparse eigenvalues computation (see, e.g., [12]) and model trust region, nonlinear least squares problems and globalized Newton algorithms in optimization; see, e.g., [11].

In the above mentioned frameworks, sequences of linear systems with matrices as in (1.1), each one with a possibly different right hand side and initial guess, have to be solved. Therefore, it would be desirable to have a strategy for modifying an existing preconditioner at a cost much lower than recomputing a preconditioner from scratch, even if the resulting preconditioner can be expected to be less effective than a brand new one in terms of iteration count.

In most of the above mentioned applications, the matrix A is large and sparse, and preconditioned Krylov subspace methods are used to solve the linear systems. Hence, there are several situations where it is desirable to use cost-effective modifications of an initial preconditioner for solving such a sequence of linear systems.

In this paper, we propose to solve (1.1) by Krylov methods using preconditioning strategies based on operators [P.sub.j] approximating [A.sub.j] such that

[P.sub.j] = [??] + [[alpha].sub.j][K.sub.j], j = 0, ..., s, (1.2)

where [??] is an operator which approximates A in (1.1), [[alpha].sub.0], ..., [[alpha].sub.s] are complex scalar parameters and [K.sub.j], j = 0, .., s, are suitable corrections related to [E.sub.j], j = 0, .., s. In particular, we propose strategies to update incomplete factorizations with threshold for A and [A.sup.-1]. The correction matrices [K.sub.j] are computed by balancing two main requirements: the updated preconditioners (1.2) must be a cheap approximation for [A.sub.j] and [parallel][A.sub.j] - [P.sub.j][parallel] must be small.

In the paper [5], we proposed an updated preconditioner for sequences of shifted linear systems

(A + [[alpha].sub.j]I)[x.sub.j] = [b.sub.j], [[alpha].sub.j] [member of] C, j = 0, ..., s, (1.3)

based on approximate inverse factorization which is effective when A is symmetric positive definite and [alpha] is real and positive. Here, these results will be extended to more general perturbations of the originating (or seed) matrix A (see (1.1)), i.e., matrices [A.sub.j] which are complex symmetric, and in the use of incomplete factorization as a basis of the updates. We stress that an update for the incomplete Cholesky factorization for shifted linear systems using a different technique was proposed in [17]. Note, however, that the standard incomplete Cholesky factorization is not entirely reliable even when applied to general positive definite matrices. On the other hand, our algorithms can be based on more robust preconditioners.

1.1. Krylov methods and shifted linear systems. Let us consider now the special case where [E.sub.j] - I, j = 0, ..., s, the initial guesses [x.sup.(0).sub.j] are all equal to the zero vector and [b.sub.j] = b, j = 0, ..., s. Then, the family of linear systems (1.1) can be written as

(A + [[alpha].sub.j]I)[x.sub.j] = b, [[alpha].sub.j] [member of] C, j = 0, ..., s, (1.4)

and each system in (1.4) generates the same Krylov subspace [K.sub.m] ([A.sub.j], [r.sup.(0).sub.j]) because

[K.sub.m]([A.sub.j], [r.sup(0).sub.j]) = [K.sub.m](A + [[alpha].sub.j]I, [r.sup(0).sub.j]) = [K.sub.m](A, [r.sup(0)]), (1.5)

[r.sup.(0)] = b - A[x.sup.(0)], [[alpha].sub.j] [member of] C, j = 0, ..., s,

see, e.g., [15]. Therefore, the most expensive part of the Krylov method for solving simultaneously the s+1 linear systems (1.4), the computation of the Krylov basis, can be computed only once. Note that the corresponding iterates and the residuals for each of the underlying linear systems can be computed without further matrix-vector multiplications; this kind of approach is the most popular in the literature. Another possibility is restating the problem as the solution of a linear system AX = B where X = [[x.sub.0] ... [x.sub.s]] and B = [[b.sub.0] ... [b.sub.s]]. This approach can solve multiple linear systems whose right hand side differ; see [19].

1.2. Why not preserve the Krylov subspace. There are many problems where the approaches in section 1.1 cannot be used. First, when [E.sub.j] [not equal to] I for at least a j [member of] {0, ..., s} in (1.1). Second, problems whose initial residuals [r.sup.(0).sub.j] are not collinear (example: right hand sides and/or the initial guesses are nonzero and/or differ) or [x.sup.(0).sub.j] and/or the right hand sides are not all available at the same time. Moreover, the linear systems in (1.1) (and, in particular, in (1.4)), can be ill-conditioned. Therefore, the underlying Krylov solver can converge very slowly. Unfortunately, the only preconditioners which preserve the same Krylov subspace (1.5) for all j, j = 0, ..., s (i.e., after the shift) are polynomial preconditioners, but they are not competitive with the approaches based on incomplete factorization. In particular, performing our numerical experiments for (1.1), we experienced that often it is more convenient to use the same preconditioner computed for A, which does not take care of the perturbations [E.sub.j], instead of a more sophisticated polynomial preconditioner. Note that polynomial preconditioning for shifted linear systems as in (1.4) has been considered in [13]. On the other hand, if the matrix [A.sub.j] in (1.1) is highly indefinite (i.e., there are more than few eigenvalues on both half complex plane), the approach proposed here using the standard ILUT or incomplete [LDL.sup.H]-threshold factorizations can fail and the use of Krylov methods for shifted linear systems can work better (in the framework (1.4), same initial guesses). However, an incomplete factorization for indefinite linear systems can be used to define an updated preconditioner for particular problems using our strategies. This will be considered in a future work.

Finally, we stress that the approaches considered in section 1.1 must be used in conjunction with (non-restarted) Krylov methods only. On the other hand, our algorithms do not have such restrictions. In particular, they can be used with restarted or oblique projection methods (e.g., restarted GMRES or any BICGSTAB) or with some other methods as well.

The paper is organized as follows. In Section 2 we give more comments on the solution of linear systems (1.1); Section 3 outlines our updated incomplete factorizations and their use as a preconditioner for the sequences of linear systems (1.1); in Section 4 we propose an analysis of the preconditioners and in Section 5 some numerical experiments. Finally, conclusions are given in Section 6.

2. Complex symmetric preconditioning. Consider the sequence of linear systems (1.1) where A is a Hermitian large, sparse, possibly ill-conditioned matrix, [b.sub.j] are given right-hand side vectors, [[alpha].sub.j] are (scalar) shifts, [E.sub.j] are complex symmetric matrices and [x.sub.j] are the corresponding solution vectors, j = 0, ..., s. The linear systems (1.1) may be given simultaneously or sequentially; the latter case occurs, for instance, when the right-hand side [b.sub.j] depends on the previous solution [x.sub.j-1], as in the case of time-dependent PDEs.

If [absolute value of [[alpha].sub.j]] in (1.1) is large, then an effective preconditioning strategy is based on

[P.sub.j] = [[alpha].sub.j][E.sub.j].

Indeed, we have that

[([[alpha].sub.j][E.sub.j]).sup.-1][A.sub.j] = I + [DELTA], [parallel][DELTA][parallel]/[parallel][A.sub.j][parallel] [less than or equal to] [delta]

and [delta] is small, [delta] [right arrow] for [absolute value of [[alpha].sub.j]] [right arrow] [infinity]. Moreover, preconditioning for the shifted problems in [5], where [alpha] [greater than or equal to] 0, [E.sub.j] = I for all j and the matrices A were all symmetric a positive definite, was found no longer beneficial as soon as [[alpha].sub.j] was of the order of [10.sup.-1]. On the other hand, we found cases where reusing a preconditioner computed for A for [[alpha].sub.j] of the order of [10.sup.-5] was ineffective. Therefore, by considering the above arguments and recalling that generating an approximation directly from the complex symmetric matrix [A.sub.j] = A + [[alpha].sub.j][E.sub.j] requires complex arithmetic, suggests that algorithms computing approximations for [A.sub.j] or [A.sup.-1.sub.j] working in real arithmetic can reduce significantly the overall computational cost.

3. The updated incomplete factorizations used as preconditioners. For simplicity of notation, we will consider a generic shift a and parametric matrix E dropping the subscripts.

We propose updates for generating approximations for [A.sub.j] as in (1.1) using approximations initially computed for the seed matrix A. The proposed preconditioners are based on incomplete factorizations (see, e.g., [18] and references therein) and approximate inverse factorizations of the type described in [2].

Let us suppose that A is Hermitian positive definite and that there exists an incomplete [LDL.sup.H] factorization defined as follows

P = [[??][??][??].sup.H] (3.1)

such that

A = [LDL.sup.H] [??] [[??][??][??].sup.H], (3.2)

where L, [??] are unit lower triangular and D, [??] are diagonal matrices, as usual. For later use, let us consider the (exact) inverse factorization of A such that

[A.sup.-1] = Z[D.sup.-1][Z.sup.H] [??] A = [Z.sup.-H]D[Z.sup.-1] [??] L = [Z.sup.-H], [L.sup.H] = [Z.sup.-1]. (3.3)

Moreover, let us suppose that [alpha]E has diagonal entries with positive real part. The proposed preconditioner based on (3.2) for [A.sub.[alpha],E] = A + [alpha]E is given by

[P.sub.[alpha],E] = [??] ([??] + [alpha]B) [[??].sup.H]. (3.4)

Note that it is customary to look for good preconditioners in the set of matrices approximating [A.sub.[alpha],E] which minimize some norm of [P.sub.[alpha],E] - [A[alpha],E]; see, e.g, [6, 7, 8].

Analogously, with a slight abuse of notation, we define the preconditioner for (1.1) based on the approximate inverse preconditioner for A

Q [equivalent to] [P.sup.-1] = [??][[??].sup.-1][[??].sup.H] (3.5)


[Q.sub.[alpha],E] [equivalent to] [P.sup.-1.sub.[alpha],B] = [??] [([??] + [alpha]B).sup.-1] [[??].sup.H], (3.6)

where the component matrices in (3.5) can be computed by the algorithm in [2] ,which is robust if A is a Hermitian positive (or negative) definite matrix. Of course, D + [alpha]B must be invertible.

The following result is a generalization of (4.5) in [5].

PROPOSITION 3.1. Let us consider the exact [LDL.sup.H]-factorization of A. By defining [P.sub.[alpha],E] = L (D + [alpha]B)[L.sup.H] and choosing B = [Z.sup.H]EZ we have [P.sub.[alpha],E] [equivalent to] [A.sub.[alpha],E], independently of [alpha].

Proof. Defining [P.sub.[alpha],E] := L (D + [alpha]B)[L.sup.H], i.e., using the exact [LDL.sup.H] factorization of A, we have:

[P.sub.[alpha],E] - [A.sub.[alpha],E] = [Z.sup.-H] (D + [alpha]B)[Z.sup.-1] - ([Z.sup.-H]D[Z.sup.-1] + [alpha]E) = [alpha] ([LBL.sup.H] - E) , (3.7)

and therefore

B = [L.sup.-1]E[L.sup.-H] = [Z.sup.H]EZ.

Unfortunately, we cannot use the form of B suggested in Proposition 3.1 in practice. Indeed, in general, only an incomplete [LDL.sup.H]-factorization of A (or an incomplete Z[D.sup.-1][Z.sup.H]-factorization of [A.sup.-1]) is available, i.e., we don't have L to generate [L.sup.-H] = Z or directly Z but only (sparse) approximations. On the other hand, if the exact Z was available, we would have to solve, at each iteration step of the Krylov subspace solver, a linear system whose matrix D + [alpha][Z.sup.H]EZ is usually full.

Similar to the approach in [5], for k [greater than or equal to] 1, we define the order k modified preconditioner (or the order k updated preconditioner) as

[P.sup.(k).sub.[alpha],E] := [??]([??] + [alpha][B.sub.k])[[??].sup.H] (3.8)

with the same notation as in (3.4) and where [B.sub.k] is the symmetric positive definite band matrix given by

[B.sub.k] = [[??].sup.T.sub.k] E [[??].sub.k]. (3.9)

[[??].sub.k] is obtained by extracting the k - 1 upper diagonals from [??] if k > 1 or [B.sub.1] = diag([[??].sup.H]E[??]) if k = 1.

Similarly, we define the order k (inverse) modified preconditioner (or the order k (inverse) updated preconditioner) as

[Q.sup.(k).sub.[alpha],E] := [??][([??] + [alpha][B.sub.k]).sup.-1][[??].sup.H]. (3.10)

Note that:

* [B.sub.k] is always nonsingular since [[??].sub.k] is a unit upper triangular matrix and therefore nonsingular and [alpha]E is diagonal whose entries have positive real part.

* From the previous item, [??] + [alpha][B.sub.k] is nonsingular. Indeed, [alpha]E = Re([alpha]E) + iIm([alpha]E) = ER + I[E.sub.I], where [E.sub.R] is a diagonal matrix whose real entries are non negative by hypothesis. Therefore, we can write

[??] + [alpha][B.sub.k] = ([??] + [[??].sup.H.sub.k][E.sub.R][[??].sub.k]) + i[[??].sup.H.sub.k][E.sub.I][[??].sub.k].

Let us consider the matrix in brackets. Recalling that [[??].sub.k] is nonsingular, we have that [[??].sup.H.sub.k][E.sub.R][[??].sub.k] is positive semidefinite and [??] has real positive entries. By using Gershgorin circles, we have the thesis.

* The order k preconditioners [P.sup.(k).sub.[alpha],E] based on incomplete [LDL.sup.H]-threshold factorization with k [greater than or equal to] 1 require the computation of the matrix [??] or of an approximation of Z by using [??]. Note that the computation of an approximation of Z for computing [B.sub.k] can be done just once for all scalars [alpha] and matrices E in (1.1). Note that the algorithm described in [3] generates [??] and [??] at the same time.

Whenever E = I, the choice B = I in (3.4), (3.6) is certainly effective if L [L.sup.H] - I is a small norm perturbation of the null matrix. In practice, the above condition occurs if the entries along the rows of Z in (3.3) decay rapidly away from the main diagonal. This happens if, e.g, A is strictly diagonally dominant; see Corollary 4.3 below and Figure 3.1. On the other hand, the matrix related to Figure 3.2 is only weakly diagonally dominant and the entries of the inverse of A decay much more slowly with respect to those of the matrix in Figure 3.1. However, we observed fast convergence of preconditioned iterations even for matrices that are not diagonally dominant. This is the case of most of the test problems considered in [5].



Let Re(M) = (M + [bar.M])/2 and Im(M) = (M - [bar.M])/(2i).

PROPOSITION 3.2. Let the incomplete factorization [[??][??][??].sup.H] in (3.1) (the approximate inverse factorization [[??][??].sup.-1][[??].sup.H] in (3.5)) be positive definite. If E ([E.sub.j] in (1.1)) is such that Re([alpha]E) is positive definite, then [P.sup.(k).sub.[alpha],E] in (3.8) ([Q.sup.(k).sub.[alpha],E] in (3.10)) is nonsingular. Moreover, Re([P.sup.(k).sub.[alpha],E]) (Re([Q.sup.(k).sub.[alpha],E])) is positive definite.

Proof. The result follows from (3.8), (3.10) by observing that Re([??] + [alpha][B.sub.k]) is positive definite for k [greater than or equal to] 0, the incomplete factorization (3.2) for the seed matrix A (the preconditioner based on the approximate inverse preconditioner (3.5)) is positive definite and therefore [P.sup.(k).sub.[alpha],E] ([Q.sup.(k).sub.[alpha],E]) is nonsingular.

4. Convergence of iterations. In the following analysis we consider the (exact) [LDL.sup.H] factorization of A and the (exact) Z[D.sup.-1][Z.sup.H] factorization of the inverse for the matrix A in (1.1), where Z = [L.sup.-H]. The results below extend those formerly given in [5] for symmetric positive definite matrices, [[alpha].sub.0], ... [[alpha].sub.s] real and positive and [E.sub.j] = I for all j.

THEOREM 4.1. Let us consider the sequence of algebraic linear systems in (1.1). Let A be Hermitian positive definite, [alpha] [member of] C. Moreover, let [delta] > 0 be a constant such that the singular values of the matrix E - L[B.sub.k][L.sup.H], [B.sub.k] as in (3.9), are as follows

[[sigma].sub.1] [greater than or equal to] [[sigma].sub.2] [greater than or equal to] ... [greater than or equal to] [[sigma].sub.t] [greater than or equal to] [delta] [greater than or equal to] [[sigma].sub.t+1] [greater than or equal to] ... [greater than or equal to] [[sigma].sub.n] [greater than or equal to] 0,

and t [much less than] n. Then, if


there exist matrices F, [DELTA] and a constant [c.sub.[alpha]] such that


rank([DELTA]) [less than or equal to] t [much less than] n, the rank of [DELTA] does not depend on [alpha], [c.sub.[alpha]] is a constant such that [lim.sub.[absolute value of [alpha]][right arrow]0] [c.sub.[alpha]] = 1, [c.sub.[alpha]] of the order of unity. The same properties hold true for

[Q.sup.(k).sub.[alpha],E] x (A + [alpha]E),

with [Q.sup.(k).sub.[alpha],E] defined as in (3.10).

Proof. The matrix

E - L[B.sub.k][L.sup.H] = 1/[alpha] ([P.sup.(k).sub.[alpha],E] - [A[alpha],E])

(see (3.7)) can be decomposed as [F.sub.1] + [[DELTA].sub.1], where, if U[summation][V.sup.H], [summation] = diag([[sigma].sub.1], ..., [[sigma].sub.n]), is its singular value decomposition, we have:

[[DELTA].sub.1] = Udiag at ([[sigma].sub.1], ..., [[sigma].sub.t], 0, ..., 0)[V.sup.H], [F.sub.1] = Udiag (0, ..., 0, [[sigma].sub.t+1], ..., [[sigma].sub.n])[V.sup.H].

To simplify the notation, from here on we remove the superscript denoting the order of the preconditioner [P[alpha],E]. We have that

rank([[DELTA].sub.1]) [less than or equal to] t, [[parallel] [F.sub.1][parallel].sub.2] [less than or equal to] [delta].

Therefore, by observing that

[P.sup.-1.sub.[alpha],E] (A + [alpha]E) = I + [alpha] [P.sup.-1.sub.[alpha],E][F.sub.1] + [alpha] [P.sup.-1.sub.[alpha],E][[DELTA].sub.1],

and defining

F = [alpha][P.sup.-1.sub.[alpha],E][F.sub.1], [DELTA] = [alpha][P.sup.-1.sub.[alpha],E][[DELTA].sub.1],

we have (4.2), where rank([DELTA]) = rank([[DELTA].sub.1]) [less than or equal to] t [much less than] n.

The matrix D + [alpha]B is nonsingular because

[absolute value of [alpha] [[parallel][D.sup.-1]B[parallel].sub.2] [less than or equal to] 1/2.


D + [alpha]B = D(I + [[alpha]D.sup.-1]B) = D(I + [[alpha]D.sup.-1][Z.sup.H.sub.k]E[Z.sub.k]) = D(I + Y)

and [[parallel]Y[parallel].sub.2] [less than or equal to] 1/2.

By observing that

[[parallel][P.sup.-1.sub.[alpha],E][parallel].sub.2] = [[parallel]Z[(D + [alpha]B).sup.-1][Z.sup.H][parallel].sub.2] [less than or equal to] [[parallel]Z[parallel].sup.2.sub.2] [[parallel][(D + [alpha]B).sup.-1][parallel].sub.2], (4.3)

[parallel][P.sup.-1.sub.[alpha],E][parallel] can be bounded by


where [c.sub.[alpha]] is a parameter of the order of unity such that [lim.sub.[absolute value of [alpha]][right arrow]0] [c.sub.[alpha]] = 1. Therefore, we can write


Hence, denoting by [[lambda].sub.min] (A) the smallest eigenvalue of A, by

[[absolute value of [d.sub.r]].sup.-1] [less than or equal to] [([[lambda].sub.min](A)[[parallel][z.sub.r] [parallel].sup.2.sub.2]).sup.-1],

(see [2]), we have

[[parallel]F[parallel].sub.2] [less than or equal to] 2[absolute value of [alpha]][c.sub.[alpha]][delta]/[[lambda].sub.min](A) [([[parallel]Z[parallel].sub.2]/[min.sub.r] [[parallel][z.sub.r][parallel].sub.2]).sup.2],

where Z = [[z.sub.1] ... [z.sub.n]], [z.sub.r] [member of] [C.sup.n].

Note that the norm of F can be bounded by a constant which does not depend on [alpha]. Moreover, if the condition (4.1) is not satisfied because [absolute value of [alpha]] x [parallel]E[parallel] is large, thus we can use [P.sub.[alpha],E] = [alpha]E, as suggested in Section 2.

Similarly to what observed in [5], the results of Theorem 4.1, without further assumptions, have a limited role in practice in order to explain the convergence of preconditioned iterations. For example, the norm of [??] and of [[??].sup.-1] (Z = [L.sup.-H]) can be large and therefore the spectrum of the preconditioned matrix cannot be considered clustered at all. Note that [parallel][??][parallel], [parallel][Z][parallel] (and therefore [parallel] [L.sup.-1][parallel], [parallel][[??].sup.-1][parallel]) can be large if, e.g., the entries of [A.sup.-1] do not decay or decay very slowly away from the main diagonal. To this end, we give the (straightforward) extensions of [5, Theorem 4.2] and of [5, Corollary 4.3] for (1.1).

THEOREM 4.2. Let A be Hermitian, positive (or negative) definite and normalized, [A.sup.-1] = Z[D.sup.-1][Z.sup.H], D = diag([d.sub.1], ..., [d.sub.n]), Z = ([z.sub.i,j]). Then, for j > i, we have:


When the bandwidth and the condition number of A are not too large, the entries of Z (i.e., of [L.sup.-H]) are bounded in a rapidly decaying fashion away from the main diagonal along rows. In this case, we can find a constant such that the inverse of L (and thus [P.sup.(k).sub.[alpha],E])) has uniformly bounded norm. Thus, by Theorems 4.1 and 4.2, we have the following consequence.

COROLLARY 4.3. Let A be a normalized symmetric positive definite diagonally dominant matrix, and let [alpha]E, [alpha] [member of] [C.sup.+] = {z [member of] C : Re(z) [greater than or equal to] 0}, be a diagonal matrix whose entries have positive real part. Then, [P.sup.-1.sub.[alpha],E][A.sub.[alpha],E] ([Q.sub.[alpha],E][A.sub.[alpha],E]) has a clustered spectrum.

We recall that if [[kappa].sub.2]([V.sub.[alpha],E]) is moderate, the eigenvalue distribution is relevant for the convergence of GMRES, see [18] and, e.g., [9], and this is the case for all test problems considered here. In practice, in our tests, [[kappa].sub.2]([V.sub.[alpha],E]) is always less than 100 and diminishes for increasing [[sigma].sub.1] for Problems 1 and 2 in section 5.

The hypotheses in Theorem 4.1 and Corollary 4.3 are restrictive for several classes of problems. However, we experienced that preconditioned iterations can converge fast even when A is not diagonally dominant and there is no decay of the entries of Z = [L.sup.-H] far away from the main diagonal at all.

5. Numerical experiments. We implemented a Matlab version of our algorithms to solve the linear systems arising in two classes of problems: the diffusion equation in a rectangle and the Helmholtz equation. Here the updated preconditioners (3.8) and (3.10) have order k = -1, 0, 1, 2. The iterations stop when the relative residual is less than [10.sup.-6]. The estimated computational costs are reported in the columns "Mf" including the number of floating point operations for both the computation of the preconditioner and the iteration phase (for the proposed updated incomplete factorizations the former is negligible).

5.1. Using updated incomplete factorizations. Let us consider the application of the proposed algorithms based on (3.8), (3.10) at each iteration step of a Krylov subspace method. In particular, we show results using (full) GMRES.

Updated factorization (3.8) used as a preconditioner for (1.1) requires the solution of the sparse linear systems whose matrices are as follows:

* [??] (sparse lower triangular);

* [??] + [alpha]B (with band k);

* [[??].sup.H] (sparse upper triangular).

Therefore, the computational cost per iteration is of the order of O(n) provided that solving

(D + [alpha]B) x = y (5.1)

requires O(n) flops as well. In particular, the order k updated factorization requires O([k.sup.2] n) for the solution of the banded linear system (5.1) and O(dn) flops for forward and backward substitution per iteration, where d is the number of nonzero diagonals of [??]. On the other hand, the updated incomplete inverse factorization (3.10) used as a preconditioner for (1.1) does not require the solution of the two (sparse) triangular linear systems because (approximations for) their inverses are already available. This can be much more effective than the iterations using (3.8) in a parallel implementation. However, we observed that for some problems, more flops are required for the additional fill-in of [??] with respect to [??] for the latter algorithm. Note that if A in (1.1) is positive (or negative) definite with real entries, the updated preconditioners (3.8), (3.10) can be particularly convenient with respect to the "full" ones (i.e., those which compute an approximation for [A.sub.j] in (1.1) by applying the [ILDL.sup.H] or AINV algorithms for each j to [A.sub.j] from scratch). Indeed, the application of the latter algorithms requires

* generating s incomplete factorizations which have to be done in complex arithmetic even if A, the seed matrix, has real coefficients;

* performing matrix-vector multiplications (with [??] and [[??].sup.H]) or solving triangular systems (i.e., those with matrices [??] and [[??].sup.H]) in complex arithmetic.

On the other hand, the strategies we propose here require the generation of just one incomplete factorization performed in real arithmetic. Moreover, the most expensive part of the matrix-vector multiplications for the underlying Krylov accelerator can be performed in real arithmetic as well. Complex arithmetic is used just for solving diagonal linear systems and performing operations with vectors.

5.2. Time-dependent PDEs. Consider a linear (or linearized) boundary value problem for a time-dependent PDE discretized in space written as


where y(t), g(t) : R [right arrow] [R.sup.m], J [member of] [R.sup.mxm], [n.sub.j] [member of] [R.sup.m] j = 1,2. By applying linear multistep formulas in boundary value form as in [6, 8], we obtain the linear system


where [e.sub.i] [member of] [R.sup.s+1], i = 1, ..., s + 1, is the i-th column of the identity matrix and A, B [member of] [R.sup.(s+1)x(s+1)] are small rank perturbations of Toeplitz matrices. In particular, the preconditioners introduced in [6, 7] for the linear systems are given by

P = c(A) [cross product] I - hc(B) [cross product] J, (5.4)

where c(A), c(B) are {[omega]}-circulant approximations of the matrices A, B, which are related to the discretization in time and [??] is an approximation of the Jacobian matrix. Other possible choices for c(A), c(B) are proposed in [6, 7]. As observed in [6, 8], P can be written as

P = ([U.sup.*] [cross product] I)G(U [cross product] I), (5.5)

U is an unitary matrix given by U = F[OMEGA], F is the Fourier matrix

F = ([F.sub.j,r]), [F.sub.j,r] = exp(2[pi]ijr/(s + 1)), [OMEGA] = diag(1, [[omega].sup.-1/(s+1)], ..., [[omega] .sup.-s/(s+1)]),

[omega] = exp(I[theta]), -[pi] < [theta] [less than or equal to] [pi],

[omega] is the parameter which determines the {[omega]}-circulant approximations (for the best choices of [omega], see [8]), and G is a block diagonal matrix given by

G = diag([G.sub.0], ..., [G.sub.s]), [G.sub.j] = [[phi].sub.j]I - h[[psi].sub.j][??], j = 0, ..., s. (5.6)

The parameter h is related to the step size of the discretization in time, [[phi].sub.0], ..., [[phi].sub.s], and [[psi].sub.0], ..., [[psi].sub.s] are the (complex) eigenvalues of c(A), c(B) in (5.4), respectively, see [8] for more details.

We stress that in [6, 7, 8] we considered [??] = J in practice. Now, we can use the tools developed in the previous sections to solve the s + 1 linear systems with matrices [G.sub.j] as in (5.6) by a Krylov accelerator. In particular, we apply our updated preconditioners (3.8), (3.10). Therefore, we need to compute just an incomplete factorization which approximates J or an approximate inverse factorization which approximates [J.sup.-1]. We update the underlying factorization (s) to solve the s + 1 m x m complex symmetric linear systems with matrices [G.sub.0], ..., [G.sub.s]. Indeed, the matrices [G.sub.j] can be written as the matrix J shifted by a complex parameter times the identity matrix:

[G.sub.j] = h[[psi].sub.j] ((-[??]) + [[phi].sub.j]h[[psi].sub.j]I), j = 0, ..., s,

where the ratios [[phi].sub.j]/(h[[psi].sub.j]), j = 0, ..., s have nonnegative real part; see [6].

Therefore, we can solve the block linear system Gy = c by solving the component linear systems

[G.sub.j][y.sub.j] = [c.sub.j] = 0, ..., s,

by a Krylov subspace solver suitable for complex symmetric linear systems. The matrix-vector products with the unitary matrices U and [U.sup.*] are performed by using FFTs.

To test the performance of our updated incomplete factorizations in the above mentioned strategy, we consider the diffusion equation in a rectangular domain with variable diffusion coefficient


where c(x, y) = exp(-[x.sup.[beta]] - [y.sup.[beta]]), [beta] [greater than or equal to] 0. Using centered differences to approximate the spatial derivatives with stepsize [delta]x = 3/(m + 1), we obtain a system of [m.sup.2] ordinary differential equations whose [m.sup.2] x [m.sup.2] Jacobian matrix [J.sub.m] is block tridiagonal. The underlying initial value problem is thus solved by the linear multistep formulas in boundary value form used in [8]. In Table 5.1 we report the outer preconditioned GMRES iterations and the average inner iterations (i.e., for the solution of the m x m linear systems as in (5.6)) with the related global cost using (full) GMRES; GMRES preconditioned by the standard "complete" approximate inverse factorization for each [G.sub.j]; GMRES preconditioned by (3.10) (order 0) and GMRES preconditioned by the same approximate inverse factorization computed for J reused for all j = 0,..., s (i.e., order "-1"). Both the outer and the inner iterations are considered converged if the initial residuals are reduced by [10.sup.-6].

Results in Table 5.1 confirm that GMRES preconditioned with (3.10) is a little bit slower in term of average inner iterations with respect to the standard preconditioning strategy based on "full" approximate inverse factorization. However, the global computational cost of our strategy is greatly reduced with respect to the others. For this example, we found that the updated inverse factorizations (3.10) of order greater than 0 do not improve the overall performance.

5.3. Helmholtz equation. An example of a problem whose discretization produces complex symmetric linear systems is the Helmholtz equation with complex wave numbers

-[nabla] x (C[nabla]u) + [[sigma].sub.1](j)u + i[[sigma].sub.2](j)u = [f.sub.j], j = 0, ..., s, (5.7)

where [[sigma].sub.1](j), [[sigma].sub.2](j) are real coefficient functions and c is the diffusion coefficient. The above equation describes the propagation of damped time-harmonic waves. We consider (5.7) on the domain D = [0,1] x [0,1] with [[sigma].sub.1] constant, [[sigma].sub.2] a real coefficient function and c(x, y) = 1 or c(x, y) = [e.sup.-x-y]. As in [14], we consider two cases.

* [Problem 1] Complex Helmholtz equation, u satisfies Dirichlet boundary conditions in D. We discretize the problem with finite differences on a n x n grid, N = [n.sup.2], and mesh size h = 1/(n + 1). We obtain the s + 1 linear systems (j = 0, ..., s):

[A.sub.j] [x.sub.j] = [b.sub.j], [A.sub.j] = H + [h.sup.2][[sigma].sub.1]I + i[h.sup.2][D.sub.j], [D.sub.j] = diag([d.sub.1], ..., [d.sub.N]), (5.8)

where H is the discretization of -[nabla] x (c[nabla]u) by means of centered differences. The [d.sub.r] = [d.sub.r](j), r = 1, ..., N, j = 0, ..., s, are the values of [[sigma].sub.2](j) at the grid points.

* [Problem 2] Real Helmholtz equation with complex boundary condition

[partial derivative]u/[partial derivative]n = i[[sigma].sub.2](j)u, {(1, y) | 0 < y < 1}

discretized with forward differences and Dirichlet boundary conditions on the remaining three sides gives again (5.8). The diagonal entries of [D.sub.j] are given by [d.sub.r] = [d.sub.r](j) = 1000/h if r/m is an integer, 0 otherwise.

All test problems are based on a 31 x 31 mesh, the right hand sides are vectors with random components in [-1,1] + i[-1,1] and the initial guess is a random vector. Notice that the methods based on (1.5) cannot be used here because the right hand side and the initial guess [x.sub.j] change for each j = 0, ..., s. Moreover, the seed matrix is not just shifted because the [E.sub.j] are diagonal and not just the identity matrices (they have complex, non-constant diagonal entries) and thus the Krylov subspaces (1.5) do not coincide.

We consider the solution of the related 2D model problem in the unit square by using GMRES without preconditioning, with the approximate inverse preconditioner described in [2], the order 0, 1 and 2 updated approximate inverse preconditioner (3.10) and the incomplete [LDL.sup.H]-threshold preconditioner (3.8). The drop tolerance for the incomplete factorizations is set to preserve the sparsity of the originating matrices. The results for the Problems 1 and 2 are shown in tables 5.2, 5.3 and 5.4, 5.5, respectively, confirming that our approach is effective. In particular, even if the global number of iterations using the update preconditioners can increase with respect to "full preconditioned" iterations, the flop count is lower with respect to the other methods. However, if [[sigma].sub.1] is greater than a suitable value, our update preconditioners should be used with updated triangular factors [??] and [??] as well otherwise our preconditioners can be not efficient; see table 5.2 for [[sigma].sub.1] = 800. However, we recall that usually in these cases preconditioning is not necessary at all; see section 2. Strategies for updating [??] were described for positive definite matrices in [5, section 6].

For the particular seed matrix A considered in these examples, we can observe that incomplete [LDL.sup.H]-based preconditioners perform slightly better. This is caused by the particular properties of the Laplacian discretized by the usual five-point formula. Indeed, we observed that for other problems updated approximate inverse factorization preconditioners (3.10) perform better. In particular, this happens for positive definite seed matrices A whose incomplete factorization generated by incomplete Cholesky factorization is ill conditioned. On the other hand, we stress that the (stabilized) approximate inverse preconditioner described in [2] is well defined for all positive definite matrices. The same is true for incomplete [LDL.sup.H] factorization generated by the algorithm proposed in [3]. Therefore, the updated approximate inverse (3.10) and incomplete factorization (3.8) based on those are well defined as well.

We stress that updates of order k greater than one are sometimes not efficient for the considered problems. On the other hand, recall that the matrices related to problems 1 and 2 can be written as [A.sub.j] in (5.8). Therefore, if [[sigma].sub.1] is positive and not negligible with respect to the entries of H, say, the inverse of (5.8) has entries which exhibit a fast decay away from the main diagonal; see, e.g., figure 3.1. Therefore, it seems reasonable that diagonal corrections (i.e., updates of order 0 and 1) give good approximations with the minimum computational cost. Moreover, order k > 1 approximations require the solution of a k-banded linear system in complex arithmetic per iteration, which can represent a not negligible computational cost if we have convergence in almost the same number of iterations of the underlying diagonal corrections. This is the case of problem 2; see tables 5.4 and 5.5.

In tables 5.2-5.5 different strategies seems to give the same number of Mflops and/or iterations. This effect is caused by different reasons.

* Tables 5.2-5.5. Order 0 and order 1 updates have a computational cost which is almost the same, but the latter approach requires obviously slightly more operations than the former. Therefore, rounding the (estimated) flops gives sometimes the same numbers.

* Table 5.2, [[sigma].sub.1] = 100, 200,400 and table 5.3, [[sigma].sub.1] = 400. In this case, for order 0 and 1 we have the same number of Mflops and iterations for some [[sigma].sub.1] and, in three runs, the same (rounded) number of Mflops but a different number of iterations for order 2 (necessarily lower because more expensive than order 0 and 1).

Finally, from the above tables, we could argue that the computational cost for the application of the standard approximate inverse preconditioner (the columns "full AINV" in the tables) is higher with respect to the standard incomplete factorization (3.2) based on the Incomplete Cholesky algorithm provided by Matlab (the columns "full [ILDL.sup.H]" in the tables). However, this is partly a consequence of the use of our rough Matlab implementation for the approximate inverse factorizations, while the incomplete [LDL.sup.H] preconditioner uses a built-in function.

6. Conclusions. We proposed a general framework for the solution of sequences of complex symmetric linear systems of the form (1.1) which is based on our algorithms for updating preconditioners generated from a seed matrix. The seed preconditioner can be based on incomplete [LDL.sup.H] factorizations for A or for [A.sup.-1] and the sequence of the iterations is well defined provided that the seed matrix A is definite and its preconditioner is well defined. Sufficient conditions for the fast convergence of the underlying iterations are proposed.

The updated preconditioners (3.8) and (3.10) can be improved by applying several techniques. In particular, the strategies proposed in [5, Section 6] can be generalized to the underlying algorithms.

Numerical experiments with Helmholtz equations and boundary value methods for a diffusion problem show that the proposed framework can give reasonably fast convergence. The seed preconditioner is computed just once and therefore the solution of several linear systems of the type (1.1) can be globally much cheaper than recomputing a new preconditioner from scratch and with respect to nonpreconditioned iterations, especially if the matrix [A.sub.j] is ill-conditioned (if [A.sub.j] is Hermitian) or has non clustered and/or close to the origin eigenvalues (if [A.sub.j] is non-Hermitian and its basis of eigenvectors is not ill-conditioned). On the other hand, if the problems are not ill-conditioned (have clustered spectra), using the seed preconditioner and the updated preconditioners can give similar performance.

Acknowledgments. The author would like to thank the editor and two anonymous referees for helpful comments which have improved this presentation.

This work is dedicated to Federico who was born during the preparation of the bulk of this paper.

* Received December 12, 2003. Accepted for publication March 3, 2004. Recommended by M. Benzi.


[1] U. R. ASCHER, R. M. M. MATTEIJ, AND R. D. RUSSELL, Numerical Solution of Boundary Value Problems for Ordinary Differential Equations, SIAM, Philadelphia, PA, 1995.

[2] M. BENZI, J. K. CULLUM, AND M. TUMA, Robust approximate inverse preconditioning for the conjugate gradient method, SIAM J. Sci. Comput. 22 (2000), pp. 1318-1332.

[3] M. BENZI AND M. TUMA, A robust incomplete factorization preconditioner for positive definite matrices, Numer. Linear Algebra Appl., 10 (2003), pp. 385-400.

[4] --Orderings for factorized sparse approximate inverse preconditioners, SIAM J. Sci. Comput., 21 (2000), pp. 1851-1868.

[5] M. BENZI AND D. BERTACCINI, Approximate inverse preconditioning for shifted linear systems, BIT, 43 (2003), pp. 231-244.

[6] D. BERTACCINI, A circulant preconditioner for the systems of LMF-based ODE codes, SIAM J. Sci. Comput., 22 (2000), pp. 767-786.

[7] --Reliable preconditioned iterative linear solvers for some numerical integrators, Numer. Linear Algebra Appl., 8 (2001), pp. 111-125.

[8] --Block {[omega]}-circulant preconditioners for the systems of differential equations, Calcolo, 40-2 (2003), pp. 71-90.

[9] D. BERTACCINI AND M. K. NG, Band-Toeplitz Preconditioned GMRES Iterations for time-dependent PDEs, BIT, 43 (2003), pp. 901-914.

[10] S. DEMKO, W. F. MOSS, AND P. W. SMITH, Decay rates for inverses of band matrices, Math. Comp., 43 (1984), pp. 491-499.

[11] J. E. DENNIS, JR. AND R. B. SCHNABEL, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, SIAM, Philadelphia, PA, 1996.

[12] J. J. DONGARRA, I. S. DUFF, D. C. SORENSEN, AND H. A. VAN DER VORST, Numerical Linear Algebra for High-Performance Computers, SIAM, Philadelphia, PA, 1998.

[13] R. FREUND, On Conjugate Gradient Type methods and Polynomial Preconditioners for a Class of Complex Non-Hermitian Matrices, Numer. Math., 57 (1990), pp. 285-312.

[14] -- Conjugate gradient-type methods for, linear systems with complex symmetric matrices, SIAM J. Sci. Stat. Comput., 13 (1992), pp. 425-448.

[15] -- Solution of shifted linear systems by quasi-minimal residual iterations, in Numerical Linear Algebra, L. Reichel, A. Ruttan and R. S. Varga, eds., de Gruyter, Berlin, 1993.

[16] E. HAIRER AND G. WANNER, Solving Ordinary Differential Equations II. Stiff and Differential-Algebraic Problems, Springer-Verlag, Berlin-Heidelberg, 1991.

[17] G. MEURANT, On the incomplete Cholesky decomposition of a class of perturbed matrices, SIAM J. Sci. Comput., 23 (2001), pp. 419-429.

[18] Y. SAAD, Iterative methods for sparse linear systems, PWS Publishing Company, 1996.

[19] V. SIMONCINI AND E. GALLOPOULOS, An iterative method for nonsymmetric systems with multiple right hand sides, SIAM J. Sci. Comput., 16 (1995), pp. 917-933.

D. BERTACCINI [dagger]

([dagger]) Universita di Roma "La Sapienza", Dipartimento di Matematica "Istituto Guido Castelnuovo", P.le A. Moro 2, 00185 Roma, Italy. E-mail:, web: This work was supported in part by the COFIN2002 project "Analisi di strutture di matrici: metodi numerici e applicazioni" and by Center for Modeling Integrated Metabolic Systems (MIMS), Cleveland (
GMRES (average inner and outer)iterations for the diffusion equation in
a rectangle [0,3] x [0,3], t [member of] [0,6], c = [e.sup.-x-y] (LMF
in bv form). The "*" means that more than 0.5 (Matlab counter)
Teraflops are required. Note that the preconditioner based on (3.6)
(order 0, i.e., B = I) gives the best performance.

 Not prec full AINV

 m s out avg Mf avg Mf

 8 8 9 21.6 44 10.5 58
 16 8 17.3 58 10.3 103
 24 8 15.2 74 10.2 155
 32 8 13.9 84 10.2 204

16 8 9 51.8 789 15.7 2845
 16 9 40.2 1054 14.4 5663
 24 9 35.3 1139 14.1 7633
 32 9 31.6 1276 13.7 10153

24 8 9 84.3 4328 21.7 31282
 16 9 65.9 5783 19.4 62453
 24 9 56.3 6698 19.1 93845
 32 8 51.2 6753 18.7 124899

32 8 10 117 15874 * *
 16 9 92.7 19445 * *
 24 9 79.3 22500 * *
 32 9 70.6 24709 * *

 AIN[V.sub.0] AINV(J)

 m s avg Mf avg Mf

 8 8 14.8 18 30.8 53
 16 15.7 35 35.9 122
 24 16.0 55 38.6 206
 32 16.2 72 40.2 291

16 8 19.3 105 36.9 288
 16 19.2 210 45.0 815
 24 19.6 292 50.8 1334
 32 19.6 385 54.5 1997

24 8 24.1 330 40.8 754
 16 23.3 635 47.8 2001
 24 22.9 942 53.1 3629
 32 23.1 1114 57.5 4896

32 8 28.8 868 43.9 1685
 16 28.2 1512 51.2 3933
 24 27.2 2187 55.7 6909
 32 26.6 2785 59.0 10257

Order k, k = 0, 1, 2 modified and incomplete LD[L.sup.H] (i.e.,
recomputed at each step) preconditioners. Results for the complex
Helmholtz equation and Dirichlet boundary conditions as in Problem 1.
The diagonal entries of D are chosen randomly in [0, 1000].

 Not prec sub.0]

[[sigma].sub.1] it Mf it Mf

 50 38 13.9 22 7.1
 100 36 12.7 20 6.2
 200 32 10.2 18 5.3
 400 26 7.2 16 4.5
 800 20 4.6 15 4.1

 ILD[L.sup.H. ILD[L.sup.H.
 sub.1] sub.2]

[[sigma].sub.1] it Mf it Mf

 50 22 7.1 18 7.0
 100 20 6.2 17 6.5
 200 18 5.3 15 5.3
 400 16 4.5 13 4.5
 800 15 4.1 12 4.1

 full ILD[L.sup.H]

[[sigma].sub.1] it Mf

 50 19 9.0
 100 17 7.7
 200 15 6.6
 400 12 5.1
 800 9 3.7

Order k, k = 0, 1, 2 inverse modified and AINV (i.e., recomputed at
each step) preconditioners. Results for the complex Helmholtz equation
and Dirichlet boundary conditions as in Problem 1. The diagonal entries
of D are chosen randomly in [0, 1000].

 AIN[V.sub.0] AIN[V.sub.1]

[[sigma].sub.1] it Mf it Mf

 50 26 8.8 26 8.8
 100 25 8.2 25 8.3
 200 22 6.7 22 6.8
 400 19 5.4 19 5.5
 800 17 4.6 17 4.7

 AIN[V.sub.2] full AINV

[[sigma].sub.1] it Mf it Mf

 50 20 7.8 15 1793
 100 19 7.3 14 1793
 200 17 6.3 13 1793
 400 15 5.4 11 1792
 800 13 4.5 8 1791

Order k, k = 0, 1, 2 modified and incomplete LD[L.sup.H] (i.e.,
recomputed at each. step) preconditioners. Results for the real
Helmholtz equation and complex boundary conditions as in Problem 2

 Not prec ILD[L.sup.H.sub.0]
[[sigma].sub.1] it Mf it Mf

.5 146 175 34 14.0
 1 145 173 33 13.0
 2 143 168 33 13.0
 4 137 155 31 12.1
 8 127 134 28 10.3

 ILD[L.sup.H.sub.1] ILD[L.sup.H.sub.2]
[[sigma].sub.1] it Mf it Mf

.5 34 14.0 34 17.2
1 33 13.0 33 16.5
2 33 13.0 33 16.5
4 31 12.1 31 15.0
8 28 10.3 29 13.6

 full ILD[L.sup.H]
[[sigma].sub.1] it Mf

.5 29 17.0
1 28 15.7
2 28 15.7
4 27 14.8
8 24 12.5

Order k, k = 0, 1, 2 inverse modified and AINV (i.e., recomputed at
each step) preconditioners. Results for the real Helmholtz equation
and complex boundary conditions as in Problem 2.

 AIN[V.sub.0] AIN[V.sub.1]

[[sigma].sub.1] it Mf it Mf

 .5 47 23.4 47 23.5
 1 46 22.6 46 22.6
 2 46 22.6 45 21.8
 4 44 20.0 44 20.1
 8 41 18.5 40 17.9

 AIN[V.sub.2] full AINV

[[sigma].sub.1] it Mf it Mf

 .5 47 27.8 46 1812
 1 46 26.9 45 1811
 2 45 26.0 45 1811
 4 44 25.1 42 1809
 8 40 21.6 40 1807
COPYRIGHT 2004 Institute of Computational Mathematics
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Bertaccini, D.
Publication:Electronic Transactions on Numerical Analysis
Date:Jul 1, 2004
Previous Article:Discrete Sobolev and Poincare inequalities for piecewise polynomial functions.
Next Article:On Hermite interpolation in [R.sub.d].

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |