# Dual variable methods for mixed-hybrid finite element approximation of the potential fluid flow problem in porous media.

Abstract. Mixed-hybrid finite element discretization of Darcy's law and the continuity equation that describe the potential fluid flow problem in porous media leads to symmetric indefinite saddle-point problems. In this paper we consider solution techniques based on the computation of a null-space basis of the whole or of a part of the left lower off-diagonal block in the system matrix and on the subsequent iterative solution of a projected system. This approach is mainly motivated by the need to solve a sequence of such systems with the same mesh but different material properties. A fundamental cycle null-space basis of the whole off-diagonal block is constructed using the spanning tree of an associated graph. It is shown that such a basis may be theoretically rather ill-conditioned. Alternatively, the orthogonal null-space basis of the sub-block used to enforce continuity over faces can be easily constructed. In the former case, the resulting projected system is symmetric positive definite and so the conjugate gradient method can be applied. The projected system in the latter case remains indefinite and the preconditioned minimal residual method (or the smoothed conjugate gradient method) should be used. The theoretical rate of convergence for both algorithms is discussed and their efficiency is compared in numerical experiments.Key words. saddle-point problem, preconditioned iterative methods, sparse matrices, finite element method

AMS subject classifications. 65F05, 65F50

1. Introduction. Let us consider a set of porous media occupying the bounded connected domain [OMEGA] [subset] [R.sup.3] with boundary [partial derivative][OMEGA] = [bar.[partial derivative][OMEGA].sub.D] [union] [bar.[partial derivative][OMEGA].sub.N]. We assume that [[partial derivative][OMEGA].sub.D] [not equal to] 0, [[partial derivative][OMEGA].sub.D] [intersection] [[partial derivative][OMEGA].sub.N] = 0 and that the area of [[partial derivative][OMEGA].sub.D] is strictly positive.

The steady state equations for the potential fluid flow in [OMEGA] combine Darcy's law for the velocity u and the piezometric potential (fluid pressure) p, and the continuity equation with Dirichlet and Neumann boundary conditions on [partial derivative][OMEGA] as follows

A(x)u = -[nabla]p, [nabla] x u = q, x [member of]Q (1.1)

p = [p.sub.D] on [[partial derivative][OMEGA].sub.D], u x n = [u.sub.N] on [[partial derivative][OMEGA].sub.N], (1.2)

where A(x) is the symmetric and uniformly positive definite second rank tensor of hydraulic permeability of the media and n is the outward normal vector defined (almost everywhere) on the boundary [partial derivative][OMEGA]. We approximate the weak form of (1.1-1.2) by a mixed-hybrid finite-element method that uses the low order Raviart-Thomas finite elements RT0 (for details we refer to [30, 33]). The family of meshes is computed by dividing the domain [bar.[OMEGA]] into trilateral prisms with vertical faces and general nonparallel bases (see, e.g., [30, 33, 34]) and with each prisma diameter bounded by h. All our results can be directly generalized to tetrahedra or other three-dimensional elements since our analysis can be applied to almost any matrix arising from an RT0 based mixed-hybrid discretization of (1.1-1.2).

Mixed finite elements yield very accurate approximations to fluid pressure and velocity components. However, the mixed matrix system becomes ill-conditioned for steady-flow problems [8] and the hybridization seems to be one of the possible strategies able to avoid this problem. Hybridization of the mixed formulation was introduced in [13]. The local conservation property of mixed and hybrid finite element models fairly well transport phenomena. Moreover, from the algebraic point of view, the systems resulting from hybridization have a rather transparent and simple sparsity structure. In particular, the hybridization can be considered as a specific matrix stretching technique [22], [1].

A mixed-hybrid discretization technique requires the solution of the following symmetric indefinite system of linear algebraic equations

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1.3)

where

u = [([u.sub.1], ..., [u.sub.5*NE]).sup.T], p = [([p.sub.1], ..., [p.sub.NE]).sup.T], [lambda] = [([lambda], ..., [[lambda].sub.NIF+NNC]).sup.T]

represent, respectively, the unknown values of the velocity momentum through the faces, the pressure values in the prisms, and the pressure values on the faces. We denote by:

* NE the number of elements,

* NIF the number of interior inter-element faces,

* NNC the number of faces with the prescribed Neumann boundary condition, and

* NEC the number of faces with the prescribed Dirichlet boundary condition (NEC [not equal to] 0).

The total number of faces is 5 * NE = 2NIF + NEC + NNC.

We assume that the elements in the mesh have been enumerated such that the global position of every face and its corresponding entries in the matrices is given by the position of the element in the enumeration, and by its local position on the element. The matrix block A [member of] [R.sup.5*NE,5*NE] is symmetric positive definite and from the analysis in [34] it follows that its spectrum lies in the interval

[sigma](A) [subset] [[C.sub.1]/h, [C.sub.2]/h], (1.4)

where [c.sub.1] and [c.sub.2] are positive constants independent of the discretization parameter h. The off-diagonal block B [member of] [R.sup.5*NE,NE] is the face-element incidence matrix (with weights equal to -1) and, therefore, 5-1/2B is an orthogonal matrix. The matrix block C has the form C = ([C.sub.1] [C.sub.2]) [member of] [R.sup.5*NE,NIF+NNC] where the matrix block [C.sup.T.sub.1] represents the discrete continuity equation for the fluid velocity across interior inter-element faces and where [C.sub.T.sub.2] stands for fulfilment of the Neumann boundary conditions (for details we refer to [33], [34]). Both matrix blocks [2.sup.-1/2][C.sub.1] and [C.sub.2] are orthogonal and [C.sup.T.sub.1] [C.sub.2] = 0. Thus, after scaling, the matrix C is an orthogonal matrix. The normalization coefficients do not play an important role here and eventually may be circumvented by a proper scaling of the columns and corresponding rows in the system matrix (1.3) (or later in (1.6)). The condition number of the whole off-diagonal matrix block (B C) (the ratio between the largest and the smallest singular values) is, however, dependent on the mesh size h [34] and for its singular values we have

sv(BC) [subset] [[c.sub.3]h, [c.sub.4]]; (1.5)

here [C.sub.3] and [C.sub.4] are again positive constants independent of the discretization parameter h [34]. Let D = diag([h.sup.1/2][I.sub.5*NE], [h.sup.-1/2][I.sub.NE], [h.sup.-1/2][I.sub.NIF+NNC]) a block diagonal matrix. If we consider the symmetric diagonal scaling of the whole indefinite system (1.3) by D we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1.6)

Then the inclusion set for the spectrum of the positive definite matrix block A becomes independent of the parameter h with [sigma], (A) [subset] [[c.sub.1], [c.sub.2]]. The matrix block (B C) remains untouched and it is now the only part of the system matrix (1.6) that depends on the mesh size h. We note here again that its sub-blocks B and C are matrices with an orthogonal set of columns. In addition to this, when the conditioning of the matrix A itself is rather significant, scaling of the matrix with its diagonal may lead to substantial improvements.

Linear systems similar to (1.3) have recently attracted a lot of attention in a number of applications e.g. Navier-Stokes problems [47], magneto-static problems [40], quadratic and nonlinear programming ([4], [32]) or porous media problems ([30],[7]). Several approaches for a solution of such systems have been considered. They range from the Uzawa-type and other splitting iteration methods [17], [6] , nonstationary conjugate gradient-type methods like the MINRES method [39] applied to the whole indefinite system (see e.g. [47], [34] or [43]) or the conjugate gradient method applied to the Schur complement systems ([30], [35]). Other possible techniques are the geometric multigrid approach ([16], [50]) or the direct solution based on the Bunch-Parlett or the [LDL.sup.T]-factorization ([15], [49]). An approach based on the null-space method (using the sparse QR decomposition) combined with the iterative solver was presented in [4].

In this paper we consider an approach based on the computation of a null space basis of some off-diagonal block in the system matrix (1.6) and on the use of an iterative method in order to solve the remaining part of a system projected onto the computed null-space. At the continuous level this is equivalent to a procedure based on divergence-free finite elements. In the two-dimensional case such finite elements correspond to stream functions. In three dimensions, which is our case of interest, the divergence-free finite elements can be characterized as curls of appropriate vector potentials [38]. The problem of finding an explicit divergence-free basis in the three-dimensional case is open even for the lowest-order Raviart-Thomas discretization. A partial solution to this problem was proposed in [53], see also [46].

Our approach is purely algebraic and it allows interesting insight into the problem. First, we consider the off-diagonal block [(B C).sup.T] and its fundamental cycle null-space basis which is computed using a spanning tree of a directed incidence graph related to the block [(B C).sup.T]. The resulting projected system is then symmetric positive definite and a conjugate gradient or a smoothed conjugate gradient method (minimal residual method) can be applied. Unfortunately, as we will show later, the computed null-space basis may be ill-conditioned and, therefore, the convergence rate of an iterative solver applied to the projected systems may be rather slow for a mesh with a large number of elements.

Alternatively, we can take advantage of the structure of the submatrix

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1.7)

which is permutable in a block diagonal form where each of the NE diagonal blocks is of order 6 and has the structure of an augmented system. Therefore, we consider the approach based on a null-space basis of the block [C.sup.T]. Since the matrix block C is orthogonal, one can very easily construct a null-space basis of [C.sup.T], which is orthogonal as well. The projected system is now symmetric indefinite, and it is equivalent to the system obtained approximating the problem (1.1) with the boundary conditions (1.2) by the Raviart-Thomas mixed finite element method [9]. For this symmetric indefinite problem, instead of the pure conjugate gradient method its smoothed variant or, in other words, the minimal residual method is used. Its rate of convergence is estimated and linear asymptotic dependence on the mesh size h is shown. Thus this approach is asymptotically as efficient as other approaches like the Schur complement reduction ([30, 35]) or the solution using some indefinite iterative solvers on the whole system (1.3) ([47, 34]).

Moreover, for nonlinear schemes modelling the transport of chemicals and/or saturation, a sequence of problems with the same topology, i.e. with the same off-diagonal matrix blocks B and C, must be solved. Therefore, the dual variable methods can compute once at the starting the null space of [(B C).sup.T] (or the null space of [C.sup.T]) and use it to project the gradient of the nonlinear function during an outer iteration of a Newton like method. On the contrary, Schur complement methods need to compute a new block matrix at each step and then they must recombine all the blocks.

Both the Schur complement and dual variable approaches can be naturally coupled with multilevel procedures to avoid deterioration of convergence with decreasing h (see [28] where instead of construction of an explicit basis the kernel of the curl operator is eliminated in a multilevel way). In our case, however, the convergence deterioration is principally related to the actual size of constants than to the asymptotic dependence on mesh discretization [33].

The outline of this paper is as follows. In Section 2 we focus on the approach based on the computation of a null-space basis of the whole block [(B C).sup.T]. We study the structural and spectral properties of a fundamental cycle null-space basis and based on these results, the theoretical convergence rate of the conjugate gradient method applied to the resulting projected system is estimated. In Section 3 we describe an approach based on a null-space basis of the block [C.sup.T] and analyze the spectrum of a resulting indefinite matrix projected onto the orthogonal null-space basis. Section 4 describes some numerical experiments which compares these two approaches. In Section 5 we give some conclusions and point out directions for the future research.

2. Approach based on a null-space basis of the matrix block [(B C).sup.T]. The dual variable method [24] for computing the unknowns u, p and [lambda] in the system (1.3) is given in the following Algorithm.

ALGORITHM 2.1. The dual variable method for a solution of the system (1.3)--an approach based on a null-space of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Step 1. Compute a null space basis Z of the matrix [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] so that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Step 2. Find some solution [u.sub.1] of the underdetermined system

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Step 3. Compute (iteratively) [u.sub.2] from the projected system

[Z.sup.T] AZ[u.sub.2] = [Z.sup.T] ([q.sub.1] - A[u.sub.1]).

Step 4. Set u = [u.sub.1] + Z[u.sub.2].

Step 5. Find the unknown vectors p and [lambda] such that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

2.1. Step 1. The most critical component of Algorithm 2.1 is Step 1. There exist several approaches how to compute a null space basis Z. Some of them are tightly coupled with particular applications. An extensive overview of null space basis algorithms based on sparse decompositions is given in [25]. A possible way to compute a null space basis of an equilibrium matrix in structural optimization is based on looking for a set of cycles in a suitably defined graph, see e.g. [26, 42]. The cycle null space basis can be found efficiently using various techniques (see, e.g., [41, 14, 11, 31]). Special attention should be paid to the approach used for solving two-dimensional problems in computational fluid dynamics (see [2, 24, 10]). These techniques use network algorithms to find a suitable cycle null space basis for a discrete divergence matrix which comes from certain finite difference discretizations.

First we briefly recall the basic terminology used in the following text. In our description we will use a slightly generalized concept of a graph by allowing more edges between a pair of vertices. This generalization is commonly called a multigraph, but since all the standard tools for graphs which we use can be trivially extended to multigraphs we will not emphasize this difference later.

DEFINITION 2.1. Let G = (V, E) be a connected directed graph with [absolute value of V] vertices and [absolute value of E] edges such that [absolute value of E] - [absolute value of V] + 1 > 0. Then the vertex-edgee incidence matrix of the graph is [absolute value of V] x [absolute value of E] matrix with a row associated to each vertex and and a column associated to each edge. The column associated with edge (i, j) has only two nonzero entries, a "1 " entry in the row associated to vertex i and a "-1 " entry in the row associated with vertex j.

We start with a definition of a cycle null space basis of a graph.

DEFINITION 2.2. Let G = (V, E) be a connected directed graph such that [absolute value of E] - [absolute value of V] + 1 > 0. Then the columns of the cycle basis are given by a set of [absolute value of E] - [absolute value of V] + 1 linearly independent edge incidence vectors that correspond to some cycles in the graph G. These incidence vectors have the i-th component equal to +1 if [e.sub.i] is an edge in the cycle and the orientations of the cycle and e2 agree, equal to -1 if [e.sub.i] is an edge in the cycle and the orientations disagree, and equal to 0 if [e.sub.i] is not an edge in the cycle. Since the cycle basis is formally defined for a graph we will not distinguish between the basis of the graph and the basis formed from the columns of its incidence matrix. The concept of fundamental cycle basis is based on the notion of a spanning tree defined as follows.

DEFINITION 2.3. A spanning tree of a connected directed graph G = (V, E) is a connected subgraph of G with [absolute value of V] vertices and [absolute value of V] - 1 edges. Note that in the previous definition we did not consider the fact that the edges are oriented. In the following we define the fundamental cycle basis.

DEFINITION 2.4. A cycle basis is fundamental if it is obtained from a spanning tree T of the graph in such a way that each cycle in the basis has exactly one non-tree edge e and its other edges lie on the unique path in T connecting the vertices of the edge e. The following lemma introduces a graph which will be used for enumeration of the cycle null space basis vectors in our application.

LEMMA 2.1. Denote by S the matrix obtained from [(B C).sup.T] by removing the rows corresponding to Neumann boundary conditions, removing the columns corresponding to faces with Neumann boundary conditions and adding a row which has ones in all the positions corresponding to faces with Dirichlet boundary condition. Then S is an incidence matrix of some directed graph [G.sub.S] = ([V.sub.S], [E.sub.S]).

[FIGURE 2.1 OMITTED]

Proof. The columns and rows of the matrix [(B C).sup.T] can be reordered to an upper block triangular form with the unit diagonal block formed from the rows corresponding to Neumann boundary conditions and the columns corresponding to faces with Neumann boundary conditions. This means that the components of null space vectors corresponding to faces with Neumann boundary conditions must be zero. Therefore we do not need to consider their columns and rows in the matrix [(B C).sup.T]. Denote by [??] the resulting matrix and let [s.sup.T] be the row vector with components corresponding to faces with Dirichlet boundary condition equal to one and remaining components equal to zero. Then define [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. It is clear that S is an incidence matrix (with the column sum equal to zero) of some directed graph which we denote from now by [G.sub.S].

An example of an off-diagonal block [(B C).sup.T] and the corresponding matrix S is shown in Figure 2.1 and Figure 2.2, respectively. Figure 2.3 depicts the corresponding graph [G.sub.S].

If we find a fundamental cycle basis to the graph of the incidence matrix S we can easily extend it to the null space basis of [(B C).sup.T]. We border [Z.sub.S] with rows of zero in correspondence of the columns in [(B C).sup.T] relative to the edges of the Neumann boundary conditions. Therefore, we can pay our attention to the matrix S only. For easier reference we will formulate it as a proposition.

PROPOSITION 2.1. Let [Z.sub.S] be a null space basis of S. Then the null space basis Z of [(B C).sup.T] can be obtained from [Z.sub.S] by adding zero rows to positions of faces with Neumann boundary condition.

For large and sparse problems it is important to keep sparsity of the null space basis as much as possible. The problem to find the sparsest null-space basis for a given matrix is NP-hard ([18, 12]). The sparsest null-space basis, however, may not be the most efficient way when solving our problem. Namely, it may be rather ill-conditioned. Therefore, an effort was devoted to computation of orthogonal null-space bases (see [4]). On the other hand, the sparse QR-decomposition may lead to rather dense and in practice infeasible factors. In this section we attempt to find a compromise between these two extreme cases. In particular, we would like to compute a relatively sparse null-space basis and, at the same time, to keep it sufficiently linearly independent.

[FIGURE 2.2 OMITTED]

We specify now more precisely how our fundamental cycle null space basis is constructed. The cycles in [G.sub.S] here are determined using some spanning tree. By its choice one can influence the conditioning of the basis in a substantial way. We assume that the spanning tree is constructed using the Algorithm 2.2. In its description we use the technique of partitioning the graph nodes into n node sets ([L.sub.0], [L.sub.1], ..., [L.sub.n-1]) which are called level sets. Starting with some initial node, which forms the initial level set [L.sub.0], the level set [L.sub.k] is defined recursively as the set of all unmarked neighbouring nodes of all the nodes of a previous level set [L.sub.k-1]. This technique is intensively used, e.g., for graph partitioning or in heuristics to find graph pseudoperipheral vertices (see [19, 45]).

ALGORITHM 2.2. Algorithm to construct the spanning tree T = ([V.sub.S], [E.sub.T]) of the graph [G.sub.S] = ([V.sub.S], [E.sub.S]).

Step 1. Find a level set partitioning ([L.sub.0], [L.sub.1], ..., [L.sub.n-1]) of [G.sub.S] starting from an arbitrary node x [member of] [V.sub.S].

Step 2. For all components of subgraphs induced by a level set partitioning construct an arbitrary spanning tree. Add all these edges of every spanning tree into [E.sub.T].

Step 3. Connect the set of edges [E.sub.T] into a spanning tree of the whole graph [G.sub.S].

This construction guarantees that there are no cycles in the graph GS which would use nodes from more than two levels of the partitioning. The whole process of construction is schematically depicted in Figure 2.4. The situation after Step 2 in Algorithm 2.2 is illustrated on the left-hand side and the spanning tree of [G.sub.S] after Step 3 is depicted on the right-hand side. The edges of the spanning tree are denoted by double lines.

In the following we study the conditioning of the null-space basis constructed using the spanning tree from Algorithm 2.2. We give bounds on the extreme singular values of the matrix [Z.sub.S], i.e. the smallest and the largest singular value. In particular, we are interested in their asymptotic behaviour with respect to the discretization parameter h under uniformly regular refinement of the mesh.

[FIGURE 2.3 OMITTED]

[FIGURE 2.4 OMITTED]

THEOREM 2.2. Let ZS be a matrix with fundamental cycle null-space basis vectors induced by the spanning tree from Algorithm 2.2. Let [[sigma].sub.ma]x ([Z.sub.S]) [greater than or equal to] [[sigma].sub.2] ([Z.sub.S]) [greater than or equal to] ... [greater than or equal to] [[sigma].sub.min]([Z.sub.S]) > 0 be the singular values of [Z.sub.S]. Then there exist a constant [C.sub.5] such that [[sigma].sub.max]([Z.sub.S]) [less than or equal to] [C.sub.5][h.sup.-2].

Proof. In a uniform mesh the ratio between the internal and the external diameters of any element is independent of h and both diameters are of order O(h). Then, the number of elements in each direction is independent of the direction. Algorithm 2.2 computes a "Shortest Path Spanning Tree" for the graph [G.sub.S] where each arc has length 1. Therefore, the value of a level set is also the value of the minimum distance of any of its nodes from the root. Such a distance is equal to the number of elements in the mesh that we cross going from a node in the level set to a node corresponding to a boundary element directly connected to root. Because of the uniformity of the mesh this number is in the worst case of order O([h.sup.-1]). The nodes in a level set map into a wavefront in the mesh, therefore, the number of nodes in a level set is in the worst case of order O([h.sup.-2]). Since [Z.sub.S] is a cycle null-space basis, its Frobenius norm is determined by the count of its nonzero entries. Each column of [Z.sub.S] corresponds to an arc which is not in the tree and the number of non zeros in the column is the length of the shortest cycle formed using the nodes on the tree and the arc. Because the max distance of a node in the tree from the root is of order O([h.sup.-1]) the maximum length of the cycle is O([h.sup.-1]). The total number of arcs out-of-tree is O([h.sup.-3]). Then, the number of nonzeros in [Z.sub.S] is of order O([h.sup.-4]). Hence there exists a positive constant [c.sub.5] such that [[sigma].sub.max]([Z.sub.S]) = [parallel][Z.sub.S][parallel] [less than or equal to] [[parallel][Z.sub.S][parallel].sub.F] [less than or equal to] [c.sub.5][h.sup.-2].

THEOREM 2.3. Let [[sigma].sub.1]([Z.sub.S]) [greater than or equal to] [[sigma].sub.2] ([Z.sub.S]) [greater than or equal to] ... [greater than or equal to] [[sigma].sub.min] ([Z.sub.S]) > 0 be the singular values of the matrix [Z.sub.S] given by the fundamental cycle null-space basis vectors [Z.sub.S]. Then [[sigma].sub.min] ([Z.sub.S]) [greater than or equal to] 1.

Proof. From the Courant-Fischer theorem we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Because S is the incidence matrix of the graph [G.sub.S], there exist [P.sub.1] and [P.sub.2] permutation matrices [36] such that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

with [L.sub.1] non singular lower triangular matrix. Then

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Since the matrix [Z.sub.S] has a unit submatrix embedded, it always satisfies [parallel][Z.sub.Sx[parallel] [greater than or equal to] [parallel]x[parallel]. From this observation we obtain the desired result.

The approach which we adopted is based on the concept of the fundamental cycle null space basis Z for which one could simply bound the smallest singular value of Z from below but then some growth in the norm of the matrix Z with the bound in Theorem 2.2 should be expected. Another approach which uses cycles of small lengths for the basis can fall into a different trap. While the norm of Z can be simply bounded by a constant times a maximum degree in the graph [G.sub.S], it is not easy to give a reasonable lower bound for the minimum singular value of Z in the case of general domain. Nevertheless, we do not exclude that such ill-conditioned null-space basis vectors may appear frequently in practical computations.

2.2. Step 2. The construction of a particular solution u1 in Step 2 of Algorithm 2.1 is considerably simpler than the construction of the null space basis (cf. [24]). Compute the uniquely determined components of the particular solution corresponding to the faces with the Neumann boundary conditions. Denote by F the matrix obtained from [(B C).sup.T] after elimination of these components and after removal of all columns corresponding to faces with a Dirichlet boundary condition. Construct a spanning tree [T.sub.F] of its incidence graph [G.sub.F] rooted in a vertex which corresponds to some element with a Dirichlet boundary condition. Then remove all non-tree columns (columns corresponding to non-tree edges) from F. The resulting matrix [??] is then the incidence matrix of [T.sub.F]. Therefore, the rows and columns of [??] can be reordered into upper Hessenberg form such that the row corresponding to the root will be numbered first. Adding a linearly independent Dirichlet column related to the root we get a nonsingular upper triangular system. By solving this system and setting all the other non-tree and Dirichlet components to zero we get the desired particular solution [u.sub.1]. Note that the construction of the particular solution based on the incidence matrix can be done in a stable fashion. Indeed, it is clear that the norm of [u.sub.1] is uniformly bounded with respect to the norm of the right-hand side vector.

2.3. Step 3. For a solution of the projected system in Step 3 one may use the iterative conjugate gradient [27] or the minimal residual method [48]. The theoretical rate of convergence has been throughly studied and the bounds for their error and/or residual norm has been given (see e.g. [27, 45]). Here we consider the conjugate gradient method smoothed by the minimal residual smoothing, which is mathematically equivalent to the minimal residual method [23]. If we apply this method to the symmetric and positive definite projected system, the residual norm of the n-th approximate solution [u.sup.n.sub.2] can be bounded as follows

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (2.1)

The bound (2.1) indicates that its rate depends strongly on the spectrum of the projected matrix [Z.sup.T] AZ. Using the bounds on the singular values of the null-space basis matrix Z constructed in Step 1 and using the bound for the eigenvalues of the positive definite matrix block A (1.4) with scaling (1.6) then we can obtain the following simple result on the eigenvalues of the matrix [Z.sup.T]AZ.

LEMMA 2.4. Let [Z.sub.S] be the fundamental null-space basis matrix induced by the spanning tree from Algorithm 2.2 and let Z be the null-space basis matrix of the block [(B C).sup.T] obtained from [Z.sub.S] by adding zero rows corresponding to faces with Neumann boundary condition. Then for the eigenvalues of the matrix [Z.sup.T] AZ we have

[sigma]([Z.sup.T]AZ) [subset] [[C.sub.1], [C.sub.2] [C.sup.2.sub.5]/[h.sup.4]]. (2.2)

Proof. The statement of lemma follows from (1.4) and (1.6), from results given in the subsection Step 1 and from the inequality

[C.sub.1] (Zx, Zx) [less than or equal to] ([Z.sup.T] AZx, x) [less than or equal to] [C.sub.2] (Zx, Zx),

which gives the relation between the spectrum of [Z.sup.T]AZ and the singular values of Z. 0

Considering the bound (2.1) and Lemma 2.4 we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (2.3)

For the asymptotic convergence factor then it follows from (2.3) that there exists a positive constant [C.sub.6] independent of the discretization such that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (2.4)

Preconditioning of projected matrices arising in optimization was studied in [37], see also [21].

2.4. Step 5. The vector [([p.sup.T], [[lambda].sup.T]).sup.T] in Step 5 of Algorithm 2.1 can be found as follows. Consider the spanning tree [T.sub.F] of the matrix F and the upper triangular system constructed in Step 2 (see Subsection 2.2). The unknowns p and A are then a solution of the system with a nonsingular lower triangular matrix obtained by transposing the matrix from Step 2. The components of the unknown vector [lambda] corresponding to Neumann boundary conditions are determined accordingly from remaining rows of (B C). The right hand-side vector is given as [q.sub.1] - Au substituting for the vector u computed in Step 4.

3. Approach based on a null-space basis of the matrix block [C.sup.T]. Since the off-diagonal matrix block C has orthogonal columns it is much easier to construct a null-space basis for the block [C.sup.T] rather than for the whole block [(B C).sup.T]. In contrast to the previous approach, this basis can be chosen orthogonal and thus the condition number of the basis matrix is not dependent on the discretization parameter. Although we are splitting the potentially ill-conditioned matrix block (B C) into two matrix blocks with orthogonal columns, the spectrum of the remaining part of the indefinite system is dependent on the discretization parameter. Consequently, the rate of convergence of the minimal residual method applied to the projected system can be bounded in terms of the mesh size and it depends linearly on the uniform mesh refinement. The algorithm is given as follows.

ALGORITHM 3.1. The dual variable method for a solution of the system (1.3)--approach based on a null-space of [C.sup.T].

Step 1. Determine the null space basis Z of the matrix block [C.sup.T] such that

[C.sup.T]Z = 0.

Step 2. Find some solution [u.sub.1] of the underdetermined system

[C.sup.T][u.sub.1] = [q.sub.3].

Step 3. Compute iteratively [u.sub.2] and p from the projected system

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Step 4. Set u = [u.sub.1] + Z[u.sub.2].

Step 5. Find the unknown [lambda] such that C[lambda] = [q.sub.1] - Au - Bp.

3.1. Step 1. The matrix block C has orthogonal columns and it has the form C = ([C.sub.1] [C.sub.2]) [member of] [R.sup.5*NE,NIF+NNC], where the block [C.sub.1] has two nonzeros per column, corresponding to the interior inter-element faces between neighbouring elements in the mesh. The block C2 is just the face-Neumann boundary condition incidence matrix. Therefore it is easy to construct the null-space matrix Z such that [C.sup.T]Z = 0. The resulting matrix Z = ([Z.sub.1] [Z.sub.2]) [member of] [R.sup.5*NE,NIF+NDC] can be chosen in the following way. The block [Z.sub.1] [member of] [R.sup.5*NE,NIF] will have two nonzeros per column (1 and -1) exactly in the same position as in the corresponding block [C.sub.1]; the block [Z.sub.2] is the face-Dirichlet boundary condition incidence matrix. It is obvious that such matrix Z has an orthogonal set of columns with [Z.sup.T] Z = diag(2, ..., 2,1, ..., 1) (which can be also orthonormalized). The null-space basis matrix Z for our example is given in Figure 3.1.

[FIGURE 3.1 OMITTED]

3.2. Step 2. The matrix block C has one entry per row, so the system [C.sup.T] [u.sub.1] = [q.sub.3] can be immediately solved by permuting its rows and columns to an upper trapezoidal form. In other words, we immediately get the unknowns that correspond to faces with the Neumann condition, and setting one of the two unknowns that stand for the interior inter-element faces, we can recompute the other. The remaining unknowns corresponding to Dirichlet faces are then set to zero. Another possible approximate solution is the least squares solution [u.sub.1] = C[[C.sup.T] C).sup.-1][q.sub.3] which is clearly stable since C is orthogonal up to normalization coefficients [square root of 2].

3.3. Step 3. The projected system from Step 3 is symmetric but indefinite. On the other hand, the null-space basis matrix Z is orthogonal. Therefore, this approach can be very efficient. The projected system can be written as a result of an orthogonal projection applied to the remaining part of the indefinite system matrix in (1.6) in the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3.1)

The structural pattern of the resulting system for our example is depicted in Figure 3.2. The projected system (3.1) is still rather sparse, so its iterative solution may be a reasonable option. Moreover, the expression given by (3.1) shows that we can implement the matrix-vector product quite efficiently. The product Zv is equivalent to a permutation of the vector v. The product [Z.sup.T] w can be implemented in parallel because the rows of the matrix [Z.sup.T] are structurally orthogonal. Furthermore, the matrix

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

can be symetrically permuted in a block diagonal form with diagonal blocks of size 6. Here we consider the conjugate gradient method smoothed by the minimal residual smoothing [23]. It is well known that the rate of convergence of symmetric iterative methods depends strongly on the eigenvalue distribution of the system matrix ([45, 23]). In the following we analyze the spectrum of the matrix in the projected system (3.1).

[FIGURE 3.2 OMITTED]

LEMMA 3.1. Let Z be the null-space basis of the off-diagonal block [C.sup.T] constructed in Step 1 of Algorithm 3.1. Then for the spectrum of the projected matrix block [Z.sup.T] AZ it follows [sigma]([Z.sup.T]AZ) [subset] [[C.sub.1], 2[C.sub.2]].

Proof. The proof of the lemma is similar to the proof of Lemma 2.4 provided that [Z.sup.T] Z = diag(2, ..., 2,1, ..., 1).

LEMMA 3.2. Let Z be the null-space basis of the off-diagonal block [C.sup.T] constructed in Step 1 of Algorithm 3.1. Then there exist positive constants [c.sub.7] and [c.sub.8] such that for the singular values of the matrix block [Z.sup.T] B it follows sv([Z.sup.T] B) [subset] [[c.sub.7] h, [c.sub.8]].

Proof. Define the graph [G.sub.B] = ([V.sub.B], [E.sub.B]) as follows. Let [V.sub.B] = {0,1, ..., NE}. Let (i, j) be an edge in [E.sub.B] whenever elements i and j are connected by an interior inter-element face. Furthermore, let (0, i) [member of] [E.sub.B] be an edge for each Dirichlet boundary condition defined on some element i. Note that there can be more edges between the node 0 and some node i [not equal to] 0. Moreover, introduce the mapping d : [V.sub.B] [right arrow] R such that [d.sub.0] = 0 and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and the induced mapping [w.sub.d] : [E.sub.B] [right arrow] R satisfying the formula [w.sub.d](e) = [absolute value of d(j) - d(i)] for e = (i, j) [member of] [E.sub.B].

Consider a tree T = ([V.sub.B], [E.sub.T]) rooted in the node 0 such that [absolute value of [E.sub.T]] = [absolute value of [V.sub.B]] - 1. Let k be its arbitrary node. Using the Schwarz inequality we get

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3.2)

where P(0, k) is a unique path between the nodes 0 and k in T (where we do not take into account the orientation of the edges) and l(k) is its length. Summing the inequalities in (3.2) for all k [member of] [V.sub.B] we get

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3.3)

where [l.sub.max] is the length of the path of maximum length from the node 0 to some node i [member of] [V.sub.B]. This implies that

[summation over e[member of][E.sub.B]] [w.sup.2.sub.d] (e) [greater than or equal to] [l.sup.-2.sub.max]. (3.4)

Consider now the matrix [Z.sup.T] B [member of] [R.sup.NIF+NDC,NE]. Its rows correspond to Dirichlet boundary conditions and interior inter-element faces. There is only one nonzero in the rows corresponding to Dirichlet boundary conditions (either +1 or -1) placed in the column of the element where this condition is imposed. In the rows which correspond to the interior faces, there are exactly two nonzeros, equal to +1 and -1, respectively. Consider a vector d = [([d.sub.0], [d.sub.1], ..., [d.sub.NE]).sup.T] such that d(0) = 0. Clearly, from the definition of [G.sub.B] we have

[summation over e[member of][E.sub.B]] [w.sup.2.sub.d] (e) = [[parallel]([Z.sup.T]B)[??][parallel].sup.2], (3.5)

where [??] = [([d.sub.1], ..., [d.sub.NE]).sup.T]. Consequently, using the Courant-Fischer theorem and (3.4) we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3.6)

The uniformly regular mesh refinement provides that [l.sub.max] = O([NE.sup.-1/3]) = O(h). Therefore, there is a positive constant [c.sub.7] such that

[[sigma].sub.min] ([Z.sup.T] B) [greater than or equal to] [c.sub.7]h. (3.7)

Since [parallel][Z.sup.T] B[parallel] [less than or equal to] [parallel]Z[parallel] [parallel]B[parallel] [less than or equal to] [square root of 2] [square root of 5], the singular values of [Z.sup.T] B are bounded by a positive constant [c.sub.8] = [square root of 10] and this completes the proof.

LEMMA 3.3. Let Z be the null-space basis of the off-diagonal block C constructed in Step 1 of Algorithm 3.1. Then for the spectrum of the projected matrix (3.1) it follows

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Proof. The proof of the lemma follows from [44], Lemma 2.1 and from the statements of Lemma 3.1 and Lemma 3.2.

It is well-known that applying the minimal residual method to the projected system (3.1) the relative residual norm of the n-th approximate solutions [u.sup.n.sub.2] and [p.sup.n], n = 0,1, ... can be bounded (see also [23, pag. 54], [52, pag. 234]) as follows

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3.8)

where a = 1/2([square root of [c.sup.2.sub.1] + 4[c.sup.2.sub.8]] - [c.sub.1]), b = [c.sup.2.sub.7]/[c.sub.2], c = [c.sub.1] and d = [c.sub.3] + [square root of [c.sup.2.sub.2] + [c.sup.2.sub.8]. From (3.8) we obtain the bound for the asymptotic convergence factor in the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[FIGURE 3.3 OMITTED]

Clearly, the bounds for the rate of convergence of the minimal residual method applied to the indefinite projected system depend linearly on the discretization parameter h. Moreover, since we have used, in fact, the assumption on the symmetric spectrum for the projected matrix, this bound may be an overestimate of the actual rate of convergence of the unpreconditioned minimal residual method [52]. The preconditioning of the projected matrix (3.1) can be incorporated as well and many other approaches are possible [47], [44], [51].

3.4. Step 5. Since the matrix block C has an orthogonal set of columns, the unknown vector [lambda] is given as [lambda] = [D.sup.-1][C.sup.T]([g.sub.1] - Au - Bp) which is easy to solve owing to the fact that D = [C.sup.t]C = diag(2, ..., 2,1, ..., 1) is a diagonal matrix.

4. Numerical experiments. In this section we give the results from numerical experiments. Two sets of matrices have been considered.

The first set corresponds to a model potential fluid flow problem in a rectangular domain with homogeneous Neumann on the top and bottom and Dirichlet conditions prescribed on the rest of the boundary. The tensor of hydraulic permeability is constant in the whole domain. Uniform prismatic discretization with the varying mesh size h was used. In Table 4.1, we give the values of discretization parameters NE = 2/[h.sup.3], NIF, NNC and NDC for different values of h. The dimension of the resulting indefinite system matrix (1.3) can be computed as N = 6 * NE + NIF + NNC and the number of columns of the off-diagonal block (B C) is given by NBC = NE + NIF + NNC. In Table 4.1 we report the dimension NZl of the null-space of the whole block [(B C).sup.T] and the dimension NZ2 of the null-space of the block [C.sup.T] for all values of mesh size h.

Table 4.2 reports the inclusion sets of the spectrum of matrix blocks A and (B C) as well as of the whole symmetric indefinite matrix from (1.3). The extreme singular values of the block (B C) (square roots of the extreme eigenvalues of the matrix [(B C).sup.T] (B C)) and the extreme positive and negative eigenvalues of the whole indefinite matrix were approximated by the eigenvalues of the symmetric tridiagonal matrix obtained from 2000 steps of the symmetric Lanczos algorithm [20]. The eigenvalue computation of the resulting tridiagonal matrix was done using the LAPACK double precision subroutine DSYEV [3]. The extreme eigenvalues of the diagonal matrix block A were computed directly by the LAPACK symmetric eigenvalue solver element by element. It is clear from Table 4.2 that the computed eigenvalues of the block A are in a good agreement with the result (1.4) and after scaling (1.6) the spectrum of the diagonal block A becomes independent of h. Similarly the computed extreme singular values of (B C) agree well with (1.5).

Approaches based on the computation of the null-space basis of the whole off-diagonal block [(B C).sup.T] are discussed first. In Table 4.3, we compare the memory requirement (denoted as NNZ(Z1)) and the computational cost of constructing the null-space basis and iteration counts for the (smoothed) conjugate gradient method applied to the projected positive definite system in Algorithm 2.1, Step 3. For computation of the null space basis Z (such that [(B C).sup.T] Z = 0) we use the sparse QR factorization (for details see [4]) and the fundamental cycle null space basis. Sparse QR decomposition was computed with the code MA49 from the Harwell Subroutine Library [29]. Fundamental cycle null space basis is based on the shortest path spanning tree of Gs, SDS algorithm from [14]. In Table 4.3 we further give the number of nonzero elements (denoted as NNZ(QR)) necessary for storing the orthogonal and upper triangular factors of (B C) and the time of computation in seconds (in brackets). All experiments were performed on the SGI Origin 200 with processor R10000. Our results from Table 4.3 indicate that the use of sparse QR factorization becomes prohibitive for last two values of h and the ratio NNZ(R)/NNZ(QR) tends to approach the value 1/2 with the decrease of h. Note that the number of nonzeros in the fundamental cycle null-space basis NNZ(Zl) is significantly smaller than the number of nonzeros in the factors Q and R. This is even more pronounced for the computation time. In the iterative part the initial approximation of [u.sub.2] was set to zero, the relative residual norm [parallel][r.sub.n][parallel]/[parallel][r.sub.0][parallel] = [10.sup.-8] was used as the stopping criterion. Only the unpreconditioned case is considered in this case. In the case of the QR approach we included the number of iterations and timing in seconds for two possible approaches using either both factors Q and R (denoted in Table 4.3 as QR, see also [4]) or solution via seminormal equations (SN) (for details we refer to [40]) which uses only the upper triangular factor R from the QR factorization. The latter then necessarily leads to approximately double cost of matrix-vector multiplications in the iterative solver. For the case of fundamental cycle basis we report the number of iterations and timings when the matrix [Z.sup.T]AZ is unpreconditioned and kept in factorized form (UN). We have noticed that simple preconditioning strategies like Jacobi (note that the system matrix was initially scaled) or IC (using explicit matrix assembling) do not help to improve the results. It is clear from iteration counts in Table 4.3 that the number of iterations in the case of the QR factorization remains independent of the mesh size h while the number iterations in the approach based on the fundamental cycle basis increases more than linearly with h, which leads to higher timings also in the iterative part of the process.

In Table 4.4 we compare the approaches based on the null-space basis of the off-diagonal block [C.sub.T]. The iteration counts and times of the preconditioned conjugate gradient method applied to the projected indefinite system in Algorithm 3.1, Step 3 are discussed for positive definite block diagonal preconditioner (IP) and indefinite (constraint) preconditioner (IQ), where the inverses of corresponding matrices are approximated by the incomplete Cholesky decomposition IC(0) (see e.g. numerical experiments in [40] and references therein). For comparison we also give results for the preconditioner based on the approximate factorization of the indefinite system (NS) developed originally by Nash and Sofer [37, pag. 52, formula (3.2)]. It is clear from Table 4.4 that the computed results are in a good agreement with the theoretical result (3.8) developed in Section 3. Indeed, the number of iterations required for reducing the relative residual norm to [10.sup.-8] increases linearly with the decrease of h. The results with the IQ and IP preconditioners are reasonably good, better than the results for the NS preconditioner which has, on the other hand, more potential for parallel implementation. We note that the stopping criterion and the level [10.sup.-8] used throughout the paper leads usually to much higher accuracy of the approximate solution than that required in practice in a finite element method framework. For a thorough discussion we refer to [5].

The iterative solution of the projected indefinite system (Algorithm 3.1, Step 3) is compared with the approach based on the sparse QR of the off-diagonal block [B.sup.T] Z. We report the memory requirement NNZ(QR) and the timings for the computation of the factors together with the number of nonzeros in the null-space basis Z (denoted as NNZ(Z2) here). We note that since the latter is equal to 2 * NIF + NEC the time for the construction of Z is negligible and it is not included in Table 4.4. Similarly to Table 4.3 in Table 4.4 we also included iteration counts and times for the iterative part of the QR approach that uses either both Q and R factors (QR) or only the factor R (SN).

The first set of matrices was obtained from a discretization of a model potential fluid flow problem with a constant tensor of permeability in a rectangular domain. Theoretical analysis and numerical experiments for the first set clearly indicate that the conditioning of the positive definite block A does not dramatically affect the behaviour of the conjugate gradient method used in the iterative part of the whole solution process. In addition, the linear dependence (or independence in the case of the QR approach) in the iteration counts of the conjugate gradient method on mesh size does not represent a serious difficulty in terms of the computational complexity, especially owing to the fact that in the three-dimensional case even large values of mesh size (h < 140) lead to a rather large problems, so a further decrease of h would lead to a practically infeasible system anyway. The second set of matrices comes from a real-world application of underground water flow modelling in the area of Straz pod Ralskem in northern Bohemia. Realistic values of hydraulic permeability lead to the positive definite diagonal block A with the condition number which may become a dominating factor for the behaviour of the iterative solver applied onto a projected system. This is illustrated in the following experiments. In Table 4.5 we give a description of the problems together with the inclusion sets for the extreme eigenvalues of A and extreme singular values of (B C) computed as for the model problem in Table 4.2.

Similarly as before, in Tables 4.6 and 4.7 we report the same quantities for the second set of matrices. It follows from Table 4.6 that also here the memory requirements and the times for computing the (sparse) QR decomposition are substantially larger than in the case of construction of the fundamental cycle null-space basis. For realistic examples, however, the iteration counts and timings for the conjugate gradient method applied on the system with [Z.sup.T]AZ (UN) dramatically increase and for last two examples exceed 9999 iterations. The iteration counts and timings for both QR approaches (QR and SN), on the other hand, remain comparable to the results in Table 4.3. Iterations counts and timings for the positive definite block-diagonal preconditioner (IP) and indefinite (constraint) preconditioner (IQ) in Table 4.7 are comparable to results in Table 4.4 and show that this approach is very efficient even for realistic problems. The Nash-Sofer preconditioning is, however, substantially worse for problems with the dominant tensor of hydraulic permeability. The QR approach applied to the projected indefinite system seems to be a useful approach. Nevertheless, it may fail in some cases.

Finally, we report a comparison of the dual variable approach from Section 3 with the primal approach based on the construction of the Schur complement matrix, and its subsequent solution by the conjugate gradient method [35]. Instead of considering the model potential flow problem (1.1) and (1.2) in a rectangular domain with a uniform mesh refinement [33], [34], where the primal approach typically outperforms our null-space based variants, we present results on our real-world problems where realistic values of hydraulic permeability tensor lead to the positive definite diagonal block A with a large condition number which significantly affects efficiency of iterative solvers applied to systems (1.3). In Table 4.8 we present the number of nonzeros of the corresponding Schur complement matrix, and projected matrix (3.1), iteration counts and total time for solving the linear system (1.3) including time for all initial transformations and substitutions. Here we considered both unpreconditioned and preconditioned variants. In the preconditioned case we applied IC(0) preconditioning to the Schur complement system, and the indefinite constraint preconditioning [21], [32], [43] to the indefinite projected system with block inverses approximated by IC(0). The results in Table 4.8 show that the dual variable variant based on the null-space basis of [C.sup.T] is significantly faster for the chosen set of real-world problems.

5. Conclusions. In this paper we have compared the computational efficiency of several dual methods for the solution of augmented linear systems coming from the mixed-hybrid finite element approximation of the potential fluid flow problem in porous media. We have discussed the approach based on the computation of a null-space basis either of the whole off-diagonal block [(B C).sup.T] or its orthogonal part [C.sup.T]. We have shown that although the sparse QR decomposition of the off-diagonal block is prohibitive for large problems in terms of memory requirements for storing the factors, its iterative part is very efficient (although the cost of iteration is rather high) and not dependent on the mesh size. On the other hand, the construction of the fundamental cycle null space basis is very fast, but the iteration counts are much worse. In addition, since the basis is non-orthogonal the number of iterations in the iterative part is no longer independent of the mesh size and in the case of more difficult tensors of hydraulic permeability may become very large. The cost of iteration is, however, owing to higher sparsity of the basis lower than for the QR approach. Good preconditioning of the projected matrix [Z.sup.T]AZ may be of help especially for realistic examples and in general it is an open question. For examples with moderate values of hydraulic permeability it seems useful to keep the projected matrix in factorized form.

The approach based on the null space of the off-diagonal block [C.sup.T] seems to be more efficient both in terms of the memory requirements and computational cost. The null-space basis of [C.sup.T] can be explicitly given and the construction of the resulting projected (mixed) system is cheap. Again, the sparse QR decompostition of [Z.sup.T] B (if it is not prohibitive) leads to lower iteration counts and times in the iterative part. Numerical experiments on all examples indicate that the pure iterative solution of the projected and still indefinite system is a very promising approach especially together with some efficient preconditioning techniques like the indefinite (constraint) or block-diagonal positive definite preconditioner. Moreover, following the discussion of Section 3.3, we can take advantage of (3.1) for an efficient parallel implementation of the matrix by vector product.

Acknowledgement. Authors would like to thank Marco Manzini for helping us with experiments and useful comments on the implementation of our codes. We are also indebted to the Dept. of Mathematical Modelling in DIAMO, s.e., Straz pod Ralskem for providing us with realistic numerical examples for the experimental part of this paper.

REFERENCES

[1] F. ALVARADO, matrix enlarging methods and their application. Direct methods, linear algebra in optimization, iterative methods , BIT, 37 (1997), pp. 473-505.

[2] R. A MIT, C. HALL, AND T. PORSCHING, An application ofnetwork theory to the solution of implicit Navier-Stokes difference equations, J. Comp. Phys., 40 (1981), pp. 183-201.

[3] E. ANDERSON, Z. BAI, C. BISCHOF, J. DEMMEL, J. DONGARRA, J. D. CROZ, A. GREENBAUM, S. HAM- MARLING, A. MCKENNEY, S. OSTROUCHOV, AND D. SORENSEN, LAPACK User's Guide, SIAM, Philadelphia, 1992.

[4] M. ARIOLI, The use of QR factorization in sparse quadratic programming and backward error issues, SIAM J. Matrix Anal. and Applies., 21 (2000), pp. 825-839.

[5] --, A stopping criterion for the conjugate gradient algorithm in a finite element method framework, Numer. Math., 97 (2004), pp. 1-24.

[6] M. BENZI AND G. GOLUB, A preconditioner for generalized saddle point problems, SIAM Journal Matrix Anal. Appl., 26 (2004), pp. 20-41.

[7] L. BERGAMASCHI, S. MANTICA, AND F. SALERI, Mixed finite element approximation of Darcy's law in porous media, tech. report, CRS4, Cagliari, Italy, 1994.

[8] L. BERGAMASCHI AND M.PUTTI, Mixed finite elements and Newton-like linearizations for the solution of Richard's equation, Internat. J. Numer. Methods Engrg., 45 (1999), pp. 1025-1046.

[9] F. BREZZI AND M. FORTIN, Mixed and Hybrid Finite Element Methods, Springer-Verlag, Berlin, 1992.

[10] J. BURKARDT, C. HALL, AND T. PORSCHING, The dual variable method for the solution of compressible fluid flow problems, SIAM J. Alg. Disc. Meth., 7 (1986), pp. 476-483.

[11] A. CASSELL, J. DE C. HENDERSON, AND A. KAVEH, Cycle bases for the flexibility analysis of structures, Int. J. Num. Meth. Engng., 8 (1974), pp. 521-528.

[12] T. COLEMAN AND A. POTHEN, The null space problem L Complexity, SIAM J. Alg. Disc. Meth., 7 (1986), pp. 527-537.

[13] B. F. DE VEUBEKE, Displacement and equilibrium models in the finite element method in: Stress Analysis, C. Zienkiewicz, G. Holister, eds., Wiley, Chichester, 1965.

[14] N. DEO, G. PRABHU, AND M. KRISHNAMOORTHY, Algorithms for generating fundamental cycles in a graph, ACM Trans. Math. Software, 8 (1982), pp. 26-42.

[15] I. DUFF, N. GOULD, J. REID, J. SCOTT, AND K. TURNER, Factorization of sparse symmetric indefinite matrices, IMA J. Numer. Anal., 11 (1991), pp. 181-204.

[16] H. ELMAN, Multigrid and Krylov subspace methods for the discrete Stokes equations, Int. J. Numerical Methods in Fluids, 22 (1996), pp. 755-770.

[17] H. ELMAN AND G. GOLUB, Inexact and preconditioned Uzawa algorithms for saddle point problems, SIAM J. Numer. Anal., 31 (1994), pp. 1645-1661.

[18] M. GAREY AND D. JOHNSON, Computers and Intractability: A Guide to the Theory of NP-Completeness: A Series of Books in the Mathematical Sciences., W. H. Freeman and Co., San Francisco, Calif., 1979.

[19] N. GIBBS, W. P. JR., AND P. STOCKMEYER, An algorithm for reducing the bandwidth and profile of a sparse matrix, SIAM J. Numer. Anal., 13 (1976), pp. 236-250.

[20] G. GOLUB AND C. VAN LOAN, Matrix Computations, The Johns Hopkins University Press, Baltimore, 3 ed., 1996.

[21] N. GOULD, M. HRIBAR, AND J. NOCEDAL, On the solution of equality constrained quadratic programming problems arising in optimization, SIAM J. Sci. Comput., 23 (2001), pp. 1375-1394.

[22] J. GRCAR, Matrix stretching for linear equations, Tech. Report Technical Report SAND90-8723, Sandia National Laboratories, 1990.

[23] A. GREENBAUM, Iterative Methods for Solving Linear Systems, SIAM, Philadelphia, 1997.

[24] C. HALL, Numerical solution of Navier-Stokes problems by the dual variable method, SIAM J. Algebraic Discrete Methods 6 (1985), pp. 220-236.

[25] M. HEATH, R. PLEMMONS, AND R. WARD, Sparse orthogonal schemes for structural optimization using the force method, SIAM J. Sci. Stat. Comput., 5 (1984), pp. 514-532.

[26] J. HENDERSON AND E. MAUNDER, A problem in applied topology: on the selection of cycles for the flexibility analysis of skeletal structures, J. Inst. Maths. Applies., 5 (1969), pp. 254-269.

[27] M. HESTENES AND E. STIEFEL, Method of conjugate gradients for solving linear systems, J. Res. Nat. Bureau Standards, 49 (1952), pp. 409-435.

[28] R. HIPTMAIR AND R. HOPPE, Multilevel preconditioning for mixed problems in three dimensions, Numerische Mathematik, 82 (1999), pp. 253-279.

[29] HSL, A collection of Fortran codes for large scale scientific computation, 2000. http://www.ese.elre.ac.uk/Activity/HSL.

[30] E. KAASSCHIETER AND A. HUIJBEN, Mixed-hybrid finite elements and streamline computation for the potential flow problem, Numer. Methods Partial Differential Equations 8 (1992), pp. 221-266.

[31] A. KAVEH, A combinatorial optimization problem: optimal generalized cycle basis, Comput. Methods Appl. Mech. Engrg., 20 (1984), pp. 983-998.

[32] L. LUKSAN AND J. VLCEK, Indefinitely preconditioned inexact Newton method for large sparse equality constrained non-linear programming problems, Numer. Linear Algebra with Appl., 5 (1998), pp. 219-247.

[33] J. MARY 9KA, M. ROZLOZNfK, AND M. T6MA, Mixed-hybrid finite element approximation of the potential fluid flow problem, J. Comput. Appl. Math., 63 (1995), pp. 383-392.

[34] --, The potential fluid flow problem and the convergence rate of the minimal residual method, Numer. Linear Algebra with Applications, 3 (1996), pp. 525-542.

[35] --, Schur complement systems in the mixed-hybrid finite element approximation of the potential fluid flow problem, SIAM J. Sci. Comput., 22 (2000), pp. 704-723.

[36] K. G. MURTHY, Network Programming, Prentice Hall, Englewood Cliffs, NJ, 1992.

[37] S. NASH AND A. SOFER, Preconditioning reduced matrices, SIAM J. Matrix Anal. Appl., 17 (1996), pp. 47-68.

[38] J. NEDELEC, Mixed finite elements in IR3, Numerische Mathematik, 35 (1980), pp. 315-341.

[39] C. C. PAIGE AND M. A. SAUNDERS, Solution of sparse indefinite systems of linear equations, SIAM Journal on Numerical Analysis, 12 (1975), pp. 617-629.

[40] I. PERUGIA, V. SIMONCINI, AND M. ARIOLI, Linear algebra methods in a mixed approximation of magnetostatic problems, SIAM J. Sci. Comput., 21 (1999), pp. 1085-1101.

[41] R. PLEMMONS AND R. WHITE, Substructuring methods for computing the nullspace of equilibrium matrices, SIAM J. Matrix Anal. Appl., 11 (1990), pp. 1-22.

[42] A. POTHEN, Sparse null basis computations in structural optimization, Numer. Math., 55 (1989), pp. 501-519.

[43] M. ROZLOZNIK AND V. SIMONCINI, Krylov subspace methods for saddle point problems with indefinite preconditioning, SIAM J. Math. Anal. Appl., 24 (2002), pp. 368-391.

[44] T. RUSTEN AND R. WINTHER, A preconditioned iterative method for saddle point problems, SIAM J. Math. Anal. Appl., 13 (1992), pp. 887-904.

[45] Y. SAAR, Iterative Methods for Sparse Linear Systems, PWS Publishing Company, ITP, 1996.

[46] R. SCHEICHL, Iterative solution of saddle-point problems using divergence free finite elements with applications to groundwater flow, PhD thesis, University of Bath, Great Britain, 2000.

[47] D. SILVESTER AND A. WATHEN, Fast iterative solution of stabilized Stokes systems, part L Using simple diagonal preconditioners, SIAM J. Numer. Anal., 30 (1993), pp. 630-649.

[48] E. STIEFEL, Relaxationsmethoden besterr Strategic zur Losung linearer Gleichungssysteme, Commentaxii Mathematici Helvetici, 29 (1955), pp. 157-179.

[49] M. TUMA, A note on LDLT-decomposition of matrices from saddle-point problems, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 903-915.

[50] C. WAGNER, W. KINZELBACH, AND G. WITTUM, Schur-complement multigrid: A robust method for groundwater flow and transport problems, Num. Math., 75 (1997), pp. 523-545.

[51] Y. WANG, Preconditioning for the mixed formulation of linear plane elasticity, PhD thesis, Department of Mathematics, A&M University, Texas, 2004.

[52] A. WATHEN, B. FiSCHER, AND D. SILVESTER, The convergence of iterative solution methods for symmetric and indefinite linear systems, in: Numerical Analysis 1997, D.E Griffiths and D.J. Higham and G.A. Watson, eds., Pitman Research Notes in Mathematics Series, Longman, 1997.

[53] T. R. Z. C At, R. R. PAR AS HKEVGV AND X. YE, Domain decomposition for a mixed finite element method in three dimensions, SIAM J. Numer. Anal., 41 (2003), pp. 181-194.

* Received January 3, 2005. Accepted for publication September 15, 2005. Recommended by M. Benzi. This work was supported by the project IET400300415 within the National Program of Research "Information Society" and by the project No. IAA1030405 by GA AS CR. The work of the first author was supported in part by EPSRC grants GR/R46641/01 and GR/S42170.

M. ARIOLI ([dagger]), J. MARYSKA ([double dagger]), M. ROZLOZNIK ([double dagger]), AND M. TUMA ([double dagger])

([dagger]) Rutherford Appleton Laboratory, Chilton, Didcot, Oxon, OX11 0QX, UK (M.Arioli@rl.ac.uk).

([double dagger]) Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod vodarenskou vezi 2, 182 07 Prague 8, Czech Republic and Technical University of Liberec, Department of Modelling of Processes, Halkova 6, CZ-461 17 Liberec, Czech Republic (jiri.maryska@vslib.cz, miro@cs.cas.cz, tuma@cs.cas.cz).

TABLE 4.1 Model potential fluid flow problem on a rectangular domain with a constant tensor of hydraulic permeability. The quantity NE denotes the number of elements, NIF stands for the number of interior inter-element faces, NLIC and NNC denotes the number of Dirichlet and Neumann boundary conditions, respectively. The dimension of the null-space of [(B C).sup.T] is given as NZl = 4 * NE - NIF - NNC and the dimension of the null-space of [C.sup.T] is given as NM =5*NE-NIF-NNC. Discretization parameters Dimension of null- spaces h NE NIF NDC NNC NZl NZ2 1/5 250 525 100 100 375 625 1/10 2000 4600 400 400 3000 5000 1/15 6750 15975 900 900 10125 16875 1/20 16000 38400 1600 1600 24000 40000 1/25 31250 75625 2500 2500 46875 78125 1/30 54000 131400 3600 3600 81000 135000 1/35 87750 209475 4900 4900 138625 226375 1/40 128000 313600 6400 6400 192000 320000 TABLE 4.2 Model potential fluid flow problem on a rectangular domain with a constant tensor of hydraulic permeability. Spectral properties of the matrix blocks and the whole indefinite system for different values of mesh size h. The extreme eigenvalues and singular values were approximated using the symmetric Lanczos process and subsequent computation of the eigenvalues of resulting tridiagonal form. spectrum of matrix blocks h spectrum of A s.v. of (B C) 1/5 [0.0016, 0.01] [0.1810, 2.63] 1/10 [0.0033, 0.02] [0.0927, 2.64] 1/15 [0.0050, 0.03] [0.0622, 2.64] 1/20 [0.0066, 0.04] [0.0467, 2.64] 1/25 [0.0083, 0.05] [0.0374, 2.65] 1/30 [0.0099, 0.06] [0.0312, 2.65] 1/35 [0.0110, 0.07] [0.0268, 2.65] 1/40 [0.0130, 0.08] [0.0234, 2.65] whole indefinite system h negative part positive part 1/5 [-2.63,-0.1800] [0.00166, 2.63] 1/10 [-2.64,-0.08981 [0.00335, 2.64] 1/15 [-2.64,-0.0354] [0.00509, 2.65] 1/20 [-2.64,-0.0413] [0.00679, 2.65] 1/25 [-2.64,-0.0311] [0.00861, 2.65] 1/30 [-2.64,-0.0241] [0.01040, 2.65] 1/35 [-2.64,-0.0190] [0.01200, 2.65] 1/40 [-2.64,-0.0152] [0.01360, 2.65] TABLE 4.3 Model potential fluid flow problem on a rectangular domain with a constant tensor of hydraulic permeability. Memory requirements (the number of nonzero entries NNZ(QR) and NNZ(Z1))of the approaches using the null-space basis of the whole block [(B C).sup.T], iteration counts and timings (in brackets for both approaches) of the conjugate gradient method applied to the projected positive definite system. memory requirements h QR approach fund. cycles NNZ(QR) NNZ(Z1) 1/5 28360 3360 (3e-2) (7e-3) 1/10 410466 47120 (0.97) (0.07) 1/15 1979203 226780 (9.73) (0.30) 1/20 7120947 697840 (59.6) (0.93) 1/25 18105131 1675800 (237) (2.21) 1/30 40837823 3436160 (980) (4.60) 1/35 -- 6314420 (8.64) 1/40 -- 10706080 (14.8) iteration counts h QR approach fund. cycles QR SN UN 1/5 22 20 71 (0.17) (0.44) (0.08) 1/10 22 21 163 (1.87) (4.23) (1.57) 1/15 22 21 252 (8.48) (17.1) (19.9) 1/20 22 21 346 (25.0) (48.6) (75.9) 1/25 22 21 438 (57.2) (107) (222) 1/30 21 21 523 (110) (214) (510) 1/35 -- -- 596 (1009) 1/40 -- -- 670 (1900) TABLE 4.4 Model potential fluid flow problem on a rectangular domain with a constant tensor of hydraulic permeability. Number of nonzeros of the projected matrix onto the null-space basis Z of the block [C.sup.T] (see Algorithm 3.1, Step 3), iteration counts and timings of the preconditioned conjugate gradient method applied to the orthogonally projected indefinite system compared to the memory requirements and iteration counts for the solution of the same system based on the sparse QR decomposition of its off-diagonal block [Z.sup.T] B. pure iteration h NNZ(Z2) IP IQ NS 1/5 14375 62 35 55 (0.05) (0.03) (0.10) 1/10 123000 103 64 108 (0.68) (0.48) (1.6) 1/15 424125 144 93 160 (5.17) (3.79) (13.6) 1/20 1016000 186 118 212 (20.2) (14.2) (49.6) 1/25 1996875 225 145.00 265 (50.8) (37.4) (122) 1/30 3465000 260 174 311 (111) (84.2) (268) 1/35 5518625 295 204 362 (224) (173) (520) 1/40 8256000 331 230 412 (383) (295) (941) sparse QR h NNZ(QR) QR SN 1/5 20834 18 14 (0.02) (0.09) (0.09) 1/10 356267 19 16 (0.35) (1.11) (0.89) 1/15 1840670 21 15 (3.14) (6.09) (4.63) 1/20 6322468 21 15 (17.97) (18.3) (14.94) 1/25 16661544 23 15 (86.6) (47.0) (27.8) 1/30 40669978 22 15 (584.0) (96.7) (85.5) 1/35 -- -- -- 1/40 -- -- -- TABLE 4.5 Realistic problems from undergound water flow modelling in Straz pod Ralskem. The name of the problem, the number of elements NE and the dimension of the whole indefinite system N = 6 * NE + NIF + NNC. The spectral properties of the matrix blocks A and (BC) for all matrices. The extreme eigenvalues and singular values were approximated using the symmetric Lanczos process and subsequent computation of the eigenvalues of the resulting tridiagonal form discretization parameters name NE N k1san 14700 126980 olesnik0 24300 210060 dpretok 36300 313940 turon 50700 438620 spectrum of matrix blocks name spectrum of A s.v. of (B C) k1san [0.21e-4,0.80e2] [0.026,2.64] olesnik0 [0.74e-4,0.91e3] [0.020,2.64] dpretok [0.77e-3,0.12e5] [0.017,2.64] turon [0.19e-4,0.96e2] [0.014,2.64] TABLE 4.6 Realistic problems from underground water flow modelling in Straz pod Ralskem. Memory requirements of the approaches using the null-space basis of the whole block (B C), iteration counts and timings of the conjugate gradient method applied to the projected positive definite system. memory requirements Name QR approach fund. cycles NNZ(QR) NNZ(Z1) k1san 3674914 983640 (38.1) (0.95) olesnik0 6626296 2057880 (102) (2.03) dpretok 10453556 3719320 (224) (3.73) turon 15398104 6095960 (434) (6.62) iteration counts Name QR approach fund. cycles QR SN UN k1san 44 44 2635 (34.4) (78.4) (703) olesnik0 58 58 4544 (79.1) (181) (2397) dpretok 37 37 >9999 (78.6) (187) (--) turon 36 36 >9999 (116) (265) (--) TABLE 4.7 Realistic problems from underground water flow modelling in Straz pod Ralskem. Number of nonzeros of the projected matrix onto the null-space basis Z of the block [C.sup.T] (see Algorithm 3.1, Step 3), iteration counts and timings of the preconditioned conjugate gradient method applied to the orthogonally projected indefinite system compared to the memory requirements and iteration counts for the solution of the same system based on the sparse QR decomposition of its off-diagonal block [Z.sup.T] B. pure iteration Name NNZ(Z2) IP IQ NS k1san 862820 184.00 76.00 3156 (17.1) (7.86) (629) olesnik0 1426140 287.00 103 5582 (44.9) (18.4) (1846) dpretok 2130260 112.00 51 1705 (26.3) (14.1) (865) turon 2975180 155.00 80 442 (56.0) (32.7) (325) sparse QR Name NNZ(QR) QR SN k1san 3284826 93 93 (5.80) (51.1) (59.1) olesnik0 6007628 >9999 >9999 (13.7) (--) (--) dpretok 9495418 23 23 (26.1) (35.6) (42.3) turon 14426491 26 26 (49.6) (59.0) (72.1) TABLE 4.8 Real application problems from the underground water flow modelling in Straz pod Ralskem. Memory requirements, iteration counts and total timings of the pure and preconditioned conjugate gradient method applied to the Schur complement system with ((-A/A)/[A.sub.11])/ [B.sub.22] compared to the memory requirements, iteration counts and total timings for the solution of the projected system (3.1) using the pure and preconditioned MINRES method. Matrix Schur complement approach NNZ(S) unprec prec k1san 33880 6632 221 (244) (13.3) olesnik0 56160 > 9999 25 (--) (92.7) dpretok 84040 > 9999 407 (--) (65.9) turon 117520 1843 376 (302) (91.8) Matrix dual variable approach NNZ(Z2) unprec prec k1san 49420 650 76 (36.2) (10.3) olesnik0 81540 727 103 (65.7) (20.5) dpretok 121660 784 51 (117) (22.5) turon 169780 722 80 (161) (40.3)

Printer friendly Cite/link Email Feedback | |

Author: | Arioli, M.; Maryska, J.; Rozloznik, M.; Tuma, M. |
---|---|

Publication: | Electronic Transactions on Numerical Analysis |

Date: | Jan 1, 2006 |

Words: | 12431 |

Previous Article: | Probing methods for saddle-point problems. |

Next Article: | A network programming approach in solving Darcy's equations by mixed finite-element methods. |