Computing matrix inverses by divide-and-conquer method with floating point calculations.

AMS Subject Classification: 15A09 Introduction

The standard method for solving a system of equations

Ax = b (1)

Where A [member of] [R.sup.nxn] is a general matrix, is based on the LU- factorization algorithm

A = LU (2)

Where L [member of] [R.sup.nxn] is a lower triangular matrix and U [member of] [R.sup.nxn] is upper triangular, as follows: first solve Lb'= b by forward substitution, and then solve Ux = b' by backward substitution to obtain the solution . Another way of solving (1) is to form the inverse [A.sup.-1] of the matrix A and simply compute x = [A.sup.-l]b.

There is much theory in linear algebra about inverses of matrices. For example Lay  list twelve necessary conditions for the existence of an inverse of an inverse of a square n x n matrix building up his so called The Invertible Matrix Theorem . However; explicit inversion of matrices as a mean for solving linear systems of equations is more or less considered as a sin among numerical analysts from the following reasons: It costs three times more floating point operations than the ordinary LU factorization approach, and it's less stable, especially for close-to-singular and ill-conditioned matrices .

On the other hand, there are other situations where explicit inversion is required, e.g., some situations in statistics and some matrix iterations in certain eigenvalue problems and for computing the solution of some matrix equation using Newton- style iterations on the matrix sign functions. [1, 2, 3]. Since explicit matrix iteration is costly and unstable, it is also challenging and interesting task in itself.

There exist a number of blocked and unblocked algorithms for explicit computation of the inverse of a general square matrix, which are more or less based on the LU factorization above .

There are not much work done on recursive blocked algorithms for computing matrix inverses. Recursion has benefit of being a divide-and-conquer approach that often can improve on performance by automatically matching the matrix blocks with the different cache in modern computers [5, 6, 8].

Relevant comparisons will be made relevant to existing algorithms concerning computational complexity, performance, accuracy, which can be measured by the size of the relative residual norm

[res.sub.inv] max([[parallel]I - AA-1[parallel].sub.F],[[paralle]I - [A.sup.-1] A[paralel].sub.F]),[[parallel]A[parallel].sub.F]

Where I is the identity matrix of order n and [parallel] [[parallel].sub.F] is the Frobenius norm

In this paper we study a different approach, based on divide-and conquer algorithm, for finding the inverse of matrices, the scalar and vector multiplications are computed by floating point error free algorithms.

In Section 2 we review the suggested algorithm for finding the inverse of the lower triangular matrices. The algorithms and a brief study of the computational methods are also described. In Section 3 we analyze the performance of the algorithm. Concluding remarks are given in Section 4.

Divide-and-Conquer Recursion Method

The divide-and-conquer strategy solves a problem by breaking it into sub- problems that are themselves smaller instances of the same type of problem, recursively solving these problems and appropriately combining their answers. The real work is done piecemeal, in three different places: in the partitioning of the problems into sub-problems, at the very tail of the recursion, when the sub-problems are so small that they are solved outright; and in the gluing together of partial answers. These are held together and coordinated by the algorithm's core recursive structure.

Algorithm 1. Divide-and-Conquer Algorithm For [A.sup.-1]

Assume a n x n matrix A

Use the permutation matrices to reduce the matrix into a lower triangular matrix. Use the error-free vector transformation subroutine to evaluate all matrix products,

Partition A in 4 n/2xn/2 blocks: A = [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Then we have [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Hence computing the matrix [A.sup.-1] is reduced to computing [A.sup.-1] and [[A.sup.-1].sub.3], each being lower triangular of order n / 2x n / 2,

The inverse of the lower triangular [A.sub.1] and [A.sub.3] matrices can be computed in a similar way

The recursion should stop at a certain basecase where a small ( say 2 x 2) explicit inverse is carefully computed and returned to a higher level in the recursion tree Compute x = [A.sup.-l] b

In order to minimize the error, the matrix product--[A.sup.-1.sub.3] [A.sub.2] [[A.sup.-1.sub.3] can be computed very efficiently using a good implementation of the unexpected feature of floating point format [9, 12]. We start by clarifying the notations used. We denote by fl(*) the result of floating-point computation, where all operations within the parentheses are executed in working precision. Consider a set of floating-point numbers F, for instance double precision floating numbers. Let a matrix A [member of] [F.sup.nxn] be given. The only requirement for the following algorithms are floating-point operations in the double precision format. Any two vectors x, y [member of] [F.sup.n] can be transformed with the error-free algorithms in Dekker  into a vector z [member of] [F.sup.2n] such that [x.sup.T] y = [[summation].sup.2n.sub.i=1] [z.sub.i]. Thus we concentrate on computing a sum [summation] P for P [member of] [F.sup.n] calculated in J-fold. For two matrices A, B [member of] [F.sup.nxn] the ordinary product in working precision is denoted by fl(A.B) . If the product is executed in J-fold precision and stored in J-fold precision, we write [fl.sub.j,j] (A.B) comprises of j matrices in [F.sup.nxn]. If the product is executed in J-fold precision and stored in working precision, we write [fl.sub.j,1] (A.B).

In 1969 Knuth  developed an error-free transformation of the sum of two floating-point numbers. Knuth 1969  algorithm transforms any pair of floating-point numbers (a, b) into a new pair (x, y) with x = fl(a + b) and x + y = a + b.

Algorithm 2. Error-free transformation for the sum of two floating-point numbers

function [x, y] = TwoSum(a, b) x = fl(a + b) z = fl( x - a)

y = fl((a - (x - z)) + (b - z))

Algorithm 3. Error-free vector transformation

function q = VecSum( p)

[q.sub.1] = [p.sub.1]

for i = 2 : n

[[q.sub.i], [q.sub.i-1]]= TwoSum([p.sub.i], [q.sub.i-1])

Algorithm 4. Summation computed in J-fold precision with J results function {res} = SumKL([P.sup.(0)], J)

for j = 0: J - 2

[P.sup.(j+1)] = VecSum([P.sub.1...n-j])

[res.sub.j+1] = [P.sup.j-1.sub.n-j]

[res.sub.j] = fl([summation].sup.n-J+1.sub.i=1] [P.sup.(j-1).sub.i]

Numerical Computations

All computational results in this section are performed in Matlab  using double precision as working precision.

Example I. Hilbert matrix

Hilbert matrix is considered to be a classical example of an ill-conditioned matrix, the ij-th components are given by [h.sub.ij] = 1/(i + j -1), the dimension for which the matrix considered is n=21 with cond([H.sub.21]) = 8.2 x [10.sup.29]. The results obtained are shown in table 3.1.

Example II

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This is extremely ill-conditioned matrix with cond(A)=6.4e64. The MATLAB inverse inv(A) of the above matrix gives completely incorrect result compared to the true inverse due to the round errors.

inv(A) = [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

While the true inverse of A is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The floating point algorithms are used to reduce the matrix into an upper triangular matrix, then partitioning it into four 2 x 2 matrices. The computational result of algorithm 1.1 gives

Concluding Remarks

We have studied a divide-and conquer algorithm together with floating point numbers transform for finding the inverse of ill-conditioned matrix A, our method, based on reducing the matrix into a lower triangular matrix, partitioning it into n / 2 x n / 2 sub-matrices, the inverse of the matrix is reduced in computing the inverse of sub-matrices. Using the error free algorithms seems to be a great tools in reducing the errors and of course to get good approximation for the computations of the inverse of ill-conditioned systems.

REFERENCES

 P. Benner, E.S. Quintana, and G. Quintana. Solving Stable Sylvester Equations via Rational Iterative Schemes. Preprint sfb393/04-08, TU Chemnitz, 2004.

 P. Benner, E.S. Quintana, and G. Quintana. Numerical Solution of Discrete Stable Linear Matrix Equations on Multicomputers. Parallel Algorithms and Applications, 17(1):127{146, 2002.

 P. Benner and E. S. Quintana. Solving Stable Generalized Lyanpunov Equations with the matrix sign functions. Numerical Algorithms, 20(1):75{100, 1999.

 T. Dekker. A Floating-Point Technique for Extending the Available Precision. Numerische Mathematik, 18 (1971), pp. 224-242

 N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, PA, second edition, 2002.

 I. Jonsson and B. K[degrees]agstrAom. Recursive Blocked Algorithms for Solving Triangular Systems. I. One-Sided and Coupled Sylvester-Type Matrix Equations. ACM Trans. Math. Software, 28(4):392{415, 2002.

 I. Jonsson and B. K[degrees]agstrAom. Recursive Blocked Algorithms for Solving Triangular Systems. II. Two-Sided and Generalized Sylvester and Lyapunov Matrix Equations. ACM Trans. Math. Software, 28(4):416{435, 2002.

 G. F. Jonsson I. Elmroth, E. and B. K[degrees]agstrAom. Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software. SIAM Review, 46(1):3{45, 2004.

 D. Knuth. The Art of Computer Programming: Seminumerical Algorithms. vol. 2, Addison Wesley, Reading, Massachusetts, 1969.

 D. C. Lay. Linear Algebra and its Applications, 2nd edition. Addison- Wesley, 1997.

 MATLAB User's Guide, Version 7.3, The Math Works Inc., 2006.

 T. Ogita, S. Rump, and S. Oishi. Accurate Sum and Dot Product. SIAM Journal on Scientific Computing (SISC), 26 (2005), pp. 1955-1988.

Awni M. Abu-Saman

Associate Professor of Applied Mathematics Mathematics Department, Alazhar University-Gaza P.O.Box 1277, Gaza-Gaza Strip, Palestinian Authority
```Table 3.1. Computational results of algorithm for Hilbert 21 x 21
matrix.

J      cond([H.sup.-1])   [parralle]I - [H.sup.-1]   [res.sub.inv]
H[parralle].sub.F]

2      7.14 e15           2.03 e03                   1.06 e+03
3      5.02E+21           815                        425
4      9.13E+29           5.21E-14                   2.72E-14

Table 3.2. Computational results of algorithm 1 for Example II
matrix.

J      cond([A.sup.-1])   [parallel]I - [A.sup.-1]   [res.sub.inv]
A[parallel].sub.F]

2      3.56 e19           6.33 e02                   1.9491E-14
3      9.72E+28           437                        1.3456E-14
4      7.91E+58           3.79E-11                   0
```
Author: Printer friendly Cite/link Email Feedback Abu-Saman, Awni M. International Journal of Computational and Applied Mathematics Report Jul 1, 2012 1797 On the law of large numbers for trigonometric series. Strongly multiplicative labeling in the context of some graph operations. Algorithms Functions, Inverse Inverse functions Matrices Matrices (Mathematics) Systems of equations