Algorithm 786: Multiple-Precision Complex Arithmetic and Functions.
Categories and! Subject Descriptors: G.1.0 [Numerical Analysis]: General--computer arithmetic; G.1.2 [Numerical Analysis]: Approximation--elementary function approximation; G.4 [Mathematics of Computing]: Mathematical Software--algorithm analysis; efficiency, portability
General Terms: Algorithms, Performance, Reliability
Additional Key Words and Phrases: Accuracy, complex arithmetic, floating point, function evaluation, mathematical library, multiple precision, portable software
The ZM package is a collection of Fortran subroutines that performs floating-point multiple-precision evaluation of complex arithmetic and elementary functions. These routines use the FM package [Smith 1991] for real multiple-precision arithmetic, constants, and elementary functions.
Brent's MP package [Brent 1978] did not support complex arithmetic, and Bailey's more recent MP package [Bailey 1993; 1995] provides complex arithmetic and some complex elementary functions and contains a Fortran 90 module defining multiple-precision derived types.
ZM also provides a Fortran 90 module that defines three multiple-precision data types and provides the interface routines for overriding arithmetic Operators and intrinsic Fortran 90 functions. This allows a program to declare variables as multiple precision real, integer, or complex and then to describe operations on these variables using the normal syntax for arithmetic expressions.
ZM versions are available for the numerical Fortran 90 intrinsic functions, and they provide good speed, rounding, and exception handling. They support flexible input and output conversion and have the capability for automatic tracing of arithmetic and functions.
There are three multiple-precision data types:
FM (multiple-precision real) IM (multiple-precision integer) ZM (multiple-precision complex)
Table I lists the operations defined in the interface module for the three derived types. For each of the operations =, +, -, *,/, **, .EQ., .NE., .GT., .GE., .LT., and .LE., the interface module defines all mixed-mode variations involving one of the three multiple-precision-derived types and another argument having one of the types integer, real, double, complex, complex double, FM, IM, ZM.
Table I. Available Operations for Multiple Precision Derived Types = real integer complex HUGE + real integer complex INT - real integer complex LOG * real integer complex LOG10 / real integer complex MATMUL ** real integer complex MAX .EQ. real integer complex MAXEXPONENT .NE. real integer complex MIN .GT. real integer complex MINEXPONENT .GE. real integer complex MOD .LT. real integer complex MODULO .LE. real integer complex NEAREST ABS real integer complex NINT ACOS real complex PRECISION AIMAG complex RADIX AINT real complex RANGE ANINT real complex REAL ASIN real complex RRSPACING ATAN real complex SCALE ATAN2 real SETEXPONENT BTEST integer SIGN CEILING real complex SIN CMPLX real integer SINH CONJ complex SPACING COS real complex SQRT COSH real complex TAN DBLE real integer complex TANH DIGITS real integer complex TINY DIM real integer TO_FM DINT real complex TO_IM DOTPRODUCT real integer complex TO_ZM EPSILON real TO_INT EXP real complex TO_SP EXPONENT real TO_DP FLOOR real integer complex TO_SPZ FRACTION real complex TO_DPZ = real integer complex + real integer complex - real complex * real complex / real integer complex ** real integer .EQ. real .NE. real integer .GT. real .GE. real integer .LT. real integer .LE. real ABS real integer complex ACOS real complex AIMAG real integer complex AINT real integer complex ANINT real integer complex ASIN real ATAN real complex ATAN2 real BTEST real integer CEILING real complex CMPLX real complex CONJ real COS real complex COSH real complex DBLE real complex DIGITS real integer complex DIM real integer complex DINT real integer complex DOTPRODUCT real integer complex EPSILON real integer complex EXP real integer complex EXPONENT real integer complex FLOOR real integer complex FRACTION real integer complex A=12 A = A + 1 IF (ABS(A).LT.1.0D-23) THEN
are handled correctly.
Not all the named functions are defined for all three multiple precision derived types, so Table I shows which can be used. The labels "real," "integer," and "complex" refer to types FM, IM, and ZM respectively. For functions that accept two or more arguments, like ATAN2 or MAX, all the arguments must be of the same type.
In addition to these types, the three conversion functions TO_FM, TO_IM, and TO_ZM accept character strings for input arguments, (e.g., TO_FM(`3.45')), and those functions also accept any of the machine precision data types integer, real, double, complex, or complex double. For converting a multiple-precision number to one of the machine precision types, there are five conversion functions: TO_INT, TO_SP, TO_DP, TO_SPZ, and TO_DPZ.
Prior to Bailey's package, most multiple-precision packages stored the numbers in integer arrays. This was faster for machines of the time, but Bailey reported that for some supercomputers and scientific workstations it was better to use double precision. His routines stored the numbers in single-precision arrays and used double-precision arithmetic to accumulate the multiple-precision results.
The advantage of using double-precision arithmetic is that a larger base can be used for the multiple-precision arithmetic. This reduces the number of array elements required to store a number with a given precision. The base is usually restricted to be less than the square root of the largest representable integer. Using integer arithmetic on a 32-bit computer, each word holds about four decimal digits, compared to seven for double precision.
Multiplication, division, and functions have running times no faster than O([t.sup.2]) for t words of precision unless t is large. These times may be faster by a factor of about three if double-precision arithmetic is as fast as integer arithmetic. At commonly used lower precisions (20 to 60 decimal digits), the overhead involved in a general multiple-precision package reduces this potential speedup factor.
However, some machines are faster using integer arithmetic. To increase the flexibility of both the FM and ZM packages, they have been written so that the type of arithmetic used for the basic operations can be easily changed. The user can select the type of arithmetic for the packages to use internally. The default is to use double precision. To produce an equivalent integer version, the user duplicates the source code file and uses a text editor to make two (global) string replacements in the code. The details are given in the documentation at the beginning of the new FM package.
The new version of the FM package incorporates several changes that support the ZM routines. There are some new routines, and most of the old ones are faster. The division routine uses a new algorithm [Smith 1996] that is about twice as fast as the original. In addition, the package now includes a collection of routines for integer multiple-precision arithmetic.
Here are some timing comparisons at 50 significant digits and 1000 significant digits that illustrate the effect of using different arithmetic on different machines (Tables II and III). For a precision level of 50 digits, the integer runs used 14 digits in base [10.sup.4], while double precision used 8 digits in base [10.sup.7]. The reason 14 digits are needed in base [10.sup.4] is that normalization can cause the first word to have only one significant digit, so only about 53 significant digits can be assumed to be present.
Table II. Complex Arithmetic Timing (seconds per call on a 604 Macintosh) 50 S.D. 1000 S.D. ZM Integer ZM D.P. ZM Integer ZM D.P. add 0.000 008 0.000 008 0.000 053 0.000 037 subtract 0.000 009 0.000 008 0.000 052 0.000 038 multiply 0.000 055 0.000 047 0.003 130 0.001 200 divide 0.000 142 0.000 074 0.011 500 0.001 690 sqrt(z) 0.000 228 0.000 186 0.006 330 0.002 380 exp(z) 0.000 858 0.000 842 0.062 0.027 ln(z) 0.002 130 0.002 190 0.102 0.049 sin(z) 0.000 908 0.000 861 0.062 0.027 arctan(z) 0.002 650 0.002 620 0.121 0.053 Table III. Complex Arithmetic Timing (seconds per call on a 68030 Macintosh) 50 S.D. 1000 S.D. ZM Integer ZM D.P. ZM Integer ZM D.P. add 0.000 280 0.000 480 0.001 620 0.003 380 subtract 0.000 303 0.000 523 0.001 720 0.003 500 multiply 0.002 750 0.003 600 0.209 0.160 divide 0.005 790 0.006 120 0.450 0.307 sqrt(z) 0.009 500 0.011 600 0.360 0.279 exp(z) 0.032 700 0.045 200 3.680 2.900 ln(z) 0.073 100 0.107 000 5.780 4.810 sin(z) 0.033 900 0.045 800 3.680 2.860 arctan(z) 0.088 900 0.127 000 6.620 5.440
These results are average times based on many calls to each routine. The same compiler and optimization settings were used in all cases, as well as the same set of input arguments. These tables are meant to compare the double-precision and the integer versions on different machines and to give a rough idea of the timing for various routines. However, the times can be sensitive to changes in compilers or hardware.
With the more recent (604) machine, double precision is slightly better than integer at 50 digits. For 1000 digits, multiplication and division are much faster in double precision, and this causes the elementary functions to be much faster as well.
The older (68030) machine has slower floating-point performance compared to its integer arithmetic. At 50 digits the integer version is much better. The two versions are about equal around 800 digits, and the double-precision version is faster for precision much higher than that.
These timing tests were also done using a Sun workstation, 68040 and 601 Macs, and a Pentium PC. The results were similar to Table II. Although most current machines run the double-precision version faster than the integer version, for a machine with double-length integer arithmetic that is at least as fast as its double-precision floating-point arithmetic the integer version would be better.
The complex arithmetic and function routines return results that are nearly always correctly rounded. Precision is raised upon entry to each routine so that the predicted error in the result before the final rounding operation is less than 0.001 units in the last place (measured in terms of the original precision).
The philosophy used for measuring errors is to assume all input arguments are exact. When zero digits are appended to the input arguments as precision is raised at the start of a routine, the values can still be assumed to be exact. This assumption is also used by Hull, Fairgrieve, and Tang [Hull et al. 1994]. For cases where the input values really are exact, making this assumption assures that accuracy is not lost needlessly.
Except for cancellation error in addition or subtraction, most algorithms lose accuracy slowly and predictably. A simple guess for how many guard digits are needed will usually prove correct. In rare cases, unexpected cancellation error will mean that the goal of 0.001 ulps of error cannot be achieved using the number of guard digits originally chosen.
With complex arithmetic, even multiplication or division can suffer from cancellation error. The following example using 5 digits in base 10 shows how ZM routines handle this problem, although the actual routines would initially use more than two guard digits in this case.
(0.63287 + 0.52498 i) * (0.69301 + 0.83542 i).
If precision is raised to 7 digits and the simple formula for multiplication is used, the result is
6.4000E-6 + 8.9253E-1 i.
The real part is correct to only two significant digits. The multiplication routine detects this loss of significance, raises the precision to a higher level, and tries again. Using at least 10 digits gives the result
6.4471E-6 + 8.9253E-1 i.
This is the correctly rounded answer. Similar cancellation can occur in functions such as
ln(0.77266 + 0.63483 i) = 6.3022E-6 + 0.68778 i.
4. SOME ALGORITHMS USED IN THE PACKAGE
Many of the basic formulas are the ones used in Wynn . Hull, Fairgrieve, and Tang [Hull et al. 1994] remarked that the formulas for complex arithmetic and elementary functions appear deceptively simple. Many special cases must be handled, including cases where one or more parts of the input arguments are zero, where exceptions (underflow, overflow, or unknown) are generated as output, and where exceptions are encountered as intermediate results but the final results are normal numbers.
As noted by Hull, Fairgrieve, and Tang, a good exception-handling facility makes designing the routines much easier. The treatment of exceptions in the FM package simplifies exception-handling for the complex functions.
Because the basic FM real arithmetic is faster than the equivalent complex arithmetic, the elementary complex functions use real functions whenever possible. Some of the complex functions must check for too much cancellation error and on rare occasions must raise the precision and do the calculation a second time.
For relatively low precision the multiply routine uses the simple formula
(a + bi)(c + di) = (ac - bd) + (ad + bc)i.
At higher precision the routine first computes P = (a + b)(c + d), then applies the formula
(a + bi)(c + di) -- (ac - bd) + (P - ac - bd)i.
This replaces one O([t.sup.2]) multiplication by three O(t) additions.
At low precision, a real multiplication takes about twice as long as a real addition. Because complex multiplication and division each require several real operations, precision must be raised to provide guard digits. The result is that complex multiplication is about seven times slower than real multiplication at low precision, and between three and four times slower at high precision.
This method is due to Smith . If |c| [is less than] |d|, let Q = c/d, and then use
a + bi / c + di = aQ+b / cQ + d + bQ - a / cQ + d i.
Otherwise, let Q = d/c, and use
a + bi / c + di = bQ + a / dQ + c + b - aQ / dQ + c i.
This requires three multiplications, three divisions, and three additions. Making|Q| [is less than] 1 prevents overflow in the intermediate results.
Complex division is about seven times slower than real division at low precision, and about five times slower at high precision.
4.3 Square Root
The algorithm used is due to Friedland . To find c + di = [square root of a + bi], let
t = [[square root of |a| + ][square root of [a.sup.2] + [b.sup.2] / 2].
Then if a [is greater than or equal to] 0, we have c = t and d = b/(2t). Otherwise, we have c = |b|/2t and d = Sign(b)t.
This is about three times slower than a real square root.
4.4 Exponential and Logarithmic Functions
The exponential function uses the formula
[e.sup.a+bi] = [e.sup.a]cos(b) + [e.sup.a]sin(b)i.
One of the trigonometric functions is computed, and the other can be obtained quickly using an identity. The real sine function is about as fast as the exponential, so the complex exponential takes slightly more than twice as long as the real exponential.
For the logarithm, with |a + bi| = [square root of [a.sup.2]+ [b.sup.2]] and arg(a + bi) = atan2(b, a), then
ln(a + bi)= ln(|a + bi|) + arg(a + bi)i.
The complex logarithm takes slightly more than twice as long as the real logarithm.
The general exponential, [y.sup.x], uses three complex operations: logarithm, multiplication, exponential. The integer power function, [y.sup.n], uses the binary method with complex multiplication. The rational power function, [y.sup.k/n], uses Newton iteration to get [y.sup.1/n], followed by a call to the integer power function.
4.5 Trigonometric Functions
cos(a + bi) = cos(a)cosh(b) - sin(a)sinh(b)i
sin(a + bi) = sin(a)cosh(b) + cos(a)sinh(b)i
tan(a + bi) = sin(2a) / cos(2a) + cosh(2b) + sinh(2b) / cos(2a) + cosh(2b) i
These functions are slower than their real counterparts by factors ranging from 2.5 or 3 at low precision to slightly more than 2 at high precision.
The inverse functions call the complex logarithm and complex square root. The two equivalent forms are used to avoid cases where one or the other suffers from cancellation.
[cos.sup.-1](z) = - i ln(z +i [square root of l - [z.sup.2]]) = i ln(z - i [square root of 1 - [z.sup.2]])
[sin.sup.-1](z) = -i ln([square root of 1 - [z.sup.2]] + iz) = i ln([square root of 1 - [z.sup.2]] - iz)
[tan.sup.-1](z) = i / 2 ln (i + z / i - z)
These are two or three times slower than the real versions.
4.6 Hyperbolic Functions
cosh(a + bi) = cosh(a)cos(b) + sinh(a)sin(b)i
sinh(a + bi) = sinh(a)cos(b) + cosh(a)sin(b)i
tanh(a + bi) = sinh(2a) / cosh(2a) + cos(2b) + sin(2b) / cosh(2a) + cos(2b) i
Timing is similar to the trigonometric functions.
4.7 Conversion and Utility Functions
Routines are provided for complex absolute value, argument, conjugate, real or imaginary parts, integer or nearest integer values, and conversion between real and complex FM formats.
The input conversion routines do free-format conversion from character format to ZM format. Input strings can be in "(1.23, -4.56)" form or "1.23 - 4.56 i" form. The two numeric parts can be in any integer, fixed, or exponential format. If one part is omitted, input of the form "3.45" or "-5i" is correctly converted to ZM format with the missing part set to zero.
The output conversion routines can be set for the desired complex format, the number of digits to display, and the real format to use for the individual parts of the complex number.
To test the accuracy of the functions in this package, many random arguments were generated and results compared at a sequence of increasing precisions.
As independent tests for the functions, ZM results were compared to values tabulated in Abramowitz and Stegun  and to values generated by the Mathematica computer algebra system [Wolfram 1991]. Tests were also made comparing ZM results at low precision with values generated by the complex Fortran intrinsic functions.
ABRAMOWITZ, M. AND STEGUN, I. 1965. Handbook of Mathematical Functions. Dover, New York.
BAILEY, D. 1993. Multiprecision translation and execution of Fortran programs. ACM Trans. Math. Softw. 19, 288-319.
BAILEY, D. 1995. A Fortran 90-based multiprecision system. ACM Trans. Math. Softw. 21, 379 -387.
BRENT, R. 1978. A Fortran multiple-precision arithmetic package. ACM Trans. Math. Softw. 4, 57-70.
FRIEDLAND, P. 1967. Absolute value and square root of a complex number. Commun. ACM 10, 665.
HULL, T., FAIRGRIEVE, T., AND TANG, P. 1994. Implementing complex elementary functions using exception handling. ACM Trans. Math. Softw. 20, 215-244.
SMITH, D. 1991. A Fortran package for floating-point multiple-precision arithmetic. ACM Trans. Math. Softw. 17, 273-283.
SMITH, D. 1996. A multiple-precision division algorithm. Math. Comput. 66, 157-163. SMITH, R. 1962. Complex division. Commun. ACM 5, 435.
WOLFRAM, S. 1991. Mathematica: A System for doing Mathematics by Computer. Addison-Wesley, Redwood City, Calif.
WYNN, P. 1962. An arsenal of Algol procedures for complex arithmetic. Bit 2, 232-255.
Received June 1996; revised November 1996; accepted October 1997
DAVID M. SMITH Loyola Marymount University
Author's address: Loyola Marymount University, Mathematics Department, 7900 Loyola Blvd., Los Angeles, CA 90045-8130; email: email@example.com.
Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.
|Printer friendly Cite/link Email Feedback|
|Author:||SMITH, DAVID M.|
|Publication:||ACM Transactions on Mathematical Software|
|Date:||Dec 1, 1998|
|Previous Article:||Algorithm 782: Codes for Rank-Revealing QR Factorizations of Dense Matrices.|
|Next Article:||Stiffness Detection and Estimation of Dominant Spectra with Explicit Runge-Kutta Methods.|