# Beware of Linear Congruential Generators with Multipliers of the Form [Alpha] = [+ or -] [2.sup.q] [+ or -] [2.sup.r].

1. INTRODUCTIONWu [1997] proposed a clever-looking way to select the parameters and implement a linear congruential generator (LCG): take a Mersenne prime modulus m, i.e., a prime of the form m = [2.sup.e] - 1 (see Knuth [1997] for more on Mersenne primes), and a multiplier of the form

(1) a = [+ or -] [2.sup.q][+ or -] [2.sup.r].

The usual LCG recurrence

(2) [x.sub.n] = [ax.sub.-1] mod m; [u.sub.n] = [x.sub.n]/m

(2) is then easily implemented by exploiting the fact that multiplying x by [2.sup.q] modulo [2.sup.e] = 1 is equivalent to permuting the two blocks of bits comprised of the q most significant bits and the e - q least significant bits (i.e., rotating the bits). This technique can in fact be generalized to moduli of the form m = [2.sup.e] - h where h is small, as explained in Section 2.

However, with multipliers of the form (1), the numbers of 1's in the binary representations of [x.sub.n-1] and of [x.sub.n] (i.e., their Hamming weights) are likely to be strongly dependent, for permuting two blocks of bits in [X.sub.n-1] does not change its Hamming weight, and [x.sub.n] is obtained by simply adding or subtracting two such bit-rotated versions of [x.sub.n-1]. As it turns out, there is indeed a strong dependence, and this dependence is also found between the bits of [u.sub.n-1] and [u.sub.n]. In Section 3, we define a test of independence between the number of 1's in the binary representations of [u.sub.n-1] and of [u.sub.n]. In Section 4, we show that the specific LCGs proposed by Wu [1997], which otherwise perform well in the spectral test, fail this independence test in a decisive way. LCGs with multipliers that have a more complicated binary representation pass this test. Extensive experiments confirm the bad statistical properties of the multipliers of the form (1).

2. IMPLEMENTATION FOR m = [2.sup.e] - h WHEN h IS SMALL

Let m = [2.sup.e] = h and 0 [is less than] x [is less than] m. To compute y = [2.sub.q]x mod m, decompose x as x = [x.sub.0] + [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], where [x.sub.0] = x mod [2.sup.e-q]. We have

(3) y = [2.sup.q]([x.sub.0] + [2.sup.e-q][x.sub.1])mod(2.sub.e] - h) = ([2.sup.q][x.sub.0] + [hx.sub.1])mod([2.sub.e] - h).

Assume that

(4) h [is less than] 2q and h(2q - (h + 1)[2.sup.-e+q]) [is less than] m.

Then [2.sup.q][x.sub.0] [is less than] [2.sup.e] - [2.sup.q] [is less than] m and [hx.sub.1] [is less than or equal to] h(m - 1)[2.sup.e-q] = h([2.sup.e] - h - 1)/[2.sub.e-q] = h([2.sup.q] - (h + 1)[2.sup.-e+q]) [is less than] m, so each of the two terms of the last expression in (3) is less than m. To compute y, compute [2.sup.q][x.sub.0] by shifting [x.sub.0] by q positions to the left, then add h times [x.sub.1], and subtract m if the result exceeds m - 1. Note, that, if h is a power of 2, this requires no multiplication, only shifts, additions, and subtractions. To multiply x by a = [+ or -] [2.sup.q] [+ or -] [2.sup.r], repeat the above operation with r instead of q, and add (or subtract) the results modulo m. Figure 1 gives an example of how this technique can be coded in the C language, for m = [2.sup.30] - 35 and a = [2.sup.15] + [2.sup.13].

Fig. 1. Implementation of an LCG with [Alpha] = [2.sup.15] + [2.sup.13] and m = [2.sup.30] - 35.

#define m 1073741789 /* 2'30 - 35 */ #define h 35 #define q 15 #define emq 15 /* e - q */ #define mask1 32767 /* 2^(e-q) - 1 */ #define r 13 #define emr 17 /* e - r */ #define mask2 131071 /* 2^(e-r) - 1 */ #define norm 1.0 / m long x; double axmodm () { unsigned long k, x0, x1; x0 = x & mask1; x1 = x >> emq; k = (x0 << q) + h * x1; x0 = x & mask2; x1 = x >> emr; k += (x0 << r) + h * x1; if (k < m) x = k; else if (k < m * 2) x = k - m; else x = k - m * 2; return x * norm; }

3. A SIMPLE INDEPENDENCE TEST

We want to test the independence between the Hamming weights of the leading bits of [u.sub.n] and those of [u.sub.n-1], for LCGs with multipliers of the form (1). We write the binary representation of [u.sub.n] as [u.sub.n] = [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], and define [Y.sub.n] as the number of 1's among the first l bits, [u.sub.n], 1, ... , [u.sub.n], l, where l is a constant. Under the null hypothesis [H.sub.0], "The [u.sub.n]'s are i.i.d. U(0,1) random variables," and the [Y.sub.n]'s are i.i.d, binomials with parameters (l, 1/2).

Take a large positive integer N, and let [C.sub.i,j] be the number of values of n for which ([Y.sub.2n-1], [Y.sub.2n]) = (i,j), for 1 [is less than or equal to] n [is less than or equal to] N and 0 [is less than or equal to] i,j [is less than or equal to] l. Under [H.sub.0], [C.sub.i,j] has the binomial distribution with parameters (N, [p.sub.i,j]) where

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

The standardized variable [Z.sub.i,j] = [C.sub.i,j] - [Np.sub.i,j]/[square root of ][Np.sub.i,j](1 - [p.sub.i,j])

has mean 0 and variance 1, and is approximately N(0,1) (standard normal) when [Np.sub.i,j] is large enough. If several of the |[Z.sub.i,j]| are larger than 5 (say), this certainly indicates that something is wrong.

For a formal test, let [Psi] = {(i, j) : [Np.sub.i,j] [is greater than or equal to] 5}, [p.sub.0] = 1 - [[Sigma].sub. (i,j)[element of][Psi] j)*][p.sub.i,j, [C.sub.0] = N - [Sigma](i,j)[element of][Psi][C.sub.i,j], and consider the chi-square test statistic

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

This is a standard chi-square test where we have regrouped all the boxes for which [Np.sub.i,j] [is less than] 5 into a single cell. Under [H.sub.0], Q has approximately the chi-square distribution with k = |[Psi]| degrees of freedom, so we can easily compute the p-value of the test, defined as p = P[Q [is greater than] x | [H.sub.0]] where x is the value taken by Q. The null hypothesis is rejected when p is deemed too close to 0. (It would also be possible to base the test on the overlapping pairs ([Y.sub.n], [Y.sub.n+1]), 1 [is less than or equal to] n [is less than or equal to] N, in which case the results of Wegenkittl [1998] could be used to find the asymptotic distribution of Q.)

4. TEST RESULTS

We applied the above independence test, with l = 30, to the two LCGs with modulus m [= 2.sup.31] - proposed by Wu [1997]. The multipliers are a = [2.sup.15] = [2.sup.10] and a = - [2.sup.16] - [2.sup.11], respectively. For comparison, we applied the same tests to the well-known LCGs with the same modulus and with multipliers a = 16807, 630360016, and 742938285 (see Fishman [1996], L'Ecuyer [1998], and Wu [1997] for details and references about these multipliers). We also applied the test with l = 50, to the two other LCGs with modulus m = [2.sup.61] - 1 proposed by Wu [1997]. Their multipliers are a = [2.sup.30] - [2.sup.19] and a = [2.sup.42] - [2.sup.31].

Figures 2 to 5 give pictorial representations of the [Z.sub.i,j]'s for four of these LCGs, with l = 30 and sample size N = [2.sup.20] for the first three, whose modulus is m = [2.sup.31] - 1, and with l = 50 and sample size N = [2.sup.26] for the last one, whose modulus i[s m = 2.sub.61] - 1. Each little square in the picture represents one pair (i, j). White means [Z.sub.i,j] [is less than or equal to] - 5; black means [Z.sub.i,j] [is greater than or equal to] 5; and gray shades represent the values in between according to the scale displayed at the top of each figure (darker is larger). Note that the black and white values are 5 standard deviations or more away from the mean. It is quite clear from the figures that the multipliers a =[ 2.sup.15] - [2.sup.10], a = = [2.sup.16] - [2.sup.11] and a = [2.sup.30] - [2.sup.19] are unacceptable. The multiplier a 16807 behaves well (although we do not recommend such a small LCG, whatever the multiplier might be). We tried the other LCGs mentioned above, as well as several other recommendable random-number generators (such as those proposed by L'Ecuyer [1999a], for example), and their displays were similar to that of Figure 4, i.e., they behaved in agreement with [H.sub.0].

[Figures 2-5 ILLUSTRATION OMITTED]

For the chi-square test based on Q, Table I gives the p-values smaller than 0.01 for different sample sizes N, for the two generators with modulus m = [2.sup.31] - 1 proposed by Wu. The p-values for N = 2v and v [is greater than or equal to] 17 are all smaller than [10.sup.-15]. A rather small sample size suffices for rejecting [H.sub.0]. For comparison, the LCGs with modulus m = [2.sup.31] - 1 and multipliers a = 16807, 630360016, and 742938285 had no p-value less than 0.01 for N = 2v and 15 [is less than or equal to] v [is less than or equal to] 24. They started to fail at a sample size around N = [2.sup.26]. Table II gives the p-values smaller than 0.01 for the two generators with modulus m = [2.sup.61] - 1 proposed by Wu. The p-values for N [is greater than or equal to] [2.sup.22] are all less than [10.sup.-15].

Table I. The p-Values for Q for the Two LCGs with m = [2.sup.31] - 1 Proposed by Wu (Np [is greater than or equal to] 5)

N a = [2.sup.15] - a = - [2.sup.16] - [2.sup.10] [2.sup.11] [2.sup.12] 3.8 x [10.sup.-3] [2.sup.13] 1.8 x [10.sup.-3] [2.sup.14] 1.8 x [10.sup.-15] [2.sup.15] 1.7 x [10.sup.-3] < [10.sup.-15] [2.sup.16] 3.5 x [10.sup.-11] < [10.sup.-15] [2.sup.17] < [10.sup.-15] < [10.sup.-15]

Table II. The p-Values for Q for the Two LCGs with m = [2.sup.61] - 1 Proposed by Wu (Np [is greater than or equal to] 5)

N a = [2.sup.23] - a = - [2.sup.42] - [2.sup.19] [2.sup.31] [2.sup.17] [2.sup.18] [2.sup.19] 1 x [10.sup.-7] [2.sup.20] 2 x [10.sup.-12] 6.4 x [10.sup.-7] [2.sup.21] < [10.sup.-15] < [10.sup.-15] [2.sup.22] < [10.sup.-15] < [10.sup.-15]

We made additional tests on LCGs of the form (1)-(2) as follows. For l = 12, ... , 40, we took m equal to the largest prime less than [2.sup.e], and found a good multiplier a of the form (1) based on the spectral test up to 8 dimensions, as was done by L'Ecuyer [1999b] for general LCGs without conditions on a. We then applied the Q test above to the two families of LCGs: (a) those satisfying Eq. (1) and (b) those listed in Table II of L'Ecuyer [1999b]. In each case we took l = e - 1 and N = [2.sup.v] for 11 [is less than or equal to] v [is less than or equal to] e. The LCGs in (b) had no difficulty with these tests for N nearly as large as m, whereas those in (a) failed the tests typically with p-values p [is less than] [10.sup.-15] for N << m.

5. CONCLUSION

The spectral test is recognized as the most powerful test for analyzing the structure of LCGs and MRGs [Knuth 1997]. It tests a very important aspect of the point set produced by the generator: its lattice structure. However, it is clear that, like any other test, the spectral test cannot test every statistical property of the generator. It has been pointed out on several occasions (e.g., Compagner [1995] and L'Ecuyer [1998]) that generators should not be based on recurrences whose structure is too simple. In particular, the bits should be well mixed. Otherwise, one gets into problems such as those illustrated by the present note. The MRGs proposed by L'Ecuyer [1996; 1999a], selected on the basis of the spectral test, do not have this problem because the binary representation of their multipliers contains a good mixture of 0's and 1's.

The idea of choosing multipliers of the form (1) to speed up implementation can be trivially extended to multiple recursive generators (MRGs), i.e., linear congruential generators based on recurrences of higher order [Knuth 1997; L'Ecuyer 1998; Niederreiter 1992]. However, the resulting LCGs or MRGs should not be used in their plain form. On the other hand, the idea, and the generalization that we developed in Section 2, can be used fruitfully for implementing combined LCGs or MRGs, e.g., with the combination methods proposed by L'Ecuyer [1996]. Work in that direction is under way.

REFERENCES

COMPAGNER, A. 1995. Operational conditions for random number generation. Phys. Rev. E52, 5-B, 5634-5645.

FISHMAN, G. 1996. Monte Carlo: Concepts, Algorithms, and Applications. In Operations Research Springer Series on Operations Research, vol. 1. Springer-Verlag, New York, NY.

KNUTH, D. E. 1997. The Art of Computer Programming. Vol. 2, Seminumerical Algorithms. 3rd ed. Addison-Wesley, Reading, MA.

L'ECUYER, P. 1996. Combined multiple recursive random number generators. Oper. Res. 44, 5, 816-822.

L'ECUYER, P. 1998. Random number generation. In Handbook of Simulation, J. Banks, Ed. John Wiley & Sons, Inc., New York, NY, 93-137.

L'ECUYER, P. 1999a. Good parameters and implementations for combined multiple recursive random number generators. Oper. Res. 47, 1, 159-164.

L'ECUYER, P. 1999b. Tables of linear congruential generators of different sizes and good lattice structure. Math. Comput. 68, 225, 249-260.

NIEDEREITER, H. 1992. Random Number Generation and Quasi-Monte Carlo Methods. CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 63. SIAM, Philadelphia, PA.

WEGENKITTL, S. 1998. Generalized &-divergence and frequency analysis in Markov chains. Ph.D. Dissertation. Univ. of Salzburg. Available via http://random.mat.sbg.ac.at/ team/.

Wu, P.-C. 1997. Multiplicative, congruential random-number generators with multiplier [+ or -] [2.sub.k1] [+ or -] [2.sub.k2] and modulus [2.sup.p] - 1. ACM Trans. Math. Softw. 23, 2, 255-265.

Received: May 1999; revised: July 1999; accepted: August 1999

PIERRE L'ECUYER and RICHARD SIMARD Univesite de Montreal

This work has been supported by the National Science and Engineering Research Council of Canada grants #ODGP0110050 and SMF0169893, by FCAR-Quebec grant #93ER1654.

Authors' address: Departement d'Informatique et de Recherche Operationnelle, Universite de Montreal, C.P. 6128, Succ. Centre-Ville, Montreal, PQ H3C 3J7, Canada; email: lecuyer@ro.umontreal.ca.

Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.

Printer friendly Cite/link Email Feedback | |

Author: | L'ECUYER, PIERRE; SIMARD, RICHARD |
---|---|

Publication: | ACM Transactions on Mathematical Software |

Article Type: | Statistical Data Included |

Geographic Code: | 1CANA |

Date: | Sep 1, 1999 |

Words: | 2709 |

Previous Article: | Algorithm 798: High-Dimensional Interpolation Using the Modified Shepard Method. |

Next Article: | C++ Implementations of Numerical Methods for Solving Differential-Algebraic Equations: Design and Optimization Considerations. |

Topics: |