# Massively distributed computing and factoring large integers.

Over the last 15 years the increased availability of computers and the introduction of the RSA cryptosystem has led to a number of new and remarkable algorithms for finding the prime factors of large integers. Factoring numbers is an arithmetic problem so simple to understand that school children are asked to do it. While multiplying or adding two very large numbers is simple and can be done quite quickly, the age-old problem of trying to find a number that divides another number still has no simple solution. Computer science has reached a point where it is starting to custom tailor the design of computers toward solving specific problems. This pracnique will discuss some of the more recent algorithms for factoring large numbers and how networks of computers can be used to run these algorithms quickly. Since this is a general exposition, we do not give detailed mathematical descriptions of the algorithms. We also allow ourselves to be somewhat casual with mathematical notation in places and hope that the mathematically sophisticated will forgive the looseness.

The problem of factoring integers has drawn considerable attention since Ron Rivest at MIT, Adi Shamir at the Weizman Institute, and Len Adleman at the University of Southern California presented a cryptographic scheme (RSA) which depends, for its security, upon the difficulty of factoring large numbers. It is a public-key system, whereby a user desiring to decrypt messages posts a public key composed of the product of two large primes. Anyone can use the key to send messages, but only the person who knows the factorization can decrypt them. See Martin Gardner's article in the August 1977 issue of Scientific American for a complete review of this system. The authors originally suggested that factoring 80-digit numbers was sufficiently difficult to make the code secure. That milestone has now been passed by this author, as well as several others, using ordinary microcomputers. Furthermore, Gardner also publicized a challenge issued by the RSA inventors. They offered a \$100 prize to the first person to decrypt a message that had been encrypted with a 129-digit key. In order to decrypt the message, one must factor the key. Improvements in algorithms, coupled with an increase in the speed of computer workstations has brought this number (barely) within reach. I hope shortly to undertake its factorization in a joint effort with Sun Microsystems. We will utilize approximately 3,000 workstations on one of Sun's internal networks to achieve this factorization.

Some Factoring Lore

A prime number is a positive integer exceeding 1 which is only divisible by 1 and itself. Primes are the fundamental building blocks of arithmetic because, according to the fundamental theorem of arithmetic, all integers except for the primes themselves can be uniquely broken down into the product of prime numbers. All positive integers, with the exception of 1, can therefore be placed into one of two classes: the primes, or the composites which are the products of primes. The number 1 is not considered to be a prime because the fundamental theorem of arithmetic would not be true if it were. One could simply insert as many factors of 1 as desired into the factorization of a number and it would no longer be unique.

Factoring plays a role in many areas of mathematics and often shows up in surprising places. Mathematicians love to generalize--it is one of their favorite pastimes. The most famour unsolved problem in all of mathematics is Fermat's Last Theorem which states that the equation [A.sup.n] + [B.sup.n] = [C.sup.n] has no solution in positive integers A, B, C for n > 2. It has withstood the assault of mathematicians for over 300 years. Oddly enough, factoring has played a major role in our failure to prove the theorem. Certain mathematical systems, known technically as cyclotomic fields, can be constructed as generalizations of the set of rational numbers. The fields are constructed by appending complex roots of unity to the rational numbers. Amazingly, the fundamental theorem of arithmetic no longer applies in many of these fields. That is to say, numbers can be broken down as the product of primes in more than one way. This fact has been the major stumbling block in the proof of Fermat's Last Theorem.

Indeed, it is easy to construct a number system in which unique factorization fails. Consider the set of all integers that leave a remainder of 1 when divided by 4:

5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, . . .

In this system the number 9 is prime because its only 'true' factor 3 is not among our set. 21 is also prime as is 49. Now consider the number 441. It is in our set because it leaves a remainder of 1 when divided by 4. Furthermore, 441 = 21 . 21 = 9 . 49. But 9, 21, and 49 are all 'primes' in our system. We thus come to the bizarre result that a number can be factored as the product of primes in two different ways!

The fact that composite numbers can, in principle be broken into prime factors does not mean that it is easy to do so. On the contrary, the arithmetic required to break even a moderate-sized number into its prime factors is staggering. On the other hand, that large numbers are, on average, difficult to factor does not mean that this is true in all cases. For example, consider [10.sup.100] and 100!. These numbers are very large but they are also trivial to factor! However, if we add 1 to either of these numbers, although we have not increased their number of digits, we will have made them much more difficult to factor.

Traditionally, numbers of the form [b.sup.n] [+ or -] 1 have been used as a benchmark for factoring methods. Mathematicians have been trying to factor numbers of this form ever since the time of Euclid. Early in this century Colonel A. J. C. Cunningham and H. J. Woodall collected all of the known factors of numbers of this form and published them in a small book. In fact, numbers of this form [b.sup.n] [+ or -] 1 have come to be known as Cunningham numbers as a result of their work. The advent of computers and new algorithms has allowed us to make great progress in completing these tables. Many mathematicians all over the world are even now producing factorizations for the 'Cunningham Project', abouth which the American Mathematical Society has published a book and Samuel S. Wagstaff Jr. of Purdue University publishes a period newsletter. There are several reasons for this:

(1) These number are mathematically interesting. For example the largest known primes are of the form [2.sup.n] - 1. Perfect numbers are those whose factors add up to the number itself. Even ones can only be constructed from the number [2.sub.n] - 1 when it is prime, and no odd perfect numbers are known. Numbers of the form [2.sup.2n] + 1, called Fermat numbers, determine when a regular polygon can be inscribed in a circle using ruler and compass. They have historical importance because of a famous conjecture by Fermat that numbers of this form are always prime. Now it is believed that the exact opposite is true: that they are almost always composite and that the only time they are prime is for n = 0,1,2,3 or 4. Interest in numbers of the form [2.sup.n] [+ or -] 1 therefore arose naturally. The Cunningham Project extends this interest to bases other than 2. Furthermore, the factorizations of [a.sup.n] - 1 are useful in determining the structure of finite fields.

(2) It is an easy and compact notation for describing very large numbers. One does not need to specify just a string of digits.

(3) They can have interesting digit patterns. For example, numbers of the form ([10.sup.n] - 1)/9 consist entirely of the digit 1 and are called repunits. Repunits show up frequently in recreational mathematics and it was quite interesting when, in 1986, H. Williams and H. Dubner proved that the repunit consisting of 1031 1s is a prime number.

(4) Numerical examples of factorizations can sometimes lead to the discovery of new algebraic identities.

Arithmetic Used in Factoring

Methods

The simplest, most obvious way to factor a number is to try all possible prime divisors of that number up to its square root. This method, usually called trial division, has been described as 'mostly trial and very little division'. Consider a typical number of about 50 digits with two or more large prime factors. It has, by the prime number theorem, about [10.sup.25]/ln([10.sup.25]) potential prime divisors, or about [10.sup.23]. If one could do divisions with 50-digit numbers at the rate of [10.sup.7]/sec., it would take roughly [10.sup.16] sec., of about 317 billion years to factor it if it had two prime factors near its square root. This estimate is unduly optimistic, however, because computers cannot do arithmetic that fast on 50-digit numbers. This is due to word-size limitations of machines.

A typical modern computer has anywhere from 16 to 64 bits of storage in each machine word. In binary representation, a word size of K bits can store numbers in the range [-2.sup.K] to [2.sup.K] - 1 on most computers. The largest number we can store in a 64-bit word is at most [2.sup.64] - 1 or about [10.sup.19]. In order to do arithmetic on larger numbers, one must store them in multiple words. One then performs division and multiplication in software, just as one would with pencil and paper. Thus one must do many single-precision divisions, multiplications, and subtractions or additions to do arithmetic on multiple-precision numbers. An algorithm is said to be O(f(N)) for some function f(N) if its run time is less than a constant times ~f(N)~ where N is the number of bits of information it takes to represent the problem.

For factoring, the size of the problem is either taken to be the number of digits, expressing the number in base 10, or the number of bits, expressing it in binary. Multiprecise division and multiplications are known as O([N.sup.2]) algorithms because the amount of work it takes to multiply 2 N-bit numbers or divide a 2N-bit number by an N-bit number is proportional to [N.sup.2] using classical arithmetic. Thus, doubling the number of digits quadruples the work. The use of Fast Fourier Transforms can reduce this to O(N log N log log N), but these methods are not effective until the numbers involved exceed several hundred digits. On the other hand, multiple precision addition and subtraction are O(N) algorithms because it only takes work proportional to N to add or subtract N bit numbers together. One can see that the time it takes to do multiplication and divisions is going to dominate any computations we do unless the number of additions and subtractions is very much greater.

How large a number one can store in a single machine word greatly affects how fast one can factor large integers. Simply doubling the word size of a computer without even making it faster (other things being equal) can speed up factoring algorithms by a factor of 4. In fact, we shall later describe efforts by Wagstaff and Jeffrey W. Smith at the University of Georgia, and at Louisiana State University by Duncan A. Buell, Don Chiarulli, and Walter G. Rudd, to build special-purpose factoring machines which have very large word sizes.

In trial division of any other method which removes small factors first, the amount of work needed during the computations will depend on the size of the second largest factor, because once we remove that one we are done: the remaining factor, which is prime, is just what is left over. The statistical distribution of the largest, second largest, . . . [n.sup.th] largest factors of arbitrary integers is very well known and is described by Dickman's function depicted in Figure 1. It is used to express the frequency with which these factors will exceed given powers of the integers they divide. The distribution of the number of prime factors of an integer drawn at random between 1 and x is normal with mean and variance ln(ln(x)).

The graph in Figure 1 shows the distribution of both the largest and the second largest factors of arbitrary integers. For example, Dickman's function tells us that the largest factor of a number will be greater than its square root about 31% of the time, and that the second largest will exceed its cube root about 25% of the time. This means that an integer drawn at random will usually have one or more very large factors. It will split into two nearly equal factors much less than 1% of the time. However, RSA encrypton keys are artificially constructed to be the product of two nearly equal primes and thus represent the class of numbers that are currently most difficult to factor. Some of the methods we will describe, like trial division, remove smaller factors first. These generally do not work well in the case of two nearly equal factors. Other algorithms that shall be described are insensitive to the size of the factors and only depend on the size of the number being factored.

In order to describe modern factoring algorithms it is first necessary to briefly describe some notation and some key mathematical concepts. The concept of 'smooth' numbers is vital to all of these methods. Smoothness is a term recently coined by Rivest and is easy to describe: a number N is smooth if all of its prime factors are small. 100! is an example of a smooth number. All of its prime factors are less than 100, whereas 100! has 158 decimal digits. Smooth numbers are very easy to factor by trial division because all of their factors are tiny. Modern algorithms attempt to transform the problem of factoring a large nonsmooth number into one of factoring one or many smooth numbers. We shall use the notation (A, B) to mean the greatest common divisor of A and B, we shall designate the multiplicative group of integers taken mod p by [(Z/pZ).sup.*], and shall use the symbol [unkeyable] to designate the equivalence relation between two integers modulo a third, e.g., A [unkeyable] B mod C. One of the more interesting and fundamental theorems concerning groups is Lagrange's Theorem, formulated by the French mathematician Lagrange in 1770. One way of stating it says that if you take an element of a group and raise it to a power equal to the number of elements in the group, you will get the group identity. A special case of this theorem, known as Fermat's 'little' theorem, will be useful later. This theorem plays a major role in many of our factoring methods.

Special-Purpose Factoring

Methods

Special-purpose factoring methods are those that depend, for their success, upon some special property of the number being factored. The usual property is that the factors of the number are small. However, other special properties might include the number having a special mathematical form, or that its factors possess some mathematical form. For example, if p is a prime number, then all of the factors of [2.sup.p] - must be congruent to 1 mod 2p. Certain factoring algorithms can take advantage of this special form of the factors. Special-purpose methods do not always succeed, but they are usually tried first, before resorting to more powerful, general methods.

There is a strong connection between modular arithmetic, groups, and prime numbers that makes the concepts very important for factoring numbers. In one way or another, modern factoring algorithms use the concepts of smoothness and groups together. They differ in their choice of group, and how smoothness is computed, but all have these underlying concepts. In some methods we exponentiate an element of some group and hope the size of that group will be smooth. In others we hope we can find a lot of individual elements within a group that are smooth.

It is fairly easy to show that if one has a prime number p, then the set of integers less than p (excluding 0) forms a group under multiplication mod p. Since there are p - 1 numbers less than p, we have [a.sup.p-1] [unkeyable] 1 mod p. This is Fermat's little theorem and says essentially that [a.sup.p-1] - 1 will be a multiple of p. We can use this fact to construct an algorithm for factoring integers. The method was first described by John Pollard in 1974. We refer the interested reader to the appendix for a complete explanation of the method.

Let N be a number we wish to factor and assume that it has a prime factor P. Select some small integer a and let M be the product of all the primes up to some convenient limit. We now compute [a.sup.M] mod N and hope that P - 1 divides M. For example, select a = 2 and set M = 4. Then [2.sup.4] [unkeyable] 16 mod 35 and p = (16 - 1, 35) = 5. If we select M = 6, then [2.sup.6] [unkeyable] 29 mod 35 and p = (29 - 1, 35) = 7. This works in the first case because 5-1 divides the exponent 4, and in the second case because 7-1 divides the exponent 6.

This method works anytime the factor P that we find has P - 1 smooth. This is because M itself consists only of small prime factors and P - 1 must divide M for the method to work. Thus, if the order of the group [(Z/pZ).sup.*] is smooth, then the factor p is easy to find. The number [2.sup.257] - 1 has an interesting history. It is one of a set of numbers, known as Mersenne numbers, that were originally conjectured to be prime by the French mathematician/priest Marin Mersenne in 1644. Much controversey about whether this number actually was prime arose among several mathematicians until the mid-20th century when it was finally determined to be composite, although its factors were not revealed.

The converse of Fermat's little theorem can be used to quickly tell us whether a number is composite without telling us anything about the factors. In 1979 M. Penk found a 14-digit factor of the number using a method, now more or less discarde, known as the RHO method. What remained however, was still a 63-digit composite factor. Using the P - 1 method Robert Baillie of the University of Illinois found the factor P = 1155685395246619182673033 of [2.sup.257] - 1 and thus finished its factorization. The computations only took about an hour on a CDC 6600 computer because of the lucky factorization of P - 1 = [2.sup.3] . [3.sup.2] . [19.sup.2] . 47 . 67. 439 . 119173 . 1050151. Thus one can find this factorization by selecting M as the product of all the primes up to 1050151. A slightly more sophisticated version of the algorithm will find a factor P when P - 1 is the product of small primes up to some limit M times a single, larger prime. It is this version that Baillie used. We have thus used some fairly simple computations to factor N anytime a prime factor P of N has P - 1 comprised of only small factors. That is to say P - 1 is smooth. Baillie has factored hundreds of numbers using the P - 1 method. The largest factor yet found using P - 1 is the 34-digit factor 7146831801094929757704917464134401 of the 575th Fibonacci number. It was found by this author.

The computation of [a.sup.M] mod N can be carried out swiftly and efficiently using simple multiplications mod N. One does not precompute M and then do the exponentiation. Rather, one does the exponentiation for each prime factor of M, one at a time. To compute [a.sup.6] mod N, for example, one first computes A = [a.sup.2] mod N, then computes [A.sup.3] mod N. This process is repeated for as many prime exponents as desired. Fast methods for performing the exponentiation are described in [2]. The only difficulty is that the length of N requires that all computations are multiprecision. If one could do all the calculations in single precision the above calculations would take only several million multiplications and only a few seconds of computer time. But on a computer with a 32-bit word size it takes about 9 words to store [2.sup.257] - 1. To do a single multiplication mod [2.sup.257] - 1 means essentially doing [9.sup.2] = 81 separate multiplications in place of just one. In multiprecise multiplication one must also sum the partial products that are formed and also handle the looping and control mechanisms. These can easily add another 20% to the computation time. The number [2.sup.247] - 1 has 78 decimal digits, and it takes about 100 times as long to multiply a number of this size as it does to multiply simple integers. This example easily shows the effect of multiprecision computations in doing factoring.

A group of researchers at Louisiana State University have constructed a special processor with a word size of 256 bits to alleviate the O([N.sup.2]) problem in doing multiplications [5]. The group consisted of Walter Rudd, Don Chiarulli, and Duncan Buell. Called the DRAFT, for Dynamically Reconfigurable Architecture for Factoring Things, it allows one to modify the usable word size in software in order to use anywhere from 32 to 256 bits to do a multiplication. The number given above could be stored in just two words on such a machine and would take only 4 multiplications in place of 81. The machine is treated as an arithmetic coprocessor to its PDP-11 host and runs the P - 1 algorithm as well as the Elliptic Curve Method [to be described] very quickly.

There is a similar calculation one can do, using a different group, that will factor N if P + 1 is smooth. This group is the twisted group of the finite field with [p.sup.2] elements. This method was first presented several years after Pollard's paper by Hugh Williams at the University of Manitoba and the English mathematician John Conway.

The difficulty with P - 1 and P + 1 is that you get just one shot: either P - 1 or P + 1 must be smooth or you are out of luck. In 1985 Hendrik W. Lenstra of the University of Amsterdam came up with a remarkable algorithm that replaces the multiplicative group mod p we have been discussing with a very different and interesting group.

An elliptic curve is a curve which can be described by the equation [y.sup.2] = [Ax.sup.3] + Bx + C. The remarkable thing about this curve is that the set of rational points on the curve form a group via an operation which looks very muck like addition. Figure 2 depicts graphically the general shape of an elliptic curve and how the group operation works. Rational points are those where both the x and y coordinates can be expressed as the ratio of integers. If one takes two rational points on the curve and forms a line, that line will intersect the curve in a third point which is also rational. Under the group operation, these three colinear points 'add up to zero'. The group operation [P.sub.1]*[P.sub.2] therefore means to draw a line including [P.sub.1] and [P.sub.2] and project it until it hits the curve at a third point. If one now reflects that third point across the x-axis, the corresponding point is the result of [P.sub.1]*[P.sub.2]. By doing some fairly simple calculations using this group one can try to factor numbers in an analogous way to the way P - 1 method works. The key to elliptic curves is that one can select the coefficients randomly! One therefore has a large number of separate curves to work with. With P - 1 one must hope that the order of the group of multiplications mod P is smooth. With the elliptic curve method (ECM) one has many groups from which to choose. Thus, if one does not work, try another. The largest factor found to date using ECM is the 38-digit factor 73330281865487664068079924400662001093 of the 467th Fibonacci number and was also found by this author. Mark Manasse of the DEC System Research Center, and Arjen Lenstra of the University of Chicago have found a slightly smaller 38-digit factor of one of the Cunningham numbers.

Since we have many curves with which to work the parallel implementation of ECM is quite simple: simply give each processor its own separate curve and quit upon the first success. Using just such a parallel implementation on distributed networks of workstations, Lenstra, Manasse, Peter Montgomery, this author and Wagstaff have found many hundreds of large factors for the Cunningham project.

The P - 1, P + 1, and ECM methods work when the order of some group is smooth. The size of the group depends on the factor P we are trying to find. The smaller that factor P, the more likely it is that the group order will be smooth. Thus, these methods find small factors first, and are generally not very good in factoring large numbers which are the product of two nearly equal primes. For numbers in this category we must use even more powerful methods.

General-Purpose Factoring

Methods

By a general-purpose method we mean one which will factor any integer of a given size in about the same time as any other of that size. The method will take as long, or example, to split a 100-digit number into the product of a 1-digit and a 99-digit prime, as it will to split a different number into the product of two 50-digit primes. These methods do not depend upon any special properties of the number or its factors.

The fastest method known today is an enhancement of a method due to Pomerance. Known as the multiple polynomial quadratic sieve (MPQS), this algorithm can factor any 70-digit number in a day's time on a typical modern mainframe computer. It is quite similar in many ways to an older method due to Michael Morrison and John Brillhart called the Continued Fraction Algorithm (CFRAC). Both algorithms have the same underlying basis.

Suppose one could construct a congruence [A.sup.2] [unkeyable] [B.sup.2] mod N with A [unkeyable] B mod N and A [unkeyable] - B mod N. This says that [A.sup.2] - [B.sup.2] is a multiple of N. The left side factors algebraically as (A + B)(A - B), but since A [unkeyable] B and A [unkeyable] - B then neither (A - B) nor (A + B) divides N. Therefore, if N = PQ then one of P or Q will divide (A + B) and the other will divide (A - B). By then computing (A + B, N) and (A - B, N), you will find the factors P and Q. For example, 91 = 7 . 13 and [20.sup.2] [unkeyable] [6.sup.2] mod 91. Hence (20 + 6, 91) = 13, and (20 - 6, 91) = 7 are the factors of 91.

The way one constructs this equation is to construct a very large number of relations of the form [A.sup.2] [unkeyable] Q mod N in such a way that Q is small. The algorithm then proceeds to attempt to factor these Q's. The vast majority of them will not factor. However, enough will so that they will be useful. It is here that smoothness again plays a part. We hope to find a sufficient number of Qs that are smooth so that they can be easily factored using a limited set of small primes. It can also be shown that the set of Qs for which there exists an A such that [A.sup.2] [unkeyable] Q mod N form a group under multiplication. Therefore this method involves trying to find many individual elements of a group that are smooth rather than trying to have the order of the group be smooth. The Qs are referred to as quadratic residues because they are remainders or residues of squares of integers.

For each of these relationships [A.sup.2] [unkeyable] Q mod N, the left-hand side is already a square. If we could find a set of Qs whose product is a square, we could multiply the corresponding left and right hand sides together to obtain squares on both sides. How one does this is actually quite elegant. Consider all of the Qs that we have managed to factor.

Each factorization is in the form of the product of small primes [p.sub.i] each raised to some power [a.sub.i]. To get a set whose product is a square would mean that all the exponents in that product would be even. Since multiplication corresponds to addition of the exponents, what we would like to do is find a set of Qs whose exponents all sum to an even number. This is equivalent to saying that they sum to 0 mod 2 and this in turn corresponds to solving a system of linear equations. Very efficient methods exist for solving such systems. One then uses the solution to determine which Qs when multiplied together will yield a square.

CFRAC and MPQS differ only in the way [A.sup.2] [unkeyable] Q is generated and in the way the Qs are factored. CFRAC uses the continued fraction expansion of [square root of] N to generate Qs that are less than 2 [square root of] N and tries to factor the Qs by trial division. A continued fraction of [square root of] N is an expression in the form

[Mathematical Expression Omitted]

Because [square root of] N is irrational, the expansion is infinite; but a theorem tells us that somewhere the pattern of [a.sub.0], [a.sub.1], [a.sub.2], . . . eventually repeats itself. If one stops the expansion at any point and collapses the expression, one gets a rational approximation to the irrational number which in a mathematical sense is the best approximation possible. These rational approximations are called convergents, and one can use their numerator and denominator to construct [A.sup.2] [unkeyable] Q mod N. It takes considerable multiple precision arithmetic to generate the convergents, and even more to factor the Qs by division. Wagstaff and Smith have constructed a special-purpose computer just to run CFRAC [6]. Called the EPOC for Extended Precision Operand Computer, and nicknamed the Georgia Cracker, it has a word size of 128 bits, so they can compute the Qs on numbers up to about 76 digits in single precision. Furthermore, the machine has a set of auxiliary processors whose sole purpose is to compute the remainder when a 128-bit number is divided by a small prime. Thus, for each Q which the machine attempts to factor, it can actually divide by many primes simultaneously instead of doing them one after another. The effect of the specialized design is to produce a machine which is optimized to run CFRAC, and in fact does so very quickly. It can factor 60-digit numbers in about a day.

One can try to improve upon the CFRAC algorithm in three ways:

(1) Find a way of generating Qs that avoids the large amount of multiprecise arithmetic required by CFRAC.

(2) Find a way to factor the Qs without using division.

(3) Find a way to reduce the size of the Qs.

The Quadratic Sieve solves the first two problems. The newly discovered Number Field Sieve solves all three. The latest improvements to the Quadratic Sieve algorithm include using quadratic polynomials of the form Q(x) = [Ax.sup.2] + Bx + C. By choosing the coefficients carefully we can use the polynomials to generate the Qs in a simple way and to keep their value small. The polynomials generate Qs that are somewhat larger than those from CFRAC, but because we used a polynomial to generate them we can try to factor them much more quickly. In fact, it can be done without using any division. Suppose we have some prime p and a known value of x for which p~Q(x). This corresponds to the equation Q(x) [unkeyable] 0 mod p. It is easy to see that p~Q(x + p), p~Q(x + 2p), . . . p~Q(x + kp) for any k. This is because Q(x + kp) = Q(x) + 2Axkp + [Ak.sup.2.p.sup.2] + Bkp;

The last three terms are clearly divisible by p. This means that we can factor many values of Q(x) at once using a sieve. One starts by selecting an appropriate set of small prime numbers, called a factor base, with which we will try to factor the Qs. One then computes the coefficients A, B, C, and solves the equation Q(x) [unkeyable] 0 mod p for each prime p. There will generally be two roots [r.sub.1] and [r.sub.2] for each prime. For each prime [p.sub.i] add the value of [log([p.sub.i])] to the sieve array at locations

[Mathematical Expressions Omitted]

We now examine each byte in the sieve array looking for values that are close to log (N)/2. Anytime we find such a location it means that the sum of the logs of the primes is about equal to the log of the value of Q, because Q is about 1/2 the length of N. This says that that particular Q has been fully factored. One can, in fact, use scaled integer approximations to the logarithms in all these calculations if one allows a little leeway for approximations. We therefore see that it is possible to try to factor a large number of Qs, all at once, by doing simple additions on single byte quantities. This is a very fast operation, even on microcomputers.

The Number Field Sieve does not start with a square on the left side of the congruence. Rather, it generates independent sets of numbers on each side of the congruence and requires that both sides must be factored. What makes this method faster is that the numbers can be made much smaller than with QS or CFRAC. The algorithm involves some fairly intricate mathematics and is beyond the scope of this pracnique. Briefly, on one side of the congruence, ordinary integers are generated by a set of linear polynomials. On the other side, algebraic integers lying in an extension field of the rationals are generated with the same linear polynomials. A map exists from the algebraic integers to the ordinary integers. One then factors both sides of the congruences and subsequently obtains squares on both sides in a manner similar to that of QS and CFRAC. The key to the method is that the numbers to be factored are much smaller than [square root of] N. In fact, they grow more slowly than any fixed fractional power of N.

The Number Field Sieve currently works well only on numbers of the form [a.sup.n] [+/i] s for very small s. A generalization to all integers is known, but it is believed that the generalization is not effective until the numbers to be factored exceed about 140 digits. The difficulty is that one must represent N as a polynomial and arbitrary N generate polynomials with large coefficients. This makes the numbers to be factored in the algebraic field much larger.

All three methods are amenable to parallel computation. With CFRAC one can start at different points of the continued fraction expansion and let many separate machines generate and factor Qs simultaneously. One can, using this approach, chain multiple EPOCs together. Each works on a different part of the expansion, totally independently of all the others. In this way one can have M machines factor a number M times as fast as one machine. Marvin Wunderlich has used the SIMD Massively Parallel Processor (MPP) at NASA to implement CFRAC. It has 16,000 separate processors, which allows one to do the trial division needed by CFRAC quite quickly. This machine will factor 60-digit numbers in under one hour. However, the CFRAC algorithm itself is less efficient than MPQS, which is why several Sun workstations can, running MPQS, factor 60-digit numbers faster than the MPP despite its many thousands of processors.

With MPQS or NFS one can give each of M separate machines their own set of polynomials on each with. The computations on each polynomial are independent of the others, so the machines themselves run independently. A star configuration is a type of parallel architecture in which one central processor, called the host, is connected to many independent processors called satellites. The configuration resembles spokes on a wheel. By using such a configuration with multiple Sun-3 microcomputers the author has produced some remarkable factorizations. A typical 70-digit number can be factored in about 10 hours of CPU time on each of 10 Sun-3s. The author has, in fact, achieved having M machines run M times as fast as one machine on a local area network of Suns.

The largest factorization of a number by a general-purpose algorithm was achieved by distributing the problem over a worldwide network. Each site was given its own set of polynomials with which to work, and sent successfully factored (A, Q) pairs in to a central location by electronic mail. when enouh pairs were sent in, the factorization was then constructed at the central site. The number factored was the 116-digit number ([10.sup.142] + 1)/(569 x 7669 x 380623849488714809). It took approximately 275 MIP-YEARS to perform the calculations. A MIP-YEAR is defined as the amount of work performed by a 1-MIP machine running for 1 year. The factors found were:

77169265188335087768689508504941, 93611382287513950329431625811490669, and 82519882659061966708762483486719446639288430446081.

This factor-by-mail effort was organized by Manasse and Lenstra. For numbers in this range, adding 3 digits approximately doubles the run time. This factor of 2 drops very slowly to 1 as the numbers we try to factor approach infinity. To do 100-digit numbers would take about 10 times as long as 90-digit numbers and to do the 129-digit challenge number would take 1000 to 2000 times as long. Thus, one can assert that with current algorithms, even massively distributed networks of computers are not a real threat to the security of RSA. Encryption with a 200-digit key takes only 8 times as long as with a 100-digit key, yet factoring a 200-digit key will take about [10.sup.9]. long. No foreseeable increase in the speed or the number of computers is going to overcome a factor of [10.sup.9].

In a similar effort, Lenstra and Manasse also organized a massive factor-by-mail project using the Number Field Sieve in order to factor the 9th Fermat number, [2.sup.512] + 1, which as the number whose factorization was 'Most Wanted' by mathematicians. What made this 155-digit number feasible was the fact that it has a very special form. General 155-digit numbers are still way out of reach.

Modern vector supercomputers are very well suited to running the QS and NFS algorithms because of their architecture. They are designed to pass a large amount of data past a set of processors each of which performs a single fixed operation on that data stream at positions which are a fixed distance from one another. This is exactly what is needed to do sieving quickly. Since MPQS spends 80% of its time doing sieving, a vector supercomputer will run MPQS very well. In 1983 a group at Sandia Laboratories successfully implemented a primitive version of MPQS on a CRAY X-MP and factored several numbers from the Cunningham project. Among them was ([10.sup.71] - 1)/9 which, at the time, set a record for the largest number ever factored as the product of nearly equal primes. H. J. teRiele has recently used a single processor on a NEC-SX/2 to factor a 101-digit number. It was the first time that a difficult 100+ digit number was ever factored using just a single computer for the calculations.

It should also be possible to build special hardware to run sieving algorithms. There are two possible ways of doing so, one of which is already under way at the University of Georgia. Pomerance, Smith, and Randy Tuler are building a specialized processor which will do sieving even more quickly than a vector supercomputer [3]. They are essentially building multiple vector units, daisy chained together, which will work simultaneously in doing the sieving. Their processor should sieve at least 100 times faster than the CRAY X-MP and they predict that it should be possible to factor a 100-digit number in several days with their machine.

A second approach would be to build many, small, simple computers and hook them together, letting them run independently. Furthermore, one could perhaps add a simple vectorizing siever to each one along with some special hardware that would make finding the roots of the polynomial easier. Such a machine has not yet been designed, however.

For the time being, even with much faster computers, 120 digits promises to be the limit of practical factoring. The challenge number is at the very edge of what can be done with massive amounts of computing, and 200-digit numbers will take perhaps a billion times as long as 100-digit numbers. The current 'Most Wanted' number is the 10th Fermat number [2.sup.1024] + 1. It has the known small factors 45592577 and 6487031809 but the remaining cofactor is 291 digits and well beyond our reach unless someone gets lucky with ECM and finds another small factor. It is impossible to predict when new algorithms may arise, but until they do such numbers will remain safely out of reach. Thus, for the foreseeable future the RSA encryption scheme will still be safe. According to Brillhart, until RSA came along factoring was only a hobby for a few eccentric mathematicians. RSA brought new interest to the problem, and with more mathematicians working on it, who is to say what methods may yet be invented?

References

[1] Bressoud, D.M. Factorization and Primality Testing. Springer-Verlag, N.Y., Berlin, London, Paris, Tokyo, 1989.

[2] Knuth, d.E. The Art of Computer Programming, Vol 2., Addison Wesley, Reading, Mass., 1981.

[3] Pomerance, C., Smith, J.W., and Tuler, R. A pipe-line architecture for factoring large integers with the quadratic sieve algorithm. Siam J. Comput. 17 (1988), 387-403.

[4] Riesel, H. Prime Numbers and Computer Methods for Factorization. Birkhauser, Boston, Stuttgart, 1985.

[5] Rudd, W.G., Buell, A. and Chiarulli, D.M. A high performance factoring machine. In Proceedings of the Eleventh International Symposium on Computer Architecture (1984).

[6] Smith, J.W., and Wagstaff Jr., S.S. An extended precision operand computer. In Proceedings of the Twenty-First Southeast Region ACM Conference (1983), 209-216.
COPYRIGHT 1991 Association for Computing Machinery, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.