Finding the Prime Factors of a Number

Introduction

This write-up is primarily concerned with factoring numbers by computer, rather than by a person with pencil and paper. There are several different methods of factoring by computer, and there are many programs available from various web sites that implement one or more of these methods. Many tables of known factors of numbers of various forms are also available on the web. I have not found much on the web about how these various factoring methods work. This write-up is an attempt to provide some of this information, when I am able to figure it out.

I have been using a program called factor.exe, which can find the factors of any number up to about 100 digits, using several different factoring methods. This program uses its author’s MIRACL library for doing arithmetic with large numbers.

Another program that I have found useful is PPSIQS, available here. (Scroll down a couple of screenfuls to find it.) It uses the Quadratic Sieve factoring method, to factor numbers up to about 100 digits. That site also has PPMPQS, an earlier program that also uses the Quadratic Sieve.

msieve is another implementation of the quadratic sieve, and seems to be faster than either of the above programs.


Definitions

A prime number is an integer greater than 1 (one) that is divisible only by 1 and itself.
A composite number is an integer greater than 1 that is not prime.

The Fundamental Theorem

The Fundamental Theorem of Arithmetic states that every positive integer greater than 1 (one) can be written as the product of prime numbers in exactly one way, except for the order of the factors. Here are some examples of numbers written as the product of prime numbers:

12 = 2 * 2 * 3
30 = 2 * 3 * 5
119 = 7 * 17
245 = 5 * 7 * 7
257 = 257 (a prime number)

Note that the asterisk (“*”) is used to mean multiply, since the usual multiplication symbol looks too much like the letter “x”.

Fermat’s Little Theorem

This theorem is called Fermat’s Little Theorem, to distinguish it from the more famous Fermat’s Last Theorem. The Little Theorem states that:

If p is a prime number, then for any integer a:

a^p - a is divisible by p.

If a is not divisible by p, then we can divide the above by a, and get:

a^(p-1) - 1 is divisible by p.

For example, for the prime number 7:

    2^7 - 2 =   126 = 7 * 18
    3^7 - 3 =  2184 = 7 * 312
    4^7 - 4 = 16380 = 7 * 2340

    2^6 - 1 =   63 = 7 * 9
    3^6 - 1 =  728 = 7 * 104
    4^6 - 1 = 4095 = 7 * 585
If we call the number we are trying to factor N, then we pick a value for a, such as 3, and compute the remainder when dividing 3^(N - 1) by N. If N is prime (and not equal to 3), then the result will be 1. If the result is not 1, then N cannot be prime.

Note that if the result of this calculation is 1, this does not prove that N is prime. It means that N is very likely to be prime, but it could be composite, since some composite numbers will “look prime” by this test. Proving that a large number is prime takes more work than just this test. The “Prime Pages” has an introduction to proving primality.


Problem Statement

So the problem is this: if you are given a number, how do you find its prime factors?

There are two basic classes of factoring algorithms. In one class, the amount of work (or computer time) that is needed to find a factor depends mostly on the size of the factor that is found (or looked for). In the other class, the amount of work depends primarily on the size of the number being factored. The first several methods described below belong to the first class.

The program “factor.exe” uses all of the following methods except for the Number Field Sieve.


Trial Division

Usually, the first thing to do when trying to factor a number is “trial division”. This is simply dividing the number by small prime numbers, (starting with 2, then 3 and 5 and 7, and so on) and checking for a remainder of zero. The “Prime Pages” has a glossary entry about trial division.

As an example, suppose you want to find the factors of:

24595184394

First, we divide by 2 and find that the remainder is zero. So we have:

24595184394 = 2 * 12297592197

But we are not done yet. That last number is not prime. So we continue (with the new number), by trying 2 again. After trying 2, 3, 5, 7, and 11, we will have:

24595184394 = 2 * 3 * 3 * 11 * 124218103

A person with pencil and paper might stop here, without being able to determine whether that last factor is prime or not. But the computer would continue and find that:

24595184394 = 2 * 3 * 3 * 11 * 97 * 103 * 12433

So how do we know that the computer is done? Imagine that the computer has just divided by 103, and got a quotient of 12433 and a remainder of 0. Now it will divide 12433 by 103, and get quotient 120, remainder 73. After dividing by the primes 107 and 109, it will divide 12433 by 113 and get quotient 110, remainder 3. So if 12433 has a prime divisor greater than 113, then it is also divisible by the quotient (12433/prime), which will be less than 113. In this case, we would have already found this smaller factor. So 12433 must be prime, and the factorization is complete.

Trial division is suitable for finding factors up to about 8 or 9 digits. But what if the number to be factored is:

174224571863520493293247799005065324265471

For this number, the smallest prime factor is 20 digits long. Finding this factor by trial division would take years, even on the fastest computer. In contrast, it takes “factor.exe” a minute or less to find that this number is:

32032215596496435569 * 5439042183600204290159 (both prime)

“factor.exe” finds these factors using the quadratic sieve method, after trying the other methods first.


Brent’s Method

This method is a variation on J. M. Pollard’s “rho” method, which works as follows. We choose a polynomial, such as

x^2 + 1

and an initial value of “x”, such as 1, and compute the value of the polynomial. The polynomial is then evaluated using the computed value as the new value of “x”. The values of “x” are reduced “modulo N” (use the remainder when divided by N), where “N” is the number being factored. Eventually, the sequence of values of “x” will repeat, except for possibly the first several values of “x”. However, we are hoping that the sequence will repeat “modulo p”, where “p” is a factor of N; and that the length of the repeating sequence is not too big.

So, how do we test if the sequence repeats “modulo p”, when we don’t know what “p” is? Suppose, for example, that the 3rd and 15th term of the sequence are the same modulo p, for some factor “p” of “N”. Then, if we subtract the two values (modulo N), the result will be divisible by “p”. If we then compute the Greatest Common Divisor of “this difference” and “N”, we will get “p” (or a multiple of “p”). In any case, the result will be a factor of “N” (and we hope this factor is not “N” itself).

In the example above, note that if the 3rd and 15th terms are the same modulo p, then the 4th and 16th terms will be the same, the 5th and 17th terms will be same (all modulo p), and so on.

An implementation of Pollard’s “rho” method is:

Initialize:  A = first term of sequence, B = second term
Step 1: set A = next term of sequence (that is:  A <-- (A^2 + 1) mod N)
Step 2: compute two terms of the sequence starting at B
        and set B to the second of these terms.
Step 3: Compute GCD(B-A,N)
        If the GCD is 1, go to Step 1
        If the GCD is greater than 1 and less than N,
            then it is a factor (not necessarily prime) of N.
        Otherwise, the GCD is N.  Either give up,
            or start over with a different polynomial.
The first time we reach Step 3, A is the second term and B is the fourth term. The next time, A is the third term and B is the sixth term. Thus we test for a repetition period of length 2, then 3, then 4, and so on. In the example above, where the 3rd and 15th terms are the same modulo “p”, we will find the factor “p” (or larger) when A is the 12th term and B is the 24th.

Since computing a GCD (Step 3) takes longer than computing three terms in the sequence (Steps 1 and 2), we can speed things up by changing Step 3 so that it computes the GCD only some of the time (perhaps every 10th time, for example). The other times, Step 3 will accumulate values of (B-A), by multiplying them together and taking the remainder when divided by N.

Brent’s method is a modification of the “rho” method, discovered by Richard P. Brent, that requires the sequence to be computed only once, instead of twice.

In the program “factor.exe”, Brent’s method finds factors from 5 digits up to about 10 or 11 digits.

As an example, suppose we want to find the factors of N=10573. Using the polynomial X^2 + 1 and an initial value of 1 for “x”, the sequence is (modulo 10573):

1, 2, 5, 26, 95, 677, 3691, 5458, 5624, 5534, ...

In Step 3, we will compute GCD(2-1, 10573), GCD(26-2, 10573) GCD(677-5, 10573) and GCD( 5458-26, 10573). The first three GCD’s are 1, and the last is 97. Thus, we have found the factor 97. Dividing 10573 by 97 gives the quotient 109, so 10573 = 97*109.


Pollard’s “p-1” Method

Let n be the number we are trying to factor. The (p-1) method can find a factor p if p-1 has only “small” factors, where “small” means “less than or equal to some limit”, which we will call B1.

Let p be a prime factor of n. Then, by Fermat’s Little Theorem:

3p-1 = 1 mod p

Note that this implies, for any positive integer k

3k(p-1) = 1 mod p

This is because

3k(p-1) = (3(p-1))^k = 1^k = 1 mod p

The (p-1) method raises 3 (or some other value for a) to some power t, where t has lots of small factors. If (p-1) has only small factors (such that it is a factor of the exponent t), then we will have:

3t = 1 mod p

This last statement is equivalent to:

3t - 1 is a multiple of p

Since we cannot calculate 3t modulo p, because we don’t know what p is, we do the calculation modulo n instead. We then compute

gcd(3t - 1, n)

The result will be p, or possibly a multiple of p.

For example, let n = 18446744073709551617, and let us choose the limit B1 to be 19. Then one possible value for t is 19! (factorial). [Note: choosing the exponent t this way results in “too many” occurrences of small factors (such as 2 and 3), so in an actual implementation of the (p-1) method, t is chosen a little differently.] We then compute:

319! = 12392788485487941828 mod n

and then

gcd( 12392788485487941827, n ) = 274177

so we have found a factor. Note that

19! = 216 * 38 * 53 * 72 * 11 * 13 * 17 * 19

274176 = 28 * 32 * 7 * 17

so that 274176 is indeed a factor of t = 19!, which is required in order for the (p-1) method to find this factor.

Phase 2

If the above calculation (“Phase 1”) fails to find a factor, a “Phase 2” may be performed. For phase 2, we choose a second limit, called B2, which is larger than B1 (perhaps 50 times B1 or so). Phase 2 will find a factor p if p-1 has one factor between B1 and B2, and if all of the other factors of p-1 are less than or equal to B1. B2 is sometimes chosen so that the computation times for phase 1 and phase 2 are approximately the same. The total computation time is much less than would be required to perform phase 1 up to the limit B2. Phase 2 will often find factors that are missed by phase 1.

William’s “p+1” Method

The “(p+1)” method can find a factor p if p+1 has only “small” factors. This is in contrast to the (p-1) method, above, where p-1 needs to have only small factors. This method can have a “phase 2”, similar to the “phase 2” for the (p-1) method.

More details to be added.


Elliptic Curve Method

The basic strategy of this method is similar to the (p-1) and (p+1) methods. However, the arithmetic here is done using elliptic curves, instead of “ordinary” numbers.

As with the (p-1) and (p+1) methods, limits B1 and B2 are chosen. (Like those other methods, larger values of B1 and B2 can result in finding larger factors, but the computation takes longer as well.) B1 can be anywhere from about 11,000 (to find small factors) to as high as 44,000,000 or more (if you are really ambitious). B2 is often 100*B1.

To run one “curve”, a starting point, called sigma, is chosen. I am not sure if sigma determines which elliptic curve is used, or if it determines a starting point on some “built-in” curve. (Or possibly it is something else entirely.) The actual calculation is similar in concept to the (p-1) or (p+1) calculation, in that we are hoping that a particular number, related to a prime factor p, has only “small” factors. If no factor is found after doing Phase 1 and (if necessary) Phase 2, a different value is chosen for sigma and the calculation is repeated.

The Elliptic Curve Method finds a factor p if p+d has only “small” factors, where d is a number between -sqrt(p) and sqrt(p). The value of d for a particular calculation depends on the value of sigma. So, if one value of sigma fails to find a factor, we try another value of sigma, hoping that eventually we pick a sigma that succeeds. Typically, an ECM program will either choose a “random” value of sigma each time it starts a new curve, or it will run through a predetermined series of values for sigma.

If we run lots of curves and fail to find a factor, we increase the limits B1 and B2, and try some more curves.


In all of the above factoring algorithms, the amount of work needed to find a factor depends, at least roughly, on the size of the factor that is found. For the following two algorithms, the amount of work needed depends mostly on the size of the number being factored.

The following two factoring methods work by trying to find integers A and B (with both of them between 1 and N) such that A^2 - B^2 is a multiple of N (the number to be factored). Note that A^2 - B^2 = (A + B)*(A - B). If A is not equal to B and A+B is not equal to N, then GCD(A+B,N) and GCD(A-B,N) will be factors of N.

As above, “GCD” means “Greatest Common Divisor”.

Quadratic Sieve

factor.exe calls this the “multiple polynomial quadratic sieve”. More to be added.

Probably, most quadratic sieve programs are written so that one computer does all the work. However, it can be written so that the sieving can take place on several computers simultaneously, with each computer using a different range. The results from each computer are then combined to find factors.

This method is suitable for numbers up to about 100 digits, or perhaps a few more. For numbers larger than this, the Number Field Sieve is more efficient.


Number Field Sieve

This method is usually used for numbers of about 100 to 200 digits, when the “class 1” methods have failed to find a factor. The term “class 1” refers to all of the above methods except the quadratic sieve.

In contrast to the quadratic sieve, implementations of the Number Field Sieve are (probably) almost always written so that sieving can take place on several computers, with the results being combined later.

The following is a portion of a “read me” file about a project that used the Number Field Sieve (NFS). In this write-up, “C” refers (I think) to the set of “complex numbers”, which are numbers of the form a+bi, where a and b are real numbers (or perhaps integers?) and i is sqrt(-1).

            Special Number Field Sieve
            ---------------------------

A brief introduction for those unfamiliar with the algorithm.
I have simplified some things.

The version of the Number Field Sieve implemented in this code uses
"special" polynomials to factor numbers of the form a^n +/- 1.
It can do others (e.g. k*A^N + k1*B^M) as well.

The two polynomials will generally be a linear one and a quartic
or quintic, but can vary somewhat. These two polynomials share a
common root mod N, where N is the number we are trying to factor.
Call this M.  The linear polynomial is usually just 'x - M'.

Consider a root (in C) of the quartic/quintic. Call this alpha.
There is a map (a homomorphism) which sends alpha to M mod N
(because the polynomials share a common root). Thus:

        M = phi(alpha)  mod N,  where phi is the homomorphism.

We then have, for sets of integers (a,b):

        a + bM  = phi(a + b alpha) mod N.

What NFS does is hope to find *many* sets (a,b) such that a + bM
and norm(a + b alpha)  are both factorisable with a fixed set of primes.
(actually one is with prime ideals of index 1; a minor nitpick).

This is done by sieving, which is the time consuming part of factoring N.
After enough factorizations have been collected, we perform some "magic"
and come up with a set of (a,b)'s such that the product over all values
of (a + bM) and over all values of norm(a + b alpha) are squares. Thus,
we get:
            A^2 = B^2 mod N
and:
            GCD(A+B,N) is a factor of N if A != +/- B.

That may give you a very general idea of what is going on with NFS factoring, depending on how much of that you understand.

A “typical” factoring project using NFS works as follows. The number to be factored is one which is composite, but for which the other factoring methods have not produced any factors. Usually this number will be between about 100 and 200 digits. The “sieving” part of the factoring is done first. Often this is done by several volunteers using PC-style computers, with each computer working on a different range of numbers. When enough data is obtained (taking perhaps 2 weeks to 3 or 4 months), it is sent to a single collection point, where the project leader (for example) shuffles it all together, and perhaps “tweaks” the data a little, such as removing duplicate data. The data is then sent to a super computer, which will spend several hours digesting the data, to come up with the factors.

A supercomputer was used because it was the only type of computer with enough memory and compute power to do the job. Today’s PCs may be powerful enough to do the required number crunching.


Go to my math index page.
Go to my home page.

Last Update: March 14, 2015.