End of Project Report

Ellen Preiss

Cryptography

William McCallum

# Goals

This summer’s research in cryptography was aimed at finding ways to keep the public key of the RSA cryptosystem secure.  There were four students working on different ways to break to public key, and therefore find what needs to be done to make it stronger against attacks.  One topic studied was producing large primes, another was attacking the exponents in the RSA algorithm, and another student and I looked at factoring algorithms.

# The RSA Alogrithm

RSA deals with picking two large prime numbers and multiplying them together to get a number n.  Then you pick a small encryption exponent a.  So you take your message to the encryption exponent, in modulus n, to get a coded message.  The public key includes n and a.  One ideal way to break the code would be to factor n.  This, however, is not easy with the large-scale numbers we are dealing with.  So I looked at some algorithms which make factoring a bit more efficient.  As a result of this work, we hoped to see what kind of prime numbers are the best to choose to make a safe public key.

# Learning Cryptography

Before we could begin any of the research, we learned some Modular Arithmetic and other aspects of number theory.  Then we learned about public key cryptosystems, particularly the RSA encryption system.  We sent coded messages to each other and practiced decoding messages we received, to get an idea of what this type of cryptography is all about. I then read parts of a textbook about cryptography called “Making, Breaking Codes,” by Paul Garrett, where I was introduced to a number of factoring algorithms.  Pollard’s Rho and p-1 methods are simple examples of algorithms which find a random mapping of numbers and test them for factors.  I studied these factoring algorithms, and then implemented them to see what types of numbers need to be avoided so the key would still be hard to factor.

# Procedure

After I had a basic understanding of each factoring algorithm, I would program it in Mathematica and use that to factor numbers from one to thirty digits, or as far as I could go in a reasonable amount of time.  Then I would look for a function of n for the running time and also for the number of steps through each loop.  This gave an idea of how useful the methods would before for larger numbers.  I made tables of the data I found, then used excel to make a function to fit the data.  For example, I used the formula r(n) = a*nb, and then took the log of that to get log(r(n)) = log(a) + b*log(n).  This is in the form of y=mx+b, so the slope of log(r(n)) versus log(n) was b, and the y-intercept was log(a).  This method gave some pretty close formulas, though it did not always work because of the type of data I had.

For Pollard’s Rho Method, I changed the algorithm 13 times to see how the running time and number of steps needed to factor would be altered, so there are 13 tables for this.  For Pollard’s P-1 Method, all I changed was the value of b, so there are two tables.  (Each group of tables is in one file, with separate sheets)

As you can see when you view the Rho tables, the functions r(n) and a(n) are fairly close to the real values, but as the value of n grows larger, the functions are less accurate.   The functions in the p-1 tables are the same way, but r(n) and c(n) are close to the correct values.

# Pollard’s Rho Method

Pollard’s Rho Method generates two sequences of random numbers, and takes the differences of those sequences to generate a quadratic map.  Then it tests the differences against n for a greatest common divisor (GCD).  This is better than trial division because instead of just dividing one number into n to see if it is a factor, it looks for a greatest common divisor between the numbers, which allows it to test a larger amount of numbers at once.  Instead of just trying to divide the number, it tries to divide the number and its factors.  This method starts with one small number, and uses it to generate a random map, and with this map it tests for the greatest common divisor with respect to the number being factored.  So say you start with x=2, and y=x2+1=5.  If the GCD of x-y and n is between one and n, the GCD is the factor.  If it’s one, you have to take x=x2+1 and y=(y2+1)2+1, until the GCD isn’t one anymore.  If the GCD is n, you have to start over, and change the formula for x to something of the form x2+c, where c is not 0 or -2.  I decided to study Pollard’s Rho Method in more detail by writing the algorithm in a computer program and looking at how long it took to find a factor, and how many steps it required.  The following is a program I wrote in Mathematica:

n= (composite number);

f[v_]:={Mod[v[]^2+1, n], Mod[(v[]^2+1)^2+1, n]}

g[v_]:=GCD[v[]-v[], n]

w=f[{2, 5}];

a=1;

Timing[Do[t=g[w]; ++a; If[t>1, Break[], ]; w=f[w], {i, 10}]]

Print[t]

Print[a]

This prints the running time, the factor (t), and the number of steps (a) to find the factor.  The number 10 is changed according to how many steps the algorithm will go through (Mathematica will not go to one billion steps), which is said to be usually about n1/4.  I used this program for n up to 31 digits (too much CPU time is required to look at numbers much larger), and then looked for formulas relating the running time to n, and the number of steps to n.   After this I started changing the formula in the algorithm, to see why Pollard wrote it the way he did.  Looking at how many steps you could go through before the run time got to the 100 seconds, linear functions went 13 steps, cubic functions went 16 steps, and Rho functions went 24 steps.  This shows that the Rho formulas take the least amount of time to run through.  Formulas of the form x=x2+c worked the quickest, also with the least amount of steps needed, overall.  So his random mapping appears to be just right.  I also looked at how the method works with prime numbers that are very far apart, making one much smaller, and then multiplied together to get a different type of n.  Of course, this was a much faster algorithm, because Pollard’s Rho Method is fastest for small prime factors up to about 7 digits.  This means that it is not very efficient for the RSA cryptosystem, where we could easily be working with factors with up to 100 digits.  The research with this method is useful, however, in finding ways to make the public key stronger.  Pollard’s Rho Method shows that the prime factors need to be about the same size, but other factoring algorithms start searching numbers that are about the same size.  So there is a balance in there somewhere.

# Rho Findings

When I changed the algorithm to a linear function (y=x+c), I saw that it was simply trial division.  This is because the differences of the two random sequences found generated a map which was actually the sequence of integers in order from c and up.  This means it was testing for a GCD the same way as you test for a factor in trial division, and the map was not random.  So the factor was equal to the number of steps it took to find a factor.  When I found a formula for the number of steps with regards to n, I saw that this was close to the actual factor.  It was about the square root of n, which is what is said to be close to the factor of n when the prime numbers are about the same size.  Of course, this only worked because I was using prime numbers close together, so the formula would change for different numbers.

Pollard’s P-1 Method

This method works well to find prime factors p so that p-1 is divisible by only relatively small factors.  This tells us that another way to make a RSA public key secure is to make sure the factors of the prime factor, of n, minus one are large.  In the p-1 method, you choose b where 2 ≤ b ≤ n-1.  Then you start off by taking the greatest common divisor of b and n, and if that is greater than or equal to 2, you’re done and the GCD is the proper factor.  If not, you take the first possible prime factor p=2, and calculate l=Floor(ln(n)/ln(p)), and b=bq^l, and then take g=GCD(b-1,n).  While g=1, you continue going back through this loop, making p the next greater prime number, until 1<g<n.  If g=n, there’s a failure and b will have to be changed.  The following is the p-1 method in Mathematica:

n= (composite number);

Timing[r=0; b=2; g=GCD[b, n]; c=1;

If[g>=2, {Print[g]}, {t=1,

While[g<2, {p=1,

While[p<2, {++t,

If[PrimeQ[t], p=t, p=1], r++}],

l=Mod[Floor[Log[n]/Log[p]], n],

b=PowerMod[b, p^l, n],

g=GCD[b-1,n],

c++}]}]]

Print[g]

Print[c]

Print[r]

This program gives the time used and prints the factor (g), the number of times through the GCD loop (c), and the number of times through the prime loop (r).  I used this program for up to 28 digits and waited for a couple hours on the 29th before I stopped.

# P-1 Findings

An interesting thing I found when experimenting with different values of b was that when b=6√n, it went the same number of times through each of the loops, but the time used was at least cut in half.  I tried a number of other values for b and this was the fastest.  This was an interesting find, but for our purposes it only adds to the fact that the value of n should be large.  There could be other tricks to make this method faster, but until they’re found, it is also not very useful when trying to factor very large RSA numbers.

# Conclusion

The two main methods I studied, Pollard’s Rho and p-1 methods, were good tools to study factorization and some cryptography.  No huge breakthroughs were found, but they were interesting to look at, and they gave us guidelines to follow in order to choose a strong public key, which was our goal.  Pollard’s Rho method shows us that p and q should be large, and his p-1 method points out how the factors of p-1 and q-1 should also be large, when p*q=n.  These two tips protect the public key from two easy attacks on the RSA Cryptosystem.  This is a good beginning, and combined with the efforts of the other students and their work this summer, our team could make a fairly secure public key.

According to the RSA lab, the best factoring method for large numbers is the general number field sieve, although it is very complex.  Some other good ones are also the quadratic sieve algorithm and the elliptic curve method.  These would take more knowledge of number theory and a lot more time to look over.  So this project has been a good time of getting introduced to the basics of cryptology and number theory, and the way research works.