Chris Dalton and Dr. Dan Ventura, Computer Science
Prime factorization is the breaking down of a composite number into prime numbers that, when multiplied together, equal the number. There is not a known algorithm that accomplishes this task in sub-exponential time. For my ORCA project, I worked with Dr. Dan Ventura and sought to develop a polynomial-time prime factorization algorithm. The development of such an algorithm would have revolutionized computer security and oered new insight to even more dicult problems. I rst approached the task as a boolean constraint satisfaction problem, but after running out of ideas, I tried coming at it from a more number theoretic standpoint. I studied multiplication and division algorithms immensely (prime factorization is just these in reverse), found a closed form solution to prime factorization that requires a modular factorial, and tried to develop a polynomial-time modular factorial algorithm. In the end I was not successful in creating a fast prime factorization algorithm, but did have a lot of success in other areas.
My rst attempts at prime factorization were through boolean constraint satisfaction. Prime factorization is multiplication in reverse, but I initially focused on reversing addition. I represented the numbers in base-2 (binary) and analyzed the equations to perform addition in this form. Two numbers A and B can be represented in binary form with the boolean variables an … a2 a1 and bn … b2 b1. The solution, S = s1+n … s2 s1, can then be represented by the following equations:
A circuit that performs these operations can easily be run in reverse and generate a pair of numbers that add into a given sum. My next task was to build a multiplication circuit and reverse it, thereby obtaining a pair of numbers that multiply into a given product. This concept appled recursively would accomplish prime factorization. Multiplication is mostly just a 2-dimensional add. Two boolean numbers of length 4, A = a4 a3 a2 a1 and B = b4 b3 b2 b1, can be multiplied by performing the following addition, where pij = aj ^ bi:
I spent about a month trying to reverse this circuit. I successfully did reverse the 2-dimensional sum, but requiring that pij = aj ^bi signicantly complicates the task. I wrote a program and found that as a number grows in length, the amount of valid values for pij that add into into it grows intractibly large, while the amount of those that also satisy the pij = aj ^ bi constraint become vanishingly small.
After hitting a dead end with the constraint satisfaction approach, I studied the numeric algorithms of multiplication and division in hopes of gaining an insight on how to reverse them. I studied and implemented an ecient algorithm for multiplication of large numbers that recursively performs a fast fourier transform over a Fermat number ring. I also spread it over multiple cores and implemented the base case in assembly for maximum eciency. After nishing multiplication, I searched for an ecient division algorithm but couldn’t nd anything. I ended up having to come up with my own that uses a divide-and-conquer approach with a slight twist. Its eciency as compared to other standard division algorithms is shown below:
The asymptotic performance is superior by far. Once a number gets to be millions of bits long, my algorithm is several thousand times faster than other standard, well-known algorithms.
I didn’t get any great ideas on how to reverse these multiplication and division algorithms, but did add other functions, including a square root, power, factorial, and modular power that all run very quickly. I have them packaged up in a template-driven C++ library that can outperfrom GMP, arguably the world’s fastest big-number library, in many instances. I hope to release my library to open source in the near future.
After playing with big numbers, I noticed a closed-form solution to prime factorization. Namely, if N is the product of two prime numbers, the following formula will produce the smaller of the two factors:
The length of (pN)! will grow exponentially, but luckily, instead of calculating that entire number, we can compute the factorial modulo N and get the same result. I tried to nd a clever method of computing modular factorials, but had no success.
I’m not surprised that I was unable to nd a sub-exponential prime factorization algorithm during the short time I had to work on this project. The problem has eluded the world’s greatest minds for centuries. It’s such a hard problem that we’ve gambled our digital security that it can’t be done in polynomial time. Despite the difficulty of the project, I feel like I had a lot of success. I gained new insight to constraint satisfaction problems, learned a lot about number theory, and developed an extremely fast, useful big-number library that I hope to release soon to open source.