|
Thu, 08 Feb 2007
(Warning: Extreme nerdiness follows. You are not expected to understand this if you are a family member whose last name is not "Sambol". Feel free to move along.) So instead of doing anything useful, I spent all of my free time today designing and implementing and measuring and optimizing and refining an algorithm for determining the prime factors of an integer. Here's the final result (written in Ruby):
def factor(n)
def primes_up_to (n)
r = (2..n).to_a
primes = []
while r.first**2 < n
primes.push(r.first)
r.reject! { |i| i % primes.last == 0 }
end
primes + r
end
factors = {}
for p in primes_up_to((n**0.5).ceil)
while n % p == 0
factors[p] ? factors[p] += 1 : factors[p] = 1
n /= p
end
end
factors[n] = 1 if n != 1
factors
end
I got the urge to do this after incidentally running across a largish number in the middle of a random sysadmin task, and idly wondering whether or not it was a power of 2. The actual factorization algorithm was fairly easy to come up with and not all that interesting: just test whether every eligible prime number divides the integer evenly and, if so, count how many times it does so. The interesting part is generating the list of prime numbers (which is the job of the "primes_up_to" subroutine). I vaguely remembered that some ancient Greek guy had come up with a simple method for finding all prime numbers smaller than a given number, and I preferred to look this up rather than reinvent that particular wheel. Some googling revealed that the "sieve of Eratosthenes" was what I had in mind. It goes a little like this: start with the list of all the integers from 2 to whatever; note that the first member of this list is prime and remove it and all its multiples from the list; keep repeating that last step until the list is empty. The code for doing exactly that is actually a bit different from the function mentioned above:
def primes_up_to (n)
r = (2..n).to_a
primes = []
while not r.empty?
primes.push(r.first)
r.reject! { |i| i % primes.last == 0 }
end
primes
end
The final version differs from this one primarily in the test for terminating the iteration. By tracing out the way that the list changed during each iteration of the sieve, I discovered that this initial implementation did quite a bit of unnecessary filtering. After a certain point relatively early on, there is nothing but prime numbers left in the list, so there's no point in wasting energy trying to weed out their multiples. A serendipitous consultation of my senior-level algorithms textbook unearthed the (unproved) assertion that, "Trial division by all integers up to B is guaranteed to factor completely any number up to B2." After ruminating on why this must be a true theorem [1], it quickly became obvious that a corollary is that an array of the integers from 2 to B2 will contain only primes if you eliminate multiples of the primes from 2 to B [2]. This meant that I could terminate the while loop in primes_up_to as soon as the first member in the list is larger than the square root of the largest integer in the list. At that point, all the numbers left in the list can be dumped into the list of primes that's been built up; no further testing needed. My curiosity led me to another succinct and elegant implementation of Eratosthenes' sieve:
def primes_up_to (n)
(2..(n**0.5).ceil).inject((2..n).to_a) { |r, i|
r.select { |j| j == i || j % i != 0 }
}
end
I'm charmed by the functional style exhibited here. I've yet to see any code that uses Ruby's inject properly that doesn't exude a fanciful cleverness. Unfortunately, measurements clearly showed that the running time for this version is much worse than that of my imperative version. Intractably so, as far as I can tell. It wastes time performing a filtering run for each of the integers up to square root of the maximum integer, regardless of whether they're actually prime. Perhaps I'll more formally analyze the running time of these two implementations of Eratosthenes' sieve, but that will have to wait for another day. [1] If B2 has a prime factor greater than B (let's call it A), then A can only form B2 by multiplying with a number smaller than B (let's call it C), else A*C would be too large. Since C is less than B, it's already been tested as a factor, which simultaneously reveals A as a factor (B2/C = A). [2] If the list contained a composite number at this point (let's call it X), then that composite number would have only factors larger than B. Since the factors are larger than B, then the result of multiplying the factors (i.e. X) would have to be larger than B2. But the list doesn't contain numbers larger than B2, by definition, so X can't be in the list. Comments
Posted by Phil! at Fri Feb 9 19:26:44 2007 Note that because you're dividing the number by its factors you don't necessarily need all the primes up to the square root of the original number. The approach I've taken in the past requires a primality-testing function and is basically
def prime-factorize (n):
d = 2
while n > 1:
if n % d = 0:
add-divisor(d)
else:
d = next-prime(d)
def next-prime (n):
if n < 2:
2
else if n = 2:
3
else if even?(n):
next-prime(n - 1)
else:
candidate = n + 2
while true:
if prime?(candidate):
return candidate
else:
candidate = candidate + 2
This approach mostly depends on having a good prime testing function. (Actually, with a really good function, the 'while n > 1' can be replaced with 'until prime?(n)', but the simplest version of 'prime?' would involve testing divisibility against all primes below sqrt(n), and if you're doing that, you might as well just build the list once and use it in your algorithm.) My primality-testing can be seen at http://aperiodic.net/phil/lisp/primes.lisp (yes, Common Lisp) and it's probably a bit heavyweight for one-offs like your question. I have it lying around for Project Euler things. Posted by Daniel Pearson at Sat Feb 10 20:35:15 2007 I gotta pick on your syntax errors: you seem to have forgotten that "?" and "-" aren't legal characters for Python identifiers. Posted by Steve at Mon Feb 12 07:29:50 2007 Hurray old dead Greek guys! Eratosthenes, rah rah rah! Anaximenes, bis boom bah! Posted by Daniel Pearson at Mon Feb 12 07:54:42 2007 "But they're like the people chained up in the cave in the allegory of the people in the cave by the Greek guy." Posted by Phil! at Mon Feb 12 14:57:34 2007 It's not Python. It's pseudocode (based on Lisp, but with Scheme naming conventions) that just happens to look like Python. So there. :P Posted by Phil! at Wed Feb 14 01:58:21 2007 And today I saw this link: Add your own comment:
|
Site Index
Diary Categories
Previous Entries
Entries by Date
Entries by Month
2007-Sep2007-Aug 2007-Jun 2007-May 2007-Apr 2007-Mar 2007-Feb 2007-Jan 2006-Dec 2006-Nov 2006-Oct 2006-Aug 2006-Jul 2006-Jun 2006-May 2006-Apr 2006-Mar 2006-Feb 2006-Jan 2005-Dec 2005-Nov 2005-Oct 2005-Sep 2005-Aug 2005-Jul Here is a version of this page more appropriate for printing. You can also read this diary via RSS 0.91 or RSS 2.0. If you ask me, I can also add you to the list of people who will automatically receive new entries at their email address. |
|||||||||||||||||||||||||||||||||||||||||||||||||