How to Find Prime Numbers in Python

To mathematicians prime numbers represent a certain Nobel challenge. They have been studying prime numbers for a while, trying to learn their properties. Some mathematicians actually made progress. Others have gone insane trying to prove a pattern or how to generate the nth prime. Some are actively looking for the largest prime they can find. For computer scientists and programmers prime numbers have properties that can be utilized in many areas. Cryptography and pseudo-random number generators are 2 examples applications of prime numbers. Let us take a dive into the world of prime numbers. We will look at how we can generate and validate prime numbers in python.

Any valid math discussion must begin with a definition. Here is a definition of prime numbers I like to use:

“A natural number p > 2 is called prime if and only if the only natural numbers that divide p [without reminder] are 1 and p. A natural n > 1 that is not prime is called composite. Thus,, n > 1 is composite if n = ab, where a and b are natural numbers with a > 1 and b < n. “ Discrete Mathematics with Graph Theory

What does this means? It means that a number, for example 17, is prime if and only if it can be divided only by 17 and 1. On the other hand, 18 is not prime. 18 is a composite number that can be written as 2 x 9 or 3 x 6. This will prove to be a very useful property.

So how do we find prime number? Well, we do not. Or not exactly. Since the natural number are infinite and a prime number can be any number greater then 2, by definition there are infinite prime numbers. Many mathematicians have tried to prove a formula or correlation among prime numbers. None to date have succeeded. The best we can do is to generate a number and check if it is prime. However, there are some tricks we can do to speed things up, but we will get to that.

Here is our basic problem. We want to generate all the prime numbers between x and y. For starts, let us think about the numbers between 2 and 100. The straight forward, brute force, approach will be to check every number in that range and see if any number up to it can divide it without a reminder. If we find such number then it is not prime. Let us take a look at a simple program that can do this for us:


import sys

def is_prime(num):
	for j in range(2,num):
		if (num % j) == 0:
			return False
	return True

def main(argv):

	if (len(sys.argv) != 3):
		sys.exit('Usage: prime_numbers.py <lowest_bound> <upper_bound>')

	low = int(sys.argv[1])
	high = int(sys.argv[2])

	for i in range(low,high):
		if is_prime(i):
			print i,

if __name__ == "__main__":
	main(sys.argv[1:])

And the output for prime numbers between 2 and 100:

wpid-pythonfirst100primes-2013-01-6-09-56.png

Now for some tricks. First, we don’t have to check every second number. All the even numbers will be divisible by 2 by definition of even numbers. So our first loop can be rewritten to jump by 2 and start at an odd number. That way we are just checking the odd numbers. By this hypothesis, all odd number are potentially a prime number. This takes away half the numbers we need to check. If we were to ask for all the prime numbers between 100 and 10,000 it might take a while. Lets look at how long it will take my computer (Intel core duo 2 2.53GHz with 8GB ram) to run this program using the unix time command.

The original script from before:

wpid-pythontimetocalculateprimes1-2013-01-6-09-56.png

Modified script to take out the evens numbers:


import sys
import math

def is_prime(num):
	for j in range(2,num):
		if (num % j) == 0:
			return False
	return True

def main(argv):

	if (len(sys.argv) != 3):
		sys.exit('Usage: prime_numbers2.py <lowest_bound> <upper_bound>')

	low = int(sys.argv[1])
	high = int(sys.argv[2])

	if (low == 2):
		print 2,

	if (low % 2 == 0):
		low += 1

	for i in range(low,high,2):
		if is_prime(i):
			print i,

if __name__ == "__main__":
	main(sys.argv[1:])

wpid-pythontimetocalculateprimes2-2013-01-6-09-56.png

This modification cut the program time by nearly half, but we can do better. According to the Sieve of Eratosthenes we do not need to check if all the numbers will divide our potential prime, we only need to check up to the floor of the square root of the number. So our next version of the code will look like this:


import sys
import math

def is_prime(num):
	for j in range(2,math.sqrt(num)):
		if (num % j) == 0:
			return False
	return True

def main(argv):

	if (len(sys.argv) != 3):
		sys.exit('Usage: prime_numbers3.py <lowest_bound> <upper_bound>')

	low = int(sys.argv[1])
	high = int(sys.argv[2])

	if (low % 2 == 0):
		low += 1

	for i in range(low,high,2):
		if is_prime(i):
			print i,

if __name__ == "__main__":
	main(sys.argv[1:])

And the time to find all the primes between 100 to 10,000:

wpid-pythontimetocalculateprimes3-2013-01-6-09-56.png

This is really great. We have improved our program by another half and nearly a quarter than the original run. Let us examine 2 (2 things) variations of this program. First we will calculate the nth prime number. Second we will try to get a random prime number for a given length.

In order to calculate the nth prime we will have to modify them main() loop into a while loop. We will also modify the program to accept only 1 number, the nth prime we want to calculate. Once we reach the nth prime we will print it out.


import sys
import math

def is_prime(num):
	for j in range(2,int(math.sqrt(num)+1)):
		if (num % j) == 0:
			return False
	return True

def main(argv):

	if (len(sys.argv) != 2):
		sys.exit('Usage: prime_numbers4.py <nth_prime>')

	i = 0
	num = 2
	nth = int(sys.argv[1])

	while i < nth:
		if is_prime(num):
			i += 1
			if i == nth:
				print 'The ' + str(nth) + ' prime number is: ' + str(num)
		num += 1

if __name__ == "__main__":
	main(sys.argv[1:])

Sample output:

wpid-pythontimetocalculateprimes4-2013-01-6-09-56.png

For many application computing a prime number is not enough. For some we want a specific length of prime number. Lets say we need a 5 digit prime number. How are we to calculate this? The simple approach is to generate all 5 digits prime numbers and choose 1 of them randomly. Let us look at how to do this:


import sys
import math
import random

def is_prime(num):
	for j in range(2,int(math.sqrt(num)+1)):
		if (num % j) == 0:
			return False
	return True

def main(argv):

	if (len(sys.argv) != 2):
		sys.exit('Usage: prime_numbers5.py <num_digits>')

	digits = int(sys.argv[1])
	low = int('1' + '0' * (digits-1))
	high = int('9' * digits)
	prime_number_list = []

	if (low == 1):
		prime_number_list.append(2)
		low = 3

	if (low % 2 == 0):
		low += 1

	for i in range(low,high,2):
		if is_prime(i):
			prime_number_list.append(i)

	print 'A random ' + str(digits) + ' digits prime number is: ' + str(random.choice(prime_number_list))

if __name__ == "__main__":
	main(sys.argv[1:])

Output:

wpid-pythontimetocalculateprimes57-2013-01-6-09-56.png

If you were to run the program, you will notice that the it will take longer and loner as n gets larger. I found that for any n grater than 6 digits, it is very demanding on the hardware. I do not recommend letting this program try to calculate a high digit prime. You will also notice if you try to ask for 10 digits, python will raise an error in the range function. So how do we compute n digit prime for a large n? One way I found is by using a random number and checking if it is prime. This seems to work reasonably well with numbers less than 15, but then again it is random.


import sys
import math
import random

def is_prime(num):
	for j in range(2,int(math.sqrt(num)+1)):
		if (num % j) == 0:
			return False
	return True

def main(argv):

	if (len(sys.argv) != 2):
		sys.exit('Usage: prime_numbers6.py <num_digits>')

	digits = int(sys.argv[1])
	low = int('1' + '0' * (digits-1))
	high = int('9' * digits)
	done = False

	if (low == 1):
		low = 2

	while not done:
		num = random.randint(low,high)
		if is_prime(num):
			print 'A random ' + str(digits) + ' digits prime number is: ' + str(num)
			done = True

if __name__ == "__main__":
	main(sys.argv[1:])

wpid-pythontimetocalculateprimes58-2013-01-6-09-56.png

So how about really large primes? That is where time becomes an enemy and computational complexity goes up. I found a few modules that can help compute large primes in python but I have yet to try to run anything. Once I get a chance I will try to run them and put a post up. If you really need prime numbers I would advise you to either download a list of them or generate a list of primes and save them to a text file. Prime numbers will show up every once in a while. They come up in cryptography and other encryption methods. Most usage of prime numbers is due to the nature of the numbers. For more information about prime numbers I would suggest seeking out a mathematician, which I am clearly not.

If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.
  • http://twitter.com/therealprologic James Mills

    Nice article! I find your computation of low, high a bit odd though. Here's a better approach I think:

    >>> low, high = 10**(digits-1), (10**digits - 1)
    >>> low, high
    (10000, 99999)
    >>>

    cheers
    James

    • CptDeadBones

      Thanks. You are right. That is another good way to calculate the limits.

      • http://twitter.com/therealprologic James Mills

        Nothing personal; but I think math should never be done with strings :)

        • CptDeadBones

          True.

  • avi

    Wow, I enjoyed so much reading this article.

    Please post more math / python articles. Thank you !

    • CptDeadBones

      Thank you very much. Is there any math area in particular you want me to write about?

  • Ashwin

    Your blog is great!!, Thank you so much for putting up these articles and breaking them down to much detailed explanations
    keep them coming :D

    • CptDeadBones

      Thank you very much! I am glad you enjoy it.

  • Perkin

    Just starting to learn python, this has helped, thanks.
    Would is_prime be twice as quick if you change the loop to

    for j in range(3,int(math.sqrt(num)+1),2):

    As your not checking even numbers, and any odd number is only divisible by an odd number.

    • CptDeadBones

      I do not know if it will be half as fast, that might be something to time. You are correct in your point. I was sure that the code I was running was skipping the even numbers. Maybe I posted the wrong code, or missed it all togther. I will need to double check. Either way, you are correct and it would be more efficient than checking all numbers.

  • swagmastag

    fuck you

  • SwagMastahJ

    @swagmastag No, fuck you

  • Dummy

    For really large primes (e.g. 4000 bits) one can use probabilistic primality tests (and indeed they are used in cryptography. The best one is Miller-Rabin test. Here's it's python code http://rosettacode.org/wiki/Miller-Rabin_primality_test#Python
    Though it's too verbose and may seem complicated, the algorithm itself is easy (though it's understanding needs some number theory knowledge).