Jump to content

Finding Primes Faster (Sieve of Eratosthenes)

- - - - -

  • Please log in to reply
5 replies to this topic

#1
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts
Find all prime numbers less than 1 million

Generating large prime numbers or even testing huge numbers for being prime or not has been a daunting task.

To clarify the basics, a number is called prime if it can be divided by only 1 and itself and no other. This means a prime can't have more than two divisors. For e.g. 2,3,5,7 etc. are all primes.

A general or common approach to test if a number is prime or not starts with testing from 2 onwards up to half of the number trivially, though actually up to square root is sufficient.

This means that if given a number e.g. 9999 whose square root is roughly 99.99 (obviously not perfect otherwise it is already not a prime), we can test if any number between 2 and 99 divides it completely. If not than no other greater number would divide either and hence this would be a prime.

Now in practice if being asked to generate say the first 1000 prime numbers, it is going to be very in efficient that you start from 2 and keep going in a loop. Then for each value you take the number’s square root and check if it divides any number between 2 and square root. If not, it is a prime. This will be a O(n)^2 algorithm.

There has been a well known efficient algorithm for doing the above task with two constraints:

1. An upper limit needs to be known in advance i.e. 1, 10 million etc.
2. One should be able to allocate array of that size in a computer program.

The idea is basically that you create an array of the given size (1 million) initialized with all 0’s. Now start with number 2 at index 2 of array. Since the value at this index is 0, mark it as a prime. Then run a loop up to 1 million marking all multiples of 2 in array as non primes.

Then keep incrementing after 2 up to a million. If the current index is found to be 0 or unmarked, it means it is not divisible by any of the numbers we have encountered so far. Otherwise it would have been marked as non prime and wouldn’t be zero.

This is known as Sieve of Eratosthenes and is considered a very efficient algorithm for finding primes. Though it still runs in two nested loops, but the performance gains are highly significant and it’s run time complexity amounts to be O(n(logn)(loglogn)). For details see
Sieve of Eratosthenes - Wikipedia, the free encyclopedia

I have used a separate print function which takes a little fancy formatting for 1st 2nd and 3rd primes into account. The core function is gen_sieve_primes which runs two nested loops. The outer loop runs from 2 to max number. If value at current index is 0 meaning unmarked, it marks that as a prime by assigning 1. Then it proceeds to second loop in which it marks all multiples of current number (for 2, those would be 4,6,8, 10…. Up to 1 million) as non primes by assigning them -1.

#define MAX_NUM 1000000 // 200


// array will be initialized to 0 being global

int primes[MAX_NUM];


void gen_sieve_primes(void)

{

    for (int p=2; p<MAX_NUM; p++) // for all elements in array

    {

        if(primes[p] == 0) // it is not multiple of any other prime

            primes[p] = 1; // mark it as prime


        // mark all multiples of prime selected above as non primes

        int c=2;

        int mul = p * c;

        for(; mul < MAX_NUM;)

        {

            primes[mul] = -1;

            c++;

            mul = p*c;

        }

    }

}


void print_all_primes()

{

    int c = 0;

    for(int i=0; i<MAX_NUM; i++)

    {

        if(primes[i] == 1)

        {

            c++;


            if(c < 4)

            {

                switch(c){

                case 1:

                    cout << c << "st prime is: " << i << endl;

                    break;

                case 2:

                    cout << c << "nd prime is: " << i << endl;

                    break;

                case 3:

                    cout << c << "rd prime is: " << i << endl;

                    break;


                default:

                    break;

                }

            }


            else

                cout << c << "th prime is: " << i << endl;

        }

    }

}


int main()

{

    gen_sieve_primes();


    print_all_primes();


    return 0;

}


Following is the run of above algorithms using 200 as the max value.

Attached File  Sieve_Primes1.jpg   70.03K   167 downloads


Then I ran this on 1 million numbers size array using Visual Studio on Vista 32 bit machine. The reason for mentioning this is, you may or may not be able to create an array as large depending upon the available memory and type of OS. On a 64 bit machine, one might be able to go even further.

It took roughly a couple of minutes to show all primes less than a million. Following is the end of prints showing that there are some 78,498 primes in 1 million numbers.

Attached File  Sieve_Primes2.jpg   55.92K   93 downloads

There can be a few optimizations to improve this algorithm as suggested in the link I referred above. For e.g. instead of incrementing by 1 while going from 2 to 1 million in the outer loop, we can treat 2 as a special case, start from 3 and do increments of 2. This is because we know no even number can be a prime and all of those would be marked while running the loop for 2.

From space perspective, instead of an integer which might be 32bit or even 64 bit on some machines, we could use a one byte variable i.e. int8_t for array, since we only need 3 states (unexplored, prime, and not prime) represented by 0, 1 and -1 in the current program. This is going to reduce the amount of memory required to between 1/4th to 1/8th.
Today is the first day of the rest of my life

#2
Axel

Axel

    Newbie

  • Members
  • PipPip
  • 27 posts
(Tell me if this is a bump for you guys, I'm new to this forum so don't be angry please)
Wouldn't it be even faster if you would first of all ignore the numbers that end with '0', '2','5' ?

#3
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts
Sure No problem! Please feel free to pass any opinion. That is the objective of this forum.

Towards the end of tutorial, an improvement is mentioned regarding iterating through odd numbers only. That cover those ending in 0 or 2. As soon as you mark 5 as a prime, all numbers that are multiples of 5 will already be marked as non primes according to the algorithm.

Did that answer your question?
Today is the first day of the rest of my life

#4
Axel

Axel

    Newbie

  • Members
  • PipPip
  • 27 posts
yes :P
What is the biggest value you can get ? 12 million digits ? :D
I would use long long but that still wouldn't beat the largest discovered prime number which is around 12m digits

#5
fayyazlodhi

fayyazlodhi

    Programming Expert

  • Members
  • PipPipPipPipPipPip
  • 403 posts
Certainly not 12 million digits. :)

Point 2 in the start mentions the constraint of array size. I guess it works fine up to 10 million numbers (number of primes in that range would be significantly less).

This algo as stated on Wikipedia too is only regarding small primes i.e. those under 10 million.

The primary advantage of this algorithm is, say you have a program which randomly inputs numbers between 1 - 10 million. Now you have to compute whether each number is a prime or not.

It works faster than any other algorithm (probably) and is known for the same efficiency.
Today is the first day of the rest of my life

#6
john coner

john coner

    Newbie

  • Members
  • Pip
  • 2 posts
can any 1 give me the tutorials of c programing, complete i want please rply soon.....:)




1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users