[erlang-questions] beginner: Generating a list of prime numbers

11 views
Skip to first unread message

Mark Wagner

unread,
Feb 5, 2009, 9:50:22 PM2/5/09
to erlang-q...@erlang.org
As a way of learning the basics of Erlang, I'm solving various
problems from Project Euler (http://projecteuler.net/). Problem #10
involves generating a list of prime numbers. The naive solution to
this is to generate a list of numbers and test each of them
individually to see if it's prime; a more sophisticated solution uses
the list of primes already found to speed up the testing. My code for
the sophisticated solution is:

is_prime(PrimesSoFar, Candidate) -> not lists:any(fun(X) -> Candidate
rem X =:= 0 end, PrimesSoFar).

list_primes(PrimesSoFar, Max, Candidate) when (Candidate > Max) -> PrimesSoFar;
list_primes(PrimesSoFar, Max, Candidate) ->
case is_prime(PrimesSoFar, Candidate) of
true -> list_primes([Candidate|PrimesSoFar], Max, Candidate + 2);
false -> list_primes(PrimesSoFar, Max, Candidate + 2)
end.

list_primes(N) when N < 2 -> [];
list_primes(2) -> [2];
list_primes(N) -> list_primes([2], N, 3).

The problem with this is that the list of prime numbers is ordered
largest-first, so that the small primes (the ones most likely to be
factors of a given number) are tested last. This makes the
sophisticated solution slower than the naive one, which tests small
factors first. Any suggestions on how to fix this?

--
Mark Wagner

Matthew Palmer

unread,
Feb 6, 2009, 1:18:08 AM2/6/09
to erlang-q...@erlang.org
On Thu, Feb 05, 2009 at 06:50:22PM -0800, Mark Wagner wrote:
> As a way of learning the basics of Erlang, I'm solving various
> problems from Project Euler (http://projecteuler.net/).

Well, that's kinda spooky -- so am I. <grin>

> Problem #10
> involves generating a list of prime numbers. The naive solution to
> this is to generate a list of numbers and test each of them
> individually to see if it's prime; a more sophisticated solution uses
> the list of primes already found to speed up the testing. My code for
> the sophisticated solution is:
>
> is_prime(PrimesSoFar, Candidate) -> not lists:any(fun(X) -> Candidate
> rem X =:= 0 end, PrimesSoFar).
>
> list_primes(PrimesSoFar, Max, Candidate) when (Candidate > Max) -> PrimesSoFar;
> list_primes(PrimesSoFar, Max, Candidate) ->
> case is_prime(PrimesSoFar, Candidate) of
> true -> list_primes([Candidate|PrimesSoFar], Max, Candidate + 2);
> false -> list_primes(PrimesSoFar, Max, Candidate + 2)
> end.
>
> list_primes(N) when N < 2 -> [];
> list_primes(2) -> [2];
> list_primes(N) -> list_primes([2], N, 3).
>
> The problem with this is that the list of prime numbers is ordered
> largest-first, so that the small primes (the ones most likely to be
> factors of a given number) are tested last. This makes the
> sophisticated solution slower than the naive one, which tests small
> factors first. Any suggestions on how to fix this?

You could use lists:reverse a couple of times to append the new value to the end
of the list, like so:

list_primes(lists:reverse([Candidate|lists:reverse(PrimesSoFar)])), ...);

You can also optimise a little further by only doing primality checking for
primes less than or equal to sqrt(Candidate), since if there's a factor F
greater than sqrt(Candidate) there must also be one less than it (Candidate
div F). For that to work, of course, you'd need to re-order your list to be
smallest first, and not use lists:any, but rather your own list-walking
function, to do the testing.

BTW, isn't Erlang a *good* language to do this sort of thing in? I'm loving
it, myself -- trivial to debug things, and the problems solve themselves so
quick (I'm fairly sure that I'm writing these Euler problems quicker than I
would in Ruby, my current favourite language).

- Matt

Imre Palik

unread,
Feb 6, 2009, 6:23:39 AM2/6/09
to erlang-q...@erlang.org

As for the primes:

sieve([H|T], M) when H=< M -> [H|sieve([X || X<- T, X rem H /= 0])];
sieve(L, _) -> L.

call it as:

Primes = [2|sieve(lists:seq(3, N, 2), math:sqrt(N))],

You could also do it with tail recursion, but if N>10^6 the speedup is negligible. The real morale of the thing is that you want to use list comprehension whenever you can. It tends to be much faster than recursion.

BTW for such things the project euler forum si your friend.


From: Matthew Palmer <mpa...@hezmatt.org>

> BTW, isn't Erlang a *good* language to do this sort of thing in? I'm loving
> it, myself -- trivial to debug things, and the problems solve themselves so
> quick (I'm fairly sure that I'm writing these Euler problems quicker than I
> would in Ruby, my current favourite language).

It depends on your definition of good. You won't be able to brute force anything. Not even those that you could in TCL. Sometimes even the optimal algorithm will run for several minutes. Also, if you try to use gb_trees for memo in dynamic programming, programming, it becomes really damn slow after a few million entries.

I.

Hynek Vychodil

unread,
Feb 6, 2009, 9:39:46 AM2/6/09
to Imre Palik, erlang-q...@erlang.org
I should like notice that your algorithm is not The Sieve of Eratosthenes. For example in sieve algorithm should not be 'rem' operation but just '+'. For more details see http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf

_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions



--
--Hynek (Pichi) Vychodil

Analyze your data in minutes. Share your insights instantly. Thrill your boss.  Be a data hero!
Try Good Data now for free: www.gooddata.com

Bjorn Gustavsson

unread,
Feb 6, 2009, 11:11:08 AM2/6/09
to Imre Palik, erlang-q...@erlang.org
On Fri, Feb 6, 2009 at 12:23 PM, Imre Palik <im...@u-tx.com> wrote:

> The real morale of the thing is that you want to use list comprehension whenever you can. It tends to be much faster than recursion.

That surprises me, because I happen to know how list comprehensions
are implemented.

Do you have any measurements to back up your statement?

/Bjorn
--
Björn Gustavsson, Erlang/OTP, Ericsson AB

Imre Palik

unread,
Feb 6, 2009, 11:49:07 AM2/6/09
to Bjorn Gustavsson, erlang-q...@erlang.org
Feladó: Bjorn Gustavsson [bgust...@gmail.com]
> On Fri, Feb 6, 2009 at 12:23 PM, Imre Palik <im...@u-tx.com> wrote:
>
>> The real morale of the thing is that you want to use list comprehension whenever you can. It tends to be much faster than recursion.
>
> That surprises me, because I happen to know how list comprehensions
> are implemented.
>
> Do you have any measurements to back up your statement?


Here it is:

% code start here

-module(prime_test).
-export([run/1]).

sieve1([H|T], S) when H =< S -> [H|sieve1([X||X<-T, X rem H /= 0], S)];
sieve1(L, _) -> L.

filter(_, [], A) -> lists:reverse(A);
filter(P, [H|T], A) when H rem P == 0 -> filter(P, T, A);
filter(P, [H|T], A) -> filter(P, T, [H|A]).

sieve2([H|T], S) when H =< S -> [H|sieve2(filter(H, T, []), S)];
sieve2(L, _) -> L.

run(N) ->
erlang:display({start, now()}),
P1 = sieve1(lists:seq(3, N), math:sqrt(N)),
erlang:display({p1, now()}),
P2 = sieve2(lists:seq(3, N), math:sqrt(N)),
erlang:display({p2, now()}).

% code ends here

15> prime_test:run(10000000).
{start,{1233,938342,764000}}
{p1,{1233,938385,702000}}
{p2,{1233,938441,827000}}
true
16> prime_test:run(10000000).
{start,{1233,938718,296000}}
{p1,{1233,938760,608000}}
{p2,{1233,938816,702000}}
true
17> prime_test:run(10000000).
{start,{1233,938821,827000}}
{p1,{1233,938864,30000}}
{p2,{1233,938919,952000}}
true
18>


Ι.

I.

Matthias Lang

unread,
Feb 6, 2009, 4:51:14 PM2/6/09
to Imre Palik, erlang-q...@erlang.org
On Friday, February 06, Imre Palik wrote:

Imre> >> The real morale of the thing is that you want to use
Imre> >> list comprehension whenever you can. It tends to be
Imre> >> much faster than recursion.
...

I see no big difference between code variants when running on my
machine:

Erlang (BEAM) emulator version 5.6.5 [source] [64-bit] [smp:2] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.6.5 (abort with ^G)
1> c(prime_test).
./prime_test.erl:16: Warning: variable 'P1' is unused
./prime_test.erl:18: Warning: variable 'P2' is unused
{ok,prime_test}
2> prime_test:run(10000000).
{start,{1233,954836,637538}}
{p1,{1233,954948,700238}}
{p2,{1233,955063,218981}}
true

I tried adding a plain recursive version of filter (see code below),
but didn't get any significant differences there either:

19> prime_test:run(10000000).
{start,{1233,955375,697367}}
{p1,{1233,955488,314905}}
{p2,{1233,955603,458047}}
{p3,{1233,955717,964138}}

Repeating on a 32-bit version of R11B-5 on a different machine didn't
produce anything spectacular either:

3> prime_test:run(10000000).
{start,{1233,955391,95580}}
{p1,{1233,955426,572019}}
{p2,{1233,955465,160014}}
{p3,{1233,955503,541756}}

Aside: that call to lists:seq(3, N) looks suspicious.

Matt

----------------------------------------------------------------------
-module(prime_test).
-export([run/1]).

sieve1([H|T], S) when H =< S -> [H|sieve1([X||X<-T, X rem H /= 0], S)];
sieve1(L, _) -> L.

filter(_, [], A) -> lists:reverse(A);
filter(P, [H|T], A) when H rem P == 0 -> filter(P, T, A);
filter(P, [H|T], A) -> filter(P, T, [H|A]).

rfilter(P, [H|T]) when H rem P == 0 -> rfilter(P, T);
rfilter(P, [H|T]) -> [H|rfilter(P,T)];
rfilter(_, []) -> [].

sieve2([H|T], S) when H =< S -> [H|sieve2(filter(H, T, []), S)];
sieve2(L, _) -> L.

sieve3([H|T], S) when H =< S -> [H|sieve3(rfilter(H, T), S)];
sieve3(L, _) -> L.

run(N) ->
erlang:display({start, now()}),
P1 = sieve1(lists:seq(3, N), math:sqrt(N)),
erlang:display({p1, now()}),
P2 = sieve2(lists:seq(3, N), math:sqrt(N)),

erlang:display({p2, now()}),
P3 = sieve2(lists:seq(3, N), math:sqrt(N)),
erlang:display({p3, now()}).


Richard O'Keefe

unread,
Feb 8, 2009, 7:55:46 PM2/8/09
to Mark Wagner, erlang-q...@erlang.org

On 6 Feb 2009, at 3:50 pm, Mark Wagner wrote:
> The problem with this is that the list of prime numbers is ordered
> largest-first, so that the small primes (the ones most likely to be
> factors of a given number) are tested last. This makes the
> sophisticated solution slower than the naive one, which tests small
> factors first. Any suggestions on how to fix this?

Brute force. You test O(N) numbers, but find only O(log N) primes.
Using Prime_List ++ [New_Prime] will thus add O((log N)**2)) time
to your search, which must already be taking O(N) at least.

This is one of the few cases where List ++ [Element] makes sense.
>


Alpár Jüttner

unread,
Feb 9, 2009, 1:49:40 AM2/9/09
to Richard O'Keefe, erlang-q...@erlang.org
On Mon, 2009-02-09 at 13:55 +1300, Richard O'Keefe wrote:
> On 6 Feb 2009, at 3:50 pm, Mark Wagner wrote:
> > The problem with this is that the list of prime numbers is ordered
> > largest-first, so that the small primes (the ones most likely to be
> > factors of a given number) are tested last. This makes the
> > sophisticated solution slower than the naive one, which tests small
> > factors first. Any suggestions on how to fix this?
>
> Brute force. You test O(N) numbers, but find only O(log N) primes.

There are much more primes than that.
(According to the "Prime number theorem", there are around N/lnN primes
less then N.)

Regards,
Alpar

Richard O'Keefe

unread,
Feb 9, 2009, 7:43:34 PM2/9/09
to Alpár Jüttner, erlang-q...@erlang.org

On 9 Feb 2009, at 7:49 pm, Alpár Jüttner wrote:
>> Brute force. You test O(N) numbers, but find only O(log N) primes.
>
> There are much more primes than that.

Yes, typo. I meant O(N/log N), and it's precisely the Prime
number theorem I was referring to. The list of numbers to
be checked against only needs to grow to O(sqrt(N)).

In any case, the bulk of the checks are going to be against the
first few primes. If you step through numbers in, say, blocks
of 30 (= 2*3*5), there are only 8 candidates in each block (the
others all being a multiple of 2, 3, or 5).


Reply all
Reply to author
Forward
0 new messages