Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How fast is your favorite Scheme?

45 views
Skip to first unread message

W. James

unread,
Jan 6, 2010, 8:33:41 PM1/6/10
to
Bigloo runs this in 2.8 seconds. Maybe there's another Scheme that
can better that.

(module prime
(option (set! *genericity* #f))
(main main))

(define *size* 59888777)

(define (prime-sieve N)
(let* [(max-index::int (quotient (- N 3) 2))
(v (make-vector (+ 1 max-index) #t))]
(let loop [(i 0) (primes '(2))]
(cond
((> i max-index) (reverse primes))
((vector-ref v i)
(let ((prime::int (+ i i 3)))
(do [(j (+ 3 (* 3 i)) (+ j prime))]
[(> j max-index)]
(vector-set! v j #f))
(loop (+ 1 i) (cons prime primes))))
(else
(loop (+ 1 i) primes))))))

(define (main args)
(print (length (prime-sieve *size*))))

--

Aaron W. Hsu

unread,
Jan 6, 2010, 9:24:34 PM1/6/10
to

That code is quite Bigloo specific. I would like to see how Bigloo does
withou tthe annotations. On the other hand, you know I can't resist an
useless benchmark, don't you? On Chez Scheme, in R6RS Top-level program
mode. I use fxvectors, which are non-standard, but everything else is
standard.

#! /usr/bin/env scheme-script
(import (chezscheme))

(define *size* 59888777)

(define (prime-sieve/fast N)
(let* [(max-index (fxquotient (fx- N 3) 2))
(v (make-fxvector (fx+ 1 max-index) 0))]


(let loop [(i 0) (primes '(2))]
(cond

((fx>? i max-index) (reverse primes))
((fxzero? (fxvector-ref v i))
(let ((prime (fx+ i i 3)))
(do [(j (fx+ 3 (fx* 3 i)) (fx+ j prime))]
[(fx>? j max-index)]
(fxvector-set! v j 1))
(loop (fx+ 1 i) (cons prime primes))))
(else
(loop (fx+ 1 i) primes))))))

(printf "~d\n" (length (prime-sieve/fast *size*)))

$ time ./test.so
3555931

real 0m3.44s
user 0m2.97s
sys 0m0.46s

What processor did you use. ;-)

Aaron W. Hsu

--
A professor is one who talks in someone else's sleep.

William D Clinger

unread,
Jan 6, 2010, 10:08:13 PM1/6/10
to
Aaron W. Hsu wrote:
> That code is quite Bigloo specific. I would like to see how Bigloo does
> withou tthe annotations. On the other hand, you know I can't resist an
> useless benchmark, don't you? On Chez Scheme, in R6RS Top-level program
> mode. I use fxvectors, which are non-standard, but everything else is
> standard.

Nope. Scheme scripts aren't part of R6RS. (They were
described only in an unratified non-binding appendix.)

printf and fxquotient aren't R6RS either, and R6RS fx+ is
restricted to two arguments; all conforming implementations
of the R6RS are required to raise an exception at run time
when the fx+ of (rnrs arithmetic fixnums (6)) is applied to
three arguments. And you might as well use bytevectors
instead of Chez-specific fxvectors. Here's a true R6RS
version of the benchmark:

(import (rnrs))

(define *size* 59888777)

(define (prime-sieve/fast N)
(let* [(max-index (fxdiv (fx- N 3) 2))
(v (make-bytevector (fx+ 1 max-index) 0))]


(let loop [(i 0) (primes '(2))]
(cond
((fx>? i max-index) (reverse primes))

((fxzero? (bytevector-u8-ref v i))
(let ((prime (fx+ (fx+ i i) 3)))


(do [(j (fx+ 3 (fx* 3 i)) (fx+ j prime))]
[(fx>? j max-index)]

(bytevector-u8-set! v j 1))


(loop (fx+ 1 i) (cons prime primes))))
(else
(loop (fx+ 1 i) primes))))))

(write (length (prime-sieve/fast *size*)))
(newline)

Aaron W. Hsu

unread,
Jan 7, 2010, 12:10:09 AM1/7/10
to
On Wed, 06 Jan 2010 22:08:13 -0500, William D Clinger <cesu...@yahoo.com>
wrote:

> Aaron W. Hsu wrote:
>> That code is quite Bigloo specific. I would like to see how Bigloo does
>> withou tthe annotations. On the other hand, you know I can't resist an
>> useless benchmark, don't you? On Chez Scheme, in R6RS Top-level program
>> mode. I use fxvectors, which are non-standard, but everything else is
>> standard.
>
> Nope. Scheme scripts aren't part of R6RS. (They were
> described only in an unratified non-binding appendix.)
>
> printf and fxquotient aren't R6RS either, and R6RS fx+ is
> restricted to two arguments; all conforming implementations
> of the R6RS are required to raise an exception at run time
> when the fx+ of (rnrs arithmetic fixnums (6)) is applied to
> three arguments. And you might as well use bytevectors
> instead of Chez-specific fxvectors. Here's a true R6RS
> version of the benchmark:

Whoops! Yes, I've overlooked some things. And interestingly enough...

> (import (rnrs))
>
> (define *size* 59888777)
>
> (define (prime-sieve/fast N)
> (let* [(max-index (fxdiv (fx- N 3) 2))
> (v (make-bytevector (fx+ 1 max-index) 0))]
> (let loop [(i 0) (primes '(2))]
> (cond
> ((fx>? i max-index) (reverse primes))
> ((fxzero? (bytevector-u8-ref v i))
> (let ((prime (fx+ (fx+ i i) 3)))
> (do [(j (fx+ 3 (fx* 3 i)) (fx+ j prime))]
> [(fx>? j max-index)]
> (bytevector-u8-set! v j 1))
> (loop (fx+ 1 i) (cons prime primes))))
> (else
> (loop (fx+ 1 i) primes))))))
>
> (write (length (prime-sieve/fast *size*)))
> (newline)

The above program, runs faster than my solution:

time petite -q test.so
3555931

real 0m2.06s
user 0m1.95s
sys 0m0.10s

There we go.

w_a_x_man

unread,
Jan 7, 2010, 3:05:54 AM1/7/10
to
On Jan 6, 8:24 pm, "Aaron W. Hsu" <arcf...@sacrideo.us> wrote:

> What processor did you use. ;-)

2.4 GHz Core2 Duo.

w_a_x_man

unread,
Jan 7, 2010, 3:08:07 AM1/7/10
to
On Jan 6, 11:10 pm, "Aaron W. Hsu" <arcf...@sacrideo.us> wrote:

> The above program, runs faster than my solution:
>
> time petite -q test.so
> 3555931
>
> real    0m2.06s
> user    0m1.95s
> sys     0m0.10s

So this is with Chez Petite Scheme, which doesn't even compile
to machine-lanuage? Impressive.

What processor?

w_a_x_man

unread,
Jan 7, 2010, 3:11:54 AM1/7/10
to

Using u8vectors (SRFI-4):

(module prime
(option (set! *genericity* #f))
(main main))

(define *size* 59888777)

(define (prime-sieve N)
(let* [(max-index::int (quotient (- N 3) 2))

(v (make-u8vector (+ 1 max-index) 0))]


(let loop [(i 0) (primes '(2))]
(cond

((> i max-index) (reverse primes))

((zerofx? (u8vector-ref v i))


(let ((prime::int (+ i i 3)))
(do [(j (+ 3 (* 3 i)) (+ j prime))]
[(> j max-index)]

(u8vector-set! v j 1))
(loop (+ 1 i) (cons prime primes))))
(else


(loop (+ 1 i) primes))))))

(define (main args)
(display (length (prime-sieve *size*)))
(newline))

The time is reduced to
2.265 seconds

Eduardo Cavazos

unread,
Jan 7, 2010, 3:22:02 AM1/7/10
to

Well, notice that he's running 'petite' with the argument 'test.so';
he probably created 'test.so' with Chez Scheme (which has the native
code compiler).

Ed

Unknown

unread,
Jan 7, 2010, 2:22:24 PM1/7/10
to
Eduardo Cavazos wrote:

You must be right. However, it's still impressive. I didn't think that
Chez could beat Bigloo.

Sam TH

unread,
Jan 7, 2010, 5:28:56 PM1/7/10
to

You can't conclude that anyone beat anyone else here - you'd have to
run the programs on the same computer to determine anything useful at
all.

sam th

w_a_x_man

unread,
Jan 7, 2010, 7:08:57 PM1/7/10
to

No, you wouldn't. You merely would have to run the programs
on equivalent computers. Since my laptop is almost new and it
cost a pretty penny, I'm optimistically assuming that it is at least
as fast as Aaron's. We'll have to wait till he gives us
the specs of his processor.

leppie

unread,
Jan 8, 2010, 6:28:59 AM1/8/10
to

I did a test on my work PC. Intel E5200.
Petite - 16 seconds
IronScheme - 48 seconds

As you can see, IronScheme is very patient :)

Aaron W. Hsu

unread,
Jan 8, 2010, 7:49:17 PM1/8/10
to
On Thu, 07 Jan 2010 03:08:07 -0500, w_a_x_man <w_a_...@yahoo.com> wrote:

> So this is with Chez Petite Scheme, which doesn't even compile
> to machine-lanuage? Impressive.

As others have pointed out, this is using petite as the runtime
environment (one of the primary applications of Petite Chez), with the
code compiled from Chez Scheme 7.9.4 x86_64 Linux machine type using
optimize-level set to 3 and without inspector information in the binary.

Aaron W. Hsu

Aaron W. Hsu

unread,
Jan 8, 2010, 7:51:29 PM1/8/10
to
On Thu, 07 Jan 2010 17:28:56 -0500, Sam TH <sam...@gmail.com> wrote:

> You can't conclude that anyone beat anyone else here - you'd have to
> run the programs on the same computer to determine anything useful at
> all.

In fact, even were these benchmarks to be run on the same machine, you can
conclude almost nothing from those benchmarks, because the different
machines are different in so many ways now that these sorts of micro
benchmarks are fun toys, at best. I'll admit that I like running them, but
not because I think they have any real merit. Additionally, it's important
to reiterate in case anyone thinks to choose one Scheme over another based
on these, that real applications are almost never so easily defined or
classified in these terms, and it is important to investigate
performance-required applications based on the application itself, and not
with these sorts of micro benchmarks.

Aaron W. Hsu

Aaron W. Hsu

unread,
Jan 8, 2010, 7:54:13 PM1/8/10
to
On Thu, 07 Jan 2010 19:08:57 -0500, w_a_x_man <w_a_...@yahoo.com> wrote:

> No, you wouldn't. You merely would have to run the programs
> on equivalent computers. Since my laptop is almost new and it
> cost a pretty penny, I'm optimistically assuming that it is at least
> as fast as Aaron's. We'll have to wait till he gives us
> the specs of his processor.

Unfortunately, even similar machines can result in a lot of differences.
In fact, even similar hardware can result in very different code based on
the Operating system and features that you have enabled. For example, my
benchmark results are based on a Slackware64 13.0 Linux machine with the
Kernel version at 2.6.29.6 and with SMP enabled, along with 4 GB of
memory. I am using the ta6le machine type for Chez, which means that I am
using the threaded, x86_64, linux binary of Chez Scheme, and this itself
is important. The 32-bit Chez Scheme actually runs faster in many cases
because of memory references on the x86_64 architecture than does the
64-bit version, though this depends on the application. Moreover, the
threaded version of Chez Scheme is slower than the unthreaded version
because of additional thread safety checks that occur. All this means that
even running on Chez Scheme, others can't guarantee that they are going to
get the same results unless they have the exact same Operating system and
machine type and hardware.

Aaron W. Hsu

Aaron W. Hsu

unread,
Jan 8, 2010, 7:54:56 PM1/8/10
to
On Fri, 08 Jan 2010 06:28:59 -0500, leppie <xacc...@gmail.com> wrote:

> As you can see, IronScheme is very patient

Well, we should count maturity into this matter, and of course, IronScheme
has different goals than Chez Scheme does. ;-)

Aaron W. Hsu

Aaron W. Hsu

unread,
Jan 8, 2010, 8:00:59 PM1/8/10
to

To give the hardware and other machine and system details, the results
from my version were done with the following:

Lenovo T500
Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz
4GB of RAM
I believe a 7200 RPM hard drive
Slackware64 13.0 with patches running Linux 2.6.29.6, ksh, and XFCE
Chez Scheme 7.9.4 Linux x86_64 threaded

As you can see, it may be reasonable to expect that Bigloo runs a little
slower on this benchmark since you have a slightly slower processor. Of
course, there are other things I would do to really get the most speed out
of Chez Scheme for these benchmarks, but I'm not inclined to worry that
much about it, since I don't do those tweaks in real life either (I always
uses theaded over unthreaded, for example, because often enough I use
networking, and I like threads for networked programs).

I think we can conclude though, that for a micro benchmark like this,
Bigloo and Chez both produce very good code. :-)

frankenstein

unread,
Jan 9, 2010, 6:14:07 PM1/9/10
to

You may speed up your Bigloo code by up to 30% if you use either of
the following codes:

==
(module prime
(main main))


(define *size* 59888777)


(define (prime-sieve2 N::bint)::pair
(let* [(max-index::bint (quotient (-fx N 3) 2))
(v::vector (make-vector (+fx 1 max-index) #t))]


(let loop [(i 0) (primes '(2))]
(cond

((>fx i max-index) (reverse primes))
((vector-ref v i)
(let ((prime::int (+fx (+fx i i) 3)))
(do [(j (+fx 3 (*fx 3 i)) (+fx j prime))]
[(>fx j max-index)]
(vector-set! v j #f))
(loop (+fx 1 i) (cons prime primes))))
(else
(loop (+fx 1 i) primes))))))

(define (main args)
(print (length (prime-sieve2 *size*))))
==

or

==
(module prime
(main main))


(define *size* 59888777)


(define (prime-sieve3 N::bint)::pair
(let* [(max-index::bint (quotient (-fx N 3) 2))
(v::u8vector (make-u8vector (+fx 1 max-index) 0))]


(let loop [(i 0) (primes '(2))]
(cond

((>fx i max-index) (reverse primes))
((<fx (u8vector-ref v i) 1)
(let ((prime::int (+fx (+fx i i) 3)))
(do [(j (+fx 3 (*fx 3 i)) (+fx j prime))]
[(>fx j max-index)]
(u8vector-set! v j 1))
(loop (+fx 1 i) (cons prime primes))))
(else
(loop (+fx 1 i) primes))))))


(define (main args)
(print (length (prime-sieve3 *size*))))
==

frankenstein

unread,
Jan 9, 2010, 6:19:15 PM1/9/10
to

sorry for my earlier post haven't seen yours where you already found
out how to speed up things.

w_a_x_man

unread,
Jan 10, 2010, 12:53:13 AM1/10/10
to
On Jan 9, 5:14 pm, frankenstein <klohmusc...@yahoo.de> wrote:

> You may speed up your Bigloo code by up to 30% if you use either of
> the following codes:
>

> or
>
> ==
> (module prime
>    (main main))
>
> (define *size* 59888777)
>
> (define (prime-sieve3 N::bint)::pair
>   (let* [(max-index::bint (quotient (-fx N 3) 2))
>          (v::u8vector (make-u8vector (+fx 1 max-index) 0))]
>     (let loop [(i 0) (primes '(2))]
>       (cond
>         ((>fx i max-index) (reverse primes))
>         ((<fx (u8vector-ref v i) 1)
>           (let ((prime::int (+fx (+fx i i)  3)))
>             (do [(j (+fx 3 (*fx 3 i)) (+fx j prime))]
>                 [(>fx j max-index)]
>                 (u8vector-set! v j 1))
>             (loop (+fx 1 i) (cons prime primes))))
>         (else
>           (loop (+fx 1 i) primes))))))
>
> (define (main args)
>    (print (length (prime-sieve3 *size*))))
> ==

This runs in about 1.71 seconds on my laptop, so this seems to be the
fastest one so far.

Apparently, when using Bigloo
(option (set! *genericity* #f))
doesn't completely eliminate the need for +fx, -fx, etc.

w_a_x_man

unread,
Jan 10, 2010, 12:56:29 AM1/10/10
to
On Jan 9, 5:14 pm, frankenstein <klohmusc...@yahoo.de> wrote:

> You may speed up your Bigloo code by up to 30% if you use either of
> the following codes:
>

> or
>
> ==
> (module prime
>    (main main))
>
> (define *size* 59888777)
>
> (define (prime-sieve3 N::bint)::pair
>   (let* [(max-index::bint (quotient (-fx N 3) 2))
>          (v::u8vector (make-u8vector (+fx 1 max-index) 0))]
>     (let loop [(i 0) (primes '(2))]
>       (cond
>         ((>fx i max-index) (reverse primes))
>         ((<fx (u8vector-ref v i) 1)
>           (let ((prime::int (+fx (+fx i i)  3)))
>             (do [(j (+fx 3 (*fx 3 i)) (+fx j prime))]
>                 [(>fx j max-index)]
>                 (u8vector-set! v j 1))
>             (loop (+fx 1 i) (cons prime primes))))
>         (else
>           (loop (+fx 1 i) primes))))))
>
> (define (main args)
>    (print (length (prime-sieve3 *size*))))
> ==

This runs in about 1.71 seconds on my laptop, so this seems to be the

w_a_x_man

unread,
Jan 10, 2010, 5:29:39 AM1/10/10
to
On Jan 9, 5:14 pm, frankenstein <klohmusc...@yahoo.de> wrote:

> (define (prime-sieve3 N::bint)::pair
>   (let* [(max-index::bint (quotient (-fx N 3) 2))

What's the difference between ::bint and ::int ?

frankenstein

unread,
Jan 10, 2010, 5:38:47 AM1/10/10
to

How do you do


Honestly I have never figured this out: bint vs int. I think int is
related to the external C int type (e.g. when you write an external
binding to a C library) whereas bint is the Bigloo native integer.
Sure ints in Bigloo are C ints (except for long int which needs a
special conversion from 31 to 32 bits).

However, either way bint and int work.

As you rigthly observed: setting genericity to false or true is not
the best method to achieve speed; it is always better to use *fx and
types explicitely.


w_a_x_man

unread,
Jan 10, 2010, 6:00:53 AM1/10/10
to
On Jan 10, 4:38 am, frankenstein <klohmusc...@yahoo.de> wrote:

> As you rigthly observed: setting genericity to false or true is not
> the best method to achieve speed; it is always better to use *fx and
> types explicitely.

I think I figured out what was keeping *genericity* from working
perfectly, at least in this case. Since "+fx", unlike "+",
accepts only 2 arguments, when you say (+ i i 3), the "+" isn't
converted to "+fx". So I changed it to (+ (+ i i) 3). The code
below seems to be as fast as yours.

;; Bigloo Scheme.
(module prime
;; Arithmetic operators are for integers only.


(option (set! *genericity* #f))
(main main))

(define *size* 59888777)

(define (prime-sieve N)
(let* [(max-index::int (quotient (- N 3) 2))

(v (make-u8vector (+ 1 max-index) 0))]


(let loop [(i 0) (primes '(2))]
(cond

((> i max-index) (reverse primes))
((zerofx? (u8vector-ref v i))

(let ((prime::int (+ (+ i i) 3)))


(do [(j (+ 3 (* 3 i)) (+ j prime))]
[(> j max-index)]

(u8vector-set! v j 1))


(loop (+ 1 i) (cons prime primes))))
(else
(loop (+ 1 i) primes))))))

(define (main args)

frankenstein

unread,
Jan 10, 2010, 7:14:34 AM1/10/10
to

Good afternoon Ladies and Gentleman

I think a rule thumb could be: if you wanna Bigloo to perform: use
*fx, ... and types throughout your code.

I would imagine that using genericity will work but could make it much
harder to get it right, speed wise, in the first place. Once you start
using types (e.g. v::vector) and operators *fx, *fl, etc. you will
notice that the Bigloo compiler can turn against you and will become
very picky.

However, using types in Bigloo is quite straightforward and
predictable and the learning curve is very shallow. This is to the
contrary what you might expect from Common Lisp: speeding up Lisp code
is a random process.

Basic Bigloo is okay but is as slow as all the other shitty Sheme and
Common Lisp implementations out there. I wouldn't use Bigloo if it
hadn't got the facility to use explicit types, though, very rough and
C basic, but still. Programming in pure Scheme without types, which
help document effectively your own code is a nightmare (I have a
colleague who uses Python in numerical simulations and the typical
Python code where each variable could have any type not becoming
obvious whilst reading the code is simply a piece of crap).

Ray

unread,
Jan 14, 2010, 1:58:18 PM1/14/10
to
Aaron W. Hsu wrote:

> On Thu, 07 Jan 2010 17:28:56 -0500, Sam TH <sam...@gmail.com> wrote:
>
>> You can't conclude that anyone beat anyone else here - you'd have to
>> run the programs on the same computer to determine anything useful at
>> all.
>
> In fact, even were these benchmarks to be run on the same machine, you can
> conclude almost nothing from those benchmarks, because the different
> machines are different in so many ways now that these sorts of micro
> benchmarks are fun toys, at best.


Developing good benchmarks is hard. And as computers have gotten
more complex (relationships between CPU, Cache, and Memory,
runtime libraries, environment loading times, protocol stacks,
OS scheduling quirks and caching algorithms, and competing
processes) it's gotten harder.

The Language Benchmarking Game (which used to be the Language
shootout) has collected no end of grief about this. It seems that
they can do *NOTHING* with benchmarks of *ANY* kind that satisfies
the language partisans that the test was or is fair.

Still, I think that benchmarks are important; for all that any
single instance may be flawed for any number of reasons, collectively
they reveal something important about the languages and their
implementations.

Does anybody know of any good methodology for developing relevant,
useful benchmarks?

Bear

Aaron W. Hsu

unread,
Jan 14, 2010, 6:35:41 PM1/14/10
to
On Thu, 14 Jan 2010 13:58:18 -0500, Ray <be...@sonic.net> wrote:

> Does anybody know of any good methodology for developing relevant,
> useful benchmarks?

It depends on what you want to test. There are techniques for creating
microbenchmarks to test a few specific things in particular within the
language, which is useful when you, for example, want to improve the
performance of your FFI handling. For creating good general purpose
benchmarks, the wisdom I have always heard is, "test your application on
multiple implementations." This works well even if you don't care about
other implementations. Chez Scheme, for example, often uses itself as its
own benchmark, and the developer(s) usually don't add optimizations that
result in a net loss.

jra/pdx

unread,
Jan 15, 2010, 3:10:41 PM1/15/10
to

As I often say, "the problem with measurement is knowing exactly what
we're measuring". Put it another way, "No matter how clear or
distinct things appear to be, observed closely enough, everything is
fuzzy".

Experience suggests we can apply such ideas to virtually any domain of
human endeavor. No doubt "benchmarking" included. Likely enough,
means that there really is no "answer" to questions of "how to
benchmark" any particular system. Too many variations/variables
influence judgment of "best" or even "adequate" testing to have
"rules" about what to do.

Maybe the way we have to look at it is that benchmarking is an art
form, a creative exercise that does not readily generalize across
instances. Thus one programmer's benchmark is another's anathema.

As others have said above, and said in many ways over the years, tests
have to be individualized to situations, and are not really comparable
across diverse circumstances.

Alas, we come the core of the problem: are we able to tolerate the
uncertainty implicit in everything we do? Indeed, implicit in the
reality of life itself.

Take care,
Jules.

0 new messages