References about multiple values in Scheme

Michele Simionato

unread,

Feb 12, 2009, 12:41:53 AM2/12/09

to

In a future episode of my Adventures (http://www.artima.com/weblogs/
index.jsp?blogger=micheles)
I am going to talk about multiple values. I am very much against them
and I will certainly write any kind of bad things against them, but
for sake of fairness I would like to add a few references about their
introduction in the Scheme standard, the corresponding discussion, the
arguments pro and against, etc. However I am totally ignorant of the
relevant references, since multiple values were introduced so long
ago. A quick search on c.l.s. finds a few posts (for instance this one
http://groups.google.com/group/comp.lang.scheme/msg/7335da47820deff4?hl=en)
but I would like some more "official" reference than a newsgroup
post.
Thanks for your help,

Michele Simionato

Grant Rettke

unread,

Feb 12, 2009, 12:47:13 AM2/12/09

to

On Feb 11, 11:41 pm, Michele Simionato <michele.simion...@gmail.com>
wrote:

> I am going to talk about multiple values. I am very much against them

Why?

Michele Simionato

unread,

Feb 12, 2009, 1:24:50 AM2/12/09

to

For the usual reasons, look for instance at the post by Jeffrey
Siskind I referenced: http://groups.google.com/group/comp.lang.scheme/msg/7335da47820deff4?hl=en

Notice that I do not want to beat a dead horse here, I am just asking
for references.

Michael Sperber

unread,

Feb 12, 2009, 2:13:41 AM2/12/09

to

Michele Simionato <michele....@gmail.com> writes:

> I would like to add a few references about their
> introduction in the Scheme standard, the corresponding discussion, the
> arguments pro and against, etc. However I am totally ignorant of the
> relevant references, since multiple values were introduced so long
> ago.

http://mumble.net/~jar/pubs/scheme-of-things/june-92-meeting.ps

(I haven't been able to find a publically available electronic version
of the original description referenced in that paper.)

--
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla

Aaron W. Hsu

unread,

Feb 12, 2009, 8:54:54 AM2/12/09

to

Michele Simionato <michele....@gmail.com> writes:

>for sake of fairness I would like to add a few references about their
>introduction in the Scheme standard, the corresponding discussion, the
>arguments pro and against, etc. However I am totally ignorant of the
>relevant references, since multiple values were introduced so long
>ago.

Have you seen Ashley and Dybvig paper on their implementation?

<http://www.cs.indiana.edu/~dyb/pubs/mrvs-abstract.html>

The above includes a link to the full text.

--
Aaron W. Hsu <arc...@sacrideo.us> | <http://www.sacrideo.us>
"Government is the great fiction, through which everybody endeavors to
live at the expense of everybody else." -- Frederic Bastiat
+++++++++++++++ ((lambda (x) (x x)) (lambda (x) (x x))) ++++++++++++++

leppie

unread,

Feb 12, 2009, 10:22:54 AM2/12/09

to

On Feb 12, 7:41 am, Michele Simionato <michele.simion...@gmail.com>
wrote:

> In a future episode of my Adventures (http://www.artima.com/weblogs/
> index.jsp?blogger=micheles)
> I am going to talk about multiple values. I am very much against them
> and I will certainly write any kind of bad things against them, but
> for sake of fairness I would like to add a few references about their
> introduction in the Scheme standard, the corresponding discussion, the
> arguments pro and against, etc. However I am totally ignorant of the
> relevant references, since multiple values were introduced so long

> ago. A quick search on c.l.s. finds a few posts (for instance this onehttp://groups.google.com/group/comp.lang.scheme/msg/7335da47820deff4?...)

> but I would like some more "official" reference than a newsgroup
> post.
> Thanks for your help,
>
> Michele Simionato

Interesting link you you posted.

I feel however, that if you think about multiple values as a
container, you will always have trouble using it. Implementing it as a
container will also cause issues (or else you would have to a very
inefficient call-with-values, if you allow the 'values' container to
be passed around).

Personally, I feel they have a place and time, but I only really use
it if needed (when 1 operation would suffice for 2 results, instead of
doing it twice for each required result).

Cheers

leppie

Michele Simionato

unread,

Feb 12, 2009, 11:45:45 PM2/12/09

to

On Feb 12, 4:22 pm, leppie <xacc....@gmail.com> wrote:
> I feel however, that if you think about multiple values as a
> container, you will always have trouble using it. Implementing it as a
> container will also cause issues (or else you would have to a very
> inefficient call-with-values, if you allow the 'values' container to
> be passed around).
>
> Personally, I feel they have a place and time, but I only really use
> it if needed (when 1 operation would suffice for 2 results, instead of
> doing it twice for each required result).

I wonder how big is the performance difference between returning a
list or returning values. In practice, I am interested only in the
case of returning two or three values: if more, I would use a record
or some smarter data structure.
This experiment in Ikarus (with the obvious definition of repeat)

(define A (random 10))
(define B (random 10))
(define C (random 10))

(define (get-values)
(values A B C))

(define (get-list)
(list A B C))

(time
(repeat 10000000
(let-values (((a b c) (get-values)))
#f)))

(time
(repeat 10000000
(let* ((ls (get-list)) (a (car ls)) (b (cadr ls)) (c (caddr ls)))
#f)))

gives me the following (on my MacBook):

running stats for (repeat 10000000 (let-values (((a b c) (get-
values))) #f)):
no collections
288 ms elapsed cpu time, including 0 ms collecting
289 ms elapsed real time, including 0 ms collecting
0 bytes allocated
running stats for (repeat 10000000 (let* ((ls (get-list)) (a (car ls))
(b (cadr ls)) (c (caddr ls))) #f)):
58 collections
200 ms elapsed cpu time, including 52 ms collecting
201 ms elapsed real time, including 45 ms collecting
240008192 bytes allocated

i.e. returning a list of three elements is 44% faster than returning
three values (and even better for two elements). This is the opposite
of what I expected. What about other implementations? Care to perform
some benchmark for me?

leppie

unread,

Feb 13, 2009, 7:01:12 AM2/13/09

to

On 13 Feb, 06:45, Michele Simionato <michele.simion...@gmail.com>
wrote:

> i.e. returning a list of three elements is 44% faster than returning
> three values (and even better for two elements). This is the opposite
> of what I expected. What about other implementations? Care to perform
> some benchmark for me?

I know Ypsilon uses a container for multiple values, would be nice to
see how it compares to Ikarus, in terms of speed and memory.

It should be noted that multiple values is a lot more memory
friendly :) (which IMO is a more important factor in such a micro
benchmark)

Cheers

leppie

PS: I'll try run some tests on my side tonite too.

Aaron W. Hsu

unread,

Feb 13, 2009, 11:03:01 AM2/13/09

to

Michele Simionato <michele....@gmail.com> writes:

>I wonder how big is the performance difference between returning a
>list or returning values. In practice, I am interested only in the
>case of returning two or three values: if more, I would use a record
>or some smarter data structure.
>This experiment in Ikarus (with the obvious definition of repeat)

I think these tests are faulty, but I'll have to spend more time on them
to know why.

These are my results using the test you gave above with slightly larger
repeat values, and running the tests in order LIST, VALUES, VALUES,
LIST.

Values
(time (do ((...)) ...))
no collections
400 ms elapsed cpu time
393 ms elapsed real time
0 bytes allocated
List
(time (do ((...)) ...))
no collections
390 ms elapsed cpu time
393 ms elapsed real time
0 bytes allocated
List
(time (do ((...)) ...))
no collections
390 ms elapsed cpu time
392 ms elapsed real time
0 bytes allocated
Values
(time (do ((...)) ...))
no collections
490 ms elapsed cpu time
496 ms elapsed real time
0 bytes allocated

Using the following code:

(eval-when (compile)
(generate-inspector-information #f)
(optimize-level 3))

(let ()

(define A (random 10))
(define B (random 10))
(define C (random 10))

(define (get-values)
(values A B C))

(define (get-list)
(list A B C))

(printf "Values~n")
(time
(do ((i 0 (fx1+ i))) ((fx= i 100000000))

(let-values (((a b c) (get-values)))
#f)))

(printf "List~n")
(time
(do ((i 0 (fx1+ i))) ((fx= i 100000000))

(let* ((ls (get-list)) (a (car ls)) (b (cadr ls)) (c (caddr ls)))
#f)))

(printf "List~n")
(time
(do ((i 0 (fx1+ i))) ((fx= i 100000000))

(let* ((ls (get-list)) (a (car ls)) (b (cadr ls)) (c (caddr ls)))
#f)))

(printf "Values~n")
(time
(do ((i 0 (fx1+ i))) ((fx= i 100000000))

(let-values (((a b c) (get-values)))
#f)))

)

Matthias Blume

unread,

Feb 13, 2009, 11:04:39 AM2/13/09

to

leppie <xacc...@gmail.com> writes:

> On 13 Feb, 06:45, Michele Simionato <michele.simion...@gmail.com>
> wrote:
>
>> i.e. returning a list of three elements is 44% faster than returning
>> three values (and even better for two elements). This is the opposite
>> of what I expected. What about other implementations? Care to perform
>> some benchmark for me?
>
> I know Ypsilon uses a container for multiple values, would be nice to
> see how it compares to Ikarus, in terms of speed and memory.
>
> It should be noted that multiple values is a lot more memory
> friendly :) (which IMO is a more important factor in such a micro
> benchmark)

Why?

I say it again and again: multiple values is where Scheme turned its
back on its original principles. There is no reason for having this
wart in the language except... well, except for what?

Matthias

Jens Axel Soegaard

unread,

Feb 13, 2009, 11:46:35 AM2/13/09

to

Michele Simionato skrev:

Have you looked at the rrrs mailing list? (I can't remember whether
multiple values are mentioned, but I'd be surprised if they weren't).

A few other relevant comp.lang.scheme threads:

http://groups.google.com/group/comp.lang.scheme/browse_thread/thread/2942207879dd9949?q=comp.lang.scheme+multiple+values+blume
http://groups.google.com/group/comp.lang.scheme/browse_thread/thread/f6d71bda574b7680?q=comp.lang.scheme+multiple+values+blume
http://groups.google.com/group/comp.lang.scheme/browse_thread/thread/3078b33a2dc69449?q=comp.lang.scheme+multiple+values+blume

--
Jens Axel Søgaard

Michele Simionato

unread,

Feb 13, 2009, 11:50:39 AM2/13/09

to

On Feb 13, 5:03 pm, Aaron W. Hsu <arcf...@sacrideo.us> wrote:

Basically you get the same speed for values and list (400ms ~ 390ms).
The last execution of
values takes 490ms instead of 400ms, but perhaps it is an accidental
fluctuation since
the code looks the same. It is strange the you get "0 bytes allocated"
both for list and values,
perhaps there is some optimization of the compiler there. I guess this
is Chez Scheme, right?
I tried ypsilon and I get the same times both for list and values.

leppie

unread,

Feb 13, 2009, 12:06:46 PM2/13/09

to

On 13 Feb, 18:03, Aaron W. Hsu <arcf...@sacrideo.us> wrote:

> Values
> (time (do ((...)) ...))
> no collections
> 490 ms elapsed cpu time
> 496 ms elapsed real time
> 0 bytes allocated

Holy MOSES, that compiler is fast! (I noted you added an extra 0 to
the repeat).

I will refrain from posting IronScheme's numbers to save
embarresment :)

Values is slightly slower (less than 5%), and has about 10% less GC's.

Cheers

leppie

Aaron W. Hsu

unread,

Feb 13, 2009, 12:28:23 PM2/13/09

to

Michele Simionato <michele....@gmail.com> writes:

>I wonder how big is the performance difference between returning a
>list or returning values. In practice, I am interested only in the
>case of returning two or three values: if more, I would use a record
>or some smarter data structure.
>This experiment in Ikarus (with the obvious definition of repeat)

I spent a bit of thinking, and realized that your test assumes that
containers are just broken apart containers. I think this is flawed.
The only case where it makes sense to use VALUES is when you are dealing
with things that really are separate and treated separately. This means
that in the parent code, you're going to have to destructor them
anyways. Your benchmark does no destructuring of the LIST form, so
unless the compiler is going to do a lot of clean up, the code is unfair
since they aren't really doing the same thing.

In fact, when I ran this code in Chez, this did happen, and the code was
very close. However, I don't consider it an accurate benchmark, so I'm
going to attempt to improve it here:

#! /usr/local/bin/petite --script

(eval-when (compile)
(generate-inspector-information #f)
(optimize-level 3))

(let ()

(define (get-values)
(values (random 5) (random 5) (random 5)))

(define (get-list)
(list (random 5) (random 5) (random 5)))

(define compute-with-values
(lambda ()
(let-values ([(a b c) (get-values)])
(fx+ (expt a 4) (expt b 2) c))))

(define compute-with-lists
(lambda ()
(let ([vals (get-list)])
(let ([a (car vals)]
[b (cadr vals)]
[c (caddr vals)])
(fx+ (expt a 4) (expt b 2) c)))))

(define run-test
(lambda (compute n)
(do ((i 0 (fx1+ i))
(res 0 (fx+ (compute) res)))
((fx= i n) (fx/ res n)))))

(define time-tests
(lambda (n)
(printf "V Result: ~d~n" (time (run-test compute-with-values n)))
(printf "V Result: ~d~n" (time (run-test compute-with-values n)))
(collect (collect-maximum-generation))
(printf "L Result: ~d~n" (time (run-test compute-with-lists n)))
(printf "L Result: ~d~n" (time (run-test compute-with-lists n)))))

(when (= 1 (length (command-line-arguments)))
(time-tests (string->number (car (command-line-arguments)))))

)

I have tried to make it a little more difficult for compilers to
optimize here, so that maybe it's a little more faithful to the general
use case. Notice that I destructor the list.

My timings:

$ ./values.so 100000000
(time (run-test compute-with-values ...))
no collections
21960 ms elapsed cpu time
21990 ms elapsed real time
0 bytes allocated
V Result: 3
(time (run-test compute-with-values ...))
no collections
21930 ms elapsed cpu time
21996 ms elapsed real time
0 bytes allocated
V Result: 3
(time (run-test compute-with-lists ...))
569 collections
24080 ms elapsed cpu time, including 210 ms collecting
24204 ms elapsed real time, including 207 ms collecting
2400080920 bytes allocated, including 2398801704 bytes reclaimed
L Result: 3
(time (run-test compute-with-lists ...))
569 collections
24840 ms elapsed cpu time, including 210 ms collecting
24848 ms elapsed real time, including 203 ms collecting
2400082896 bytes allocated, including 2398802792 bytes reclaimed
L Result: 3

Notice a few things. In this case, the VALUES version is marignally
faster, but not by a great deal. The real killer here is just how much
allocation is happening with the LIST version. The big benefit with
VALUES is that ZERO overhead is incurred in its use compared to using
containers which really don't serve a purpose (since we don't want the
container, but just whats inside as unique pieces). In that sense, the
VALUES version is much more efficient, and at least, no more
inefficient.

I'd be interested in improvements over this simple and probably still
incorrect benchmark.

Aaron W. Hsu

unread,

Feb 13, 2009, 12:31:39 PM2/13/09

to

Michele Simionato <michele....@gmail.com> writes:

>Basically you get the same speed for values and list (400ms ~ 390ms).
>The last execution of values takes 490ms instead of 400ms, but perhaps
>it is an accidental fluctuation since the code looks the same. It is
>strange the you get "0 bytes allocated" both for list and values,
>perhaps there is some optimization of the compiler there. I guess this
>is Chez Scheme, right? I tried ypsilon and I get the same times both
>for list and values.

Yes, the benchmark doesn't fluctuate the random values, which means that
we are dealing with the same values every time. Plus, there are plenty
of other areas for optimization which makes the benchmark kind of
useless for testing, since all the factors that need to be tested are
largely optimized away.

Michele Simionato

unread,

Feb 13, 2009, 1:30:15 PM2/13/09

to

On Feb 13, 6:28 pm, Aaron W. Hsu <arcf...@sacrideo.us> wrote:

> Michele Simionato <michele.simion...@gmail.com> writes:
> >I wonder how big is the performance difference between returning a
> >list or returning values. In practice, I am interested only in the
> >case of returning two or three values: if more, I would use a record
> >or some smarter data structure.
> >This experiment in Ikarus (with the obvious definition of repeat)
>
> I spent a bit of thinking, and realized that your test assumes that
> containers are just broken apart containers. I think this is flawed.
> The only case where it makes sense to use VALUES is when you are dealing
> with things that really are separate and treated separately. This means
> that in the parent code, you're going to have to destructor them
> anyways. Your benchmark does no destructuring of the LIST form, so
> unless the compiler is going to do a lot of clean up, the code is unfair
> since they aren't really doing the same thing.

Uh? My code is performing destructuring here

(let* ((ls (get-list)) (a (car ls)) (b (cadr ls)) (c (caddr ls)))

which is the same as your code

(let ([vals (get-list)])
(let ([a (car vals)]
[b (cadr vals)]
[c (caddr vals)])

I guess I do not understand your point.

> The real killer here is just how much
> allocation is happening with the LIST version. The big benefit with
> VALUES is that ZERO overhead is incurred in its use compared to using
> containers which really don't serve a purpose (since we don't want the
> container, but just whats inside as unique pieces). In that sense, the
> VALUES version is much more efficient, and at least, no more
> inefficient.

Of course the list version is performing a lot of allocations,
but the garbage collector seems to manage that pretty well, since
there is no performance penalty, so I am unconvinced by this argument.
I am pretty sure there are better arguments (since people decided to
introduce multiple values in the standard) this is why I asked for
references.

Aaron W. Hsu

unread,

Feb 13, 2009, 2:35:00 PM2/13/09

to

Matthias Blume <bl...@hana.uchicago.edu> writes:

>I say it again and again: multiple values is where Scheme turned its
>back on its original principles. There is no reason for having this
>wart in the language except... well, except for what?

This has been argued to death, and I don't think that people are going
to agree. I happen to think that multiple values let us express what we
want to express, improving the clarity of the code, even if they can be
misused. Just because we can do without, doesn't mean it doesn't belong.
Eliminating the need for containers that don't serve an useful purpose
in the code is a plus in my book.

Aaron W. Hsu

unread,

Feb 13, 2009, 3:36:20 PM2/13/09

to

Michele Simionato <michele....@gmail.com> writes:

>On Feb 13, 6:28=A0pm, Aaron W. Hsu <arcf...@sacrideo.us> wrote:
>> Michele Simionato <michele.simion...@gmail.com> writes:
>> >I wonder how big is the performance difference between returning a
>> >list or returning values. In practice, I am interested only in the
>> >case of returning two or three values: if more, I would use a record
>> >or some smarter data structure.
>> >This experiment in Ikarus (with the obvious definition of repeat)
>>
>> I spent a bit of thinking, and realized that your test assumes that

>> containers are just broken apart containers. =A0I think this is flawed.

>> The only case where it makes sense to use VALUES is when you are dealing
>> with things that really are separate and treated separately. This means
>> that in the parent code, you're going to have to destructor them

>> anyways. =A0Your benchmark does no destructuring of the LIST form, so

>> unless the compiler is going to do a lot of clean up, the code is unfair
>> since they aren't really doing the same thing.

>Uh? My code is performing destructuring here

>(let* ((ls (get-list)) (a (car ls)) (b (cadr ls)) (c (caddr ls)))

>which is the same as your code

> (let ([vals (get-list)])
> (let ([a (car vals)]
> [b (cadr vals)]
> [c (caddr vals)])

>I guess I do not understand your point.

Sorry, I missed that part. What happens is that your code gets
optimized away because the values bound in the lets are not used at all.
The destructuring that you do isn't used [though I'm sorry I missed that
you *did* do some destructuring].

>> =A0The real killer here is just how much
>> allocation is happening with the LIST version. =A0The big benefit with

>> VALUES is that ZERO overhead is incurred in its use compared to using
>> containers which really don't serve a purpose (since we don't want the
>> container, but just whats inside as unique pieces). In that sense, the
>> VALUES version is much more efficient, and at least, no more
>> inefficient.

>Of course the list version is performing a lot of allocations,
>but the garbage collector seems to manage that pretty well, since
>there is no performance penalty, so I am unconvinced by this argument.
>I am pretty sure there are better arguments (since people decided to
>introduce multiple values in the standard) this is why I asked for
>references.

The argument I'm most convinced by is that it's the right tool for the
"problem" it solves, regardless of effeciency. When I use multiple
values, I usually am not thinking about effeciency of the executed code,
but rather, whether the code makes more sense with multiple values in
it. I rely on a compiler for the speed, and rely on VALUES to make my
code clearer and less redundant.

Others will be convinced by the efficiency argument, because allocation
does make a difference in many programs where this level of efficiency
is important.

leppie

unread,

Feb 13, 2009, 3:52:39 PM2/13/09

to

On Feb 13, 8:30 pm, Michele Simionato <michele.simion...@gmail.com>
wrote:
>

> Uh? My code is performing destructuring here
>
> (let* ((ls (get-list)) (a (car ls)) (b (cadr ls)) (c (caddr ls)))
>

Simply CSE will (or should) eliminate the overhead.

The other thing to notice is compared to a list, multiple values have
much stricter semantics, in the sense it can only be used as a return
value of a function/procedure. This eliminates confusion, and
ambiguity.

Cheers

leppie

unread,

Feb 13, 2009, 3:57:05 PM2/13/09

to

On Feb 13, 10:52 pm, leppie <xacc....@gmail.com> wrote:

> This eliminates confusion, and ambiguity.

And this is a part that I miss in Scheme, like visibility modifiers in
most OO languages. R6RS's libraries (and other module systems, I
suppose) does solve the problem partially, but it still requires a
fair bit of planning/plumbing to get the desired effect.

Cheers

leppie

Matthias Blume

unread,

Feb 13, 2009, 5:07:04 PM2/13/09

to

Aaron W. Hsu <arc...@sacrideo.us> writes:

> Matthias Blume <bl...@hana.uchicago.edu> writes:
>
>>I say it again and again: multiple values is where Scheme turned its
>>back on its original principles. There is no reason for having this
>>wart in the language except... well, except for what?
>
> This has been argued to death, and I don't think that people are going
> to agree.

I know. And I am unlikely to ever agree with them on this issue.

> I happen to think that multiple values let us express what we
> want to express, improving the clarity of the code, even if they can be
> misused.

If the syntax is all you want, you can always declare it as a macro.
(To me, the syntax is one the most hideous aspects of the whole affair.
Returning a data structure is actually much clearer to me in expressing
what I want to express.)

> Just because we can do without, doesn't mean it doesn't belong.

Yes it does mean that! (Remember these words from the Scheme report
(paraphrasing): "...should be designed not by piling features on top of
features..."?)

> Eliminating the need for containers that don't serve an useful purpose
> in the code is a plus in my book.

These containers do serve a useful purpose: They remove clutter from the
language.

Matthias

Aaron W. Hsu

unread,

Feb 13, 2009, 8:18:20 PM2/13/09

to

leppie <xacc...@gmail.com> writes:

>On 13 Feb, 18:03, Aaron W. Hsu <arcf...@sacrideo.us> wrote:

>> Values
>> (time (do ((...)) ...))

>> =A0 =A0 no collections
>> =A0 =A0 490 ms elapsed cpu time
>> =A0 =A0 496 ms elapsed real time
>> =A0 =A0 0 bytes allocated

>Holy MOSES, that compiler is fast! (I noted you added an extra 0 to
>the repeat).

>I will refrain from posting IronScheme's numbers to save
>embarresment :)

>Values is slightly slower (less than 5%), and has about 10% less GC's.

Sorry, this was just . . . LOL.

William D Clinger

unread,

Feb 13, 2009, 11:03:54 PM2/13/09

to

Aaron Hsu wrote:
> I'd be interested in improvements over this simple and probably
> still incorrect benchmark.

The obvious problems with your benchmark are its
gratuitously non-portabilities. Even if we assume
the code is to be run in an implementation of the
R6RS, the following things are still non-portable:

#! /usr/local/bin/petite --script
(eval-when (compile)
(generate-inspector-information #f)
(optimize-level 3))

random

fx+ with more than two arguments

fx/ instead of fxdiv

fx= instead of fx=?

fx1+

printf

(collect (collect-maximum-generation))

(car (command-line-arguments)) instead of (cadr (command-line-
arguments))

its reliance on unsafe code to ignore the fixnum overflows
that are likely to occur on 32-bit implementations

That last one is a killer. Many (most?) implementations
of Scheme do not provide an unsafe mode that ignores
overflows. (The R6RS, for example, absolutely forbids
ignoring of overflows.) So your timings cannot fairly be
compared to those on most other systems.

There was no real need for any non-portable code here.
If your goal is to gather data for the widest possible
range of implementations, then you should write the
benchmark as portable R5RS code, with timing separated
from the benchmark itself.

Will

William D Clinger

unread,

Feb 14, 2009, 1:08:50 AM2/14/09

to

In addition to the portability problems, I should
have mentioned that, after making the many changes
to the benchmark that were necessary to make it
run in Larceny, the benchmark spent about 95% of
its time in the random and expt procedures. Those
procedures vary greatly in quality and performance
across implementations; furthermore, some compilers
may generate inline code for things like (expt a 4),
but most won't.

If the goal is to measure the performance of
multiple values, then the benchmark's timing
shouldn't depend so much on whether calls to expt
are inlined, and the benchmark shouldn't spend most
of its time in procedures that have nothing to do
with multiple values.

As written, the benchmark would make you think
it hardly matters whether you use multiple values
or lists. That's probably true for most programs,
just because few programs spend much of their time
returning multiple values, but it would be a mistake
to conclude from this benchmark that returning
multiple values is about as fast as returning a
list.

(I suspect that very few implementors of Scheme---
including me---have put in the work needed to make
multiple values faster than returning a list, but
that suspicion is based on my own benchmarking, not
on this benchmark. Besides, I might be wrong.)

Will

Michele Simionato

unread,

Feb 14, 2009, 2:22:27 AM2/14/09

to

On Feb 14, 7:08 am, William D Clinger <cesur...@yahoo.com> wrote:
> If the goal is to measure the performance of
> multiple values, then the benchmark's timing
> shouldn't depend so much on whether calls to expt
> are inlined, and the benchmark shouldn't spend most
> of its time in procedures that have nothing to do
> with multiple values.

Indeed, but I am ignorant of how to write a significant
benchmark valid for many implementations. Perhaps you can
suggest one.

> As written, the benchmark would make you think
> it hardly matters whether you use multiple values
> or lists. That's probably true for most programs,
> just because few programs spend much of their time
> returning multiple values, but it would be a mistake
> to conclude from this benchmark that returning
> multiple values is about as fast as returning a
> list.
>
> (I suspect that very few implementors of Scheme---
> including me---have put in the work needed to make
> multiple values faster than returning a list, but
> that suspicion is based on my own benchmarking, not
> on this benchmark. Besides, I might be wrong.)
>
> Will

You are effectively saying that in practice, for most
programs in most implementations, using lists or
returning multiple values does not make any difference,
performance-wise. I have no trouble in believing that.

Michele Simionato

William D Clinger

unread,

Feb 14, 2009, 11:12:13 AM2/14/09

to

Michele Simionato wrote:
> Indeed, but I am ignorant of how to write a significant
> benchmark valid for many implementations. Perhaps you can
> suggest one.

How about eight? See
http://www.ccs.neu.edu/home/will/Temp/Values/

A man with one watch knows what time it is; a man with
two watches is never quite sure. With benchmarks, it's
better to be not quite sure.

> You are effectively saying that in practice, for most
> programs in most implementations, using lists or
> returning multiple values does not make any difference,
> performance-wise.

Had you said "much" instead of "any", I'd agree with you.

Existing implementations of Scheme seem to fall into one
of four categories:

1. fast multiple values and fast generational garbage collection
2. fast multiple values and relatively slow garbage collection
3. slow multiple values and fast generational garbage collection
4. slow multiple values and slow garbage collection

For example, Chez Scheme is in category 1, PLT Scheme
is in category 2, I would have expected Larceny to be
in category 3 (but see below), and most implementations
are in category 4.

In most implementations (category 4), returning multiple
values will be slow no matter how you do it.

For category 3, I would have expected lists to be faster
than multiple values, but that wasn't true in Larceny.

For categories 1 and 2, returning multiple values will
be faster than returning a list.

In practice, however, it hardly matters because few real

programs spend much of their time returning multiple

values. Returning a single value is the common case,
and that's the case implementors should optimize.

Will

Sam TH

unread,

Feb 14, 2009, 11:45:02 AM2/14/09

to

On Feb 14, 11:12 am, William D Clinger <cesur...@yahoo.com> wrote:
> Michele Simionato wrote:
> > Indeed, but I am ignorant of how to write a significant
> > benchmark valid for many implementations. Perhaps you can
> > suggest one.
>
> How about eight? Seehttp://www.ccs.neu.edu/home/will/Temp/Values/

I'm curious - if you use the builtin PLT Scheme implementation of `let-
values', does it change the timings for the fourth benchmark?

Thanks,
sam th

Aaron W. Hsu

unread,

Feb 14, 2009, 12:08:14 PM2/14/09

to

William D Clinger <cesu...@yahoo.com> writes:

>If the goal is to measure the performance of multiple values, then the
>benchmark's timing shouldn't depend so much on whether calls to expt
>are inlined, and the benchmark shouldn't spend most of its time in
>procedures that have nothing to do with multiple values.

>As written, the benchmark would make you think it hardly matters
>whether you use multiple values or lists. That's probably true for
>most programs, just because few programs spend much of their time
>returning multiple values, but it would be a mistake to conclude from
>this benchmark that returning multiple values is about as fast as
>returning a list.

Okay, that's probably true. I was trying to get something that would
not be optimized away by the compiler. I would like to see a better

William D Clinger

unread,

Feb 14, 2009, 12:09:16 PM2/14/09

to

Sam TH wrote:
> I'm curious - if you use the builtin PLT Scheme implementation of `let-
> values', does it change the timings for the fourth benchmark?

No.

Will

Aaron W. Hsu

unread,

Feb 14, 2009, 12:09:29 PM2/14/09

to

William D Clinger <cesu...@yahoo.com> writes:

>Aaron Hsu wrote:
>> I'd be interested in improvements over this simple and probably
>> still incorrect benchmark.

>The obvious problems with your benchmark are its gratuitously
>non-portabilities. Even if we assume the code is to be run in an
>implementation of the R6RS, the following things are still non-portable:

Alright, I'll grant that, and since you've pointed out other issues with
this benchmarking attempt, I'll leave off trying to improve this, and
hope for a better version from somewhere.

Aaron W. Hsu

unread,

Feb 14, 2009, 3:32:32 PM2/14/09

to

William D Clinger <cesu...@yahoo.com> writes:

>http://www.ccs.neu.edu/home/will/Temp/Values/

>A man with one watch knows what time it is; a man with
>two watches is never quite sure. With benchmarks, it's
>better to be not quite sure.

I agree with you there. I decided to give these benchmarks a try, for
the fun of it:

$ ./values.so 100000000
Test 1 Values
(time (main1 n))
no collections
3290 ms elapsed cpu time
3293 ms elapsed real time
0 bytes allocated
Test 1 List(time (main1a n))
758 collections
12370 ms elapsed cpu time, including 890 ms collecting
12405 ms elapsed real time, including 973 ms collecting
3200108136 bytes allocated, including 3198675576 bytes reclaimed
Test 2 Values
(time (main2 n))
no collections
2780 ms elapsed cpu time
2782 ms elapsed real time
0 bytes allocated
Test 2 List(time (main2a n))
758 collections
10910 ms elapsed cpu time, including 1080 ms collecting
10935 ms elapsed real time, including 1090 ms collecting
3200113144 bytes allocated, including 3198710672 bytes reclaimed
Test 3 Values
(time (main3 n))
no collections
3840 ms elapsed cpu time
3832 ms elapsed real time
0 bytes allocated
Test 3 List(time (main3a n))
759 collections
11730 ms elapsed cpu time, including 1050 ms collecting
11726 ms elapsed real time, including 1069 ms collecting
3200115368 bytes allocated, including 3202933312 bytes reclaimed
Test 4 Values
(time (main4 n))
no collections
1390 ms elapsed cpu time
1392 ms elapsed real time
0 bytes allocated
Test 4 List(time (main4a n))
758 collections
7200 ms elapsed cpu time, including 1000 ms collecting
7208 ms elapsed real time, including 1072 ms collecting
3200115216 bytes allocated, including 3198712648 bytes reclaimed

Abdulaziz Ghuloum

unread,

Feb 14, 2009, 5:40:45 PM2/14/09

to

Michele Simionato wrote:

> I wonder how big is the performance difference between returning a
> list or returning values.

This thread so far has focused on two ways of returning multiple values
(one via values and call-with-values and one via an explicit data
structure) and ignored a third way (via CPS) which is IMHO very important.

Simply put, to return multiple values, pass an explicit continuation
(procedure) that receives the desired number of values.

> This experiment in Ikarus (with the obvious definition of repeat):

Add to your experiment:

(define (get-cont k)
(k A B C))

and

(time
(repeat 10000000
(get-cont (lambda (a b c) #f))))

> gives me the following (on my MacBook):

running stats for (repeat 10000000 (let-values (((a b c) (get-values)))
#f)):
no collections

286 ms elapsed cpu time, including 0 ms collecting

289 ms elapsed real time, including 0 ms collecting
0 bytes allocated
running stats for (repeat 10000000 (let* ((ls (get-list)) (a (car ls))
(b (cadr ls)) (c (caddr ls))) #f)):

57 collections
198 ms elapsed cpu time, including 48 ms collecting
199 ms elapsed real time, including 50 ms collecting
240008192 bytes allocated
running stats for (repeat 10000000 (get-cont (lambda (a b c) #f))):
no collections
95 ms elapsed cpu time, including 0 ms collecting
96 ms elapsed real time, including 0 ms collecting
0 bytes allocated

With ikarus -O2, the results for the last test is:

running stats for (repeat 10000000 (get-cont (lambda (a b c) #f))):
no collections
35 ms elapsed cpu time, including 0 ms collecting
35 ms elapsed real time, including 0 ms collecting
0 bytes allocated

That is exactly the same timings as you'd get for:

(time
(repeat 10000000 #f))

> i.e. returning a list of three elements is 44% faster than returning
> three values (and even better for two elements). This is the opposite
> of what I expected.

Me too! Even ignoring the -O2, I'm very surprised and cannot really
explain the reason for why returning multiple values is much worse than
the other two approaches in Ikarus. (I can hypothesize but cannot say
anything for sure.)

Anyways, I wouldn't focus that much on the performance difference here.
I have personally used all three styles at different times in Ikarus
depending on the task. I personally prefer to not use container data
structures for multiple values (just like how I don't usually pass
multiple arguments as a list), and would keep separate arguments/values
unpacked using either direct-style or continuation-passing-style
multiple value return. The choice of style is something that you
develop over time.

Aziz,,,

Michele Simionato

unread,

Feb 15, 2009, 12:25:36 AM2/15/09

to

On Feb 14, 11:40 pm, Abdulaziz Ghuloum <aghul...@cee.ess.indiana.edu>
wrote:

> This thread so far has focused on two ways of returning multiple values
> (one via values and call-with-values and one via an explicit data
> structure) and ignored a third way (via CPS) which is IMHO very important.
>
> Simply put, to return multiple values, pass an explicit continuation
> (procedure) that receives the desired number of values.
>

> (define (get-cont k)
> (k A B C))
>
> and
>
> (time
> (repeat 10000000
> (get-cont (lambda (a b c) #f))))

The CPS solution does not qualify: my gripe is against
values, call-with-values, let-values & friends which I see
as as an unworthy addition to the standard. OTOH,
the standard lacks a destructuring bind form which I would
have liked :-(

> I personally prefer to not use container data
> structures for multiple values (just like how I don't usually pass
> multiple arguments as a list), and would keep separate arguments/values
> unpacked using either direct-style or continuation-passing-style
> multiple value return. The choice of style is something that you
> develop over time.
>
> Aziz,,,

That's good advice for working with current Scheme. Still, I never was
really happy with current Scheme and in many ways I would prefer it to
be more like SML. For instance I like
single argument functions with single return value, lists
which are really immutable, lack of improper lists, SML records,
references, etc. About the issue of static typing vs dynamic typing I
have not made my mind up yet.

Michele Simionato