multiple return values

William D Clinger

unread,

May 26, 1998, 3:00:00 AM5/26/98

to

[Richard: Please add this to the list of known bugs in the R5RS. Mea
culpa.]

In another thread, be...@damek.kth.se wrote:

> I belive that Shriram Krishnamurthi has posted that multiple return values
> are a performance bottle neck, and adds complexity, for all the cases
> where they are not needed.

I don't know whether Shriram really said this, but it isn't true.
Although it is possible to implement multiple return values in a way
that makes all code run slower, the usual techniques for implementing
multiple return values have absolutely no impact on the efficiency of
code that does not use multiple return values. The usual reference
for this sort of thing is available at
ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/mrvs.ps.gz

Speaking of multiple return values, and getting to the real point of
this message, R5RS Section 6.4 says

Except for continuations created by the CALL-WITH-VALUES procedure,
all continuations take exactly one value.

Thus (BEGIN (VALUES 1 2 3) 4) is an error (or has unspecified semantics,
which amounts to the same thing). The formal semantics in Section 7.2.3
says that (BEGIN (VALUES 1 2 3) 4) evaluates to 4. I assume Section 6.4
is right, and Section 7.2.3 is wrong.

Will

Jeffrey Mark Siskind

unread,

May 26, 1998, 3:00:00 AM5/26/98

to

> > I belive that Shriram Krishnamurthi has posted that multiple return values
> > are a performance bottle neck, and adds complexity, for all the cases
> > where they are not needed.
>
> I don't know whether Shriram really said this, but it isn't true.
> Although it is possible to implement multiple return values in a way
> that makes all code run slower, the usual techniques for implementing
> multiple return values have absolutely no impact on the efficiency of
> code that does not use multiple return values.

It is meaningless to say that Shriram's proported statement is true or false
because, as stated (above), it doesn't specify whether multiple return values
are a performance bottleneck in SOME or ALL implementations. The statement is
likely true under the SOME interpretation and likely false under the ALL
interpretation. Will asserted that there is no performance penalty for the
`usual implementation techniques'. While that may be true, it is irrelevant.
The usual implementation techniques might not be the optimial implementation
techniques. And it may turn out to be the case that some implementation
technique may be discovered in the future that dominates the current usual
implementation technique and has the property that implementing multiple
return values incurs a performance penalty. I don't believe that the reference
cited by Will precludes such a possibility.

Language designers choose to exclude language features for potential but
as-yet unencountered performance bottlenecks all of the time. For example, the
designers of SML chose not to include callcc in the language spec even though
the dominant existant implementation SML/NJ included that feature at no cost.
Precisely because the designers felt that callcc might incur a cost with
future implementation techniques.

Jeff (http://www.neci.nj.nec.com/homepages/qobi)

Niels Möller

unread,

May 27, 1998, 3:00:00 AM5/27/98

to

William D Clinger <wi...@ccs.neu.edu> writes:

> Speaking of multiple return values, and getting to the real point of
> this message, R5RS Section 6.4 says
>
> Except for continuations created by the CALL-WITH-VALUES procedure,
> all continuations take exactly one value.
>
> Thus (BEGIN (VALUES 1 2 3) 4) is an error (or has unspecified semantics,
> which amounts to the same thing). The formal semantics in Section 7.2.3
> says that (BEGIN (VALUES 1 2 3) 4) evaluates to 4. I assume Section 6.4
> is right, and Section 7.2.3 is wrong.

What is the motivation for not allowing (BEGIN (VALUES 1 2 3) 4)? As
the "meaning" of (BEGIN e_1 ... e_n) is to evaluate a list of
expressions, and throw away all the returned values except for the
value of the last expression, e_n, it would make a lot of sense to
have it not care about the number of values returned from e_1 ...
e_{n-1}, wouldn't it?

I don't have the mrvs paper you refer to handy, but unless I'm
confusing it with some other paper, I think the implementation
strategy described there handles begin as follows: (begin e_1 ... e_n)
first evaluates e_1, ... e_{n-1}, and each of those evaluations are
passed a continuation which in all cases ignores the returned
value(s), and then continues with the next expression. Finally, e_n is
evaluated, and it is passed the continuation of the entire
begin-expression. Furthermore, with this implementation, ignoring
multiple values doesn't even cost anything extra. (I haven't been able
to read the formal semantics, but I would suspect that it handles the
continuations involved in evaluating a begin in about the same way,
and that's why it disagrees with section 6.4).

I guess one could always define a macro to expand

(my-begin e_1 ... e_n)

into

(begin (call-with-values (lambda () e_1)
(lambda ignored #f)) ... e_n)

to get a begin which can ignore any number of values, but why not have
the standard begin do that, when the cost of doing so is practically
nil?

If the sentence from section 6.4 you cite is to be interpreted as
implying that not even the continuations for _implicit_ begins can
handle (i.e. ignore) multiple values, that is even worse. Consider a
function foo which is in most cases called for side effects only, but
which also returns a value which is occasionally useful for the caller
(I like the style "if there's anything sensible at all to return, do
return that"). The strict interpretation would strongly discourage the
use of multiple return values for that type of functions.

(lambda () (foo ...) 17)

would be invalid whenever foo returns some multiple values I want to
ignore, and as would

(for-each foo some-list).

Of course, one could write macros that inserts dummy consumers for
these expressions as well, just as for my-begin above, but that seems
clumsy and inefficient.

I think it is quite important that it be convenient to ignore values.
Not being able to do that makes me think of C compiler that requires
an explicit casts to void for any value a want to ignore.

/Niels

Matthias Blume

unread,

May 27, 1998, 3:00:00 AM5/27/98

to

In article <356B332C...@ccs.neu.edu> William D Clinger <wi...@ccs.neu.edu> writes:

Although it is possible to implement multiple return values in a way
that makes all code run slower, the usual techniques for implementing
multiple return values have absolutely no impact on the efficiency of
code that does not use multiple return values.

Jeff Siskind has already written a good reply to this "usual
implementation technique" argument. But even if what Clinger says were
true, there is a much more serious issue in here: The mere presence of
MV in the language will force cautious programmers to distort their
programming style.

My often-repeated example for this is simple function composition. In
a sane world without MV, I simply write:

(define (compose f g) (lambda (x) (f (g x))))

Now, we are already in a slightly insane world -- we have multiple
arguments. So I have to write:

(define (compose f g) (lambda args (f (apply g args))))

With MV, I then have to distort this further to

(define (compose f g)
(lambda args
(call-with-values
(lambda () (apply g args))
f)))

Now, _that_'s what I call ugly! Try explaining this to a beginner of
this oh-so-great teaching language...

Aside from the ugliness -- multiple values are likely to introduce
inefficiencies. Perhaps it may be true that the "common case" will be
optimized in the kind of implementation that Clinger foresees -- the
case where a suitable receiver for MV is handily available at the time
those MV pop up. However, one often must capture the result of some
sub-computation to be able to _later_ send it off to some other
computation. The sad truth is that with MV we cannot simply say

(let ((tmp (do-this ...)))
(do-something-else)
(do-that tmp))

but instead must use CALL-WITH-VALUES:

(call-with-values
(lambda (do-this ...))
(lambda tmp
(do-something-else)
(apply do-that tmp)))

Thus, the second-class MV will often be converted explicitly to a
first-class compound data structure (e.g., a list). Only very clever
compilers will be able to streamline this and avoid the consing. If
the MV package passes through many such TMPs before finally being
consumed at its destination, then those explicit conversions are
likely to outweigh any benefits.

--
-Matthias

Clifford Beshers

unread,

May 27, 1998, 3:00:00 AM5/27/98

to

bl...@ux01.cs.princeton.edu (Matthias Blume) writes:

> --
> -Matthias

An interesting argument. Assuming your ``sane'' model of single
argument, single value functions only, what does the world look like?

I assume I would pack and unpack arguments? If so, would I have to
use lists as the packing structure? It seems that would get us right
back to the cons'ing.

Also, I don't think of Scheme as *just* a teaching language. Do you,
and if so, what bearing does it have on your opinion here?

--
Clifford Beshers Computer Graphics and User Interfaces Lab
bes...@cs.columbia.edu Department of Computer Science
http://www.cs.columbia.edu/~beshers Columbia University

William D Clinger

unread,

May 27, 1998, 3:00:00 AM5/27/98

to

ni...@lysator.liu.se (Niels Möller) wrote:

> What is the motivation for not allowing (BEGIN (VALUES 1 2 3) 4)?

The multiple return values proposal was accepted at the June 1992
meeting of the R*RS authors. Based on my minutes of that meeting,
Jonathan Rees wrote:

> Except for continuations created by the {\tt call-with-values}
> procedure, all continuations take exactly one value, as now; the
> effect of passing no value or more than one value to continuations
> that were not created by {\tt call-with-values} is unspecified (as
> indeed it is unspecified now)....
>
> Because the behavior of a number-of-values mismatch between a
> continuation and its invoker is unspecified, some implementations may
> assign some specific meaning to such situations; for example, extra
> values might be ignored, or defaults might be supplied for missing
> values....
> The behavior of programs in such situations was a point of contention
> among the authors, which is why only the least common denominator
> behavior was specified.

It was my impression that a consensus had emerged following the meeting
to require command continuations to accept (and to ignore) any number
of return values, which is why I changed the formal semantics to allow
this. On the other hand, this change was never formally approved by the
authors. I presume that Richard Kelsey felt that making this change
without a unanimous formal vote of the authors would be to exceed his
authority as editor of the R5RS.

I apologize for having failed to notice this discrepancy when proofreading
the R5RS.

bl...@ux01.cs.princeton.edu (Matthias Blume) suggested that multiple
return values might result in inefficiency because useful routines
such as FOR-EACH and COMPOSE might be coded in an ugly style to
accomodate the possibility of multiple return values. This is a
sophisticated argument, and I think it had a lot to do with the authors'
conservatism in June 1992.

The R5RS semantics (in Section 6.4) implies that returning no values
or more than one value is safe only when you know that the continuation
was created by CALL-WITH-VALUES. If the designer of FOR-EACH or COMPOSE
chooses to guarantee this, by calling CALL-WITH-VALUES, then there is
likely to be some overhead associated with that call. If the designer of
FOR-EACH or COMPOSE does not choose to guarantee this, then there will be
no call to CALL-WITH-VALUES, and there will be no overhead associated
with the mere existence of CALL-WITH-VALUES in the language.

Jeffrey Mark Siskind <qo...@research.nj.nec.com> pointed out that the
mere existence of multiple return values in the language might entail
some kind of overhead when combined with future implementation techniques.

I agree with Siskind, but I do not foresee any technological developments
that would entail such an overhead. In current implementations, excluding
perhaps a few interpreters that would be slow anyway, the existence of
multiple return values in R5RS Scheme has absolutely no impact on the
efficiency of programs that do not use them.

Will

Matthias Blume

unread,

May 27, 1998, 3:00:00 AM5/27/98

to

bl...@ux01.cs.princeton.edu (Matthias Blume) writes:

Well, perhaps. A dumb compiler would certainly cons a lot. I was
just trying to show that MV does not actually solve the consing
problem in a general way.

There are compilers for languages that use the one-arg one-result
model where a great deal of (type) analysis is done to avoid
unnecessary consing. This works quite well in strongly-typed
languages like ML.

Now, in a lot of cases similar success could perhaps be achieved by
using soft type analysis. If that turns out not to be the case, then
I would count this as an argument in favor of strong typing. (Just my
personal view...)

Also, I don't think of Scheme as *just* a teaching language. Do you,
and if so, what bearing does it have on your opinion here?

I am not sure.

--
-Matthias

Jeffrey Mark Siskind

unread,

May 27, 1998, 3:00:00 AM5/27/98

to wi...@ccs.neu.edu

> Jeffrey Mark Siskind <qo...@research.nj.nec.com> pointed out that the
> mere existence of multiple return values in the language might entail

> some kind of overhead when combined with future implementation techniques=
> =2E
>
> I agree with Siskind, but I do not foresee any technological developments=
>
> that would entail such an overhead. In current implementations, excludin=

> g
> perhaps a few interpreters that would be slow anyway, the existence of
> multiple return values in R5RS Scheme has absolutely no impact on the
> efficiency of programs that do not use them.

Suppose we added an exception mechanism to Scheme. It would be natural to have
an INVALID-CALL exception when a procedure was called with an inappropriate
number of arguments. Because call sites can be higher-order in Scheme, in
general, and especially without global flow analysis, it is not always
possible to prove that a given call site will not raise an INVALID-CALL
exception. So some form of run-time checking is necessary. Now (with the
exception of APPLY) the number of arguments of a call site is statically and
locally determinable by simple inspection of the call site. And the (range of)
allowed number of parameters of a procedure is statically and locally
determinable by simple inspection of the formal parameter list of the lambda
expression. So the (range of) allowed number of parameters of a procedure can
be encoded as a constant in the procedure object itself. And the code to check
and possibly raise the INVALID-CALL exception can simply compare a constant
that is dependent on the call site with a field of the procedure object.

If Scheme also had multiple values it would also be natural to raise an
exception when a procedure returned an inappropriate number of values. But the
number of values returned by a procedure is not statically and locally
determinable. For example, the following procedure might return one value and
might return values values.

(lambda (x) (if (p) (values 1) (values 2 3)))

And determining the number of values returned by following procedure:

(lambda (x) (f x))

requires knowing what F is. Furthermore, the number of values accepted by a
call site cannot be statically and locally determined. (I don't remember the
syntax of CALL-WITH-VALUES and I don't have a copy of R5RS at hand so please
excuse me if the following is slightly inaccurate. The spirit is true
nonetheless.) For example, in the following:

(call-with-values (lambda () (f)) g)

the number of values that F is allowed to return depends on the number of
arguments that G can accept.

The procedure call sequence for a language without multiple values is
inherently simpler than one with multiple values. For no other reason than
fewer checks need be done. And because of the higher-order nature of Scheme,
without global flow analysis, it is not possible to remove those checks. So
just the checks alone imply that the mere presence of multiple values in the
language incurs a cost penalty at every call site, even in programs that don't
use them.

Jeff (http://www.neci.nj.nec.com/homepages/qobi)

William D Clinger

unread,

Jun 9, 1998, 3:00:00 AM6/9/98

to Qo...@research.nj.nec.com

For the third time I say:

The existence of multiple return values in R5RS Scheme has absolutely

no impact on the efficiency of programs that do not use them.

There seems to be considerable confusion about this. For example,

a well-known and highly respected implementor of Scheme wrote:

> The procedure call sequence for a language without multiple values is
> inherently simpler than one with multiple values. For no other reason than
> fewer checks need be done. And because of the higher-order nature of Scheme,
> without global flow analysis, it is not possible to remove those checks. So
> just the checks alone imply that the mere presence of multiple values in the
> language incurs a cost penalty at every call site, even in programs that don't
> use them.

The first sentence above is true, provided you regard multiple return values
as part of the procedure call sequence. The other three sentences are false.

Instead of merely citing the standard references that describe the technologies
that are currently used to implement multiple return values in Lisp and Scheme,
I will briefly sketch those technologies. I will use the phrase "zero overhead"
to describe technologies that have no effect on the efficiency of programs that
do not use multiple return values. There seem to be three basic techniques:

1. VALUES detects whether its continuation was created by CALL-WITH-VALUE.
This technique combines zero overhead with the best possible error
detection.

2. Multiple return values are represented, or at least flagged, by some
first class value. This technique has zero overhead, but does not
detect all errors. Of these three techniques, this is the only one
that can be implemented by portable R4RS Scheme code.

3. Each continuation performs a dynamic check to ensure that it was
passed a correct number of return values. This technique has
unacceptably high overhead, but does detect all errors.

I am not aware of any implementations that use the third technique, but
there might be a few slow interpreters that use it.

Some implementations do not yet support multiple return values directly.
In practice, this implies that they use the second technique, implemented
by user code.

Implementations that directly support multiple return values use some
variation of the first technique. This technique was described in a
paper by Ashley and Dybvig [1], and is a practical application of a technique
that was later described in a tongue-in-cheek paper by Andrew Appel [2], who
might be amazed to learn that the Lisp and Scheme communities regard it as a
standard implementation technique. The technique works as follows.

(CALL-WITH-VALUES thunk receiver) creates a specially marked continuation.
The return address within CALL-WITH-VALUES can serve as the mark. Then it
performs a normal call to the thunk, with no arguments. When the thunk
returns a single value, then the receiver is called with that value as its
argument. In other words, the code for CALL-WITH-VALUES is basically

(define (call-with-values thunk receiver)
(receiver (thunk)))

All the magic occurs within VALUES. When VALUES is called with one argument,
it just returns that argument normally. When VALUES is called with zero
arguments or with more than argument, however, then it inspects the current
continuation to see whether that continuation was created by CALL-WITH-VALUES.
If it wasn't, then this is an error case. If it was, then VALUES fetches the
receiver out of the continuation frame that was created by CALL-WITH-VALUES,
pops that frame off the continuation, and calls the receiver with the same
arguments that were passed to VALUES. In other words, the code for VALUES
is basically

(define (values . args)
(if (= 1 (length args))
(car args)
(if (current-continuation-accepts-multiple-values?) ;magic
(let ((receiver (current-continuation-receiver))) ;magic
(current-continuation-pop!) ;magic
(apply receiver args))
(error (string-append
(number->string (length args))
" results were passed"
" to a continuation that requires"
" exactly one result.")))))

where the three lines that are marked as magic cannot be coded in Scheme,
but are easy for an implementor of Scheme to implement.

There is one kind of implementation that cannot easily implement those
three magic lines of code. This kind of implementation

is compiled rather than interpreted;
compiles to C, Java, or some similarly high-level language instead
of to assembly or machine language; and
does not generate properly tail-recursive target code

Such implementations may have a bit of a problem with recognizing whether
the current continuation was created by CALL-WITH-VALUES, since the current
continuation may not be accessible from C, Java, or whatever without playing
the kind of tricks described by Appel [2], and the lack of proper tail
recursion means that the conceptually current continuation may be buried
within an unbounded number of useless continuation frames.

Implementations of this sort can still use the second implementation
technique described above. Although that technique does not provide the
best possible error detection, this is not as serious a deficiency as the
improper tail recursion that creates the problem in the first place.
Furthermore Jeffrey Mark Siskind has discovered and has described to me a
fourth technique that such implementations can use, which combines near-zero
overhead with the best possible error detection, but he has not yet
published this result.

Implementors that wish to go beyond zero overhead by optimizing the multiple
value cases as well as the single-value case should consult the paper by
Ashley and Dybvig [1].

If you are a Scheme programmer rather than an implementor of Scheme, then
the only thing you need to remember from this entire thread is that

The existence of multiple return values in R5RS Scheme has absolutely

no impact on the efficiency of programs that do not use them.

Will

[1] J Michael Ashley and R Kent Dybvig. An efficient implementation of
multiple return values in Scheme. 1994 ACM Conference on LISP and
Functional Programming, pages 140-149. Available at
ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/mrvs.ps.gz

[2] Andrew W Appel. Intensional equality ;-) for continuations. ACM
SIGPLAN Notices 31(2), February 1996, pages 55-57. Available at
http://www.cs.princeton.edu/faculty/appel/papers/