Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

multiple values

192 views
Skip to first unread message

Jeffrey Mark Siskind

unread,
Jul 24, 1996, 3:00:00 AM7/24/96
to

A number of people have asked me to repost my news article on multiple values.
I've enclosed it below.
Jeff (home page http://tochna.technion.ac.il/~qobi)
-------------------------------------------------------------------------------
In-reply-to: shi...@clark.lcs.mit.EDU's message of 31 Jan 1995 02:37:40 -0500
Newsgroups: comp.lang.scheme
Subject: Re: multiple-value return & optimising compilers
Reply-to: Qo...@CS.Toronto.EDU
References: <dig-Sch...@mc.lcs.mit.edu> <950131074...@clark.lcs.mit.edu>
Distribution: world
--text follows this line--
In article <950131074...@clark.lcs.mit.edu> shi...@clark.lcs.mit.EDU (Olin Shivers) writes:

Matthias Blume makes the claim that multiple-value return produces objects
that aren't first-class, in the sense that the "container" holding the
multiple values isn't accessible as other values are.

Multiple return values are not a data-structure; this is like saying
variables
aren't first class or something.

Let me stand up to Matthias' defense here. Matthias is not claiming that the
the current multiple-value spec *does* create containers. Of course it doesn't.
He is claiming that:

a) (with the possible exception of CALL/CC) the semantics is equivalent to
what is provided by a trivial implementation based on containers.
b) This trivial implementation requires no new language features.
c) The proposed extension exists solely for the purpose of performance.

And most importantly:

d) multiple-values look very similar to containers, so similar that people
might view them as containers until they get bitten by the fact that they
are really not containers.

I think that the last point is very important. It is not that they are
containers. It is that they are analogous to containers.

I think a well-designed programming language can have two features that share
a common subfeature. It can also have two features, one of which is totally
subsumed by the other, the first of which exists solely for reasons of
performance. But then the syntax for specifying which feature to use should be
orthogonal to how it is used. That way I can write my code uniformly and make
local changes to tune performance.

Multiple return values are completely symmetric with multiple parameters to
procedures. The latter allows you to pass multiple values to a procedure;
the
former allows you to pass multiple values to an implicit continuation.
CALL-WITH-VALUES essentially allows you to specify the arity of a
continuation.

While such symmetry might be elegant from a language design/semantics point of
view I think that multiple values is an exceedingly bad idea from the point of
view of writing clean/understandable/maintainable code. One of the hallmarks
of Lisp style is writing nested expressions. Like it or not, expressions are
trees. They potentially have multiple children yet have only a single parent.
With multiple values, programs become directed graphs. The problem with this
is that you can't extract a subexpression and talk about its meaning or place
it somewhere else.

Very often, perhaps almost always, the multiple values are semantically
related. They *belong* in a container. Let me give you an example. I once
wrote a very large piece of code that did complicated computational geometry
calculations. It was originally written for a Symbolics machine and made
extensive use of multiple values for reasons of efficiency. Points, lines,
rays, circles, and so forth were all passed around as multiple single-float
values to avoid consing. The code did many geometric constructions. These
constructions were something like the following: Given four points p1, p2, p3,
and p4, construct a ray r from p1 in the direction of p2, construct a line l
that intersects p3 and p4, and return the intersection point of r with l.
This might be done with something like the following:

(intersection (ray p1 p2) (line p3 p4)

Now suppose lines are represented as the coefficients of ax+by=c and that rays
are represented as a point <x,y> and an angle. With (Common Lisp style)
multiple values this becomes:

(multiple-value-bind (la lb lc) (line p3x p3y p4x p4y)
(multiple-value-bind (rx ry rtheta) (ray p1x p1y p2x p2y)
(intersection-ray-line rx ry rtheta la lb lc)))

First of all, the later is much less clear than the former. Second, the
representation of lines and rays is hardcoded. With the later approach, if I
want to change the representation of lines to <point,angle> I will have to
change every fragment of code that uses lines. With the former, the bulk of
the code remains unchanged. Third, the former allows using a generic
`intersection' function. The latter does not. Fourth, suppose that I latter
discover that my algorithm is flawed. And the line l must be rotated by the
angle theta about the point p5 before it is intersected with the ray r.
I can simply modify my code as follows:

(intersection (ray p1 p2) (rotate (line p3 p4) p5 theta))

With the latter one needs to do:

(multiple-value-bind (la lb lc) (line p3x p3y p4x p4y)
(multiple-value-bind (la1 lb1 lc1) (rotate la lb lc p5x p5y theta)
(multiple-value-bind (rx ry rtheta) (ray p1x p1y p2x p2y)
(intersection-ray-line rx ry rtheta la1 lb1 lc1))))

Making the latter type of change is very error prone.

I never was able to fully debug my multiple-value code. Even after months of
debugging, there were dozens of latent bugs. I decided to forgo performance
and reimplemented the code using containers instead of multiple values. I was
able to eliminate all of the known bugs in a few days. (I ended up spending
several months writing a partial evaluator to get back the performance, but
that is a different story).

The key here is that while I was able to use multiple values and did in fact
use them, the objects that they represented WERE MORE PROPERLY VIEWED AS
*CONTAINERS*. I think that Matthias is dead right on this one.

Encapsulating the base functionality of m-v in a procedural form (with the
VALUES and CALL-WITH-VALUES procedures) is the way you do things in Scheme.
It
is analogous to encapsulating the base functionality of continuations in a
procedural form (with the CALL/CC procedure and the reified continuation
procedures it produces). This is a fine thing to do, and very much in the
"spirit of Scheme" to which Matthias keeps referring.

The spirit of Scheme is as I stated (quoting from the first paragraph of R4RS:
It was designed to have an exceptionally clear and simple semantics and
few different ways to form expressions.
^^^^^^^^^^^^^^^^^^
IMHO multiple values adds an unnecessary (and even undesirable) way to express
something that is already expressible in a different way.

I frequently see people on this list claim that it's a simple matter of
global
program analysis to handle all of the horrible inefficiencies introduced by
various proposals -- such as Matthias' one parameter/one return value
proposal.

I have noted that people who have actually implemented aggressive
native-code
Scheme compilers -- such as Orbit, Gambit, or Chez Scheme -- are usually not
the people who make these claims.

Not true. I have implemented a native-code compiler for Scheme called Stalin
that does aggressive global analysis. I believe that Stalin does more
extensive global analysis than any existing compiler for any programming
language. Stalin does not yet do the optimization necessary to convert
container-based returns into multiple-value returns but it is high on my
agenda. The infrastructure is there to support automatic linearity analysis.
(I agree with Henry Baker that linearity is important to support
`immediatization' and in-place update but prefer to have the compiler
determine this by global static analysis than have the programmer declare such
information.)

It is very difficult to do these compilers,
and global analysis of higher-order languages is quite tricky and does not
always pay off.

Sometimes yes and sometimes no.

If you believe that a magic global analysis will make the implementation
problems go away, then I invite you to design and implement such an
analysis.
Not only will your results be worth serious academic acclaim, they will
significantly impact real-world programming.

I hope so.

It's a double win. So, please. We
anxiously await you.

I don't want to dampen anyone's enthusiasm. Sarcasm aside, I really do
encourage anyone who wants to make advanced programming languages go fast --
go for it. It's a fun problem, if you are into that kind of thing (I am).
But
people have been working on this one for a while. If it was easy, it would
have been done by now.

There are many reasons why innovations happen later rather than earlier.
Sometimes it is because they are hard. Sometimes it is because some prior
enabling innovation is necessary. Sometimes it is just because no one thought
of it (perhaps because people were biased by some prevailing way of thinking).

Unlike others on this list, I don't program in Scheme because it is a simple
pedagogical programming language, useful for expressing ideas to students,
or
because it is elegant and therefore useful in applications where efficiency
is
of no concern at all. I want to use powerful languages to do real systems
programming. I don't want to have to use C. Efficiency matters. For a large
set of programming tasks -- graphics, spreadsheets, databases engines,
operating systems, word processors, text editors, file servers, network
applications, and so forth -- powerful notation is a great boon, but
efficiency is a requirement, and these are the tasks to which I would like
to
apply the technology of advanced languages. So arguments of the form, "Of
course it makes the system slow, but it's minimal and elegant, so it's OK"
don't work for me.

My primary research area is not programming languages or compilers. Stalin is
a low-priority part-time effort for me. I am building Stalin precisely because
I believe that Lisp is the best language for most of my other research efforts
and I need a better implementation than those that are avaiable.

It's easy to appeal to hypothetical static analyses to justify introducing
inefficient features into a language. It's a lot harder to do it.

Agreed.
-Olin

Jeff (home page http://www.cdf.toronto.edu/DCS/Personal/Siskind.html)
--

Jeff (home page http://tochna.technion.ac.il/~qobi)

Greg Morrisett

unread,
Jul 24, 1996, 3:00:00 AM7/24/96
to

Olin Shivers wrote:

> I frequently see people on this list claim that it's a simple matter of
> global
> program analysis to handle all of the horrible inefficiencies introduced by
> various proposals -- such as Matthias' one parameter/one return value
> proposal.
> I have noted that people who have actually implemented aggressive
> native-code
> Scheme compilers -- such as Orbit, Gambit, or Chez Scheme -- are usually not
> the people who make these claims.

However, it is interesting how the people who implement native code compilers
for ML (which does have the one parameter/one return value proposal)
make these claims. For example, both NJ, TIL, and CSL compilers
unbox function arguments (among other things.) Since SML/NJ uses
CPS, there's no need for unboxing results. TIL doesn't unbox
results, in part because we found no performance gain for the
added complexity. (There are classes of programs that do benefit,
but it adds a lot of complexity to the intermediate form of
a direct-style compiler to support multiple return values properly.
Adding it to the source language just shifts the burden to the programmer.)



> If you believe that a magic global analysis will make the implementation
> problems go away, then I invite you to design and implement such an
> analysis.
> Not only will your results be worth serious academic acclaim, they will
> significantly impact real-world programming.

Of course, both SML/NJ and TIL take advantage of a very simple, global
program analysis called, erm, algorithm W. Fortunately, Milner has
already received his Turing award, so no further acclaim is due :-)


> Unlike others on this list, I don't program in Scheme because it is a simple
> pedagogical programming language, useful for expressing ideas to students,
> or
> because it is elegant and therefore useful in applications where efficiency
> is
> of no concern at all. I want to use powerful languages to do real systems
> programming. I don't want to have to use C. Efficiency matters. For a large
> set of programming tasks -- graphics, spreadsheets, databases engines,
> operating systems, word processors, text editors, file servers, network
> applications, and so forth -- powerful notation is a great boon, but
> efficiency is a requirement, and these are the tasks to which I would like
> to
> apply the technology of advanced languages. So arguments of the form, "Of
> course it makes the system slow, but it's minimal and elegant, so it's OK"
> don't work for me.

If you're going to add something to Scheme to support efficient multiple
return values, I humbly suggest adding types. You get the same benefits
in terms of optimization, as witnessed by the NJ and TIL compilers, without
the loss of expressiveness or asymmetry. On top of this, you get a wonderful
static debugging tool, powerful semantic approaches to proving correctness,
etc. Furthermore, a good compiler can use the information to optimize not
only calling conventions, but also data representations within data structures
(e.g., unboxed floating point arrays, re-associated sums, flattened records
with packed fields, etc.) as well as garbage collection.

> It's easy to appeal to hypothetical static analyses to justify introducing
> inefficient features into a language. It's a lot harder to do it.

It's also easy to throw in the first hack that appeals to today's notion of
efficiency, knowing full well that these decisions, more often than not, come
back to bite us in the future.

-Greg Morrisett
j...@cs.cornell.edu

Guillermo (Bill) J. Rozas

unread,
Jul 26, 1996, 3:00:00 AM7/26/96
to

In article <QOBI.96Ju...@eesun.technion.ac.il> qo...@eesun.technion.ac.il (Jeffrey Mark Siskind) writes:

| d) multiple-values look very similar to containers, so similar that people
| might view them as containers until they get bitten by the fact that they
| are really not containers.
|
| I think that the last point is very important. It is not that they are
| containers. It is that they are analogous to containers.

But they can be. For example, a trick similar to boxing/unboxing
fixnums and flonums in MacLisp can be used.

Assume that you give multiple-value containers (almost) first class status.
By almost first-class status I mean that they can be passed around,
stored, etc., but _just like fixnums_ have no guarantee of EQ? ness.

A simple implementation would always box multiple values and
destructure in the caller when necessary.

A more complex (and presumably higher performance) implementation can
do things differently: Every procedure can have two entry points,
namely one used by the routines expecting multiple values and one used
by the routines expecting one (boxed) value.

Since there is no guarantee of EQ?ness, the caller can re-box at will
if the value returned is then used in a context that cannot be
flattened by the compiler (e.g. the aggregate is passed along to
someone else).

A simple analysis in the caller determines whether this is the case or
not.

Yes, this adds complexity to the compiler, but the additional
complexity (as in the fixnum/flonum case) is to increase performance
while keeping the inherent dynamism of Scheme (as opposed to
statically typed languages which have an easier time with such
things).

|
| (multiple-value-bind (la lb lc) (line p3x p3y p4x p4y)
| (multiple-value-bind (rx ry rtheta) (ray p1x p1y p2x p2y)
| (intersection-ray-line rx ry rtheta la lb lc)))

This is bad coding practice _whether_ multiple values are containers
or not.

Even if multiple values are ordinary containers, I would expect people
to use some destructuring sugar. Whether you call this MULTIPLE-VALUE-BIND
or destructuring-let (or extend LET's syntax) the issue is the same.

When you wrote this code, you chose a complicated interface with
insufficient abstraction, and this came back to bite you.

I do agree however, that things such as multiple values are misused.
Lisp programmers often micro-manage the efficiency of their programs.
Sometimes it is necessary. More often than not it is not.

Sean Case

unread,
Jul 28, 1996, 3:00:00 AM7/28/96
to

In article <QOBI.96Ju...@eesun.technion.ac.il>,

qo...@eesun.technion.ac.il (Jeffrey Mark Siskind) wrote:

>While such symmetry might be elegant from a language design/semantics point of
>view I think that multiple values is an exceedingly bad idea from the point of
>view of writing clean/understandable/maintainable code. One of the hallmarks
>of Lisp style is writing nested expressions. Like it or not, expressions are
>trees. They potentially have multiple children yet have only a single parent.

This is true in Lisp, because of the notation used for composition. Forth
programmers
don't have any trouble with multiple values, because Forth uses a more
general (if
less readable) notation for expressions. There are other possible syntaxes,
such as
Beta or CMS-2. Of course, if you change Scheme's expression syntax, you
definitely
aren't in Kansas any more.

A modest proposal:

Scheme already pretends that function arguments are placed in a list;
suppose that
we replace multiple-value-bind with list-bind, taking a (syntactic) list of
variables and a (genuine) list of values. If you like, we can allow all the
usual
kinds of parameter syntax in the list of variables to be bound. The
advantage of
this over "normal" list accessors is that using list-bind explicitly tells
the
compiler that the values argument will be destructured immediately on
arrival. The
"sufficiently smart compiler" can then try to recognize idioms
corresponding to
multiple return values.

Or is this a stupid idea?

Sean Case


---------------------------------------------
Sean Case gsc...@tpgi.com.au

Code is an illusion. Only assertions are real.

Jeffrey Mark Siskind

unread,
Jul 31, 1996, 3:00:00 AM7/31/96
to

In article <GJR.96Ju...@hplgr2.hpl.hp.com> g...@hplgr2.hpl.hp.com (Guillermo (Bill) J. Rozas) writes:

| I think that the last point is very important. It is not that they are
| containers. It is that they are analogous to containers.

But they can be.

So we are in agreement. Your proposal makes multiple values first class in
everyway except EQ?-ness. And you propose eliminating EQ?-ness as a means for
allowing unboxing. Such unboxing can be done even without eliminating EQ?-ness
by doing linearity analysis.

Independent of how you do unboxing, the issues of unboxing, EQ?-ness, and
linearity are more general, pervasive, and orthogonal to the issue of multiple
values. They apply to aggregate objects in general. Since the efficiency
benefits of multiple values are better handled in a more general way, and
because of the dubious value of the multiple value return constructs
themselves, I believe that there is no place for multiple value return
contructs in a properly designed language, except as syntact sugar that is
part of a more general structuring/destructuring facility.

|
| (multiple-value-bind (la lb lc) (line p3x p3y p4x p4y)
| (multiple-value-bind (rx ry rtheta) (ray p1x p1y p2x p2y)
| (intersection-ray-line rx ry rtheta la lb lc)))

This is bad coding practice _whether_ multiple values are containers
or not.

This was precisely my point.

Even if multiple values are ordinary containers, I would expect people
to use some destructuring sugar. Whether you call this MULTIPLE-VALUE-BIND
or destructuring-let (or extend LET's syntax) the issue is the same.

When you wrote this code, you chose a complicated interface with
insufficient abstraction, and this came back to bite you.

What better kind of interface would you suggest that I use? I contend that
with the conventional semantics for multiple values, there is no better
interface possible (short of writing your own compiler for conventional
expression notation and having that compiler do the boxing/unboxing
automatically). And I contend that the proper interface/abstraction is the
conventional expression-oriented formulation that I included in my original
post which treated the semantically-related multiple values as aggregate
objects.

Paul Wilson

unread,
Aug 6, 1996, 3:00:00 AM8/6/96
to

In article <AE21572E...@cbr-ts1-212.tpgi.com.au>,

Sean Case <gsc...@tpgi.com.au> wrote:
>In article <QOBI.96Ju...@eesun.technion.ac.il>,
>qo...@eesun.technion.ac.il (Jeffrey Mark Siskind) wrote:
>
>Forth programmers don't have any trouble with multiple values, because
> Forth uses a more general (if less readable) notation for expressions.

Maybe I'm misunderstanding your point, but in my experience Forth programmers
have tremendous problems with multiple return values.

I programmed professionally in Forth for a miserable few months, and
I spent a large amount of time tracking down bugs due to mishandled return
values. I also had the impression that my more experienced Forth-programming
colleagues had serious problems of the same sort. They either had the
bugs, or invested heavily in paranoid programming to avoid them.

Forth's exposed-stack do-it-yourself handling of arguments and return
values is a disaster coming and going.

It was quite common to have bugs where the wrong number of values would
be returned, the top one would be eaten, the program would run merrily
along for a few million instructions, and then it would return down
to the level of the erronously unconsumed return values, and blow up.

Worse, it might not blow up at that point, but go off and do seemingly
legitimate things for another million instructions or so, corrupting
things in horrible ways.

(This is a quibble about Forth, and has no bearing on your general
point.)
--
| Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (wil...@cs.utexas.edu)
| Papers on memory allocators, garbage collection, memory hierarchies,
| persistence and Scheme interpreters and compilers available via ftp from
| ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)

Paul E. Bennett

unread,
Aug 12, 1996, 3:00:00 AM8/12/96
to

In article <40.106...@channel1.com>
yuval....@channel1.com "Yuval Peduel" writes:

> ************* Original From: PAUL WILSON
> * FORWARDED * To: ALL
> * MESSAGE * Date/Number: 08/06/96 - 0006011
> ************* On: CHANNEL1 - 1637 - comp.lang.scheme
> -----------------------------------------------------------------------
>
> *@FROM :wil...@cs.utexas.edu
> Message-ID: <4u7qte$i...@roar.cs.utexas.edu>
> Path: news.channel1.com!news1.channel1.com!wizard.pn.com!news-in.tiac.net
> news.kei.com!newsfeed.internetmci.com!swrinde!cs.utexas.edu!not-for-mail
> From: wil...@cs.utexas.edu (Paul Wilson)
> Newsgroups: comp.lang.scheme
> Subject: Re: multiple values
> Date: 6 Aug 1996 11:12:30 -0500
> Organization: CS Dept, University of Texas at Austin
> Lines: 36
> Message-ID: <4u7qte$i...@roar.cs.utexas.edu>
> References: <QOBI.96Ju...@eesun.technion.ac.il>
> <AE21572E...@cbr-ts1-212.tpgi.com.au>
> NNTP-Posting-Host: roar.cs.utexas.edu
> Ident-User: wilson


>
>
> In article <AE21572E...@cbr-ts1-212.tpgi.com.au>,
> Sean Case <gsc...@tpgi.com.au> wrote:
> >In article <QOBI.96Ju...@eesun.technion.ac.il>,
> >qo...@eesun.technion.ac.il (Jeffrey Mark Siskind) wrote:
> >
> >Forth programmers don't have any trouble with multiple values, because
> > Forth uses a more general (if less readable) notation for expressions.
>
> Maybe I'm misunderstanding your point, but in my experience Forth programmers
> have tremendous problems with multiple return values.
>
> I programmed professionally in Forth for a miserable few months, and
> I spent a large amount of time tracking down bugs due to mishandled return
> values. I also had the impression that my more experienced Forth-programming
> colleagues had serious problems of the same sort. They either had the
> bugs, or invested heavily in paranoid programming to avoid them.
>
> Forth's exposed-stack do-it-yourself handling of arguments and return
> values is a disaster coming and going.
>
> It was quite common to have bugs where the wrong number of values would
> be returned, the top one would be eaten, the program would run merrily
> along for a few million instructions, and then it would return down
> to the level of the erronously unconsumed return values, and blow up.
>
> Worse, it might not blow up at that point, but go off and do seemingly
> legitimate things for another million instructions or so, corrupting
> things in horrible ways.
>
> (This is a quibble about Forth, and has no bearing on your general
> point.)

Responses from comp.lang.forth:

Path: tcontec.demon.co.uk!news.demon.co.uk!dispatch.news.demon.net!demon!arclight.uoregon.edu!usenet.eel.ufl.edu!spool.mu.edu!howland.erols.net!vixen.cso.uiuc.edu!newsfeed.internetmci.com!in2.uu.net!wizard.pn.com!news1.channel1.com!news.channel1.com!channel1!yuval.peduel
Distribution: world
Newsgroups: comp.lang.forth
Subject: multiple values
From: yuval....@channel1.com (Yuval Peduel)
Message-ID: <40.106...@channel1.com>
Date: Sat, 10 Aug 1996 21:56:00 -0640
Organization: Channel 1(R) 617-864-0100 Info
Lines: 64

************* Original From: PAUL WILSON
* FORWARDED * To: ALL
* MESSAGE * Date/Number: 08/06/96 - 0006011
************* On: CHANNEL1 - 1637 - comp.lang.scheme
-----------------------------------------------------------------------

*@FROM :wil...@cs.utexas.edu
Message-ID: <4u7qte$i...@roar.cs.utexas.edu>
Path: news.channel1.com!news1.channel1.com!wizard.pn.com!news-in.tiac.net
news.kei.com!newsfeed.internetmci.com!swrinde!cs.utexas.edu!not-for-mail
From: wil...@cs.utexas.edu (Paul Wilson)
Newsgroups: comp.lang.scheme
Subject: Re: multiple values
Date: 6 Aug 1996 11:12:30 -0500
Organization: CS Dept, University of Texas at Austin
Lines: 36
Message-ID: <4u7qte$i...@roar.cs.utexas.edu>
References: <QOBI.96Ju...@eesun.technion.ac.il>
<AE21572E...@cbr-ts1-212.tpgi.com.au>
NNTP-Posting-Host: roar.cs.utexas.edu
Ident-User: wilson


* RM 1.31 2216 *


Path: tcontec.demon.co.uk!news.demon.co.uk!dispatch.news.demon.net!demon!arclight.uoregon.edu!usenet.eel.ufl.edu!spool.mu.edu!howland.erols.net!vixen.cso.uiuc.edu!newsfeed.internetmci.com!in2.uu.net!news.goodnet.com!news
From: Brad Eckert <di4...@goodnet.com>
Newsgroups: comp.lang.forth
Subject: Re: multiple values
Date: 11 Aug 1996 16:15:20 GMT
Organization: GoodNet
Lines: 34
Message-ID: <4ul0uo$10...@news.goodnet.com>
References: <40.106...@channel1.com>
NNTP-Posting-Host: phx-ts10-2.goodnet.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 1.22 (Windows; I; 16bit)

yuval....@channel1.com (Yuval Peduel) wrote:
[...]


>Maybe I'm misunderstanding your point, but in my experience Forth programmers
>have tremendous problems with multiple return values.
>
>I programmed professionally in Forth for a miserable few months, and
>I spent a large amount of time tracking down bugs due to mishandled return
>values. I also had the impression that my more experienced Forth-programming
>colleagues had serious problems of the same sort. They either had the
>bugs, or invested heavily in paranoid programming to avoid them.
>
>Forth's exposed-stack do-it-yourself handling of arguments and return
>values is a disaster coming and going.
>
>It was quite common to have bugs where the wrong number of values would
>be returned, the top one would be eaten, the program would run merrily
>along for a few million instructions, and then it would return down
>to the level of the erronously unconsumed return values, and blow up.
>
>Worse, it might not blow up at that point, but go off and do seemingly
>legitimate things for another million instructions or so, corrupting
>things in horrible ways.
>

[...]
A graphic illustration of what can go wrong if you're not real careful
when using the return stack for stuff. It's probably a good idea to
check the depths of both stacks at some point in the program where they
are supposed to be particular values, and report an error if they are not
the correct depth.

-- Brad

"Will work for code"


Path: tcontec.demon.co.uk!news.demon.co.uk!dispatch.news.demon.net!demon!arclight.uoregon.edu!usenet.eel.ufl.edu!psgrain!nntp.teleport.com!znmeb
From: zn...@teleport.com ()
Newsgroups: comp.lang.forth
Subject: Re: multiple values
Date: 11 Aug 1996 22:18:52 GMT
Organization: Teleport - Portland's Public Access (503) 220-1016
Lines: 59
Distribution: world
Message-ID: <4ulm8c$b...@nadine.teleport.com>
References: <40.106...@channel1.com> <4ul0uo$10...@news.goodnet.com>
NNTP-Posting-Host: kelly.teleport.com
X-Newsreader: TIN [version 1.2 PL2]

Brad Eckert (di4...@goodnet.com) wrote:
: yuval....@channel1.com (Yuval Peduel) wrote:
: [...]
: >Maybe I'm misunderstanding your point, but in my experience Forth programmers


: >have tremendous problems with multiple return values.
: >
: >I programmed professionally in Forth for a miserable few months, and
: >I spent a large amount of time tracking down bugs due to mishandled return
: >values. I also had the impression that my more experienced Forth-programming
: >colleagues had serious problems of the same sort. They either had the
: >bugs, or invested heavily in paranoid programming to avoid them.
: >
: >Forth's exposed-stack do-it-yourself handling of arguments and return
: >values is a disaster coming and going.
: >
: >It was quite common to have bugs where the wrong number of values would
: >be returned, the top one would be eaten, the program would run merrily
: >along for a few million instructions, and then it would return down
: >to the level of the erronously unconsumed return values, and blow up.
: >
: >Worse, it might not blow up at that point, but go off and do seemingly
: >legitimate things for another million instructions or so, corrupting
: >things in horrible ways.

: >
: [...]
: A graphic illustration of what can go wrong if you're not real careful
: when using the return stack for stuff. It's probably a good idea to
: check the depths of both stacks at some point in the program where they
: are supposed to be particular values, and report an error if they are not
: the correct depth.

: -- Brad

: "Will work for code"

Only languages which have access to the *entire* program text at compile
time can detect this common programming error, mismatch of type or
number of arguments between a calling program and the subroutine or
function called. Since the *majority* of languages, including Forth,
"C", FORTRAN, assembler and many implementations of Pascal, allow
different programmers to write different parts of an application at
different times, there are only two ways to deal with this:

1. Check at run time, as Brad Eckert suggests. Many compilers insert
such checks when compiling in debug mode.

2. Enforce the discipline adminstratively on the programmers through
documentation standards, hopefully augmented by CASE tools.

In any event, it should be emphasized that this is *not* a weakness
inherent to Forth in relation to other languages! BTW, there are other
things to be gained from a compiler that has access to the entire
program text, among them many more speed and memory optimizations.

--
zn...@teleport.com (M. Edward Borasky) http://www.teleport.com/~znmeb

What phrase will you *never* hear Candice Bergen use?
"My daddy didn't raise no dummies!"

Path: tcontec.demon.co.uk!news.demon.co.uk!dispatch.news.demon.net!demon!arclight.uoregon.edu!usenet.eel.ufl.edu!news-res.gsl.net!news.gsl.net!swrinde!cs.utexas.edu!howland.erols.net!vixen.cso.uiuc.edu!newsfeed.internetmci.com!act.news.telstra.net!newshost.telstra.net!news.ci.com.au!wabbit.its.uow.edu.au!seagoon.newcastle.edu.au!usenet
From: "Bruce R. McFarling" <ec...@cc.newcastle.edu.au>
Newsgroups: comp.lang.forth
Subject: Re: multiple values
Date: 12 Aug 1996 00:59:26 GMT
Organization: Department of Economics, University of Newcastle
Lines: 48
Message-ID: <4ulvle$o...@seagoon.newcastle.edu.au>
References: <40.106...@channel1.com>
NNTP-Posting-Host: econ70.newcastle.edu.au
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 1.2N (Windows; I; 16bit)

yuval....@channel1.com (Yuval Peduel) wrote:

>In article <QOBI.96Ju...@eesun.technion.ac.il>,
>>qo...@eesun.technion.ac.il (Jeffrey Mark Siskind) wrote:
>>Forth programmers don't have any trouble with multiple values,
>>because Forth uses a more general (if less readable) notation for
>> expressions.
>
> Maybe I'm misunderstanding your point, but in my experience Forth
> programmers have tremendous problems with multiple return values.

> I programmed professionally in Forth for a miserable few months,
> and I spent a large amount of time tracking down bugs due to
> mishandled return values. I also had the impression that my
> more experienced Forth-programming colleagues had serious problems
> of the same sort. They either had the bugs, or invested heavily in > paranoid programming to avoid them.

> Forth's exposed-stack do-it-yourself handling of arguments and
> return values is a disaster coming and going.

(1) The original assertion was pointing out that Forth
programmers can return multiple return values by direct extension
of the way they return a single return value -- unlike, for example,
C, where multiple values have to be returned in a different way.

(2) The issue addressed in the reply is a direct consequence
of the first point. You can't hope to provide the programmer freedom
to do as they choose without giving them additional freedom to shoot
themselves in the foot. Programming *in* Forth (as opposed to simply
programming Forth) involves the verification that the word you have
just defined works as expected in normal and boundary conditions.
I think that's an effective way to program in general, and Forth
lets me do it without writing andf rewriting (and debugging!) stubs:
if you test by commenting the behavior you expect, and then verifying
the behavior you expect, the comments get written when the purpose
is still fresh in the mind; and testing immediately after writing
provides the most immediate positive and negative feedback for
effective and ineffective programming. OTOH, if you don't like that
programming style, there is probably much more support in C for
debugging in the "build it broken then fix it" approach.

--
Virtually,

Bruce R. McFarling, Newcastle, NSW
ec...@cc.newcastle.edu.au

Path: tcontec.demon.co.uk!news.demon.co.uk!dispatch.news.demon.net!demon!arclight.uoregon.edu!enews.sgi.com!news.sgi.com!news-res.gsl.net!news.gsl.net!news.mathworks.com!uunet!in3.uu.net!news.compuserve.com!ix.netcom.com!news
From: JETh...@ix.netcom.com (Jonah Thomas)
Newsgroups: comp.lang.forth
Subject: Re: multiple values
Date: 12 Aug 1996 05:37:53 GMT
Organization: Netcom
Lines: 23
Message-ID: <4umfvh$i...@dfw-ixnews9.ix.netcom.com>
References: <40.106...@channel1.com> <4ulvle$o...@seagoon.newcastle.edu.au>
NNTP-Posting-Host: lax-ca18-17.ix.netcom.com
X-NETCOM-Date: Mon Aug 12 12:37:53 AM CDT 1996

In <4ulvle$o...@seagoon.newcastle.edu.au> "Bruce R. McFarling"
<ec...@cc.newcastle.edu.au> writes:
>yuval....@channel1.com (Yuval Peduel) wrote:

>>In article <QOBI.96Ju...@eesun.technion.ac.il>,
>>>qo...@eesun.technion.ac.il (Jeffrey Mark Siskind) wrote:
>>>Forth programmers don't have any trouble with multiple values,
>>>because Forth uses a more general (if less readable) notation for
>>> expressions.

>> Maybe I'm misunderstanding your point, but in my experience Forth
>> programmers have tremendous problems with multiple return values.

>> I programmed professionally in Forth for a miserable few months,

I looked at where the original post in this thread came from, and I
figured I wouldn't respond unless I figured out how to post back to the
original newsgroup too. We do a whole lot of preaching to the choir
here, no point doing more.

I never went back to get that straight, but I hope somebody does. Let's
get these responses out to the people who read the original.


****************
end of cross-post


--
Paul E. Bennett <p...@transcontech.co.uk>
Transport Control Technology Ltd.
Tel: +44 (0)117-9499861
Going Forth Safely

Mitchell Wand

unread,
Aug 13, 1996, 3:00:00 AM8/13/96
to

Paul Bennett kindly forwarded the following message

> From: zn...@teleport.com ()
> Newsgroups: comp.lang.forth
> Subject: Re: multiple values

> Only languages which have access to the *entire* program text at compile
> time can detect this common programming error, mismatch of type or
> number of arguments between a calling program and the subroutine or
> function called. Since the *majority* of languages, including Forth,
> "C", FORTRAN, assembler and many implementations of Pascal, allow
> different programmers to write different parts of an application at
> different times, there are only two ways to deal with this:

> 1. Check at run time, as Brad Eckert suggests. Many compilers insert
> such checks when compiling in debug mode.

> 2. Enforce the discipline adminstratively on the programmers through
> documentation standards, hopefully augmented by CASE tools.

So this person evidently does not understand/know about strongly typed
languages, even after 19+ years (dating from Milner's 1977 article).

Maybe this is an example of what Bob means by not getting the word out.

--Mitch

Eric W. Biederman

unread,
Aug 14, 1996, 3:00:00 AM8/14/96
to

>>>>> "Mitchell" == Mitchell Wand <wa...@delphi.ccs.neu.edu> writes:

Mitchell> Paul Bennett kindly forwarded the following message


>> From: zn...@teleport.com ()
>> Newsgroups: comp.lang.forth
>> Subject: Re: multiple values

>> Only languages which have access to the *entire* program text at compile


>> time can detect this common programming error, mismatch of type or
>> number of arguments between a calling program and the subroutine or
>> function called. Since the *majority* of languages, including Forth,
>> "C", FORTRAN, assembler and many implementations of Pascal, allow
>> different programmers to write different parts of an application at
>> different times, there are only two ways to deal with this:

>> 1. Check at run time, as Brad Eckert suggests. Many compilers insert
>> such checks when compiling in debug mode.

>> 2. Enforce the discipline adminstratively on the programmers through
>> documentation standards, hopefully augmented by CASE tools.

Mitch> So this person evidently does not understand/know about strongly typed
Mitch> languages, even after 19+ years (dating from Milner's 1977 article).

Mitch> Maybe this is an example of what Bob means by not getting the word out.

Perhaps I am clueless on this but the point seems a good _practical_
point. But even more then that if are not talking _types_ but
something else that a type system doesn't/can't catch, I see the point
as quite valid.

Since I haven't read this 1977 article how could types be exteneded to
general catch this kind of error. Which happens to show up in forth
because it has an explicit stack for arguments, and doesn't use
functions/procedures. Heck I'd like to hear how you could use strong
types to catch this kind of error with any kind of add-on stack!

Eric


Stephen J Bevan

unread,
Aug 14, 1996, 3:00:00 AM8/14/96
to

In article <m191biz...@flinx.home> ebie...@cse.unl.edu (Eric W. Biederman) writes:
Since I haven't read this 1977 article how could types be exteneded to
general catch this kind of error. Which happens to show up in forth
because it has an explicit stack for arguments, and doesn't use
functions/procedures. ...

Indeed, it uses "words" but it isn't clear to me that they are
significantly different from functions/procedures, it is just that
there is an open stack on which arguments are passed and returned.


... Heck I'd like to hear how you could use strong


types to catch this kind of error with any kind of add-on stack!

Some work done by Stoddart and Knaggs may be of interest
[Stoddart:Knaggs:fac:1993] and I believe that Robin Popplestone has
done some work on this area and has published some techreports on it
which are available via the Glasgow FP page
<URL:http://www.dcs.gla.ac.uk/fp/>.

@article
{ Stoddart:Knaggs:fac:1993
, author= "Bill Stoddart and Peter J. Knaggs"
, title= "Type Inference in Stack Based Languages"
, journal= "Formal Aspects of Computing"
, year= 1993
, volume= 5
, number= 4
, pages= "289--298"
, checked= 19940317
, keywords= "type inference, type checking, Forth, stack based language"
, abstract= "We consider a language of operations which pass
parameters by means of a stack. An algebra over the set of type
signatures is introduced, which allows the type signature of a program
to be obtained from the type signatures of its constituent operations.

Although the theories apply in principle to any stack based language,
they have been evolved with particular regard to the proposed ANSI
Standard Forth language, which is currently implemented in a type free
manner. We hope this work will stimulate an interest in Forth amongst
those applying algebraic techniques in software engineering, and we
hope to lay the theoretical foundations for implementing practical
type checkers to support Forth."
}

Anton Ertl

unread,
Aug 14, 1996, 3:00:00 AM8/14/96
to Eric W. Biederman

In article <m191biz...@flinx.home>, ebie...@cse.unl.edu (Eric W. Biederman) writes:
|> Perhaps I am clueless on this but the point seems a good _practical_
|> point. But even more then that if are not talking _types_ but
|> something else that a type system doesn't/can't catch, I see the point
|> as quite valid.
|>
|> Since I haven't read this 1977 article how could types be exteneded to
|> general catch this kind of error. Which happens to show up in forth
|> because it has an explicit stack for arguments, and doesn't use
|> functions/procedures. Heck I'd like to hear how you could use strong

|> types to catch this kind of error with any kind of add-on stack!

IIRC, the problem discussed was that not all return values are consumed
or that more than the return values are consumed (from my experience
with Forth, this is not a frequent or hard problem, in contrast to the
claims made).

This kind of error would be easy to catch statically, if we restrict
the legal programs such that all words (functions for you Schemers)
take and return a statically determined number of values on the stack
(a restriction satisfied by ~99% of the Forth words written, but most
applications contain a few words that do not satisfy it).

You just have to count the stack depth statically; At a control-flow
join, if there are different stack depths at the joining paths,
there's something wrong in one of them. This does not catch some
multiple errors and does not point out the error exactly; we would
need stack effect declarations for that. [hoffmann93] describes this
in more detail.

Of course, there are many problems we cannot catch at compile time
(e.g., lift the restriction above; then it's even hard to catch this
problem at run-time automatically). I always wonder why compile-time
type-checking is so religiously defended by many, who don't give a
damn about, e.g., compile-time assertion checking (not to mention that
most don't use assertions at all).

@InProceedings{hoffmann93,
author = "Ulrich Hoffmann",
title = "Static Stack Effect Analysis",
booktitle = "EuroFORTH '93 conference proceedings",
year = "1993",
address = "Mari\'ansk\'e L\'azn\`e (Marienbad)"
}

- anton

Posted to comp.lang.forth and comp.lang.scheme. Set followup as
appropriate.
--
M. Anton Ertl Some things have to be seen to be believed
an...@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html

Anton Ertl

unread,
Aug 16, 1996, 3:00:00 AM8/16/96
to

In article <4usi07$s...@news.tuwien.ac.at>, an...@a0.complang.tuwien.ac.at (Anton Ertl) writes:
|> This kind of error would be easy to catch statically, if we restrict
|> the legal programs such that all words (functions for you Schemers)
|> take and return a statically determined number of values on the stack
|> (a restriction satisfied by ~99% of the Forth words written, but most
|> applications contain a few words that do not satisfy it).

Silly me, I forgot EXECUTE (i.e., indirect calls). I should not post
when I am tired.

Anyway, EXECUTE is somewhat more frequent than words that break the
restrictions above. To make words using it statically checkable, we
would need unrealistically good data flow analysis to ensure that all
words executed by a specific EXECUTE have the same stack effect. We
could also use declarations at the places where the data flow analysis
would have problems.

0 new messages