Why is (eqv? g g) unspecified in R6RS when g is a procedure?

John Cowan

unread,

May 8, 2012, 8:24:21 AM5/8/12

to

One of the changes between R5RS and R6RS is that (eqv? x x), where
x is a variable whose value is a procedure, is allowed to return #f.
The R6RS Rationale, section 11.5.1, says in its entirety:

The definition of eqv? allows implementations latitude in their
treatment of procedures: implementations are free either to detect or
to fail to detect that two procedures are equivalent to each other,
and can decide whether or not to merge representations of equivalent
procedures by using the same pointer or bit pattern to represent
both. Moreover, they can use implementation techniques such as
inlining and beta reduction that duplicate otherwise equivalent
procedures.

The first sentence of the rationale is only partly true: the R6RS
gen-counter example shows that if procedures are not operationally
equivalent, eqv? must treat them as distinct, as was true in R5RS.
The second sentence does not seem to me to constitute a sufficient
explanation for results like this:

(let ((x (lambda (x) (+ x 1)))) (eqv? x x)) => #f rather than #t
(memv cdr (list car cdr cons)) => #f rather than a list of two elements

In essence, that freedom makes procedures not really first-class:
they can't be placed into data structures and taken out again reliably.
Furthermore, none of the existing R6RS implementations (Racket, Guile,
Chez, Vicare, Larceny, Ypsilon, Mosh) actually takes advantage of this
freedom, at least at the REPL.

Can someone explain the thinking of the R6RS committee here?
Adv-thanks-ance.

Barry Margolin

unread,

May 8, 2012, 11:07:57 AM5/8/12

to

In article
<17724820.57.1336479861732.JavaMail.geo-discussion-forums@ynjn15>,

John Cowan <johnw...@gmail.com> wrote:

> One of the changes between R5RS and R6RS is that (eqv? x x), where
> x is a variable whose value is a procedure, is allowed to return #f.
> The R6RS Rationale, section 11.5.1, says in its entirety:
>
> The definition of eqv? allows implementations latitude in their
> treatment of procedures: implementations are free either to detect or
> to fail to detect that two procedures are equivalent to each other,
> and can decide whether or not to merge representations of equivalent
> procedures by using the same pointer or bit pattern to represent
> both. Moreover, they can use implementation techniques such as
> inlining and beta reduction that duplicate otherwise equivalent
> procedures.
>
> The first sentence of the rationale is only partly true: the R6RS
> gen-counter example shows that if procedures are not operationally
> equivalent, eqv? must treat them as distinct, as was true in R5RS.

I think they intentionally said that the latitude is in detecting
equivalence, NOT in detecting difference. And there's no conceivable
implementation where operationally different procedures could be
mistaken as being equivalent.

> The second sentence does not seem to me to constitute a sufficient
> explanation for results like this:
>
> (let ((x (lambda (x) (+ x 1)))) (eqv? x x)) => #f rather than #t
> (memv cdr (list car cdr cons)) => #f rather than a list of two elements
>
> In essence, that freedom makes procedures not really first-class:
> they can't be placed into data structures and taken out again reliably.

Sure they can. Since the only thing that matters about a procedure is
what happens when you call it, it doesn't matter if some copying takes
place.

> Furthermore, none of the existing R6RS implementations (Racket, Guile,
> Chez, Vicare, Larceny, Ypsilon, Mosh) actually takes advantage of this
> freedom, at least at the REPL.
>
> Can someone explain the thinking of the R6RS committee here?
> Adv-thanks-ance.

Suppose the implementation has metadata that maps a procedure back to
its name, and you do something like this:

(define proc1 (lambda ...))
(define proc2 proc1)

The second expression might make a copy of the procedure (or just its
header block) so that it could record that the name has changed to proc2.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

John Cowan

unread,

May 8, 2012, 12:40:50 PM5/8/12

to

On Tuesday, May 8, 2012 11:07:57 AM UTC-4, Barry Margolin wrote:

> I think they intentionally said that the latitude is in detecting
> equivalence, NOT in detecting difference.

No, the situation is less symmetrical than that. The examples give
the well-known (to those who read Scheme standards) procedures
GEN-COUNTER and GEN-LOSER: the former returns an operationally distinct
procedure every time it is called, the latter returns an operationally
identical one. R6RS Schemes, like older Schemes, require that calls
to GEN-COUNTER produce objects that are distinct in the sense of EQV?,
and is open about the results of calls to GEN-LOSER: they may be
distinct or may not be.

However, it explicitly says (and this is new) that (let ((x (gen-counter)))
(eqv? x x)) is indeterminate, and so is (let ((x (gen-loser))) (eqv? x x))!

> And there's no conceivable
> implementation where operationally different procedures could be
> mistaken as being equivalent.

No, there isn't. But implementations are permitted in which procedures lack
object identity.

> Sure they can. Since the only thing that matters about a procedure is
> what happens when you call it, it doesn't matter if some copying takes
> place.

It matters, for example, if you try to use a procedure as a key to a hash
table: the implementation is permitted, when asked if that same procedure
is in the hash table, to reply that it is not. Before R6RS, procedures
had object identity. Now there is no guarantee that a procedure is EQV?
even to itself.

Michael Sperber

unread,

May 9, 2012, 2:29:47 AM5/9/12

to John Cowan

John Cowan <johnw...@gmail.com> writes:

> One of the changes between R5RS and R6RS is that (eqv? x x), where
> x is a variable whose value is a procedure, is allowed to return #f.
> The R6RS Rationale, section 11.5.1, says in its entirety:
>
> The definition of eqv? allows implementations latitude in their
> treatment of procedures: implementations are free either to detect or
> to fail to detect that two procedures are equivalent to each other,
> and can decide whether or not to merge representations of equivalent
> procedures by using the same pointer or bit pattern to represent
> both. Moreover, they can use implementation techniques such as
> inlining and beta reduction that duplicate otherwise equivalent
> procedures.
>
> The first sentence of the rationale is only partly true: the R6RS
> gen-counter example shows that if procedures are not operationally
> equivalent, eqv? must treat them as distinct, as was true in R5RS.
> The second sentence does not seem to me to constitute a sufficient
> explanation for results like this:
>
> (let ((x (lambda (x) (+ x 1)))) (eqv? x x)) => #f rather than #t

For most implementations, the only reasonable test of procedure
equivalence is just checking if the two references point to the same
object - essentially using eq?. Also, (simplifying greatly), most
implementations usually allocate a fresh object when evaluating a
`lambda' expression. This means that two different evaluations of
`lambda' expressions usually give non-equivalent procedures in the sense
of `eqv?'.

Now, in the example above, beta-reducing the `let' gives you:

(eqv? (lambda (x) (+ x 1)) (lambda (x) (+ x 1)))

This is the kind of program transformation we wanted to allow in R6RS.
Since one expression has become two, you also generally get two
evaluations of `lambda' expressions, and, hence, two objects that are
not `eqv?'.

> (memv cdr (list car cdr cons)) => #f rather than a list of two elements

This example, I believe, specifically used simple primitives, which can
be compiled to very short open instruction sequences when applied
directly. Converting them into first-class procedures may mean
something similar to eta-expanding them. Again, you get separate `lambda'
expressions, and therefore non-equivalence.

Does this help?

Historical aside: Prior to R6RS, the ML people had ridiculed us for
years for the old behavior of `eqv?'.

--
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla

John Cowan

unread,

May 9, 2012, 3:02:39 PM5/9/12

to John Cowan

On Wednesday, May 9, 2012 2:29:47 AM UTC-4, Michael Sperber wrote:

> (eqv? (lambda (x) (+ x 1)) (lambda (x) (+ x 1)))
>
> This is the kind of program transformation we wanted to allow in R6RS.
> Since one expression has become two, you also generally get two
> evaluations of `lambda' expressions, and, hence, two objects that are
> not `eqv?'.

As I pointed out, this bites both ways: given this expression, a compiler
may notice that the two expressions are both pure and identical, and reduce
the whole thing to just #t.

> > (memv cdr (list car cdr cons)) => #f rather than a list of two elements
>
> This example, I believe, specifically used simple primitives,

I actually invented this example myself, so I used primitives only to keep
it short. But perhaps I chose better than I knew.

> Does this help?

Well, it explains the motivation well enough. Unfortunately, we are still
stuck with the issue: procedures are supposed to be first-class in Scheme,
but when you put them into a data structure, you can't be sure of getting
them out again. In the Lambda Order, that's not supposed to happen.

> Historical aside: Prior to R6RS, the ML people had ridiculed us for
> years for the old behavior of `eqv?'.

Ah, language design by competitive insults! If I'd known that was the
proper procedure, it would have made my R7RS efforts so much easier....

Michael Sperber

unread,

May 15, 2012, 2:27:17 AM5/15/12

to

John Cowan <johnw...@gmail.com> writes:

> Well, it explains the motivation well enough. Unfortunately, we are still
> stuck with the issue: procedures are supposed to be first-class in Scheme,
> but when you put them into a data structure, you can't be sure of getting
> them out again. In the Lambda Order, that's not supposed to happen.

I'm not sure I'm with your definition of first class, which seems to be
intertwingled with the definition of `eqv?'. (Why `eqv?', and not
`eq?'. With regard to that, you can't be sure of putting a number into
a data structure and getting it out again.) That's a reasonable view,
but I didn't see this in my membership charter when I signed up for the
Lambda Order.

John Cowan

unread,

May 15, 2012, 2:13:35 PM5/15/12

to

On Tue, May 15, 2012 at 2:26 AM, Michael Sperber
<sperber at deinprogramm.de> wrote:

> I'm not sure I'm with your definition of first class, which seems to be
> intertwingled with the definition of `eqv?'. (Why `eqv?', and not
> `eq?'. With regard to that, you can't be sure of putting a number into
> a data structure and getting it out again.)

Because of R5RS section 3.4 and R6RS section 5.10:

# An object fetched from a location, by a variable reference or
# by a procedure such as car, vector-ref, or string-ref,
# is equivalent in the sense of eqv? (section 6.1) to the object
# last stored in the location before the fetch.

That is why eqv? is Scheme's identity predicate. Well, as Quine says,
no entity without identity: since procedures don't have identity, they
are not really entities. With the annoying but trivial exception of
+nan.0, there are no other Scheme objects that behave like this. (I
got onto all this in the first place when memoizing functions of one
argument on Chicken: if the argument is +nan.0, you are out of luck
with hash tables, and have to handle it specially.)

emmanuel....@gmail.com

unread,

May 16, 2012, 2:37:44 AM5/16/12

to

Yes that's true and should be true for functions as well, even with the proposed change. I tend to think that (eqv? x x) should be true for all objects x. I think the problem is not well worded, I will try to explain it as I understand it. Please correct me.

Please notice the emphasis "An object _fetched from a location_". Again, the issue is not functions being first class, there are and should be first-class in our "Lambda Order": if you store a function in a location and take it back twice then these two objects must be eqv? "by contract". The real issue is:
"Do we want to enforce lambda to allocate a location for every closures it creates or not ?"

As far as I understand it, the problem is with following R5RS statement enforces lambda to provide a location for all procedures:

"obj1 and obj2 are procedures whose location tags are equal"

But this may be too demanding, on one hand this disallow functions inlining, for instance you may want to implement 'car','cdr', etc. as primitives. On the other hand if you have a named let you may want to use registers for passing arguments instead of allocating a closure in memory for it. Of course if your named let escapes you will have to allocate it on memory, the fact is that you are not forced to do it every time with the proposed change. Surely there are other examples and I would really like to hear about them.

So what the standard is about is to allow some implementation to do these optimizations and accept it.

I would be really glad to have more details about this from implementers. Does some of you have implemented the above ?

--
Emmanuel

William D Clinger

unread,

Jul 7, 2012, 11:13:42 AM7/7/12

to

On Wednesday, May 16, 2012 2:37:44 AM UTC-4, emmanuel....@gmail.com wrote:

> I would be really glad to have more details about this from
> implementers. Does some of you have implemented the above ?

Yes. This optimization was implemented a long time ago in
PC Scheme, and would have been implemented in most of the
more performant implementations of Scheme had it not been
for the R5RS semantics. Although this optimization has not
been implemented in Larceny (because Larceny still supports
the R5RS), I'm going to implement it later this summer.
It's an easy change to the single assignment analysis that
Larceny's compiler has been performing for years and years:

http://www.larcenists.org/twobit.html
http://www.larcenists.org/Twobit/p2saa.html

If you want to look at the source code for Twobit, the change
that needs to be made is in the single-assignment-analysis
procedure, which is in src/Compiler/pass2p1.sch. The need
for this optimization occurs when the following test fails:

(if (= (length (calls R I)) (length (references R I)))
...
...)

At present, failure of that test disables optimization of
the procedure named I along with all other local procedures
that are defined in that same group of local definitions
following the definition of I. The optimization consists
of rewriting the escaping references to I as new lambda
expressions with I in call position. Following those
rewrites, the test shown above will succeed and the local
procedure I will be optimized.

Will