How to check if two objects are equal

bobu...@yahoo.com

unread,

Jan 15, 2006, 12:15:42 PM1/15/06

to

This is a short summary of the thread "what is eq? used for". I tried
to distil the essence but there could be misunderstandings on my part
and essentials that are left out. The imaginary conversation below is
between a newbie and a hacker. Observe that this conversation is
towards a newbie with a newbie's understanding of things. I would very
much appreciate comments on both the correctness and the pedagogical
presentation.

Bob

newbie: If you want to determine if two objects are equal what
procedure should you use?

hacker: You can use equal?, eq?, eqv? or =.

newbie: Why is there four of them, is not one enough?

hacker: Do you remember what we talked about when we discussed the
model of the memory?

newbie: You mean that most objects in Scheme were in the heap and had
pointers pointing to them.

hacker: Yes, that's right. We called such objects boxed, and the
objects not in the heap we called unboxed. Now suppose I have a boxed
object pointed by the pointer p and whose contents is a. Suppose also
that I have another boxed object pointed by the pointer q and whose
contents is b. What equality questions could you then ask?

newbie: I could ask if the pointer p and q are equal, that is if they
are in fact pointing at the same boxed object. If the answer is yes
then I know that a and b are also equal. But if p and q are different
then I could ask if a and b are equal.

hacker: That's right, to check for the first kind of equality you use
the procedure eq? and to check for the second kind of equality you use
the procedure equal?. Here's an example
> (define p (cons 1 2))
> p
(1 . 2)
> (define p (cons 1 2))
> q
(1 . 2)
> (eq? p q)
#f
> (equal? p q)
#t

newbie: I see, but what about =, what's that used for?

hacker: In mathematics you know that 3 and 3.0 are equal. But in Scheme
3 is an integer and is implemented in a different way then 3.0 which is
a float. So
> (equal? 3 3.0)
#f
But you might still want to think about 3 and 3.0 as equal. Using =
let's you check for this. Thus
> (= 3 3.0)
#t

newbie: OK. What is the eqv? used for?

hacker: You can think of eqv? as almost the same as eq? You can use
only eq? and forget about eqv? or the other way around.

newbie: Is there no difference between them?

hacker: Yes, there is but as a newbie you don't need to be aware of the
difference, since it will only confuse your mental model of equality.
However if you really want to know the difference you can read about it
at www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-9.html.

newbie: Can't you at least hint about the difference?

hacker: As you probably know there are mutable objects (like vectors
strings and pairs) and immutable objects (like numbers and characters).
Roughly eqv? behaves like eq? for mutable objects, but like equal? for
immutable objects.

newbie: You are talking about different kind of objects like mutable,
immutable and so on. What kind of objects are there in Scheme?

hacker: There are four kinds of types:
1. Immutable: numbers, characters.
2. Atomic: booleans, the empty list, symbols. These are consisting of
explicitly named immutable values.
3. Mutable: vectors, strings, pairs and ports. These consist of
compound values with mutable fields.
4. Procedures.

newbie: All right, if we return to the original discussion; Which one
should I choose, eq? or eqv?

hacker: For you as a newbie it doesn't matter, choose whichever you
want.

newbie: Could you show me one example where there is a difference?

hacker: Sure
> (define x 5/7)
> (define y 5/7)
> (eq? x y)
#f
> (eqv? x y)
#t
Rational numbers are implemented in Scheme as boxed objects. We have
two boxed objects with the same contents but pointed by different
pointers. As you can see eq? checks for equality for pointers but eqv?
checks for equality of contents. This difference is true for immutable
objects like numbers and characters but not for other kind of objects.
In your case you should always use = when checking for equality of
numbers
> (define x 5/7)
> (define y 5/7)
> (= x y)
#t
So the lesson is that if you want to check for equality of numbers
always use =.

newbie: What about strings?

hacker: Strings are boxed objects, so
> (define s "hi")
> (define t "hi")
> (eq? s t)
#f
> (equal? s t)
#t
just as you would expect.

newbie: How about symbols?

hacker: Symbols are little special. All symbols are kept in a table. If
you make a symbol that already exist in the system, a pointer to the
same box as last is returned. Hence
> (eq? 'a (string->symbol "a"))
#t
> (eqv? 'a (string->symbol "a"))
#t
> (equal? 'a (string->symbol "a"))
#t
What I said here about symbols also applies to other atomic objects
like booleans
> (define x #t)
> (define y #t)
> (eq? x y)
#t
> (equal? x y)
#t

and the empty list
> (define x '())
> (define y '())
> (eq? x y)
#t
> (equal? x y)
#t

newbie: What about characters?

hacker: The simplest is to use char=?
> (define x #\a)
> (define y #\a)
> (char=? x y)
#t

newbie: What about pairs, lists and vectors?

hacker: They are all boxed objects so use eq? to check if the pointers
are equal and equal? to see if the contents is equal
> (define x (list 1 2 3))
> (define y (list 1 2 3))
> (eq? x y)
#f
> (equal? x y)
#t

newbie: What about procedures?

hacker: These are tricky. They are usually different even if we humans
think of them as same. Consider
(let ((f (lambda (x) x)) (g (lambda (y) y)))
(display (eq? f g))
(display (eqv? f g))
(display (equal? f g)))

produces #f #f #f. So even if we humans think of them as the same
procedure Scheme considers them as different.

newbie: That's strange, I would have expected that at least (equal? f
g) would have evaluated to true, since the syntax is the same.

hacker: Consider
(let ((f (lambda (x) (+ x x))) (g (lambda (y) (* 2 y))))
(display (equal? f g)))

which returns false. We humans know that these procedures are equal but
they have different syntax. So what Scheme does is that it says that if
two procedures are pointed by different pointers then they are not
equal in any sense.

Brian Harvey

unread,

Jan 15, 2006, 12:43:41 PM1/15/06

to

bobu...@yahoo.com writes:
>This is a short summary of the thread "what is eq? used for".

[...]

>hacker: Consider
>(let ((f (lambda (x) (+ x x))) (g (lambda (y) (* 2 y))))
> (display (equal? f g)))
>
>which returns false. We humans know that these procedures are equal but
>they have different syntax. So what Scheme does is that it says that if
>two procedures are pointed by different pointers then they are not
>equal in any sense.

I liked everything except the discussion of procedures. I think it's really
important to maintain the distinction between procedures and functions. As
procedures, these are not the least bit equal. (Imagine, as used to be true
when computers were more CPU-limited than memory-limited, that multiplication
is a lot slower than addition.) *The functions modeled by these procedures*
are equal, but the procedures aren't. That's not a Scheme foible, it's the
Right Thing.

The example of (lambda (x) x) vs. (lambda (y) y) is a very special case
because (as discussed in another recent article) these two procedures happen
to be closed, so we can tell by careful analysis of their texts that they
are equivalent -- they compile into the same machine instructions and don't
depend on any external data. It's a mistake to encourage people to generalize
from this example.

In general, the Really Right Thing would be for EQV? and EQUAL? to return #T
for two procedures if they're EQ?, and to return #[unspecified] otherwise!
I guess Scheme does what it does because #F is a conservative choice that
prevents programs from blowing up if they compare two objects that might or
might not be procedures.

bobu...@yahoo.com

unread,

Jan 15, 2006, 1:24:48 PM1/15/06

to

Very good comments Brian, as always :)

I will change the discussion of procedures to take into account the
distinction between functions and procedures. (One minor trouble with
the word functions is that some programming languages, which our
imaginary newbie might have been exposed to, use the term functions for
procedures).

Brian Harvey wrote:
*The functions modeled by these procedures* are equal, but the
procedures aren't.

I like this way of expressing it. It's to the point.

Brian Harvey wrote

>In general, the Really Right Thing would be for EQV? and EQUAL? to return #T
>for two procedures if they're EQ?, and to return #[unspecified] otherwise!

As I understand it #unspecified means that it's up to implemention to
decide what to return. I've not seen DrScheme return the string
"#unspecified" or something similar, is there such a thing in other
implementations?

Abdulaziz Ghuloum

unread,

Jan 15, 2006, 2:01:27 PM1/15/06

to

bobu...@yahoo.com wrote:

> newbie: What about strings?
>
> hacker: Strings are boxed objects, so
>
>>(define s "hi")
>>(define t "hi")
>>(eq? s t)
>
> #f
>
>>(equal? s t)
>
> #t
> just as you would expect.

A Scheme implementation is allowed to commonize constants. The string
"hi" is a constant in scheme (constants are immutable). In the example
above, the implementation is allowed to make s and t be the exact same
objects (or the same box as you put it), thus eq? may return #t on some
implementations. This is not commonly done however.

Another example is the following:
(let ([x '(a b c)] [y '(b c)])
(eq? (cdr x) y))
Here, x and y are both constants and the implementation is free to share
the common parts.

You can fix your example as follows:
> (define s (string #\h #\i))
> (define t (string #\h #\i))

> (eq? s t)
#f
> (equal? s t)
#t

This is because the procedure string is required to return a fresh
object everytime it is invoked with arguments. If invoked with no
arguments, the implementation may return the same empty string everytime
it is invoked since the empty string is immutable. Same goes for
calling (vector), (make-string 0), and (make-vector 0).

Aziz,,,

H.

unread,

Jan 15, 2006, 6:59:09 PM1/15/06

to

>>This is a short summary of the thread "what is eq? used for"

and an excellent one at that. I bookmarked it for future reference.

Brian Harvey

unread,

Jan 16, 2006, 1:08:08 AM1/16/06

to

bobu...@yahoo.com writes:
>distinction between functions and procedures. (One minor trouble with
>the word functions is that some programming languages, which our
>imaginary newbie might have been exposed to, use the term functions for
>procedures).

Okay, "mathematical functions." :-)

>As I understand it #unspecified means that it's up to implemention to
>decide what to return. I've not seen DrScheme return the string
>"#unspecified" or something similar, is there such a thing in other
>implementations?

Yeah, STk does this iirc; it's a literal reading of the standard. :-)
It has the advantage that it dissuades you from writing code that depends
on the particular return value that some particular implementation gives
you in one of those situations.

bobu...@yahoo.com

unread,

Jan 16, 2006, 5:36:51 AM1/16/06

to

Here's an update of the last part concerning procedures

newbie: What about procedures?

hacker: These are tricky because it involves the question of when two
procedures are equal. The problem is that we humans think in terms of
mathematical functions when we create procedures. For instance the two
mathematical functions f(x)=x+x and g(x)=2*x are equal but if we
implement them as procedures we do it by
(define (f x) (+ x x))
(define (g x) (* 2 x))

Now even though the mathematical functions f(x) and g(x) are equal the
procedures f and g aren't. A procedure is a number of steps that the
computer performs. And the steps that the computer performs when adding
is different from the steps it performs when it multiplies.
newbie: Couldn't you say that two procedures are equal if they return
the same output for the same input.
hacker: It would be extremely hard thing to do, even for simple
functions and in fact there are theorems that say that it is
*impossible* in majority of cases.
newbie: So what answer does Scheme give when you compare two functions
for equality.
hacker: Procedures are boxed. So Scheme looks if the pointers to the
two procedures are equal or not. If the pointers are equal Scheme says
that the functions are equal otherwise it says that the functions are
not equal. Here are some examples:
> (define f cons)
> (define g cons)
> (eq? f g)
#t
> (define (f x) (+ x x))
> (define (g x) (* 2 x))
> (eq? f g)
#f
> (define (f x) (+ x x))
> (define (g x) (+ x x))
> (eq? f g)
#f
> (define (f x) (+ x x))
> (define g f)
> (eq? f g)
#t

Bob

bobu...@yahoo.com

unread,

Jan 16, 2006, 5:48:25 AM1/16/06

to

Abdulaziz wrote:
>A Scheme implementation is allowed to commonize constants. The string
>"hi" is a constant in scheme (constants are immutable).

Are you saying that "hi" is a constant in some implementations are in
all implementations. What's the purpose of making some strings into
constants and what possible rules does an implementation have for
making some strings into constants and not others.

Bob

Ulrich Hobelmann

unread,

Jan 16, 2006, 6:57:24 AM1/16/06

to

I suppose that strings read at compile-time can be made constant, but
probably there are some problems should you ever modify those strings
destructively. Like in C, it's best not to do that :)

--
The problems of the real world are primarily those you are left with
when you refuse to apply their effective solutions.
Edsger W. Dijkstra

Jens Axel Søgaard

unread,

Jan 16, 2006, 9:59:42 AM1/16/06

to

Ulrich Hobelmann wrote:
> bobu...@yahoo.com wrote:
>
>> Abdulaziz wrote:
>>
>>> A Scheme implementation is allowed to commonize constants. The string
>>> "hi" is a constant in scheme (constants are immutable).
>>
>> Are you saying that "hi" is a constant in some implementations are in
>> all implementations. What's the purpose of making some strings into
>> constants and what possible rules does an implementation have for
>> making some strings into constants and not others.
>
> I suppose that strings read at compile-time can be made constant, but
> probably there are some problems should you ever modify those strings
> destructively. Like in C, it's best not to do that :)

Bob, the question is whether this

(define a "foo")
(define b "foo")
(string-set! a 0 b)
(eq? a b)

is allowed to returned #t.

An optimizing compiler can notice that the same string constant
occurs twice, and let a and b denote the same string. This
saves space, when the program is run.

Luckily section 3.4 of R5RS says:

In many systems it is desirable for constants (i.e. the values of
literal expressions) to reside in read-only-memory. To express this,
it is convenient to imagine that every object that denotes locations
is associated with a flag telling whether that object is mutable or
immutable. In such systems literal constants and the strings returned
by symbol->string are immutable objects, while all objects created by
the other procedures listed in this report are mutable. It is an error
to attempt to store a new value into a location that is denoted by an
immutable object.

and section 4.1.2 in literal expressions say:

As noted in section 3.4, it is an error to alter a constant (i.e. the
value of a literal expression) using a mutation procedure like
set-car! or string-set!.

Thus the result of my snippet above is unspecified. So Ulrich
is spot on, when he says "don't do that".

--
Jens Axel Søgaard

bobu...@yahoo.com

unread,

Jan 16, 2006, 2:01:01 PM1/16/06

to

Brian Harvey wrote:
> The example of (lambda (x) x) vs. (lambda (y) y) is a very special case
> because (as discussed in another recent article) these two procedures happen
> to be closed, so we can tell by careful analysis of their texts that they
> are equivalent -- they compile into the same machine instructions and don't
> depend on any external data. It's a mistake to encourage people to generalize
> from this example.
>

OK, they are closed and we humans can tell that they are same, and if I
understand you correctly Scheme could in theory also tell that they are
same (since they are closed). But still Scheme (at least DrScheme) says
that they are different

> (define (f x) (+ x x))
> (define (g x) (+ x x))
> (eq? f g)
#f

> (equal? f g)
#f

Is it a mistake to say that:

"Although in theory Scheme could analyze closed procedures, it doesn't
do it (at least not yet). Scheme makes it very easy for itself.

Procedures are boxed. So Scheme looks if the pointers to the two
procedures are equal or not. If the pointers are equal Scheme says
that the functions are equal otherwise it says that the functions are
not equal."

Does it encourage peope to make the wrong generalizations?

Bob

Brian Harvey

unread,

Jan 16, 2006, 3:40:17 PM1/16/06

to

bobu...@yahoo.com writes:
>OK, they are closed and we humans can tell that they are same, and if I
>understand you correctly Scheme could in theory also tell that they are
>same (since they are closed). But still Scheme (at least DrScheme) says
>that they are different
>
>> (define (f x) (+ x x))
>> (define (g x) (+ x x))
>> (eq? f g)
>#f
>> (equal? f g)
>#f

Ah, but *those* procedures aren't closed! They contain the free variable +.

It's really quite rare in practice (as opposed to lambda calculus exercises)
to see an interesting closed procedure. That's one reason why I don't think
it's worth talking about this, especially in a text intended for newbies!

For example, here's a well-known nontrivial closed procedure:
(lambda (f) (lambda (x) (f f x)))
Do you really want to use that as an example for beginners? It's useless
in practice unless you happen to be using a Scheme without DEFINE! :-)

>Is it a mistake to say that:
>
>"Although in theory Scheme could analyze closed procedures, it doesn't
>do it (at least not yet). Scheme makes it very easy for itself.
>Procedures are boxed. So Scheme looks if the pointers to the two
>procedures are equal or not. If the pointers are equal Scheme says
>that the functions are equal otherwise it says that the functions are
>not equal."
>
>Does it encourage peope to make the wrong generalizations?

I didn't say "Scheme could analyze closed procedures"; I said it could analyze
trivial cases such as (lambda (x) x) versus (lambda (y) y). And the reason I
said it was to convince *you* that this special case isn't worth talking to
beginners about. Scheme (or any language) could actually do better than this;
it could compile the two candidate procedures into machine language, and if
both the sequence of machine instructions and the lexical environment are the
same, then the procedures could be declared equal. But I wouldn't want to say
*that* to beginners, either!

What I would be willing to say to beginners is this:

Scheme has three kinds of equality for general data (as opposed to specific
types such as numbers):

The very same place in memory (EQ?)
Equal enough so that a Scheme program can't tell the difference (EQV?)
Looks the same when printed (EQUAL?)

The first of these trivially applies to procedures just as to any other data;
if we have two pointers to the one procedure produced by one evaluation of a
lambda expression, those two pointers are EQ?.

The third kind of equality doesn't make much sense for procedures altogether,
because they're not really printable in any interesting sense.

As for the second, what it should mean for procedures is that there is no way
to distinguish the result (both the return value and any changes made to the
environment) of calling one from the result of calling the other. This is, in
general, not a solvable problem; we can't even determine, in general, whether
a given procedure call will return a value at all, let alone whether it will
return the same value as a call to a different procedure. (If you want to
know more about this claim, read SICP exercise 4.15 or look up the Halting
Problem in any automata theory textbook.) Scheme could in principle perform
this analysis in certain special cases, and if it had a three-valued logic
with #T, #F, and #MAYBE, then perhaps it would make sense for EQV? to try,
returning #MAYBE when it can't tell. But in what situation would that be
useful to anyone in practice? So instead EQV? (and also EQUAL?) perform an
easier test that gives a conservative result: return #T if and only if EQ?
would return #T for these two procedures.

------------

That's much more long-winded than what I suggested before, and I wouldn't
bother, myself -- after all, how often do you really want to compare two
procedures anyway? We don't even have ARITY, and you want EQV?? But if
you have to go into it in this depth, that's how to do it.

P.S. Your proposed test says "the functions are equal" instead of
"the procedures are equal." :-(

Adrian Kubala

unread,

Jan 16, 2006, 4:16:04 PM1/16/06

to

On 2006-01-16, bobu...@yahoo.com <bobu...@yahoo.com> wrote:
> Is it a mistake to say that:
> "Although in theory Scheme could analyze closed procedures, it doesn't
> do it (at least not yet).

I think it's safe to say it will NEVER do it. Comparing procedures for
anything other than identity-equality simply doesn't make sense. The
only reason it's not a type error to call equal? on procedures is that
there's some convenience to having equal? be a total function.

Lauri Alanko

unread,

Jan 16, 2006, 5:20:47 PM1/16/06

to

In article <dqh0bh$7et$1...@abbenay.CS.Berkeley.EDU>,

Brian Harvey <b...@cs.berkeley.edu> wrote:
> Ah, but *those* procedures aren't closed! They contain the free variable +.
>
> It's really quite rare in practice (as opposed to lambda calculus exercises)
> to see an interesting closed procedure.

Only because the interesting ones are just a bit too tedious to use in
Scheme. In Haskell, combinators are used _all_ the time.

Lauri

bobu...@yahoo.com

unread,

Jan 16, 2006, 5:57:39 PM1/16/06

to

Brian Harvey wrote:
> I wouldn't bother, myself -- after all, how often do you really want to compare two
> procedures anyway?

Right. So The new much shortened proposal is:

---
newbie: What about procedures?

hacker: There's not much point in asking if two procedures are equal,
apart from the trivial case when their pointers point to the same box.

> (define (f x) (+ x x))

> (define g f)
> (eq? f g)
#t

---

Marcin 'Qrczak' Kowalczyk

unread,

Jan 16, 2006, 6:13:22 PM1/16/06

to

Adrian Kubala <adria...@sixfingeredman.net> writes:

> The only reason it's not a type error to call equal? on procedures
> is that there's some convenience to having equal? be a total
> function.

EQUAL? is not a total function: it may loop on cyclic structures.

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

Adrian Kubala

unread,

Jan 16, 2006, 7:06:41 PM1/16/06

to

On 2006-01-16, Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
> Adrian Kubala <adria...@sixfingeredman.net> writes:
>> The only reason it's not a type error to call equal? on procedures
>> is that there's some convenience to having equal? be a total
>> function.
>
> EQUAL? is not a total function: it may loop on cyclic structures.

Hm, I guess we could argue about procedure versus function, but I what I
meant is that it's useful to know that you can pass anything to equal?
and expect to at least get an #f. (Barring cycles.)