# Rethinking equality

487 views

### martin odersky

May 7, 2011, 5:56:52 AM5/7/11
to scala-internals
Now that 2.9 is almost out the door, we have the luxury to think of what could come next. One thing I would like to address is equality. The current version did not age well; the longer one looks at it, the uglier it gets. In particular it is a great impediment for DSL design. Witness the recent popularity of === as an alternative equality operator. I previously thought we were stuck with Java-like universal equality for backwards compatibility reasons. But there might be a workable way out of that morass. It works in three steps, which would coincide with the next three major revisions of Scala (yes, sometimes progress has to be slow!)

Step 1:
=====

Introduce the standard Equals type class:

class Equals[T] {
def eql(x: T, y: T)
}

And put the following method in Predef:

@inline def areEqual[T](x: T, y: T)(implicit eq: Equals[T]) = eq.eql(x, y)

Define equality as follows:

X == Y  is equivalent to areEqual(X, Y), if that typechecks, and otherwise equivalent to what it was until now (i.e. universal equality)

[Aside: I note that the spec still defines Any.== to be

if (null eq this) null eq that else this equals that

We should update that to reflect the realities wrt boxed numbers (on the other hand, if we continue down the path I outline, those realities will also change, see below).]

Furthermore, issue a warning if there is an Equals[T] class for one of the types of X and Y, but not the other (in that case we fall back to the default behavior).

The effect of step 1 is just that any implicit Equals definitions override default behavior, so we are backwards compatible.

Step 2:
======

Define equality as follows:

X == Y  is equivalent to areEqual(X, Y), if there is an implicit Equals[T] value for at least one of the types T of X or Y.

Otherwise it is equivalent to universal equality, but in that case a warning is issued.

Step 3:
======

Define equality as follows:

X == Y  is equivalent to areEqual(X, Y).

Here's an example: Let's say we have a class Person with an implicit Equals[Person].
Let's assume:

p, q: Person
a, b: Any

Then we have

Step1           Step2              Step3
p == q   implicit        implicit           implicit

p == a      universal       error              error
a == p   + warning

a == b   universal       universal          error
+ warning

Notes:

1. Of course I assume everywhere that X != Y is !(X == Y).
2. Once we have arrived at step 3, the universal equality logic for boxed numbers would go into
the implicit equalities for these numbers.
3. All logic we have now in the compiler that warns of equalities that are always true or false is no longer needed.

### martin odersky

May 7, 2011, 6:51:44 AM5/7/11
to scala-internals
I just noted that we might want to add a step 0 before the others:

Step 0:

======

Define equality as follows:

X == Y  is equivalent to areEqual(X, Y), if that typechecks, and otherwise equivalent to what it was until now (i.e. universal equality)

No new warnings are given for mixed types.

That step would be strictly backwards compatible, and would let DSLs use == as they need it. So we can introduce this as early as 2.9.1.

Cheers

-- Martin

### √iktor Ҡlang

May 7, 2011, 7:01:56 AM5/7/11
Sounds great.

For collections, methods etc that want to use equality checking will need to have their signatures changed to include : Equals?

Cheers,
--
Viktor Klang,
Director of Research and Development
Scalable Solutions

Code:   github.com/viktorklang

### Matthew Pocock

May 7, 2011, 7:02:14 AM5/7/11
Hi Martin,

This sounds very good to me. Each time I've talked with people about == in scala, the conversation has become derailed by  discussing the Java equals/hashCode contract. Perhaps there would be an argument for a parallel process of shifting hashing into a type class also, so that it is possible to pass around consistent equals/hashcode functions.

trait Hashcode[T] {
def hashcodeOf(x: T): Int
}

trait Equality[T]  extends Hashcode[T] {
def eql(x: T, y: T): Boolean
}

Inescapably, this raises the possibility of shuffling the ordering/ordered/ord/comparable/comparator type class in:

trait Ordering[T] extends Equality[T] {
def lt(x: T, y: T): Boolean
}

I'm not suggesting we pull an entire scalaz-esque type class hierarchy into the core, but hashing, equality and ordering are so intimately related (given the contracts inherited from the JVM and core Java libs) that this looks to me like a golden opportunity to tackle all 3 at once.

Matthew

ps I like your 2.9.1 idea for early introduction
--
Matthew Pocock
(0191) 2566550

### √iktor Ҡlang

May 7, 2011, 7:14:23 AM5/7/11
On Sat, May 7, 2011 at 1:02 PM, Matthew Pocock wrote:
Hi Martin,

This sounds very good to me. Each time I've talked with people about == in scala, the conversation has become derailed by  discussing the Java equals/hashCode contract. Perhaps there would be an argument for a parallel process of shifting hashing into a type class also, so that it is possible to pass around consistent equals/hashcode functions.

trait Hashcode[T] {
def hashcodeOf(x: T): Int
}

trait Equality[T]  extends Hashcode[T] {
def eql(x: T, y: T): Boolean
}

That's a big no-no from me, there's no reason why Equality has anything to do with a hash.
I do, however like to have a similar approach as for Equals for hashcode, so that one can use multiple context bounds for HashMap etc:

class HashMap[A : Equals : HashCode] ...

Inescapably, this raises the possibility of shuffling the ordering/ordered/ord/comparable/comparator type class in:

trait Ordering[T] extends Equality[T] {
def lt(x: T, y: T): Boolean
}

Depends on the intention of "Ordering", if it's absolute ordering or relative ordering.

I'm not suggesting we pull an entire scalaz-esque type class hierarchy into the core, but hashing, equality and ordering are so intimately related (given the contracts inherited from the JVM and core Java libs) that this looks to me like a golden opportunity to tackle all 3 at once.

I agree with the first part, but I think we should solve that and untangle them.

Cheers,

### Jason Zaugg

May 7, 2011, 7:30:24 AM5/7/11
On Sat, May 7, 2011 at 12:51 PM, martin odersky <martin....@epfl.ch> wrote:

I'm very skeptical that we can reconcile the world of nominal
subtyping/polymorphic dispatch with with world of statically bound
ad-hoc polymorphism. Furthermore, shoe-horning this onto the existing

Furthermore, using compiler magic to make it possible ("if there is an
Equals[T] class for one of the types of X and Y, but not the other")
seems like cheating -- perhaps there is a useful tool (exact types?
alternative notions of contravariant specificity?) that should be
exposed in the type system for authors of other type classes that have
the same sort of problems (e.g. Ordering).

Which instances of scala.Equals (BTW, there is already scala.Equiv),
for types in the standard library? For example, how would you defined
Equal[Iterable[A]]?

def iterableEqual[A: Equal]: Equal[Iterable[A]] = ...
def setEqual[A: Equal]: Equal[Iterable[A]] = ...

It's hard to get the same answer for the two calls to areEqual below.
It's not actually clear to me if they should be the same or different.
But if `areEqual` is used to implement '==', I think people will
expect it to work as today.

val s1 = Set(1, 2)
val s1 = Set(2, 1)
val (i1: Iterable[Int], i2: Iterable[Int]) = (s1, s2)
areEqual(s1, s2)
areEqual(i1, i2)

We've had a fair crack at this in Scalaz, and the presence of
subtyping always rains on the type class parade.

-jason

### Matthew Pocock

May 7, 2011, 8:09:07 AM5/7/11
That's a big no-no from me, there's no reason why Equality has anything to do with a hash.

The reason they are linked is that if a.equals(b) then a.hashcode == b.hashcode. I think in my haste I got the inheritance the wrong way about:

trait Equality[T] ...
trait Hashcode[T] extends Equality[T]
trait Ordering[T] extends Equality[T]

I do, however like to have a similar approach as for Equals for hashcode, so that one can use multiple context bounds for HashMap etc:

class HashMap[A : Equals : HashCode] ...

I'm not suggesting we pull an entire scalaz-esque type class hierarchy into the core, but hashing, equality and ordering are so intimately related (given the contracts inherited from the JVM and core Java libs) that this looks to me like a golden opportunity to tackle all 3 at once.

I agree with the first part, but I think we should solve that and untangle them.

Cheers,

--
Viktor Klang,
Director of Research and Development
Scalable Solutions

Code:   github.com/viktorklang

### martin odersky

May 7, 2011, 8:09:08 AM5/7/11
On Sat, May 7, 2011 at 1:30 PM, Jason Zaugg wrote:
On Sat, May 7, 2011 at 12:51 PM, martin odersky <martin....@epfl.ch> wrote:
> I just noted that we might want to add a step 0 before the others:
>
> Step 0:
> ======
>
> Define equality as follows:
>
>   X == Y  is equivalent to areEqual(X, Y), if that typechecks, and otherwise
> equivalent to what it was until now (i.e. universal equality)
>
> No new warnings are given for mixed types.
>
> That step would be strictly backwards compatible, and would let DSLs use ==
> as they need it. So we can introduce this as early as 2.9.1.

I'm very skeptical that we can reconcile the world of nominal
subtyping/polymorphic dispatch with with world of statically bound
ad-hoc polymorphism. Furthermore, shoe-horning this onto the existing

Furthermore, using compiler magic to make it possible ("if there is an
Equals[T] class for one of the types of X and Y, but not the other")
seems like cheating -- perhaps there is a useful tool (exact types?
alternative notions of contravariant specificity?) that should be
exposed in the type system for authors of other type classes that have
the same sort of problems (e.g. Ordering).

I agree it's not pretty, but neither is the status quo. Equality is in a sense special because it is the only universal binary method in Java as well as in Scala. Note that most of the cheating is temporary. After step 3, we can give a nice explanation of == as implicit method injection:

def equalsDecorator[T](x: T)(implicit eqv: Equiv[T]) {
def == (y: T) = eqv.equiv(x, y)
}

Basically, same as is done for === now.

Which instances of scala.Equals (BTW, there is already scala.Equiv),

Right, we should use that. Equals is already taken for something else.

for types in the standard library? For example, how would you defined
Equal[Iterable[A]]?

def iterableEqual[A: Equal]: Equal[Iterable[A]] = ...
def setEqual[A: Equal]: Equal[Iterable[A]] = ...

I see two possibilities that both keep the current run-time behavior of equality on collections:

1. Either define Equiv on the level of Traversable and use dynamic dispatch to the left collection's equals method.

2. Or define Equiv individually on Seq, Map, Set. (so Iterables could not be compared with == after Step 3).

I'd probably use (1) because it is has the best backwards compatibility.

Cheers

-- Martin

### Jason Zaugg

May 7, 2011, 8:21:20 AM5/7/11
On Sat, May 7, 2011 at 2:09 PM, martin odersky <martin....@epfl.ch> wrote:
> I see two possibilities that both keep the current run-time behavior of
> equality on collections:
>
> 1. Either define Equiv on the level of Traversable and use dynamic dispatch
> to the left collection's equals method.
>
> 2. Or define Equiv individually on Seq, Map, Set. (so Iterables could not be
> compared with == after Step 3).
>
> I'd probably use (1) because it is has the best backwards compatibility.

Parameterized by an Equal instance for the element type?

-jason

### Matthew Pocock

May 7, 2011, 9:16:26 AM5/7/11
For e.g. a set, the same element equivalence must be used throughout its life. Otherwise crazy things will happen. For comparing two sets themselves, It only has an obvious meaning to me if both sets are using the same equivalence relation for their members.

case class NameAge(name: String, age: Int)
val byName = Equals.by(_.name)
val byAge = Equals.by(_.age)

val ppl1 = Set(NameAge("matt", 35), NameAge("mark", 35), NameAge("matt", 30))(byName) // probably drops matt,30
val ppl2 = Set(NameAge("matt", 35), NameAge("mark", 35), NameAge("matt", 30))(byAge) // probably drops mark,35

ppl1 == ppl2 // what element-wise eq are we using to check that every member in each set is present in the other?

So, for Set, I think == is only clearly defined if they use the same equivalence relation.

For the more general case of traversable, I think you're looking at something like:

Equals.sameElements[T](x: Traversable[T], y: Traversable[T])(implicit equ: Equals[T]): Equals[Traversable[T]]

However, different applications will probably want to use different notions of equality, and some may want to use different equalities for the *same types* at different places. Set equivalence (they contain only members that are equal to a member in the other set) is different from traversable equivalence (if you zip the two traversables, there are no elements left over, and each pair is equal).

Matthew

-jason

### Tiark Rompf

May 7, 2011, 11:45:25 AM5/7/11
How does this relate to the current implementation in scala-virtualized, which translates a == b to __equal(a,b) ? Is areEqual just an identifier like __equal or is it tied to the symbol Predef.areEqual -- in other words, can DSLs override equality by a mechanism other than the Equiv type class?

A common situation in DSLs is the following:

val a: Rep[Int] = ...
if (a == 0) ....

Or in general, each side of the equation being one of {T, Var[T], Rep[T]} with the desired result to first apply an implicit conversion from T or Var[T]  to Rep[T] and then applying the DSL-defined equality operator. Only if both sides are just T, the regular non-DSL equality should be used.

How will this be achieved with the proposed changes?

Currently, we use overloaded __equals for each combination.

- Tiark

### Paul Phillips

May 7, 2011, 12:14:10 PM5/7/11
On 5/7/11 2:56 AM, martin odersky wrote:
> [Aside: I note that the spec still defines Any.== to be
>
> if (null eq this) null eq that else this equals that
>
> We should update that to reflect the realities wrt boxed numbers (on the
> other hand, if we continue down the path I outline, those realities will
> also change, see below).]

FYI attached is the email I sent in Apr 2010 trying to initiate the
process of getting it specified. I'm afraid to keep looking or I might
find one of my 2009 proposals for using an implicit Equals[T] and
falling back on universal equality.

spec for == and ##.eml

### Paul Phillips

May 7, 2011, 12:15:18 PM5/7/11
Attaching didn't go so well. These computer things are complicated.
Let me try that again.

Subject:
spec for == and ##
From:
Paul Phillips <pa...@improving.org>
Date:
Tue, 13 Apr 2010 12:29:33 -0700
To:
martin odersky <martin....@epfl.ch>
CC:
scala-i...@listes.epfl.ch

Here is a kind of off the top of my head attempt to spec out equality
and hash codes. I don't really speak spec-ese but this is written in
the pidgin spec-ese within my grasp. Does this look approximately
correct? (Anyone else feel free to chime in on that point.) What if
anything would you like me to do with it?

Resolution of x == y
====================

1) Null values will not cause NPEs.

2) Nothing is == to null except null.

3) All objects must be == to themselves.

The first three conditions are summarized in this initial expansion of
'x == y', which the compiler may or may not inline. All user-defined
equals methods are responsible for preserving invariants 2 and 3.

if (x eq y) true
else if (x eq null) false
else // remainder of algorithm

4) If the static type of the left hand side allows for the possibility
that it is a boxed or unboxed primitive numeric type (any of Byte,
Short, Int, Long, Float, Double, or Char) then: go to step 5.

If the static type definitively excludes those types, then: the result
is x.equals(y).

5) If the static types of both operands are primitive types, then: the
result is that of the primitive comparison, exactly as performed in java.

If the static types are identical final types (for instance, both are
java.lang.Longs) then the result is x.equals(y).

In all other cases, both operands are boxed if necessary and a method in
BoxesRunTime is called. (The method will be semantically equivalent to
BoxesRunTime.equals, but a different method may be chosen to avoid
repeating the above tests.)

BoxesRuntime.equals
===================

All of the preceding logic is preserved, and then it proceeds as
follows, where 'x' remains the left hand side operand and 'y' the right.

1) Runtime instance checks will be done to determine the types of the
operands, with the following resolutions. (Resolutions represent the
semantics, not necessarily the implementation.)

1a) If both sides of the comparison are boxed primitives, then they are
unboxed and the primitive comparison is performed as in java.

1b) If 'x' is a class implementing the scala.math.ScalaNumber trait,
then the result is x.equals(y).

1c) If 'x' is a boxed primitive and 'y' is a class implementing the
scala.math.ScalaNumber trait, then the result is y.equals(x).

1d) Otherwise, the result is x.equals(y).

hashCode and ##
===============

The unification of primitives and boxed types in scala necessitates
measures to preserve the equality contract: equal objects must have
equal hash codes. To accomplish this a new method is introduced on Any:

def ##: Int

This method should be called in preference to hashCode by all scala
software which consumes hashCodes. (One need not use or even be aware
of it unless implementing something which depends on hashCodes -- to
define an object's hashCode, overridding hashCode remains the mechanism.)

The default implementation of ## is simply to call hashCode:

def ##: Int = this.hashCode()

In the case of numeric types however, it selectively alters hash codes
to support the == algorithm given above. The guarantees provided by ##
are as follows. "Numbers" are the aforementioned primitives (boxed or
unboxed) and any standard scala library class implementing ScalaNumber.

1) If x and y are whole Numbers in the range Int.MinValue to
Int.MaxValue, then (x == y) implies (x.## == y.##). The value of ## for
all Numbers in that range is equal to the result of .toInt on that Number.

2) If x and y are Numbers and either or both is fractional, then the
guarantee in 1) applies if both are in the range Short.MinValue to
Short.MaxValue.

3) If x and y are Numbers and neither 1) nor 2) applies, the implication
is preserved on a best-effort basis, but cannot be preserved generally
given the fuzziness introduced in primitive equality at the borders.
(For instance given a large Float, a java primitive Float/Double
comparison may return true for 2^10 different Double values, and similar
issues arise with Longs and Doubles.)

-- Paul Phillips | Giving every man a vote has no more made men wise
Moral Alien | and free than Christianity has made them good. Empiricist
| -- H. L. Mencken slap pi uphill! |----------*
http://www.improving.org/paulp/ *----------

### martin odersky

May 7, 2011, 12:35:30 PM5/7/11
On Sat, May 7, 2011 at 5:45 PM, Tiark Rompf wrote:
How does this relate to the current implementation in scala-virtualized, which translates a == b to __equal(a,b) ? Is areEqual just an identifier like __equal or is it tied to the symbol Predef.areEqual -- in other words, can DSLs override equality by a mechanism other than the Equiv type class?

I think areEqual can replace __equal. I.e. it need not be tied to Predef. One thing less to virtualize.

Cheers

-- Martin

### martin odersky

May 7, 2011, 12:37:07 PM5/7/11
Yes, exactly.

-- Martin

### Rüdiger Keller

May 7, 2011, 1:13:36 PM5/7/11
While a type safe equals accompanied by the flexibility of using a
type class sounds very intriguing, I'm not experienced enough with
using type classes to know the negative aspects this may bring. That
said I will just ask what came to my mind:

- What will be the performance implications of this, if any?

- Does it make sense, or is it even possible to have a "fallback"
Equals[Any] that replicates the current behavior, where anything can
be compared? Couldn't that be supplied in Predef during the first
steps as a fallback?

Also there's something I don't really get. Martin writes:

-----

@inline def areEqual[T](x: T, y: T)(implicit eq: Equals[T]) = eq.eql(x, y)

Define equality as follows:

X == Y is equivalent to areEqual(X, Y), if that typechecks, and
otherwise equivalent to what it was until now (i.e. universal
equality)

[snip]

Furthermore, issue a warning if there is an Equals[T] class for one of

the types of X and Y, but not the other (in that case we fall back to
the default behavior).

-----

Regarding the last sentence, how do we separately look at the types of
X and Y when we are using above definition of the areEqual method?
Won't T be inferred as the most specific super type of the two?
Wouldn't that mean we're only looking at whether there is an Equal for
the super type instead of looking at the types of X and Y separately?
Or is it meant that the compiler first searches for Equal instances
for X and Y before rewriting the == using areEqual? But what would
that be good for?

Sorry if I'm not making any sense because of my lack of type class foo.

Regards,
Rüdiger

### martin odersky

May 7, 2011, 1:40:25 PM5/7/11
On Sat, May 7, 2011 at 7:13 PM, Rüdiger Keller wrote:
While a type safe equals accompanied by the flexibility of using a
type class sounds very intriguing, I'm not experienced enough with
using type classes to know the negative aspects this may bring. That
said I will just ask what came to my mind:

- What will be the performance implications of this, if any?

Depends on the details, but I see no great changes either way wrt the current model.

- Does it make sense, or is it even possible to have a "fallback"
Equals[Any] that replicates the current behavior, where anything can
be compared? Couldn't that be supplied in Predef during the first
steps as a fallback?

That's a way to model universal equality, yes.

Also there's something I don't really get. Martin writes:

-----
@inline def areEqual[T](x: T, y: T)(implicit eq: Equals[T]) = eq.eql(x, y)

Define equality as follows:

X == Y  is equivalent to areEqual(X, Y), if that typechecks, and
otherwise equivalent to what it was until now (i.e. universal
equality)

[snip]

Furthermore, issue a warning if there is an Equals[T] class for one of
the types of X and Y, but not the other (in that case we fall back to
the default behavior).
-----

Regarding the last sentence, how do we separately look at the types of
X and Y when we are using above definition of the areEqual method?
Won't T be inferred as the most specific super type of the two?
Wouldn't that mean we're only looking at whether there is an Equal for
the super type instead of looking at the types of X and Y separately?
Or is it meant that the compiler first searches for Equal instances
for X and Y before rewriting the == using areEqual? But what would
that be good for?

I mean that the warning will be issued in Step 1 for those cases there the implicit does not apply, but one of the types of X and Y would have an implicit Equals instance. The warning will be turned into an error in Step 2.

-- Martin

### Stefan Zeiger

May 7, 2011, 4:45:14 PM5/7/11
On 2011-05-07 11:56, martin odersky wrote:
> Introduce the standard Equals type class:
>
> class Equals[T] {
> def eql(x: T, y: T)
> }

What would the return type be? The main reason for not being able to use
== in ScalaQuery is its type (Any, Any) => Boolean. In order to describe
the expression on a basic type T by evaluating a similar expression on a
type Column[T], I need an equality operator with a type (Column[T],
Column[T]) => Column[Boolean].

> And put the following method in Predef:
>
> @inline def areEqual[T](x: T, y: T)(implicit eq: Equals[T]) =
> eq.eql(x, y)

This would also pose problems for DSLs like ScalaQuery which operate on
wrapped values because you couldn't implicitly lift basic values. With
===, I can make these expressions work:

(a:Column[T]) === (b:T) -> a === ConstColumn(b)

(a:T) === (b:Column[T]) -> ConstColumn(a) === b

AFAICT, this wouldn't work with areEqual[T] because T would be inferred
as Any or AnyRef before an implicit conversion could kick in.

To further complicate things, here is the real definition of === from
ScalaQuery:

trait ColumnOps[B1, P1] {
...
def === [P2, R](e: Column[P2])(implicit om: OptionMapper2[B1, B1,
Boolean, P1, P2, R]): Column[R] = ...
}

It depends on another implicit value, OptionMapper2, which also
determines the return type: In a:Column[A] === b:Column[B], if neither A
nor B is an Option type, the result is a Column[Boolean]. If at least
one of A and B is an Option, the result is Column[Option[Boolean]]. In
either case, A and B must have the same base type.

-sz

### Tiark Rompf

May 7, 2011, 7:59:16 PM5/7/11
On May 7, 2011, at 10:45 PM, Stefan Zeiger wrote:

> On 2011-05-07 11:56, martin odersky wrote:
>> Introduce the standard Equals type class:
>>
>> class Equals[T] {
>> def eql(x: T, y: T)
>> }
>
> What would the return type be? The main reason for not being able to use == in ScalaQuery is its type (Any, Any) => Boolean. In order to describe the expression on a basic type T by evaluating a similar expression on a type Column[T], I need an equality operator with a type (Column[T], Column[T]) => Column[Boolean].

I think this could be handled by providing an appropriate overloaded areEqual method, taking an implicit LiftedEquals instance for example that would define eql on Column[T].

Another question: How are case classes going to know how to compare their arguments? Would they need to take one Equiv instance per argument?

- Tiark

### Ismael Juma

Jul 31, 2011, 8:27:13 PM7/31/11
On Sat, May 7, 2011 at 11:51 AM, martin odersky wrote:
That step would be strictly backwards compatible, and would let DSLs use == as they need it. So we can introduce this as early as 2.9.1.

What is the current thinking regarding this? Is there still a plan to introduce something like this in the 2.9.x series?

Best,
Ismael

### martin odersky

Aug 1, 2011, 4:45:07 AM8/1/11
It seems this requires a fudamental rethinking how equality of case classes is handled. That won't happen for any 2.9.x, I think.

Cheers

-- Martin
--
Martin Odersky
Prof., EPFL and Chairman, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967

### Ismael Juma

Aug 1, 2011, 4:47:35 AM8/1/11