Should we include the type of a class in the hashCode when using Object.hash (and variants)?

45 views
Skip to first unread message

Mateus Felipe Cordeiro Caetano Pinto

unread,
Nov 14, 2023, 1:15:04 AM11/14/23
to Dart Misc
Consider the following code:

void main() {
  const a = A(10, 11);
  const b = B(10, 11);
 
  print(a.hashCode);
  print(b.hashCode);
  print(a.hashCode == b.hashCode);
}

class A {
  const A(this.a, this.b);

  final int a;
  final int b;

  @override
  operator ==(Object other) => other is A && other.a == a && other.b == b;

  @override
  int get hashCode => Object.hash(a, b);
}

class B {
  const B(this.a, this.b);

  final int a;
  final int b;

  @override
  operator ==(Object other) => other is B && other.a == a && other.b == b;

  @override
  int get hashCode => Object.hash(a, b);
}

The result of running this code is that it will print two equal integers and then true.

Jacob Bang

unread,
Nov 14, 2023, 1:59:19 AM11/14/23
to mi...@dartlang.org
HashCode is not meant to be guarantee unique but is mostly just used
for hash tables to ensure good distribution of data between buckets.
When checking if two objects are equal, you cannot trust the hashCode
but you should call the == operator.

If using hashCode, it should only be used as a quick way of checking
if objects might be equal (since different hashCode's would mean the
two objects are definitely not equal). But again, if the hashCode's
ends up being equal, you need to afterwards call == to be sure.

Den tirs. 14. nov. 2023 kl. 07.15 skrev Mateus Felipe Cordeiro Caetano
Pinto <mateu...@gmail.com>:
> --
> For more ways to connect visit https://dart.dev/community
> ---
> You received this message because you are subscribed to the Google Groups "Dart Misc" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to misc+uns...@dartlang.org.
> To view this discussion on the web visit https://groups.google.com/a/dartlang.org/d/msgid/misc/744f164e-21cf-424d-96b5-7f48a878b022n%40dartlang.org.



--
Jacob Bang / julemand101

Jacob Bang

unread,
Nov 14, 2023, 6:20:08 AM11/14/23
to Mateus Felipe Cordeiro Caetano Pinto, Dart Misc
> 1. Considering that, in the example, adding the type to the hash would make the hash codes to be different, does this imply in some kind of potential performance improvement, as it will avoid `==` being called?

I can only see this benefit if you have a Map/Set with entities of
mixed types of classes which ends up having the same internal value
but still needs to be seen as different entities. I think personally
this is such a rare case that it is not worth adding the type to the
hashsum and I have yet to see anybody adding the type to the hashsum.
From a performance perspective, it should not really matter but it
does make it more "expensive" to calculate the hashSum since you
include yet another parameter which are very likely to never matter.
But that is such a minor thing that I would say it can be ignored
(unless somebody comes and say the hashCode of a Type is a very
expensive operation) :)

> 2. Is there any downside in including the type in the hash? I think the fact that it make the two hash codes different is either positive or neutral, so what would be the reason to not include it?

I guess one downside could be that it makes it more complicated to
make subclass that should behave the same as the original object but
it depends on how you would include the class in the hashsum. E.g. are
you going to include .runtimeType it would make it rather annoying but
if you just manually have used "AType.hashsum" then I guess it is less
problematic.

But I don't see any benefit really of doing this and you would just
make your code more complicated without any gains. So my own personal
recommendation would be to NOT include the type of your class as part
of hashcode generation.

Den tirs. 14. nov. 2023 kl. 11.27 skrev Mateus Felipe Cordeiro Caetano
Pinto <mateu...@gmail.com>:
>
> I am sorry for my initial question, I was writing it and accidentally posted it before finishing, but I think the overall idea has been passed.
>
> > But again, if the hashCode's ends up being equal, you need to afterwards call == to be sure.
>
> I think my question is around this. I know that `==` is the "canonical" source of truth regarding equality. However, it's not clear how hash codes are supposed to be used, and even less how they ARE used internally by Dart.
>
> Sure, the documentation says that it's used for hash based data structures like the default Set and Map implementations, but there's no details about how this works. To be more specific, I'm going to split my question in two:
>
> 1. Considering that, in the example, adding the type to the hash would make the hash codes to be different, does this imply in some kind of potential performance improvement, as it will avoid `==` being called?
> 2. Is there any downside in including the type in the hash? I think the fact that it make the two hash codes different is either positive or neutral, so what would be the reason to not include it?
>
> Thanks!

Mateus Felipe Cordeiro Caetano Pinto

unread,
Nov 14, 2023, 6:56:24 AM11/14/23
to Dart Misc, Jacob Bang
I am sorry for my initial question, I was writing it and accidentally posted it before finishing, but I think the overall idea has been passed.

> But again, if the hashCode's ends up being equal, you need to afterwards call == to be sure.

I think my question is around this. I know that  `==` is the "canonical" source of truth regarding equality. However, it's not clear how hash codes are supposed to be used, and even less how they ARE used internally by Dart.

Sure, the documentation says that it's used for hash based data structures like the default Set and Map implementations, but there's no details about how this works. To be more specific, I'm going to split my question in two:

1. Considering that, in the example, adding the type to the hash would make the hash codes to be different, does this imply in some kind of potential performance improvement, as it will avoid `==` being called?
2. Is there any downside in including the type in the hash? I think the fact that it make the two hash codes different is either positive or neutral, so what would be the reason to not include it?

Thanks!

Em terça-feira, 14 de novembro de 2023 às 03:59:19 UTC-3, Jacob Bang escreveu:

Mateus Felipe Cordeiro Caetano Pinto

unread,
Nov 14, 2023, 12:31:21 PM11/14/23
to Dart Misc, Jacob Bang
> I have yet to see anybody adding the type to the hashsum

I've seen it some times, and started doing it myself, but reading Flutter's source code I realized they don't, so it was not clear to me if it was a good practice or not.

Anyway, I think your clarifications settles the matter.

Thanks, again!

Bob Nystrom

unread,
Nov 14, 2023, 1:46:06 PM11/14/23
to mi...@dartlang.org, Jacob Bang
On Tue, Nov 14, 2023 at 9:31 AM Mateus Felipe Cordeiro Caetano Pinto <mateu...@gmail.com> wrote:
> I have yet to see anybody adding the type to the hashsum

I've seen it some times, and started doing it myself, but reading Flutter's source code I realized they don't, so it was not clear to me if it was a good practice or not.

Like almost all performance questions, the answer is really "it depends".

The more data about an object that goes into its hash code, the less likely there are to be hash collisions. Fewer hash collisions generally makes hash tables (sets and maps) perform better because as you say, it falls back to `==` less often.

But calculating the hash code itself takes time and using the runtime type of an object as part of its hash code can make calculating the hash code significantly slower. Whether that additional cost is more than made up for by fewer collisions will depend very heavily on specifically how fast you can hash a runtime type, how the specific hash table is implemented, and what kinds of objects you are putting into the table. So it's hard to have a blanket 100% correct answer.

A good approximate answer in most cases is that, no, it's not worth hashing the type. I never do in my code.

– bob


 

James D. Lin

unread,
Nov 14, 2023, 3:00:06 PM11/14/23
to mi...@dartlang.org, Jacob Bang
That two objects with unrelated types produce equal hash codes matters only if you're storing those objects in a heterogeneous hashing collection.  Are you?  Ignoring serialization to/from JSON (where this situation doesn't apply since you don't control the hash codes of any of the elements), I don't think that's a very common thing to do.

- James


Reply all
Reply to author
Forward
0 new messages