Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Valid hash keys?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  14 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Autrijus Tang  
View profile  
 More options Feb 27 2005, 3:50 am
Newsgroups: perl.perl6.language
From: autri...@autrijus.org (Autrijus Tang)
Date: Sun, 27 Feb 2005 16:50:50 +0800
Local: Sun, Feb 27 2005 3:50 am
Subject: Valid hash keys?

Just a quick question: Is Hash keys still Strings, or can they
be arbitary values? If the latter, can Int 2, Num 2.0 and Str "2"
point to different values?

Thanks,
/Autrijus/

  application_pgp-signature_part
< 1K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Luke Palmer  
View profile  
 More options Feb 27 2005, 4:20 am
Newsgroups: perl.perl6.language
From: l...@luqui.org (Luke Palmer)
Date: Sun, 27 Feb 2005 02:20:59 -0700
Local: Sun, Feb 27 2005 4:20 am
Subject: Re: Valid hash keys?

I forgot an important concretity.  Hashes should compare based on the
generic "equal" operator, which knows how to compare apples and apples,
and oranges and oranges, and occasionally a red orange to an apple.

That is:

    3 equal 3              ==> true
    3 equal { codeblock }  ==> false
    3 equal "3"            ==> true

Luke


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Luke Palmer  
View profile  
 More options Feb 27 2005, 4:18 am
Newsgroups: perl.perl6.language
From: l...@luqui.org (Luke Palmer)
Date: Sun, 27 Feb 2005 02:18:06 -0700
Local: Sun, Feb 27 2005 4:18 am
Subject: Re: Valid hash keys?

Autrijus Tang writes:
> Just a quick question: Is Hash keys still Strings, or can they be
> arbitary values?

They can be declared to be arbitrary:

    my %hash is shape(Any);

> If the latter, can Int 2, Num 2.0 and Str "2" point to different
> values?

That's an interesting question.  Some people would want them to, and
some people would definitely not want them to.  I think the general
consensus is that people would not want them to be different, since in
the rest of perl, 2 and "2" are the same.

The object model that I'm working on actually identifies 2 and "2" as
the same object, indistinguishable in every respect.  But that hasn't
been accepted (or even proposed)... yet.

Luke


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nigel Sandever  
View profile  
 More options Feb 27 2005, 7:19 am
Newsgroups: perl.perl6.language
From: nigelsande...@btconnect.com (Nigel Sandever)
Date: Sun, 27 Feb 2005 12:19:20 GMT
Local: Sun, Feb 27 2005 7:19 am
Subject: Re: Valid hash keys?

I would have assumed a hash who shape was defined as C<Int> to perform the
hashing function directly on the (nominally 32-bit) binary representation Int 2.

Likewise, c<my % hash is shape(Double)>, would perform the hashing on the binary
rep of the (nom.64-bit) Double value.

And C<my %hash is shape(Ref)>, on the address of the key passed?

By extension, a C<%hash is shape( Any )> would hash the binary representation of
whatever (type of) key it was given, which would make keys of 2, 2.0, '2',
'2.0', (Int2)2 etc. all map to different keys.

If C<%hash is shape(Any)> maps all the above representation of 2 to the same
value, then C<my %hash is shape(Any)> becomes a synonym for

C<%hash is shape(Scalar)>.

(is that the same as C<my %hash of Scalar>?).

If my auumption is correct, then that would imply that each type hash or
inherits a .binary or .raw method to allow access to the internal
representation?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Burr  
View profile  
 More options Feb 27 2005, 2:54 pm
Newsgroups: perl.perl6.language
From: ajb44....@yahoo.com (Alex Burr)
Date: Sun, 27 Feb 2005 19:54:47 +0000
Local: Sun, Feb 27 2005 2:54 pm
Subject: Re: Valid hash keys?

On Sun, Feb 27, 2005 at 02:20:59AM -0700, Luke Palmer wrote:
> I forgot an important concretity.  Hashes should compare based on the
> generic "equal" operator, which knows how to compare apples and apples,
> and oranges and oranges, and occasionally a red orange to an apple.

Um. Hashes don't really compare, though, do they? Maybe you
just mean a notional equals operator, which isn't really used; but
it seems to me that what hashes acutally implement is more of a
'canonicalize' operator. Actually, it would be useful sometimes
to be able to give a hash an explicit canonicalizer:

my %msdos_files is canonicalized_by lc;

my %fractions is canonicalized_by gcd;

Alex


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Luke Palmer  
View profile  
 More options Feb 27 2005, 5:36 pm
Newsgroups: perl.perl6.language
From: l...@luqui.org (Luke Palmer)
Date: Sun, 27 Feb 2005 15:36:42 -0700
Local: Sun, Feb 27 2005 5:36 pm
Subject: Re: Valid hash keys?

I wasn't even thinking about implementation.  Sometimes it's good to let
implementation drive language, but I don't think it's appropriate here.

When we're talking about hashes of everything, there are a couple of
routes we can take.  We can go Java's way and define a .hash method on
every object.  We can go C++'s way and not hash at all, but instead
define a transitive ordering on every object.  We can go Perl's way and
find a string representation of the object and map the problem back to
string hashing, which we can do well.

But the biggest problem is that if the user overloads 'equal' on two
objects, the hash should consider them equal.  We could require that to
overload 'equal', you also have to overload .hash so that you've given
some thought to the thing.  The worry I have is that people will do:

    method hash() { 0 }

But I suppose that's okay.  That just punts the work off to 'equal',
albeit in linear time.

That may be the right way to go.  Use a Javaesque .hash method with a
sane default (an introspecting default, perhaps), and use a sane
equivalent default for 'equal'.  

As far as getting 2, 2.0, and "2" to hash to the same object, well, we
know they're 'equal', so we just need to know how to hash them the same
way.  In fact, I don't believe 2.0 can be represented as a Num.  At
least in Perl 5, it translates itself back to an int.  So we can just
stringify and hash for the scalar types.

Luke


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Sandlaß  
View profile  
 More options Feb 27 2005, 5:57 pm
Newsgroups: perl.perl6.language
From: Thomas.Sandl...@orthogon.com (Thomas Sandlaß)
Date: Sun, 27 Feb 2005 23:57:30 +0100
Local: Sun, Feb 27 2005 5:57 pm
Subject: Re: Valid hash keys?

Alex Burr wrote:
> [..] Actually, it would be useful sometimes
> to be able to give a hash an explicit canonicalizer:

> my %msdos_files is canonicalized_by lc;

> my %fractions is canonicalized_by gcd;

Shouldn't that be handled by container subclasses of Hash?
Like PersitentScalar or SparseArray?

Regards,
--
TSa (Thomas Sandlaß)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nigel Sandever  
View profile  
 More options Feb 27 2005, 5:57 pm
Newsgroups: perl.perl6.language
From: nigelsande...@btconnect.com (Nigel Sandever)
Date: Sun, 27 Feb 2005 22:57:22 GMT
Local: Sun, Feb 27 2005 5:57 pm
Subject: Re: Valid hash keys?

My thought is that if c<my %hash is shape(Any)> uses the stringyfied values of
the keys, then it is no different to C<my %hash>,

I think it would be useful for shape(Any) be different to an ordinary hash, and
hashing the binary representation of the key, so that

(Int)2, (Num)2, (String)2, (uint)2 (uint4)2 etc.

would be a useful way of collating things according to their "type" rather than
their value....?

> Luke

njs.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Luke Palmer  
View profile  
 More options Feb 27 2005, 6:55 pm
Newsgroups: perl.perl6.language
From: l...@luqui.org (Luke Palmer)
Date: Sun, 27 Feb 2005 16:55:20 -0700
Local: Sun, Feb 27 2005 6:55 pm
Subject: Re: Valid hash keys?

Nigel Sandever writes:
> On Sun, 27 Feb 2005 15:36:42 -0700, l...@luqui.org (Luke Palmer) wrote:
> > As far as getting 2, 2.0, and "2" to hash to the same object, well, we
> > know they're 'equal', so we just need to know how to hash them the same
> > way.  In fact, I don't believe 2.0 can be represented as a Num.  At
> > least in Perl 5, it translates itself back to an int.  So we can just
> > stringify and hash for the scalar types.

> My thought is that if c<my %hash is shape(Any)> uses the stringyfied values of
> the keys, then it is no different to C<my %hash>,

Indeed, but I meant just for our non-reference scalar types, such as
Num, Int, and Str.

> I think it would be useful for shape(Any) be different to an ordinary
> hash, and hashing the binary representation of the key, so that

> (Int)2, (Num)2, (String)2, (uint)2 (uint4)2 etc.

> would be a useful way of collating things according to their "type"
> rather than their value....?

That may indeed be a useful kind of map to have, but it's hardly what
people expect when they declare a hash keyed by Any.  The reason I want
to identify 2 and "2" is because they are identical everywhere else in
the language.

Also, if you hash on the binary representation, what's to say that uint
hashes differently from uint4.  And even if we do guarantee that they
hash differently, there will be collisions.  And then uint(2) equal
uint4(2) (for any reasonable implementation of equal), and they are the
same, but only for collision values.

A better way to make a type-collated hash would be to:

    class TypeHash is Hash[shape => [Class; Any]] {
        method postcircumfix:<{ }> (Any $arg) {
            .SUPER::{$arg.type; $arg}
        }
    }

And if you really think that it's going to be that common, then you can
write a module that implements that.  It would be about five lines long.

Luke


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Adams  
View profile  
 More options Feb 27 2005, 7:42 pm
Newsgroups: perl.perl6.language
From: r...@rodadams.net (Rod Adams)
Date: Sun, 27 Feb 2005 18:42:17 -0600
Local: Sun, Feb 27 2005 7:42 pm
Subject: Re: Valid hash keys?

Luke Palmer wrote:
>The object model that I'm working on actually identifies 2 and "2" as
>the same object, indistinguishable in every respect.

Okay, that's fine, since C< 2 eq "2" > and C< 2 == "2" >. But what about
2.0 and "2.0"?

In Perl5, C< 2.0 == "2.0" >, but  C< 2.0 ne "2.0" >.

-- Rod Adams


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Burr  
View profile  
 More options Feb 27 2005, 7:21 pm
Newsgroups: perl.perl6.language
From: ajb44....@yahoo.com (Alex Burr)
Date: Mon, 28 Feb 2005 00:21:06 +0000
Local: Sun, Feb 27 2005 7:21 pm
Subject: Re: Valid hash keys?

On Sun, Feb 27, 2005 at 11:57:30PM +0100, Thomas Sandlaß wrote:
> Alex Burr wrote:

> >[..] Actually, it would be useful sometimes
> >to be able to give a hash an explicit canonicalizer:

> >my %msdos_files is canonicalized_by lc;

> >my %fractions is canonicalized_by gcd;

> Shouldn't that be handled by container subclasses of Hash?
> Like PersitentScalar or SparseArray?

Possibly. Clearly that's what one would do in any other language.
What I was thinking was that if hashes are going to have a
canonicalizer function *anyway*, maybe the default implementation
could be overridable with a trait (or role?). But I can't
actually claim to have followed perl6 development enough to
be able to argue that it really makes sense.

Alex


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nigel Sandever  
View profile  
 More options Feb 27 2005, 7:41 pm
Newsgroups: perl.perl6.language
From: nigelsande...@btconnect.com (Nigel Sandever)
Date: Mon, 28 Feb 2005 00:41:43 GMT
Local: Sun, Feb 27 2005 7:41 pm
Subject: Re: Valid hash keys?

On Sun, 27 Feb 2005 15:36:42 -0700, l...@luqui.org (Luke Palmer) wrote:
> Nigel Sandever writes:

> When we're talking about hashes of everything, there are a couple of
> routes we can take.  We can go Java's way and define a .hash method on
> every object.  We can go C++'s way and not hash at all, but instead
> define a transitive ordering on every object.  

The more I think about this, please, no. The reason hashes are called hashes is
because they hash.

If we need bags, sets, or orderThingies with overloadable transitive ordering,
they can be written as classes--that possibly overload the hash syntax

        obj<<key>> = value

or whatever, but don't tack all that overhead on to the basic primitive.

> We can go Perl's way and
> find a string representation of the object and map the problem back to
> string hashing, which we can do well.

The only question is why does the thing that gets hashed have to be stringyfied
first?

In p5, I often use hash{ pack'V', $key } = $value. # or 'd'

1) Because for large hashes using numeric keys it use up less space for the
keys. 4-bytes rather than  10 for 2**32.

2) By using all 256 values of each byte, it tends to spread the keys more even
across fewer buckets;

        use Devel::Size qw[total_size size];
        undef $h{ pack 'V', $_ } for map{ $_ * 10000  } 0 .. 999999;

        print total_size \%h;
        18418136

        print scalar %h;
        292754/524288

versus

        use Devel::Size qw[total_size size];
        undef $h{ $_ } for map{ $_ * 10000  } 0 .. 999999;

        print total_size \%h;
        48083250

        print scalar %h;
        644301/1048576

It would also avoid the need for hacks like Tie:RefHash, by hashing the address
of the ref rather than the stringyfied ref and forcing the key to be stored
twice and the creation of zillions of anonymous arrays to hold the unstrigyfied
ref+value.

The same could be extended to hashing the composite binary representations of
whole structures and objects.

njs.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Burr  
View profile  
 More options Mar 1 2005, 6:51 pm
Newsgroups: perl.perl6.language
From: ajb44....@yahoo.com (Alex Burr)
Date: Tue, 1 Mar 2005 23:51:44 +0000
Local: Tues, Mar 1 2005 6:51 pm
Subject: Re: Valid hash keys?

On Sun, Feb 27, 2005 at 03:36:42PM -0700, Luke Palmer wrote:
> But the biggest problem is that if the user overloads 'equal' on two
> objects, the hash should consider them equal.  We could require that to
> overload 'equal', you also have to overload .hash so that you've given
> some thought to the thing.  The worry I have is that people will do:

>     method hash() { 0 }

This is why I think a  'canonicalize' method is better than a
.hash' method: if canonicalize is defined correctly, then .equal
is just {$a.canon() is_bitwise_identical_to $b.canon()}, so
people don't have to supply both. Quite often when you would ordinarily
define a .equal method, it is of that form anyhow. And then .hash
can be straightforwardly defined as {$_.canon().binaryhash()}, where
binaryhash() just means hash all the memory of this object.

 Stringify would be an instance of a canonicalization function; and
perhaps the default.

Alex


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Luke Palmer  
View profile  
 More options Mar 1 2005, 7:23 pm
Newsgroups: perl.perl6.language
From: l...@luqui.org (Luke Palmer)
Date: Tue, 1 Mar 2005 17:23:27 -0700
Local: Tues, Mar 1 2005 7:23 pm
Subject: Re: Valid hash keys?

Alex Burr writes:
> On Sun, Feb 27, 2005 at 03:36:42PM -0700, Luke Palmer wrote:
> > But the biggest problem is that if the user overloads 'equal' on two
> > objects, the hash should consider them equal.  We could require that to
> > overload 'equal', you also have to overload .hash so that you've given
> > some thought to the thing.  The worry I have is that people will do:

> >     method hash() { 0 }

> This is why I think a  'canonicalize' method is better than a .hash'
> method: if canonicalize is defined correctly, then .equal is just
> {$a.canon() is_bitwise_identical_to $b.canon()}, so people don't have
> to supply both.

Sure.  And you know what's better than all of that?  Transitive ordering
on all objects, so that .equal is just:

    !($a < $b && $b < $a)

But the reason we're not going with that is because it's very hard (both
from a user perspective and computationally) to do that in the general
case.  And the same is true for canonicalization.  Sure, when you have
certain kinds of objects in mind, like strings, or trees, it's very easy
to canonicalize.  But when you have other objects in mind, like
filehandles, graphs, or references to remote objects, it's hard or
impossible to canonicalize.

And in fact, there are some objects for which it's hard or impossible to
define an .equal method.  And those objects can't be hashed, plain and
simple. If you need to hash them, you can define a type of hash that
uses =:= as comparison, so that things are only equal if they are
referentially equal.

But I expect that we have can have a standard role that defines .hash
and .equal in terms of .canonicalize, if it exists.  We can also have a
magical role that defines .canonicalize when it can. If .hash isn't
defined when an object is used as a key of a hash (but .equal is), it
will have to bite the speed bullet and use 0, probably along with a
warning.

Luke


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »