Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ordered hash ?

5 views
Skip to first unread message

itsme213

unread,
Dec 1, 2004, 7:33:48 PM12/1/04
to
Is there a pure-ruby ordered hash? I'm looking for something that will
preserve the order in which key/value pairs were entered. The one on RAA is
buggy http://raa.ruby-lang.org/project/orderedhash/

Will 2.0 come with a good ordered hash?

Thanks


Austin Ziegler

unread,
Dec 1, 2004, 10:30:00 PM12/1/04
to
On Thu, 2 Dec 2004 09:37:44 +0900, itsme213 <itsm...@hotmail.com> wrote:
> Is there a pure-ruby ordered hash? I'm looking for something that will
> preserve the order in which key/value pairs were entered. The one on RAA is
> buggy http://raa.ruby-lang.org/project/orderedhash/

What's buggy about it? I have an ohash in the code to PDF::Writer that
you're welcome to use.

> Will 2.0 come with a good ordered hash?

The native hashing mechanism has been changed to preserve that order, I think.

-austin
--
Austin Ziegler * halos...@gmail.com
* Alternate: aus...@halostatue.ca


itsme213

unread,
Dec 1, 2004, 11:12:12 PM12/1/04
to
> What's buggy about it?

It threw an error on '==', I did not dig much deeper after that.

> I have an ohash in the code to PDF::Writer that
> you're welcome to use.

I'll take a look, thanks.

> > Will 2.0 come with a good ordered hash?
>
> The native hashing mechanism has been changed to preserve that order, I
think.

That's good to know. Is this also in the 1.8 stream, or just 1.9 ?

Gavin Sinclair

unread,
Dec 2, 2004, 1:33:24 AM12/2/04
to
On Thursday, December 2, 2004, 2:30:00 PM, Austin wrote:

>> Will 2.0 come with a good ordered hash?

> The native hashing mechanism has been changed to preserve that order, I think.

That sounds a bit silly. Wouldn't there be a performance overhead
associated with that? Shouldn't that price be paid only by those
people who want that functionality, given that it can be easily
implemented in C or Ruby?

A C-based extension providing an InsertOrderHash, distributed with
Ruby, would be very handy.

Gavin

nobu....@softhome.net

unread,
Dec 2, 2004, 1:46:58 AM12/2/04
to
Hi,

At Thu, 2 Dec 2004 12:30:00 +0900,
Austin Ziegler wrote in [ruby-talk:122119]:


> The native hashing mechanism has been changed to preserve that order, I think.

No.

--
Nobu Nakada


Robert Klemme

unread,
Dec 2, 2004, 5:43:40 AM12/2/04
to

"itsme213" <itsm...@hotmail.com> schrieb im Newsbeitrag
news:Mptrd.59198$KQ2....@fe2.texas.rr.com...

> Is there a pure-ruby ordered hash? I'm looking for something that will
> preserve the order in which key/value pairs were entered. The one on RAA
is
> buggy http://raa.ruby-lang.org/project/orderedhash/

Whenever I hear "ordered hash" then I think of a hash whose keys are
ordered according to a definable order - not necessarily insertion order.
In fact, insertion order seems to me a very special case. Is this really
widely used?

> Will 2.0 come with a good ordered hash?

Dunno.

Kind regards

robert

Michael Neumann

unread,
Dec 2, 2004, 8:23:49 AM12/2/04
to

This works fine for me:

class OrderedHash < Hash
def initialize
@order_of_keys = []
super
end

def []=(key, value)
@order_of_keys.delete(key) if has_key?(key)
@order_of_keys << key
super
end

def each
@order_of_keys.each do |key|
yield key, self[key]
end
end
end


Regards,

Michael


itsme213

unread,
Dec 2, 2004, 1:09:20 PM12/2/04
to

"David G. Andersen" <d...@lcs.mit.edu> wrote in message
> Brian's point is the right one to consider: insertion order, in a
> sense, is merely another key, one that's defined implicitly.

Sure, but then any operation that is sequence-sensitive (it modifies state)
can be shoehorned into implicit ordering keys. This does not make it
necessarily a good way to understand (i.e. model or specify) them, or to
implement them, imho.

stack.push(x)

push order is merely another implicit ordering key?

array merely a map from explicit (except for << and the like) integer keys
to value?

stack merely a map from implicit integer keys to value?


Nikolai Weibull

unread,
Dec 2, 2004, 9:37:34 AM12/2/04
to
* itsme213 <itsm...@hotmail.com> [Dec 02, 2004 14:00]:

> Is there a pure-ruby ordered hash? I'm looking for something that will
> preserve the order in which key/value pairs were entered. The one on
> RAA is buggy http://raa.ruby-lang.org/project/orderedhash/

Having an ordered Hash is like saying that arrays should be indexed by
strings--it's simply not what they are meant to be. My RDSL library
(http://rdsly.rubyforge.org/) contains an implementation of the Treap
ADT. It works like a Hash, but preserves order. It is implemented much
like a Tree but with Heap-like abilities--hence the name. Anyway, the
items are ordered, not by insertion but by comparison. It does not use
a two-level scheme like OrderedHash seems to use, so there's no wasted
memory on keeping ordering information. Also, benchmarks have shown
that it works very fast, even though it's purely ruby.

If a Treap isn't your style, RDSL also includes a SkipList
implementation, that has similar capabilities to that of a Treap; it's
yet another sorted ADT.

Please, if you haven't already commited to some solution, give RDSL a
try. It needs some testing. I'm going to get back to working on it
when I get time, but until then, some love from other users wouldn't
hurt.

End of plug. Thanks,
nikolai


--
::: name: Nikolai Weibull :: aliases: pcp / lone-star / aka :::
::: born: Chicago, IL USA :: loc atm: Gothenburg, Sweden :::
::: page: www.pcppopper.org :: fun atm: gf,lps,ruby,lisp,war3 :::
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}


Hugh Sasse Staff Elec Eng

unread,
Dec 2, 2004, 10:17:43 AM12/2/04
to
On Thu, 2 Dec 2004, Austin Ziegler wrote:

> On Thu, 2 Dec 2004 23:37:34 +0900, Nikolai Weibull
> <mailing-lis...@rawuncut.elitemail.org> wrote:
>> * itsme213 <itsm...@hotmail.com> [Dec 02, 2004 14:00]:
>>> Is there a pure-ruby ordered hash? I'm looking for something that will

[...]


>> Having an ordered Hash is like saying that arrays should be indexed by
>> strings--it's simply not what they are meant to be.
>

> Be that as it may, it is still necessary to have an insertion-ordered
> hash-like object.

Is anyone friendly with the Lua crowd? Could they donate us a
table? :-) http://www.lua.org/pil/11.html

Hugh


Austin Ziegler

unread,
Dec 2, 2004, 10:56:32 AM12/2/04
to
On Fri, 3 Dec 2004 00:28:17 +0900, Nikolai Weibull
<mailing-lis...@rawuncut.elitemail.org> wrote:
> * Austin Ziegler <halos...@gmail.com> [Dec 02, 2004 16:00]:

> > > > Is there a pure-ruby ordered hash? I'm looking for something that
> > > > will preserve the order in which key/value pairs were entered. The
> > > > one on RAA is buggy http://raa.ruby-lang.org/project/orderedhash/
> > > Having an ordered Hash is like saying that arrays should be indexed
> > > by strings--it's simply not what they are meant to be.
> > Be that as it may, it is still necessary to have an insertion-ordered
> > hash-like object.
> I just gave you one--you seem to have missed the gist of my posting.
> (You chopped it right off it seems...)

Which one? The Treap you said is comparison ordered, and you didn't
make it clear whether the SkipList is insertion ordered. The order for
PDF::Writer *must* be insertion ordered. I'll happily use RDSL in a
future version of PDF::Writer rather than the hackish ohash
implementation that I have, but ...

David G. Andersen

unread,
Dec 2, 2004, 1:42:59 PM12/2/04
to
On Fri, Dec 03, 2004 at 03:12:42AM +0900, itsme213 scribed:

>
> "David G. Andersen" <d...@lcs.mit.edu> wrote in message
> > Brian's point is the right one to consider: insertion order, in a
> > sense, is merely another key, one that's defined implicitly.
>
> Sure, but then any operation that is sequence-sensitive (it modifies state)
> can be shoehorned into implicit ordering keys. This does not make it
> necessarily a good way to understand (i.e. model or specify) them, or to
> implement them, imho.

I repeat from my earlier message:
"if you gained significant speed or ease of
implementation from using insertion order specifically."

A stack is simple to implement as a list -- much more simple than
a lookup tree -- and O(1) for push/pop operations. A hash provides
O(1) key lookup instead of O(log n) key lookup. So does an array.
These are obvious.

It gets a little less obvious when you start comparing the things
we were _actually_ talking about, which are considerably more
complex.

a) Multiple key ordered data structure
+ General, can be used for arbitrary multiple key lookups
- O(log n) operations
b) Hash + doubly-linked list
+ O(1) operations (_if_ you implement the list traversal right)
- Can only handle insertion order

In many cases, you might be using insertion order as a synonym
for time. If you wanted at some later point to permit
the user to insert objects retroactively, you'd have to
move to an explicit representation. (a) would let you do that;
(b) would not.

In an interpreted, high-level language, I'd probably prefer to
have a good, solid implementation of (a), because it's more useful
to more people, and for the size of things we usually do in Ruby,
O(log n) operations in fast C for N in the thousands issn't too
much slower -- relative to the speed of the entire program running
in ruby -- as using an O(1) operation in C.

(The arguments above should not be construed as suggesting
that it should also replace general hash tables -- they're
more commonly used, and so benefit more from a little attention
to speed).

If this were in the kernel, it would usually be implemented the
other way.

-Dave

--
work: d...@lcs.mit.edu me: d...@pobox.com
MIT Laboratory for Computer Science http://www.angio.net/


gabriele renzi

unread,
Dec 2, 2004, 9:05:45 AM12/2/04
to
Michael Neumann ha scritto:

> itsme213 wrote:
>
>> Is there a pure-ruby ordered hash? I'm looking for something that will
>> preserve the order in which key/value pairs were entered. The one on
>> RAA is
>> buggy http://raa.ruby-lang.org/project/orderedhash/
>>
>> Will 2.0 come with a good ordered hash?
>
>
> This works fine for me:
>

<snipcode>
well, but it works just unless you use a different method to access it
(i.e. shift). We definitely need an Association/Map mixin, imo.

Nikolai Weibull

unread,
Dec 2, 2004, 10:28:17 AM12/2/04
to
* Austin Ziegler <halos...@gmail.com> [Dec 02, 2004 16:00]:

> > > Is there a pure-ruby ordered hash? I'm looking for something that
> > > will preserve the order in which key/value pairs were entered. The
> > > one on RAA is buggy http://raa.ruby-lang.org/project/orderedhash/

> > Having an ordered Hash is like saying that arrays should be indexed
> > by strings--it's simply not what they are meant to be.

> Be that as it may, it is still necessary to have an insertion-ordered
> hash-like object.

I just gave you one--you seem to have missed the gist of my posting.
(You chopped it right off it seems...)

Brian Schröder

unread,
Dec 2, 2004, 12:16:17 PM12/2/04
to
On Fri, 3 Dec 2004 00:59:29 +0900
Glenn Parker <glenn....@comcast.net> wrote:

> Austin Ziegler wrote:
> > On Thu, 2 Dec 2004 23:37:34 +0900, Nikolai Weibull
> >>

> >>Having an ordered Hash is like saying that arrays should be indexed by
> >>strings--it's simply not what they are meant to be.
> >

> > Be that as it may, it is still necessary to have an insertion-ordered
> > hash-like object.
> >

> > I use it in PDF::Writer for page objects that can be referred to
> > meaningfully -- but still render in the order in which they were
> > inserted.
>
> How is an "OrderedHash" different from something we might call a
> "HashedArray", i.e. an array where elements also have a string-like
> address? With this hybrid type, the array-ness and the hash-ness are
> orthogonal and neither takes precedence. To me, this says that the only
> solution (that can maintain an arbitrary ordering) is something like
> Michael's two-layered implementation, where a hash and an array both
> refer to the same set of objects internally. Hashing destroys ordering,
> so you have to pay for it somewhere else.
>
> A more flexible altnerative to Michael's OrderedHash implementation
> might embed an index within each element, or maintain a parallel hash
> with indices, then sort the elements by index on demand. If you access
> the elements in sorted order infrequently, this might be a win for
> performance.
>
> SkipList and related patterns impose an external sort-ordering on the
> hash elements, so they would not suit Austin's requirements.
>

Why is everybody in this thread so fixed on using array and hash. As I see it
the requirements are:

insert(k1, k2, value)
delete_by_k1(k1, value)
delete_by_k2(k2, value)
get_by_k1(k1)
get_by_k2(k2)
each_by_k1

and maybe things like delete_min etc. Any datastructure that fullfills these
purposes fits. If I can construct one by using an Array and a Hash thats good,
but if I can achieve this using a fibonaccy heap or a skiplist or something
else that should not matter.

Maybe we should first get the specifications right? (Austin, what exactly are
your needs, have I hit them more or less?)

kind regards,

Brian

--
Brian Schröder
http://www.brian-schroeder.de/

Charles Mills

unread,
Dec 2, 2004, 11:28:39 AM12/2/04
to
On Dec 2, 2004, at 7:59 AM, Glenn Parker wrote:

> Austin Ziegler wrote:
>> On Thu, 2 Dec 2004 23:37:34 +0900, Nikolai Weibull
>>>

>>> Having an ordered Hash is like saying that arrays should be indexed
>>> by
>>> strings--it's simply not what they are meant to be.

>> Be that as it may, it is still necessary to have an insertion-ordered
>> hash-like object.
>> I use it in PDF::Writer for page objects that can be referred to
>> meaningfully -- but still render in the order in which they were
>> inserted.
>
> How is an "OrderedHash" different from something we might call a
> "HashedArray", i.e. an array where elements also have a string-like
> address?

I think an ordered hash typically keeps track of the insertion order.
So you basically have two ways of accessing elements - by the order of
insertion (index?) and the hash key.

> With this hybrid type, the array-ness and the hash-ness are
> orthogonal and neither takes precedence. To me, this says that the
> only solution (that can maintain an arbitrary ordering) is something
> like Michael's two-layered implementation, where a hash and an array
> both refer to the same set of objects internally. Hashing destroys
> ordering, so you have to pay for it somewhere else.
>

Yeah. Seems like you have to have an array or linked list combined
with a hash table to make this work.
Here is one way to do it:

Have a hash table which maps hash keys to insertion indices like so
(note: key.hash #=> hash key):
Hash Table: { hash key => insertion index, hash key => insertion index }

Then an array, indexed by insertion order which contains key, value
pairs.
Array: [ [ key, value ], [key, value] ]

So to find an element you generate its hash key from the given key.
Then get the insertion index from the hash table (may be the wrong
insertion index), if it exists. Check to make sure it is the right
insertion index by comparing the key at the insertion index to the key
you were given. If it is then you return the value, if it isn't the
either the hash doesn't contain that key or it is deeper in the hash
bucket.

You don't have to do it the way described above, but it has some
advantages. The main one being you can go both ways insertion index =>
key and key => insertion index. This is a very useful property. The
other big advantage is iteration is very fast. You have everything you
need for any type of iteration in the array - the key, the value, and
the insertion index.

> A more flexible altnerative to Michael's OrderedHash implementation
> might embed an index within each element, or maintain a parallel hash
> with indices, then sort the elements by index on demand. If you
> access the elements in sorted order infrequently, this might be a win
> for performance.

You could also include a reorder function, it would be kind of
expensive, but it is possible because you can go both ways insertion
index <=> key.

> SkipList and related patterns impose an external sort-ordering on the
> hash elements, so they would not suit Austin's requirements.
>

-Charlie

Glenn Parker

unread,
Dec 2, 2004, 10:59:29 AM12/2/04
to
Austin Ziegler wrote:
> On Thu, 2 Dec 2004 23:37:34 +0900, Nikolai Weibull
>>
>>Having an ordered Hash is like saying that arrays should be indexed by
>>strings--it's simply not what they are meant to be.
>
> Be that as it may, it is still necessary to have an insertion-ordered
> hash-like object.
>
> I use it in PDF::Writer for page objects that can be referred to
> meaningfully -- but still render in the order in which they were
> inserted.

How is an "OrderedHash" different from something we might call a
"HashedArray", i.e. an array where elements also have a string-like

address? With this hybrid type, the array-ness and the hash-ness are

orthogonal and neither takes precedence. To me, this says that the only
solution (that can maintain an arbitrary ordering) is something like
Michael's two-layered implementation, where a hash and an array both
refer to the same set of objects internally. Hashing destroys ordering,
so you have to pay for it somewhere else.

A more flexible altnerative to Michael's OrderedHash implementation

might embed an index within each element, or maintain a parallel hash
with indices, then sort the elements by index on demand. If you access
the elements in sorted order infrequently, this might be a win for
performance.

SkipList and related patterns impose an external sort-ordering on the

hash elements, so they would not suit Austin's requirements.

--
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>


David G. Andersen

unread,
Dec 2, 2004, 12:56:17 PM12/2/04
to
On Fri, Dec 03, 2004 at 02:35:42AM +0900, Austin Ziegler scribed:

> On Fri, 3 Dec 2004 02:16:17 +0900, Brian Schr?der
> <ru...@brian-schroeder.de> wrote:
> > Why is everybody in this thread so fixed on using array and hash.
> > As I see it the requirements are:
> >
> > insert(k1, k2, value)
> > delete_by_k1(k1, value)
> > delete_by_k2(k2, value)
> > get_by_k1(k1)
> > get_by_k2(k2)
> > each_by_k1
>
> Actually, my needs are:
>
> insert(key, value)
> delete(key)
> get(key)
> each_by_insertion_order
>
> I don't want to manage two sets of keys.

Which, if you had Brian's mechanism as a base, you could
implement as:

class OrderedThingy < BriansMultiKeyThingy
def initialize
@sequence = 0
end
def insert(key, value)
# XXX: Note lack of thread safety here.
super(@sequence, key, value)
@sequence += 1
end
def delete(key)
delete_by_k2(key)
end
def get(key)
get_by_k2(key)
end
def each_by_insertion_order(&proc)
each_by_k1(&proc)
end
end

Brian's point is the right one to consider: insertion order, in a

sense, is merely another key, one that's defined implicitly. The
only reason to not use the general case were if you gained significant


speed or ease of implementation from using insertion order specifically.

Insertion order _is_ particularly easy to manage using a linked
list in a way that the doubly keyed implementation above isn't
(though the naive list.delete operation has to go - you'd be better
off maintaining a doubly linked list so that you can do the delete
from the key), but unless you're super concerned with performance,
the generality of the above implementation is a pretty nice thing
to have.

Austin Ziegler

unread,
Dec 2, 2004, 12:35:42 PM12/2/04
to
On Fri, 3 Dec 2004 02:16:17 +0900, Brian Schröder
<ru...@brian-schroeder.de> wrote:
> Why is everybody in this thread so fixed on using array and hash.
> As I see it the requirements are:
>
> insert(k1, k2, value)
> delete_by_k1(k1, value)
> delete_by_k2(k2, value)
> get_by_k1(k1)
> get_by_k2(k2)
> each_by_k1

Actually, my needs are:

insert(key, value)
delete(key)
get(key)
each_by_insertion_order

I don't want to manage two sets of keys.

Basically, I need to be able to manipulate a number of objects on
the page queue, but they have to be rendered in the order inserted.
I handle other issues through transaction support via
Transaction::Simple.

itsme213

unread,
Dec 2, 2004, 10:21:08 AM12/2/04
to
> Whenever I hear "ordered hash" then I think of a hash whose keys are
> ordered according to a definable order - not necessarily insertion order.

You are right.

> In fact, insertion order seems to me a very special case. Is this really
> widely used?

It's not uncommon. And not meant to replace comparison -based ordered ADTs.


Austin Ziegler

unread,
Dec 2, 2004, 9:49:47 AM12/2/04
to
On Thu, 2 Dec 2004 23:37:34 +0900, Nikolai Weibull
<mailing-lis...@rawuncut.elitemail.org> wrote:
> * itsme213 <itsm...@hotmail.com> [Dec 02, 2004 14:00]:
> > Is there a pure-ruby ordered hash? I'm looking for something that will
> > preserve the order in which key/value pairs were entered. The one on
> > RAA is buggy http://raa.ruby-lang.org/project/orderedhash/
>
> Having an ordered Hash is like saying that arrays should be indexed by
> strings--it's simply not what they are meant to be.

Be that as it may, it is still necessary to have an insertion-ordered
hash-like object.

I use it in PDF::Writer for page objects that can be referred to
meaningfully -- but still render in the order in which they were
inserted.

-austin

itsme213

unread,
Dec 2, 2004, 4:51:40 PM12/2/04
to
> I have an ohash in the code to PDF::Writer that
> you're welcome to use.

I may have to add some Hash conveniences, like default blocks. Thanks.


Nikolai Weibull

unread,
Dec 2, 2004, 4:09:12 PM12/2/04
to
* Austin Ziegler <halos...@gmail.com> [Dec 02, 2004 17:00]:

> > I just gave you one--you seem to have missed the gist of my posting.
> > (You chopped it right off it seems...)

> Which one? The Treap you said is comparison ordered, and you didn't
> make it clear whether the SkipList is insertion ordered. The order for
> PDF::Writer *must* be insertion ordered. I'll happily use RDSL in a
> future version of PDF::Writer rather than the hackish ohash
> implementation that I have, but ...

Hm, OK, sorry. I misunderstood. I thought you wanted comparison.

nobu....@softhome.net

unread,
Dec 2, 2004, 9:38:24 PM12/2/04
to
Hi,

At Thu, 2 Dec 2004 15:51:46 +0900,
Yukihiro Matsumoto wrote in [ruby-talk:122141]:
> Nobu himself made a patch to preserve hash order. I have not decided
> yet to merge it. The only concern is performance.

The performance wouldn't increase for insertion, iteration and
lookup, but do for direct deleting (except for st_delete_safe).
And the memory usage increases 2 pointers for each hash
entries.

--
Nobu Nakada


Michael Neumann

unread,
Dec 3, 2004, 6:09:55 AM12/3/04
to

Hm, 2 pointers for each hash entry, that's an increase of 50%. On the
other side, 8 bytes more per hash entry isn't very much (if it's useful
enough).

Regards,

Michael


Jim Crigler

unread,
Dec 2, 2004, 4:03:50 PM12/2/04
to
Something like this (hoping GG doesn't mess up the indenting):

# Assumes you're not initializing with values at the beginning, i.e.
# h = InsertOrderHash.new
# Extension to the general case is left as an exercise for the reader
# (I've always wanted to say that ;-)
class InsertOrderHash < Hash
def initialize(*args)
@insert_order = Array.new
end
def []=(key, val)
@insert_order.push key unless self.key? key
super
end
def each # Assumes block follows
@insert_order.each do |key|

Robert Klemme

unread,
Dec 3, 2004, 7:20:45 AM12/3/04
to

"Hal Fulton" <hal...@hypermetrics.com> schrieb im Newsbeitrag
news:41AFA897...@hypermetrics.com...

> Robert Klemme wrote:
> > "itsme213" <itsm...@hotmail.com> schrieb im Newsbeitrag
> > news:Mptrd.59198$KQ2....@fe2.texas.rr.com...
> >
> >>Is there a pure-ruby ordered hash? I'm looking for something that will
> >>preserve the order in which key/value pairs were entered. The one on
RAA
> >
> > is
> >
> >>buggy http://raa.ruby-lang.org/project/orderedhash/
> >
> >
> > Whenever I hear "ordered hash" then I think of a hash whose keys are
> > ordered according to a definable order - not necessarily insertion
order.
> > In fact, insertion order seems to me a very special case. Is this
really
> > widely used?
> >
>
> I'll reply to this, since an ordered hash is on my wish list.
>
> First of all, if it HAS an order, then it can be sorted (reordered)
> in any way you want. The (first) important thing to me is that it
> HAVE an order.

Definitely!

> Secondly, insertion order should IMO be the default -- because of
> hash literals, if for no other reason.

Hm, I'd prefer the "natural" order of key elements, i.e. the order defined
by <=>. The reason being that you can get away with no additional data
stored. If you make insertion oder the default order you're likely going
to have to wrap keys with some kind of other object that maintains an
insertion count (which then is used for ordering) and references the
original key. If you base order on the key values only you can safe all
the hassle with maintaining insertion oder information, taking care of key
comparisons, hash values etc.

> If I say {this=>1, that=>2, other=>3} then that is what I expect
> (assuming there is any order at all). I would not want my hash
> literals scrambled (assuming some ordering, but one not based on
> insertion).

I can understand that from the point of view of source code, but in which
situations in applications does this really matter?

> But as long as a hash HAS an order, you can order it any way you
> like -- just as we can sort an array any way we like.

Yep, I'd do it similar to java.util.TreeMap - set a block as comparator
during hash creation that receives two values and returns -1,0 or 1 (same
as <=> does).

> And yes, I know there are fundamental differences between hashes
> and arrays. ;)

What differences? Just kiddin'... I'm in a strange mood today. :-)

Kind regards

robert

Andrew Johnson

unread,
Dec 2, 2004, 8:10:04 PM12/2/04
to


There are many more methods than just #[]= and #each that should be
(re)defined for an insert-order Hash --- including #delete, #delete_if,
#clear, #keys, #values, and others. The OrderedHash on RAA goes a
long way towards (re)defining the methods that might make sense
for such an object.

Fixing (or overriding) the #== method (depending on the semantics
you desire) would probably be a trivial matter (adding a check
that the second hash is also an OrderedHash seems a likely choice). That
module's author has been quick to respond to bug reports/fixes in the
past.

regards,
andrew

gabriele renzi

unread,
Dec 2, 2004, 7:14:50 PM12/2/04
to
Nikolai Weibull ha scritto:

> * itsme213 <itsm...@hotmail.com> [Dec 02, 2004 14:00]:
>
>>Is there a pure-ruby ordered hash? I'm looking for something that will
>>preserve the order in which key/value pairs were entered. The one on
>>RAA is buggy http://raa.ruby-lang.org/project/orderedhash/
>
>
> Having an ordered Hash is like saying that arrays should be indexed by
> strings--it's simply not what they are meant to be.

<kidding>
but sometimes they are :)
>> a=[1,2]
=> [1, 2]
>> b=['one' ,'two']
=> ['one', 'two']
>> c=a,b
=> [[1, 2], ['one', 'two']]
>> c.assoc 'one'
=> ['one', 'two']
</kidding>

Hal Fulton

unread,
Dec 2, 2004, 6:43:27 PM12/2/04
to
Robert Klemme wrote:
> "itsme213" <itsm...@hotmail.com> schrieb im Newsbeitrag
> news:Mptrd.59198$KQ2....@fe2.texas.rr.com...
>
>>Is there a pure-ruby ordered hash? I'm looking for something that will
>>preserve the order in which key/value pairs were entered. The one on RAA
>
> is
>
>>buggy http://raa.ruby-lang.org/project/orderedhash/
>
>
> Whenever I hear "ordered hash" then I think of a hash whose keys are
> ordered according to a definable order - not necessarily insertion order.
> In fact, insertion order seems to me a very special case. Is this really
> widely used?
>

I'll reply to this, since an ordered hash is on my wish list.

First of all, if it HAS an order, then it can be sorted (reordered)
in any way you want. The (first) important thing to me is that it
HAVE an order.

Secondly, insertion order should IMO be the default -- because of


hash literals, if for no other reason.

If I say {this=>1, that=>2, other=>3} then that is what I expect


(assuming there is any order at all). I would not want my hash
literals scrambled (assuming some ordering, but one not based on
insertion).

But as long as a hash HAS an order, you can order it any way you


like -- just as we can sort an array any way we like.

And yes, I know there are fundamental differences between hashes
and arrays. ;)


Hal

Robert Klemme

unread,
Dec 3, 2004, 9:07:41 AM12/3/04
to

"David A. Black" <dbl...@wobblini.net> schrieb im Newsbeitrag
news:Pine.LNX.4.44.0412030519240.28725-100000@wobblini...
> Hi --

>
> On Fri, 3 Dec 2004, Robert Klemme wrote:
>
> > Hm, I'd prefer the "natural" order of key elements, i.e. the order
defined
> > by <=>.
>
> The problem is, though, that keys can be anything, including things
> that don't respond to <=> or that can't be meaningfully compared with
> other keys.

But that's okay: you'll get bitten by an exception if you insert such
keys. It will work perfectly for the most common key types (I guess these
are String and maybe Fixnum). If you want such keys then you can easily
define your own ordering with the lambda you provide.

class Foo
attr_accessor :bar
end

oh = OrderedHash( lambda {|foo1,foo2| foo1.bar.size <=> foo2.bar.size} )
oh[Foo.new]="bar"
...

Kind regards

robert

Robert Klemme

unread,
Dec 3, 2004, 9:40:47 AM12/3/04
to

"David A. Black" <dbl...@wobblini.net> schrieb im Newsbeitrag
news:Pine.LNX.4.44.0412030621130.4637-100000@wobblini...

> Hi --
>
> On Fri, 3 Dec 2004, Robert Klemme wrote:
>
> >
> > "David A. Black" <dbl...@wobblini.net> schrieb im Newsbeitrag
> > news:Pine.LNX.4.44.0412030519240.28725-100000@wobblini...
> > > Hi --
> > >
> > > On Fri, 3 Dec 2004, Robert Klemme wrote:
> > >
> > > > Hm, I'd prefer the "natural" order of key elements, i.e. the order
> > defined
> > > > by <=>.
> > >
> > > The problem is, though, that keys can be anything, including things
> > > that don't respond to <=> or that can't be meaningfully compared
with
> > > other keys.
> >
> > But that's okay: you'll get bitten by an exception if you insert such
> > keys. It will work perfectly for the most common key types (I guess
these
> > are String and maybe Fixnum). If you want such keys then you can
easily
> > define your own ordering with the lambda you provide.
>
> There may be a lot of hashes with such keys, but I wouldn't describe
> that uniformity as more "natural" to a hash than the act of key/value
> insertion itself.

I meant "natural" to refer to the ordering of the instance class. I guess
I didn't make that clear enough. But yes, I'd use that as the default
ordering of a hash, too. :-)

> I guess I'd prefer to see hash order be a
> characteristic of every hash (without an exception raised), regardless

Please, not of every Hash - just the OrderedHashes. IMHO space and time
overheads are significant enough to not include this into every hash.

> of the keys, and then have things like ordering by string value, etc.,
> be the more specialized case.

Well, I beg to differ for stated reasons. Let's see what Matz and others
think about this.

Kind regards

robert

Robert Klemme

unread,
Dec 3, 2004, 10:29:56 AM12/3/04
to

"David A. Black" <dbl...@wobblini.net> schrieb im Newsbeitrag
news:Pine.LNX.4.44.0412030658240.26976-100000@wobblini...

> Hi --
>
> On Fri, 3 Dec 2004, Robert Klemme wrote:
>
> > > I guess I'd prefer to see hash order be a
> > > characteristic of every hash (without an exception raised),
regardless
> >
> > Please, not of every Hash - just the OrderedHashes. IMHO space and
time
> > overheads are significant enough to not include this into every hash.
>
> Right, just to the extent that a Hash is Ordered in the first place.
> (I had the feeling Matz was considering having all of them be ordered,
> but that may be wrong and/or a different matter.)

Hm, might be. I can't remember such a statement but then again - I don't
read every single posting. So this might have slipped through my fingers.
Still I don't think it's a good idea to make all hashes insertion ordered.

I could imagine a solution that uses a callback instance for greatest
flexibility. That way even insertion order could be easily realized:


class OrderedHash
include Enumerable

# callbacks for various events
class InstanceCallbacks
def hash_instance(o) o.hash end
def insert(key,val) end
def update(key,val) end
def remove(key,val) end
def compare(a,b) a<=>b end
end

class InsertionOrderCB < InstanceCallbacks
def initialize() @ins = [] end

def insert(key,val) @ins << key end
def remove(key,val) @ins.delete key end

def compare(a,b)
return 0 if a <=> == 0

@ins.each do |x|
if a <=> x == 0
return -1
elsif b <=> x == 0
return 1
end
end

raise "Internal Error"
end
end

def initialize(ih = InsertionOrderCB.new,&b)
@ih = ih
# ...
end

def store(key,val)
# use @ih instance handler to determine order...
end

def delete(key)
# ...
end
end


Kind regards

robert

David A. Black

unread,
Dec 3, 2004, 9:24:23 AM12/3/04
to
Hi --

On Fri, 3 Dec 2004, Robert Klemme wrote:

>
> "David A. Black" <dbl...@wobblini.net> schrieb im Newsbeitrag
> news:Pine.LNX.4.44.0412030519240.28725-100000@wobblini...
> > Hi --
> >
> > On Fri, 3 Dec 2004, Robert Klemme wrote:
> >
> > > Hm, I'd prefer the "natural" order of key elements, i.e. the order
> defined
> > > by <=>.
> >
> > The problem is, though, that keys can be anything, including things
> > that don't respond to <=> or that can't be meaningfully compared with
> > other keys.
>
> But that's okay: you'll get bitten by an exception if you insert such
> keys. It will work perfectly for the most common key types (I guess these
> are String and maybe Fixnum). If you want such keys then you can easily
> define your own ordering with the lambda you provide.

There may be a lot of hashes with such keys, but I wouldn't describe


that uniformity as more "natural" to a hash than the act of key/value

insertion itself. I guess I'd prefer to see hash order be a


characteristic of every hash (without an exception raised), regardless

of the keys, and then have things like ordering by string value, etc.,
be the more specialized case.


David

--
David A. Black
dbl...@wobblini.net

David A. Black

unread,
Dec 3, 2004, 9:59:24 AM12/3/04
to
Hi --

On Fri, 3 Dec 2004, Robert Klemme wrote:

> > I guess I'd prefer to see hash order be a
> > characteristic of every hash (without an exception raised), regardless
>
> Please, not of every Hash - just the OrderedHashes. IMHO space and time
> overheads are significant enough to not include this into every hash.

Right, just to the extent that a Hash is Ordered in the first place.


(I had the feeling Matz was considering having all of them be ordered,
but that may be wrong and/or a different matter.)

David A. Black

unread,
Dec 3, 2004, 8:20:37 AM12/3/04
to
Hi --

On Fri, 3 Dec 2004, Robert Klemme wrote:

> Hm, I'd prefer the "natural" order of key elements, i.e. the order defined
> by <=>.

The problem is, though, that keys can be anything, including things


that don't respond to <=> or that can't be meaningfully compared with
other keys.

Hal Fulton

unread,
Dec 3, 2004, 11:30:33 AM12/3/04
to

In my opinion, it would be worth it.

I am curious: How often do people use "large" hashes? Nearly all
of mine are under 50 keys, I think.

It becomes more of an issue if the hash has 1000000 keys, of
course.


Hal


David G. Andersen

unread,
Dec 3, 2004, 2:09:12 PM12/3/04
to
On Fri, Dec 03, 2004 at 11:59:24PM +0900, David A. Black scribed:

>
> Right, just to the extent that a Hash is Ordered in the first place.
> (I had the feeling Matz was considering having all of them be ordered,
> but that may be wrong and/or a different matter.)

The point is that a hash ordered by insertion order is pretty
easy to maintain. Your hash table changes from something like:

hash_item *hash_table;

where:

struct hash_item {
something *item_ref;
}

update(new_item) {
/* Yes, yes, you have to do some kind of chaining or
* whatever.. but you understand the point here. :) */
hash_table[ hash_of(new_item) ] = new_item;
}

to:

hash_item *hash_table;
struct hash_item {
something *item_ref;
hash_item *prev, *next;
}
hash_item *head, *tail;

update(new_item) {
hash_table[ hash_of(new_item) ] = new_item;
new_item.prev = tail;
tail = new_item;
}

And, as Nobu pointed out, all of the operations remain O(1),
except, obviously, for each(). They do a little more work, but
not exceedingly more.

If you want to maintain by something other than insertion order,
you have to sort things, which _would_ be likely to slow down the
basic hash table.

Christoph

unread,
Dec 3, 2004, 5:31:05 PM12/3/04
to
Hal Fulton schrieb:

>>
>>
>> The performance wouldn't increase for insertion, iteration and
>> lookup, but do for direct deleting (except for st_delete_safe).
>> And the memory usage increases 2 pointers for each hash
>> entries.
>>
>
> In my opinion, it would be worth it.
>
> I am curious: How often do people use "large" hashes? Nearly all
> of mine are under 50 keys, I think.

I don't use ruby that often these days but mine
were and are often much larger then this

>
> It becomes more of an issue if the hash has 1000000 keys, of
> course.

I am strongly against an insertion-ordered Hash Class (as the
default implementation) . Feature overloading a fundamental data
structure like Hash, instead of defining a separate class like
"InsOrderedHash", to maintain a shallow tree of bases classes,
violates my sense of aesthetics - I would probably seriously consider
changing to the Python-camp:-(

/Christoph


Brian Schröder

unread,
Dec 4, 2004, 4:20:05 AM12/4/04
to

I use hashes as large as this. (For building my chordlist.
http://chordlist.brian-schroeder.de/)

Regards,

David Garamond

unread,
Dec 5, 2004, 6:26:56 AM12/5/04
to
Brian Schröder wrote:
>>It becomes more of an issue if the hash has 1000000 keys, of
>>course.
>
> I use hashes as large as this. (For building my chordlist.
> http://chordlist.brian-schroeder.de/)

Forbidden
You don't have permission to access /index.rb on this server.

--
dave


Brian Schröder

unread,
Dec 5, 2004, 3:42:43 PM12/5/04
to

Thanks for pointing me to it. I just buyed some more space on the webserver,
and the admin messed it up and disabled cgi. Now its working again.

Regards,

Brian

--
Brian Schröder
http://ruby.brian-schroeder.de/

Robert Klemme

unread,
Dec 7, 2004, 7:12:06 AM12/7/04
to

"David G. Andersen" <d...@lcs.mit.edu> schrieb im Newsbeitrag
news:20041203190...@lcs.mit.edu...

Yeah, but space *can* be crucial sometimes. So I'd vote for keeping the
current Hash as it is and maybe adding another class InsertionOrderedHash
(or whatever) that keeps insertion order.

> If you want to maintain by something other than insertion order,
> you have to sort things, which _would_ be likely to slow down the
> basic hash table.

I think for ordered maps usually some kind of tree implementation is
chosen, which naturally is a bit slower than a hash. I think in Java and
C++ (STL) they use red-black-trees.

Kind regards

robert

0 new messages