Hashes maintain field order?

1,063 views
Skip to first unread message

tribalvibes

unread,
May 10, 2011, 7:20:41 PM5/10/11
to Redis DB
Are Redis hashes guaranteed to maintain the insertion order of
fields? I.e. is HGETALL guaranteed to return the fields in the order
each is created with multiple HSET/HSETNX/HMSET ops?

This is a very useful property that I'd like to see implemented (or
documented, if it is already implemented this way.) (E.g., ruby 1.9
Hash is an OrderedHash.)

Secondly, does the global keyspace use the same hash implementation as
the hash type and provide the same insertion order guarantees ?

Also by the way, what is the difference really between HSET and HMSET
besides possibly the return value? Why have both? Do you foresee a
polymorphic or "reduced instruction set" Redis design in the future?

Josiah Carlson

unread,
May 11, 2011, 4:42:30 AM5/11/11
to redi...@googlegroups.com
On Tue, May 10, 2011 at 4:20 PM, tribalvibes <m...@tribalvibes.com> wrote:
> Are Redis hashes guaranteed to maintain the insertion order of
> fields?  I.e. is  HGETALL guaranteed to return the fields in the order
> each is created with multiple HSET/HSETNX/HMSET ops?

No.

> This is a very useful property that I'd like to see implemented (or
> documented, if it is already implemented this way.)  (E.g., ruby 1.9
> Hash is an OrderedHash.)

It's not implemented this way, and it seems unlikely at this point
that it would change.

> Secondly, does the global keyspace use the same hash implementation as
> the hash type and provide the same insertion order guarantees ?

Yes. Which is to say that there really aren't any guarantees with
regards to order returned related to insertion (beyond the vagaries of
the hash bucket implementation, the number of hashes, and the
distribution of the hash function).

> Also by the way, what is the difference really between HSET and HMSET
> besides possibly the return value? Why have both? Do you foresee a
> polymorphic or "reduced instruction set" Redis design in the future?

One sets one value, the other sets multiple values. There have been
the addition of "varargs" versions of sadd and others recently, I
personally do not know the plans with regards to potential deprecation
of HMSET, so I will not comment.

Regards,
- Josiah

paulino huerta

unread,
May 11, 2011, 5:04:07 AM5/11/11
to redi...@googlegroups.com

You can also clarify some ideas

http://redis.io/topics/memory-optimization

--Paulino

2011/5/11 tribalvibes <m...@tribalvibes.com>

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.


tribalvibes

unread,
May 11, 2011, 4:27:10 PM5/11/11
to Redis DB
Thanks Paulino, that's insighftul. The client-side sharding is a nice
idea although it spreads rather than encapsulates an implementation
detail. I think we should step back and look at the design and not
only the implementation towards Redis 3.

Hashes that maintain the insertion order of their fields are important
for numerous applications, and notably this is also now the way that
not only Ruby 1.9 but also Javascript de facto maintain their hashes,
see:
http://talklikeaduck.denhaven2.com/2009/04/15/ordered-hashes-in-ruby-1-9-and-javascript
http://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/

As it turns out, w.r.t. the small hash optimization, this would not
even add the back-pointer storage overhead. Our case for field
ordering is not so much for individual objects though but rather for
hashes representing object collections ordered by a sort key. So this
would not use the small hash optimization. I suppose we could hack
some workaround using sorted set, and then maintain that workaround in
each of the client libraries...

Another useful primitive operation would be to efficiently return the
field-value pairs of a hash (or sorted set) where the field name
(member) matches a regexp, (as keys * in the global space) and with a
limit/offset/range.

And, when I do something like:
SORT aList BY *->fieldname GET *
or for that matter the interesting:
SORT aList BY NOSORT GET *
I would like it to return each entire hash, all field-value pairs,
rather than having to explicitly enumerate each fieldname, which the
client may not know if the objects are polymorphic.

Conversely, the other opportunity for the small hash optimization is
that a typical use case is that the client already serialize the
objects, e.g. with json or bson. Perhaps this is what you are
effectively doing internally. In which case it would it make sense to
accept bson or json serialized hashes directly from the wire?

I'd rather see a design based on first principles and generic
operations, e.g. as STL, as well as actual use cases, rather than a
heap of one-off tricks and optimizations.

For example, next question is how to maintain a sorted list, with
O(log(N)) insert, find and delete? Perhaps again a hack workaround
with sorted set?

And, perhaps sorted set should really be an ordered map? I.e. not
just a set of value ordered by score, but a map of key => value
ordered by score (or by insertion order.) There have already been
calls for this (e.g., strangely for time series but it is more general
than that.) So this brings us back to OrderedHash as an implementation
of OrderedMap in lua of sorted set.

Sorry if this seems rambling, but I hope you could extract a useful
direction. Maybe now that there is an embedded ruby or js interpreter
some of these things can be implemented server-side through scripting
without inventing yet another ad-hoc language. But sure, I'll code a
stored proc class lib in Lua over SQL any day.


On May 11, 2:04 am, paulino huerta <paulinohue...@gmail.com> wrote:
> You can also clarify some ideas
>
> http://redis.io/topics/memory-optimization<http://www.google.com/url?sa=D&q=http://redis.io/topics/memory-optimi...>

Josiah Carlson

unread,
May 11, 2011, 4:54:25 PM5/11/11
to redi...@googlegroups.com

Redis uses fairly standard C, not C++, and doesn't use the STL at all.

> For example, next question is how to maintain a sorted list, with
> O(log(N)) insert, find and delete?   Perhaps again a hack workaround
> with sorted set?

If you want to implement a new structure, this is trivial. Any sorted
dictionary implementation will do.

> And, perhaps sorted set should really be an ordered map?  I.e. not
> just a set of value ordered by score, but a map of key => value
> ordered by score (or by insertion order.)  There have already been
> calls for this (e.g., strangely for time series but it is more general
> than that.) So this brings us back to OrderedHash as an implementation
> of OrderedMap in lua of sorted set.
>
> Sorry if this seems rambling, but I hope you could extract a useful
> direction. Maybe now that there is an embedded ruby or js interpreter
> some of these things can be implemented server-side through scripting
> without inventing yet another ad-hoc language.  But sure, I'll code a
> stored proc class lib in Lua over SQL any day.

There is no embedded Ruby or JS interpreter, it's Lua, and it's
currently in testing.

If all you care about is insertion order (as opposed to update order,
all of which are very similar), this can be accomplished with 2
additional pointers per hash item, and 1 or 2 additional pointers for
the hash itself. The ideas are 99.9% identical to O(1) LRU updates,
which you can see here:
http://code.activestate.com/recipes/252524-length-limited-o1-lru-cache-implementation/
.

Regards,
- Josiah

tribalvibes

unread,
May 11, 2011, 5:29:25 PM5/11/11
to Redis DB
On May 11, 1:54 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> Redis uses fairly standard C, not C++, and doesn't use the STL at all.

The point here is to be able to abstract the thinking, so that the
implementation follows from a coherent design pattern based on real
use cases, rather than letting the constraints of the current
implementation drive the evolution.

I was suggesting only that C++ STL is designed based on first
principles of computational time-space complexity of data structures
and Redes could do well in a redesign to consider such a generic
approach rather than a proliferation of commands that each only work
with a concrete data type. This has nothing to do with whether you
implement the server core in C, Algol, assembly, pYthon, whatever.

>
> > For example, next question is how to maintain a sorted list, with
> > O(log(N)) insert, find and delete?   Perhaps again a hack workaround
> > with sorted set?
>
> If you want to implement a new structure, this is trivial. Any sorted
> dictionary implementation will do.

Yes, well how hard would it be to implement ordered binary operations
(i.e. the list is implicitly ordered by the client's insertions) on
the current list structure without resorting to an explicitly ordered
structure? So if it's trivial it would be a good fundamental addition,
no?

> If all you care about is insertion order (as opposed to update order,
> all of which are very similar), this can be accomplished with 2
> additional pointers per hash item, and 1 or 2 additional pointers for
> the hash itself. The ideas are 99.9% identical to O(1) LRU updates,
> which you can see here:http://code.activestate.com/recipes/252524-length-limited-o1-lru-cach...
> .
>
> Regards,
>  - Josiah

paulino huerta

unread,
May 11, 2011, 5:47:47 PM5/11/11
to redi...@googlegroups.com


2011/5/11 tribalvibes <m...@tribalvibes.com>

On May 11, 1:54 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> Redis uses fairly standard C, not C++, and doesn't use the STL at all.

The point here is to be able to abstract the thinking, so that the
implementation follows from a coherent design pattern based on real
use cases, rather than letting the constraints of the current
implementation drive the evolution.

I was suggesting only that C++ STL is designed based on first
principles of computational time-space complexity of data structures
and Redes could do well in a redesign to consider such a generic
approach rather than a proliferation of commands that each only work
with a concrete data type.   This has nothing to do with whether you
implement the server core in C, Algol, assembly, pYthon, whatever.
 
Redis is a DSL.
Redis is a protocol
Redis is a server structures (natural structures: sets, lists, variable, hashs, sorted sets, and little else)

The HTTP protocol, is less than 10 commands
GET, HEAD, POST, PUT, DELETE, ...

FTP, more than 60

Where are the problems with the number of commands?

>
> > For example, next question is how to maintain a sorted list, with
> > O(log(N)) insert, find and delete?   Perhaps again a hack workaround
> > with sorted set?
>
> If you want to implement a new structure, this is trivial. Any sorted
> dictionary implementation will do.

Yes, well how hard would it be to implement ordered binary operations
(i.e. the list is implicitly ordered by the client's insertions) on
the current list structure without resorting to an explicitly ordered
structure? So if it's trivial it would be a good fundamental addition,
no?

> If all you care about is insertion order (as opposed to update order,
> all of which are very similar), this can be accomplished with 2
> additional pointers per hash item, and 1 or 2 additional pointers for
> the hash itself. The ideas are 99.9% identical to O(1) LRU updates,
> which you can see here:http://code.activestate.com/recipes/252524-length-limited-o1-lru-cach...
> .
>
> Regards,
>  - Josiah

Josiah Carlson

unread,
May 12, 2011, 12:52:37 AM5/12/11
to redi...@googlegroups.com
On Wed, May 11, 2011 at 2:29 PM, tribalvibes <m...@tribalvibes.com> wrote:
> On May 11, 1:54 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>> Redis uses fairly standard C, not C++, and doesn't use the STL at all.
>
> The point here is to be able to abstract the thinking, so that the
> implementation follows from a coherent design pattern based on real
> use cases, rather than letting the constraints of the current
> implementation drive the evolution.

You must be new here. Salvatore doesn't add new features just because
someone wants them, he waits for multiple people to have similar/same
uses, and considers adding functionality that aren't against what he
considers to be the core ideas behind Redis.

> I was suggesting only that C++ STL is designed based on first
> principles of computational time-space complexity of data structures
> and Redes could do well in a redesign to consider such a generic
> approach rather than a proliferation of commands that each only work
> with a concrete data type.   This has nothing to do with whether you
> implement the server core in C, Algol, assembly, pYthon, whatever.

Indeed, someone could redesign Redis to (for example) not use a
line-based protocol (readline in many languages is horrible), or use a
set structure that doesn't use on average average (at least in older
versions) 86 bytes/per entry (on 32 bit, 96 on 64 bit), or use a
different kind of tree structure for zsets that reduce storage
requirements, or use item freelists, or used tagged strings/ints,
etc., etc., etc.

Redis originally offered features and functionality that weren't
really all that new for people with languages that didn't suck, except
that instead of being exclusive to a single process, it was now shared
among many processes. But Redis is evolving from just a simple data
structure server to more; adding scripting (making it similar in many
respects to traditional databases with PL/SQL), clustering, improved
on-disk synchronization, etc.

You may have a bit of a point with respect to having a generic
approach for a data structure agnostic approach... except that each
structure within Redis have radically different uses, time complexity
for operations, etc., and for the sake of offering the least surprise,
keeps operations only on data of a particular type.

>> > For example, next question is how to maintain a sorted list, with
>> > O(log(N)) insert, find and delete?   Perhaps again a hack workaround
>> > with sorted set?
>>
>> If you want to implement a new structure, this is trivial. Any sorted
>> dictionary implementation will do.
>
> Yes, well how hard would it be to implement ordered binary operations
> (i.e. the list is implicitly ordered by the client's insertions) on
> the current list structure without resorting to an explicitly ordered
> structure? So if it's trivial it would be a good fundamental addition,
> no?

To know the order would take just one list and a little bit of client hacking.

WATCH hash_key
if (HGET hash_key hash_member) is null:
MULTI
LPUSH hash_key:order hash_member
else:
MULTI
HADD hash_key hash_member hash_value
EXEC

This would be made easier with the scripting branch; basically add to
a list if the member that you are adding/manipulating isn't already
there.

- Josiah

>> If all you care about is insertion order (as opposed to update order,
>> all of which are very similar), this can be accomplished with 2
>> additional pointers per hash item, and 1 or 2 additional pointers for
>> the hash itself. The ideas are 99.9% identical to O(1) LRU updates,
>> which you can see here:http://code.activestate.com/recipes/252524-length-limited-o1-lru-cach...
>> .
>>
>> Regards,
>>  - Josiah
>

tribalvibes

unread,
May 12, 2011, 2:11:48 AM5/12/11
to Redis DB
You are correct--it's the lack of genericity that concerns me,
especially as the features and complexity of Redis grow. I think
things could be made more functional by removing features/commands and
instead adding generic ops.

For example, SORT ... STORE does this store only to a list, or could
it store the sorted weights and values to a new sorted set? Does the
stored type depend on the sorted type?

> To know the order would take just one list and a little bit of client hacking.
>
> WATCH hash_key
> if (HGET hash_key hash_member) is null:
>   MULTI
>   LPUSH hash_key:order hash_member
> else:
>   MULTI
> HADD hash_key hash_member hash_value
> EXEC
>
> This would be made easier with the scripting branch; basically add to
> a list if the member that you are adding/manipulating isn't already
> there.

Yes, maintaining a separate list is a fine way to keep the insertion
order of the hash. It might as well be intrinsic. What about delete?
Now it's no longer O(1) in this example. Not that we couldn't live
without delete. But if we wanted to maintain the update order, rather
than the insertion order as you suggested? How would you do that while
maintaining O(1) average insertion cost of hash?

paulino huerta

unread,
May 12, 2011, 4:46:18 AM5/12/11
to redi...@googlegroups.com


2011/5/12 tribalvibes <m...@tribalvibes.com>

You are correct--it's the lack of genericity that concerns me,
especially as the features and complexity of Redis grow. I think
things could be made more functional by removing features/commands and
instead adding generic ops.

For example, SORT ... STORE  does this store only to a list, or could
it store the sorted weights and values to a new sorted set?  Does the
stored type depend on the sorted type?
 
Better if we specify ¿OK?

SORT:
 Sort the elements in a list, set or sorted set
 http://redis.io/commands/sort
STORE:
  not exist

More ...

> To know the order would take just one list and a little bit of client hacking.
>
> WATCH hash_key
> if (HGET hash_key hash_member) is null:
>   MULTI
>   LPUSH hash_key:order hash_member
> else:
>   MULTI
> HADD hash_key hash_member hash_value
> EXEC
>
> This would be made easier with the scripting branch; basically add to
> a list if the member that you are adding/manipulating isn't already
> there.

Yes, maintaining a separate list is a fine way to keep the insertion
order of the hash. It might as well be intrinsic. What about delete?
Now it's no longer O(1) in this example.  Not that we couldn't live
without delete. But if we wanted to maintain the update order, rather
than the insertion order as you suggested? How would you do that while
maintaining O(1) average insertion cost of hash?

--

Josiah Carlson

unread,
May 12, 2011, 2:44:14 PM5/12/11
to redi...@googlegroups.com
On Wed, May 11, 2011 at 11:11 PM, tribalvibes <m...@tribalvibes.com> wrote:
> You are correct--it's the lack of genericity that concerns me,
> especially as the features and complexity of Redis grow. I think
> things could be made more functional by removing features/commands and
> instead adding generic ops.

Sounds like bikeshedding unless you have a specific set of "replace
command X with generic command Y". But again, less surprises are
better.

> For example, SORT ... STORE  does this store only to a list, or could
> it store the sorted weights and values to a new sorted set?  Does the
> stored type depend on the sorted type?

It currently only stores a list. It works on sets and lists as input
(though there is additional functionality offered by the GET and BY
arguments).

>> To know the order would take just one list and a little bit of client hacking.
>>
>> WATCH hash_key
>> if (HGET hash_key hash_member) is null:
>>   MULTI
>>   LPUSH hash_key:order hash_member
>> else:
>>   MULTI
>> HADD hash_key hash_member hash_value
>> EXEC
>>
>> This would be made easier with the scripting branch; basically add to
>> a list if the member that you are adding/manipulating isn't already
>> there.
>
> Yes, maintaining a separate list is a fine way to keep the insertion
> order of the hash. It might as well be intrinsic.

I disagree. Python didn't include an OrderedDict until very recently
when Ruby users started coming to Python. Before then, few people
worried or cared about it not being there. You are coming to Redis
from the same perspective, having had it before, and wanting it here.
You've not asked on the mailing list "can I do X with primitives that
Redis offers", but instead start with "can Redis offer this
functionality that I think I need?"

> What about delete?
> Now it's no longer O(1) in this example.  Not that we couldn't live
> without delete. But if we wanted to maintain the update order, rather
> than the insertion order as you suggested? How would you do that while
> maintaining O(1) average insertion cost of hash?

See http://code.activestate.com/recipes/252524-length-limited-o1-lru-cache-implementation/
.
Your hash should contain the entries of the form: hash_key -> {member
-> value, ...} .
Every member should have a list of the form: hash_key:member ->
[prev_member, next_member]
You keep an extra list of the form: hash_key::fl -> [last_member, first_member]

If you want to get the sequence of operations, you perform HKEYS on
hash_key (or HGETALL if you also want the data), then perform a
pipelined pull of hash_key:<members> and hash_key::fl with LRANGE. You
start with the "first_member" listed in the ::fl list, and order your
entries by following the next_member pointers.

Updates are performed by updating at most 8 previous/next pointers in lists.

The details are tedious, which is why I am not including a race-free
implementation for Redis, but using the Scripting branch should remove
race conditions while offering you what you want.

- Josiah

tribalvibes

unread,
May 12, 2011, 5:11:57 PM5/12/11
to Redis DB
On May 12, 11:44 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> Sounds like bikeshedding unless you have a specific set of "replace
> command X with generic command Y". But again, less surprises are
> better.
>
> > For example, SORT ... STORE  does this store only to a list, or could
> > it store the sorted weights and values to a new sorted set?  Does the
> > stored type depend on the sorted type?
>
> It currently only stores a list. It works on sets and lists as input
> (though there is additional functionality offered by the GET and BY
> arguments).

To continue the one-by-one feature approach, we could add a STORE type
specifier, e.g. SORT ... STORE Z:dest
If only the ZSET weights could be alpha as well as double...

> I disagree. Python didn't include an OrderedDict until very recently
> when Ruby users started coming to Python. Before then, few people
> worried or cared about it not being there. You are coming to Redis
> from the same perspective, having had it before, and wanting it here.
> You've not asked on the mailing list "can I do X with primitives that
> Redis offers", but instead start with "can Redis offer this
> functionality that I think I need?"

Well, what if the primitives offered were simply read bit, write bit,
move the tape to the left, move to the right, halt. We could build
anything we need from those primitives, right? And with server-side
scripting we would not have race conditions. But maybe we want to
write our applications and not reinvent the wheel.

So the point is just that Redis is a great idea--a data structure
server that offers certain concurrency/atomicity and in-memory with
backing store benefits, with a tight minimalistic implementation.
Total kudos to Salvatore. Why not make the design choice of data
structures and generic primitives more in tune with the app usage
requirements going forward?

Sure, go ahead and tell us please how would you implement OrderedDict
either from the client side or with Lua scripting? Why ZSET is
implemented with doubles only is I'm guessing merely another
implementation detail that is exposed to the client. Sure, we have
rank list/leaderbord usage too but also would like to order by alpha
key and map to value as OrderedMap.

> > What about delete?
> > Now it's no longer O(1) in this example.  Not that we couldn't live
> > without delete. But if we wanted to maintain the update order, rather
> > than the insertion order as you suggested? How would you do that while
> > maintaining O(1) average insertion cost of hash?
>
> Seehttp://code.activestate.com/recipes/252524-length-limited-o1-lru-cach...
> ...
> The details are tedious, which is why I am not including a race-free
> implementation for Redis, but using the Scripting branch should remove
> race conditions while offering you what you want.

Yes, a lot of tedious (and error prone) details to do something really
simple that should be an intrinsic. It is a design decision and the
reasoning is identical to that for including it in ruby 1.9--the
overhead of adding it is negligible and does not adversely impact
existing users, whereas the benefit is substantial and easily enables
a new class of usage that otherwise would require tedious and error
prone workarounds.

Peace,
TV.

Josiah Carlson

unread,
May 12, 2011, 9:26:06 PM5/12/11
to redi...@googlegroups.com
On Thu, May 12, 2011 at 2:11 PM, tribalvibes <m...@tribalvibes.com> wrote:
> On May 12, 11:44 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>> Sounds like bikeshedding unless you have a specific set of "replace
>> command X with generic command Y". But again, less surprises are
>> better.
>>
>> > For example, SORT ... STORE  does this store only to a list, or could
>> > it store the sorted weights and values to a new sorted set?  Does the
>> > stored type depend on the sorted type?
>>
>> It currently only stores a list. It works on sets and lists as input
>> (though there is additional functionality offered by the GET and BY
>> arguments).
>
> To continue the one-by-one feature approach, we could add a STORE type
> specifier, e.g. SORT ... STORE Z:dest
> If only the ZSET weights could be alpha as well as double...

Or it could set the score to a default of "1" like it does during
zinterstore/zunionstore with a standard set as input. But I'm not all
that interested (personally) in talking about changes to commands that
I'm not planning on implementing. Now would be a good time for others
to hop in and discuss this if they are interested.

>> I disagree. Python didn't include an OrderedDict until very recently
>> when Ruby users started coming to Python. Before then, few people
>> worried or cared about it not being there. You are coming to Redis
>> from the same perspective, having had it before, and wanting it here.
>> You've not asked on the mailing list "can I do X with primitives that
>> Redis offers", but instead start with "can Redis offer this
>> functionality that I think I need?"
>
> Well, what if the primitives offered were simply read bit, write bit,
> move the tape to the left, move to the right, halt. We could build
> anything we need from those primitives, right? And with server-side
> scripting we would not have race conditions. But maybe we want to
> write our applications and not reinvent the wheel.

If you are willing to produce a patch, I think that may go a lot
farther towards getting the functionality than just talking about "it
would be great if Redis did ...".

> So the point is just that Redis is a great idea--a data structure
> server that offers certain concurrency/atomicity and in-memory with
> backing store benefits, with a tight minimalistic implementation.
> Total kudos to Salvatore. Why not make the design choice of data
> structures and generic primitives more in tune with the app usage
> requirements going forward?

That's exactly what Salvatore does (looking for multiple apps and
users having the same needs). I may be mistaken, but you are the first
in my memory to ask for the particular feature, and generally
Salvatore doesn't add features for the sake of a single user. Heck,
there haven't even been any others who have thrown in a +1 for this
feature.

> Sure, go ahead and tell us please how would you implement OrderedDict
> either from the client side or with Lua scripting? Why ZSET is
> implemented with doubles only is I'm guessing merely another
> implementation detail that is exposed to the client. Sure, we have
> rank list/leaderbord usage too but also would like to order by alpha
> key and map to value as OrderedMap.

I have already described a schema where you can implement the "node"
as a hash member/value plus a list related to the hash + member, and
the hash itself behaves exactly as the hash does in the LRU class in
the earlier provided link. The actual details for what to do and how
to manipulate them are also at the link. Translating the Python to Lua
for each command is left as an exercise to the reader. I also have a
stripped down C version from 7 years ago that I can put up as a github
gist.

>> > What about delete?
>> > Now it's no longer O(1) in this example.  Not that we couldn't live
>> > without delete. But if we wanted to maintain the update order, rather
>> > than the insertion order as you suggested? How would you do that while
>> > maintaining O(1) average insertion cost of hash?
>>
>> Seehttp://code.activestate.com/recipes/252524-length-limited-o1-lru-cach...
>> ...
>> The details are tedious, which is why I am not including a race-free
>> implementation for Redis, but using the Scripting branch should remove
>> race conditions while offering you what you want.
>
> Yes, a lot of tedious (and error prone) details to do something really
> simple that should be an intrinsic. It is a design decision and the
> reasoning is identical to that for including it in ruby 1.9--the
> overhead of adding it is negligible and does not adversely impact
> existing users, whereas the benefit is substantial and easily enables
> a new class of usage that otherwise would require tedious and error
> prone workarounds.

I disagree. Considering how trivial it is to accomplish the same thing
with a hash + zset, for the O(logn) cost (which disappears in network
latency), the tedious "right way" is easily shrugged off for the
alternate 5 line Lua implementation. Throw in it being tedious for
Salvatore, all of the other more important things on his plate, as
well as what I recognize as a general lack of demand from the masses,
and I can understand why Salvatore didn't do it in the first place and
isn't hopping in to say "heck yeah, love it, I'll build it now".

If you really want the functionality; patches talk. Everything else is
bikeshedding.

Regards,
- Josiah

tribalvibes

unread,
May 13, 2011, 2:01:19 AM5/13/11
to Redis DB
On May 12, 6:26 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> > We could add a STORE type
> > specifier, e.g. SORT ... STORE Z:dest
> Or it could set the score to a default of "1" like it does during
> zinterstore/zunionstore with a standard set as input

OK yes this is trivial with scripting. We can test the performance
difference between SORT + insert sorted copy vs unsorted insert into
the zset.

By the way, what is the limit on the number of items in a list, set
and zset?

> there haven't even been any others who have thrown in a +1 for this
> feature.

That's because no one is following such a noisy thread. If you were to
proactively cull the group though, you'll see dozens of requests for
things that amount to maintaining a sorted index on an attribute--
undoubtedly one of the most common app requirements.

> > Sure, go ahead and tell us please how would you implement OrderedDict
> > either from the client side or with Lua scripting?
> I disagree. Considering how trivial it is to accomplish the same thing
> with a hash + zset, for the O(logn) cost

How about OrderedMultiDict? ;-)

Seriously. Let's make this concrete. Yes OK it's trivial to make a
sorted index with a hash and zset for unique attribute values, but how
about if duplicates are allowed? What would you suggest then?

And considering the need to shard the index with this approach for
more than 2^32 items. Or will cluster remove that size limitation for
hashes (and the other types)?

> If you really want the functionality; patches talk. Everything else is
> bikeshedding.

I am not bikeshedding. I'm trying to design an app and in the process
extend Ohm to overcome its deficiencies. I have specific requirements
in mind and I need "good" scalable and not necessarily bleeding edge
performance (i.e. O(k log n) is ok but O(N) is not.) I'd like to just
use Redis off the shelf and not have to patch it to get reasonable
basic functionality, while still trying to use it in it's "natural"
way to get the benes of its performance optimizations. If only it were
not so opaque what that way is.

Thanks/.tv

Josiah Carlson

unread,
May 13, 2011, 2:55:08 AM5/13/11
to redi...@googlegroups.com
On Thu, May 12, 2011 at 11:01 PM, tribalvibes <m...@tribalvibes.com> wrote:
> On May 12, 6:26 pm, Josiah Carlson <josiah.carl...@gmail.com> wrote:
>> > We could add a STORE type
>> > specifier, e.g. SORT ... STORE Z:dest
>> Or it could set the score to a default of "1" like it does during
>> zinterstore/zunionstore with a standard set as input
>
> OK yes this is trivial with scripting. We can test the performance
> difference between SORT + insert sorted copy  vs unsorted insert into
> the zset.
>
> By the way, what is the limit on the number of items in a list, set
> and zset?

However much memory you have and can address with your architecture.
VM was supposed to alleviate some of it, but it doesn't work as well
as the rest of Redis. Diskstore is supposed to improve it further, but
building a BTree from scratch is not easy (SQLite has a public domain
BTree implementation that is amazing, I don't know why that wasn't
chosen instead).

>> there haven't even been any others who have thrown in a +1 for this
>> feature.
>
> That's because no one is following such a noisy thread. If you were to
> proactively cull the group though, you'll see dozens of requests for
> things that amount to maintaining a sorted index on an attribute--
> undoubtedly one of the most common app requirements.

Since I apparently must have been blind, perhaps you can go through
the last week of archives and discover those dozens of requests and
bring them to Salvatore's attention ;)

>> > Sure, go ahead and tell us please how would you implement OrderedDict
>> > either from the client side or with Lua scripting?
>> I disagree. Considering how trivial it is to accomplish the same thing
>> with a hash + zset, for the O(logn) cost
>
> How about OrderedMultiDict? ;-)
>
> Seriously. Let's make this concrete. Yes OK it's trivial to make a
> sorted index with a hash and zset for unique attribute values, but how
> about if duplicates are allowed? What would you suggest then?

Use lists in the base top-level hash, keyed by hash_name:member_name,
using a zset for ordering again.

> And considering the need to shard the index with this approach for
> more than 2^32 items. Or will cluster remove that size limitation for
> hashes (and the other types)?

There is no such limitation at present, unless you are using a 32 bit
Redis, but then you are running into memory limits and not item number
limits.

>> If you really want the functionality; patches talk. Everything else is
>> bikeshedding.
>
> I am not bikeshedding. I'm trying to design an app and in the process
> extend Ohm to overcome its deficiencies. I have specific requirements
> in mind and I need "good" scalable and not necessarily bleeding edge
> performance (i.e. O(k log n) is ok but O(N) is not.) I'd like to just
> use Redis off the shelf and not have to patch it to get reasonable
> basic functionality, while still trying to use it in it's "natural"
> way to get the benes of its performance optimizations. If only it were
> not so opaque what that way is.

With a hash, set, list, sorted set, and strings, you can build just
about any structure you can imagine (it just takes a bit more
imagination). Some have small-moderate caveats (O(log n) time per call
vs. O(1), more memory, requires explicit locking, etc.), but just
about anything is possible.

- Josiah

Javier Guerra Giraldez

unread,
May 13, 2011, 10:56:50 AM5/13/11
to redi...@googlegroups.com
On Fri, May 13, 2011 at 1:01 AM, tribalvibes <m...@tribalvibes.com> wrote:
>> there haven't even been any others who have thrown in a +1 for this
>> feature.
>
> That's because no one is following such a noisy thread.

I am, but didn't bother to raise my voice simply because i agree with
everybody else: would be nice, but waaay below the priority of having
the basics well tuned. (and no, OrderedDict isn't a basic structure)


> If you were to
> proactively cull the group though, you'll see dozens of requests for
> things that amount to maintaining a sorted index on an attribute--
> undoubtedly one of the most common app requirements.

i guess you're referring to the not uncommon question of "how do i
model this and that?". and the answers seem somewhat contrived from a
newcomer's eyes. But in the end this means that the choice of basic
structures isn't so bad.

frankly, if there was a higher level structure, i might use them
(where appropriate, i hope); but with the scripting addition it's less
and less significant.

I have my own wishlist, but i prefer to put any suggestion with some
proof of concept code; even if it's not production-ready.

--
Javier

tribalvibes

unread,
May 16, 2011, 3:14:03 AM5/16/11
to Redis DB
On May 13, 7:56 am, Javier Guerra Giraldez <jav...@guerrag.com> wrote:
>the priority of having
> the basics well tuned.  (and no, OrderedDict isn't a basic structure)
> ...
> i guess you're referring to the not uncommon question of "how do i
> model this and that?".  and the answers seem somewhat contrived from a
> newcomer's eyes.   But in the end this means that the choice of basic
> structures isn't so bad.

Maybe not, but the specific case of maintaining an incrementally
sorted index on an attribute is so common that "having the basics well
tuned" might mean providing this functionality as an intrinsic for a
majority of apps. That is I hope you are thinking tuning is in terms
of actual app performance, not Redis 'busy beaver' mode. Sure, this
can be implemented within a constant efficiency of optimal using
server-side scripting, but that constant might be 3, 4, 5 times the
space and time complexity, not 1.01. So, wouldn't you look to
something like that as a place to tune the basics if it could deliver
triple or more the performance for a very common (and bottleneck)
case?

I think it would be very simple to have zset with float weights also
be zmap string -> string which would cover a ton of cases. This would
require only a tweak to the existing structure and perhaps the
addition (or again, generalisation) of a few Z commands. But, fair
enough, I'll can it until I could justify spending time on such a
patch. I need to just use Redis as-is for now.

> I have my own wishlist, but i prefer to put any suggestion with some
> proof of concept code; even if it's not production-ready.

I'm working on extending ruby's Ohm for our app purposes. The way it
uses Redis seems very inefficient both in terms of round trips and
index design. So, any changes made there would then serve as use cases
for Redis enhancements.
Reply all
Reply to author
Forward
0 new messages