List of hashes

35 views
Skip to first unread message

Marcus

unread,
Oct 19, 2010, 9:52:02 PM10/19/10
to Redis DB
I am wondering if list of hashes is on the list of features to be
added? And if not, if anyone has any interest in such a feature?


Examples of possible commands:

append hash to list
HLRPUSH key field value field value field value

returns length of hash list
HLLLEN key

returns number of fields that has has
HLHLEN key index

returns values of specified fields
HLLINDEXGET key index field, field2, field3

sets hash field
HLLINDEXSET key index field value

returns hash (field value, field value, etc)
HLLPOP key

Jason J. W. Williams

unread,
Oct 19, 2010, 10:18:51 PM10/19/10
to redi...@googlegroups.com
Honestly, I'd like to see sets as datatype for hash members.

-J

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

Josiah Carlson

unread,
Oct 19, 2010, 11:41:15 PM10/19/10
to redi...@googlegroups.com
I think that the added overhead to the protocol and commands to
support the embedding of arbitrary structures within one another
(because that's really what is being proposed) is probably too much.

If/when Redis server-side scripting is done, I can imagine a syntax
where this kind of thing wouldn't be unreasonable (and I would already
have a few different use-cases for it), but I also think that
simplicity is a goal worth striving for.

- Josiah

Marcus

unread,
Oct 20, 2010, 12:48:19 AM10/20/10
to Redis DB
Yeah, I can see how this could be problematic. This users will request
lists of sets, hashes of lists, lists of lists, etc, etc. It could be
never ending.

Xiangrong Fang

unread,
Oct 20, 2010, 12:50:45 AM10/20/10
to redi...@googlegroups.com
Server Side Scripting (S3?) maybe a cool idea.

2010/10/20 Marcus <gwebmas...@yahoo.com>:


> Yeah, I can see how this could be problematic. This users will request
> lists of sets, hashes of lists, lists of lists, etc, etc. It could be
> never ending.
>

Marc Byrd

unread,
Oct 20, 2010, 1:09:38 AM10/20/10
to redi...@googlegroups.com
An idea was floated a while back called recursive redis , which starts with the observation that the introduction of Hashes, even though redis already had SET/GET, is by itself at least one level of brute force recursion which could be simplified and yet amplified by saying that a value can be any type, including int, string, list, hash, set, zset, etc.

For language bindings which seek to automagically map language primitives about lists, hashes, sets, etc., this would be a great boon.  Redis has changed the way many of us think about programming, it's that much of a sea change.  This recursive, more general approach would complete the evolution, wherein basic and interesting data structures in any programming language could be mapped, in completely language-natural ways, to a shared memory space, and operated upon with complete confidence in atomicity.  

The number of lines of code would actually go down with this approach - e.g. note the current redundancy between top level memcached-like hash (SET/GET) and the proposed approach, where both are handled by the same set of code.  

This also seems to provide the opportunity to reconcile issues around concepts like volatile keys and VM:  For VM, if there's no such thing as a "top-level key", and if any key at any level can point to something stored on disk, then the efficiency of VM can go way up (How many times have we seen an entire huge zset get pulled into memory because we only needed one score?).   As for keys - if there's no such thing as a level, then the data structure currently known as a Hash could contain keys which have expirations, and so on.  

It could also be a way to approach clustering:  If a "value" is a tuple of (type, location), where location could be VM, host/port, etc., then that provides a very viable way to distribute redis.  

The changes required to the current code base would likely break all clients, perhaps even at a protocol level (in much the same way redis broke the memcached protocol).  None of this is likely to be in or get into the "trunk" any time soon.  But it might be an interesting fork, and certainly seems worth some discussion.


m
 

On Tue, Oct 19, 2010 at 7:18 PM, Jason J. W. Williams <jasonjw...@gmail.com> wrote:

Jeremy Zawodny

unread,
Oct 20, 2010, 1:17:12 AM10/20/10
to redi...@googlegroups.com
Well said.  I've often had similar thoughts when I wanted a HASH of small ZSETs or something similar.  Given that I work in Perl, this is a natural sort of thing to want. :-)

Jeremy

Jason J. W. Williams

unread,
Oct 20, 2010, 1:30:18 AM10/20/10
to redi...@googlegroups.com, redi...@googlegroups.com
I frequently find myself wanting the single unit granularity of sets for a hash field...instead of stuffing a JSON list in and leaving it to the app to add/remove members and stuff a full new copy of the list in. 

-J

Sent via iPhone

Is your e-mail Premiere?

Marcus

unread,
Oct 20, 2010, 2:20:22 AM10/20/10
to Redis DB
Maybe something like this could be used for the protocol without
changing too much. Each line is a separate command:

Define a top level key as a certain type

SETTYPE [key] LIST-HASH

To read a hash value:

BEGINCOMMAND
LIST-HASH [key]
LINDEX [index]
HGET [field]
ENDCOMMAND

This tells redis we are dealing with a type "LIST-HASH" where the key
of the top-level type (list) is [key]. Since a list is being used the
next command in the group must read a list element. The folowing
command must operate on the hash. Neither LINDEX or HGET need to
define a key because it is implied. In redis, the value of LINDEX key
[index] is a pointer pointing to a hash table, since it is a type LIST-
HASH

to read a complex data type of type LIST-HASH-HASH


BEGINCOMMAND
LIST-HASH-HASH [key]
LINDEX [index]
HGET [field]
HGET [field]
ENDCOMMAND

BEGINCOMMAND and ENDCOMMAND are used to tell redis that the commands
should be interpreted as a group.




Would something like this work for the protocol?

Josiah Carlson

unread,
Oct 20, 2010, 2:24:14 AM10/20/10
to redi...@googlegroups.com
On Tue, Oct 19, 2010 at 10:09 PM, Marc Byrd <dr.mar...@gmail.com> wrote:
> An idea was floated a while back called recursive redis , which starts with
> the observation that the introduction of Hashes, even though redis already
> had SET/GET, is by itself at least one level of brute force recursion which
> could be simplified and yet amplified by saying that a value can be any
> type, including int, string, list, hash, set, zset, etc.
> For language bindings which seek to automagically map
> language primitives about lists, hashes, sets, etc., this would be a great
> boon.  Redis has changed the way many of us think about programming, it's
> that much of a sea change.  This recursive, more general approach would

I guess it's all about what languages you use. I find Redis' use of
data structures restrictive (but sufficient to solve problems with a
little bit of work), but I come from a Python world where arbitrary
structure embedding is easy, free, and expected. When I was using C++
STL, I would have seen Redis as a life-saver (it is now, but for
different reasons).

> complete the evolution, wherein basic and interesting data structures in any
> programming language could be mapped, in completely language-natural ways,
> to a shared memory space, and operated upon with complete confidence in
> atomicity.
> The number of lines of code would actually go down with this approach - e.g.
> note the current redundancy between top level memcached-like hash (SET/GET)
> and the proposed approach, where both are handled by the same set of code.
> This also seems to provide the opportunity to reconcile issues around
> concepts like volatile keys and VM:  For VM, if there's no such thing as a
> "top-level key", and if any key at any level can point to something stored
> on disk, then the efficiency of VM can go way up (How many times have we
> seen an entire huge zset get pulled into memory because we only needed one

Sadly, it's not that easy with a ZSET. If you are looking something
up by rank, you need to traverse down your log(n) levels to get to the
proper leaf. If you are looking something up by member, you need to
pull the hash entry and potentially all of the items in the bucket.
During modification, you have to update pointers of adjacent skiplist
nodes for the old and new location. Also, if the values in a ZSET can
be of arbitrary type (number, string, list, set, hash, zset, ...) you
run into sorting issues trying to determine a total ordering of all
values.

I know Redis isn't Python, but Python has at least two applicable zens
in this scenario:
Simple is better than complex.
Flat is better than nested.

> score?).   As for keys - if there's no such thing as a level, then the data
> structure currently known as a Hash could contain keys which have
> expirations, and so on.

That would make expiration fairly painful from an accounting
perspective, as every item to be expired would basically need to be
tracked in a zset. :/

Also, the thing to remember is that in order to operate on something
in a nested way, you first need to select the parent item. There is a
reasonable syntax for doing this, but it basically requires performing
an implicit or explicit reset when the final operation is performed.

> It could also be a way to approach clustering:  If a "value" is a tuple of
> (type, location), where location could be VM, host/port, etc., then that
> provides a very viable way to distribute redis.
> The changes required to the current code base would likely break all
> clients, perhaps even at a protocol level (in much the same way redis broke
> the memcached protocol).  None of this is likely to be in or get into the
> "trunk" any time soon.  But it might be an interesting fork, and certainly
> seems worth some discussion.

Well... memcached has a protocol. Redis, being a different piece of
software, chose a different protocol of a specific design. If a
future Redis changes the protocol in such a way to support these kinds
of structures, then it's not like memcached -> Redis, it is Redis X ->
Redis Y.

That said, there is a Python almost-work-alike of Redis that can be
very easily modified to support these kinds of operations (and it
speaks the Redis protocol), which would make for a much faster method
of exploring things. The only annoying part is that Python doesn't
have a fast equivalent of a ZSET, though heapq.merge +
itertools.groupby can get most of the way there.

Regards,
- Josiah

Jak Sprats

unread,
Oct 21, 2010, 3:51:19 AM10/21/10
to Redis DB

Marcus,

on your protocol
> BEGINCOMMAND
> LIST-HASH [key]
> LINDEX [index]
> HGET [field]
> ENDCOMMAND

I am not convinced that multiple commands are needed. We are talking
about doing the following in C
getList(key)->getLindex(index)->hashGet(field)... etc....

This would also work (written in bash format)
HGET $(LINDEX key index) field
where the $() command gets evaluated first and returned to the HGET
command

Since complex data structures can be written to in the following
fashion
HSET $(LINDEX key index) field VALUE_THAT_CAN_BE_REALLLLLLLLY_LONG
the final argument being really long does not complicate things server
side or client side.

Complex data structures are actually a natural fit for redis and not
that hard to implement server side, but coming up w/ a syntax that
everyone agrees on (e.g. my bash suggestion will be real unpopular
[its not 1972]:) is not so easy.

Complex data structure writes and reads can be one-iiners, but a redis
eval() (e.g. the $() i used above) must be introduced into the syntax
and its worth debating how this eval() would look.

- Jak

Jak Sprats

unread,
Oct 28, 2010, 11:06:12 PM10/28/10
to Redis DB
I already posted this here:
http://groups.google.com/group/redis-db/browse_thread/thread/8a981b3fd76f8f41

But its nice to have relevant info in one post

EVAL (what other people calledrecursive redis- cant find post) where
you can do the following:
./redis-cli HSET hash0 name "I AM HASH ZERO"
./redis-cli HSET hash1 name "I AM HASH ONE"
./redis-cli HSET hash2 name "I AM HASH TWO"
./redis-cli RPUSH hashlist hash0
./redis-cli RPUSH hashlist hash1
./redis-cli RPUSH hashlist hash2
./redis-cli HGET $(LINDEX hashlist 1) name -> "I AM HASH ONE"

What is cool about bash, is this actually works: (EVAL in bash)
./redis-cli HGET $(./redis-cli LINDEX hashlist 1) name

Marcus

unread,
Oct 29, 2010, 10:23:33 PM10/29/10
to Redis DB
That would not solve my issue though as every hash would still have a
key in the global key space. This means that it cannot be swapped to
disk. If there were 10,000,000,000 hashes, then the amount of memory
that would be used would be huge. Whereas, using the same
implementation with true list of hashes, the minimum amount of memory
used would be negligible.
Reply all
Reply to author
Forward
0 new messages