Is single hash capable of handling 10000 items?

21 views
Skip to first unread message

Gopalakrishnan Subramani

unread,
Jan 14, 2011, 12:56:45 AM1/14/11
to Redis DB
I have more than 10000 topics which needs to be stored as part of hash
structure.

Is there any performance penalty if the number of items in the hash
increases to more number?

Josiah Carlson

unread,
Jan 14, 2011, 1:12:53 AM1/14/11
to redi...@googlegroups.com
I've not experienced any slowdowns. I have a hash that currently contains 20 million items, and will grow to at least 40 million.

 - Josiah
 

Gopalakrishnan Subramani

unread,
Jan 14, 2011, 1:26:22 AM1/14/11
to Redis DB
20 million is veryyyyy big number. Thank you for answer.

On Jan 14, 11:12 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:
> On Thu, Jan 13, 2011 at 9:56 PM, Gopalakrishnan Subramani <
>

Bob Hutchison

unread,
Jan 15, 2011, 12:31:24 PM1/15/11
to redi...@googlegroups.com
Ummmm.... wow! I don't think I ever really thought about it consciously but I guess I've always just assumed these would be small. This is going to take a few minutes to absorb :-) Followed by a re-think.

Can you say what you are using that hash for? I'm wondering why you'd do this.

Cheers,
Bob


 - Josiah
 

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Josiah Carlson

unread,
Jan 15, 2011, 5:27:12 PM1/15/11
to redi...@googlegroups.com
On Sat, Jan 15, 2011 at 9:31 AM, Bob Hutchison <hutch-...@recursive.ca> wrote:

On 2011-01-14, at 1:12 AM, Josiah Carlson wrote:

On Thu, Jan 13, 2011 at 9:56 PM, Gopalakrishnan Subramani <gopalakrishn...@gmail.com> wrote:
I have more than 10000 topics which needs to be stored as part of hash
structure.

Is there any performance penalty if the number of items in the hash
increases to more number?

I've not experienced any slowdowns. I have a hash that currently contains 20 million items, and will grow to at least 40 million.

Ummmm.... wow! I don't think I ever really thought about it consciously but I guess I've always just assumed these would be small. This is going to take a few minutes to absorb :-) Followed by a re-think.

Can you say what you are using that hash for? I'm wondering why you'd do this.

Imagine for a moment that you have 40 million rows in a database with a small number of columns. If you want to store them in Redis, there are 3 obvious solutions:

* each field is it's own entry in Redis (table:rowid:colname -> colvalue, the memcache way)
* each row is it's own hash in Redis (table:rowid -> {col1:col1value, col2:col2value, ...}, row-order)
* each column is it's own hash in Redis (table:colname -> {rowid1:colvalue2, rowid2:colvalue2, ..., ...}, column-order)

Obviously the first one results in a huge number of long string entries in the main hash, so is generally a waste. The second one is commonly a very good choice, especially if your use-cases are driven by pulling a generally small number of rows, entire rows at a time. However, because there are 40 million keys, if you want to check the state of the system, performing the introspection function KEYS takes a very long time.

In my use-case, I rarely want all of the columns for the given rows (generally just one or two), but I do want many rows worth of data for those columns. By having column-oriented data, I can use HMGET table:colname <rowids> to get thousands of values with a single call without the use of pipelines, without requiring Redis to re-parse every command, without sending the commands, etc. (small savings, sure, but when you pull a few million over the course of a few seconds, you can feel the difference). Toss in the fact that I can introspect on the system via KEYS (which returns 10-20 items instead of 40 million), and using column-oriented hashes just makes sense.

Now, there is the argument that if I were to store it row-oriented I would be able to get memory savings due to the small number of columns, but at this point, I've got enough data so that it can't fit in a smaller box than what I'm using, so memory isn't a concern (but introspection and examination is).

Incidentally, I know how large my hashes are going to grow because I've got a zset which determines the priority in which the hashes get filled, and the zset has over 40 million entries. :)

Regards,
 - Josiah

Salvatore Sanfilippo

unread,
Jan 15, 2011, 5:29:18 PM1/15/11
to redi...@googlegroups.com
Related, I just created a new section in this page:

http://redis.io/topics/memory-optimization

Check the section "Using hashes to abstract a very memory efficient
plain key-value store on top of Redis".

Cheers,
Salvatore

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Bob Hutchison

unread,
Jan 16, 2011, 10:01:03 AM1/16/11
to redi...@googlegroups.com
Thanks Josiah. I'm going to play around with this idea a bit.

Cheers,
Bob

Reply all
Reply to author
Forward
0 new messages