HyperDex vs Redis

4,649 views
Skip to first unread message

Riyad Kalla

unread,
Feb 22, 2012, 1:54:49 PM2/22/12
to redi...@googlegroups.com
HyperDex ( http://hyperdex.org/ ) is the new K/V store out of Cornell that uses some unique hashing to allow searching and retrieving records without key values (I haven't dug deep, this is all high-level blurbs at this point).

They just added Redis performance comparisons today (bottom graph):

Curious what "Workload E" was, but they claim better performance in every case with instant consistency across multiple nodes (no idea how) as well as failover that sounds similar to a Cassandra/Riak/HBase at least using the word-pictures from the site.

Anybody know more about HyperDex?

Felix Gallo

unread,
Feb 22, 2012, 2:10:02 PM2/22/12
to redi...@googlegroups.com
'Workload E' is part of the 'Yahoo Cloud Serving Benchmark' (https://github.com/brianfrankcooper/YCSB/wiki).  Unfortunately, like so many Yahoo products these days, it appears that the benchmark was designed without regard to quality, and primarily tests how well a given solution's java client library and object model conforms with, and are implemented against, a toy database model.  It may be that HyperDex is faster, but from casual inspection of the Redis-specific code included in the benchmark distribution, there's no grounds for reasoning that via YCSB.

F.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Arek Bochinski

unread,
Feb 22, 2012, 2:16:13 PM2/22/12
to redi...@googlegroups.com
I spent 15 minutes searching around for some justification on their reasoning behind the project and it seems
like a prototype k/v store based on one of the professors papers on hyperspace hashing. Which made me
laugh while at the same time, reason if they applied a common octree data structure algo to distributing the keys
on many nodes/shards/key spaces.

Josiah Carlson

unread,
Feb 22, 2012, 2:26:10 PM2/22/12
to redi...@googlegroups.com
On Wed, Feb 22, 2012 at 10:54 AM, Riyad Kalla <rka...@gmail.com> wrote:

I'm sure what they mean by "instant" consistency is that data sync
between nodes is completed before your command returns.

There are lies, damn lies, and benchmarks. Benchmarks are particularly
insidious, especially when the benchmarks are designed to test a
platform in a way that the platform was not designed. Felix already
talked about this.

Regards,
- Josiah

> Anybody know more about HyperDex?
>

Salvatore Sanfilippo

unread,
Feb 22, 2012, 4:47:13 PM2/22/12
to redi...@googlegroups.com
Reality is extremely less interesting, as usually:

Redis and memcached provide more or less the upper limit figure in
queries per second per core.
Memcached can also use more cores automatically (something that Redis
will do at some point), and with Redis you need multiple instances,
but that's an upper level for a networked server, since both this
systems serve data from memory, and are reasonably well optimized.

What I mean is, I can modify Redis to just return always "foo", and
I'll still get 150k ops/sec per core.

So what is happening here is that:

1) Benchmark "E" is just a primitive that Redis does not provide, the
comparison here is just a bit of useless marketing material.
2) In all the other tests, probably they are comparing single-core
Redis with multi-core HyperDex, or multi-node HyperDex. To give you an
example Redis LPUSH can easily insert 1 million items per second into
lists, but if you push against four instances at the same time, one
per core, you can reach 3/4 million inserts per second. This does not
mean we should write "four millions operations per second11!1!!!one"
in our home page.
3) Probably data set fits in memory anyway in the benchmark test case.
4) Methodology is not shown at all, so everything in this page is more
or less useless.

I think that to provide false illusions is also bad when marketing a
product, it can help the first three months, but then what happens?

"He who chases two rabbits, catches neither one."

(Sicilian proverb, kindly translated by Ted Nyman).

Salvatore

> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Riyad Kalla

unread,
Feb 22, 2012, 5:59:50 PM2/22/12
to redi...@googlegroups.com
Great clarifications guys.

Salvatore, I respectfully disagree that you shouldn't put "four millions operations per second11!1!!!one" on the homepage ;)

-R

Salvatore Sanfilippo

unread,
Feb 23, 2012, 4:31:00 AM2/23/12
to redi...@googlegroups.com
My investigation on those benchmarks ->
http://news.ycombinator.com/item?id=3622888

Basically YCSB is an epic fail in the way it is conceived, instead of
giving use cases and try the best implementation with different DBs,
Different DBs are forced to a given data model using a layer. For most
DBs this is the native data model, in others (like Redis) it is
simulated using native operations.

The benchmark was also comparing single-core Redis against multi-core Hyperdex.

Cheers,
Salvatore

Dvir Volk

unread,
Feb 23, 2012, 4:49:15 AM2/23/12
to redi...@googlegroups.com
you mention there that redis is going to be able to use multiple cores soon.
what do you mean by that?
Dvir Volk
System Architect, The Everything Project (formerly DoAT)

Salvatore Sanfilippo

unread,
Feb 23, 2012, 5:06:00 AM2/23/12
to redi...@googlegroups.com
On Thu, Feb 23, 2012 at 10:49 AM, Dvir Volk <dv...@doat.com> wrote:

> you mention there that redis is going to be able to use multiple cores soon.
> what do you mean by that?

Hi Dvir,

I want to use threads in Redis to improve the performances for the
single instance case, but at the same time I want to have a design
that will not make it slower when using a single core.

The idea is to start with the networking layer that is where a lot of
time is lost, we could have N threads reading and parsing requests
from clients, and then serving the result back to clients. A further
step can be to also call commands in the context of different threads
with key-based locking.

After Redis 2.6 RC1 release I'll start experimenting with a few
branches to see what is the best balance to internal simplicity and
ability to use more threads, but I think it is a change we can't
avoid, the future appears to really belong to computers with many many
cores, and Redis should try to take advantage of this in the best way.
If there are 4 cores in a computer it is fine to run four instances,
but maybe in two or three years we'll start to commonly see computers
with 16, 64, 128 cores... and it starts to be not practical :)

Salvatore

Dvir Volk

unread,
Feb 23, 2012, 5:17:50 AM2/23/12
to redi...@googlegroups.com
that's interesting. the netowkring and parsing thing could sure benefit from it.
another approach could be to enable managing many instances simultaneously with ease. you can think for example of an apache web server as dozens of instances of apache, but you have a single coordinator so it's painless. 
so maybe targeting the cluster also for the single machine would also be a viable approach.

Hampus Wessman

unread,
Feb 23, 2012, 5:47:27 AM2/23/12
to redi...@googlegroups.com
There will always be a certain drawback of running multiple single-threaded instances instead of one large multi-threaded instance, unless the queries are spread uniformly among the instances. With an uneven load, some of the single-threaded instances could use 100% CPU while others were completely idle and unable to help out. I think multi-threading in the future, for scaling up on a single machine, would be very cool :)

Some way to more easily manage multiple instances on a single machine may be a better short-term solution, though. I can imagine that threading will require a lot of work and testing!

Just a thought about not making it slower for single-core use. Perhaps it would be useful to have a "single-threaded" mode (an option in the config file), where Redis would only run one thread and also skip a lot of synchronization. Definitely more complicated, but may be worth experimenting with. I'm unsure how big difference that would really make...

Cheers,
Hampus

Salvatore Sanfilippo

unread,
Feb 23, 2012, 7:33:33 AM2/23/12
to redi...@googlegroups.com
On Thu, Feb 23, 2012 at 11:17 AM, Dvir Volk <dv...@doat.com> wrote:
> that's interesting. the netowkring and parsing thing could sure benefit from
> it.
> another approach could be to enable managing many instances simultaneously
> with ease. you can think for example of an apache web server as dozens of
> instances of apache, but you have a single coordinator so it's painless.
> so maybe targeting the cluster also for the single machine would also be a
> viable approach.

Yep tools to manage multiple instances at once are definitely a good idea...
I think this is the kind of stuff that should be a community project
btw, especially the part of the community that actually has to manage
many redis instances (and can test the tool into the wild) should join
forces and create a google group to go forward.

It is not possible for me, unfortunately, to allocate bandwidth to
this kind of projects: while extremely interesting, fun, and useful, I
must focus on the core itself most of the time. But I see very well
such a project.

Salvatore Sanfilippo

unread,
Feb 23, 2012, 7:36:17 AM2/23/12
to redi...@googlegroups.com
On Thu, Feb 23, 2012 at 11:47 AM, Hampus Wessman
<hampus....@gmail.com> wrote:

> Just a thought about not making it slower for single-core use. Perhaps it
> would be useful to have a "single-threaded" mode (an option in the config
> file), where Redis would only run one thread and also skip a lot of
> synchronization. Definitely more complicated, but may be worth experimenting
> with. I'm unsure how big difference that would really make...

This is definitely possible... just wrappering calls of lock with a
define so that they do something like:

#define lock_foo(x) if (server.threads != 1) really_lock_foo(x).

Not sure if without contention this is worth doing, apparently on mac
os x acquiring a lock without contention is still a bit of work
(according to profiling info I obtained using Instruments), but
possibly in Linux this is more optimized.

Cheers,
Salvatore

Dvir Volk

unread,
Feb 23, 2012, 7:49:13 AM2/23/12
to redi...@googlegroups.com
I've been wanting to do that for a long time but haven't found the time.
my main dilemma thinking about it is whether it should be a proxy that will handle failovers and such,  or just a database of endpoints and process manager, that enables clients to initiate connections.
the advantage of the simpler approach can be an easy to implement rest API to add, delete, replicate and query redis nodes, that will not add any performance overhead to redis itself.

Jak Sprats

unread,
Feb 23, 2012, 9:06:16 AM2/23/12
to Redis DB
Hi Salvatore,

Just a warning: key based locking is analagous to row based locking in
a RDBMS and it has all sorts of issues, the biggest one is deadlock
(e.g. 1000 concurrent connections need Key88, the connections pile up
and when key88 is unlocked, they need to be awakened efficiently).
Alot of RDBMSes went to MVCC for this reason, and MVCC in redis (or
anywhere) will be a lot of work.

A cluster can run efficiently on many cores, in many ways better than
a multithreaded application, as clusters can taskset to a core, have
fewer context switches, higher L1/L2 hit ratios (lower L3), optimal
NUMA memory placement, possibilities to isolate NIC to core via IRQ
affinity, and biggest and best, no deadlocks/mutexes, nothing thread-
wise.

The huge downside to a cluster on a single machine is the cross-node-
intersection/union, here a multi-threaded app would have all the ZSETs
in a single memory space and it would be a normal (key locking)
intersection/union.

Also w/ multiple threads serving the network layer, command based
replication becomes tricky, there needs to be a consistent (w/ what
happened across many threads on a single machine) ordering to the
slave replication log, which adds another very trick thread mutex
dimension to the whole equation, and the slave must replay them
serially (to guarantee ordering), so it may not be able to keep up.

Why not a cluster as a single machine multi threaded app, w/ an
efficient way to handle cross-node-intersections/joins?

- jak

On Feb 23, 7:06 am, Salvatore Sanfilippo <anti...@gmail.com> wrote:
> On Thu, Feb 23, 2012 at 10:49 AM, Dvir Volk <d...@doat.com> wrote:
> > you mention there that redis is going to be able to use multiple cores soon.
> > what do you mean by that?
>
> Hi Dvir,
>
> I want to use threads in Redis to improve the performances for the
> single instance case, but at the same time I want to have a design
> that will not make it slower when using a single core.
>
> The idea is to start with the networking layer that is where a lot of
> time is lost, we could have N threads reading and parsing requests
> from clients, and then serving the result back to clients. A further
> step can be to also call commands in the context of different threads
> with key-based locking.
>
> After Redis 2.6 RC1 release I'll start experimenting with a few
> branches to see what is the best balance to internal simplicity and
> ability to use more threads, but I think it is a change we can't
> avoid, the future appears to really belong to computers with many many
> cores, and Redis should try to take advantage of this in the best way.
> If there are 4 cores in a computer it is fine to run four instances,
> but maybe in two or three years we'll start to commonly see computers
> with 16, 64, 128 cores... and it starts to be not practical :)
>
> Salvatore
>
>
>
>
>
>
>
>
>
>
>
> > On Thu, Feb 23, 2012 at 11:31 AM, Salvatore Sanfilippo <anti...@gmail.com>
> >> >> HyperDex (http://hyperdex.org/ ) is the new K/V store out of Cornell

Dvir Volk

unread,
Feb 23, 2012, 9:16:08 AM2/23/12
to redi...@googlegroups.com
on the same machine, maybe using read only shared memory or something, cross node intersections might not be that bad. 
but I was thinking more along the lines of two of the following scenarios:

1. each instance represents a database (what is now the database number). or

2. if you have a proxy, and don't care about ram overhead, just raw performance, a simple master/slaves cluster, where the proxy routes the queries to the right instance and does load balancing and failover.

Salvatore Sanfilippo

unread,
Feb 23, 2012, 9:21:05 AM2/23/12
to redi...@googlegroups.com
Hello Jack,

whatever is the design, you'll be able to run using just one thread
with the same performances, so we'll never have lower results per core
if we run with a single core. However there are no alternatives to
key-based locking in Redis, accessing with different threads to the
same data structure means to kill the performance with locking, to
make the implementation more complex, and so forth.

I mean, everything else is the same: Redis Cluster, and so forth,
Redis API, atomicity. The only difference is that it will be good if
we can specify to Redis to use more cores to serve requests.

If we handle with threads just the networking side, it is just like to
have a global locking to the key space, since a single request is
performed at a given time. Exactly like now, but the networking side
handled by multiple threads can improve performances if we see the
memcached experience about it.

If we have a way to also serve queries about different keys with
threads maybe it is an improvement over the above model, maybe not,
it's a matter of performing tests.

I can't see how introducing support for multiple cores may make Redis
worse than that, if the implementation used can "scale down" to a
single thread with the same performances of today.

Salvatore

Dvir Volk

unread,
Feb 23, 2012, 9:34:27 AM2/23/12
to redi...@googlegroups.com
If we handle with threads just the networking side, it is just like to
have a global locking to the key space, since a single request is
performed at a given time. Exactly like now, but the networking side
handled by multiple threads can improve performances if we see the
memcached experience about it.

I would start with just that, maybe lua scripting as well?
 

If we have a way to also serve queries about different keys with
threads maybe it is an improvement over the above model, maybe not,
it's a matter of performing tests.

how about internal sharding per thread? :) 
you have a pool of N threads, each handling 1/N of the keys, and locking just its own keys on normal operations.
on intersections, sorts, etc - you can lock a global lock.
it will be much simpler than key level locking I suppose.

Salvatore Sanfilippo

unread,
Feb 24, 2012, 4:31:49 AM2/24/12
to redi...@googlegroups.com
On Thu, Feb 23, 2012 at 3:34 PM, Dvir Volk <dv...@doat.com> wrote:
> how about internal sharding per thread? :)
> you have a pool of N threads, each handling 1/N of the keys, and locking
> just its own keys on normal operations.
> on intersections, sorts, etc - you can lock a global lock.
> it will be much simpler than key level locking I suppose.

That's possible indeed but what is the advantage compared to key
granularity locking?
You can lock just the key that a thread is going to touch.

Btw one general problem with Redis and threads is that we can't afford
to have a mutex in every Redis object (too space consuming), so
probably it is better to lock keys using an hash table of locked keys
or something like that...
First step is definitely to try the networking side and see if the
results are cool enough to justify further investigations :)

Cheers,
Salvatore

Dvir Volk

unread,
Feb 24, 2012, 5:58:52 AM2/24/12
to redi...@googlegroups.com

> on intersections, sorts, etc - you can lock a global lock.
> it will be much simpler than key level locking I suppose.

That's possible indeed but what is the advantage compared to key
granularity locking?
You can lock just the key that a thread is going to touch.


Since of course you can't have a mutex per key, I thought this would be simpler to implement than a hash table of currently locked keys.
 

Hampus Wessman

unread,
Feb 24, 2012, 6:11:37 AM2/24/12
to redi...@googlegroups.com
I think more granular locking (i.e. key-based) could be beneficial (if
done right) for Redis. The overhead and complexity increases with more
fine-grained locking, but we could also get less contention and better
performance. It's always hard to strike the perfect balance between the
two. Internal sharding will only let a single query run for an entire
shard at a time, so it's equivalent to more coarse-grained locking (with
an exclusive lock per shard/partition, both on reads and writes).
Intersections between shards will be more expensive too, as mentioned
before.

One mutex per object is probably not the way to go, agree. I think most
RDBMS implement database locking by having some kind of global "lock
manager" (or "lock table") that keeps track of all database locks
(that's similar to what you said). Then you just need to have one mutex
for the entire lock manager. This could become a contention point, but
it's easy to have several lock managers and just hash the keys between
them. All this would cost some performance, but I think it may be needed
if we want to make optimal use of many cores. It can probably be
implemented very efficiently.

Dead locks and MVCC was mentioned earlier. I don't think that's
necessarily a problem for Redis (but that depends). Everything that
Redis Cluster or internal sharding can do (without using distributed
transactions/locks or cluster-wide snapshots) should be easy to
implement efficiently with threads and key-based locking too.

An important question here is whether a multi-threaded implementation
need to behave exactly like a single Redis instance does now (from the
clients points-of-view). That will be impossible to achieve with
internal sharding, as far as I can tell. If we would relax that
requirement just a little, then I think we could "easily" implement a
very efficient threading model based on locking individual keys.
Otherwise, it gets a bit harder.

That got a bit long... interesting topic. I have more thoughts about
this, but I will stop here for now :)

BR,
Hampus

Daniel Schnell

unread,
Feb 24, 2012, 8:29:03 AM2/24/12
to redi...@googlegroups.com
One should not leave aside the possibilities of lockless approaches via atomic operations like atomic_inc/atomic_dec/atomic_test_and_set, etc.. These should be applied very finely grained.
If applied wisely they scale almost linear to the number of involved processors. Downside is that not all algorithms can be locked with atomic operations. But e.g. Linux SpinLocks have been implemented this way.

One SpinLock per Key one should be pretty good for the case that there are a lot of keys and key access is mostly random. But even if there are often the same keys involved, real collisions are pretty unlikely.
For every locking mechanism care has to be taken against deadlocks. The more locks are involved, the more likely deadlocks are. It's for a reason that it takes so long to get rid of the Big Kernel Lock (BKL) in Linux.

Regards,
Daniel.



Yiftach Shoolman

unread,
Feb 24, 2012, 8:32:19 AM2/24/12
to redi...@googlegroups.com
This is how multi-threaded is done in Memcached, that might help to chose the right approach 


 
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.




--
Yiftach Shoolman
+972-54-7634621

Salvatore Sanfilippo

unread,
Feb 24, 2012, 9:36:22 AM2/24/12
to redi...@googlegroups.com
On Fri, Feb 24, 2012 at 2:32 PM, Yiftach Shoolman
<yiftach....@gmail.com> wrote:
> This is how multi-threaded is done in Memcached, that might help to chose
> the right approach

This is exactly the approach I would like to follow, but we need a few
more mutex, especially in the client structure, since what is in
memcached a "connection", is in Redis a more complex idea of "client",
and there are commands operating in the context of one thread that
could interact with a different client.

This at least to make the networking part parallel. For the other
parts in memcached things are much more simple than in Redis. They
just have a global lock for the hash table and other stuff like that.
I think we need to start to be this way too, and then try to implement
key-based locking.

Salvatore

--

Jak Sprats

unread,
Feb 24, 2012, 9:54:26 AM2/24/12
to Redis DB
Hi Salvatore,

Using multiple threads will speed things up, I get that, it works in
memcached, it will work in redis. But memcached does not replicate
data via a command log.

redis has to give the slaves a serialised command log and the slaves
must execute it in EXACTLY the same order to guarantee data integrity.
So while the master can benefit from doing the networking part in many
threads, I do not think the slave can, and that means the slave may
not be able to keep up.

Maybe since the master->slave connection is already established and
slaves only handle writes and pipelining could be introduced (if
sensible), the slave can process commands quick enough to keep up, but
if the new threaded master redis can do 500K SETs per second, can the
slave do them in the same order? I dont think the slave benefits from
being multithreaded.

I think the replication part of this is trickier than the key-level-
locking part

- jak
> Salvatore 'antirez'...
>
> read more »

Matthew Alton

unread,
Feb 24, 2012, 1:49:29 PM2/24/12
to redi...@googlegroups.com
Howdy Folks,

  This is my official un-lurk moment.  Here goes.

  I'm a professional C/Unix code monkey -- been this way since 1992.  I would like to investigate new and exciting ways to solve the C10K problem using some combination of threads, events, aio...  I would also like to make Redis work and play well with AIX.

  How can I help?

  Thanks.

--Matt
Matthew Alton
UNIX Systems Programming & Administration

"Beware of bugs in the above code; I have only proved it correct, not tried it." -- Donald Knuth

Hikeonpast

unread,
Feb 25, 2012, 12:05:10 AM2/25/12
to Redis DB
Hi Hampus,

>>Everything that Redis Cluster or internal sharding can do
>>(without using distributed transactions/locks or
>>cluster-wide snapshots) should be easy to implement
>>efficiently with threads and key-based locking too.

Perhaps I misunderstand your point, but it seems to me that one big
thing that threading and key-based locking don't address is spreading
the load across multiple hosts. Redis Cluster and explicit/external
sharding are the only two approaches that facilitate distribution of
CPU/memory load at the host level. They also both facilitate improved
multi-core utilization on a single host, just as threading and locking
would, although perhaps not with quite the same efficiency.

Not too long ago, there was active discussion on the possibility of
running multiple 32-bit Redis Cluster nodes per host, affording
reduced memory footprint and improved multi-core utilization. Adding
the ability to spread the load across hosts in a memory-grid type of
architecture seemed like a clear win. The biggest drawback was (and
still is) the limitation on multiple-key operations to a single
instance.

Does the desire for multi-threaded support for multiple key operations
outweigh the benefits of automating the sharding of Redis across
multiple hosts via Redis Cluster? At the risk of over simplifying, it
seems that Redis Cluster solves both the multi-core and multi-host
issues, while threading addresses only multi-core. For our use case
(big memory footprint + high qps, manually sharded), Redis Cluster
seems substantially more desirable than a multi-threaded, single-host
Redis.

Cheers,

Dean
> > >> >> "We are what we repeatedly do. Excellence,...
>
> read more »

Josiah Carlson

unread,
Feb 25, 2012, 12:31:52 AM2/25/12
to redi...@googlegroups.com

Adding threading to the mix offers the same Redis we have today, only
with better multi-core utilization. It's also potentially
significantly easier to implement than the remainder of Redis cluster.
While Redis Cluster is one of many destinations on Redis' path, making
Redis faster generally is a good idea. It also means that you don't
need to run 16 instances on your 16 core machine, which means less
inter-node traffic, less overhead, fewer instances to upgrade, etc.

Regards,
- Josiah

Hampus Wessman

unread,
Feb 25, 2012, 4:22:40 AM2/25/12
to redi...@googlegroups.com
On 02/25/2012 06:05 AM, Hikeonpast wrote:
> Hi Hampus,
>
>>> Everything that Redis Cluster or internal sharding can do
>>> (without using distributed transactions/locks or
>>> cluster-wide snapshots) should be easy to implement
>>> efficiently with threads and key-based locking too.
> Perhaps I misunderstand your point, but it seems to me that one big
> thing that threading and key-based locking don't address is spreading
> the load across multiple hosts. Redis Cluster and explicit/external
> sharding are the only two approaches that facilitate distribution of
> CPU/memory load at the host level. They also both facilitate improved
> multi-core utilization on a single host, just as threading and locking
> would, although perhaps not with quite the same efficiency.

Yes, of course. I didn't mean it that way. I rather meant that at least
all the behavior/guarantees/commands that Redis Cluster can support on
multiple hosts, should be possible to implement very efficiently on a
single machine with threading and more fine-grained locking. The
alternative is to run several Redis instances on individual machines
too, which may not be as efficient. When one physical machine is not
enough, then Redis Cluster (or similar) will definitely be the only
solution. Each machine in the cluster could still run a single big Redis
instance that makes good use of all available CPU cores, so I was just
focusing on each machine there :)

> Not too long ago, there was active discussion on the possibility of
> running multiple 32-bit Redis Cluster nodes per host, affording
> reduced memory footprint and improved multi-core utilization. Adding
> the ability to spread the load across hosts in a memory-grid type of
> architecture seemed like a clear win. The biggest drawback was (and
> still is) the limitation on multiple-key operations to a single
> instance.
>
> Does the desire for multi-threaded support for multiple key operations
> outweigh the benefits of automating the sharding of Redis across
> multiple hosts via Redis Cluster? At the risk of over simplifying, it
> seems that Redis Cluster solves both the multi-core and multi-host
> issues, while threading addresses only multi-core. For our use case
> (big memory footprint + high qps, manually sharded), Redis Cluster
> seems substantially more desirable than a multi-threaded, single-host
> Redis.

I agree. Redis Cluster is very important and it will make it possible to
use multiple cores too. Threaded Redis would just be an optimization for
single machines (in a cluster or not), in my opinion. It's questionable
how far it's worth going with that even, as it will get more and more
complicated... On the other hand, it could potentially be very useful
for big machines with a lot of cores and memory, both when it comes to
performance and administration. Redis on e.g. 16 cores could probably do
more than many people ever need too =)

Cheers,
Hampus

Hampus Wessman

unread,
Feb 25, 2012, 4:40:15 AM2/25/12
to redi...@googlegroups.com
Hi Jak,

I very much agree with this. Without fully multi-threaded replication
(which is hard!), it will not be that useful to be able to run several
write queries in parallel on the master...

It could still be useful to allow several read-only queries to run in
parallel however. That should be a lot easier to implement too, so it
may be a good start (after the networking code). Just need to make
everything those queries do (expiration and logging in particular)
thread-safe and then have some kind of global
shared-read/exclusive-write lock. That would be great for everyone who
does a lot of reads, at least!

Cheers,
Hampus

>> read more �

Jak Sprats

unread,
Feb 25, 2012, 8:24:19 AM2/25/12
to Redis DB
Hi Hampus,

yeah, multi threaded reading should yield a performance boost.

I can not figure out a way to get write ops that span multiple shards
(e.g. ZINTERSTORE a b c d e f g) to run in a concurrent manner that
can be replicated via commands AND guarantee consistency.

If we restrict the commands the same way we restrict them in redis
cluster, and redis-threaded shards internally to a given thread, that
would work ... but it is a subset of the commands.

this may be a cant-have-it-all situation, or maybe someone else sees
something I dont, but I dont have a solution to this.

- jak
> >>>>>>>> HyperDex (http://hyperdex.org/) is the new K/V store out of Cornell
> >>>>>>> open source...
>
> read more »

Jan Oberst

unread,
Feb 25, 2012, 11:42:22 AM2/25/12
to redi...@googlegroups.com
Hi Salvatore,

It's super interesting to see where Redis is going. I bet added threads to redis networking is going to lead to a massive increase in speed. However, It seems to me that speed is probably not the biggest thing that would keep someone from using Redis in production. I think iterating on redis-cluster would be more important.

Obviously I'd be thrilled about threaded networking code :) But I'm more excited about redis-cluster. I don't know quite enough about Redis internals to contribute to the implementation of core cluster functionality. But once core cluster APIs are stable enough, I (and tons of others) could their hands dirty and start working on cluster management details or Redis client code.

The main problem I see is hot failover between Redis instances.

I bet most simple master/slave setups do not take care of failover at all. People write custom sharding code to run Redis on multiple machines. Even those pretty complicated management scripts still result in a couple of minutes or seconds of downtime in case of a simple server or instance crash. I don't even dream of handling more complicated failure cases like netsplits or transient errors. Yet for Redis clusters that run on dozens of machines those errors happens all the time. Until there is a rock solid redis cluster implementation I can't really scale to a 500GB cluster easily. Adding a lot of custom code to deal with cluster management helps a bit but only goes so far.

I am thrilled by the way Redis is developed, and I am convinced that both redis-threading and the final redis-cluster solution will be awesome. If it was up to me I'd start another iteration on redis-cluster.

Early Cassandra and Riak issues were often focues around figuring out just the right way to handle replication issues. I think getting Redis to a point where those issues can surface is important. That includes finishing most APIs and implementing basic cluster functionality in clients. At that point we can try redis-cluster with actual machines, actual application code, and actual data. At that point we will inevitably discover dozens of race conditions and edge-cases that have to be handled properly. 

- Jan

>> >> redis-db+unsubscribe@googlegroups.com.


>> >> For more options, visit this group at
>> >> http://groups.google.com/group/redis-db?hl=en.
>> >
>> >
>> >
>> > --
>> > Salvatore 'antirez' Sanfilippo
>> > open source developer - VMware
>> >
>> > http://invece.org
>> > "We are what we repeatedly do. Excellence, therefore, is not an act,
>> > but a habit." -- Aristotele
>>
>>
>>
>> --
>> Salvatore 'antirez' Sanfilippo
>> open source developer - VMware
>>
>> http://invece.org
>> "We are what we repeatedly do. Excellence, therefore, is not an act,
>> but a habit." -- Aristotele
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Redis DB" group.
>> To post to this group, send email to redi...@googlegroups.com.
>> To unsubscribe from this group, send email to

>> redis-db+unsubscribe@googlegroups.com.


>> For more options, visit this group at
>> http://groups.google.com/group/redis-db?hl=en.
>>
>
>
>
> --
> Dvir Volk
> System Architect, The Everything Project (formerly DoAT)
> http://everything.me
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to

> redis-db+unsubscribe@googlegroups.com.


> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

>> >> redis-db+unsubscribe@googlegroups.com.


>> >> For more options, visit this group at
>> >> http://groups.google.com/group/redis-db?hl=en.
>> >
>> >
>> >
>> > --
>> > Salvatore 'antirez' Sanfilippo
>> > open source developer - VMware
>> >
>> > http://invece.org
>> > "We are what we repeatedly do. Excellence, therefore, is not an act,
>> > but a habit." -- Aristotele
>>
>>
>>
>> --
>> Salvatore 'antirez' Sanfilippo
>> open source developer - VMware
>>
>> http://invece.org
>> "We are what we repeatedly do. Excellence, therefore, is not an act,
>> but a habit." -- Aristotele
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Redis DB" group.
>> To post to this group, send email to redi...@googlegroups.com.
>> To unsubscribe from this group, send email to

>> redis-db+unsubscribe@googlegroups.com.


>> For more options, visit this group at
>> http://groups.google.com/group/redis-db?hl=en.
>>
>
>
>
> --
> Dvir Volk
> System Architect, The Everything Project (formerly DoAT)
> http://everything.me
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to

> redis-db+unsubscribe@googlegroups.com.


> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.

James Chou

unread,
Mar 1, 2012, 1:46:45 AM3/1/12
to redi...@googlegroups.com
Hello Dvir,

Back in April 2011 I brought up some ideas of a "redis administration console" (for details please see: http://www.jrcutie.com/wp/2011/04/05/open-source-redis-admin/), would you feel like it's something that you and the redis community might be interested in bringing forth?

Unfortunately I haven't had much time on this at all, since my day job is still RDBMS...but if there's enough interest, I might just start cranking something out...

Regards,
James

>> >> For more options, visit this group at
>> >> http://groups.google.com/group/redis-db?hl=en.
>> >
>> >
>> >
>> > --
>> > Salvatore 'antirez' Sanfilippo
>> > open source developer - VMware
>> >
>> > http://invece.org
>> > "We are what we repeatedly do. Excellence, therefore, is not an act,
>> > but a habit." -- Aristotele
>>
>>
>>
>> --
>> Salvatore 'antirez' Sanfilippo
>> open source developer - VMware
>>
>> http://invece.org
>> "We are what we repeatedly do. Excellence, therefore, is not an act,
>> but a habit." -- Aristotele
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Redis DB" group.
>> To post to this group, send email to redi...@googlegroups.com.
>> To unsubscribe from this group, send email to

>> For more options, visit this group at
>> http://groups.google.com/group/redis-db?hl=en.
>>
>
>
>
> --
> Dvir Volk
> System Architect, The Everything Project (formerly DoAT)
> http://everything.me
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to

> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.



--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Dvir Volk

unread,
Mar 1, 2012, 3:03:47 PM3/1/12
to redi...@googlegroups.com
Hi James,
I don't see myself having time for this in the coming months, but if I'll manage to convince my boss to give me a few days to start this project, I'd be glad to :) He liked the general idea so we'll see.