Redis critiques, let's take the good part.

11918 views
Skip to first unread message

Salvatore Sanfilippo

unread,
Dec 6, 2013, 8:52:41 AM12/6/13
to Redis DB
Hello dear Redis community,

today Pierre Chapuis started a discussion on Twitter about Redis
bashing, stimulated by this thread on Twitter from Rick Branson:

https://twitter.com/rbranson/status/408853897495592960

It is not the first time that Rick Branson, that works at Instagram,
openly criticizes Redis, because I guess he does not like the Redis
design and / or implementation.
However according to Pierre, this is not something limited to Rick,
but there are other engineers in the SF area that believe that Redis
sucks, and Pierre also reported to hear similar stories in Paris.

Of course every open source project of a given size is target if
critiques, especially a project like Redis is very opinionated on how
programs should be written, with the search for simple design and
implementation that sometimes are felt as sub-optimal.
However, what we can learn from this critiques, and what is that you
think is not working well in Redis? I really encourage you to share
your view.

As a starting point I'll use Rick tweet: "BGSAVE. the sentinel wtf.
memory cliffs. impossible to track what's in it. heap fragmentation.
LRU impl sux. etc et".
He also writes: "you can't even really dump the whole keyspace because
KEYS "*" causes it to shit it's"

This is a good starting point, and I'll use the rest of this email to
see what happened in the different areas of Redis criticized by Rick.

1) BGSAVE

I'm not sure what is wrong with BGSAVE, probably Rick had bad
experiences with EC2 instances where the fork time can create latency
spikes?

2) The Sentinel WTF.

Here probably the reference is the following:
http://aphyr.com/posts/283-call-me-maybe-redis

Aphyr analyzed Redis Sentinel from the point of view of a consistent
system, consistent as in CAP "strong consistency". During partition in
Aphyr tests Sentinel was not able to handle the promises of a CP
system.
I replied with a blog post trying to clarify that Redis Sentinel is
not designed to provide strong consistency in the face of partitions,
but only to provide some degree of availability when the master
instance fails.

However the implementation of Sentinel, even as a system promoting a
slave when the master fails, was not optimal, so there was work to
reimplement it from scratch. Finally the new Sentinel is available in
Redis 2.8.x
and is much more simple to understand and predict. This is surely an
improvement. The new implementation is able to version changes in the
configuration that are eventually propagated to all the other
Sentinels, requires majority to perform the failover, and so forth.

However if you understand even the basics of distributed programming
you know a few things, like how a system with asynchronous replication
is not capable to guarantee consistency.
Even if Sentinel was not designed for this, is Redis improving from
this point of view? Probably yes. For example now the unstable branch
has support for a new command called WAIT that implements a form of
synchronous replication.

Using WAIT and the new sentinel, it is possible to have a setup that
is quite partition resistant. For example if you have three computers,
A, B, C, and run a Sentinel instance and a Redis instance in every
computer, only the majority partition will be able to perform the
failover, and the minority partition will stop accepting writes if you
use "WAIT 1", that is, if you wait the propagation of the write to at
least one replica. The new Sentinel also elects the slave that has the
most updated version of data automatically.

Redis Cluster is another step forward towards Redis HA and automatic
sharding, we'll see how it works in practice. However I believe that
Sentinel is improving and Redis is providing more tools to fine-tune
consistency guarantees.

3) Impossible to track what is in it.

Lack of SCAN was a problem indeed, now it is solved. Even before using
RANDOMKEY it was somewhat possible to inspect data sets, but SCAN is
surely a much better way to do this.
The same argument goes for KEYS *.

4) LRU implementation sucks.

The LRU implementation in Redis 2.4 had issues, and under mass-expire
there where latency spikes.
The LRU in 2.6 is much smoother, however it contained issues signaled
by Pavlo Baron where the algorithm was not able to guarantee expired
keys where always under a given threshold.
Newer versions of 2.6, and 2.8 of course, both fix this issue.

I'm not aware of issues with the LRU algorithm.

I've the feeling that Rick's opinion is a bit biased by the fact that
he was exposed to older versions of Redis, however his criticism where
in part actually applicable to older versions of Redis.
This show that there is something good about this critiques. For
instance Rick always said that replication sucked because of lack for
partial resynchronization. I'm sorry he is no longer able to say this.
As a consolatory prize we'll send him a t-shirt if budget will permit.
But this again shows that critiques tend to be focused where
deficiencies *are*, so hiding Redis behind a niddle is not a good idea
IMHO. We need to improve the system to make it better, as long is it
still an useful system for many users.

So, what are the critiques that you hear frequently about Redis? What
are your own critiques? When Redis sucks?

Let's tear Redis apart, something good will happen.

Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

We suspect that trading off implementation flexibility for
understandability makes sense for most system designs.
— Diego Ongaro and John Ousterhout (from Raft paper)

Pierre Chapuis

unread,
Dec 6, 2013, 10:22:02 AM12/6/13
to redi...@googlegroups.com
Others:

Quentin Adam, CEO of Clever Cloud (a PaaS) has a presentation that says Redis is not fit to store sessions: http://www.slideshare.net/quentinadam/dotscale2013-how-to-scale/15 (he advises Membase)

Tony Arcieri (Square, ex-LivingSocial) is a "frequent offender":

https://twitter.com/bascule/status/277163514412548096
https://twitter.com/bascule/status/335538863869136896
https://twitter.com/bascule/status/371108333979054081
https://twitter.com/bascule/status/390919938862379008

Then there's the Disqus guys, who migrated to Cassandra,
the Superfeedr guys who migrated to Riak...

Instagram moved to Cassandra as well, here's more on
it by Branson to see where he comes from:
http://www.planetcassandra.org/blog/post/cassandra-summit-2013-instagrams-shift-to-cassandra-from-redis-by-rick-branson

This presentation about scaling Instagram with a small
team (by Mike Krieger) is very interesting as well:
http://qconsf.com/system/files/presentation-slides/How%20a%20Small%20Team%20Scales%20Instagram.pdf
He says he would go with Redis again, but there are
some points about scaling up Redis starting at slide 56.

My personal experience, to be clear, is that Redis is an
awesome tool when you know how it works and how to
use it, especially for a small team (like Krieger basically).

I have worked for a company with a very reduced technical
team for the last 3.5 years. We make technology for mobile
applications which we sell to large companies (retail, TV,
cinema, press...) mostly white-labelled. I have written most
of our server side software, and I have also been responsible
for operations. We have used and still use Redis *a lot*, and
some of the things we have done would just not have been
possible with such a reduced team in so little time without it.

So when I read someone saying he would ban Redis from
his architecture if he ever makes a startup, I think: "good
thing he doesn't." :)

Thank you Antirez for this awesome tool.

Alexander Gladysh

unread,
Dec 6, 2013, 10:25:14 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 7:22 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:

> My personal experience, to be clear, is that Redis is an
> awesome tool when you know how it works and how to
> use it, especially for a small team (like Krieger basically).

Indeed! Until you bumped on all the hidden obstacles, the experience
is rather horrible. When Redis blows up on production — it usually
costs developers a few gray hairs :-)

However, after you know what not to do, Redis is all awesomeness.

My 2c,
Alexander.

Pierre Chapuis

unread,
Dec 6, 2013, 10:33:31 AM12/6/13
to redi...@googlegroups.com
Le vendredi 6 décembre 2013 16:25:14 UTC+1, Alexander Gladysh a écrit :
On Fri, Dec 6, 2013 at 7:22 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:

Indeed! Until you bumped on all the hidden obstacles, the experience
is rather horrible. When Redis blows up on production — it usually
costs developers a few gray hairs :-)

I would say that of every tool. You can all outgrow them or use them poorly.

I had a terrible experience with MySQL. A (VC funded) startup around
here had issues with CouchDB, moved to Riak with Basho support,
had issued, moved to HBase which the still use (I think). That does
not make any of those tools bad. You just have to invest some time
into learning what those tools can and cannot do, which one to use for
which use case, and how to use them correctly.

--
Pierre Chapuis

Alexander Gladysh

unread,
Dec 6, 2013, 10:34:38 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 7:33 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:
> Le vendredi 6 décembre 2013 16:25:14 UTC+1, Alexander Gladysh a écrit :
>>
>> On Fri, Dec 6, 2013 at 7:22 PM, Pierre Chapuis
>> <catwell...@catwell.info> wrote:
>>
>> Indeed! Until you bumped on all the hidden obstacles, the experience
>> is rather horrible. When Redis blows up on production — it usually
>> costs developers a few gray hairs :-)
>
>
> I would say that of every tool. You can all outgrow them or use them poorly.
>
> I had a terrible experience with MySQL. A (VC funded) startup around
> here had issues with CouchDB, moved to Riak with Basho support,
> had issued, moved to HBase which the still use (I think). That does
> not make any of those tools bad. You just have to invest some time
> into learning what those tools can and cannot do, which one to use for
> which use case, and how to use them correctly.

I agree :-)

If learning curve is flat, it usually means that the tool is too
casual to be useful.

Alexander.

Pierre Chapuis

unread,
Dec 6, 2013, 10:41:21 AM12/6/13
to redi...@googlegroups.com
Also: I am not saying I have never experienced scaling issues
with Redis! I have. You will always when you build a system from
scratch that ends up serving millions of users. So there are
bottlenecks I hit, models I had to reconsider, and even things I had
to move off Redis.

But none of that made me go "OMG this tool is terrible and nobody
should use it, ever!!1". And I still think going with Redis in the first
place was a very good idea.

On a side note: one of the things it *did* make me decide not
to use is intermediate layers between my application and Redis
that abstract your models. When you hit a bottleneck, you want
to know exactly what you have stored in Redis, how and why.

So things like https://github.com/soveran/ohm are really cool
for prototyping and things that are not intended to scale, but
if you decide to use them for a product with traction you'd better
understand exactly what they do or just write your own abstraction
layer that suits your business logic.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 10:47:11 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 4:22 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:
> Others:
>
> Quentin Adam, CEO of Clever Cloud (a PaaS) has a presentation that says
> Redis is not fit to store sessions:
> http://www.slideshare.net/quentinadam/dotscale2013-how-to-scale/15 (he
> advises Membase)

I don't quite understand the presentation to be super-honest, what
means "multiple writes" / "pseudo automic"? I'm not sure.
MULTI/EXEC and Lua scripts both retain their semantic in the slave,
that will process the transaction all-or-nothing.

About HA, with new Sentinel and Cluster we have something to say in
the present and in the future.
Not sure what Membase properties are, their page seems like marketing,
and I don't know a single person that uses it to be honest.

> Tony Arcieri (Square, ex-LivingSocial) is a "frequent offender":
>
> https://twitter.com/bascule/status/277163514412548096

Latency complains, 2.2.x, no information given but Redis can be
operated with excellent latency characteristics if you know what you
are doing.
Honestly I believe that from the point of view of average latency, and
ability to provide a consistent latency, Redis is one of the better
DBs available out there.
If you run it on EC2 with EBS, instances that can't fork, fsync that
can't cope, it is a sysop fail, not a problem with the system IMHO.

> https://twitter.com/bascule/status/335538863869136896

FUD

> https://twitter.com/bascule/status/371108333979054081

FUD

> https://twitter.com/bascule/status/390919938862379008

101 of distributed systems is that non-synchronous replication can
drop acknowledged writes.
Every on disk-db single instance not configured to fsync on disk at
every write, can drop acknowledged writes.

So this is totally obvious for most DBs deployed currently.

What does not write acknowledged writes as long as the majority is up?
CP systems with strong consistency like Zookeeper.

It's worth to mention that WAIT announced yesterday can do a lot from
this point of view.

> Then there's the Disqus guys, who migrated to Cassandra,

I've no idea why Disqus migrated to Cassandra, probably it was just a
much better pick for them?
Migrating to a different does not necessarily implies a problem with
Redis, so this is not a criticism we can use in a positive way to act,
unless Disqus guys write us why they migrated and what Redis
deficiencies they found.

> the Superfeedr guys who migrated to Riak...

Same story here.

> Instagram moved to Cassandra as well, here's more on
> it by Branson to see where he comes from:
> http://www.planetcassandra.org/blog/post/cassandra-summit-2013-instagrams-shift-to-cassandra-from-redis-by-rick-branson

And again...

> This presentation about scaling Instagram with a small
> team (by Mike Krieger) is very interesting as well:
> http://qconsf.com/system/files/presentation-slides/How%20a%20Small%20Team%20Scales%20Instagram.pdf
> He says he would go with Redis again, but there are
> some points about scaling up Redis starting at slide 56.

This is interesting indeed, and sounds like problems that we can solve
with Redis Cluster.
Let's face it, partitioning client side is complex. Redis Cluster
provides a lot of help for big players with many instances since
operations will be much simpler once you can reshard live.

I find the above pointers interesting, but how to act based on this?
IMHO the current ruote of providing a simple HA system like Sentinel
trying to make it robust, and at the same time providing a more
complex system like Redis Cluster for "bigger needs" is the best the
Redis project can be headed to.

The "moved away from Redis" stories don't tell us much. What I believe
is that sometimes when you are small you tend to do things with an
in-memory data store that don't really scale cost wise, since the IOPS
per instance can be handled with a disk oriented system, so it could
be a natural consequence, and this is fine. At the start maybe using
Redis helped a lot by serving many queries with little machines,
during the boom with relatively little users in the order of maybe 1
million, but the hype about the service creating a big pressure from
the point of view of load.

What do you think we can do to improve Redis based on the above stories?

Cheers!

Pierre Chapuis

unread,
Dec 6, 2013, 10:48:05 AM12/6/13
to redi...@googlegroups.com
Le vendredi 6 décembre 2013 16:34:38 UTC+1, Alexander Gladysh a écrit :

If learning curve is flat, it usually means that the tool is too
casual to be useful.

This.

Also, maybe I avoided some of the issues others encountered in
production because:

  1) I have a MSc in distributed systems (helps sometimes :p)

  2) I had forked Redis and implemented custom commands
     before I actually deployed it so I understood the code base.

Also, I had read the documentation and not skipped the
parts about algorithmic complexity of the commands,
persistence trade-offs... :)

I guess that if you let a novice developer use Redis in his
application it may be easier for him to shoot himself in the
foot.

But... if you think about it, those things are also true of a
relational database: if you don't understand what you do
you will write dangerous code, and if you decide to use an
ORM and scale you'd better understand it.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 10:52:07 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 4:33 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:

> I had a terrible experience with MySQL. A (VC funded) startup around
> here had issues with CouchDB, moved to Riak with Basho support,

About the "moves to Riak", this is also a component. People seek for
help with Redis and there was nothing: me busy, Pivotal yet not
providing support (now they do finally!).
If Basho engineers say hi, we'll fix your issues, this is surely an
incentive (yet in this case people moved).

Unfortunately I'm really not qualified to say if there is big value or
not into Riak for the use case it is designed about as I hear a mix of
horrible and great things, and I never deployed it seriously.
But I'm happy that people try other solutions: in the end what is no
longer useful MUST DIE in technology.

If Redis will die in 6 months, this is great news, it means that
technology evolved enough that with other systems you can do the same
in some simpler way.
However as long as I'll see traction as I'm seeing it right now in the
project, and there is a company like Pivotal supporting the effort,
I'll continue to improve it.

Shane McEwan

unread,
Dec 6, 2013, 11:05:32 AM12/6/13
to redi...@googlegroups.com
On 06/12/13 15:52, Salvatore Sanfilippo wrote:
> Unfortunately I'm really not qualified to say if there is big value or
> not into Riak for the use case it is designed about as I hear a mix of
> horrible and great things, and I never deployed it seriously.
> But I'm happy that people try other solutions: in the end what is no
> longer useful MUST DIE in technology.

For what it's worth, we run both Riak and Redis. They each solve
different problems for us. You use whichever tool solves your problem.
There's no point complaining that your screwdriver is no good at
hammering nails!

Shane.

Pierre Chapuis

unread,
Dec 6, 2013, 11:08:19 AM12/6/13
to redi...@googlegroups.com
Le vendredi 6 décembre 2013 16:47:11 UTC+1, Salvatore Sanfilippo a écrit :
On Fri, Dec 6, 2013 at 4:22 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:
> Others:
>
> Quentin Adam, CEO of Clever Cloud (a PaaS) has a presentation that says
> Redis is not fit to store sessions:
> http://www.slideshare.net/quentinadam/dotscale2013-how-to-scale/15 (he
> advises Membase)

I don't quite understand the presentation to be super-honest, what
means "multiple writes" / "pseudo automic"? I'm not sure.

Afaik he is saying the system is single master and you cannot
have two writes executing concurrently, so write throughput / latency
is limited by a single node.

> Then there's the Disqus guys, who migrated to Cassandra,

I've no idea why Disqus migrated to Cassandra, probably it was just a
much better pick for them?  
Migrating to a different does not necessarily implies a problem with
Redis, so this is not a criticism we can use in a positive way to act,
unless Disqus guys write us why they migrated and what Redis
deficiencies they found.

They mention it here:
http://planetcassandra.org/blog/post/disqus-discusses-migration-from-redis-to-cassandra-for-horizontal-scalability
 
But they don't say much about their reasons, basically "it didn't
scale" :(

> This presentation about scaling Instagram with a small
> team (by Mike Krieger) is very interesting as well:
> http://qconsf.com/system/files/presentation-slides/How%20a%20Small%20Team%20Scales%20Instagram.pdf
> He says he would go with Redis again, but there are
> some points about scaling up Redis starting at slide 56.

This is interesting indeed, and sounds like problems that we can solve
with Redis Cluster. [...]

He also mentions the allocator as their reason to use Memcache
instead of Redis. I wonder if a lot of this criticism does not come
from people who don't use jemalloc.
 
Let's face it, partitioning client side is complex. Redis Cluster
provides a lot of help for big players with many instances since
operations will be much simpler once you can reshard live.

I can't comment much on that, I don't see a reason to use Redis
Cluster for now. Most of my data is trivial to shard in the application.
Maybe that would help with migrations / re-sharding but this is not
*so* terrible if you don't let your shards grow really huge.

We suspect that trading off implementation flexibility for
understandability makes sense for most system designs.
       — Diego Ongaro and John Ousterhout (from Raft paper)

:)

Jonathan Leibiusky

unread,
Dec 6, 2013, 11:09:30 AM12/6/13
to redi...@googlegroups.com

One of the big challenges we had with redis in mercadolibre was size of dataset. The fact that it needs to fit in memory was a big issue for us.
We used to have, on a common basis, 500gb DBs or even more.
Not sure if this is a common case for other redis users anyway.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Alexander Gladysh

unread,
Dec 6, 2013, 11:12:46 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 8:09 PM, Jonathan Leibiusky <iona...@gmail.com> wrote:
> One of the big challenges we had with redis in mercadolibre was size of
> dataset. The fact that it needs to fit in memory was a big issue for us.
> We used to have, on a common basis, 500gb DBs or even more.
> Not sure if this is a common case for other redis users anyway.

Seems to be kind of screwdriver vs. nails problem, no? Why use Redis
for the task that it is explicitly not designed for?

(Not trying to offend you, this is a honest question — relevant, I
think, since we're talking about why Redis is perceived as deficient
by some users...)

Alexander.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 11:15:41 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 5:05 PM, Shane McEwan <sh...@mcewan.id.au> wrote:
> For what it's worth, we run both Riak and Redis. They each solve different
> problems for us. You use whichever tool solves your problem. There's no
> point complaining that your screwdriver is no good at hammering nails!

Totally makes sense indeed. The systems are very different.

Just a question, supposing Redis Cluster were available and stable, is
some problem at the intersection between Redis and Riak that you ended
solving with Riak more disputable with Redis Cluster? Or it was a
matter of other metrics like consistency model and alike?

Jonathan Leibiusky

unread,
Dec 6, 2013, 11:18:23 AM12/6/13
to redi...@googlegroups.com

It's not that we planned it. Developers started using it for something they thought will stay small but it grew. And it grew a lot. We ended up using redis to cache a small chunk of the data and the as a backend data store mysql or oracle.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 11:18:50 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 5:12 PM, Alexander Gladysh <agla...@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 8:09 PM, Jonathan Leibiusky <iona...@gmail.com> wrote:
>> One of the big challenges we had with redis in mercadolibre was size of
>> dataset. The fact that it needs to fit in memory was a big issue for us.
>> We used to have, on a common basis, 500gb DBs or even more.
>> Not sure if this is a common case for other redis users anyway.
>
> Seems to be kind of screwdriver vs. nails problem, no? Why use Redis
> for the task that it is explicitly not designed for?

This is entirely possible but depends a lot on use case. If IOPS for
object are in a range that you pay less for RAM compared to how many
nodes you need to spin with an on-disk solution, then switching
becomes hard even when you realize you are using a lot of RAM. Also it
depends on where you run. On premise 500GB is not huge, on EC2 it is.

Alexander Gladysh

unread,
Dec 6, 2013, 11:28:02 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 8:18 PM, Jonathan Leibiusky <iona...@gmail.com> wrote:
> On Dec 6, 2013 1:13 PM, "Alexander Gladysh" <agla...@gmail.com> wrote:
>> On Fri, Dec 6, 2013 at 8:09 PM, Jonathan Leibiusky <iona...@gmail.com>
>> wrote:
>> > One of the big challenges we had with redis in mercadolibre was size of
>> > dataset. The fact that it needs to fit in memory was a big issue for us.
>> > We used to have, on a common basis, 500gb DBs or even more.
>> > Not sure if this is a common case for other redis users anyway.
>>
>> Seems to be kind of screwdriver vs. nails problem, no? Why use Redis
>> for the task that it is explicitly not designed for?
>>
> It's not that we planned it. Developers started using it for something they
> thought will stay small but it grew. And it grew a lot.

Ah, I see. We had that happen (on much smaller scale). But, despite
Redis blowing up in our faces several times, we were eventually able
to get away with optimizing data sizes (and adding a few ad-hoc
cluster nodes).

> We ended up using
> redis to cache a small chunk of the data and the as a backend data store
> mysql or oracle.

This is exactly what I would do now — after I had that experience.
Redis can be a primary data storage, but you have to think very well
before using it as such.

I had different point of view before — and it was the source of some
pain for us. You live and learn :-)

My 2c,
Alexander.

Alexander Gladysh

unread,
Dec 6, 2013, 11:29:28 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 8:18 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 5:12 PM, Alexander Gladysh <agla...@gmail.com> wrote:
>> On Fri, Dec 6, 2013 at 8:09 PM, Jonathan Leibiusky <iona...@gmail.com> wrote:
>>> One of the big challenges we had with redis in mercadolibre was size of
>>> dataset. The fact that it needs to fit in memory was a big issue for us.
>>> We used to have, on a common basis, 500gb DBs or even more.
>>> Not sure if this is a common case for other redis users anyway.
>>
>> Seems to be kind of screwdriver vs. nails problem, no? Why use Redis
>> for the task that it is explicitly not designed for?
>
> This is entirely possible but depends a lot on use case. If IOPS for
> object are in a range that you pay less for RAM compared to how many
> nodes you need to spin with an on-disk solution, then switching
> becomes hard even when you realize you are using a lot of RAM. Also it
> depends on where you run. On premise 500GB is not huge, on EC2 it is.

Of course. But you have to know Redis well to be able to get away with
this — and even to be able to make weighted and sane decision on that
matter.

Alexander.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 11:31:09 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 5:08 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:

> Afaik he is saying the system is single master and you cannot
> have two writes executing concurrently, so write throughput / latency
> is limited by a single node.

Unless you use sharding. Otherwise any system that accepts at the same
time, in two different nodes, a write for the same object, is
eventually consistent.

> But they don't say much about their reasons, basically "it didn't
> scale" :(

From what I can tell, Redis *can not* really scale on EC2 for
applications requiring a large data set just because of the cost of
spinning enough instances.
Imagine the 4TB Twitter Redis cluster on EC2. Totally possible even
for small companies on premise.

> He also mentions the allocator as their reason to use Memcache
> instead of Redis. I wonder if a lot of this criticism does not come
> from people who don't use jemalloc.

That's pre-jemalloc IMHO.

>> Let's face it, partitioning client side is complex. Redis Cluster
>> provides a lot of help for big players with many instances since
>> operations will be much simpler once you can reshard live.
>
>
> I can't comment much on that, I don't see a reason to use Redis
> Cluster for now. Most of my data is trivial to shard in the application.
> Maybe that would help with migrations / re-sharding but this is not
> *so* terrible if you don't let your shards grow really huge.

I'm quite sure that as soon as we provide solid Sentinel and a Redis
Cluster that works, we'll see a lot of new users...



>
>> We suspect that trading off implementation flexibility for
>> understandability makes sense for most system designs.
>> — Diego Ongaro and John Ousterhout (from Raft paper)
>
>
> :)
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

Felix Gallo

unread,
Dec 6, 2013, 11:32:22 AM12/6/13
to redi...@googlegroups.com
I think there's three types of criticism.  

The first type comes from a surge in popularity of high-A-style systems and, owing to the sexiness of those concepts and relative newness, a corresponding surge in dilettantes who try to eagerly apply knowledge gleaned from Aphyr's (great) Jepsen posts against all use cases, find Redis wanting, and try to be the first to tweet out the hipster sneering.  I won't name names but there's a dude who posted that you should replace redis with zookeeper.  I literally cried with laughter.

The second type is serious high-A folk like Aphyr, who do correctly point out that Redis cluster was not designed "properly."  It turns out that distributed systems are incredibly complicated and doing things the most simple and direct way, as Salvatore seems to aim to do, frequently misses some complex edge cases.  This type of criticism is more important, because here traditionally Redis has claimed it has a story when it really didn't.  I have concerns that Salvatore working alone will not get to a satisfactory story here owing to the complexities, and sometimes wonder if maybe external solutions (e.g. the system that uses zookeeper as a control plane) would not be better, not go for 100% availability, and for focus to be placed on the third area of criticism.

The third type is the most important, in my opinion: it's the people who fundamentally misunderstand Redis.  You see it all the time on this list: people who think Redis is mysql, or who ask why the server seems to have exploded when they put 100G of data in an m1.small, or why expiry is not instant, or why a transaction isn't rollable back.  The problem here is that Redis is very much a database construction set, with Unix-style semantics.  By itself it gives you just enough rope to hang you with.  By itself without care and feeding and diligence, Redis will detonate over time in the face of junior- and mid- level developers.  People will create clashing schemas across applications.  People will issue KEYS * in production.  People will DEL a 4 million long list and wonder why it doesn't return immediately (<-- this was me).  Heck, I'd been using Redis hard for a year before I learned the stupid SORT join trick from Josiah.  Many of these warts and complexities around usage and operation of a single instance could be smoothed over (KEYS *, ARE YOU SURE (Y/N) in redis-cli), and as far as making The World happy, that's probably the biggest bang for the buck.

Personally, I've just finished deploying a major application component for an online game for which you have seen many billboards no matter where you are in the world.  Over 2 million users use the component every day, and we put and get tens-to-hundreds-of-thousands of data items per second.  We don't use in-redis clustering, and we don't use sentinel, but I sleep at night fine because my dev and ops teams understand the product and know how it fails.

F.




Shane McEwan

unread,
Dec 6, 2013, 11:42:25 AM12/6/13
to redi...@googlegroups.com
On 06/12/13 16:15, Salvatore Sanfilippo wrote:
> Just a question, supposing Redis Cluster were available and stable, is
> some problem at the intersection between Redis and Riak that you ended
> solving with Riak more disputable with Redis Cluster? Or it was a
> matter of other metrics like consistency model and alike?

I haven't looked at Redis Cluster yet so I can't say for sure. The main
reason for choosing Riak was scalability and redundancy. We know there's
some huge Riak clusters out there and we plan to be one of them
eventually. Our dataset is larger than can easily (cheaply!) fit into
memory so we use Riak with LevelDB to store our data while anything we
want quick and easy access to we store in Redis.

Shane.

Yiftach Shoolman

unread,
Dec 6, 2013, 12:03:03 PM12/6/13
to redi...@googlegroups.com
From the point of view of a Redis provider who "live" from these OSS issues I can only say that I know a handful of companies that can actually manage themselves any OSS DB at a large scale in production. I'm quite sure that most of these transitions to Riak/Cassandra were backed by Basho and Datastax guys. The fact that Redis is much more popular than those DBs (only 2nd to Mongo in real NoSQL deployments) actually means that someone built a solid product here. 
From the commercial side, there are now a few companies with enough cash in the bank for supporting and giving services around Redis, I'm this will only strengthen its position.

Another point to mention is the cloud deployment - I can only guess that most of the Redis deployments today are on AWS, and managing any large distributed deployment over this environment is a great challenge and especially with in-memory databases. This is because: instances fail frequently, data-centers fail, network partition happens too often, noisy neighbor all over, the concept of ephemeral storage, SAN/EBS storage which is not tuned for sequential writes, etc - I can only say that due to the competition from the other cloud vendors,  SoftLayer, GCE and Azure, AWS infrastructure is constantly improving. For instance - last year there were 4 zone (data-center) failure events in the AWS us-east region; this year - zero. The new AWS C3 instances are now based on HVM and most of the BGSAVE fork time issues have been solved
    


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.



--

Yiftach Shoolman
+972-54-7634621

Josiah Carlson

unread,
Dec 6, 2013, 1:29:16 PM12/6/13
to redi...@googlegroups.com
Heck, I'd been using Redis hard for a year before I learned the stupid SORT join trick from Josiah.

Not stupid, just crazy :)


My criticisms are primarily from the point of view of someone who knows enough about Redis to be dangerous, who has spent the last 11+ years studying, designing, and building data structures, but who doesn't have a lot of time to work on Redis itself. All of the runtime-related issues have already been covered.

Long story short: every one of the existing data structures in Redis can be improved substantially. All of them can have their memory use reduced, and most of them can have their performance improved. I would argue that the ziplist encoding should be removed in favor of structures that are concise enough to make the optimization unnecessary for structures with more than 5 or 10 items. If the intset encoding is to be kept, I would also argue that it should be modified to apply to all sets of integers (not just small ones), and its performance characteristics updated if it happens that the implementation changes to improve large intset performance.

I might also argue that something like Redis-nds should be included in core, but that it should *not* involve the development of a new storage engine, unless that storage engine is super simple (I wrote a bitcask w/index creation on shutdown in Go a few weeks ago in a week, and it is the best on-disk key/value storage engine I've ever used). I don't know whether explicitly paging data in and out makes sense, or whether it should be automatic, as I can make passionate arguments on both sides.

All of that said, Redis does work very well for every use case that I find reasonable, even if there are some rough edges.
 - Josiah

Salvatore Sanfilippo

unread,
Dec 6, 2013, 1:36:13 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 5:32 PM, Felix Gallo <felix...@gmail.com> wrote:
> I think there's three types of criticism.

Hello Felix, I like classifications ;-)

> I won't
> name names but there's a dude who posted that you should replace redis with
> zookeeper. I literally cried with laughter.

Skipping that... as I recognized this and not worth analyzing :-)

> The second type is serious high-A folk like Aphyr, who do correctly point
> out that Redis cluster was not designed "properly." It turns out that
> distributed systems are incredibly complicated and doing things the most
> simple and direct way, as Salvatore seems to aim to do, frequently misses
> some complex edge cases. This type of criticism is more important, because
> here traditionally Redis has claimed it has a story when it really didn't.
> I have concerns that Salvatore working alone will not get to a satisfactory
> story here owing to the complexities, and sometimes wonder if maybe external
> solutions (e.g. the system that uses zookeeper as a control plane) would not
> be better, not go for 100% availability, and for focus to be placed on the
> third area of criticism.

Here there is a "mea culpa" to do, the first Sentinel and the first
version of Redis Cluster were designed before I seriously learned the
theoretical basis of distributed systems. This is why I used the past
months to read and learn about distributed systems.

I believe the new design of Redis Cluster is focused on real trade
offs and will hold well in practice. It may not be bug free or some
minor changes may be needed but IMHO there are not huge mistakes.

Aphyr did a great thing analyzing systems in practice, they hold the
expectations? However I think that distributed systems are not super
hard, like kernel programming is not super hard, like C system
programming is not super hard. Everything new or that you don't do in
a daily basis seems super hard, but it is actually different concepts
that are definitely things everybody here in this list can master.

So Redis Sentinel as a distributed system was not consistent? Wow,
asynchronous replication used so no way for the master partitioned
away to stop receiving writes, no merge operation afterward but "who
is the master rewrites the history". Also the first Sentinel was much
simpler to take apart from a theoretical perspective, the system would
not converge after the partition hails and it was simple to prove. It
is also possible to trivially prove that the ODOWN state for the kind
of algorithm used does not guarantee liveness (but this is practically
not important for *how* it is used now).

It is important to learn, but there is no distributed system cathedral
that is impossible to escalate. At max to learn more is needed, and to
adapt the implementation to the best one could provide in a given
moment, given understanding, practical limits (a single coder) and so
forth.

However my take on that is that the Redis project responded in a
positive way to theoretical criticisms. I never believed that was
interesting for the kind of uses Redis was designed to, to improve a
lot our story about consistency. I changed idea, and we got things
like WAIT. This is a huge change, WAIT means that if you run three
nodes A, B, C where every node contains a Sentinel instance and a
Redis instance, and you "WAIT 1" after every operation to reach the
majority of slaves, you get a consistent system.

> The third type is the most important, in my opinion: it's the people who
> fundamentally misunderstand Redis. You see it all the time on this list:
> people who think Redis is mysql, or who ask why the server seems to have
> exploded when they put 100G of data in an m1.small, or why expiry is not
> instant, or why a transaction isn't rollable back. The problem here is that
> Redis is very much a database construction set, with Unix-style semantics.
> By itself it gives you just enough rope to hang you with. By itself without
> care and feeding and diligence, Redis will detonate over time in the face of
> junior- and mid- level developers. People will create clashing schemas
> across applications. People will issue KEYS * in production. People will
> DEL a 4 million long list and wonder why it doesn't return immediately (<--
> this was me). Heck, I'd been using Redis hard for a year before I learned
> the stupid SORT join trick from Josiah. Many of these warts and
> complexities around usage and operation of a single instance could be
> smoothed over (KEYS *, ARE YOU SURE (Y/N) in redis-cli), and as far as
> making The World happy, that's probably the biggest bang for the buck.

Totally agree, what is disturbing is that in most environments where
you could expect "A class" developers sometimes the system was misused
like that.

> Personally, I've just finished deploying a major application component for
> an online game for which you have seen many billboards no matter where you
> are in the world. Over 2 million users use the component every day, and we
> put and get tens-to-hundreds-of-thousands of data items per second. We
> don't use in-redis clustering, and we don't use sentinel, but I sleep at
> night fine because my dev and ops teams understand the product and know how
> it fails.

Totally reasonable... thanks for sharing.

John Watson

unread,
Dec 6, 2013, 2:34:28 PM12/6/13
to redi...@googlegroups.com
We outgrew Redis in 1 specific use case. For the exact tradeoff Salvatore
has already ceded as a possible deficiency.

Some info about that in this slide deck:

Besides that, Redis is still a critical piece of our infrastructure and
has not been much of a pain point. We "cluster" by running many instances per
machine (and in some "clusters", some semblance of HA by a spider web of
SLAVE OFs between them.) We also built a Python library for handling the clustering
client side using various routing methods: https://pypi.python.org/pypi/nydus

Of course Nydus has some obvious drawbacks and so we're watching the work
Salvatore has been putting in to Sentinel/Cluster very closely.

Aphyr Null

unread,
Dec 6, 2013, 3:07:37 PM12/6/13
to redi...@googlegroups.com
> WAIT means that if you run three nodes A, B, C where every node contains a Sentinel instance and a Redis instance, and you "WAIT 1" after every operation to reach the majority of slaves, you get a consistent system.

While I am enthusiastic about the Redis project's improvements with respect to safety, this is not correct.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 4:14:37 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 9:07 PM, Aphyr Null <aphyr...@gmail.com> wrote:
> While I am enthusiastic about the Redis project's improvements with respect
> to safety, this is not correct.

It is not correct if you take it as "strong consistency" because there
are definitely failure modes, basically it is not like if synchronous
replication + failover turned the system into Paxos or Raft. For
example if the master returns writable when the failover already
started we are no longer sure to pick the slave with the best
replication offset. However this is definitely "more consistent" then
in the past, and probably it is possible to achieve strong consistency
if you have a way to stop writes during the replication process.

I understand this not the "C" consistency of "CAP" but, before: the
partition with clients and the (old) master partitioned away would
receive writes that gets lost.
after: under certain system models the system is consistent, like if
you assume that crashed instances never start again. It is not
realistic as a system model, but it means that in practice you have a
better real-world behavior, and in theory you have a system that is
going towards a better consistency model.

Regards,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

Matt Palmer

unread,
Dec 6, 2013, 5:04:11 PM12/6/13
to redi...@googlegroups.com
On Fri, Dec 06, 2013 at 11:09:30AM -0500, Jonathan Leibiusky wrote:
> One of the big challenges we had with redis in mercadolibre was size of
> dataset. The fact that it needs to fit in memory was a big issue for us.
> We used to have, on a common basis, 500gb DBs or even more.
> Not sure if this is a common case for other redis users anyway.

Common enough that I sat down and hacked together NDS to satisfy it. As you
said in your other message, though, it isn't that anyone usually *plans* to
store 500GB of data from the start and chooses Redis anyway, but rather that
you start small, and then things get out of hand... the situation isn't
helped when the developers aren't aware enough of what's going on "inside
the box" that they don't realise that they can't just throw data at the
Redis indefinitely -- but then, I (ops) didn't exactly give them the full
visibility required to know how big those Rediises were getting...

- Matt

--
Ruby's the only language I've ever used that feels like it was designed by a
programmer, and not by a hardware engineer (Java, C, C++), an academic
theorist (Lisp, Haskell, OCaml), or an editor of PC World (Python).
-- William Morgan

Matt Palmer

unread,
Dec 6, 2013, 5:06:32 PM12/6/13
to redi...@googlegroups.com
On Fri, Dec 06, 2013 at 07:22:02AM -0800, Pierre Chapuis wrote:
> So when I read someone saying he would ban Redis from
> his architecture if he ever makes a startup, I think: "good
> thing he doesn't." :)

I, on the other hand, just sincerely hope that whatever startup he makes is
competing with mine, because if he refuses to use the right tool for the job
(if Redis turns out to be the right tool for a specific use case), then I'll
gladly use that tool as a competitive advantage, and I need every advantage
I can get.

- Matt

Salvatore Sanfilippo

unread,
Dec 6, 2013, 5:16:21 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 11:06 PM, Matt Palmer <mpa...@hezmatt.org> wrote:
> I, on the other hand, just sincerely hope that whatever startup he makes is
> competing with mine, because if he refuses to use the right tool for the job
> (if Redis turns out to be the right tool for a specific use case), then I'll
> gladly use that tool as a competitive advantage, and I need every advantage
> I can get.

This is a fundamental point.

If you consider systems from a theoretical point of view, everybody
should use Zookeeper.
It is like to try to win all the wars with a precision rifle: it is
the most accurate, however it does not work against a tank.

People use Redis because it solves problems, because of the data model
that fits a given problem, and so forth, not because of it offers the
best consistency guarantees.
This is the point of view of us, programmers. We try to do the best to
implement systems in a great way.

There are other guys, like the authors of the Raft algorithm, that try
to do "A grade" work in the field of applicable distributed systems.
Those people provide us with the theoretical foundation to improve the
systems we are designing, however it is the sensibility of the
programmer to pick the trade offs, the API, and so forth.

Companies using the right tools will survive and will solve user
problems. When a tool, like Redis, starts to solve no problems, it
gets obsoleted and after a few years marginalized.
This is not a linear process because fashion also is a big player in
tech. Especially in the field of DBs lately there are too much money
for the environment to be sane, people don't just argue from a
technical point of view, there is a bit too much rage IMHO. But my
optimism says me that eventually the technology is the most important
thing.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 5:37:04 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 8:34 PM, John Watson <jo...@disqus.com> wrote:
> We outgrew Redis in 1 specific use case. For the exact tradeoff Salvatore
> has already ceded as a possible deficiency.
>
> Some info about that in this slide deck:
> http://www.slideshare.net/gjcourt/cassandra-sf-meetup20130731
>
> Besides that, Redis is still a critical piece of our infrastructure and
> has not been much of a pain point. We "cluster" by running many instances
> per
> machine (and in some "clusters", some semblance of HA by a spider web of
> SLAVE OFs between them.) We also built a Python library for handling the
> clustering
> client side using various routing methods:
> https://pypi.python.org/pypi/nydus

Hello John, thank you a lot for your feedback.
I seriously believe in using multiple DB systems to get the job done,
maybe because my point of view is biased by Redis being not very
general purpose, but I believe there is definitely value in being open
to use the right technologies for the right jobs. Of course it is a
hard to generalize concept, good engineers will understand when
something new is needed with great sensibility, and less experienced
ones sometimes do the error of trowing many technologies together when
they are not exactly needed, including Redis...

Thanks for the link to Nydus, I was not aware of this project. I'm
adding it here in the tools section -> http://redis.io/clients

> Of course Nydus has some obvious drawbacks and so we're watching the work
> Salvatore has been putting in to Sentinel/Cluster very closely.

Thanks, those are the priorities of the Redis project currently!

Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org