Redis critiques, let's take the good part.

11609 views
Skip to first unread message

Salvatore Sanfilippo

unread,
Dec 6, 2013, 8:52:41 AM12/6/13
to Redis DB
Hello dear Redis community,

today Pierre Chapuis started a discussion on Twitter about Redis
bashing, stimulated by this thread on Twitter from Rick Branson:

https://twitter.com/rbranson/status/408853897495592960

It is not the first time that Rick Branson, that works at Instagram,
openly criticizes Redis, because I guess he does not like the Redis
design and / or implementation.
However according to Pierre, this is not something limited to Rick,
but there are other engineers in the SF area that believe that Redis
sucks, and Pierre also reported to hear similar stories in Paris.

Of course every open source project of a given size is target if
critiques, especially a project like Redis is very opinionated on how
programs should be written, with the search for simple design and
implementation that sometimes are felt as sub-optimal.
However, what we can learn from this critiques, and what is that you
think is not working well in Redis? I really encourage you to share
your view.

As a starting point I'll use Rick tweet: "BGSAVE. the sentinel wtf.
memory cliffs. impossible to track what's in it. heap fragmentation.
LRU impl sux. etc et".
He also writes: "you can't even really dump the whole keyspace because
KEYS "*" causes it to shit it's"

This is a good starting point, and I'll use the rest of this email to
see what happened in the different areas of Redis criticized by Rick.

1) BGSAVE

I'm not sure what is wrong with BGSAVE, probably Rick had bad
experiences with EC2 instances where the fork time can create latency
spikes?

2) The Sentinel WTF.

Here probably the reference is the following:
http://aphyr.com/posts/283-call-me-maybe-redis

Aphyr analyzed Redis Sentinel from the point of view of a consistent
system, consistent as in CAP "strong consistency". During partition in
Aphyr tests Sentinel was not able to handle the promises of a CP
system.
I replied with a blog post trying to clarify that Redis Sentinel is
not designed to provide strong consistency in the face of partitions,
but only to provide some degree of availability when the master
instance fails.

However the implementation of Sentinel, even as a system promoting a
slave when the master fails, was not optimal, so there was work to
reimplement it from scratch. Finally the new Sentinel is available in
Redis 2.8.x
and is much more simple to understand and predict. This is surely an
improvement. The new implementation is able to version changes in the
configuration that are eventually propagated to all the other
Sentinels, requires majority to perform the failover, and so forth.

However if you understand even the basics of distributed programming
you know a few things, like how a system with asynchronous replication
is not capable to guarantee consistency.
Even if Sentinel was not designed for this, is Redis improving from
this point of view? Probably yes. For example now the unstable branch
has support for a new command called WAIT that implements a form of
synchronous replication.

Using WAIT and the new sentinel, it is possible to have a setup that
is quite partition resistant. For example if you have three computers,
A, B, C, and run a Sentinel instance and a Redis instance in every
computer, only the majority partition will be able to perform the
failover, and the minority partition will stop accepting writes if you
use "WAIT 1", that is, if you wait the propagation of the write to at
least one replica. The new Sentinel also elects the slave that has the
most updated version of data automatically.

Redis Cluster is another step forward towards Redis HA and automatic
sharding, we'll see how it works in practice. However I believe that
Sentinel is improving and Redis is providing more tools to fine-tune
consistency guarantees.

3) Impossible to track what is in it.

Lack of SCAN was a problem indeed, now it is solved. Even before using
RANDOMKEY it was somewhat possible to inspect data sets, but SCAN is
surely a much better way to do this.
The same argument goes for KEYS *.

4) LRU implementation sucks.

The LRU implementation in Redis 2.4 had issues, and under mass-expire
there where latency spikes.
The LRU in 2.6 is much smoother, however it contained issues signaled
by Pavlo Baron where the algorithm was not able to guarantee expired
keys where always under a given threshold.
Newer versions of 2.6, and 2.8 of course, both fix this issue.

I'm not aware of issues with the LRU algorithm.

I've the feeling that Rick's opinion is a bit biased by the fact that
he was exposed to older versions of Redis, however his criticism where
in part actually applicable to older versions of Redis.
This show that there is something good about this critiques. For
instance Rick always said that replication sucked because of lack for
partial resynchronization. I'm sorry he is no longer able to say this.
As a consolatory prize we'll send him a t-shirt if budget will permit.
But this again shows that critiques tend to be focused where
deficiencies *are*, so hiding Redis behind a niddle is not a good idea
IMHO. We need to improve the system to make it better, as long is it
still an useful system for many users.

So, what are the critiques that you hear frequently about Redis? What
are your own critiques? When Redis sucks?

Let's tear Redis apart, something good will happen.

Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

We suspect that trading off implementation flexibility for
understandability makes sense for most system designs.
— Diego Ongaro and John Ousterhout (from Raft paper)

Pierre Chapuis

unread,
Dec 6, 2013, 10:22:02 AM12/6/13
to redi...@googlegroups.com
Others:

Quentin Adam, CEO of Clever Cloud (a PaaS) has a presentation that says Redis is not fit to store sessions: http://www.slideshare.net/quentinadam/dotscale2013-how-to-scale/15 (he advises Membase)

Tony Arcieri (Square, ex-LivingSocial) is a "frequent offender":

https://twitter.com/bascule/status/277163514412548096
https://twitter.com/bascule/status/335538863869136896
https://twitter.com/bascule/status/371108333979054081
https://twitter.com/bascule/status/390919938862379008

Then there's the Disqus guys, who migrated to Cassandra,
the Superfeedr guys who migrated to Riak...

Instagram moved to Cassandra as well, here's more on
it by Branson to see where he comes from:
http://www.planetcassandra.org/blog/post/cassandra-summit-2013-instagrams-shift-to-cassandra-from-redis-by-rick-branson

This presentation about scaling Instagram with a small
team (by Mike Krieger) is very interesting as well:
http://qconsf.com/system/files/presentation-slides/How%20a%20Small%20Team%20Scales%20Instagram.pdf
He says he would go with Redis again, but there are
some points about scaling up Redis starting at slide 56.

My personal experience, to be clear, is that Redis is an
awesome tool when you know how it works and how to
use it, especially for a small team (like Krieger basically).

I have worked for a company with a very reduced technical
team for the last 3.5 years. We make technology for mobile
applications which we sell to large companies (retail, TV,
cinema, press...) mostly white-labelled. I have written most
of our server side software, and I have also been responsible
for operations. We have used and still use Redis *a lot*, and
some of the things we have done would just not have been
possible with such a reduced team in so little time without it.

So when I read someone saying he would ban Redis from
his architecture if he ever makes a startup, I think: "good
thing he doesn't." :)

Thank you Antirez for this awesome tool.

Alexander Gladysh

unread,
Dec 6, 2013, 10:25:14 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 7:22 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:

> My personal experience, to be clear, is that Redis is an
> awesome tool when you know how it works and how to
> use it, especially for a small team (like Krieger basically).

Indeed! Until you bumped on all the hidden obstacles, the experience
is rather horrible. When Redis blows up on production — it usually
costs developers a few gray hairs :-)

However, after you know what not to do, Redis is all awesomeness.

My 2c,
Alexander.

Pierre Chapuis

unread,
Dec 6, 2013, 10:33:31 AM12/6/13
to redi...@googlegroups.com
Le vendredi 6 décembre 2013 16:25:14 UTC+1, Alexander Gladysh a écrit :
On Fri, Dec 6, 2013 at 7:22 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:

Indeed! Until you bumped on all the hidden obstacles, the experience
is rather horrible. When Redis blows up on production — it usually
costs developers a few gray hairs :-)

I would say that of every tool. You can all outgrow them or use them poorly.

I had a terrible experience with MySQL. A (VC funded) startup around
here had issues with CouchDB, moved to Riak with Basho support,
had issued, moved to HBase which the still use (I think). That does
not make any of those tools bad. You just have to invest some time
into learning what those tools can and cannot do, which one to use for
which use case, and how to use them correctly.

--
Pierre Chapuis

Alexander Gladysh

unread,
Dec 6, 2013, 10:34:38 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 7:33 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:
> Le vendredi 6 décembre 2013 16:25:14 UTC+1, Alexander Gladysh a écrit :
>>
>> On Fri, Dec 6, 2013 at 7:22 PM, Pierre Chapuis
>> <catwell...@catwell.info> wrote:
>>
>> Indeed! Until you bumped on all the hidden obstacles, the experience
>> is rather horrible. When Redis blows up on production — it usually
>> costs developers a few gray hairs :-)
>
>
> I would say that of every tool. You can all outgrow them or use them poorly.
>
> I had a terrible experience with MySQL. A (VC funded) startup around
> here had issues with CouchDB, moved to Riak with Basho support,
> had issued, moved to HBase which the still use (I think). That does
> not make any of those tools bad. You just have to invest some time
> into learning what those tools can and cannot do, which one to use for
> which use case, and how to use them correctly.

I agree :-)

If learning curve is flat, it usually means that the tool is too
casual to be useful.

Alexander.

Pierre Chapuis

unread,
Dec 6, 2013, 10:41:21 AM12/6/13
to redi...@googlegroups.com
Also: I am not saying I have never experienced scaling issues
with Redis! I have. You will always when you build a system from
scratch that ends up serving millions of users. So there are
bottlenecks I hit, models I had to reconsider, and even things I had
to move off Redis.

But none of that made me go "OMG this tool is terrible and nobody
should use it, ever!!1". And I still think going with Redis in the first
place was a very good idea.

On a side note: one of the things it *did* make me decide not
to use is intermediate layers between my application and Redis
that abstract your models. When you hit a bottleneck, you want
to know exactly what you have stored in Redis, how and why.

So things like https://github.com/soveran/ohm are really cool
for prototyping and things that are not intended to scale, but
if you decide to use them for a product with traction you'd better
understand exactly what they do or just write your own abstraction
layer that suits your business logic.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 10:47:11 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 4:22 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:
> Others:
>
> Quentin Adam, CEO of Clever Cloud (a PaaS) has a presentation that says
> Redis is not fit to store sessions:
> http://www.slideshare.net/quentinadam/dotscale2013-how-to-scale/15 (he
> advises Membase)

I don't quite understand the presentation to be super-honest, what
means "multiple writes" / "pseudo automic"? I'm not sure.
MULTI/EXEC and Lua scripts both retain their semantic in the slave,
that will process the transaction all-or-nothing.

About HA, with new Sentinel and Cluster we have something to say in
the present and in the future.
Not sure what Membase properties are, their page seems like marketing,
and I don't know a single person that uses it to be honest.

> Tony Arcieri (Square, ex-LivingSocial) is a "frequent offender":
>
> https://twitter.com/bascule/status/277163514412548096

Latency complains, 2.2.x, no information given but Redis can be
operated with excellent latency characteristics if you know what you
are doing.
Honestly I believe that from the point of view of average latency, and
ability to provide a consistent latency, Redis is one of the better
DBs available out there.
If you run it on EC2 with EBS, instances that can't fork, fsync that
can't cope, it is a sysop fail, not a problem with the system IMHO.

> https://twitter.com/bascule/status/335538863869136896

FUD

> https://twitter.com/bascule/status/371108333979054081

FUD

> https://twitter.com/bascule/status/390919938862379008

101 of distributed systems is that non-synchronous replication can
drop acknowledged writes.
Every on disk-db single instance not configured to fsync on disk at
every write, can drop acknowledged writes.

So this is totally obvious for most DBs deployed currently.

What does not write acknowledged writes as long as the majority is up?
CP systems with strong consistency like Zookeeper.

It's worth to mention that WAIT announced yesterday can do a lot from
this point of view.

> Then there's the Disqus guys, who migrated to Cassandra,

I've no idea why Disqus migrated to Cassandra, probably it was just a
much better pick for them?
Migrating to a different does not necessarily implies a problem with
Redis, so this is not a criticism we can use in a positive way to act,
unless Disqus guys write us why they migrated and what Redis
deficiencies they found.

> the Superfeedr guys who migrated to Riak...

Same story here.

> Instagram moved to Cassandra as well, here's more on
> it by Branson to see where he comes from:
> http://www.planetcassandra.org/blog/post/cassandra-summit-2013-instagrams-shift-to-cassandra-from-redis-by-rick-branson

And again...

> This presentation about scaling Instagram with a small
> team (by Mike Krieger) is very interesting as well:
> http://qconsf.com/system/files/presentation-slides/How%20a%20Small%20Team%20Scales%20Instagram.pdf
> He says he would go with Redis again, but there are
> some points about scaling up Redis starting at slide 56.

This is interesting indeed, and sounds like problems that we can solve
with Redis Cluster.
Let's face it, partitioning client side is complex. Redis Cluster
provides a lot of help for big players with many instances since
operations will be much simpler once you can reshard live.

I find the above pointers interesting, but how to act based on this?
IMHO the current ruote of providing a simple HA system like Sentinel
trying to make it robust, and at the same time providing a more
complex system like Redis Cluster for "bigger needs" is the best the
Redis project can be headed to.

The "moved away from Redis" stories don't tell us much. What I believe
is that sometimes when you are small you tend to do things with an
in-memory data store that don't really scale cost wise, since the IOPS
per instance can be handled with a disk oriented system, so it could
be a natural consequence, and this is fine. At the start maybe using
Redis helped a lot by serving many queries with little machines,
during the boom with relatively little users in the order of maybe 1
million, but the hype about the service creating a big pressure from
the point of view of load.

What do you think we can do to improve Redis based on the above stories?

Cheers!

Pierre Chapuis

unread,
Dec 6, 2013, 10:48:05 AM12/6/13
to redi...@googlegroups.com
Le vendredi 6 décembre 2013 16:34:38 UTC+1, Alexander Gladysh a écrit :

If learning curve is flat, it usually means that the tool is too
casual to be useful.

This.

Also, maybe I avoided some of the issues others encountered in
production because:

  1) I have a MSc in distributed systems (helps sometimes :p)

  2) I had forked Redis and implemented custom commands
     before I actually deployed it so I understood the code base.

Also, I had read the documentation and not skipped the
parts about algorithmic complexity of the commands,
persistence trade-offs... :)

I guess that if you let a novice developer use Redis in his
application it may be easier for him to shoot himself in the
foot.

But... if you think about it, those things are also true of a
relational database: if you don't understand what you do
you will write dangerous code, and if you decide to use an
ORM and scale you'd better understand it.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 10:52:07 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 4:33 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:

> I had a terrible experience with MySQL. A (VC funded) startup around
> here had issues with CouchDB, moved to Riak with Basho support,

About the "moves to Riak", this is also a component. People seek for
help with Redis and there was nothing: me busy, Pivotal yet not
providing support (now they do finally!).
If Basho engineers say hi, we'll fix your issues, this is surely an
incentive (yet in this case people moved).

Unfortunately I'm really not qualified to say if there is big value or
not into Riak for the use case it is designed about as I hear a mix of
horrible and great things, and I never deployed it seriously.
But I'm happy that people try other solutions: in the end what is no
longer useful MUST DIE in technology.

If Redis will die in 6 months, this is great news, it means that
technology evolved enough that with other systems you can do the same
in some simpler way.
However as long as I'll see traction as I'm seeing it right now in the
project, and there is a company like Pivotal supporting the effort,
I'll continue to improve it.

Shane McEwan

unread,
Dec 6, 2013, 11:05:32 AM12/6/13
to redi...@googlegroups.com
On 06/12/13 15:52, Salvatore Sanfilippo wrote:
> Unfortunately I'm really not qualified to say if there is big value or
> not into Riak for the use case it is designed about as I hear a mix of
> horrible and great things, and I never deployed it seriously.
> But I'm happy that people try other solutions: in the end what is no
> longer useful MUST DIE in technology.

For what it's worth, we run both Riak and Redis. They each solve
different problems for us. You use whichever tool solves your problem.
There's no point complaining that your screwdriver is no good at
hammering nails!

Shane.

Pierre Chapuis

unread,
Dec 6, 2013, 11:08:19 AM12/6/13
to redi...@googlegroups.com
Le vendredi 6 décembre 2013 16:47:11 UTC+1, Salvatore Sanfilippo a écrit :
On Fri, Dec 6, 2013 at 4:22 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:
> Others:
>
> Quentin Adam, CEO of Clever Cloud (a PaaS) has a presentation that says
> Redis is not fit to store sessions:
> http://www.slideshare.net/quentinadam/dotscale2013-how-to-scale/15 (he
> advises Membase)

I don't quite understand the presentation to be super-honest, what
means "multiple writes" / "pseudo automic"? I'm not sure.

Afaik he is saying the system is single master and you cannot
have two writes executing concurrently, so write throughput / latency
is limited by a single node.

> Then there's the Disqus guys, who migrated to Cassandra,

I've no idea why Disqus migrated to Cassandra, probably it was just a
much better pick for them?  
Migrating to a different does not necessarily implies a problem with
Redis, so this is not a criticism we can use in a positive way to act,
unless Disqus guys write us why they migrated and what Redis
deficiencies they found.

They mention it here:
http://planetcassandra.org/blog/post/disqus-discusses-migration-from-redis-to-cassandra-for-horizontal-scalability
 
But they don't say much about their reasons, basically "it didn't
scale" :(

> This presentation about scaling Instagram with a small
> team (by Mike Krieger) is very interesting as well:
> http://qconsf.com/system/files/presentation-slides/How%20a%20Small%20Team%20Scales%20Instagram.pdf
> He says he would go with Redis again, but there are
> some points about scaling up Redis starting at slide 56.

This is interesting indeed, and sounds like problems that we can solve
with Redis Cluster. [...]

He also mentions the allocator as their reason to use Memcache
instead of Redis. I wonder if a lot of this criticism does not come
from people who don't use jemalloc.
 
Let's face it, partitioning client side is complex. Redis Cluster
provides a lot of help for big players with many instances since
operations will be much simpler once you can reshard live.

I can't comment much on that, I don't see a reason to use Redis
Cluster for now. Most of my data is trivial to shard in the application.
Maybe that would help with migrations / re-sharding but this is not
*so* terrible if you don't let your shards grow really huge.

We suspect that trading off implementation flexibility for
understandability makes sense for most system designs.
       — Diego Ongaro and John Ousterhout (from Raft paper)

:)

Jonathan Leibiusky

unread,
Dec 6, 2013, 11:09:30 AM12/6/13
to redi...@googlegroups.com

One of the big challenges we had with redis in mercadolibre was size of dataset. The fact that it needs to fit in memory was a big issue for us.
We used to have, on a common basis, 500gb DBs or even more.
Not sure if this is a common case for other redis users anyway.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Alexander Gladysh

unread,
Dec 6, 2013, 11:12:46 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 8:09 PM, Jonathan Leibiusky <iona...@gmail.com> wrote:
> One of the big challenges we had with redis in mercadolibre was size of
> dataset. The fact that it needs to fit in memory was a big issue for us.
> We used to have, on a common basis, 500gb DBs or even more.
> Not sure if this is a common case for other redis users anyway.

Seems to be kind of screwdriver vs. nails problem, no? Why use Redis
for the task that it is explicitly not designed for?

(Not trying to offend you, this is a honest question — relevant, I
think, since we're talking about why Redis is perceived as deficient
by some users...)

Alexander.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 11:15:41 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 5:05 PM, Shane McEwan <sh...@mcewan.id.au> wrote:
> For what it's worth, we run both Riak and Redis. They each solve different
> problems for us. You use whichever tool solves your problem. There's no
> point complaining that your screwdriver is no good at hammering nails!

Totally makes sense indeed. The systems are very different.

Just a question, supposing Redis Cluster were available and stable, is
some problem at the intersection between Redis and Riak that you ended
solving with Riak more disputable with Redis Cluster? Or it was a
matter of other metrics like consistency model and alike?

Jonathan Leibiusky

unread,
Dec 6, 2013, 11:18:23 AM12/6/13
to redi...@googlegroups.com

It's not that we planned it. Developers started using it for something they thought will stay small but it grew. And it grew a lot. We ended up using redis to cache a small chunk of the data and the as a backend data store mysql or oracle.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 11:18:50 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 5:12 PM, Alexander Gladysh <agla...@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 8:09 PM, Jonathan Leibiusky <iona...@gmail.com> wrote:
>> One of the big challenges we had with redis in mercadolibre was size of
>> dataset. The fact that it needs to fit in memory was a big issue for us.
>> We used to have, on a common basis, 500gb DBs or even more.
>> Not sure if this is a common case for other redis users anyway.
>
> Seems to be kind of screwdriver vs. nails problem, no? Why use Redis
> for the task that it is explicitly not designed for?

This is entirely possible but depends a lot on use case. If IOPS for
object are in a range that you pay less for RAM compared to how many
nodes you need to spin with an on-disk solution, then switching
becomes hard even when you realize you are using a lot of RAM. Also it
depends on where you run. On premise 500GB is not huge, on EC2 it is.

Alexander Gladysh

unread,
Dec 6, 2013, 11:28:02 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 8:18 PM, Jonathan Leibiusky <iona...@gmail.com> wrote:
> On Dec 6, 2013 1:13 PM, "Alexander Gladysh" <agla...@gmail.com> wrote:
>> On Fri, Dec 6, 2013 at 8:09 PM, Jonathan Leibiusky <iona...@gmail.com>
>> wrote:
>> > One of the big challenges we had with redis in mercadolibre was size of
>> > dataset. The fact that it needs to fit in memory was a big issue for us.
>> > We used to have, on a common basis, 500gb DBs or even more.
>> > Not sure if this is a common case for other redis users anyway.
>>
>> Seems to be kind of screwdriver vs. nails problem, no? Why use Redis
>> for the task that it is explicitly not designed for?
>>
> It's not that we planned it. Developers started using it for something they
> thought will stay small but it grew. And it grew a lot.

Ah, I see. We had that happen (on much smaller scale). But, despite
Redis blowing up in our faces several times, we were eventually able
to get away with optimizing data sizes (and adding a few ad-hoc
cluster nodes).

> We ended up using
> redis to cache a small chunk of the data and the as a backend data store
> mysql or oracle.

This is exactly what I would do now — after I had that experience.
Redis can be a primary data storage, but you have to think very well
before using it as such.

I had different point of view before — and it was the source of some
pain for us. You live and learn :-)

My 2c,
Alexander.

Alexander Gladysh

unread,
Dec 6, 2013, 11:29:28 AM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 8:18 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 5:12 PM, Alexander Gladysh <agla...@gmail.com> wrote:
>> On Fri, Dec 6, 2013 at 8:09 PM, Jonathan Leibiusky <iona...@gmail.com> wrote:
>>> One of the big challenges we had with redis in mercadolibre was size of
>>> dataset. The fact that it needs to fit in memory was a big issue for us.
>>> We used to have, on a common basis, 500gb DBs or even more.
>>> Not sure if this is a common case for other redis users anyway.
>>
>> Seems to be kind of screwdriver vs. nails problem, no? Why use Redis
>> for the task that it is explicitly not designed for?
>
> This is entirely possible but depends a lot on use case. If IOPS for
> object are in a range that you pay less for RAM compared to how many
> nodes you need to spin with an on-disk solution, then switching
> becomes hard even when you realize you are using a lot of RAM. Also it
> depends on where you run. On premise 500GB is not huge, on EC2 it is.

Of course. But you have to know Redis well to be able to get away with
this — and even to be able to make weighted and sane decision on that
matter.

Alexander.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 11:31:09 AM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 5:08 PM, Pierre Chapuis
<catwell...@catwell.info> wrote:

> Afaik he is saying the system is single master and you cannot
> have two writes executing concurrently, so write throughput / latency
> is limited by a single node.

Unless you use sharding. Otherwise any system that accepts at the same
time, in two different nodes, a write for the same object, is
eventually consistent.

> But they don't say much about their reasons, basically "it didn't
> scale" :(

From what I can tell, Redis *can not* really scale on EC2 for
applications requiring a large data set just because of the cost of
spinning enough instances.
Imagine the 4TB Twitter Redis cluster on EC2. Totally possible even
for small companies on premise.

> He also mentions the allocator as their reason to use Memcache
> instead of Redis. I wonder if a lot of this criticism does not come
> from people who don't use jemalloc.

That's pre-jemalloc IMHO.

>> Let's face it, partitioning client side is complex. Redis Cluster
>> provides a lot of help for big players with many instances since
>> operations will be much simpler once you can reshard live.
>
>
> I can't comment much on that, I don't see a reason to use Redis
> Cluster for now. Most of my data is trivial to shard in the application.
> Maybe that would help with migrations / re-sharding but this is not
> *so* terrible if you don't let your shards grow really huge.

I'm quite sure that as soon as we provide solid Sentinel and a Redis
Cluster that works, we'll see a lot of new users...



>
>> We suspect that trading off implementation flexibility for
>> understandability makes sense for most system designs.
>> — Diego Ongaro and John Ousterhout (from Raft paper)
>
>
> :)
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

Felix Gallo

unread,
Dec 6, 2013, 11:32:22 AM12/6/13
to redi...@googlegroups.com
I think there's three types of criticism.  

The first type comes from a surge in popularity of high-A-style systems and, owing to the sexiness of those concepts and relative newness, a corresponding surge in dilettantes who try to eagerly apply knowledge gleaned from Aphyr's (great) Jepsen posts against all use cases, find Redis wanting, and try to be the first to tweet out the hipster sneering.  I won't name names but there's a dude who posted that you should replace redis with zookeeper.  I literally cried with laughter.

The second type is serious high-A folk like Aphyr, who do correctly point out that Redis cluster was not designed "properly."  It turns out that distributed systems are incredibly complicated and doing things the most simple and direct way, as Salvatore seems to aim to do, frequently misses some complex edge cases.  This type of criticism is more important, because here traditionally Redis has claimed it has a story when it really didn't.  I have concerns that Salvatore working alone will not get to a satisfactory story here owing to the complexities, and sometimes wonder if maybe external solutions (e.g. the system that uses zookeeper as a control plane) would not be better, not go for 100% availability, and for focus to be placed on the third area of criticism.

The third type is the most important, in my opinion: it's the people who fundamentally misunderstand Redis.  You see it all the time on this list: people who think Redis is mysql, or who ask why the server seems to have exploded when they put 100G of data in an m1.small, or why expiry is not instant, or why a transaction isn't rollable back.  The problem here is that Redis is very much a database construction set, with Unix-style semantics.  By itself it gives you just enough rope to hang you with.  By itself without care and feeding and diligence, Redis will detonate over time in the face of junior- and mid- level developers.  People will create clashing schemas across applications.  People will issue KEYS * in production.  People will DEL a 4 million long list and wonder why it doesn't return immediately (<-- this was me).  Heck, I'd been using Redis hard for a year before I learned the stupid SORT join trick from Josiah.  Many of these warts and complexities around usage and operation of a single instance could be smoothed over (KEYS *, ARE YOU SURE (Y/N) in redis-cli), and as far as making The World happy, that's probably the biggest bang for the buck.

Personally, I've just finished deploying a major application component for an online game for which you have seen many billboards no matter where you are in the world.  Over 2 million users use the component every day, and we put and get tens-to-hundreds-of-thousands of data items per second.  We don't use in-redis clustering, and we don't use sentinel, but I sleep at night fine because my dev and ops teams understand the product and know how it fails.

F.




Shane McEwan

unread,
Dec 6, 2013, 11:42:25 AM12/6/13
to redi...@googlegroups.com
On 06/12/13 16:15, Salvatore Sanfilippo wrote:
> Just a question, supposing Redis Cluster were available and stable, is
> some problem at the intersection between Redis and Riak that you ended
> solving with Riak more disputable with Redis Cluster? Or it was a
> matter of other metrics like consistency model and alike?

I haven't looked at Redis Cluster yet so I can't say for sure. The main
reason for choosing Riak was scalability and redundancy. We know there's
some huge Riak clusters out there and we plan to be one of them
eventually. Our dataset is larger than can easily (cheaply!) fit into
memory so we use Riak with LevelDB to store our data while anything we
want quick and easy access to we store in Redis.

Shane.

Yiftach Shoolman

unread,
Dec 6, 2013, 12:03:03 PM12/6/13
to redi...@googlegroups.com
From the point of view of a Redis provider who "live" from these OSS issues I can only say that I know a handful of companies that can actually manage themselves any OSS DB at a large scale in production. I'm quite sure that most of these transitions to Riak/Cassandra were backed by Basho and Datastax guys. The fact that Redis is much more popular than those DBs (only 2nd to Mongo in real NoSQL deployments) actually means that someone built a solid product here. 
From the commercial side, there are now a few companies with enough cash in the bank for supporting and giving services around Redis, I'm this will only strengthen its position.

Another point to mention is the cloud deployment - I can only guess that most of the Redis deployments today are on AWS, and managing any large distributed deployment over this environment is a great challenge and especially with in-memory databases. This is because: instances fail frequently, data-centers fail, network partition happens too often, noisy neighbor all over, the concept of ephemeral storage, SAN/EBS storage which is not tuned for sequential writes, etc - I can only say that due to the competition from the other cloud vendors,  SoftLayer, GCE and Azure, AWS infrastructure is constantly improving. For instance - last year there were 4 zone (data-center) failure events in the AWS us-east region; this year - zero. The new AWS C3 instances are now based on HVM and most of the BGSAVE fork time issues have been solved
    


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.



--

Yiftach Shoolman
+972-54-7634621

Josiah Carlson

unread,
Dec 6, 2013, 1:29:16 PM12/6/13
to redi...@googlegroups.com
Heck, I'd been using Redis hard for a year before I learned the stupid SORT join trick from Josiah.

Not stupid, just crazy :)


My criticisms are primarily from the point of view of someone who knows enough about Redis to be dangerous, who has spent the last 11+ years studying, designing, and building data structures, but who doesn't have a lot of time to work on Redis itself. All of the runtime-related issues have already been covered.

Long story short: every one of the existing data structures in Redis can be improved substantially. All of them can have their memory use reduced, and most of them can have their performance improved. I would argue that the ziplist encoding should be removed in favor of structures that are concise enough to make the optimization unnecessary for structures with more than 5 or 10 items. If the intset encoding is to be kept, I would also argue that it should be modified to apply to all sets of integers (not just small ones), and its performance characteristics updated if it happens that the implementation changes to improve large intset performance.

I might also argue that something like Redis-nds should be included in core, but that it should *not* involve the development of a new storage engine, unless that storage engine is super simple (I wrote a bitcask w/index creation on shutdown in Go a few weeks ago in a week, and it is the best on-disk key/value storage engine I've ever used). I don't know whether explicitly paging data in and out makes sense, or whether it should be automatic, as I can make passionate arguments on both sides.

All of that said, Redis does work very well for every use case that I find reasonable, even if there are some rough edges.
 - Josiah

Salvatore Sanfilippo

unread,
Dec 6, 2013, 1:36:13 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 5:32 PM, Felix Gallo <felix...@gmail.com> wrote:
> I think there's three types of criticism.

Hello Felix, I like classifications ;-)

> I won't
> name names but there's a dude who posted that you should replace redis with
> zookeeper. I literally cried with laughter.

Skipping that... as I recognized this and not worth analyzing :-)

> The second type is serious high-A folk like Aphyr, who do correctly point
> out that Redis cluster was not designed "properly." It turns out that
> distributed systems are incredibly complicated and doing things the most
> simple and direct way, as Salvatore seems to aim to do, frequently misses
> some complex edge cases. This type of criticism is more important, because
> here traditionally Redis has claimed it has a story when it really didn't.
> I have concerns that Salvatore working alone will not get to a satisfactory
> story here owing to the complexities, and sometimes wonder if maybe external
> solutions (e.g. the system that uses zookeeper as a control plane) would not
> be better, not go for 100% availability, and for focus to be placed on the
> third area of criticism.

Here there is a "mea culpa" to do, the first Sentinel and the first
version of Redis Cluster were designed before I seriously learned the
theoretical basis of distributed systems. This is why I used the past
months to read and learn about distributed systems.

I believe the new design of Redis Cluster is focused on real trade
offs and will hold well in practice. It may not be bug free or some
minor changes may be needed but IMHO there are not huge mistakes.

Aphyr did a great thing analyzing systems in practice, they hold the
expectations? However I think that distributed systems are not super
hard, like kernel programming is not super hard, like C system
programming is not super hard. Everything new or that you don't do in
a daily basis seems super hard, but it is actually different concepts
that are definitely things everybody here in this list can master.

So Redis Sentinel as a distributed system was not consistent? Wow,
asynchronous replication used so no way for the master partitioned
away to stop receiving writes, no merge operation afterward but "who
is the master rewrites the history". Also the first Sentinel was much
simpler to take apart from a theoretical perspective, the system would
not converge after the partition hails and it was simple to prove. It
is also possible to trivially prove that the ODOWN state for the kind
of algorithm used does not guarantee liveness (but this is practically
not important for *how* it is used now).

It is important to learn, but there is no distributed system cathedral
that is impossible to escalate. At max to learn more is needed, and to
adapt the implementation to the best one could provide in a given
moment, given understanding, practical limits (a single coder) and so
forth.

However my take on that is that the Redis project responded in a
positive way to theoretical criticisms. I never believed that was
interesting for the kind of uses Redis was designed to, to improve a
lot our story about consistency. I changed idea, and we got things
like WAIT. This is a huge change, WAIT means that if you run three
nodes A, B, C where every node contains a Sentinel instance and a
Redis instance, and you "WAIT 1" after every operation to reach the
majority of slaves, you get a consistent system.

> The third type is the most important, in my opinion: it's the people who
> fundamentally misunderstand Redis. You see it all the time on this list:
> people who think Redis is mysql, or who ask why the server seems to have
> exploded when they put 100G of data in an m1.small, or why expiry is not
> instant, or why a transaction isn't rollable back. The problem here is that
> Redis is very much a database construction set, with Unix-style semantics.
> By itself it gives you just enough rope to hang you with. By itself without
> care and feeding and diligence, Redis will detonate over time in the face of
> junior- and mid- level developers. People will create clashing schemas
> across applications. People will issue KEYS * in production. People will
> DEL a 4 million long list and wonder why it doesn't return immediately (<--
> this was me). Heck, I'd been using Redis hard for a year before I learned
> the stupid SORT join trick from Josiah. Many of these warts and
> complexities around usage and operation of a single instance could be
> smoothed over (KEYS *, ARE YOU SURE (Y/N) in redis-cli), and as far as
> making The World happy, that's probably the biggest bang for the buck.

Totally agree, what is disturbing is that in most environments where
you could expect "A class" developers sometimes the system was misused
like that.

> Personally, I've just finished deploying a major application component for
> an online game for which you have seen many billboards no matter where you
> are in the world. Over 2 million users use the component every day, and we
> put and get tens-to-hundreds-of-thousands of data items per second. We
> don't use in-redis clustering, and we don't use sentinel, but I sleep at
> night fine because my dev and ops teams understand the product and know how
> it fails.

Totally reasonable... thanks for sharing.

John Watson

unread,
Dec 6, 2013, 2:34:28 PM12/6/13
to redi...@googlegroups.com
We outgrew Redis in 1 specific use case. For the exact tradeoff Salvatore
has already ceded as a possible deficiency.

Some info about that in this slide deck:

Besides that, Redis is still a critical piece of our infrastructure and
has not been much of a pain point. We "cluster" by running many instances per
machine (and in some "clusters", some semblance of HA by a spider web of
SLAVE OFs between them.) We also built a Python library for handling the clustering
client side using various routing methods: https://pypi.python.org/pypi/nydus

Of course Nydus has some obvious drawbacks and so we're watching the work
Salvatore has been putting in to Sentinel/Cluster very closely.

Aphyr Null

unread,
Dec 6, 2013, 3:07:37 PM12/6/13
to redi...@googlegroups.com
> WAIT means that if you run three nodes A, B, C where every node contains a Sentinel instance and a Redis instance, and you "WAIT 1" after every operation to reach the majority of slaves, you get a consistent system.

While I am enthusiastic about the Redis project's improvements with respect to safety, this is not correct.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 4:14:37 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 9:07 PM, Aphyr Null <aphyr...@gmail.com> wrote:
> While I am enthusiastic about the Redis project's improvements with respect
> to safety, this is not correct.

It is not correct if you take it as "strong consistency" because there
are definitely failure modes, basically it is not like if synchronous
replication + failover turned the system into Paxos or Raft. For
example if the master returns writable when the failover already
started we are no longer sure to pick the slave with the best
replication offset. However this is definitely "more consistent" then
in the past, and probably it is possible to achieve strong consistency
if you have a way to stop writes during the replication process.

I understand this not the "C" consistency of "CAP" but, before: the
partition with clients and the (old) master partitioned away would
receive writes that gets lost.
after: under certain system models the system is consistent, like if
you assume that crashed instances never start again. It is not
realistic as a system model, but it means that in practice you have a
better real-world behavior, and in theory you have a system that is
going towards a better consistency model.

Regards,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

Matt Palmer

unread,
Dec 6, 2013, 5:04:11 PM12/6/13
to redi...@googlegroups.com
On Fri, Dec 06, 2013 at 11:09:30AM -0500, Jonathan Leibiusky wrote:
> One of the big challenges we had with redis in mercadolibre was size of
> dataset. The fact that it needs to fit in memory was a big issue for us.
> We used to have, on a common basis, 500gb DBs or even more.
> Not sure if this is a common case for other redis users anyway.

Common enough that I sat down and hacked together NDS to satisfy it. As you
said in your other message, though, it isn't that anyone usually *plans* to
store 500GB of data from the start and chooses Redis anyway, but rather that
you start small, and then things get out of hand... the situation isn't
helped when the developers aren't aware enough of what's going on "inside
the box" that they don't realise that they can't just throw data at the
Redis indefinitely -- but then, I (ops) didn't exactly give them the full
visibility required to know how big those Rediises were getting...

- Matt

--
Ruby's the only language I've ever used that feels like it was designed by a
programmer, and not by a hardware engineer (Java, C, C++), an academic
theorist (Lisp, Haskell, OCaml), or an editor of PC World (Python).
-- William Morgan

Matt Palmer

unread,
Dec 6, 2013, 5:06:32 PM12/6/13
to redi...@googlegroups.com
On Fri, Dec 06, 2013 at 07:22:02AM -0800, Pierre Chapuis wrote:
> So when I read someone saying he would ban Redis from
> his architecture if he ever makes a startup, I think: "good
> thing he doesn't." :)

I, on the other hand, just sincerely hope that whatever startup he makes is
competing with mine, because if he refuses to use the right tool for the job
(if Redis turns out to be the right tool for a specific use case), then I'll
gladly use that tool as a competitive advantage, and I need every advantage
I can get.

- Matt

Salvatore Sanfilippo

unread,
Dec 6, 2013, 5:16:21 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 11:06 PM, Matt Palmer <mpa...@hezmatt.org> wrote:
> I, on the other hand, just sincerely hope that whatever startup he makes is
> competing with mine, because if he refuses to use the right tool for the job
> (if Redis turns out to be the right tool for a specific use case), then I'll
> gladly use that tool as a competitive advantage, and I need every advantage
> I can get.

This is a fundamental point.

If you consider systems from a theoretical point of view, everybody
should use Zookeeper.
It is like to try to win all the wars with a precision rifle: it is
the most accurate, however it does not work against a tank.

People use Redis because it solves problems, because of the data model
that fits a given problem, and so forth, not because of it offers the
best consistency guarantees.
This is the point of view of us, programmers. We try to do the best to
implement systems in a great way.

There are other guys, like the authors of the Raft algorithm, that try
to do "A grade" work in the field of applicable distributed systems.
Those people provide us with the theoretical foundation to improve the
systems we are designing, however it is the sensibility of the
programmer to pick the trade offs, the API, and so forth.

Companies using the right tools will survive and will solve user
problems. When a tool, like Redis, starts to solve no problems, it
gets obsoleted and after a few years marginalized.
This is not a linear process because fashion also is a big player in
tech. Especially in the field of DBs lately there are too much money
for the environment to be sane, people don't just argue from a
technical point of view, there is a bit too much rage IMHO. But my
optimism says me that eventually the technology is the most important
thing.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 5:37:04 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 8:34 PM, John Watson <jo...@disqus.com> wrote:
> We outgrew Redis in 1 specific use case. For the exact tradeoff Salvatore
> has already ceded as a possible deficiency.
>
> Some info about that in this slide deck:
> http://www.slideshare.net/gjcourt/cassandra-sf-meetup20130731
>
> Besides that, Redis is still a critical piece of our infrastructure and
> has not been much of a pain point. We "cluster" by running many instances
> per
> machine (and in some "clusters", some semblance of HA by a spider web of
> SLAVE OFs between them.) We also built a Python library for handling the
> clustering
> client side using various routing methods:
> https://pypi.python.org/pypi/nydus

Hello John, thank you a lot for your feedback.
I seriously believe in using multiple DB systems to get the job done,
maybe because my point of view is biased by Redis being not very
general purpose, but I believe there is definitely value in being open
to use the right technologies for the right jobs. Of course it is a
hard to generalize concept, good engineers will understand when
something new is needed with great sensibility, and less experienced
ones sometimes do the error of trowing many technologies together when
they are not exactly needed, including Redis...

Thanks for the link to Nydus, I was not aware of this project. I'm
adding it here in the tools section -> http://redis.io/clients

> Of course Nydus has some obvious drawbacks and so we're watching the work
> Salvatore has been putting in to Sentinel/Cluster very closely.

Thanks, those are the priorities of the Redis project currently!

Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

Alberto Gimeno Brieba

unread,
Dec 6, 2013, 5:46:34 PM12/6/13
to redi...@googlegroups.com
Hi,

I use redis a lot (3 big projects already) and I love it. And I know many people that love it too.

For me the two biggest problems with redis used to be:

- distribute it over several nodes. The problem is being solved with redis-cluster. And synchronous replication is a great feature.

- the dataset size needs to fit into memory. Of course I totally understand that redis is an in-memory db and is the main reason that makes redis so fast. However I would appreciate something like NDS officially supported. There were some attempts to address this problem in the past (vm and diskstore) but in the end they were removed.

I think that having something like NDS officially supported would make redis a great option for many more usage cases. Many times the 90% of the "hot data" in your db fits in an inexpensive server, but the rest of the data is too big and would be too expensive (unaffordable) to have enough RAM for it. So in the end you choose other db for the entire dataset.

My 2 cents.

Salvatore Sanfilippo

unread,
Dec 6, 2013, 5:54:49 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 6:03 PM, Yiftach Shoolman
<yiftach....@gmail.com> wrote:

> From the commercial side, there are now a few companies with enough cash in
> the bank for supporting and giving services around Redis, I'm this will only
> strengthen its position.

This is a very important point. With Redis you were alone until
recently, that's not good.

> Another point to mention is the cloud deployment - I can only guess that
> most of the Redis deployments today are on AWS, and managing any large
> distributed deployment over this environment is a great challenge and
> especially with in-memory databases. This is because: instances fail
> frequently, data-centers fail, network partition happens too often, noisy
> neighbor all over, the concept of ephemeral storage, SAN/EBS storage which
> is not tuned for sequential writes, etc - I can only say that due to the
> competition from the other cloud vendors, SoftLayer, GCE and Azure, AWS
> infrastructure is constantly improving. For instance - last year there were
> 4 zone (data-center) failure events in the AWS us-east region; this year -
> zero. The new AWS C3 instances are now based on HVM and most of the BGSAVE
> fork time issues have been solved

Absolutely, in some way EC2 is good for distributed systems, it is
problematic enough that it is much simpler to sense pain points and
see partitions that in practice are very rare in other deployments.
This somewhat is "training for failure" that's good. But seriously,
sometimes the problems are *just* a result of EC2 and if you don't
know how to fine-tune for this environment it is likely to see latency
and other issues that in other conditions are very hard to see at all.

I'm super happy about C3 instances, but... what about EBS? It remains
a problem I guess when AOF is enabled and disk can't cope with the
fsync policy...

Thanks,
Salvatore

Salvatore Sanfilippo

unread,
Dec 6, 2013, 6:00:01 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 11:46 PM, Alberto Gimeno Brieba
<gime...@gmail.com> wrote:
> I think that having something like NDS officially supported would make redis
> a great option for many more usage cases. Many times the 90% of the "hot
> data" in your db fits in an inexpensive server, but the rest of the data is
> too big and would be too expensive (unaffordable) to have enough RAM for it.
> So in the end you choose other db for the entire dataset.

I completely understand this, but IMHO to make Redis on disk right we need:

1) an optional threaded model. You may use it to dispatch maybe only
slow queries and on-disk queries. Threads are not a good fit for Redis
in memory I think. Similarly I believe that threads are the key for a
good on disk implementation.
2) Representing every data structure on disk in a native way. Mostly a
btree of btrees or alike, but definitely some work ahead to understand
what to use or what to implement.

Currently it is an effort that is basically impossible to do. If I'll
be able to continue the development of what we have now that the
complexity is raised is already a good result, the core, cluster,
sentinel, the community...
So for now the choice is to stay focused in the in-memory paradigm
even if I understand this makes Redis less useful for certain use
cases, since there are other DBs solving at least in part the "Redis
on disk" case, but there are little systems doing the Redis work well
IMHO.

Thanks!

Dvir Volk

unread,
Dec 6, 2013, 6:00:47 PM12/6/13
to redi...@googlegroups.com
Just my two cents. 
I've been using redis in production for almost 3 years, and I've had many difficulties but many many wins with it.
I think the biggest mistake I made was, being excited about it in the beginning, to use it for too many things, some of which it didn't fit. 
I'm very happy with redis as:
1. a geo resolving database.
2. complex cache (where just putting a blob in something like memcache is not enough)
3. distributed event bus
4. semantic entity store.

Where I wasn't happy with it was:
1. storing data that had complex relations in it.
2. storing data that needed migration from other dbs constantly (that's not redis' fault thuogh)
3. storing data that needed high persistence rates on EC2, although the forking problem was solved in recent generation machines.
4. having a mission critical DB that needed super fast failover. Sentinel in its original form was simply not good enough for what we needed.
5. needing cross DC replication. This has been solved in 2.8, but I needed it before.

So we've been moving some things we used to do with redis to other databases, but I still love this tool and would definitely use it for new projects.






On Fri, Dec 6, 2013 at 3:52 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
Hello dear Redis community,

today Pierre Chapuis started a discussion on Twitter about Redis
bashing, stimulated by this thread on Twitter from Rick Branson:

https://twitter.com/rbranson/status/408853897495592960

It is not the first time that Rick Branson, that works at Instagram,
openly criticizes Redis, because I guess he does not like the Redis
design and / or implementation.
However according to Pierre, this is not something limited to Rick,
but there are other engineers in the SF area that believe that Redis
sucks, and Pierre also reported to hear similar stories in Paris.

Of course every open source project of a given size is target if
critiques, especially a project like Redis is very opinionated on how
programs should be written, with the search for simple design and
implementation that sometimes are felt as sub-optimal.
However, what we can learn from this critiques, and what is that you
think is not working well in Redis? I really encourage you to share
your view.

As a starting point I'll use Rick tweet: "BGSAVE. the sentinel wtf.
memory cliffs. impossible to track what's in it. heap fragmentation.
LRU impl sux. etc et".
He also writes: "you can't even really dump the whole keyspace because
KEYS "*" causes it to shit it's"

This is a good starting point, and I'll use the rest of this email to
see what happened in the different areas of Redis criticized by Rick.

1) BGSAVE

I'm not sure what is wrong with BGSAVE, probably Rick had bad
experiences with EC2 instances where the fork time can create latency
spikes?

2) The Sentinel WTF.

Here probably the reference is the following:
http://aphyr.com/posts/283-call-me-maybe-redis

Aphyr analyzed Redis Sentinel from the point of view of a consistent
system, consistent as in CAP "strong consistency". During partition in
Aphyr tests Sentinel was not able to handle the promises of a CP
system.
I replied with a blog post trying to clarify that Redis Sentinel is
not designed to provide strong consistency in the face of partitions,
but only to provide some degree of availability when the master
instance fails.

However the implementation of Sentinel, even as a system promoting a
slave when the master fails, was not optimal, so there was work to
reimplement it from scratch. Finally the new Sentinel is available in
Redis 2.8.x
and is much more simple to understand and predict. This is surely an
improvement. The new implementation is able to version changes in the
configuration that are eventually propagated to all the other
Sentinels, requires majority to perform the failover, and so forth.

However if you understand even the basics of distributed programming
you know a few things, like how a system with asynchronous replication
is not capable to guarantee consistency.
Even if Sentinel was not designed for this, is Redis improving from
this point of view? Probably yes. For example now the unstable branch
has support for a new command called WAIT that implements a form of
synchronous replication.

Using WAIT and the new sentinel, it is possible to have a setup that
is quite partition resistant. For example if you have three computers,
A, B, C, and run a Sentinel instance and a Redis instance in every
computer, only the majority partition will be able to perform the
failover, and the minority partition will stop accepting writes if you
use "WAIT 1", that is, if you wait the propagation of the write to at
least one replica. The new Sentinel also elects the slave that has the
most updated version of data automatically.

Redis Cluster is another step forward towards Redis HA and automatic
sharding, we'll see how it works in practice. However I believe that
Sentinel is improving and Redis is providing more tools to fine-tune
consistency guarantees.

3) Impossible to track what is in it.

Lack of SCAN was a problem indeed, now it is solved. Even before using
RANDOMKEY it was somewhat possible to inspect data sets, but SCAN is
surely a much better way to do this.
The same argument goes for KEYS *.

4) LRU implementation sucks.

The LRU implementation in Redis 2.4 had issues, and under mass-expire
there where latency spikes.
The LRU in 2.6 is much smoother, however it contained issues signaled
by Pavlo Baron where the algorithm was not able to guarantee expired
keys where always under a given threshold.
Newer versions of 2.6, and 2.8 of course, both fix this issue.

I'm not aware of issues with the LRU algorithm.

I've the feeling that Rick's opinion is a bit biased by the fact that
he was exposed to older versions of Redis, however his criticism where
in part actually applicable to older versions of Redis.
This show that there is something good about this critiques. For
instance Rick always said that replication sucked because of lack for
partial resynchronization. I'm sorry he is no longer able to say this.
As a consolatory prize we'll send him a t-shirt if budget will permit.
But this again shows that critiques tend to be focused where
deficiencies *are*, so hiding Redis behind a niddle is not a good idea
IMHO. We need to improve the system to make it better, as long is it
still an useful system for many users.

So, what are the critiques that you hear frequently about Redis? What
are your own critiques? When Redis sucks?

Let's tear Redis apart, something good will happen.

Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

We suspect that trading off implementation flexibility for
understandability makes sense for most system designs.
       — Diego Ongaro and John Ousterhout (from Raft paper)

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.



--
Dvir Volk
Chief Architect, Everything.me

Salvatore Sanfilippo

unread,
Dec 6, 2013, 6:02:47 PM12/6/13
to Redis DB
On Fri, Dec 6, 2013 at 7:29 PM, Josiah Carlson <josiah....@gmail.com> wrote:
> Long story short: every one of the existing data structures in Redis can be
> improved substantially. All of them can have their memory use reduced, and
> most of them can have their performance improved. I would argue that the
> ziplist encoding should be removed in favor of structures that are concise
> enough to make the optimization unnecessary for structures with more than 5
> or 10 items. If the intset encoding is to be kept, I would also argue that
> it should be modified to apply to all sets of integers (not just small
> ones), and its performance characteristics updated if it happens that the
> implementation changes to improve large intset performance.

Hello Josiah, thanks for your contrib. I agree with you, it is exactly
another case of "this is the simplest way to avoid work given that it
is good enough".
This would deserve a person allocated to this solely that is able to
do steady progresses and merge code when it is mature / tested enough
to avoid disasters, since it is a very sensible area.

Cheers,

Salvatore Sanfilippo

unread,
Dec 6, 2013, 6:07:36 PM12/6/13
to Redis DB
Thanks Dvir, this is a very balanced message.

Certain use cases in your non-happy list probably will never be good
for Redis, including complex relations.
The good side is that I see other entries about issues that are
getting solved finally.

Just to collect a data point, about the fast failover, what were the
consistency requirements and the actual failover times? A few seconds,
or milliseconds, or what?
New Sentinel is faster at failing over, but could be made a lot faster
(in the order of 200 milliseconds instead of a 2/3 seconds it takes
now).

Salvatore

Dvir Volk

unread,
Dec 6, 2013, 6:20:47 PM12/6/13
to redi...@googlegroups.com
A few seconds were fine. If you remember our lengthy discussion about it (and the rejected 1000 line-long pull request :) ) from about a year ago, the problem we had was how to do this stuff dynamically without changing config files, and without having sentinel state conflicting with Chef state.

I ended up protecting the code itself from having a lost master, so the app won't fail while we do a longer failover process; And also moving the write intensive, mission critical stuff, away from redis, to cassandra. As long as I treat redis as read (almost) only, and a potentially volatile (Although we never suffered any major data loss with it) data store - all is fine. 

Salvatore Sanfilippo

unread,
Dec 6, 2013, 6:26:08 PM12/6/13
to Redis DB
Ok, thanks for the additional context, a few seconds is already in
line with the new implementation, however now that you said that, it
is really easy to bring the failover timeout delay under a few hundred
milliseconds. About the PR, I'm sorry but I had no enough focus /
context at the time to really understand if it was a good thing or
not... really tried to take a slower evolution path where I was able
to understand more during the process. Thanks for the PR anyway and
for the chats ;-)

Salvatore

Pierre Chapuis

unread,
Dec 6, 2013, 6:42:00 PM12/6/13
to redi...@googlegroups.com

Le vendredi 6 décembre 2013 16:22:02 UTC+1, Pierre Chapuis a écrit :

Tony Arcieri (Square, ex-LivingSocial) is a "frequent offender":

OK, it looks like I have an apology to make. I wanted to say that Tony had often criticised Redis. Instead I used an English expression which I clearly did not understand well. That was a really, really stupid thing to do.

Moreover, even though I do not share his point of view on Redis, I think Tony is a very good engineer I respect a lot. In particular, he wrote Celluloid, which you probably know about if you are interested in distributed systems and/or Ruby. That makes me even more ashamed to have written such a terrible thing.

Aphyr Null

unread,
Dec 6, 2013, 6:44:53 PM12/6/13
to redi...@googlegroups.com
> probably it is possible to achieve strong consistency 
> if you have a way to stop writes during the replication process.

A formal model and proof would go a long way towards convincing me. I strongly suspect that in the absence of transactional rollbacks, one cannot currently use WAIT to guarantee both linearizability and liveness in the presence of one or more node failures--not without careful control of the election process, anyway.

Howard Chu

unread,
Dec 6, 2013, 6:44:56 PM12/6/13
to redi...@googlegroups.com


On Friday, December 6, 2013 3:00:01 PM UTC-8, Salvatore Sanfilippo wrote:
On Fri, Dec 6, 2013 at 11:46 PM, Alberto Gimeno Brieba
<gime...@gmail.com> wrote:
> I think that having something like NDS officially supported would make redis
> a great option for many more usage cases. Many times the 90% of the "hot
> data" in your db fits in an inexpensive server, but the rest of the data is
> too big and would be too expensive (unaffordable) to have enough RAM for it.
> So in the end you choose other db for the entire dataset.

I completely understand this, but IMHO to make Redis on disk right we need:

1) an optional threaded model. You may use it to dispatch maybe only
slow queries and on-disk queries. Threads are not a good fit for Redis
in memory I think. Similarly I believe that threads are the key for a
good on disk implementation.
2) Representing every data structure on disk in a native way. Mostly a
btree of btrees or alike, but definitely some work ahead to understand
what to use or what to implement.

LMDB, which NDS uses, already supports btree of btrees.

The major flaw in any in-memory DB design, as I see it, is the notion that there is a difference between in-memory data and on-disk data. It inherently leads to waste of CPU + memory due to redundant caches and associated management code.

Kelly Sommers

unread,
Dec 6, 2013, 7:21:08 PM12/6/13
to redi...@googlegroups.com


On Friday, December 6, 2013 4:14:37 PM UTC-5, Salvatore Sanfilippo wrote:
On Fri, Dec 6, 2013 at 9:07 PM, Aphyr Null <aphyr...@gmail.com> wrote:
> While I am enthusiastic about the Redis project's improvements with respect
> to safety, this is not correct.

It is not correct if you take it as "strong consistency" because there
are definitely failure modes, basically it is not like if synchronous
replication + failover turned the system into Paxos or Raft. For
example if the master returns writable when the failover already
started we are no longer sure to pick the slave with the best
replication offset. However this is definitely "more consistent" then
in the past, and probably it is possible to achieve strong consistency
if you have a way to stop writes during the replication process.

Descriptions like this indicate the trade-offs aren't understood, explicitly chosen and designed or accounted for. What is Redis trying to be? Is Redis trying to be a CP or AP system? Pick one and design it as such. From my perspective, with masters and slaves, Redis is trying to be a CP system but it's not achieving the goals. If it's trying to be an AP system, it isn't achieving those goals either. 

Broken CP systems are the worst kinds of AP systems. The aren't as consistent as they intend to be, nor as available and eventually consistent as they ought to be.

Now for a little tough love. I share the #2 type criticism concern Felix mentioned. Respect for the complexity of the problems that production distributed systems face seems to be a root problem here. This is a common theme I see repeating, even today. I don't think one can claim that "distributed systems are not super hard" while their distributed system has issues. Some people devote their entire career to this domain and you don't just learn it in a couple months.

I post this because like many, I want to see Redis improve and I want to see users I work with that use it and everyone else have a better experience. I think the distributed systems community is very welcoming and that Redis could benefit from some design discussions and peer review in these areas.

Josiah Carlson

unread,
Dec 6, 2013, 7:31:33 PM12/6/13
to redi...@googlegroups.com
I thought your use of "frequent offender" with respect to Tony's complaints against Redis was right on :P

Whether or not he has built a lot of good stuff, Salvatore pointed out that his complaints were either FUD or missing the point of what Redis offers. Right tool for the right job and all that.

I wouldn't take it back, and I don't think that any reasonable person should have a problem with what you said.
 - Josiah


--

Josiah Carlson

unread,
Dec 6, 2013, 7:39:59 PM12/6/13
to redi...@googlegroups.com
On Fri, Dec 6, 2013 at 3:02 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
On Fri, Dec 6, 2013 at 7:29 PM, Josiah Carlson <josiah....@gmail.com> wrote:
> Long story short: every one of the existing data structures in Redis can be
> improved substantially. All of them can have their memory use reduced, and
> most of them can have their performance improved. I would argue that the
> ziplist encoding should be removed in favor of structures that are concise
> enough to make the optimization unnecessary for structures with more than 5
> or 10 items. If the intset encoding is to be kept, I would also argue that
> it should be modified to apply to all sets of integers (not just small
> ones), and its performance characteristics updated if it happens that the
> implementation changes to improve large intset performance.

Hello Josiah, thanks for your contrib. I agree with you, it is exactly
another case of "this is the simplest way to avoid work given that it
is good enough".
This would deserve a person allocated to this solely that is able to
do steady progresses and merge code when it is mature / tested enough
to avoid disasters, since it is a very sensible area.

Having someone on this as their job is exactly what it needs. It's a pity Pivotal missed the boat back in July.

 - Josiah

Alberto Gimeno

unread,
Dec 6, 2013, 8:00:12 PM12/6/13
to redi...@googlegroups.com
Hi,

On Sat, Dec 7, 2013, at 12:00 AM, Salvatore Sanfilippo wrote:
> On Fri, Dec 6, 2013 at 11:46 PM, Alberto Gimeno Brieba
> <gime...@gmail.com> wrote:
> > I think that having something like NDS officially supported would make redis
> > a great option for many more usage cases. Many times the 90% of the "hot
> > data" in your db fits in an inexpensive server, but the rest of the data is
> > too big and would be too expensive (unaffordable) to have enough RAM for it.
> > So in the end you choose other db for the entire dataset.
>
> I completely understand this, but IMHO to make Redis on disk right we
> need:
>
> 1) an optional threaded model. You may use it to dispatch maybe only
> slow queries and on-disk queries. Threads are not a good fit for Redis
> in memory I think. Similarly I believe that threads are the key for a
> good on disk implementation.
> 2) Representing every data structure on disk in a native way. Mostly a
> btree of btrees or alike, but definitely some work ahead to understand
> what to use or what to implement.

What about using an already working disk key-value store like leveldb,
rocksdb (http://rocksdb.org), lmdb (like nds does
https://github.com/mpalmer/redis/tree/nds-2.6/deps/liblmdb ), etc.?

Matt Palmer

unread,
Dec 6, 2013, 8:43:23 PM12/6/13
to redi...@googlegroups.com
On Sat, Dec 07, 2013 at 12:00:01AM +0100, Salvatore Sanfilippo wrote:
> On Fri, Dec 6, 2013 at 11:46 PM, Alberto Gimeno Brieba
> <gime...@gmail.com> wrote:
> > I think that having something like NDS officially supported would make redis
> > a great option for many more usage cases. Many times the 90% of the "hot
> > data" in your db fits in an inexpensive server, but the rest of the data is
> > too big and would be too expensive (unaffordable) to have enough RAM for it.
> > So in the end you choose other db for the entire dataset.
>
> I completely understand this, but IMHO to make Redis on disk right we need:
>
> 1) an optional threaded model. You may use it to dispatch maybe only
> slow queries and on-disk queries. Threads are not a good fit for Redis
> in memory I think. Similarly I believe that threads are the key for a
> good on disk implementation.

Well, in theory we've got Posix AIO and O_NONBLOCK, but... hahahaha. No.

I've pondered using bio to handle reads from disk, which would mostly just
involve adding the ability for bio to notify the event loop that a
particular key was now in memory (and thus running all those commands
blocked on that key), but I'm keeping that in reserve for a rainy and boring
weekend...

For now, I recommend enabling nds-keycache, keeping an eye on your cache hit
rate to make sure your maxmemory is high enough, and living with the
occasional latency spike when you have to go to disk to read in a
rarely-used key. Hell, if you're running Redis in EC2, you're used to huge
latency spikes, right? </me ducks>

> 2) Representing every data structure on disk in a native way. Mostly a
> btree of btrees or alike, but definitely some work ahead to understand
> what to use or what to implement.

I actually don't think this is a huge blocker. The time involved in
deserialising a value from the packed RDB format is, I believe, a small part
of the total time involved in getting a key from disk to memory -- compared
to how long you spend waiting for the disk to barf up something useful,
almost any CPU-oriented operation is lightning fast. True, I haven't
benchmarked this, and if someone does wave a profiler at NDS and it shows
that the amount of time spent in rdbLoadObject is a significant percentage
of the time spent in getNDS, I'll gladly change my opinion. Until then,
I'll worry more about reducing the impact of disk operations on request
latency.

> So for now the choice is to stay focused in the in-memory paradigm

For you, perhaps. I'm having quite a bit of fun over here shuffling data on
and off disk inside of Redis. <grin> It's the beauty of OSS -- you can
focus on what you think is more important / interesting, and so can everyone
else.

And thanks, by the way, for providing such a high-quality, easy-to-hack-on
codebase to use as a starting point for my adventures.

- Matt

--
"After years of studying math and encountering surprising and
counterintuitive results, I came to accept that math is always reasonable,
by my intuition of what is reasonably is not always reasonable."
-- Steve VanDevender, ASR

Josh Berkus

unread,
Dec 6, 2013, 9:00:44 PM12/6/13
to redi...@googlegroups.com
On 12/06/2013 05:43 PM, Matt Palmer wrote:
> I actually don't think this is a huge blocker. The time involved in
> deserialising a value from the packed RDB format is, I believe, a small part
> of the total time involved in getting a key from disk to memory -- compared
> to how long you spend waiting for the disk to barf up something useful,
> almost any CPU-oriented operation is lightning fast. True, I haven't
> benchmarked this, and if someone does wave a profiler at NDS and it shows
> that the amount of time spent in rdbLoadObject is a significant percentage
> of the time spent in getNDS, I'll gladly change my opinion. Until then,
> I'll worry more about reducing the impact of disk operations on request
> latency.

Actually, you'd be surprised how much time you can spend in
serailization operations. It's nothing compared with reading from EBS,
of course, but some people have faster disks than that; SSDs are quite
affordable these days, and even Amazon has dedicated IOPS. Not that
your prioritization is wrong; it's still better to spend your time where
you are spending it.

BTW, once we go over to disk-backed Redis, we're pretty much certain to
need a better append-only log. The general approach for on-disk
databases is to write first to the AOL (or WAL), and then have a
background process shuffle data to the searchable representation of the
database on disk; it turns out that writing to an AOL is vastly faster
than writing to more elaborately structured data, even (nay, especially)
on SSD.

Of course, right now we don't *have* background processes ...

Anyway, as an Old Database Geek, I'll speak for the Postgres community
and say that we're around if you need advice on how to manage disk-based
access. We have more than a little experience in this regard ;-)

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Matt Palmer

unread,
Dec 6, 2013, 9:04:29 PM12/6/13
to redi...@googlegroups.com
On Sat, Dec 07, 2013 at 02:00:12AM +0100, Alberto Gimeno wrote:
> On Sat, Dec 7, 2013, at 12:00 AM, Salvatore Sanfilippo wrote:
> > I completely understand this, but IMHO to make Redis on disk right we
> > need:

[...]

> > 2) Representing every data structure on disk in a native way. Mostly a
> > btree of btrees or alike, but definitely some work ahead to understand
> > what to use or what to implement.
>
> What about using an already working disk key-value store like leveldb,
> rocksdb (http://rocksdb.org), lmdb (like nds does
> https://github.com/mpalmer/redis/tree/nds-2.6/deps/liblmdb ), etc.?

I think the issue that Salvatore is talking about there is that with all
those examples you've given, they all treat the values associated with keys
as opaque blobs. Redis, on the other hand, provides its value squarely in
the realm of "I know what these values are, and I have the commands
necessary to allow you to manipulate them".

For NDS, I've gotten around that by only allowing disk/memory granularity at
the key level -- if you want any part of a key, the entire key gets loaded
into memory, and then Redis works on it entirely as normal. This is
hideously inefficient for very large values (hence the "Naive" in "Naive
Disk Store") and performance for almost all types of values would be greatly
improved if granularity increased, but what's there now works well enough
for a great variety of workloads. (/me gives the sigmonster a cookie)

- Matt

--
> There really is no substitute for brute force.
Indeed - I must admit to being a disciple of blessed Saint Makita myself.
-- Robert Sneddon and Tanuki, in the Monastery

Rodrigo Ribeiro

unread,
Dec 7, 2013, 12:05:30 AM12/7/13
to redi...@googlegroups.com
This is a great post Antirez, redis will only improve from this kind of feedback.

Well, we use redis extensively at JusBrasil and my biggest complaint is how expensive can be to keep a large dataset high available.
One of our use is to process user feed. For this we have a cluster of 90 redis instances(distributed across 15 servers), 2/3 of those instances are slaves, used to read from and by our failover mechanism(similar to sentinel).
The problem is that we need to use 2x more memory even if we decide not to read from slaves.

Redis could have a option to run as a "cold-slave", that only receive changes from master and append to disc(RDB+AOF or something similar to NDS fork), keeping minimal memory usage while in this state. 
Then when sentinel elect it as the new master, it would load everything to memory and come back to normal execution.
This would represent huge memory reduction to our cluster, just an idea though.

I also think the core development could be closer with the community work. 
I understand that is important to keep redis simple, but I see few forks that have good contributions(eg: NDS, Sentinel automatic discovery/registration), yet not much movement to merge in the core.
-- 
Rodrigo Ribeiro

On Friday, December 6, 2013 10:52:41 AM UTC-3, Salvatore Sanfilippo wrote:
Hello dear Redis community,

today Pierre Chapuis started a discussion on Twitter about Redis
bashing, stimulated by this thread on Twitter from Rick Branson:

https://twitter.com/rbranson/status/408853897495592960

It is not the first time that Rick Branson, that works at Instagram,
openly criticizes Redis, because I guess he does not like the Redis
design and / or implementation.
However according to Pierre, this is not something limited to Rick,
but there are other engineers in the SF area that believe that Redis
sucks, and Pierre also reported to hear similar stories in Paris.

Of course every open source project of a given size is target if
critiques, especially a project like Redis is very opinionated on how
programs should be written, with the search for simple design and
implementation that sometimes are felt as sub-optimal.
However, what we can learn from this critiques, and what is that you
think is not working well in Redis? I really encourage you to share
your view.

So, what are the critiques that you hear frequently about Redis? What
are your own critiques? When Redis sucks?

Let's tear Redis apart, something good will happen.

Matt Palmer

unread,
Dec 7, 2013, 3:26:34 AM12/7/13
to redi...@googlegroups.com
On Fri, Dec 06, 2013 at 06:00:44PM -0800, Josh Berkus wrote:
> On 12/06/2013 05:43 PM, Matt Palmer wrote:
> > I actually don't think this is a huge blocker. The time involved in
> > deserialising a value from the packed RDB format is, I believe, a small part
> > of the total time involved in getting a key from disk to memory -- compared
> > to how long you spend waiting for the disk to barf up something useful,
> > almost any CPU-oriented operation is lightning fast. True, I haven't
> > benchmarked this, and if someone does wave a profiler at NDS and it shows
> > that the amount of time spent in rdbLoadObject is a significant percentage
> > of the time spent in getNDS, I'll gladly change my opinion. Until then,
> > I'll worry more about reducing the impact of disk operations on request
> > latency.
>
> Actually, you'd be surprised how much time you can spend in
> serailization operations. It's nothing compared with reading from EBS,
> of course, but some people have faster disks than that; SSDs are quite
> affordable these days, and even Amazon has dedicated IOPS.

While I've come to the conclusion that PIOPS are snakeoil, SSDs are quite
nice -- but they're not magic. They're still not as fast as RAM or CPU.

> BTW, once we go over to disk-backed Redis, we're pretty much certain to
> need a better append-only log. The general approach for on-disk
> databases is to write first to the AOL (or WAL), and then have a
> background process shuffle data to the searchable representation of the
> database on disk; it turns out that writing to an AOL is vastly faster
> than writing to more elaborately structured data, even (nay, especially)
> on SSD.

Oh, definitely. In the case of NDS, writing to disk doesn't impact
performance, because that's done from memory to disk in a forked background
process, but that naturally sucks because the data isn't properly durable
(the use case I was addressing meant I can suffer the loss of the last few
writes).

For a proper disk-backed Redis, I'd be switching to something like AOF
fragments to store the log, and the background process would rewrite the AOF
fragments into the disk cache; on startup, this would also be done before we
start serving data.

> Anyway, as an Old Database Geek, I'll speak for the Postgres community
> and say that we're around if you need advice on how to manage disk-based
> access. We have more than a little experience in this regard ;-)

Yeah, I can imagine...

- Matt

--
The hypothalamus is one of the most important parts of the brain, involved
in many kinds of motivation, among other functions. The hypothalamus
controls the "Four F's": 1. fighting; 2. fleeing; 3. feeding; and 4. mating.
-- Psychology professor in neuropsychology intro course

Matt Palmer

unread,
Dec 7, 2013, 3:31:20 AM12/7/13
to redi...@googlegroups.com
On Fri, Dec 06, 2013 at 09:05:30PM -0800, Rodrigo Ribeiro wrote:
> Redis could have a option to run as a "cold-slave", that only receive
> changes from master and append to disc(RDB+AOF or something similar to NDS
> fork), keeping minimal memory usage while in this state.

You could definitely do this with NDS right now -- set a low nds-watermark
and a huge maxmemory on the slaves, and then as part of the promotion
process, set nds-waterwark to 0 (turns it off) and trigger a preload.
Performance will suck a little while the preloading gets everything into
memory, but after that it'll feel just like normal Redis, except you'll get
the benefits of NDS persistence (quick restarts, frequent but tiny disk
flushes, etc).

- Matt


--
Judging by this particular thread, many people in this group spent their
school years taking illogical, pointless orders from morons and having their
will to live systematically crushed. And people say school doesn't prepare
kids for the real world. -- Rayner, in the Monastery

Robert Allen

unread,
Dec 6, 2013, 9:07:56 PM12/6/13
to redi...@googlegroups.com
Firstly, I would like to say thank you to all contributors for your time, efforts and contributions to this outstanding project. 

We have utilised redis for three and a half years with only one notable incident; an incident I attribute solely to a failed HA implementation not related to redis itself. Charles Eames is quoted as saying, "design depends largely on constraints." This holds true with redis and all other systems components. As consumers, we have the noble responsibility to ensure we know, define and learn the constraints of all components we deploy or develop. Our deployment[s] of redis has grown massively in the three+ years of constant use, necessitating these deployments to be configured and tuned with workloads divided specifically to what they are responsible for. We do not mix persisting data, transient cache keys or sessions; we do not utilise Sentinel for HA yet (I would like to but I am giving it more time). 

In summary, I am convinced that, at this time, there are no other viable products that would fit our environment and constraints as well as redis has and will continue to for the foreseeable future. 


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Salvatore Sanfilippo

unread,
Dec 7, 2013, 3:52:12 AM12/7/13