I wanted to have an hello world application for Redis. Something so
simple that even the average PHP programmer may understand without
problems, but that was a real application of Redis in a plausible
environment. The result is a Twitter clone called Retwis, that you can
see at work here: http://retwis.antirez.com/home.php
I'll release the source code together with an article explaining the
design and the source code in a very simple way. My guess is that
key-value stores so fare failed a bit the marketing side of the game,
I could like to avoid this error with Redis :)
Any hint appreciated!
antirez
--
Salvatore 'antirez' Sanfilippo
http://antirez.com
Organizations which design systems are constrained to produce designs
which are copies of the communication structures of these
organizations.
Conway's Law
> with your small dev hardware this would take 5 full seconds to
> process. How would this scale for real in production?
Side-stepping your question somewhat, is 5 seconds actually bad? If you
did the SETs from a message queue, I'm pretty sure it would be fine;
people with 250,000 followers aren't Tweeting every minute.
You can keep the illusion up doing a single synchronous SET into the
Tweeter's feed too.
Regards,
--
Chris Lamb, UK ch...@chris-lamb.co.uk
GPG: 0x634F9A20
Exactly I agree that the central idea here is that if the operation is Huge
you have to create a queue where "workers" will operate against instead
of trying to perform it synchronously in the web page generation like Retwis
is doing.
Regards,
Salvatore
>
> Oh OK, I think I'm getting it now. If the post is huge, don't try to
> do it all at once in the main Redis database, and instead create a new
> Redis database to act as a queue and have separate workers operate on
> that queue to do all 250,000 notifications.
Exactly but there is no main database. You have N redis servers, and
using some kind of partitioning (vanilla key hashing, consistent
hashing, user-id range dispatch, ...) different keys will be stored in
different Redis servers. The more users you need to handle, the more
servers you'll use to scale.
There are important keys that maybe it's wise to store in dedicated
servers, like global:timeline or things like this, probably even the
global:largeUsersUpdateQueue we are talking about.
> I presume the read/write benchmarks are regardless of how many DBs you
If a server can do 50k read/write operations per second and you have N
servers will this schema you'll get more or less N*50k queries per
second. This is why to operate with this data mode can bring
horizontally scalable applications. There is no need for the data to
be available all at the same point into a single node.
> have in your one Redis server, correct? Meaning for optimal
> performance the queue must be on different hardware?
The queue is not the most stressed part. If many users with many
followers will update frequently what you need are actually just more
servers, since the PUSH operations against uid:*:posts are partitioned
among all this servers. You don't really need a single big very
powerful server for this kind of application. What you need is N
commodity hardware servers, and an hashing scheme smart enough that if
a node will die you have the same data stored somebody else.
This is a viable idea: You have N master servers and you use
partitioning against this servers. Also every of this N servers have M
slaves that are used only in order to issue read queries there, and in
order to be highly available if one of the N servers will die (just
turn one of the slaves into a master). But there are many other ways
to deal with this kind of problems.