'Crazy' new commands idea.

78 views
Skip to first unread message

qyloxe

unread,
Oct 28, 2010, 3:44:29 PM10/28/10
to Redis DB
Some 'crazy' ideas. New commands in capital letters:

Conditionals:

IF operand_type_1 operand_1 operator operand_type_2 operand_2
THEN
block
ELSE
block

operand_type_1,operand_type_2 -> ["key","value","list"...]
operator -> ["=",">","<",">=","<=","<>","!=","in","not in",...]

block -> single command or:

BEGIN
command
command
command
...
END

BEGINPROCEDURE proc_name args
ENDPROCEDURE

CALL proc_name args

====================
Example 1:

IF key mykey1 = value "xx"
THEN
BEGIN
SET mykey2 "yy"
INCR mykey3
END
ELSE
BEGIN
SET mykey2 "zz"
DECR mykey3
END

====================
Example 2:

BEGINPROCEDURE myswap key1 key2
SET myswap_temp key1
SET key1 key2
SET key2 myswap_temp
INCR myswap_cnt
ENDPROCEDURE

CALL myswap mykey1 mykey2
CALL myswap mykey2 mykey3

==============================

REDIS as a virtual machine - wow!

What you think?

Xiangrong Fang

unread,
Oct 28, 2010, 6:46:02 PM10/28/10
to redi...@googlegroups.com
This is the idea of server-side-scripting, or macro commands. I like it.

2010/10/29 qyloxe <qyl...@gmail.com>:

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

Dvir Volk

unread,
Oct 28, 2010, 7:10:35 PM10/28/10
to redi...@googlegroups.com
Personally I really don't like this direction. What's best about redis
is how dead simple it is, which keeps it easy to work with, and
performing amazingly.
If you want stored procedures you have traditional RDBMS for that. For
me as a (happy) user, this looks like going against the core values
and philosophy of redis.

qyloxe

unread,
Oct 28, 2010, 7:57:39 PM10/28/10
to Redis DB
Dvir, it seems you missed the point.

Please, write the Example 1 in the language of your choice and let it
be short, correct, multi-process safe and without timeouts.
I dare you, it won't be a "dead simple" code ;-)
> >> For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

Josiah Carlson

unread,
Oct 28, 2010, 8:08:00 PM10/28/10
to redi...@googlegroups.com
If you're going to have a programming language embedded in there, why
create a new one? Why not use one that is designed to be embedded?

- Josiah

Xiangrong Fang

unread,
Oct 28, 2010, 8:24:53 PM10/28/10
to redi...@googlegroups.com
That's right. Simple is NOT only speed. If this facility is provided,
it should be correctly used by those who need it.

2010/10/29 qyloxe <qyl...@gmail.com>:

Nick Quaranto

unread,
Oct 28, 2010, 8:26:49 PM10/28/10
to redi...@googlegroups.com
-1 to all of this. Your language's driver can provide this behavior. Let's keep the protocol and commands simple, Redis is not a programming language.

Xiangrong Fang

unread,
Oct 28, 2010, 8:31:03 PM10/28/10
to redi...@googlegroups.com
Its up to Salvatore and the core team to decide. For me, if this
feature is added, and stored procedure is created, it at least could
be used for SAs i.e. use Redis directly via redis-cli or even telnet.
Which means the server alone is more powerful and is not at the mercy
of any particular client to provide advanced features.

But I agree that this is not at all an high priority feature, even if
it is to be added in the future...

2010/10/29 Nick Quaranto <ni...@quaran.to>:

qyloxe

unread,
Oct 28, 2010, 8:52:04 PM10/28/10
to Redis DB
Surely it would be tempting to have embedded python or java or ruby
BUT as I can see it, it's not that level of abstraction.
Python, Java, Ruby are precompiled to bytecodes which runs on their
own virtual machines.
I consider Redis is a virtual machine, too, so it's one level below
those languages.

Redis does excellent job as a VM for data structures. It has
atomicity, persistence, communication, and did I mention
atomicity ;-)?
On the other hand, data structures are static but algorithms don't, so
with a little help from VM, we could open doors for many useful
solutions.
Consider Example 1 from my first post - currently it is difficult to
write it correctly on the client. Moreover the code would look ugly,
with all that synchronisation, barriers and even timeouts or clean-up
background tasks. On the server - the code looks simple and
beautiful ;-)

Michael Russo

unread,
Oct 28, 2010, 9:05:36 PM10/28/10
to redi...@googlegroups.com
On 2010-10-28, at 7:57 PM, qyloxe wrote:

> Please, write the Example 1 in the language of your choice and let it
> be short, correct, multi-process safe and without timeouts.

> I dare you, it won't be a "dead simple" code ;-) Moreover the code would

> look ugly, with all that synchronisation, barriers and even timeouts or clean-up
> background tasks. On the server - the code looks simple and
> beautiful ;-)


I've written this code for you, in Python. See here: http://gist.github.com/652679

-Michael

qyloxe

unread,
Oct 28, 2010, 9:14:08 PM10/28/10
to Redis DB
Thanks, and now please remove the bugs from the code ;-)

From:
http://code.google.com/p/redis/wiki/MultiExecCommand

"""WATCHed keys are monitored in order to detect changes against this
keys. If at least a watched key will be modified before the EXEC call,
the whole transaction will abort, and EXEC will return a nil object (A
Null Multi Bulk reply) to notify that the transaction failed."""

You need to check against nil object on the client side and call the
transaction again (and again... and wait... and again...) :-(

Jak Sprats

unread,
Oct 28, 2010, 9:23:18 PM10/28/10
to Redis DB
I see this all related to some other posts
"Lists of hashes" http://groups.google.com/group/redis-db/browse_thread/thread/2ea22f14bc3b7d6d
"Pipe the results of a command as arguments to another"
http://groups.google.com/group/redis-db/browse_thread/thread/d7ec7df97d89ec89/4b33853ff7910082

How can we extend redis to do some new stuff w/o totally obfuscating
the syntax and language.

I am a fan of IF statements, they can minimise sequential request
steps from app-server to redis-server, which cause app-server threads
to block multiple times for a single frontend request.
I am not a fan of PROCEDUREs, these should be done app-server side,
IMO.

I am a bigger fan of what I call EVAL (what other people called
recursive redis- cant find post) where you can do the following:
./redis-cli HSET hash0 name "I AM HASH ZERO"
./redis-cli HSET hash1 name "I AM HASH ONE"
./redis-cli HSET hash2 name "I AM HASH TWO"
./redis-cli RPUSH hashlist hash0
./redis-cli RPUSH hashlist hash1
./redis-cli RPUSH hashlist hash2
./redis-cli HGET $(LINDEX hashlist 1) name -> "I AM HASH ONE"

What is cool about bash, is this actually works: (EVAL in bash)
./redis-cli HGET $(./redis-cli LINDEX hashlist 1) name
And this also illustrates why I am a fan of the IF statement, this
request requires two ./redis-cli processes, which are akin to two
sequential app-server to redis-server requests (i.e. lots of overhead,
waiting, blocking, etc...).

Mike Shaver

unread,
Oct 28, 2010, 9:40:34 PM10/28/10
to redi...@googlegroups.com
On Thu, Oct 28, 2010 at 5:52 PM, qyloxe <qyl...@gmail.com> wrote:
> On the other hand, data structures are static but algorithms don't, so
> with a little help from VM, we could open doors for many useful
> solutions.
> Consider Example 1 from my first post - currently it is difficult to
> write it correctly on the client. Moreover the code would look ugly,
> with all that synchronisation, barriers and even timeouts or clean-up
> background tasks. On the server - the code looks simple and
> beautiful ;-)

Redis can be the data store, and even provide the protocol, without
needing to put all the algorithms inside it. Put another daemon on
the server, have it speak the Redis protocol with the extensions you
want, and let it talk to a "protected" redis server via localhost TCP
(Unix domain sockets in the future, I hope).

client: "CALLMACRO SWAP key1 key2"

qyloxed -> redis:
$tmp = "GET key1"
"SET key1 key2"
"SET key2" ($tmp)
"INCR myswap_cnt"
$cnt = "GET myswap_cnt"

qyloxed -> client: ($cnt)

By luck or ridiculously impeccable taste, or perhaps a combination of
both, Salvatore has put a great set of algorithms into redis that
allow it to be fast and flexible and fast and fast and reliable and
fast and atomic. (And fast.) It gives the sought-after atomicity by
virtue of being single threaded, but *any* single-threaded server can
provide that guarantee in a simple way -- not to diminish redis'
architectural achievements!

Mike

qyloxe

unread,
Oct 29, 2010, 4:47:48 AM10/29/10
to Redis DB
> allow it to be fast and flexible and fast and fast and reliable and
> fast and atomic.  (And fast.)  It gives the sought-after atomicity by
Please, write the Example 1 in the language of your choice and let it
be fast and reliable in multi-process environment (and fast).
Can you? Nah..

Answering your question - why procedures? Because they allow to write
simpler and faster client side code (and more reliable).
How is so? Consider the map, filter, reduce (google "map filter
reduce"). Those are basic building blocks of many distributed
algorithms. Filter on the client side - well, possible, but not so
fast and definitely very hard to write it reliable. Just try to
imagine exclusive locks in environment of 500 client processes. On the
other hand, filter on the server side - wow - extremely fast,
guaranteed atomic, simple code to write for a developer. Everybody
wants to write (fast) a simple (and fast) code which is easy to
maintain and error free.

+1 for that it's up to Salvatore and the core team.

Jacob Rothstein

unread,
Oct 28, 2010, 11:29:21 PM10/28/10
to redi...@googlegroups.com
Hi all. I'm new here, but I've been using redis for a while.
Hopefully this isn't too wacky.

My initial response was similar to "Redis isn't a language, it's a
datastore. Run a fancy driver or daemon if you want additional logic."
A small concern popped up, though, and I was curious what others
thought. When redis is clustered, it would be pretty neat to run an
embedded language local to a particular piece of data. A potential
use case would be map/reduce, with the map stage running on a node
that has the particular key. Would a fancy client have enough
information to do that without a lot of roundtrips? I haven't really
internalized the commands outlined in the cluster presentation yet,
but can imagine data-locality being a motivation for embedding an
interpreted language.

–Jacob

Mike Shaver

unread,
Oct 29, 2010, 11:18:28 AM10/29/10
to redi...@googlegroups.com

Writing it the same way redis is written (single threaded server) would give you the atomicity easily, afaict.  What am I missing? Redis wisely avoids fancy (read: fragile) locking, so that's what you'd get if Salvatore did the work for you anyway. :-)

teleo

unread,
Oct 29, 2010, 12:46:55 PM10/29/10
to Redis DB
I believe that scripting support is far from being high priority for
Redis. In my opinion, the next priority after Redis Cluster should be
improved persistence support that would handle some of the issues both
RDB and AOF files have with huge datasets.

That said, running server-side code on Redis can dramatically reduce
the I/O needed to process large amount of data, because in many cases
it alleviates the need of sending data to the client, and then
receiving it back. The idea of running code on the premises of the
data is sometimes called 'co-location' or 'collocation', and it is a
principle that is too often overlooked.

Still, many of the use cases where server-side code is beneficial
could also be solved by better support of aggregation, filtering and
piping.

-Teleo
> "Redis DB" group.> To post to this group, send email tored...@googlegroups.com.
> > To unsubscribe from this group, send email to
>
> redis-db+u...@googlegroups.com<redis-db%2Bunsu...@googlegroups.com>
> .> For more options, visit this group at
>
> http://groups.google.com/group/redis-db?hl=en.
>
>
>
>
>
>
>
>

Marcus

unread,
Oct 29, 2010, 10:20:56 PM10/29/10
to Redis DB
Some can never be solved with client-side code. For example, the
problem that list of hashes solves is not the need for a list of
hashes, which redis can already simulate. But, for reducing the key-
space that the current list of hashes would currently have to occupy.
With a large list of hashes, using the current implementation of
redis, the available system memory would be occupied just by the keys,
let alone the data.

The current method to do list of hashes in redis works great when you
are talking about 10,000,000 hashes in lists. When you begin to get
into 10,000,000,000 keys it becomes impossible. However, that number
of "keys" would become trivial with native list of hashes support, as
then the VM would be able to put infrequently used data on the disk
(the hashes, in this case).

Jak Sprats

unread,
Oct 30, 2010, 12:26:34 AM10/30/10
to Redis DB
Well put Teleo, I would also vote for better large data AOF and RDB
handling.

With a little elbow grease. a spare core can be used to rewrite the
AOF file w/o using copy-on-write (which limits you to only using 50%
of your RAM).
The key is dumping the data in the AOF file in a different format, one
that can be easily be sorted to date and then key, and also every
write op must be dumped (dumping "SINTERSTORE a b dest" wont work, you
have to dump the entire "dest" SET).
W/ a new AOF file format and then using unix sort (which uses a low
memory <50MB mergesort) the AOF file can be rewritten on a seperate
core, and redis can use 2X RAM .. which is a GIGANTIC win. (and these
days everyone has a spare core)
> > redis-db+u...@googlegroups.com<redis-db%2Bunsubscribe@googlegroups.c om>

Xiangrong Fang

unread,
Oct 30, 2010, 6:08:04 AM10/30/10
to redi...@googlegroups.com
2010/10/30 Jak Sprats <jaks...@gmail.com>:

> Well put Teleo, I would also vote for better large data AOF and RDB
> handling.

Vote for improving the persistence layer asap after cluster...

Salvatore Sanfilippo

unread,
Oct 30, 2010, 6:29:56 AM10/30/10
to redi...@googlegroups.com

This is already in our plans, but there are no magical things we can do.
There are limits very similar to the CAP theorem that we have to fight
with, that is, the worst case scenario always requires 2x memory in
order to persist in a snapshot form (rdb) or in order to rewrite the
AOF in a system semantically as complex as Redis and with inter-key
operations.

Anyway we have strong hints to believe that Redis Master should
minimize the number of copy-on-write operations compared to 2.0, I
still did not found the time to test it, but the difference may be
huge.

Cheers,
Salvatore

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.

> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.


> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

--
Salvatore 'antirez' Sanfilippo
http://invece.org

"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Xiangrong Fang

unread,
Oct 30, 2010, 8:35:12 AM10/30/10
to redi...@googlegroups.com
Hi Salvatore,

Enjoy your holiday! Until you get back... this is what I think:

Using 2x memory in the worst case scenario is not a core issue, For
example, I might design a master-slave-slave replica set so that the
persistence only takes place on one of the slave. There are 2
questions though:

1) if a slave is designed specifically for persistence purpose, it
should be configured differently than the master, e.g minimize memory
usage and maximize vm usage. i.e. make enough free memory for the
"dump" process. At the same time, this slave will not answer any user
query, it just connect to the master and receive data to persist.

2) Someone mentioned earlier that you should merge the VM with either
AOF or RDB. You also mentioned about the possibility to use a series
of binary-format AOF files... anyway, if we can simply backup (copy)
the VM dump file it would be very cool. Also, if the VM file is
"paged", i.e. there is not only one VM file but a set of files (one
per "page" for example) it would be very good for incremental backup.

The key problem (uncertainty) for me is the restore procedure in case
of system crash. Especially under clustered situation. i.e. if I
backup data on a server, I need to know what data I have backed up.
That's why I suggested a series of SHARDING related commands so that
the SA is very clear what data is on what node! Please see here:

http://groups.google.com/group/redis-db/browse_thread/thread/5d02fee62d434070

Best Regards,
Shannon

2010/10/30 Salvatore Sanfilippo <ant...@gmail.com>:

Salvatore Sanfilippo

unread,
Oct 30, 2010, 8:48:30 AM10/30/10
to redi...@googlegroups.com
On Sat, Oct 30, 2010 at 2:35 PM, Xiangrong Fang <xrf...@gmail.com> wrote:
> Hi Salvatore,
>
> Enjoy your holiday! Until you get back... this is what I think:
>
> Using 2x memory in the worst case scenario is not a core issue, For
> example, I might design a master-slave-slave replica set so that the
> persistence only takes place on one of the slave.  There are 2
> questions though:

Hey, while waiting for girlfriend to be ready I can reply ;)

In the master -> slave case you very obviously have a copy of the data
in the slave ;)
so 2x memory.

> 1) if a slave is designed specifically for persistence purpose, it
> should be configured differently than the master, e.g minimize memory
> usage and maximize vm usage.  i.e. make enough free memory for the
> "dump" process.  At the same time, this slave will not answer any user
> query, it just connect to the master and receive data to persist.

This does not make sense as the slave is required to be able to afford
the same write speed of the master (and the master <-> slave link
itself), otherwise you are d00med.

> 2) Someone mentioned earlier that you should merge the VM with either
> AOF or RDB.  You also mentioned about the possibility to use a series
> of binary-format AOF files... anyway, if we can simply backup (copy)
> the VM dump file it would be very cool. Also, if the VM file is
> "paged", i.e. there is not only one VM file but a set of files (one
> per "page" for example) it would be very good for incremental backup.

VM and persistence will likely not be merged, but in a single case,
that in future Redis versions we try to just use the disk, and the
memory as cache. If we'll find reasonable ways to store our data
structures with good performances and good locality directly on disk,
then there will be no longer need of VM at all. I doubt this will
work, but this is just to show the only case where VM and persistence
may converge IMHO.

VM can't be used as a persistence mechanism anyway since it does not
hold enough information, there are only the values stored there and
not the keys that are in RAM.

Redesigning VM so that swapped objects will point to AOF/RDB instead
of the swap file is IMHO not a good idea:

1) In many cases you may want no persistence but large VM (caching)
without any background operation (BGSAVE / AOF-REWRITE) happening in
the background. The VM as it is today allows you to do this already.
2) as 1) but you do this because you want persistence only in the
slave in order to have a master as responsive as possible (in this
specific case the master->slave link buffers are enough to compensate
for slow down in the slave that will not last too much).
3) From an operational point of view is a nightmare. As it works today
users now that they can for instance backup the latest .rdb file with
"mv" unix command and a new dump will be created later. If persistence
/ VM are mixed this will corrupt the instance.
4) Can't be done with .rdb as it is rewritten every time. The whole
point of .rdb is that it's a compact representation of data in memory
so it is not viable to rewrite the new .rdb so that offsets will be
the same as the old one.
5) AOF does not contain a single value. What in .rdb / VM is a list of
"a,b,c", in the AOF can be actually the result of different PUSH/POP
operations.

We think at this issues very hard since months, it's not that we are
not caring and there are obvious solutions.

Cheers,
Salvatore

qyloxe

unread,
Oct 30, 2010, 9:01:38 AM10/30/10
to Redis DB
> That said, running server-side code on Redis can dramatically reduce
> the I/O needed to process large amount of data, because in many cases
> it alleviates the need of sending data to the client, and then
> receiving it back. The idea of running code on the premises of the
> data is sometimes called 'co-location' or 'collocation', and it is a
> principle that is too often overlooked.

Even more: with the Virtual Memory configuration it would be
beneficial to run server side code on data, without the risk of paging-
out between consecutive client calls.

Xiangrong Fang

unread,
Oct 30, 2010, 9:05:41 AM10/30/10
to redi...@googlegroups.com
Well, I will leave you along for these hardcore problems :) My core
concern is still about disaster recovery and restoration of data. For
example, if master of a master-slave set is down, there are 2 obvious
ways to recover:

1) promote the slave to master, put a new machine as slave, then there
will be a MASSIVE sync, as the new server is empty.

2) promote the slave to master, put a new server as slave, copy back
the AOF file (backup) to the empty machine, then start the slave

which one is better (easier?)

Also, do you have anything to say about auto-failover within the
replica set? As far as I know redis-cluster is mostly about sharding,
but not about HA or data-redundancy?

Thanks,
Shannon

2010/10/30 Salvatore Sanfilippo <ant...@gmail.com>:

Jak Sprats

unread,
Oct 30, 2010, 4:42:56 PM10/30/10
to Redis DB
>> The worst case scenario always requires 2x memory in in order to rewrite the
AOF in a system semantically as complex as Redis and with inter-key
operations.

I believe there is a way around this.
Inter-key operations always produce a result.
If the result set is dumped instead of the command producing the
result then the AOF file can be rewritten from the AOF file (not how
it is currently done by using the real data). Further if the
operations are sorted by timestamp and then by result-set key, the
rewriting can be done w/ very little memory.

This is not easy to do, but it is definitely doable.

Am I missing something?

On Oct 30, 3:29 am, Salvatore Sanfilippo <anti...@gmail.com> wrote:
> On Sat, Oct 30, 2010 at 12:08 PM, Xiangrong Fang <xrf...@gmail.com> wrote:
> > 2010/10/30 Jak Sprats <jakspr...@gmail.com>:
> >> Well put Teleo, I would also vote for better large data AOF and RDB
> >> handling.
>
> > Vote for improving the persistence layer asap after cluster...
>
> This is already in our plans, but there are no magical things we can do.
> There are limits very similar to the CAP theorem that we have to fight
> with, that is, the worst case scenario always requires 2x memory in
> order to persist in a snapshot form (rdb) or in order to rewrite the
> AOF in a system semantically as complex as Redis and with inter-key
> operations.
>
> Anyway we have strong hints to believe that Redis Master should
> minimize the number of copy-on-write operations compared to 2.0, I
> still did not found the time to test it, but the difference may be
> huge.
>
> Cheers,
> Salvatore
>
> > --
> > You received this message because you are subscribed to the Google Groups "Redis DB" group.
> > To post to this group, send email to redi...@googlegroups.com.
> > To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.
>
> --
> Salvatore 'antirez' Sanfilippohttp://invece.org

Bjarni Rúnar Einarsson

unread,
Oct 30, 2010, 5:21:36 PM10/30/10
to redi...@googlegroups.com
On Sat, Oct 30, 2010 at 8:42 PM, Jak Sprats <jaks...@gmail.com> wrote:
>> The worst case scenario always requires 2x memory in in order to rewrite the
AOF in a system semantically as complex as Redis and with inter-key
operations.

I believe there is a way around this.
Inter-key operations always produce a result.
If the result set is dumped instead of the command producing the
result then the AOF file can be rewritten from the AOF file (not how
it is currently done by using the real data). Further if the
operations are sorted by timestamp and then by result-set key, the
rewriting can be done w/ very little memory.

This is not easy to do, but it is definitely doable.

Am I missing something?

Wouldn't that potentially generate lots of I/O for large sets, violating the premise that AOF logging is guaranteed to be reasonably fast?  The writes would become O(sizeof(set)) instead of effectively O(1).

--
Bjarni R. Einarsson
The Beanstalks Project ehf.

Making personal web-pages fly: http://pagekite.net/

Jak Sprats

unread,
Oct 30, 2010, 7:18:30 PM10/30/10
to Redis DB
> The writes would become O(sizeof(set)) instead of effectively O(1)

It is a trade off.

And yes for the handful of commands where this happens (S*STORE and
Z*STORE), if the result set was large, it would have to be written to
disk (or first RAM then disk, depending on your appendfsync settings).
If you do a MSET w/ 1000 key-values, it gets written to disk.

If you're not doing large S*STORE's or Z*STORE's 100s of times a
second and you dont need "appendfsync always", then the win is you get
2X more RAM.

On Oct 30, 2:21 pm, Bjarni Rúnar Einarsson <b...@pagekite.net> wrote:

Xiangrong Fang

unread,
Oct 31, 2010, 8:14:19 AM10/31/10
to redi...@googlegroups.com
We are already doing a cluster, right? I do NOT see using a "dedicated
persistence server" any problems :-) i.e., the problem of 2x memory
is not that intolerable, I would simply do that completely
asynchronously, and, if the slave server which receives continuous
stream of AOF data, it simply pipeline it to a background service
whose SOLE purpose is doing BGWRITEAOF, i.e that slave does NOT answer
client queries, and all available RAM is reserved exclusively for the
BGWRITEAOF process.

What's wrong with that, if the dedicated "persistence server" is
programmed very effectively, and ideally it can serve the entire
clusters (i.e. one "persistence server" works for multiple shards in
the cluster...). I think this would be a "clean" and "easy" solution,
because it is quite "separated".

I am sorry if my thinking seems weird or immature... what I think is
that the next step on redis persistence is to make a rock solid HA
solution, as well as a proven, easy-to-follow disaster recovery
procedure. In another word, if redis is as it is today, I can still
do the above steps myself, but I will feel much more comfortable if
redis is built to be safe and solid (in term of data persistence).

Regards,
Shannon

2010/10/31 Jak Sprats <jaks...@gmail.com>:

Jak Sprats

unread,
Nov 1, 2010, 5:11:18 AM11/1/10
to Redis DB

Not everyone will use redis as a cluster and not everyone has the
money to afford a bunch of machines to do the setup you described
above.

RAM is the single most expensive component in servers, especially if
you have like 96GB, so being able to use all of it is a huge win.

Here is an example, if your data set is 28GB, in the setup you
proposed you would put (via sharding or clustering) 14GB on machine1
(16GB machine), 14GB on machine2 (16GB machine) and then machine3 (the
persistence server) needs 56GB+ (so 64GB). If you solve the RAM
problem, you need a single 32GB machine, which would be 4X+ cheaper.

On Oct 31, 5:14 am, Xiangrong Fang <xrf...@gmail.com> wrote:
> We are already doing a cluster, right? I do NOT see using a "dedicated
> persistence server" any problems :-)  i.e., the problem of 2x memory
> is not that intolerable, I would simply do that completely
> asynchronously, and, if the slave server which receives continuous
> stream of AOF data, it simply pipeline it to a background service
> whose SOLE purpose is doing BGWRITEAOF, i.e that slave does NOT answer
> client queries, and all available RAM is reserved exclusively for the
> BGWRITEAOF process.
>
> What's wrong with that, if the dedicated "persistence server" is
> programmed very effectively, and ideally it can serve the entire
> clusters (i.e. one "persistence server" works for multiple shards in
> the cluster...).  I think this would be a "clean" and "easy" solution,
> because it is quite "separated".
>
> I am sorry if my thinking seems weird or immature... what I think is
> that the next step on redis persistence is to make a rock solid HA
> solution, as well as a proven, easy-to-follow disaster recovery
> procedure.  In another word, if redis is as it is today, I can still
> do the above steps myself, but I will feel much more comfortable if
> redis is built to be safe and solid (in term of data persistence).
>
> Regards,
> Shannon
>
> 2010/10/31 Jak Sprats <jakspr...@gmail.com>:

Xiangrong Fang

unread,
Nov 1, 2010, 10:04:20 AM11/1/10
to redi...@googlegroups.com
Hi Jak,

I do not agree with your reasoning, but as you are redis internal
experts, I will be very happy if you correct my mistakes:

1) as redis need 2x size-of-dataset memory in the worst case, and
suppose you cannot afford a cluster, and now you have a 32 GB machine,
you can setup a 15GB dataset reserve 15GB for the worst case, and 2GB
for system and other possible tasks. Now your memory efficiency is a
little less than 50%, that's fine.

2) But if you exceeds your limit, there are 2 options:

a) add more memory on that server, e.g. make it 48GB or 64GB, this way
your memory efficiency is still below 50%

b) arrange 3 machines with 16GB memory on each box, you will have a
cluster of 32GB memory with the persistence server having 16GB memory.
Memory efficiency improves to 60+%.

3) Now the most important point: I do *not* think the so-called
persistence server require as much memory as the working server. In
fact if the persistence server in the above example have 8GB memory,
cost-effectiveness will be better. Because that should be a dedicated
server processing AOF and it does not need the ability to rapidly
reply client queries, and it should use an algorithm suitable for
serializing data but not for random access. I hope this is
possible...

That's the whole point of the hypothetic persistence server. If that
server require a memory space of the sum of all servers in the
cluster, that is very ridiculous if not useless.

Regards,
Shannon

2010/11/1 Jak Sprats <jaks...@gmail.com>:

Jak Sprats

unread,
Nov 2, 2010, 6:31:14 AM11/2/10
to Redis DB

Hi Shannon,

I am not sure its possible to have a "dedicated server processing AOF
and it does not need the ability to rapidly reply client queries, and
it should use an algorithm suitable for serializing data but not for
random access"

In worst case scenarios, you are doing writes to all of your data (and
the speed is at random access). The persistence-server needs to read
in these writes at the speed they are happening (random access) and
then use a more memory efficient algorithm to save space. It is a
paradox, I think. If the algorithm is more memory efficient at the
cost of speed, then worst cases will cause backlogs to the persistence
server, which are real bad.

The algorithm I proposed w/ the (dumping all data of S/Z*STORE and
then mergesort and apply) is actually very similar to the persistence-
server you are proposing, the difference being it runs locally on a
seperate core, and avoids adding network I/O. W/ my algorithm, there
is still a chance for backlogs, but it will catch up much quicker as
it would be a dedicated core reading thru a file as opposed to the
persistence server reading from a socket.

I probably need to explain my algorithm in a lot more detail, this
stuff is complicated.

- Jak
> 2010/11/1 Jak Sprats <jakspr...@gmail.com>:

Xiangrong Fang

unread,
Nov 2, 2010, 8:55:41 AM11/2/10
to redi...@googlegroups.com
Hi Jak,

I think your algorithm is very valuable for Salvatore, even that you
may have different ideas... I am not that "in" the core of redis, so
I cannot comment.

My idea is just "eventually persistent". Because web access has
patterns, a web site cannot have very high access rate 7*24*365.
With the help of replica & sharding, you can be sure that some servers
are relatively free at some time. Optimistically speaking, you may
distribute servers geographically, so that you can always have quite
times while the majority of your users are sleeping.

In another word, with redis, you can say that "products do not
persist, but architectures do". i.e. redis can focus on its strength
to do things fast and rock stable, leave the things that it cannot do
to the administrator... of course, with necessary mechanism built
into redis, and with established best practices.

Regards.

2010/11/2 Jak Sprats <jaks...@gmail.com>:

Reply all
Reply to author
Forward
0 new messages