2010/10/29 qyloxe <qyl...@gmail.com>:
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>
- Josiah
2010/10/29 qyloxe <qyl...@gmail.com>:
But I agree that this is not at all an high priority feature, even if
it is to be added in the future...
2010/10/29 Nick Quaranto <ni...@quaran.to>:
> Please, write the Example 1 in the language of your choice and let it
> be short, correct, multi-process safe and without timeouts.
> I dare you, it won't be a "dead simple" code ;-) Moreover the code would
> look ugly, with all that synchronisation, barriers and even timeouts or clean-up
> background tasks. On the server - the code looks simple and
> beautiful ;-)
I've written this code for you, in Python. See here: http://gist.github.com/652679
-Michael
Redis can be the data store, and even provide the protocol, without
needing to put all the algorithms inside it. Put another daemon on
the server, have it speak the Redis protocol with the extensions you
want, and let it talk to a "protected" redis server via localhost TCP
(Unix domain sockets in the future, I hope).
client: "CALLMACRO SWAP key1 key2"
qyloxed -> redis:
$tmp = "GET key1"
"SET key1 key2"
"SET key2" ($tmp)
"INCR myswap_cnt"
$cnt = "GET myswap_cnt"
qyloxed -> client: ($cnt)
By luck or ridiculously impeccable taste, or perhaps a combination of
both, Salvatore has put a great set of algorithms into redis that
allow it to be fast and flexible and fast and fast and reliable and
fast and atomic. (And fast.) It gives the sought-after atomicity by
virtue of being single threaded, but *any* single-threaded server can
provide that guarantee in a simple way -- not to diminish redis'
architectural achievements!
Mike
My initial response was similar to "Redis isn't a language, it's a
datastore. Run a fancy driver or daemon if you want additional logic."
A small concern popped up, though, and I was curious what others
thought. When redis is clustered, it would be pretty neat to run an
embedded language local to a particular piece of data. A potential
use case would be map/reduce, with the map stage running on a node
that has the particular key. Would a fancy client have enough
information to do that without a lot of roundtrips? I haven't really
internalized the commands outlined in the cluster presentation yet,
but can imagine data-locality being a motivation for embedding an
interpreted language.
–Jacob
Writing it the same way redis is written (single threaded server) would give you the atomicity easily, afaict. What am I missing? Redis wisely avoids fancy (read: fragile) locking, so that's what you'd get if Salvatore did the work for you anyway. :-)
Vote for improving the persistence layer asap after cluster...
This is already in our plans, but there are no magical things we can do.
There are limits very similar to the CAP theorem that we have to fight
with, that is, the worst case scenario always requires 2x memory in
order to persist in a snapshot form (rdb) or in order to rewrite the
AOF in a system semantically as complex as Redis and with inter-key
operations.
Anyway we have strong hints to believe that Redis Master should
minimize the number of copy-on-write operations compared to 2.0, I
still did not found the time to test it, but the difference may be
huge.
Cheers,
Salvatore
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>
--
Salvatore 'antirez' Sanfilippo
http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele
Enjoy your holiday! Until you get back... this is what I think:
Using 2x memory in the worst case scenario is not a core issue, For
example, I might design a master-slave-slave replica set so that the
persistence only takes place on one of the slave. There are 2
questions though:
1) if a slave is designed specifically for persistence purpose, it
should be configured differently than the master, e.g minimize memory
usage and maximize vm usage. i.e. make enough free memory for the
"dump" process. At the same time, this slave will not answer any user
query, it just connect to the master and receive data to persist.
2) Someone mentioned earlier that you should merge the VM with either
AOF or RDB. You also mentioned about the possibility to use a series
of binary-format AOF files... anyway, if we can simply backup (copy)
the VM dump file it would be very cool. Also, if the VM file is
"paged", i.e. there is not only one VM file but a set of files (one
per "page" for example) it would be very good for incremental backup.
The key problem (uncertainty) for me is the restore procedure in case
of system crash. Especially under clustered situation. i.e. if I
backup data on a server, I need to know what data I have backed up.
That's why I suggested a series of SHARDING related commands so that
the SA is very clear what data is on what node! Please see here:
http://groups.google.com/group/redis-db/browse_thread/thread/5d02fee62d434070
Best Regards,
Shannon
2010/10/30 Salvatore Sanfilippo <ant...@gmail.com>:
Hey, while waiting for girlfriend to be ready I can reply ;)
In the master -> slave case you very obviously have a copy of the data
in the slave ;)
so 2x memory.
> 1) if a slave is designed specifically for persistence purpose, it
> should be configured differently than the master, e.g minimize memory
> usage and maximize vm usage. i.e. make enough free memory for the
> "dump" process. At the same time, this slave will not answer any user
> query, it just connect to the master and receive data to persist.
This does not make sense as the slave is required to be able to afford
the same write speed of the master (and the master <-> slave link
itself), otherwise you are d00med.
> 2) Someone mentioned earlier that you should merge the VM with either
> AOF or RDB. You also mentioned about the possibility to use a series
> of binary-format AOF files... anyway, if we can simply backup (copy)
> the VM dump file it would be very cool. Also, if the VM file is
> "paged", i.e. there is not only one VM file but a set of files (one
> per "page" for example) it would be very good for incremental backup.
VM and persistence will likely not be merged, but in a single case,
that in future Redis versions we try to just use the disk, and the
memory as cache. If we'll find reasonable ways to store our data
structures with good performances and good locality directly on disk,
then there will be no longer need of VM at all. I doubt this will
work, but this is just to show the only case where VM and persistence
may converge IMHO.
VM can't be used as a persistence mechanism anyway since it does not
hold enough information, there are only the values stored there and
not the keys that are in RAM.
Redesigning VM so that swapped objects will point to AOF/RDB instead
of the swap file is IMHO not a good idea:
1) In many cases you may want no persistence but large VM (caching)
without any background operation (BGSAVE / AOF-REWRITE) happening in
the background. The VM as it is today allows you to do this already.
2) as 1) but you do this because you want persistence only in the
slave in order to have a master as responsive as possible (in this
specific case the master->slave link buffers are enough to compensate
for slow down in the slave that will not last too much).
3) From an operational point of view is a nightmare. As it works today
users now that they can for instance backup the latest .rdb file with
"mv" unix command and a new dump will be created later. If persistence
/ VM are mixed this will corrupt the instance.
4) Can't be done with .rdb as it is rewritten every time. The whole
point of .rdb is that it's a compact representation of data in memory
so it is not viable to rewrite the new .rdb so that offsets will be
the same as the old one.
5) AOF does not contain a single value. What in .rdb / VM is a list of
"a,b,c", in the AOF can be actually the result of different PUSH/POP
operations.
We think at this issues very hard since months, it's not that we are
not caring and there are obvious solutions.
Cheers,
Salvatore
1) promote the slave to master, put a new machine as slave, then there
will be a MASSIVE sync, as the new server is empty.
2) promote the slave to master, put a new server as slave, copy back
the AOF file (backup) to the empty machine, then start the slave
which one is better (easier?)
Also, do you have anything to say about auto-failover within the
replica set? As far as I know redis-cluster is mostly about sharding,
but not about HA or data-redundancy?
Thanks,
Shannon
2010/10/30 Salvatore Sanfilippo <ant...@gmail.com>:
>> The worst case scenario always requires 2x memory in in order to rewrite the
AOF in a system semantically as complex as Redis and with inter-keyI believe there is a way around this.
operations.
Inter-key operations always produce a result.
If the result set is dumped instead of the command producing the
result then the AOF file can be rewritten from the AOF file (not how
it is currently done by using the real data). Further if the
operations are sorted by timestamp and then by result-set key, the
rewriting can be done w/ very little memory.
This is not easy to do, but it is definitely doable.
Am I missing something?
What's wrong with that, if the dedicated "persistence server" is
programmed very effectively, and ideally it can serve the entire
clusters (i.e. one "persistence server" works for multiple shards in
the cluster...). I think this would be a "clean" and "easy" solution,
because it is quite "separated".
I am sorry if my thinking seems weird or immature... what I think is
that the next step on redis persistence is to make a rock solid HA
solution, as well as a proven, easy-to-follow disaster recovery
procedure. In another word, if redis is as it is today, I can still
do the above steps myself, but I will feel much more comfortable if
redis is built to be safe and solid (in term of data persistence).
Regards,
Shannon
2010/10/31 Jak Sprats <jaks...@gmail.com>:
I do not agree with your reasoning, but as you are redis internal
experts, I will be very happy if you correct my mistakes:
1) as redis need 2x size-of-dataset memory in the worst case, and
suppose you cannot afford a cluster, and now you have a 32 GB machine,
you can setup a 15GB dataset reserve 15GB for the worst case, and 2GB
for system and other possible tasks. Now your memory efficiency is a
little less than 50%, that's fine.
2) But if you exceeds your limit, there are 2 options:
a) add more memory on that server, e.g. make it 48GB or 64GB, this way
your memory efficiency is still below 50%
b) arrange 3 machines with 16GB memory on each box, you will have a
cluster of 32GB memory with the persistence server having 16GB memory.
Memory efficiency improves to 60+%.
3) Now the most important point: I do *not* think the so-called
persistence server require as much memory as the working server. In
fact if the persistence server in the above example have 8GB memory,
cost-effectiveness will be better. Because that should be a dedicated
server processing AOF and it does not need the ability to rapidly
reply client queries, and it should use an algorithm suitable for
serializing data but not for random access. I hope this is
possible...
That's the whole point of the hypothetic persistence server. If that
server require a memory space of the sum of all servers in the
cluster, that is very ridiculous if not useless.
Regards,
Shannon
2010/11/1 Jak Sprats <jaks...@gmail.com>:
I think your algorithm is very valuable for Salvatore, even that you
may have different ideas... I am not that "in" the core of redis, so
I cannot comment.
My idea is just "eventually persistent". Because web access has
patterns, a web site cannot have very high access rate 7*24*365.
With the help of replica & sharding, you can be sure that some servers
are relatively free at some time. Optimistically speaking, you may
distribute servers geographically, so that you can always have quite
times while the majority of your users are sleeping.
In another word, with redis, you can say that "products do not
persist, but architectures do". i.e. redis can focus on its strength
to do things fast and rock stable, leave the things that it cannot do
to the administrator... of course, with necessary mechanism built
into redis, and with established best practices.
Regards.
2010/11/2 Jak Sprats <jaks...@gmail.com>: