from time to time somebody asks why Redis lacks "CAS", the memcached
Check And Set operator. This operator works in the following way:
when the client performs a GET the server actually returns two values:
the value of the key itself and an integer, that is called a
"cas_token" in memcached slang, but actually it's a 64 bit integer
that you can think as a "version" of the value contained inside a key.
Every time a key is set to another thing this counter increments.
Thanks to this token you can update the value stored at the key in a
way that it will only be modified if no other client changed that
value since your GET with
the following semantic:
value,cas_token = GET <key>
... perform some work on value that will produce new_value ...
CAS <key> <new_value> <cas_token_redeived_by_get>
If the token you are sending to the CAS operation matches the current
key token then the key will be set to the new value, otherwise CAS
notifies the client that no update was performed.
This allows to mount a number of atomic operations on values. For
example you can implement Redis' LPUSH:
LOOP FOREVER
value,token = GET mykey
value = .... restore the value that is probably serialized in some
way, append the element, re-serialize the value ...
CAS mykey value token
BREAK if CAS returned success (i.e. the value was updated)
END
At a first glance this may appear like a cool locking free operation
that makes possible to perform a lot of higher level atomic operations
without locking. Unfortunately if you study a bit more in depth the
semantic it turns out that CAS is a weak, ill conceived, form of
locking. Basically when CAS returns that it was not able to perform
the operation because some client updated the key in the meantime you
need to redo the operation again: this is like waiting for a lock, but
with two main drawbacks:
- You will find that the lock was not obtained after you already did
all the work to create the new value, issued and transfered the new
value, and so on...
- There is no serialization of all the clients waiting for a key, so
in theory a slow client will have a very bad time trying to modify a
key.
Imagine for example that you have two Linux boxes trying to perform a
lot of LPUSHes against a single key with a lot of value inside. But
one Linux box isi four times faster than the other one. The
de-serialize, append the new value, serialize, can be pretty slow, so
the fast Linux box will probably be able to update the key multiple
times while the slow one is trying to update the key one time. The
slow Linux box will get a lot of CAS failures before of being able to
update the key.
So IMHO CAS appears to be cool but it sucks. Moreover it's totally
conceived for a just-string-values scenario, for instance you can't
implement Redis' SMOVE with CAS, or any other operation working
against multiple elements in a Redis List or Set. For complex
operations what we want is a very fast and flexible locking primitive,
and this is what Redis will get in the long run. That said thanks to
the higher level operations offered by Redis we can live well and
without locking for now, and this will be anyway the preferred way to
do things, but sometimes locking is really needed.
Ok looks like we have a new entry for the FAQ :)
Regards,
Salvatore
--
Salvatore 'antirez' Sanfilippo
http://invece.org
> it would work like SET, but client should also pass a checksum of the
> OLD value the client thinks should be replaced (or the old value, but
> sometimes you store large values, so it might be not efficient). If
> within Redis the checksums do not match, error is returned. Now it is
> the client application that should decide what to do next - this would
> vary depending on the particular scenario.
>
> The checksum could be calculated using MD5, SHA1 or any other hashing
> algorithm that is quick enough and provides low probability of
> collisions. Or the old value itself could be passed - it would work
> better for short values.
Hello Michal,
Is the LOCK semantic I described enough to fix your issue? Please if
you can it is possible to have more details about the kind of data
manipulation requires this SETIF instruction? I'm not sure I'm able to
argument on this without a bit more details since I wonder if you need
SETIF in order to just give up at all with the update if the data
changed in the meanwhile or if you instead retry the operation.
Using lock your code could change in the following:
LOCK key
GET key
... manipulate key by code ...
SET key <newval>
UNLOCK key
Note that reading clients will not use LOCK at all, so they will be
able to read the data without any kind of locking.
Thank you very much!
Salvatore
On Wed, Aug 8, 2012 at 8:08 AM, Joseph Jang <josep...@gmail.com> wrote:
if (redis.call("type",KEYS[1]) ~= "string" or redis.call("get",KEYS[1]) ~= ARGV[1]) then redis.call("set",KEYS[1],ARGV[2]) end
You can easily modify this in order to set a value just if the key does not exist.
LOCK key
GET key
... manipulate key by code ...
SET key <newval>
UNLOCK key
The problem with this is that client has to "GET" the value of the key first, check it and then "SET" it. This is one extra roundtrip to the server. For e.g. here's how I plan to use Redis as pure cache:Customer getCustomerData(customerId) {Customer c = redis.get(customerId);if (c == null) {c = db.get(customerId);redis.put(customerId, c);}return c;}Note that the "put" above needs to be a conditional put. I only want to put the Customer record in cache if this is a newest version. If some other client already put a newer version, I don't want to overwrite it. So with your current solution, I will have to do the following:Customer getCustomerData(customerId) {Customer c = redis.get(customerId);if (c == null) {c = db.get(customerId);redis.LOCK(customerId);oldValue = redis.get(customerId);if (oldValue.version < c.version) {redis.put(customerId, c);}redis.UNLOCK(customerId);}return c;}The above pattern forces me to fetch the value on client side (one extra roundtrip to redis cache server). I would really like to avoid it. Is there a way to do that?Ideally, I would like to do the following:Customer getCustomerData(customerId) {Customer c = redis.get(customerId);if (c == null) {c = db.get(customerId);redis.putIfAbsent(customerId, c, c.version); // OR if I updated the customer record, I would invoke "redis.put(customerId, c.version /* current version */, expectedVersion)"}return c;}This pattern of CAS avoids one complete round-trip to fetch the value from cache and also any multi-instruction LOCK.
--You received this message because you are subscribed to the Google Groups "Redis DB" group.
To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/EZ2e9GVsaSEJ.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
if (redis.call("type",KEYS[1]) ~= "string" or redis.call("get",KEYS[1]) ~= ARGV[1]) then redis.call("set",KEYS[1],ARGV[2]) endYou can easily modify this in order to set a value just if the key does not exist.
I read that executing Lua scripts is synchronous i.e. no other clients can run commands on the server while the script is executing (See "atomicity of scripts" section: http://redis.io/commands/eval). Will this not seriously affect concurrency/throughput?
All the Redis commands are like that
All the Redis commands are like thatSo is a "GET k1" and "GET k2" from different clients serialized?
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/AvSHHO0PTQEJ.