Add a feature 'strong cas', developed from 'lease' that mentioned in Facebook's paper

Zhiwei Chan

unread,

Apr 17, 2014, 3:21:50 AM4/17/14

to memc...@googlegroups.com

I m working on a trading system, and getting stale data for the system is unaccepted at most of the time. But the high throughput make it impossible to get all data from mysql. So i want to make it more reliable when use memcache as a cache. Facebook's paper "Scaling Memcache at Facebook" mentions a method called ‘lease' and 'mcsqueal', but the mcsqueal is difficult for my case, because it is hard to get the key for mysql.

Adding the 'strong cas' feature is devoted to solve the following typical problems, client A and Client B want to update the same key, and A(set key=>1)update database before B(set key=>2):
key not exist in cache: (A get-miss)->(B get-miss)->(B set key=2) -> (A set key=1);
or key exist in cache: (A delete key)->(B delete key)->(B set key=2) -> (A set key=1);
Some thing Wrong! the key=2 in database but key=1 in cache.

It is possible to happen in a high concurrent system, and i don't find a way to solve it with the current cas method. So i add two command 'getss' and 'deletess', they will create a lease and return a cas-unique, or tell the client there already exist lease on the server. the client can do something to prevent stale data. such as wait, or invalidate the pre-lease.
I also think the lease is a concept of 'dirty lock', because anybody try to update it will replace itself expiration to the lease's expiration(the lease's expiration time should be very short), so in the worst case(low probability), the stale data only exist in cache for a short time. It is accepted for most app in my case.

For more detail information, please read doc/strongcas.txt. And hoping for u guys suggestion ~_~

i have created a pull request on github.
https://github.com/memcached/memcached/pull/65

dormando

unread,

Apr 20, 2014, 3:23:17 AM4/20/14

to memc...@googlegroups.com

Well I haven't read the lease paper yet. Ryan, can folks more familiar
with the actual implementation have a look through it maybe?

> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>

Ryan McElroy

unread,

Apr 22, 2014, 2:17:12 PM4/22/14

to memc...@googlegroups.com

I'll take a look at the API and behavior and provide feedback. I'll also ping our former server guys to see if they can take a look at the implementation.

Cheers,

~Ryan

Guille -bisho-

unread,

Apr 22, 2014, 2:51:23 PM4/22/14

to memc...@googlegroups.com

leases at better to fight against thundering herds (too many clients
trying to fill a miss) and being a bit more efficient by blocking, but
same strong cas can be implemented with regular cas and add. Even the
leases can be implemented with a lock on memcache when you try to
update some key, implemented with add/delete (but again less efficient
that the FB integrated implementation)

From your example:
- client A and Client B want to update the same key, and A(set

key=>1)update database before B(set key=>2):

- key not exist in cache: (A get-miss)->(B get-miss)->(B set key=2) ->
(A set key=1);
They should be using add, not set!
B add key=2 (success) A add key=1 (fail) A re-reads from master db and
updates the key using cas. If read from master db is not an option due
to load, re-read from slave but set a low TTL so it's re-read and
corrected soon.

Said that, it's good to have this leases implemented in the vanilla memcached!

Regards,

Guille -ℬḭṩḩø- <bish...@gmail.com>
:wq

Zhiwei Chan

unread,

Apr 23, 2014, 3:11:17 AM4/23/14

to memc...@googlegroups.com

Thanks for folks' suggestion, specifically for bisho. Maybe i should

say more detail the problem i face. Recently i read more

papers and articles and found that is very difficult to make "strong

consistency" between cache and database. But the lease seems can make it work

better in my case that it's shown as following.

we have two data-centers in different city far away,

and the communication between them are expensive and

more than 200ms latency. two data-centers have the same

structure, because we want to keep it working when one

of them are broken. The apps only write to the mater

database, but read from cache and slave database in their data-center.

I have to make sure most of the time apps can read the almost "realtime"

value from the cache.(we get the very important data from mater

database directly). So i want the lease to be part of my solution.

(master)Data center 1 >= 1000KM (slave)Data center 2

-------------------------|---------------------------------

|

|| \ <Invalidate> /||

/\ \--->(Send) ------>(Rcv)->| / | \

/ \ | | | / | \

/ \ | | (marker) | \

/ \ | | | \

/ \ | | /<--| \

(DB(master)) (Cache) | |->(Cache) (DB(slave))

| | /

|-->------------Rplication--- - ------------>/

|

In the above picture,

1. the marker, as it is said in facebook's paper, is special cache

server to tell the cache-client which key is updating in the master;

2. Send and Rcv is a proxy, like a pipe, sending cache's invalidation

message from master data-center updating a key to all slave

data-center. the message convey only the key and

something maybe like crc, and it shouldnot have a traffic problem.

As the early mail said, the "deletess" command is designed to creat

a lease in all cache in the slave data-center. the lease well tell

the client to wait when the key is updating, and it also make sure

the key to be a teporary in a short period. it is called "dirty"

lock in early mail. so at the worst case, the in-consistency status

will disappear after the lease time. The "getss" command is designed

to get the lease or value used by the client.

so i just share my idea, and any suggestion is welcome.

--

---
You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/KUjl4LVpV-A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to memcached+...@googlegroups.com.

Reply all

Reply to author

Forward