Add a feature 'strong cas', developed from 'lease' that mentioned in Facebook's paper

50 views
Skip to first unread message

Zhiwei Chan

unread,
Apr 17, 2014, 3:21:50 AM4/17/14
to memc...@googlegroups.com

I m working on a trading system, and getting stale data for the system is unaccepted at most of the time. But the high throughput make it impossible to get all data from mysql. So i want to make it more reliable when use memcache as a cache. Facebook's paper "Scaling Memcache at Facebook" mentions a method called ‘lease' and 'mcsqueal', but the mcsqueal is difficult for my case, because it is hard to get the key for mysql.

Adding the 'strong cas' feature is devoted to solve the following typical problems, client A and Client B want to update the same key, and A(set key=>1)update database before B(set key=>2):
key not exist in cache: (A get-miss)->(B get-miss)->(B set key=2) -> (A set key=1);
or key exist in cache: (A delete key)->(B delete key)->(B set key=2) -> (A set key=1);
Some thing Wrong! the key=2 in database but key=1 in cache.

It is possible to happen in a high concurrent system, and i don't find a way to solve it with the current cas method. So i add two command 'getss' and 'deletess', they will create a lease and return a cas-unique, or tell the client there already exist lease on the server. the client can do something to prevent stale data. such as wait, or invalidate the pre-lease.
I also think the lease is a concept of 'dirty lock', because anybody try to update it will replace itself expiration to the lease's expiration(the lease's expiration time should be very short), so in the worst case(low probability), the stale data only exist in cache for a short time. It is accepted for most app in my case.

For more detail information, please read doc/strongcas.txt. And hoping for u guys suggestion ~_~

 i have created a pull request on github.
https://github.com/memcached/memcached/pull/65

dormando

unread,
Apr 20, 2014, 3:23:17 AM4/20/14
to memc...@googlegroups.com
Well I haven't read the lease paper yet. Ryan, can folks more familiar
with the actual implementation have a look through it maybe?
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>

Ryan McElroy

unread,
Apr 22, 2014, 2:17:12 PM4/22/14
to memc...@googlegroups.com
I'll take a look at the API and behavior and provide feedback. I'll also ping our former server guys to see if they can take a look at the implementation.

Cheers,

~Ryan

Guille -bisho-

unread,
Apr 22, 2014, 2:51:23 PM4/22/14
to memc...@googlegroups.com
leases at better to fight against thundering herds (too many clients
trying to fill a miss) and being a bit more efficient by blocking, but
same strong cas can be implemented with regular cas and add. Even the
leases can be implemented with a lock on memcache when you try to
update some key, implemented with add/delete (but again less efficient
that the FB integrated implementation)

From your example:
- client A and Client B want to update the same key, and A(set
key=>1)update database before B(set key=>2):
- key not exist in cache: (A get-miss)->(B get-miss)->(B set key=2) ->
(A set key=1);
They should be using add, not set!
B add key=2 (success) A add key=1 (fail) A re-reads from master db and
updates the key using cas. If read from master db is not an option due
to load, re-read from slave but set a low TTL so it's re-read and
corrected soon.

Said that, it's good to have this leases implemented in the vanilla memcached!

Regards,
Guille -ℬḭṩḩø- <bish...@gmail.com>
:wq

Zhiwei Chan

unread,
Apr 23, 2014, 3:11:17 AM4/23/14
to memc...@googlegroups.com
   Thanks for folks' suggestion, specifically for bisho. Maybe i should
 say more detail the problem i face.  Recently i read more
 papers and articles and found that is very difficult to make "strong
 consistency" between cache and database. But the lease seems can make it work
 better in my case that it's shown as following.  
 
   we have two  data-centers in different city far away, 
 and the communication between them are expensive and
 more than 200ms latency. two  data-centers have the same 
 structure, because we want to keep it working when one 
 of them are broken. The apps only write to the  mater 
 database, but read from cache and slave database in their data-center. 

  I have to make sure most of the time apps can read the almost "realtime"
  value from the cache.(we get the very important data from mater 
  database directly). So i want the lease to be part of my solution.
      
 (master)Data center 1     >= 1000KM  (slave)Data center 2        
 -------------------------|---------------------------------
                          |                          
       <APP>              |               <APP>             
        || \        <Invalidate>           /||
        /\  \--->(Send) ------>(Rcv)->|   / | \
       /  \               |     |     |  /  |  \   
      /    \              |     | (marker)  |   \            
     /      \             |     |           |    \       
    /        \            |     |       /<--|     \          
(DB(master))  (Cache)     |     |->(Cache)    (DB(slave))  
    |                     |                       /         
    |-->------------Rplication--- - ------------>/          
                          |                                 
                          |                                 
  
  
   In the above picture, 
   1. the marker, as it is said in facebook's paper, is special cache 
   server to tell the cache-client which key is updating in the master;
   2. Send and Rcv is a proxy, like a pipe, sending cache's invalidation 
   message from master data-center updating a key to all slave 
   data-center. the message  convey only the key and 
   something maybe like crc, and it shouldnot have a traffic problem.
   
     As the early mail said, the "deletess" command is designed to creat 
   a lease in all cache in the slave data-center. the lease well tell 
   the client to wait when the key is updating, and it also make sure
   the key to be a teporary in a short period. it is called "dirty" 
   lock in early mail. so at the worst case, the in-consistency status
   will disappear after the lease time. The "getss" command is designed
   to get the lease or value used by the client.

    so i just share my idea, and any suggestion is welcome.
                                                                                                                    
 



--

---
You received this message because you are subscribed to a topic in the Google Groups "memcached" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/memcached/KUjl4LVpV-A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to memcached+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages