New Project Based Off of Voldemort (open source)

45 views
Skip to first unread message

Lester Pi

unread,
Mar 29, 2016, 8:38:02 PM3/29/16
to project-voldemort
Hi Community,

I am involved in a recently open sourced project inspired by Project Voldemort called CacheStore. Originally, CacheStore was created as a plugin for Project Voldemort before breaking off into its own key-value storage system project. It has been used for the past 5 years in production at the data and ad-tech company, Viant Inc. Many of the features, such as server side replication, object query language, range scans, store procs, and cursors, have been added to address the large-scale data needs of the business.

I would like to share this with you guys since this project was heavily influenced by Project Voldemort.

Come check it out if you have time: http://viant.github.io/CacheStore/

Hope you guys like it!

Thanks and Best Regards,
Lester Pi

Arunachalam

unread,
Mar 29, 2016, 10:36:20 PM3/29/16
to project-...@googlegroups.com
This is cool, is there a bench mark or load test that you have done and published the numbers. I am more interested in throughput, latency ?

Thanks,
Arun.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

Mickey Hsieh

unread,
Mar 30, 2016, 10:39:33 AM3/30/16
to project-...@googlegroups.com
Let me provide our production use case and evnrioment as one of reference point.

Cluster: 60 nodes
Hardware : CPU 4 core, 32 GB RAM, 256 GB SDD and 500 GB HD
Data : 1.2B records of Users Profiles, tow replica, average 40M records per node
Record : Key  long integer , Value average 1KB
Through put : Average 200 tps per second/ per node
Latency: Average random R/W 1-2 MS on client side, On server side 60 -100 Micro Seconds given use case of 90% cache hit rate.

It was designed to meet our company need which is  Ad tech business that require low latency and high through put.
Hope that answer some of your question.
If you need more information. Please contact me.

Thanks 



Félix GV

unread,
Mar 30, 2016, 1:23:34 PM3/30/16
to project-...@googlegroups.com
Thanks for sharing and for the details! Looks like a very cool project.

I am curious regarding your workload. What would you say is the limiting factor in this configuration? Is it memory-bound? CPU-bound? IO-bound?

I'm just trying to extrapolate which spec would need to be bumped up if one would want to have denser nodes which can, let's say, store more keys per node, or serve more QPS per node.

Also, I'm curious to know how does server-side replication and fail over work? Is it a master-slave model? Or eventually-consistent with conflict resolution? Or some other strategy?

-F

Mickey Hsieh

unread,
Mar 30, 2016, 3:40:26 PM3/30/16
to project-...@googlegroups.com
Good questions!!

For random read
Most of cases, if it is in cache. it return the result. if not. It takes one read from disk based on meta data (key and meta data always in cache).
For random write
it support two mode, sync and async. For sync it waits until write successful and return, for async case, system put into queue and return to client right away.

In the query case and scan cursor. it will be cpu bound process. (http://viant.github.io/CacheStore/index.html for detail)

One of thing, you need to be aware is that during the bootstrap time, It load key and meta data (on disk around 27 bytes per record, on memory around 60 bytes). It can support up to 60M record for 32 GB node in our production node). I will not suggest more than that. It might run out memory,if you load several million records based on you memory foot print. So it fits well for low latency use case. but you need more hardware.

CacheStore cluster node is active/active, not master/slave. you could shutdown the replicate node any time. The client will automatically failover, as long as at least one replicate node is up.
Replication in CacheStore is async, fault torrence and  reliable. After it write to main store, it will write to journal log first, then it send to replicate port of other cluster nodes.  If the destination is down, it will resume replication as soon as the cluster node is up.   

It support always consistency, either it successes or fails. Every write carry the global version # for each key. for concurrent writ, only one will success, the rest will get exception.


Reply all
Reply to author
Forward
0 new messages