YBC - fast in-process out-of-GC cache with persistence support

Aliaksandr Valialkin

unread,

Dec 20, 2012, 12:22:42 PM12/20/12

to golan...@googlegroups.com

Hi all,

I already posted a topic here about memcached implementation in Go - see https://groups.google.com/d/topic/golang-nuts/O-N5Pc2ErBY/discussion . This implementation is based on YBC caching library - https://github.com/valyala/ybc/tree/master/bindings/go/ybc , which has the following features:

* It is extremely fast key-value cache.

* Its' performance doesn't depend on the number of items cached and on the size of the cache.

* Since cache manages its' memory itself, i.e. it is out-of-GC, it scales to billions of items without negative impact on application performance related to increased GC pauses.

* It supports cache sizes exceeding available RAM size by multiple orders of magnitude.

* It supports cache data persistence - i.e. cached data may survive application restarts.

* It can cache huge blobs with sizes up to 2Gb each.

* The speed of objects' eviction algorithm doesn't depend on the number and the size of cached items.

This library shows the following performance numbers on my not-so-fast laptop ( Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz ):

GetMiss: 3.5 Mqps

GetHit: 1.9 Mqps

Set: 2.5 Mqps

Library documentation is available at http://godoc.org/github.com/valyala/ybc/bindings/go/ybc .

Currently the library works only under Linux, but it should be easy to port it to other platforms, since all platform-specific code is located in a single folder https://github.com/valyala/ybc/tree/master/platform .

Aliaksandr Valialkin

unread,

Dec 24, 2012, 4:42:08 PM12/24/12

to golang-nuts

Small FAQ:

Q: Why YBC is better than simple map[string][]byte?

A: Because it has the following features missing in map[string][]byte:

* Automatic cache size management. You can continuously add millions of new objects to YBC and it will never exceed maximum size set via Config.DataFileSize. With map[string][]byte you should implement cache eviction mechanism for prevention of unbound cache growth.

* Out-of-GC memory management. YBC can contain billions of objects without negative performance impact related to increased GC pauses. With map[string][]byte every object stored in it increases GC pauses, since GC should periodically traverse all active objects in memory, including objects in map[string][]byte.

* YBC is thread safe out of the box. It supports concurrent read/write access to objects under the same key from multiple threads (goroutines). With map[string][]byte you should carefully design and implement thread safety support.

* YBC is optimized for cache sizes exceeding available RAM. It writes data only sequentially in continuous memory region. This guarantees that write speed to YBC won't be limited by the number of disk seeks per second when cache size exceeds available RAM. Actually it may be limited by sequential write speed to backing store (either HDD or SSD, but it may be arbitrary device hidden behind filesystem). On the read size YBC tries packing frequently accessed small objects into smallest possible continuous memory region. If this region fits RAM (i.e. if hot data fits RAM), then read speed won't be limited by the number of disk seeks per second. YBC don't use swap - it backs cache memory by special files and you always can put these files on the fastest available storage (see Config.DataFile and Config.IndexFile). You can even shard these files to distinct storage devices for faster throughput (see ClusterConfig). With map[string][]byte you can't control cache memory layout, so there is high probability it will swap as hell for cache sizes exceeding available RAM even if hot data size is smaller than RAM size.

* YBC may persist cache contents to files, so cached objects won't be lost between application restarts or on application crash. With map[string][]byte all cached objects are instantly lost on application exit.

Q: Why YBC is better than memcache, redis, couchbase, your-favorite-key-value-service?

A: Because, unlike all out-of-process services, YBC manages its' memory in the application's process. It has no RPC overhead, so it will be always faster than any existing and future out-of-process service. Of course, pure YBC won't help you when you need cache shared among multiple processes. In this case you can build thin RPC wrapper on top of YBC or just use out-of-the-box solution - go-memcached - memcache server already built on top of YBC :)

Q: Why YBC is better than leveldb, dbm, your-favorite-key-value-library?

A: Because these libraries implement storage, not cache. Cache is free to evict arbitrary object at arbitrary time due to arbitrary reason, while storage must preserve all objects put into it. This freedom in object life management provides two benefits to cache implementations:

* Cache size may be easily limited without external help, while storage size must be controlled by external tools (aka 'cleaners').

* Better potential performance.

Q: Where YBC can be used?

A: In arbitrary applications, which may benefit from in-process cache:

* forward and reverse caching web proxies such as squid, varnish, nginx, etc.;

* CDNs;

* web browsers;

* file hosting services such as youtube, picasa, dropbox, instagram, facebook photos;

* scientific applications.

IMHO, it would be great if caching library similar to YBC will be bundled into future Go releases.

--

--
Best Regards,

Aliaksandr

Aliaksandr Valialkin

unread,

Dec 24, 2012, 5:10:19 PM12/24/12

to golang-nuts, vit...@googlecode.com

YBC could be used by vitess for row cache. According to their wiki:

Go’s existing mark-and-sweep garbage collector is sub-optimal for systems that use large amounts of static memory (like caches). In the case of vtocc, this would be the row cache. To alleviate this, we intend to use memcache for the time being. If the gc ends up addressing this, it should be fairly trivial to switch to an in-memory row cache

--
Best Regards,

Aliaksandr

Aliaksandr Valialkin

unread,

Mar 18, 2013, 10:29:47 AM3/18/13

to golan...@googlegroups.com

Now YBC library provides SimpleCache API, which is easier to use comparing to Cache API. It provides straightforward methods for key-value cache - Get(), Set() and Delete(). Besides that, SimpleCache.Get() performance scales better on multi-CPU systems comparing to Cache.Get().

P.S. Could someone with write access to https://code.google.com/p/go-wiki/wiki/Projects add YBC to Caching section using the following description: "Fast in-process out-of-GC key-value cache with persistence support. Optimized for SSDs".

Reply all

Reply to author

Forward