golang 1.5.1 GC problem for large memory cache service

458 views
Skip to first unread message

Huabin Zheng

unread,
Nov 3, 2015, 8:11:22 AM11/3/15
to golang-nuts
Hi all,

    I have a service that acts as a lru cache for backend redis storage. It cache's about 800M items that consume about 15GB memory. Soon it encountered with GC problem. The cpu usage monitor (ganglia) shows peaks on every 2 minuts, and the gctrace log looks like the following:
gc 35 @603.815s 0%: 3.9+156+0.096+393+10 ms clock, 125+156+0+547/3146/8758+321 ms cpu, 2674->2696->1422 MB, 2742 MB goal, 32 P
gc 36 @636.661s 0%: 3.0+135+54+210+7.8 ms clock, 97+135+0+851/1625/8711+249 ms cpu, 2730->2746->1446 MB, 2800 MB goal, 32 P
gc 37 @670.483s 0%: 4.0+199+10+207+7.9 ms clock, 129+199+0+1933/1650/8515+255 ms cpu, 2789->2807->1479 MB, 2861 MB goal, 32 P
gc 38 @705.215s 0%: 4.3+101+17+585+10 ms clock, 138+101+0+583/3333/9336+340 ms cpu, 2849->2876->1518 MB, 2922 MB goal, 32 P
gc 39 @741.302s 0%: 4.6+181+0.080+325+9.0 ms clock, 150+181+0+242/2599/7452+289 ms cpu, 2907->2927->1539 MB, 2981 MB goal, 32 P
scvg4: inuse: 2278, idle: 717, sys: 2996, released: 0, consumed: 2996 (MB)
2015/11/03 15:12:34 error processing request: read tcp 10.4.15.78:9702->10.4.20.131:12955: read: connection reset by peer
2015/11/03 15:12:34 error processing request: read tcp 10.4.15.78:9702->10.4.20.131:12955: read: connection reset by peer
gc 40 @777.700s 0%: 4.3+218+15+382+13 ms clock, 140+218+0+441/3057/10811+417 ms cpu, 2963->2986->1571 MB, 3038 MB goal, 32 P


1) for this line: 
gc 39 @741.302s 0%: 4.6+181+0.080+325+9.0 ms clock, 150+181+0+242/2599/7452+289 ms cpu, 2907->2927->1539 MB, 2981 MB goal, 32 P
does it mean the STW is 150 + 289ms ?

2) after runs for a while, there are "GC forced" lines in the log. and it seems that every gc is triggered after GC forced log
GC forced
gc 486 @56747.960s 0%: 24+80+459+3840+44 ms clock, 368+80+0+1.7/15323/48398+668 ms cpu, 13987->14152->9635 MB, 28404 MB goal, 16 P
2015/11/03 14:51:34 error processing request: read tcp 10.4.15.78:9702->10.4.21.194:45694: read: connection reset by peer
2015/11/03 14:51:34 error processing request: read tcp 10.4.15.78:9702->10.4.21.194:45694: read: connection reset by peer
scvg378: 3 MB released
scvg378: inuse: 14349, idle: 2366, sys: 16716, released: 1652, consumed: 15063 (MB)
2015/11/03 14:51:46 error processing request: read tcp 10.4.15.78:9702->10.4.17.106:29327: read: connection reset by peer
2015/11/03 14:51:46 error processing request: read tcp 10.4.15.78:9702->10.4.17.106:29327: read: connection reset by peer

3) hints to optimize it?

Thanks

Tamás Gulácsi

unread,
Nov 3, 2015, 3:31:05 PM11/3/15
to golang-nuts
Do you really need that much items?
Map or slice? Slice is better.
Can you eliminate pointers? That'd help the GC, too.

Huabin Zheng

unread,
Nov 4, 2015, 4:02:01 AM11/4/15
to golang-nuts
Yes, I need to cache enough items to hold hot data.

I used this lru cache: https://github.com/bluele/gcache

Is cgo helpful in such case?

Tamás Gulácsi

unread,
Nov 4, 2015, 5:56:07 AM11/4/15
to golang-nuts
That library has so many quirks, it'd need a little rewrite for performance...
Just quirks: function pointer pointers, expiration as *time.Duration...
Performance: uses map[interface{}]*list.List.

What do you need to store in the cache?

I'd first try github.com/dgryski/go-arc then implement something like a big []byte backend, and store the starting and ending indices in the cache (though eviction && free lists are troublesome).

unread,
Nov 4, 2015, 7:12:11 AM11/4/15
to golang-nuts
On Tuesday, November 3, 2015 at 2:11:22 PM UTC+1, Huabin Zheng wrote:
The cpu usage monitor (ganglia) shows peaks on every 2 minuts ...
 
2) after runs for a while, there are "GC forced" lines in the log. and it seems that every gc is triggered after GC forced log
GC forced
gc 486 @56747.960s 0%: 24+80+459+3840+44 ms clock, 368+80+0+1.7/15323/48398+668 ms cpu, 13987->14152->9635 MB, 28404 MB goal, 16 P
How long does the peak last (gc 486 and similar)? 4 seconds? The CPU usage on most of the 16 cores is close to 100% throughout the peak?

Huabin Zheng

unread,
Nov 4, 2015, 9:48:18 AM11/4/15
to golang-nuts
I want to store key:value pairs in the cache, where key is string type and value is user defined struct. eg.

cache <string, T>

cache.set(k, v)
cache.get(k) -> v

but is should has size limit and will automatically evict items.
Message has been deleted

Huabin Zheng

unread,
Nov 4, 2015, 9:59:13 AM11/4/15
to golang-nuts
No, the cpu usage is very low. I set gomaxprocs to 16 but cpu usage peak is less than 200%. 

Today, I decrease the lru cache size to 1/8 of origin size and it performs quite well.

Is there any way to skip gc for items saved in lru cache. 

Damian Gryski

unread,
Nov 4, 2015, 10:31:32 AM11/4/15
to golang-nuts


On Wednesday, November 4, 2015 at 11:56:07 AM UTC+1, Tamás Gulácsi wrote:

I'd first try github.com/dgryski/go-arc then implement something like a big []byte backend, and store the starting and ending indices in the cache (though eviction && free lists are troublesome).

ARC, while effective, is also a fairly slow algorithm.  Lots of things to do on each update.  Might be worth looking at https://github.com/dgryski/go-s4lru or https://github.com/dgryski/go-clockpro .

Damian

Jason E. Aten

unread,
Nov 4, 2015, 12:17:59 PM11/4/15
to golang-nuts
Doesn't implement LRU for you, but my offheap hash table shows how to avoid GC completely for very large hash tables in Go. 

https://github.com/glycerine/offheap

You could easily have the hash table index into a ring buffer instead of storing the data inline in the table (e.g. https://github.com/glycerine/rbuf). You'd have to validate that each key hadn't expired before returning the value stored in the ringbuf, but that's pretty easy to accomplish by comparing timestamps and/or checksums on the data.
Reply all
Reply to author
Forward
0 new messages