Hi all,
I have a service that acts as a lru cache for backend redis storage. It cache's about 800M items that consume about 15GB memory. Soon it encountered with GC problem. The cpu usage monitor (ganglia) shows peaks on every 2 minuts, and the gctrace log looks like the following:
gc 35 @603.815s 0%: 3.9+156+0.096+393+10 ms clock, 125+156+0+547/3146/8758+321 ms cpu, 2674->2696->1422 MB, 2742 MB goal, 32 P
gc 36 @636.661s 0%: 3.0+135+54+210+7.8 ms clock, 97+135+0+851/1625/8711+249 ms cpu, 2730->2746->1446 MB, 2800 MB goal, 32 P
gc 37 @670.483s 0%: 4.0+199+10+207+7.9 ms clock, 129+199+0+1933/1650/8515+255 ms cpu, 2789->2807->1479 MB, 2861 MB goal, 32 P
gc 38 @705.215s 0%: 4.3+101+17+585+10 ms clock, 138+101+0+583/3333/9336+340 ms cpu, 2849->2876->1518 MB, 2922 MB goal, 32 P
gc 39 @741.302s 0%: 4.6+181+0.080+325+9.0 ms clock, 150+181+0+242/2599/7452+289 ms cpu, 2907->2927->1539 MB, 2981 MB goal, 32 P
scvg4: inuse: 2278, idle: 717, sys: 2996, released: 0, consumed: 2996 (MB)
2015/11/03 15:12:34 error processing request: read tcp 10.4.15.78:9702->10.4.20.131:12955: read: connection reset by peer
2015/11/03 15:12:34 error processing request: read tcp 10.4.15.78:9702->10.4.20.131:12955: read: connection reset by peer
gc 40 @777.700s 0%: 4.3+218+15+382+13 ms clock, 140+218+0+441/3057/10811+417 ms cpu, 2963->2986->1571 MB, 3038 MB goal, 32 P
1) for this line:
gc 39 @741.302s 0%: 4.6+181+0.080+325+9.0 ms clock, 150+181+0+242/2599/7452+289 ms cpu, 2907->2927->1539 MB, 2981 MB goal, 32 P
does it mean the STW is 150 + 289ms ?
2) after runs for a while, there are "GC forced" lines in the log. and it seems that every gc is triggered after GC forced log
GC forced
gc 486 @56747.960s 0%: 24+80+459+3840+44 ms clock, 368+80+0+1.7/15323/48398+668 ms cpu, 13987->14152->9635 MB, 28404 MB goal, 16 P
2015/11/03 14:51:34 error processing request: read tcp 10.4.15.78:9702->10.4.21.194:45694: read: connection reset by peer
2015/11/03 14:51:34 error processing request: read tcp 10.4.15.78:9702->10.4.21.194:45694: read: connection reset by peer
scvg378: 3 MB released
scvg378: inuse: 14349, idle: 2366, sys: 16716, released: 1652, consumed: 15063 (MB)
2015/11/03 14:51:46 error processing request: read tcp 10.4.15.78:9702->10.4.17.106:29327: read: connection reset by peer
2015/11/03 14:51:46 error processing request: read tcp 10.4.15.78:9702->10.4.17.106:29327: read: connection reset by peer
3) hints to optimize it?
Thanks