High CPU usage

91 views
Skip to first unread message

jam+

unread,
Nov 11, 2013, 11:33:36 PM11/11/13
to project-...@googlegroups.com
Hi,

Our VDM  cluster is causing high CPU usage problem now.
Currently I have no idea what the root cause is. 
There is no error/exception log, there is no unstable status... all I can see is usage of CPU getting 100%.

Disk space
/dev/md0               63G   45G   16G  75% /opt/vdm

server.properties

max.threads=100
client.max.connections.per.node=50

############### DB options ######################

data.directory=/opt/vdm/service/data

http.enable=true
socket.enable=true

slop.pusher.enable=true

# BDB
bdb.write.transactions=true
bdb.flush.transactions=false
bdb.cache.size=11000m

# Mysql
mysql.host=localhost
mysql.port=1521
mysql.user=root
mysql.password=3306
mysql.database=test

#NIO connector settings.
enable.nio.connector=true

storage.configs=voldemort.store.bdb.BdbStorageConfiguration, voldemort.store.readonly.ReadOnlyStorageConfiguration, voldemort.store.memory.CacheStorageConfiguration


Here is my jstack dump: https://cloudup.com/cNJDQplqskC


Hope someone can give me a hint, thanks!


jam+

unread,
Nov 11, 2013, 11:42:46 PM11/11/13
to project-...@googlegroups.com
BTW, here is top:
9246 csrunner  15   0 12.9g  10g 5120 S 102.2 68.8  76729:04 java

here is free:
jam[0]0$ free -m
                  total         used       free     shared    buffers     cached
Mem:         15360      15325         34             0           36        4182
-/+ buffers/cache:      11105       4254
Swap:         2047        505       1542

The memory usage is almost 70%, and eating swap...


jam+於 2013年11月12日星期二UTC+8下午12時33分36秒寫道:

jam+

unread,
Nov 12, 2013, 4:09:39 AM11/12/13
to project-...@googlegroups.com
Here for more information.



Thanks.

jam+於 2013年11月12日星期二UTC+8下午12時42分46秒寫道:

Esteban Donato

unread,
Nov 12, 2013, 6:48:08 AM11/12/13
to project-...@googlegroups.com
Did you check if you are running too frequent FGC?  What's your max heap size value?  It seems all your heap is being used by the bdb cache.


--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at http://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/groups/opt_out.

Brendan Harris (a.k.a. stotch on irc.oftc.net)

unread,
Nov 12, 2013, 11:33:09 AM11/12/13
to project-...@googlegroups.com
Hi Jam,

Like Esteban asked, please give use your JVM config (the full config).

Also ...


storage.configs=voldemort.store.bdb.BdbStorageConfiguration, voldemort.store.readonly.ReadOnlyStorageConfiguration, voldemort.store.memory.CacheStorageConfiguration

I don't recommend running more than one storage configuration on a single JVM instance. The objects, their creation rate and lifespans are very different from storage engine to storage engine. The JVM's automated GC is not generally adequate for such a complex system of objects. The JVM's GC activity could keep the CPU very busy.

Thanks,

Brendan

jam+

unread,
Nov 12, 2013, 10:26:49 PM11/12/13
to project-...@googlegroups.com
Thank you for all your reply.


Here is JVM config (I reduce the classpath part):

java -Xms12g -Xmx12g -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=70 -XX:SurvivorRatio=2 -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/opt/vdm-0.95/bin/../logs/gc-vdm.log -XX:NewSize=512m -XX:MaxNewSize=512m -XX:MaxPermSize=160M -Dlog4j.configuration=file:///opt/vdm-0.95/conf/log4j.properties -Dcom.sun.management.jmxremote.port=9001 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.library.path=/opt/vdm-0.95/bin/../lib/boot -classpath /opt/vdm-0.95/lib/.. -Dwrapper.key=B6OXdyhr5fM8dD83 -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.pid=24254 -Dwrapper.version=3.2.3 -Dwrapper.native_library=wrapper -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 org.tanukisoftware.wrapper.WrapperSimpleApp voldemort.server.VoldemortServer


For the storage.configs, we will figure out how to separate theses instances.
Thanks for the advice! 



Brendan Harris (a.k.a. stotch on irc.oftc.net)於 2013年11月13日星期三UTC+8上午12時33分09秒寫道:

Brendan Harris (a.k.a. stotch on irc.oftc.net)

unread,
Nov 12, 2013, 11:04:58 PM11/12/13
to project-...@googlegroups.com
So, your bdb.cache.size is 11g and your jvm heap size is 12g. 0.5g of the heap is for newgen, leaving roughly 11.4g for oldgen, which is inevitably where your bdb cache objects will land and remain for a long time. If you want a cache size that large, you're going to need to bump your heap size up to at least 16g and you should probably give at least 1g to newgen. You're probably spending most of your time in GC, which is probably what is keeping the CPU busy. You may need a much larger heap than even 16g (and larger newgen) depending upon your throughput rate. How many queries per second are you serving?

Also, if you're running voldemort 0.95 (I assume that is what /opt/vdm-0.95 is), you're _very_ out of date and should upgrade to 1.2.0+ to get all of the performance improvements.

Lastly, with the bdb cache consuming 90% of the jvm heap, there's probably no room for the read-only and in-memory storage engines to run properly, so you're probably just stuck in a non-stop GC loop trying to allocate for all three engines in barely enough heap space for even one engine.

Brendan

jam+

unread,
Nov 12, 2013, 11:13:09 PM11/12/13
to project-...@googlegroups.com
Thanks, it's really helpful !! We will consider to arrange the setting and maybe will upgrade to 1.3 as well.

Thanks again!

Brendan Harris (a.k.a. stotch on irc.oftc.net)於 2013年11月13日星期三UTC+8下午12時04分58秒寫道:
Reply all
Reply to author
Forward
0 new messages