Hi, I need your opinion about a extrange behavior that I have in my voldemort platform.
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: [10:25:04,067 voldemort.server.scheduler.slop.StreamingSlopPusherJob] INFO Completed streaming slop pusher job which started at Tue Dec 29 1...cutor$Worker]
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.lang.Thread.run(Thread.java:745)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: Caused by: voldemort.store.UnreachableStoreException: Failure while checking out socket for bcn1-cache-vold-095p1:6666(vp1):
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.UnreachableStoreException.wrap(UnreachableStoreException.java:41)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.ClientRequestExecutorPool.checkout(ClientRequestExecutorPool.java:214)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.SocketStore.request(SocketStore.java:278)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.SocketStore.get(SocketStore.java:200)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.SocketStore.get(SocketStore.java:62)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.serialized.SerializingStore.get(SerializingStore.java:107)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.client.AbstractStoreClientFactory.getRemoteMetadata(AbstractStoreClientFactory.java:579)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.client.SocketStoreClientFactory.getRemoteMetadata(SocketStoreClientFactory.java:97)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: ... 16 more
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: Caused by: java.net.ConnectException: ClientRequestExecutor timed out for destination bcn1-cache-vold-095p1:6666(vp1)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$1.requestComplete(ClientRequestExecutorFactory.java:210)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.NonblockingStoreCallbackClientRequest.invokeCallback(NonblockingStoreCallbackClientRequest.java:68)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.NonblockingStoreCallbackClientRequest.timeOut(NonblockingStoreCallbackClientRequest.java:128)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.ClientRequestExecutor.completeClientRequest(ClientRequestExecutor.java:358)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.ClientRequestExecutor.close(ClientRequestExecutor.java:200)
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.ClientRequestExecutor.checkTimeout(ClientRequestExecutor.java:108)
it seems that we have a connection problem, if we check the jmx graph we can see connection problems in some servers
or other hand we can see a strange behavior DaemonTheadControl
In voldemort logs, we have connection problems in the same time (The ethernet interfaces for the servers are not busy)
we have the same time in GC 30ms.
So, we know that probably we have some limit in the Voldemort server or client, but we can’t find.
Finally we have this java and servers options
rNewGC -XX:CMSInitiatingOccupancyFraction=70 -XX:SurvivorRatio=2 -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:gc.log -XX:+PrintGCApplicationStoppedTime -XX:+
PrintGCApplicationConcurrentTime -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=7198 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false voldemort.server.VoldemortServer
Introducir código aquí...
node.id=1
max.threads=20000
############### DB options ######################
http.enable=true
socket.enable=true
# BDB
bdb.write.transactions=false
bdb.flush.transactions=false
bdb.cache.size=17G
bdb.one.env.per.store=true
# Mysql
mysql.host=localhost
mysql.port=1521
mysql.user=root
mysql.password=3306
mysql.database=test
#NIO connector settings.
enable.nio.connector=true
request.format=vp3
storage.configs=voldemort.store.bdb.BdbStorageConfiguration, voldemort.store.readonly.ReadOnlyStorageConfiguration, voldemort.store.memory.CacheStorageConfiguration
Introducir código aquí...
java -Dlog4j.configuration=file:///opt/voldemort/src/java/log4j.properties -server -Xms28g -Xmx28g -XX:NewSize=2048m -XX:MaxNewSize=2048m -XX:+UseConcMarkSweepGC -XX:+UsePa
rNewGC -XX:CMSInitiatingOccupancyFraction=70 -XX:SurvivorRatio=2 -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:gc.log -XX:+PrintGCApplicationStoppedTime -XX:+
PrintGCApplicationConcurrentTime -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=7198 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false voldemort.server.VoldemortServer
Any ideas?
Extra info in the last minute
All clients does not have configure the client zone, then for default all clients are connected in zone 0 ( I suppose )
Thanks!
What is the client and server version?
...
--
You received this message because you are subscribed to a topic in the Google Groups "project-voldemort" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/project-voldemort/rkP7gzLCq74/unsubscribe.
To unsubscribe from this group and all its topics, send an email to project-voldem...@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.
Can you increase the number of server selectors to see if it makes a difference?
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
I am on a tablet, will send you more info when I am back on my macbook.
public void setNioConnectorSelectors(int nioConnectorSelectors)
[root@bcn1-cache-vold-095p1:mauso]# /usr/java/jdk1.7.0_72/bin/jps -vvv12958 VoldemortServer -Dlog4j.configuration=file:///opt/voldemort/src/java/log4j.properties -Xms28g -Xmx28g -XX:NewSize=2048m -XX:MaxNewSize=2048m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=70 -XX:SurvivorRatio=2 -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:gc.log -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=7198 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false31144 Jps -Dapplication.home=/usr/java/jdk1.7.0_72 -Xms8m[root@bcn1-cache-vold-095p1:mauso]# /usr/java/jdk1.7.0_72/bin/jstack -l 12958 | grep 'voldemort-nio-socket-server-t'"voldemort-nio-socket-server-t32" daemon prio=10 tid=0x00007f0351230800 nid=0x3334 runnable [0x00007f00d3cfb000]"voldemort-nio-socket-server-t31" daemon prio=10 tid=0x00007f0351216000 nid=0x3333 runnable [0x00007f00d3dfc000]"voldemort-nio-socket-server-t30" daemon prio=10 tid=0x00007f03511fa800 nid=0x3332 runnable [0x00007f00d3efd000]"voldemort-nio-socket-server-t29" daemon prio=10 tid=0x00007f03511df800 nid=0x3331 runnable [0x00007f00d3ffe000]"voldemort-nio-socket-server-t28" daemon prio=10 tid=0x00007f03511c4800 nid=0x3330 runnable [0x00007f01d41c0000]"voldemort-nio-socket-server-t27" daemon prio=10 tid=0x00007f03511a9800 nid=0x332f runnable [0x00007f01d42c1000]"voldemort-nio-socket-server-t26" daemon prio=10 tid=0x00007f035118f000 nid=0x332e runnable [0x00007f01d43c2000]"voldemort-nio-socket-server-t25" daemon prio=10 tid=0x00007f0351174000 nid=0x332d runnable [0x00007f01d44c3000]"voldemort-nio-socket-server-t24" daemon prio=10 tid=0x00007f0351159000 nid=0x332c runnable [0x00007f01d45c4000]"voldemort-nio-socket-server-t23" daemon prio=10 tid=0x00007f035113d800 nid=0x332b runnable [0x00007f01d46c5000]"voldemort-nio-socket-server-t22" daemon prio=10 tid=0x00007f0351122800 nid=0x332a runnable [0x00007f01d47c6000]"voldemort-nio-socket-server-t21" daemon prio=10 tid=0x00007f0351107800 nid=0x3329 runnable [0x00007f01d48c7000]"voldemort-nio-socket-server-t20" daemon prio=10 tid=0x00007f03510ec800 nid=0x3328 runnable [0x00007f01d49c8000]"voldemort-nio-socket-server-t19" daemon prio=10 tid=0x00007f03510d1800 nid=0x3327 runnable [0x00007f01d4ac9000]"voldemort-nio-socket-server-t18" daemon prio=10 tid=0x00007f03510b6800 nid=0x3326 runnable [0x00007f01d4bca000]"voldemort-nio-socket-server-t17" daemon prio=10 tid=0x00007f035109b800 nid=0x3325 runnable [0x00007f01d4ccb000]"voldemort-nio-socket-server-t16" daemon prio=10 tid=0x00007f0351080000 nid=0x3324 runnable [0x00007f01d4dcc000]"voldemort-nio-socket-server-t15" daemon prio=10 tid=0x00007f0351065000 nid=0x3323 runnable [0x00007f01d4ecd000]"voldemort-nio-socket-server-t14" daemon prio=10 tid=0x00007f035104a000 nid=0x3322 runnable [0x00007f01d4fce000]"voldemort-nio-socket-server-t13" daemon prio=10 tid=0x00007f035102f000 nid=0x3321 runnable [0x00007f01d50cf000]"voldemort-nio-socket-server-t12" daemon prio=10 tid=0x00007f0351014000 nid=0x3320 runnable [0x00007f01d51d0000]"voldemort-nio-socket-server-t11" daemon prio=10 tid=0x00007f0350ff9000 nid=0x331f runnable [0x00007f01d52d1000]"voldemort-nio-socket-server-t10" daemon prio=10 tid=0x00007f0350fdd800 nid=0x331e runnable [0x00007f01d53d2000]"voldemort-nio-socket-server-t9" daemon prio=10 tid=0x00007f0350fc2800 nid=0x331d runnable [0x00007f01d54d3000]"voldemort-nio-socket-server-t8" daemon prio=10 tid=0x00007f0350fa7800 nid=0x331c runnable [0x00007f01d55d4000]"voldemort-nio-socket-server-t7" daemon prio=10 tid=0x00007f0350f8c800 nid=0x331b runnable [0x00007f01d56d5000]"voldemort-nio-socket-server-t6" daemon prio=10 tid=0x00007f0350f71800 nid=0x331a runnable [0x00007f01d57d6000]"voldemort-nio-socket-server-t5" daemon prio=10 tid=0x00007f0350f56800 nid=0x3319 runnable [0x00007f01d58d7000]"voldemort-nio-socket-server-t4" daemon prio=10 tid=0x00007f0350ebb000 nid=0x3318 runnable [0x00007f01d59d8000]"voldemort-nio-socket-server-t3" daemon prio=10 tid=0x00007f0350eb9800 nid=0x3317 runnable [0x00007f01d5ad9000]"voldemort-nio-socket-server-t2" daemon prio=10 tid=0x00007f0350ebd800 nid=0x3316 runnable [0x00007f01d5bda000]"voldemort-nio-socket-server-t1" daemon prio=10 tid=0x00007f0350ebc800 nid=0x3315 runnable [0x00007f01d5cdb000]
...
2016-01-06 00:52:06,079 ERROR [AsynchronousVoldemortDistributedCache: :ED67E0848842B96ACC5B13CBE2807017] [VoldemortDistributedCache] - Exception adding entry in cache: voldemort.versioning.ObsoleteVersionException: Key 00ec332e322e322e39
2e342e312e332e312e312e312e312e322e322e332e312e312e312e312e312e312e312e312e332e332e342e322e312e312e312e322e392e342e312e332e312e312e312e312e322e322e332e312e312e312e312e312e312e312e312e332e332e342e322e312e312e312d3141534b3132303030313437383
53633323030303030393734333234373946323230465354414e444152445f52414e47455f444154453131343739363636363030303030323437393937343346323230465354414e444152445f52414e47455f44415445464c49474854464744454641554c544e4f524d414c5b5d5b48424f42415d5b43
4f52504f524154455f554e4946415245532c20454c454354524f4e49435f5449434b45545f4f4e4c592c204e4f5f4c43435f46415245532c2050415353454e4745525f53414d455f424f4f4b494e475f434f44452c205055424c49534845445f46415245532c205449434b45545f4142494c4954595f4
34845434b2c20554e4946415245535d66616c73657472756530304742 version(30:4) ts:1452037926078 is obsolete, it is no greater than the current version of version(30:4) ts:1452037926077.
Jan 07 10:45:14 bcn1-cache-vold-095p1 voldemort-server.sh[588]: voldemort.VoldemortException: voldemort.store.UnreachableStoreException: Failure while checking out socket for bcn1-cache-vold-095p1:6666(vp1):
Jan 07 10:45:14 bcn1-cache-vold-095p1 voldemort-server.sh[588]: Caused by: voldemort.store.UnreachableStoreException: Failure while checking out socket for bcn1-cache-vold-095p1:6666(vp1):
...
--
You received this message because you are subscribed to a topic in the Google Groups "project-voldemort" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/project-voldemort/rkP7gzLCq74/unsubscribe.
To unsubscribe from this group and all its topics, send an email to project-voldem...@googlegroups.com.
Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: [10:25:04,067 voldemort.server.scheduler.slop.StreamingSlopPusherJob] INFO Completed streaming slop pusher job which started at Tue Dec 29 1...cutor$Worker]Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at java.lang.Thread.run(Thread.java:745)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: Caused by: voldemort.store.UnreachableStoreException: Failure while checking out socket for bcn1-cache-vold-095p1:6666(vp1):Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.UnreachableStoreException.wrap(UnreachableStoreException.java:41)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.ClientRequestExecutorPool.checkout(ClientRequestExecutorPool.java:214)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.SocketStore.request(SocketStore.java:278)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.SocketStore.get(SocketStore.java:200)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.SocketStore.get(SocketStore.java:62)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.serialized.SerializingStore.get(SerializingStore.java:107)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.client.AbstractStoreClientFactory.getRemoteMetadata(AbstractStoreClientFactory.java:579)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.client.SocketStoreClientFactory.getRemoteMetadata(SocketStoreClientFactory.java:97)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: ... 16 moreDec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: Caused by: java.net.ConnectException: ClientRequestExecutor timed out for destination bcn1-cache-vold-095p1:6666(vp1)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$1.requestComplete(ClientRequestExecutorFactory.java:210)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.NonblockingStoreCallbackClientRequest.invokeCallback(NonblockingStoreCallbackClientRequest.java:68)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.NonblockingStoreCallbackClientRequest.timeOut(NonblockingStoreCallbackClientRequest.java:128)Dec 29 10:25:04 bcn1-cache-vold-095p2. voldemort-server.sh[656]: at voldemort.store.socket.clientrequest.ClientRequestExecutor.completeClientReque
http.enable=truesocket.enable=true# BDBbdb.write.transactions=falsebdb.flush.transactions=falsebdb.cache.size=17Gbdb.one.env.per.store=true
#NIO connector settings.enable.nio.connector=true
request.format=vp3storage.configs=voldemort.store.bdb.BdbStorageConfiguration, voldemort.store.readonly.ReadOnlyStorageConfiguration, voldemort.store.memory.CacheStorageConfiguration
...
--
...
--
...
The client uses the client port. Also client preferably should be on 1.10.2+ as we fixed some more connection issues on that.
Hi Arun ,Thanks for you reply.
I'm not sure what is the client version, (Tomorrow I will ask developers), even so the Slope process use the client version on the server, and this version I'm sure that is 1.9.18.When the clients request the metadata, what does port use it? Admin Port? maybe I need increases the Admin Connections.
El domingo, 10 de enero de 2016, 2:08:32 (UTC+1), Arun Thirupathi escribió:
sysctl -w net.ipv4.neigh.default.gc_thresh3=8192sysctl -w net.ipv4.neigh.default.gc_thresh2=8192sysctl -w net.ipv4.neigh.default.gc_thresh1=4096
...
--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
...
--