I'm running 4 nods cluster (BDB), heap size of each is 6G, BDB cache
is 5G, 3 partitions, replication factor 2. In order to benchmark the
cluster, I started loading the cluster with 100M records (key and
value are strings). After about 20M, the servers logs show the
following for about 2 hours:
2010-04-14 17:20:18,871 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-0] -
Client /10.80.0.51:57882 connected successfully with protocol vp1
2010-04-14 17:20:19,257 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-1] -
Client /10.80.0.51:57883 connected successfully with protocol vp1
2010-04-14 17:20:20,050 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-2] -
Client /10.80.0.51:57890 connected successfully with protocol vp1
2010-04-14 17:20:20,561 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-3] -
Client /10.80.0.51:57894 connected successfully with protocol vp1
2010-04-14 17:20:43,786 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-4] -
Client /10.80.0.51:57899 connected successfully with protocol vp1
2010-04-14 17:34:38,360 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-5] -
Client /10.80.0.51:57085 connected successfully with protocol vp1
2010-04-14 18:30:03,557 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-2] -
Client /10.80.0.51:57890 disconnected.
2010-04-14 18:30:03,557 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-3] -
Client /10.80.0.51:57894 disconnected.
2010-04-14 18:30:13,427 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-6] -
Client /10.80.0.51:59481 connected successfully with protocol vp1
2010-04-14 18:38:47,114 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-1] -
Client /10.80.0.51:57883 disconnected.
2010-04-14 18:38:47,115 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-4] -
Client /10.80.0.51:57899 disconnected.
2010-04-14 18:38:48,482 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-7] -
Client /10.80.0.51:39788 connected successfully with protocol vp1
2010-04-14 18:39:11,776 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-8] -
Client /10.80.0.51:39789 connected successfully with protocol vp1
2010-04-14 18:39:11,777 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-9] -
Client /10.80.0.51:39790 connected successfully with protocol vp1
2010-04-14 18:45:19,543 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-7] -
Client /10.80.0.51:39788 disconnected.
2010-04-14 18:45:19,543 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-6] -
Client /10.80.0.51:59481 disconnected.
2010-04-14 18:45:51,454 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-10] -
Client /10.80.0.51:38223 connected successfully with protocol vp1
2010-04-14 18:46:49,939 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-8] -
Client /10.80.0.51:39789 disconnected.
2010-04-14 18:46:49,940 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-9] -
Client /10.80.0.51:39790 disconnected.
2010-04-14 18:46:50,683 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-11] -
Client /10.80.0.51:45701 connected successfully with protocol vp1
2010-04-14 18:47:14,363 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-12] -
Client /10.80.0.51:45702 connected successfully with protocol vp1
2010-04-14 18:48:02,990 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-10] -
Client /10.80.0.51:38223 disconnected.
2010-04-14 18:48:02,991 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-11] -
Client /10.80.0.51:45701 disconnected.
2010-04-14 18:48:04,574 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-13] -
Client /10.80.0.51:45704 connected successfully with protocol vp1
2010-04-14 18:48:33,130 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-14] -
Client /10.80.0.51:45706 connected successfully with protocol vp1
2010-04-14 18:49:11,074 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-12] -
Client /10.80.0.51:45702 disconnected.
2010-04-14 18:49:11,074 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-13] -
Client /10.80.0.51:45704 disconnected.
2010-04-14 18:49:12,245 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-15] -
Client /10.80.0.51:45708 connected successfully with protocol vp1
2010-04-14 18:49:26,231 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-16] -
Client /10.80.0.51:45709 connected successfully with protocol vp1
2010-04-14 18:50:18,372 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-14] -
Client /10.80.0.51:45706 disconnected.
2010-04-14 18:50:18,372 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-16] -
Client /10.80.0.51:45709 disconnected.
2010-04-14 18:50:18,406 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-18] -
Client /10.80.0.51:45713 connected successfully with protocol vp1
2010-04-14 18:50:18,406 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-17] -
Client /10.80.0.51:45712 connected successfully with protocol vp1
2010-04-14 18:51:35,492 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-15] -
Client /10.80.0.51:45708 disconnected.
2010-04-14 18:51:35,495 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-5] -
Client /10.80.0.51:57085 disconnected.
2010-04-14 18:51:35,495 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-17] -
Client /10.80.0.51:45712 disconnected.
2010-04-14 18:51:35,497 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-22] -
Client /10.80.0.51:33013 connected successfully with protocol vp1
2010-04-14 18:51:35,497 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-21] -
Client /10.80.0.51:33012 connected successfully with protocol vp1
2010-04-14 18:51:35,497 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-20] -
Client /10.80.0.51:33011 connected successfully with protocol vp1
2010-04-14 18:51:35,498 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-19] -
Client /10.80.0.51:33010 connected successfully with protocol vp1
2010-04-14 18:52:31,754 INFO
voldemort.server.socket.SocketServerSession [voldemort-server-21] -
Client /10.80.0.51:33012 disconnected.
Then the client crashed with:
Exception in thread "main"
voldemort.store.InsufficientOperationalNodesException: No master node
succeeded!
at voldemort.store.routed.RoutedStore.put(RoutedStore.java:
703)
at voldemort.store.routed.RoutedStore.put(RoutedStore.java:72)
at voldemort.store.DelegatingStore.put(DelegatingStore.java:
68)
at
voldemort.store.stats.StatTrackingStore.put(StatTrackingStore.java:90)
at
voldemort.store.serialized.SerializingStore.put(SerializingStore.java:
109)
at voldemort.store.DelegatingStore.put(DelegatingStore.java:
68)
at
voldemort.client.DefaultStoreClient.put(DefaultStoreClient.java:208)
at com.demo.PutTest.main(PutTest.java:52)
Caused by: voldemort.store.UnreachableStoreException: Failure in put
on dev102.dev:6666(vp1): Read timed out
at voldemort.store.socket.SocketStore.put(SocketStore.java:
150)
at voldemort.store.socket.SocketStore.put(SocketStore.java:47)
at voldemort.store.logging.LoggingStore.put(LoggingStore.java:
121)
at voldemort.store.routed.RoutedStore.put(RoutedStore.java:
687)
... 7 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:
218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:
237)
at java.io.DataInputStream.readShort(DataInputStream.java:295)
at
voldemort.client.protocol.vold.VoldemortNativeClientRequestFormat.checkException(VoldemortNativeClientRequestFormat.java:
177)
at
voldemort.client.protocol.vold.VoldemortNativeClientRequestFormat.readPutResponse(VoldemortNativeClientRequestFormat.java:
170)
at voldemort.store.socket.SocketStore.put(SocketStore.java:
147)
... 10 more
Is my test wrong? Should I instanciate new StoreClient every few
minutes? Beside the socket issues, no other errors seen on the servers
logs, and none of them crashed.
Tom