Asynchronous read and timeout exceptions

Paolo Forte

unread,

Mar 26, 2015, 11:45:58 AM3/26/15

to project-...@googlegroups.com

Hi,

I am using the performance tool as if it were a load generator (for the moment it just performs read). For this purpose I modified voldemort.performance.benchmark.VoldemortWrapper.java to spawn a new thread which takes care of the read operations asynchronously.

Note: I inserted a sleep(time) just before the spawning in order to be able to set the number of operations.

The point is that, doing so, I receive a lot of these messages on the client:

[15:59:07,599 voldemort.store.socket.clientrequest.ClientRequestExecutor] WARN Client request associated with Socket[addr=/192.168.0.9,port=6666,localport=59117] timed out. Start time(ns) 168755944268398 allowed time(ns) 4979608367 elapsed time(ns) 5145758141 [voldemort-niosocket-client-6]

and the following exceptions:

Exception in thread "Thread-4676" voldemort.store.InsufficientOperationalNodesException: 1 gets required, but only 0 succeeded Original replication set
:[0] Known failed nodes before operation :[] Estimated live nodes in preference list :[0] New failed nodes during operation :[0]
at voldemort.store.routed.action.PerformSerialRequests.execute(PerformSerialRequests.java:142)
at voldemort.store.routed.Pipeline.execute(Pipeline.java:212)
at voldemort.store.routed.PipelineRoutedStore.get(PipelineRoutedStore.java:352)
at voldemort.store.routed.PipelineRoutedStore.get(PipelineRoutedStore.java:258)
at voldemort.store.routed.PipelineRoutedStore.get(PipelineRoutedStore.java:81)
at voldemort.store.logging.LoggingStore.get(LoggingStore.java:106)
at voldemort.store.DelegatingStore.get(DelegatingStore.java:65)
at voldemort.store.stats.StatTrackingStore.get(StatTrackingStore.java:87)
at voldemort.store.stats.StatTrackingStore.get(StatTrackingStore.java:40)
at voldemort.store.serialized.SerializingStore.get(SerializingStore.java:107)
at voldemort.store.DelegatingStore.get(DelegatingStore.java:65)
at voldemort.store.versioned.InconsistencyResolvingStore.get(InconsistencyResolvingStore.java:51)
at voldemort.client.DefaultStoreClient.get(DefaultStoreClient.java:157)
at voldemort.client.DefaultStoreClient.get(DefaultStoreClient.java:314)
at voldemort.client.LazyStoreClient.get(LazyStoreClient.java:103)
at voldemort.performance.benchmark.VoldemortWrapper$1.run(VoldemortWrapper.java:118)
at java.lang.Thread.run(Thread.java:745)
Caused by: voldemort.store.UnreachableStoreException: Failure in get: Protocol negotation timed out for socket Socket[addr=/192.168.0.9,port=6666,local
port=60958]
at voldemort.store.socket.clientrequest.ClientRequestExecutorPool$AsyncSocketDestinationRequest.handleException(ClientRequestExecutorPool.java:
358)
at voldemort.utils.pool.QueuedKeyedResourcePool.registerResourceRequest(QueuedKeyedResourcePool.java:108)
at voldemort.store.socket.clientrequest.ClientRequestExecutorPool.submitAsync(ClientRequestExecutorPool.java:273)
at voldemort.store.socket.SocketStore.requestAsync(SocketStore.java:371)
at voldemort.store.socket.SocketStore.submitGetRequest(SocketStore.java:122)
at voldemort.store.routed.action.PerformParallelRequests.execute(PerformParallelRequests.java:159)
... 16 more
Caused by: java.util.concurrent.TimeoutException: Protocol negotation timed out for socket Socket[addr=/192.168.0.9,port=6666,localport=60958]
at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory.create(ClientRequestExecutorFactory.java:212)
at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory.create(ClientRequestExecutorFactory.java:51)
at voldemort.utils.pool.KeyedResourcePool$Pool.attemptGrow(KeyedResourcePool.java:455)
at voldemort.utils.pool.KeyedResourcePool.attemptNonBlockingCheckout(KeyedResourcePool.java:177)
at voldemort.utils.pool.QueuedKeyedResourcePool.registerResourceRequest(QueuedKeyedResourcePool.java:104)
... 20 more

On the server instead, I can see a stream of these messages:

[16:09:06,720 voldemort.server.niosocket.AsyncRequestHandler] INFO Protocol negotiated for Socket[addr=/192.168.0.6,port=32862,localport=6666]: voldemort-native-v3 [voldemort-niosocket-server74]

spaced by a lot of these messages:

[16:09:06,725 voldemort.server.niosocket.AsyncRequestHandler] INFO Connection reset from Socket[addr=/192.168.0.6,port=60448,localport=6666] with message - Connection reset by peer [voldemort-niosocket-server428]

What am I doing wrong? Please, how can I work it out?

Thanks in advance.

In the following part of the post, I attached info about the systems and the settings. Let me know if you need more info.

Performance Tool

Some parameters of ./voldemort-performance-tool.sh:

--num-connections-per-node 999999
--percent-cached 0 -r 100
--record-selection uniform
--threads 1
--url tcp://192.168.0.9:6666
--lambda 200

with the parameter lambda that I inserted, I can control statistically the number of operations done per second. I have tested it with the default configuration (that is, without spawning a new thread for each read operation) and it works correctly.

Moreover I added this in the bin/voldemort-perfomance-tool.sh:

export VOLD_OPTS=" -server -Xms32684m -Xmx32684m -XX:NewSize=8048m -XX:MaxNewSize=20048m -XX:+UseG1GC -XX:SurvivorRatio=2 -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:gc.log -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime "

Here my VoldemortWrapper.java:

public void read(final Object key,final Object expectedValue,final Object transforms) {
long startNs = System.nanoTime();
// Here I actually insert an exponential distributed sleep to tune the load generator.
// By tuning the sleeping time, I can decide how many operations (i.e. connections ) per second should be done.
new Thread(new Runnable() {
public void run() {
Versioned<Object> returnedValue = voldemortStore.get(key, transforms);
}
}).start();

long endNs = System.nanoTime();
measurement.recordLatency(Operations.Read.getOpString(), ((endNs - startNs) / Time.NS_PER_MS));

ReturnCode res = ReturnCode.Ok;
// if (returnedValue == null && !this.ignoreNulls) {
// res = ReturnCode.Error;
// }
// if (verifyReads && !expectedValue.equals(returnedValue.getValue())) {
// res = ReturnCode.Error;
// }

measurement.recordReturnCode(Operations.Read.getOpString(), res.ordinal());
}

Server

I run the following command:

sudo bin/voldemort-prod-server.sh config/prod_single_node_cluster

I modified the voldemort-prod-server.sh raising the memory parameters:

export VOLD_OPTS=" -server -Xms32684m -Xmx32684m -XX:NewSize=2048m -XX:MaxNewSize=20048m -XX:+UseG1GC -XX:SurvivorRatio=2 -XX:+AlwaysPreTouch -XX:+UseCompressedOops -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:gc.log -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime "

here the server.properties (I put in bold what I have added):

# The ID of *this* particular cluster node
node.id=0

# configs
admin.enable=true
admin.max.threads=40
bdb.cache.evictln=true
bdb.cache.size=20GB
bdb.checkpoint.interval.bytes=2147483648
bdb.checkpointer.off.batch.writes=true
bdb.cleaner.interval.bytes=15728640
bdb.cleaner.lazy.migration=false
bdb.cleaner.min.file.utilization=0
bdb.cleaner.threads=1
bdb.enable=true
bdb.evict.by.level=true
bdb.expose.space.utilization=true
bdb.lock.nLockTables=47
bdb.minimize.scan.impact=true
bdb.one.env.per.store=true
enable.server.routing=false
enable.verbose.logging=false
http.enable=true
nio.connector.selectors=500
client.max.queued.requests=10000
num.scan.permits=2
request.format=vp3
restore.data.timeout.sec=1314000
scheduler.threads=24
slop.frequency.ms=300000
socket.enable=true
storage.configs=voldemort.store.bdb.BdbStorageConfiguration, voldemort.store.readonly.ReadOnlyStorageConfiguration
stream.read.byte.per.sec=209715200
stream.write.byte.per.sec=78643200

here the cluster.xml

<cluster>
<name>myprodcluster</name>
<server>
<id>0</id>
<host>localhost</host>
<http-port>8081</http-port>
<socket-port>6666</socket-port>
<admin-port>7777</admin-port>
<partitions>0, 1</partitions>
</server>
</cluster>

and the store.xml (that is actually named just "test"):

<store>
<name>test</name>
<persistence>bdb</persistence>
<description>Test store</description>
<owners>ha...@hogwarts.edu, herm...@hogwarts.edu</owners>
<routing-strategy>consistent-routing</routing-strategy>
<routing>client</routing>
<replication-factor>1</replication-factor>
<required-reads>1</required-reads>
<required-writes>1</required-writes>
<key-serializer>
<type>string</type>
</key-serializer>
<value-serializer>
<type>string</type>
</value-serializer>
</store>

Hardware

The machines have 64 GB RAM, two 12-core AMD Opteron processors, a 500 GB hard disk and a 1 Gb network controller. They run Ubuntu 12.04 LTS.

Arunachalam

unread,

Mar 26, 2015, 1:13:59 PM3/26/15

to project-...@googlegroups.com

You are creating too many connections and there is a design flaw in Voldemort, which during connection creation delays other operations. If too many connection creation occurs, it fails and it gets into a death spiral.

Try the patched branch https://github.com/arunthirupathi/voldemort/tree/clientV2

and let me know how it goes. This is a rewrite of NIO layer to a certain extent, so I haven't merged it in. Currently going through internal testing and will release it after that.

What is the performance effort you are doing ? If you are interested in Voldemort performance, I have several patches which will improve the latency and throughput of Voldemort. Give me a brief introduction to your work and we can take it from there.

Thanks,

Arun.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at http://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

Carlos Tasada

unread,

Mar 30, 2015, 7:05:23 AM3/30/15

to project-...@googlegroups.com

Hi Arun,

I see the same error in my production environment. What's "too many connections" for you?

In my production environment I've the next voldemort-client configuration:

# Maximum number of connection allowed to each voldemort node
max_connections = 50
# Maximum number of connections allowed to all voldemort nodes
max_total_connections = 400
# The maximum number of client threads (Used by the client thread pool) [max_connections + 10]

max_threads = 60

Server.properties

max.threads = 2048

enable.nio.connector = true

nio.connector.selectors = 8

request.format = vp2

enable.gossip = false

I have a total of 28 clients and 12 Voldemort Servers.

Do you see anything that could help to create such errors?

How stable do you believe is the clientV2 branch?

Thanks a lot

Paolo Forte

unread,

Mar 30, 2015, 7:24:46 AM3/30/15

to project-...@googlegroups.com

Thanks for your reply. I'll take in consideration your branch, but first I want to see if I can avoid it, because I couldn't use experimental branches.

Basically, I just use the built-in performance tool as a load generator (which has to create the load on the servers) and as a client (on which I can monitor the avg delay) in order to check if under high load, the client could suffer lack of service.

How do the patches work?

Thanks.

Arunachalam

unread,

Mar 30, 2015, 1:14:14 PM3/30/15

to project-...@googlegroups.com

Carlos,

There are 2 factors that can cause this.

1) Round trip time between the client and the server. The current implementation of the client blocks the selector for 2 round trip times ( one for TCP connect and other for Protocol Negotiation). This happens during connection creation.

2) The servers not responding or GCing. When there is a failed connection, and when the selector is about to return the failed connection it initiates a new connection and it sort of gets into a death spiral.

The example case was weird, as it had too many connections, but I believe this should not be a problem in the real world.

The selectors are handling multiple read/writes at the same time. But it is too busy tending to one connection it times out the other read/writes. The fix is to not block and do all operations asynchronously. My personal tests have shown that this fix should cut down the p95th percentile latency by 50%. I am internally testing the fix with few of our partners and I plan to release the fix into the open source in next 1 or 2 months.

But I have done decent testing on my sample client and it shows good improvements. Feel free to try out the patch and let me know if you have any questions/concerns.

Thanks,

Arun.

Arunachalam

unread,

Mar 30, 2015, 1:15:22 PM3/30/15

to project-...@googlegroups.com

Paolo,

In that case I recommend you to drop the connections to a reasonable number, say 200 or 500.

--num-connections-per-node 999999

Thanks,

Arun.

Paolo Forte

unread,

Mar 30, 2015, 1:52:28 PM3/30/15

to project-...@googlegroups.com

Hi Arun, thanks for the help.

Unfortunately, I already tried with different values ( range [ 10 ; 500 ] ) but without any luck; same output of the first post with timed out connections.

Till 120 ops/sec it works, but when it overreach 150 it begins to fail. I wouldn't know how to fix it.

Anyhow thanks for your time.

Arunachalam

unread,

Mar 30, 2015, 1:56:36 PM3/30/15

to project-...@googlegroups.com

Paolo,

Try increasing the selectors to 100 with connections being 500. It would mitigate the issue to a certain extent.

Thanks,

Arun.

Paolo Forte

unread,

Mar 31, 2015, 8:07:41 AM3/31/15

to project-...@googlegroups.com

I noticed that after a while, the server does caching ( I noticed using iotop that the disk I/O drops ). When most of the Values are in cache, the server can bear way more connections than before.

Unfortunately for the purposes I am pursuing, I should avoid caching, so this is not a solution for me. Rather I should somehow disable caching.

Reply all

Reply to author

Forward

Message has been deleted

Asynchronous read and timeout exceptions - performance tool

Paolo Forte

Arunachalam

Carlos Tasada

Paolo Forte

Arunachalam

Arunachalam

Paolo Forte

Arunachalam

Paolo Forte