Hi,
I’d appreciate any guidance on optimal setup for multi-threaded, high throughput low latency Java Client using DataStax Java Driver for Apache Cassandra. I appreciate ‘roll-your-own’ benchmarking is not recommended, but this task is also aimed at a proof-of-concept for a real-world application to achieve high TPS.
Setup:
-------
Client Side : Java 8 Client, configurable number of multi-threaded executor threads (facilitated by lmax disruptor), cassandra-driver-core-3.0.0.jar, running on Redhat 6.3, 24 core machine, dl360s
Server side : 3 node Cassandra Cluster (apache-cassandra-2.2.4, on Redhat 6 with Java 8) , Replication Factor = 3 , running on Redhat 6.3, 24 core machine dl360s
Testing
--------
With cl=LOCAL_QUORUM tests have been in the region of 3.5K INSERTS and 6.5K READS per second from a relatively simple schema, with latency circa 6 and 2 milliseconds respectively, with CPU usage circa 20% across the box.
However the problem I can not solve is that - when I create multiple separate instances of my load client-application I can achieve significantly higher TPS summed across instances, and greater CPU usage. This suggests that my Java Client Application is neither IO or CPU bound, nor is the server-side Cassandra cluster the bottleneck. Likewise when I stub out the Cassandra call, I achieve much higher TPS thus giving me confidence that the application is not suffering from any contention.
So my question is: Is this a common problem – that one single Java Client using DataStax Java Driver for Apache Cassandra is somehow limited on it’s throughput? and assuming not can anyone point me in the right direction to investigate.
I have tested multiple sequences (READs and WRITEs), and also both execute and executeAsync, with variable number of concurrent threads. As you’d expect I see higher numbers with executeAsync but still the same limitation within my app.
I have tested with multiple Connection Pooling settings, and have tried creating/building 1 Cluster Instance per client-application, and multiple cluster instances per application, and varying CoreConnections, maxRequestsPerConnection and newConnectionThreshold values but thus far with no success.
My current best results were with 50 executor threads, 5 instances ;MaxRequestsPerConnection(L) = 1024; ;NewConnectionThreshold(L) = 800; CoreConnectionsPerHost(L) = 20
This yielded ~4K TPS BUT only using 18% of the CPU, and when I start a separate Application Instance I achieve 7.5K TPS across both using 30% CPU, but I can not achieve this 7.5K within the save JVM
Thanks
Aidan
Is this a common problem – that one single Java Client using DataStax Java Driver for Apache Cassandra is somehow limited on it’s throughput? and assuming not can anyone point me in the right direction to investigate.
have tried creating/building 1 Cluster Instance per client-application, and multiple cluster instances per application,
CoreConnectionsPerHost(L) = 20
--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
|
|
1 Threads |
2 Threads |
5 Threads |
10 Threads |
20 Threads |
50 Threads |
100 Threads |
250 Threads |
|
execASYNC |
65K TPS |
64K |
66K |
56K |
62K |
65k |
55K |
|
|
exec |
Too few |
|
7K |
11K |
19K |
35K |
47K |
27K |
In-Flight Query Data
execASYNC (50 threads)
Load per node appears to jump around – which is probably as
we’d expect with async requests.
1 cluster per-app: Host=/10.20.53.48:9042 connections=1, current load=0, max load=1024
1 cluster per-app: Host=/10.20.53.183:9042 connections=1, current load=1024, max load=1024
1 cluster per-app: Host=/10.20.53.184:9042 connections=1, current load=0, max load=1024
--
1 cluster per-app: Host=/10.20.53.48:9042 connections=1, current load=1023, max load=1024
1 cluster per-app: Host=/10.20.53.183:9042 connections=1, current load=40, max load=1024
1 cluster per-app: Host=/10.20.53.184:9042 connections=1, current load=41, max load=1024
--
1 cluster per-app: Host=/10.20.53.48:9042 connections=1, current load=1024, max load=1024
1 cluster per-app: Host=/10.20.53.183:9042 connections=1, current load=0, max load=1024
1 cluster per-app: Host=/10.20.53.184:9042 connections=1, current load=0, max load=1024
--
Exec blocking (100 threads)
More stable and evenly distributed.
1 cluster per-app: Host=/10.20.53.48:9042 connections=1, current load=58, max load=1024
1 cluster per-app: Host=/10.20.53.183:9042 connections=1, current load=22, max load=1024
1 cluster per-app: Host=/10.20.53.184:9042 connections=1, current load=20, max load=1024
--
1 cluster per-app: Host=/10.20.53.48:9042 connections=1, current load=36, max load=1024
1 cluster per-app: Host=/10.20.53.183:9042 connections=1, current load=27, max load=1024
1 cluster per-app: Host=/10.20.53.184:9042 connections=1, current load=34, max load=1024
--
1 cluster per-app: Host=/10.20.53.48:9042 connections=1, current load=34, max load=1024
1 cluster per-app: Host=/10.20.53.183:9042 connections=1, current load=35, max load=1024
1 cluster per-app: Host=/10.20.53.184:9042 connections=1, current load=31, max load=1024
Output from stress tool
# ./cassandra-stress write n=1000000 -node 10.20.53.48
INFO 16:18:26 Did not find Netty's native epoll transport in the classpath, defaulting to NIO.
INFO 16:18:27 Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
INFO 16:18:27 New Cassandra host /10.20.53.48:9042 added
INFO 16:18:27 New Cassandra host /10.20.53.183:9042 added
INFO 16:18:27 New Cassandra host /10.20.53.184:9042 added
Connected to cluster: TestClusteraidan224
Datatacenter: datacenter1; Host: /10.20.53.48; Rack: rack1
Datatacenter: datacenter1; Host: /10.20.53.183; Rack: rack1
Datatacenter: datacenter1; Host: /10.20.53.184; Rack: rack1
Created keyspaces. Sleeping 1s for propagation.
Sleeping 2s...
Warming up WRITE with 50000 iterations...
Failed to connect over JMX; not collecting these stats
Running WRITE with 200 threads for 1000000 iteration
Failed to connect over JMX; not collecting these stats
type, total ops, op/s, pk/s, row/s, mean, med, .95, .99, .999, max, time, stderr, errors, gc: #, max ms, sum ms, sdv ms, mb
total, 105837, 105816, 105816, 105816, 2.0, 1.6, 3.4, 7.0, 66.0, 106.5, 1.0, 0.00000, 0, 0, 0, 0, 0, 0
total, 260081, 138394, 138394, 138394, 1.4, 1.3, 2.4, 3.8, 48.8, 58.5, 2.1, 0.09433, 0, 0, 0, 0, 0, 0
total, 404236, 119122, 119122, 119122, 1.7, 1.1, 2.0, 2.9, 161.5, 165.2, 3.3, 0.06809, 0, 0, 0, 0, 0, 0
total, 564912, 150632, 150632, 150632, 1.3, 1.1, 2.0, 3.1, 106.6, 111.7, 4.4, 0.07768, 0, 0, 0, 0, 0, 0
total, 735872, 165532, 165532, 165532, 1.1, 1.1, 2.0, 2.5, 4.9, 52.5, 5.4, 0.06977, 0, 0, 0, 0, 0, 0
total, 918927, 175529, 175529, 175529, 1.2, 1.1, 1.9, 2.6, 51.8, 56.3, 6.5, 0.06550, 0, 0, 0, 0, 0, 0
total, 1000000, 114363, 114363, 114363, 1.7, 1.1, 2.0, 2.8, 117.2, 118.4, 7.2, 0.06581, 0, 0, 0, 0, 0, 0
Results:
op rate : 139351 [WRITE:139351]
partition rate : 139351 [WRITE:139351]
row rate : 139351 [WRITE:139351]
latency mean : 1.4 [WRITE:1.4]
latency median : 1.1 [WRITE:1.1]
latency 95th percentile : 2.2 [WRITE:2.2]
latency 99th percentile : 3.2 [WRITE:3.2]
latency 99.9th percentile : 68.3 [WRITE:68.3]
latency max : 165.2 [WRITE:165.2]
Total partitions : 1000000 [WRITE:1000000]
Total errors : 0 [WRITE:0]
total gc count : 0
total gc mb : 0
total gc time (s) : 0
avg gc time(ms) : NaN
stdev gc time(ms) : 0
Total operation time : 00:00:07
END
--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
You received this message because you are subscribed to a topic in the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this topic, visit https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/zqtlMDktuv4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to java-driver-us...@lists.datastax.com.