Best practice when using LatencyAwarePolicy?

59 views
Skip to first unread message

Steven

unread,
Aug 24, 2016, 1:41:08 PM8/24/16
to DataStax Java Driver for Apache Cassandra User Mailing List
I recently experimented with the LatencyAwarePolicy in the Datastax Java Driver in an attempt to avoid sporadically slow nodes in our Cassandra clusters. The results of this experiment were, at best, mixed, and I'm hoping someone here can advise me whether there is a way to make better use of this policy or if there are better alternatives.

In our Cassandra clusters, which are hosted on Amazon EC2, we observe a good median response time of a few milliseconds for any given read or write request, but the 99th percentile of response times over any given period can be orders of magnitude higher. We see this across all types of requests whether read or write, all nodes, and all column families. Under the hypothesis that these slow requests are the result of some transient slow-down on the coordinator nodes, I was hoping that by using the LatencyAwarePolicy, the Driver would better balance requests between slow and fast nodes and thus reduce the 99th percentile.

In load testing and "shadow" production testing with the LatencyAwarePolicy, configured with the default exclusion threshold, I observed that the policy was hysterically blacklisting and then re-instating nodes several times a minute, and yet this didn't seem to move the needle on the 99th% response time much. Setting the exclusion threshold higher reduced the frequency of blacklisting, but did nothing for response time.

So, what is the best practice for configuring the LatencyAwarePolicy? Is there a way to use it effectively to balance load between slow and fast nodes that results in better performance? Can anyone share a successful case of using this policy? Can anyone suggest an alternative approach?

Vishy Kasar

unread,
Aug 24, 2016, 1:53:56 PM8/24/16
to java-dri...@lists.datastax.com
We did not see much success with LatencyAwarePolicy. 

An alternate approach would be to use ConstantSpeculativeExecutionPolicy  That certainly helped us get the response faster when some co-ordinators became busy.  

--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Reply all
Reply to author
Forward
0 new messages