I feel like I could just as well be asking this question on the cassandra-user list, but I'll ask it here. I'm using version 3.0.1 of the DataStax Java Driver pointing at Cassandra 2.1.12 (DSE 4.8.4).
I have a keyspace that's using a replication factor of 3.
I have some inserts that are timing out, which is fine:
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.exceptions.WriteTimeoutException.copy(WriteTimeoutException.java:100) ~[tools-timeseries-migrator.jar:?]
at com.datastax.driver.core.Responses$Error.asException(Responses.java:122) ~[tools-timeseries-migrator.jar:?]
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:471) [tools-timeseries-migrator.jar:?]
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1013) [tools-timeseries-migrator.jar:?]
...
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_ONE (1 replica were required but only 0 acknowledged the write)
at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:59) ~[tools-timeseries-migrator.jar:?]
at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:37) ~[tools-timeseries-migrator.jar:?]
What's strange to me is that they're timing out at consistency level LOCAL_ONE when the queries themselves have been configured to use LOCAL_QUORUM.
When I saw this occurring, it reminded me of the behavior with DowngradingConsistencyRetryPolicy, as described in this DataStax Developer Blog post:
http://www.datastax.com/dev/blog/cassandra-error-handling-done-right. So LOCAL_QUORUM would downgrade to LOCAL_ONE. The thing is, I'm not using DowngradingConsistencyRetryPolicy. I'm not specifying a RetyPolicy when I build my Cluster object, which defaults to DefaultRetryPolicy, and, as you see from the API docs for DefaultRetryPolicy: "This retry policy is conservative in that it will never retry with a different consistency level than the one of the initial operation."
So I'm puzzled as to why LOCAL_ONE is showing up in the exception. Anyone have any idea?
My immediate practical plan of action is to keep the cluster from getting overloaded just to make the timeouts go away, and then the fact the timeouts were occurring with the wrong consistency level becomes academic. But I can't help but think this is a deeper sign of something amiss or something I'm missing.