Hey guys,
We are using Astyanax 1.56.26 with Cassandra 1.1.9 in various configurations.... and getting an exception like this:
com.netflix.astyanax.connectionpool.exceptions.OperationTimeoutException: OperationTimeoutException: [host=10.10.3.99(10.10.3.99):9160, latency=1(90011), attempts=14]TimedOutException()
at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:171) ~[rookery-spark.jar:na]
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:61) ~[rookery-spark.jar:na]
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28) ~[rookery-spark.jar:na]
at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151) ~[rookery-spark.jar:na]
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69) ~[rookery-spark.jar:na]
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:253) ~[rookery-spark.jar:na]
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:465) ~[rookery-spark.jar:na]
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:67) ~[rookery-spark.jar:na]
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:103) ~[rookery-spark.jar:na]
at ooyala.rookery.spark.MBICassandraCacheManager.saveMetadata(MBICacheManager.scala:75
The code is pretty simple, it just sets up a BatchMutation and does one write with batch.withRow(....).putEmptyColumn(....), but we have seen this in other code as well.
The thing that bothers us is that it throws this exception after something like 10-30 minutes of seeming to hang and not do anything. So is there anything we can do to tune Astyanax to fail faster?
We'd also like to know the exact reason for this exception.
Some configuration params:
.setMaxConnsPerHost(100)
.setMaxConns(1000)
Also, connection timeout is set to 2 seconds, and socket timeout to 10 seconds.
thanks,
Evan