Hi,
There are 2 nodes in the ring (replication factor = 2). Here is the output from nodetool:
Address DC Rack Status State Load Owns Token
140637942640091053069842345145244564255
10.90.106.8 datacenter1 rack1 Up Normal 78.86 MB 65.21% 81437580563132525083742012705162878500
10.90.106.7 datacenter1 rack1 Up Normal 78.93 MB 34.79% 140637942640091053069842345145244564255
I am using 9161 on both servers. However I should note that there is 0.6.x installation running on the same servers using 9160 (hence why I needed to change the port). It goes without saying I don't have the client configured to talk to 9160.
Here is how I have my cluster + keyspace initialized:
CassandraHostConfigurator config = new CassandraHostConfigurator();
config.setHosts(hosts);
config.setMaxActive(maxActive);
config.setClockResolution(ClockResolution.MICROSECONDS);
config.setCassandraThriftSocketTimeout(connectTimeout);
config.setExhaustedPolicy(ExhaustedPolicy.WHEN_EXHAUSTED_FAIL);
this.cluster = HFactory.getOrCreateCluster("myCluster", config);
Keyspace ks = HFactory.createKeyspace(keyspace, this.cluster,
new AllOneConsistencyLevelPolicy(), FailoverPolicy.ON_FAIL_TRY_ALL_AVAILABLE);
I restarted the client and tried taking A down again (10.90.106.7). After doing so, I checked the ring info from cli just to be sure, and the output was identical to the above. Here is the full log output after I brought A down:
2012-01-17 21:44:00,204 ERROR client.HThriftClient: Could not flush transport (to be expected if the pool is shutting down) in close for client: CassandraClient<10.90.106.7:9161-1>
org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
at me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:98)
at me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:26)
at me.prettyprint.cassandra.connection.HConnectionManager.closeClient(HConnectionManager.java:308)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:257)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate.multigetSliceInternal(ThriftColumnFamilyTemplate.java:110)
at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate.doExecuteMultigetSlice(ThriftColumnFamilyTemplate.java:51)
at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.queryColumns(ColumnFamilyTemplate.java:117)
...
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 17 more
2012-01-17 21:44:00,205 ERROR connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.90.106.7(10.90.106.7):9161
2012-01-17 21:44:00,205 ERROR connection.HConnectionManager: Pool state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.90.106.7(10.90.106.7):9161}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
2012-01-17 21:44:00,206 INFO connection.ConcurrentHClientPool: Shutdown triggered on <ConcurrentCassandraClientPoolByHost>:{10.90.106.7(10.90.106.7):9161}
2012-01-17 21:44:00,206 INFO connection.ConcurrentHClientPool: Shutdown complete on <ConcurrentCassandraClientPoolByHost>:{10.90.106.7(10.90.106.7):9161}
2012-01-17 21:44:00,206 INFO connection.CassandraHostRetryService: Host detected as down was added to retry queue: 10.90.106.7(10.90.106.7):9161
2012-01-17 21:44:00,207 WARN connection.HConnectionManager: Could not fullfill request on this host CassandraClient<10.90.106.7:9161-1>
2012-01-17 21:44:00,207 WARN connection.HConnectionManager: Exception:
me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportException
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:39)
at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate$2.execute(ThriftColumnFamilyTemplate.java:120)
at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate$2.execute(ThriftColumnFamilyTemplate.java:110)
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:99)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:243)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate.multigetSliceInternal(ThriftColumnFamilyTemplate.java:110)
at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate.doExecuteMultigetSlice(ThriftColumnFamilyTemplate.java:51)
at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.queryColumns(ColumnFamilyTemplate.java:117)
...
Caused by: org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:656)
at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:638)
at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate$2.execute(ThriftColumnFamilyTemplate.java:116)
... 15 more
2012-01-17 21:44:00,207 INFO connection.HConnectionManager: Client CassandraClient<10.90.106.7:9161-1> released to inactive or dead pool. Closing.
2012-01-17 21:44:00,207 WARN connection.CassandraHostRetryService: Downed 10.90.106.7(10.90.106.7):9161 host still appears to be down: Unable to open transport to 10.90.106.7(10.90.106.7):9161 , java.net.ConnectException: Connection refused
2012-01-17 21:44:00,626 INFO connection.CassandraHostRetryService: Removing host 10.90.106.7(10.90.106.7):9161 - It does no longer exist in the ring.
Then I brought down the B side. The log output is similar (though notice lack of "no longer exist in the ring" message):
2012-01-17 21:52:37,429 ERROR client.HThriftClient: Could not flush transport (to be expected if the pool is shutting down) in close for client: CassandraClient<10.90.106.8:9161-9>
org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
at me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:98)
at me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:26)
at me.prettyprint.cassandra.connection.HConnectionManager.closeClient(HConnectionManager.java:308)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:257)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115)
at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:149)
at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69)
...
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 18 more
2012-01-17 21:52:37,430 ERROR connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host 10.90.106.8(10.90.106.8):9161
2012-01-17 21:52:37,430 ERROR connection.HConnectionManager: Pool state on shutdown: <ConcurrentCassandraClientPoolByHost>:{10.90.106.8(10.90.106.8):9161}; IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
2012-01-17 21:52:37,430 INFO connection.ConcurrentHClientPool: Shutdown triggered on <ConcurrentCassandraClientPoolByHost>:{10.90.106.8(10.90.106.8):9161}
2012-01-17 21:52:37,430 INFO connection.ConcurrentHClientPool: Shutdown complete on <ConcurrentCassandraClientPoolByHost>:{10.90.106.8(10.90.106.8):9161}
2012-01-17 21:52:37,430 INFO connection.CassandraHostRetryService: Host detected as down was added to retry queue: 10.90.106.8(10.90.106.8):9161
2012-01-17 21:52:37,431 WARN connection.CassandraHostRetryService: Downed 10.90.106.8(10.90.106.8):9161 host still appears to be down: Unable to open transport to 10.90.106.8(10.90.106.8):9161 , java.net.ConnectException: Connection refused
2012-01-17 21:52:37,431 WARN connection.HConnectionManager: Could not fullfill request on this host CassandraClient<10.90.106.8:9161-9>
2012-01-17 21:52:37,431 WARN connection.HConnectionManager: Exception:
me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:39)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:249)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115)
at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:149)
at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69)
...
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:157)
at org.apache.cassandra.thrift.Cassandra$Client.send_batch_mutate(Cassandra.java:1020)
at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1008)
at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246)
at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:99)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:243)
... 13 more
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 20 more
2012-01-17 21:52:37,431 INFO connection.HConnectionManager: Client CassandraClient<10.90.106.8:9161-9> released to inactive or dead pool. Closing.
2012-01-17 21:52:37,432 INFO connection.HConnectionManager: Client CassandraClient<10.90.106.8:9161-9> released to inactive or dead pool. Closing.
Then the following exception bubbles up to the app:
me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client.
at me.prettyprint.cassandra.connection.HConnectionManager.getClientFromLBPolicy(HConnectionManager.java:354)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:234)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeBatch(AbstractColumnFamilyTemplate.java:115)
at me.prettyprint.cassandra.service.template.AbstractColumnFamilyTemplate.executeIfNotBatched(AbstractColumnFamilyTemplate.java:149)
at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.update(ColumnFamilyTemplate.java:69)
Hope this helps,
Mike