HThriftClient - Could not flush transport

Chris Kaminski

unread,

Apr 7, 2011, 6:27:43 PM4/7/11

to hector...@googlegroups.com

I'm getting the following error when doing integration testing (single JVM, no fork).

ERROR me.prettyprint.cassandra.connection.HThriftClient - Could not flush transport (to be expected if the pool is shutting down) in close for client: CassandraClient<localhost:9160-2>

org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe

at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)

...

I'm using the cassandra-maven-plugin to start Cassandra. My tests fail after I try and push a thrift message of 22MB, when thrift_framed_transport_size_in_mb=16.

So once I get a TTransportException, Hector gives up communicating to that server.

So if I'm in production with my web service, to recover I want to reissue getOrCreateCluster? Assuming that the cassandra hosts are running? If I lose a whole Cassandra cluster, I do not want to have to bounce my web servers when the Cassandra cluster recovers.

In my integration tests I'm using forkMode = always, so I am certain that Cassandra is alive after this error.

Regards

-Chris

Nate McCall

unread,

Apr 7, 2011, 6:35:17 PM4/7/11

to hector...@googlegroups.com, Chris Kaminski

So TTransportException is what comes back immediately when the files
size exceeds the configured transport size? Or this occured after some
period of test activity?

Chris Kaminski

unread,

Apr 7, 2011, 6:44:28 PM4/7/11

to Nate McCall, hector...@googlegroups.com

Buffer2 Size: 22038392 is the size of the BLOB I'm storing. It throws the exception immediately upon insert. I might have had 800 successful inserts ahead of that. From then on the Cluster appears to be useless. I am checking now to see if going from 10 MB to 22 MB back to 10 MB behaves any differently.

I only have 1 Cassandra server configured (as these are integration tests).

[TRACE] StoredFileDAO: file: cdc74d1f-d5a6-4209-8ea7-fd5866c528f9 persisting buffer2 size: 22038392

166125 [19689347@qtp-26659010-6] ERROR me.prettyprint.cassandra.connection.HThriftClient - Could not flush transport (to be expected if the pool is shutting down) in close for client: CassandraClient<localhost:9160-2>

org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe

at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)

at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:158)

at me.prettyprint.cassandra.connection.HThriftClient.close(HThriftClient.java:82)

at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)

at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)

at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)

at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)

at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:203)

at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:200)

at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)

at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)

at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:200)

at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:55)

Nate McCall

unread,

Apr 7, 2011, 6:49:31 PM4/7/11

to Chris Kaminski, hector...@googlegroups.com

TTransportException causes us to take down the server as that usually
only comes when under duress. I'll see if I can reproduce this, my
understanding was that these types of errors should produce
invalidRequestExceptions.

Chris Kaminski

unread,

Apr 7, 2011, 7:03:14 PM4/7/11

to Nate McCall, hector...@googlegroups.com

Thank you, Nate.

I'm using hector 0.7.0-28 and cassandra 0.7.0-rc4

So am I correct, this is a situation where I need to create a new Cluster and restart processing after my servers come back?

Nate McCall

unread,

Apr 8, 2011, 10:42:01 AM4/8/11

to Chris Kaminski, hector...@googlegroups.com

You could - it might be easier to sleep for N seconds where N >
CassandraHostConfigurator#retryDownedHostDelayInSeconds (default 10).

Daniel Lundin

unread,

Apr 11, 2011, 4:03:17 AM4/11/11

to hector...@googlegroups.com

We're seeing this in logs as well, using a single api.Keyspace, RoundRobinBalancingPolicy, and a plenitude of worker threads.

Actual load is fairly constantly high, but no spikes. Cassandra itself isn't stressed though.

Periodically all pools are downed, with regular intervals.

Will look further into what's happening. Meanwhile, any ideas are much appreciated.

Reply all

Reply to author

Forward