HThriftClient - Could not flush transport

618 views
Skip to first unread message

Chris Kaminski

unread,
Apr 7, 2011, 6:27:43 PM4/7/11
to hector...@googlegroups.com
I'm getting the following error when doing integration testing (single JVM, no fork). 

ERROR me.prettyprint.cassandra.connection.HThriftClient - Could not flush transport (to be expected if the pool is shutting down) in close for client: CassandraClient<localhost:9160-2>
org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
        at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
        ...

I'm using the cassandra-maven-plugin to start Cassandra.  My tests fail after I try and push a thrift message of 22MB, when thrift_framed_transport_size_in_mb=16. 

So once I get a TTransportException, Hector gives up communicating to that server.  

So if I'm in production with my web service, to recover I want to reissue getOrCreateCluster?  Assuming that the cassandra hosts are running?  If I lose a whole Cassandra cluster, I do not want to have to bounce my web servers when the Cassandra cluster recovers.  

In my integration tests I'm using forkMode = always, so I am certain that Cassandra is alive after this error. 

Regards 
-Chris 

 

Nate McCall

unread,
Apr 7, 2011, 6:35:17 PM4/7/11
to hector...@googlegroups.com, Chris Kaminski
So TTransportException is what comes back immediately when the files
size exceeds the configured transport size? Or this occured after some
period of test activity?

Chris Kaminski

unread,
Apr 7, 2011, 6:44:28 PM4/7/11
to Nate McCall, hector...@googlegroups.com
Buffer2 Size: 22038392 is the size of the BLOB I'm storing.  It throws the exception immediately upon insert.  I might have had 800 successful inserts ahead of that.  From then on the Cluster appears to be useless.  I am checking now to see if going from 10 MB to 22 MB back to 10 MB behaves any differently.

I only have 1 Cassandra server configured (as these are integration tests). 

[TRACE] StoredFileDAO: file: cdc74d1f-d5a6-4209-8ea7-fd5866c528f9 persisting buffer2 size: 22038392
166125 [19689347@qtp-26659010-6] ERROR me.prettyprint.cassandra.connection.HThriftClient - Could not flush transport (to be expected if the pool is shutting down) in close for client: CassandraClient<localhost:9160-2>
org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
    at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
    at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:158)
    at me.prettyprint.cassandra.connection.HThriftClient.close(HThriftClient.java:82)
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:233)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:203)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:200)
    at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
    at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
    at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:200)
    at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:55)

Nate McCall

unread,
Apr 7, 2011, 6:49:31 PM4/7/11
to Chris Kaminski, hector...@googlegroups.com
TTransportException causes us to take down the server as that usually
only comes when under duress. I'll see if I can reproduce this, my
understanding was that these types of errors should produce
invalidRequestExceptions.

Chris Kaminski

unread,
Apr 7, 2011, 7:03:14 PM4/7/11
to Nate McCall, hector...@googlegroups.com
Thank you, Nate.  

I'm using hector 0.7.0-28 and cassandra 0.7.0-rc4

So am I correct, this is a situation where I need to create a new Cluster and restart processing after my servers come back? 

Nate McCall

unread,
Apr 8, 2011, 10:42:01 AM4/8/11
to Chris Kaminski, hector...@googlegroups.com
You could - it might be easier to sleep for N seconds where N >
CassandraHostConfigurator#retryDownedHostDelayInSeconds (default 10).

Daniel Lundin

unread,
Apr 11, 2011, 4:03:17 AM4/11/11
to hector...@googlegroups.com
We're seeing this in logs as well, using a single api.Keyspace, RoundRobinBalancingPolicy, and a plenitude of worker threads. 
Actual load is fairly constantly high, but no spikes. Cassandra itself isn't stressed though.

Periodically all pools are downed, with regular intervals.

Will look further into what's happening. Meanwhile, any ideas are much appreciated.
Reply all
Reply to author
Forward
0 new messages