Hector connection pool

Daning Wang

unread,

Mar 12, 2012, 2:47:08 PM3/12/12

to hector-users

We are using 0.8.0.3 hector right now. We got this exception once a
while and after that, all the nodes are marked down. we have to
restart application to make the connection working again.

What does the message "are we shutting down" mean?

We will upgrade to latest release. just to check if there is reported
bug for this. I did search on net and could not find one.

2012-03-08 16:37:15,103 [pool-2-thread-34288] Cassandra client
acquisition interrupted
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.reportInterruptAfterWait(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(Unknown Source)
at java.util.concurrent.ArrayBlockingQueue.poll(Unknown
Source)
at
me.prettyprint.cassandra.connection.ConcurrentHClientPool.waitForConnection(ConcurrentHClientPool.java:
117)
at
me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:
77)
at
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:
226)
at
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:
97)
at
me.prettyprint.cassandra.model.CqlQuery.execute(CqlQuery.java:93)
at
com.netseer.cassandra.cache.dao.CacheReader.getRows(CacheReader.java:
267)
at
com.netseer.cassandra.cache.dao.CacheReader.getCache0(CacheReader.java:
55)
at
com.netseer.cassandra.cache.dao.CacheDao.getCaches(CacheDao.java:85)
at
com.netseer.cassandra.cache.dao.CacheDao.getCache(CacheDao.java:71)
at
com.netseer.cassandra.cache.dao.CacheDao.getCache(CacheDao.java:149)
at
com.netseer.cassandra.cache.service.CacheServiceImpl.getCache(CacheServiceImpl.java:
55)
at
com.netseer.cassandra.cache.service.CacheServiceImpl.getCache(CacheServiceImpl.java:
28)
at
com.netseer.dsat.cache.CassandraDSATCacheImpl.get(CassandraDSATCacheImpl.java:
62)
at
com.netseer.dsat.cache.CassandraDSATCacheImpl.getTimedValue(CassandraDSATCacheImpl.java:
144)
at com.netseer.dsat.serving.GenericCacheManager
$4.call(GenericCacheManager.java:427)
at com.netseer.dsat.serving.GenericCacheManager
$4.call(GenericCacheManager.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown
Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
2012-03-08 16:37:15,104 [pool-2-thread-34288] Failed getting remote
cache for key=Key String = 'http://www.my-banners.com', long key =
5630311119483252185, keyType = 'PATH'
me.prettyprint.hector.api.exceptions.HectorException:
HConnectionManager returned a null client after aquisition - are we
shutting down?
at
me.prettyprint.cassandra.connection.ConcurrentHClientPool.borrowClient(ConcurrentHClientPool.java:
83)
at
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:
226)
at
me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:
97)
at
me.prettyprint.cassandra.model.CqlQuery.execute(CqlQuery.java:93)

Thanks you in advance.

Daning

Patricio Echagüe

unread,

Mar 15, 2012, 4:47:40 PM3/15/12

to hector...@googlegroups.com

we fixed some bugs in the connection pool in newer versions of Hector. Are you on Cassandra 0.8 ?

Daning Wang

unread,

Mar 15, 2012, 5:27:47 PM3/15/12

to hector...@googlegroups.com, Maciej Miklas

I found this bug in the hector code(ver 1.0-3, submitted bug CASSANDRA-4055, not sure if that is the right place to submit)

Basically if there is exception in addCassandraHost, then the host may not be added to active list and remove from downed list, then the host is gone forever.

connectionManager.addCassandraHost(cassandraHost);
downedHostQueue.remove(cassandraHost);

We have hit this bug a few times.

Thanks,

Daning

2012/3/15 Patricio Echagüe <patr...@gmail.com>

Nate McCall

unread,

Mar 15, 2012, 6:41:08 PM3/15/12

to hector...@googlegroups.com

Thanks for bringing this up. Our issue system is here:
https://github.com/rantav/hector/issues

There is not too much going on in addCassandraHost - what kind of
exception are you seeing?

Daning Wang

unread,

Mar 15, 2012, 6:48:58 PM3/15/12

to hector...@googlegroups.com

Thanks Nate. it requires login to submit bug, and I am not sure how to register. so can you log the bug there?

Here is the exception,

2012-03-15 05:39:20,944 [Hector.me.prettyprint.cassandra.connection.CassandraHostRetryService-1]        Transport exception host to HConnectionManager: s2.dsat4.netseer.com(10.210.101.116):9160
me.prettyprint.hector.api.exceptions.HectorTransportException: Unable to open transport to s2.dsat4.netseer.com(10.210.101.116):9160 , java.net.SocketTimeoutException: connect timed out
        at me.prettyprint.cassandra.connection.client.HThriftClient.open(HThriftClient.java:144)
        at me.prettyprint.cassandra.connection.client.HThriftClient.open(HThriftClient.java:26)
        at me.prettyprint.cassandra.connection.ConcurrentHClientPool.createClient(ConcurrentHClientPool.java:147)
        at me.prettyprint.cassandra.connection.ConcurrentHClientPool.<init>(ConcurrentHClientPool.java:53)
        at me.prettyprint.cassandra.connection.LeastActiveBalancingPolicy.createConnection(LeastActiveBalancingPolicy.java:59)
        at me.prettyprint.cassandra.connection.HConnectionManager.addCassandraHost(HConnectionManager.java:116)
        at me.prettyprint.cassandra.connection.CassandraHostRetryService$1.run(CassandraHostRetryService.java:75)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: connect timed out
        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
        at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
        at me.prettyprint.cassandra.connection.client.HThriftClient.open(HThriftClient.java:138)
        ... 14 more
Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:529)
        at org.apache.thrift.transport.TSocket.open(TSocket.java:17

Daning Wang

unread,

Mar 15, 2012, 7:02:18 PM3/15/12

to hector...@googlegroups.com

OK, submitted

https://github.com/rantav/hector/issues/439

On Thu, Mar 15, 2012 at 3:41 PM, Nate McCall <zzna...@gmail.com> wrote:

Nate McCall

unread,

Mar 15, 2012, 7:05:32 PM3/15/12

to hector...@googlegroups.com

Awesome - Thanks!

Thibaut Britz

unread,

Mar 19, 2012, 2:33:05 PM3/19/12

to hector...@googlegroups.com

Could you also please fix this in 0.8?

The bug is also there.

Thanks

Thibaut

Reply all

Reply to author

Forward