All hosts marked down

CassUser CassUser

unread,

Jan 6, 2011, 2:39:03 PM1/6/11

to hector...@googlegroups.com

Hey hector users, I'm running a map/reduce job using hector to output to cassandra from my reducer. Running on a small 6 node cluster. After upgrading to cassandra 0.7, all hosts are now being marked as down.

Here is what I'm seeing in the logs:

2011-01-06 11:26:21,509 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Transport open status false for client CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,509 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Creating a new thrift connection to cass1(192.168.240.1):9160
2011-01-06 11:26:21,511 DEBUG me.prettyprint.cassandra.connection.ConcurrentHClientPool: created new client. NumActive:1 untilExhausted: 49
2011-01-06 11:26:21,511 DEBUG me.prettyprint.cassandra.connection.ConcurrentHClientPool: blocking on queue - current block count 1
2011-01-06 11:26:21,511 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Transport open status true for client CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,722 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Closing client CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,722 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Transport open status true for client CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,722 ERROR me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN TRIGGERED for host cass1(192.168.240.1):9160
2011-01-06 11:26:21,723 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Pool state on shutdown: <ConcurrentCassandraClientPoolByHost>:{cass1(192.168.240.1):9160}; IsActive?: true; Active: 1; Blocked: 0; Idle: 0; NumBeforeExhausted: 49
2011-01-06 11:26:21,724 ERROR me.prettyprint.cassandra.connection.ConcurrentHClientPool: Shutdown triggered on <ConcurrentCassandraClientPoolByHost>:{cass1(192.168.240.1):9160}
2011-01-06 11:26:21,724 ERROR me.prettyprint.cassandra.connection.ConcurrentHClientPool: Shutdown complete on <ConcurrentCassandraClientPoolByHost>:{cass1(192.168.240.1):9160}
2011-01-06 11:26:21,724 INFO me.prettyprint.cassandra.connection.CassandraHostRetryService: Host detected as down was added to retry queue: cass1(192.168.240.1):9160
2011-01-06 11:26:21,724 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Closing client CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,724 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Transport open status false for client CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,725 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Could not fullfill request on this host CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,727 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Exception:
me.prettyprint.hector.api.exceptions.HUnavailableException: UnavailableException()
    at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:50)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:88)
    at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89)
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:142)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:146)
    at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
    at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
    at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)
    at com.atsid.cassandra.ingest.thrift.NGramMutationCreator.flushMutations(NGramMutationCreator.java:133)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.sendUpdate(NGramMutationShipper.java:115)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.shipPayloadsIfNeeded(NGramMutationShipper.java:91)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.write(NGramMutationShipper.java:66)
    at com.atsid.cassandra.ingest.thrift.DefaultNGramDataCollector.collectAndWrite(DefaultNGramDataCollector.java:43)
    at com.atsid.cassandra.ingest.thrift.mapreduce.NGramReducer.reduce(NGramReducer.java:56)
    at com.atsid.cassandra.ingest.thrift.mapreduce.NGramReducer.reduce(NGramReducer.java:42)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: UnavailableException()
    at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:16485)
    at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:916)
    at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:890)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:93)
    ... 22 more
2011-01-06 11:26:21,730 INFO me.prettyprint.cassandra.connection.HConnectionManager: Client CassandraClient<cass1:9160-61> released to inactive or dead pool. Closing.
2011-01-06 11:26:21,730 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Closing client CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,731 DEBUG me.prettyprint.cassandra.connection.HThriftClient: Transport open status false for client CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,731 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Could not fullfill request on this host CassandraClient<cass1:9160-61>
2011-01-06 11:26:21,731 ERROR me.prettyprint.cassandra.connection.HConnectionManager: Exception:
me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client.
    at me.prettyprint.cassandra.connection.HConnectionManager.getClientFromLBPolicy(HConnectionManager.java:212)
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:132)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:146)
    at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
    at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
    at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)
    at com.atsid.cassandra.ingest.thrift.NGramMutationCreator.flushMutations(NGramMutationCreator.java:133)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.sendUpdate(NGramMutationShipper.java:115)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.shipPayloadsIfNeeded(NGramMutationShipper.java:91)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.write(NGramMutationShipper.java:66)
    at com.atsid.cassandra.ingest.thrift.DefaultNGramDataCollector.collectAndWrite(DefaultNGramDataCollector.java:43)
    at com.atsid.cassandra.ingest.thrift.mapreduce.NGramReducer.reduce(NGramReducer.java:56)
    at com.atsid.cassandra.ingest.thrift.mapreduce.NGramReducer.reduce(NGramReducer.java:42)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

and eventually:

me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client.
    at me.prettyprint.cassandra.connection.HConnectionManager.getClientFromLBPolicy(HConnectionManager.java:212)
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:132)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:146)
    at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
    at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
    at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)
    at com.atsid.cassandra.ingest.thrift.NGramMutationCreator.flushMutations(NGramMutationCreator.java:133)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.sendUpdate(NGramMutationShipper.java:115)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.shipPayloadsIfNeeded(NGramMutationShipper.java:91)
    at com.atsid.cassandra.ingest.thrift.NGramMutationShipper.write(NGramMutationShipper.java:66)
    at com.atsid.cassandra.ingest.thrift.DefaultNGramDataCollector.collectAndWrite(DefaultNGramDataCollector.java:43)
    at com.atsid.cassandra.ingest.thrift.mapreduce.NGramReducer.reduce(NGramReducer.java:56)
    at com.atsid.cassandra.ingest.thrift.mapreduce.NGramReducer.reduce(NGramReducer.java:42)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

Nate McCall

unread,

Jan 6, 2011, 2:45:24 PM1/6/11

to hector...@googlegroups.com

How many hosts do you have configured in Hector? Perhaps it was
truncated for clarity, but It looks like only 1 from this log output.

Patricio Echagüe

unread,

Jan 6, 2011, 2:49:09 PM1/6/11

to hector...@googlegroups.com

Are you able to see the 6 hosts in Hector JMX ?

--
Patricio.-

CassUser CassUser

unread,

Jan 6, 2011, 4:33:08 PM1/6/11

to hector...@googlegroups.com

Thanks for the response guys. From Cassandra ring, all 6 hosts are up and running.

If i pass in a single host to the CassandraHostConfigurator, does it discover the others automatically?

2011/1/6 Patricio Echagüe <patr...@gmail.com>

Nate McCall

unread,

Jan 6, 2011, 4:36:19 PM1/6/11

to hector...@googlegroups.com

No. You have to turn that on explicitly as it does not work for some
configurations. For some background, see:
https://issues.apache.org/jira/browse/CASSANDRA-1777

It can be turned on via:
cassandraHostConfigurator.setAutoDiscoverHosts(true);

Patricio Echagüe

unread,

Jan 6, 2011, 4:35:38 PM1/6/11

to hector...@googlegroups.com

it does if the discovery mechanism is enabled.

CassandraHostConfigurator.autoDiscoverHosts

The default is false

Check Hector JMX to see the active pools.

--
Patricio.-

CassUser CassUser

unread,

Jan 6, 2011, 4:51:10 PM1/6/11

to hector...@googlegroups.com

Cool thanks!

Question regarding retry logic. In the HConnectionManager class, the method operateWithFailover. If one has retries set to zero (Fail fast), and there is a recoverable exception, then retry count is decreased by one making it -1. The retry check is checking if retry count is explicitly equal to zero. Shouldn't this be checking if it's less than 1?

2011/1/6 Patricio Echagüe <patr...@gmail.com>

Nate McCall

unread,

Jan 6, 2011, 6:00:43 PM1/6/11

to hector...@googlegroups.com

Indeed. Just committed this to trunk. Good catch.

CassUser CassUser

unread,

Jan 6, 2011, 6:35:51 PM1/6/11

to hector...@googlegroups.com

Thanks Nate.

Setting auto discovery solves the issue with nodes being down. Can you explain how/why separate interfaces for thrift and gossip?

I'm getting the following errors now:

    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

Caused by: java.lang.IllegalStateException: Queue full
    at java.util.AbstractQueue.add(AbstractQueue.java:71)
    at me.prettyprint.cassandra.connection.CassandraHostRetryService.add(CassandraHostRetryService.java:55)
    at me.prettyprint.cassandra.connection.HConnectionManager.markHostAsDown(HConnectionManager.java:243)
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:154)

    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
    at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:146)
    at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
    at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
    at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)

... 9 more

I have not added the retry fix yet (doing that next), would that be causing the exception above?

Nate McCall

unread,

Jan 6, 2011, 6:45:33 PM1/6/11

to hector...@googlegroups.com

You are hitting the default queue size for the node recovery service
(3 by default - can be adjusted via
CassandraHostConfigurator#setRetryDownedHostsQueueSize). Is this
happening right away or after some time of bulk importing?

The retry issue would only be triggered if you configured the failover
policy for failing fast (the default is to keep retrying).

CassUser CassUser

unread,

Jan 6, 2011, 6:54:56 PM1/6/11

to hector...@googlegroups.com

This is happening after some time (around 80% reduce).

Why does the Unavailable mark the host down vs timeout?

Nate McCall

unread,

Jan 6, 2011, 6:54:59 PM1/6/11

to hector...@googlegroups.com

As for the separate interfaces thing, this is an unfortunately a
Cassandra API wart. The Thrift API methods, describe_ring and
describe_splits, return hosts identified by the IP address associated
with gossip traffic because they are a reflection of the current
status of the ring.

If thrift traffic was run on a different hardware interface (it's not
unless you explicitly configured it for such), thrift would try to
connect to the machines retrieved from the API which would be the
gossip address so it would get a connection refused (thrift trying to
connect to port 9160 on the gossip IP address which is only listening
for port 7000).

Nate McCall

unread,

Jan 7, 2011, 1:36:26 AM1/7/11

to hector...@googlegroups.com

HUnavailableException not being treated as a timeout looks like a bug as well.

Regardless, if you have multiple hosts being unavailable within the
cluster, you should look at throttling your jobs or adjusting
read_repair_chance (or turn RR off completely) and
dynamic_snitch_badness_threshold (see the comments of the latter in
cassandra.yaml for an explanation on this).

I'll investigate get a fix out for the above in the morning. Thanks
for bringing this up.

CassUser CassUser

unread,

Jan 7, 2011, 1:26:00 PM1/7/11

to hector...@googlegroups.com

We're currently throttling writes. The retry we use for throttling is a variable back off based on 'n' previous successful writes. This worked pretty well for us in production code on cassandra 0.6. At the time we weren't using Hector. This is why I'm a bit confused. The goal was to upgrade to cassandra 0.7 with the same variable backoff we had before. I modified Hector to pass Unavailable and Timeout exceptions back to us, and fixed the retry bug. See below:

        } else if ( he instanceof HTimedOutException || he instanceof HUnavailableException) {
          // DO NOT drecrement retries, we will be keep retrying on timeouts until it comes back
          retryable = true;
          monitor.incCounter(Counter.RECOVERABLE_TIMED_OUT_EXCEPTIONS);
          client.close();
          // TODO timecheck on how long we've been waiting on timeouts here
          // suggestion per user moores on hector-users
        } else if ( he instanceof PoolExhaustedException ) {
          retryable = true;
          --retries;
          if ( hostPools.size() == 1 ) {
            throw he;
          }
          monitor.incCounter(Counter.POOL_EXHAUSTED);
          excludeHosts.add(client.cassandraHost);
        }
        if ( retries < 1 || retryable == false) throw he;

Although I'm still receiving errors regarding all nodes are down. I'll look into the properties you mentioned, Nate. Maybe Cassandra configuration?

Nate McCall

unread,

Jan 7, 2011, 2:17:53 PM1/7/11

to hector...@googlegroups.com

We default to a read and write level of QUORUM. I think if you change
that to ONE (by passing in ConfigurableConsistencyLevel on
HFactory#createKeyspace) and tweak those cassandra settings (turning
off read repair for the duration of your bulk load) you should see a
lot better throughput.

Also, having monitoring or at the least taking a couple of samples
from "nodetool tpstats" for the nodes having issues while this is
happening will get us some extra data on where to look.

I do know of other folks successfully bulk loading from Hector though.
(Anybody else seeing similar issues?)

One more thing- what version of hector? update to the latest 0.7.0-22
if you have not already as there was some scewyness with the default
least active load balancing policy.

CassUser CassUser

unread,

Jan 7, 2011, 6:13:26 PM1/7/11

to hector...@googlegroups.com

Thanks for the tips Nate.

- I'm using consistency level one for both reads and writes
- Already on hector version 22
- I will turn off read repair and report what happens

Is there instructions on how to monitor Hector? We're using a map/reduce job, where the reducer uses hector to ship stuff off to cassandra.

Here is what I'm seeing in some of the cassandra logs:

org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1001
        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:117)
        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:385)
        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:395)
        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:353)
        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
ERROR [MutationStage:1] 2011-01-06 23:12:55,480 RowMutationVerbHandler.java (line 83) Error in row mutation
org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1003
        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:117)
        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:385)
        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:395)
        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:353)
        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

Nate McCall

unread,

Jan 7, 2011, 6:48:51 PM1/7/11

to hector...@googlegroups.com

What version of Cassandra? This could be a result of
https://issues.apache.org/jira/browse/CASSANDRA-1847
via:
https://issues.apache.org/jira/browse/CASSANDRA-1860

Upgrade to rc4 if you have not already.

There are some details on monitoring hector in this document:
http://www.riptano.com/sites/default/files/hector-v2-client-doc.pdf

Reply all

Reply to author

Forward