Cannot connect to Cassandra during tests

732 views
Skip to first unread message

David Webb

unread,
Nov 20, 2015, 3:40:28 PM11/20/15
to cassandra-unit-users
Jeremy,

We have a situation where we are using cassandra-unit for our integration tests with spring-data-cassandra.  Our integration tests work great on many build machines,but one in particular has an issue that I am hoping you can help with.

I have turned up the logging and looked through your code, and I think this is the correct question for you:

We call 
EmbeddedCassandraServerHelper.startEmbeddedCassandra()

at the beginning of each integration test.  We can see from your debug logs, that this initialises C* the first time, and after that C* is running (And the cassandraDaemon is not null), so we get back the not null cassandraDaemon for each subsequent test.

My questions is that is the cassandraDaemon is not null (Which assumes C* is running), why would be get the NoHostAvailableException showen below?

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1:36337 (com.datastax.driver.core.OperationTimedOutException: [localhost/127.0.0.1:36337] Operation timed out))

For the first several tests, which reuse the cassandraDaemon, everything works great, and then at some point along the way we can no longer connect.

Please let me know what else I can provide to you for troubleshooting this, and thank you in advance.

Sincerely,
David Webb
Co-Author of Spring-Data-Cassandra

David Webb

unread,
Nov 20, 2015, 3:44:18 PM11/20/15
to cassandra-unit-users
Sorry I forgot some important info:

cassandra-unit version 2.1.9.2
DSE Driver 2.1.9

Jérémy SEVELLEC

unread,
Nov 20, 2015, 4:00:18 PM11/20/15
to cassandra-...@googlegroups.com
Hi David,

I suppose that it's not affecting your test results, right? If so, it's unfortunately not something new. Anyway, I have to investigate to try to fix that. 

I'm guessing that cassandra is shutdown before the java driver has closed its connections...

Jérémy

--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "cassandra-unit-users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse cassandra-unit-u...@googlegroups.com.
Pour obtenir davantage d'options, consultez la page https://groups.google.com/d/optout.



--
Jérémy

Nate McCall

unread,
Nov 20, 2015, 4:10:21 PM11/20/15
to cassandra-...@googlegroups.com
> We have a situation where we are using cassandra-unit for our integration
> tests with spring-data-cassandra. Our integration tests work great on many
> build machines,but one in particular has an issue that I am hoping you can
> help with.

Is this particular build machine slightly smaller/slower or more
heavily loaded?

>
> My questions is that is the cassandraDaemon is not null (Which assumes C* is
> running), why would be get the NoHostAvailableException showen below?
>
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
> tried for query failed (tried: localhost/127.0.0.1:36337
> (com.datastax.driver.core.OperationTimedOutException:
> [localhost/127.0.0.1:36337] Operation timed out))
>
> For the first several tests, which reuse the cassandraDaemon, everything
> works great, and then at some point along the way we can no longer connect.

Per the question above, you would see this error if the testing
instance got bogged down.

>
> Please let me know what else I can provide to you for troubleshooting this,
> and thank you in advance.
>

Anything unusual showing up in the log files? IIRC, cassandra-unit has
org.apache.cassandra at ERROR level, so not a lot is getting printed
by default. Turning this up will tell you a lot, but gets real chatty.

A couple of things to change in the Cassandra configuration to
minimize resource consumption:

hinted_handoff_enabled: false
key_cache_size_in_mb: 1
key_cache_save_period: 0
concurrent_reads: 2
concurrent_writes: 2
concurrent_compactors: 1
memtable_flush_queue_size: 2

And if you're not using thrift:
start_rpc: false

David Webb

unread,
Nov 20, 2015, 4:29:09 PM11/20/15
to cassandra-...@googlegroups.com
This configuration works fine on the same build server:

cassandra-unit: 2.0.2.2
use-driver: 2.1.5

I am not sure about the power of the build server or the various build agents since that is maintained by Spring/Pivotal.

I will turn up the cassandra logging.

And, I am using the EmbeddedCassandraServerHelper.CASSANDRA_RNDPORT_YML_FILE configuration file so I am not sure what the C* settings are in there. Do you recommend using this one, or creating a custom one?

Thank you.
-Dave
> --
> Vous recevez ce message car vous êtes abonné à un sujet dans le groupe Google Groupes "cassandra-unit-users".
> Pour vous désabonner de ce sujet, visitez le site https://groups.google.com/d/topic/cassandra-unit-users/HJtz2BLTV2E/unsubscribe.
> Pour vous désabonner de ce groupe et de tous ses sujets, envoyez un e-mail à l'adresse cassandra-unit-u...@googlegroups.com.
> Pour plus d'options, visitez le site https://groups.google.com/d/optout .

signature.asc

David Webb

unread,
Nov 20, 2015, 4:39:01 PM11/20/15
to cassandra-unit-users
It is affecting the results and the build:



On Friday, November 20, 2015 at 4:00:18 PM UTC-5, jsevellec wrote:
Hi David,

I suppose that it's not affecting your test results, right? If so, it's unfortunately not something new. Anyway, I have to investigate to try to fix that. 

I'm guessing that cassandra is shutdown before the java driver has closed its connections...

Jérémy
Le 20 novembre 2015 21:44, David Webb <david...@gmail.com> a écrit :
Sorry I forgot some important info:

cassandra-unit version 2.1.9.2
DSE Driver 2.1.9

--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes "cassandra-unit-users".
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse cassandra-unit-users+unsub...@googlegroups.com.

Pour obtenir davantage d'options, consultez la page https://groups.google.com/d/optout.



--
Jérémy

David Webb

unread,
Nov 20, 2015, 4:40:10 PM11/20/15
to cassandra-unit-users
I customized the cassandra.yaml file with those recommended changes and turned up the logging.  I will let you know the results.  Thanks for the prompt reply.

David Webb

unread,
Nov 20, 2015, 8:21:09 PM11/20/15
to cassandra-...@googlegroups.com
With the customized yaml file, i still experience the same problems. Attaching the build.log with org.apache.cassandra turned up to DEBUG.

Thank you for looking into this for us.
SPRINGDATACASSANDRA-SDC6-JOB1-13.log.zip

Jérémy SEVELLEC

unread,
Nov 21, 2015, 4:03:03 AM11/21/15
to cassandra-...@googlegroups.com
Hi David, 

First of all, sorry for my first email... forget it... I read yours too quickly and i did not see there is an issue with only one build machine. 

Luckily @Nate replied a more helpful email. 

I had a quick look to the log. At the moment i was only able to see errors from the cassandra java driver but nothing from cassandra itself. It looks weird. Have you got the same feeling @Nate?

I will try to have a deeper look as soon as possible.

Did you always have this issue or is it something new? Did you change something that cause the issue (driver or cassandra-unit version?)


Jérémy


Le 21 novembre 2015 02:21, David Webb <david...@gmail.com> a écrit :
With the customized yaml file, i still experience the same problems.  Attaching the build.log with org.apache.cassandra turned up to DEBUG.

Thank you for looking into this for us.
--
Vous recevez ce message, car vous êtes abonné au groupe Google Groupes cassandra-unit-users.
Pour vous désabonner de ce groupe et ne plus recevoir d'e-mails le concernant, envoyez un e-mail à l'adresse cassandra-unit-u...@googlegroups.com.

Pour plus d'options, visitez le site https://groups.google.com/d/optout .



--
Jérémy

Nate McCall

unread,
Nov 21, 2015, 9:57:59 AM11/21/15
to cassandra-...@googlegroups.com

I looked closer at the initial stack trace, looks like it chokes on the drop statement when the keyspace exists.

This has been a historically brittle code path on the C* side as schema modifications in quick succession are not a design goal.

A couple of things to try:
- running without the drop
- updating to latest C* by overidding deps

David Webb

unread,
Nov 21, 2015, 10:22:30 AM11/21/15
to cassandra-...@googlegroups.com
Thanks guys.  I am in the process of starting over and incrementing each dependency one at a time to narrow it down.  I'll let you know what I find.  Hopefully I can find the right combination that will build on all machines.




Vous recevez ce message, car vous êtes abonné à un sujet dans le groupe Google Groupes "cassandra-unit-users".

Pour vous désabonner de ce sujet, visitez le site https://groups.google.com/d/topic/cassandra-unit-users/HJtz2BLTV2E/unsubscribe.
Pour vous désabonner de ce groupe et de tous ses sujets, envoyez un e-mail à l'adresse cassandra-unit-u...@googlegroups.com.
signature.asc

David Webb

unread,
Nov 21, 2015, 4:10:55 PM11/21/15
to cassandra-...@googlegroups.com
Jeremy/Nate,

This build server is very, very slow.  A basic create table statement takes about 20 seconds.  I fixed most of the tests by increasing the timeout parameters in cassandra.yaml to 120000.  This will not negatively affect the performance on the fast build machines, while allowing extra time for the slow ones.

There is one feature of cassandra-unit that we are using, and I believe may be the source of some more timeouts I am working through.

@Rule
public CassandraCQLUnit cassandraCQLUnit = new CassandraCQLUnit(new ClassPathCQLDataSet(
"integration/cql/generator/CreateIndexCqlGeneratorIntegrationTests-BasicTest.cql"this.keyspace),
CASSANDRA_CONFIGCQL_INIT_TIMEOUT);

And the CQL file its loading is this simple:
create table mytable (id uuid primary key, column1 text);

We use this to pre-load CQL for test runs.  Even though the CQL_INIT_TIMEOUT is 60000, it seems to throw timeout exceptions much faster than that.  I looked in the javadocs, but that constructor isn't really clear to me.  Am I using that the correct way?  Can you guys checkout that third parameter (which appears to be new from CU 2.0.2 to CU 2.1.9) and make sure its working as intended?

When the @Rule runs, the Embedded C* Server is already running, so it should just have to run the CQL.  And again, the below statement takes ~20 seconds on this server.

So this is isolated to a very slow server, and definitely an edge case.  I don't think anything is "wrong" in CU or SDC, but we might have to tweak some stuff for this extreme.

Thanks for all the input so far...it really helped me today.

Dave
Reply all
Reply to author
Forward
0 new messages