READ Timeout showing LOCAL_QUORUM when LOCAL_ONE was used.

1,265 views
Skip to first unread message

Ankush Goyal

unread,
Apr 13, 2015, 12:56:33 AM4/13/15
to java-dri...@lists.datastax.com
Hi Guys,

We are using driver version 2.0.5, and I am experiencing this unusual bug in driver that although we specify Consistency level of LOCAL_ONE, but while doing some testing driver started throwing Read Timeouts and the error message states driver was trying to read at consistency level LOCAL_QUORUM.

Here's a snapshot of stack trace:

ERROR [] [] [2015-04-13 03:50:47,794 +0000] - c.l.d.StackTraceDAO: StackTrace{message=Cassandra timeout during read query at consistency LOCAL_QUORUM (2 responses were required but only 1 replica responded), type=ContactCassandraDAO.getByPropertiesAsyncInRS, rootCause=com.datastax.driver.core.exceptions.ReadTimeoutException
        at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:57)
        at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:34)
        at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:182)
        at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)}

Any help regarding this would be highly appreciated.

P.S. We can't upgrade the driver to higher version due to NoHostAvailable (No Host Tried) exception --> https://groups.google.com/a/lists.datastax.com/forum/#!topic/java-driver-user/b076HRgEfoo.

Ananth kumar

unread,
Apr 13, 2015, 2:06:22 AM4/13/15
to java-dri...@lists.datastax.com
Hi Ankush, 


We got QUORUM ReadTimeOut for LOCAL_QUORUM once. I think the driver has got a bit buggy. Can anyone from driver team please answer on this? From the source , i can see you are getting the consistency from the query statement but is it cached somewhere. Even so , we never use QUORUM consistency anywhere in our code base. It was a shock to us to get that . 


Regards,

Ananthkumar K S

Olivier Michallat

unread,
Apr 13, 2015, 6:21:09 AM4/13/15
to java-dri...@lists.datastax.com
Are you using lightweight transactions or batches, and against which Cassandra version?

I'll have to dig in for the exact ticket numbers, but there are known issues where the consistency level mentioned in the message is that of a secondary query (for example, with a logged batch, writing the distributed batch log is done at a consistency > 1, even if the batch itself uses ONE).

--

Olivier Michallat

Driver & tools engineer, DataStax


To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Ankush Goyal

unread,
Apr 13, 2015, 12:35:09 PM4/13/15
to java-dri...@lists.datastax.com
Olivier,

We are using 2.0.11 version of Cassandra.

These are just simple row reads (not writes) and rows themselves are fairly thin. Anyways, the problem was exposed because one out of three C* nodes went down.
We do have batch writes, but this test was done in an isolated environment with no writes and only reads.

Ananth kumar

unread,
Apr 14, 2015, 5:29:55 AM4/14/15
to java-dri...@lists.datastax.com
HI Olivier,

The error i mentioned occurred for a read and not a write operation. My read was done using a consistency of LOCAL_QUORUM. 

Moreover, if you would like to know about my batches, all are logged batches which at the max carries 20 insert at a time . Moreover, since I am using Cassandra 2.0.3 , I have not started using lightweight transactions.I just felt it was too early to try it out. Is it stable now to use lightweight transactions now with recent versions of Cassandra?

 Driver version am using is 2.1.4. 

Olivier Michallat

unread,
Apr 15, 2015, 4:27:47 AM4/15/15
to java-dri...@lists.datastax.com
Ankush, could you try to isolate the issue in a unit test? From your stack trace, I think you should be able to find the statement that got the read timeout. And since the timeout was produced by a node being down, that's something that can easily be reproduced with a test cluster.

--

Olivier Michallat

Driver & tools engineer, DataStax


Reply all
Reply to author
Forward
0 new messages