Driver sends additional? query even for CL=ONE if SELECT with LIMIT actually limits result set

20 views
Skip to first unread message

Eugene Voytitsky

unread,
Aug 17, 2016, 7:32:03 AM8/17/16
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi all.

I'm using datastax cassandra java driver version 3.0.0. with cassandra server 3.x.
(Actually the problem is also reproduced with newest driver 3.1.0 too)

Preamble: In order to workaround the issues
https://datastax-oss.atlassian.net/browse/JAVA-268
https://datastax-oss.atlassian.net/browse/JAVA-471
our code always executes query asynchronously.

We faced that driver sends SELECT query twice even for CL=ONE if SELECT is with LIMIT N and actually limits RS to N (more than N rows present in DB satisfying WHERE clause).
In logs it looks like:

2016-08-16 18:59:22,858000 DEBUG main/my.CassandraSession.Q: Query [SELECT ...] on [KS1] took 84ms at host03.some.domain with consistency ONE; OK

2016-08-16 18:59:22,858001 TRACE host-cass-nio-worker-9/com.datastax.driver.core.QueryLogger.NORMAL: [host] [host03.some.domain/ipv6:9042] Query completed normally, took 83 ms: [1 bound values] SELECT ...; [params]

2016-08-16 18:59:22,935000 TRACE host-cass-nio-worker-1/com.datastax.driver.core.QueryLogger.NORMAL: [host] [host01.some.domain/ipv6:9042] Query completed normally, took 61 ms: [1 bound values] SELECT ...; [params]

This example shows 2 TRACEs from driver and we see that driver sent query twice to different host from different nio threads.
Sometimes it even sends query twice to same host from same nio thread.
Also note that timing is different, so it seems not a simple 1 query logged twice problem.

DEBUG log is from my app, it shows that both:
  • desired CL (statement.getConsistencyLevel())
  • and actual CL (ExecutionInfo.getAchievedConsistencyLevel())
is the same and =ONE.
If they differ then log message will be different saying that CL was altered during query execution.

If there are less than N rows matched WHERE clause – then we see only single TRACE log.

I can't find any info about such a problem in internet. Should I open issue in tracker?

When I turn on TRACE for the whole "com.datastax" I see additional info:
1-st query logs: received: ROWS [5 columns] (to be continued)
2-nd
query logs: received: ROWS [5 columns]
If you want I can show this full TRACEd queries too.

Ryan Svihla

unread,
Aug 17, 2016, 8:48:26 AM8/17/16
to DataStax Java Driver for Apache Cassandra User Mailing List

What's your fetch size relative to N


--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.
--
Regards,

Ryan Svihla

Eugene Voytitsky

unread,
Aug 17, 2016, 9:59:33 AM8/17/16
to java-dri...@lists.datastax.com, r...@foundev.pro
On 17.08.16 15:48, Ryan Svihla wrote:

What's your fetch size relative to N


Hi, Ryan.

Thank you for your so short but helpful question!
My guess "if SELECT with LIMIT actually limits result set" was completely wrong.

Amount of queries is RS.size()/fetchSize.

In my initial example
N=10K, fetchSize=5K(default) -> 2 queries

I've changed N to 21K and it became
N=21K, fetchSize=5K(default) -> 5 queries


Okey, now I have additional questions:

1.Why fetching next rows' page triggers additional SELECT query?
I guess 1 SELECT opens 1 row cursor and fetching just transmits next page from server to driver.

2.Why additional SELECT query can be sent to different host than the original one?

3.May be QueryLogger should be improved? In order to show that additional SELECT relates to consecutive page fetching.
-- 
Best regards,
Eugene Voytitsky

Ryan Svihla

unread,
Aug 17, 2016, 10:21:27 AM8/17/16
to Eugene Voytitsky, java-dri...@lists.datastax.com
Eugene,

This is actually expected behavior scroll down to automatic paging in this link http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0 

You'll note the diagram in that article exactly copies the behavior you're describing, and this is a feature not a bug. In fact if you were using Cassandra in the times before 2.0 we used to have to manage paging ourselves and it was quite easy to that wrong, I for one am glad for the massive reduction in issues people encounter with paging since that feature came out even if the automatic behavior of this threw you was a surprise (well done investigating and discovering it without knowing about the feature). 

Hope that helps.

Regards,

Ryan Svihla

Eugene Voytitsky

unread,
Aug 17, 2016, 11:15:59 AM8/17/16
to Ryan Svihla, java-dri...@lists.datastax.com
Thanks a lot! Very helpful.

My past SQL experience leads to incorrect guess how fetchSize should work.
I'm sorry for disturb.

Ryan Svihla

unread,
Aug 17, 2016, 11:34:22 AM8/17/16
to Eugene Voytitsky, java-dri...@lists.datastax.com
Eugene,

No problem! We all start somewhere and we've all struggled to make that same logical jump.

Regards,

Ryan Svihla
Reply all
Reply to author
Forward
0 new messages