"rs.isExhausted()" instead of "rs.getExecutionInfo().getPagingState() == null" ?

195 views
Skip to first unread message

Fabrice Larcher

unread,
Jun 20, 2016, 8:01:25 AM6/20/16
to java-dri...@lists.datastax.com
Hello,

I followed the documentation of async-paging in order to fetch results with a row count greater than the fetchSize value. I am using the Java driver version 2.1.10.2 and C* version 2.1.4. For my test, I put 20 rows in a table and tried to retrieve all rows using a request having a fetchSize of only 10 rows.

When I use the line "boolean wasLastPage = rs.getExecutionInfo().getPagingState() == null;of the code example, I do not get the last 10 rows. But I have found out that using instead "boolean wasLastPage rs.isExhausted()" is making it work.

Is that a little mistake in the documentation ? Or am I misunderstanding something ?


Regards,
Fabrice

Olivier Michallat

unread,
Jun 20, 2016, 12:56:09 PM6/20/16
to java-dri...@lists.datastax.com
Hi,

It's hard to say without seeing your code, but "getPagingState() == null" means you have fetched the last page (but you might still have local rows to consume), whereas "rs.isExhausted()" means you have consumed the last row. 

From your description it sounds like you're using a null PagingState as the condition to stop the iteration, which is too early. Note that in the example you linked to, the paging state is tested after the current page's rows were printed.

--

Olivier Michallat

Driver & tools engineer, DataStax


--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Fabrice Larcher

unread,
Jun 27, 2016, 9:08:29 AM6/27/16
to java-dri...@lists.datastax.com
Hi,

Thanks for the clarification. 

I should have put the code with my question ; here is the code sample I was using :

Iterator<T> fetchAll(final ResultSet result, final Function<Row, T> readFunction) {

return new Iterator<T>() {

private final Iterator<Row> rowIterator = result.iterator();
private int remaining = result.getAvailableWithoutFetching();

@Override
public boolean hasNext() {
return rowIterator.hasNext();
}

@Override
public T next() {
T out;
if (remaining > 0) {
out = readFunction.apply(rowIterator.next());
remaining--;
} else if (result.isExhausted()) {
//} else if (result.getExecutionInfo().getPagingState() == null) {
throw new NoSuchElementException();
} else {
// We waits for the next page
Futures.getUnchecked(result.fetchMoreResults());
remaining = result.getAvailableWithoutFetching();
out = next();
}
return out;
}

@Override
public void remove() {
throw new UnsupportedOperationException();
}

};
};

I put 15 rows in the table and try to retrieve all rows. The ResultSet argument comes from a request having a fetchSize of only 10 rows. The code shown above is returning all the 15 rows (making 2 requests). If I replace the condition "result.isExhausted()" with "result.getExecutionInfo().getPagingState() == null", then I do not retrieve the last page (only the first 10 rows) and get a NoSuchElementExceptionI still do not understand why the PagingState was null in my example, since It has not fetched the last page at the time it was checked (as I understand). 

Thanks,
-- 
Fabrice

Olivier Michallat

unread,
Jun 28, 2016, 5:28:24 PM6/28/16
to java-dri...@lists.datastax.com
Hi,

The paging state is null as soon as the driver has retrieved the last page. The rows of the last page are still in the local cache at this point.

To summarize what's happening internally:

1) query 1 with paging state = null
2) receive response with 10 rows and paging state = xyz
3) add those 10 rows to local cache
4) as user iterates, pop rows from local cache
5) if local cache empty or fetchMoreResults called => query 2 with paging state = xyz
6) receive response with 5 rows and paging state = null (indicating the server has no more pages)
7) add those 5 rows to local cache
8) as user iterates, pop rows from local cache
9) local cache empty => signal no more results

If your condition to stop the iteration is "getPagingState() == null", you stop at step 6, with 5 rows still not returned to the user.

On a side note, if your only goal is to transform the iterator with a function, why deal with paging at all? You could just have hasNext() and next() call the internal iterator directly, and let it deal with paging.
Or better yet, use Guava's Iterators.transform. 


--

Olivier Michallat

Driver & tools engineer, DataStax


Reply all
Reply to author
Forward
0 new messages