[Proposal] Return most recent results first

116 views
Skip to first unread message

Fangjin Yang

unread,
Aug 6, 2015, 6:48:41 PM8/6/15
to Druid Development
I've heard numerous folks in the community request for the ability to return results for an interval in reverse order (with most recent timestamps first). Most often, I hear this feedback for people using the select query, where queries often have page through numerous results before reaching the most recent ones.

Although there are several workarounds on the client side that can be instrumented to support this use case, returning most recent results can also be supported natively in Druid. I am writing this proposal to begin the discussion. If users are interested in contributing, this thread can be used to coordinate the development.

Query Context

We can add a new flag in the query context that will return reversed results. 

Reversing the Cursor

This flag will be passed down through the various query engines, through makeCursorBasedQuery(), and through the adaptors.makeCursors() method.

For IncrementalIndexStorageAdapter, we can simply reverse the cursorMap.


For QueryableIndexStorageAdapter, things get more tricky. We'll need reversible offset objects for filtered and non-filtered queries. The non-filtered index is more straightforward to implement. For the filtered case, we'll need the bitmap indexes to support reversible iterators. Concise sets already support this out of the box, but the immutable concise set will need to be adjusted to support this as well. Roaring bitmaps currently have no concept of a reversible cursor, and we'll need to either throw an exception, or add the logic to roaring bitmaps.


Merging Results

Care needs to be taken here as most of the merge logic assumes timestamps are ordered naturally. We'll need the ability to use different comparators based on the query. I believe this can be tricky to get right, so many new unit tests will be required.


Thoughts?


-- FJ



Daniel Lemire

unread,
Sep 1, 2015, 10:18:34 AM9/1/15
to Druid Development

 Roaring bitmaps currently have no concept of a reversible cursor, and we'll need to either throw an exception, or add the logic to roaring bitmaps.

I do not think that this is quite correct. With Roaring, you can iterate through the values in reverse order :


So it is a simple matter of using the available API. 

Fangjin Yang

unread,
Sep 2, 2015, 11:07:51 PM9/2/15
to Druid Development
My bad, good to know. Thanks Daniel.

Xavier

unread,
Sep 17, 2015, 7:45:41 PM9/17/15
to Druid Development
My only concern with reverse cursor is the added complexity and potential for bugs / performance regressions.

Currently there are quite a few places in the query path that assume things are traversed in increasing time order, and take advantage of this to take some shortcuts and speed things up. I know this is the case for groupBy, for instance. 

An alternative would be to keep the same cursor direction, but implement sorting on the results. This would also give the flexibility of re-ordering results based on things other than time which I think would be more beneficial overall. 

Fangjin Yang

unread,
Sep 18, 2015, 12:57:55 PM9/18/15
to Druid Development
I worry about the performance of just sorting the results as the select query isn't the fastest as is. Also, the result set from select can be substantial, which means multiple queries need to be issued to get all the results from a time range before the results can be sorted.

nav...@gmail.com

unread,
Nov 26, 2015, 2:39:01 AM11/26/15
to Druid Development
I've found ImmutableRoaringReverseIntIterator is not cloneable. Could you check https://github.com/lemire/RoaringBitmap/pull/54?

2015년 9월 1일 화요일 오후 11시 18분 34초 UTC+9, Daniel Lemire 님의 말:

Daniel Lemire

unread,
Nov 26, 2015, 9:17:25 AM11/26/15
to Druid Development
Thanks to your PR, the issue is already fixed. Thanks. I am issuing a new release.

Daniel Lemire

unread,
Nov 26, 2015, 9:17:57 AM11/26/15
to Druid Development
The version with the fix will be 0.5.13, should be available today from the maven repository.

nav...@gmail.com

unread,
Nov 27, 2015, 4:04:43 AM11/27/15
to Druid Development
Thanks for the quick answer. 

2015년 11월 26일 목요일 오후 11시 17분 57초 UTC+9, Daniel Lemire 님의 말:

Fangjin Yang

unread,
Nov 28, 2015, 5:14:15 PM11/28/15
to Druid Development
Thanks for the contrib! We should be able to get to review it after thanksgiving.

Charles Allen

unread,
Dec 1, 2015, 3:58:25 PM12/1/15
to Druid Development
https://github.com/metamx/bytebuffer-collections/pull/21 submitted to get the workflow going for "proper" druid support

Fangjin Yang

unread,
Dec 30, 2015, 1:17:33 PM12/30/15
to Druid Development
BTW, a lot of this work has been done in the following PR: https://github.com/druid-io/druid/pull/2014/files
Reply all
Reply to author
Forward
0 new messages