BufferOverflowException in Broker when using orderBy in groupBy query

56 views
Skip to first unread message

Asbjørn Clemmensen

unread,
Aug 7, 2014, 6:38:15 AM8/7/14
to druid-de...@googlegroups.com
First off, Druid is pretty amazing. :) However, I'm experiencing some issues with queries routed through the broker and I'm hoping you guys can help me.

On 6.137, using a query like this:

{
   "queryType":"groupBy",
   "dataSource":"<ds>",
   "granularity":"all",
   "dimensions":[ "url" ],
   "orderBy": {
      "type": "default",
      "columns": [ {"dimension": "pv", "direction":"DESCENDING"} ],
      "limit": 5
   },
   "aggregations":[
      {"type":"count", "name":"pv"},
      {"type":"longSum", "name":"bounces", "fieldName":"bounces"},
      {"type":"longSum", "name":"entries", "fieldName":"entries"}
   ],
   "postAggregations": [
      {"type":"arithmetic", "name": "bouncerate", "fn": "*", "fields": [
              {"type":"arithmetic", "name": "boucerate", "fn": "/", "fields": [
                 {"type": "fieldAccess", "name": "xyz", "fieldName": "bounces"},
                 {"type": "fieldAccess", "name": "zyz", "fieldName": "entries"}
              ]},
              {"type": "constant", "name": "const", "value": 100}]}
   ],
   "intervals":[ "2014-01-01T00:00/2014-07-31T23:59" ]
}

I'm getting an empty response and an error message in the broker console output like this:

2014-08-07 10:14:17,982 INFO [qtp1280156403-30] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-08-07T10:14:17.956Z","service":"broker","host":"localhost:8080","metric":"query/time","value":14,"user10"
:"failed","user2":"<ds>","user3":"1 dims","user4":"groupBy","user5":"2010-01-01T00:00:00.000Z/2014-07-31T23:59:59.000Z","user6":"false","user7":"3 aggs","user8":"c9610613-9f48-49b9-963f-ce62cf1b4c95","user9":"PT2409119M"}]
2014-08-07 10:14:17,983 INFO [qtp1280156403-30] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2014-08-07T10:14:17.982Z","service":"broker","host":"localhost:8080","metric":"query/wait","value":12,"user10"
:"failed","user2":"<ds>","user3":"1 dims","user4":"groupBy","user5":"2010-01-01T00:00:00.000Z/2014-07-31T23:59:59.000Z","user6":"false","user7":"3 aggs","user8":"c9610613-9f48-49b9-963f-ce62cf1b4c95","user9":"PT2409119M"}]
2014-08-07 10:14:17,983 WARN [qtp1280156403-30] io.druid.server.QueryResource - Exception occurred on request [GroupByQuery{limitSpec=DefaultLimitSpec{columns='[OrderByColumnSpec{dimension='pv', direction=DESCENDING}]', limit=5}, dimFilt
er=null, granularity=AllGranularity, dimensions=[DefaultDimensionSpec{dimension='url', outputName='url'}], aggregatorSpecs=[CountAggregatorFactory{name='pv'}, LongSumAggregatorFactory{fieldName='bounces', name='bounces'}, LongSumAggregat
orFactory{fieldName='entries', name='entries'}], postAggregatorSpecs=[ArithmeticPostAggregator{name='bouncerate', fnName='*', fields=[ArithmeticPostAggregator{name='boucerate', fnName='/', fields=[FieldAccessPostAggregator{name='xyz', fi
eldName='bounces'}, FieldAccessPostAggregator{name='zyz', fieldName='entries'}], op=DIV}, ConstantPostAggregator{name='const', constantValue=100}], op=MULT}], limitFn=io.druid.query.groupby.orderby.DefaultLimitSpec$TopNFunction@6507c2cd}
]
java.nio.BufferOverflowException
        at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:183)
        at java.nio.ByteBuffer.put(ByteBuffer.java:832)
        at io.druid.query.groupby.orderby.OrderByColumnSpec.getCacheKey(OrderByColumnSpec.java:188)
        at io.druid.query.groupby.orderby.DefaultLimitSpec.getCacheKey(DefaultLimitSpec.java:362)
        at io.druid.query.groupby.GroupByQueryQueryToolChest$6.computeCacheKey(GroupByQueryQueryToolChest.java:276)
        at io.druid.query.groupby.GroupByQueryQueryToolChest$6.computeCacheKey(GroupByQueryQueryToolChest.java:259)
        at io.druid.client.CachingClusteredClient.run(CachingClusteredClient.java:184)
        at io.druid.query.IntervalChunkingQueryRunner.run(IntervalChunkingQueryRunner.java:54)
        at io.druid.query.SubqueryQueryRunner.run(SubqueryQueryRunner.java:45)
        at io.druid.query.MetricsEmittingQueryRunner$1.accumulate(MetricsEmittingQueryRunner.java:87)
        at io.druid.query.groupby.GroupByQueryQueryToolChest.makeIncrementalIndex(GroupByQueryQueryToolChest.java:189)
        at io.druid.query.groupby.GroupByQueryQueryToolChest.mergeGroupByResults(GroupByQueryQueryToolChest.java:157)
        at io.druid.query.groupby.GroupByQueryQueryToolChest.access$100(GroupByQueryQueryToolChest.java:71)
        at io.druid.query.groupby.GroupByQueryQueryToolChest$3.run(GroupByQueryQueryToolChest.java:116)
        at io.druid.query.FinalizeResultsQueryRunner.run(FinalizeResultsQueryRunner.java:96)
        at io.druid.query.BaseQuery.run(BaseQuery.java:80)
        at io.druid.query.BaseQuery.run(BaseQuery.java:75)
        at io.druid.server.QueryResource.doPost(QueryResource.java:155)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)

If I use the same query against the historical node directly it works as intended. If I remove the orderBy it also works from both broker and historical, but I do need the result set to be ordered.
I can't quite figure out if it's possible to avoid branching into the getCacheKey call. I also haven't changed any settings pertaining to cacheable queries.

Thank you for your time and for creating Druid in the first place.

Gian Merlino

unread,
Aug 7, 2014, 10:04:41 AM8/7/14
to druid-de...@googlegroups.com
That looks like a bug. If you are able to build druid from source, I think this patch should fix it: https://github.com/metamx/druid/pull/663/files. If not, a fix should be in a new release soon.

You can disable caching on a per query basis by adding this to your query:

    "context": {"useCache":false, "populateCache":false}

Although I am not sure if that will actually prevent computation of the cache key. It probably doesn't, since unless you have modified your config, I think groupBy query results are not cached by default anyway.

Fangjin Yang

unread,
Aug 7, 2014, 7:02:23 PM8/7/14
to druid-de...@googlegroups.com
This has been merged and I'll update the RC accordingly.

Asbjørn Clemmensen

unread,
Aug 8, 2014, 1:55:20 AM8/8/14
to druid-de...@googlegroups.com
Thank you guys for reacting so quickly. 
Reply all
Reply to author
Forward
0 new messages