Druid realtime unable to hand off segments

574 views
Skip to first unread message

pe...@whisper.sh

unread,
May 3, 2016, 2:52:51 PM5/3/16
to Druid User
Our druid realtime nodes were running for for some time but recently have hit a hard limit of some sort in the druid.processing.buffer.sizeBytes limit. We have it configured to be 2GB but this does not seem to be adequate for merging some of the segments. I see the following errors in the log for broken/missing segments:

2016-05-02T12:33:50,893 ERROR [weaver_events-2016-04-29T02:00:00.000Z-persist-n-merge] io.druid.segment.realtime.plumber.RealtimePlumber - Failed to persist merged index[weaver_events]: {class=io.druid.segment.realtime.plumber.RealtimePlumber, exceptionType=class com.metamx.common.IAE, exceptionMessage=Asked to add buffers[2,439,613,871] larger than configured max[2,147,483,647], interval=2016-04-29T02:00:00.000Z/2016-04-29T03:00:00.000Z}


com.metamx.common.IAE: Asked to add buffers[2,439,613,871] larger than configured max[2,147,483,647]


  at com.metamx.common.io.smoosh.FileSmoosher.addWithSmooshedWriter(FileSmoosher.java:152) ~[java-util-0.27.7.jar:?]


  at io.druid.segment.IndexIO$DefaultIndexIOHandler.convertV8toV9(IndexIO.java:744) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.IndexMerger.makeIndexFiles(IndexMerger.java:1009) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.IndexMerger.merge(IndexMerger.java:421) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.IndexMerger.mergeQueryableIndex(IndexMerger.java:242) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.IndexMerger.mergeQueryableIndex(IndexMerger.java:215) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.realtime.plumber.RealtimePlumber$4.doRun(RealtimePlumber.java:536) [druid-server-0.9.0.jar:0.9.0]


  at io.druid.common.guava.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:42) [druid-common-0.9.0.jar:0.9.0]


  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_60]


  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_60]


  at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]



I thought ok, I must up the buffer size a bit more no problem, then upon adding the new value i get the following exception. It seems like there is a hard limit at 2GB? How do I fix this?

1) Error in custom provider, java.lang.NumberFormatException: For input string: "3073741824"


  at io.druid.guice.ConfigProvider.bind(ConfigProvider.java:44)


  at io.druid.guice.ConfigProvider.bind(ConfigProvider.java:44)


  while locating io.druid.query.DruidProcessingConfig


  at io.druid.guice.DruidProcessingModule.getProcessingExecutorService(DruidProcessingModule.java:92)


  at io.druid.guice.DruidProcessingModule.getProcessingExecutorService(DruidProcessingModule.java:92)


  while locating java.util.concurrent.ExecutorService annotated with @io.druid.guice.annotations.Processing()


    for parameter 0 at io.druid.query.IntervalChunkingQueryRunnerDecorator.<init>(IntervalChunkingQueryRunnerDecorator.java:37)


  while locating io.druid.query.IntervalChunkingQueryRunnerDecorator


    for parameter 0 at io.druid.query.timeseries.TimeseriesQueryQueryToolChest.<init>(TimeseriesQueryQueryToolChest.java:71)


  at io.druid.guice.QueryToolChestModule.configure(QueryToolChestModule.java:74)


  while locating io.druid.query.timeseries.TimeseriesQueryQueryToolChest


    for parameter 0 at io.druid.query.timeseries.TimeseriesQueryRunnerFactory.<init>(TimeseriesQueryRunnerFactory.java:53)


  at io.druid.guice.QueryRunnerFactoryModule.configure(QueryRunnerFactoryModule.java:82)


  while locating io.druid.query.timeseries.TimeseriesQueryRunnerFactory


  while locating io.druid.query.QueryRunnerFactory annotated with @com.google.inject.multibindings.Element(setName=,uniqueId=18, type=MAPBINDER)


  at io.druid.guice.DruidBinders.queryRunnerFactoryBinder(DruidBinders.java:38)


  while locating java.util.Map<java.lang.Class<? extends io.druid.query.Query>, io.druid.query.QueryRunnerFactory>


    for parameter 0 at io.druid.query.DefaultQueryRunnerFactoryConglomerate.<init>(DefaultQueryRunnerFactoryConglomerate.java:36)


  while locating io.druid.query.DefaultQueryRunnerFactoryConglomerate


  at io.druid.guice.StorageNodeModule.configure(StorageNodeModule.java:55)


  while locating io.druid.query.QueryRunnerFactoryConglomerate


Fangjin

unread,
May 3, 2016, 3:01:59 PM5/3/16
to Druid User
How big are your segments? How many rows are in each?

Druid segments should be around 300-700M in size with around 5M rows.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/cb5fdad0-7d84-4e3e-9420-bab505022bd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

pe...@whisper.sh

unread,
May 3, 2016, 4:13:17 PM5/3/16
to Druid User
Segments are around 2-4GB in size, and this is what i see in the logs for size"

2016-05-03T07:11:30,582 INFO [events-incremental-persist] io.druid.segment.IndexMerger - Starting persist for interval[2016-05-03T07:00:00.000Z/2016-05-03T08:00:00.000Z], rows[500,000]


Here is the result of a count query by hour:

user@druid-realtime:~$ curl -X POST 'http://localhost:8084/druid/v2/?pretty' -H 'content-type: application/json'  -d  @query.q 










[ {


  "timestamp" : "2016-05-03T00:00:00.000Z",


  "result" : {


    "count" : 41555535


  }


}, {


  "timestamp" : "2016-05-03T01:00:00.000Z",


  "result" : {


    "count" : 46411458


  }


}, {


  "timestamp" : "2016-05-03T02:00:00.000Z",


  "result" : {


    "count" : 40967300


  }


}, {


  "timestamp" : "2016-05-03T03:00:00.000Z",


  "result" : {


    "count" : 32833436


  }


}, {


  "timestamp" : "2016-05-03T04:00:00.000Z",


  "result" : {


    "count" : 29186762


  }


}, {


  "timestamp" : "2016-05-03T05:00:00.000Z",


  "result" : {


    "count" : 24195599


  }


}, {


  "timestamp" : "2016-05-03T06:00:00.000Z",


  "result" : {


    "count" : 20235289


  }


}, {


  "timestamp" : "2016-05-03T07:00:00.000Z",


  "result" : {


    "count" : 16825411


  }


}, {


  "timestamp" : "2016-05-03T08:00:00.000Z",


  "result" : {


    "count" : 14114823


  }


}, {


  "timestamp" : "2016-05-03T09:00:00.000Z",


  "result" : {


    "count" : 13307833


  }


}, {


  "timestamp" : "2016-05-03T10:00:00.000Z",


  "result" : {


    "count" : 15092930


  }


}, {


  "timestamp" : "2016-05-03T11:00:00.000Z",


  "result" : {


    "count" : 20319835


  }


}, {


  "timestamp" : "2016-05-03T12:00:00.000Z",


  "result" : {


    "count" : 20064067


  }


}, {


  "timestamp" : "2016-05-03T13:00:00.000Z",


  "result" : {


    "count" : 20766580


  }


}, {


  "timestamp" : "2016-05-03T14:00:00.000Z",


  "result" : {


    "count" : 24863256


  }


}, {


  "timestamp" : "2016-05-03T15:00:00.000Z",


  "result" : {


    "count" : 24225975


  }


}, {


  "timestamp" : "2016-05-03T16:00:00.000Z",


  "result" : {


    "count" : 31565259


  }


}, {


  "timestamp" : "2016-05-03T17:00:00.000Z",


  "result" : {


    "count" : 34518830


  }


}, {


  "timestamp" : "2016-05-03T18:00:00.000Z",


  "result" : {


    "count" : 25000000


  }


} ]

Fangjin

unread,
May 3, 2016, 4:15:53 PM5/3/16
to Druid User
You need way more partitions and much smaller segments.

pe...@whisper.sh

unread,
May 3, 2016, 5:25:40 PM5/3/16
to Druid User
Thanks Fangjin!

Is there any way to fix the data that is already collected? I was under the assumption if I modify these parameters they are not retroactive. I currently only have 1 realtime node consuming, am switching over to Tranquility soon but am trying to minimize data loss.

-Pere
...

pe...@whisper.sh

unread,
May 3, 2016, 5:27:38 PM5/3/16
to Druid User
I forgot to ask, will tranquility handle these issues automatically? Or is this a problem moreso with the segment length

Thanks,
Pere


On Tuesday, May 3, 2016 at 11:52:51 AM UTC-7, pe...@whisper.sh wrote:
Our druid realtime nodes were running for for some time but recently have hit a hard limit of some sort in the druid.processing.buffer.sizeBytes limit. We have it configured to be 2GB but this does not seem to be adequate for merging some of the segments. I see the following errors in the log for broken/missing segments:

2016-05-02T12:33:50,893 ERROR [weaver_events-2016-04-29T02:00:00.000Z-persist-n-merge] io.druid.segment.realtime.plumber.RealtimePlumber - Failed to persist merged index[weaver_events]: {class=io.druid.segment.realtime.plumber.RealtimePlumber, exceptionType=class com.metamx.common.IAE, exceptionMessage=Asked to add buffers[2,439,613,871] larger than configured max[2,147,483,647], interval=2016-04-29T02:00:00.000Z/2016-04-29T03:00:00.000Z}


com.metamx.common.IAE: Asked to add buffers[2,439,613,871] larger than configured max[2,147,483,647]


  at com.metamx.common.io.smoosh.FileSmoosher.addWithSmooshedWriter(FileSmoosher.java:152) ~[java-util-0.27.7.jar:?]


  at io.druid.segment.IndexIO$DefaultIndexIOHandler.convertV8toV9(IndexIO.java:744) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.IndexMerger.makeIndexFiles(IndexMerger.java:1009) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.IndexMerger.merge(IndexMerger.java:421) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.IndexMerger.mergeQueryableIndex(IndexMerger.java:242) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.IndexMerger.mergeQueryableIndex(IndexMerger.java:215) ~[druid-processing-0.9.0.jar:0.9.0]


  at io.druid.segment.realtime.plumber.RealtimePlumber$4.doRun(RealtimePlumber.java:536) [druid-server-0.9.0.jar:0.9.0]


  at io.druid.common.guava.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:42) [druid-common-0.9.0.jar:0.9.0]


  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_60]


  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_60]


  at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]



I thought ok, I must up the buffer size a bit more no problem, then upon adding the new value i get the following exception. It seems like there is a hard limit at 2GB? How do I fix this?

1) Error in custom provider, java.lang.NumberFormatException: For input string: "3073741824"


  at io.druid.guice.ConfigProvider.bind(ConfigProvider.java:44)


  at io.druid.guice.ConfigProvider.bind(ConfigProvider.java:44)


  while locating io.druid.query.DruidProcessingConfig


  at io.druid.guice.DruidProcessingModule.getProcessingExecutorService(DruidProcessingModule.java:92)


  at io.druid.guice.DruidProcessingModule.getProcessingExecutorService(DruidProcessingModule.java:92)


  while locating java.util.concurrent.ExecutorService annotated with @io.druid.guice.annotations.Processing

...

Fangjin Yang

unread,
May 3, 2016, 8:00:56 PM5/3/16
to Druid User
With Tranquility you can define the number of partitions you want and it'll create the additional required segments and manage the sharding for you. You can do the same thing with realtime nodes, however, it requires much more manual tuning. You can set up multiple realtime nodes, each with a different shardSpec number.

Pere

unread,
May 3, 2016, 9:10:24 PM5/3/16
to Druid User
Thanks again. Will go down that path then!


On Tuesday, May 3, 2016 at 5:00:56 PM UTC-7, Fangjin Yang wrote:
With Tranquility you can define the number of partitions you want and it'll create the additional required segments and manage the sharding for you. You can do the same thing with realtime nodes, however, it requires much more manual tuning. You can set up multiple realtime nodes, each with a different shardSpec number.

shivani gupta

unread,
Dec 17, 2018, 11:46:34 PM12/17/18
to Druid User
@Pere I am also facing the same issue. Were you able to resolve this ? What did you change?
Reply all
Reply to author
Forward
0 new messages