Indexing task fails with Error " java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)"

979 views
Skip to first unread message

Atul Suryavanshi

unread,
Feb 7, 2018, 7:47:55 AM2/7/18
to Druid User
Hi,

I am trying to load a sample data set to druid , however my indexing Task fails with error below.
I am not using hadoop and using local storage to read the data from. Could some one please help me in resolving this issue.
I have looked through one of the post with similar issue however not being of much help

I have tried loading a JSON file and it works fine, I am not sure if i have any issues with my Spec file (attaching the same for reference)


2018-02-07T12:36:34,705 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorDriver - New segment[retail_2010-12-01T00:00:00.00
0Z_2010-12-02T00:00:00.000Z_2018-02-07T12:36:29.193Z] for sequenceName[index_retail_2018-02-07T12:36:29.191Z].
2018-02-07T12:36:34,760 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorDriver - Persisting data.
2018-02-07T12:36:34,766 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down...
2018-02-07T12:36:34,770 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[retail_2010-12-01T00:00:00.000Z_2010-12-02T00:00:00.000Z_2018-02-07T12:36:29.193Z].
2018-02-07T12:36:34,776 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[IndexTask{id=index_retail_2018-02-07T12:36:29.191Z, type=index, dataSource=retail}]
java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)
        at java.util.ArrayList.subListRangeCheck(ArrayList.java:1014) ~[?:1.8.0_152]
        at java.util.ArrayList.subList(ArrayList.java:1004) ~[?:1.8.0_152]
        at io.druid.segment.realtime.appenderator.AppenderatorImpl.persist(AppenderatorImpl.java:381) ~[druid-server-0.11.0.jar:0.11.0]
        at io.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:462) ~[druid-server-0.11.0.jar:0.11.0]
        at io.druid.segment.realtime.appenderator.AppenderatorDriver.persist(AppenderatorDriver.java:258) ~[druid-server-0.11.0.jar:0.11.0]
        at io.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:695) ~[druid-indexing-service-0.11.0.jar:0.11.0]
        at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:233) ~[druid-indexing-service-0.11.0.jar:0.11.0]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.11.0.jar:0.11.0]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.11.0.jar:0.11.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_152]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_152]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_152]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_152]
2018-02-07T12:36:34,783 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_retail_2018-02-07T12:36:29.191Z] status changed to [FAILED].
2018-02-07T12:36:34,787 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_retail_2018-02-07T12:36:29.191Z",
  "status" : "FAILED",
  "duration" : 354
}

Thanks
Atul S

retail.json

Jonathan Wei

unread,
Feb 7, 2018, 9:04:46 PM2/7/18
to druid...@googlegroups.com
Hi Atul,

I think this exception occurs when the segment to be persisted has no rows, can you try running your job with `reportParseExceptions=true` in the tuningConfig? Maybe also try double checking your timestamp formats.

- Jon






--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/c074f660-accc-4dc9-a0cf-b82d4c36a0cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ahe...@barefootnetworks.com

unread,
May 29, 2018, 8:15:38 PM5/29/18
to Druid User
Hi Jon,

I see the same issue with Druid 0.12.1-rc2. I also notice some data loss after this error is hit. Is this going to be resolved in 0.12.1 or 0.13.0?

2018-05-29T20:06:46,138 INFO [task-runner-0-priority-0] io.druid.server.coordination.CuratorDataSegmentServerAnnouncer - Unannouncing self[DruidServerMetadata{name='10.1.1.1:8102', hostAndPort='10.1.1.1:8102', hostAndTlsPort='null', maxSize=0, tier='_default_tier', type=indexer-executor, priority=0}] at [/druid/announcements/10.1.1.1:8102]
2018-05-29T20:06:46,138 INFO [task-runner-0-priority-0] io.druid.curator.announcement.Announcer - unannouncing [/druid/announcements/10.1.1.1:8102]
2018-05-29T20:06:46,142 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[KafkaIndexTask{id=index_kafka_topic_8e372287495ab31_aijnmgpe, type=index_kafka, dataSource=topic}]
java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)
	at java.util.ArrayList.subListRangeCheck(ArrayList.java:1014) ~[?:1.8.0_162]
	at java.util.ArrayList.subList(ArrayList.java:1004) ~[?:1.8.0_162]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:408) ~[druid-server-0.12.1-rc2.jar:0.12.1-rc2]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.push(AppenderatorImpl.java:519) ~[druid-server-0.12.1-rc2.jar:0.12.1-rc2]
	at io.druid.segment.realtime.appenderator.BaseAppenderatorDriver.pushInBackground(BaseAppenderatorDriver.java:351) ~[druid-server-0.12.1-rc2.jar:0.12.1-rc2]
	at io.druid.segment.realtime.appenderator.StreamAppenderatorDriver.publish(StreamAppenderatorDriver.java:268) ~[druid-server-0.12.1-rc2.jar:0.12.1-rc2]
	at io.druid.indexing.kafka.KafkaIndexTask.lambda$createAndStartPublishExecutor$1(KafkaIndexTask.java:364) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_162]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_162]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_162]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_162]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
2018-05-29T20:06:46,143 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_kafka_topic_8e372287495ab31_aijnmgpe] status changed to [FAILED].
2018-05-29T20:06:46,146 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_kafka_topic_8e372287495ab31_aijnmgpe",
  "status" : "FAILED",
  "duration" : 583724
}

Thanks,
Avinash
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

Jonathan Wei

unread,
May 30, 2018, 9:55:53 PM5/30/18
to druid...@googlegroups.com
Hi Avinash,

The resolution for "java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)" is most likely to fix whatever errors are present in the input data that leads to unparseable rows, which is something that must be done on the user side.

0.13.0 will have better error reporting for ingestion tasks (https://github.com/druid-io/druid/pull/5418), which should help with identifying such parsing errors.

Thanks,
Jon

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Xianyin Xin

unread,
Jan 29, 2019, 10:00:16 PM1/29/19
to Druid User
We observed this issue also, but i guess it's not a parse problem, since the segments regenerated in the following task and finally published successful.
A little different is that in my case the exception is reported when serialize the metadata:

2019-01-29T13:30:48,059 INFO [access_log-incremental-persist] io.druid.segment.realtime.appenderator.AppenderatorImpl - Committing metadata[AppenderatorDriverMetadata{segments={****}].
2019-01-29T13:30:48,060 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.StreamAppenderatorDriver - Persisted pending data in 48ms.
2019-01-29T13:30:48,064 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down immediately...
2019-01-29T13:30:48,065 INFO [task-runner-0-priority-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[***] at path[/druid/segments/emr-worker-4.cluster-72321:8101/emr-worker-4.cluster-72321:8101_indexer-executor__default_tier_2019-01-29T12:50:13.319Z_51df8fd2617a48868262f617e05efb170]
2019-01-29T13:30:48,066 INFO [task-runner-0-priority-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[***] at path[/druid/segments/emr-worker-4.cluster-72321:8101/emr-worker-4.cluster-72321:8101_indexer-executor__default_tier_2019-01-29T12:50:13.319Z_51df8fd2617a48868262f617e05efb170]
2019-01-29T13:30:48,067 INFO [task-runner-0-priority-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[***] at path[/druid/segments/emr-worker-4.cluster-72321:8101/emr-worker-4.cluster-72321:8101_indexer-executor__default_tier_2019-01-29T12:50:13.319Z_51df8fd2617a48868262f617e05efb170]
2019-01-29T13:30:48,067 INFO [task-runner-0-priority-0] io.druid.curator.announcement.Announcer - unannouncing [/druid/segments/emr-worker-4.cluster-72321:8101/emr-worker-4.cluster-72321:8101_indexer-executor__default_tier_2019-01-29T12:50:13.319Z_51df8fd2617a48868262f617e05efb170]
2019-01-29T13:30:48,088 INFO [task-runner-0-priority-0] io.druid.segment.realtime.firehose.ServiceAnnouncingChatHandlerProvider - Unregistering chat handler[***]
2019-01-29T13:30:48,088 INFO [task-runner-0-priority-0] io.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannouncing [DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/middleManager', host='emr-worker-4.cluster-72321', port=-1, plaintextPort=8101, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeType='peon', services={dataNodeService=DataNodeService{tier='_default_tier', maxSize=0, type=indexer-executor, priority=0}, lookupNodeService=LookupNodeService{lookupTier='__default'}}}].
2019-01-29T13:30:48,088 INFO [task-runner-0-priority-0] io.druid.curator.announcement.Announcer - unannouncing [/druid/internal-discovery/peon/emr-worker-4.cluster-72321:8101]
2019-01-29T13:30:48,092 INFO [task-runner-0-priority-0] io.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannounced [DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/middleManager', host='emr-worker-4.cluster-72321', port=-1, plaintextPort=8101, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeType='peon', services={dataNodeService=DataNodeService{tier='_default_tier', maxSize=0, type=indexer-executor, priority=0}, lookupNodeService=LookupNodeService{lookupTier='__default'}}}].
2019-01-29T13:30:48,092 INFO [task-runner-0-priority-0] io.druid.server.coordination.CuratorDataSegmentServerAnnouncer - Unannouncing self[DruidServerMetadata{name='emr-worker-4.cluster-72321:8101', hostAndPort='emr-worker-4.cluster-72321:8101', hostAndTlsPort='null', maxSize=0, tier='_default_tier', type=indexer-executor, priority=0}] at [/druid/announcements/emr-worker-4.cluster-72321:8101]
2019-01-29T13:30:48,092 INFO [task-runner-0-priority-0] io.druid.curator.announcement.Announcer - unannouncing [/druid/announcements/emr-worker-4.cluster-72321:8101]
2019-01-29T13:30:48,095 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[KafkaIndexTask{id=index_kafka_access_log_5855cca6d6ce7d7_ojahebcn, type=index_kafka, dataSource=access_log}]
java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)
    at java.util.ArrayList.subListRangeCheck(ArrayList.java:1012) ~[?:1.8.0_151]
    at java.util.ArrayList.subList(ArrayList.java:1002) ~[?:1.8.0_151]
    at io.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:408) ~[druid-server-0.12.1.jar:0.12.1]
    at io.druid.segment.realtime.appenderator.AppenderatorImpl.push(AppenderatorImpl.java:519) ~[druid-server-0.12.1.jar:0.12.1]
    at io.druid.segment.realtime.appenderator.BaseAppenderatorDriver.pushInBackground(BaseAppenderatorDriver.java:351) ~[druid-server-0.12.1.jar:0.12.1]
    at io.druid.segment.realtime.appenderator.StreamAppenderatorDriver.publish(StreamAppenderatorDriver.java:268) ~[druid-server-0.12.1.jar:0.12.1]
    at io.druid.indexing.kafka.KafkaIndexTask.lambda$createAndStartPublishExecutor$1(KafkaIndexTask.java:364) ~[?:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_151]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_151]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
2019-01-29T13:30:48,096 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [***] status changed to [FAILED].
2019-01-29T13:30:48,097 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "***",
  "status" : "FAILED",
  "duration" : 2436516
Reply all
Reply to author
Forward
0 new messages