Error on realtime ingestion using tranquility (Server module) and druid

456 views
Skip to first unread message

Rodrigo G.

unread,
Feb 17, 2016, 7:15:58 PM2/17/16
to Druid User
Hi Guys,

I'm having trouble trying to ingest data realtime using tranquility and druid. I couldn't get much information from the task log exception:
Basically this part (full task log enclosed):

2016-02-17T23:26:05,983 INFO [Thread-44] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.http.client.NettyHttpClient.stop()] on object[com.metamx.http.client.NettyHttpClient@35fe7ec2].
2016-02-17T23:26:05,983 ERROR [task-runner-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[RealtimeIndexTask{id=index_realtime_000C7221A34CA9DD8444662FFFB2EFD0_2016-02-17T00:00:00.000Z_0_0, type=index_realtime, dataSource=000C7221A34CA9DD8444662FFFB2EFD0}]
java.lang.RuntimeException: java.lang.InterruptedException
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
at io.druid.segment.realtime.firehose.EventReceiverFirehoseFactory$EventReceiverFirehose.hasMore(EventReceiverFirehoseFactory.java:209) ~[druid-server-0.8.3.jar:0.8.3]
at io.druid.segment.realtime.firehose.TimedShutoffFirehoseFactory$TimedShutoffFirehose.hasMore(TimedShutoffFirehoseFactory.java:99) ~[druid-server-0.8.3.jar:0.8.3]
at io.druid.segment.realtime.firehose.PredicateFirehose.hasMore(PredicateFirehose.java:50) ~[druid-server-0.8.3.jar:0.8.3]
at io.druid.indexing.common.task.RealtimeIndexTask.run(RealtimeIndexTask.java:314) ~[druid-indexing-service-0.8.3.jar:0.8.3]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:285) [druid-indexing-service-0.8.3.jar:0.8.3]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:265) [druid-indexing-service-0.8.3.jar:0.8.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_95]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_95]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_95]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_95]
Caused by: java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017) ~[?:1.7.0_95]
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095) ~[?:1.7.0_95]
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389) ~[?:1.7.0_95]
at io.druid.segment.realtime.firehose.EventReceiverFirehoseFactory$EventReceiverFirehose.hasMore(EventReceiverFirehoseFactory.java:201) ~[druid-server-0.8.3.jar:0.8.3]
... 9 more
2016-02-17T23:26:05,988 INFO [task-runner-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_realtime_000C7221A34CA9DD8444662FFFB2EFD0_2016-02-17T00:00:00.000Z_0_0",
  "status" : "FAILED",
  "duration" : 434577
}
2016-02-17T23:26:05,991 INFO [Thread-44] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void com.metamx.metrics.MonitorScheduler.stop()] on object[com.metamx.metrics.MonitorScheduler@3d75caf9].

The task is received by overloard
The temporary files are created in /tmp/druid/zk_druid
The task shows up in the indexing tab of the coordinator console (web)
The task shows up in the overlord console (web) always with status RUNNING
The log of the task is created in /usr/local/druid/log

About my environment, I'm running:
Druid 0.8.3
Tranquility 0.7.3
Zookeeper 3.4.6
Mysql 5.7.11

my current topology is:
machine A (7.5Gb ram, 14 ECU) running zookeeper, mysql and coordinator node.
machine B (16Gb ram, 13 ECU) running historical and overlord (indexer)

my deep storage is s3.
metadata storage is mysql

My current nodes configuration for machine A:
druid.extensions.coordinates=["io.druid.extensions:druid-examples","io.druid.extensions:druid-kafka-eight", "io.druid.extensions:mysql-metadata-storage", "io.druid.extensions:druid-hdfs-storage", "io.druid.extensions:$
druid.extensions.localRepository=extensions-repo

druid.zk.service.host=localhost
druid.zk.paths.base=/druid

druid.metadata.storage.type=mysql
druid.metadata.storage.connector.connectURI=jdbc:mysql://mysqlip/druid
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=(somepass)

druid.storage.type=s3
druid.s3.accessKey=(some access key)
druid.s3.secretKey=(some secret)
druid.storage.bucket=(bucket)
druid.storage.baseKey=(key)
druid.storage.archiveBucket=druid-archive
druid.storage.archiveBaseKey=bc

druid.cache.type=local
druid.cache.sizeInBytes=10000000

druid.selectors.indexing.serviceName=overlord

druid.selectors.coordinator.serviceName=coordinator

druid.emitter=noop

coordinator:
JVM parameters:  -Xmx4g -Duser.timezone=UTC -Dfile.encoding=UTF-8

druid.host=(machine A IP)
druid.port=8081
druid.service=coordinator

druid.coordinator.startDelay=PT70s


Common confinguration for machine B:
druid.extensions.coordinates=["io.druid.extensions:druid-examples","io.druid.extensions:druid-kafka-eight", "io.druid.extensions:mysql-metadata-storage", "io.druid.extensions:druid-hdfs-storage", "io.druid.extensions:druid-s3-extensions"]
druid.extensions.localRepository=extensions-repo

druid.zk.service.host=(machine A's ip address)
druid.zk.paths.base=/druid

druid.metadata.storage.type=mysql
druid.metadata.storage.connector.connectURI=jdbc:mysql://mysqlip/druid
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=(somepass)

druid.storage.type=s3
druid.s3.accessKey=(some access key)
druid.s3.secretKey=(some secret)
druid.storage.bucket=(bucket)
druid.storage.baseKey=(key)
druid.storage.archiveBucket=druid-archive
druid.storage.archiveBaseKey=bc


druid.cache.type=local
druid.cache.sizeInBytes=10000000

druid.selectors.indexing.serviceName=overlord

druid.selectors.coordinator.serviceName=coordinator

druid.monitoring.monitors=["com.metamx.metrics.SysMonitor","com.metamx.metrics.JvmMonitor"]

druid.emitter=noop

historical:
JVM parameters:  -server -Xmx12g -Xms12g -XX:NewSize=6g -XX:MaxNewSize=6g -XX:MaxDirectMemorySize=16g -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8

druid.host=(machine B ip)
druid.port=8083
druid.service=historical

druid.processing.numThreads=15

druid.historical.cache.useCache=true
druid.historical.cache.populateCache=true

druid.server.http.numThreads=50
druid.server.maxSize=300000000000

druid.segmentCache.locations=[{"path": "/tmp/druid/zk_druid", "maxSize"\: 300000000000}]

overlord (not running middle manager):
JVM parameters: 

druid.host=(machine B ip)
druid.port=8090
druid.service=overlord



Thanks in advance for any help!





index_realtime_000C7221A34CA9DD8444662FFFB2EFD0_2016-02-17T00%3A00%3A00.000Z_0_0.log

Slim Bouguerra

unread,
Feb 18, 2016, 3:02:28 PM2/18/16
to druid...@googlegroups.com
this seems the task was interrupted, i am not sure what caused that ! did you send a cancel task ?

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/32bc190f-1e76-4a11-8752-1bde050c31ce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rodrigo G.

unread,
Feb 18, 2016, 4:31:29 PM2/18/16
to Druid User
No, didn't send any cancel task.
I've just pushed it to tranquility and collected the logs.
The task is stuck as RUNNING, even after the exception.

Fangjin Yang

unread,
Feb 19, 2016, 7:04:11 PM2/19/16
to Druid User
Hi Rodrigo, do you see this happen often or was this only with the 1 task?

Rodrigo G.

unread,
Feb 21, 2016, 5:10:22 PM2/21/16
to Druid User
Hi Fangjin,

It happened to every task I submitted via tranquility http server (I tried different configurations and different data sources).
I connected to zookeeper via zkCli and saw that the segments were under /druid/segments but the /druid/loadQueue was empty.
I tried also to running on instance of the middlemanager and saw on the indexing service web console that the worker was showing there as available and then allocated to my task.
On the indexing service console, the tasks hangs as running for like 1hour and then goes to the "completed" bucket but no segments were correctly created.
I checked on my task temp directories /tmp/task and the files for the task are there..
I really can't figure out what part I'm missing.

Thanks for the help!

Fangjin Yang

unread,
Feb 29, 2016, 12:13:34 PM2/29/16
to Druid User
You likely have misconfiguration in your setup. Can you attach a task log?

biswa...@gmail.com

unread,
Mar 30, 2016, 3:53:49 PM3/30/16
to Druid User
Hi All ,

Any resolution for this error ?? I'm also seeing exact same error during ingestion via tranquility server .

Thank you . 

Gian Merlino

unread,
Mar 30, 2016, 4:22:21 PM3/30/16
to druid...@googlegroups.com
For the "tasks hangs as running for 1hour and then goes to 'completed'" thing, try checking out: https://github.com/druid-io/tranquility/blob/master/docs/trouble.md#no-data-is-showing-up-in-druid-even-though-events-seem-to-be-successfully-sent

Biswajit, is that what you're seeing too? If not could you describe what you're seeing in more detail?

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.

Rodrigo G.

unread,
Apr 4, 2016, 7:18:34 PM4/4/16
to Druid User
Hi Guys,

Sorry for the late reply... I ended up upgrading my version to 0.9.0.rc3 and everything started working...

Thanks for the support!
Reply all
Reply to author
Forward
0 new messages