Failed realtime task finished with SUCCESS state

zdenek tison

unread,

Mar 2, 2015, 5:08:03 AM3/2/15

to druid-de...@googlegroups.com

Hi,

We observe weird behaviour of real time tasks. We have missing segments, but our tasks were finished as SUCCESS.

In logs we found out "No space left on device" exception, but finite status is "SUCCESS". Is it expected behaviour?

Interesting parts from log file:

2015-02-28 02:51:45,747 INFO [ssp-auction-2015-02-27T00:00:00.000Z-persist-n-merge] io.druid.segment.IndexMerger - outDir[/data/druid/baseTaskDir/index_realtime_ssp-auction_2015-02-27T00:00:00.000Z_1_0_dmpongod/work/persist/ssp-auction/2

015-02-27T00:00:00.000Z_2015-02-28T00:00:00.000Z/merged/v8-tmp] walked 500,000/29,000,000 rows in 16,455 millis.

2015-02-28 02:51:53,908 ERROR [ssp-auction-2015-02-27T00:00:00.000Z-persist-n-merge] io.druid.segment.realtime.plumber.RealtimePlumber - Failed to persist merged index[ssp-auction]: {class=io.druid.segment.realtime.plumber.RealtimePlumbe

r, exceptionType=class java.io.IOException, exceptionMessage=No space left on device, interval=2015-02-27T00:00:00.000Z/2015-02-28T00:00:00.000Z}

java.io.IOException: No space left on device

at java.io.FileOutputStream.writeBytes(Native Method)

at java.io.FileOutputStream.write(FileOutputStream.java:315)

at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)

at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)

at com.google.common.io.CountingOutputStream.write(CountingOutputStream.java:53)

at java.io.FilterOutputStream.write(FilterOutputStream.java:97)

at io.druid.segment.data.VSizeIndexedWriter.write(VSizeIndexedWriter.java:77)

at io.druid.segment.IndexMerger.makeIndexFiles(IndexMerger.java:652)

at io.druid.segment.IndexMerger.merge(IndexMerger.java:307)

at io.druid.segment.IndexMerger.mergeQueryableIndex(IndexMerger.java:169)

at io.druid.segment.IndexMerger.mergeQueryableIndex(IndexMerger.java:162)

at io.druid.segment.realtime.plumber.RealtimePlumber$4.doRun(RealtimePlumber.java:348)

at io.druid.common.guava.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:42)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

......

2015-02-28 02:51:59,505 INFO [task-runner-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {

"id" : "index_realtime_ssp-auction_2015-02-27T00:00:00.000Z_1_0_dmpongod",

"status" : "SUCCESS",

"duration" : 53328147

}

Next question, what is the location of disk without space? Is it from log file (outDir[/data/druid/baseTask......)?

Because we are not sure if that disk could be out of space:

Filesystem Size Used Avail Use% Mounted on

/dev/vda1 7.8G 3.8G 3.6G 52% /

tmpfs 3.6G 0 3.6G 0% /dev/shm

/dev/vdb1 727G 56G 671G 8% /data

Thanks

index_realtime_ssp-auction_2015-02-27T00%3A00%3A00.000Z_1_0_dmpongod.zip

zdenek tison

unread,

Mar 2, 2015, 5:26:36 AM3/2/15

to druid-de...@googlegroups.com

To be precise we also have set:

druid.indexer.task.baseTaskDir=/data/druid/baseTaskDir
druid.indexer.task.baseDir=/data/druid/baseDir
druid.indexer.task.hadoopWorkingPath=/data/druid/hadoopWorkingPath
druid.fork.property.druid.indexer.task.baseTaskDir=/data/druid/baseTaskDir
druid.fork.property.druid.indexer.task.baseDir=/data/druid/baseDir
druid.fork.property.druid.indexer.task.hadoopWorkingPath=/data/druid/hadoopWorkingPath

Nishant Bangarwa

unread,

Mar 2, 2015, 11:13:50 AM3/2/15

to druid-de...@googlegroups.com

Hi zdenek,

It seems like an issue with the way exceptions are handled while shutting down the plumber.

I feel the correct behavior to handle IOException on persisting the segment in all the cases should be to wait and retry for until the handoff succeeds to prevent any data loss.

Can you create a github issue for this ?

the location that is full is /data/druid/baseTaskDir

--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/cec3c9cb-33f2-4e06-b91a-fee505887d3e%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Nishant

Software Engineer

|

METAMARKETS

m	+91-9729200044

nishant....@metamarkets.com

zdenek tison

unread,

Mar 3, 2015, 8:53:02 AM3/3/15

to druid-de...@googlegroups.com

I did. Thanks

https://github.com/druid-io/druid/issues/1166

Reply all

Reply to author

Forward