Realtime not performing incremental persist (throws java.io.IOException: No such file or directory)

512 views
Skip to first unread message

Maxime Nay

unread,
Apr 8, 2014, 1:44:39 PM4/8/14
to druid-de...@googlegroups.com
Hi,

Our realtime nodes seem unable to perform incremental persists.
We are getting the following exceptions (1 per dataSource per attempt of incremental persist) :

2014-04-08 10:27:46,832 ERROR [rtb_auctions-incremental-persist] io.druid.segment.realtime.plumber.RealtimePlumber - dataSource[rtb_auctions] -- incremental persist failed: {class=io.druid.segment.realtime.plumber.RealtimePlumber, interval=2014-04-08T10:00:00.000-07:00/2014-04-08T11:00:00.000-07:00, count=2}
2014-04-08 10:27:46,832 INFO [rtb_auctions-incremental-persist] com.metamx.emitter.core.LoggingEmitter - Event [{"feed":"alerts","timestamp":"2014-04-08T10:27:46.832-07:00","service":"druid/prod/realtime","host":"xxx","severity":"component-failure","description":"dataSource[rtb_auctions] -- incremental persist failed","data":{"class":"io.druid.segment.realtime.plumber.RealtimePlumber","interval":"2014-04-08T10:00:00.000-07:00/2014-04-08T11:00:00.000-07:00","count":2}}]
Exception in thread "plumber_persist_2" java.lang.RuntimeException: java.io.IOException: No such file or directory
        at com.google.common.base.Throwables.propagate(Throwables.java:160)
        at io.druid.segment.realtime.plumber.RealtimePlumber.persistHydrant(RealtimePlumber.java:695)
        at io.druid.segment.realtime.plumber.RealtimePlumber$3.doRun(RealtimePlumber.java:295)
        at io.druid.common.guava.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: No such file or directory
        at java.io.UnixFileSystem.createFileExclusively(Native Method)
        at java.io.File.createNewFile(File.java:1006)
        at java.io.File.createTempFile(File.java:1989)
        at java.io.File.createTempFile(File.java:2040)
        at io.druid.segment.data.TmpFileIOPeon.makeOutputStream(TmpFileIOPeon.java:44)
        at io.druid.segment.data.GenericIndexedWriter.open(GenericIndexedWriter.java:67)
        at io.druid.segment.IndexMerger.makeIndexFiles(IndexMerger.java:466)
        at io.druid.segment.IndexMerger.merge(IndexMerger.java:306)
        at io.druid.segment.IndexMerger.persist(IndexMerger.java:149)
        at io.druid.segment.IndexMerger.persist(IndexMerger.java:119)
        at io.druid.segment.IndexMerger.persist(IndexMerger.java:104)
        at io.druid.segment.realtime.plumber.RealtimePlumber.persistHydrant(RealtimePlumber.java:675)
        ... 5 more


We are using druid 0.6.73

Here is how I am starting the realtime node:
java -server -Xmx3g -Xms2g -XX:+UseConcMarkSweepGC -Duser.timezone=PST -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/mnt/tmp -Dcom.sun.management.jmxremote.port=17071 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -classpath lib/*:config/realtime io.druid.cli.Main server realtime

Here is my runtime.properties:
druid.host=xxx
druid.port=8080
druid.service=druid/prod/realtime

druid.zk.service.host=xxx
druid.zk.paths.base=/druid/prod

druid.extensions.coordinates=["io.druid.extensions:druid-s3-extensions:0.6.73","io.druid.extensions:druid-kafka-eight:0.6.73"]

druid.s3.secretKey=xxx
druid.s3.accessKey=xxx
druid.storage.type=s3
druid.storage.bucket=xxx
druid.storage.baseKey=xxx

druid.db.connector.connectURI=jdbc:mysql://xxx:3306/Druid
druid.db.connector.user=xxx
druid.db.connector.password=xxx
druid.db.connector.useValidationQuery=true
druid.db.tables.base=prod

druid.publish.type=db

druid.realtime.specFile=config/schemas.json

druid.server.maxSize=10000000000
druid.server.http.numThreads=30

# Change these to make Druid faster
druid.processing.buffer.sizeBytes=100000000
druid.processing.numThreads=2


Here is the definition of one of our dataSource in the specFile:
{
  "schema": {
    "dataSource": "rtb_auctions",
    "aggregators" : [{
       "type" : "count",
       "name" : "count"
      }, {
       "type" : "longSum",
       "name" : "f",
       "fieldName" : "f"
      }, {
       "type" : "longSum",
       "name" : "a",
       "fieldName" : "a"
      }, {
       "type" : "longSum",
       "name" : "b",
       "fieldName" : "b"
      }, {
       "type" : "doubleSum",
       "name" : "tbp",
       "fieldName" : "tbp"
      }, {
       "type" : "doubleSum",
       "name" : "wp",
       "fieldName" : "wp"
      }, {
       "type" : "doubleSum",
       "name" : "sp",
       "fieldName" : "sp"
      }],
     "indexGranularity": "minute"
  },
  "config": {
    "maxRowsInMemory": 500000,
    "intermediatePersistPeriod": "PT2m"
  },
  "firehose": {
   "type": "kafka-0.8",
    "consumerProps": {
      "zookeeper.connect": "xxx,xxx,xxx",
      "zookeeper.connectiontimeout.ms": "15000",
      "zookeeper.sessiontimeout.ms": "15000",
      "zookeeper.synctime.ms": "5000",
      "group.id": "RTB_AUCTIONS_DRUID",
      "fetch.size": "1048586",
      "autooffset.reset": "largest",
      "autocommit.enable": "false"
    },
    "feed": "RTB_AUCTIONS",
    "parser": {
        "timestampSpec": {
          "column": "t",
          "format": "millis"
        },
        "data": {
          "format": "json",
          "dimensions" : ["cc","tid"]
        }
    }
  },
  "plumber": {
      "type": "realtime",
      "windowPeriod": "PT10m",
      "segmentGranularity": "hour",
      "basePersistDirectory": "\/tmp\/realtime\/basePersist"
  }
}


Any help/advice (even if not related to our issue) would be greatly appreciated!
Thanks,
Maxime

Kiran Patchigolla

unread,
Apr 8, 2014, 1:49:42 PM4/8/14
to druid-de...@googlegroups.com
Maxime
Create this folder /mnt/tmp and restart realtime node, that will fix your issue. Its unable to create temp files because this folder does not exist and is defined in -Djava.io.tmpdir=/mnt/tmp

Kiran


--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/6696362f-d567-48ec-87fc-c217a8ac3755%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Maxime Nay

unread,
Apr 8, 2014, 2:06:06 PM4/8/14
to druid-de...@googlegroups.com
Aha, indeed, this was the issue...
Thanks!
Reply all
Reply to author
Forward
0 new messages