kafka indexing service: Could not allocate segment for row with timestamp

ago...@redborder.com

unread,

Dec 1, 2016, 4:20:11 AM12/1/16

to Druid User

Hi all,

Currently, I'm testing the kafka indexing task on kubernetes druid cluster. The cluster works fine .. and the indexing tasks is running on my middleManager:

I send kafka message from another node.. but I only can send one message, if I send one message I can query them:

 {
  "queryType": "topN",
  "dataSource": "test-data-4",
  "granularity": "all",
  "dimension": "dim1",
  "threshold": 1000,
  "metric": "value",
  "aggregations": [
    {
      "type": "doubleSum",
      "name": "value",
      "fieldName": "value_sum"
    }
  ],
  "intervals": [
    "2016-12-01T08:30:00/2016-12-01T12:00:00"
  ]
}

[ {
  "timestamp" : "2016-12-01T08:58:00.000Z",
  "result" : [ {
    "value" : 200.0,
    "dim1" : "val2"
  } ]
} ]

but ... If I send another message, the task throw this exception and die ..

2016-11-30T12:02:21,156 INFO [test-data-1-incremental-persist] io.druid.segment.realtime.appenderator.AppenderatorImpl - Committing metadata[FiniteAppenderatorDriverMetadata{act
iveSegments={index_kafka_test-data-1_b1d9f9f90e48493_0=[test-data-1_2016-11-30T11:00:00.000Z_2016-11-30T12:00:00.000Z_2016-11-30T11:31:00.880Z]}, lastSegmentIds={index_kafka_tes
t-data-1_b1d9f9f90e48493_0=test-data-1_2016-11-30T11:00:00.000Z_2016-11-30T12:00:00.000Z_2016-11-30T11:31:00.880Z}, callerMetadata={nextPartitions=KafkaPartitions{topic='druid-t
esting', partitionOffsetMap={0=2, 1=3, 2=2, 3=3}}}}] for sinks[test-data-1_2016-11-30T11:00:00.000Z_2016-11-30T12:00:00.000Z_2016-11-30T11:31:00.880Z:1].
2016-11-30T12:02:21,163 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver - Persisted pending data in 157ms.
2016-11-30T12:02:21,167 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down...
2016-11-30T12:02:21,179 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[test-data-1_2016-11-30T11:00:00.000Z_2016-11-
30T12:00:00.000Z_2016-11-30T11:31:00.880Z] at path[/druid/segments/10.0.4.20:7081/10.0.4.20:7081_indexer-executor__default_tier_2016-11-30T12:02:20.942Z_ef8717f4c2d04819905b17ab
9378ca140]
2016-11-30T12:02:21,180 INFO [appenderator_persist_0] io.druid.curator.announcement.Announcer - unannouncing [/druid/segments/10.0.4.20:7081/10.0.4.20:7081_indexer-executor__def
ault_tier_2016-11-30T12:02:20.942Z_ef8717f4c2d04819905b17ab9378ca140]
2016-11-30T12:02:21,199 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[test-data-1_2016-11-30T11:00:00.000Z_20
16-11-30T12:00:00.000Z_2016-11-30T11:31:00.880Z].
2016-11-30T12:02:21,207 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[KafkaIndexTask{id=index_kafka_test-data-1
_b1d9f9f90e48493_njmjinao, type=index_kafka, dataSource=test-data-1}]
com.metamx.common.ISE: Could not allocate segment for row with timestamp[2016-11-30T11:40:36.000Z]
        at io.druid.indexing.kafka.KafkaIndexTask.run(KafkaIndexTask.java:427) ~[?:?]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
2016-11-30T12:02:21,231 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_kafka_test-data-1_b1d9f9f90e48493_njmjinao] status changed to [F
AILED].
2016-11-30T12:02:21,233 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_kafka_test-data-1_b1d9f9f90e48493_njmjinao",
  "status" : "FAILED",
  "duration" : 1093
}
2016-11-30T12:02:21,238 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.coordination.AbstractDataSegm
entAnnouncer.stop()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@4538856f].

I did this test serveral times and the return is the same all the times.

This is my supervisor-kafka sepc:

{
  "type": "kafka",
  "dataSchema": {
    "dataSource": "test-data-4",
    "parser": {
      "type": "string",
      "parseSpec": {
        "format": "json",
        "timestampSpec": {
          "column": "timestamp",
          "format": "ruby"
        },
        "dimensionsSpec": {
          "dimensions": ["dim1"]
        }
      }
    },
    "metricsSpec": [
      {
        "name": "count",
        "type": "count"
      },
      {
        "name": "value_sum",
        "fieldName": "value",
        "type": "doubleSum"
      },
      {
        "name": "value_min",
        "fieldName": "value",
        "type": "doubleMin"
      },
      {
        "name": "value_max",
        "fieldName": "value",
        "type": "doubleMax"
      }
    ],
    "granularitySpec": {
      "type": "uniform",
      "segmentGranularity": "HOUR",
      "queryGranularity": "minute"
    }
  },
  "tuningConfig": {
    "type": "kafka",
    "maxRowsPerSegment": 5000000
  },
  "ioConfig": {
    "topic": "druid-testing",
    "consumerProperties": {
      "bootstrap.servers": "kafka:9092"
    },
    "taskCount": 1,
    "replicas": 1,
    "taskDuration": "PT1H"
  }
}

I have read on other threads that this exception maybe throw when you have hadoop generated segments that doesn't support append data ... but this is a empty cluster without hadoop and without data.

Some idea about the problem??

Regards,

Andrés

Message has been deleted

Slim Bouguerra

unread,

Dec 1, 2016, 1:06:39 PM12/1/16

to druid...@googlegroups.com

mu guess is that the time format depends on the timezone so if it is not UTC some see/desr mismatch will occur

--

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

On Dec 1, 2016, at 2:47 AM, Andrés Gómez <ago...@redborder.com> wrote:

Apparently, the issue solved when I added the -Duser.timezone=UTC -Dfile.encoding=UTF-8 to the JVM args on all druid services.

Can someone tell me why this solve the problem?

Regards,
Andrés Gómez
Big Data Development Manager
agomez@redborder.com
+34 606224922 | +34 955 601 160

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/S7UKNsSCMGI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/47a53b05-8e3d-47b2-b210-a224b37c6ac0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACgakt4ZZpsXAL9gavPezdPd8_8yiOLzMA6Rh2KPBkaAEh_fcw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

David Lim

unread,

Dec 1, 2016, 1:27:13 PM12/1/16

to Druid User

Can you post your overlord logs as well? The actual exception thrown while trying to allocate the segment should be found there.

Andrés Gómez

unread,

Dec 2, 2016, 4:07:11 AM12/2/16

to druid...@googlegroups.com

Hi!!

Apparently the problem solved when I add the -Duser.timezone=UTC -Dfile.encoding=UTF-8 to the JAVA ARGS and added the druid-s3-extensions, I forgot it!! hahaha

I suppose that this make sense, isn’t??

Regards,

Andrés Gómez

Big Data Development Manager

agomez@redborder.com

+34 606224922 | +34 955 601 160

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/85b82cf4-0a6e-4246-bf03-41f186d93886%40googlegroups.com.

Reply all

Reply to author

Forward