kafka indexing service: Could not allocate segment for row with timestamp

1,409 views
Skip to first unread message

ago...@redborder.com

unread,
Dec 1, 2016, 4:20:11 AM12/1/16
to Druid User
Hi all,

Currently, I'm testing the kafka indexing task on kubernetes druid cluster. The cluster works fine .. and the indexing tasks is running on my middleManager:


















I send kafka message from another node.. but I only can send one message, if I send one message I can query them:

 {
  "queryType": "topN",
  "dataSource": "test-data-4",
  "granularity": "all",
  "dimension": "dim1",
  "threshold": 1000,
  "metric": "value",
  "aggregations": [
    {
      "type": "doubleSum",
      "name": "value",
      "fieldName": "value_sum"
    }
  ],
  "intervals": [
    "2016-12-01T08:30:00/2016-12-01T12:00:00"
  ]
}


[ {
  "timestamp" : "2016-12-01T08:58:00.000Z",
  "result" : [ {
    "value" : 200.0,
    "dim1" : "val2"
  } ]
} ]

but ... If I send another message, the task throw this exception and die ..

2016-11-30T12:02:21,156 INFO [test-data-1-incremental-persist] io.druid.segment.realtime.appenderator.AppenderatorImpl - Committing metadata[FiniteAppenderatorDriverMetadata{act
iveSegments={index_kafka_test-data-1_b1d9f9f90e48493_0=[test-data-1_2016-11-30T11:00:00.000Z_2016-11-30T12:00:00.000Z_2016-11-30T11:31:00.880Z]}, lastSegmentIds={index_kafka_tes
t-data-1_b1d9f9f90e48493_0=test-data-1_2016-11-30T11:00:00.000Z_2016-11-30T12:00:00.000Z_2016-11-30T11:31:00.880Z}, callerMetadata={nextPartitions=KafkaPartitions{topic='druid-t
esting', partitionOffsetMap={0=2, 1=3, 2=2, 3=3}}}}] for sinks[test-data-1_2016-11-30T11:00:00.000Z_2016-11-30T12:00:00.000Z_2016-11-30T11:31:00.880Z:1].
2016-11-30T12:02:21,163 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver - Persisted pending data in 157ms.
2016-11-30T12:02:21,167 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down...
2016-11-30T12:02:21,179 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[test-data-1_2016-11-30T11:00:00.000Z_2016-11-
9378ca140]
2016-11-30T12:02:21,180 INFO [appenderator_persist_0] io.druid.curator.announcement.Announcer - unannouncing [/druid/segments/10.0.4.20:7081/10.0.4.20:7081_indexer-executor__def
ault_tier_2016-11-30T12:02:20.942Z_ef8717f4c2d04819905b17ab9378ca140]
2016-11-30T12:02:21,199 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[test-data-1_2016-11-30T11:00:00.000Z_20
16-11-30T12:00:00.000Z_2016-11-30T11:31:00.880Z].
2016-11-30T12:02:21,207 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[KafkaIndexTask{id=index_kafka_test-data-1
_b1d9f9f90e48493_njmjinao, type=index_kafka, dataSource=test-data-1}]
com.metamx.common.ISE: Could not allocate segment for row with timestamp[2016-11-30T11:40:36.000Z]
        at io.druid.indexing.kafka.KafkaIndexTask.run(KafkaIndexTask.java:427) ~[?:?]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
        at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
2016-11-30T12:02:21,231 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_kafka_test-data-1_b1d9f9f90e48493_njmjinao] status changed to [F
AILED].
2016-11-30T12:02:21,233 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_kafka_test-data-1_b1d9f9f90e48493_njmjinao",
  "status" : "FAILED",
  "duration" : 1093
}
2016-11-30T12:02:21,238 INFO [main] com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.coordination.AbstractDataSegm
entAnnouncer.stop()] on object[io.druid.server.coordination.BatchDataSegmentAnnouncer@4538856f].

I did this test serveral times and the return is the same all the times.

This is my supervisor-kafka sepc:

{
  "type": "kafka",
  "dataSchema": {
    "dataSource": "test-data-4",
    "parser": {
      "type": "string",
      "parseSpec": {
        "format": "json",
        "timestampSpec": {
          "column": "timestamp",
          "format": "ruby"
        },
        "dimensionsSpec": {
          "dimensions": ["dim1"]
        }
      }
    },
    "metricsSpec": [
      {
        "name": "count",
        "type": "count"
      },
      {
        "name": "value_sum",
        "fieldName": "value",
        "type": "doubleSum"
      },
      {
        "name": "value_min",
        "fieldName": "value",
        "type": "doubleMin"
      },
      {
        "name": "value_max",
        "fieldName": "value",
        "type": "doubleMax"
      }
    ],
    "granularitySpec": {
      "type": "uniform",
      "segmentGranularity": "HOUR",
      "queryGranularity": "minute"
    }
  },
  "tuningConfig": {
    "type": "kafka",
    "maxRowsPerSegment": 5000000
  },
  "ioConfig": {
    "topic": "druid-testing",
    "consumerProperties": {
      "bootstrap.servers": "kafka:9092"
    },
    "taskCount": 1,
    "replicas": 1,
    "taskDuration": "PT1H"
  }
}

I have read on other threads that this exception maybe throw when you have hadoop generated segments that doesn't support append data ... but this is a empty cluster without hadoop and without data.

Some idea about the problem??

Regards,
Andrés
Message has been deleted

Slim Bouguerra

unread,
Dec 1, 2016, 1:06:39 PM12/1/16
to druid...@googlegroups.com
mu guess is that the time format depends on the timezone so if it is not UTC some see/desr mismatch will occur 
-- 

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

On Dec 1, 2016, at 2:47 AM, Andrés Gómez <ago...@redborder.com> wrote:

Apparently, the issue solved when I added the -Duser.timezone=UTC -Dfile.encoding=UTF-8 to the JVM args on all druid services.

Can someone tell me why this solve the problem? 

Regards,

Andrés Gómez

Big Data Development Manager

agomez@redborder.com
--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/S7UKNsSCMGI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/47a53b05-8e3d-47b2-b210-a224b37c6ac0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACgakt4ZZpsXAL9gavPezdPd8_8yiOLzMA6Rh2KPBkaAEh_fcw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

David Lim

unread,
Dec 1, 2016, 1:27:13 PM12/1/16
to Druid User
Can you post your overlord logs as well? The actual exception thrown while trying to allocate the segment should be found there.

Andrés Gómez

unread,
Dec 2, 2016, 4:07:11 AM12/2/16
to druid...@googlegroups.com
Hi!!

Apparently the problem solved when I add the -Duser.timezone=UTC -Dfile.encoding=UTF-8 to the JAVA ARGS and added the druid-s3-extensions, I forgot it!! hahaha 

I suppose that this make sense, isn’t??

Regards,

Andrés Gómez

Big Data Development Manager

agomez@redborder.com

+34 606224922 | +34 955 601 160


Reply all
Reply to author
Forward
0 new messages