WARN [task-runner-0-priority-0] io.druid.indexing.kafka.KafkaIndexTask - Skipped to offset[16,521,675] after offset[16,521,674] in partition[0].
I'm using Kafka-Streams in transactional mode to transform data before ingesting in Druid.
I have set "skipOffsetGaps": true in my ioConfig, so the ingestion is working correctly. But it seems that Druid doesn't understand that Kafka uses up offsets for commits when using transactions - it's certainly doesn't seem something that should be logged at WARN level as it's expected that Kafka skips offsets with transactions.
I am worried that Druid may have other issues with Kafka Transactions. Is ingestion from topics that have been produced with Kafka transactions supported by Druid or should I expect other issues when doing so?
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/f6e9c41d-cd4a-4e82-901c-2ab5f5d6c117%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi Erwin,The 'skipped offset' check in Druid's Kafka indexing consumer is a sanity check meant to make sure we process every message in order. It was originally put there just in case there were bugs in the Kafka broker, consumer, etc, that caused them to skip messages. This doesn't hold for compacted topics of course, but that's ok since it's not expected that people will want to read from compacted topics into Druid.I'm not familiar with how Kafka transactions works but it sounds like they break this assumption too. That's more interesting since I do expect people to want to read from these topics into Druid. Do you have some pointers to how this works, especially on the consumer side? It sounds like we should change the Druid code a bit.
Gian
On Tue, Feb 13, 2018 at 10:38 PM, Erwin Bolwidt <ebol...@gmail.com> wrote:
I'm running Druid 0.11 and I'm getting loads of warnings in the Kafka indexer peon task log for my datasource of the form:
WARN [task-runner-0-priority-0] io.druid.indexing.kafka.KafkaIndexTask - Skipped to offset[16,521,675] after offset[16,521,674] in partition[0].I'm using Kafka-Streams in transactional mode to transform data before ingesting in Druid.I have set "skipOffsetGaps": true in my ioConfig, so the ingestion is working correctly. But it seems that Druid doesn't understand that Kafka uses up offsets for commits when using transactions - it's certainly doesn't seem something that should be logged at WARN level as it's expected that Kafka skips offsets with transactions.I am worried that Druid may have other issues with Kafka Transactions. Is ingestion from topics that have been produced with Kafka transactions supported by Druid or should I expect other issues when doing so?
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
Hi Erwin,The 'skipped offset' check in Druid's Kafka indexing consumer is a sanity check meant to make sure we process every message in order. It was originally put there just in case there were bugs in the Kafka broker, consumer, etc, that caused them to skip messages. This doesn't hold for compacted topics of course, but that's ok since it's not expected that people will want to read from compacted topics into Druid.I'm not familiar with how Kafka transactions works but it sounds like they break this assumption too. That's more interesting since I do expect people to want to read from these topics into Druid. Do you have some pointers to how this works, especially on the consumer side? It sounds like we should change the Druid code a bit.
Gian
On Tue, Feb 13, 2018 at 10:38 PM, Erwin Bolwidt <ebol...@gmail.com> wrote:
I'm running Druid 0.11 and I'm getting loads of warnings in the Kafka indexer peon task log for my datasource of the form:
WARN [task-runner-0-priority-0] io.druid.indexing.kafka.KafkaIndexTask - Skipped to offset[16,521,675] after offset[16,521,674] in partition[0].I'm using Kafka-Streams in transactional mode to transform data before ingesting in Druid.I have set "skipOffsetGaps": true in my ioConfig, so the ingestion is working correctly. But it seems that Druid doesn't understand that Kafka uses up offsets for commits when using transactions - it's certainly doesn't seem something that should be logged at WARN level as it's expected that Kafka skips offsets with transactions.I am worried that Druid may have other issues with Kafka Transactions. Is ingestion from topics that have been produced with Kafka transactions supported by Druid or should I expect other issues when doing so?
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/640aaea5-0bcc-4276-954d-32afd89e6e7c%40googlegroups.com.