I have set up a kafka ingestion supervisor. In the beginning it works fine. However, usually after a day or two, it randomly hits the following error in log, but the ingestion task is still showing as successful, albeit no data is actually ingested.
...
2021-07-29T23:50:29,887 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=consumer-kafka-supervisor-feommldm-1, groupId=kafka-supervisor-feommldm] Subscribed to partition(s): final-occupancy-0
2021-07-29T23:50:29,892 INFO [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Seeking partition[0] to[253221925924319233].
2021-07-29T23:50:29,892 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=consumer-kafka-supervisor-feommldm-1, groupId=kafka-supervisor-feommldm] Seeking to offset 253221925924319233 for partition final-occupancy-0
2021-07-29T23:50:30,977 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Started o.e.j.s.ServletContextHandler@59b98ad1{/,null,AVAILABLE}
2021-07-29T23:50:30,998 INFO [main] org.eclipse.jetty.server.AbstractConnector - Started ServerConnector@7a3a49e5{HTTP/1.1, (http/1.1)}{
0.0.0.0:8100}
2021-07-29T23:50:30,998 INFO [main] org.eclipse.jetty.server.Server - Started @9187ms
2021-07-29T23:50:30,999 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Starting lifecycle [module] stage [ANNOUNCEMENTS]
2021-07-29T23:50:31,030 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Successfully started lifecycle [module]
2021-07-29T23:50:31,049 INFO [task-runner-0-priority-0] org.apache.kafka.clients.Metadata - [Consumer clientId=consumer-kafka-supervisor-feommldm-1, groupId=kafka-supervisor-feommldm] Cluster ID: pulsar-cluster-1
2021-07-29T23:51:01,112 INFO [task-runner-0-priority-0] org.apache.kafka.clients.FetchSessionHandler - [Consumer clientId=consumer-kafka-supervisor-feommldm-1, groupId=kafka-supervisor-feommldm] Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node
2042978450:
org.apache.kafka.common.errors.DisconnectException: null
2021-07-29T23:51:31,242 INFO [task-runner-0-priority-0] org.apache.kafka.clients.FetchSessionHandler - [Consumer clientId=consumer-kafka-supervisor-feommldm-1, groupId=kafka-supervisor-feommldm] Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node
2042978450:
org.apache.kafka.common.errors.DisconnectException: null
2021-07-29T23:52:01,426 INFO [task-runner-0-priority-0] org.apache.kafka.clients.FetchSessionHandler - [Consumer clientId=consumer-kafka-supervisor-feommldm-1, groupId=kafka-supervisor-feommldm] Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node
2042978450:
org.apache.kafka.common.errors.DisconnectException: null
...
I understand that this may not be druid issue per-se, I'd like some pointers on how can I find the root cause? Right now the log does not seem quite informative. Specifically, I have no idea why sessionId=INVALID in the first place. How can I gather more information?
Many thanks.