Kafka connect is waiting long for connectors

92 views
Skip to first unread message

Raju Divakaran

unread,
Jul 8, 2016, 7:02:26 AM7/8/16
to Confluent Platform
Hello,

I have been checking the kafka connect logs and also from our monitoring, it looks that kafka connect isn't processing or doing its job.. other than waiting for connectors...

[2016-07-08 10:53:01,472] INFO x.x.x.x - - [08/Jul/2016:10:53:01 +0000] "GET /connectors HTTP/1.1" 200 205  1 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:53:01,475] INFO x.x.x.x - - [08/Jul/2016:10:53:01 +0000] "GET /connectors HTTP/1.1" 200 205  1 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:54:01,523] INFO x.x.x.x - - [08/Jul/2016:10:54:01 +0000] "GET /connectors HTTP/1.1" 200 205  2 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:54:01,526] INFO x.x.x.x - - [08/Jul/2016:10:54:01 +0000] "GET /connectors HTTP/1.1" 200 205  1 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:54:01,529] INFO x.x.x.x - - [08/Jul/2016:10:54:01 +0000] "GET /connectors HTTP/1.1" 200 205  1 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:55:01,648] INFO x.x.x.x - - [08/Jul/2016:10:55:01 +0000] "GET /connectors HTTP/1.1" 200 205  2 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:55:01,651] INFO x.x.x.x - - [08/Jul/2016:10:55:01 +0000] "GET /connectors HTTP/1.1" 200 205  1 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:55:01,654] INFO x.x.x.x - - [08/Jul/2016:10:55:01 +0000] "GET /connectors HTTP/1.1" 200 205  1 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:56:01,588] INFO x.x.x.x - - [08/Jul/2016:10:56:01 +0000] "GET /connectors HTTP/1.1" 200 205  3 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:56:01,592] INFO x.x.x.x - - [08/Jul/2016:10:56:01 +0000] "GET /connectors HTTP/1.1" 200 205  1 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-07-08 10:56:01,596] INFO x.x.x.x - - [08/Jul/2016:10:56:01 +0000] "GET /connectors HTTP/1.1" 200 205  2 (org.apache.kafka.connect.runtime.rest.RestServer:60)

These are in logs.. and almost for a day!! ( I have masked the IP with x.x.x.x )

I am having 13 instances of kafka connect running, with 5 connectors. Out of these 13 instances, few of them I can see are working.. but others are just giving the above traces to logs.

Does anyone seen the same ? Any suggestions or help would be greatly appreciated.

Thanks

Ewen Cheslack-Postava

unread,
Jul 20, 2016, 12:03:28 AM7/20/16
to Confluent Platform
How many connectors and tasks are there total? You should be able to use the API to see which are assigned to which workers. If some of the workers are not logging anything besides these messages, its possibly they simply have no work to do. Alternatively, if they are only assigned connectors (i.e. no tasks) and those connectors don't do any background monitoring, they may technically be assigned work but not *really* have any to do.

-Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/fe37065d-57a3-44af-8744-40111c604e52%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen

Raju Divakaran

unread,
Jul 21, 2016, 11:04:59 AM7/21/16
to Confluent Platform
Hello Ewen,

Thanks for the explaining. So what I can see from logs is that, whenever I do config refresh via API or a service restart, all tasks defined under certain connector gets killed.. due to below exception

org.apache.kafka.connect.errors.DataException: Failed to deserialize data to Avro:

    at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:109)

    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:346)

    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:226)

    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:170)

    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:142)

    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)

    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)

    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

    at java.util.concurrent.FutureTask.run(FutureTask.java:266)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

    at java.lang.Thread.run(Thread.java:745)

Caused by: org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id 427

Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403



:( 

To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.



--
Thanks,
Ewen

Ewen Cheslack-Postava

unread,
Jul 21, 2016, 11:47:53 PM7/21/16
to Confluent Platform
It sounds like you have some bad data in your topic. Unfortunately Connect doesn't handle this very gracefully today, so probably what's happening is that one task sees the bad data, crashes, and then the partition with the bad data is rebalanced to another task, which also subsequently throws an exception and dies, and so on. For situations like this, we're going to need to either provide an API to allow you to reset offsets so you can get past the bad data or add some policies for how to handle messages that cannot be parsed.

-Ewen

To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.



--
Thanks,
Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen
Reply all
Reply to author
Forward
0 new messages