what is the expected behavior when a kafka connect task fails?

1,987 views
Skip to first unread message

Andrew Xue

unread,
Oct 17, 2016, 7:45:38 PM10/17/16
to Confluent Platform
hi -- i have a rethinkdb connect running on a distributed connect cluster. it encountered an error like this:

    trace: com.rethinkdb.gen.exc.ReqlDriverError: Can't write query because response pump is not running.
at com.rethinkdb.net.Connection.sendQuery(Connection.java:218)
at com.rethinkdb.net.Connection.runQuery(Connection.java:269)
at com.rethinkdb.net.Connection.runQuery(Connection.java:253)
at com.rethinkdb.net.Connection.runQuery(Connection.java:249)
at com.rethinkdb.net.Connection.noreplyWait(Connection.java:305)
at com.rethinkdb.net.Connection.close(Connection.java:152)
at com.datamountaineer.streamreactor.connect.rethink.sink.ReThinkWriter.close(ReThinkWriter.scala:125)
at com.datamountaineer.streamreactor.connect.rethink.sink.ReThinkSinkTask$$anonfun$stop$1.apply(ReThinkSinkTask.scala:73)
at com.datamountaineer.streamreactor.connect.rethink.sink.ReThinkSinkTask$$anonfun$stop$1.apply(ReThinkSinkTask.scala:73)
at scala.Option.foreach(Option.scala:257)
at com.datamountaineer.streamreactor.connect.rethink.sink.ReThinkSinkTask.stop(ReThinkSinkTask.scala:73)
at org.apache.kafka.connect.runtime.WorkerSinkTask.close(WorkerSinkTask.java:126)
at org.apache.kafka.connect.runtime.WorkerTask.doClose(WorkerTask.java:121)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

the task died and that was the end of that. i would have expected the worker to try to restart the task in distributed mode. any advice on how to further debug this issue? 

Ewen Cheslack-Postava

unread,
Oct 18, 2016, 1:42:10 AM10/18/16
to Confluent Platform
The Connect framework won't just retry because it doesn't have any indication that the exception it is catching is even retriable. Connect, like the lower level Kafka clients, has a standardized exception http://kafka.apache.org/0100/javadoc/org/apache/kafka/connect/errors/ConnectException.html and a retriable version http://kafka.apache.org/0100/javadoc/org/apache/kafka/connect/errors/RetriableException.html. If you throw a RetriableException, the framework will try to redeliver data. If you throw any other type of error (general ConnectException or really anything else), the framework will simply set the status of the task as failed and you'll need to intervene to get it to restart.

Note that if you throw something outside the ConnectException hierarchy, we can't tell whether your connector is even in an ok state (e.g. if it is safe to invoke *any* further methods that might write data). In contrast, at least with ConnectException we know you specifically chose to throw the error (rather than, e.g., some library doing it for you) so the framework could potentially introduce it's own limited retries/backoff in the future if you correctly throw these types of exceptions. Ideally you would only throw ConnectExceptions to the framework, but we know with unchecked RuntimeExceptions it can be difficult to be sure you've caught them all.

-Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/8645f9bb-be4e-4904-9595-cf6a76b08236%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Ewen
Reply all
Reply to author
Forward
0 new messages