Sporadic ConnectException shuts down the whole connect process

48 views
Skip to first unread message

Sagar Rao

unread,
Nov 2, 2016, 12:26:55 PM11/2/16
to Confluent Platform
I had setup a 2 node distributed kafka-connect process. Everything went well and I could see lot of data flowing into the relevant kafka topics.

After some time, JDBCUtils.getCurrentTimeOnDB threw a ConnectException with the following stacktrace:

The last packet successfully received from the server was 792 milliseconds ago.  The last packet sent successfully to the server was 286 milliseconds ago. (io.confluent.connect.jdbc.source.JdbcSourceTask:234)
[2016-11-02 12:42:06,116] ERROR Failed to get current time from DB using query select CURRENT_TIMESTAMP; on database MySQL (io.confluent.connect.jdbc.util.JdbcUtils:226)
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 1,855 milliseconds ago.  The last packet sent successfully to the server was 557 milliseconds ago.
       at sun.reflect.GeneratedConstructorAccessor51.newInstance(Unknown Source)
       at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
       at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
       at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
       at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)
       at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3829)
       at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2449)
       at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2629)
       at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2719)
       at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)
       at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)
       at com.mysql.jdbc.StatementImpl.createResultSetUsingServerFetch(StatementImpl.java:651)
       at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1527)
       at io.confluent.connect.jdbc.util.JdbcUtils.getCurrentTimeOnDB(JdbcUtils.java:220)
       at io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.executeQuery(TimestampIncrementingTableQuerier.java:157)
       at io.confluent.connect.jdbc.source.TableQuerier.maybeStartQuery(TableQuerier.java:78)
       at io.confluent.connect.jdbc.source.TimestampIncrementingTableQuerier.maybeStartQuery(TimestampIncrementingTableQuerier.java:57)
       at io.confluent.connect.jdbc.source.JdbcSourceTask.poll(JdbcSourceTask.java:207)
       at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:155)
       at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
       at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Broken pipe (Write failed)
       at java.net.SocketOutputStream.socketWrite0(Native Method)
       at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
       at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
       at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
       at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
       at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3810)
       ... 20 more

This was just a minor glitch to the connection as the ec2 isntances are able to connect to the Mysql Aurora instances without any issues.

But, after this exception(which is there a number of times), none of the connectors' tasks are executing. Beyond this, all I see in the logs is 

[2016-11-02 16:17:41,983] ERROR Failed to run query for table TimestampIncrementingTableQuerier{name='eng_match_series', query='null', topicPrefix='ci-eng-', timestampColumn='modified', incrementingColumn='id'}: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after statement closed. (io.confluent.connect.jdbc.source.JdbcSourceTask:234)



Is this expected behaviour? I restarted the connector using REST apis but that didn't help. How do we handle such scenarios?

Sagar.

Gwen Shapira

unread,
Nov 2, 2016, 6:29:14 PM11/2/16
to confluent...@googlegroups.com
mmm... I'd expect that restarting the connector would help. Were there additional errors after the restart?

The fact that JDBC connector doesn't recover from a minor disconnect on its own though sounds concerning and worth opening an issue so we can track and resolve.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/aa71e7b9-7f29-4268-b2f8-acd62999a264%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Gwen Shapira
Product Manager | Confluent
650.450.2760 @gwenshap
Follow us: Twitter | blog

Sagar Rao

unread,
Nov 3, 2016, 6:04:33 AM11/3/16
to Confluent Platform
Created an issue on jira:


There weren't any additional errors after the restart. This is what the logs said when I restarted with the /restart api(using curl)

[2016-11-03 06:00:40,165] INFO 127.0.0.1 - - [03/Nov/2016:06:00:40 +0000] "POST /connectors/ci-engine-non-bbb/restart HTTP/1.1" 204 -  36 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2016-11-03 06:00:40,402] ERROR Failed to run query for table TimestampIncrementingTableQuerier{name='eng_match_team', query='null', topicPrefix='ci-eng-', timestampColumn='modified', incrementingColumn='id'}: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after statement closed. (io.confluent.connect.jdbc.source.JdbcSourceTask:234)
[2016-11-03 06:00:41,336] 

As you can see, the restart happened and then the same errors started to appear.

Sagar.
To post to this group, send email to confluent...@googlegroups.com.

Sagar Rao

unread,
Nov 3, 2016, 8:36:04 AM11/3/16
to Confluent Platform
Gwen,

There's a comment on the ticket stating that it's a bug with the kafka-connect-jdbc side. I couldn't assign the ticket to you there hence mentioning it here. It's a kind of a critical bug for us..

Also, there's another PR which is also critical for us as we have a data in mysql related to that. Here's the PR for that:


Thanks!
Sagar.

Gwen Shapira

unread,
Nov 4, 2016, 7:39:31 PM11/4/16
to confluent...@googlegroups.com
for PR-152, we are planning on adding single-message-transforms to the
Connect API itself, and I think Shikhar suggested that this can be
resolved through the transforms rather than for JDBC connector
specifically.

I'll let you and Shikhar figure it our on the github PR...
>>>> an email to confluent-platf...@googlegroups.com.
>>>> To post to this group, send email to confluent...@googlegroups.com.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/confluent-platform/aa71e7b9-7f29-4268-b2f8-acd62999a264%40googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>
>>>
>>> --
>>> Gwen Shapira
>>> Product Manager | Confluent
>>> 650.450.2760 | @gwenshap
>>> Follow us: Twitter | blog
>>>
> --
> You received this message because you are subscribed to the Google Groups
> "Confluent Platform" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to confluent-platf...@googlegroups.com.
> To post to this group, send email to confluent...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/confluent-platform/6b6d4f29-3648-4ed0-ae2e-28b74e0bcace%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages