HDFS Sink Connector Error creating writer for log file

676 views
Skip to first unread message

Basti

unread,
Apr 18, 2017, 5:54:45 AM4/18/17
to Confluent Platform
Hey guys,

I am trying to configure an HDFS Sink Connector with every component of Kafka Connect running dockerized. Only the HDFS cluster resides on another machine.
However I ran in the error below. I then tried to set up HDFS from scratch, still same error. I searched the forum but didn't see anyone with this exact problem.

Thanks a lot.

ERROR Recovery failed at state RECOVERY_PARTITION_PAUSED (io.confluent.connect.hdfs.TopicPartitionWriter)
org.apache.kafka.connect.errors.ConnectException: Error creating writer for log file hdfs://10.42.0.86:9000/logs/testApp/0/log
        at io.confluent.connect.hdfs.wal.FSWAL.acquireLease(FSWAL.java:91)
        at io.confluent.connect.hdfs.wal.FSWAL.apply(FSWAL.java:105)
        at io.confluent.connect.hdfs.TopicPartitionWriter.applyWAL(TopicPartitionWriter.java:484)
        at io.confluent.connect.hdfs.TopicPartitionWriter.recover(TopicPartitionWriter.java:212)
        at io.confluent.connect.hdfs.TopicPartitionWriter.write(TopicPartitionWriter.java:256)
        at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:234)
        at io.confluent.connect.hdfs.HdfsSinkTask.put(HdfsSinkTask.java:103)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:429)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:197)
        at java.io.DataInputStream.readFully(DataInputStream.java:169)
        at io.confluent.connect.hdfs.wal.WALFile$Reader.init(WALFile.java:584)
        at io.confluent.connect.hdfs.wal.WALFile$Reader.initialize(WALFile.java:552)
        at io.confluent.connect.hdfs.wal.WALFile$Reader.<init>(WALFile.java:529)
        at io.confluent.connect.hdfs.wal.WALFile$Writer.<init>(WALFile.java:214)
        at io.confluent.connect.hdfs.wal.WALFile.createWriter(WALFile.java:67)
        at io.confluent.connect.hdfs.wal.FSWAL.acquireLease(FSWAL.java:73)
        ... 17 more

Ewen Cheslack-Postava

unread,
Apr 18, 2017, 11:43:32 PM4/18/17
to Confluent Platform
Did anything crash at any point? This doesn't look exactly the same as https://github.com/confluentinc/kafka-connect-hdfs/issues/168 but looks like it could be a similar/related problem where we have a WAL file that's incomplete and causes recovery to fail.

-Ewen

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
To post to this group, send email to confluent-platform@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/9335b03f-13f6-4696-a574-627ca705f5f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Basti

unread,
Apr 19, 2017, 3:45:03 AM4/19/17
to Confluent Platform
Hey,

Thanks for the reply. Everything seems fine, I checked the logs of Zookeeper, Kafka and Schema Registry. No errors. The error above happens right after initializing the connector, so there is no error before that. 
To post to this group, send email to confluent...@googlegroups.com.

Ewen Cheslack-Postava

unread,
Apr 19, 2017, 7:28:09 PM4/19/17
to Confluent Platform
Could you check the file hdfs://10.42.0.86:9000/logs/testApp/0/log that is causing the problem to see if it exists and if it has any contents? If we have the file, we might be able to inspect it to see what the problem is (e.g. if it is somehow corrupt, only has a partial entry, or perhaps something else).

Since this looks like a new issue, you might file a bug on the repo. There are a few that hit a related code path already but this one seems like it is new: https://github.com/confluentinc/kafka-connect-hdfs/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aopen%20RECOVERY_PARTITION_PAUSED%20

-Ewen

To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages