HDFS Connector - Connector can't renew properly Kerberos ticket

830 views
Skip to first unread message

Alex Piermatteo

unread,
May 2, 2016, 9:47:07 AM5/2/16
to Confluent Platform

Hi,

I've this problem in a Kerberized environment: when I start the connector everything is working fine, I obtain my Kerberos credentials and the connector start writing without issues. The problem begin when a day after the ticket from Kerberos is renewed and the connector crash immediatly with this error:

ERROR Recovery failed at state RECOVERY_PARTITION_PAUSED (io.confluent.connect.hdfs.TopicPartitionWriter:221) org.apache.kafka.connect.errors.ConnectException: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "mitstatlodpbroker01/10.72.176.22"; destination host is: "mitstatlodpmaster01":8020;
at io.confluent.connect.hdfs.wal.FSWAL.apply(FSWAL.java:131)
at io.confluent.connect.hdfs.TopicPartitionWriter.applyWAL(TopicPartitionWriter.java:519)
at io.confluent.connect.hdfs.TopicPartitionWriter.recover(TopicPartitionWriter.java:204)
at io.confluent.connect.hdfs.TopicPartitionWriter.write(TopicPartitionWriter.java:234)
at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:234)
at io.confluent.connect.hdfs.HdfsSinkTask.put(HdfsSinkTask.java:91)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:287)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:176)
at org.apache.kafka.connect.runtime.WorkerSinkTaskThread.iteration(WorkerSinkTaskThread.java:90)
at org.apache.kafka.connect.runtime.WorkerSinkTaskThread.execute(WorkerSinkTaskThread.java:58)
at org.apache.kafka.connect.util.ShutdownableThread.run(ShutdownableThread.java:82)

it is strange because the connector is able to renew periodically the ticket (due to the async function in DataWriter class) but the issue still there and I've to restart manually the connector every time that this error happen.

Did you know this error or there is something that maybe I'm doing wrong?

Thanks.

Liquan Pei

unread,
May 2, 2016, 11:21:51 AM5/2/16
to confluent...@googlegroups.com
Hi Alex,

Can you share with me your HDFS connector configuration? Especially configurations with Hadoop security settings. Thanks!

Liquan

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/e7f0564d-9ec8-4061-a649-e22edd4196f9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Liquan Pei | Software Engineer | Confluent | +1 413.230.6855
Download Apache Kafka and Confluent Platform: www.confluent.io/download

Alex Piermatteo

unread,
May 2, 2016, 11:37:07 AM5/2/16
to Confluent Platform
Hi Liquan,

thank you for your quick response, please find below my connector configuration:

{"name": "kafka-connect-hdfs-jdbctopic",
          "hdfs.url": "hdfs://mitstatlodpmaster01:8020",
          "hadoop.conf.dir": "/etc/hadoop/conf",
          "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector",
          "flush.size": "134217728",
          "rotate.interval.ms": "10000",
          "topics.dir": "/data/raw/usr/mgt",
          "logs.dir": "/data/raw/usr/mgt/log",
          "topics": "raw_usr_mgt_v_bi_activation,raw_usr_mgt_v_bi_coupon,raw_usr_mgt_v_bi_coupon_rule,raw_usr_mgt_v_bi_creditmemo",
          "tasks.max": "2",
          "partitioner.class": "io.confluent.connect.hdfs.partitioner.DailyPartitioner",
          "locale": "en",
          "timezone": "UTC",
          "hdfs.authentication.kerberos": "true",
          "connect.hdfs.principal": "ka...@SKY.LOCAL",
          "connect.hdfs.keytab": "/opt/kerberos/keytabs/kafka.keytab",
          "hdfs.namenode.principal": "hdfs/mitstatlodpmas...@SKY.LOCAL"}



Naturally every hostname in the config is perfectly reachable from the machine where the connector is running, in fact no problem at the first start. Just after the renew..



Thank you, 


Alex

Il giorno lunedì 2 maggio 2016 17:21:51 UTC+2, Liquan Pei ha scritto:
Hi Alex,

Can you share with me your HDFS connector configuration? Especially configurations with Hadoop security settings. Thanks!

Liquan
On Mon, May 2, 2016 at 6:47 AM, Alex Piermatteo <piermat...@gmail.com> wrote:

Hi,

I've this problem in a Kerberized environment: when I start the connector everything is working fine, I obtain my Kerberos credentials and the connector start writing without issues. The problem begin when a day after the ticket from Kerberos is renewed and the connector crash immediatly with this error:

ERROR Recovery failed at state RECOVERY_PARTITION_PAUSED (io.confluent.connect.hdfs.TopicPartitionWriter:221) org.apache.kafka.connect.errors.ConnectException: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "mitstatlodpbroker01/10.72.176.22"; destination host is: "mitstatlodpmaster01":8020;
at io.confluent.connect.hdfs.wal.FSWAL.apply(FSWAL.java:131)
at io.confluent.connect.hdfs.TopicPartitionWriter.applyWAL(TopicPartitionWriter.java:519)
at io.confluent.connect.hdfs.TopicPartitionWriter.recover(TopicPartitionWriter.java:204)
at io.confluent.connect.hdfs.TopicPartitionWriter.write(TopicPartitionWriter.java:234)
at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:234)
at io.confluent.connect.hdfs.HdfsSinkTask.put(HdfsSinkTask.java:91)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:287)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:176)
at org.apache.kafka.connect.runtime.WorkerSinkTaskThread.iteration(WorkerSinkTaskThread.java:90)
at org.apache.kafka.connect.runtime.WorkerSinkTaskThread.execute(WorkerSinkTaskThread.java:58)
at org.apache.kafka.connect.util.ShutdownableThread.run(ShutdownableThread.java:82)

it is strange because the connector is able to renew periodically the ticket (due to the async function in DataWriter class) but the issue still there and I've to restart manually the connector every time that this error happen.

Did you know this error or there is something that maybe I'm doing wrong?

Thanks.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

Alex Piermatteo

unread,
May 22, 2016, 7:18:28 AM5/22/16
to Confluent Platform
Hi,

after some researches I found the problem, is related to the hadoop-common library 2.6.0 and Java 8. 
Basically the function UGI#reloginFromKeytab inside the DataWriter.java class used to renew everyday the Kerberos ticket fail silently due to a parameter (isKeyTab) that returns false every time even if should be true (link bug: https://issues.apache.org/jira/browse/HADOOP-10786).

I solved the issue using Java 7 to run the connector and also with the usage of Hadoop-Clent 2.6.1 library that has the fix inside.

Regards,

Alex

GADI Younes

unread,
Mar 26, 2019, 2:49:16 PM3/26/19
to Confluent Platform
HI,

Please did you install kerberos on host where kafka connector is running to generate ticket for user ?



Le lundi 2 mai 2016 17:37:07 UTC+2, Alex Piermatteo a écrit :
Hi Liquan,

thank you for your quick response, please find below my connector configuration:

{"name": "kafka-connect-hdfs-jdbctopic",
          "hdfs.url": "hdfs://mitstatlodpmaster01:8020",
          "hadoop.conf.dir": "/etc/hadoop/conf",
          "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector",
          "flush.size": "134217728",
          "rotate.interval.ms": "10000",
          "topics.dir": "/data/raw/usr/mgt",
          "logs.dir": "/data/raw/usr/mgt/log",
          "topics": "raw_usr_mgt_v_bi_activation,raw_usr_mgt_v_bi_coupon,raw_usr_mgt_v_bi_coupon_rule,raw_usr_mgt_v_bi_creditmemo",
          "tasks.max": "2",
          "partitioner.class": "io.confluent.connect.hdfs.partitioner.DailyPartitioner",
          "locale": "en",
          "timezone": "UTC",
          "hdfs.authentication.kerberos": "true",
          "connect.hdfs.principal": "ka...@SKY.LOCAL",
          "connect.hdfs.keytab": "/opt/kerberos/keytabs/kafka.keytab",
          "hdfs.namenode.principal": "hdfs/mitstatlodpmaster01.sky.lo...@SKY.LOCAL"}

GADI Younes

unread,
Mar 26, 2019, 3:02:48 PM3/26/19
to Confluent Platform
Hi
Please did you install kerberos on host where kafka connector is running to generate ticket for user ?
Le mar. 26 mars 2019 à 19:49, GADI Younes <gadiyou...@gmail.com> a écrit :
HI,

Please did you install kerberos on host where kafka connector is running to generate ticket for user ?



Le lundi 2 mai 2016 17:37:07 UTC+2, Alex Piermatteo a écrit :
Hi Liquan,

thank you for your quick response, please find below my connector configuration:

{"name": "kafka-connect-hdfs-jdbctopic",
          "hdfs.url": "hdfs://mitstatlodpmaster01:8020",
          "hadoop.conf.dir": "/etc/hadoop/conf",
          "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector",
          "flush.size": "134217728",
          "rotate.interval.ms": "10000",
          "topics.dir": "/data/raw/usr/mgt",
          "logs.dir": "/data/raw/usr/mgt/log",
          "topics": "raw_usr_mgt_v_bi_activation,raw_usr_mgt_v_bi_coupon,raw_usr_mgt_v_bi_coupon_rule,raw_usr_mgt_v_bi_creditmemo",
          "tasks.max": "2",
          "partitioner.class": "io.confluent.connect.hdfs.partitioner.DailyPartitioner",
          "locale": "en",
          "timezone": "UTC",
          "hdfs.authentication.kerberos": "true",
          "connect.hdfs.principal": "ka...@SKY.LOCAL",
          "connect.hdfs.keytab": "/opt/kerberos/keytabs/kafka.keytab",
          "hdfs.namenode.principal": "hdfs/mitstatlodpmas...@SKY.LOCAL"}
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.



--
Liquan Pei | Software Engineer | Confluent | +1 413.230.6855
Download Apache Kafka and Confluent Platform: www.confluent.io/download

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.

To post to this group, send email to confluent...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages