Kafka-connect with Hive integration: using high availability name nodes

518 views
Skip to first unread message

George @paytm.com

unread,
Mar 29, 2016, 4:44:33 PM3/29/16
to Confluent Platform
Hi,

Here is our context:

1. Our HDFS uses HA name nodes, and I use active name node's ip in my kafka-connect's config, because kafka-connect is not able to recognize the HA NN host.
2. Importing to HDFS is successful. I can SHOW CREATE TABLE my imported topic
3. But I can't query or even drop the table. Hive would give the following error.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.IllegalArgumentException: Wrong FS: hdfs://ACTIVE_NAME_NODE_IP:8020/TOPICS_DIR/mytopic, expected: hdfs://HA_NN_Name)

Anyone experiencing similar problems?

Thanks,

George


Liquan Pei

unread,
Mar 29, 2016, 7:13:40 PM3/29/16
to confluent...@googlegroups.com
Hi George,

It seems that there are some issues with the Hive configuration. Did you upgrade the Hive metastore after enabling HA? http://www.cloudera.com/documentation/enterprise/5-2-x/topics/cdh_hag_hdfs_ha_cdh_components_config.html

Thanks,
Liquan

Disclaimer :-

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail and destroy all copies of this message and any attachments. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. 

Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platf...@googlegroups.com.
To post to this group, send email to confluent...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/confluent-platform/d191e19c-ae41-4989-a86c-9b0d6d700c6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Liquan Pei | Software Engineer | Confluent | +1 413.230.6855
Download Apache Kafka and Confluent Platform: www.confluent.io/download

Tariq Mohammad

unread,
Mar 30, 2016, 4:18:40 AM3/30/16
to Confluent Platform
Hi George,

Try using the NN alias(say nameservice1) and specify hadoop.conf.dir property in your kafka-connect-hdfs properties file. Since you have a HA setup Hive would look for nameservice1 and not for the actual active NN hostname while fetching for the underlying HDFS data location. 

As far as I have understood Kafka-connect-hdfs so far it uses the name specified in kafka-connect-hdfs properties file to create HDFS location for Hive data. But Hive doesn't know about this location  since it's looking for nameservice1 as per Hive's configuration.

HTH


On Wednesday, 30 March 2016 04:43:40 UTC+5:30, Liquan Pei wrote:
Hi George,

It seems that there are some issues with the Hive configuration. Did you upgrade the Hive metastore after enabling HA? http://www.cloudera.com/documentation/enterprise/5-2-x/topics/cdh_hag_hdfs_ha_cdh_components_config.html

Thanks,
Liquan
On Tue, Mar 29, 2016 at 1:44 PM, 'George @paytm.com' via Confluent Platform <confluent...@googlegroups.com> wrote:
Hi,

Here is our context:

1. Our HDFS uses HA name nodes, and I use active name node's ip in my kafka-connect's config, because kafka-connect is not able to recognize the HA NN host.
2. Importing to HDFS is successful. I can SHOW CREATE TABLE my imported topic
3. But I can't query or even drop the table. Hive would give the following error.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.IllegalArgumentException: Wrong FS: hdfs://ACTIVE_NAME_NODE_IP:8020/TOPICS_DIR/mytopic, expected: hdfs://HA_NN_Name)

Anyone experiencing similar problems?

Thanks,

George



Disclaimer :-

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail and destroy all copies of this message and any attachments. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. 

Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.

--
You received this message because you are subscribed to the Google Groups "Confluent Platform" group.
To unsubscribe from this group and stop receiving emails from it, send an email to confluent-platform+unsub...@googlegroups.com.

Tariq Mohammad

unread,
Mar 30, 2016, 4:24:13 AM3/30/16
to Confluent Platform
Add hive.conf.dir as well in your properties file.

George @paytm.com

unread,
Mar 30, 2016, 11:38:36 AM3/30/16
to Confluent Platform
Hi all,

I had hive.conf.dir set but not hadoop.conf.dir. Setting it solves my HA NN alias problem.
My hdfs.url has :8020 attached to it. Removing it solves my wrong FS problem.

Thanks a lot for your help.

George
Reply all
Reply to author
Forward
Message has been deleted
0 new messages