Presto not resolving HA Namenode cluster alias despite having resources named.

791 views
Skip to first unread message

Douglas Moore

unread,
Sep 22, 2015, 10:55:16 AM9/22/15
to Presto

Hi All,


Having trouble with UnknownHostException after upgrading to 115t beta.


I have this Hive connector file per documentation: 

hive.properties:

connector.name=hive-hadoop2

hive.metastore.uri=thrift://TDXYZN1:9083

hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml


And when I run presto command, presto gets through accessing the hive metastore, the HDFS ACL permissions tests and then fails retrieving the Hive split.


presto:default> select count(*) from hive.default.employee;

 

Query 20150922_142613_00022_xq8uj, FAILED, 1 node

Splits: 2 total, 0 done (0.00%)

0:00 [0 rows, 0B] [0 rows/s, 0B/s]

 

Query 20150922_142613_00022_xq8uj failed: Error opening Hive split hdfs://CLUSTERALIAS/apps/hive/warehouse/employee/MOCK_DATA.csv (offset=0, length=62315) using org.apache.hadoop.mapred.TextInputFormat: java.net.UnknownHostException: CLUSTERALIAS

 


Suggestions?



Thanks in advance.


David Phillips

unread,
Sep 22, 2015, 11:18:26 AM9/22/15
to presto...@googlegroups.com
What version of Hadoop are you using?


--
You received this message because you are subscribed to the Google Groups "Presto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Wallin, Christina A

unread,
Sep 22, 2015, 11:19:57 AM9/22/15
to presto...@googlegroups.com
Hi Douglas,

The HA Hive support does not work with HA HDFS cluster aliases for now. Thus, you need to specify both of the metastore uris explicitly:

hive.metastore.uri=thrift://metastore.machine.one:9083,thrift://metastore.machine.two:9083

Best,
Christina

From: presto...@googlegroups.com [presto...@googlegroups.com] on behalf of Douglas Moore [dmoo...@gmail.com]
Sent: Tuesday, September 22, 2015 10:55 AM
To: Presto
Subject: Presto not resolving HA Namenode cluster alias despite having resources named.

Douglas Moore

unread,
Sep 22, 2015, 11:57:43 AM9/22/15
to Presto, da...@acz.org
HDP 2.1 (Hadoop 2.4.0)

Douglas Moore

unread,
Sep 22, 2015, 12:01:20 PM9/22/15
to Presto, Christin...@teradata.com
Hi Cristina,
I applied the suggested change, however same results. Presto does seem to make the initial connection to the metastore and did resolve the location of the HDFS files.
The HDFS permission checks passed (an earlier test they did not because my id did not have read access, now it does).
- Douglas

Douglas Moore

unread,
Sep 22, 2015, 12:57:40 PM9/22/15
to Presto

From the server.log:


2015-09-22T11:10:17.830-0400    ERROR   query-execution-0       com.facebook.presto.execution.QueryStateMachine Query 20150922_151016_00010_7pbi6 failed

com.facebook.presto.spi.PrestoException: Error opening Hive split hdfs://CLUSTERALIAS/apps/hive/warehouse/employee/MOCK_DATA.csv (offset=0, length=62315) using org.apache.hadoop.mapred.TextInputFormat: java.net.UnknownHostException: CLUSTERALIAS

        at com.facebook.presto.hive.HiveUtil.createRecordReader(HiveUtil.java:163)

        at com.facebook.presto.hive.GenericHiveRecordCursorProvider.createHiveRecordCursor(GenericHiveRecordCursorProvider.java:47)

        at com.facebook.presto.hive.HivePageSourceProvider.getHiveRecordCursor(HivePageSourceProvider.java:128)

        at com.facebook.presto.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:106)

        at com.facebook.presto.spi.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:43)

        at com.facebook.presto.split.PageSourceManager.createPageSource(PageSourceManager.java:48)

        at com.facebook.presto.operator.TableScanOperator.createSourceIfNecessary(TableScanOperator.java:258)

        at com.facebook.presto.operator.TableScanOperator.isFinished(TableScanOperator.java:206)

        at com.facebook.presto.operator.Driver.processInternal(Driver.java:377)

        at com.facebook.presto.operator.Driver.processFor(Driver.java:303)

        at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:587)

        at com.facebook.presto.execution.TaskExecutor$PrioritizedSplitRunner.process(TaskExecutor.java:505)

        at com.facebook.presto.execution.TaskExecutor$Runner.run(TaskExecutor.java:639)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

        at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: CLUSTERALIAS

        at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)

        at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:231)

        at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)

        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)

        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)

        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)

        at org.apache.hadoop.fs.PrestoFileSystemCache.createFileSystem(PrestoFileSystemCache.java:74)

        at org.apache.hadoop.fs.PrestoFileSystemCache.getInternal(PrestoFileSystemCache.java:61)

        at org.apache.hadoop.fs.PrestoFileSystemCache.get(PrestoFileSystemCache.java:43)

        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)

        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)

        at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:105)

        at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)

        at com.facebook.presto.hive.HiveUtil.lambda$createRecordReader$2(HiveUtil.java:160)

        at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:136)

        at com.facebook.presto.hive.HiveUtil.createRecordReader(HiveUtil.java:160)

        ... 15 more

Caused by: java.net.UnknownHostException: CLUSTERALIAS

        ... 31 more

Wallin, Christina A

unread,
Sep 22, 2015, 1:11:20 PM9/22/15
to presto...@googlegroups.com
Hi Douglas,

Can you make sure that you restarted Presto after changing the hive.properties file? What is now in your hive.properties file?

I once came across this issue when I forgot to stop the old Presto server with the old configuration, so make sure you only have one Presto server running.

From the stacktrace, it looks like the Hadoop libraries called by Presto are not using HA HDFS (which I presume you have configured, since you are using the cluster alias).

Christina

Sent: Tuesday, September 22, 2015 12:57 PM
To: Presto
Subject: Re: Presto not resolving HA Namenode cluster alias despite having resources named.

Douglas Moore

unread,
Sep 22, 2015, 1:13:23 PM9/22/15
to Presto
David,
I looking back through this group, there were two other posts related to HA. In my case I've included the resources. In another you asked if the cluster is in secure mode.
We just enabled Hive Impersonation (doAs=true). We do not have Kerberos, do not have LDAP integration at this time.

Could this be a factor?
- Douglas


On Tuesday, September 22, 2015 at 10:55:16 AM UTC-4, Douglas Moore wrote:

Wallin, Christina A

unread,
Sep 22, 2015, 1:48:45 PM9/22/15
to presto...@googlegroups.com
Hi Douglas,

I think the issue is in HA HDFS, not in the Hive layer; your stacktrace deals with finding the location of the split on HDFS, whereas impersonation changes what user hive-server2 uses to read from HDFS.

Can you ensure that all of the nodes in your cluster have the right configs? Also, do queries in hive work properly?

Christina

Sent: Tuesday, September 22, 2015 1:13 PM

To: Presto
Subject: Re: Presto not resolving HA Namenode cluster alias despite having resources named.

Douglas Moore

unread,
Sep 22, 2015, 2:35:55 PM9/22/15
to Presto, Christin...@teradata.com

Yes, the services were re-started. The same hive query does work.

Current hive.properties:


connector.name=hive-hadoop2

hive.metastore.uri=thrift://tdxyz3n1:9083,thrift://tdxyz3n2:9083

hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml


How to turn on more extensive logging? Seems worthwhile to track down where it's choosing the non-HA hdfs libraries vs. the HA hdfs libraries.
- Douglas

Fuller, Matthew S

unread,
Sep 22, 2015, 3:44:32 PM9/22/15
to presto...@googlegroups.com, Wallin, Christina A
Hi Doug,
Let’s take this to a separate thread. Although this issue may turn out to be relevant to the Presto community, you are using Teradata’s supported version of Presto. This presto-users group is not meant to be a support list for Teradata’s version. Let’s get to bedrock, and then we can reopen any relevant discussions on this list.
- Matt

David Phillips

unread,
Sep 22, 2015, 9:48:57 PM9/22/15
to presto...@googlegroups.com
Matt, thanks for looking into this. We don't have any experience with HA Namenode (Facebook has Avatar [1]), so it will be great to have your help here.

Fuller, Matthew S

unread,
Sep 23, 2015, 10:06:06 AM9/23/15
to presto...@googlegroups.com
Our pleasure. We will report back our findings. Thanks for the link!

Douglas Moore

unread,
Sep 27, 2015, 1:55:39 PM9/27/15
to Presto, Matthew...@teradata.com
This issue was related to a mix up of configuration files. The launcher.py creates symlinks which doesn't help when then later using scp -r to replicate the install dirs across the cluster.
We can close this item. Thanks everyone, especially Cristina.

A note to future: Using launcher.py -v start will display the config files actually being used, this was very helpful in sorting out my mess.

The earlier suggestion about specifying the resources was good.

connector.name=hive-hadoop2

hive.metastore.uri=thrift://tdxyz3n1:9083,thrift://tdxyz3n2:9083

hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml


- Douglas

伍照坤

unread,
Sep 27, 2015, 3:24:02 PM9/27/15
to presto...@googlegroups.com
double check your hdfs-site.xml for hive?
--
You received this message because you are subscribed to the Google Groups "Presto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Sincerely.
伍 涛 | Tony Wu

CHITRAPANDI S

unread,
Jul 21, 2020, 7:53:14 AM7/21/20
to Presto
Any update about this issue?
To unsubscribe from this group and stop receiving emails from it, send an email to presto-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Piotr Findeisen

unread,
Jul 21, 2020, 7:56:42 AM7/21/20
to presto...@googlegroups.com
Hi,

Douglas Moore reported this problem on version 115t.
If you also use that version, please be sure to upgrade.
Otherwise, I encourage you to ask on the #troubleshooting channel

Best
PF



To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Sincerely.
伍 涛 | Tony Wu

--
You received this message because you are subscribed to the Google Groups "Presto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to presto-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/presto-users/b56e8ac5-9617-4b72-9e36-8e07f05adb9fo%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages