Hive refuses to work with Yarn

Showing 1-11 of 11 messages
Hive refuses to work with Yarn eran 6/27/12 6:24 AM
Hi,
I'm using CDH4 with Yarn. When running queries in hive that require a
M/R job the job is submitted to the Yarn resource manager (I see it
listed there with status "KILLED") but is immediately killed by hive:

It keeps failing with:
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 1; number of
reducers: 0
2012-06-27 09:08:24,810 null map = 0%,  reduce = 0%
Ended Job = job_1340800364224_0002 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1340800364224_0002_m_000000 (and more) from
job job_1340800364224_0002
Exception in thread "Thread-33" java.lang.IllegalArgumentException:
Does not contain a valid host:port authority: local
        at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:206)
        at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:158)
        at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:147)
        at
org.apache.hadoop.hive.ql.exec.JobTrackerURLResolver.getURL(JobTrackerURLResolver.java:
42)
        at
org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:
198)
        at
org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:83)
        at java.lang.Thread.run(Thread.java:662)
Execution failed with exit status: 2
Obtaining error information

Task failed!
Task ID:
  Stage-1


I tried setting the MRv1 parameter mapred.job.tracker which did
resolve this exception but then, of course, it failed on something
else.
This issue seems to imply that hive requires MRv1, although the
documentation suggests that it should also work with Yarn.

I do have the HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce and the
mapreduce.framework.name is set to "yarn".

Any ideas why wouldn't it work?

Thanks,
Eran
Re: Hive refuses to work with Yarn Roman Shaposhnik 6/27/12 9:49 AM
On Wed, Jun 27, 2012 at 6:24 AM, eran
<eran%gigy...@gtempaccount.com> wrote:
> Exception in thread "Thread-33" java.lang.IllegalArgumentException:
> Does not contain a valid host:port authority: local

This tells me that that your configuration is somehow busted. Check all the
properties ending in .address in your yarn-site.xml

Thanks,
Roman.
Re: Hive refuses to work with Yarn Carl Steinbach 6/28/12 12:41 AM
Hi Eran,

There are actually two failures here:

1) The MR job that Hive launched on your cluster failed for some reason. I can't determine why based on the information provided. I recommend trying to locate the task logs for the failed tasks on the cluster.

2) When a job fails Hive attempts to automatically retrieve the task logs from the JobTracker's TaskLogServlet. This service doesn't exist in MR2, which is why Hive is throwing an exception (either because mapred.job.tracker is undefined, or because it can't find the TaskLogServlet service running on the machine that mapred.job.tracker points to). This is a known issue and one that we plan to address in the next release of CDH.

In the meantime I recommend doing the following if you need to run Hive on MR2:

* Keep Hive happy by setting mapred.job.tracker to a bogus value.
* Disable task log retrieval by setting hive.exec.show.job.failure.debug.info=false

Thanks.

Carl

On Wed, Jun 27, 2012 at 11:14 AM, <er...@gigya-inc.com> wrote:
Thanks Roman,
The addresses seem correct. Also note that setting the mapred.job.tracker property will make this error go away, doesn't it mean that hive is looking for a MRv1 property?

This is the relevant part of my yarn-site.xml

  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>hadoop2-m2:8025</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>hadoop2-m2:8040</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>hadoop2-m2:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>hadoop2-m2:8141</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>hadoop2-m2:8088</value>
  </property>

-eran



On Wednesday, June 27, 2012 7:49:51 PM UTC+3, Roman Shaposhnik wrote:
On Wed, Jun 27, 2012 at 6:24 AM, eran

> Exception in thread "Thread-33" java.lang.IllegalArgumentException:
> Does not contain a valid host:port authority: local

This tells me that that your configuration is somehow busted. Check all the
properties ending in .address in your yarn-site.xml

Thanks,
Roman.

Re: Hive refuses to work with Yarn Shilin Jiang 6/29/12 12:54 AM
Hi,
I have the same problem. Do you find the reason?
Re: Hive refuses to work with Yarn tony.xjz 7/16/12 12:02 AM
I have the same question,i found task logs for the failed tasks on the cluster,the errors as follows:
 
 
2012-07-16 14:41:58,520 FATAL [IPC Server handler 2 on 34459] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1342417297772_0014_m_000000_0 - exited : java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/hadoop/hive_2012-07-16_14-43-33_873_8745321134507504113/-mr-10001/9a014a09-4cc5-40c6-b2cd-cc121ff11b7c (No such file or directory)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:223)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:381)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:374)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:536)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:160)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.io.FileNotFoundException: /tmp/hadoop/hive_2012-07-16_14-43-33_873_8745321134507504113/-mr-10001/9a014a09-4cc5-40c6-b2cd-cc121ff11b7c (No such file or directory)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.(FileInputStream.java:120)
	at java.io.FileInputStream.(FileInputStream.java:79)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:214)
 

在 2012年6月28日星期四UTC+8下午3时41分24秒,Carl Steinbach写道:
Re: Hive refuses to work with Yarn Swarnim Kulkarni 7/17/12 1:22 PM
Guess I am also in the same boat.

All my regular M/R jobs are running fine on yarn but all from hive fail. Is there any workaround for this issue? This seems to be a pretty big issue if we are not able to run any M/R jobs from within hive.
Re: Hive refuses to work with Yarn Carl Steinbach 7/17/12 2:15 PM
Can you please provide some details about how your Hive jobs are failing? Do you have any stack traces or log files? Can you provide a dump of your Hive configuration properties taken from the Hive CLI?

Thanks.

Carl

--
 
 
 

Re: Hive refuses to work with Yarn Swarnim Kulkarni 7/18/12 10:15 AM
Thanks for your reply Carl.

My exceptions are similar to what other have mentioned in the thread. That is,


Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 1; number of reducers: 1
2012-07-18 11:58:57,600 null map = 0%,  reduce = 0%
Ended Job = job_1342625887245_0016 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1342625887245_0016_m_000000 (and more) from job job_1342625887245_0016
Exception in thread "Thread-19" java.lang.IllegalArgumentException: Does not contain a valid host:port authority: local
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:206)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:158)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:147)
at org.apache.hadoop.hive.ql.exec.JobTrackerURLResolver.getURL(JobTrackerURLResolver.java:42)
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:198)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:83)
at java.lang.Thread.run(Thread.java:662)
Execution failed with exit status: 2
12/07/18 11:59:11 ERROR exec.Task: Execution failed with exit status: 2
Obtaining error information
12/07/18 11:59:11 ERROR exec.Task: Obtaining error information

Task failed!
Task ID:
  Stage-1

I suppressed the above exception by setting hive.exec.show.job.failure.debug.info=false.

However, looking into the task logs, I found this:

2012-07-18 11:59:00,307 FATAL [IPC Server handler 2 on 49957] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1342625887245_0016_m_000000_0 - exited : java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/root/hive_2012-07-18_11-58-51_379_5336502585428454730/-mr-10001/758fb869-b2ae-492f-9b2e-fc333058bedc (No such file or directory)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:223)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:381)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:374)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:536)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:160)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.io.FileNotFoundException: /tmp/root/hive_2012-07-18_11-58-51_379_5336502585428454730/-mr-10001/758fb869-b2ae-492f-9b2e-fc333058bedc (No such file or directory)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.(FileInputStream.java:120)
	at java.io.FileInputStream.(FileInputStream.java:79)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:214)
	... 12 more

On Wednesday, June 27, 2012 8:24:42 AM UTC-5, eran wrote:
Re: Hive refuses to work with Yarn Swarnim Kulkarni 7/18/12 10:07 AM
Thanks for your reply Carl.

My exceptions are similar to what other have mentioned in the thread. That is,

Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 1; number of reducers: 1
2012-07-18 11:58:57,600 null map = 0%,  reduce = 0%
Ended Job = job_1342625887245_0016 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1342625887245_0016_m_000000 (and more) from job job_1342625887245_0016
Exception in thread "Thread-19" java.lang.IllegalArgumentException: Does not contain a valid host:port authority: local
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:206)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:158)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:147)
at org.apache.hadoop.hive.ql.exec.JobTrackerURLResolver.getURL(JobTrackerURLResolver.java:42)
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:198)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:83)
at java.lang.Thread.run(Thread.java:662)
Execution failed with exit status: 2
12/07/18 11:59:11 ERROR exec.Task: Execution failed with exit status: 2
Obtaining error information
12/07/18 11:59:11 ERROR exec.Task: Obtaining error information

Task failed!
Task ID:
  Stage-1

I suppressed the above exception by setting hive.exec.show.job.failure.debug.info=false.

However, looking into the task logs, I found this:


2012-07-18 11:59:00,307 FATAL [IPC Server handler 2 on 49957] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1342625887245_0016_m_000000_0 - exited : java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/root/hive_2012-07-18_11-58-51_379_5336502585428454730/-mr-10001/758fb869-b2ae-492f-9b2e-fc333058bedc (No such file or directory)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:223)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:381)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:374)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:536)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:160)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.io.FileNotFoundException: /tmp/root/hive_2012-07-18_11-58-51_379_5336502585428454730/-mr-10001/758fb869-b2ae-492f-9b2e-fc333058bedc (No such file or directory)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.(FileInputStream.java:120)
	at java.io.FileInputStream.(FileInputStream.java:79)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapRedWork(Utilities.java:214)
	... 12 more


--
 
 
 



--
Swarnim
Re: Hive refuses to work with Yarn Eran Kutner 7/24/12 12:33 PM
I was having too many problems with Yarn. Its integration with other members of the hadoop family seems unstable. After wasting too much time over it I finally gave up and switched to MRv1 and everything, including hive, started working just fine.

-eran


Re: Hive refuses to work with Yarn midair77 8/2/12 5:11 PM
I experience this problem too and I got a hint from some jira, then I
added this to hive-site.xml

<property>
   <name>mapreduce.jobtracker.address</name>
   <value>ignorethis</value>
</property>

and restarted hive-server.

I was able to run some hive queries and things look good since then.

Hope this helps.