Error with Terasort on Google Cloud Storage

165 views
Skip to first unread message

Christian Guegi

unread,
Oct 8, 2015, 6:00:57 AM10/8/15
to SequenceIQ Cloudbreak
Hi,

I have a small HDP 2.2.8 cluster running on GCP in combination with GCS (provisioned with Cloudbreak).
On all nodes p12 key as well as the  Google Cloud Storage connector is installed. 

HDFS works since command hdfs dfs -cat gs://<bucket>/data/access.log returns the content of the log file.

I'd like to validate the setup with Terasort but TeraGen (store the generated data on GCS) throws the error below:

yarn jar /usr/hdp/2.2.8.0-3150/hadoop-mapreduce/hadoop-mapreduce-examples.jar teragen 1000000 gs://<bucket>/data/terasort-input
INFO gcs.GoogleHadoopFileSystemBase: GHFS version: 1.4.2-hadoop2
WARN gcs.GoogleHadoopFileSystemBase: No working directory configured, using default: 'gs://zh-hadoop/'
INFO impl.TimelineClientImpl: Timeline service address: http://<hostname>:8188/ws/v1/timeline/
INFO client.RMProxy: Connecting to ResourceManager at <hostename>/10.0.0.6:8050
INFO terasort.TeraSort: Generating 1000000 using 2
INFO mapreduce.JobSubmitter: number of splits:2
INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1444297515143_0001
INFO impl.YarnClientImpl: Submitted application application_1444297515143_0001
INFO mapreduce.Job: The url to track the job: http://<hostname>/proxy/application_1444297515143_0001/
INFO mapreduce.Job: Running job: job_1444297515143_0001
INFO mapreduce.Job: Job job_1444297515143_0001 running in uber mode : false
INFO mapreduce.Job:  map 0% reduce 0%
INFO mapreduce.Job: Job job_1444297515143_0001 failed with state FAILED due to: Application application_1444297515143_0001 failed 2 times due to AM Container for appattempt_1444297515143_0001_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://<hostname>/proxy/application_1444297515143_0001/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e04_1444297515143_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.

I looks like the nodemanager has some problems:

WARN  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(230)) - Exception from container-launch with container ID: container_e04_1444297515143_0001_01_000001 and exit code: 1

ExitCodeException exitCode=1: 

at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)

at org.apache.hadoop.util.Shell.run(Shell.java:455)

at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)

at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)

at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)


Any idea what's wrong here?


Thanks much in advance,

Christian



marto...@gmail.com

unread,
Oct 8, 2015, 7:26:45 AM10/8/15
to SequenceIQ Cloudbreak
Hi Christian,

what do the logs of the submitted application from the resource manager UI say?

Marton

Christian Guegi

unread,
Oct 8, 2015, 7:50:01 AM10/8/15
to SequenceIQ Cloudbreak, marto...@gmail.com
Hi  Marton,

It's the same as the console output I've already added. Below statement form UI:
Application application_1444297515143_0001 failed 2 times due to AM Container for appattempt_1444297515143_0001_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://hostname:8088/proxy/application_1444297515143_0001/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e04_1444297515143_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.

The Yarn container get terminated before the Mapper gets started. Hence there is no log from the mappers.

Btw: I even just tried it with a cluster provisioned from a slightly modified blueprint. I modified the HDP version, thats all. I got the same error.

Regards, Christian

Marton Sereg

unread,
Oct 8, 2015, 8:01:14 AM10/8/15
to Christian Guegi, SequenceIQ Cloudbreak
I've managed to reproduce it and from the application logs it seems to me that it's a classpath issue:

2015-10-08 11:40:47,200 INFO [main] org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:478)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:458)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1560)

Please try to modify the mapreduce.application.classpath property through Ambari by adding this to the end of the value: :/usr/lib/hadoop/lib/* (or any other directory that holds the connector jar). If its done restart the required services from Ambari and try to submit the job again.

I'm not yet sure if it's a Cloudbreak issue or not, but I'll file a ticket about it, thanks for reporting.

Marton

--
You received this message because you are subscribed to the Google Groups "SequenceIQ Cloudbreak" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloudbreak+...@googlegroups.com.
To post to this group, send email to cloud...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloudbreak/d35086f8-36cf-4894-908e-8d823da0843a%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Christian Guegi

unread,
Oct 8, 2015, 8:17:40 AM10/8/15
to SequenceIQ Cloudbreak, christi...@gmail.com
Marton,

Thanks for your help!
After adding :/usr/lib/hadoop/lib/* to the property mapreduce.application.classpath it works like a charm.

Best, Christian
Reply all
Reply to author
Forward
0 new messages