cluster creation fails in aws emr when starting with my jar with cascading

1,322 views
Skip to first unread message

Srinivasan Ramaswamy

unread,
Feb 26, 2014, 8:38:01 PM2/26/14
to cascadi...@googlegroups.com
I am trying to run my jar (with cascading dependencies) that tries to join on a common field from two files. The jar runs fine in my local machine, but in aws the cluster doesnt even startup. 

The cluster terminates as the bootstrap action fails with the error

Terminated with errors   On the master instance (i-8b5170aa), bootstrap action 1 returned a non-zero return code

Here is the bootstrap action i specified while creating the cluster

Bootstrap Actions
Custom action s3://my-bucket/project/dataMerger-1.0-jar-with-dependencies.jar s3://my-bucket/inputData/input1.txt, s3://my-bucket/inputData/input2.dat,, s3://my-bucket/inputData/mergedOutput

Here are the content of the logs.

j-1GWFGNLBY1CGD/node/i-4e52746f/bootstrap-actions/1/controller

2014-02-26T21:45:19.180Z INFO Fetching file 's3://my-bucket/project/dataMerger-1.0-jar-with-dependencies.jar'
2014-02-26T21:45:29.333Z INFO Working dir /mnt/var/lib/bootstrap-actions/1
2014-02-26T21:45:29.333Z INFO Executing /mnt/var/lib/bootstrap-actions/1/dataMerger-1.0-jar-with-dependencies.jar s3://my-bucket/inputData/input1.txt s3://my-bucket/inputData/input2.dat s3://my-bucket/inputData/mergedOutput
2014-02-26T21:45:29.838Z INFO Execution ended with ret val 2
2014-02-26T21:45:29.839Z ERROR Execution failed with code '2'

j-1GWFGNLBY1CGD/node/i-4e52746f/bootstrap-actions/1/stderr

/mnt/var/lib/bootstrap-actions/1/dataMerger-1.0-jar-with-dependencies.jar: line 1: PK : command not found
/mnt/var/lib/bootstrap-actions/1/dataMerger-1.0-jar-with-dependencies.jar: line 3: ZD: command not found
/mnt/var/lib/bootstrap-actions/1/dataMerger-1.0-jar-with-dependencies.jar: line 4: ú: command not found
/mnt/var/lib/bootstrap-actions/1/dataMerger-1.0-jar-with-dependencies.jar: line 3: ùZDéÆoËѰ META-INF/MANIFEST.MFEÕ±: No such file or directory
/mnt/var/lib/bootstrap-actions/1/dataMerger-1.0-jar-with-dependencies.jar: line 4: syntax error near unexpected token `)'
/mnt/var/lib/bootstrap-actions/1/dataMerger-1.0-jar-with-dependencies.jar: line 4: `¬0 Ä·=êw» §‘¡ Ÿ⁄: Apï≥9mö ]ZÌ€K q˝˛·˜ê‚ %€ ≤ƒ19≥)J≠jÓ˙8#ˇ˘@¯öƒ¸ÇV-#d ∂Yú© –ıh<Ãò¥j¶H˘„¬0Ä<aXæ Ï>‹◊EUîÁj´ïáòlK ‚à â‡B∏É ˘∂N¥z PK '


I searched online for a few hours and i didnt get anything useful. why does this jar that run in my machine doesnt run in aws. does it ring a bell to anyone ?  any help is appreciated :)


Thanks
Srini

Andre Kelpe

unread,
Feb 27, 2014, 3:22:48 AM2/27/14
to cascadi...@googlegroups.com
This is not a cascading problem: EMR bootstrap actions are shell-scripts, that get executed after the cluster has been booted. You are giving it your job-jar, which will not work: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-bootstrap.html

- André


--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/fdbd50ce-8a77-43ed-89e7-ca8d005706d6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
André Kelpe
an...@concurrentinc.com
http://concurrentinc.com

Srinivasan Ramaswamy

unread,
Feb 27, 2014, 11:37:03 AM2/27/14
to cascadi...@googlegroups.com
Yeah, I agree its not a cascading problem. But I felt people in this forum would have experience in running cascading job in aws. wondering whether anyone faced the same problem.


This is the first time I am trying to run a cascading job in aws. I followed the instructions in this page to create the cluster.


--
You received this message because you are subscribed to a topic in the Google Groups "cascading-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cascading-user/AsOQWUneP8w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cascading-use...@googlegroups.com.

To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.

Andre Kelpe

unread,
Feb 27, 2014, 11:57:49 AM2/27/14
to cascadi...@googlegroups.com
You have to either add a Step via the UI or pass the jar file to run on the command line to the "elastic-mapreduce" script as explained in the CLI section of that document.

- André

P.S.: The document is a bit outdated with regards to the Cascading SDK. If you want to install the latest SDK, use this boostrap action:  s3://cascading.org/sdk/2.5/install-cascading-sdk.sh Please also note, that the SDK is not necessary, if you are just running a Cascading application. You only need it, when you want to use the lingual, multitool or, load programs on your cluster.



For more options, visit https://groups.google.com/groups/opt_out.

Srinivasan Ramaswamy

unread,
Feb 27, 2014, 5:46:54 PM2/27/14
to cascadi...@googlegroups.com
Thanks Andre ! Yeah, I got mixed up between the Bootstrap and Steps. Thanks for the clarification and it works now.


Reply all
Reply to author
Forward
0 new messages