Unable to run spark-submit using azkaban

416 views
Skip to first unread message

vikra...@gmail.com

unread,
Aug 11, 2015, 4:47:23 PM8/11/15
to azkaban
Hi,
I have create a foo.job file as follows

type=command
command = ~/spark/spark-1.4.1/bin/spark-submit --master spark://weve:7077 ~/spark/spark-1.4.1/examples/src/main/python/pi.py 100

When I run the same cmd in the terminal, it works correctly from whatever directory i'm currently present at

However, i see that the job fails when executed from azkaban with the following error. What am I doing wrong?

11-08-2015 13:35:12 PDT pi INFO - Starting job pi at 1439325312319
11-08-2015 13:35:12 PDT pi INFO - Building command job executor. 
11-08-2015 13:35:12 PDT pi INFO - 1 commands to execute.
11-08-2015 13:35:12 PDT pi INFO - Command: ~/spark/spark-1.4.1/bin/spark-submit --master spark://weve:7077 ~/spark/spark-1.4.1/examples/src/main/python/pi.py 100
11-08-2015 13:35:12 PDT pi INFO - Environment variables: {JOB_NAME=pi, JOB_PROP_FILE=/home/weveadmin/azkaban/azkaban-solo-2.5.0/executions/1/pi_props_765525279793906107_tmp, JOB_OUTPUT_PROP_FILE=/home/weveadmin/azkaban/azkaban-solo-2.5.0/executions/1/pi_output_7062891433530190749_tmp}
11-08-2015 13:35:12 PDT pi INFO - Working directory: /home/weveadmin/azkaban/azkaban-solo-2.5.0/executions/1
11-08-2015 13:35:12 PDT pi INFO - Process completed unsuccessfully in 0 seconds.
11-08-2015 13:35:12 PDT pi ERROR - Job run failed!
11-08-2015 13:35:12 PDT pi ERROR - java.io.IOException: Cannot run program "~/spark/spark-1.4.1/bin/spark-submit" (in directory "/home/weveadmin/azkaban/azkaban-solo-2.5.0/executions/1"): error=2, No such file or directoryjava.io.IOException: Cannot run program "~/spark/spark-1.4.1/bin/spark-submit" (in directory "/home/weveadmin/azkaban/azkaban-solo-2.5.0/executions/1"): error=2, No such file or directory
11-08-2015 13:35:12 PDT pi INFO - Finishing job pi at 1439325312351 with status FAILED

Vikram Kone

unread,
Aug 12, 2015, 5:00:45 PM8/12/15
to azkab...@googlegroups.com, Hien Luu, Nick Pentreath
Is there a way to submit spark jar files to worker roles in the spark standalone cluster mode using azkaban?
I see that if we use the command job type, it will need access to the spark job jar files on all the client machines to be able to work.

--
You received this message because you are subscribed to a topic in the Google Groups "azkaban" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/azkaban-dev/Hqou_OV4EMY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to azkaban-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages