What should be the Main Class for PySpark Jobs in OOzie ?

16 views
Skip to first unread message

Sachin P

unread,
Sep 19, 2017, 2:48:10 AM9/19/17
to Hue-Users

I created a pySpark Job and its working perfectly fine on submitting thru spark-submit. Now When I tried thru Oozie its failing. I doubt the Fields that the fields I enter has issues . These fields are required for Spark Action in Oozie.

Spark Master : local
Mode : client 
Main class : DO I need to enter anything here as its Python + Spark code (Pyspark) ????????????????
Jars/py files : My py module


====

Log Stdout is as bellow

  =================================================================

  >>> Invoking Main class now >>>

  Fetching child yarn jobs
  tag id : oozie-653992fdf1609a2d4e19a863dff21a1
  Child yarn jobs are found -
  Spark Action Main class        : org.apache.spark.deploy.SparkSubmit

  Oozie Spark action configuration
  =================================================================

  --master
  local[*]
  --deploy-mode
  client
  --name
  POC1L
  --verbose
  /user/sachinkerala6174/pgm/poc1l.py

  =================================================================

  >>> Invoking Spark class now >>>

  python: can't open file '/user/sachinkerala6174/pgm/poc1l.py': [Errno 2] No such file or directory
  Intercepting System.exit(2)

  <<< Invocation of Main class completed <<<

  Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], exit code [2]

  Oozie Launcher failed, finishing Hadoop job gracefully

  Oozie Launcher, uploading action data to HDFS sequence file: hdfs://ip-172-31-53-48.ec2.internal:8020/user/sachinkerala6174/oozie-oozi/0000509-170711051319609-oozie-oozi-W/spark-fea0--spark/action-data.seq

  Oozie Launcher ends

Reply all
Reply to author
Forward
0 new messages