How to schedule dataproc jobs

1,354 views
Skip to first unread message

amerikiwi

unread,
Sep 17, 2017, 4:13:09 PM9/17/17
to Google Cloud Dataproc Discussions
What is the best way to schedule dataproc jobs? I have jobs using GCS, BQ, Pyspark in Dataproc. If anyone has some sample code that would be great.

Dan Sedov

unread,
Sep 18, 2017, 12:59:44 PM9/18/17
to Google Cloud Dataproc Discussions
At the moment cloud scheduler [1] appears to be the best option available. Here's a blog post talking about using it for Dataproc jobs [2].

James Malone

unread,
Sep 18, 2017, 1:13:12 PM9/18/17
to Google Cloud Dataproc Discussions
You can also use a self-installed Apache Airflow instance. Airflow has support for Cloud Dataproc along with other GCP products:

Reply all
Reply to author
Forward
0 new messages