Hello,
Dataproc for Spark does not seems to be reliable at all ! I am unable to create my Spark cluster with the correct number of executors.
1. Create a dataproc cluster:
$ gcloud beta dataproc clusters create my-dataproc
2. Follow the documentation to access to the web UI [1]
3. Connect to the master node using gcloud
4. Run PySpark
$ pyspark
5. Check the number of executors on the Spark web UI
I got only 2 executors (screenshot 1). I should have 8 executors as I have 2 workers nodes with 4 CPUs on each.
6. After about 3 minutes, I got a lost executor error in my logs:
>>> 15/10/29 15:11:33 ERROR org.apache.spark.scheduler.cluster.YarnScheduler: Lost executor 1 on guillaume-dataproc-w-1.c.databerries-staging.internal: remote Rpc client disassociated
15/10/29 15:11:33 WARN akka.remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkE...@guillaume-dataproc-w-1.c.databerries-staging.internal:48238] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
7. If I check the number of executors in the web UI, I have only one executor remaining.
Screenshot 1

Cluster config:
