Executor also starting along with driver on Master Node - Any reason

123 views
Skip to first unread message

Moses Sundheep

unread,
Mar 7, 2020, 11:21:19 AM3/7/20
to Google Cloud Dataproc Discussions
Hi Team,

I created Dataproc cluster with below configuration:

1 Master Node
3 Worker Nodes

When i started the spark-shell command on master node  which is client mode, i expected only driver on Master node. But i see executor as well on master node. 

Is there any reason for it, how can i fix it.

Please find the attached image.

Thanks in advance.

Regards,
Moses Palla

Dennis Huo

unread,
Mar 7, 2020, 1:18:46 PM3/7/20
to Google Cloud Dataproc Discussions
Is it possible you accidentally specified something like "--master local"? Or overrode the "spark.master" property to be "local" somewhere? Generally any settings related to "spark.master" should not be specified if you're running on Dataproc, as you should let Dataproc's default settings ensure it is pointed at YARN correctly.

Moses Sundheep

unread,
Mar 8, 2020, 9:24:40 AM3/8/20
to Google Cloud Dataproc Discussions
Hi dennis,

Thank you for your reply.

I didn't change any property. on dataproc.
I launched new cluster and ran spark-shell command from master node and verified. But i see executor also running along with driver on master node.

i see Master assigned to yarn, please find the below output. Can you lead me how can i set up that master node only holds driver program.


Spark context available as 'sc' (master = yarn, app id = application_1583671393942_0002).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.4
      /_/
         
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_242)
Type in expressions to have them evaluated.
Type :help for more information.

scala> 

Dennis Huo

unread,
Mar 8, 2020, 3:21:45 PM3/8/20
to Google Cloud Dataproc Discussions
Where are you seeing the executor on the master node? Does "sudo jps" show a CoarseGrainedExecutorBackend running on the master?

Which Dataproc image version are you using?

Moses Sundheep

unread,
Mar 8, 2020, 3:58:24 PM3/8/20
to Google Cloud Dataproc Discussions
Hi Dennis,

I see driver and executor started on master node under application UI section.please find the attached image.

Attached imàge of sudo jps output. I don't see any said process.

I am using cloud dataproc version 1.3, attached image of that as well
Master-Driver.JPG
15836971139468902485416246695118.jpg
15836974680595389375948331520219.jpg

karth...@google.com

unread,
Mar 8, 2020, 4:30:27 PM3/8/20
to Google Cloud Dataproc Discussions
Ah this is what I suspected from the beginning -- this is intended behavior. The Spark UI just visualizes the driver (your local Spark shell) as a special type of "executor". But the executors where tasks run are still on worker nodes.

You can verify this by running something like `sc.parallelize(1 until 10).map(x => java.net.InetAddress.getLocalHost().getHostName()).collect()`. You'll see that all the tasks ran on worker nodes (hostnames ending in -w-*).

Moses Sundheep

unread,
Mar 11, 2020, 12:42:23 PM3/11/20
to Google Cloud Dataproc Discussions
Thank you Karthik for your valuable inputs. I can see output with worker nodes.

Regards,
Mose Palla
Reply all
Reply to author
Forward
0 new messages