Question about parallelism of saveAsNewHadoopFile

75 views

Skip to first unread message

chi zhang

unread,

Dec 21, 2013, 10:46:20 PM12/21/13

to spark...@googlegroups.com

Hi,

I'm testing my spark program recently, in which I have 4 RDD of different data, after a groupby and some map operations, I call the saveAsNewAPIHadoopFile to save my data to HDFS.

I saw that the groupby and saveAsNewAPIHadoopFile operations were processed one by one in spark web monitor. I want to know that why are these 4 saveAsNewAPIHadoopFile operations not concurrently?