Very good insight ! but I have some different view about this. :-D
First take about the memory pressure I mentioned in last email.
fault tolerance is the basic feature for RDD, if we think deeper about it, we can find there is two feature to support fault tolerance: DAG and immutable, DAG means at the map stage, we will compute many work, such as map(function1).map(function2).flatmap(function3). Yes, this will lead to a lot of cpu usage, and what we all see: Spark is CPU-bound. But on the other hand, because RDD is immutable, at every transform, we need new space for new object, this lead to a lot of memory pressure and cpu usage.
I know Spark is CPU-bound, this is what the public paper tell about, and under my experiment, this is true. What I worry about is CPU-bound is just a surface phenomena. If we compare Spark with other framework, the compute efficiency is not very high, and I do not believe compute efficiency of JVM application is not high enough. In another words, If Spark is CPU-bound, then use GPU accelerate Spark will get a boost even under the batch process, so I really want to know is this true?(there is a paper about this:
https://spark-summit.org/2015-east/wp-content/uploads/2015/03/SSE15-28-Peilong-Li-Yan-Luo.pdf)
Unfortunately, I do not get enough data to support my view, and this is just my speculate. Maybe I am wrong, and Spark is just CPU-bound :-D
Second, let us talk about Sparrow.
I agree with you, I can not find much difference between Sparrow and Mesos/Yarn. They are all decentralized, this is what Sparrow out performance other scheduling system, maybe Sparrow is more care about scheduling, not care about resource management?
What we can do with Sparrow, we can do with Mesos/Yarn, such as multi-driver using in parallel. no offends here :-D
Thanks!