Why Spark do not add a option to support Sparrow

98 views
Skip to first unread message

Lee hu

unread,
Oct 21, 2015, 2:28:46 PM10/21/15
to Sparrow Users
Since Sparrow is very efficiency for scheduling a lot of tasks. I test some experiment on Spark1.5 with 816 cores, and find the max scheduling delay can arrive to 0.5s, this is a very high overhead for low duration task.(I use the standalone mode for Spark)

Thanks!

Kay Ousterhout

unread,
Oct 21, 2015, 3:04:57 PM10/21/15
to sparrow-sch...@googlegroups.com
Unfortunately there are also a few practical problems with using Sparrow with Spark.  Sparrow distributes scheduling over many Sparrow schedulers that are each associated with their own Spark driver (this is where Sparrow's improvements stem from -- there's no longer a single driver serving as the bottleneck for your application, but all of the schedulers/drivers share the same slots for scheduling tasks).  As a result, data stored in Spark's block manager on one Spark driver (and created as part of a job scheduled by the associated Sparrow scheduler) cannot be accessed by other Spark drivers.

We've also found that many people have scheduling issues when they're scheduling a single large job, whereas Sparrow targets a use case where users are scheduling many different jobs.  We're doing some ongoing work at Berkeley now to try to alleviate the scheduling bottleneck in Spark when running a single large job.

-Kay

On Wed, Oct 21, 2015 at 11:28 AM, Lee hu <lih...@gmail.com> wrote:
Since Sparrow is very efficiency for scheduling a lot of tasks. I test some experiment on Spark1.5 with 816 cores, and find the max scheduling delay can arrive to 0.5s, this is a very high overhead for low duration task.(I use the standalone mode for Spark)

Thanks!

--
You received this message because you are subscribed to the Google Groups "Sparrow Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sparrow-scheduler...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

lihu

unread,
Oct 21, 2015, 4:12:52 PM10/21/15
to sparrow-sch...@googlegroups.com
Thanks for your reply! I now understand it, if we are running different jobs, we can use Yarn or Mesos. 

but since Spark can be a parallel query engine for many users, if we can use a external meta store to store the Block info, may be we can solve the practical issue for Sparrow? since the blocks info can be shared by different drivers. 



--
You received this message because you are subscribed to a topic in the Google Groups "Sparrow Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sparrow-scheduler-users/d0gxEuk6lAc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sparrow-scheduler...@googlegroups.com.

Kay Ousterhout

unread,
Oct 21, 2015, 5:04:40 PM10/21/15
to sparrow-sch...@googlegroups.com
Yes, if there were an external meta store, then Sparrow could be used! One indirect way to do this is to store data in an caching layer like Tachyon that can be accessed by different Spark contexts.

Lou

unread,
Oct 21, 2015, 6:18:43 PM10/21/15
to Sparrow Users
> I test some experiment on Spark1.5 with 816 cores, and find the max scheduling delay can arrive to 0.5s, this is a very high overhead for low duration task. (I use 
> the standalone mode for Spark)

I think this also depends on the number of jobs submitted by users per second, e.g. when there are more than 850 queries submitted, the response time of jobs or job completion time (in other words) has been increased (dramatically) when one is using Spark 0.7 standalone cluster scheduler (as shown in the Sparrow paper as well as our testbed running in a small-scale cloud stack), which might also be true for Spark 1.5.1 (and correct me if I were wrong).

Moreover, Sparrow is very good at scheduling short-lived jobs (as in there are many sparrows eating small pieces of cheese without looking at each other), as being low scheduling latency, high system throughput, fault-tolerant and highly available (granted by the nature of the decentralized approach). On the other hand, since each per-job scheduler in Sparrow is stateless itself due to a lightweight scheduling algorithm adopted, some features are just gone, including gang scheduling, which, nevertheless, may not be a concern if not interested. 

More interestingly, if Mesos would become a HCOS, then Sparrow may find her way to fly there too, and again, flying with Spark 1.5.1+, which she was intimately tied with as always.

No offends, if any sounds.

Mvh
Lou

lihu

unread,
Oct 21, 2015, 10:05:21 PM10/21/15
to sparrow-sch...@googlegroups.com
Yes, you are right. As Spark is design to support fault-tolerance as the first class, the spark cluster should be hundreds or thousands machines(if there are only tens of machine, we do not need fault-tolerance, as RDD bring so many pressure in memory), so we should consider well about the overhead in launch lots of task in parallel, but the centralized driver limit this, maybe we can ignore the scheduling delay or other overhead for batch process, but as Spark is memory-based and Spark SQL become more and more powerful, maybe we should consider how to reduce the overhead for query tasks or low-duration tasks. 

This may interesting, thanks!






--

Lou

unread,
Oct 22, 2015, 5:58:58 PM10/22/15
to Sparrow Users
My two (more) cents here. ;)

>maybe we should consider how to reduce the overhead for query tasks or low-duration tasks. 

Note that based off of one latest work, in which a few Spark workloads have been evaluated, CPU utilization has become the bottleneck for reducing job completion time as reported (under the premise that this observation using some simple descriptive statistics could be qualified), a possible solution might be landed very soon. 


>if there are only tens of machine, we do not need fault-tolerance, as RDD bring so many pressure in memory

My understanding about fault-tolerance of Spark is: Since Spark mainly counts on RDDs, which can be stored either in memory or on disk. Fault-tolerance thereby comes around to void data loss, as necessary as to compete with MapReduce and the related models. Stress that I might be wrong on this one, because I am neither a Spark guy nor a Flink buddy. ;) 

By taking your example of a large-scale cluster consisting of hundreds or even thousands of (virtual) machines on which Spark has been installed, usually Mesos could take the job about mapping framework scheduler demands and resource offers pertaining to the adhering slaves available. The scheduling and scalability bottleneck thus is shifted to Mesos, since it has adopted a centralized resource allocator. Further, as a question can always be argued from both sides, i.e. one may consider to partition the cluster into small cells or mini-clusters, in each of which a Mesos can be running as the cluster OS, coordinating various framework schedulers. In doing so, Spark can be qualified as a framework scheduler for handling in-memory data processing workloads, while Marathon is good at handling long-lived workloads. 

If one would love to stay in fashion, then just try Mesos + Kubernetes as in "it is a Google thing".

Last but (not) least, since here is Sparrow's territory, and I speak for nobody, my understanding about Spark and Sparrow is: To fly together, there is very little space for Sparrow to grow, since the two are essentially on different sides of the same coin. By looking at the impressive achievements that Sparrow has made, featured with many citations and a popular decentralized framework often to be compared with, a great success itself is, literally.

What else? One day, Sparrow might be taken to the next level, not only after being reborn in terms of getting inspired by many pieces of work coming after her existence, but also being for a simple reason: lightning could strike!

Bien à toi
Lou

lihu

unread,
Oct 22, 2015, 8:35:35 PM10/22/15
to sparrow-sch...@googlegroups.com, ylu...@gmail.com
Very good insight !  but I have some different view about this. :-D


First take about the memory pressure I mentioned in last email. 

fault tolerance is the basic feature for RDD, if we think deeper about it, we can find there is two feature to support fault tolerance: DAG and immutable, DAG means at the map stage, we will compute many work, such as map(function1).map(function2).flatmap(function3).  Yes, this will lead to a lot of cpu usage, and what we all see: Spark is CPU-bound. But on the other hand, because RDD is immutable, at every transform,  we need new space for new object, this lead to a lot of memory pressure and cpu usage. 

I know Spark is CPU-bound, this is what the public paper tell about, and under my experiment, this is true. What I worry about is CPU-bound is just a surface phenomena. If we compare Spark with other framework, the compute efficiency is not very high, and I do not believe compute efficiency  of JVM application is not high enough. In another words, If Spark is CPU-bound, then use GPU accelerate Spark will get a boost even under the batch process, so I really want to know is this true?(there is a paper about this: https://spark-summit.org/2015-east/wp-content/uploads/2015/03/SSE15-28-Peilong-Li-Yan-Luo.pdf)

Unfortunately, I do not get enough data to support my view, and this is just my speculate. Maybe I am wrong, and Spark is just CPU-bound :-D


Second, let us talk about Sparrow.

I agree with you, I can not find much difference between Sparrow and Mesos/Yarn. They are all decentralized, this is what Sparrow out performance other scheduling system, maybe Sparrow is more care about scheduling, not care about resource management? 

What we can do with Sparrow, we can do with Mesos/Yarn, such as multi-driver using in parallel.  no offends here :-D


Thanks!









Reply all
Reply to author
Forward
0 new messages