Hi there, may I ask two short questions please?

Lou

unread,

Oct 21, 2014, 4:58:09 AM10/21/14

to sparrow-sch...@googlegroups.com

Hi there,

May I ask two short questions please?

In order to evaluate the performance of Sparrow on Spark, there had been some or many changes made to Spark as in you've had the forked version. My question is that: what is the amount of work involved, excluding the development of Sparrow job schedulers (as a customized cluster scheduler in Spark)? My educated guess is that such changes may have taken you (e.g. at least two full-time junior researchers who didn't start from scratch in Spark) for quite a while (say a few months for development).

My second question is, i.e. may I happen to know what is the latest status of Sparrow development? For example, gang scheduling might be a bit hard to achieve due to the lack of a centralized cluster view in Sparrow, and inter-job dependencies are also too hard to be supported by Sparrow, unless the design is changed significantly.

I hope my aforementioned questions are crystal clear, and many thanks.

King days.

Cheers,

Lou

Kay Ousterhout

unread,

Oct 21, 2014, 3:18:13 PM10/21/14

to sparrow-sch...@googlegroups.com

Hi Lou,

It's hard to give an exact answer to your first question because integrating Sparrow with Spark was intimately tied to the development of the Sparrow in general. We ended up essentially doing the Spark integration twice: the first integration happened around Spring 2012, and then we had to do a near-complete re-write in August 2013 because the Spark scheduling code had changed significantly. Patrick Wendell did the first integration and I did the second one; we each did actually start from scratch in terms of our existing Spark knowledge (this would be different now!). I remember it taking a month or so the first time and less (more like a week) the second time, because having the old code was a useful reference. In the end, the current Sparrow branch (https://github.com/kayousterhout/spark/commits/sparrow) reflects 326 added lines of code compared to the base Spark version (plus the ThroughputTester file, which we wrote to compare to the default Spark scheduler). The current Sparrow branch is not feature-complete with the Spark default scheduler (for example, it doesn't currently support the UI -- which didn't exist when we started!), so doing that would require additional work.

You're right that gang scheduling is not supported by Sparrow, and would be difficult to support given Sparrow's decentralized architecture. Gang scheduling is not needed by Spark (since synchronization happens between stages, and not within a stage) and also not supported by many other cluster schedulers. There's a brief discussion of this on page 14 of our SOSP paper.

For inter-job dependencies, some of these could be supported by Sparrow. For example, one inter-job dependency I've heard of folks wanting is "job X should never be scheduled on the same machine at job Y". Sparrow could support this by adding some additional logic when responding to probes, to notify frontends about what else was running on the machine. Sparrow's late binding approach ends up looking a lot like Mesos, where scheduler frontends get "offers" from workers (in response to probes) describing available resources, and the scheduler frontend can accept or reject those resources. It's possible to implement a fairly broad set of policies in this model, which the Mesos paper has some discussion of. Supporting this would require adding some logic in Sparrow to avoid sending multiple offers from one worker at the same time (which is possible now but would lead to race conditions with inter-job policies) and also adding logic to expose more information about the jobs currently running on a particular machine. We have generally been surprised at how many policies would be implemented using Sparrow! I'm curious to hear if there are other policies you're thinking of that seem difficult or impossible to support.

I hope this is helpful. Let me know if you have further questions!

-Kay

--
You received this message because you are subscribed to the Google Groups "Sparrow Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sparrow-scheduler...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

Message has been deleted