Spark Job Server

691 views
Skip to first unread message

Evan Chan

unread,
Jul 15, 2013, 7:30:19 PM7/15/13
to spark-de...@googlegroups.com
Hey guys,

As some of you know, here at Ooyala we are working on a general-purpose REST Job Server for Spark, and are very much intending to contribute it back to Spark. 
We would like to share about what we have done so far, and gather feedback as far as its architecture and design.   In addition, questions such as
- should it be contributed back as a separate project of spark itself, or would it be better if we just open sourced it independently?

Would it be better to share the design / overview here as a email?  Should I create a ticket?  

Overall, the job server is intended to help with running jobs
- for standalone, or Mesos, or YARN modes
- for isolated jobs or for jobs sharing cached datasets in a context
- job statuses and history are persisted and queryable

Looking forward to the discussion,
Evan

Matei Zaharia

unread,
Jul 15, 2013, 8:44:09 PM7/15/13
to spark-de...@googlegroups.com
Hey Evan,

It might be easiest to open a JIRA for this if you already have a design doc. JIRA makes it easy to attach files and to track the discussion over time.

In any case though, this would be cool to have. Where to include it is hard to tell without seeing the design in more detail. It might be nice to make it part of Spark as a whole (and for example have it on by default in the standalone mode), unless it has some things that seem like they'll change quickly over time or otherwise seems hard to develop together with Spark.

Matei

--
You received this message because you are subscribed to the Google Groups "Spark Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-develope...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jason Dai

unread,
Jul 15, 2013, 9:33:26 PM7/15/13
to spark-de...@googlegroups.com
Hi Evan,

This is very cool. We are planning to deploy Spark for multiple users and concurrent jobs, and will be very interested in such a job server. As Matei mentioned, a design doc in JIRA is a good starting point; in addition, if you have code/prototype that you can share, we are happy to hack it around :-)

Thanks,
-Jason


--

Evan Chan

unread,
Jul 16, 2013, 2:24:00 AM7/16/13
to spark-de...@googlegroups.com
It wouldn't be too hard for us to share the code, since we have mostly maintained it as something we plan to contribute....  but let me start with a JIRA ticket.   Also we're about to embark on a redesign, so it's a good time to share our design with the community.

-Evan



You received this message because you are subscribed to a topic in the Google Groups "Spark Developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/spark-developers/NPw9Kk-2wk4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to spark-develope...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
--
Evan Chan
Staff Engineer
e...@ooyala.com  | 


Henry Saputra

unread,
Jul 16, 2013, 4:13:15 PM7/16/13
to spark-de...@googlegroups.com
HI Evan, +1 for this effort.

I would love to contribute and help with this feature in any way.


- Henry


--

Evan Chan

unread,
Jul 17, 2013, 7:25:32 PM7/17/13
to spark-de...@googlegroups.com
Hey guys,

and attached a very high level features / architecture / API document.

We are in the middle of redesigning the system and should have more documents / detailed dataflows to attach soon, but please have a look and start commenting.  

thanks,
Evan



You received this message because you are subscribed to a topic in the Google Groups "Spark Developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/spark-developers/NPw9Kk-2wk4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to spark-develope...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

Evan Chan

unread,
Jul 24, 2013, 7:31:13 PM7/24/13
to spark-de...@googlegroups.com
I received one comment.... anybody else?   It'd be great to hear feedback on overall architecture, whether this could be a spark project itself, etc.

I will start a separate email discussion on a particular aspect of job server that I'd like feedback from.

thanks,
Evan

MLnick

unread,
Aug 8, 2013, 10:12:31 AM8/8/13
to spark-de...@googlegroups.com
Hi Evan

This sounds like something very useful and something that I need right now. I have a workflow where my front-end servers will periodically kick off jobs on the compute cluster in a scheduled manner (and possibly low-latency jobs in the future).

The basic architecture looks good overall. 

I started a basic version of the same idea just for my own narrow use case and ran into issues that appear to be conflicts between Jetty / Netty versions in the web server (I am using Scalatra) and Spark.

How did you approach this? Did you just use the same versions as Spark? Spray? etc etc?

Any idea when this might be ready to release?

Thanks
Nick

Evan Chan

unread,
Aug 9, 2013, 4:09:09 AM8/9/13
to spark-de...@googlegroups.com
Hi Nick,

I responded on the ticket itself, but we are planning to start contributing it back in bits and pieces starting in about a week, so we can get feedback.

-Evan

Henry Saputra

unread,
Aug 9, 2013, 2:56:57 PM8/9/13
to spark-de...@googlegroups.com
HI Evan, will you guys be doing it in separate branch/repo or will be as new module in Spark? 

Would love to try and help around when we can.

- Henry

Evan Chan

unread,
Aug 9, 2013, 6:40:41 PM8/9/13
to spark-de...@googlegroups.com
Hi Henry, we are planning for it to be in a separate repo.

-Evan

Evan Chan

unread,
Aug 9, 2013, 6:40:54 PM8/9/13
to spark-de...@googlegroups.com
Sorry I meant separate module.
Reply all
Reply to author
Forward
0 new messages