Batch recovery with multiple Livy servers!

161 views
Skip to first unread message

Meisam

unread,
Oct 20, 2016, 2:45:42 PM10/20/16
to Livy Development
LIVY-211 adds support for batch recovery but only with one Livy server. We extended LIVY-211 and added support for multiple Livy servers. We would like to gauge the communities interest in this feature and contribute back our changes.

This is a short description of how the patch works.
Multiple Livy servers coordinate through the curator framework and Zookeeper.
When one of the servers updates a batch session's metadata, Zookeeper propagates the change to all Livy servers.
We only support recovery in Yarn mode and with Zookeeper as the meta-store (we do not support a file based meta-store).

We also are working support for interactive session recovery with multiple Livy servers (based on LIVY-212). We would like to contribute that feature as well.

Thanks,
Fathi Salmi, Meisam
Software Engineer at PayPal

Alex Man

unread,
Oct 20, 2016, 4:28:03 PM10/20/16
to Meisam, Livy Development

This sounds so cool. Your work could be extended to support HA!!

 

I’m curious and have some questions about your approach:

Do you have a GitHub link to your work?

Is there a master/slaves hierarchy among the Livy servers?

Does each Livy server poll YARNs individually? Would different Livy servers return different states because of difference in timing of polling?

 

Thanks!

Alex

--
You received this message because you are subscribed to the Google Groups "Livy Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to livy-dev+u...@cloudera.org.

Meisam Fathi

unread,
Oct 20, 2016, 4:58:14 PM10/20/16
to Alex Man, Livy Development
Hi Alex!

Pleas see the inlined comments.

Do you have a GitHub link to your work?


I have a clone on my corporate github account. If the community is interested, I can clean up our commits and push them to my public github account.
 

Is there a master/slaves hierarchy among the Livy servers?


There is no master/slave hierarchy. Users can access sessions through any Livy server.
 

Does each Livy server poll YARNs individually? Would different Livy servers return different states because of difference in timing of polling?

 
No, they don't. To keep Livy servers consistent, we are using the Path Cache recipe from Apache Curator (https://curator.apache.org/curator-recipes/path-cache.html). Only one server polls YARN and once it has the app info, it updates ZooKeeper. Other servers get the YARN info from ZooKeeper.
 There is a chance that servers get inconsistent states, but that wouldn't be due to the timing of polling.

Thanks,
Meisam

Meisam Fathi

unread,
Oct 20, 2016, 5:38:52 PM10/20/16
to Alex Man, Livy Development
Here is a JIRA ticket to track the discussion for this feature:

Thanks,
Meisam

Saisai Shao

unread,
Oct 26, 2016, 5:37:04 AM10/26/16
to Meisam Fathi, Alex Man, Livy Development
This is a master-master solution compared to what Alex mentioned about master-slave arch. My concern will this architecture easily lead to inconsistent state between Livy Servers? Also I'm wondering will this introduce heavy communication overhead between servers or server-zks?

This is definitely option for Livy HA, I think we need to well discuss the pros and cons of different solutions, since this is an important feature.

Let's discuss in the JIRA, btw do you have a design doc about this?

Thanks
Jerry


To unsubscribe from this group and stop receiving emails from it, send an email to livy-dev+unsubscribe@cloudera.org.

--
You received this message because you are subscribed to the Google Groups "Livy Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to livy-dev+unsubscribe@cloudera.org.

Lin Chan

unread,
Oct 26, 2016, 11:08:33 AM10/26/16
to Saisai Shao, Meisam Fathi, Alex Man, Livy Development

I would second on Jerry’s proposal of having a deeper discussion on this. It would be great if you could provide an initial design doc then everyone can chip in on the JIRA.

 

Thanks

Lin

To unsubscribe from this group and stop receiving emails from it, send an email to livy-dev+u...@cloudera.org.

--
You received this message because you are subscribed to the Google Groups "Livy Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to livy-dev+u...@cloudera.org.

 

--
You received this message because you are subscribed to the Google Groups "Livy Development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to livy-dev+u...@cloudera.org.

Meisam Fathi

unread,
Oct 26, 2016, 2:41:17 PM10/26/16
to Lin Chan, Saisai Shao, Alex Man, Livy Development
I updated the JIRA ticket with a link to a design doc

Thanks,
Meisam
Reply all
Reply to author
Forward
0 new messages