Proposal: RESTful coordinator service for the Voldemort cluster

Chinmay Soman

unread,

Oct 11, 2012, 12:08:18 AM10/11/12

to project-...@googlegroups.com

The problem:
The DefaultStoreClient currently in use in production at LinkedIn is a source of a lot of issues. I'm enumerating them below:

* The cluster topology and store metadata has to be synchronized on the client side. This makes bootstrapping (a process by which the client obtains this metadata from a random server node) an expensive operation. In addition, it introduces the additional burden of fetching the latest changes or bouncing the client when such an event occurs.
* Complicated connection management which seems to have operational and performance implications.
* Existing client handles all the replication (including cross zone) which is inefficient.
* The client design is pretty complicated, with multiple levels of nesting.
* Client side config parameters are far too many (sometimes confusing) which causes an operational overhead.

Although some of these issues can be fixed, it requires an iterative release to all the clients. For instance, we have recently implemented an auto-bootstrapper mechanism which tracks changes in cluster.xml happening on the cluster and automatically re-bootstraps the client which helps in doing cluster maintenance. However, in order to reap the benefits, it is essential that all the clients using the cluster pick up this new release. This is a very costly operation when a large number of clients are involved.

The proposal:
Refactor most of the Voldemort logic out of the client and create a new entity called the 'Coordinator'. Essentially we're trying to make the client more lightweight to address most of the issues discussed above.

A high level description of how it will work is as follows:
- We can deploy a set of N coordinators in front of M server nodes (the ration of N:M needs to be worked out).
- The coordinator deployment is symmetric in the sense: each coordinator can satisfy any Voldemort request (get/put/getall/...).
- The coordinator is stateless (across requests) and supports a REST interface.
- When it receives a Voldemort request (get/put/...), it takes care of routing the request to the required server nodes as specified by the store properties, aggregate the result and return an HTTP response.

The coordinator still needs to obtain the cluster.xml and stores.xml, but the point is that its within the control of the administrator (thus, fetching/bouncing is really easy). In addition, all the bug fixes can be quickly deployed due to this property. In summary, this is a huge operational benefit. Please note that the subsequent bug fixes would also be made available as part of the existing thick client (same code path).

Open issue:
The main issue now is how to route a request from the thin client to the coordinator. The ideal solution should take care of the following:
- Fair load balance of the incoming REST requests amongst the N coordinators
- Have the ability to dynamically add / remove / change the coordinator nodes (with minimal impact to the clients)
- Should not have any extra overhead (since every request will route through this piece)
- Should be highly available

At LinkedIn, there is a proprietary solution that resolves the above Open Issue. Unfortunately, the proprietary solution is not open sourced (yet). So we're opening this question to the community: What open source options may resolve our Open Issue?

We will document the detailed design for the coordinator and thin client in a separate wiki. This is just to get some initial feedback from the community regarding the routing logic between the thin client and the coordinator in the long term. If you have any ideas please share with us!

Voldemort team

Sunny Gleason

unread,

Oct 11, 2012, 7:41:37 AM10/11/12

to project-...@googlegroups.com

On 10/11/12, Chinmay Soman <chinmay...@gmail.com> wrote:
> At LinkedIn, there is a *proprietary solution* that resolves the above Open

> Issue. Unfortunately, the proprietary solution is not open sourced (yet).
> So we're opening this question to the community: What open source options

> may resolve our *Open Issue*?

You should check out Riak.

Maarten Koopmans

unread,

Oct 15, 2012, 5:51:23 AM10/15/12

to project-...@googlegroups.com

Sounds nice, I have been toying with the idea of putting a REST proxy
for a Voldemort cluster. The elephant in the room is of course how to
shield the resources - i.e. add session management/authN or not.

Using Lift/Scala it should be a few days of work. What is your planning?

--Maarten

> --
> You received this message because you are subscribed to the Google Groups "project-voldemort" group.
> To unsubscribe from this group, send email to project-voldem...@googlegroups.com.
> Visit this group at http://groups.google.com/group/project-voldemort?hl=en.
>
>

Chinmay Soman

unread,

Oct 16, 2012, 5:43:07 PM10/16/12

to project-...@googlegroups.com

Currently, we're thinking of using Netty for implementing the REST enabled coordinator and make that part open source. The other component is essentially a software load balancer with a persistent HTTP connection pool which is the core of the thin client. This thin client will be released internal to LinkedIn at first.

But yes, any good load balancing solution (along with a HTTP connection pool) will work in this case. I agree that wont be a lot of work: anyone willing to take this up ?

I'm currently designing the REST API for the coordinator. Since we dont want to do any extra work, the de-compression, de-serialization and inconsistency resolver modules should still be located in the thin client. This means we can only transfer raw bytes between the thin client and the coordinator. In my prototype I'm going to use base64 encoding to send the data to and fro. But are there any specific suggestions with respect to the REST API ?

Lemme know,

C

Maarten Koopmans

unread,

Oct 17, 2012, 1:37:22 AM10/17/12

to project-...@googlegroups.com

Sounds like were thinking along the same lines. I'm on it, I'll update with a status within a week.

Contact me if you want to discuss this.

--Maarten

Chinmay Soman

unread,

Oct 18, 2012, 5:46:49 PM10/18/12

to project-...@googlegroups.com

Sounds good. I'll have a base prototype coordinator working soon (roughly next week) which speaks a binary protocol on top of HTTP chunked transfer encoding.

We can take it from there. There are a couple of issues with respect to what protocol / schema to use for this communication though. For instance if the messaging layer uses Avro and the keys and values use JSON, then we'll end up doing expensive translations.

We need to standardize this.

To unsubscribe from this group, send email to project-voldemort+unsubscribe@googlegroups.com.

Maarten Koopmans

unread,

Oct 26, 2012, 9:15:12 AM10/26/12

to project-...@googlegroups.com

The question is whether speed is an issue on the container level. I am
short on time, but will try to pick things up sooner rather than
later.

Probably profitable to have the two approaches, I guess.

Chinmay Soman

unread,

Jan 24, 2013, 8:41:55 PM1/24/13

to project-...@googlegroups.com

Apologies for the radio silence guys. We've just published the REST API to be used with the coordinator and thin client.

Here's the link : https://github.com/voldemort/voldemort/raw/gh-pages/pdfs/voldemort-rest-api.pdf

Any comments, concerns are welcome !

Chinmay Soman

unread,

Feb 5, 2013, 4:52:40 PM2/5/13

to project-...@googlegroups.com

I've updated the design doc here:

https://github.com/voldemort/voldemort/wiki/Voldemort-thin-client

The link to the REST API is also at the end of this document. Anybody who wants to implement the REST'ful client in a non Java language should probably go through that PDF and point out mistakes if any.

Thanks,

C

Alex Feinberg

unread,

Feb 8, 2013, 10:06:19 PM2/8/13

to project-...@googlegroups.com

Looks great, I really like this. I like how multi-gets are handled!

- af

--
Alex Feinberg

> To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com (mailto:project-voldem...@googlegroups.com).

> Visit this group at http://groups.google.com/group/project-voldemort?hl=en.

> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Bongani Shongwe

unread,

Feb 18, 2013, 11:36:16 AM2/18/13

to project-...@googlegroups.com

Is there a REST'ful client available for use yet? I went through the github code but I didn't see it, except a python implementation (which I'm unsure if it still works properly as I received 404 errors). I've tried creating my own Http client in Jetty but I receive 404 errors as well. A working example would be great to help me figure out what I'm missing

Also, might be a slight oversight, in the API pdf the port number was not indicated for the URI

Chinmay Soman

unread,

Feb 19, 2013, 2:25:03 AM2/19/13

to project-...@googlegroups.com

We're still working on the implementation. There are two key pieces as part of this:

1) Coordinator with the embedded fat client layer

2) Java Rest client

I'll send out a note when the coordinator is ready. At this moment we do not have a plan for a Python client, but once the coordinator is in place, it should be trivial to write one.

Vinoth Chandar

unread,

Feb 22, 2013, 12:13:54 PM2/22/13

to project-...@googlegroups.com

The biggest advantage of the coordinator is that now the community can chip with clients in different languages... :)

Fun times ahead

-Vinoth

Maarten Koopmans

unread,

Mar 7, 2013, 11:57:10 AM3/7/13

to project-...@googlegroups.com

Just saw this, pretty nice. Keep up the good work!

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.

Visit this group at http://groups.google.com/group/project-voldemort?hl=en.

Reply all

Reply to author

Forward