Regarding newbie tasks / projects

omkar patil

unread,

Apr 9, 2017, 11:16:09 PM4/9/17

to project-voldemort

Hey!

My name is Omkar and currently i am doing my Masters at UCLA.

Recently while looking for cool distributed database projects to study about/work on, i came across voldemort.

I want to study the project and make valuable contirbutions to it, but wasnt able to understand the messages on mailing list
and the begineer projects on the github page seem outdated.
Hence and am not sure where to start from, given the current standing of the project.

Hence it would be great if you guys could provide any little direction/keyword which I could start from.

I understand that I might be bothering you all with such trivialities and I apologise for the same. Hope to hear from you guys soon!! Have a great day!

Félix GV

unread,

Apr 10, 2017, 12:00:42 AM4/10/17

to project-voldemort

Hi Omkar,

You are totally right that the beginner projects on GitHub are outdated by now...

We could suggest some things, but it would help to know what level of work load are you interested in. Is it a few hours a week for one semester? or something bigger than that?

Also, if you have any specific interest, that could also guide the suggestion process. For example, are you interested in serialization/wire formats? fault-tolerance? performance? cluster management/operations? Hadoop stuff? etc.

Let's start with your thoughts on the above questions, and we'll go from there.

-F

--
Félix

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

omkar patil

unread,

Apr 10, 2017, 12:09:07 AM4/10/17

to project-voldemort

Hey!!

Thank you so much for the follow-up. You guys are great.

I am looking forward to put in 20 hours per week for next 2 months.

I would love to know more about and work on the cluster management / fault tolerance parts.

Looking forward to further correspondence.

Thanks a lot again!

Best,
Omkar

--
Félix

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.

Félix GV

unread,

Apr 10, 2017, 12:47:02 PM4/10/17

to project-voldemort

Hi Omkar,

That sounds like a pretty significant commitment. I think you should be able to pull off some pretty impactful work with that.

Here's a summary of the current state of Voldemort, and what I would suggest as an improvement.

Background Context:

Voldemort is usable in two modes:

Read-Write, which is a fairly typical tunable consistency key-value store (similar to Cassandra).
Read-Only, which is somewhat unique, in that it provides first-class support for bulk loading data from Hadoop.

At LinkedIn and, as far as I can tell, in the open-source community, the biggest footprint for Voldemort is in Read-Only (RO) deployments.

In terms of fault-tolerance, the read path (as well as the write path in RW mode) is very stable. It uses replicas to fall back on when nodes are down, etc... There used to be some instability with it in some fringe cases (very high-throughput use cases) but even that has been fixed a couple of years ago. So, really, there isn't much to do there.

Where there is a significant opportunity for improvement is in the bulkload from Hadoop feature. The process that loads data into Voldemort from Hadoop is called Build and Push (BnP) and has historically been considered explicitly NOT highly available (HA). Later on, we added a BnP HA feature, which allows a BnP job to finish successfully on N-1 node if a node is down, and that helps a bit, but there is still a pretty major flaw. Bringing a new node in the cluster is a manual, brittle process, and it requires BnP downtime. So, essentially, the BnP HA feature made it easier to transform an unplanned outage into a planned (and hopefully shorter) outage, so it makes life a bit easier for operators, since they don't need to fix issues ASAP in the middle of the night. But overall, it is still pretty sketchy and could use an overhaul.

Proposal:

What I'd like to suggest would be the creation of a "recovery mode", where after an outage, we can bring in a fresh new empty node into the cluster, and it would have the ability to automatically restore its data from peer nodes, without extensive operator intervention, and when it's done restoring its data, it would automatically come back online and start serving requests, again, without manual operator intervention.

The work required would be something along the lines of:

Create new mode where a server refuses to serve online requests, but accepts new BnP jobs.
Figure out what data is missing, and what peer nodes hold that data.
Fetch the data from peer nodes.
Deal with race conditions where new BnP jobs may conflict with the recovery process.
Ensure that there are good automated checks to verify that the cluster is in a healthy state before bringing the node back online.
Clean up state that was kept around to indicate that the cluster is under-replicated, so that BnP HA can kick in again in the future if need be.
Bring the node online.
...
Profit!

Some of these items already have code that pretty much does what's needed already, but refactoring would be needed to expose it form new code paths. Some other items are brand new and would require more significant work.

The end result is that Voldemort RO would have self-healing, rather than a clumsy and error prone node restoration process.

Let me know what you think. If you're interested, we can discuss the details, and I'll be happy to provide some mentoring.

-F

--
Félix

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Félix GV

unread,

Apr 10, 2017, 1:03:28 PM4/10/17

to project-voldemort

I guess that's not clear from my previous email, but what I'm suggesting would not only provide self-healing, it would also provide full HA for the BnP job (i.e.: no planned outages anymore either).

omkar patil

unread,

Apr 10, 2017, 2:31:09 PM4/10/17

to project-...@googlegroups.com

Hey Felix!

Thank you so much for the follow-up.

The proposal sounds exciting and am really pumped up for it.

While reading the current state of the system you described, i wasn table to understand this part ->

So, essentially, the BnP HA feature made it easier to transform an unplanned outage into a planned (and hopefully shorter) outage, so it makes life a bit easier for operators, since they don't need to fix issues ASAP in the middle of the night. But overall, it is still pretty sketchy and could use an overhaul.

Could you provide a little bit more details if you have the time .?

Irrespective of that I would love to start working on this asap and it would be great if you could provide me further general directions/keywords to get started.

Looking forward to this amazing experience.

Thanks a lot for your time.

Best,

Omkar

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--

Omkar Patil,
Computer Science Graduate Student,

UCLA

David Ongaro

unread,

Apr 10, 2017, 2:41:32 PM4/10/17

to project-...@googlegroups.com

HI Félix, hi Omkar,

step 3 would require to implement some kind of internode data transfer (which might be fundamentally different to the current internode communications for metadata, due to the different scale, throughput and latency requirements). Having this would immediately allow to implement a feature I long for since quite some time: Only fetch each partition once from HDFS even if the replication factor is >1, but replicate partitions through internode transfer according to the replication factor.

The basic impetus is that normally the Voldemort nodes of a cluster life on the same rack whereas the HDFS cluster might even be in a different DC, and its cheaper and quicker (and possible more reliable) to do as much network transfers as possible locally.

To give some background: the RO stores are generated “offline" in Hadoop to avoid having to build the indexes on the edge nodes (Voldemort) themselves, which would cut into their time serving actual requests. Once they are generated they need to be fetched by all Voldemort nodes from HDFS and swapped (atomically replace the previous version of the store clusterwide). This is what the “Push” part in BnP does. The “Build” part of BnP was already enhanced in the past so that each partition needs only be generated once in Hadoop, but the fetcher was kept relatively simple: say the replication factor is 2 then each data and index file is fetched twice from HDFS by two different nodes in order to have a clusterwide replication factor of 2. With internode transfer only one node has to fetch each file from HDFS, a second node has to figure out somehow which node has the partition it needs and fetch directly from there.

This might only indirectly improve HA, but it surely seems like a “low-hanging" (although not trivial) fruit which could be implemented along the way for an easy win. Of course some care needs to be taken as not to congest the local network too much so that request serving is affected (which probably can be handled by implementing a rate limit, like we have for current fetchers).

1. Implement an internode data transfer (basically step 3 from Félix)

2. Change fetching logic so that each file is only fetched once from HDFS. Nodes with missing partitions needs to know which node has the missing files (and have to wait till they finished fetching it) and fetch directly from there. (This is similar to step 2 from Félix). Maybe this is also something the Push job itself can coordinate.

3. The push job probably needs a few adaptions to handle the new mode

4. Profit

So a profit step could be reached much earlier while already doing a significant amount of the work needed for your proposal.

David

PS:

Create new mode where a server refuses to serve online requests, but accepts new BnP jobs.

Isn’t that what server.state=OFFLINE_SERVER and readonly.fetch.enabled=true is supposed to be?

Felix GV

unread,

Apr 10, 2017, 2:45:14 PM4/10/17

to project-...@googlegroups.com

Hi Omkar,

I'm glad to see your enthusiasm. Of course, I understand that there will be a fair bit of ramp up before you can start tackling this, so definitely feel free to ask as many questions as you need.

Here are a few resources that I would recommend reading first:

High-level intro and quick-start guide for BnP: http://www.project-voldemort.com/voldemort/build-and-push.html
First commit of the BnP HA feature: https://github.com/voldemort/voldemort/commit/c2db8fd9b0afe714f6f88908c769b66485d28825 (There are minor details that have changed since then, but it is a good intro nonetheless. Just read the commit message, no need to look at the code for now.).

--

Felix GV
Staff Software Engineer
Data Infrastructure
LinkedIn

f...@linkedin.com
linkedin.com/in/felixgv

Felix GV

unread,

Apr 10, 2017, 3:43:19 PM4/10/17

to project-...@googlegroups.com

Hi David,

I definitely see a lot of synergy with both of these proposed changes. In fact, in either case, I think the most major piece would be the stabilization of the inter-node transfer functionality. Once that is in, the rest of the work for either features can be implemented "fairly easily".

For the sake of realistically achieving a goal (any goal), I would recommend that Omkar focuses on just one of the two, and if he still has time left, he can look into doing the small remaining incremental work to kill two birds with one stone. But regardless of whether we get to that or not, having one of the two would be more valuable than having half of each.

Whether Omkar is more interested in the fault-tolerance aspect, or in the performance / efficiency aspect, can help him guide his choice.

--

Felix GV
Staff Software Engineer
Data Infrastructure
LinkedIn

f...@linkedin.com
linkedin.com/in/felixgv

omkar patil

unread,

Apr 11, 2017, 3:40:31 AM4/11/17

to project-...@googlegroups.com

Hey david,felix!

Thank you so much for the succinct explanation of the workings in your mails.

This helps a lot.

I went through the resources while refering the design(http://www.project-voldemort.com/voldemort/design.html) and i now how a clearer understanding of the goal/proposal and its relation to fault tolerance.

Hoping to hear from you guy soon regarding further steps.

Have a great day.

Best,

Omkar

Félix GV

unread,

Apr 12, 2017, 7:41:53 PM4/12/17

to project-...@googlegroups.com

Hi Onkar,

Here are few pointers:
	•	Experimental work on server-to-server Read-Only files transfer, by Arun: 
	◦	https://github.com/arunthirupathi/voldemort/tree/primaryPartitionFetch_2
	◦	By Arun, one of my ex-colleagues who is very good and hacked extensively on Voldemort.
	◦	This was aiming to solve David Ongaro's problem of fetching data only once across the WAN. However, parts of that work may be salvageable as part of the node auto-restoration work as well. It may be usable as is, or perhaps can be used as a starting point or merely as inspiration. Feel free to use it, adapt it or discard it.
	◦	I think it has a few unit tests, but it is definitely lacking a lot of testing. There may be bugs in there. Although, IIRC, someone in the open-source community (from Apple or otherwise, can't remember now) reported that "it works".
	•	Server-side code which initializes a store-version, after it has been downloaded:
	◦	https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/store/readonly/chunk/ChunkedFileSet.java#L342 
	◦	Already part of production code.
	◦	The initVersion2() function linked above has logic which can determine which partitions (and even replica number) are supposed to be hosted on the current server. This can serve as a starting point for understanding where the other replicas of a partitions may live.
	•	New recovery mode which accepts fetches from BnP jobs but does not allow read requests from clients.
	◦	https://github.com/voldemort/voldemort/pull/471 
	◦	A pull request by me.
	◦	The part which models the server states more cleanly may be useful, but it does not actually provide "recovery" functionality per say. This is not the whole story.
	•	The Failed Fetch Lock implementation which stores some BnP HA state on HDFS so that distinct BnP jobs can coordinate among one another
	◦	https://github.com/voldemort/voldemort/blob/ea37ef67fa7724180608510c6d4237167b78dd63/contrib/hadoop-store-builder/src/java/voldemort/store/readonly/swapper/HdfsFailedFetchLock.java 
	◦	Already part of production code.
	◦	This state will need to be cleaned up at the end of the recovery process, so that BnP HA can kick in again.
These are good starting points for reading the code.

I'll be offline for the next few days, but feel free to email the list if you have questions. Perhaps Arun is still lurking and may be able to help if there are questions about his branch.

-F

--

Omkar Patil,
Computer Science Graduate Student,
UCLA

--

omkar patil

unread,

Apr 12, 2017, 8:32:52 PM4/12/17

to project-...@googlegroups.com

Hey Felix !

Thanks for the update.

Will go through the material and definitely get back to you or arun in case of any doubts.

Have a great day!!

best,

omkar

--
Félix

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
Omkar Patil,
Computer Science Graduate Student,
UCLA

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

David Ongaro

unread,

Apr 18, 2017, 7:25:22 PM4/18/17

to project-...@googlegroups.com

Hi again,

On Apr 10, 2017, at 11:41 AM, David Ongaro <BITT...@GMAIL.COM> wrote:
Create new mode where a server refuses to serve online requests, but accepts new BnP jobs.
Isn’t that what server.state=OFFLINE_SERVER and readonly.fetch.enabled=true is supposed to be?

I just checked this and it’s not working. The push job is recognizing the OFFLINE_SERVER node with a "Invoking fetch for Node…” message but it doesn’t actually fetches something even though readonly.fetch.enabled is set to true. When the fetch for the other nodes is done it yields a "java.util.concurrent.ExecutionException: voldemort.store.UnreachableStoreException: Failure while checking out socket for…” exception for this node.

At least that seems to be the case with our current Server version 1.10.13, do later Versions contain any related fixes? Otherwise I guess this is something ought to be implemented.

Thanks,

David

Felix GV

unread,

Apr 18, 2017, 8:29:07 PM4/18/17

to project-...@googlegroups.com

On Tue, Apr 18, 2017 at 4:25 PM, David Ongaro <bitt...@gmail.com> wrote:

At least that seems to be the case with our current Server version 1.10.13, do later Versions contain any related fixes?

Not that I know of, as this is not a use case we have attempted to make work yet.

Otherwise I guess this is something ought to be implemented.

I think so, yes.

Arunachalam

unread,

Apr 19, 2017, 12:47:34 AM4/19/17

to project-...@googlegroups.com

Voldemort server exposes two ports. Client and Admin. Setting OFFLINE_SERVER disables the client port. So in theory admin operation should go through. But the problem is this feature is added in later, so there could be some client operation that is going on which could fail. BnP and any code involved should directly use the admin port as the admin port can serve all client operations in addition to admin operation.

IF you have the full call stack, probably I can tell you what is going on. But I moved on from LinkedIn and I have very little time to focus on Voldemort these days.

Thanks,

Arun.

--

David Ongaro

unread,

Apr 19, 2017, 3:00:58 PM4/19/17

to project-...@googlegroups.com

Indeed we’re using the admin port to initiate a push. Interestingly the fetch itself seems to be still using the client port. Maybe that is the problem? A redacted version of our push log looks like this:

[... skipped lines ...]

2017/04/18 22:41:57.123 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Initiating fetch of somestore with dataDir: webhdfs://hadoop:50070/data/somestore

2017/04/18 22:41:57.126 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Invoking fetch for Node node1.net:6666 [id 1] for webhdfs://hadoop:50070/data/somestore/node-1

2017/04/18 22:41:57.126 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Invoking fetch for Node node4.net:6666 [id 4] for webhdfs://hadoop:50070/data/somestore/node-4

2017/04/18 22:41:57.126 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Invoking fetch for Node node2.net:6666 [id 2] for webhdfs://hadoop:50070/data/somestore/node-2

2017/04/18 22:41:57.126 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Invoking fetch for Node node0.net:6666 [id 0] for webhdfs://hadoop:50070/data/somestore/node-0

2017/04/18 22:41:57.126 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Invoking fetch for Node node5.net:6666 [id 5] for webhdfs://hadoop:50070/data/somestore/node-5

2017/04/18 22:41:57.126 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Invoking fetch for Node node6.net:6666 [id 6] for webhdfs://hadoop:50070/data/somestore/node-6

2017/04/18 22:41:57.126 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Invoking fetch for Node node3.net:6666 [id 3] for webhdfs://hadoop:50070/data/somestore/node-3

2017/04/18 22:41:57.434 +0000 INFO [AdminClient] [Azkaban] Node node2.net:6666 [id 2] : AsyncOperationStatus(task id = 147, description = Fetch store 'somestore' v146, complete = false, status = 0 MB copied at 0 MB/sec - 0 % complete)

2017/04/18 22:41:57.434 +0000 INFO [AdminClient] [Azkaban] Node node6.net:6666 [id 6] : AsyncOperationStatus(task id = 10, description = Fetch store 'somestore' v146, complete = false, status = 0 MB copied at 0 MB/sec - 0 % complete)

2017/04/18 22:41:57.434 +0000 INFO [AdminClient] [Azkaban] Node node1.net:6666 [id 1] : AsyncOperationStatus(task id = 147, description = Fetch store 'somestore' v146, complete = false, status = 0 MB copied at 0 MB/sec - 0 % complete)

2017/04/18 22:41:57.434 +0000 INFO [AdminClient] [Azkaban] Node node0.net:6666 [id 0] : AsyncOperationStatus(task id = 146, description = Fetch store 'somestore' v146, complete = false, status = 0 MB copied at 0 MB/sec - 0 % complete)

2017/04/18 22:41:57.434 +0000 INFO [AdminClient] [Azkaban] Node node5.net:6666 [id 5] : AsyncOperationStatus(task id = 145, description = Fetch store 'somestore' v146, complete = false, status = 0 MB copied at 0 MB/sec - 0 % complete)

2017/04/18 22:41:57.434 +0000 INFO [AdminClient] [Azkaban] Node node3.net:6666 [id 3] : AsyncOperationStatus(task id = 145, description = Fetch store 'somestore' v146, complete = false, status = 0 MB copied at 0 MB/sec - 0 % complete)

2017/04/18 22:41:57.443 +0000 INFO [AdminClient] [Azkaban] Node node4.net:6666 [id 4] : AsyncOperationStatus(task id = 145, description = Fetch store 'somestore' v146, complete = false, status = 0 MB copied at 0 MB/sec - 0 % complete)

[... skipped lines ...]

2017/04/18 22:59:04.248 +0000 INFO [AdminClient] [Azkaban] Node node5.net:6666 [id 5] : AsyncOperationStatus(task id = 145, description = Fetch store 'somestore' v146, complete = true, status = /var/voldemort/data/read-only/somestore/version-146)

2017/04/18 22:59:04.248 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Fetch succeeded on Node node5.net:6666 [id 5]

2017/04/18 22:59:04.261 +0000 INFO [AdminClient] [Azkaban] Node node0.net:6666 [id 0] : AsyncOperationStatus(task id = 146, description = Fetch store 'somestore' v146, complete = false, status = 45 MB copied at 0.03 MB/sec, 99.53 % complete, attempt: #1/6, current file: 854_1.index)

2017/04/18 22:59:04.285 +0000 INFO [AdminClient] [Azkaban] Node node4.net:6666 [id 4] : AsyncOperationStatus(task id = 145, description = Fetch store 'somestore' v146, complete = true, status = /var/voldemort/data/read-only/somestore/version-146)

2017/04/18 22:59:04.285 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Fetch succeeded on Node node4.net:6666 [id 4]

2017/04/18 23:00:08.385 +0000 INFO [AdminClient] [Azkaban] Node node2.net:6666 [id 2] : AsyncOperationStatus(task id = 147, description = Fetch store 'somestore' v146, complete = true, status = /var/voldemort/data/read-only/somestore/version-146)

2017/04/18 23:00:08.385 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Fetch succeeded on Node node2.net:6666 [id 2]

2017/04/18 23:00:08.394 +0000 INFO [AdminClient] [Azkaban] Node node3.net:6666 [id 3] : AsyncOperationStatus(task id = 145, description = Fetch store 'somestore' v146, complete = true, status = /var/voldemort/data/read-only/somestore/version-146)

2017/04/18 23:00:08.394 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Fetch succeeded on Node node3.net:6666 [id 3]

2017/04/18 23:00:08.395 +0000 INFO [AdminClient] [Azkaban] Node node1.net:6666 [id 1] : AsyncOperationStatus(task id = 147, description = Fetch store 'somestore' v146, complete = true, status = /var/voldemort/data/read-only/somestore/version-146)

2017/04/18 23:00:08.395 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Fetch succeeded on Node node1.net:6666 [id 1]

2017/04/18 23:00:08.412 +0000 INFO [AdminClient] [Azkaban] Node node0.net:6666 [id 0] : AsyncOperationStatus(task id = 146, description = Fetch store 'somestore' v146, complete = true, status = /var/voldemort/data/read-only/somestore/version-146)

2017/04/18 23:00:08.412 +0000 INFO [net:6667] [Azkaban] tcp://node0.net:6667 : Fetch succeeded on Node node0.net:6666 [id 0]

2017/04/18 23:00:08.413 +0000 ERROR [net:6667] [Azkaban] tcp://node0.net:6667 : Error on Node node6.net:6666 [id 6] during push :

java.util.concurrent.ExecutionException: voldemort.store.UnreachableStoreException: Failure while checking out socket for node6.net:6666(vp1):

at java.util.concurrent.FutureTask.report(FutureTask.java:122)

at java.util.concurrent.FutureTask.get(FutureTask.java:188)

at voldemort.store.readonly.swapper.AdminStoreSwapper.invokeFetch(AdminStoreSwapper.java:220)

at voldemort.store.readonly.swapper.AdminStoreSwapper.fetchAndSwapStoreData(AdminStoreSwapper.java:123)

at voldemort.store.readonly.mr.azkaban.VoldemortSwapJob.run(VoldemortSwapJob.java:159)

at voldemort.store.readonly.mr.azkaban.VoldemortBuildAndPushJob.runPushStore(VoldemortBuildAndPushJob.java:837)

at voldemort.store.readonly.mr.azkaban.VoldemortBuildAndPushJob$StorePushTask.call(VoldemortBuildAndPushJob.java:556)

at voldemort.store.readonly.mr.azkaban.VoldemortBuildAndPushJob$StorePushTask.call(VoldemortBuildAndPushJob.java:539)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: voldemort.store.UnreachableStoreException: Failure while checking out socket for node6.net:6666(vp1):

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:526)

at voldemort.utils.ReflectUtils.callConstructor(ReflectUtils.java:116)

at voldemort.utils.ReflectUtils.callConstructor(ReflectUtils.java:103)

at voldemort.store.ErrorCodeMapper.getError(ErrorCodeMapper.java:84)

at voldemort.client.protocol.admin.AdminClient$HelperOperations.throwException(AdminClient.java:462)

at voldemort.client.protocol.admin.AdminClient$RPCOperations.getAsyncRequestStatus(AdminClient.java:707)

at voldemort.client.protocol.admin.AdminClient$RPCOperations.waitForCompletion(AdminClient.java:912)

at voldemort.client.protocol.admin.AdminClient$RPCOperations.waitForCompletion(AdminClient.java:983)

at voldemort.client.protocol.admin.AdminClient$ReadOnlySpecificOperations.fetchStore(AdminClient.java:4417)

at voldemort.store.readonly.swapper.AdminStoreSwapper$1.fetch(AdminStoreSwapper.java:180)

at voldemort.store.readonly.swapper.AdminStoreSwapper$1.call(AdminStoreSwapper.java:168)

at voldemort.store.readonly.swapper.AdminStoreSwapper$1.call(AdminStoreSwapper.java:158)

... 4 more

[...]

The admin port is 6667 and the client port 6666. The node in offline mode is node 6. As you can see it’s appearing in the “Invoking fetch” message and also in the first AsyncOperationStatus message but not anymore till the ERROR line. Probably more helpful is the trace on the corresponding node:

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: Failure while checking out socket for node6.net:6666(vp1): [voldemort-scheduler-service1-t3; AsyncOp ID 10]

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: voldemort.store.UnreachableStoreException: Failure while checking out socket for node6.net:6666(vp1):

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.store.UnreachableStoreException.wrap(UnreachableStoreException.java:41)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.store.socket.clientrequest.ClientRequestExecutorPool.checkout(ClientRequestExecutorPool.java:214)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.store.socket.SocketStore.request(SocketStore.java:278)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.store.socket.SocketStore.get(SocketStore.java:200)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.client.protocol.admin.AdminClient$StoreOperations.getNodeKey(AdminClient.java:3031)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.client.protocol.admin.AdminClient$QuotaManagementOperations.getQuotaForNode(AdminClient.java:4812)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.store.readonly.fetcher.HdfsFetcher.fetch(HdfsFetcher.java:203)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.server.protocol.admin.AdminServiceRequestHandler$4.operate(AdminServiceRequestHandler.java:1134)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.server.protocol.admin.AsyncOperation.run(AsyncOperation.java:35)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at java.util.concurrent.FutureTask.run(FutureTask.java:266)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at java.lang.Thread.run(Thread.java:745)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: Caused by: java.net.ConnectException: Connection refused

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.store.socket.clientrequest.ClientRequestExecutor.connect(ClientRequestExecutor.java:310)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.common.nio.SelectorManagerWorker.run(SelectorManagerWorker.java:103)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: at voldemort.common.nio.AbstractSelectorManager.run(AbstractSelectorManager.java:243)

Apr 18 22:41:57 node6.net voldemort-server.sh[12330]: ... 3 more

Apr 18 22:41:58 node6.net voldemort-server.sh[12330]: [22:41:58,049 voldemort.server.protocol.admin.AdminServiceRequestHandler] ERROR handleAsyncStatus failed for request(request_id: 10) [voldemort-admin-server-t1

Apr 18 22:41:58 node6.net voldemort-server.sh[12330]: voldemort.store.UnreachableStoreException: Failure while checking out socket for node6.net:6666(vp1):

I looks like node6 is trying to connect to itself?

Thanks,

David

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.

Arunachalam

unread,

Apr 19, 2017, 3:41:26 PM4/19/17

to project-...@googlegroups.com

This is a bug I fixed in later versions. But there could be more of this kind.
https://github.com/voldemort/voldemort/commit/701e97684ff635d68c79a12f53061250a6cb06c1

Update to latest and retry to see if it works.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

Arunachalam

unread,

Apr 19, 2017, 4:11:00 PM4/19/17

to project-...@googlegroups.com

Sent that from phone. It was using an connection to set the quota on to itself. The code was fixed to invoke the API's correctly to set the quota.

https://github.com/voldemort/voldemort/commit/701e97684ff635d68c79a12f53061250a6cb06c1#diff-01af3f151b7185726bbb5500510a80a9R1662

But there is no guarantee that , it could fix that scenario as we never tried to aim for that scenario. But would be curious to understand are there more issues there.

Thanks,

Arun.

On Tue, Apr 18, 2017 at 5:28 PM, 'Felix GV' via project-voldemort <project-voldemort@googlegroups.com> wrote:

On Tue, Apr 18, 2017 at 4:25 PM, David Ongaro <bitt...@gmail.com> wrote:
At least that seems to be the case with our current Server version 1.10.13, do later Versions contain any related fixes?

Not that I know of, as this is not a use case we have attempted to make work yet.

Otherwise I guess this is something ought to be implemented.

I think so, yes.

--
Felix GV
Staff Software Engineer
Data Infrastructure
LinkedIn

f...@linkedin.com
linkedin.com/in/felixgv

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldemort+unsubscribe@googlegroups.com.

David Ongaro

unread,

Apr 19, 2017, 4:15:46 PM4/19/17

to project-...@googlegroups.com

Thanks for the pointer that makes sense. (Even though I don’t understand why the Admin call is/was using the client port instead of the admin port.) We’re planning to set up our next cluster with version 1.10.23 so we can retest then to see what happens.

Thanks,

David

To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.

Arunachalam

unread,

Apr 19, 2017, 6:50:37 PM4/19/17

to project-...@googlegroups.com

In Voldemort client and admin port were interchangeable for a long time and essentially did the same things.

To make the matters worse for bootstrap,client and admin have same bootstrap which was using both the ports intermixed with one another. I refactored most codes to pass them as parameters. But did not proactively clean up all the code.