Voldemort 0.70 Release Candidate 1 uploaded

Alex Feinberg

unread,

Jan 22, 2010, 11:41:47 PM1/22/10

to project-...@googlegroups.com

Hello,

I've updated the release 0.70 candidate branch (containing the long
awaited failure detection and rebalancing features) and uploaded the
tarball/zip archives (branded as 0.70.RC1):

http://github.com/voldemort/voldemort/downloads

These tarballs have been cut from the release-070 branch <
http://github.com/voldemort/voldemort/tree/release-070 >. I've also
merged the development (mostly bug fixes, tools and slight
enhancements) that happened in the release-0.70 up to the master
branch.

The purpose of this candidate release is to allow the open source
community to test and sanity check the 0.70 release before releasing a
definitive "voldemort-0.70" archive is created and uploaded. Please,
feel free to download this snapshot, experiment with it and report any
potential bugs and broken or unusual behaviour. If no bugs are seen
with the 0.70.RC1, on Monday (1/25/2009) a final 0.70 release will be
made-- with release notes, documentation and the blog being updated.

Thanks,
- Alex

B. Todd Burruss

unread,

Jan 25, 2010, 2:44:22 PM1/25/10

to project-...@googlegroups.com

is hinted-handoff a part of 0.7?

thx

> --
> You received this message because you are subscribed to the Google Groups "project-voldemort" group.
> To post to this group, send email to project-...@googlegroups.com.
> To unsubscribe from this group, send email to project-voldem...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/project-voldemort?hl=en.
>
>

Alex Feinberg

unread,

Jan 25, 2010, 11:25:58 PM1/25/10

to project-...@googlegroups.com

Hinted handoff isn't going to be in this release, but it's on the
roadmap finish/correct it's implementation in the near future. There's
an issue assigned to me to track this:

http://code.google.com/p/project-voldemort/issues/detail?id=118

Thanks,
- Alex

B. Todd Burruss

unread,

Jan 25, 2010, 11:31:17 PM1/25/10

to project-...@googlegroups.com

so without hinted handoff voldemort relies on read repair to fix nodes
that may have been taken out of rotation and put back?

Alex Feinberg

unread,

Jan 25, 2010, 11:41:30 PM1/25/10

to project-...@googlegroups.com

There's read repair and there's also the ability to use the
AdminClient API to restore a partition from its replicas. It's exposed
via a JMX operation: the mbean is voldemort.server.VoldemortServer and
the operation is restoreDataFromReplication (you should give it an
integer >= 1 which is the number of transfers to do in parallel). This
is a very freshly added feature, it should receive more documentation.

This is good for more long-term outages. Read-repair should work for
the transient cases (nodes being down for a short time, entering back
into the cluster). However, for everything in between, Hinted Handoff
would be the correct solution -- which is why it's high on my list.

Thanks,
- Alex

B. Todd Burruss

unread,

Jan 26, 2010, 1:09:51 PM1/26/10

to project-...@googlegroups.com

thx for the feedback. a good idea for voldemort's wiki would be an
"operations" page that would describe actions to take under certain
situations:

- how to add/remove nodes proactively
- how to repair a failed node that lost all data
- how to repair a failed node that has data intact
- how to load balance
- etc.

and with each of the operational scenarios, how long the process
should expect to take, or some sort of percent complete via JMX - or
both ;)

thx. 0.7 is looking good

to give you some stats feedback:

- i have a 4 node cluster, N=3, R=2, W=2 (write preferred=3) - fast
SCSI 15k drives, 48G RAM, 8 core
- keys = ~82,269,011 inserted
- data = ~4k data per key
- node0 = 542G
- node1 = 464G
- node2 = 581G
- node3 = 365G
- i have a loadbalance initiated 21 hours ago

write stats (millis/write):
high value = 85.11ms
99% value = 65.48ms
90% value = 48.01ms
50% value = 28.11ms

read stats (millis/read):
high value = 71.12ms
99% value = 58.97ms
90% value = 50.47ms
50% value = 40.24ms

i pushed the server hard to get data into the cluster then dropped off
to ~650 reads/sec and 500 writes/sec

Reply all

Reply to author

Forward