Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Order Preserving Partitions in Voldemort
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Mark Rambacher  
View profile  
 More options Nov 23 2010, 9:28 am
From: Mark Rambacher <mramb...@gmail.com>
Date: Tue, 23 Nov 2010 06:28:17 -0800 (PST)
Local: Tues, Nov 23 2010 9:28 am
Subject: Order Preserving Partitions in Voldemort
Has anyone attempted to build an Order Preserving Routing Strategy
using Voldemort, comparable to that provided by Cassandra?

On the surface, this sounds like a fairly trivial operation to do
(implementing a new routing strategy).  But digging deeper, I think
the structure of the cluster file (to support range partitions rather
than numeric) and possibly other structures would need to change.

Has anyone attempted to write an order preserving routing strategy in
Voldemort or thought about what would need changed to implement such a
routing strategy?

Thanks,

Mark


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tatu Saloranta  
View profile  
 More options Nov 23 2010, 1:03 pm
From: Tatu Saloranta <tsalora...@gmail.com>
Date: Tue, 23 Nov 2010 10:03:16 -0800
Local: Tues, Nov 23 2010 1:03 pm
Subject: Re: [project-voldemort] Order Preserving Partitions in Voldemort

On Tue, Nov 23, 2010 at 6:28 AM, Mark Rambacher <mramb...@gmail.com> wrote:
> Has anyone attempted to build an Order Preserving Routing Strategy
> using Voldemort, comparable to that provided by Cassandra?

> On the surface, this sounds like a fairly trivial operation to do
> (implementing a new routing strategy).  But digging deeper, I think
> the structure of the cluster file (to support range partitions rather
> than numeric) and possibly other structures would need to change.

> Has anyone attempted to write an order preserving routing strategy in
> Voldemort or thought about what would need changed to implement such a
> routing strategy?

I haven't heard of such efforts, but my first question is about
utility: would the use case be that of range queries?
That seems to be the obvious one to me, but maybe there are others.

-+ Tatu +-


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Rambacher  
View profile  
 More options Nov 23 2010, 5:21 pm
From: Mark Rambacher <mramb...@gmail.com>
Date: Tue, 23 Nov 2010 14:21:03 -0800 (PST)
Local: Tues, Nov 23 2010 5:21 pm
Subject: Re: Order Preserving Partitions in Voldemort
Range queries are one use case and a definite need for us.

Another would be a means of storing "like" data together.  For
example, we could create "list keys" where we store a bunch of related
keys as individual keys with the same prefix.  I could create an
aggregate key list by having a common prefix (and unique suffix) for a
group of keys.  For example, I could prefix the set of contact keys
for a user with some prefix and retrieve all of the contacts by
looking for that prefix.  This is comparable to the "range query" case
but slightly different, in that the end goal is to get a single set of
information (all my contacts) rather than a general-purpose search.

The "storing like data together" could be taken one step further to
implement something comparable to a column-store, where a search of
"key" would give me all the data associated with a key, where "key-
column" would look up a specific property.  Again, a more specialized
example of a range query.

There are potentially other use cases that could be dreamed of, but
the question is if this would be generally useful and, if so, how one
would go about implementing it in the existing Voldemort
infrastructure.

/ Mark

On Nov 23, 1:03 pm, Tatu Saloranta <tsalora...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tatu Saloranta  
View profile  
 More options Nov 23 2010, 6:11 pm
From: Tatu Saloranta <tsalora...@gmail.com>
Date: Tue, 23 Nov 2010 15:11:58 -0800
Local: Tues, Nov 23 2010 6:11 pm
Subject: Re: [project-voldemort] Re: Order Preserving Partitions in Voldemort

Yes, these make sense to me, and could also be an alternative to other
kinds of modeling that is being discussed on other threads. Benefit of
this approach is that it would be possibly to prevent loading of extra
data, instead of reading and discarding it on server side.

Ability to do range queries seems useful, but even Cassandra
seems/seemed to be moving away from this direction, since it is rather
difficult to handle even load balancing with order-preserving key
hash. It is tricky balance; usefulness of ordering versus difficulties
in making it scale, especially when adding/removing hosts.

However I have been thinking that one way around this would be using
combo-keys; where part of key is used for partitioning data, and other
parts as "local" key. This would work for many (but not all) use
cases: for example, where you have natural multi-level hierarchy (like
S3 has account/bucket/key), and distribution of data is good enough
even if only using part of key.
In such cases one could even just hash part of the key, locate host(s)
that have the data, and then access ranges from individual host.

-+ Tatu +-


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »