Question on Blueflood Rollup Redundancy and Sharding

85 views
Skip to first unread message

Kevin Mendel

unread,
Mar 21, 2016, 4:13:05 PM3/21/16
to Blueflood Discuss
I am looking for a little clartity regarding rollup node redundancy. We are trying to expand our deployment very soon. 

I guess what I am asking is: is the redundancy at the shard level, or is it at the node level?

One configuration scenario that was described to informally was having 4 pairs of rollup nodes, each pair operating on a set of shards. 
Which sounds like one node of the pair is rolling up, and the other node in the pair is on standby. 
When the active node goes down, the standby node is activated and continues rolling up the set of shards.

This seems to imply that a rollup node is either entirely on standby, or entirely active.

But the documentation I have been able to find on this is rather ambiguous in this area:

Each rollup node is responsible for managing one or more 'shards.' It is possible (recommended!) to configure your Blueflood cluster in such a way that multiple rollup nodes are responsible for the same shards. If a rollup node goes down, another rollup node will pick up the shards assigned to the downed node and roll up the metrics in those shards.
 
Zookeeper is used by nodes to claim active 'ownership' of a particular shard so that multiple nodes aren't rolling up the same data.

This suggests (but does not confirm) that the redunancy might happen in a completely different way. That is, on a shard-by-shard basis - the active node of each shard is determined independently of the other shards. 

In other words, if a node goes down, it's shards could be taken over by other nodes that are already actively rolling up other shards. A rollup node can be rolling up shards, and if another node goes down, it would be rolling up MORE shards. 

Or maybe this is a distinction without a difference. What is possible might be ruled out by best practices.

Can anyone clear this up for me?

Could I configure Blueflood like this:
    - node A, shards 0-63
    - node B, shards 32-95
    - node C, shards 64-127
    - node D, shards 96-127 and 0-31

In that configuration, each shard is owned by 2 nodes. If a node fails, another node is available to take over. 

If this is a possible configuration, would it be possible for all 4 nodes to be rolling up some number of shards?





Kevin Mendel

unread,
Mar 21, 2016, 4:16:12 PM3/21/16
to Blueflood Discuss
P.S. I don't actually want to configure Blueflood that way. It's just a way to persue the question.

Scott Nelson Windels

unread,
Mar 21, 2016, 5:20:23 PM3/21/16
to Blueflood Discuss
Hi Kevin -

I am the current DevOps engineer for the Rackspace Metrics team.  We currently have 8 rollups nodes each processing 1/8 of the available shards.  If a node fails we would stop processing the rollups for that set of shards.
Prior to when I arrived to the role I know that their was an attempt to Zookeeper to help provide redundancy for the rollup nodes - but it was not implemented fully and we have not spent time trying to implement since I've been here.  I think it will be a worthwhile effort at some point - but hasn't risen to the top of the queue.  Our effort to build and deploy a new rollup node would take no more than about 30min I would guess (I could probably do it faster if needed) - so not fully automated but most of the process is. 

When we look at a future solution we'll likely revisit Zookeeper and also review etcd and consul as more recent options for automatic service discovery in the infrastructure.  There are a few places in the application where increased awareness between the components would be a big win.

I don't know if that answers your question - but hopefully gives the background for it.

thx
Scott
Message has been deleted

Kevin Mendel

unread,
Mar 21, 2016, 5:32:08 PM3/21/16
to Blueflood Discuss
Are you saying you don't use the zookeeper/redundancy feature, because it's not really needed?
Or are you you saying rollup redundancy is not actually functional or supported at the moment?

If the last one, that's quite a shocker. 

I am now more confused than I was. 

Vinny Ly

unread,
Mar 21, 2016, 5:46:56 PM3/21/16
to Blueflood Discuss
Hi Kevin, 

The text in the documentation in the github blueflood/wiki you had referred to is confusing, I agree... but basically the idea from that documentation is to use zookeeper to control deployment and shards assignment, resetting the values for shards assignment in blueflood.conf to the rollup nodes when one goes down, which you can try doing if you need that kind of redundancy.

For us, in practice today we don't actually do that because of the complexity and upkeep.  We currently have chef recipes in place to create rollup nodes really quickly, and we are very active in monitoring the states of all the rollup nodes.  If one goes down, we pretty much know about it right away, and there will be a few hours downtime for the shards affected, but ultimately data will be rolled up again.

Re: what you proposed with the following configuration:
>>>
Could I configure Blueflood like this:
    - node A, shards 0-63
    - node B, shards 32-95
    - node C, shards 64-127
    - node D, shards 96-127 and 0-31
>>>

In my opinion that would actually work.  Remember that rollups doesn't modify data, it just creates new data points.  So all you'll be doing is doing the same amount of rollup twice for every shards.  It doesn't mess up the data and will provide a level of redundancy, but it does mean that you're rolling up twice.  Zookeeper is a better route for redundancy I think.

Vinny Ly

unread,
Mar 21, 2016, 5:49:39 PM3/21/16
to Blueflood Discuss
Also, there's no rollup redundancy builtin feature, not in the way you think off (i.e. cassandra tokens sharding and redudancy).  It's all done with devops tools at the moment.  We do think it is something we can explore in the future though and will make sure it's on our backlog of things to add.


On Monday, March 21, 2016 at 4:32:08 PM UTC-5, Kevin Mendel wrote:

Vinny Ly

unread,
Mar 21, 2016, 5:52:37 PM3/21/16
to Blueflood Discuss
for clarification, this line

"rollups doesn't modify data, it just creates new data points.  " should actually be 
"rollups doesn't modify data, it just creates new data points, or replaces ones that already exists".  

Kevin Mendel

unread,
Mar 21, 2016, 6:16:12 PM3/21/16
to Blueflood Discuss
Thanks, vinny_ly. That's partially the answer.

I understand the reasoning behind avoiding Zookeeper. And I see your reasoning for it being unnecessary.

But the original question remains: is the redundancy at the shard level, or is it at the node level?
 
In my opinion that would actually work.  Remember that rollups doesn't modify data, it just creates new data points.  So all you'll be doing is doing the same amount of rollup twice for every shards.

Wait ... what? I thought you assigned the same shard to multiple nodes and then Zookeeper sorted out which node rolled it and which ones didn't. "Zookeeper is used by nodes to claim active 'ownership' of a particular shard so that multiple nodes aren't rolling up the same data." 
 
 It doesn't mess up the data and will provide a level of redundancy, but it does mean that you're rolling up twice.  Zookeeper is a better route for redundancy I think.

Wait ... what? Did I say I wasn't using Zookeeper in that example anywhere? I didn't think I did. 

I am just trying to figure out ... is the redundancy at the shard level, or is it at the node level [when using Zookeeper of course]?

Kevin Mendel

unread,
Mar 21, 2016, 6:19:13 PM3/21/16
to Blueflood Discuss
Also, there's no rollup redundancy builtin feature, not in the way you think off (i.e. cassandra tokens sharding and redudancy).  It's all done with devops tools at the moment.  We do think it is something we can explore in the future though and will make sure it's on our backlog of things to add.

So redundancy for rollup using Zookeeper ISN'T supported as described? Or IS supported? 

Vinny Ly

unread,
Mar 21, 2016, 6:23:10 PM3/21/16
to Blueflood Discuss
My bad, ignore my reply on the overlapping sharding comments, it was just a theory from me mis-reading your question.  I will have to dig deeper into figuring out how you would actually set up zookeeper for the rollup nodes.  

Kevin Mendel

unread,
Mar 21, 2016, 6:39:29 PM3/21/16
to Blueflood Discuss
It sounds from what I am hearing on the #blueflood channel that zookeeper/redundancy has fallen into disuse and so there's no good guarantee that it would work. 
So that's probably why I am so confused - I keep asking how it works and I keep getting back that no one uses it. 


On Monday, March 21, 2016 at 4:13:05 PM UTC-4, Kevin Mendel wrote:

chinmay gupte

unread,
Mar 21, 2016, 6:43:20 PM3/21/16
to Kevin Mendel, Blueflood Discuss
Hi Kevin,

If I remember correctly, the redundancy is at the shard level in blueflood. 

With zookeeper, you assign overlapping shards to multiple nodes, they fight with each other to claim ownership for shards and the one which succeeds eventually gets to do rollup work.

This fight-for-shard created operational problems, since Zookeeper is a network sensitive distributed system. So although the code is functional, we ended up using static assignment of shards with zookeeper contention.

I am not an active contributor anymore, so take it with a grain of salt, but I have seen the zookeeper related shard assignment code working in production.

Hope this helps,

Chinmay

On Mon, Mar 21, 2016 at 3:19 PM, Kevin Mendel <kevinj...@gmail.com> wrote:
Also, there's no rollup redundancy builtin feature, not in the way you think off (i.e. cassandra tokens sharding and redudancy).  It's all done with devops tools at the moment.  We do think it is something we can explore in the future though and will make sure it's on our backlog of things to add.

So redundancy for rollup using Zookeeper ISN'T supported as described? Or IS supported? 

--
You received this message because you are subscribed to the Google Groups "Blueflood Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blueflood-disc...@googlegroups.com.
Visit this group at https://groups.google.com/group/blueflood-discuss.
To view this discussion on the web visit https://groups.google.com/d/msgid/blueflood-discuss/f2a8cbd3-96f3-450c-81f7-5c16101b0ff0%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Kevin Mendel

unread,
Mar 22, 2016, 10:23:51 AM3/22/16
to Blueflood Discuss, kevinj...@gmail.com
Thank you Chinmay (and Vinny and Scott).

Knowing that the redundancy is at the shard level gives me the clues I need to unlock understanding in the rest of this. So that's good to hear.

However, for me, it may be only acedemic, since we also have an aversion to Zookeeper and if we don't need it we probably won't use it. 
Reply all
Reply to author
Forward
0 new messages