'Other bootstrapping/leaving/moving nodes detected' error

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 4:34:29 AM11/19/15

to Asias He, scylladb-dev

Recently I started to get this error:

Exiting on unhandled exception of type 'std::runtime_error': Other
bootstrapping/leaving/moving nodes detected, cannot bootstrap while
cassandra.consistent.rangemovement is true

while starting cluster for the first time. I run all scylla's almost
simultaneously, but if one is started half a second to late I get this
erro. Shouldn't it be a period during first run when nodes are waiting
for all other nodes to join?

--
Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 5:12:26 AM11/19/15

to Gleb Natapov, scylladb-dev

On Thu, Nov 19, 2015 at 5:34 PM, Gleb Natapov <gl...@scylladb.com> wrote:

When you start nodes simultaneously, did you get the error?

Sorry, What do you mean here? First of all, the check is to prevent operation erros.

In real life, we should bootstrap node one by one. However, to make test easier, we can bootstrap multiple nodes since there is probably no data, we can set cassandra.consistent.rangemovement to false in this case.

--

Asias

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 5:16:25 AM11/19/15

to Asias He, scylladb-dev

On Thu, Nov 19, 2015 at 06:11:56PM +0800, Asias He wrote:
> On Thu, Nov 19, 2015 at 5:34 PM, Gleb Natapov <gl...@scylladb.com> wrote:
>
> > Recently I started to get this error:
> >
> > Exiting on unhandled exception of type 'std::runtime_error': Other
> > bootstrapping/leaving/moving nodes detected, cannot bootstrap while
> > cassandra.consistent.rangemovement is true
> >
> > while starting cluster for the first time. I run all scylla's almost
> > simultaneously, but if one is started half a second to late I get this
> > erro.
>
>
>
> When you start nodes simultaneously, did you get the error?
>

Define simultaneously? I get the error sometimes when I ran them by pressing
enter in 3 windows one by one as fast as possible. Definitely less then
3sec between each.

>
>
> > Shouldn't it be a period during first run when nodes are waiting
> > for all other nodes to join?
> >
>
>
> Sorry, What do you mean here? First of all, the check is to prevent
> operation erros.

IIRC previously seed node started cql server immediately, but non seed
nodes waited for 10 seconds for all nodes to join on the first run. Is
this different now?

> In real life, we should bootstrap node one by one. However, to make test
> easier, we can bootstrap multiple nodes since there is probably no data, we
> can set cassandra.consistent.rangemovement to false in this case.
>
>
> >
> > --
> > Gleb.
> >
>
>
>
> --
> Asias

--
Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 5:38:19 AM11/19/15

to Gleb Natapov, scylladb-dev

On Thu, Nov 19, 2015 at 6:16 PM, Gleb Natapov <gl...@scylladb.com> wrote:

OK.

Yes, it is different now. In check_for_endpoint_collision, we did the check

if (state == sstring(versioned_value::STATUS_BOOTSTRAPPING) ||

state == sstring(versioned_value::STATUS_LEAVING) ||

state == sstring(versioned_value::STATUS_MOVING)) {

throw std::runtime_error("Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true");

}

There is no such sleep before the check. So the check is very earlier than before.

--

Asias

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 5:41:20 AM11/19/15

to Asias He, scylladb-dev

OK, so this is intended behaviour and matches the cassandra one, right?

--
Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 5:53:15 AM11/19/15

to Gleb Natapov, scylladb-dev

Exactly.

--

Asias

Vlad Zolotarov

<vladz@cloudius-systems.com>

unread,

Nov 19, 2015, 6:22:53 AM11/19/15

to Asias He, Gleb Natapov, scylladb-dev

That's a bit strange - this would cause a lot of false alarms both in
the real life and in the standard testing environments (e.g. jepsen)
which bring up nodes "simultaneously" and not "one-by-one". What is
"one-by-one" anyway? Do u imply that a user has to use a nodetool in
order to check the node state before bringing up the next node? Sounds
insane!

>
> --
> Gleb.
>
>
>
>
> --
> Asias
> --

> You received this message because you are subscribed to the Google
> Groups "ScyllaDB development" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to scylladb-dev...@googlegroups.com
> <mailto:scylladb-dev...@googlegroups.com>.
> To post to this group, send email to scylla...@googlegroups.com
> <mailto:scylla...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/scylladb-dev/CAO1GqFbhkGty%3D%3D3yM65i2inN2QmF3RrK2qoaTTPJNX8zNUNK-g%40mail.gmail.com
> <https://groups.google.com/d/msgid/scylladb-dev/CAO1GqFbhkGty%3D%3D3yM65i2inN2QmF3RrK2qoaTTPJNX8zNUNK-g%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 6:52:40 AM11/19/15

to Vlad Zolotarov, Gleb Natapov, scylladb-dev

Take a look:

http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_node_to_cluster_t.html

'''

3. Use nodetool status to verify that the node is fully bootstrapped and all other nodes are up (UN) and not in any other state.

Note: Consistency issues might result when more than one node is added at a time. To assess the risks to your environment, see JIRA issues CASSANDRA-2434 and CASSANDRA-7069. Although adding multiple nodes at the same time is not a best practice, you can reduce data consistency risks by following these steps:

'''

--

Asias

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 6:54:23 AM11/19/15

to Asias He, Vlad Zolotarov, scylladb-dev

> CASSANDRA-2434 <https://issues.apache.org/jira/browse/CASSANDRA-2434> and
> CASSANDRA-7069 <https://issues.apache.org/jira/browse/CASSANDRA-7069>.

> Although adding multiple nodes at the same time is not a best practice, you
> can reduce data consistency risks by following these steps:
> '''
>

This is about adding node to existing cluster, not bootstrapping a new,
empty one.

--
Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 7:10:59 AM11/19/15

to Gleb Natapov, Vlad Zolotarov, scylladb-dev

The bootstrap code can not know if the cluster is empty or not. User has to tell it.

If it is an empty cluster, we need to set auto_bootstrap to false. With auto_boostrap = fasle, there will be no such check.

""
auto_bootstrap Configuration 108 (Default: true) This setting has been removed from default configuration. It makes new (non-seed) nodes automatically migrate the right data to themselves. When initializing a fresh cluster without data, add auto_bootstrap: false. Related information: Initializing a multiple node cluster (single data center) on page 57 and Initializing a multiple node cluster (multiple data centers) on page 60.

"""

--

Asias

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 8:06:21 AM11/19/15

to Asias He, Vlad Zolotarov, scylladb-dev

Can't it? If peers table is empty then this is initial cluster bootstrap, no?

> If it is an empty cluster, we need to set auto_bootstrap to false. With
> auto_boostrap = fasle, there will be no such check.
>
> ""
> auto_bootstrap Configuration 108 (Default: true) This setting has been
> removed from default configuration. It makes new (non-seed) nodes
> automatically migrate the right data to themselves. When initializing a
> fresh cluster without data, add auto_bootstrap: false. Related information:
> Initializing a multiple node cluster (single data center) on page 57 and
> Initializing a multiple node cluster (multiple data centers) on page 60.
>
> """
>
>
> > --
> > Gleb.
> >
>
>
>
> --
> Asias

--
Gleb.

Vlad Zolotarov

<vladz@cloudius-systems.com>

unread,

Nov 19, 2015, 8:19:14 AM11/19/15

to Asias He, Gleb Natapov, scylladb-dev

On 11/19/15 13:52, Asias He wrote:
>
>
> On Thu, Nov 19, 2015 at 7:22 PM, Vlad Zolotarov
> <vl...@cloudius-systems.com <mailto:vl...@cloudius-systems.com>> wrote:
>
>
>
> On 11/19/15 12:52, Asias He wrote:
>
>
>
> On Thu, Nov 19, 2015 at 6:41 PM, Gleb Natapov
> <gl...@scylladb.com <mailto:gl...@scylladb.com>

> <mailto:gl...@scylladb.com <mailto:gl...@scylladb.com>>> wrote:
>
> On Thu, Nov 19, 2015 at 06:37:49PM +0800, Asias He wrote:
> > On Thu, Nov 19, 2015 at 6:16 PM, Gleb Natapov
> <gl...@scylladb.com <mailto:gl...@scylladb.com>

> <mailto:gl...@scylladb.com <mailto:gl...@scylladb.com>>> wrote:
> >
> > > On Thu, Nov 19, 2015 at 06:11:56PM +0800, Asias He wrote:
> > > > On Thu, Nov 19, 2015 at 5:34 PM, Gleb Natapov
> <gl...@scylladb.com <mailto:gl...@scylladb.com>

> <https://issues.apache.org/jira/browse/CASSANDRA-2434> and
> CASSANDRA-7069 <https://issues.apache.org/jira/browse/CASSANDRA-7069>.

> Although adding multiple nodes at the same time is not a best
> practice, you can reduce data consistency risks by following these steps:
> '''
>

I see. Thanks for clarification.

>
>
> --
> Gleb.
>
>
>
>
> --
> Asias
> --
> You received this message because you are subscribed to the
> Google Groups "ScyllaDB development" group.
> To unsubscribe from this group and stop receiving emails from
> it, send an email to scylladb-dev...@googlegroups.com

> <mailto:scylladb-dev%2Bunsu...@googlegroups.com>
> <mailto:scylladb-dev...@googlegroups.com
> <mailto:scylladb-dev%2Bunsu...@googlegroups.com>>.

> To post to this group, send email to
> scylla...@googlegroups.com
> <mailto:scylla...@googlegroups.com>

> <mailto:scylla...@googlegroups.com

Vlad Zolotarov

<vladz@cloudius-systems.com>

unread,

Nov 19, 2015, 8:22:02 AM11/19/15

to Gleb Natapov, Asias He, scylladb-dev

Won't it mean that this is the first Node boot? system.peers table is a
local Node's table filled by a gossiper isn't it?

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 8:23:10 AM11/19/15

to Vlad Zolotarov, Asias He, scylladb-dev

True.

--
Gleb.

Vlad Zolotarov

<vladz@cloudius-systems.com>

unread,

Nov 19, 2015, 8:25:39 AM11/19/15

to Gleb Natapov, Asias He, scylladb-dev

In that case looking at "peers" table won't help to determine if the
cluster is empty since it may be quite "full" when we try to boot the
current Node... ;)

>
> --
> Gleb.

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 8:28:59 AM11/19/15

to Vlad Zolotarov, Asias He, scylladb-dev

Yes, that is what I agreed with by OTOH can't node connect to a seeder and
ask if it is cluster bootstrap or node bootstrap?

--
Gleb.

Vlad Zolotarov

<vladz@cloudius-systems.com>

unread,

Nov 19, 2015, 8:33:31 AM11/19/15

to Gleb Natapov, Asias He, scylladb-dev

I'd like Asias to answer it. But it seems to me that gossiper (of a
seeder) doesn't have such a state as a "cluster bootstrap" but it sounds
like a good idea to add it...

>
> --
> Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 8:40:23 AM11/19/15

to Vlad Zolotarov, Gleb Natapov, scylladb-dev

Actually, we can look at if there is any non-system keysapce/tables in the schema. If no user keyspace/tables, then it is a empty cluster. But we get this schema information after the check in join_token_ring.

If there is no data in the cluster, why not just set auto_boostrap to false. I dont think it is a real issue for us.

--

Asias

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 8:42:13 AM11/19/15

to Asias He, Vlad Zolotarov, scylladb-dev

On Thu, Nov 19, 2015 at 09:39:52PM +0800, Asias He wrote:
> >>>>
> >>> Won't it mean that this is the first Node boot? system.peers table is a
> >>> local Node's table filled by a gossiper isn't it?
> >>>
> >>> True.
> >>
> >
> > In that case looking at "peers" table won't help to determine if the
> > cluster is empty since it may be quite "full" when we try to boot the
> > current Node... ;)
>
>
>
> Actually, we can look at if there is any non-system keysapce/tables in the
> schema. If no user keyspace/tables, then it is a empty cluster. But we get
> this schema information after the check in join_token_ring.
>
> If there is no data in the cluster, why not just set auto_boostrap to
> false. I dont think it is a real issue for us.
>
>

Less chances for error, but for our testing it should not be an issue.

--
Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 8:45:45 AM11/19/15

to Gleb Natapov, Vlad Zolotarov, scylladb-dev

The check is trying to prevent error.

What error do you mean above?

--

Asias

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 8:47:59 AM11/19/15

to Asias He, Vlad Zolotarov, scylladb-dev

Configuring auto_bootstrap=false and connecting to a running cluster.

--
Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 8:53:32 AM11/19/15

to Gleb Natapov, Vlad Zolotarov, scylladb-dev

It is not a problem neither. User can have the node join the cluster (streaming data to the node) using nodetool later.

--

Asias

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 8:56:06 AM11/19/15

to Asias He, Vlad Zolotarov, scylladb-dev

He should notice that node did not join the cluster. But I guess he will
see this in the nodetool.

--
Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 8:59:39 AM11/19/15

to Gleb Natapov, Vlad Zolotarov, scylladb-dev

Yes, there is a status code for it.

--

Asias

Asias He

<asias@scylladb.com>

unread,

Nov 19, 2015, 9:06:41 AM11/19/15

to Gleb Natapov, Vlad Zolotarov, scylladb-dev

Like this?

Config seed node to set a flag = there is no data in the cluster

New node joins and sees the flag and skips the bootstraping

Config seed node to clear the flag if there is data.

This looks problematic, e.g., what if client starts to write data while the flag is not updated, new node joins.

It won't worth the trouble just to optimize the empty cluster case.

--

Asias

Gleb Natapov

<gleb@scylladb.com>

unread,

Nov 19, 2015, 9:29:22 AM11/19/15

to Asias He, Vlad Zolotarov, scylladb-dev

Almost like that, but set flag not based on cluster emptiness, but for
first 10s of running the cluster, so if node tries to connect to seed
during first 10s it knows that this is initial bootstrap and waits for
10s before starting cql transport. I am not sure how gossip works
exactly, but what I am trying to replicate here is an old behaviour: on
a first start non seed nodes wait for 10s before start working.

--
Gleb.

Vlad Zolotarov

<vladz@cloudius-systems.com>

unread,

Nov 19, 2015, 9:36:25 AM11/19/15

to Gleb Natapov, scylladb-dev, Asias He

+1

>
> --
> Gleb.

Asias He

<asias@scylladb.com>

unread,

Nov 20, 2015, 2:04:41 AM11/20/15

to Gleb Natapov, Vlad Zolotarov, scylladb-dev

With

[PATCH scylla 0/2] Relax bootstrapping/leaving/moving nodes check

your old scripts to start nodes should work as before.

--

Asias

Reply all

Reply to author

Forward