restore backup to a new cluster (4.0.1)

Micha

<micha-1@fantasymail.de>

unread,

Jun 19, 2020, 6:50:28 AM6/19/20

to scylladb-users@googlegroups.com

Hi,

when replacing cluster hardwar (all nodes) with new a cluster (to switch to
newer hardware) is it ok to make a backup on the old cluster and copy this to
the new cluster? The number of nodes is the same.

Or is there a prefered way of doing this?

thanks
Michael

Dan Yasny

<dyasny@scylladb.com>

unread,

Jun 19, 2020, 9:09:10 AM6/19/20

to scylladb-users@googlegroups.com

If you want to keep the database live as you replace nodes, you could try two approaches:

1. replace nodes one by one, using https://docs.scylladb.com/operating-scylla/procedures/cluster-management/replace_dead_node/#:~:text=Replace%20a%20Dead%20Node%20in%20a%20Scylla%20Cluster,for%20replacing%20one%20dead%20node.

2. Build a second DC with the new nodes, alter the schema to have replicas in the second DC, wait for the data to propagate over and then remove the replicas in the original DC and turn it off.

--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/2055612.TaHD1p8vsG%40termite.

--

Dan Yasny
Field Engineer | ScyllaDB
dya...@scylladb.com
+1-438-888-5610

Micha

<micha-1@fantasymail.de>

unread,

Jun 19, 2020, 10:56:04 AM6/19/20

to scylladb-users@googlegroups.com

thanks for the answer. It isn't required to keep the old cluster alive.

I will think about adding the second cluster, but does the backup approach
also work? I'm not sure because of the token range to node distribution...

Michael

Am Freitag, 19. Juni 2020, 15:08:57 CEST schrieb Dan Yasny:
> If you want to keep the database live as you replace nodes, you could try
> two approaches:
> 1. replace nodes one by one, using
> https://docs.scylladb.com/operating-scylla/procedures/cluster-management/rep
> lace_dead_node/#:~:text=Replace%20a%20Dead%20Node%20in%20a%20Scylla%20Cluste

> r,for%20replacing%20one%20dead%20node. 2. Build a second DC with the new

Dan Yasny

<dyasny@scylladb.com>

unread,

Jun 19, 2020, 11:05:09 AM6/19/20

to scylladb-users@googlegroups.com

On Fri, Jun 19, 2020 at 10:56 AM Micha <mic...@fantasymail.de> wrote:

thanks for the answer. It isn't required to keep the old cluster alive.

I will think about adding the second cluster, but does the backup approach
also work? I'm not sure because of the token range to node distribution...

If you are restoring to the same amount and size (core count) of nodes, just make sure you backup and restore the system tables as well. That will enforce a similar token ring topology as the original cluster had. You need to have the exact same IP addresses as before of course.

Another option would be to set the token list in scylla.yaml as initial_token: [tokenlist] but this will require for you to remove the settings later on.

One last and the more genetic option would be to simply load all the sstables into all the nodes and do a nodetool refresh/nodetool cleanup after each ingestion step. This will load the node with the relevant data and shed the irrelevant stuff. Bit of an overkill, but you will not lose anything and will be able to restore to any size/configuration cluster instead of an exact mirror of the old cluster.

Michael

Am Freitag, 19. Juni 2020, 15:08:57 CEST schrieb Dan Yasny:
> If you want to keep the database live as you replace nodes, you could try
> two approaches:
> 1. replace nodes one by one, using
> https://docs.scylladb.com/operating-scylla/procedures/cluster-management/rep
> lace_dead_node/#:~:text=Replace%20a%20Dead%20Node%20in%20a%20Scylla%20Cluste
> r,for%20replacing%20one%20dead%20node. 2. Build a second DC with the new
> nodes, alter the schema to have replicas in the second DC, wait for the
> data to propagate over and then remove the replicas in the original DC and
> turn it off.
>

--
You received this message because you are subscribed to the Google Groups "ScyllaDB users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scylladb-user...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/scylladb-users/4202745.tboTkUm0tW%40termite.

Micha

<micha-1@fantasymail.de>

unread,

Jun 21, 2020, 6:18:17 AM6/21/20

to scylladb-users@googlegroups.com, Dan Yasny

Am Freitag, 19. Juni 2020, 17:04:56 CEST schrieb Dan Yasny:
> If you are restoring to the same amount and size (core count) of nodes,
> just make sure you backup and restore the system tables as well. That will
> enforce a similar token ring topology as the original cluster had. You need
> to have the exact same IP addresses as before of course.

ok, the IPs are different, this cannot be changed.

> Another option would be to set the token list in scylla.yaml as
> initial_token: [tokenlist] but this will require for you to remove the
> settings later on.

Just to be sure:

From each node I get the tokenlist and set it as "initial_token:..." on a new
node: so, on node new_1 I set the initial_token from, say, node old_1.
Then I put the backup from old_1 into new_1's table directories.
After doing this for all 8 nodes I start the new cluster, do a repair.
After that I have to remove the initial_token setting from the config.

>
> One last and the more genetic option would be to simply load all the
> sstables into all the nodes and do a nodetool refresh/nodetool cleanup
> after each ingestion step. This will load the node with the relevant data
> and shed the irrelevant stuff. Bit of an overkill, but you will not lose
> anything and will be able to restore to any size/configuration cluster
> instead of an exact mirror of the old cluster.

With this variant I do it as:

I put the all the sstable files form say old_1 into the table/upload folders of
node new_1, then do a refresh.
Then I put all the sstables from old_2 into the table/upload folders of new_2
and do a refresh.
After doing this for all 8 nodes, I execute a cleanup on all nodes.

Is this correct? (Any mistake is just... really time consuming )

cheers
Michael

Reply all

Reply to author

Forward