drop / create database not replicated to other nodes?

107 views
Skip to first unread message

cyusedfzfb

unread,
May 16, 2024, 9:28:08 AM5/16/24
to codership
Hi,

Today we discovered to our surprise that, on our 3-node galera cluster,
dropping a database did NOT replicate to the other two nodes.

The node status was "synced" on all three nodes, both before and after
the database drop.
Status continued to be "synced", while some hosts still showed the
datase (and it's contents!) and others no longer showed it.

Then we dropped the database manually on the other two nodes. That worked.

We then created the database fresh again, and also THAT did not
replicate to the other nodes.

Is this behaviour intentional? As designed..? Are we having issues with
our cluster?

I expected a CREATE DATABASE to replicate to the other nodes as well...

We are puzzled, and are utterly unsure if our cluster IS in fact
"synced" or not...

Anyone..?

mj | galera

unread,
May 16, 2024, 9:41:45 AM5/16/24
to codership
After creating the database on all three nodes, we observed that "CREATE TABLE" etc in that DB is in fact sucessfully replicated to all nodes.

We have also had several times SST/IST in our cluster: all expected, and all working perfectly.

This behaviour on DROP/CREATE DATABASE is completely new and for us unexpected.

We really hope for some feedback from experts here.

cyusedfzfb

unread,
May 17, 2024, 5:25:18 AM5/17/24
to codersh...@googlegroups.com

Hi all,

Meanwhile I have read up a lot, and still hope someone here can reply and explain.

There are places that say "DROP DATABASE" is excluded from galera replication, and these have to be done on each node separately and manually, but ther are also places that say it *IS* replicated.

However: "CREATE DATABASE" is commonly said to be replicated to the other nodes, were as we observed that it did NOT. We had to CREATE DATABASE on each node.

Once the new database was created, contents (TABLES) of that new database WERE replicated successfully.

All of this is happening on rhel8.9, mariadb 10.5.22, and the galera cluster is showing healthy, the three nodes are "synced", cluster_size = 3, they share the same wsrep_cluster_name, etc, etc. Node reboots result in successful IST's, we have performed successful SST's, etc. All in all the cluster is functioning and behaving very well.

We really hope for some insights of how this is supposed to work from the mailinglist!

I guess this output helps, from one of the nodes:

MariaDB [(none)]> show status like 'wsrep_%';
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name                 | Value                                                                                                                                          |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| wsrep_local_state_uuid        | 470cef5b-f0ca-11ee-ab68-87cd9b53c729                                                                                                           |
| wsrep_protocol_version        | 10                                                                                                                                             |
| wsrep_last_committed          | 41921937                                                                                                                                       |
| wsrep_replicated              | 7                                                                                                                                              |
| wsrep_replicated_bytes        | 2680                                                                                                                                           |
| wsrep_repl_keys               | 12                                                                                                                                             |
| wsrep_repl_keys_bytes         | 264                                                                                                                                            |
| wsrep_repl_data_bytes         | 1950                                                                                                                                           |
| wsrep_repl_other_bytes        | 0                                                                                                                                              |
| wsrep_received                | 8208123                                                                                                                                        |
| wsrep_received_bytes          | 47764240283                                                                                                                                    |
| wsrep_local_commits           | 2                                                                                                                                              |
| wsrep_local_cert_failures     | 0                                                                                                                                              |
| wsrep_local_replays           | 0                                                                                                                                              |
| wsrep_local_send_queue        | 0                                                                                                                                              |
| wsrep_local_send_queue_max    | 1                                                                                                                                              |
| wsrep_local_send_queue_min    | 0                                                                                                                                              |
| wsrep_local_send_queue_avg    | 0                                                                                                                                              |
| wsrep_local_recv_queue        | 0                                                                                                                                              |
| wsrep_local_recv_queue_max    | 114                                                                                                                                            |
| wsrep_local_recv_queue_min    | 0                                                                                                                                              |
| wsrep_local_recv_queue_avg    | 0.00758419                                                                                                                                     |
| wsrep_local_cached_downto     | 41080804                                                                                                                                       |
| wsrep_flow_control_paused_ns  | 47828550454                                                                                                                                    |
| wsrep_flow_control_paused     | 6.24912e-05                                                                                                                                    |
| wsrep_flow_control_sent       | 1                                                                                                                                              |
| wsrep_flow_control_recv       | 29                                                                                                                                             |
| wsrep_flow_control_active     | false                                                                                                                                          |
| wsrep_flow_control_requested  | false                                                                                                                                          |
| wsrep_cert_deps_distance      | 15.5805                                                                                                                                        |
| wsrep_apply_oooe              | 0.0373219                                                                                                                                      |
| wsrep_apply_oool              | 0.00186536                                                                                                                                     |
| wsrep_apply_window            | 1.04028                                                                                                                                        |
| wsrep_apply_waits             | 594                                                                                                                                            |
| wsrep_commit_oooe             | 0                                                                                                                                              |
| wsrep_commit_oool             | 0                                                                                                                                              |
| wsrep_commit_window           | 1.02563                                                                                                                                        |
| wsrep_local_state             | 4                                                                                                                                              |
| wsrep_local_state_comment     | Synced                                                                                                                                         |
| wsrep_cert_index_size         | 809                                                                                                                                            |
| wsrep_causal_reads            | 0                                                                                                                                              |
| wsrep_cert_interval           | 1882.9                                                                                                                                         |
| wsrep_open_transactions       | 0                                                                                                                                              |
| wsrep_open_connections        | 0                                                                                                                                              |
| wsrep_incoming_addresses      | 1.2.3.103:0,1.2.3.112:0,1.2.3.100:0                                                                                          |
| wsrep_cluster_weight          | 3                                                                                                                                              |
| wsrep_desync_count            | 0                                                                                                                                              |
| wsrep_evs_delayed             |                                                                                                                                                |
| wsrep_evs_evict_list          |                                                                                                                                                |
| wsrep_evs_repl_latency        | 0.000562427/0.000562427/0.000562427/0/1                                                                                                        |
| wsrep_evs_state               | OPERATIONAL                                                                                                                                    |
| wsrep_gcomm_uuid              | 8e50abd0-0d38-11ef-ba95-dfec4a163b8a                                                                                                           |
| wsrep_gmcast_segment          | 0                                                                                                                                              |
| wsrep_applier_thread_count    | 4                                                                                                                                              |
| wsrep_cluster_capabilities    |                                                                                                                                                |
| wsrep_cluster_conf_id         | 187                                                                                                                                            |
| wsrep_cluster_size            | 3                                                                                                                                              |
| wsrep_cluster_state_uuid      | 470cef5b-f0ca-11ee-ab68-87cd9b53c729                                                                                                           |
| wsrep_cluster_status          | Primary                                                                                                                                        |
| wsrep_connected               | ON                                                                                                                                             |
| wsrep_local_bf_aborts         | 0                                                                                                                                              |
| wsrep_local_index             | 0                                                                                                                                              |
| wsrep_provider_capabilities   | :MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:TRX_REPLAY:ISOLATION:PAUSE:CAUSAL_READS:INCREMENTAL_WRITESET:UNORDERED:PREORDERED:STREAMING:NBO: |
| wsrep_provider_name           | Galera                                                                                                                                         |
| wsrep_provider_vendor         | Codership Oy <in...@codership.com>                                                                                                              |
| wsrep_provider_version        | 26.4.14(rXXXX)                                                                                                                                 |
| wsrep_ready                   | ON                                                                                                                                             |
| wsrep_rollbacker_thread_count | 1                                                                                                                                              |
| wsrep_thread_count            | 5                                                                                                                                              |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+

Many thanks in advance for insights!

--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/codership-team/59536d52-09b8-4cd5-b4ae-74acb8fda1d4n%40googlegroups.com.

Wayne Gemmell

unread,
May 21, 2024, 7:07:01 AM5/21/24
to cyusedfzfb, codersh...@googlegroups.com
I had similar problems when specifying the wsrep_thread_count variable. My setting was a bit higher than that and everything just went inconsistent. Maybe try leaving that as a default value and see if that helps? Disclaimer, I'm a n00b at this and just relaying what worked for me.

Jan Lindström

unread,
May 21, 2024, 8:32:06 AM5/21/24
to codership
Hi,

Can you provide some example where you can verify that CREATE/DROP DATABASE is not replicated? What exact version you are using? You could also provide your config file.

R: Jan

sacawulu

unread,
May 22, 2024, 2:49:26 AM5/22/24
to codersh...@googlegroups.com
Hi Jan Lindström and Wayne Gemmell,

Thanks very much for your replies, much appreciated..!

Today I will prepare and post evertything as requested.

I have not yet incorporated your suggestion Wayne, to reduce
wsrep_thread_count, but it is appreciated and we will try it in the
coming days.

Worrying that setting non-default wsrep_thread_count custom settigs
would possibly have such side effects...

I will get back to you with configs, and 'proof' to show what we observed.

Again: thanks for your replies!

MJ


Op 21-05-2024 om 13:50 schreef Jan Lindström:
> --
> You received this message because you are subscribed to the Google
> Groups "codership" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to codership-tea...@googlegroups.com
> <mailto:codership-tea...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/codership-team/0376c0c5-ff70-490d-aa0c-ab1e7c5b3dden%40googlegroups.com <https://groups.google.com/d/msgid/codership-team/0376c0c5-ff70-490d-aa0c-ab1e7c5b3dden%40googlegroups.com?utm_medium=email&utm_source=footer>.

cyusedfzfb

unread,
May 22, 2024, 6:54:23 AM5/22/24
to codersh...@googlegroups.com

Dear Jan and Wayne, and others as well.

We have found our issue.

Our galera cluster was also acting as an 'old style' (i know it's not old) master/slave replication source, and therefore our /etc/my.cnf contained the following lines: (obviously NOT YET commented out)

server-id=UNIQUE_NUMBER
#log_bin = /data/mysql/mysql-bin.log
#binlog_do_db = db1
#binlog_do_db = db2
#replicate-do-db = db1
#replicate-do-db = db2
#relay_log = /data/mysql/mysql-relay.log

With the above 7 lines uncommented, no "CREATE DATABASE xxx;" was ever replicated, we tried both db names (db1 / db2) from the config, but also new unique db names.

But here it comes: after commenting out as above, only leaving "server-id=UNIQUE_NUMBER", restarting mariadb, everything started working immediately:

# CREATE DATABASE TEST1;

immediately showed up at the other hosts.

But of course, removing these lines will break replication to the old-style slave. :-(

For us, the side effect of those 6 lines was completely unexpected..? But perhaps that's just our stupidity?

Opinions, anyone...?

MJ

We are puzzled, and really unsure if our cluster IS in fact
"synced" or not...

Anyone..?

--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/codership-team/0376c0c5-ff70-490d-aa0c-ab1e7c5b3dden%40googlegroups.com.

cyusedfzfb

unread,
May 22, 2024, 8:38:03 AM5/22/24
to codersh...@googlegroups.com
Hi,

Just to report back with more (per request) information, and also the
fix in my.cnf:

#    replicate-do-db = db1 # commented out, because these two lines cause
#    replicate-do-db = db2 # CREATE & DROP DATABASE statements to be NO
LONGER replicated

I was under the impression that the above lines were specifically for
old-style master/slave replication, but they also apply to galera
replication.

Commenting them out, fixes the issue in the subject line.

For us is *still* unexpected that with specific databases configured
under replicate-do-db, all databases (including db1 & db2!) have to be
created and dropped manually. We wonder why..?

Anyway: even though our issue seems to be pinpointed...as requested,
some more information on our setup:

Three idential hosts, called galera1/galera2/galera3, running AlmaLinux
release 8.9 (Midnight Oncilla), fully updated, with the stock mariadb
Server version: 10.5.22-MariaDB MariaDB Server, installed like:
 # yum distro-sync
 # yum module reset mariadb
 # yum module enable mariadb:10.5
 # dnf install mariadb-server-galera -y

For testing I disabled selinux and firewalld. The hosts run on a local
subnet, no firewalling, and below is my.cnf (only the server-id is
customized per host, hence the placeholder XXX-UNIQUE)

[root@galera2 ~]# cat /etc/my.cnf | grep -v "^#" | grep -v "^$"
[mysqld]
innodb_file_per_table
innodb_flush_method=O_DIRECT
performance_schema=ON
datadir=/data/mysql
socket=/data/mysql/mysql.sock
symbolic-links=0
tmpdir = /data/tmp
server-id=XXX-UNIQUE
log_bin = /data/mysql/mysql-bin.log
binlog_do_db = db1
binlog_do_db = db2
#    replicate-do-db = db1 # commented out, because these two lines cause
#    replicate-do-db = db2 # CREATE & DROP DATABASE statements to be NO
LONGER replicated
relay_log = /data/mysql/mysql-relay.log
[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid
[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=2
bind-address=0.0.0.0
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_name="Galera_Cluster"
wsrep_cluster_address    = "gcomm://galera1,galera2,galera3"
wsrep_slave_threads=4
wsrep_max_ws_rows=0
wsrep_max_ws_size=2147483647
wsrep_debug=0
wsrep_convert_LOCK_to_trx=0
wsrep_retry_autocommit=1
wsrep_auto_increment_control=1
wsrep_drupal_282555_workaround=0
wsrep_causal_reads=0
wsrep_notify_cmd=/usr/local/bin/wsrep_notify.sh
wsrep_provider_options="gcache.size = 5G"
[mysqld]
datadir=/data/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mariadb/mariadb.log
pid-file=/run/mariadb/mariadb.pid

MJ

Jan Lindström

unread,
May 22, 2024, 9:25:14 AM5/22/24
to cyusedfzfb, codersh...@googlegroups.com
Hi,

As this is MariaDB I can say that at the moment replicate-do-db does not work correctly in the Galera cluster (see https://jira.mariadb.org/browse/MDEV-421). There is a bug in this area under work. Just to point out they are also dangerous for a cluster as expectation is that databases are consistent across clusters.

R: Jan

--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages