Question1: How can i change the rsync process to use private IP instead of public IP?
4. Once the sync is completed on node 3, the clustercheck still shows that the node is down and node is not usable as a cluster node5. Then i have to issue sudo service mysql stop and tthen sudo /etc/init.d/mysql start and it says database failed to start but the rsync process starts and after the process is completed node3 becomes a part of the clusterQuestion2: How can i change the mysql process to start using /etc/init.dmysq instead of service mysql start during the boot time.?
Question3: if node1 becomes a donor it stops accepting connections which make the application unusable, once suggestion is to add +if [ "$WSSREP_STATUS" == "4" ] || [ "$WSSREP_STATUS" == "2" ] in the cluster check, but doing that how accurate is the data during the rsync or should i be using xtrabackup?
Question4: how do i configure the nodes to use incremental to avoid this error?
Thank Jay..you suggestions were indeed helpful... and yes i will investigate in detail about the mysql startup and then create a bug requestanother thing to note here is that is it because the wsrep_urls are in the mysqld_safe section that must be causing this? since the database has to be started in the mysqld_safe mode?
can that parameter be shifted to mysqld section? or is their another variable that i can use like e.gwsrep_cluster_address=gcomm://10.1.6.118:4567,gcomm://10.1.3.30:4567,gcomm://10.1.3.101:4567,gcomm://
So after your suggestions i noticed that the if the db is restarted the node becomes available very soon, is it due to the IST taking effect?
here are my new parameter[mysqld_safe]socket = /var/run/mysqld/mysqld.socknice = 0[mysqld]## * Basic Settings#server_id=1binlog_format=ROWwsrep_provider=/usr/lib64/libgalera_smm.sowsrep_slave_threads=2wsrep_cluster_name=dev_clusterwsrep_sst_method=xtrabackup # changed to xtrabackup from rsycn inorder to use IST
wsrep_node_name=node1innodb_locks_unsafe_for_binlog=1innodb_autoinc_lock_mode=2log_slave_updateswsrep_replicate_myisam=1wsrep_sst_receive_address=10.1.6.118 # i believe this should be the private ip of this node?wsrep_provider_options = "gmcast.listen_addr=tcp://0.0.0.0:4567; ist.recv_addr=10.1.6.118:4568; " # i believe this should be the private ip of this node?
Hi Jay, thanks for the answers...another question is..this might be a side track, so let me know if should open a new thread for this....we were running some load tests on the entire setup which has (1 haproxy lb + 3 nodes) and i am noticing that after a few connections the scripts stop running with the error "Error connecting to mysql"
and it starts back after a while..i checked the innotop and did not see any locks or deadlocks in the db node, plus i am just running 1 thread at a time so i don't think it should be using too many connections..but i wasn't sure whether any of the system user connections is causing the db to lock down?here is what my process list looks while the load test is runningmysql> show full processlist;+--------+-------------+--------------------+---------+---------+-------+--------------------+-----------------------+-----------+---------------+-----------+| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |+--------+-------------+--------------------+---------+---------+-------+--------------------+-----------------------+-----------+---------------+-----------+| 1 | system user | | NULL | Sleep | 39005 | wsrep aborter idle | NULL | 0 | 0 | 1 || 2 | system user | | NULL | Sleep | 2815 | committed 81728 | NULL | 0 | 0 | 1 || 3 | system user | | NULL | Sleep | 2816 | committed 81727 | NULL | 0 | 0 | 1 || 136444 | user1 | localhost | NULL | Query | 0 | sleeping | show full processlist | 0 | 0 | 1 || 141869 | applusdev | 10.1.4.6:35006 | demo | Sleep | 0 | | NULL | 1 | 1 | 1 || 141871 | applusdev | 10.1.4.6:35008 | demo | Sleep | 0 | | NULL | 0 | 0 | 1 |+--------+-------------+--------------------+---------+---------+-------+--------------------+-----------------------+-----------+---------------+-----------+I am open to suggestions to debug this issue, as i cannot proceed to production with this issue lingering...
yes the precise error in the log file isPHP Warning: mysqli::mysqli(): (HY000/2003): Can't connect to MySQL server on '<server_IP>'and yes the script is using a new connection for every new record it inserts, and closes the connectionsome variables from the db...mysql> show variables like 'max_connection%';+-----------------+-------+| Variable_name | Value |+-----------------+-------+| max_connections | 151 |+-----------------+-------+mysql> show status like '%connections';+----------------------+--------+| Variable_name | Value |+----------------------+--------+| Connections | 305747 || Max_used_connections | 115 |+----------------------+--------+
connecting to haproxy..and here is the config i used in haproxy to avoid lock conflictsbackend pxc-onenode-backmode tcpbalance leastconnoption httpchkserver c2 10.1.3.3:3306 check port 9200 inter 12000 rise 3 fall 3server c1 10.1.6.8:3306 check port 9200 inter 12000 rise 3 fall 3 backupserver c3 10.1.3.1:3306 check port 9200 inter 12000 rise 3 fall 3 backup
and i also tried connection to server c2 directly and when during the load test i got similar errors…
Hi this is my first post to the group and i am hoping to find some answers to my questions, i apologize for a long post, but i think if i give you all the details then debugging will be easier..So here is a detailed description of the issueServer Version : ubuntu 10.04 LTSpercona version: 5.5.24-55-log Percona XtraDB Cluster (GPL), wsrep_23.6.r341Configuration details: 3 nodes (node1, node2, node3)(my.cnf) in node 1++++++++++++++++++++++++++++
[mysqld_safe]socket = /var/run/mysqld/mysqld.socknice = 0[mysqld]## * Basic Settings#server_id=1binlog_format=ROWwsrep_provider=/usr/lib64/libgalera_smm.so
#wsrep_cluster_address=gcomm://wsrep_slave_threads=2wsrep_cluster_name=dev_clusterwsrep_sst_method=rsync
wsrep_node_name=node1innodb_locks_unsafe_for_binlog=1innodb_autoinc_lock_mode=2log_slave_updateswsrep_replicate_myisam=1
++++++++++++++++++++++++++++(my.cnf) in node 2++++++++++++++++++++++++++++# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]socket = /var/run/mysqld/mysqld.socknice = 0[mysqld]## * Basic Settings#
server_id=2binlog_format=ROWwsrep_provider=/usr/lib64/libgalera_smm.so#wsrep_cluster_address=gcomm://10.1.6.118:4567wsrep_slave_threads=2wsrep_cluster_name=dev_clusterwsrep_sst_method=rsyncwsrep_node_name=node2innodb_locks_unsafe_for_binlog=1innodb_autoinc_lock_mode=2log_slave_updateswsrep_replicate_myisam=1++++++++++++++++++++++++++++(my.cnf) in node 3++++++++++++++++++++++++++++# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]socket = /var/run/mysqld/mysqld.socknice = 0[mysqld]## * Basic Settings#
server_id=3binlog_format=ROWwsrep_provider=/usr/lib64/libgalera_smm.so#wsrep_cluster_address=gcomm://10.1.6.118:4567wsrep_slave_threads=2wsrep_cluster_name=dev_clusterwsrep_sst_method=rsyncwsrep_node_name=node3innodb_locks_unsafe_for_binlog=1innodb_autoinc_lock_mode=2log_slave_updateswsrep_replicate_myisam=1++++++++++++++++++++++++++++Testing Scenario: Setup haproxy with node1 up and node2 and node3 as backup (so the connections always go to one node)When i reboot node 3:
- node1 becomes the donor: wsrep_local_state_comment | Donor (+)
- node2 is up and running
- node3 comes back up and starts to sync
node3:~$ ps -ef | grep mysqlmysql 2429 1 0 11:27 ? 00:00:00 /usr/sbin/mysqldroot 2549 1 0 11:27 ? 00:00:00 /bin/sh /usr/bin/mysqld_safemysql 3031 2549 0 11:27 ? 00:00:00 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/lib/mysql/dev-db-node3.pid --socket=/var/run/mysqld/mysqld.sock --port=3306 --wsrep_cluster_address=gcomm://10.1.6.118:4567mysql 3188 3031 0 11:27 ? 00:00:00 sh -c wsrep_sst_rsync 'joiner' '<public_ip_node3>' '' '/var/lib/mysql/' '/etc/mysql/conf.d/mysqld_safe_syslog.cnf' '3031' 2>sst.errmysql 3189 3188 0 11:27 ? 00:00:01 /bin/bash -ue /usr//bin/wsrep_sst_rsync joiner <public_ip_node3> /var/lib/mysql/ /etc/mysql/conf.d/mysqld_safe_syslog.cnf 3031mysql 3203 1 0 11:27 ? 00:00:00 rsync --daemon --port 4444 --config /var/lib/mysql//rsync_sst.confmysql 3243 3203 0 11:27 ? 00:00:00 rsync --daemon --port 4444 --config /var/lib/mysql//rsync_sst.confmysql 3248 3243 1 11:27 ? 00:00:08 rsync --daemon --port 4444 --config /var/lib/mysql//rsync_sst.confmysql 5279 3189 0 11:35 ? 00:00:00 sleep 1akedar 5281 3771 0 11:35 pts/0 00:00:00 grep --color=auto mysqlnode3:~$
Question1: How can i change the rsync process to use private IP instead of public IP?
4. Once the sync is completed on node 3, the clustercheck still shows that the node is down and node is not usable as a cluster node5. Then i have to issue sudo service mysql stop and tthen sudo /etc/init.d/mysql start and it says database failed to start but the rsync process starts and after the process is completed node3 becomes a part of the clusterQuestion2: How can i change the mysql process to start using /etc/init.dmysq instead of service mysql start during the boot time.?
Question3: if node1 becomes a donor it stops accepting connections which make the application unusable, once suggestion is to add +if [ "$WSSREP_STATUS" == "4" ] || [ "$WSSREP_STATUS" == "2" ] in the cluster check, but doing that how accurate is the data during the rsync or should i be using xtrabackup?
Question4: how do i configure the nodes to use incremental to avoid this error?
120807 11:48:00 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (afc4ea7d-dc5e-11e1-0800-0616c529eebe): 1 (Operation not permitted)at galera/src/replicator_str.cpp:prepare_for_IST():439. IST will be unavailable.I have many more questions as i go on and test the configuration but if someone can answer these, i think i can clear a lot of my doubts...
--
You received this message because you are subscribed to the Google Groups "Percona Discussion" group.
To view this discussion on the web visit https://groups.google.com/d/msg/percona-discussion/-/2f0ODzIxAncJ.
To post to this group, send email to percona-d...@googlegroups.com.
To unsubscribe from this group, send email to percona-discuss...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/percona-discussion?hl=en.
So i resolved that error by creating a user..
grant process on *.* to 'mysql'@'localhost' identified by ''; flush privileges;
and then on server reboot i see that the donor was a different node and it shows this error....
To unsubscribe from this group, send email to percona-discussion+unsub...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/percona-discussion?hl=en.
so the question is...does the IST only work when you have stop the db and started it?if you reboot a node does it always do SST?