Message from discussion
Issue on ubuntu when i reboot the machine Percona cluster fails on node
Date: Thu, 9 Aug 2012 09:33:00 -0700 (PDT)
From: amol <ajke...@gmail.com>
To: percona-discussion@googlegroups.com
Message-Id: <82a4e0e1-7abf-4347-9ca2-8dddf04093cc@googlegroups.com>
In-Reply-To: <CDC55D50-B79E-4F4E-A4EE-1EAC5E530E2F@percona.com>
References: <f8b71bd3-74fb-48e3-a486-1d4e572d1008@googlegroups.com> <cdbe0c8f-d917-4294-8806-5ee3e74ba8ea@googlegroups.com>
<CDC55D50-B79E-4F4E-A4EE-1EAC5E530E2F@percona.com>
Subject: Re: Issue on ubuntu when i reboot the machine Percona cluster fails
on node
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_Part_720_23477386.1344529980721"
------=_Part_720_23477386.1344529980721
Content-Type: multipart/alternative;
boundary="----=_Part_721_17872352.1344529980721"
------=_Part_721_17872352.1344529980721
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
yes i just did a reboot of the machine
and after some search if found this error on the donor node
innobackupex: Error: mysql child process has died: ERROR 1045 (28000):
Access denied for user 'mysql'@'localhost' (using password: NO)
So i resolved that error by creating a user..
grant process on *.* to 'mysql'@'localhost' identified by '';
flush privileges;
and then on server reboot i see that the donor was a different node and it shows this error....
innobackupex: Error: mysql child process has died: ERROR 1044 (42000) at
line 3: Access denied for user 'mysql'@'localhost' to database 'mysql'
while waiting for reply to MySQL request: 'USE mysql;' at
/usr//bin/innobackupex line 374.
now i see that mysql user needs more privileges..so i have granted all
privileges to mysql..so now i have to try getting the node backup using SST
and then try the reboot
On Thursday, August 9, 2012 8:00:35 AM UTC-4, Jay Janssen wrote:
>
> Something about how you have SST configured is causing the ultimate
> problem here.
>
> I can't say why the local state was reset to all zeros on reboot, how was
> the machine restarted? If the local server had kept its state correctly,
> an IST should have been possible.
>
> On Aug 8, 2012, at 4:54 PM, amol <ajk...@gmail.com <javascript:>> wrote:
>
> now when i reboot the node1 the database does not start (even after using
> /etc/init.d/mysql start)
>
> and i see these error in mysql/error.log
>
> 120808 16:39:19 [Note] WSREP: Flow-control interval: [14, 28]
> 120808 16:39:19 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 151412)
> 120808 16:39:19 [Note] WSREP: State transfer required:
> Group state: afc4ea7d-dc5e-11e1-0800-0616c529eebe:151412
> Local state: 00000000-0000-0000-0000-000000000000:-1
> 120808 16:39:19 [Note] WSREP: New cluster view: global state:
> afc4ea7d-dc5e-11e1-0800-0616c529eebe:151412, view# 31: Primary, number of
> nodes: 3, my index: 0, protocol version 2
> 120808 16:39:19 [Warning] WSREP: Gap in state sequence. Need state
> transfer.
> 120808 16:39:21 [Note] WSREP: Running: 'wsrep_sst_xtrabackup 'joiner'
> '10.1.6.8' '' '/var/lib/mysql/' '/etc/mysql/conf.d/mysqld_safe_syslog.cnf'
> '4411' 2>sst.err'
> 120808 16:39:21 [Note] WSREP: Prepared SST request: xtrabackup|
> 10.1.6.8:4444/xtrabackup_sst
> 120808 16:39:21 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
> notification.
> 120808 16:39:21 [Note] WSREP: Assign initial position for certification:
> 151412, protocol version: 2
> 120808 16:39:21 [Warning] WSREP: Failed to prepare for incremental state
> transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not
> match group state UUID (afc4ea7d-dc5e-11e1-0800-0616c529eebe): 1 (Operation
> not permitted)
> at galera/src/replicator_str.cpp:prepare_for_IST():439. IST will be
> unavailable.
> 120808 16:39:21 [Note] WSREP: Node 0 (node1) requested state transfer from
> '*any*'. Selected 1 (node2)(SYNCED) as donor.
> 120808 16:39:21 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 151412)
> 120808 16:39:21 [Note] WSREP: Requesting state transfer: success, donor: 1
> 120808 16:39:27 [ERROR] WSREP: Process completed with error:
> wsrep_sst_xtrabackup 'joiner' '10.1.6.8' '' '/var/lib/mysql/'
> '/etc/mysql/conf.d/mysqld_safe_syslog.cnf' '4411' 2>sst.err: 32 (Broken
> pipe)
> 120808 16:39:27 [ERROR] WSREP: Failed to read uuid:seqno from joiner
> script.
> 120808 16:39:27 [ERROR] WSREP: SST failed: 32 (Broken pipe)
> 120808 16:39:27 [ERROR] Aborting
>
> 120808 16:39:27 [Warning] WSREP: 1 (node2): State transfer to 0 (node1)
> failed: -1 (Operation not permitted)
> 120808 16:39:27 [ERROR] WSREP:
> gcs/src/gcs_group.c:gcs_group_handle_join_msg():712: Will never receive
> state. Need to abort.
> 120808 16:39:27 [Note] WSREP: gcomm: terminating thread
> 120808 16:39:27 [Note] WSREP: gcomm: joining thread
> 120808 16:39:27 [Note] WSREP: gcomm: closing backend
> 120808 16:39:27 [Note] WSREP:
> view(view_id(NON_PRIM,20ca3744-e199-11e1-0800-0de247e11b46,31) memb {
> 20ca3744-e199-11e1-0800-0de247e11b46,
> } joined {
> } left {
> } partitioned {
> 5ffb372a-e118-11e1-0800-1e749dee7061,
> 71386e58-e109-11e1-0800-8855542b6c12,
> })
> 120808 16:39:27 [Note] WSREP: view((empty))
> 120808 16:39:27 [Note] WSREP: gcomm: closed
> 120808 16:39:27 [Note] WSREP: /usr/sbin/mysqld: Terminated.
> Aborted
> 120808 16:39:27 mysqld_safe mysqld from pid file
> /var/lib/mysql/dev2-db-upgrade.pid ended
>
>
> this is after all the changes i did earlier in the day on node 1 my.cnf
> for IST
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> [mysqld_safe]
> socket = /var/run/mysqld/mysqld.sock
> nice = 0
> wsrep_urls =
> gcomm://10.1.6.3:4567,gcomm://10.1.3.1:4567,gcomm://10.1.6.8:4567,gcomm://
>
> [mysqld]
> #
> # * Basic Settings
> #
> server_id=1
> binlog_format=ROW
> wsrep_provider=/usr/lib64/libgalera_smm.so
> wsrep_slave_threads=2
> wsrep_cluster_name=dev_cluster
> wsrep_sst_method=xtrabackup
> wsrep_node_name=node1
> innodb_locks_unsafe_for_binlog=1
> innodb_autoinc_lock_mode=2
> log_slave_updates
> wsrep_replicate_myisam=1
> wsrep_sst_receive_address=10.1.6.8
> wsrep_provider_options = "gmcast.listen_addr=tcp://0.0.0.0:4567;
> ist.recv_addr=10.1.6.8:4568; "
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
> On Tuesday, August 7, 2012 1:03:48 PM UTC-4, amol wrote:
>>
>> Hi this is my first post to the group and i am hoping to find some
>> answers to my questions, i apologize for a long post, but i think if i give
>> you all the details then debugging will be easier..
>>
>> So here is a detailed description of the issue
>>
>> Server Version : ubuntu 10.04 LTS
>> percona version: 5.5.24-55-log Percona XtraDB Cluster (GPL),
>> wsrep_23.6.r341
>>
>> *Configuration details: 3 nodes (node1, node2, node3)*
>> *(my.cnf) in node 1 *
>> ++++++++++++++++++++++++++++
>> [mysqld_safe]
>> socket = /var/run/mysqld/mysqld.sock
>> nice = 0
>> wsrep_urls = gcomm://10.1.6.118:4567,gcomm://10.1.3.30:4567,gcomm://
>> 10.1.3.101:4567,gcomm://
>>
>> [mysqld]
>> #
>> # * Basic Settings
>> #
>> server_id=1
>> binlog_format=ROW
>> wsrep_provider=/usr/lib64/libgalera_smm.so
>> #wsrep_cluster_address=gcomm://
>> wsrep_slave_threads=2
>> wsrep_cluster_name=dev_cluster
>> wsrep_sst_method=rsync
>> wsrep_node_name=node1
>> innodb_locks_unsafe_for_binlog=1
>> innodb_autoinc_lock_mode=2
>> log_slave_updates
>> wsrep_replicate_myisam=1
>> ++++++++++++++++++++++++++++
>> *
>> *
>> *(my.cnf) in node 2 *
>> ++++++++++++++++++++++++++++
>> # This was formally known as [safe_mysqld]. Both versions are currently
>> parsed.
>> [mysqld_safe]
>> socket = /var/run/mysqld/mysqld.sock
>> nice = 0
>> wsrep_urls = gcomm://10.1.6.118:4567,gcomm://10.1.3.30:4567,gcomm://
>> 10.1.3.101:4567,gcomm://
>>
>> [mysqld]
>> #
>> # * Basic Settings
>> #
>> server_id=2
>> binlog_format=ROW
>> wsrep_provider=/usr/lib64/libgalera_smm.so
>> #wsrep_cluster_address=gcomm://10.1.6.118:4567
>> wsrep_slave_threads=2
>> wsrep_cluster_name=dev_cluster
>> wsrep_sst_method=rsync
>> wsrep_node_name=node2
>> innodb_locks_unsafe_for_binlog=1
>> innodb_autoinc_lock_mode=2
>> log_slave_updates
>> wsrep_replicate_myisam=1
>> ++++++++++++++++++++++++++++
>>
>> *(my.cnf) in node 3*
>>
>> ++++++++++++++++++++++++++++
>> # This was formally known as [safe_mysqld]. Both versions are currently
>> parsed.
>> [mysqld_safe]
>> socket = /var/run/mysqld/mysqld.sock
>> nice = 0
>> wsrep_urls = gcomm://10.1.6.118:4567,gcomm://10.1.3.30:4567,gcomm://
>> 10.1.3.101:4567,gcomm://
>>
>> [mysqld]
>> #
>> # * Basic Settings
>> #
>> server_id=3
>> binlog_format=ROW
>> wsrep_provider=/usr/lib64/libgalera_smm.so
>> #wsrep_cluster_address=gcomm://10.1.6.118:4567
>> wsrep_slave_threads=2
>> wsrep_cluster_name=dev_cluster
>> wsrep_sst_method=rsync
>> wsrep_node_name=node3
>> innodb_locks_unsafe_for_binlog=1
>> innodb_autoinc_lock_mode=2
>> log_slave_updates
>> wsrep_replicate_myisam=1
>> ++++++++++++++++++++++++++++
>>
>>
>> Testing Scenario: Setup haproxy with node1 up and node2 and node3 as
>> backup (so the connections always go to one node)
>>
>> When i reboot node 3:
>>
>> 1. node1 becomes the donor: wsrep_local_state_comment | Donor (+)
>> 2. node2 is up and running
>> 3. node3 comes back up and starts to sync
>>
>> node3:~$ ps -ef | grep mysql
>> mysql 2429 1 0 11:27 ? 00:00:00 /usr/sbin/mysqld
>> root 2549 1 0 11:27 ? 00:00:00 /bin/sh
>> /usr/bin/mysqld_safe
>> mysql 3031 2549 0 11:27 ? 00:00:00 /usr/sbin/mysqld
>> --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin
>> --user=mysql --log-error=/var/log/mysql/error.log
>> --pid-file=/var/lib/mysql/dev-db-node3.pid
>> --socket=/var/run/mysqld/mysqld.sock --port=3306
>> --wsrep_cluster_address=gcomm://10.1.6.118:4567
>> mysql 3188 3031 0 11:27 ? 00:00:00 sh -c wsrep_sst_rsync
>> 'joiner' '<public_ip_node3>' '' '/var/lib/mysql/'
>> '/etc/mysql/conf.d/mysqld_safe_syslog.cnf' '3031' 2>sst.err
>> mysql 3189 3188 0 11:27 ? 00:00:01 /bin/bash -ue
>> /usr//bin/wsrep_sst_rsync joiner <public_ip_node3> /var/lib/mysql/
>> /etc/mysql/conf.d/mysqld_safe_syslog.cnf 3031
>> mysql 3203 1 0 11:27 ? 00:00:00 rsync --daemon --port
>> 4444 --config /var/lib/mysql//rsync_sst.conf
>> mysql 3243 3203 0 11:27 ? 00:00:00 rsync --daemon --port
>> 4444 --config /var/lib/mysql//rsync_sst.conf
>> mysql 3248 3243 1 11:27 ? 00:00:08 rsync --daemon --port
>> 4444 --config /var/lib/mysql//rsync_sst.conf
>> mysql 5279 3189 0 11:35 ? 00:00:00 sleep 1
>> akedar 5281 3771 0 11:35 pts/0 00:00:00 grep --color=auto mysql
>> node3:~$
>>
>> Question1: How can i change the rsync process to use private IP instead
>> of public IP?
>>
>> 4. Once the sync is completed on node 3, the clustercheck still
>> shows that the node is down and node is not usable as a cluster node
>> 5. Then i have to issue sudo service mysql stop and tthen sudo
>> /etc/init.d/mysql start and it says database failed to start but the
>> rsync process starts and after the process is completed node3 becomes a
>> part of the cluster
>>
>> Question2: How can i change the mysql process to start using
>> /etc/init.dmysq instead of service mysql start during the boot time.?
>>
>> Question3: if node1 becomes a donor it stops accepting connections which
>> make the application unusable, once suggestion is to add +if [
>> "$WSSREP_STATUS" == "4" ] || [ "$WSSREP_STATUS" == "2" ] in the cluster
>> check, but doing that how accurate is the data during the rsync or should i
>> be using xtrabackup?
>>
>> Question4: how do i configure the nodes to use incremental to avoid this
>> error?
>> 120807 11:48:00 [Warning] WSREP: Failed to prepare for incremental state
>> transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not
>> match group state UUID (afc4ea7d-dc5e-11e1-0800-0616c529eebe): 1 (Operation
>> not permitted)
>> at galera/src/replicator_str.cpp:prepare_for_IST():439. IST will
>> be unavailable.
>>
>> I have many more questions as i go on and test the configuration but if
>> someone can answer these, i think i can clear a lot of my doubts...
>>
>>
>>
>>
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Percona Discussion" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/percona-discussion/-/2f0ODzIxAncJ.
> To post to this group, send email to percona-d...@googlegroups.com<javascript:>
> .
> To unsubscribe from this group, send email to
> percona-discussion+unsubscribe@googlegroups.com <javascript:>.
> For more options, visit this group at
> http://groups.google.com/group/percona-discussion?hl=en.
>
>
> Jay Janssen, Senior MySQL Consultant, Percona Inc.
> http://about.me/jay.janssen
> Percona Live in NYC Oct 1-2nd: http://www.percona.com/live/nyc-2012/
>
>
------=_Part_721_17872352.1344529980721
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable
yes i just did a reboot of the machine<div><br></div><div> and after s=
ome search if found this error on the donor node<div><br></div><div><font c=
olor=3D"#4c1130"><span style=3D"background-color: rgb(250, 250, 250); white=
-space: pre-wrap; ">innobackupex: Error: mysql child process has died: ERRO=
R 1045 (28000): Access denied for user 'mysql'@'localhost' (using password:=
NO)</span><br></font></div><div><pre style=3D"white-space: pre-wrap; color=
: rgb(0, 0, 0); background-color: rgb(250, 250, 250); "><br></pre><pre styl=
e=3D"white-space: pre-wrap; color: rgb(0, 0, 0); background-color: rgb(250,=
250, 250); ">So i resolved that error by creating a user.. </pre><pre styl=
e=3D"white-space: pre-wrap; color: rgb(0, 0, 0); background-color: rgb(250,=
250, 250); "><br></pre><pre style=3D"white-space: pre-wrap; background-col=
or: rgb(250, 250, 250); "><font color=3D"#4c1130">grant process on *.* to '=
mysql'@'localhost' identified by '';
flush privileges;</font><font color=3D"#000000"><br></font></pre><pre style=
=3D"white-space: pre-wrap; color: rgb(0, 0, 0); background-color: rgb(250, =
250, 250); "><br></pre><pre style=3D"white-space: pre-wrap; color: rgb(0, 0=
, 0); background-color: rgb(250, 250, 250); ">and then on server reboot i s=
ee that the donor was a different node and it shows this error....</pre><pr=
e style=3D"white-space: pre-wrap; color: rgb(0, 0, 0); background-color: rg=
b(250, 250, 250); "><br></pre></div><div><div><font color=3D"#4c1130">innob=
ackupex: Error: mysql child process has died: ERROR 1044 (42000) at line 3:=
Access denied for user 'mysql'@'localhost' to database 'mysql'</font></div=
><div><font color=3D"#4c1130"> while waiting for reply to MySQL reques=
t: 'USE mysql;' at /usr//bin/innobackupex line 374.</font></div><div><br></=
div><div>now i see that mysql user needs more privileges..so i have granted=
all privileges to mysql..so now i have to try getting the node backup usin=
g SST and then try the reboot </div><div><br></div><div><br></div></di=
v></div><br>On Thursday, August 9, 2012 8:00:35 AM UTC-4, Jay Janssen wrote=
:<blockquote class=3D"gmail_quote" style=3D"margin: 0;margin-left: 0.8ex;bo=
rder-left: 1px #ccc solid;padding-left: 1ex;"><div style=3D"word-wrap:break=
-word"><div>Something about how you have SST configured is causing the ulti=
mate problem here.</div><div><br></div><div>I can't say why the local state=
was reset to all zeros on reboot, how was the machine restarted? If =
the local server had kept its state correctly, an IST should have been poss=
ible. </div><br><div><div>On Aug 8, 2012, at 4:54 PM, amol <<a hre=
f=3D"javascript:" target=3D"_blank" gdf-obfuscated-mailto=3D"P6ekwb-zERoJ">=
ajk...@gmail.com</a>> wrote:</div><br><blockquote type=3D"cite">now when=
i reboot the node1 the database does not start (even after using /etc/init=
.d/mysql start)<div><br></div><div>and i see these error in mysql/error.log=
</div><div><br></div><div><div>120808 16:39:19 [Note] WSREP: Flow-control i=
nterval: [14, 28]</div><div>120808 16:39:19 [Note] WSREP: Shifting OPEN -&g=
t; PRIMARY (TO: 151412)</div><div>120808 16:39:19 [Note] WSREP: State trans=
fer required: </div><div><span style=3D"white-space:pre">=09</span>Gro=
up state: afc4ea7d-dc5e-11e1-0800-<wbr>0616c529eebe:151412</div><div><span =
style=3D"white-space:pre">=09</span>Local state: 00000000-0000-0000-0000-<w=
br>000000000000:-1</div><div>120808 16:39:19 [Note] WSREP: New cluster view=
: global state: afc4ea7d-dc5e-11e1-0800-<wbr>0616c529eebe:151412, view# 31:=
Primary, number of nodes: 3, my index: 0, protocol version 2</div><div>120=
808 16:39:19 [Warning] WSREP: Gap in state sequence. Need state transfer.</=
div><div>120808 16:39:21 [Note] WSREP: Running: 'wsrep_sst_xtrabackup 'join=
er' '10.1.6.8' '' '/var/lib/mysql/' '/etc/mysql/conf.d/mysqld_<wbr>safe_sys=
log.cnf' '4411' 2>sst.err'</div><div>120808 16:39:21 [Note] WSREP: Prepa=
red SST request: xtrabackup|<a href=3D"http://10.1.6.8:4444/xtrabackup_sst"=
target=3D"_blank">10.1.6.8:4444/<wbr>xtrabackup_sst</a></div><div>120808 1=
6:39:21 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notificatio=
n.</div><div>120808 16:39:21 [Note] WSREP: Assign initial position for cert=
ification: 151412, protocol version: 2</div><div>120808 16:39:21 [Warning] =
WSREP: Failed to prepare for incremental state transfer: Local state UUID (=
00000000-0000-0000-0000-<wbr>000000000000) does not match group state UUID =
(afc4ea7d-dc5e-11e1-0800-<wbr>0616c529eebe): 1 (Operation not permitted)</d=
iv><div><span style=3D"white-space:pre">=09</span> at galera/src/replicator=
_str.cpp:<wbr>prepare_for_IST():439. IST will be unavailable.</div><div>120=
808 16:39:21 [Note] WSREP: Node 0 (node1) requested state transfer from '*a=
ny*'. Selected 1 (node2)(SYNCED) as donor.</div><div>120808 16:39:21 [Note]=
WSREP: Shifting PRIMARY -> JOINER (TO: 151412)</div><div>120808 16:39:2=
1 [Note] WSREP: Requesting state transfer: success, donor: 1</div><div>1208=
08 16:39:27 [ERROR] WSREP: Process completed with error: wsrep_sst_xtraback=
up 'joiner' '10.1.6.8' '' '/var/lib/mysql/' '/etc/mysql/conf.d/mysqld_<wbr>=
safe_syslog.cnf' '4411' 2>sst.err: 32 (Broken pipe)</div><div>120808 16:=
39:27 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.</div><di=
v>120808 16:39:27 [ERROR] WSREP: SST failed: 32 (Broken pipe)</div><div>120=
808 16:39:27 [ERROR] Aborting</div><div><br></div><div>120808 16:39:27 [War=
ning] WSREP: 1 (node2): State transfer to 0 (node1) failed: -1 (Operation n=
ot permitted)</div><div>120808 16:39:27 [ERROR] WSREP: gcs/src/gcs_group.c:=
gcs_group_<wbr>handle_join_msg():712: Will never receive state. Need to abo=
rt.</div><div>120808 16:39:27 [Note] WSREP: gcomm: terminating thread</div>=
<div>120808 16:39:27 [Note] WSREP: gcomm: joining thread</div><div>120808 1=
6:39:27 [Note] WSREP: gcomm: closing backend</div><div>120808 16:39:27 [Not=
e] WSREP: view(view_id(NON_PRIM,<wbr>20ca3744-e199-11e1-0800-<wbr>0de247e11=
b46,31) memb {</div><div><span style=3D"white-space:pre">=09</span>20ca3744=
-e199-11e1-0800-<wbr>0de247e11b46,</div><div>} joined {</div><div>} left {<=
/div><div>} partitioned {</div><div><span style=3D"white-space:pre">=09</sp=
an>5ffb372a-e118-11e1-0800-<wbr>1e749dee7061,</div><div><span style=3D"whit=
e-space:pre">=09</span>71386e58-e109-11e1-0800-<wbr>8855542b6c12,</div><div=
>})</div><div>120808 16:39:27 [Note] WSREP: view((empty))</div><div>120808 =
16:39:27 [Note] WSREP: gcomm: closed</div><div>120808 16:39:27 [Note] WSREP=
: /usr/sbin/mysqld: Terminated.</div><div>Aborted</div><div>120808 16:39:27=
mysqld_safe mysqld from pid file /var/lib/mysql/dev2-db-<wbr>upgrade.pid e=
nded</div></div><div><br><br>this is after all the changes i did earlier in=
the day on node 1 my.cnf for IST</div><div>++++++++++++++++++++++++++++++<=
wbr>++++++++++++++++++++++++++++++<wbr>++++++++++++++++++++++++<br></div><d=
iv><div>[mysqld_safe]</div><div>socket<span style=3D"white-space:pre">=09=
=09</span>=3D /var/run/mysqld/mysqld.sock</div><div>nice<span style=3D"whit=
e-space:pre">=09=09</span>=3D 0</div><div>wsrep_urls =
=3D <a>gcomm://10.1.6.3:4567,gcomm://<wbr>10.1.3.1:4567,gcomm://10.1.6.<wbr=
>8:4567,gcomm://</a></div><div><br></div><div>[mysqld]</div><div>#</div><di=
v># * Basic Settings</div><div>#</div><div>server_id=3D1</div><div>binlog_f=
ormat=3DROW </div><div>wsrep_provider=3D/usr/lib64/<wbr>libgale=
ra_smm.so </div><div>wsrep_slave_threads=3D2 </div><div>ws=
rep_cluster_name=3Ddev_<wbr>cluster </div><div>wsrep_sst_method=3Dxtra=
backup</div><div>wsrep_node_name=3Dnode1 </div><div>innodb_lock=
s_unsafe_for_<wbr>binlog=3D1 </div><div>innodb_autoinc_lock_mode=3D2</=
div><div>log_slave_updates</div><div>wsrep_replicate_myisam=3D1</div><div>w=
srep_sst_receive_address=3D10.<wbr>1.6.8</div><div>wsrep_provider_options =
=3D "gmcast.listen_addr=3D<a>tcp://0.0.<wbr>0.0:4567</a>; ist.recv_addr=3D<=
a href=3D"http://10.1.6.8:4568" target=3D"_blank">10.1.6.8:4568</a>; "</div=
></div><div>++++++++++++++++++++++++++++++<wbr>++++++++++++++++++++++++++++=
++<wbr>++++++++++++++++++++++++</div><div><br><br>On Tuesday, August 7, 201=
2 1:03:48 PM UTC-4, amol wrote:<blockquote class=3D"gmail_quote" style=3D"m=
argin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex">Hi t=
his is my first post to the group and i am hoping to find some answers to m=
y questions, i apologize for a long post, but i think if i give you all the=
details then debugging will be easier..<div><br></div><div>So here is a de=
tailed description of the issue<div><br></div><div><font color=3D"#351c75">=
Server Version : ubuntu 10.04 LTS</font></div><div><font color=3D"#351c75">=
percona version: 5.5.24-55-log Percona XtraDB Cluster (GPL), wsrep_23.6.r34=
1</font></div><div><br></div><div><b>Configuration details: 3 nodes (node1,=
node2, node3)</b></div><div><b style=3D"background-color:rgb(255,255,255)"=
><font color=3D"#ff0000">(my.cnf) in node 1 </font></b></div><div=
><div>++++++++++++++++++++++++++++</div><div>[mysqld_safe]</div><div>socket=
<span style=3D"white-space:pre">=09=09</span>=3D /var/run/mysqld/mysqld.soc=
k</div><div>nice<span style=3D"white-space:pre">=09=09</span>=3D 0</div><di=
v>wsrep_urls =3D gcomm://<a href=3D"http://10.1.6.118:4=
567/" target=3D"_blank">10.1.6.118:4567</a>,gcomm:<wbr>//<a href=3D"http://=
10.1.3.30:4567/" target=3D"_blank">10.1.3.30:4567</a>,gcomm://<a href=3D"ht=
tp://10.1.3.101:4567/" target=3D"_blank">10.1.<wbr>3.101:4567</a>,gcomm://<=
/div><div><br></div><div>[mysqld]</div><div>#</div><div># * Basic Settings<=
/div><div>#</div><div>server_id=3D1</div><div>binlog_format=3DROW &nb=
sp;</div><div>wsrep_provider=3D/usr/lib64/<wbr>libgalera_smm.so  =
;</div><div>#wsrep_cluster_address=3Dgcomm:/<wbr>/</div><div>wsrep_slave_th=
reads=3D2 </div><div>wsrep_cluster_name=3Ddev_<wbr>cluster </div>=
<div>wsrep_sst_method=3Drsync</div><div>wsrep_node_name=3Dnode1  =
;</div><div>innodb_locks_unsafe_for_<wbr>binlog=3D1 </div><div>innodb_=
autoinc_lock_mode=3D2</div><div>log_slave_updates</div><div>wsrep_replicate=
_myisam=3D1</div></div><div>++++++++++++++++++++++++++++<br></div><div><fon=
t color=3D"#ff0000"><b><br></b></font></div><div><font color=3D"#ff0000"><b=
>(my.cnf) in node 2 </b></font><br></div><div><div>++++++++++++++=
++++++++++++++</div><div># This was formally known as [safe_mysqld]. Both v=
ersions are currently parsed.</div><div>[mysqld_safe]</div><div>socket<span=
style=3D"white-space:pre">=09=09</span>=3D /var/run/mysqld/mysqld.sock</di=
v><div>nice<span style=3D"white-space:pre">=09=09</span>=3D 0</div><div>wsr=
ep_urls =3D gcomm://<a href=3D"http://10.1.6.118:4567/"=
target=3D"_blank">10.1.6.118:4567</a>,gcomm:<wbr>//<a href=3D"http://10.1.=
3.30:4567/" target=3D"_blank">10.1.3.30:4567</a>,gcomm://<a href=3D"http://=
10.1.3.101:4567/" target=3D"_blank">10.1.<wbr>3.101:4567</a>,gcomm://</div>=
<div><br></div><div>[mysqld]</div><div>#</div><div># * Basic Settings</div>=
<div>#</div><div>server_id=3D2</div><div>binlog_format=3DROW </=
div><div>wsrep_provider=3D/usr/lib64/<wbr>libgalera_smm.so </di=
v><div>#wsrep_cluster_address=3Dgcomm:/<wbr>/<a href=3D"http://10.1.6.118:4=
567/" target=3D"_blank">10.1.6.118:4567</a></div><div>wsrep_slave_threads=
=3D2 </div><div>wsrep_cluster_name=3Ddev_<wbr>cluster </div><div>=
wsrep_sst_method=3Drsync</div><div>wsrep_node_name=3Dnode2 </di=
v><div>innodb_locks_unsafe_for_<wbr>binlog=3D1 </div><div>innodb_autoi=
nc_lock_mode=3D2</div><div>log_slave_updates</div><div>wsrep_replicate_myis=
am=3D1</div><div>++++++++++++++++++++++++++++</div></div><div><br></div><di=
v><b><font color=3D"#ff0000">(my.cnf) in node 3</font></b><br></div><d=
iv><br></div><div><div>++++++++++++++++++++++++++++</div><div># This was fo=
rmally known as [safe_mysqld]. Both versions are currently parsed.</div><di=
v>[mysqld_safe]</div><div>socket<span style=3D"white-space:pre">=09=09</spa=
n>=3D /var/run/mysqld/mysqld.sock</div><div>nice<span style=3D"white-space:=
pre">=09=09</span>=3D 0</div><div>wsrep_urls =3D gcomm:=
//<a href=3D"http://10.1.6.118:4567/" target=3D"_blank">10.1.6.118:4567</a>=
,gcomm:<wbr>//<a href=3D"http://10.1.3.30:4567/" target=3D"_blank">10.1.3.3=
0:4567</a>,gcomm://<a href=3D"http://10.1.3.101:4567/" target=3D"_blank">10=
.1.<wbr>3.101:4567</a>,gcomm://</div><div><br></div><div>[mysqld]</div><div=
>#</div><div># * Basic Settings</div><div>#</div><div>server_id=3D3</div><d=
iv>binlog_format=3DROW</div><div>wsrep_provider=3D/usr/lib64/<wbr>libgalera=
_smm.so</div><div>#wsrep_cluster_address=3Dgcomm:/<wbr>/<a href=3D"http://1=
0.1.6.118:4567/" target=3D"_blank">10.1.6.118:4567</a></div><div>wsrep_slav=
e_threads=3D2</div><div>wsrep_cluster_name=3Ddev_cluster</div><div>wsrep_ss=
t_method=3Drsync</div><div>wsrep_node_name=3Dnode3</div><div>innodb_locks_u=
nsafe_for_<wbr>binlog=3D1</div><div>innodb_autoinc_lock_mode=3D2</div><div>=
log_slave_updates</div><div>wsrep_replicate_myisam=3D1</div><div>++++++++++=
++++++++++++++++++</div></div><div><br></div><div><br></div><div>Testing Sc=
enario: Setup haproxy with node1 up and node2 and node3 as backup (so the c=
onnections always go to one node)</div><div><br></div><div>When i reboot no=
de 3: </div><div><ol><li>node1 becomes the donor: wsrep_local_st=
ate_comment | Donor (+) <br></li><li>node2 is up and running&nbs=
p;</li><li>node3 comes back up and starts to sync </li></ol><div=
><div>node3:~$ ps -ef | grep mysql</div><div>mysql 2429  =
; 1 0 11:27 ? 00:00:00 /usr/sbin/my=
sqld</div><div>root 2549 1 0 11:27 =
? 00:00:00 /bin/sh /usr/bin/mysqld_safe</div><di=
v>mysql 3031 2549 0 11:27 ? =
00:00:00 /usr/sbin/mysqld --basedir=3D/usr --datadir=3D/var/lib/mysql=
--plugin-dir=3D/usr/lib/mysql/<wbr>plugin --user=3Dmysql --log-error=3D/va=
r/log/mysql/<wbr>error.log --pid-file=3D/var/lib/mysql/dev-<wbr>db-node3.pi=
d --socket=3D/var/run/mysqld/<wbr>mysqld.sock --port=3D3306 --wsrep_cluster=
_address=3Dgcomm:<wbr>//<a href=3D"http://10.1.6.118:4567/" target=3D"_blan=
k">10.1.6.118:4567</a></div><div>mysql 3188 3031 =
0 11:27 ? 00:00:00 sh -c wsrep_sst_rsync 'joiner=
' '<font color=3D"#ff0000"><public_ip_node3></font>' '' '/var/lib/mys=
ql/' '/etc/mysql/conf.d/mysqld_<wbr>safe_syslog.cnf' '3031' 2>sst.err</d=
iv><div>mysql 3189 3188 0 11:27 ? &=
nbsp; 00:00:01 /bin/bash -ue /usr//bin/wsrep_sst_rsync joiner <font c=
olor=3D"#ff0000"><public_ip_node3></font> /var/lib/mysql/ /etc/=
mysql/conf.d/mysqld_safe_<wbr>syslog.cnf 3031</div><div>mysql =
3203 1 0 11:27 ? 00:00:00 r=
sync --daemon --port 4444 --config /var/lib/mysql//rsync_sst.conf</div><div=
>mysql 3243 3203 0 11:27 ? &=
nbsp;00:00:00 rsync --daemon --port 4444 --config /var/lib/mysql//rsync_sst=
.conf</div><div>mysql 3248 3243 1 11:27 ? =
00:00:08 rsync --daemon --port 4444 --config /var/lib/m=
ysql//rsync_sst.conf</div><div>mysql 5279 3189 0 =
11:35 ? 00:00:00 sleep 1</div><div>akedar =
5281 3771 0 11:35 pts/0 00:00:00 grep --col=
or=3Dauto mysql</div><div>node3:~$ </div></div></div><div><br></div><d=
iv><font color=3D"#0000ff">Question1: How can i change the rsync process to=
use private IP instead of public IP?</font></div><div><br></div><div> =
; 4. Once the sync is completed on node 3, the c=
lustercheck still shows that the node is down and node is not usable as a c=
luster node</div><div> 5. Then i have to i=
ssue <font color=3D"#20124d">sudo service mysql stop</font> and tthen<font =
color=3D"#20124d"> sudo /etc/init.d/mysql start</font><font color=3D"#0c343=
d"> </font>and it says database failed to start but the rsync process start=
s and after the process is completed node3 becomes a part of the cluster</d=
iv><div><div><font color=3D"#0000ff"><br></font></div><div><font color=3D"#=
0000ff">Question2: How can i change the mysql process to start using /etc/i=
nit.dmysq instead of service mysql start during the boot time.?</font></div=
></div><div><font color=3D"#0000ff"><br></font></div><div><font color=3D"#0=
000ff">Question3: if node1 becomes a donor it stops accepting connections w=
hich make the application unusable, once suggestion is to add <span st=
yle=3D"background-color:rgb(244,244,244);font-family:Arial,Tahoma,Verdana;l=
ine-height:20px">+if [ "$WSSREP_STATUS" =3D=3D "4" ] || [ "$WSSREP_STATUS" =
=3D=3D "2" ] in the cluster check, but doing that how accurate is the data =
during the rsync or should i be using xtrabackup?</span></font></div><div><=
br></div><div><font color=3D"#0000ff">Question4: how do i configure the nod=
es to use incremental to avoid this error?</font></div><div><div><font colo=
r=3D"#20124d">120807 11:48:00 [Warning] WSREP: Failed to prepare for increm=
ental state transfer: Local state UUID (00000000-0000-0000-0000-<wbr>000000=
000000) does not match group state UUID (afc4ea7d-dc5e-11e1-0800-<wbr>0616c=
529eebe): 1 (Operation not permitted)</font></div><div><font color=3D"#2012=
4d"> at galera/src/replicator_str.cpp:<wbr=
>prepare_for_IST():439. IST will be unavailable.</font></div></div><div>&nb=
sp;</div><div>I have many more questions as i go on and test the configurat=
ion but if someone can answer these, i think i can clear a lot of my doubts=
...</div><div><br></div><div><br></div><div><br></div><div><br></div><div><=
br></div></div></blockquote></div><div><br></div>
-- <br>
You received this message because you are subscribed to the Google Groups "=
Percona Discussion" group.<br>
To view this discussion on the web visit <a href=3D"https://groups.google.c=
om/d/msg/percona-discussion/-/2f0ODzIxAncJ" target=3D"_blank">https://group=
s.google.com/d/<wbr>msg/percona-discussion/-/<wbr>2f0ODzIxAncJ</a>.<br>=20
To post to this group, send email to <a href=3D"javascript:" target=3D"_bla=
nk" gdf-obfuscated-mailto=3D"P6ekwb-zERoJ">percona-d...@<wbr>googlegroups.c=
om</a>.<br>
To unsubscribe from this group, send email to <a href=3D"javascript:" targe=
t=3D"_blank" gdf-obfuscated-mailto=3D"P6ekwb-zERoJ">percona-discussion+<wbr=
>unsubscribe@googlegroups.com</a>.<br>
For more options, visit this group at <a href=3D"http://groups.google.com/g=
roup/percona-discussion?hl=3Den" target=3D"_blank">http://groups.google.com=
/<wbr>group/percona-discussion?hl=3Den</a><wbr>.<br>
</blockquote></div><br><div>
<div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:medium;font-=
style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;l=
ine-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:no=
ne;white-space:normal;word-spacing:0px;word-wrap:break-word"><span style=3D=
"border-collapse:separate;text-align:-webkit-auto;text-indent:0px;border-sp=
acing:0px"><div style=3D"word-wrap:break-word"><span style=3D"border-collap=
se:separate;text-align:-webkit-auto;text-indent:0px;border-spacing:0px"><di=
v style=3D"word-wrap:break-word"><span style=3D"border-collapse:separate;te=
xt-align:-webkit-auto;text-indent:0px;border-spacing:0px"><div style=3D"col=
or:rgb(0,0,0);font-family:Helvetica;font-size:medium;font-style:normal;font=
-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal=
;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-wo=
rd">Jay Janssen, Senior MySQL Consultant, Percona Inc.</div><div style=3D"w=
ord-wrap:break-word"><a href=3D"http://about.me/jay.janssen" target=3D"_bla=
nk">http://about.me/jay.janssen</a></div><div style=3D"color:rgb(0,0,0);fon=
t-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;f=
ont-weight:normal;letter-spacing:normal;line-height:normal;text-transform:n=
one;white-space:normal;word-spacing:0px;word-wrap:break-word">Percona Live =
in NYC Oct 1-2nd: <a href=3D"http://www.percona.com/live/nyc-2012/" ta=
rget=3D"_blank">http://www.percona.com/<wbr>live/nyc-2012/</a></div></span>=
</div></span></div></span></div>
</div>
<br></div></blockquote>
------=_Part_721_17872352.1344529980721--
------=_Part_720_23477386.1344529980721--