Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  8 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Emilio  
View profile  
 More options Sep 18 2012, 3:18 am
From: Emilio <emil.sal...@uber.biz>
Date: Tue, 18 Sep 2012 00:18:38 -0700 (PDT)
Local: Tues, Sep 18 2012 3:18 am
Subject: 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued

The 3rd node is no longer able to rejoin the cluster , after a mysql-galera
stop was issued, as a result of node not responding to sql requests. there
were no hardware failures no software issues, however just not responding
on SQL queries.

Error from the failing server: .err

....
.
.
.
...

.

InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/Packages'
InnoDB: in InnoDB data dictionary has tablespace id 4355,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/Packages.ibd and id 1295,
though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PagePaths'
InnoDB: in InnoDB data dictionary has tablespace id 4356,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PagePaths.ibd and id 3993,
though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table
'prod_c5_priceline/PagePermissionPageTypes'
InnoDB: in InnoDB data dictionary has tablespace id 4357,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name
./prod_c5_priceline/PagePermissionPageTypes.ibd and id 3994, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PagePermissions'
InnoDB: in InnoDB data dictionary has tablespace id 4358,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PagePermissions.ibd and id
3995, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PageSearchIndex'
InnoDB: in InnoDB data dictionary has tablespace id 4359,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PageSearchIndex.ibd and id
3997, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PageStatistics'
InnoDB: in InnoDB data dictionary has tablespace id 4360,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PageStatistics.ibd and id
244, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_mage_soulpattinson/dummy'
InnoDB: in InnoDB data dictionary has tablespace id 1,
InnoDB: but tablespace with that id or name does not exist. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: This may also be a table created with CREATE TEMPORARY TABLE
InnoDB: whose .ibd and .frm files MySQL automatically removed, but the
InnoDB: table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Waiting for the background threads to start
120918 17:15:41 InnoDB: 1.1.8 started; log sequence number 1423444579
120918 17:15:41 [Note] Event Scheduler: Loaded 0 events
120918 17:15:41 [Note] WSREP: Signalling provider to continue.
120918 17:15:41 [Note] WSREP: Received SST:
b2ad718b-0150-11e2-0800-3066d877c09e:2829
120918 17:15:41 [Note] WSREP: SST received:
b2ad718b-0150-11e2-0800-3066d877c09e:2829
120918 17:15:41  InnoDB: error: space object of table
'prod_c5_priceline/PageStatist120918 17:15:41 [Note] WSREP: 0
(API-B2C-DB02): State transfer from 1 (API-B2C-Cluster01) complete.
ics',
InnoDB: space id 4360 did not exist in memory. Retrying an open.
120918 17:15:41 [Note] WSREP: Shifting JOINER -> JOINED (TO: 2832)
120918 17:15:41  InnoDB: Error: tablespace id and flags in file
'./prod_c5_priceline/PageStatistics.ibd' are 244 and 0, but in the InnoDB
InnoDB: data dictionary they are 4360 and 0.
InnoDB: Have you moved InnoDB .ibd files around without using the
InnoDB: commands DISCARD TABLESPACE and IMPORT TABLESPACE?
InnoDB: Please refer to
InnoDB:
http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadic...
InnoDB: for how to resolve the issue.
120918 17:15:41 [Note] WSREP: Member 0 (API-B2C-DB02) synced with group.
120918 17:15:41 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 2832)
120918 17:15:41  InnoDB: cannot calculate statistics for table
prod_c5_priceline/PageStatistics
InnoDB: because the .ibd file is missing.  For help, please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting.html
120918 17:15:41 [ERROR] MySQL is trying to open a table handle but the .ibd
file for
table prod_c5_priceline/PageStatistics does not exist.
Have you deleted the .ibd file from the database directory under
the MySQL datadir, or have you used DISCARD TABLESPACE?
See http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting.html
how you can resolve the problem.
120918 17:15:41 [Warning] WSREP: BF applier failed to open_and_lock_tables:
1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 2830)
120918 17:15:41 [ERROR] Slave SQL: Error executing row event: 'Table
'prod_c5_priceline.PageStatistics' doesn't exist', Error_code: 1146
120918 17:15:41 [Warning] WSREP: RBR event 2 Write_rows apply warning:
1146, 2830
120918 17:15:41 [Note]
/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/sbin/mysqld:
ready for connections.
Version: '5.5.23'  socket:
'/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/var/mysqld.sock'  
port: 3306  Source distribution, wsrep_23.6.r3755
120918 17:15:41 [ERROR] WSREP: Failed to apply trx: source:
b2abd484-0150-11e2-0800-3e31ed9fda91 version: 2 local: 0 state: CERTIFYING
flags: 1 conn_id: 5586 trx_id: 573003579 seqnos (l: 4, g: 2830, s: 2829, d:
2828, ts: 1347952539315888478)
120918 17:15:41 [ERROR] WSREP: Failed to apply app buffer: P, seqno: 2830,
status: WSREP_FATAL
         at galera/src/replicator_smm.cpp:apply_wscoll():50
         at galera/src/replicator_smm.cpp:apply_trx_ws():121
120918 17:15:41 [ERROR] WSREP: Node consistency compromized, aborting...
120918 17:15:41 [Note] WSREP: Closing send monitor...
120918 17:15:41 [Note] WSREP: Closed send monitor.
120918 17:15:41 [Note] WSREP: gcomm: terminating thread
120918 17:15:41 [Note] WSREP: gcomm: joining thread
120918 17:15:41 [Note] WSREP: gcomm: closing backend
120918 17:15:42 [Note] WSREP:
view(view_id(NON_PRIM,a2bcd106-0160-11e2-0800-a52aa051edfc,8) memb {
        a2bcd106-0160-11e2-0800-a52aa051edfc,

} joined {
} left {
} partitioned {

        b2abd484-0150-11e2-0800-3e31ed9fda91,
})

120918 17:15:42 [Note] WSREP: view((empty))
120918 17:15:42 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no,
my_idx = 0, memb_num = 1
120918 17:15:42 [Note] WSREP: gcomm: closed
120918 17:15:42 [Note] WSREP: Flow-control interval: [8, 16]
120918 17:15:42 [Note] WSREP: Received NON-PRIMARY.
120918 17:15:42 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 2832)
120918 17:15:42 [Note] WSREP: Received self-leave message.
120918 17:15:42 [Note] WSREP: Flow-control interval: [0, 0]
120918 17:15:42 [Note] WSREP: Received SELF-LEAVE. Closing connection.
120918 17:15:42 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 2832)
120918 17:15:42 [Note] WSREP: RECV thread exiting 0: Success
120918 17:15:42 [Note] WSREP: recv_thread() joined.
120918 17:15:42 [Note] WSREP: Closing slave action queue.
120918 17:15:42 [Note] WSREP:
/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/sbin/mysqld:
Terminated.

sst error sst.err:

Joiner cleanup:
++ cat
/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/var//rsync_sst.pi d
+ local PID=14296
+ '[' 0 '!=' 14296 ']'
+ kill 14296
+ sleep 0.5
+ kill -9 14296
/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql//bin/wsrep_sst_rs ync:
line 29: kill: (14296) - No such process
+ :
+ set +x

galera state

# GALERA saved state
version: 2.1
uuid:    00000000-0000-0000-0000-000000000000
seqno:   -1
cert_index:

is it possible to clean this node up and rejoin, if so how?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Emilio  
View profile  
 More options Sep 18 2012, 3:53 am
From: Emilio <emil.sal...@uber.biz>
Date: Tue, 18 Sep 2012 00:53:50 -0700 (PDT)
Local: Tues, Sep 18 2012 3:53 am
Subject: Re: 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued

Just to add the Donor Log as well:

120918 17:20:41 [Note] WSREP: declaring
59dd02c8-0161-11e2-0800-32448ed27c29 stable
120918 17:20:41 [Note] WSREP:
view(view_id(PRIM,59dd02c8-0161-11e2-0800-32448ed27c29,10) memb {
        59dd02c8-0161-11e2-0800-32448ed27c29,
        b2abd484-0150-11e2-0800-3e31ed9fda91,

} joined {
} left {
} partitioned {
})

120918 17:20:41 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no,
my_idx = 1, memb_num = 2
120918 17:20:41 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120918 17:20:41 [Note] WSREP: STATE EXCHANGE: sent state msg:
5a77284a-0161-11e2-0800-2c3a847811a7
120918 17:20:41 [Note] WSREP: STATE EXCHANGE: got state msg:
5a77284a-0161-11e2-0800-2c3a847811a7 from 0 (API-B2C-DB02)
120918 17:20:41 [Note] WSREP: STATE EXCHANGE: got state msg:
5a77284a-0161-11e2-0800-2c3a847811a7 from 1 (API-B2C-Cluster01)
120918 17:20:41 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 9,
        members    = 1/2 (joined/total),
        act_id     = 2950,
        last_appl. = 2029,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b2ad718b-0150-11e2-0800-3066d877c09e
120918 17:20:41 [Note] WSREP: Flow-control interval: [12, 23]
120918 17:20:41 [Note] WSREP: New cluster view: global state:
b2ad718b-0150-11e2-0800-3066d877c09e:2950, view# 10: Primary, number of
nodes: 2, my index: 1, protocol version 2
120918 17:20:41 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
120918 17:20:41 [Note] WSREP: Assign initial position for certification:
2950, protocol version: 2
120918 17:20:43 [Warning] IP address '192.168.101.251' could not be
resolved: Temporary failure in name resolution
120918 17:20:43 [Warning] IP address '192.168.101.252' could not be
resolved: Temporary failure in name resolution
120918 17:20:43 [Note] WSREP: Node 0 (API-B2C-DB02) requested state
transfer from '*any*'. Selected 1 (API-B2C-Cluster01)(SYNCED) as donor.
120918 17:20:43 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 2951)
120918 17:20:43 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
120918 17:20:43 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor'
'172.28.63.69:4444/rsync_sst' 'root:rootpass'
'/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/var/'
'/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/etc/my.cnf'
'b2ad718b-0150-11e2-0800-3066d877c09e'
120918 17:20:43 [Note] WSREP: sst_donor_thread signaled with 0
120918 17:20:43 [Note] WSREP: Flushing tables for SST...
120918 17:20:43 [Note] WSREP: Provider paused at
b2ad718b-0150-11e2-0800-3066d877c09e:2951
120918 17:20:43 [Note] WSREP: Tables flushed.
120918 17:20:45 [Warning] IP address '192.168.101.250' could not be
resolved: Temporary failure in name resolution
120918 17:20:46 [Note] WSREP: Provider resumed.
120918 17:20:46 [Note] WSREP: 1 (API-B2C-Cluster01): State transfer to 0
(API-B2C-DB02) complete.
120918 17:20:46 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 2951)
120918 17:20:46 [Note] WSREP: Member 1 (API-B2C-Cluster01) synced with
group.
120918 17:20:46 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 2951)
120918 17:20:46 [Note] WSREP: Synchronized with group, ready for connections
120918 17:20:46 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
120918 17:20:47 [Warning] IP address '192.168.101.252' could not be
resolved: Temporary failure in name resolution
120918 17:20:48 [Warning] IP address '192.168.101.251' could not be
resolved: Temporary failure in name resolution
120918 17:20:48 [Warning] IP address '192.168.101.252' could not be
resolved: Temporary failure in name resolution
120918 17:20:49 [Warning] IP address '192.168.101.252' could not be
resolved: Temporary failure in name resolution
120918 17:20:50 [Warning] IP address '192.168.101.250' could not be
resolved: Temporary failure in name resolution
120918 17:20:51 [Note] WSREP: 0 (API-B2C-DB02): State transfer from 1
(API-B2C-Cluster01) complete.
120918 17:20:51 [Note] WSREP: Member 0 (API-B2C-DB02) synced with group.
120918 17:20:51 [Note] WSREP:
view(view_id(PRIM,b2abd484-0150-11e2-0800-3e31ed9fda91,11) memb {
        b2abd484-0150-11e2-0800-3e31ed9fda91,
} joined {
} left {
} partitioned {

        59dd02c8-0161-11e2-0800-32448ed27c29,
})

120918 17:20:51 [Note] WSREP: forgetting
59dd02c8-0161-11e2-0800-32448ed27c29 (tcp://192.168.101.3:4567)
120918 17:20:51 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no,
my_idx = 0, memb_num = 1
120918 17:20:51 [Note] WSREP: STATE_EXCHANGE: sent state UUID:
606732df-0161-11e2-0800-83cc96ce7138
120918 17:20:51 [Note] WSREP: STATE EXCHANGE: sent state msg:
606732df-0161-11e2-0800-83cc96ce7138
120918 17:20:51 [Note] WSREP: STATE EXCHANGE: got state msg:
606732df-0161-11e2-0800-83cc96ce7138 from 0 (API-B2C-Cluster01)
120918 17:20:51 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 10,
        members    = 1/1 (joined/total),
        act_id     = 2954,
        last_appl. = 2029,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b2ad718b-0150-11e2-0800-3066d877c09e
120918 17:20:51 [Note] WSREP: Flow-control interval: [8, 16]
120918 17:20:51 [Note] WSREP: New cluster view: global state:
b2ad718b-0150-11e2-0800-3066d877c09e:2954, view# 11: P

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Yurchenko  
View profile  
 More options Sep 18 2012, 12:08 pm
From: Alex Yurchenko <alexey.yurche...@codership.com>
Date: Tue, 18 Sep 2012 19:08:19 +0300
Local: Tues, Sep 18 2012 12:08 pm
Subject: Re: [codership-team] Re: 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued
Hi Emilio,

1) I see that your donor log is about SST at 17:20 whereas joiner log
is about SST at 17:15. Are these about the same event? Galera does not
require synchronized clocks on nodes, but successful log parsing does ;)

2) It seems that you have a serious datadir corruption on joiner. What
could be the case - I don't know, but paired with you mentioning that
the server just stopped responding it may well be a hardware problem
(like running out of disk space). rsync SST should normally fix the
datadir corruption, but if it does not, I'd suggest to
1. check the filesystem for errors and enough space
2. manually remove ALL contents of the data dir and try to join again

Regards,
Alex

On 2012-09-18 10:53, Emilio wrote:

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Henrik Ingo  
View profile  
 More options Sep 18 2012, 3:43 pm
From: Henrik Ingo <henrik.i...@avoinelama.fi>
Date: Tue, 18 Sep 2012 22:43:06 +0300
Local: Tues, Sep 18 2012 3:43 pm
Subject: Re: [codership-team] 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued
I think I had something like that when rsync was different versions on
different nodes. It didn't copy datadir correctly and the MySQL
couldn't start. (Disk full is also a good guess...)

henrik

--
henrik.i...@avoinelama.fi
+358-40-8211286 skype: henrik.ingo irc: hingo
www.openlife.cc

My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Emilio  
View profile  
 More options Sep 18 2012, 9:20 pm
From: Emilio <emil.sal...@uber.biz>
Date: Tue, 18 Sep 2012 18:20:11 -0700 (PDT)
Local: Tues, Sep 18 2012 9:20 pm
Subject: Re: [codership-team] Re: 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued

Hi Alexy

Thanks for your prompt response, here's the log from the correct timestamp
(I've made sure all the servers time are the same):

1)
120918 17:15:33 [Note] WSREP: declaring
a2bcd106-0160-11e2-0800-a52aa051edfc stable
120918 17:15:33 [Note] WSREP:
view(view_id(PRIM,a2bcd106-0160-11e2-0800-a52aa051edfc,8) memb {
120918 17:15:33 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no,
my_idx = 1, memb_num = 2
120918 17:15:33 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120918 17:15:34 [Warning] IP address '192.168.101.250' could not be
resolved: Temporary failure in name resolution
120918 17:15:34 [Note] WSREP: STATE EXCHANGE: sent state msg:
a3571f1b-0160-11e2-0800-5af6eb6272c2
120918 17:15:34 [Note] WSREP: STATE EXCHANGE: got state msg:
a3571f1b-0160-11e2-0800-5af6eb6272c2 from 0 (API-B2C-DB02)
120918 17:15:34 [Note] WSREP: STATE EXCHANGE: got state msg:
a3571f1b-0160-11e2-0800-5af6eb6272c2 from 1 (API-B2C-Cluster01)
120918 17:15:34 [Note] WSREP: Quorum results:
120918 17:15:34 [Note] WSREP: Flow-control interval: [12, 23]
120918 17:15:34 [Note] WSREP: New cluster view: global state:
b2ad718b-0150-11e2-0800-3066d877c09e:2828, view# 8: Primary, number of
nodes: 2, my index: 1, protocol version 2
120918 17:15:34 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
120918 17:15:34 [Note] WSREP: Assign initial position for certification:
2828, protocol version: 2
120918 17:15:34 [Warning] IP address '192.168.101.252' could not be
resolved: Temporary failure in name resolution
120918 17:15:36 [Note] WSREP: Node 0 (API-B2C-DB02) requested state
transfer from '*any*'. Selected 1 (API-B2C-Cluster01)(SYNCED) as donor.
120918 17:15:36 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 2829)
120918 17:15:36 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
120918 17:15:36 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor'
'172.28.63.69:4444/rsync_sst' 'root:rootpass'
'/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/var/'
'/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/etc/my.cnf'
'b2ad718b-0150-11e2-0800-3066d877c09e'
120918 17:15:36 [Note] WSREP: sst_donor_thread signaled with 0
120918 17:15:36 [Note] WSREP: Flushing tables for SST...
120918 17:15:36 [Note] WSREP: Provider paused at
b2ad718b-0150-11e2-0800-3066d877c09e:2829
120918 17:15:36 [Note] WSREP: Tables flushed.
120918 17:15:37 [Warning] IP address '192.168.101.252' could not be
resolved: Temporary failure in name resolution
120918 17:15:38 [Warning] IP address '192.168.101.251' could not be
resolved: Temporary failure in name resolution
120918 17:15:38 [Warning] IP address '192.168.101.252' could not be
resolved: Temporary failure in name resolution
120918 17:15:39 [Note] WSREP: Provider resumed.
120918 17:15:39 [Warning] IP address '192.168.101.250' could not be
resolved: Temporary failure in name resolution
120918 17:15:39 [Note] WSREP: 1 (API-B2C-Cluster01): State transfer to 0
(API-B2C-DB02) complete.
120918 17:15:39 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 2831)
120918 17:15:39 [Note] WSREP: Member 1 (API-B2C-Cluster01) synced with
group.
120918 17:15:39 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 2831)
120918 17:15:39 [Note] WSREP: Synchronized with group, ready for connections
120918 17:15:39 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
120918 17:15:39 [Warning] IP address '192.168.101.252' could not be
resolved: Temporary failure in name resolution
120918 17:15:41 [Note] WSREP: 0 (API-B2C-DB02): State transfer from 1
(API-B2C-Cluster01) complete.
120918 17:15:41 [Note] WSREP: Member 0 (API-B2C-DB02) synced with group.
120918 17:15:42 [Note] WSREP:
view(view_id(PRIM,b2abd484-0150-11e2-0800-3e31ed9fda91,9) memb {
120918 17:15:42 [Note] WSREP: forgetting
a2bcd106-0160-11e2-0800-a52aa051edfc (tcp://192.168.101.3:4567)
120918 17:15:42 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no,
my_idx = 0, memb_num = 1
120918 17:15:42 [Note] WSREP: STATE_EXCHANGE: sent state UUID:
a813b93b-0160-11e2-0800-914a571aa574
120918 17:15:42 [Note] WSREP: STATE EXCHANGE: sent state msg:
a813b93b-0160-11e2-0800-914a571aa574
120918 17:15:42 [Note] WSREP: STATE EXCHANGE: got state msg:
a813b93b-0160-11e2-0800-914a571aa574 from 0 (API-B2C-Cluster01)
120918 17:15:42 [Note] WSREP: Quorum results:
120918 17:15:42 [Note] WSREP: Flow-control interval: [8, 16]
120918 17:15:42 [Note] WSREP: New cluster view: global state:
b2ad718b-0150-11e2-0800-3066d877c09e:2832, view# 9: Primary, number of
nodes: 1, my index: 0, protocol version 2
120918 17:15:42 [Note] WSREP: wsrep_notify_cmd is not defined, skipping
notification.
120918 17:15:42 [Note] WSREP: Assign initial position for certification:
2832, protocol version: 2
2) I will provide feedback about the filesystem check later today (a no
change check shows error in the filesystem, fsck -n), however there is
definately ample space for the databases.

/../../magento-magento
                       50G  675M   47G   2% /mnt/prod_mage
/../../concrete5-concrete5
                       49G  395M   46G   1% /mnt/prod_c5
/dev/mapper/DBB
                       20G  173M   19G   1% /mnt/prod_mageB
/dev/mapper/DBB2
                       20G  236M   18G   2% /mnt/prod_c5B

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Emilio  
View profile  
 More options Sep 19 2012, 12:40 am
From: Emilio <emil.sal...@uber.biz>
Date: Tue, 18 Sep 2012 21:40:46 -0700 (PDT)
Local: Wed, Sep 19 2012 12:40 am
Subject: Re: 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued

Hi Alexey

Please find the fsck results from DB02, would these have an effect on
galera?

FSCK Errors:

fsck -n

************** Root mount (Galera installation and control + mysql)

fsck from util-linux-ng 2.17.2

e2fsck 1.41.12 (17-May-2010)

Warning!  /dev/mapper/root--vg-root is mounted.

Warning: skipping journal recovery because doing a read-only filesystem
check.

/dev/mapper/root--vg-root contains a file system with errors, check forced.

Pass 1: Checking inodes, blocks, and sizes

Inodes that were part of a corrupted orphan linked list found.  Fix? No

-----  (*http://jira.whamcloud.com/browse/LU-1281*<http://jira.whamcloud.com/browse/LU-1281>
)

Inode 261635 was part of the orphaned inode list.  IGNORED.

Deleted inode 261636 has zero dtime.  Fix? no

Inode 261637 was part of the orphaned inode list.  IGNORED.

Inode 265226 was part of the orphaned inode list.  IGNORED.

Pass 2: Checking directory structure

Pass 3: Checking directory connectivity

Pass 4: Checking reference counts

Pass 5: Checking group summary information

Block bitmap differences:  -1057529 -(7953409--7953411)

Fix? no

Free blocks count wrong (9332762, counted=8630828).

Fix? no

Inode bitmap differences:  -(261635--261637) -265226

Fix? no

Directories count wrong for group #32 (131, counted=130).

Fix? no

Free inodes count wrong (2560481, counted=2558866).

Fix? no

/dev/mapper/root--vg-root: ********** WARNING: Filesystem still has errors**********

/dev/mapper/root--vg-root: 47663/2608144 files (0.7% non-contiguous),
1095654/10428416 blocks

***************** Magento DB Mount *******************

Warning!  /dev/mapper/magento-magento is mounted.

Warning: skipping journal recovery because doing a read-only filesystem
check.

/dev/mapper/magento-magento contains a file system with errors, check
forced.

Pass 1: Checking inodes, blocks, and sizes

Pass 2: Checking directory structure

Pass 3: Checking directory connectivity

Pass 4: Checking reference counts

Pass 5: Checking group summary information

Free blocks count wrong (12854444, counted=12726079).

Fix? no

Free inodes count wrong (3276789, counted=3275957).

Fix? no

/dev/mapper/magento-magento: ********** WARNING: Filesystem still has errors**********

/dev/mapper/magento-magento: 11/3276800 files (3654.5% non-contiguous),
251732/13106176 blocks

********************** C5 DB **********************

Free blocks count wrong (12725428, counted=12670390).

Fix? no

Free inodes count wrong (3244021, counted=3243630).

Fix? no

/dev/mapper/concrete5-concrete5: ********** WARNING: Filesystem still has
errors **********

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Yurchenko  
View profile  
 More options Sep 19 2012, 3:31 am
From: Alex Yurchenko <alexey.yurche...@codership.com>
Date: Wed, 19 Sep 2012 10:31:38 +0300
Local: Wed, Sep 19 2012 3:31 am
Subject: Re: [codership-team] Re: 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued
Emilio,

An error in a filesystem means it can't store data reliably. You can't
expect ANYTHING working on a broken filesystem. You can't expect ANY
data to be preserved there. Broken filesystem should be your biggest
fear.

On 2012-09-19 07:40, Emilio wrote:

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Emilio  
View profile  
 More options Sep 19 2012, 5:15 am
From: Emilio <emil.sal...@uber.biz>
Date: Wed, 19 Sep 2012 02:15:14 -0700 (PDT)
Local: Wed, Sep 19 2012 5:15 am
Subject: Re: 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued

Thanks Alexey for your all your help, I'll get back to you once I've ran
all the filesystem checks and provide an update

Regards
Emilio


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »