Re: 3rd node is unable to be rejoin the cluster after node was not responding and mysql-galera stop was issued

295 views
Skip to first unread message

Emilio

unread,
Sep 18, 2012, 3:53:50 AM9/18/12
to codersh...@googlegroups.com
Just to add the Donor Log as well:
 
120918 17:20:41 [Note] WSREP: declaring 59dd02c8-0161-11e2-0800-32448ed27c29 stable
120918 17:20:41 [Note] WSREP: view(view_id(PRIM,59dd02c8-0161-11e2-0800-32448ed27c29,10) memb {
        59dd02c8-0161-11e2-0800-32448ed27c29,
        b2abd484-0150-11e2-0800-3e31ed9fda91,
} joined {
} left {
} partitioned {
})
120918 17:20:41 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
120918 17:20:41 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120918 17:20:41 [Note] WSREP: STATE EXCHANGE: sent state msg: 5a77284a-0161-11e2-0800-2c3a847811a7
120918 17:20:41 [Note] WSREP: STATE EXCHANGE: got state msg: 5a77284a-0161-11e2-0800-2c3a847811a7 from 0 (API-B2C-DB02)
120918 17:20:41 [Note] WSREP: STATE EXCHANGE: got state msg: 5a77284a-0161-11e2-0800-2c3a847811a7 from 1 (API-B2C-Cluster01)
120918 17:20:41 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 9,
        members    = 1/2 (joined/total),
        act_id     = 2950,
        last_appl. = 2029,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b2ad718b-0150-11e2-0800-3066d877c09e
120918 17:20:41 [Note] WSREP: Flow-control interval: [12, 23]
120918 17:20:41 [Note] WSREP: New cluster view: global state: b2ad718b-0150-11e2-0800-3066d877c09e:2950, view# 10: Primary, number of nodes: 2, my index: 1, protocol version 2
120918 17:20:41 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120918 17:20:41 [Note] WSREP: Assign initial position for certification: 2950, protocol version: 2
120918 17:20:43 [Warning] IP address '192.168.101.251' could not be resolved: Temporary failure in name resolution
120918 17:20:43 [Warning] IP address '192.168.101.252' could not be resolved: Temporary failure in name resolution
120918 17:20:43 [Note] WSREP: Node 0 (API-B2C-DB02) requested state transfer from '*any*'. Selected 1 (API-B2C-Cluster01)(SYNCED) as donor.
120918 17:20:43 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 2951)
120918 17:20:43 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120918 17:20:43 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor' '172.28.63.69:4444/rsync_sst' 'root:rootpass' '/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/var/' '/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/etc/my.cnf' 'b2ad718b-0150-11e2-0800-3066d877c09e'
120918 17:20:43 [Note] WSREP: sst_donor_thread signaled with 0
120918 17:20:43 [Note] WSREP: Flushing tables for SST...
120918 17:20:43 [Note] WSREP: Provider paused at b2ad718b-0150-11e2-0800-3066d877c09e:2951
120918 17:20:43 [Note] WSREP: Tables flushed.
120918 17:20:45 [Warning] IP address '192.168.101.250' could not be resolved: Temporary failure in name resolution
120918 17:20:46 [Note] WSREP: Provider resumed.
120918 17:20:46 [Note] WSREP: 1 (API-B2C-Cluster01): State transfer to 0 (API-B2C-DB02) complete.
120918 17:20:46 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 2951)
120918 17:20:46 [Note] WSREP: Member 1 (API-B2C-Cluster01) synced with group.
120918 17:20:46 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 2951)
120918 17:20:46 [Note] WSREP: Synchronized with group, ready for connections
120918 17:20:46 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120918 17:20:47 [Warning] IP address '192.168.101.252' could not be resolved: Temporary failure in name resolution
120918 17:20:48 [Warning] IP address '192.168.101.251' could not be resolved: Temporary failure in name resolution
120918 17:20:48 [Warning] IP address '192.168.101.252' could not be resolved: Temporary failure in name resolution
120918 17:20:49 [Warning] IP address '192.168.101.252' could not be resolved: Temporary failure in name resolution
120918 17:20:50 [Warning] IP address '192.168.101.250' could not be resolved: Temporary failure in name resolution
120918 17:20:51 [Note] WSREP: 0 (API-B2C-DB02): State transfer from 1 (API-B2C-Cluster01) complete.
120918 17:20:51 [Note] WSREP: Member 0 (API-B2C-DB02) synced with group.
120918 17:20:51 [Note] WSREP: view(view_id(PRIM,b2abd484-0150-11e2-0800-3e31ed9fda91,11) memb {
        b2abd484-0150-11e2-0800-3e31ed9fda91,
} joined {
} left {
} partitioned {
        59dd02c8-0161-11e2-0800-32448ed27c29,
})
120918 17:20:51 [Note] WSREP: forgetting 59dd02c8-0161-11e2-0800-32448ed27c29 (tcp://192.168.101.3:4567)
120918 17:20:51 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
120918 17:20:51 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 606732df-0161-11e2-0800-83cc96ce7138
120918 17:20:51 [Note] WSREP: STATE EXCHANGE: sent state msg: 606732df-0161-11e2-0800-83cc96ce7138
120918 17:20:51 [Note] WSREP: STATE EXCHANGE: got state msg: 606732df-0161-11e2-0800-83cc96ce7138 from 0 (API-B2C-Cluster01)
120918 17:20:51 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 10,
        members    = 1/1 (joined/total),
        act_id     = 2954,
        last_appl. = 2029,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b2ad718b-0150-11e2-0800-3066d877c09e
120918 17:20:51 [Note] WSREP: Flow-control interval: [8, 16]
120918 17:20:51 [Note] WSREP: New cluster view: global state: b2ad718b-0150-11e2-0800-3066d877c09e:2954, view# 11: P
 

On Tuesday, September 18, 2012 5:18:38 PM UTC+10, Emilio wrote:
The 3rd node is no longer able to rejoin the cluster , after a mysql-galera stop was issued, as a result of node not responding to sql requests. there were no hardware failures no software issues, however just not responding on SQL queries.
 
Error from the failing server: .err
 
....
.
.
.
...
 
.
 
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/Packages'
InnoDB: in InnoDB data dictionary has tablespace id 4355,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/Packages.ibd and id 1295, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PagePaths'
InnoDB: in InnoDB data dictionary has tablespace id 4356,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PagePaths.ibd and id 3993, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PagePermissionPageTypes'
InnoDB: in InnoDB data dictionary has tablespace id 4357,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PagePermissionPageTypes.ibd and id 3994, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PagePermissions'
InnoDB: in InnoDB data dictionary has tablespace id 4358,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PagePermissions.ibd and id 3995, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PageSearchIndex'
InnoDB: in InnoDB data dictionary has tablespace id 4359,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PageSearchIndex.ibd and id 3997, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_c5_priceline/PageStatistics'
InnoDB: in InnoDB data dictionary has tablespace id 4360,
InnoDB: but a tablespace with that id does not exist. There is
InnoDB: a tablespace of name ./prod_c5_priceline/PageStatistics.ibd and id 244, though. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Error: table 'prod_mage_soulpattinson/dummy'
InnoDB: in InnoDB data dictionary has tablespace id 1,
InnoDB: but tablespace with that id or name does not exist. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: This may also be a table created with CREATE TEMPORARY TABLE
InnoDB: whose .ibd and .frm files MySQL automatically removed, but the
InnoDB: table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:40  InnoDB: Waiting for the background threads to start
120918 17:15:41 InnoDB: 1.1.8 started; log sequence number 1423444579
120918 17:15:41 [Note] Event Scheduler: Loaded 0 events
120918 17:15:41 [Note] WSREP: Signalling provider to continue.
120918 17:15:41 [Note] WSREP: Received SST: b2ad718b-0150-11e2-0800-3066d877c09e:2829
120918 17:15:41 [Note] WSREP: SST received: b2ad718b-0150-11e2-0800-3066d877c09e:2829
120918 17:15:41  InnoDB: error: space object of table 'prod_c5_priceline/PageStatist120918 17:15:41 [Note] WSREP: 0 (API-B2C-DB02): State transfer from 1 (API-B2C-Cluster01) complete.
ics',
InnoDB: space id 4360 did not exist in memory. Retrying an open.
120918 17:15:41 [Note] WSREP: Shifting JOINER -> JOINED (TO: 2832)
120918 17:15:41  InnoDB: Error: tablespace id and flags in file './prod_c5_priceline/PageStatistics.ibd' are 244 and 0, but in the InnoDB
InnoDB: data dictionary they are 4360 and 0.
InnoDB: Have you moved InnoDB .ibd files around without using the
InnoDB: commands DISCARD TABLESPACE and IMPORT TABLESPACE?
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
120918 17:15:41 [Note] WSREP: Member 0 (API-B2C-DB02) synced with group.
120918 17:15:41 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 2832)
120918 17:15:41  InnoDB: cannot calculate statistics for table prod_c5_priceline/PageStatistics
InnoDB: because the .ibd file is missing.  For help, please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting.html
120918 17:15:41 [ERROR] MySQL is trying to open a table handle but the .ibd file for
table prod_c5_priceline/PageStatistics does not exist.
Have you deleted the .ibd file from the database directory under
the MySQL datadir, or have you used DISCARD TABLESPACE?
See http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting.html
how you can resolve the problem.
120918 17:15:41 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 2830)
120918 17:15:41 [ERROR] Slave SQL: Error executing row event: 'Table 'prod_c5_priceline.PageStatistics' doesn't exist', Error_code: 1146
120918 17:15:41 [Warning] WSREP: RBR event 2 Write_rows apply warning: 1146, 2830
120918 17:15:41 [Note] /opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/sbin/mysqld: ready for connections.
Version: '5.5.23'  socket: '/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/var/mysqld.sock'  port: 3306  Source distribution, wsrep_23.6.r3755
120918 17:15:41 [ERROR] WSREP: Failed to apply trx: source: b2abd484-0150-11e2-0800-3e31ed9fda91 version: 2 local: 0 state: CERTIFYING flags: 1 conn_id: 5586 trx_id: 573003579 seqnos (l: 4, g: 2830, s: 2829, d: 2828, ts: 1347952539315888478)
120918 17:15:41 [ERROR] WSREP: Failed to apply app buffer: P, seqno: 2830, status: WSREP_FATAL
         at galera/src/replicator_smm.cpp:apply_wscoll():50
         at galera/src/replicator_smm.cpp:apply_trx_ws():121
120918 17:15:41 [ERROR] WSREP: Node consistency compromized, aborting...
120918 17:15:41 [Note] WSREP: Closing send monitor...
120918 17:15:41 [Note] WSREP: Closed send monitor.
120918 17:15:41 [Note] WSREP: gcomm: terminating thread
120918 17:15:41 [Note] WSREP: gcomm: joining thread
120918 17:15:41 [Note] WSREP: gcomm: closing backend
120918 17:15:42 [Note] WSREP: view(view_id(NON_PRIM,a2bcd106-0160-11e2-0800-a52aa051edfc,8) memb {
        a2bcd106-0160-11e2-0800-a52aa051edfc,
} joined {
} left {
} partitioned {
        b2abd484-0150-11e2-0800-3e31ed9fda91,
})
120918 17:15:42 [Note] WSREP: view((empty))
120918 17:15:42 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
120918 17:15:42 [Note] WSREP: gcomm: closed
120918 17:15:42 [Note] WSREP: Flow-control interval: [8, 16]
120918 17:15:42 [Note] WSREP: Received NON-PRIMARY.
120918 17:15:42 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 2832)
120918 17:15:42 [Note] WSREP: Received self-leave message.
120918 17:15:42 [Note] WSREP: Flow-control interval: [0, 0]
120918 17:15:42 [Note] WSREP: Received SELF-LEAVE. Closing connection.
120918 17:15:42 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 2832)
120918 17:15:42 [Note] WSREP: RECV thread exiting 0: Success
120918 17:15:42 [Note] WSREP: recv_thread() joined.
120918 17:15:42 [Note] WSREP: Closing slave action queue.
120918 17:15:42 [Note] WSREP: /opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/sbin/mysqld: Terminated.
 
sst error sst.err:
 
Joiner cleanup:
++ cat /opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/var//rsync_sst.pid
+ local PID=14296
+ '[' 0 '!=' 14296 ']'
+ kill 14296
+ sleep 0.5
+ kill -9 14296
/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql//bin/wsrep_sst_rsync: line 29: kill: (14296) - No such process
+ :
+ set +x
 
 
galera state
 
# GALERA saved state
version: 2.1
uuid:    00000000-0000-0000-0000-000000000000
seqno:   -1
cert_index:
 
is it possible to clean this node up and rejoin, if so how?
 

Alex Yurchenko

unread,
Sep 18, 2012, 12:08:19 PM9/18/12
to codersh...@googlegroups.com
Hi Emilio,

1) I see that your donor log is about SST at 17:20 whereas joiner log
is about SST at 17:15. Are these about the same event? Galera does not
require synchronized clocks on nodes, but successful log parsing does ;)

2) It seems that you have a serious datadir corruption on joiner. What
could be the case - I don't know, but paired with you mentioning that
the server just stopped responding it may well be a hardware problem
(like running out of disk space). rsync SST should normally fix the
datadir corruption, but if it does not, I'd suggest to
1. check the filesystem for errors and enough space
2. manually remove ALL contents of the data dir and try to join again

Regards,
Alex
--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

Henrik Ingo

unread,
Sep 18, 2012, 3:43:06 PM9/18/12
to Emilio, codersh...@googlegroups.com
I think I had something like that when rsync was different versions on
different nodes. It didn't copy datadir correctly and the MySQL
couldn't start. (Disk full is also a good guess...)

henrik
--
henri...@avoinelama.fi
+358-40-8211286 skype: henrik.ingo irc: hingo
www.openlife.cc

My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559

Emilio

unread,
Sep 18, 2012, 9:20:11 PM9/18/12
to codersh...@googlegroups.com
Hi Alexy
 
Thanks for your prompt response, here's the log from the correct timestamp (I've made sure all the servers time are the same):
 
 
1)
120918 17:15:33 [Note] WSREP: declaring a2bcd106-0160-11e2-0800-a52aa051edfc stable
120918 17:15:33 [Note] WSREP: view(view_id(PRIM,a2bcd106-0160-11e2-0800-a52aa051edfc,8) memb {
120918 17:15:33 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
120918 17:15:33 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
120918 17:15:34 [Warning] IP address '192.168.101.250' could not be resolved: Temporary failure in name resolution
120918 17:15:34 [Note] WSREP: STATE EXCHANGE: sent state msg: a3571f1b-0160-11e2-0800-5af6eb6272c2
120918 17:15:34 [Note] WSREP: STATE EXCHANGE: got state msg: a3571f1b-0160-11e2-0800-5af6eb6272c2 from 0 (API-B2C-DB02)
120918 17:15:34 [Note] WSREP: STATE EXCHANGE: got state msg: a3571f1b-0160-11e2-0800-5af6eb6272c2 from 1 (API-B2C-Cluster01)
120918 17:15:34 [Note] WSREP: Quorum results:
120918 17:15:34 [Note] WSREP: Flow-control interval: [12, 23]
120918 17:15:34 [Note] WSREP: New cluster view: global state: b2ad718b-0150-11e2-0800-3066d877c09e:2828, view# 8: Primary, number of nodes: 2, my index: 1, protocol version 2
120918 17:15:34 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120918 17:15:34 [Note] WSREP: Assign initial position for certification: 2828, protocol version: 2
120918 17:15:34 [Warning] IP address '192.168.101.252' could not be resolved: Temporary failure in name resolution
120918 17:15:36 [Note] WSREP: Node 0 (API-B2C-DB02) requested state transfer from '*any*'. Selected 1 (API-B2C-Cluster01)(SYNCED) as donor.
120918 17:15:36 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 2829)
120918 17:15:36 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120918 17:15:36 [Note] WSREP: Running: 'wsrep_sst_rsync 'donor' '172.28.63.69:4444/rsync_sst' 'root:rootpass' '/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/var/' '/opt/mysql-galera/mysql-5.5.23-galera-23.2.1-x86_64/mysql/etc/my.cnf' 'b2ad718b-0150-11e2-0800-3066d877c09e'
120918 17:15:36 [Note] WSREP: sst_donor_thread signaled with 0
120918 17:15:36 [Note] WSREP: Flushing tables for SST...
120918 17:15:36 [Note] WSREP: Provider paused at b2ad718b-0150-11e2-0800-3066d877c09e:2829
120918 17:15:36 [Note] WSREP: Tables flushed.
120918 17:15:37 [Warning] IP address '192.168.101.252' could not be resolved: Temporary failure in name resolution
120918 17:15:38 [Warning] IP address '192.168.101.251' could not be resolved: Temporary failure in name resolution
120918 17:15:38 [Warning] IP address '192.168.101.252' could not be resolved: Temporary failure in name resolution
120918 17:15:39 [Note] WSREP: Provider resumed.
120918 17:15:39 [Warning] IP address '192.168.101.250' could not be resolved: Temporary failure in name resolution
120918 17:15:39 [Note] WSREP: 1 (API-B2C-Cluster01): State transfer to 0 (API-B2C-DB02) complete.
120918 17:15:39 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 2831)
120918 17:15:39 [Note] WSREP: Member 1 (API-B2C-Cluster01) synced with group.
120918 17:15:39 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 2831)
120918 17:15:39 [Note] WSREP: Synchronized with group, ready for connections
120918 17:15:39 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120918 17:15:39 [Warning] IP address '192.168.101.252' could not be resolved: Temporary failure in name resolution

120918 17:15:41 [Note] WSREP: 0 (API-B2C-DB02): State transfer from 1 (API-B2C-Cluster01) complete.
120918 17:15:41 [Note] WSREP: Member 0 (API-B2C-DB02) synced with group.
120918 17:15:42 [Note] WSREP: view(view_id(PRIM,b2abd484-0150-11e2-0800-3e31ed9fda91,9) memb {
120918 17:15:42 [Note] WSREP: forgetting a2bcd106-0160-11e2-0800-a52aa051edfc (tcp://192.168.101.3:4567)
120918 17:15:42 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
120918 17:15:42 [Note] WSREP: STATE_EXCHANGE: sent state UUID: a813b93b-0160-11e2-0800-914a571aa574
120918 17:15:42 [Note] WSREP: STATE EXCHANGE: sent state msg: a813b93b-0160-11e2-0800-914a571aa574
120918 17:15:42 [Note] WSREP: STATE EXCHANGE: got state msg: a813b93b-0160-11e2-0800-914a571aa574 from 0 (API-B2C-Cluster01)
120918 17:15:42 [Note] WSREP: Quorum results:

120918 17:15:42 [Note] WSREP: Flow-control interval: [8, 16]
120918 17:15:42 [Note] WSREP: New cluster view: global state: b2ad718b-0150-11e2-0800-3066d877c09e:2832, view# 9: Primary, number of nodes: 1, my index: 0, protocol version 2
120918 17:15:42 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
120918 17:15:42 [Note] WSREP: Assign initial position for certification: 2832, protocol version: 2
2) I will provide feedback about the filesystem check later today (a no change check shows error in the filesystem, fsck -n), however there is definately ample space for the databases.
 
/../../magento-magento
                       50G  675M   47G   2% /mnt/prod_mage
/../../concrete5-concrete5
                       49G  395M   46G   1% /mnt/prod_c5
/dev/mapper/DBB
                       20G  173M   19G   1% /mnt/prod_mageB
/dev/mapper/DBB2
                       20G  236M   18G   2% /mnt/prod_c5B

Emilio

unread,
Sep 19, 2012, 12:40:46 AM9/19/12
to codersh...@googlegroups.com
Hi Alexey
 
Please find the fsck results from DB02, would these have an effect on galera?
 

FSCK Errors:

fsck -n

 

************** Root mount (Galera installation and control + mysql)

 

fsck from util-linux-ng 2.17.2

e2fsck 1.41.12 (17-May-2010)

Warning!  /dev/mapper/root--vg-root is mounted.

Warning: skipping journal recovery because doing a read-only filesystem check.

/dev/mapper/root--vg-root contains a file system with errors, check forced.

Pass 1: Checking inodes, blocks, and sizes

Inodes that were part of a corrupted orphan linked list found.  Fix? No

-----  (http://jira.whamcloud.com/browse/LU-1281)

 

 

 

Inode 261635 was part of the orphaned inode list.  IGNORED.

Deleted inode 261636 has zero dtime.  Fix? no

 

Inode 261637 was part of the orphaned inode list.  IGNORED.

Inode 265226 was part of the orphaned inode list.  IGNORED.

Pass 2: Checking directory structure

Pass 3: Checking directory connectivity

Pass 4: Checking reference counts

Pass 5: Checking group summary information

Block bitmap differences:  -1057529 -(7953409--7953411)

Fix? no

 

Free blocks count wrong (9332762, counted=8630828).

Fix? no

 

Inode bitmap differences:  -(261635--261637) -265226

Fix? no

 

Directories count wrong for group #32 (131, counted=130).

Fix? no

 

Free inodes count wrong (2560481, counted=2558866).

Fix? no

 

 

/dev/mapper/root--vg-root: ********** WARNING: Filesystem still has errors **********

 

/dev/mapper/root--vg-root: 47663/2608144 files (0.7% non-contiguous), 1095654/10428416 blocks

 

 

***************** Magento DB Mount *******************

 

Warning!  /dev/mapper/magento-magento is mounted.

Warning: skipping journal recovery because doing a read-only filesystem check.

/dev/mapper/magento-magento contains a file system with errors, check forced.

Pass 1: Checking inodes, blocks, and sizes

Pass 2: Checking directory structure

Pass 3: Checking directory connectivity

Pass 4: Checking reference counts

Pass 5: Checking group summary information

Free blocks count wrong (12854444, counted=12726079).

Fix? no

 

Free inodes count wrong (3276789, counted=3275957).

Fix? no

 

 

/dev/mapper/magento-magento: ********** WARNING: Filesystem still has errors **********

 

/dev/mapper/magento-magento: 11/3276800 files (3654.5% non-contiguous), 251732/13106176 blocks

 

********************** C5 DB **********************

Free blocks count wrong (12725428, counted=12670390).

Fix? no

 

Free inodes count wrong (3244021, counted=3243630).

Fix? no

 

 

/dev/mapper/concrete5-concrete5: ********** WARNING: Filesystem still has errors **********

Alex Yurchenko

unread,
Sep 19, 2012, 3:31:38 AM9/19/12
to codersh...@googlegroups.com
Emilio,

An error in a filesystem means it can't store data reliably. You can't
expect ANYTHING working on a broken filesystem. You can't expect ANY
data to be preserved there. Broken filesystem should be your biggest
fear.

On 2012-09-19 07:40, Emilio wrote:
> Hi Alexey
>
> Please find the fsck results from DB02, would these have an effect on
> galera?
>
>
> FSCK Errors:
>
> fsck -n
>
>
>
> ************** Root mount (Galera installation and control + mysql)
>
>
>
> fsck from util-linux-ng 2.17.2
>
> e2fsck 1.41.12 (17-May-2010)
>
> Warning! /dev/mapper/root--vg-root is mounted.
>
> Warning: skipping journal recovery because doing a read-only
> filesystem
> check.
>
> /dev/mapper/root--vg-root contains a file system with errors, check
> forced.
>
> Pass 1: Checking inodes, blocks, and sizes
>
> Inodes that were part of a corrupted orphan linked list found. Fix?
> No
>
> -----
>
> (*http://jira.whamcloud.com/browse/LU-1281*<http://jira.whamcloud.com/browse/LU-1281>
> )

Emilio

unread,
Sep 19, 2012, 5:15:14 AM9/19/12
to codersh...@googlegroups.com
Thanks Alexey for your all your help, I'll get back to you once I've ran all the filesystem checks and provide an update
 
 
Regards
Emilio
Reply all
Reply to author
Forward
0 new messages