Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  9 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Abdel Said  
View profile  
 More options Oct 9 2012, 10:11 am
From: Abdel Said <said.ab...@gmail.com>
Date: Tue, 9 Oct 2012 07:11:38 -0700 (PDT)
Local: Tues, Oct 9 2012 10:11 am
Subject: Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted

See the logs especially node-1-mysql-error.log at line 320.

This is not supposed to happen, node 2 and node 3 tried to sync and node 1
tried to take over but crashed. Any idea what's going on here?

  node-3-mysql-error.log
38K Download

  node-2-mysql-error.log
51K Download

  node-1-mysql-error.log
40K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Yurchenko  
View profile  
 More options Oct 9 2012, 12:47 pm
From: Alex Yurchenko <alexey.yurche...@codership.com>
Date: Tue, 09 Oct 2012 19:47:09 +0300
Local: Tues, Oct 9 2012 12:47 pm
Subject: Re: [percona-group] Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted
On 2012-10-09 17:11, Abdel Said wrote:

> See the logs especially node-1-mysql-error.log at line 320.

> This is not supposed to happen, node 2 and node 3 tried to sync and
> node 1
> tried to take over but crashed. Any idea what's going on here?

Hi,

1) your servers seem to be silently crashing from time to time. I'd
look into system logs around times when you have lines such like:

121002 15:03:43 mysqld_safe Number of processes running now: 0

But this is not the cause of the situation you encountered. It was a
result of misconfiguration:

2) node1 seems to have wsrep_cluster_address=gcomm:// so
- every time it crashes, other nodes forget it.
- every time it is restarted it starts a new cluster.

So you've been routinely running 2 disjoint clusters, one consisting of
a single node1 and another consisting of nodes2 and 3. And it was
perfectly fine except that they of course became inconsistent with each
other (but that's another story).

Until one day node3 silently crashed.

Since it is a split-brain situation, node2 could not form majority and
started to try to reconnect to the last node it saw: node3.

At the same time node3 was automatically restarted by mysqld_safe.
Since it had wsrep_cluster_address=node1 it connected to node1.

And then node2 connected to node3, since it was trying to reconnect.

This way two nodes from different primary components saw each other in
one cluster. And that's what caused an exception, because Galera
detected inconsistency - and stopped operation to prevent data loss. So
it is not a bug, in fact it is a very valuable feature. Now you can
properly decide which data set is more representative - the one from
node1 or the one from node2.

This story of three nodes once again reminds us how automatic recovery
is inherently evil and can punish you any time. Especially if you have
your cluster misconfigured.

Thanks,
Alex


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Vadim Tkachenko  
View profile  
 More options Oct 9 2012, 1:30 pm
From: Vadim Tkachenko <va...@percona.com>
Date: Tue, 9 Oct 2012 10:29:28 -0700
Local: Tues, Oct 9 2012 1:29 pm
Subject: Re: [percona-group] Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted
Abdel,

5.5.24 is known to have a crashing issues.
Can you please try 5.5.27 ?

Thanks,
Vadim

On Tue, Oct 9, 2012 at 7:11 AM, Abdel Said <said.ab...@gmail.com> wrote:
> See the logs especially node-1-mysql-error.log at line 320.

> This is not supposed to happen, node 2 and node 3 tried to sync and node 1
> tried to take over but crashed. Any idea what's going on here?

> --
> You received this message because you are subscribed to the Google Groups
> "Percona Discussion" group.
> To post to this group, send email to percona-discussion@googlegroups.com.

--
Vadim Tkachenko, CTO, Percona Inc.
Phone +1-925-400-7377,  Skype: vadimtk153
Schedule meeting: http://tungle.me/VadimTkachenko

Looking for Replication with Data Consistency?
Try Percona XtraDB Cluster!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Abdel Said  
View profile  
 More options Oct 24 2012, 10:20 am
From: Abdel Said <said.ab...@gmail.com>
Date: Wed, 24 Oct 2012 07:20:43 -0700 (PDT)
Local: Wed, Oct 24 2012 10:20 am
Subject: Re: [percona-group] Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted

Thanks Alex for your reply. Unfortunatly that's the standard Percona
configuration. Can you point me to the right configuration?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Yurchenko  
View profile  
 More options Oct 24 2012, 12:36 pm
From: Alex Yurchenko <alexey.yurche...@codership.com>
Date: Wed, 24 Oct 2012 19:36:31 +0300
Local: Wed, Oct 24 2012 12:36 pm
Subject: Re: [percona-group] Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted
On 2012-10-24 17:20, Abdel Said wrote:

> Thanks Alex for your reply. Unfortunatly that's the standard Percona
> configuration. Can you point me to the right configuration?

You should never leave wsrep_cluster_address=gcomm:// on a running
node.

--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Abdel Said  
View profile  
 More options Oct 24 2012, 12:42 pm
From: Abdel Said <said.ab...@gmail.com>
Date: Wed, 24 Oct 2012 09:42:00 -0700 (PDT)
Subject: Re: [percona-group] Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted

What do you suggest? set it to ip of the second node after start?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Abdel Said  
View profile  
 More options Oct 24 2012, 12:43 pm
From: Abdel Said <said.ab...@gmail.com>
Date: Wed, 24 Oct 2012 09:43:06 -0700 (PDT)
Local: Wed, Oct 24 2012 12:43 pm
Subject: Re: [percona-group] Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted

is there any way to avoid using wsrep_cluster_addres at all? to list the ip
of the 3 nodes and the system do the rest?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Yurchenko  
View profile  
 More options Oct 24 2012, 1:10 pm
From: Alex Yurchenko <alexey.yurche...@codership.com>
Date: Wed, 24 Oct 2012 20:10:12 +0300
Local: Wed, Oct 24 2012 1:10 pm
Subject: Re: [percona-group] Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted
On 2012-10-24 19:43, Abdel Said wrote:

> is there any way to avoid using wsrep_cluster_addres at all? to list
> the ip
> of the 3 nodes and the system do the rest?

To an extent: check our recent 2.2 RC2 and
http://www.codership.com/wiki/doku.php?id=galera_url

If, however, you need to start a cluster from scratch, or the primary
component is lost, you will have to (re)bootstrap the PC manually, by

mysql> SET GLOBAL wsrep_provider_options="pc.bootstrap=1";

It is your responsibility though to make sure that there is no more
than 1 PC at a time.

--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Abdel Said  
View profile  
 More options Nov 10 2012, 2:20 pm
From: Abdel Said <said.ab...@gmail.com>
Date: Sat, 10 Nov 2012 11:20:02 -0800 (PST)
Local: Sat, Nov 10 2012 2:20 pm
Subject: Re: [percona-group] Percona Xtradb crash after [ERROR] WSREP: exception from gcomm, backend must be restarted

Thanks Alex. This "You should never leave wsrep_cluster_address=gcomm:// on
a running node." seems to have fixed the issue.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic