Unnecessary Fail-over As MySQL incorrectly detected as failed?

23 views
Skip to first unread message

madpah

unread,
Oct 19, 2009, 3:07:16 PM10/19/09
to MySQL Multi Master Manager Development
Hi All,

In advance, thanks for any light you can shed on this issue I'm
experiencing.

The set up I have is two MySQL servers in MySQL-MMM configuration
(only one active writer role [10.0.1.1], and no reader roles between
them). Two further MySQL servers then replicate from 10.0.1.1 and are
used in the "application" as read-only sources (these are the
"application servers"). The application servers only read from
themselves and direct all writes to 10.0.1.1. There is monitoring in
place by way of a table within the replicated database that has the
current time entered every minute (via a cron job), and each slave
"application server" has a monitoring cron job which checks the
replicated table for the time and compares it to it's own time. If the
difference is greater than 1 minute, I am alerted via SMS.

MySQL Version (on all 4 boxes) is: MySQL-5.0.84 (build 18) from
Percona [http://www.percona.com/mysql/5.0.84-b18]
MySQL-MMM Version (on 2 master boxes) is: 2.0.9
Perl Version: "This is perl, v5.8.8 built for i386-linux-thread-multi"
All boxes running CentOS-5.3(Final).

Sorry for the long windedness, but some or all of the above may be
relevant.

Now the problem...

Rather at random, I'm seeing log entries in the monitor log as
follows:

ERROR Check 'mysql' on 'db01' has failed for 14 seconds! Message:
ERROR: Connect error (host = 10.0.1.11:3306, user = usr_mmm_monitor)!
Can't connect to MySQL server on '10.0.1.11' (4)

At the same time, the agent log shows things like:

FATAL Couldn't allow writes: ERROR: Can't connect to MySQL (host =
10.0.1.11:3306, user = usr_mmm_agent)!

Both the above log entries are seen (randomly?) throughout normal
operation, and don't seem to actually impact in performance or
availability, until they happen many times sequentially, causing a
fail-over.

As far as the monitor is concerned, mysql has been dead for 14 seconds
(in example above).

With default MySQL-MMM config, after 15 seconds of dead time, the
writer role is pushed to the other server (db02 in my case), which is
great but...

...during the 15 seconds of dead time above, the cron job has
connected to local mysql, updated the time in the check table and both
the "application servers" have replicated this value too, so mysql is
certainly not dead.

For the time being, I've upped the required mysql dead time to 30
seconds, which is working around the problem (although, I still see
entries such as the above in the agent monitor logs).

If anyone has any suggestions as to what I can check, I'd appreciate
it, and apologies for the potentially confusing long post.

Thanks in advance,

Paul

Istvan Podor

unread,
Oct 21, 2009, 2:00:40 AM10/21/09
to mmm-...@googlegroups.com
Hi Madpah,

For the first sight, I think the problem is:
If you are using as you wrote below 10.0.1.1 as the VIP what is
balanced by mmm, then you may misunderstood the documentation.

You MUST configure your slaves and mmm_mon to use the REAL IPs and not
connect to the VIP.

However, in any balancing logic, you have to use the VIP.

So lets say in the monitoring you have to set up db01/db02 with a they
real IP and assign a VIP as a writer role.
You must set up your slaves to replicate from the real IP (and not the
VIP in comparing with a typical DRBD configuration).
You should use exactly the same mmm_common.conf on each agent hosts
to avoid the issues like this :)

I hope I could help, if not, send us your configurations and if its
not confidential, the line from the show master status\G which contain
master_host.

Regards,
Istvan

Pascal Hofmann

unread,
Oct 21, 2009, 2:07:50 AM10/21/09
to mmm-...@googlegroups.com
Hi Paul,

> ERROR Check 'mysql' on 'db01' has failed for 14 seconds! Message:
> ERROR: Connect error (host = 10.0.1.11:3306, user = usr_mmm_monitor)!
> Can't connect to MySQL server on '10.0.1.11' (4)


Error code 4 is EINTR. So something is interrupting mysql connect. I
think the code should try to do a reconnect there.


Regards

Pascal

Paul Horton

unread,
Oct 21, 2009, 3:49:18 AM10/21/09
to mmm-...@googlegroups.com
Thanks for the feedback Istvan, Pascal.

To Istvan:
Perhaps I've not fully explained my setup.

There are four database servers, 2 of which are part of a single MySQL-MMM configuration. These have each other as slaves (using real IP addresses).

The other two database servers are not part of the MySQL-MMM configuration at all. They are simply slaves, but are set to replicate from the active master (or writer role), which is why they replicate from the VIP IP (10.0.1.1).

Do your comments from before still stand?

To Pascal:

Re-connecting sounds like a good idea, but I'm worried it might be masking a bigger problem.

Do you think that this connection issue could be MySQL or Perl related (either DBI or DBD::mysql)?

Thanks,

Paul

2009/10/21 Pascal Hofmann <ma...@pascalhofmann.de>

Istvan Podor

unread,
Oct 21, 2009, 3:58:53 AM10/21/09
to mmm-...@googlegroups.com
Hey Paul,

Of course its stand. Because not like in drbd the two master writing two different binlog file and working at different positions.

Test failover after you restarted the passive master a few times and you will see what am I talking about :)

The mmm_mon thing is the one which command the agents on the slaves to change the master  to the correct position of the passive master. You don't need it to handle balance or something like that, your logic as you doing is ok.

Set up agent on every replica, use RIPs :)

I think your issues will be solved soon.

Btw Pascal can confirmbut I think is can't connect to 10.0.1.1 because mmm drop the ViP before continue acting. So when it's arrive to connect to that IP, it's already gone.

Regards,
Istvan

Pascal Hofmann

unread,
Oct 21, 2009, 4:12:07 AM10/21/09
to mmm-...@googlegroups.com
Hi Istvan,

the VIP ist 10.0.1.1 - but the real IP is 10.0.1.11. Everything is
fine with his setup as far as I can see.


Cheers

Pascal

Pascal Hofmann

unread,
Oct 21, 2009, 4:15:13 AM10/21/09
to mmm-...@googlegroups.com
Hi Paul,

> Do you think that this connection issue could be MySQL or Perl
> related (either DBI or DBD::mysql)?

I don't know, maybe attach with strace (to one of the checker
processes, not mmmd_mon) and redirect output to a file then wait for
the error to happen.


Regards

Pascal

Istvan Podor

unread,
Oct 21, 2009, 4:16:34 AM10/21/09
to mmm-...@googlegroups.com
Hi Pascal,

My bad :( Still pretty sleepy :)))

thanks,
Istvan

Paul Horton

unread,
Oct 21, 2009, 7:33:21 AM10/21/09
to mmm-...@googlegroups.com
Cheers Gents.

I'll try to get an strace. As the system is live (production!), I'll have to pick an appropriate moment. Will post ASAP.

Thanks,

Paul

P.S. When the writer role fails over, (and VIP is moved too), there seems to also be an issue with the slaves reconnecting to the VIP to continue replication after the VIP has moved. I've lowered slave reconnect time to 15 seconds, but still no dice. Any thoughts?

2009/10/21 Istvan Podor <istvan...@gmail.com>

Pascal Hofmann

unread,
Oct 21, 2009, 7:35:41 AM10/21/09
to mmm-...@googlegroups.com
Hi Paul,

> P.S. When the writer role fails over, (and VIP is moved too), there
> seems to also be an issue with the slaves reconnecting to the VIP to
> continue replication after the VIP has moved. I've lowered slave
> reconnect time to 15 seconds, but still no dice. Any thoughts?

The slaves should not replicate from the VIP. They should relicate
from the real IP of the current active master - MMM will change the
master when the writer role moves.

Cheers

Pascal

Istvan Podor

unread,
Oct 21, 2009, 8:03:46 AM10/21/09
to mmm-...@googlegroups.com
So I was right? :O

Istvan

Pascal Hofmann

unread,
Oct 21, 2009, 8:14:45 AM10/21/09
to mmm-...@googlegroups.com
Hi Istvan,

Yes - My bad :( Still pretty sleepy :)))


(c:

Paul Horton

unread,
Oct 21, 2009, 8:53:55 AM10/21/09
to mmm-...@googlegroups.com
Istvan,

Why is it incorrect or bad practice to have slaves outside of a MySQL-MMM cluster replicate from the VIP?

The behaviour I was looking for (and expecting) was:
  1. If db01 fails (primary writer), writer role moves to db02 along with VIP.
  2. Additional slaves that were replicating from db01 continue replicating from db02 post fail over.
Thanks for the hand-holding.

Paul

2009/10/21 Pascal Hofmann <ma...@pascalhofmann.de>

Walter Heck

unread,
Oct 21, 2009, 9:01:19 AM10/21/09
to mmm-...@googlegroups.com
Hi Paul,

if you put the slaves inside the cluster, that is exactly the
behaviour you will get. MMM will take care of failing the slaves over
to the other master :)

cheers,

Walter

Paul Horton

unread,
Oct 21, 2009, 9:07:19 AM10/21/09
to mmm-...@googlegroups.com
Hi Walter,

Thanks for that "revelation" - wasn't aware the MMM did that at all!!!!

Pascal, Istvan - does this meet with your expectations too?

Are there any restrictions on this operation?

Paul

2009/10/21 Walter Heck <walte...@gmail.com>

Walter Heck

unread,
Oct 21, 2009, 9:13:27 AM10/21/09
to mmm-...@googlegroups.com
shameless plug: Paul, if you wanna know more about what MMM does, and
how and why, attend the MySQL university session I'm presenting in 23
hours :)

http://forge.mysql.com/wiki/MySQL_University

I'll explain most of the basics of MMM with slides and all.

cheers,

Walter

Paul Horton

unread,
Oct 21, 2009, 9:21:31 AM10/21/09
to mmm-...@googlegroups.com
Cheers Walter.

I'll be sure to put that in my diary and will certainly join in.

Thanks,

Istvan Podor

unread,
Oct 21, 2009, 9:37:11 AM10/21/09
to mmm-...@googlegroups.com
Paul,

Hmm I think you are a bit confused. Let me explain:

Not a bad idea to have slaves outside MMM but a very bad idea to replicate that from VIP.

Let me explain how replication works (briefly):

Replication needs at least two components, master and slave.

On the master, every single query you execute will be written down to the binlog (there are modifier options like ignore-db, replicate-do-db etc but lets ignore them for now).
Every binlog file has a filename and a position. Each filename is a real file on the masters filesystem and every position point to a "query" in the binlog file.
At every "query" in the binlog file stores the position of the next query coming after it. The names of the binlog files is incremented in row (001,002,003 .. ) so when a binlog file reached to the end, the master will start a new one starting with the position 98 in that file. This is how binlogs treated on the master (briefly, shortly and easy to understand way, don't hate me because of this :)) )

At the slave, you can set up a master host, binlog file (name) and position. Now we know that what you set up your slave as a master position will point to a file on the master's filesystem with a position you specified to THAT exact master you set up as master_host.

Now, if you have a master-master setup. That mean you have one master and one slave. And you can look to them like this. 
This means each master is going to log in to a binlog file and read up from another server's binlog. But, on the slave (or passive master) the file names and positions used to be different. Not just used to be, lets say they are NEVER the same. So for example if you execute a query like this on your master:

insert into table values (1,'sometext');

This will logged in to the master's binlog lets say for example in to the file mysql-binlog.0001 at position 1002. 

When the passive master request for an event from the active master, active master is going to answer: Hey I got a new position in mysql-binlog.0001 at position 1002 what is   insert into table values (1,'sometext');
(stateful replication, maybe different depend on your config, but this is the default)

The passive master will execute this query, and log in to its own binlog (because logbin is turned on, on your masters) lets say to mysql-binlog.0012 at position 12461. 

Ok I hope you got this yet.

So when your slave is replicate without MMM from the vip and the IP moved to the other master, the slave is going to request for a binlog file at a position what possibly doesn't even exists. So if you use the VIP, you can be sure your replication will STOP immediately as you move the IP.

What to use then? use the real IP. Why? Because if you use the real ip of one of your masters without MMM managed, when that master is down, your replication will stop. But when that master come back, its going on again. Again, if you use the VIP, your slave will request a WRONG position to fetch next.

So with mmm whats different?

If you use mmm, you have to use the RIP again. But, when your master crash,stop our you just move the role, MMM_MON will COMMAND the slave, to change the master host, binlog and position to the other master :)) And you (hopefully and/or mostly) stay consistent on you slaves.

If you have other question, don't hesitate I hope you got this.

Regards,
istvan

Paul Horton

unread,
Oct 21, 2009, 9:49:29 AM10/21/09
to mmm-...@googlegroups.com
Istvan,

Thanks for that. Appreciate your time explaining in so much detail.

Your detailed explanation makes sense - I forgot that the log positions will differ (potentially) on each master.

FYI, my actual set-up has 4 master database servers (2 pairs) (i.e. circular replication), with each pair running their own MMM and having a further two slave servers.

Not sure how this configuration stacks up in terms of craziness?

Paul

2009/10/21 Istvan Podor <istvan...@gmail.com>

Istvan Podor

unread,
Oct 21, 2009, 9:54:41 AM10/21/09
to mmm-...@googlegroups.com
Paul,

I would recommend the clusters function in mmm for management of that. But I also have to warn you, circular replication is a kind of risky.
I had some thoughts on a really pervert configuration for MMM with a lots of clusters to manage circular replication but all was just a weird late-night thought :)

Are you sure you want 4 masters? You may worth become familiar with drbd + mmm managed replication and of course with maatkit to make sure your slaves and masters are consistent to each other.

Regards,
Istvan

Paul Horton

unread,
Oct 21, 2009, 10:10:36 AM10/21/09
to mmm-...@googlegroups.com
Istvan,

DRBD was considered, but was ruled out for a number of reasons, including it's inability to support MyISAM tables and the *potential* swtich over time being lengthy (although I now know that this is minimised with the Percona patched versions of MySQL - InnoDB Speed Recovery Hack).

They are installed as two MMM's mainly because they are geographically separate and it wasn't straightforward to configure the required access from a single monitoring node to all four DB Servers.

The theory of 4 masters wasn't actually a requirement, but was a "side-effect" of having two individual clusters (1 running back-office, 1 web-site) each of with were required to have a backup master...

Thanks again for your input and thoughts all.

Paul

P.S. Is anyone attending the MySQL HA Breakfast (November 19th - London)?
Reply all
Reply to author
Forward
0 new messages