Orchestrator Query

vinay jaiswal

unread,

Sep 12, 2017, 2:50:19 PM9/12/17

to orchestrator-mysql

Hi ,

Can you help me to understand the flow of orchestrator for auto recovery .

I haven't found concrete doc for this like how orchestrator work in background .

what are the default checks while promoting master from old master to new except prefailover and post-failover hooks .
Is there any inbuilt default script for prefailover and postfailover hooks .
Can we do binlog recovery in orchestrator as MHA .
Is there any way to display all clusters topology over single page instead of going to dashboard and click on individual cluster .

Thanks in Advance .

Shlomi Noach

unread,

Sep 13, 2017, 2:52:17 AM9/13/17

to vinay jaiswal, orchestrator-mysql

Hi,

> what are the default checks while promoting master from old master to new except prefailover and post-failover hooks .

I'm not sure what you mean by "default checks", or "checks"? Can you elaborate?

> Is there any inbuilt default script for prefailover and postfailover hooks .

The million dollar question. There is none, and I'm hoping to create a couple generic scripts. This is so tightly coupled with your environment. Some people will remote SSH ; others will write to Consul ; yet others will talk to a proxy...

> Can we do binlog recovery in orchestrator as MHA .

Unfortunately not. I gave this a try and decided to back off. I felt it was not guaranteed to be correct/stable.

> Is there any way to display all clusters topology over single page instead of going to dashboard and click on individual cluster .

There is no such way. I'm just wondering aloud here, how would you display 10 or 50 different clusters on the same page? You'd have to scroll down and sideways.

This isn't on my roadmap; I'm happy for a PR showing this is possible.

--
You received this message because you are subscribed to the Google Groups "orchestrator-mysql" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orchestrator-mysql+unsub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/orchestrator-mysql/989e453d-310e-433e-85ab-f8e0b1d18e7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

vinay jaiswal

unread,

Sep 13, 2017, 3:51:17 PM9/13/17

to Shlomi Noach, orchestrator-mysql

Thanks Shlomi for your response .

Congrats & Thanku for developing a good software . I' exploring orchestrator because wanted to move my current clusters from MHA to orchestrator .

I'm not sure what you mean by "default checks", or "checks"? Can you elaborate?

Internally ,what does orchestrator check before promoting master from dead to new .

Let's suppose if the master goes down ,then how orchestrator choose which slave are the most advance from which logic and promote that slave as a master and make it master of its siblings .
Easy master failover , Slave promotion can happen with out GTID / Pseudo GTID using binlog file & pos in orchestrator then why required Pseudo GTID and what do orchestrator run in internally for change the topology GTID.
Where do we maintain coordinates like binlog and pos in the case of pseudo GTID so that manually reattach dead master as slave through coordinates or any other way to reattach dead master in the topology .
Is there any cons of Pseudo GTID ?
what does it means of this limitation ,Slaves can not be manually promoted to be a master

On Wed, Sep 13, 2017 at 12:22 PM, Shlomi Noach <shlomi...@gmail.com> wrote:

Hi,

> what are the default checks while promoting master from old master to new except prefailover and post-failover hooks .

I'm not sure what you mean by "default checks", or "checks"? Can you elaborate?

> Is there any inbuilt default script for prefailover and postfailover hooks .

The million dollar question. There is none, and I'm hoping to create a couple generic scripts. This is so tightly coupled with your environment. Some people will remote SSH ; others will write to Consul ; yet others will talk to a proxy...

> Can we do binlog recovery in orchestrator as MHA .

Unfortunately not. I gave this a try and decided to back off. I felt it was not guaranteed to be correct/stable.

> Is there any way to display all clusters topology over single page instead of going to dashboard and click on individual cluster .

There is no such way. I'm just wondering aloud here, how would you display 10 or 50 different clusters on the same page? You'd have to scroll down and sideways.
This isn't on my roadmap; I'm happy for a PR showing this is possible.

On Tue, Sep 12, 2017 at 9:50 PM, vinay jaiswal <jaiswa...@gmail.com> wrote:

Hi ,

Can you help me to understand the flow of orchestrator for auto recovery .

I haven't found concrete doc for this like how orchestrator work in background .

what are the default checks while promoting master from old master to new except prefailover and post-failover hooks .
Is there any inbuilt default script for prefailover and postfailover hooks .
Can we do binlog recovery in orchestrator as MHA .
Is there any way to display all clusters topology over single page instead of going to dashboard and click on individual cluster .
Thanks in Advance .

--
You received this message because you are subscribed to the Google Groups "orchestrator-mysql" group.

To unsubscribe from this group and stop receiving emails from it, send an email to orchestrator-mysql+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/orchestrator-mysql/989e453d-310e-433e-85ab-f8e0b1d18e7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Thanks
Vinay kumar Jaiswal
Mobile No:9650526003
Skype:vinayjaiswal1989

Shlomi Noach

unread,

Sep 13, 2017, 11:00:00 PM9/13/17

to vinay jaiswal, orchestrator-mysql

Each of these questions is quite big, and the answers can be found in:

Let's suppose if the master goes down ,then how orchestrator choose which slave are the most advance from which logic and promote that slave as a master and make it master of its siblings .

http://code.openark.org/blog/mysql/mysql-high-availability-tools-followup-the-missing-piece-orchestrator

http://code.openark.org/blog/mysql/whats-so-complicated-about-a-master-failover

Easy master failover , Slave promotion can happen with out GTID / Pseudo GTID using binlog file & pos in orchestrator then why required Pseudo GTID and what do orchestrator run in internally for change the topology GTID.

Not sure I understand the question. Orchestrator supports either GTID or Pseudo-GTID and does not support relay-log based failover. It does not work like MHA. See above links.

I'm not going to pursue MHA-like solution. I worked on this for a few months, https://github.com/github/orchestrator/pull/42, https://github.com/github/orchestrator/pull/32 and more, and saw some problems with the solution that I'm unhappy about.

Where do we maintain coordinates like binlog and pos in the case of pseudo GTID so that manually reattach dead master as slave through coordinates or any other way to reattach dead master in the topology .

The dead master is lost. It does not get reconnected to the topology.

Is there any cons of Pseudo GTID ?

You need to make sure it is injected. So if you failover to a new master you need to, as part of the failover process, make sure pseudo-gtid starts writing in the new master.

Otherwise it is very lightweight.

what does it means of this limitation ,Slaves can not be manually promoted to be a master

I do not understand the question.

You can do graceful-failover to do planned promotion of a replica as new master.

vinay jaiswal

unread,

Sep 14, 2017, 4:17:44 AM9/14/17

to Shlomi Noach, orchestrator-mysql

Tanks for your prompt response .

Where do we maintain coordinates like binlog and pos in the case of pseudo GTID so that manually reattach dead master as slave through coordinates or any other way to reattach dead master in the topology .

The dead master is lost. It does not get reconnected to the topology.

Agreed on this but how to reconnect manually to dead master in the case of pseudo GTID .

In the case of Pseudo GTID , if Master goes down then orchestrator reattach dead master from topology and promote most advance slave as new master then how to repoint dead master using coordinates like binlog file and pos or other way to capture the coordinates . Does orchestrator maintain binlog file and pos of promoted master ?

what does it means of this limitation ,Slaves can not be manually promoted to be a master

I do not understand the question.

You can do graceful-failover to do planned promotion of a replica as new master.

this is one of them limitation of orchestrator as per percona blog https://www.percona.com/blog/2016/03/08/orchestrator-mysql-replication-topology-manager/ so Need to understand this . i know the graceful failover . for planned activity .

If Orchestrator does not see the master but replica see the master what will happen ? If the master is seen by 2 nodes, but all of its replicas are broken, does that make a failure scenario or not? What if a couple replicas are happy but ten others are not?

Shlomi Noach

unread,

Sep 14, 2017, 4:32:43 AM9/14/17

to vinay jaiswal, orchestrator-mysql

Answers:

Where do we maintain coordinates like binlog and pos in the case of pseudo GTID so that manually reattach dead master as slave through coordinates or any other way to reattach dead master in the topology .
The dead master is lost. It does not get reconnected to the topology.

Agreed on this but how to reconnect manually to dead master in the case of pseudo GTID .

In the case of Pseudo GTID , if Master goes down then orchestrator reattach dead master from topology and promote most advance slave as new master then how to repoint dead master using coordinates like binlog file and pos or other way to capture the coordinates . Does orchestrator maintain binlog file and pos of promoted master ?

The problem is there is no guarantee this can be done in the first place (neither with MHA, by the way) because this is async replication, the master may have had some binlog entries that never made it to the replicas.

If you're lucky and the replicas were 100% up to date with master, then you may try:

orchestrator -c match -i dead.master.com -d new.master.com

"match" is a pseudo-gtid specific request.

what does it means of this limitation ,Slaves can not be manually promoted to be a master
I do not understand the question.
You can do graceful-failover to do planned promotion of a replica as new master.

this is one of them limitation of orchestrator as per percona blog https://www.percona.com/blog/2016/03/08/orchestrator-mysql-replication-topology-manager/ so Need to understand this . i know the graceful failover . for planned activity .

This blog post is 1.5 years old and does not reflect the current state. You can have planned failovers.

If Orchestrator does not see the master but replica see the master what will happen ? If the master is seen by 2 nodes, but all of its replicas are broken, does that make a failure scenario or not? What if a couple replicas are happy but ten others are not?

If couple replicas are happy then there is no failover. If master is seen by orchestrator but all replicas are broken there is no failover.

For failing over the master the master must be dead to orchestrator and dead to all replicas ; or dead to orchestrator and dead to some replicas where all other replicas are also dead.

You do have detection. If you set detection hooks to email/ping/SMS/alert you, you get the alert. But these are fuzzy incidents where I chose to not take automated action.

As example to the complexity of the scenario (this happened in real production): master was overloaded with connections. The replicas had "old" connections so they were happy, but orchestrator was not able to connect, got "too many connections". In the past, this kicked failover. Guess what, the new master got hammered by the same problem hitting the old master.

Today, this does not kick failover. It's a gray zone.

vinay jaiswal

unread,

Sep 16, 2017, 3:17:10 AM9/16/17

to Shlomi Noach, orchestrator-mysql

Hi Shlomi ,

Thanks a lot . Now most of doubts are cleared .

Is there any way to run orchestrator through custom user except root .
trying to do graceful-master-takeover with more than one slave but it did not success . Is there any way do the graceful master takeover for planned activity so all slaves can repoint to new prompted master and old master also repoint with topology as slave .

Getting below error while doing graceful master takeover .

orchestrator -c graceful-master-takeover -i 192.168.56.102:3306

FATAL Cannot deduce cluster master for 192.168.56.101:3306. Found 0 potential masters

WARNING executeCheckAndRecoverFunction: ignoring analysisEntry that has no action plan: FirstTierSlaveFailingToConnectToMaster;

FATAL GracefulMasterTakeover: master 192.168.56.101:3306 should only have one replica (making the takeover safe and simple), but has 2. Aborting

sampath....@boldcommerce.com

unread,

May 6, 2019, 11:50:38 AM5/6/19

to orchestrator-mysql

If we have more than one slave attached to the current master, we will either need GTID/Pseudo GTID implementation to perform graceful Master takeovers. Orchestrator cannot align re attach slaves to the new promoted master after the takeover / failover without GTID/ Pseudo GTID implementation.

Answers:

To unsubscribe from this group and stop receiving emails from it, send an email to orchestrator-mysql+unsub...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/orchestrator-mysql/989e453d-310e-433e-85ab-f8e0b1d18e7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Thanks
Vinay kumar Jaiswal
Mobile No:9650526003
Skype:vinayjaiswal1989

Matthew Boehm

unread,

May 6, 2019, 12:42:27 PM5/6/19

to sampath....@boldcommerce.com, orchestrator-mysql

Gotta give a +1 for Pseudo-GTID. I don't have very many clients that use
GTID cause it changes everything, with regards to fixing replication.
You can't use skip anymore and fixing/finding phantoms is a pain.

-Matthew

--
Matthew Boehm
Senior Instructor / Senior Architect
Percona, Inc / www.percona.com

Shlomi Noach

unread,

May 6, 2019, 1:08:52 PM5/6/19

to Matthew Boehm, sampath....@boldcommerce.com, orchestrator-mysql

Notwithstanding my agreement re: GTID, orchestrator today has very good support for GTID, including helping you find errant GTIDs, issuing reset master for them, or injecting errant GTIDs on the master.

--
You received this message because you are subscribed to the Google Groups "orchestrator-mysql" group.

To unsubscribe from this group and stop receiving emails from it, send an email to orchestrator-my...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/orchestrator-mysql/0fe4318c-cd1e-e3bb-2636-45a8b3762560%40percona.com.

Reply all

Reply to author

Forward