We'd like to use Orchestrator in AWS EC2 for MySQL 5.7 using GTIDs in typical main plus replicas mode -- no Multi Master scenarios and no semi-sync plugin set
I downloaded from Github orchestrator repo the latest GA release (3.0.3) of Orchestrator and proceeded to use a modified version of the Dockerfile included in the 3.0.3 zipfile with no entrypoint.sh and CMD set to tail -f /dev/null *I did that so that I could use the orchestrator.sample.conf.json file and start up the orchestrator process with debug flag plus point it to the /usr/local/orchestrator directory the Docker file chooses to put the executable there.I used the Docker hub MySQL 5.7 Dockerfile, also modded, to not use the entry point shell script nor CMD set to mysqld so that I could start up mysqld independently with the datapoint-entry.sh script calling mysqld as it's argument.I put both Docker built image tag names into a docker-compose so that I could tie orchestrator container to 4 individual instances of MySQL set up as a main and 3 replicas, mount volumes with specific scripts to manually set and check replication, and put config files for both MySQL and Orchestrator into place.
1. Killed the mysqld process and the topology in the GUI showed the "DeadMaster" -- no replicas were promoted automatically. Does this article still hold true https://www.percona.com/blog/2016/03/08/orchestrator-mysql-replication-topology-manager/ ? Quoting the Limitations section -->"One of the key missing features is that there is no easy way to promote a slave to be the new master. This could be useful in scenarios where the master server has to be upgraded, there is a planned failover, etc. (this is a known feature request)." (known = https://github.com/outbrain/orchestrator/issues/151)or does the sample config setting "ApplyMySQLPromotionAfterMasterFailover": false, have to be changed to be true?I plan to test ApplyMySQLPromotionAfterMasterFailover": true next and wanted to make sure I covered all bases for other orchestrator.config.json settings I probably overlooked or don't understand yet.2. When I forced the failover to occur with the default orchestrator config file + orchestrator -c force-master-failover -i 58606e157e68 --config=/usr/local/orchestrator/orchestrator.config.json the replica was automatically chosen --> the slaves were not reset, nor the automatically chosen new target set to read/write as I was expecting. I was hoping that one of the remaining replicas would become under the new master.I read up on https://github.com/github/orchestrator/blob/master/docs/topology-recovery.md and it isn't clear to me like the commands tested in issue 151 (also looks like switch-alive is not a flag in 3.0.3 is that now take-master instead?)I included the log of the command run with the /tmp/recovery.log outputIt looks like the GUI may have initiated the force failover yet didn't acknowledge it at all.The command line may have tried one replica and then a second. I'm not sure because line 69 in the file shows both docker container id's
The by product /tmp files would be great to have around to verify the steps taken to connect the dots of what commands follow what steps. It looks like they get removed?
3. Would folks be willing to share what orchestrator.config.json settings are for Booking or Github (without any usernames or passwords or other incriminating things ; ) ) that are best for unplanned and planned failover & recovery?4. I've been looking for the orchestrator -c command line list of steps to run through as a playbook in various presentations, percona blogs, etc. and it's not clear to me what orchestrate -c command(s) chatops is running when the log of orchestrate commands is put to Jabber/IRC client. Will this be a moot point if I get the orchestrator.config.json settings tweaked properly to handle a unplanned failover of Master (DeadMaster) or a planned failover ? Or does each step in either failover scenario need to have a corresponding orchestrator -c command ? Referencing https://githubengineering.com/mysql-testing-automation-at-github/
Goal:We'd like to use Orchestrator in AWS EC2 for MySQL 5.7 using GTIDs in typical main plus replicas mode -- no Multi Master scenarios and no semi-sync plugin set
Context:I downloaded from Github orchestrator repo the latest GA release (3.0.3) of Orchestrator and proceeded to use a modified version of the Dockerfile included in the 3.0.3 zipfile with no entrypoint.sh and CMD set to tail -f /dev/null *I did that so that I could use the orchestrator.sample.conf.json file and start up the orchestrator process with debug flag plus point it to the /usr/local/orchestrator directory the Docker file chooses to put the executable there.
I used the Docker hub MySQL 5.7 Dockerfile, also modded, to not use the entry point shell script nor CMD set to mysqld so that I could start up mysqld independently with the datapoint-entry.sh script calling mysqld as it's argument.I put both Docker built image tag names into a docker-compose so that I could tie orchestrator container to 4 individual instances of MySQL set up as a main and 3 replicas, mount volumes with specific scripts to manually set and check replication, and put config files for both MySQL and Orchestrator into place.
*The reason I took the entrypoint out for Orchestrator Dockerfile is two fold. The defaults didn't get information from docker network inspection of the network assigned (and therefore the IPs) and the directory /usr/local/orchestrator isn't one of the current default paths to look for the config file.What I have run into while testing in a Docker Compose environment and the questions I am hoping you all can help me with are:1. Killed the mysqld process and the topology in the GUI showed the "DeadMaster" -- no replicas were promoted automatically. Does this article still hold true https://www.percona.com/blog/2016/03/08/orchestrator-mysql-replication-topology-manager/ ? Quoting the Limitations section -->"One of the key missing features is that there is no easy way to promote a slave to be the new master. This could be useful in scenarios where the master server has to be upgraded, there is a planned failover, etc. (this is a known feature request)." (known = https://github.com/outbrain/orchestrator/issues/151)
or does the sample config setting "ApplyMySQLPromotionAfterMasterFailover": false, have to be changed to be true?
I plan to test ApplyMySQLPromotionAfterMasterFailover": true next and wanted to make sure I covered all bases for other orchestrator.config.json settings I probably overlooked or don't understand yet.2. When I forced the failover to occur with the default orchestrator config file + orchestrator -c force-master-failover -i 58606e157e68 --config=/usr/local/orchestrator/orchestrator.config.json the replica was automatically chosen --> the slaves were not reset, nor the automatically chosen new target set to read/write as I was expecting. I was hoping that one of the remaining replicas would become under the new master.I read up on https://github.com/github/orchestrator/blob/master/docs/topology-recovery.md and it isn't clear to me like the commands tested in issue 151 (also looks like switch-alive is not a flag in 3.0.3 is that now take-master instead?)
switch-master" never existed. It was _requested_ by a user; that doesn't mean it was implemented the way the user requested; a request by a user isn't means of documentation.I included the log of the command run with the /tmp/recovery.log outputIt looks like the GUI may have initiated the force failover yet didn't acknowledge it at all.
The command line may have tried one replica and then a second. I'm not sure because line 69 in the file shows both docker container id'sThe by product /tmp files would be great to have around to verify the steps taken to connect the dots of what commands follow what steps. It looks like they get removed?
3. Would folks be willing to share what orchestrator.config.json settings are for Booking or Github (without any usernames or passwords or other incriminating things ; ) ) that are best for unplanned and planned failover & recovery?
4. I've been looking for the orchestrator -c command line list of steps to run through as a playbook in various presentations, percona blogs, etc. and it's not clear to me what orchestrate -c command(s) chatops is running when the log of orchestrate commands is put to Jabber/IRC client.
Will this be a moot point if I get the orchestrator.config.json settings tweaked properly to handle a unplanned failover of Master (DeadMaster) or a planned failover ? Or does each step in either failover scenario need to have a corresponding orchestrator -c command ? Referencing https://githubengineering.com/mysql-testing-automation-at-github/
bash-4.4# orchestrator -c graceful-master-takeover helpgraceful-master-takeover:Gracefully discard master and promote another (direct child) instance instead, even if everything is running well.This allows for planned switchover.NOTE:- Promoted instance must be a direct child of the existing master- Promoted instance must be the *only* direct child of the existing master. It *is* a planned failover thing.- Orchestrator will first issue a "set global read_only=1" on existing master- It will promote candidate master to the binlog positions of the existing master after issuing the above- There _could_ still be statements issued and executed on the existing master by SUPER users, but those are ignored.- Orchestrator then proceeds to handle a DeadMaster failover scenario- Orchestrator will issue all relevant pre-failover and post-failover external processes.Examples:orchestrator -c graceful-master-takeover -alias myclusterIndicate cluster by alias. Orchestrator automatically figures out the master and verifies it has a single direct replicaorchestrator -c force-master-takeover -i instance.in.relevant.cluster.comIndicate cluster by an instance. You don't structly need to specify the master, orchestratorwill infer the master's identify.bash-4.4# orchestrator -c graceful-master-takeover -debug -verbose -i 58606e157e68 --config=/usr/local/orchestrator/orchestrator.config.json
mysql> show slave status \G*************************** 1. row ***************************Slave_IO_State:Master_Host: ebb7130cec9fMaster_User:Master_Port: 3306Connect_Retry: 60Master_Log_File:Read_Master_Log_Pos: 4Relay_Log_File: 58606e157e68-relay-bin.000001Relay_Log_Pos: 4Relay_Master_Log_File:Slave_IO_Running: NoSlave_SQL_Running: YesReplicate_Do_DB:Replicate_Ignore_DB:Replicate_Do_Table:Replicate_Ignore_Table:Replicate_Wild_Do_Table:Replicate_Wild_Ignore_Table:Last_Errno: 0Last_Error:Skip_Counter: 0Exec_Master_Log_Pos: 0Relay_Log_Space: 154Until_Condition: NoneUntil_Log_File:Until_Log_Pos: 0Master_SSL_Allowed: NoMaster_SSL_CA_File:Master_SSL_CA_Path:Master_SSL_Cert:Master_SSL_Cipher:Master_SSL_Key:Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: NoLast_IO_Errno: 1593Last_IO_Error: Fatal error: Invalid (empty) username when attempting to connect to the master server. Connection attempt terminated.Last_SQL_Errno: 0Last_SQL_Error:Replicate_Ignore_Server_Ids:Master_Server_Id: 0Master_UUID:Master_Info_File: /var/lib/mysql/master.infoSQL_Delay: 0SQL_Remaining_Delay: NULLSlave_SQL_Running_State: Slave has read all relay log; waiting for more updatesMaster_Retry_Count: 86400Master_Bind:Last_IO_Error_Timestamp: 171226 19:20:11Last_SQL_Error_Timestamp:Master_SSL_Crl:Master_SSL_Crlpath:Retrieved_Gtid_Set:Executed_Gtid_Set: 56546043-df7b-11e7-b8df-0242ac130002:1-3Auto_Position: 1Replicate_Rewrite_DB:Channel_Name:Master_TLS_Version:1 row in set (0.00 sec)
To view this discussion on the web visit https://groups.google.com/d/msgid/orchestrator-mysql/bfb895dd-ee46-44ab-8986-7e742eaff184%40googlegroups.com.--
You received this message because you are subscribed to a topic in the Google Groups "orchestrator-mysql" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/orchestrator-mysql/m6bQBImvd0M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to orchestrator-mysql+unsub...@googlegroups.com.