MPP Engine team.
> EC process is a loop:
>
> 6. The follower should wait sometime before take over to endure network glitch.
Can you please elaborate this point?
>
> The defect of this proposal:
> Ideally, we should trigger a segment transition when any one of the following happens:
> 1. The segment postmaster process is killed or is not responding.
> 2. The segment is unreachable from the master.
> 3. The segment is unreachable from other segments.
In addition, even if a postmaster is reachable and alive, if its data
directory cannot be written to (e.g. disk full or I/O error), it is
effectively down. FTS probe handler currently checks this by writing
a page to disk before responding to a probe. FTS uses libpq messages
for probing. This means that if a postmaster cannot spawn a backend
process to handle a probe request, it is considered down.
>
> Conclusion:
> In order to test the connectivity, we need EC to be able to send heartbeat between each node, it's the same as FTS. So one other option is to use etcd for master/standby and keep using FTS for primary/mirror segments.
How about starting with etcd for master/standby and leaving
primary/mirror for FTS as first step?
Asim
Good to know. Thanks Goutam and Ivan.On Tue, Oct 30, 2018 at 2:24 AM Ivan Novick <ino...@pivotal.io> wrote:I was advised by CLoud Foundry engineering they have data loss with Consul, and to avoid itOn Mon, Oct 29, 2018 at 8:55 AM Goutam Tadi <gt...@pivotal.io> wrote:Hi Gang,FYI Regarding Consul:There has been a document since June 2017 that explains the pain points of consul and reasons why Cloud Foundry is moving away from it.I don't know if the same issues can happen to Greenplum also.Thanks,Goutam TadiSince this document is internal, I limited recipients of this reply to internal folks only.
Automatically failing over the master doesn't matter if the end user clients and all utilities don't know to connect to the "new" master. Any promotion or change in state would need to be made externally visible so that the corresponding host knows to take control of the IP, DNS needs to modify to point to the new host. It would be exceptionally nice to have some sort of facility to execute scripts in location X if the system failover happens.
On Tue, Oct 30, 2018 at 12:00 PM Scott Kahler <ska...@pivotal.io> wrote:Automatically failing over the master doesn't matter if the end user clients and all utilities don't know to connect to the "new" master. Any promotion or change in state would need to be made externally visible so that the corresponding host knows to take control of the IP, DNS needs to modify to point to the new host. It would be exceptionally nice to have some sort of facility to execute scripts in location X if the system failover happens.The usual suspects for handling that (assuming no existing external LB) are pgbouncer, pgpool or generically haproxy isn't it? Any others?
Andreas Scherbaum
Principal Software Engineer
GoPivotal Deutschland GmbH
Amtsgericht Königstein im Taunus, HRB 8433
Geschäftsführer: Andrew Michael Cohen, Paul Thomas Dacier--
You received this message because you are subscribed to the Google Groups "Greenplum Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-dev+u...@greenplum.org.
Asim
--
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-dev+unsubscribe@greenplum.org.
Have you considered keepalived?
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-dev+u...@greenplum.org.
Hi Paul and Asim,I explored a little bit more on your idea of using one single segment (for example, segment 0) as the arbiter, which is similar to use FTS to arbitrate primary/mirror segments. The solution is practicable and much easier, except it can not tolerant any 2 out of the 3 nodes (arbiter segment, master and standby) fail at the same time.TL;DRThe process:1. arbiter segment regularly sends heartbeat to master and standby. when it detects master is down, it promotes the standby.
2. when master finds standby is not responsible, it stops all the transactions, then it write the standby out-of-sync status to arbiter segment and its mirror. If the status is written successfully, the master continues the transactions. If arbiter says 'the standby has already taken over', master shut down.Failure mode:1. The master is down.Arbiter segment detects it and promotes standby.2. Arbiter segment detects the master down, but the master is actually alive.Arbiter segment detects it and promotes standby. After standby is promoted, the master lost the connection to the walreceiver, so it talks to the arbiter and finds the standby has taken over.3. The standby is downMaster lost connection to the walreceiver, so it writes out-of-sync to the arbiter and resume transactions.4. The arbiter segment is downMaster FTS detects it and negotiates with standby to assign the arbiter role to a new segment.
If arbiter can’t reach master, how can we say master is down? It can be just connection between this segment and master is broken but rest all segments can reach master. Will unnecessarily promote standby and cause current transactions to abort.Per failure mode 1&2, we will promote standby in case of master down or just arbiter segment detects the master down but master is actually alive. We can't distinguish master is really down or just connection failure between arbiter segment and master by this simple solution. Maybe we will consider some corner cases like arbiter detects master down but walrep is still working(master is alive), which we should not promote standby.
How does it stop the previous arbiter segment from not continuing to run in system in this case? As today primary can be marked as down in configuration but still keep running.Master and standby will keep the dbid of arbiter, after negotiating, they will keep new dbid and refuse the arbitrating connection of old arbiter.Also, how is manual promotion allowed or disallowed with this scheme. Its very important one to protect which doesn’t exist today. Are we planning to completely remove gpactivatestandby? Seems we can’t, as should still have manual option to failover to standby.We won't completely remove gpactivatestandby. It is necessary in some extreme cases. For example, if the master node is broken, all the datas are lost and can not be recovered. Activate standby is the only choice though maybe some walrep logs are missing.