It was asked in a PR if there was any documentation or working docs
that we were using in order to help explain the Replication slot work.
Replication slots are a feature from Postgres, first in 9.4
https://www.postgresql.org/docs/9.4/catalog-pg-replication-slots.html
"They are a persistent record of the state of a replica that is kept
on the master server even when the replica is offline and
disconnected." meaning that we can ensure that the WAL Logs that are
needed to bring a mirror back up exist on the Primary.
https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.2ndquadrant.com_postgresql-2D9-2D4-2Dslots_&d=DwIBaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=_wjPWQXi0BcvFNIPAAsrOw&m=rI14kClqo9D5GRksIfjaWNpX7We69fPfwmAnbH-ZOIE&s=moYs0GoHNVTjLEWxjBZe6oaxbg30F2Yys6rV876e540&e=
I apologize for some of the wording below. The verbiage used in
Greenplum is that there are primary/mirror pairs and there is always a
primary. When the Primary fails the mirror is promoted and there is no
longer a mirror but there is still a primary that was the mirror. The
lack of good language for this might cause confusion in my below
comments, please ask clarifying questions.
Fundamentally all we are doing with Replication Slots is plumbing them
into the utilities and the FTS system so that the usage is automated.
This work consists of a few stories.
* whenever there is a mirror added a replication slot is created to
enable incremental recovery if the mirror goes down then comes back up
* when a mirror is promoted then there is a replication slot created
to enable incremental recovery via pg_rewind of old primary
* Finally, relevant to the PR discussion, when a primary fails there
is still a replication slot on that (not sure what to call a dead
primary) When it becomes a mirror that (stale) replication slot needs
to be killed so that it doesn't store logs that aren't being flushed
because it isn't a primary anymore.
There are a few follow up stories around providing warnings about
setting up replications slots w/o using the utilities and making sure
that they can be super user only modified but essentially that is it.
-- Rob
>
> If primary failed, and mirror is promoted, is there any tiny window, where mirror's xlog is
> behind of master's xlog, and after mirror is promoted, some of latest changes are missed?
No, in order for 2 phase commit to commit a transaction the xlog needs
to be written to the mirror so we know it exists on the mirror.
>
> 'pg_rewind of old primary': do you mean when primary is back and becomes the new mirror,
> pg_rewind is used to bring old primary (new mirror) to sync state with old mirror (new primary)?
>>
>> * Finally, relevant to the PR discussion, when a primary fails there
>> is still a replication slot on that (not sure what to call a dead
>> primary) When it becomes a mirror that (stale) replication slot needs
>> to be killed so that it doesn't store logs that aren't being flushed
>> because it isn't a primary anymore.
>
> This helps a lot to understand corresponding PR. Thanks! So it will be removed
> during startup as mirror (old primary). Then after startup, pg_rewind is used
> to bring mirror up to date with primary. Is this true? Who will call pg_rewind? FTS or utilities such as gprecoverseg?
pg_rewind is called by gprecoverseg. Let me let someone else explain
the exact order of events that you mention. I'm not sure if the
replication slot is created before or after pg_rewind nor how they
interact.
-- Rob
>
> If primary failed, and mirror is promoted, is there any tiny window, where mirror's xlog is
> behind of master's xlog, and after mirror is promoted, some of latest changes are missed?
No, in order for 2 phase commit to commit a transaction the xlog needs
to be written to the mirror so we know it exists on the mirror.Primary will commit transaction firstly, then ship xlog to mirror. If primary crashes just after sending xlog to mirror while before receiving ACK. What will happen? Will the mirror commit successfully after mirror is promoted, or mirror abort the transaction?
For this case, client will treat it as aborted anyway.
>
> 'pg_rewind of old primary': do you mean when primary is back and becomes the new mirror,
> pg_rewind is used to bring old primary (new mirror) to sync state with old mirror (new primary)?
>>
>> * Finally, relevant to the PR discussion, when a primary fails there
>> is still a replication slot on that (not sure what to call a dead
>> primary) When it becomes a mirror that (stale) replication slot needs
>> to be killed so that it doesn't store logs that aren't being flushed
>> because it isn't a primary anymore.
>
> This helps a lot to understand corresponding PR. Thanks! So it will be removed
> during startup as mirror (old primary). Then after startup, pg_rewind is used
> to bring mirror up to date with primary. Is this true? Who will call pg_rewind? FTS or utilities such as gprecoverseg?
On Thu, Jan 24, 2019 at 3:06 AM Yandong Yao <yy...@pivotal.io> wrote:>
> If primary failed, and mirror is promoted, is there any tiny window, where mirror's xlog is
> behind of master's xlog, and after mirror is promoted, some of latest changes are missed?
No, in order for 2 phase commit to commit a transaction the xlog needs
to be written to the mirror so we know it exists on the mirror.Primary will commit transaction firstly, then ship xlog to mirror. If primary crashes just after sending xlog to mirror while before receiving ACK. What will happen? Will the mirror commit successfully after mirror is promoted, or mirror abort the transaction?2PC in Greenpum is coordinated by QD. If primary crashes after sending the commit xlog to mirror but before receiving ack from mirror, means primary has not sent ack to QD. Hence, QD is going to retry the commit to the promoted mirror and complete the transaction.Important point to note primary waits to receive ack from mirror for both the phases of 2PC. First Prepare phase and then for commit.So, the flow is:QD -> (prepare) -> primary -> (waits preprare lsn xlog flush) -> mirrorQD <- (ack) <- primary <- (ack) <- mirrorQD (commit)1 phase completesQD -> (commit) -> primary -> (waits commit lsn xlog flush) -> mirrorQD <- (ack) <- primary <- (ack) <- mirrorQD (marks done)2 phase completes
Just wish to point out for benefit of other readers, that this question doesn't relate anything with replication slots, and is discussing general 2PC working with walreplication.
For this case, client will treat it as aborted anyway.Why client would treat it as aborted, QD is our shield not to expose primary failure to clients for this case. Only if primary crashes before 1 phase completion of 2PC, the transaction is aborted.>
> 'pg_rewind of old primary': do you mean when primary is back and becomes the new mirror,
> pg_rewind is used to bring old primary (new mirror) to sync state with old mirror (new primary)?
>>
>> * Finally, relevant to the PR discussion, when a primary fails there
>> is still a replication slot on that (not sure what to call a dead
>> primary) When it becomes a mirror that (stale) replication slot needs
>> to be killed so that it doesn't store logs that aren't being flushed
>> because it isn't a primary anymore.
>
> This helps a lot to understand corresponding PR. Thanks! So it will be removed
> during startup as mirror (old primary). Then after startup, pg_rewind is used
> to bring mirror up to date with primary. Is this true? Who will call pg_rewind? FTS or utilities such as gprecoverseg?gprecoverseg calls pg_rewind, FTS can't perform this action as it has to be manually initiated event. pg_rewind needs to be run first to rollback extra transactions on old primary before it can be converted and connected back as mirror. Hence, that happens as first step, and after same that segment is connected back as mirror.As part of pg_rewind or pg_basebackup (for full mirror recovery) will copy over primaries replication slot. So, during start of segment as mirror (irrespective of how it was created) deletes the internal gp replication slot as not supposed to continue retaining xlog on mirror.
On Fri, Jan 25, 2019 at 3:21 AM Ashwin Agrawal <aagr...@pivotal.io> wrote:On Thu, Jan 24, 2019 at 3:06 AM Yandong Yao <yy...@pivotal.io> wrote:>
> If primary failed, and mirror is promoted, is there any tiny window, where mirror's xlog is
> behind of master's xlog, and after mirror is promoted, some of latest changes are missed?
No, in order for 2 phase commit to commit a transaction the xlog needs
to be written to the mirror so we know it exists on the mirror.Primary will commit transaction firstly, then ship xlog to mirror. If primary crashes just after sending xlog to mirror while before receiving ACK. What will happen? Will the mirror commit successfully after mirror is promoted, or mirror abort the transaction?2PC in Greenpum is coordinated by QD. If primary crashes after sending the commit xlog to mirror but before receiving ack from mirror, means primary has not sent ack to QD. Hence, QD is going to retry the commit to the promoted mirror and complete the transaction.Important point to note primary waits to receive ack from mirror for both the phases of 2PC. First Prepare phase and then for commit.So, the flow is:QD -> (prepare) -> primary -> (waits preprare lsn xlog flush) -> mirrorQD <- (ack) <- primary <- (ack) <- mirrorQD (commit)1 phase completesQD -> (commit) -> primary -> (waits commit lsn xlog flush) -> mirrorQD <- (ack) <- primary <- (ack) <- mirrorQD (marks done)2 phase completesThanks for the detailed information. So for about master and standby for same question?"master will commit transaction firstly, then ship xlog to standby. If master crashes just after sending xlog to standby while before receiving ACK. What will happen? Will the standby commit successfully after standby is promoted, or standby abort the transaction?"
And what is the behavior for client application?