What should the segment/master do after they hit the timeout?
DTX recovery process infinitely trying to recover seems fine as what else can we do. But feel we should have some sort of timeout for backend to bail out if recovery is not completing and ERROR out. I was thinking of using statement_timeout for this but that's not enabled by default, so will not serve us for CI and such cases. I was thinking of adding 10mins timeout but that also felt like some arbitrary number, as depending on cluster size and such maybe in real the distributed recovery could take longer and we would unnecessarily fail the connection.
Does someone have reasonable value or action to suggest?
Hi Ashwin, I was trying to set a retry timeout to 4 seconds in cdb_setup() before, but it
turns out will make the pipeline flaky, so I revert that fix and keep the behavior that always waiting. (referring to a54a557c212f). I am not sure whether your PR will still make flakiness.
Things have passed in PR pipeline and local run. So, I am not sure if we would face the flakiness or not similar to last time. Ideally the change is much more aggressive this time compared to last time. So, I would expect if flakiness exist to show up much more and sooner compared to last time. I don't have better way to shake out the flakiness other than commit the PR and see what happens.
Also, while looking into this realized, cdb_setup() (within it mainly
InitMotionLayerIPC()) is currently performed at start of connection
for all non-utility connections. Wondering if there is way to delay
that initialization to some later point, only if connection executes
some distributed query. Like query:
psql -c "select * from gp_segment_configuration;"
does not need to InitMotionLayerIPC(). Currently such connection would
also fail if dtx recovery is in-progress which really is not necessary
and seems can be avoided. Thoughts?
Hmm, InitMotionLayerIPC() is used to fork the ic thread and setup ic contexts, we need the listener port information when a QE is initialized.
I agree we can move the logical of waiting the DTX recovery out of cdb_setup() and delay it until we perform some distributed queries.
I am not planning to work on that piece of work. So, will leave it for someone else to tackle.
Hi Ashwin, I was trying to set a retry timeout to 4 seconds in cdb_setup() before, but it
turns out will make the pipeline flaky, so I revert that fix and keep the behavior that always waiting. (referring to a54a557c212f). I am not sure whether your PR will still make flakiness.
Also, while looking into this realized, cdb_setup() (within it mainly
InitMotionLayerIPC()) is currently performed at start of connection
for all non-utility connections. Wondering if there is way to delay
that initialization to some later point, only if connection executes
some distributed query. Like query:
psql -c "select * from gp_segment_configuration;"
does not need to InitMotionLayerIPC(). Currently such connection would
also fail if dtx recovery is in-progress which really is not necessary
and seems can be avoided. Thoughts?
Hmm, InitMotionLayerIPC() is used to fork the ic thread and setup ic contexts, we need the listener port information when a QE is initialized.
I agree we can move the logical of waiting the DTX recovery out of cdb_setup() and delay it until we perform some distributed queries.
On 5/27/20 6:23 PM, Pengzhou Tang wrote:
Hi Ashwin, I was trying to set a retry timeout to 4 seconds in cdb_setup() before, but it
turns out will make the pipeline flaky, so I revert that fix and keep the behavior that always waiting. (referring to a54a557c212f). I am not sure whether your PR will still make flakiness.Things have passed in PR pipeline and local run. So, I am not sure if we would face the flakiness or not similar to last time. Ideally the change is much more aggressive this time compared to last time. So, I would expect if flakiness exist to show up much more and sooner compared to last time. I don't have better way to shake out the flakiness other than commit the PR and see what happens.
Okay, I can see where the flakiness in tests comes from. It's mainly the change in expectation in terms of gpstart. Crash recovery rejects the connections immediately but not pg_ctl start. Default behavior (or with -w) for pg_ctl start is to wait until can successfully connect to database. In greenplum, gpstart doesn't provide user the option to specify -w or any timeout. It always behaves as -w was specified. So, at successful exit of gpstart, connections should succeed and no retries are required. Hence, tests that perform restart (via gpstop -air) expect would be able to connect as soon as command is done. With old logic, the connection internally waited if distributed transaction recovery was still running. But with immediate failure as proposed, connections might fail. So, if making this change, need to add logic to gpstart to also wait for distributed transaction recovery to complete before declaring start done. - Ashwin
On 30-May-2020, at 2:46 AM, Ashwin Agrawal <aas...@vmware.com> wrote:
In greenplum, gpstart doesn't provide user the option to specify -w or
any timeout. It always behaves as -w was specified. So, at successful exit of gpstart, connections should succeed and no retries are required. Hence, tests that perform restart (via gpstop -air) expect would be able to connect as soon as command is done. With old logic, the connection internally waited if distributed transaction recovery was still running. But with immediate failure as proposed, connections might fail. So, if making this change, need to add logic to gpstart to also wait for distributed transaction recovery to complete before declaring start done.
The flakiness is less about queries from gpstart. But expectation in tests that after successful completion of gpstart, further queries/test will run fine. With the change, that's breakage in expectation. At completion of gpstart, we should be ready to accept
distributed transactions.
I have created this PR
https://github.com/greenplum-db/gpdb-postgres-merge/pull/52 against PostgreSQL 12 merge branch, as depends on PostgreSQL 10 change. Please let me know your thoughts.
The flakiness is less about queries from gpstart. But expectation in tests that after successful completion of gpstart, further queries/test will run fine. With the change, that's breakage in expectation. At completion of gpstart, we should be ready to accept distributed transactions.