Sync GUC between master and segments

87 views
Skip to first unread message

Hubert Zhang

unread,
Mar 4, 2019, 4:54:49 AM3/4/19
to Greenplum Developers
Hi all,

Currently, gpdb supports two ways to synchronize the GUCs between master and segments.
1. Using `SET` command, which call DispatchSetPGVariable() after set_config_option() on QD.
It only handles two GucAction type: GUC_ACTION_SET and GUC_ACTION_LOCAL which corresponds to `SET` and `SET LOCAL` command. But GUC_ACTION_SAVE is missing.

2. Sync GUC when creating gang. By calling makeOptions(), all the gucs will be passed to QEs from QD.

The above two methods both failed to handle the following case: `create extension xxx with schema yyy;`
It will set the search_path inside the create extension statement only on master. No explicit `SET` or `SET LOCAL` command will be called. Currently, it only calls set_config_option() which set the search_path on master. It leads to issues6716

One way to fix this issue is to dispatch the search_path to segments manually by calling DispatchSetPGVariable(). But the dispatch command could only call `SET` or `SET LOCAL`, which misses the case GUC_ACTION_SAVE. The semantic for GUC_ACTION_SAVE is that guc's lifecycle is at function level, while GUC_ACTION_LOCAL is at transaction level and GUC_ACTION_SET is at session level. 

In the 'create extension with schema' context, GUC_ACTION_SAVE is used by upstream. But we consider is that OK to treat GUC_ACTION_SAVE and GUC_ACTION_LOCAL to equal in this context? Because after the transaction commit, the GUC will both changed to the old value, no matter GUC_ACTION_SAVE or GUC_ACTION_LOCAL.

Or is there any better ways to fix this problem?
 
Thanks

Hubert Zhang, Haozhou Wang

Richard Guo

unread,
Mar 4, 2019, 6:06:09 AM3/4/19
to Hubert Zhang, Greenplum Developers
Hi,

A rough search on the codes shows that execute_extension_script() is not
executed by QE, so the search_path with action GUC_ACTION_SAVE is not set on
QE.

So I'm thinking how about we execute execute_extension_script() on QE too,
to set up the search path to contain the target schema on QE. Meanwhile, we
need to prevent execute_sql_string() from being executed on QE.

Look at the patch below, which is definitely far from correct, but can work
on this specific case:

diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 4f67b46ef91..fc374ecccb9 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -851,6 +851,9 @@ execute_extension_script(CreateExtensionStmt *stmt,
                              PGC_USERSET, PGC_S_SESSION,
                              GUC_ACTION_SAVE, true, 0);

+    if (Gp_role == GP_ROLE_EXECUTE)
+        return;
+
     /*
      * Set creating_extension and related variables so that
      * recordDependencyOnCurrentExtension and other functions do the right
@@ -1543,7 +1546,7 @@ CreateExtension(CreateExtensionStmt *stmt)
         CurrentExtensionObject = extensionOid;
     }

-    if (Gp_role != GP_ROLE_EXECUTE)
+//  if (Gp_role != GP_ROLE_EXECUTE)
     {
         execute_extension_script(stmt, extensionOid, control,
                              oldVersionName, versionName,


With this patch:

gpadmin=# set search_path to abc;
SET
gpadmin=# create extension gp_inject_fault with schema public;
CREATE EXTENSION
gpadmin=# select * from pg_extension ;
     extname     | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition
-----------------+----------+--------------+----------------+------------+-----------+--------------
 plpgsql         |       10 |           11 | f              | 1.0        |           |
 gp_inject_fault |       10 |         2200 | t              | 1.0        |           |
(2 rows)

Thanks
Richard

Hubert Zhang

unread,
Mar 4, 2019, 9:05:56 AM3/4/19
to Richard Guo, Greenplum Developers
Thanks, Richard.
execute_extension_script() will do the following things 1.set guc(search_path etc.), 2 run extension script. 3. If run script failed, rollback the guc setting.
If we run `set guc` on QD+QE but run extension script only on QD, then how about the rollback step? 

--
Thanks

Hubert Zhang

Ashwin Agrawal

unread,
Mar 4, 2019, 5:51:41 PM3/4/19
to Hubert Zhang, Greenplum Developers
On Mon, Mar 4, 2019 at 1:54 AM Hubert Zhang <hzh...@pivotal.io> wrote:
Hi all,

Currently, gpdb supports two ways to synchronize the GUCs between master and segments.
1. Using `SET` command, which call DispatchSetPGVariable() after set_config_option() on QD.
It only handles two GucAction type: GUC_ACTION_SET and GUC_ACTION_LOCAL which corresponds to `SET` and `SET LOCAL` command. But GUC_ACTION_SAVE is missing.

2. Sync GUC when creating gang. By calling makeOptions(), all the gucs will be passed to QEs from QD.

The above two methods both failed to handle the following case: `create extension xxx with schema yyy;`
It will set the search_path inside the create extension statement only on master. No explicit `SET` or `SET LOCAL` command will be called. Currently, it only calls set_config_option() which set the search_path on master. It leads to issues6716

Can you provide more details on why makeOptions() didn't dispatch the search_path guc correctly for commands being dispatched as part of `execute_extension_script()` ? I think we should fix to dispatch such gucs as well and generically fix the issue instead of point fixing Create Extension specifically.

Ashwin Agrawal

unread,
Mar 4, 2019, 5:58:38 PM3/4/19
to Richard Guo, Hubert Zhang, Greenplum Developers
On Mon, Mar 4, 2019 at 3:06 AM Richard Guo <ri...@pivotal.io> wrote:
Hi,

A rough search on the codes shows that execute_extension_script() is not
executed by QE, so the search_path with action GUC_ACTION_SAVE is not set on
QE.

So I'm thinking how about we execute execute_extension_script() on QE too,
to set up the search path to contain the target schema on QE. Meanwhile, we
need to prevent execute_sql_string() from being executed on QE.

This seems super hacky and very confusing in long run. Ideally, surprised it worked basically its elevating GUC_ACTION_SAVE to GUC_ACTION_LOCAL behavior on QE as despite completing the function CreateExtension() on QE the guc value remains and later is used for statements coming from QD run by `execute_extension_script()`.

Hubert Zhang

unread,
Mar 4, 2019, 9:23:29 PM3/4/19
to Ashwin Agrawal, Greenplum Developers
makeOptions() is only called by cdbgang_createGang_async() which means it only synchronize GUCs when creating gangs.
Consider the following statement "create extension gp_inject_fault with schema public."
It will first call CdbDispatchUtilityStatement() on master to create gang on all segments and pass all the gucs. And then it will call execute_extension_script() on master too, and inside execute_extension_script(), set_config_option will be called to set search_path to public only on master as well. Since gangs are already created, execute_extension_script() will not call makeOptions() again.


--
Thanks

Hubert Zhang

Pengzhou Tang

unread,
Mar 4, 2019, 9:39:53 PM3/4/19
to Hubert Zhang, Ashwin Agrawal, Greenplum Developers
The discussion in https://github.com/greenplum-db/gpdb/pull/6596 might help, we need a new mechanism to manage GUCs and make them sync between
QD and QEs. From Heikki's comment:


I think we should change our approach, wrt. keeping GUCs in sync in all the processes. Instead of immediately dispatching the SET command, I think we should have a per-connection flag, to indicate whether the GUCs in that QE process are in sync with the QD process. If the GUCs become out of sync for any reason, for example because the user issues a SET command, or because the connection was just established and the GUCs haven't been set yet, clear the flag. Check the flag before dispatching any command, and issue the required SETs at that point, if needed.

And instead of trying to keep track of which GUCs are in sync, dispatch all the GUCs that are different from the default values.


Will the potential approach resolve the problem?

Hubert Zhang

unread,
Mar 4, 2019, 10:30:12 PM3/4/19
to Pengzhou Tang, Ashwin Agrawal, Greenplum Developers
I think so. +1 on keeping GUCs in sync in all the processes:
For GUC_ACTION_SAVE which there is no corresponding "SET" command to it. By using this new approach, we don't need to consider the GUC_ACTION_TYPE on segment. We could let master to manage the GUC life cycle, e.g. restore when function/transaction committed.


--
Thanks

Hubert Zhang

Hubert Zhang

unread,
Mar 5, 2019, 5:09:40 AM3/5/19
to Pengzhou Tang, Ashwin Agrawal, Greenplum Developers
Did some additional spike followed Heikki's approach. Here is a brief summary of what to change:
QD side:
1. In buildGpQueryString(), detect modified GUCs and serialize these modified GUCs
2. When GUC is modified, mark guc_need_sync to true. It is similar to guc_dirty. But guc_dirty is used by AtEOXact_GUC() in transaction commit/abort. guc_need_sync is set to true when GUC value changed; set to false after successful dispatch.

QE side:
1. Parse connection string from QD and extract GUC list in postgres.c.
2. Call set_config_option to set the GUC value on QE

Two things need to be discussed:
1. cached multi gangs. Suppose last query run a multiple slice query, and generate five gangs. But new query only uses 1 gang. To avoid inconsistent GUC value on old idle gangs, there are 3 approaches:
  a. Retire the old four gangs and only left 1 gang to be valid. e.g. call cdbcomponent_cleanupIdleQEs()
  b. If detecting additional gangs, don’t set guc_need_sync to false.
  c. Firstly dispatch all gucs to all the gangs, and then dispatch the real query.
Which is better?

2. parameter need to be serialized in modified guc list. Below is function interface:
set_config_option(const char *name, const char *value, GucContext context, GucSource source, GucAction action, bool changeVal, int elevel)
Except name:value pair, is there a need to pass other parameters? Since GUCs on Segments is just a copy of GUCs on Master, So the additional info like GucContext, GucSource and GucAction is not important. I wonder if it is OK to just pass name:value pair as serialized GUCs?

Thanks
Hubert Zhang

Hubert Zhang

unread,
Mar 22, 2019, 5:13:16 AM3/22/19
to Pengzhou Tang, Ashwin Agrawal, Greenplum Developers, Heikki Linnakangas
Hi all,

Based on the previous discussion on PR6596, I follow Heikki's approach to refactor the GUC sync mechanism between QD and QE.  Please refer to PR7184 for implementation.
Here I want to involve more people to discuss and recheck the behavior of GUC synchronization. 
As explained in PR7184, Previous GUC sync is implemented by SET command +  Pass GUCs with GUC_GPDB_ADDOPT when creating gang. 
It has two problems:
1. Set command is not 2PC and may lead to inconsistent GUC state. Issues6610 describes the details of this problem.
2. When set GUCs without GUC_GPDB_ADDOPT using SET command, it will apply to all the existing QEs, but when cached QE exits with timeout, the changed GUC value will not appear on the new QE process.

New solution does not dispatch "SET command" to QE anymore. Instead, SET command only take effect on QD and enable GUC_need_sync flag. GUC will be passed to QE along with dispatched query/command later . As for what kind of GUCs need to passed to QE, we introduce a GUC flag GUC_GPDB_NEED_SYNC to replace the old flag  GUC_GPDB_ADDOPT. And the GUC sync logic become:
1. What to sync 1: Only GUCs with flag GUC_GPDB_NEED_SYNC will be passed from QD to QE. GUCs without this flag will be treated as no need to sync.
2. What to sync 2: Only changed GUCs will be passed to QE to reduce the number of GUC need to be synced. Here the changed means GUC was modified at some time even if it's value changed back to the default value. This is due to without 2PC guarantee,  default GUC value on QD doesn't mean GUC value on QE is also default value.
3. Time to sync 1: "SET command" will lead to GUC to be synced at the next query. Even if you just set one GUC, all the changed GUC will be passed to QE to ensure consistent.
4. Time to sync 2: Transaction abort will lead to GUC to be synced at the next query.
5. GUC will not be passed by libpq option to QE when creating gang. Old logic is removed.

Based on Logic#1, we need to recheck semantic of all the GUCs and set GUC_GPDB_NEED_SYNC flag for them if needed.
Any comments on the GUC sync behavior change?

Thanks
Hubert
--
Thanks

Hubert Zhang
Reply all
Reply to author
Forward
0 new messages