DML Locking behavior Issues for Partition Table FIXME

60 views
Skip to first unread message

Zhenghua Lyu

unread,
Sep 15, 2020, 10:14:53 AM9/15/20
to gpdb...@greenplum.org

DML Locking behavior Issues for Partition Table FIXME


Overview

Postgres V12 merge is approaching the end. I spend some time on the DML locking behavior for partition table based on the top commit in gpdb-postgres-merge repo (https://github.com/greenplum-db/gpdb-postgres-merge/commit/01f2d101e876747248abb6cb7655d031b5e00726


For DMLs, there are several places to hold locks:

  • parse analyze to set target relation

  • during planning it may expand the partition table that has subclass, this will hold locks on sub tables

  • InitPlan will init resulterlation, it will try to get partition qual and this will hold locks

  • QEs will hold locks during InitPlan and Executor code for insert


This report mainly shows the behavior now and clarifies the logic here. We can discuss in this thread what is the best design for locking behavior of partition tables in Greenplum.


My Suggestion

I come up with a rule of locking behavior in Greenplum:

  • QE holds a lock on relation T in M mode => QD holds a lock on relation T in higher level mode than M


In short, QD should hold all locks and in highest level mode. The motivation is:

  • If without GDD, we should not make waiting on QEs happen

  • DistributedSnapshot can only be got on QD and then dispatch to QEs.


So if my suggestion is reasonable, we might fix like:

  • Save the correct lockmode in RTEs

  • Even for insert, hold all locks for each leaf in QD

  • Consider if ORCA should hold locks (I vote for not)


Claim for GDD

GDD impacts the lock level for update and delete on heap tables. As the report’s result, it only impact the RangeVar’s relation. When expanding the root relation, it uses the mode in RTE. Also in InitPlan.

Claim for ORCA


This report only shows the result under planner, as far as I can see, it will hold locks for a table and all its inheritors in the function cdb_estimate_partitioned_numtuples. Maybe it should take lock mode dynamically, not just hard-coded as AccessShareLock.

Test Scripts and Result


See detailed results below, the comments starting with ---- is added by me as analysis.

Table Definition and Locking Query View

create table t_part_heap (a int, b int, c int)
distributed randomly
partition by range(b) (start(0) end(2) every(1));

create table t_part_ao (a int, b int, c int)
with(appendonly=true)
distributed randomly
partition by range(b) (start(0) end(2) every(1));

create table t_part_mix (a int, b int, c int)
with(appendonly=true)
distributed randomly
partition by list(b)
(
  partition p1 values (0),
  partition p2 values (1) with(appendonly=true),
  partition p3 values (2)
);

create table t_part_multi_heap (a int, b int, c int)
distributed randomly
partition by range(b)
  subpartition by list(c) subpartition template
  (
    values(0),
    values(1),
    values(2)
  )
(start(0) end(2) every(1));

create table t_part_multi_ao (a int, b int, c int)
with(appendonly=true)
distributed randomly
partition by range(b)
  subpartition by list(c) subpartition template
  (
    values(0),
    values(1),
    values(2)
  )
(start(0) end(2) every(1));

create table t_part_multi_mix (a int, b int, c int)
distributed randomly
partition by range(b)
  subpartition by list(c) subpartition template
  (
    values(0),
    values(1) with(appendonly=true),
    values(2)
  )
(start(0) end(2) every(1));

create or replace view show_locks_lockmodes as
  select gp_segment_id, relation::regclass::text, array_agg(mode)
  from pg_locks
  where
    gp_segment_id < 1 and
    locktype = 'relation' and
    granted and
    relation::regclass::text like 't_part%'
  group by gp_segment_id, relation
  order by gp_segment_id, relation;


Heap Partition Table, SingleLevel, Disable GDD

-- delete on root
begin;
BEGIN
delete from t_part_heap;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |      relation       |             array_agg             
---------------+---------------------+------------------------------------
            -1 | t_part_heap         | {AccessShareLock,ExclusiveLock}
            -1 | t_part_heap_1_prt_1 | {RowExclusiveLock}
            -1 | t_part_heap_1_prt_2 | {RowExclusiveLock}
            0 | t_part_heap         | {RowExclusiveLock,AccessShareLock}
            0 | t_part_heap_1_prt_1 | {RowExclusiveLock}
            0 | t_part_heap_1_prt_2 | {RowExclusiveLock}
(6 rows)

abort;
ROLLBACK
---- analysis for above locking behavior
---- On QD:
----   1. it first holds ExclusiveLock on the target relation when
-----     invoking the function setTargetTable, it is the root
----     partition in this case. ExclusiveLock is because we disable
----     GDD now.
----  2. then it enters the code of planner, during this stage, it will
----     invoke add_other_rels_to_query -> expand_inherited_rtentry to
----     hold locks on target relation's inherited tables. The lockmode
----     is read from root relation's RTE.
----  3. then during InitPlan, it will invoke RelationGetPartitionQual
----     when InitResultRelInfo, it will skip for root partition, and
----     then it will hold AccessShareLock on each parent.
---- On QEs:
----   1. QE get dispatched plan, so it does not have parse-analyze stage
----      so it holds locks during InitPlan, it invokes
----      ExecGetRangeTableRelation to lock result relations (leafs)
----   2. Then during InitResultRelInfo it will lock parent in
----      AccessShareLock when generate_partition_qual
----   3. Then it will handle rootResultRelations to lock it, mode is
----      from RTE.

-- delete on first leaf
begin;
BEGIN
delete from t_part_heap_1_prt_1;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |      relation       |     array_agg     
---------------+---------------------+--------------------
            -1 | t_part_heap         | {AccessShareLock}
            -1 | t_part_heap_1_prt_1 | {ExclusiveLock}
            0 | t_part_heap         | {AccessShareLock}
            0 | t_part_heap_1_prt_1 | {RowExclusiveLock}
(4 rows)

abort;
ROLLBACK
----

-- insert on root
begin;
BEGIN
insert into t_part_heap values (0,0,0), (1,1,1);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |      relation       |             array_agg
---------------+---------------------+------------------------------------
            -1 | t_part_heap         | {RowExclusiveLock}
            0 | t_part_heap         | {AccessShareLock,RowExclusiveLock}
            0 | t_part_heap_1_prt_1 | {RowExclusiveLock}
(3 rows)

---- On QD: it just holds RowExclusiveLock when set target relation

---- on QEs:

----   1. During InitPlan it holds RowExclusiveLock on target relation

----   2. During ExecModifyTable, it will hold locks on the needed leaf

----      and its parent.

abort;
ROLLBACK
-- insert on a leaf
begin;
BEGIN
insert into t_part_heap_1_prt_1 values (0,0,0), (1,0,0);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |      relation       |     array_agg     
---------------+---------------------+--------------------
            -1 | t_part_heap         | {AccessShareLock}
            -1 | t_part_heap_1_prt_1 | {RowExclusiveLock}
            0 | t_part_heap         | {AccessShareLock}
            0 | t_part_heap_1_prt_1 | {RowExclusiveLock}
(4 rows)

abort;
ROLLBACK



AO Partition Table, SingleLevel, Disable GDD

---- this part of analysis is almost the same as heap table

-- delete on root
begin;
BEGIN
delete from t_part_ao;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |     relation      |             array_agg             
---------------+-------------------+------------------------------------
            -1 | t_part_ao         | {AccessShareLock,ExclusiveLock}
            -1 | t_part_ao_1_prt_1 | {RowExclusiveLock}
            -1 | t_part_ao_1_prt_2 | {RowExclusiveLock}
            0 | t_part_ao         | {RowExclusiveLock,AccessShareLock}
            0 | t_part_ao_1_prt_1 | {RowExclusiveLock}
            0 | t_part_ao_1_prt_2 | {RowExclusiveLock}
(6 rows)

abort;
ROLLBACK
-- delete on first leaf
begin;
BEGIN
delete from t_part_ao_1_prt_1;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |     relation      |     array_agg     
---------------+-------------------+--------------------
            -1 | t_part_ao         | {AccessShareLock}
            -1 | t_part_ao_1_prt_1 | {ExclusiveLock}
            0 | t_part_ao         | {AccessShareLock}
            0 | t_part_ao_1_prt_1 | {RowExclusiveLock}
(4 rows)

abort;
ROLLBACK
-- insert on root
begin;
BEGIN
insert into t_part_ao values (0,0,0), (1,1,1);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |     relation      |             array_agg             
---------------+-------------------+------------------------------------
            -1 | t_part_ao         | {RowExclusiveLock}
            0 | t_part_ao         | {AccessShareLock,RowExclusiveLock}
            0 | t_part_ao_1_prt_2 | {RowExclusiveLock}
(3 rows)

abort;
ROLLBACK
-- insert on a leaf
begin;
BEGIN
insert into t_part_ao_1_prt_1 values (0,0,0), (1,0,0);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |     relation      |     array_agg     
---------------+-------------------+--------------------
            -1 | t_part_ao         | {AccessShareLock}
            -1 | t_part_ao_1_prt_1 | {RowExclusiveLock}
            0 | t_part_ao         | {AccessShareLock}
            0 | t_part_ao_1_prt_1 | {RowExclusiveLock}
(4 rows)

abort;
ROLLBACK

Heap Partition Table, MultiLevel, Disable GDD


-- delete on root
begin;
BEGIN
delete from t_part_multi_heap;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |             relation              |             array_agg             
---------------+-----------------------------------+------------------------------------
            -1 | t_part_multi_heap                 | {ExclusiveLock,AccessShareLock}
            -1 | t_part_multi_heap_1_prt_1         | {AccessShareLock,RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_1_2_prt_1 | {RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_1_2_prt_2 | {RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_1_2_prt_3 | {RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_2         | {AccessShareLock,RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_2_2_prt_1 | {RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_2_2_prt_2 | {RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_2_2_prt_3 | {RowExclusiveLock}
            0 | t_part_multi_heap                 | {AccessShareLock,RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_1         | {AccessShareLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_2 | {RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_3 | {RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_2         | {AccessShareLock}
            0 | t_part_multi_heap_1_prt_2_2_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_2_2_prt_2 | {RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_2_2_prt_3 | {RowExclusiveLock}
(18 rows)

abort;
ROLLBACK
-- delete on first leaf
begin;
BEGIN
delete from t_part_multi_heap_1_prt_1_2_prt_1;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |             relation              |     array_agg     
---------------+-----------------------------------+--------------------
            -1 | t_part_multi_heap                 | {AccessShareLock}
            -1 | t_part_multi_heap_1_prt_1         | {AccessShareLock}
            -1 | t_part_multi_heap_1_prt_1_2_prt_1 | {ExclusiveLock}
            0 | t_part_multi_heap                 | {AccessShareLock}
            0 | t_part_multi_heap_1_prt_1         | {AccessShareLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_1 | {RowExclusiveLock}
(6 rows)

abort;
ROLLBACK
-- delete on middle
begin;
BEGIN
delete from t_part_multi_heap_1_prt_1;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |             relation              |             array_agg             
---------------+-----------------------------------+------------------------------------
            -1 | t_part_multi_heap                 | {AccessShareLock}
            -1 | t_part_multi_heap_1_prt_1         | {AccessShareLock,ExclusiveLock}
            -1 | t_part_multi_heap_1_prt_1_2_prt_1 | {RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_1_2_prt_2 | {RowExclusiveLock}
            -1 | t_part_multi_heap_1_prt_1_2_prt_3 | {RowExclusiveLock}
            0 | t_part_multi_heap                 | {AccessShareLock}
            0 | t_part_multi_heap_1_prt_1         | {RowExclusiveLock,AccessShareLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_2 | {RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_3 | {RowExclusiveLock}
(10 rows)

abort;
ROLLBACK
-- insert on root
begin;
BEGIN
insert into t_part_multi_heap values (0,0,0), (1,1,1);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |             relation              |             array_agg             
---------------+-----------------------------------+------------------------------------
            -1 | t_part_multi_heap                 | {RowExclusiveLock}
            0 | t_part_multi_heap                 | {AccessShareLock,RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_2         | {AccessShareLock,RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_2_2_prt_2 | {RowExclusiveLock}
(4 rows)

abort;
ROLLBACK
-- insert on a leaf
begin;
BEGIN
insert into t_part_multi_heap_1_prt_1_2_prt_1 values (0,0,0), (1,0,0);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |             relation              |     array_agg     
---------------+-----------------------------------+--------------------
            -1 | t_part_multi_heap                 | {AccessShareLock}
            -1 | t_part_multi_heap_1_prt_1         | {AccessShareLock}
            -1 | t_part_multi_heap_1_prt_1_2_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_heap                 | {AccessShareLock}
            0 | t_part_multi_heap_1_prt_1         | {AccessShareLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_1 | {RowExclusiveLock}
(6 rows)

abort;
ROLLBACK
-- insert on a middle
begin;
BEGIN
insert into t_part_multi_heap_1_prt_1 values (0,0,0), (0,0,1);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |             relation              |             array_agg             
---------------+-----------------------------------+------------------------------------
            -1 | t_part_multi_heap                 | {AccessShareLock}
            -1 | t_part_multi_heap_1_prt_1         | {RowExclusiveLock}
            0 | t_part_multi_heap                 | {AccessShareLock}
            0 | t_part_multi_heap_1_prt_1         | {AccessShareLock,RowExclusiveLock}
            0 | t_part_multi_heap_1_prt_1_2_prt_1 | {RowExclusiveLock}
(5 rows)

abort;
ROLLBACK


AO Partition Table, MultiLevel, Disable GDD

-- delete on root
begin;
BEGIN
delete from t_part_multi_ao;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |            relation             |             array_agg             
---------------+---------------------------------+------------------------------------
            -1 | t_part_multi_ao                 | {ExclusiveLock,AccessShareLock}
            -1 | t_part_multi_ao_1_prt_1         | {AccessShareLock,RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_1_2_prt_1 | {RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_1_2_prt_2 | {RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_1_2_prt_3 | {RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_2         | {AccessShareLock,RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_2_2_prt_1 | {RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_2_2_prt_2 | {RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_2_2_prt_3 | {RowExclusiveLock}
            0 | t_part_multi_ao                 | {AccessShareLock,RowExclusiveLock}
            0 | t_part_multi_ao_1_prt_1         | {AccessShareLock}
            0 | t_part_multi_ao_1_prt_1_2_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_ao_1_prt_1_2_prt_2 | {RowExclusiveLock}
            0 | t_part_multi_ao_1_prt_1_2_prt_3 | {RowExclusiveLock}
            0 | t_part_multi_ao_1_prt_2         | {AccessShareLock}
            0 | t_part_multi_ao_1_prt_2_2_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_ao_1_prt_2_2_prt_2 | {RowExclusiveLock}
            0 | t_part_multi_ao_1_prt_2_2_prt_3 | {RowExclusiveLock}
(18 rows)

abort;
ROLLBACK
-- delete on first leaf
begin;
BEGIN
delete from t_part_multi_ao_1_prt_1_2_prt_1;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |            relation             |     array_agg     
---------------+---------------------------------+--------------------
            -1 | t_part_multi_ao                 | {AccessShareLock}
            -1 | t_part_multi_ao_1_prt_1         | {AccessShareLock}
            -1 | t_part_multi_ao_1_prt_1_2_prt_1 | {ExclusiveLock}
            0 | t_part_multi_ao                 | {AccessShareLock}
            0 | t_part_multi_ao_1_prt_1         | {AccessShareLock}
            0 | t_part_multi_ao_1_prt_1_2_prt_1 | {RowExclusiveLock}
(6 rows)

abort;
ROLLBACK
-- delete on middle
begin;
BEGIN
delete from t_part_multi_ao_1_prt_1;
DELETE 0
select * from show_locks_lockmodes;
gp_segment_id |            relation             |             array_agg             
---------------+---------------------------------+------------------------------------
            -1 | t_part_multi_ao                 | {AccessShareLock}
            -1 | t_part_multi_ao_1_prt_1         | {AccessShareLock,ExclusiveLock}
            -1 | t_part_multi_ao_1_prt_1_2_prt_1 | {RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_1_2_prt_2 | {RowExclusiveLock}
            -1 | t_part_multi_ao_1_prt_1_2_prt_3 | {RowExclusiveLock}
            0 | t_part_multi_ao                 | {AccessShareLock}
            0 | t_part_multi_ao_1_prt_1         | {RowExclusiveLock,AccessShareLock}
            0 | t_part_multi_ao_1_prt_1_2_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_ao_1_prt_1_2_prt_2 | {RowExclusiveLock}
            0 | t_part_multi_ao_1_prt_1_2_prt_3 | {RowExclusiveLock}
(10 rows)

abort;
ROLLBACK
-- insert on root
begin;
BEGIN
insert into t_part_multi_ao values (0,0,0), (1,1,1);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |    relation     |     array_agg     
---------------+-----------------+--------------------
            -1 | t_part_multi_ao | {RowExclusiveLock}
            0 | t_part_multi_ao | {RowExclusiveLock}
(2 rows)

abort;
ROLLBACK
-- insert on a leaf
begin;
BEGIN
insert into t_part_multi_ao_1_prt_1_2_prt_1 values (0,0,0), (1,0,0);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |            relation             |     array_agg     
---------------+---------------------------------+--------------------
            -1 | t_part_multi_ao                 | {AccessShareLock}
            -1 | t_part_multi_ao_1_prt_1         | {AccessShareLock}
            -1 | t_part_multi_ao_1_prt_1_2_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_ao                 | {AccessShareLock}
            0 | t_part_multi_ao_1_prt_1         | {AccessShareLock}
            0 | t_part_multi_ao_1_prt_1_2_prt_1 | {RowExclusiveLock}
(6 rows)

abort;
ROLLBACK
-- insert on a middle
begin;
BEGIN
insert into t_part_multi_ao_1_prt_1 values (0,0,0), (0,0,1);
INSERT 0 2
select * from show_locks_lockmodes;
gp_segment_id |        relation         |     array_agg     
---------------+-------------------------+--------------------
            -1 | t_part_multi_ao         | {AccessShareLock}
            -1 | t_part_multi_ao_1_prt_1 | {RowExclusiveLock}
            0 | t_part_multi_ao         | {AccessShareLock}
            0 | t_part_multi_ao_1_prt_1 | {RowExclusiveLock}
(4 rows)

abort;
ROLLBACK




Heikki Linnakangas

unread,
Sep 24, 2020, 3:33:27 AM9/24/20
to Zhenghua Lyu, gpdb...@greenplum.org
On 15/09/2020 17:14, Zhenghua Lyu wrote:
> I come up with a rule of locking behavior in Greenplum:
>
> * QE holds a lock on relation T in M mode => QD holds a lock on relation T in higher level mode than M

That's a simple rule, which is nice. I'm worried about the impact
especially on inserts to a partitioned table, though.

> In short, QD should hold all locks and in highest level mode. The motivation is:
>
> * If without GDD, we should not make waiting on QEs happen

That would be good. Although in general I think we should turn GDD on by
default, and deprecate running without GDD. Is there any reason to not
use GDD?

> * DistributedSnapshot can only be got on QD and then dispatch to QEs.

How is DistributedSnapshot related to this?

> So if my suggestion is reasonable, we might fix like:
>
> * Save the correct lockmode in RTEs
>
> * Even for insert, hold all locks for each leaf in QD

That sounds bad. If I understand correctly, normally when you INSERT
into a partitioned table, you only need to lock the partitioned table
itself, and the partition where the new row goes. Locking *all*
partitions on every insert would be a lot more locking.

- Heikki

Zhenghua Lyu

unread,
Sep 24, 2020, 5:39:40 AM9/24/20
to Heikki Linnakangas, gpdb...@greenplum.org
Hi Heikki,
  
That would be good. Although in general I think we should turn GDD on by
default, and deprecate running without GDD. Is there any reason to not
use GDD?

The only problem I can come up with is with GDD and ORCA both enabled, since currently, ORCA treat every update (even
update non hash distributed columns) as split-update. So ORCA does not use heap_update to implement update, it uses
ExecDelete and ExecInsert. So, concurrently update (even non distkey) under this scenario,  might lead to too many rollbacks.

But Alex Wang would pick up this later (do not use split update for non distkey update under ORCA). So That will not be a problem
if we finish this.

--------------------------------------------------------------------------------------------------------------------------------------

How is DistributedSnapshot related to this?

My concern is: when insert partition table's waiting happen on QEs only, despite global deadlock, at some point
the hang transaction must wake up (which means the world has changed: others have commited or abort).
But now it is on QE, it cannot update global transaction snapshot.

I try to come up with cases but fail. One case goes in my mind is:

tx1: begin;
tx1: insert into root valus (1, 1);

tx2: create index on leaf (xxx); -- this will goes into QE, and wait at QEs, because tx1 only hold lock on leaf in QEs

But it seems in the function `heapam_index_build_range_scan` it use SnapshotAny to scan the heap table.

--------------------------------------------------------------------------------------------------------------------------------------

That sounds bad. If I understand correctly, normally when you INSERT
into a partitioned table, you only need to lock the partitioned table
itself, and the partition where the new row goes. Locking *all*
partitions on every insert would be a lot more locking.

You are right. But now gp(and postgres) locking leaf (the partition where data goes into) during ExecModifyTable,
that happens on QEs, it might be too late. It might lead to Global Deadlock (with GDD, the deadlock can be resolved):
QD dispatch to QEs is async, we cannot guarantee the order of the receiving of QEs. Look at the following case:

1: begin; --start transaction 1
2: begin; --start transaction 2
1: insert into root values (1), (2), (3); -- this will dispatch to all segments
2: truncate root_leaf_part1; -- this will not conflict with transaction 1

But on QEs, the order is not guaranteed.
      If on seg0, transaction 2 is before transaction 1,  then tx1 wait for tx2 on seg0;
      If on seg1, transaction 2 is after transaction 1,  then tx2 wait for tx1 on seg1;
Global deadlock happens.


It seems that we cannot find a general rule for insert statement to know which leaf should be locked in QD?
For some very simple case, we might add a logic like direct dispatch to determine which partition leaf will be
touched?

------------------------------------------------------------------------------------------------------------------------





From: Heikki Linnakangas <linnak...@vmware.com>
Sent: Thursday, September 24, 2020 3:33 PM
To: Zhenghua Lyu <zl...@vmware.com>; gpdb...@greenplum.org <gpdb...@greenplum.org>
Subject: Re: DML Locking behavior Issues for Partition Table FIXME
 
Reply all
Reply to author
Forward
0 new messages