Behavioral Difference between DROP COLUMN and DROP..CASCADE

71 views
Skip to first unread message

ekh...@vmware.com

unread,
May 13, 2023, 12:19:23 AM5/13/23
to Greenplum Developers
Hello Folks,

Context:
Recently we came across an issue in GP6 where the partition key was dropped by using `DROP TYPE .. CASCADE` on a user defined type that was associated with that partition key. Later when a select query was executed on this table it caused a crash with ORCA. 
A direct drop on the partition key column is not allowed in GP6 or GP7.
This led to looking into if there are any other behavioral differences between direct drop (`ALTER TABLE .... DROP COLUMN`) and `DROP TYPE.. CASCADE` on GP6 and GP7 that would leave the table in a state where queries may cause other issues.

Below is a matrix of scenarios tried :
DropColMatrix.png


There are 2 main differences that I have highlighted in the above matrix. More details below:

1. Behavioral difference between `ALTER TABLE ... DROP COLUMN` and `DROP TYPE.. CASCADE` on Partition key
ALTER TABLE ... DROP COLUMN
For both GP6 and GP7, when we try to do a direct DROP for the partition key column, it is not allowed and errors out.

DROP TYPE..CASCADE
GP6: when a DROP TYPE..CASCADE is called for a column type associated with the partition key, it drops the partition column from the table.
GP7: consistent with Postgres, the table is dropped if the user type being dropped is associated with the partition key column.

CREATE TYPE test_type AS ENUM ('new', 'open', 'closed');
CREATE TABLE part_table(dist_key int, part_key test_type, col int)
DISTRIBUTED BY(dist_key)
PARTITION BY LIST(part_key) ( PARTITION p1 VALUES ('new'),
                              PARTITION p2 VALUES ('open')) ;

ALTER TABLE part_table DROP COLUMN part_key;
ERROR:  cannot drop partitioning column "part_key"

-------- GP6 -----------
DROP TYPE test_type CASCADE;
NOTICE:  drop cascades to 3 other objects
DETAIL:  drop cascades to table part_table column part_key
drop cascades to table part_table_1_prt_p1 column part_key
drop cascades to table part_table_1_prt_p2 column part_key
DROP TYPE

\d+ part_table
                       Table "public.part_table"
  Column  |  Type   | Modifiers | Storage | Stats target | Description
----------+---------+-----------+---------+--------------+-------------
 dist_key | integer |           | plain   |              |
 col      | integer |           | plain   |              |
Child tables: part_table_1_prt_p1,
              part_table_1_prt_p2
Distributed by: (dist_key)
Partition by: (........pg.dropped.2........)

-------- GP7(same as PG) -----------
DROP TYPE test_type CASCADE;
NOTICE:  drop cascades to table part_table
DROP TYPE
\d part_table
Did not find any relation named "part_table".

2. Behavioral difference between `ALTER TABLE ... DROP COLUMN` and `DROP TYPE.. CASCADE` on distribution key COLUMN

ALTER TABLE ... DROP COLUMN
For both GP6 and GP7, when we try to do a direct DROP for the distribution key column, it goes ahead and drops the distribution key column and sets the distribution policy for the table as randomly distributed.

DROP TYPE.. CASCADE
For both GP6 and GP7, when `DROP TYPE..CASCADE` is called for a column type associated with the distribution key, it drops the column but does not update the table to be randomly distributed, leaving the table in a weird state.
Note: the same behavior is also seen for non-partitioned tables

CREATE DOMAIN test_int as INT;
CREATE TABLE test_part_table(a test_int, b int) DISTRIBUTED BY(a)
PARTITION BY RANGE(b) (START(0) END(10) EVERY(5));
CREATE TABLE test_table(a test_int, b int);
INSERT INTO test_part_table SELECT 1,1;
INSERT INTO test_table SELECT 1,1;

DROP DOMAIN test_int CASCADE;
\d test_table ---> Crash
\d test_part_table ---> Crash
select policytype,distkey from gp_distribution_policy where localoid=(select oid from pg_class where relname='test_table');
 policytype | distkey
------------+---------
 p          | 1
(1 row)

select policytype,distkey from gp_distribution_policy where localoid=(select oid from pg_class where relname='test_part_table');
 policytype | distkey
------------+---------
 p          | 1
(1 row)


This state further leads to query failures with ORCA on such tables (planner still works fine)
--- PLANNER
SELECT * FROM test_table;
b
---
1
(1 row)
SELECT * FROM test_part_table;
b
---
1
(1 row)

---- ORCA
SELECT * FROM test_table;
ERROR:  could not find hash function for type 0 in operator family 1977 (cdbhash.c:407)
SELECT * FROM test_part_table;
ERROR:  could not find hash function for type 0 in operator family 1977 (cdbhash.c:407)


Wanted to understand better if there is a reason we have these differences in behavior? Or If they were not intended, then how would we want to proceed moving forward?

Below are some initial proposals for closing the gaps:  
1. For partition key column: GP6 should also drop the table altogether and be consistent with GP7 and Postgres, but not sure how that would impact current GP6 users. 
    Another way for GP6 would be to disallow dropping partition key columns completely and error out during DROP TYPE .. CASCADE (same behavior as direct drop on partition key COLUMN).
2. For distribution key column: this should be the same behavior as direct drop on distribution column,i.e, drop the column and set distribution policy as randomly distributed.

Thanks,
Ekta 

Ekta Khanna

unread,
May 15, 2023, 2:12:25 PM5/15/23
to gpdb...@greenplum.org

(Resending; group conversation via google groups didn’t make it to mail. 2nd attempt to send via mail)


Hello Folks,

Context:

Recently we came across an issue in GP6 where the partition key was dropped by using `DROP TYPE .. CASCADE` on a user defined type that was associated with that partition key. Later when a select query was executed on this table it caused a crash with ORCA. 

A direct drop on the partition key column is not allowed in GP6 or GP7.

This led to looking into if there are any other behavioral differences between direct drop (`ALTER TABLE .... DROP COLUMN`) and `DROP TYPE.. CASCADE` on GP6 and GP7 that would leave the table in a state where queries may cause other issues.


Below is a matrix of scenarios tried :

b29va2trYwYsQIQVIiBDSmJRpMdaVuBDQP6oZOJzYJAflsk4DXbYVABgKal3IPIVBMBP4DOcODraU8OWsAAAAASUVORK5CYII=



Wanted to understand better if there is a reason we have these differences in behavior? Or if they were not intended, then how would we want to proceed moving forward?

Ashwin Agrawal

unread,
May 15, 2023, 6:46:47 PM5/15/23
to Ekta Khanna, gpdb...@greenplum.org
On Mon, May 15, 2023 at 11:12 AM 'Ekta Khanna' via Greenplum Developers <gpdb...@greenplum.org> wrote:

(Resending; group conversation via google groups didn’t make it to mail. 2nd attempt to send via mail)


Hello Folks,

Context:

Recently we came across an issue in GP6 where the partition key was dropped by using `DROP TYPE .. CASCADE` on a user defined type that was associated with that partition key. Later when a select query was executed on this table it caused a crash with ORCA. 

A direct drop on the partition key column is not allowed in GP6 or GP7.

This led to looking into if there are any other behavioral differences between direct drop (`ALTER TABLE .... DROP COLUMN`) and `DROP TYPE.. CASCADE` on GP6 and GP7 that would leave the table in a state where queries may cause other issues.


Below is a matrix of scenarios tried :
b29va2trYwYsQIQVIiBDSmJRpMdaVuBDQP6oZOJzYJAflsk4DXbYVABgKal3IPIVBMBP4DOcODraU8OWsAAAAASUVORK5CYII=



Thanks a lot for work and articulating the findings.

Wanted to understand better if there is a reason we have these differences in behavior? Or if they were not intended, then how would we want to proceed moving forward?


I don't think it's intentional chosen behavior (unless you are able to
trace down some tests validating the current behavior).

 

Below are some initial proposals for closing the gaps:  

1. For partition key column: GP6 should also drop the table altogether and be consistent with GP7 and Postgres, but not sure how that would impact current GP6 users. 

    Another way for GP6 would be to disallow dropping partition key columns completely and error out during DROP TYPE .. CASCADE (same behavior as direct drop on partition key COLUMN).


Dropping the table does feel like a very drastic step (specially for
STABLE release if any logic error happens to correctly detect the
condition may end up deleting tables which shouldn't have been
dropped). Hence, I feel we should be conservative at changing the
behavior and just fail DROP TYPE..CASCADE and let users take the
action (which also gives them a chance to backup data and such if
required).

Also, one dangling part remains when mail says it works fine for the
planner after this condition. What exactly does fine mean? How does
insert behavior with dropped partition key? How does partition
elimination work with dropped partition key? Does planner have
explicit or special logic coded to handle all such cases or is
working by accident?
 

2. For distribution key column: this should be the same behavior as direct drop on distribution column,i.e, drop the column and set distribution policy as randomly distributed.


Agree +1 to that.

--
Ashwin Agrawal (VMware)
Message has been deleted

Ekta Khanna

unread,
May 16, 2023, 10:19:29 PM5/16/23
to Ashwin Agrawal, gpdb...@greenplum.org

 

Below are some initial proposals for closing the gaps:  

1. For partition key column: GP6 should also drop the table altogether and be consistent with GP7 and Postgres, but not sure how that would impact current GP6 users. 

    Another way for GP6 would be to disallow dropping partition key columns completely and error out during DROP TYPE .. CASCADE (same behavior as direct drop on partition key COLUMN).


Dropping the table does feel like a very drastic step (specially for
STABLE release if any logic error happens to correctly detect the
condition may end up deleting tables which shouldn't have been
dropped). Hence, I feel we should be conservative at changing the
behavior and just fail DROP TYPE..CASCADE and let users take the
action (which also gives them a chance to backup data and such if
required).

Agreed. (my exact thoughts too) 
  
Also, one dangling part remains when mail says it works fine for the
planner after this condition. What exactly does fine mean? How does
insert behavior with dropped partition key? How does partition
elimination work with dropped partition key? Does planner have
explicit or special logic coded to handle all such cases or is
working by accident?

By works fine I meant to say we don't see any crashes and it generates a plan. Behavior wise, for a table with dropped partition column GP6:
1. INSERT and UPDATE: Fail during execution with the following error:
ERROR:  no partition for partitioning key
This error occurs as it's trying to look for which partition(table) to insert/update the data into, which is done at execution time.
2. SELECT: SELECT rows from all partitions or can filter on non-dropped columns 
3. DELETE: DELETE rows from all partitions or can delete specific rows filtered on non-dropped column   
SELECT and DELETE works similar to how it would work if any regular column is dropped and since no partition key will be part of the query itself, planner ends up scanning all partitions as no partition elimination is required.

Looking more into the code, planner doesn't have special code to handle such cases, it mainly relies on `rel_is_partitioned()` for checking if a table is partitioned and add scans for the underlying child partitions.
This function looks at `pg_partition` and `pg_partition_rule` table to determine if the table is partitioned or not and eventhough the partition key column is dropped there is information populated in these tables(seems incorrect) that lets planner fetch the child table relid's and add scans for them.
Planner's code that does partition elimination or anything else with partition keys is only triggered when the query has that column in the targetlist or filters on it but since the partition key cannot be used in the query itself(as it is dropped), planner doesn't go through that part of the code at all. Thus, no failures/crashes while planning.

The more I look at these scenarios, it makes more sense to just fail for DROP TYPE...CASCADE and avoid failure/crash for other scenarios that I can't think of right now.


From: Ashwin Agrawal <ashwi...@gmail.com>
Sent: Monday, May 15, 2023 3:46 PM
To: Ekta Khanna <ekh...@vmware.com>
Cc: gpdb...@greenplum.org <gpdb...@greenplum.org>
Subject: Re: Behavioral Difference between DROP COLUMN and DROP..CASCADE
 
!! External Email
!! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.

Ekta Khanna

unread,
May 17, 2023, 3:53:08 AM5/17/23
to Ashwin Agrawal, gpdb...@greenplum.org

Another scenario: for a table with default partition and dropped partition key

  1. INSERT: Planner generates a plan that inserts data into the default partition
  2. UPDATE: Still fails but with a different error: moving tuple from partition "partition_key_dropped_1_prt_p2" to partition "partition_key_dropped_1_prt_dp" not supported
  3. SELECT and DELETE: work same a table with dropped partition key and no default partition.

Updated matrix below:

--
To unsubscribe from this topic, visit https://groups.google.com/a/greenplum.org/d/topic/gpdb-dev/Yo2Tp18ENu8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gpdb-dev+u...@greenplum.org.

Reply all
Reply to author
Forward
0 new messages