Syntax for specifying column-level encoding options in ALTER TABLE

46 views
Skip to first unread message

Huansong Fu

unread,
Aug 1, 2022, 2:28:22 PM8/1/22
to Greenplum Developers
Hi All, 

We would like to support specifying column-level encoding directives during altering table from Heap/AO to AOCO. The main reason for this support is to make sure that the user can set both access method and column-level encodings w/ one command and (more importantly) one single table rewrite.

There are a few different approaches in doing this which I would like to ask your opinions about:

#1 Currently we have a prototype adding a new syntax ENCODING(<options>,...) in an ATSETAM command, such as:

   ALTER TABLE foo SET ACCESS METHOD ao_column ENCODING (COLUMN a ENCODING (compresstype=rle_type)); 
   ALTER TABLE foo SET ACCESS METHOD ao_column ENCODING (DEFAULT COLUMN ENCODING (compresstype=rle_type)); 
   — Similarly for the variation of ATSETAM:
   ALTER TABLE foo SET WITH (appendonly=true, orientation=column) ENCODING (COLUMN a ENCODING (compresstype=rle_type));

#2 Put the encodings in the WITH clause, so one can do:
   ALTER TABLE foo SET ACCESS METHOD ao_column WITH(COLUMN a ENCODIONG (compresstype=rle_type));
  • Pros: seems like a quite natural usage. This was also my first thought.
  • Cons: WITH clause is an existing syntax for specifying table-level options. Mixing them could lead to confusion, e.g., would the user then expect to call CREATE TABLE ... WITH (<table options>, <column encodings>) and we have to support that too?
#3 Just use the existing ALTER COLUMN … SET (…) syntax (functionality still WIP, but syntax is there). So one can do:
   ALTER TABLE foo SET ACCESS METHOD ao_column, ALTER COLUMN a SET (compresstype=rle_type);
  • Pros: no extra syntax so no user education on that :)
  • Cons:
    • Have to call two subcommands so there’s chance that users can forget about this usage and do two separate ALTER TABLE commands which will incur two table rewrite. So still some user education might be needed (though I imagine it’s easier to adopt than new syntax).
    • This couldn’t specify the DEFAULT column encoding, which exists for CREATE TABLE (note that this is different than the table-level options, which would be overrided by the DEFAULT column encoding).
#4 Add an ALTER ENCODING subcommand so one can do:
   ALTER TABLE foo SET ACCESS METHOD ao_column, ALTER ENCODING (COLUMN a ENCODIONG (compresstype=rle_type));
  • Pros: can change DEFAULT encoding. It’s also a nice syntax to have if the user wants to change DEFAULT encoding for an existing AOCO table (though I’m not sure how useful it is).
  • Cons: the bad side of both #1 and #3: this is a new syntax, AND we still need to calling two subcommands.
Thoughts?

Regards,
Huansong

Alexandra Wang

unread,
Aug 2, 2022, 2:53:15 PM8/2/22
to Greenplum Developers, Huansong Fu
Hi Huansong, 

On Monday, August 1, 2022 at 11:28:22 AM UTC-7 Huansong Fu wrote:
#3 Just use the existing ALTER COLUMN … SET (…) syntax (functionality still WIP, but syntax is there). So one can do:
   ALTER TABLE foo SET ACCESS METHOD ao_column, ALTER COLUMN a SET (compresstype=rle_type);
  • Pros: no extra syntax so no user education on that :)
  • Cons:
    • Have to call two subcommands so there’s chance that users can forget about this usage and do two separate ALTER TABLE commands which will incur two table rewrite. So still some user education might be needed (though I imagine it’s easier to adopt than new syntax).
    • This couldn’t specify the DEFAULT column encoding, which exists for CREATE TABLE (note that this is different than the table-level options, which would be overrided by the DEFAULT column encoding).

I prefer option 3 because of the pros you pointed out - this is existing syntax that users are already familiar with. I don't think we need to worry about cons #1 as long as we have this usage well documented. For cons #2, can we consider adding a table-level command like:
ALTER TABLE ... SET DEFAULT ENCODING ...
Or a subcommand like:
ALTER TABLE ... SET ACCESS METHOD ..., ALTER DEFAULT ENCODING ... (as I'm writing I realized this is kind of similar to your option 4 but just for default encoding)

Thanks,
Alex



Huansong Fu

unread,
Aug 2, 2022, 5:20:49 PM8/2/22
to Greenplum Developers, Alexandra Wang, Huansong Fu
Thanks Alex! So far option 3 is preferred by all who have voiced their opinions on this topic (you and Deep who we briefly discussed over Slack). 

I agree with your point on cons #1. Avoiding new syntax should be a bigger factor than having to call two subcommands. And I partly agree with #2, too, but was just wondering if there's any usefulness of specifying DEFAULT column encoding in ALTER TABLE w/o changing the table-level option. E.g. (a random one):
    CREATE TABLE foo(c1 int, c2 int) WITH (appendonl=true, compresslevel=1); 
    ALTER TABLE t SET ACCESS METHOD ao_column ENCODING (DEFAULT COLUMN ENCODING (compresslevel=9));
So after the above commands, both c1 and c2 will have compresslevel=9 but the table-level option remains having compresslevel=1, which will still be used if any new column is added to the table. I don't know whether this is of any practical usefulness, but just want to bring it up in case it's discussion-worthy.

Thanks,
Huansong

Soumyadeep Chakraborty

unread,
Aug 4, 2022, 7:48:17 PM8/4/22
to Greenplum Developers, fhua...@vmware.com, walex...@vmware.com
Hello,

I think we can follow the following rule of thumb: introduce as little new syntax as possible and try to leverage existing syntax.
This is why I prefer option 3.
  • Multiple subcommands has been around for a long time. So users shouldn't have that much trouble. Plus, it will be up to us somewhat to educate them with the right promotional material.
  • Since we can do stuff like alter table a set access method heap, alter column i set (n_distinct=20); , in PG upstream today lets just use this.
We can treat the default column encoding as a separate command altogether (if we want to support it at all) -  seems like a nice-to-have than a must-have really (for ALTER time).
Reply all
Reply to author
Forward
0 new messages