Hi folks,
I want to start a discussion about gp_default_storage_options and its compatibility from GP6 to GP7.
In GP7, a new GUC (default_table_access_method) was introduced and due to its overlap with gp_default_storage_options in GP7, several options of gp_default_storage_options were removed.
For example, you will see the following error when trying to set appendonly for gp_default_storage_options in GP7:
postgres=# SET gp_default_storage_options = 'appendonly=true, compresstype=rle_type';
ERROR: invalid storage option "appendonly"
HINT: For table access methods use "default_table_access_method" instead
NOTE: this issue also applies for orientation and appendoptimized option.
SCENARIO: on top of my mind, there are two scenarios that the issue can have a big enough impact for a user (there might be more that I'm missing?).
1. Upgrade from GP6 to GP7 will have issues, specially if they have roles that modified gp_default_storage_options.
postgres=# ALTER ROLE foo SET gp_default_storage_options TO 'appendonly=true,compresstype=zstd,compresslevel=3';
ERROR: invalid storage option "appendonly"
HINT: For table access methods use "default_table_access_method" instead.
Upgrading from GP6 to GP7 will definitely fail due to this (or at the very least have inconsistencies). If we ignore this problem, then the tables that are created GP7 will not have the same options compared to the GP6 counterpart.
2. Users may have scripts based on GP6 with gp_default_storage_options that include the removed options; the scripts will no longer work for GP7.
Do we want to introduce backwards compatibility for this and have a seamless transition to GP7 for their scripts or will the users need to modify their script to adhere to GP7 standard?
APPROACH: assuming that we agree that we need some form of backwards compatibility, here are some approaches:
1. Enable GP7 pg_dump/pg_dumpall to be able to modify SQL when dumping from a GP6 cluster.
2. Enable pg_restore to read old dumps and modify the SQL on the fly to restore on GP7.
3. Enable to have GPDB (server) have backwards compatibility and it will handle things internally with in the server. (keep GP6 syntax intact).
4. gpupgrade will create a script that will update the dumps to the GP7 version.
If I'm missing other interesting approaches for this issue, please feel free to chime in.
Some observations with the approaches.
Approach 1,2, and 4 do not handle scenario 2.
If we go for approach 3, then it opens up more questions about resolving conflicting options, who takes priority, etc.
For example, gp_default_storage_options and default_table_access_method both have option appendonly, but one is set to true and the other as false.
Who is the correct one?
Input into about this topic would greatly appreciated!
Thank you,
Marbin