Mixing GPDB 5X and 6X environments causes issues when executing Greenplum utilities. This is especially common when multiple versions are in-play such as during upgrade resulting in a frustrating customer workflow and experience. Common errors seen include:
```
/usr/local/greenplum-db-6.19.3/bin/postgres: /usr/local/greenplum-db-5.29.1/lib/libxml2.so.2: no version information available (required by /usr/local/greenplum-db-6.19.3/bin/postgres)
# and...
stderr='Error: unable to import module: /usr/local/greenplum-db-6.20.3/lib/libpq.so.5: symbol gss_acquire_cred_from, version gssapi_krb5_2_MIT not defined in file libgssapi_krb5.so.2 with link time reference
```
Details:
```
Add the above source command to the gpadmin user’s .bashrc or other shell startup file so that the Greenplum Database path and environment variables are set whenever you log in as gpadmin.
```
The recent 6X PR:
[6X] support kerberos delegation in libpq #12926 removed the vendored libkrb library in favor of using the system library. However, 5X still vendors libkrb with the values in PATH taking precedence. Thus, if one’s PATH or LD_LIBRARY_PATH contain 5X values
then when trying to execute 6X utilities such as gpstart then the wrong libkrb is used resulting in the above errors.
Specifically, during gpupgrade initialize, gpinitsystem is called which internally calls gpstart which ssh's to the segments. Even though gpupgrade explicitly clears the environment and sets the correct values to call the 6X gpinitsystem the command still
fails if customers source 5X greenplum_path.sh in their .bashrc as documented. This is because when 6X gpstart ssh's to the segments the 5X greenplum_path.sh is sourced from .bashrc resulting in the 5X values being added to PATH and LD_LIBRARY_PATH causing
the 5X libkrb library to be used for 6X gpstart.
Current Workaround:
Before upgrading remove sourcing greenplum_path.sh and any Greenplum variables in .bashrc or .bash_profile from
hosts. Start a new shell and ensure PATH, LD_LIBRARY_PATH, PYTHONHOME, and PYTHONPATH are clear of any Greenplum values. Depending on the desired operation source the correct greenplum_path.sh, and set MASTER_DATA_DIRECTORY and PGPORT for either the
source or target cluster. Next continue with running a Greenplum command or gpupgrade.
Potential Ideas:
1) .set_greenplum_vars
- In the Greenplum Install documentation instruct users to create `.set_greenplum_vars` file which sources greenplum_path.sh and sets any other Greenplum specific variables such as MASTER_DATA_DIRECTORY, PGPORT, and PGDATABASE.
- Users will then added a single line in their .bashrc or .bash_profile to invoke any Greenplum variables (ie: `source ~/.set_greenplum_vars`)
- During a situation like the one above customers can execute a single gpssh sed command to comment this line in their .bashrc. Next they can open a fresh shell to reset their environment of any Greenplum values. They can then source the desired greenplum_path.sh
and set MASTER_DATA_DIRECTORY and PGPORT for the source or target cluster. After upgrading they can update .set_greenplum_vars on all hosts if needed, and re-enable it in their .bashrc with the gpssh sed command.
2) GPVERSION variable
In the users .bashrc create a variable called GPVERSION=5 which can be set to the major version such as 5, 6, or 7. Then update all the utility code such as gpstart to read this variable and unset or set the correct greenplum_path.sh environment variables.
3) Other ideas...?
We are looking for any input on what we can do today to help customers.
Thanks,
gpupgrade team