Issues Mixing GPDB 5X and 6X Environments

58 views
Skip to first unread message

Kalen Krempely

unread,
May 2, 2022, 8:49:29 PM5/2/22
to Greenplum Developers
Mixing GPDB 5X and 6X environments causes issues when executing Greenplum utilities. This is especially common when multiple versions are in-play such as during upgrade resulting in a frustrating customer workflow and experience.  Common errors seen include:
```
/usr/local/greenplum-db-6.19.3/bin/postgres: /usr/local/greenplum-db-5.29.1/lib/libxml2.so.2: no version information available (required by /usr/local/greenplum-db-6.19.3/bin/postgres)
# and...
stderr='Error: unable to import module: /usr/local/greenplum-db-6.20.3/lib/libpq.so.5: symbol gss_acquire_cred_from, version gssapi_krb5_2_MIT not defined in file libgssapi_krb5.so.2 with link time reference
```


Details:
The Greenplum install documentation states to add greenplum_path.sh to the user's .bashrc.
```
 Add the above source command to the gpadmin user’s .bashrc or other shell startup file so that the Greenplum Database path and environment variables are set whenever you log in as gpadmin.
```

The recent 6X PR: [6X] support kerberos delegation in libpq #12926 removed the vendored libkrb library in favor of using the system library. However, 5X still vendors libkrb with the values in PATH taking precedence. Thus, if one’s PATH or LD_LIBRARY_PATH contain 5X values then when trying to execute 6X utilities such as gpstart then the wrong libkrb is used resulting in the above errors.

Specifically, during gpupgrade initialize, gpinitsystem is called which internally calls gpstart which ssh's to the segments. Even though gpupgrade explicitly clears the environment and sets the correct values to call the 6X gpinitsystem the command still fails if customers source 5X greenplum_path.sh in their .bashrc as documented. This is because when 6X gpstart ssh's to the segments the 5X greenplum_path.sh is sourced from .bashrc resulting in the 5X values being added to PATH and LD_LIBRARY_PATH causing the 5X libkrb library to be used for 6X gpstart.


Current Workaround:
Before upgrading remove sourcing greenplum_path.sh and any Greenplum variables in .bashrc or .bash_profile from hosts. Start a new shell and ensure PATH, LD_LIBRARY_PATH, PYTHONHOME, and PYTHONPATH are clear of any Greenplum values. Depending on the desired operation source the correct greenplum_path.sh, and set MASTER_DATA_DIRECTORY and PGPORT for either the source or target cluster. Next continue with running a Greenplum command or gpupgrade.


Potential Ideas:
1) .set_greenplum_vars
- In the Greenplum Install documentation instruct users to create `.set_greenplum_vars` file which sources greenplum_path.sh and sets any other Greenplum specific variables such as MASTER_DATA_DIRECTORY, PGPORT, and PGDATABASE.
- Users will then added a single line in their .bashrc or .bash_profile to invoke any Greenplum variables (ie: `source ~/.set_greenplum_vars`)
- During a situation like the one above customers can execute a single gpssh sed command to comment this line in their .bashrc. Next they can open a fresh shell to reset their environment of any Greenplum values. They can then source the desired greenplum_path.sh and set MASTER_DATA_DIRECTORY and PGPORT for the source or target cluster. After upgrading they can update .set_greenplum_vars on all hosts if needed, and re-enable it in their .bashrc with the gpssh sed command.

2) GPVERSION variable
In the users .bashrc create a variable called GPVERSION=5 which can be set to the major version such as 5, 6, or 7. Then update all the utility code such as gpstart to read this variable and unset or set the correct greenplum_path.sh environment variables.

3) Other ideas...?


We are looking for any input on what we can do today to help customers.

Thanks,

gpupgrade team

Ashwin Agrawal

unread,
May 2, 2022, 9:20:46 PM5/2/22
to Kalen Krempely, Greenplum Developers
On Mon, May 2, 2022 at 5:49 PM Kalen Krempely <kkre...@vmware.com> wrote:
The Greenplum install documentation states to add greenplum_path.sh to the user's .bashrc.
```
 Add the above source command to the gpadmin user’s .bashrc or other shell startup file so that the Greenplum Database path and environment variables are set whenever you log in as gpadmin.
```

Before diving into solutions, I wish to gain clarity on the reasons we recommend or need to add greenplum_path.sh to segment hosts .bashrc file? I understand that the coordinator for ease of use is recommended (though as a developer I moved away from modifying .bashrc file many years ago. Plus I understand even if we are able to restrict the recommendation to only coordinator the problem will not go away and still need solving, just will help to reduce the scope and focus)

--
Ashwin Agrawal (VMware)
Reply all
Reply to author
Forward
0 new messages