OS replacement on Endpoint

23 views
Skip to first unread message

Axel Haenssen

unread,
Jun 7, 2024, 2:58:12 PMJun 7
to Discuss
We have a Globus Endpoint running on RHEL7 and need to upgrade to RHEL9. Can't do an in-place upgrade so it will be a re-install. Can anyone provide guidance on how to preserver the Endpoint and put it back after the install? Thanks for any pointers.
Axel

Lev Gorenstein

unread,
Jun 7, 2024, 4:39:44 PMJun 7
to Axel Haenssen, Discuss

Hello Axel,

Upgrading in place vs. full reinstall are very similar, with just a few caveats to keep an eye on. For single-DTN endpoints you are of course bound to have a [small] downtime, while for multi-DTN ones you should be able to do it in a rolling fashion without service interruption.

From a very high-level overview, all of your endpoints/storage gateways/collections configurations are stored in the Globus service. So as long as you have that precious deployment-key.json file, you can redeploy your DTNs and resurrect everything.

The crux to keep in mind is that on a given endpoint, all active DTNs must have the same GCS version (OS version does not matter much, but the GCS version does). And there was a breaking change in deployment key format in GCS 5.4.67. In other words, if your current GCS is older, the GCS update and key conversion must be taken care of first, and only then followed by the OS upgrade afterwards.

We have something of a standard process for sites looking to do an OS upgrade for a node in their endpoint, which I’m including below for your convenience. Note that you’ll need to update your GCS install on the existing node so as to be able to perform the deployment key conversion to the new format. You do NOT want to be in a position where you are trying to deploy a new node in your endpoint using the latest version of GCS with only an old style deployment key. You’ll want to be sure you get a converted copy of your deployment key before you blow away your old node.

A)
You will first want to determine if your deployment key is of the old format or the new format. As a distinct characteristic, new format has the client_id field, so you could just verify it visually… or use this command (note: may need to install jq and sqlite3 if not present already):

# Checks that the deployment key contains a 'client_id' field and that the client id matches the endpoint id of the local node

unset dkid; unset dbid; dkid=$(jq -r '.client_id' deployment-key.json); dbid=$(sqlite3 ~gcsweb/gcs54.db "select value from configuration where name is 'client_id';"); echo "Deployment Key ID is: " $dkid; echo "gcs54.db ID is: " $dbid; if [ "$dkid" != "$dbid" ]; then echo "IDs don't match"; else echo "IDs match"; fi

If the IDs match, then you’re on the new format. If not, then you’ll want to convert your deployment key over to the new format with this command: https://docs.globus.org/globus-connect-server/v5.4/reference/endpoint/key-convert/

If the key-convert subcommand is not present on your system, then you will need to update your GCS install to the latest GCS version as discussed here first: https://docs.globus.org/globus-connect-server/v5/#upgrading_globus_connect_server

Once you’ve updated your install, you’ll then convert your key as discussed above.

DO NOT attempt to proceed further without having ensured you have a new format deployment key for your endpoint!

B)
Next, the OS upgrade. The pattern here is “remove node from endpoint, do OS upgrade, add back to the endpoint”. There are small variations depending on whether you are reimaging a given server, or standing up an entirely new server to replace the old.

Here’s the general process for upgrading the OS of a system hosting a GCSv5 node:

  1. Make sure you’ve got a backup of your endpoint’s deployment key. Do not proceed without this.

  2. Remove the node from the endpoint using the ‘globus-connect-server node cleanup’ command (https://docs.globus.org/globus-connect-server/v5.4/reference/node/cleanup/)

  3. Remove the Globus software and config from the system as per our doc here: https://docs.globus.org/globus-connect-server/v5.4/#removing_globus_software_from_a_node

  4. Upgrade the OS / drop down a new image of a new OS, as per your standard procedure for this process

  5. Install the GCSv5 software on the new OS as per our doc here: https://docs.globus.org/globus-connect-server/v5.4/#gcsv5-install

  6. Deploy the system as a new node in the existing endpoint as per our doc here: https://docs.globus.org/globus-connect-server/v5.4/#add-dtns

If you just want to drop down a new image - totally erasing and overwriting the old - you can follow this same process but omit step 3.

If you want to simply replace an old server with a new one, then just run step 1 and 2 on the old server, then run steps 5 and 6 on the new server.

Hope this helps, and let us know if you have any questions!

Lev


On Fri, Jun 7, 2024 at 2:58 PM Axel Haenssen <ax...@princeton.edu> wrote:
We have a Globus Endpoint running on RHEL7 and need to upgrade to RHEL9. Can't do an in-place upgrade so it will be a re-install. Can anyone provide guidance on how to preserver the Endpoint and put it back after the install? Thanks for any pointers.
Axel


--
Lev Gorenstein
Solutions Architect
Globus // University of Chicago
e: l...@globus.org
Reply all
Reply to author
Forward
0 new messages