Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Karl's Globus-related NHC items

18 views
Skip to first unread message

Karl Kornel

unread,
Oct 22, 2024, 6:51:11 PM10/22/24
to dis...@globus.org

Hello!

 

A while ago (before GCS 5.4.78 came out) I had cause to watch the status of the GCS services a little more closely.  We put our DTNs into SLURM—even though we don’t run jobs on them—so that node problems (typically identified by NHC) can be flagged to us through SLURM (by draining the node, even though it doesn’t run jobs).

 

While doing some NHC configuration updates, I thought others might be interested in the Globus-related NHC entries I have, so I’m sharing them here!

 

Here are the nhc.conf entries I’m using:

 

HOSTNAME            || check_ps_service globus-gridftp-server

HOSTNAME            || check_ps_service -u apache httpd

HOSTNAME            || check_ps_service -f -m "* /opt/globus/bin/gunicorn *" -u gcsweb gcs_manager

HOSTNAME            || check_file_test -S /run/gcs_manager.sock

HOSTNAME            || check_file_test -S /var/run/globus-connect-server/control

HOSTNAME            || check_file_test -S /var/run/globus-connect-server/ipc

HOSTNAME            || check_ps_service -f -m "* /opt/globus/bin/globus-connect-server assistant" -u gcsweb gcs_manager_assistant

 

Lines 1, 2, 3-6, and 7, respectively, are used to monitor GridFTP, Apache, the GCS Manager, and the GCS Manager Assistant.  The GCS Manager and GCS Manager Assistant are Python-based services, so matching the correct process name is a little complicated.  Also, this is on an Enterprise Linux system; your Apache process and/or user names may be different.

 

Also, we don’t run any other web services on our DTNs, so checking “Is Apache running?” is enough for us.

 

Finally, the socket-file checks on Line 4-6 might not really be needed.  All NHC does is check if the socket file exists, but it doesn’t check if the socket file is actually hooked up to anything.  Still, I decided to add it anyway.

 

Hopefully these NHC entries are useful to others!

 

-- 

A. Karl Kornel | Info. Sys. Specialist

UIT Research Computing | Stanford University

+1 (650) 736-932

Reply all
Reply to author
Forward
0 new messages