Dear all,
I am trying to use NHC on Rocks Cluster distribution, which SGE is its job scheduler.
The instruction presented on
NHC git repository is heavily toward SLURM and TORQUE. It will be nice if you can add a mini section "SGE Integration". Right now I'm not sure what is the best practice of using NHC on SGE.
I tried a couple of ways, but currently what I'm doing is to install NHC on a shared directory and run it on every single node every now and again, then change the status of my load_sensor by my own script. Finally, check which node is in alert status. I don't think this is the best practice by the way.
The real question is, what is the correct way of using NHC on SGE clusters.
Sincerely,
Eisa