Hello,
Is there support within CCTools for managing custom resources in an HPC?
My use case is Distributed Simulations, where some of the simulations in our testbed have a hard limit to the number which can run simultaneously. Currently, we have to manage those resources by "checking out" a specific machine on a white board which is prone to human error and definitely causes efficiency problems.
An example of what I'm looking for is:
Name | # Concurrent Instances
Sim A | No Limit (CPU only)
Sim B | 5 Instances
Sim A | No Limit (CPU only)
I have 3 different runs to submit:
Run A requires:
1x Simulation A
5x Simulation B
Run B requires:
1x Simulation A
1x Simulation C
Run C requires:
1x Simulation A
1x Simulation B
1x Simulation C
If I submit Run A, B, and C at the same time, I'd like for Run A and Run B to successfully start, while Run C waits for Run A to complete, due to no more instances of Sim B available.
We have an abnormal HPC due to requiring both Windows & Linux to interoperate and policy restrictions which prevent us from using containers or VMs and leveraging a system like k8s to spin up resources as necessary
Its been many years since I've used CCTools, but no matter where I look for a "close-enough" solution, I can't stop thinking the various components of CCTools might be the answer.
Thank you for you help!
V/R
Nick Callahan