You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to slurm...@schedmd.com
I am trying to find out how as a user, I can get information about available/unused gpus on cluster.
This cluster is setup in a way that user can request gpus per node between 1 to 8. Using sinfo it shows that STATE is "mix". But I can't tell how many out of 8 are being used and how many are available on that mix state node for other user to use.
Please let me know your suggestions.
--
Thanks
Jayraj
sefa....@tubitak.gov.tr
unread,
Mar 16, 2018, 2:34:50 PM3/16/18
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to sefa....@tubitak.gov.tr, slurm...@schedmd.com
Thank you.
I tried that option but it gives me, how many gpus are available/attached to node. (such as gpu:8)
I want to know how many are used and how many are free (to run my jobs on free gpus).
--
Thanks
Jayraj
Alex Chekholko
unread,
Mar 16, 2018, 3:18:10 PM3/16/18
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Slurm User Community List
There was a previous thread where someone recommended a third-party script: "pestat -G" that will parse the outputs of 'scontrol shown node' and 'scontrol show job' and add up the used GPUs perhaps? https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat
Christopher Benjamin Coffey
unread,
Mar 16, 2018, 3:34:18 PM3/16/18
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Slurm User Community List
We tell our users to do this:
squeue -h -t R -O gres | grep gpu|wc -l
The command above will report the number of GPUs in use. If the number is 16, then all of the GPUs are currently being used. If nothing is displayed, then all of the GPUs are available.
In our case we have 16 GPU's. Probably better ways then that, but it works.
Best,
Chris
—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
Thank you.
I tried that option but it gives me, how many gpus are available/attached to node. (such as gpu:8)
I want to know how many are used and how many are free (to run my jobs on free gpus).