[slurm-users] Available gpus ?

11,358 views
Skip to first unread message

jayraj shah

unread,
Mar 16, 2018, 1:59:22 PM3/16/18
to slurm...@schedmd.com
I am trying to find out how as a user, I can get information about available/unused gpus on cluster. 

This cluster is setup in a way that user can request gpus per node between 1 to 8. Using sinfo it shows that STATE is "mix". But I can't tell how many out of 8 are being used and how many are available on that mix state node for other user to use. 


Please let me know your suggestions. 



--
Thanks
Jayraj

sefa....@tubitak.gov.tr

unread,
Mar 16, 2018, 2:34:50 PM3/16/18
to jayraj...@gmail.com, slurm...@schedmd.com
You msy use sinfo with parameters i.e. %G .


Sefa Arslan

------ Orijinal mesaj------
Kimden: jayraj shah
Tarih: Cum, 16 Mar 2018 21:18
Konu:[slurm-users] Available gpus ?

jayraj shah

unread,
Mar 16, 2018, 2:44:52 PM3/16/18
to sefa....@tubitak.gov.tr, slurm...@schedmd.com
Thank you. 
I tried that option but it gives me, how many gpus are available/attached to node. (such as gpu:8)
I want to know how many are used and how many are free (to run my jobs on free gpus).
--
Thanks
Jayraj

Alex Chekholko

unread,
Mar 16, 2018, 3:18:10 PM3/16/18
to Slurm User Community List
There was a previous thread where someone recommended a third-party script: "pestat -G" that will parse the outputs of 'scontrol shown node' and 'scontrol show job' and add up the used GPUs perhaps?  https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat

Christopher Benjamin Coffey

unread,
Mar 16, 2018, 3:34:18 PM3/16/18
to Slurm User Community List
We tell our users to do this:

squeue -h -t R -O gres | grep gpu|wc -l

The command above will report the number of GPUs in use. If the number is 16, then all of the GPUs are currently being used. If nothing is displayed, then all of the GPUs are available.

In our case we have 16 GPU's. Probably better ways then that, but it works.

Best,
Chris


Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167


On 3/16/18, 12:19 PM, "slurm-users on behalf of Alex Chekholko" <slurm-use...@lists.schedmd.com on behalf of al...@calicolabs.com> wrote:

There was a previous thread where someone recommended a third-party script: "pestat -G" that will parse the outputs of 'scontrol shown node' and 'scontrol show job' and add up the used GPUs perhaps? https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FOleHolmNielsen%2FSlurm_tools%2Ftree%2Fmaster%2Fpestat&data=02%7C01%7Cchris.coffey%40nau.edu%7C64f9412663a44143553e08d58b72d72c%7C27d49e9f89e14aa099a3d35b57b2ba03%7C0%7C1%7C636568247709507960&sdata=R7wjvsTmFk75ObJrFC7Hif2txBy051xlr%2FPNeE2Dn2Q%3D&reserved=0>



On Fri, Mar 16, 2018 at 11:44 AM, jayraj shah
<jayraj...@gmail.com> wrote:

Thank you.
I tried that option but it gives me, how many gpus are available/attached to node. (such as gpu:8)
I want to know how many are used and how many are free (to run my jobs on free gpus).


On Fri, Mar 16, 2018 at 2:34 PM,
sefa....@tubitak.gov.tr <mailto:sefa....@tubitak.gov.tr> <sefa....@tubitak.gov.tr> wrote:

You msy use sinfo with parameters i.e. %G .


https://slurm.schedmd.com/sinfo.html <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fsinfo.html&data=02%7C01%7Cchris.coffey%40nau.edu%7C64f9412663a44143553e08d58b72d72c%7C27d49e9f89e14aa099a3d35b57b2ba03%7C0%7C1%7C636568247709507960&sdata=7d1Q49XGS3gWed99hMfn9ItT6p6pyWi6PDfW64eGWEw%3D&reserved=0>

Nadav Toledo

unread,
Mar 18, 2018, 7:26:47 AM3/18/18
to slurm...@lists.schedmd.com
Hey all,
We used to numbers from the following commands:
sinfo -o %G(as suggested above) - gives total gpu in cluster

squeue -o %b   - gives amount of gpu in use for each running job
sum all the numbers under %b gives you gpu in use in cluster

pestat was suggested by ole, but the flag required it is not supported yet.

There are
Reply all
Reply to author
Forward
0 new messages