[slurm-dev] Control task distribution across nodes

1 view
Skip to first unread message

Felip Moll

unread,
Feb 12, 2013, 2:44:05 PM2/12/13
to slurm-dev
Hello SLURM list!

This is my second question about job binding. As I said to another thread I set up CR_Core_Memory and Slurm 2.4.3 on a small cluster of 15 compute nodes, 2 quad-cores each.


Suppose that the cluster is empty of jobs. When a user sends one serial job to run it goes to node0, processor0, core 0. All right.

If the same or another user sends another serial job, Slurm sends it to node1, processor0, core0. In terms of performance it is optimal because it separates the tasks as it can.

In terms of energy for example, it is not optimal because node1 could be suspended while both 2 tasks gone to node 0.

An other problem that I found the other day is that one single user sent 15 jobs to the cluster and this jobs were spreaded into 15 nodes thus making impossible to send then a job asking for 4 cores.

Is there any solution here? Is it possible to control how Slurm distribute tasks?


Thank you all!,
Felip

Moe Jette

unread,
Feb 12, 2013, 4:02:06 PM2/12/13
to slurm-dev

Most configurations will result in packing jobs onto node rather than
spreading them across multiple nodes. Perhaps your jobs are consuming
all of the memory or Generic resources.

Felip Moll

unread,
Feb 12, 2013, 4:41:07 PM2/12/13
to slurm-dev
I will review the launched tasks. Maybe you are right and the user sent jobs asking for 6GB of RAM, while the node have 15GB (so 2 jobs), and when I saw it one of the two jobs that were on the node had finished.

Thanks for your fast answer.
Felip

2013/2/12 Moe Jette <je...@schedmd.com>

Martin...@bull.com

unread,
Feb 12, 2013, 5:20:05 PM2/12/13
to slurm-dev
Felip,
Slurm provides a lot of control over the allocation of CPU resources.  It's certainly possible to run two or more jobs on the same node at the same time with each job allocated a different set of CPUs.  For more information, see the CPU Management Guide:
http://www.schedmd.com/slurmdocs/cpu_management.html
Martin Perry
Bull Phoenix
Reply all
Reply to author
Forward
0 new messages