Express jobs on slurm

75 views
Skip to first unread message

jonatan.smp

unread,
Sep 4, 2014, 5:03:57 AM9/4/14
to genome-au-c...@googlegroups.com
On the old cluster there used to be a feature that you could submit a job with 1 node 1 hour, and it would (most of the time) start immediately. I found this highly useful to run quick programs or test the code before submitting a large job. However, this has not worked for me on slurm. I don't know if this is because the feature has been disabled (although some jobs are called express), or because the express queue is constantly highly booked. 

Anyway, I guess my question is if this is also an important feature for other people than me, and if so, if steps could be taken to reinstate it.

Anders Halager

unread,
Sep 4, 2014, 5:14:28 AM9/4/14
to genome-au-c...@googlegroups.com
It is already supported. If you submit a job with a timelimit of 1 hour or less it will automatically be eligible for the express partition. There are 3 express machines right now (s02n[61-63]), we will add a few more soon.

Anders.

jonatan.smp

unread,
Sep 4, 2014, 5:17:44 AM9/4/14
to genome-au-c...@googlegroups.com
Ok, but then what about some sort of limit that each user can at most submit say 2 express jobs? So that it would maybe actually be express. But again, I don't want there to be made a big deal out of this if I'm the only one who finds this feature important.

Palle Villesen

unread,
Sep 5, 2014, 6:08:47 AM9/5/14
to genome-au-c...@googlegroups.com
If the other jobs are then submitted to the normal queue - then fine with me.

I agree with Jonathan that we should reserve a few (3) nodes for these very small jobs and that they should be readily available for testing code etc.

I and Henrik (palle and heho) didn't ask slurm to use the express queue, but because our thousands of jobs are quick jobs they are automatically submitted to both express,normal by slurm - making it impossible for other users to use the express nodes.

That is not really optimal for users (but it IS optimal for the cluster load) - so maybe only submit jobs to the express queue when people specifically ask for the express queue ?

I agree that it is necessary for people to have quick access to a few nodes for development - otherwise people start running on fe2 (some are already).

best,
.p




jakob.skou

unread,
Sep 5, 2014, 7:11:27 AM9/5/14
to genome-au-c...@googlegroups.com

Is it possible to use different weighing schemes for the two queues, though they have a common entry point? In my view it would be optimal if jobs with an overall short expected wall time were given top priority within the express queue. It then would not matter if other overall larger jobs were given access and were already running on the express nodes, as long as they would complete within an hour anyway.

my 2c.

Jakob

Rune Møllegaard Friborg

unread,
Sep 9, 2014, 4:30:18 AM9/9/14
to genome-au-c...@googlegroups.com

We have added a few nodes to the express queue, thus it now has 8 nodes. For now..

Regarding configuration options. The challenge is to find the configuration for an express queue, which has the fewest negative consequences. There might be one better than the current version.

Let's review.

Current configuration:
 * Jobs are automatically submitted to both normal and express, if the job has a wall time of 1 hour or less. 8 nodes in the express queue.
   (neg) If many jobs are submitted with a wall time of less than 1 hour, then they may block the express queue for some hours.
   (pos) 8 nodes available for short jobs, thus real work can be done on the express queue.

Configuration alternatives:
 * No jobs are automatically sent to the express queue. 2 nodes in the express queue.
   (neg) The nodes in the express queue can not as easily be utilised for computations, thus less nodes are reserved for this queue. 
   (pos) They will not be blocked by short jobs sent to the normal queue

 * Limit the express queue, such that no user can have more than 2 jobs in the queue and no jobs are automatically sent to the express queue. 2 nodes reserved.
   (neg) They would likely be unused most of the time. Especially doing non-work hours.
   (pos) No user would be able to block the express queue for more than an hour.

 * No jobs are automatically sent to the express queue. All jobs sent to express, will additionally be sent to normal. 2 nodes reserved.
   (pos) Users blocking the express queue with many jobs, would be doing it intentionally, thus they could be reprimanded.

I prefer the last configuration on this list. What do you think?

/ Rune



Palle Villesen

unread,
Sep 22, 2014, 6:46:01 AM9/22/14
to genome-au-c...@googlegroups.com
I have changed my scripts, so all short jobs are timed to 01:01:00 (requiring MORE than 1 hour) - but as it is someone is having +5000 jobs in queue on express - making it impossible to get some interactive shell at the moment.

This is clearly not optimal. I will vote for the last option, maybe having fewer express nodes - since they should NOT be used for "actual" workflow - but for testing, development etc. 

Having a few machines/cores accessible at all times would be nice.

Anders Halager

unread,
Sep 24, 2014, 6:08:41 AM9/24/14
to genome-au-c...@googlegroups.com
Alright, I have changed the express partition now. It only has two nodes, but jobs are only added to it if you explicitly ask with "-p express". Express jobs will also be eligible for the normal partition.
This will hopefully mean that the express machines have a shorter wait for test jobs and interactive jobs.

Anders.

Michał Świtnicki

unread,
Sep 24, 2014, 6:28:53 AM9/24/14
to genome-au-c...@googlegroups.com
Hi Guys,

Sounds all good. 8 express nodes was a bit much to keep idle in comparison with the cluster capabilities. Given that and the huge load on the cluster, I would sometimes use them to run my pilot jobs - so did others as far as I've seen.
On a different note. When are you actually planning on merging the 2 queuing systems thus ensuring optimal utilization of the cluster?

Michał

Rune Møllegaard Friborg

unread,
Sep 24, 2014, 6:41:15 AM9/24/14
to genome-au-c...@googlegroups.com

>When are you actually planning on merging the 2 queuing systems thus ensuring optimal utilization of the cluster?

We will reduce the number of Torque/PBS nodes (currently 20) at an undecided pace, depending on the day-to-day usage.

Finally, Torque/PBS will be completely removed December 1st.

We will assist any users not being able to make the move to Slurm by themselves. No one will be left behind.

If you need help, contact us.

best regards,
Rune Friborg


Palle Villesen

unread,
Sep 24, 2014, 7:12:36 AM9/24/14
to genome-au-c...@googlegroups.com
Perfect. Works like a charm

To get an interactive node for working 1 hour (4 cores, 16 gig memory):

srun -p express --pty -c 4 -t 01:00:00 --mem=16g bash 

Palle

Rune Møllegaard Friborg

unread,
Sep 24, 2014, 7:19:22 AM9/24/14
to genome-au-c...@googlegroups.com
To save time, you don't need to specify time.. (pun intended)

srun -p express -c 4 --mem=16 --pty bash

/Rune

Palle Villesen

unread,
Sep 25, 2014, 6:53:20 AM9/25/14
to genome-au-c...@googlegroups.com
And you should stop running pilot jobs now... (or make pilot jobs a LOT smaller). Submitting more than 4-5 jobs to the express queue will effectively block it (as it is now) - which has 2 consequences:

1. People will not be able to get an interactive shell/run test jobs.
2. They will switch to using the front end nodes (which is a really bad idea).

From Rune post: 
Reply all
Reply to author
Forward
0 new messages