[slurm-dev] Multifactor Job Priority

0 views
Skip to first unread message

Cook, Justin (London)

unread,
Nov 6, 2009, 9:51:02 AM11/6/09
to slur...@lists.llnl.gov

Hi,

Firstly, let me take my hat off to all developers for a fine product. We
are testing SLURM for wider deployment within the firm and amazed at the
execution speed -- especially for our use case.

We have developed two map-reduce libraries for use with R and Python in
the SLURM environment, so execution speed is very important for us. The
way we want to use SLURM is simply throw any number of single-core jobs
at it and let the SLURM scheduler handle fair access without completely
obliterating larger jobs.

Let's say I have a user that has a list of 1000 entries that take ten
minutes each to complete. He may be taking up 80% of resources with
others using the rest. We want other new jobs to be able to filter up to
the top of the queue and interleave with the other jobs that were
submitted first.

I have set slurm.conf to use the multifactor priority plugin as follows:
PriorityDecayHalfLife = 00:10:00
PriorityFavorSmall = 0
PriorityMaxAge = 00:00:00
PriorityUsageResetPeriod = NONE
PriorityType = priority/multifactor
PriorityWeightAge = 0
PriorityWeightFairShare = 100000
PriorityWeightJobSize = 1000
PriorityWeightPartition = 0
PriorityWeightQOS = 0

I understand that since almost all jobs will be one CPU, that job size
will not affect the weight. But, I would expect fair share to have some
sort of impact.

When I execute the following I always see the same priority on all jobs
which is 2147483647. This number never changes:

sacct -a --fields=JobID,JobName,Account,State,Priority --state=r,s,pd

In order to accomplish what I am after, what is the best approach /
settings?

Cheers,

--
Justin Cook
AHL Research & Trading Systems
Technology Group

Man Investments
5th Floor, Sugar Quay, Lower Thames St, London, EC3R 6DU
Office: +44 (0) 20 7144 3744

**********************************************************************
Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees only. It contains information which may be confidential and privileged. If you are not the intended recipient, please notify the sender immediately, destroy this email and any attachments and do not otherwise disclose or use them. Email transmission is not a secure method of communication and Man Investments cannot accept responsibility for the completeness or accuracy of this email or any attachments. Whilst Man Investments makes every effort to keep its network free from viruses, it does not accept responsibility for any computer virus which might be transferred by way of this email or any attachments. This email does not constitute a request, offer, recommendation or solicitation of any kind to buy, subscribe, sell or redeem any investment instruments or to perform other such transactions of any kind. Man Investments reserves the right to monitor, record and retain all electronic communications through its network to ensure the integrity of its systems, for record keeping and regulatory purposes.
Visit us at: www.*maninvestments.com
TG0908
**********************************************************************

Danny Auble

unread,
Nov 6, 2009, 11:52:05 AM11/6/09
to slur...@lists.llnl.gov
Hi Justin, thanks for the accolades. How is your fairshare tree set up in accounting? That will determine the effect fairshare plays on each job coming in. Keep in mind with such a low decayhalflife when a job runs after 10 minutes it's time will have effectively decayed into nothing. I would suggest a higher value if you really want the priority to be accurate of the usage unless there is a constant stream of jobs from all parties involved. Still the largest thing here is how you have set up your fairshare tree. If you could print out 'sacctmgr list assoc tree' that would be helpful.

Danny

Reply all
Reply to author
Forward
0 new messages