[slurm-users] Node (anti?) Feature / attribute

16 views
Skip to first unread message

David Magda via slurm-users

unread,
Jun 14, 2024, 2:20:38 PMJun 14
to Slurm User Community List
Hello,

What I’m looking for is a way for a node to continue to be in the same partition, and have the same QoS(es), but only be chosen if a particular capability is being asked for. This is because we are rolling something (OS upgrade) out slowly to a small batch of nodes at first, and then more and more over time, and do not want to interrupt users’ workflows: we want them to default the ‘current’ nodes and only land on the ‘special’ ones if requested. (At a certain point the ‘special’ ones will become the majority and we’d swap the behaviour.)

Slurm has the well-known feature item that can be put on a node(s):

> A comma-delimited list of arbitrary strings indicative of some characteristic associated with the node. There is no value or count associated with a feature at this time, a node either has a feature or it does not. A desired feature may contain a numeric component indicating, for example, processor speed but this numeric component will be considered to be part of the feature string. Features are intended to be used to filter nodes eligible to run jobs via the --constraintargument. By default a node has no features. Also see Gres for being able to have more control such as types and count. Using features is faster than scheduling against GRES but is limited to Boolean operations.


https://slurm.schedmd.com/slurm.conf.html#OPT_Features

So if there are (a bunch of) partitions, and nodes with-in those partitions, a job can be submitted to a partition and it can be run any any available node, or even be requested to run a particular node (--nodelist). With the above (and --constraint / --prefer), a particular sub-set of node(s) can be requested. But (AIUI) that sub-set is also available generally to everyone, even if a particular feature is not requested.

Is there a way to tell Slurm to not schedule a job on a node UNLESS a flag or option is set? Or is it necessary to set up new partition(s) or QoS(es)? I see that AllowAccounts (and AllowGroups) is applicable only to Partitions, and not (AFAICT) on a per node basis.

We’re currently on 22.05.x, but upgrading is fine.

Regards,
David

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Bill via slurm-users

unread,
Jun 14, 2024, 2:28:54 PMJun 14
to slurm...@lists.schedmd.com
We've done this though with job_submit.lua. Mostly with OS updates. We
add a feature to everything then proceed. Telling users that adding a
feature gets you on the "new" nodes.

I can send you the snippet if you're using the job_submit.lua script.

Bill

Laura Hild via slurm-users

unread,
Jun 14, 2024, 2:35:46 PMJun 14
to slurm...@lists.schedmd.com
I wrote a job_submit.lua also. It would append "&centos79" to the feature string unless the features already contained "el9," or if empty, set the features string to "centos79" without the ampersand. I didn't hear from any users doing anything fancy enough with their feature string for the ampersand to cause a problem.

Ryan Cox via slurm-users

unread,
Jun 14, 2024, 3:40:19 PMJun 14
to slurm...@lists.schedmd.com
We did something like this in the past but from C.  However, modifying
the features was painful if the user did any interesting syntax.

What we are doing now is using --extra for that purpose.  The nodes boot
up with SLURMD_OPTIONS="--extra {\\\"os\\\":\\\"rhel9\\\"}" or similar. 
Users can request --extra=os=rhel9 or whatever if they want to submit
across OS versions for some weird reason.

Handling defaults is problematic because there is no way to set a
default --extra for people.  We had some things working to set an
environment variable on the nodes that gets passed by sbatch, et al. and
then read it from the submit plugin.  We would then set the --extra in
the job submit plugin.  The problem is that salloc and srun behave
differently and you can't access the environment.

Instead, we are now looking up the alloc_node in the plugin and reading
its `extra` directly.  Here's what the relevant parts look like:
static void _set_extra_from_alloc_node(job_desc_msg_t *job_desc)
{
        node_record_t *node_ptr = find_node_record(job_desc->alloc_node);
        char *default_str = "os=rhel7";

        if (node_ptr == NULL) {
                job_desc->extra = xstrdup(default_str);
                info("WARNING: _set_extra_from_alloc_node: node %s not
found. Setting job to default '%s'", job_desc->alloc_node, default_str);
        } else {
                if (!xstrcmp(node_ptr->extra, "{\"os\":\"rhel7\"}")) {
                        job_desc->extra = xstrdup("os=rhel7");
                } else if (!xstrcmp(node_ptr->extra,
"{\"os\":\"rhel9\"}")) {
                        job_desc->extra = xstrdup("os=rhel9");
                } else {
                        job_desc->extra = xstrdup(default_str);
                        info("WARNING: _set_extra_from_alloc_node: node
%s returned extra of '%s' which did not match known values. Setting job
to default '%s'", job_desc->alloc_node, node_ptr->extra, default_str);
                }
        }
}

...


        if (!job_desc->extra) {
                _set_extra_from_alloc_node(job_desc);
        }

I don't know if you can do it in lua.  The easiest way to do this would
be if there was an environment variable for a default --extra, but there
isn't currently.  I've been meaning to ask SchedMD about that but
haven't done so yet.

By the way, the nice thing about --extra is that there's no juggling of
features in config files.  Whatever OS it boots up in, that's what ends
up in the extra field.  We have a script that populates the relevant
file before Slurm boots.

David Magda via slurm-users

unread,
Jun 17, 2024, 10:28:51 AMJun 17
to Ryan Cox, slurm...@lists.schedmd.com
This functionality in slurmd was added in August 2023, so not in the version we’re currently running:

https://github.com/SchedMD/slurm/commit/0daa1fda97c125c0b1c48cbdcdeaf1382ed71c4f

Perhaps something for the future. Currently looking like the job_submit.lua is the best candidate.


> On Jun 14, 2024, at 15:37, Ryan Cox via slurm-users <slurm...@lists.schedmd.com> wrote:
>
> What we are doing now is using --extra for that purpose. The nodes boot up with SLURMD_OPTIONS="--extra {\\\"os\\\":\\\"rhel9\\\"}" or similar. Users can request --extra=os=rhel9 or whatever if they want to submit across OS versions for some weird reason.

David Magda via slurm-users

unread,
Jun 17, 2024, 10:29:27 AMJun 17
to Laura Hild, slurm...@lists.schedmd.com
Could you post that snippet?

Laura Hild via slurm-users

unread,
Jun 17, 2024, 10:37:01 AMJun 17
to David Magda, slurm...@lists.schedmd.com
> Could you post that snippet?

function slurm_job_submit ( job_desc, part_list, submit_uid )
if job_desc.features then
if not string.find(job_desc.features,"el9") then
job_desc.features = job_desc.features .. '&centos79'
end
else
job_desc.features = "centos79"
end
return slurm.SUCCESS
end
Reply all
Reply to author
Forward
0 new messages