[slurm-users] 21.08.6 srun fails with error "Invalid job credential" ; sbatch is fine.

1,527 views
Skip to first unread message

Williams, Jenny Avis

unread,
May 13, 2022, 5:32:05 PM5/13/22
to slurm...@schedmd.com

 

Yesterday I upgraded slurmdbd and slurmctld nodes from RHEL7 / Slurm v. 20.11.8 to RHEL8.5 / Slurm v. 21.08.6 on our production cluster.

I also updated slurm on the rhel7 login nodes to 21.08.6

Sbatch jobs run fine.

 

Srun, however, fails from the updated login node with invalid job credential errors. Sruns from nodes that are not update runs fine.

I am hoping this looks familiar to you.

 

 

$  srun --slurmd-debug=verbose -n 1 -t 8:00:00 --mem=3g -p interact -w c0801 --pty /bin/bash

srun: job 45281066 queued and waiting for resources

srun: job 45281066 has been allocated resources

srun: error: Task launch for StepId=45281066.0 failed on node c0801: Invalid job credential

srun: error: Application launch failed: Invalid job credential

srun: Job step aborted: Waiting up to 32 seconds for job step to finish.

srun: error: Timed out waiting for job step to complete

 

 

Brian Andrus

unread,
May 13, 2022, 7:14:39 PM5/13/22
to slurm...@lists.schedmd.com

Double-check the account info on that node (c0801).

Could be the node does not recognize the uid being assigned to the user/job.

Brian Andrus

Reply all
Reply to author
Forward
0 new messages