Nevermind. This was a layer 8 problem. I was editing the wrong
slurm.conf. We recently switched to using RPMs, and I was accidentally
edited the file in the location used before we switched to using RPMs.
It turns out those errors were always there in slurmctld.log, and no one
ever noticed. Now that I am using the output of 'slurmd -C' in the
correct file, those errors have gone away.
What is interesting is the configuration produced by Slurmd -C treats
each NUMA node as a separate socket (4 sockets) so the old configuration
in slurm.conf matched the physical configuration (2 sockets), so the
'correct' physical configuration had been causing those errors.
Prentice