[slurm-users] Node specs for Epyc 7xx3 processors?

216 views
Skip to first unread message

Steffen Grunewald

unread,
Dec 22, 2021, 10:38:37 AM12/22/21
to Slurm users
Hello,

I'm wondering whether there is some rule-of-thumb to translate the core
config listed in https://en.wikipedia.org/wiki/Epyc to the node information
Slurm expects in "Sockets=x CoresPerSocket=y"? ("ThreadsPerCore=2" is clear.)

We'll be getting Epyc 7313 and 7513 machines, and perhaps add a single 7713
one - "lscpu" outputs are wildly different, while the total number of cores
is correct.

Will I have to wait until the machines have arrived, and do some experiments,
or did someone already retrieve the right numbers, and is willing to share?

Thanks,
Steffen

--
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)aei.mpg.de
~~~

Stuart MacLachlan

unread,
Dec 22, 2021, 11:02:47 AM12/22/21
to Slurm User Community List
Hi Steffan,

Not sure if the output from 'numactl --hardware' is more consistent and easier to parse with a script or similar?

Kind Regards,
Stuart

-----Original Message-----
From: slurm-users <slurm-use...@lists.schedmd.com> On Behalf Of Steffen Grunewald
Sent: 22 December 2021 15:38
To: Slurm users <slurm...@lists.schedmd.com>
Subject: [slurm-users] Node specs for Epyc 7xx3 processors?

[You don't often get email from steffen....@aei.mpg.de. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]

[EXTERNAL SENDER]


Hello,

I'm wondering whether there is some rule-of-thumb to translate the core config listed in https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Epyc&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=WRHOCjWNhD-hk2AQTjbVUqX9gELcVS7wxFXQqbJ02hk&m=tinTJJMaVo42Y9_sd_6GFqebHHPSHkAU7HWSxq9pJJk&s=4qtrnAVSz-nrQqws1E9H4JixtZQpj0dg3dtPszJpW1g&e= to the node information Slurm expects in "Sockets=x CoresPerSocket=y"? ("ThreadsPerCore=2" is clear.)

We'll be getting Epyc 7313 and 7513 machines, and perhaps add a single 7713 one - "lscpu" outputs are wildly different, while the total number of cores is correct.

Will I have to wait until the machines have arrived, and do some experiments, or did someone already retrieve the right numbers, and is willing to share?

Thanks,
Steffen

--
Steffen Grunewald, Cluster Administrator Max Planck Institute for Gravitational Physics (Albert Einstein Institute) Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany ~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)https://urldefense.proofpoint.com/v2/url?u=http-3A__aei.mpg.de&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=WRHOCjWNhD-hk2AQTjbVUqX9gELcVS7wxFXQqbJ02hk&m=tinTJJMaVo42Y9_sd_6GFqebHHPSHkAU7HWSxq9pJJk&s=RHWCSvo5XB6Unry_N6weSSXIlHqR0wLYnQeZq-snMhE&e=
~~~


Steffen Grunewald

unread,
Dec 22, 2021, 11:28:17 AM12/22/21
to Slurm User Community List
On Wed, 2021-12-22 at 16:02:00 +0000, Stuart MacLachlan wrote:
> Hi Steffan,
>
> Not sure if the output from 'numactl --hardware' is more consistent and easier to parse with a script or similar?

Hi,

I'm getting confusing results.
For an older dual 7351, there are 8 NUMA nodes, 4 physical cores each.
(This already works with Slurm "Sockes=8 CorePerSocket=4".)
For a dual 7713 running Ubuntu, kernel 5.11, I get 2 NUMA nodes, one
per processor (64 physical cores, times 2).
I've seen "lscpu" output for a 7313 which also shows 8 nodes, 4 cores,
2 threads each, kernel 4.19, Debian Buster.
Does Ubuntu (or the 5.11 kernel) handle NUMA nodes difefrently?

Thanks,
Steffen

> Hello,
>
> I'm wondering whether there is some rule-of-thumb to translate the core config listed in https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Epyc&d=DwIDAw&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=WRHOCjWNhD-hk2AQTjbVUqX9gELcVS7wxFXQqbJ02hk&m=tinTJJMaVo42Y9_sd_6GFqebHHPSHkAU7HWSxq9pJJk&s=4qtrnAVSz-nrQqws1E9H4JixtZQpj0dg3dtPszJpW1g&e= to the node information Slurm expects in "Sockets=x CoresPerSocket=y"? ("ThreadsPerCore=2" is clear.)
>
> We'll be getting Epyc 7313 and 7513 machines, and perhaps add a single 7713 one - "lscpu" outputs are wildly different, while the total number of cores is correct.
>
> Will I have to wait until the machines have arrived, and do some experiments, or did someone already retrieve the right numbers, and is willing to share?
>
> Thanks,
> Steffen

--
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)aei.mpg.de
~~~

Brice Goglin

unread,
Dec 22, 2021, 11:40:45 AM12/22/21
to slurm...@lists.schedmd.com

Le 22/12/2021 à 17:27, Steffen Grunewald a écrit :
> On Wed, 2021-12-22 at 16:02:00 +0000, Stuart MacLachlan wrote:
>> Hi Steffan,
>>
>> Not sure if the output from 'numactl --hardware' is more consistent and easier to parse with a script or similar?
> Hi,
>
> I'm getting confusing results.
> For an older dual 7351, there are 8 NUMA nodes, 4 physical cores each.
> (This already works with Slurm "Sockes=8 CorePerSocket=4".)
> For a dual 7713 running Ubuntu, kernel 5.11, I get 2 NUMA nodes, one
> per processor (64 physical cores, times 2).
> I've seen "lscpu" output for a 7313 which also shows 8 nodes, 4 cores,
> 2 threads each, kernel 4.19, Debian Buster.
> Does Ubuntu (or the 5.11 kernel) handle NUMA nodes difefrently?
>

Hello

AMD Epyc can be configured with 1, 2 or 4 NPS (nodes per socket) in the
BIOS.

Your old 7351 is configured in NPS4, your dual 7713 is NPS1, and 7313 is
NPS4 again.

Brice



OpenPGP_signature

Steffen Grunewald

unread,
Dec 22, 2021, 12:39:49 PM12/22/21
to Slurm User Community List
Hi Brice,

old dog still learning new tricks...

On Wed, 2021-12-22 at 17:40:17 +0100, Brice Goglin wrote:
>
> Le 22/12/2021 à 17:27, Steffen Grunewald a écrit :
> > On Wed, 2021-12-22 at 16:02:00 +0000, Stuart MacLachlan wrote:
> > > Hi Steffan,
> > >
> > > Not sure if the output from 'numactl --hardware' is more consistent and easier to parse with a script or similar?
> > Hi,
> >
> > I'm getting confusing results.
> > For an older dual 7351, there are 8 NUMA nodes, 4 physical cores each.
> > (This already works with Slurm "Sockes=8 CorePerSocket=4".)
> > For a dual 7713 running Ubuntu, kernel 5.11, I get 2 NUMA nodes, one
> > per processor (64 physical cores, times 2).
> > I've seen "lscpu" output for a 7313 which also shows 8 nodes, 4 cores,
> > 2 threads each, kernel 4.19, Debian Buster.
> >
>
> Hello
>
> AMD Epyc can be configured with 1, 2 or 4 NPS (nodes per socket) in the
> BIOS.
>
> Your old 7351 is configured in NPS4, your dual 7713 is NPS1, and 7313 is
> NPS4 again.

Thanks for this interesting detail - I must have missed that shady corner
of the settings ;)

In a nutshell, this means that the configuration I will get is completely
unpredictable - could be 2 x 32, 4 x 16 or 8 x 8 (Sockets x CoresPerSocket).

So I have to wait, until I get my hands on either the BIOS or numactl or
lscpu output.

Thanks - case closed ;)
Reply all
Reply to author
Forward
0 new messages