[PATCH] ppc64_cpu: Support partial SMT level through SYS FS smt/control files

58 views
Skip to first unread message

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Aug 10, 2023, 5:30:07 AM8/10/23
to powerpc-utils-devel@googlegroups.com, tyreld@linux.ibm.com, nathanl@linux.ibm.com, nysal@linux.ibm.com
The next kernel release will support partial SMT level [1] though the SYS
FS file "devices/system/cpu/smt/control". This allows the SMT level to be
recorded in the kernel. With the current SMT level stored in the kernel,
when a new CPU is added, only the necessary threads are brought online.

The legacy way to active threads through the SYS FS files
'devices/system/cpu/cpu<n>/online', is still used in the case the new SYS
FS API is not available. This allows compatibility with the previous kernel
versions.

[1] https://lore.kernel.org/linuxppc-dev/20230705145143....@linux.ibm.com/

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/ppc64_cpu.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/ppc64_cpu.c b/src/ppc64_cpu.c
index 5fdf86a14312..c33a293fef57 100644
--- a/src/ppc64_cpu.c
+++ b/src/ppc64_cpu.c
@@ -56,6 +56,8 @@
#define DIAGNOSTICS_RUN_MODE 42
#define CPU_OFFLINE -1

+#define SYS_SMT_CONTROL "/sys/devices/system/cpu/smt/control"
+
#ifdef HAVE_LINUX_PERF_EVENT_H
struct cpu_freq {
int offline;
@@ -360,6 +362,20 @@ static int is_dscr_capable(void)
return 0;
}

+/*
+ * Depends on kernel's CONFIG_HOTPLUG_CPU
+ */
+static int set_smt_control(int smt_state)
+{
+ if (set_attribute(SYS_SMT_CONTROL, "%d", smt_state)) {
+ /* Silently ignore kernel not supporting this feature */
+ if (errno != ENODEV)
+ perror(SYS_SMT_CONTROL);
+ return -1;
+ }
+ return 0;
+}
+
static int do_smt(char *state, bool numeric)
{
int rc = 0;
@@ -388,7 +404,9 @@ static int do_smt(char *state, bool numeric)
return -1;
}

- rc = set_smt_state(smt_state);
+ /* Try using smt/control if failing, fall back to the legacy way */
+ if (set_smt_control(smt_state))
+ rc = set_smt_state(smt_state);
}

return rc;
--
2.41.0

Nathan Lynch

<nathanl@linux.ibm.com>
unread,
Oct 9, 2023, 4:47:57 PM10/9/23
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, tyreld@linux.ibm.com, nysal@linux.ibm.com
Laurent Dufour <ldu...@linux.ibm.com> writes:
> The next kernel release will support partial SMT level [1] though the SYS
> FS file "devices/system/cpu/smt/control". This allows the SMT level to be
> recorded in the kernel. With the current SMT level stored in the kernel,
> when a new CPU is added, only the necessary threads are brought online.
>
> The legacy way to active threads through the SYS FS files
> 'devices/system/cpu/cpu<n>/online', is still used in the case the new SYS
> FS API is not available. This allows compatibility with the previous kernel
> versions.
>
> [1] https://lore.kernel.org/linuxppc-dev/20230705145143....@linux.ibm.com/
>
> Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>

Reviewed-by: Nathan Lynch <nat...@linux.ibm.com>

I think this could be applied as-is, but there may be some follow-up
work to do, see below.
If the smt control path does not exist, I would expect errno to be
ENOENT from the open() in set_attribute()...

> +
> static int do_smt(char *state, bool numeric)
> {
> int rc = 0;
> @@ -388,7 +404,9 @@ static int do_smt(char *state, bool numeric)
> return -1;
> }
>
> - rc = set_smt_state(smt_state);
> + /* Try using smt/control if failing, fall back to the legacy way */
> + if (set_smt_control(smt_state))
> + rc = set_smt_state(smt_state);
> }

...and I believe that is the only case where we want to suppress an
error message and fall back to the old method. As written, this code
will fall back to the legacy method if set_smt_control() errors for any
reason, for instance if the kernel fails to online all CPUs due to low
memory. I don't think that's the behavior we want.

Since Laurent is no longer working in this area, I think we could:

* Apply this to next as-is and follow it with a separate change
correcting the minor issues described above; or
* Create a new patch based on this one, retaining Laurent as
co-author.

Nysal Jan K.A.

<nysal@linux.ibm.com>
unread,
Oct 16, 2023, 7:20:13 AM10/16/23
to Nathan Lynch, Laurent Dufour, powerpc-utils-devel@googlegroups.com, tyreld@linux.ibm.com
I verified that errno is set to ENODEV if open() fails

>
> > +
> > static int do_smt(char *state, bool numeric)
> > {
> > int rc = 0;
> > @@ -388,7 +404,9 @@ static int do_smt(char *state, bool numeric)
> > return -1;
> > }
> >
> > - rc = set_smt_state(smt_state);
> > + /* Try using smt/control if failing, fall back to the legacy way */
> > + if (set_smt_control(smt_state))
> > + rc = set_smt_state(smt_state);
> > }
>
> ...and I believe that is the only case where we want to suppress an
> error message and fall back to the old method. As written, this code
> will fall back to the legacy method if set_smt_control() errors for any
> reason, for instance if the kernel fails to online all CPUs due to low
> memory. I don't think that's the behavior we want.
>
> Since Laurent is no longer working in this area, I think we could:
>
> * Apply this to next as-is and follow it with a separate change
> correcting the minor issues described above; or
> * Create a new patch based on this one, retaining Laurent as
> co-author.
Maybe we can apply this as-is and post a follow up patch? Let me know your
preference.


--Nysal

Tyrel Datwyler

<tyreld@linux.ibm.com>
unread,
Oct 26, 2023, 5:13:03 PM10/26/23
to Nysal Jan K.A., Nathan Lynch, Laurent Dufour, powerpc-utils-devel@googlegroups.com
There are two different things at play here. If smt/control path doesn't exist
then yes the errno would be ENOENT, but in this case at play the write is
actually what is failing with ENODEV because smt/contorl exists, but smt state
control is not supported by the kernel (presumably because (cpu_smt_control ==
CPU_SMT_NOT_SUPPORTED)).

I would expect smt/control to exist on any reasonably modern kernel, but maybe
we should handle the ENOENT case as well with ENODEV.

There is a third case as well where we could get EPERM for (cpu_smt_control ==
CPU_SMT_FORCE_DISABLED).

>
>>
>>> +
>>> static int do_smt(char *state, bool numeric)
>>> {
>>> int rc = 0;
>>> @@ -388,7 +404,9 @@ static int do_smt(char *state, bool numeric)
>>> return -1;
>>> }
>>>
>>> - rc = set_smt_state(smt_state);
>>> + /* Try using smt/control if failing, fall back to the legacy way */
>>> + if (set_smt_control(smt_state))
>>> + rc = set_smt_state(smt_state);
>>> }
>>
>> ...and I believe that is the only case where we want to suppress an
>> error message and fall back to the old method. As written, this code
>> will fall back to the legacy method if set_smt_control() errors for any
>> reason, for instance if the kernel fails to online all CPUs due to low
>> memory. I don't think that's the behavior we want.
>>
>> Since Laurent is no longer working in this area, I think we could:
>>
>> * Apply this to next as-is and follow it with a separate change
>> correcting the minor issues described above; or
>> * Create a new patch based on this one, retaining Laurent as
>> co-author.

It seems fairly trivial to fix up the handling of the cases we want to silently
ignore and make do_smt() fall back to legacy, and instead erroring out in any
other case of set_smt_control failing.

-Tyrel

Tyrel Datwyler

<tyreld@linux.ibm.com>
unread,
Jan 31, 2024, 8:01:21 PMJan 31
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, nysal@linux.ibm.com
On 8/10/23 02:29, Laurent Dufour wrote:
> The next kernel release will support partial SMT level [1] though the SYS
> FS file "devices/system/cpu/smt/control". This allows the SMT level to be
> recorded in the kernel. With the current SMT level stored in the kernel,
> when a new CPU is added, only the necessary threads are brought online.
>
> The legacy way to active threads through the SYS FS files
> 'devices/system/cpu/cpu<n>/online', is still used in the case the new SYS
> FS API is not available. This allows compatibility with the previous kernel
> versions.
>
> [1] https://lore.kernel.org/linuxppc-dev/20230705145143....@linux.ibm.com/
>
> Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
> ---

This discussion seems to have gone stale. I resent this patch to the list from
the github pull request, and upon closer inspection realized that there had
already been discussion around this patch and some trivial error handling
updates that seemed necessary. Any chance of seeing a v2?

-Tyrel

Michal Suchanek

<msuchanek@suse.de>
unread,
Feb 9, 2024, 7:26:49 AMFeb 9
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, Tyrel Datwyler, nathanl@linux.ibm.com, nysal@linux.ibm.com, Michal Suchanek
When the kernel does not support the sysfs intercface do not report and
arror, fall back to the old method silently.

Suggested-by: Nathan Lynch<nat...@linux.ibm.com>
Signed-off-by: Michal Suchanek <msuc...@suse.de>
---
It has been suggested to also not try the old iterface in case of
unknown error because it's unlikely to work but that would needlessly
complicate the code.
---
src/ppc64_cpu.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/ppc64_cpu.c b/src/ppc64_cpu.c
index c318928..90ae5a7 100644
--- a/src/ppc64_cpu.c
+++ b/src/ppc64_cpu.c
@@ -368,12 +368,22 @@ static int is_dscr_capable(void)
static int set_smt_control(int smt_state)
{
if (set_attribute(SYS_SMT_CONTROL, "%d", smt_state)) {
- /* Silently ignore kernel not supporting this feature */
- if (errno != ENODEV)
- perror(SYS_SMT_CONTROL);
- return -1;
+ switch (errno) {
+ case ENOENT:
+ /*
+ * The kernel does not have the interface.
+ * Try the old method.
+ */
+ return -1;
+ case ENODEV:
+ /* Setting SMT state not supported on the system. */
+ return 0;
+ default:
+ perror(SYS_SMT_CONTROL);
+ return -1;
+ }
}
- return 0;
+ return true;
}

static int do_smt(char *state, bool numeric)
--
2.43.0

Michal Suchanek

<msuchanek@suse.de>
unread,
Feb 9, 2024, 7:34:30 AMFeb 9
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, Tyrel Datwyler, Nathan Lynch, Nysal Jan K . A ., Michal Suchanek
When the kernel does not support the sysfs intercface do not report and
arror, fall back to the old method silently.

Suggested-by: Nathan Lynch<nat...@linux.ibm.com>
Signed-off-by: Michal Suchanek <msuc...@suse.de>
---
It has been suggested to also not try the old iterface in case of
unknown error because it's unlikely to work but that would needlessly
complicate the code.
---
src/ppc64_cpu.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/ppc64_cpu.c b/src/ppc64_cpu.c
index c318928..759ceff 100644
--- a/src/ppc64_cpu.c
+++ b/src/ppc64_cpu.c
@@ -368,10 +368,20 @@ static int is_dscr_capable(void)
static int set_smt_control(int smt_state)
{
if (set_attribute(SYS_SMT_CONTROL, "%d", smt_state)) {
- /* Silently ignore kernel not supporting this feature */
- if (errno != ENODEV)
- perror(SYS_SMT_CONTROL);
- return -1;
+ switch (errno) {
+ case ENOENT:
+ /*
+ * The kernel does not have the interface.
+ * Try the old method.
+ */
+ return -1;
+ case ENODEV:
+ /* Setting SMT state not supported on the system. */
+ return 0;
+ default:
+ perror(SYS_SMT_CONTROL);
+ return -1;
+ }
}
return 0;
}
--
2.43.0

Michal Suchánek

<msuchanek@suse.de>
unread,
Mar 20, 2024, 2:51:52 PMMar 20
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, Tyrel Datwyler, Nathan Lynch, Nysal Jan K . A .
Hello,

any progress with merging this?

Thanks

Michal

Tyrel Datwyler

<tyreld@linux.ibm.com>
unread,
Mar 25, 2024, 7:25:14 PMMar 25
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, nysal@linux.ibm.com
On 8/10/23 02:29, Laurent Dufour wrote:
> The next kernel release will support partial SMT level [1] though the SYS
> FS file "devices/system/cpu/smt/control". This allows the SMT level to be
> recorded in the kernel. With the current SMT level stored in the kernel,
> when a new CPU is added, only the necessary threads are brought online.

I've received a report that with this patch applied the previous expected
behavior breaks. Take the following steps.

ppc64_cpu --cores-on=1
ppc64_cpu --smt=2

Expectation is that a single core with threads 1,2 online, and this is observed.
However, the next step breaks previous behavior.

ppc64_cpu --smt=4

Expectation is that a single core with threads 1-4 online. With this patch we
observe all cores being onlined in smt 4. I have not had a chance to investigate
further.

-Tyrel

Michal Suchánek

<msuchanek@suse.de>
unread,
Apr 2, 2024, 4:13:27 AMApr 2
to Tyrel Datwyler, Laurent Dufour, powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, nysal@linux.ibm.com
Hello,

On Mon, Mar 25, 2024 at 04:25:05PM -0700, Tyrel Datwyler wrote:
> On 8/10/23 02:29, Laurent Dufour wrote:
> > The next kernel release will support partial SMT level [1] though the SYS
> > FS file "devices/system/cpu/smt/control". This allows the SMT level to be
> > recorded in the kernel. With the current SMT level stored in the kernel,
> > when a new CPU is added, only the necessary threads are brought online.
>
> I've received a report that with this patch applied the previous expected
> behavior breaks. Take the following steps.
>
> ppc64_cpu --cores-on=1
> ppc64_cpu --smt=2
>
> Expectation is that a single core with threads 1,2 online, and this is observed.
> However, the next step breaks previous behavior.
>
> ppc64_cpu --smt=4
>
> Expectation is that a single core with threads 1-4 online. With this patch we
> observe all cores being onlined in smt 4. I have not had a chance to investigate
> further.

It also breaks SMT support on powernv, the
/sys/devices/system/cpu/smt/control returns ENODEV there which is not
retried.

Thanks

Michal
> --
> You received this message because you are subscribed to the Google Groups "Powerpc-utils development mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to powerpc-utils-d...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/powerpc-utils-devel/af3f93f0-d53d-4251-8302-ca573d02bf3e%40linux.ibm.com.

Michal Suchanek

<msuchanek@suse.de>
unread,
Apr 2, 2024, 5:43:26 AMApr 2
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, Tyrel Datwyler, Nathan Lynch, Nysal Jan K . A ., Michal Suchanek
When the kernel does not support the sysfs intercface do not report an
arror, fall back to the old method silently.

Suggested-by: Nathan Lynch<nat...@linux.ibm.com>
Signed-off-by: Michal Suchanek <msuc...@suse.de>
---
v3: retry is needed on ENODEV to support powernv
---
src/ppc64_cpu.c | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/src/ppc64_cpu.c b/src/ppc64_cpu.c
index c318928..688152b 100644
--- a/src/ppc64_cpu.c
+++ b/src/ppc64_cpu.c
@@ -364,14 +364,28 @@ static int is_dscr_capable(void)

/*
* Depends on kernel's CONFIG_HOTPLUG_CPU
+ * Return -1 for fatal error, -2 to retry.
*/
static int set_smt_control(int smt_state)
{
if (set_attribute(SYS_SMT_CONTROL, "%d", smt_state)) {
- /* Silently ignore kernel not supporting this feature */
- if (errno != ENODEV)
- perror(SYS_SMT_CONTROL);
- return -1;
+ switch (errno) {
+ case ENOENT:
+ /*
+ * The kernel does not have the interface.
+ * Try the old method.
+ */
+ return -2;
+ case ENODEV:
+ /*
+ * Setting SMT state not supported by this interface.
+ * eg. powernv
+ */
+ return -2;
+ default:
+ perror(SYS_SMT_CONTROL);
+ return -1;
+ }
}
return 0;
}
@@ -405,7 +419,7 @@ static int do_smt(char *state, bool numeric)
}

/* Try using smt/control if failing, fall back to the legacy way */
- if (set_smt_control(smt_state))
+ if ((rc = set_smt_control(smt_state)) == -2)
rc = set_smt_state(smt_state);
}

--
2.44.0

Michal Suchánek

<msuchanek@suse.de>
unread,
Apr 2, 2024, 8:36:19 AMApr 2
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, Tyrel Datwyler, Nathan Lynch, Nysal Jan K . A .
Seems to work on powernv 8247-22L but not S812LC

Don't have access to the latter so can't really say what's different.

Michal Suchánek

<msuchanek@suse.de>
unread,
Apr 4, 2024, 7:06:14 AMApr 4
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, Tyrel Datwyler, Nathan Lynch, Nysal Jan K . A .
Hello,
It turns out that the difference is kernel version.

While this file does exist on older kernels (eg. 5.3) it is not wired on
powerpc resulting in powerpc-utils assuming that managing CPU threads is
not possible. In reality it is not possible only through this interface.

Thanks

Michal

>
> > + */
> > + return -2;
> > + default:
> > + perror(SYS_SMT_CONTROL);
> > + return -1;
> > + }
> > }
> > return 0;
> > }
> > @@ -405,7 +419,7 @@ static int do_smt(char *state, bool numeric)
> > }
> >
> > /* Try using smt/control if failing, fall back to the legacy way */
> > - if (set_smt_control(smt_state))
> > + if ((rc = set_smt_control(smt_state)) == -2)
> > rc = set_smt_state(smt_state);
> > }
> >
> > --
> > 2.44.0
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Powerpc-utils development mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to powerpc-utils-d...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/powerpc-utils-devel/20240402123615.GM20665%40kitsune.suse.cz.

Michal Suchánek

<msuchanek@suse.de>
unread,
Apr 8, 2024, 10:02:06 AMApr 8
to Tyrel Datwyler, Laurent Dufour, powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, nysal@linux.ibm.com
On Mon, Mar 25, 2024 at 04:25:05PM -0700, Tyrel Datwyler wrote:
> On 8/10/23 02:29, Laurent Dufour wrote:
> > The next kernel release will support partial SMT level [1] though the SYS
> > FS file "devices/system/cpu/smt/control". This allows the SMT level to be
> > recorded in the kernel. With the current SMT level stored in the kernel,
> > when a new CPU is added, only the necessary threads are brought online.
>
> I've received a report that with this patch applied the previous expected
> behavior breaks. Take the following steps.
>
> ppc64_cpu --cores-on=1
> ppc64_cpu --smt=2
>
> Expectation is that a single core with threads 1,2 online, and this is observed.
> However, the next step breaks previous behavior.
>
> ppc64_cpu --smt=4
>
> Expectation is that a single core with threads 1-4 online. With this patch we
> observe all cores being onlined in smt 4. I have not had a chance to investigate
> further.

This should be addressed in the kernel code, AFAICT powerpc-utils cannot
do anything about this problem, the behavior of the kernel-side
implementation differs from the userspace implementation.

Thanks

Michal

Michal Suchanek

<msuchanek@suse.de>
unread,
Apr 8, 2024, 10:05:49 AMApr 8
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, Tyrel Datwyler, Nathan Lynch, Nysal Jan K . A ., Michal Suchanek
When the kernel does not support the sysfs interface do not report an
error, fall back to the old method silently.

Suggested-by: Nathan Lynch<nat...@linux.ibm.com>
Signed-off-by: Michal Suchanek <msuc...@suse.de>
---
v3: retry is needed on ENODEV to support powernv
v4: typo fix in commit message
ENODEV depends on kernel version, not platform
---
src/ppc64_cpu.c | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/src/ppc64_cpu.c b/src/ppc64_cpu.c
index c318928..4017240 100644
--- a/src/ppc64_cpu.c
+++ b/src/ppc64_cpu.c
@@ -364,14 +364,30 @@ static int is_dscr_capable(void)

/*
* Depends on kernel's CONFIG_HOTPLUG_CPU
+ * Return -1 for fatal error, -2 to retry.
*/
static int set_smt_control(int smt_state)
{
if (set_attribute(SYS_SMT_CONTROL, "%d", smt_state)) {
- /* Silently ignore kernel not supporting this feature */
- if (errno != ENODEV)
- perror(SYS_SMT_CONTROL);
- return -1;
+ switch (errno) {
+ case ENOENT:
+ /*
+ * The kernel does not have the interface.
+ * Try the old method.
+ */
+ return -2;
+ case ENODEV:
+ /*
+ * Setting SMT state not supported by this interface.
+ * On older kernels (before Linux 6.6) the generic interface
+ * may exist but is not hooked on powerpc resulting in ENODEV
+ * on kernels that can set SMT using the old interface.
+ */
+ return -2;
+ default:
+ perror(SYS_SMT_CONTROL);
+ return -1;
+ }
}
return 0;
}
@@ -405,7 +421,7 @@ static int do_smt(char *state, bool numeric)

Nysal Jan K.A.

<nysal@linux.ibm.com>
unread,
Apr 15, 2024, 2:42:46 AMApr 15
to Michal Suchánek, Tyrel Datwyler, Laurent Dufour, powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com
I'm taking a look at this Michal. I have a very basic patch that fixes the
issue. I'd like to test this a bit and also verify that the original DLPAR
use case has no regressions.

Regards
--Nysal
Reply all
Reply to author
Forward
0 new messages