Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH v8 3/8] x86: Enable Intel Turbo Boost Max Technology 3.0

65 views
Skip to first unread message

Tim Chen

unread,
Nov 22, 2016, 3:30:06 PM11/22/16
to
On platforms supporting Intel Turbo Boost Max Technology 3.0, the maximum
turbo frequencies of some cores in a CPU package may be higher than for
the other cores in the same package. In that case, better performance
(and possibly lower energy consumption as well) can be achieved by
making the scheduler prefer to run tasks on the CPUs with higher max
turbo frequencies.

To that end, set up a core priority metric to abstract the core
preferences based on the maximum turbo frequency. In that metric,
the cores with higher maximum turbo frequencies are higher-priority
than the other cores in the same package and that causes the scheduler
to favor them when making load-balancing decisions using the asymmertic
packing approach. At the same time, the priority of SMT threads with a
higher CPU number is reduced so as to avoid scheduling tasks on all of
the threads that belong to a favored core before all of the other cores
have been given a task to run.

The priority metric will be initialized by the P-state driver with the
help of the sched_set_itmt_core_prio() function. The P-state driver
will also determine whether or not ITMT is supported by the platform
and will call sched_set_itmt_support() to indicate that.

Co-developed-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Co-developed-by: Srinivas Pandruvada <srinivas....@linux.intel.com>
Signed-off-by: Tim Chen <tim.c...@linux.intel.com>
---
arch/x86/Kconfig | 9 ++++
arch/x86/include/asm/topology.h | 28 +++++++++++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/itmt.c | 109 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 147 insertions(+)
create mode 100644 arch/x86/kernel/itmt.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bada636..25950f0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -939,6 +939,15 @@ config SCHED_MC
making when dealing with multi-core CPU chips at a cost of slightly
increased overhead in some places. If unsure say N here.

+config SCHED_ITMT
+ bool "Intel Turbo Boost Max Technology (ITMT) scheduler support"
+ depends on SCHED_MC && CPU_SUP_INTEL && X86_INTEL_PSTATE
+ ---help---
+ ITMT enabled scheduler support improves the CPU scheduler's decision
+ to move tasks to cpu core that can be boosted to a higher frequency
+ than others. It will have better performance at a cost of slightly
+ increased overhead in task migrations. If unsure say N here.
+
source "kernel/Kconfig.preempt"

config UP_LATE_INIT
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index a5ca88a..8ace951 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -147,4 +147,32 @@ int x86_pci_root_bus_node(int bus);
void x86_pci_root_bus_resources(int bus, struct list_head *resources);

extern bool x86_topology_update;
+
+#ifdef CONFIG_SCHED_ITMT
+#include <asm/percpu.h>
+
+DECLARE_PER_CPU_READ_MOSTLY(int, sched_core_priority);
+
+/* Interface to set priority of a cpu */
+void sched_set_itmt_core_prio(int prio, int core_cpu);
+
+/* Interface to notify scheduler that system supports ITMT */
+void sched_set_itmt_support(void);
+
+/* Interface to notify scheduler that system revokes ITMT support */
+void sched_clear_itmt_support(void);
+
+#else /* CONFIG_SCHED_ITMT */
+
+static inline void sched_set_itmt_core_prio(int prio, int core_cpu)
+{
+}
+static inline void sched_set_itmt_support(void)
+{
+}
+static inline void sched_clear_itmt_support(void)
+{
+}
+#endif /* CONFIG_SCHED_ITMT */
+
#endif /* _ASM_X86_TOPOLOGY_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 79076d7..bbd0ebc 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -123,6 +123,7 @@ obj-$(CONFIG_EFI) += sysfb_efi.o

obj-$(CONFIG_PERF_EVENTS) += perf_regs.o
obj-$(CONFIG_TRACING) += tracepoint.o
+obj-$(CONFIG_SCHED_ITMT) += itmt.o

ifdef CONFIG_FRAME_POINTER
obj-y += unwind_frame.o
diff --git a/arch/x86/kernel/itmt.c b/arch/x86/kernel/itmt.c
new file mode 100644
index 0000000..63c9b3e
--- /dev/null
+++ b/arch/x86/kernel/itmt.c
@@ -0,0 +1,109 @@
+/*
+ * itmt.c: Support Intel Turbo Boost Max Technology 3.0
+ *
+ * (C) Copyright 2016 Intel Corporation
+ * Author: Tim Chen <tim.c...@linux.intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ *
+ * On platforms supporting Intel Turbo Boost Max Technology 3.0, (ITMT),
+ * the maximum turbo frequencies of some cores in a CPU package may be
+ * higher than for the other cores in the same package. In that case,
+ * better performance can be achieved by making the scheduler prefer
+ * to run tasks on the CPUs with higher max turbo frequencies.
+ *
+ * This file provides functions and data structures for enabling the
+ * scheduler to favor scheduling on cores can be boosted to a higher
+ * frequency under ITMT.
+ */
+
+#include <linux/sched.h>
+#include <linux/cpumask.h>
+#include <linux/cpuset.h>
+#include <asm/mutex.h>
+#include <linux/sched.h>
+#include <linux/sysctl.h>
+#include <linux/nodemask.h>
+
+static DEFINE_MUTEX(itmt_update_mutex);
+DEFINE_PER_CPU_READ_MOSTLY(int, sched_core_priority);
+
+/* Boolean to track if system has ITMT capabilities */
+static bool __read_mostly sched_itmt_capable;
+
+/**
+ * sched_set_itmt_support() - Indicate platform supports ITMT
+ *
+ * This function is used by the OS to indicate to scheduler that the platform
+ * is capable of supporting the ITMT feature.
+ *
+ * The current scheme has the pstate driver detects if the system
+ * is ITMT capable and call sched_set_itmt_support.
+ *
+ * This must be done only after sched_set_itmt_core_prio
+ * has been called to set the cpus' priorities.
+ */
+void sched_set_itmt_support(void)
+{
+ mutex_lock(&itmt_update_mutex);
+
+ sched_itmt_capable = true;
+
+ mutex_unlock(&itmt_update_mutex);
+}
+
+/**
+ * sched_clear_itmt_support() - Revoke platform's support of ITMT
+ *
+ * This function is used by the OS to indicate that it has
+ * revoked the platform's support of ITMT feature.
+ *
+ */
+void sched_clear_itmt_support(void)
+{
+ mutex_lock(&itmt_update_mutex);
+
+ sched_itmt_capable = false;
+
+ mutex_unlock(&itmt_update_mutex);
+}
+
+int arch_asym_cpu_priority(int cpu)
+{
+ return per_cpu(sched_core_priority, cpu);
+}
+
+/**
+ * sched_set_itmt_core_prio() - Set CPU priority based on ITMT
+ * @prio: Priority of cpu core
+ * @core_cpu: The cpu number associated with the core
+ *
+ * The pstate driver will find out the max boost frequency
+ * and call this function to set a priority proportional
+ * to the max boost frequency. CPU with higher boost
+ * frequency will receive higher priority.
+ *
+ * No need to rebuild sched domain after updating
+ * the CPU priorities. The sched domains have no
+ * dependency on CPU priorities.
+ */
+void sched_set_itmt_core_prio(int prio, int core_cpu)
+{
+ int cpu, i = 1;
+
+ for_each_cpu(cpu, topology_sibling_cpumask(core_cpu)) {
+ int smt_prio;
+
+ /*
+ * Ensure that the siblings are moved to the end
+ * of the priority chain and only used when
+ * all other high priority cpus are out of capacity.
+ */
+ smt_prio = prio * smp_num_siblings / i;
+ per_cpu(sched_core_priority, cpu) = smt_prio;
+ i++;
+ }
+}
--
2.5.5

tip-bot for Tim Chen

unread,
Nov 24, 2016, 3:00:05 PM11/24/16
to
Commit-ID: 5e76b2ab36b40ca33023e78725bdc69eafd63134
Gitweb: http://git.kernel.org/tip/5e76b2ab36b40ca33023e78725bdc69eafd63134
Author: Tim Chen <tim.c...@linux.intel.com>
AuthorDate: Tue, 22 Nov 2016 12:23:55 -0800
Committer: Thomas Gleixner <tg...@linutronix.de>
CommitDate: Thu, 24 Nov 2016 20:44:19 +0100

x86: Enable Intel Turbo Boost Max Technology 3.0

On platforms supporting Intel Turbo Boost Max Technology 3.0, the maximum
turbo frequencies of some cores in a CPU package may be higher than for
the other cores in the same package. In that case, better performance
(and possibly lower energy consumption as well) can be achieved by
making the scheduler prefer to run tasks on the CPUs with higher max
turbo frequencies.

To that end, set up a core priority metric to abstract the core
preferences based on the maximum turbo frequency. In that metric,
the cores with higher maximum turbo frequencies are higher-priority
than the other cores in the same package and that causes the scheduler
to favor them when making load-balancing decisions using the asymmertic
packing approach. At the same time, the priority of SMT threads with a
higher CPU number is reduced so as to avoid scheduling tasks on all of
the threads that belong to a favored core before all of the other cores
have been given a task to run.

The priority metric will be initialized by the P-state driver with the
help of the sched_set_itmt_core_prio() function. The P-state driver
will also determine whether or not ITMT is supported by the platform
and will call sched_set_itmt_support() to indicate that.

Co-developed-by: Peter Zijlstra (Intel) <pet...@infradead.org>
Co-developed-by: Srinivas Pandruvada <srinivas....@linux.intel.com>
Signed-off-by: Tim Chen <tim.c...@linux.intel.com>
Cc: linu...@vger.kernel.org
Cc: pet...@infradead.org
Cc: jo...@redhat.com
Cc: r...@rjwysocki.net
Cc: linux...@vger.kernel.org
Cc: Srinivas Pandruvada <srinivas....@linux.intel.com>
Cc: b...@suse.de
Link: http://lkml.kernel.org/r/cd401ccdff88f88c8349314febdc25d51...@linux.intel.com
Signed-off-by: Thomas Gleixner <tg...@linutronix.de>

---
arch/x86/Kconfig | 9 ++++
arch/x86/include/asm/topology.h | 28 +++++++++++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/itmt.c | 109 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 147 insertions(+)

Ingo Molnar

unread,
Nov 25, 2016, 3:30:05 AM11/25/16
to

* tip-bot for Tim Chen <tip...@zytor.com> wrote:

> Commit-ID: 5e76b2ab36b40ca33023e78725bdc69eafd63134
> Gitweb: http://git.kernel.org/tip/5e76b2ab36b40ca33023e78725bdc69eafd63134
> Author: Tim Chen <tim.c...@linux.intel.com>
> AuthorDate: Tue, 22 Nov 2016 12:23:55 -0800
> Committer: Thomas Gleixner <tg...@linutronix.de>
> CommitDate: Thu, 24 Nov 2016 20:44:19 +0100
>
> x86: Enable Intel Turbo Boost Max Technology 3.0

This patch doesn't build:

Note that this patch has to be redone anyway, as it won't even build:

> +#include <linux/sched.h>
> +#include <linux/cpumask.h>
> +#include <linux/cpuset.h>
> +#include <asm/mutex.h>
> +#include <linux/sched.h>
> +#include <linux/sysctl.h>
> +#include <linux/nodemask.h>

arch/x86/kernel/itmt.c:26:23: fatal error: asm/mutex.h: No such file or directory

> +config SCHED_ITMT
> + bool "Intel Turbo Boost Max Technology (ITMT) scheduler support"
> + depends on SCHED_MC && CPU_SUP_INTEL && X86_INTEL_PSTATE
> + ---help---
> + ITMT enabled scheduler support improves the CPU scheduler's decision
> + to move tasks to cpu core that can be boosted to a higher frequency
> + than others. It will have better performance at a cost of slightly
> + increased overhead in task migrations. If unsure say N here.

Argh, so the 'itmt' name really sucks as well - could we please make it something
more obvious - like SCHED_INTEL_TURBO or so - and similarly rename the file as
well?

The sched_intel_turbo.c file could thus host all things related to scheduler
support of turbo frequencies - it shouldn't be named after the Intel acronym of
the day...

Thanks,

Ingo

Peter Zijlstra

unread,
Nov 25, 2016, 3:40:07 AM11/25/16
to
On Fri, Nov 25, 2016 at 09:19:47AM +0100, Ingo Molnar wrote:
>
> * tip-bot for Tim Chen <tip...@zytor.com> wrote:
>
> > Commit-ID: 5e76b2ab36b40ca33023e78725bdc69eafd63134
> > Gitweb: http://git.kernel.org/tip/5e76b2ab36b40ca33023e78725bdc69eafd63134
> > Author: Tim Chen <tim.c...@linux.intel.com>
> > AuthorDate: Tue, 22 Nov 2016 12:23:55 -0800
> > Committer: Thomas Gleixner <tg...@linutronix.de>
> > CommitDate: Thu, 24 Nov 2016 20:44:19 +0100
> >
> > x86: Enable Intel Turbo Boost Max Technology 3.0
>
> This patch doesn't build:
>
> Note that this patch has to be redone anyway, as it won't even build:
>
> > +#include <linux/sched.h>
> > +#include <linux/cpumask.h>
> > +#include <linux/cpuset.h>
> > +#include <asm/mutex.h>
> > +#include <linux/sched.h>
> > +#include <linux/sysctl.h>
> > +#include <linux/nodemask.h>
>
> arch/x86/kernel/itmt.c:26:23: fatal error: asm/mutex.h: No such file or directory

Hehe, indeed, we killed that dead in the locking branch. Weird include
to have anyway.

Thomas Gleixner

unread,
Nov 25, 2016, 2:20:07 PM11/25/16
to
On Fri, 25 Nov 2016, Ingo Molnar wrote:

>
> * tip-bot for Tim Chen <tip...@zytor.com> wrote:
>
> > Commit-ID: 5e76b2ab36b40ca33023e78725bdc69eafd63134
> > Gitweb: http://git.kernel.org/tip/5e76b2ab36b40ca33023e78725bdc69eafd63134
> > Author: Tim Chen <tim.c...@linux.intel.com>
> > AuthorDate: Tue, 22 Nov 2016 12:23:55 -0800
> > Committer: Thomas Gleixner <tg...@linutronix.de>
> > CommitDate: Thu, 24 Nov 2016 20:44:19 +0100
> >
> > x86: Enable Intel Turbo Boost Max Technology 3.0
>
> This patch doesn't build:
>
> Note that this patch has to be redone anyway, as it won't even build:

The branch where I merged it to builds fine.

Though, yes I missed the asm/mutex.h include which obviously should be
linux/mutex.h

> > +#include <linux/sched.h>
> > +#include <linux/cpumask.h>
> > +#include <linux/cpuset.h>
> > +#include <asm/mutex.h>
> > +#include <linux/sched.h>
> > +#include <linux/sysctl.h>
> > +#include <linux/nodemask.h>
>
> arch/x86/kernel/itmt.c:26:23: fatal error: asm/mutex.h: No such file or directory
>
> > +config SCHED_ITMT
> > + bool "Intel Turbo Boost Max Technology (ITMT) scheduler support"
> > + depends on SCHED_MC && CPU_SUP_INTEL && X86_INTEL_PSTATE
> > + ---help---
> > + ITMT enabled scheduler support improves the CPU scheduler's decision
> > + to move tasks to cpu core that can be boosted to a higher frequency
> > + than others. It will have better performance at a cost of slightly
> > + increased overhead in task migrations. If unsure say N here.
>
> Argh, so the 'itmt' name really sucks as well - could we please make it something
> more obvious - like SCHED_INTEL_TURBO or so - and similarly rename the file as
> well?
>
> The sched_intel_turbo.c file could thus host all things related to scheduler
> support of turbo frequencies - it shouldn't be named after the Intel acronym of
> the day...

It would be nice to come up with such nitpicks during review. This thing
went through 8 iterations, but nothing came up and I didn't mind the itmt
naming.

Thanks,

tglx

Ingo Molnar

unread,
Nov 28, 2016, 4:00:05 AM11/28/16
to

* Thomas Gleixner <tg...@linutronix.de> wrote:

> On Fri, 25 Nov 2016, Ingo Molnar wrote:
>
> >
> > * tip-bot for Tim Chen <tip...@zytor.com> wrote:
> >
> > > Commit-ID: 5e76b2ab36b40ca33023e78725bdc69eafd63134
> > > Gitweb: http://git.kernel.org/tip/5e76b2ab36b40ca33023e78725bdc69eafd63134
> > > Author: Tim Chen <tim.c...@linux.intel.com>
> > > AuthorDate: Tue, 22 Nov 2016 12:23:55 -0800
> > > Committer: Thomas Gleixner <tg...@linutronix.de>
> > > CommitDate: Thu, 24 Nov 2016 20:44:19 +0100
> > >
> > > x86: Enable Intel Turbo Boost Max Technology 3.0
> >
> > This patch doesn't build:
> >
> > Note that this patch has to be redone anyway, as it won't even build:
>
> The branch where I merged it to builds fine.

Indeed you are right - asm/mutex.h is gone in locking/core, so this is a semantic
merge conflict, not a build failure.
Yeah, so I had to NAK an early iteration and didn't get around to doing a really
detailed review yet - and after (falsely) thinking it had a build failure I got
overly worked up about the bad naming: my bad and apologies!

So the code looks good to me but the naming still sucks a bit - I'm fine with
having the commits re-merged as-is and renaming the Kconfig variable to something
more expressive: I've done this in tip:sched/core and have fixed the asm/mutex.h
thing as well.

Wrt. improving the naming:

Firstly, popular tech news has coined the 'Turbo Boost Max' technology 'TBM' (TBM2
and TBM3) as the natural acronym of the Intel feature - not 'ITMT'. So to anyone
except people well aware of Intel acronyms the term 'ITMT' will be pretty
meaningless.

Does something more generic like SCHED_MC_PRIO (as an extension to SCHED_MC) work
with everyone? Intel Turbo Max 3.0 is the current (only) implementation of it, but
I don't think the technology will stop at that stage as dies are getting larger
but thinner.

I also think the Kconfig text is somewhat misleading and the default-disabled
status is counterproductive:

+config SCHED_ITMT
+ bool "Intel Turbo Boost Max Technology (ITMT) scheduler support"
+ depends on SCHED_MC && CPU_SUP_INTEL && X86_INTEL_PSTATE
+ ---help---
+ ITMT enabled scheduler support improves the CPU scheduler's decision
+ to move tasks to cpu core that can be boosted to a higher frequency
+ than others. It will have better performance at a cost of slightly
+ increased overhead in task migrations. If unsure say N here.

... the extra cost of smarter CPU selection is IMHO overwhelmed by the negative
effects of not knowing about core frequency ordering, on most workloads.

A better default would be default-y I believe (that is what we do for CPU hardware
enablement typically), and a better description would be something like:

+config SCHED_MC_PRIO
+ bool "CPU core priorities scheduler support"
+ depends on SCHED_MC && CPU_SUP_INTEL && X86_INTEL_PSTATE
+ default y
+ ---help---
+ Intel Turbo Boost Max 3.0 enabled CPUs have a core ordering determined at
+ manufacturing time, which allows certain cores to reach higher turbo
+ frequencies (when running single threaded workloads) than others.
+
+ Enabling this kernel feature teaches the scheduler about the TBM3 priority
+ order of the CPU cores and adjusts the scheduler's CPU selection logic
+ accordingly, so that higher overall system performance can be achieved.
+
+ This feature will have no effect on CPUs without this feature.
+
+ If unsure say Y here.

If/when other architectures make use of this the Kconfig entry can be moved into
the scheduler Kconfig - but for the time being it can stay in arch/x86/.

Another variant would be to eliminate the Kconfig option altogether and make it a
natural feature of SCHED_MC (like it is in the core scheduler).

Thanks,

Ingo

Tim Chen

unread,
Nov 28, 2016, 12:40:08 PM11/28/16
to
I am fine with renaming SCHED_ITMT to SCHED_MC_PRIO.  Patch 7 and 8 that
Rafael merged into his tree also have SCHED_ITMT so they will need to
be updated if we renamed it.

Thanks.

Tim

Rafael J. Wysocki

unread,
Nov 28, 2016, 6:30:05 PM11/28/16
to
No, I haven't. They are in tip AFAICS.

Thanks,
Rafael

Ingo Molnar

unread,
Nov 29, 2016, 2:20:04 AM11/29/16
to

* Tim Chen <tim.c...@linux.intel.com> wrote:

> > + If unsure say Y here.
> >
> > If/when other architectures make use of this the Kconfig entry can be moved into 
> > the scheduler Kconfig - but for the time being it can stay in arch/x86/.
> >
> > Another variant would be to eliminate the Kconfig option altogether and make it a 
> > natural feature of SCHED_MC (like it is in the core scheduler).
> >
>
> I am fine with renaming SCHED_ITMT to SCHED_MC_PRIO.

Ok, could you please send a delta patch on top of tip:sched/core that does this
and the other improvements?

Thanks,

Ingo

Tim Chen

unread,
Nov 29, 2016, 1:50:07 PM11/29/16
to
I have sent a delta patch in a separate mail for this change.

Thanks.

Tim
0 new messages