[PATCH] dynamically allocate struct cpumask

3 views
Skip to first unread message

Nick Desaulniers

unread,
Jan 27, 2020, 2:16:18 AM1/27/20
to pbon...@redhat.com, tg...@linutronix.de, mi...@redhat.com, b...@alien8.de, Nick Desaulniers, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, H. Peter Anvin, x...@kernel.org, k...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com
This helps avoid avoid a potentially large stack allocation.

When building with:
$ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000
The following warning is observed:
arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in
function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=]
static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int
vector)
^
Debugging with:
https://github.com/ClangBuiltLinux/frame-larger-than
via:
$ python3 frame_larger_than.py arch/x86/kernel/kvm.o \
kvm_send_ipi_mask_allbutself
points to the stack allocated `struct cpumask newmask` in
`kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is
potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for
the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as
8192, making a single instance of a `struct cpumask` 1024 B.

Signed-off-by: Nick Desaulniers <nick.des...@gmail.com>
---
arch/x86/kernel/kvm.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 32ef1ee733b7..d41c0a0d62a2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -494,13 +494,15 @@ static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
{
unsigned int this_cpu = smp_processor_id();
- struct cpumask new_mask;
+ struct cpumask *new_mask;
const struct cpumask *local_mask;

- cpumask_copy(&new_mask, mask);
- cpumask_clear_cpu(this_cpu, &new_mask);
- local_mask = &new_mask;
+ new_mask = kmalloc(sizeof(*new_mask), GFP_KERNEL);
+ cpumask_copy(new_mask, mask);
+ cpumask_clear_cpu(this_cpu, new_mask);
+ local_mask = new_mask;
__send_ipi_mask(local_mask, vector);
+ kfree(new_mask);
}

/*
--
2.17.1

Peter Zijlstra

unread,
Jan 27, 2020, 3:10:01 AM1/27/20
to Nick Desaulniers, pbon...@redhat.com, tg...@linutronix.de, mi...@redhat.com, b...@alien8.de, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, H. Peter Anvin, x...@kernel.org, k...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com
Right, on stack cpumask is definitely dodgy.

> + struct cpumask *new_mask;
> const struct cpumask *local_mask;
>
> - cpumask_copy(&new_mask, mask);
> - cpumask_clear_cpu(this_cpu, &new_mask);
> - local_mask = &new_mask;
> + new_mask = kmalloc(sizeof(*new_mask), GFP_KERNEL);
> + cpumask_copy(new_mask, mask);
> + cpumask_clear_cpu(this_cpu, new_mask);
> + local_mask = new_mask;
> __send_ipi_mask(local_mask, vector);
> + kfree(new_mask);
> }

One alternative approach is adding the inverse of cpu_bit_bitmap. I'm
not entirely sure how often we need the all-but-self mask, but ISTR
there were other places too.

Nick Desaulniers

unread,
Jan 27, 2020, 3:14:38 AM1/27/20
to Peter Zijlstra, Paolo Bonzini, Thomas Gleixner, Ingo Molnar, b...@alien8.de, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, H. Peter Anvin, x...@kernel.org, k...@vger.kernel.org, Linux Kernel Mailing List, clang-bu...@googlegroups.com
Probably should check for allocation failure, d'oh!

Vitaly Kuznetsov

unread,
Jan 27, 2020, 3:56:34 AM1/27/20
to Nick Desaulniers, Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, H. Peter Anvin, x...@kernel.org, k...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, pbon...@redhat.com, tg...@linutronix.de, mi...@redhat.com, b...@alien8.de
You could've used alloc_cpumask_var() instead, however, I think that
memory allocation on this path is undesireable. We can always
pre-allocate 1 cpumask variable per cpu and use it every time, e.g. we
do this for Hyper-V.

> + cpumask_copy(new_mask, mask);
> + cpumask_clear_cpu(this_cpu, new_mask);
> + local_mask = new_mask;
> __send_ipi_mask(local_mask, vector);
> + kfree(new_mask);
> }
>
> /*

--
Vitaly

Wanpeng Li

unread,
Feb 3, 2020, 3:32:04 AM2/3/20
to Nick Desaulniers, Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, H. Peter Anvin, the arch/x86 maintainers, kvm, LKML, clang-bu...@googlegroups.com
Hi Nick,
On Mon, 27 Jan 2020 at 15:16, Nick Desaulniers
<nick.des...@gmail.com> wrote:
>
> This helps avoid avoid a potentially large stack allocation.
>
> When building with:
> $ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000
> The following warning is observed:
> arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in
> function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=]
> static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int
> vector)
> ^
> Debugging with:
> https://github.com/ClangBuiltLinux/frame-larger-than
> via:
> $ python3 frame_larger_than.py arch/x86/kernel/kvm.o \
> kvm_send_ipi_mask_allbutself
> points to the stack allocated `struct cpumask newmask` in
> `kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is
> potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for
> the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as
> 8192, making a single instance of a `struct cpumask` 1024 B.

Could you help test the below untested patch?

From 867753e2fa27906f15df7902ba1bce7f9cef6ebe Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanp...@tencent.com>
Date: Mon, 3 Feb 2020 16:26:35 +0800
Subject: [PATCH] KVM: Pre-allocate 1 cpumask variable per cpu for both
pv tlb and pv ipis

Reported-by: Nick Desaulniers <nick.des...@gmail.com>
Signed-off-by: Wanpeng Li <wanp...@tencent.com>
---
arch/x86/kernel/kvm.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 81045aab..b1e8efa 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -425,6 +425,8 @@ static void __init sev_map_percpu_data(void)
}
}

+static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask);
+
#ifdef CONFIG_SMP
#define KVM_IPI_CLUSTER_SIZE (2 * BITS_PER_LONG)

@@ -490,12 +492,12 @@ static void kvm_send_ipi_mask(const struct
cpumask *mask, int vector)
static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask,
int vector)
{
unsigned int this_cpu = smp_processor_id();
- struct cpumask new_mask;
+ struct cpumask *new_mask = this_cpu_cpumask_var_ptr(__pv_cpu_mask);
const struct cpumask *local_mask;

- cpumask_copy(&new_mask, mask);
- cpumask_clear_cpu(this_cpu, &new_mask);
- local_mask = &new_mask;
+ cpumask_copy(new_mask, mask);
+ cpumask_clear_cpu(this_cpu, new_mask);
+ local_mask = new_mask;
__send_ipi_mask(local_mask, vector);
}

@@ -575,7 +577,6 @@ static void __init kvm_apf_trap_init(void)
update_intr_gate(X86_TRAP_PF, async_page_fault);
}

-static DEFINE_PER_CPU(cpumask_var_t, __pv_tlb_mask);

static void kvm_flush_tlb_others(const struct cpumask *cpumask,
const struct flush_tlb_info *info)
@@ -583,7 +584,7 @@ static void kvm_flush_tlb_others(const struct
cpumask *cpumask,
u8 state;
int cpu;
struct kvm_steal_time *src;
- struct cpumask *flushmask = this_cpu_cpumask_var_ptr(__pv_tlb_mask);
+ struct cpumask *flushmask = this_cpu_cpumask_var_ptr(__pv_cpu_mask);

cpumask_copy(flushmask, cpumask);
/*
@@ -624,6 +625,7 @@ static void __init kvm_guest_init(void)
kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
pv_ops.mmu.flush_tlb_others = kvm_flush_tlb_others;
pv_ops.mmu.tlb_remove_table = tlb_remove_table;
+ pr_info("KVM setup pv remote TLB flush\n");
}

if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
@@ -732,23 +734,30 @@ static __init int activate_jump_labels(void)
}
arch_initcall(activate_jump_labels);

-static __init int kvm_setup_pv_tlb_flush(void)
+static __init int kvm_alloc_cpumask(void)
{
int cpu;
+ bool alloc = false;

if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) &&
!kvm_para_has_hint(KVM_HINTS_REALTIME) &&
- kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
+ kvm_para_has_feature(KVM_FEATURE_STEAL_TIME))
+ alloc = true;
+
+#if defined(CONFIG_SMP)
+ if (!alloc && kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI))
+ alloc = true;
+#endif
+
+ if (alloc)
for_each_possible_cpu(cpu) {
- zalloc_cpumask_var_node(per_cpu_ptr(&__pv_tlb_mask, cpu),
+ zalloc_cpumask_var_node(per_cpu_ptr(&__pv_cpu_mask, cpu),
GFP_KERNEL, cpu_to_node(cpu));
}
- pr_info("KVM setup pv remote TLB flush\n");
- }

return 0;
}
-arch_initcall(kvm_setup_pv_tlb_flush);
+arch_initcall(kvm_alloc_cpumask);

#ifdef CONFIG_PARAVIRT_SPINLOCKS

--
1.8.3.1

Nick Desaulniers

unread,
Feb 3, 2020, 10:03:03 AM2/3/20
to Wanpeng Li, Nick Desaulniers, Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, H. Peter Anvin, the arch/x86 maintainers, kvm, LKML, clang-built-linux
Yes, this should help reduce the stack usage, thanks.
Acked-by: Nick Desaulniers <ndesau...@google.com>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CANRm%2BCwK0Cg45mktda9Yz9fsjPCvtuB8O%2Bfma5L3tV725ki1qw%40mail.gmail.com.



--
Thanks,
~Nick Desaulniers
Reply all
Reply to author
Forward
0 new messages