[PATCH] mempool: resize l1_pool_stats once

16 views
Skip to first unread message

Waldemar Kozaczuk

unread,
May 3, 2021, 4:48:48 PM5/3/21
to osv...@googlegroups.com, Waldemar Kozaczuk
On OSv emailing list David Smith reported sporadic crashes
when running OSv on firecracker in nested virtualization setup
looking like this:

"OSv v0.55.0-168-g31a04239
1 CPUs detected
Firmware vendor: Unknown
bsd: initializingAssertion failed: sched::preemptable() (arch/x64/mmu.cc: page_fault: 37)
[backtrace]"

or

"OSv v0.55.0-237-g7cd33926
1 CPUs detected
Firmware vendor: Unknown
trying to execute or access missing symbol
[backtrace]"

After long back-and-forth, I was able to re-create this issue
on firecracker and qemu microvm where I was able to connect
with gdb. It turned out that OSv would trigger page fault in
at core/mempool.cc:1198:

l1_pool_stats[cpu_id]._nr = nr;

The l1_pool_stats is a vector that gets re-sized to the number of cpus
every time the cpu notifier is called. Unfortunately, with this
approach we may occasionally experience a race condition when this
vector gets resized in parallel at the same time on multiple CPUs.
In another race scenario, the smp_allocator_cnt counter might be incremented
in l2::fill_thread() and smp_allocator set to true before l1_pool_stats
get resized in cpu::notifier in which case the accesses to l1_pool_stats
might result in page faults.

To fix this we need to guarantee that the l1_pool_stats vector is
resized once and before smp_allocator flag is set.

To achieve this, this patch adds another atomic counter,
l1_initialized_cnt, that tracks number of initialized l1 pools.
Only when that counter is equal to the number of cpus in the system
we resize the l1_pool_stats vector. We also do it before incrementing
the smp_allocator_cnt.

Signed-off-by: Waldemar Kozaczuk <jwkoz...@gmail.com>
---
core/mempool.cc | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/core/mempool.cc b/core/mempool.cc
index bd8e2fcf..c85bc2be 100644
--- a/core/mempool.cc
+++ b/core/mempool.cc
@@ -1323,15 +1323,18 @@ private:
std::unique_ptr<sched::thread> _fill_thread;
};

+std::atomic<unsigned int> l1_initialized_cnt{};
PERCPU(l1*, percpu_l1);
static sched::cpu::notifier _notifier([] () {
*percpu_l1 = new l1(sched::cpu::current());
+ if (++l1_initialized_cnt == sched::cpus.size()) {
+ l1_pool_stats.resize(sched::cpus.size());
+ }
// N per-cpu threads for L1 page pool, 1 thread for L2 page pool
// Switch to smp_allocator only when all the N + 1 threads are ready
if (smp_allocator_cnt++ == sched::cpus.size()) {
smp_allocator = true;
}
- l1_pool_stats.resize(sched::cpus.size());
});
static inline l1& get_l1()
{
--
2.30.2

Commit Bot

unread,
May 5, 2021, 4:24:57 AM5/5/21
to osv...@googlegroups.com, Waldemar Kozaczuk
From: Waldemar Kozaczuk <jwkoz...@gmail.com>
Committer: Nadav Har'El <n...@scylladb.com>
Branch: master

mempool: resize l1_pool_stats once
Message-Id: <20210503204838.84...@gmail.com>

---
diff --git a/core/mempool.cc b/core/mempool.cc
--- a/core/mempool.cc
+++ b/core/mempool.cc
@@ -1323,15 +1323,18 @@ class l2 {
Reply all
Reply to author
Forward
0 new messages