[COMMIT seastar master] reactor: disable io_uring on older kernels if not enough lockable memory is available

3 views
Skip to first unread message

Commit Bot

<bot@cloudius-systems.com>
unread,
Jan 26, 2023, 11:04:17 AM1/26/23
to seastar-dev@googlegroups.com, Avi Kivity
From: Avi Kivity <a...@scylladb.com>
Committer: Pavel Emelyanov <xe...@scylladb.com>
Branch: master

reactor: disable io_uring on older kernels if not enough lockable memory is available

Before Linux 5.12[1], io_uring accounted the memory of the ring buffers
as locked memory. This amounts to 32k/shard in our configuration, and
as the default mlock limit was 64k at the time, it allows for a maximum
of two vcpus.

To prevent problems, disable io_uring for those old kernels if less
than 8MB of lockable memory is available. This 8MB is more than
enough, but also allows other applications (e.g. mail clients that
lock encryption keys) to co-exist with Seastar. In practice lockable
memory will be either 64k (old configurations), 8MB (newer kernels
and/or systemd) or infinite (server installations that choose to let
the Seastar application lock its memory), so we aren't missing anything
by disallowing 7.3MB.

[1] https://github.com/torvalds/linux/commit/26bfa89e25f42d2b26fe951bbcf04bb13937fbba

Closes #1436

---
diff --git a/src/core/reactor_backend.cc b/src/core/reactor_backend.cc
--- a/src/core/reactor_backend.cc
+++ b/src/core/reactor_backend.cc
@@ -33,6 +33,7 @@
#include <filesystem>
#include <sys/poll.h>
#include <sys/syscall.h>
+#include <sys/resource.h>

#ifdef SEASTAR_HAVE_URING
#include <liburing.h>
@@ -1246,13 +1247,27 @@ have_md_devices() {
return false;
}

+static size_t mlock_limit() {
+ struct ::rlimit lim;
+ int r = ::getrlimit(RLIMIT_MEMLOCK, &lim);
+ if (r == -1) {
+ return 0; // assume the worst; this is advisory anyway
+ }
+ return lim.rlim_cur;
+}
+
static
bool
detect_io_uring() {
if (!kernel_uname().whitelisted({"5.17"}) && have_md_devices()) {
// Older kernels fall back to workqueues for RAID devices
return false;
}
+ if (!kernel_uname().whitelisted({"5.12"}) && mlock_limit() < (8 << 20)) {
+ // Older kernels lock about 32k/vcpu for the ring itself. Require 8MB of
+ // locked memory to be safe (8MB is what newer kernels and newer systemd provide)
+ return false;
+ }
auto ring_opt = try_create_uring(1, false);
if (ring_opt) {
::io_uring_queue_exit(&ring_opt.value());
Reply all
Reply to author
Forward
0 new messages