Heads-up (Non-Qubes): thin-provisioned swap via LVM caused machine lockup

14 views
Skip to first unread message

Ulrich Windl

unread,
Oct 16, 2020, 6:02:05 PM10/16/20
to qubes...@googlegroups.com
Hi!

Just a note as qubes uses thin-provisioned LVs (but not for swap by
default): I could reproducibly cause a Linux kernel freeze when multiple
processes started to cause paging to a non-encrypted thinly-provisioned
LV (basically backed on SSD, actually two SSDs behind a hardware RAID1
controller).
As the machine had rather huge RAM (>512GB), I had used two swap
devices: A smaller first one of size 5GB being a plain partition on the
RAID device, and a huge thin-provisioned one with lower priority.

From my tests it was clear that the problem did not happen when the
first swap device was being filled, but soon after the second device was
used (actually when new blocks were allocated from the pool) the kernel
had several pauses that got longer and longer, until eventually nothing
more happened for minutes or hours (I had a top running in a PuTTY
terminal).

I also could reproduce that the problem did not occur when the swap
device had enough blocks allocated, but when I used discard to put them
back into the pool, the freeze happened again when new blocks needed to
be allocated.

For those being interested, I had filed a bug at kernel.org some days ago...

As PuTTY failed to exchange any data with the host (while PING was still
answered), not detecting that the connection was dead, I had also filed
a bug report for PuTTY some days ago (PuTTY should detect that the
connection is dead). The problem was confirmed, but was considered to be
"not common enough" to be taken seriously ("patches welcome").

Regards,
Ulrich
Reply all
Reply to author
Forward
0 new messages