Lance Yang
unread,12:56 AM (2 hours ago) 12:56 AMSign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to kanchanaps...@gmail.com, yo...@kernel.org, nph...@gmail.com, coreg...@gmail.com, syzk...@googlegroups.com, han...@cmpxchg.org, chengmi...@linux.dev, ak...@linux-foundation.org, linu...@kvack.org, linux-...@vger.kernel.org, Lance Yang
On Tue, Jun 09, 2026 at 08:36:29AM -1000, Kanchana P. Sridhar wrote:
>On Tue, Jun 9, 2026 at 10:50 AM Yosry Ahmed <
yo...@kernel.org> wrote:
>>
>> On Tue, Jun 9, 2026 at 8:40 AM Nhat Pham <
nph...@gmail.com> wrote:
>> >
>> > On Tue, Jun 9, 2026 at 4:51 AM Longxing Li <
coreg...@gmail.com> wrote:
>> > >
>> > > Dear Linux kernel developers and maintainers,
Thanks for reporting this!
>> > >
>> > > We would like to report a new kernel bug found by our tool. INFO: task
>> > > hung in zswap_decompress. Details are as follows.
>> > >
>> > > Kernel commit: v7.0.6
>> > > Kernel config: see attachment
>> > > report: see attachment
>>
>> If I am reading the report correctly, it seems like we are doing
>> swapin from the page fault path, and waiting for the per-CPU mutex
>> that is held by kswapd. Since we can sleep waiting for decompression
>> while holding the mutex, it's possible that we have some kind of
>> priority inversion where kswap held the lock, went to sleep, and
>> didn't run again for a while. But that always been possible for a long
>> time AFAICT.
Cool!
Worth rerunning with CONFIG_DETECT_HUNG_TASK_BLOCKER=y, Should be on by
default with CONFIG_DETECT_HUNG_TASK=y, but I don't see it in the config.
With that enabled, the kernel should hopefully tell us which task likely
owns the mutex :) If kswapd is sitting on the per-CPU zswap mutex and not
getting scheduled back in, that should be pretty clear ;)
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DETECT_HUNG_TASK_BLOCKER=y
# detect after 10s in D state instead of the default 120s
echo 10 > /proc/sys/kernel/hung_task_timeout_secs
# 0 means use hung_task_timeout_secs as the check interval
echo 0 > /proc/sys/kernel/hung_task_check_interval_secs
>>
>> Do you have any more details? Is this a new regression (observed when
>> upgrading to v7.0.6), or is it possible this was a pre-existing issue
>> and you just found it on this kernel?
>>
>> > >
>> > > We are currently analyzing the root cause and working on a
>> > > reproducible PoC. We will provide further updates in this thread as
>> > > soon as we have more information.
>>
>> Yeah more details like a known-good kernel version, or even better a
>> reproducer, would certainly help a lot.
>>
>> > >
>> > > Best regards,
>> > > Longxing Li
>> > >
>> > > ==================================================================
>> > >
https://drive.google.com/file/d/1Bx2unEf-QntjVi8g6Zw7QNO6OP4cjGO_/view?usp=drive_link
>> > >
>> > >
https://drive.google.com/file/d/16xzUrwOvwE67cnMPH3AhhNRWq6hr26Qj/view?usp=drive_link
>> >
>> > + Kanchana, who last worked on this piece of code
>
>Thanks Nhat and Yosry. I agree with Yosry, having a known-good kernel
>version would be helpful.
>
>Also, it appears the kernel stack-trace is from before the merging of
>the per-CPU acomp_ctx simplifications wrt the mutex, and resources'
>lifetime being tied to that of the zswap pool.
>
>Looking forward to more details.
+1
Cheers, Lance