Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#990464: Kdump is unable to use NVME

80 views
Skip to first unread message

dann frazier

unread,
Jun 30, 2021, 2:10:03 PM6/30/21
to
tag 990464 + moreinfo
thanks

On Tue, Jun 29, 2021 at 06:25:48PM -0500, Kody Barks wrote:
> Package: kdump-tools
> Version: 1:1.6.5-1
>
> When the kernel panics, it loads the kdump kernel correctly but this kernel
> is unable to use NVME, making it unable to mount the root filesystem and
> thus making the kdump kernel panic.
> I am unable to generate a proper log of this error. The best I could do is
> to take a picture of my screen when this happens. This is the attached
> file.
> I have tried using the boot parameter: "nvme_core.default_ps_max_latency_us=0"
> but this made no changes in behavior.
> I am using Debian 10 Buster with Linux 5.10.40 with libc6 2.28-10
> My motherboard is the ASUS PRIME x399-a .
> The SSD I'm trying to use as the root filesystem is the Intel Corporation
> Optane SSD 900P Series [8086:2700].
> It should be noted that outside of kdump, during normal operation, the
> device works flawlessly. Only Kdump's kernel has an issue.

hey Kody,

It's likely that your crashkernel= parameter just doesn't provide
enough memory for the crashkernel to initialize all of your devices.
This is set in /etc/default/grub.d/kdump-tools.cfg. Try bumping that
up to see if it fixes things. Note that after changing it you'll need
to 1) sudo update-grub and 2) reboot before it will take effect for
the next crash dump. It's unfortunately not possible to provide
a static default crashkernel= that works for everyone, so tuning it is
often required.

-dann

Bernhard Übelacker

unread,
Jul 14, 2021, 6:10:04 AM7/14/21
to
Hello Kody,
I had in the past also a working crash kernel setup.
But that started to fail with and after kernel 5.5.
Unfortunately I did not yet come to report this to debian.

If that is also the case for you, a workaround could be to
install the old 5.4.19 bpo kernel [1].

And modify /etc/default/kdump-tools like this:
-KDUMP_KERNEL=/var/lib/kdump/vmlinuz
-KDUMP_INITRD=/var/lib/kdump/initrd.img
+KDUMP_KERNEL=/boot/vmlinuz-5.4.0-4-amd64
+KDUMP_INITRD=/var/lib/kdump/initrd.img-5.4.0-4-amd64

This is a workaround for me with a "Asus-PRIME-B350M-A",
but for a plain SATA SSD, and running testing.


I just looked in upstream tracker and found this bug [2],
but that might show another issue.

Kind regards,
Bernhard

[1] https://snapshot.debian.org/package/linux/5.4.19-1%7Ebpo10%2B1/
[2] https://bugzilla.kernel.org/show_bug.cgi?id=209351

Kody Barks

unread,
Jul 14, 2021, 1:00:04 PM7/14/21
to
reverting the kdump kernel to 5.4 did not make any difference. 

Kody Barks

unread,
Dec 27, 2021, 10:50:03 AM12/27/21
to
Kernel 5.10, and an upgrade to Bullseye. This issue is still not fixed.

Guilherme G. Piccoli

unread,
Feb 16, 2022, 11:20:03 AM2/16/22
to
On Mon, 27 Dec 2021 09:44:24 -0600 Kody Barks <kodyre...@gmail.com>
wrote:
> Kernel 5.10, and an upgrade to Bullseye. This issue is still not fixed.

Hi Kody, can you please share the outputs of the following 3 commands?

lspci -nns 0000:09:00.0
lspci -nns 0000:41:00.0
lspci -nns 0000:42:00.0

Also, do you remember of a specific kernel version that ever worked with
these drivers on kdump?
Thanks,


Guilherme


P.S. Please don't forget to CC me when responding, I'm not sure if I'd
receive responses to the bug only, I'm not subscribed it seems.

Kody Barks

unread,
Apr 11, 2023, 11:10:04 PM4/11/23
to
here ya go:
root@TESSA:~# lspci -nns 0000:09:00.0
09:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
root@TESSA:~# lspci -nns 0000:41:00.0
41:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961/SM963 [144d:a804]
root@TESSA:~# lspci -nns 0000:42:00.0
42:00.0 Non-Volatile memory controller [0108]: Intel Corporation Optane SSD 900P Series [8086:2700]

no, this has never worked. my system crashes frequently, and without kdump, i have no means of figuring out why. 
0 new messages