Title: Firmware-Powered Kernel Dumps: Enhancing Reliability
Abstract:
Currently, kdump, the upstream dump mechanism for most architectures, relies on
the crashed kernel to boot the second kernel by design. This approach means that
devices may remain in an inconsistent or unstable state when the second kernel
starts, potentially impacting the reliability of the dump capture process.
Firmware-Assisted Dump (aka FADump) is an alternative dump mechanism to kdump,
where firmware assists in preserving the crashing kernel's memory, and
booting the next kernel.
This design greatly enhances reliability by shifting recovery control to the
firmware layer instead of relying on the unstable state of the crashed kernel.
FADump is currently implemented on IBM Power systems, though the underlying
concepts are applicable to any architecture, that can support a
memory-preserving reboot.
The talk will cover what is a kernel dump in brief, and focus on the emulation
implemented in QEMU for:
1. PowerVM (a virtualised platform, FADump support merged in QEMU)
2. PowerNV (non-virtualised/bare metal platform, patches in mailing list)
The implementation provides insights on the hardware/firmware side of FADump,
giving an idea how it can be supported in architectures other than PowerPC.
Preferred Format: 25+5 mins
Bio:
A Kernel Engineer, with ~3 years experience, part of Linux Bringup team,
in LTC, IBM.
Actively working on various subsystems in the Linux kernel, as well as PowerPC's
bare-metal firmware (skiboot)
Reviewer of PowerNV machine in QEMU, and Maintainer of FADump subsystem in QEMU.
Thanks,
- Aditya G
Patches:
FADump in QEMU PowerVM (virtualised):
https://lore.kernel.org/qemu-devel/20251021134823.1...@linux.ibm.com/
FADump/MPIPL in QEMU PowerNV (non-virtualised):
https://lore.kernel.org/qemu-devel/20250217071934....@linux.ibm.com/