Extracting only kernel messages from vmcore using kdump.

523 views
Skip to first unread message

Tetsuo Handa

unread,
Jul 6, 2018, 11:11:24 AM7/6/18
to syzkaller
Somewhat related with https://github.com/google/syzkaller/issues/491 or
http://lkml.kernel.org/r/CACT4Y+ZX-QtOzCXaDhGnk4FzpSUk3rNw=5_o34O9=je2j...@mail.gmail.com .
But this post rather focuses on only dmesg part.

Sometimes I can't check past kernel messages (even one minute prior to kernel panic) from log
files because lines by syzkaller programs can flood out lines by kernel messages (and vise
versa). It might be helpful to save log file which contains only kernel messages. And kdump
could be used for such purpose.



Here is an example approach for running minimal initramfs which reads only dmesg
part. If we can reserve more memory for kdump kernel and the testing machines can
fetch kernels from build server via network, we could do complicated things (e.g.
running scripted crash commands without saving vmcore into storage).



(Step 1) Build latest makedumpfile command and /init program.

Download makedumpfile from source and build.

$ git clone https://git.code.sf.net/p/makedumpfile/code makedumpfile.git
$ make -C makedumpfile.git LINKTYPE=dynamic

Build /init program for initramfs for kdump kernel from source.
You can customize like whatever you want.

$ gcc -Wall -O3 -m64 -o init -x c - << EOF
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/mount.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/reboot.h>
#define LINUX_REBOOT_CMD_POWER_OFF 0x4321FEDC

int main(int argc, char *argv[])
{
if (mount("proc", "/proc", "proc", 0, NULL) || mount("sysfs", "/sys", "sysfs", 0, NULL))
return 1;
puts("***** Start of dmesg *****");
if (fork() == 0) {
char *args[5] = { "/makedumpfile", "-F", "--dump-dmesg", "/proc/vmcore", NULL };
execv(args[0], args);
_exit(1);
}
wait(NULL);
puts("***** End of dmesg *****");
sleep(10);
//return 0;
return reboot(LINUX_REBOOT_CMD_POWER_OFF) != 0;
}
EOF

(Step 2) Prepare files for initramfs for kdump kernel.

Create a directory and copy makedupmpfile and init built at Step 1.

$ mkdir -p /var/tmp/miniroot
$ cp makedumpfile.git/makedumpfile init /var/tmp/miniroot/
$ cd /var/tmp/miniroot/
$ mkdir dev proc sys lib64 tmp
$ su -c 'cp -a /dev/console dev/'

Copy library files which "ldd makedumpfile" reported.

$ cp -L /lib64/libpthread.so.0 /lib64/libdw.so.1 /lib64/libbz2.so.1 /lib64/libdl.so.2 /lib64/libelf.so.1 /lib64/libz.so.1 /lib64/libc.so.6 /lib64/ld-linux-x86-64.so.2 /lib64/liblzma.so.5 lib64/

(Step 3) Build a kdump kernel.

The kernel version for testing and the kernel version for kdump does not need to match.
In other words, you can choose a kernel version which you want to use, and you don't
have to rebuild kdump kernel every time a kernel for testing is build, and you can embed
files needed for kdump into the filesystem image used for testing.

$ cd /usr/src/linux-4.9.111/
$ make -s menuconfig
$ make -s

You may start from "make allnoconfig" and enable minimum features such as

CONFIG_64BIT=y

CONFIG_BLK_DEV_INITRD=y
# CONFIG_RD_GZIP is not set
# CONFIG_RD_BZIP2 is not set
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
# CONFIG_RD_LZ4 is not set
# CONFIG_EMBEDDED is not set

CONFIG_CRASH_DUMP=y
CONFIG_RELOCATABLE=y

CONFIG_BINFMT_ELF=y

CONFIG_PROC_FS=y
CONFIG_PROC_VMCORE=y
CONFIG_SYSFS=y

and the directory containing files for initramfs for kdump kernel like

CONFIG_INITRAMFS_SOURCE="/var/tmp/miniroot"

and config options for printing to standard output like

CONFIG_TTY=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y

and some environment dependent bits required for boot/shutdown correctly.

CONFIG_PCI=y
CONFIG_ACPI=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_X86_X2APIC=y

(Step 4) Test the kdump kernel.

Install the kexec-tools package and copy arch/x86/boot/bzImage built at Step 3
to the filesystem image used for testing after booted using the filesystem image
(e.g. when SSH became ready). Then, load bzImage with appropriate command line
options.

# /sbin/kexec -p /data/linux-stable/arch/x86/boot/bzImage --append 'console=ttyS0,115200n8 quiet'
# echo c > /proc/sysrq-trigger

You might need to do trial and error (change kernel config and/or
command line option) for several times between Step 3 and Step 4.

If the the kdump kernel booted correctly and dmesg is dumped correctly,
the output (with 'quiet' also added to command line option) would look like below.

***** Start of dmesg *****
The kernel version is not supported.
The makedumpfile operation may be incomplete.
[ 0.000000] Linux version 4.18.0-rc3+ (root@ccsecurity) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC)) #436 SMP Fri Jul 6 19:55:49 JST 2018
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.18.0-rc3+ root=UUID=17c3c28f-a70a-4666-95fa-ecf6acd901e4 ro vconsole.keymap=jp106 crashkernel=256M vconsole.font=latarcyrheb-sun16 security=none sysrq_always_enabled console=ttyS0,115200n8 console=tty0 LANG=en_US.UTF-8
[ 0.000000] Disabled fast string operations
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
(...snipped...)
[ 50.360013] ? default_idle+0x5/0x10
[ 50.361267] do_idle+0x19d/0x290
[ 50.362396] ? _raw_spin_unlock_irqrestore+0x2d/0x50
[ 50.363859] cpu_startup_entry+0x6a/0x70
[ 50.365132] start_secondary+0x186/0x1d0
[ 50.366373] secondary_startup_64+0xa5/0xb0
[ 50.367659] Modules linked in: pcspkr sg vmw_vmci i2c_piix4 sd_mod ata_generic pata_acpi ahci libahci vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ata_piix drm e1000 i2c_core mptspi scsi_transport_spi mptscsih libata mptbase serio_raw
[ 50.373245] CR2: 0000000000000000

The dmesg log is saved to STDOUT.

makedumpfile Completed.
***** End of dmesg *****

Tetsuo Handa

unread,
Jul 7, 2018, 2:57:35 AM7/7/18
to syzkaller
I noticed that hung task reports from *-next [1] say "INFO: lockdep is turned off.".
Is this because lockdep splat occurred but messages are already flooded out?

[1] https://syzkaller.appspot.com/bug?id=287aa8708bc940d0ca1645223c53dd4c2d203be6

Tetsuo Handa

unread,
Jul 7, 2018, 4:19:21 AM7/7/18
to syzkaller
I also noticed that syzbot is failing to report backtrace correctly [2].
Is this because lines by syzkaller programs cut into the backtrace?
Or, is this because backtrace lines with '?' are eliminated?

[2] https://syzkaller.appspot.com/bug?id=1114b866137dcd9b8c087544a34d52bec892fbb4

Dmitry Vyukov

unread,
Jul 10, 2018, 5:47:51 AM7/10/18
to Tetsuo Handa, syzkaller
Hi,

I don't think so. It may happen episodically (if some output is
lost/corrupted on console), but should not happen systematically
because any lockdep splat should be detected and reported and kernel
aborted and not run beyond that point.

I think what happens is that panic() calls debug_locks_off() and so
lockdep rejects to dump current locks, etc and instead prints this
line.

Dmitry Vyukov

unread,
Jul 10, 2018, 5:49:47 AM7/10/18
to Tetsuo Handa, syzkaller
Yes, lines with '?' are removed. They are not useful in 99.99% of
cases and are still available in the log.

Dmitry Vyukov

unread,
Jul 10, 2018, 5:58:49 AM7/10/18
to Tetsuo Handa, syzkaller
On Fri, Jul 6, 2018 at 5:11 PM, Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
> Somewhat related with https://github.com/google/syzkaller/issues/491 or
> http://lkml.kernel.org/r/CACT4Y+ZX-QtOzCXaDhGnk4FzpSUk3rNw=5_o34O9=je2j...@mail.gmail.com .
> But this post rather focuses on only dmesg part.
>
> Sometimes I can't check past kernel messages (even one minute prior to kernel panic) from log
> files because lines by syzkaller programs can flood out lines by kernel messages (and vise
> versa). It might be helpful to save log file which contains only kernel messages. And kdump
> could be used for such purpose.


If the goal is to obtain kernel output, then we don't need kdump for
this. syzkaller already captures all console output on the fly (that's
required to understand when kernel crashes). But the thing is that it
throws it away on the fly too, because kernel can print hundreds of
megs over an hour. So it's just does not look feasible to save all
this. There is a bunch of places in kernel that print large splats of
output all the time. And we have hundreds of machines printing
something all the time and crashing in the end.
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Tetsuo Handa

unread,
Jul 10, 2018, 6:15:22 AM7/10/18
to Dmitry Vyukov, syzkaller
On 2018/07/10 18:47, Dmitry Vyukov wrote:
> I think what happens is that panic() calls debug_locks_off() and so
> lockdep rejects to dump current locks, etc and instead prints this
> line.

I don't think so. khungtaskd calls debug_show_all_locks() before calling panic().

if (hung_task_show_lock)
debug_show_all_locks();
if (hung_task_call_panic) {
trigger_all_cpu_backtrace();
panic("hung_task: blocked tasks");
}

I think there is lockdep splat which was not saved to the console logs.

Dmitry Vyukov

unread,
Jul 10, 2018, 7:07:53 AM7/10/18
to Tetsuo Handa, syzkaller
Perhaps one of these guys who silently turn off lockdep:

static int save_trace(struct stack_trace *trace)
{
...
if (nr_stack_trace_entries >= MAX_STACK_TRACE_ENTRIES-1) {
if (!debug_locks_off_graph_unlock())
return 0;

print_lockdep_off("BUG: MAX_STACK_TRACE_ENTRIES too low!");
dump_stack();

return 0;
}



static inline int add_chain_cache_classes(unsigned int prev,
unsigned int next,
unsigned int irq_context,
u64 chain_key)
{
...
if (unlikely(nr_lock_chains >= MAX_LOCKDEP_CHAINS)) {
if (!debug_locks_off_graph_unlock())
return 0;

print_lockdep_off("BUG: MAX_LOCKDEP_CHAINS too low!");
dump_stack();
return 0;
}


If we would turned them into WARNING then sybot would catch them and
we would know what happens and could adjust the value.

Dmitry Vyukov

unread,
Jul 10, 2018, 7:09:42 AM7/10/18
to Tetsuo Handa, syzkaller
Wait, this prints "BUG:" to console, so should be caught by syzkaller.
However, this is printed with KERN_DEBUG and I am not sure these
messages make it to console at all...

Tetsuo Handa

unread,
Jul 11, 2018, 8:50:14 AM7/11/18
to Dmitry Vyukov, syzkaller
It might be very early in the boot process.
Can you try "dmesg -c" before starting the test (i.e. when
"Warning: Permanently added 'xxx.xxx.xxx.xxx' (ECDSA) to the list of known hosts."
is printed) ?

Dmitry Vyukov

unread,
Jul 11, 2018, 1:24:12 PM7/11/18
to Tetsuo Handa, syzkaller
I've booted this revision 3 times, did scp, ssh and sysrq d. All works
without any lockdep splats.

Tetsuo Handa

unread,
Jul 11, 2018, 4:23:02 PM7/11/18
to Dmitry Vyukov, syzkaller
Hmm, no splats before starting the test... Then, try changing kernel's printk loglevel to "debug" ?

Dmitry Vyukov

unread,
Jul 12, 2018, 6:58:12 AM7/12/18
to Tetsuo Handa, syzkaller
On Wed, Jul 11, 2018 at 10:22 PM, Tetsuo Handa
Well, I am not typing all commands manually for hundreds of syzbot
machines and can't just adjust one thing on the fly and check how it
affects things. One can adjust configs for all of them here:
https://github.com/google/syzkaller/tree/master/dashboard/config
But then what one is looking for needs to crash loudly, because nobody
is carefully reading terabytes of kernel output to filter it out.
Reply all
Reply to author
Forward
0 new messages