My 5 cents on this topic.
First, GDB can sometimes be useful, but in many cases it may be
mistaking or confusing because of it being attached too late, other
things happening in the background, stack traces being corrupted etc.
I wouldn't recommend starting with it.
What GDB is good at at the basic level is quickly detecting that the
kernel deadlocked: if you attach GDB to a running program, dump the
stack and see some _spin_lock() functions at the top for a long time,
this basically means you're stuck.
But I believe there are easier means to detect deadlocks (there should
be a debug config for that).
Once you've a stable bug reproducer (which is critical!) and a crash
stack trace, your tools of the trade should be printk() and bisection.
The basic idea of debugging is to figure out which invariant in the
program is violated and to pinpoint the place at which the violation
happened (not necessarily the place where the program crashed).
For example, if you know that the kernel crashes on a NULL dereference
in a function foo(), you can figure out where the data comes from.
First, print the pointer value at the point of crash to make sure it's NULL.
Then check all assignments to that variable in the current function
(if any) to see which of them have introduced that NULL value.
If the pointer was passed to foo() as a parameter, check its value
right at the beginning of foo(). If it's also NULL, go up the crash
stack till you find the origin of that NULL value.
It's extremely handy to use the __FILE__ and __LINE__ macros when
printing the values, e.g.:
pr_err("ptr=%px HERE: %s:%d\n", ptr, __FILE__, __LINE__);
If for some reason you don't have the crash stack, but have a rough
idea what might have gone wrong, you can still figure out the exact
location by inserting pr_err() into the suspect functions and seeing
which of them were called before the crash.
This requires some knowledge of the subsystem in question, however.
Bisection is a general trick to reduce the search space by dividing it
into two halves.
It can be applied to kernel commits (as in "which commit introduced
this bug?" [1]), reproducer programs ("which part of the program
actually triggers the bug?" [2]), source files ("on which line of the
function does is the following invariant violated?"), printk outputs
("there are too many calls to printk already, can I remove some?") and
anything else.
As Dmitry said, it's quite hard to write a general instruction,
especially given the variety of bugs that syzkaller is able to detect.
I suggest you start with something more or less straightforward (like
null dereferences, local stack buffer overflows or assertion
violations) and reproducible to gain some hands-on experience, and
then move on to more complex cases like hanging kernel.
[1] - you may want to check out the manual for `git bisect` if you
want to do commit bisection. This can also be handy: looking at the
commit that introduced a bug may be easier than reading the whole
file.
[2] - usually syzkaller repros are good enough, but they can be
reduced further either manually or by using tools like multidelta.
Reducing a test case manually may also give you better understanding
about what's going on in the kernel.
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
syzkaller+...@googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.
--
Alexander Potapenko
Software Engineer
Google Germany GmbH
Erika-Mann-Straße, 33
80636 München
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg