cgdstrategy: divide fault in supervisor mode

Alexander Nasonov

unread,

Sep 13, 2016, 4:40:43 AM9/13/16

to tech...@netbsd.org

Someone warned me that adding cgd to dump devices can have bad
consequences. I think I caught one possible bad case yesterday.
I was lucky enough to still have my data.

My setup is quite complicated. I have a small root on wd0a which
does only one thing: to mount a real root on cgd0a and chroot to
it. The rest of the system is on cgd1.

I was in a single-user mode, inside /altroot (iirc), all fs mounted
but I wanted to remount them in read-only mode. Some of them couldn't
be unmounted and I forced umounts with the -f flag. Then I mounted them
with read-only flag. I don't remember exact commands but I have nested
mount points, e.g. /var/log inside /var and I was definitely trying to
remount both inner and outer fs.

All mount/umount worked but when I ran reboot, the system trapped here:

fatal integer divide fault in supervisor mode
trap type 8 code 0 rip ffffffff808db36f cs 8 rflags 10246 cr2 efd...
curlwp 0xfffffe81163b4a40 pid 276.1 lowest kstack 0xfffffe8117343...
kernel: integer divide fault trap, code=0
Stopped in pid 276.1 (reboot) at netbsd:cgdstrategy+0x26:
4
0(%rdi),%eax

This it what I run:

NetBSD neva 7.99.36 NetBSD 7.99.36 (GENERIC) #0: Fri Sep 2 22:04:02 BST 2016 alnsn@nebeda:/home/alnsn/netbsd-current/clean/src/sys/arch/amd64/compile/obj/GENERIC amd64

Sources checked out on Sep 2.

Looking at the assembly, it appears that the fault happened at the
second line of this branch:

if (bp->b_blkno < 0 ||
(bp->b_bcount % dg->dg_secsize) != 0 ||

(offset of b_blkno is 0x48, b_bcount's offset is 0x34).

ffffffff808db349 <cgdstrategy>:
ffffffff808db349: 55 push %rbp
ffffffff808db34a: 48 89 e5 mov %rsp,%rbp
ffffffff808db34d: 53 push %rbx
ffffffff808db34e: 48 83 ec 08 sub $0x8,%rsp
ffffffff808db352: 48 89 fb mov %rdi,%rbx
ffffffff808db355: 48 8b 7f 38 mov 0x38(%rdi),%rdi
ffffffff808db359: e8 4d fe ff ff callq ffffffff808db1ab <getcgd_
softc>
ffffffff808db35e: 48 83 7b 48 00 cmpq $0x0,0x48(%rbx)
ffffffff808db363: 78 3d js ffffffff808db3a2 <cgdstra
tegy+0x59>
ffffffff808db365: 48 89 c7 mov %rax,%rdi
ffffffff808db368: 8b 4b 34 mov 0x34(%rbx),%ecx
ffffffff808db36b: 89 c8 mov %ecx,%eax
ffffffff808db36d: 31 d2 xor %edx,%edx
ffffffff808db36f: f7 77 40 divl 0x40(%rdi)

Alex

Michael van Elst

unread,

Sep 13, 2016, 7:26:36 AM9/13/16

to tech...@netbsd.org

al...@yandex.ru (Alexander Nasonov) writes:

>All mount/umount worked but when I ran reboot, the system trapped here:

>fatal integer divide fault in supervisor mode
>trap type 8 code 0 rip ffffffff808db36f cs 8 rflags 10246 cr2 efd...
>curlwp 0xfffffe81163b4a40 pid 276.1 lowest kstack 0xfffffe8117343...
>kernel: integer divide fault trap, code=0
>Stopped in pid 276.1 (reboot) at netbsd:cgdstrategy+0x26:

> if (bp->b_blkno < 0 ||
> (bp->b_bcount % dg->dg_secsize) != 0 ||

>ffffffff808db36b: 89 c8 mov %ecx,%eax
>ffffffff808db36d: 31 d2 xor %edx,%edx
>ffffffff808db36f: f7 77 40 divl 0x40(%rdi)

That would require dg_secsize to be 0 which is difficult to do
because the drivers initialize the value and the common disk_set_info()
function fixes up a zero value.

But maybe the dg pointer is bad? Please have a look at the %rdi
register.

N.B. there are some rare failure paths in getcgd_softc() that would
return a NULL pointer that isn't checked. If the kernel maps zeros
at NULL this could trigger a divide error here.

--
--
Michael van Elst
Internet: mle...@serpens.de
"A potential Snark may lurk in every tree."

Alexander Nasonov

unread,

Sep 13, 2016, 4:26:01 PM9/13/16

to Michael van Elst, tech...@netbsd.org

Michael van Elst wrote:
> That would require dg_secsize to be 0 which is difficult to do
> because the drivers initialize the value and the common disk_set_info()
> function fixes up a zero value.

I can reproduce division by zero but not when rebooting. If I take
an unconfigured cgd device, i.e. cgd2 and run

mount /dev/cgd2d /tmp

the kernel will panic instead of returning ENXIO.

> But maybe the dg pointer is bad? Please have a look at the %rdi
> register.

I don't know what was rdi's value when it crashed during reboot but
crashes when mounting /dev/cgd2d all have good kernel-space values.
I can examine data at 0x34 offset and it's indeed 0.

$ crash -M netbsd.12.core
Crash version 7.99.36, image version 7.99.36.
System panicked: dump forced via kernel debugger
Backtrace from time of crash is available.
crash> dmesg|tail
iwm0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36M
bps 48Mbps 54Mbps
acpibat0: normal capacity on 'charge state'

fatal integer divide fault in supervisor mode

trap type 8 code 0 rip ffffffff808db36f cs 8 rflags 10246 cr2 4d8000 ilevel 0 rs
p fffffe8116cfba50
curlwp 0xfffffe836fcbab00 pid 13.1 lowest kstack 0xfffffe8116cf82c0

dumping to dev 20,17 (offset=212951, size=3119109):
dump
crash> bt
_KERNEL_OPT_NARCNET() at 0
_KERNEL_OPT_NARCNET() at 0
db_reboot_cmd() at db_reboot_cmd
db_command() at db_command+0xeb
db_command_loop() at db_command_loop+0x90
db_trap() at db_trap+0xe3
kdb_trap() at kdb_trap+0xe1
trap() at trap+0x574
--- trap (number 8) ---
cgdstrategy() at cgdstrategy+0x26
bdev_strategy() at bdev_strategy+0x68
spec_strategy() at spec_strategy+0x81
VOP_STRATEGY() at VOP_STRATEGY+0x33
bio_doread() at bio_doread+0x98
bread() at bread+0x1a
ffs_mountfs() at ffs_mountfs+0x170
ffs_mount() at ffs_mount+0x227
VFS_MOUNT() at VFS_MOUNT+0x34
mount_domount() at mount_domount+0x122
do_sys_mount() at do_sys_mount+0x20f
sys___mount50() at sys___mount50+0x33
syscall() at syscall+0x15b
--- syscall (number 410) ---
75c7da:
crash> ps
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
13 > 1 7 1 0 fffffe836fcbab00 mount_ffs
12 1 2 1 8020000 fffffe811681d2a0 mount
8 1 2 1 8020000 fffffe811681d6c0 ksh
2 1 2 1 8020000 fffffe811681dae0 ksh
1 1 2 1 8020000 fffffe81163f5680 init
..

Alex

Alexander Nasonov

unread,

Sep 13, 2016, 5:13:22 PM9/13/16

to tech...@netbsd.org, Michael van Elst

Alexander Nasonov wrote:
> I can examine data at 0x34 offset and it's indeed 0.

Correction: $rdi+0x40 is the correct location.

I also inspected low half of dev_t ($rbx+0x38) and its value was 0x1423
which corresponds to:

brw-r----- 1 root operator 20, 35 Dec 13 2015 /dev/cgd2d

Alex

Michael van Elst

unread,

Sep 13, 2016, 5:45:49 PM9/13/16

to Alexander Nasonov, tech...@netbsd.org

On Tue, Sep 13, 2016 at 09:27:11PM +0100, Alexander Nasonov wrote:
>
> I can reproduce division by zero but not when rebooting. If I take
> an unconfigured cgd device, i.e. cgd2 and run
>
> mount /dev/cgd2d /tmp
>
> the kernel will panic instead of returning ENXIO.

Ah, maybe then:

--- cgd.c 5 Aug 2016 08:24:46 -0000 1.110
+++ cgd.c 13 Sep 2016 21:43:27 -0000
@@ -305,13 +305,17 @@
static void
cgdstrategy(struct buf *bp)
{
- struct cgd_softc *cs = getcgd_softc(bp->b_dev);
- struct dk_softc *dksc = &cs->sc_dksc;
- struct disk_geom *dg = &dksc->sc_dkdev.dk_geom;
+ struct cgd_softc *cs;
+ struct dk_softc *dksc;
+ struct disk_geom *dg;

DPRINTF_FOLLOW(("cgdstrategy(%p): b_bcount = %ld\n", bp,
(long)bp->b_bcount));

+ GETCGD_SOFTC(cs, bp->b_dev);
+ dksc = &cs->sc_dksc;
+ dg = &dksc->sc_dkdev.dk_geom;
+
/*
* Reject unaligned writes. We can encrypt and decrypt only
* complete disk sectors, and we let the ciphers require their

Greetings,

Alexander Nasonov

unread,

Sep 14, 2016, 2:17:50 AM9/14/16

to Michael van Elst, tech...@netbsd.org

Michael van Elst wrote:
> Ah, maybe then:
>
> --- cgd.c 5 Aug 2016 08:24:46 -0000 1.110
> +++ cgd.c 13 Sep 2016 21:43:27 -0000
> @@ -305,13 +305,17 @@
> static void
> cgdstrategy(struct buf *bp)
> {
> - struct cgd_softc *cs = getcgd_softc(bp->b_dev);
> - struct dk_softc *dksc = &cs->sc_dksc;
> - struct disk_geom *dg = &dksc->sc_dkdev.dk_geom;
> + struct cgd_softc *cs;
> + struct dk_softc *dksc;
> + struct disk_geom *dg;
>
> DPRINTF_FOLLOW(("cgdstrategy(%p): b_bcount = %ld\n", bp,
> (long)bp->b_bcount));
>
> + GETCGD_SOFTC(cs, bp->b_dev);
> + dksc = &cs->sc_dksc;
> + dg = &dksc->sc_dkdev.dk_geom;
> +

It will not compile because cgdstrategy() returns void.

Alex

Michael van Elst

unread,

Sep 14, 2016, 2:22:56 AM9/14/16

to Alexander Nasonov, tech...@netbsd.org

Right. This needs to be written differently. Instead of GETCGD_SOFTC()
use:

cs = getcgd_softc(bp->b_dev);
if (!cs) {
bp->b_error = ENXIO;
biodone(bp);
return;

Alexander Nasonov

unread,

Sep 14, 2016, 4:38:58 AM9/14/16

to Michael van Elst, tech...@netbsd.org

Michael van Elst wrote:
> Right. This needs to be written differently. Instead of GETCGD_SOFTC()
> use:
>
> cs = getcgd_softc(bp->b_dev);
> if (!cs) {
> bp->b_error = ENXIO;
> biodone(bp);
> return;
> }

I tried something similar but with bp->b_resid = bp->b_bcount;
instead of biodone(bp); It still crashes.

I'll try your code.

Alex

Alexander Nasonov

unread,

Sep 14, 2016, 4:49:18 PM9/14/16

to Michael van Elst, tech...@netbsd.org

Michael van Elst wrote:
> Right. This needs to be written differently. Instead of GETCGD_SOFTC()
> use:
>
> cs = getcgd_softc(bp->b_dev);
> if (!cs) {
> bp->b_error = ENXIO;
> biodone(bp);
> return;
> }

I enabled DEBUG in the config and changed cgdstrategy. Same crash:

Stopped in pid 10.1 (mount_ffs) at netbsd:cgdstrategy+0x2d: divl
4
0(%r12),%eax

ffffffff808edcd8 <cgdstrategy>:
ffffffff808edcd8: 55 push %rbp
ffffffff808edcd9: 48 89 e5 mov %rsp,%rbp
ffffffff808edcdc: 53 push %rbx
ffffffff808edcdd: 48 83 ec 08 sub $0x8,%rsp
ffffffff808edce1: 48 89 fb mov %rdi,%rbx
ffffffff808edce4: f6 05 d5 d0 8e 00 01 testb $0x1,0x8ed0d5(%rip) # ffffffff811dadc0 <cgddebug>
ffffffff808edceb: 75 52 jne ffffffff808edd3f <cgdstrategy+0x67>
ffffffff808edced: 48 8b 7b 38 mov 0x38(%rbx),%rdi
ffffffff808edcf1: e8 e5 fd ff ff callq ffffffff808edadb <getcgd_softc>
ffffffff808edcf6: 48 89 c7 mov %rax,%rdi
ffffffff808edcf9: 48 85 c0 test %rax,%rax
ffffffff808edcfc: 74 58 je ffffffff808edd56 <cgdstrategy+0x7e>
ffffffff808edcfe: 48 83 7b 48 00 cmpq $0x0,0x48(%rbx)
ffffffff808edd03: 8b 4b 34 mov 0x34(%rbx),%ecx
ffffffff808edd06: 78 11 js ffffffff808edd19 <cgdstrategy+0x41>
ffffffff808edd08: 89 c8 mov %ecx,%eax
ffffffff808edd0a: 31 d2 xor %edx,%edx
ffffffff808edd0c: f7 77 40 divl 0x40(%rdi)
ffffffff808edd0f: 85 d2 test %edx,%edx
ffffffff808edd11: 75 06 jne ffffffff808edd19 <cgdstrategy+0x41>
ffffffff808edd13: f6 43 40 03 testb $0x3,0x40(%rbx)
ffffffff808edd17: 74 18 je ffffffff808edd31 <cgdstrategy+0x59>
ffffffff808edd19: c7 43 20 16 00 00 00 movl $0x16,0x20(%rbx)
ffffffff808edd20: 89 4b 24 mov %ecx,0x24(%rbx)
ffffffff808edd23: 48 89 df mov %rbx,%rdi
ffffffff808edd26: 48 83 c4 08 add $0x8,%rsp
ffffffff808edd2a: 5b pop %rbx
ffffffff808edd2b: 5d pop %rbp
ffffffff808edd2c: e9 f0 c3 fc ff jmpq ffffffff808ba121 <biodone>
ffffffff808edd31: 48 89 de mov %rbx,%rsi
ffffffff808edd34: 48 83 c4 08 add $0x8,%rsp
ffffffff808edd38: 5b pop %rbx
ffffffff808edd39: 5d pop %rbp
ffffffff808edd3a: e9 a1 2e 00 00 jmpq ffffffff808f0be0 <dk_strategy>
ffffffff808edd3f: 48 63 57 34 movslq 0x34(%rdi),%rdx
ffffffff808edd43: 48 89 fe mov %rdi,%rsi
ffffffff808edd46: 48 c7 c7 18 15 f9 80 mov $0xffffffff80f91518,%rdi
ffffffff808edd4d: 31 c0 xor %eax,%eax
ffffffff808edd4f: e8 4f d8 f8 ff callq ffffffff8087b5a3 <printf>
ffffffff808edd54: eb 97 jmp ffffffff808edced <cgdstrategy+0x15>
ffffffff808edd56: c7 43 20 06 00 00 00 movl $0x6,0x20(%rbx)
ffffffff808edd5d: eb c4 jmp ffffffff808edd23 <cgdstrategy+0x4b>

ffffffff808eeb2e: 48 c7 c7 d8 dc 8e 80 mov $0xffffffff808edcd8,%rdi
ffffffff808eeb35: 5b pop %rbx
ffffffff808eeb36: 41 5c pop %r12
ffffffff808eeb38: 5d pop %rbp
ffffffff808eeb39: e9 4f db f4 ff jmpq ffffffff8083c68d <physio>

ffffffff808eeb9d: 48 c7 c7 d8 dc 8e 80 mov $0xffffffff808edcd8,%rdi
ffffffff808eeba4: 5b pop %rbx
ffffffff808eeba5: 41 5c pop %r12
ffffffff808eeba7: 5d pop %rbp
ffffffff808eeba8: e9 e0 da f4 ff jmpq ffffffff8083c68d <physio>

Alex