scsi: sg: assorted memory corruptions

Dmitry Vyukov

unread,

Jan 22, 2018, 6:07:20 AM1/22/18

to Doug Gilbert, je...@linux.vnet.ibm.com, Martin K. Petersen, linux-scsi, LKML, Ben Hutchings, syzkaller

Hello,

The following program triggers assorted memory corruptions on 4.15-rc9:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

#define SG_NEXT_CMD_LEN 0x2283

int main()
{
int fd = open("/dev/sg0", O_RDWR);
long len = 9;
ioctl(fd, SG_NEXT_CMD_LEN, &len);
char* p = "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x47\x00\x00\x24\x00"
"\x00\x00\x00\x00\x00\x1c\xbb\xac\x14\x00\xaa\xe0\x00\x00\x01"
"\x00\x07\x07\x00\x00\x59\x08\x00\x00\x00\x80\xfe\x7f\x00\x00\x01";
write(fd, p, 46);
return 0;
}

Run it in a loop as "while ./a.out; do true; done". Below are some
manifestations, but it really looks it smashes heap badly and then it
manifests in a random way:

general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 2 PID: 11158 Comm: syz-executor2 Not tainted 4.15.0-rc9+ #65
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
RIP: 0010:find_stack lib/stackdepot.c:173 [inline]
RIP: 0010:depot_save_stack+0x108/0x440 lib/stackdepot.c:225
RSP: 0018:ffff88007118ed68 EFLAGS: 00010002
RAX: 0000000033ae8ebb RBX: 00000000891e24d1 RCX: 0000000000000002
RDX: 0000000024208bf1 RSI: 0000000001000000 RDI: ffff88007118edc0
RBP: ffff88007118edb0 R08: 1ffff1000e231d77 R09: ffff88007118edd8
R10: 00000000e160d61a R11: 00000000f692b9a9 R12: 000000000000000d
R13: 0000000000000068 R14: 0001800800008008 R15: 00000000000e24d1
FS: 00000000023e4940(0000) GS:ffff88002db00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b9bc25000 CR3: 000000007d476000 CR4: 00000000000026e0
DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Call Trace:
save_stack+0xa3/0xd0 mm/kasan/kasan.c:453
set_track mm/kasan/kasan.c:459 [inline]
kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
__cache_free mm/slab.c:3488 [inline]
kfree+0xc5/0x160 mm/slab.c:3803
__mmu_notifier_mm_destroy+0x116/0x1c0 mm/mmu_notifier.c:323
mmu_notifier_mm_destroy include/linux/mmu_notifier.h:297 [inline]
__mmdrop+0x104/0x3f0 kernel/fork.c:908
mmdrop include/linux/sched/mm.h:43 [inline]
finish_task_switch+0x44c/0x6f0 kernel/sched/core.c:2671
context_switch kernel/sched/core.c:2802 [inline]
__schedule+0x842/0x1e10 kernel/sched/core.c:3375
schedule+0xe8/0x420 kernel/sched/core.c:3434
freezable_schedule include/linux/freezer.h:172 [inline]
futex_wait_queue_me+0x3af/0x770 kernel/futex.c:2548
futex_wait+0x374/0x9e0 kernel/futex.c:2663
do_futex+0xe20/0x2750 kernel/futex.c:3545
SYSC_futex kernel/futex.c:3605 [inline]
SyS_futex+0x368/0x485 kernel/futex.c:3573
entry_SYSCALL_64_fastpath+0x24/0x8c
RIP: 0033:0x4482b9
RSP: 002b:0000000000a2f908 EFLAGS: 00000206 ORIG_RAX: 00000000000000ca
RAX: ffffffffffffffda RBX: 000000000071bea0 RCX: 00000000004482b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071becc
RBP: 00000000000000bb R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000a2f910 R11: 0000000000000206 R12: 00000000000003b7
R13: 0000000000000b5a R14: 00000000c59c644d R15: 0000000000000000
Code: 75 b8 48 89 7d c0 41 81 e7 ff ff 0f 00 4e 8b 34 fd 60 f9 ed 87
4d 85 f6 74 5e 4d 63 ec 49 c1 e5 03 eb 08 4d 8b 36 4d 85 f6 74 4d <41>
39 5e 08 75 f2 45 3b 66 0c 75 ec 49 8d 76 18 4c 89 cf 4c 89
RIP: find_stack lib/stackdepot.c:173 [inline] RSP: ffff88007118ed68
RIP: depot_save_stack+0x108/0x440 lib/stackdepot.c:225 RSP: ffff88007118ed68
---[ end trace a25d77609c7bff29 ]---

[ 71.351814] general protection fault: 0000 [#1] SMP KASAN
[ 71.352992] Modules linked in:
[ 71.353611] CPU: 3 PID: 3724 Comm: bash Not tainted 4.15.0-rc9+ #65
[ 71.354666] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[ 71.356440] RIP: 0010:thread_group_cputime+0x4b8/0x1000
[ 71.357354] RSP: 0018:ffff88005dbff658 EFLAGS: 00010206
[ 71.358240] RAX: 0000000000000005 RBX: dffffc0000000000 RCX: 1ffff1000cf0b917
[ 71.359403] RDX: ffff88005dbff838 RSI: 1ffff1000bb7ff06 RDI: 0000000000000028
[ 71.360757] RBP: ffff88005dbff800 R08: ffff88005dbff840 R09: ffff88006454a280
[ 71.362314] R10: ffff88006b06e1c0 R11: ffff88006785c1c0 R12: fffffffffffffa70
[ 71.363800] R13: ffff88005dbff830 R14: ffffed000bb7fef3 R15: ffff88005dbff7d8
[ 71.365291] FS: 00007fb375f95700(0000) GS:ffff88006cb80000(0000)
knlGS:0000000000000000
[ 71.366990] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 71.368198] CR2: 00000000006edbd4 CR3: 00000000637fb002 CR4: 00000000001606e0
[ 71.369710] Call Trace:
[ 71.370259] ? print_unlock_imbalance_bug+0x70/0x70
[ 71.371297] ? account_idle_time+0x1b0/0x1b0
[ 71.372292] ? lock_downgrade+0x8e0/0x8e0
[ 71.373148] ? lock_downgrade+0x8e0/0x8e0
[ 71.374036] thread_group_cputime_adjusted+0x6b/0xd0
[ 71.375086] ? task_cputime_adjusted+0x240/0x240
[ 71.376067] wait_consider_task+0x1a91/0x38b0
[ 71.376987] ? graph_lock+0x160/0x160
[ 71.377780] ? graph_lock+0x160/0x160
[ 71.378568] ? exit_notify+0xb60/0xb60
[ 71.379381] ? print_unlock_imbalance_bug+0x70/0x70
[ 71.380420] ? find_held_lock+0x35/0x1d0
[ 71.381262] ? lock_acquire+0x1f7/0x4f0
[ 71.382151] ? do_wait+0x3ba/0x9d0
[ 71.382885] ? lock_downgrade+0x8e0/0x8e0
[ 71.383743] ? lock_release+0xaf0/0xaf0
[ 71.384566] ? add_wait_queue+0x19e/0x230
[ 71.385427] ? __wake_up_locked_key_bookmark+0x20/0x20
[ 71.386553] ? task_active_pid_ns+0xd0/0xd0
[ 71.387451] do_wait+0x45b/0x9d0
[ 71.388155] ? wait_consider_task+0x38b0/0x38b0
[ 71.389139] ? tty_vhangup+0x30/0x30
[ 71.389933] ? find_held_lock+0x35/0x1d0
[ 71.390777] ? lock_downgrade+0x8e0/0x8e0
[ 71.391634] ? lock_release+0xaf0/0xaf0
[ 71.392522] ? do_raw_spin_unlock+0x1f0/0x2d0
[ 71.393487] kernel_wait4+0x234/0x3b0
[ 71.394289] ? SyS_waitid+0x50/0x50
[ 71.395041] ? task_stopped_code+0x190/0x190
[ 71.395957] ? sigprocmask+0xf4/0x2e0
[ 71.396746] SYSC_wait4+0x119/0x120
[ 71.397516] ? kernel_wait4+0x3b0/0x3b0
[ 71.398353] ? _copy_to_user+0x85/0xd0
[ 71.399169] ? __sanitizer_cov_trace_const_cmp8+0x18/0x20
[ 71.400318] ? SyS_rt_sigprocmask+0x1ca/0x240
[ 71.401251] ? sigprocmask+0x2e0/0x2e0
[ 71.402147] ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
[ 71.403291] ? security_file_ioctl+0x95/0xc0
[ 71.404210] SyS_wait4+0x2c/0x40
[ 71.404913] entry_SYSCALL_64_fastpath+0x24/0x8c
[ 71.405928] RIP: 0033:0x7fb375671a3e
[ 71.406699] RSP: 002b:00007ffc523ec340 EFLAGS: 00000246 ORIG_RAX:
000000000000003d
[ 71.408297] RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007fb375671a3e
[ 71.409817] RDX: 000000000000000a RSI: 00007ffc523ec398 RDI: ffffffffffffffff
[ 71.411317] RBP: 0000000000000000 R08: 00000000011c1a48 R09: 0000000000000000
[ 71.412867] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
[ 71.414388] R13: 00000000011abb88 R14: 0000000000000000 R15: 00000000011abd08
[ 71.415888] Code: c0 10 49 39 c6 0f 84 09 01 00 00 48 c1 ef 03 4c
89 ee 49 89 fe 48 c1 ee 03 49 01 de 49 8d bc 24 b8 05 00 00 48 89 f8
48 c1 e8 03 <80> 3c 18 00 0f 85 c7 05 00 00 49 8d bc 24 c0 05 00 00 49
8b 84
[ 71.419966] RIP: thread_group_cputime+0x4b8/0x1000 RSP: ffff88005dbff658
[ 71.421464] ---[ end trace 982cd2844bb6092a ]---

[ 493.794289] BUG: unable to handle kernel paging request at fffff1e03c000220
[ 493.795959] IP: qlist_free_all+0xe4/0x110
[ 493.796893] PGD 0 P4D 0
[ 493.797450] Oops: 0000 [#1] SMP KASAN
[ 493.798274] Modules linked in:
[ 493.798953] CPU: 1 PID: 4273 Comm: a.out Not tainted 4.15.0-rc9+ #65
[ 493.800321] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[ 493.802005] RIP: 0010:qlist_free_all+0xe4/0x110
[ 493.802960] RSP: 0018:ffff88006ae17cb8 EFLAGS: 00010286
[ 493.804073] RAX: 0001800f0000800f RBX: 0000000000000282 RCX: ffffea0000000000
[ 493.805540] RDX: fffff1e03c000200 RSI: 000077ff80000000 RDI: 0000000000000000
[ 493.807061] RBP: ffff88006ae17ce0 R08: 1ffff1000d5c2f68 R09: ffff880063e60040
[ 493.808562] R10: 1ffff1000c7cc115 R11: 0000000000000001 R12: ffff88006ae17cf0
[ 493.810080] R13: 0001800f0000800f R14: ffffffff86acaf20 R15: 0000000000000000
[ 493.812477] FS: 00000000007e9880(0000) GS:ffff88006ca80000(0000)
knlGS:0000000000000000
[ 493.813762] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 493.814678] CR2: fffff1e03c000220 CR3: 000000006b812003 CR4: 00000000001606e0
[ 493.815830] Call Trace:
[ 493.816280] quarantine_reduce+0x141/0x170
[ 493.817011] kasan_kmalloc+0x99/0xe0
[ 493.817607] kasan_slab_alloc+0x12/0x20
[ 493.818254] kmem_cache_alloc+0x10c/0x620
[ 493.818899] ? map_id_range_down+0x1e6/0x400
[ 493.819585] getname_flags+0xd0/0x5a0
[ 493.820184] user_path_at_empty+0x2d/0x50
[ 493.820819] SyS_access+0x254/0x7b0
[ 493.821399] ? SyS_faccessat+0x7c0/0x7c0
[ 493.822061] ? lockdep_sys_exit_thunk+0x16/0x29
[ 493.822814] ? async_page_fault+0x36/0x60
[ 493.823481] entry_SYSCALL_64_fastpath+0x24/0x8c
[ 493.824251] RIP: 0033:0x463327
[ 493.824748] RSP: 002b:00007ffe8b588888 EFLAGS: 00000246 ORIG_RAX:
0000000000000015
[ 493.825938] RAX: ffffffffffffffda RBX: 00000000007ea1f0 RCX: 0000000000463327
[ 493.827132] RDX: 0000000000000004 RSI: 0000000000000000 RDI: 00000000004af1a1
[ 493.828214] RBP: 00007ffe8b588ae8 R08: 00007ffe8b5c1040 R09: 0000000000000000
[ 493.829226] R10: 00000000006c3f20 R11: 0000000000000246 R12: 00007ffe8b588af8
[ 493.830308] R13: 0000000000401d20 R14: 0000000000401db0 R15: 0000000000000000
[ 493.831386] Code: 00 00 00 80 48 01 c2 72 42 48 be 00 00 00 80 ff
77 00 00 48 01 f2 48 b9 00 00 00 00 00 ea ff ff 48 c1 ea 0c 48 c1 e2
06 48 01 ca <48> 8b 72 20 48 8d 7e ff 83 e6 01 48 0f 45 d7 48 8b 7a 30
e9 36
[ 493.834255] RIP: qlist_free_all+0xe4/0x110 RSP: ffff88006ae17cb8
[ 493.835157] CR2: fffff1e03c000220
[ 493.835733] ---[ end trace 1fbd2672ad8e619c ]---

Bart Van Assche

unread,

Jan 22, 2018, 11:30:55 AM1/22/18

to je...@linux.vnet.ibm.com, linux...@vger.kernel.org, dgil...@interlog.com, dvy...@google.com, linux-...@vger.kernel.org, martin....@oracle.com, ben.hu...@codethink.co.uk, syzk...@googlegroups.com

On Mon, 2018-01-22 at 12:06 +0100, Dmitry Vyukov wrote:
> general protection fault: 0000 [#1] SMP KASAN

How about the untested patch below?

Thanks,

Bart.

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index cd9b6ebd7257..04a644b39d79 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -627,6 +627,10 @@ sg_write(struct file *filp, const char __user *buf, size_t count, loff_t * ppos)
mutex_unlock(&sfp->f_mutex);
SCSI_LOG_TIMEOUT(4, sg_printk(KERN_INFO, sdp,
"sg_write: scsi opcode=0x%02x, cmd_size=%d\n", (int) opcode, cmd_size));
+ if (cmd_size > sizeof(cmnd)) {
+ sg_remove_request(sfp, srp);
+ return -EFAULT;
+ }
/* Determine buffer size. */
input_size = count - cmd_size;
mxsize = max(input_size, old_hdr.reply_len);

Douglas Gilbert

unread,

Jan 22, 2018, 1:57:58 PM1/22/18

to Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, dvy...@google.com, linux-...@vger.kernel.org, martin....@oracle.com, ben.hu...@codethink.co.uk, syzk...@googlegroups.com

Using 'scsi_logging_level -s -T 5' on the sg driver and running the test
program provided, the cmd_size is 9, just like the ioctl() in his program
set it to. The sizeof(cmnd) is 252. So I don't know what caused the
GPF but it wasn't cmd_size being out of bounds.

As for the above patch, did you notice this check in that function:

if ((!hp->cmdp) || (hp->cmd_len < 6) || (hp->cmd_len > sizeof (cmnd))) {
sg_remove_request(sfp, srp);
return -EMSGSIZE;
}

As far as I remember, Dmitry has not indicated in multiple reports
over several years what /dev/sg0 is. Perhaps it misbehaves when it
gets a SCSI command in the T10 range (i.e. not vendor specific) with
a 9 byte cdb length. As far as I'm aware T10 (and the Ansi committee
before it) have never defined a cdb with an odd length.

For those that are not aware, the sg driver is a relatively thin
shim over the block layer, the SCSI mid-level, and a low-level
driver which may have another kernel driver stack underneath it
(e.g. UAS (USB attached SCSI)). The previous report from syzkaller
on the sg driver ("scsi: memory leak in sg_start_req") has resulted
in one accepted patch on the block layer with probably more to
come in the same area.

Testing the patch Dmitry gave (with some added error checks which
reported no problems) with the scsi_debug driver supplying /dev/sg0
I have not seen any problems running that test program. Again
there might be a very slow memory leak, but if there is I don't
believe it is in the sg driver.

While it's not invalid from a testing perspective, throwing total
nonsense at a pass-through mechanism, including absurd SCSI commands
at best will test error paths, but only at a very shallow level.
Setting up almost valid pass-through scenarios will test error
paths at a deeper level. Then there are lots of valid pass-through
scenarios that would be expected not to fail.

Doug Gilbert

Dmitry Vyukov

unread,

Jan 22, 2018, 2:06:23 PM1/22/18

to Doug Gilbert, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, ben.hu...@codethink.co.uk, syzk...@googlegroups.com

That's because I know nothing about sg. If you give a command to run,
I will provide it's output.

> Perhaps it misbehaves when it
> gets a SCSI command in the T10 range (i.e. not vendor specific) with
> a 9 byte cdb length. As far as I'm aware T10 (and the Ansi committee
> before it) have never defined a cdb with an odd length.
>
> For those that are not aware, the sg driver is a relatively thin
> shim over the block layer, the SCSI mid-level, and a low-level
> driver which may have another kernel driver stack underneath it
> (e.g. UAS (USB attached SCSI)). The previous report from syzkaller
> on the sg driver ("scsi: memory leak in sg_start_req") has resulted
> in one accepted patch on the block layer with probably more to
> come in the same area.
>
> Testing the patch Dmitry gave (with some added error checks which
> reported no problems) with the scsi_debug driver supplying /dev/sg0
> I have not seen any problems running that test program. Again
> there might be a very slow memory leak, but if there is I don't
> believe it is in the sg driver.

Did you run it in a loop? First runs pass just fine for me too.

> While it's not invalid from a testing perspective, throwing total
> nonsense at a pass-through mechanism, including absurd SCSI commands
> at best will test error paths, but only at a very shallow level.
> Setting up almost valid pass-through scenarios will test error
> paths at a deeper level. Then there are lots of valid pass-through
> scenarios that would be expected not to fail.

Agree. syzkaller can test very elaborate scenarios, but it needs help
for this (telling what are these "almost valid" inputs). Kernel has
hundreds of APIs, some of them are quite elaborate and require
expertise to use (and undocumented), we can't describe all of them.
Frequently after adding a proper description, syzkaller finds a dozen
or two of bugs in the subsystem.

Bart Van Assche

unread,

Jan 22, 2018, 2:13:59 PM1/22/18

to dgil...@interlog.com, dvy...@google.com, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, ben.hu...@codethink.co.uk, martin....@oracle.com, syzk...@googlegroups.com

On Mon, 2018-01-22 at 20:06 +0100, Dmitry Vyukov wrote:
> On Mon, Jan 22, 2018 at 7:57 PM, Douglas Gilbert <dgil...@interlog.com> wrote:
> > As far as I remember, Dmitry has not indicated in multiple reports
> > over several years what /dev/sg0 is.
>
> That's because I know nothing about sg. If you give a command to run,
> I will provide it's output.

/dev/sg0 refers to a SCSI device. We would like to know the name of the SCSI
LLD driver. Would it be possible to provide the output of the following commands:

readlink /sys/class/scsi_generic/sg0
cat /sys/class/scsi_generic/sg0/device/vendor

Thanks,

Bart.

Dmitry Vyukov

unread,

Jan 30, 2018, 7:22:49 AM1/30/18

to Doug Gilbert, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, ben.hu...@codethink.co.uk, syzk...@googlegroups.com

Uh, I've answered this a week ago, but did not notice that Doug
dropped everybody from CC. Reporting to all.

On Mon, Jan 22, 2018 at 8:16 PM, Douglas Gilbert <dgil...@interlog.com> wrote:

> On 2018-01-22 02:06 PM, Dmitry Vyukov wrote:
>>
>> On Mon, Jan 22, 2018 at 7:57 PM, Douglas Gilbert <dgil...@interlog.com>

> Please show me the output of 'lsscsi -g' on your test machine.
> /dev/sg0 is often associated with /dev/sda which is often a SATA
> SSD (or a virtualized one) that holds the root file system.
> With the sg pass-through driver it is relatively easy to write
> random (user provided data) over the root file system which will
> almost certainly "root" the system.

This is pretty standard qemu vm started with:

qemu-system-x86_64 -hda wheezy.img -net user,host=10.0.2.10 -net nic
-nographic -kernel arch/x86/boot/bzImage -append "console=ttyS0
root=/dev/sda earlyprintk=serial " -m 2G -smp 4

# lsscsi -g
[0:0:0:0] disk ATA QEMU HARDDISK 0 /dev/sda /dev/sg0
[1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1

# readlink /sys/class/scsi_generic/sg0
../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0

# cat /sys/class/scsi_generic/sg0/device/vendor
ATA

>>> Perhaps it misbehaves when it
>>> gets a SCSI command in the T10 range (i.e. not vendor specific) with
>>> a 9 byte cdb length. As far as I'm aware T10 (and the Ansi committee
>>> before it) have never defined a cdb with an odd length.
>>>
>>> For those that are not aware, the sg driver is a relatively thin
>>> shim over the block layer, the SCSI mid-level, and a low-level
>>> driver which may have another kernel driver stack underneath it
>>> (e.g. UAS (USB attached SCSI)). The previous report from syzkaller
>>> on the sg driver ("scsi: memory leak in sg_start_req") has resulted
>>> in one accepted patch on the block layer with probably more to
>>> come in the same area.
>>>
>>> Testing the patch Dmitry gave (with some added error checks which
>>> reported no problems) with the scsi_debug driver supplying /dev/sg0
>>> I have not seen any problems running that test program. Again
>>> there might be a very slow memory leak, but if there is I don't
>>> believe it is in the sg driver.
>>
>>

>> Did you run it in a loop? First runs pass just fine for me too.
>
>

> Is thirty minutes long enough ??

Yes, it certainly should be enough. Here is what I see:

# while ./a.out; do echo RUN; done
RUN
RUN
RUN
RUN
RUN
RUN
RUN
[ 371.977266] ==================================================================
[ 371.980158] BUG: KASAN: double-free or invalid-free in
__put_task_struct+0x1e7/0x5c0
....

Here is full execution trace of the write call if that will be of any help:
https://gist.githubusercontent.com/dvyukov/14ae64c3e753dedf9ab2608676ecf0b9/raw/9803d52bb1e317a9228e362236d042aaf0fa9d69/gistfile1.txt

This is on upstream commit 0d665e7b109d512b7cae3ccef6e8654714887844.
Also attaching my config just in case.

.config

Douglas Gilbert

unread,

Feb 1, 2018, 1:03:47 AM2/1/18

to Dmitry Vyukov, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, ben.hu...@codethink.co.uk, syzk...@googlegroups.com

On 2018-01-30 07:22 AM, Dmitry Vyukov wrote:
> Uh, I've answered this a week ago, but did not notice that Doug
> dropped everybody from CC. Reporting to all.
>
> On Mon, Jan 22, 2018 at 8:16 PM, Douglas Gilbert <dgil...@interlog.com> wrote:
>> On 2018-01-22 02:06 PM, Dmitry Vyukov wrote:
>>>
>>> On Mon, Jan 22, 2018 at 7:57 PM, Douglas Gilbert <dgil...@interlog.com>
>> Please show me the output of 'lsscsi -g' on your test machine.
>> /dev/sg0 is often associated with /dev/sda which is often a SATA
>> SSD (or a virtualized one) that holds the root file system.
>> With the sg pass-through driver it is relatively easy to write
>> random (user provided data) over the root file system which will
>> almost certainly "root" the system.
>
>
> This is pretty standard qemu vm started with:
>
> qemu-system-x86_64 -hda wheezy.img -net user,host=10.0.2.10 -net nic
> -nographic -kernel arch/x86/boot/bzImage -append "console=ttyS0
> root=/dev/sda earlyprintk=serial " -m 2G -smp 4
>
> # lsscsi -g
> [0:0:0:0] disk ATA QEMU HARDDISK 0 /dev/sda /dev/sg0

With lk 4.15.0-rc9 I can run your test program (with some additions, see
attachment) for 30 minutes against a scsi_debug simulated disk. You can
easily replicate this test just run 'modprobe scsi_debug' and a third
line should appear in your lsscsi output. The new device will most likely
be /dev/sg2 .

With lk 4.15.0 (release) running against a SAS SSD (SEAGATE ST200FM0073),
the test has been running 20 minutes and counting without problems. That
is using a LSI HBA with the mpt3sas driver.

> [1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1
>
> # readlink /sys/class/scsi_generic/sg0
> ../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
>
> # cat /sys/class/scsi_generic/sg0/device/vendor
> ATA

^^^^^
That subsystem is the culprit IMO, most likely libata.

Until you can show this test failing on something other than an
ATA disk, then I will treat this issue as closed.

Doug Gilbert

sg_syzk_next_cbd.c

Dmitry Vyukov

unread,

Feb 1, 2018, 2:05:14 AM2/1/18

to Doug Gilbert, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, ben.hu...@codethink.co.uk, syzk...@googlegroups.com

Hi Doug,

Why is bug in ATA not a bug? Is it long unused by everybody? I've got
it by running qemu with default flags...

Ben Hutchings

unread,

Feb 1, 2018, 11:17:48 AM2/1/18

to Dmitry Vyukov, Doug Gilbert, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, syzk...@googlegroups.com

On Thu, 2018-02-01 at 08:04 +0100, Dmitry Vyukov wrote:
> On Thu, Feb 1, 2018 at 7:03 AM, Douglas Gilbert <dgil...@interlog.com> wrote:
> > On 2018-01-30 07:22 AM, Dmitry Vyukov wrote:

[...]

> > > [1:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.0. /dev/sr0 /dev/sg1
> > >
> > > # readlink /sys/class/scsi_generic/sg0
> > >
> > > ../../devices/pci0000:00/0000:00:01.1/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
> > >
> > > # cat /sys/class/scsi_generic/sg0/device/vendor
> > > ATA
> >
> >
> > ^^^^^
> > That subsystem is the culprit IMO, most likely libata.
> >
> > Until you can show this test failing on something other than an
> > ATA disk, then I will treat this issue as closed.
>
> Hi Doug,
>
> Why is bug in ATA not a bug? Is it long unused by everybody? I've got
> it by running qemu with default flags...

If the bug is in libata then it's not on Doug to fix it since he's only
maintaining sg.

Ben.

--
Ben Hutchings
Software Developer, Codethink Ltd.

Dmitry Vyukov

unread,

Feb 1, 2018, 11:21:34 AM2/1/18

to Ben Hutchings, Tejun Heo, linu...@vger.kernel.org, Doug Gilbert, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, syzk...@googlegroups.com

Then I think we need to CC ata maintainers rather than treat it as closed.
+Tejun, linux-ide@, you can see full thread here:
https://groups.google.com/forum/#!topic/syzkaller/9RNr9Gu0MyY

Eric Biggers

unread,

Feb 4, 2018, 4:07:13 AM2/4/18

to Dmitry Vyukov, Ben Hutchings, Tejun Heo, linu...@vger.kernel.org, Doug Gilbert, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, syzk...@googlegroups.com

To get memory corruption it's actually sufficient just to submit "1-byte" reads;
there's no need for the SG_NEXT_CMD_LEN ioctl or anything:

#include <fcntl.h>
#include <unistd.h>

int main()
{
int fd = open("/dev/sg0", O_RDWR);

char buf[43] = { [36] = 0x08 /* READ_6 */ };

for (;;)
write(fd, buf, sizeof(buf));
}

(where /dev/sg0 is the default QEMU disk type, "82371SB PIIX3 IDE")

The SCSI command descriptor block is the 6 bytes at indices 36-41, so index 42
is the only data byte.

Also this is a different bug from the crash in ata_bmdma_fill_sg() which is
fixed by "libata: fix length validation of ATAPI-relayed SCSI commands".

I'm guessing the driver is DMA'ing to somewhere it shouldn't be...

Eric

Dmitry Vyukov

unread,

Feb 4, 2018, 6:11:19 AM2/4/18

to Eric Biggers, Ben Hutchings, Tejun Heo, linu...@vger.kernel.org, Doug Gilbert, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, syzk...@googlegroups.com

It would be good to add KASAN checks to the DMA code that issues
transfers. This is another case where a silent memory corruption
causes dozens of assorted crashes all over the kernel. If we add
checks, KASAN would pinpoint the exact stack that issues the bad
command. This may be the simplest way to debug this bug as well. I've
filed https://bugzilla.kernel.org/show_bug.cgi?id=198661 for this.

Eric Biggers

unread,

Feb 10, 2018, 2:13:07 PM2/10/18

to Dmitry Vyukov, Ben Hutchings, Tejun Heo, linu...@vger.kernel.org, Doug Gilbert, Bart Van Assche, je...@linux.vnet.ibm.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, martin....@oracle.com, syzk...@googlegroups.com

On Sun, Feb 04, 2018 at 12:10:58PM +0100, Dmitry Vyukov wrote:
> >
> > To get memory corruption it's actually sufficient just to submit "1-byte" reads;
> > there's no need for the SG_NEXT_CMD_LEN ioctl or anything:
> >
> > #include <fcntl.h>
> > #include <unistd.h>
> >
> > int main()
> > {
> > int fd = open("/dev/sg0", O_RDWR);
> > char buf[43] = { [36] = 0x08 /* READ_6 */ };
> >
> > for (;;)
> > write(fd, buf, sizeof(buf));
> > }
> >
> > (where /dev/sg0 is the default QEMU disk type, "82371SB PIIX3 IDE")
> >
> > The SCSI command descriptor block is the 6 bytes at indices 36-41, so index 42
> > is the only data byte.
> >
> > Also this is a different bug from the crash in ata_bmdma_fill_sg() which is
> > fixed by "libata: fix length validation of ATAPI-relayed SCSI commands".
> >
> > I'm guessing the driver is DMA'ing to somewhere it shouldn't be...
>
> It would be good to add KASAN checks to the DMA code that issues
> transfers. This is another case where a silent memory corruption
> causes dozens of assorted crashes all over the kernel. If we add
> checks, KASAN would pinpoint the exact stack that issues the bad
> command. This may be the simplest way to debug this bug as well. I've
> filed https://bugzilla.kernel.org/show_bug.cgi?id=198661 for this.

It seems the problem is related to the fact that in the PRD (Physical Region
Descriptor) list for the DMA transfer for "BMDMA" ATA disks, the disk (emulated
in QEMU here: https://github.com/qemu/qemu/blob/master/hw/ide/pci.c#L89) ignores
the low bit in the lengths, causing a length of 1 (byte) to be interpreted as 0.
But, at the same time there is also a special case where a length of 0 is
interpreted as 65536 bytes.

So the disk will DMA up to 65536 bytes into a 1-byte buffer, causing massive
memory corruption.

I'm not sure what the best fix is, but probably it needs to be required that the
lengths in the sglist have the alignment needed for the disk.

KASAN would not have helped here unfortunately. Even if there were KASAN checks
when mapping the sglist for DMA or when filling in the PRD list, the kernel
would not have known that the disk would actually interpret "1" as "65536".

- Eric

Reply all

Reply to author

Forward