WARNING in memtype_reserve

32 views
Skip to first unread message

syzbot

unread,
May 9, 2020, 3:20:16 AM5/9/20
to b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, gre...@linuxfoundation.org, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
Hello,

syzbot found the following crash on:

HEAD commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15093632100000
kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
userspace arch: i386
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=168ee02c100000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=119f3788100000

The bug was bisected to:

commit 2bef9aed6f0e22391c8d4570749b1acc9bc3981e
Author: Jeremy Linton <jeremy...@arm.com>
Date: Mon May 4 20:13:48 2020 +0000

usb: usbfs: correct kernel->user page attribute mismatch

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1701f168100000
final crash: https://syzkaller.appspot.com/x/report.txt?x=1481f168100000
console output: https://syzkaller.appspot.com/x/log.txt?x=1081f168100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+353be4...@syzkaller.appspotmail.com
Fixes: 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch")

------------[ cut here ]------------
memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 7025 Comm: syz-executor254 Not tainted 5.7.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
panic+0x2e3/0x75c kernel/panic.c:221
__warn.cold+0x2f/0x35 kernel/panic.c:582
report_bug+0x27b/0x2f0 lib/bug.c:195
fixup_bug arch/x86/kernel/traps.c:175 [inline]
fixup_bug arch/x86/kernel/traps.c:170 [inline]
do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
Code: 48 8b 2c ed c0 00 29 88 e8 ae ad 3e 00 48 8d 4b ff 49 89 e8 4c 89 e2 48 c7 c6 20 01 29 88 48 c7 c7 80 f9 28 88 e8 79 e8 0f 00 <0f> 0b 41 bf ea ff ff ff e9 03 fc ff ff 41 bf ea ff ff ff e9 f8 fb
RSP: 0018:ffffc900015e7790 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000009000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815ce181 RDI: fffff520002bcee4
RBP: ffffffff8828ff40 R08: ffff888097ce85c0 R09: ffffed1015ce45f1
R10: ffff8880ae722f83 R11: ffffed1015ce45f0 R12: 000ffffffffff000
R13: 1ffff920002bcef8 R14: dffffc0000000000 R15: 0000000000000000
reserve_pfn_range+0x173/0x470 arch/x86/mm/pat/memtype.c:941
track_pfn_remap+0x18b/0x280 arch/x86/mm/pat/memtype.c:1033
remap_pfn_range+0x202/0xbf0 mm/memory.c:2130
dma_direct_mmap+0x197/0x260 kernel/dma/direct.c:453
dma_mmap_attrs+0xfe/0x150 kernel/dma/mapping.c:237
usbdev_mmap+0x3ae/0x730 drivers/usb/core/devio.c:254
call_mmap include/linux/fs.h:1912 [inline]
mmap_region+0xafb/0x1540 mm/mmap.c:1772
do_mmap+0x849/0x1160 mm/mmap.c:1545
do_mmap_pgoff include/linux/mm.h:2553 [inline]
vm_mmap_pgoff+0x197/0x200 mm/util.c:506
ksys_mmap_pgoff+0x457/0x5b0 mm/mmap.c:1595
do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
do_fast_syscall_32+0x270/0xe90 arch/x86/entry/common.c:396
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Greg KH

unread,
May 9, 2020, 3:45:12 AM5/9/20
to syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
So should memtype_reserve() not do a WARN if given invalid parameters as
it can be triggered by userspace requests?

A normal "invalid request" debug line is probably all that is needed,
right?

thanks,

greg k-h

Thomas Gleixner

unread,
May 9, 2020, 6:01:04 AM5/9/20
to Greg KH, syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, x...@kernel.org
Greg KH <gre...@linuxfoundation.org> writes:
> On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:
>> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back
>> WARNING: CPU: 1 PID: 7025 at arch/x86/mm/pat/memtype.c:589 memtype_reserve+0x69f/0x820 arch/x86/mm/pat/memtype.c:589
>
> So should memtype_reserve() not do a WARN if given invalid parameters as
> it can be triggered by userspace requests?
>
> A normal "invalid request" debug line is probably all that is needed,
> right?

I disagree. The callsite espcially if user space triggerable should not
attempt to ask for a reservation where start > end:

>> memtype_reserve failed: [mem 0xffffffffff000-0x00008fff], req write-back

The real question is which part of the call chain is responsible for
this. That needs to be fixed.

Thanks,

tglx

Alan Stern

unread,
May 9, 2020, 9:41:56 AM5/9/20
to Thomas Gleixner, Greg KH, syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, syzkall...@googlegroups.com, x...@kernel.org
What about all the other ways in which a reservation request could be
invalid? The MM core already checks for these; what point is there in
duplicating these checks in many places higher up the call chain?

Alan Stern

Hillf Danton

unread,
May 9, 2020, 11:47:42 AM5/9/20
to syzbot, gre...@linuxfoundation.org, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, tg...@linutronix.de, Christoph Hellwig, syzkall...@googlegroups.com

Sat, 09 May 2020 00:20:14 -0700
Add check of physical address and boundary.

--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -447,6 +447,7 @@ int dma_direct_mmap(struct device *dev,
unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
unsigned long pfn = PHYS_PFN(dma_to_phys(dev, dma_addr));
int ret = -ENXIO;
+ unsigned long pa;

vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);

@@ -455,6 +456,13 @@ int dma_direct_mmap(struct device *dev,

if (vma->vm_pgoff >= count || user_count > count - vma->vm_pgoff)
return -ENXIO;
+ if (pfn > pfn + vma->vm_pgoff)
+ return -ENXIO;
+ pa = (pfn + vma->vm_pgoff) << PAGE_SHIFT;
+ if (pa < (pfn << PAGE_SHIFT))
+ return -ENXIO;
+ if (pa > pa + (user_count << PAGE_SHIFT))
+ return -ENXIO;
return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
user_count << PAGE_SHIFT, vma->vm_page_prot);
}

Jeremy Linton

unread,
May 9, 2020, 1:42:44 PM5/9/20
to syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, gre...@linuxfoundation.org, h...@zytor.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
Hi,
Adding a range check in dma_direct_mmap() looks like a fine fix. I'm
curious how the mapping attribute changed though. I expected that would
be rare on x86 (ISA devices/etc).

What is the GCE USB controller here? Its not the same as the virtual
qemu/pci/xhci right?

Greg KH

unread,
May 13, 2020, 8:41:13 AM5/13/20
to Hillf Danton, syzbot, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, tg...@linutronix.de, Christoph Hellwig, syzkall...@googlegroups.com
Isn't there some sort of "check mmap parms" function somewhere that
should do this logic all in one place? Putting it all over the kernel
feels wrong to me...

thanks,

greg k-h

Greg KH

unread,
May 13, 2020, 8:44:48 AM5/13/20
to Thomas Gleixner, syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, x...@kernel.org
This is caused by 2bef9aed6f0e ("usb: usbfs: correct kernel->user page
attribute mismatch") which changed a call to remap_pfn_range() to
dma_mmap_coherent(). Looks like the error checking in remap_pfn_range()
handled the invalid options better than dma_mma_coherent() when odd
values are passed in.

We can add the check to dma_mmap_coherent(), again, but really, this
type of check should probably only be needed in one place to ensure we
always get it correct, right?

thanks,

greg k-h

Hillf Danton

unread,
May 13, 2020, 10:13:03 AM5/13/20
to Greg KH, syzbot, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, tg...@linutronix.de, Christoph Hellwig, syzkall...@googlegroups.com

Hi Greg
Feel free to shed light on where to look for such a function after lol,
I did not see one checking physical address/bundary in particular to
avoid something like the report.

> feels wrong to me...

What's special here is the pfn, and a bigger pain is that I'm not sure
if it's the right place to add the check that may be not needed anywhere
else.

Thanks
Hillf

Thomas Gleixner

unread,
May 13, 2020, 12:21:45 PM5/13/20
to Alan Stern, Greg KH, syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, syzkall...@googlegroups.com, x...@kernel.org
Defensive programming rule #1: Check crap early but have the check which
ultimatively catches it at the last possible place as well.

Thomas Gleixner

unread,
May 13, 2020, 12:23:03 PM5/13/20
to Greg KH, syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, x...@kernel.org
That might be correct for this particular call chain, but this check
really is the last defense before stuff goes down the drain. None of the
last line functions should ever be reached with crappy arguments.

Thanks,

tglx

Greg KH

unread,
May 13, 2020, 12:55:44 PM5/13/20
to Hillf Danton, Thomas Gleixner, syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, x...@kernel.org
Looking at the other callers of dma_mmap_coherent(), it looks like this
needs to be done in that function, as other drivers are passing in "raw"
data as well. So Hillf's patch is probably the best one.

Hillf, can you resend it in a format we can apply it in and have syzbot
test?

thanks,

greg k-h

Hillf Danton

unread,
May 13, 2020, 11:55:13 PM5/13/20
to syzbot, Greg KH, Thomas Gleixner, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, Christoph Hellwig, Hillf Danton, x...@kernel.org
#syz test: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git d5eeab8d


--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -441,6 +441,7 @@ int dma_direct_mmap(struct device *dev,
unsigned long user_count = vma_pages(vma);
unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
unsigned long pfn = PHYS_PFN(dma_to_phys(dev, dma_addr));
+ unsigned long pa;
int ret = -ENXIO;

vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
@@ -450,6 +451,17 @@ int dma_direct_mmap(struct device *dev,

if (vma->vm_pgoff >= count || user_count > count - vma->vm_pgoff)
return -ENXIO;
+
+ if (pfn > pfn + vma->vm_pgoff)
+ return -ENXIO;
+
+ pa = (pfn + vma->vm_pgoff) << PAGE_SHIFT;
+ if (pa < (pfn << PAGE_SHIFT))
+ return -ENXIO;
+
+ if (pa > pa + (user_count << PAGE_SHIFT))
+ return -ENXIO;
+
return remap_pfn_range(vma, vma->vm_start, pfn + vma->vm_pgoff,
user_count << PAGE_SHIFT, vma->vm_page_prot);
}
--

Christoph Hellwig

unread,
May 14, 2020, 2:14:20 AM5/14/20
to Hillf Danton, syzbot, Greg KH, Thomas Gleixner, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, Christoph Hellwig, x...@kernel.org
Guys, can you please start formal thread on this? I have no
idea where this came from and what the rationale is. Btw, if the
pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
crap, as it is derived from that. What is the caller, and how is
this triggered?

Dmitry Vyukov

unread,
May 14, 2020, 2:19:36 AM5/14/20
to Christoph Hellwig, Hillf Danton, syzbot, Greg KH, Thomas Gleixner, jeremy...@arm.com, LKML, USB list, syzkaller-bugs, the arch/x86 maintainers
FWIW the whole thread is available on lore:
https://lore.kernel.org/lkml/2020051406...@lst.de/T/#t

Greg KH

unread,
May 14, 2020, 2:27:53 AM5/14/20
to Christoph Hellwig, Hillf Danton, syzbot, Thomas Gleixner, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
page attribute mismatch") changed a call from remap_pfn_range() to
dma_mmap_coherent() for usb data buffers being sent from userspace.

Details on why this was changed is in the commit changelog, but this has
triggered a WARN_ON() in memtype_reserve when I think some odd data
buffers were sent to it by syzbot fuzzing.

The warning caught the wrong data, but triggered syzbot.

So, the question is, should all callers of dma_mmap_coherent() validate
the parms better before it is called, or should the call chain here
properly sanitize things on their own.

Note, usbfs is not the only direct-from-userspace caller of
dma_mmap_coherent(), it's also used in other userspace apis (frame
buffer drivers, fastrpc, some SoC drivers) but it looks like nothing has
fuzzed those apis before to trigger this.

Does that help explain things better?

thanks,

greg k-h

Christoph Hellwig

unread,
May 14, 2020, 2:32:01 AM5/14/20
to Greg KH, Christoph Hellwig, Hillf Danton, syzbot, Thomas Gleixner, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
On Thu, May 14, 2020 at 08:27:50AM +0200, Greg KH wrote:
> On Thu, May 14, 2020 at 08:14:17AM +0200, Christoph Hellwig wrote:
> > Guys, can you please start formal thread on this? I have no
> > idea where this came from and what the rationale is. Btw, if the
> > pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
> > crap, as it is derived from that. What is the caller, and how is
> > this triggered?
>
>
> Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
> page attribute mismatch") changed a call from remap_pfn_range() to
> dma_mmap_coherent() for usb data buffers being sent from userspace.

I only need to look at the commit for 3 seconds to tell you that it is
completely buggy. While using dma_mmap_coherent is fundamentally the
right thing and absolutely required for dma_alloc_* allocations, USB
also uses it's own local gen pool allocator or plain kmalloc for not
DMA capable controller. This need to use remap_pfn_range. I'm pretty
sure you hit one of those cases.

The logic should be something like:

if (hcd->localmem_pool || !hcd_uses_dma(hcd))
remap_pfn_range()
else
dma_mmap_coherent()

Greg KH

unread,
May 14, 2020, 3:47:02 AM5/14/20
to Christoph Hellwig, jeremy...@arm.com, Hillf Danton, syzbot, Thomas Gleixner, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
Ok, that's simple enough, patch is below.

Jeremy, any objection to this change?

thanks,

greg k-h


diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index b9db9812d6c5..d93d94d7ff50 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -251,9 +251,19 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
usbm->vma_use_count = 1;
INIT_LIST_HEAD(&usbm->memlist);

- if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle, size)) {
- dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
- return -EAGAIN;
+ if (hcd->localmem_pool || !hcd_uses_dma(hcd)) {
+ if (remap_pfn_range(vma, vma->vm_start,
+ virt_to_phys(usbm->mem) >> PAGE_SHIFT,
+ size, vma->vm_page_prot) < 0) {
+ dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+ return -EAGAIN;
+ }
+ } else {
+ if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle,
+ size)) {
+ dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+ return -EAGAIN;
+ }
}

vma->vm_flags |= VM_IO;

syzbot

unread,
May 14, 2020, 5:08:06 AM5/14/20
to gre...@linuxfoundation.org, h...@lst.de, hda...@sina.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger crash:

Reported-and-tested-by: syzbot+353be4...@syzkaller.appspotmail.com

Tested on:

commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
patch: https://syzkaller.appspot.com/x/patch.diff?x=1046a052100000

Note: testing is done by a robot and is best-effort only.

Greg KH

unread,
May 14, 2020, 6:27:52 AM5/14/20
to syzbot, b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
On Sat, May 09, 2020 at 12:20:14AM -0700, syzbot wrote:

syzbot

unread,
May 14, 2020, 6:44:05 AM5/14/20
to b...@alien8.de, dave....@linux.intel.com, dmitry....@gmail.com, ebie...@xmission.com, gre...@linuxfoundation.org, h...@zytor.com, jeremy...@arm.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, lu...@kernel.org, mi...@redhat.com, pet...@infradead.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger crash:

Reported-and-tested-by: syzbot+353be4...@syzkaller.appspotmail.com

Tested on:

commit: d5eeab8d Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config: https://syzkaller.appspot.com/x/.config?x=b0212dbee046bc1f
dashboard link: https://syzkaller.appspot.com/bug?extid=353be47c9ce21b68b7ed
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
patch: https://syzkaller.appspot.com/x/patch.diff?x=12bd6806100000

Jeremy Linton

unread,
May 14, 2020, 7:10:06 AM5/14/20
to Christoph Hellwig, Greg KH, Hillf Danton, syzbot, Thomas Gleixner, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
Hi,

On 5/14/20 1:31 AM, Christoph Hellwig wrote:
> On Thu, May 14, 2020 at 08:27:50AM +0200, Greg KH wrote:
>> On Thu, May 14, 2020 at 08:14:17AM +0200, Christoph Hellwig wrote:
>>> Guys, can you please start formal thread on this? I have no
>>> idea where this came from and what the rationale is. Btw, if the
>>> pfn is crap in dma_direct_mmap then the dma_addr_t passed in is
>>> crap, as it is derived from that. What is the caller, and how is
>>> this triggered?
>>
>>
>> Ok, to summarize, commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user
>> page attribute mismatch") changed a call from remap_pfn_range() to
>> dma_mmap_coherent() for usb data buffers being sent from userspace.
>
> I only need to look at the commit for 3 seconds to tell you that it is
> completely buggy. While using dma_mmap_coherent is fundamentally the
> right thing and absolutely required for dma_alloc_* allocations, USB
> also uses it's own local gen pool allocator or plain kmalloc for not
> DMA capable controller. This need to use remap_pfn_range. I'm pretty
> sure you hit one of those cases.

? The code path in question is usbdev_mmap() and the allocation is done
~13 lines lines before as a usb_alloc_coherent().


>
> The logic should be something like:
>
> if (hcd->localmem_pool || !hcd_uses_dma(hcd))
> remap_pfn_range()
> else
> dma_mmap_coherent()
>

That sort of makes sense, except for the above, and the fact that I
would imagine the dma_mmap_coherent should be dealing with that case.
I'm not really clear about the details of the GCE usb device here, but
my first guess at this was the dma_pgprot() in dma_direct_mmap() is
incorrectly picking a pgprot...

Christoph Hellwig

unread,
May 14, 2020, 7:14:45 AM5/14/20
to Jeremy Linton, Christoph Hellwig, Greg KH, Hillf Danton, syzbot, Thomas Gleixner, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
On Thu, May 14, 2020 at 06:10:03AM -0500, Jeremy Linton wrote:
>> I only need to look at the commit for 3 seconds to tell you that it is
>> completely buggy. While using dma_mmap_coherent is fundamentally the
>> right thing and absolutely required for dma_alloc_* allocations, USB
>> also uses it's own local gen pool allocator or plain kmalloc for not
>> DMA capable controller. This need to use remap_pfn_range. I'm pretty
>> sure you hit one of those cases.
>
> ? The code path in question is usbdev_mmap() and the allocation is done ~13
> lines lines before as a usb_alloc_coherent().

And did you take a look at how usb_alloc_coherent is implemented? That
should make it completely obvious that not all allocations come
from dma_alloc_*.

> That sort of makes sense, except for the above, and the fact that I would
> imagine the dma_mmap_coherent should be dealing with that case. I'm not
> really clear about the details of the GCE usb device here, but my first
> guess at this was the dma_pgprot() in dma_direct_mmap() is incorrectly
> picking a pgprot...

No, dma_mmap_* / dma_direct_mmap has absolutely no business dealing
with memory that did not come from the DMA allocator.

Jeremy Linton

unread,
May 14, 2020, 7:16:13 AM5/14/20
to Christoph Hellwig, Greg KH, Hillf Danton, syzbot, Thomas Gleixner, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
On 5/14/20 6:14 AM, Christoph Hellwig wrote:
> On Thu, May 14, 2020 at 06:10:03AM -0500, Jeremy Linton wrote:
>>> I only need to look at the commit for 3 seconds to tell you that it is
>>> completely buggy. While using dma_mmap_coherent is fundamentally the
>>> right thing and absolutely required for dma_alloc_* allocations, USB
>>> also uses it's own local gen pool allocator or plain kmalloc for not
>>> DMA capable controller. This need to use remap_pfn_range. I'm pretty
>>> sure you hit one of those cases.
>>
>> ? The code path in question is usbdev_mmap() and the allocation is done ~13
>> lines lines before as a usb_alloc_coherent().
>
> And did you take a look at how usb_alloc_coherent is implemented? That
> should make it completely obvious that not all allocations come
> from dma_alloc_*.

No, your right, I noticed/remembered the usb_alloc vs dma_alloc
difference right after sending that email, and was just about to say so.

Sorry, you right.

Jeremy Linton

unread,
May 14, 2020, 7:17:44 AM5/14/20
to Greg KH, Christoph Hellwig, Hillf Danton, syzbot, Thomas Gleixner, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
No, thats fine but since I just translated usb_alloc_coherent() to
dma_map_coherent in my not fully away head. Putting this as
"usb_map_cohernet()" sort of makes more sense.

Greg KH

unread,
May 14, 2020, 7:22:04 AM5/14/20
to Jeremy Linton, Christoph Hellwig, Hillf Danton, syzbot, Thomas Gleixner, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com, x...@kernel.org
Thanks, I'll turn this into a "real" patch now...

greg k-h
Reply all
Reply to author
Forward
0 new messages