Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Using KASAN to catch streaming DMA API violations

3 views
Skip to first unread message

Ahmad Fatoum

unread,
Dec 5, 2024, 9:54:50 AM12/5/24
to kasa...@googlegroups.com, io...@lists.linux.dev, Arnd Bergmann, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Christoph Hellwig, Marek Szyprowski, Robin Murphy, Paul E . McKenney, el...@google.com, Kees Cook, Pengutronix Kernel Team
Hello,

This is a follow-up to a discussion that took place in the Kernel Sanitizers
Office Hours (IIRC) at this year's Plumbers Event in Vienna.

I had asked about how KCSAN could detect races due to DMA[1] and Arnd
suggested that we could use KASAN to detect the CPU accessing buffers that
it doesn't have ownership of. I mentioned having implemented[2] this exact scheme
in the barebox bootloader's KASAN support and promised to type up an email
about it to help getting a similar functionality into the kernel, but first
some context:

The streaming DMA API is used to annotate ownership transfer of buffers in
memory shared between the kernel and a DMA-capable device.

The relevant kernel documentation is:

https://www.kernel.org/doc/Documentation/DMA-API.txt
https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt

But I'll give a quick recap. There are four key operations:

- dma_map_single() moves a range of memory from CPU to device ownership

- dma_sync_single_for_cpu() can be called on all or a subset of the range
mapped by dma_map_single() to move ownership back to the CPU

- dma_sync_single_for_device() moves back all oo a subset of the range
mapped by dma_map_single() to the ownership of the device

- dma_unmap_single() gives back ownership of a range of memory to the CPU

It's a bug for the CPU or the device to access a streaming DMA mapping while
it's owned by the other side. On many systems, that bug will manifest itself
as memory corruption due to loss of cache coherence.

To make it easier to spot some misuses of the API, the kernel has a
CONFIG_DMA_API_DEBUG feature, which will run sanity checks when using the API.
It can't however detect if a memory access happens to a buffer while it's
owned by other side, which is where KASAN can come in by having CONFIG_DMA_API_DEBUG
record ownership information into the KASAN shadow memory.

That way accessing a device mapped buffer before sync'ing it to the CPU is
detected like KASAN would detect a use-after-free. When the ownership is moved
back to the CPU, the memory is unpoisoned and such an access would be allowed
again.

I had implemented this scheme[3] in the barebox bootloader and it works ok:

BUG: KASAN: dma-mapped-to-device in eqos_send+0xdc/0x1a8
Read of size 4 at addr 0000000040419f00

Call trace:
[<7fbd4980>] (unwind_backtrace+0x0/0xb0) from [<7fbd4a40>] (dump_stack+0x10/0x18)
[<7fbd4a40>] (dump_stack+0x10/0x18) from [<7fba2360>] (kasan_report+0x11c/0x290)
[<7fba2360>] (kasan_report+0x11c/0x290) from [<7fba1f44>] (__asan_load4+0x54/0xb8)
[<7fba1f44>] (__asan_load4+0x54/0xb8) from [<7fb2e52c>] (eqos_send+0xdc/0x1a8)
[<7fb2e52c>] (eqos_send+0xdc/0x1a8) from [<7fbb6544>] (eth_send+0x154/0x16c)
[<7fbb6544>] (eth_send+0x154/0x16c) from [<7fbb7114>] (net_ip_send+0xe8/0xf8)
[<7fbb7114>] (net_ip_send+0xe8/0xf8) from [<7fbb7d10>] (net_udp_send+0x68/0x78)


The aforementioned barebox functionality goes a step further and also used
the shadow memory information to detect repeated syncs without an ownership
change. While this is not a bug, my impression is that this is unnecessary
overhead and a diagnostic could help correct a developer's misunderstanding
of the API.

I hope to kick off a discussion about this with my mail here and perhaps even
motivate someone else to port it over or reimplement it. :D

[1]: when CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN is enabled
[2]: https://lore.barebox.org/barebox/20240910114832.2...@pengutronix.de/
[3]: https://github.com/barebox/barebox/blob/master/drivers/dma/debug.c

Cheers,
Ahmad

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Dmitry Vyukov

unread,
Dec 6, 2024, 2:06:23 AM12/6/24
to Ahmad Fatoum, kasa...@googlegroups.com, io...@lists.linux.dev, Arnd Bergmann, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Christoph Hellwig, Marek Szyprowski, Robin Murphy, Paul E . McKenney, el...@google.com, Kees Cook, Pengutronix Kernel Team
Hi Ahmad,

This looks great. The example stack you posted - is it a real bug, or
an injected one? Has it found any real ones?

Added the link to https://bugzilla.kernel.org/show_bug.cgi?id=198661
so that it's not lost.

I guess this memory usually does not come from kmalloc, right?
Otherwise it would be possible to use kasan_record_aux_stack() to
attach the stack of the last sync operation.

Re repeated syncs. If not all of them will be fixed, then at least for
the kernel it should be under a separate config b/c syzbot does not
tolerate false positives.

Ahmad Fatoum

unread,
Dec 6, 2024, 3:09:08 AM12/6/24
to Dmitry Vyukov, kasa...@googlegroups.com, io...@lists.linux.dev, Arnd Bergmann, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Vincenzo Frascino, Christoph Hellwig, Marek Szyprowski, Robin Murphy, Paul E . McKenney, el...@google.com, Kees Cook, Pengutronix Kernel Team
Hi Dmitry,
That was an injected one. The code is recent enough not to have hit a
barebox release yet.

I have gotten feedback that it was useful during driver development,
but it hasn't found an issue yet. I did run it though against issues
that were pointed out by reviewer feedback.

> Added the link to https://bugzilla.kernel.org/show_bug.cgi?id=198661
> so that it's not lost.

Cool, I didn't know that the discussion restarted in the Bugzilla.

> I guess this memory usually does not come from kmalloc, right?

No, kmalloc() in barebox is just TLSF with DMA-suitable alignment.

> Otherwise it would be possible to use kasan_record_aux_stack() to
> attach the stack of the last sync operation.

barebox architectures currently only implement stack_dump(). It needs
to be split into stack walk, kallsyms lookup and printing to console,
before a kasan_record_aux_stack() equivalent can be implemented.

> Re repeated syncs. If not all of them will be fixed, then at least for
> the kernel it should be under a separate config b/c syzbot does not
> tolerate false positives.

Agreed.

Arnd Bergmann

unread,
Dec 6, 2024, 3:14:52 AM12/6/24
to Ahmad Fatoum, kasa...@googlegroups.com, io...@lists.linux.dev, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Christoph Hellwig, Marek Szyprowski, Robin Murphy, Paul E. McKenney, Marco Elver, Kees Cook, Pengutronix Kernel Team
On Thu, Dec 5, 2024, at 15:54, Ahmad Fatoum wrote:

> That way accessing a device mapped buffer before sync'ing it to the CPU is
> detected like KASAN would detect a use-after-free. When the ownership is moved
> back to the CPU, the memory is unpoisoned and such an access would be allowed
> again.

Right. I would go even further and say that transferring ownership
to the device poisons an area that is aligned to ARCH_DMA_MINALIGN,
making it possibly bigger on both ends of the area. Transferring
ownership back to the CPU only unpoisons the exact area that was
specified, leaving the unaligned bytes around it as uninitialized.

That may need to be controlled by an additional Kconfig option on
top of poisoning the data initially.

ARCH_DMA_MINALIGN is between 4 and 128 bytes depending on the
architecture.

> The aforementioned barebox functionality goes a step further and also used
> the shadow memory information to detect repeated syncs without an ownership
> change. While this is not a bug, my impression is that this is unnecessary
> overhead and a diagnostic could help correct a developer's misunderstanding
> of the API.

Agreed, there is clearly something wrong if a driver does it, but
I can see them still work correctly without any risk of data corruption,
so it's a different class of bug.

Arnd

Christoph Hellwig

unread,
Dec 12, 2024, 12:40:20 AM12/12/24
to Arnd Bergmann, Ahmad Fatoum, kasa...@googlegroups.com, io...@lists.linux.dev, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Christoph Hellwig, Marek Szyprowski, Robin Murphy, Paul E. McKenney, Marco Elver, Kees Cook, Pengutronix Kernel Team
On Fri, Dec 06, 2024 at 09:14:27AM +0100, Arnd Bergmann wrote:
> Right. I would go even further and say that transferring ownership
> to the device poisons an area that is aligned to ARCH_DMA_MINALIGN,
> making it possibly bigger on both ends of the area. Transferring
> ownership back to the CPU only unpoisons the exact area that was
> specified, leaving the unaligned bytes around it as uninitialized.

Yes.

> That may need to be controlled by an additional Kconfig option on
> top of poisoning the data initially.

Note that we'll definitively need a config option for the basic
checks as well. There is plenty o drivers that don't do any DMA
ownership management right now. And while I'd like to see everything
fixed it's going to take a while to get there.

Reply all
Reply to author
Forward
0 new messages