Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

USB mass storage and ARM cache coherency

31 views
Skip to first unread message

Catalin Marinas

unread,
Jan 29, 2010, 9:40:02 AM1/29/10
to
Hi Matthew,

I've been trying for some time to use a rootfs (ext2) on a USB memory
stick on ARM platforms but without any success. The USB HCD driver is
ISP1760 which doesn't use DMA.

ARM has a Harvard cache architecture and what I get is incoherency
between the I and D caches. The CPU I'm using (ARM11MPCore) has PIPT
caches with D-cache lines allocation on write.

Basically, when user space tries to execute from a new page, it faults
and the data is requested via the VFS layer, SCSI block device and USB
mass storage from the ISP1760 driver. The page is then mapped into user
space and update_mmu_cache() called.

However, since the driver is PIO, the data copied from the USB device
into RAM gets stuck in the D-cache. On the above page requesting path
there is no call to flush_dcache_page() to handle D-cache maintenance
(for DMA drivers, that's handled by the DMA API).

Since the USB mass storage code has the information about the USB driver
capabilities (DMA or PIO), it looks like the best place to call
flush_dcache_page(). But I got lost in the SCSI emulation and all my
attempts failed to get a working rootfs.

Adding flush_dcache_page() higher up in mpage_end_io_read() solves the
problem but that's not the correct fix as it has wider implications and
it's not needed for DMA-capable devices.

Thanks.

--
Catalin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Oliver Neukum

unread,
Jan 29, 2010, 11:20:04 AM1/29/10
to
Am Freitag, 29. Januar 2010 15:34:15 schrieb Catalin Marinas:
> Basically, when user space tries to execute from a new page, it faults
> and the data is requested via the VFS layer, SCSI block device and USB
> mass storage from the ISP1760 driver. The page is then mapped into user
> space and update_mmu_cache() called.
>
> However, since the driver is PIO, the data copied from the USB device
> into RAM gets stuck in the D-cache. On the above page requesting path
> there is no call to flush_dcache_page() to handle D-cache maintenance
> (for DMA drivers, that's handled by the DMA API).
>
> Since the USB mass storage code has the information about the USB driver
> capabilities (DMA or PIO), it looks like the best place to call
> flush_dcache_page(). But I got lost in the SCSI emulation and all my
> attempts failed to get a working rootfs.

No, that would be a very bad place in the layering to do this.
The problem would happen with ub and storage. It might also
happen with any other driver.
Please add this to the HCD driver.

Regards
Oliver

Ming Lei

unread,
Jan 29, 2010, 11:30:03 AM1/29/10
to
2010/1/29 Catalin Marinas <catalin...@arm.com>:

> Hi Matthew,
>
> I've been trying for some time to use a rootfs (ext2) on a USB memory
> stick on ARM platforms but without any success. The USB HCD driver is
> ISP1760 which doesn't use DMA.
>
> ARM has a Harvard cache architecture and what I get is incoherency
> between the I and D caches. The CPU I'm using (ARM11MPCore) has PIPT
> caches with D-cache lines allocation on write.
>
> Basically, when user space tries to execute from a new page, it faults
> and the data is requested via the VFS layer, SCSI block device and USB
> mass storage from the ISP1760 driver. The page is then mapped into user
> space and update_mmu_cache() called.
>
> However, since the driver is PIO, the data copied from the USB device
> into RAM gets stuck in the D-cache. On the above page requesting path
> there is no call to flush_dcache_page() to handle D-cache maintenance
> (for DMA drivers, that's handled by the DMA API).
>
> Since the USB mass storage code has the information about the USB driver

Sorry, I am a little confused that usb mass storage has what information
about DMA or PIO of low level usb transfer?

Thanks,

--
Lei Ming

Catalin Marinas

unread,
Jan 29, 2010, 11:40:01 AM1/29/10
to
On Fri, 2010-01-29 at 16:23 +0000, Ming Lei wrote:
> 2010/1/29 Catalin Marinas <catalin...@arm.com>:

> > I've been trying for some time to use a rootfs (ext2) on a USB memory
> > stick on ARM platforms but without any success. The USB HCD driver is
> > ISP1760 which doesn't use DMA.
> >
> > ARM has a Harvard cache architecture and what I get is incoherency
> > between the I and D caches. The CPU I'm using (ARM11MPCore) has PIPT
> > caches with D-cache lines allocation on write.
> >
> > Basically, when user space tries to execute from a new page, it faults
> > and the data is requested via the VFS layer, SCSI block device and USB
> > mass storage from the ISP1760 driver. The page is then mapped into user
> > space and update_mmu_cache() called.
> >
> > However, since the driver is PIO, the data copied from the USB device
> > into RAM gets stuck in the D-cache. On the above page requesting path
> > there is no call to flush_dcache_page() to handle D-cache maintenance
> > (for DMA drivers, that's handled by the DMA API).
> >
> > Since the USB mass storage code has the information about the USB driver
>
> Sorry, I am a little confused that usb mass storage has what information
> about DMA or PIO of low level usb transfer?

I was thinking about checking dev->bus->controller->dma_mask which the
code (though not the storage one) seems to imply that if the dma_mask is
0, the HCD driver is only capable of PIO.

That would be a more general solution rather than going through each HCD
driver since my understanding is that flush_dcache_page() is only needed
together with the mass storage support.

--
Catalin

Oliver Neukum

unread,
Jan 29, 2010, 11:50:03 AM1/29/10
to
Am Freitag, 29. Januar 2010 17:34:03 schrieb Catalin Marinas:

> I was thinking about checking dev->bus->controller->dma_mask which the
> code (though not the storage one) seems to imply that if the dma_mask is
> 0, the HCD driver is only capable of PIO.

That a HCD is capable of DMA need not imply that DMA is used for every
transfer.



> That would be a more general solution rather than going through each HCD
> driver since my understanding is that flush_dcache_page() is only needed
> together with the mass storage support.

What about ub, nfs or nbd over a USB<->ethernet converter?
This, I am afraid is best solved at the HCD or glue layer.

Regards
Oliver

Catalin Marinas

unread,
Jan 29, 2010, 12:20:02 PM1/29/10
to
On Fri, 2010-01-29 at 16:41 +0000, Oliver Neukum wrote:
> Am Freitag, 29. Januar 2010 17:34:03 schrieb Catalin Marinas:
>
> > I was thinking about checking dev->bus->controller->dma_mask which the
> > code (though not the storage one) seems to imply that if the dma_mask is
> > 0, the HCD driver is only capable of PIO.
>
> That a HCD is capable of DMA need not imply that DMA is used for every
> transfer.

Actually the DMA drivers are safe in this respect only if the transfer
happens directly to a page cache page that may be (later) mapped into
user space. I'm not familiar with the USB drivers to fully understand
the data flow, so any help would be appreciated.

> > That would be a more general solution rather than going through each HCD
> > driver since my understanding is that flush_dcache_page() is only needed
> > together with the mass storage support.
>
> What about ub, nfs or nbd over a USB<->ethernet converter?
> This, I am afraid is best solved at the HCD or glue layer.

NFS handles the cache flushing itself, so in this case there is no need
to duplicate the cache flushing at the HCD level. AFAICT, the HCD driver
may be used in several cases and it's only the storage case (via either
ub, mass storage etc.) that requires cache flushing. Is there a way to
differentiate between these at the HCD driver level?

Regarding nbd, is there any copying happening between the HCD driver
receiving the network packet from the USB-ethernet converter and the nbd
bio_vec buffers (most likely during the TCP/IP stack flow)? In this case
it would be for the nbd driver (doesn't seem to be the case now) to
flush the D-cache as the HCD flushing is not necessary as long as it
doesn't write directly to the page cache page.

The ub case is similar to the USB mass storage one, so they could both
benefit from flushing at the HCD driver level. But is this possible
without duplicating the flushing in the nfs case?

Regards.

--
Catalin

Sergei Shtylyov

unread,
Jan 29, 2010, 1:00:02 PM1/29/10
to
Hello.

Catalin Marinas wrote:

>>> I've been trying for some time to use a rootfs (ext2) on a USB memory
>>> stick on ARM platforms but without any success. The USB HCD driver is
>>> ISP1760 which doesn't use DMA.
>>>
>>> ARM has a Harvard cache architecture and what I get is incoherency
>>> between the I and D caches. The CPU I'm using (ARM11MPCore) has PIPT
>>> caches with D-cache lines allocation on write.
>>>
>>> Basically, when user space tries to execute from a new page, it faults
>>> and the data is requested via the VFS layer, SCSI block device and USB
>>> mass storage from the ISP1760 driver. The page is then mapped into user
>>> space and update_mmu_cache() called.
>>>
>>> However, since the driver is PIO, the data copied from the USB device
>>> into RAM gets stuck in the D-cache. On the above page requesting path
>>> there is no call to flush_dcache_page() to handle D-cache maintenance
>>> (for DMA drivers, that's handled by the DMA API).
>>>
>>> Since the USB mass storage code has the information about the USB driver
>>>
>> Sorry, I am a little confused that usb mass storage has what information
>> about DMA or PIO of low level usb transfer?
>>
>
> I was thinking about checking dev->bus->controller->dma_mask which the
> code (though not the storage one) seems to imply that if the dma_mask is
> 0, the HCD driver is only capable of PIO.
>
> That would be a more general solution rather than going through each HCD
> driver since my understanding is that flush_dcache_page() is only needed
> together with the mass storage support.

Note that DMA capable driver can be doing some transfers in PIO mode
or falling back to PIO mode if DMA mode transfer is unsuccessful (the
musb driver is an example of the latter and if the DMA rewrite patches
will get accepted, it'll do short transfers in PIO mode).

MBR, Sergei

Matthew Dharm

unread,
Jan 29, 2010, 2:00:02 PM1/29/10
to
On Fri, Jan 29, 2010 at 08:51:47PM +0300, Sergei Shtylyov wrote:
> Catalin Marinas wrote:
>
> >That would be a more general solution rather than going through each HCD
> >driver since my understanding is that flush_dcache_page() is only needed
> >together with the mass storage support.
>
> Note that DMA capable driver can be doing some transfers in PIO mode
> or falling back to PIO mode if DMA mode transfer is unsuccessful (the
> musb driver is an example of the latter and if the DMA rewrite patches
> will get accepted, it'll do short transfers in PIO mode).

Given that an HCD can choose, on the fly, if it's using DMA or PIO, the HCD
driver is the only place to reasonably put any cache-synchronization code.

That said, what do the other SCSI HCDs do? I'm guessing the question gets
kinda muddy there, since the other SCSI HCDs all talk directly to some
piece of hardware, and thus are responsible for the cache management
themselves.

Based on that, one could argue that ub and usb-storage should be doing
this.

HOWEVER, I firmly believe that the cache-management functions belong with
the driver that actually talks to the low-level hardware, as that's the
only place where you can be 100% certain of what cache operations are
needed. After all, I think someone is working on a USB-over-IP transport,
and trying to manage cache at the usb-storage level in that scenario is
just silly.

So, let's put this in the HCD drivers and be done with it.

Matt

--
Matthew Dharm Home: mdhar...@one-eyed-alien.net
Maintainer, Linux USB Mass Storage Driver

I see you've been reading alt.sex.chubby.sheep voraciously.
-- Tanya
User Friendly, 11/24/97

Greg KH

unread,
Jan 29, 2010, 2:40:02 PM1/29/10
to

I agree, that's the place to fix this issue.

thanks,

greg k-h

Catalin Marinas

unread,
Feb 1, 2010, 9:00:02 AM2/1/10
to

Doing this (flush_dcache_page) in the HCD driver (ISP1760) solves my
problem. I'll post a patch and also cc the driver maintainer.

Thanks.

--
Catalin

Catalin Marinas

unread,
Feb 1, 2010, 12:30:04 PM2/1/10
to
On Fri, 2010-01-29 at 18:54 +0000, Matthew Dharm wrote:
> HOWEVER, I firmly believe that the cache-management functions belong with
> the driver that actually talks to the low-level hardware, as that's the
> only place where you can be 100% certain of what cache operations are
> needed. After all, I think someone is working on a USB-over-IP transport,
> and trying to manage cache at the usb-storage level in that scenario is
> just silly.
>
> So, let's put this in the HCD drivers and be done with it.

The patch below is what fixes the I-D cache incoherency issues on ARM. I
don't particularly like the solution but it seems to be the only one
available.

IMHO, Linux should have functions similar to the DMA API but for PIO
drivers (e.g. pio_map_single/pio_unmap_single) that non-coherent
architectures can define, otherwise being no-ops. Any thoughts?

Thanks.

isp1760: Flush the D-cache for the pipe-in transfer buffers

From: Catalin Marinas <catalin...@arm.com>

When the HDC driver writes the data to the transfer buffers it pollutes
the D-cache (unlike DMA drivers where the device writes the data). If
the corresponding pages get mapped into user space, there are no
additional cache flushing operations performed and this causes random
user space faults on architectures with separate I and D caches
(Harvard) or those with aliasing D-cache.

Signed-off-by: Catalin Marinas <catalin...@arm.com>
Cc: Matthew Dharm <mdharm...@one-eyed-alien.net>
Cc: Greg KH <gr...@kroah.com>
Cc: Sebastian Siewior <big...@linutronix.de>
---
drivers/usb/host/isp1760-hcd.c | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/usb/host/isp1760-hcd.c b/drivers/usb/host/isp1760-hcd.c
index 27b8f7c..4d3eeee 100644
--- a/drivers/usb/host/isp1760-hcd.c
+++ b/drivers/usb/host/isp1760-hcd.c
@@ -18,6 +18,8 @@
#include <linux/uaccess.h>
#include <linux/io.h>
#include <asm/unaligned.h>
+#include <asm/cacheflush.h>
+#include <asm/memory.h>

#include "../core/hcd.h"
#include "isp1760-hcd.h"
@@ -904,6 +906,14 @@ __acquires(priv->lock)
status = 0;
}

+ if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) == PIPE_BULK) {
+ void *ptr;
+ for (ptr = urb->transfer_buffer;
+ ptr < urb->transfer_buffer + urb->transfer_buffer_length;
+ ptr += PAGE_SIZE)
+ flush_dcache_page(virt_to_page(ptr));
+ }
+
/* complete() can reenter this HCD */
usb_hcd_unlink_urb_from_ep(priv_to_hcd(priv), urb);
spin_unlock(&priv->lock);

Alan Stern

unread,
Feb 1, 2010, 3:20:02 PM2/1/10
to
On Mon, 1 Feb 2010, Catalin Marinas wrote:

> On Fri, 2010-01-29 at 18:54 +0000, Matthew Dharm wrote:
> > HOWEVER, I firmly believe that the cache-management functions belong with
> > the driver that actually talks to the low-level hardware, as that's the
> > only place where you can be 100% certain of what cache operations are
> > needed. After all, I think someone is working on a USB-over-IP transport,
> > and trying to manage cache at the usb-storage level in that scenario is
> > just silly.
> >
> > So, let's put this in the HCD drivers and be done with it.
>
> The patch below is what fixes the I-D cache incoherency issues on ARM. I
> don't particularly like the solution but it seems to be the only one
> available.
>
> IMHO, Linux should have functions similar to the DMA API but for PIO
> drivers (e.g. pio_map_single/pio_unmap_single) that non-coherent
> architectures can define, otherwise being no-ops. Any thoughts?

You should bring this up on the linux-arm-kernel mailing list and CC:
the ARM maintainer. They are the ones most directly affected.

Alan Stern

Andreas Mohr

unread,
Feb 1, 2010, 5:40:02 PM2/1/10
to
[CC Takashi]

On Mon, Feb 01, 2010 at 05:29:14PM +0000, Catalin Marinas wrote:
> On Fri, 2010-01-29 at 18:54 +0000, Matthew Dharm wrote:
> > HOWEVER, I firmly believe that the cache-management functions belong with
> > the driver that actually talks to the low-level hardware, as that's the
> > only place where you can be 100% certain of what cache operations are
> > needed. After all, I think someone is working on a USB-over-IP transport,
> > and trying to manage cache at the usb-storage level in that scenario is
> > just silly.
> >
> > So, let's put this in the HCD drivers and be done with it.
>
> The patch below is what fixes the I-D cache incoherency issues on ARM. I
> don't particularly like the solution but it seems to be the only one
> available.

Thanks very much for working on this amazingly large problem!

I took some time to add your patch to ehci-q.c / ohci-q.c
(for my *hci-ssb.c ASUS WL-500gP v2), on my now _heavily_ patched-up 2.6.31.9,
but _UNFORTUNATELY_ it kept locking up the same way as always when stopping
playback despite being damn sure this time that this patch could have
the potential to finally fix it ;)
(I had to replace memory.h with page.h on my arch though, to fix the build)

This is on MIPSEL (not one of my many ARM devices, unfortunately ;),
with usb-audio, and the madplay process crashes in __bzero(),
which strongly indicates cache coherency issues (other subsequent backtraces
have lots of mmap and vma listed, see also my "snd_usb_audio OOPS on MIPSEL -
is that the mmap issue?").

Next thing I'll do is fire up gdb and get a good backtrace of the
__bzero() address to find out which page handling in mpd exactly
is hampered with crashes. This is now ~ the third patch that I applied
on-the-go and that didn't help, so it's probably time to do
some earnest analysis on what's really going on locally.

Note that usb-storage itself does work on this platform though.

Rather annoying to be so close (sound works) yet so far away,
especially after all that USB host trouble I already had.

Thanks a lot again,

Andreas Mohr

Paul Mundt

unread,
Feb 1, 2010, 11:30:01 PM2/1/10
to
On Mon, Feb 01, 2010 at 03:14:04PM -0500, Alan Stern wrote:
> On Mon, 1 Feb 2010, Catalin Marinas wrote:
>
> > On Fri, 2010-01-29 at 18:54 +0000, Matthew Dharm wrote:
> > > HOWEVER, I firmly believe that the cache-management functions belong with
> > > the driver that actually talks to the low-level hardware, as that's the
> > > only place where you can be 100% certain of what cache operations are
> > > needed. After all, I think someone is working on a USB-over-IP transport,
> > > and trying to manage cache at the usb-storage level in that scenario is
> > > just silly.
> > >
> > > So, let's put this in the HCD drivers and be done with it.
> >
> > The patch below is what fixes the I-D cache incoherency issues on ARM. I
> > don't particularly like the solution but it seems to be the only one
> > available.
> >
> > IMHO, Linux should have functions similar to the DMA API but for PIO
> > drivers (e.g. pio_map_single/pio_unmap_single) that non-coherent
> > architectures can define, otherwise being no-ops. Any thoughts?
>
> You should bring this up on the linux-arm-kernel mailing list and CC:
> the ARM maintainer. They are the ones most directly affected.
>
No, this belongs on linux-arch, as it's something that impacts a lot of
people besides ARM.

Paul Mundt

unread,
Feb 2, 2010, 1:40:01 AM2/2/10
to
On Mon, Feb 01, 2010 at 05:29:14PM +0000, Catalin Marinas wrote:
> On Fri, 2010-01-29 at 18:54 +0000, Matthew Dharm wrote:
> > HOWEVER, I firmly believe that the cache-management functions belong with
> > the driver that actually talks to the low-level hardware, as that's the
> > only place where you can be 100% certain of what cache operations are
> > needed. After all, I think someone is working on a USB-over-IP transport,
> > and trying to manage cache at the usb-storage level in that scenario is
> > just silly.
> >
> > So, let's put this in the HCD drivers and be done with it.
>
> The patch below is what fixes the I-D cache incoherency issues on ARM. I
> don't particularly like the solution but it seems to be the only one
> available.
>
> IMHO, Linux should have functions similar to the DMA API but for PIO
> drivers (e.g. pio_map_single/pio_unmap_single) that non-coherent
> architectures can define, otherwise being no-ops. Any thoughts?
>
I would certainly be in favour of such a thing, particularly since on SH
we often find ourselves with coherent PIO and non-coherent MMIO.

This is however something that should be prototyped and submitted to
linux-arch for discussion.

> diff --git a/drivers/usb/host/isp1760-hcd.c b/drivers/usb/host/isp1760-hcd.c
> index 27b8f7c..4d3eeee 100644
> --- a/drivers/usb/host/isp1760-hcd.c
> +++ b/drivers/usb/host/isp1760-hcd.c
> @@ -18,6 +18,8 @@
> #include <linux/uaccess.h>
> #include <linux/io.h>
> #include <asm/unaligned.h>
> +#include <asm/cacheflush.h>
> +#include <asm/memory.h>
>

asm/memory.h isn't a portable header. If you are including it for
virt_to_page(), linux/io.h should already bring that in via asm/io.h.
If arm doesn't bring in virt_to_page() through its asm/io.h, then fix the
headers there please.

FWIW I used the same fix you came up with on r8a66597_hcd and it fixed up
crashes we were seeing there, too.

Oliver Neukum

unread,
Feb 2, 2010, 2:00:02 AM2/2/10
to
Am Montag, 1. Februar 2010 23:30:01 schrieb Andreas Mohr:
> I took some time to add your patch to ehci-q.c / ohci-q.c
> (for my *hci-ssb.c ASUS WL-500gP v2), on my now heavily patched-up 2.6.31.9,
> but UNFORTUNATELY it kept locking up the same way as always when stopping

> playback despite being damn sure this time that this patch could have
> the potential to finally fix it ;)
> (I had to replace memory.h with page.h on my arch though, to fix the build)

A moment please. You are using ehci and ohci. Both are using dma.
Why does this issue arise?

Regards
Oliver

Sebastian Andrzej Siewior

unread,
Feb 2, 2010, 4:20:02 AM2/2/10
to
* Catalin Marinas | 2010-02-01 17:29:14 [+0000]:

>> So, let's put this in the HCD drivers and be done with it.

That is the correct place. MMC -hcd drivers for instance are doing this
way.

>The patch below is what fixes the I-D cache incoherency issues on ARM. I
>don't particularly like the solution but it seems to be the only one
>available.

The PIO-MMC drivers walk through a scatter list via sg_miter_start() and
friends. Those helpers take care of this automaticly.

>IMHO, Linux should have functions similar to the DMA API but for PIO
>drivers (e.g. pio_map_single/pio_unmap_single) that non-coherent
>architectures can define, otherwise being no-ops. Any thoughts?

What is wrong with flush_dcache_page() ? And I think linux-arch is the
appropriate place.

I'm fine with the patch generally but I don't like the asm headers.
cacheflush.h is available on most architectures as far as I can see it but
memory.h is only available on arm. So you break the build on !arm and
therefore I NAK this.

> #include "../core/hcd.h"
> #include "isp1760-hcd.h"
>@@ -904,6 +906,14 @@ __acquires(priv->lock)
> status = 0;
> }
>
>+ if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) == PIPE_BULK) {
>+ void *ptr;
>+ for (ptr = urb->transfer_buffer;
>+ ptr < urb->transfer_buffer + urb->transfer_buffer_length;
>+ ptr += PAGE_SIZE)
>+ flush_dcache_page(virt_to_page(ptr));
>+ }
>+
> /* complete() can reenter this HCD */
> usb_hcd_unlink_urb_from_ep(priv_to_hcd(priv), urb);
> spin_unlock(&priv->lock);
>

Sebastian

Florian Fainelli

unread,
Feb 2, 2010, 4:40:02 AM2/2/10
to
On Tuesday 02 February 2010 07:58:42 Oliver Neukum wrote:
> Am Montag, 1. Februar 2010 23:30:01 schrieb Andreas Mohr:
> > I took some time to add your patch to ehci-q.c / ohci-q.c
> > (for my *hci-ssb.c ASUS WL-500gP v2), on my now heavily patched-up
> > 2.6.31.9, but UNFORTUNATELY it kept locking up the same way as always
> > when stopping playback despite being damn sure this time that this patch
> > could have the potential to finally fix it ;)
> > (I had to replace memory.h with page.h on my arch though, to fix the
> > build)
>
> A moment please. You are using ehci and ohci. Both are using dma.
> Why does this issue arise?

Because the BCM4710 CPU core is know to have cache problems and we have been
trying to workaround this, your problem Andreas is imho a different one.
--
Regards, Florian

Catalin Marinas

unread,
Feb 2, 2010, 5:00:02 AM2/2/10
to
On Tue, 2010-02-02 at 04:24 +0000, Paul Mundt wrote:
> On Mon, Feb 01, 2010 at 03:14:04PM -0500, Alan Stern wrote:
> > On Mon, 1 Feb 2010, Catalin Marinas wrote:
> >
> > > On Fri, 2010-01-29 at 18:54 +0000, Matthew Dharm wrote:
> > > > HOWEVER, I firmly believe that the cache-management functions belong with
> > > > the driver that actually talks to the low-level hardware, as that's the
> > > > only place where you can be 100% certain of what cache operations are
> > > > needed. After all, I think someone is working on a USB-over-IP transport,
> > > > and trying to manage cache at the usb-storage level in that scenario is
> > > > just silly.
> > > >
> > > > So, let's put this in the HCD drivers and be done with it.
> > >
> > > The patch below is what fixes the I-D cache incoherency issues on ARM. I
> > > don't particularly like the solution but it seems to be the only one
> > > available.
> > >
> > > IMHO, Linux should have functions similar to the DMA API but for PIO
> > > drivers (e.g. pio_map_single/pio_unmap_single) that non-coherent
> > > architectures can define, otherwise being no-ops. Any thoughts?
> >
> > You should bring this up on the linux-arm-kernel mailing list and CC:
> > the ARM maintainer. They are the ones most directly affected.
>
> No, this belongs on linux-arch, as it's something that impacts a lot of
> people besides ARM.

I agree. I'll try to come up with a proposal and post it there.

BTW, this was already raised on the ARM Linux lists and people there are
aware of these problems. Their suggestion was to take it to LKML.

--
Catalin

Catalin Marinas

unread,
Feb 2, 2010, 6:10:02 AM2/2/10
to
On Tue, 2010-02-02 at 06:39 +0000, Paul Mundt wrote:
> On Mon, Feb 01, 2010 at 05:29:14PM +0000, Catalin Marinas wrote:
> > diff --git a/drivers/usb/host/isp1760-hcd.c b/drivers/usb/host/isp1760-hcd.c
> > index 27b8f7c..4d3eeee 100644
> > --- a/drivers/usb/host/isp1760-hcd.c
> > +++ b/drivers/usb/host/isp1760-hcd.c
> > @@ -18,6 +18,8 @@
> > #include <linux/uaccess.h>
> > #include <linux/io.h>
> > #include <asm/unaligned.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/memory.h>
>
> asm/memory.h isn't a portable header. If you are including it for
> virt_to_page(), linux/io.h should already bring that in via asm/io.h.
> If arm doesn't bring in virt_to_page() through its asm/io.h, then fix the
> headers there please.

In the ARM case, yes, it brings virt_to_page() but I'm not sure that's
the case for the other architectures. I think a better header is
linux/mm.h which already uses this function in virt_to_head_page().

--
Catalin

Catalin Marinas

unread,
Feb 2, 2010, 6:10:02 AM2/2/10
to
On Tue, 2010-02-02 at 09:11 +0000, Sebastian Andrzej Siewior wrote:
> * Catalin Marinas | 2010-02-01 17:29:14 [+0000]:
> >> So, let's put this in the HCD drivers and be done with it.
>
> That is the correct place. MMC -hcd drivers for instance are doing this
> way.
>
> >The patch below is what fixes the I-D cache incoherency issues on ARM. I
> >don't particularly like the solution but it seems to be the only one
> >available.
>
> The PIO-MMC drivers walk through a scatter list via sg_miter_start() and
> friends. Those helpers take care of this automaticly.
>
> >IMHO, Linux should have functions similar to the DMA API but for PIO
> >drivers (e.g. pio_map_single/pio_unmap_single) that non-coherent
> >architectures can define, otherwise being no-ops. Any thoughts?
>
> What is wrong with flush_dcache_page() ?

In this particular case, it's too many lines to do the virt_to_page for
the transfer buffer since the HCD driver doesn't have access to the
individual pages (via something like urb->sg). A better solution would
be to move such loop in a flush_dcache_range() function to make it
easier for drivers.

Apart from that, flush_dcache_page() doesn't have any data flow
information. Optimisations could be done on ARM if we know that the
kernel only intends to read from a page (no flushing necessary with a
non-aliasing D-cache).

> And I think linux-arch is the appropriate place.

For changes to the cache flushing API, yes, that's the right place. I'll
get there with a patch.


>
> >diff --git a/drivers/usb/host/isp1760-hcd.c b/drivers/usb/host/isp1760-hcd.c
> >index 27b8f7c..4d3eeee 100644
> >--- a/drivers/usb/host/isp1760-hcd.c
> >+++ b/drivers/usb/host/isp1760-hcd.c
> >@@ -18,6 +18,8 @@
> > #include <linux/uaccess.h>
> > #include <linux/io.h>
> > #include <asm/unaligned.h>
> >+#include <asm/cacheflush.h>
> >+#include <asm/memory.h>
>
> I'm fine with the patch generally but I don't like the asm headers.
> cacheflush.h is available on most architectures as far as I can see it but
> memory.h is only available on arm. So you break the build on !arm and
> therefore I NAK this.

Yes, that was already pointed out. I'll post a revised patch (until we
maybe get a better API for such things but that's for linux-arch).

Thanks.

--
Catalin

Paul Mundt

unread,
Feb 2, 2010, 6:20:01 AM2/2/10
to
On Tue, Feb 02, 2010 at 11:05:39AM +0000, Catalin Marinas wrote:
> On Tue, 2010-02-02 at 06:39 +0000, Paul Mundt wrote:
> > On Mon, Feb 01, 2010 at 05:29:14PM +0000, Catalin Marinas wrote:
> > > diff --git a/drivers/usb/host/isp1760-hcd.c b/drivers/usb/host/isp1760-hcd.c
> > > index 27b8f7c..4d3eeee 100644
> > > --- a/drivers/usb/host/isp1760-hcd.c
> > > +++ b/drivers/usb/host/isp1760-hcd.c
> > > @@ -18,6 +18,8 @@
> > > #include <linux/uaccess.h>
> > > #include <linux/io.h>
> > > #include <asm/unaligned.h>
> > > +#include <asm/cacheflush.h>
> > > +#include <asm/memory.h>
> >
> > asm/memory.h isn't a portable header. If you are including it for
> > virt_to_page(), linux/io.h should already bring that in via asm/io.h.
> > If arm doesn't bring in virt_to_page() through its asm/io.h, then fix the
> > headers there please.
>
> In the ARM case, yes, it brings virt_to_page() but I'm not sure that's
> the case for the other architectures. I think a better header is
> linux/mm.h which already uses this function in virt_to_head_page().
>
For some reason I was thinking virt_to_phys() instead of virt_to_page()
when I wrote that, so just ignore me. linux/mm.h is obviously fine.

Oliver Neukum

unread,
Feb 2, 2010, 6:50:02 AM2/2/10
to
Am Montag, 1. Februar 2010 18:29:14 schrieb Catalin Marinas:
> + if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) == PIPE_BULK) {
> + void *ptr;
> + for (ptr = urb->transfer_buffer;
> + ptr < urb->transfer_buffer + urb->transfer_buffer_length;
> + ptr += PAGE_SIZE)
> + flush_dcache_page(virt_to_page(ptr));

Is it correct to limit this to BULK pipes?

Regards
Oliver

Catalin Marinas

unread,
Feb 2, 2010, 7:10:01 AM2/2/10
to
On Tue, 2010-02-02 at 11:48 +0000, Oliver Neukum wrote:
> Am Montag, 1. Februar 2010 18:29:14 schrieb Catalin Marinas:
> > + if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) == PIPE_BULK) {
> > + void *ptr;
> > + for (ptr = urb->transfer_buffer;
> > + ptr < urb->transfer_buffer + urb->transfer_buffer_length;
> > + ptr += PAGE_SIZE)
> > + flush_dcache_page(virt_to_page(ptr));
>
> Is it correct to limit this to BULK pipes?

I'm not entirely sure. The flush_dcache_page() should only be called for
pages that may be mapped into user space (page cache pages). We don't
need this for control buffers. It was my impression that what's coming
from the mass storage layer intended for page cache pages has the
PIPE_BULK type (I may be wrong though).

--
Catalin

Oliver Neukum

unread,
Feb 2, 2010, 7:10:02 AM2/2/10
to
Am Dienstag, 2. Februar 2010 13:01:12 schrieb Catalin Marinas:
> On Tue, 2010-02-02 at 11:48 +0000, Oliver Neukum wrote:
> > Am Montag, 1. Februar 2010 18:29:14 schrieb Catalin Marinas:
> > > + if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) == PIPE_BULK) {
> > > + void *ptr;
> > > + for (ptr = urb->transfer_buffer;
> > > + ptr < urb->transfer_buffer + urb->transfer_buffer_length;
> > > + ptr += PAGE_SIZE)
> > > + flush_dcache_page(virt_to_page(ptr));
> >
> > Is it correct to limit this to BULK pipes?
>
> I'm not entirely sure. The flush_dcache_page() should only be called for
> pages that may be mapped into user space (page cache pages). We don't
> need this for control buffers. It was my impression that what's coming
> from the mass storage layer intended for page cache pages has the
> PIPE_BULK type (I may be wrong though).

For storage that is correct. But what about other sources of pages,
for example iSCSI?

Regards
Oliver

Andreas Mohr

unread,
Feb 2, 2010, 7:20:03 AM2/2/10
to
Hi,

On Tue, Feb 02, 2010 at 01:07:56PM +0100, Oliver Neukum wrote:
> Am Dienstag, 2. Februar 2010 13:01:12 schrieb Catalin Marinas:
> > On Tue, 2010-02-02 at 11:48 +0000, Oliver Neukum wrote:
> > > Am Montag, 1. Februar 2010 18:29:14 schrieb Catalin Marinas:
> > > Is it correct to limit this to BULK pipes?
> >
> > I'm not entirely sure. The flush_dcache_page() should only be called for
> > pages that may be mapped into user space (page cache pages). We don't
> > need this for control buffers. It was my impression that what's coming
> > from the mass storage layer intended for page cache pages has the
> > PIPE_BULK type (I may be wrong though).
>
> For storage that is correct. But what about other sources of pages,
> for example iSCSI?

Or... usb-audio? I should have verified that it is using bulk endpoints
(and thus the patch applies to my case).
usb-audio probably uses isochronous transfers, thus that would be
an obvious reason why the patch didn't work for me.
(with some other reason possibly being BCM4710 issues, of course)

Andreas Mohr

Catalin Marinas

unread,
Feb 2, 2010, 7:50:02 AM2/2/10
to
On Tue, 2010-02-02 at 12:07 +0000, Oliver Neukum wrote:
> Am Dienstag, 2. Februar 2010 13:01:12 schrieb Catalin Marinas:
> > On Tue, 2010-02-02 at 11:48 +0000, Oliver Neukum wrote:
> > > Am Montag, 1. Februar 2010 18:29:14 schrieb Catalin Marinas:
> > > > + if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) == PIPE_BULK) {
> > > > + void *ptr;
> > > > + for (ptr = urb->transfer_buffer;
> > > > + ptr < urb->transfer_buffer + urb->transfer_buffer_length;
> > > > + ptr += PAGE_SIZE)
> > > > + flush_dcache_page(virt_to_page(ptr));
> > >
> > > Is it correct to limit this to BULK pipes?
> >
> > I'm not entirely sure. The flush_dcache_page() should only be called for
> > pages that may be mapped into user space (page cache pages). We don't
> > need this for control buffers. It was my impression that what's coming
> > from the mass storage layer intended for page cache pages has the
> > PIPE_BULK type (I may be wrong though).
>
> For storage that is correct. But what about other sources of pages,
> for example iSCSI?

In the iSCSI case, does the HCD driver write directly to a page cache
page? Or it just fills in network packets that are copied to page cache
pages by the iSCSI code (sorry, I'm not familiar with this part of the
kernel). If the latter, the cache flushing in the HCD driver would not
help and it needs to be done in the iSCSI code.

--
Catalin

Oliver Neukum

unread,
Feb 2, 2010, 8:10:02 AM2/2/10
to
Am Dienstag, 2. Februar 2010 13:39:35 schrieb Catalin Marinas:
> > For storage that is correct. But what about other sources of pages,
> > for example iSCSI?
>
> In the iSCSI case, does the HCD driver write directly to a page cache
> page? Or it just fills in network packets that are copied to page cache
> pages by the iSCSI code (sorry, I'm not familiar with this part of the
> kernel). If the latter, the cache flushing in the HCD driver would not
> help and it needs to be done in the iSCSI code.

As far as I can tell iSCSI does a private copy. But I don't know how
many methods to transfer code pages over USB exist. I'd say the
conservative solution is to flush for everything but control transfers.

Regards
Oliver

Ming Lei

unread,
Feb 2, 2010, 8:50:02 AM2/2/10
to
2010/2/2 Catalin Marinas <catalin...@arm.com>:

> In the iSCSI case, does the HCD driver write directly to a page cache
> page? Or it just fills in network packets that are copied to page cache
> pages by the iSCSI code (sorry, I'm not familiar with this part of the
> kernel). If the latter, the cache flushing in the HCD driver would not
> help and it needs to be done in the iSCSI code.

So we should flush dcache page in the place where the user mapped
page is copied to, instead of low level driver which does not do such
thing always.

--
Lei Ming

Catalin Marinas

unread,
Feb 2, 2010, 9:40:01 AM2/2/10
to
On Tue, 2010-02-02 at 13:08 +0000, Oliver Neukum wrote:
> Am Dienstag, 2. Februar 2010 13:39:35 schrieb Catalin Marinas:
> > > For storage that is correct. But what about other sources of pages,
> > > for example iSCSI?
> >
> > In the iSCSI case, does the HCD driver write directly to a page cache
> > page? Or it just fills in network packets that are copied to page cache
> > pages by the iSCSI code (sorry, I'm not familiar with this part of the
> > kernel). If the latter, the cache flushing in the HCD driver would not
> > help and it needs to be done in the iSCSI code.
>
> As far as I can tell iSCSI does a private copy. But I don't know how
> many methods to transfer code pages over USB exist. I'd say the
> conservative solution is to flush for everything but control transfers.

flush_dcache_page() is on many architectures implemented lazily so that
if the page isn't mapped in user space no flushing takes place. It's
mainly the cost of virt_to_page() which I suspect is slightly higher
with sparsemem enabled.

--
Catalin

Catalin Marinas

unread,
Feb 2, 2010, 9:40:01 AM2/2/10
to
On Tue, 2010-02-02 at 13:36 +0000, Ming Lei wrote:
> 2010/2/2 Catalin Marinas <catalin...@arm.com>:
>
> > In the iSCSI case, does the HCD driver write directly to a page cache
> > page? Or it just fills in network packets that are copied to page cache
> > pages by the iSCSI code (sorry, I'm not familiar with this part of the
> > kernel). If the latter, the cache flushing in the HCD driver would not
> > help and it needs to be done in the iSCSI code.
>
> So we should flush dcache page in the place where the user mapped
> page is copied to, instead of low level driver which does not do such
> thing always.

Or both if you can't be sure whether the driver copies directly to a
page cache page.

--
Catalin

Clemens Ladisch

unread,
Feb 2, 2010, 9:50:01 AM2/2/10
to
Andreas Mohr wrote:
> On Tue, Feb 02, 2010 at 01:07:56PM +0100, Oliver Neukum wrote:
> > Am Dienstag, 2. Februar 2010 13:01:12 schrieb Catalin Marinas:
> > > On Tue, 2010-02-02 at 11:48 +0000, Oliver Neukum wrote:
> > > > Am Montag, 1. Februar 2010 18:29:14 schrieb Catalin Marinas:
> > > > Is it correct to limit this to BULK pipes?
> > >
> > > I'm not entirely sure. The flush_dcache_page() should only be called for
> > > pages that may be mapped into user space (page cache pages). We don't
> > > need this for control buffers. It was my impression that what's coming
> > > from the mass storage layer intended for page cache pages has the
> > > PIPE_BULK type (I may be wrong though).
> >
> > For storage that is correct. But what about other sources of pages,
> > for example iSCSI?
>
> Or... usb-audio? I should have verified that it is using bulk endpoints
> (and thus the patch applies to my case).
> usb-audio probably uses isochronous transfers, thus that would be
> an obvious reason why the patch didn't work for me.

snd-usb-audio indeed uses isochronous transfers, but those buffers are
never mapped into user space. The intermediate vmalloc()ed buffer is,
however, and there was a bugfix for this recently. Do you have these
patches in your tree?
http://git.kernel.org/?p=linux/kernel/git/tiwai/sound-2.6.git;a=commit;h=3e879d7bac705be4813a0ec9560cbe31db4b269f
http://git.kernel.org/?p=linux/kernel/git/tiwai/sound-2.6.git;a=commit;h=c32d977b8157bf67cdf47729ce7dd054a26eb534


Best regards,
Clemens

Oliver Neukum

unread,
Feb 2, 2010, 10:00:02 AM2/2/10
to
Am Dienstag, 2. Februar 2010 15:42:49 schrieb Clemens Ladisch:
> > Or... usb-audio? I should have verified that it is using bulk endpoints
> > (and thus the patch applies to my case).
> > usb-audio probably uses isochronous transfers, thus that would be
> > an obvious reason why the patch didn't work for me.
>
> snd-usb-audio indeed uses isochronous transfers, but those buffers are
> never mapped into user space. The intermediate vmalloc()ed buffer is,
> however, and there was a bugfix for this recently. Do you have these
> patches in your tree?

Now that I think about it, several video drivers do map it to user space.

Regards
Oliver

Andreas Mohr

unread,
Feb 2, 2010, 10:20:03 AM2/2/10
to
[added another __bzero coherency crash victim, see
http://lkml.org/lkml/2008/6/9/14 ]

On Tue, Feb 02, 2010 at 03:52:19PM +0100, Oliver Neukum wrote:
> Am Dienstag, 2. Februar 2010 15:42:49 schrieb Clemens Ladisch:
> > > Or... usb-audio? I should have verified that it is using bulk endpoints
> > > (and thus the patch applies to my case).
> > > usb-audio probably uses isochronous transfers, thus that would be
> > > an obvious reason why the patch didn't work for me.
> >
> > snd-usb-audio indeed uses isochronous transfers, but those buffers are
> > never mapped into user space. The intermediate vmalloc()ed buffer is,
> > however, and there was a bugfix for this recently. Do you have these
> > patches in your tree?
>
> Now that I think about it, several video drivers do map it to user space.

OK, then the urb loop needs to also handle isochronous pipes,
and IMHO we should have a generic helper for this instead of open-coding
it, since it probably needs to be done in a couple affected HCDs
(and, most importantly, only on affected architectures - which the helper
could handle transparently).

Clemens: no, both of these patches haven't been applied (yet!!),
many thanks for the notification!

Will apply both patches and the isochronous addition, hopefully that
improves things (will be painful to check which of these things managed to
fix it - in case it does! -, though). Nope, will apply step by step,
both patches, then isochronous as a last resort.

Andreas Mohr

Catalin Marinas

unread,
Feb 2, 2010, 10:40:02 AM2/2/10
to
On Tue, 2010-02-02 at 15:10 +0000, Andreas Mohr wrote:
> [added another __bzero coherency crash victim, see
> http://lkml.org/lkml/2008/6/9/14 ]
>
> On Tue, Feb 02, 2010 at 03:52:19PM +0100, Oliver Neukum wrote:
> > Am Dienstag, 2. Februar 2010 15:42:49 schrieb Clemens Ladisch:
> > > > Or... usb-audio? I should have verified that it is using bulk endpoints
> > > > (and thus the patch applies to my case).
> > > > usb-audio probably uses isochronous transfers, thus that would be
> > > > an obvious reason why the patch didn't work for me.
> > >
> > > snd-usb-audio indeed uses isochronous transfers, but those buffers are
> > > never mapped into user space. The intermediate vmalloc()ed buffer is,
> > > however, and there was a bugfix for this recently. Do you have these
> > > patches in your tree?
> >
> > Now that I think about it, several video drivers do map it to user space.
>
> OK, then the urb loop needs to also handle isochronous pipes,
> and IMHO we should have a generic helper for this instead of open-coding
> it, since it probably needs to be done in a couple affected HCDs
> (and, most importantly, only on affected architectures - which the helper
> could handle transparently).

I'm planning to send a proposal to linux-arch for a flush_dcache_range()
function.

--
Catalin

Alan Stern

unread,
Feb 2, 2010, 12:20:02 PM2/2/10
to
On Tue, 2 Feb 2010, Oliver Neukum wrote:

> Am Dienstag, 2. Februar 2010 13:39:35 schrieb Catalin Marinas:
> > > For storage that is correct. But what about other sources of pages,
> > > for example iSCSI?
> >
> > In the iSCSI case, does the HCD driver write directly to a page cache
> > page? Or it just fills in network packets that are copied to page cache
> > pages by the iSCSI code (sorry, I'm not familiar with this part of the
> > kernel). If the latter, the cache flushing in the HCD driver would not
> > help and it needs to be done in the iSCSI code.
>
> As far as I can tell iSCSI does a private copy. But I don't know how
> many methods to transfer code pages over USB exist. I'd say the
> conservative solution is to flush for everything but control transfers.

This doesn't make any sense. Nobody would ever use isochronous
transfers to store data into a code page because isochronous is
unreliable. (Audio isn't a counterexample -- audio data may be mapped
to userspace, but only to data pages, not code pages. And the problem
here is to maintain consistency between the D and I caches.)

In principle interrupt transfers could be used, but it is most
unlikely. They are intended for bounded-latency transfers, not
transfers of potentially large amounts of data.

The only transfer type that makes sense to worry about is bulk.

Alan Stern

Catalin Marinas

unread,
Feb 2, 2010, 12:30:02 PM2/2/10
to
On Tue, 2010-02-02 at 17:11 +0000, Alan Stern wrote:
> On Tue, 2 Feb 2010, Oliver Neukum wrote:
>
> > Am Dienstag, 2. Februar 2010 13:39:35 schrieb Catalin Marinas:
> > > > For storage that is correct. But what about other sources of pages,
> > > > for example iSCSI?
> > >
> > > In the iSCSI case, does the HCD driver write directly to a page cache
> > > page? Or it just fills in network packets that are copied to page cache
> > > pages by the iSCSI code (sorry, I'm not familiar with this part of the
> > > kernel). If the latter, the cache flushing in the HCD driver would not
> > > help and it needs to be done in the iSCSI code.
> >
> > As far as I can tell iSCSI does a private copy. But I don't know how
> > many methods to transfer code pages over USB exist. I'd say the
> > conservative solution is to flush for everything but control transfers.
>
> This doesn't make any sense. Nobody would ever use isochronous
> transfers to store data into a code page because isochronous is
> unreliable. (Audio isn't a counterexample -- audio data may be mapped
> to userspace, but only to data pages, not code pages. And the problem
> here is to maintain consistency between the D and I caches.)

My issues is with both I-D coherency and D-cache aliasing caused by
pages mapped in both user and kernel space (with different colours). The
flush_dcache_page() call should target both cases.

--
Catalin

Andreas Mohr

unread,
Feb 2, 2010, 3:40:02 PM2/2/10
to
On Tue, Feb 02, 2010 at 03:42:49PM +0100, Clemens Ladisch wrote:
> Andreas Mohr wrote:
> > Or... usb-audio? I should have verified that it is using bulk endpoints
> > (and thus the patch applies to my case).
> > usb-audio probably uses isochronous transfers, thus that would be
> > an obvious reason why the patch didn't work for me.
>
> snd-usb-audio indeed uses isochronous transfers, but those buffers are
> never mapped into user space. The intermediate vmalloc()ed buffer is,
> however, and there was a bugfix for this recently. Do you have these
> patches in your tree?
> http://git.kernel.org/?p=linux/kernel/git/tiwai/sound-2.6.git;a=commit;h=3e879d7bac705be4813a0ec9560cbe31db4b269f
> http://git.kernel.org/?p=linux/kernel/git/tiwai/sound-2.6.git;a=commit;h=c32d977b8157bf67cdf47729ce7dd054a26eb534

OK, I've now added both patches to my quilt series (and pushed everything),
rebuilt, reflashed image and copied modules, and it still bombs
just the very same way.
And this also with Catalins latest patch version (the one using != PIPE_CONTROL
to hit iso transfers etc. as well).
So it seems I still haven't got to the core of the issue despite all these
rather different patch attempts.

I'm afraid if it turns out that keeping open the sound device manually
via another process manages to workaround it, then I'll simply
give it all up completely and live with the current semi-satisfying situation
on my custom 2.6.31.9 build.

Any further ideas or patches that I could try?
(I might investigate the issue myself in a serious way sometime later,
but don't count on it)

Thanks!

Andreas Mohr

netconsole log (some previous crashes were at __bzero, now it was two times
at __copy_user - maybe the patches changed something for real?):

Instruction bus error, epc == 80004dd8, ra == 80000018
Oops[#1]:
Cpu 0
$ 0 : 00000000 1000d000 00000000 00000000
$ 4 : 7f9e6be8 81ee7ec4 00000004 00000000
$ 8 : 00000000 00000000 00000000 81fac000
$12 : 4b688a80 80340000 81d6e868 00000400
$16 : 81ee7f00 00000000 7f9e6be8 00000001
$20 : 81ee7eb8 00000000 7f9e6c9c 7f9eb320
$24 : 00000000 2b565ed0
$28 : 81ee6000 81ee7ea8 7f9f6c98 80000018
Hi : 00000000
Lo : 00000000
epc : 80004dd8 __copy_user+0xd4/0x2bc
Not tainted
ra : 80000018 0x80000018
Status: 1000d003 KERNEL EXL IE
Cause : 00800018
PrId : 00029029 (Broadcom BCM3302)
Modules linked in: snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd_page_alloc snd_usb_lib snd_rawmidi usbhid snd_seq_device snd_hwdep hid snd input_core soundcore ipv6 arc4 ecb cryptomgr aead pcompress crypto_blkcipher crypto_hash crypto_algapi b43 mac80211 cfg80211
Process mpd (pid: 1310, threadinfo=81ee6000, task=81d92838, tls=00000000)
Stack : 00000000 800afd4c 81ee56f8 00000219 00000000 00000000 00000000 00000000
ffffffff 3b043616 81ee7f00 7f9e6be8 00000000 00000000 00000000 800b1370
7f9e6ca0 81ee5680 000182fc 00493a83 81ee7f00 8009f6b0 00000219 03b1daad
00000000 00002710 00000000 00000000 7f9f6ca0 7f9e6ca0 7f9ec528 00000127
00000000 800031f0 00000000 2ae49060 7f9f75a8 2ae49060 7f9e6be8 7f9f5bd8
...
Call Trace:
[<80004dd8>] __copy_user+0xd4/0x2bc


Code: 8ca80000 24a50004 24c6fffc <ac880000> 1706fffb 24840004 10c00040 00864821 240a0020
Disabling lock debugging due to kernel taint
Instruction bus error, epc == 80096fa0, ra == 80000018
Oops[#2]:
Cpu 0
$ 0 : 00000000 1000d000 c0156064 00000064
$ 4 : 00000032 803b5514 00000032 81fa8000
$ 8 : 8037f840 00080000 81040000 00000003
$12 : 00000010 8037f840 00000004 00000000
$16 : 00000032 81fa8d54 2ab55000 00398f45
$20 : 2ab56000 0064d613 00000000 00000000
$24 : 00000000 80018ff0
$28 : 81ee6000 81ee7c58 00000000 80000018
Hi : 00000000
Lo : 00000000
epc : 80096fa0 swap_info_get+0x74/0xfc
Tainted: G D
ra : 80000018 0x80000018
Status: 1000d003 KERNEL EXL IE
Cause : 00800018
PrId : 00029029 (Broadcom BCM3302)
Modules linked in: snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd_page_alloc snd_usb_lib snd_rawmidi usbhid snd_seq_device snd_hwdep hid snd input_core soundcore ipv6 arc4 ecb cryptomgr aead pcompress crypto_blkcipher crypto_hash crypto_algapi b43 mac80211 cfg80211
Process mpd (pid: 1310, threadinfo=81ee6000, task=81d92838, tls=00000000)
Stack : ffffffff 800736a0 00000004 81d6e870 80350fd0 80099498 81d929cc 81036e40
81fa8bfc 2aaff000 00064000 81fa8d54 2ab55000 80088a30 8033dab8 80029d78
803402d4 00000006 2ab55fff 8033d9c0 81eaae60 81f0c2a8 81f0c2a8 2ab56000
00000000 00000001 80c4c0bc 81eaae60 8037f840 81eaae94 81d92838 00000000
00000001 7f9eb320 7f9f6c98 8008db1c 81ee7d00 80c4ce90 00000000 ffffffff
...
Call Trace:
[<80096fa0>] swap_info_get+0x74/0xfc
[<80099498>] free_swap_and_cache+0x1c/0x218
[<80088a30>] unmap_vmas+0x418/0x63c
[<8008db1c>] exit_mmap+0xb8/0x148
[<8002e3c4>] mmput+0xc0/0x1d8
[<800333e8>] exit_mm+0x260/0x298
[<800357cc>] do_exit+0x1cc/0x688
[<80014658>] nmi_exception_handler+0x0/0x34


Code: 00041840 8ca20020 00431021 <94440000> 1480001d 8fbf0014 3c048030 3c05802c 24a5f280
Fixing recursive fault but reboot is needed!
Instruction bus error, epc == 80004dd8, ra == 80000018
Oops[#3]:
Cpu 0
$ 0 : 00000000 1000d000 00000000 00000000
$ 4 : 7fcd81b0 81c0ddd0 00000000 1000d001
$ 8 : 00000000 00000000 00000000 806f8000
$12 : 4b688a84 7f9f7f18 81d6e868 00000000
$16 : 00000004 00000000 81c0ddc0 81c0ddc0
$20 : 7fcd81b0 00000000 81c0ddcc 00000000
$24 : 00000000 2b565ed0
$28 : 81c0c000 81c0dd98 00000001 80000018
Hi : 0000007d
Lo : eb254400
epc : 80004dd8 __copy_user+0xd4/0x2bc
Tainted: G D
ra : 80000018 0x80000018
Status: 1000d003 KERNEL EXL IE
Cause : 00800018
PrId : 00029029 (Broadcom BCM3302)
Modules linked in: snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd_page_alloc snd_usb_lib snd_rawmidi usbhid snd_seq_device snd_hwdep hid snd input_core soundcore ipv6 arc4 ecb cryptomgr aead pcompress crypto_blkcipher crypto_hash crypto_algapi b43 mac80211 cfg80211
Process init (pid: 1, threadinfo=81c0c000, task=81c08480, tls=00000000)
Stack : 00000000 81c0dda8 81c0df00 80350fb0 81c0ddc0 81c0ddc4 81c0ddc8 81c0ddcc
81c0ddd0 81c0ddd4 00000400 00000000 00000000 00000000 00000000 00000000
00000000 00000000 81d98000 ffffff9c 81c0dea8 0044a234 7fcd8598 8009b864
00000001 81c0dea8 00000001 81d98000 ffffff9c 800a2d18 00000003 00000002
00000003 00000003 0000000d 00000000 00000000 00000000 000000c9 00001180
...
Call Trace:
[<80004dd8>] __copy_user+0xd4/0x2bc


Code: 8ca80000 24a50004 24c6fffc <ac880000> 1706fffb 24840004 10c00040 00864821 240a0020
kobject: 'ep_01' (81e52f10): kobject_uevent_env
kobject: 'ep_01' (81e52f10): kobject_uevent_env: filter function caused the event to drop!
Kernel panic - not syncing: Attempted to kill init!

Andreas Mohr

unread,
Feb 2, 2010, 5:00:02 PM2/2/10
to
Hi,

On Tue, Feb 02, 2010 at 05:20:11PM +0000, Catalin Marinas wrote:
> My issues is with both I-D coherency and D-cache aliasing caused by
> pages mapped in both user and kernel space (with different colours). The
> flush_dcache_page() call should target both cases.

Yup, it does, and quite successfully at that (aka "at that point in time we
having nothing any more to worry about, everything dealt with" ;-)


usbcore: registered new interface driver ums-datafab
hub 2-1:1.0: state 7 ports 2 chg 0002 evt 0000
kobject: 'ums-freecom' (81de0a80): kobject_add_internal: parent: 'drivers', set: 'drivers'
hub 2-1:1.0: port 1, status 0101, change 0000, 12 Mb/s
kobject: 'ums-freecom' (81de0a80): kobject_uevent_env
kobject: 'ums-freecom' (81de0a80): fill_kobj_path: path = '/bus/usb/drivers/ums-freecom'
usbcore: registered new interface driver ums-freecom
kobject: 'ums-jumpshot' (81de0c80): kobject_add_internal: parent: 'drivers', set: 'drivers'
CPU 0 Unable to handle kernel paging request at virtual address 0000041c, epc == 800171e8, ra == 801da5dc
Oops[#1]:
Cpu 0
$ 0 : 00000000 10008000 803b0000 00010000
$ 4 : 00000408 8143bc60 0043bc60 00000001
$ 8 : 81dd7124 81dd7190 00000004 00000000
$12 : 0000003b 80380000 00000002 f2d9b780
$16 : a1de4020 803b0000 8037f840 81de7f00
$20 : 00000000 81dd7080 80000000 00000000
$24 : 00000000 80016bb8
$28 : 81c0c000 81c0da98 a1dd414c 801da5dc
Hi : 00000008
Lo : 00000000
epc : 800171e8 __flush_dcache_page+0x38/0x120
Not tainted
ra : 801da5dc ehci_urb_done+0x178/0x1dc
Status: 10008002 KERNEL EXL
Cause : 00805008
BadVA : 0000041c


PrId : 00029029 (Broadcom BCM3302)
Modules linked in:

Process swapper (pid: 1, threadinfo=81c0c000, task=81c08480, tls=00000000)
Stack : 81dd7080 00000001 10009000 8033dab8 a1dd8120 a1dd4114 ffffff6a ffffff6a
81de7f00 a1dd414c a1dd4100 801db39c 05b8d800 00000000 00000018 803a0000
803a0000 0000054c 00000001 00000000 a1dd8180 81dd7080 00000000 a1dd4100
00000000 81c0dbb8 00000000 80318d24 81dd7158 81dd7080 81dda004 801deb38
81dd7158 8004f984 01f63104 0000003c 81c0dc78 8033feb8 00000008 00000042
...
Call Trace:
[<800171e8>] __flush_dcache_page+0x38/0x120
[<801da5dc>] ehci_urb_done+0x178/0x1dc
[<801db39c>] qh_completions+0x484/0x554
[<801deb38>] ehci_work+0x438/0xb68
[<801df2bc>] ehci_watchdog+0x54/0x94
[<8003d3ec>] run_timer_softirq+0x1b0/0x268
[<80037fbc>] __do_softirq+0xb8/0x174
[<800380d4>] do_softirq+0x5c/0x98
[<80038244>] irq_exit+0x40/0x88
[<8000e12c>] plat_irq_dispatch+0x60/0x178
[<80001444>] ret_from_irq+0x0/0x4
[<80031de8>] vprintk+0x36c/0x3bc
[<8000a48c>] printk+0x24/0x30
[<80151918>] kobject_add_internal+0x124/0x254
[<80151f80>] kobject_init_and_add+0x40/0x58
[<8018e854>] bus_add_driver+0xdc/0x2b4
[<801902c8>] driver_register+0xe0/0x19c
[<801ce000>] usb_register_driver+0x84/0x118
[<8000d640>] do_one_initcall+0x70/0x1f4
[<80354334>] kernel_init+0xd0/0x140
[<8000fb4c>] kernel_thread_helper+0x10/0x18


Code: 00000000 10800029 3c02803b <8c820014> 14400026 3c02803b 8c83001c 2482001c 14620021


Disabling lock debugging due to kernel taint

Kernel panic - not syncing: Fatal exception in interrupt

Any ideas? To my uncaring mind this would look like __flush_dcache_page()
not being quite so happy with a NULL pointer that it is being served
(although I haven't managed to precisely investigate yet where the
dereferencing offset 0000041c is coming from).

Yes, crash is reproducible (three times on boot already, although some bootup
does make it successfully).

My ehci-q.c has:

if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) != PIPE_CONTROL) {
void *ptr;


for (ptr = urb->transfer_buffer;

ptr < urb->transfer_buffer + urb->transfer_buffer_length;

ptr += PAGE_SIZE)
flush_dcache_page(virt_to_page(ptr));
}

Hmm, OTOH this code seems to postulate that urb->transfer_buffer_length
is that 0x41c from above...
(IOW the code is simply missing an urb->transfer_buffer NULL check)
OTOH there would also be the question whether flush_dcache_page() should
have caught the NULL pointer input...
And then there's the question whether urb->transfer_buffer is allowed to end
up as NULL anyway...

BTW, trying to keep open /dev/dsp by another app when closing the playback app
does not prevent the audio OOPS.


Been seeing a nano-tiny wee bit too many crashes these days,

Andreas Mohr

Alan Stern

unread,
Feb 3, 2010, 10:20:01 AM2/3/10
to
On Tue, 2 Feb 2010, Andreas Mohr wrote:

> Any ideas? To my uncaring mind this would look like __flush_dcache_page()
> not being quite so happy with a NULL pointer that it is being served
> (although I haven't managed to precisely investigate yet where the
> dereferencing offset 0000041c is coming from).
>
> Yes, crash is reproducible (three times on boot already, although some bootup
> does make it successfully).
>
> My ehci-q.c has:
>
> if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) != PIPE_CONTROL) {
> void *ptr;
> for (ptr = urb->transfer_buffer;
> ptr < urb->transfer_buffer + urb->transfer_buffer_length;
> ptr += PAGE_SIZE)
> flush_dcache_page(virt_to_page(ptr));
> }
>
> Hmm, OTOH this code seems to postulate that urb->transfer_buffer_length
> is that 0x41c from above...
> (IOW the code is simply missing an urb->transfer_buffer NULL check)
> OTOH there would also be the question whether flush_dcache_page() should
> have caught the NULL pointer input...
> And then there's the question whether urb->transfer_buffer is allowed to end
> up as NULL anyway...

Have you looked at the code in qh_urb_transaction() in ehci-q.c
involving this_sg_len and buf? It's quite possible that
urb->transfer_buffer is a NULL pointer and that the actual buffer is
not a contiguous set of pages -- but only if DMA is used.

Alan Stern

George Spelvin

unread,
Feb 3, 2010, 7:00:03 PM2/3/10
to
> Apart from that, flush_dcache_page() doesn't have any data flow
> information. Optimisations could be done on ARM if we know that the
> kernel only intends to read from a page (no flushing necessary with a
> non-aliasing D-cache).

Already done in flush_dcache_page(). If possible (uniprocessor), it just
flags the page as PG_dcache_dirty, and defers the actual flush operation
until it's mapped somewhere else (either a virtual alias or executable).

See Documentation/cachetlb.txt. (Really, all PIO drivers should
be calling flush_dcache_page.)

Paul Mundt

unread,
Feb 3, 2010, 11:40:02 PM2/3/10
to
On Wed, Feb 03, 2010 at 06:56:44PM -0500, George Spelvin wrote:
> > Apart from that, flush_dcache_page() doesn't have any data flow
> > information. Optimisations could be done on ARM if we know that the
> > kernel only intends to read from a page (no flushing necessary with a
> > non-aliasing D-cache).
>
> Already done in flush_dcache_page(). If possible (uniprocessor), it just
> flags the page as PG_dcache_dirty, and defers the actual flush operation
> until it's mapped somewhere else (either a virtual alias or executable).
>
Try reading the thread again, as you seem to have missed the point
completely. The issue isn't with lazy dcache writeback, the issue is that
flush_dcache_page() is a bit of a sledgehammer for cases when directional
information is available. The DMA mapping operations conversely are aware
of data flow and optimize accordingly.

Additionally, with something like a flush_dcache_range() it's possible
to optimize for large ranges as opposed to page-at-a-time looping for
anything that needs to flag PG_dcache_dirty on a bulk group of pages.

Pavel Machek

unread,
Feb 8, 2010, 2:00:02 AM2/8/10
to
Hi!

> > So, let's put this in the HCD drivers and be done with it.
>

> The patch below is what fixes the I-D cache incoherency issues on ARM. I
> don't particularly like the solution but it seems to be the only one
> available.

Really? It looks like arm should just flush the caches when mapping
executable page to the userspace.... you can't expect all the drivers
to be modified like that...

Plus it does unneccessary flushes on x86, etc...

> @@ -904,6 +906,14 @@ __acquires(priv->lock)
> status = 0;


> }
>
> + if (usb_pipein(urb->pipe) && usb_pipetype(urb->pipe) == PIPE_BULK) {
> + void *ptr;
> + for (ptr = urb->transfer_buffer;
> + ptr < urb->transfer_buffer + urb->transfer_buffer_length;
> + ptr += PAGE_SIZE)
> + flush_dcache_page(virt_to_page(ptr));

> + }
> +
> /* complete() can reenter this HCD */
> usb_hcd_unlink_urb_from_ep(priv_to_hcd(priv), urb);
> spin_unlock(&priv->lock);
>

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Pavel Machek

unread,
Feb 8, 2010, 2:00:02 AM2/8/10
to
On Tue 2010-02-02 12:11:25, Alan Stern wrote:
> On Tue, 2 Feb 2010, Oliver Neukum wrote:
>
> > Am Dienstag, 2. Februar 2010 13:39:35 schrieb Catalin Marinas:
> > > > For storage that is correct. But what about other sources of pages,
> > > > for example iSCSI?
> > >
> > > In the iSCSI case, does the HCD driver write directly to a page cache
> > > page? Or it just fills in network packets that are copied to page cache
> > > pages by the iSCSI code (sorry, I'm not familiar with this part of the
> > > kernel). If the latter, the cache flushing in the HCD driver would not
> > > help and it needs to be done in the iSCSI code.
> >
> > As far as I can tell iSCSI does a private copy. But I don't know how
> > many methods to transfer code pages over USB exist. I'd say the
> > conservative solution is to flush for everything but control transfers.
>
> This doesn't make any sense. Nobody would ever use isochronous
> transfers to store data into a code page because isochronous is
> unreliable. (Audio isn't a counterexample -- audio data may be

Why not?

Use isochronous transfer to load data, verify it is okay, exec it.

Or maybe someone is doing crashme testing with usb audio as random
generator :-).

Sure, unlikely, but...
Pavel

Andreas Mohr

unread,
Feb 8, 2010, 2:40:01 AM2/8/10
to
Hi,

On Mon, Feb 08, 2010 at 07:55:19AM +0100, Pavel Machek wrote:
> Plus it does unneccessary flushes on x86, etc...

Noticed that as well, there should be an arch-obeying helper for this.


On my MIPSEL, I had urb->transfer_buffer NULL ptr crashes
(I think that was expected in case of a certain DMA setup, Alan said).

However, even with NULL check added I still had:

hub 2-1.1:1.0: state 7 ports 7 chg 0000 evt 0010
Unhandled kernel unaligned access[#1]:
Cpu 0
$ 0 : 00000000 fffffffd 803b0000 00010000
$ 4 : 08002042 8143bfe0 0043bfe0 0000000d
$ 8 : 00000001 3b9aca00 c4653600 00000000
$12 : 00000049 3b9aca00 81dbc868 00000000
$16 : a1e00000 803b0000 8037f840 81dfaa80
$20 : 00000000 81dd5080 80000000 00000000
$24 : 00000000 80015a64
$28 : 8033a000 8033bc10 a1dd83cc 801da5e4
Hi : 00000000


Lo : 00000000
epc : 800171e8 __flush_dcache_page+0x38/0x120
Not tainted

ra : 801da5e4 ehci_urb_done+0x180/0x1e4
Status: 10009002 KERNEL EXL
Cause : 00800010
BadVA : 08002056


PrId : 00029029 (Broadcom BCM3302)
Modules linked in:

Process swapper (pid: 0, threadinfo=8033a000, task=8033c000, tls=00000000)
Stack : 00000000 00000000 81e04980 801c80ac a1dd9060 a1dd8394 ffffff6a ffffff6a
81dfaa80 a1dd83cc a1dd8380 801db3a4 803a6a28 80068e9c 000003f8 00003fc0
a1dd81cc 801dea58 00000001 00000000 a1dd9360 81dd5080 a1dd8380 10009001
a1dd83cc 81dd5158 00000000 80318d44 81dd5158 00000001 00010031 801de8f4
81dd5158 8033bce0 803a76a0 803a0000 8033d860 8004f924 00000219 00000043


...
Call Trace:
[<800171e8>] __flush_dcache_page+0x38/0x120

[<801da5e4>] ehci_urb_done+0x180/0x1e4
[<801db3a4>] qh_completions+0x484/0x554
[<801de8f4>] ehci_work+0x1ec/0xb68
[<801e2598>] ehci_irq+0x360/0x3a4
[<801c7cf8>] usb_hcd_irq+0x64/0x15c
[<80066d58>] handle_IRQ_event+0x90/0x280
[<80068e80>] handle_percpu_irq+0x48/0x9c
[<8000e228>] plat_irq_dispatch+0x15c/0x178
[<80001444>] ret_from_irq+0x0/0x4
[<80001680>] r4k_wait+0x20/0x40
[<8000fe34>] cpu_idle+0x30/0x60
[<80354a34>] start_kernel+0x338/0x350


Code: 00000000 10800029 3c02803b <8c820014> 14400026 3c02803b 8c83001c 2482001c 14620021
Disabling lock debugging due to kernel taint
Kernel panic - not syncing: Fatal exception in interrupt

Seems like BadVA : 08002056 really isn't as aligned (offset 0x6) as it should be.

I've given up on this now BTW, I'll wait until the dust has settled (i.e. some nice improvements
have found their way to the kernel) and retry in some months with a much newer kernel version
(currently patched-up 2.6.31.9) whether something remains to be fixed.
I'll work on more productive things such as submitting some waiting patches.

Andreas Mohr

Catalin Marinas

unread,
Feb 8, 2010, 5:00:01 AM2/8/10
to
Hi,

On Mon, 2010-02-08 at 06:55 +0000, Pavel Machek wrote:
> > > So, let's put this in the HCD drivers and be done with it.
> >
> > The patch below is what fixes the I-D cache incoherency issues on ARM. I
> > don't particularly like the solution but it seems to be the only one
> > available.
>
> Really? It looks like arm should just flush the caches when mapping
> executable page to the userspace.... you can't expect all the drivers
> to be modified like that...

We could of course flush the caches every time we get a page fault but
that's far from optimal, especially since DMA-capable drivers to do not
pollute the D-cache and don't need this extra flushing. Note that the
recent ARM processors have PIPT caches but separate for I and D and it's
the PIO drivers that pollute the D-cache.

The kernel API provides flush_dcache_page() to be called every time the
kernel writes to a page cache page. This is further optimised for
working in pair with update_mmu_cache() to delay the flushing until the
actual page is mapped into user space and this latter function is called
(which in general is not a cache maintenance function).

The problem with some PIO drivers and a filesystems like ext2 is that
there is no call to flush_dcache_page() when getting data into a page
cache page. Since the page isn't marked as dirty (PG_arch_1), a
subsequent call to update_mmu_cache() as a result of a page fault
doesn't flush the caches.

There is a flush_icache_page() function called from __do_fault(),
however, Documentation/cachetlb.txt states that all the functionality of
this function can be implemented in flush_dcache_page() and
update_mmu_cache(), hence this function is a no-op.

Please suggest a better solution that does not involve modifying generic
Linux code.

> Plus it does unneccessary flushes on x86, etc...

On x86, it should indeed be conditionally compiled based on
ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE.

Regards.

--
Catalin

Andy Green

unread,
Feb 8, 2010, 5:10:02 AM2/8/10
to
On 02/08/10 10:51, Somebody in the thread at some point said:

> We could of course flush the caches every time we get a page fault but
> that's far from optimal, especially since DMA-capable drivers to do not
> pollute the D-cache and don't need this extra flushing. Note that the
> recent ARM processors have PIPT caches but separate for I and D and it's
> the PIO drivers that pollute the D-cache.

Just noting that AFAIK iMX31 USB and MMC drivers both are PIO at the
moment, for lack of any platform DMA support of its unusual DMA engine.

-Andy

Catalin Marinas

unread,
Feb 8, 2010, 5:30:02 AM2/8/10
to
On Mon, 2010-02-08 at 07:33 +0000, Andreas Mohr wrote:
> On Mon, Feb 08, 2010 at 07:55:19AM +0100, Pavel Machek wrote:
> > Plus it does unneccessary flushes on x86, etc...
>
> Noticed that as well, there should be an arch-obeying helper for this.
>
>
> On my MIPSEL, I had urb->transfer_buffer NULL ptr crashes
> (I think that was expected in case of a certain DMA setup, Alan said).
>
> However, even with NULL check added I still had:
>
> hub 2-1.1:1.0: state 7 ports 7 chg 0000 evt 0010
> Unhandled kernel unaligned access[#1]:

Just to avoid confusion - that's a similar patch applied to a different
driver. The ISP1760 HCD driver works fine with my patch (transfer_buffer
never seems to be NULL with latest mainline). I can't comment on the
ehci-q.c driver (it looks like it has some support for DMA while my
patch only applies to PIO drivers where transfer_buffer should be set).

--
Catalin

Pavel Machek

unread,
Feb 8, 2010, 6:00:01 AM2/8/10
to
> Hi,
>
> On Mon, 2010-02-08 at 06:55 +0000, Pavel Machek wrote:
> > > > So, let's put this in the HCD drivers and be done with it.
> > >
> > > The patch below is what fixes the I-D cache incoherency issues on ARM. I
> > > don't particularly like the solution but it seems to be the only one
> > > available.
> >
> > Really? It looks like arm should just flush the caches when mapping
> > executable page to the userspace.... you can't expect all the drivers
> > to be modified like that...
>
> We could of course flush the caches every time we get a page fault but
> that's far from optimal, especially since DMA-capable drivers to do
> not

Maybe far for optimal, but it is something that should be done,
_first_. Correctness is more important than performance, and you can't
expect all drivers to behave like you want them.

Then you can add optimalizations not to do the flushes on drivers you
audited and where you care...

Catalin Marinas

unread,
Feb 8, 2010, 6:30:02 AM2/8/10
to
On Mon, 2010-02-08 at 10:52 +0000, Pavel Machek wrote:
> > On Mon, 2010-02-08 at 06:55 +0000, Pavel Machek wrote:
> > > > > So, let's put this in the HCD drivers and be done with it.
> > > >
> > > > The patch below is what fixes the I-D cache incoherency issues on ARM. I
> > > > don't particularly like the solution but it seems to be the only one
> > > > available.
> > >
> > > Really? It looks like arm should just flush the caches when mapping
> > > executable page to the userspace.... you can't expect all the drivers
> > > to be modified like that...
> >
> > We could of course flush the caches every time we get a page fault but
> > that's far from optimal, especially since DMA-capable drivers to do
> > not
>
> Maybe far for optimal, but it is something that should be done,
> _first_. Correctness is more important than performance, and you can't
> expect all drivers to behave like you want them.

I wouldn't call heavy cache flushing "correctness". We could as well
disable the caches so that it is fully coherent.

The arch code follows an API defined in cachetlb.txt but the PIO drivers
don't (some do, like mmci.c). It may be inconvenient to call
flush_dcache_page() in the driver, hence I started a discussion on
linux-arch on a PIO mapping API that x86 or other fully coherent
architectures can leave it as no-ops.

> Then you can add optimalizations not to do the flushes on drivers you
> audited and where you care...

Sorry but that's not really feasible (unless I don't fully understand
what you mean) - if we do the cache flushing on the fault handling path
in the arch code, there is no way for the arch code to know what driver
is doing, unless we make this conditionally compiled with something like
CONFIG_ARCH_NEEDS_HEAVY_FLUSHING.

--
Catalin

Shilimkar, Santosh

unread,
Feb 16, 2010, 3:00:01 AM2/16/10
to


Continuing on the USB issue w.r.t cache coherency, the usb host
code is violating the buffer ownership rules of streaming APIs from
dma and non-dma transfers point if view.

We have a below temporary patch to get around the issue and probably it
needs to be fixed in the right way in the stack because some controllers
may not have PIO option even for control transfers. (e.g. Synopsis EHCI
controller)

From: Maulik Mankad <x008...@ti.com>

USB: Avoid DMA map/unmap of control transfer buffers.

This patch avoids the DMA mapping of buffers for control
transfers.

Signed-off-by: Maulik Mankad <x008...@ti.com>
---
Index: omap4_integration/drivers/usb/core/hcd.c
===================================================================
--- omap4_integration.orig/drivers/usb/core/hcd.c
+++ omap4_integration/drivers/usb/core/hcd.c
@@ -1274,6 +1274,10 @@ static int map_urb_for_dma(struct usb_hc
if (is_root_hub(urb->dev))
return 0;

+ if (usb_endpoint_xfer_control(&urb->ep->desc))
+ urb->transfer_flags = URB_NO_SETUP_DMA_MAP |
+ URB_NO_TRANSFER_DMA_MAP;
+
if (usb_endpoint_xfer_control(&urb->ep->desc)
&& !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {
if (hcd->self.uses_dma) {

Oliver Neukum

unread,
Feb 16, 2010, 3:30:02 AM2/16/10
to
Am Dienstag, 16. Februar 2010 08:57:53 schrieb Shilimkar, Santosh:
> Continuing on the USB issue w.r.t cache coherency, the usb host
> code is violating the buffer ownership rules of streaming APIs from
> dma and non-dma transfers point if view.
>
> We have a below temporary patch to get around the issue and probably it
> needs to be fixed in the right way in the stack because some controllers
> may not have PIO option even for control transfers. (e.g. Synopsis EHCI
> controller)

This seems wrong to me. Buffers for control transfers may be transfered
by DMA, so the caches must be flushed on architectures whose caches
are not coherent with respect to DMA.

Would you care to elaborate on the exact nature of the bug you are fixing?

Regards
Oliver

Russell King - ARM Linux

unread,
Feb 16, 2010, 3:50:03 AM2/16/10
to
On Tue, Feb 16, 2010 at 01:27:53PM +0530, Shilimkar, Santosh wrote:
> Continuing on the USB issue w.r.t cache coherency, the usb host
> code is violating the buffer ownership rules of streaming APIs from
> dma and non-dma transfers point if view.
>
> We have a below temporary patch to get around the issue and probably it
> needs to be fixed in the right way in the stack because some controllers
> may not have PIO option even for control transfers. (e.g. Synopsis EHCI
> controller)

if (usb_endpoint_xfer_control(&urb->ep->desc)


&& !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {

if (hcd->self.uses_dma) { <=================
urb->setup_dma = dma_map_single(
hcd->self.controller,
urb->setup_packet,
sizeof(struct usb_ctrlrequest),
DMA_TO_DEVICE);

struct usb_hcd *usb_create_hcd (const struct hc_driver *driver,
struct device *dev, const char *bus_name)
{
...
hcd->self.uses_dma = (dev->dma_mask != NULL);

Is it easier to make sure that PIO devices don't have dev->dma_mask set?

Shilimkar, Santosh

unread,
Feb 16, 2010, 4:00:02 AM2/16/10
to
> -----Original Message-----
> From: Oliver Neukum [mailto:oli...@neukum.org]
> Sent: Tuesday, February 16, 2010 1:53 PM
> To: Shilimkar, Santosh
> Cc: Catalin Marinas; Pavel Machek; Greg KH; Russell King - ARM Linux; Matthew Dharm; Sergei Shtylyov;
> Ming Lei; Sebastian Siewior; linu...@vger.kernel.org; linux-kernel; linux-arm-kernel; Mankad,
> Maulik Ojas
> Subject: Re: USB mass storage and ARM cache coherency
>
> Am Dienstag, 16. Februar 2010 08:57:53 schrieb Shilimkar, Santosh:
> > Continuing on the USB issue w.r.t cache coherency, the usb host
> > code is violating the buffer ownership rules of streaming APIs from
> > dma and non-dma transfers point if view.
> >
> > We have a below temporary patch to get around the issue and probably it
> > needs to be fixed in the right way in the stack because some controllers
> > may not have PIO option even for control transfers. (e.g. Synopsis EHCI
> > controller)
>
> This seems wrong to me. Buffers for control transfers may be transfered
> by DMA, so the caches must be flushed on architectures whose caches
> are not coherent with respect to DMA.
Indeed and that's what I mentioned in the comment. But we shouldn't have dma
cache maintenance operations done for the buffers which would use pio based transfer.
> Would you care to elaborate on the exact nature of the bug you are fixing?
On the OMAP4 (ARM cortex-a9) platform, the enumeration fails because control
transfer buffers are corrupted. On our platform, we use PIO mode for control
transfers and DMA for bulk transfers.

The current stack performs dma cache maintenance even for the PIO transfers
which leads to the corruption issue. The control buffers are handled by CPU
and they already coherent from CPU point of view.


Regards,
Santosh

Gadiyar, Anand

unread,
Feb 16, 2010, 4:00:02 AM2/16/10
to
Russell King - ARM Linux wrote:
> On Tue, Feb 16, 2010 at 01:27:53PM +0530, Shilimkar, Santosh wrote:
> > Continuing on the USB issue w.r.t cache coherency, the usb host
> > code is violating the buffer ownership rules of streaming APIs from
> > dma and non-dma transfers point if view.
> >
> > We have a below temporary patch to get around the issue and probably it
> > needs to be fixed in the right way in the stack because some controllers
> > may not have PIO option even for control transfers. (e.g. Synopsis EHCI
> > controller)
>
> if (usb_endpoint_xfer_control(&urb->ep->desc)
> && !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {
> if (hcd->self.uses_dma) { <=================
> urb->setup_dma = dma_map_single(
> hcd->self.controller,
> urb->setup_packet,
> sizeof(struct usb_ctrlrequest),
> DMA_TO_DEVICE);
>
> struct usb_hcd *usb_create_hcd (const struct hc_driver *driver,
> struct device *dev, const char *bus_name)
> {
> ...
> hcd->self.uses_dma = (dev->dma_mask != NULL);
>
> Is it easier to make sure that PIO devices don't have dev->dma_mask set?

Not really. For instance, in the case of the DMA engine in the MUSB
controller in OMAP3, we can only use DMA with endpoints other than
EP0, and EP0 is what is used for control transfers.

It's not PIO for all the endpoints or DMA for all of them.

- Anand

Oliver Neukum

unread,
Feb 16, 2010, 4:10:02 AM2/16/10
to
Am Dienstag, 16. Februar 2010 09:55:55 schrieb Shilimkar, Santosh:
> > This seems wrong to me. Buffers for control transfers may be transfered
> > by DMA, so the caches must be flushed on architectures whose caches
> > are not coherent with respect to DMA.
> Indeed and that's what I mentioned in the comment. But we shouldn't have dma
> cache maintenance operations done for the buffers which would use pio based transfer.

Given that the generic layer can't know which buffers will be used for DMA
that would require a callback into the hcd driver.

> > Would you care to elaborate on the exact nature of the bug you are fixing?
> On the OMAP4 (ARM cortex-a9) platform, the enumeration fails because control
> transfer buffers are corrupted. On our platform, we use PIO mode for control
> transfers and DMA for bulk transfers.
>
> The current stack performs dma cache maintenance even for the PIO transfers
> which leads to the corruption issue. The control buffers are handled by CPU
> and they already coherent from CPU point of view.

How does the mapping corrupt buffers? It might impact performance, but why
do you see corruption?

Regards
Oliver

Russell King - ARM Linux

unread,
Feb 16, 2010, 4:50:02 AM2/16/10
to
On Tue, Feb 16, 2010 at 10:07:20AM +0100, Oliver Neukum wrote:
> Am Dienstag, 16. Februar 2010 09:55:55 schrieb Shilimkar, Santosh:
> > > Would you care to elaborate on the exact nature of the bug you are fixing?
> > On the OMAP4 (ARM cortex-a9) platform, the enumeration fails because control
> > transfer buffers are corrupted. On our platform, we use PIO mode for control
> > transfers and DMA for bulk transfers.
> >
> > The current stack performs dma cache maintenance even for the PIO transfers
> > which leads to the corruption issue. The control buffers are handled by CPU
> > and they already coherent from CPU point of view.
>
> How does the mapping corrupt buffers? It might impact performance, but why
> do you see corruption?

On map, buffers are cleaned if they're being used for DMA_TO_DEVICE and
DMA_BIDIRECTIONAL, or invalidated in the case of DMA_FROM_DEVICE.

However, because ARM CPUs can now speculatively prefetch, just leaving it
at that results in corruption of buffers used for DMA. So we have to
invalidate DMA_FROM_DEVICE and DMA_BIDIRECTIONAL buffers on unmap to
ensure coherency with DMA operations.

If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, the
writes can sit in the cache, and on unmap, they will be discarded.

Cleaning the cache on unmap is not an option; that too can lead to DMA
buffer corruption in the DMA case.

USB and associated host driver must abide by the DMA API buffer
ownership rules otherwise the result will be data corruption; either
that or USB/host driver people need to have a discussion with the
DMA API authors to remove this sensible "restriction".

Oliver Neukum

unread,
Feb 16, 2010, 8:40:02 AM2/16/10
to
Am Dienstag, 16. Februar 2010 10:39:46 schrieb Russell King - ARM Linux:
> However, because ARM CPUs can now speculatively prefetch, just leaving it
> at that results in corruption of buffers used for DMA. So we have to
> invalidate DMA_FROM_DEVICE and DMA_BIDIRECTIONAL buffers on unmap to
> ensure coherency with DMA operations.
>
> If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, the
> writes can sit in the cache, and on unmap, they will be discarded.
>
> Cleaning the cache on unmap is not an option; that too can lead to DMA
> buffer corruption in the DMA case.

I am afraid for these controllers the controller driver must be responsible
for all DMA and cache issues. Indicating the exact requirements to the
upper layer would be a battle already lost.
so the safe choice is not to set has_dma and the generic layer will leave
the issue to the lower level.

Regards
Oliver

Shilimkar, Santosh

unread,
Feb 16, 2010, 8:50:01 AM2/16/10
to
> -----Original Message-----
> From: Oliver Neukum [mailto:oli...@neukum.org]
> Sent: Tuesday, February 16, 2010 7:03 PM
> To: Russell King - ARM Linux
> Cc: Shilimkar, Santosh; Catalin Marinas; Pavel Machek; Greg KH; Matthew Dharm; Sergei Shtylyov; Ming
> Lei; Sebastian Siewior; linu...@vger.kernel.org; linux-kernel; linux-arm-kernel; Mankad, Maulik
> Ojas
> Subject: Re: USB mass storage and ARM cache coherency
>
> Am Dienstag, 16. Februar 2010 10:39:46 schrieb Russell King - ARM Linux:
> > However, because ARM CPUs can now speculatively prefetch, just leaving it
> > at that results in corruption of buffers used for DMA. So we have to
> > invalidate DMA_FROM_DEVICE and DMA_BIDIRECTIONAL buffers on unmap to
> > ensure coherency with DMA operations.
> >
> > If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, the
> > writes can sit in the cache, and on unmap, they will be discarded.
> >
> > Cleaning the cache on unmap is not an option; that too can lead to DMA
> > buffer corruption in the DMA case.
>
> I am afraid for these controllers the controller driver must be responsible
> for all DMA and cache issues. Indicating the exact requirements to the
> upper layer would be a battle already lost.
> so the safe choice is not to set has_dma and the generic layer will leave
> the issue to the lower level.
This means don't use dma at all which will almost kill the performance.

Regards,
Santosh

Oliver Neukum

unread,
Feb 16, 2010, 8:50:01 AM2/16/10
to
Am Dienstag, 16. Februar 2010 14:40:45 schrieb Shilimkar, Santosh:
> > > If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, the
> > > writes can sit in the cache, and on unmap, they will be discarded.
> > >
> > > Cleaning the cache on unmap is not an option; that too can lead to DMA
> > > buffer corruption in the DMA case.
> >
> > I am afraid for these controllers the controller driver must be responsible
> > for all DMA and cache issues. Indicating the exact requirements to the
> > upper layer would be a battle already lost.
> > so the safe choice is not to set has_dma and the generic layer will leave
> > the issue to the lower level.
> This means don't use dma at all which will almost kill the performance.

Why would you be unable to map a buffer in the hcd driver when you know
that you'll use DMA?

Regards
Oliver

Shilimkar, Santosh

unread,
Feb 16, 2010, 9:20:02 AM2/16/10
to
> -----Original Message-----
> From: Oliver Neukum [mailto:oli...@neukum.org]
> Sent: Tuesday, February 16, 2010 7:17 PM
> To: Shilimkar, Santosh
> Cc: Russell King - ARM Linux; Catalin Marinas; Pavel Machek; Greg KH; Matthew Dharm; Sergei Shtylyov;
> Ming Lei; Sebastian Siewior; linu...@vger.kernel.org; linux-kernel; linux-arm-kernel; Mankad,
> Maulik Ojas
> Subject: Re: USB mass storage and ARM cache coherency
>
> Am Dienstag, 16. Februar 2010 14:40:45 schrieb Shilimkar, Santosh:
> > > > If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, the
> > > > writes can sit in the cache, and on unmap, they will be discarded.
> > > >
> > > > Cleaning the cache on unmap is not an option; that too can lead to DMA
> > > > buffer corruption in the DMA case.
> > >
> > > I am afraid for these controllers the controller driver must be responsible
> > > for all DMA and cache issues. Indicating the exact requirements to the
> > > upper layer would be a battle already lost.
> > > so the safe choice is not to set has_dma and the generic layer will leave
> > > the issue to the lower level.
> > This means don't use dma at all which will almost kill the performance.
>
> Why would you be unable to map a buffer in the hcd driver when you know
> that you'll use DMA?
Probably it can be. The USB stack has the dma maintenance code at common
place for all controllers and hence we were just trying to see if there is
way to handle that way.

We shall check this possibility

Regards,
Santosh

Oliver Neukum

unread,
Feb 16, 2010, 9:30:01 AM2/16/10
to
Am Dienstag, 16. Februar 2010 15:12:45 schrieb Shilimkar, Santosh:
> > > > I am afraid for these controllers the controller driver must be responsible
> > > > for all DMA and cache issues. Indicating the exact requirements to the
> > > > upper layer would be a battle already lost.
> > > > so the safe choice is not to set has_dma and the generic layer will leave
> > > > the issue to the lower level.
> > > This means don't use dma at all which will almost kill the performance.
> >
> > Why would you be unable to map a buffer in the hcd driver when you know
> > that you'll use DMA?
> Probably it can be. The USB stack has the dma maintenance code at common
> place for all controllers and hence we were just trying to see if there is
> way to handle that way.

This is true. If you can find a clean way to describe your requirements
to the generic layer, that would be better. The problem is that we must
not end up with a dozen flags.

Your original patch however kills ehci, ohci and uhci on some architectures.

Regards
Oliver

Shilimkar, Santosh

unread,
Feb 16, 2010, 9:50:01 AM2/16/10
to
> -----Original Message-----
> From: Oliver Neukum [mailto:oli...@neukum.org]
> Sent: Tuesday, February 16, 2010 7:53 PM
> To: Shilimkar, Santosh
> Cc: Russell King - ARM Linux; Catalin Marinas; Pavel Machek; Greg KH; Matthew Dharm; Sergei Shtylyov;
> Ming Lei; Sebastian Siewior; linu...@vger.kernel.org; linux-kernel; linux-arm-kernel; Mankad,
> Maulik Ojas
> Subject: Re: USB mass storage and ARM cache coherency
>
> Am Dienstag, 16. Februar 2010 15:12:45 schrieb Shilimkar, Santosh:
> > > > > I am afraid for these controllers the controller driver must be responsible
> > > > > for all DMA and cache issues. Indicating the exact requirements to the
> > > > > upper layer would be a battle already lost.
> > > > > so the safe choice is not to set has_dma and the generic layer will leave
> > > > > the issue to the lower level.
> > > > This means don't use dma at all which will almost kill the performance.
> > >
> > > Why would you be unable to map a buffer in the hcd driver when you know
> > > that you'll use DMA?
> > Probably it can be. The USB stack has the dma maintenance code at common
> > place for all controllers and hence we were just trying to see if there is
> > way to handle that way.
>
> This is true. If you can find a clean way to describe your requirements
> to the generic layer, that would be better. The problem is that we must
> not end up with a dozen flags.
Agree
> Your original patch however kills ehci, ohci and uhci on some architectures.
Well the patch was making _ONLY_ control transfers use PIO and rest of
the transfer would still use dma. So not sure how much performance impact would
be because of that.
Another issue with that patch is there are few controllers which can't do PIO
at all and hence the patch would broke those controllers.

So we need a clean way to handle it as you described.

Regards,
Santosh

Alan Stern

unread,
Feb 16, 2010, 10:50:02 AM2/16/10
to
On Tue, 16 Feb 2010, Shilimkar, Santosh wrote:

> > Your original patch however kills ehci, ohci and uhci on some architectures.
> Well the patch was making _ONLY_ control transfers use PIO and rest of
> the transfer would still use dma. So not sure how much performance impact would
> be because of that.
> Another issue with that patch is there are few controllers which can't do PIO
> at all and hence the patch would broke those controllers.

More than "a few"! None of the EHCI, OHCI, or UHCI controllers used in
Intel-compatible desktop and laptop systems can do PIO. That's what
Oliver meant.

Alan Stern

Ming Lei

unread,
Feb 16, 2010, 10:30:01 PM2/16/10
to
2010/2/16 Shilimkar, Santosh <santosh....@ti.com>:

>> > We have a below temporary patch to get around the issue and probably it
>> > needs to be fixed in the right way in the stack because some controllers
>> > may not have PIO option even for control transfers. (e.g. Synopsis EHCI
>> > controller)

Your temporary patch only removes dma map and umap for setup buffer in
control transfer.

>>
>> This seems wrong to me. Buffers for control transfers may be transfered
>> by DMA, so the caches must be flushed on architectures whose caches
>> are not coherent with respect to DMA.
> Indeed and that's what I mentioned in the comment. But we shouldn't have dma
> cache maintenance operations done for the buffers which would use pio based transfer.
>> Would you care to elaborate on the exact nature of the bug you are fixing?
> On the OMAP4 (ARM cortex-a9) platform, the enumeration fails because control
> transfer buffers are corrupted. On our platform, we use PIO mode for control
> transfers and DMA for bulk transfers.

I don't know you mean you use PIO mode for seup buffer only or whole control
transfer(setup sent, data in or data out). If you mean do not use DMA
for setup sent, data in or data out in a control transfer, your
temporary patch maybe is not enough, right?

>
> The current stack performs dma cache maintenance even for the PIO transfers
> which leads to the corruption issue. The control buffers are handled by CPU
> and they already coherent from CPU point of view.

--
Lei Ming

Shilimkar, Santosh

unread,
Feb 17, 2010, 4:00:01 AM2/17/10
to
> -----Original Message-----
> From: Oliver Neukum [mailto:oli...@neukum.org]
> Sent: Tuesday, February 16, 2010 7:53 PM
> To: Shilimkar, Santosh
> Cc: Russell King - ARM Linux; Catalin Marinas; Pavel Machek; Greg KH; Matthew Dharm; Sergei Shtylyov;
> Ming Lei; Sebastian Siewior; linu...@vger.kernel.org; linux-kernel; linux-arm-kernel; Mankad,
> Maulik Ojas
> Subject: Re: USB mass storage and ARM cache coherency
>
> Am Dienstag, 16. Februar 2010 15:12:45 schrieb Shilimkar, Santosh:
> > > > > I am afraid for these controllers the controller driver must be responsible
> > > > > for all DMA and cache issues. Indicating the exact requirements to the
> > > > > upper layer would be a battle already lost.
> > > > > so the safe choice is not to set has_dma and the generic layer will leave
> > > > > the issue to the lower level.
> > > > This means don't use dma at all which will almost kill the performance.
> > >
> > > Why would you be unable to map a buffer in the hcd driver when you know
> > > that you'll use DMA?
> > Probably it can be. The USB stack has the dma maintenance code at common
> > place for all controllers and hence we were just trying to see if there is
> > way to handle that way.
>
> This is true. If you can find a clean way to describe your requirements
> to the generic layer, that would be better. The problem is that we must
> not end up with a dozen flags.
>
> Your original patch however kills ehci, ohci and uhci on some architectures.

How about below approach? Controller driver can set
"uses_pio_for_control" if it can't do dma for control transfer.

diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 80995ef..e3eae02 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1276,7 +1276,7 @@ static int map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,



if (usb_endpoint_xfer_control(&urb->ep->desc)
&& !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {

- if (hcd->self.uses_dma) {
+ if (hcd->self.uses_dma && !hcd->self.uses_pio_for_control) {


urb->setup_dma = dma_map_single(
hcd->self.controller,
urb->setup_packet,

@@ -1335,7 +1335,7 @@ static void unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)



if (usb_endpoint_xfer_control(&urb->ep->desc)
&& !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {

- if (hcd->self.uses_dma)
+ if (hcd->self.uses_dma && !hcd->self.uses_pio_for_control)
dma_unmap_single(hcd->self.controller, urb->setup_dma,
sizeof(struct usb_ctrlrequest),
DMA_TO_DEVICE);
diff --git a/include/linux/usb.h b/include/linux/usb.h
index d7ace1b..ba5b0a2 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -329,6 +329,9 @@ struct usb_bus {
int busnum; /* Bus number (in order of reg) */
const char *bus_name; /* stable id (PCI slot_name etc) */
u8 uses_dma; /* Does the host controller use DMA? */
+ u8 uses_pio_for_control; /* Does the host controller use PIO
+ * for control tansfers?
+ */
u8 otg_port; /* 0, or number of OTG/HNP port */
unsigned is_b_host:1; /* true during some HNP roleswitches */
unsigned b_hnp_enable:1; /* OTG: did A-Host enable HNP? */

Regards,
Santosh

Oliver Neukum

unread,
Feb 17, 2010, 4:20:02 AM2/17/10
to
Am Mittwoch, 17. Februar 2010 09:55:08 schrieb Shilimkar, Santosh:
> > Your original patch however kills ehci, ohci and uhci on some architectures.
>
> How about below approach? Controller driver can set
> "uses_pio_for_control" if it can't do dma for control transfer.
>
> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> index 80995ef..e3eae02 100644
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -1276,7 +1276,7 @@ static int map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
>
> if (usb_endpoint_xfer_control(&urb->ep->desc)
> && !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {
> - if (hcd->self.uses_dma) {
> + if (hcd->self.uses_dma && !hcd->self.uses_pio_for_control) {

It is not elegant to describe exceptions. It would be better, if you split up
the flag into two flags, called uses_dma_for_ordinary_transfers and
uses_dma_for control_transfers. Doing so also makes sure you look at
all hcd drivers ;-)

And the tests become straightforward. And please add a detailed comment
to explain why this differentiation is needed on ARM.

Regards
Oliver

Shilimkar, Santosh

unread,
Feb 17, 2010, 4:20:02 AM2/17/10
to
> -----Original Message-----
> From: Oliver Neukum [mailto:oli...@neukum.org]
> Sent: Wednesday, February 17, 2010 2:41 PM
> To: Shilimkar, Santosh
> Cc: Russell King - ARM Linux; Catalin Marinas; Pavel Machek; Greg KH; Matthew Dharm; Sergei Shtylyov;
> Ming Lei; Sebastian Siewior; linu...@vger.kernel.org; linux-kernel; linux-arm-kernel; Mankad,
> Maulik Ojas; Gadiyar, Anand
> Subject: Re: USB mass storage and ARM cache coherency
>
> Am Mittwoch, 17. Februar 2010 09:55:08 schrieb Shilimkar, Santosh:
> > > Your original patch however kills ehci, ohci and uhci on some architectures.
> >
> > How about below approach? Controller driver can set
> > "uses_pio_for_control" if it can't do dma for control transfer.
> >
> > diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> > index 80995ef..e3eae02 100644
> > --- a/drivers/usb/core/hcd.c
> > +++ b/drivers/usb/core/hcd.c
> > @@ -1276,7 +1276,7 @@ static int map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
> >
> > if (usb_endpoint_xfer_control(&urb->ep->desc)
> > && !(urb->transfer_flags & URB_NO_SETUP_DMA_MAP)) {
> > - if (hcd->self.uses_dma) {
> > + if (hcd->self.uses_dma && !hcd->self.uses_pio_for_control) {
>
> It is not elegant to describe exceptions. It would be better, if you split up
> the flag into two flags, called uses_dma_for_ordinary_transfers and
> uses_dma_for control_transfers. Doing so also makes sure you look at
> all hcd drivers ;-)
>
Good point. Negative checks are any way not elegant

> And the tests become straightforward. And please add a detailed comment
> to explain why this differentiation is needed on ARM.
OK. I shall create a patch with description about the problem.

Thanks for feedback!!

Regards,
Santosh

Oliver Neukum

unread,
Feb 17, 2010, 4:20:02 AM2/17/10
to
Am Mittwoch, 17. Februar 2010 10:05:43 schrieb Benjamin Herrenschmidt:
> > Would you care to elaborate on the exact nature of the bug you are
> > fixing?
>
> I missed part of this thread, so forgive me if I'm a bit off here, but
> if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then
> this is a long solved issue on other archs such as ppc (and I think
> sparc).

We should have changed the subject line.

There's a second problem. It turns out that on ARM
mapping for DMA must not be done if PIO will be used. Some HCDs
use PIO for some transfers but DMA for others. The generic layer
must learn about this.

Benjamin Herrenschmidt

unread,
Feb 17, 2010, 4:20:01 AM2/17/10
to
On Tue, 2010-02-16 at 09:22 +0100, Oliver Neukum wrote:
> This seems wrong to me. Buffers for control transfers may be
> transfered
> by DMA, so the caches must be flushed on architectures whose caches
> are not coherent with respect to DMA.
>
> Would you care to elaborate on the exact nature of the bug you are
> fixing?

I missed part of this thread, so forgive me if I'm a bit off here, but


if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then

this is a long solved issue on other archs such as ppc (and I _think_
sparc).

The way we do it, at least on powerpc which is PIPT, is to keep track on
a per-page basis, whether a given page is clean for execution using
PG_arch1 bit. This bit is cleared when a new page is popped into the
page cache, and we clear it from flush_dcache_page() iirc (you may want
to dbl check I don't have the code at hand right now, or rather, I do
but I'm to lazy to look right now :-)

Any page with that not set is mapped into userspace with execute
permission disabled. We do the flush and set PG_arch1 on the first exec
fault to that page.

Cheers,
Ben.

Benjamin Herrenschmidt

unread,
Feb 17, 2010, 4:50:02 AM2/17/10
to
On Wed, 2010-02-17 at 10:15 +0100, Oliver Neukum wrote:
> We should have changed the subject line.
>
> There's a second problem. It turns out that on ARM
> mapping for DMA must not be done if PIO will be used. Some HCDs
> use PIO for some transfers but DMA for others. The generic layer
> must learn about this.

Ah, that makes a lot of sense and the same problem would happen on
any non-DMA coherent architecture, including some embedded ppc's.

I can see why the dma unmap would invalidate the dcache and blow
away the PIO.

What bugs me here is that the dma_map_* operation should always
be done at the lowest level, ie, the actual HCD driver, and thus
it should be up to the HCD to decide whether to dma_map or not
depending on whether it's going to do DMA or not. I haven't
scrutinized USB lately but if that isn't the case and the dma_map_*
operations are done behind your back by the USB core then that needs to
be changed in a way or another, or hooked at least.

Cheers,
Ben.

Andy Green

unread,
Feb 17, 2010, 5:00:03 AM2/17/10
to
On 02/17/10 10:50, Somebody in the thread at some point said:

> On Mon, Feb 08, 2010 at 11:03:14AM +0100, Andy Green wrote:
>> On 02/08/10 10:51, Somebody in the thread at some point said:
>>
>>> We could of course flush the caches every time we get a page fault but
>>> that's far from optimal, especially since DMA-capable drivers to do not
>>> pollute the D-cache and don't need this extra flushing. Note that the
>>> recent ARM processors have PIPT caches but separate for I and D and it's
>>> the PIO drivers that pollute the D-cache.
>>
>> Just noting that AFAIK iMX31 USB and MMC drivers both are PIO at the
>> moment, for lack of any platform DMA support of its unusual DMA engine.
>
> The EHCI module has its own DMA engine and has nothing to do with the
> SDMA engine.

You're right, my mistake. iMX31 MMC is PIO due to no SDMA support though.

Sascha Hauer

unread,
Feb 17, 2010, 5:00:02 AM2/17/10
to
On Mon, Feb 08, 2010 at 11:03:14AM +0100, Andy Green wrote:
> On 02/08/10 10:51, Somebody in the thread at some point said:
>
>> We could of course flush the caches every time we get a page fault but
>> that's far from optimal, especially since DMA-capable drivers to do not
>> pollute the D-cache and don't need this extra flushing. Note that the
>> recent ARM processors have PIPT caches but separate for I and D and it's
>> the PIO drivers that pollute the D-cache.
>
> Just noting that AFAIK iMX31 USB and MMC drivers both are PIO at the
> moment, for lack of any platform DMA support of its unusual DMA engine.

The EHCI module has its own DMA engine and has nothing to do with the
SDMA engine.

Sascha


--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Russell King - ARM Linux

unread,
Feb 17, 2010, 5:10:01 AM2/17/10
to
On Wed, Feb 17, 2010 at 08:05:43PM +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2010-02-16 at 09:22 +0100, Oliver Neukum wrote:
> > This seems wrong to me. Buffers for control transfers may be
> > transfered
> > by DMA, so the caches must be flushed on architectures whose caches
> > are not coherent with respect to DMA.
> >
> > Would you care to elaborate on the exact nature of the bug you are
> > fixing?
>
> I missed part of this thread, so forgive me if I'm a bit off here, but
> if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then
> this is a long solved issue on other archs such as ppc (and I _think_
> sparc).

Nope. It's to do with mapping a buffer for DMA, and then doing PIO
reads/writes to it.

With speculative prefetches, you have to deal with cache coherency with
hardware DMA on DMA unmap. If you've written to the buffer in violation
of the DMA API buffer ownership rules, then your writes get thrown away
resulting in immediate data corruption.

Oliver Neukum

unread,
Feb 17, 2010, 5:10:02 AM2/17/10
to
Am Mittwoch, 17. Februar 2010 10:40:09 schrieb Benjamin Herrenschmidt:
> What bugs me here is that the dma_map_* operation should always
> be done at the lowest level, ie, the actual HCD driver, and thus
> it should be up to the HCD to decide whether to dma_map or not
> depending on whether it's going to do DMA or not. I haven't
> scrutinized USB lately but if that isn't the case and the dma_map_*
> operations are done behind your back by the USB core then that needs to
> be changed in a way or another, or hooked at least.

No problem here. USB core does the mapping only if the low-level driver
so requests. The only exception is in usb_buffer_alloc(), but that boils
down to dma_alloc_coherent()

Regards
Oliver

Benjamin Herrenschmidt

unread,
Feb 17, 2010, 5:10:02 AM2/17/10
to
On Wed, 2010-02-17 at 09:55 +0000, Russell King - ARM Linux wrote:
> Nope. It's to do with mapping a buffer for DMA, and then doing PIO
> reads/writes to it.
>
> With speculative prefetches, you have to deal with cache coherency with
> hardware DMA on DMA unmap. If you've written to the buffer in violation
> of the DMA API buffer ownership rules, then your writes get thrown away
> resulting in immediate data corruption.

Right, and this exact same problem will bite some embedded powerpc
too I suppose :-)

Hrm... actually not :-) We don't do the invalidate at unmap time
today because we know 44x have such a broken prefetcher that we disable
it ... interesting considering that there are machines around that
do non-coherent DMA with 750's style chips who -do- have a prefetcher...
damn, we have a bug :-)

In any case, same problem here.

See my reply to Oliver. Basically, the problem boils down to the
dma_map/unmap being done at the wrong layer. The driver should
simply not do these if it's going to do PIO over that range.

Cheers,
Ben.

Oliver Neukum

unread,
Feb 17, 2010, 5:30:02 AM2/17/10
to
Am Mittwoch, 17. Februar 2010 11:18:01 schrieb Benjamin Herrenschmidt:
> > No problem here. USB core does the mapping only if the low-level driver
> > so requests. The only exception is in usb_buffer_alloc(), but that boils
> > down to dma_alloc_coherent()
>
> Allright, so why do we need to "fix" anything ? Or is the whole thread
> moot ? :-)

The request a low-level driver does is all or nothing. Either DMA
issues have to be handled by that driver alone, or a finer-grained
description of the DMA requirements is needed. A fix using the latter
approach is being worked on.

Benjamin Herrenschmidt

unread,
Feb 17, 2010, 5:30:02 AM2/17/10
to
On Wed, 2010-02-17 at 11:09 +0100, Oliver Neukum wrote:
>
> No problem here. USB core does the mapping only if the low-level driver
> so requests. The only exception is in usb_buffer_alloc(), but that boils
> down to dma_alloc_coherent()

Allright, so why do we need to "fix" anything ? Or is the whole thread
moot ? :-)

It's pretty clear that between dma_map* and subsequent unmap, the memory
is owned by the device and must not be touched by the CPU. If that is
violated, then we have a driver bug.

Cheers,
Ben.

Benjamin Herrenschmidt

unread,
Feb 17, 2010, 7:20:02 AM2/17/10
to
On Wed, 2010-02-17 at 11:23 +0100, Oliver Neukum wrote:
>
> The request a low-level driver does is all or nothing. Either DMA
> issues have to be handled by that driver alone, or a finer-grained
> description of the DMA requirements is needed. A fix using the latter
> approach is being worked on.

Well, that's what I'm trying to understand.

IE. It's a pretty strong rule ... don't do CPU accesses between dma_map
and unmap. So it's all in driver land at that stage. I'm not sure how
the DMA requirements get into the picture here. IE. That rule is
globally true. It's not going to hurt just non-coherent archs, it's
going to hurt anybody using swiotlb too... So I don't see you need more
info about the DMA requirements, but maybe I did miss something :-)

Cheers,
Ben.

Jamie Lokier

unread,
Feb 17, 2010, 7:40:02 AM2/17/10
to
Russell King - ARM Linux wrote:
> On Tue, Feb 16, 2010 at 10:07:20AM +0100, Oliver Neukum wrote:
> > Am Dienstag, 16. Februar 2010 09:55:55 schrieb Shilimkar, Santosh:
> > > > Would you care to elaborate on the exact nature of the bug you are fixing?
> > > On the OMAP4 (ARM cortex-a9) platform, the enumeration fails because control
> > > transfer buffers are corrupted. On our platform, we use PIO mode for control
> > > transfers and DMA for bulk transfers.
> > >
> > > The current stack performs dma cache maintenance even for the PIO transfers
> > > which leads to the corruption issue. The control buffers are handled by CPU
> > > and they already coherent from CPU point of view.
> >
> > How does the mapping corrupt buffers? It might impact performance, but why
> > do you see corruption?
>
> On map, buffers are cleaned if they're being used for DMA_TO_DEVICE and
> DMA_BIDIRECTIONAL, or invalidated in the case of DMA_FROM_DEVICE.
>
> However, because ARM CPUs can now speculatively prefetch, just leaving it
> at that results in corruption of buffers used for DMA. So we have to
> invalidate DMA_FROM_DEVICE and DMA_BIDIRECTIONAL buffers on unmap to
> ensure coherency with DMA operations.
>
> If the CPU writes to a DMA_FROM_DEVICE buffer between map and unmap, the
> writes can sit in the cache, and on unmap, they will be discarded.
>
> Cleaning the cache on unmap is not an option; that too can lead to DMA
> buffer corruption in the DMA case.

Provided the buffers are cleaned on map for
DMA_TO_DEVICE/DMA_BIDIRECTIONAL, I don't see how cleaning on unmap for
DMA_FROM_DEVICE/DMA_BIDIRECTIONAL can cause corruption. The only way
to get dirty cache lines while mapped is if the CPU did PIO to them.
If it was real DMA, the second clean should be a no-op. (Assume it's
all one or the other).

Can you explain why cleanining the cache on unmap (as well as map, in
DMA_BIDIRECTIONAL case) is not an option? Just curious, because I
don't see what would go wrong.

> USB and associated host driver must abide by the DMA API buffer
> ownership rules otherwise the result will be data corruption; either
> that or USB/host driver people need to have a discussion with the
> DMA API authors to remove this sensible "restriction".

Just in case my question gives the wrong impression, I agree that the
DMA API must be followed. Additional flushes/cleans are not good for
performance either.

-- Jamie

Catalin Marinas

unread,
Feb 17, 2010, 10:30:03 AM2/17/10
to
On Wed, 2010-02-17 at 20:05 +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2010-02-16 at 09:22 +0100, Oliver Neukum wrote:
> > This seems wrong to me. Buffers for control transfers may be
> > transfered
> > by DMA, so the caches must be flushed on architectures whose caches
> > are not coherent with respect to DMA.
> >
> > Would you care to elaborate on the exact nature of the bug you are
> > fixing?
>
> I missed part of this thread, so forgive me if I'm a bit off here, but
> if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then
> this is a long solved issue on other archs such as ppc (and I _think_
> sparc).

The thread I started was indeed regarding I/D cache coherency and PIO.
But it diverged into DMA issues a few days ago (should have been a new
thread).

> The way we do it, at least on powerpc which is PIPT, is to keep track on
> a per-page basis, whether a given page is clean for execution using
> PG_arch1 bit. This bit is cleared when a new page is popped into the
> page cache, and we clear it from flush_dcache_page() iirc (you may want
> to dbl check I don't have the code at hand right now, or rather, I do
> but I'm to lazy to look right now :-)

We do the same on ARM. The problem with most (all) HCD drivers that do
PIO is that they copy the data to the transfer buffer but there is no
call in this driver to flush_dcache_page(). The upper mass storage or
filesystem layers don't call this function either, so there isn't
anything that would set the PG_arch1 bit.

--
Catalin

Catalin Marinas

unread,
Feb 17, 2010, 10:50:02 AM2/17/10
to
On Wed, 2010-02-17 at 20:05 +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2010-02-16 at 09:22 +0100, Oliver Neukum wrote:
> > This seems wrong to me. Buffers for control transfers may be
> > transfered
> > by DMA, so the caches must be flushed on architectures whose caches
> > are not coherent with respect to DMA.
> >=20

> > Would you care to elaborate on the exact nature of the bug you are
> > fixing?
>=20

> I missed part of this thread, so forgive me if I'm a bit off here, but
> if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then
> this is a long solved issue on other archs such as ppc (and I _think_
> sparc).

The thread I started was indeed regarding I/D cache coherency and PIO.


But it diverged into DMA issues a few days ago (should have been a new
thread).

> The way we do it, at least on powerpc which is PIPT, is to keep track on


> a per-page basis, whether a given page is clean for execution using
> PG_arch1 bit. This bit is cleared when a new page is popped into the
> page cache, and we clear it from flush_dcache_page() iirc (you may want
> to dbl check I don't have the code at hand right now, or rather, I do
> but I'm to lazy to look right now :-)

We do the same on ARM. The problem with most (all) HCD drivers that do


PIO is that they copy the data to the transfer buffer but there is no
call in this driver to flush_dcache_page(). The upper mass storage or
filesystem layers don't call this function either, so there isn't
anything that would set the PG_arch1 bit.

--=20
Catalin
--
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Catalin Marinas

unread,
Feb 17, 2010, 10:50:03 AM2/17/10
to
On Wed, 2010-02-17 at 20:05 +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2010-02-16 at 09:22 +0100, Oliver Neukum wrote:
> > This seems wrong to me. Buffers for control transfers may be
> > transfered
> > by DMA, so the caches must be flushed on architectures whose caches
> > are not coherent with respect to DMA.
> >=20

> > Would you care to elaborate on the exact nature of the bug you are
> > fixing?
>=20

> I missed part of this thread, so forgive me if I'm a bit off here, but
> if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then
> this is a long solved issue on other archs such as ppc (and I _think_
> sparc).

The thread I started was indeed regarding I/D cache coherency and PIO.


But it diverged into DMA issues a few days ago (should have been a new
thread).

> The way we do it, at least on powerpc which is PIPT, is to keep track on


> a per-page basis, whether a given page is clean for execution using
> PG_arch1 bit. This bit is cleared when a new page is popped into the
> page cache, and we clear it from flush_dcache_page() iirc (you may want
> to dbl check I don't have the code at hand right now, or rather, I do
> but I'm to lazy to look right now :-)

We do the same on ARM. The problem with most (all) HCD drivers that do


PIO is that they copy the data to the transfer buffer but there is no
call in this driver to flush_dcache_page(). The upper mass storage or
filesystem layers don't call this function either, so there isn't
anything that would set the PG_arch1 bit.

--=20
Catalin
--
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Catalin Marinas

unread,
Feb 17, 2010, 11:00:01 AM2/17/10
to
On Wed, 2010-02-17 at 20:05 +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2010-02-16 at 09:22 +0100, Oliver Neukum wrote:
> > This seems wrong to me. Buffers for control transfers may be
> > transfered
> > by DMA, so the caches must be flushed on architectures whose caches
> > are not coherent with respect to DMA.
> >=20

> > Would you care to elaborate on the exact nature of the bug you are
> > fixing?
>=20

> I missed part of this thread, so forgive me if I'm a bit off here, but
> if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then
> this is a long solved issue on other archs such as ppc (and I _think_
> sparc).

The thread I started was indeed regarding I/D cache coherency and PIO.


But it diverged into DMA issues a few days ago (should have been a new
thread).

> The way we do it, at least on powerpc which is PIPT, is to keep track on


> a per-page basis, whether a given page is clean for execution using
> PG_arch1 bit. This bit is cleared when a new page is popped into the
> page cache, and we clear it from flush_dcache_page() iirc (you may want
> to dbl check I don't have the code at hand right now, or rather, I do
> but I'm to lazy to look right now :-)

We do the same on ARM. The problem with most (all) HCD drivers that do


PIO is that they copy the data to the transfer buffer but there is no
call in this driver to flush_dcache_page(). The upper mass storage or
filesystem layers don't call this function either, so there isn't
anything that would set the PG_arch1 bit.

--=20
Catalin
--
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Catalin Marinas

unread,
Feb 17, 2010, 11:00:02 AM2/17/10
to
On Wed, 2010-02-17 at 20:05 +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2010-02-16 at 09:22 +0100, Oliver Neukum wrote:
> > This seems wrong to me. Buffers for control transfers may be
> > transfered
> > by DMA, so the caches must be flushed on architectures whose caches
> > are not coherent with respect to DMA.
> >=20

> > Would you care to elaborate on the exact nature of the bug you are
> > fixing?
>=20

> I missed part of this thread, so forgive me if I'm a bit off here, but
> if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then
> this is a long solved issue on other archs such as ppc (and I _think_
> sparc).

The thread I started was indeed regarding I/D cache coherency and PIO.


But it diverged into DMA issues a few days ago (should have been a new
thread).

> The way we do it, at least on powerpc which is PIPT, is to keep track on


> a per-page basis, whether a given page is clean for execution using
> PG_arch1 bit. This bit is cleared when a new page is popped into the
> page cache, and we clear it from flush_dcache_page() iirc (you may want
> to dbl check I don't have the code at hand right now, or rather, I do
> but I'm to lazy to look right now :-)

We do the same on ARM. The problem with most (all) HCD drivers that do


PIO is that they copy the data to the transfer buffer but there is no
call in this driver to flush_dcache_page(). The upper mass storage or
filesystem layers don't call this function either, so there isn't
anything that would set the PG_arch1 bit.

--=20
Catalin
--
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Catalin Marinas

unread,
Feb 17, 2010, 11:20:02 AM2/17/10
to
SORRY - one more message to apologise for the multiple reposts (and
automatically appended legal disclaimer). I've been moved to Exchange
2007 and trying to use Evolution + Exchange-MAPI. It looks like it went
terribly wrong.

Catalin

> _______________________________________________
> linux-arm-kernel mailing list
> linux-ar...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Catalin Marinas

unread,
Feb 17, 2010, 11:30:02 AM2/17/10
to
SORRY - one more message to apologise for the multiple reposts (and
automatically appended legal disclaimer). I've been moved to Exchange
2007 and trying to use Evolution + Exchange-MAPI. It looks like it went
terribly wrong.

Catalin


On Wed, 2010-02-17 at 15:40 +0000, Catalin Marinas wrote:

> On Wed, 2010-02-17 at 20:05 +1100, Benjamin Herrenschmidt wrote:
> > On Tue, 2010-02-16 at 09:22 +0100, Oliver Neukum wrote:
> > > This seems wrong to me. Buffers for control transfers may be
> > > transfered
> > > by DMA, so the caches must be flushed on architectures whose caches
> > > are not coherent with respect to DMA.

> > >=3D20


> > > Would you care to elaborate on the exact nature of the bug you are
> > > fixing?

> >=3D20


> > I missed part of this thread, so forgive me if I'm a bit off here, but
> > if the problem is indeed I$/D$ cache coherency vs. PIO transfers, then
> > this is a long solved issue on other archs such as ppc (and I _think_
> > sparc).

>=20


> The thread I started was indeed regarding I/D cache coherency and PIO.
> But it diverged into DMA issues a few days ago (should have been a new
> thread).

>=20
> > The way we do it, at least on powerpc which is PIPT, is to keep track o=


n
> > a per-page basis, whether a given page is clean for execution using
> > PG_arch1 bit. This bit is cleared when a new page is popped into the
> > page cache, and we clear it from flush_dcache_page() iirc (you may want
> > to dbl check I don't have the code at hand right now, or rather, I do
> > but I'm to lazy to look right now :-)

>=20


> We do the same on ARM. The problem with most (all) HCD drivers that do
> PIO is that they copy the data to the transfer buffer but there is no
> call in this driver to flush_dcache_page(). The upper mass storage or
> filesystem layers don't call this function either, so there isn't
> anything that would set the PG_arch1 bit.

>=20
> --=3D20
> Catalin
> --=20


> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose
> the contents to any other person, use it for any purpose, or store or
> copy the information in any medium. Thank you.

>=20


> _______________________________________________
> linux-arm-kernel mailing list
> linux-ar...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

--

Alan Stern

unread,
Feb 17, 2010, 12:10:03 PM2/17/10
to

Why do you skip mapping the setup packet but not the data packet?

Alan Stern

Russell King - ARM Linux

unread,
Feb 17, 2010, 3:30:03 PM2/17/10
to
On Wed, Feb 17, 2010 at 12:02:21PM -0500, Alan Stern wrote:
> Why do you skip mapping the setup packet but not the data packet?

This is something of a FAQ in this thread. Here are the responses to
similar questions yesterday:

"Gadiyar, Anand" <gad...@ti.com> said:
> Not really. For instance, in the case of the DMA engine in the MUSB
> controller in OMAP3, we can only use DMA with endpoints other than
> EP0, and EP0 is what is used for control transfers.
>
> It's not PIO for all the endpoints or DMA for all of them.

"Shilimkar, Santosh" <santosh....@ti.com> said:
> On the OMAP4 (ARM cortex-a9) platform, the enumeration fails because control
> transfer buffers are corrupted. On our platform, we use PIO mode for control
> transfers and DMA for bulk transfers.
>
> The current stack performs dma cache maintenance even for the PIO transfers
> which leads to the corruption issue. The control buffers are handled by CPU
> and they already coherent from CPU point of view.

--

Benjamin Herrenschmidt

unread,
Feb 17, 2010, 3:40:03 PM2/17/10
to
On Wed, 2010-02-17 at 15:27 +0000, Catalin Marinas wrote:
> We do the same on ARM. The problem with most (all) HCD drivers that do
> PIO is that they copy the data to the transfer buffer but there is no
> call in this driver to flush_dcache_page(). The upper mass storage or
> filesystem layers don't call this function either, so there isn't
> anything that would set the PG_arch1 bit.

Actually, clear it :-)

I suppose that's one thing that needs to be fixed in the drivers.

Cheers,
Ben.

Gadiyar, Anand

unread,
Feb 17, 2010, 3:40:04 PM2/17/10
to

I think that's oversight. For this controller, we need to skip mapping
all buffers used to do transfers on EP0, which is all control transfers.

Will fix in the next version of the patch.

- Anand

Russell King - ARM Linux

unread,
Feb 17, 2010, 3:50:02 PM2/17/10
to
On Thu, Feb 18, 2010 at 07:37:00AM +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2010-02-17 at 15:27 +0000, Catalin Marinas wrote:
> > We do the same on ARM. The problem with most (all) HCD drivers that do
> > PIO is that they copy the data to the transfer buffer but there is no
> > call in this driver to flush_dcache_page(). The upper mass storage or
> > filesystem layers don't call this function either, so there isn't
> > anything that would set the PG_arch1 bit.
>
> Actually, clear it :-)
>
> I suppose that's one thing that needs to be fixed in the drivers.

No, because that'd probably bugger up the Sparc64 method of delaying
flush_dcache_page.

This method works as follows:

- a page cache page is allocated - this has PG_arch_1 clear.

- IO happens on it and it's placed into the page cache. PG_arch_1 is
still clear.

- someone calls read()/write() which accesses the page. The generic
file IO layers call flush_dcache_page() in response to read()/write()
fs calls. flush_dcache_page() spots that the page is not yet mapped
into userspace, and sets PG_arch_1 to mark the fact that the kernel
mapping is dirty.

- when someone maps the page, we check PG_arch_1 in update_mmu_cache.
If PG_arch_1 is set, we flush the kernel mapping.

Clearly, if we go around having drivers clearing PG_arch_1, this is going
to break horribly.

Benjamin Herrenschmidt

unread,
Feb 17, 2010, 5:40:02 PM2/17/10
to
On Wed, 2010-02-17 at 20:44 +0000, Russell King - ARM Linux wrote:
> No, because that'd probably bugger up the Sparc64 method of delaying
> flush_dcache_page.
>
> This method works as follows:
>
> - a page cache page is allocated - this has PG_arch_1 clear.
>
> - IO happens on it and it's placed into the page cache. PG_arch_1 is
> still clear.
>
> - someone calls read()/write() which accesses the page. The generic
> file IO layers call flush_dcache_page() in response to
> read()/write()
> fs calls. flush_dcache_page() spots that the page is not yet mapped
> into userspace, and sets PG_arch_1 to mark the fact that the kernel
> mapping is dirty.
>
> - when someone maps the page, we check PG_arch_1 in update_mmu_cache.
> If PG_arch_1 is set, we flush the kernel mapping.
>
> Clearly, if we go around having drivers clearing PG_arch_1, this is
> going to break horribly.

Ok, you do things very differently than us on ppc then. We clear
PG_arch_1 in flush_dcache_page(), and we set it when the page has been
cache cleaned for execution.

We assume that anybody that dirties a page in the kernel will call
flush_dcache_page() which removes our PG_arch_1 bit thus marking the
page "dirty".

Note that from experience, doing the check & flushes in
update_mmu_cache() is racy on SMP. At least for I$/D$, we have the case
where processor one does set_pte followed by update_mmu_cache(). The
later isn't done yet but processor 2 sees the PTE now and starts using
it, cache hasn't been fully flushed yet. You may avoid that race in some
ways, but on ppc, I've stopped using that.

I now do things directly in set_pte_at(). In fact, that's why I want
your patch to change update_mmu_cache() to take a PTE pointer :-) Since
my set_pte_at() can now remove the _PAGE_EXEC bit, I need
update_mmu_cache() to re-read the PTE before it updates the hash table
or TLB.

Cheers,
Ben.

Oliver Neukum

unread,
Feb 18, 2010, 2:00:02 AM2/18/10
to
Am Mittwoch, 17. Februar 2010 21:30:24 schrieb Gadiyar, Anand:
> > Why do you skip mapping the setup packet but not the data packet?
> >
>
> I think that's oversight. For this controller, we need to skip mapping
> all buffers used to do transfers on EP0, which is all control transfers.

One thing more. Do you have an issue with EP 0 only or all control
endpoints? EP 0 must be control, but devices are within spec if they
have multiple control endpoints provided EP 0 is control.

Regards
Oliver

Gadiyar, Anand

unread,
Feb 18, 2010, 2:20:02 AM2/18/10
to
Oliver Neukum wrote:
> Am Mittwoch, 17. Februar 2010 21:30:24 schrieb Gadiyar, Anand:
> > > Why do you skip mapping the setup packet but not the data packet?
> > >
> >
> > I think that's oversight. For this controller, we need to skip mapping
> > all buffers used to do transfers on EP0, which is all control transfers.
>
> One thing more. Do you have an issue with EP 0 only or all control
> endpoints? EP 0 must be control, but devices are within spec if they
> have multiple control endpoints provided EP 0 is control.

Sorry for the confusion. The issue is not with EP 0 of devices
connected to the controller; the problem is with EP 0 on the host
controller itself.

The controller in question is the MUSB OTG controller present in
OMAPs, Davinci chips, and some Blackfins. The MUSB HCD driver is
written such that it carries out all control transfers on EP 0 of
the controller. All bulk transfers are carried out on other hardware
endpoints.

(This is the same "hardware endpoint" that is used in when the MUSB
is used in gadget mode.)


I'm not really sure why EP0 was chosen for control transfers, or
if there is a restriction that we *need* to use it. Let me study
the docs some more.

The problem is that with the driver code as written today, we use
EP 0 for all control transfers, and the DMA engine cannot do DMA
to this endpoint's FIFO.

- Anand

Catalin Marinas

unread,
Feb 19, 2010, 12:20:06 PM2/19/10
to
On Wed, 2010-02-17 at 22:31 +0000, Benjamin Herrenschmidt wrote:
> On Wed, 2010-02-17 at 20:44 +0000, Russell King - ARM Linux wrote:
> > No, because that'd probably bugger up the Sparc64 method of delaying
> > flush_dcache_page.
> >
> > This method works as follows:
> >
> > - a page cache page is allocated - this has PG_arch_1 clear.
> >
> > - IO happens on it and it's placed into the page cache. PG_arch_1 is
> > still clear.
> >
> > - someone calls read()/write() which accesses the page. The generic
> > file IO layers call flush_dcache_page() in response to
> > read()/write()
> > fs calls. flush_dcache_page() spots that the page is not yet mapped
> > into userspace, and sets PG_arch_1 to mark the fact that the kernel
> > mapping is dirty.
> >
> > - when someone maps the page, we check PG_arch_1 in update_mmu_cache.
> > If PG_arch_1 is set, we flush the kernel mapping.
> >
> > Clearly, if we go around having drivers clearing PG_arch_1, this is
> > going to break horribly.
>
> Ok, you do things very differently than us on ppc then. We clear
> PG_arch_1 in flush_dcache_page(), and we set it when the page has been
> cache cleaned for execution.

For this perspective it's not that different, just that we use the
negated PG_arch_1.

> We assume that anybody that dirties a page in the kernel will call
> flush_dcache_page() which removes our PG_arch_1 bit thus marking the
> page "dirty".

This assumption is not valid with some drivers like USB HCD doing PIO.
But, yes, that's how it should be done.

> Note that from experience, doing the check & flushes in
> update_mmu_cache() is racy on SMP. At least for I$/D$, we have the case
> where processor one does set_pte followed by update_mmu_cache(). The
> later isn't done yet but processor 2 sees the PTE now and starts using
> it, cache hasn't been fully flushed yet. You may avoid that race in some
> ways, but on ppc, I've stopped using that.

I think that's possible on ARM too. Having two threads on different
CPUs, one thread triggers a prefetch abort (instruction page fault) on
CPU0 but the second thread on CPU1 may branch into this page after
set_pte() (hence not fault) but before update_mmu_cache() doing the
flush.

On ARM11MPCore we flush the caches in flush_dcache_page() because the
cache maintenance operations weren't visible to the other CPUs.
Cortex-A9 broadcasts the cache operations in hardware so we can use lazy
flushing but with the race you pointed out.

Using set_pte_at() for delayed flushing may be a better option for ARM
as well (and maybe Documentation/cachetlb.txt updated).

Thanks.

--
Catalin

Catalin Marinas

unread,
Feb 19, 2010, 12:40:03 PM2/19/10
to
On Fri, 2010-02-19 at 17:15 +0000, Catalin Marinas wrote:
> On Wed, 2010-02-17 at 22:31 +0000, Benjamin Herrenschmidt wrote:
> > On Wed, 2010-02-17 at 20:44 +0000, Russell King - ARM Linux wrote:
> > > No, because that'd probably bugger up the Sparc64 method of delaying
> > > flush_dcache_page.
> > >
> > > This method works as follows:
> > >
> > > - a page cache page is allocated - this has PG_arch_1 clear.
> > >
> > > - IO happens on it and it's placed into the page cache. PG_arch_1 is
> > > still clear.
> > >
> > > - someone calls read()/write() which accesses the page. The generic
> > > file IO layers call flush_dcache_page() in response to
> > > read()/write()
> > > fs calls. flush_dcache_page() spots that the page is not yet mapped
> > > into userspace, and sets PG_arch_1 to mark the fact that the kernel
> > > mapping is dirty.
> > >
> > > - when someone maps the page, we check PG_arch_1 in update_mmu_cache.
> > > If PG_arch_1 is set, we flush the kernel mapping.
> > >
> > > Clearly, if we go around having drivers clearing PG_arch_1, this is
> > > going to break horribly.
> >
> > Ok, you do things very differently than us on ppc then. We clear
> > PG_arch_1 in flush_dcache_page(), and we set it when the page has been
> > cache cleaned for execution.
>
> For this perspective it's not that different, just that we use the
> negated PG_arch_1.

I got your point now (after reading the replies on linux-arch :)).

So PPC assumes that if PG_arch_1 is clear (the default), the page wasn't
cleaned. If there is no call to flush_dcache_page() but the page gets
mapped to user space, update_mmu_cache() (or set_pte_at()) would simply
assume that the page was dirtied, flush the caches and set this bit.

We could easily do this on ARM as well and assume that the page is dirty
if !PG_arch_1. But it only partially solves the problem (only for
faulted-in pages).

If a page is already mapped in user space, flush_dcache_page() on ARM
does the flushing rather than deferring it to update_mmu_cache(). The
PIO HCD drivers, however, don't call flush_dcache_page(). Is it possible
that the HCD could transfer data into a page cache page already mapped
in user space? My understanding is that the scenario above is possible.

Oliver Neukum

unread,
Feb 19, 2010, 4:00:01 PM2/19/10
to
Am Freitag, 19. Februar 2010 18:36:51 schrieb Catalin Marinas:
> If a page is already mapped in user space, flush_dcache_page() on ARM
> does the flushing rather than deferring it to update_mmu_cache(). The
> PIO HCD drivers, however, don't call flush_dcache_page(). Is it possible
> that the HCD could transfer data into a page cache page already mapped
> in user space? My understanding is that the scenario above is possible.

Yes, video drivers do that.

Regards
Oliver

Pete Zaitcev

unread,
Feb 20, 2010, 2:30:01 AM2/20/10
to
On Tue, 16 Feb 2010 14:21:48 +0530
"Gadiyar, Anand" <gad...@ti.com> wrote:

> > hcd->self.uses_dma = (dev->dma_mask != NULL);
> >
> > Is it easier to make sure that PIO devices don't have dev->dma_mask set?


>
> Not really. For instance, in the case of the DMA engine in the MUSB
> controller in OMAP3, we can only use DMA with endpoints other than
> EP0, and EP0 is what is used for control transfers.
>
> It's not PIO for all the endpoints or DMA for all of them.

The HC driver does not have to be 100% truthful here. If the system
is not HIGHMEM, HCD can easily set uses_dma to false yet use DMA
by mapping buffers itself, without relying on the quoted code.

On a HIGHMEM system, block layer will bounce-buffer data in such case.
Hopefuly not a problem for ARM?

All network stack drivers work that way, BTW.

-- Pete

Benjamin Herrenschmidt

unread,
Feb 23, 2010, 9:50:01 PM2/23/10
to
On Fri, 2010-02-19 at 17:15 +0000, Catalin Marinas wrote:
> > Ok, you do things very differently than us on ppc then. We clear
> > PG_arch_1 in flush_dcache_page(), and we set it when the page has
> been
> > cache cleaned for execution.
>
> For this perspective it's not that different, just that we use the
> negated PG_arch_1.

Right, though you default as "clean" while we default as "dirty".

> > We assume that anybody that dirties a page in the kernel will call
> > flush_dcache_page() which removes our PG_arch_1 bit thus marking the
> > page "dirty".
>
> This assumption is not valid with some drivers like USB HCD doing PIO.
> But, yes, that's how it should be done.

So we go back to the fix should be done at the individual drivers level.
If it's going to write into the page cache, it needs to whack the bits.

Now there's of course the question as to whether you really only want to
do that for a PIO access and not for a DMA access, I think on power, we
don't really discriminate that much (since in any case our icache still
needs flushing). Maybe it would be useful to separate the I$ and D$ bits
but I'm not sure I can be bothered.



> > Note that from experience, doing the check & flushes in
> > update_mmu_cache() is racy on SMP. At least for I$/D$, we have the
> case
> > where processor one does set_pte followed by update_mmu_cache(). The
> > later isn't done yet but processor 2 sees the PTE now and starts
> using
> > it, cache hasn't been fully flushed yet. You may avoid that race in
> some
> > ways, but on ppc, I've stopped using that.
>
> I think that's possible on ARM too. Having two threads on different
> CPUs, one thread triggers a prefetch abort (instruction page fault) on
> CPU0 but the second thread on CPU1 may branch into this page after
> set_pte() (hence not fault) but before update_mmu_cache() doing the
> flush.
>
> On ARM11MPCore we flush the caches in flush_dcache_page() because the
> cache maintenance operations weren't visible to the other CPUs.

I'm not even sure that's going to be 100% correct. Don't you also need
to flush the remote icaches when you are dealing with instructions (such
as swap) anyways ?

I've had some discussions in the past with Russell and others around the
problem of non-broadcast cache ops on ARM SMP since that's also hurting
you hard with dma mappings.

Can you issue IPIs as FIQs if needed (from my old ARM knowledge, FIQs
are still on even in local_irq_save() blocks right ? I haven't touched
low level ARM for years tho, I may have forgotten things).

In this case, you should probably use the same bits as A9 and simply
make them use FIQs on 11MP to make the other cores flush as well.

> Cortex-A9 broadcasts the cache operations in hardware so we can use
> lazy flushing but with the race you pointed out.

Right.

> Using set_pte_at() for delayed flushing may be a better option for ARM
> as well (and maybe Documentation/cachetlb.txt updated).

Cheers,
Ben.

It is loading more messages.
0 new messages