The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().
Signed-off-by: Catalin Marinas <catalin...@arm.com>
Cc: Russell King <r...@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Cc: James Bottomley <James.B...@HansenPartnership.com>
---
This idea came up during a long discussion on USB mass storage and ARM
cache coherency and is also the approach used on PowerPC:
http://thread.gmane.org/gmane.linux.usb.general/27072
The patch is against 2.6.33 but there maybe be some additional patches
in Linus' tree and may no apply cleanly. Anyway, at this stage it is
meant for comments.
With this patch, we may no longer need a PIO mapping API.
arch/arm/include/asm/cacheflush.h | 6 +++---
arch/arm/mm/copypage-v6.c | 2 +-
arch/arm/mm/dma-mapping.c | 5 +++++
arch/arm/mm/fault-armv.c | 2 +-
arch/arm/mm/flush.c | 2 +-
5 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 8113bb5..691c5b3 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -136,10 +136,10 @@
#endif
/*
- * This flag is used to indicate that the page pointed to by a pte
- * is dirty and requires cleaning before returning it to the user.
+ * This flag is used to indicate that the page pointed to by a pte is clean
+ * and does not require cleaning before returning it to the user.
*/
-#define PG_dcache_dirty PG_arch_1
+#define PG_dcache_clean PG_arch_1
/*
* MM Cache Management
diff --git a/arch/arm/mm/copypage-v6.c b/arch/arm/mm/copypage-v6.c
index 0fa1319..7e0a050 100644
--- a/arch/arm/mm/copypage-v6.c
+++ b/arch/arm/mm/copypage-v6.c
@@ -86,7 +86,7 @@ static void v6_copy_user_highpage_aliasing(struct page *to,
unsigned int offset = CACHE_COLOUR(vaddr);
unsigned long kfrom, kto;
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
/* FIXME: not highmem safe */
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index dcf1ecc..105a3a9 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -478,6 +478,11 @@ static void dma_cache_maint_contiguous(struct page *page, unsigned long offset,
paddr = page_to_phys(page) + offset;
outer_op(paddr, paddr + size);
+
+ /*
+ * Mark the D-cache clean for this page to avoid extra flushing.
+ */
+ set_bit(PG_dcache_clean, &page->flags);
}
void dma_cache_maint_page(struct page *page, unsigned long offset,
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 8b755ff..89dc1dd 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -162,7 +162,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr, pte_t pte)
return;
mapping = page_mapping(page);
- if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &page->flags))
__flush_dcache_page(mapping, page);
if (mapping) {
if (cache_is_vivt())
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 834db87..b829d30 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -209,7 +209,7 @@ void flush_dcache_page(struct page *page)
if (!cache_ops_need_broadcast() &&
!PageHighMem(page) && mapping && !mapping_mapped(mapping))
- set_bit(PG_dcache_dirty, &page->flags);
+ clear_bit(PG_dcache_clean, &page->flags);
else {
__flush_dcache_page(mapping, page);
if (mapping && cache_is_vivt())
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Could you please send for RFC a fuller patch which covers all places that
PG_dcache_dirty is used and/or mentioned?
$ grep PG_dcache_dirty arch/arm -r
arch/arm/mm/copypage-v6.c: if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
arch/arm/mm/fault-armv.c: * 1. If PG_dcache_dirty is set for the page, we need to ensure
arch/arm/mm/fault-armv.c: if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
arch/arm/mm/flush.c: set_bit(PG_dcache_dirty, &page->flags);
arch/arm/mm/copypage-v4mc.c: if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
arch/arm/mm/copypage-xscale.c: if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
arch/arm/include/asm/tlbflush.h: * if PG_dcache_dirty is set for the page, we need to ensure that any
arch/arm/include/asm/cacheflush.h:#define PG_dcache_dirty PG_arch_1
Ah, I thought the compilation would find them but I was wrong. I'll
repost.
--
Catalin
ARM: Assume new page cache pages have dirty D-cache
From: Catalin Marinas <catalin...@arm.com>
There are places in Linux where writes to newly allocated page cache
pages happen without a subsequent call to flush_dcache_page() (several
PIO drivers including USB HCD). This patch changes the meaning of
PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly
mapped page in update_mmu_cache().
The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().
Signed-off-by: Catalin Marinas <catalin...@arm.com>
Cc: Russell King <r...@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <be...@kernel.crashing.org>
Cc: James Bottomley <James.B...@HansenPartnership.com>
---
arch/arm/include/asm/cacheflush.h | 6 +++---
arch/arm/include/asm/tlbflush.h | 2 +-
arch/arm/mm/copypage-v4mc.c | 2 +-
arch/arm/mm/copypage-v6.c | 2 +-
arch/arm/mm/copypage-xscale.c | 2 +-
arch/arm/mm/dma-mapping.c | 5 +++++
arch/arm/mm/fault-armv.c | 4 ++--
arch/arm/mm/flush.c | 2 +-
8 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index ed7d289..96d666f 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -137,10 +137,10 @@
#endif
/*
- * This flag is used to indicate that the page pointed to by a pte
- * is dirty and requires cleaning before returning it to the user.
+ * This flag is used to indicate that the page pointed to by a pte is clean
+ * and does not require cleaning before returning it to the user.
*/
-#define PG_dcache_dirty PG_arch_1
+#define PG_dcache_clean PG_arch_1
/*
* MM Cache Management
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index c2f1605..ccc3a2a 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -525,7 +525,7 @@ extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
#endif
/*
- * if PG_dcache_dirty is set for the page, we need to ensure that any
+ * If PG_dcache_clean is not set for the page, we need to ensure that any
* cache entries for the kernels virtual memory range are written
* back to the page.
*/
diff --git a/arch/arm/mm/copypage-v4mc.c b/arch/arm/mm/copypage-v4mc.c
index 7370a71..34c9fe5 100644
--- a/arch/arm/mm/copypage-v4mc.c
+++ b/arch/arm/mm/copypage-v4mc.c
@@ -73,7 +73,7 @@ void v4_mc_copy_user_highpage(struct page *to, struct page *from,
{
void *kto = kmap_atomic(to, KM_USER1);
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
spin_lock(&minicache_lock);
diff --git a/arch/arm/mm/copypage-v6.c b/arch/arm/mm/copypage-v6.c
index 0fa1319..7e0a050 100644
--- a/arch/arm/mm/copypage-v6.c
+++ b/arch/arm/mm/copypage-v6.c
@@ -86,7 +86,7 @@ static void v6_copy_user_highpage_aliasing(struct page *to,
unsigned int offset = CACHE_COLOUR(vaddr);
unsigned long kfrom, kto;
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
/* FIXME: not highmem safe */
diff --git a/arch/arm/mm/copypage-xscale.c b/arch/arm/mm/copypage-xscale.c
index 76824d3..4c94ea3 100644
--- a/arch/arm/mm/copypage-xscale.c
+++ b/arch/arm/mm/copypage-xscale.c
@@ -95,7 +95,7 @@ void xscale_mc_copy_user_highpage(struct page *to, struct page *from,
{
void *kto = kmap_atomic(to, KM_USER1);
- if (test_and_clear_bit(PG_dcache_dirty, &from->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &from->flags))
__flush_dcache_page(page_mapping(from), from);
spin_lock(&minicache_lock);
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index dcf1ecc..105a3a9 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -478,6 +478,11 @@ static void dma_cache_maint_contiguous(struct page *page, unsigned long offset,
paddr = page_to_phys(page) + offset;
outer_op(paddr, paddr + size);
+
+ /*
+ * Mark the D-cache clean for this page to avoid extra flushing.
+ */
+ set_bit(PG_dcache_clean, &page->flags);
}
void dma_cache_maint_page(struct page *page, unsigned long offset,
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index 8b755ff..baa5742 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -136,7 +136,7 @@ make_coherent(struct address_space *mapping, struct vm_area_struct *vma, unsigne
* a page table, or changing an existing PTE. Basically, there are two
* things that we need to take care of:
*
- * 1. If PG_dcache_dirty is set for the page, we need to ensure
+ * 1. If PG_dcache_clean is not set for the page, we need to ensure
* that any cache entries for the kernels virtual memory
* range are written back to the page.
* 2. If we have multiple shared mappings of the same space in
@@ -162,7 +162,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr, pte_t pte)
return;
mapping = page_mapping(page);
- if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
+ if (!test_and_set_bit(PG_dcache_clean, &page->flags))
__flush_dcache_page(mapping, page);
if (mapping) {
if (cache_is_vivt())
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index 834db87..b829d30 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -209,7 +209,7 @@ void flush_dcache_page(struct page *page)
if (!cache_ops_need_broadcast() &&
!PageHighMem(page) && mapping && !mapping_mapped(mapping))
- set_bit(PG_dcache_dirty, &page->flags);
+ clear_bit(PG_dcache_clean, &page->flags);
else {
__flush_dcache_page(mapping, page);
if (mapping && cache_is_vivt())
--
Catalin
As I just realised, this is going to subject all pages placed into
userspace with a D cache flush - even anonymous pages, and those
which we've been careful to deal with the cache issues already (eg,
via the COW page copying code.)
I think all the copypage functions need to set PG_dcache_clean on the
new pages once their copy has completed.
I wonder if there's any other anonymous page creating functions which
could do with a similar treatment...
In the anonymous page case, flush_anon_page() is always called prior to
flush_dcache_page(), so flush_anon_page() could just set PG_dcache_clean
to work around that. That would handle get_user_pages(), too.
Do we do anything other than COW and zero page ? clear_user_page() I
suppose could deal with that if you do the cache bits there.
Cheers,
Ben.
Well, currently, we clear PG_arch_1 in flush_dcache_page(), at least on
ppc.
Now, I have a nagging feeling that we might not need to... I'll have to
give it a closer look when I'm back from this extended week-end :-)
Cheers,
Ben.
As Ben said, I think we can set PG_dcache_clean in the
clear/copy_user_page() functions. My doubt with these functions is the
highmem cases where kunmap_atomic() only flushes the D-cache in one
situation, the other just calling kunmap_high() which doesn't seem to do
anything to the caches.
But for non-aliasing VIPT, do we actually need the D-cache flushing in
kunmap_atomic() or copy_user_page()? If we default to new page being
dirty, I think we can remove the cache cleaning from this function and
leave it for update_mmu_cache (or set_pte_at but that's for a different
patch).
--
Catalin
In which case you're totally missing the point with these functions.
The copy_user_page and clear_user_page functions specifically do tricks
to ensure that they can avoid additional cache maintainence - or any
cache maintainence at all.
For instance, on aliasing VIPT, they will map the user page in using
the same colour as the ultimate userspace address, ensuring that any
cache lines created will be visible to the userspace application.
So what kunmap_atomic() does with caches is not really relevant to the
coherency issue.
I was more thinking for the non-aliasing VIPT case where we could defer
the flushing until update_mmu_cache(). But I'm fine with just setting
PG_dcache_clean in these functions to avoid checks on non-aliasing vs.
aliasing VIPT.
> So what kunmap_atomic() does with caches is not really relevant to the
> coherency issue.
The relevant part is that if highmem is enabled, copy_user_page() does
not flush the D-cache, leaving it to kunmap_atomic(). Does this latter
function flush the D-cache in all the relevant situations?
--
Catalin