[PATCH] x86_64: make GART PTEs uncacheable

5 views
Skip to first unread message

Joachim Deguara

unread,
Apr 23, 2007, 5:20:12 AM4/23/07
to
This patches fixes the silent data corruption problems being seen using the
GART iommu where 4kB of data where incorrect (seen mostly on Nvidia CK804
systems). This fix, to mark the memory regin the GART PTEs reside on as
uncacheable, also brings the code in line with the AGP specification.

Signed-off-by: Joachim Deguara <joachim...@amd.com>

---
diff --git a/arch/x86_64/kernel/pci-gart.c b/arch/x86_64/kernel/pci-gart.c
index 2bac8c6..0bae862 100644
--- a/arch/x86_64/kernel/pci-gart.c
+++ b/arch/x86_64/kernel/pci-gart.c
@@ -519,7 +519,11 @@ static __init int init_k8_gatt(struct ag
gatt_size = (aper_size >> PAGE_SHIFT) * sizeof(u32);
gatt = (void *)__get_free_pages(GFP_KERNEL, get_order(gatt_size));
if (!gatt)
- panic("Cannot allocate GATT table");
+ panic("Cannot allocate GATT table");
+ if (change_page_attr_addr((unsigned long)gatt, gatt_size >> PAGE_SHIFT,
PAGE_KERNEL_NOCACHE))
+ panic("Could not set GART PTEs to uncacheable pages");
+ global_flush_tlb();
+
memset(gatt, 0, gatt_size);
agp_gatt_table = gatt;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Andi Kleen

unread,
Apr 23, 2007, 5:40:56 AM4/23/07
to
On Monday 23 April 2007 11:14:10 Joachim Deguara wrote:
> This patches fixes the silent data corruption problems being seen using the
> GART iommu where 4kB of data where incorrect (seen mostly on Nvidia CK804
> systems).

Performance numbers? How much slower does this make this? Is it still faster
than swiotlb?

Also this will always split up the direct memory mapping of the kernel,
so you'll lose more TLB entries even for other data.

> This fix, to mark the memory regin the GART PTEs reside on as
> uncacheable, also brings the code in line with the AGP specification.

Where in the AGP specification? I can't find any such requirement
in AGPv3

-Andi

Joachim Deguara

unread,
Apr 23, 2007, 5:50:16 AM4/23/07
to
On Monday 23 April 2007 11:32, Andi Kleen wrote:
> On Monday 23 April 2007 11:14:10 Joachim Deguara wrote:
> > This patches fixes the silent data corruption problems being seen using
> > the GART iommu where 4kB of data where incorrect (seen mostly on Nvidia
> > CK804 systems).
>
> Performance numbers? How much slower does this make this? Is it still
> faster than swiotlb?

I can work on that as a side note, but while the GART IOMMU is still in the
kernel then we need this fix.

> Also this will always split up the direct memory mapping of the kernel,
> so you'll lose more TLB entries even for other data.
>
> > This fix, to mark the memory regin the GART PTEs reside on as
> > uncacheable, also brings the code in line with the AGP specification.
>
> Where in the AGP specification? I can't find any such requirement
> in AGPv3
>

Mark pointed this out and he can answer best. I believe he was referring to
section 5.3.4 point 8:
"Core-logic accesses to the GART are not guaranteed to be coherent with host
processor caches. In
order to avoid having to flush the cache after every GART update, portable
system software should
place the GART in a range of the physical memory space that is considered
un-cacheable by host
processors. (A good example is mapping the GART as UC in an IntelÒ PentiumÒ II
processor).
However, the specification does not preclude the placement of the GART in
cachable memory
space in cases where the coherency is guaranteed through some hardware or
software
mechanism."

updated patch follows with corrected long lone.

-Joachim

---


This patches fixes the silent data corruption problems being seen using the
GART iommu where 4kB of data where incorrect (seen mostly on Nvidia CK804

systems). This fix to mark the memory regin the GART PTEs reside on as
uncacheable also brings the code in line with the AGP specification.

Signed-off-by: Joachim Deguara <joachim...@amd.com>

---

diff --git a/arch/x86_64/kernel/pci-gart.c b/arch/x86_64/kernel/pci-gart.c

index 2bac8c6..8fb4957 100644
--- a/arch/x86_64/kernel/pci-gart.c
+++ b/arch/x86_64/kernel/pci-gart.c
@@ -519,7 +519,12 @@ static __init int init_k8_gatt(struct ag


gatt_size = (aper_size >> PAGE_SHIFT) * sizeof(u32);
gatt = (void *)__get_free_pages(GFP_KERNEL, get_order(gatt_size));
if (!gatt)
- panic("Cannot allocate GATT table");
+ panic("Cannot allocate GATT table");
+ if (change_page_attr_addr((unsigned long)gatt, gatt_size >> PAGE_SHIFT,

+ PAGE_KERNEL_NOCACHE))


+ panic("Could not set GART PTEs to uncacheable pages");
+ global_flush_tlb();
+
memset(gatt, 0, gatt_size);
agp_gatt_table = gatt;

Andi Kleen

unread,
Apr 23, 2007, 5:50:25 AM4/23/07
to
On Monday 23 April 2007 11:45:11 Joachim Deguara wrote:

> I can work on that as a side note, but while the GART IOMMU is still in the
> kernel then we need this fix.

If it's too slow we can just use swiotlb instead. Probably while enlarging it.

-Andi

Reply all
Reply to author
Forward
0 new messages