[PATCH 0/3] mm: Fix bug affecting swapping in MTE tagged pages

0 views
Skip to first unread message

Peter Collingbourne

unread,
May 12, 2023, 7:58:13 PM5/12/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price
This patch series reworks the logic that handles swapping in page
metadata to fix a reported bug [1] where metadata can sometimes not
be swapped in correctly after commit c145e0b47c77 ("mm: streamline COW
logic in do_swap_page()").

[1] https://lore.kernel.org/all/5050805753ac469e8d727c7...@mediatek.com/

Peter Collingbourne (3):
mm: Move arch_do_swap_page() call to before swap_free()
mm: Call arch_swap_restore() from arch_do_swap_page() and deprecate
the latter
arm64: mte: Simplify swap tag restoration logic and fix uninitialized
tag issue

arch/arm64/include/asm/mte.h | 4 ++--
arch/arm64/include/asm/pgtable.h | 14 ++------------
arch/arm64/kernel/mte.c | 32 +++-----------------------------
arch/arm64/mm/mteswap.c | 7 +++----
include/linux/pgtable.h | 26 +++++++++++++-------------
mm/memory.c | 26 +++++++++++++-------------
6 files changed, 36 insertions(+), 73 deletions(-)

--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 12, 2023, 7:58:16 PM5/12/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance
to restore them. One other possibility was to hook arch_do_swap_page(),
but this had a number of problems:

- The call to the hook was also after swap_free().

- The call to the hook was after the call to set_pte_at(), so there was a
racy window where uninitialized metadata may be exposed to userspace.
This likely also affects SPARC ADI, which implements this hook to
restore tags.

- As a result of commit 1eba86c096e3 ("mm: change page type prior to
adding page table entry"), we were also passing the new PTE as the
oldpte argument, preventing the hook from knowing the swap index.

Fix all of these problems by moving the arch_do_swap_page() call before
the call to free_page(), and ensuring that we do not set orig_pte until
after the call.

Signed-off-by: Peter Collingbourne <p...@google.com>
Suggested-by: Catalin Marinas <catalin...@arm.com>
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
Cc: <sta...@vger.kernel.org> # 6.1
Fixes: ca827d55ebaa ("mm, swap: Add infrastructure for saving page metadata on swap")
Fixes: 1eba86c096e3 ("mm: change page type prior to adding page table entry")
---
mm/memory.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 01a23ad48a04..83268d287ff1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3914,19 +3914,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
}
}

- /*
- * Remove the swap entry and conditionally try to free up the swapcache.
- * We're already holding a reference on the page but haven't mapped it
- * yet.
- */
- swap_free(entry);
- if (should_try_to_free_swap(folio, vma, vmf->flags))
- folio_free_swap(folio);
-
- inc_mm_counter(vma->vm_mm, MM_ANONPAGES);
- dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
pte = mk_pte(page, vma->vm_page_prot);
-
/*
* Same logic as in do_wp_page(); however, optimize for pages that are
* certainly not shared either because we just allocated them without
@@ -3946,8 +3934,21 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
pte = pte_mksoft_dirty(pte);
if (pte_swp_uffd_wp(vmf->orig_pte))
pte = pte_mkuffd_wp(pte);
+ arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
vmf->orig_pte = pte;

+ /*
+ * Remove the swap entry and conditionally try to free up the swapcache.
+ * We're already holding a reference on the page but haven't mapped it
+ * yet.
+ */
+ swap_free(entry);
+ if (should_try_to_free_swap(folio, vma, vmf->flags))
+ folio_free_swap(folio);
+
+ inc_mm_counter(vma->vm_mm, MM_ANONPAGES);
+ dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
+
/* ksm created a completely new copy */
if (unlikely(folio != swapcache && swapcache)) {
page_add_new_anon_rmap(page, vma, vmf->address);
@@ -3959,7 +3960,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
VM_BUG_ON(!folio_test_anon(folio) ||
(pte_write(pte) && !PageAnonExclusive(page)));
set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
- arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);

folio_unlock(folio);
if (folio != swapcache && swapcache) {
--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 12, 2023, 7:58:18 PM5/12/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
The previous patch made it possible for MTE to restore tags before they
are freed by hooking arch_do_swap_page().

However, the arch_do_swap_page() hook API is incompatible with swap
restoration in circumstances where we do not have an mm or a vma,
such as swapoff with swapped out shmem, and I expect that ADI will
currently fail to restore tags in these circumstances. This implies that
arch-specific metadata stores ought to be indexed by swap index, as MTE
does, rather than by mm and vma, as ADI does, and we should discourage
hooking arch_do_swap_page(), preferring to hook arch_swap_restore()
instead, as MTE already does.

Therefore, instead of directly hooking arch_do_swap_page() for
MTE, deprecate that hook, change its default implementation to call
arch_swap_restore() and rely on the existing implementation of the latter
for MTE.

Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
Link: https://linux-review.googlesource.com/id/Id2f1ad76eaf606ae210e1d2dd0b7fe287e5f7d87
Signed-off-by: Peter Collingbourne <p...@google.com>
Reported-by: Qun-wei Lin (林群崴) <Qun-w...@mediatek.com>
Link: https://lore.kernel.org/all/5050805753ac469e8d727c7...@mediatek.com/
Cc: <sta...@vger.kernel.org> # 6.1
---
include/linux/pgtable.h | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index c63cd44777ec..fc0259cf60fb 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -740,6 +740,12 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
set_pgd(pgdp, pgd); \
})

+#ifndef __HAVE_ARCH_SWAP_RESTORE
+static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
+{
+}
+#endif
+
#ifndef __HAVE_ARCH_DO_SWAP_PAGE
/*
* Some architectures support metadata associated with a page. When a
@@ -748,14 +754,14 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
* processors support an ADI (Application Data Integrity) tag for the
* page as metadata for the page. arch_do_swap_page() can restore this
* metadata when a page is swapped back in.
+ *
+ * This hook is deprecated. Architectures should hook arch_swap_restore()
+ * instead, because this hook is not called on all code paths that can
+ * swap in a page, particularly those where mm and vma are not available
+ * (e.g. swapoff for shmem pages).
*/
-static inline void arch_do_swap_page(struct mm_struct *mm,
- struct vm_area_struct *vma,
- unsigned long addr,
- pte_t pte, pte_t oldpte)
-{
-
-}
+#define arch_do_swap_page(mm, vma, addr, pte, oldpte) \
+ arch_swap_restore(pte_to_swp_entry(oldpte), page_folio(pte_page(pte)))
#endif

#ifndef __HAVE_ARCH_UNMAP_ONE
@@ -798,12 +804,6 @@ static inline void arch_swap_invalidate_area(int type)
}
#endif

-#ifndef __HAVE_ARCH_SWAP_RESTORE
-static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
-{
-}
-#endif
-
#ifndef __HAVE_ARCH_PGD_OFFSET_GATE
#define pgd_offset_gate(mm, addr) pgd_offset(mm, addr)
#endif
--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 12, 2023, 7:58:21 PM5/12/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
As a result of the previous two patches, there are no circumstances
in which a swapped-in page is installed in a page table without first
having arch_swap_restore() called on it. Therefore, we no longer need
the logic in set_pte_at() that restores the tags, so remove it.

Because we can now rely on the page being locked, we no longer need to
handle the case where a page is having its tags restored by multiple tasks
concurrently, so we can slightly simplify the logic in mte_restore_tags().

This patch also fixes an issue where a page can have PG_mte_tagged set
with uninitialized tags. The issue is that the mte_sync_page_tags()
function sets PG_mte_tagged if it initializes page tags. Then we
return to mte_sync_tags(), which sets PG_mte_tagged again. At best,
this is redundant. However, it is possible for mte_sync_page_tags()
to return without having initialized tags for the page, i.e. in the
case where check_swap is true (non-compound page), is_swap_pte(old_pte)
is false and pte_is_tagged is false. So at worst, we set PG_mte_tagged
on a page with uninitialized tags. This can happen if, for example,
page migration causes a PTE for an untagged page to be replaced. If the
userspace program subsequently uses mprotect() to enable PROT_MTE for
that page, the uninitialized tags will be exposed to userspace.

Signed-off-by: Peter Collingbourne <p...@google.com>
Link: https://linux-review.googlesource.com/id/I8ad54476f3b2d0144ccd8ce0c1d7a2963e5ff6f3
Fixes: e059853d14ca ("arm64: mte: Fix/clarify the PG_mte_tagged semantics")
Cc: <sta...@vger.kernel.org> # 6.1
---
The Fixes: tag (and the commit message in general) are written assuming
that this patch is landed in a maintainer tree instead of
"arm64: mte: Do not set PG_mte_tagged if tags were not initialized".

arch/arm64/include/asm/mte.h | 4 ++--
arch/arm64/include/asm/pgtable.h | 14 ++------------
arch/arm64/kernel/mte.c | 32 +++-----------------------------
arch/arm64/mm/mteswap.c | 7 +++----
4 files changed, 10 insertions(+), 47 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 20dd06d70af5..dfea486a6a85 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -90,7 +90,7 @@ static inline bool try_page_mte_tagging(struct page *page)
}

void mte_zero_clear_page_tags(void *addr);
-void mte_sync_tags(pte_t old_pte, pte_t pte);
+void mte_sync_tags(pte_t pte);
void mte_copy_page_tags(void *kto, const void *kfrom);
void mte_thread_init_user(void);
void mte_thread_switch(struct task_struct *next);
@@ -122,7 +122,7 @@ static inline bool try_page_mte_tagging(struct page *page)
static inline void mte_zero_clear_page_tags(void *addr)
{
}
-static inline void mte_sync_tags(pte_t old_pte, pte_t pte)
+static inline void mte_sync_tags(pte_t pte)
{
}
static inline void mte_copy_page_tags(void *kto, const void *kfrom)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b6ba466e2e8a..efdf48392026 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -337,18 +337,8 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
* don't expose tags (instruction fetches don't check tags).
*/
if (system_supports_mte() && pte_access_permitted(pte, false) &&
- !pte_special(pte)) {
- pte_t old_pte = READ_ONCE(*ptep);
- /*
- * We only need to synchronise if the new PTE has tags enabled
- * or if swapping in (in which case another mapping may have
- * set tags in the past even if this PTE isn't tagged).
- * (!pte_none() && !pte_present()) is an open coded version of
- * is_swap_pte()
- */
- if (pte_tagged(pte) || (!pte_none(old_pte) && !pte_present(old_pte)))
- mte_sync_tags(old_pte, pte);
- }
+ !pte_special(pte) && pte_tagged(pte))
+ mte_sync_tags(pte);

__check_safe_pte_update(mm, ptep, pte);

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index f5bcb0dc6267..c40728046fed 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -35,41 +35,15 @@ DEFINE_STATIC_KEY_FALSE(mte_async_or_asymm_mode);
EXPORT_SYMBOL_GPL(mte_async_or_asymm_mode);
#endif

-static void mte_sync_page_tags(struct page *page, pte_t old_pte,
- bool check_swap, bool pte_is_tagged)
-{
- if (check_swap && is_swap_pte(old_pte)) {
- swp_entry_t entry = pte_to_swp_entry(old_pte);
-
- if (!non_swap_entry(entry))
- mte_restore_tags(entry, page);
- }
-
- if (!pte_is_tagged)
- return;
-
- if (try_page_mte_tagging(page)) {
- mte_clear_page_tags(page_address(page));
- set_page_mte_tagged(page);
- }
-}
-
-void mte_sync_tags(pte_t old_pte, pte_t pte)
+void mte_sync_tags(pte_t pte)
{
struct page *page = pte_page(pte);
long i, nr_pages = compound_nr(page);
- bool check_swap = nr_pages == 1;
- bool pte_is_tagged = pte_tagged(pte);
-
- /* Early out if there's nothing to do */
- if (!check_swap && !pte_is_tagged)
- return;

/* if PG_mte_tagged is set, tags have already been initialised */
for (i = 0; i < nr_pages; i++, page++) {
- if (!page_mte_tagged(page)) {
- mte_sync_page_tags(page, old_pte, check_swap,
- pte_is_tagged);
+ if (try_page_mte_tagging(page)) {
+ mte_clear_page_tags(page_address(page));
set_page_mte_tagged(page);
}
}
diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
index cd508ba80ab1..3a78bf1b1364 100644
--- a/arch/arm64/mm/mteswap.c
+++ b/arch/arm64/mm/mteswap.c
@@ -53,10 +53,9 @@ void mte_restore_tags(swp_entry_t entry, struct page *page)
if (!tags)
return;

- if (try_page_mte_tagging(page)) {
- mte_restore_page_tags(page_address(page), tags);
- set_page_mte_tagged(page);
- }
+ WARN_ON_ONCE(!try_page_mte_tagging(page));
+ mte_restore_page_tags(page_address(page), tags);
+ set_page_mte_tagged(page);
}

void mte_invalidate_tags(int type, pgoff_t offset)
--
2.40.1.606.ga4b1b128d6-goog

David Hildenbrand

unread,
May 12, 2023, 11:30:00 PM5/12/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
On 13.05.23 01:57, Peter Collingbourne wrote:
> Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
> the call to swap_free() before the call to set_pte_at(), which meant that
> the MTE tags could end up being freed before set_pte_at() had a chance
> to restore them. One other possibility was to hook arch_do_swap_page(),
> but this had a number of problems:
>
> - The call to the hook was also after swap_free().
>
> - The call to the hook was after the call to set_pte_at(), so there was a
> racy window where uninitialized metadata may be exposed to userspace.
> This likely also affects SPARC ADI, which implements this hook to
> restore tags.
>
> - As a result of commit 1eba86c096e3 ("mm: change page type prior to
> adding page table entry"), we were also passing the new PTE as the
> oldpte argument, preventing the hook from knowing the swap index.
>
> Fix all of these problems by moving the arch_do_swap_page() call before
> the call to free_page(), and ensuring that we do not set orig_pte until
> after the call.
>
> Signed-off-by: Peter Collingbourne <p...@google.com>
> Suggested-by: Catalin Marinas <catalin...@arm.com>
> Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
> Cc: <sta...@vger.kernel.org> # 6.1
> Fixes: ca827d55ebaa ("mm, swap: Add infrastructure for saving page metadata on swap")
> Fixes: 1eba86c096e3 ("mm: change page type prior to adding page table entry")

I'm confused. You say c145e0b47c77 changed something (which was after
above commits), indicate that it fixes two other commits, and indicate
"6.1" as stable which does not apply to any of these commits.
You are moving the folio_free_swap() call after the
folio_ref_count(folio) == 1 check, which means that such (previously)
swapped pages that are exclusive cannot be detected as exclusive.

There must be a better way to handle MTE here.

Where are the tags stored, how is the location identified, and when are
they effectively restored right now?

--
Thanks,

David / dhildenb

David Hildenbrand

unread,
May 12, 2023, 11:34:29 PM5/12/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
On 13.05.23 01:57, Peter Collingbourne wrote:
> The previous patch made it possible for MTE to restore tags before they
> are freed by hooking arch_do_swap_page().
>
> However, the arch_do_swap_page() hook API is incompatible with swap
> restoration in circumstances where we do not have an mm or a vma,
> such as swapoff with swapped out shmem, and I expect that ADI will
> currently fail to restore tags in these circumstances. This implies that
> arch-specific metadata stores ought to be indexed by swap index, as MTE
> does, rather than by mm and vma, as ADI does, and we should discourage
> hooking arch_do_swap_page(), preferring to hook arch_swap_restore()
> instead, as MTE already does.
>
> Therefore, instead of directly hooking arch_do_swap_page() for
> MTE, deprecate that hook, change its default implementation to call
> arch_swap_restore() and rely on the existing implementation of the latter
> for MTE.
>
> Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")

Can you enlighten me how this change fixes that commit? I'm afraid I am
missing something important.

What is the user-visible impact of the problem, how was it caused by
c145e0b47c77, and how does your change fix it?

Catalin Marinas

unread,
May 15, 2023, 1:34:38 PM5/15/23
to David Hildenbrand, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
I haven't gone through Peter's patches yet but a pretty good description
of the problem is here:
https://lore.kernel.org/all/5050805753ac469e8d727c7...@mediatek.com/.
I couldn't reproduce it with my swap setup but both Qun-wei and Peter
triggered it.

When a tagged page is swapped out, the arm64 code stores the metadata
(tags) in a local xarray indexed by the swap pte. When restoring from
swap, the arm64 set_pte_at() checks this xarray using the old swap pte
and spills the tags onto the new page. Apparently something changed in
the kernel recently that causes swap_range_free() to be called before
set_pte_at(). The arm64 arch_swap_invalidate_page() frees the metadata
from the xarray and the subsequent set_pte_at() won't find it.

If we have the page, the metadata can be restored before set_pte_at()
and I guess that's what Peter is trying to do (again, I haven't looked
at the details yet; leaving it for tomorrow).

Is there any other way of handling this? E.g. not release the metadata
in arch_swap_invalidate_page() but later in set_pte_at() once it was
restored. But then we may leak this metadata if there's no set_pte_at()
(the process mapping the swap entry died).

--
Catalin

Peter Collingbourne

unread,
May 15, 2023, 7:40:08 PM5/15/23
to Catalin Marinas, David Hildenbrand, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
In order to reproduce this bug it is necessary for the swap slot cache
to be disabled, which is unlikely to occur during normal operation. I
was only able to reproduce the bug by disabling it forcefully with the
following patch:

diff --git a/mm/swap_slots.c b/mm/swap_slots.c
index 0bec1f705f8e0..25afba16980c7 100644
--- a/mm/swap_slots.c
+++ b/mm/swap_slots.c
@@ -79,7 +79,7 @@ void disable_swap_slots_cache_lock(void)

static void __reenable_swap_slots_cache(void)
{
- swap_slot_cache_enabled = has_usable_swap();
+ swap_slot_cache_enabled = false;
}

void reenable_swap_slots_cache_unlock(void)

With that I can trigger the bug on an MTE-utilizing process by running
a program that enumerates the process's private anonymous mappings and
calls process_madvise(MADV_PAGEOUT) on all of them.

> When a tagged page is swapped out, the arm64 code stores the metadata
> (tags) in a local xarray indexed by the swap pte. When restoring from
> swap, the arm64 set_pte_at() checks this xarray using the old swap pte
> and spills the tags onto the new page. Apparently something changed in
> the kernel recently that causes swap_range_free() to be called before
> set_pte_at(). The arm64 arch_swap_invalidate_page() frees the metadata
> from the xarray and the subsequent set_pte_at() won't find it.
>
> If we have the page, the metadata can be restored before set_pte_at()
> and I guess that's what Peter is trying to do (again, I haven't looked
> at the details yet; leaving it for tomorrow).
>
> Is there any other way of handling this? E.g. not release the metadata
> in arch_swap_invalidate_page() but later in set_pte_at() once it was
> restored. But then we may leak this metadata if there's no set_pte_at()
> (the process mapping the swap entry died).

Another problem that I can see with this approach is that it does not
respect reference counts for swap entries, and it's unclear whether that
can be done in a non-racy fashion.

Another approach that I considered was to move the hook to swap_readpage()
as in the patch below (sorry, it only applies to an older version
of Android's android14-6.1 branch and not mainline, but you get the
idea). But during a stress test (running the aforementioned program that
calls process_madvise(MADV_PAGEOUT) in a loop during an Android "monkey"
test) I discovered the following racy use-after-free that can occur when
two tasks T1 and T2 concurrently restore the same page:

T1: | T2:
arch_swap_readpage() |
| arch_swap_readpage() -> mte_restore_tags() -> xe_load()
swap_free() |
| arch_swap_readpage() -> mte_restore_tags() -> mte_restore_page_tags()

We can avoid it by taking the swap_info_struct::lock spinlock in
mte_restore_tags(), but it seems like it would lead to lock contention.

Peter

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 3f8199ba265a1..99c8be073f107 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -25,7 +25,7 @@ unsigned long mte_copy_tags_to_user(void __user *to, void *from,
unsigned long n);
int mte_save_tags(struct page *page);
void mte_save_page_tags(const void *page_addr, void *tag_storage);
-bool mte_restore_tags(swp_entry_t entry, struct page *page);
+void mte_restore_tags(struct page *page);
void mte_restore_page_tags(void *page_addr, const void *tag_storage);
void mte_invalidate_tags(int type, pgoff_t offset);
void mte_invalidate_tags_area(int type);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 812373cff4eec..32d3c661a0eee 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1054,11 +1054,11 @@ static inline void arch_swap_invalidate_area(int type)
mte_invalidate_tags_area(type);
}

-#define __HAVE_ARCH_SWAP_RESTORE
-static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
+#define __HAVE_ARCH_SWAP_READPAGE
+static inline void arch_swap_readpage(struct page *page)
{
- if (system_supports_mte() && mte_restore_tags(entry, &folio->page))
- set_page_mte_tagged(&folio->page);
+ if (system_supports_mte())
+ mte_restore_tags(page);
}

#endif /* CONFIG_ARM64_MTE */
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 84a085d536f84..176f094ecaa1e 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -38,15 +38,6 @@ EXPORT_SYMBOL_GPL(mte_async_or_asymm_mode);
static void mte_sync_page_tags(struct page *page, pte_t old_pte,
bool check_swap, bool pte_is_tagged)
{
- if (check_swap && is_swap_pte(old_pte)) {
- swp_entry_t entry = pte_to_swp_entry(old_pte);
-
- if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) {
- set_page_mte_tagged(page);
- return;
- }
- }
-
if (!pte_is_tagged)
return;

diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
index 70f913205db99..3fe7774f32b3c 100644
--- a/arch/arm64/mm/mteswap.c
+++ b/arch/arm64/mm/mteswap.c
@@ -46,21 +46,23 @@ int mte_save_tags(struct page *page)
return 0;
}

-bool mte_restore_tags(swp_entry_t entry, struct page *page)
+void mte_restore_tags(struct page *page)
{
+ swp_entry_t entry = folio_swap_entry(page_folio(page));
void *tags = xa_load(&mte_pages, entry.val);

if (!tags)
- return false;
+ return;

/*
* Test PG_mte_tagged again in case it was racing with another
* set_pte_at().
*/
- if (!test_and_set_bit(PG_mte_tagged, &page->flags))
+ if (!test_and_set_bit(PG_mte_tagged, &page->flags)) {
mte_restore_page_tags(page_address(page), tags);
-
- return true;
+ if (kasan_hw_tags_enabled())
+ page_kasan_tag_reset(page);
+ }
}

void mte_invalidate_tags(int type, pgoff_t offset)
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 5f0d7d0b9471b..eea1e545595ca 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -793,8 +793,8 @@ static inline void arch_swap_invalidate_area(int type)
}
#endif

-#ifndef __HAVE_ARCH_SWAP_RESTORE
-static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
+#ifndef __HAVE_ARCH_SWAP_READPAGE
+static inline void arch_swap_readpage(struct page *page)
{
}
#endif
diff --git a/mm/page_io.c b/mm/page_io.c
index 3a5f921b932e8..a2f53dbeca7b3 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -470,6 +470,12 @@ int swap_readpage(struct page *page, bool synchronous,
}
delayacct_swapin_start();

+ /*
+ * Some architectures may have to restore extra metadata to the
+ * page when reading from swap.
+ */
+ arch_swap_readpage(page);
+
if (frontswap_load(page) == 0) {
SetPageUptodate(page);
unlock_page(page);
diff --git a/mm/shmem.c b/mm/shmem.c
index 0b335607bf2ad..82ccf1e6efe5d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1784,12 +1784,6 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
}
folio_wait_writeback(folio);

- /*
- * Some architectures may have to restore extra metadata to the
- * folio after reading from swap.
- */
- arch_swap_restore(swap, folio);
-
if (shmem_should_replace_folio(folio, gfp)) {
error = shmem_replace_folio(&folio, gfp, info, index);
if (error)

Peter Collingbourne

unread,
May 15, 2023, 8:16:16 PM5/15/23
to David Hildenbrand, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
On Sat, May 13, 2023 at 05:29:53AM +0200, David Hildenbrand wrote:
Sorry, the situation is indeed a bit confusing.

- In order to make the arch_do_swap_page() hook suitable for fixing the
bug introduced by c145e0b47c77, patch 1 addresses a number of issues,
including fixing bugs introduced by ca827d55ebaa and 1eba86c096e3,
but we haven't fixed the c145e0b47c77 bug yet, so there's no Fixes:
tag for it yet.

- Patch 2, relying on the fixes in patch 1, makes MTE install an
arch_do_swap_page() hook (indirectly, by making arch_swap_restore()
also hook arch_do_swap_page()), thereby fixing the c145e0b47c77 bug.

- 6.1 is the first stable version in which all 3 commits in my Fixes: tags
are present, so that is the version that I've indicated in my stable
tag for this series. In theory patch 1 could be applied to older kernel
versions, but it wouldn't fix any problems that we are facing with MTE
(because it only fixes problems relating to the arch_do_swap_page()
hook, which older kernel versions don't hook with MTE), and there are
some merge conflicts if we go back further anyway. If the SPARC folks
(the previous only user of this hook) want to fix these issues with ADI,
they can propose their own backport.

> > @@ -3959,7 +3960,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> > VM_BUG_ON(!folio_test_anon(folio) ||
> > (pte_write(pte) && !PageAnonExclusive(page)));
> > set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> > - arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
> > folio_unlock(folio);
> > if (folio != swapcache && swapcache) {
>
>
> You are moving the folio_free_swap() call after the folio_ref_count(folio)
> == 1 check, which means that such (previously) swapped pages that are
> exclusive cannot be detected as exclusive.

Ack. I will fix this in v2.

Peter

Peter Collingbourne

unread,
May 15, 2023, 10:35:21 PM5/15/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price
This patch series reworks the logic that handles swapping in page
metadata to fix a reported bug [1] where metadata can sometimes not
be swapped in correctly after commit c145e0b47c77 ("mm: streamline COW
logic in do_swap_page()").

[1] https://lore.kernel.org/all/5050805753ac469e8d727c7...@mediatek.com/

v2:
- Call arch_swap_restore() directly instead of via arch_do_swap_page()

Peter Collingbourne (2):
mm: Call arch_swap_restore() from do_swap_page()
arm64: mte: Simplify swap tag restoration logic and fix uninitialized
tag issue

arch/arm64/include/asm/mte.h | 4 ++--
arch/arm64/include/asm/pgtable.h | 14 ++------------
arch/arm64/kernel/mte.c | 32 +++-----------------------------
arch/arm64/mm/mteswap.c | 7 +++----
mm/memory.c | 7 +++++++
5 files changed, 17 insertions(+), 47 deletions(-)

--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 15, 2023, 10:35:23 PM5/15/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance
to restore them. Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().

Signed-off-by: Peter Collingbourne <p...@google.com>
Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
---
v2:
- Call arch_swap_restore() directly instead of via arch_do_swap_page()

mm/memory.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 01a23ad48a04..a2d9e6952d31 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3914,6 +3914,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
}
}

+ /*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, folio);
+
/*
* Remove the swap entry and conditionally try to free up the swapcache.
* We're already holding a reference on the page but haven't mapped it
--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 15, 2023, 10:35:26 PM5/15/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
As a result of the previous two patches, there are no circumstances
in which a swapped-in page is installed in a page table without first
having arch_swap_restore() called on it. Therefore, we no longer need
the logic in set_pte_at() that restores the tags, so remove it.

Because we can now rely on the page being locked, we no longer need to
handle the case where a page is having its tags restored by multiple tasks
concurrently, so we can slightly simplify the logic in mte_restore_tags().

This patch also fixes an issue where a page can have PG_mte_tagged set
with uninitialized tags. The issue is that the mte_sync_page_tags()
function sets PG_mte_tagged if it initializes page tags. Then we
return to mte_sync_tags(), which sets PG_mte_tagged again. At best,
this is redundant. However, it is possible for mte_sync_page_tags()
to return without having initialized tags for the page, i.e. in the
case where check_swap is true (non-compound page), is_swap_pte(old_pte)
is false and pte_is_tagged is false. So at worst, we set PG_mte_tagged
on a page with uninitialized tags. This can happen if, for example,
page migration causes a PTE for an untagged page to be replaced. If the
userspace program subsequently uses mprotect() to enable PROT_MTE for
that page, the uninitialized tags will be exposed to userspace.

Signed-off-by: Peter Collingbourne <p...@google.com>
Link: https://linux-review.googlesource.com/id/I8ad54476f3b2d0144ccd8ce0c1d7a2963e5ff6f3
Fixes: e059853d14ca ("arm64: mte: Fix/clarify the PG_mte_tagged semantics")
Cc: <sta...@vger.kernel.org> # 6.1
---
The Fixes: tag (and the commit message in general) are written assuming
that this patch is landed in a maintainer tree instead of
"arm64: mte: Do not set PG_mte_tagged if tags were not initialized".

arch/arm64/include/asm/mte.h | 4 ++--
arch/arm64/include/asm/pgtable.h | 14 ++------------
arch/arm64/kernel/mte.c | 32 +++-----------------------------
arch/arm64/mm/mteswap.c | 7 +++----
4 files changed, 10 insertions(+), 47 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 20dd06d70af5..dfea486a6a85 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -90,7 +90,7 @@ static inline bool try_page_mte_tagging(struct page *page)
}

void mte_zero_clear_page_tags(void *addr);
-void mte_sync_tags(pte_t old_pte, pte_t pte);
+void mte_sync_tags(pte_t pte);
void mte_copy_page_tags(void *kto, const void *kfrom);
void mte_thread_init_user(void);
void mte_thread_switch(struct task_struct *next);
@@ -122,7 +122,7 @@ static inline bool try_page_mte_tagging(struct page *page)
static inline void mte_zero_clear_page_tags(void *addr)
{
}
-static inline void mte_sync_tags(pte_t old_pte, pte_t pte)
+static inline void mte_sync_tags(pte_t pte)
{
}
static inline void mte_copy_page_tags(void *kto, const void *kfrom)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b6ba466e2e8a..efdf48392026 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -337,18 +337,8 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
* don't expose tags (instruction fetches don't check tags).
*/
if (system_supports_mte() && pte_access_permitted(pte, false) &&
- !pte_special(pte)) {
- pte_t old_pte = READ_ONCE(*ptep);
- /*
- * We only need to synchronise if the new PTE has tags enabled
- * or if swapping in (in which case another mapping may have
- * set tags in the past even if this PTE isn't tagged).
- * (!pte_none() && !pte_present()) is an open coded version of
- * is_swap_pte()
- */
- if (pte_tagged(pte) || (!pte_none(old_pte) && !pte_present(old_pte)))
- mte_sync_tags(old_pte, pte);
- }
+ !pte_special(pte) && pte_tagged(pte))
+ mte_sync_tags(pte);

__check_safe_pte_update(mm, ptep, pte);

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index f5bcb0dc6267..c40728046fed 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -35,41 +35,15 @@ DEFINE_STATIC_KEY_FALSE(mte_async_or_asymm_mode);
EXPORT_SYMBOL_GPL(mte_async_or_asymm_mode);
#endif

-static void mte_sync_page_tags(struct page *page, pte_t old_pte,
- bool check_swap, bool pte_is_tagged)
-{
- if (check_swap && is_swap_pte(old_pte)) {
- swp_entry_t entry = pte_to_swp_entry(old_pte);
-
- if (!non_swap_entry(entry))
- mte_restore_tags(entry, page);
- }
-
- if (!pte_is_tagged)
diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
index cd508ba80ab1..3a78bf1b1364 100644
--- a/arch/arm64/mm/mteswap.c
+++ b/arch/arm64/mm/mteswap.c
@@ -53,10 +53,9 @@ void mte_restore_tags(swp_entry_t entry, struct page *page)
if (!tags)
return;

- if (try_page_mte_tagging(page)) {
- mte_restore_page_tags(page_address(page), tags);
- set_page_mte_tagged(page);
- }
+ WARN_ON_ONCE(!try_page_mte_tagging(page));
+ mte_restore_page_tags(page_address(page), tags);
+ set_page_mte_tagged(page);
}

void mte_invalidate_tags(int type, pgoff_t offset)
--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 15, 2023, 10:35:31 PM5/15/23
to David Hildenbrand, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
I gave this some thought and concluded that the added complexity needed
to make this hook suitable for arm64 without breaking sparc probably
isn't worth it in the end, and as I explained in patch 2, sparc ought
to be moving away from this hook anyway. So in v2 I replaced patches 1
and 2 with a patch that adds a direct call to the arch_swap_restore()
hook before the call to swap_free().

Peter

David Hildenbrand

unread,
May 16, 2023, 8:31:05 AM5/16/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Thanks for the details! I was missing that we also have a hook in
swap_range_free().

>
> Is there any other way of handling this? E.g. not release the metadata
> in arch_swap_invalidate_page() but later in set_pte_at() once it was
> restored. But then we may leak this metadata if there's no set_pte_at()
> (the process mapping the swap entry died).

That was my immediate thought: do we really have to hook into
swap_range_free() at all? And I also wondered why we have to do this
from set_pte_at() and not do this explicitly (maybe that's the other
arch_* callback on the swapin path).

I'll have a look at v2, maybe it can be fixed easily without having to
shuffle around too much of the swapin code (which can easily break again
because the dependencies are not obvious at all and even undocumented in
the code).

David Hildenbrand

unread,
May 16, 2023, 8:35:32 AM5/16/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Would the idea be to fail swap_readpage() on the one that comes last,
simply retrying to lookup the page?

This might be a naive question, but how does MTE play along with shared
anonymous pages?

David Hildenbrand

unread,
May 16, 2023, 8:40:41 AM5/16/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Oh. That's indeed confusing. Maybe that should all be squashed to have
one logical fix for the overall problem. It's especially confusing
because this patch here fixes the other two issues touches code moved by
c145e0b47c77.

> - 6.1 is the first stable version in which all 3 commits in my Fixes: tags
> are present, so that is the version that I've indicated in my stable
> tag for this series. In theory patch 1 could be applied to older kernel
> versions, but it wouldn't fix any problems that we are facing with MTE
> (because it only fixes problems relating to the arch_do_swap_page()
> hook, which older kernel versions don't hook with MTE), and there are
> some merge conflicts if we go back further anyway. If the SPARC folks
> (the previous only user of this hook) want to fix these issues with ADI,
> they can propose their own backport.

Sometimes, it's a good idea to not specify a stable version and rather
let the Fixes: tags imply that.

David Hildenbrand

unread,
May 16, 2023, 8:49:47 AM5/16/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Looks much better to me, thanks :)

... staring at unuse_pte(), I suspect it also doesn't take care of MTE
tags and needs fixing?

Peter Collingbourne

unread,
May 16, 2023, 9:37:35 PM5/16/23
to David Hildenbrand, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
As I alluded to in another reply, without the hook in
swap_range_free() I think we would either end up with a race or an
effective memory leak in the arch code that maintains the metadata for
swapped out pages, as there would be no way for the arch-specific code
to know when it is safe to free it after swapin.

> And I also wondered why we have to do this
> from set_pte_at() and not do this explicitly (maybe that's the other
> arch_* callback on the swapin path).

I don't think it's necessary, as the set_pte_at() call sites for
swapped in pages are known. I'd much rather do this via an explicit
hook at those call sites, as the existing approach of implicit
restoring seems too subtle and easy to be overlooked when refactoring,
as we have seen with this bug. In the end we only have 3 call sites
for the hook and hopefully the comments that I'm adding are sufficient
to ensure that any new swapin code should end up with a call to the
hook in the right place.

Peter

Peter Collingbourne

unread,
May 16, 2023, 9:57:39 PM5/16/23
to David Hildenbrand, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
The idea would be that T2's arch_swap_readpage() could potentially not
find tags if it ran after swap_free(), so T2 would produce a page
without restored tags. But that wouldn't matter, because T1 reaching
swap_free() means that T2 will follow the goto at [1] after waiting
for T1 to unlock at [2], and T2's page will be discarded.

> This might be a naive question, but how does MTE play along with shared
> anonymous pages?

It should work fine. shmem_writepage() calls swap_writepage() which
calls arch_prepare_to_swap() to write the tags. And
shmem_swapin_folio() has a call to arch_swap_restore() to restore
them.

Peter

[1] https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/mm/memory.c#L3881
[2] https://github.com/torvalds/linux/blob/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6/mm/memory.c#L4006

Peter Collingbourne

unread,
May 16, 2023, 10:13:53 PM5/16/23
to David Hildenbrand, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Maybe. It can sometimes be hard to reconcile "one logical change per
patch" with "bug requires more than one logical change to fix" though.
Fortunately in this case I think we have an approach that fixes the
bug in one logical change, with some followup patches to clean things
up.

> > - 6.1 is the first stable version in which all 3 commits in my Fixes: tags
> > are present, so that is the version that I've indicated in my stable
> > tag for this series. In theory patch 1 could be applied to older kernel
> > versions, but it wouldn't fix any problems that we are facing with MTE
> > (because it only fixes problems relating to the arch_do_swap_page()
> > hook, which older kernel versions don't hook with MTE), and there are
> > some merge conflicts if we go back further anyway. If the SPARC folks
> > (the previous only user of this hook) want to fix these issues with ADI,
> > they can propose their own backport.
>
> Sometimes, it's a good idea to not specify a stable version and rather
> let the Fixes: tags imply that.

Yeah, but sometimes it's hard to say which way would be more
efficient. Either we spend time discussing why the version is
necessary or Greg spends time trying to apply patches to the wrong
trees because I wasn't more explicit...

Peter

Peter Collingbourne

unread,
May 16, 2023, 10:21:24 PM5/16/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price
This patch series reworks the logic that handles swapping in page
metadata to fix a reported bug [1] where metadata can sometimes not
be swapped in correctly after commit c145e0b47c77 ("mm: streamline COW
logic in do_swap_page()").

- Patch 1 fixes the bug itself, but still requires architectures
to restore metadata in both arch_swap_restore() and set_pte_at().

- Patch 2 makes it so that architectures only need to restore metadata
in arch_swap_restore().

- Patch 3 changes arm64 to remove support for restoring metadata
in set_pte_at().

[1] https://lore.kernel.org/all/5050805753ac469e8d727c7...@mediatek.com/

v3:
- Added patch to call arch_swap_restore() from unuse_pte()
- Rebased onto arm64/for-next/fixes

v2:
- Call arch_swap_restore() directly instead of via arch_do_swap_page()

Peter Collingbourne (3):
mm: Call arch_swap_restore() from do_swap_page()
mm: Call arch_swap_restore() from unuse_pte()
arm64: mte: Simplify swap tag restoration logic

arch/arm64/include/asm/mte.h | 4 ++--
arch/arm64/include/asm/pgtable.h | 14 ++----------
arch/arm64/kernel/mte.c | 37 ++++++--------------------------
arch/arm64/mm/mteswap.c | 7 +++---
mm/memory.c | 7 ++++++
mm/swapfile.c | 7 ++++++
6 files changed, 28 insertions(+), 48 deletions(-)

--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 16, 2023, 10:21:26 PM5/16/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance
to restore them. Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().

Signed-off-by: Peter Collingbourne <p...@google.com>
Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
Reported-by: Qun-wei Lin (林群崴) <Qun-w...@mediatek.com>
Closes: https://lore.kernel.org/all/5050805753ac469e8d727c7...@mediatek.com/
---
v2:
- Call arch_swap_restore() directly instead of via arch_do_swap_page()

mm/memory.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index f69fbc251198..fc25764016b3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3932,6 +3932,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
}
}

+ /*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, folio);
+
/*
* Remove the swap entry and conditionally try to free up the swapcache.
* We're already holding a reference on the page but haven't mapped it
--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 16, 2023, 10:21:29 PM5/16/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price
We would like to move away from requiring architectures to restore
metadata from swap in the set_pte_at() implementation, as this is not only
error-prone but adds complexity to the arch-specific code. This requires
us to call arch_swap_restore() before calling swap_free() whenever pages
are restored from swap. We are currently doing so everywhere except in
unuse_pte(); do so there as well.

Signed-off-by: Peter Collingbourne <p...@google.com>
Link: https://linux-review.googlesource.com/id/I68276653e612d64cde271ce1b5a99ae05d6bbc4f
---
mm/swapfile.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 274bbf797480..e9843fadecd6 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1794,6 +1794,13 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
goto setpte;
}

+ /*
+ * Some architectures may have to restore extra metadata to the page
+ * when reading from swap. This metadata may be indexed by swap entry
+ * so this must be called before swap_free().
+ */
+ arch_swap_restore(entry, page_folio(page));
+
/* See do_swap_page() */
BUG_ON(!PageAnon(page) && PageMappedToDisk(page));
BUG_ON(PageAnon(page) && PageAnonExclusive(page));
--
2.40.1.606.ga4b1b128d6-goog

Peter Collingbourne

unread,
May 16, 2023, 10:21:30 PM5/16/23
to Catalin Marinas, Peter Collingbourne, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price
As a result of the previous two patches, there are no circumstances
in which a swapped-in page is installed in a page table without first
having arch_swap_restore() called on it. Therefore, we no longer need
the logic in set_pte_at() that restores the tags, so remove it.

Because we can now rely on the page being locked, we no longer need to
handle the case where a page is having its tags restored by multiple tasks
concurrently, so we can slightly simplify the logic in mte_restore_tags().

Signed-off-by: Peter Collingbourne <p...@google.com>
Link: https://linux-review.googlesource.com/id/I8ad54476f3b2d0144ccd8ce0c1d7a2963e5ff6f3
---
v3:
- Rebased onto arm64/for-next/fixes, which already has a fix
for the issue previously tagged, therefore removed Fixes:
tag

arch/arm64/include/asm/mte.h | 4 ++--
arch/arm64/include/asm/pgtable.h | 14 ++----------
arch/arm64/kernel/mte.c | 37 ++++++--------------------------
arch/arm64/mm/mteswap.c | 7 +++---
4 files changed, 14 insertions(+), 48 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index c028afb1cd0b..4cedbaa16f41 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -90,7 +90,7 @@ static inline bool try_page_mte_tagging(struct page *page)
}

void mte_zero_clear_page_tags(void *addr);
-void mte_sync_tags(pte_t old_pte, pte_t pte);
+void mte_sync_tags(pte_t pte);
void mte_copy_page_tags(void *kto, const void *kfrom);
void mte_thread_init_user(void);
void mte_thread_switch(struct task_struct *next);
@@ -122,7 +122,7 @@ static inline bool try_page_mte_tagging(struct page *page)
static inline void mte_zero_clear_page_tags(void *addr)
{
}
-static inline void mte_sync_tags(pte_t old_pte, pte_t pte)
+static inline void mte_sync_tags(pte_t pte)
{
}
static inline void mte_copy_page_tags(void *kto, const void *kfrom)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0bd18de9fd97..e8a252e62b12 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -337,18 +337,8 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
* don't expose tags (instruction fetches don't check tags).
*/
if (system_supports_mte() && pte_access_permitted(pte, false) &&
- !pte_special(pte)) {
- pte_t old_pte = READ_ONCE(*ptep);
- /*
- * We only need to synchronise if the new PTE has tags enabled
- * or if swapping in (in which case another mapping may have
- * set tags in the past even if this PTE isn't tagged).
- * (!pte_none() && !pte_present()) is an open coded version of
- * is_swap_pte()
- */
- if (pte_tagged(pte) || (!pte_none(old_pte) && !pte_present(old_pte)))
- mte_sync_tags(old_pte, pte);
- }
+ !pte_special(pte) && pte_tagged(pte))
+ mte_sync_tags(pte);

__check_safe_pte_update(mm, ptep, pte);

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 7e89968bd282..c40728046fed 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -35,41 +35,18 @@ DEFINE_STATIC_KEY_FALSE(mte_async_or_asymm_mode);
- for (i = 0; i < nr_pages; i++, page++)
- if (!page_mte_tagged(page))
- mte_sync_page_tags(page, old_pte, check_swap,
- pte_is_tagged);
+ for (i = 0; i < nr_pages; i++, page++) {
+ if (try_page_mte_tagging(page)) {
+ mte_clear_page_tags(page_address(page));
+ set_page_mte_tagged(page);
+ }
+ }

/* ensure the tags are visible before the PTE is set */
smp_wmb();

Peter Collingbourne

unread,
May 16, 2023, 10:21:58 PM5/16/23
to David Hildenbrand, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Nice catch, I've fixed it in v3.

I don't think there are any other cases like this. I looked for code
that decrements the MM_SWAPENTS counter and we're already covering all
of them.

Peter

Huang, Ying

unread,
May 16, 2023, 11:42:20 PM5/16/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
Should you add

Suggested-by: David Hildenbrand <da...@redhat.com>

for 1/3 and 2/3.

It looks good for me for swap code related part. Feel free to add

Acked-by: "Huang, Ying" <ying....@intel.com>

to 1/3 and 2/3.

Best Regards,
Huang, Ying

David Hildenbrand

unread,
May 17, 2023, 4:30:25 AM5/17/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
>> Would the idea be to fail swap_readpage() on the one that comes last,
>> simply retrying to lookup the page?
>
> The idea would be that T2's arch_swap_readpage() could potentially not
> find tags if it ran after swap_free(), so T2 would produce a page
> without restored tags. But that wouldn't matter, because T1 reaching
> swap_free() means that T2 will follow the goto at [1] after waiting
> for T1 to unlock at [2], and T2's page will be discarded.

Ah, right.

>
>> This might be a naive question, but how does MTE play along with shared
>> anonymous pages?
>
> It should work fine. shmem_writepage() calls swap_writepage() which
> calls arch_prepare_to_swap() to write the tags. And
> shmem_swapin_folio() has a call to arch_swap_restore() to restore
> them.

Sorry, I meant actual anonymous memory pages, not shmem. Like, anonymous
pages that are COW-shared due to fork() or KSM.

How does MTE, in general, interact with that? Assume one process ends up
modifying the tags ... and the page is COW-shared with a different
process that should not observe these tag modifications.

David Hildenbrand

unread,
May 17, 2023, 4:31:21 AM5/17/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org

>>> Is there any other way of handling this? E.g. not release the metadata
>>> in arch_swap_invalidate_page() but later in set_pte_at() once it was
>>> restored. But then we may leak this metadata if there's no set_pte_at()
>>> (the process mapping the swap entry died).
>>
>> That was my immediate thought: do we really have to hook into
>> swap_range_free() at all?
>
> As I alluded to in another reply, without the hook in
> swap_range_free() I think we would either end up with a race or an
> effective memory leak in the arch code that maintains the metadata for
> swapped out pages, as there would be no way for the arch-specific code
> to know when it is safe to free it after swapin.

Agreed, hooking swap_range_free() is actually cleaner (also considering
COW-shared pages).

>
>> And I also wondered why we have to do this
>> from set_pte_at() and not do this explicitly (maybe that's the other
>> arch_* callback on the swapin path).
>
> I don't think it's necessary, as the set_pte_at() call sites for
> swapped in pages are known. I'd much rather do this via an explicit
> hook at those call sites, as the existing approach of implicit
> restoring seems too subtle and easy to be overlooked when refactoring,
> as we have seen with this bug. In the end we only have 3 call sites
> for the hook and hopefully the comments that I'm adding are sufficient
> to ensure that any new swapin code should end up with a call to the
> hook in the right place.


Agreed, much cleaner, thanks!

David Hildenbrand

unread,
May 17, 2023, 4:34:48 AM5/17/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
On 16.05.23 04:35, Peter Collingbourne wrote:
As a side note, I recall that sparc code might be a bit fragile and
eventually broken already (arch_unmap_one()):

https://lkml.kernel.org/r/d98bd1f9-e9b7-049c...@redhat.com

David Hildenbrand

unread,
May 17, 2023, 4:37:28 AM5/17/23
to Huang, Ying, Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price, sta...@vger.kernel.org
For 1/3, I think I rather only explained the problem in the first patch
and didn't really suggest this.

Acked-by: David Hildenbrand <da...@redhat.com>

David Hildenbrand

unread,
May 17, 2023, 4:37:35 AM5/17/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, Steven Price

Steven Price

unread,
May 17, 2023, 10:57:50 AM5/17/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com, sta...@vger.kernel.org
On 17/05/2023 03:21, Peter Collingbourne wrote:
> Commit c145e0b47c77 ("mm: streamline COW logic in do_swap_page()") moved
> the call to swap_free() before the call to set_pte_at(), which meant that
> the MTE tags could end up being freed before set_pte_at() had a chance
> to restore them. Fix it by adding a call to the arch_swap_restore() hook
> before the call to swap_free().
>
> Signed-off-by: Peter Collingbourne <p...@google.com>
> Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
> Cc: <sta...@vger.kernel.org> # 6.1
> Fixes: c145e0b47c77 ("mm: streamline COW logic in do_swap_page()")
> Reported-by: Qun-wei Lin (林群崴) <Qun-w...@mediatek.com>
> Closes: https://lore.kernel.org/all/5050805753ac469e8d727c7...@mediatek.com/

Reviewed-by: Steven Price <steven...@arm.com>

Steven Price

unread,
May 17, 2023, 10:58:12 AM5/17/23
to Peter Collingbourne, Catalin Marinas, Qun-wei Lin (林群崴), linux-ar...@lists.infradead.org, linu...@kvack.org, linux-...@vger.kernel.org, sur...@google.com, da...@redhat.com, Chinwen Chang (張錦文), kasa...@googlegroups.com, Kuan-Ying Lee (李冠穎), Casper Li (李中榮), gre...@linuxfoundation.org, vincenzo...@arm.com, Alexandru Elisei, wi...@kernel.org, eug...@google.com
On 17/05/2023 03:21, Peter Collingbourne wrote:
> We would like to move away from requiring architectures to restore
> metadata from swap in the set_pte_at() implementation, as this is not only
> error-prone but adds complexity to the arch-specific code. This requires
> us to call arch_swap_restore() before calling swap_free() whenever pages
> are restored from swap. We are currently doing so everywhere except in
> unuse_pte(); do so there as well.
>
> Signed-off-by: Peter Collingbourne <p...@google.com>
> Link: https://linux-review.googlesource.com/id/I68276653e612d64cde271ce1b5a99ae05d6bbc4f

Reviewed-by: Steven Price <steven...@arm.com>