[PATCH v2 00/33] Separate struct slab from struct page

0 views
Skip to first unread message

Vlastimil Babka

unread,
Dec 1, 2021, 1:15:17 PM12/1/21
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Vlastimil Babka, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org
Folks from non-slab subsystems are Cc'd only to patches affecting them, and
this cover letter.

Series also available in git, based on 5.16-rc3:
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2

The plan: as my SLUB PREEMPT_RT series in 5.15, I would prefer to go again with
the git pull request way of eventually merging this, as it's also not a small
series. I will thus reply to this mail with asking to include my branch in
linux-next.

As stated in the v1/RFC cover letter, I wouldn't mind to then continue with
maintaining a git tree for all slab patches in general. It was apparently
already done that way before, by Pekka:
https://lore.kernel.org/linux-mm/alpine.DEB.2.00.1107221108190.2996@tiger/

Changes from v1/RFC:
https://lore.kernel.org/all/20211116001628...@suse.cz/
- Added virt_to_folio() and folio_address() in the new Patch 1.
- Addressed feedback from Andrey Konovalov and Matthew Wilcox (Thanks!)
- Added Tested-by: Marco Elver for the KFENCE parts (Thanks!)

Previous version from Matthew Wilcox:
https://lore.kernel.org/all/20211004134650....@infradead.org/

LWN coverage of the above:
https://lwn.net/Articles/871982/

This is originally an offshoot of the folio work by Matthew. One of the more
complex parts of the struct page definition are the parts used by the slab
allocators. It would be good for the MM in general if struct slab were its own
data type, and it also helps to prevent tail pages from slipping in anywhere.
As Matthew requested in his proof of concept series, I have taken over the
development of this series, so it's a mix of patches from him (often modified
by me) and my own.

One big difference is the use of coccinelle to perform the relatively trivial
parts of the conversions automatically and at once, instead of a larger number
of smaller incremental reviewable steps. Thanks to Julia Lawall and Luis
Chamberlain for all their help!

Another notable difference is (based also on review feedback) I don't represent
with a struct slab the large kmalloc allocations which are not really a slab,
but use page allocator directly. When going from an object address to a struct
slab, the code tests first folio slab flag, and only if it's set it converts to
struct slab. This makes the struct slab type stronger.

Finally, although Matthew's version didn't use any of the folio work, the
initial support has been merged meanwhile so my version builds on top of it
where appropriate. This eliminates some of the redundant compound_head()
being performed e.g. when testing the slab flag.

To sum up, after this series, struct page fields used by slab allocators are
moved from struct page to a new struct slab, that uses the same physical
storage. The availability of the fields is further distinguished by the
selected slab allocator implementation. The advantages include:

- Similar to folios, if the slab is of order > 0, struct slab always is
guaranteed to be the head page. Additionally it's guaranteed to be an actual
slab page, not a large kmalloc. This removes uncertainty and potential for
bugs.
- It's not possible to accidentally use fields of the slab implementation that's
not configured.
- Other subsystems cannot use slab's fields in struct page anymore (some
existing non-slab usages had to be adjusted in this series), so slab
implementations have more freedom in rearranging them in the struct slab.

Matthew Wilcox (Oracle) (16):
mm: Split slab into its own type
mm: Add account_slab() and unaccount_slab()
mm: Convert virt_to_cache() to use struct slab
mm: Convert __ksize() to struct slab
mm: Use struct slab in kmem_obj_info()
mm: Convert check_heap_object() to use struct slab
mm/slub: Convert detached_freelist to use a struct slab
mm/slub: Convert kfree() to use a struct slab
mm/slub: Convert print_page_info() to print_slab_info()
mm/slub: Convert pfmemalloc_match() to take a struct slab
mm/slob: Convert SLOB to use struct slab
mm/kasan: Convert to struct folio and struct slab
zsmalloc: Stop using slab fields in struct page
bootmem: Use page->index instead of page->freelist
iommu: Use put_pages_list
mm: Remove slab from struct page

Vlastimil Babka (17):
mm: add virt_to_folio() and folio_address()
mm/slab: Dissolve slab_map_pages() in its caller
mm/slub: Make object_err() static
mm/slub: Convert __slab_lock() and __slab_unlock() to struct slab
mm/slub: Convert alloc_slab_page() to return a struct slab
mm/slub: Convert __free_slab() to use struct slab
mm/slub: Convert most struct page to struct slab by spatch
mm/slub: Finish struct page to struct slab conversion
mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
mm/slab: Convert most struct page to struct slab by spatch
mm/slab: Finish struct page to struct slab conversion
mm: Convert struct page to struct slab in functions used by other
subsystems
mm/memcg: Convert slab objcgs from struct page to struct slab
mm/kfence: Convert kfence_guarded_alloc() to struct slab
mm/sl*b: Differentiate struct slab fields by sl*b implementations
mm/slub: Simplify struct slab slabs field definition
mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only
when enabled

arch/x86/mm/init_64.c | 2 +-
drivers/iommu/amd/io_pgtable.c | 59 +-
drivers/iommu/dma-iommu.c | 11 +-
drivers/iommu/intel/iommu.c | 89 +--
include/linux/bootmem_info.h | 2 +-
include/linux/iommu.h | 3 +-
include/linux/kasan.h | 9 +-
include/linux/memcontrol.h | 48 --
include/linux/mm.h | 12 +
include/linux/mm_types.h | 38 +-
include/linux/page-flags.h | 37 -
include/linux/slab.h | 8 -
include/linux/slab_def.h | 16 +-
include/linux/slub_def.h | 29 +-
mm/bootmem_info.c | 7 +-
mm/kasan/common.c | 27 +-
mm/kasan/generic.c | 8 +-
mm/kasan/kasan.h | 1 +
mm/kasan/quarantine.c | 2 +-
mm/kasan/report.c | 13 +-
mm/kasan/report_tags.c | 10 +-
mm/kfence/core.c | 17 +-
mm/kfence/kfence_test.c | 6 +-
mm/memcontrol.c | 43 +-
mm/slab.c | 455 ++++++-------
mm/slab.h | 322 ++++++++-
mm/slab_common.c | 8 +-
mm/slob.c | 46 +-
mm/slub.c | 1164 ++++++++++++++++----------------
mm/sparse.c | 2 +-
mm/usercopy.c | 13 +-
mm/zsmalloc.c | 18 +-
32 files changed, 1317 insertions(+), 1208 deletions(-)

--
2.33.1

Vlastimil Babka

unread,
Dec 1, 2021, 1:15:20 PM12/1/21
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Vlastimil Babka, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Marco Elver, Johannes Weiner, Michal Hocko, Vladimir Davydov, kasa...@googlegroups.com, cgr...@vger.kernel.org
KASAN, KFENCE and memcg interact with SLAB or SLUB internals through functions
nearest_obj(), obj_to_index() and objs_per_slab() that use struct page as
parameter. This patch converts it to struct slab including all callers, through
a coccinelle semantic patch.

// Options: --include-headers --no-includes --smpl-spacing include/linux/slab_def.h include/linux/slub_def.h mm/slab.h mm/kasan/*.c mm/kfence/kfence_test.c mm/memcontrol.c mm/slab.c mm/slub.c
// Note: needs coccinelle 1.1.1 to avoid breaking whitespace

@@
@@

-objs_per_slab_page(
+objs_per_slab(
...
)
{ ... }

@@
@@

-objs_per_slab_page(
+objs_per_slab(
...
)

@@
identifier fn =~ "obj_to_index|objs_per_slab";
@@

fn(...,
- const struct page *page
+ const struct slab *slab
,...)
{
<...
(
- page_address(page)
+ slab_address(slab)
|
- page
+ slab
)
...>
}

@@
identifier fn =~ "nearest_obj";
@@

fn(...,
- struct page *page
+ const struct slab *slab
,...)
{
<...
(
- page_address(page)
+ slab_address(slab)
|
- page
+ slab
)
...>
}

@@
identifier fn =~ "nearest_obj|obj_to_index|objs_per_slab";
expression E;
@@

fn(...,
(
- slab_page(E)
+ E
|
- virt_to_page(E)
+ virt_to_slab(E)
|
- virt_to_head_page(E)
+ virt_to_slab(E)
|
- page
+ page_slab(page)
)
,...)

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Cc: Julia Lawall <julia....@inria.fr>
Cc: Luis Chamberlain <mcg...@kernel.org>
Cc: Andrey Ryabinin <ryabin...@gmail.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Andrey Konovalov <andre...@gmail.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: Marco Elver <el...@google.com>
Cc: Johannes Weiner <han...@cmpxchg.org>
Cc: Michal Hocko <mho...@kernel.org>
Cc: Vladimir Davydov <vdavyd...@gmail.com>
Cc: <kasa...@googlegroups.com>
Cc: <cgr...@vger.kernel.org>
---
include/linux/slab_def.h | 16 ++++++++--------
include/linux/slub_def.h | 18 +++++++++---------
mm/kasan/common.c | 4 ++--
mm/kasan/generic.c | 2 +-
mm/kasan/report.c | 2 +-
mm/kasan/report_tags.c | 2 +-
mm/kfence/kfence_test.c | 4 ++--
mm/memcontrol.c | 4 ++--
mm/slab.c | 10 +++++-----
mm/slab.h | 4 ++--
mm/slub.c | 2 +-
11 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 3aa5e1e73ab6..e24c9aff6fed 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -87,11 +87,11 @@ struct kmem_cache {
struct kmem_cache_node *node[MAX_NUMNODES];
};

-static inline void *nearest_obj(struct kmem_cache *cache, struct page *page,
+static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *slab,
void *x)
{
- void *object = x - (x - page->s_mem) % cache->size;
- void *last_object = page->s_mem + (cache->num - 1) * cache->size;
+ void *object = x - (x - slab->s_mem) % cache->size;
+ void *last_object = slab->s_mem + (cache->num - 1) * cache->size;

if (unlikely(object > last_object))
return last_object;
@@ -106,16 +106,16 @@ static inline void *nearest_obj(struct kmem_cache *cache, struct page *page,
* reciprocal_divide(offset, cache->reciprocal_buffer_size)
*/
static inline unsigned int obj_to_index(const struct kmem_cache *cache,
- const struct page *page, void *obj)
+ const struct slab *slab, void *obj)
{
- u32 offset = (obj - page->s_mem);
+ u32 offset = (obj - slab->s_mem);
return reciprocal_divide(offset, cache->reciprocal_buffer_size);
}

-static inline int objs_per_slab_page(const struct kmem_cache *cache,
- const struct page *page)
+static inline int objs_per_slab(const struct kmem_cache *cache,
+ const struct slab *slab)
{
- if (is_kfence_address(page_address(page)))
+ if (is_kfence_address(slab_address(slab)))
return 1;
return cache->num;
}
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 8a9c2876ca89..33c5c0e3bd8d 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -158,11 +158,11 @@ static inline void sysfs_slab_release(struct kmem_cache *s)

void *fixup_red_left(struct kmem_cache *s, void *p);

-static inline void *nearest_obj(struct kmem_cache *cache, struct page *page,
+static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *slab,
void *x) {
- void *object = x - (x - page_address(page)) % cache->size;
- void *last_object = page_address(page) +
- (page->objects - 1) * cache->size;
+ void *object = x - (x - slab_address(slab)) % cache->size;
+ void *last_object = slab_address(slab) +
+ (slab->objects - 1) * cache->size;
void *result = (unlikely(object > last_object)) ? last_object : object;

result = fixup_red_left(cache, result);
@@ -178,16 +178,16 @@ static inline unsigned int __obj_to_index(const struct kmem_cache *cache,
}

static inline unsigned int obj_to_index(const struct kmem_cache *cache,
- const struct page *page, void *obj)
+ const struct slab *slab, void *obj)
{
if (is_kfence_address(obj))
return 0;
- return __obj_to_index(cache, page_address(page), obj);
+ return __obj_to_index(cache, slab_address(slab), obj);
}

-static inline int objs_per_slab_page(const struct kmem_cache *cache,
- const struct page *page)
+static inline int objs_per_slab(const struct kmem_cache *cache,
+ const struct slab *slab)
{
- return page->objects;
+ return slab->objects;
}
#endif /* _LINUX_SLUB_DEF_H */
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 8428da2aaf17..6a1cd2d38bff 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -298,7 +298,7 @@ static inline u8 assign_tag(struct kmem_cache *cache,
/* For caches that either have a constructor or SLAB_TYPESAFE_BY_RCU: */
#ifdef CONFIG_SLAB
/* For SLAB assign tags based on the object index in the freelist. */
- return (u8)obj_to_index(cache, virt_to_head_page(object), (void *)object);
+ return (u8)obj_to_index(cache, virt_to_slab(object), (void *)object);
#else
/*
* For SLUB assign a random tag during slab creation, otherwise reuse
@@ -341,7 +341,7 @@ static inline bool ____kasan_slab_free(struct kmem_cache *cache, void *object,
if (is_kfence_address(object))
return false;

- if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) !=
+ if (unlikely(nearest_obj(cache, virt_to_slab(object), object) !=
object)) {
kasan_report_invalid_free(tagged_object, ip);
return true;
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index 84a038b07c6f..5d0b79416c4e 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -339,7 +339,7 @@ static void __kasan_record_aux_stack(void *addr, bool can_alloc)
return;

cache = page->slab_cache;
- object = nearest_obj(cache, page, addr);
+ object = nearest_obj(cache, page_slab(page), addr);
alloc_meta = kasan_get_alloc_meta(cache, object);
if (!alloc_meta)
return;
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 0bc10f452f7e..e00999dc6499 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -249,7 +249,7 @@ static void print_address_description(void *addr, u8 tag)

if (page && PageSlab(page)) {
struct kmem_cache *cache = page->slab_cache;
- void *object = nearest_obj(cache, page, addr);
+ void *object = nearest_obj(cache, page_slab(page), addr);

describe_object(cache, object, addr, tag);
}
diff --git a/mm/kasan/report_tags.c b/mm/kasan/report_tags.c
index 8a319fc16dab..06c21dd77493 100644
--- a/mm/kasan/report_tags.c
+++ b/mm/kasan/report_tags.c
@@ -23,7 +23,7 @@ const char *kasan_get_bug_type(struct kasan_access_info *info)
page = kasan_addr_to_page(addr);
if (page && PageSlab(page)) {
cache = page->slab_cache;
- object = nearest_obj(cache, page, (void *)addr);
+ object = nearest_obj(cache, page_slab(page), (void *)addr);
alloc_meta = kasan_get_alloc_meta(cache, object);

if (alloc_meta) {
diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index 695030c1fff8..f7276711d7b9 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -291,8 +291,8 @@ static void *test_alloc(struct kunit *test, size_t size, gfp_t gfp, enum allocat
* even for KFENCE objects; these are required so that
* memcg accounting works correctly.
*/
- KUNIT_EXPECT_EQ(test, obj_to_index(s, page, alloc), 0U);
- KUNIT_EXPECT_EQ(test, objs_per_slab_page(s, page), 1);
+ KUNIT_EXPECT_EQ(test, obj_to_index(s, page_slab(page), alloc), 0U);
+ KUNIT_EXPECT_EQ(test, objs_per_slab(s, page_slab(page)), 1);

if (policy == ALLOCATE_ANY)
return alloc;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6863a834ed42..906edbd92436 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2819,7 +2819,7 @@ static struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *objcg)
int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s,
gfp_t gfp, bool new_page)
{
- unsigned int objects = objs_per_slab_page(s, page);
+ unsigned int objects = objs_per_slab(s, page_slab(page));
unsigned long memcg_data;
void *vec;

@@ -2881,7 +2881,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p)
struct obj_cgroup *objcg;
unsigned int off;

- off = obj_to_index(page->slab_cache, page, p);
+ off = obj_to_index(page->slab_cache, page_slab(page), p);
objcg = page_objcgs(page)[off];
if (objcg)
return obj_cgroup_memcg(objcg);
diff --git a/mm/slab.c b/mm/slab.c
index f0447b087d02..785fffd527fe 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1560,7 +1560,7 @@ static void check_poison_obj(struct kmem_cache *cachep, void *objp)
struct slab *slab = virt_to_slab(objp);
unsigned int objnr;

- objnr = obj_to_index(cachep, slab_page(slab), objp);
+ objnr = obj_to_index(cachep, slab, objp);
if (objnr) {
objp = index_to_obj(cachep, slab, objnr - 1);
realobj = (char *)objp + obj_offset(cachep);
@@ -2530,7 +2530,7 @@ static void *slab_get_obj(struct kmem_cache *cachep, struct slab *slab)
static void slab_put_obj(struct kmem_cache *cachep,
struct slab *slab, void *objp)
{
- unsigned int objnr = obj_to_index(cachep, slab_page(slab), objp);
+ unsigned int objnr = obj_to_index(cachep, slab, objp);
#if DEBUG
unsigned int i;

@@ -2717,7 +2717,7 @@ static void *cache_free_debugcheck(struct kmem_cache *cachep, void *objp,
if (cachep->flags & SLAB_STORE_USER)
*dbg_userword(cachep, objp) = (void *)caller;

- objnr = obj_to_index(cachep, slab_page(slab), objp);
+ objnr = obj_to_index(cachep, slab, objp);

BUG_ON(objnr >= cachep->num);
BUG_ON(objp != index_to_obj(cachep, slab, objnr));
@@ -3663,7 +3663,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab)
objp = object - obj_offset(cachep);
kpp->kp_data_offset = obj_offset(cachep);
slab = virt_to_slab(objp);
- objnr = obj_to_index(cachep, slab_page(slab), objp);
+ objnr = obj_to_index(cachep, slab, objp);
objp = index_to_obj(cachep, slab, objnr);
kpp->kp_objp = objp;
if (DEBUG && cachep->flags & SLAB_STORE_USER)
@@ -4181,7 +4181,7 @@ void __check_heap_object(const void *ptr, unsigned long n,

/* Find and validate object. */
cachep = slab->slab_cache;
- objnr = obj_to_index(cachep, slab_page(slab), (void *)ptr);
+ objnr = obj_to_index(cachep, slab, (void *)ptr);
BUG_ON(objnr >= cachep->num);

/* Find offset within object. */
diff --git a/mm/slab.h b/mm/slab.h
index 7376c9d8aa2b..15d109d8ec89 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -483,7 +483,7 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s,
continue;
}

- off = obj_to_index(s, page, p[i]);
+ off = obj_to_index(s, page_slab(page), p[i]);
obj_cgroup_get(objcg);
page_objcgs(page)[off] = objcg;
mod_objcg_state(objcg, page_pgdat(page),
@@ -522,7 +522,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
else
s = s_orig;

- off = obj_to_index(s, page, p[i]);
+ off = obj_to_index(s, page_slab(page), p[i]);
objcg = objcgs[off];
if (!objcg)
continue;
diff --git a/mm/slub.c b/mm/slub.c
index f5344211d8cc..61aaaa662c5e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4342,7 +4342,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab)
#else
objp = objp0;
#endif
- objnr = obj_to_index(s, slab_page(slab), objp);
+ objnr = obj_to_index(s, slab, objp);
kpp->kp_data_offset = (unsigned long)((char *)objp0 - (char *)objp);
objp = base + s->size * objnr;
kpp->kp_objp = objp;
--
2.33.1

Vlastimil Babka

unread,
Dec 1, 2021, 1:15:21 PM12/1/21
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Vlastimil Babka, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, kasa...@googlegroups.com
From: "Matthew Wilcox (Oracle)" <wi...@infradead.org>

KASAN accesses some slab related struct page fields so we need to convert it
to struct slab. Some places are a bit simplified thanks to kasan_addr_to_slab()
encapsulating the PageSlab flag check through virt_to_slab().
When resolving object address to either a real slab or a large kmalloc, use
struct folio as the intermediate type for testing the slab flag to avoid
unnecessary implicit compound_head().

[ vba...@suse.cz: use struct folio, adjust to differences in previous patches ]

Signed-off-by: Matthew Wilcox (Oracle) <wi...@infradead.org>
Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Cc: Andrey Ryabinin <ryabin...@gmail.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Andrey Konovalov <andre...@gmail.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: <kasa...@googlegroups.com>
---
include/linux/kasan.h | 9 +++++----
mm/kasan/common.c | 23 +++++++++++++----------
mm/kasan/generic.c | 8 ++++----
mm/kasan/kasan.h | 1 +
mm/kasan/quarantine.c | 2 +-
mm/kasan/report.c | 13 +++++++++++--
mm/kasan/report_tags.c | 10 +++++-----
mm/slab.c | 2 +-
mm/slub.c | 2 +-
9 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index d8783b682669..fb78108d694e 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -9,6 +9,7 @@

struct kmem_cache;
struct page;
+struct slab;
struct vm_struct;
struct task_struct;

@@ -193,11 +194,11 @@ static __always_inline size_t kasan_metadata_size(struct kmem_cache *cache)
return 0;
}

-void __kasan_poison_slab(struct page *page);
-static __always_inline void kasan_poison_slab(struct page *page)
+void __kasan_poison_slab(struct slab *slab);
+static __always_inline void kasan_poison_slab(struct slab *slab)
{
if (kasan_enabled())
- __kasan_poison_slab(page);
+ __kasan_poison_slab(slab);
}

void __kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
@@ -322,7 +323,7 @@ static inline void kasan_cache_create(struct kmem_cache *cache,
slab_flags_t *flags) {}
static inline void kasan_cache_create_kmalloc(struct kmem_cache *cache) {}
static inline size_t kasan_metadata_size(struct kmem_cache *cache) { return 0; }
-static inline void kasan_poison_slab(struct page *page) {}
+static inline void kasan_poison_slab(struct slab *slab) {}
static inline void kasan_unpoison_object_data(struct kmem_cache *cache,
void *object) {}
static inline void kasan_poison_object_data(struct kmem_cache *cache,
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 6a1cd2d38bff..7c06db78a76c 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -247,8 +247,9 @@ struct kasan_free_meta *kasan_get_free_meta(struct kmem_cache *cache,
}
#endif

-void __kasan_poison_slab(struct page *page)
+void __kasan_poison_slab(struct slab *slab)
{
+ struct page *page = slab_page(slab);
unsigned long i;

for (i = 0; i < compound_nr(page); i++)
@@ -401,9 +402,9 @@ void __kasan_kfree_large(void *ptr, unsigned long ip)

void __kasan_slab_free_mempool(void *ptr, unsigned long ip)
{
- struct page *page;
+ struct folio *folio;

- page = virt_to_head_page(ptr);
+ folio = virt_to_folio(ptr);

/*
* Even though this function is only called for kmem_cache_alloc and
@@ -411,12 +412,14 @@ void __kasan_slab_free_mempool(void *ptr, unsigned long ip)
* !PageSlab() when the size provided to kmalloc is larger than
* KMALLOC_MAX_SIZE, and kmalloc falls back onto page_alloc.
*/
- if (unlikely(!PageSlab(page))) {
+ if (unlikely(!folio_test_slab(folio))) {
if (____kasan_kfree_large(ptr, ip))
return;
- kasan_poison(ptr, page_size(page), KASAN_FREE_PAGE, false);
+ kasan_poison(ptr, folio_size(folio), KASAN_FREE_PAGE, false);
} else {
- ____kasan_slab_free(page->slab_cache, ptr, ip, false, false);
+ struct slab *slab = folio_slab(folio);
+
+ ____kasan_slab_free(slab->slab_cache, ptr, ip, false, false);
}
}

@@ -560,7 +563,7 @@ void * __must_check __kasan_kmalloc_large(const void *ptr, size_t size,

void * __must_check __kasan_krealloc(const void *object, size_t size, gfp_t flags)
{
- struct page *page;
+ struct slab *slab;

if (unlikely(object == ZERO_SIZE_PTR))
return (void *)object;
@@ -572,13 +575,13 @@ void * __must_check __kasan_krealloc(const void *object, size_t size, gfp_t flag
*/
kasan_unpoison(object, size, false);

- page = virt_to_head_page(object);
+ slab = virt_to_slab(object);

/* Piggy-back on kmalloc() instrumentation to poison the redzone. */
- if (unlikely(!PageSlab(page)))
+ if (unlikely(!slab))
return __kasan_kmalloc_large(object, size, flags);
else
- return ____kasan_kmalloc(page->slab_cache, object, size, flags);
+ return ____kasan_kmalloc(slab->slab_cache, object, size, flags);
}

bool __kasan_check_byte(const void *address, unsigned long ip)
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index 5d0b79416c4e..a25ad4090615 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -330,16 +330,16 @@ DEFINE_ASAN_SET_SHADOW(f8);

static void __kasan_record_aux_stack(void *addr, bool can_alloc)
{
- struct page *page = kasan_addr_to_page(addr);
+ struct slab *slab = kasan_addr_to_slab(addr);
struct kmem_cache *cache;
struct kasan_alloc_meta *alloc_meta;
void *object;

- if (is_kfence_address(addr) || !(page && PageSlab(page)))
+ if (is_kfence_address(addr) || !slab)
return;

- cache = page->slab_cache;
- object = nearest_obj(cache, page_slab(page), addr);
+ cache = slab->slab_cache;
+ object = nearest_obj(cache, slab, addr);
alloc_meta = kasan_get_alloc_meta(cache, object);
if (!alloc_meta)
return;
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index aebd8df86a1f..c17fa8d26ffe 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -265,6 +265,7 @@ bool kasan_report(unsigned long addr, size_t size,
void kasan_report_invalid_free(void *object, unsigned long ip);

struct page *kasan_addr_to_page(const void *addr);
+struct slab *kasan_addr_to_slab(const void *addr);

depot_stack_handle_t kasan_save_stack(gfp_t flags, bool can_alloc);
void kasan_set_track(struct kasan_track *track, gfp_t flags);
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index d8ccff4c1275..587da8995f2d 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -117,7 +117,7 @@ static unsigned long quarantine_batch_size;

static struct kmem_cache *qlink_to_cache(struct qlist_node *qlink)
{
- return virt_to_head_page(qlink)->slab_cache;
+ return virt_to_slab(qlink)->slab_cache;
}

static void *qlink_to_object(struct qlist_node *qlink, struct kmem_cache *cache)
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index e00999dc6499..3ad9624dcc56 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -150,6 +150,14 @@ struct page *kasan_addr_to_page(const void *addr)
return NULL;
}

+struct slab *kasan_addr_to_slab(const void *addr)
+{
+ if ((addr >= (void *)PAGE_OFFSET) &&
+ (addr < high_memory))
+ return virt_to_slab(addr);
+ return NULL;
+}
+
static void describe_object_addr(struct kmem_cache *cache, void *object,
const void *addr)
{
@@ -248,8 +256,9 @@ static void print_address_description(void *addr, u8 tag)
pr_err("\n");

if (page && PageSlab(page)) {
- struct kmem_cache *cache = page->slab_cache;
- void *object = nearest_obj(cache, page_slab(page), addr);
+ struct slab *slab = page_slab(page);
+ struct kmem_cache *cache = slab->slab_cache;
+ void *object = nearest_obj(cache, slab, addr);

describe_object(cache, object, addr, tag);
}
diff --git a/mm/kasan/report_tags.c b/mm/kasan/report_tags.c
index 06c21dd77493..1b41de88c53e 100644
--- a/mm/kasan/report_tags.c
+++ b/mm/kasan/report_tags.c
@@ -12,7 +12,7 @@ const char *kasan_get_bug_type(struct kasan_access_info *info)
#ifdef CONFIG_KASAN_TAGS_IDENTIFY
struct kasan_alloc_meta *alloc_meta;
struct kmem_cache *cache;
- struct page *page;
+ struct slab *slab;
const void *addr;
void *object;
u8 tag;
@@ -20,10 +20,10 @@ const char *kasan_get_bug_type(struct kasan_access_info *info)

tag = get_tag(info->access_addr);
addr = kasan_reset_tag(info->access_addr);
- page = kasan_addr_to_page(addr);
- if (page && PageSlab(page)) {
- cache = page->slab_cache;
- object = nearest_obj(cache, page_slab(page), (void *)addr);
+ slab = kasan_addr_to_slab(addr);
+ if (slab) {
+ cache = slab->slab_cache;
+ object = nearest_obj(cache, slab, (void *)addr);
alloc_meta = kasan_get_alloc_meta(cache, object);

if (alloc_meta) {
diff --git a/mm/slab.c b/mm/slab.c
index 785fffd527fe..fed55fa1b7d0 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2605,7 +2605,7 @@ static struct slab *cache_grow_begin(struct kmem_cache *cachep,
* page_address() in the latter returns a non-tagged pointer,
* as it should be for slab pages.
*/
- kasan_poison_slab(slab_page(slab));
+ kasan_poison_slab(slab);

/* Get slab management. */
freelist = alloc_slabmgmt(cachep, slab, offset,
diff --git a/mm/slub.c b/mm/slub.c
index 61aaaa662c5e..58f0d499a293 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1961,7 +1961,7 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)

slab->slab_cache = s;

- kasan_poison_slab(slab_page(slab));
+ kasan_poison_slab(slab);

start = slab_address(slab);

--
2.33.1

Vlastimil Babka

unread,
Dec 1, 2021, 1:15:21 PM12/1/21
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Vlastimil Babka, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
The function sets some fields that are being moved from struct page to struct
slab so it needs to be converted.

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Tested-by: Marco Elver <el...@google.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Marco Elver <el...@google.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: <kasa...@googlegroups.com>
---
mm/kfence/core.c | 12 ++++++------
mm/kfence/kfence_test.c | 6 +++---
2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 09945784df9e..4eb60cf5ff8b 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -360,7 +360,7 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g
{
struct kfence_metadata *meta = NULL;
unsigned long flags;
- struct page *page;
+ struct slab *slab;
void *addr;

/* Try to obtain a free object. */
@@ -424,13 +424,13 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g

alloc_covered_add(alloc_stack_hash, 1);

- /* Set required struct page fields. */
- page = virt_to_page(meta->addr);
- page->slab_cache = cache;
+ /* Set required slab fields. */
+ slab = virt_to_slab((void *)meta->addr);
+ slab->slab_cache = cache;
if (IS_ENABLED(CONFIG_SLUB))
- page->objects = 1;
+ slab->objects = 1;
if (IS_ENABLED(CONFIG_SLAB))
- page->s_mem = addr;
+ slab->s_mem = addr;

/* Memory initialization. */
for_each_canary(meta, set_canary_byte);
diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index f7276711d7b9..a22b1af85577 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -282,7 +282,7 @@ static void *test_alloc(struct kunit *test, size_t size, gfp_t gfp, enum allocat
alloc = kmalloc(size, gfp);

if (is_kfence_address(alloc)) {
- struct page *page = virt_to_head_page(alloc);
+ struct slab *slab = virt_to_slab(alloc);
struct kmem_cache *s = test_cache ?:
kmalloc_caches[kmalloc_type(GFP_KERNEL)][__kmalloc_index(size, false)];

@@ -291,8 +291,8 @@ static void *test_alloc(struct kunit *test, size_t size, gfp_t gfp, enum allocat
* even for KFENCE objects; these are required so that
* memcg accounting works correctly.
*/
- KUNIT_EXPECT_EQ(test, obj_to_index(s, page_slab(page), alloc), 0U);
- KUNIT_EXPECT_EQ(test, objs_per_slab(s, page_slab(page)), 1);
+ KUNIT_EXPECT_EQ(test, obj_to_index(s, slab, alloc), 0U);
+ KUNIT_EXPECT_EQ(test, objs_per_slab(s, slab), 1);

if (policy == ALLOCATE_ANY)
return alloc;
--
2.33.1

Vlastimil Babka

unread,
Dec 1, 2021, 1:15:22 PM12/1/21
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Vlastimil Babka, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
With a struct slab definition separate from struct page, we can go further and
define only fields that the chosen sl*b implementation uses. This means
everything between __page_flags and __page_refcount placeholders now depends on
the chosen CONFIG_SL*B. Some fields exist in all implementations (slab_list)
but can be part of a union in some, so it's simpler to repeat them than
complicate the definition with ifdefs even more.

The patch doesn't change physical offsets of the fields, although it could be
done later - for example it's now clear that tighter packing in SLOB could be
possible.

This should also prevent accidental use of fields that don't exist in given
implementation. Before this patch virt_to_cache() and and cache_from_obj() was
visible for SLOB (albeit not used), although it relies on the slab_cache field
that isn't set by SLOB. With this patch it's now a compile error, so these
functions are now hidden behind #ifndef CONFIG_SLOB.

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Tested-by: Marco Elver <el...@google.com> # kfence
Cc: Alexander Potapenko <gli...@google.com>
Cc: Marco Elver <el...@google.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: <kasa...@googlegroups.com>
---
mm/kfence/core.c | 9 +++++----
mm/slab.h | 46 ++++++++++++++++++++++++++++++++++++----------
2 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 4eb60cf5ff8b..46103a7628a6 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -427,10 +427,11 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g
/* Set required slab fields. */
slab = virt_to_slab((void *)meta->addr);
slab->slab_cache = cache;
- if (IS_ENABLED(CONFIG_SLUB))
- slab->objects = 1;
- if (IS_ENABLED(CONFIG_SLAB))
- slab->s_mem = addr;
+#if defined(CONFIG_SLUB)
+ slab->objects = 1;
+#elif defined (CONFIG_SLAB)
+ slab->s_mem = addr;
+#endif

/* Memory initialization. */
for_each_canary(meta, set_canary_byte);
diff --git a/mm/slab.h b/mm/slab.h
index 2d50c099a222..8c5a8c005896 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -8,9 +8,24 @@
/* Reuses the bits in struct page */
struct slab {
unsigned long __page_flags;
+
+#if defined(CONFIG_SLAB)
+
+ union {
+ struct list_head slab_list;
+ struct rcu_head rcu_head;
+ };
+ struct kmem_cache *slab_cache;
+ void *freelist; /* array of free object indexes */
+ void * s_mem; /* first object */
+ unsigned int active;
+
+#elif defined(CONFIG_SLUB)
+
union {
struct list_head slab_list;
- struct { /* Partial pages */
+ struct rcu_head rcu_head;
+ struct {
struct slab *next;
#ifdef CONFIG_64BIT
int slabs; /* Nr of slabs left */
@@ -18,25 +33,32 @@ struct slab {
short int slabs;
#endif
};
- struct rcu_head rcu_head;
};
- struct kmem_cache *slab_cache; /* not slob */
+ struct kmem_cache *slab_cache;
/* Double-word boundary */
void *freelist; /* first free object */
union {
- void *s_mem; /* slab: first object */
- unsigned long counters; /* SLUB */
- struct { /* SLUB */
+ unsigned long counters;
+ struct {
unsigned inuse:16;
unsigned objects:15;
unsigned frozen:1;
};
};
+ unsigned int __unused;
+
+#elif defined(CONFIG_SLOB)
+
+ struct list_head slab_list;
+ void * __unused_1;
+ void *freelist; /* first free block */
+ void * __unused_2;
+ int units;
+
+#else
+#error "Unexpected slab allocator configured"
+#endif

- union {
- unsigned int active; /* SLAB */
- int units; /* SLOB */
- };
atomic_t __page_refcount;
#ifdef CONFIG_MEMCG
unsigned long memcg_data;
@@ -47,7 +69,9 @@ struct slab {
static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl))
SLAB_MATCH(flags, __page_flags);
SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */
+#ifndef CONFIG_SLOB
SLAB_MATCH(rcu_head, rcu_head);
+#endif
SLAB_MATCH(_refcount, __page_refcount);
#ifdef CONFIG_MEMCG
SLAB_MATCH(memcg_data, memcg_data);
@@ -623,6 +647,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s,
}
#endif /* CONFIG_MEMCG_KMEM */

+#ifndef CONFIG_SLOB
static inline struct kmem_cache *virt_to_cache(const void *obj)
{
struct slab *slab;
@@ -669,6 +694,7 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
print_tracking(cachep, x);
return cachep;
}
+#endif /* CONFIG_SLOB */

static inline size_t slab_ksize(const struct kmem_cache *s)
{
--
2.33.1

Vlastimil Babka

unread,
Dec 2, 2021, 7:25:55 AM12/2/21
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Robin Murphy
On 12/1/21 19:14, Vlastimil Babka wrote:
> Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> this cover letter.
>
> Series also available in git, based on 5.16-rc3:
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2

I have pushed a v3, but not going to resent immediately to avoid unnecessary
spamming, the differences is just that some patches are removed and other
reordered, so the current v2 posting should be still sufficient for on-list
review:

https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v3r1

patch 29/33 iommu: Use put_pages_list
- removed as this version is broken and Robin Murphy has meanwhile
incorporated it partially to his series:
https://lore.kernel.org/lkml/cover.1637671820...@arm.com/

patch 30/33 mm: Remove slab from struct page
- removed and postponed for later as this can be only be applied after the
iommu use of page.freelist is resolved

patch 27/33 zsmalloc: Stop using slab fields in struct page
patch 28/33 bootmem: Use page->index instead of page->freelist
- moved towards the end of series, to further separate the part that adjusts
non-slab users of slab fields towards removing those fields from struct page.

Andrey Konovalov

unread,
Dec 2, 2021, 12:16:42 PM12/2/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, Linux Memory Management List, Andrew Morton, pat...@lists.linux.dev, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Marco Elver, Johannes Weiner, Michal Hocko, Vladimir Davydov, kasan-dev, cgr...@vger.kernel.org
Reviewed-by: Andrey Konovalov <andre...@gmail.com>

Thanks!

Andrey Konovalov

unread,
Dec 2, 2021, 12:17:02 PM12/2/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, Linux Memory Management List, Andrew Morton, pat...@lists.linux.dev, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev
On Wed, Dec 1, 2021 at 7:15 PM Vlastimil Babka <vba...@suse.cz> wrote:
>
Reviewed-by: Andrey Konovalov <andre...@gmail.com>

Great job with the overall struct page refactoring!

Thanks!

Hyeonggon Yoo

unread,
Dec 10, 2021, 11:38:06 AM12/10/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On Wed, Dec 01, 2021 at 07:15:08PM +0100, Vlastimil Babka wrote:
> With a struct slab definition separate from struct page, we can go further and
> define only fields that the chosen sl*b implementation uses. This means
> everything between __page_flags and __page_refcount placeholders now depends on
> the chosen CONFIG_SL*B.

When I read this patch series first, I thought struct slab is allocated
separately from struct page.

But after reading it again, It uses same allocated space of struct page.

So, the code should care about fields that page allocator cares when
freeing page. (->mapping, ->refcount, ->flags, ...)

And, we can change offset of fields between page->flags and page->refcount,
If we care about the value of page->mapping before freeing it.

Did I get it right?

> Some fields exist in all implementations (slab_list)
> but can be part of a union in some, so it's simpler to repeat them than
> complicate the definition with ifdefs even more.

Before this patch I always ran preprocessor in my brain.
now it's MUCH easier to understand than before!

>
> The patch doesn't change physical offsets of the fields, although it could be
> done later - for example it's now clear that tighter packing in SLOB could be
> possible.
>

Is there a benefit if we pack SLOB's struct slab tighter?

...

> #ifdef CONFIG_MEMCG
> unsigned long memcg_data;
> @@ -47,7 +69,9 @@ struct slab {
> static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl))
> SLAB_MATCH(flags, __page_flags);
> SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */
> +#ifndef CONFIG_SLOB
> SLAB_MATCH(rcu_head, rcu_head);

Because SLUB and SLAB sets slab->slab_cache = NULL (to set page->mapping = NULL),
What about adding this?:

SLAB_MATCH(mapping, slab_cache);

there was SLAB_MATCH(slab_cache, slab_cache) but removed.

> +#endif
> SLAB_MATCH(_refcount, __page_refcount);
> #ifdef CONFIG_MEMCG
> SLAB_MATCH(memcg_data, memcg_data);

I couldn't find any functional problem on this patch.
but it seems there's some style issues.

Below is what checkpatch.pl complains.
it's better to fix them!

WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#7:
With a struct slab definition separate from struct page, we can go further and

WARNING: Possible repeated word: 'and'
#19:
implementation. Before this patch virt_to_cache() and and cache_from_obj() was

WARNING: space prohibited between function name and open parenthesis '('
#49: FILE: mm/kfence/core.c:432:
+#elif defined (CONFIG_SLAB)

ERROR: "foo * bar" should be "foo *bar"
#73: FILE: mm/slab.h:20:
+void * s_mem;/* first object */

ERROR: "foo * bar" should be "foo *bar"
#111: FILE: mm/slab.h:53:
+void * __unused_1;

ERROR: "foo * bar" should be "foo *bar"
#113: FILE: mm/slab.h:55:
+void * __unused_2;

---
Thanks,
Hyeonggon.

Vlastimil Babka

unread,
Dec 10, 2021, 1:26:13 PM12/10/21
to Hyeonggon Yoo, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On 12/10/21 17:37, Hyeonggon Yoo wrote:
> On Wed, Dec 01, 2021 at 07:15:08PM +0100, Vlastimil Babka wrote:
>> With a struct slab definition separate from struct page, we can go further and
>> define only fields that the chosen sl*b implementation uses. This means
>> everything between __page_flags and __page_refcount placeholders now depends on
>> the chosen CONFIG_SL*B.
>
> When I read this patch series first, I thought struct slab is allocated
> separately from struct page.
>
> But after reading it again, It uses same allocated space of struct page.

Yes. Allocating it elsewhere is something that can be discussed later. It's
not a simple clear win - more memory used, more overhead, complicated code...

> So, the code should care about fields that page allocator cares when
> freeing page. (->mapping, ->refcount, ->flags, ...)
>
> And, we can change offset of fields between page->flags and page->refcount,
> If we care about the value of page->mapping before freeing it.
>
> Did I get it right?

Yeah. Also whatever aliases with compound_head must not have bit zero set as
that means a tail page.

>> Some fields exist in all implementations (slab_list)
>> but can be part of a union in some, so it's simpler to repeat them than
>> complicate the definition with ifdefs even more.
>
> Before this patch I always ran preprocessor in my brain.
> now it's MUCH easier to understand than before!
>
>>
>> The patch doesn't change physical offsets of the fields, although it could be
>> done later - for example it's now clear that tighter packing in SLOB could be
>> possible.
>>
>
> Is there a benefit if we pack SLOB's struct slab tighter?

I don't see any immediate benefit, except avoiding the page->mapping alias
as you suggested.

> ...
>
>> #ifdef CONFIG_MEMCG
>> unsigned long memcg_data;
>> @@ -47,7 +69,9 @@ struct slab {
>> static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl))
>> SLAB_MATCH(flags, __page_flags);
>> SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */
>> +#ifndef CONFIG_SLOB
>> SLAB_MATCH(rcu_head, rcu_head);
>
> Because SLUB and SLAB sets slab->slab_cache = NULL (to set page->mapping = NULL),

Hm, now that you mention it, maybe it would be better to do a
"folio->mapping = NULL" instead as we now have a more clearer view where we
operate on struct slab, and where we transition between that and a plain
folio. This is IMHO part of preparing the folio for freeing, not a struct
slab cleanup as struct slab doesn't need this cleanup.

> What about adding this?:
>
> SLAB_MATCH(mapping, slab_cache);
>
> there was SLAB_MATCH(slab_cache, slab_cache) but removed.

With the change suggested above, it wouldn't be needed as a safety check
anymore.

>> +#endif
>> SLAB_MATCH(_refcount, __page_refcount);
>> #ifdef CONFIG_MEMCG
>> SLAB_MATCH(memcg_data, memcg_data);
>
> I couldn't find any functional problem on this patch.
> but it seems there's some style issues.
>
> Below is what checkpatch.pl complains.
> it's better to fix them!

Not all checkpatch suggestions are correct and have to be followed, but I'll
check what I missed. Thanks.

Hyeonggon Yoo

unread,
Dec 11, 2021, 6:55:34 AM12/11/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On Fri, Dec 10, 2021 at 07:26:11PM +0100, Vlastimil Babka wrote:
> On 12/10/21 17:37, Hyeonggon Yoo wrote:
> > On Wed, Dec 01, 2021 at 07:15:08PM +0100, Vlastimil Babka wrote:
> >> With a struct slab definition separate from struct page, we can go further and
> >> define only fields that the chosen sl*b implementation uses. This means
> >> everything between __page_flags and __page_refcount placeholders now depends on
> >> the chosen CONFIG_SL*B.
> >
> > When I read this patch series first, I thought struct slab is allocated
> > separately from struct page.
> >
> > But after reading it again, It uses same allocated space of struct page.
>
> Yes. Allocating it elsewhere is something that can be discussed later. It's
> not a simple clear win - more memory used, more overhead, complicated code...
>

Right. That is a something that can be discussed,
But I don't think there will be much win.

> > So, the code should care about fields that page allocator cares when
> > freeing page. (->mapping, ->refcount, ->flags, ...)
> >
> > And, we can change offset of fields between page->flags and page->refcount,
> > If we care about the value of page->mapping before freeing it.
> >
> > Did I get it right?
>
> Yeah. Also whatever aliases with compound_head must not have bit zero set as
> that means a tail page.
>

Oh I was missing that. Thank you.

Hmm then in struct slab, page->compound_head and slab->list_head (or
slab->rcu_head) has same offset. And list_head / rcu_head both store pointers.

then it has a alignment requirement. (address saved in list_head/rcu_head
should be multiple of 2)

Anyway, it was required long time before this patch,
so it is not a problem for this patch.
Oh, folio->mapping = NULL seems more intuitive.

And we can reorder fields of struct slab more flexibly with
folio->mapping = NULL because we have no reason to make page->mapping
and slab->slab_cache to have same offset.

So it should be done in separate patch for SLUB/SLAB.
Do you mind If I send a patch for this after some testing?

> This is IMHO part of preparing the folio for freeing, not a struct
> slab cleanup as struct slab doesn't need this cleanup.

I agree that. it's needed for folio, not for struct slab.

> > What about adding this?:
> >
> > SLAB_MATCH(mapping, slab_cache);
> >
> > there was SLAB_MATCH(slab_cache, slab_cache) but removed.
>
> With the change suggested above, it wouldn't be needed as a safety check
> anymore.
>

Okay.

> >> +#endif
> >> SLAB_MATCH(_refcount, __page_refcount);
> >> #ifdef CONFIG_MEMCG
> >> SLAB_MATCH(memcg_data, memcg_data);
> >
> > I couldn't find any functional problem on this patch.
> > but it seems there's some style issues.
> >
> > Below is what checkpatch.pl complains.
> > it's better to fix them!
>
> Not all checkpatch suggestions are correct and have to be followed, but I'll
> check what I missed. Thanks.
>

You're welcome.
They are just a typo in changelog and white space warnings.

So now, exept few style issues, I can't find any problem in this patch.
And this patch gives us much better view of struct slab.

Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>

Thanks,
Hyeonggon.

Matthew Wilcox

unread,
Dec 11, 2021, 11:24:06 AM12/11/21
to Vlastimil Babka, Hyeonggon Yoo, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On Fri, Dec 10, 2021 at 07:26:11PM +0100, Vlastimil Babka wrote:
> > Because SLUB and SLAB sets slab->slab_cache = NULL (to set page->mapping = NULL),
>
> Hm, now that you mention it, maybe it would be better to do a
> "folio->mapping = NULL" instead as we now have a more clearer view where we
> operate on struct slab, and where we transition between that and a plain
> folio. This is IMHO part of preparing the folio for freeing, not a struct
> slab cleanup as struct slab doesn't need this cleanup.

Yes, I did that as part of "mm/slub: Convert slab freeing to struct slab"
in my original series:

- __ClearPageSlabPfmemalloc(page);
+ __slab_clear_pfmemalloc(slab);
__ClearPageSlab(page);
- /* In union with page->mapping where page allocator expects NULL */
- page->slab_cache = NULL;
+ page->mapping = NULL;

Matthew Wilcox

unread,
Dec 11, 2021, 11:52:59 AM12/11/21
to Hyeonggon Yoo, Vlastimil Babka, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On Sat, Dec 11, 2021 at 11:55:27AM +0000, Hyeonggon Yoo wrote:
> On Fri, Dec 10, 2021 at 07:26:11PM +0100, Vlastimil Babka wrote:
> > On 12/10/21 17:37, Hyeonggon Yoo wrote:
> > > On Wed, Dec 01, 2021 at 07:15:08PM +0100, Vlastimil Babka wrote:
> > >> With a struct slab definition separate from struct page, we can go further and
> > >> define only fields that the chosen sl*b implementation uses. This means
> > >> everything between __page_flags and __page_refcount placeholders now depends on
> > >> the chosen CONFIG_SL*B.
> > >
> > > When I read this patch series first, I thought struct slab is allocated
> > > separately from struct page.
> > >
> > > But after reading it again, It uses same allocated space of struct page.
> >
> > Yes. Allocating it elsewhere is something that can be discussed later. It's
> > not a simple clear win - more memory used, more overhead, complicated code...
> >
>
> Right. That is a something that can be discussed,
> But I don't think there will be much win.

Oh no, there's a substantial win. If we can reduce struct page to a
single pointer, that shrinks it from 64 bytes/4k to 8 bytes/4k. Set
against that, you have to allocate the struct folio / struct slab / ...
but then it's one _per allocation_ rather than one per page. So for
an order-2 allocation, it takes 32 bytes + 64 bytes (= 96 bytes)
rather than 4*64 = 256 bytes. It's an even bigger win for larger
allocations, and it lets us grow the memory descriptors independently
of each other.

But it's also a substantial amount of work, so don't expect us to get
there any time soon. Everything currently using struct page needs to
be converted to use another type, and that's just the pre-requisite
step.

Some more thoughts on this here:
https://lore.kernel.org/linux-mm/YXcLqcFh...@casper.infradead.org/

> > Yeah. Also whatever aliases with compound_head must not have bit zero set as
> > that means a tail page.
> >
>
> Oh I was missing that. Thank you.
>
> Hmm then in struct slab, page->compound_head and slab->list_head (or
> slab->rcu_head) has same offset. And list_head / rcu_head both store pointers.
>
> then it has a alignment requirement. (address saved in list_head/rcu_head
> should be multiple of 2)
>
> Anyway, it was required long time before this patch,
> so it is not a problem for this patch.

Yes, that's why there's an assert that the list_heads all line up. This
requirement will go away if we do get separately allocated memory
descriptors (because that bottom bit is no longer PageTail).

Hyeonggon Yoo

unread,
Dec 12, 2021, 12:54:27 AM12/12/21
to Matthew Wilcox, Vlastimil Babka, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On Sat, Dec 11, 2021 at 04:52:47PM +0000, Matthew Wilcox wrote:
> On Sat, Dec 11, 2021 at 11:55:27AM +0000, Hyeonggon Yoo wrote:
> > On Fri, Dec 10, 2021 at 07:26:11PM +0100, Vlastimil Babka wrote:
> > > On 12/10/21 17:37, Hyeonggon Yoo wrote:
> > > > On Wed, Dec 01, 2021 at 07:15:08PM +0100, Vlastimil Babka wrote:
> > > >> With a struct slab definition separate from struct page, we can go further and
> > > >> define only fields that the chosen sl*b implementation uses. This means
> > > >> everything between __page_flags and __page_refcount placeholders now depends on
> > > >> the chosen CONFIG_SL*B.
> > > >
> > > > When I read this patch series first, I thought struct slab is allocated
> > > > separately from struct page.
> > > >
> > > > But after reading it again, It uses same allocated space of struct page.
> > >
> > > Yes. Allocating it elsewhere is something that can be discussed later. It's
> > > not a simple clear win - more memory used, more overhead, complicated code...
> > >
> >
> > Right. That is a something that can be discussed,
> > But I don't think there will be much win.
>
> Oh no, there's a substantial win. If we can reduce struct page to a
> single pointer, that shrinks it from 64 bytes/4k to 8 bytes/4k. Set
> against that, you have to allocate the struct folio / struct slab / ...
> but then it's one _per allocation_ rather than one per page. So for
> an order-2 allocation, it takes 32 bytes + 64 bytes (= 96 bytes)
> rather than 4*64 = 256 bytes. It's an even bigger win for larger
> allocations, and it lets us grow the memory descriptors independently
> of each other.

Oh I thought there won't be much win because I thought it was
just allocating additional memory for struct slab and still allocating
memory for struct page as we do now.

It will be more efficient if we can allocate descriptor of slab/page/...etc
per *allocation*, which may have order > 1. And currently we're
duplicating memory descriptor (struct page) even on high order
allocation.

Even if we do not allocate high order page at all, it's
still efficient if we can reduce struct page into double word.
And we can allocate something like struct slab only when we need it.

One challenge here is that we should allocate the descriptors
dynamically... I'm going to read the link you sent.

> Everything currently using struct page needs to
> be converted to use another type, and that's just the pre-requisite
> step.
>

Oh, you're planning to separate *everything* from
struct page, not only struct slab!

So your intention of this patch series is preparing for
physical separation. It's fascinating...

> But it's also a substantial amount of work, so don't expect us to get
> there any time soon.

Yeah, that will require much work. But I'll wait for your good work.
It's so interesting.
Thank you for the link.

> > > Yeah. Also whatever aliases with compound_head must not have bit zero set as
> > > that means a tail page.
> > >
> >
> > Oh I was missing that. Thank you.
> >
> > Hmm then in struct slab, page->compound_head and slab->list_head (or
> > slab->rcu_head) has same offset. And list_head / rcu_head both store pointers.
> >
> > then it has a alignment requirement. (address saved in list_head/rcu_head
> > should be multiple of 2)
> >
> > Anyway, it was required long time before this patch,
> > so it is not a problem for this patch.
>
> Yes, that's why there's an assert that the list_heads all line up. This
> requirement will go away if we do get separately allocated memory
> descriptors (because that bottom bit is no longer PageTail).

Yeah, we don't need to care that if we separately allocate memory for
struct slab.

Thanks,
Hyeonggon.

Hyeonggon Yoo

unread,
Dec 12, 2021, 1:00:07 AM12/12/21
to Matthew Wilcox, Vlastimil Babka, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
Didn't know your original patch series did it already.
Vlastimil, would you update that patch?

Hyeonggon Yoo

unread,
Dec 12, 2021, 1:52:49 AM12/12/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com, Hyeonggon Yoo
After commit f1ac9059ca34 ("mm/sl*b: Differentiate struct slab fields
by sl*b implementations"), we can reorder fields of struct slab
depending on slab allocator.

For now, page_mapcount_reset() is called because page->_mapcount and
slab->units have same offset. But this is not necessary for
struct slab. Use unused field for units instead.

Signed-off-by: Hyeonggon Yoo <42.h...@gmail.com>
---
mm/slab.h | 4 ++--
mm/slob.c | 1 -
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 90d7fceba470..dd0480149d38 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -50,8 +50,8 @@ struct slab {
struct list_head slab_list;
void * __unused_1;
void *freelist; /* first free block */
- void * __unused_2;
- int units;
+ long units;
+ unsigned int __unused_2;

#else
#error "Unexpected slab allocator configured"
diff --git a/mm/slob.c b/mm/slob.c
index 39b651b3e6e7..7b2d2c7d69cc 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -404,7 +404,6 @@ static void slob_free(void *block, int size)
clear_slob_page_free(sp);
spin_unlock_irqrestore(&slob_lock, flags);
__ClearPageSlab(slab_page(sp));
- page_mapcount_reset(slab_page(sp));
slob_free_pages(b, 0);
return;
}
--
2.30.2

Vlastimil Babka

unread,
Dec 14, 2021, 6:51:35 AM12/14/21
to Hyeonggon Yoo, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On 12/12/21 07:52, Hyeonggon Yoo wrote:
> After commit f1ac9059ca34 ("mm/sl*b: Differentiate struct slab fields
> by sl*b implementations"), we can reorder fields of struct slab
> depending on slab allocator.
>
> For now, page_mapcount_reset() is called because page->_mapcount and
> slab->units have same offset. But this is not necessary for
> struct slab. Use unused field for units instead.
>
> Signed-off-by: Hyeonggon Yoo <42.h...@gmail.com>

Will add to the series, thanks!

Vlastimil Babka

unread,
Dec 14, 2021, 7:57:25 AM12/14/21
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Hyeonggon Yoo
On 12/1/21 19:14, Vlastimil Babka wrote:
> Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> this cover letter.
>
> Series also available in git, based on 5.16-rc3:
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2

Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:

1: 10b656f9eb1e = 1: 10b656f9eb1e mm: add virt_to_folio() and folio_address()
2: 5e6ad846acf1 = 2: 5e6ad846acf1 mm/slab: Dissolve slab_map_pages() in its caller
3: 48d4e9407aa0 = 3: 48d4e9407aa0 mm/slub: Make object_err() static
4: fe1e19081321 = 4: fe1e19081321 mm: Split slab into its own type
5: af7fd46fbb9b = 5: af7fd46fbb9b mm: Add account_slab() and unaccount_slab()
6: 7ed088d601d9 = 6: 7ed088d601d9 mm: Convert virt_to_cache() to use struct slab
7: 1d41188b9401 = 7: 1d41188b9401 mm: Convert __ksize() to struct slab
8: 5d9d1231461f ! 8: 8fd22e0b086e mm: Use struct slab in kmem_obj_info()
@@ Commit message
slab type instead of the page type, we make it obvious that this can
only be called for slabs.

+ [ vba...@suse.cz: also convert the related kmem_valid_obj() to folios ]
+
Signed-off-by: Matthew Wilcox (Oracle) <wi...@infradead.org>
Signed-off-by: Vlastimil Babka <vba...@suse.cz>

@@ mm/slab.h: struct kmem_obj_info {
#endif /* MM_SLAB_H */

## mm/slab_common.c ##
+@@ mm/slab_common.c: bool slab_is_available(void)
+ */
+ bool kmem_valid_obj(void *object)
+ {
+- struct page *page;
++ struct folio *folio;
+
+ /* Some arches consider ZERO_SIZE_PTR to be a valid address. */
+ if (object < (void *)PAGE_SIZE || !virt_addr_valid(object))
+ return false;
+- page = virt_to_head_page(object);
+- return PageSlab(page);
++ folio = virt_to_folio(object);
++ return folio_test_slab(folio);
+ }
+ EXPORT_SYMBOL_GPL(kmem_valid_obj);
+
@@ mm/slab_common.c: void kmem_dump_obj(void *object)
{
char *cp = IS_ENABLED(CONFIG_MMU) ? "" : "/vmalloc";
@@ mm/slub.c: int __kmem_cache_shutdown(struct kmem_cache *s)
objp = base + s->size * objnr;
kpp->kp_objp = objp;
- if (WARN_ON_ONCE(objp < base || objp >= base + page->objects * s->size || (objp - base) % s->size) ||
-+ if (WARN_ON_ONCE(objp < base || objp >= base + slab->objects * s->size || (objp - base) % s->size) ||
++ if (WARN_ON_ONCE(objp < base || objp >= base + slab->objects * s->size
++ || (objp - base) % s->size) ||
!(s->flags & SLAB_STORE_USER))
return;
#ifdef CONFIG_SLUB_DEBUG
9: 3aef771be335 ! 9: c97e73c3b6c2 mm: Convert check_heap_object() to use struct slab
@@ mm/slab.h: struct kmem_obj_info {
+#else
+static inline
+void __check_heap_object(const void *ptr, unsigned long n,
-+ const struct slab *slab, bool to_user) { }
++ const struct slab *slab, bool to_user)
++{
++}
+#endif
+
#endif /* MM_SLAB_H */
10: 2253e45e6bef = 10: da05e0f7179c mm/slub: Convert detached_freelist to use a struct slab
11: f28202bc27ba = 11: 383887e77104 mm/slub: Convert kfree() to use a struct slab
12: 31b58b1e914f = 12: c46be093c637 mm/slub: Convert __slab_lock() and __slab_unlock() to struct slab
13: 636406a3ad59 = 13: 49dbbf917052 mm/slub: Convert print_page_info() to print_slab_info()
14: 3b49efda3b6f = 14: 4bb0c932156a mm/slub: Convert alloc_slab_page() to return a struct slab
15: 61a195526d3b ! 15: 4b9761b5cfab mm/slub: Convert __free_slab() to use struct slab
@@ mm/slub.c: static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int n

- __ClearPageSlabPfmemalloc(page);
- __ClearPageSlab(page);
+- /* In union with page->mapping where page allocator expects NULL */
+- page->slab_cache = NULL;
+ __slab_clear_pfmemalloc(slab);
+ __folio_clear_slab(folio);
- /* In union with page->mapping where page allocator expects NULL */
-- page->slab_cache = NULL;
-+ slab->slab_cache = NULL;
++ folio->mapping = NULL;
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += pages;
- unaccount_slab(page_slab(page), order, s);
16: 987c7ed31580 = 16: f384ec918065 mm/slub: Convert pfmemalloc_match() to take a struct slab
17: cc742564237e ! 17: 06738ade4e17 mm/slub: Convert most struct page to struct slab by spatch
@@ Commit message

// Options: --include-headers --no-includes --smpl-spacing include/linux/slub_def.h mm/slub.c
// Note: needs coccinelle 1.1.1 to avoid breaking whitespace, and ocaml for the
- // embedded script script
+ // embedded script

// build list of functions to exclude from applying the next rule
@initialize:ocaml@
18: b45acac9aace = 18: 1a4f69a4cced mm/slub: Finish struct page to struct slab conversion
19: 76c3eeb39684 ! 19: 1d62d706e884 mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
@@ mm/slab.c: slab_out_of_memory(struct kmem_cache *cachep, gfp_t gfpflags, int nod
- __ClearPageSlabPfmemalloc(page);
- __ClearPageSlab(page);
- page_mapcount_reset(page);
+- /* In union with page->mapping where page allocator expects NULL */
+- page->slab_cache = NULL;
+ BUG_ON(!folio_test_slab(folio));
+ __slab_clear_pfmemalloc(slab);
+ __folio_clear_slab(folio);
+ page_mapcount_reset(folio_page(folio, 0));
- /* In union with page->mapping where page allocator expects NULL */
-- page->slab_cache = NULL;
-+ slab->slab_cache = NULL;
++ folio->mapping = NULL;

if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += 1 << order;
20: ed6144dbebce ! 20: fd4c3aabacd3 mm/slab: Convert most struct page to struct slab by spatch
@@ Commit message

// Options: --include-headers --no-includes --smpl-spacing mm/slab.c
// Note: needs coccinelle 1.1.1 to avoid breaking whitespace, and ocaml for the
- // embedded script script
+ // embedded script

// build list of functions for applying the next rule
@initialize:ocaml@
21: 17fb81e601e6 = 21: b59720b2edba mm/slab: Finish struct page to struct slab conversion
22: 4e8d1faebc24 ! 22: 65ced071c3e7 mm: Convert struct page to struct slab in functions used by other subsystems
@@ Commit message
,...)

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
+ Reviewed-by: Andrey Konovalov <andre...@gmail.com>
Cc: Julia Lawall <julia....@inria.fr>
Cc: Luis Chamberlain <mcg...@kernel.org>
Cc: Andrey Ryabinin <ryabin...@gmail.com>
23: eefa12e18a92 = 23: c9c8dee01e5d mm/memcg: Convert slab objcgs from struct page to struct slab
24: fa5ba4107ce2 ! 24: def731137335 mm/slob: Convert SLOB to use struct slab
@@ Metadata
Author: Matthew Wilcox (Oracle) <wi...@infradead.org>

## Commit message ##
- mm/slob: Convert SLOB to use struct slab
+ mm/slob: Convert SLOB to use struct slab and struct folio

- Use struct slab throughout the slob allocator.
+ Use struct slab throughout the slob allocator. Where non-slab page can appear
+ use struct folio instead of struct page.

[ vba...@suse.cz: don't introduce wrappers for PageSlobFree in mm/slab.h just
for the single callers being wrappers in mm/slob.c ]

+ [ Hyeonggon Yoo <42.h...@gmail.com>: fix NULL pointer deference ]
+
Signed-off-by: Matthew Wilcox (Oracle) <wi...@infradead.org>
Signed-off-by: Vlastimil Babka <vba...@suse.cz>

## mm/slob.c ##
+@@
+ * If kmalloc is asked for objects of PAGE_SIZE or larger, it calls
+ * alloc_pages() directly, allocating compound pages so the page order
+ * does not have to be separately tracked.
+- * These objects are detected in kfree() because PageSlab()
++ * These objects are detected in kfree() because folio_test_slab()
+ * is false for them.
+ *
+ * SLAB is emulated on top of SLOB by simply calling constructors and
@@ mm/slob.c: static LIST_HEAD(free_slob_large);
/*
* slob_page_free: true for pages on free_slob_pages list.
@@ mm/slob.c: static void *slob_page_alloc(struct page *sp, size_t size, int align,
int align_offset)
{
- struct page *sp;
++ struct folio *folio;
+ struct slab *sp;
struct list_head *slob_list;
slob_t *b = NULL;
@@ mm/slob.c: static void *slob_alloc(size_t size, gfp_t gfp, int align, int node,
return NULL;
- sp = virt_to_page(b);
- __SetPageSlab(sp);
-+ sp = virt_to_slab(b);
-+ __SetPageSlab(slab_page(sp));
++ folio = virt_to_folio(b);
++ __folio_set_slab(folio);
++ sp = folio_slab(folio);

spin_lock_irqsave(&slob_lock, flags);
sp->units = SLOB_UNITS(PAGE_SIZE);
@@ mm/slob.c: static void slob_free(void *block, int size)
spin_unlock_irqrestore(&slob_lock, flags);
- __ClearPageSlab(sp);
- page_mapcount_reset(sp);
-+ __ClearPageSlab(slab_page(sp));
++ __folio_clear_slab(slab_folio(sp));
+ page_mapcount_reset(slab_page(sp));
slob_free_pages(b, 0);
return;
}
+@@ mm/slob.c: EXPORT_SYMBOL(__kmalloc_node_track_caller);
+
+ void kfree(const void *block)
+ {
+- struct page *sp;
++ struct folio *sp;
+
+ trace_kfree(_RET_IP_, block);
+
+@@ mm/slob.c: void kfree(const void *block)
+ return;
+ kmemleak_free(block);
+
+- sp = virt_to_page(block);
+- if (PageSlab(sp)) {
++ sp = virt_to_folio(block);
++ if (folio_test_slab(sp)) {
+ int align = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
+ unsigned int *m = (unsigned int *)(block - align);
+ slob_free(m, *m + align);
+ } else {
+- unsigned int order = compound_order(sp);
+- mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B,
++ unsigned int order = folio_order(sp);
++
++ mod_node_page_state(folio_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B,
+ -(PAGE_SIZE << order));
+- __free_pages(sp, order);
++ __free_pages(folio_page(sp, 0), order);
+
+ }
+ }
25: aa4f573a4c96 ! 25: 466b9fb1f6e5 mm/kasan: Convert to struct folio and struct slab
@@ Commit message

Signed-off-by: Matthew Wilcox (Oracle) <wi...@infradead.org>
Signed-off-by: Vlastimil Babka <vba...@suse.cz>
+ Reviewed-by: Andrey Konovalov <andre...@gmail.com>
Cc: Andrey Ryabinin <ryabin...@gmail.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Andrey Konovalov <andre...@gmail.com>
26: 67b7966d2fb6 = 26: b8159ae8e5cd mm/kfence: Convert kfence_guarded_alloc() to struct slab
31: d64dfe49c1e7 ! 27: 4525180926f9 mm/sl*b: Differentiate struct slab fields by sl*b implementations
@@ Commit message
possible.

This should also prevent accidental use of fields that don't exist in given
- implementation. Before this patch virt_to_cache() and and cache_from_obj() was
- visible for SLOB (albeit not used), although it relies on the slab_cache field
+ implementation. Before this patch virt_to_cache() and cache_from_obj() were
+ visible for SLOB (albeit not used), although they rely on the slab_cache field
that isn't set by SLOB. With this patch it's now a compile error, so these
functions are now hidden behind #ifndef CONFIG_SLOB.

@@ mm/kfence/core.c: static void *kfence_guarded_alloc(struct kmem_cache *cache, si
- slab->s_mem = addr;
+#if defined(CONFIG_SLUB)
+ slab->objects = 1;
-+#elif defined (CONFIG_SLAB)
++#elif defined(CONFIG_SLAB)
+ slab->s_mem = addr;
+#endif

@@ mm/slab.h
+
+#if defined(CONFIG_SLAB)
+
-+ union {
-+ struct list_head slab_list;
+ union {
+ struct list_head slab_list;
+- struct { /* Partial pages */
+ struct rcu_head rcu_head;
+ };
+ struct kmem_cache *slab_cache;
+ void *freelist; /* array of free object indexes */
-+ void * s_mem; /* first object */
++ void *s_mem; /* first object */
+ unsigned int active;
+
+#elif defined(CONFIG_SLUB)
+
- union {
- struct list_head slab_list;
-- struct { /* Partial pages */
++ union {
++ struct list_head slab_list;
+ struct rcu_head rcu_head;
+ struct {
struct slab *next;
@@ mm/slab.h: struct slab {
+#elif defined(CONFIG_SLOB)
+
+ struct list_head slab_list;
-+ void * __unused_1;
++ void *__unused_1;
+ void *freelist; /* first free block */
-+ void * __unused_2;
++ void *__unused_2;
+ int units;
+
+#else
@@ mm/slab.h: struct slab {
#ifdef CONFIG_MEMCG
unsigned long memcg_data;
@@ mm/slab.h: struct slab {
- static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl))
SLAB_MATCH(flags, __page_flags);
SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */
+ SLAB_MATCH(slab_list, slab_list);
+#ifndef CONFIG_SLOB
SLAB_MATCH(rcu_head, rcu_head);
+ SLAB_MATCH(slab_cache, slab_cache);
++#endif
++#ifdef CONFIG_SLAB
+ SLAB_MATCH(s_mem, s_mem);
+ SLAB_MATCH(active, active);
+#endif
SLAB_MATCH(_refcount, __page_refcount);
#ifdef CONFIG_MEMCG
32: 0abf87bae67e = 28: 94b78948d53f mm/slub: Simplify struct slab slabs field definition
33: 813c304f18e4 = 29: f5261e6375f0 mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled
27: ebce4b5b5ced ! 30: 1414e8c87de6 zsmalloc: Stop using slab fields in struct page
@@ Commit message

Signed-off-by: Matthew Wilcox (Oracle) <wi...@infradead.org>
Signed-off-by: Vlastimil Babka <vba...@suse.cz>
- Cc: Minchan Kim <min...@kernel.org>
+ Acked-by: Minchan Kim <min...@kernel.org>
Cc: Nitin Gupta <ngu...@vflare.org>
Cc: Sergey Senozhatsky <senoz...@chromium.org>

28: f124425ae7de = 31: 8a3cda6b38eb bootmem: Use page->index instead of page->freelist
29: 82da48c73b2e < -: ------------ iommu: Use put_pages_list
30: 181e16dfefbb < -: ------------ mm: Remove slab from struct page
-: ------------ > 32: 91e069ba116b mm/slob: Remove unnecessary page_mapcount_reset() function call

Johannes Weiner

unread,
Dec 14, 2021, 9:31:21 AM12/14/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Marco Elver, Michal Hocko, Vladimir Davydov, kasa...@googlegroups.com, cgr...@vger.kernel.org
LGTM.

Acked-by: Johannes Weiner <han...@cmpxchg.org>

Hyeonggon Yoo

unread,
Dec 14, 2021, 9:38:35 AM12/14/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org
On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
> On 12/1/21 19:14, Vlastimil Babka wrote:
> > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> > this cover letter.
> >
> > Series also available in git, based on 5.16-rc3:
> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
>
> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
>

Hello Vlastimil, Thank you for nice work.
I'm going to review and test new version soon in free time.

Btw, I gave you some review and test tags and seems to be missing in new
series. Did I do review/test process wrongly? It's first time to review
patches so please let me know if I did it wrongly.

--
Thank you.
Hyeonggon.

Vlastimil Babka

unread,
Dec 14, 2021, 9:43:38 AM12/14/21
to Hyeonggon Yoo, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org
On 12/14/21 15:38, Hyeonggon Yoo wrote:
> On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
>> On 12/1/21 19:14, Vlastimil Babka wrote:
>> > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
>> > this cover letter.
>> >
>> > Series also available in git, based on 5.16-rc3:
>> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
>>
>> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
>> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
>>
>
> Hello Vlastimil, Thank you for nice work.
> I'm going to review and test new version soon in free time.

Thanks!

> Btw, I gave you some review and test tags and seems to be missing in new
> series. Did I do review/test process wrongly? It's first time to review
> patches so please let me know if I did it wrongly.

You did right, sorry! I didn't include them as those were for patches that I
was additionally changing after your review/test and the decision what is
substantial change enough to need a new test/review is often fuzzy. So if
you can recheck the new versions it would be great and then I will pick that
up, thanks!

> --
> Thank you.
> Hyeonggon.

Roman Gushchin

unread,
Dec 14, 2021, 8:03:42 PM12/14/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Hyeonggon Yoo
On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
> On 12/1/21 19:14, Vlastimil Babka wrote:
> > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> > this cover letter.
> >
> > Series also available in git, based on 5.16-rc3:
> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
>
> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:

Hi Vlastimil!

I've started to review this patchset (btw, a really nice work, I like
the resulting code way more). Because I'm looking at v3 and I don't have
the whole v2 in my mailbox, here is what I've now:

* mm: add virt_to_folio() and folio_address()
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slab: Dissolve slab_map_pages() in its caller
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Make object_err() static
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm: Split slab into its own type
1) Shouldn't SLAB_MATCH() macro use struct folio instead of struct page for the
comparison?
2) page_slab() is used only in kasan and only in one place, so maybe it's better
to not introduce it as a generic helper?
Other than that
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm: Add account_slab() and unaccount_slab()
1) maybe change the title to convert/replace instead of add?
2) maybe move later changes to memcg_alloc_page_obj_cgroups() to this patch?
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm: Convert virt_to_cache() to use struct slab
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm: Convert __ksize() to struct slab
It looks like certain parts of __ksize() can be merged between slab, slub and slob?
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm: Use struct slab in kmem_obj_info()
Reviewed-by: Roman Gushchin <gu...@fb.com>


I'll try to finish reviewing the patchset until the end of the week.

Thanks!

Roman

Hyeonggon Yoo

unread,
Dec 14, 2021, 10:47:58 PM12/14/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org
On Tue, Dec 14, 2021 at 03:43:35PM +0100, Vlastimil Babka wrote:
> On 12/14/21 15:38, Hyeonggon Yoo wrote:
> > On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
> >> On 12/1/21 19:14, Vlastimil Babka wrote:
> >> > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> >> > this cover letter.
> >> >
> >> > Series also available in git, based on 5.16-rc3:
> >> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
> >>
> >> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
> >> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
> >>
> >
> > Hello Vlastimil, Thank you for nice work.
> > I'm going to review and test new version soon in free time.
>
> Thanks!
>

You're welcome!

> > Btw, I gave you some review and test tags and seems to be missing in new
> > series. Did I do review/test process wrongly? It's first time to review
> > patches so please let me know if I did it wrongly.
>
> You did right, sorry! I didn't include them as those were for patches that I
> was additionally changing after your review/test and the decision what is
> substantial change enough to need a new test/review is often fuzzy.

Ah, Okay. review/test becomes invalid after some changing.
that's okay. I was just unfamiliar with the process. Thank you!

> So if you can recheck the new versions it would be great and then I will pick that
> up, thanks!

Okay. I'll new versions.

>
> > --
> > Thank you.
> > Hyeonggon.
>

Roman Gushchin

unread,
Dec 15, 2021, 6:38:45 PM12/15/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Hyeonggon Yoo
Part 2:

* mm: Convert check_heap_object() to use struct slab
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Convert detached_freelist to use a struct slab
How about to convert free_nonslab_page() to free_nonslab_folio()?
And maybe rename it to something like free_large_kmalloc()?
If I'm not missing something, large kmallocs is the only way how we can end up
there with a !slab folio/page.

* mm/slub: Convert kfree() to use a struct slab
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Convert __slab_lock() and __slab_unlock() to struct slab
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Convert print_page_info() to print_slab_info()
Do we really need to explicitly convert slab_folio()'s result to (struct folio *)?
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Convert alloc_slab_page() to return a struct slab
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Convert __free_slab() to use struct slab
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Convert pfmemalloc_match() to take a struct slab
Cool! Removing pfmemalloc_unsafe() is really nice.
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Convert most struct page to struct slab by spatch
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slub: Finish struct page to struct slab conversion
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
Reviewed-by: Roman Gushchin <gu...@fb.com>

* mm/slab: Convert most struct page to struct slab by spatch

Another patch with the same title? Rebase error?

* mm/slab: Finish struct page to struct slab conversion

And this one too?


Thanks!

Roman

Vlastimil Babka

unread,
Dec 16, 2021, 4:19:04 AM12/16/21
to Roman Gushchin, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Hyeonggon Yoo
On 12/16/21 00:38, Roman Gushchin wrote:
> On Tue, Dec 14, 2021 at 05:03:12PM -0800, Roman Gushchin wrote:
>> On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
>> > On 12/1/21 19:14, Vlastimil Babka wrote:
>> > > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
>> > > this cover letter.
>> > >
>> > > Series also available in git, based on 5.16-rc3:
>> > > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
>> >
>> > Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
>> > and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
>>
>> Hi Vlastimil!
>>
>> I've started to review this patchset (btw, a really nice work, I like
>> the resulting code way more). Because I'm looking at v3 and I don't have

Thanks a lot, Roman!

...

>
> * mm/slab: Convert most struct page to struct slab by spatch
>
> Another patch with the same title? Rebase error?
>
> * mm/slab: Finish struct page to struct slab conversion
>
> And this one too?

No, these are for mm/slab.c, the previous were for mm/slub.c :)

>
> Thanks!
>
> Roman

Hyeonggon Yoo

unread,
Dec 16, 2021, 10:00:53 AM12/16/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org
On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
> On 12/1/21 19:14, Vlastimil Babka wrote:
> > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> > this cover letter.
> >
> > Series also available in git, based on 5.16-rc3:
> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
>
> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:

Reviewing the whole patch series takes longer than I thought.
I'll try to review and test rest of patches when I have time.

I added Tested-by if kernel builds okay and kselftests
does not break the kernel on my machine.
(with CONFIG_SLAB/SLUB/SLOB depending on the patch),
Let me know me if you know better way to test a patch.

# mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled

Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>

Comment:
Works on both SLUB_CPU_PARTIAL and !SLUB_CPU_PARTIAL.
btw, do we need slabs_cpu_partial attribute when we don't use
cpu partials? (!SLUB_CPU_PARTIAL)

# mm/slub: Simplify struct slab slabs field definition
Comment:

This is how struct page looks on the top of v3r3 branch:
struct page {
[...]
struct { /* slab, slob and slub */
union {
struct list_head slab_list;
struct { /* Partial pages */
struct page *next;
#ifdef CONFIG_64BIT
int pages; /* Nr of pages left */
#else
short int pages;
#endif
};
};
[...]

It's not consistent with struct slab.
I think this is because "mm: Remove slab from struct page" was dropped.
Would you update some of patches?

# mm/sl*b: Differentiate struct slab fields by sl*b implementations
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
Works SL[AUO]B on my machine and makes code much better.

# mm/slob: Convert SLOB to use struct slab and struct folio
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
It still works fine on SLOB.

# mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>

# mm/slub: Convert __free_slab() to use struct slab
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>

Thanks,
Hyeonggon.

>

Vlastimil Babka

unread,
Dec 19, 2021, 7:24:55 PM12/19/21
to Roman Gushchin, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Hyeonggon Yoo
On 12/15/21 02:03, Roman Gushchin wrote:
> On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
>> On 12/1/21 19:14, Vlastimil Babka wrote:
>> > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
>> > this cover letter.
>> >
>> > Series also available in git, based on 5.16-rc3:
>> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
>>
>> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
>> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
>
> Hi Vlastimil!
>
> I've started to review this patchset (btw, a really nice work, I like
> the resulting code way more). Because I'm looking at v3 and I don't have
> the whole v2 in my mailbox, here is what I've now:

Thanks a lot, Roman!

> * mm: add virt_to_folio() and folio_address()
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slab: Dissolve slab_map_pages() in its caller
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slub: Make object_err() static
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm: Split slab into its own type
> 1) Shouldn't SLAB_MATCH() macro use struct folio instead of struct page for the
> comparison?

Folio doesn't have define most of the fields, and matching some to page and
others to folio seems like unnecessary complication. Maybe as part of the
final struct page cleanup when the slab fields are gone from struct page,
the rest could all be in folio - I'll check once we get there.

> 2) page_slab() is used only in kasan and only in one place, so maybe it's better
> to not introduce it as a generic helper?

Yeah that's the case after the series, but as part of the incremental steps,
page_slab() gets used in many places. I'll consider removing it on top though.

> Other than that
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm: Add account_slab() and unaccount_slab()
> 1) maybe change the title to convert/replace instead of add?

Done.

> 2) maybe move later changes to memcg_alloc_page_obj_cgroups() to this patch?

Maybe possible, but that would distort the series more than I'd like to at
this rc6 time.

Vlastimil Babka

unread,
Dec 19, 2021, 7:47:56 PM12/19/21
to Roman Gushchin, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Hyeonggon Yoo
On 12/16/21 00:38, Roman Gushchin wrote:
> Part 2:
>
> * mm: Convert check_heap_object() to use struct slab
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slub: Convert detached_freelist to use a struct slab
> How about to convert free_nonslab_page() to free_nonslab_folio()?
> And maybe rename it to something like free_large_kmalloc()?
> If I'm not missing something, large kmallocs is the only way how we can end up
> there with a !slab folio/page.

Good point, thanks! But did at as part of the following patch, where it fits
logically better.

> * mm/slub: Convert kfree() to use a struct slab
> Reviewed-by: Roman Gushchin <gu...@fb.com>

Didn't add your tag because of the addition of free_large_kmalloc() change.

> * mm/slub: Convert __slab_lock() and __slab_unlock() to struct slab
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slub: Convert print_page_info() to print_slab_info()
> Do we really need to explicitly convert slab_folio()'s result to (struct folio *)?

Unfortunately yes, as long as folio_flags() don't take const struct folio *,
which will need some yak shaving.

> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slub: Convert alloc_slab_page() to return a struct slab
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slub: Convert __free_slab() to use struct slab
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slub: Convert pfmemalloc_match() to take a struct slab
> Cool! Removing pfmemalloc_unsafe() is really nice.
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slub: Convert most struct page to struct slab by spatch
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slub: Finish struct page to struct slab conversion
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> * mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
> Reviewed-by: Roman Gushchin <gu...@fb.com>

Thanks again!

Matthew Wilcox

unread,
Dec 19, 2021, 8:43:02 PM12/19/21
to Vlastimil Babka, Roman Gushchin, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Hyeonggon Yoo
On Mon, Dec 20, 2021 at 01:47:54AM +0100, Vlastimil Babka wrote:
> > * mm/slub: Convert print_page_info() to print_slab_info()
> > Do we really need to explicitly convert slab_folio()'s result to (struct folio *)?
>
> Unfortunately yes, as long as folio_flags() don't take const struct folio *,
> which will need some yak shaving.

In case anyone's interested ...

folio_flags calls VM_BUG_ON_PGFLAGS() which would need its second
argument to be const.

That means dump_page() needs to take a const struct page, which
means __dump_page() needs its argument to be const.

That calls ...

is_migrate_cma_page()
page_mapping()
page_mapcount()
page_ref_count()
page_to_pgoff()
page_to_pfn()
hpage_pincount_available()
head_compound_mapcount()
head_compound_pincount()
compound_order()
PageKsm()
PageAnon()
PageCompound()

... and at that point, I ran out of motivation to track down some parts
of this tarbaby that could be fixed. I did do:

mm: constify page_count and page_ref_count
mm: constify get_pfnblock_flags_mask and get_pfnblock_migratetype
mm: make compound_head const-preserving
mm/page_owner: constify dump_page_owner

so some of those are already done. But a lot of them just need to be
done at the same time. For example, page_mapping() calls
folio_mapping() which calls folio_test_slab() which calls folio_flags(),
so dump_page() and page_mapping() need to be done at the same time.

One bit that could be broken off easily (I think ...) is PageTransTail()
PageTail(), PageCompound(), PageHuge(), page_to_pgoff() and
page_to_index(). One wrinkle is needed a temporary
TESTPAGEFLAGS_FALSE_CONST. But I haven't tried it yet.

Vlastimil Babka

unread,
Dec 20, 2021, 6:58:17 PM12/20/21
to Hyeonggon Yoo, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org
On 12/16/21 16:00, Hyeonggon Yoo wrote:
> On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
>> On 12/1/21 19:14, Vlastimil Babka wrote:
>> > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
>> > this cover letter.
>> >
>> > Series also available in git, based on 5.16-rc3:
>> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
>>
>> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
>> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
>
> Reviewing the whole patch series takes longer than I thought.
> I'll try to review and test rest of patches when I have time.
>
> I added Tested-by if kernel builds okay and kselftests
> does not break the kernel on my machine.
> (with CONFIG_SLAB/SLUB/SLOB depending on the patch),

Thanks!

> Let me know me if you know better way to test a patch.

Testing on your machine is just fine.

> # mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled
>
> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
>
> Comment:
> Works on both SLUB_CPU_PARTIAL and !SLUB_CPU_PARTIAL.
> btw, do we need slabs_cpu_partial attribute when we don't use
> cpu partials? (!SLUB_CPU_PARTIAL)

The sysfs attribute? Yeah we should be consistent to userspace expecting to
read it (even with zeroes), regardless of config.

> # mm/slub: Simplify struct slab slabs field definition
> Comment:
>
> This is how struct page looks on the top of v3r3 branch:
> struct page {
> [...]
> struct { /* slab, slob and slub */
> union {
> struct list_head slab_list;
> struct { /* Partial pages */
> struct page *next;
> #ifdef CONFIG_64BIT
> int pages; /* Nr of pages left */
> #else
> short int pages;
> #endif
> };
> };
> [...]
>
> It's not consistent with struct slab.

Hm right. But as we don't actually use the struct page version anymore, and
it's not one of the fields checked by SLAB_MATCH(), we can ignore this.

> I think this is because "mm: Remove slab from struct page" was dropped.

That was just postponed until iommu changes are in. Matthew mentioned those
might be merged too, so that final cleanup will happen too and take care of
the discrepancy above, so no need for extra churn to address it speficially.

> Would you update some of patches?
>
> # mm/sl*b: Differentiate struct slab fields by sl*b implementations
> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
> Works SL[AUO]B on my machine and makes code much better.
>
> # mm/slob: Convert SLOB to use struct slab and struct folio
> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
> It still works fine on SLOB.
>
> # mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
>
> # mm/slub: Convert __free_slab() to use struct slab
> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
>
> Thanks,
> Hyeonggon.

Thanks again,
Vlastimil

Robin Murphy

unread,
Dec 21, 2021, 12:25:16 PM12/21/21
to Vlastimil Babka, Hyeonggon Yoo, Peter Zijlstra, Dave Hansen, Michal Hocko, linu...@kvack.org, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com, H. Peter Anvin, Christoph Lameter, Will Deacon, Julia Lawall, Sergey Senozhatsky, x...@kernel.org, Luis Chamberlain, Matthew Wilcox, Ingo Molnar, Vladimir Davydov, David Rientjes, Nitin Gupta, Marco Elver, Borislav Petkov, Andy Lutomirski, cgr...@vger.kernel.org, Thomas Gleixner, Joonsoo Kim, Dmitry Vyukov, Andrey Konovalov, pat...@lists.linux.dev, Pekka Enberg, Minchan Kim, io...@lists.linux-foundation.org, Johannes Weiner, Andrew Morton, David Woodhouse
FYI the IOMMU changes are now queued in linux-next, so if all goes well
you might be able to sneak that final patch in too.

Robin.

>
>> Would you update some of patches?
>>
>> # mm/sl*b: Differentiate struct slab fields by sl*b implementations
>> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
>> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
>> Works SL[AUO]B on my machine and makes code much better.
>>
>> # mm/slob: Convert SLOB to use struct slab and struct folio
>> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
>> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
>> It still works fine on SLOB.
>>
>> # mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
>> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
>> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
>>
>> # mm/slub: Convert __free_slab() to use struct slab
>> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
>> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
>>
>> Thanks,
>> Hyeonggon.
>
> Thanks again,
> Vlastimil
> _______________________________________________
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

Hyeonggon Yoo

unread,
Dec 22, 2021, 2:36:44 AM12/22/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org
On Tue, Dec 21, 2021 at 12:58:14AM +0100, Vlastimil Babka wrote:
> On 12/16/21 16:00, Hyeonggon Yoo wrote:
> > On Tue, Dec 14, 2021 at 01:57:22PM +0100, Vlastimil Babka wrote:
> >> On 12/1/21 19:14, Vlastimil Babka wrote:
> >> > Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> >> > this cover letter.
> >> >
> >> > Series also available in git, based on 5.16-rc3:
> >> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
> >>
> >> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
> >> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
> >
> > Reviewing the whole patch series takes longer than I thought.
> > I'll try to review and test rest of patches when I have time.
> >
> > I added Tested-by if kernel builds okay and kselftests
> > does not break the kernel on my machine.
> > (with CONFIG_SLAB/SLUB/SLOB depending on the patch),
>
> Thanks!
>

:)

> > Let me know me if you know better way to test a patch.
>
> Testing on your machine is just fine.
>

Good!

> > # mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled
> >
> > Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> > Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
> >
> > Comment:
> > Works on both SLUB_CPU_PARTIAL and !SLUB_CPU_PARTIAL.
> > btw, do we need slabs_cpu_partial attribute when we don't use
> > cpu partials? (!SLUB_CPU_PARTIAL)
>
> The sysfs attribute? Yeah we should be consistent to userspace expecting to
> read it (even with zeroes), regardless of config.
>

I thought entirely disabling the attribute is simpler,
But okay If it should be exposed even if it's always zero.

> > # mm/slub: Simplify struct slab slabs field definition
> > Comment:
> >
> > This is how struct page looks on the top of v3r3 branch:
> > struct page {
> > [...]
> > struct { /* slab, slob and slub */
> > union {
> > struct list_head slab_list;
> > struct { /* Partial pages */
> > struct page *next;
> > #ifdef CONFIG_64BIT
> > int pages; /* Nr of pages left */
> > #else
> > short int pages;
> > #endif
> > };
> > };
> > [...]
> >
> > It's not consistent with struct slab.
>
> Hm right. But as we don't actually use the struct page version anymore, and
> it's not one of the fields checked by SLAB_MATCH(), we can ignore this.
>

Yeah this is not a big problem. just mentioned this because
it looked weird and I didn't know when the patch "mm: Remove slab from struct page"
will come back.

> > I think this is because "mm: Remove slab from struct page" was dropped.
>
> That was just postponed until iommu changes are in. Matthew mentioned those
> might be merged too, so that final cleanup will happen too and take care of
> the discrepancy above, so no need for extra churn to address it speficially.
>

Okay it seems no extra work needed until the iommu changes are in!

BTW, in the patch (that I sent) ("mm/slob: Remove unnecessary page_mapcount_reset()
function call"), it refers commit 4525180926f9 ("mm/sl*b: Differentiate struct slab fields by
sl*b implementations"). But the commit hash 4525180926f9 changed after the
tree has been changed.

It will be nice to write a script to handle situations like this.

> > Would you update some of patches?
> >
> > # mm/sl*b: Differentiate struct slab fields by sl*b implementations
> > Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> > Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
> > Works SL[AUO]B on my machine and makes code much better.
> >
> > # mm/slob: Convert SLOB to use struct slab and struct folio
> > Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> > Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
> > It still works fine on SLOB.
> >
> > # mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
> > Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> > Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
> >
> > # mm/slub: Convert __free_slab() to use struct slab
> > Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> > Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
> >
> > Thanks,
> > Hyeonggon.
>
> Thanks again,
> Vlastimil

Have a nice day, thanks!
Hyeonggon.

Vlastimil Babka

unread,
Dec 22, 2021, 11:57:52 AM12/22/21
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Hyeonggon Yoo, Roman Gushchin
On 12/14/21 13:57, Vlastimil Babka wrote:
> On 12/1/21 19:14, Vlastimil Babka wrote:
>> Folks from non-slab subsystems are Cc'd only to patches affecting them, and
>> this cover letter.
>>
>> Series also available in git, based on 5.16-rc3:
>> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
>
> Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
> and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:

Hi, I've pushed another update branch slab-struct_slab-v4r1, and also to
-next. I've shortened git commit log lines to make checkpatch happier,
so no range-diff as it would be too long. I believe it would be useless
spam to post the whole series now, shortly before xmas, so I will do it
at rc8 time, to hopefully collect remaining reviews. But if anyone wants
a mailed version, I can do that.

Changes in v4:
- rebase to 5.16-rc6 to avoid a conflict with mainline
- collect acks/reviews/tested-by from Johannes, Roman, Hyeonggon Yoo -
thanks!
- in patch "mm/slub: Convert detached_freelist to use a struct slab"
renamed free_nonslab_page() to free_large_kmalloc() and use folio there,
as suggested by Roman
- in "mm/memcg: Convert slab objcgs from struct page to struct slab"
change one caller of slab_objcgs_check() to slab_objcgs() as suggested
by Johannes, realize the other caller should be also changed, and remove
slab_objcgs_check() completely.

Hyeonggon Yoo

unread,
Dec 25, 2021, 4:17:06 AM12/25/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Roman Gushchin
On Wed, Dec 22, 2021 at 05:56:50PM +0100, Vlastimil Babka wrote:
> On 12/14/21 13:57, Vlastimil Babka wrote:
> > On 12/1/21 19:14, Vlastimil Babka wrote:
> >> Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> >> this cover letter.
> >>
> >> Series also available in git, based on 5.16-rc3:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
> >
> > Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
> > and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
>
> Hi, I've pushed another update branch slab-struct_slab-v4r1, and also to
> -next. I've shortened git commit log lines to make checkpatch happier,
> so no range-diff as it would be too long. I believe it would be useless
> spam to post the whole series now, shortly before xmas, so I will do it
> at rc8 time, to hopefully collect remaining reviews. But if anyone wants
> a mailed version, I can do that.

Hello Vlastimil, Merry Christmas!
This is part 2 of reviewing/testing patches.

# mm/kasan: Convert to struct folio and struct slab
I'm not familiar with kasan yet but kasan runs well on my machine and
kasan's bug report functionality too works fine.
Tested-by: Hyeongogn Yoo <42.h...@gmail.com>

# mm: Convert struct page to struct slab in functions used by other subsystems
I'm not familiar with kasan, but to ask:
Does ____kasan_slab_free detect invalid free if someone frees
an object that is not allocated from slab?

@@ -341,7 +341,7 @@ static inline bool ____kasan_slab_free(struct kmem_cache *cache, void *object,
- if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) !=
+ if (unlikely(nearest_obj(cache, virt_to_slab(object), object) !=
object)) {
kasan_report_invalid_free(tagged_object, ip);
return true;

I'm asking this because virt_to_slab() will return NULL if folio_test_slab()
returns false. That will cause NULL pointer dereference in nearest_obj.
I don't think this change is intended.

This makes me think some of virt_to_head_page() -> virt_to_slab()
conversion need to be reviewed with caution.

# mm/slab: Finish struct page to struct slab conversion
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>

# mm/slab: Convert most struct page to struct slab by spatch
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>

I'll come back with part 3 :)
Enjoy your Christmas!
Hyeonggon

Matthew Wilcox

unread,
Dec 25, 2021, 12:54:08 PM12/25/21
to Hyeonggon Yoo, Vlastimil Babka, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Roman Gushchin
On Sat, Dec 25, 2021 at 09:16:55AM +0000, Hyeonggon Yoo wrote:
> # mm: Convert struct page to struct slab in functions used by other subsystems
> I'm not familiar with kasan, but to ask:
> Does ____kasan_slab_free detect invalid free if someone frees
> an object that is not allocated from slab?
>
> @@ -341,7 +341,7 @@ static inline bool ____kasan_slab_free(struct kmem_cache *cache, void *object,
> - if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) !=
> + if (unlikely(nearest_obj(cache, virt_to_slab(object), object) !=
> object)) {
> kasan_report_invalid_free(tagged_object, ip);
> return true;
>
> I'm asking this because virt_to_slab() will return NULL if folio_test_slab()
> returns false. That will cause NULL pointer dereference in nearest_obj.
> I don't think this change is intended.

You need to track down how this could happen. As far as I can tell,
it's always called when we know the object is part of a slab. That's
where the cachep pointer is deduced from.

Hyeonggon Yoo

unread,
Dec 26, 2021, 9:43:26 PM12/26/21
to Matthew Wilcox, Vlastimil Babka, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Roman Gushchin
Thank you Matthew, you are right. I read the code too narrowly.
when we call kasan hooks, we know that the object is allocated from
the slab cache. (through cache_from_obj)

I'll review that patch again in part 3!

Thanks,
Hyeonggon

Hyeonggon Yoo

unread,
Dec 29, 2021, 6:23:04 AM12/29/21
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Roman Gushchin
On Wed, Dec 22, 2021 at 05:56:50PM +0100, Vlastimil Babka wrote:
> On 12/14/21 13:57, Vlastimil Babka wrote:
> > On 12/1/21 19:14, Vlastimil Babka wrote:
> >> Folks from non-slab subsystems are Cc'd only to patches affecting them, and
> >> this cover letter.
> >>
> >> Series also available in git, based on 5.16-rc3:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v2r2
> >
> > Pushed a new branch slab-struct-slab-v3r3 with accumulated fixes and small tweaks
> > and a new patch from Hyeonggon Yoo on top. To avoid too much spam, here's a range diff:
>
> Hi, I've pushed another update branch slab-struct_slab-v4r1, and also to
> -next. I've shortened git commit log lines to make checkpatch happier,
> so no range-diff as it would be too long. I believe it would be useless
> spam to post the whole series now, shortly before xmas, so I will do it
> at rc8 time, to hopefully collect remaining reviews. But if anyone wants
> a mailed version, I can do that.
>

Hello Matthew and Vlastimil.
it's part 3 of review.

# mm: Convert struct page to struct slab in functions used by other subsystems
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>


# mm/slub: Convert most struct page to struct slab by spatch
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
with a question below.

-static int check_slab(struct kmem_cache *s, struct page *page)
+static int check_slab(struct kmem_cache *s, struct slab *slab)
{
int maxobj;

- if (!PageSlab(page)) {
- slab_err(s, page, "Not a valid slab page");
+ if (!folio_test_slab(slab_folio(slab))) {
+ slab_err(s, slab, "Not a valid slab page");
return 0;
}

Can't we guarantee that struct slab * always points to a slab?

for struct page * it can be !PageSlab(page) because struct page *
can be other than slab. but struct slab * can only be slab
unlike struct page. code will be simpler if we guarantee that
struct slab * always points to a slab (or NULL).


# mm/slub: Convert pfmemalloc_match() to take a struct slab
It's confusing to me because the original pfmemalloc_match() is removed
and pfmemalloc_match_unsafe() was renamed to pfmemalloc_match() and
converted to use slab_test_pfmemalloc() helper.

But I agree with the resulting code. so:
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>


# mm/slub: Convert alloc_slab_page() to return a struct slab
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>


# mm/slub: Convert print_page_info() to print_slab_info()
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>

I hope to review rest of patches in a week.

Thanks,
Hyeonggon

Vlastimil Babka

unread,
Jan 3, 2022, 12:56:24 PM1/3/22
to Hyeonggon Yoo, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, pat...@lists.linux.dev, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, David Woodhouse, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, io...@lists.linux-foundation.org, Joerg Roedel, Johannes Weiner, Julia Lawall, kasa...@googlegroups.com, Lu Baolu, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Suravee Suthikulpanit, Thomas Gleixner, Vladimir Davydov, Will Deacon, x...@kernel.org, Roman Gushchin
Normally, yes.

> for struct page * it can be !PageSlab(page) because struct page *
> can be other than slab. but struct slab * can only be slab
> unlike struct page. code will be simpler if we guarantee that
> struct slab * always points to a slab (or NULL).

That's what the code does indeed. But check_slab() is called as part of
various consistency checks, so there we on purpose question all assumptions
in order to find a bug (e.g. memory corruption) - such as a page that's
still on the list of slabs while it was already freed and reused and thus
e.g. lacks the slab page flag.

But it's nice how using struct slab makes such a check immediately stand out
as suspicious, right?

> # mm/slub: Convert pfmemalloc_match() to take a struct slab
> It's confusing to me because the original pfmemalloc_match() is removed
> and pfmemalloc_match_unsafe() was renamed to pfmemalloc_match() and
> converted to use slab_test_pfmemalloc() helper.
>
> But I agree with the resulting code. so:
> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
>
>
> # mm/slub: Convert alloc_slab_page() to return a struct slab
> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
> Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
>
>
> # mm/slub: Convert print_page_info() to print_slab_info()
> Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
>
> I hope to review rest of patches in a week.

Thanks for your reviews/tests!

Vlastimil Babka

unread,
Jan 3, 2022, 7:10:54 PM1/3/22
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Roman Gushchin, Hyeonggon Yoo, pat...@lists.linux.dev, Vlastimil Babka, Alexander Potapenko, Andrey Konovalov, Andrey Ryabinin, Andy Lutomirski, Borislav Petkov, cgr...@vger.kernel.org, Dave Hansen, Dmitry Vyukov, H. Peter Anvin, Ingo Molnar, Julia Lawall, kasa...@googlegroups.com, Luis Chamberlain, Marco Elver, Michal Hocko, Minchan Kim, Nitin Gupta, Peter Zijlstra, Sergey Senozhatsky, Thomas Gleixner, Vladimir Davydov, x...@kernel.org
Folks from non-slab subsystems are Cc'd only to patches affecting them, and
this cover letter.

Series also available in git, based on 5.16-rc6:
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-struct_slab-v4r2

The plan is to submit as pull request, the previous versions have been
in linux-next since v2 early December. This v4 was in linux-next since
Dec 22:
https://lore.kernel.org/all/f3a83708-3f3c-a634...@suse.cz/
I planned to post it on mailing list for any final review in January, so
this is it. Added only reviewed/tested tags from Hyeonggon Yoo
meahwhile.

Changes from v3:
https://lore.kernel.org/all/4c3dfdfa-2e19-a9a7...@suse.cz/
- rebase to 5.16-rc6 to avoid a conflict with mainline
- collect acks/reviews/tested-by from Johannes, Roman, Hyeonggon Yoo -
thanks!
- in patch "mm/slub: Convert detached_freelist to use a struct slab"
renamed free_nonslab_page() to free_large_kmalloc() and use folio there,
as suggested by Roman
- in "mm/memcg: Convert slab objcgs from struct page to struct slab"
change one caller of slab_objcgs_check() to slab_objcgs() as suggested
by Johannes, realize the other caller should be also changed, and remove
slab_objcgs_check() completely.

Initial version from Matthew Wilcox:
https://lore.kernel.org/all/20211004134650....@infradead.org/

LWN coverage of the above:
https://lwn.net/Articles/871982/

This is originally an offshoot of the folio work by Matthew. One of the more
complex parts of the struct page definition are the parts used by the slab
allocators. It would be good for the MM in general if struct slab were its own
data type, and it also helps to prevent tail pages from slipping in anywhere.
As Matthew requested in his proof of concept series, I have taken over the
development of this series, so it's a mix of patches from him (often modified
by me) and my own.

One big difference is the use of coccinelle to perform the relatively trivial
parts of the conversions automatically and at once, instead of a larger number
of smaller incremental reviewable steps. Thanks to Julia Lawall and Luis
Chamberlain for all their help!

Another notable difference is (based also on review feedback) I don't represent
with a struct slab the large kmalloc allocations which are not really a slab,
but use page allocator directly. When going from an object address to a struct
slab, the code tests first folio slab flag, and only if it's set it converts to
struct slab. This makes the struct slab type stronger.

Finally, although Matthew's version didn't use any of the folio work, the
initial support has been merged meanwhile so my version builds on top of it
where appropriate. This eliminates some of the redundant compound_head()
being performed e.g. when testing the slab flag.

To sum up, after this series, struct page fields used by slab allocators are
moved from struct page to a new struct slab, that uses the same physical
storage. The availability of the fields is further distinguished by the
selected slab allocator implementation. The advantages include:

- Similar to folios, if the slab is of order > 0, struct slab always is
guaranteed to be the head page. Additionally it's guaranteed to be an actual
slab page, not a large kmalloc. This removes uncertainty and potential for
bugs.
- It's not possible to accidentally use fields of the slab implementation that's
not configured.
- Other subsystems cannot use slab's fields in struct page anymore (some
existing non-slab usages had to be adjusted in this series), so slab
implementations have more freedom in rearranging them in the struct slab.

Hyeonggon Yoo (1):
mm/slob: Remove unnecessary page_mapcount_reset() function call

Matthew Wilcox (Oracle) (14):
mm: Split slab into its own type
mm: Convert [un]account_slab_page() to struct slab
mm: Convert virt_to_cache() to use struct slab
mm: Convert __ksize() to struct slab
mm: Use struct slab in kmem_obj_info()
mm: Convert check_heap_object() to use struct slab
mm/slub: Convert detached_freelist to use a struct slab
mm/slub: Convert kfree() to use a struct slab
mm/slub: Convert print_page_info() to print_slab_info()
mm/slub: Convert pfmemalloc_match() to take a struct slab
mm/slob: Convert SLOB to use struct slab and struct folio
mm/kasan: Convert to struct folio and struct slab
zsmalloc: Stop using slab fields in struct page
bootmem: Use page->index instead of page->freelist

Vlastimil Babka (17):
mm: add virt_to_folio() and folio_address()
mm/slab: Dissolve slab_map_pages() in its caller
mm/slub: Make object_err() static
mm/slub: Convert __slab_lock() and __slab_unlock() to struct slab
mm/slub: Convert alloc_slab_page() to return a struct slab
mm/slub: Convert __free_slab() to use struct slab
mm/slub: Convert most struct page to struct slab by spatch
mm/slub: Finish struct page to struct slab conversion
mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab
mm/slab: Convert most struct page to struct slab by spatch
mm/slab: Finish struct page to struct slab conversion
mm: Convert struct page to struct slab in functions used by other
subsystems
mm/memcg: Convert slab objcgs from struct page to struct slab
mm/kfence: Convert kfence_guarded_alloc() to struct slab
mm/sl*b: Differentiate struct slab fields by sl*b implementations
mm/slub: Simplify struct slab slabs field definition
mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only
when enabled

arch/x86/mm/init_64.c | 2 +-
include/linux/bootmem_info.h | 2 +-
include/linux/kasan.h | 9 +-
include/linux/memcontrol.h | 48 --
include/linux/mm.h | 12 +
include/linux/mm_types.h | 10 +-
include/linux/slab.h | 8 -
include/linux/slab_def.h | 16 +-
include/linux/slub_def.h | 29 +-
mm/bootmem_info.c | 7 +-
mm/kasan/common.c | 27 +-
mm/kasan/generic.c | 8 +-
mm/kasan/kasan.h | 1 +
mm/kasan/quarantine.c | 2 +-
mm/kasan/report.c | 13 +-
mm/kasan/report_tags.c | 10 +-
mm/kfence/core.c | 17 +-
mm/kfence/kfence_test.c | 6 +-
mm/memcontrol.c | 47 +-
mm/slab.c | 456 +++++++------
mm/slab.h | 305 +++++++--
mm/slab_common.c | 14 +-
mm/slob.c | 62 +-
mm/slub.c | 1177 +++++++++++++++++-----------------
mm/sparse.c | 2 +-
mm/usercopy.c | 13 +-
mm/zsmalloc.c | 18 +-
27 files changed, 1263 insertions(+), 1058 deletions(-)

--
2.34.1

Vlastimil Babka

unread,
Jan 3, 2022, 7:10:58 PM1/3/22
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Roman Gushchin, Hyeonggon Yoo, pat...@lists.linux.dev, Vlastimil Babka, Andrey Konovalov, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Marco Elver, Michal Hocko, Vladimir Davydov, kasa...@googlegroups.com, cgr...@vger.kernel.org
,...)

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Reviewed-by: Andrey Konovalov <andre...@gmail.com>
Acked-by: Johannes Weiner <han...@cmpxchg.org>
Cc: Julia Lawall <julia....@inria.fr>
Cc: Luis Chamberlain <mcg...@kernel.org>
Cc: Andrey Ryabinin <ryabin...@gmail.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Andrey Konovalov <andre...@gmail.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: Marco Elver <el...@google.com>
Cc: Johannes Weiner <han...@cmpxchg.org>
Cc: Michal Hocko <mho...@kernel.org>
Cc: Vladimir Davydov <vdavyd...@gmail.com>
Cc: <kasa...@googlegroups.com>
Cc: <cgr...@vger.kernel.org>
---
include/linux/slab_def.h | 16 ++++++++--------
include/linux/slub_def.h | 18 +++++++++---------
mm/kasan/common.c | 4 ++--
mm/kasan/generic.c | 2 +-
mm/kasan/report.c | 2 +-
mm/kasan/report_tags.c | 2 +-
mm/kfence/kfence_test.c | 4 ++--
mm/memcontrol.c | 4 ++--
mm/slab.c | 10 +++++-----
mm/slab.h | 4 ++--
mm/slub.c | 2 +-
11 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 3aa5e1e73ab6..e24c9aff6fed 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -87,11 +87,11 @@ struct kmem_cache {
struct kmem_cache_node *node[MAX_NUMNODES];
};

-static inline void *nearest_obj(struct kmem_cache *cache, struct page *page,
+static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *slab,
void *x)
{
- void *object = x - (x - page->s_mem) % cache->size;
- void *last_object = page->s_mem + (cache->num - 1) * cache->size;
+ void *object = x - (x - slab->s_mem) % cache->size;
+ void *last_object = slab->s_mem + (cache->num - 1) * cache->size;

if (unlikely(object > last_object))
return last_object;
@@ -106,16 +106,16 @@ static inline void *nearest_obj(struct kmem_cache *cache, struct page *page,
* reciprocal_divide(offset, cache->reciprocal_buffer_size)
*/
static inline unsigned int obj_to_index(const struct kmem_cache *cache,
- const struct page *page, void *obj)
+ const struct slab *slab, void *obj)
{
- u32 offset = (obj - page->s_mem);
+ u32 offset = (obj - slab->s_mem);
return reciprocal_divide(offset, cache->reciprocal_buffer_size);
}

-static inline int objs_per_slab_page(const struct kmem_cache *cache,
- const struct page *page)
+static inline int objs_per_slab(const struct kmem_cache *cache,
+ const struct slab *slab)
{
- if (is_kfence_address(page_address(page)))
+ if (is_kfence_address(slab_address(slab)))
return 1;
return cache->num;
}
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 8a9c2876ca89..33c5c0e3bd8d 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -158,11 +158,11 @@ static inline void sysfs_slab_release(struct kmem_cache *s)

void *fixup_red_left(struct kmem_cache *s, void *p);

-static inline void *nearest_obj(struct kmem_cache *cache, struct page *page,
+static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *slab,
void *x) {
- void *object = x - (x - page_address(page)) % cache->size;
- void *last_object = page_address(page) +
- (page->objects - 1) * cache->size;
+ void *object = x - (x - slab_address(slab)) % cache->size;
+ void *last_object = slab_address(slab) +
+ (slab->objects - 1) * cache->size;
void *result = (unlikely(object > last_object)) ? last_object : object;

result = fixup_red_left(cache, result);
@@ -178,16 +178,16 @@ static inline unsigned int __obj_to_index(const struct kmem_cache *cache,
}

static inline unsigned int obj_to_index(const struct kmem_cache *cache,
- const struct page *page, void *obj)
+ const struct slab *slab, void *obj)
{
if (is_kfence_address(obj))
return 0;
- return __obj_to_index(cache, page_address(page), obj);
+ return __obj_to_index(cache, slab_address(slab), obj);
}

-static inline int objs_per_slab_page(const struct kmem_cache *cache,
- const struct page *page)
+static inline int objs_per_slab(const struct kmem_cache *cache,
+ const struct slab *slab)
{
- return page->objects;
+ return slab->objects;
}
#endif /* _LINUX_SLUB_DEF_H */
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 8428da2aaf17..6a1cd2d38bff 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -298,7 +298,7 @@ static inline u8 assign_tag(struct kmem_cache *cache,
/* For caches that either have a constructor or SLAB_TYPESAFE_BY_RCU: */
#ifdef CONFIG_SLAB
/* For SLAB assign tags based on the object index in the freelist. */
- return (u8)obj_to_index(cache, virt_to_head_page(object), (void *)object);
+ return (u8)obj_to_index(cache, virt_to_slab(object), (void *)object);
#else
/*
* For SLUB assign a random tag during slab creation, otherwise reuse
@@ -341,7 +341,7 @@ static inline bool ____kasan_slab_free(struct kmem_cache *cache, void *object,
if (is_kfence_address(object))
return false;

- if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) !=
+ if (unlikely(nearest_obj(cache, virt_to_slab(object), object) !=
object)) {
kasan_report_invalid_free(tagged_object, ip);
return true;
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index 84a038b07c6f..5d0b79416c4e 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -339,7 +339,7 @@ static void __kasan_record_aux_stack(void *addr, bool can_alloc)
return;

cache = page->slab_cache;
- object = nearest_obj(cache, page, addr);
+ object = nearest_obj(cache, page_slab(page), addr);
alloc_meta = kasan_get_alloc_meta(cache, object);
if (!alloc_meta)
return;
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 0bc10f452f7e..e00999dc6499 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -249,7 +249,7 @@ static void print_address_description(void *addr, u8 tag)

if (page && PageSlab(page)) {
struct kmem_cache *cache = page->slab_cache;
- void *object = nearest_obj(cache, page, addr);
+ void *object = nearest_obj(cache, page_slab(page), addr);

describe_object(cache, object, addr, tag);
}
diff --git a/mm/kasan/report_tags.c b/mm/kasan/report_tags.c
index 8a319fc16dab..06c21dd77493 100644
--- a/mm/kasan/report_tags.c
+++ b/mm/kasan/report_tags.c
@@ -23,7 +23,7 @@ const char *kasan_get_bug_type(struct kasan_access_info *info)
page = kasan_addr_to_page(addr);
if (page && PageSlab(page)) {
cache = page->slab_cache;
- object = nearest_obj(cache, page, (void *)addr);
+ object = nearest_obj(cache, page_slab(page), (void *)addr);
alloc_meta = kasan_get_alloc_meta(cache, object);

if (alloc_meta) {
diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index 695030c1fff8..f7276711d7b9 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -291,8 +291,8 @@ static void *test_alloc(struct kunit *test, size_t size, gfp_t gfp, enum allocat
* even for KFENCE objects; these are required so that
* memcg accounting works correctly.
*/
- KUNIT_EXPECT_EQ(test, obj_to_index(s, page, alloc), 0U);
- KUNIT_EXPECT_EQ(test, objs_per_slab_page(s, page), 1);
+ KUNIT_EXPECT_EQ(test, obj_to_index(s, page_slab(page), alloc), 0U);
+ KUNIT_EXPECT_EQ(test, objs_per_slab(s, page_slab(page)), 1);

if (policy == ALLOCATE_ANY)
return alloc;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2ed5f2a0879d..f7b789e692a0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2819,7 +2819,7 @@ static inline void mod_objcg_mlstate(struct obj_cgroup *objcg,
int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s,
gfp_t gfp, bool new_page)
{
- unsigned int objects = objs_per_slab_page(s, page);
+ unsigned int objects = objs_per_slab(s, page_slab(page));
unsigned long memcg_data;
void *vec;

@@ -2881,7 +2881,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p)
struct obj_cgroup *objcg;
unsigned int off;

- off = obj_to_index(page->slab_cache, page, p);
+ off = obj_to_index(page->slab_cache, page_slab(page), p);
objcg = page_objcgs(page)[off];
if (objcg)
return obj_cgroup_memcg(objcg);
diff --git a/mm/slab.c b/mm/slab.c
index 547ed068a569..c13258116791 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1559,7 +1559,7 @@ static void check_poison_obj(struct kmem_cache *cachep, void *objp)
struct slab *slab = virt_to_slab(objp);
unsigned int objnr;

- objnr = obj_to_index(cachep, slab_page(slab), objp);
+ objnr = obj_to_index(cachep, slab, objp);
if (objnr) {
objp = index_to_obj(cachep, slab, objnr - 1);
realobj = (char *)objp + obj_offset(cachep);
@@ -2529,7 +2529,7 @@ static void *slab_get_obj(struct kmem_cache *cachep, struct slab *slab)
static void slab_put_obj(struct kmem_cache *cachep,
struct slab *slab, void *objp)
{
- unsigned int objnr = obj_to_index(cachep, slab_page(slab), objp);
+ unsigned int objnr = obj_to_index(cachep, slab, objp);
#if DEBUG
unsigned int i;

@@ -2716,7 +2716,7 @@ static void *cache_free_debugcheck(struct kmem_cache *cachep, void *objp,
if (cachep->flags & SLAB_STORE_USER)
*dbg_userword(cachep, objp) = (void *)caller;

- objnr = obj_to_index(cachep, slab_page(slab), objp);
+ objnr = obj_to_index(cachep, slab, objp);

BUG_ON(objnr >= cachep->num);
BUG_ON(objp != index_to_obj(cachep, slab, objnr));
@@ -3662,7 +3662,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab)
objp = object - obj_offset(cachep);
kpp->kp_data_offset = obj_offset(cachep);
slab = virt_to_slab(objp);
- objnr = obj_to_index(cachep, slab_page(slab), objp);
+ objnr = obj_to_index(cachep, slab, objp);
objp = index_to_obj(cachep, slab, objnr);
kpp->kp_objp = objp;
if (DEBUG && cachep->flags & SLAB_STORE_USER)
@@ -4180,7 +4180,7 @@ void __check_heap_object(const void *ptr, unsigned long n,

/* Find and validate object. */
cachep = slab->slab_cache;
- objnr = obj_to_index(cachep, slab_page(slab), (void *)ptr);
+ objnr = obj_to_index(cachep, slab, (void *)ptr);
BUG_ON(objnr >= cachep->num);

/* Find offset within object. */
diff --git a/mm/slab.h b/mm/slab.h
index 039babfde2fe..bca9181e96d7 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -483,7 +483,7 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s,
continue;
}

- off = obj_to_index(s, page, p[i]);
+ off = obj_to_index(s, page_slab(page), p[i]);
obj_cgroup_get(objcg);
page_objcgs(page)[off] = objcg;
mod_objcg_state(objcg, page_pgdat(page),
@@ -522,7 +522,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
else
s = s_orig;

- off = obj_to_index(s, page, p[i]);
+ off = obj_to_index(s, page_slab(page), p[i]);
objcg = objcgs[off];
if (!objcg)
continue;
diff --git a/mm/slub.c b/mm/slub.c
index cc64ba9d9963..ddf21c7a381a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4342,7 +4342,7 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab)
#else
objp = objp0;
#endif
- objnr = obj_to_index(s, slab_page(slab), objp);
+ objnr = obj_to_index(s, slab, objp);
kpp->kp_data_offset = (unsigned long)((char *)objp0 - (char *)objp);
objp = base + s->size * objnr;
kpp->kp_objp = objp;
--
2.34.1

Vlastimil Babka

unread,
Jan 3, 2022, 7:10:59 PM1/3/22
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Roman Gushchin, Hyeonggon Yoo, pat...@lists.linux.dev, Vlastimil Babka, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
The function sets some fields that are being moved from struct page to
struct slab so it needs to be converted.

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Tested-by: Marco Elver <el...@google.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Marco Elver <el...@google.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: <kasa...@googlegroups.com>
---
mm/kfence/core.c | 12 ++++++------
mm/kfence/kfence_test.c | 6 +++---
2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 09945784df9e..4eb60cf5ff8b 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -360,7 +360,7 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g
{
struct kfence_metadata *meta = NULL;
unsigned long flags;
- struct page *page;
+ struct slab *slab;
void *addr;

/* Try to obtain a free object. */
@@ -424,13 +424,13 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g

alloc_covered_add(alloc_stack_hash, 1);

- /* Set required struct page fields. */
- page = virt_to_page(meta->addr);
- page->slab_cache = cache;
+ /* Set required slab fields. */
+ slab = virt_to_slab((void *)meta->addr);
+ slab->slab_cache = cache;
if (IS_ENABLED(CONFIG_SLUB))
- page->objects = 1;
+ slab->objects = 1;
if (IS_ENABLED(CONFIG_SLAB))
- page->s_mem = addr;
+ slab->s_mem = addr;

/* Memory initialization. */
for_each_canary(meta, set_canary_byte);
diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index f7276711d7b9..a22b1af85577 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -282,7 +282,7 @@ static void *test_alloc(struct kunit *test, size_t size, gfp_t gfp, enum allocat
alloc = kmalloc(size, gfp);

if (is_kfence_address(alloc)) {
- struct page *page = virt_to_head_page(alloc);
+ struct slab *slab = virt_to_slab(alloc);
struct kmem_cache *s = test_cache ?:
kmalloc_caches[kmalloc_type(GFP_KERNEL)][__kmalloc_index(size, false)];

@@ -291,8 +291,8 @@ static void *test_alloc(struct kunit *test, size_t size, gfp_t gfp, enum allocat
* even for KFENCE objects; these are required so that
* memcg accounting works correctly.
*/
- KUNIT_EXPECT_EQ(test, obj_to_index(s, page_slab(page), alloc), 0U);
- KUNIT_EXPECT_EQ(test, objs_per_slab(s, page_slab(page)), 1);
+ KUNIT_EXPECT_EQ(test, obj_to_index(s, slab, alloc), 0U);
+ KUNIT_EXPECT_EQ(test, objs_per_slab(s, slab), 1);

if (policy == ALLOCATE_ANY)
return alloc;
--
2.34.1

Vlastimil Babka

unread,
Jan 3, 2022, 7:10:59 PM1/3/22
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Roman Gushchin, Hyeonggon Yoo, pat...@lists.linux.dev, Vlastimil Babka, Andrey Konovalov, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
From: "Matthew Wilcox (Oracle)" <wi...@infradead.org>

KASAN accesses some slab related struct page fields so we need to
convert it to struct slab. Some places are a bit simplified thanks to
kasan_addr_to_slab() encapsulating the PageSlab flag check through
virt_to_slab(). When resolving object address to either a real slab or
a large kmalloc, use struct folio as the intermediate type for testing
the slab flag to avoid unnecessary implicit compound_head().

[ vba...@suse.cz: use struct folio, adjust to differences in previous
patches ]

Signed-off-by: Matthew Wilcox (Oracle) <wi...@infradead.org>
Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Reviewed-by: Andrey Konovalov <andre...@gmail.com>
Tested-by: Hyeongogn Yoo <42.h...@gmail.com>
Cc: Andrey Ryabinin <ryabin...@gmail.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Andrey Konovalov <andre...@gmail.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: <kasa...@googlegroups.com>
---
include/linux/kasan.h | 9 +++++----
mm/kasan/common.c | 23 +++++++++++++----------
mm/kasan/generic.c | 8 ++++----
mm/kasan/kasan.h | 1 +
mm/kasan/quarantine.c | 2 +-
mm/kasan/report.c | 13 +++++++++++--
mm/kasan/report_tags.c | 10 +++++-----
mm/slab.c | 2 +-
mm/slub.c | 2 +-
9 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index d8783b682669..fb78108d694e 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -9,6 +9,7 @@

struct kmem_cache;
struct page;
+struct slab;
struct vm_struct;
struct task_struct;

@@ -193,11 +194,11 @@ static __always_inline size_t kasan_metadata_size(struct kmem_cache *cache)
return 0;
}

-void __kasan_poison_slab(struct page *page);
-static __always_inline void kasan_poison_slab(struct page *page)
+void __kasan_poison_slab(struct slab *slab);
+static __always_inline void kasan_poison_slab(struct slab *slab)
{
if (kasan_enabled())
- __kasan_poison_slab(page);
+ __kasan_poison_slab(slab);
}

void __kasan_unpoison_object_data(struct kmem_cache *cache, void *object);
@@ -322,7 +323,7 @@ static inline void kasan_cache_create(struct kmem_cache *cache,
slab_flags_t *flags) {}
static inline void kasan_cache_create_kmalloc(struct kmem_cache *cache) {}
static inline size_t kasan_metadata_size(struct kmem_cache *cache) { return 0; }
-static inline void kasan_poison_slab(struct page *page) {}
+static inline void kasan_poison_slab(struct slab *slab) {}
static inline void kasan_unpoison_object_data(struct kmem_cache *cache,
void *object) {}
static inline void kasan_poison_object_data(struct kmem_cache *cache,
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 6a1cd2d38bff..7c06db78a76c 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -247,8 +247,9 @@ struct kasan_free_meta *kasan_get_free_meta(struct kmem_cache *cache,
}
#endif

-void __kasan_poison_slab(struct page *page)
+void __kasan_poison_slab(struct slab *slab)
{
+ struct page *page = slab_page(slab);
unsigned long i;

for (i = 0; i < compound_nr(page); i++)
@@ -401,9 +402,9 @@ void __kasan_kfree_large(void *ptr, unsigned long ip)

void __kasan_slab_free_mempool(void *ptr, unsigned long ip)
{
- struct page *page;
+ struct folio *folio;

- page = virt_to_head_page(ptr);
+ folio = virt_to_folio(ptr);

/*
* Even though this function is only called for kmem_cache_alloc and
@@ -411,12 +412,14 @@ void __kasan_slab_free_mempool(void *ptr, unsigned long ip)
* !PageSlab() when the size provided to kmalloc is larger than
* KMALLOC_MAX_SIZE, and kmalloc falls back onto page_alloc.
*/
- if (unlikely(!PageSlab(page))) {
+ if (unlikely(!folio_test_slab(folio))) {
if (____kasan_kfree_large(ptr, ip))
return;
- kasan_poison(ptr, page_size(page), KASAN_FREE_PAGE, false);
+ kasan_poison(ptr, folio_size(folio), KASAN_FREE_PAGE, false);
} else {
- ____kasan_slab_free(page->slab_cache, ptr, ip, false, false);
+ struct slab *slab = folio_slab(folio);
+
+ ____kasan_slab_free(slab->slab_cache, ptr, ip, false, false);
}
}

@@ -560,7 +563,7 @@ void * __must_check __kasan_kmalloc_large(const void *ptr, size_t size,

void * __must_check __kasan_krealloc(const void *object, size_t size, gfp_t flags)
{
- struct page *page;
+ struct slab *slab;

if (unlikely(object == ZERO_SIZE_PTR))
return (void *)object;
@@ -572,13 +575,13 @@ void * __must_check __kasan_krealloc(const void *object, size_t size, gfp_t flag
*/
kasan_unpoison(object, size, false);

- page = virt_to_head_page(object);
+ slab = virt_to_slab(object);

/* Piggy-back on kmalloc() instrumentation to poison the redzone. */
- if (unlikely(!PageSlab(page)))
+ if (unlikely(!slab))
return __kasan_kmalloc_large(object, size, flags);
else
- return ____kasan_kmalloc(page->slab_cache, object, size, flags);
+ return ____kasan_kmalloc(slab->slab_cache, object, size, flags);
}

bool __kasan_check_byte(const void *address, unsigned long ip)
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index 5d0b79416c4e..a25ad4090615 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -330,16 +330,16 @@ DEFINE_ASAN_SET_SHADOW(f8);

static void __kasan_record_aux_stack(void *addr, bool can_alloc)
{
- struct page *page = kasan_addr_to_page(addr);
+ struct slab *slab = kasan_addr_to_slab(addr);
struct kmem_cache *cache;
struct kasan_alloc_meta *alloc_meta;
void *object;

- if (is_kfence_address(addr) || !(page && PageSlab(page)))
+ if (is_kfence_address(addr) || !slab)
return;

- cache = page->slab_cache;
- object = nearest_obj(cache, page_slab(page), addr);
+ cache = slab->slab_cache;
+ object = nearest_obj(cache, slab, addr);
alloc_meta = kasan_get_alloc_meta(cache, object);
if (!alloc_meta)
return;
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index aebd8df86a1f..c17fa8d26ffe 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -265,6 +265,7 @@ bool kasan_report(unsigned long addr, size_t size,
void kasan_report_invalid_free(void *object, unsigned long ip);

struct page *kasan_addr_to_page(const void *addr);
+struct slab *kasan_addr_to_slab(const void *addr);

depot_stack_handle_t kasan_save_stack(gfp_t flags, bool can_alloc);
void kasan_set_track(struct kasan_track *track, gfp_t flags);
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index d8ccff4c1275..587da8995f2d 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -117,7 +117,7 @@ static unsigned long quarantine_batch_size;

static struct kmem_cache *qlink_to_cache(struct qlist_node *qlink)
{
- return virt_to_head_page(qlink)->slab_cache;
+ return virt_to_slab(qlink)->slab_cache;
}

static void *qlink_to_object(struct qlist_node *qlink, struct kmem_cache *cache)
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index e00999dc6499..3ad9624dcc56 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -150,6 +150,14 @@ struct page *kasan_addr_to_page(const void *addr)
return NULL;
}

+struct slab *kasan_addr_to_slab(const void *addr)
+{
+ if ((addr >= (void *)PAGE_OFFSET) &&
+ (addr < high_memory))
+ return virt_to_slab(addr);
+ return NULL;
+}
+
static void describe_object_addr(struct kmem_cache *cache, void *object,
const void *addr)
{
@@ -248,8 +256,9 @@ static void print_address_description(void *addr, u8 tag)
pr_err("\n");

if (page && PageSlab(page)) {
- struct kmem_cache *cache = page->slab_cache;
- void *object = nearest_obj(cache, page_slab(page), addr);
+ struct slab *slab = page_slab(page);
+ struct kmem_cache *cache = slab->slab_cache;
+ void *object = nearest_obj(cache, slab, addr);

describe_object(cache, object, addr, tag);
}
diff --git a/mm/kasan/report_tags.c b/mm/kasan/report_tags.c
index 06c21dd77493..1b41de88c53e 100644
--- a/mm/kasan/report_tags.c
+++ b/mm/kasan/report_tags.c
@@ -12,7 +12,7 @@ const char *kasan_get_bug_type(struct kasan_access_info *info)
#ifdef CONFIG_KASAN_TAGS_IDENTIFY
struct kasan_alloc_meta *alloc_meta;
struct kmem_cache *cache;
- struct page *page;
+ struct slab *slab;
const void *addr;
void *object;
u8 tag;
@@ -20,10 +20,10 @@ const char *kasan_get_bug_type(struct kasan_access_info *info)

tag = get_tag(info->access_addr);
addr = kasan_reset_tag(info->access_addr);
- page = kasan_addr_to_page(addr);
- if (page && PageSlab(page)) {
- cache = page->slab_cache;
- object = nearest_obj(cache, page_slab(page), (void *)addr);
+ slab = kasan_addr_to_slab(addr);
+ if (slab) {
+ cache = slab->slab_cache;
+ object = nearest_obj(cache, slab, (void *)addr);
alloc_meta = kasan_get_alloc_meta(cache, object);

if (alloc_meta) {
diff --git a/mm/slab.c b/mm/slab.c
index c13258116791..ddf5737c63d9 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2604,7 +2604,7 @@ static struct slab *cache_grow_begin(struct kmem_cache *cachep,
* page_address() in the latter returns a non-tagged pointer,
* as it should be for slab pages.
*/
- kasan_poison_slab(slab_page(slab));
+ kasan_poison_slab(slab);

/* Get slab management. */
freelist = alloc_slabmgmt(cachep, slab, offset,
diff --git a/mm/slub.c b/mm/slub.c
index ddf21c7a381a..d08ba1025aae 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1961,7 +1961,7 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)

slab->slab_cache = s;

- kasan_poison_slab(slab_page(slab));
+ kasan_poison_slab(slab);

start = slab_address(slab);

--
2.34.1

Vlastimil Babka

unread,
Jan 3, 2022, 7:11:00 PM1/3/22
to Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Roman Gushchin, Hyeonggon Yoo, pat...@lists.linux.dev, Vlastimil Babka, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
With a struct slab definition separate from struct page, we can go
further and define only fields that the chosen sl*b implementation uses.
This means everything between __page_flags and __page_refcount
placeholders now depends on the chosen CONFIG_SL*B. Some fields exist in
all implementations (slab_list) but can be part of a union in some, so
it's simpler to repeat them than complicate the definition with ifdefs
even more.

The patch doesn't change physical offsets of the fields, although it
could be done later - for example it's now clear that tighter packing in
SLOB could be possible.

This should also prevent accidental use of fields that don't exist in
given implementation. Before this patch virt_to_cache() and
cache_from_obj() were visible for SLOB (albeit not used), although they
rely on the slab_cache field that isn't set by SLOB. With this patch
it's now a compile error, so these functions are now hidden behind
an #ifndef CONFIG_SLOB.

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Tested-by: Marco Elver <el...@google.com> # kfence
Reviewed-by: Hyeonggon Yoo <42.h...@gmail.com>
Tested-by: Hyeonggon Yoo <42.h...@gmail.com>
Cc: Alexander Potapenko <gli...@google.com>
Cc: Marco Elver <el...@google.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: <kasa...@googlegroups.com>
---
mm/kfence/core.c | 9 +++++----
mm/slab.h | 48 ++++++++++++++++++++++++++++++++++++++----------
2 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 4eb60cf5ff8b..267dfde43b91 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -427,10 +427,11 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g
/* Set required slab fields. */
slab = virt_to_slab((void *)meta->addr);
slab->slab_cache = cache;
- if (IS_ENABLED(CONFIG_SLUB))
- slab->objects = 1;
- if (IS_ENABLED(CONFIG_SLAB))
- slab->s_mem = addr;
+#if defined(CONFIG_SLUB)
+ slab->objects = 1;
+#elif defined(CONFIG_SLAB)
+ slab->s_mem = addr;
+#endif

/* Memory initialization. */
for_each_canary(meta, set_canary_byte);
diff --git a/mm/slab.h b/mm/slab.h
index 36e0022d8267..b8da249f44f9 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -8,9 +8,24 @@
/* Reuses the bits in struct page */
struct slab {
unsigned long __page_flags;
+
+#if defined(CONFIG_SLAB)
+
union {
struct list_head slab_list;
- struct { /* Partial pages */
+ struct rcu_head rcu_head;
+ };
+ struct kmem_cache *slab_cache;
+ void *freelist; /* array of free object indexes */
+ void *s_mem; /* first object */
+ unsigned int active;
+
+#elif defined(CONFIG_SLUB)
+
+ union {
+ struct list_head slab_list;
+ struct rcu_head rcu_head;
+ struct {
struct slab *next;
#ifdef CONFIG_64BIT
int slabs; /* Nr of slabs left */
@@ -18,25 +33,32 @@ struct slab {
short int slabs;
#endif
};
- struct rcu_head rcu_head;
};
- struct kmem_cache *slab_cache; /* not slob */
+ struct kmem_cache *slab_cache;
/* Double-word boundary */
void *freelist; /* first free object */
union {
- void *s_mem; /* slab: first object */
- unsigned long counters; /* SLUB */
- struct { /* SLUB */
+ unsigned long counters;
+ struct {
unsigned inuse:16;
unsigned objects:15;
unsigned frozen:1;
};
};
+ unsigned int __unused;
+
+#elif defined(CONFIG_SLOB)
+
+ struct list_head slab_list;
+ void *__unused_1;
+ void *freelist; /* first free block */
+ void *__unused_2;
+ int units;
+
+#else
+#error "Unexpected slab allocator configured"
+#endif

- union {
- unsigned int active; /* SLAB */
- int units; /* SLOB */
- };
atomic_t __page_refcount;
#ifdef CONFIG_MEMCG
unsigned long memcg_data;
@@ -48,10 +70,14 @@ struct slab {
SLAB_MATCH(flags, __page_flags);
SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */
SLAB_MATCH(slab_list, slab_list);
+#ifndef CONFIG_SLOB
SLAB_MATCH(rcu_head, rcu_head);
SLAB_MATCH(slab_cache, slab_cache);
+#endif
+#ifdef CONFIG_SLAB
SLAB_MATCH(s_mem, s_mem);
SLAB_MATCH(active, active);
+#endif
SLAB_MATCH(_refcount, __page_refcount);
#ifdef CONFIG_MEMCG
SLAB_MATCH(memcg_data, memcg_data);
@@ -602,6 +628,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s,
}
#endif /* CONFIG_MEMCG_KMEM */

+#ifndef CONFIG_SLOB
static inline struct kmem_cache *virt_to_cache(const void *obj)
{
struct slab *slab;
@@ -648,6 +675,7 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
print_tracking(cachep, x);
return cachep;
}
+#endif /* CONFIG_SLOB */

static inline size_t slab_ksize(const struct kmem_cache *s)
{
--
2.34.1

Roman Gushchin

unread,
Jan 4, 2022, 9:13:11 PM1/4/22
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, pat...@lists.linux.dev, Andrey Konovalov, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Marco Elver, Michal Hocko, Vladimir Davydov, kasa...@googlegroups.com, cgr...@vger.kernel.org
Nice! It looks indeed better.
s/tab/space
Reviewed-by: Roman Gushchin <gu...@fb.com>

Thanks!

Vlastimil Babka

unread,
Jan 5, 2022, 11:39:11 AM1/5/22
to Roman Gushchin, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, pat...@lists.linux.dev, Andrey Konovalov, Julia Lawall, Luis Chamberlain, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Marco Elver, Michal Hocko, Vladimir Davydov, kasa...@googlegroups.com, cgr...@vger.kernel.org
On 1/5/22 03:12, Roman Gushchin wrote:
> On Tue, Jan 04, 2022 at 01:10:36AM +0100, Vlastimil Babka wrote:
>> --- a/mm/kasan/report.c
>> +++ b/mm/kasan/report.c
>> @@ -249,7 +249,7 @@ static void print_address_description(void *addr, u8 tag)
>>
>> if (page && PageSlab(page)) {
>> struct kmem_cache *cache = page->slab_cache;
>> - void *object = nearest_obj(cache, page, addr);
>> + void *object = nearest_obj(cache, page_slab(page), addr);
> s/tab/space

Yeah it was pointed out earlier that the tab was already there but only this
change made it stand out. Fixing that up here would go against the automated
spatch conversion, so it's done in later manual patch that also touches this
line.

>> 2.34.1
>>
>
> Reviewed-by: Roman Gushchin <gu...@fb.com>
>
> Thanks!

Thanks!

Roman Gushchin

unread,
Jan 5, 2022, 11:06:22 PM1/5/22
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, pat...@lists.linux.dev, Andrey Konovalov, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On Tue, Jan 04, 2022 at 01:10:39AM +0100, Vlastimil Babka wrote:
> From: "Matthew Wilcox (Oracle)" <wi...@infradead.org>
>
> KASAN accesses some slab related struct page fields so we need to
> convert it to struct slab. Some places are a bit simplified thanks to
> kasan_addr_to_slab() encapsulating the PageSlab flag check through
> virt_to_slab(). When resolving object address to either a real slab or
> a large kmalloc, use struct folio as the intermediate type for testing
> the slab flag to avoid unnecessary implicit compound_head().
>
> [ vba...@suse.cz: use struct folio, adjust to differences in previous
> patches ]
>
> Signed-off-by: Matthew Wilcox (Oracle) <wi...@infradead.org>
> Signed-off-by: Vlastimil Babka <vba...@suse.cz>
> Reviewed-by: Andrey Konovalov <andre...@gmail.com>
> Tested-by: Hyeongogn Yoo <42.h...@gmail.com>
> Cc: Andrey Ryabinin <ryabin...@gmail.com>
> Cc: Alexander Potapenko <gli...@google.com>
> Cc: Andrey Konovalov <andre...@gmail.com>
> Cc: Dmitry Vyukov <dvy...@google.com>
> Cc: <kasa...@googlegroups.com>

Roman Gushchin

unread,
Jan 5, 2022, 11:12:13 PM1/5/22
to Vlastimil Babka, Matthew Wilcox, Christoph Lameter, David Rientjes, Joonsoo Kim, Pekka Enberg, linu...@kvack.org, Andrew Morton, Johannes Weiner, Hyeonggon Yoo, pat...@lists.linux.dev, Marco Elver, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
Nice!

Reviewed-by: Roman Gushchin <gu...@fb.com>
Reply all
Reply to author
Forward
0 new messages