[Kern Meetup Blr] [CFP] : Talk Proposal - Optimizing Page Clearing in the Linux Kernel's init_on_alloc Path

41 views
Skip to first unread message

hrushikesh salunke

unread,
Mar 27, 2026, 2:15:14 AM (5 days ago) Mar 27
to kernel-meet...@googlegroups.com

Preferred Format: Lightning Talk (10+5 minutes)

 

Abstract:

The Linux kernel needs to zero freshly allocated pages for security, userspace page faults, and HugeTLB. Traditionally, this was done one 4KB page at a time — map the page, clear 4KB, unmap, repeat. For a 2MB HugeTLB allocation, that's 512 individual iterations with per-page overhead. Recent upstream work introduced batched page clearing via a clear_pages() primitive that clears an entire contiguous range in a single operation, letting the CPU's memory subsystem work at full bandwidth. On x86, this means one long REP STOSB instead of 512 short ones.

However, this optimization only covered the page-fault path (folio_zero_user). When init_on_alloc is enabled, a security default on many distros, all pages are zeroed at allocation time through kernel_init_pages(), which still used the old page-by-page loop. We instrumented this path, found that 84–99% of pages cleared here are higher-order (32KB SLUB allocations, 2MB HugeTLB), and applied the same batching. The result: 2.68x faster HugeTLB allocation, and up to 51% kernel time reduction on AMD EPYC.

 

Outline:

  1. Background — how page clearing works today
  2. The gap : init_on_alloc bypasses the new batched path entirely, clearing everything page-by-page, implications on hugepage allocations.
  3. Profiling results of kernel_init_pages : vmstat counters values
  4. Optimizing kernel_init_pages()
  5. Results
  6. Q&A (5 min)

 

Speaker Bio:

I am a Linux Kernel developer at AMD working on Linux kernel performance optimization, with a focus on memory management. Previously I worked at Texas Instruments. I hold the MTech degree in computer science from IISc Bangalore.

 

Links:

 

Regards,
Hrushikesh.

AKHILESH PATIL

unread,
Mar 27, 2026, 2:34:20 AM (5 days ago) Mar 27
to Kernel Meetup Bangalore
+1

Srinivasulu Thanneeru

unread,
Mar 27, 2026, 2:48:14 AM (5 days ago) Mar 27
to hrushikesh salunke, kernel-meet...@googlegroups.com
+1

Regards,
Srinivas Th


--
You received this message because you are subscribed to the Google Groups "Kernel Meetup Bangalore" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kernel-meetup-ban...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/kernel-meetup-bangalore/CALDvzVPMh%2BY0hKqowMdJdMmiELuJQ4cHbDSiz9N%3DiNHH0%3DgnFg%40mail.gmail.com.

Shivraj Jamgade

unread,
Mar 28, 2026, 12:51:22 AM (4 days ago) Mar 28
to hrushikesh salunke, kernel-meet...@googlegroups.com
+1

Thanks,
Shivraj 

--

I Viswanath

unread,
Mar 28, 2026, 2:39:17 PM (4 days ago) Mar 28
to Shivraj Jamgade, hrushikesh salunke, Kernel Meetup Bangalore

Wyes Karny

unread,
Mar 29, 2026, 1:22:21 AM (3 days ago) Mar 29
to I Viswanath, Shivraj Jamgade, hrushikesh salunke, Kernel Meetup Bangalore

Gautham

unread,
Mar 31, 2026, 5:02:33 AM (23 hours ago) Mar 31
to Wyes Karny, I Viswanath, Shivraj Jamgade, hrushikesh salunke, Kernel Meetup Bangalore

Sandipan Das

unread,
Mar 31, 2026, 5:54:10 AM (22 hours ago) Mar 31
to hrushikesh salunke, kernel-meet...@googlegroups.com
+1

hrushikesh salunke wrote:
> Preferred Format: Lightning Talk (10+5 minutes)
>
>  
>
> Abstract:
>
> The Linux kernel needs to zero freshly allocated pages for security, userspace page faults, and HugeTLB. Traditionally, this was done one 4KB page at a time — map the page, clear 4KB, unmap, repeat. For a 2MB HugeTLB allocation, that's 512 individual iterations with per-page overhead. Recent upstream work introduced batched page clearing via a clear_pages() primitive that clears an entire contiguous range in a single operation, letting the CPU's memory subsystem work at full bandwidth. On x86, this means one long REP STOSB instead of 512 short ones.
>
> However, this optimization only covered the page-fault path (folio_zero_user). When init_on_alloc is enabled, a security default on many distros, all pages are zeroed at allocation time through kernel_init_pages(), which still used the old page-by-page loop. We instrumented this path, found that 84–99% of pages cleared here are higher-order (32KB SLUB allocations, 2MB HugeTLB), and applied the same batching. The result: 2.68x faster HugeTLB allocation, and up to 51% kernel time reduction on AMD EPYC.
>
>  
>
> Outline:
>
> 1. Background — how page clearing works today
> 2. The gap : init_on_alloc bypasses the new batched path entirely, clearing everything page-by-page, implications on hugepage allocations.
> 3. Profiling results of kernel_init_pages : vmstat counters values
> 4. Optimizing kernel_init_pages()
> 5. Results
> 6. Q&A (5 min)
>
>  
>
> Speaker Bio:
>
> I am a Linux Kernel developer at AMD working on Linux kernel performance optimization, with a focus on memory management. Previously I worked at Texas Instruments. I hold the MTech degree in computer science from IISc Bangalore.
>
>  
>
> Links:
>
> * Batched page clearing upstream patch series (prerequisite infrastructure): https://lore.kernel.org/lkml/20260107072009.1615...@oracle.com/ <https://lore.kernel.org/lkml/20260107072009.1615...@oracle.com/>
>
>  
>
> Regards,
> Hrushikesh.
>
Reply all
Reply to author
Forward
0 new messages