[PATCH v2 1/1] pgo: Fix allocate_node() v2

6 views
Skip to first unread message

Jarmo Tiitto

unread,
Jun 3, 2021, 9:41:11 AM6/3/21
to Sami Tolvanen, Bill Wendling, Kees Cook, Nathan Chancellor, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, Jarmo Tiitto, mo...@google.com
Based on Kees and others feedback here is v2 patch
that clarifies why the current checks in allocate_node()
are flawed. I did fair amount of KGDB time on it.

When clang instrumentation eventually calls allocate_node()
the struct llvm_prf_data *p argument tells us from what section
we should reserve the vnode: It either points into vmlinux's
core __llvm_prf_data section or some loaded module's
__llvm_prf_data section.

But since we don't have access to corresponding
__llvm_prf_vnds section(s) for any module, the function
should return just NULL and ignore any profiling attempts
from modules for now.

Signed-off-by: Jarmo Tiitto <jarmo....@gmail.com>
---
kernel/pgo/instrument.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
index 0e07ee1b17d9..afe9982b07a3 100644
--- a/kernel/pgo/instrument.c
+++ b/kernel/pgo/instrument.c
@@ -23,6 +23,7 @@
#include <linux/export.h>
#include <linux/spinlock.h>
#include <linux/types.h>
+#include <asm-generic/sections.h>
#include "pgo.h"

/*
@@ -55,17 +56,19 @@ void prf_unlock(unsigned long flags)
static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
u32 index, u64 value)
{
- if (&__llvm_prf_vnds_start[current_node + 1] >= __llvm_prf_vnds_end)
- return NULL; /* Out of nodes */
-
- current_node++;
-
- /* Make sure the node is entirely within the section */
- if (&__llvm_prf_vnds_start[current_node] >= __llvm_prf_vnds_end ||
- &__llvm_prf_vnds_start[current_node + 1] > __llvm_prf_vnds_end)
+ const int max_vnds = prf_vnds_count();
+ /* Check that p is within vmlinux __llvm_prf_data section.
+ * If not, don't allocate since we can't handle modules yet.
+ */
+ if (!memory_contains(__llvm_prf_data_start,
+ __llvm_prf_data_end, p, sizeof(*p)))
return NULL;

- return &__llvm_prf_vnds_start[current_node];
+ if (WARN_ON_ONCE(current_node >= max_vnds))
+ return NULL; /* Out of nodes */
+
+ /* reserve vnode for vmlinux */
+ return &__llvm_prf_vnds_start[current_node++];
}

/*

base-commit: 5d0cda65918279ada060417c5fecb7e86ccb3def
--
2.31.1

Nick Desaulniers

unread,
Jun 3, 2021, 4:50:30 PM6/3/21
to Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Kees Cook, Nathan Chancellor, clang-built-linux, LKML, Bill Wendling
On Thu, Jun 3, 2021 at 6:41 AM Jarmo Tiitto <jarmo....@gmail.com> wrote:
>
> Based on Kees and others feedback here is v2 patch
> that clarifies why the current checks in allocate_node()
> are flawed. I did fair amount of KGDB time on it.

Kees can probably cut it when merging, but the above paragraph might
be better "below the fold" below next time (doesn't necessitate a v3).

>
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.
>
> Signed-off-by: Jarmo Tiitto <jarmo....@gmail.com>
> ---

^ ie. here. If you put text between the `---` and the diffstat, git
just discards it when applying. It's a good way to hide commentary
just meant for reviewers when sending a patch.
Sorry, where was prf_vnds_count() defined? I don't see it in
linux-next, but I'm also not sure which tree has
5d0cda65918279ada060417c5fecb7e86ccb3def.

> + /* Check that p is within vmlinux __llvm_prf_data section.
> + * If not, don't allocate since we can't handle modules yet.
> + */
> + if (!memory_contains(__llvm_prf_data_start,
> + __llvm_prf_data_end, p, sizeof(*p)))
> return NULL;
>
> - return &__llvm_prf_vnds_start[current_node];
> + if (WARN_ON_ONCE(current_node >= max_vnds))
> + return NULL; /* Out of nodes */
> +
> + /* reserve vnode for vmlinux */
> + return &__llvm_prf_vnds_start[current_node++];
> }
>
> /*
>
> base-commit: 5d0cda65918279ada060417c5fecb7e86ccb3def
> --
> 2.31.1
>


--
Thanks,
~Nick Desaulniers

Nathan Chancellor

unread,
Jun 3, 2021, 4:52:45 PM6/3/21
to Nick Desaulniers, Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Kees Cook, clang-built-linux, LKML, Bill Wendling
It is generated via the __DEFINE_PRF_SIZE macro in kernel/pgo/pgo.h.

Nick Desaulniers

unread,
Jun 3, 2021, 5:00:50 PM6/3/21
to Nathan Chancellor, Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Kees Cook, clang-built-linux, LKML, Bill Wendling
Thanks, it kills me when I can't find something with grep. I wonder if
language servers more modern than ctags have figured this out yet.

Patch looks fine to me then.
Reviewed-by: Nick Desaulniers <ndesau...@google.com>

>
> >> + /* Check that p is within vmlinux __llvm_prf_data section.
> >> + * If not, don't allocate since we can't handle modules yet.
> >> + */
> >> + if (!memory_contains(__llvm_prf_data_start,
> >> + __llvm_prf_data_end, p, sizeof(*p)))
> >> return NULL;
> >>
> >> - return &__llvm_prf_vnds_start[current_node];
> >> + if (WARN_ON_ONCE(current_node >= max_vnds))
> >> + return NULL; /* Out of nodes */
> >> +
> >> + /* reserve vnode for vmlinux */
> >> + return &__llvm_prf_vnds_start[current_node++];
> >> }
> >>
> >> /*
> >>
> >> base-commit: 5d0cda65918279ada060417c5fecb7e86ccb3def
> >> --
> >> 2.31.1
> >>
> >
> >
> > --
> > Thanks,
> > ~Nick Desaulniers
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/f06200fd-4ce5-e976-20ec-d2ea9d734b3c%40kernel.org.



--
Thanks,
~Nick Desaulniers

Nathan Chancellor

unread,
Jun 3, 2021, 5:14:26 PM6/3/21
to Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, mo...@google.com
On 6/3/2021 6:38 AM, Jarmo Tiitto wrote:
> Based on Kees and others feedback here is v2 patch
> that clarifies why the current checks in allocate_node()
> are flawed. I did fair amount of KGDB time on it.
>
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.
>
> Signed-off-by: Jarmo Tiitto <jarmo....@gmail.com>

I agree with Nick on the comments about the commit message. A few more
small nits below, not sure they necessitate a v3, up to you. Thank you
for the patch!

Reviewed-by: Nathan Chancellor <nat...@kernel.org>

> ---
> kernel/pgo/instrument.c | 21 ++++++++++++---------
> 1 file changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
> index 0e07ee1b17d9..afe9982b07a3 100644
> --- a/kernel/pgo/instrument.c
> +++ b/kernel/pgo/instrument.c
> @@ -23,6 +23,7 @@
> #include <linux/export.h>
> #include <linux/spinlock.h>
> #include <linux/types.h>
> +#include <asm-generic/sections.h>

Not sure that it actually matters but I think this should be

#include <asm/sections.h>

instead. Might be nice to keep this sorted by moving it to the top as well.

> #include "pgo.h"
>
> /*
> @@ -55,17 +56,19 @@ void prf_unlock(unsigned long flags)
> static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
> u32 index, u64 value)
> {
> - if (&__llvm_prf_vnds_start[current_node + 1] >= __llvm_prf_vnds_end)
> - return NULL; /* Out of nodes */
> -
> - current_node++;
> -
> - /* Make sure the node is entirely within the section */
> - if (&__llvm_prf_vnds_start[current_node] >= __llvm_prf_vnds_end ||
> - &__llvm_prf_vnds_start[current_node + 1] > __llvm_prf_vnds_end)
> + const int max_vnds = prf_vnds_count();

A blank line between this variable and the comment below would look nice.

> + /* Check that p is within vmlinux __llvm_prf_data section. > + * If not, don't allocate since we can't handle modules yet.
> + */

For every subsystem except for netdev, there should be a blank line at
the beginning of a comment. In other works:

/*
* Check that p is within vmlinux __llvm_prf_data section.
* If not, don't allocate since we can't handle modules yet.

Kees Cook

unread,
Jun 3, 2021, 5:36:42 PM6/3/21
to Nathan Chancellor, Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, mo...@google.com
On Thu, Jun 03, 2021 at 02:14:24PM -0700, Nathan Chancellor wrote:
> On 6/3/2021 6:38 AM, Jarmo Tiitto wrote:
> > Based on Kees and others feedback here is v2 patch
> > that clarifies why the current checks in allocate_node()
> > are flawed. I did fair amount of KGDB time on it.
> >
> > When clang instrumentation eventually calls allocate_node()
> > the struct llvm_prf_data *p argument tells us from what section
> > we should reserve the vnode: It either points into vmlinux's
> > core __llvm_prf_data section or some loaded module's
> > __llvm_prf_data section.
> >
> > But since we don't have access to corresponding
> > __llvm_prf_vnds section(s) for any module, the function
> > should return just NULL and ignore any profiling attempts
> > from modules for now.
> >
> > Signed-off-by: Jarmo Tiitto <jarmo....@gmail.com>
>
> I agree with Nick on the comments about the commit message. A few more small
> nits below, not sure they necessitate a v3, up to you. Thank you for the
> patch!

It would make my life easier to get a v3. :) I agree with all of
Nathan's suggestions. :)

Thanks!

-Kees

--
Kees Cook

Jarmo Tiitto

unread,
Jun 4, 2021, 2:50:09 AM6/4/21
to Kees Cook, Nathan Chancellor, Sami Tolvanen, Bill Wendling, Nick Desaulniers, clang-built-linux, LKML, Bill Wendling
Hello,

Ok, I'll make the requested changes and post v3 patch.
Btw. These patches were  based on kees/clang/pgo/features.

Thanks for patience.

Jarmo Tiitto

unread,
Jun 4, 2021, 5:40:30 AM6/4/21
to Nathan Chancellor, Kees Cook, Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, mo...@google.com
Hello,

Ok, I'll make the requested changes, noted by Nathan and post v3 patch soon. :-)
Btw. These patches were based on kees/for-next/clang/features branch.

Thanks for patience.
-Jarmo



Jarmo Tiitto

unread,
Jun 4, 2021, 1:03:23 PM6/4/21
to Sami Tolvanen, Bill Wendling, Kees Cook, Nathan Chancellor, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, Jarmo Tiitto, mo...@google.com
When clang instrumentation eventually calls allocate_node()
the struct llvm_prf_data *p argument tells us from what section
we should reserve the vnode: It either points into vmlinux's
core __llvm_prf_data section or some loaded module's
__llvm_prf_data section.

But since we don't have access to corresponding
__llvm_prf_vnds section(s) for any module, the function
should return just NULL and ignore any profiling attempts
from modules for now.

Signed-off-by: Jarmo Tiitto <jarmo....@gmail.com>
---
Based on Kees and others feedback here is v3 patch
that clarifies why the current checks in allocate_node()
are flawed. I did fair amount of KGDB time on it.

The commit is based on kees/for-next/clang/features tree,
hopefully this is ok. Should I have based it on linux-next
instead?

I grep -R'd where the memory_contains() can be found and it is only
found in #include <asm-generic/sections.h>

I cross my fingers and await if this is my first accepted patch. :-)
---
kernel/pgo/instrument.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c
index 0e07ee1b17d9..c69b4f7ebaad 100644
--- a/kernel/pgo/instrument.c
+++ b/kernel/pgo/instrument.c
@@ -18,6 +18,7 @@

#define pr_fmt(fmt) "pgo: " fmt

+#include <asm-generic/sections.h>
#include <linux/bitops.h>
#include <linux/kernel.h>
#include <linux/export.h>
@@ -55,17 +56,21 @@ void prf_unlock(unsigned long flags)
static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p,
u32 index, u64 value)
{
- if (&__llvm_prf_vnds_start[current_node + 1] >= __llvm_prf_vnds_end)
- return NULL; /* Out of nodes */
-
- current_node++;
-
- /* Make sure the node is entirely within the section */
- if (&__llvm_prf_vnds_start[current_node] >= __llvm_prf_vnds_end ||
- &__llvm_prf_vnds_start[current_node + 1] > __llvm_prf_vnds_end)
+ const int max_vnds = prf_vnds_count();
+
+ /*
+ * Check that p is within vmlinux __llvm_prf_data section.
+ * If not, don't allocate since we can't handle modules yet.
+ */
+ if (!memory_contains(__llvm_prf_data_start,
+ __llvm_prf_data_end, p, sizeof(*p)))
return NULL;

- return &__llvm_prf_vnds_start[current_node];
+ if (WARN_ON_ONCE(current_node >= max_vnds))
+ return NULL; /* Out of nodes */
+
+ /* reserve vnode for vmlinux */
+ return &__llvm_prf_vnds_start[current_node++];
}

/*

base-commit: 5d0cda65918279ada060417c5fecb7e86ccb3def
--
2.31.1

Nathan Chancellor

unread,
Jun 4, 2021, 1:04:43 PM6/4/21
to Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, mo...@google.com
On 6/4/2021 9:58 AM, Jarmo Tiitto wrote:
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.
>
> Signed-off-by: Jarmo Tiitto <jarmo....@gmail.com>

Reviewed-by: Nathan Chancellor <nat...@kernel.org>

Nick Desaulniers

unread,
Jun 4, 2021, 1:28:58 PM6/4/21
to Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Kees Cook, Nathan Chancellor, clang-built-linux, LKML, Bill Wendling
^ What about Nathan's feedback on this being just `<asm/sections.h>`?
--
Thanks,
~Nick Desaulniers

Kees Cook

unread,
Jun 4, 2021, 1:59:54 PM6/4/21
to Bill Wendling, Nathan Chancellor, Jarmo Tiitto, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, Nick Desaulniers, Sami Tolvanen, Kees Cook, mo...@google.com
On Fri, 4 Jun 2021 19:58:20 +0300, Jarmo Tiitto wrote:
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.

Thanks for working on this! I tweaked the commit title, reflowed the
commit log to 80 columns, and adjusted asm-generic to asm.

Applied to for-next/clang/features, thanks!

[1/1] pgo: Limit allocate_node() to vmlinux sections
https://git.kernel.org/kees/c/46773f32ddf1

--
Kees Cook

Kees Cook

unread,
Jun 4, 2021, 2:06:40 PM6/4/21
to Jarmo Tiitto, Sami Tolvanen, Bill Wendling, Nathan Chancellor, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, mo...@google.com
On Fri, Jun 04, 2021 at 07:58:20PM +0300, Jarmo Tiitto wrote:
> When clang instrumentation eventually calls allocate_node()
> the struct llvm_prf_data *p argument tells us from what section
> we should reserve the vnode: It either points into vmlinux's
> core __llvm_prf_data section or some loaded module's
> __llvm_prf_data section.
>
> But since we don't have access to corresponding
> __llvm_prf_vnds section(s) for any module, the function
> should return just NULL and ignore any profiling attempts
> from modules for now.
>
> Signed-off-by: Jarmo Tiitto <jarmo....@gmail.com>
> ---
> Based on Kees and others feedback here is v3 patch
> that clarifies why the current checks in allocate_node()
> are flawed. I did fair amount of KGDB time on it.
>
> The commit is based on kees/for-next/clang/features tree,
> hopefully this is ok. Should I have based it on linux-next
> instead?
>
> I grep -R'd where the memory_contains() can be found and it is only
> found in #include <asm-generic/sections.h>

That's true, but the way to use "asm-generic" is to include the
top-level "asm" file, so that architectures can override things as
needed.

> I cross my fingers and await if this is my first accepted patch. :-)

I tweaked it a bit and applied it (see the separate email).

Thank you!

-Kees

--
Kees Cook

Jarmo Tiitto

unread,
Jun 5, 2021, 1:15:32 PM6/5/21
to Jarmo Tiitto, Kees Cook, Sami Tolvanen, Bill Wendling, Nathan Chancellor, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, mo...@google.com
> Kees Cook wrote perjantaina 4. kesäkuuta 2021 21.06.37 EEST:
> >
> > I grep -R'd where the memory_contains() can be found and it is only
> > found in #include <asm-generic/sections.h>
>
> That's true, but the way to use "asm-generic" is to include the
> top-level "asm" file, so that architectures can override things as
> needed.
>
Thanks, I didn't know that.

> > I cross my fingers and await if this is my first accepted patch. :-)
>
> I tweaked it a bit and applied it (see the separate email).
>
> Thank you!
>
> -Kees
>
> --
> Kees Cook
>

Whoa!
Thanks, I'm glad it worked out. :-)

Btw. I have almost forgotten that I once wrote code
(that I didn't send) for the GCC gcov subsystem that also enabled
-fprofile-generate/use for the kernel.
However the Clang PGO looks much more approachable and
easier to hack on since the profile data format is simpler.

So starting to work on this felt just natural to me. :-)

-Jarmo



Fangrui Song

unread,
Jun 5, 2021, 7:45:59 PM6/5/21
to Jarmo Tiitto, Kees Cook, Sami Tolvanen, Bill Wendling, Nathan Chancellor, Nick Desaulniers, clang-bu...@googlegroups.com, linux-...@vger.kernel.org, mo...@google.com
Right, __llvm_prf_vnodes reserves space for static allocation.
There is no relocation referencing __llvm_prf_vnodes so there is
no straightforward way using the space for the kernel.

In userspace shared objects, the scheme works by linking
libclang_rt.profile-*.a into every shared object. The
__start___llvm_prf_vnodes/__stop___llvm_prf_vnodes symbols are
delieberately hidden in compiler-rt InstrProfilingPlatformLinux.c.

The GCC gcov format is definitely simpler than the LLVM format:)
Reply all
Reply to author
Forward
0 new messages