[go] runtime: aggregate records with identical stacks in MutexProfile

2 views
Skip to first unread message

Nick Ripley (Gerrit)

unread,
8:09 AM (13 hours ago) 8:09 AM
to goph...@pubsubhelper.golang.org, Felix Geisendörfer, Go LUCI, golang-co...@googlegroups.com
Attention needed from Felix Geisendörfer

Nick Ripley added 1 comment

File src/runtime/pprof/pprof_test.go
Line 2582, Patchset 2 (Latest):func TestMutexDeduplication(t *testing.T) {
Felix Geisendörfer . unresolved

Do we also need this test and logic for block profiles? This could be a separate CL, I'm just curious.

Nick Ripley

I looked through the existing call sites leading to `blockevent` and I didn't see any places where it looked like there were multiple ways to reach it (as is the case for `semrelease` for mutex unlocking). That's not _proof_ it can't happen, of course. We could do deduplication for the block profiler defensively. But perhaps absent a clear case where we can get duplicates, we should avoid the cost of deduplication?

Open in Gerrit

Related details

Attention is currently required from:
  • Felix Geisendörfer
Submit Requirements:
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
  • requirement satisfiedTryBots-Pass
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: go
Gerrit-Branch: master
Gerrit-Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
Gerrit-Change-Number: 595966
Gerrit-PatchSet: 2
Gerrit-Owner: Nick Ripley <nick....@datadoghq.com>
Gerrit-Reviewer: Felix Geisendörfer <felix.gei...@datadoghq.com>
Gerrit-Reviewer: Nick Ripley <nick....@datadoghq.com>
Gerrit-Attention: Felix Geisendörfer <felix.gei...@datadoghq.com>
Gerrit-Comment-Date: Wed, 03 Jul 2024 12:09:04 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Comment-In-Reply-To: Felix Geisendörfer <felix.gei...@datadoghq.com>
unsatisfied_requirement
satisfied_requirement
open
diffy

Nick Ripley (Gerrit)

unread,
8:26 AM (13 hours ago) 8:26 AM
to goph...@pubsubhelper.golang.org, Felix Geisendörfer, Go LUCI, golang-co...@googlegroups.com
Attention needed from Felix Geisendörfer

Nick Ripley added 1 comment

File src/runtime/pprof/pprof_test.go
Line 2585, Patchset 2 (Latest): // Make sure we have permission to use it, or write a new one.
Nick Ripley . unresolved

I've reached out on that issue to get permission to use this code.

Open in Gerrit

Related details

Attention is currently required from:
  • Felix Geisendörfer
Submit Requirements:
  • requirement is not satisfiedCode-Review
  • requirement is not satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
  • requirement satisfiedTryBots-Pass
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: comment
Gerrit-Project: go
Gerrit-Branch: master
Gerrit-Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
Gerrit-Change-Number: 595966
Gerrit-PatchSet: 2
Gerrit-Owner: Nick Ripley <nick....@datadoghq.com>
Gerrit-Reviewer: Felix Geisendörfer <felix.gei...@datadoghq.com>
Gerrit-Reviewer: Nick Ripley <nick....@datadoghq.com>
Gerrit-Attention: Felix Geisendörfer <felix.gei...@datadoghq.com>
Gerrit-Comment-Date: Wed, 03 Jul 2024 12:26:26 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
unsatisfied_requirement
satisfied_requirement
open
diffy

Nick Ripley (Gerrit)

unread,
9:21 AM (12 hours ago) 9:21 AM
to goph...@pubsubhelper.golang.org, golang-co...@googlegroups.com
Attention needed from Felix Geisendörfer

Nick Ripley uploaded new patchset

Nick Ripley uploaded patch set #3 to this change.
Following approvals got outdated and were removed:
  • TryBots-Pass: LUCI-TryBot-Result+1 by Go LUCI
Open in Gerrit

Related details

Attention is currently required from:
  • Felix Geisendörfer
Submit Requirements:
    • requirement is not satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement is not satisfiedReview-Enforcement
    • requirement is not satisfiedTryBots-Pass
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: newpatchset
    Gerrit-Project: go
    Gerrit-Branch: master
    Gerrit-Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
    Gerrit-Change-Number: 595966
    Gerrit-PatchSet: 3
    unsatisfied_requirement
    open
    diffy

    Nick Ripley (Gerrit)

    unread,
    9:21 AM (12 hours ago) 9:21 AM
    to goph...@pubsubhelper.golang.org, Felix Geisendörfer, Go LUCI, golang-co...@googlegroups.com
    Attention needed from Felix Geisendörfer

    Nick Ripley added 1 comment

    File src/runtime/pprof/pprof_test.go
    Line 2585, Patchset 2: // Make sure we have permission to use it, or write a new one.
    Nick Ripley . resolved

    I've reached out on that issue to get permission to use this code.

    Nick Ripley

    Done

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Felix Geisendörfer
    Submit Requirements:
    • requirement is not satisfiedCode-Review
    • requirement is not satisfiedNo-Unresolved-Comments
    • requirement is not satisfiedReview-Enforcement
    • requirement is not satisfiedTryBots-Pass
    Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
    Gerrit-MessageType: comment
    Gerrit-Project: go
    Gerrit-Branch: master
    Gerrit-Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
    Gerrit-Change-Number: 595966
    Gerrit-PatchSet: 3
    Gerrit-Owner: Nick Ripley <nick....@datadoghq.com>
    Gerrit-Reviewer: Felix Geisendörfer <felix.gei...@datadoghq.com>
    Gerrit-Reviewer: Nick Ripley <nick....@datadoghq.com>
    Gerrit-Attention: Felix Geisendörfer <felix.gei...@datadoghq.com>
    Gerrit-Comment-Date: Wed, 03 Jul 2024 13:21:16 +0000
    Gerrit-HasComments: Yes
    Gerrit-Has-Labels: No
    Comment-In-Reply-To: Nick Ripley <nick....@datadoghq.com>
    unsatisfied_requirement
    open
    diffy

    Rhys Hiltner (Gerrit)

    unread,
    1:28 PM (8 hours ago) 1:28 PM
    to Nick Ripley, goph...@pubsubhelper.golang.org, Rhys Hiltner, Go LUCI, Felix Geisendörfer, golang-co...@googlegroups.com
    Attention needed from Felix Geisendörfer and Nick Ripley

    Rhys Hiltner added 3 comments

    Patchset-level comments
    File-level comment, Patchset 3 (Latest):
    Rhys Hiltner . resolved

    Fixing the records after the fact looks tricky. It's hard to hide fully from the user (see line comments), and it seems like it'll result in repeating a lot of the work that the original change was trying to avoid (skip/inline calculations on every read, though not on every event). Plus allocating a big map on each read.

    Is there a path forward where an mprof bucket could serve as a forwarding record to another bucket? That would involve doing the stack skip / inlining calculations once (at the time of creating the bucket corresponding to the non-deduped stack). It might double the memory footprint of block/mutex profiles .. though in my experience they're usually dwarfed by the size of the heap profile when all are enabled.

    File src/runtime/mprof.go
    Line 1180, Patchset 2: // I don't see any tests for this behavior
    Felix Geisendörfer . unresolved

    Good question. I think it's okay because the caller can't assume that the returned `n` stays the same between calls.

    However, it's possible to imagine a caller that makes the assumption that `n` is not decreasing between calls for this profile type.

    So the most-compatible thing to do here is to remove the `size` argument from `mutexProfileInternal` (so it always invokes its callbacks) and handle the `n` counting case inside of this function, including the aggregation when needed.

    But I'd wait for feedback from the maintainers. I'm a bit uncertain how far to interpret Hyrum's law given the new [compatibility focus](https://go.dev/blog/compat).

    Rhys Hiltner

    Another non-maintainer view:

    Most profiles say "If len(p) < n, [FooProfile] does not change p and returns n, false.", this one says "If len(p) >= n, MutexProfile copies the profile into p and returns n, true."

    I'm not sure how to interpret those when "n" isn't a single value (at some point in time). PS3 looks like its behavior could be:

    ```
    MutexProfile(buf[:100]) == (1, true)

    MutexProfile(buf[:1]) == (3, false)
    MutexProfile(buf[:2]) == (3, false)
    MutexProfile(buf[:3]) == (1, true)
    ```

    The docs guarantee that there won't be changes to the slice if ok==false, which I think will cause trouble for Felix's suggestion of doing the count inside this function -- hard to deduplicate when we're not allowed to use the user-allocated memory as scratch space.

    The docs don't make guarantees about not touching the elements in `p[n:len(p)]. As of PS3, this will overwrite some of them (and leave those records in place after it returns).

    File src/runtime/pprof/pprof_test.go
    Line 2582, Patchset 2:func TestMutexDeduplication(t *testing.T) {
    Felix Geisendörfer . unresolved

    Do we also need this test and logic for block profiles? This could be a separate CL, I'm just curious.

    Nick Ripley

    I looked through the existing call sites leading to `blockevent` and I didn't see any places where it looked like there were multiple ways to reach it (as is the case for `semrelease` for mutex unlocking). That's not _proof_ it can't happen, of course. We could do deduplication for the block profiler defensively. But perhaps absent a clear case where we can get duplicates, we should avoid the cost of deduplication?

    Rhys Hiltner

    Is there a test that will break as soon as someone adds a `blockevent` call that violates that invariant? Absent that, it doesn't seem like a safe / future-proof assumption to make.

    Open in Gerrit

    Related details

    Attention is currently required from:
    • Felix Geisendörfer
    • Nick Ripley
    Submit Requirements:
      • requirement is not satisfiedCode-Review
      • requirement is not satisfiedNo-Unresolved-Comments
      • requirement is not satisfiedReview-Enforcement
      • requirement satisfiedTryBots-Pass
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: go
      Gerrit-Branch: master
      Gerrit-Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
      Gerrit-Change-Number: 595966
      Gerrit-PatchSet: 3
      Gerrit-Owner: Nick Ripley <nick....@datadoghq.com>
      Gerrit-Reviewer: Felix Geisendörfer <felix.gei...@datadoghq.com>
      Gerrit-Reviewer: Nick Ripley <nick....@datadoghq.com>
      Gerrit-CC: Rhys Hiltner <rhys.h...@gmail.com>
      Gerrit-Attention: Nick Ripley <nick....@datadoghq.com>
      Gerrit-Attention: Felix Geisendörfer <felix.gei...@datadoghq.com>
      Gerrit-Comment-Date: Wed, 03 Jul 2024 17:28:22 +0000
      Gerrit-HasComments: Yes
      Gerrit-Has-Labels: No
      Comment-In-Reply-To: Nick Ripley <nick....@datadoghq.com>
      Comment-In-Reply-To: Felix Geisendörfer <felix.gei...@datadoghq.com>
      unsatisfied_requirement
      satisfied_requirement
      open
      diffy

      Nick Ripley (Gerrit)

      unread,
      3:23 PM (6 hours ago) 3:23 PM
      to goph...@pubsubhelper.golang.org, Cherry Mui, Rhys Hiltner, Go LUCI, Felix Geisendörfer, golang-co...@googlegroups.com
      Attention needed from Felix Geisendörfer and Rhys Hiltner

      Nick Ripley added 2 comments

      Patchset-level comments
      Rhys Hiltner . resolved

      Fixing the records after the fact looks tricky. It's hard to hide fully from the user (see line comments), and it seems like it'll result in repeating a lot of the work that the original change was trying to avoid (skip/inline calculations on every read, though not on every event). Plus allocating a big map on each read.

      Is there a path forward where an mprof bucket could serve as a forwarding record to another bucket? That would involve doing the stack skip / inlining calculations once (at the time of creating the bucket corresponding to the non-deduped stack). It might double the memory footprint of block/mutex profiles .. though in my experience they're usually dwarfed by the size of the heap profile when all are enabled.

      Nick Ripley

      it seems like it'll result in repeating a lot of the work that the original change was trying to avoid

      Indeed, I don't see a way forward without losing a lot of the gain from frame pointer unwinding.

      Plus allocating a big map on each read.

      Yeah, I chose the implementation as of PS3 for simplicity, but the added memory overhead is unfortunate. We could do it without additional memory overhead by sorting the records (lexicographically by stack) in place and merging consecutive records with identical stacks. I opted not to do this for a first implementation because we can't use an existing sort implementation here, and I didn't trust myself to write one without subtle bugs. But I could give that a go if it sounds better.

      Is there a path forward where an mprof bucket could serve as a forwarding record to another bucket?

      I could see something like that working. This also bring up the point that this issue affects the runtime/pprof output as well. That is, we can get mutex profiles where there are multiple records with the same call stack. This can already happen for the heap profile due to removing runtime frames during reporting, so in theory pprof consumers already have to deal with this. But if we consider that a regression as well then we'll need to address it here. And if so, it'd probably be better to handle aggregation/deduplication in one place than to make each mutex/block profile consumer do it.

      Talking through this has me reconsidering what the best fix is. Given that there doesn't seem to be a way forward without _some_ performance hit, perhaps the best options are:

      • Don't use frame pointer unwinding for now, i.e. revert CL 533258
      • Still use frame pointer unwinding to collect the stack, but handle skipping & inline expansion at sample time.

      The second option would lose most of the performance benefit of frame pointer unwinding, but we'd still save _some_ work and come out ahead of 1.22 perf-wise. Thoughts? cc @cher...@google.com as a runtime maintainer

      File src/runtime/pprof/pprof_test.go
      Line 2582, Patchset 2:func TestMutexDeduplication(t *testing.T) {
      Felix Geisendörfer . unresolved

      Do we also need this test and logic for block profiles? This could be a separate CL, I'm just curious.

      Nick Ripley

      I looked through the existing call sites leading to `blockevent` and I didn't see any places where it looked like there were multiple ways to reach it (as is the case for `semrelease` for mutex unlocking). That's not _proof_ it can't happen, of course. We could do deduplication for the block profiler defensively. But perhaps absent a clear case where we can get duplicates, we should avoid the cost of deduplication?

      Rhys Hiltner

      Is there a test that will break as soon as someone adds a `blockevent` call that violates that invariant? Absent that, it doesn't seem like a safe / future-proof assumption to make.

      Nick Ripley

      There isn't such a test, so you're right, the only safe thing to do is handle this for the block profiler as well.

      Open in Gerrit

      Related details

      Attention is currently required from:
      • Felix Geisendörfer
      • Rhys Hiltner
      Submit Requirements:
      • requirement is not satisfiedCode-Review
      • requirement is not satisfiedNo-Unresolved-Comments
      • requirement is not satisfiedReview-Enforcement
      • requirement satisfiedTryBots-Pass
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: go
      Gerrit-Branch: master
      Gerrit-Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
      Gerrit-Change-Number: 595966
      Gerrit-PatchSet: 3
      Gerrit-Owner: Nick Ripley <nick....@datadoghq.com>
      Gerrit-Reviewer: Felix Geisendörfer <felix.gei...@datadoghq.com>
      Gerrit-Reviewer: Nick Ripley <nick....@datadoghq.com>
      Gerrit-CC: Cherry Mui <cher...@google.com>
      Gerrit-CC: Rhys Hiltner <rhys.h...@gmail.com>
      Gerrit-Attention: Rhys Hiltner <rhys.h...@gmail.com>
      Gerrit-Attention: Felix Geisendörfer <felix.gei...@datadoghq.com>
      Gerrit-Comment-Date: Wed, 03 Jul 2024 19:23:48 +0000
      Gerrit-HasComments: Yes
      Gerrit-Has-Labels: No
      Comment-In-Reply-To: Rhys Hiltner <rhys.h...@gmail.com>
      unsatisfied_requirement
      satisfied_requirement
      open
      diffy

      Rhys Hiltner (Gerrit)

      unread,
      4:41 PM (5 hours ago) 4:41 PM
      to Nick Ripley, goph...@pubsubhelper.golang.org, Cherry Mui, Rhys Hiltner, Go LUCI, Felix Geisendörfer, golang-co...@googlegroups.com
      Attention needed from Felix Geisendörfer and Nick Ripley

      Rhys Hiltner added 1 comment

      Patchset-level comments
      Rhys Hiltner . resolved

      Fixing the records after the fact looks tricky. It's hard to hide fully from the user (see line comments), and it seems like it'll result in repeating a lot of the work that the original change was trying to avoid (skip/inline calculations on every read, though not on every event). Plus allocating a big map on each read.

      Is there a path forward where an mprof bucket could serve as a forwarding record to another bucket? That would involve doing the stack skip / inlining calculations once (at the time of creating the bucket corresponding to the non-deduped stack). It might double the memory footprint of block/mutex profiles .. though in my experience they're usually dwarfed by the size of the heap profile when all are enabled.

      Nick Ripley

      it seems like it'll result in repeating a lot of the work that the original change was trying to avoid

      Indeed, I don't see a way forward without losing a lot of the gain from frame pointer unwinding.

      Plus allocating a big map on each read.

      Yeah, I chose the implementation as of PS3 for simplicity, but the added memory overhead is unfortunate. We could do it without additional memory overhead by sorting the records (lexicographically by stack) in place and merging consecutive records with identical stacks. I opted not to do this for a first implementation because we can't use an existing sort implementation here, and I didn't trust myself to write one without subtle bugs. But I could give that a go if it sounds better.

      Is there a path forward where an mprof bucket could serve as a forwarding record to another bucket?

      I could see something like that working. This also bring up the point that this issue affects the runtime/pprof output as well. That is, we can get mutex profiles where there are multiple records with the same call stack. This can already happen for the heap profile due to removing runtime frames during reporting, so in theory pprof consumers already have to deal with this. But if we consider that a regression as well then we'll need to address it here. And if so, it'd probably be better to handle aggregation/deduplication in one place than to make each mutex/block profile consumer do it.

      Talking through this has me reconsidering what the best fix is. Given that there doesn't seem to be a way forward without _some_ performance hit, perhaps the best options are:

      • Don't use frame pointer unwinding for now, i.e. revert CL 533258
      • Still use frame pointer unwinding to collect the stack, but handle skipping & inline expansion at sample time.

      The second option would lose most of the performance benefit of frame pointer unwinding, but we'd still save _some_ work and come out ahead of 1.22 perf-wise. Thoughts? cc @cher...@google.com as a runtime maintainer

      Rhys Hiltner

      I was proposing what I see as a third option, maybe it didn't come across. What if we handle skipping and inline expansion at sample time, but only the first time we see a particular stack (and create the mprof bucket)?

      Previously, (before CL 533258?) we had one `runtime.bucket` for each profile entry. Now, we sometimes (#67548) have multiple `runtime.bucket` values that end up with identical stacks in the resulting profile. I'm proposing that we consider having two runtime.bucket kinds of values: one with an expanded/skipped stack (which shows up in the profile), and one that holds the unprocessed stack that we get from framepointer unwinding. The second kind of bucket would forward to the first (`type bucket struct { ... ; forwardTo *bucket ; ... }`), and be filtered out when exporting the profile to the user.

      We'd still use frame pointer unwinding to collect the stack. We'd look up the entry in the mprof hash table, we'd see that the `runtime.(*bucket).forwardTo` field was set, we'd follow that pointer and add the samples to that other bucket. In effect, it's a cache of the expansion/skipping work. The main runtime cost would be the memory due to (nearly) doubling the number of bucket values.

      But in my experience the mutex/block profiles are usually small relative to heap profiles, so I estimate that as a low / acceptable cost. Do your data and experience match that?

      Open in Gerrit

      Related details

      Attention is currently required from:
      • Felix Geisendörfer
      • Nick Ripley
      Submit Requirements:
      • requirement is not satisfiedCode-Review
      • requirement is not satisfiedNo-Unresolved-Comments
      • requirement is not satisfiedReview-Enforcement
      • requirement satisfiedTryBots-Pass
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: go
      Gerrit-Branch: master
      Gerrit-Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
      Gerrit-Change-Number: 595966
      Gerrit-PatchSet: 3
      Gerrit-Owner: Nick Ripley <nick....@datadoghq.com>
      Gerrit-Reviewer: Felix Geisendörfer <felix.gei...@datadoghq.com>
      Gerrit-Reviewer: Nick Ripley <nick....@datadoghq.com>
      Gerrit-CC: Cherry Mui <cher...@google.com>
      Gerrit-CC: Rhys Hiltner <rhys.h...@gmail.com>
      Gerrit-Attention: Nick Ripley <nick....@datadoghq.com>
      Gerrit-Attention: Felix Geisendörfer <felix.gei...@datadoghq.com>
      Gerrit-Comment-Date: Wed, 03 Jul 2024 20:40:55 +0000
      unsatisfied_requirement
      satisfied_requirement
      open
      diffy

      Cherry Mui (Gerrit)

      unread,
      5:38 PM (4 hours ago) 5:38 PM
      to Nick Ripley, goph...@pubsubhelper.golang.org, Rhys Hiltner, Go LUCI, Felix Geisendörfer, golang-co...@googlegroups.com
      Attention needed from Felix Geisendörfer, Nick Ripley and Rhys Hiltner

      Cherry Mui added 5 comments

      Patchset-level comments
      Rhys Hiltner . unresolved

      Fixing the records after the fact looks tricky. It's hard to hide fully from the user (see line comments), and it seems like it'll result in repeating a lot of the work that the original change was trying to avoid (skip/inline calculations on every read, though not on every event). Plus allocating a big map on each read.

      Is there a path forward where an mprof bucket could serve as a forwarding record to another bucket? That would involve doing the stack skip / inlining calculations once (at the time of creating the bucket corresponding to the non-deduped stack). It might double the memory footprint of block/mutex profiles .. though in my experience they're usually dwarfed by the size of the heap profile when all are enabled.

      Nick Ripley

      it seems like it'll result in repeating a lot of the work that the original change was trying to avoid

      Indeed, I don't see a way forward without losing a lot of the gain from frame pointer unwinding.

      Plus allocating a big map on each read.

      Yeah, I chose the implementation as of PS3 for simplicity, but the added memory overhead is unfortunate. We could do it without additional memory overhead by sorting the records (lexicographically by stack) in place and merging consecutive records with identical stacks. I opted not to do this for a first implementation because we can't use an existing sort implementation here, and I didn't trust myself to write one without subtle bugs. But I could give that a go if it sounds better.

      Is there a path forward where an mprof bucket could serve as a forwarding record to another bucket?

      I could see something like that working. This also bring up the point that this issue affects the runtime/pprof output as well. That is, we can get mutex profiles where there are multiple records with the same call stack. This can already happen for the heap profile due to removing runtime frames during reporting, so in theory pprof consumers already have to deal with this. But if we consider that a regression as well then we'll need to address it here. And if so, it'd probably be better to handle aggregation/deduplication in one place than to make each mutex/block profile consumer do it.

      Talking through this has me reconsidering what the best fix is. Given that there doesn't seem to be a way forward without _some_ performance hit, perhaps the best options are:

      • Don't use frame pointer unwinding for now, i.e. revert CL 533258
      • Still use frame pointer unwinding to collect the stack, but handle skipping & inline expansion at sample time.

      The second option would lose most of the performance benefit of frame pointer unwinding, but we'd still save _some_ work and come out ahead of 1.22 perf-wise. Thoughts? cc @cher...@google.com as a runtime maintainer

      Rhys Hiltner

      I was proposing what I see as a third option, maybe it didn't come across. What if we handle skipping and inline expansion at sample time, but only the first time we see a particular stack (and create the mprof bucket)?

      Previously, (before CL 533258?) we had one `runtime.bucket` for each profile entry. Now, we sometimes (#67548) have multiple `runtime.bucket` values that end up with identical stacks in the resulting profile. I'm proposing that we consider having two runtime.bucket kinds of values: one with an expanded/skipped stack (which shows up in the profile), and one that holds the unprocessed stack that we get from framepointer unwinding. The second kind of bucket would forward to the first (`type bucket struct { ... ; forwardTo *bucket ; ... }`), and be filtered out when exporting the profile to the user.

      We'd still use frame pointer unwinding to collect the stack. We'd look up the entry in the mprof hash table, we'd see that the `runtime.(*bucket).forwardTo` field was set, we'd follow that pointer and add the samples to that other bucket. In effect, it's a cache of the expansion/skipping work. The main runtime cost would be the memory due to (nearly) doubling the number of bucket values.

      But in my experience the mutex/block profiles are usually small relative to heap profiles, so I estimate that as a low / acceptable cost. Do your data and experience match that?

      Cherry Mui

      Yeah, I was writing my comment and about to propose perhaps something similar. What about we do expansion only up to the skip count? That is, we still use frame pointer unwinder, which is cheaper than the full unwinding. Then we expand just a few frames to cover the skip count, leave the rest of the PCs unexpanded, and record the partially expanded stack to buckets. At profile read time, we expand the rest of the PCs. (We probably don't need two types of buckets.)

      This will be more expensive than just the frame pointer unwinding. But skip counts are usually pretty small, so we don't need to expand many frames. And if it matters we probably can memoize the expansion result somehow.

      Could we try this and see if the performance is acceptible?

      Thanks.

      File src/runtime/mprof.go
      Line 1160, Patchset 3 (Latest): stackHash := func(stk []uintptr) uintptr {
      var h uintptr
      for _, pc := range stk {
      h += pc
      h += h << 10
      h ^= h >> 6
      }
      // finalize
      h += h << 3
      h ^= h >> 11
      return h
      }
      Cherry Mui . unresolved

      Since we're operating on BlockProfileRecord, we can jush hash the Stack0 array, and just use a Go map.

      Line 1180, Patchset 2: // I don't see any tests for this behavior
      Felix Geisendörfer . unresolved

      Good question. I think it's okay because the caller can't assume that the returned `n` stays the same between calls.

      However, it's possible to imagine a caller that makes the assumption that `n` is not decreasing between calls for this profile type.

      So the most-compatible thing to do here is to remove the `size` argument from `mutexProfileInternal` (so it always invokes its callbacks) and handle the `n` counting case inside of this function, including the aggregation when needed.

      But I'd wait for feedback from the maintainers. I'm a bit uncertain how far to interpret Hyrum's law given the new [compatibility focus](https://go.dev/blog/compat).

      Rhys Hiltner

      Another non-maintainer view:

      Most profiles say "If len(p) < n, [FooProfile] does not change p and returns n, false.", this one says "If len(p) >= n, MutexProfile copies the profile into p and returns n, true."

      I'm not sure how to interpret those when "n" isn't a single value (at some point in time). PS3 looks like its behavior could be:

      ```
      MutexProfile(buf[:100]) == (1, true)

      MutexProfile(buf[:1]) == (3, false)
      MutexProfile(buf[:2]) == (3, false)
      MutexProfile(buf[:3]) == (1, true)
      ```

      The docs guarantee that there won't be changes to the slice if ok==false, which I think will cause trouble for Felix's suggestion of doing the count inside this function -- hard to deduplicate when we're not allowed to use the user-allocated memory as scratch space.

      The docs don't make guarantees about not touching the elements in `p[n:len(p)]. As of PS3, this will overwrite some of them (and leave those records in place after it returns).

      Cherry Mui

      Interesting questions.

      I'm not too worried about writing to `p[n:len(p)]`. As it is possible to write all the elements in p, the caller caller cannot keep any useful data there.

      The non-monotonicity is unfortunate and is more concerning. Most uses are probably still fine. The count is essentially only useful for allocating the buffer.

      As Rhys mentioned it would be hard to dedup inside mutexProfileInternal, as if ok==false we don't want to write to p.

      Maybe this is okay? (But see the other discussion.)

      Line 1231, Patchset 3 (Latest): })
      Cherry Mui . unresolved

      Maybe mention that we don't need to dedup here, as the pprof package dedups.

      File src/runtime/pprof/pprof_test.go
      Line 2582, Patchset 2:func TestMutexDeduplication(t *testing.T) {
      Felix Geisendörfer . unresolved

      Do we also need this test and logic for block profiles? This could be a separate CL, I'm just curious.

      Nick Ripley

      I looked through the existing call sites leading to `blockevent` and I didn't see any places where it looked like there were multiple ways to reach it (as is the case for `semrelease` for mutex unlocking). That's not _proof_ it can't happen, of course. We could do deduplication for the block profiler defensively. But perhaps absent a clear case where we can get duplicates, we should avoid the cost of deduplication?

      Rhys Hiltner

      Is there a test that will break as soon as someone adds a `blockevent` call that violates that invariant? Absent that, it doesn't seem like a safe / future-proof assumption to make.

      Nick Ripley

      There isn't such a test, so you're right, the only safe thing to do is handle this for the block profiler as well.

      Cherry Mui

      It would be good to add a test (can be a separate CL). We could do actual dedup logic later. Thanks.

      Open in Gerrit

      Related details

      Attention is currently required from:
      • Felix Geisendörfer
      • Nick Ripley
      • Rhys Hiltner
      Submit Requirements:
      • requirement is not satisfiedCode-Review
      • requirement is not satisfiedNo-Unresolved-Comments
      • requirement is not satisfiedReview-Enforcement
      • requirement satisfiedTryBots-Pass
      Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
      Gerrit-MessageType: comment
      Gerrit-Project: go
      Gerrit-Branch: master
      Gerrit-Change-Id: I6a42ce612377f235b2c8c0cec9ba8e9331224b84
      Gerrit-Change-Number: 595966
      Gerrit-PatchSet: 3
      Gerrit-Owner: Nick Ripley <nick....@datadoghq.com>
      Gerrit-Reviewer: Felix Geisendörfer <felix.gei...@datadoghq.com>
      Gerrit-Reviewer: Nick Ripley <nick....@datadoghq.com>
      Gerrit-CC: Cherry Mui <cher...@google.com>
      Gerrit-CC: Rhys Hiltner <rhys.h...@gmail.com>
      Gerrit-Attention: Rhys Hiltner <rhys.h...@gmail.com>
      Gerrit-Attention: Nick Ripley <nick....@datadoghq.com>
      Gerrit-Attention: Felix Geisendörfer <felix.gei...@datadoghq.com>
      Gerrit-Comment-Date: Wed, 03 Jul 2024 21:37:55 +0000
      Gerrit-HasComments: Yes
      Gerrit-Has-Labels: No
      Comment-In-Reply-To: Rhys Hiltner <rhys.h...@gmail.com>
      Comment-In-Reply-To: Nick Ripley <nick....@datadoghq.com>
      Comment-In-Reply-To: Felix Geisendörfer <felix.gei...@datadoghq.com>
      unsatisfied_requirement
      satisfied_requirement
      open
      diffy
      Reply all
      Reply to author
      Forward
      0 new messages