David Koloski would like Owners Override to review this change.
David Koloski removed Clayton McCray and Will Drewry from reviewers of this change.
[fuchsia-async] Inline OnSignals allocation
This commit includes four major changes:
1. Changing the lock around fasync's `PacketReceiverMap` to an `RWLock`
2. Switching `PacketReceiverMap` to hold pointers to `dyn
PacketReceiver`
3. Introducing `RawReceiverRegistration` to allow for more efficient
packet receiver registration
4. Splitting the core implementation of `OnSignals` into an unboxed
`OnSignalsFuture` and a boxed `OnSignals`.
`OnSignals` only ever registers a packet receiver while being polled,
and therefore only when it is pinned. Since it's pinned, we can avoid
making an additional allocation for the coordinating memory between the
executor and the future. Being pinned guarantees that `drop` will be
called to unregister the receiver before the memory is freed.
To prevent a race removing packet receivers, packet receivers are now
called while holding a read lock on the packet receiver map. This does
open some potential for deadlocks if packet receivers attempt to insert
or remove packet receivers during their callback. There are ways we can
work around this if needed.
As a side-effect of some code reorganization, this change also switches
a few `Arc`s to `Box`es. This should be vanishingly more efficient.
In benchmarks, this removed an allocation corresponding with each
`OnSignals` created. It also reduced the amount of memory allocated
slightly, by about 1% overall. This is not expected to be representative
of the entire codebase.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | +2 |
Looks good - modifying the commit message will clear the Owners-Override bit so I'll hold off on setting that for this patch set.
Change-Id: Ifb011aa4a276341e3b0d9154401ed717c2fd7d55
Could you please file a tracking bug, link it from here, and attach / link to the relevant benchmark results to that bug? That will help us greatly when doing performance and regression analysis in the future.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | -2 |
Commit-Queue | +0 |
Looks like there are some more failures that need investigating.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | -2 |
Unfortunately there's some UB here that's proving very difficult to reproduce locally. I can't afford to spend any more time trying to track it down right now, but I suspect https://github.com/rust-lang/rust/issues/63818 is related.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | +2 |
Unfortunately there's some UB here that's proving very difficult to reproduce locally. I can't afford to spend any more time trying to track it down right now, but I suspect https://github.com/rust-lang/rust/issues/63818 is related.
Is there a particular test that is failing? Do you have a link to a failure?
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Chris SuterUnfortunately there's some UB here that's proving very difficult to reproduce locally. I can't afford to spend any more time trying to track it down right now, but I suspect https://github.com/rust-lang/rust/issues/63818 is related.
Is there a particular test that is failing? Do you have a link to a failure?
The tests in the [`core.x64-release` builder](https://ci.chromium.org/ui/p/fuchsia/builders/try/core.x64-release/b8711797439120975889/overview) seem to have the highest failure rate. When attempting to reproduce locally, the failures appeared random. I wasn't able to reproduce in any other configurations.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Chris SuterUnfortunately there's some UB here that's proving very difficult to reproduce locally. I can't afford to spend any more time trying to track it down right now, but I suspect https://github.com/rust-lang/rust/issues/63818 is related.
David KoloskiIs there a particular test that is failing? Do you have a link to a failure?
The tests in the [`core.x64-release` builder](https://ci.chromium.org/ui/p/fuchsia/builders/try/core.x64-release/b8711797439120975889/overview) seem to have the highest failure rate. When attempting to reproduce locally, the failures appeared random. I wasn't able to reproduce in any other configurations.
If I had to guess, I'd say this is caused by the deadlock you mentioned in the CL description.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Chris SuterUnfortunately there's some UB here that's proving very difficult to reproduce locally. I can't afford to spend any more time trying to track it down right now, but I suspect https://github.com/rust-lang/rust/issues/63818 is related.
David KoloskiIs there a particular test that is failing? Do you have a link to a failure?
Chris SuterThe tests in the [`core.x64-release` builder](https://ci.chromium.org/ui/p/fuchsia/builders/try/core.x64-release/b8711797439120975889/overview) seem to have the highest failure rate. When attempting to reproduce locally, the failures appeared random. I wasn't able to reproduce in any other configurations.
If I had to guess, I'd say this is caused by the deadlock you mentioned in the CL description.
Do you think you will get back to this CL? I think it's a positive change. If you don't have the time right now, let me know and it might be something I can pick up.
Chris SuterUnfortunately there's some UB here that's proving very difficult to reproduce locally. I can't afford to spend any more time trying to track it down right now, but I suspect https://github.com/rust-lang/rust/issues/63818 is related.
David KoloskiIs there a particular test that is failing? Do you have a link to a failure?
Chris SuterThe tests in the [`core.x64-release` builder](https://ci.chromium.org/ui/p/fuchsia/builders/try/core.x64-release/b8711797439120975889/overview) seem to have the highest failure rate. When attempting to reproduce locally, the failures appeared random. I wasn't able to reproduce in any other configurations.
Chris SuterIf I had to guess, I'd say this is caused by the deadlock you mentioned in the CL description.
Do you think you will get back to this CL? I think it's a positive change. If you don't have the time right now, let me know and it might be something I can pick up.
The deadlock should be turned into a panic via the thread-local `RECEIVING` variable, but there might be a case I missed.
I probably won't get back to it in the immediate future, and I would be very grateful for help getting the issues figured out.
I suspect the issue may be due to `&mut` aliasing a shared memory region even though the value is pinned. I was looking at [`UnsafePinned`](https://doc.rust-lang.org/nightly/std/pin/struct.UnsafePinned.html) to see if it would address the issue, but that's unfortunately nightly-only and last I checked was mid-implementation.
It's very possible the issue is due to something else, but I was having a bear of a time getting a local repro. I _think_ this change may break ffx in release mode based on the symptoms I was seeing. I can't be sure though.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Chris SuterUnfortunately there's some UB here that's proving very difficult to reproduce locally. I can't afford to spend any more time trying to track it down right now, but I suspect https://github.com/rust-lang/rust/issues/63818 is related.
David KoloskiIs there a particular test that is failing? Do you have a link to a failure?
Chris SuterThe tests in the [`core.x64-release` builder](https://ci.chromium.org/ui/p/fuchsia/builders/try/core.x64-release/b8711797439120975889/overview) seem to have the highest failure rate. When attempting to reproduce locally, the failures appeared random. I wasn't able to reproduce in any other configurations.
Chris SuterIf I had to guess, I'd say this is caused by the deadlock you mentioned in the CL description.
David KoloskiDo you think you will get back to this CL? I think it's a positive change. If you don't have the time right now, let me know and it might be something I can pick up.
The deadlock should be turned into a panic via the thread-local `RECEIVING` variable, but there might be a case I missed.
I probably won't get back to it in the immediate future, and I would be very grateful for help getting the issues figured out.
I suspect the issue may be due to `&mut` aliasing a shared memory region even though the value is pinned. I was looking at [`UnsafePinned`](https://doc.rust-lang.org/nightly/std/pin/struct.UnsafePinned.html) to see if it would address the issue, but that's unfortunately nightly-only and last I checked was mid-implementation.
It's very possible the issue is due to something else, but I was having a bear of a time getting a local repro. I _think_ this change may break ffx in release mode based on the symptoms I was seeing. I can't be sure though.
The issue was a deadlock (the tracing_mutex crate readily pointed to the issue). Fxfs wants to be able to free a receiver from within receive_packet. I've fixed it up here: https://fxrev.dev/1313267.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |