Set Ready For Review
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
Hi Toon, hi Leszek, this patch is ready for review, PTAL, thanks!
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
OperandType::kReg, OperandType::kFlag16, OperandType::kFlag8) \We add one padding byte to ensure that the feedback value can be stored at a 16-bit aligned address. This guarantees that atomic loads will always operate on an aligned address. We have a more detailed explanation of this in our design doc: https://docs.google.com/document/d/1QmkY6LEZ7B6kEu1xAr3Dzn4O5gFljmRwvqcA5eMmLsw/edit?disco=AAABrMA17pE.
Do you have any suggestions regarding this approach?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
OperandType::kReg, OperandType::kFlag16, OperandType::kFlag8) \We add one padding byte to ensure that the feedback value can be stored at a 16-bit aligned address. This guarantees that atomic loads will always operate on an aligned address. We have a more detailed explanation of this in our design doc: https://docs.google.com/document/d/1QmkY6LEZ7B6kEu1xAr3Dzn4O5gFljmRwvqcA5eMmLsw/edit?disco=AAABrMA17pE.
Do you have any suggestions regarding this approach?
Would it be possible to keep this as a racy read, and deal with the consequences in the optimizing compiler (e.g. swallow the risk of reading incorrect feedback then deopting, or validate the feedback on finalization)?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
OperandType::kReg, OperandType::kFlag16, OperandType::kFlag8) \Leszek SwirskiWe add one padding byte to ensure that the feedback value can be stored at a 16-bit aligned address. This guarantees that atomic loads will always operate on an aligned address. We have a more detailed explanation of this in our design doc: https://docs.google.com/document/d/1QmkY6LEZ7B6kEu1xAr3Dzn4O5gFljmRwvqcA5eMmLsw/edit?disco=AAABrMA17pE.
Do you have any suggestions regarding this approach?
Would it be possible to keep this as a racy read, and deal with the consequences in the optimizing compiler (e.g. swallow the risk of reading incorrect feedback then deopting, or validate the feedback on finalization)?
You're right, I think the optimizing compiler itself could likely tolerate a racy read for the feedback. We have an earlier version of this implementation which used a simple unaligned read for the 16-bit value (see https://chromium-review.googlesource.com/c/v8/v8/+/6855690/18/src/interpreter/bytecode-array-iterator.cc#371).
The specific problem we're running into is with ThreadSanitizer in trybot. When the `is_tsan` build flag is enabled, the Store operation in the CSA (src/codegen/code-stub-assembler.cc#13297 of this CL) is lowered to a tsan-aware store, and the unaligned access will be flagged as a data race.
Current implementation uses a 16-bit atomic load to satisfy these tsan and ubsan checks. I think if we want to use unaligned read/write, we would need to bypass the tsan check (maybe we introduce an "unsafe" store in CSA which will ignore the tsan? I'm not sure what the best practice would be here).
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
OperandType::kReg, OperandType::kFlag16, OperandType::kFlag8) \Leszek SwirskiWe add one padding byte to ensure that the feedback value can be stored at a 16-bit aligned address. This guarantees that atomic loads will always operate on an aligned address. We have a more detailed explanation of this in our design doc: https://docs.google.com/document/d/1QmkY6LEZ7B6kEu1xAr3Dzn4O5gFljmRwvqcA5eMmLsw/edit?disco=AAABrMA17pE.
Do you have any suggestions regarding this approach?
Yuheng WeiWould it be possible to keep this as a racy read, and deal with the consequences in the optimizing compiler (e.g. swallow the risk of reading incorrect feedback then deopting, or validate the feedback on finalization)?
You're right, I think the optimizing compiler itself could likely tolerate a racy read for the feedback. We have an earlier version of this implementation which used a simple unaligned read for the 16-bit value (see https://chromium-review.googlesource.com/c/v8/v8/+/6855690/18/src/interpreter/bytecode-array-iterator.cc#371).
The specific problem we're running into is with ThreadSanitizer in trybot. When the `is_tsan` build flag is enabled, the Store operation in the CSA (src/codegen/code-stub-assembler.cc#13297 of this CL) is lowered to a tsan-aware store, and the unaligned access will be flagged as a data race.
Current implementation uses a 16-bit atomic load to satisfy these tsan and ubsan checks. I think if we want to use unaligned read/write, we would need to bypass the tsan check (maybe we introduce an "unsafe" store in CSA which will ignore the tsan? I'm not sure what the best practice would be here).
We have precedent with RacyReadHeapNumberBits https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/heap-refs.cc;l=309;drc=77ce2c33f61b01daac3d9e4299b3d0805bee6461 which is suppressed in https://source.chromium.org/chromium/chromium/src/+/main:v8/tools/sanitizers/tsan_suppressions.txt;l=15;drc=96cf03108a535070d387bbaf88c899442d47d4d0
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
OperandType::kReg, OperandType::kFlag16, OperandType::kFlag8) \Leszek SwirskiWe add one padding byte to ensure that the feedback value can be stored at a 16-bit aligned address. This guarantees that atomic loads will always operate on an aligned address. We have a more detailed explanation of this in our design doc: https://docs.google.com/document/d/1QmkY6LEZ7B6kEu1xAr3Dzn4O5gFljmRwvqcA5eMmLsw/edit?disco=AAABrMA17pE.
Do you have any suggestions regarding this approach?
Yuheng WeiWould it be possible to keep this as a racy read, and deal with the consequences in the optimizing compiler (e.g. swallow the risk of reading incorrect feedback then deopting, or validate the feedback on finalization)?
Leszek SwirskiYou're right, I think the optimizing compiler itself could likely tolerate a racy read for the feedback. We have an earlier version of this implementation which used a simple unaligned read for the 16-bit value (see https://chromium-review.googlesource.com/c/v8/v8/+/6855690/18/src/interpreter/bytecode-array-iterator.cc#371).
The specific problem we're running into is with ThreadSanitizer in trybot. When the `is_tsan` build flag is enabled, the Store operation in the CSA (src/codegen/code-stub-assembler.cc#13297 of this CL) is lowered to a tsan-aware store, and the unaligned access will be flagged as a data race.
Current implementation uses a 16-bit atomic load to satisfy these tsan and ubsan checks. I think if we want to use unaligned read/write, we would need to bypass the tsan check (maybe we introduce an "unsafe" store in CSA which will ignore the tsan? I'm not sure what the best practice would be here).
We have precedent with RacyReadHeapNumberBits https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/heap-refs.cc;l=309;drc=77ce2c33f61b01daac3d9e4299b3d0805bee6461 which is suppressed in https://source.chromium.org/chromium/chromium/src/+/main:v8/tools/sanitizers/tsan_suppressions.txt;l=15;drc=96cf03108a535070d387bbaf88c899442d47d4d0
Thanks for this information! I've uploaded a new patch set implementing the racy embedded feedback read and removing the padding byte, PTAL.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
Looks pretty nice. I do think you might need to whitelist code that writes to the bytecode array for sandbox/pkey builds? Adding mlippautz for pkey insights.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Looks pretty nice. I do think you might need to whitelist code that writes to the bytecode array for sandbox/pkey builds? Adding mlippautz for pkey insights.
Trying to figure out where this actually happens. The CL description doesn't really help here. For the HW sandbox using pkeys the reads are fine but write are not.
Basically we need exit/enter pairs for the HW sandbox around code that writes outside of untrusted memory.
What's going on here? That's a 1kLOC change without any description/bug/design doc?
race:BytecodeDecoder::RacyDecodeEmbeddedFeedbackThis is for C++ code, right? Then it should be fixed with using proper atomics. We should not add any more exceptions to this file as this is just not sustainable.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
What's going on here? That's a 1kLOC change without any description/bug/design doc?
Sorry for missing the description. The purpose of the current CL is to embed the feedback slots for the bytecode "StrictEqual" in its bytecode rather than allocating the slots in the feedback vector. The CL itself can save some memory, and more importantly, we can do some further Sparkplug optimization based on this CL. For the details, please refer to this design doc. (And sure we will add some description in the commit message.)
Here is the design doc.
https://docs.google.com/document/d/1QmkY6LEZ7B6kEu1xAr3Dzn4O5gFljmRwvqcA5eMmLsw/edit?tab=t.0#heading=h.byzqhv7xsdrd
Hao A XuWhat's going on here? That's a 1kLOC change without any description/bug/design doc?
Sorry for missing the description. The purpose of the current CL is to embed the feedback slots for the bytecode "StrictEqual" in its bytecode rather than allocating the slots in the feedback vector. The CL itself can save some memory, and more importantly, we can do some further Sparkplug optimization based on this CL. For the details, please refer to this design doc. (And sure we will add some description in the commit message.)
Here is the design doc.
https://docs.google.com/document/d/1QmkY6LEZ7B6kEu1xAr3Dzn4O5gFljmRwvqcA5eMmLsw/edit?tab=t.0#heading=h.byzqhv7xsdrd
Thanks, I checked the doc and security sgtm but we should update the text of the design doc a little :)
Also, we still need add the HW sandbox primitives to make PKU bots pass. You can find various enter/exit pairs depending on your needs here: https://source.chromium.org/search?q=EnterSandbox&ss=chromium
Let's use the following tracking bug: https://issues.chromium.org/u/1/issues/429351411