[turboshaft][wasm] Wasm-in-JS inlining: Part 1 [v8/v8 : main]

2 views

Skip to first unread message

Daniel Lehmann (Gerrit)

unread,

Aug 29, 2024, 9:43:28 AM8/29/24

to Matthias Liedtke, V8 LUCI CQ, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Attention needed from Matthias Liedtke

Daniel Lehmann voted Commit-Queue+1

Commit-Queue

Open in Gerrit

Related details

Attention is currently required from:

Matthias Liedtke

Submit Requirements:

Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

unsatisfied_requirement

open

diffy

Daniel Lehmann (Gerrit)

unread,

Aug 29, 2024, 11:53:11 AM8/29/24

to Matthias Liedtke, V8 LUCI CQ, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Attention needed from Matthias Liedtke

Daniel Lehmann added 7 comments

Patchset-level comments

File-level comment, Patchset 8 (Latest):

Daniel Lehmann . unresolved

Thanks for the first, quick synchronous review. It's ready for another pass in your own time now. (I fixed the notes I made for myself earlier.)

File src/compiler/turboshaft/graph-builder.cc

Line 337, Patchset 7: if (v8_flags.turboshaft_wasm_in_js_inlining) {

Daniel Lehmann . resolved

remove: DCE interaction etc.

Line 1331, Patchset 7: js_wasm_calls_sidetable->erase(it);

Daniel Lehmann . resolved

remove

File src/compiler/turboshaft/operations.h

Line 3903, Patchset 7: const JSWasmCallParameters* js_wasm_call_parameters;

Daniel Lehmann . resolved

add TODO: if TSCallDescriptor becomes shared, add this to the TS CallOp, needs lots of code churn in all reducers

File src/compiler/turboshaft/wasm-in-js-inlining-phase.cc

Line 18, Patchset 7: CopyingPhase<WasmInJSInliningReducer, WasmLoweringReducer>::Run(data,

Daniel Lehmann . resolved

add TODO: GC TypeOptimizationReducer

File src/compiler/turboshaft/wasm-in-js-inlining-reducer-inl.h

Line 40, Patchset 7: DCHECK(v8_flags.turboshaft_wasm_in_js_inlining);

Daniel Lehmann . resolved

Make it a `CHECK`

File src/wasm/function-body-decoder-impl.h

Line 2907, Patchset 7: if constexpr (ValidationTag::validate) {

Daniel Lehmann . resolved

I double checked, and this is required. See added comment for clarity.

Open in Gerrit

Related details

Attention is currently required from:

Matthias Liedtke

Submit Requirements:

Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

unsatisfied_requirement

open

diffy

Matthias Liedtke (Gerrit)

unread,

Aug 30, 2024, 6:24:46 AM8/30/24

to Daniel Lehmann, V8 LUCI CQ, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Attention needed from Daniel Lehmann

Matthias Liedtke added 36 comments

Patchset-level comments

File-level comment, Patchset 8 (Latest):

Matthias Liedtke . resolved

Great work! Given the size of the CL, I hope it isn't surprising that I also have a bunch of comments... 😊

File src/compiler/js-inlining.h

Line 42, Patchset 8 (Latest):#if V8_ENABLE_WEBASSEMBLY

Matthias Liedtke . unresolved

As mentioned above, I think this is a large difference in readability. If we can just keep the members (a few extra bytes in one reducer during a limited amount of time during compilation), that sounds like the better trade-off.

File src/compiler/js-inlining.cc

Line 506, Patchset 8 (Latest): // don't set the thread-in-wasm flag now, and instead do that when _not_

Matthias Liedtke . unresolved

File src/compiler/pipeline-data-inl.h

Line 569, Patchset 8 (Latest): // TODO(dlehmann,353475584): Fix this restriction when porting to Turboshaft.

Matthias Liedtke . unresolved

Sure? Even the wasm-into-wasm inlining doesn't support this (i.e. cross-module-inlining), see e.g. https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/wasm-lowering-reducer.h;l=977. This will not be a trivial change.
(You can obviously keep the `TODO` but you might want to lower your expectations on it. 😊)

File src/compiler/pipeline.cc

Line 1048, Patchset 8 (Latest): // in Turboshaft (or in Maglev, depending on the shared frontend).

Matthias Liedtke . resolved

That's the big disadvantage. We'd need to implement JS-to-Wasm-wrappers in maglev. Ideally we'd have a design where we don't need to duplicate it. (Although even in Turbofan that wasn't fully the case.)
So doing it late in Turboshaft would probably be a nicer solution.

File src/compiler/turboshaft/graph-builder.cc

Line 1307, Patchset 8 (Latest): const JSWasmCallParameters* jswcp = nullptr;

Matthias Liedtke . unresolved

`wasm_call_parameters`? Yes, there is an 80 char line limit but it isn't that terrible. 😄

Line 1309, Patchset 8 (Latest):

      if (call_descriptor->kind() == CallDescriptor::kCallWasmFunction) {
        if (v8_flags.turboshaft_wasm_in_js_inlining) {

Matthias Liedtke . unresolved

Nit:
```suggestion
      if (call_descriptor->kind() == CallDescriptor::kCallWasmFunction
          && v8_flags.turboshaft_wasm_in_js_inlining) {
```

Line 1314, Patchset 8 (Latest): // Make sure that for each no-yet-body-inlined call node, there is an

Matthias Liedtke . unresolved

not

Line 1318, Patchset 8 (Latest): DCHECK_NE(it, js_wasm_calls_sidetable->end());

Matthias Liedtke . unresolved

Nit: This would produce an oob-pointer, so maybe just `CHECK_NE`? I used to use `DCHECK` everywhere, but especially in the optimizing compiler I feel like we should use `CHECK`s more deliberately.

File src/compiler/turboshaft/operations.h

Line 3907, Patchset 8 (Latest): // Then, one would need to make this an argument of the Turboshaft CallOp

Matthias Liedtke . unresolved

`Then`? In which of the two cases? 😊
Maybe "For sharing call descriptors, the JSWasmCallParameters need to be moved to the CallOp which causes a lot of code churn (needs touching all `REDUCE(Call)`).

Well, actually, there aren't many of them (2), so it wouldn't be all that bad. Still needs touching the operation etc. but that isn't a lot of work.

Line 3904, Patchset 8 (Latest): // one particular call site, this assumes (only works correctly) if

Matthias Liedtke . unresolved

```suggestion
  // one particular call site, this assumes that (only works correctly if)
```

File src/compiler/turboshaft/pipelines.h

Line 183, Patchset 8 (Latest): Run<turboshaft::MachineLoweringPhase>();

Matthias Liedtke . unresolved

I feel like we should do the inlining conceptually after the `MachineLowering` as it replaces higher-level JS concepts with low-level instructions and the inlined wasm uses low-level instructions as well.

On the other hand, the `MachineLoweringPhase` also does a few optimizations like `MachineOptimizationReducer` that we'd want to run for the inlined wasm as well.

It could indeed be the best strategy to merge it into the `MachineLoweringPhase` in the end but for now I'd keep it separate (because of the experimental nature and the open questions).

File src/compiler/turboshaft/wasm-in-js-inlining-reducer-inl.h

Line 142, Patchset 8 (Latest): std::copy(arguments_.begin(), arguments_.end(), locals_.begin());

Matthias Liedtke . unresolved

`std::copy` doesn't know about `locals_.end()`, so we should probably add a `CHECK` here?

Line 122, Patchset 8 (Latest): bool func_is_shared)

Matthias Liedtke . unresolved

Can we please just bail out in that case? Shared-everything is a super-complicated proposal and the whole `trusted_instance_data` handling for it will be challenging. It is unclear who'll be responsible for then providing the necessary testing for mixing wasm-into-js-inlining with shared-everything, so there is a big risk that this will create issues in the future.

There is also no need to support this right now, shared-everything isn't going to be shipped within the next 12 months (I guess).

Line 93, Patchset 8 (Latest):namespace v8::internal::wasm {

Matthias Liedtke . unresolved

I don't think this is a good solution. We should stay with the `compiler::turboshaft` namespace in my opinion. But I also had my concerns about putting the `turboshaft-graph-interface.cc` to `src/wasm` instead of `src/compiler`.

Line 80, Patchset 8 (Latest):

    // Regular call, nothing to do with Wasm or inlining. Proceed untouched...
    return Next::ReduceCall(callee, frame_state, arguments, descriptor,
                            effects);

Matthias Liedtke . unresolved

I think it's easier to read if you put this in the beginning:
```
if (!descriptor->js_wasm_call_parameters) {
  return Next::ReduceCall(callee, frame_state, arguments, descriptor,
                          effects);
}
```

Line 61, Patchset 8 (Latest): OpIndex thread_in_wasm_flag_address =

Matthias Liedtke . unresolved

`V<WordPtr>`?

Line 55, Patchset 8 (Latest): // not toggle the thread-in-Wasm flag, since it's not needed in the

Matthias Liedtke . unresolved

I don't think this is semantically correct. We do not want to set the flag in the inlined case (the flag is meant to be only set on wasm frames).

Line 1147, Patchset 8 (Latest): V<WasmTrustedInstanceData> trusted_instance_data(bool element_is_shared) {

Matthias Liedtke . unresolved

As said above, I'd like to get rid of any sharedness here.

Line 1120, Patchset 8 (Latest): V<Any> DefaultValue(ValueType type) {

Matthias Liedtke . unresolved

Nit: The data members should be last (meaning this functions should all be moved above them).

Line 1110, Patchset 8 (Latest): // Wasm instance. Used only in `StartFunction()`.

Matthias Liedtke . unresolved

Nit: I'm not a fan of such comments. They tend to get outdated and then cause more confusion than without having the comment.

Line 625, Patchset 8 (Latest):

      // TODO(dlehmann): Not 100% sure this is the correct way. Took this from
      // the Wasm pipeline, but is it also correct in JavaScript?

Matthias Liedtke . unresolved

This should be correct. However, you should have an `isolate` so you should be able to just emit a `HeapConstant` for this instead of having to look up the value from the roots.

I couldn't see any of your test-cases testing the result value produced by the inlining (in all cases, not just the void return one), you should do that, otherwise we only know that it didn't crash...

Line 529, Patchset 8 (Latest): UNREACHABLE();

Matthias Liedtke . unresolved

Same as above.

Line 324, Patchset 8 (Latest): UNREACHABLE();

Matthias Liedtke . unresolved

Just `Bailout`? I think chances are high that we'll add a new unary op at some point in time. Whoever adds that, might miss this inliner and it would be fine if we don't support the new instruction but it wouldn't be great if we then suddenly have stability issues in rare corner cases because of that.

If you think that "the inlining should support all unary ops", then we can add a `DCHECK` plus the bailout, so that fuzzers can find this but this shouldn't risk production stability.

Line 239, Patchset 8 (Latest): return __ TruncateWord64ToWord32(arg);

Matthias Liedtke . unresolved

I'm pretty sure that most if not all of the `I64` instructions you added are broken. When not on 64 bits, you'll need to run the `turboshaft::Int64LoweringReducer`.

As you're on an experimental flag, this doesn't have to happen on this change but that's a bit of the risk of adding dozens of instructions and testing like 5 of them.

Line 169, Patchset 8 (Latest): // need to understand what the inlining ID is / where to get it.

Matthias Liedtke . unresolved

You'll need to provide that from the JS side, see https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/js-inlining.cc;l=454
This might not be fully trivial given that you are doing the inlining later than the JS inlining.

Line 144, Patchset 8 (Latest): // Initialize the non-parameter locals.

Matthias Liedtke . unresolved

How many test cases do you have for this? #passiveAggressiveQuestion 😊
Maybe you'd want to add a `Bailout` here if we have non-parameter-locals and add this feature later.

File src/flags/flag-definitions.h

Line 1395, Patchset 8 (Latest):// Can't use Turboshaft Wasm-in-JS inlining without the Turboshaft _JavaScript_

Matthias Liedtke . unresolved

Nit: I think those emphasizing `_`s are a bit more confusing than helping.

File src/wasm/function-body-decoder-impl.h

Line 2908, Patchset 8 (Latest): if constexpr (ValidationTag::validate) {

Matthias Liedtke . unresolved

So, what was the reason for this change again?

File test/message/wasm-in-js-inlining-turboshaft.js

Line 76, Patchset 8 (Latest): // We would need to extend the wrapper inlining to support this.

Matthias Liedtke . unresolved

As discussed offline, I don't think so for multi-returns as these require allocating a JSArray, so it isn't completely straight-forward.

Line 83, Patchset 8 (Latest): kExprUnreachable,

Matthias Liedtke . unresolved

This should be easy, we need traps anyways for all the gc instructions. But we don't really need `kExprUnreachable` (it only makes sense in combination with exceptions or other control flow).

Line 143, Patchset 8 (Latest): console.log(e)

Matthias Liedtke . unresolved

It might be better to return it or not use the error at all. I've learnt in the past that printing errors creates confusion because someone looks at the output and thinks "Oh, `RuntimeError: unreachable`, seems like the test has failed!"

File test/message/wasm-in-js-inlining-turboshaft.out

Line 3, Patchset 8 (Latest):Successfully inlined Wasm function #0

Matthias Liedtke . unresolved

Nit: You probably want to print some information about the module (e.g. the address) which helps in case of multiple instantiations.

Line 6, Patchset 8 (Latest):Successfully inlined Wasm function #1

Matthias Liedtke . unresolved

Nit: The `successfully` seems redundant.

Line 9, Patchset 8 (Latest):Successfully inlined Wasm function #2

Matthias Liedtke . unresolved

Nit: You should be able to also print the debug name if present (see the turbofan implementation that seems to do that).

Line 27, Patchset 8 (Latest):Cannot consider {ADDRESS} {{ADDRESS} <FeedbackCell[many closures]>} for inlining (no feedback vector)

Matthias Liedtke . unresolved

Nit: But we "considered" it for inlining, otherwise we wouldn't have printed this line? 😊
Should we just treat this as "Cannot inline [...] (no feedback vector)"?

Open in Gerrit

Related details

Attention is currently required from:

Daniel Lehmann

Submit Requirements:

Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

unsatisfied_requirement

open

diffy

Daniel Lehmann (Gerrit)

unread,

Sep 5, 2024, 1:48:51 PM9/5/24

to Matthias Liedtke, V8 LUCI CQ, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Attention needed from Matthias Liedtke

Daniel Lehmann added 36 comments

Patchset-level comments

File-level comment, Patchset 8:

Daniel Lehmann . resolved

Thanks for the first, quick synchronous review. It's ready for another pass in your own time now. (I fixed the notes I made for myself earlier.)

Daniel Lehmann

Done

File src/compiler/js-inlining.h

Line 42, Patchset 8:#if V8_ENABLE_WEBASSEMBLY

Matthias Liedtke . resolved

As mentioned above, I think this is a large difference in readability. If we can just keep the members (a few extra bytes in one reducer during a limited amount of time during compilation), that sounds like the better trade-off.

Daniel Lehmann

Ah, I forgot to make this change after our sync meeting. Done.

File src/compiler/js-inlining.cc

Line 506, Patchset 8: // don't set the thread-in-wasm flag now, and instead do that when _not_

Matthias Liedtke . resolved

if

Daniel Lehmann

Done

File src/compiler/pipeline-data-inl.h

Line 569, Patchset 8: // TODO(dlehmann,353475584): Fix this restriction when porting to Turboshaft.

Matthias Liedtke . unresolved

Sure? Even the wasm-into-wasm inlining doesn't support this (i.e. cross-module-inlining), see e.g. https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/wasm-lowering-reducer.h;l=977. This will not be a trivial change.
(You can obviously keep the `TODO` but you might want to lower your expectations on it. 😊)

Daniel Lehmann

Ack, this is not going to happen for the TS Wasm-in-JS inlining MVP, so I toned down the comment.

In terms of importance/prioritization, I do think the restriction to inline only from a single module in Wasm-in-JS inlining is more of a weird performance cliff than not inlining across different Wasm modules. I.e., that the first Wasm module "wins" in Wasm-in-JS inlining feels "more arbitrary" or "more surprising". (This is just my gut feeling for what I think "a typical Web developer" expects, WDYT?)

File src/compiler/pipeline.cc

Line 1048, Patchset 8: // in Turboshaft (or in Maglev, depending on the shared frontend).

Matthias Liedtke . resolved

That's the big disadvantage. We'd need to implement JS-to-Wasm-wrappers in maglev. Ideally we'd have a design where we don't need to duplicate it. (Although even in Turbofan that wasn't fully the case.)
So doing it late in Turboshaft would probably be a nicer solution.

Daniel Lehmann

Good point, that's a good reason to do it late in Turboshaft in any case (i.e., here.)

File src/compiler/turboshaft/graph-builder.cc

Line 1307, Patchset 8: const JSWasmCallParameters* jswcp = nullptr;

Matthias Liedtke . resolved

`wasm_call_parameters`? Yes, there is an 80 char line limit but it isn't that terrible. 😄

Daniel Lehmann

fair 😄

Line 1309, Patchset 8:

      if (call_descriptor->kind() == CallDescriptor::kCallWasmFunction) {
        if (v8_flags.turboshaft_wasm_in_js_inlining) {

Matthias Liedtke . resolved

Nit:
```suggestion
      if (call_descriptor->kind() == CallDescriptor::kCallWasmFunction
          && v8_flags.turboshaft_wasm_in_js_inlining) {
```

Daniel Lehmann

Done

Line 1314, Patchset 8: // Make sure that for each no-yet-body-inlined call node, there is an

Matthias Liedtke . resolved

not

Daniel Lehmann

Done

Line 1318, Patchset 8: DCHECK_NE(it, js_wasm_calls_sidetable->end());

Matthias Liedtke . resolved

Nit: This would produce an oob-pointer, so maybe just `CHECK_NE`? I used to use `DCHECK` everywhere, but especially in the optimizing compiler I feel like we should use `CHECK`s more deliberately.

Daniel Lehmann

Ack & done

File src/compiler/turboshaft/operations.h

Line 3907, Patchset 8: // Then, one would need to make this an argument of the Turboshaft CallOp

Matthias Liedtke . resolved

`Then`? In which of the two cases? 😊
Maybe "For sharing call descriptors, the JSWasmCallParameters need to be moved to the CallOp which causes a lot of code churn (needs touching all `REDUCE(Call)`).
Well, actually, there aren't many of them (2), so it wouldn't be all that bad. Still needs touching the operation etc. but that isn't a lot of work.

Daniel Lehmann

Done

Line 3904, Patchset 8: // one particular call site, this assumes (only works correctly) if

Matthias Liedtke . resolved

```suggestion
  // one particular call site, this assumes that (only works correctly if)
```

Daniel Lehmann

Done

File src/compiler/turboshaft/pipelines.h

Line 183, Patchset 8: Run<turboshaft::MachineLoweringPhase>();

Matthias Liedtke . unresolved

I feel like we should do the inlining conceptually after the `MachineLowering` as it replaces higher-level JS concepts with low-level instructions and the inlined wasm uses low-level instructions as well.
On the other hand, the `MachineLoweringPhase` also does a few optimizations like `MachineOptimizationReducer` that we'd want to run for the inlined wasm as well.
It could indeed be the best strategy to merge it into the `MachineLoweringPhase` in the end but for now I'd keep it separate (because of the experimental nature and the open questions).

Daniel Lehmann

I have thought again about my comment `Skip this phase if there were no JS->Wasm calls` a couple lines above and replaced it instead with the plan you laid out:
1. Keep a separate phase before the `MachineLoweringPhase` until the MVP is feature-complete and cleaned-up (but before Finching or shipping it).
2. Then integrate my reducer into the `MachineLoweringPhase` probably before `DataViewLoweringReducer` (since we have `DataViewLoweringReducer` in the `WasmGraphBuilderBase` as well, that seems like the right place).
3. Not worry about the performance of always running the reducer: If we have inlineable Wasm functions, running a separate phase is more expensive, and if we don't inline it's just an additional load+conditional branch per CallOp during Turbofan compilation, which is not worth the hassle. (Except if we add a lot more reducers that apply to JS operations as well, but I don't think that will be the case.)

File src/compiler/turboshaft/wasm-in-js-inlining-reducer-inl.h

Line 142, Patchset 8: std::copy(arguments_.begin(), arguments_.end(), locals_.begin());

Matthias Liedtke . resolved

`std::copy` doesn't know about `locals_.end()`, so we should probably add a `CHECK` here?

Daniel Lehmann

Done

Line 122, Patchset 8: bool func_is_shared)

Matthias Liedtke . resolved

Can we please just bail out in that case? Shared-everything is a super-complicated proposal and the whole `trusted_instance_data` handling for it will be challenging. It is unclear who'll be responsible for then providing the necessary testing for mixing wasm-into-js-inlining with shared-everything, so there is a big risk that this will create issues in the future.
There is also no need to support this right now, shared-everything isn't going to be shipped within the next 12 months (I guess).

Daniel Lehmann

Yes, sure. Removed the field, bail out in `TryInlineWasmCall` now.

Line 93, Patchset 8:namespace v8::internal::wasm {

Matthias Liedtke . resolved

I don't think this is a good solution. We should stay with the `compiler::turboshaft` namespace in my opinion. But I also had my concerns about putting the `turboshaft-graph-interface.cc` to `src/wasm` instead of `src/compiler`.

Daniel Lehmann

Sure, kept both in the same `compiler::turboshaft` namespace and added a bunch of `wasm::` prefixes (and few `using wasm::...` declarations for frequently repeated things or where there's no danger of confusing it with JS things).

Line 80, Patchset 8: // Regular call, nothing to do with Wasm or inlining. Proceed untouched...


    return Next::ReduceCall(callee, frame_state, arguments, descriptor,
                            effects);

Matthias Liedtke . resolved

I think it's easier to read if you put this in the beginning:
```
if (!descriptor->js_wasm_call_parameters) {
  return Next::ReduceCall(callee, frame_state, arguments, descriptor,
                          effects);
}
```

Daniel Lehmann

Good idea! Done.

Line 61, Patchset 8: OpIndex thread_in_wasm_flag_address =

Matthias Liedtke . resolved

`V<WordPtr>`?

Daniel Lehmann

Done

Line 55, Patchset 8: // not toggle the thread-in-Wasm flag, since it's not needed in the

Matthias Liedtke . resolved

I don't think this is semantically correct. We do not want to set the flag in the inlined case (the flag is meant to be only set on wasm frames).

Daniel Lehmann

Done

Line 1147, Patchset 8: V<WasmTrustedInstanceData> trusted_instance_data(bool element_is_shared) {

Matthias Liedtke . resolved

As said above, I'd like to get rid of any sharedness here.

Daniel Lehmann

Done

Line 1120, Patchset 8: V<Any> DefaultValue(ValueType type) {

Matthias Liedtke . resolved

Nit: The data members should be last (meaning this functions should all be moved above them).

Daniel Lehmann

Done

Line 1110, Patchset 8: // Wasm instance. Used only in `StartFunction()`.

Matthias Liedtke . resolved

Nit: I'm not a fan of such comments. They tend to get outdated and then cause more confusion than without having the comment.

Daniel Lehmann

Done (removed).

Line 625, Patchset 8: // TODO(dlehmann): Not 100% sure this is the correct way. Took this from


      // the Wasm pipeline, but is it also correct in JavaScript?

Matthias Liedtke . resolved

This should be correct. However, you should have an `isolate` so you should be able to just emit a `HeapConstant` for this instead of having to look up the value from the roots.
I couldn't see any of your test-cases testing the result value produced by the inlining (in all cases, not just the void return one), you should do that, otherwise we only know that it didn't crash...

Daniel Lehmann

Changed to loading `__ HeapConstant(isolate->factory()->undefined_value())`.

Regarding testing the result of the inlining: all test cases `assertEqual(unoptimizedResult, inlinedResult)`, so basically a differential oracle. I think that's good enough, but feel free to reopen if I misunderstood something.

Line 529, Patchset 8: UNREACHABLE();

Matthias Liedtke . resolved

Same as above.

Daniel Lehmann

Done

Line 324, Patchset 8: UNREACHABLE();

Matthias Liedtke . resolved

Just `Bailout`? I think chances are high that we'll add a new unary op at some point in time. Whoever adds that, might miss this inliner and it would be fine if we don't support the new instruction but it wouldn't be great if we then suddenly have stability issues in rare corner cases because of that.
If you think that "the inlining should support all unary ops", then we can add a `DCHECK` plus the bailout, so that fuzzers can find this but this shouldn't risk production stability.

Daniel Lehmann

My reasoning for the `UNREACHABLE` was that we should trip over it when adding a new (in this case: unary) op and forget to handle it here (in the spirit of "make assumptions explicit", "constrain the valid execution paths maximally", and "make all handled enum cases explicit") but I agree that this shouldn't risk stability. So instead, I am going with a three pronged approach:

ops that we support
ops that we don't support, but not a bug: `Bailout()`
default (ops that we "forgot to handle"), a bug, but should not affect stability: `DHCECK(false); Bailout()`

Line 239, Patchset 8: return __ TruncateWord64ToWord32(arg);

Matthias Liedtke . unresolved

I'm pretty sure that most if not all of the `I64` instructions you added are broken. When not on 64 bits, you'll need to run the `turboshaft::Int64LoweringReducer`.
As you're on an experimental flag, this doesn't have to happen on this change but that's a bit of the risk of adding dozens of instructions and testing like 5 of them.

Daniel Lehmann

Oh, you are right, and constant folding may eliminate the i64 operations in many simple test cases so it's not immediately obvious. Just adding the `Int64LoweringReducer` is also not possible to support them, since it makes some assumptions about `Parameter` and `Return` ops and assumes a (non-existent in the Wasm-in-JS inlining case) `wasm_sig`.

All in all, I have to agree that this is a bit too risky and would require a whole bunch of more changes to land right now. So I will remove i64 unary/binary ops from this CL, and only gradually extend later.

I also extended the test with all unary and binary ops that are currently supported by the inliner (which required a bit of a rewrite/testing framework, otherwise it would have been a lot of repetition).

Line 169, Patchset 8: // need to understand what the inlining ID is / where to get it.

Matthias Liedtke . unresolved

You'll need to provide that from the JS side, see https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/js-inlining.cc;l=454
This might not be fully trivial given that you are doing the inlining later than the JS inlining.

Daniel Lehmann

Thanks for the pointer, will leave this open and address in a follow-up CL/part of the implementation.

Line 144, Patchset 8: // Initialize the non-parameter locals.

Matthias Liedtke . resolved

How many test cases do you have for this? #passiveAggressiveQuestion 😊
Maybe you'd want to add a `Bailout` here if we have non-parameter-locals and add this feature later.

Daniel Lehmann

Haha, well, there was one test with a non-parameter local (`localTee` below), but I see your point :)
I will leave it in, since the code required for non-parameter locals is really quite minimal, but added another test with more than one local and some swaps between locals.

File src/flags/flag-definitions.h

Line 1395, Patchset 8:// Can't use Turboshaft Wasm-in-JS inlining without the Turboshaft _JavaScript_

Matthias Liedtke . resolved

Nit: I think those emphasizing `_`s are a bit more confusing than helping.

Daniel Lehmann

Done (removed).

File src/wasm/function-body-decoder-impl.h

Line 2908, Patchset 8: if constexpr (ValidationTag::validate) {

Matthias Liedtke . resolved

So, what was the reason for this change again?

Daniel Lehmann

`DecoderError()` in its body has `V8_ASSUME(ValidationTag::validate)` to help the compiler optimize (and since it's an unchecked assumption, V8_ASSUME additionally DCHECKs that it holds). Here, we call `DecodeError` in a context where `validate` could be false. I am not entirely sure why we didn't trip over this beforehand, but this is the correct thing to do and it comes up with bailouts in the `WasmInJsInliningInterface` now.

File test/message/wasm-in-js-inlining-turboshaft.js

Line 76, Patchset 8: // We would need to extend the wrapper inlining to support this.

Matthias Liedtke . resolved

As discussed offline, I don't think so for multi-returns as these require allocating a JSArray, so it isn't completely straight-forward.

Daniel Lehmann

Correct, rephrased the comment: If we want to support it, we first need to support this in the JS-to-Wasm wrapper inlining. I would rather first port the wrapper inlining to Turboshaft and then implement it, so it's quite far out (doesn't make much sense to spend time on features in the TurboFan wrapper inlining).

Line 83, Patchset 8: kExprUnreachable,

Matthias Liedtke . resolved

This should be easy, we need traps anyways for all the gc instructions. But we don't really need `kExprUnreachable` (it only makes sense in combination with exceptions or other control flow).

Daniel Lehmann

Acknowledged

Line 143, Patchset 8: console.log(e)

Matthias Liedtke . resolved

It might be better to return it or not use the error at all. I've learnt in the past that printing errors creates confusion because someone looks at the output and thinks "Oh, `RuntimeError: unreachable`, seems like the test has failed!"

Daniel Lehmann

Good point, ack and done.

File test/message/wasm-in-js-inlining-turboshaft.out

Line 3, Patchset 8:Successfully inlined Wasm function #0

Matthias Liedtke . resolved

Nit: You probably want to print some information about the module (e.g. the address) which helps in case of multiple instantiations.

Daniel Lehmann

Ack and done: Use the same format as the TurboFan implementation (see also comment below).

Line 6, Patchset 8:Successfully inlined Wasm function #1

Matthias Liedtke . resolved

Nit: The `successfully` seems redundant.

Daniel Lehmann

Ack and done: Use the same format as the TurboFan implementation (see also comment below).

Line 9, Patchset 8:Successfully inlined Wasm function #2

Matthias Liedtke . resolved

Nit: You should be able to also print the debug name if present (see the turbofan implementation that seems to do that).

Daniel Lehmann

Ack and done: Use the same format as the TurboFan implementation (see also comment below).

Line 27, Patchset 8:Cannot consider {ADDRESS} {{ADDRESS} <FeedbackCell[many closures]>} for inlining (no feedback vector)

Matthias Liedtke . resolved

Nit: But we "considered" it for inlining, otherwise we wouldn't have printed this line? 😊
Should we just treat this as "Cannot inline [...] (no feedback vector)"?

Daniel Lehmann

This is actually output from the existing JS-to-Wasm wrapper inlining in Turbofan. I assume it is written this way to be consistent with JavaScript inlining and the positive case where inlining works out: ``` Considering wasm function [24] anyConvertExternConvertAny of module 0x56523c722ac8 for inlining

inlining
```

I don't terribly like the format either, but it's really just bikeshedding and for consistency with the existing tracing output, I use the same format now here. The wrapper inlining is still in Turbofan, which is yet another reason to use this format.

Open in Gerrit

Related details

Attention is currently required from:

Matthias Liedtke

Submit Requirements:

Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

unsatisfied_requirement

open

diffy

Matthias Liedtke (Gerrit)

unread,

Sep 9, 2024, 5:22:59 AM9/9/24

to Daniel Lehmann, V8 LUCI CQ, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Attention needed from Daniel Lehmann

Matthias Liedtke voted and added 19 comments

Votes added by Matthias Liedtke

Code-Review

19 comments

Patchset-level comments

File-level comment, Patchset 9 (Latest):

Matthias Liedtke . resolved

LGTM once tests pass. 😊

File src/compiler/pipeline-data-inl.h

Line 569, Patchset 8: // TODO(dlehmann,353475584): Fix this restriction when porting to Turboshaft.

Matthias Liedtke . unresolved

Sure? Even the wasm-into-wasm inlining doesn't support this (i.e. cross-module-inlining), see e.g. https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/wasm-lowering-reducer.h;l=977. This will not be a trivial change.
(You can obviously keep the `TODO` but you might want to lower your expectations on it. 😊)

Daniel Lehmann

Ack, this is not going to happen for the TS Wasm-in-JS inlining MVP, so I toned down the comment.
In terms of importance/prioritization, I do think the restriction to inline only from a single module in Wasm-in-JS inlining is more of a weird performance cliff than not inlining across different Wasm modules. I.e., that the first Wasm module "wins" in Wasm-in-JS inlining feels "more arbitrary" or "more surprising". (This is just my gut feeling for what I think "a typical Web developer" expects, WDYT?)

Matthias Liedtke

More arbitrary, yes.
However, the multi-module-use-cases are rare so far. It does work with multiple instantiations afaik as both instantiations would have the same `wasm::WasmModule` object, so that's already ahead of the wasm-into-wasm inlining.
I'd rather see us implement cross-module / cross-instantiation inlining for the wasm-inlining than for the Wasm-into-JS inlining as this is much more of a corner case (and much less something that developers would expect the engine to optimize for).

File src/compiler/turboshaft/pipelines.h

Line 183, Patchset 8: Run<turboshaft::MachineLoweringPhase>();

Matthias Liedtke . unresolved

I feel like we should do the inlining conceptually after the `MachineLowering` as it replaces higher-level JS concepts with low-level instructions and the inlined wasm uses low-level instructions as well.
On the other hand, the `MachineLoweringPhase` also does a few optimizations like `MachineOptimizationReducer` that we'd want to run for the inlined wasm as well.
It could indeed be the best strategy to merge it into the `MachineLoweringPhase` in the end but for now I'd keep it separate (because of the experimental nature and the open questions).

Daniel Lehmann

I have thought again about my comment `Skip this phase if there were no JS->Wasm calls` a couple lines above and replaced it instead with the plan you laid out:
1. Keep a separate phase before the `MachineLoweringPhase` until the MVP is feature-complete and cleaned-up (but before Finching or shipping it).
2. Then integrate my reducer into the `MachineLoweringPhase` probably before `DataViewLoweringReducer` (since we have `DataViewLoweringReducer` in the `WasmGraphBuilderBase` as well, that seems like the right place).
3. Not worry about the performance of always running the reducer: If we have inlineable Wasm functions, running a separate phase is more expensive, and if we don't inline it's just an additional load+conditional branch per CallOp during Turbofan compilation, which is not worth the hassle. (Except if we add a lot more reducers that apply to JS operations as well, but I don't think that will be the case.)

Matthias Liedtke

Ah, actually I remember one difference here:
For Turbofan, if we inline wasm functions, we run a bunch of extra optimizations (the `WasmGCOptimizationPhase`) and some of that stuff is actually relevant (e.g. eliminating redundant casts and the `extern.internalize(extern.externalize(...))` that would be a typical pattern if you chain two wasm calls (one externalizes the return and the other internalizes the argument.)

So you might want to do the same later on. (OTOH, we can still run it in the regular phase and then conditionally run the wasm optimizations later on, they anyways need their own phase in Turboshaft.)

File src/compiler/turboshaft/wasm-in-js-inlining-reducer-inl.h

Line 457, Patchset 9 (Latest): // We currently don't sup port operations that:

Matthias Liedtke . unresolved

Nit: `support`

Line 312, Patchset 9 (Latest): // asm.js:

Matthias Liedtke . unresolved

Thanks. Also note that we can't inline asm.js because it would require different handling in frames.cc as it would need to create a "JS-looking" stack frame for a wasm function. (There is a test case somewhere I think.)

Line 625, Patchset 8:

      // TODO(dlehmann): Not 100% sure this is the correct way. Took this from
      // the Wasm pipeline, but is it also correct in JavaScript?

Matthias Liedtke . resolved

This should be correct. However, you should have an `isolate` so you should be able to just emit a `HeapConstant` for this instead of having to look up the value from the roots.
I couldn't see any of your test-cases testing the result value produced by the inlining (in all cases, not just the void return one), you should do that, otherwise we only know that it didn't crash...

Daniel Lehmann

Changed to loading `__ HeapConstant(isolate->factory()->undefined_value())`.
Regarding testing the result of the inlining: all test cases `assertEqual(unoptimizedResult, inlinedResult)`, so basically a differential oracle. I think that's good enough, but feel free to reopen if I misunderstood something.

Matthias Liedtke

Fair, I missed the `assertEquals`. (I'll blame it on the CL size! 😜)

Line 324, Patchset 8: UNREACHABLE();

Matthias Liedtke . resolved

Just `Bailout`? I think chances are high that we'll add a new unary op at some point in time. Whoever adds that, might miss this inliner and it would be fine if we don't support the new instruction but it wouldn't be great if we then suddenly have stability issues in rare corner cases because of that.
If you think that "the inlining should support all unary ops", then we can add a `DCHECK` plus the bailout, so that fuzzers can find this but this shouldn't risk production stability.

Daniel Lehmann

My reasoning for the `UNREACHABLE` was that we should trip over it when adding a new (in this case: unary) op and forget to handle it here (in the spirit of "make assumptions explicit", "constrain the valid execution paths maximally", and "make all handled enum cases explicit") but I agree that this shouldn't risk stability. So instead, I am going with a three pronged approach:
ops that we support
ops that we don't support, but not a bug: `Bailout()`
default (ops that we "forgot to handle"), a bug, but should not affect stability: `DHCECK(false); Bailout()`

Matthias Liedtke

Sounds good!

Line 239, Patchset 8: return __ TruncateWord64ToWord32(arg);

Matthias Liedtke . unresolved

I'm pretty sure that most if not all of the `I64` instructions you added are broken. When not on 64 bits, you'll need to run the `turboshaft::Int64LoweringReducer`.
As you're on an experimental flag, this doesn't have to happen on this change but that's a bit of the risk of adding dozens of instructions and testing like 5 of them.

Daniel Lehmann

Oh, you are right, and constant folding may eliminate the i64 operations in many simple test cases so it's not immediately obvious. Just adding the `Int64LoweringReducer` is also not possible to support them, since it makes some assumptions about `Parameter` and `Return` ops and assumes a (non-existent in the Wasm-in-JS inlining case) `wasm_sig`.
All in all, I have to agree that this is a bit too risky and would require a whole bunch of more changes to land right now. So I will remove i64 unary/binary ops from this CL, and only gradually extend later.
I also extended the test with all unary and binary ops that are currently supported by the inliner (which required a bit of a rewrite/testing framework, otherwise it would have been a lot of repetition).

Matthias Liedtke

and constant folding may eliminate the i64 operations in many simple test cases so it's not immediately obvious.

Yes, but even `Word64Constant` should fail without the lowering. 😊

Line 144, Patchset 8: // Initialize the non-parameter locals.

Matthias Liedtke . resolved

How many test cases do you have for this? #passiveAggressiveQuestion 😊
Maybe you'd want to add a `Bailout` here if we have non-parameter-locals and add this feature later.

Daniel Lehmann

Haha, well, there was one test with a non-parameter local (`localTee` below), but I see your point :)
I will leave it in, since the code required for non-parameter locals is really quite minimal, but added another test with more than one local and some swaps between locals.

Matthias Liedtke

Thanks!

File src/wasm/function-body-decoder-impl.h

Line 2908, Patchset 8: if constexpr (ValidationTag::validate) {

Matthias Liedtke . resolved

So, what was the reason for this change again?

Daniel Lehmann

`DecoderError()` in its body has `V8_ASSUME(ValidationTag::validate)` to help the compiler optimize (and since it's an unchecked assumption, V8_ASSUME additionally DCHECKs that it holds). Here, we call `DecodeError` in a context where `validate` could be false. I am not entirely sure why we didn't trip over this beforehand, but this is the correct thing to do and it comes up with bailouts in the `WasmInJsInliningInterface` now.

Matthias Liedtke

Ah, I think I got it, with bailouts, we don't reach the end but we still don't want to trigger a `DecodeError` here, right?

File test/message/wasm-in-js-inlining-turboshaft.js

Line 32, Patchset 9 (Latest):

    // TODO(353475584): Is it worth lifting this restriction? Is this a
    // restriction also for regular JS inlining?

Matthias Liedtke . resolved

I'd hope that
1) People don't use a generic wrapper mechanism with `...` (as that would also prevent inlining as everything goes through the same call).
2) compilers generate the arguments explicitly since wasm signatures are static, so it should be very easy to generate the argument list.)

So, if these assumptions hold true, we shouldn't need this. (But still a good idea to keep the todo for now.)

Line 87, Patchset 9 (Latest):

  kGCPrefix, kExprAnyConvertExtern,
  kGCPrefix, kExprExternConvertAny,

Matthias Liedtke . unresolved

Note that this can be optimized away (as mentioned in another comment).
An idea using `kSi_r_i`:
```
kExprLocalGet, 0,
kExprRefI31,
kExprExternConvertAny
```

and `kSig_i_r` for the other way:
```
kExprLocalGet, 0,
kExprAnyConvertExtern,
kExprI31GetS,
```
(I didn't check if you have the ref.i31 stuff in your CL, so this could also be a nice test case for a follow-up CL.)

And yes, there could still be some optimizations here, but I don't think we optimize anything there (other than very generic optimizations).

Line 194, Patchset 9 (Latest):addTestcase('createArray', makeSig([kWasmI32], [kWasmExternRef]), [42], [

Matthias Liedtke . unresolved

This isn't needed for feature parity, it's only needed to test the `array.len` below, so your TODO is slightly misleading. 😊

Line 212, Patchset 9 (Latest):// addBinaryTestcase('RefEq', makeSig([kWasmExternRef, kWasmExternRef], [kWasmExternRef]), {}, null);

Matthias Liedtke . unresolved

Should this be `kWasmEqRef` or do I misunderstand the comment?

Line 216, Patchset 9 (Latest):// Also, there is no wrapper inlining for i64 on 32-bit architectures right

Matthias Liedtke . resolved

Oh, I never considered this. Independently, it's probably easiest to add the lowering to support the operations and ignore the fact that wrappers won't work well on 32 bits as we don't care too much about missed optimizations on 32 bit platforms.

Line 244, Patchset 9 (Latest):

  try {
    wasmExports.trapNoInline();
  } catch (e) {
    return e.toString();
  }

Matthias Liedtke . resolved

This would also require inlining into a try-block which in Turbofan would've been a pain (not sure about Turboshaft in that aspect).

Line 263, Patchset 9 (Latest):

  eval(jsFunctionSrc);
  const jsFunction = eval('js_' + name);

Matthias Liedtke . unresolved

You can slightly reduce the amount of eval and magic by doing


```suggestion
  const jsFunction = eval(jsFunctionSrc);
```

Line 273, Patchset 9 (Latest):

  assertEquals(%GetOptimizationStatus(jsFunction), V8OptimizationStatus.kIsFunction | V8OptimizationStatus.kOptimized | V8OptimizationStatus.kTurboFanned);

Matthias Liedtke . unresolved

I think these asserts can produce all kinds of issues in various build modes (e.g. the one that randomly deopts functions).
You should probably try to use the helpers like `assertOptimized(jsFunction)` that hides most of that complexity.

File test/message/wasm-in-js-inlining-turboshaft.out

Line 27, Patchset 8:Cannot consider {ADDRESS} {{ADDRESS} <FeedbackCell[many closures]>} for inlining (no feedback vector)

Matthias Liedtke . resolved

Nit: But we "considered" it for inlining, otherwise we wouldn't have printed this line? 😊
Should we just treat this as "Cannot inline [...] (no feedback vector)"?

Daniel Lehmann

This is actually output from the existing JS-to-Wasm wrapper inlining in Turbofan. I assume it is written this way to be consistent with JavaScript inlining and the positive case where inlining works out: ``` Considering wasm function [24] anyConvertExternConvertAny of module 0x56523c722ac8 for inlining
inlining
```
I don't terribly like the format either, but it's really just bikeshedding and for consistency with the existing tracing output, I use the same format now here. The wrapper inlining is still in Turbofan, which is yet another reason to use this format.

Matthias Liedtke

Consistency makes sense, I don't like the format too much either, but I guess, I also wanted to make it consistent with the JS inlining. 😊

Open in Gerrit

Related details

Attention is currently required from:

Daniel Lehmann

Submit Requirements:

Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

Matthias Liedtke (Gerrit)

unread,

Sep 10, 2024, 8:54:36 AM9/10/24

to Daniel Lehmann, V8 LUCI CQ, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Attention needed from Daniel Lehmann

Matthias Liedtke voted Code-Review+1

Code-Review

Open in Gerrit

Related details

Attention is currently required from:

Daniel Lehmann

Submit Requirements:

Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

Daniel Lehmann (Gerrit)

unread,

Sep 10, 2024, 9:47:47 AM9/10/24

to Matthias Liedtke, V8 LUCI CQ, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Attention needed from Matthias Liedtke

Daniel Lehmann voted and added 11 comments

Votes added by Daniel Lehmann

Commit-Queue

11 comments

File src/compiler/pipeline-data-inl.h

Line 569, Patchset 8: // TODO(dlehmann,353475584): Fix this restriction when porting to Turboshaft.

Matthias Liedtke . resolved

Sure? Even the wasm-into-wasm inlining doesn't support this (i.e. cross-module-inlining), see e.g. https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/wasm-lowering-reducer.h;l=977. This will not be a trivial change.
(You can obviously keep the `TODO` but you might want to lower your expectations on it. 😊)

Daniel Lehmann

Ack, this is not going to happen for the TS Wasm-in-JS inlining MVP, so I toned down the comment.
In terms of importance/prioritization, I do think the restriction to inline only from a single module in Wasm-in-JS inlining is more of a weird performance cliff than not inlining across different Wasm modules. I.e., that the first Wasm module "wins" in Wasm-in-JS inlining feels "more arbitrary" or "more surprising". (This is just my gut feeling for what I think "a typical Web developer" expects, WDYT?)

Matthias Liedtke

More arbitrary, yes.
However, the multi-module-use-cases are rare so far. It does work with multiple instantiations afaik as both instantiations would have the same `wasm::WasmModule` object, so that's already ahead of the wasm-into-wasm inlining.
I'd rather see us implement cross-module / cross-instantiation inlining for the wasm-inlining than for the Wasm-into-JS inlining as this is much more of a corner case (and much less something that developers would expect the engine to optimize for).

Daniel Lehmann

That makes sense: more arbitrary, but less used in practice, hence add features rather to Wasm-inlining. (Marking as resolved, further discussion on team priorities offline.)

File src/compiler/turboshaft/pipelines.h

Line 183, Patchset 8: Run<turboshaft::MachineLoweringPhase>();

Matthias Liedtke . resolved

I feel like we should do the inlining conceptually after the `MachineLowering` as it replaces higher-level JS concepts with low-level instructions and the inlined wasm uses low-level instructions as well.
On the other hand, the `MachineLoweringPhase` also does a few optimizations like `MachineOptimizationReducer` that we'd want to run for the inlined wasm as well.
It could indeed be the best strategy to merge it into the `MachineLoweringPhase` in the end but for now I'd keep it separate (because of the experimental nature and the open questions).

Daniel Lehmann

I have thought again about my comment `Skip this phase if there were no JS->Wasm calls` a couple lines above and replaced it instead with the plan you laid out:
1. Keep a separate phase before the `MachineLoweringPhase` until the MVP is feature-complete and cleaned-up (but before Finching or shipping it).
2. Then integrate my reducer into the `MachineLoweringPhase` probably before `DataViewLoweringReducer` (since we have `DataViewLoweringReducer` in the `WasmGraphBuilderBase` as well, that seems like the right place).
3. Not worry about the performance of always running the reducer: If we have inlineable Wasm functions, running a separate phase is more expensive, and if we don't inline it's just an additional load+conditional branch per CallOp during Turbofan compilation, which is not worth the hassle. (Except if we add a lot more reducers that apply to JS operations as well, but I don't think that will be the case.)

Matthias Liedtke

Ah, actually I remember one difference here:
For Turbofan, if we inline wasm functions, we run a bunch of extra optimizations (the `WasmGCOptimizationPhase`) and some of that stuff is actually relevant (e.g. eliminating redundant casts and the `extern.internalize(extern.externalize(...))` that would be a typical pattern if you chain two wasm calls (one externalizes the return and the other internalizes the argument.)
So you might want to do the same later on. (OTOH, we can still run it in the regular phase and then conditionally run the wasm optimizations later on, they anyways need their own phase in Turboshaft.)

Daniel Lehmann

Yes, good point. I already had a `TODO` in `wasm-in-js-inlining-phase.cc` to add Wasm GC optimizations, but I hadn't accounted for them needing their own phase due to the analysis on the input graph.

IIUC the analysis code walks over all operations in all blocks (i.e., it wouldn't only add overhead for Wasm GC operations), so we cannot just add it to the beginning of a following phase unconditionally.

File src/compiler/turboshaft/wasm-in-js-inlining-reducer-inl.h

Line 457, Patchset 9: // We currently don't sup port operations that:

Matthias Liedtke . resolved

Nit: `support`

Daniel Lehmann

Done

Line 312, Patchset 9: // asm.js:

Matthias Liedtke . resolved

Thanks. Also note that we can't inline asm.js because it would require different handling in frames.cc as it would need to create a "JS-looking" stack frame for a wasm function. (There is a test case somewhere I think.)

Daniel Lehmann

I added an early bailout to ensure the Wasm module we inline is not coming from `asm.js`, to document this explicitly.

Line 239, Patchset 8: return __ TruncateWord64ToWord32(arg);

Matthias Liedtke . resolved

I'm pretty sure that most if not all of the `I64` instructions you added are broken. When not on 64 bits, you'll need to run the `turboshaft::Int64LoweringReducer`.
As you're on an experimental flag, this doesn't have to happen on this change but that's a bit of the risk of adding dozens of instructions and testing like 5 of them.

Daniel Lehmann

Oh, you are right, and constant folding may eliminate the i64 operations in many simple test cases so it's not immediately obvious. Just adding the `Int64LoweringReducer` is also not possible to support them, since it makes some assumptions about `Parameter` and `Return` ops and assumes a (non-existent in the Wasm-in-JS inlining case) `wasm_sig`.
All in all, I have to agree that this is a bit too risky and would require a whole bunch of more changes to land right now. So I will remove i64 unary/binary ops from this CL, and only gradually extend later.
I also extended the test with all unary and binary ops that are currently supported by the inliner (which required a bit of a rewrite/testing framework, otherwise it would have been a lot of repetition).

Matthias Liedtke

and constant folding may eliminate the i64 operations in many simple test cases so it's not immediately obvious.
Yes, but even `Word64Constant` should fail without the lowering. 😊

Daniel Lehmann

Right, I also removed support for `i64.const`. (But it's not easy to test a lone `i64.const` instruction: the wrapper inlining doesn't support `i64` in the signature on 32-bit architectures, i.e., you first need to convert it to an `i32` somehow, and that may be constant folded again.)

File src/wasm/function-body-decoder-impl.h

Line 2908, Patchset 8: if constexpr (ValidationTag::validate) {

Matthias Liedtke . resolved

So, what was the reason for this change again?

Daniel Lehmann

`DecoderError()` in its body has `V8_ASSUME(ValidationTag::validate)` to help the compiler optimize (and since it's an unchecked assumption, V8_ASSUME additionally DCHECKs that it holds). Here, we call `DecodeError` in a context where `validate` could be false. I am not entirely sure why we didn't trip over this beforehand, but this is the correct thing to do and it comes up with bailouts in the `WasmInJsInliningInterface` now.

Matthias Liedtke

Ah, I think I got it, with bailouts, we don't reach the end but we still don't want to trigger a `DecodeError` here, right?

Daniel Lehmann

Yes, exactly, good explanation. I updated the comment to explicitly mention bailouts.

File test/message/wasm-in-js-inlining-turboshaft.js

Line 87, Patchset 9:

  kGCPrefix, kExprAnyConvertExtern,
  kGCPrefix, kExprExternConvertAny,

Matthias Liedtke . unresolved

Note that this can be optimized away (as mentioned in another comment).
An idea using `kSi_r_i`:
```
kExprLocalGet, 0,
kExprRefI31,
kExprExternConvertAny
```
and `kSig_i_r` for the other way:
```
kExprLocalGet, 0,
kExprAnyConvertExtern,
kExprI31GetS,
```
(I didn't check if you have the ref.i31 stuff in your CL, so this could also be a nice test case for a follow-up CL.)
And yes, there could still be some optimizations here, but I don't think we optimize anything there (other than very generic optimizations).

Daniel Lehmann

Thanks for the test case. Since I don't have `ref.i31` support yet, I'll keep this open and add it in a later CL.

Line 194, Patchset 9:addTestcase('createArray', makeSig([kWasmI32], [kWasmExternRef]), [42], [

Matthias Liedtke . resolved

This isn't needed for feature parity, it's only needed to test the `array.len` below, so your TODO is slightly misleading. 😊

Daniel Lehmann

Oh, good point. Fixed.

Line 212, Patchset 9:// addBinaryTestcase('RefEq', makeSig([kWasmExternRef, kWasmExternRef], [kWasmExternRef]), {}, null);

Matthias Liedtke . resolved

Should this be `kWasmEqRef` or do I misunderstand the comment?

Daniel Lehmann

Right, I missed that I can already test this without having to support operations that create an `eqref` in Wasm, since one can pass JS numbers as `kWasmEqRef`. So fixed and added a `RefEq` testcase now.

Line 263, Patchset 9: eval(jsFunctionSrc);


  const jsFunction = eval('js_' + name);

Matthias Liedtke . resolved

You can slightly reduce the amount of eval and magic by doing
```suggestion
  const jsFunction = eval(jsFunctionSrc);
```

Daniel Lehmann

Unfortunately, that's not possible, since in `eval('function foo() { ... }')` the JS is parsed as a function declaration (not a named function expression) and hence has return value `undefined`. It's only a named function expression when this syntax is used in a context that doesn't allow statements (e.g., the right-hand side of an assignment). So I think I have to keep this particular eval magic :/

Line 273, Patchset 9:

  assertEquals(%GetOptimizationStatus(jsFunction), V8OptimizationStatus.kIsFunction | V8OptimizationStatus.kOptimized | V8OptimizationStatus.kTurboFanned);

Matthias Liedtke . resolved

I think these asserts can produce all kinds of issues in various build modes (e.g. the one that randomly deopts functions).
You should probably try to use the helpers like `assertOptimized(jsFunction)` that hides most of that complexity.

Daniel Lehmann

Thanks, used this and `assertUnoptimized` now.

Open in Gerrit

Related details

Attention is currently required from:

Matthias Liedtke

Submit Requirements:

Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

V8 LUCI CQ (Gerrit)

unread,

Sep 10, 2024, 9:49:07 AM9/10/24

to Daniel Lehmann, Matthias Liedtke, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

V8 LUCI CQ submitted the change

Change information

Commit message:

[turboshaft][wasm] Wasm-in-JS inlining: Part 1

Start with some boilerplate and incomplete implementation. This adds:
- Off-by-default, experimental flag.
- Attaching the information which Wasm function is called to a new
  TurboFan sidetable (determined the least invasive/best option
  together with Nico).
- Convert that sidetable information and attach to `TSCallDescriptor`.
- New Turboshaft phase and reducer, which consume this info and do the
  actual inlining.
- Mjsunit-style message test.
- WasmFullDecoder instantiation (in the TS reducer) and new interface
  for the decoder.
- Bailout by decoding Wasm function twice.
- Don't toggle thread-in-Wasm-flag when inlining (HACKY, see comment).
- Implementation (roughly copied from regular Wasm TS pipeline) of
  * locals, globals, constants,
  * simple arithmetic,
  * several binary and unary operations that don't trap or call
    builtins.

TODOs for follow-up parts:
- Correct stack trace, metadata, debugging ...
- Implement support for more Wasm instructions in the inlining
  interface (which might need smarter code sharing with the regular
  Wasm pipeline)
- Implement memory accesses (with bounds checking)
- Fuzz the whole thing with Fuzzilli

Bug: 353475584

Change-Id: Ica7283e76ea15b4cf5efd0dbc8d47877ee080382

Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/5823909

Commit-Queue: Daniel Lehmann <dleh...@chromium.org>

Reviewed-by: Matthias Liedtke <mlie...@chromium.org>

Cr-Commit-Position: refs/heads/main@{#96030}

Files:

M BUILD.bazel
M BUILD.gn
M src/compiler/js-inlining-heuristic.h
M src/compiler/js-inlining.cc
M src/compiler/js-inlining.h
M src/compiler/pipeline-data-inl.h
M src/compiler/pipeline.cc
M src/compiler/turboshaft/build-graph-phase.cc
M src/compiler/turboshaft/graph-builder.cc
M src/compiler/turboshaft/graph-builder.h
M src/compiler/turboshaft/operations.h
M src/compiler/turboshaft/pipelines.h
A src/compiler/turboshaft/wasm-in-js-inlining-phase.cc
A src/compiler/turboshaft/wasm-in-js-inlining-phase.h
A src/compiler/turboshaft/wasm-in-js-inlining-reducer-inl.h
M src/flags/flag-definitions.h
M src/logging/runtime-call-stats.h
M src/wasm/function-body-decoder-impl.h
M test/message/message.status
A test/message/wasm-in-js-inlining-turboshaft.js
A test/message/wasm-in-js-inlining-turboshaft.out
M test/mjsunit/mjsunit.status
M test/mjsunit/wasm/wasm-module-builder.js

Change size: XL

Delta: 23 files changed, 2094 insertions(+), 47 deletions(-)

Branch: refs/heads/main

Submit Requirements:

Code-Review: +1 by Matthias Liedtke

Open in Gerrit

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

open

diffy

satisfied_requirement

Leszek Swirski (Gerrit)

unread,

Sep 10, 2024, 11:37:31 AM9/10/24

to Daniel Lehmann, V8 LUCI CQ, Matthias Liedtke, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Leszek Swirski has created a revert of this change

Open original change

Related details

Attention set is empty

Submit Requirements:

Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

Daniel Lehmann (Gerrit)

unread,

Apr 2, 2026, 4:43:44 PM (3 days ago) Apr 2

to V8 LUCI CQ, Matthias Liedtke, Michael Hablich, dmercadi...@chromium.org, saelo...@chromium.org, v8-re...@googlegroups.com, victorgo...@chromium.org, was...@google.com

Daniel Lehmann added 2 comments

File src/compiler/turboshaft/wasm-in-js-inlining-reducer-inl.h

Line 169, Patchset 8: // need to understand what the inlining ID is / where to get it.

Matthias Liedtke . resolved

You'll need to provide that from the JS side, see https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/js-inlining.cc;l=454

This might not be fully trivial given that you are doing the inlining later than the JS inlining.

Daniel Lehmann

Thanks for the pointer, will leave this open and address in a follow-up CL/part of the implementation.

Daniel Lehmann

This got fixed in https://crrev.com/c/7638289.

File test/message/wasm-in-js-inlining-turboshaft.js

Line 87, Patchset 9:

  kGCPrefix, kExprAnyConvertExtern,
  kGCPrefix, kExprExternConvertAny,

Matthias Liedtke . resolved

Note that this can be optimized away (as mentioned in another comment).
An idea using `kSi_r_i`:
```
kExprLocalGet, 0,
kExprRefI31,
kExprExternConvertAny
```
and `kSig_i_r` for the other way:
```
kExprLocalGet, 0,
kExprAnyConvertExtern,
kExprI31GetS,
```
(I didn't check if you have the ref.i31 stuff in your CL, so this could also be a nice test case for a follow-up CL.)
And yes, there could still be some optimizations here, but I don't think we optimize anything there (other than very generic optimizations).

Daniel Lehmann

Thanks for the test case. Since I don't have `ref.i31` support yet, I'll keep this open and add it in a later CL.

Daniel Lehmann

Finally fixed this test together with the support for `any.convert_extern` and `extern.convert_any` in https://crrev.com/c/7725990.

Open in Gerrit

Related details

Attention set is empty

Submit Requirements:

Code-Owners
Code-Review

Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings.

Gerrit

satisfied_requirement

open

diffy

Reply all

Reply to author

Forward

0 new messages