Set Ready For Review
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
First quick pass. I haven't looked at the main file yet...
void ThisStartNewSnapshot(base::Vector<const Snapshot> predecessors,What's the difference from this and the StartNewSnapshot above?
if (maglev_block->has_phi()) {
DCHECK(!maglev_block->is_exception_handler_block());
ComputePredecessorPermutations(maglev_block, turboshaft_block, false,
false);
}This change seems unrelated to this CL, maybe do it in another?
bool ignore_last_predecessor) {
predecessor_permutation_.clear();
// This function is only called for loops that need a "single block
// predecessor" (from EmitLoopSinglePredecessorBlock). The backedge should
// always be skipped in thus cases. Additionally, this means that when
// even when {maglev_block} is a loop, {turboshaft_block} shouldn't and
// should instead be the new single forward predecessor of the loop.
DCHECK_EQ(skip_backedge, maglev_block->is_loop());
DCHECK(!turboshaft_block->IsLoop());
if (maglev_block->predecessor_count() == 1) {
// This block has a single predecessor ==> trivial.
predecessor_permutation_.push_back(0);
return;
}
DCHECK(maglev_block->has_state());Is this related to the current CL?
void MakeLoopPhi(maglev::ValueNode* phi,
const maglev::ProcessingState& state) {
DCHECK(phi->Is<maglev::Phi>());
DCHECK(__ current_block()->IsLoop());
DCHECK(state.block()->is_loop());
OpIndex first_phi_input;
if (state.block()->predecessor_count() > 2 ||
generator_analyzer_.HeaderIsBypassed(state.block())) {
// This loop has multiple forward edges in Maglev, so we should have
// created an intermediate block in Turboshaft, which will be the only
// predecessor of the Turboshaft loop, and from which we'll find the
// first input for this loop phi.
DCHECK_EQ(loop_phis_first_input_.size(),
static_cast<size_t>(state.block()->phis()->LengthForTest()));
DCHECK_GE(loop_phis_first_input_index_, 0);
DCHECK_LT(loop_phis_first_input_index_, loop_phis_first_input_.size());
DCHECK(loop_single_edge_predecessors_.contains(state.block()));
DCHECK_EQ(loop_single_edge_predecessors_[state.block()],
__ current_block()->LastPredecessor());
first_phi_input = loop_phis_first_input_[loop_phis_first_input_index_];
loop_phis_first_input_index_++;
} else {
DCHECK_EQ(phi->input_count(), 2);
DCHECK_EQ(state.block()->predecessor_count(), 2);
DCHECK(loop_phis_first_input_.empty());
first_phi_input = Map(phi->input(0));
}
RegisterRepresentation rep =
RegisterRepresentationFor(phi->value_representation());
SetMap(phi, __ PendingLoopPhi(first_phi_input, rep));
}
Unrelated refactoring.
if (!node->is_used()) {
// TODO(dmercadier): we could make DCE to remove such unused nodes
// earlier.
return maglev::ProcessResult::kContinue;
}You removed the code that removes it in DCE. I guess it should still be valid?
if (size == 0) {
// TODO(dmercadier): DCE could also remove these nodes.
return maglev::ProcessResult::kContinue;
}Same here.
if (loop->has_phi()) {
for (maglev::Phi* maglev_phi : *loop->phis()) {
// Note that we've already emitted the backedge Goto, which means that
// we're currently not in a block, which means that we need to pass
// can_be_invalid=false to `Map`, otherwise it will think that we're
// currently emitting unreachable operations and return
// OpIndex::Invalid().
constexpr bool kIndexCanBeInvalid = false;
OpIndex phi_index = Map(maglev_phi, kIndexCanBeInvalid);
PendingLoopPhiOp& pending_phi =
__ output_graph().Get(phi_index).Cast<PendingLoopPhiOp>();
__ output_graph().Replace<PhiOp>(
phi_index,
base::VectorOf(
{pending_phi.first(),
Map(maglev_phi -> backedge_input(), kIndexCanBeInvalid)}),
pending_phi.rep);
}Not need to do this here...
// TODO: re-enable this DCHECK (it requires running DCE to remove unused
// AllocationBlock)
// DCHECK_IMPLIES(result == maglev::ProcessResult::kContinue &&
// !GraphBuildingNodeProcessor::Asm()
// .generating_unreachable_operations() &&
// maglev::IsValueNode(node->opcode()),
// IsMapped(node));You removed the code in DCE, why?
processor(AnyUseMarkingProcessor{true});nit: you could create a constructor with default arg.
processor(DeadNodeSweepingProcessor{true},
LiveRangeAndNextUseProcessor{compilation_info, graph,Same nit here.
return !is_maglev_ || phase_ != MaglevPhase::kAnyUseMarking;Did you negate this? Was it wrong before?
So in turbolev we can print sweepable dead phis in any use marking, but not in maglev?
BlockPostProcessResult post_action =
node_processor_.PostProcessBasicBlock(block);
USE(post_action);
DCHECK_EQ(post_action, BlockPostProcessResult::kContinue);
return;We didn't need to do this post action before?
maybe we can have a "abort" hook instead? ProcessAbortedBasicBlock(...) or something? Wdyt?
node_index_(node_index) {}Why do you removed
```
DCHECK_IMPLIES(node_index != kNoNodeIndex, node_index >= 0);
```
?
enum class BlockPreProcessResult {
kContinue, // Process exited normally.
kSkip, // Skip processing this blockand and do not call the following
// processors.
};
enum class BlockPostProcessResult {
kContinue, // Process exited normally.
kRevisitLoop, // Revisit the current loop, assuming that the current block
// ends with a JumpLoop (no MultiProcessor support).
};Nit: I would prefer kRevisitLoop inside BlockProcessResult to avoid type polution.
We can DCHECK inside the processor. Similar to ProcessResult, not all return values are valid in all situation.
kRevisitLoop is even already special, since it must be the result from a block ending with JumpLoop.
int block_id(const BasicBlock* target) {We should have a better name for this. We already have a block id.
blocks_offset? not great, idk...
// if (!run_escape_analysis_) return ProcessResult::kContinue;Delete.
if (!run_escape_analysis_) return ProcessResult::kContinue;Do you need this? we could remove unused AllocationBlocks here.
return BlockPostProcessResult::kRevisitLoop;Have you check the cost (in compilation in the entire JS3) by just running the processor twice?
// Constant for simplicity.
ValueNode* dummy_cst;
switch (node->value_representation()) {
case ValueRepresentation::kTagged:
dummy_cst = data_.graph->GetSmiConstant(0);
break;
case ValueRepresentation::kInt32:
dummy_cst = data_.graph->GetInt32Constant(0);
break;
case ValueRepresentation::kUint32:
dummy_cst = data_.graph->GetUint32Constant(0);
break;
case ValueRepresentation::kFloat64:
dummy_cst = data_.graph->GetFloat64Constant(0);
break;
case ValueRepresentation::kHoleyFloat64:
dummy_cst = data_.graph->GetFloat64Constant(0);
break;
case ValueRepresentation::kIntPtr:
dummy_cst = data_.graph->GetIntPtrConstant(0);
break;
case ValueRepresentation::kRawPtr:
case ValueRepresentation::kNone:
UNREACHABLE();
}
node->OverwriteWithIdentityTo(dummy_cst);Maybe we should have an Unreachable value node?
You can also return kTruncate (+ maybe some massages).
// This phase will update some deopt frames, thus rendering the recorded deopt
// frames in the graph stale. We thus clear them now.
// Note that we don't repopulate eager_deopt_top_frames_ and
// lazy_deopt_top_frames_ in the graph since the Turbolev pipeline with this
// escape analysis will never read this again. In particular,
// UnwrapDeoptFrames (which is the main place where those frames are read)
// doesn't handle recursive (and mutually recursive) objects, and adding
// support for this would add a significant cost with no obvious benefits, so
// we just avoid calling UnwrapDeoptFrames.
data_.graph->ClearRecordedFrames();The main idea of the frames were to unwrap identities and conversion nodes in the frames.
One could walk the graph and visit the frames in the graphs, but since nodes share frames, we would walk the frames too many times. The recorded frames avoided that.
I don't think the escape analysis solve the unwrapping issue, so we might still need these frames (or walk the entire graph which is more expensive).
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks for the initial comments; I cleaned up :)
(I've also cleared the CCs, because of a wrong rebase a bunch of folks got auto-cced)
void ThisStartNewSnapshot(base::Vector<const Snapshot> predecessors,What's the difference from this and the StartNewSnapshot above?
Whups, left over from debugging, sorry.
if (maglev_block->has_phi()) {
DCHECK(!maglev_block->is_exception_handler_block());
ComputePredecessorPermutations(maglev_block, turboshaft_block, false,
false);
}This change seems unrelated to this CL, maybe do it in another?
Done
bool ignore_last_predecessor) {
predecessor_permutation_.clear();
// This function is only called for loops that need a "single block
// predecessor" (from EmitLoopSinglePredecessorBlock). The backedge should
// always be skipped in thus cases. Additionally, this means that when
// even when {maglev_block} is a loop, {turboshaft_block} shouldn't and
// should instead be the new single forward predecessor of the loop.
DCHECK_EQ(skip_backedge, maglev_block->is_loop());
DCHECK(!turboshaft_block->IsLoop());
if (maglev_block->predecessor_count() == 1) {
// This block has a single predecessor ==> trivial.
predecessor_permutation_.push_back(0);
return;
}
DCHECK(maglev_block->has_state());Is this related to the current CL?
Not in its current state; reverted.
void MakeLoopPhi(maglev::ValueNode* phi,Darius MercadierUnrelated refactoring.
Whups sorry, again because of this other Phi kind I added at some point and them removed :/
if (!node->is_used()) {
// TODO(dmercadier): we could make DCE to remove such unused nodes
// earlier.
return maglev::ProcessResult::kContinue;
}You removed the code that removes it in DCE. I guess it should still be valid?
You're right, I guess that for some reason the DCE code didn't work in my early prototypes, but it works now. I've re-enabled it and removed the bailouts here.
if (size == 0) {
// TODO(dmercadier): DCE could also remove these nodes.
return maglev::ProcessResult::kContinue;
}Darius MercadierSame here.
Done
if (loop->has_phi()) {
for (maglev::Phi* maglev_phi : *loop->phis()) {
// Note that we've already emitted the backedge Goto, which means that
// we're currently not in a block, which means that we need to pass
// can_be_invalid=false to `Map`, otherwise it will think that we're
// currently emitting unreachable operations and return
// OpIndex::Invalid().
constexpr bool kIndexCanBeInvalid = false;
OpIndex phi_index = Map(maglev_phi, kIndexCanBeInvalid);
PendingLoopPhiOp& pending_phi =
__ output_graph().Get(phi_index).Cast<PendingLoopPhiOp>();
__ output_graph().Replace<PhiOp>(
phi_index,
base::VectorOf(
{pending_phi.first(),
Map(maglev_phi -> backedge_input(), kIndexCanBeInvalid)}),
pending_phi.rep);
}Not need to do this here...
Yea sorry about that; leftovers from a previous version that was adding a new kind of Phis :/
// TODO: re-enable this DCHECK (it requires running DCE to remove unused
// AllocationBlock)
// DCHECK_IMPLIES(result == maglev::ProcessResult::kContinue &&
// !GraphBuildingNodeProcessor::Asm()
// .generating_unreachable_operations() &&
// maglev::IsValueNode(node->opcode()),
// IsMapped(node));You removed the code in DCE, why?
Done
nit: you could create a constructor with default arg.
Done
processor(DeadNodeSweepingProcessor{true},
LiveRangeAndNextUseProcessor{compilation_info, graph,Darius MercadierSame nit here.
Done
return !is_maglev_ || phase_ != MaglevPhase::kAnyUseMarking;Did you negate this? Was it wrong before?
So in turbolev we can print sweepable dead phis in any use marking, but not in maglev?
Yes it was wrong before. I could also move this to another CL.
So in turbolev we can print sweepable dead phis in any use marking, but not in maglev?
Yes, because the issue in Maglev is that a dead phi can have regalloc data for some of its inputs but not the other ones, and the printer doesn't like that. In Turbolev, we never allocate regalloc data, so this isn't an issue.
BlockPostProcessResult post_action =
node_processor_.PostProcessBasicBlock(block);
USE(post_action);
DCHECK_EQ(post_action, BlockPostProcessResult::kContinue);
return;We didn't need to do this post action before?
maybe we can have a "abort" hook instead? ProcessAbortedBasicBlock(...) or something? Wdyt?
Ah actually I've double-checked and this isn't needed anymore.
The reasoning behind this post action is that we needed it a some point to close the current SnapshotTable snapshot, but now the escape analysis never returns `kAbort`, so we don't need to bother here. I've reverted this (here and later in the file).
Why do you removed
```
DCHECK_IMPLIES(node_index != kNoNodeIndex, node_index >= 0);
```
?
Good question; reverted.
enum class BlockPreProcessResult {
kContinue, // Process exited normally.
kSkip, // Skip processing this blockand and do not call the following
// processors.
};
enum class BlockPostProcessResult {
kContinue, // Process exited normally.
kRevisitLoop, // Revisit the current loop, assuming that the current block
// ends with a JumpLoop (no MultiProcessor support).
};Nit: I would prefer kRevisitLoop inside BlockProcessResult to avoid type polution.
We can DCHECK inside the processor. Similar to ProcessResult, not all return values are valid in all situation.
kRevisitLoop is even already special, since it must be the result from a block ending with JumpLoop.
Ok sounds good, done.
We should have a better name for this. We already have a block id.
blocks_offset? not great, idk...
Good point. I went with block_offset...
// if (!run_escape_analysis_) return ProcessResult::kContinue;Darius MercadierDelete.
Done
if (!run_escape_analysis_) return ProcessResult::kContinue;Do you need this? we could remove unused AllocationBlocks here.
Yea not sure why I disabled this; I'm guessing that this was breaking an earlier version of the CL; now it works.
return BlockPostProcessResult::kRevisitLoop;Have you check the cost (in compilation in the entire JS3) by just running the processor twice?
Will do.
// Constant for simplicity.I was a bit afraid of the kTruncate, but maybe it could simply work. If that's ok I'll do this in a follow up; there is a TODO above already.
And if kTruncate doesn't work, the Unreachable makes sense so that a subsequent processor can detect this and take care of the truncation indeed.
// This phase will update some deopt frames, thus rendering the recorded deopt
// frames in the graph stale. We thus clear them now.
// Note that we don't repopulate eager_deopt_top_frames_ and
// lazy_deopt_top_frames_ in the graph since the Turbolev pipeline with this
// escape analysis will never read this again. In particular,
// UnwrapDeoptFrames (which is the main place where those frames are read)
// doesn't handle recursive (and mutually recursive) objects, and adding
// support for this would add a significant cost with no obvious benefits, so
// we just avoid calling UnwrapDeoptFrames.
data_.graph->ClearRecordedFrames();The main idea of the frames were to unwrap identities and conversion nodes in the frames.
One could walk the graph and visit the frames in the graphs, but since nodes share frames, we would walk the frames too many times. The recorded frames avoided that.I don't think the escape analysis solve the unwrapping issue, so we might still need these frames (or walk the entire graph which is more expensive).
Ok, I could re-add this, but then we'll need UnwrapDeoptFrames to handle recursive objects... Do you want me to add a TODO?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |