Commit-Queue | +1 |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Hey Sam: I'm sorry I haven't managed to review this CL yet. Will take a look tomorrow!
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Hey Sam: I'm sorry I haven't managed to review this CL yet. Will take a look tomorrow!
Okay, thanks! And just for context, it now seems fast enough to use at runtime and I'm seeing ~1.5% geomean improvement on my personal suite.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Code-Review | +1 |
LGTM except for a few nits
std::optional<base::RandomNumberGenerator>(v8_flags.random_seed)) {}
nit: I think we don't need this, but we can just initialize it with the generator.
auto it = waiting_.begin();
while (it != waiting_.end()) {
if (cycle >= (*it)->start_cycle()) {
ready_.push_back(*it);
it = waiting_.erase(it);
} else {
++it;
}
}
nit: Do you think it would make sense to write this in terms of standard algorithms? E.g.
```
auto IsReady = [cycle](ScheduleGraphNode* n) { return cycle >= n->start_cycle(); };
auto it = std::partition(waiting_.begin(), waiting_.end(), IsReady);
ready_.insert(ready_.end(), waiting_.begin(), it);
waiting_.erase(waiting_.begin(), it);
```
I'm not decided what's the better choice.
int max_latency = 0;
auto best_candidate = ready_.begin();
for (auto it = ready_.begin(); it != ready_.end(); ++it) {
if ((*it)->total_latency() > max_latency) {
max_latency = (*it)->total_latency();
best_candidate = it;
}
}
nit:
```
auto best_candidate = std::max_element(ready_.begin(), ready_.end(),
[](auto l, auto r) { return l->total_latency() < r->total_latency(); });
```
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
std::optional<base::RandomNumberGenerator>(v8_flags.random_seed)) {}
nit: I think we don't need this, but we can just initialize it with the generator.
Done
auto it = waiting_.begin();
while (it != waiting_.end()) {
if (cycle >= (*it)->start_cycle()) {
ready_.push_back(*it);
it = waiting_.erase(it);
} else {
++it;
}
}
nit: Do you think it would make sense to write this in terms of standard algorithms? E.g.
```
auto IsReady = [cycle](ScheduleGraphNode* n) { return cycle >= n->start_cycle(); };
auto it = std::partition(waiting_.begin(), waiting_.end(), IsReady);
ready_.insert(ready_.end(), waiting_.begin(), it);
waiting_.erase(waiting_.begin(), it);
```
I'm not decided what's the better choice.
Considering the pain I always encounter with iterators, this SGTM!
int max_latency = 0;
auto best_candidate = ready_.begin();
for (auto it = ready_.begin(); it != ready_.end(); ++it) {
if ((*it)->total_latency() > max_latency) {
max_latency = (*it)->total_latency();
best_candidate = it;
}
}
nit:
```
auto best_candidate = std::max_element(ready_.begin(), ready_.end(),
[](auto l, auto r) { return l->total_latency() < r->total_latency(); });
```
Thanks.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
2 is the latest approved patch-set.
The change was submitted with unreviewed changes in the following files:
```
The name of the file: src/compiler/backend/instruction-scheduler.cc
Insertions: 10, Deletions: 18.
The diff is too large to show. Please review the diff.
```
[compiler] Make the scheduler faster
Fixup the instruction scheduler to be (~an order of magnitude) faster.
This is achieved in several ways, mainly changing data structures:
- Split nodes into two lists: waiting and ready.
- We don't keep waiting or ready sorted, we just iterate through the
ready list to find the longest latency.
- Using an ZoneUnorderedMap, instead of ZoneMap, for operands_map.
- Use a SmallZoneVector, instead of ZoneDeque, for successors which
reduces the overhead of resizing.
- Reserving enough space when constructing the graph.
Other structural changes are:
- Caching the result from GetInstructionFlags, as this was taking a
significant amount of time, after being called many times for each
instruction.
- We no longer add the terminator to the graph as this caused far too
many unnecessary edges. We want the terminator to be the last
instruction, so we just append it to the, almost, final sequence.
- The two SchedulingQueues have been combined and we just construct one
per scheduler, not per call to Schedule.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |