Has someone looked at not checkpointing prove facts for dominated values

94 views
Skip to first unread message

Jorropo

unread,
Jun 7, 2026, 1:46:56 AM (2 days ago) Jun 7
to golang-dev
Hello, when prove scans a block (case descend:) it finds facts about values inside the current block (shocker I know).

However we push these facts onto the various revert stacks, which doesn't make sense.
Since if a fact for v23 is learned at block b4 and v23 is dominated by b4 (like by being inside it).
The fact will be true for all users of v23 so we don't need to revert it*.

Has anyone looked into implementing this already ?
We could save these forever facts across passes allowing opt rules to use prove information.
It should also make prove faster and use less memory but I don't know if we could measure that.

*I haven't checked if that is true for orderS and orderU, maybe if we have b1 → b2, b3. Having fact about b2 values when descending b3 could create wrong facts about b1 from transitive properties between b1 and b2's values, altho I feel this should be solvable by putting the transitively learned facts about b1 values in the revert stack even if the original b2 facts aren't.

In case the idea is unclear here is a really Q&D implementation of the idea for the limitStack:
diff --git a/src/cmd/compile/internal/ssa/fuse_branchredirect.go b/src/cmd/compile/internal/ssa/fuse_branchredirect.go
index 153c2a56b7..a431c0ece9 100644
--- a/src/cmd/compile/internal/ssa/fuse_branchredirect.go
+++ b/src/cmd/compile/internal/ssa/fuse_branchredirect.go
@@ -28,7 +28,7 @@ package ssa
// contradiction is found then redirect p to another successor of b.
func fuseBranchRedirect(f *Func) bool {
ft := newFactsTable(f)
- ft.checkpoint()
+ ft.checkpoint(nil)
changed := false
for i := len(f.Blocks) - 1; i >= 0; i-- {
@@ -54,13 +54,13 @@ func fuseBranchRedirect(f *Func) bool {
if pk.i == 1 {
pbranch = negative
}
- ft.checkpoint()
+ ft.checkpoint(nil)
// Assume branch p->b is taken.
addBranchRestrictions(ft, p, pbranch)
// Check if any outgoing branch is unreachable based on the above condition.
parent := b
for j, bbranch := range [...]branch{positive, negative} {
- ft.checkpoint()
+ ft.checkpoint(nil)
// Try to update relationship b->child, and check if the contradiction occurs.
addBranchRestrictions(ft, parent, bbranch)
unsat := ft.unsat
diff --git a/src/cmd/compile/internal/ssa/prove.go b/src/cmd/compile/internal/ssa/prove.go
index 6f21432711..8e154923bb 100644
--- a/src/cmd/compile/internal/ssa/prove.go
+++ b/src/cmd/compile/internal/ssa/prove.go
@@ -507,6 +507,8 @@ type ordering struct {
// by the facts table is effective for real code while remaining very
// efficient.
type factsTable struct {
+ currentlyDescending *Block
+
// unsat is true if facts contains a contradiction.
//
// Note that the factsTable logic is incomplete, so if unsat
@@ -677,8 +679,10 @@ func (ft *factsTable) newLimit(v *Value, newLim limit) {
ft.recurseCheck[v.ID] = false
}()
- // Record undo information.
- ft.limitStack = append(ft.limitStack, limitFact{v.ID, oldLim})
+ if v.Block != ft.currentlyDescending {
+ // Record undo information.
+ ft.limitStack = append(ft.limitStack, limitFact{v.ID, oldLim})
+ }
// Record new information.
ft.limits[v.ID] = lim
if v.Block.Func.pass.debug > 2 {
@@ -1264,7 +1268,11 @@ func (ft *factsTable) isNonNegative(v *Value) bool {
// checkpoint saves the current state of known relations.
// Called when descending on a branch.
-func (ft *factsTable) checkpoint() {
+// descending is used to ignore checkpointing facts learned
+// in the current block as they will be always be true in
+// dominated users. Can be nil.
+func (ft *factsTable) checkpoint(descending *Block) {
+ ft.currentlyDescending = descending
if ft.unsat {
ft.unsatDepth++
}
@@ -1598,7 +1606,7 @@ func prove(f *Func) {
}
ft := newFactsTable(f)
- ft.checkpoint()
+ ft.checkpoint(nil)
// Find length and capacity ops.
for _, b := range f.Blocks {
@@ -1673,7 +1681,7 @@ func prove(f *Func) {
switch node.state {
case descend:
- ft.checkpoint()
+ ft.checkpoint(node.block)
// Entering the block, add facts about the induction variable
// that is bound to this block.
@@ -3031,7 +3039,7 @@ func simplifyBlock(sdom SparseTree, ft *factsTable, b *Block) {
}
// For edges to other blocks, this can trim a branch
// even if we couldn't get rid of the child itself.
- ft.checkpoint()
+ ft.checkpoint(nil)
addBranchRestrictions(ft, parent, branch)
unsat := ft.unsat
ft.restore()

Other ones are trickier to do since they store data in more complex data structures.

Jorropo

unread,
Jun 7, 2026, 1:55:23 AM (2 days ago) Jun 7
to golang-dev
The Q&D patch above alone yields to more code being removed:
file before after Δ %
syscall.s 80806 80682 -124 -0.153%
time.s 89469 89437 -32 -0.036%
vendor/golang.org/x/crypto/chacha20.s 5354 5316 -38 -0.710%
debug/gosym.s 30679 30664 -15 -0.049%
syscall [cmd/compile].s 80806 80682 -124 -0.153%
time [cmd/compile].s 89469 89437 -32 -0.036%
cmd/vendor/golang.org/x/arch/riscv64/riscv64asm.s 64429 64109 -320 -0.497%
cmd/link/internal/dwtest.s 5222 5038 -184 -3.524%
cmd/vendor/golang.org/x/arch/x86/x86asm.s 90202 90106 -96 -0.106%
cmd/vendor/golang.org/x/arch/ppc64/ppc64asm.s 202252 202060 -192 -0.095%
cmd/link/internal/ld.s 645536 645531 -5 -0.001%
cmd/compile/internal/ssa.s 4828630 4828992 +362 +0.007% ; bigger since I've added more code
cmd/compile/internal/ssa [cmd/compile].s 4975530 4975851 +321 +0.006% : same
total

I guess this is because it allows the OpPhi intersection code to do a better job:
case OpPhi:
// Compute the union of all the input phis.
// Often this will convey no information, because the block
// is not dominated by its predecessors and hence the
// phi arguments might not have been processed yet. But if
// the values are declared earlier, it may help. e.g., for
// v = phi(c3, c5)
// where c3 = OpConst [3] and c5 = OpConst [5] are
// defined in the entry block, we can derive [3,5]
// as the limit for v.
l := ft.limits[v.Args[0].ID]
for _, a := range v.Args[1:] {
l2 := ft.limits[a.ID]
l.min = min(l.min, l2.min)
l.max = max(l.max, l2.max)
l.umin = min(l.umin, l2.umin)
l.umax = max(l.umax, l2.umax)
}
ft.newLimit(v, l)

Jorropo

unread,
Jun 7, 2026, 5:03:46 AM (2 days ago) Jun 7
to golang-dev
NVM the Q&D patch is broken, it removes some IsSliceInBounds that are definitely needed.

Keith Randall

unread,
Jun 8, 2026, 4:17:00 PM (20 hours ago) Jun 8
to Jorropo, golang-dev
Maybe this breaks for Phis? There are uses of v outside the dominator subtree rooted at v.Block, namely phi args for edges going from within that dominator subtree to outside of it.
Not sure how exactly that would break things. The constraints of v at the end of v.Block that we kept would always be conservative?

I do worry about the value-value relations. Maybe there is something like issue 67625 going on?

It would be interesting to test by removing the constraint propagation through Phis. Then what you are doing shouldn't make any difference to the generated code. It would just be a potential compiler speed benefit. If you're seeing generated code differences, it might point to what is going wrong.




--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/golang-dev/CAHWihb_rjFOaRsM1%3DnaV0ggPuP4C2Lf4ow0s0giJJ491SROAJw%40mail.gmail.com.

Jorropo

unread,
Jun 8, 2026, 9:43:36 PM (15 hours ago) Jun 8
to kei...@alum.mit.edu, golang-dev
I always think that phi uses are at the end of all block's preds which would be dominated so that should be fine ?

if I have:
  x = a | 1
  if b {
    x = x << 1
  } else {
    x = x | 2
  }
  ; at this point x's phi would do intersection between umin: 2 & umin: 3 and should return umin: 2 even it's not dominated

I didn't investigated the details, but the misscompilation of the Q&D patch is in code without any phis.
Reply all
Reply to author
Forward
0 new messages