Phase state bruteforcing for rewrite rules validation

167 views
Skip to first unread message

Jorropo

unread,
Oct 28, 2025, 2:55:38 PM (9 days ago) Oct 28
to golang-dev
Hello, I would like any opinion on a new testing idea for .rules dsl.

Backstory:
My recent topo-sort prove adventure got me thinking.

I spawned 3 issues about the exact same bug in 3 different unrelated parts of the codebase:
cmd/compile: amd64 CMOV* lowering rules depend on the values schedule
cmd/compile: ppc64 MADDLD lowering rules depend on the values schedule

all of this because I dared to change a meaningless thing (schedule of values before the schedule pass).
The previous ordering wasn't thought through by anyone or good in any metric since it doesn't matter.
However ossification later, tests come to rely on it.

Idea:
In classic mode (what exists today) rewrite goes top down in the function and finds the first matching rule (again going down rule by rule).
Instead there would be a test mode where it finds all matching rules and executes all of them on it's own clone of the function.
Then repeat until no more rules can be applied.

This means rather than rewriting a single first match win path.
We would explore the graph of all possible rewrites.

The test would be considered to pass if there is a single definite end state no matter the path you take in the graph (excluding cycles).

Implementation wise it would be extremely slow,
we are gonna need a transposition table.
And I hope it wouldn't blow up exponentially that it couldn't even be used to compile std and testdir.

This would enforce a complete impossibility to rely on rules ordering (since all orderings would be tested) or schedule (since all values / blocks orderings would be tested)*.
*for the sample test code compiled under test mode.

Keith Randall

unread,
Oct 28, 2025, 4:05:18 PM (9 days ago) Oct 28
to Jorropo, golang-dev
Yes please.

I've always wanted a mode to ensure that rule ordering doesn't matter. I guess the value ordering problem you were running into has a very similar flavor. Maybe they are not exactly the same though? Not sure.

A stochastic approach might be fine. A mode that clones the function 10 times, randomizes rule & value order, applies the rules, and checks that all 10 results are identical. Maybe the ssacheck builder can do it.

"identical" might not be trivial. Applying rules in different orders might change value numbering, value order, etc. We want some sort of graph equivalence.

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/golang-dev/CAHWihb_k6Fpi5EK8j38Zg1s3mc-PGCAb76EZo9bBkibEWGVc_A%40mail.gmail.com.

David Chase

unread,
Oct 28, 2025, 4:42:00 PM (9 days ago) Oct 28
to kei...@alum.mit.edu, Jorropo, golang-dev
Will things ever end up threaded differently on the mem value?  It seems like parameters and mem could be the base case for graph equivalence.

And also before we do full randomization, maybe just try running the whole thing backwards.  Could do it 4 ways, forward/backward on blocks, forward/backward on values in blocks.  At least do those first because any bugs we find w/o rearranging instructions and just matching "backwards" are probably easier to look at than fully random.

Jorropo

unread,
Oct 28, 2025, 5:56:05 PM (9 days ago) Oct 28
to David Chase, kei...@alum.mit.edu, golang-dev
I am not sure about the mem value.
my First thought is that no, since it seems a non rule-atomic break of the mem graph would always not work sometimes ¿
Maybe we have rules about disjointed stores that could cause give or take identical functions that are not topologically identical, this would have be handled by the test.

Idk about the orders,
I think it will be much easier to keep following the no sequencing rule once we fixed all the ones discovered during initial discovery.
I'll keep in mind the other directions if it makes the initial discovery more digestible.

Russ Cox

unread,
Nov 2, 2025, 12:38:06 AM (4 days ago) Nov 2
to Jorropo, David Chase, kei...@alum.mit.edu, golang-dev
Brute forcing only helps when paired with an input that trips over the ambiguity, and it may not be necessary. 

The only time the rule order can matter is when you have a structure (A ... B ...) where there is a rule that matches the whole expression (and rewrites away B) and also a rule that matches B. So to start with, you could have the generator look for one rule's pattern appearing inside another rule's pattern and then print an error about that. If that has false positives (it probably will), you will need to check whether there are conditions that exclude both from matching simultaneously. 

It really should be computable at generation time instead of resorting to brute force.

Best,
Russ

David Chase

unread,
Nov 3, 2025, 10:02:52 AM (3 days ago) Nov 3
to Russ Cox, Jorropo, kei...@alum.mit.edu, golang-dev
I think that this filter may also need to consider matches between subpatterns and rewrite results.  If you assume two top level patterns, more and less specific, the more specific one may only fire if a subtree rewrite first creates a match for the corresponding subpattern of the more-specific pattern.  Otherwise, less-specific fires, and the opportunity is lost.  This actually seems a little hard to fix.

(I worked on this sort of stuff N decades ago.)

Another option for doing this might be:
- scan the entire program for matching patterns, but only note their existence, do not rewrite them.
- process the program in a random ordering, but before applying a rewrite, rerun the pattern match on that node and see if the same rewrite still applies.
- report differences, which we then study.

I think this dodges the output-graph equivalence problem which is its own can of worms.

Russ Cox

unread,
Nov 3, 2025, 11:50:45 AM (3 days ago) Nov 3
to David Chase, Jorropo, kei...@alum.mit.edu, golang-dev
On Mon, Nov 3, 2025 at 10:02 AM David Chase <drc...@google.com> wrote:
I think that this filter may also need to consider matches between subpatterns and rewrite results.  If you assume two top level patterns, more and less specific, the more specific one may only fire if a subtree rewrite first creates a match for the corresponding subpattern of the more-specific pattern.  Otherwise, less-specific fires, and the opportunity is lost.  This actually seems a little hard to fix.

You're right. The pattern I ran into and had in mind was

(A (B)) -> (Fast)
(A x) -> (Slow)
(B) -> (C)

with input (A (B)) where the problem was that if (B) -> (C) triggered first, then the fast path wasn't taken.

What you're describing is the reverse:

(A (B)) -> (Fast)
(A x) -> (Slow)
(C) -> (B)

with input (A (C)) and the problem is that (C) -> (B) must trigger first in order for the fast path to be taken.

I still think it must be possible to analyze the rule sets without needing a triggering input. There must be papers about this.

Best,
Russ

Reply all
Reply to author
Forward
0 new messages