There's no backtracking in RETE, so the "best" rule is chosen to fire. I think that if all applicable rules were fired, there would be an exponential increase in the data. Also, it's difficult enough to understand the behaviour when single-firing; if all were fired, it would be even more difficult to understand. (And some rule firings would invalidate other matches, so could result in inconsistencies?)
For the R1/XCON configurator, I suspect that a constraint-based design would be more effective (I don't think that constraint solving was very advanced back then). For other applications, something like Linda is probably cleaner.
But others are probably more knowledgeable on this than I am. (And backward-chaining has sufficed for most things that I've worked on)
- peter