On profiling my code I found it spent a lot of time in subs, particularly the subs call at the end of linsolve where it substitutes in your given free variables for the dummy parameters from gauss_jordan_solve. If I replace subs with xreplace I get a 10x speedup in the substitution process (250x without fastcache!).
Are there any possible pitfalls in doing this? My understanding is the advantage of subs over xreplace is that it is mathematically aware, so (x**4).subs(x**2, y) will yield y**2 for example, while the same would not work for xreplace. But if one is always replacing just a free symbol, such as replacing the dummy symbol in linsolve, surely the two are equivalent? If so, should linsolve just use xreplace?
On the subject, why does the matrix class not have an xreplace method which applies xreplace to its elements, like it has for subs?