The issue is not as much scaling as the fact that shape(A)=(600,608) the system is undertetermined, so there is no reason why the Mosek solution and your favourite solution should be the same. There is an 8-dim solution space of the linear system Ax=b. There isn't even a reason why x_p[:8] and xTrue should be the same unless you fix x_p[:8]:
problem = cp.Problem(objective=cost, constraints=[xdiag==xTrue/scale])
x_p = np.linalg.pinv(np.vstack([A, np.eye(8,608)])) @ (np.hstack([b/scale, xTrue/scale]))
You probably want to add the R(x) constraints to get a nondegenerate problem and test with that.
In all cases I tried, scaled or not, the solution produced by the solver satisfies np.linalg.norm(A...@x.value-b) == 0 so it is optimal.
Without scaling you want to solve the problem where b is or order 1e-23 and expected x is of similar order. That is a bad idea. You most likely want to use b/scale and then your solution will also be rescaled by the same factor.
Reading x.value after x participated in two different cvxpy problems is suspicious. For safety reasons I would never even let the same cp.Variable() participate in two different cp.Problem()s.
Best,
Michal