DCM and quaternion representations are entirely equivalent. You can transform back and forth. There is no difference in accuracy or transient response.
In the present version of MatrixPilot, I actually use both representations. In my opinion, in most cases, DCM is the most convenient representation to use for most of the computations we do in MatrixPilot, but there are a couple of places were quaternions are more convenient, so I do the conversion.
Computing quaternions is a little bit more efficient computationally than DCM, but in MatrixPilot the amount of CPU power used on computing DCM is less than 1%. The largest CPU loads are from communications and A/D conversion.
As a minor point, I note that it is not necessary to do either divide or square root in the renormalization process for either DCM or quaternions. You can use a Taylor's expansion instead.
Also, when I first started out, I was careful not to use divide or square root very much in MatrixPilot, because divide takes 17 CPU cycles, and square root takes a lot more than that. But we found that the architecture of MatrixPilot and the PIC that we are using are very efficient with regard to CPU loading. The single CPU on the UDB4 is currently at about 10% loading, and it is doing the work of the 3 CPUs in the latest version of the Ardu system.
Best regards,
Bill Premerlani