Now if only phase A is on and phase B is off you have 1A of total current into the motor and the torque therefore is less. If we could put 1.414A into phase A then would have the same torque, but if we limit to 1A per phase we end up with our torque curve is square shown above. We could limit the torque to be circle inside the square such that our maximum current at 45 degrees would be 0.707A per phase for vector total of 1A.
So as you can see even if you have improved accuracy from smart stepper the torque during microstepping is lower. However we have a few more tricks we can employee...
The maximum motor current of 1A is normally the maximum continuous current you can apply to the motor. That is the motors have windings of wire to generate the magnetic field, these field windings fail when they get too hot (either hot enough to melt insulation, or hot enough to burn wire into). So if you apply more than 1A to a cold motor it will start heating up, then after some time it would get so hot it melts windings or insulation. This does not happen instantly. That is the heat has to build up before failure. What if you put 1.4A into motor for a short period of time? If the time is short then most likely the motor will not fail as it does not have time to create enough heat to melt. Of course at some amperage level it will heat up so quick it will melt the windings almost instantly.
One thing I am working on with the smart steppers is to have a peak current limit (per phase) and then an average current limit per phase. The average would be averaged over some time period say 100ms. The idea is that you can exceed the motor's continuous current limit for some brief period of time, but the average current must be below the continuous current limit of the motor.
For applications like 3D printers this would allow faster moves, and more instantaneous torque, and hence over all better performance. However for CNC milling machines where you need continuous torque to drive the tool into the material it will not be of an advantage as the current needs to be high continuously.
Since you need high torque on CNC mills my recommendation is to design CNC mills with motors having large torque (NEMA 23 or 34) and not use the small NEMA 17. The reasoning is that if you need more torque you can get buy a bigger motor in the NEMA 23 or 34 form factor without making new mounting brackets.
Once you have the bigger motor I still recommend trying to match the full steps to the accuracy, especially if you are not concerned about speed. Not because the smart steppers are not accurate at micro stepping but because you will have the maximum torque and the most flexibility. That is it is better to add the smart steppers and get better accuracy than required than it is to be on the edge of what you need.
For 3D printers they often count on the microstepping for accuracy as they also want high velocity moves. The speed is such a concern the accuracy has to be made up by microstepping. Even here however the main advantage of the smart stepper is to prevent missing steps, as that the 3D printers are so cost sensitive designs their accuracy is limited by other factors (back lash, belts, etc). While one saved 24 hour print on a 3D printer from missing steps can easily pay for the smart steppers. The missed steps in 3D printers often come from such things a plastic blob on the print, or dirt on the linear rails. These things are often not predictable and thus having closed loop to deal with them makes sense.
On CNC mills most people start adding the smart steppers when they are trying to get every drop of performance out of the machines. That is they are pushing machine to limits where they can miss steps because they do not have the torque needed for the speed they are running. My personal experience is that when this starts happening even the smart steppers will be just a temporary band aid. That is maybe it can recover if the mill hits a hard spot briefly, but if the motor is already at full torque there is little the smart steppers can do to get more power from motor to try and correct error. Here again it is better to design with bigger motors so you have excessive torque, this way the smart steppers can increase the torque to deal with the errors. The other advantages of the smart stepper is lower motor heat and smoother movements, which can be significant advantage as well.
As far as improving accuracy in microstepping using the smart steppers, imagine that the one coil of your motor is 1% difference in inductance or resistance. Then what happens is when both coils are on at 1A one will actually get more power than the other. In the diagram about the vector would not be at 45 degrees but would have some error. With the smart stepper it measures the actual shaft position and will change the currents until the angle is correct.
The difference in the motor coils does happen, in early firmware we calibrated with both phases powered up, but since the current was different our motor angle was off. Now the smart steppers calibrate with only one phase on to ensure that the motor shaft is at the correct angle. That is one phase on it is always on the X or Y axis above regardless of minor changes in the phase currents.
Trampas
Just to clarify for others to know, I think you meant AS5047D which gives us theta.
I think that for phase advancement circular reference, you mentioned, can be (partially?) solved with a little bit of model based approach. Basically you can predict speed based on your command. That will solve "unexpected" deceleration problem when full step is commanded. (Kalman filter with a modelled observed could work really nice here.)
The only unknown would be load. Now depending on application, the load may change instantly or smoothly. If the later, the resulting speed change will be slow, so no problem. If former, for example hitting a wall, then the calculation will be incorrect for a moment, like you said. But given that we have a wall in front of us, I think it would be acceptable to "loose few steps" trying to push the wall.
If it was someone was trying to detect end stops of their 3d printer that way, they should do it at low speed.
Regarding the maximum speed. I think one could also find out if there is more speed potential to be unlocked by just driving the motor (from UART) in complete open loop until it looses steps. This way the encoder latency wouldn't slow us down.
That is if the program loop is limiting factor before the magnetic field switching as you pointed out.
I guess only then field weakening could be worth considering. Because in theory it should allow for faster rotation than full step driving. But this sort of thing would probably require DMA to still write proper sine wave to PWM, otherwise the cpu probably wouldn't be fast enough. So it's probably out of the scope of Arduino library.
Above is the block diagram for the A4954 as you see at the bottom there is the VREF and current sense resistor going to the comparator. What the A4954 does is turn on power to the motor then it waits a "blanking time" of 2-4us before it starts checking the comparator for over current. Once it detects over current it enters what it calls "mixed decay" operation. Specifically once it detects we reach the current desired it will reverse the H-bridge driver for some 'fast decay' time. After that it will short both wires of the motor coil to ground for the 'slow decay' time. Theses times are both fixed at 12.5us each and can not be changed. After this time it turns the coil back on and repeats the process.
There are a few problems here, specifically if you use a small motor and large voltage it is possible to exceed the desired current level in the blanking time. You can also reverse all your current in the fast decay time. Usually this is not a problem but it is possible. Additionally the coils on the motor and current sense resistors are not perfect, so you will not always get the current you wanted into the coils. For example one coil could get higher lower current for the same VREF set point. These problems however get taken care of in the control loop, that is by knowing the motor position we correct for any position errors, so if the current was off motor would move to bad position in micro stepping but then control loop would correct.
Also as the load on the motor changes the inductance of the coils changes too, which will change the current ramp and ramp down times. The reality is that knowing the precise current also does not help as the current to magnetic flux in the motor will be different for each coil, hence in the end it all gets taken care of by measuring what we really want, that is the positional accuracy.