Hey Clayton,
Thank you for this summary! It is great to hear that Utterly Voice is working well for you.
Yes, Google Cloud can be a bit pricey. In case you don't already, you should always use "stop listening" instead of "pause listening" when you can, because utterances are still sent to the recognizer while the microphone is paused. We are assuming that cloud speech recognition will get cheaper over time due to the growing number of providers. We recently switched from Google Cloud to Digital Ocean for our cloud storage, and our monthly bill went from $25 to $5. Competition is good :-)
Yes, Deepgram has made some significant improvements lately in both accuracy and latency. It might be good to retry them periodically in case they surpass Google.
Nabla looks interesting. I have added that to our task list for future investigation. It does support streaming, which is great. However, it does not support streaming raw binary audio data. It supports streaming of base64 text audio data, which can increase latency somewhat. It still looks worth trying.
We are starting to review Azure right now. It looks very promising in general. It appears that they do not offer any medical-specific models. They do, however, provide a nice interface for creating custom models, and this is what they recommend to users looking for medical dictation. This might require more work for your setup, but it might result in improved accuracy, because it will be trained on your voice, and on the terms you use frequently.
-Tony