Do you want to use uniMRCP to transcribe preexisting audio files?
If yes I think this is not a good use case for uniMRCP. I don't know about MRCP v1, but with MRCP v2, you would establish a SIP session and have data transmission constrained by the audio rate negotiated in the call.
So a wav file with 2 minutes of audio would take 2 minutes just to be sent from client to server.
Instead if you call Google API directly, all audio would be transmitted instantly.
There are lots of things an MRCP server has to do like controlling/processing timers, RTP jitter, VAD because of the real-time nature of the usual scenario that are not relevant for simple audio file transcription using Google (well VAD is always relevant for SR engines) as we are dealing with human actors and SIP/RTP issues.
Now, I checked the MRCP v2. RFC and found this:
9.4.10. Input-Waveform-URI
This optional header field specifies a URI pointing to audio content
to be processed by the RECOGNIZE operation. This enables the client
to request recognition from a specified buffer or audio file.
input-waveform-uri = "Input-Waveform-URI" ":" uri CRLF
So if uniMRCP supports this then there might be a case to be made to use it for this scenario (you would need to serve the audio file using HTTP assuming Input-Waveform-URI can be an HTTP URI or upload it to the uniMRCP host file system).
But I suspect it doesn't as I tried to send this header to uniMRCP server:
MRCP/2.0 232 RECOGNIZE 1
channel-identifier: 5251fff6aa614ecd@speechrecog
speech-language: ja-JP
content-type: text/uri-list
input-waveform-uri: http://192.168.3.138:7777/some.wav
content-length: 25
builtin:speech/transcribe
but it didn't do anything with it according to the logs.
Anyway, in case of Google, its API is easy to use so I would just call it directly and other providers like AWS, Azure etc should have similarly simple APIs so I would not insert an MRCP layer for handling static files.
But some SR providers like nuance, voxeo, etc might only offer MRCP interface (I have no idea) so you would be constrained by them and then use of uniMRCP would make sense although it would be bad usage of resources as you are dealing with a platform created for real-time processing of data when your use case doesn't seem to have real-time constraints.