Recognition Accuracy

200 views
Skip to first unread message

Marcelo Botelho

unread,
Mar 5, 2021, 3:12:46 PM3/5/21
to UniMRCP
Hi, all

I have a Asterisk 11.25.3 with Unimrcp ASR and TTS by Google 

part of my extension:
exten => 44007,n(voz),MRCPRecog("builtin:speech/transcribe",pt-BR&p=uni2)

Sometimes it didn´t understand "sim" and "não" pt-BR words.

is there a way to adjust sensitivity and accurracy?

Arsen Chaloyan

unread,
Mar 8, 2021, 9:55:37 PM3/8/21
to UniMRCP
Hi Marcelo,

If you could provide logs and utterances with such occurrences, I would be able to tell what the problem is exactly. Otherwise, I have to guess as there are two many parameters involved.

First, check the value of speech-start-timeout. The default value of this parameter was changed from 300 msec to 50 msec a year or two ago not to dismiss short utterances.

Recognition of short utterances probably remains one of the most challenging tasks for Google. You would need to consult them for recommendations applicable to the pt-BR language in particular. In general, it would help to use the enhanced model with the speech adaptation feature enabled.



--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unimrcp+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unimrcp/809706ae-f5de-46c0-8aa0-9afd92b59d9co%40googlegroups.com.


--
Arsen Chaloyan
Author of UniMRCP
http://www.unimrcp.org

Marcelo Botelho

unread,
Mar 19, 2021, 12:27:06 PM3/19/21
to UniMRCP
Thanks Arsen,

I´ll send by e-mail logs for calls OK and Not Ok


{"recog-details-record": {
   "datetime": "2021-03-19 12:00:09",
   "language": "pt-BR",
   "sampling-rate": "8000 Hz",
   "max-alternatives": 1,
   "gRPC": {
      "creation-ts": "122 ms",
      "start-of-streaming-ts": "934 ms",
      "end-of-streaming-ts": "4284 ms",
      "sent": "102400 bytes"
   },
   "completion-cause": "no-match",
   "completion-ts": "4504 ms"
}}

{"recog-details-record": {
   "datetime": "2021-03-19 11:59:11",
   "language": "pt-BR",
   "sampling-rate": "8000 Hz",
   "max-alternatives": 1,
   "input": {
      "type": "speech",
      "start-of-input-ts": "754 ms",
      "end-of-input-ts": "4284 ms",
      "end-of-input-cause": "success",
      "duration": "6580 ms",
      "size": "105280 bytes"
   },
   "gRPC": {
      "creation-ts": "187 ms",
      "start-of-streaming-ts": "754 ms",
      "end-of-streaming-ts": "4284 ms",
      "sent": "105280 bytes"
   },
   "recog-results": [
      {"alternatives": [
         {"transcript": "sim", "confidence": 0.670689}
      ]}
   ],
   "completion-cause": "success",
   "completion-ts": "4508 ms"
}}




Em segunda-feira, 8 de março de 2021 23:55:37 UTC-3, Arsen Chaloyan escreveu:
Hi Marcelo,

If you could provide logs and utterances with such occurrences, I would be able to tell what the problem is exactly. Otherwise, I have to guess as there are two many parameters involved.

First, check the value of speech-start-timeout. The default value of this parameter was changed from 300 msec to 50 msec a year or two ago not to dismiss short utterances.

Recognition of short utterances probably remains one of the most challenging tasks for Google. You would need to consult them for recommendations applicable to the pt-BR language in particular. In general, it would help to use the enhanced model with the speech adaptation feature enabled.



On Fri, Mar 5, 2021 at 12:12 PM Marcelo Botelho <base...@gmail.com> wrote:
Hi, all

I have a Asterisk 11.25.3 with Unimrcp ASR and TTS by Google 

part of my extension:
exten => 44007,n(voz),MRCPRecog("builtin:speech/transcribe",pt-BR&p=uni2)

Sometimes it didn´t understand "sim" and "não" pt-BR words.

is there a way to adjust sensitivity and accurracy?

--
You received this message because you are subscribed to the Google Groups "UniMRCP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to uni...@googlegroups.com.

Marcelo Botelho

unread,
Mar 19, 2021, 12:28:44 PM3/19/21
to uni...@googlegroups.com
Hi, Arsen

I´m sending by here, because is rejecting files by forum

recog.zip

Arsen Chaloyan

unread,
Mar 20, 2021, 9:37:29 PM3/20/21
to UniMRCP
Hi Marcelo,

I tried to restream audio data stored in umsgsr-8a3aa2ef69714222-1_sim_nomatch.wav to Google and got no-match as you did. Using v1beta1p1 API with use-enhanced set to true did not seem to make any difference in this case either.

Out of curiosity, I streamed the same to Azure and got the intended "sim" in return with confidence score 0.3 and then to AWS Transcribe which also returned the intended result with confidence score 0.9 without making any changes in the default input parameters.

Next, it would make sense to experiment with speech adaptation boost, but sorry I have no time for that. You should contact Google to get their recommendations.

Marcelo Botelho

unread,
Mar 21, 2021, 8:21:14 AM3/21/21
to uni...@googlegroups.com
Sorry,

I didn´t understand where to configure SpeechContext as advised in google

Arsen Chaloyan

unread,
Mar 24, 2021, 4:54:16 PM3/24/21
to UniMRCP
See this post for more info. The version of the plugin you use may not support this option, though. You may need to upgrade the license to be able to use newer versions of the plugin.

Reply all
Reply to author
Forward
0 new messages