online-audio-server-decode-faster - online-audio-client

261 views
Skip to first unread message

Piero Cosi

unread,
Oct 1, 2015, 9:51:55 AM10/1/15
to kaldi-developers, Piero Cosi
Hi all,

I am using, with my ITALIAN VOXFORGE-like data and ASR,
the online environment: online-audio-server-decode-faster  -  online-audio-client

Results are OK infact the recognition performance is great but the last word of the sentence is often missing …
 I SUPPOSE  I am doing something wrong … perhaps with parameter setting!!
I generally use the default parameters for both programs!


Maybe someone could help me to find some hints on this problem?!?!?

MANY THANKS
Piero

Danijel Korzinek

unread,
Oct 1, 2015, 11:44:47 AM10/1/15
to kaldi-de...@googlegroups.com
Can you use the same files with another program (e.g. online-wav-gmm-decode-faster) and see if there is a problem there?

If the result is the same for both online programs, it may be an issue with the model. If it works on the latter, but doesn't on the server, then it is probably a bug in the server code (which I made), so I will have to take a look at it.

Best regards,
Danijel

--
You received this message because you are subscribed to the Google Groups "kaldi-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-develope...@googlegroups.com.
To post to this group, send email to kaldi-de...@googlegroups.com.
Visit this group at http://groups.google.com/group/kaldi-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-developers/f3c8b722-aa81-4ce1-9ec6-fcb1f16927ed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Povey

unread,
Oct 1, 2015, 2:52:32 PM10/1/15
to kaldi-developers
That decoder program uses OnlineFasterDecoder (from Cisco), and the
special point of that decoder is that it does not output a word until
it is certain about it, i.e. until the tracebacks converge. This was
necessary for some application they had.
There is some kind of logic for telling the decoder that the features
being input have finished, and it's possible that if your gstreamer
code is not telling it that the features have finished, it will fail
to flush out the last word. (kEndFeats and kEndUtt may have
something to do with this).

Dan
> https://groups.google.com/d/msgid/kaldi-developers/CA%2Bzbk1TW7Xa17hE02diPuRqBM3E6WJdw14%2BL5ng36qXCTeQuOw%40mail.gmail.com.

Danijel Korzinek

unread,
Oct 1, 2015, 3:02:38 PM10/1/15
to kaldi-de...@googlegroups.com
Hey Daniel,

The online-audio-server-decode-faster does have logic to deal with the "end-of-stream" scenario. When the server receives an EOS mark (which is simply a frame of nul length), it calls "decoder.decoder.FinishTraceBack(&out_fst)" and "decoder.GetBestPath(&out_fst)" and spits out the whole output as it is. I'm not sure if anything else is needed, but it should work (I copied almost all of it from the wav decoder code).

Now as far as the las word is considered, could it be the case that if there isn't sufficient silence after the final word (it's cut-off abrubtly) the decoder would not send it to output?

There could also be other problems, like the online-audio-client not sending the file completely for some reason. That is why I suggested verifying it with the other program, which is (to my knowledge) well-working and thoroughly tested.

Danijel

Reply all
Reply to author
Forward
0 new messages