real time time decoding force last audio data decoding

380 views
Skip to first unread message

J Kokotinis

unread,
Apr 5, 2019, 10:00:10 AM4/5/19
to kaldi-help
Hello,

I am creating a real time decoding application in python, that uses the microphone as audio input, according to the voxforge run-live.py example (https://github.com/kaldi-asr/kaldi/tree/master/egs/voxforge/gst_demo).

The issue that I am facing is that the decoder does not print all spoken audio data when I press the pause button. As I see the pause function just sets the asr to silent (line 110) and no matter how much I wait the results are not printed. They are printed only when I unpause the process and provide the decoder with more audio data.

Is there any way to force the decoding process (onlinegmmdecodefaster) to decode and print all audio data, that are spoken through the microphone, each time I press the pause button?

If not, which is the best approach to decode audio from microphone in real time and get all data when the process is paused?


Thank you in advance!

Daniel Povey

unread,
Apr 5, 2019, 12:15:50 PM4/5/19
to kaldi-help
I think that code may be designed to only output the portion of the audio that it is sure about, so if future context is required to disambiguate different hypotheses, it may not output everything.  It's based on the Viterbi traceback.
There are other ways to do that kind of thing-- e.g. correct things later-- but they are not exposed in that demo.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/6fe88822-8573-4ec2-8abd-29de6e96d1e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

J Kokotinis

unread,
Apr 6, 2019, 5:26:04 AM4/6/19
to kaldi-help
Thank your for your reply.

So the best way to deal with this issue is to create a wav file, as the microphone is recording and feed it at the same time at the online-wav-gmm-decode-faster function. When I pause the recording process, the wav file will be complete and the decoding process will output the complete results. Is this right?



Τη Παρασκευή, 5 Απριλίου 2019 - 7:15:50 μ.μ. UTC+3, ο χρήστης Dan Povey έγραψε:
I think that code may be designed to only output the portion of the audio that it is sure about, so if future context is required to disambiguate different hypotheses, it may not output everything.  It's based on the Viterbi traceback.
There are other ways to do that kind of thing-- e.g. correct things later-- but they are not exposed in that demo.

On Fri, Apr 5, 2019 at 7:00 AM J Kokotinis <jkoko...@gmail.com> wrote:
Hello,

I am creating a real time decoding application in python, that uses the microphone as audio input, according to the voxforge run-live.py example (https://github.com/kaldi-asr/kaldi/tree/master/egs/voxforge/gst_demo).

The issue that I am facing is that the decoder does not print all spoken audio data when I press the pause button. As I see the pause function just sets the asr to silent (line 110) and no matter how much I wait the results are not printed. They are printed only when I unpause the process and provide the decoder with more audio data.

Is there any way to force the decoding process (onlinegmmdecodefaster) to decode and print all audio data, that are spoken through the microphone, each time I press the pause button?

If not, which is the best approach to decode audio from microphone in real time and get all data when the process is paused?


Thank you in advance!

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Apr 6, 2019, 12:49:17 PM4/6/19
to kaldi-help
online-wav-gmm-decode-faster doesn't read the wav incrementally.

This recently merged PR
may be closer to what you need.


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

J Kokotinis

unread,
Apr 8, 2019, 3:06:36 AM4/8/19
to kaldi-help
Thank you very much for the help! Keep up the good work!
Reply all
Reply to author
Forward
0 new messages