The most advance recipe for offline speech recognition

Alexander Gorodetski

unread,

Mar 20, 2019, 1:18:20 PM3/20/19

to kaldi-help

Hello All,

I wanted to ask please what is the most advanced recipe for offline speech recognition? Until yesterday I thought that it should be Tedlium. But after yesterday post from Nvidia https://devblogs.nvidia.com/nvidia-accelerates-speech-text-transcription-3500x-kaldi/ it seems that I was wrong.

Regarding post of Nvidia. Does it mean that speed of transcription is 3500 faster than real time including lattice rescoring and LM rescoring. Is that correct?

Thanks,

Alex.

Daniel Povey

unread,

Mar 20, 2019, 1:20:57 PM3/20/19

to kaldi-help

Depends what you mean by advanced. Anyway most of the up-to-date recipes have the same basic setup; look for scripts called nnet3/chain/run_tdnn.sh that have 'tdnnf-layer' inside the script.

There is a PR up for that. That's GPU based decoding. The 3500xRT number is throughput, it means that the machine with a GPU can process that much data. That doesn't include LM rescoring, just first-pass decoding, but it does generate lattices.

Dan

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/52187738-5c15-49d7-9f7e-8c49e9e8e7ad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexander Gorodetski

unread,

Mar 21, 2019, 4:57:44 AM3/21/19

to kaldi-help

Hi Dan,

Thank you for your answer. If you are talking about 3k Real time decoding, I guess that Token Passing (Viterbi) algorithm was implemented in Matrix form too (for GPU). Could you please point me to some article that describes implementation of Token Passing in matrix form.

Thank you so much,

AlexG.

Daniel Povey

unread,

Mar 21, 2019, 4:23:46 PM3/21/19

to kaldi-help

It's not done with sparse-matrix ideas.

I think this paper

http://www.danielpovey.com/files/2018_interspeech_gpu_wfst.pdf

may contain *some* of the ideas, but the work has moved on considerably since then. Most of it is about CUDA programming techniques and understanding the hardware.

--

Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/a846d672-4717-46ca-99ac-d16c9afe4a23%40googlegroups.com.

Reply all

Reply to author

Forward