Multi threading Online nnet3 decoder

1,277 views
Skip to first unread message

Prajwal Rao

unread,
Nov 29, 2017, 3:10:30 AM11/29/17
to kaldi-help
Hello Dan,

Is multi threading an online model such as online2-wav-nnet3-latgen-faster possible? (With the help of TaskSequencer)
In the case of kaldi gstream server, I have noticed that for every worker initialised the model is reloaded, which is unnecessary if the same model has to be loaded for every client request.

Also, could you please guide me in making the necessary changes in online2-wav-nnet3-latgen-faster.cc so that it uses the TaskSequencer to multithread multiple input wav files.

Any advise will be gladly appreciated.

Thank you,
Prajwal Rao

Daniel Povey

unread,
Nov 29, 2017, 12:56:13 PM11/29/17
to kaldi-help
The multithreading is definitely possible.  If you look at the core code in online2-wav-nnet3-latgen-faster.cc, you'll see that the objects that do the decoding, such as OnlineNnet2FeaturePipeline and SingleUtteranceNnet3Decoder, are lightweight and take only const references to the model and the decoding graph.  So the model and decoding graph can be loaded once and shared by multiple threads.

The gstreamer stuff isn't maintained by the core Kaldi team and I've decided not to expend my energy doing that.

Regarding using TaskSequencer to multithread the decoding of multiple input wav files: that is definitely possible and people who are more experienced in Kaldi programming will be able to do it, but I don't have time myself.  In any case, TaskSequencer probably isn't suitable for real-time (interactive) applications... my suspicion is that you don't really need online decoding and could probably accomplish what you need to accomplish using the existing decoding scripts.



--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/998c7616-ef7d-47a6-afe6-3d4bf05a5086%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Prajwal Rao

unread,
Nov 30, 2017, 1:55:06 AM11/30/17
to kaldi-help
Thank you for your quick informative reply sir. 

I will look into that within the online2-wav-nnet3-latgen-faster.cc. 

In regard to the TaskSequencer, I'm really not very experienced in Kaldi programming. My ultimate goal would be to provide online real-time decoding to multiple users streaming input to my server and their recording could be any length. I may provide partial output on the fly as decoding happens. 

So, is there any other approach which you would suggest or should I dig deeper into implementing TaskSequencer in the online2-wav-nnet3-latgen-faster.cc code? What would you suggest?

Prajwal Rao.

 

On Wednesday, 29 November 2017 23:26:13 UTC+5:30, Dan Povey wrote:
The multithreading is definitely possible.  If you look at the core code in online2-wav-nnet3-latgen-faster.cc, you'll see that the objects that do the decoding, such as OnlineNnet2FeaturePipeline and SingleUtteranceNnet3Decoder, are lightweight and take only const references to the model and the decoding graph.  So the model and decoding graph can be loaded once and shared by multiple threads.

The gstreamer stuff isn't maintained by the core Kaldi team and I've decided not to expend my energy doing that.

Regarding using TaskSequencer to multithread the decoding of multiple input wav files: that is definitely possible and people who are more experienced in Kaldi programming will be able to do it, but I don't have time myself.  In any case, TaskSequencer probably isn't suitable for real-time (interactive) applications... my suspicion is that you don't really need online decoding and could probably accomplish what you need to accomplish using the existing decoding scripts.


On Wed, Nov 29, 2017 at 3:10 AM, Prajwal Rao <prajw...@gmail.com> wrote:
Hello Dan,

Is multi threading an online model such as online2-wav-nnet3-latgen-faster possible? (With the help of TaskSequencer)
In the case of kaldi gstream server, I have noticed that for every worker initialised the model is reloaded, which is unnecessary if the same model has to be loaded for every client request.

Also, could you please guide me in making the necessary changes in online2-wav-nnet3-latgen-faster.cc so that it uses the TaskSequencer to multithread multiple input wav files.

Any advise will be gladly appreciated.

Thank you,
Prajwal Rao

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Daniel Povey

unread,
Nov 30, 2017, 1:50:00 PM11/30/17
to kaldi-help
TaskSequencer isn't suitable for a real-time application like what you describe.
For the application you describe you need to write your own multi-threaded code.  If you haven't written multi-threaded programs before, I warn you that it's very tricky, and you may need to find someone with experience in that kind of thing.  On top of that, server code of this type is tricky in general, because you have to worry about connection protocols, timeouts, dropped connections, knowing when the server has reached capacity, and so on.  I recommend that you first try to write server code that will work for just one user, that's easily complicated enough to keep you busy for a while.
If you look at the code in online2-wav-nnet3-latgen-faster.cc, and bear in mind what I told you about certain things being const, it should be obvious how to write the multi-threaded application without re-loading the model and graph each time.  
Others have done this before, so it's definitely possible.  But the aim of the Kaldi project is to just provide the core ASR capabilities, we don't help you write your application code.

Dan







To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Prajwal Rao

unread,
Dec 1, 2017, 8:59:40 AM12/1/17
to kaldi-help
Thankyou for your guidance. I will mostly carry on with building my own multi-threading architecture.

Prajwal Rao

Arkadi Gurevich

unread,
Dec 19, 2017, 6:09:09 AM12/19/17
to kaldi-help
Hi Rao
 Have you been able to prepare Multi-threading code?
I'm also interested in doing something similar and I'd love to see your example...
Cheers
Arkadi

Prajwal Rao

unread,
Dec 21, 2017, 12:39:57 AM12/21/17
to kaldi-help
Hello Arkadi,
Glad to know you are interested in building a similar architecture. Unfortunately my company policy would not let me share the code but I can give you pointers on how to do it if you are interested.
Currently I have a simple architecture in c++
  • Carefully read the online2-wav-nnet3-latgen-faster.cc code 
  • Isolate the for loop and create a function which will do a single file decoding
  • Keep in mind the parameters you pass into the function as Dan mentioned. Try to pass all of them as references unless particularly required.
  • You can use threads or pthreads library for multi-threading. But in my case I have used the boost thread library. Also it is very complicated to do multiprocessing, So if you would like to take that route I would suggest you to use python
  • I guess at this point you will be done if you just want the results. If you want the confidence scores for each word, you will have to look into the lattice-to-ctm-conf.cc code and create a function which does the same
Note: If you are ultimately planning to take the python route you can use an already existing wrapper which I will link below which does the nnet3 online decoding.

Reply all
Reply to author
Forward
0 new messages