kaldi online decoding (online2-tcp-nnet3-decode-faster) support multiplexing or multi threading?

715 views
Skip to first unread message

Yu Beomgon

unread,
Aug 23, 2019, 6:43:13 AM8/23/19
to kaldi-help

I try to multi plexing using libwebsocket, which is library.
but when I mixed it, segmentation error happened.

code is like below.

int main( argc, argv)
{
   { 
      using namespace kaldi;
      using namespace fst;
      .....
      .....

    fst::SymbolTable *word_syms = NULL;
    if (!word_syms_filename.empty())
      if (!(word_syms = fst::SymbolTable::ReadText(word_syms_filename)))
        KALDI_ERR << "Could not read symbol table from file "
                  << word_syms_filename;
   }
   { 
     live web socket code
     while(destroy_flag)
       {
            lws_service();
       }

   }

during websocket init,
segmentation error happend.

I first want to check kaldi support multi threading or multiplexing.
I got the code for libwebsocket from below link.


Daniel Povey

unread,
Aug 24, 2019, 12:55:28 PM8/24/19
to kaldi-help
Most objects used in Kaldi don't  contain their own mutexes for thread safety; if multiple threads will be modifying an object you need to add your own mutexes to guard them.  But it is designed so that large objects that need to be shared are generally not modified by code that uses them, so you can share them safely.
If it's an error in websocket init, I suspect it may not even be a Kaldi issue per se.
If you use OpenBLAS, by default it has a limit on the number of threads you can use.

Dan


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/331457c5-f8d8-4686-9677-12996068e47a%40googlegroups.com.

Nickolay Shmyrev

unread,
Aug 25, 2019, 7:23:03 AM8/25/19
to kaldi-help
Hey.

You seriously need to check https://github.com/alphacep/kaldi-websocket-python, it contains proper server implementation of multithread processing of many streams with shared model data. Neither tcp-server nor gstreamer server nor py-kaldy-simple have that.

It also uses asyncio, a very straightforward parallelization.

Python helps with flexibility of the server, compared to your libwebsocket, you can use logging, store results in a database, etc.

If you have any questions you can ask me off-list.



orum farhang

unread,
Aug 25, 2019, 7:40:07 AM8/25/19
to kaldi...@googlegroups.com
Hi Nickolay,

Could you please explain what is the benefit of using Gstreamer insted of your sample websocket-python? I mean if we can create an instance of online decoder in Python and use the Python web server to send/receive data then do we still need use the Gstreamer? Also would you mind to write how you bind the c++ code with Python? I see there is a kaldi_recognizer.i file which provides an interface but how you generated that file?

Thanks.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Nickolay Shmyrev

unread,
Aug 25, 2019, 8:12:58 AM8/25/19
to kaldi...@googlegroups.com

25 авг. 2019 г., в 14:39, orum farhang <orumf...@gmail.com> написал(а):

Hi Nickolay,

Could you please explain what is the benefit of using Gstreamer insted of your sample websocket-python?

Personally I do not see any use in Gstreamer. Gstreamer was a thing when Gnome desktop adopted it as a media framework. It is pretty complicated beast with plugins, thousand dependencies, complex memory management, etc. I do not think it is easy to have shared model data with gstreamer, you have to run separate workers. Gstreamer helps to transcode mp3 streams, but you rarely stream mp3 from the web, for flac/ulaw you better just adopt a simple standalone flac codec. For mp3 decoding I'd better use ffmpeg.

I mean if we can create an instance of online decoder in Python and use the Python web server to send/receive data then do we still need use the Gstreamer? Also would you mind to write how you bind the c++ code with Python? I see there is a kaldi_recognizer.i file which provides an interface but how you generated that file?

This is a standard swig http://swig.org interface file. You write .i file yourself, then you use swig tool to create a python wrapper from .cpp and .i files, see the Makefile in the project. Also see

http://swig.org/Doc4.0/SWIGPlus.html

You can use the same interface file to create java/javascript/perl/go bindings.

You don't necessary have to use Swig, you can use Cython/Pybind, the C++ interface to wrap is very simple.

signature.asc

Yu Beomgon

unread,
Aug 25, 2019, 8:49:33 PM8/25/19
to kaldi-help

hi Nickolay,
can you please more explanation about code and env.
in the code, for example,
it seemed that I should install asyncio and pathlib and websockets.
in the readme, there is no info about that.
which lib, should I install for test?


import asyncio
import pathlib
import websockets

2019년 8월 25일 일요일 오후 9시 12분 58초 UTC+9, Nickolay Shmyrev 님의 말:

Ahmet A. Akın

unread,
Aug 26, 2019, 2:19:24 AM8/26/19
to kaldi-help
Hi Dan,

As of 0.3.7 version (released 2 weeks ago) OpenBlas adds a new flag (USE_LOCKING) that is specifically used for multiple threads are accessing single threaded OpenBlas functions. I successfully managed to work with it in a multi threaded application. Before it was giving memory errors (BLAS : Program is Terminated. Because you tried to allocate too many memory regions). I created a pull request for it.



On Saturday, August 24, 2019 at 7:55:28 PM UTC+3, Dan Povey wrote:
Most objects used in Kaldi don't  contain their own mutexes for thread safety; if multiple threads will be modifying an object you need to add your own mutexes to guard them.  But it is designed so that large objects that need to be shared are generally not modified by code that uses them, so you can share them safely.
If it's an error in websocket init, I suspect it may not even be a Kaldi issue per se.
If you use OpenBLAS, by default it has a limit on the number of threads you can use.

Dan


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages