Kaldi Pytorch LM rescoring

631 views
Skip to first unread message

Rishabh Kumar

unread,
Apr 26, 2020, 2:37:05 AM4/26/20
to kaldi-help
Till now there is no Kaldi Pytorch LM rescoring whereas there is TensorFlow lm rescoring: https://github.com/kaldi-asr/kaldi/tree/master/src/tfrnnlmbin

I am going to write for Kaldi Pytorch LM rescoring code. I am new. If someone can guide me. It would be easy to understand and write the code. Or is there any other way to rescore the Pytorch LM?

Daniel Povey

unread,
Apr 26, 2020, 3:27:38 AM4/26/20
to kaldi-help
Currently all my energy is going to a ground-up rewrite of Kaldi (or not-exactly-Kaldi) that will use PyTorch, 
and eventually, as an alternative, TensorFlow) and will support things like that.  I'm not focusing on adding 
features like what you mention for the time being.
If anyone volunteers to work with you, though, great.

Dan

On Sun, Apr 26, 2020 at 2:37 PM Rishabh Kumar <cyfe...@gmail.com> wrote:
Till now there is no Kaldi Pytorch LM rescoring whereas there is TensorFlow lm rescoring: https://github.com/kaldi-asr/kaldi/tree/master/src/tfrnnlmbin

I am going to write for Kaldi Pytorch LM rescoring code. I am new. If someone can guide me. It would be easy to understand and write the code. Or is there any other way to rescore the Pytorch LM?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/26f489f0-9e99-4561-894f-520a81e3f615%40googlegroups.com.

Rishabh Kumar

unread,
Apr 26, 2020, 7:17:41 AM4/26/20
to kaldi-help
Thanks, Sir for the reply.


On Sunday, April 26, 2020 at 12:57:38 PM UTC+5:30, Dan Povey wrote:
Currently all my energy is going to a ground-up rewrite of Kaldi (or not-exactly-Kaldi) that will use PyTorch, 
and eventually, as an alternative, TensorFlow) and will support things like that.  I'm not focusing on adding 
features like what you mention for the time being.
If anyone volunteers to work with you, though, great.

Dan

On Sun, Apr 26, 2020 at 2:37 PM Rishabh Kumar <cyfe...@gmail.com> wrote:
Till now there is no Kaldi Pytorch LM rescoring whereas there is TensorFlow lm rescoring: https://github.com/kaldi-asr/kaldi/tree/master/src/tfrnnlmbin

I am going to write for Kaldi Pytorch LM rescoring code. I am new. If someone can guide me. It would be easy to understand and write the code. Or is there any other way to rescore the Pytorch LM?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Heng xin Fun

unread,
Apr 27, 2020, 12:02:27 PM4/27/20
to kaldi-help
Hi, Is there any way I can help contribute to this initiative? Thanks! Heng

On Sunday, April 26, 2020 at 12:27:38 AM UTC-7, Dan Povey wrote:
Currently all my energy is going to a ground-up rewrite of Kaldi (or not-exactly-Kaldi) that will use PyTorch, 
and eventually, as an alternative, TensorFlow) and will support things like that.  I'm not focusing on adding 
features like what you mention for the time being.
If anyone volunteers to work with you, though, great.

Dan

On Sun, Apr 26, 2020 at 2:37 PM Rishabh Kumar <cyfe...@gmail.com> wrote:
Till now there is no Kaldi Pytorch LM rescoring whereas there is TensorFlow lm rescoring: https://github.com/kaldi-asr/kaldi/tree/master/src/tfrnnlmbin

I am going to write for Kaldi Pytorch LM rescoring code. I am new. If someone can guide me. It would be easy to understand and write the code. Or is there any other way to rescore the Pytorch LM?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Rishabh Kumar

unread,
Apr 27, 2020, 3:09:57 PM4/27/20
to kaldi-help
Yes Heng y not... Thanks :)

Daniel Povey

unread,
Apr 27, 2020, 11:56:03 PM4/27/20
to kaldi-help
Again, I think Heng may have been talking about next-gen Kaldi.
Heng: the easiest way would be to choose one of those two projects I mentioned, look for something that seems to be un-implemented, and make a pull request.  Feel free to ask if anyone is working on it before you try.   

We will also start one or two more prongs, which will be about the interfaces of the acoustic modeling part and the RNNLM part.  I have to figure out the details.  But I likely won't pro-actively include you until you prove yourself useful by making a reasonable PR :-)



To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/5ca234bd-2a49-44ef-ab28-a9d13a1280db%40googlegroups.com.

Rishabh Kumar

unread,
Apr 28, 2020, 5:59:31 AM4/28/20
to kaldi-help
I have doubt Dan Sir, Do you think for writing Lattice rescoring in Pytorch wrapper is not required as Tensorflow. As Pytorch itself has C++ API which can be used directly. 

Right now, I am just replicating the same code of Tensorflow into Pytorch.

Rémi Francis

unread,
Apr 28, 2020, 7:29:29 AM4/28/20
to kaldi-help
Is there a place (separate from github issues) to follow and to ask questions?
Message has been deleted

Rishabh Kumar

unread,
May 5, 2020, 5:45:36 PM5/5/20
to kaldi-help
@Dan what is the difference between lmrescore_rnnlm_lat.sh and lattice-lmrescore-tf-rnnlm.cc

Daniel Povey

unread,
May 5, 2020, 10:23:23 PM5/5/20
to kaldi-help
The first is a script that calls the second, I believe.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/7b95047e-6e02-4ccd-ae50-0409050d4f49%40googlegroups.com.

Rishabh Kumar

unread,
May 6, 2020, 5:11:23 PM5/6/20
to kaldi-help
Thanks Dan


On Wednesday, May 6, 2020 at 7:53:23 AM UTC+5:30, Dan Povey wrote:
The first is a script that calls the second, I believe.

On Wed, May 6, 2020 at 5:45 AM Rishabh Kumar <cyfe...@gmail.com> wrote:
@Dan what is the difference between lmrescore_rnnlm_lat.sh and lattice-lmrescore-tf-rnnlm.cc

On Sunday, April 26, 2020 at 12:07:05 PM UTC+5:30, Rishabh Kumar wrote:
Till now there is no Kaldi Pytorch LM rescoring whereas there is TensorFlow lm rescoring: https://github.com/kaldi-asr/kaldi/tree/master/src/tfrnnlmbin

I am going to write for Kaldi Pytorch LM rescoring code. I am new. If someone can guide me. It would be easy to understand and write the code. Or is there any other way to rescore the Pytorch LM?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Rishabh Kumar

unread,
May 7, 2020, 4:33:54 PM5/7/20
to kaldi-help
Dan the second script is src/tfrnnlmbin/lattice-lmrescore-tf-rnnlm.cc . it is called from lmrescore_rnnlm_lat.sh though cmd.
I have doubt do second script should be in src only or it can be at any place. 


On Wednesday, May 6, 2020 at 7:53:23 AM UTC+5:30, Dan Povey wrote:
The first is a script that calls the second, I believe.

On Wed, May 6, 2020 at 5:45 AM Rishabh Kumar <cyfe...@gmail.com> wrote:
@Dan what is the difference between lmrescore_rnnlm_lat.sh and lattice-lmrescore-tf-rnnlm.cc

On Sunday, April 26, 2020 at 12:07:05 PM UTC+5:30, Rishabh Kumar wrote:
Till now there is no Kaldi Pytorch LM rescoring whereas there is TensorFlow lm rescoring: https://github.com/kaldi-asr/kaldi/tree/master/src/tfrnnlmbin

I am going to write for Kaldi Pytorch LM rescoring code. I am new. If someone can guide me. It would be easy to understand and write the code. Or is there any other way to rescore the Pytorch LM?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Rishabh Kumar

unread,
May 7, 2020, 7:01:45 PM5/7/20
to kaldi-help
Dan I have one more question how do kaldi know that lattice-lmrescore-tf-rnnlm.cc is inside this folder src/tfrnnlmbin/ as we have not mentioned anything while calling the file. For example: steps/decode.sh
Is it because of we add the path in the tools/config/common_path.sh as "${KALDI_ROOT}/src/tfrnnlmbin:\"?

Rishabh Kumar

unread,
May 8, 2020, 7:38:19 AM5/8/20
to kaldi-help
@Dan can you please reply?

Daniel Povey

unread,
May 8, 2020, 8:07:52 AM5/8/20
to kaldi-help
Yes.. that's a path issue, read about UNIX path.  I don't always reply to basic questions that are not specific to Kaldi.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/da760e3b-5a9c-445e-ac7c-f362bd325ba2%40googlegroups.com.

Rishabh Kumar

unread,
May 8, 2020, 9:42:25 AM5/8/20
to kaldi-help
Thanks @Dan, Sorry in advance...

In kaldi/egs/wsj/s5/steps/tfrnnlm/lstm.py why did you have get_initial_state and single_step function which you can get from the model also? As the Tensorflow model can give all the states and the output.

Correct me if I am wrong.

Daniel Povey

unread,
May 8, 2020, 9:52:24 AM5/8/20
to kaldi-help
I'm sorry, I didn't write that code and I'm not sure.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/c2b15e1f-c35b-4bf6-a4fc-054746632797%40googlegroups.com.

Rishabh Kumar

unread,
May 8, 2020, 10:07:18 AM5/8/20
to kaldi-help
Can you suggest me the right person to ask this question?

Daniel Povey

unread,
May 8, 2020, 10:11:15 AM5/8/20
to kaldi-help, Hainan Xu
cc'd

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/0a16b69f-b421-4504-9bbc-ae48c5a39728%40googlegroups.com.

Rishabh Kumar

unread,
May 10, 2020, 10:09:47 PM5/10/20
to kaldi-help
@Dan I have almost completed the Pytorch rescoring part.

I tried to run lstm.py (Tensorflow) separately but It had many errors many for version.

In lstm.py there is two functions
get_initial_state() -> initial_state
state_steps(context,word_id) -> log_prob, rnn_state, rnn_out
cc'd

Daniel Povey

unread,
May 11, 2020, 12:34:23 AM5/11/20
to kaldi-help
Great!  Perhaps you can create a repo to show how you are doing this, so we can judge how easy/hard it might be to merge?


To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/71e4b72c-7161-455e-9995-0cb1ba5f0bfa%40googlegroups.com.

Rishabh Kumar

unread,
May 11, 2020, 9:36:23 AM5/11/20
to kaldi-help
Yes, For merging I need to work more on it... but now I am just trying to run it.... Maybe after INTERSPEECH... But You u want to see the files and suggest something missing then I can share it with you... It may have alot of debugs...
May be every part so, don't expect a lot from me... also the code is not written as u write the code...

It is dirty code but after INTERSPEECH I will clean... write make it a good code to merge...

Rishabh Kumar

unread,
May 11, 2020, 3:58:13 PM5/11/20
to kaldi-help
1. Is it okay if I separately compile all tfrnnlm and tfrnnlmbin?
2. Then I make an executable file of both
3. I have modified Makefile and CMakeLists.txt


4. kaldi/CMakeLists.txt
adding this
if(PYTORCH_DIR)
add_subdirectory(src/pyrnnlm)
add_subdirectory(src/prnnlmbin)
endif()

adding this 

if d.startswith('pyrnnlm'):
continue

@Dan
No more... Please reply as soon as possible...

Rishabh Kumar

unread,
May 11, 2020, 6:47:06 PM5/11/20
to kaldi-help
Sorry, sir to disturb you again...

Sir, I have properly compiled kaldi/src/pyrnnlm/CMakeLists.txt command - "cmake -DCMAKE_PREFIX_PATH=/home/rakesh/rishabh_workspace/Garbage/libtorch"
when I am making executable file by the command "cmake --build . --config Release"
Then there is an error "fatal error: fst/types.h: No such file or directory"

Rishabh Kumar

unread,
May 11, 2020, 11:00:18 PM5/11/20
to kaldi-help

Sir, How to run CMakeLists.txt . Any source where I can know how to compile the code.
As
In CMakeLists.txt there is 
add_kaldi_executable(NAME lattice-lmrescore-py-rnnlm SOURCES lattice-lmrescore-py-rnnlm.cc DEPENDS kaldi-pyrnnlm kaldi-lat)

which means that I need to compile it with Kaldi
should I run the CMakeLists.txt of Kaldi
by calling cmake

Rishabh Kumar

unread,
May 16, 2020, 5:31:15 PM5/16/20
to kaldi-help
Apologies @Dan Sir for asking you foolish questions. It won't repeat again.

Sir, there is my repository for Enable RNNLM lattice rescoring with Pytorch: https://github.com/cyfer0618/kaldi-pytorch-rnnlm.git 

There is a small compilation issue which I will solve in 2-3 days. Sorry, once again Sir.

Jan Trmal

unread,
May 16, 2020, 5:38:32 PM5/16/20
to kaldi-help
You will have to set your include directory to include openfst directories -- perhaps they are not correct (try to find types.h in your openfst install and cross-correlate with the cmakelists.txt)
I don't have the time to go through all of this so it's possible your problem is different -- in that case, can you please re-state your problem?
Thanks,
y.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/e294a682-cac9-4a9f-b1f9-5bf23732ac2b%40googlegroups.com.

Rishabh Kumar

unread,
May 16, 2020, 9:36:05 PM5/16/20
to kaldi-help
@Yenda Sir, Sir I have solved all problems I have mentioned earlier. Now, I am getting compilation errors because of Pytorch C++.

Jan Trmal

unread,
May 16, 2020, 9:48:29 PM5/16/20
to kaldi-help
I'm not expert in the pytorch C++, but in those cmake files you are showing there, you don't link against the pythorch library.
This:
-L/home/rakesh/rishabh_workspace/Garbage/kaldi/tools/libtorch -L/home/rakesh/rishabh_workspace/Garbage/kaldi/tools/libtorch/lib -L/home/rakesh/rishabh_workspace/Garbage/kaldi/tools/cuda/lib64
adds online directories to a llist of dirs to be used to lookup a library, nothing else...

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Rishabh Kumar

unread,
May 16, 2020, 10:58:25 PM5/16/20
to kaldi-help
@yenda, I added Torch Library in place of these which is  ${TORCH_LIBRARIES} than I got the error...

CMake Error: Error evaluating generator expression: 
$<TARGET_PROPERTY:torch_cpu,INTERFACE_INCLUDE_DIRECTORIES> Target "torch_cpu" not found. 
CMake Error: Error evaluating generator expression: 
$<TARGET_PROPERTY:torch_cuda,INTERFACE_SYSTEM_INCLUDE_DIRECTORIES> Target "torch_cuda" not found.



On Sunday, May 17, 2020 at 7:18:29 AM UTC+5:30, Yenda wrote:
I'm not expert in the pytorch C++, but in those cmake files you are showing there, you don't link against the pythorch library.
This:
-L/home/rakesh/rishabh_workspace/Garbage/kaldi/tools/libtorch -L/home/rakesh/rishabh_workspace/Garbage/kaldi/tools/libtorch/lib -L/home/rakesh/rishabh_workspace/Garbage/kaldi/tools/cuda/lib64
adds online directories to a llist of dirs to be used to lookup a library, nothing else...

On Sat, May 16, 2020 at 9:36 PM Rishabh Kumar <cyfe...@gmail.com> wrote:
@Yenda Sir, Sir I have solved all problems I have mentioned earlier. Now, I am getting compilation errors because of Pytorch C++.
https://discuss.pytorch.org/t/build-a-pytorch-wrapper-error-c10-todouble-const-similar-errors/81280

On Sunday, April 26, 2020 at 12:07:05 PM UTC+5:30, Rishabh Kumar wrote:
Till now there is no Kaldi Pytorch LM rescoring whereas there is TensorFlow lm rescoring: https://github.com/kaldi-asr/kaldi/tree/master/src/tfrnnlmbin

I am going to write for Kaldi Pytorch LM rescoring code. I am new. If someone can guide me. It would be easy to understand and write the code. Or is there any other way to rescore the Pytorch LM?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Jan Trmal

unread,
May 17, 2020, 8:57:05 AM5/17/20
to kaldi-help
I don't understand the generator expressions -- delete them :p
y.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/f37a3514-f9b6-42d6-b1af-75c3f01512d0%40googlegroups.com.

Rishabh Kumar

unread,
May 30, 2020, 9:12:45 PM5/30/20
to kaldi-help
Sir, I didn't understand what u have said... but it is compiled 100%. Thank you Sir.

keli...@gmail.com

unread,
Mar 9, 2021, 6:21:34 PM3/9/21
to kaldi-help
Hi, now Kaldi supports lattice rescoring with PyTorch LMs. Examples for SWBD and WSJ can be found here: https://github.com/kaldi-asr/kaldi/tree/master/egs/swbd/s5c/local/pytorchnnhttps://github.com/kaldi-asr/kaldi/tree/master/egs/wsj/s5/local/pytorchnn

Ke

Reply all
Reply to author
Forward
0 new messages