Time delay neural network implementation

645 views
Skip to first unread message

speechMachine

unread,
Feb 3, 2016, 1:47:29 PM2/3/16
to kaldi-help
Hello,

I had a short/brief question. Does Kaldi have an implementation for a time-delay neural network?

Thanks...!

Daniel Povey

unread,
Feb 3, 2016, 1:49:37 PM2/3/16
to kaldi-help
Yes, search for TDNN in the scripts, it's the default recipe in many of them.

--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

speechMachine

unread,
Feb 3, 2016, 2:16:27 PM2/3/16
to kaldi-help
Thanks Dan, I did a grep and found the recipes that do have the implementation. From the comments I understand that the referring paper would be the multi-splice paper http://www.danielpovey.com/files/2015_interspeech_multisplice.pdf. I'm assuming the nnet3 binary for TDNN allows the layered architecture portrayed in Fig 1 which allows TDNN with subsampling at multiple layers, is that right? 

Is a basic TDNN in a single layer essentially no different from a general splicing of the frames with a certain context window as is done in regular DNNs? 

Daniel Povey

unread,
Feb 3, 2016, 2:19:59 PM2/3/16
to kaldi-help
Thanks Dan, I did a grep and found the recipes that do have the implementation. From the comments I understand that the referring paper would be the multi-splice paper http://www.danielpovey.com/files/2015_interspeech_multisplice.pdf. I'm assuming the nnet3 binary for TDNN allows the layered architecture portrayed in Fig 1 which allows TDNN with subsampling at multiple layers, is that right? 

Yes.. well there's not specific binary for that, it's done via config files, but any recipe that uses 'steps/nnet3/train_tdnn.sh' is a TDNN recipe.  The corresponding nnet2 recipes are steps/nnet2/train_multisplice*.sh.



Is a basic TDNN in a single layer essentially no different from a general splicing of the frames with a certain context window as is done in regular DNNs? 

Yes, if there is just one layer.
Dan

Vijayaditya Peddinti

unread,
Feb 3, 2016, 2:44:49 PM2/3/16
to kaldi-help
The core element of the TDNN is the Splicing layer (Append descriptor in nnet3), which can splice non-contiguous indices in the given context. Once this spliced input is formed it is just processed by a normal Affine layer. 

--Vijay
Reply all
Reply to author
Forward
0 new messages