Hi Daniel,
I want to train tdnn model based on librispeech dataset,
I went through the configuration of the network and try to understand to it's topology.
I attach the topology from "run_tdnn_1h.sh" and I have a few questions for the highlighted lines :
input dim=100 name=ivector
input dim=40 name=input
fixed-affine-layer name=lda input=Append(-1,0,1,ReplaceIndex(ivector, t, 0)) affine-transform-file=$dir/configs/lda.mat
relu-batchnorm-dropout-layer name=tdnn1 $tdnn_opts dim=768
tdnnf-layer name=tdnnf2 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=1
tdnnf-layer name=tdnnf3 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=1
tdnnf-layer name=tdnnf4 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=1
tdnnf-layer name=tdnnf5 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=0
tdnnf-layer name=tdnnf6 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=3
tdnnf-layer name=tdnnf7 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=3
tdnnf-layer name=tdnnf8 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=3
tdnnf-layer name=tdnnf9 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=3
tdnnf-layer name=tdnnf10 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=3
tdnnf-layer name=tdnnf11 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=3
tdnnf-layer name=tdnnf12 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=3
tdnnf-layer name=tdnnf13 $tdnnf_opts dim=768 bottleneck-dim=96 time-stride=3
linear-component name=prefinal-l dim=192 $linear_opts
prefinal-layer name=prefinal-chain input=prefinal-l $prefinal_opts small-dim=192 big-dim=768
output-layer name=output include-log-softmax=false dim=$num_targets $output_opts
prefinal-layer name=prefinal-xent input=prefinal-l $prefinal_opts small-dim=192 big-dim=768
output-layer name=output-xent dim=$num_targets learning-rate-factor=$learning_rate_factor $output_opts
1. What is the dimension of the input layer ( fixed-affine ) ? is it 100+40 ?
2. I guess tdnn_i+1 get as input window context from tdnn_i ,
if it so then how do I know what window context size ? is it related to "time-stride"?
3. The two highlighted lines that related to the output of the net, do they define one or two layers ?
Thanks for your time,