Hi all,
I noticed that there is are options --ivector-silence-weighting.* for nnet3 online decoders.
I am wondering if there any options in nnet3 model training like that so that the model is trained with weighted ivectors and preventing a mismatch between training and decoding setups. I have not found anything like that.
I am using steps/online/nnet2/extract_ivectors_online.sh to extract ivectors for ASR training and it does not seem to use any information about speech / non-speech information (e.g. from alignments).
If it is not implemented yet, do you think that it would be helpful? I assume that it may have an impact when the training data has a lot of silence/noise segments.
Thanks!
Best regards,
Filip