1)
I thought of getting pdf-ids (senones) for all frames, for all utterances used in training, like this
ali-to-pdf exp/nnet_ali/final.mdl "ark:gunzip -c exp/tri_ali/ali.*.gz|" ark,t:all_senone_alignments.txt
And then, I should write a script that counts repetition of each pdf-id. Right way?
2)
Is this assumption fine? "All senones in a DNN-HMM system are equally likely"