If you don't have silences, the utterances may be cut off in the middle, and that's not good. Also, test data may have silences. There isn't a hard and fast rule.
Anyway, I doubt that the problems with your TDNN+LSTM were related to this. I think it's more likely there was an unrelated error such as a tree mismatch or wrong chunk-size or extra-{left,right}-context options.
However, it's possible that there was an issue about the silence. Do
grep optional exp/your-gmm-dir/log/analyze_alignments.log
One possibility is that your training data had very little silence in its alignments, leading to the den.fst having very low probability for silence; this would have tended to push up the acoustic probabilities of silence, which might have led to a lot of silence being recognized since the lexicon fixes the probability of silence at 0.5. But my feeling is if this were a likely thing, we'd have seen it before. Were you using a lexicon with pronunciation and silence probabilties? (utils/dict_dir_add_pronprobs.sh)?