Yes, "nnet3bin/nnet3-latgen-faster" is used to generate lattices. Lattice is the result of decoding and contains the whole information of decoding. From the lattice, you can get the best path and so on. Actually, the alignment is the sequence of "ilabel" of the lattice. With the binary, you can achieve the alignment [i.e. the sequence of ilabel of the best path]. Check the usage, we have already provide the alignment option.
Why do you want to check the acoustic model performance with "test set" rather than "dev set"?
I think the key is that, if you want to check the performance, you need to have a criterion to define what's good.
Obviously, the accuracy of frame and WER are the most straightforward ways to judge it. Bear in mind, there is a positive correlation between frame-accuracy and WER rather than direct ratio in both ML and DT models.
In addition, maybe you can use some kind of "confidence measure" to help you check the performance.