There will generally be an optimal beam but the differences are very tiny, and in your case may
simply be due to random noise rather than real differences.
As you make the beam smaller you lose more data to failed alignent (and/or make more alignment errors), but as you make the beam larger you end up aligning data that maybe had a wrong transcript, so in the limit it can sometimes make things slightly worse to make it larger. But this is probably dataset-dependent (depending on how many bad transcripts you have). In general we expect any differences to be quite tiny.
Larger beam is slower, too.
Also I think gmm-align-compiled has two beams- one for the initial pass, and one for a second pass if the initial pass fails to reach the final state.