I think I have a pretty good understanding of how kaldi works at this point, what files need to be created, what scripts to use etc. I'm not a complete noob is what I'm saying.
One thing I'm having trouble with is the phones.txt file. I think I know what phones are in general, but what must be put in the phones.txt file (data/lang directory)? Where are those coming from, do I choose them from somewhere (like the International Phonetic Alphabet) or is it something else entirely? For instance, in egs/yesno, it only uses 2 phones (Y N) even though the spoken Hebrew words for 'yes' and 'no' clearly use more than that. And in egs/voxforge, I'm seeing a ridiculous ammount of phones, 167 I think, even though I read that the English language uses around 40 different phones. And in voxforge a lot of them seem complex, as in they have suffixes, using other letters. What is that about?
I'd really appreciate if someone could clear this up for me, I haven't found anything in the documentation that really answers this. Thank you in advance.