Hi Dan,
Some more documentation would be good. There are apparently lots of people experimenting with kaldi, but what I've noticed is a lack of high-level documentation just describing how kaldi works from a basic level, without getting way down into the weeds - which files and directories are for what, which scripts should be used for training different types or simple decoding, etc. So many of the scripts are similar in name and the comments just don't differentiate enough what is used for what. And for many of us, we don't need to delve into the nitty-gritty of everything, we just want to run some basic experiments. I've had a long IT career designing and buildings complex networks and figuring out complex systems on my own, and I'll eventually figure all this out, but little pointers here and there on how to do basic things without digging too deep really help. I know kaldi was designed for speech recognition professionals, that seems to be mentioned with every post on the forums, but there are enough of us who aren't speech recognition professionals experimenting with it that I think some high-level documentation would be worth it, and keep a lot of us from asking so many questions. Once I learn this stuff, I'll gladly contribute back with documentation help. I had already read that link you provided, but I'm still trying to figure it out. I've probably read almost all of the documentation and many forum posts by this point as well.
One thing that would be perfect is something I found in an Amazon VM with pre-built models that's all ready to go (and unfortunately extremely limited in accuracy). It's called "Offline transcription system for Estonian using Kaldi" (though with English models, too; I'm sure you're familiar with it) which has *exactly* what many of us are looking for - speech2text.sh - which I'm trying to modify to use the models I've trained, though I still don't know yet what all of the files do so it's not working yet but that is exactly what I'm looking for.
And I have spoken with many consultants, and the ones willing to provide consulting (and not just sell access to their own API) want to sell a complete, turnkey package, which down the road I may go with but we can't get more funding until I achieve proof of concept, and for that right now I just want to put my trained models to use decoding some .wav files. None seem willing to just provide a couple hours of hand-holding to get what I have working - we'd gladly pay for two or three hours of consulting for that, which should be more than enough.
I know kaldi is a complex product in a complex field, but it seems like there should be some simpler ways of using it for some basic testing, unless I'm somehow over-complicating things.
Thanks for your time,
RB