# local/run_unk_model.sh
This script demonstrates how you can add something to the decoding graph that makes it possible to decode arbitrary phone sequences in addition to regular words; they fill the slot where the "unknown word" would normally appear in the LM.
But don't expect the resulting decoded phone sequences to be particularly accurate. Also, getting the actual phone sequences is not 100% trivial; the simplest method would be to pipe the lattices through lattice-best-path (with correct acoustic scale), then lattice-align-words, then lattice-arc-post, then sym2int a couple of times with suitable options to convert the words and phones to text form.
Dan
--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
Thanks I'll look at G.fst
it is important to note that before I built the unk model,
I used the graph I built according to the aspire directory and indeed I got unk so it actually means it exists on G.fst not like that? Correct me if I'm wrong but I do exactly what you prescribed to do. 1 . lattice-best-path as follows: CompactLattice clat; bool end_of_utterance = true; decoder-> GetLattice (end_of_utterance, & clan); CompactLattice best_path_clat; CompactLatticeShortestPath (clan, & best_path_clat); Lattice best_path_lat; ConvertLattice (best_path_clat, & best_path_lat); 2. lattice-align-words, as follows : WordBoundaryInfoNewOpts opts; WordBoundaryInfo info (opts, "/path/to/word_boundary.int"); bool ok = WordAlignLattice (best_path_clat, trans_model, info, 0, & aligned_clat); 3. lattice-arc-post, as follows: kaldi::TopSortCompactLatticeIfNeeded (& aligned_clat); kaldi::ArcPosteriorComputer computer (aligned_clat, min_post, false, & trans_model); std::vector <int32> phoneme = computer.OutputPosteriors (); And instead of printing the phonemes I return the phonemes to the OutputPosteriors function.
Is not that the right way to get the phonemes? For words that are in my lexicon, I accept the phonemes in this way ...
That was the problem, I took care of it and now I get the phonemes as I wanted.
Thank you !
The next problem is that sometimes I get phonemes that are far from reflecting what I said.
I saw that you wrote up that the process should be performed several times
until I get maximum accuracy - what is the purpose of the rehearsals?