Words with empty pronunciation

373 views
Skip to first unread message

Rémi Francis

unread,
Jun 13, 2016, 1:47:53 PM6/13/16
to kaldi-help
In the Kaldi documentation: http://kaldi-asr.org/doc/graph_recipe_test.html#graph_lexicon there is:
Notice that we allow words with empty phonetic representations.

However, in utils/validate_dict_dir.pl, there is (line 189):
    if (@col == 0) {
      print "--> ERROR: lexicon.txt contains word $word with empty ";
      print "pronunciation.\n";
      set_to_fail();
    }

 Is the doc outdated, or are the empty pronunciations allowed only for words like "<s>", "</s>" and "#0"?

Daniel Povey

unread,
Jun 13, 2016, 2:09:09 PM6/13/16
to kaldi-help
They are outdated. I think currently we never list those types of
words in the lexicon, only in the word-list. I created an issue for
it.
Dan
> --
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Rémi Francis

unread,
Jun 14, 2016, 5:50:48 AM6/14/16
to kaldi-help, dpo...@gmail.com
Thanks I see.
So why are words with empty pronunciation forbidden? With disambiguation symbols it should be possible to deal with them properly.

Daniel Povey

unread,
Jun 14, 2016, 2:55:01 PM6/14/16
to Rémi Francis, kaldi-help
Because they would cause lattice word alignment to fail.
Dan

Rémi Francis

unread,
Jun 15, 2016, 7:13:27 AM6/15/16
to kaldi-help, re...@speechmatics.com, dpo...@gmail.com
Thanks I see. 
Reply all
Reply to author
Forward
0 new messages