what is oov.int, sets.int in thchs30

720 views
Skip to first unread message

yangyu1980.1

unread,
Jun 26, 2018, 3:24:16 AM6/26/18
to kaldi-help
Hi everyone,

I am trying to execute run.sh of thchs30, there is some error due to lack of ".int" file, for example, oov.int, and sets.int in s5/steps/train_mono.sh.
I can't find material of how to generate such .int files. 
Please help.

BR
Yangyu

jhennrich89

unread,
Jun 26, 2018, 7:37:00 AM6/26/18
to kaldi-help
The files should be created by prepare_lang.sh. All *.int files are equivalent to the *.txt files of the same name but with integer ids instead of symbolnames. The mappings from symbolnames to ids can be found in the files that are used as isymbols or osymbols. For oov.int its usually phones.txt and for sets.int its words.txt

yangyu1980.1

unread,
Jun 27, 2018, 2:55:31 AM6/27/18
to kaldi-help
Hi Hennrich,

Thanks for the information.
I need to write script just as prepare_lang.sh, right?

BR
Yangyu

在 2018年6月26日星期二 UTC+8下午7:37:00,Johannes Hennrich写道:

Daniel Povey

unread,
Jun 27, 2018, 1:29:50 PM6/27/18
to kaldi-help
You shouldn't have to write anything, it should work out of the box.
If you actually showed us the output on the screen when you run run.sh
we would probably be able to immediately say what the problem is. The
problem is that you are trying to describe the error but you likely
can't even see where the initial error was.
> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> To post to this group, send email to kaldi...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/kaldi-help/970016b2-3dec-48ab-a5f0-7efc4c81f2e4%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.
Message has been deleted

yangyu1980.1

unread,
Jun 28, 2018, 8:56:53 AM6/28/18
to kaldi-help
Hi Povey,

log.txt is attached, "cat: data/lang/oov.int: No such file or directory"
besides oov.int not found, still has below error, but I don't know if it is important:
ERROR: GenericRegister::GetEntry: No such file or directory
ERROR: MutableFst::Read: Unknown FST type "vector" (arc type = "standard"): standard input
ERROR: GenericRegister::GetEntry: No such file or directory
ERROR: MutableFst::Read: Unknown FST type "vector" (arc type = "standard"): standard input

There isn't script for creating ".int" file in thchs30 folder.

There is thchs30/s5/data/lang/oov.txt, the content of which is <SPOKEN_NOISE>, I can create oov.int file by "cat data/lang/oov.txt | utils/sym2int.pl data/lang/words.txt > data/lang/oov.int".
There is thchs30/s5/data/lang/phones/sets.txt, I can create oov.int file by "cat data/lang/phones/sets.txt | utils/sym2int.pl data/lang/phones.txt >data/lang/phones/sets.int".
oov.txt,oov.int, sets.txt, sets.int have uploaded, are the .int files right?

BR
Yangyu

在 2018年6月28日星期四 UTC+8上午1:29:50,Dan Povey写道:
log.txt
sets.int
sets.txt
oov.int
oov.txt

Daniel Povey

unread,
Jun 28, 2018, 1:12:58 PM6/28/18
to kaldi-help
I suspect that you have installed OpenFst at the system level (e.g.
using a system installer like apt-get or yum), and the version is
incompatible with Kaldi somehow. If you do
. ./path.sh
which fstcopy
it might tell you where the OpenFst is coming from. Removing the
system OpenFst, if that exists, might help.
The following error:

ERROR: GenericRegister::GetEntry: No such file or directory
ERROR: MutableFst::Read: Unknown FST type "vector" (arc type =
"standard"): standard input

basically means that the OpenFst installation is super badly messed
up, because the standard-arc vector FST is the most common, basic type
of FST.



Dan
> https://groups.google.com/d/msgid/kaldi-help/5dc871a4-de40-4d1a-b080-f4c26f2de40d%40googlegroups.com.

yangyu1980.1

unread,
Jun 29, 2018, 4:54:16 AM6/29/18
to kaldi-help
Hi Dan,

I installed kaldi on windows 7 cygwin and I didn't use apt-get or yum to install OpenFst.
The openfst version is 1.6.7, this is installed in Makefile of kaldi/tools/Makefile.
./path.sh in thchs30/s5 will show nothing.
which fstcopy will show no fstcopy, actually, in kaldi/tools/openfst-1.6.7/bin/ there is no fstcopy.exe

BR
Yangyu

在 2018年6月29日星期五 UTC+8上午1:12:58,Dan Povey写道:

yangyu1980.1

unread,
Jun 29, 2018, 6:02:26 AM6/29/18
to kaldi-help
Hi Dan,

I use "./configure --static --static-fst" for the kaldi/src config, just follow the instruction as below,  I don't know if this is related with openfst error.

yangyu4@LIHY7-PC0B0J9W ~/kaldi/src
$ ./configure
Configuring ...
Backing up kaldi.mk to kaldi.mk.bak ...
Checking compiler g++ ...
Checking OpenFst library in /home/yangyu4/kaldi/tools/openfst ...
***configure failed: Dynamic libraries are not supported on this platform.
             Run configure with --static --static-fst flag. *** 

BR
Yangyu

在 2018年6月29日星期五 UTC+8下午4:54:16,yangyu1980.1写道:

Daniel Povey

unread,
Jun 29, 2018, 1:54:44 PM6/29/18
to kaldi-help
Oh.

So many things break on cygwin with recent Windows versions that I
don't even bother trying to debug it. I wouldn't even bother trying.
Anyway it's an OpenFst issue not a kaldi issue, and the OpenFst code
involved (about registering FST types) is quite complicated, it's code
that I don't fully understand, so I don't know what the issue might
be.
I advise to use a UNIX-like system. Windows/cygwin used to work but
recently the platform has become too unstable to properly maintain,
and anyway it's declining in popularity.

Dan
> https://groups.google.com/d/msgid/kaldi-help/1f708d7c-f5eb-4661-becb-d66a2a7c7838%40googlegroups.com.

yangyu1980.1

unread,
Jun 30, 2018, 11:43:37 PM6/30/18
to kaldi-help
Hi Dan,

Ok, many thanks for your advise.
On Ubantu there is no such problem.

BR
Yangyu

在 2018年6月30日星期六 UTC+8上午1:54:44,Dan Povey写道:
Reply all
Reply to author
Forward
0 new messages