SRILM toolkit is probably not installed.

2,232 views
Skip to first unread message

jfische...@gmail.com

unread,
Aug 15, 2016, 6:21:14 PM8/15/16
to kaldi-help
I have been following Kaldi for dummies tutorial and seem to have everything setup except for the fact that I am getting an error "SRILM toolkit is probably not installed." The path it would reference in my script checks out and SRILM installed fine. I think it has something to do with ngram-count not returning the correct value. There other thing is I could not use the line "export LC_ALL=C" on the bottom of the path.sh as it was complaining about incorrect locale. Lastly the path "/tools/mitlm-svn/lib" referenced in path.sh does not exist on my system or I cannot find where it resides. Thanks for your help.

(console output)
./run.sh

===== PREPARING ACOUSTIC DATA =====


===== FEATURES EXTRACTION =====

 data/train exp/make_mfcc/train mfccpl
steps/make_mfcc.sh: line 19: parse_options.sh: No such file or directory
 data/test exp/make_mfcc/test mfccn.pl
steps/make_mfcc.sh: line 19: parse_options.sh: No such file or directory
steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train mfcc
make_cmvn.sh: no such file data/train/feats.scp
steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test mfcc
make_cmvn.sh: no such file data/test/feats.scp

===== PREPARING LANGUAGE DATA =====

Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> data/local/dict/silence_phones.txt is OK

Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> data/local/dict/optional_silence.txt is OK

Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> data/local/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.

Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> data/local/dict/lexicon.txt is OK

Checking data/local/dict/extra_questions.txt ...
--> data/local/dict/extra_questions.txt is empty (this is OK)
--> SUCCESS [validating dictionary directory data/local/dict]

**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
utils/prepare_lang.sh: line 378: fstcompile: command not found
utils/prepare_lang.sh: line 380: fstarcsort: command not found

===== LANGUAGE MODEL CREATION =====
===== MAKING lm.arpa =====

/src/latbin/:/root/kaldi/egs/digits:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr)sbin:/bin:/sbin:/root/bin
SRILM toolkit is probably not installed.
              Instructions: tools/install_srilm.sh
[root@localhost digits]#



(run.sh portion)
loc=`which ngram-count`;
if [ -z $loc ]; then
     if uname -a | grep 64 >/dev/null; then
        sdir=$KALDI_ROOT/tools/srilm/bin/i686-m64
    else
            sdir=$KALDI_ROOT/tools/srilm/bin/i686
      fi

      if [ -f $sdir/ngram-count ]; then
            echo "Using SRILM language modelling tool from $sdir"
            export PATH=$PATH:$sdir
      else
            echo "SRILM toolkit is probably not installed.
              Instructions: tools/install_srilm.sh"
            exit 1
      fi
fi



(path.sh file)

# Defining Kaldi root directory

export KALDI_ROOT=`pwd`/../../..

# Setting paths to useful tools

export PATH=$KALDI_ROOT/utils/:$KALDI_ROOT/src/bin:$KALDI_ROOT/tools/openfst/bin:$KALDI_ROOT/src/fstbin/:$KALDI_ROOT/src/gmmbin/:$KALDI_ROOT/src/featbin/:$KALDI_ROOT/src/lm/:$KALDI_ROOT/src/sgmmbin/:$KALDI_ROOT/src/sgmm2bin/:$KALDI_ROOT/src/fgmmbin/:$KALDI_ROOT/src/latbin/:$PWD:$PATH

# Defining audio data directory (modify it for your installation directory!)

export DATA_ROOT="root/kaldi/egs/digits/digits_audio"

# Variable that stores path to MITLM library

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(pwd)/tools/mitlm-svn/lib

# Variable needed for proper data sorting
#export LC_ALL=C

Daniel Povey

unread,
Aug 15, 2016, 6:40:33 PM8/15/16
to kaldi-help
I suspect you missed a stage early on in the instructions where you
are supposed to create soft links or set up your path or source
path.sh, e.g. parse_options.sh was not found.

Dan
> --
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

jfische...@gmail.com

unread,
Aug 16, 2016, 1:45:16 PM8/16/16
to kaldi-help, dpo...@gmail.com
I reinstalled all the tools and parse_options error disappeared however I still have an unsorted data error which is due to me commenting out export LC_ALL=C in path.sh. If I uncomment that line I get a different error

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
    LANGUAGE = "en_US.UTF-8",
",    LC_ALL = "C
    LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

I know this may be more Linux OS related but have you seen this before in Cent OS 7.2?

Daniel Povey

unread,
Aug 16, 2016, 3:01:50 PM8/16/16
to jfische...@gmail.com, kaldi-help
I don't use CentOS personally but we haven't had this issue reported before.
It's odd because from what I can tell, it's supposed to always be safe
to export LC_ALL=C; it's the standard way of forcing C-style sorting.
For instance, on Debian, the man page of "sort' says this:

*** WARNING *** The locale specified by the environment
affects sort order. Set LC_ALL=C to get the traditional sort order
that uses native byte values.


It is probably effectively a bug in CentOS if this does not work (but
check the man page of 'sort' on that platform to verify).
See also this:
https://www.centos.org/forums/viewtopic.php?t=1013
You may be able to fix this by modifying the path.sh to unset LANG and
LANGUAGE at the same place as it does 'export LC_ALL=C'.


Dan

jfische...@gmail.com

unread,
Aug 16, 2016, 4:08:49 PM8/16/16
to kaldi-help, jfische...@gmail.com, dpo...@gmail.com
I had tried that method to fix it and a few others with no luck. Do you recommend Debian for running Kaldi? I'm just running a VM so I'd rather just install whatever has been battle tested than waste anymore time fighting Cent OS.

Daniel Povey

unread,
Aug 16, 2016, 4:17:31 PM8/16/16
to jfische...@gmail.com, kaldi-help
Yes, we use Debian at JHU so it should work very smoothly.

However, on second thoughts it might not be just about the remote OS,
it might be that you ssh session is exporting your local locale to the
cloud machine and it might not support that locale.

http://stackoverflow.com/questions/2499794/how-can-i-fix-a-locale-warning-from-perl

Also you need to export the variables for them to work.
Try doing
unset LANG
unset LANGUAGE
export LC_ALL=C


Dan

Daniel Povey

unread,
Aug 16, 2016, 4:48:56 PM8/16/16
to John Fiscer, kaldi-help
We figured this out off-list, but just for the record, the issue was
that his path.sh had DOS line endings - presumably he had edited it on
Windows. It contains the line
export LC_ALL=C
if you look at it in emacs, but you'll see the [DOS] indication at the
bottom of the screen, indicating is has DOS-style line endings, so
really it contains:
export LC_ALL=C\r
and it seems that bash does not treat that \r as whitespace, it makes
it part of the variable. I sourced his path.sh on my mac and the same
thing happened to me, you can see below:
mac:Downloads: . ./path.sh
mac:Downloads: echo $LC_ALL
C
mac:Downloads: echo "$LC_ALL:x"
:x
mac:Downloads: export LC_ALL=C
mac:Downloads: echo "$LC_ALL:x"
C:x
That is why perl was producing the quotes in the wrong place below:
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US.UTF-8",
", LC_ALL = "C
LANG = "en_US.UTF-8"
because the \r moves the cursor back to the beginning of the line.

Dan
Reply all
Reply to author
Forward
0 new messages