Lobby to autodetect kenlm

27 views
Skip to first unread message

Kenneth Heafield

unread,
Jun 28, 2013, 9:56:14 AM6/28/13
to jane-...@googlegroups.com
Dear Jane,

    In LMInterface.cpp, the documentation provided to the user says "Default: autodetect. Note however that auto-detection fails for the randlm LM type."  This is not quite true: it also fails for kenlm.  I note you're calling lm::ngram::RecognizeBinary(fname.c_str(), modelType) to classify the type of kenlm.  It will also return false if it's not one of my binary files.  So can I convince you to autodetect? 

Change

        if (userType == LMInterface::autodetectType) {
            if (identifier == "\\data\\" || identifier.substr(0, 18) == "SRILM_BINARY_NGRAM")
                lm = new SriLMInterface(config, fname, externalAlphabet);
            else
                lm = new BinaryLMInterface(config, fname, externalAlphabet);
            fp.close();
        } else {

to

        lm::ngram::ModelType modelType;
        if (userType == LMInterface::autodetectType) {
            if (identifier == "\\data\\" || identifier.substr(0, 18) == "SRILM_BINARY_NGRAM")
                userType = LMInterface::sriLmType;
            else if (lm::ngram::RecognizeBinary(fname.c_str(), modelType))
                userType = LMInterface::kenLmType;
            else
                userType = LMInterface::janeLmType;
        }
        {

Then presumably remove the spurious braces and indentation. 

Kenneth

Stephan Peitz

unread,
Jul 8, 2013, 2:51:04 AM7/8/13
to jane-...@googlegroups.com
Hi Kenneth,

thanks! It's fixed!

Stephan
Reply all
Reply to author
Forward
0 new messages