Hi, everyone!
I have a language model, i.e. 3-gram, total 4 GB, iARPA format, and the details:
I want to prune it with IRSTLM's prune-lm function. But I found it is very slow, the "lmt.load(...) in prune-lm.cpp" costs much time when it loads the gram 3, the gram 1 and gram 2 are both read very quickly.
I found it costs 10 minutes to read 500000 lines (gram 3) .
My machine has 6 cores (12 threads) and 64GB memory. The code only runs on a single core.
Can you give me some suggestions?
I have also trained a 80GB 3-gram, and I found it is much harder to perform pruning with IRSTLM.