Creating a graph ended with "Hangup problem" without completing the job on ec2 instance

200 views
Skip to first unread message

nda...@eqratech.com

unread,
Jul 30, 2018, 7:44:34 AM7/30/18
to kaldi-help
Dear Dan, 
We have try to build a lang from a 2G text corpus. But unfortunately failed due to shortages of memory. 
So we have create an ec2 instance with 64GB  and add a 100G swap to avoid this problem, and then the problem solved.
when we go a head to create a graph, the same problem occur. So I have increased  swap to  (130G) to resolve memory issue. And according to not responding problem, I have segmented the following step:
fsttablecompose $lang/L_disambig.fst $lang/G.fst | fstdeterminizestar --use-log=true | \
    fstminimizeencoded | fstpushspecial | \
    fstarcsort --sort_type=ilabel > $lang/tmp/LG.fst.$$ || exit 1;
  mv $lang/tmp/LG.fst.$$ $lang/tmp/LG.fst
  fstisstochastic $lang/tmp/LG.fst || echo "[info]: LG not stochastic."

to these steps:
        fsttablecompose $lang/L_disambig.fst $lang/G.fst  > $1/tmp/LG.fst.1
        #echo "Compose Complete"
        fstdeterminizestar $1/tmp/LG.fst.1  > $1/tmp/LG.fst.2

        echo "Determinizer Complete"
        fstminimizeencoded $1/tmp/LG.fst.2 > $1/tmp/LG.fst.3
        echo "Encoded Complete"

        fstpushspecial $1/tmp/LG.fst.3 > $1/tmp/LG.fst.4 
        echo "Special Complete"


        fstarcsort --sort_type=ilabel $1/tmp/LG.fst.4 > $1/tmp/LG.fst.5

        mv $1/tmp/LG.fst.5 $1/tmp/LG.fst
        echo "Renaming Complete"
        fstisstochastic $1/tmp/LG.fst || echo "[info]: LG not stochastic."

this segmentation solved the problem of composing the LG.fst.

But for creating the Ha.fst in the next step, a hangup problem is occur. Note that I have run a shell to check the memory each 15 seconds and find out in the log that the remaining memory is 35G out of 194G before "Hangup" problem. The following is the original segment of code:
fsttablecompose $dir/Ha.fst "$clg" | fstdeterminizestar --use-log=true \
    | fstrmsymbols $dir/disambig_tid.int | fstrmepslocal | \
     fstminimizeencoded > $dir/HCLGa.fst.$$ || exit 1;
  mv $dir/HCLGa.fst.$$ $dir/HCLGa.fst
  fstisstochastic $dir/HCLGa.fst || echo "HCLGa is not stochastic"

and the modified one is:
fsttablecompose $dir/Ha.fst "$clg" > $dir/Ha.fst.1
  fstdeterminizestar --use-log=true $dir/Ha.fst.1 > $dir/Ha.fst.2
  fstrmsymbols $dir/disambig_tid.int $dir/Ha.fst.2 > $dir/Ha.fst.3 
  fstrmepslocal $dir/Ha.fst.3 > $dir/Ha.fst.4
  fstminimizeencoded $dir/Ha.fst.4 > $dir/HCLGa.fst.$$ || exit 1;
  mv $dir/HCLGa.fst.$$ $dir/HCLGa.fst
  fstisstochastic $dir/HCLGa.fst || echo "HCLGa is not stochastic"


when I have left the server work, and come next day to see the log file. I have found "Hangup" in the log, the process stopped and the memory is free.

I have attached the modified make graph shell within this email.

I will be thankful for your help.

Thank you

Best Regards
Nour Alhuda Damer
mkgraph2.sh

Daniel Povey

unread,
Jul 30, 2018, 12:09:29 PM7/30/18
to kaldi-help
Thanks... that might be of interest to someone in future.  
But in general I recommend to prune the LM to a smaller size to make the graph, and then do LM rescoring on the lattices with a larger LM after decoding.


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/2c4e4077-1b4f-4cb5-a08a-4eb24c290010%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages