Read HCLG.fst slowly

7862...@qq.com

unread,

Dec 28, 2017, 7:55:01 AM12/28/17

to kaldi-help

HI:

I try to use the command online2-wav-nnet3-latgen-faster decode wav file through the model,The decode result is pretty good,but it need wait a long time ,It spent about 3 minute that i

could got the decode result ,so I debugged and found the program took about 3 minute read the HCLG.fst,The HCLG.fst file size is about 8G ,

The command as follows:

~ /kaldi-master/src/online2bin/online2-wav-nnet3-latgen-faster --do-endpointing=false --online=true --frame-subsampling-factor=3 --config=online.conf --add-pitch=false --max-active=7000 --beam=15 --frames-per-chunk=50 --lattice-beam=6.0 --acoustic-scale=1.0 --word-symbol-table=words.txt HCLG.fst 'ark:echo utterance-id1 utterance-id1|' 'scp:echo utterance-id1 TEST.wav |' ark:/dev/null

Is the HCLG.fst file too big so that it should takes more time to read?

I want decode faster，Is there any method that can speed up decode?

Daniel Povey

unread,

Dec 28, 2017, 5:00:05 PM12/28/17

to kaldi-help

It might be taking that long to read HCLG.fst because your disk is slow. But it's inevitable that it will take a while.

It's not intended that you call that program each time you want to decode a file-- you have to either call it with multiple files, in batches, or write some server-type code that will load the model and wait for requests.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/702434d3-ed11-43ba-8464-055d160cb83b%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Armin Oliya

unread,

Sep 11, 2018, 5:59:26 AM9/11/18

to kaldi-help

Hi Dan,

I'm trying to understand how each component contributes to the graph size but can't find a consistent pattern based on what follows:

AM	AM_size (final.mdl)	Lexicon_size	LM	LM_size (L+G in MB)	HCLG_size
NL_small	30M	250k	L1	182	622M
NL_large	89M	250k	L1	182	626M
"	"	253k	L2+L3	111	232M
"	"	253k	L1+L2+L3	205	731M
EN_Kaldi_aspire_chain_pretrained	142M	42k	Fisher? (under lang_pp_test)	90	1021M

Specifically, NL_large has an AM triple the size of NL_small but the graph size is almost the same.

On the other hand, the AM for pretrained aspire model seems to have such a big effect on graph size. What am i missing?

On Thursday, December 28, 2017 at 11:00:05 PM UTC+1, Dan Povey wrote:

It might be taking that long to read HCLG.fst because your disk is slow. But it's inevitable that it will take a while.

It's not intended that you call that program each time you want to decode a file-- you have to either call it with multiple files, in batches, or write some server-type code that will load the model and wait for requests.

On Thu, Dec 28, 2017 at 4:55 AM, <7862...@qq.com> wrote:

HI:
I try to use the command online2-wav-nnet3-latgen-faster decode wav file through the model,The decode result is pretty good,but it need wait a long time ,It spent about 3 minute that i

could got the decode result ,so I debugged and found the program took about 3 minute read the HCLG.fst,The HCLG.fst file size is about 8G ,

The command as follows:
~ /kaldi-master/src/online2bin/online2-wav-nnet3-latgen-faster --do-endpointing=false --online=true --frame-subsampling-factor=3 --config=online.conf --add-pitch=false --max-active=7000 --beam=15 --frames-per-chunk=50 --lattice-beam=6.0 --acoustic-scale=1.0 --word-symbol-table=words.txt HCLG.fst 'ark:echo utterance-id1 utterance-id1|' 'scp:echo utterance-id1 TEST.wav |' ark:/dev/null

Is the HCLG.fst file too big so that it should takes more time to read?

I want decode faster，Is there any method that can speed up decode?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Daniel Povey

unread,

Sep 11, 2018, 12:06:42 PM9/11/18

to kaldi-help

Maybe the models have different phonetic context widths (biphone vs. triphone) or topologies. (1-state vs. 3-state).

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/adf9eae7-5e76-415f-9c72-f4550cba6760%40googlegroups.com.

Armin Oliya

unread,

Sep 11, 2018, 12:39:14 PM9/11/18

to kaldi-help

NL_large

Armin Oliya

unread,

Sep 11, 2018, 12:42:10 PM9/11/18

to kaldi-help

NL_large is based on swbd tdnn_7o and i took aspire model from here (http://kaldi-asr.org/models/m1)

On Tuesday, September 11, 2018 at 6:06:42 PM UTC+2, Dan Povey wrote:

Daniel Povey

unread,

Sep 11, 2018, 12:46:49 PM9/11/18

to kaldi-help

Oh. The size of the language model (G.fst) probably accounts for the difference then.

Dan

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/fdb8e31e-f32d-4544-b6f9-47e2f7b94733%40googlegroups.com.

Armin Oliya

unread,

Sep 11, 2018, 3:19:20 PM9/11/18

to kaldi-help

Sorry that's what i don't get.

comparing NL_small and NL_large, acoustic model doesn't affect graph size.

If it's L or G, aspire model has a much smaller L and G compared to NL models but has a much bigger graph.

Daniel Povey

unread,

Sep 11, 2018, 3:24:33 PM9/11/18

to kaldi-help

Well I can't really see that in your table beause you add the L+G size, you don't show the G size directly.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/51cab7a8-2150-4776-9bdd-902b52fe2ad0%40googlegroups.com.

Armin Oliya

unread,

Sep 12, 2018, 6:40:46 AM9/12/18

to kaldi-help

AM

AM_size (final.mdl)

Lexicon_size

LM

LM_size (L+G in MB)

G_size

HCLG_size

NL_small

30M

250k

L1

182

122M

622M

NL_large

89M

250k

L1

182

122M	626M
"	"	253k	L2+L3	111	47M	232M
"	"	253k	L1+L2+L3	205	141M	731M

EN_Kaldi_aspire_chain_pretrained

142M

42k

Fisher? (under lang_pp_test)

90

79M

1021M

Daniel Povey

unread,

Sep 12, 2018, 11:43:58 AM9/12/18

to kaldi-help

I think that aspire-model graph has word-dependent silence probabilities. (utils/dict_dir_add_pronprobs.sh was run).

That increases the graph size.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/7d31c9af-5eca-43c8-86c6-c6562cd14b37%40googlegroups.com.

CW Huang

unread,

Sep 12, 2018, 12:38:12 PM9/12/18

to kaldi-help

Hi

I think it's not the size of final.mdl that affects size of HCLG, since most bytes stored in final.mdl are neural network parameters, which does not contribute to the graph size.

You should probably take a look at context-width and topology. The pre-trained aspire model's context-width is 3, and it has a 3-state topology(but it's a chain model?). From swbd tdnn_7o script I think its context-width is 2, and it uses 1-state topology. So it's reasonable that graph size of pre-trained aspire model is slightly larger.

Dan, please correct me if I said something wrong.

William

Daniel Povey

unread,

Sep 12, 2018, 12:40:45 PM9/12/18

to kaldi-help

I think if it was a chian model it would have had a 1-state topology and left-biphone context. Could be a very old one though. tree-info would show you about the context. Anyway I think it's likely the silence probabilities.

--

Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/2d5f430a-aee5-4a46-b2ae-018a2047f13d%40googlegroups.com.

CW Huang

unread,

Sep 12, 2018, 12:56:24 PM9/12/18

to kaldi-help

I downloaded the pre-trained aspire model here: http://kaldi-asr.org/models/m1

here's the tree-info output:

tree-info exp/chain/tdnn_7b/tree

num-pdfs 8629

context-width 3

central-position 1

And the topology in data/lang_chain/topo is a 3-state topology.

Is it possible this model came from an old recipe that uses old topology with LF-MMI objective?

William

Daniel Povey

unread,

Sep 12, 2018, 12:57:53 PM9/12/18

to kaldi-help

hm, yes, it's possible I suppose. I didn't realize we had any checked-in models like that.

--

Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/527db370-ddd9-474d-9954-fcf0cf2ccec7%40googlegroups.com.

Armin Oliya

unread,

Sep 14, 2018, 4:57:39 AM9/14/18

to kaldi-help

Got it, thank you both!

Reply all

Reply to author

Forward