pocolm count_scale validation error

54 views
Skip to first unread message

ZRV

unread,
Nov 3, 2016, 4:41:26 PM11/3/16
to kaldi-help

Hi all,

I am trying to train language model with pocolm (tedlium recipe) using three text sources. I run into an error where validation script (validate_metaparameters.py) rejects the scale_count of one of sources. Here is the content of the log file 
data/local/local_lm/data/work/optimize_wordlist_4_subset20/57.log:

...
validate_vocab.py: validated file data/local/local_lm/data/work/counts_wordlist_4_subset20/words.txt with 227999 entries.
validate_count_dir.py: validated counts directory data/local/local_lm/data/work/counts_wordlist_4_subset20
validate_vocab.py: validated file data/local/local_lm/data/work/counts_wordlist_4_subset20/split10/1/words.txt with 227999 entries.
validate_count_dir.py: validated counts directory data/local/local_lm/data/work/counts_wordlist_4_subset20/split10/1
validate_metaparameters.py: bad 2'th line 'count_scale_2 1.0'of metaparameters file data/local/local_lm/data/work/optimize_wordlist_4_subset20/57.metaparams

These are the values in the corresponding metaparameter file:

count_scale_1 0.249149111822
count_scale_2 1.0
count_scale_3 0.000179659232405
order2_D1 0.260773918308
order2_D2 0.260773918007
order2_D3 0.160098043941
order2_D4 0.160028714727
order3_D1 0.645986300793
order3_D2 0.375220719982
order3_D3 0.231268998618
order3_D4 3.90020292588e-06
order4_D1 0.706065074843
order4_D2 0.394692938757
order4_D3 0.261320899569
order4_D4 6.42484578026e-16

The count_scale values have been initialized to 0.5 with the default initialize_metaparametes.py script.

I am wondering if there is a way to stop the weight from becoming 1.0?

Thanks in advance

Daniel Povey

unread,
Nov 3, 2016, 4:48:32 PM11/3/16
to kaldi-help
Do 
git log -1
in pocolm so I can see whether you're using the current scripts, or if you are behind.
Dan


--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Povey

unread,
Nov 3, 2016, 4:49:33 PM11/3/16
to kaldi-help
... actually, I suspect you are behind.  I'd be surprised if this would happen if you use the current scripts.


ZRV

unread,
Nov 3, 2016, 4:57:42 PM11/3/16
to kaldi-help
Here is the outcome of git log -1 in pocolm:

commit 75e73cf3ef3c30ce93d1bf13fbc4c680aa31156c
Author: Daniel Povey <dpo...@gmail.com>
Date:   Thu Sep 15 17:01:03 2016 -0400

    Setting TMPDIR in get_counts.py

Daniel Povey

unread,
Nov 3, 2016, 5:01:11 PM11/3/16
to kaldi-help
That's well behind.  Do "git pull" and recompile, then rerun.


ZRV

unread,
Nov 4, 2016, 12:41:59 AM11/4/16
to kaldi-help
Thanks for your replies Dan. I did update and recompile pocolm and don't get the count_scale being equal to "1.0" problem anymore. But now I get an error that the count_scale is equal to "0.0":

validate_metaparameters.py: bad 1'th line 'count_scale_1 0.00000000000000000000'of metaparameters file

Is there anything else that I might be missing here?



On Thursday, 3 November 2016 13:41:26 UTC-7, ZRV wrote:

Daniel Povey

unread,
Nov 4, 2016, 12:46:18 AM11/4/16
to kaldi-help
Send me the output of the optimize log (should have info about every iteration), to dpo...@gmail.com


Daniel Povey

unread,
Nov 5, 2016, 2:38:29 PM11/5/16
to kaldi-help
Just to follow up on the list- this problem was resolved with a script change.

Reply all
Reply to author
Forward
0 new messages