Hi all,
I am trying to train language model with pocolm (tedlium recipe) using three text sources. I run into an error where validation script (validate_metaparameters.py) rejects the scale_count of one of sources. Here is the content of the log file
data/local/local_lm/data/work/optimize_wordlist_4_subset20/57.log:
...
validate_vocab.py: validated file data/local/local_lm/data/work/counts_wordlist_4_subset20/words.txt with 227999 entries.
validate_count_dir.py: validated counts directory data/local/local_lm/data/work/counts_wordlist_4_subset20
validate_vocab.py: validated file data/local/local_lm/data/work/counts_wordlist_4_subset20/split10/1/words.txt with 227999 entries.
validate_count_dir.py: validated counts directory data/local/local_lm/data/work/counts_wordlist_4_subset20/split10/1
validate_metaparameters.py: bad 2'th line 'count_scale_2 1.0'of metaparameters file data/local/local_lm/data/work/optimize_wordlist_4_subset20/57.metaparams
These are the values in the corresponding metaparameter file:
count_scale_1 0.249149111822
count_scale_2 1.0
count_scale_3 0.000179659232405
order2_D1 0.260773918308
order2_D2 0.260773918007
order2_D3 0.160098043941
order2_D4 0.160028714727
order3_D1 0.645986300793
order3_D2 0.375220719982
order3_D3 0.231268998618
order3_D4 3.90020292588e-06
order4_D1 0.706065074843
order4_D2 0.394692938757
order4_D3 0.261320899569
order4_D4 6.42484578026e-16
The count_scale values have been initialized to 0.5 with the default initialize_metaparametes.py script.
I am wondering if there is a way to stop the weight from becoming 1.0?
Thanks in advance