difference between downloaded ASpIRE and mine

65 views
Skip to first unread message

Jon Nichols

unread,
Feb 7, 2018, 5:04:08 PM2/7/18
to kaldi-help

I recently got the fisher data and tried to re-create the ASpIRE Chain Model that can be downloaded from http://kaldi-asr.org/models.html


The only change I did was the number of GPUs from

--trainer.optimization.num-jobs-initial 3 \

--trainer.optimization.num-jobs-final 16 \

to

--trainer.optimization.num-jobs-initial 2 \

--trainer.optimization.num-jobs-final 8 \

 

I don’t have the ASpIRE test data, so was testing with my data, which is out of domain.

When comparing the results between the premade model and the one I made the real-time factor of my model averaged 1.479064 faster (premade 13.3457 vs mine 11.866656) but the WER averaged 2.642% worse (premade 33.95 vs mine 36.592)

  

I assumed the model would have the same or better real-time factor and WER, because of improvements in Kaldi between the version the premade was on and mine,  Jan 9th Commit a0b71317df1035bd3c6fa49a2b6bb33c801b56ac.  Was this a bad assumption, was it the GPU change? 

 

Thanks for any help

Daniel Povey

unread,
Feb 8, 2018, 12:24:30 AM2/8/18
to kaldi-help

Using fewer GPUs could cause it to optimize faster, and you might need to use fewer iterations in that case.
If it's the TDNN model, perhaps a comparison of the chain_dir_info.pl output with the following would help clarify whether this is the issue:

steps/info/chain_dir_info.pl exp/chain/tdnn_7b
exp/chain/tdnn_7b: num-iters=1596 nj=3..16 num-params=36.9M dim=40+100->8672 combine=-0.163->-0.162 xent:train/valid[1062,1595,final]=(-2.00,-1.97,-1.96/-2.02,-1.99,-1.99) logprob:train/valid[1062,1595,final]=(-0.154,-0.150,-0.149/-0.158,-0.155,-0.154)


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/d94e773d-a89b-4b77-b4e3-4f49a1afa97a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jon Nichols

unread,
Feb 8, 2018, 6:01:38 AM2/8/18
to kaldi-help
from my model:

steps/info/chain_dir_info.pl exp/chain/tdnn_7b
exp/chain/tdnn_7b: num-iters=3297 nj=2..8 num-params=36.7M dim=40+100->8600 combine=-0.160->-0.160 (over 10) xent:train/valid[2195,3296,final]=(-2.05,-2.00,-2.00/-1.90,-1.88,-1.87) logprob:train/valid[2195,3296,final]=(-0.155,-0.148,-0.148/-0.146,-0.144,-0.143)

looks like the pre-made model didn't have the logs to run the script against that one, when i tried it just returned to the command line with no error or output returned.

You received this message because you are subscribed to a topic in the Google Groups "kaldi-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kaldi-help/UKdwflI2s4Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kaldi-help+unsubscribe@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Feb 8, 2018, 6:10:11 AM2/8/18
to kaldi-help
The diagnostics OK to me.  If there has been any regression I think it's most likely to be due to some kind of change in the aspire script itself.  What was that WER measured on ?


Jon Nichols

unread,
Feb 8, 2018, 3:34:29 PM2/8/18
to kaldi-help
its on bunch of customer service calls that we have manually transcribed.

there is an ok amount of terms that are out of domain, and the next step i was planning on doing was to rebuild the LM to include the out of domain words and phrases. but was confused by WER getting worse with the new model, so haven't moved on from that yet.



Daniel Povey

unread,
Feb 8, 2018, 6:56:31 PM2/8/18
to kaldi-help
I'd suggest to go ahead and  rebuild the LM; I'll try to investigate separately, here, whether there has been any regression in the aspire model.  But I warn you, it may not show up on the official ASPIRE test set which is where we'd test it.

Dan


Jon Nichols

unread,
Feb 8, 2018, 8:03:53 PM2/8/18
to kaldi-help
will do and thanks again for the help.

Also is https://catalog.ldc.upenn.edu/LDC2017S21 the  official ASPIRE test set, or where can it be  obtained?  LDC2017S21 has 16k files, so an 8k model seems like an odd choice, which is why i'm guessing its a different ASPIRE data set.



Daniel Povey

unread,
Feb 8, 2018, 8:05:48 PM2/8/18
to kaldi-help
I think it is.  We probably downsample the files to 8k; the recipe will make clear.


Dan


Reply all
Reply to author
Forward
0 new messages