steps/cleanup/segment_long_utterances_nnet3.sh, pitch features and ivectors

142 views
Skip to first unread message

Rémi Francis

unread,
Nov 21, 2019, 6:14:54 AM11/21/19
to kaldi-help
Hi,

My neural net is trained with pitch features, but my ivector extractor is trained without.
This script doesn't accommodate this scenario. I can hack it to force calling utils/data/limit_feature_dim.sh 0:39, but it would be good to fix that properly.

Btw, why are ivectors trained without pitch? Can these features harm their performance?

Best regards.

Daniel Povey

unread,
Nov 21, 2019, 6:22:54 AM11/21/19
to kaldi-help
Does it fail in extract_ivectors_online.sh?  We could perhaps change that script so that if it notices
the input feature dim is too large by 3, it will just remove the last 3 dimes itself.

The pitch didn't make a difference to the ivector extraction, IIRC; and I think there were concerns about
compounding the latency of the ivector extractor and of the pitch.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/bb950308-c914-485e-ae08-0496edfe2a7d%40googlegroups.com.

Rémi Francis

unread,
Nov 21, 2019, 6:49:58 AM11/21/19
to kaldi-help
Yes it fails on the extraction jobs launched by extract_ivectors_online.sh, it complains that there's a dimension mismatch (280 vs 301).

Another ivector question: is there a metric that can be computed on ivectors that gives an indication of their performance (that would be faster than training a neural net and getting WERs)?

On Thursday, 21 November 2019 11:22:54 UTC, Dan Povey wrote:
Does it fail in extract_ivectors_online.sh?  We could perhaps change that script so that if it notices
the input feature dim is too large by 3, it will just remove the last 3 dimes itself.

The pitch didn't make a difference to the ivector extraction, IIRC; and I think there were concerns about
compounding the latency of the ivector extractor and of the pitch.

On Thu, Nov 21, 2019 at 7:14 PM Rémi Francis <re...@speechmatics.com> wrote:
Hi,

My neural net is trained with pitch features, but my ivector extractor is trained without.
This script doesn't accommodate this scenario. I can hack it to force calling utils/data/limit_feature_dim.sh 0:39, but it would be good to fix that properly.

Btw, why are ivectors trained without pitch? Can these features harm their performance?

Best regards.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,
Nov 21, 2019, 6:57:53 AM11/21/19
to kaldi-help
Please show the exact error message, I may have the script grep for it on failure.



To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/4470748d-774d-44ce-b31f-f2a1bd1bf957%40googlegroups.com.

Daniel Povey

unread,
Nov 21, 2019, 7:20:25 AM11/21/19
to kaldi-help
Please see if this
fixes it (not tested!)
I accidentally merged and had to revert, which is why it shows as merged.

Rémi Francis

unread,
Nov 21, 2019, 7:22:58 AM11/21/19
to kaldi-help
ERROR (ivector-extract-online2[5.5]:OnlineTransform():online-feature.cc:527) Dimension mismatch: source features have dimension 301 and LDA #cols is 280

Rémi Francis

unread,
Nov 21, 2019, 10:20:32 AM11/21/19
to kaldi-help
ASSERTION_FAILED (ivector-extract-online2[5.5]:ExpectedFeatureDim():online-ivector-feature.cc:76) Assertion failed: (full_dim % num_splice == 0 && "Something went wrong getting the feature dimension")

I added some logs:
LOG (ivector-extract-online2[5.5]:Check():online-ivector-feature.cc:86) lda_mat.NumRows() 40
LOG
(ivector-extract-online2[5.5]:Check():online-ivector-feature.cc:87) diag_ubm.Dim() 40
LOG
(ivector-extract-online2[5.5]:ExpectedFeatureDim():online-ivector-feature.cc:73) full_dim 40
LOG
(ivector-extract-online2[5.5]:ExpectedFeatureDim():online-ivector-feature.cc:74) num_splice 7


I tried to return `full_dim` in that function, but then I get another assert:
LOG (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:508) feat->Dim() 280
LOG (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:509) dim_in 100
LOG (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:510) left_context_ 3
LOG (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:511) right_context_ 3
ASSERTION_FAILED (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:512) Assertion failed: (feat->Dim() == dim_in * (1 + left_context_ + right_context_))

Not sure why `dim_in` is 100.

Daniel Povey

unread,
Nov 21, 2019, 8:34:41 PM11/21/19
to kaldi-help

OK, try now.
full_dim is not the right thing to be comparing with, it's the num-cols of the LDA matrix (or maybe the num-cols - 1, if it
includes the offset term.)

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/037b4f8a-f8ec-4c61-adac-95c649082a37%40googlegroups.com.

Rémi Francis

unread,
Nov 22, 2019, 6:39:26 AM11/22/19
to kaldi-help
NumCols() is 280, so it's without the minus 1.

But I still get 
LOG (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:508) feat->Dim() 280
LOG (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:509) dim_in 100
LOG (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:510) left_context_ 3
LOG (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:511) right_context_ 3
ASSERTION_FAILED (ivector-extract-online2[5.5]:GetFrame():online-feature.cc:512) Assertion failed: (feat->Dim() == dim_in * (1 + left_context_ + right_context_))

I think that there's something wrong with:
OnlineMatrixFeature matrix_feature(feats.ColRange(0, feat_dim));
Because if I use that line with features that don't include pitch, that I got with utils/data/limit_feature_dim.sh 0:39, I still get the same error because dim_in should be 40, even though it worked before that change.

Daniel Povey

unread,
Nov 22, 2019, 9:01:47 PM11/22/19
to kaldi-help
It only subtracts 3 if it is too large by 3, so I don't see how that could be.
dim_in should be 40, not 100; I don't see where the 100 could be coming from- perhaps  you experimented with higher dim features or somehow the ivector features got in there?

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/808c229b-d8e6-4064-89e1-bf60f5307132%40googlegroups.com.

Rémi Francis

unread,
Nov 25, 2019, 9:21:55 AM11/25/19
to kaldi-help
I debugged it a bit more:
OnlineMatrixFeature matrix_feature(feats.ColRange(0, feat_dim));

Gives me issues, but when I did:
auto base_feats = feats.ColRange(0, feat_dim);
OnlineMatrixFeature matrix_feature(base_feats);
It worked.

It's because:
  /// Caution: this class maintains the const reference from the constructor, so
  /// don't let it go out of scope while this object exists.
  explicit OnlineMatrixFeature(const MatrixBase<BaseFloat&mat): mat_(mat) { }


feats.ColRange(0, feat_dim);
Goes out of scope in the first version, so then the behaviour is undefined.

Rémi Francis

unread,
Nov 25, 2019, 12:42:43 PM11/25/19
to kaldi-help
I've actually got models with a NumCols() of 280, and others where it's 281.

Daniel Povey

unread,
Nov 26, 2019, 11:39:25 PM11/26/19
to kaldi-help
OK, great.  I pushed fixes for both of those issues, please confirm that the version up there works so I can merge it.


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/4b6b4c76-841b-4d88-89f5-c40a786f00fa%40googlegroups.com.

Rémi Francis

unread,
Nov 28, 2019, 5:42:23 AM11/28/19
to kaldi-help
full_dim = lda_mat.NumCols() - 1;
should be instead
full_dim = lda_mat.NumCols();
Since lda_mat.NumCols(); is 280 or 281.

But with this fix it worked for me.

On Wednesday, 27 November 2019 04:39:25 UTC, Dan Povey wrote:
OK, great.  I pushed fixes for both of those issues, please confirm that the version up there works so I can merge it.


On Tue, Nov 26, 2019 at 1:42 AM Rémi Francis <re...@speechmatics.com> wrote:
I've actually got models with a NumCols() of 280, and others where it's 281.

On Friday, 22 November 2019 11:39:26 UTC, Rémi Francis wrote:
NumCols() is 280, so it's without the minus 1.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages