Mismatch between audio duration and number of frames returned by "ali-to-phones --write-lengths=true ...."

Sergei Tushev

unread,

Sep 2, 2024, 10:09:03 AM9/2/24

to kaldi-help

Hello.

I use default MFCC --frame-length=25. So, for 0.75 sec audio I should get 30 frames without offset.

When I align this audio with nnet3-align-compiled, I get only 26 frames, e.g. "utt-1 SIL 10 ; AA 6 ; SIL 10".
What could be the reason?

Daniel Povey

unread,

Sep 2, 2024, 10:46:27 AM9/2/24

to kaldi...@googlegroups.com

Probably end effects, I think one of the mfcc options relates to that (whether num frames is reduced by end effects), but also Kaldi models have end effects, as you would get from convolution without padding.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/213c101a-380c-41d1-b56f-c0d0791e9ecfn%40googlegroups.com.

Sergei Tushev

unread,

Sep 2, 2024, 11:04:41 AM9/2/24

to kaldi-help

Thank you.

If I understand correctly, there is no way to fix this quickly without changing the model?

понедельник, 2 сентября 2024 г. в 17:46:27 UTC+3, Daniel Povey:

Probably end effects, I think one of the mfcc options relates to that (whether num frames is reduced by end effects), but also Kaldi models have end effects, as you would get from convolution without padding.

On Monday, September 2, 2024, Sergei Tushev <tushev...@gmail.com> wrote:
Hello.

I use default MFCC --frame-length=25. So, for 0.75 sec audio I should get 30 frames without offset.
When I align this audio with nnet3-align-compiled, I get only 26 frames, e.g. "utt-1 SIL 10 ; AA 6 ; SIL 10".
What could be the reason?

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Daniel Povey

unread,

Sep 3, 2024, 3:48:19 AM9/3/24

to kaldi...@googlegroups.com

Not really, but the end effect should be fixed and symmetric so you can compensate for it in post processing

On Monday, September 2, 2024, Sergei Tushev <tushev...@gmail.com> wrote:

Thank you.
If I understand correctly, there is no way to fix this quickly without changing the model?

понедельник, 2 сентября 2024 г. в 17:46:27 UTC+3, Daniel Povey:

Probably end effects, I think one of the mfcc options relates to that (whether num frames is reduced by end effects), but also Kaldi models have end effects, as you would get from convolution without padding.

On Monday, September 2, 2024, Sergei Tushev <tushev...@gmail.com> wrote:
Hello.

I use default MFCC --frame-length=25. So, for 0.75 sec audio I should get 30 frames without offset.
When I align this audio with nnet3-align-compiled, I get only 26 frames, e.g. "utt-1 SIL 10 ; AA 6 ; SIL 10".
What could be the reason?

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/213c101a-380c-41d1-b56f-c0d0791e9ecfn%40googlegroups.com.

--

Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/20076d81-9ec7-404b-891f-4d02f9042a67n%40googlegroups.com.

Reply all

Reply to author

Forward