Hi,
I've been struggling with this issue for a few days.
I have a very basic understanding of how the mfcc function works, and I need some help.
Here is a
gist containing my test code.
My question is why are the mfccs different for the first 50ms when the values for the first 2 seconds are exactly the same?
The only difference is that the second audio is longer.
The main problem I'm having is that I've trained my model on segmented data, and because the mfcc values are different for a non-segmented audio, my model's performance decreases significantly.
This originated from a real world data, and I'm just using random data as an example.
Thanks for all your help!
Willy