How to Normalize Melspectrogram?

1,496 views
Skip to first unread message

Muhammad Faisal

unread,
Oct 25, 2021, 7:43:08 AM10/25/21
to librosa
Hi, can anyone help me regarding how to normalize melspectrogram using librosa.

I convert my audio signal into melspectrogram but i want to normalize it for machine learning.

And is there any way to convert melspectrogram to audio fastly?

Vincent Lostanlen

unread,
Oct 25, 2021, 8:07:37 AM10/25/21
to lib...@googlegroups.com
Hello Muhammad,

I recommend Per-Channel Energy Normalization (PCEN)

https://librosa.org/doc/main/generated/librosa.pcen.html


followed by batch normalization


Sincerely,

Vincent.
On 25/10/2021 14:05, Vincent Lostanlen wrote:

Hello Muhammad,

I recommend Per-Channel Energy Normalization (PCEN)

https://librosa.org/doc/main/generated/librosa.pcen.html


followed by batch normalization


Sincerely,

Vincent.

--
You received this message because you are subscribed to the Google Groups "librosa" group.
To unsubscribe from this group and stop receiving emails from it, send an email to librosa+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/librosa/be75475a-ded9-4c26-b77a-61d8f5cc1ef7n%40googlegroups.com.

Muhammad Faisal

unread,
Oct 26, 2021, 5:44:05 AM10/26/21
to librosa
ok basically i am building a text to speech data set where i use melspectrogram as audio feature. but i tried PCEN but the inverse of PCEN Melspectrogram is give very bad audio. is there any way to reverse this normalization after my model predict the Normalized melspectrogram.

Vincent Lostanlen

unread,
Oct 26, 2021, 7:16:06 AM10/26/21
to lib...@googlegroups.com

Hello,

Sounds difficult. By definition, any kind of normalization removes some factors of variability in the data so recovering them after the fact is not possible without some side-channel information.

In the case of PCEN, approximating the inverse might actually be doable via convex optimization if you have a good enough initial guess for the denominator ("M").

Another direction is to implement a PCEN-based neural generative model à la WaveGlow.

But both of these entail research questions of their own and are out of the scope of librosa dev.


I hope this helps!

Vincent.

Muhammad Faisal

unread,
Oct 29, 2021, 1:47:44 AM10/29/21
to librosa
So, is there any way I can convert melspectrogram fastly into audio. Librosa Inverse function is too slow. How can i convert it into audio fast.
Reply all
Reply to author
Forward
0 new messages