How to reverse spectrogram png into audio

John Ho

unread,

May 29, 2023, 7:07:15 AM5/29/23

to librosa

def audio_to_spectrogram(audio,img):
plt.rcParams["figure.figsize"] = [2.56, 2.56]
plt.rcParams["figure.autolayout"] = True

fig, ax = plt.subplots()

y, sr = librosa.load(audio)
S = librosa.feature.melspectrogram(y=y, sr=sr)
print(sr)
S_dB = librosa.power_to_db(S, ref=np.max)
p = librosa.display.specshow(S_dB, sr=sr, fmax=8000, ax=ax)

plt.savefig(img)

Using this code, I have converted my wav file into a 256x256 image to be used in a neural network. I was wondering how I could reverse the image back into audio.

Graham Coleman

unread,

May 29, 2023, 7:17:58 AM5/29/23

to John Ho, librosa

Hi John,

As long as your image matches the dimensions of your mel spectrogram (and all of the parameters match), you can use the following function to invert it (using Griffin-Lim) back to audio:

https://librosa.org/doc/main/generated/librosa.feature.inverse.mel_to_audio.html

Graham

--
You received this message because you are subscribed to the Google Groups "librosa" group.
To unsubscribe from this group and stop receiving emails from it, send an email to librosa+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/librosa/3b84447b-6062-4506-b28d-34b89d4095fbn%40googlegroups.com.

John Ho

unread,

May 29, 2023, 8:42:10 AM5/29/23

to librosa

Hi,

These are some example sizes of my mel-spectrogram

(128, 59)

(128, 60)

(128, 27)

(128, 438)

(128, 652)

However, when I plot it, the image becomes size (256,256,3).

How would i convert this array of (256,256,3) back into the dimensions of the spectrogram?

Reply all

Reply to author

Forward