How to calculate score of cross similarity?

488 views
Skip to first unread message

Jorge

unread,
Jul 10, 2022, 8:18:05 PM7/10/22
to librosa
Hello,

I am trying to identificate a short audio inside a longer one. I use librosa to extract the stft and perform a cross similarity.
When the audios match, I get the "diagonal" line in the plot as the documentation show.
Now I need to calculate the score of it to take a decision if the code assume that there is a match or not. So how can I calculate it after using cross_similarity?

Probably not a question related directly to librosa but I've spend a lot of time trying to do it, as I am learning AI/ML, starting with librosa and numpy. Maybe I am doing it wrong.
Thank you for any help or pointing direction.

code:

import librosa
import librosa.display
import matplotlib.pyplot as plt

# Load ad and record
y_ad, sr_ad = librosa.load('ad.mp3')
y_rec, sr_rec = librosa.load('rec.mp3')

# Offset for matching
ad_offset = tuple([slice(None), slice(*list(librosa.time_to_frames([20, 25])))])
rec_offset = tuple([slice(None), slice(*list(librosa.time_to_frames([35, 40])))])
# Offset NOT matching
rec_offset_fail = tuple([slice(None), slice(*list(librosa.time_to_frames([50, 55])))])

# Chroma stft
ad_chroma = librosa.feature.chroma_stft(y=y_ad)
rec_chroma = librosa.feature.chroma_stft(y=y_rec)

fig, ax = plt.subplots(nrows=2)
fig.set_size_inches(20, 5)
fig.set_facecolor("white")

ax[0].set_title("ad")
ax[1].set_title("rec")

librosa.display.specshow(ad_chroma[ad_offset], y_axis="chroma", x_axis="time", ax=ax[0])
librosa.display.specshow(rec_chroma[rec_offset], y_axis="chroma", x_axis="time", ax=ax[1])

plt.show()

# Cross similarity
sim = librosa.segment.cross_similarity(ad_chroma[ad_offset], rec_chroma[rec_offset], k=5)
sim_fail = librosa.segment.cross_similarity(ad_chroma[ad_offset], rec_chroma[rec_offset_fail], k=5)

fig, ax = plt.subplots(ncols=2)
fig.set_facecolor("white")
fig.set_size_inches(10, 5)

ax[0].set_title("Match OK")
librosa.display.specshow(sim, ax=ax[0])

ax[1].set_title("Match FAIL")
librosa.display.specshow(sim_fail, ax=ax[1])

plt.show()



Brian McFee

unread,
Jul 11, 2022, 6:35:01 AM7/11/22
to librosa
You might want to look into the RQA function: https://librosa.org/doc/latest/generated/librosa.sequence.rqa.html , and reference [1] (Serrá et al., 2009) in the documentation for background to demonstrate how the algorithm is used.

It takes as input a (cross-) similarity matrix and produces an optimal alignment path, which should be (near) diagonal in the case of a match.  It doesn't do detection of match/non-match though, for which you'll probably need to determine a score and/or path length threshold after looking at a sample of data.
Reply all
Reply to author
Forward
0 new messages