fpcalc.exe gives different fingerprints for diffrerent encodingqualities

3,275 views
Skip to first unread message

hantzs...@googlemail.com

unread,
Oct 14, 2013, 8:27:28 AM10/14/13
to acou...@googlegroups.com
Hi,

I'm trying to use fpcalc for finding duplicates in my music collection.
To test it i converted a flac file that i ripped from CD to 320 cbr mp3 and 112 cbr mp3 using LAME

unfortunately fpcalc gives me different fingerprints for all three files:
(put it on pastebin because its a lot of text)
as far as i can see, i only get partial matches. i thought i'd get the same fingerprint for all files, since they are the same song.
isn't that how it is supposed to work? this also happens with 320 cbr <-> flac, which are nearly identical, at least to the listener.

hope someone can help

Regards
Sebastian

Lukáš Lalinský

unread,
Oct 14, 2013, 9:14:32 AM10/14/13
to Acoustid
I thought I explained this in another post, but I can't find it now, I'll write a more detailed explanation here.

You can't compare two Chromaprint fingerprints in the compressed and base64-encoded form and you can't compare two fingerprints by simply checking if they are identical.

If you want to compare two fingerprints, you need to get the raw data. Using fpcalc, you get that if you call it with the -raw parameter. That will give you a list of 32-bit numbers, instead of the long base64-encoded string. You can also decode string, but for that you would need to interact with Chromaprint as a library.

FINGERPRINT=1376359748,1397470669,1397396981,1359666679,1359667159,1359716246,1342814102,1347003830,1892394390,1888202135...
FINGERPRINT=1376358852,1397470669,1397396981,1359667191,1359667191,1359716246,1342814102,1347003830,1892394390,1888202135...

They are 32-bit integers, so you can look at each of them as a sequence of 32 bits. Then you need to compare them for bit differences (e.g. by counting set bits in the result of xor). For the first 10 items you get:

0. 01010010000010011001010101000100 xor 01010010000010011001000111000100 = 00000000000000000000010010000000
1. 01010011010010111011010111001101 xor 01010011010010111011010111001101 = 00000000000000000000000000000000
2. 01010011010010101001010111110101 xor 01010011010010101001010111110101 = 00000000000000000000000000000000
3. 01010001000010101101110111110111 xor 01010001000010101101111111110111 = 00000000000000000000001000000000
4. 01010001000010101101111111010111 xor 01010001000010101101111111110111 = 00000000000000000000000000100000
5. 01010001000010111001111110010110 xor 01010001000010111001111110010110 = 00000000000000000000000000000000
6. 01010000000010011011011110010110 xor 01010000000010011011011110010110 = 00000000000000000000000000000000
7. 01010000010010011010010110110110 xor 01010000010010011010010110110110 = 00000000000000000000000000000000
8. 01110000110010111010010110010110 xor 01110000110010111010010110010110 = 00000000000000000000000000000000
9. 01110000100010111010110110010111 xor 01110000100010111010110110010111 = 00000000000000000000000000000000

That means that in the item 0 there are 2 bits different, in items 3 and 4 there is only one bit different and the rest is identical. The simplest way to score the similarity of two fingerprints is to sum the bit differences.

So in this case you would get only 4 bits different out of the total of 320 bits (if I assume the fingerprints are only 10 items long).

Sometimes you might need to align the two fingerprints, because they do not start at the exact same time, so for example you will be comparing item 0 from the first fingerprint with item 1 from the second fingerpriont. See http://acoustid.org/fingerprint/26931382/compare/13214169 for an example.

Lukas





--
You received this message because you are subscribed to the Google Groups "AcoustID" group.
To unsubscribe from this group and stop receiving emails from it, send an email to acoustid+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Message has been deleted

Philipp Wolfer

unread,
Feb 6, 2019, 7:58:43 AM2/6/19
to acou...@googlegroups.com

Am Mi., 6. Feb. 2019 um 13:53 Uhr schrieb <webm...@allsync.de>:
i use fpcalc 1.4.3 under win7 with the parameter -raw and i got some values higher than 32bit (2.147.483.647):

AFAIK those are unsigned integers, so the max. representable value is 4.294.967.295
 


i tried it with the 32bit and 64bit version of fpcalc.

any ideas why the values are higher than 32bit?

--
You received this message because you are subscribed to the Google Groups "AcoustID" group.
To unsubscribe from this group and stop receiving emails from it, send an email to acoustid+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Philipp Wolfer

parolu – work together smoothly
https://www.parolu.io

All-in-one collaboration tool for designers, developers & project managers

Jashan Chittesh

unread,
Feb 14, 2019, 10:11:03 AM2/14/19
to AcoustID
On Monday, October 14, 2013 at 3:14:32 PM UTC+2, Lukáš Lalinský wrote:

Sometimes you might need to align the two fingerprints, because they do not start at the exact same time, so for example you will be comparing item 0 from the first fingerprint with item 1 from the second fingerpriont. See http://acoustid.org/fingerprint/26931382/compare/13214169 for an example.

Could this be used to calculate a time offset to synchronize two "same but slightly different" audio-files? Use case is for a rhythm game, where two different files of the same recording with an offset in the beginning should "match" but would require that kind of offset so that the gameplay and music doesn't get out of sync.

This doesn't have to be sample-precise, of course. Probably, 20-30 ms wouldn't be noticeably for most people, possibly even more than that.
Reply all
Reply to author
Forward
0 new messages