Timestamps for OTA recognition

138 views
Skip to first unread message

Mike Sheldon

unread,
Jun 3, 2012, 2:48:55 AM6/3/12
to echo...@googlegroups.com
Hi all,

I'm attempting to perform OTA recognition, however I'm finding that
recognition of songs that are part-way through is much less reliable
than songs that start roughly at the same time as recording does. Is
there something I should be doing to indicate that any timestamps
generated should only be considered relative to one another, not as
absolute values?

Currently I simply give an offset of 0 when initialising libcodegen,
but I'm concerned that the server might be interpreting this as meaning
that we're really at the start of a song.

Thanks,
Mike.

Andrew Nesbit

unread,
Jun 4, 2012, 6:38:30 PM6/4/12
to echo...@googlegroups.com
Hi Mike,


On Sun, Jun 3, 2012 at 7:48 AM, Mike Sheldon <mikes...@gmail.com> wrote:
I'm attempting to perform OTA recognition, however I'm finding that
recognition of songs that are part-way through is much less reliable
than songs that start roughly at the same time as recording does.

This is not an issue with the time codes. What is happening is that, when you start the codegen, it takes a small amount of time to "warm up", and this means that the distribution of the detected onsets (from which the hash codes are computed) changes as the codegen passes over the audio, relative to the point at which the codegen was started. We've studied this behaviour in depth and have a very good understanding of it, and are implementing updates, in conjunction with several other OTA improvements, which will further improve match rates from the middle of a song.

Is
there something I should be doing to indicate that any timestamps
generated should only be considered relative to one another, not as
absolute values?

This is indeed how the time codes are already interpreted (see below).

Currently I simply give an offset of 0 when initialising libcodegen,
but I'm concerned that the server might be interpreting this as meaning
that we're really at the start of a song.

The purpose of the time codes is not to measure the time distance between the query and the start of the song. It is to measure the relative distances between hash codes in the fingerprint segment so that it can be aligned in time with the query fingerprint during the matching process - if this alignment is strong then the match is considered to be a good one. During matching, the server normalizes all time offsets to a zero-based offset anyway, so I wouldn't worry about that.

Andrew

Mike Sheldon

unread,
Jun 4, 2012, 7:42:46 PM6/4/12
to echo...@googlegroups.com
On Mon, 2012-06-04 at 23:38 +0100, Andrew Nesbit wrote:

> During matching, the server normalizes all time offsets to a
> zero-based offset anyway, so I wouldn't worry about that.

Ah, that's great, thanks for the feedback.

Mike.


Ton

unread,
Aug 13, 2012, 4:09:54 AM8/13/12
to echo...@googlegroups.com
 
Hi Andrew,

On Tuesday, June 5, 2012 12:38:30 AM UTC+2, Andrew Nesbit wrote:
Hi Mike,

On Sun, Jun 3, 2012 at 7:48 AM, Mike Sheldon <mikes...@gmail.com> wrote:
I'm attempting to perform OTA recognition, however I'm finding that
recognition of songs that are part-way through is much less reliable
than songs that start roughly at the same time as recording does.

This is not an issue with the time codes. What is happening is that, when you start the codegen, it takes a small amount of time to "warm up", and this means that the distribution of the detected onsets (from which the hash codes are computed) changes as the codegen passes over the audio, relative to the point at which the codegen was started. We've studied this behaviour in depth and have a very good understanding of it, and are implementing updates, in conjunction with several other OTA improvements, which will further improve match rates from the middle of a song.

....

Andrew
 
Are there any codegen updates related to this available for testing purposes? If not, could you provide some more details on the onset generation related to the starting point? I suspect that you refer to the threshold decay mechanism, I would appreciate any thoughts you have on this issue.
 
Ton
 

Andrew Nesbit

unread,
Aug 13, 2012, 8:18:40 AM8/13/12
to echo...@googlegroups.com
On Mon, Aug 13, 2012 at 9:09 AM, Ton <to...@crunchtech.com> wrote:
 
On Sun, Jun 3, 2012 at 7:48 AM, Mike Sheldon <mikes...@gmail.com> wrote:
I'm attempting to perform OTA recognition, however I'm finding that
recognition of songs that are part-way through is much less reliable
than songs that start roughly at the same time as recording does.

This is not an issue with the time codes. What is happening is that, when you start the codegen, it takes a small amount of time to "warm up", and this means that the distribution of the detected onsets (from which the hash codes are computed) changes as the codegen passes over the audio, relative to the point at which the codegen was started. We've studied this behaviour in depth and have a very good understanding of it, and are implementing updates, in conjunction with several other OTA improvements, which will further improve match rates from the middle of a song.
 
Are there any codegen updates related to this available for testing purposes? If not, could you provide some more details on the onset generation related to the starting point? I suspect that you refer to the threshold decay mechanism, I would appreciate any thoughts you have on this issue.

Yes, I am referring to the threshold decay mechanism - this affects the sensitivity of the onset detection, thus determining the average density of onsets detected. In addition to codegen updates which optimize for this and which will be published once they are ready (they are necessarily part of a suite of changes), we will soon be installing some further updates on the server-side for post-processing the hash codes and this should also help with the issue.

Best,

Andrew
Reply all
Reply to author
Forward
0 new messages