Hi Mike,
I'm attempting to perform OTA recognition, however I'm finding that
recognition of songs that are part-way through is much less reliable
than songs that start roughly at the same time as recording does.
This is not an issue with the time codes. What is happening is that, when you start the codegen, it takes a small amount of time to "warm up", and this means that the distribution of the detected onsets (from which the hash codes are computed) changes as the codegen passes over the audio, relative to the point at which the codegen was started. We've studied this behaviour in depth and have a very good understanding of it, and are implementing updates, in conjunction with several other OTA improvements, which will further improve match rates from the middle of a song.
Is
there something I should be doing to indicate that any timestamps
generated should only be considered relative to one another, not as
absolute values?
This is indeed how the time codes are already interpreted (see below).
Currently I simply give an offset of 0 when initialising libcodegen,
but I'm concerned that the server might be interpreting this as meaning
that we're really at the start of a song.
The purpose of the time codes is not to measure the time distance between the query and the start of the song. It is to measure the relative distances between hash codes in the fingerprint segment so that it can be aligned in time with the query fingerprint during the matching process - if this alignment is strong then the match is considered to be a good one. During matching, the server normalizes all time offsets to a zero-based offset anyway, so I wouldn't worry about that.
Andrew