Echo Nest vs. Hosted Echoprint - audio fingerprinting accuracy?

950 views
Skip to first unread message

Julie Yaunches

unread,
Jul 1, 2013, 10:03:02 AM7/1/13
to echo...@googlegroups.com
Hi group,

I'm working on a prototype of an iOS mobile app that takes audio recordings through the phone's mic and sends them out for audio-fingerprinting.

We're currently integrating with Echo Nest and I've followed the iOS sample app that demonstrates how to do this. This has all gone well and we have our prototype up and running. However, we're finding that the accuracy of identification we're receiving from the Echo Nest service is not that great. Around 35% for 30+ second samples with no background noise. Introduce background noise and it drops to 0%.

This is sort of what I expected because Echo Nest say they don't support this 'over-the-air' type of recording with background noise. I understand the Echo Nest are using the Echoprint project in their solution, however Echoprint say clearly that it should work in environments like this. We're considering going ahead and hosting our own instance of an Echoprint server... but I wanted to check with the group with a couple of questions:

1. Will accuracy increase moving from Echo Nest to our own hosted Echoprint?
2. We don't have a C++ developer or audio expert on our team. I did C++ in University, none since.. however, I am an Objective C programmer. Do we need either of these types of experts if we're going to dive into the Echoprint project?

Thanks,


Julie

Andrew Nesbit

unread,
Jul 1, 2013, 10:21:00 AM7/1/13
to echo...@googlegroups.com
On Mon, Jul 1, 2013 at 3:03 PM, Julie Yaunches <jmya...@gmail.com> wrote:
However, we're finding that the accuracy of identification we're receiving from the Echo Nest service is not that great. Around 35% for 30+ second samples with no background noise. Introduce background noise and it drops to 0%.

This is sort of what I expected because Echo Nest say they don't support this 'over-the-air' type of recording with background noise.

That's right - as we have already stated other times, over-the-air matching is still in an experimental state. It does work if you increase the density of hash codes generated by the codegen, but you would have to set up your own server for that and ensure that it can scale up.

I understand the Echo Nest are using the Echoprint project in their solution, however Echoprint say clearly that it should work in environments like this.

Thank you for pointing that out. That information is unfortunately out of date and needs to be corrected.

A better source of information is FAQ: http://notes.variogr.am/post/27796385927/the-audio-fingerprinting-at-the-echo-nest-faq

1. Will accuracy increase moving from Echo Nest to our own hosted Echoprint?

Not unless you increase the density of hash codes in the codegen.

2. We don't have a C++ developer or audio expert on our team. I did C++ in University, none since.. however, I am an Objective C programmer. Do we need either of these types of experts if we're going to dive into the Echoprint project?

If it's an iOS app then obviously you will need as iOS expert. If you are only making minor changes to the codegen (e.g., adjust various numerical parameters) then you probably don't need to be a C++ expert. To deploy a server you should know a bit of Python and the overall principles of how to run a server, databases, etc.

Andrew

Julie Yaunches

unread,
Jul 1, 2013, 11:03:19 AM7/1/13
to echo...@googlegroups.com
Hi Andrew, 

Thanks for your response, this is really useful. I appreciate the thoroughness as well. I'm wondering if you'll know what kind of improvement we might see? If we try the technique you're suggesting.. to setup our own server and increase the density of hash codes generated by the codegen, how much improvement in identification do you think we'll see in samples with low to moderate background noise? Given that we keep our samples around 30+ seconds.

Thanks again, 

Julie

Andrew Nesbit

unread,
Jul 1, 2013, 12:24:11 PM7/1/13
to echo...@googlegroups.com
On Mon, Jul 1, 2013 at 4:03 PM, Julie Yaunches <jmya...@gmail.com> wrote:
Thanks for your response, this is really useful. I appreciate the thoroughness as well.

You're welcome.
 
I'm wondering if you'll know what kind of improvement we might see? If we try the technique you're suggesting.. to setup our own server and increase the density of hash codes generated by the codegen, how much improvement in identification do you think we'll see in samples with low to moderate background noise?

It depends on many variables so the only way to try it out is to prototype something and try it. Once we have a controlled evaluation kit then it will be much easier to say. It does significantly improve results, but again, if you want to have a large database then the increased hash code density means that even more consideration must be given to scalability concerns.

Given that we keep our samples around 30+ seconds.

Is this the query length or the length of the reference audio tracks in the database?

Andrew

Julie Yaunches

unread,
Jul 1, 2013, 12:34:10 PM7/1/13
to echo...@googlegroups.com
I think I understand.

I was speaking about the length of the .caf files that I then encode to send out to the hosted Echoprint server. Would increasing the density of the hash codes generated by the codegen also allow us to decrease the length of these samples?

Andrew Nesbit

unread,
Jul 1, 2013, 12:46:23 PM7/1/13
to echo...@googlegroups.com
On Mon, Jul 1, 2013 at 5:34 PM, Julie Yaunches <jmya...@gmail.com> wrote:
I was speaking about the length of the .caf files that I then encode to send out to the hosted Echoprint server. Would increasing the density of the hash codes generated by the codegen also allow us to decrease the length of these samples?

Probably yes. Try it out and see what happens. It's highly dependent on what sort of audio it is, noise levels, how loud the source audio is, etc.

Andrew

Julie Yaunches

unread,
Jul 1, 2013, 4:39:07 PM7/1/13
to echo...@googlegroups.com
We'll do that. Thanks a lot!

cheche

unread,
Jul 1, 2013, 10:13:56 PM7/1/13
to echo...@googlegroups.com
dear Andrew Nesbit
how to  increase the density of hash codes ?
thank you .

Julie Yaunches

unread,
Jul 3, 2013, 11:36:11 AM7/3/13
to echo...@googlegroups.com
Hey Andrew,

I hope I can ask for a little more help. I've been working on getting both the Echoprint server and codegen set-up. I've done all that and I have ingested encoded tracks to my local server. I've basically duplicated the results I was seeing from the Echo Nest API. i.e. Correct identification with no background noise.. add moderate background noise, no identification.

I'm just looking through the codegen trying to figure out how to 'increase the density of the hash codes'. I'm a bit new to all this. Did you mean to increase the sampling rate? Or decrease the seconds per chunk?

You mentioned adjusting various numerical parameters. Can you elaborate on which ones?

Again, apologies. I'm quite new to many of these concepts but am trying to get up to speed.

Thanks,

Julie

On Monday, July 1, 2013 10:21:00 AM UTC-4, Andrew Nesbit wrote:

adambenjamincohen

unread,
Jun 25, 2015, 5:31:00 PM6/25/15
to echo...@googlegroups.com
For posterity of other internet searchers, I think he Andrew was referring to a patch such as this one:

Reply all
Reply to author
Forward
0 new messages