Google Image recognition POC (+Spotify integration demo)

162 views
Skip to first unread message

Marc

unread,
May 15, 2013, 4:07:00 PM5/15/13
to tas...@googlegroups.com
Dear Tasker fellows,

I just finished my newest and by far most complicated script I've ever worked on. 

The promise is simple - Make a photo, let Google analyze it and get their textual "best guess" back to use it in Tasker!

"Okay, so we can analyze photos, great. Now what could you do with it?" that's the question I was asking myself after getting it to work for the first time. I'm a passionate Spotify user so for the purpose of demoing it I thought it would be cool if you could use a photo to launch up albums, bands and so on in Spotify. And that's exactly what it does. It works surprisingly well and you can do some really cool stuff with it, like...

  • ... snapping the cover art of one of your CD, vinyl, music poster etc. to listen to the corresponding album
  • ... take a photo of an artist from a website or - even better of course - Google Image search to
  • ... launching specifically named playlists - for example the playlist 'banana' - by taking a picture of a banana :)

So how does it work?
I won't go into any details right now but if there're a lot of requests I'll post a tutorial on how I did it. But as you can imagine, it's pretty complicated and I'm currently pretty busy, so don't hold your breath. However, I'll try to post a demo video probably at the end of the week (currently don't have a second cam to record my phone, lol). I'll also gladly provide all the necessary actions, profiles and scenes whenever I find the time to do that, however they will only work 'out of the box' of you meet all of the following requirements:
  1. a rooted Nexus 4
  2. Chrome and/or Chrome Beta
  3. The full version of Chrome UA Switcher (~$2, €1,49)
  4. joaomgcd' AutoShortcut (pro-version optional, I recommend it though for the sake of supporting this guy)
  5. GermainZ' getevent->sendevent script if you're not using a Nexus 4.
  6. Spotify (optional)

What are the flaws and drawbacks?
Well, it's incredibly slow. It takes 14 seconds from launching the action to get to the camera dialog. After you took your photo, it takes another crawling 21 seconds (2 Megapixel on 3G) to have Spotify open your desired song, album, band or playlist. So it's a pretty cool proof of concept, I think, but due it's long loading times it's not something you'd use on a day-to-day basis (Why would you want to do that anyway?! :). Please keep in mind that the majority of the loading time is due to all the workarounds one have to use in order to do achieve what I was looking for.  

Some optimisation within the boundaries of Tasker could probably bring loading times down by 25 - 50%. A "native app" solution using similar dirty workarounds like my solution does - due to missing APIs -  could cut it down to mere seconds.



Yeah, so that's it for now, have to go back to work. Hope you guys are looking forward to hear more from me. If you have any questions or some special request for the demo video, let me know.

Best,
Marc



Marc

unread,
May 15, 2013, 4:13:32 PM5/15/13
to tas...@googlegroups.com
Well, "launching" an album, playlist etc. was probably the wrong word for it. It doesn't directly play the music but takes you the corresponding search result within Spotify. You then have to select it in order to play.

Best,
Marc
Message has been deleted

Mike L

unread,
May 16, 2013, 2:27:05 PM5/16/13
to tas...@googlegroups.com
This sounds very cool. I wonder if it would work for locations. Looking forward to an update, I'm curious to see the tasker logic for this

Marc

unread,
May 16, 2013, 6:13:47 PM5/16/13
to tas...@googlegroups.com


On Thursday, May 16, 2013 8:27:05 PM UTC+2, Mike L wrote:
This sounds very cool. I wonder if it would work for locations.

You mean sights, iconic buildings and so on, right? Yes, that should work.

I just took a photo of a random, completely untagged photo right from flickr and it worked - however, I haven't tried that in real life conditions.

Best,
Marc

David Spivey

unread,
May 20, 2013, 11:52:26 AM5/20/13
to tas...@googlegroups.com
For everyone here, you need to know that you can switch Chrome's user agent from tasker and Secure Settings instead of paying for a user agent switcher app.

This set of commands will switch chrome to Desktop mode.
Run "chmod 0777 /data/local"
Write File "../../data/local/chrome-command-line", chrome --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1", no append
Run "chmod 775 /data/local/chrome-command-line

to reverse this, either delete /data/local/chrome-command-line or rename it to something else.

David Spivey

unread,
May 20, 2013, 11:54:08 AM5/20/13
to tas...@googlegroups.com
Also, you can use an app called FRep on the Play store with AutoShortcut to record and replay screen touch events wihout using the getevent -> sendevent script.

Matt R

unread,
May 20, 2013, 12:10:50 PM5/20/13
to tas...@googlegroups.com
The same set of shell commands should work in Tasker itself. You shouldn't need secure settings.

Matt

Reply all
Reply to author
Forward
0 new messages