Image similarity search : while the model need to be trained without Tags

28 views
Skip to first unread message

Anguo Yang

unread,
Jun 1, 2016, 7:15:59 AM6/1/16
to Caffe Users
Hi, all,
Could anyone be kindly to help me on this issue? thanks:

We have lots of photos/images, say 10 million or more, they are original photos/images from our customers which need to be protected(To prevent plagiarism), here we call it as dataset A. 
We also got lots of images by way of web crawler, from bloggers, websites, forum, etc. some of these images are simply copied from dataset A, some added with additional watermark, we call it as dataset B. it currently contains about 300000 images, but will grow day by day.
We will use 1 image or several images from dataset A, we call it as dataset C, we want to search images in B which is similar with C, and list all similar images.

We want to use deep learning for similarity search, but most of the images in dataset A has no tag, could we train these images into a specific model, then we could get more accurate result while searching similar images?

Thanks a lot for your patience to read this long requirement, and have a nice day! 

B.R
Anguo

Hossein Hasanpour

unread,
Jun 1, 2016, 12:06:37 PM6/1/16
to Caffe Users
I guess your best bet is to use autoencoders, and get a simplified vector (the encoder output after training) and save it in you database, then for similarity look for vectors that have a similar pattern to the one you have in you db. 
this is a normal case for autoencoders if I'm not mistaken, you wont need label data in this case either. its unsupervised training. 
Reply all
Reply to author
Forward
0 new messages