Caffe basics: image scaling, scale invariance

31 views
Skip to first unread message

PeasAndCues

unread,
Jan 17, 2018, 5:14:31 PM1/17/18
to Caffe Users
I have just downloaded, built and tested Caffe on the CPP Classification example - success.

But then, instead of applying the net to the cat.jpg file (size 480x260 pixels), I applied it to the fish-bike.jpg file (size 481x323 pixels). I'm not sure what the answer should be, but the result didn't seem unreasonable.

I did this specifically to see if the classification.bin (Windows: classification.exe) process would work at all on the different file size. It did, but how?

That is... until this point it was my understanding that there was a 1:1 mapping between pixels in the input layer of the network and the original image. Clearly not. So how is an image of another size made compatible with a network? Is it that an image of a cat is still a cat at any scale, so the classification process always pre-scales the input image to fit the input layer of the network? What if my application cannot allow such scaling - how can I control it and/or prevent it?

That is, suppose I want to create a net to classify T-shirts on a conveyor into sizes (S, M, L, XL, XXL, etc across several product lines), because I'm trying to create some robot sewing machine to identify the product & size, and automatically sew the correct label into the collar. In this situation, I really don't want to scale my images - I want a fixed geometry camera with respect to my products as they come along the conveyor, and for my network to be tailored to the image size of my camera, or to some reasonable fixed size portion of it. Where in Caffe can I control this?

Przemek D

unread,
Jan 19, 2018, 6:13:17 AM1/19/18
to Caffe Users
Remember, this is just an example how you might use Caffe. If you were developing your own solution for t-shirt classification, you would probably want to build the application from grounds up, only using Caffe as backend for classification. Stuff like data acquisition and preprocessing would be on your side, you would only feed an image to Caffe in whichever way you see fit.
Please take a look at the source of this particular example, especially lines 205&206 - you will see that there is a good deal of external code (doing image resizing, among other things). Actual call to the Caffe backend occurs on line 162.
Reply all
Reply to author
Forward
0 new messages