Cropping an image degrades vision product search results

46 views
Skip to first unread message

Raffi Krikorian

unread,
Feb 19, 2021, 7:20:15 PM2/19/21
to cloud-vision-discuss
I noticed that doing a minor crop to the background of my image, drastically degrades the results I get when doing a vision product search. Has anyone else encountered this issue? 

Note that I'm not cropping the person in the image, just a minor sliver around the border. I'm looking at both the full image results and the PRODUCT_SEARCH results for the various annotations.

Fernando Cid Samper

unread,
Feb 23, 2021, 4:27:20 AM2/23/21
to cloud-vision-discuss
Could you please further detail which usage of Cloud Vision API are you using while detecting this?
Please, take a look at crop hints detection [1] that could help you for selecting the best crop region of an image for increasing accuracy.

--------------------

Raffi Krikorian

unread,
Feb 23, 2021, 12:52:15 PM2/23/21
to cloud-vision-discuss
I'm first running an OBJECT_LOCALIZATION and getting the boundingPoly.normalizedVertices for the PERSON annotation in my image.
Then I'm passing the normalizedVertices to PRODUCT_SEARCH via boundingPoly: {normalizedVertices: normalizedVertices}

Fernando Cid Samper

unread,
Feb 24, 2021, 5:53:04 AM2/24/21
to cloud-vision-discuss
Could you please detail more about your model? Namely,

- Which data are you using.
- What is the accuracy of your model with and without cropping.

Are you following recommendations described in [1]?

--------------------

Clayton Mellina

unread,
Feb 24, 2021, 11:58:03 AM2/24/21
to cloud-vision-discuss
Hello,

I understand that you are using the Vision Product Search API https://cloud.google.com/vision/product-search/docs — please correct me if I misunderstood.

Following your second email, I understand that you are not actually cropping the image. Instead, you are keeping the image the same, but passing the normalizedVertices of the "Person" label obtained via OBJECT_LOCALIZATION to the PRODUCT_SEARCH API in the ProductSearchParams https://cloud.google.com/vision/product-search/docs/reference/rest/v1/ImageContext#ProductSearchParams. Again, please correct me if I've misunderstood.

The boundingPoly in ProductSearchParams is expected to contain a product which should serve as the query, and its primary use case is to enable users to manually draw a bounding box around a product of interest to get results specifically for that product. When the box is provided, the API will skip its internal product localization and return results for the bounding box provided. In your case, you are setting the bounding box to a "Person" box, not a product box, so the API is treating the person box as a product when it is not, resulting in degraded results.

Feel free to share more details of your use case, but my recommendation is that you remove the the person bounding box in the product search queries to enable the API to do its own product detection for best results. If you need to know the "Person" box as well for your application, you can separately retrieve it via OBJECT_LOCALIZATION, as you are. You may wish to reason about the relationship between the "Person" boxes and the product boxes (returned by Product Search API in its multi-detection response https://cloud.google.com/vision/product-search/docs/searching-response) — for example, for apparel use cases you may wish to group the product boxes into "outfits" based on the "Person" box. If this if true of your intended use case, then it is best to do this after getting both the OBJECT_LOCALIZATION and PRODUCT_SEARCH responses for the image, namely by computing how much each product box spatially overlaps each "Person" box and making a grouping based on this.

Hopefully this is helpful. Thanks and all the best,
clayton

Raffi Krikorian

unread,
Feb 24, 2021, 7:31:45 PM2/24/21
to cloud-vision-discuss
Thanks Clayton. Your assumptions are correct and your explanation clears everything up. 
Reply all
Reply to author
Forward
0 new messages