getComponnentImages falling short of a few words/ characters

36 views

Skip to first unread message

Muhammad Ali

unread,

Dec 17, 2023, 2:04:51 AM12/17/23

to tesseract-ocr

Hi team,

I had a few recurring issues regarding inaccuracy of getComponentImages ROI boxes resulting in smaller ROIs than the actual words sample attachments provided. But I couldn't put a fingre on what could be causing this.

I am using tesserOCR wrapper with Tesseract 5.3 underlying.

Another question about the same getComponentImages API in tesseract is, does tesseract have a pre-recognition text detector like EAST for example? or is getComponentImages the same thing as getting the text recognized, and just the output is different i-e instead of text values, it turns them into ROI boxes with coordinates?

tesserOCR_getComponentImages.png

Zdenko Podobny

unread,

Dec 17, 2023, 8:07:00 AM12/17/23

to tesser...@googlegroups.com

First of all, provide the original input image.

Next, it would be nice to see code to replicate the problem.

Zdenko

ne 17. 12. 2023 o 8:04 'Muhammad Ali' via tesseract-ocr <tesser...@googlegroups.com> napísal(a):

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/cdd9ce36-3776-4620-8a30-8bbc1acd2d32n%40googlegroups.com.

Reply all

Reply to author

Forward

0 new messages