getComponnentImages falling short of a few words/ characters

36 views
Skip to first unread message

Muhammad Ali

unread,
Dec 17, 2023, 2:04:51 AM12/17/23
to tesseract-ocr
Hi team,

I had a few recurring issues regarding inaccuracy of getComponentImages ROI boxes resulting in smaller ROIs than the actual words sample attachments provided. But I couldn't put a fingre on what could be causing this.

I am using tesserOCR wrapper with Tesseract 5.3 underlying.

Another question about the same getComponentImages API in tesseract is, does tesseract have a pre-recognition text detector like EAST for example? or is getComponentImages the same thing as getting the text recognized, and just the output is different i-e instead of text values, it turns them into ROI boxes with coordinates?


tesserOCR_getComponentImages.png

Zdenko Podobny

unread,
Dec 17, 2023, 8:07:00 AM12/17/23
to tesser...@googlegroups.com
First of all, provide the original input image.
Next, it would be nice to see code to replicate the problem.

Zdenko


ne 17. 12. 2023 o 8:04 'Muhammad Ali' via tesseract-ocr <tesser...@googlegroups.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/cdd9ce36-3776-4620-8a30-8bbc1acd2d32n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages