Hi team,
I had a few recurring issues regarding inaccuracy of getComponentImages ROI boxes resulting in smaller ROIs than the actual words sample attachments provided. But I couldn't put a fingre on what could be causing this.
I am using tesserOCR wrapper with Tesseract 5.3 underlying.
Another question about the same getComponentImages API in tesseract is, does tesseract have a pre-recognition text detector like EAST for example? or is getComponentImages the same thing as getting the text recognized, and just the output is different i-e instead of text values, it turns them into ROI boxes with coordinates?