Dear Tesseract Community,
We run a high-volume, multi-engine OCR pipeline that includes Tesseract 4 (LSTM, Latin), a custom Tesseract 3 (outline-based) model for specific cases, and newer OCR models on low-resource serverless environment.
We wanted to share some brief internal results that may be useful to the community.
Key points



Head-to-Head Comparisons Direct word-level comparisons show a meaningful share of cases where Tesseract LSTM model is correct while others are not.
This complementary behavior means Tesseract LSTM model still adds significant value in an ensemble, despite being an older engine.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/CABCmPq0AtjM-nY0vb%2B2PWwLRkqkf2Kkznp%3DoTNL1T678VQjAhA%40mail.gmail.com.
Thank you for your follow-up and interest in the technical details.
Regarding the -1 values for T3 and T4: these represent a specific classification within our evaluation methodology.
To maintain the integrity of our single-word accuracy analysis, we used this value to categorize instances where the OCR engines returned multiple words for a single-word source image. This allowed us to effectively separate segmentation variances from standard character recognition scores. Given the scale of our dataset, these instances were grouped together to provide a complete picture of the engines' behavior.
I hope this clarifies the anomaly. Please let me know if you have any further questions.
Best regards,
Jakub Hybl