Dear Tesseract Community,
We run a high-volume, multi-engine OCR pipeline that includes Tesseract 4 (LSTM, Latin), a custom Tesseract 3 (outline-based) model for specific cases, and newer OCR models on low-resource serverless environment.
We wanted to share some brief internal results that may be useful to the community.
Key points
- Data: Internal printed-word sets. We cannot publish the datasets. The goal here is practical relevance, not reproducibility claims.
- Scope: These results focus strictly on word recognition.
- Context: We did not evaluate other valuable Tesseract features (e.g., segmentation, CPU performance) or address known limitations (e.g., GPU support or the practicality of generic LSTM retraining); however, they might be important for your use case.
Findings
Confidence Calibration For Tesseract LSTM based models, there is a strong link between confidence and correctness: most errors sit at lower confidence levels.This makes thresholding and model voting reasonably straightforward. In our tests, the confidence distributions of Tesseract LSTM models are usable for such decisions.
Note that the Tesseract 3 outline-based matching model is more noise-sensitive on our data, reinforcing that the tested dataset is not "easy".
Confidence scores limited to the [0, 100] range. For a single confidence level, there are two corresponding values, red and green, where ideally, high confidence has a low red value and a high green value (and vice-versa at the lower confidence levels).
Head-to-Head Comparisons Direct word-level comparisons show a meaningful share of cases where Tesseract LSTM model is correct while others are not.
This complementary behavior means Tesseract LSTM model still adds significant value in an ensemble, despite being an older engine.

ConclusionMature engines like Tesseract are not obsolete (at least for
certain
use cases). In our pipeline, Tesseract LSTM word recognition remains competitive and, importantly, provides well-calibrated confidence scores that are useful for filtering and ensemble voting.