Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Suggestions for Improving Handwritten English Text Recognition

55 views
Skip to first unread message

李佳栓

unread,
Nov 26, 2024, 8:57:50 AM11/26/24
to tesseract-ocr

Dear Tesseract Team,

I am currently working on a project that involves recognizing English text, and I’ve implemented a workflow using the CRAFT text detector to identify text regions. After isolating these regions, I process each segment with Tesseract OCR. While this approach achieves high accuracy with printed text, the performance drops significantly when dealing with handwritten text(example the image).

To improve accuracy, I’ve already applied preprocessing steps such as grayscale conversion and binarization. However, I would like to ask for advice on optimizing preprocessing parameters for images with diverse characteristics. Specifically:

  1. Are there recommended preprocessing configurations that generally work well for most images when preparing them for OCR?
  2. Are there additional steps or methods that could further enhance handwritten text recognition accuracy?

Thank you for your time and support. I look forward to your valuable insights.

Best regards 螢幕擷取畫面 2024-11-26 214426.png

Ajinkya Bobade

unread,
Nov 29, 2024, 1:18:24 AM11/29/24
to tesseract-ocr
Hi,

The problem seems that you are using a pre-trained Tesseract model. In your scenario, you need to retrain another Tesseract model based on your handwriting. This new model will increase your accuracy. 

Regards
Ajinkya
Reply all
Reply to author
Forward
0 new messages