Handling text scans and cleaning

137 views
Skip to first unread message

Ajinkya Bobade

unread,
Apr 8, 2025, 12:09:52 AMApr 8
to tesser...@googlegroups.com
I have noticed that text cleaning is the most difficult part in OCR pipeline. I have struggled alot on this part, without properly cleaned text OCR simply fails in terms of accuracy. In order to handle text cleaning seperately I created  a GitHub repo that uses AI to clean up all text in a image. Once the text is cleaned we can choose our own custom OCR models on it. I have personally seen OCR accuracy shoot up to 99% on a properly preprocessed and cleaned image. 

Here is a Github: https://github.com/ajinkya933/ClearText link. 

Regards 
Ajinkya

Zdenko Podobny

unread,
Apr 9, 2025, 12:56:46 AMApr 9
to tesser...@googlegroups.com

ut 8. 4. 2025 o 6:09 Ajinkya Bobade <ajinkya...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/CAHy6iNOjhs7ZY7r26fGzqJOUr2e%2BF3bY%3DeDCHjM-VD7XH5M%3DTA%40mail.gmail.com.

Ajinkya Bobade

unread,
Apr 9, 2025, 1:15:22 AMApr 9
to tesseract-ocr
Thank you, just saw from your link that it is posted !! 
I'm so glad to hear this news  

Ajinkya

Kliai Louay

unread,
Jun 19, 2025, 11:37:03 AMJun 19
to tesseract-ocr

Hi everyone,

I’m an engineering student currently working on my first project, which focuses on handwritten recognition. Since it’s my first time tackling this type of project, I’m not sure where to begin.

I would really appreciate any advice or resources you could share to help me get started.

Thank you in advance, and have a great day!

Best regards,

Louay klai

jannes hoekman

unread,
Jun 19, 2025, 11:54:34 AMJun 19
to tesser...@googlegroups.com
You can use BIQE archive 

Op do 19 jun 2025 om 17:36 schreef Kliai Louay <kliai...@gmail.com>
Reply all
Reply to author
Forward
0 new messages