Open source (BSD) MICR dataset for Tesseract v4 + evaluation app

225 views
Skip to first unread message

Mamadou

unread,
Sep 17, 2019, 1:37:59 AM9/17/19
to tesseract-ocr
Hello,

We've open sourced (BSD 3-Clause License) our MICR dataset and *.traineddata for Tesseract v4.

This was developed as an internal R&D project and never went to production as we ended using Tensorflow.

Even as a PoC it's already more accurate than many commercial products. The repo contains a command line application to check the accuracy. This application can detect the MICR lines, de-skew, de-slant and binarize them before OCR'ing using Tesseract v4.


Regards,

René Hansen

unread,
Sep 17, 2019, 1:55:49 AM9/17/19
to tesser...@googlegroups.com
Very cool. Thank you for open sourcing this!


/René

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a259b96d-7c28-4b32-9a77-b93cec197eee%40googlegroups.com.


--
Never fear, Linux is here.
Reply all
Reply to author
Forward
0 new messages