How can I train TesseractOCRiOS to recognize handwriting?

428 views
Skip to first unread message

SlushyPuffin

unread,
Aug 29, 2019, 1:24:10 PM8/29/19
to tesseract-ocr
Im making an application, the goal is to take a picture of my school notes and have them processed into just text (so I can have neater notes)... I have some code already as of now my application can take text from uploaded images with even lighting and turn it into just text. Im using Xcode and Swift, I downloaded TesseractOCRiOS and GPUImage via CocoaPods.
Does anyone know how I can train TesseractOCRiOS to process handwriting images so I can complete my app? I would much appreciate if anyone could help me as I can't find a clear way to do this on the Tesseract website!

Timothy Snyder

unread,
Aug 29, 2019, 2:33:46 PM8/29/19
to tesser...@googlegroups.com
You will have to train it with handwriting samples like IAM handwriting database.

On Thu, Aug 29, 2019 at 1:24 PM SlushyPuffin <petesfo...@gmail.com> wrote:
Im making an application, the goal is to take a picture of my school notes and have them processed into just text (so I can have neater notes)... I have some code already as of now my application can take text from uploaded images with even lighting and turn it into just text. Im using Xcode and Swift, I downloaded TesseractOCRiOS and GPUImage via CocoaPods.
Does anyone know how I can train TesseractOCRiOS to process handwriting images so I can complete my app? I would much appreciate if anyone could help me as I can't find a clear way to do this on the Tesseract website!

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/38254d7a-0f64-4dfc-9eb5-a1ba2ea24c2b%40googlegroups.com.

SlushyPuffin

unread,
Aug 29, 2019, 3:32:52 PM8/29/19
to tesseract-ocr
Ok Thanks! can you give me a step by step on how I do that?

Timothy Snyder

unread,
Aug 29, 2019, 4:03:57 PM8/29/19
to tesser...@googlegroups.com
I would first learn how to train Tesseract with regular fonts. Once you understand that process pretty well, you can think about how you'd go about training Tesseract with samples from something like IAM handwriting database. That process will involve transforming IAM images + metadata files into the boxtif format used by Tesseract for training. You'd be best doing this work with Tesseract on Linux and then importing the trained recognition model into your iOS app.

On Thu, Aug 29, 2019 at 3:32 PM SlushyPuffin <petesfo...@gmail.com> wrote:
Ok Thanks! can you give me a step by step on how I do that?

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

Baking Squad

unread,
Aug 29, 2019, 4:20:01 PM8/29/19
to tesser...@googlegroups.com
Ok thanks! Have you done this before? If so can I have an example?

Sent from my iPhone

Timothy Snyder

unread,
Aug 29, 2019, 4:22:51 PM8/29/19
to tesser...@googlegroups.com
Example of what?

Baking Squad

unread,
Aug 29, 2019, 4:52:15 PM8/29/19
to tesser...@googlegroups.com
Of how I can train tesseract to recognize hand writing... or a tutorial on how I can accomplish it... or how I download what I need to download...

Sent from my iPhone

Baking Squad

unread,
Aug 29, 2019, 5:16:11 PM8/29/19
to tesser...@googlegroups.com
If you could do that I would really appreciate it, I really don’t wanna waste time building up to the main task I wish to accomplish... if you can refer me to a solid tutorial or give me an in depth explanation on how I can train TesseractOCR to do what I want it to I would much appreciate it.

Sent from my iPhone

Baking Squad

unread,
Aug 29, 2019, 5:23:22 PM8/29/19
to tesser...@googlegroups.com
Like I mean how can I take IAM images and metadata files and transform them into boxtif? And what are metadata files?

Sent from my iPhone

Timothy Snyder

unread,
Aug 29, 2019, 5:38:23 PM8/29/19
to tesser...@googlegroups.com
There are no tutorials for this and such a tutorial would take days to write. There is no easy path to developing an effective handwriting recognition system and it will require much independent research on your part. You need to do your reading. Follow the tesstraining tutorial on the Wiki. Download IAM handwriting database and explore the files that come with it. Unfortunately, there's no easy, one-shot solution for this.

Baking Squad

unread,
Aug 29, 2019, 5:51:52 PM8/29/19
to tesser...@googlegroups.com
It’s ok I’m figuring it out more and more... can you by any chance tell me how to convert the files so they are available to be used within TesseractOCR?

Sent from my iPhone
Reply all
Reply to author
Forward
0 new messages