Automatic Number Plate Recognition

Mark Beylis

unread,

Nov 20, 2014, 7:53:43 AM11/20/14

to tesser...@googlegroups.com

Hello

I am making use of Tesseract OCR to perform number plate recognition on vehicles

I am making use of jTessBoxEditor v1.1 to check my box and tif files

At the moment each iteration of my training consists of using about 250 - 300 number plates

I have read in many places that one should train fonts separately. This is difficult in my case as my source of images of number plates consists of number plates with varying font's unless I manually look through each one of the 100 initial images I use per training iteration to separate them into different groups. Would this really be neccessary?

I have been doing training for over a month now and probably trained on over 1000 images and 3000 number plates and seem to not be able to get a better accuracy percentage of over 86%

I was wondering if you have some suggestions as ideally I would like to see in excess of 90% accuracy

What I have picked up is that the OCR struggles with certain problem characters : O vs 0, 5 vs S, 2 vs Z, B vs 8

Is there a specific way of training that I should use to improve correct reads of these letters. During my editting of the tif/box in jTessBoxEditor I am torn between discarding the bad quality read characters and only keeping the good quality read characters vs correcting each and every character to be what it should be regardless of the quality of the character in the tif file. Which is the better approach and why?

Any other suggestions on how to improve my training using jTessBoxEditor greatly appreciated

Thanks

newbie

unread,

Jan 7, 2015, 2:45:44 PM1/7/15

to tesser...@googlegroups.com

Mark,

Did you find a solution to line below(extracted from your original msg) ? If so , please let me know. Thanks

What I have picked up is that the OCR struggles with certain problem characters : O vs 0, 5 vs S, 2 vs Z, B vs 8

Allistair C

unread,

Jan 7, 2015, 4:47:10 PM1/7/15

to tesser...@googlegroups.com

You've tried unicharambigs right (bottom of this page https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3)

newbie

unread,

Jan 7, 2015, 5:17:47 PM1/7/15

to tesser...@googlegroups.com

Thanks Allistair for your response. I have the final crunched eng/ trained_data, not sure if that has merged in unicharambigs. How would i know ?

Allistair

unread,

Jan 7, 2015, 5:44:33 PM1/7/15

to tesser...@googlegroups.com

I believe this file needs to be supplied before the final combined trained data is compiled, therefore perhaps you should look for if jTessBoxEditor supports its creation.

https://tesseract-ocr.googlecode.com/svn/trunk/doc/unicharambigs.5.html

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/71596b7f-3630-4241-b665-f5c03f2d66a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward