Need to OCR LED Number Display

909 views
Skip to first unread message

Brandon

unread,
Oct 13, 2009, 10:58:58 AM10/13/09
to tesseract-ocr
I have a huge batch of images, each one has an LED number display. I
had seen some mention in earlier discussions about someone with a
similar issue; I exclusively need to OCR numbers, no other text.

An example of what sort of image I'm talking about can be viewed here:

http://picasaweb.google.com/btreen/DigitalTextExample?feat=directlink

I've just recently learned of Tesseract and have never trained it to
recognize a new font personally, so if anyone has tackled something
like this before it'd be a huge help if you could share your files.
If not, I would also be very thankful if anyone could give me some
guidance on the "quick and dirty" method of setting Tesseract up to
recognize this sort of font (and exclusively the 10 number characters
0 - 9.)

patrickq

unread,
Oct 14, 2009, 12:33:04 PM10/14/09
to tesseract-ocr
Hi Brandon,

I took a look at your sample image and it is very dark - I am
processing similar images myself with Tesseract and having poor
results with such dark images. One may hope that there is nevertheless
sufficient contrast in this image to get decent results by apparently
not with the current version of Tesseract. Has anyone here been
successful improving results for such images through some pre-
processing of the images to improve contrast?

Patrick

Georg Oberth

unread,
Oct 14, 2009, 2:18:35 PM10/14/09
to tesser...@googlegroups.com
I am working on similar thing... & had pretty good results defining the area
to scan and setting the brightness to -40 and contrast to 30.
Good Luck and keep me posted.
BTW - do you happen to know where I can get box files for digital number
like the example?

Thank you.
Regards,


Georg Oberth
ge...@godata.at
www.godata.at
0043-676-692-6070
0043-3127-88121

-----Ursprüngliche Nachricht-----
Von: tesser...@googlegroups.com [mailto:tesser...@googlegroups.com]
Im Auftrag von patrickq
Gesendet: Wednesday, October 14, 2009 6:33 PM
An: tesseract-ocr
Betreff: Re: Need to OCR LED Number Display

Georg Oberth

unread,
Oct 14, 2009, 2:20:57 PM10/14/09
to tesser...@googlegroups.com
I am working on similar thing... & had pretty good results defining the area
to scan and setting the brightness to -40 and contrast to 30.
Good Luck and keep me posted.
BTW - do you happen to know where I can get box files for digital number
like the example?

Thank you.
Regards,



Georg Oberth
ge...@godata.at
www.godata.at
0043-676-692-6070
0043-3127-88121

-----Ursprüngliche Nachricht-----
Von: tesser...@googlegroups.com [mailto:tesser...@googlegroups.com]
Im Auftrag von Brandon
Gesendet: Tuesday, October 13, 2009 4:59 PM
An: tesseract-ocr
Betreff: Need to OCR LED Number Display

Brandon

unread,
Oct 15, 2009, 11:10:18 AM10/15/09
to tesseract-ocr
I've been attempting to make box files but so far no luck, Tesseract
can grab the individual segments of the LED numbers but when I
attempted to redefine the box area such that it would instead capture
the "whole" number I just get a huge number of errors when trying to
use the box file.

Georg Oberth

unread,
Oct 15, 2009, 12:07:09 PM10/15/09
to tesser...@googlegroups.com
There is a difference between the box files (which are used to create your
dictionary) and the box around the area on the image you want to OCR.
Are you having problems generating the box files? What does your
tesseract.log say? Where exactly are you getting the errors?
Let me know.
Regards,


Georg Oberth
ge...@godata.at
www.godata.at
0043-676-692-6070
0043-3127-88121

-----Ursprüngliche Nachricht-----
Von: tesser...@googlegroups.com [mailto:tesser...@googlegroups.com]
Im Auftrag von Brandon
Gesendet: Thursday, October 15, 2009 5:10 PM
An: tesseract-ocr
Betreff: Re: Need to OCR LED Number Display

Suraj Supekar

unread,
Oct 15, 2009, 11:35:38 PM10/15/09
to tesser...@googlegroups.com
Hi all
 
I have wriiten .Net code to train OCR using Tesseract  2.4 and 3.0
It trains differnt fonts as well as data from images and creates training files in one shot (ENGLISH ONLY). You can merge blobs as well (not mannually ) , GUI is present. But GUI is like a sample test app, currently i am not working on that and hence very less progress to close the issues. I will post source code soon. still
if any body wants that binaries mail me to suraj....@gmail.com with subject  "Tesseract Training binaries required", i will mail them or will upload at forum.
 
SURAJ
--
SURAJ MURALIDHAR SUPEKAR
Director,
REDIVIVUS,
www.redivivus.in
Mobile : +91-9226941901

Laris Qiao

unread,
Dec 14, 2014, 9:39:32 AM12/14/14
to tesser...@googlegroups.com
hi Brandon and all,

Did anybody have future progress for the LED number display OCR?

I'm interesting for this function.

Thanks,
Laris

在 2009年10月13日星期二UTC+8下午10时58分58秒,Brandon写道:

Artur Augusto

unread,
Dec 14, 2014, 12:47:43 PM12/14/14
to tesser...@googlegroups.com
Hi Laris,

Take a look at this project: https://github.com/arturaugusto/display_ocr

There you can download the traineddata and have a python sample app.

Also, you can use a web-app developed for this purpose, that use the same trained data: http://ocr.sytes.net

Artur

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/18289fb9-a9ed-4317-a931-f580a30760e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages