Training 7-segment display digits

5,629 views
Skip to first unread message

Raj

unread,
Mar 5, 2009, 2:03:03 AM3/5/09
to tesseract-ocr
Hi All....

i'm newbie and want 2 use tesseract ocr for detecting 7-segment
display.

for this i'm using c#.net 2005 and a image processing open source
(opencv) and c# wrapper emgucv to achieve the task.

i have removed noise from the image before passing it to tesseract
ocr engine.

but i'm getting mixed results. like for digit ' 0' -- detecting as 11.
digit '6' as 5

i read about training the tesseract. Is it possible for me to train
the " 7 segment display ".

If yes, then please tell me the process how can i get tesseract to
train.



Thank U.


luigi rensinghoff

unread,
Mar 6, 2009, 2:55:19 AM3/6/09
to tesser...@googlegroups.com
Hey Raj,



--- On Thu, 3/5/09, Raj <mail2s...@gmail.com> wrote:

luigi rensinghoff

unread,
Mar 6, 2009, 2:56:53 AM3/6/09
to tesser...@googlegroups.com
Hey Raj,

That is interesting, i was trying to do the same, but then let the project sleep for a while.

I would like to stay in touch with you to hear about your effort.
Probably we can benefit both.

So what is your goal ? Why do you want to recognize 7-segment digits ?


Cheers Luigi



--- On Thu, 3/5/09, Raj <mail2s...@gmail.com> wrote:
From: Raj <mail2s...@gmail.com>
Subject: Training 7-segment display digits
To: "tesseract-ocr" <tesser...@googlegroups.com>
Date: Thursday, March 5, 2009, 12:03 AM

Raj

unread,
Mar 6, 2009, 5:54:14 AM3/6/09
to tesseract-ocr
Hi luigi,

Nice to get a response in short time.

Actually i'm pursuing M.Tech and as part of my M.Tech project, i want
to design an OCR that can recognise 7-Segment Display apart from other
things.

As i said i already done much work on this project.
currently i'm using Tesseract ocr in my project in c# .net.
I'm getting mixed results.

So, my question is do i need to train the tesseract to recognize 7
segment display digits.
If so, what is the process.(as i mentioned i'm only interested in 7
segment display).

If U know abt the training process.pls let me know the process

bie 4 now.
waiting 4 reply




On Mar 6, 12:56 pm, luigi rensinghoff <luigi_r...@yahoo.com> wrote:
> Hey Raj,
>
> That is interesting, i was trying to do the same, but then let the project sleep for a while.
>
> I would like to stay in touch with you to hear about your effort.
> Probably we can benefit both.
>
> So what is your goal ? Why do you want to recognize 7-segment digits ?
>
> Cheers Luigi
>
> --- On Thu, 3/5/09, Raj <mail2sun....@gmail.com> wrote:
> From: Raj <mail2sun....@gmail.com>

paulfeakins

unread,
Apr 1, 2009, 12:52:07 PM4/1/09
to tesseract-ocr
Hi there,

Have you managed to get tesseract working with 7-segment digits yet?

I'd recommend downloading a "digital" font that looks right and
training it by following this:

http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract

Paul.

Raj

unread,
Apr 9, 2009, 2:14:04 AM4/9/09
to tesseract-ocr
HI Paul,

Now, i have got tesseract to working to recognize 7-segment digits.

and it is working perfectly as required producing 100% accurate
results with the command "tesseract.exe input.tif output.txt -l sun"
where "sun" is the language name.

But my intension is to use the tesseract in a c# application.for this
I found, EmguCV 1.5.0.1wrapper for OpenCV, an Example in C# where the
example named(OCR) is calling the tesseract internally using tessdata
for English language,and Tessnet2 dll (a .NET 2.0 Open Source OCR
assembly using Tesseract engine)and it is working perfectly. when i
provide the image of 7-segment digits it is working with 90% accuracy
with some ambiguity.

For this i create my own 8 files of tessdata and tried to use it in
the example.But when i replace the tessdata in the example with my
tessdata files, the application is terminating at the statement
"ocr.init()" without producing any results.

But the application with tessdata with english language is working
perfectly as i said.

I have no clue of this prob. B'coz when i use my tessdata and run the
command "tesseract.exe input.tif output.txt -l sun" it is producing
results with cent% accuracy. But when i tried to use my tessdata in
the application, it is not producing any results.

I hope i'm clear to u abt the prob.

Please provide some feedback.

Thanks in advance,

Regards..
Sunder Raj.M

Raj

unread,
Apr 9, 2009, 2:16:32 AM4/9/09
to tesseract-ocr
Hi luigi,

i have got tesseract working to recognize 7-segment digits.
On Mar 6, 12:56 pm, luigi rensinghoff <luigi_r...@yahoo.com> wrote:
> Hey Raj,
>
> That is interesting, i was trying to do the same, but then let the project sleep for a while.
>
> I would like to stay in touch with you to hear about your effort.
> Probably we can benefit both.
>
> So what is your goal ? Why do you want to recognize 7-segment digits ?
>
> Cheers Luigi
>
> --- On Thu, 3/5/09, Raj <mail2sun....@gmail.com> wrote:
> From: Raj <mail2sun....@gmail.com>

Francisco Loché Costa

unread,
Sep 12, 2012, 5:20:16 AM9/12/12
to tesser...@googlegroups.com
Hi there,

I have finally manage myself to train tesseract correctly. Do anyone knows how to use the traineddata obtained whit pytesser? It's because I'm working in python whit pytesser, but I don't know how to modify pytesser to call tesseract with the necesary options to work whit the new traineddata.

Thanks to all of us.

2012/9/10 Francisco Loché Costa <francis...@gmail.com>
Hello,

I am trying to use tesseract to recognize a seven segment display too, but i have problems training the OCR. I think that my image for training is not very well (I'm using an image black whit the numbers in white, there are many errors when I run tesseract for training. If i change the same image to a white image whit numbers in black, makebox don't work, tesseract can't open teh image).

Could someone spare with me a train image that works fine?

Thanks to all of us.

--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en



--
  Francisco Loché Costa,
  Ingeniero Técnico de Telecomunicación, esp. Telemática.
Message has been deleted

David Peter Lisin Crespo

unread,
Jan 28, 2013, 2:59:50 AM1/28/13
to tesser...@googlegroups.com
Good morning Seema, 

   as for training tesseract for seven segment ocr, i also asked in the forum but did not find a reply. In the end i simply used opencv.

Steps where:

- Convert image to black and white
- Clean the image (erosion, dilation, etc)
- Contour detection (works very well).
- Once I have countours, i made three lines. One that cuts vertically the contour by the half. And two horizontal lines at 1/4 of contour height and 3/4.
- Set point were segments cross the lines to get 7 points (the vertical lines cut three segments) and the horizontal cut both two (upper and lower segments)
The check for pixel value. If black consider segment as active.
With resultng segments, you get which number is in use.

Hope this helps ;)

PD: On this page you can see similar algorithm used: http://www.unix-ag.uni-kl.de/~auerswal/ssocr/
Raj found how to do it with tesseract, but was not able to answer.https://groups.google.com/forum/?fromgroups=#!topic/tesseract-ocr/elnIngFJvQs
Good paper describing similar process to what i had to do: http://morgoth.zemris.fer.hr/people/Marko.Cupic/files/2009-SP-MIPRO.pdf

Hope it works out for you ;)

Kind regards, 

           David Lisin


2013/1/27 Seema Shettar <seema....@gmail.com>

--

jamilir

unread,
Jun 6, 2014, 2:01:24 AM6/6/14
to tesser...@googlegroups.com
Hi Raj,
Can you please provide your  sun  trained language for 7 segment display recognition by tesseract OCR?
alternatively, can you please let me know how did you train this language. I mean ,I already know all the steps of training but it does not work for me for 7 segment digital fonts because when I give any image having digital font number to tesseract to creat box file then only an empty box file is created.

Thanks,

jamilir

unread,
Jun 6, 2014, 2:05:18 AM6/6/14
to tesser...@googlegroups.com, dlis...@gmail.com
Hi David,

Can  you help me out , how can I use tesseract to OCR numbers of digitanl fonts (7 segment display).
Did you figure out how to train tesseract for 7 segment display numeric language?
Thanks,
Jamil
Reply all
Reply to author
Forward
0 new messages