How to download the Tesseract trained data for Digital display numbers ( Seven Segments Data trained data )

Pixxe

unread,

Jul 2, 2014, 7:35:41 AM7/2/14

to tesser...@googlegroups.com, G Sakthi

Hi Guys,

How to download the Tesseract trained data for Digital display numbers ( Seven Segments display trained data )

In some forum: they have said language options for 7 segment display digits OCR " SUN "

But i cant find this file named "sun" , Guys pls help us in finding this file.

If some one have this already please share the link or the trained data file

Thanks in advance...

Artur Augusto

unread,

Jul 3, 2014, 3:44:02 AM7/3/14

to tesser...@googlegroups.com, G Sakthi

Hi Pixxe,

As many people ask about how to use tesseract to read 7 segments display, I decided to publish an open source sample project.

If someone wanna check it: https://github.com/arturahttps://github.com/arturaugusto/display_ocrugusto/display_ocr

Artur

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/38cc4cdc-61e0-4cad-b0e3-1dafb047bbf2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nick White

unread,

Jul 3, 2014, 5:28:26 PM7/3/14

to tesser...@googlegroups.com, G Sakthi

Hi Artur,

On Wed, Jul 02, 2014 at 10:18:55PM -0300, Artur Augusto wrote:
> As many people ask about how to use tesseract to read 7 segments display, I
> decided to publish an open source sample project.
>
> If someone wanna check it: https://github.com/arturahttps://github.com/
> arturaugusto/display_ocrugusto/display_ocr

Awesome, thanks so much for sharing! I was about to add it to the
3rdParty wiki page, but Zdenko beat me to it :)

Can you share the source files for your training somewhere too
(image and box files), so people can potentially improve on / add to
the training themselves?

Nick

Artur Augusto

unread,

Jul 3, 2014, 7:58:06 PM7/3/14

to tesser...@googlegroups.com

Sure, just need some time to compile all stuff in a more organized way and document it.

I needed to apply some erosion to preprocess the font because of the problem to recognize segmented fonts.

My trained data only works with erosion.

I will do that as soon as I can.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.

To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/20140703212755.GB19831%40manta.lan.

Artur Augusto

unread,

Jul 3, 2014, 8:12:44 PM7/3/14

to tesser...@googlegroups.com

And thats why I created a project that uses OpenCV so user can real time control the erosion..

Artur Augusto

unread,

Jul 3, 2014, 8:13:12 PM7/3/14

to tesser...@googlegroups.com

Hi Nick,

I've just pushed the training data to my project page!

https://github.com/arturaugusto/display_ocr/tree/master/training_source

If someone come with improvements as you told, I will be accepting pull requests.

Artur

sabrina soraya

unread,

May 28, 2015, 9:22:27 AM5/28/15

to tesser...@googlegroups.com

Hi Arthur, first of all I want to say thanks for sharing the trained data files to us. But I found that the "7" digit data I have is different like your trained data. My "7" digit has one more segment in the left top. So I was thinking to train by myself. But I got problem when I follow training instruction from here https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 especially in "Run Tesseract for Training" part. It did not give me *.tr file. Can you give me a clear instruction how did you train your data? Thank you in advance!

Sabrina.

Artur Augusto

unread,

May 28, 2015, 11:37:06 AM5/28/15

to tesser...@googlegroups.com

Hi Sabrina,

At this link I included a python script that helped me training tesseract, providing the .box and .tif image that contains the image with samples for the font.

I can't remember the details, since I did this work 1 year ago and never trained any other font.

Do you already have sample images to the font you want to train?

To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/73052430-a525-4bbd-a177-a15e0bc55a9b%40googlegroups.com.

sabrina soraya

unread,

May 31, 2015, 5:54:52 AM5/31/15

to tesser...@googlegroups.com

Hi Arthur,

Thank you for replying my email. Unfortunately I am not familiar with phyton. My project is recognizing the digits from the device like I attached for Android devices. Here is my sample image and the tif and box file for my training. Do you have any idea to train this. Any suggestion would help me. Thank you :)

--
You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/JJOMlrJjO9s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.

To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAGP33S5DBqaTj5trDbQ1G-asQs_qbMgcvv1eC9D-nE4CRtP1hQ%40mail.gmail.com.

P_20150518_231018_LL.jpg

seg.digital-7.exp0.box

seg.digitaldismay.exp0.box

seg.digital-7.exp0.tif

seg.digitaldismay.exp0.tif

Ashish Bachhav

unread,

May 17, 2016, 11:21:06 AM5/17/16

to tesseract-ocr

Hi Arthur and Sabrina,

Thanks for ur helpful post.

Sabrina did you get any solution for 7Segment Digital display OCR, i m also searching for 7seg traindata file. i have one file but i have some accuracy issue with it. i want more accuracy,

so if you have any train file for 7seg plz share with me. i want to read multimeter reading from its display.

or give me clean steps to traindata from basics to End.

Thanks

komala...@gmail.com

unread,

Mar 24, 2017, 5:57:39 AM3/24/17

to tesseract-ocr, g.sa...@tcs.com

Hello,

I am basically working in electronics field and new to C#.Currently I am working on one project (Image processing in C#) where i am using C#,where in one of the part i have to detect text or digits of 7 segment display image for that on google i found Tesseract solution.

For experiment i have first try to convert normal text image in to text file and it is working fine for some of the basic images but it is not working with 7 segment display.so i came to know i required trained data file for 7 segment.

For training 7 segment data i follow the steps which are shown in vidoe of below link:https://www.youtube.com/watch?v=i_1-hGsXxy8.

But the output.txt file showing in that video is not generating in my case.Due to which after using trained 7 segment data file ,i am getting garbage value in text file.So for checking that i am getting proper trained file or not , i have follow the procedure which is shown on that video but it is giving an error like outpt.txt file not found.Is this happened because of missing otput.txt file or something else i am missing to do.I have follow all the steps which are shown in that video for training 7 segment data.

Also i have installed jTessBoxEditorFX.jar, serak trainer & Tesseract-ocr v3.02.So at the end i am just stuck in the point where i don't know where i am going wrong,is my procedure is wrong or software installation is not proper because after installing tesseract there is red cross mark against tesseract.

Please somebody help me to figure it out.If possible please provide me 7 segment trained data file and also the exact steps to trained 7 segment data as i have to trained some more files for various display icons and some specific messages.Its very urgent as my project is stuck and i am helpless because after trying so much solutions in image processing for 7 segment display detection like pixel count & image comparison in C#, i came up on tesseract solution.

If you have any doubts on understanding my query please let me know.

Please do the needful.

komal gawade

unread,

Mar 27, 2017, 4:20:20 AM3/27/17

to tesseract-ocr, g.sa...@tcs.com

ShreeDevi Kumar

unread,

Mar 27, 2017, 4:45:03 AM3/27/17

to tesser...@googlegroups.com, g.sa...@tcs.com

https://github.com/tesseract-ocr/tesseract/wiki/AddOns

has link to traineddata for digital seven fonts.

https://github.com/arturaugusto/display_ocr

You can download various digital seven fonts, create traineing data images and train - all in Jtessboxeditor. Use 3.0x version

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.

Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/814e5125-8224-4a38-9035-5ab1c3bc0488%40googlegroups.com.

komal gawade

unread,

Mar 27, 2017, 5:49:52 AM3/27/17

to tesseract-ocr, g.sa...@tcs.com

Sorry friend but i am not getting it............................

I used this link https://github.com/arturaugusto/display_ocr/tree/master/letsgodigital, for trined data but still i am getting garbage values in text file.

Also i dont know how to get TOF & TTf files.

Now in my pc Tesseract V3.02 is installed.I am using jTessBoxEditorFX-2.0-Beta and Serak tesseract Trainer V0.3 for taining & segment data.

I have attached some images which i want to extract in text file.

Below is the my program:

using System;

using System.Collections.Generic;

using System.ComponentModel;

using System.Data;

using System.Drawing;

using System.Linq;

using System.Text;

using System.Threading.Tasks;

using System.Windows.Forms;

using AForge;

using AForge.Video;

using AForge.Video.DirectShow;

using AForge.Imaging.Filters;

using tessnet2;

using System.IO;

/****************************By using Garbage in text file and compairing it (final prog)***********************************************/

namespace Display_Detection

{

public partial class FormDisplayDetection : Form

{

public FormDisplayDetection()

{

InitializeComponent();

}

private FilterInfoCollection CaptureDevice;

private VideoCaptureDevice CaptureImage;

private void FormDisplayDetection_Load(object sender, EventArgs e)

{

// enumerate video devices

CaptureDevice = new FilterInfoCollection(FilterCategory.VideoInputDevice);

foreach (FilterInfo VideoCaptureDevice in CaptureDevice)

{

comboBox1.Items.Add(VideoCaptureDevice.Name);

}

comboBox1.SelectedIndex = 0;

}

private void button1_Click(object sender, EventArgs e)

{

// create video source

// CaptureImage = new VideoCaptureDevice(CaptureDevice[0].MonikerString);

CaptureImage = new VideoCaptureDevice(CaptureDevice[comboBox1.SelectedIndex].MonikerString);

// set NewFrame event handler

CaptureImage.NewFrame += new NewFrameEventHandler(video_NewFrame);

// start the video source

CaptureImage.Start();

}

private void video_NewFrame(object sender, NewFrameEventArgs eventArgs)

{

// get new frame

pictureBox1.Image = (Bitmap)eventArgs.Frame.Clone();

// process the frame

}

private void button2_Click(object sender, EventArgs e)

{

Bitmap varBmp = new Bitmap(pictureBox1.Image);

varBmp = ResizeBitmap(varBmp, 320, 240);

pictureBox2.Image = (Bitmap)varBmp.Clone();

}

private static Bitmap ResizeBitmap(Bitmap sourceBMP, int width, int height)

{

Bitmap result = new Bitmap(width, height);

using (Graphics g = Graphics.FromImage(result))

g.DrawImage(sourceBMP, 0, 0, width, height);

return result;

}

private void button4_Click(object sender, EventArgs e)

{

// signal to stop

CaptureImage.Stop();

// ...

}

private void button3_Click(object sender, EventArgs e)

{

if (pictureBox2.Image != null)

{

//Save First

Bitmap varBmp = new Bitmap(pictureBox2.Image);

// Bitmap newBitmap = new Bitmap(varBmp);

//string Image = "ImageName_" + DateTime.Now + ".jpg";

// string Image = "ImageCaptured_" + DateTime.Now.ToString("ddMMyyyy HHmmss") + ".jpg";

//varBmp = ResizeBitmap(varBmp, 320, 240);

// varBmp = ResizeBitmap(varBmp, 640, 480);

string Image = "ImageCaptured_" + DateTime.Now.ToString("d-M-yyyy hh.mm.ss tt") + ".bmp";

varBmp.Save(@"D:\Komal\Automation Project\Programs\task1\Display Text Detection\Save_Images\" + Image);

// varBmp.Save(@"D:\Komal\Automation Project\Programs\task1\Capture Image\Save_Captured Image\filename.jpg", ImageFormat.Jpeg);

//Now Dispose to free the memory

varBmp.Dispose();

varBmp = null;

}

else

{ MessageBox.Show("null exception"); }

}

private void button5_Click(object sender, EventArgs e)

{

CaptureImage.Stop();

Application.Exit();

}

private void button6_Click(object sender, EventArgs e)

{

var image = new Bitmap(pictureBox2.Image);

File.WriteAllText(@"D:\Komal\Automation Project\Programs\task1\Display Detection\image1.txt", String.Empty);

// now add the following C# line in the code page

var ocr = new Tesseract();

ocr.Init(@"D:\Komal\Automation Project\Programs\task1\Display Text Detection\packages\tessdata", "eng", false);

var result = ocr.DoOCR(image, Rectangle.Empty);

foreach (tessnet2.Word word in result)

{

File.AppendAllText(@"D:\Komal\Automation Project\Programs\task1\Display Detection\image1.txt", word.Text);

}

/* byte[] file1 = File.ReadAllBytes(@"D:\Komal\Automation Project\Programs\task1\Display Detection\image1.txt");

byte[] file2 = File.ReadAllBytes(@"D:\Komal\Automation Project\Programs\task1\Display Detection\sample.txt");

if (file1.Length == file2.Length)

{

MessageBox.Show("Both files are same!!!!!!!!!!!!");

}

else

{ MessageBox.Show("Both files are not same!!!!!!!!!!!!"); }*/

}

private void button7_Click(object sender, EventArgs e)

{

DialogResult result = openFileDialog1.ShowDialog();

if (result == DialogResult.OK)

{

Image image = Image.FromFile(openFileDialog1.FileName);

pictureBox1.Image = image;

}

private void button8_Click(object sender, EventArgs e)

{

DialogResult result = openFileDialog1.ShowDialog();

if (result == DialogResult.OK)

{

Image image = Image.FromFile(openFileDialog1.FileName);

pictureBox2.Image = image;

}

Picture1.png

Picture2.png

Picture3.png

Picture4.png

Reply all

Reply to author

Forward