Training for plotter file

57 views
Skip to first unread message

Dennis

unread,
Oct 18, 2014, 5:51:33 PM10/18/14
to tesser...@googlegroups.com
Hello,

I am trying to recognize the characters from a plot file (attached).  The characters are composed of lines and are not fonts.

I've tried training, but I was unsuccessful (I probably did something wrong).

Can anyone help?

Thank you,
Dennis
plotFileImage (1).gif

ShreeDevi Kumar

unread,
Oct 19, 2014, 10:02:05 PM10/19/14
to tesser...@googlegroups.com
Which version of tesseract are you using?

Try changing to 300/600 dpi, apply a blur/soften filter, decrease brighness, convert to greyscale.

I tried with  vietocr gui, 
zero with the line across gets recognized as @, rest comes out ok.

If you will not have @ in your plots, you could just substitute @ by zero in post-processing.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dennis

unread,
Mar 22, 2015, 12:24:54 PM3/22/15
to tesser...@googlegroups.com
I'm using the latest version of tesseract: 3.02.

I successfully OCRed with vietocr gui.  If I set it as screenshot mode, apply a smooth filter, and use the textbox to select each line one by one, I get a 100% correct OCR.

Now I am wondering, how can I automate this process?  I want to be able to create a program or execute a command so that I give it the image and it does the above things automatically and outputs the OCR and the location of the OCRed text in the image file.

Thank you,
Dennis Gahm

ShreeDevi Kumar

unread,
Mar 22, 2015, 2:20:18 PM3/22/15
to tesser...@googlegroups.com
vietocr has bulkocr and batch options. 

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

Dennis

unread,
Mar 22, 2015, 4:55:50 PM3/22/15
to tesser...@googlegroups.com
I just tried the bulk option, and I see that it also outputs the location of the text it OCRed, which is what I wanted,
but it does not have an option to do a smooth or textbox around the text I want to OCR.  There is no way to automate these things?

Also, how do I run vietocr with commandline?  preferably through .NET rather than Java.

Thank you for the help,
Dennis

Quan Nguyen

unread,
Mar 28, 2015, 11:39:22 AM3/28/15
to tesser...@googlegroups.com
The basic image processing is only available for individual images loaded in the UI. The bulk or batch OCR does not have this support. Therefore, it's suggested that you perform the bulk image processing outside of VietOCR, using ImageMagick, GIMP, etc.

Both VietOCR Java & .NET support command-line execution. The command syntax is similar to that of Tesseract:

vietocr vietsample.tif out -l vie
Reply all
Reply to author
Forward
0 new messages