Tesseract 3.02 Orientation Script Detection

Joe Aspara

unread,

May 11, 2014, 6:48:39 AM5/11/14

to tesser...@googlegroups.com

I'm struggling with the OSD function of Tesseract 3.02.

I tried the standalone version via command line and the Tess4J version too, but I always obtain an error with different input types.

I downloaded the osd.traineddata for version 3.01 (I guess no such file still exist for v3.02) from here https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-3.01.osd.tar.gz&can=2&q=

and I copied it properly in the TESSDATA folder

Below my experiments:

COMMAND LINE

tesseract input_image output_text -l eng -psm 0

response: Error during processing.

With psm = 1 it read text with very bad quality, with psm = 2 or 3 it give my empty output.

As far as I know only 0 and 1 values perform OSD! From the reference:

0 = Orientation and script detection (OSD) only.

1 = Automatic page segmentation with OSD.

TESS4J

Tesseract instance = Tesseract.getInstance();

instance.setLanguage("ita");

instance.setPageSegMode(TessPageSegMode.PSM_AUTO_OSD);

String result = instance.doOCR(myImage);

result always is empty at the end

To know the input orientation it's critical for my project but at now I'm not able to find a way to accomplish this.

I hope somebody can help me! Thanks in advance

Quan Nguyen

unread,

May 11, 2014, 7:53:45 AM5/11/14

to tesser...@googlegroups.com

With psm 0, Tesseract does not perform normal OCR function but analyzes layout; it produces such characteristics as Orientation, Writing Direction, and Textline Order. Check Tess4J unit tests for usage of OSD.

zdenop

unread,

May 11, 2014, 10:59:57 AM5/11/14

to tesser...@googlegroups.com

3.02 version do not produce output for psm 0 (and 2?). This was changed in 3.03 version where tesseract will produce output like this:

$ tesseract test.tif - -psm 0
Orientation: 0
Orientation in degrees: 0
Orientation confidence: 6.04
Script: 1
Script confidence: 4.17

See current code[1] how this information could be retrieved in c++.

[1] https://code.google.com/p/tesseract-ocr/source/browse/trunk/api/tesseractmain.cpp?r=1093#274

Joe Aspara

unread,

May 14, 2014, 2:17:46 PM5/14/14

to tesser...@googlegroups.com

@Quan

Ok that with pam 0 OCR isn't performed but I'm expecting that when I run "tesseract input_image output_text -l eng -psm 0" I'll get the analysis response in the output_text file. With Tesseract 3.02 it isn't so :(

@zdenop

So Tesseract v. 3.02 doesn't support this feature... I'll try 3.03 version! Many thanks!

Quan Nguyen

unread,

May 14, 2014, 9:21:23 PM5/14/14

to tesser...@googlegroups.com

On Wednesday, May 14, 2014 1:17:46 PM UTC-5, Joe Aspara wrote:

@Quan
Ok that with pam 0 OCR isn't performed but I'm expecting that when I run "tesseract input_image output_text -l eng -psm 0" I'll get the analysis response in the output_text file. With Tesseract 3.02 it isn't so :(

I was speaking regarding Tess4J. You can get the information of interest through the API.

Reply all

Reply to author

Forward