setting pagesegmode for multi-column text OCR

249 views
Skip to first unread message

alexiuk

unread,
Oct 3, 2013, 10:23:20 AM10/3/13
to tesser...@googlegroups.com
Hi - I have an image with multiple columns I'd like to OCR. I'm using tess4j v1.1 and wrote another small convenience wrapper. 

I was expecting to be able to change pagesegmode as per the following failing junit test.  I'd sure appreciate some advice.

Thanks


@Test
public void test_setParams() {

// my convenience wrapper
  OCR_Tesseract ocr = new OCR_Tesseract();

  int current_page_seg_mode = Tesseract1.TessBaseAPIGetPageSegMode( ocr.handle );
  assertEquals(TessPageSegMode.PSM_AUTO_OSD, current_page_seg_mode);
}

// extract from ctor.

this.handle = TessAPI1.TessBaseAPICreate();
TessAPI1.TessBaseAPIInit3(this.handle, this.datapath, this.language);

this.instance = new Tesseract1();
this.instance.setPageSegMode( TessPageSegMode.PSM_AUTO_OSD );
...

Quan Nguyen

unread,
Oct 3, 2013, 12:08:28 PM10/3/13
to tesser...@googlegroups.com
SetPageSegMode should be called after Init.

Take a look at the testOSD test case in http://sourceforge.net/p/tess4j/code/HEAD/tree/Tess4J_3/trunk/test/net/sourceforge/tess4j/
Reply all
Reply to author
Forward
0 new messages