Hello everyone;
Tesseract from command line yields decent results. However, from API, the results are not as good.
I've written a few c wrapper functions in order to use Tesseract from harbour.
This code works fine but yields different and less accurate text when compared from command line output:
handle := TessBaseAPICreate() //Using Tesseract to OCR image
IF TessBaseAPIInit3( handle, NIL, "eng" ) != 0 ; LOOP ; ENDIF //abort if english traindata file can't be found locally.
//line below is commented to avoid program from freezing when calling TessBaseAPIGetUTF8Text()
//TessBaseAPISetPageSegMode( handle, 3 ) //this line causes program to freeze when calling GetUTF8Text() below
img := pixRead( ALLTRIM( cPath )+cFile )
TessBaseAPISetImage2( handle, img )
IF TessBaseAPIRecognize( handle, Nil ) != 0 ; LOOP ;ENDIF //abort if Recognize fails
cText := TessBaseAPIGetUTF8Text( handle ) //program will freeze here unless SetPageSegMode above is commented
I'm guessing the reason output is different is do to PSM mode defaults to different values on command line use versus from API. However, when I uncomment the line "TessBaseAPISetPageSegMode( handle, 3 )" to make sure PSM is same as command line default then the program freezes when executing TessBaseAPIGetUTF8Text( handle).
Can someone, please, help me understand what I might be doing wrong?
Thank you,
Reinaldo.