Build from source failed to recognize arabic

Essam Zaky

unread,

Apr 6, 2017, 2:17:44 PM4/6/17

to tesseract-ocr

Hi dears

i had build tesseract and training tools from source for windows and VS2015

when recognize English page it succeeded

but when try to recognize arabic page it fails

C:\Users\emz\tesseract\build\bin\Debug>tesseract eurotext.tif eurotext -l eng

Tesseract Open Source OCR Engine v4.00.00dev with Leptonica

Page 1

C:\Users\emz\tesseract\build\bin\Debug>tesseract sample1.tif sample1 -l ara

Error: LSTM requested, but not present!! Loading tesseract.

tessdata_manager.SeekToStart(TESSDATA_INTTEMP):Error:Assert failed:in file C:\Us

ers\emz\tesseract\classify\adaptmatch.cpp, line 537

what could be the reason of the error

Best regards

Essam

universal reseller

unread,

Apr 6, 2017, 2:22:04 PM4/6/17

to tesser...@googlegroups.com

send output of
tesseract --list-langs

in cli

Essam Zaky

unread,

Apr 6, 2017, 3:01:38 PM4/6/17

to tesseract-ocr

Hi @.peiman

thanks for reply

i found the problem

I was installed old build for v4 from DanBolomBerg site

and the TESSDATA_PREFIX was refering to old version with cube

now i updated the TESSDATA_PREFIXin system enviornment to the new downloaded data it's working

Thanks again

universal reseller

unread,

Apr 6, 2017, 3:25:53 PM4/6/17

to tesser...@googlegroups.com

what is accuracy of result for you!?

Essam Zaky

unread,

Apr 8, 2017, 11:00:32 AM4/8/17

to tesseract-ocr

For the sample images i used

The accuracy for english is good

but for arabic the cube is still better than current LSTM

Reply all

Reply to author

Forward