Hi, guys !
I am doing video subtitles recognition for one of my C++ projects and can not figure out why for the same image tesseract gives good results when I run it from command line but fails from API. I see a couple of different parameters when running
tesseract --print-parameters
and don't know how to find which of them affect results.
Could anyone help me, please ?
-- From command line ----------
tesseract ./subtitles/sub_ron_1.png stdout -l ron --dpi 600
-----------------------------------------
Turul virtual făcut de Kira şi Matt
a fost foarte amuzant.
-----------------------------------------
-- From API -------------------------
char *text;
std::string lang = "rum";
ocr->Init(NULL, lang.c_str());
ocr->SetImage(avframe->data[0], avframe->width, avframe->height, 4, avframe->linesize[0]);
text = ocr->GetUTF8Text();
-----------------------------------------
€ II a] e E ăn si 2 W a p:] VA
Turul'virtual făcut de Kira şi Matt
nat fn arte - SE,
a fost foarte amuzant.
-----------------------------------------
-- Version info ---------------------------
tesseract 4.1.1
leptonica-1.79.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
Found SSE
Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 libzstd/1.4.4
---------------------------------------------
Image:
