Calling from API the same way as from command line

43 views
Skip to first unread message

Yesbird

unread,
Dec 23, 2021, 3:28:18 PM12/23/21
to tesseract-ocr
Hi, guys !

I am doing video subtitles recognition for one of my C++ projects and can not figure out why for the same image tesseract gives good results when I run it from command line but fails from API. I see a couple of different parameters when running

tesseract --print-parameters

and don't know how to find which of them affect results.

Could anyone help me, please ? 

-- From command line ----------
tesseract ./subtitles/sub_ron_1.png stdout -l ron --dpi 600
-----------------------------------------
Turul virtual făcut de Kira şi Matt
a fost foarte amuzant.
-----------------------------------------

-- From API -------------------------
char *text;
std::string lang = "rum";
ocr->Init(NULL, lang.c_str());
ocr->SetImage(avframe->data[0], avframe->width, avframe->height, 4, avframe->linesize[0]);
text = ocr->GetUTF8Text();
-----------------------------------------
€ II a] e E ăn si 2 W a p:] VA
Turul'virtual făcut de Kira şi Matt
nat fn arte - SE,
a fost foarte amuzant.
-----------------------------------------

-- Version info ---------------------------
tesseract 4.1.1
leptonica-1.79.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
Found SSE
Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 libzstd/1.4.4
---------------------------------------------  

Image:
sub_ron_1.png

Yesbird

unread,
Dec 23, 2021, 3:37:27 PM12/23/21
to tesseract-ocr
And sorry, language initialization from API is the same as from comman line:
std::string lang = "ron";

Yesbird

unread,
Dec 23, 2021, 7:09:09 PM12/23/21
to tesseract-ocr
Problem solved by additional preprocessing - blending with white background.
This form of SetImage():
SetImage(avframe->data[0], avframe->width, avframe->height, 4, avframe->linesize[0]);
do not removing alpha, so I need to do it myself.
  
Reply all
Reply to author
Forward
0 new messages