Compiled tesseract(Both in windows and linux) is giving junk results on most of the images

Vasudevakrishna R joshi

unread,

Jun 10, 2025, 1:43:35 AM6/10/25

to tesseract-ocr

Hi,
I am compiled tesseract for both windows and linux(x64) with the help of documentation.
Currently I am using latest version(5.5.0).
While testing I am getting wierd characters while original binary tesseract.exe which is in repo, is giving proper results. Is there any thing I am missing?

Thanks,
Vasudevakrishna

Zdenko Podobny

unread,

Jun 10, 2025, 1:47:38 AM6/10/25

to tesser...@googlegroups.com

What about providing an example image?

What about reading documentation? Which suggestion did you try?

Zdenko

ut 10. 6. 2025 o 7:43 Vasudevakrishna R joshi <vasujo...@gmail.com> napísal(a):

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/2a6c8258-6220-42d4-a3c6-0565d61a54b3n%40googlegroups.com.

Vasudevakrishna R joshi

unread,

Jun 10, 2025, 10:41:15 AM6/10/25

to tesseract-ocr

Even with simple images also I am not getting proper results. But If I ran tesseract.exe which is given in github I am getting better results. Is something(Like any preprocessing internally tesseract is doing?) they are doing internally?
My code is simple as below:

Pix* image = pixRead(imagePath);
ocr = new tesseract::TessBaseAPI();
if (ocr->Init(dataPath, languageCode) != -1) {
    //ocr->SetPageSegMode(tesseract::PSM_RAW_LINE);
    return SDS_SUCCESS;
}
if (!image)
{
    return NULL;
}
ocr->SetImage(image);
text = ocr->GetUTF8Text();
if (text)
{
    return text;
}
else
{
    return NULL;
}

Reply all

Reply to author

Forward