Compiled tesseract(Both in windows and linux) is giving junk results on most of the images

44 views
Skip to first unread message

Vasudevakrishna R joshi

unread,
Jun 10, 2025, 1:43:35 AMJun 10
to tesseract-ocr
Hi,
I am compiled tesseract for both windows and linux(x64) with the help of documentation.
Currently I am using latest version(5.5.0).
While testing I am getting wierd characters while original binary tesseract.exe which is in repo, is giving proper results. Is there any thing I am missing?

Thanks,
Vasudevakrishna

Zdenko Podobny

unread,
Jun 10, 2025, 1:47:38 AMJun 10
to tesser...@googlegroups.com
What about providing an example image?
What about reading documentation? Which suggestion did you try?

Zdenko


ut 10. 6. 2025 o 7:43 Vasudevakrishna R joshi <vasujo...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/2a6c8258-6220-42d4-a3c6-0565d61a54b3n%40googlegroups.com.

Vasudevakrishna R joshi

unread,
Jun 10, 2025, 10:41:15 AMJun 10
to tesseract-ocr
Even with simple images also I am not getting proper results. But If I ran tesseract.exe which is given in github I am getting better results. Is something(Like any preprocessing internally tesseract is doing?) they are doing internally?
My code is simple as below:
Pix* image = pixRead(imagePath);
ocr = new tesseract::TessBaseAPI();
if (ocr->Init(dataPath, languageCode) != -1) {
    //ocr->SetPageSegMode(tesseract::PSM_RAW_LINE);
    return SDS_SUCCESS;
}
if (!image)
{
    return NULL;
}
ocr->SetImage(image);
text = ocr->GetUTF8Text();
if (text)
{
    return text;
}
else
{
    return NULL;
}
Reply all
Reply to author
Forward
0 new messages