is psm wrong?

40 views
Skip to first unread message

Pndaza

unread,
Mar 29, 2020, 10:01:07 AM3/29/20
to tesseract-ocr
i finetuned myanmar traineddata and i got accuracy above 95%.
But something is wired.
I render some text with different exposure to eval and i run

tesseract in exp_minus_1.png exp_minus_1 -l mya --psm 6
 
etc.
exposure miuns 1 output is ok.
In exp miuns 5 and 10 output, some lines output are not really exist in image. output result of exist line is still ok.
i try default psm and not different.
what is wrong?


My system is
tesseract v5.0.0-alpha.20200328
leptonica-1.78.0 
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2                                                                                                              
Found AVX                                                                                                               
Found FMA                                                                                                               
Found SSE                                                                                                               
Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5
Found libcurl/7.59.0 OpenSSL/1.0.2o (WinSSL) zlib/1.2.11 WinIDN libssh2/1.7.0 nghttp2/1.31.0


traineddata file
exp_minus_1.png
exp_minus_1.txt
exp_minus_5.png
exp_minus_5.txt
exp_minus_10.png
exp_minus_10.txt

Shree Devi Kumar

unread,
Mar 29, 2020, 10:35:36 AM3/29/20
to tesseract-ocr
If you want to recognise images with exposure of -5 or -10 then you need to train with it. Check your images, I think those images will be too light to be recognised correctly.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e5136aab-24c4-4ebf-ad52-06f5d9845a56%40googlegroups.com.

Pndaza

unread,
Mar 29, 2020, 10:42:15 AM3/29/20
to tesseract-ocr
thank . i will try
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.

Pndaza

unread,
Mar 29, 2020, 10:51:56 AM3/29/20
to tesseract-ocr
But same for this image
test.png
test.txt
Reply all
Reply to author
Forward
0 new messages