Lack of accuracy on reading numbers

52 views
Skip to first unread message

Ajay Pandya

unread,
Mar 27, 2024, 2:42:06 AMMar 27
to tesseract-ocr
Hello Everyone,

I am using tesseract 5.2 with C#. Having problem in reading this number.

PSM : 8
OEM : 3
Train file : eng (Best)

Data : 31111, Reading 3111.

We have many same images with different numbers. Sometimes it adds extra number and some times it removes. 

Kindly help with this problem.

Thanks.
teach_t2.bmp

Zdenko Podobny

unread,
Mar 27, 2024, 4:46:26 AMMar 27
to tesser...@googlegroups.com
Always test the command line if there is an issue with the wrapper.

tesseract -v
tesseract 5.3.4-44-g2b07
 leptonica-1.84.0 (Dec 31 2023, 23:36:37) [MSC v.1929 LIB Release x64]
  libgif 5.1.2 : libjpeg 6b (libjpeg-turbo 2.1.90) : libpng 1.6.40 : libtiff 4.6.0 : zlib 1.2.13.zlib-ng : libwebp 1.3.2 : libopenjp2 2.5.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 200203

tesseract teach_t2.png -
31111

tesseract teach_t2.png - --psm 8
C 3111

tesseract teach_t2.png - --psm 7
31111


seem like psm 8 is not suitable in this case.

Zdenko


st 27. 3. 2024 o 7:42 Ajay Pandya <ajay....@einnosystech.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/3664ab29-85a5-49ad-9066-789293feaefdn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages