tesseract 4.1.0
leptonica-1.76.0
libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
Debian 10 64bit
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9e8203e6-fbd5-47dc-8b2b-0327fe1e2e0a%40googlegroups.com.
I am afraid that such small faction of text (where are just letter commonly misinterpreted like S or 5 or ? can not recognized with 100% accuracy. Try to use in some context (line).Zdenko
po 21. 10. 2019 o 20:22 Ast <asteptoe...@gmail.com> napísal(a):
--I've spent a good amount of time looking how to resolve this issue. Came across this unanswered post from 2017. Tried it and it is still reproducible today. There are 2 images - one with the letter S, one with 2S. As a single character, the letter S is detected successfully but 2S is detected as 25From what I've been able to learn, this issue stems from the combination of alphanumeric characters (common in receipts or codes) and how tessaract tries to use dictionary words.Environment:tesseract 4.1.0
leptonica-1.76.0
libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
Debian 10 64bitI've tried changing some configurations such as load_system_dawg=0 and load_freq_dawg=0 but without luck.
I am fairly new to OCR so any input and feedback is greatly appreciated. Thank you.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.
cv2.resize(img,
None,
fx=4,
fy=4,
interpolation=cv2.INTER_CUBIC) x = 10
y = 10
h = 20
w = 110
img = img[y:y + h, x:x + w]I am afraid that such small faction of text (where are just letter commonly misinterpreted like S or 5 or ? can not recognized with 100% accuracy. Try to use in some context (line).Zdenko
po 21. 10. 2019 o 20:22 Ast <asteptoe...@gmail.com> napísal(a):
--I've spent a good amount of time looking how to resolve this issue. Came across this unanswered post from 2017. Tried it and it is still reproducible today. There are 2 images - one with the letter S, one with 2S. As a single character, the letter S is detected successfully but 2S is detected as 25From what I've been able to learn, this issue stems from the combination of alphanumeric characters (common in receipts or codes) and how tessaract tries to use dictionary words.Environment:tesseract 4.1.0
leptonica-1.76.0
libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
Debian 10 64bitI've tried changing some configurations such as load_system_dawg=0 and load_freq_dawg=0 but without luck.
I am fairly new to OCR so any input and feedback is greatly appreciated. Thank you.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/4b6426d0-450b-4416-95c3-ba3b23f778d6%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/4b6426d0-450b-4416-95c3-ba3b23f778d6%40googlegroups.com.