Groups
Sign in
Groups
tesseract-ocr
Conversations
About
Send feedback
Help
tesseract-ocr
Contact owners and managers
1–30 of 7154
Welcome,
Before sending an email to the group:
Please read
Frequently Asked Questions
Make sure you read the
Tesseract documentation
Search internet sources (including this group) for a solution
If you have a problem:
Provide all steps (including input resources) for its replication.
So not send a screenshot of the terminal - send the logs or copy text from a terminal.
Mark all as read
Report group
0 selected
testcoal
Apr 22
Train Tesseract with my own Data
Hi, i am trying to train a tesseract model with my own data. This is my code : import os # Pfade
unread,
Train Tesseract with my own Data
Hi, i am trying to train a tesseract model with my own data. This is my code : import os # Pfade
Apr 22
Roparzh Hemon
, …
Zdenko Podobny
8
Apr 22
Beginner question : could not initialize tesseract, missing eng.traineddata file in tessdata
No, you are not using best float tessdata files from: https://github.com/tesseract-ocr/tessdata_best/
unread,
Beginner question : could not initialize tesseract, missing eng.traineddata file in tessdata
No, you are not using best float tessdata files from: https://github.com/tesseract-ocr/tessdata_best/
Apr 22
Surya VaraPrasad Alla
Apr 22
Reg: Legacy Components not found
Hello, I have the similar response pytesseract.pytesseract.TesseractError: (1, "read_params_file
unread,
Reg: Legacy Components not found
Hello, I have the similar response pytesseract.pytesseract.TesseractError: (1, "read_params_file
Apr 22
Surya VaraPrasad Alla
Apr 19
couldn't find components in fine tuned traineddata file
when using the fine tuned model, addressed below error: (1, "read_params_file: Can't open
unread,
couldn't find components in fine tuned traineddata file
when using the fine tuned model, addressed below error: (1, "read_params_file: Can't open
Apr 19
Panumeth Khongsawatkiat
,
ZeroCool Zero
2
Apr 19
Training Tesseract 5 for a New Font in Thai not wroking
I tried to train Tesseract 5 with a new font in Thai but The BCER value keeps increasing There is
unread,
Training Tesseract 5 for a New Font in Thai not wroking
I tried to train Tesseract 5 with a new font in Thai but The BCER value keeps increasing There is
Apr 19
testcoal
,
Misti Hamon
2
Apr 18
Train Tesseract (german)
Scanned books? No help on training or choosing datasets, but, if these images are photoscanned book
unread,
Train Tesseract (german)
Scanned books? No help on training or choosing datasets, but, if these images are photoscanned book
Apr 18
Jayrajsinh Zala
,
Zdenko Podobny
2
Apr 18
tesseract misleading in 8 and 6
Unfortunately, your post is very vague. Unless you provide a detailed description of what you are
unread,
tesseract misleading in 8 and 6
Unfortunately, your post is very vague. Unless you provide a detailed description of what you are
Apr 18
Leder Extreme BR
Apr 18
Cursive letters
Hello, I'm testing tesseract and I'm not able to process texts that use cursive fonts. How do
unread,
Cursive letters
Hello, I'm testing tesseract and I'm not able to process texts that use cursive fonts. How do
Apr 18
achille sadjang
Apr 17
Tesseract to recognize images or shapes
Hello everyone, I have a concern: is it possible to train Tesseract to recognize images or shapes? If
unread,
Tesseract to recognize images or shapes
Hello everyone, I have a concern: is it possible to train Tesseract to recognize images or shapes? If
Apr 17
Omar Samir
Apr 12
Fine-Tune Arabic Model
I have created a dataset with almost 200 million words. So there are about 20 million examples to
unread,
Fine-Tune Arabic Model
I have created a dataset with almost 200 million words. So there are about 20 million examples to
Apr 12
Mark Pellegrino
, …
Jeremiah
17
Apr 11
Post OCR Verification and Editing
Hi Mark, Glad you found Scribe OCR useful. Regarding character support, all characters in the Windows
unread,
Post OCR Verification and Editing
Hi Mark, Glad you found Scribe OCR useful. Regarding character support, all characters in the Windows
Apr 11
Nathan Bierema
Apr 11
Building from souce
I'm trying to build Tesseract from source using these instructions, but I believe I'm doing
unread,
Building from souce
I'm trying to build Tesseract from source using these instructions, but I believe I'm doing
Apr 11
Shatter
,
Jeremiah
2
Apr 8
Recognition when font is known
Cropping the image to only include the relevant area can significantly improve performance in cases
unread,
Recognition when font is known
Cropping the image to only include the relevant area can significantly improve performance in cases
Apr 8
Cain Pian
,
Jeremiah
3
Apr 7
Is there a good way to change the recognition rate for such images?
Yes, I've seen a lot of discussion on this issue that ended up going nowhere, it might be helpful
unread,
Is there a good way to change the recognition rate for such images?
Yes, I've seen a lot of discussion on this issue that ended up going nowhere, it might be helpful
Apr 7
Misti Hamon
Apr 5
Image preprocessing - textbook like layout
I'm hoping someone here can help. I'm working with a scan of a book with a textbook like
unread,
Image preprocessing - textbook like layout
I'm hoping someone here can help. I'm working with a scan of a book with a textbook like
Apr 5
Jean-Marc Spaggiari
,
René JM Clais
5
Apr 3
Shord word detection recommendations
Thanks for giving it a try! I ended up generating 11 versions of the same picture with very little
unread,
Shord word detection recommendations
Thanks for giving it a try! I ended up generating 11 versions of the same picture with very little
Apr 3
Cain Pian
,
Zdenko Podobny
3
Apr 3
Does training new images increase the size of the traindata file?
Thanks for the reply I'm simply confused by the fact that training a large number of images didn
unread,
Does training new images increase the size of the traindata file?
Thanks for the reply I'm simply confused by the fact that training a large number of images didn
Apr 3
aum hren
,
Tom Morris
3
Mar 29
english-arabic dictionary - transliteration text
Rather than using random web resources, I'd suggest using the official documentation. The most
unread,
english-arabic dictionary - transliteration text
Rather than using random web resources, I'd suggest using the official documentation. The most
Mar 29
Madhav Pandey
, …
Zdenko Podobny
14
Mar 27
Getting Error: No such file or directory: 'data/foo/all-lstmf'
You can try custom images - see the example ocrd-testset.zip And follow the example from https://
unread,
Getting Error: No such file or directory: 'data/foo/all-lstmf'
You can try custom images - see the example ocrd-testset.zip And follow the example from https://
Mar 27
roei shlezinger
,
Zdenko Podobny
2
Mar 27
fine tuning on images
You can easily test your hypothesis by modifying Makefile[1] lines from tesseract "$<" $
unread,
fine tuning on images
You can easily test your hypothesis by modifying Makefile[1] lines from tesseract "$<" $
Mar 27
Ajay Pandya
,
Zdenko Podobny
2
Mar 27
Lack of accuracy on reading numbers
Always test the command line if there is an issue with the wrapper. tesseract -v tesseract 5.3.4-44-
unread,
Lack of accuracy on reading numbers
Always test the command line if there is an issue with the wrapper. tesseract -v tesseract 5.3.4-44-
Mar 27
inKi Wang
,
Zdenko Podobny
3
Mar 26
Reading large gray images with only numbers yields incorrect results
Yes, we have suggestions for me to improve the accuracy of the results - they are already in the
unread,
Reading large gray images with only numbers yields incorrect results
Yes, we have suggestions for me to improve the accuracy of the results - they are already in the
Mar 26
Misti Hamon
,
Ger Hobbelt
2
Mar 25
hOCR verification and editing plus non-word characters
In your scenario, I would check performance of both modern lstm (v4/v5 engine) and old "classic
unread,
hOCR verification and editing plus non-word characters
In your scenario, I would check performance of both modern lstm (v4/v5 engine) and old "classic
Mar 25
Keith M
, …
Graham Toal
14
Mar 21
advice for OCR'ing 9-pin dot matrix BASIC code
I believe that for fixed font width listings, it is preferable to segment the page into characters
unread,
advice for OCR'ing 9-pin dot matrix BASIC code
I believe that for fixed font width listings, it is preferable to segment the page into characters
Mar 21
Liam Doherty
, …
Tom Morris
5
Mar 19
why are there no new trained models since 2018?
Thanks, that's helpful. Is the collaboration with Google ongoing then? Can you give me a sense of
unread,
why are there no new trained models since 2018?
Thanks, that's helpful. Is the collaboration with Google ongoing then? Can you give me a sense of
Mar 19
Jan Ploska
Mar 16
Chinise characters.
Hello, I am making a transcrypt of YT wideos using tessaract. Images I input to tessaract look like
unread,
Chinise characters.
Hello, I am making a transcrypt of YT wideos using tessaract. Images I input to tessaract look like
Mar 16
Quan Nguyen
,
JB Data31
3
Mar 13
VietOCR v6.3.0 & VietOCR.NET v6.3.0 Releases
VietOCR v6.13.0 & VietOCR.NET v6.11.0 Releases A Java/.NET WPF GUI frontend for Tesseract OCR
unread,
VietOCR v6.3.0 & VietOCR.NET v6.3.0 Releases
VietOCR v6.13.0 & VietOCR.NET v6.11.0 Releases A Java/.NET WPF GUI frontend for Tesseract OCR
Mar 13
Ravil R
,
Zdenko Podobny
2
Mar 13
Leptonica directory
It seems like you are not following the official documented way for compiling leptonica and tesseract
unread,
Leptonica directory
It seems like you are not following the official documented way for compiling leptonica and tesseract
Mar 13
Roman Seidel
, …
Zdenko Podobny
7
Mar 12
user patterns with tesserocr python API
One correction: I checked the example in the below mentioned url with the Tesseract executable and
unread,
user patterns with tesserocr python API
One correction: I checked the example in the below mentioned url with the Tesseract executable and
Mar 12
Jan F
Mar 12
Some PDF readers see double spaces in tesseract PDF output
Dear readers, I'm experimenting with Tesseract 5.3.3.20231005 on Windows and I keep running into
unread,
Some PDF readers see double spaces in tesseract PDF output
Dear readers, I'm experimenting with Tesseract 5.3.3.20231005 on Windows and I keep running into
Mar 12