Groups
Sign in
Groups
tesseract-ocr
Conversations
About
Send feedback
Help
tesseract-ocr
Contact owners and managers
1–30 of 7151
Welcome,
Before sending an email to the group:
Please read
Frequently Asked Questions
Make sure you read the
Tesseract documentation
Search internet sources (including this group) for a solution
If you have a problem:
Provide all steps (including input resources) for its replication.
So not send a screenshot of the terminal - send the logs or copy text from a terminal.
Mark all as read
Report group
0 selected
testcoal
,
Misti Hamon
2
Apr 18
Train Tesseract (german)
Scanned books? No help on training or choosing datasets, but, if these images are photoscanned book
unread,
Train Tesseract (german)
Scanned books? No help on training or choosing datasets, but, if these images are photoscanned book
Apr 18
Jayrajsinh Zala
,
Zdenko Podobny
2
Apr 18
tesseract misleading in 8 and 6
Unfortunately, your post is very vague. Unless you provide a detailed description of what you are
unread,
tesseract misleading in 8 and 6
Unfortunately, your post is very vague. Unless you provide a detailed description of what you are
Apr 18
Leder Extreme BR
Apr 18
Cursive letters
Hello, I'm testing tesseract and I'm not able to process texts that use cursive fonts. How do
unread,
Cursive letters
Hello, I'm testing tesseract and I'm not able to process texts that use cursive fonts. How do
Apr 18
achille sadjang
Apr 17
Tesseract to recognize images or shapes
Hello everyone, I have a concern: is it possible to train Tesseract to recognize images or shapes? If
unread,
Tesseract to recognize images or shapes
Hello everyone, I have a concern: is it possible to train Tesseract to recognize images or shapes? If
Apr 17
Omar Samir
Apr 12
Fine-Tune Arabic Model
I have created a dataset with almost 200 million words. So there are about 20 million examples to
unread,
Fine-Tune Arabic Model
I have created a dataset with almost 200 million words. So there are about 20 million examples to
Apr 12
Mark Pellegrino
, …
Jeremiah
17
Apr 11
Post OCR Verification and Editing
Hi Mark, Glad you found Scribe OCR useful. Regarding character support, all characters in the Windows
unread,
Post OCR Verification and Editing
Hi Mark, Glad you found Scribe OCR useful. Regarding character support, all characters in the Windows
Apr 11
Nathan Bierema
Apr 11
Building from souce
I'm trying to build Tesseract from source using these instructions, but I believe I'm doing
unread,
Building from souce
I'm trying to build Tesseract from source using these instructions, but I believe I'm doing
Apr 11
Shatter
,
Jeremiah
2
Apr 8
Recognition when font is known
Cropping the image to only include the relevant area can significantly improve performance in cases
unread,
Recognition when font is known
Cropping the image to only include the relevant area can significantly improve performance in cases
Apr 8
Cain Pian
,
Jeremiah
3
Apr 7
Is there a good way to change the recognition rate for such images?
Yes, I've seen a lot of discussion on this issue that ended up going nowhere, it might be helpful
unread,
Is there a good way to change the recognition rate for such images?
Yes, I've seen a lot of discussion on this issue that ended up going nowhere, it might be helpful
Apr 7
Misti Hamon
Apr 5
Image preprocessing - textbook like layout
I'm hoping someone here can help. I'm working with a scan of a book with a textbook like
unread,
Image preprocessing - textbook like layout
I'm hoping someone here can help. I'm working with a scan of a book with a textbook like
Apr 5
Jean-Marc Spaggiari
,
René JM Clais
5
Apr 3
Shord word detection recommendations
Thanks for giving it a try! I ended up generating 11 versions of the same picture with very little
unread,
Shord word detection recommendations
Thanks for giving it a try! I ended up generating 11 versions of the same picture with very little
Apr 3
Cain Pian
,
Zdenko Podobny
3
Apr 3
Does training new images increase the size of the traindata file?
Thanks for the reply I'm simply confused by the fact that training a large number of images didn
unread,
Does training new images increase the size of the traindata file?
Thanks for the reply I'm simply confused by the fact that training a large number of images didn
Apr 3
aum hren
,
Tom Morris
3
Mar 29
english-arabic dictionary - transliteration text
Rather than using random web resources, I'd suggest using the official documentation. The most
unread,
english-arabic dictionary - transliteration text
Rather than using random web resources, I'd suggest using the official documentation. The most
Mar 29
Madhav Pandey
, …
Zdenko Podobny
14
Mar 27
Getting Error: No such file or directory: 'data/foo/all-lstmf'
You can try custom images - see the example ocrd-testset.zip And follow the example from https://
unread,
Getting Error: No such file or directory: 'data/foo/all-lstmf'
You can try custom images - see the example ocrd-testset.zip And follow the example from https://
Mar 27
roei shlezinger
,
Zdenko Podobny
2
Mar 27
fine tuning on images
You can easily test your hypothesis by modifying Makefile[1] lines from tesseract "$<" $
unread,
fine tuning on images
You can easily test your hypothesis by modifying Makefile[1] lines from tesseract "$<" $
Mar 27
Ajay Pandya
,
Zdenko Podobny
2
Mar 27
Lack of accuracy on reading numbers
Always test the command line if there is an issue with the wrapper. tesseract -v tesseract 5.3.4-44-
unread,
Lack of accuracy on reading numbers
Always test the command line if there is an issue with the wrapper. tesseract -v tesseract 5.3.4-44-
Mar 27
inKi Wang
,
Zdenko Podobny
3
Mar 26
Reading large gray images with only numbers yields incorrect results
Yes, we have suggestions for me to improve the accuracy of the results - they are already in the
unread,
Reading large gray images with only numbers yields incorrect results
Yes, we have suggestions for me to improve the accuracy of the results - they are already in the
Mar 26
Misti Hamon
,
Ger Hobbelt
2
Mar 25
hOCR verification and editing plus non-word characters
In your scenario, I would check performance of both modern lstm (v4/v5 engine) and old "classic
unread,
hOCR verification and editing plus non-word characters
In your scenario, I would check performance of both modern lstm (v4/v5 engine) and old "classic
Mar 25
Keith M
, …
Graham Toal
14
Mar 21
advice for OCR'ing 9-pin dot matrix BASIC code
I believe that for fixed font width listings, it is preferable to segment the page into characters
unread,
advice for OCR'ing 9-pin dot matrix BASIC code
I believe that for fixed font width listings, it is preferable to segment the page into characters
Mar 21
Liam Doherty
, …
Tom Morris
5
Mar 19
why are there no new trained models since 2018?
Thanks, that's helpful. Is the collaboration with Google ongoing then? Can you give me a sense of
unread,
why are there no new trained models since 2018?
Thanks, that's helpful. Is the collaboration with Google ongoing then? Can you give me a sense of
Mar 19
Jan Ploska
Mar 16
Chinise characters.
Hello, I am making a transcrypt of YT wideos using tessaract. Images I input to tessaract look like
unread,
Chinise characters.
Hello, I am making a transcrypt of YT wideos using tessaract. Images I input to tessaract look like
Mar 16
Quan Nguyen
,
JB Data31
3
Mar 13
VietOCR v6.3.0 & VietOCR.NET v6.3.0 Releases
VietOCR v6.13.0 & VietOCR.NET v6.11.0 Releases A Java/.NET WPF GUI frontend for Tesseract OCR
unread,
VietOCR v6.3.0 & VietOCR.NET v6.3.0 Releases
VietOCR v6.13.0 & VietOCR.NET v6.11.0 Releases A Java/.NET WPF GUI frontend for Tesseract OCR
Mar 13
Ravil R
,
Zdenko Podobny
2
Mar 13
Leptonica directory
It seems like you are not following the official documented way for compiling leptonica and tesseract
unread,
Leptonica directory
It seems like you are not following the official documented way for compiling leptonica and tesseract
Mar 13
Roman Seidel
, …
Zdenko Podobny
7
Mar 12
user patterns with tesserocr python API
One correction: I checked the example in the below mentioned url with the Tesseract executable and
unread,
user patterns with tesserocr python API
One correction: I checked the example in the below mentioned url with the Tesseract executable and
Mar 12
Jan F
Mar 12
Some PDF readers see double spaces in tesseract PDF output
Dear readers, I'm experimenting with Tesseract 5.3.3.20231005 on Windows and I keep running into
unread,
Some PDF readers see double spaces in tesseract PDF output
Dear readers, I'm experimenting with Tesseract 5.3.3.20231005 on Windows and I keep running into
Mar 12
Panumeth Khongsawatkiat
Mar 12
Training Tesseract 5 for a New Font in Thai not wroking
I tried to train Tesseract 5 with a new font in Thai but The BCER value keeps increasing. This is the
unread,
Training Tesseract 5 for a New Font in Thai not wroking
I tried to train Tesseract 5 with a new font in Thai but The BCER value keeps increasing. This is the
Mar 12
Mridul Davesar
Mar 12
LSTM training tesseract OCR high error rate
Hey everyone , I am train my own lstm model based using some specific images that I want tesseract to
unread,
LSTM training tesseract OCR high error rate
Hey everyone , I am train my own lstm model based using some specific images that I want tesseract to
Mar 12
Ali öksüzoglu
Mar 11
I can't create OCR traindata
Hello, I am trying to solve the Captcha in this image, but I am getting an error. Is there anyone who
unread,
I can't create OCR traindata
Hello, I am trying to solve the Captcha in this image, but I am getting an error. Is there anyone who
Mar 11
thangaraj r
Mar 8
i got Failed to continue from: data/eng/eng_num_vert.lstm
Warning: LSTMTrainer deserialized an LSTMRecognizer! Error, data/eng/eng_num_vert.lstm is an integer
unread,
i got Failed to continue from: data/eng/eng_num_vert.lstm
Warning: LSTMTrainer deserialized an LSTMRecognizer! Error, data/eng/eng_num_vert.lstm is an integer
Mar 8
Minh Nguyen
Mar 7
How to get path tesseract_cmd
I'm using sam cli to build and deploy images to AWS ECR. The code snippet has been packaged into
unread,
How to get path tesseract_cmd
I'm using sam cli to build and deploy images to AWS ECR. The code snippet has been packaged into
Mar 7