Groups
Groups
Sign in
Groups
Groups
tesseract-ocr
Conversations
About
Send feedback
Help
tesseract-ocr
Contact owners and managers
1–30 of 7358
Welcome,
Before sending an email to the group:
Please read
Frequently Asked Questions
Make sure you read the
Tesseract documentation
Search internet sources (including this group) for a solution
If you have a problem:
Provide all steps (including input resources) for its replication.
So not send a screenshot of the terminal - send the logs or copy text from a terminal.
Mark all as read
Report group
0 selected
James Smith
Jan 21
force 2 lines
I am trying to ocr my cell phone usage report. Sometimes it is two lines, all the numbers on one line
unread,
force 2 lines
I am trying to ocr my cell phone usage report. Sometimes it is two lines, all the numbers on one line
Jan 21
James Smith
6
Jan 21
2 line PNG to more than 2 lines output (used to output 2 lines of text)
It wasn't the uneven line because it happened again and it seems even horizontally. On Thursday,
unread,
2 line PNG to more than 2 lines output (used to output 2 lines of text)
It wasn't the uneven line because it happened again and it seems even horizontally. On Thursday,
Jan 21
Thomas
Jan 19
Guidence for optimising model.
Hi!, Thank you for reading, I'm new here and have a hard time getting my head around how the
unread,
Guidence for optimising model.
Hi!, Thank you for reading, I'm new here and have a hard time getting my head around how the
Jan 19
Shavkat Sultanov
Jan 18
is there a minimum amount of samples that is required for a training to start running?
I encountered a similar issue as this guy: https://github.com/tesseract-ocr/tesstrain/issues/40 I
unread,
is there a minimum amount of samples that is required for a training to start running?
I encountered a similar issue as this guy: https://github.com/tesseract-ocr/tesstrain/issues/40 I
Jan 18
Shavkat Sultanov
7
Jan 18
Failed to read continue from: ./data/gg_custom_1/checkpoints/gg_custom_1_checkpoint make: *** [Makefile:325: data/gg_custom_1.traineddata] Error 1
combine_lang_model \ --input_unicharset ./data/gg_custom_1/unicharset \ --script_dir ./data/langdata
unread,
Failed to read continue from: ./data/gg_custom_1/checkpoints/gg_custom_1_checkpoint make: *** [Makefile:325: data/gg_custom_1.traineddata] Error 1
combine_lang_model \ --input_unicharset ./data/gg_custom_1/unicharset \ --script_dir ./data/langdata
Jan 18
Shavkat Sultanov
,
Zdenko Podobny
13
Jan 7
my training fails
Hi again, I copied over the actual eng-traineddata (14.6MB) file now. It does not have a problem with
unread,
my training fails
Hi again, I copied over the actual eng-traineddata (14.6MB) file now. It does not have a problem with
Jan 7
Maxim Kizub
Jan 1
Strange PSM 7 and 13 results
Hello. I've trained tesseract for specific font on a carefully scaled and baseline aligned set of
unread,
Strange PSM 7 and 13 results
Hello. I've trained tesseract for specific font on a carefully scaled and baseline aligned set of
Jan 1
Jürgen Uhl
,
Zdenko Podobny
2
12/12/25
Strange bbox'es in image_to_boxes
Hi there, I'm afraid our crystal balls are on permanent vacation, so we have no way of magically
unread,
Strange bbox'es in image_to_boxes
Hi there, I'm afraid our crystal balls are on permanent vacation, so we have no way of magically
12/12/25
محمود محمد
12/7/25
Has anyone created a training file from scratch
Welcome everyone! This is Mahmoud, a member of the group. My question is: Has anyone created a
unread,
Has anyone created a training file from scratch
Welcome everyone! This is Mahmoud, a member of the group. My question is: Has anyone created a
12/7/25
Baraah Kassab
,
Sara Elshobaky
4
11/30/25
Arabic OCR
Dear, this is a main train data file or i need to combine it with other train data? when i use it
unread,
Arabic OCR
Dear, this is a main train data file or i need to combine it with other train data? when i use it
11/30/25
Stéphane Brunner
,
Zdenko Podobny
5
11/25/25
Image just some color converted into black and white
Hello, Thanks, effectively it's working with a newer version :-) CU Stéph Le samedi 22 novembre
unread,
Image just some color converted into black and white
Hello, Thanks, effectively it's working with a newer version :-) CU Stéph Le samedi 22 novembre
11/25/25
Jozef M.
,
Ger Hobbelt
2
11/11/25
Tesseract LSTM competitive word recognition (at least for certain use cases)
Thank you for publishing this. Question: the -1 confidence numbers for T3 and T4 in the charts: could
unread,
Tesseract LSTM competitive word recognition (at least for certain use cases)
Thank you for publishing this. Question: the -1 confidence numbers for T3 and T4 in the charts: could
11/11/25
Mattia Mirri
,
Ger Hobbelt
2
11/10/25
Help with auto island detection
Relevant (if only sideways at first glance): - https://tesseract-ocr.github.io/tessdoc/ImproveQuality
unread,
Help with auto island detection
Relevant (if only sideways at first glance): - https://tesseract-ocr.github.io/tessdoc/ImproveQuality
11/10/25
Jeremy C. Reed
,
Ger Hobbelt
2
11/9/25
training using a page at a time?
In answer to your question: AFAIK there is no 'simple' solution/answer. Reading, OCRing (old)
unread,
training using a page at a time?
In answer to your question: AFAIK there is no 'simple' solution/answer. Reading, OCRing (old)
11/9/25
Harshit Goel
,
Ger Hobbelt
5
11/5/25
tesseract via gosseract returns empty text for one image, but CLI detects correctly ("NO SMOKING")
"tav output modes": typo! I meant to say "TSV output mode". Sorry. Met
unread,
tesseract via gosseract returns empty text for one image, but CLI detects correctly ("NO SMOKING")
"tav output modes": typo! I meant to say "TSV output mode". Sorry. Met
11/5/25
Sandeep G
11/3/25
Issue with Colon Recognition After Fine-Tuning Tesseract 5.5.3 on Russian Dataset
I'm currently working on fine-tuning the Tesseract OCR model (version 5.5.3) and encountered an
unread,
Issue with Colon Recognition After Fine-Tuning Tesseract 5.5.3 on Russian Dataset
I'm currently working on fine-tuning the Tesseract OCR model (version 5.5.3) and encountered an
11/3/25
Michael Schuh
, …
Ger Hobbelt
11
11/1/25
Trouble extracting date and time from image
You're welcome! Good luck and take care! .... (For posterity / google search, here's a
unread,
Trouble extracting date and time from image
You're welcome! Good luck and take care! .... (For posterity / google search, here's a
11/1/25
Coure 2011
10/29/25
Deserialize header failed: 1.lstmf
I need to train the default eng data, so that it can also recognize new characters. I created box
unread,
Deserialize header failed: 1.lstmf
I need to train the default eng data, so that it can also recognize new characters. I created box
10/29/25
Jean-Marc Spaggiari
,
Zdenko Podobny
2
9/28/25
Same command for 2 files
Hi, But for the Aurochs file I'm getting "Empty page!!". I have not been able to get a
unread,
Same command for 2 files
Hi, But for the Aurochs file I'm getting "Empty page!!". I have not been able to get a
9/28/25
pascal 06
,
Tom Morris
4
9/18/25
Carriage return after each word
Bonsoir Tom, je suppose que tu es francophone :) Merci pour ta réponse ! Je vais continuer en anglais
unread,
Carriage return after each word
Bonsoir Tom, je suppose que tu es francophone :) Merci pour ta réponse ! Je vais continuer en anglais
9/18/25
Alessandro Griseta
, …
Milan Hauth
4
8/30/25
[questions] what happened to `tessdata_best` in Tesseract 5?
works for me with tesseract 5.5.1 git clone --depth=1 https://github.com/tesseract-ocr/tessdata_best
unread,
[questions] what happened to `tessdata_best` in Tesseract 5?
works for me with tesseract 5.5.1 git clone --depth=1 https://github.com/tesseract-ocr/tessdata_best
8/30/25
Andrus Moor
8/24/25
How recognize text with background
Tried https://github.com/Sicos1977/TesseractOCR and Leptonica to convert jpg receipt slip to text:
unread,
How recognize text with background
Tried https://github.com/Sicos1977/TesseractOCR and Leptonica to convert jpg receipt slip to text:
8/24/25
Pavel Hanák
8/21/25
Tesseract returns exotic characters while processing standard latin-script document
Short version: Ghostscipt uses Tesseract, but their data exchange interface may contain a bug.
unread,
Tesseract returns exotic characters while processing standard latin-script document
Short version: Ghostscipt uses Tesseract, but their data exchange interface may contain a bug.
8/21/25
Yuwen Hsieh
8/15/25
Can't install 5.5.1
Hello I tried to install Tesseract with docker base image python:3.11-trixie, but it's installing
unread,
Can't install 5.5.1
Hello I tried to install Tesseract with docker base image python:3.11-trixie, but it's installing
8/15/25
Cary Lewis
8/14/25
OCR of iPhone Screen shots
I have been trying with some success to have tesseract recognize text from iPhone's about screen.
unread,
OCR of iPhone Screen shots
I have been trying with some success to have tesseract recognize text from iPhone's about screen.
8/14/25
Thomas McGrew
,
Zdenko Podobny
7
8/10/25
Incorrect text detection
You are correct, I did miss that section. Inverting the image seems to produce better results. I
unread,
Incorrect text detection
You are correct, I did miss that section. Inverting the image seems to produce better results. I
8/10/25
Jan-Erik Lärka
,
Nikola Smolenski
4
8/4/25
TESSDATA_PREFIX doesn't work with national character(s)
The problem is that there are two places attempting to use TESSDATA_PREFIX and they have conflicting
unread,
TESSDATA_PREFIX doesn't work with national character(s)
The problem is that there are two places attempting to use TESSDATA_PREFIX and they have conflicting
8/4/25
Terasgr
7/25/25
Modern Greek depends on Ancient Greek language?
Hello people. When I tried to OCR a Greek text using tesseract I found that the modern Greek data (
unread,
Modern Greek depends on Ancient Greek language?
Hello people. When I tried to OCR a Greek text using tesseract I found that the modern Greek data (
7/25/25
Graham Toal
7/20/25
Re: [tesseract-ocr] How can I find out the version of current Tesseract from cmdline?
'--' not '-' gtoal@linux:~/github/uparse-main$ tesseract Usage: tesseract --help | --
unread,
Re: [tesseract-ocr] How can I find out the version of current Tesseract from cmdline?
'--' not '-' gtoal@linux:~/github/uparse-main$ tesseract Usage: tesseract --help | --
7/20/25
Tom Vercauteren
, …
Fly Night Society
7
7/17/25
Best settings to OCR an image of some cyphered text (base64)
I already have, and yes to all. On Wednesday, July 16, 2025 at 5:33:53 PM UTC-4 tfmo...@gmail.com
unread,
Best settings to OCR an image of some cyphered text (base64)
I already have, and yes to all. On Wednesday, July 16, 2025 at 5:33:53 PM UTC-4 tfmo...@gmail.com
7/17/25