Mute "Empty page!!" print when using libtesseract?

546 views
Skip to first unread message

MPursche

unread,
Oct 8, 2019, 6:57:34 AM10/8/19
to tesseract-ocr
Hello!

I have a use-case for Tesseract where I'm not always 100% sure that there is text in the image I try to do OCR on. When it runs and gets to one of these images it prints "Empty page!!" to the console which interferes with the user experience a lot.

I did some searching and I found this: https://github.com/tesseract-ocr/tesseract/wiki/FAQ#how-can-i-make-the-error-messages-go-to-tesseractlog-instead-of-stderr

However, I am using Tesseract as a library, not as a command line tool. Can I mute this behaviour from code?

I would prefer to modify my usage of libtesseract rather than make changes of libtesseract directly if possible since I am using a wrapper for C#, but if I have to I will compile my own libtesseract.dll.

I already tried redirecting Console.Out and Console.Error before making my calls to libtesseract but it didn't help.

Zdenko Podobny

unread,
Oct 8, 2019, 8:23:26 AM10/8/19
to tesser...@googlegroups.com
Set parameter debug_file to /dev/null (or some filename, where you can check warning from tesseract)

Zdenko


ut 8. 10. 2019 o 12:57 MPursche <purs...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ce951ed6-f309-419c-a914-70b39dc93b54%40googlegroups.com.

MPursche

unread,
Oct 8, 2019, 8:51:38 AM10/8/19
to tesseract-ocr
Thanks that worked. :)


On Tuesday, October 8, 2019 at 2:23:26 PM UTC+2, zdenop wrote:
Set parameter debug_file to /dev/null (or some filename, where you can check warning from tesseract)

Zdenko


ut 8. 10. 2019 o 12:57 MPursche <purs...@gmail.com> napísal(a):
Hello!

I have a use-case for Tesseract where I'm not always 100% sure that there is text in the image I try to do OCR on. When it runs and gets to one of these images it prints "Empty page!!" to the console which interferes with the user experience a lot.

I did some searching and I found this: https://github.com/tesseract-ocr/tesseract/wiki/FAQ#how-can-i-make-the-error-messages-go-to-tesseractlog-instead-of-stderr

However, I am using Tesseract as a library, not as a command line tool. Can I mute this behaviour from code?

I would prefer to modify my usage of libtesseract rather than make changes of libtesseract directly if possible since I am using a wrapper for C#, but if I have to I will compile my own libtesseract.dll.

I already tried redirecting Console.Out and Console.Error before making my calls to libtesseract but it didn't help.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.

Lorenzo Bolzani

unread,
Oct 8, 2019, 8:52:42 AM10/8/19
to tesser...@googlegroups.com
Hi,
I suspect what you are using is not a real api bindings but more of a command line wrapper. This is very slow and inconvenient to use.

I would simply use the API, probably even the plain tesseract libs as you are using C#. The API does not write anything to the console.


Lorenzo

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

MPursche

unread,
Oct 8, 2019, 9:07:43 AM10/8/19
to tesseract-ocr
I think you might be right, I just installed this one from NuGet: https://github.com/charlesw/tesseract/

Do you know of a C# binding you would recommend?


On Tuesday, October 8, 2019 at 2:52:42 PM UTC+2, Lorenzo Blz wrote:
Hi,
I suspect what you are using is not a real api bindings but more of a command line wrapper. This is very slow and inconvenient to use.

I would simply use the API, probably even the plain tesseract libs as you are using C#. The API does not write anything to the console.


Lorenzo

Il giorno mar 8 ott 2019 alle ore 12:57 MPursche <purs...@gmail.com> ha scritto:
Hello!

I have a use-case for Tesseract where I'm not always 100% sure that there is text in the image I try to do OCR on. When it runs and gets to one of these images it prints "Empty page!!" to the console which interferes with the user experience a lot.

I did some searching and I found this: https://github.com/tesseract-ocr/tesseract/wiki/FAQ#how-can-i-make-the-error-messages-go-to-tesseractlog-instead-of-stderr

However, I am using Tesseract as a library, not as a command line tool. Can I mute this behaviour from code?

I would prefer to modify my usage of libtesseract rather than make changes of libtesseract directly if possible since I am using a wrapper for C#, but if I have to I will compile my own libtesseract.dll.

I already tried redirecting Console.Out and Console.Error before making my calls to libtesseract but it didn't help.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.

Lorenzo Bolzani

unread,
Oct 8, 2019, 7:06:04 PM10/8/19
to tesser...@googlegroups.com

I'm not a C# developer but I suppose you can just use the c++ library as is.


Lorenzo

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f93c2796-416b-4b75-8332-0589a4a81ef3%40googlegroups.com.

Zdenko Podobny

unread,
Oct 9, 2019, 2:51:20 AM10/9/19
to tesser...@googlegroups.com
Unfortunately you are wrong on this:
  1. He is using the real api binding (if not the best, at least most active C# tesseract solution)
  2. Tesseract library prints output to stderr and stdout. Check the source.
Zdenko

Dňa ut 8. 10. 2019, 14:52 Lorenzo Bolzani <l.bo...@gmail.com> napísal(a):

Lorenzo Bolzani

unread,
Oct 9, 2019, 3:34:10 AM10/9/19
to tesser...@googlegroups.com
Ok, I trust you, strange I never noticed it. Maybe it does this only for some PSM modes that I do not use.

Thanks for clarifying.


Lorenzo

Reply all
Reply to author
Forward
0 new messages