tesseract.exe has stopped working on win2008 r2

1,627 views
Skip to first unread message

moos3

unread,
Mar 23, 2011, 11:50:50 AM3/23/11
to tesseract-ocr
I have been trying to get the latest version thats available for down
to work on windows 2008 server r2. The moment it goes to process the
file instant Has stopped working message on the screen. I was
wondering if any could build a new windows release or know how to fix
this issue ?



Thanks.

Sriranga(78yrsold)

unread,
Mar 23, 2011, 12:28:03 PM3/23/11
to tesser...@googlegroups.com
which version tesseract-ocr you have downloaded for windows2008. For me latest version r578(released today) works fine for me in WinXP. I hope it must work for  windows 2008 server 2 - which I am not sure of it- since I don't have 2008windows.
Cheers.



--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com.
To unsubscribe from this group, send email to tesseract-oc...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.


zdenko podobny

unread,
Mar 23, 2011, 2:54:53 PM3/23/11
to tesser...@googlegroups.com, moos3
Hi,

tesseract is command line tool. Item i windows menu is more or less just for testing purpose (it will not be present in next version of tesseract installer).

If you need gui have a look on Vietocr, PDF OCR X, lector etc.

Zdenko

Richard Genthner

unread,
Mar 23, 2011, 8:42:20 PM3/23/11
to zdenko podobny, tesser...@googlegroups.com
Its erroring when I run it from the command line. doing like tesseract c:\image.tif c:\output.txt causes error once starting to process the image.
--
Thank you,

Richard Genthner
IT Consultant
545 Union Rd
Waldoboro,Me 04572
Ph. 585.563.9602
Fax. 1.413.294.9891
Web. www.guthnur.net

Classification Status: UnClassified

This email message, including any attachments, is intended solely for the use of the designated recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please promptly contact the sender by reply email and destroy all copies of the original message, whether in hard copy or electronic format.

moos3

unread,
Mar 25, 2011, 10:19:41 AM3/25/11
to tesseract-ocr
The following test command causes a "tesseract.exe has stopped
working." error:

"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe" "C:/www/test/
wwwdocs/files/ea1/ea1064bb1fdb449c28f97fa31b8e3ea6.tif"
"C:/www/test/wwwdocs/files/ea1/ea1064bb1fdb449c28f97fa31b8e3ea6.txt"

Ideas on how to get this working ?

Sriranga(78yrsold)

unread,
Mar 25, 2011, 10:33:19 AM3/25/11
to tesser...@googlegroups.com
Better to upload tif file for testing, if it is in English.

Dmitri Silaev

unread,
Mar 25, 2011, 10:40:39 AM3/25/11
to tesser...@googlegroups.com, moos3
I'd say better upload a screenshot with the error message, the image file, your traineddata and command line you use.

Dmitri

Lutz, Michael

unread,
Mar 25, 2011, 11:09:22 AM3/25/11
to tesser...@googlegroups.com, moos3

Just what I thought, have you check that the “tesseract.exe” is in the same folder as the “tessdata” folder?

I could provide an exe built on win7 32bit system, but I am not sure how to, since I think google blocks *.exe extension. Is it enough to rename the extension?

Mike

--

You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com.
To unsubscribe from this group, send email to tesseract-oc...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.



This message is confidential and intended only for the addressee. If you have received this message in error, please immediately notify the postm...@nds.com and delete it from your system as well as any copies. The content of e-mails as well as traffic data may be monitored by NDS for employment and security purposes.
To protect the environment please do not print this e-mail unless necessary.

An NDS Group Limited company. www.nds.com

Richard Genthner

unread,
Mar 25, 2011, 12:03:49 PM3/25/11
to Lutz, Michael, tesser...@googlegroups.com
Here is the screenshot and the tif file. Dmitri if you rename the .exe that should work. I'm trying to get the traning data up.
ea1064bb1fdb449c28f97fa31b8e3ea6.tif

Lutz, Michael

unread,
Mar 25, 2011, 12:40:13 PM3/25/11
to Richard Genthner, tesser...@googlegroups.com

Hi,

I just ran your tif file, I get no results, it must have something to do with the size of the image. If I try to run a portion of tiff something smaller than 1000x1000 then I get results.

Can somebody explain why a tif size (2480x3508 @ 8BPP) is not processed?

Mike

 

Von: Richard Genthner [mailto:ric...@guthnur.net]
Gesendet: Freitag, 25. März 2011 17:04
An: Lutz, Michael
Cc: tesser...@googlegroups.com
Betreff: Re: tesseract.exe has stopped working on win2008 r2

 

Here is the screenshot and the tif file. Dmitri if you rename the .exe that should work. I'm trying to get the traning data up.


Dmitri Silaev

unread,
Mar 26, 2011, 1:26:04 AM3/26/11
to tesser...@googlegroups.com, Lutz, Michael, Richard Genthner
You didn't attach the .exe and the screenshot, or don't I understand anything?

To pass the obtrusive Gmail's security checks, you can zip the .exe
with password protection (give us the password). Or you can place the
.exe into two nested folders and change its extension, then zip and
send.

Your .tif looks okay, though.

Warm regards,
Dmitri Silaev

Quan Nguyen

unread,
Mar 26, 2011, 8:47:55 AM3/26/11
to tesseract-ocr
The image appears to have been heavily compressed. OCR the whole image
did not yield anything. Doing it blockwise, I got some results but not
very accurate:

Ch Juhe 24, 2@@9 the ACHP vctect ct: revisect teccmmehdettcns tcr
mee_s1es-muhqes-t'ube[[e (NR/H~
‘evictetnce ct tmmuhity’ requtrementstcr heetthcete teefschheh‘. The
Heatthcate thtecttctn Ochtrct
Ptectices Aciviscry Ccmrmttee (HHCPAG) has ernctcfsed these changes.

Sriranga(78yrsold)

unread,
Mar 26, 2011, 9:12:41 AM3/26/11
to tesser...@googlegroups.com
According irfanview, is compressed as - LZW tif file of 300 DPI   What Quan says is correct  image is heavily compressed tif one. Tesseract-OCR is supported only uncompressed tif file only from my experience.


--

zdenko podobny

unread,
Mar 26, 2011, 10:18:30 AM3/26/11
to tesser...@googlegroups.com, Richard Genthner, Lutz, Michael
convert it to png - you got smaller picture with the same quality and tesseract should process it without problem.

Zdenko

On Fri, Mar 25, 2011 at 5:03 PM, Richard Genthner <ric...@guthnur.net> wrote:
Here is the screenshot and the tif file. Dmitri if you rename the .exe that should work. I'm trying to get the traning data up.

--

zdenko podobny

unread,
Mar 26, 2011, 10:42:50 AM3/26/11
to tesser...@googlegroups.com, Lutz, Michael, Richard Genthner
On Fri, Mar 25, 2011 at 5:40 PM, Lutz, Michael <ML...@nds.com> wrote:

Hi,

I just ran your tif file, I get no results, it must have something to do with the size of the image. If I try to run a portion of tiff something smaller than 1000x1000 then I get results.

Can somebody explain why a tif size (2480x3508 @ 8BPP) is not processed?


This is not tesseract but leptonica issue (library used for image handling). When I run it on linux I got error message comming from leptonica (1.67 -> I did not try 1.68 on linux yet):
Error in pixReadFromTiffStream: spp not in set {1,3,4}
Error in pixReadStreamTiff: pix not read
Error in pixReadTiff: pix not read

On Windows leptonica "release version" library did not show error/warning messages because of compile option "NO_CONSOLE_IO" (see http://code.google.com/p/leptonica/issues/detail?id=42).

It looks like leptonica did not support lzw compression for tiff ( see http://www.leptonica.com/source/README.html  "9. Image I/O" - lzw is mentioned in png and gif section, but not with tif). I change tif compression from lzw to zip (BTW: this will cause smaller image), tesseract will produce ouput (on XP SP3).

Zdenko


Mike

 

Von: Richard Genthner [mailto:ric...@guthnur.net]
Gesendet: Freitag, 25. März 2011 17:04
An: Lutz, Michael
Cc: tesser...@googlegroups.com


Betreff: Re: tesseract.exe has stopped working on win2008 r2

 

Here is the screenshot and the tif file. Dmitri if you rename the .exe that should work. I'm trying to get the traning data up.



This message is confidential and intended only for the addressee. If you have received this message in error, please immediately notify the postm...@nds.com and delete it from your system as well as any copies. The content of e-mails as well as traffic data may be monitored by NDS for employment and security purposes.
To protect the environment please do not print this e-mail unless necessary.

An NDS Group Limited company. www.nds.com

--

Dmitri Silaev

unread,
Mar 26, 2011, 5:04:10 PM3/26/11
to tesser...@googlegroups.com, zdenko podobny, Lutz, Michael, Richard Genthner
Guys, I still can't understand what the error is produced by
Tesseract. Let's wait for the error screenshot. Or did you understand
everything already? Richard says he's got an error message...

Warm regards,
Dmitri Silaev

TP

unread,
Mar 26, 2011, 7:45:15 PM3/26/11
to tesser...@googlegroups.com
On Sat, Mar 26, 2011 at 7:42 AM, zdenko podobny <zde...@gmail.com> wrote:
>> Can somebody explain why a tif size (2480x3508 @ 8BPP) is not processed?

The test image has 16 bpp.

> This is not tesseract but leptonica issue (library used for image handling).
> When I run it on linux I got error message comming from leptonica (1.67 -> I
> did not try 1.68 on linux yet):
> Error in pixReadFromTiffStream: spp not in set {1,3,4}
> Error in pixReadStreamTiff: pix not read
> Error in pixReadTiff: pix not read

I get same warnings on with Leptonica v1.68 on Windows XP SP3.

> On Windows leptonica "release version" library did not show error/warning
> messages because of compile option "NO_CONSOLE_IO"
> (see http://code.google.com/p/leptonica/issues/detail?id=42).
> It looks like leptonica did not support lzw compression for tiff (
> see http://www.leptonica.com/source/README.html  "9. Image I/O" - lzw is
> mentioned in png and gif section, but not with tif). I change
> tif compression from lzw to zip (BTW: this will cause smaller image),
> tesseract will produce ouput (on XP SP3).

Incorrect. At least on Windows I build libtiff with "LZW_SUPPORT = 1"
in my nmake.opt file.

You can see the actual problem by looking at
http://tpgit.github.com/Leptonica/tiffio_8c_source.html#l00274, where
Leptonica gets the TIFFTAG_SAMPLESPERPIXEL. It allows 1, 3, or 4 but
not 2 as this image contains.

-- TP

zdenko podobny

unread,
Mar 27, 2011, 5:01:19 AM3/27/11
to Dmitri Silaev, tesser...@googlegroups.com, Lutz, Michael, Richard Genthner
Tree different users with working tesseract-ocr have problem with this input. Isn't it sufficient for you? Yes there could be other issues related to Windows 7 (because official build was done on 32bit Windows XP), but without solving "input issue" other investigation will be unnecessary difficult...

Zdenko

zdenko podobny

unread,
Mar 27, 2011, 5:27:29 AM3/27/11
to tesser...@googlegroups.com, TP
On Sun, Mar 27, 2011 at 12:45 AM, TP <win...@gmail.com> wrote:
On Sat, Mar 26, 2011 at 7:42 AM, zdenko podobny <zde...@gmail.com> wrote:
>> Can somebody explain why a tif size (2480x3508 @ 8BPP) is not processed?

The test image has 16 bpp.

Interesting. How did get this information? I tried:
  • identify (imagemagick): TIFF 2480x3508 2480x3508+0+0 8-bit Grayscale DirectClass 1.556MB
  • infranview "says": Original colors: 65536   (16 BitsPerPixel); Current colors: 256   (8 BitsPerPixel); Number of unique colors: 41;

> This is not tesseract but leptonica issue (library used for image handling).
> When I run it on linux I got error message comming from leptonica (1.67 -> I
> did not try 1.68 on linux yet):
> Error in pixReadFromTiffStream: spp not in set {1,3,4}
> Error in pixReadStreamTiff: pix not read
> Error in pixReadTiff: pix not read

I get same warnings on with Leptonica v1.68 on Windows XP SP3.

> On Windows leptonica "release version" library did not show error/warning
> messages because of compile option "NO_CONSOLE_IO"
> (see http://code.google.com/p/leptonica/issues/detail?id=42).
> It looks like leptonica did not support lzw compression for tiff (
> see http://www.leptonica.com/source/README.html  "9. Image I/O" - lzw is
> mentioned in png and gif section, but not with tif). I change
> tif compression from lzw to zip (BTW: this will cause smaller image),
> tesseract will produce ouput (on XP SP3).

Incorrect. At least on Windows I build libtiff with "LZW_SUPPORT        = 1"
in my nmake.opt file.

You can see the actual problem by looking at
http://tpgit.github.com/Leptonica/tiffio_8c_source.html#l00274, where
Leptonica gets the TIFFTAG_SAMPLESPERPIXEL. It allows 1, 3, or 4 but
not 2 as this image contains.

Thanks for clarifying this. As I mention It was just my guess based on my observation of README :-)

         -- TP

TP

unread,
Mar 27, 2011, 7:58:33 AM3/27/11
to tesser...@googlegroups.com
On Sun, Mar 27, 2011 at 2:27 AM, zdenko podobny <zde...@gmail.com> wrote:
> On Sun, Mar 27, 2011 at 12:45 AM, TP <win...@gmail.com> wrote:
>> On Sat, Mar 26, 2011 at 7:42 AM, zdenko podobny <zde...@gmail.com> wrote:
>> >> Can somebody explain why a tif size (2480x3508 @ 8BPP) is not
>> >> processed?
>>
>> The test image has 16 bpp.
>>
> Interesting. How did get this information? I tried:
>
> identify (imagemagick): TIFF 2480x3508 2480x3508+0+0 8-bit Grayscale
> DirectClass 1.556MB
> infranview "says": Original colors: 65536   (16 BitsPerPixel);
> Current colors: 256   (8 BitsPerPixel); Number of unique colors: 41;

I used:

ACDSee and

AsTiffTagViewer
(http://www.awaresystems.be/imaging/tiff/astifftagviewer.html) The
results are a bit complicated but pretty sure this one tells you
EXACTLY what the tiff tags contains. "Whenever a customer reports your
software doesn't handle this or that particular TIFF, use
AsTiffTagViewer and discover why." :P It's the only tag viewer I know
of that correctly shows the ImageDescription tag for each page of a
multi-page tiff. Most image viewers seem to only show the first page's
ImageDescription.

And since you asked, I also ran the libtiff 3.9.4's tiffinfo.exe to get:

TIFF Directory at offset 0x174e32 (1527346)
Image Width: 2480 Image Length: 3508
Resolution: 300, 300 (unitless)
Bits/Sample: 8
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Extra Samples: 1<unassoc-alpha>
FillOrder: msb-to-lsb
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 2
Rows/Strip: 1
Planar Configuration: single image plane
Page Number: 0-1
DocumentName:
C:/www/test/wwwdocs/files/ea1/ea1064bb1fdb449c28f97fa31b8e3ea6.tif
Predictor: horizontal differencing 2 (0x2)

Looks to me like some programs just look at Bits/Sample and forget to
also take into account Samples/Pixel?

-- TP

zdenko podobny

unread,
Mar 28, 2011, 8:33:44 AM3/28/11
to Lutz, Michael, Dmitri Silaev, tesser...@googlegroups.com, Richard Genthner

On Mon, Mar 28, 2011 at 11:54 AM, Lutz, Michael <ML...@nds.com> wrote:
Hi All,

So the image Richard gave us is a compressed TIF file. Since tesseract only supports uncompressed TIF images as noticed by Zdenko you will not get any results from this image.

Incorrect:
  1. image support is task of leptonica, so list of supported format can be found of leptonica web and source code. I think we really need to distinguish this, because with upgrading of leptonica there could be support for new format without changing a line in tesseract code.
  2. I guessed that leptonica has problem with tiff with "lzw compression". When I created tiff with "zip compression" it worked (there are also other compression algorithms available in tiff: Packbits, G4, G3,...). I never said that leptonica (tesseract) support only uncompressed tiff. I am sorry if I was not clear about this.
  3. As TP corrected me: problem is not in LZW compression, but in "Samples per Pixel". Leptonica support 1, 3, 4. Input image used (unsupported2. To "solve" this just open input file in InfranView and save it as tiff with lzw compression. It will change "Samples/Pixel" to 1 automatically ;-)
 Zdenko

I attached the image as an uncompressed TIF file, see uncompressed.zip, this image is processed by tesseract without any problems.
Also attached is a tesseract.zip, which should unpack a tesseract.executable, just rename it to tesseract.exe if it went through, it is a release static build using Win7 and WinSDK 7.1 if anyone still wants it.

Regards,
Mike

-----Ursprüngliche Nachricht-----
Von: Dmitri Silaev [mailto:daemo...@gmail.com]
Gesendet: Samstag, 26. März 2011 22:04
Cc: zdenko podobny; Lutz, Michael; Richard Genthner

Dmitri Silaev

unread,
Mar 28, 2011, 8:59:05 AM3/28/11
to tesser...@googlegroups.com
Sriranga and Mike,

Support for uncompressed TIFFs only is not an issue for a long time!
It was only during the period when Tess used a home-brewed TIFF
input/output routines. Now Tesseract does support many TIFF variations
through the use of Leptonica.

Actually I don't use the image handling part of Tesseract, so I'm
rather interested in investigation of Tesseract's errors, not
Leptonica's.

Warm regards,
Dmitri Silaev

On Mon, Mar 28, 2011 at 4:41 PM, Lutz, Michael <ML...@nds.com> wrote:
> Sorry, you were not saying this, I mixed some stuff up when reading up on
> the issue this morning, this was what I was referring to:


>
>
>
> According irfanview, is compressed as - LZW tif file of 300 DPI   What Quan
> says is correct  image is heavily compressed tif one. Tesseract-OCR is
> supported only uncompressed tif file only from my experience.
>

> Sriranga(78yrsold)
>
> Thanks for pointing it out.
>
> Mike

> ---------- Forwarded message ----------
> From: "Sriranga(78yrsold)" <withbl...@gmail.com>
> To: "tesser...@googlegroups.com" <tesser...@googlegroups.com>
> Date: Sat, 26 Mar 2011 14:12:41 +0100
> Subject: Re: tesseract.exe has stopped working on win2008 r2
> According irfanview, is compressed as - LZW tif file of 300 DPI   What Quan
> says is correct  image is heavily compressed tif one. Tesseract-OCR is
> supported only uncompressed tif file only from my experience.
>
> On Sat, Mar 26, 2011 at 6:17 PM, Quan Nguyen <nguy...@gmail.com> wrote:
>>
>> The image appears to have been heavily compressed. OCR the whole image
>> did not yield anything. Doing it blockwise, I got some results but not
>> very accurate:
>>
>> Ch Juhe 24, 2@@9 the ACHP vctect ct: revisect teccmmehdettcns tcr
>> mee_s1es-muhqes-t'ube[[e (NR/H~
>> ‘evictetnce ct tmmuhity’ requtrementstcr heetthcete teefschheh‘. The
>> Heatthcate thtecttctn Ochtrct
>> Ptectices Aciviscry Ccmrmttee (HHCPAG) has ernctcfsed these changes.
>>

Lutz, Michael

unread,
Mar 28, 2011, 8:41:28 AM3/28/11
to zdenko podobny, Dmitri Silaev, tesser...@googlegroups.com, Richard Genthner

Sorry, you were not saying this, I mixed some stuff up when reading up on the issue this morning, this was what I was referring to:

 

According irfanview, is compressed as - LZW tif file of 300 DPI   What Quan says is correct  image is heavily compressed tif one. Tesseract-OCR is supported only uncompressed tif file only from my experience.

Sriranga(78yrsold)

Thanks for pointing it out.

Mike

 

Von: zdenko podobny [mailto:zde...@gmail.com]

Gesendet: Montag, 28. März 2011 14:34
An: Lutz, Michael

Lutz, Michael

unread,
Mar 28, 2011, 5:54:31 AM3/28/11
to Dmitri Silaev, tesser...@googlegroups.com, zdenko podobny, Richard Genthner
Hi All,

So the image Richard gave us is a compressed TIF file. Since tesseract only supports uncompressed TIF images as noticed by Zdenko you will not get any results from this image.

I attached the image as an uncompressed TIF file, see uncompressed.zip, this image is processed by tesseract without any problems.
Also attached is a tesseract.zip, which should unpack a tesseract.executable, just rename it to tesseract.exe if it went through, it is a release static build using Win7 and WinSDK 7.1 if anyone still wants it.

Regards,
Mike

-----Ursprüngliche Nachricht-----
Von: Dmitri Silaev [mailto:daemo...@gmail.com]
Gesendet: Samstag, 26. März 2011 22:04
An: tesser...@googlegroups.com
Cc: zdenko podobny; Lutz, Michael; Richard Genthner

uncompressed.zip
tesseract.zip

moos3

unread,
Mar 28, 2011, 11:17:45 AM3/28/11
to tesser...@googlegroups.com, Richard Genthner, Lutz, Michael, zdenko podobny
Infraview saved worked. Imagemagick doesnt' seem to work. We are thinking it has to be something up with imagemagicks lzw compression. 

tarık Kaya

unread,
Oct 21, 2013, 11:52:09 AM10/21/13
to tesser...@googlegroups.com
i changed the tesseract.exe and it worked. tesseract version 3.2.0.0 .
Reply all
Reply to author
Forward
0 new messages