Build from source for Visual studio and windows

1,503 views
Skip to first unread message

Essam Zaky

unread,
Jan 18, 2017, 8:32:36 AM1/18/17
to tesseract-ocr
Dear All
I have Windows and Visual Studio2010,2015
Are there any tutorial to build Tesseract4.00 from source
Also are there any tutorial to do the training process in windows

any suggestion are welcome

thanks

ShreeDevi Kumar

unread,
Jan 18, 2017, 9:43:32 AM1/18/17
to tesser...@googlegroups.com

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0e241ce0-2ffc-42bf-9814-c524bc144a9c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Essam Zaky

unread,
Jan 18, 2017, 11:23:42 AM1/18/17
to tesseract-ocr

Thanks Shree

I already check the link
but when i run
 the following command it generate error
as follow:
C:\Users\emz>cppan --build pvt.cppan.demo.google.tesseract-master
No such file or directory, trying to build as package
Reading package specs...
dependency 'pvt.cppan.demo.unicode.icu.data' not found

C:\Users\emz>cd tesseract

C:\Users\emz\tesseract>cppan
Reading package specs...
dependency 'pvt.cppan.demo.unicode.icu.data' not found

note : when i run 
the following command 
Run cppan --build pvt.cppan.demo.google.tesseract-master
it crashed in the first run 
then it always produce the mentioned error in  red
dependency 'pvt.cppan.demo.unicode.icu.data' not found
بتاريخ الأربعاء، 18 يناير، 2017 4:43:32 م UTC+2، كتب shree:

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Wed, Jan 18, 2017 at 7:02 PM, Essam Zaky <essa...@gmail.com> wrote:
Dear All
I have Windows and Visual Studio2010,2015
Are there any tutorial to build Tesseract4.00 from source
Also are there any tutorial to do the training process in windows

any suggestion are welcome

thanks

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

Egor Pugin

unread,
Jan 18, 2017, 1:03:53 PM1/18/17
to tesseract-ocr
Hi,

Try to remove directory c:\users\emz\.cppan\storage and re-run cppan again.

Essam Zaky

unread,
Jan 18, 2017, 2:17:46 PM1/18/17
to tesseract-ocr
Thanks Egor

i removed
c:\users\emz\.cppan\storage 
and ran cppan as follow
Run-->cmd
cppan --build pvt.cppan.demo.google.tesseract-master

the cppan crashed atfer the following line
Unpacking  : pvt.cppan.demo.unicode.icu.i18n-58.2.0... 

what should i do
delete the storage again and re run cppan or 
run cppan without clean storage
?

Egor Pugin

unread,
Jan 18, 2017, 2:18:54 PM1/18/17
to tesseract-ocr
Please, clean again and run 'cppan --trace --build pvt.cppan.demo.google.tesseract-master'. Post the output here.

Essam Zaky

unread,
Jan 18, 2017, 2:33:28 PM1/18/17
to tesseract-ocr

attaced the cmd window log 
note :the strating of log is cleard from cmd window it seems it keeps only the latest logs

thanks
cppanCrashlog0001.txt

Essam Zaky

unread,
Jan 18, 2017, 2:35:02 PM1/18/17
to tesseract-ocr
Thanks Shree

I already check the link
but when i run
 the following command it generate error
as follow:
C:\Users\emz>cppan --build pvt.cppan.demo.google.tesseract-master
No such file or directory, trying to build as package
Reading package specs...
dependency 'pvt.cppan.demo.unicode.icu.data' not found

C:\Users\emz>cd tesseract

C:\Users\emz\tesseract>cppan
Reading package specs...
dependency 'pvt.cppan.demo.unicode.icu.data' not found

note : when i run 
the following command 
Run cppan --build pvt.cppan.demo.google.tesseract-master
it crashed in the first run 
then it always produce the mentioned error in  red
dependency 'pvt.cppan.demo.unicode.icu.data' not found

Egor Pugin

unread,
Jan 18, 2017, 2:36:45 PM1/18/17
to tesseract-ocr
Ok, then run 'cppan --trace --build pvt.cppan.demo.google.tesseract-master > 1.txt 2>&1' to get the full log and attach 1.txt here. (Of course, remove storage before this command.)

Essam Zaky

unread,
Jan 18, 2017, 3:38:52 PM1/18/17
to tesseract-ocr
Hi Egor
 here it's the log attached

cppan did not crash this time
how i check if the process completed successfully or not ?
1.rar

Egor Pugin

unread,
Jan 18, 2017, 3:53:44 PM1/18/17
to tesseract-ocr
Now go to c:\users\yourname\.cppan\storage\lnk\SomeHashDirHere\ and open pvt.cppan.demo.cairographics.cairo-1.15.2.sln.lnk there. Switch to Release configuration in Visual studio and try to build it.
Probably you'll see errors. Now sure what caused them.

Next - try to remove c:\users\yourname\.cppan\storage\src\24\65\930c (whole dir).
Re-run cppan --build pvt.cppan.demo.google.tesseract-master
without deleting whole storage.

Gowzancha

unread,
Jan 23, 2017, 10:21:40 AM1/23/17
to tesseract-ocr
I got same error messages as Essam Zaky, I tried 
cppan --trace --build pvt.cppan.demo.google.tesseract-master > 1.txt 2>&1
The 1.txt is attached. Still the cppan crashes.
The directory c:\users\yourname\.cppan\storage\lnk\ is empty.

I tried to remove c:\users\yourname\.cppan\storage\src\24\65\930c
and run again:
cppan --trace --build pvt.cppan.demo.google.tesseract-master > 2.txt 2>&1
The 2.txt is attached. Now when i run tesseract>cppan I got error message:
dependency 'pvt.cppan.demo.unicode.icu.common' not found

I'm using Windows 7 Enterprise, Visual Studio 2012 installed, 
1.txt
2.txt

Egor Pugin

unread,
Jan 23, 2017, 10:54:33 AM1/23/17
to tesseract-ocr
Hi,

I'm trying to track down that issue (crash), but still need more info.
Could you please clear the storage, re-run 'cppan --build pvt.cppan.demo.google.tesseract-master' and attach log files from c:\Users\u\.cppan\ 
cppan.log.debug
cppan.log.trace

Gowzancha

unread,
Jan 24, 2017, 5:22:05 AM1/24/17
to tesseract-ocr
There is no such log files in c:\Users\u\.cppan\ directory. There is only storage and cppan.yml file.

Egor Pugin

unread,
Jan 24, 2017, 5:26:44 AM1/24/17
to tesseract-ocr
Please, update to the latest cppan client: 'cppan --self-upgrade' and try again. Logs must appear.

Gowzancha

unread,
Jan 24, 2017, 6:23:49 AM1/24/17
to tesseract-ocr
I did three attempts running:
'cppan --build pvt.cppan.demo.google.tesseract-master'
every time I clean storage and delete the old log files.
As I see the log files look pretty different, it seems that the crashing happens in different points of the process each time!
Please find attached the log files for the three attempts.
cppan.log_2.zip
cppan.log_3.zip
cppan.log_1.zip

Egor Pugin

unread,
Jan 28, 2017, 12:56:24 PM1/28/17
to tesseract-ocr
Hi,

I've updated the cppan client with possible fix of your crashes. Please, try to run
cppan --self-upgrade
cppan --build pvt.cppan.demo.google.tesseract-master

Essam Zaky

unread,
Jan 29, 2017, 6:31:10 AM1/29/17
to tesseract-ocr

Hi Egor

Should i remove the storage before running these commands
this process takes more than 1 hour

Thanks

Egor Pugin

unread,
Jan 29, 2017, 6:32:08 AM1/29/17
to tesseract-ocr
No, try it without removing storage.

Essam Zaky

unread,
Jan 29, 2017, 9:24:06 AM1/29/17
to tesseract-ocr
Hi Egor

It's completed now without error or crash
0 Errors
140141 Warning

how to check that process is working fine?

Egor Pugin

unread,
Jan 29, 2017, 9:27:18 AM1/29/17
to tesseract-ocr
What process?

Essam Zaky

unread,
Jan 29, 2017, 9:38:04 AM1/29/17
to tesseract-ocr
I see some bin files here
C:\Users\emz\.cppan\storage\bin\33e598b5\Release

and some bin files here
C:\Users\emz

Also whare i can find main *.sln 
i would like to build the debug version of tesseract

بتاريخ الأحد، 29 يناير، 2017 4:27:18 م UTC+2، كتب Egor Pugin:
What process?

Egor Pugin

unread,
Jan 29, 2017, 9:54:21 AM1/29/17
to tesseract-ocr
1. Right binaries are in the folder from which you call cppan command. Seems it's C:\Users\emz
2. Solution file can be found near those binaries (e.g tesseract-9fa26eb4.sln.lnk). You can open it, switch to debug and build.

Essam Zaky

unread,
Jan 29, 2017, 10:08:13 AM1/29/17
to tesseract-ocr
Thanks Egor
Sorry for disturbing you alot

i have another question
i had compared leptonica files in the following path
C:\Users\emz\.cppan\storage\src\14\83\bafc\src

and latest files downloaded from Leptonica site 

some files are different

should i replace newer files into cppan folders ?
should i do that for all libraries used in the solution ?

Egor Pugin

unread,
Jan 29, 2017, 10:12:33 AM1/29/17
to tesseract-ocr
Tess uses stable 1.74 leptonica, not the master branch. You don't need to touch anything in cppan storage.

ShreeDevi Kumar

unread,
Jan 29, 2017, 11:16:25 AM1/29/17
to tesser...@googlegroups.com
There are recent changes in leptonica which cater to requirements for 4.0.0alpha. so, I think you should build leptonica from the GitHub master for 4.0.

- excuse the brevity, sent from mobile

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

Essam Zaky

unread,
Jan 29, 2017, 1:50:28 PM1/29/17
to tesseract-ocr
thanks Egor , Shree

Egor,, what do you thin about Shree openion which say "There are recent changes in leptonica which cater to requirements for 4.0.0alpha. so, I think you should build leptonica from the GitHub master for 4.0"
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

Egor Pugin

unread,
Jan 29, 2017, 1:54:30 PM1/29/17
to tesseract-ocr
I do not follow leptonica very closely. If you have strong considerations about leptonica version, ask Zdenko or Ray.

Essam Zaky

unread,
Jan 29, 2017, 2:10:22 PM1/29/17
to tesseract-ocr

Shree, I would like to train some Arabic text images , do you think the changes in Leptonica will affact the traning or recognizing process ?

Essam Zaky

unread,
Feb 11, 2017, 6:13:28 AM2/11/17
to tesseract-ocr
Dear Egor

In my home machinei,I had built the training and tesseract from source 
by following the following page
i faced the follwoing problems
1-I run the follwing 
Run cppan --build pvt.cppan.demo.google.tesseract.tesseract-master
the files binary files are generated but when run tesseract it crash
C:\Users\myloo>tesseractmain 1.tif 1 -l ara
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica
TIFFReadDirectory: Warning, Unknown field with tag 292 (0x124) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 292 (0x124) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 292 (0x124) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
Page 1
DotProductSSE can't be used on Android

2-sln created not correct ,it contains only 5 project from all tesseract  projects 


what do you think and why detect that my windows machine as Android?


بتاريخ الأربعاء، 18 يناير، 2017 3:32:36 م UTC+2، كتب Essam Zaky:

Egor Pugin

unread,
Feb 11, 2017, 6:18:12 AM2/11/17
to tesseract-ocr
Hi Essam,

I've updated cppan sources to the latest tess master.
Please, try to re-run the command and check your issues again.
cppan --build pvt.cppan.demo.google.tesseract.tesseract-master

And by the way, what projects do you see in the solution?
Probably you have some build errors, so some projects left unbuilt.
Did you see any errors during the build?

Essam Zaky

unread,
Feb 11, 2017, 10:31:58 AM2/11/17
to tesseract-ocr
Hi Igor

Same errors hapen

i did not see any errors while cppan build just warning

the 5 project in the solution file ar
ALL_BUILD.vcxproj
ZERO_CHECK.vcxproj
cppan-d-b.vcxproj
cppan-d-b-d.vcxproj
cppan-d-c.vcxproj

Note : i did not clean the storage and i run the command directly

Best regards

Egor Pugin

unread,
Feb 11, 2017, 10:42:04 AM2/11/17
to tesseract-ocr
That's ok.
What about your android error? Does it still exist?

Essam Zaky

unread,
Feb 11, 2017, 11:44:42 AM2/11/17
to tesseract-ocr
yes the error still exist when try to launch 

C:\Users\myloo>tesseractmain 1.tif 1 -l ara
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica
TIFFReadDirectory: Warning, Unknown field with tag 292 (0x124) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 292 (0x124) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 292 (0x124) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
Page 1
DotProductSSE can't be used on Android

Egor Pugin

unread,
Feb 11, 2017, 12:34:02 PM2/11/17
to tesseract-ocr
Test image was recognized fine for me.

>tesseractmain.exe testing\phototest.tif - -l eng
Page 1
This is a lot of 12 point text to test the
ocr code and see if it works on all types
of file format.

The quick brown dog jumped over the
lazy fox. The quick brown dog jumped
over the lazy fox. The quick brown dog
jumped over the lazy fox. The quick
brown dog jumped over the lazy fox.

Please, write:
1. your processor
2. your windows version
3. visual studio version

Essam Zaky

unread,
Feb 11, 2017, 2:15:07 PM2/11/17
to tesseract-ocr

I had downloaded the following version 
then installed and copied the installed files and replaced the files in cppan build folder "C:\Users\myloo" and run command

tesseract testing\phototest.tif - -l eng
Page 1
This is a lot of 12 point text to test the
ocr code and see if it works on all types
of file format.

The quick brown dog jumped over the
lazy fox. The quick brown dog jumped
over the lazy fox. The quick brown dog
jumped over the lazy fox. The quick
brown dog jumped over the lazy fox.

so the installed version is similar to your machine

but when restore the original files built by cppan command
it crash as follow
C:\Users\myloo>tesseractmain testing\phototest.tif - -l eng
Page 1
DotProductSSE can't be used on Android

I'm Using Visual Studio2015 also I have VS2010 installed in the machine
My machine configuration as follow
see the following image

Egor Pugin

unread,
Feb 11, 2017, 2:27:35 PM2/11/17
to tesseract-ocr

Essam Zaky

unread,
Feb 11, 2017, 3:06:15 PM2/11/17
to tesseract-ocr
I replaced the binaries in the following link 
C:\Users\myloo>tesseractmain testing\phototest.tif - -l eng
Page 1
DotProductSSE can't be used on Android

Note in the beginning of building using cppan there is message says checking SSE is found, may this help

Egor Pugin

unread,
Feb 23, 2017, 8:04:40 AM2/23/17
to tesseract-ocr
Please, check again.
I did some fixes, in general it should work now.

Essam Zaky

unread,
Feb 23, 2017, 11:07:03 AM2/23/17
to tesseract-ocr
Hi Egor
Should i remove strorage and tesseract folders?
my last failed trial was to build leptonica 1.7.4 and tesseract 3.05 for Visual studio 2010

Egor Pugin

unread,
Feb 23, 2017, 11:12:57 AM2/23/17
to tesseract-ocr
> Should i remove strorage and tesseract folders?

First, try without removing them.
Use tess4.0 and vs2015.

Essam Zaky

unread,
Feb 23, 2017, 11:45:52 AM2/23/17
to tesseract-ocr
cppan crash
the first problem appear again when unpack the following module
Unpacking  : pvt.cppan.demo.unicode.icu.common-58.2.0...

here the steps i did
cd tesseract
cppan

note at the beging of execution of cppan command it says there are new version of client
should now i run cppan --self-upgarde
Reply all
Reply to author
Forward
0 new messages