Building tesseract 4.0.0 from master on OS X

1,672 views
Skip to first unread message

Kevin Schiesser

unread,
Jul 30, 2017, 2:33:16 PM7/30/17
to tesseract-ocr
Hi all,

I hit 2 walls when trying the build and run tesseract from the latest checkout of master on OS X Sierra. First, I ran into some issues when running make training. After some Makefile hacking I was able to link libpango-1.0, but failed on libgobject-2.0. I couldn't find much about the availability of this library for Macs and stopped there.

The second issue is when running the tesseract vanilla OCR binary built from source:

TESSDATA_PREFIX=/path/to/repos/tesseract/tessdata tesseract AmazonSonicare.pdf ./
Error opening data file /path/to/repos/tesseract/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.

Along the way I had to depart from the Mac Homebrew instructions on the git wiki and pass in gcc/g++ v7.0 to the configure step (the instructions say to use v6.0). That said, the main binary build didn't report any warnings or errors.

Does the project intend to support Mac or should I simply use a Linux environment going forward?

Thanks much,
Kevin

ShreeDevi Kumar

unread,
Jul 31, 2017, 12:03:00 AM7/31/17
to tesser...@googlegroups.com
Please see the following for the suggested solutions

Can't Install Latest Head With Brew

3.05 can't be be built as Standalone Self-contained Tesseract-OCR for Mac

Regarding tessdata_prefix

you can try the following
either 
EXPORT the location 
or
give --tessdata-dir as part of command

eg.

 export TESSDATA_PREFIX=/home/shree/tesseract-ocr

tesseract --tessdata-dir=/home/shree/tesseract-ocr testing/phototest.jpg testing/phototest-jpg 

your /path/to/repos/tesseract/
should reflect where you have your tessdata files.



ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/96dcaf32-c4d0-4e5b-9f02-c06285bccdbf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ShreeDevi Kumar

unread,
Jul 31, 2017, 12:08:04 AM7/31/17
to tesser...@googlegroups.com

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Mon, Jul 31, 2017 at 9:32 AM, ShreeDevi Kumar <shree...@gmail.com> wrote:
Please see the following for the suggested solutions

Can't Install Latest Head With Brew

3.05 can't be be built as Standalone Self-contained Tesseract-OCR for Mac

Regarding tessdata_prefix

you can try the following
either 
EXPORT the location 
or
give --tessdata-dir as part of command

eg.

 export TESSDATA_PREFIX=/home/shree/tesseract-ocr

tesseract --tessdata-dir=/home/shree/tesseract-ocr testing/phototest.jpg testing/phototest-jpg 

your /path/to/repos/tesseract/
should reflect where you have your tessdata files.



ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

Kevin Schiesser

unread,
Jul 31, 2017, 12:27:13 AM7/31/17
to tesseract-ocr
Thank you shree. I ran into the same issue with the Linux install. I had to wget eng.traineddata into tessdata/ as it was not delivered with the source installation. Thus, I should be able to run tesseract on OS X from the HEAD source installation. My only issue is then executing make training, which bails when trying to link libgobject.
On Sun, Jul 30, 2017 at 11:02 PM, Kevin Schiesser <kevin.josep...@gmail.com> wrote:
Hi all,

I hit 2 walls when trying the build and run tesseract from the latest checkout of master on OS X Sierra. First, I ran into some issues when running make training. After some Makefile hacking I was able to link libpango-1.0, but failed on libgobject-2.0. I couldn't find much about the availability of this library for Macs and stopped there.

The second issue is when running the tesseract vanilla OCR binary built from source:

TESSDATA_PREFIX=/path/to/repos/tesseract/tessdata tesseract AmazonSonicare.pdf ./
Error opening data file /path/to/repos/tesseract/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.

Along the way I had to depart from the Mac Homebrew instructions on the git wiki and pass in gcc/g++ v7.0 to the configure step (the instructions say to use v6.0). That said, the main binary build didn't report any warnings or errors.

Does the project intend to support Mac or should I simply use a Linux environment going forward?

Thanks much,
Kevin

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

ShreeDevi Kumar

unread,
Jul 31, 2017, 1:12:55 AM7/31/17
to tesser...@googlegroups.com
I do not have a MAC so cannot check. But you can try 

option "with-training-tools", "Install OCR training tools"

with homebrew install along with the --HEAD option.


Please add a comment to existing mac OS issue on github, if you still face a problem.

Stefan Weil

unread,
Jul 31, 2017, 3:42:32 PM7/31/17
to tesseract-ocr
Kevin, how did you run the failing builds on macOS?

I just tested building with `brew install tesseract --HEAD --with-training-tools` and had no problems.
An automake based builds also works with MacPorts.
No modifications were needed for Tesseract git master.

Kevin Schiesser

unread,
Jul 31, 2017, 10:15:32 PM7/31/17
to tesseract-ocr
I used brew to install the dependencies and then ran the following:

$ ./autogen.sh
$ make
$ sudo make install
$ make training

The last command exits with the following:

ld: library not found for -lgobject-2.0
collect2: error: ld returned 1 exit status
make[1]: *** [text2image] Error 1
make: *** [training] Error 2

soumyas...@heavywater.solutions

unread,
Feb 23, 2018, 4:34:35 PM2/23/18
to tesseract-ocr
Hi Kevin,

Were you able to install Tesseract from Master branch on linux? I checked out the master branch but couldn't find the Makefile to complete the installation. I'm new to tesseract not sure if I'm missing something. 

soumyas...@heavywater.solutions

unread,
Feb 23, 2018, 6:38:13 PM2/23/18
to tesseract-ocr
Ok I figured ./configure creates the Makefile but ./configure errors out with this statement: 

./configure: line 4193: syntax error near unexpected token `-mavx,'

./configure: line 4193: `AX_CHECK_COMPILE_FLAG(-mavx, avx=true, avx=false)'


Anyone has any idea what this means?

Reply all
Reply to author
Forward
0 new messages