error when make training

60 views
Skip to first unread message

Jingjing Lin

unread,
Jun 2, 2019, 10:10:48 PM6/2/19
to tesseract-ocr
Hi I installed tesseract about one month ago using brew install and I'm now trying to set up training tool in MacOS Mojave following instructions here: https://github.com/tesseract-ocr/tesseract/wiki/Compiling#macos using homebrew and was having problems one after another. 
Currently the bug I'm having is:

boxchar.cpp:33:10: fatal error: unicode/uchar.h: No such file or directory

 #include "unicode/uchar.h"  // from libicu


Below might be some helpful information for you to help me figure out what is happening:


when running ./configure CC=gcc-8 CXX=g++-8 CPPFLAGS=-I/usr/local/opt/icu4c/include LDFLAGS=-L/usr/local/opt/icu4c/lib, I got the following result relating to configuration:


checking pkg-config is at least version 0.9.0... yes

checking for lept >= 1.74... yes

checking for libarchive... yes

checking for icu-uc >= 52.1... no

checking for icu-i18n >= 52.1... no

checking for pango >= 1.22.0... yes

checking for cairo... yes


Training tools can be built and installed with:


$ make training

$ sudo make training-install


I'm wondering whether I got the error because icu-uc and icu-i18n are missing? Are these two related to icu4c? I did install icu4c however. I also set:


export PKG_CONFIG_PATH=\
$(brew --prefix)/lib/pkgconfig:\
$(brew --prefix)/opt/libarchive/lib/pkgconfig:\
$(brew --prefix)/opt/icu4c/lib/pkgconfig:\
$(brew --prefix)/opt/libffi/lib/pkgconfig
./configure
Anybody knows why? Thanks in advance for the help.

Zdenko Podobny

unread,
Jun 3, 2019, 1:21:10 AM6/3/19
to tesser...@googlegroups.com

po 3. 6. 2019 o 4:10 Jingjing Lin <joejo...@gmail.com> napísal(a):

checking for icu-uc >= 52.1... no

checking for icu-i18n >= 52.1... no

This is problem - you do now have icu / right version...
 

Jingjing Lin

unread,
Jun 3, 2019, 10:35:59 AM6/3/19
to tesseract-ocr
Thanks for your reply.
Are these two related to icu4c? If yes 'brew info icu4c' gives me:

icu4c: stable 64.2 (bottled) [keg-only]

C/C++ and Java libraries for Unicode and globalization

https://ssl.icu-project.org/

/usr/local/Cellar/icu4c/64.2 (257 files, 69.2MB)

  Poured from bottle on 2019-05-06 at 17:49:58

From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/icu4c.rb

==> Caveats

icu4c is keg-only, which means it was not symlinked into /usr/local,

because macOS provides libicucore.dylib (but nothing else).


If you need to have icu4c first in your PATH run:

  echo 'export PATH="/usr/local/opt/icu4c/bin:$PATH"' >> ~/.bash_profile

  echo 'export PATH="/usr/local/opt/icu4c/sbin:$PATH"' >> ~/.bash_profile


For compilers to find icu4c you may need to set:

  export LDFLAGS="-L/usr/local/opt/icu4c/lib"

  export CPPFLAGS="-I/usr/local/opt/icu4c/include"


For pkg-config to find icu4c you may need to set:

  export PKG_CONFIG_PATH="/usr/local/opt/icu4c/lib/pkgconfig"


==> Analytics

install: 368,818 (30 days), 986,695 (90 days), 3,236,278 (365 days)

install_on_request: 14,492 (30 days), 39,988 (90 days), 134,775 (365 days)

build_error: 0 (30 days)


Do you have any idea why './configure CC=gcc-8 CXX=g++-8 CPPFLAGS=-I/usr/local/opt/icu4c/include LDFLAGS=-L/usr/local/opt/icu4c/lib' would gives me:


checking for icu-uc >= 52.1... no

checking for icu-i18n >= 52.1... no

Thanks!



在 2019年6月3日星期一 UTC-4上午1:21:10,zdenop写道:

Jingjing Lin

unread,
Jun 3, 2019, 12:36:01 PM6/3/19
to tesseract-ocr
The above error was fixed with brew reinstall icu4c.

now ./configure CC=gcc-8 CXX=g++-8 CPPFLAGS=-I/usr/local/opt/icu4c/include LDFLAGS=-L/usr/local/opt/icu4c/lib gives:

checking pkg-config is at least version 0.9.0... yes

checking for lept >= 1.74... yes

checking for libarchive... yes

checking for icu-uc >= 52.1... yes

checking for icu-i18n >= 52.1... yes

checking for pango >= 1.22.0... yes

checking for cairo... yes



but still running into another error when 'make training':


ld: library not found for -lpango-1.0

collect2: error: ld returned 1 exit status

make[1]: *** [text2image] Error 1

make: *** [training] Error 2


any idea what is happening? Thanks!




在 2019年6月3日星期一 UTC-4上午10:35:59,Jingjing Lin写道:

Zdenko Podobny

unread,
Jun 3, 2019, 1:48:51 PM6/3/19
to tesser...@googlegroups.com
I am not familiar with Mac, but AFAIK there is no problem with compiling tesseract on Mac.

Error mean that linker is not able to find&link  text2image against pango-1.0 library... Try to add path to pango library to LDFLAGS.
 
Zdenko


po 3. 6. 2019 o 18:36 Jingjing Lin <joejo...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d7db4028-7f44-42c5-adce-1e480b5f0089%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jingjing Lin

unread,
Jun 5, 2019, 7:37:57 AM6/5/19
to tesseract-ocr
I think my make training went successfully by manually linking several libraries via "LDFLAGS". The compiling gives no error anymore. But it seems it is still not working because 'text2image' command is not recognized. Is there anything else I need to do after 'make training'? Thanks.
Another question is, I installed tesseract via homebrew already. I don't have to uninstall it and reinstall here, right?

在 2019年6月3日星期一 UTC-4下午1:48:51,zdenop写道:
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.

Jingjing Lin

unread,
Jun 5, 2019, 9:10:43 AM6/5/19
to tesseract-ocr
Actually I found all the following are not there. Am I missing something?
text2image
unicharset_extractor 
set_unicharset_properties 
combine_lang_model 
lstmtraining 
lstmeval

在 2019年6月5日星期三 UTC-4上午7:37:57,Jingjing Lin写道:

Jingjing Lin

unread,
Jun 5, 2019, 9:26:03 AM6/5/19
to tesseract-ocr
In my tesseract/src/training folder, all these

text2image
unicharset_extractor 
set_unicharset_properties 
combine_lang_model 
lstmtraining 
lstmeval

are there. icluding .cpp, .o and without filename extension, three types of them.

在 2019年6月5日星期三 UTC-4上午9:10:43,Jingjing Lin写道:

Shree Devi Kumar

unread,
Jun 5, 2019, 10:01:09 AM6/5/19
to tesser...@googlegroups.com
If training tools are made correctly, you should have all those programs. AT least that's how it is on Linux and Windows.

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

For more options, visit https://groups.google.com/d/optout.


--

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

Shree Devi Kumar

unread,
Jun 6, 2019, 12:37:09 AM6/6/19
to tesser...@googlegroups.com

You are probably missing the last step


sudo make training-install

Usual Build and Install instructions

git clone https://github.com/tesseract-ocr/tesseract/
cd tesseract
./autogen.sh
./configure
make
sudo make install
sudo ldconfig
make training
sudo make training-install

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages