training/text2image --text=../training_texts/model.txt --outputbase=test.arial.exp0 --font='Arial Medium' --fonts_dir=~/Library/Fonts/
I installed Tesseract 3.04 from the release in GitHub.
https://github.com/tesseract-ocr/tesseract/releases
Here is the version information:
tesseract 3.04.00
leptonica-1.71
libgif 4.2.3 : libjpeg 9a : libpng 1.6.18 : libtiff 4.0.4 : zlib 1.2.8 : libwebp 0.4.3 : libopenjp2 2.1.0
It is difficult to find information about this segmentation fault 11 with Google, apparently it isn't so common. The book I want to OCR is in Komi language with rather specific but not very complicated orthography. I have lots of text in the same variant, and thought to try to develop a language model for the purpose.
I would appreciate any help! Can it be that I'm missing some dependency?
Best wishes,
Niko