Issue 796 in tesseract-ocr: Loading .user-words doesn't work on Windows

103 views
Skip to first unread message

tesser...@googlecode.com

unread,
Nov 16, 2012, 7:38:27 PM11/16/12
to tesserac...@googlegroups.com
Status: New
Owner: ----

New issue 796 by piotr.fo...@gmail.com: Loading .user-words doesn't work on
Windows
http://code.google.com/p/tesseract-ocr/issues/detail?id=796

What steps will reproduce the problem?
1. Set user_words_suffix to user-words in config_file
2. Create eng.user-words file in tessdata directory
3. Run tesseract.exe in.png out.txt -l eng config_file

What is the expected output? What do you see instead?
eng.user-words should be loaded and used, instead error message is
printed "Could not open file, .\Tesseract-OCR\tessdata\eng.user-words".

What version of the product are you using? On what operating system?
tesseract 3.02
leptonica-1.68 (Mar 14 2011, 10:43:03) [MSC v.1500 LIB Release 32 bit]
libgif 4.1.6 : libjpeg 8c : libpng 1.4.3 : libtiff 3.9.4 : zlib 1.2.5

System is Windows 7 x86 with all updates.

tesser...@googlecode.com

unread,
Nov 18, 2012, 3:57:40 PM11/18/12
to tesserac...@googlegroups.com
Updates:
Labels: OpSys-Windows Type-Other

Comment #1 on issue 796 by zde...@gmail.com: Loading .user-words doesn't
Does 'tesseract.exe in.png out.txt -l eng' work?
I think that relative path does not works on Windows. Try to set full path.

tesser...@googlecode.com

unread,
Nov 19, 2012, 3:48:40 AM11/19/12
to tesserac...@googlegroups.com

Comment #2 on issue 796 by piotr.fo...@gmail.com: Loading .user-words
Yes, 'tesseract.exe in.png out.txt -l eng' does work. Sadly, changing path
to absolute doesn't help - same error.

tesser...@googlecode.com

unread,
Nov 19, 2012, 9:43:03 AM11/19/12
to tesserac...@googlegroups.com

Comment #4 on issue 796 by zde...@gmail.com: Loading .user-words doesn't
please post your config_file

tesser...@googlecode.com

unread,
Nov 30, 2012, 5:31:29 PM11/30/12
to tesserac...@googlegroups.com
Updates:
Status: WorksForMe

Comment #5 on issue 796 by zde...@gmail.com: Loading .user-words doesn't
Works for me:

echo %TESSDATA_PREFIX%
C:\Program Files\Tesseract-OCR\

config_file: "c:\Program Files\Tesseract-OCR\tessdata\configs\config_file"
user word: "c:\Program Files\Tesseract-OCR\tessdata\deu.user-words"

Tested on Windows XP SP3, 32bit with these commands without error:
tesseract phototest.tif phototest -l deu
tesseract phototest.tif phototest-1 -l deu config_file

Attachments:
config_file 28 bytes
deu.user-words 20 bytes
phototest.txt 287 bytes
phototest-1.txt 287 bytes

tesser...@googlecode.com

unread,
Dec 15, 2012, 6:54:01 PM12/15/12
to tesserac...@googlegroups.com

Comment #6 on issue 796 by mariuszr...@gmail.com: Loading .user-words
It seems this error occurs when config file has DOS line ends instead of
Unix (only LF).

Reply all
Reply to author
Forward
0 new messages