CCExtractor version: 0.94
{}
channel5-2018-02-12.ts from the TV Samples page
ccextractor tries to load tesseract traineddata from a wrong location then blames it on the TESSDATA_PREFIX. Here's the output it produces:
Opening file: /home/ibrahim/Downloads/channel5-2018-02-12.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Error opening data file /usr/share/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Failed TessBaseAPIInit4 -1
I checked the logic in ocr.c
and found that probe_tessdata_location
works fine by tracing the syscalls it makes to each possible tessdata location by running strace -e trace=openat ./ccextractor ~/Downloads/channel5-2018-02-12.ts
and the result is as follows:
Opening file: /home/ibrahim/Downloads/channel5-2018-02-12.ts
openat(AT_FDCWD, "/home/ibrahim/Downloads/channel5-2018-02-12.ts", O_RDONLY) = 3
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
openat(AT_FDCWD, "./tessdata/", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/tessdata/", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 4
openat(AT_FDCWD, "/usr/share/eng.traineddata", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/eng.traineddata", O_RDONLY) = -1 ENOENT (No such file or directory)
Error opening data file /usr/share/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Failed TessBaseAPIInit4 -1
It checks the paths correctly and stops when finding it at /usr/share/tessdata/' so I suspect the problem is possibly in the
TessBaseAPIInit4` call.