Hey folks, I downloaded tesseract tonight and I'm having an issue I can't get past. The error output is as follows: Deserialize header failed: ☺
First document cannot be empty!!
num_pages_per_doc_ > 0:Error:Assert failed:in file ../../../src/ccstruct/imagedata.cpp, line 704
I am using a tif file as my raw image source. I have tried 2 different methods of generating the tif file. The first method is taking a screenshot with snipping tool, pasting it into gimp and saving as a tif. I also tried print screening instead of snipping tool. The second method is taking a screenshot with snipping tool, saving as a .png, then converting to .tif via ImageMagick commandline. I am creating the box file like so:
tesseract 9.tif 9 makebox
I then editing the box file to make sure it is an accurate representation of the characters on the screen. I have also tried creating the box file and just leaving it to see if that resolves the issue, it does not. I then proceed to create the lstmf file like so:
tesseract 9.tif 9 --psm 6 lstm.train
I then try to run lstmtraining or lstmeval and i get the header error every time. I am using version 5.3.3, but I have also tried using v4.1, recreating all the files and I still got the same issue. Does anyone know why I'm getting this issue, and how to resolve it? About to give up with tesseract because this shit does not work out of the box. I am following google instructions to a T so I either overlooked something crucial that is ruining my lstmf file or this shit just does not work for me. Appreciate any help that can be provided.