Tess4j:Fails at parallel execution of tasks, executing createDocuments

114 views
Skip to first unread message

Kristóf Horváth

unread,
Feb 26, 2019, 2:19:20 AM2/26/19
to tesseract-ocr
Im trying to make a setup where i can give a list of entities (that hold all the necessary information to do OCR on .tiff files). For this i use Spring and i use Threadpoolexecutor to execute my tasks in parallel.

Enviroment: win10, Java, Spring Framework
Executor: FixedThreadpool
tess4j version: 4.3.1

Error messages: (there are multiple ones because it gives sometimes different error or just works, so here are my findings)

splitter_.orig_pix():Error:Assert failed:in file ..\..\src\ccmain\tesseractclass.cpp, line 674
This is the most common one, I can replicate this


!w_it.cycled_list():Error:Assert failed:in file ..\..\src\ccstruct\pageres.cpp, line 1351
 I couldnt replicate this


HIGHlol1
LOWlol
Page 1
Page 1
Detected 224 diacritics
Didn't fail
OCR is done let'
s move!
tmp
\lol1.pdf -> C:\Users\kh\Desktop\workstuff\samples\test_out\lol1.pdf: The process cannot access the file because it is being used by another process.
C
:\Users\kh\Desktop\workstuff\samples\test_out\lol1.pdf [Fatal Error] :1:167: The markup in the document following the root element must be well-formed.
C
:\Users\kh\Desktop\workstuff\samples\test_out\lol1.pdf
Here is the context for this output. HIGHlol1 means high priority and the file is named lol1(.tiff). Page 1 and Detected 224 diacritics are standard tesseract outputs as far as i know. Didn't fail means it did not throw any tesseract exceptions (never got one btw.). OCR is done lets move! means i managed to tell database that we finished OCR on file. After that, the program fails to move pdf file from tmp folder, which is the intended folder for creating pdf via tess4j. After this I dont know what the error means, but it closes the application, meaning it wont even try to do ocr on the second .tiff file called lol.

To sum it up these are the errors i get when i try to execute tasks (specific tasks that execute tess4j's createDocuments) parallel.
Reply all
Reply to author
Forward
0 new messages