proc sgs kernel hung task timeout secs
However, now using my config
tesseract term2onlyalpha.jpg term2onlyalpha liu
Yields the correct result
proc sys kernel hung task timeout secs
Just to check what about the config is working I commented out user_words_suffix and it went back to sgs. So I commented that back in and turned the load_system_dawg back on and this maintained the correct result.
SO
I have a correct result out of tesseract but only when I doctored the input image to remove slashes.
I don't know how important those slashes are to you but you COULD try replacing slashes in your text if you have a possibility to intercept that before it becomes an image that goes to Tesseract. If you need slashes or some notion of slashes you could try using a different separator character like an underscore or something that does not confuse Tesseract whilst it's parsing the characters.
OR
You could try and see if modifying tesseract's internal engine parameters will get you anywhere. I tweaked ONE parameter with no impact but there are a great many.
If you run
tesseract --print-parameters
You will get a list of all the parameters you can put in your config file which could make a difference to your image as it is.
Another tip is you can get debug out of tesseract as its running. To find these parameters do:
tesseract --print-parameters | grep debug
Then you can use them with your tesseract runs. For instance I tried word_to_debug=sys and when it was failing I got NO output from this, but when it is working (with the image with slashes removed) I get output:
tesseract -c word_to_debug=sys term2onlyalpha.jpg term2onlyalpha liu
Best Raw Choice : sgs : R=47.8142, C=-3.75406, F=1, Perm=2, xht=[12.2215,15.2019], ambig=0
pos NORM NORM NORM
str s g s
state: 1 1 1
C -3.377 -3.754 -3.095
etc...
So this lets you see a bit of some of the values being used when it works and then you can go back to the full print-parameters list to actually modify some of those options to see if you can get it working.
You will need for some debugging the visual debugging tool ScrollView.jar
https://github.com/tesseract-ocr/tesseract/wiki/ViewerDebugging
Which will start showing you cool stuff like segmentation (looks OK)


Moral of the story - work ahead for you to understand what engine params you can change that might help you tweak this gremlin out in relation to slashes causing some kind of problem for tesseract around the y and g and I also noticed proc vs prec sometimes.
Cheers
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/26291430-30f8-4920-9e63-7296245f8759%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/dbab122c-b148-413d-b53c-557cf6eeaa95%40googlegroups.com.