Hi everyone,
I am devoloping a Java application to vectorize a raster image. One of the steps is symbol recognition and I was hoping to train Tesseract to find them and return their pixel coordinates.
My question is:
1) Is it possible to make a dictionary of symbols to avoid detection of letters contained in English dictionary?
2) What steps should I perform?
I managed to make a box files for my training image, but later I get an Empty page! error.
I am glad for any suggestion,
Piotrek
Here is a sample image of tree sybols which I would like to train to check if it works:
https://gyazo.com/85a1db80f92f2df44625875bcf20d37d