Hi community,
I have been playing around with the engine and have found some issues with some pictures, I am using bitmaps generated by the computer on diagrams that I create that then change regularly.
The issue I have is that the text, which is numeric in nature, is not being identified, or is identified wrong (not by much, but enough).
Attached is an example image, the image shows 13.00%, this is sometimes identified as I3.00% or I 3.00X, or I3.0096.
I can understand why this occurs as they are similar to the engine, but when I increase the image size, it works better, which is expected and supported by the optimization documentation, optimal size is 300DPI.
I would like some guidance as to any flags or the like, or even an advanced numeric trainingdata that can help in this regard.
Any advice or tips or even a guide to better utilization of the engine would be appreciated.
Thanks.
PS. Current code:
engine = new TesseractEngine(@"./tessdata", "eng", EngineMode.TesseractOnly, "config");
private string Decypher_add_entries(Bitmap bitmap, int blowupW, int blowupH)
{
bitmap = ResizeImage(bitmap, bitmap.Width * blowupW, bitmap.Height * blowupH);
string text = "";
//var i = 1;
using (var page = engine.Process(bitmap))
{
text = page.GetText();
}
return text;
}
I might not be utilizing all the available commands that can assist me, thats all the code I use for implementation which is a fairly simple 3-4 lines of code.