--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7133178a-47d9-46bc-998e-a2471ee94413%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi,normalize the images so that the average brightness is the same, so the pre-processing is easier to fine tune. Also use component analysis to estimate the text size and normalize the image sizes too. For example a blur of 5 pixels is very different if applied on a small or large image.Then try CLAHE to get an even illumination and difference of gaussian (40,1 with gimp) or divide (gimp: duplicate layer, invert the upper one, set to divide mode). And all the things you already tried too.To remove glare you could try to simply take the clipped pixels and set them to the average value of the picture.If you can take multiple pictures you can merge them to remove glare and reduce noise:Straighten the text: https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/ and automatic perspective correction based in straight lines (once you have the rectangle containing the text you can use opecv findHomography/warpPerspective to fix it).After all of this do a little fine tuning of the tesseract model (ocrd-train) to limit the charset to the characters you are using, to the font (the 5, 9, G and R, for example, are quite special) and to the pre-processing output style. In this way you do not need to get a perfect pure black on white text, just something that is close but consistent.Prepare some test data and a small script to quickly see if what you are doing is improving the average results or not. When you are taking the pictures take several for each plate.Study the failing images to understand why (glare, noise, strange characters, rotation, etc.), see if you can fix it with gimp then transpose it to code.ByeLorenzo
Il giorno lun 27 mag 2019 alle ore 08:24 Bhamare Harshal <harshal...@gmail.com> ha scritto:
Hi All,--We need to do OCR on stamped metal or engraved metal plates but after doing some image processing like grey scaling, thresholding, dilation, canny edge detection still not getting perfect OCR output after applying metal stamping.thanks please help me.
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.