--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/73f12a10-45d4-4879-9d62-456dd5dd3abf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
The gaps in some of the characters are probably too significant for tesseract to identify them properly. I'd be tempted to try to leverage the parts of the characters where the segments are connected and infer the numbers from the positioning, for example, train for the top and bottom of the zero as separate characters and then identify the zero when one is over the other. I wonder if something like opencv would be a better tool in this case.
art
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Wed, Feb 8, 2017 at 12:20 AM, <rob...@inkrh.com> wrote:
Hello,I am running into real difficulty getting Tesseract to work with a faux segmented display.At the moment I am processing a video frame by frame, removing the background and replacing the character's color to leave images like the attached.I have spent a long time training both using a set of the actual images produced by the above, as well as using tifs of LCD fonts that match, setting the expected format, whitelisting the characters expected and still am seeing no improvement in the recognition success - at best it is around 30% success, at worst 0%.I have also tried using SSOCR (https://www.unix-ag.uni-kl.de/~auerswal/ssocr/) without any success (0 digits recognized from the above), and exploring all the different settings of tesseract and SSOCR.Is there any advice for getting these characters recognized consistently? My target is to have the characters recognized with at least a 75% success rate.
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
It’s worth trying, a “character” is typically a set of connected line segments, in fact, I think LED display tools typically try to close the gap between the segments. When you do font training, tesseract’s tools tell you how it interprets the characters. I would be tempted to take the “0” from the image and see if the “makebox” step identifies one or two boxes. Opencv would indeed be more of a shape recognition exercise, specifically template matching, there’s a nice example here [1]. I have found template matching is worthwhile for a limited set of characters/symbols, that’s why a number sequence might be a candidate, but a lot depends on the consistency of the display. I tried something like this for a handwritten set of diaries, but the variations in letters pushed me towards OpenIMAJ [2].
art
---
1. http://www.pyimagesearch.com/2015/01/26/multi-scale-template-matching-using-python-opencv/
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/a438816d-238f-4201-88de-1a61190a0a2f%40googlegroups.com.