BH
Hello,
I am an experienced programmer, but absolute newbie to OCR / document analysis / all computer optical recognition.
Desired Effect (workflow I'm trying to program)
- Dynamically generate a form intended to be printed and filled out IRL
- Scan the completed form and obtain its data
Type of Data
- Highest Priority: Check boxes, filled in by pen / pencil / marker etc., marked with check-mark, X, diagonal strike, etc.
- Optional: Written Numbers, circled options,
Theoretical Coding Solution
- When generating a form, store layout / coordinate information of form elements
- Place recognizable anchors (rotated 'L' s or '+' symbols) at the corners of the printed page to define a general known rectangular area
- Print a bar-code or numeric identifier at pre-defined coordinates in the rectangle area
- Obtain data out of form elements using layout/format information & coordinates previously stored for this identified form
Bottom line: Is this possible? How to do this? What do I need to learn in order to get to a point where I know how to use OCRopus (or other libraries) to achieve these results?
Related Links (describe some technical aspects & bits of theoretical solutions, but no practical road-map of how to actualize this)