We've done this for customers before; we sub-contracted the microfilm scanning to
a specialist company, who returned TIFF images on tape. We then segmented the page
bitmaps into text/image regions, OCRed the text regions using the ScanWorX ICR, and
proof-checked the text (we used our own in-house line-by-line proof-checking tool to
maximise speed and accuracy, but depending on the accuracy you require a quick
visual check might suffice). We usually go on and apply SGML markup to the text, but
what you would want to do would depend very much on what you plan to do with the
electronic documents once you've got them.
Hope that helps a bit...
Hugh Stabler, Software Development Manager
Rank Xerox Business Services
Document Technology Centre
Beech House, Mitcheldean, Gloucestershire, GL170DD, UK.