OCR from microfilm: how?

Skip to first unread message

PEI Crafts Council

Nov 17, 1994, 11:01:42 AM11/17/94
I'm interested in hearing from anyone who knows anything about converting
microfilmed archival newspapers and converting them to digital information
through some sort of optical character recognition process.



Hugh Stabler

Nov 22, 1994, 11:59:47 AM11/22/94

We've done this for customers before; we sub-contracted the microfilm scanning to
a specialist company, who returned TIFF images on tape. We then segmented the page
bitmaps into text/image regions, OCRed the text regions using the ScanWorX ICR, and
proof-checked the text (we used our own in-house line-by-line proof-checking tool to
maximise speed and accuracy, but depending on the accuracy you require a quick
visual check might suffice). We usually go on and apply SGML markup to the text, but
what you would want to do would depend very much on what you plan to do with the
electronic documents once you've got them.

Hope that helps a bit...

Hugh Stabler, Software Development Manager
Rank Xerox Business Services
Document Technology Centre
Beech House, Mitcheldean, Gloucestershire, GL170DD, UK.

Reply all
Reply to author
0 new messages