Lucas Alexandre
unread,Aug 17, 2016, 2:58:40 PM8/17/16Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesser...@googlegroups.com
Hello,
I am a new member of this mailing list. I am creating a small project to read electronic screens through OCR. In other words, we set up some equipment that capture
the VGA output of computers and other devices, and converts the signal to RCA composite, so I can take pictures and videos of the machine. My idea is to capture
BIOS images (Setup) and convert them to text that can be read by visually impaired users, like me. The fact is that Tesseract does not seem to understand my images,
but other commercial OCRs can read almost 99% of the text, with amazing accuracy. Before purchasing any license these OCRs, I wonder if there is anything I can
do
to make the tesseract is able to read my screens with some precision. I've tried to make the tesseract tessinput.tif return the file, and the result is a 1KB file
with bad picture quality, with completely blurred and distorted letters. I believe this happens because the tesseract tries to improve the image internally, but
ends up destroying it. If there was any option for tesseract not modify the image, certainly I could best results. Even in commercial OCRs, I can choose whether
you want the image to be converted to black and white.
Does anyone have any idea how I can do this by tesseract? I thought even in trying to recompile the tesseract in a way that it does not alter the original image.
Or is there some binary for Windows or Linux you already have this capability?
Very grateful.
Sincerely,
Lucas Alexandre