Business Card OCR

108 views
Skip to first unread message

Vijay Rawat

unread,
Dec 28, 2017, 3:08:20 AM12/28/17
to tesseract-ocr
Hi ,
I am building a small application on the server side to process business card images and extract text. I am using tesseract for OCR. Attached are 2 business cards
If I use tesseract directly on original image, i don't get any text. So I am using ImageMagick to improve the image quality. I follow following steps
Step 1. Increase image resolution  > magick bas_eng_sm.jpg -colorspace RGB -alpha off -units PixelsPerInch -resample 600 bas_eng_sm_resize.tiff
Step 2. Convert to gray colorspace > magick bas_eng_sm_resize.tiff -colorspace gray bas_eng_sm_gray.tiff
Step 3. Apply OCR > tesseract bas_eng_sm_gray.tiff bas_eng_sm_gray
Output : I get some output
Peter M. Btcining
Pr:-sauna: 8 CEO
pmbilbasmodlcalacom
(650) 235-4000 (direct)

QBAS
Mf:DI(‘.»‘\l.

I660 S. Amphletl Blvd. 82(1)
San M8100. CA 94402-2525
Main Pb: (650) 235-41)!
cl’-‘ax: (650) 2-I0-«KID
www.bnsmedical.com

I  need at least name , phone no and mailid should come up proper. What else can I do either in ImageMagick or tesseract to improve the results
bas_eng_sm.jpg
citrus_viet_sm.jpg

Divyasree m.p

unread,
Feb 7, 2018, 7:52:42 AM2/7/18
to tesseract-ocr
Hi Vijay,

Did you get the solution. i have same problem . if you get please share . 

Thanks,
Divyasree MP
Reply all
Reply to author
Forward
0 new messages