Ask for advise on improving unsatisfactory recognition result

59 views
Skip to first unread message

James Liu

unread,
Nov 2, 2016, 3:20:08 AM11/2/16
to tesseract-ocr
Hey all,

I'm in a task to OCR screenshot of terminal output. They are error logs from standard ubuntu or centos systems. It's kind of easy compared with images from real world camera. My result was so poor. I tried everything found on google but nothing helps. Hope someone here could give me some advise. Anything is welcome. 

My eng.traineddata was downloaded here https://github.com/tesseract-ocr/tessdata . The image was preprocessed by binarization and changing DPI 600*600. 

The command I run was :
tesseract -l eng test1_dpi_b.tiff out -psm 4

I also tested this image on a tesseract web service https://www.newocr.com. Their results are much better than mine. 
test1_dpi_b.tiff

James Liu

unread,
Nov 2, 2016, 4:07:19 AM11/2/16
to tesseract-ocr
The output is followed:
[53752.1auuuzl uul uatehdag:
 [53754.142uu31 uul uatehda -
 lss7au.13uuual uul uatehda
 [63732.142EE7] uul uatehda

 : salt lucknp 7 cluua staek lar 22$! [lt7test:1salsl
 - salt lucknp 7 cluus staek lar 22$! lssdh7serue - 535]
 salt lucknp 7 cluua staek lar 22$! [lt7test:1salsl
 salt lucknp 7 cluus staek lar 22$! lssdh7seruer:zsssl
 lssaua.13uuuzl uul natehdag. . salt lucknp 7 cluua staek lar 22$! [lt7test:1salsl
 lssazu.142uu7l uul natehdag: : salt lucknp 7 cluus staek lar 23$! lssdh7seruer:zsssl
 [63332.222223] lulu: rea 7sehed sell7 deteeted stall an clulssaaz.225u111 lul rea7sehed deteeted stalls an CPUs/tasks
 [63336.13EEEZI uul natehdag: . salt lucknp 7 cluua staek lar 22$! [lt7test 13315]
 lssasu.142uull uul uatehda salt lucknp 7 cluus staek lar 23$! lssdh7serue . 535]
 [63364.13E223] uul uatehda salt lucknp 7 cluua staek lar 23$! [lt7test:1salsl
 lssaaa.142uzll uul uatehda salt lucknp 7 cluus staek lar 22$! lssdh7serue 535]
 [63332.13EEEEI uul uatehda salt lucknp 7 cluua staek lar 23$! [lt7test:1salsl
 [63316.142222] uul natehdag: : salt lucknp 7 cluus staek lar 22$! lssdh7seruer:zsssl
 [6332E.13EEESI uul natehdag: : salt lucknp 7 cluua staek lar 23$! [lt7test:1salsl
 [63344.142227] uul uatehda - - salt lucknp 7 cluus staek lar 22$! lssdh7serue - 535]
 [63343.13EEEZI uul uatehda salt lucknp 7 cluua staek lar 23$! [lt7test:1salsl
 [53372.142uu41 uul uatehda salt lucknp 7 cluus staek lar 22$! lssdh7seruer:zsssl
 [63376.13EEE4] uul natehdag. . salt lucknp 7 cluua staek lar 23$! [lt7test:1salsl
 l7uuuu.142uual uul natehdag: : salt lucknp 7 cluus staek lar 22$! lssdh7seruer:zsssl
 [7EBB4.13EEE4] uul natehdag: : salt lucknp 7 cluua staek lar 22$! [lt7test 13315]
 l7uu12.zzauual lulu: rea 7sehed sell7 deteeted stall an clul7uu12.zauu12] lul rea7sehed deteeted stalls an CPUs/tasks
 [7EB32.13EEE4] uul uatehda . salt lucknp 7 cluua staek lar 22$! [lt7test.1salsl
 [7EB4E.14ZEE4] uul uatehda salt lucknp 7 cluus staek lar 22$! lssdh7seruer:zsssl
 [7EB6E.13EEE4] uul uatehda salt lucknp 7 cluua staek lar 22$! [lt7test:1salsl
 l7uusa.142uual uul natehdag: uu salt lucknp 7 cluus staek lar 23$! lssdh7seruer:zsssl
 l7uuau.zasuaul lulu: task dnsetaa:1la7s hlaeked lar nare than 12a seeands.
 l7uuau.2427uul Tainted: G L 4. 1. u7 2. 216. aelaad. xus 754 u1
 [7EBBE.25E363] "eeha u > /pruE/sgs/kernel/huug7 task 7tineaat 7sees" disables this message
 l7uuaa.13uuuzl uul natehdag: . salt lucknp 7 cluua staek lar 22$! llt7 test: 13315]
 [7EB36.142211] uul uatehda salt lucknp 7 cluus staek lar 23$! lssdh7 7seruer: 2535]
 [72116.13E223] uul natehdag. . salt lucknp 7 cluua staek lar 22$! [lt7test:1salsl
 l7u124.142uu41 uul natehdag: : salt lucknp 7 CPUuS staek lar 23$! lssdh7seruer:zsssl
 l7u144.13uuu51 uul natehdag: : salt lucknp 7 cluua staek lar 22$! [lt7test:1salsl
 l7u152.142uu41 uul uatehda salt lucknp 7 cluus staek lar 23$! lssdh7seruer:zsssl
 l7u172.13uuu51 uul natehdag. . salt lucknp 7 cluua staek lar 22$! [lt7test:1salsl
 l7ulau.142uual uul natehdag: uue: salt lucknp 7 cluus staek lar 23$! lssdh7seruer:zsssl
 l7u132.zaauual lulu: rea7sehed sell7deteeted stall an clul7u132.zasulzl lul rea7sehed deteeted stalls an CPUs/tasks
 l7uzuu.13uuual uul natehdag: uue: salt lucknp 7 cluua staek lar 22$! [lt7test 13315]
 [7EZBE.Z§EE43] lulu: task dnsetaa:1la7s hlaeked lar nare than 12a seeands.
 l7uzuu.zs77zul Tainted: G L 4. 1. u7 2. 216. aelaad. xus 754 u1
 l7uzuu.z754311"eeha u > /pruE/sgs/kernel/huug7 task 7tineaat 7sees" disables this message
 l7uzzu.142uual uul natehdag: . salt lucknp 7 CPUuS staek lar 22$! lssdh7 7seruer: 2535]
 l7uzza.13uuual uul uatehda salt lucknp 7 cluua staek lar 23$! llt7 test: 13315]
 l7uzla.142uuzl uul natehdag. . salt lucknp 7 CPUuS staek lar 22$! lssdh7serue . 535]
 [72256.13EEESI uul natehdag: : salt lucknp 7 cluua staek lar 23$! [lt7test:1salsl
 l7uz7s.142uual uul natehdag: : salt lucknp 7 cluus staek lar 22$! lssdh7seruer:zsssl
 l7uzal.13uuzal uul natehdag: : salt lucknp 7 cluua staek lar 23$! [lt7test:1salsl

在 2016年11月2日星期三 UTC+8下午3:20:08,James Liu写道:

James Liu

unread,
Nov 2, 2016, 5:07:05 AM11/2/16
to tesseract-ocr
Just find out there's bug in my image scale function. After fixed it the input image is surely scaled by 300% and tesseract works like a charm. 


在 2016年11月2日星期三 UTC+8下午3:20:08,James Liu写道:
Hey all,
Reply all
Reply to author
Forward
0 new messages