What is the reocmmended parameters to use to convert PDF to image in order to get best result from tesseract

52 views
Skip to first unread message

fady taher

unread,
Aug 18, 2019, 12:14:15 PM8/18/19
to tesseract-ocr
Am trying to perform OCR on some PDFs, I use image magic to convert PDF to image yet, when I apply tesseract some of the values are miss-interpreted.

on the other hand, converting the same PDF using online tools produces images which outputs correct values when applying tesseract.

Is there any recommeneded params to use while converting to image ?

the below are the ones am currently using

pdfConvDensity = "-density"
densityVal = "300"
setO, setOCoSp, setOCo, setOSeperate, setOAvg = "-set", "colorspace", "Gray", "-separate", "-average"
quality = '-quality'
qualityVal = '100'
depth = '-depth'
depthVal = '2'
antialias = '-antialias'




Reply all
Reply to author
Forward
0 new messages