text close to lines

70 views
Skip to first unread message

GuillaumeQ

unread,
Jul 10, 2017, 4:24:12 AM7/10/17
to tesseract-ocr
I have in a document some text written in a table. the lines of the table are pretty close to the text. when i doOCR, i dont get the text between the lines. is there any way to improve this performance and read some text close to lines? the image is attached

my code:
def ocrToStream(){
def imageFile = new File("path\\to.PNG")
ITesseract instance = new Tesseract1() // JNA Direct Mapping
instance.setDatapath("<parentPath>") // replace <parentPath> with path to parent directory of tessdata
instance.setLanguage("fra")

try {
def result = instance.doOCR(imageFile)
System.out.println(result)
} catch (TesseractException e) {
System.err.println(e.getMessage())
} catch (IOException e) {
System.err.println(e.getMessage())
}
}
zoom.PNG

THintz

unread,
Jul 10, 2017, 9:18:25 AM7/10/17
to tesseract-ocr
Charles Weld's Tesseract .Net implements Leptonica's RemoveLines for grayscale in Pix.cs.

Quan Nguyen

unread,
Jul 10, 2017, 10:09:00 AM7/10/17
to tesseract-ocr
Reply all
Reply to author
Forward
0 new messages