I have a problem with the current tesseract

158 views
Skip to first unread message

Pkumar ..

unread,
May 8, 2019, 2:02:56 AM5/8/19
to tesseract-ocr

I have a problem with the current tesseract I use tesseract in PHP coding like under


<?php

if(isset($_FILES['image'])){

$total = count($_FILES['image']['name']);

for( $i=0 ; $i < $total ; $i++ ) {

$file_name = $_FILES['image']['name'][$i];

$file_tmp =$_FILES['image']['tmp_name'][$i];

move_uploaded_file($file_tmp,"image1/".$file_name);

echo "<h3>Image Upload Success</h3>";

echo '<img src="image1/'.$file_name.'" style="width:100%">';


shell_exec('"C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe" "D:\\xampp\\htdocs\\Image_OCR\\Image1\\'.$file_name.'" out');


echo "<br><h3>OCR after reading</h3><br><pre>";


$myfile = fopen("out.txt", "r") or die("Unable to open file!");

echo fread($myfile,filesize("out.txt"));

fclose($myfile);

echo "</pre>";

}

}

?>


but I face a problem my image is .jpg and it is numeric and I found output in unread format can any one help me how can I got right text I send image for test

crop1G.jpg
crop2G.jpg
crop3G.jpg

zdenop

unread,
May 8, 2019, 8:06:31 AM5/8/19
to tesseract-ocr
What is reason for showing as you code? You are running tesseract from shell, so...

If you have problem with tesseract output (I guess) - you will need trainneddata for MICR font.

Zdenko

Dňa streda, 8. mája 2019 8:02:56 UTC+2 Pkumar .. napísal(-a):

Pkumar ..

unread,
May 10, 2019, 4:58:09 AM5/10/19
to tesseract-ocr
thanks foe reply but i cant find  trainneddata for MICR font.

can you help me where i found that and how i modify my code for it 

Zdenko Podobny

unread,
May 10, 2019, 5:15:17 AM5/10/19
to tesser...@googlegroups.com
You can train it by yourself.
There is aslo one PR  https://github.com/tesseract-ocr/tessdata/pull/121 but result is not perfect.
 
Zdenko


pi 10. 5. 2019 o 10:58 Pkumar .. <pkuma...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/79b14938-af4a-45c0-a228-8e72cd4ce4a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Zdenko Podobny

unread,
May 11, 2019, 8:37:04 AM5/11/19
to tesser...@googlegroups.com

Sid Sathi

unread,
May 16, 2019, 11:30:53 AM5/16/19
to tesseract-ocr
Hi I found a trained MCIR font for tesseract in this google group. Hope this helps! https://groups.google.com/forum/#!searchin/tesseract-ocr/micr/tesseract-ocr/obWI4cz8rXg/_Yl8RuCeJAkJ

shree

unread,
Jun 15, 2019, 10:02:37 AM6/15/19
to tesseract-ocr
In case that file doesn't work with tesseract4, you can try MICR.traineddata from https://github.com/Shreeshrii/tessdata_MICR/tree/master/MICR-legacy

I was able to OCR the three images posted earlier in this thread:

⑈000144⑈ 400756051⑆ 000023⑈ 11

⑈069565⑈ 364013051⑆ 000079⑈ 11

⑈420360⑈ 36400 2052⑆ 000009⑈ 20

used --dpi 300 --psm 6 
Reply all
Reply to author
Forward
0 new messages