How can i train tesseract from images directly ?

109 views
Skip to first unread message

Harshit Gupta

unread,
Aug 1, 2017, 6:52:36 AM8/1/17
to tesseract-ocr
I am having images from number plate of vehicles which isn't a standard font. I have cropped images of characters in number plate. I tried the following - 

  1. Created a grid of all images so tesseract can read them for training and generated tif file for it.
  2. Then i generated box files for the corresponding tif file and corrected the box file for wrongly detected characters.
  3. Then i tried generating - unicharset, font_properties, shapeclustering, mftraining, cntraining.
  4. At shapeclustering i am getting something like - 
    1. Bad properties for index 3, char ‘: 0,255 0,255 0,0 0,0 0,0
    2. Bad properties for index 4, char J: 0,255 0,255 0,0 0,0 0,0
    3. Bad properties for index 5, char l: 0,255 0,255 0,0 0,0 0,0
    4. Bad properties for index 6, char I: 0,255 0,255 0,0 0,0 0,0
    5. Bad properties for index 7, char .: 0,255 0,255 0,0 0,0 0,0

ahmed.ba...@gmail.com

unread,
Aug 2, 2017, 9:24:16 AM8/2/17
to tesseract-ocr
I guess you use linux and you make trainng manualy , so

If you are within linux you can use LIOS is very efficase 


When under windows , use jtestboxeditor and serak 
Reply all
Reply to author
Forward
0 new messages