I updated tesseract to the latest version in svn and now I am getting errors while running training ..
D:\BuildFolder\testing\TRAINdata\v6-TransliterationOnly>echo off
tesseract 3.02.03
leptonica-1.68 (Mar 14 2011, 10:43:03) [MSC v.1500 LIB Release 32 bit]
libgif 4.1.6 : libjpeg 8c : libpng 1.4.3 : libtiff 3.9.4 : zlib 1.2.5
**** extracting unicharset *****
Extracting unicharset from ipa.sanskrit2003.exp994.box
Wrote unicharset file ./unicharset.
**** done extracting unicharset from *****
**** ipa.sanskrit2003.exp994.box ****
**** Training using following .tr files *****
****
ipa.sanskrit2003.exp994.tr ****
**** NO Shapeclustering - Non Indic Language*****
**** Started MFTraining *****
Read shape table shapetable of 733 shapes
Reading
ipa.sanskrit2003.exp994.tr ...
id < this->size():Error:Assert failed:in file ..\..\ccutil\unicharset.cpp, line
237
Has anyone else had this problem?
Additionally, for sanskrit language data
I am errors while running OCR on .png images - it worked fine earlier.
1 file(s) copied.
tesseract 3.02.03
leptonica-1.68 (Mar 14 2011, 10:43:03) [MSC v.1500 LIB Release 32 bit]
libgif 4.1.6 : libjpeg 8c : libpng 1.4.3 : libtiff 3.9.4 : zlib 1.2.5
processing san.0s2003.exp0.tif
processing san.0s2003.exp8.tif
processing san.0sanskrit2003.exp0.tif
processing san.0sanskrit2003.exp8.tif
processing san.mnt.exp013.png
TIFFstream: Not a TIFF file, bad magic number 20617 (0x5089).
processing san.mnt.exp014.png
TIFFstream: Not a TIFF file, bad magic number 20617 (0x5089).
processing san.mnt.exp031.png
TIFFstream: Not a TIFF file, bad magic number 20617 (0x5089).
processing san.mnt.exp032.png
TIFFstream: Not a TIFF file, bad magic number 20617 (0x5089).
processing san.mnt.exp038.png
TIFFstream: Not a TIFF file, bad magic number 20617 (0x5089).
processing san.mnt.exp424.png
TIFFstream: Not a TIFF file, bad magic number 20617 (0x5089).
Press any key to continue . . .
Should I open issues for the above?