The world of open source welcomes me with insufficient info/examples/
documentation but with opened doors to ask ;)
I`m trying just to recognize really clear and simple line of text in
English like "Tess TEST 123.4 $15"
now I have:
//Tesseract block start
CTessOCR *tess = CTessOCR::Instance();
tess->api->SetVariable ("tessedit_char_whitelist", "0123456789");
tess->api->SetVariable ("classify_bln_numeric_mode", "1");
tess->api->Init ("./../../tessdata", "eng");
#ifdef DEBUG_MODE
tess->api->SetVariable ("tessedit_adaption_debug", "T");
tess->api->SetVariable ("tessedit_draw_outwords", "T");
tess->api->SetVariable ("tessedit_dump_choices", "T");
tess->api->SetVariable ("tessedit_dump_choices", "T");
tess->api->SetVariable ("interactive_mode", "T");
tess->api->SetVariable ("tessedit_create_hocr", "T");
#endif
tess->api->SetVariable ("tessedit_single_match", "0");
tess->api->SetVariable ("tessedit_zero_rejection", "T");
tess->api->SetVariable ("tessedit_minimal_rejection", "F");
tess->api->SetVariable ("tessedit_write_rep_codes", "F");
tess->api->SetVariable ("tessedit_resegment_from_boxes", "T");
tess->api->SetVariable ("tessedit_train_from_boxes", "T");
tess->api->SetVariable ("textord_fast_pitch_test", "T");
tess->api->SetVariable ("textord_no_rejects", "T");
tess->api->SetVariable ("edges_children_fix", "F");
tess->api->SetVariable ("edges_childarea", "0.65");
tess->api->SetVariable ("edges_boxarea", "0.9");
tess->api->SetVariable ("il1_adaption_test", "1");
tess->api->SetPageSegMode (tesseract::PSM_SINGLE_LINE);
Mat img = imread( "../../tess.jpg", CV_LOAD_IMAGE_GRAYSCALE ); //err
now
tess->api->SetImage(convert_mat_to_pix(img));
std::string text = tess->api->GetUTF8Text();
It all fails in
> match.exe!OpenBoxFile(const STRING & fname)
match.exe!tesseract::Tesseract::ApplyBoxes(const STRING & fname,
bool find_segmentation, BLOCK_LIST * block_list)
match.exe!tesseract::TessBaseAPI::Recognize(ETEXT_DESC * monitor)
match.exe!tesseract::TessBaseAPI::GetUTF8Text()
Obviously it fails because I`ve never set input file name with boxes.
But why would I need it? I already have trained data downloaded and
put in tessdata: eng.traineddata, eng.cube.size etc
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
it looks like you need to specify leptonica library for linking (-llept)
Thanks for this wonderfully simple example!It compiles for me on Ubuntu, but not on Mac OSX. Anyone know what I'm missing here?$ g++ -o test test.cpp -I /opt/local/include/tesseract -I /opt/local/include/leptonica -L /opt/local/lib -ltesseractUndefined symbols for architecture x86_64:"_pixRead", referenced from:_main in ccVNVpU7.o"_pixDestroy", referenced from:_main in ccVNVpU7.old: symbol(s) not found for architecture x86_64collect2: ld returned 1 exit statusPatrick.