tesseract.exe target.jpg stdout -l number --oem 0 -psm 6--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/273d9f86-39ce-42fe-8934-781f2103e4fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Your image is 96 dpi. Increase the dpi to 300 and try.Preprocess the image to remove the boxes around letters, if possible.
ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Fri, Oct 20, 2017 at 1:24 PM, 朱裕清 <zyd1...@gmail.com> wrote:
This is my target image:

Actually my question is similar to [this post](https://stackoverflow.com/questions/4944830/how-to-make-tesseract-to-recognize-only-numbers-when-they-are-mixed-with-letter). But I don't know why the following answer will lead to another direction. I mean, I just hope to get those digits with high degree of confidence. Such as I can do this with another language

Then I can just keep those degree of confidence with a threshold `0.9`. But now I hope to use *Tesseract* to do this.
First, I train a *number.traineddata* just for recognizing number. You can get it [here](https://1drv.ms/u/s!Aumb0ijJibxOi1KVXFjwDzOVRQrm).
tesseract.exe target.jpg stdout -l number --oem 0 -psm 6

Note I will get all digits which include high confidence and low confidence. Can we recognize the number and get the degree of its confidence? I cannot find any information to implement it. If *Tesseract* cannot do it. Any other method based on **C++** can implement my target? Could anyone can give me some information for it?
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
Maybe look at the API [1]. The output of the attached program shows there’s a lot of detail that can be gleaned at this level, including the confidence of the selected character and that of the other candidates. Compiling against tesseract on Ubuntu, at least, is fairly straightforward. I don’t know about windows or os/x.
art
---
1. https://github.com/tesseract-ocr/tesseract/wiki/APIExample
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/790f9169-e724-49b2-b24a-320a10fea6f4%40googlegroups.com.
Hello, This is my target_image.tif
I have changed your code a little to be this
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
//#include<opencv.hpp>
//using namespace cv;
using namespace std;
int main()
{
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
if (api->Init(".\\tessdata", "eng")) {
fprintf(stderr, "Could not initialize tesseract.\n");
exit(1);
}
api->SetPageSegMode(tesseract::PSM_SINGLE_BLOCK);
//Mat mat_image = imread("target_image.tif", 0);
Pix *image = pixRead("target_image.tif");
//cvtColor(mat_image, mat_image, CV_GRAY2BGR);
api->SetImage(image);
api->SetSourceResolution(300);
//Important
api->Recognize(NULL);
tesseract::ResultIterator* ri = api->GetIterator();
tesseract::PageIteratorLevel level = tesseract::RIL_SYMBOL;
//Mat image_rect_bin(mat_image.size(), CV_8UC1, Scalar(0));
int line = 0;
if (ri != 0) {
do {
const char* symbol = ri->GetUTF8Text(level);
if (ri->IsAtBeginningOf(tesseract::RIL_TEXTLINE))
line++;
if (symbol != 0) {
int x1, y1, x2, y2;
ri->BoundingBox(level, &x1, &y1, &x2, &y2);
tesseract::ChoiceIterator ci(*ri);
do {
const char* choice = ci.GetUTF8Text();
//rectangle(image_rect_bin, Point(x1, y1), Point(x2, y2), Scalar(255), -1);
} while (ci.Next());
}
delete[] symbol;
} while (ri->Next(level));
}
api->End();
return 0;
} As you see, I will get so many dis-separated rectangle from the target_image.tif. Could you help me again? Please...