SetVariable on whitelist but get no luck

71 views
Skip to first unread message

Matthew Scott

unread,
Jan 5, 2017, 3:09:11 AM1/5/17
to tesseract-ocr
version: 3.05 //I compiled the tesseract305.dll
system: windows 10 x64
I want to only recognize the number, so the code is as follows
        tesseract::TessBaseAPI tess;
   if (tess.Init("e:\\resources\\tesseract\\tessdata","eng",tesseract::OEM_LSTM_ONLY))
    {
              cerr << "OCRTess:could not initialize teseract" << endl;
               return -1;
     }
      tess.SetPageSegMode(tesseract::PageSegMode::PSM_SINGLE_LINE);
  cout<<"setVariable succeed?  "<<tess.SetVariable("tessedit_char_whitelist","0123456789");
       Rect r;
       Mat roi = src(r);//where r is the region of interest
   tess.SetImage((uchar*)roi.data, roi.size().width, roi.size().height,
   roi.channels(), roi.step1());
  tess.Recognize(0);
     text = std::unique_ptr<char[]>(tess.GetUTF8Text()).get();

however, the output still contains the alphabet

Zdenko Podobný

unread,
Jan 5, 2017, 3:17:48 AM1/5/17
to tesser...@googlegroups.com

Zdenko

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/4415aaee-23e2-48af-b625-e3ab2b9902dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthew Scott

unread,
Jan 5, 2017, 8:22:25 AM1/5/17
to tesseract-ocr
I build it from the master branch but not 3.05
however when using 'OEM_CUBE_ONLY`, the satVariable do work.

在 2017年1月5日星期四 UTC+8下午4:17:48,zdenop写道:

Zdenko

On Thu, Jan 5, 2017 at 9:01 AM, Matthew Scott <beaut...@gmail.com> wrote:
version: 3.05 //I compiled the tesseract305.dll
system: windows 10 x64
I want to only recognize the number, so the code is as follows
        tesseract::TessBaseAPI tess;
   if (tess.Init("e:\\resources\\tesseract\\tessdata","eng",tesseract::OEM_LSTM_ONLY))
    {
              cerr << "OCRTess:could not initialize teseract" << endl;
               return -1;
     }
      tess.SetPageSegMode(tesseract::PageSegMode::PSM_SINGLE_LINE);
  cout<<"setVariable succeed?  "<<tess.SetVariable("tessedit_char_whitelist","0123456789");
       Rect r;
       Mat roi = src(r);//where r is the region of interest
   tess.SetImage((uchar*)roi.data, roi.size().width, roi.size().height,
   roi.channels(), roi.step1());
  tess.Recognize(0);
     text = std::unique_ptr<char[]>(tess.GetUTF8Text()).get();

however, the output still contains the alphabet

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

Zdenko Podobný

unread,
Jan 5, 2017, 9:27:08 AM1/5/17
to tesser...@googlegroups.com
In master there is tesseract 4.0 and cube was removed from it (see e.g. [1]). So setting OEM_CUBE_ONLY has no effect.

Zdenko

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
Reply all
Reply to author
Forward
0 new messages