How does the Tesseract variable “save_blob_choices” works (in tess-two)?

749 views
Skip to first unread message

Sergio Mendoza

unread,
Mar 14, 2016, 1:17:49 AM3/14/16
to tesseract-ocr

So I've been trying to use tesseract ocr, (specifically tess-two) for an android project to scan some symbols.

Everything works fine but sometimes the recognized String is returned as null. One of the solutions I found was to set the variable save_blob_choices to true in order to have tesseract save alternatives for recognition.


But I don't know if it indeed is supposed to do that. Where does it save the alternatives? How do I access them?

Of course if you have any other solution apart from using this variable, please tell me.

Here is my code:


TessBaseAPI baseApi = new TessBaseAPI();
baseApi.setDebug(true);
baseApi.init(MainActivity.DATA_PATH, MainActivity.lang);
baseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_CHAR);


baseApi.setVariable("tessedit_char_whitelist","abcdefghijklmnopqrst");
baseApi.setVariable("save_blob_choices", "T");

baseApi.setImage(mainBitmap);

publishProgress(80);


mainBitmap.recycle();
mainBitmap = null;

// Iterate through the results.
ResultIterator iterator = baseApi.getResultIterator();
String lastUTF8Text;
float lastConfidence;


iterator.begin();
do {
    lastUTF8Text = iterator.getUTF8Text(TessBaseAPI.PageIteratorLevel.RIL_SYMBOL);
    lastConfidence = iterator.confidence(TessBaseAPI.PageIteratorLevel.RIL_SYMBOL);

    Log.i("string, intConfidence",lastUTF8Text+", "+lastConfidence);
} while (iterator.next(TessBaseAPI.PageIteratorLevel.RIL_SYMBOL));


baseApi.end();


Also as an extra question, is base.setDebug(true) supposed to work? Because it doesn't seem to do anything.

zdenko podobny

unread,
Mar 14, 2016, 3:29:37 AM3/14/16
to tesser...@googlegroups.com

Zdenko

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1d742d69-6508-4e85-ba68-5e01fd7f9a36%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tom Morris

unread,
Mar 14, 2016, 1:18:08 PM3/14/16
to tesseract-ocr
What version are you using? I don't see a save_blob_choices in the current sources and the two closest matches, save_alt_choices and save_raw_choices, don't actually appear to do anything.

Tom

Sergio Mendoza

unread,
Mar 14, 2016, 2:02:55 PM3/14/16
to tesseract-ocr
I´m using tess-two which uses Tesseract v3.05.00dev. 
Message has been deleted

Sergio Mendoza

unread,
Mar 14, 2016, 2:16:16 PM3/14/16
to tesseract-ocr


I think it could work, but I don't really know C++. Do you know if there is any place I could find this, but passed to Java? Like an update of the tess-two project or something like that. I wouldn't mind learning C++, but I'm running on the clock unfortunately.

zdenko podobny

unread,
Mar 14, 2016, 4:13:06 PM3/14/16
to tesser...@googlegroups.com
I am not familiar with tess-two, but I see there function  getChoicesAndConfidence[1].

[1] https://github.com/rmtheis/tess-two/blob/master/tess-two/src/com/googlecode/tesseract/android/ResultIterator.java#L80

Zdenko

On Mon, Mar 14, 2016 at 7:12 PM, Sergio Mendoza <srgmet...@gmail.com> wrote:
I think it could work, but in the tess-two version that class is not passed to Java. Do you know if there is any place I could find it? (Problem is I don't reallly know C++)


On Monday, March 14, 2016 at 1:29:37 AM UTC-6, zdenop wrote:

Sergio Mendoza

unread,
Mar 17, 2016, 6:53:20 PM3/17/16
to tesseract-ocr
Thanks that works, I'd already found that class, but I couldn't get it to work, because I didn't know what to pass as parameter.

FYI in tess-two the paramter is just defined as int level, which is any variable from TessBaseAPI.PageIteratorLevel.
In my case instead of _constant_name_ I had to pass RIL_SYMBOL so tess would recognize single characters.
Reply all
Reply to author
Forward
0 new messages