Greetings,
I found a potential solution to rewrite each pixel to either white or black based on a set threshold. After looking at OpenCV functions I found "threshold" would do just that but Tesseract was still finding "ghost" characters in the white areas of the image. So I had to find where the string starts and grab an ROI from that point. Note that the
THRESH_BINARY_INV parameter to threshold will also convert dark colors to white and light colors to black. From things I've read Tesseract likes black characters on white backgrounds.
So the solution I came up with is the following using OpenCV and tesseract:
Mat img; // should already have the image
Mat cropped;
Mat grayed;
Mat inverted;
Mat cropNum;
// Crop the original image to the defined ROI
Rect roi(xStart,yStart,xMove,yMove);
cropped
= img(roi);
// Convert Image to Gray
cvtColor(cropped, grayed, COLOR_BGR2GRAY);
// Invert Image to black and white
threshold(grayed,
inverted, 100, 255, THRESH_BINARY_INV);
// Use tesseract to OCR
tesseract::TessBaseAPI *ocr = new tesseract::TessBaseAPI();
ocr->Init(NULL, "eng", tesseract::OEM_LSTM_ONLY);
ocr->SetPageSegMode(tesseract::PSM_SINGLE_WORD);
ocr->SetImage(
inverted
.data,
inverted
.cols,
inverted
.rows, 1,
inverted
.step);
popupNum = string(ocr->GetUTF8Text());
NOTE: Be careful with the 4th parameter in
ocr->SetImage function. This is the number of bits per pixel.
After converting to grayscale it's 1 and not 3. I forgot about this and I was getting 3 strings back. Quite strange.