Retrieve HUD text from a video game screenshot

34 views
Skip to first unread message

Richard

unread,
Jan 13, 2020, 10:18:40 AM1/13/20
to tesseract-ocr
Hello,

I'm starting a software that would analyze eSport matchs, mining statistics from screenshots (i.e. video).
I've started simple, I'm trying to extract team names.

I've tried to crop/gray but it's not efficient (i.e. text doesn't match).

Any tip to improve text recognition ?

Code sample used:

import cv2
import pytesseract
from loguru import logger


image
= cv2.imread("test.png")

x = 375
y = 0
h = 40
w = 160

# IN GREY
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# BLUR
image = cv2.medianBlur(image, 3)

# THRESH
#image = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

#image = image[y:y+h, x:x+w]
cv2.imwrite("output.png", image)

for arg in range(1, 14):
    try:
        text = pytesseract.image_to_string(image, config=f"-l eng --oem 1 --psm {arg}")
        logger.info(f"Parsed text [{text}] with [{arg}]")
    except Exception as exception:
        logger.exception(exception)


Screenshot_20200113_075013.png
cropped.png

Daniel Anton

unread,
Jan 13, 2020, 11:48:01 AM1/13/20
to tesseract-ocr
Hi Richard,

First of all I would recommend you to visualize each step of your process (Threshold, Median ...).
I use matplotlib for that propuse you can find an example bellow.

The problem is not Tesseract but your preprocess, you shouldn't do a blur to a text that small and for the Threshold you can just use cv2.THRESH_BINARY.

import cv2
import pytesseract
from loguru import logger

from matplotlib import pyplot as plt


image = cv2.imread("test.png")
x = 375
y
= 0
h
= 40
w
= 160

# IN GREY
image
= cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)


plt
.figure(figsize = (40,40))
plt
.imshow(image, cmap = "gray")
plt
.title('image')
plt
.show()


# THRESH
image
= cv2.threshold(image, 200, 255, cv2.THRESH_BINARY)[1]



plt
.figure(figsize = (40,40))
plt
.imshow(image, cmap = "gray")
plt
.title('image')
plt
.show()


image
= image[y:y+h, x:x+w]


plt
.figure(figsize = (40,40))
plt
.imshow(image, cmap = "gray")
plt
.title('image')
plt
.show()


cv2
.imwrite("output.png", image)

for arg in range(1, 14):
   
try:
        text
= pytesseract.image_to_string(image, config=f"-l eng --oem 1 --psm {arg}")
        logger
.info(f"Parsed text [{text}] with [{arg}]")

    except Exception as exception:
        logger
.exception(exception)

Richard

unread,
Jan 14, 2020, 11:15:19 PM1/14/20
to tesseract-ocr
Thank you very much Daniel.
This is very handy for sure, I can see every step of image modification in PyCharm now :)
I tried to transform image in different orders (crop first, gray, thresh; gray first, crop then tresh etc) but still there is no good result.
Do you have some interesting material that would help to pinpoint what's wrong in my preprocess ?

Best regards,
Richard

Daniel Anton

unread,
Jan 16, 2020, 3:46:45 AM1/16/20
to tesseract-ocr
Yes check the code from the previous post, It works well with half of the tesseract psm.
It's just a simple global threshold.
Reply all
Reply to author
Forward
0 new messages