templatematch with transparent templates

2,276 views
Skip to first unread message

Lewis Haley

unread,
Apr 9, 2013, 4:42:04 PM4/9/13
to stb-t...@googlegroups.com
Hi everyone,

I have successfully prototyped a method for template matching with transparent template,
i.e. templates which aren't square or even regular shapes, and which part(s) of the template
have zero alpha.

This is the message from an internal commit, which outlines the process:

The general idea is:
    What if we want to match a template that isn't square, or otherwise might
    have background content that we wish to ignore?
    This prototype allows for template with an alpha channel (transparency)
    to be submitted to the template match algorithm.
   
    Note that the cv.imread which loads the images from disk uses -1 instead
    of 1 as the flag. This means that all channel information is kept as-is,
    whereas 1 would remove the alpha and just use RGB.
   
    * The process does not affect the first pass... which means that in general
      the match threshold (certainty, best_res, etc) has to be decreased as
      there is literally less image to match... I'm uncertain how
      cvTemplateMatch deals with alpha channels... I think it probably ignores
      it and just reads the RGB as (0, 0, 0) or black. Might want to
      consider cloning the template for the template match and converting every
      pixel with alpha = 0 to be mid-gray because this is would maximize the
      potential difference to any other colour.
   
    * The mask_from_alpha function creates a mask by parsing over the template
      and creating a new image where I[x,y] = white if alpha == 255, else
      black. We do it like this so that only 100% opaque pixels contribute to
      the mask, which eliminates anti-aliasing problems. Note that the convert
      to grayscale must come after the alpha parse so we do not lose the alpha
      information!
   
    * The normalize still works with the addition of giving the normalize the
      mask as an option. Note that OpenCV already supported this.
   
    * After the absdiff, threshold, erode, we do a bitwise and between the
      resultant image and the mask. This means that only white pixels on the
      image which occur within the white areas of the mask are counted when
      assessing the final match/no match.

At present we are not looking to immediately implement this idea, however feel free
to attempt this idea in gst/gsttemplatematch.c/h and contribute it to the stb-tester
repo: https://github.com/drothlis/stb-tester/.

If this is of interest or use to you, let us know so we can raise the priority of its implementation.

Thanks,
Lewis

David Röthlisberger

unread,
Apr 9, 2013, 5:14:21 PM4/9/13
to stb-t...@googlegroups.com, Lewis Haley
On 9 Apr 2013, at 17:42, Lewis Haley wrote:
>
> I have successfully prototyped a method for template matching with transparent template,
> i.e. templates which aren't square or even regular shapes, and which part(s) of the template
> have zero alpha.
>
> This is the message from an internal commit, which outlines the process:

I'll just add that at YouView we have a bunch of python scripts we used
to prototype the "templatematch" algorithm, and this is what Lewis is
referring to by "an internal commit"; we'll work on publishing these
scripts under stb-tester/extra but it may take a few days.

Cheers,
Dave.

patri...@yahoo.com

unread,
Jun 2, 2014, 12:29:02 PM6/2/14
to stb-t...@googlegroups.com, lewis...@gmail.com, da...@rothlis.net
Hi David, Lewis,

Template matching with transparent template is a very useful function. Is this function available in the open source version of stb-tester in github?

Thanks,
Patrick

Lewis Haley

unread,
Jun 3, 2014, 8:34:13 AM6/3/14
to stb-t...@googlegroups.com, lewis...@gmail.com, da...@rothlis.net, patri...@yahoo.com

Hi David, Lewis,

Template matching with transparent template is a very useful function. Is this function available in the open source version of stb-tester in github?

Thanks,
Patrick

No, this was never integrated into the main stbt repository.

David Röthlisberger

unread,
Jun 3, 2014, 9:42:28 AM6/3/14
to Lewis Haley, patri...@yahoo.com, stb-t...@googlegroups.com
On 3 Jun 2014, at 09:34, Lewis Haley <lewis...@gmail.com> wrote:
>> Template matching with transparent template is a very useful function. Is this function available in the open source version of stb-tester in github?
>
> No, this was never integrated into the main stbt repository.

Hi Lewis,

Did it work well? Any chance you can release the code? Even if it's not
suitable for merging, feel free to raise a pull request with a proof of
concept, and maybe Patrick can polish it up if he needs to.

Thanks,
Dave.

Lewis Haley

unread,
Jun 3, 2014, 10:42:44 AM6/3/14
to stb-t...@googlegroups.com, lewis...@gmail.com, patri...@yahoo.com, da...@stb-tester.com
Hi Dave,

The code is not in a state that I can simply push to a branch (it's in the wrong repository for one)
and also its using a prototype framework as mentions previously in this thread. It shouldn't take
me long to review it and make a proper more thorough submission to stb-tester, however I can't
say exactly when I'll get round to this.

From what I remember, the process works - but on the extent to which it works *well* I couldn't
comment.

--Lewis

Robert G

unread,
Aug 22, 2014, 12:32:16 PM8/22/14
to stb-t...@googlegroups.com, lewis...@gmail.com, patri...@yahoo.com, da...@stb-tester.com
Hi Lewis,

We're currently in a situation where transparent image matching or masks (like in wait_for_motion()) would become really helpful in wait_for_match() as well.
Are there any updates on this topic or plans to release this feature?
I think this is for everyone who is not using solid background colors to indicate the focus, but working with borders instead. I just want to match for the border (=focus), ignoring the actual text within the element in focus (border) - I'll attach a reference template here. I'd like to use this, masked or alpha-masked to just match the outer border, to be used for every other element on this onscreen keyboard.

Kind regards,
Robert
keyboard_letter_v.png

Will Manley

unread,
Aug 24, 2014, 11:05:48 AM8/24/14
to stb-t...@googlegroups.com


On Tue, 9 Apr 2013, at 17:42, Lewis Haley wrote:
> Hi everyone,
>
> I have successfully prototyped a method for template matching with
> transparent template,
> i.e. templates which aren't square or even regular shapes, and which
> part(s) of the template
> have zero alpha.
>
> This is the message from an internal commit, which outlines the process:
>
> The general idea is:
> > What if we want to match a template that isn't square, or otherwise
> > might
> > have background content that we wish to ignore?
> > This prototype allows for template with an alpha channel (transparency)
> > to be submitted to the template match algorithm.
> >
> > Note that the cv.imread which loads the images from disk uses -1
> > instead
> > of 1 as the flag. This means that all channel information is kept
> > as-is,
> > whereas 1 would remove the alpha and just use RGB.

I like the general approach. It would be good to work through the maths
of it to ensure the results are consistent with both the current results
with non-transparent templates and that we could generalise the approach
to semi-transparent templates.

> > * The process does not affect the first pass... which means that in
> > general
> > the match threshold (certainty, best_res, etc) has to be decreased as
> > there is literally less image to match... I'm uncertain how
> > cvTemplateMatch deals with alpha channels... I think it probably
> > ignores
> > it and just reads the RGB as (0, 0, 0) or black. Might want to
> > consider cloning the template for the template match and converting
> > every
> > pixel with alpha = 0 to be mid-gray because this is would maximize
> > the
> > potential difference to any other colour.

I like that idea. Rather than grey, filling the space with noise
(random pixel values) might give better results. This is because nothing
should correlate with random noise well, including other random noise.
This could even work with semi-transparent templates. We could blend
the noise behind the semi-transparent image.

We would have to take care to adjust the certainties appropriately. Also
blending could be tricky as it's colourspace dependant and we might have
to be careful to take gamma into account. But certainly it should be
easier with a simple mask rather than semi-transparency, so we can start
there much like you have :).

> > * The mask_from_alpha function creates a mask by parsing over the
> > template
> > and creating a new image where I[x,y] = white if alpha == 255, else
> > black. We do it like this so that only 100% opaque pixels contribute
> > to
> > the mask, which eliminates anti-aliasing problems. Note that the
> > convert
> > to grayscale must come after the alpha parse so we do not lose the
> > alpha
> > information!

For now I think I'd prefer to raise an exception on partial transparency
for forward compatibility with support for partial transparency.

> > * The normalize still works with the addition of giving the normalize
> > the
> > mask as an option. Note that OpenCV already supported this.
> >
> > * After the absdiff, threshold, erode, we do a bitwise and between the
> > resultant image and the mask. This means that only white pixels on
> > the
> > image which occur within the white areas of the mask are counted when
> > assessing the final match/no match.
> >
>
> At present we are not looking to immediately implement this idea, however
> feel free
> to attempt this idea in gst/gsttemplatematch.c/h and contribute it to the
> stb-tester
> repo: https://github.com/drothlis/stb-tester/.
>
> If this is of interest or use to you, let us know so we can raise the
> priority of its implementation.

I also have increased interest in transparent templates recently so if
you could show your working that would be great. Even if it wasn't a
patch to stbt it would be interesting as a discussion point so we can
see more concretely the details of how first_pass_result and
confirmation thresholds are affected.

Thanks

Will

Lewis Haley

unread,
Aug 25, 2014, 10:19:20 AM8/25/14
to stb-t...@googlegroups.com, wi...@stb-tester.com
I'm working on some example code.

Lewis Haley

unread,
Aug 26, 2014, 12:09:22 PM8/26/14
to stb-t...@googlegroups.com, wi...@stb-tester.com
Some conclusions from some experimentations:

It seems that my original method (filling alpha=0 with black) doesn't work reliably because it makes cv2.matchTemplate give stronger
matches where the frame/image *is actually black*. The same goes for filling with grey/white/noise/any other colour.

The method does give *OK-ish* results when the amount of transparency in the template is kept to a very small percentage of the
total number of pixels in the image. The template must also contain some form of "anchor point", i.e. a unique defining detail, which
the templatematching algorithm can "latch onto". The big caveat for this is that most use cases of a transparent template are along
the lines of "match any button in this menu (ignore the text - we don't care which button)", so depending on the button style, or more
precisely the details which *don't have to be omitted due to transparency*, there is a significant chance that the best match location
for the template won't be in the correct place (therefore the second pass fails, if it gets there).

The only viable algorithm I can think of is to rewrite the cv.matchTemplate function (cross-correlation) and take alpha into account.
I've tried this in Python but the method is simply far too slow (minutes rather than microseconds). This would mean writing the function
in C in the form of a Python module. For those who don't know the algorithm, is it something along these lines (in pseudo-python code):

def correlate(img, tpl):

    correlation = 0
    for j in range(subimage.shape[0]):
        for in range(subimage.shape[1]):
            corr = sum([abs(img[y,x,i] - tpl[y,x,i]) for i in range(3)])
            corr *= tpl[y,x,3]  # alpha
            correlation += corr
    return correlation


def iter_subimages(img, shape):
    for y in range(0, img.shape[0] - shape[0], 1):
        for x in range(0, img.shape[1] - shape[1], 1):
            yield img[y:y + shape[0], x:x + shape[1]]

corrs = []
for subimage in iter_subimages(image, template.shape):
    corrs.append(correlate(subimage, template))

heatmap = numpy.ndarray(
    corrs,
    shape=(
        image.shape[0] - template.shape[0],
        image.shape[1] - template.shape[1],
        1)
)

normalize(heatmap)
# the closer to 0 (black) then the better the match.

And that gets you the first pass. You'd then have to take alpha into account for the second pass also.
There are probably various tricks which make the cross-correlation faster than pixel-by-pixel, channel-by-channel,
but this is how I understand the basic principle. The root of the/our problem is that the alpha must be taken into

account at the lowest level (i.e. when comparing individual pixels) and there is no way of applying it after the pixel
level analysis.

Robert G

unread,
Aug 26, 2014, 12:34:54 PM8/26/14
to stb-t...@googlegroups.com
The mask-parameter from wait_for_motion() is no option here?
At least for that case I could imagine to use one of the images containing the letter and telling the match algorithm to ignore the inner square through a black and white mask.

Lewis Haley

unread,
Aug 26, 2014, 12:44:41 PM8/26/14
to stb-t...@googlegroups.com
OpenCV's matchTemplate function/algorithm does not accept a mask option.

http://docs.opencv.org/modules/imgproc/doc/object_detection.html#matchtemplate

Lewis Haley

unread,
Aug 26, 2014, 12:53:13 PM8/26/14
to stb-t...@googlegroups.com
Just to revisit the code from above:


def correlate(img, tpl):
    correlation = 0
    for j in range(subimage.shape[0]):
        for in range(subimage.shape[1]):
            corr = sum([abs(img[y,x,i] - tpl[y,x,i]) for i in range(3)])
            corr *= tpl[y,x,3]  # alpha
            correlation += corr
    return correlation

If you replace the "abs" with a "** 2" (i.e. square the difference between the correlating pixels), then this is how opencv's TM_SQDIFF method works.

David Röthlisberger

unread,
Aug 26, 2014, 3:06:41 PM8/26/14
to Lewis Haley, stb-t...@googlegroups.com, William Manley
On 26 Aug 2014, at 13:09, Lewis Haley <lewis...@gmail.com> wrote:

The big caveat for this is that most use cases of a transparent template are along
the lines of "match any button in this menu (ignore the text - we don't care which button)", so depending on the button style, or more
precisely the details which *don't have to be omitted due to transparency*, there is a significant chance that the best match location
for the template won't be in the correct place (therefore the second pass fails, if it gets there).

We could change the first pass to return multiple (potential) locations;
if the second pass fails on the first location, then try the second
location, etc.

The first pass thresholds would have to be low enough that the correct
location does eventually match (even though the "transparent" parts of
the template --actually filled in with random noise-- don't match
exactly), but high enough that we aren't running the expensive second
pass against every pixel position in the frame.

I imagine that even with an implementation in C, all those nested loops
would be quite slow. OpenCV's `matchTemplate` implementation uses SIMD
operations IIRC. If you used numpy's vector operations instead of `for`
loops you might get a better idea of the kind of performance you can
expect without dropping into C.

David Röthlisberger

unread,
Aug 26, 2014, 3:15:23 PM8/26/14
to Robert Glas, stb-t...@googlegroups.com, Lewis Haley, patri...@yahoo.com
On 22 Aug 2014, at 13:32, Robert G <rober...@maxdome.de> wrote:

I just want to match for the border (=focus), ignoring the actual text within the element in focus (border) - I'll attach a reference template here. I'd like to use this, masked or alpha-masked to just match the outer border, to be used for every other element on this onscreen keyboard.

As a workaround for full transparency support:

Is the black part of the selected button always black, or is it
transparent?

If it's black, you could try using a template like this:


and another one like this:


and then calculate the region of the selection (for example to pass to
stbt.ocr) based on the position of the 2 matches.

Lewis Haley

unread,
Aug 26, 2014, 3:57:26 PM8/26/14
to stb-t...@googlegroups.com, lewis...@gmail.com, wi...@stb-tester.com, da...@stb-tester.com


On Tuesday, 26 August 2014 16:06:41 UTC+1, David Röthlisberger wrote:

On 26 Aug 2014, at 13:09, Lewis Haley <lewis...@gmail.com> wrote:

The big caveat for this is that most use cases of a transparent template are along
the lines of "match any button in this menu (ignore the text - we don't care which button)", so depending on the button style, or more
precisely the details which *don't have to be omitted due to transparency*, there is a significant chance that the best match location
for the template won't be in the correct place (therefore the second pass fails, if it gets there).

We could change the first pass to return multiple (potential) locations;
if the second pass fails on the first location, then try the second
location, etc.

This would probably be the best option to go for, although I don't know how
many revisits would be required. Could potentially use Will's multi-match
code.
 
The first pass thresholds would have to be low enough that the correct
location does eventually match (even though the "transparent" parts of
the template --actually filled in with random noise-- don't match
exactly), but high enough that we aren't running the expensive second
pass against every pixel position in the frame.

I think mid-grey would be the better choice rather than random noise. With
mid grey, the maximum potential difference on any given 8-bit channel is
128, meaning you'll get a stronger match value that with random noise,
which could potentially produce a difference of 255 on an 8-bit channel.
As we're attempting to find a match (a match that doesn't *actually* exist)
it's better to make the match more viable than to discourage the match.


I've had a look at OpenCV source code and it is heavily dependent on OpenCL,
delegating this kind of processing into parallel threads on the kernel. In my example
I wasn't trying to write optimised code - quite the opposite - I was explaining the
algorithm.

Robert G

unread,
Aug 26, 2014, 4:49:24 PM8/26/14
to stb-t...@googlegroups.com, rober...@maxdome.de, lewis...@gmail.com, patri...@yahoo.com, da...@stb-tester.com


Am Dienstag, 26. August 2014 17:15:23 UTC+2 schrieb David Röthlisberger:


If it's black, you could try using a template like this:


and another one like this:


and then calculate the region of the selection (for example to pass to
stbt.ocr) based on the position of the 2 matches.


Actually I'm already using it exactly like this, but as this seems not to work very reliable, I'd prefer a more robust solution.
At least I'm still trying to find the perfect parameters to get this working smoothly.

Patrick L

unread,
Aug 30, 2014, 12:02:27 PM8/30/14
to Robert G, stb-t...@googlegroups.com, lewis...@gmail.com, da...@stb-tester.com
Hi,

You can also have a look at this method:




From: Robert G <rober...@maxdome.de>
To: stb-t...@googlegroups.com
Cc: rober...@maxdome.de; lewis...@gmail.com; patri...@yahoo.com; da...@stb-tester.com
Sent: Wednesday, August 27, 2014 12:49 AM
Subject: Re: [stb-tester] templatematch with transparent templates

David Röthlisberger

unread,
Aug 31, 2014, 9:43:03 AM8/31/14
to stb-t...@googlegroups.com, Robert Glas, lewis...@gmail.com, William Manley
I've drawn up a mini "project plan" with the steps required to implement
transparency support (and later, translucency support):
https://github.com/drothlis/stb-tester/issues/216

The first step would be to finish & merge pull request #212
(stbt.find_all), which allows the second pass to consider multiple
locations from the first pass. This in turn depends on pull request #182
(stbt.match). I hope to work on both of these this week.

We'll appreciate any help anyone can provide, especially in testing our
experiments against your private corpus of screenshots.

Thanks,
Dave.

Reply all
Reply to author
Forward
0 new messages