On 30 Oct 2013, at 18:04, David Röthlisberger wrote:
> It looks like Sikuli uses "pyramid template matching"[1] to find the
> region-of-interest more quickly than the standard OpenCV matchTemplate
> function, and at the last stage it just uses matchTemplate to
> confirm.[2]
I've done a prototype of "pyramid template matching" (i.e. running
OpenCV's matchTemplate function against downscaled versions of the
template & source images to identify the likely match region, and
then only process that region in the full-size image):
https://github.com/drothlis/stb-tester/commits/templatematch-performance
It doesn't work 100% yet; it doesn't pass all "make check" unit tests,
and when run against a collection of templates + screenshots from the
YouView UI it provides the wrong result for 6% of them (out of 141).
However, the performance figures are encouraging:
tests/run-performance-test.sh reports 36fps (up from 4.0fps).
(Measured on my laptop's 1.8GHz Intel Core i7, using a 720p mpeg file on
disk as the video source).
With this video, every frame was a "no match", which means that the
match algorithm exits early, after the first `matchTemplate` at the
smallest level of the "pyramid". However, I believe that that
`matchTemplate` is the most expensive part of the algorithm; subsequent
passes are only performed on the area of interest identified by the
initial pass.
Testing the afore-mentioned collection of real screenshots (90% of which
do match, 10% don't) went from 32s to 6s (a 5x improvement, compared
with the 9x improvement measured by run-performance-test.sh).