Normalised cross correlation

Malcolm Reynolds

unread,

Mar 6, 2012, 12:16:37 PM3/6/12

to scikit...@googlegroups.com

Hi everyone,

I've just stumbled across skimage as I'm going through to arduous process of trying to switch my workflow (PhD study of computer vision & machine learning) from matlab to python. The project looks great but I'm currently in need of a fast Normalised Cross Correlation function. As it is, I can't find any python implementation which is anywhere near the speed of Matlab's normxcorr2 (correct me if I'm wrong...) and it looks like a corresponding function to match the normxcorr2 behaviour is planned but not yet implemented for skimage. Is this correct? I'd be interested in looking into this and contributing some code to the project, but before I put a heap of time into it I just wanted to check that something didn't already exist in this or another numpy-based project.

Cheers,

Malcolm

Tony Yu

unread,

Mar 6, 2012, 1:54:22 PM3/6/12

to scikit...@googlegroups.com

Hi Malcolm,

I currently have a PR for a normalized cross-correlation function (which was originally implemented by Pieter Holtzhausen). Comments/suggestions would definitely be appreciated.

-Tony

Malcolm Reynolds

unread,

Mar 20, 2012, 10:56:14 AM3/20/12

to scikit...@googlegroups.com

On Tuesday, March 6, 2012 6:54:22 PM UTC, Tony S Yu wrote:

Hi Malcolm,

I currently have a PR for a normalized cross-correlation function (which was originally implemented by Pieter Holtzhausen). Comments/suggestions would definitely be appreciated.

-Tony

Hi Tony

I've just got round to trying this and I may be using it wrong, but currently I'm finding it doesn't match the result given by matlab's normxcorr2. For one thing I think it would make sense to raise an error when the user passes the template and image the wrong way round (matlab does this). Also, I've tested by having the template as the central part of some image, and using ncc to compare the template to the whole image. Ideally since there is one arrangement which makes each pixel match perfectly, there will be a very clear strong peak in the NCC image - which is indeed what I see in matlab (eg http://img840.imageshack.us/img840/8577/screenshot20120320at143.png ). With your match_template function however, I'm unable to get the same result, what I get instead is this: http://img844.imageshack.us/img844/8577/screenshot20120320at143.png - and if I put the template and image arguments in a different order I get an image that looks like it may have accessed invalid memory: http://img849.imageshack.us/img849/7696/screenshot20120320at144.png

Additionally, calling template_match a second time with the arguments in this 'wrong' order produces a segmentation fault.

Also note that the matlab, when given a 768x432 image and 384x216 template the size of the result from normxcorr2 is 647x1151. As far as I can see this size of result is not possible to create with your code, although please correct me if I'm wrong?

I hope these bug reports are useful, if you need me to put together a reproduction script just let me know.

Malcolm

Tony Yu

unread,

Mar 20, 2012, 11:13:26 AM3/20/12

to scikit...@googlegroups.com

Wow, those results are incredibly different. Yes, please put together a small script to reproduce the different behaviors (ideally, with the matlab result saved as .npy---or if it's easier .mat---file; also, if you can reproduce with a smaller image, that would be great).

It'll be easy to check for switched arguments, but the rest might take some time for me to get to. In your original email, it sounded like you have some expertise in template matching; feel free to dig into the code and see if you can spot any errors in the code.

Cheers,

-Tony

Malcolm Reynolds

unread,

Mar 20, 2012, 12:12:04 PM3/20/12

to scikit...@googlegroups.com

Wow, those results are incredibly different. Yes, please put together a small script to reproduce the different behaviors (ideally, with the matlab result saved as .npy---or if it's easier .mat---file; also, if you can reproduce with a smaller image, that would be great).

It'll be easy to check for switched arguments, but the rest might take some time for me to get to. In your original email, it sounded like you have some expertise in template matching; feel free to dig into the code and see if you can spot any errors in the code.

Cheers,
-Tony

I've quickly whipped up some scripts in both matlab (if you have access to that) and python and put them here: https://github.com/malcolmreynolds/NCC-test-program - if you don't have access to matlab, then the .mat file included should contain the inputs and outputs from matlab's normxcorr2. To run it you just need to put any image called 'lena.png' in the directory and then I ran 'ipython --pylab -i run_ncc.py' at the shell on OS X. You may have your plotting set up a different way, or not use pylab, but if all else fails the result images are written out as pngs (also included in the repository). With matlab just get to the matlab prompt and do "run_ncc".

For reference, this is what I see when using a 256x256 downsampled version of the lena image: http://img843.imageshack.us/img843/2729/screenshot20120320at155.png (matlab on left, middle and right are python with and without the pad_output flag).

I'm afraid I'm definitely not an expert on template matching, but I reckon it's pretty crucial to have a good free (&fast!) implementation available in numpy/scipy/skimage in order to compete.. I will take a dig through your code sometime but unfortunately deadlines are always looming for me.. I'll hopefully get a chance this coming weekend.

Thanks for your quick response, let me know if you have any problems with the stuff on github.

Malcolm

Tony Yu

unread,

Mar 20, 2012, 12:31:26 PM3/20/12

to scikit...@googlegroups.com

Great. Thanks for putting this together and for the original bug report. I'll take a look at this soon.... I hope. :)

-Tony

Mike Sarahan

unread,

Mar 24, 2012, 5:16:21 PM3/24/12

to scikit...@googlegroups.com

Hi guys,

I have used NCC a lot in my research, so I thought I'd take a look at this.

How critical is it that skimage's result image is the same size as
matlab's? The difference arises from the approach to the problem.
Matlab is assuming that the template might match an area of the target
image that isn't fully in frame. They probably pad the original image
(preferably with the target image average), and then do the cross
correlation on the padded image. This would explain the different
result size - Matlab seems to be M+m-1 by N+n-1. That means they pad
by the template width laterally, and the template height vertically.
They then have an M+2m x N+2n image, which they then apply the
template match to.

I haven't figured out the problem with the results not matching,
though. The "good" news is that the results with padding the matlab
way (pre-match) are especially fishy (nans are cropping up). That's
certainly an indicator of something rather fundamental being wrong.

I'm out of time for now, but I hope this helps a little. I will
investigate further when time allows, provided you all don't beat me
to it.

Best,
Mike

template.py.diff

Stéfan van der Walt

unread,

Mar 24, 2012, 6:07:06 PM3/24/12

to scikit...@googlegroups.com

Hi Mike

On Sat, Mar 24, 2012 at 2:16 PM, Mike Sarahan <msar...@gmail.com> wrote:
> I'm out of time for now, but I hope this helps a little. I will
> investigate further when time allows, provided you all don't beat me
> to it.

Thanks for looking into this, and for identifying the boundary issue.
The fact that there are differences even with padding is
disconcerting; I'll see if I can review this soon.

Stéfan

Tony Yu

unread,

Mar 24, 2012, 7:03:58 PM3/24/12

to scikit...@googlegroups.com

2012/3/24 Stéfan van der Walt <ste...@sun.ac.za>

Ok,... so it turns out I made a stupid with the convolution-to-correlation translation. Fixing that gives comparable *looking* results to matlab (after Mike's padding patch). BUT, I also get NaN values with the padded output. These NaNs come from the sqrt of negative values (see code which solves the denominator). I *think* the argument of the sqrt should (mathematically) be zero, but round-off errors are causing it to go negative. If that's the case, then it's an easy fix to check that the argument is positive, but someone should check my math to make sure the the code matches the equation.

As for the padding---that was a hack. I noticed the original implementation clipped the image to (M-m+1, N-n+1) so I padded *the result* with zeros to make the output (M, N). Also, the way I padded it (all padding on bottom and right) meant that the output has a high value when "origin" (i.e. top-left corner) of the template matches, as opposed to the center. This could be done by padding the image with half the template width to the left and right, and half the template height to the top and bottom.

Padding *the input* sounds like a good idea, but I think it should be done in such a way that the output is (M, N)---i.e. same size as the input image. I'm not a huge fan of the Matlab output size (but maybe some people expect this output?).

Is there a "right" way to do the padding? Or maybe "pad_output" should be changed to "mode" (to match scipy.signal.convolve) with values 'valid' (no padding; output (M-m+1, N-n+1)), 'same' (pad input with image mean), and 'zeros' (pad output with zeros)?

-T

Tony Yu

unread,

Mar 24, 2012, 8:09:45 PM3/24/12

to scikit...@googlegroups.com

Just to follow up: This would be my preferred way to pad the image. Here, the maximum in the output corresponds to the center of the best template-match.

-T

Mike Sarahan

unread,

Mar 24, 2012, 8:29:24 PM3/24/12

to scikit...@googlegroups.com

Hi Tony,

Glad you figured this out.

I'm also not a huge fan of the Matlab-style padding. I guess
theoretically, it gives your match greater range - beyond the bounds
of the image. In practice, I found that such behavior in the presence
of noise often gave me a peak outside the image, when I knew I wanted
my peak inside the image.

I'm mixed on the input padding. Intuitively, I think the high value
at template center is easier to understand for a human looking at the
cross correlation result image. I know that this confused me a lot
when I first encountered cross correlation using OpenCV. However, any
subsequent analysis (peak finding, for example) would have to account
for the difference, and I think this would be more of a headache than
a benefit to people. I can imagine a lot of careless analyses with
offsets of exactly half the template dimensions.

Your proposed method for the padding makes more sense to me than
Matlab's. I would recommend defaulting to no padding, and also
recommend a little documentation that warns the user about the offset
they will have to account for if using the pad option.

I can't think of a use case for padding the output with zeros. If
anyone else knows one, I'll be happy to learn, though.

Best,
Mike

Tony Yu

unread,

Mar 25, 2012, 9:31:46 AM3/25/12

to scikit...@googlegroups.com

On Sat, Mar 24, 2012 at 8:29 PM, Mike Sarahan <msar...@gmail.com> wrote:

Hi Tony,

Glad you figured this out.

I'm also not a huge fan of the Matlab-style padding. I guess
theoretically, it gives your match greater range - beyond the bounds
of the image. In practice, I found that such behavior in the presence
of noise often gave me a peak outside the image, when I knew I wanted
my peak inside the image.

I'm mixed on the input padding. Intuitively, I think the high value
at template center is easier to understand for a human looking at the
cross correlation result image. I know that this confused me a lot
when I first encountered cross correlation using OpenCV. However, any
subsequent analysis (peak finding, for example) would have to account
for the difference, and I think this would be more of a headache than
a benefit to people. I can imagine a lot of careless analyses with
offsets of exactly half the template dimensions.

Your proposed method for the padding makes more sense to me than
Matlab's. I would recommend defaulting to no padding, and also
recommend a little documentation that warns the user about the offset
they will have to account for if using the pad option.

I can't think of a use case for padding the output with zeros. If
anyone else knows one, I'll be happy to learn, though.

Hey Mike,

I guess my rationale for padding with zeros was the following: I want the output hotspot to match the image coordinates, but I don't want to have matches where the template exceeds the image boundaries.

With the match centered on the template-origin, this padding actually makes no difference since you're padding only the higher indices, but with the match centered on the template-center, there would be a difference. Also, there's a (usually-insignificant) computational improvement compared to padding the input.

I'm just making this stuff up as I go along, so this might not actually be a desirable use case.

-Tony

Best,
Mike

Stéfan van der Walt

unread,

Mar 26, 2012, 2:44:21 PM3/26/12

to scikit...@googlegroups.com

On Sat, Mar 24, 2012 at 5:09 PM, Tony Yu <tsy...@gmail.com> wrote:
> Just to follow up: This would be my preferred way to pad the image. Here,
> the maximum in the output corresponds to the center of the best
> template-match.

Just as a point of interest, there should soon be some padding
capabilities in numpy:

https://github.com/numpy/numpy/pull/198

Stéfan

Stéfan van der Walt

unread,

Mar 26, 2012, 3:03:04 PM3/26/12

to scikit...@googlegroups.com

Hi Tony

On Sat, Mar 24, 2012 at 4:03 PM, Tony Yu <tsy...@gmail.com> wrote:
> zero, but round-off errors are causing it to go negative. If that's the
> case, then it's an easy fix to check that the argument is positive, but
> someone should check my math to make sure the the code matches the equation.

I checked the paper vs your code, and the code looks correct. The
only thing I didn't check was whether the normalization was done on
the correct position in the correlation.

> Padding *the input* sounds like a good idea, but I think it should be done
> in such a way that the output is (M, N)---i.e. same size as the input image.
> I'm not a huge fan of the Matlab output size (but maybe some people expect
> this output?).

If the user wants to pad the input, they should probably do that
themselves beforehand. Half-template matches feel sketchy (and where
do you stop: when at least half the template matches?). Also, with
the new proposed padding in numpy, this should become a lot easier.

Stéfan

Tony Yu

unread,

Mar 26, 2012, 8:47:10 PM3/26/12

to scikit...@googlegroups.com

2012/3/26 Stéfan van der Walt <ste...@sun.ac.za>

This looks like a great new feature. Thanks for the heads-up.

-Tony

Tony Yu

unread,

Mar 26, 2012, 9:36:40 PM3/26/12

to scikit...@googlegroups.com

2012/3/26 Stéfan van der Walt <ste...@sun.ac.za>

Hi Tony

On Sat, Mar 24, 2012 at 4:03 PM, Tony Yu <tsy...@gmail.com> wrote:
> zero, but round-off errors are causing it to go negative. If that's the
> case, then it's an easy fix to check that the argument is positive, but
> someone should check my math to make sure the the code matches the equation.

I checked the paper vs your code, and the code looks correct. The
only thing I didn't check was whether the normalization was done on
the correct position in the correlation.

Thanks for checking. What do you mean by "correct position in the correlation"?

> Padding *the input* sounds like a good idea, but I think it should be done
> in such a way that the output is (M, N)---i.e. same size as the input image.
> I'm not a huge fan of the Matlab output size (but maybe some people expect
> this output?).

If the user wants to pad the input, they should probably do that
themselves beforehand. Half-template matches feel sketchy (and where
do you stop: when at least half the template matches?). Also, with
the new proposed padding in numpy, this should become a lot easier.

Hmm, I was under the impression that half-template matches would be appropriately penalized.

Say we have a template of a face: If it perfectly matches a patch of an image, the result will have a match value of 1 at that location. Now, if half the face matches perfectly but the rest is garbage, then the result might get a value of 0.5 (I'm not sure what the value would actually be). To me, it doesn't really matter whether the face is half-obstructed by a shadow (but still fully in the image), or if it's half cropped by the image edge; they should both have values around 0.5. Unfortunately, the exact score would depend on what value was used to pad the image (Mike's suggestion of the average seems preferable, but nothing's going to be perfect).

I agree that it'd be better for users to take care of the padding themselves, but I think it's a common-enough use-case that we should make it easy to pad with the image average (although turn it off by default).

-Tony

Stéfan

Stéfan van der Walt

unread,

Mar 27, 2012, 1:04:41 AM3/27/12

to scikit...@googlegroups.com

On Mon, Mar 26, 2012 at 6:36 PM, Tony Yu <tsy...@gmail.com> wrote:
> Thanks for checking. What do you mean by "correct position in the
> correlation"?

It looked like you took the result of np.correlate, and then
normalized those coefficients. I wasn't sure (didn't verify) whether
the normalization was computed for the correct position in the
np.correlate result.

> Hmm, I was under the impression that half-template matches would be
> appropriately penalized.

That explanation seems reasonable--such behaviour seems totally sane.

Stéfan

Tony Yu

unread,

Mar 30, 2012, 9:55:20 AM3/30/12

to scikit...@googlegroups.com

2012/3/27 Stéfan van der Walt <ste...@sun.ac.za>

On Mon, Mar 26, 2012 at 6:36 PM, Tony Yu <tsy...@gmail.com> wrote:
> Thanks for checking. What do you mean by "correct position in the
> correlation"?

It looked like you took the result of np.correlate, and then
normalized those coefficients. I wasn't sure (didn't verify) whether
the normalization was computed for the correct position in the
np.correlate result.

Yes, you're correct---not sure why I was confused by before.

> Hmm, I was under the impression that half-template matches would be
> appropriately penalized.

That explanation seems reasonable--such behaviour seems totally sane.

Stéfan

I added some commits to the template PR. I changed the interface: `pad_output` is replaced by `pad_input`, which defaults to False. When True, the input image is padded with a border that is half the template dimensions on each side, and the padding value is the image mean, as Mike suggested. Note that this function still does not produce results similar to Matlab's; no one made the case for it, so I left it out (and as Stefan pointed out: the user can manually pad the input if such a result is desired). Are people happy with this interface?

I added a simpler example plot to the gallery, but there should probably only be a single example. Personally, I prefer the simpler plot. (The more complex plot, however, is more "realistic" since it has noise and uses a peak detector.) Any objections to removing the complex plot?

-Tony

Tony Yu

unread,

Mar 30, 2012, 10:27:15 AM3/30/12

to scikit...@googlegroups.com

Actually, I decided I like this third example better. (I'm just trying to see how many commits I can get into this PR before it's closed.)

-Tony

Reply all

Reply to author

Forward