41 views

Skip to first unread message

Feb 22, 2020, 2:12:47 PM2/22/20

to reikna

Hi Bogdan,

I am trying to implement a phase correlation function using reikna (more info available under method section of this wikipedia article). I was hoping that you might be able to give me a few tips as I'm getting started in order to avoid making silly decisions on program structure and usage of the reikna library.

For starters, one part of this task is to take the complex conjugate. I was thinking I'd use the split_complex transformation on the input, multiply the complex component by -1, and then recombine the previously split data using the combine_complex transformation. Is there a better way to do this?

I was thinking it may be most efficient and clean to do the whole operation (shown in below pictures), in one computation with several transformations connected, do you think this is a good strategy?

The phase correlation process also requires an FFT and IFFT, but these operations are both straightforward in the reikna library (although I'm unsure if these parts should also be connected to the previously discussed math).

One last thing: is there any way to efficiently compute the maximum using the reikna (or PyOpenCL) interface (I searched but wasn't able to find much of anything)?

Thank you for all your work here on the forum and on reikna itself!

Best,

Mike

Feb 22, 2020, 2:21:56 PM2/22/20

to reikna

Oops, didn't realize how badly the math images displayed there. Here is an image of the math I was referencing

Feb 22, 2020, 8:24:38 PM2/22/20

to reikna

Hi Michael,

> For starters, one part of this task is to take the complex conjugate. I was thinking I'd use the split_complex transformation on the input, multiply the complex component by -1, and then recombine the previously split data using the combine_complex transformation. Is there a better way to do this?

Yes, you could use http://reikna.publicfields.net/en/latest/api/cluda.html#reikna.cluda.functions.conj , or just implement it yourself as a part of the rest of the calculation.

> I was thinking it may be most efficient and clean to do the whole operation (shown in below pictures), in one computation with several transformations connected, do you think this is a good strategy?

I don't think you can do it with one kernel call, but just two are possible - if you can put both images in the same array. Then your first computation will be the FFT with batch 2, and in the output transformation you will calculate Ga_jk * Gb_jk^* / |Ga_jk * Gb_jk^*| (that's why I said you might want to implement `conj` yourself - trying to use it here will only complicate things). This transformation will halve the size of the output array, producing R_jk. The second computation will calculate the IFFT.

If you cannot keep both images in the same array, then you will have to do three kernel calls - FFT on the first one, then FFT on the second one with postprocessing, using the results from the first FFT. And the third one is IFFT.

> One last thing: is there any way to efficiently compute the maximum using the reikna (or PyOpenCL) interface (I searched but wasn't able to find much of anything)?

Yes, you need to use Reduce with the custom predicate (which will be a function (x, y) -> max(x, y)). The example https://github.com/fjarri/reikna/blob/develop/examples/demo_struct_reduce.py shows how to calculate the minimum and the maximum simultaneously. If you just need one, you can avoid using a custom struct type.

Feb 23, 2020, 11:11:03 AM2/23/20

to reikna

Ok, thanks for all the info!

Just one question, how would you recommend taking the absolute value of the denominator here? I couldn't seem to find any built in transformation which accomplishes this, and am unsure how I may implement this myself.

Apologies if I've missed something in my search and am asking questions answered elsewhere. Thanks again for all the help.

Best,

Mike

Feb 23, 2020, 12:17:30 PM2/23/20

to reikna

In reference to the absolute value question, I think it should be sufficient to use the pyopencl-complex.h functions for this, although I'm unsure if the performance will be degraded compared to a fully reikna-based implementation. Thanks again!

Feb 23, 2020, 6:06:10 PM2/23/20

to reikna

> Just one question, how would you recommend taking the absolute value of the denominator here? I couldn't seem to find any built in transformation which accomplishes this, and am unsure how I may implement this myself.

There's http://reikna.publicfields.net/en/latest/api/cluda.html#reikna.cluda.functions.norm . But just like with `conj()`, you can just as well write `sqrt(v.x * v.x + v.y * v.y) in your code. The function in pyopencl-complex will do the same, so I don't expect any performance difference.

Feb 25, 2020, 1:47:51 PM2/25/20

to reikna

That makes sense, thank you for all the help!

Best,

Mike

Feb 25, 2020, 2:50:20 PM2/25/20

to reikna

I've just realized that instead of taking the maximum with a reduction kernel, I need to take the maximum *argument. *Do you have any tips on how I may adapt the example you mentioned (https://github.com/fjarri/reikna/blob/develop/examples/demo_struct_reduce.py) to accomplish this? It seems that all that I would need to change is to store the index (i,j) of the max each time instead of the max itself. However, my inexperience in working with the reikna backend and mako leaves me at a loss on how I might do this.

Feb 26, 2020, 4:37:07 PM2/26/20

to reikna

It is a bit trickier. Indices are available in transformations by their names from the array `idxs` (a python array, so it can be used in a template). The problem is that the predicate object is a Snippet, and only gets the values to perform operations on. A way to do it would be:

1) Attach an output transformation mapping value -> struct(value, index)

2) Perform the reduction on that output, comparing .value

3) Extract .index from the result.

Tell me if you have problems with implementing that.

Mar 2, 2020, 10:59:07 AM3/2/20

to reikna

Thanks! I appreciate the tips! I haven't approached this portion of the implementation yet since I'm mostly focused on getting all my functions working at a reasonable speed ASAP (and for argmax a jitted numba function serves my purposes fine for now).

So, one last (hopefully) question I have: what is the most efficient method for taking a sum over a particular axis of some array? It seems like I should be able to use a builtin PyOpenCL array function but PyOpenCL's array sum function has an argument 'slice' which I don't understand and can't find any writeup on. Is there a reikna method which behaves similarly to the numpy.sum function or perhaps a way to use the PyOpenCL function in this way?

I've tried writing a custom kernel in PyOpenCL but can't seem to figure out the indexing in this case.

Best,

Mike

Mar 12, 2020, 2:10:13 AM3/12/20

to reikna

Hi Michael,

Sorry for a delayed reply, I got caught up in work and completely forgot about this discussion.

Reikna's Reduce allows you to specify axes over which to reduce. Note though that reducing over the innermost axes is much faster than over arbitrary ones, so a combination of Transpose (target axes to the innermost ones) + Reduce may turn out faster than just Reduce over non-innermost axes.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu