Could you let me know if this usage of FFT in reikna works?

zlff...@gmail.com

unread,

Feb 13, 2017, 9:32:44 PM2/13/17

to reikna

I want to know whether first argument of fft.compile can be replaced with same dimension of array or not.

api = cluda.get_api('cuda')
dev = api.get_platforms()[0].get_devices()[0]
thr = api.Thread(dev)

rand = np.random.random((10,10,10))

fft = FFT(rand, axes=(0,))
fftc = fft.compile(thr, fast_math=True)

two = 2 * rand
three = 3 * rand

two_dev = thr.to_device(two)
three_dev = thr.to_device(three)

fftc(two_dev, two_dev)
fftc(three_dev, three_dev)

two = two_dev.get()
three = three_dev.get()

zlff...@gmail.com

unread,

Feb 13, 2017, 10:38:40 PM2/13/17

to reikna

it looks... not working... could you explain why?

Bogdan Opanchuk

unread,

Feb 14, 2017, 6:29:41 PM2/14/17

to reikna

The FFT computation requires a complex array. There are currently no versions optimized for real-valued arrays. You have two ways to proceed:

1. Convert the array to a complex datatype as use FFT normally:

rand = np.random.random((10,10,10)).astype(np.complex128)

and leave the rest of the code as is.

2. Use an input transformation that will allow you to use a real-valued array directly. It can be useful if your arrays are large, and you do not want to waste memory to store a copy of the input array (and time to actually make the copy). You can either write a custom transformation as shown in https://github.com/fjarri/reikna/blob/develop/examples/demo_real_to_complex_fft.py , or combine two existing ones:

import numpy as np
from reikna import cluda
from reikna.fft import FFT

from reikna.cluda.dtypes import complex_for
from reikna.core import Type
from reikna.transformations import combine_complex, broadcast_const


api = cluda.get_api('cuda')


dev = api.get_platforms()[0].get_devices()[0]
thr = api.Thread(dev)

rand = np.random.random((10, 10, 10))



fft  = FFT(Type(complex_for(rand.dtype), rand.shape), axes=(0,))

# combines two real-valued inputs into a complex-valued input of the same shape
cc = combine_complex(fft.parameter.input)
# supplies a constant output
bc = broadcast_const(cc.imag, 0)

fft.parameter.input.connect(cc, cc.output, real_input=cc.real, imag_input=cc.imag)
fft.parameter.imag_input.connect(bc, bc.output)



fftc = fft.compile(thr, fast_math=True)

two   = 2 * rand
three = 3 * rand

two_dev   = thr.to_device(two)
three_dev = thr.to_device(three)



two_res_dev = thr.empty_like(fft.parameter.output)
three_res_dev = thr.empty_like(fft.parameter.output)

fftc(two_res_dev, two_dev)
fftc(three_res_dev, three_dev)

two_res = two_res_dev.get()
three_res = three_res_dev.get()

assert np.allclose(two_res, np.fft.fftn(two, axes=(0,)))
assert np.allclose(three_res, np.fft.fftn(three, axes=(0,)))

Note that if you do that, input and output of the resulting computation have different dtypes (the former is real, the latter is complex), so you cannot use the same array. You will have to allocate an array for the output.

zlff...@gmail.com

unread,

Feb 14, 2017, 11:19:45 PM2/14/17

to reikna

Thank you, i'm stick to use real input. I will follow you're first advice. Thanks.

Reply all

Reply to author

Forward