slicing in computation arguments

Skip to first unread message

May 6, 2019, 11:51:45 PM5/6/19
to reikna

I was trying to use FFT computation with Transformations (though it's not important). But as an input argument I used not a regular class Array(gpuarray.GPUArray) but its slice:

fft = FFT(Type(self.data_dtype, shape=(sats_num, dopp_num, ca_samples)), axes=(2,))
fft.parameter.input.connect(prev_tr, prev_tr.out, indata=prev_tr.indata, f=prev_tr.f, phases=prev_tr.phases,
epochs_per_coh=prev_tr.epochs_per_coh, coh_starts=prev_tr.coh_starts, exp_lut_data=prev_tr.exp_lut_data)
fft.parameter.output.connect(post_tr, post_tr.fftout, result=post_tr.result, prn=post_tr.prn)
fftc = fft.compile(self.thr, compiler_options=nvcc_opts)

fftc(self._inter_res_dev, self._prn_dev, self._samples_dev, self._f_dev[non_coh_i],
self._phases_dev[non_coh_i], self._epochs_per_coh_dev, self._starts_dev[non_coh_i],

You can see that some arguments of fftc call has slicing. But I realized that I have the same address of these data in my CUDA code. Which means slicing is not taken into account. In Array::__getitem__(self, index) everything looks correct, we create a new array with new strides and offset, but the same base_data.
But then in Kernel::prepared_call(self, *args) we ignore everything besides base_data. That means slicing doesn't play any role here.
Is that done intentionally?
My idea was just to upload whole data once, but then process it partially one by one. I can't process all at once in parallel, because I need to integrate portions.

Bogdan Opanchuk

May 7, 2019, 1:02:05 AM5/7/19
to reikna
Yes, it is intentional. Kernel.prepared_call() is low-level (CLUDA level), it does not know about multidimensional arrays, strides and offsets and only uses the buffer address. Multi-dimensional array addressing is a high-level feature that works in computations. See the note in (I admit, I was not sure where to place it to make it easier to discover). 
Reply all
Reply to author
0 new messages