'gfor' dim_t parameter variation within loops / PARTIAL unwrap() function

Skip to first unread message

Jack Wells

Mar 11, 2022, 1:27:07 PMMar 11
to ArrayFire Users
Hi all, many thanks in advance if anyone is able to help me solve this issue.
I have recently started using ArrayFire and have found it to be fantastic, however I have hit a problem I am unable to solve with the Documentation and/or searching online.

To put it simply, the function I need to create is a reduced version of the unrwap() function (which produces a column matrix of all {potentially overlapping} flattened 2D windows of an image). However, the data I am working with may be huge, and attempting the full unwrap() function crashes the program due to memory limitations. Even if it could be achieved, it is very inefficient as I only want access to a subset of these "patches" at any given time.

What I would really want is a function that is effectively...
array unwrap(array& in, seq desired_indexes, .... );
where the "desired indexes" determines which of the columns from the theoretical "full" unwrapping are generated.

At first, I thought this could be achieved with GFOR, such as...

// Some Device pointer
int* input_ptr = (some af::array).device<T>();
// Set stride and shape
dim4 shape(h, w, c);        // dimensions of "patch"
dim4 strides(sh, sw, sc); // stride of input array

int batch_size = 5; // Trivially small batch size as an example)
dtype typeX(u32);
array X(dim4(h*w*c, batch_size), typeX); // Generating the column matrix

gfor(seq ii, batch_size)
                X(span, ii) = af::flat(af::createStridedArray(input_ptr, OFFSET, shape, strides, typeX, af::source::afDevice));

However, I cannot see a way of providing a different OFFSET value to each of the gfor "loops", if OFFSET is replaced by say, 0, the function runs perfectly well, but all of the columns are filled with the first patch - clearly not the desired behaviour.

The offset could easily be calculated, and is fact already stored in an array such that offsets(ii) would contain the correct value, but there seems to be no way of using the vectorised 'ii' seq indexing to replace the dim_t argument in the createStridedArray() method.

Does anyone have any suggestions on how to efficiently implement this in parallel (as, in practise, batch_size will be very large, and thus really needs to be parallelised/vectorised in some way) 

Reply all
Reply to author
0 new messages