Hello, we are developing the fusion for "reshape", the case is
for i in range 80:
s0: a[i] = input[i]
for i2 in range 5:
for j2 in range 2:
for k2 in range 8:
s1: b[i2, j2, k2] = a[k2 + j2*8 + i2* 16]
to
for i2 in range 5:
for j2 in range 2:
for k2 in range 8:
a[k2 + j2*8 + i2* 16] = input[k2 + j2*8 + i2* 16]
b[i2, j2, k2] = a[k2 + j2*8 + i2* 16]
The loop fusion has been done without isl. Following that, we utilized isl for loop tiling.
Now the dim of scheduling is either [i] or [i2, j2, k2]. There are two choices:
1. The dim is i, and we first tile the i to i_0, i_1, i_2 such that i_0 = i2, j_0 = j2, k_0 = k2. And then we continue to tile the tiled axis i_0, i_1, i_2 to smaller tiling.
2. The dim is [i2, j2, k2], and we directly tile the tiled axis i2 , j2 , k2 to smaller tiling.
In the 1st case, the domain is supposed to be represented as:
domain: {s0[i]: 0<= i <= 79;
s1[ i2, j2, k2]: 0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7 ;}
In the 2nd case, the domain is supposed to be represented as:
domain: {s0[ i ]: i = k2 + j2*8 + i2* 16 and 0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7;
s1[i2, j2, k2]: 0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7 ;}
In the 2nd cases, the constraints are applied with isl_constraint_alloc_equality.
The first question is that we are wondering if the first choice is feasible and the representation is correct because the scheduling dims [i2, j2, k2] are redundant. If not, could you give us some advice to support such method.
The second question is that is the 2nd choice more appropriate for this problem? If neither method is feasible, could you provide some hints to solve our problem?