representations for "reshape"

47 views
Skip to first unread message

Webb Xu

unread,
Jan 2, 2025, 11:46:30 AMJan 2
to isl Development
Hello, we are developing the fusion for "reshape", the case is 
for i in range 80:
   s0: a[i] = input[i]
for i2 in range 5:
  for j2 in range 2:
    for k2 in range 8:
        s1: b[i2, j2, k2] = a[k2 + j2*8 + i2* 16]
to
for i2 in range 5:
  for j2 in range 2:
    for k2 in range 8:
        a[k2 + j2*8 + i2* 16] = input[k2 + j2*8 + i2* 16]
        b[i2, j2, k2] = a[k2 + j2*8 + i2* 16]
The loop fusion has been done without isl. Following that, we utilized isl for loop tiling.
Now the dim of scheduling is either [i] or [i2, j2, k2]. There are two choices:
1. The dim is i, and we first tile the i to i_0, i_1, i_2 such that i_0 = i2, j_0 = j2, k_0 = k2. And then we continue to tile the tiled axis  i_0, i_1, i_2 to smaller tiling.

2. The dim is [i2, j2, k2], and we directly tile the tiled axis   i2  ,  j2  ,  k2  to smaller tiling.

In the 1st case, the domain is supposed to be represented as:
domain: {s0[i]:  0<= i <= 79; 
                s1[ i2, j2, k2]:  0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7 ;}
In the 2nd case, the domain is supposed to be represented as:
domain: {s0[ k2 + j2*8 + i2* 16  ]:  0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7; 
                s1[i2, j2, k2]:  0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7 ;}
In the 2nd cases, the constraints are applied with isl_constraint_alloc_equality.

The first question is that we are wondering if the first choice is feasible and the representation is correct because the scheduling dims [i2, j2, k2] are redundant. If not, could you give us some advice to support such method.
The second question is that is the 2nd choice more appropriate for this problem? If neither method is feasible, could you provide some hints to solve our problem?

Sven Verdoolaege

unread,
Jan 2, 2025, 12:23:23 PMJan 2
to Webb Xu, isl Development
On Wed, Jan 01, 2025 at 07:34:29PM -0800, Webb Xu wrote:
> Hello, we are developing the fusion for "reshape", the case is
> for i in range 80:
> s0: a[i] = input[i]
> for i2 in range 5:
> for j2 in range 2:
> for k2 in range 8:
> s1: b[i2, j2, k2] = a[k2 + j2*8 + i2* 16]
> to
> for i2 in range 5:
> for j2 in range 2:
> for k2 in range 8:
> a[k2 + j2*8 + i2* 16] = input[k2 + j2*8 + i2* 16]
> b[i2, j2, k2] = a[k2 + j2*8 + i2* 16]
> The loop fusion has been done without isl. Following that, we utilized isl
> for loop tiling.

How did you perform this fusion?
In particular, how do you represent it?

> Now the dim of scheduling is either [i] or [i2, j2, k2]. There are two
> choices:

You lost me here.

If you are tiling after the fusion, then there is only
a single 3D object to tile.

For example, if you represent the fusion using
the schedule tree

domain: "{ s0[0:79]; s1[0:4,0:1,0:7] }"
child:
schedule: "B[ \
{s0[i] -> [i//16]; s1[i,j,k] -> [i]}, \
{s0[i] -> [(i//8)%2]; s1[i,j,k] -> [j]}, \
{s0[i] -> [i%8]; s1[i,j,k] -> [k]} \
]"
child:
sequence:
- filter: "{ s0[*] }"
- filter: "{ s1[*,*,*] }"

then there is a single B-band that needs to be tiled.
For that, you can just navigate to the node and then
call isl_schedule_node_band_tile.

skimo

Webb Xu

unread,
Jan 2, 2025, 9:28:48 PMJan 2
to isl Development
Sorry, maybe I misled you. I have fused the "reshape" and its predominant elementwise for i2 in range 5: for j2 in range 2: for k2 in range 8: a[k2 + j2*8 + i2* 16] = input[k2 + j2*8 + i2* 16] b[i2, j2, k2] = a[k2 + j2*8 + i2* 16] The loop fusion has been done without isl. Following that, we utilized isl for loop tiling. The problem is how to express the schedule space. We are not sure that if the dims are [i, i2, j2, k2] or [i] or [i2, j2, k2]. We adopt the scheduling as [i] or [i2, j2, k2] because they can either be expressed with the other. The 2 cases are listed as following:
1. The dim is i, the domain is supposed to be represented as:
domain: {s0[i]: 0<= i <= 79;
s1[ i2, j2, k2]: 0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7 ;}

2. The dim is [i2, j2, k2], the domain is supposed to be represented as:
domain: {s0[i ]: i = k2 + j2*8 + i2* 16 and 0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7;
s1[i2, j2, k2]: 0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7 ;}

If we take the second case as the example, the code reads:
  isl_ctx* ctx = isl_ctx_alloc();
  isl_set* cur_set;

  isl_val* i2_range = isl_val_int_from_si(ctx, 5);
  isl_val* j2_range = isl_val_int_from_si(ctx, 2);
  isl_val* k2_range = isl_val_int_from_si(ctx, 8);

  // exp1 space
  isl_space* exp1_space = isl_space_alloc(ctx, 0, 3, 1);
  isl_space_set_tuple_name(exp1_space, isl_dim_set, "exp1");
  isl_space_set_tuple_name(exp1_space, isl_dim_set, "exp1");

  VLOG_APEX_WARNING << "exp1_space:" << isl_space_to_str(exp1_space);
  isl_map* exp1_map = isl_map_universe(exp1_space);

  // i - (j2_range * k2_range * i2 + j2 * k2_range + k2) = 0
  isl_constraint* eq = isl_constraint_alloc_equality(isl_local_space_from_space(isl_map_get_space(exp1_map)));
  eq = isl_constraint_set_coefficient_si(eq, isl_dim_out, 0, 1);
// -j2_range * k2_range * i2 isl_constraint_set_coefficient_si(eq, isl_dim_in, 0, -16);
  // -k2_range * j2
  isl_constraint_set_coefficient_si(eq, isl_dim_in, 1, -8);
  // -k2
  eq = isl_constraint_set_coefficient_si(eq, isl_dim_in, 2, -1);
  // exp1_map constrain
  VLOG_APEX_WARNING << "isl_constraint:" << isl_constraint_list_to_str(isl_constraint_to_list(eq));
  exp1_map = isl_map_add_constraint(exp1_map, eq);
  VLOG_APEX_WARNING << "isl map:" << isl_map_to_str(exp1_map);

  isl_set* exp1_set = isl_map_domain(exp1_map);
  exp1_set = isl_set_lower_bound_si(exp1_set, isl_dim_set, 0, 0);
  exp1_set = isl_set_upper_bound_val(exp1_set, isl_dim_set, 0, isl_val_copy(i2_range));
  exp1_set = isl_set_lower_bound_si(exp1_set, isl_dim_set, 1, 0);
  exp1_set = isl_set_upper_bound_val(exp1_set, isl_dim_set, 1, isl_val_copy(j2_range));
  exp1_set = isl_set_lower_bound_si(exp1_set, isl_dim_set, 2, 0);
  exp1_set = isl_set_upper_bound_val(exp1_set, isl_dim_set, 2, isl_val_copy(k2_range));
  VLOG_APEX_WARNING << "exp1_set: " << isl_set_to_str(exp1_set);

  // reshape space
  isl_space* reshape_space = isl_space_set_alloc(ctx, 0, 3);
  isl_space_set_tuple_name(reshape_space, isl_dim_set, "reshape");
  isl_set* reshape_set = isl_set_universe(reshape_space);
  reshape_set = isl_set_lower_bound_si(reshape_set, isl_dim_set, 0, 0);
  reshape_set = isl_set_upper_bound_val(reshape_set, isl_dim_set, 0, isl_val_copy(i2_range));
  reshape_set = isl_set_lower_bound_si(reshape_set, isl_dim_set, 1, 0);
  reshape_set = isl_set_upper_bound_val(reshape_set, isl_dim_set, 1, isl_val_copy(j2_range));
  reshape_set = isl_set_lower_bound_si(reshape_set, isl_dim_set, 2, 0);
  reshape_set = isl_set_upper_bound_val(reshape_set, isl_dim_set, 2, isl_val_copy(k2_range));
  VLOG_APEX_WARNING << "reshape_set:" << isl_set_to_str(reshape_set);

  isl_union_set* domain = isl_union_set_from_set(exp1_set);
  domain = isl_union_set_union(domain, isl_union_set_from_set(reshape_set));
  // sch tree
  isl_schedule* schedule = isl_schedule_from_domain(domain);
The code yields: domain: "{  [i2, j2, k2] : 0 <= i2 <= 4and 0 <= j2 <= 1 and 0<=k2<=7;  reshape[i2, j2, k2] : 0 <= i2 <= 4and 0 <= j2 <= 1 and 0<=k2<=7}". The constraints are missing and space name of exp1 disappears. Is our idea and the schedule correct? Note that we would like to place exp1 and reshape spaces within the same schedule dims such that they could be lying in the same nested loops. 

Sven Verdoolaege

unread,
Jan 5, 2025, 3:52:13 PMJan 5
to Webb Xu, isl Development
On Thu, Jan 02, 2025 at 06:28:47PM -0800, Webb Xu wrote:
> Sorry, maybe I misled you. I have fused the "reshape" and its predominant
> elementwise for i2 in range 5: for j2 in range 2: for k2 in range 8: a[k2 +
> j2*8 + i2* 16] = input[k2 + j2*8 + i2* 16] b[i2, j2, k2] = a[k2 + j2*8 +
> i2* 16] The loop fusion has been done without isl. Following that, we
> utilized isl for loop tiling. The problem is how to express the schedule
> space. We are not sure that if the dims are [i, i2, j2, k2] or [i] or [i2,
> j2, k2].

If you expect the generated code to consist of three loops,
then you should use a 3-dimensional schedule space.
That's exactly what I showed before. Here it is again:

domain: "{ s0[0:79]; s1[0:4,0:1,0:7] }"
child:
schedule: "B[ \
{s0[i] -> [i//16]; s1[i,j,k] -> [i]}, \
{s0[i] -> [(i//8)%2]; s1[i,j,k] -> [j]}, \
{s0[i] -> [i%8]; s1[i,j,k] -> [k]} \
]"
child:
sequence:
- filter: "{ s0[*] }"
- filter: "{ s1[*,*,*] }"


If you prefer a map representation, then you should go for
something like

{ s0[i=0:79] -> B[i//16, (i//8)%2, i%8, 0];
s1[i=0:4,j=0:1,k=0:7] -> B[i, j, k, 1] }

However, if you intend to perform tiling, then a schedule tree
representation is more convenient.

See also Section 5.6 Schedule of
"Presburger Formulas and Polyhedral Compilation"
for more information about schedules.

> We adopt the scheduling as [i] or [i2, j2, k2] because they can
> either be expressed with the other. The 2 cases are listed as following:
> 1. The dim is i, the domain is supposed to be represented as:
> domain: {s0[i]: 0<= i <= 79;
> s1[ i2, j2, k2]: 0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7 ;}
>
> 2. The dim is [i2, j2, k2], the domain is supposed to be represented as:
> domain: {s0[i ]: i = k2 + j2*8 + i2* 16 and 0<= i2 <= 4 and 0<= j2 <= 1 and
> 0<= i2 <= 7;
> s1[i2, j2, k2]: 0<= i2 <= 4 and 0<= j2 <= 1 and 0<= i2 <= 7 ;}

This is not a valid object since you didn't define k2, j2, and i2
in the first disjunct.
In any case, you shouldn't modify the domain;
you should define a schedule in terms of the original domain.

> If we take the second case as the example, the code reads:
> isl_ctx* ctx = isl_ctx_alloc();
> isl_set* cur_set;
>
> isl_val* i2_range = isl_val_int_from_si(ctx, 5);
> isl_val* j2_range = isl_val_int_from_si(ctx, 2);
> isl_val* k2_range = isl_val_int_from_si(ctx, 8);
>
> // exp1 space
> isl_space* exp1_space = isl_space_alloc(ctx, 0, 3, 1);
> isl_space_set_tuple_name(exp1_space, isl_dim_set, "exp1");
> isl_space_set_tuple_name(exp1_space, isl_dim_set, "exp1");

The first call to isl_space_set_tuple_name consumes exp1_space,
so you cannot pass the same value to the second call (or anywhere else).
At the same time, you should keep track of the return value
of the call(s) so that you can use or free the result later.
(It may happen that the return value is equal to the argument,
but this is by no means guaranteed.)
Also, for a map space, you should use isl_dim_in or isl_dim_out,
not isl_dim_set.
Finally, you are setting the same tuple name twice.

> // sch tree
> isl_schedule* schedule = isl_schedule_from_domain(domain);

You are creating an empty schedule tree here.
This doesn't define any order.

> The code yields: domain: "{ [i2, j2, k2] : 0 <= i2 <= 4and 0 <= j2 <= 1
> and 0<=k2<=7; reshape[i2, j2, k2] : 0 <= i2 <= 4and 0 <= j2 <= 1 and
> 0<=k2<=7}". The constraints are missing and space name of exp1 disappears.

You projected out the named tuple, so it makes sense
the name diappears.
It's not clear which constraints are supposed to be missing.

> Is our idea and the schedule correct?

No.

> Note that we would like to place exp1
> and reshape spaces within the same schedule dims such that they could be
> lying in the same nested loops.

For that, you should define a schedule as explained above.

skimo

Webb Xu

unread,
Apr 19, 2025, 11:38:28 PMApr 19
to isl Development
Thank you. That helps a lot.
Reply all
Reply to author
Forward
0 new messages