On Mon, Mar 20, 2017 at 10:23:14PM +0530, Uday K Bondhugula wrote:
> Now, even with --lastwriter, you'll see transformations like:
>
> T(S1): (l, i, n+j, p+k, m, 4m+n, 20m+4n+p)
>
> The large skews are all spurious and undesirable - this is again a
> side-effect of using small constant trip counts. The problem can be
> circumvented by using parametric bounds as mentioned above, or with
> --coeff-bound=1 (an undocumented option that upper bounds transformation
> coefficients by 1).
FYI, it may be useful to derive bounds on the coefficients
from the size of the domain. That's what I do in isl and
in fact this example helped to refine the strategy.
It's described in Section 7.5 Loop Coalescing Avoidance
of Scheduling for PPCG.
With --no-isl-schedule-treat-coalescing, the resulting schedule
is very similar to the one you show above:
domain: "{ S_0[l, m, n, p, i, j, k] : 0 <= l <= 49 and 0 <= m <= 63 and 0 <= n <= 7 and 0 <= p <= 7 and 0 <= i <= 15 and 0 <= j <= 4 and 0 <= k <= 4 }"
child:
schedule: "[{ S_0[l, m, n, p, i, j, k] -> [(l)] }, { S_0[l, m, n, p, i, j, k] -> [(n + j)] }, { S_0[l, m, n, p, i, j, k] -> [(p + k)] }, { S_0[l, m, n, p, i, j, k] -> [(i)] }, { S_0[l, m, n, p, i, j, k] -> [(m)] }, { S_0[l, m, n, p, i, j, k] -> [(4m + n)] }, { S_0[l, m, n, p, i, j, k] -> [(20m + 4n + p)] }]"
permutable: 1
coincident: [ 1, 1, 1, 1, 0, 0, 0 ]
Without the option, a bound of "2" is derived from the domain size
and the following schedule is produced:
domain: "{ S_0[l, m, n, p, i, j, k] : 0 <= l <= 49 and 0 <= m <= 63 and 0 <= n <= 7 and 0 <= p <= 7 and 0 <= i <= 15 and 0 <= j <= 4 and 0 <= k <= 4 }"
child:
schedule: "[{ S_0[l, m, n, p, i, j, k] -> [(l)] }, { S_0[l, m, n, p, i, j, k] -> [(n + j)] }, { S_0[l, m, n, p, i, j, k] -> [(p + k)] }, { S_0[l, m, n, p, i, j, k] -> [(i)] }, { S_0[l, m, n, p, i, j, k] -> [(m)] }]"
permutable: 1
coincident: [ 1, 1, 1, 1, 0 ]
child:
schedule: "[{ S_0[l, m, n, p, i, j, k] -> [(n)] }]"
child:
schedule: "[{ S_0[l, m, n, p, i, j, k] -> [(p)] }]"
That is, the last two loops have been split off from the tilable band
due to the bounds on the coefficients.
Note that for reliable results, you need isl-0.18-768-g033b61ae3d
or higher.
skimo