On Thu, Oct 17, 2024 at 09:48:41AM -0700, Pedro Silvestre wrote:
> Hello all,
>
> I am using ISL to schedule GPU computations represented by dependence
> graphs such as below. Here, Op1 and Op2 are dependents of Op0 and must
> execute after it (arrows are reversed). An operation emits a value that can
> be taken as input by other operations. The phi's represent dependence and
> expressions while the sets S0, S1, S2 are domains. Translating this graph
> into validity constraints allows me to easily get a schedule from ISL.
>
> [image: dependence_graph.png]
>
> Now, I want to add the ability to offload and fetch values from/to GPU
> memory. My plan was to add an offload and fetch operation to the dependence
> graph for each dependent of a large value (the output of Op0 in this case)
> + an initial offload of the tensor at time of creation.
Why do you want to do this before scheduling?
> This is represented
> below with the black dashed lines and the new operations (O_ for offload
> and F_ for fetch). This nearly works, but can cause problems because the
> offloads of one dependent operation are not forced to be ordered with the
> fetches of the other dependent operation.
>
> Suppose S0=S1=S2=isl.Set("[I] -> {[i]: 0 <= i < I}"). This means that
> schedules like F_Op_1[i] -> Op1[i] -> F_Op_2[i] -> Op2[i] -> O_Op_1[i] ->
> O_Op_2[i] are possible.
Exactly which situations are you trying to avoid?
> I believe this issue can be solved using conditional validity constraints
> as pictured in red and blue, but I genuinely cannot figure out how to use
> them. The "Scheduling for PPCG" and "isl manual" both contain some
> information but no examples.
There is a bit more information in the paper Live-Range Reordering from 2016.
You could also run ppcg on a simple input with reuse and dump
the schedule constraints using --dump-schedule-constraints.
It essentially allows ordering constraints to be ignored as long as
the live-ranges that the ordering constraints are meant to prevent
from overlapping are local (zero dependence distance) within
a schedul band.
That is, either both live-ranges are local (in which case they cannot
overlap at the level of the scheudle band) or the ordering constraints
are imposed (in which case they are prevented from overlapping
by those ordering constraints).
> # I want to add these two conditioned validity constraints.
> # IF O_Op0_1[i] before O_Op0_2[i] THEN O_Op0_1[i] before F_Op0_2[i]
> # O_Op0_1[i] -> O_Op0_2[i] => O_Op0_1[i] -> F_Op0_2[i]
>
> # IF O_Op0_2[i] before O_Op0_1[i] THEN O_Op0_2[i] before F_Op0_1[i]
> # O_Op0_2[i] -> O_Op0_1[i] => O_Op0_2[i] -> F_Op0_1[i]
You cannot impose both of these at the same time.
If S(O_Op0_1[i]) - S(O_Op0_2[i]) is non-zero, then obviously
S(O_Op0_2[i]) - S(O_Op0_1[i]) is non-zero too and then
the two additional constraints lead to a circular dependence.
skimo