How to use Conditional Validity Scheduling Constraints

25 views
Skip to first unread message

Pedro Silvestre

unread,
Oct 17, 2024, 12:48:43 PM10/17/24
to isl Development
Hello all,

I am using ISL to schedule GPU computations represented by dependence graphs such as below. Here, Op1 and Op2 are dependents of Op0 and must execute after it (arrows are reversed). An operation emits a value that can be taken as input by other operations. The phi's represent dependence and expressions while the sets S0, S1, S2 are domains. Translating this graph into validity constraints allows me to easily get a schedule from ISL.

dependence_graph.png

Now, I want to add the ability to offload and fetch values from/to GPU memory. My plan was to add an offload and fetch operation to the dependence graph for each dependent of a large value (the output of Op0 in this case) + an initial offload of the tensor at time of creation. This is represented below with the black dashed lines and the new operations (O_ for offload and F_ for fetch). This nearly works, but can cause problems because the offloads of one dependent operation are not forced to be ordered with the fetches of the other dependent operation.

Suppose S0=S1=S2=isl.Set("[I] -> {[i]: 0 <= i < I}"). This means that schedules like F_Op_1[i] -> Op1[i] ->  F_Op_2[i] -> Op2[i] ->  O_Op_1[i] -> O_Op_2[i] are possible.

offload_fetch.png

I believe this issue can be solved using conditional validity constraints as pictured in red and blue, but I genuinely cannot figure out how to use them. The "Scheduling for PPCG" and "isl manual" both contain some information but no examples.

I would appreciate any help understanding if my issue can be resolved using conditional validity constraints (or indeed any other method), and if so, how should these be structured? Providing an example of how to encode what is on the picture above would be a great help.

Cheers,
Pedro

Pedro Silvestre

unread,
Oct 17, 2024, 2:27:27 PM10/17/24
to isl Development
Perhaps the following islpy example can help. Here I set the domains and dependence expressions to just be [i] : 0 <= i < I.

```
import islpy as isl

def isl_sched_to_c(schedule: isl.Schedule) -> str:
    astbuild = isl.AstBuild.from_context(isl.Set("{:}"))
    ast = astbuild.node_from_schedule(schedule)
    p = isl.Printer.to_str(ast.get_ctx())
    p = p.set_output_format(isl.format.C)
    return p.print_ast_node(ast).get_str()

dom_ops = isl.UnionSet("[I] -> { Op1[i] : 0 <= i < I; Op2[i] : 0 <= i < I; Op0[i] : 0 <= i < I }")
dom_off = isl.UnionSet("[I] -> {O_Op0[i]: 0 <= i < I; O_Op0_1[i]: 0 <= i < I;O_Op0_2[i]: 0 <= i < I}")
dom_fet = isl.UnionSet("[I] -> {F_Op0_1[i]: 0 <= i < I; F_Op0_2[i]: 0 <= i < I}")
dom = dom_ops.union(dom_off).union(dom_fet)
print(dom)

sc = isl.ScheduleConstraints.on_domain(dom)


val = isl.UnionMap("[I] -> {Op0[i] -> Op1[i]: 0 <= i < I; Op0[i] -> Op2[i]: 0 <= i < I; Op0[i] -> O_Op0[i]: 0<= i< I; O_Op0[i] -> F_Op0_1[i]: 0<= i< I; O_Op0[i] -> F_Op0_2[i]: 0<= i< I; F_Op0_1[i] -> Op1[i]: 0<= i< I;F_Op0_2[i] -> Op2[i]: 0<= i< I;Op1[i] -> O_Op0_1[i]: 0<= i< I;Op2[i] -> O_Op0_2[i]: 0<= i< I;}")

sc = sc.set_validity(val)

influence_bad_schedule_prox_constraint = isl.UnionMap("[I] -> {Op1[i] -> F_Op0_2[i]: 0 <= i <I }")
prox = isl.UnionMap("[I] -> {Op0[i] -> Op1[i]: 0 <= i < I; Op0[i] -> Op2[i]: 0 <= i < I; }")
prox = prox.union(influence_bad_schedule_prox_constraint)
sc = sc.set_proximity(prox)


# NOTE: Uncommenting this section does not help
## O_Op0_1[i] -> O_Op0_2[i] => F_Op0_2[i] -> O_Op0_1[i]
## O_Op0_2[i] -> O_Op0_1[i] => F_Op0_1[i] -> O_Op0_2[i]
#conditions = isl.UnionMap("[I] -> {O_Op0_1[i] -> O_Op0_2[i]: 0<= i < I; O_Op0_2[i] -> O_Op0_1[i]: 0 <= i < I}")
#conditional_val = isl.UnionMap("[I] -> {F_Op0_2[i] -> O_Op0_1[i]: 0 <= i < I;  F_Op0_1[i] -> O_Op0_2[i]: 0 <= i < I}")
#
#sc.set_conditional_validity(conditions, conditional_val)

s = sc.compute_schedule()
print(isl_sched_to_c(s))
```

If this does not work, then how do you actually use conditional validity correctly?

sched_isl.py

Pedro Silvestre

unread,
Oct 17, 2024, 8:25:50 PM10/17/24
to isl Development
Sorry, I messed up the arrow direction. Here it is fixed.
fixed.png
sched_isl.py

Sven Verdoolaege

unread,
Oct 19, 2024, 4:02:18 PM10/19/24
to Pedro Silvestre, isl Development
On Thu, Oct 17, 2024 at 09:48:41AM -0700, Pedro Silvestre wrote:
> Hello all,
>
> I am using ISL to schedule GPU computations represented by dependence
> graphs such as below. Here, Op1 and Op2 are dependents of Op0 and must
> execute after it (arrows are reversed). An operation emits a value that can
> be taken as input by other operations. The phi's represent dependence and
> expressions while the sets S0, S1, S2 are domains. Translating this graph
> into validity constraints allows me to easily get a schedule from ISL.
>
> [image: dependence_graph.png]
>
> Now, I want to add the ability to offload and fetch values from/to GPU
> memory. My plan was to add an offload and fetch operation to the dependence
> graph for each dependent of a large value (the output of Op0 in this case)
> + an initial offload of the tensor at time of creation.

Why do you want to do this before scheduling?

> This is represented
> below with the black dashed lines and the new operations (O_ for offload
> and F_ for fetch). This nearly works, but can cause problems because the
> offloads of one dependent operation are not forced to be ordered with the
> fetches of the other dependent operation.
>
> Suppose S0=S1=S2=isl.Set("[I] -> {[i]: 0 <= i < I}"). This means that
> schedules like F_Op_1[i] -> Op1[i] -> F_Op_2[i] -> Op2[i] -> O_Op_1[i] ->
> O_Op_2[i] are possible.

Exactly which situations are you trying to avoid?

> I believe this issue can be solved using conditional validity constraints
> as pictured in red and blue, but I genuinely cannot figure out how to use
> them. The "Scheduling for PPCG" and "isl manual" both contain some
> information but no examples.

There is a bit more information in the paper Live-Range Reordering from 2016.
You could also run ppcg on a simple input with reuse and dump
the schedule constraints using --dump-schedule-constraints.

It essentially allows ordering constraints to be ignored as long as
the live-ranges that the ordering constraints are meant to prevent
from overlapping are local (zero dependence distance) within
a schedul band.
That is, either both live-ranges are local (in which case they cannot
overlap at the level of the scheudle band) or the ordering constraints
are imposed (in which case they are prevented from overlapping
by those ordering constraints).

> # I want to add these two conditioned validity constraints.
> # IF O_Op0_1[i] before O_Op0_2[i] THEN O_Op0_1[i] before F_Op0_2[i]
> # O_Op0_1[i] -> O_Op0_2[i] => O_Op0_1[i] -> F_Op0_2[i]
>
> # IF O_Op0_2[i] before O_Op0_1[i] THEN O_Op0_2[i] before F_Op0_1[i]
> # O_Op0_2[i] -> O_Op0_1[i] => O_Op0_2[i] -> F_Op0_1[i]

You cannot impose both of these at the same time.
If S(O_Op0_1[i]) - S(O_Op0_2[i]) is non-zero, then obviously
S(O_Op0_2[i]) - S(O_Op0_1[i]) is non-zero too and then
the two additional constraints lead to a circular dependence.

skimo

Pedro Silvestre

unread,
Oct 19, 2024, 4:48:33 PM10/19/24
to isl Development
> Why do you want to do this before scheduling?
It seemed easier to have isl place these operations for me close to where they are used and then post-process the schedule to remove redundant operations if two fetch-offload pairs end-up in the same for-loop.
Perhaps this is not the case.

> Exactly which situations are you trying to avoid?
Semantically incorrect situations. A piece of data is either on-device or on-host. Attempting to fetch an on-device datum or offload an on-host datum are incorrect.

> If S(O_Op0_1[i]) - S(O_Op0_2[i]) is non-zero, then obviously S(O_Op0_2[i]) - S(O_Op0_1[i]) is non-zero too.
Well, that answers one of my questions. The PPCG document did say this, but I was hoping it was misworded and perhaps negative distances would not count.

Well then, it sadly seems that this approach does not work. I will likely have to manually insert the operations in the schedule.

Thanks Sven.

Sven Verdoolaege

unread,
Oct 20, 2024, 4:52:35 AM10/20/24
to Pedro Silvestre, isl Development
On Sat, Oct 19, 2024 at 01:48:31PM -0700, Pedro Silvestre wrote:
> > Why do you want to do this before scheduling?
> It seemed easier to have isl place these operations for me close to where
> they are used and then post-process the schedule to remove redundant
> operations if two fetch-offload pairs end-up in the same for-loop.
> Perhaps this is not the case.

Such post-processing sounds at least as difficult as
directly adding the offloading after scheduling.

skimo
Reply all
Reply to author
Forward
0 new messages