[llvm-dev] [VLIW Scheduler] Itineraries vs. per operand scheduling

Yatsina, Marina via llvm-dev

unread,

Feb 4, 2018, 12:16:14 PM2/4/18

to llvm...@lists.llvm.org

Hi,

What is the best way to model a scheduler for a VLIW in-order architecture?

I’ve looked at the Hexagon and R600 architectures and they are using itineraries. I wanted to understand the benefit in using itineraries over the per operand scheduling.

I also found this thread from almost 2 years ago:

http://lists.llvm.org/pipermail/llvm-dev/2016-April/098763.html

At that time it seemed the itineraries are a better choice, but is it still the case?

Also, in this thread Phil says:

“Some of the constraints that can be found in in-order micro architectures cannot be expressed in the per-operand scheduling model”

Does anybody have an example of such constraints that will be harder to model with per operand scheduling?

Thanks,

Marina

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

陳韋任 via llvm-dev

unread,

Feb 5, 2018, 7:55:29 AM2/5/18

to Yatsina, Marina, llvm...@lists.llvm.org

I also found this thread from almost 2 years ago:

http://lists.llvm.org/pipermail/llvm-dev/2016-April/098763.html

At that time it seemed the itineraries are a better choice, but is it still the case?

Also, in this thread Phil says:

“Some of the constraints that can be found in in-order micro architectures cannot be expressed in the per-operand scheduling model”

Does anybody have an example of such constraints that will be harder to model with per operand scheduling?

From the thread you provided, it says

ARM might be a good start for generic superscalar. Hexagon for VLIW style scheduling.

Anyway, I also interest in knowing their difference. :-)

Regards,

chenwj

--

Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

Andrew Trick via llvm-dev

unread,

Feb 8, 2018, 12:32:51 AM2/8/18

to Yatsina, Marina, llvm...@lists.llvm.org

On Feb 4, 2018, at 9:15 AM, Yatsina, Marina via llvm-dev <llvm...@lists.llvm.org> wrote:

Hi,

What is the best way to model a scheduler for a VLIW in-order architecture?
I’ve looked at the Hexagon and R600 architectures and they are using itineraries. I wanted to understand the benefit in using itineraries over the per operand scheduling.

I also found this thread from almost 2 years ago:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098763.html

At that time it seemed the itineraries are a better choice, but is it still the case?
Also, in this thread Phil says:
“Some of the constraints that can be found in in-order micro architectures cannot be expressed in the per-operand scheduling model”
Does anybody have an example of such constraints that will be harder to model with per operand scheduling?

Thanks,
Marina

Hi Marina,

Itineraries have stuck around mainly because no one was interested in porting old itineraries to a new machine description. At least initially, there also wasn't anyone interested in working on an in-order scheduling implementation based on the new machine model, even though the machine description itself was designed for in-order scheduling. There are a handful of targets now using the new model with MicroOpBufferSize = 0/1 (in-order mode). It's hard for me to say how widely supported it is, because so many backends using the MachineScheduler are out-of-tree.

The per-operand model should be much more compile time efficient, but that's often not a concern. It can be easier now to incrementally bootstrap new machine descriptions based on similar machines with the same architecture. The subtarget details no longer need to be tied to the TableGen instruction descriptions. The new model lets you define the structure of the subtarget scheduling description, so it can be self-documenting. The new model lets you use a plethora of tablegen tricks to handle all the variations on vector operations. I believe it's much more straightfoward to specify special constraints, like bypassing. It's fairly easy to extend certain opcodes with predicates (callbacks) that depend on immediate values or any other arbtrary info.

See Javed and Florian's tutorial on the "new" machine model:

https://www.youtube.com/watch?v=brpomKUynEA

Itineraries are not strictly more powerful. The original motivation for the new machine description is that there were important issue constraints that weren't being correctly modeled by the itineraries. The only advantage of itineraries that I'm aware of is that, for instructions that use multiple machine resources, you can control during which cycle those resources are used relative to each other (it specifies a full resource matrix for each op). The per-operand model makes the simplifying assumption that a given resource is simply used for N cycles. Normally, all instructions begin using a given resource on the same cycle relative to instruction issue, so that's not an innaccurate assumption. If you actually need to correct for that, and don't use itineraries then you need to make use of the the MachineScheduler's hooks to write a little custom scheduling code.

Whatever you do, there's no sense in using both kinds of machine descriptions. Support is only there for the purpose of bootstrapping using ARM A9, since that was the latest in-tree ARM cpu at the time. It's just useless complexity now.

-Andy

陳韋任 via llvm-dev

unread,

Feb 8, 2018, 9:49:30 AM2/8/18

to Krzysztof Parzyszek, llvm...@lists.llvm.org

Hi Krzysztof,

2018-02-08 13:32 GMT+08:00 Andrew Trick via llvm-dev <llvm...@lists.llvm.org>:

On Feb 4, 2018, at 9:15 AM, Yatsina, Marina via llvm-dev <llvm...@lists.llvm.org> wrote:

Hi,

What is the best way to model a scheduler for a VLIW in-order architecture?
I’ve looked at the Hexagon and R600 architectures and they are using itineraries. I wanted to understand the benefit in using itineraries over the per operand scheduling.

Do you have time to give comment on why Hexagon still use itineraries, rather than switching to per operand scheduling like ARM does? I really like to hear your opinion.

Thanks. :-)

Krzysztof Parzyszek via llvm-dev

unread,

Feb 8, 2018, 2:40:31 PM2/8/18

to 陳韋任, llvm...@lists.llvm.org

We have a two different dimensions for each instruction: slot
assignments, and operand timings. These two are unrelated to each other,
and also each (or both) can change for any given instruction from one
architecture version to the next.

The main concern for us was which of these mechanisms contains all the
information that we need. We cannot express all the scheduling details
by hand, and majority of it was auto-generated anyway. I don't know if
the new model has all the required pieces of information, but we've been
using itineraries for a while, and we stuck with them.

The short answer is "because it works", but it's not meant to imply that
nothing else would.

-Krzysztof

On 2/8/2018 8:49 AM, 陳韋任 wrote:
> Hi Krzysztof,
>
> 2018-02-08 13:32 GMT+08:00 Andrew Trick via llvm-dev

> <llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>>:

>
>
>
>> On Feb 4, 2018, at 9:15 AM, Yatsina, Marina via llvm-dev

>> <llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>> wrote:
>>
>> Hi,____
>> __ __

>> What is the best way to model a scheduler for a VLIW in-order

>> architecture?____

>> I’ve looked at the Hexagon and R600 architectures and they are
>> using itineraries. I wanted to understand the benefit in using
>> itineraries over the per operand scheduling.
>
> Do you have time to give comment on why Hexagon still use itineraries,
> rather than switching to per operand scheduling like ARM does? I really
> like to hear your opinion.
>
> Thanks. :-)
>
> --
> Wei-Ren Chen (陳韋任)
> Homepage: https://people.cs.nctu.edu.tw/~chenwj

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Andrew Trick via llvm-dev

unread,

Feb 8, 2018, 7:32:52 PM2/8/18

to Krzysztof Parzyszek, llvm...@lists.llvm.org

> On Feb 8, 2018, at 11:40 AM, Krzysztof Parzyszek via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> We have a two different dimensions for each instruction: slot assignments, and operand timings. These two are unrelated to each other, and also each (or both) can change for any given instruction from one architecture version to the next.
>
> The main concern for us was which of these mechanisms contains all the information that we need. We cannot express all the scheduling details by hand, and majority of it was auto-generated anyway. I don't know if the new model has all the required pieces of information, but we've been using itineraries for a while, and we stuck with them.
>
> The short answer is "because it works", but it's not meant to imply that nothing else would.
>
> -Krzysztof

That reminds me, in the new model, there are two different ways to define issues constraints:

ARMScheduleSwift has two ALU pipes. Some instructions can issue on either one, so we define a superset “P01” resource like this:
def SwiftUnitP01 : ProcResource<2>; // ALU unit.
def SwiftUnitP0 : ProcResource<1> { let Super = SwiftUnitP01; } // Mul unit.
def SwiftUnitP1 : ProcResource<1> { let Super = SwiftUnitP01; } // Br unit.

Haswell allows an instruction to issue an any arbitrary set of ports, so it uses groups:

def HWPort1 : ProcResource<1>;
def HWPort2 : ProcResource<1>;
def HWPort3 : ProcResource<1>;
def HWPort4 : ProcResource<1>;
def HWPort5 : ProcResource<1>;
def HWPort6 : ProcResource<1>;
def HWPort7 : ProcResource<1>;

// Many micro-ops are capable of issuing on multiple ports.
def HWPort01 : ProcResGroup<[HWPort0, HWPort1]>;
def HWPort23 : ProcResGroup<[HWPort2, HWPort3]>;
def HWPort237 : ProcResGroup<[HWPort2, HWPort3, HWPort7]>;
…

IIRC: they’re modeled the same way and ProcRegGroup is just computing superset relations for you.
And an instruction on pipe/port “01” can issue on either dynamically, so a subsequent pipe/port “0” instruction could issue in the same cycle.

For something like slot assignments, I would be tempted to write an small independent state machine as part of the hazard checker, then use the machine model (either itinerary or per-operand) just for operand timings.

-Andy

Yatsina, Marina via llvm-dev

unread,

Feb 11, 2018, 4:13:16 AM2/11/18

to Andrew Trick, llvm...@lists.llvm.org

Thank you so much for the detailed answer!

Our target only uses one resource at a time, so based on what you said, itineraries do not have any advantage over the new scheduling model.

The per operand model seems to be the way to go.

Thanks,

Marina

---------------------------------------------------------------------
Intel Israel (74) Limited

Reply all

Reply to author

Forward