[llvm-dev] How to get started with instruction scheduling? Advice needed.

Phil Tomson via llvm-dev

unread,

Apr 20, 2016, 1:51:21 PM4/20/16

to LLVM Developers Mailing List

I need to add instruction scheduling for a new target which is a fairly simple in-order execution machine.

I've been watching this presentation from a 2014 LLVM dev meeting as it seems relevant:

"SchedMachineModel: Adding and Optimizing a Subtarget" http://llvm.org/devmtg/2014-10/Slides/Estes-MISchedulerTutorial.pdf

In this presentation the author says that there have been several ways to approach scheduling in LLVM over the years:

Pre 2008: SelectionDAGISel pass creates the ScheduleDAG from the SelectionDAG at the end of instruction selection
ScheduleDAG works on SelectionDAG Nodes (SDNodes)
Circa 2008: Post Register
Allocation pass added for
instruction selection ( SchedulePostRATDList
works on MachineInstrs)
Circa 2012: MIScheduler
(ScheduleDAGMI) added as
separate pass for pre-RA
scheduling
Circa 2014: MIScheduler
adapted to optionally replace
PostRA Scheduler

In the presentation he goes with defining a subclass of SchedMachineModel in the schedule .td file. And apparently with this approach there are no instruction itineraries.

So I'm wondering: what's the current recommended way to approach this and does it depend on the type or target? (in-order, superscalar, out of order, VLIW...)?

Someone earlier started to define instruction itineraries for our target. Should I continue down this road or move over to the SchedMachineModel approach? Are there other recommended presentations/documents that I should be looking at?

Thanks.

Phil

Phil Tomson via llvm-dev

unread,

Apr 20, 2016, 4:27:10 PM4/20/16

to Sergei Larin, LLVM Developers Mailing List

So if I use the SchedMachineModel method, can I just skip itineraries?

Phil

On Wed, Apr 20, 2016 at 12:29 PM, Sergei Larin <sla...@codeaurora.org> wrote:

Target does make a difference. VLIW needs more hand-holding. For what you are describing it should be fairly simple.

Best strategy – see what other targets do. ARM might be a good start for generic superscalar. Hexagon for VLIW style scheduling.

Depending on what you decide, you might need different target hooks.

Sergei

---
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

Phil Tomson via llvm-dev

unread,

Apr 20, 2016, 6:07:02 PM4/20/16

to LLVM Developers Mailing List

I notice from looking at ARMScheduleA9.td that there seems to be a hybrid approach where they still have itineraries but also use SchedMachineModel:

// ===---------------------------------------------------------------------===//
// The following definitions describe the simpler per-operand machine model.
// This works with MachineScheduler and will eventually replace itineraries.

class A9WriteLMOpsListType<list<WriteSequence> writes> {
list <WriteSequence> Writes = writes;
SchedMachineModel SchedModel = ?;
}

// Cortex-A9 machine model for scheduling and other instruction cost heuristics.
def CortexA9Model : SchedMachineModel {
let IssueWidth = 2; // 2 micro-ops are dispatched per cycle.
let MicroOpBufferSize = 56; // Based on available renamed registers.
let LoadLatency = 2; // Optimistic load latency assuming bypass.
// This is overriden by OperandCycles if the
// Itineraries are queried instead.
let MispredictPenalty = 8; // Based on estimate of pipeline depth.

let Itineraries = CortexA9Itineraries;

// FIXME: Many vector operations were never given an itinerary. We
// haven't mapped these to the new model either.
let CompleteModel = 0;
}

I'm guessing this is probably the way forward for my case since Itineraries seem to be already mostly defined.

Phil

Christof Douma via llvm-dev

unread,

Apr 26, 2016, 8:09:56 AM4/26/16

to Phil Tomson, LLVM Developers Mailing List

Hi Phil.

You more or less answered your own question, but let me give you some more info. Maybe it is of use.

From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in in-order micro architectures cannot be expressed in the per-operand scheduling model and the heuristics of the pre-RA scheduling pass is probably a bit too focussed on register pressure for in-order cores (I have no numbers, just hearsay).

There is some documentation in comments at the start of include/llvm/Target/TargetSchedule.td that you might find useful. If you are going to look at an existing scheduling model, I suggest to look at an in-order core. A good example would be AArch64/AArch64SchedA53.td. If itineraries are present, they are used by the mi-scheduler next to the SchedMachineModel to detect hazards. I think that is the only place where the mi-scheduler uses itineraries.

There are some magic numbers you need for in-order operation. Most notably MicroOpBufferSize should be set to 0 for full in-order behaviour. You also want to set CompleteModel to 0 as that prevents asserts due to instructions without scheduling information. There is a script that might help you to visualise if you have provided scheduling information in the SchedMachineModel for all instructions (utils/schedcover.py). It is very simplistic and takes as input the debug output of tablegen. There are some usage comments at the beginning.

Regards,

Christof

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Phil Tomson via llvm-dev

unread,

Apr 28, 2016, 3:01:01 PM4/28/16

to Christof Douma, LLVM Developers Mailing List

Christoff,

Thanks for the reply. Comments below:

On Tue, Apr 26, 2016 at 5:09 AM, Christof Douma <Christo...@arm.com> wrote:

Hi Phil.

You more or less answered your own question, but let me give you some more info. Maybe it is of use.

From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in in-order micro architectures cannot be expressed in the per-operand scheduling model and the heuristics of the pre-RA scheduling pass is probably a bit too focussed on register pressure for in-order cores (I have no numbers, just hearsay).

There is some documentation in comments at the start of include/llvm/Target/TargetSchedule.td that you might find useful. If you are going to look at an existing scheduling model, I suggest to look at an in-order core. A good example would be AArch64/AArch64SchedA53.td. If itineraries are present, they are used by the mi-scheduler next to the SchedMachineModel to detect hazards. I think that is the only place where the mi-scheduler uses itineraries.

If I don't use the mi-scheduler (since we already have itineraries defined for most ops and this is an in-order processor so it sounds like I wouldn't get much benefit from using the mi-scheduler), how can I tell if my itinerary definitions are being used? Is there any way to get a report or debug info?

There are some magic numbers you need for in-order operation. Most notably MicroOpBufferSize should be set to 0 for full in-order behaviour. You also want to set CompleteModel to 0 as that prevents asserts due to instructions without scheduling information. There is a script that might help you to visualise if you have provided scheduling information in the SchedMachineModel for all instructions (utils/schedcover.py).

I do not see schedcover.py in my LLVM source tree, but we're still on LLVM 3.6 so this could be the issue. Are there other ways to debug itineraries?

Christof Douma via llvm-dev

unread,

Apr 29, 2016, 4:49:51 AM4/29/16

to Phil Tomson, LLVM Developers Mailing List

Hi Phil.

That schedcover.py script is only useful for per-operand mi-model and is rather new. I have never tried to debug itineraries, so can’t help you with that. I also cannot compare the different scheduler passes, sorry.

You can get debug info from a debug build of llvm by using the –debug or -debug–only=<pass-name>. For example -debug–only=misched will give you only the machine scheduler info. If the machine scheduler uses itineraries, you’ll see a line "Using scoreboard hazard recognizer” in the debug output. For other scheduling passes, I don’t know.

See http://llvm.org/docs/ProgrammersManual.html#the-debug-macro-and-debug-option for how debug info is controlled per pass.

Regards,

Christof

Matthias Braun via llvm-dev

unread,

Apr 29, 2016, 2:28:56 PM4/29/16

to Christof Douma, LLVM Developers Mailing List

On Apr 26, 2016, at 5:09 AM, Christof Douma via llvm-dev <llvm...@lists.llvm.org> wrote:

Hi Phil.

You more or less answered your own question, but let me give you some more info. Maybe it is of use.

From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in in-order micro architectures cannot be expressed in the per-operand scheduling model and the heuristics of the pre-RA scheduling pass is probably a bit too focussed on register pressure for in-order cores (I have no numbers, just hearsay).

There is some documentation in comments at the start of include/llvm/Target/TargetSchedule.td that you might find useful. If you are going to look at an existing scheduling model, I suggest to look at an in-order core. A good example would be AArch64/AArch64SchedA53.td. If itineraries are present, they are used by the mi-scheduler next to the SchedMachineModel to detect hazards. I think that is the only place where the mi-scheduler uses itineraries.

There are some magic numbers you need for in-order operation. Most notably MicroOpBufferSize should be set to 0 for full in-order behaviour. You also want to set CompleteModel to 0 as that prevents asserts due to instructions without scheduling information. There is a script that might help you to visualise if you have provided scheduling information in the SchedMachineModel for all instructions (utils/schedcover.py). It is very simplistic and takes as input the debug output of tablegen. There are some usage comments at the beginning.

Having itinerary data should be enough for an instruction to count as covered for the "CompleteModel" case. I'd highly recommend to aim for "CompleteModel 1" in your targets, because it is easy to forget new instructions. It should also not be complicated to add empty scheduling information to a node as a temporary measure for cases where you have a reason not to provide scheduling information.

schedcover.py is indeed a nice tool to get a feeling/overview of your scheduling information. If schedcover.py shows no empty cells then "CompleteModel 1" should work as well.

- Matthias

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Reply all

Reply to author

Forward