Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

DFA Scheduler and Processor Pipelines

1 view
Skip to first unread message

Christian Parpart

unread,
May 3, 2002, 4:04:03 PM5/3/02
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

well this is a really very unknown part for me.
Does anyone know what exactly a DFA Scheduler is,
or even how to implement such a scheduler?

It is mentioned on the gcc.gnu.org site that they'll
use the DFA scheduler for their processor pipe lines.
I'd like to get more informations about DFA. Does anyone
have some usefull urls?

Thanks in advance,

Christian Parpart.
http://www.surakware.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE80VGGPpa2GmDVhK0RAjoNAJ46rWf93SJljZrPOfU3QJfk5spoHQCfWBmn
D+j63PkSH3x53+uW/30Y5Ic=
=5AcY
-----END PGP SIGNATURE-----

Jason Lee Eckhardt

unread,
May 4, 2002, 2:21:59 PM5/4/02
to
Christian Parpart <cpar...@surakware.net> wrote:
>well this is a really very unknown part for me.
>Does anyone know what exactly a DFA Scheduler is,
>or even how to implement such a scheduler?
>
>It is mentioned on the gcc.gnu.org site that they'll
>use the DFA scheduler for their processor pipe lines.
>I'd like to get more informations about DFA. Does anyone
>have some usefull urls?
>

The so-called "DFA scheduler" is an instruction scheduler
which uses finite automata to detect resource conflicts (as opposed
to detecing conflicts with some other technique, such as resource
matrices).

Without going into too much detail, the general idea is that
when one is writing an algorithm to re-order instructions ("instruction
scheduling"), a method is needed to detect whether instructions
are competing for the the same resources ("structural hazard detection" or
"resource conflict detection", etc.) so that those conflicts can be
minimized (or eliminated on statically scheduled architectures).

There are numerous ways to implement hazard detection in instruction
schedulers, such as using finite automata. This method is desirable in
that each conflict query can be done with one (or a small number)
of table lookup(s). Other methods are generally not as fast, but
have other benefits. For example, it is cumbersome to use FAs to model
resource conflicts for modulo-scheduling (the standard FA method
isn't directly applicable and requires modifications -- usually requiring
large space requirements). The speed of conflict detection is especially
important in schedulers which do large amounts of backtracking, etc. In
those types of schedulers, conflict detection time can become a significant
fraction of overall scheduling time.

There is a huge body of literature on scheduling and related issues.
You should have no trouble searching with those keywords and developing
matches. However, for one example article to get started on using FAs for
hazard detection, read:
Muller, T. "Employing Finite Automata for Resource Scheduling"
MICRO-26, 1993.

As for gcc. Vladimir Makarov has implemented a FA-based hazard detection
module for the gcc instruction scheduler and an accompanying pipeline
description language (as an extension of the current machine desc. facility).
The code currently is in a development branch at gcc.gnu.org. It is to be
merged into the main sources when the code review process is done. Vlad
and I have both implemented production schedulers with his FA-based code.

Regards, Jason Eckhardt.

Naveen Sharma, Noida

unread,
May 4, 2002, 2:22:20 PM5/4/02
to
Hi,

> well this is a really very unknown part for me. Does anyone know
> what exactly a DFA Scheduler is, or even how to implement such a
> scheduler?

It is a scheduling technique based on FINITE AUTOMATA
described in

Efficient Instruction Scheduling using Finite State Automata
by Vasanth Bala and Norman Rubin. (MICRO-28) Nov,95

Broadly speaking GCC had function unit based model for
scheduling in which described characterisitics of
processor unit in the pipeline (multiplicity,simultaneiuty etc).
This is based the other way round. Instruction are described
in terms of functional unit requirements. The scheduler
maintains a collision matrix for resource hazards in processor pipeline
(which corresponds to a DFA state) and updates it as insns are issued.

Incidentally I wrote DFA description for Sh4 processor for GCC and
this method would yield performance improvement of the
order of 10-18 %

You will get lot more detail in the research paper which
is downloadable from the net.

Regards,
Naveen Sharma.

Vladimir Makarov

unread,
May 8, 2002, 12:16:44 AM5/8/02
to
Jason Lee Eckhardt wrote:
>
> As for gcc. Vladimir Makarov has implemented a FA-based hazard detection
> module for the gcc instruction scheduler and an accompanying pipeline
> description language (as an extension of the current machine desc. facility).
> The code currently is in a development branch at gcc.gnu.org. It is to be
> merged into the main sources when the code review process is done. Vlad
> and I have both implemented production schedulers with his FA-based code.

Hi, Jason.

It is in the main line of the gcc repository now. So some gcc ports
using the DFA scheduler will be probably in 3.2 release of gcc.

In brief, gcc DFA scheduler is gcc Haifa interblock scheduler with
recognizing pipeline hazards based on DFA (deterministic finite state
automaton). Modern RISC and VLIW processors issue insns when the data
are ready for given insn and the processor functional units are free
to execute the insn. A pipeline hazard recognizer aims to recognize
processor stalls because of busy functional units.

The DFA pipeline hazard recognizer uses DFA for this. Each state
encodes the current and planned future reservations of the functional
units. Arcs between states are marked by insns. If there is an
output arc marked by an insn in the automaton, then the insn can be
issued in given state (i.e. all necessary functional units are free in
given processor state). The destination state encodes the functional
unit reservation after issuing the insn. There is always arc marked
by advancing processor cycle from each state. So DFA describing a
processor pipelines is closely connected (there is a way from each its
state to any other state).

Recognizing pipeline hazards based on the DFA is much faster than
other approaches. The more functional units are used in a processor
pipeline description, the faster DFA based pipeline hazard recognizer
is. It is really important. E.g. more 15% of ia64 gcc time spent in
pipeline hazard recognizer and bundling. It is important to implement
insn schedulers trying more heuristics, more insn schedules
(e.g. Muchnik in his book about optimized compiler mentioned the
problem of inadequate insn scheduling heuristics for superscalar RISC
processors). Usage of DFA also simplifies implementation of insn
scheduler algorithms (e.g. you could easy to compare processor states
in kernel recognition software pipelining algorithms).

With the user point of view, gcc DFA description is a set of regular
expressions. Each expression describes a functional unit reservation
by an insn, e.g.

p1+div,div*10
(p1|p2)+(int1|int2)

The first expression could describe non pipelined division (because
the functional unit `div' is reserved during all insn execution) which
is issued into the 1st pipeline (p1). The second one describes an
integer insn which is issued into the 1st pipeline (p1) and, if it is
busy, into the 2nd pipeline and which is executed on the first integer
unit (int1) and, if it is busy, on the 2nd integer unit (int2).

Gcc DFA description is more powerful than the previous one and
permits more accurate descriptions. It is more important for
classical RISC and VLIW processor scheduling (e.g. some embedded
benchmarks for ultrasparc was sped up to 15%) and less important for
out-of-order and/or speculative RISC processors.

Gcc DFA pipeline hazard recognizer can be used for VLIW packing
(insn bundling) too. Although separation of insn scheduling and VLIW
packing tasks might do a big impact on code quality. Fortunately, you
could easy describe VLIW packing rules for the DFA scheduler too.

The gcc DFA pipeline hazard recognizer infrastructure has many new
features in comparison with other DFA approaches:

1. Usage of alternatives. You could describe that an insn reserves
these units or other units.

2. Possibility of describing NDFA. The processor itself is a
deterministic device. Therefore DFA is usually used to describe
it. It means that the first possible alternative of the unit
reservation is chosen in given processor state (see the examples
above). In NDFA, the alternative operator means that any
alternative can be chosen. So you could describe all possible
VLIW packing and (even inserting NOP insns for VLIW packing).

3. Exclusion, presence, and absence sets which permits to describe
easily VLIW packing and some complex cases for RISC processors.

4. Implementation of more complex queries (e.g. minimal time in
which given insn can be issued in given state) in the pipeline
hazard recognizer interface. You also could query unit
reservation in given state (the automaton minimization is not
possible in this case).

Actually, gcc could have the biggest benefits from such
infrastructure because it is the single compiler which is ported to
majority of the processors. Having a huge feedback (from Richard
Henderson, Jeff Law, Jason Eckhardt, David Miller, Jan Hubicka, Dan
Nicolaescu, David Edelhson and tens of other developers, sorry if I
did not name you) I see the advantage of open source model to design
and improve the infrastructure.

Vladimir Makarov

0 new messages