Just read an interesting new book:
"A Retargetable C Compiler: Design and Implementation"
Christopher W. Fraser (AT&T Bell Labs)
David R. Hanson (Princeton U.)
Benjamin/Cummings Publishing Company, Inc.
Copyright (c) 1995 AT&T & David Hanson
I found this book, after a first reading, to be one of the better
compiler or compiler-tool books I've read. Highly recommended.
Right up there with the Dragon Book and O'Reilly's 2nd edition
Lex & Yacc book (that ought to get this post past John ;) ).
The authors present an in-depth look at lcc, a retargetable
c compiler. They contrast their approach to texts (like the
Dragon Book, IMO) which provide breadth and strong theoretical
material, rather than depth and implementation details.
Also, the authors present their book as an example of "programming
in the large." It provides an opportunity for programmers to learn
about the design decisions, wrong turns, constraints, and strategies,
that went into a large successful software project.
At the larger level, "A Retargetable C Compiler..." is organized in
the traditional way: starting with basics, then lexical analysis,
then parsing, then code generation.
Within each section, however, the book is organized un-traditionally:
as a running narrative on the actual lcc source code:
"This book not only describes the implementation of lcc, it
*is* the implementation. The 'noweb' system for 'literate
programming' generates both the book and the code for lcc
from a single source. This source consists of interleaved
prose and lableed code 'fragments'." [p1]
The approach works well, although there were times when I
would have liked to see the source code in some other arrangement
than that provided. Fortunately the source is available via ftp:
The most interesting parts of this book detail the author's efforts
to make lcc easily retargetable. There's a good discussion about the
interface between the independant front-end and the various dependant
back-ends. Fraser & Hanson's observations and experiences area will
be very useful to anyone dealing with multi-platform compiler issues.
The chapters on code generation for MIPS, SPARC, and x86 code provide
sufficient detail so that one comes away with a specific, detailed
understanding, of some of the less-obvious issues in code generation.
(A good companion text for these chapters is "Microprocessors" by
Dewar and Smosna ISBN 0-07-016638-2)
Interestingly, Fraser & Hanson use a code generator-generator, lburg,
for the back-end, but they hand code the lexical analyzer and parser.
They seem to brush off Lex and Yacc as not particularly useful for
their purpose of generating a small, fast, portable c compiler
(though they do say that lex & yacc have an important role to play
for more complex languages, throw-away utilities, or instances in
which compilation speed is not important).
Fraser & Hanson state that "automatically generated analyzers,
such as those produced by LEX, tend to be large and slower than
analyzers built by hand." [p107] While they say that re2c and
ELI can generate faster analyzers than lcc's, they do not say
the same for flex, which they say generates analyzers
"much faster and smaller than those produced by LEX." [p125]
Is is true that LEX's analyzers are generally larger and slower than
All & all, this is a worthwhile book which I recommend to anyone
interested in compilers.
ISC Consultants, Inc. 14 East 4 Street Ste 602
New York City 10012-1141
212 477-8800 fax:477-9895
[AT&T lex generates slow and often buggy lexers. Flex's are competitive
with hand-coded ones, particularly when you take advantage of its reports
and fiddle your token definitions to avoid having the lexer back up. -John]
Send compilers articles to compil...@iecc.com,
meta-mail to compilers-requ...@iecc.com.