Pascal v. C in compile speed

0 views
Skip to first unread message

Anders Nyg}rd

unread,
May 13, 1995, 3:00:00 AM5/13/95
to
Why is there such a difference between C and Pascal when it comes to compiler
speed. IMHO C should be slightly faster to parse since it has simpler and
more general syntax or maybe it is just Borlands Pascal compiler that is
so fast.

--
+-------------------------------------------------------------
| Anders Nygaard * any...@technis.vtyh.fi * Vasa Polytechnic
| "Information wants to be free" -- William Gibson

John P Temprile

unread,
May 14, 1995, 3:00:00 AM5/14/95
to
I would attribute this to the nature of Pascal and the nature of C, Wirth
defined and specified the syntax for Pascal, where as Ritchie(If everyone
will please forgive me for this blasphemy) invented C as a tool, and thus
did not put the effort into defining the language as Wirth perhaps would,
just out of a burst of curiousity, does Pascal require a
pre-processor??? I didnt think it did. I remeber using Turbo Pascal on
my XT and then compiling a similar program with Turbo C... wow!! What a
difference!!!

Anders Nyg}rd (any...@loke.vtyh.fi) wrote:
: Why is there such a difference between C and Pascal when it comes to compiler

--
==========================================================_/_/_/_/ _/_/_/_/
[]+[] Whats wrong with being drunk? Ask a glass of water! _/ _/_/
===========<j...@snowhite.cis.uoguelph.ca>================== _/ _/_/
_/ _/ + _/_/ +
_/_/_/ _/_/

Ping Huang

unread,
May 14, 1995, 3:00:00 AM5/14/95
to
In article <3p2nf0$4...@solutions.solon.com> any...@loke.vtyh.fi (Anders Nyg}rd) writes:

> Why is there such a difference between C and Pascal when it comes
> to compiler speed. IMHO C should be slightly faster to parse since
> it has simpler and more general syntax or maybe it is just Borlands
> Pascal compiler that is so fast.

A rule of thumb in computer science is "general is slower". Hence if
your statement "[C] has [....] more general syntax [than Pascal]" is
true, a parser for C would be slower, not faster, than a parser for
Pascal, and the conclusion you tried to draw would be incorrect.

There's lots of theory about parsing languages; certain classes of
language grammars are inherently harder/slower to parse than other
fundamentally simpler grammars. I would disagree with your statement
that C has a simpler syntax than Pascal; Pascal has longer keywords
than C, but other than that, I would argue that its grammar is
simpler, more uniform, and smaller than C's. Also, the most obvious
way to implement a C compiler is with a multi-pass approach, with the
first pass performing preprocessing, although that is actually not
absolutely mandated by the language grammar semantics; more passes are
likely (although not necessarily) to make a compiler slower.

I suspect that most of the speed difference that you see between a
"typical C compiler" and the Borland Pascal compiler can be attributed
not to theoretically-significant differences between the two
languages' grammars, but to more pragmatic implementation issues:

* Typical organization of C source code into .c and .h files means
that often to compile a relatively small .c file, the compiler
needs to parse through thousands of lines of .h files. For
example, the two-line C source file which includes the ANSI-C
stdio.h and stdlib.h header files turns into 447 non-blank lines
after the C preprocessor pass on the Linux box I am currently
sitting at. This can be especially significant if source code is
organized into many small files, which has various benefits,
including finer granularity for source control systems to allow
many programmers to work on the same project at once, but which
amplifies this problem. Having hundreds of .c files, each of which
re-include essentially the same headers, is not unusual. In
Borland Pascal, rather than including a human-readable header file
to define external procedure interfaces, etc., that needs to be
parsed, you reference other units, and the compiler can look up
pre-parsed information in the corresponding .TPU files. Note that
some C compilers support pre-parsed/pre-compiled header files, to
gain some of the same speed advantage.

* Tuning emphasis. Borland Pascal is the child of Turbo Pascal; I do
not know there is really any code left in the recent versions of
Borland Pascal or Delphi compilers that dates back to say Turbo
Pascal 3.0, but the historical lineage surely meant that the
programmers working on the Borland Pascal and Delphi compilers gave
much more thought to and put much more effort into speed of
compilation than their compatriots working on other compilers.
Note that Borland tried to emphasize compilation speed in their C
compilers as well during the late 80's. The strategy of putting
effort into compilation speed had a fair amount of payback for the
Turbo Pascal product, in my opinion, and less payback for the Turbo
C product. Although Turbo C inspired other vendors to come up with
quicker compilers (e.g., Microsoft Quick C), market forces seem to
have swung away from emphasizing compilation speed.

--
Ping Huang <psh...@mit.edu>, probably speaking for himself

Hogan Long

unread,
May 20, 1995, 3:00:00 AM5/20/95
to
>In article <3p2nf0$4...@solutions.solon.com> any...@loke.vtyh.fi (Anders Nyg}rd) writes:

> > Why is there such a difference between C and Pascal when it comes
> > to compiler speed. IMHO C should be slightly faster to parse since
> > it has simpler and more general syntax or maybe it is just Borlands
> > Pascal compiler that is so fast.

There is a very technical reson that the Pascal compilers are faster - and
saying that C requires a two pass compiler sort of gets at the issue.

Simply stated the CFG (context free grammar) that describes the Pascal
language is suitable for recursive descent parsing. Recursive descent
parsers are fairly easy to impliment and Pascal was clearly designed to
be implimented this way. However, if you use LL(1) parse tables you can
see an improvement in speed. C can NOT be implimented with LL(1) parse
tables because it is more complex.

This reply does not mean much to anyone who has not studied compiler
construction - and is not much use because if you had studied compiler
construction you would understand the issue. So I will try and explain more.
Simply put, the definition of the Pascal language is designed to be easy to
parse and the definition of the C language is not (even if you ignore the
pre-processor.) Some examples of what makes the C language complex to a parser
are the use of the "-" token as both a binary minus and a unary minus. The
use of "--" as both a pre and a post decriment, etc.
Those of you who are quick will say "wait a min - pascal has both types of
minus also!" And you are right sort of - but infact the locations where
the minus can occure are different in pascal, also this may not have been a
good example, lets just leave it at Pascal is easier to parse. That is,
the simple reply to your question.

If you want more information about these issues check out the first few
chapters in a compiler constuction book, where they talk about CFGs and
parsing techniques. C is usually implimented as a LRL(2) I think.

Hogan

Billy Chambless

unread,
May 22, 1995, 3:00:00 AM5/22/95
to
In article <3p2nf0$4...@solutions.solon.com>, any...@loke.vtyh.fi (Anders Nyg}rd) writes:
> Why is there such a difference between C and Pascal when it comes to compiler
> speed. IMHO C should be slightly faster to parse since it has simpler and
> more general syntax

Actually, it's that "simpler and more general syntax" that makes
C _harder_ to parse. The stricter the syntax of a language, the dumber
the parser is allowed to be.

> or maybe it is just Borlands Pascal compiler that is
> so fast.

That's another part. Borland's Pascal compilers do some really clever
stuff at the linking stage, which helps a lot.
--
* Billy Chambless bi...@cast.msstate.edu
* Voice: 601.688.3159 FAX: 601.688.7100
* Mississippi State University Center for Air-Sea Technology

Michael Cook

unread,
May 22, 1995, 3:00:00 AM5/22/95
to
>>>>> "HL" == Hogan Long <ho...@cs.nyu.edu> writes:

>> In article <3p2nf0$4...@solutions.solon.com> any...@loke.vtyh.fi (Anders Nyg}rd) writes:
>> > Why is there such a difference between C and Pascal when it comes
>> > to compiler speed. IMHO C should be slightly faster to parse since

>> > it has simpler and more general syntax or maybe it is just Borlands


>> > Pascal compiler that is so fast.

HL> There is a very technical reson that the Pascal compilers are faster - and
HL> saying that C requires a two pass compiler sort of gets at the issue.

HL> Simply stated the CFG (context free grammar) that describes the Pascal
HL> language is suitable for recursive descent parsing. Recursive descent
HL> parsers are fairly easy to impliment and Pascal was clearly designed to
HL> be implimented this way. However, if you use LL(1) parse tables you can
HL> see an improvement in speed. C can NOT be implimented with LL(1) parse
HL> tables because it is more complex.

Hey, this is all rubbish.

C compilers /can/ be implemented as single-pass. Such a compiler could read
(un-preprocessed) source code and generate an executable.

The only trick is that you need a feedback loop from the parser to the lexical
analyzer to distinguish between typedef identifiers and other identifiers.

Pascal has its own parsing quirk--its related to the fact that declarations
don't take effect immediately. To work correctly, a Pascal parser has to make
two "passes" over each block of declarations (but, yes, the parser can get by
with reading the whole declaration into memory and making two passes over it).

Incidentally, C++ has /both/ quirks: C's typedef quirk and (a form of)
Pascal's forward declaration quirk.

HL> This reply does not mean much to anyone who has not studied compiler
HL> construction - and is not much use because if you had studied compiler
HL> construction you would understand the issue. So I will try and explain
HL> more. Simply put, the definition of the Pascal language is designed to
HL> be easy to parse and the definition of the C language is not (even if you
HL> ignore the pre-processor.) Some examples of what makes the C language
HL> complex to a parser are the use of the "-" token as both a binary minus
HL> and a unary minus. The use of "--" as both a pre and a post decriment,
HL> etc. Those of you who are quick will say "wait a min - pascal has both
HL> types of minus also!" And you are right sort of - but infact the
HL> locations where the minus can occure are different in pascal, also this
HL> may not have been a good example,

Specious.

HL> lets just leave it at Pascal is easier to parse. That is, the simple
HL> reply to your question.

At best, your argument is weak. And you don't help your argument by
concluding with "just trust me on this one."

HL> If you want more information about these issues check out the first few
HL> chapters in a compiler constuction book, where they talk about CFGs and
HL> parsing techniques. C is usually implimented as a LRL(2) I think.

LR(1), actually. See `yacc'. (What's a LRL(2) parser?) See also Aho, Sethi
and Ullman's book--the title escapes me at the moment.

M.

Mike McCarty

unread,
May 23, 1995, 3:00:00 AM5/23/95
to
In article <3prm8b$9...@solutions.solon.com>,
Michael Cook <mc...@cognex.com> wrote:

[nested quotes gone]

)Hey, this is all rubbish.

Was it =all= rubbish?

)C compilers /can/ be implemented as single-pass. Such a compiler could read
)(un-preprocessed) source code and generate an executable.

The easiest way to write a single pass C compiler is to output
assembler. Then let the assembler do two passes :)

)The only trick is that you need a feedback loop from the parser to the lexical
)analyzer to distinguish between typedef identifiers and other identifiers.

Not necessary.

)Pascal has its own parsing quirk--its related to the fact that declarations
)don't take effect immediately. To work correctly, a Pascal parser has to make
)two "passes" over each block of declarations (but, yes, the parser can get by
)with reading the whole declaration into memory and making two passes over it).

This is untrue. I have worked on Pascal compilers which are completely
one pass. No second pass over the symbol table to fix it up. Things just
get filled in as they are encountered.

)Incidentally, C++ has /both/ quirks: C's typedef quirk and (a form of)
)Pascal's forward declaration quirk.

[nested stuff gone]

)At best, your argument is weak. And you don't help your argument by
)concluding with "just trust me on this one."

I agree that "just trust me" is a little weak.

) HL> If you want more information about these issues check out the first few
) HL> chapters in a compiler constuction book, where they talk about CFGs and
) HL> parsing techniques. C is usually implimented as a LRL(2) I think.
)
)LR(1), actually. See `yacc'. (What's a LRL(2) parser?) See also Aho, Sethi
)and Ullman's book--the title escapes me at the moment.
)
)M.

That's not LR(1), that's LALR(1).

I have no idea what an LRL(2) parser would be. Did you perhaps mean
LR(2)?

Mike
----
char *p="char *p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);}

Brent Eric Edwards

unread,
May 23, 1995, 3:00:00 AM5/23/95
to
In article <3prm8b$9...@solutions.solon.com>,
Michael Cook <mc...@cognex.com> wrote:
>Hey, this is all rubbish.
>
>C compilers /can/ be implemented as single-pass. Such a compiler could read
>(un-preprocessed) source code and generate an executable.

Because of things like forward goto statements in both Pascal and C,
such a single-pass compiler would need to use backpatching.

Say... isn't Borland's C, C++ compiler a single-pass compiler (not counting
the linking?)

> HL> If you want more information about these issues check out the first few

> HL> chapters in a compiler constuction book, where they talk about CFGs and

> HL> parsing techniques. C is usually implimented as a LRL(2) I think.
>

>LR(1), actually. See `yacc'. (What's a LRL(2) parser?) See also Aho, Sethi

>and Ullman's book--the title escapes me at the moment.

The book's title is _Compilers_, but because of the picture on the
front, it's generally called _The Dragon Book_.

Brent Edwards

Gordon Couger

unread,
May 23, 1995, 3:00:00 AM5/23/95
to
In article <3prkql$9...@solutions.solon.com>,
Billy Chambless <bi...@skink.NoSubdomain.NoDomain> wrote:

>In article <3p2nf0$4...@solutions.solon.com>, any...@loke.vtyh.fi (Anders Nyg}rd) writes:
> > Why is there such a difference between C and Pascal when it comes to compiler
It is not much if any faster than a true Pascal compiler. Turbo Pascal is
not a standard compiler. Pascal would probably be an almost dead language if
it had not been for Turbo Pascal. In many ways it is a work of art.

When it came out I had a program that took 30 minutes to fully compile in a
standard pascal compiler. The same thing would take almost no time at all
under turbo pascal.

> > speed. IMHO C should be slightly faster to parse since it has simpler and
> > more general syntax
>

> Actually, it's that "simpler and more general syntax" that makes
> C _harder_ to parse. The stricter the syntax of a language, the dumber
> the parser is allowed to be.
>

The linker also makes multiple passes to resolve hidden refferences. It
also has to make a pass for the preprocessor.


> > or maybe it is just Borlands Pascal compiler that is
> > so fast.
>

>That's another part. Borland's Pascal compilers do some really clever
>stuff at the linking stage, which helps a lot.
>--
>* Billy Chambless bi...@cast.msstate.edu
>* Voice: 601.688.3159 FAX: 601.688.7100
>* Mississippi State University Center for Air-Sea Technology

Gordon Couger - 624 Cheyenne, Stillwater, OK 74075
gco...@master.ceat.okstate.edu 405-624-2855 evenings
I do not speak for my employer

Michael Cook

unread,
May 24, 1995, 3:00:00 AM5/24/95
to
>>>>> "MM" == Mike McCarty <jmcc...@spdmail.spd.dsccc.com> writes:

MM> This is untrue. I have worked on Pascal compilers which are completely
MM> one pass. No second pass over the symbol table to fix it up. Things just
MM> get filled in as they are encountered.

The example I saw included a statement saying something like "most pascal
compilers parse this code incorrectly". Perhaps the compilers you worked on
also misparsed.

Michael.

Ray Dunn

unread,
May 24, 1995, 3:00:00 AM5/24/95
to
>Because of things like forward goto statements in both Pascal and C,
>such a single-pass compiler would need to use backpatching.

No, a jump address can be filled in at link time like any other address
(assuming the linker can handle relative addressing).
---
Ray Dunn (opinions are my own) | Phone: (514) 954 9050
Montreal | Phax : (514) 954 9057
r...@ultimate-tech.com | Home : (514) 630 3749

Mike McCarty

unread,
May 26, 1995, 3:00:00 AM5/26/95
to
In article <3q09lv$k...@solutions.solon.com>,
Ray Dunn <r...@ultimate-tech.com> wrote:
)>Because of things like forward goto statements in both Pascal and C,
)>such a single-pass compiler would need to use backpatching.
)
)No, a jump address can be filled in at link time like any other address
)(assuming the linker can handle relative addressing).
)---

AND in Pascal, a forward goto statement is a different kind of thing
from what it is in C. In C a label is just a label. In Pascal, it is
more like a jmpbuf. It allows non-local gotos, outside the current
scope.

Mike McCarty

unread,
May 26, 1995, 3:00:00 AM5/26/95
to
In article <3q09ed$k...@solutions.solon.com>,
Michael Cook <mc...@cognex.com> wrote:
)>>>>> "MM" == Mike McCarty <jmcc...@spdmail.spd.dsccc.com> writes:
)
) MM> This is untrue. I have worked on Pascal compilers which are completely
) MM> one pass. No second pass over the symbol table to fix it up. Things just
) MM> get filled in as they are encountered.
)
)The example I saw included a statement saying something like "most pascal
)compilers parse this code incorrectly". Perhaps the compilers you worked on
)also misparsed.
)
)Michael.

No, they did not misparse. One simply always generates an anonymous
type, and points to it. When the -actual- type is encountered, it is
generated as a high level type, which points to the anonymous one.
Borlan's Turbo Pascal has done this since at least version 2.0, for
example.

Reply all
Reply to author
Forward
0 new messages