Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

What in fact is the preprocessor?

3 views
Skip to first unread message

Penco...@gmail.com

unread,
May 24, 2007, 1:55:56 PM5/24/07
to
What in fact is the preprocessor?

Things that follow after # are evaluated by the preprocesor right?

What in fact that means? What is preprocesor?

Sam of California

unread,
May 24, 2007, 4:06:14 PM5/24/07
to
<Penco...@gmail.com> wrote in message
news:1180029356....@g4g2000hsf.googlegroups.com...

> What in fact is the preprocessor?
>
> Things that follow after # are evaluated by the preprocesor right?
>
> What in fact that means? What is preprocesor?

Most good books about C and C++ will explain that much better than most
people have time to provide here. If you don't get answers then it is
because people don't want to spend time answering a question that is
answered very well in many books.

A preprocessor reads the files that are input to the compiler and ceates a
temporary file that is the result of some specific processing. When the
preprocessor processes an #include, the output contains all the code that is
to be included, and the compiler sees all that code without the #include.

If that is unclear or you have more questions, get a book.


Ian Collins

unread,
May 24, 2007, 5:41:53 PM5/24/07
to
You'd save everyone a lot of time by getting a decent text book.

--
Ian Collins.

Ron Natalie

unread,
May 25, 2007, 6:51:27 AM5/25/07
to
Sam of California wrote:

> A preprocessor reads the files that are input to the compiler and ceates a
> temporary file that is the result of some specific processing. When the
> preprocessor processes an #include, the output contains all the code that is
> to be included, and the compiler sees all that code without the #include.
>

As far as the language goes, the preprocessor is just a pass in the
parsing of the source file. The fact that it may be a separate program
or put the intermediate results into a temporary file is purely an
implementation detail.

Look up "Phases of translation" in either the C or C++ standard.

Bart van Ingen Schenau

unread,
May 25, 2007, 4:33:32 PM5/25/07
to
Penco...@gmail.com wrote:

As the others already said, this is explained much better in any
half-decent textbook.

The process of building an executable from your source files can be
viewed as having three stages.

The first stage is called preprocessing. In this stage, the preprocessor
makes a pass over the source file, expanding macros (that are defined
with #define, or on the command-line), replacing the #include lines
with the actual contents of the named header, and processing the other
preprocessor directives.
Lines that start with a #-sign are preprocessor directives. They tell
the preprocessor what to do.

The second stage is the actual compilation. In this stage, the compiler
translates the pre-processed source file into an object file, which
contains executable code and references to functions that could not be
found in the current source file.

The third and last stage is linking, where the linker takes one or more
source files and some library files. With these inputs, the linker
tries to resolve all the references that the compiler had to leave open
in order to create an executable.

In the old days, you had a separate program for each of the stages, but
nowadays preprocessing and compilation are both done by the compiler,
and you can often even ask the compiler to do the linking stage for
you.

Bart v Ingen Schenau
--
a.c.l.l.c-c++ FAQ: http://www.comeaucomputing.com/learn/faq
c.l.c FAQ: http://www.eskimo.com/~scs/C-faq/top.html
c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/

Sam of California

unread,
May 26, 2007, 2:01:42 AM5/26/07
to
"Ron Natalie" <r...@spamcop.net> wrote in message
news:4656bef1$0$18418$9a6e...@news.newshosting.com...

>
> As far as the language goes, the preprocessor is just a pass in the
> parsing of the source file. The fact that it may be a separate program
> or put the intermediate results into a temporary file is purely an
> implementation detail.

The preprocessor is not just a pass. It processes statements that the
compiler does not process. The language is very clear that the
preprocessor's statements are totally different from the compiler.


James Dennett

unread,
May 26, 2007, 2:26:42 AM5/26/07
to

If you read the C++ standard, you'll find that translation
from source code to programs is defined as a number of
passes; what is traditionally called "preprocessing"
consists of a number of these passes, the result of
which is a sequence of tokens for a translation unit.

There are some minor disagreements, such as about whether
concatenation of adjacent string literals is part of
preprocessing, but it's definitely consistent with the
C and C++ standards to think of preprocessing as an early
pass (or a number of early passes) in the translation
process.

I'm not sure quite which point(s) you were trying to
illustrate; perhaps you can clarify?

-- James

Sam of California

unread,
May 26, 2007, 3:13:24 AM5/26/07
to
"James Dennett" <jden...@acm.org> wrote in message
news:EqQ5i.406295$6P2.2...@newsfe16.phx...

>
> If you read the C++ standard, you'll find that translation
> from source code to programs is defined as a number of
> passes; what is traditionally called "preprocessing"
> consists of a number of these passes, the result of
> which is a sequence of tokens for a translation unit.

Regarless of how many passes the preprocessor makes, it will never never
ever never process statements that are not preprocessor statements. After
the prepocessor executes, the data will have no preprocessor statements and
the number of passes made is totally irrelevant to the way the prepocessor
works.

Note that Ron Natalie said that the "preprocessor is just a pass in the
parsing of the source file", implying that there is no such thing as
preprocessor statements that are processed by the preprocessor and only the
preprocessor.


Keith Thompson

unread,
May 26, 2007, 4:36:04 AM5/26/07
to

Referring to it as a "pass" (which, in typical implementations, is
exactly what it is) doesn't imply any such thing. It's a pass that
handles certain directives and leaves anything that's not part of its
grammar unchanged.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

James Dennett

unread,
May 26, 2007, 9:23:48 AM5/26/07
to
Sam of California wrote:
> "James Dennett" <jden...@acm.org> wrote in message
> news:EqQ5i.406295$6P2.2...@newsfe16.phx...
>> If you read the C++ standard, you'll find that translation
>> from source code to programs is defined as a number of
>> passes; what is traditionally called "preprocessing"
>> consists of a number of these passes, the result of
>> which is a sequence of tokens for a translation unit.
>
> Regarless of how many passes the preprocessor makes, it will never never
> ever never process statements that are not preprocessor statements.

The term you're looking for would probably be
"directives" rather than "statements".

> After
> the prepocessor executes, the data will have no preprocessor statements

In practice, #line directives or similar annotations
often remain in preprocessor output, though the abstract
C++ specification does not require any such thing (as
it says the output from these phases is a sequence of
tokens, not text).

> and the number of passes made is totally irrelevant
> to the way the prepocessor works.

It's very relevant to how preprocessors work, but not
to your point I fear.

> Note that Ron Natalie said that the "preprocessor is just a pass in the
> parsing of the source file", implying that there is no such thing as
> preprocessor statements that are processed by the preprocessor and only the
> preprocessor.

His statement doesn't seem to imply that, though it
could be viewed as misleading to say that preprocessing
is part of parsing; I'd prefer to say that it's a
precursor to parsing. In C++ (and usually in other
programming languages) parsing acts on a sequence of
tokens; preprocessing transforms text into tokens and
manipulates those tokens according to various rules,
which aren't limited only to interpreting directives
(for example, macro expansion includes some pre-defined
macros, and is handled by the preprocessor).

-- James

Sam of California

unread,
May 26, 2007, 12:25:17 PM5/26/07
to
"James Dennett" <jden...@acm.org> wrote in message
news:FxW5i.337265$JN6.2...@newsfe17.phx...

>
> His statement doesn't seem to imply that, though it
> could be viewed as misleading to say that preprocessing
> is part of parsing


I still think the statement that the "preprocessor is just a pass in the
parsing of the source file" clearly implies something that is not true. If
it is valid, then I still don't understand how it is valid.


James Dennett

unread,
May 26, 2007, 4:05:15 PM5/26/07
to

Do you accept that "preprocessing is the name given to
a number of passes in the translation of a C++ program"
is true? Is the confusion about the use of the term
"parsing" or something else?

Translation of C++ code is defined in terms of phases
by the C++ standard; preprocessing is a relatively
informal term that encompasses some of the early
phases of that process, stopping prior to applying
most of the grammar of C++ (though the preprocessor
grammar for expressions in conditionals is a subset
of that of the expression grammar used in later
phases).

-- James

Sam of California

unread,
May 26, 2007, 6:20:09 PM5/26/07
to
"James Dennett" <jden...@acm.org> wrote in message
news:0q06i.293282$ZA5....@newsfe15.phx...

>
> Do you accept that "preprocessing is the name given to
> a number of passes in the translation of a C++ program"
> is true? Is the confusion about the use of the term
> "parsing" or something else?

No, but my explanation would just be a repetition of what I have already
said. I assume you know the standards better than I do but the point I am
making is very fundamental. It is bizarre to me that we are not
communicating.

> preprocessing is a relatively informal term

It is my understanding that it is not.

> stopping prior to applying most of the grammar of C++

The preprocessor does nothing with the grammar of C++ other than the clearly
defined preprocessor grammer; the preprocessor only processes it's "grammer"
and the subsequent non-preprocessor does not have the preprocessor grammer
anywhere in the data it processes.

Preprocessing is the act of processing the unique grammer that it and only
it processes.


Keith Thompson

unread,
May 26, 2007, 7:06:01 PM5/26/07
to
"Sam of California" <sam...@social.rr.com_change_social_to_socal> writes:

Right.

The point of disagreement is that you think this implies that the word
"pass" is not appropriate. Most of the rest of us see no such
implication in the word "pass", and I for one have trouble
understanding why you do see such an implication.

I understand your point of view: you're saying that calling the
preprocessor a "pass" is inconsistent with the fact that it leaves
much of its input unchanged. I simply disagree.

It would be helpful if you could present a definition of the word
"pass" that supports your argument.

Ron Natalie

unread,
May 26, 2007, 8:05:51 PM5/26/07
to
Sam of California wrote:

>
> Regarless of how many passes the preprocessor makes, it will never never
> ever never process statements that are not preprocessor statements. After
> the prepocessor executes, the data will have no preprocessor statements and
> the number of passes made is totally irrelevant to the way the prepocessor
> works.

Incorrect. It performs substitutions based on the preprocessor
directives it previously encountered. It is a phase of translation
officially, and colloquially a pass. There's no such thing as a
"preprocessor statement" in the language.


>
> Note that Ron Natalie said that the "preprocessor is just a pass in the
> parsing of the source file", implying that there is no such thing as
> preprocessor statements that are processed by the preprocessor and only the
> preprocessor.
>

Therare no such things as preprocessor statements. There are C (or C++)
statements and preprocessor directives. But even if you omit your
imprecise terminology, what I said is STILL correct. The preoprocessing
pass interprets the directives and performs the substitution.

I suggest you actually read one of the standards rather than just guessing.

James Dennett

unread,
May 26, 2007, 9:05:02 PM5/26/07
to
Sam of California wrote:
> "James Dennett" <jden...@acm.org> wrote in message
> news:0q06i.293282$ZA5....@newsfe15.phx...
>> Do you accept that "preprocessing is the name given to
>> a number of passes in the translation of a C++ program"
>> is true? Is the confusion about the use of the term
>> "parsing" or something else?
>
> No, but my explanation would just be a repetition of what I have already
> said. I assume you know the standards better than I do but the point I am
> making is very fundamental. It is bizarre to me that we are not
> communicating.
>
>> preprocessing is a relatively informal term
>
> It is my understanding that it is not.

It is, in the sense that the C++ standard does not define
"a preprocessor" or "preprocessing". So far as the definition
of C++ (in the standard) goes, there are only phases of
translation. Compared to the definition of translation
in terms of a number of phases (also known as passes),
the notion of preprocessing is more nebulous. Different
implementations do, in fact, vary in which phases they
consider to be part of preprocessing.

>> stopping prior to applying most of the grammar of C++
>
> The preprocessor does nothing with the grammar of C++ other than the clearly
> defined preprocessor grammer;

Why do you think so? I already pointed out that the
expressions handled by the preprocessor are governed
by the same grammar as expressions used in later
phases. What do you think constitutes "the clearly
defined preprocessor grammer [sic]"?

> the preprocessor only processes it's "grammer"
> and the subsequent non-preprocessor does not have the preprocessor grammer
> anywhere in the data it processes.

There isn't a "preprocessor grammar" separate from the
grammar of the rest of C++ in the standard. There
certainly are parts that apply only in the early phases
(such as the grammar rules defining preprocessing
directives), but there are also parts that apply in
handling of conditional compilation directives (in
what is conventionally called preprocessing) as well
as later on (in the phase conventionally, though
rather ambiguously, known as compilation).

> Preprocessing is the act of processing the unique grammer that it and only
> it processes.

But I've pointed out that there's an overlap; there
are not disjoint grammars for preprocessing and for
later phases, though the overlap is small.

In any case, I'd be interested to hear of any concrete
reason, supported by evidence, for attacking the position
that preprocessing is a term applied to some of the
early phases/passes during translation from source code
to executables.

-- James

Sam of California

unread,
May 27, 2007, 10:41:10 AM5/27/07
to
"Keith Thompson" <ks...@mib.org> wrote in message
news:lnk5uvc...@nuthaus.mib.org...

>
> The point of disagreement is that you think this implies that the word
> "pass" is not appropriate.

Totally inncaurate. you are very much oversimplifying what I am saying.
Please go back and read what it is I said that got this started. Whether you
do or not, the response you made is useless since what you say I am saying
is total fabrication.


Richard Heathfield

unread,
May 27, 2007, 10:45:51 AM5/27/07
to
Sam of California said:

Whilst it is certainly possible that Keith has misunderstood you, he is
not given to indulging in "total fabrication". Chill out, as they say.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Sam of California

unread,
May 27, 2007, 11:22:50 AM5/27/07
to
"Richard Heathfield" <r...@see.sig.invalid> wrote in message
news:cdednZVC39q...@bt.com...

> Sam of California said:
>
>> "Keith Thompson" <ks...@mib.org> wrote in message
>> news:lnk5uvc...@nuthaus.mib.org...
>>>
>>> The point of disagreement is that you think this implies that the
>>> word "pass" is not appropriate.
>>
>> Totally inncaurate. you are very much oversimplifying what I am
>> saying. Please go back and read what it is I said that got this
>> started. Whether you do or not, the response you made is useless since
>> what you say I am saying is total fabrication.
>
> Whilst it is certainly possible that Keith has misunderstood you, he is
> not given to indulging in "total fabrication". Chill out, as they say.

I did not say that he does it often or even more than once, but in this
situation it is true, whether intentional or not (I will assume
unintentional), that what he says I said is a fabrication. I am simply
stating a fact; I am not expressing emotion.

By saying "chill out" you are getting personal in a manner I did not.


Ulrich Eckhardt

unread,
May 27, 2007, 11:02:47 AM5/27/07
to

I think there is some misunderstanding here: the standard defines the steps
it takes to translate sourcode into an executable. These steps are called
passes. The input for one pass is the output of the former pass. The
algorithm applied to the data in one pass is _not_ the same as that applied
in the former pass.

Now, here is where I think the misunderstanding is: you seem to understand
that James said that the sourcecode (or derivations thereof) is passed a
few times to the preprocessor, which is indeed not the case. Rather, what
is coloquially referred to as 'preprocessing' consists of some of the steps
(passes) that are required to compile sourcecode.

Also, it seems like you were using the term 'parsing' as just the separation
of a file into statements/directives but not the execution of them. My
personal feeling would go into the same direction, but I'm not sure this
separation is universally accepted. I'm pretty sure that e.g. James does
not share my feeling there.

Uli

James Dennett

unread,
May 27, 2007, 11:56:50 AM5/27/07
to
Ulrich Eckhardt wrote:
> Also, it seems like you were using the term 'parsing' as just the separation
> of a file into statements/directives but not the execution of them. My
> personal feeling would go into the same direction, but I'm not sure this
> separation is universally accepted. I'm pretty sure that e.g. James does
> not share my feeling there.

I'd like to reserve the term "parsing" in this
context for the specific operation of taking a
sequence of tokens and identifying which grammatical
productions gave rise to it (or diagnosing errors if
there is no such production).

However, Sam claimed in this thread that the term
"parsing" wasn't his sticking point, but I can't
see that he has identified any other problem (other
that some mysterious claim that preprocessing being
a pass means something about grammars).

-- James

Richard Heathfield

unread,
May 27, 2007, 1:37:44 PM5/27/07
to
Sam of California said:
> "Richard Heathfield" wrote:
>> Sam of California said:
>>
<snip>

>>> the response you made is useless
>>> since what you say I am saying is total fabrication.
>>
>> Whilst it is certainly possible that Keith has misunderstood you, he
>> is not given to indulging in "total fabrication". Chill out, as they
>> say.
>
> I did not say that he does it often or even more than once, but in
> this situation it is true, whether intentional or not (I will assume
> unintentional), that what he says I said is a fabrication.

The trouble is that "fabrication" carries with it a strong connotation
of "lie". Keith is no liar. Whether you intended to imply that he is
one or not, it was a poor word choice.

> I am simply stating a fact; I am not expressing emotion.

You are actually stating an opinion. Furthermore, your replies do not
read as if they were emotion-free. If you wish not to convey emotion,
you'll need to work harder at it.

> By saying "chill out" you are getting personal in a manner I did not.

No, it's just a slang expression for "calm down". You don't have to, of
course - it's merely a suggestion.

Keith Thompson

unread,
May 27, 2007, 5:05:15 PM5/27/07
to
"Sam of California" <sam...@social.rr.com_change_social_to_socal> writes:

It's obvious that you and I use words differently. As Richard says,
the word "fabrication" implies deliberate deceipt; see
<http://dictionary.reference.com/browse/fabrication> if you don't
believe me or him.

In effect, you have called me a liar. Before I read your later
responses and saw that you used the word "fabrication" without
understanding what it really means, I assumed that it was deliberate.

For the record, my interpretation of what you wrote may have been
inaccurate, but it was entirely honest. You might wish to explain
more clearly what you actually meant. You might also wish to
apologize, but I'm not going to insist on it.

Yes, this is personal.

Keith Thompson

unread,
May 27, 2007, 5:07:53 PM5/27/07
to
Ulrich Eckhardt <doom...@knuut.de> writes:
> Sam of California wrote:
[...]

> I think there is some misunderstanding here: the standard defines the steps
> it takes to translate sourcode into an executable. These steps are called
> passes. The input for one pass is the output of the former pass. The
> algorithm applied to the data in one pass is _not_ the same as that applied
> in the former pass.
[...]

Actually, both the C and C++ standards refer to "phases of translation".

Keith Thompson

unread,
May 27, 2007, 5:27:47 PM5/27/07
to
"Sam of California" <sam...@social.rr.com_change_social_to_socal> writes:
> "James Dennett" <jden...@acm.org> wrote in message
> news:EqQ5i.406295$6P2.2...@newsfe16.phx...
>> If you read the C++ standard, you'll find that translation
>> from source code to programs is defined as a number of
>> passes; what is traditionally called "preprocessing"
>> consists of a number of these passes, the result of
>> which is a sequence of tokens for a translation unit.
>
> Regarless of how many passes the preprocessor makes, it will never never
> ever never process statements that are not preprocessor statements. After
> the prepocessor executes, the data will have no preprocessor statements and
> the number of passes made is totally irrelevant to the way the prepocessor
> works.

In a typical implementation, the preprocessor makes a single pass over
the source code; it reads the original source and writes a modified
version of that source.

As a matter of terminology, there are no "preprocessor statements";
there are "preprocessing directives". The preprocessor (again, in a
typical implementation) handles these directives; it also expands
macro invocations. It may do a few other things as well, such as
deleting comments and handling trigraphs.

The C standard defines 8 translation phases. The "preprocessor", if
it's implemented as a distinct program, typically handles the first 4
or 5 of these phases. (The C++ standard defines 9 translation phases;
the added phase 8 handles template instantiation.)

Of course, other implementations are possible. A compiler could have
a distinct program for each phase, or all 8 or 9 translation phases
could be incorporated into a single program.

> Note that Ron Natalie said that the "preprocessor is just a pass in the
> parsing of the source file", implying that there is no such thing as
> preprocessor statements that are processed by the preprocessor and only the
> preprocessor.

I don't understand what you mean by this, and I still have no idea why
you see this implication. What the preprocessor does arguably isn't
part of "parsing". The term "parsing" generally refers to the
syntactic analysis performed in translation phase 7. Is that what you
mean? Please explain.

Would you agree that the preprocessor is just a pass in the
*compilation* (rather than "parsing") of the source file? If not, how
do you define the word "pass", and how is your definition inconsistent
with what the preprocessor does?

0 new messages