Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

thoughts on an old proposal: #once

21 views
Skip to first unread message

Mike Conley

unread,
Mar 19, 2003, 2:45:30 PM3/19/03
to
For reference, a great deal of discussion, but no apparent resolution, can
be found here:

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=
3dc12c7f.5179031%40news.bluecom.no&rnum=28&prev=/groups%3Fq%3D%2523include%
2Bgroup:comp.std.*%26start%3D20%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%
3D3dc12c7f.5179031%2540news.bluecom.no%26rnum%3D28


A concern expressed by a number of posters centered on the question of
determining when two files are, in fact, the same.

I'd like to point out that #ifndef has similar problems: there is no
gurantee that any macro defined to guard any particular header will be
unique, no matter what elaborate steps are taken. Now, it is certainly
true that there are ways to define such macros that make duplicates highly
unlikely, but there are no guarantees.

It is also true that there are ways to organize one's source code that make
the suggested problems unlikely. We are therefore dealing with a question
of probabilities and programmer convenience.

Just to make the problems explicit:

#ifndef guards: it is possible for a macro to be replicated, causing a
file to be ignored when it should not be. This problem can be avoided with
high probability by defining elaborate macros.

#once: it is possible that the same file may be duplicated in multiple
include directories and, consequently, included multiple times. This
problem can be avoided with certainty by not specifying include directories
which contain duplicate versions of the same header.

Which problem is more likely to occur in practice? Before you answer
"obviously, replication of files is more likely, because I go through a
great deal of trouble to define 256-character-long guard macros with my
social security number in them" :), remember that (1) not everyone (this
includes standard library implementers) goes through such trouble, (2) it
isn't often that multiple include directories that might contain the same
files are used in the same build, and, I think, most tellingly, (3) the
feature is already implemented in most compilers, generally as a pragma
(the only objections to such compiler features that I'm aware of deal with
their lack of portability to other compilers).

Just when would a real problem arise? Consider the steps one must perform
in order to run into a problem (beyond the obvious fact that the header in
question must define something that, if defined twice, would cause a
compiler error):

1) I must set my compiler to search for include files in multiple
directories.

2) The file must appear in more than one of these directories in such a way
that the system itself cannot tell that they are the same (something that,
I think, occurs infrequently in practice -- basically, the file must be an
actual duplicate, and not the same file accessed through different paths,
or it must be accessible through different paths on a remote filesystem
which does not support any facility for determining uniqueness).

Now, I think it important to note here that you, the programmer, can ALWAYS
tell when a file is duplicated in your include directories (by manual
inspection if necessary, but more likely with diff). It is also possible,
though potentially much harder, to find every relevent definition of a
given macro and, from this, deduce whether or not it has been defined at a
given point in a file.

But why would you duplicate header files and then ask your compiler to
search for includes in both places? If you require only certain headers
from another directory (where, it so happens, there are duplicate versions
of headers you have elsewhere), you could easily correct the problem by (1)
eliminating the duplicate headers in one of the directories, (2) copying
the headers you need so that all necessary headers reside in a single
place, or (3) specifying the complete path to the needed include files and
removing the offending directory from your default search path.

Even if you're using NFS or Windows shares to access includes (what
advantage would you gain over using a local copy?), and the shares have
multiple mount points, why would you want to search for include files in
each mount point (as opposed to a single mount point)?

Considering the likelyhood of a problem ever arising, is it not reasonable
to expect a programmer to perform one of the 3 steps above to remedy the
situation? Is this not more reasonable than expecting library authors to
come up with globally unique keys for each header they write so that
developers can #include from multiple directories with duplicate files?

--
Mike Conley
Thoughts for sale. $.10 or best offer.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]

jacob navia

unread,
Mar 19, 2003, 5:20:45 PM3/19/03
to
The implementation of "#pragma once" in lcc-win32 assumes that it applies to
a file path. The compiler will only assume that two files with exactly the
same path (ignoring case) are the same file.

This means that #pragma once applies only to a certain physical file. If you
write:

#include "foo.h"

and in the include path the compiler finds
"z:\shares\include\foo.h"

and in this file a #pragma once occurs, this will apply only to THAT file in
THAT share.

If, when compiling another include file in "x:\shares\include\incs.h"
(another share) that contains
#include "foo.h"

leads to the compiler finding ANOTHER x:\shares\include\foo.h", that one
will be compiled AGAIN because it is assumed that two files in different
shared drives are different. Some operating systems may let you share
different and maybe overlapping parts of the remote system into different
shares, but that is no longer the responsibility of the compiler.

Of course this simple schema could break down if we let

#include http://www.mysources.com/shared/include/foo.h

How could the compiler know if that URL is or it is not the same physical
file?

What does it mean "physical" here anyway?

In Unix (and now under Windows too) a file can be just a symbolic link...
should the compiler verify that the link points to an already known file?

Just let's keep this simple. I would just propose that

#pragma STDC once

is added to the standard.

How to do the "once" is up to the implementation. Lcc-win32 assumes a simple
fact: Two different paths are different files. If your compilation
environment is so complex that that supposition no longer holds you can ask
for a fix if you have a budget for it. It is a QOI issue :-)


"Mike Conley" <conle...@osu.edu> wrote in message
news:Xns9343937778AB...@65.24.2.11...

Mike Conley

unread,
Mar 20, 2003, 2:30:06 AM3/20/03
to
ja...@jacob.remcomp.fr ("jacob navia") wrote in
news:b5aojg$qi3$1...@news-reader12.wanadoo.fr:

> In Unix (and now under Windows too) a file can be just a symbolic
> link... should the compiler verify that the link points to an already
> known file?

In practice a compiler would have to follow symlinks to see the #pragma
in the first place. (I suppose a compiler could cache the strings that
followed #include and not even try to open duplicates, but that would
result in severly broken behavior for headers designed to be #included
multiple times).


> Just let's keep this simple. I would just propose that
>
> #pragma STDC once
>
> is added to the standard.
>
> How to do the "once" is up to the implementation.

Let's not leave everything up to the implementation :) There should be
a guarantee that a file specifying the #pragma and residing on a local
filesystem is not included twice, no matter the path it is referred to
in an #include (presuming, of course, that we're referring to the very
same file and not a copy -- and, yes, "the very same file" means that
these names all refer to the same data in the same place on the same
disk).

If your local filesystems don't support the basic function of
determining whether or not two names refer to the same physical data on
disk, I doubt they're very useful :)

Network shares are another matter entirely.

A main point of my original post was that #including files from a
network share is probably a bad idea to begin with, and certainly any
difficulties resulting from such use are easy to work around (make a
local copy, don't mount the same directory of the same filesystem twice and
then direct the compiler to search both mountpoints for #include files,
etc.).

That is, I can't see how such difficulties could be a serious impediment to
standardization (with a noticible warning about limitations if enough
people think it is appropriate). It certainly hasn't stopped compiler
vendors from adding it as an extension.

> Lcc-win32 assumes a simple fact: Two different paths are different
> files. If your compilation environment is so complex that that
> supposition no longer holds you can ask for a fix if you have a budget
> for it. It is a QOI issue :-)

Agreed, though (as I said above) I think lcc might be simplifying things
just a bit too much :)

--
Mike Conley
Too lazy to come up with a clever sig.

Allan W

unread,
Mar 20, 2003, 1:13:56 PM3/20/03
to
conle...@osu.edu (Mike Conley) wrote

> For reference, a great deal of discussion, but no apparent resolution, can
> be found here:
>
> http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&threadm=
> 3dc12c7f.5179031%40news.bluecom.no&rnum=28&prev=/groups%3Fq%3D%2523include%
> 2Bgroup:comp.std.*%26start%3D20%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%
> 3D3dc12c7f.5179031%2540news.bluecom.no%26rnum%3D28
[snip]

> Just to make the problems explicit:
>
> #ifndef guards: it is possible for a macro to be replicated, causing a
> file to be ignored when it should not be.

This is a potential problem. In real life, it is always detected
immediately. (I could contrive an example where the lack of a header
file would change a program's meaning without an error message, but
these do not occur in real life.)

> This problem can be avoided with
> high probability by defining elaborate macros.

This problem can be avoided with REASONABLE probability by defining
simple macros not likely to be used by others. For instance, the
ABC project includes a file named foo.h:
#ifndef ABC_INCLUDE_FOO
#define ABC_INCLUDE_FOO
whereas the DEF project includes a file named foo.h:
#ifndef MYCOMPANY_INCLUDE_PROJECTDEF_foo.h
#define MYCOMPANY_INCLUDE_PROJECTDEF_foo.h
This isn't 100% secure. If you use an XYZ library from a third-party
vendor (and who doesn't!), you can't know that the XYZ library doesn't
also include some code from that vendor's ABC library -- and if they
happen to use the same guard format that you do, and they also happen
to have an include file named foo.h, you've still got your collision.
"Conceivably possible" is a far cry from "likely," though. Using such
a scheme, a collision is very unlikely -- and once again, it's usually
easy to detect and fix.

> #once: it is possible that the same file may be duplicated in multiple
> include directories and, consequently, included multiple times.

In some cases, depending on the contents of the header file, this might
be easily detected. More often it will not be, but this also won't
usually change the meaning of the program -- normally, the only penalty
will be poor compile-time performance.

> This
> problem can be avoided with certainty by not specifying include directories
> which contain duplicate versions of the same header.

Did you read the thread you quoted above? IIRC, "Duplicate versions" is
hardly the major stumbling block. Most networking systems, as well as
any OS with the filesystem concept of an "alias", can give you multiple
paths to the exact same function. It's very possible that
q:/myproject/headerone.h
and
r:/yourproject/somehead.h
are *EXACTLY* the same file (edit one and it shows up in both places, etc.)
Thus, a compiler can never be 100% certain about double-includes based
only on the names of header files.

I strongly suspect that every system that has this problem, also has
some method of getting a "normalized" (unique) name for every file it
opens. For instance, in Microsoft Windows this would be the UNC. On
any such platform we could indeed avoid the problem -- but there's no
guarantee that this type of thing is available on all systems, so we
can't call it a generalized solution.

> Just when would a real problem arise? Consider the steps one must perform
> in order to run into a problem (beyond the obvious fact that the header in
> question must define something that, if defined twice, would cause a
> compiler error):
>
> 1) I must set my compiler to search for include files in multiple
> directories.

Or have #include directives that specify specific directories. (This is
usually a bad practice, because it makes the code hard to move -- but
it's by no means uncommon.)

> 2) The file must appear in more than one of these directories in such a way
> that the system itself cannot tell that they are the same (something that,
> I think, occurs infrequently in practice -- basically, the file must be an
> actual duplicate, and not the same file accessed through different paths,

If #once allowed both copies to be included, I would accept that.

> or it must be accessible through different paths on a remote filesystem
> which does not support any facility for determining uniqueness).

That's where the difficulty comes in.

> Now, I think it important to note here that you, the programmer, can ALWAYS
> tell when a file is duplicated in your include directories (by manual
> inspection if necessary, but more likely with diff).

How?

I suppose you could say that if you wrote every #include directive yourself.
The problem is in teams with multiple programmers and third-party vendors.

// Foo.cpp
#include "client.h"
#include "product.h"
// ... etc.

// Client.h
// Written by Programmer1
// Need to include QuickForms package from Vendor1
#include "\\SomeServer\Libraries\Vendor1\Forms.h"
// ... etc.

// Product.h
// Written by Programmer2
// Need to include Allforms package from Vendor1
#include "S:/Vendor1/Allforms.h"

The "S:" drive is mapped to \\SomeServer\Libraries. Vendor1's include
file "Allforms.h" includes several others, including "Forms.h"

> It is also possible,
> though potentially much harder, to find every relevent definition of a
> given macro and, from this, deduce whether or not it has been defined at a
> given point in a file.

Macros (and inline functions, and...) can be defined in multiple files.
Just because the first one happens to have already been defined, doesn't
mean that we don't need to process the whole file.

> But why would you duplicate header files and then ask your compiler to
> search for includes in both places?

Multiple programmers, multiple goals

> If you require only certain headers
> from another directory (where, it so happens, there are duplicate versions
> of headers you have elsewhere), you could easily correct the problem by (1)
> eliminating the duplicate headers in one of the directories,

But it's the same directory!

> (2) copying
> the headers you need so that all necessary headers reside in a single
> place,

But they already do!

> or (3) specifying the complete path to the needed include files and
> removing the offending directory from your default search path.

In other words, always using \\SomeServer\Libraries, or always using S:.

The trouble is, Product.h was written 8 years ago for another project
on Windows 3.1 (which didn't understand UNC filenames). It's been
converted to 32-bit since then, and the newest version of Vendor1's
forms application. Fortunately the code changes were very minor.

Meanwhile, for 6 years we've had a corporate policy to use UNC names
wherever feasible. So when Client.h was written 2 years ago, naturally
we used the servername and sharename appropriately.

> Even if you're using NFS or Windows shares to access includes (what
> advantage would you gain over using a local copy?),

1. We don't have to replicate the Vendor's library on every machine
(space issues, plus one vendor has rather strange licensing terms).
We license over 80 libraries in-house. While no single project uses
more than 20 of them, programmers tend to bounce between projects
on a weekly basis (or even more often).

2. When we upgrade, everyone is current at once.

3. Our file server is highly optimized, and very often outperforms
the cheapo disks we put on individual workstations.

> and the shares have
> multiple mount points, why would you want to search for include files in
> each mount point (as opposed to a single mount point)?

Keep legacy code alive, while using more modern names for more recent
code.

Note: I've fictionalized many of my responses to illustrate reasonable
answers. (For instance, I don't personally keep libraries on a network
share, nor do I use more than 3 at once.) But I think that these answers
are plausable, and illustrate concerns that could happen in the real world.

Allan W

unread,
Mar 20, 2003, 5:16:26 PM3/20/03
to
ja...@jacob.remcomp.fr ("jacob navia") wrote

> What does it mean "physical" here anyway?

If "Physical" means anything, it ought to mean:
To the best of the compiler's knowledge, these files do/do not
refer to literally the same source of information -- such that if
one of those sources were to change, the other one would as well.

I don't think you can do better without defining a source file,
the types of media it can be on, how those media can be accessed by
the computer, and so on... none of which belong in the standard.

> In Unix (and now under Windows too) a file can be just a symbolic link...
> should the compiler verify that the link points to an already known file?
>
> Just let's keep this simple. I would just propose that
>
> #pragma STDC once
>
> is added to the standard.

I think we should make some effort to keep it consistent with ALL of
the other standard pragmas. Pardon me if I've used excessive quoting
here, but this is absolutely everything that the current C++ standard
says about pragmas:

16.6 Pragma directive [cpp.pragma]
1 A preprocessing directive of the form
# pragma pp-tokens(opt) new-line
causes the implementation to behave in an implementation-defined
manner. Any pragma that is not recognized by the implementation
is ignored.

In other words, the standard doesn't say anything about what the pragmas
mean. It was specifically invented to allow compilers a way to introduce
implementation-defined content without interfering with other compilers!

Mike Conley has been careful NOT to call it #pragma once (which compilers
already do implement), but instead call it by a new name: #once.

> How to do the "once" is up to the implementation.

I think this would be reasonable. The standard should spell out the
intent, but not the formal mechanism.

> Lcc-win32 assumes a simple
> fact: Two different paths are different files. If your compilation
> environment is so complex that that supposition no longer holds you can ask
> for a fix if you have a budget for it.

I suspect that most single-platform compilers (and many multiple-
platform compilers) can do better -- but even this limited form has
values. Plus, as you say,

> It is a QOI issue :-)

---

Mike Conley

unread,
Mar 21, 2003, 1:27:18 AM3/21/03
to
all...@my-dejanews.com (Allan W) wrote in
news:7f2735a5.03031...@posting.google.com:

>> #once: it is possible that the same file may be duplicated in
>> multiple
>> include directories and, consequently, included multiple times.

> Did you read the thread you quoted above? IIRC, "Duplicate versions"


> is hardly the major stumbling block.

I should have been more clear about this. s/"duplicate versions"/"aliases
or symlinks, etc". Clearly a copy of the same file cannot be protected by
#once, since it isn't "the same" file.


> Most networking systems, as well
> as any OS with the filesystem concept of an "alias", can give you
> multiple paths to the exact same function. It's very possible that

> are *EXACTLY* the same file (edit one and it shows up in both places,
> etc.) Thus, a compiler can never be 100% certain about double-includes
> based only on the names of header files.


I can't imagine a local filesystem not being able to tell you what file an
alias points to. The problem, as I understand it, is that I could have, eg
an NFS server with an include directory mounted on, say, /nfs-includes and
/usr/include, and have a compiler looking in both places for includes.

Even if the mount points don't refer to the same directory on the server
(say one points to a subdirectory of the other), doesn't the programmer
have the ability to resolve the situation with relative ease?

If /nfs-includes and /usr/include point to different directories on the
server, you shouldn't have a problem. The whole point is that #once
probably couldn't make any guarantees about remote file systems, but that
failures would be rare. Just to list the points more clearly:

1) You MUST have a remote file system mounted twice.
2) You must #include headers through these 2 different mount points, as in
#include "/nfs-includes/foo.h" and #include "/usr/include/foo.h", or
you must tell the compiler to look both places for headers.

If you perform the steps above, and your headers are guarded against
multiple inclusion with #once, all bets are off -- guarded headers may be
#included twice. The discussion in the thread I referred to seemed to
suggest that (1) this was a problem and (2) that #ifndef guards solve this
problem.

I'll grant (2). Truthfully, I don't see how (1) is a real problem. Unless
your sysadmin is purposely trying to aggravate you :) Maybe I'm missing
something.


> I strongly suspect that every system that has this problem, also has
> some method of getting a "normalized" (unique) name for every file it
> opens.

You're probably right. If such a facility is available, #once could make
use of it. For systems without such support, the behavior would be up to
the implementation (probably #include the file twice).


>> Now, I think it important to note here that you, the programmer, can
>> ALWAYS tell when a file is duplicated in your include directories (by
>> manual inspection if necessary, but more likely with diff).
>
> How?

Easy. Look through the directory structure of your file server. If you've
gotten an error of some kind, you probably have an idea of what header
you're looking for (which should give you an idea of where to look).


> I suppose you could say that if you wrote every #include directive
> yourself. The problem is in teams with multiple programmers and
> third-party vendors.

> // Client.h


> // Written by Programmer1
> // Need to include QuickForms package from Vendor1
> #include "\\SomeServer\Libraries\Vendor1\Forms.h"
> // ... etc.


Or I could say that people who do that deserve what they get, and that
somebody should tell programmer1 about the -I switch, or the moral
equivalent for his compiler. :)


>> It is also possible,
>> though potentially much harder, to find every relevent definition of
>> a given macro and, from this, deduce whether or not it has been
>> defined at a given point in a file.
>
> Macros (and inline functions, and...) can be defined in multiple
> files. Just because the first one happens to have already been
> defined, doesn't mean that we don't need to process the whole file.

I was referring to the #ifndef guard. If you're trying to figure out why a
header guarded with #ifndef isn't being included, you're really trying to
find a definition of the guard macro outside of said header.


>> If you require only certain headers
>> from another directory (where, it so happens, there are duplicate
>> versions of headers you have elsewhere), you could easily correct the
>> problem by (1) eliminating the duplicate headers in one of the
>> directories,
>
> But it's the same directory!

Again, I should have been clear. When I said duplicates, I meant symlinks.

You might have something like:
/include-dir/foo.h -> /usr/include/foo.h

with some special headers in /include-dir for your project. If this is a
local filesystem, things are fine. But if /include-dir and /usr/include
are on an NFS server and you've mounted them two different places....


>> or (3) specifying the complete path to the needed include files and
>> removing the offending directory from your default search path.
>
> In other words, always using \\SomeServer\Libraries, or always using
> S:.


Naturally, this is the least preferred solution.


>> why would you want to search for include files
>> in each mount point (as opposed to a single mount point)?
>
> Keep legacy code alive, while using more modern names for more recent
> code.

Sensible. If that's a goal of your project, you wouldn't want to use
#once.

If you wanted to convert your headers (or if the originals had used #once),
you could always write a perl script that does s/old-name/new-name on your
#includes :)

I suppose my point is that there are workarounds for any situation for
which #once would fail to work (and that these situations are probably rare
in practice, anyway).

--
Mike Conley

jacob navia

unread,
Mar 22, 2003, 2:15:32 PM3/22/03
to

"Allan W" <all...@my-dejanews.com> wrote in message
news:7f2735a5.03032...@posting.google.com...

> ja...@jacob.remcomp.fr ("jacob navia") wrote
> >
> > Just let's keep this simple. I would just propose that
> >
> > #pragma STDC once
> >
> > is added to the standard.
>
> I think we should make some effort to keep it consistent with ALL of
> the other standard pragmas. Pardon me if I've used excessive quoting
> here, but this is absolutely everything that the current C++ standard
> says about pragmas:
>
> 16.6 Pragma directive [cpp.pragma]
> 1 A preprocessing directive of the form
> # pragma pp-tokens(opt) new-line
> causes the implementation to behave in an implementation-defined
> manner. Any pragma that is not recognized by the implementation
> is ignored.
>
> In other words, the standard doesn't say anything about what the pragmas
> mean. It was specifically invented to allow compilers a way to introduce
> implementation-defined content without interfering with other compilers!
>
> Mike Conley has been careful NOT to call it #pragma once (which compilers
> already do implement), but instead call it by a new name: #once.
>

C99 defines #pragma STDC for pragmas reserved for the standards comitee.
In C++ that is maybe different...

John Nagle

unread,
Mar 24, 2003, 6:31:41 PM3/24/03
to
Mike Conley wrote:
> all...@my-dejanews.com (Allan W) wrote in
> news:7f2735a5.03031...@posting.google.com:
>
>
>>> #once: it is possible that the same file may be duplicated in
>>> multiple
>>>include directories and, consequently, included multiple times.

If this is a real problem (which it probably isn't), it
can be solved by detecting duplicate files with a cryptographic
grade hash, like MD5.

That has to work better than "My include guards are so
obscure that nobody will duplicate them by accident".

John Nagle
Animats

Allan W

unread,
Mar 27, 2003, 1:36:29 PM3/27/03
to
na...@animats.com (John Nagle) wrote

> > all...@my-dejanews.com (Allan W) wrote


> >>> #once: it is possible that the same file may be duplicated in
> >>> multiple
> >>>include directories and, consequently, included multiple times.

Excuse me -- but that isn't me you were quoting. Please try to be
more careful with your attributions.

> If this is a real problem (which it probably isn't), it
> can be solved by detecting duplicate files with a cryptographic
> grade hash, like MD5.

So, you're suggesting that when a header file is marked with #once,
the compiler should scan the entire header file to develop a hash.
And then, if the hash was NOT previously seen along with #once, it
could scan the entire header file again, this time processing the
contents.

I suppose that if #once was part of the language, this would be one
way to support it. Let's call it a QOI issue.

Furthermore, your idea makes it sound as if we should never include
a header file more than once. But some header files were designed
specifically to include as often as needed -- this is an obscure but
legal technique today. We need to have some way to distinguish such
header files. The obvious solution for backward-compatibility is to
mark header files with #once if (and only if) multiple inclusions are
either redundant or cause errors, while leaving off the declaration
on header files meant to be #included two or more times... Maybe we
could reverse this by using #multiple in the more unusual headers?

> That has to work better than "My include guards are so
> obscure that nobody will duplicate them by accident".

If you mean an established pattern of using a header file name as
part of the symbol in an include guard, I disagree. In fact, I see
these two different mechanisms as complementary, unless we can
guarantee that ALL C++ compilers would understand #once (or
#multiple). Even if this idea comes to pass, it would take several
(ten?) years before we could take it for granted.

Thomas David Rivers

unread,
Mar 28, 2003, 4:32:54 PM3/28/03
to

Just to add to that...

Even in the case of inclusion guards, many compilers will
actually reread the source (incurring the I/O) while
"skipping" the body (because of the inclusion guards.)

But - there is a way around this without using #pragma once,
and hence, I would argue against it. Our C and C++ compilers
accomplish this.

First of all - the only "real" way for a compiler to decide
if the file is "the same" is by the name used in an fopen()
statement... I think it's safe for a compiler to assume
that what name is used to access the file is the arbiter
of file uniqueness.

Our compilers cache these names, and recognize the
typical inclusion guards.

Thus, if a file has an inclusion guard such that
nothing else is defined outside of the guard, the
compiler knows that if that guard is still defined,
and it finds itself fopen()'ing the associated file
name again... it doesn't bother.

This avoids the I/O, and furthermore, avoids
having to add a #pragma.

It seems to me, that solves the problem - doesn't it?

- Dave Rivers -

--
riv...@dignus.com Work: (919) 676-0847
Get your mainframe programming tools at http://www.dignus.com

jacob navia

unread,
Mar 28, 2003, 9:59:10 PM3/28/03
to
> First of all - the only "real" way for a compiler to decide
> if the file is "the same" is by the name used in an fopen()
> statement... I think it's safe for a compiler to assume
> that what name is used to access the file is the arbiter
> of file uniqueness.
>

yes

> Our compilers cache these names, and recognize the
> typical inclusion guards.
>

This is easier said than done.

for instance
#ifndef __file_h__
/*
300 lines of comments
*/
#define __file_h__
// rest of file here
#endif


OK OK, then you mean that:
1: The first preprocessor token encountered is an ifndef
2: The next (immediately following) preprocessor directive is #define the
named symbol
3: The ifdef/endif pair encloses the whole file excluding comment lines
before or after

THEN
the compiler can catch it.

OK. OK. But what about

#if ! defined(__file_h_)
#define __file_h_
...
etc

This is equivalent but completely different. OK Add following rule
4: Else if first token is #if, followed by an expression that tests if a
symbol is defined AND
5: The next preprocessor statement (in the next line) defines that symbol
THEN we have one bug less in the compiler... :-)

OK. OK. But what about

#ifndef __file_h__
#define __file_h__ 1
#else
#undef __file_h__
#define __file_h__ 2
#endif

#if (__file_h__ == 1)
// Rest of the file here
#endif

OK Add following rule:
... :-)

The #pragma construct has many reasons to exist. Why make things more
complicated than they already are?
In lcc-win32 the compiler doesn't try to second guess you. You write

#pragma once

when you want a file included once.
And you are sure (and the reader too) that it means what it is written!


> Thus, if a file has an inclusion guard such that
> nothing else is defined outside of the guard, the
> compiler knows that if that guard is still defined,
> and it finds itself fopen()'ing the associated file
> name again... it doesn't bother.

You forget that the preprocessor supports the #undef directive...
Besides, it is much more practical to write somewhere in the file

#pragma once

than to write 3 full lines taking care not to forget the enclosing #endif...
that will be FAR away from its start.

Mike Conley

unread,
Mar 29, 2003, 4:58:05 PM3/29/03
to
riv...@dignus.com (Thomas David Rivers) wrote in
news:3E84B464...@dignus.com:

> Just to add to that...
>
> Even in the case of inclusion guards, many compilers will
> actually reread the source (incurring the I/O) while
> "skipping" the body (because of the inclusion guards.)
> But - there is a way around this without using #pragma once,
> and hence, I would argue against it. Our C and C++ compilers
> accomplish this.

The real problem, as I see it, is the requirement that every header you
want to guard has to have a globally unique key, and that generating this
key is bothersome and error prone. It's another thing that a compiler
can and should do for you (many already can), provided you ask it to.
The problem is that it isn't standardized and thus cannot be relied upon
in portable code.

A sticking point seems to be that it may be impossible to determine
whether or not two names refer to the same file in some cases. I doubt
that any system actually exhibits the characteristics necessary for this
problem to arise.


> First of all - the only "real" way for a compiler to decide
> if the file is "the same" is by the name used in an fopen()
> statement...

Two different names to an fopen call could easily refer to the same file
(symlinks, relative vs absolute pathnames, etc). It would be more
accurate to call stat and use the device/inode numbers.


>I think it's safe for a compiler to assume
> that what name is used to access the file is the arbiter
> of file uniqueness.


This is probably true much of the time, but I think we can do better,
depending on the interface the OS provides for various filesystems.


> Our compilers cache these names, and recognize the
> typical inclusion guards.

> It seems to me, that solves the problem - doesn't it?


It solves a problem, but not the one I'm complaining about :)

--
Mike Conley

Christian Bau

unread,
Mar 29, 2003, 5:46:50 PM3/29/03
to
In article <b62jin$p8j$1...@news-reader10.wanadoo.fr>,

ja...@jacob.remcomp.fr ("jacob navia") wrote:

> > First of all - the only "real" way for a compiler to decide
> > if the file is "the same" is by the name used in an fopen()
> > statement... I think it's safe for a compiler to assume
> > that what name is used to access the file is the arbiter
> > of file uniqueness.
> >
>
> yes
>
> > Our compilers cache these names, and recognize the
> > typical inclusion guards.
> >
>
> This is easier said than done.
>
> for instance
> #ifndef __file_h__
> /*
> 300 lines of comments
> */
> #define __file_h__
> // rest of file here
> #endif

You don't need to check whether __file_h__ becomes #defined inside the
header file, it doesn't matter. It doesn't matter whether the first
non-empty line is an #if, #ifdef or #ifndef. All you need to do is find
out that the structure is one of

#if /* or #ifdef or #ifndef */
<arbitrary lines>
#endif

or

#if /* or #ifdef or #ifndef */
<arbitrary lines>
#else
<whitespace only>
#endif

or

#if /* or #ifdef or #ifndef */
<whitespace only>
#else
<arbitrary lines>
#endif

Remember the #if line completely, evaluate when the #include statement
is encountered and decide whether everything included would be white
space only. (Make sure that the #if line is evaluated in the correct
context, for example I could write

#if __LINE__ > 100
...
#endif

).

jacob navia

unread,
Mar 29, 2003, 6:48:56 PM3/29/03
to
Quote:

> You don't need to check whether __file_h__ becomes #defined inside the
> header file, it doesn't matter. It doesn't matter whether the first
> non-empty line is an #if, #ifdef or #ifndef. All you need to do is find
> out that the structure is one of
>

OK. Here are your rules now:
1)

> #if /* or #ifdef or #ifndef */
> <arbitrary lines>
> #endif

The endif should be the last line or followed only by comments. The #if
construct MUST be the first thing that the compiler sees after stripping
comments.

>
> or
>

2)

> #if /* or #ifdef or #ifndef */
> <arbitrary lines>
> #else
> <whitespace only>
> #endif
>

Ditto as above. The problem is determining precisely and with no bugs this
stuff.

> or
>

3)

> #if /* or #ifdef or #ifndef */
> <whitespace only>
> #else
> <arbitrary lines>
> #endif
>

Yes, rule count is now 3...


> Remember the #if line completely, evaluate when the #include statement
> is encountered and decide whether everything included would be white
> space only.

And why should I go to this extremes?

My whole point is:

WHY COMPLICATE THINGS?

Why can't the user clearly say

#pragma once

and be done with it?

The user is forced to figure out a new name for each file that could
interfere with the name space of the program. With #pragma once there is:

1) No need to generate a name and consume brain power in such a task that
can be better done by computers.
2) No need to enclose the whole file
3) It is much shorter and convenient. The compiler doesn't need to spend
time and dedicate code to figure out that this construct is actually a
#pragma once, and the user doesn't need to spend any effort figuring out
strange identifiers that must be typed EXACTLY TWICE.

I have been bitten by the bug of writing

#ifndef __file_h_
#define __file_h__
///
#endif

Any typing mistake is fatal here and you get NO warnings. What is incredible
about this bugs is that you do NOT see them. The brain automatically
corrects the eyes and I see the SAME identifier OVER and OVER. It took me a
while to get pass my brain and see that I was missing the second final
underscore.

I implemented "#pragma once" after I fixed that. It is so simple, just a few
lines of code in the compiler!

By using

#pragma once

typing mistakes are harmless since they will provoke just a preprocessor
error.


> (Make sure that the #if line is evaluated in the correct
> context, for example I could write
>
> #if __LINE__ > 100
> ...
> #endif
>
> ).


Imagine. Imagine all the bugs that I could have implementing that.

And you did NOT answer any new rule for

#ifndef __file_h__
#define __file_h__ 1
#else
#undef __file_h__
#define __file_h__ 2
#endif

#if (__file_h__ == 1)
///
#endif

Second guessing the user is bad practice: it complicates the compiler and
complicates the program.

Mike Conley

unread,
Mar 30, 2003, 11:22:44 PM3/30/03
to
all...@my-dejanews.com (Allan W) wrote in
news:7f2735a5.03032...@posting.google.com:

in reply to john nagle:

> So, you're suggesting that when a header file is marked with #once,
> the compiler should scan the entire header file to develop a hash.
> And then, if the hash was NOT previously seen along with #once, it
> could scan the entire header file again, this time processing the
> contents.
>
> I suppose that if #once was part of the language, this would be one
> way to support it. Let's call it a QOI issue.

Especially since we can be sure that it would not be necessary for (at
least) local filesystems. I'd say that, if the compiler cannot determine,
given two pathnames (NOT the contents of the files), whether or not they
refer to the same file, the behavior should be implementation defined.


> unless we can
> guarantee that ALL C++ compilers would understand #once (or
> #multiple).

If #once is standardized, those compilers not supporting it wouldn't be
standard-compliant (and that's the idea :)


> Even if this idea comes to pass, it would take several
> (ten?) years before we could take it for granted.

I think you're being too pessimistic. Many compilers implement it now as a
pragma. It would likely take them less than 10 days. Compilers that don't
implement it (I'm not aware of any, but I'm sure they exist), only need to
add a fairly simple extension to what is already a fairly simple macro
processor. We're not exactly talking about implementing export here :)

--
Mike Conley

Dave Hansen

unread,
Mar 30, 2003, 11:22:49 PM3/30/03
to
On Fri, 28 Mar 2003 21:32:54 +0000 (UTC), riv...@dignus.com (Thomas
David Rivers) wrote:

[...]


>
>It seems to me, that solves the problem - doesn't it?
>

Not when the problem is colliding guard symbols. Something like
#pragma once could use file identity in the way you describe without
the worry that some other .h file #defined the same guard.

Regards,

-=Dave
--
Change is inevitable, progress is not.

Thomas David Rivers

unread,
Mar 31, 2003, 10:09:41 AM3/31/03
to


I'm sorry.. but, I fail to see the difference between
adding a #pragma once and requiring that the source
be reasonably recognizable... both are changes,
aren't they?

That is - what's the difference between changing
the source by adding a #pragma once, and changing
the source so it's #ifdef guards are recognizable?

And - many, many sources are already reasonably in
a form that can be recognized, unchanged...

David Hopwood

unread,
Mar 31, 2003, 2:00:40 PM3/31/03
to
-----BEGIN PGP SIGNED MESSAGE-----

Christian Bau wrote:
[snip]


> #if /* or #ifdef or #ifndef */
> <arbitrary lines>
> #endif

[snip]

This will incorrectly match

#if /* or #ifdef or #ifndef */
<arbitrary lines>

#elif /* anything */
<not whitespace only>
#endif

- --
David Hopwood <david....@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip


-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPoYmmTkCAxeYt5gVAQFF5gf/WhNywcOvhDMT9jEgB9luCeIE7DhLMcjn
bKI3LCp6tKGRW0QV4TzLaH6Yg90AIXUgifSwupIpxZR540RjLoVj36vX+TCx5Z4j
tCeicTLdfdY8gayF0qN6AlqIy177hQRJobPiVa8EfYNNX+xnWr/B2d2f49C2dGHt
aDih6DYj/jZ7aAJmrAwdXQQXXJFb7sCw32iiKU+ttBa2PUDt9BZVZQjXH+AWiMM5
ytePBoZMYpXcTgMYokHm6hZEdRhxBqLxhaYpI1cUEV0fAIGJulJ25JEfo8fpSeq5
s8t9u8z2XMEthO9+Br3ZK1Ig/Uvvgo5rJDZF2xsMoAQdNWSPcgjbPw==
=/V7K
-----END PGP SIGNATURE-----

jacob navia

unread,
Mar 31, 2003, 2:01:02 PM3/31/03
to
> I'm sorry.. but, I fail to see the difference between
> adding a #pragma once and requiring that the source
> be reasonably recognizable... both are changes,
> aren't they?
>
> That is - what's the difference between changing
> the source by adding a #pragma once, and changing
> the source so it's #ifdef guards are recognizable?
>
> And - many, many sources are already reasonably in
> a form that can be recognized, unchanged...

Nobody FORCES you to use #pragma once

1) generate an identifier that is not used elsewhere.
2) Add at the top the guard, typing the identifier TWICE without any mistake
3) Make sure none of the other files in your build (present and future) uses
that identifier

The compiler will not see the file. I/O will still be done though, but this
is fairly fast with a reasonable OS file cache.

Eric Backus

unread,
Apr 1, 2003, 12:54:11 AM4/1/03
to
"Mike Conley" <conle...@osu.edu> wrote in message
news:Xns934CCF9B4EBB...@65.24.2.11...

> Two different names to an fopen call could easily refer to the same file
> (symlinks, relative vs absolute pathnames, etc). It would be more
> accurate to call stat and use the device/inode numbers.

More accurate on Posix systems, but less accurate everywhere else. But
really the method of determining sameness is an implementation detail--a C
implementation must know the details of the environment in which it runs,
and it can do whatever it takes to decide whether two files are the "same"
file.

--
Eric Backus
R&D Design Engineer
Agilent Technologies, Inc.
425-335-2495 Tel

Fergus Henderson

unread,
Apr 1, 2003, 1:55:00 AM4/1/03
to
This topic has also been discussed to death on the gcc mailing list.

See the threads starting at <http://gcc.gnu.org/ml/gcc/2003-01/msg01224.html>,
<http://gcc.gnu.org/ml/gcc/2003-02/msg00294.html>,
and <http://gcc.gnu.org/ml/gcc/2003-03/msg00269.htm>.

--
Fergus Henderson <f...@cs.mu.oz.au> | "I have always known that the pursuit
The University of Melbourne | of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh> | -- the last words of T. S. Garp.

Christian Bau

unread,
Apr 1, 2003, 1:47:04 PM4/1/03
to
In article <3E88654A...@zetnet.co.uk>,
david....@zetnet.co.uk (David Hopwood) wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
>
> Christian Bau wrote:
> [snip]
> > #if /* or #ifdef or #ifndef */
> > <arbitrary lines>
> > #endif
> [snip]
>
> This will incorrectly match
>
> #if /* or #ifdef or #ifndef */
> <arbitrary lines>
> #elif /* anything */
> <not whitespace only>
> #endif

I would expect anyone who implements this to avoid trivial mistakes.

James Dennett

unread,
Apr 1, 2003, 1:47:39 PM4/1/03
to
jacob navia wrote:

[snip]

>>Remember the #if line completely, evaluate when the #include statement
>>is encountered and decide whether everything included would be white
>>space only.
>
>
> And why should I go to this extremes?

It's not an extreme, it's quite simple, it's existing
practice, and it works.

> My whole point is:
>
> WHY COMPLICATE THINGS?

Indeed. We don't want to complicate things, so we don't
want to change the semantics of #pragma -- let's leave them
implementation-defined for C++.

>
> Why can't the user clearly say
>
> #pragma once
>
> and be done with it?

Because defining what that means is turning out to be far
from trivial, and because we don't need a new mechanism
when the rules of the preprocessor are already sufficient
to allow this optimisation.

> The user is forced to figure out a new name for each file that could
> interfere with the name space of the program. With #pragma once there is:

I have my editor write that in, along with other boilerplate
such as documentation comment outlines.

> 1) No need to generate a name and consume brain power in such a task that
> can be better done by computers.

So, get your computer to do it. Many of us do.

> 2) No need to enclose the whole file

A trivial saving, IMO.

> 3) It is much shorter and convenient. The compiler doesn't need to spend
> time and dedicate code to figure out that this construct is actually a
> #pragma once, and the user doesn't need to spend any effort figuring out
> strange identifiers that must be typed EXACTLY TWICE.

It is insignificantly shorter. Language changes on the basis
of saving a little typing at the expense of complicating the
language need rather more justification.

> I have been bitten by the bug of writing
>
> #ifndef __file_h_
> #define __file_h__
> ///
> #endif
>
> Any typing mistake is fatal here and you get NO warnings. What is incredible
> about this bugs is that you do NOT see them. The brain automatically
> corrects the eyes and I see the SAME identifier OVER and OVER. It took me a
> while to get pass my brain and see that I was missing the second final
> underscore.

Again, automating this with a sensible editor can avoid
that problem, and having a simple Perl script to check if
you're really having problems is quite easy.

> I implemented "#pragma once" after I fixed that. It is so simple, just a few
> lines of code in the compiler!

Did you cope with multiple different names for the same file?
What if it is accessed via different network paths?

> By using
>
> #pragma once
>
> typing mistakes are harmless since they will provoke just a preprocessor
> error.

Or you'll invoke a different pragma, which (being implementation
defined) might do almost anything.

>>(Make sure that the #if line is evaluated in the correct
>>context, for example I could write
>>
>> #if __LINE__ > 100
>> ...
>> #endif
>>
>>).
>
>
>
> Imagine. Imagine all the bugs that I could have implementing that.
>
> And you did NOT answer any new rule for
>
> #ifndef __file_h__
> #define __file_h__ 1
> #else
> #undef __file_h__
> #define __file_h__ 2
> #endif
>
> #if (__file_h__ == 1)
> ///
> #endif
>
> Second guessing the user is bad practice: it complicates the compiler and
> complicates the program.

There's no need for an extra rule to handle that one.
It's a contrived example, and it doesn't matter if the
file is opened more than once in such artificial cases.
Would it be sensible to add #pragma once to such a file?

In any case, the same rule applies. This header file
is _not_ idempotent (including it a second time changes
the macro), and so it must be reprocessed, and adding
#pragma once to it would change its meaning. Which
illustrates that #pragma once is dangerous if you use
it in situations like this, whereas the existing rules
work just fine, and the optimisation also works safely
within the framework of the existing rules.

-- James.

James Dennett

unread,
Apr 1, 2003, 2:43:18 PM4/1/03
to
jacob navia wrote:
>>I'm sorry.. but, I fail to see the difference between
>>adding a #pragma once and requiring that the source
>>be reasonably recognizable... both are changes,
>>aren't they?
>>
>>That is - what's the difference between changing
>>the source by adding a #pragma once, and changing
>>the source so it's #ifdef guards are recognizable?
>>
>>And - many, many sources are already reasonably in
>>a form that can be recognized, unchanged...
>
>
> Nobody FORCES you to use #pragma once
>
> 1) generate an identifier that is not used elsewhere.

Easy, an editor script will take care of this.

> 2) Add at the top the guard, typing the identifier TWICE without any mistake

I don't have to type it at all.

> 3) Make sure none of the other files in your build (present and future) uses
> that identifier

Not a problem. Inclusion of a prefix, a mangled file name,
and the current date/time (to the second) is automated by a
simple editor script, which is invoked whenever I create a
new header file. Less effort than typing "#pragma once" and
less error prone, unless you put the pragma into a similar
editor script.

> The compiler will not see the file. I/O will still be done though, but this
> is fairly fast with a reasonable OS file cache.

I/O is not done by compilers that recognize the existing
idiom.

-- James.

Dave Hansen

unread,
Apr 1, 2003, 3:12:22 PM4/1/03
to
On Tue, 1 Apr 2003 18:47:04 +0000 (UTC),
christ...@cbau.freeserve.co.uk (Christian Bau) wrote:

>In article <3E88654A...@zetnet.co.uk>,
> david....@zetnet.co.uk (David Hopwood) wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>>
>> Christian Bau wrote:
>> [snip]
>> > #if /* or #ifdef or #ifndef */
>> > <arbitrary lines>
>> > #endif
>> [snip]
>>
>> This will incorrectly match
>>
>> #if /* or #ifdef or #ifndef */
>> <arbitrary lines>
>> #elif /* anything */
>> <not whitespace only>
>> #endif
>
>I would expect anyone who implements this to avoid trivial mistakes.

You would be disappointed. See the thread in c.l.c.m earlier this
month:

http://groups.google.com/groups?selm=clcm-20030305-0001%40plethora.net

Regards,

-=Dave
--
Change is inevitable, progress is not.

---

Douglas A. Gwyn

unread,
Apr 1, 2003, 6:57:49 PM4/1/03
to
Eric Backus wrote:
> More accurate on Posix systems, but less accurate everywhere else. But
> really the method of determining sameness is an implementation detail--a C
> implementation must know the details of the environment in which it runs,
> and it can do whatever it takes to decide whether two files are the "same"
> file.

It would probably suffice for the header name to be spelled the same
in the corresponding #include statements.

Zack Weinberg

unread,
Apr 2, 2003, 11:46:46 AM4/2/03
to
DAG...@null.net ("Douglas A. Gwyn") writes:

> Eric Backus wrote:
>> More accurate on Posix systems, but less accurate everywhere else.
>> But really the method of determining sameness is an implementation
>> detail--a C implementation must know the details of the environment
>> in which it runs, and it can do whatever it takes to decide whether
>> two files are the "same" file.

I do not want to rehash the argument to death on the gcc list, which
Fergus Henderson provided pointers to. Suffice to say that this is
not as easy as it may sound, for two reasons:

1) Environments are common which are POSIX conformant in every detail
except for failing to provide unique inode/device pairs for some
subset of the files visible in the file system, and there is no
practical way to determine whether or not any given file falls into
that subset.

2) Users do horrible things with symbolic links (or shortcuts, or even
hard links) and then expect multiple-inclusion guards to work.
#ifndef wrappers degrade gracefully in the face of such tricks,
#pragma once doesn't.

It is true that that #pragma once is easier to work with than #ifndef
wrappers ... in simple situations. In more complex situations, I feel
that #ifndef wrappers are a better choice, because it is possible to
fix any problem one encounters with an #ifndef wrapper by editing the
offending file (e.g. to change the name of a guard which conflicts
with some other file's choice). It is not possible to fix problems
with #pragma once so easily. #pragma once can make the compiler
reject one's program because of factors which are invisible to the
user, and cannot be rectified by the user even if s/he can dig them up
... short of replacing all uses of #pragma once with #ifndef wrappers,
which I suspect can be done with a ten-line perl script.

So I cannot support standardizing #pragma once.

> It would probably suffice for the header name to be spelled the same
> in the corresponding #include statements.

This is not enough. Assume the usual implementation-defined behavior
referenced by 6.10.2p2,3: that #include "foo.h" attempts to access an
ordinary file named "foo.h" in the directory containing the file
containing the #include statement, before reprocessing as
#include <foo.h>, and that #include <foo.h> attempts to access an
ordinary file named "foo.h" in a directory whose contents are provided
by the implementation.

Now, suppose the existence of these files in that directory (this
hypothetical implementation aims to conform to POSIX.1 as well as C99):

time.h
sys/time.h
sys/stat.h

all of which contain a "#pragma once" style include guard, and further
suppose that sys/stat.h contains the statement

#include "time.h" /* expected to read sys/time.h */

An implementation based solely on the spelling of the #include
statements will then misinterpret a translation unit which begins

#include "time.h"
#include "sys/stat.h"

This is *not* a contrived example; GNU libc does more or less this
(not necessarily with those three header files).

zw

jacob navia

unread,
Apr 2, 2003, 1:54:52 PM4/2/03
to

----- Original Message -----
From: "Dave Hansen" <id...@hotmail.com>
Newsgroups: comp.std.c++,comp.std.c
Sent: Tuesday, April 01, 2003 10:12 PM
Subject: Re: thoughts on an old proposal: #once


> On Tue, 1 Apr 2003 18:47:04 +0000 (UTC),
> christ...@cbau.freeserve.co.uk (Christian Bau) wrote:
>

[snip]


> >I would expect anyone who implements this to avoid trivial mistakes.
>
> You would be disappointed. See the thread in c.l.c.m earlier this
> month:
>
> http://groups.google.com/groups?selm=clcm-20030305-0001%40plethora.net
>

In that thread a customer complains about a bug in the optimization of the
#if
#endif
guards in the compiler. He wrote
#if
#else
#endif

The #else part was missed by the optimizing preprocessor.


This is the consequence of following the path of always adding more
complexity to the compiler instead of using simple solutions that work.

#pragma once

provokes no such bugs damm it!

jacob navia

unread,
Apr 2, 2003, 3:27:28 PM4/2/03
to
Hi James:

I said


> 3) Make sure none of the other files in your build (present and future)
uses
> that identifier

You answered, (and really, it is worth the re-reading :-)

Not a problem. Inclusion of a prefix, a mangled file name,
and the current date/time (to the second) is automated by a
simple editor script, which is invoked whenever I create a
new header file. Less effort than typing "#pragma once" and
less error prone, unless you put the pragma into a similar
editor script.

You say that and I wonder...

Simple editor script?

OK.

Where is the doc of emacs? Oh mess...

Yes, I will generate a name based on the file name and the minute. Mmm the
function to get the current time was called? Damm where *is* that manual!

Ah there yes. OK get_time, then I print it in the editor buffer, first line.
Then, go to last line, add #endif

Well, it is working. Ah, but I have to do this only if there isn't any guard
already. Yes, I have to recognize the guards with my script. Well, if the
first line is #ifndef something, or #if OK, I have just to implement the
rules proposed by Christian Bau in that thread. After implementing those
rules I will just need an evaluator of those ifdefs written in elisp. It is
a nice project actually.

But too much. I will just do some script that I will call once at each
header that I want included once. I will keep it around in the private elisp
directory and not forget the name of the script. This works only in emacs
but then, other editors are just a waste of time.

Yes, I have got a solution for my private problem:

1) Require a scripting editor and
2) learn it enough to do scripts and
3) Assume your compiler optimizes this construct

This will work well with emacs and gcc.

For people that are lazy like me, lccwin32 proposes

#pragma once

Just two words James. What a simplification isn't it?

jacob

Christian Bau

unread,
Apr 2, 2003, 5:21:16 PM4/2/03
to
In article <b6e84b$1r4$1...@news-reader11.wanadoo.fr>,

ja...@jacob.remcomp.fr ("jacob navia") wrote:

> ----- Original Message -----
> From: "Dave Hansen" <id...@hotmail.com>
> Newsgroups: comp.std.c++,comp.std.c
> Sent: Tuesday, April 01, 2003 10:12 PM
> Subject: Re: thoughts on an old proposal: #once
>
>
> > On Tue, 1 Apr 2003 18:47:04 +0000 (UTC),
> > christ...@cbau.freeserve.co.uk (Christian Bau) wrote:
> >
> [snip]
> > >I would expect anyone who implements this to avoid trivial mistakes.
> >
> > You would be disappointed. See the thread in c.l.c.m earlier this
> > month:
> >
> > http://groups.google.com/groups?selm=clcm-20030305-0001%40plethora.net
> >
>
> In that thread a customer complains about a bug in the optimization of the
> #if
> #endif
> guards in the compiler. He wrote
> #if
> #else
> #endif
>
> The #else part was missed by the optimizing preprocessor.
>
>
> This is the consequence of following the path of always adding more
> complexity to the compiler instead of using simple solutions that work.

No, that is the consequence of letting people write compilers who have
no business doing so (and I am not referring to you, I am referring to
the unknown authors of the unknown compiler described in your link).

Collecting the relevant information about #if/#else/#endif structure is
trivial when you are processing the header file for the first time. It
is just unavoidable to spot the #else in the middle and process it
correctly, otherwise you can't compile any headerfile correctly.

Eric Backus

unread,
Apr 2, 2003, 6:20:45 PM4/2/03
to
"Zack Weinberg" <za...@codesourcery.com> wrote in message
news:8765pxh...@egil.codesourcery.com...

> > Eric Backus wrote:
> >> More accurate on Posix systems, but less accurate everywhere else.
> >> But really the method of determining sameness is an implementation
> >> detail--a C implementation must know the details of the environment
> >> in which it runs, and it can do whatever it takes to decide whether
> >> two files are the "same" file.
>
> I do not want to rehash the argument to death on the gcc list, which
> Fergus Henderson provided pointers to. Suffice to say that this is
> not as easy as it may sound, for two reasons:
>
> 1) Environments are common which are POSIX conformant in every detail
> except for failing to provide unique inode/device pairs for some
> subset of the files visible in the file system, and there is no
> practical way to determine whether or not any given file falls into
> that subset.
>
> 2) Users do horrible things with symbolic links (or shortcuts, or even
> hard links) and then expect multiple-inclusion guards to work.
> #ifndef wrappers degrade gracefully in the face of such tricks,
> #pragma once doesn't.

I think that compiler writers could probably come up with reasonable ways to
deal with all that, but on the whole I'd agree that it's probably not worth
the effort. Today's #define include guards appear to work fairly well, and
the compiler can optimize the I/O away behind the scenes in some cases, so
there doesn't appear to be compelling need for #once.

--
Eric Backus
R&D Design Engineer
Agilent Technologies, Inc.
425-335-2495 Tel

Allan W

unread,
Apr 2, 2003, 8:29:07 PM4/2/03
to
> riv...@dignus.com (Thomas David Rivers) wrote
> > Even in the case of inclusion guards, many compilers will
> > actually reread the source (incurring the I/O) while
> > "skipping" the body (because of the inclusion guards.)
> > But - there is a way around this without using #pragma once,
> > and hence, I would argue against it. Our C and C++ compilers
> > accomplish this.

I actually worked in a shop where they recommended using include
guards around the #include directives.

#ifndef INCLUDE_FOO_H
#include "foo.h"
#endif

Foo.h would have the usual include guard as well, "just in case" the
main programs left it out. As you can imagine, it was very important
to use the exact same name in the main program that the include
file used!

I found this system to be unwieldy and error-prone. It did improve
compile times, though.

conle...@osu.edu (Mike Conley) wrote


> The real problem, as I see it, is the requirement that every header you
> want to guard has to have a globally unique key, and that generating this
> key is bothersome and error prone. It's another thing that a compiler
> can and should do for you (many already can), provided you ask it to.
> The problem is that it isn't standardized and thus cannot be relied upon
> in portable code.

But some files are meant to be included more than once!

Furthermore, constructing a "Globally unique key" by hand isn't nearly
so hard as you're making it out to be. Just combine the vendor name
(or your own company name) with the filename. If you've done a decent
job of not naming your files the same as each other, that's sufficient.

INCLUDE_MYCOMPANY_FOO_H
INCLUDE_THIRDPARTYCOMPANY_FOO_H

> A sticking point seems to be that it may be impossible to determine
> whether or not two names refer to the same file in some cases. I doubt
> that any system actually exhibits the characteristics necessary for this
> problem to arise.

But then you go on and give examples:

> Two different names to an fopen call could easily refer to the same file
> (symlinks, relative vs absolute pathnames, etc). It would be more
> accurate to call stat and use the device/inode numbers.

Not all systems have device/inode numbers!

> >I think it's safe for a compiler to assume
> > that what name is used to access the file is the arbiter
> > of file uniqueness.
>
> This is probably true much of the time, but I think we can do better,
> depending on the interface the OS provides for various filesystems.

Which is obviously system-dependent. But that's really okay.
I think it's sufficient to have the standard say that the compiler
will attempt to determine if two files are the same, and add a short
warning that if the same file has two or more names, the results
are implementation-defined. Then let the rest of the details be a
QOI issue. Systems that have device/inode numbers could use them;
other systems could base it off of filename alone, etc.

(Note that the other way around is assumed not to be a problem: if
two different files have the same exact name, all bets are off.
AFAIK this happens only with devices that are not random-access.
For instance, reading a deck of card twice might provide different
data, if the second deck of cards had different values...)

Allan W

unread,
Apr 2, 2003, 11:51:24 PM4/2/03
to
za...@codesourcery.com (Zack Weinberg) wrote

> DAG...@null.net ("Douglas A. Gwyn") writes:
> > It would probably suffice for the header name to be spelled the same
> > in the corresponding #include statements.
>
> This is not enough.
[snip]

> Now, suppose the existence of these files in that directory (this
> hypothetical implementation aims to conform to POSIX.1 as well as C99):
>
> time.h
> sys/time.h
> sys/stat.h
>
> all of which contain a "#pragma once" style include guard, and further

I interpreted Douglas's phrase "the header name" to include the "file
path" or "directory name" or "folder name" on systems that have such
a concept. Thus, "time.h" would not match "sys/time.h".

Furthermore, although Douglas didn't say so, I would expect that
non-explicit, discovered paths would also be included, so that
#include <time.h>
and
#include "time.h"
would refer to the same file IF (and only if) they were found in the
same path on disk.

James Dennett

unread,
Apr 2, 2003, 11:54:11 PM4/2/03
to
jacob navia wrote:
> This is the consequence of following the path of always adding more
> complexity to the compiler instead of using simple solutions that work.
>
> #pragma once
>
> provokes no such bugs damm it!

Perhaps you would care to write a formal description of
the behavior you require of #pragma once, such that
(1) the behavior is sufficiently consistent across all
conforming implementations, and
(2) you do not assume that it is possibly to determine
from two #include directives whether or not they
refer to "the same file".

It might be possible to do this by ensuring that implementations
where the determination of whether two #include directives refer
to the same file cannot be made must read the file contents to
determine whether to ignore a #include directive.

All of the logic required for this would complicate compilers,
and probably provoke bugs.

-- James.

Mike Conley

unread,
Apr 2, 2003, 11:56:20 PM4/2/03
to
eric_...@alum.mit.edu ("Eric Backus") wrote in
news:10493215...@cswreg.cos.agilent.com:

[quoted problems snipped]

> I think that compiler writers could probably come up with reasonable
> ways to deal with all that,


They could even go so far as to compute and cache checksums for included
headers as others have suggested. I'd guess that could be avoided in most
cases though.

It might even be reasonable to let the preprocessor modify the header file
(appending the checksum and the time it was computed after the #once), so
that this work would only be done once (unless the header is modified).
This would provide the semantics of #ifndef guards, but without the manual
labor. Of course, you need to be able to determine the time the file was
last modified for this to work...

And, if you don't want the compile-time overhead this may imply, you don't
use the feature.


> but on the whole I'd agree that it's
> probably not worth the effort. Today's #define include guards appear
> to work fairly well, and the compiler can optimize the I/O away behind
> the scenes in some cases, so there doesn't appear to be compelling
> need for #once.


There's rarely a NEED for anything. It's more a question of WANT. And
wants can be quite compelling :)


--
Mike Conley

Allan W

unread,
Apr 2, 2003, 11:56:20 PM4/2/03
to
david....@zetnet.co.uk (David Hopwood) wrote

> Christian Bau wrote:
> [snip]
> > #if /* or #ifdef or #ifndef */
> > <arbitrary lines>
> > #endif
> [snip]
>
> This will incorrectly match
>
> #if /* or #ifdef or #ifndef */
> <arbitrary lines>
> #elif /* anything */
> <not whitespace only>
> #endif

And you wouldn't want it to match that, either.

We're talking about getting compilers to skip re-reading header
files, if they already "know" that the entire contents will be
skipped. In your example, the entire contents will NOT be
skipped -- either "arbitrary lines" or "not whitespace only"
will be used every time that header file is #included. So
having the compiler match it to that pattern would be the exact
opposite of what you want.

Fergus Henderson

unread,
Apr 3, 2003, 11:25:02 AM4/3/03
to
ja...@jacob.remcomp.fr ("jacob navia") writes:

>This is the consequence of following the path of always adding more
>complexity to the compiler instead of using simple solutions that work.
>
>#pragma once
>
>provokes no such bugs damm it!

No, #pragma once just provokes different bugs. The bugs that #pragma
once provokes relate to the use of relative paths, symlinks, hard links,
network file systems, etc.

What does lcc-win32 do if the same file is included twice, via different names
that end up referring to the same file (e.g. via Samba, where two file
names on the Linux file server may refer to the same file)?
Will lcc-win32 include it twice, thus ignoring the #pragma once?
If so, isn't that a bug?

--
Fergus Henderson <f...@cs.mu.oz.au> | "I have always known that the pursuit
The University of Melbourne | of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh> | -- the last words of T. S. Garp.

---

James Dennett

unread,
Apr 3, 2003, 1:07:48 PM4/3/03
to
[Posted to comp.std.c++ only, as I am talking about C++,
and #pragma has different semantics in C99.]

jacob navia wrote:
> Hi James:
>
> I said
>
>>3) Make sure none of the other files in your build (present and future)
>
> uses
>
>>that identifier
>
>
> You answered, (and really, it is worth the re-reading :-)
>
> Not a problem. Inclusion of a prefix, a mangled file name,
> and the current date/time (to the second) is automated by a
> simple editor script, which is invoked whenever I create a
> new header file. Less effort than typing "#pragma once" and
> less error prone, unless you put the pragma into a similar
> editor script.
>
> You say that and I wonder...
>
> Simple editor script?

Yes. Compared to the C++ we write, an editor script to
do what I need there is trivial. Programmers use editors,
generally; they are our tools, and part of our job is to
know how to operate them well enough to do that job.

> OK.
>
> Where is the doc of emacs? Oh mess...

One's choice of text editor shouldn't cause the C++ Standard
to be changed to cope with difficulties there, surely.

> Yes, I will generate a name based on the file name and the minute. Mmm the
> function to get the current time was called? Damm where *is* that manual!
>
> Ah there yes. OK get_time, then I print it in the editor buffer, first line.
> Then, go to last line, add #endif
>
> Well, it is working. Ah, but I have to do this only if there isn't any guard
> already. Yes, I have to recognize the guards with my script. Well, if the
> first line is #ifndef something, or #if OK, I have just to implement the
> rules proposed by Christian Bau in that thread. After implementing those
> rules I will just need an evaluator of those ifdefs written in elisp. It is
> a nice project actually.

But unnecessary. Just add the guard, automatically, when
the file is first created. As it's automated, this won't
get missed. And as the file is newly created, you'll
probably want to add other boilerplate at the time too,
to suit a house style.

> But too much. I will just do some script that I will call once at each
> header that I want included once. I will keep it around in the private elisp
> directory and not forget the name of the script. This works only in emacs
> but then, other editors are just a waste of time.
>
> Yes, I have got a solution for my private problem:
>
> 1) Require a scripting editor and
> 2) learn it enough to do scripts and
> 3) Assume your compiler optimizes this construct
>
> This will work well with emacs and gcc.

Or any other reasonably powerful editor.

> For people that are lazy like me, lccwin32 proposes
>
> #pragma once
>
> Just two words James. What a simplification isn't it?

There's the rub. It isn't. If it were so simple that the
C++ Standard could just define #pragma once to do what you
want (avoid I/O if the same file is #included more than
once), my position might be different. (Even then I'd rather
avoid standardised pragmas, given that pragma currently is
explicitly a hook for implementation-dependent behavior, but
let's ignore the issue of notation for now). In reality,
determining whether two #include directives target the same
file is far from trivial, and it's not clear to me that it
is always possible.

If the implementation of "#pragma once" can't determine
that the files are the same, it will process the file twice,
and likely generate errors.

Compare to convention include guards. If your compiler doesn't
handle the optimisation to avoid I/O, code using conventional
include guards degrades gracefully: the code still compiles,
just a little slower.

So, you add much complexity for a broken solution. If you
care to respond with a portable notion of how #pragma once
can work, including cases where the same header is accessed
via different paths (including cases where these access files
stored across a network), some of us may be interested to
hear about it.

I would like to see better support for named units of code
in C++, but I don't think that requiring #pragma to do
impossible things is progress.

-- James

Mike Conley

unread,
Apr 3, 2003, 1:07:57 PM4/3/03
to
all...@my-dejanews.com (Allan W) wrote in
news:7f2735a5.03040...@posting.google.com:

> conle...@osu.edu (Mike Conley) wrote
>> The real problem, as I see it, is the requirement that every header
>> you want to guard has to have a globally unique key, and that
>> generating this key is bothersome and error prone. It's another
>> thing that a compiler can and should do for you (many already can),
>> provided you ask it to. The problem is that it isn't standardized
>> and thus cannot be relied upon in portable code.
>
> But some files are meant to be included more than once!

I know. I think we're misunderstanding one another.

> Furthermore, constructing a "Globally unique key" by hand isn't nearly
> so hard as you're making it out to be.

It isn't hard, no. It is bothersome, at least to me. It is a source of
error (usually easy to diagnose and fix).

An #ifndef guard macro is more than a globally unique key (uniquely
identifying the header it is intended to guard, assuming it works as
intended). If anyone decides to use the same string of characters in
his program that you have used for a guard macro, and he includes your
header, he is likely to have problems.

I'm not saying that this is a terrible problem. I'm sure it rarely
affects anyone.

But I do think #ifndef guards are an ugly, ugly hack. You introduce a
macro into your program for every guarded header (very likely into code
you didn't write), which is surely not a good thing, no matter how you
define your guards. They're bad for all the reasons macros are bad. And
it's something that can be automated.


> I think it's sufficient to have the standard say that the compiler
> will attempt to determine if two files are the same, and add a short
> warning that if the same file has two or more names, the results
> are implementation-defined. Then let the rest of the details be a
> QOI issue.

Works for me. I had thought it would be possible to make some guarantees
for common cases (eg, symlinks on local filesystems). It looks like I
misunderstood the scope of the problem: I thought it was limited to
network shares. Things are never as simple as they seem....

But, as you say, it's a QOI issue. Such guarantees probably aren't
desirable, from the standards point of view.

--
Mike Conley

Dave Hansen

unread,
Apr 4, 2003, 2:06:41 AM4/4/03
to
On Thu, 3 Apr 2003 18:07:57 +0000 (UTC), conle...@osu.edu (Mike
Conley) wrote:

[...]


>
>> Furthermore, constructing a "Globally unique key" by hand isn't nearly
>> so hard as you're making it out to be.
>
>It isn't hard, no. It is bothersome, at least to me. It is a source of
>error (usually easy to diagnose and fix).
>
>An #ifndef guard macro is more than a globally unique key (uniquely
>identifying the header it is intended to guard, assuming it works as
>intended). If anyone decides to use the same string of characters in
>his program that you have used for a guard macro, and he includes your
>header, he is likely to have problems.
>
>I'm not saying that this is a terrible problem. I'm sure it rarely
>affects anyone.
>
>But I do think #ifndef guards are an ugly, ugly hack. You introduce a
>macro into your program for every guarded header (very likely into code
>you didn't write), which is surely not a good thing, no matter how you
>define your guards. They're bad for all the reasons macros are bad. And
>it's something that can be automated.

It's the last statement of yours ("can be automated") that seems to be
at issue. Determining file identity isn't easily specified in the
absence of implementation details. In reality, we'd probably just be
trading one set of problems with another.

They say compromise produces a solution that no one likes, but here
goes anyway...

How about something like

#pragma STDC ONCE(<key>)

where <key> is a pp-token similar to a guard #define. This would work
the same way #ifndef guards work, but 1) the symbol needs only be
defined in the file once (in the pragma), and 2) it lives in a unique
namespace, so it won't collide with any #define, global variable,
typename, whatever.

This still leaves the problem of guard symbols colliding with each
other, but that may be the smallest problem we currently have. It's
(a little) prettier and easier than a guard.

The other problem is that someone's got to like this solution well
enough to shepherd it through the committee. Does it solve enough
problems to make it worthwhile?

Regards,

-=Dave
--
Change is inevitable, progress is not.

---

Allan W

unread,
Apr 4, 2003, 6:26:17 AM4/4/03
to
ja...@jacob.remcomp.fr ("jacob navia") wrote
> Yes, rule count is now 3...

But all three are nearly identical, and easy to understand.


> And why should I go to this extremes?
>

> My whole point is:
>
> WHY COMPLICATE THINGS?
>

> Why can't the user clearly say
>
> #pragma once
>
> and be done with it?

I'd like to see the name change slightly if we standardize it.
#once is an obvious name (and see the subject line of this thread).

> And you did NOT answer any new rule for
>
> #ifndef __file_h__
> #define __file_h__ 1
> #else
> #undef __file_h__
> #define __file_h__ 2
> #endif
>
> #if (__file_h__ == 1)
> ///
> #endif

You're trying to demonstrate that there are strange cases that won't
be covered by any simple rule, and I agree. But this type of pattern
is also very rare -- I imagine that it is rare enough that we don't
need to optimize it. It doesn't match the pattern established in the
first 3 rules, so just go ahead and include it normally. The compile
might take a few extra seconds, but the #if's will still ensuring
that everything works correctly.

> Second guessing the user is bad practice: it complicates the compiler and
> complicates the program.

If that was always true, then the compiler couldn't optimize
char name[80+1];
to use the constant 81 -- it would always have to perform the addition.

Allan W

unread,
Apr 4, 2003, 6:27:33 AM4/4/03
to
conle...@osu.edu (Mike Conley) wrote
> I'd say that, if the compiler cannot determine,
> given two pathnames (NOT the contents of the files), whether or not they
> refer to the same file, the behavior should be implementation defined.

Agreed.

> > Even if this idea comes to pass, it would take several
> > (ten?) years before we could take it for granted.
>
> I think you're being too pessimistic. Many compilers implement it now as a
> pragma. It would likely take them less than 10 days. Compilers that don't
> implement it (I'm not aware of any, but I'm sure they exist), only need to
> add a fairly simple extension to what is already a fairly simple macro
> processor. We're not exactly talking about implementing export here :)

Perhaps your university upgrades all of it's compilers the very day that
they are released. Mine does not, and neither do the businesses that I
work at.

Consider this code:
int i = 2;
{ for(int i=0; i<10; ++i) {} }
std::cout << i;

Can you assume that this will write "2"? I don't yet have that luxury.

What I'm saying is, if it takes 10 days to implement, 10 more days to
run through all the test cases, and 10 days to bring it to market, then
a majority of customers will have it on a majority of platforms in a
year or two. But if I need to write code that will work for the VAST
majority of customers on the VAST majority of platforms (say, 95%),
it's more like 10 years.

Mike Conley

unread,
Apr 4, 2003, 11:19:05 PM4/4/03
to
id...@hotmail.com (Dave Hansen) wrote in
news:3e8c97d...@News.CIS.DFN.DE:

> How about something like
>
> #pragma STDC ONCE(<key>)
>
> where <key> is a pp-token similar to a guard #define. This would work
> the same way #ifndef guards work, but 1) the symbol needs only be
> defined in the file once (in the pragma), and 2) it lives in a unique
> namespace, so it won't collide with any #define, global variable,
> typename, whatever.

It's a reasonable compromise, and it solves a number of problems.
Personally, I'd rather see

#once (key)

because (1) C++ doesn't have any notion of STDC pragmas, and this should
apply to both C and C++ (and Objective C, for that matter), and (2) the
behavior doesn't appear to depend upon the implementation at all.


--
Mike Conley

Douglas A. Gwyn

unread,
Apr 4, 2003, 11:19:44 PM4/4/03
to
jacob navia wrote:
> #pragma once
> provokes no such bugs damm it!

But it has problems of its own, discussed in a thread elsewhere.

During a recent WG14 meeting when it was stated that compilers
could easily keep track of idempotency guards in an internal
table to avoid even trying to open an already-processed header,
I wondered if it was really so. I spent a few minutes working
out an algorithm to do that correctly. It was of course not as
simple as the one apparently used by the buggy compiler, but it
also wasn't very complicated.

jacob navia

unread,
Apr 8, 2003, 4:32:11 AM4/8/03
to
> No, #pragma once just provokes different bugs. The bugs that #pragma
> once provokes relate to the use of relative paths, symlinks, hard links,
> network file systems, etc.
>
No!
If your setup is so messy that you are never sure if two different absolute
file paths point to different data you can ALWAYS use
#ifndef
#endif

#pragma once is NOT obligatory you see? Lcc-win32 will continue to support
the old ways.

> What does lcc-win32 do if the same file is included twice, via different
names
> that end up referring to the same file (e.g. via Samba, where two file
> names on the Linux file server may refer to the same file)?

Will read the file twice.

If you want to be 100% sure that in all possible network configurations your
file will be included only once just write a little more.

For all others, that have never seen (and do not want to see) such a messy
network config pleeeezee...

> Will lcc-win32 include it twice, thus ignoring the #pragma once?

Yes

> If so, isn't that a bug?

No. It is a network misconfiguration.

jacob navia

unread,
Apr 8, 2003, 4:34:41 AM4/8/03
to
jden...@acm.org (James Dennett) wrote in message news:<QWOia.5909$ey1.4...@newsread1.prod.itd.earthlink.net>...

> jacob navia wrote:
> > This is the consequence of following the path of always adding more
> > complexity to the compiler instead of using simple solutions that work.
> >
> > #pragma once
> >
> > provokes no such bugs damm it!
>
> Perhaps you would care to write a formal description of
> the behavior you require of #pragma once, such that
> (1) the behavior is sufficiently consistent across all
> conforming implementations,

Lcc-win32 implementation does this:
1) When an included file is found in the pre-processor, its full
absolute path is determined
2) If this file contains a #pragma once, the name of the file is added
to the initially empty list of "only once" files.
3) When opening a file, if the absolute path is identical (case
insensitive under windows) to one of the files in the list, this is
equivalent to finding EOF immediately.

Simple isn't it?

But highly effective.

> (2) you do not assume that it is possibly to determine
> from two #include directives whether or not they
> refer to "the same file".

The preprocessor looks for files in the preprocessor path list. If it
finds a file matching the file in the text of the include directive,
its full path is determined and the algorithm above applies. If you
say
#include "foo.h"
and in some directory in the path of includes there is a foo.h, it
matches, and then it is looked upon in the once list.

In OSes where this is possible, the absolute path name should be
devoid of symbolic links. Lcc-win32 doesn't do this because symbolic
links are quite recent to Win32 and the use of symbolic links is not
so widespread. A conforming implementation should be not required to
do this.

>
> It might be possible to do this by ensuring that implementations
> where the determination of whether two #include directives refer
> to the same file cannot be made must read the file contents to

> determine whether to ignore an #include directive.
>

In most systems and most implementations, different absolute paths
lead to different data. If the user sets up a messy networked
environment this is no longer a compiler issue.

If you want to support messy network configs with your code, use
#ifndef once17888
#define once17888
///
#endif

I repeat that this continues to work in lcc-win32.

> All of the logic required for this would complicate compilers,
> and probably provoke bugs.

The logic required is:

1) Get the absolute path of each file opened by the preprocessor.

Not very difficult in most systems...

2) Parse #pragma once

Again, only one word!

3) Maintain a linked list of "only once" files.

Trivial. At each "pragma once" we add the file at the root of the
list.

4) Check it when opening a file.

Go through the list. make a stricmp (under windows) or strcmp (under
Unix) with each element.
5) If found report EOF

There are no optimizations needed, no complicated logic, nothing.
Getting the absolute path of a file (following symbolic links) is just
an API call away in most systems.

High quality compilers will discover that actually that network share
is the same part of the shared partition in another machine, whatever.
This is not required.

Basically I believe that messy network configs are a mistake. When two
different absolute paths lead to the same data, the file system
doesn't have a one to one relationship between file names and data.
This leads to many bugs and makes many other software fail, not only
compilers. A file caching application may be confused, writes could be
cancelled/done because of false assumptions, etc etc.

If you use
#pragma once
your code will work in most normal configurations, and it will surely
work in a single workstation setup.

But if you work in a networked mess..., beware. The only 100% sure way
is
#ifndef
... etc.

This is a matter of philosophy too. Why impose to most configurations
heavy stuff that is required for a very few?

In most sane setups the simple technique above works very well. And
most users of the compiler are in a workstation setup where all
include files are in the same machine anyway.

Thomas David Rivers

unread,
Apr 8, 2003, 12:19:17 PM4/8/03
to
jacob navia wrote:
>
> jden...@acm.org (James Dennett) wrote in message news:<QWOia.5909$ey1.4...@newsread1.prod.itd.earthlink.net>...
> > jacob navia wrote:
> > > This is the consequence of following the path of always adding more
> > > complexity to the compiler instead of using simple solutions that work.
> > >
> > > #pragma once
> > >
> > > provokes no such bugs damm it!
> >
> > Perhaps you would care to write a formal description of
> > the behavior you require of #pragma once, such that
> > (1) the behavior is sufficiently consistent across all
> > conforming implementations,
>
> Lcc-win32 implementation does this:
> 1) When an included file is found in the pre-processor, its full
> absolute path is determined

Just a quick note on this - step #1 isn't determinable (if that's
a word) on many operating systems.

For example - on the operating system we use, the file can
be a DD that "points" to any other file (kinda like expanding
a shell environment variable - if you want to think of it
that way.) Furthermore - such a name can be defined in
such a way that it is a concatenation of several files...

So - there isn't an reasonable way to determine the entire
"full absolute path." In fact, the idea doesn't really
exist...

Thus, I have a suggested modification below...

> 2) If this file contains a #pragma once, the name of the file is added
> to the initially empty list of "only once" files.
> 3) When opening a file, if the absolute path is identical (case
> insensitive under windows) to one of the files in the list, this is
> equivalent to finding EOF immediately.

I would suggest not using the "absolute path" - but simplifying
this to "the name the compiler used in the operating system's
file opening interface."

Presuming, of course, that the same name results in the
same file.

it's slightly less effective, because of say, symlinks
to the same file - but if the name the compiler uses
to open the file is different - shouldn't that be considered
a "different" file?

That is - is it the compiler's point-of-view that's interesting,
or the underlying operating system's?

- Dave Rivers -

--
riv...@dignus.com Work: (919) 676-0847
Get your mainframe programming tools at http://www.dignus.com

cody

unread,
Apr 8, 2003, 2:10:53 PM4/8/03
to
only a stupid quesition: could there be ever a situation where somebody
would like the include one and the sme header multiple times? i can't think
of such a situation.

--
cody

Freeware Tools, Games and Humour
http://www.deutronium.de.vu
[noncommercial and no fucking ads]

Allan W

unread,
Apr 8, 2003, 4:32:15 PM4/8/03
to
ja...@jacob.remcomp.fr (jacob navia) wrote

> > Will lcc-win32 include it twice, thus ignoring the #pragma once?
>
> Yes
>
> > If so, isn't that a bug?
>
> No. It is a network misconfiguration.

It depends on the stated purpose of #once (or #pragma once).

I view it as an optimization. A failed optimization is only a bug if it
causes the wrong results.

#define-style include guards have been around since the earliest days
of C programming. They're easy to write, easy to understand, and they
work very well (given a good naming standard). I teach this to
first-year C++ students.

#define-style include guards do have one shortcoming: in order for the
compiler to skip an already-seen include guard, it has to re-open the
source file, read the appropriate #if statement, scan for the matching
#endif, and finally close the file. While this does not result in any
sort of error, it can slow down the compile process.

#pragma once is an attempt to optimize the performance by skipping the
file read when it is unneccesary. If this optimization fails, the program
is still correct (but it compiles more slowly).

My point, then, is that if you combine #once (or #pragma once) with
traditional include guards, you get the best of all worlds.

Douglas A. Gwyn

unread,
Apr 8, 2003, 4:32:23 PM4/8/03
to
jacob navia wrote:
> 1) When an included file is found in the pre-processor, its full
> absolute path is determined
> ...
> Simple isn't it?

You ignored the issue that was raised concerning the
nonuniqueness of the association between full path name
and file object.

Nobody argued that you couldn't add some sort of hack
to your compiler. What was being argued was whether
the facility could be specified well enough to make
standard across all platforms.

Mike Conley

unread,
Apr 8, 2003, 5:15:30 PM4/8/03
to
deutr...@web.de ("cody") wrote in news:qcAka.11$Zt6....@news.ecrc.de:

> only a stupid quesition: could there be ever a situation where
> somebody would like the include one and the sme header multiple times?
> i can't think of such a situation.

It is, I think, somewhat uncommon, but it does happen. Probably the best
example is a header in which nothing but preprocessor macros is defined,
with a corresponding header to undef those macros. One then #includes the
definitions wherever they're needed, then #includes the #undefs when he's
finished with them. Boost does a lot of this.


--
Mike Conley

Christian Bau

unread,
Apr 8, 2003, 5:26:36 PM4/8/03
to
In article <qcAka.11$Zt6....@news.ecrc.de>, deutr...@web.de ("cody")
wrote:

> only a stupid quesition: could there be ever a situation where somebody
> would like the include one and the sme header multiple times? i can't think
> of such a situation.

Could you imagine why I might write a header file containing a single
#pragma statement, and include it in the same source file dozens of
times?

Allan W

unread,
Apr 8, 2003, 10:53:04 PM4/8/03
to
deutr...@web.de ("cody") wrote
> only a stupid quesition: could there be ever a situation where somebody
> would like the include one and the sme header multiple times?

Yes.

> i can't think of such a situation.

You're just not being stupid enough. I, on the other hand, am quite
expert at being stupid. (I started when I was very young.) Let me
give two examples.

(1) <assert.h> -- the contents depend on the definition of NDEBUG

#include <assert.h>
void one() { /* ... */ }
void two() { /* ... */ }
void three() { /* ... */ }
void four() { /* ... */ }
void five() { /* ... */ }

// Something is acting funny in function six.
// We have to find it before we ship!
// Until then, EVERY build has asserts enabled...
#ifndef NDEBUG
#define NDEBUG
#define UNDEF_NDEBUG
#include <assert.h>
#endif
void six() { /* ... */ }
// Put NDEBUG back the way it was
#ifdef UNDEF_NDEBUG
#undef UNDEF_NDEBUG
#undef NDEBUG
#include <assert.h>
#endif

void seven() { /* ... */ }
void eight() { /* ... */ }


(2) This doesn't happen with standard headers, but I've seen it in
more than one shop with user-defined headers.

/// sort3.inc

// Get lowest value into sort_a
if ((sort_b < sort_a) || (sort_c < sort_a))
if (sort_b < sort_c)
swap(sort_b, sort_a);
else
swap(sort_c, sort_a);

// Get highest value into sort_c
if (sort_c < sort_b)
swap(sort_c, sort_b);

// These three elements are now sorted

/// Just for demonstration purposes -- you could
/// actually have quite a bit of fun in the .inc file.

Typical usage:
void show_median(double a, double b, double_c) {
// Sort them
#define sort_a a
#define sort_b b
#define sort_c c
#include "sort3.inc"
#undef sort_a
#undef sort_b
#undef sort_c

// Show the median
std::cout << "Median is " << b;
}

Surely you can see how two different functions in the same source
translation unit, might both want to use the magical code in
sort3.inc.

Hyman Rosen

unread,
Apr 8, 2003, 10:53:45 PM4/8/03
to
Allan W wrote:
> #pragma once is an attempt to optimize the performance by skipping the
> file read when it is unneccesary. If this optimization fails, the program
> is still correct (but it compiles more slowly).
>
> My point, then, is that if you combine #once (or #pragma once) with
> traditional include guards, you get the best of all worlds.

But some compilers already implement the same semantics by remembering
that the include guards encompass the entire file. Why bother forcing
everyone to implement a special pragma when they can just do the same
thing?

Bo Persson

unread,
Apr 9, 2003, 1:06:09 AM4/9/03
to

"Allan W" <all...@my-dejanews.com> skrev i meddelandet
news:7f2735a5.03040...@posting.google.com...

> ja...@jacob.remcomp.fr (jacob navia) wrote
> > > Will lcc-win32 include it twice, thus ignoring the #pragma once?
> >
> > Yes
> >
> > > If so, isn't that a bug?
> >
> > No. It is a network misconfiguration.
>
> It depends on the stated purpose of #once (or #pragma once).
>
> I view it as an optimization. A failed optimization is only a bug if it
> causes the wrong results.
>
> #define-style include guards have been around since the earliest days
> of C programming. They're easy to write, easy to understand, and they
> work very well (given a good naming standard). I teach this to
> first-year C++ students.
>
> #define-style include guards do have one shortcoming: in order for the
> compiler to skip an already-seen include guard, it has to re-open the
> source file, read the appropriate #if statement, scan for the matching
> #endif, and finally close the file. While this does not result in any
> sort of error, it can slow down the compile process.

No, it does not.

The compiler can remember files it has already seen, together with their
include guards. When it encounters another inlude of the same file, it can
check if the guard condition is still active, and skip the file entirely.

> #pragma once is an attempt to optimize the performance by skipping the
> file read when it is unneccesary. If this optimization fails, the program
> is still correct (but it compiles more slowly).

But, unless the compiler knows which files contain #pragma once (unguarded
pragrnas!), it still have to open and close the file.

> My point, then, is that if you combine #once (or #pragma once) with
> traditional include guards, you get the best of all worlds.
>

Sounds like you get neither...


Bo Persson
bo...@telia.com

Dave Hansen

unread,
Apr 9, 2003, 1:06:22 AM4/9/03
to
On Tue, 8 Apr 2003 18:10:53 +0000 (UTC), deutr...@web.de ("cody")
wrote:

>only a stupid quesition: could there be ever a situation where somebody
>would like the include one and the sme header multiple times? i can't think
>of such a situation.

Yes. Consider assert.h.

Regards,

-=Dave
--
Change is inevitable, progress is not.

---

Mike Conley

unread,
Apr 9, 2003, 1:07:10 AM4/9/03
to
jden...@acm.org (James Dennett) wrote in
news:QOOia.5896$ey1.4...@newsread1.prod.itd.earthlink.net:

> If you care to respond with a portable notion of how #pragma once can
> work, including cases where the same header is accessed via different
> paths (including cases where these access files stored across a
> network), some of us may be interested to hear about it.

Well, you didn't ask me, but I've got an idea:

Some people have suggested using MD5s to determine whether or not two
files are "the same". We could take that a step further (and achieve
the same semantics as #ifndef guards, when they guard the whole file) if
we modify the header file. It would work as follows:

The preprocessor (1) sees #once. (2) computes a checksum of the file (3)
modifies the line containing the #once declaration so that it contains
an encoding of the checkum & the current time (presumably a somewhat
readable chunk of text, and certainly without newlines).

Now, every time the preprocessor sees #once, it checks for the checksum
and modification time. If the file hasn't been modified since it was
last checksummed, and if the preprocessor has already seen the checksum,
it closes the file and moves on. (The usual optimizations would still
apply: there's no need to open a guarded header that the preprocessor
knows it's already #included).

Otherwise, it does whatever is necessary to #include it (taking note of
the checksum). Naturally, if you don't have write permissions for a
given header, the compiler should (one would hope) warn you about it (so
you can pester your sysadmin to generate checksums for said header).
The preprocessor could fail gracefully (computing the checksums
repeatedly if necessory) in such situations.

The implementation would depend only upon the ability of the compiler to
determine the current time and the last modification time of a file
reliably.


An alternative would be to write some stamp to the header (after the
#once, as before), keeping a list of open headers. When preprocessing is
completed, remove the stamps.


What we would have, essentially, is automatic generation of an #ifndef
guard (generated the first time the file is seen), but without the macro
definitions. No need to try to specify what constitutes a path name or
what it means for two files to be the same.


-- Mike Conley

Ross Smith

unread,
Apr 10, 2003, 1:47:08 AM4/10/03
to
Allan W wrote:

> deutr...@web.de ("cody") wrote
>> only a stupid quesition: could there be ever a situation where
>> somebody would like the include one and the sme header multiple
>> times?
>

> [...]


>
> (2) This doesn't happen with standard headers, but I've seen it in
> more than one shop with user-defined headers.
>
> /// sort3.inc
>

> [...]


>
> void show_median(double a, double b, double_c) {
> // Sort them
> #define sort_a a
> #define sort_b b
> #define sort_c c
> #include "sort3.inc"
> #undef sort_a
> #undef sort_b
> #undef sort_c
> // Show the median
> std::cout << "Median is " << b;
> }
>
> Surely you can see how two different functions in the same source
> translation unit, might both want to use the magical code in
> sort3.inc.

Glibc (the standard C library on Linux) does this sort of thing with the
maths functions. Something like this (grossly simplified from the
fairly complicated actual code):

// <bits/mathcalls.h>
extern TYPE sin##SUFFIX(TYPE x);
extern TYPE cos##SUFFIX(TYPE x);
extern TYPE tan##SUFFIX(TYPE x);
// etc...

// <math.h>
#define TYPE float
#define SUFFIX f
#include <bits/mathcalls.h>
#define TYPE double
#define SUFFIX
#include <bits/mathcalls.h>
#define TYPE long double
#define SUFFIX l
#include <bits/mathcalls.h>

--
Ross Smith ......... r-s...@ihug.co.nz ......... Auckland, New Zealand
Welcome to Babylon, the traitors' homes of newer days
Come feel my terror, or watch the anger rise in me
What was this war for, if it is you who wins at last?
What is your word for, if it is us who breaks the laws?
-- Diary of Dreams

Charles Sanders

unread,
Apr 10, 2003, 3:12:04 PM4/10/03
to

cody wrote:
>
> only a stupid quesition: could there be ever a situation where
> somebody would like the include one and the sme header multiple
> times? i can't think of such a situation.

One use I have seen is to declare a set of overloaded functions
for several types. In the example below all macro names, file
names, function names etc changed because I cannot remember
what they were anyway, nor can I remember where/when I saw this,
except that it was long, long ago.

---- file1.hpp --------

TYPE f1( TYPE x );
TYPE f2( TYPE x, TYPE y );
TYPE g1( const TYPE x );
// .... and so on for many other functions

---- file2.hpp ---------

#if defined(TYPE)
#undef TYPE
#endif

#define TYPE short int
#include "file1.hpp"
#undef TYPE

#define TYPE int
#include "file1.hpp"
#undef TYPE

// ... and so on for other types

Charles

Mike Conley

unread,
Apr 11, 2003, 12:21:49 AM4/11/03
to
I left a few things out.

> The implementation would depend only upon the ability of the compiler
> to determine the current time and the last modification time of a file
> reliably.

And, of course, some type of file locking.



> An alternative would be to write some stamp to the header (after the
> #once, as before), keeping a list of open headers. When preprocessing
> is completed, remove the stamps.

Let's just pretend I didn't say that :) The problems would be
overwhelming.

Allan W

unread,
Apr 15, 2003, 8:00:49 PM4/15/03
to
conle...@osu.edu (Mike Conley) wrote

> > An alternative would be to write some stamp to the header (after the
> > #once, as before), keeping a list of open headers. When preprocessing
> > is completed, remove the stamps.
>
> Let's just pretend I didn't say that :) The problems would be
> overwhelming.

I don't think it was really you that said it anyway. Someone adept
at spoofing, no doubt... surely not you.

I have a completely different idea. For traditional header files
(function prototypes, class and constant definitions, etc.), what's
the real harm in allowing it to be included twice? Currently we can't
define the same class or constant twice, even if those definitions
are absolutely identical. Couldn't these be relaxed?

// SomeClass.h
// Assume no #include guards
#include <iosfwd>
class SomeClass {
int Someint;
float Somefloat;
public:
enum { low=1, medium=5, high=9 };
SomeClass();
SomeClass(int, float);
SomeClass(const SomeClass&);
SomeClass &operator=(const SomeClass&);
~SomeClass();
int getSomeint() { return Someint; }
void setSomeint(int s) { Someint = s; }
float getSomefloat() { return Somefloat; }
void setSomefloat(float f) { Somefloat = f; }
void process();
};
std::ostream& operator<<(std::ostream&, const SomeClass&);
SomeClass operator+(const SomeClass&lhs, const SomeClass&rhs);

// Main.cpp
#include "SomeClass.h"
#include "SomeClass.h" // Why should this be an error?

Would it be an unreasonable burden on compilers, to have them
display a diagnostic only if the two definitions of SomeClass
were not compatible? In the example above, they're not just
token-for-compatible but actually character-for-character
identical (the result of processing the same code twice). Why
shouldn't the compiler just confirm that they are equivalent,
and then continue processing?

We'd still want something like #once or #pragma once, but after
this change (and even without include guards) it would be merely
an optimization hint that could be safely ignored.

Mike Conley

unread,
Apr 17, 2003, 1:46:29 PM4/17/03
to
all...@my-dejanews.com (Allan W) wrote in
news:7f2735a5.03041...@posting.google.com:

> conle...@osu.edu (Mike Conley) wrote


>> Let's just pretend I didn't say that :) The problems would be
>> overwhelming.
>
> I don't think it was really you that said it anyway. Someone adept
> at spoofing, no doubt... surely not you.

No doubt :)

> I have a completely different idea. For traditional header files
> (function prototypes, class and constant definitions, etc.), what's
> the real harm in allowing it to be included twice? Currently we can't
> define the same class or constant twice, even if those definitions
> are absolutely identical. Couldn't these be relaxed?

They probably could be for the definitions of everything but functions.
The problem is that it is impossible to determine whether or not two
arbitrary functions compute the same value. Textual comparison won't help
here, as even the same text could have different meanings in different
contexts. You could allow different function definitions to be present in
a program, but I think you'll agree that it would be undesirable.


--
Mike Conley

Allan W

unread,
Apr 21, 2003, 1:32:01 PM4/21/03
to
> all...@my-dejanews.com (Allan W) wrote

> > I have a completely different idea. For traditional header files
> > (function prototypes, class and constant definitions, etc.), what's
> > the real harm in allowing it to be included twice? Currently we can't
> > define the same class or constant twice, even if those definitions
> > are absolutely identical. Couldn't these be relaxed?

conle...@osu.edu (Mike Conley) wrote


> They probably could be for the definitions of everything but functions.
> The problem is that it is impossible to determine whether or not two
> arbitrary functions compute the same value. Textual comparison won't help
> here, as even the same text could have different meanings in different
> contexts. You could allow different function definitions to be present in
> a program, but I think you'll agree that it would be undesirable.

AFAIK, the only functions normally found in header files are inline
functions, and these already have well-defined rules for multiple
definitions in multiple translation units. As for mutiple definitions
in the SAME translation unit -- I wouldn't exactly recommend dropping
the ODR, if that's what you mean. Maybe the most innocuous-possible
relaxation. If the same function is defined twice, it must be
token-for-token identical, AND the meaning of those tokens must not
have changed -- no diagnostic required. That way, a given
implementation could have many different coping strategies:
* Do some sort of checksum on the tokens (or the generated code!),
and emit a warning if the two definitions have different values
* Just use the first one encountered, and ignore the others without
warning

I would guess that a bigger problem would be multiple declarations
of globals. I don't have such "easy" answers for this case.

0 new messages