Preprocessor isolation

343 views
Skip to first unread message

g...@axiomatics.org

unread,
Feb 25, 2014, 8:53:09 PM2/25/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com

[ This is a resend. Please, can anyone managing this group approve the
pending subscription requests? ]

Hi,

The various module design papers emphasize macro isolation as key
benefits of modules. However, how much of "isolation" is being considered?
Are pragmas also being included? I would have expected so but I don't
seem to locate any comprehensive discussion of the interactions of
modules with other CPP directives in any of the proposals. Any
pointers? -- Gaby

Richard Smith

unread,
Feb 26, 2014, 3:12:51 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, Gabriel Dos Reis
Which pragmas did you have in mind? Since we don't have any standard pragmas, I'd expect this to be up to the implementation. I would expect for the most part a model of "what happens in the module, stays in the module", but there are some pragmas that we want to transfer across. Some examples (MS #pragmas):

// If a pointer-to-member for an incomplete class type is formed, assume the class has single inheritance.
// If this appears in a module, it should probably apply only to that module.
#pragma pointers_to_members(single_inheritance)

// Instruct the linker to link in 'mylib' if it includes this code.
// If this appears in a module, we want it to apply to any program importing that module.
#pragma comment(lib, "mylib")

Gabriel Dos Reis

unread,
Feb 26, 2014, 3:31:41 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
Richard Smith <richar...@google.com> writes:

| On 25 February 2014 17:53, <g...@axiomatics.org> wrote:
|
|
| [ This is a resend.  Please, can anyone managing this group
| approve the
| pending subscription requests? ]
|
| Hi,
|
| The various module design papers emphasize macro isolation as key
| benefits of modules.   However, how much of "isolation" is being
| considered?
| Are pragmas also being included?  I would have expected so but I
| don't
| seem to locate any comprehensive discussion of the interactions of
| modules with other CPP directives in any of the proposals.  Any
| pointers? -- Gaby
|
|
| Which pragmas did you have in mind?

Instead of drawing list of pragmas (if we have 13, how do we know it
shouldn't be 14?) I am looking for a general rule/guideline.

| Since we don't have any standard pragmas, I'd expect this to be up to
| the implementation.

While most pragmas have implementation-defined semantics, it would be a
disaster if that is the best we can come up with.

As a strawman, a rule could be the effects of a CPP directive is mute
outside of a module boundary. Then, it would be up to implementors to
decide whether they want a "compiler directive" to be expressed as
pragma or command lines. But, just saying it is up to implementors to
decide the boundary of CPP directives is likely to lead to more
confusion and chaos than the include source file model we have today.

| I would expect for the most part a model of "what happens in the
| module, stays in the module", but there are some pragmas that we want
| to transfer across.
|
| Some examples (MS #pragmas):
|
| // If a pointer-to-member for an incomplete class type is formed,
| assume the class has single inheritance.
| // If this appears in a module, it should probably apply only to that
| module.
| #pragma pointers_to_members(single_inheritance)

I don't know whether this is the time or place to discuss
Microsoft-specific pragmas, but I would suggest this is an example where
I would rather see a general rule about CPP pragmas first, then discuss
whether we need exceptions at all and what form that takes.

| // Instruct the linker to link in 'mylib' if it includes this code.
| // If this appears in a module, we want it to apply to any program
| importing that module.
| #pragma comment(lib, "mylib")

It isn't clear to me that implementations wouldn't want to take the
advatange of modules to provide better or different mechanisms to
express this. So, at this point in time, I don't think this example
should the center of general design decisions/principles.

-- Gaby

James Dennett

unread,
Feb 26, 2014, 3:52:11 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
On Wed, Feb 26, 2014 at 12:31 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:
> Richard Smith <richar...@google.com> writes:
>
> | On 25 February 2014 17:53, <g...@axiomatics.org> wrote:
> |
> |
> | [ This is a resend. Please, can anyone managing this group
> | approve the
> | pending subscription requests? ]
> |
> | Hi,
> |
> | The various module design papers emphasize macro isolation as key
> | benefits of modules. However, how much of "isolation" is being
> | considered?
> | Are pragmas also being included? I would have expected so but I
> | don't
> | seem to locate any comprehensive discussion of the interactions of
> | modules with other CPP directives in any of the proposals. Any
> | pointers? -- Gaby
> |
> |
> | Which pragmas did you have in mind?
>
> Instead of drawing list of pragmas (if we have 13, how do we know it
> shouldn't be 14?) I am looking for a general rule/guideline.
>
> | Since we don't have any standard pragmas, I'd expect this to be up to
> | the implementation.
>
> While most pragmas have implementation-defined semantics,

[cpp.pragma] only says that pragmas all have implementation-defined
behavior, no? (I think that's stronger than "most pragmas...".)

> it would be a
> disaster if that is the best we can come up with.

How would this disaster manifest? Will modules make it worse that the
situation we've had for decades?
As a general position, I think it would be best not to say that the
effect of #pragmas is implementation-defined except that XYZ, for just
about any value of XYZ.

The purpose of pragmas is to allow implementations hooks to do things
that are often beyond the scope of things the standard even talks
about, or enable non-conforming features.

We could introduce a new notion of a module-local pragma (#pragma
module XXXX ?), but the default position should be that C++98-style
pragmas are simply implementation-defined.

Beyond that, any more complicated position should be justified with
actual examples. We should not generalize in ways that add complexity
without compelling reasons. Keeping the simple status quo, on the
other hand, doesn't really need to be driven by examples. Or, even
better, implementation experience; we can start out with a simple
"it's all implementation-defined" rule, and then standardize existing
practice if appropriate.

-- James

Gabriel Dos Reis

unread,
Feb 26, 2014, 4:26:01 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
James Dennett <jden...@google.com> writes:

| On Wed, Feb 26, 2014 at 12:31 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:
| > Richard Smith <richar...@google.com> writes:
| >
| > | On 25 February 2014 17:53, <g...@axiomatics.org> wrote:
| > |
| > |
| > | [ This is a resend. Please, can anyone managing this group
| > | approve the
| > | pending subscription requests? ]
| > |
| > | Hi,
| > |
| > | The various module design papers emphasize macro isolation as key
| > | benefits of modules. However, how much of "isolation" is being
| > | considered?
| > | Are pragmas also being included? I would have expected so but I
| > | don't
| > | seem to locate any comprehensive discussion of the interactions of
| > | modules with other CPP directives in any of the proposals. Any
| > | pointers? -- Gaby
| > |
| > |
| > | Which pragmas did you have in mind?
| >
| > Instead of drawing list of pragmas (if we have 13, how do we know it
| > shouldn't be 14?) I am looking for a general rule/guideline.
| >
| > | Since we don't have any standard pragmas, I'd expect this to be up to
| > | the implementation.
| >
| > While most pragmas have implementation-defined semantics,
|
| [cpp.pragma] only says that pragmas all have implementation-defined
| behavior, no? (I think that's stronger than "most pragmas...".)

Yes, but that does not change the fundamental point.

| > it would be a
| > disaster if that is the best we can come up with.
|
| How would this disaster manifest? Will modules make it worse that the
| situation we've had for decades?

Yes, by introducing a variability that did not exist before.

Today, pragmas are processed as part of translation units.
If you introduce modules, you have to say how they interact with
modules. If all you say is "oh, it is up to implementations" then you
have introduced with no apparent justification a portability problem
that did not exist before. You have to explain why that is necessary.
Yes, but that is just one side of the story. Another side of the story
is that pragmas are processed as part of translation units.

| We could introduce a new notion of a module-local pragma (#pragma
| module XXXX ?), but the default position should be that C++98-style
| pragmas are simply implementation-defined.

In a translation unit, yes. I don't think anybody is proposing to
change that.

| Beyond that, any more complicated position should be justified with
| actual examples. We should not generalize in ways that add complexity
| without compelling reasons. Keeping the simple status quo, on the
| other hand, doesn't really need to be driven by examples. Or, even
| better, implementation experience; we can start out with a simple
| "it's all implementation-defined" rule, and then standardize existing
| practice if appropriate.

You need to define "any more complicated position."

Note that we are talking about implication on implementations and uses.
For example, if you say that some pragmas may "escape" module
boundaries, then you are transfering the existing problems of include
order dependencies (that come from CPP effects) to module import order
dependencies. What this means is that, potentially the statements

import A;
import B;

do not have the same effect as

import B;
import A;

therefore programmers have to remember in which order they have to
import (most rarely do, leading to over-inclusion).

From the implementation point of view, the compiler loses opportunities
to reuse/share translation contexts. E.g. the above two scenarios
represent potentially different translation contexts.

Can you explain again why that is desirable?

On the other hand, if you say CPP directives do not escape module
boundaries, the order of consecutive imports is immaterial (as should be
in sane module systems), programmers don't have to remember, and
opportunity for sharing translation contexts is increased and plenty.
It is simple, and the benefits are apparent.

-- Gaby

James Widman

unread,
Feb 26, 2014, 4:58:53 PM2/26/14
to mod...@isocpp.org

On Feb 26, 2014, at 4:26 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:

> What this means is that, potentially the statements
>
> import A;
> import B;
>
> do not have the same effect as
>
> import B;
> import A;
>
> therefore programmers have to remember in which order they have to
> import (most rarely do, leading to over-inclusion).


Hm… so, imagine the last line of module B is:

#pragma pack

.. and suppose the first line of module A starts with a definition of a POD struct. Then would we get a different effect with:

import B;
import A;
struct X { /* members.. */ };

… vs:

import A;
import B;
struct X { /* members.. */ };

My understanding is that in Clang’s implementation, there is a completely separate invocation of the front end for each import statement, so unless you put something in its execution environment to tell it, “pack the first struct”, it will be as if #pragma pack was not there.

And if a vendor implements it that way, then they have to deliberately do some work in order to make the effect of an “uncollected" pragma cross the boundary (in either direction).

(I’m not suggesting we recommend any particular implementation model; I’m just making an observation.)

Of course, anything that affects attributes of a declared entity (like the size of a type, which could be altered by a pack directive), is probably intended to implicitly bubble all the way up. So, a “pack” in B that appears before the definition of a struct that is *also* in B could make that type smaller, and that could be observed by an importing TU. (But I’m guessing no one is worried about that kind of “cross-module” effect.)

So, if in 16.6 we were to insert wording to the effect of:

“If a pragma appears before or after an import-statement, the effect of the pragma on the translation of the imported module is implementation-defined. Likewise, if a pragma appears within an imported module, the effect of the pragma on the translation of the module that imports the imported module is implementation-defined. ”

… would that settle it? It doesn’t really change the status quo, but it does clarify that a vendor must document this stuff in order to be considered “conforming”.

—James

Gabriel Dos Reis

unread,
Feb 26, 2014, 5:18:15 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com

[ please keep the CC: line; there are pending subscriptions that have
not been approved by the mailing list moderators yet. ]

James Widman <james....@gmail.com> writes:

| On Feb 26, 2014, at 4:26 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:
|
| > What this means is that, potentially the statements
| >
| > import A;
| > import B;
| >
| > do not have the same effect as
| >
| > import B;
| > import A;
| >
| > therefore programmers have to remember in which order they have to
| > import (most rarely do, leading to over-inclusion).
|
|
| Hm... so, imagine the last line of module B is:
|
| #pragma pack
|
| .. and suppose the first line of module A starts with a definition of a POD struct. Then would we get a different effect with:
|
| import B;
| import A;
| struct X { /* members.. */ };
|
| ... vs:
|
| import A;
| import B;
| struct X { /* members.. */ };
|
| My understanding is that in Clang's implementation, there is a completely separate invocation of the front end for each import statement, so unless you put something in its execution environment to tell it, "pack the first struct", it will be as if #pragma pack was not there.

Exactly right. That corresponds to the notion that CPP directives are
isolated. It is simple to teach, to use, and it is simple to implement
with noticeable benefits for build-time improvements.

However, it does have some consequences; and I would like that to be a
deliberate decision, considered along with its implications.

| And if a vendor implements it that way, then they have to deliberately do some work in order to make the effect of an "uncollected" pragma cross the boundary (in either direction).

Correct. That is more work, not only for implementers -- and if you
consider build-time effects, it is not clear how much benefits/win we
would get for build-time performance compared to the existing
time-honored source file inclusion model. And it is adding another
complication on top of the existing mess.

| (I'm not suggesting we recommend any particular implementation model; I'm just making an observation.)
|
| Of course, anything that affects attributes of a declared entity (like
| the size of a type, which could be altered by a pack directive), is
| probably intended to implicitly bubble all the way up. So, a "pack" in
| B that appears before the definition of a struct that is *also* in B
| could make that type smaller, and that could be observed by an
| importing TU. (But I'm guessing no one is worried about that kind of
| "cross-module" effect.)

It is not clear to me that we want attributes effect in one module to
leak to another module by the mere fact of its being included before.
If that is the case, are you suggesting that every module be
retranslated every time it is imported before of possible attributes
effects leaking into its import? What are we gaining compared to the
current source file inclusion model?

| So, if in 16.6 we were to insert wording to the effect of:
|
| "If a pragma appears before or after an import-statement, the effect of the pragma on the translation of the imported module is implementation-defined. Likewise, if a pragma appears within an imported module, the effect of the pragma on the translation of the module that imports the imported module is implementation-defined. "
|
| ... would that settle it?

No. :-)

| It doesn't really change the status quo, but it does clarify that a
| vendor must document this stuff in order to be considered
| "conforming".

I don't understand what the "status quo" is: we don't have a spec yet.

How about #line or #file directives? I believe we should consider CPP
directives as whole instead of making lists.

I think we need a design discussion and carefully consider consequences,
before we decide that we have a "status quo."

-- Gaby

James Widman

unread,
Feb 26, 2014, 7:03:30 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com

On Feb 26, 2014, at 5:18 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:

>
> [ please keep the CC: line; there are pending subscriptions that have
> not been approved by the mailing list moderators yet. ]

Right; sorry.
Ok.

<snip>
> | Of course, anything that affects attributes of a declared entity (like
> | the size of a type, which could be altered by a pack directive), is
> | probably intended to implicitly bubble all the way up. So, a "pack" in
> | B that appears before the definition of a struct that is *also* in B
> | could make that type smaller, and that could be observed by an
> | importing TU. (But I'm guessing no one is worried about that kind of
> | "cross-module" effect.)
>
> It is not clear to me that we want attributes effect in one module to
> leak to another module by the mere fact of its being included before.


So… suppose I had:

// file “a.h”:
#pragma pack
struct A { /* members */ };

Suppose that, without the pragma, sizeof(A) is 32, and with the pragma, sizeof(A) is 24.

I think a user could reasonably expect that, when one replaces:

#include “a.h”

… with:

import a;

… where module a contains the exact same two lines of code above (plus an export thingy)…

… sizeof(A) is 24 when evaluated in the module that imports a.

(I.e., the user assumes that an instance of sizeof(A) (in the including/importing code) does not change when migrating from “#include” to “import".)

Agreed?

Would you be ok with a future standard that permits a conforming implementation to behave that way?

> If that is the case, are you suggesting that every module be
> retranslated every time it is imported before of possible attributes
> effects leaking into its import?

No.

Basically, whenever the internally-genreated command switches to the front end change in a way that the implementation decides is significant, we should expect new object code and a new binary module file.

And if a vendor decides that a pragma in source code needs to have the effect of changing that sub-invocation, and if those sub-invocations for the same module differ at any two points in the program, then we may want that to be regarded as an ODR violation.

Richard, have you seen any counter-examples to that while implementing modules?

> What are we gaining compared to the
> current source file inclusion model?

Um… “good stuff”.

(speed, macro isolation, much, much shorter TUs, tools enabled by knowledge of the index of names that the compiler needs to make things fast)


> | So, if in 16.6 we were to insert wording to the effect of:
> |
> | "If a pragma appears before or after an import-statement, the effect of the pragma on the translation of the imported module is implementation-defined. Likewise, if a pragma appears within an imported module, the effect of the pragma on the translation of the module that imports the imported module is implementation-defined. "
> |
> | ... would that settle it?
>
> No. :-)

Ok…

<snip>
> How about #line or #file directives?

I don’t know; what about #line?

Each module has its own separate translation unit (separate from the TU of any other module, including any importing module).

So there is a separate preprocessing TU for each module (i.e., different preprocessor outputs—probably to be implemented as one “.i” file for each file indicated by an import-statement).

Where’s the problem?

> I believe we should consider CPP directives as whole instead of making lists.

Ok, I’m curious to see where this goes…

And if we can help to prevent implementors from doing stupid things at the intersection of pragmas and “import”, that would be good...

But if we never get to the point of talking about specific directives (including specific pragmas like “pack”), then I don’t know where you’re going with this.


—James


James Widman

unread,
Feb 26, 2014, 10:26:11 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
Apologies; in between other tasks I completely lost the context that I set up when I replied with:

On Feb 26, 2014, at 7:03 PM, James Widman <james....@gmail.com> wrote:

>
> <snip>
>> | Of course, anything that affects attributes of a declared entity (like
>> | the size of a type, which could be altered by a pack directive), is
>> | probably intended to implicitly bubble all the way up. So, a "pack" in
>> | B that appears before the definition of a struct that is *also* in B
>> | could make that type smaller, and that could be observed by an
>> | importing TU. (But I'm guessing no one is worried about that kind of
>> | "cross-module" effect.)
>>
>> It is not clear to me that we want attributes effect in one module to
>> leak to another module by the mere fact of its being included before.
>
>
> So… suppose I had:
<embarrassing example where I went off topic without noticing>

Let me try that again:

In:

// Translation unit that defines module Q:
import X;
import Y;

… where X does not import any module and Y does not import any module, we want to give the user some guarantees about the effect of a pragma...

1. in Q, on the translation of Q
2. in Q, on the translation of X
3. in Q, on the translation of Y

4. in X, on the translation of Q
5. in X, on the translation of X
6. in X, on the translation of Y

7. in Y, on the translation of Q
8. in Y, on the translation of X
9. in Y, on the translation of Y

I *now* think that when you wrote:

> On the other hand, if you say CPP directives do not escape module
> boundaries, the order of consecutive imports is immaterial (as should be
> in sane module systems), programmers don't have to remember, and
> opportunity for sharing translation contexts is increased and plenty.
> It is simple, and the benefits are apparent.

… you were mainly concerned with 6 and 8, but also to some extent 2,3,4, and 7. Is that correct?

(Incidentally, I was previously fixated on 2 and 4.)

So, starting over… here’s straw man version deux!

For 5 and 9 ( X-X, Y-Y):

“Whatever effect it had in C++14."

For 6 and 8

“No effect."

Or at least, if any implementor thinks they need something other than “none at all” for 6 and 8, we want them to speak up as soon as possible. Right?

And if we change the example to:

// Translation unit that defines module Q:
import Y;
import X;

… then that is guaranteed NOT to change the semantics of the program, regardless of any #pragma in X or Y (except that order-of-initialization might differ).

(But something like NDEBUG/assert() might be an exception.)

For 4:

“nothing direct, but if a pragma in X affects an attribute of an entity declared in X (e.g. “pack” on an exported struct can affect the size of that struct), and if the semantics in Q depend on that attribute, then in that limited sense, yes, there is an observable effect.”

Discussion of #pragma comment(lib, "my lib”) belongs to this category. I agree it would be better to have a separate core language construct for this.

Discussion of NDEBUG/assert() also belongs to category 4. It might be nice if a module could exported a macro like assert(). I vaguely recall that Doug might have run into practical reasons why some headers just need to remain headers, and <assert.h> might have been one of them (although I might be conflating this with the issue of system headers that are not properly guarded).

For 7:

ditto (replace “X” with “Y”).

For 2 & 3:

“Very little”

(Discussion of NDEBUG also belongs here.)

For 1:

Just like 5 and 9: no change (except that we’re now more blind to directives inside X and Y compared to the inclusion model.

—James

Gabriel Dos Reis

unread,
Feb 26, 2014, 10:46:09 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
| So... suppose I had:
|
| // file "a.h":
| #pragma pack
| struct A { /* members */ };
|
| Suppose that, without the pragma, sizeof(A) is 32, and with the
| pragma, sizeof(A) is 24.
|
| I think a user could reasonably expect that, when one replaces:
|
| #include "a.h"
|
| ... with:
|
| import a;
|
| ... where module a contains the exact same two lines of code above
| (plus an export thingy)...
|
| ... sizeof(A) is 24 when evaluated in the module that imports a.
|
| (I.e., the user assumes that an instance of sizeof(A) (in the
| including/importing code) does not change when migrating from
| "#include" to "import".)
|
| Agreed?

No.

"resoanably expects" depends on what story we tell users. If we tell
them that a module is just a glorified preprocessed file, then yeah I
can see how that can happen. But, the stories (e.g. Daveed's paper,
various presentations at committee meetings) suggest something
completely different: one where the processed form of a module is closer
to set of elaborations (fully type checked declarations) than sequence
of tokens to be reparsed over and over again.

In that story, I find it unreasonable expectation and quite counter
productive to expect the "pragma pack" should leak to the next module.

For a consistent story, we need a few things:
(a) what properties to expect from modules,
(b) how a module can be used, etc.
(c) how modules interact with classic compilation models
then we flesh out the rules to meet those expectations.

| Would you be ok with a future standard that permits a conforming
| implementation to behave that way?

At this point, I will answer "no": we are introducing a disruptive
technology and any non-portability of this magnitude (which affects both
users and implementations) ought to have its benefits clearly spelled
out and weighted against other alternatives.

| > If that is the case, are you suggesting that every module be
| > retranslated every time it is imported before of possible attributes
| > effects leaking into its import?
|
| No.

But, what you suggested earlier pretty much requires that, and you are
saying as much below:

| Basically, whenever the internally-genreated command switches to the
| front end change in a way that the implementation decides is
| significant, we should expect new object code and a new binary module
| file.
|
| And if a vendor decides that a pragma in source code needs to have the
| effect of changing that sub-invocation, and if those sub-invocations
| for the same module differ at any two points in the program, then we
| may want that to be regarded as an ODR violation.

Please explain *why* we need this uncertainty about the boundaries of
CPP directives. The examples you've given so far, suggest additional
work to preserve just the problems we have today with source file
inclusion model.

|
| Richard, have you seen any counter-examples to that while implementing modules?
|
| > What are we gaining compared to the current source file inclusion model?
|
| Um... "good stuff".

What are they?

| (speed, macro isolation, much, much shorter TUs, tools enabled by
| knowledge of the index of names that the compiler needs to make things
| fast)

How? You stated earlier:

# My understanding is that in Clang's implementation, there is a
# completely separate invocation of the front end for each import
# statement, so unless you put something in its execution environment to
# tell it, "pack the first struct", it will be as if #pragma pack was
# not there.

which pretty much supports the design point that even Clang's implementation
you are talking about does pragma isolation.

| > | So, if in 16.6 we were to insert wording to the effect of:
| > |
| > | "If a pragma appears before or after an import-statement, the
| > | effect of the pragma on the translation of the imported module is
| > | implementation-defined. Likewise, if a pragma appears within an
| > | imported module, the effect of the pragma on the translation of
| > | the module that imports the imported module is
| > | implementation-defined. "
| > |
| > | ... would that settle it?
| >
| > No. :-)
|
| Ok...
|
| <snip>
| > How about #line or #file directives?
|
| I don't know; what about #line?

It is a directive that changes the state of the CPP processor.

| Each module has its own separate translation unit (separate from the
| TU of any other module, including any importing module).

Define "each module has its own separate translation unit".

| So there is a separate preprocessing TU for each module (i.e.,
| different preprocessor outputs--probably to be implemented as one ".i"
| file for each file indicated by an import-statement).
|
| Where's the problem?

What is "separate preprocessing TU for each module"?
The only reasonable interpretation I can see is that effects of CPP
directives are ignored outside of module boundaries.

| > I believe we should consider CPP directives as whole instead of making lists.
|
| Ok, I'm curious to see where this goes...
|
| And if we can help to prevent implementors from doing stupid things at
| the intersection of pragmas and "import", that would be good...

The question isn't as much as preventing implementors from doing stupid
things than preventing ourselves from writing sloppy and possibly stupid
spec. Implementors will implement a spec; I would like to see us give
more thinking to these points.

| But if we never get to the point of talking about specific directives
| (including specific pragmas like "pack"), then I don't know where
| you're going with this.

I think "pragma directives" is specific enough for now. If we are
unable to say something coherent at that level and work out the
implications, then it is unlikely that we are going to produce a useful
spec for modules. Surely, we don't want the spec to be just a
documentation of the implementation du jour.

-- Gaby

Gabriel Dos Reis

unread,
Feb 26, 2014, 11:19:52 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
James Widman <james....@gmail.com> writes:

| Apologies; in between other tasks I completely lost the context that I
| set up when I replied with:

I just saw this new message, -after- I replied to the previous.

| On Feb 26, 2014, at 7:03 PM, James Widman <james....@gmail.com> wrote:
|
| >
| > <snip>
| >> | Of course, anything that affects attributes of a declared entity (like
| >> | the size of a type, which could be altered by a pack directive), is
| >> | probably intended to implicitly bubble all the way up. So, a "pack" in
| >> | B that appears before the definition of a struct that is *also* in B
| >> | could make that type smaller, and that could be observed by an
| >> | importing TU. (But I'm guessing no one is worried about that kind of
| >> | "cross-module" effect.)
| >>
| >> It is not clear to me that we want attributes effect in one module to
| >> leak to another module by the mere fact of its being included before.
| >
| >
| > So... suppose I had:
| <embarrassing example where I went off topic without noticing>
|
| Let me try that again:
|
| In:
|
| // Translation unit that defines module Q:
| import X;
| import Y;
|
| ... where X does not import any module and Y does not import any
| module, we want to give the user some guarantees about the effect of a
| pragma...
|
| 1. in Q, on the translation of Q
| 2. in Q, on the translation of X
| 3. in Q, on the translation of Y
|
| 4. in X, on the translation of Q
| 5. in X, on the translation of X
| 6. in X, on the translation of Y
|
| 7. in Y, on the translation of Q
| 8. in Y, on the translation of X
| 9. in Y, on the translation of Y
|
| I *now* think that when you wrote:
|
| > On the other hand, if you say CPP directives do not escape module
| > boundaries, the order of consecutive imports is immaterial (as should be
| > in sane module systems), programmers don't have to remember, and
| > opportunity for sharing translation contexts is increased and plenty.
| > It is simple, and the benefits are apparent.
|
| ... you were mainly concerned with 6 and 8, but also to some extent
| 2,3,4, and 7. Is that correct?

Yes, my emphasis was on 6, 7, and 8. But, you are correct that I'm also
concerned with 2, 3, 4. Furthermore, my concerns is independent of
whether X or Y import a modules or not -- otherwise, we are asking users
to understand the effects of imports transitive closures (while today we
tell them that any non-inherited C standard header is free to include
any other standard header.) That would be a regression in programming model.

| (Incidentally, I was previously fixated on 2 and 4.)
|
| So, starting over... here's straw man version deux!
|
| For 5 and 9 ( X-X, Y-Y):
| --
| "Whatever effect it had in C++14."

OK -- but I didn't think this was under dispute :-)

|
| For 6 and 8
| --
| "No effect."

Agreed.

But, I don't understand why we need to make a list of cases [you have 9,
I trust you have done your math, but I don't trust long lists of cases;
I'm pretty sure we've forgotten somee :-)] when we can just make a
simple rule.

| Or at least, if any implementor thinks they need something other than
| "none at all" for 6 and 8, we want them to speak up as soon as
| possible. Right?

yes, but most importantly this is a design question.

| And if we change the example to:
|
| // Translation unit that defines module Q:
| import Y;
| import X;
|
| ... then that is guaranteed NOT to change the semantics of the
| program, regardless of any #pragma in X or Y (except that
| order-of-initialization might differ).

Agreed.

| (But something like NDEBUG/assert() might be an exception.)

Well, we don't know that. We've taken for granted that we have macro
isolation and NDEBUG *is* a macro.

| For 4:
| --
| "nothing direct, but if a pragma in X affects an attribute of an
| entity declared in X (e.g. "pack" on an exported struct can affect the
| size of that struct), and if the semantics in Q depend on that
| attribute, then in that limited sense, yes, there is an observable
| effect."

Too complicated. Why do we need this? Why can't we just say "nothing"?
People are going to program with this, why do they need to know the long
list of contexts where something has an effect on a module inclusion
order or not?

| Discussion of #pragma comment(lib, "my lib") belongs to this
| category. I agree it would be better to have a separate core language
| construct for this.

I think that is drifting into a swamp. If this example is being used as
design criteria then I believe we may be driving the design with the
wrong tools. We don't know whether the implementor which introduced
this would like to retain this in a module world or not.

| Discussion of NDEBUG/assert() also belongs to category 4. It might be
| nice if a module could exported a macro like assert(). I vaguely
| recall that Doug might have run into practical reasons why some
| headers just need to remain headers, and <assert.h> might have been
| one of them (although I might be conflating this with the issue of
| system headers that are not properly guarded).

If we don't have macro isolation, then we won't have the property that
order of consecutive module imports is immaterial -- which allows greater
sharing of previously and independently processed modules, good for
build performance.

It is a design question whether we want to deviate from that. If we
do, we should conduct careful analysis of the implications.

It is a separate question whether *all* standard headers must be turned
into modules, and if so whether the transformation should be one-to-one or
one-to-many.

| For 7:
| --
| ditto (replace "X" with "Y").
|
| For 2 & 3:
| --
| "Very little"

"none".

| (Discussion of NDEBUG also belongs here.)

NDEBUG is a macro. We have to decide whether we want macro isolation
for *modules* -- I thought we already did :-)

| For 1:
| --
| Just like 5 and 9: no change (except that we're now more blind to
| directives inside X and Y compared to the inclusion model.

Can a CPP directive in an importing module affect the imported module?
If yes, how much more precisely?

-- Gaby

Richard Smith

unread,
Feb 26, 2014, 11:55:12 PM2/26/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, Gabriel Dos Reis
I think it should be a fundamental principle that changing a module X does not affect the semantics of translating an unrelated module Y in any way. (Naturally changing X in a way that makes it conflict with Y would have consequences, but I don't count that here.)
 
| And if we change the example to:
|
| // Translation unit that defines module Q:
| import Y;
| import X;
|
| ... then that is guaranteed NOT to change the semantics of the
| program, regardless of any #pragma in X or Y (except that
| order-of-initialization might differ).

Agreed.

I think this should be another fundamental principle -- order of import does not affect program validity (but sure, it might affect order of initialization).

| (But something like NDEBUG/assert() might be an exception.)

Well, we don't know that.  We've taken for granted that we have macro
isolation and NDEBUG *is* a macro.

In our implementation, we view <assert.h> as being fundamentally non-modular, and so far treating it as a textually-included blob seems fine.
 
| For 4:
| --
|  "nothing direct, but if a pragma in X affects an attribute of an
| entity declared in X (e.g. "pack" on an exported struct can affect the
| size of that struct), and if the semantics in Q depend on that
| attribute, then in that limited sense, yes, there is an observable
| effect."

Too complicated.  Why do we need this?  Why can't we just say "nothing"?
People are going to program with this, why do they need to know the long
list of contexts where something has an effect on a module inclusion
order or not?

I think this is just a long way of saying "no effect". But I don't think that's entirely right. I think there might be some cases where an implementation wants a pragma in an imported module to affect the importing code (this would be the case for a pragma that fundamentally affects the whole program and not just a single translation unit or module). And pragmatically we want for modules to be able to export macros as part of their interface.
 
| Discussion of #pragma comment(lib, "my lib") belongs to this
| category. I agree it would be better to have a separate core language
| construct for this.

I think that is drifting into a swamp.  If this example is being used as
design criteria then I believe we may be driving the design with the
wrong tools.  We don't know whether the implementor which introduced
this would like to retain this in a module world or not.

That's the point, though. We don't know if an implementation will want to have a pragma in a module affect importing modules, so we shouldn't go out of our way to preclude it. Implementation-defined seems to continue to be the right choice for pragmas.
 
| Discussion of NDEBUG/assert() also belongs to category 4. It might be
| nice if a module could exported a macro like assert().  I vaguely
| recall that Doug might have run into practical reasons why some
| headers just need to remain headers, and <assert.h> might have been
| one of them (although I might be conflating this with the issue of
| system headers that are not properly guarded).

It's valid to repeatedly #include <assert.h> with different settings for NDEBUG, and it *updates* its assert() macro based on that setting. "#include <assert.h>" is just an instruction to muck with the preprocessor state; I certainly don't want to support a module import which behaves that way.

If we don't have macro isolation, then we won't have the property that
order of consecutive module imports is immaterial -- which allows greater
sharing of previously and independently processed modules, good for
build performance.

We can make the import order immaterial and allow modules to export macros, but we need rules to describe how to behave if the macros conflict. (There are a few design choices in this area, and there are practicalities that guide our direction, such as allowing a C module to present an interface with macros, and allowing a C++ module to wrap that interface and remove some of the macros but keep other parts of it.)

It is a design question whether we want to deviate from that.  If we
do, we should conduct careful analysis of the implications.

It is a separate question whether *all* standard headers must be turned
into modules, and if so whether the transformation should be one-to-one or
one-to-many.

| For 7:
| --
| ditto (replace "X" with "Y").
|
| For 2 & 3:
| --
|  "Very little"

"none".

Popular C standard libraries are parameterized by a number of macros such as _GNU_SOURCE that affect which symbols they provide. Practically, we need a way to get the same effect when including a module for the C standard library, but I don't know whether the standard needs to talk about that -- implementations can handle that detail for themselves (through configuration flags or command-line options or "module map files" or whatever else).

Can a CPP directive in an importing module affect the imported module?
If yes, how much more precisely?

Such directives should not affect the imported module at all. (But see above for a case where implementations might want to provide a mechanism to allow this.)

James Widman

unread,
Feb 27, 2014, 12:06:25 AM2/27/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com

On Feb 26, 2014, at 11:19 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:

> | For 4:
> | --
> | "nothing direct, but if a pragma in X affects an attribute of an
> | entity declared in X (e.g. "pack" on an exported struct can affect the
> | size of that struct), and if the semantics in Q depend on that
> | attribute, then in that limited sense, yes, there is an observable
> | effect."
>
> Too complicated. Why do we need this?

Because I thought that, for implementations that already accept this:

#pragma pack(1)
struct S {
int m;
char c;
};
static_assert( sizeof(S) == 5, "" ); // Ok according to GCC, Clang, and MSVC, all targeting x86_64.

… then users would probably want the following to be well-formed (same compilers, same target):

// file x.cpp:
export X:
public:
#pragma pack( 1 )
struct S {
int m;
char c;
};
static_assert( sizeof(S) == 5, "" );
// end of file x.cpp here

// beginning of file q.cpp:
export Q:
public:
import X;
static_assert( sizeof(S) == 5, "" );
// end of file q.cpp here


—James


Gabriel Dos Reis

unread,
Feb 27, 2014, 12:23:16 AM2/27/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
Richard Smith <richar...@google.com> writes:

[...]

| | For 7:
| | --
| | ditto (replace "X" with "Y").
| |
| | For 2 & 3:
| | --
| |  "Very little"
|
|
| "none".
|
|
| Popular C standard libraries are parameterized by a number of macros
| such as _GNU_SOURCE that affect which symbols they provide.
| Practically, we need a way to get the same effect when including a
| module for the C standard library, but I don't know whether the
| standard needs to talk about that -- implementations can handle that
| detail for themselves (through configuration flags or command-line
| options or "module map files" or whatever else).

Similar situations exist on our platforms too, with a prime example
being <windows.h> which defines bunch of declarations depending on a set
of macros active at the point of inclusion, and spits out bunch of
macros. However, it is not clear to me (neither to people in Windows
land I've talked to) that we want to systematically turn every header
file (the problem is acute, as you note with C headr files) into a
module in a one-to-one transformation. There are alternatives. For
example, one could retain the header files, but with parts that are
modularized. Those header files will continue to react to the way they
used to, except that large fragments no longer need to be reprocessed
over and over.

| Can a CPP directive in an importing module affect the imported
| module?
|
| If yes, how much more precisely?
|
|
| Such directives should not affect the imported module at all. (But see
| above for a case where implementations might want to provide a
| mechanism to allow this.)

Agreed.

-- Gaby

Gabriel Dos Reis

unread,
Feb 27, 2014, 12:32:13 AM2/27/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
James Widman <james....@gmail.com> writes:

| On Feb 26, 2014, at 11:19 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:
|
| > | For 4:
| > | --
| > | "nothing direct, but if a pragma in X affects an attribute of an
| > | entity declared in X (e.g. "pack" on an exported struct can affect the
| > | size of that struct), and if the semantics in Q depend on that
| > | attribute, then in that limited sense, yes, there is an observable
| > | effect."
| >
| > Too complicated. Why do we need this?
|
| Because I thought that, for implementations that already accept this:
|
| #pragma pack(1)
| struct S {
| int m;
| char c;
| };
| static_assert( sizeof(S) == 5, "" ); // Ok according to GCC, Clang,
| and MSVC, all targeting x86_64.
|
| ... then users would probably want the following to be well-formed
| (same compilers, same target):
|
| // file x.cpp:
| export X:
| public:
| #pragma pack( 1 )
| struct S {
| int m;
| char c;
| };
| static_assert( sizeof(S) == 5, "" );
| // end of file x.cpp here
|
| // beginning of file q.cpp:
| export Q:
| public:
| import X;
| static_assert( sizeof(S) == 5, "" );
| // end of file q.cpp here

We are not going to get rid of #include, nor is it realistic to try to
make it synonymous with 'import'. Consequently, it will still be
available for programmers who on purpose want CPP directives like in the
example above leak out into whatever happens to contain that source file
as translation unit.

Based on feedback I have from folks here who have looked at internal
C-style header files, this isn't a scenario that is that realistic or
common. The reason is very simple: they ship library header files and
they have to protect themselves against random pragmas active at the
point of inclusion that may change calling conventions or structure
layout or other fundamental properties.

-- Gaby

James Widman

unread,
Feb 27, 2014, 6:35:43 AM2/27/14
to mod...@isocpp.org, Bjarne.S...@morganstanley.com, g...@microsoft.com
Based on the discussion so far, and at least for starters, it seems like we want a rule like this:

In 2.2 [lex.phases], p1 modify phase 4 like so:

“””
4. Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed. If a character sequence that matches the syntax of a universal-character-name is produced by token concatenation ([cpp.concat]), the behavior is undefined. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively.

<insert>Unless otherwise specified, the execution of preprocessing directives that occurs for a translation unit U ignores all preprocessing directives from all translation units other than U. If U exports a module, the execution proceeds as if U is not referenced by any import-statement in the program. </insert>

All preprocessing directives are then deleted.

<insert>[Note: an import-statement [basic.mod] in U references a translation unit other than U. Consequently, there is no behavior that occurs **during phase 4** that can be altered by any import-statement in any translation unit. Example: an import-statement that references a module that contains an #include directive does not cause the execution of the #include directive within the TU that contains the aforementioned import-statement. Another example: when an imported module M is translated, the execution of a #pragma in M may cause the arrangement for special behavior to occur in a later phase of translation---e.g., in the form of additional arguments to a linker. —end note]</insert>
“””

To me, this seems to resolve the confusion I felt about the effect of a pragma that appears in X on the translation of Q in:

// file q.cpp:
export Q:
import X;
import Y;

For a #define directive that appears in X, the suggestion insertion into phase 4 tells us that:

- During translation of X, the directive is executed during phase 4.

- During translation of Q, the directive is NOT executed during phase 4. (Furthermore, because directives are all deleted at the end of phase 4, the directive is not executed in Q at all.)

- HOWEVER, during translation of X, the directive can affect the semantics of a NON-MACRO entity exported from X (like the size of an exported type). If the semantics of Q depend on that entity from X, then the #define directive kind of has a "ripple effect” on the translation of Q—but not on anything that happens during phase 4 in Q.

Mentally replace “#define” with “#pragma”, and we have our answer for pragmas.

(I don’t think anyone suggested anything different; but previously, I might have been unclear about my position on the second bullet above.)

This would be the bare minimum; the next thing would be to establish guarantees about the null effect of re-ordering two adjacent import-statements.


Let controversy commence about what should be “otherwise specified”! (:


—James


James Widman

unread,
Feb 27, 2014, 1:07:33 PM2/27/14
to Richard Smith, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Gabriel Dos Reis, mod...@isocpp.org, Doug Gregor
Richard & Daveed:

Has anyone started on core language wording?

I was struggling through this discussion until I found the anchor that is phase 4. So if people are generally ok with the effect of the <insert/> below, I’m going to take a crack at core language wording that specifies the effect of “export X:” and “import X;” in terms of phases of translation for two or more translation units.

(I would put it in a google doc so that people can follow the edits.)

We want wording that:

- defines a “module" as the result of translation through phase 7, for a given translation unit, that has a name given in an /export-directive/ (new grammar term): it’s a sequence of declarations, as well as the set of names of declared entities in their respective scopes. (In particular, a module is NOT a /declaration-seq/ (grammar term), because that would imply a sequence of tokens, and a module—as defined in the previous sentence---has no tokens.)

- updates the ODR to say that only one module in the whole program has any given name.

- says that at the point in a /declaration-seq/ where the /import-directive/ “import X;” appears, the implementation attempts to locate the module named “X” as previously produced by a compatible translation environment (where “compatible translation environment” implies several things, but e.g. sizeof(int) is the same in both TUs. And note, by “module”, I mean “sequence of declarations as produced by the completion of phase 7”.)

The implementation attempts to locate the module in an implementation-defined way. (But we can provide some hints in non-normative text; e.g. suggest that it live next to associated object code files, or next to associated DLLs. This is appropriate since both are produced in the same build config. Better yet, an implementation might zip them together so that it’s clear they should not be separated.)

If a compatible module cannot be located, then the current translation is suspended (paused) until after a compatible module is produced from another translation unit. (And the current translation has no effect on that sub-translation, other than to indicate a compatible translation environment.)

The implementation locates a primary source file for that sub-translation in an implementation-defined way (but we can provide hints in non-normative text).

When the first translation locates the module, its exported names (i.e., non-private names) are introduced into their respective scopes (including class and namespace scopes, for members of classes and namespaces defined in the module). The names are introduced as if each of their points of declaration is at the end of the /import-directive/.

From there, we can work our way to certain guarantees, like import order indifference.

Hopefully, we can guarantee that for a program image (end of phase 9), there is exactly one instance of each module (again, using definition of “module” above).

That would imply that, all though you *can* run the compiler multiple times to produce a module, there is only one translation in the entire program for that module. The hope is that this could be an ODR requirement, with a diagnostic required if it is violated.

Of course, it only works if each and every entity in the program is provided by a module (or an instantiation in phase 8 that uses only modules).

But if we can say that, it means we can guarantee that a preprocessing directive in the primary source file of a module is executed exactly once for the translation of the program image. (Of course, it requires that said file is not the subject of an #include directive.)

Gabriel Dos Reis

unread,
Feb 27, 2014, 1:35:00 PM2/27/14
to mod...@isocpp.org, Richard Smith, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, g...@microsoft.com
James Widman <james....@gmail.com> writes:

| Richard & Daveed:
|
| Has anyone started on core language wording?
|
| I was struggling through this discussion until I found the anchor that
| is phase 4. So if people are generally ok with the effect of the
| <insert/> below, I'm going to take a crack at core language wording
| that specifies the effect of "export X:" and "import X;" in terms of
| phases of translation for two or more translation units.

Quick ack: I will respond to your earlier message later, but before you
start on wording; please note that I don't think we have a good grasp of
the design space yet. In particular, I don't think phase 4 of
translation is an appropriate anchor. We want processed modules to have
gone through at least phase 7; we need to explore interactions with
phase 8. For example, do we need phase 7.5 for 'module units'? Do
module units lead to instantiation units? Or are instantiation units
contained in modules (to provide a cache)? We need more discussions on
those issues before start wording.

-- Gaby

|
| (I would put it in a google doc so that people can follow the edits.)
|
| We want wording that:
|
| - defines a "module" as the result of translation through
| phase 7, for a given translation unit, that has a name given in an
| /export-directive/ (new grammar term): it's a sequence of
| declarations, as well as the set of names of declared entities in
| their respective scopes. (In particular, a module is NOT a
| /declaration-seq/ (grammar term), because that would imply a sequence
| of tokens, and a module--as defined in the previous sentence---has no
| > --end note]</insert>
| > """
|
|
|
| --
| You received this message because you are subscribed to the Google Groups "SG2 - Modules" group.
| To unsubscribe from this group and stop receiving emails from it, send an email to modules+u...@isocpp.org.
| To post to this group, send email to mod...@isocpp.org.
| Visit this group at http://groups.google.com/a/isocpp.org/group/modules/.

James Widman

unread,
Feb 27, 2014, 2:14:01 PM2/27/14
to Gabriel Dos Reis, Richard Smith, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, mod...@isocpp.org, Doug Gregor

On Feb 27, 2014, at 1:35 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:

>
> Quick ack: I will respond to your earlier message later, but before you
> start on wording; please note that I don't think we have a good grasp of
> the design space yet. In particular, I don't think phase 4 of
> translation is an appropriate anchor.

Well… it’s a point of interest, if only to declare that:

1) Each module is produced form a TU that is distinct from all other TUs (and in particular, it is distinct from each TU that imports the module).

2) No TU contains any preprocessing token from any other TU.

… so therefore no TU executes any preprocessing directive (phase 4) from any other TU.

> We want processed modules to have
> gone through at least phase 7;

Agreed.

> we need to explore interactions with phase 8.

Yes.

> For example, do we need phase 7.5 for 'module units’?

I guess it depends on what you mean by “7.5”...

Phase 7 includes the conversion of preprocessing tokens into tokens.

We don’t need no stinking tokens! (:

> Do
> module units lead to instantiation units? Or are instantiation units
> contained in modules (to provide a cache)? We need more discussions on
> those issues before start wording.

Well, look to phase 8: a module *is* a “translated translation unit” (as implementors think of that phrase today when they read that paragraph).

I’m uncertain about template instantiation semantics, but if we can *tentatively* specify this much (oh so tentatively!) then at least we have a common vocabulary to talk about the design.

—James


James Widman

unread,
Feb 28, 2014, 2:15:31 PM2/28/14
to Gabriel Dos Reis, James Dennett, Bjarne.S...@morganstanley.com, mod...@isocpp.org
On Feb 27, 2014, at 2:14 PM, James Widman <james....@gmail.com> wrote:

>
> On Feb 27, 2014, at 1:35 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:
>
>>
>> Quick ack: I will respond to your earlier message later, but before you
>> start on wording; please note that I don't think we have a good grasp of
>> the design space yet. In particular, I don't think phase 4 of
>> translation is an appropriate anchor.
>
> Well… it’s a point of interest, if only to declare that:
>
> 1) Each module is produced form a TU that is distinct from all other TUs (and in particular, it is distinct from each TU that imports the module).
>
> 2) No TU contains any preprocessing token from any other TU.
>
> … so therefore no TU executes any preprocessing directive (phase 4) from any other TU.


BTW, I think this point means that, when you and James Dennett were arguing about how #pragmas should behave in an environment with modules, you were in "violent agreement”:

James’s point (IIUC) was that the effect of a pragma has always been “implementation-defined”, and that modules should not change that.

Your point (IIUC) was that we want a standard that clarifies that a #pragma must not “leak” from one module to the next.

But you’re both right:

**When a #pragma directive is executed** (during phase 4), the behavior is implementation-defined.

And if we have a rule that indicates that no module executes a preprocessor directive from another TU (and “import” only imports the end result of phase 7, which is well after #pragma and other CPP directives have been deleted), then an importing module never reaches the implementation-defined behavior that resulted from the execution of a #pragma that occurred during phase 4 of translation of the imported module.

So the only remaining possibility for “leakage” (for some definition of “leakage”) is if the implementation defines this:

“When #pragma ‘Foo' executes, the execution arranges for event ‘Bar' to happen *after* phase 7.”

(An example of this would be a #pragma that sets linker options.)

This clears up something else James suggested:

> We could introduce a new notion of a module-local pragma (#pragma
> module XXXX ?), but the default position should be that C++98-style
> pragmas are simply implementation-defined.


… because now, we can say that *all* pragmas that do nothing beyond phase 7 are “module-local” in terms of their execution. (At least, you can say that to users. Normative wording can give us that effect with less ambiguous verbiage, probably without saying it directly.)

James, Gaby: agreed?

—James


James Dennett

unread,
Feb 28, 2014, 11:04:52 PM2/28/14
to James Widman, Gabriel Dos Reis, Bjarne.S...@morganstanley.com, mod...@isocpp.org
Yes, thank you: I think that covers it nicely. (And I just love it
when everyone gets to be "right".)

Pragmas in modules ought not to affect what happens to other modules
*except* that they might affect phases after phase 7 (i.e.,
translation of translation units and instantiation units is unaffected
by pragmas in other translation units). An argument could be made
that the phases of translation imply that already, but I wouldn't
oppose explicit wording to make it clearer.

-- James (Dennett)

Gabriel Dos Reis

unread,
Mar 3, 2014, 2:49:54 PM3/3/14
to mod...@isocpp.org, Richard Smith, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, g...@microsoft.com
James Widman <james....@gmail.com> writes:

| On Feb 27, 2014, at 1:35 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:
|
| >
| > Quick ack: I will respond to your earlier message later, but before you
| > start on wording; please note that I don't think we have a good grasp of
| > the design space yet. In particular, I don't think phase 4 of
| > translation is an appropriate anchor.
|
| Well... it's a point of interest, if only to declare that:
|
| 1) Each module is produced form a TU that is distinct from all
| other TUs (and in particular, it is distinct from each TU that imports
| the module).
|
| 2) No TU contains any preprocessing token from any other TU.
|
| ... so therefore no TU executes any preprocessing directive (phase 4)
| from any other TU.

I am still looking at what a module is. The design papers say that a
module can span several translation units; so a module isn't just a
translation unit.

Clang's documentation has a notion of submodule; are submodules part of
the enclosing modules? If yes, and assuming that modules can span
several translation units, how do the CPP directives from those
translation units interact together to form the module?

Also, note that Clang's documentation already has alternative ways of
achieving one of the examples brought up earlier

#pragma comment(lib, "linker stuff")

-- Gaby

Richard Smith

unread,
Mar 3, 2014, 2:57:26 PM3/3/14
to Gabriel Dos Reis, mod...@isocpp.org, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, Gabriel Dos Reis
As another example, Clang has:

  #pragma clang poison identifier 

... which causes any mention of that identifier after the pragma to be rejected. If you import a module that poisons an identifier, it is poisoned for you, too. (But obviously it is not poisoned in other modules that you import.)

This seems to violate James W's rules, because a preprocessing pragma in an imported module affects the preprocessing of the importing module.

Gabriel Dos Reis

unread,
Mar 3, 2014, 3:07:47 PM3/3/14
to Richard Smith, mod...@isocpp.org, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, g...@microsoft.com, mark...@microsoft.com
Indeed. I suggested in a message I just sent, that the effect of CPP
directives be nullified at the end of phase 7 (or 8). So a module that
was developed with an included source file that poisoned identifiers
won't leak it to importing modules.

-- Gaby

James Widman

unread,
Mar 3, 2014, 6:07:43 PM3/3/14
to mod...@isocpp.org, Richard Smith, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, g...@microsoft.com

On Mar 3, 2014, at 2:49 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:

> James Widman <james....@gmail.com> writes:
>
> | On Feb 27, 2014, at 1:35 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:
> |
> | >
> | > Quick ack: I will respond to your earlier message later, but before you
> | > start on wording; please note that I don't think we have a good grasp of
> | > the design space yet. In particular, I don't think phase 4 of
> | > translation is an appropriate anchor.
> |
> | Well... it's a point of interest, if only to declare that:
> |
> | 1) Each module is produced form a TU that is distinct from all
> | other TUs (and in particular, it is distinct from each TU that imports
> | the module).
> |
> | 2) No TU contains any preprocessing token from any other TU.
> |
> | ... so therefore no TU executes any preprocessing directive (phase 4)
> | from any other TU.
>
> I am still looking at what a module is. The design papers say that a
> module can span several translation units; so a module isn't just a
> translation unit.

Ah; true.

So, how about this:

“a module is a set of one or more translated translation units, with a subset of names therefrom made visible at the point of an import-directive that names the module.”

We can adjust as needed according to the precise rules of “public” and “private” at namespace scope W.R.T. visibility at the point of the import.


> Clang's documentation has a notion of submodule; are submodules part of
> the enclosing modules?

Yes.

> If yes, and assuming that modules can span
> several translation units, how do the CPP directives from those
> translation units interact together to form the module?


If a translation unit U1 defines a submodule s1 and imports (or is imported by, or is imported adjacent to) another submodule s2—defined by translation unit U2—then, during phase 4 of translation of U1, there is no preprocessing directive that executes as a consequence of anything in U2.


> Also, note that Clang's documentation already has alternative ways of
> achieving one of the examples brought up earlier
>
> #pragma comment(lib, "linker stuff”)


Yep; noted and accounted for. (See my other email in this thread where I said "you’re both right” and commented on linker options.)

—James


James Widman

unread,
Mar 3, 2014, 6:26:36 PM3/3/14
to mod...@isocpp.org, Gabriel Dos Reis, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, Gabriel Dos Reis
So… someone’s going to fix that bug in Clang, right? (:

I mean, no one actually *asked* for that behavior, did they? More likely, that behavior just fell out naturally because module importation uses most of the same serialized-AST-reading behavior as is used to implement precompiled headers…. right?

—James


Richard Smith

unread,
Mar 3, 2014, 6:29:09 PM3/3/14
to mod...@isocpp.org, Gabriel Dos Reis, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, Gabriel Dos Reis
So... someone's going to fix that bug in Clang, right? (:

I mean, no one actually *asked* for that behavior, did they? More likely, that behavior just fell out naturally because module importation uses most of the same serialized-AST-reading behavior as is used to implement precompiled headers.... right?
 
No, this is deliberate.

James Widman

unread,
Mar 3, 2014, 9:04:18 PM3/3/14
to mod...@isocpp.org, Gabriel Dos Reis, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Gabriel Dos Reis, Doug Gregor

On Mar 3, 2014, at 6:29 PM, Richard Smith <richar...@google.com> wrote:

> On 3 March 2014 15:26, James Widman <james....@gmail.com> wrote:
>
>> On Mar 3, 2014, at 2:57 PM, Richard Smith <richar...@google.com> wrote:
[…]
>> > As another example, Clang has:
>> >
>> > #pragma clang poison identifier
>> >
>> > ... which causes any mention of that identifier after the pragma to be rejected. If you import a module that poisons an identifier, it is poisoned for you, too. (But obviously it is not poisoned in other modules that you import.)
>> >
>>
>> So... someone's going to fix that bug in Clang, right? (:
>>
>> I mean, no one actually *asked* for that behavior, did they? More likely, that behavior just fell out naturally because module importation uses most of the same serialized-AST-reading behavior as is used to implement precompiled headers.... right?
>
> No, this is deliberate.

Ok… this is happening in lib/Serialization/AST(Reader|Writer).cpp, and documented here:

http://clang.llvm.org/docs/PCHInternals.html

(Under “identifier table block”.)

So you set a bit in the id table (the PP id table) saying “foo” is poisoned.

If a library vendor put “#pragma clang poison foo” in their header file today, then they probably expect their users to get an error during phase 3. (And if you test this out with Clang today, you even get an error in -E mode.)

HOWEVER: In order to make this work, Clang does not pass PP directives back from the child to be executed in the parent process. So, it’s funny behavior, but it is consistent with what I described earlier.

—James


Richard Smith

unread,
Mar 3, 2014, 9:06:58 PM3/3/14
to Gabriel Dos Reis, mod...@isocpp.org, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, Gabriel Dos Reis, mark...@microsoft.com
On 3 March 2014 17:58, Gabriel Dos Reis <g...@axiomatics.org> wrote:
which part is deliberate?  Suppression of pragmas?  Suppression of CPP directives?

The behavior I described, where "#pragma poison" is transmitted from an imported module to its importer, is intentional. The "poison" becomes part of that module's interface.

Richard Smith

unread,
Mar 3, 2014, 9:09:07 PM3/3/14
to mod...@isocpp.org, Gabriel Dos Reis, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Gabriel Dos Reis, Doug Gregor
On 3 March 2014 18:04, James Widman <james....@gmail.com> wrote:

On Mar 3, 2014, at 6:29 PM, Richard Smith <richar...@google.com> wrote:

> On 3 March 2014 15:26, James Widman <james....@gmail.com> wrote:
>
>> On Mar 3, 2014, at 2:57 PM, Richard Smith <richar...@google.com> wrote:
[...]

>> > As another example, Clang has:
>> >
>> >   #pragma clang poison identifier
>> >
>> > ... which causes any mention of that identifier after the pragma to be rejected. If you import a module that poisons an identifier, it is poisoned for you, too. (But obviously it is not poisoned in other modules that you import.)
>> >
>>
>> So... someone's going to fix that bug in Clang, right? (:
>>
>> I mean, no one actually *asked* for that behavior, did they? More likely, that behavior just fell out naturally because module importation uses most of the same serialized-AST-reading behavior as is used to implement precompiled headers.... right?
>
> No, this is deliberate.

Ok... this is happening in lib/Serialization/AST(Reader|Writer).cpp, and documented here:


http://clang.llvm.org/docs/PCHInternals.html

(Under "identifier table block".)

So you set a bit in the id table (the PP id table) saying "foo" is poisoned.

If a library vendor put "#pragma clang poison foo" in their header file today, then they probably expect their users to get an error during phase 3.  (And if you test this out with Clang today, you even get an error in -E mode.)

HOWEVER: In order to make this work, Clang does not pass PP directives back from the child to be executed in the parent process.  So, it's funny behavior, but it is consistent with what I described earlier.

Are you saying that this would not be OK if we implemented it by stashing a pragma in our binary module format and we processed that pragma when we imported the module? The standard has no business mandating implementation details like that.

Gabriel Dos Reis

unread,
Mar 3, 2014, 9:13:26 PM3/3/14
to James Widman, mod...@isocpp.org, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, g...@microsoft.com, mark...@microsoft.com
James Widman <james....@gmail.com> writes:

| On Mar 3, 2014, at 6:29 PM, Richard Smith <richar...@google.com> wrote:
|
| > On 3 March 2014 15:26, James Widman <james....@gmail.com> wrote:
| >
| >> On Mar 3, 2014, at 2:57 PM, Richard Smith <richar...@google.com> wrote:
| […]
| >> > As another example, Clang has:
| >> >
| >> > #pragma clang poison identifier
| >> >
| >> > ... which causes any mention of that identifier after the pragma
| >> > to be rejected. If you import a module that poisons an
| >> > identifier, it is poisoned for you, too. (But obviously it is not
| >> > poisoned in other modules that you import.)
| >> >
| >>
| >> So... someone's going to fix that bug in Clang, right? (:
| >>
| >> I mean, no one actually *asked* for that behavior, did they? More
| >> likely, that behavior just fell out naturally because module
| >> importation uses most of the same serialized-AST-reading behavior
| >> as is used to implement precompiled headers.... right?
| >
| > No, this is deliberate.
|
| Ok… this is happening in lib/Serialization/AST(Reader|Writer).cpp, and
| documented here:
|
| http://clang.llvm.org/docs/PCHInternals.html
|
| (Under “identifier table block”.)

Thanks.
[Note: we shouldn't require people to read other people's implementation
before we can discuss what semantics we want -- for obvious reasons.
Meaning, I can't read that source code; please be considerate of that
constraint. ]

| So you set a bit in the id table (the PP id table) saying “foo” is poisoned.
|
| If a library vendor put “#pragma clang poison foo” in their header
| file today, then they probably expect their users to get an error
| during phase 3. (And if you test this out with Clang today, you even
| get an error in -E mode.)
|
| HOWEVER: In order to make this work, Clang does not pass PP directives
| back from the child to be executed in the parent process. So, it’s
| funny behavior, but it is consistent with what I described earlier.

What #pragma poison (which I suspect originated from GCC) achieves is to
essentially say that a declaration is deleted -- except that being a CPP
directive, it has no regards for scopes.

That may make sense for C. But, I suspect C++ has a better solution for
C++ purposes.

-- Gaby

Richard Smith

unread,
Mar 3, 2014, 9:16:06 PM3/3/14
to mod...@isocpp.org, James Widman, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, Gabriel Dos Reis, mark...@microsoft.com
I don't find that particularly relevant -- the point of the example is solely that implementations may want to do all sorts of things with pragmas that won't necessarily fit into any rules we invent. So we should just leave them as implementation-defined. We don't need to say anything else.

James Widman

unread,
Mar 3, 2014, 9:19:02 PM3/3/14
to mod...@isocpp.org, Gabriel Dos Reis, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Gabriel Dos Reis, Doug Gregor
Ack.

And by “Ack,” I mean two things:

1) Acknowledged (you’re right): in the nested module, we hit the #pragma, which means we hit implementation-defined behavior,… which means the only restriction is that you need to document it, and apart from that, the Standard actually his no say on what happens next. You could, for example, document arbitrary transformations in arbitrary TUs, or request that arbitrary TUs be re-translated with different pre-defined macros. (Read again: "The behavior might cause translation to fail **or cause the translator or the resulting program to behave in a non-conforming manner**.” ([cpp.pragma]))

2) Shit.

Sorry everyone. Wild goose chase.

Richard, thank you for your patience.

—James


James Widman

unread,
Mar 3, 2014, 10:31:14 PM3/3/14
to Gabriel Dos Reis, mod...@isocpp.org, Daveed Vandevoorde, Bjarne.S...@morganstanley.com, Doug Gregor, mark...@microsoft.com, g...@microsoft.com

On Mar 3, 2014, at 9:21 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:

> Richard Smith <richar...@google.com> writes:
>
> | On 3 March 2014 18:13, Gabriel Dos Reis <g...@axiomatics.org> wrote:
> |
[…]
> | [#pragma poison] may make sense for C. But, I suspect C++ has a better
> | solution for
> | C++ purposes.
> |
> |
> | I don't find that particularly relevant -- the point of the example is
> | solely that implementations may want to do all sorts of things with
> | pragmas that won't necessarily fit into any rules we invent. So we
> | should just leave them as implementation-defined. We don't need to say
> | anything else.
>
> On the contrary it is quite relevant because we are discussing what the
> boundaries of a module is and what should be expected. If you bring up an
> example for justification of uncertainty, it is relevant to discuss to
> what extent that example is conclusive or convincing.

I’ll try to reconcile these views:

When we see:

import M;

… we want to believe in a pure, post-phase-7 import.

But then we hear the response, “well, that’s not going to work, because of [fragment of a list of real-world pragmas that users will depend upon for the foreseeable future].”

Poison is a good example: nothing else causes a lex-time error for an arbitrary identifier.

And there are people who deliberately put that in their header files (I can only assume—otherwise, why export the effect from a module).

So:

- we can’t stop people from using pragmas in modules, and

- we can’t stop vendors from defining effects of a pragma outside the module where it appears.

However:

- perhaps [cpp.pragma] could require that implementations more precisely define the effect of #pragmas across import boundaries.

- By following implementation experience, we may learn more about demand for specific kinds of cross-import effects, which might inspire appropriate proposals for core language features that supplant those pragmas.

- Maybe more mileage might materialize in module maps. (Example: upon completion of preprocessing for a header, have all exported directives dumped into the header’s entry in the module map file. Instead of exporting directives from the binary module file, a directive read from the map file is a directive that the user may consider and selectively delete.)


In other words: since we can’t *force* people into the pure, post-phase-7 ideal, maybe we can do some investigations into the practical concerns keeping people away from that ideal, and come up with approaches that at least enable the -pedantic crowd to do their thing, while requirements incompatible with -pedantic are still met.

—James


James Widman

unread,
Mar 4, 2014, 1:24:39 PM3/4/14
to Gabriel Dos Reis, Bjarne.S...@morganstanley.com, mod...@isocpp.org, Daveed Vandevoorde, Doug Gregor, Mark Hall (VC++), Gabriel Dos Reis

On Mar 3, 2014, at 10:50 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:

> -pedantic is offensive and suggests a bias, not a reconciliation of views.

Well… ok, maybe a compiler switch with the same effect as “-pedantic”, which could have been given a spelling that is a little less snarky, like:

-ISO-only

I only meant to refer to the effect of the switch.

> Note that these concerns are being raised while we are contemplating an
> implementation to deal with hundreds of millions of line codes and a
> particular concern for build-time performance and portability of
> semantics for our customers.

Noted and agreed, both about performance and portability-of-semantics.

Regarding performance: it’s true that we still don’t know the answer to this. There is good reason to expect that Clang's lazy-AST-loading approach will pay off, but until we have data, we can’t really say.

But for the question about *performance*, at least we can reasonably expect an answer.

By contrast, for the question about *portability-of-semantics*: because of #pragma, we do NOT know if we will ever have a complete answer.

Even if a future modules proposal is well-implemented & tested, other attempts to implement it (and attempts to use it with those other implementations) may still fail in unexpected ways because of unexpected or unavoidable interactions with unknown or under-documented pragmas.

Under these circumstances, and as of this morning, I don’t think anyone can reasonably submit a core language proposal for modules.

But even still, we want to try to make modules work.

Therefore we need solutions to the #pragma problem.

… which leads us to what Bjarne just wrote:

> We cannot force people, but we can - and should - encourage and enable a more modern, more modular style of code. If all we get from modules is faster compilations, we will be left with a constant demand for support for more sanitary code. Compilation speed is one aim of modules, cleaner code (meaning fewer bugs) is another.
>
> We will not get a second go at this. The default should be sanitary, and the bias in description should be strongly "pro sanitary." This is a bigger deal for C++ than for C, and we should not accept a solution that works well only for C and C-style programs.


Agreed on all points.

We need response efforts. (Plural.)

Here’s my own partial list (everyone: please append/comment as you wish):

1) With regard to normative wording in clauses 1 through 16 inclusive: we might not be able to use “shall”, but we can use “implementations are encouraged…”.

2) We should ask known implementors of C++11 features about how pragmas in their implementation(s) behave TODAY across boundaries between:

a) #include directives

b) translation units

c) libraries

… and ask them to talk (as muchNDAs allow) about cases where a pure-post-phase-7-import might not work, particularly in cases where a user might replace #include <foo.h> with “import foo;”.

This might be worded in the form of a straw-man core-wording document, where, instead of the weaker “encouraged” verbiage, inserted text uses “shall” (e.g.: “an import shall proceed as if any pragmas from the imported module do NOT execute in the importing translation unit.”). The question to implementors could then be: “please list #pragmas and other non-standard CPP directives for which this will not work, and give examples & use cases.”

3) We should independently test behavior of actual #pragmas across #include boundaries.

4) We should survey users of these pragmas (again, using carefully-selected words).

4.1) We should identify and publish cases where users do NOT want a #pragma execution to happen as the result of an import.

5) We should consider/discuss/document any potential alternative migration strategies.



—James

Gabriel Dos Reis

unread,
Mar 4, 2014, 2:25:52 PM3/4/14
to James Widman, Bjarne.S...@morganstanley.com, mod...@isocpp.org, Daveed Vandevoorde, Doug Gregor, Mark Hall (VC++), g...@microsoft.com
James Widman <james....@gmail.com> writes:

| On Mar 3, 2014, at 10:50 PM, Gabriel Dos Reis <g...@axiomatics.org> wrote:
|
| > -pedantic is offensive and suggests a bias, not a reconciliation of views.
|
| Well… ok, maybe a compiler switch with the same effect as “-pedantic”, which could have been given a spelling that is a little less snarky, like:
|
| -ISO-only
|
| I only meant to refer to the effect of the switch.

We are discussing the design and semantics of a feature intended to be
included in an ISO standard.

| > Note that these concerns are being raised while we are contemplating an
| > implementation to deal with hundreds of millions of line codes and a
| > particular concern for build-time performance and portability of
| > semantics for our customers.
|
| Noted and agreed, both about performance and portability-of-semantics.
|
| Regarding performance: it’s true that we still don’t know the answer
| to this. There is good reason to expect that Clang's lazy-AST-loading
| approach will pay off, but until we have data, we can’t really say.

Note that lazy loading of AST is not the issue. I can tell you that
we've done some exploration of the implementation strategy space and
what you might call "lazy loading of AST" (I don't know the
implementation details of Clang) features itself in various forms in our
findings.

| But for the question about *performance*, at least we can reasonably
| expect an answer.
|
| By contrast, for the question about *portability-of-semantics*:
| because of #pragma, we do NOT know if we will ever have a complete
| answer.

We may not know the complete answer for everything, but we are
*designing* the module system. I have not seen any sustained argument
and explanation of why an importing module cannot be made agnostics of
what CPP directives do in an imported module, and vice-versa.

I would like us to have a design discussion of what is we want, set out
a few simple design rules, map out what module usage is for users, and
focus less on diffs of the standards text just now.

One fundamental property that I would like to see for module is that
CPP directives do not leak outside module boundaries -- this include
pragmas, no matter how implementation-defined their semantics are.

[...]

| … and ask them to talk (as muchNDAs allow) about cases where a
| pure-post-phase-7-import might not work, particularly in cases where a
| user might replace #include <foo.h> with “import foo;”.

I think it is unreaslistic to expect that '#include' is going away.
It is here to stay. I believe it is equally unrealistic to expect
that a design that makes 'import' synonymous of '#include' is going to
gain unanimous support.

#include has been been around for far too long and has been subject of
far too many creative uses. We shouldn't aim for watering down a module
design that support modern and robust software engineering, just because
of all arcane corners of #include. I suspect the creative uses of
#include will stay around forever.

What we can hope for is that reasonably good uses of CPP directives will
lend themselves to modularization. We have to work out solutions that
allow implementors to map include header files to modules, but we
shouldn't insist on an absolute one-to-one mapping.

The goal shouldn't be to make 'import' synonymous for '#include'.
Rather, it should be to make '#include' antiquated and redundant. We
can't achieve that if we insist on flying over its semantics.

What is good for C isn't necessarily good for C++.

-- Gaby

James Widman

unread,
Mar 4, 2014, 2:36:47 PM3/4/14
to Richard Smith, mod...@isocpp.org, Bjarne.S...@morganstanley.com, Daveed Vandevoorde, Doug Gregor, Mark Hall (VC++), Gabriel Dos Reis
Richard, I just noticed this:

On Feb 26, 2014, at 11:55 PM, Richard Smith <richar...@google.com> wrote:

I think this should be another fundamental principle -- order of import does not affect program validity (but sure, it might affect order of initialization).

So, suppose we make normative  (“shall”) wording to specify that principle.

Then consider:

import Y; // ok
import X; // ok (but X’s TU contains “#pragma GCC poison Y”)

… swap like so:

import X;
import Y; // Error for use of poisoned identifier Y. (phase-3 error)

I don’t think [cpp.pragma] would allow an implementor to claim conformance if it issues an error in this case for Y.  (It says, “*might* cause translation to fail”, except we’re supposing a normative rule that says swapping doesn’t affect validity. I guess ‘poison’ could be redefined to be retroactive in this case, but I’m not sure that’s what you want.)

—James


Richard Smith

unread,
Mar 4, 2014, 3:04:08 PM3/4/14
to James Widman, mod...@isocpp.org, Bjarne.S...@morganstanley.com, Daveed Vandevoorde, Doug Gregor, Mark Hall (VC++), Gabriel Dos Reis
[cpp.pragma]p1: "The behavior might cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner."

Note that pragmas allow us to not conform.

It seems to me that the above rule, in effect, says that a program that uses a pragma is written in some implementation-defined language resembling C++, not in standard C++, so we cannot specify how pragmas behave; it would be a logical contradiction to do so.


Also, we can argue that this:

  import X;
  import Y;

... does not contain two import declarations, because the second line is ill-formed at the token level (the fifth token is a poison token, not the identifier 'Y'). Also, remember that we don't even have a syntax for module imports yet, so assuming that they're named with identifiers may be premature.


Finally, I honestly don't see why we're expending so much time talking about pragmas. They've got to be approximately the least interesting part of modules. Why can't we just leave them as being implementation-defined?

James Widman

unread,
Mar 4, 2014, 3:26:45 PM3/4/14
to Richard Smith, mod...@isocpp.org, Bjarne.S...@morganstanley.com, Daveed Vandevoorde, Doug Gregor, Mark Hall (VC++), Gabriel Dos Reis

On Mar 4, 2014, at 3:04 PM, Richard Smith <richar...@google.com> wrote:

> On 4 March 2014 11:36, James Widman <james....@gmail.com> wrote:
>> Richard, I just noticed this:
>>
>> On Feb 26, 2014, at 11:55 PM, Richard Smith <richar...@google.com> wrote:
>>
>>> I think this should be another fundamental principle -- order of import does not affect program validity (but sure, it might affect order of initialization).
>>
>> So, suppose we make normative (“shall”) wording to specify that principle.
>>
>> Then consider:
>>
>> import Y; // ok
>> import X; // ok (but X’s TU contains “#pragma GCC poison Y”)
>>
>> … swap like so:
>>
>> import X;
>> import Y; // Error for use of poisoned identifier Y. (phase-3 error)
>>
>> I don’t think [cpp.pragma] would allow an implementor to claim conformance if it issues an error in this case for Y. (It says, “*might* cause translation to fail”, except we’re supposing a normative rule that says swapping doesn’t affect validity. I guess ‘poison’ could be redefined to be retroactive in this case, but I’m not sure that’s what you want.)
>
>
> [cpp.pragma]p1: "The behavior might cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner."

It’s a funny bit of verbiage (“non-conforming manner”), isn’t it? (:

The more I read it, the less sure I am of what it means.

Does it mean that an implementation can call itself “conforming” even though it generates a non-conforming program image? (In which case, does the word “conforming” mean anything in the International Standard?)

Or does it mean that your implementation might not be a conforming implementation?

And if, in practice, all implementations use a #pragma to do something “non-conforming”…

I mean, given this view of pragmas (which I’m not saying is incorrect): I don’t think anyone has ever actually used a C++ implementation.

But having said that...

> Note that pragmas allow us to not conform.
>
> It seems to me that the above rule, in effect, says that a program that uses a pragma is written in some implementation-defined language resembling C++, not in standard C++, so we cannot specify how pragmas behave; it would be a logical contradiction to do so.
>
>
> Also, we can argue that this:
>
> import X;
> import Y;
>
> ... does not contain two import declarations, because the second line is ill-formed at the token level (the fifth token is a poison token, not the identifier 'Y’).

Ah… I guess your lexer could adopt that world view (since we’re in “non-conforming” territory anyway)…


> Also, remember that we don't even have a syntax for module imports yet, so assuming that they're named with identifiers may be premature.

Right.

> Finally, I honestly don't see why we're expending so much time talking about pragmas. They've got to be approximately the least interesting part of modules. Why can't we just leave them as being implementation-defined?

It’s about portability.

Why is it a new concern? Because WRT #pragmas, the effect of an #include is usually negligible: translation behavior is pretty much as if the content of the included header appeared inline at the point of the #include directive.

So if you have two implementations (say, MSVC and GCC) that implement the same pragma that appears in an #include’d header with pretty much the same semantics, then from the user’s perspective, that’s a portable header.

By contrast, one implementation might, in effect, re-execute an imported module’s pragma at the point of an import while another implementation does NOT.

So there is a whole new opportunity for fragmentation of language semantics. I don’t know if it’ll be as bad as the 1990s, but it could still be pretty bad.


—James


Richard Smith

unread,
Mar 4, 2014, 5:37:30 PM3/4/14
to James Widman, mod...@isocpp.org, Bjarne.S...@morganstanley.com, Daveed Vandevoorde, Doug Gregor, Mark Hall (VC++), Gabriel Dos Reis
Pragmas are not portable, unless you're porting between implementations whose behavior on those pragmas is the same. Between such implementations, modules introduces no change in portability.
 
Why is it a new concern? Because WRT #pragmas, the effect of an #include is usually negligible: translation behavior is pretty much as if the content of the included header appeared inline at the point of the #include directive.

So if you have two implementations (say, MSVC and GCC) that implement the same pragma that appears in an #include’d header with pretty much the same semantics, then from the user’s perspective, that’s a portable header.

By contrast, one implementation might, in effect, re-execute an imported module’s pragma at the point of an import while another implementation does NOT.

In such a case, the implementations would not have the same semantics for the pragma.

James Widman

unread,
Mar 4, 2014, 6:41:00 PM3/4/14
to Richard Smith, mod...@isocpp.org, Bjarne.S...@morganstanley.com, Daveed Vandevoorde, Doug Gregor, Mark Hall (VC++), Gabriel Dos Reis
Yes, obviously. But that would only be true upon the introduction of modules. (Well, any of a subset of possible designs for modules in ISO C++, anyway.)

So the point is: such a design for modules might exacerbate existing portability issues with pragmas, thereby making more programs less portable.

For some projects, it could mean that modules will be unusable.

If there is a migration approach like the one I describe in the “#include migration strategies” thread (posted roughly four hours ago), where directives are made available, but also user-separable from an import, would that make it any easier to accept any kind of non-leakage guarantee?

—James

James Widman

unread,
Mar 4, 2014, 8:19:58 PM3/4/14
to Richard Smith, mod...@isocpp.org, Bjarne.S...@morganstanley.com, Daveed Vandevoorde, Doug Gregor, Mark Hall (VC++), Gabriel Dos Reis
Almost forgot:

On Mar 4, 2014, at 3:04 PM, Richard Smith <richar...@google.com> wrote:

> It seems to me that [cpp.pragma], in effect, says that a program that uses a pragma is written in some implementation-defined language resembling C++, not in standard C++, so we cannot specify how pragmas behave; it would be a logical contradiction to do so.

Well, it certainly seems that way based on how 16.6 [cpp.pragma] is worded today, but a hypothetical modules proposal could change it to:

“[…] causes the implementation to behave in an implementation-defined manner<insert>, except that, at the point of a module import, [blah blah blah] shall [blah blah blah]. In cases where the aforementioned restrictions do not apply, </insert> the behavior might cause translation to fail […]"

[Going to core now to ask for a clarification on [cpp.pragma]’s current meaning.]

—James


Reply all
Reply to author
Forward
0 new messages