One of the problems I have with all examples of Modules that I've seen is that they use simple inline functions to demonstrate the feature. What does 'using Modules' look like for a shared library?
Steve:
>> I realize that the standard has nothing to say about buildsystems or even compiler drivers, but the impact that the standard has on those (particularly in the case of Modules) will affect adoption of the feature, so I think it makes sense to talk about it.
Agreed.
Steve:
>> It seems to me that large amounts of C++ library code would have to be rewritten to take advantage of them.
That hasn’t been my experience, nor is there any factual logical reason for that to be.
If your existing library/program is already architecturally modular, there is no reason to have to rewrite them. You may have
but that is a far cry from “have to be rewritten.”
Steve:
>> In the Microsoft implementation of Modules at least, it seems that users will have to create new ixx files
No, there is no requirement that you have to create new ixx files. You can invoke the compiler with -module:interface command line switch. Having an “ixx” is a convenience, not a requirement.
Steve
>> One of the problems I have with all examples of Modules that I've seen is that they use simple inline functions to demonstrate the feature.
I don’t know that to be true. Certainly the design document (P0142R0) contains examples of non-inline functions.
If your larger suggestion is that we need more tutorial materials, I fully agree with that.
Steve:
>> What does 'using Modules' look like for a shared library?
I know you know the distinction, but I want to take the opportunity here to clarify a common confusion. C++ modules are distinct from shared libraries or DLLs. There is no requirement that a C++ module be compiled to a single shared library or that a shared library corresponds to a single module. Furthermore, C++ export declarations are not necessarily linker-level export symbols (e.g. dllexport) for dynamic linked purposes. It is most likely that a linker-level symbol that is dllexported ends up being declared ‘export’ at a C++ source level; but the converse does not necessarily hold.
Now, if you do export a declaration both in the C++ sense (using the keyword ‘export’) and in the linker sense (e.g. _declspec(dllexport)), VC++ will take the step of automatically marking the symbol as ‘dllimport’ on the ‘import side’. This relives you from having to perform the confusing macro dance regarding when to put __declspec(dllexport) or __declspec(dllimport). All of that evaporates.
Steve:
>> Imagine I have 40 classes in a library, which today each has a .cpp and a .h file and I want to support modules. I am required to write one ixx file for the entire library, so I copy the contents of the 40 .h files into the MyLib.ixx and I add a few export keywords.
I am a bit puzzled. If your existing library has 40 classes in modular 40 headers, and that corresponds to the architecture you had in mind for your library, why would you want to smash them together into a single MyLib.ixx? Apparently you have determined that you wanted a single module instead of 40 modules (that would mirror your existing 40 headers). If that is the case, then why didn’t you have a super-header that included all of the 40 headers which would have been the official interface of your library? The question isn’t rhetorical; I’m trying to understand what has changed in your architecture to push you to do that?
By the way, you can also have this:
// file: MyLib.ixx
module MyLib;
#include “foo1.h”
// …
#include “foo40.h”
If you like the precept of one class per file. You can even go further using module aggregation if you go by the precept of one class per module.
Steve:
>> Question: Is that a deliberate outcome of the choices made in designing Modules? What impact does it have on maintainability of the code?
Yes. It help maintenance by centralizing the definitions of helpers (shared by module units of the same module) at one place.
Steve:
No, not really. The actual issue here is the decision made to lump the contents of 40 headers together. That isn’t required if it does not match the architecture you had in mind for your library.
Steve:
>> which apparently produces an object file (what does it contain? Is that what I want?) and the MyLib.ifc file.
First all, a module interface unit can contain definitions, so their compilation generally need to go somewhere. The traditional place is object file. So, yes that is what you want. :-)
What this means is that compiling a module interface file produces at least two outputs.
Steve:
>> My buildsystem knows to invoke that command before attempting to compile any of my cpp files. If I have multiple libraries in the same build and there are dependencies between them, my buildsystem knows to generate the .ifc files in the correct order.
That is awesome! Which version of CMake has this capability?
Steve:
>> Question: What is the impact of naming the module in the .ixx file? What synchronization is expected between the 'module M;' name and the name of the .ifc file generated? What if there was no 'module M;' syntax? Is there an alternate design without name redundancy?
There is no formal relationship between the pathname of the file containing a module interface unit and the module name itself.
“module M;” is what indicate to the compiler (from C++ semantics) that any declaration that follows is owned by the module M. If that module declaration is missing then there is no module semantics.
In the VC++ case, the name of the IFC file is settable by the user via the command line “-module:output”. You can choose any name you want. If you import a module M in your source, you need to specify an IFC file for the compiler to find the metadata for the interface of M. You can do that via “-module:reference <pathname>” where you specifiy a specific file, or via a directory search path with the option “-module:search <directory>” in which case the compiler will try to associate the module M with a file M.ifc in that directory. I recommend being explicit, e.g. using the “-module:reference” option. See the sections “Consuming Modules” and “Module Search Path” at https://blogs.msdn.microsoft.com/vcblog/2015/12/03/c-modules-in-vs-2015-update-1/
Steve:
>> What is the impact of Modules on buildsystems?
This is a good question.
The Windows engineering team upgraded its existing build system to include infrastructure for using modules. From what I understand, it was a smooth upgrade.
Thanks,
-- Gaby
Steve:>> It seems to me that large amounts of C++ library code would have to be rewritten to take advantage of them.
That hasn’t been my experience, nor is there any factual logical reason for that to be.
If your existing library/program is already architecturally modular, there is no reason to have to rewrite them. You may have
- to add a module declaration to state in code the symbolic name of your module (which was in the background of your architecture, but you had no way to express directly)
- stick the keyword ‘export’ in front of the declarations that you intended to be part of the interface of your library
but that is a far cry from “have to be rewritten.”
Steve:
>> In the Microsoft implementation of Modules at least, it seems that users will have to create new ixx files
No, there is no requirement that you have to create new ixx files. You can invoke the compiler with -module:interface command line switch. Having an “ixx” is a convenience, not a requirement.
Steve
>> One of the problems I have with all examples of Modules that I've seen is that they use simple inline functions to demonstrate the feature.
I don’t know that to be true. Certainly the design document (P0142R0) contains examples of non-inline functions.
If your larger suggestion is that we need more tutorial materials, I fully agree with that.
Steve:
>> What does 'using Modules' look like for a shared library?
Now, if you do export a declaration both in the C++ sense (using the keyword ‘export’) and in the linker sense (e.g. _declspec(dllexport)), VC++ will take the step of automatically marking the symbol as ‘dllimport’ on the ‘import side’. This relives you from having to perform the confusing macro dance regarding when to put __declspec(dllexport) or __declspec(dllimport). All of that evaporates.
Steve:
>> Imagine I have 40 classes in a library, which today each has a .cpp and a .h file and I want to support modules. I am required to write one ixx file for the entire library, so I copy the contents of the 40 .h files into the MyLib.ixx and I add a few export keywords.
I am a bit puzzled. If your existing library has 40 classes in modular 40 headers, and that corresponds to the architecture you had in mind for your library, why would you want to smash them together into a single MyLib.ixx?
Apparently you have determined that you wanted a single module instead of 40 modules (that would mirror your existing 40 headers). If that is the case, then why didn’t you have a super-header that included all of the 40 headers which would have been the official interface of your library? The question isn’t rhetorical; I’m trying to understand what has changed in your architecture to push you to do that?
By the way, you can also have this:
// file: MyLib.ixx
module MyLib;
#include “foo1.h”
// …
#include “foo40.h”
If you like the precept of one class per file. You can even go further using module aggregation if you go by the precept of one class per module.
Steve:
>> Question: Is that a deliberate outcome of the choices made in designing Modules? What impact does it have on maintainability of the code?
Yes. It help maintenance by centralizing the definitions of helpers (shared by module units of the same module) at one place.
Steve:
>> My buildsystem knows to invoke that command before attempting to compile any of my cpp files. If I have multiple libraries in the same build and there are dependencies between them, my buildsystem knows to generate the .ifc files in the correct order.
That is awesome! Which version of CMake has this capability?
Steve:
>> Question: What is the impact of naming the module in the .ixx file? What synchronization is expected between the 'module M;' name and the name of the .ifc file generated? What if there was no 'module M;' syntax? Is there an alternate design without name redundancy?
There is no formal relationship between the pathname of the file containing a module interface unit and the module name itself.
“module M;” is what indicate to the compiler (from C++ semantics) that any declaration that follows is owned by the module M. If that module declaration is missing then there is no module semantics.
In the VC++ case, the name of the IFC file is settable by the user via the command line “-module:output”. You can choose any name you want. If you import a module M in your source, you need to specify an IFC file for the compiler to find the metadata for the interface of M. You can do that via “-module:reference <pathname>” where you specifiy a specific file, or via a directory search path with the option “-module:search <directory>” in which case the compiler will try to associate the module M with a file M.ifc in that directory. I recommend being explicit, e.g. using the “-module:reference” option.
Steve:
>> What is the impact of Modules on buildsystems?
This is a good question.
The Windows engineering team upgraded its existing build system to include infrastructure for using modules. From what I understand, it was a smooth upgrade.
Steve:
>> Question: Is that a deliberate outcome of the choices made in designing Modules? What impact does it have on maintainability of the code?
Yes. It help maintenance by centralizing the definitions of helpers (shared by module units of the same module) at one place.
I'm not sure what you are referring to here.
Right. That was one of my misunderstandings I suppose.
Steve:
>> My buildsystem knows to invoke that command before attempting to compile any of my cpp files. If I have multiple libraries in the same build and there are dependencies between them, my buildsystem knows to generate the .ifc files in the correct order.
That is awesome! Which version of CMake has this capability?
I think you misunderstood me here. I'll try to clarify:
Speaking generally, CMake/Makefiles/Ninja need to know the output files which will be created by any command that is executed in order to know which command to execute to create a particular file, and to determine the correct order to generate them. This is already part of those systems for years. I could create 'custom target' rules today with CMake to generate appropriate Makefiles/Ninja files to invoke cl.exe to generate ifc files.
Note though that this requires knowing/hardcoding the output files which will be created by the command.
Steve:
>> Question: What is the impact of naming the module in the .ixx file? What synchronization is expected between the 'module M;' name and the name of the .ifc file generated? What if there was no 'module M;' syntax? Is there an alternate design without name redundancy?
There is no formal relationship between the pathname of the file containing a module interface unit and the module name itself.
“module M;” is what indicate to the compiler (from C++ semantics) that any declaration that follows is owned by the module M. If that module declaration is missing then there is no module semantics.
In the VC++ case, the name of the IFC file is settable by the user via the command line “-module:output”. You can choose any name you want. If you import a module M in your source, you need to specify an IFC file for the compiler to find the metadata for the interface of M. You can do that via “-module:reference <pathname>” where you specifiy a specific file, or via a directory search path with the option “-module:search <directory>” in which case the compiler will try to associate the module M with a file M.ifc in that directory. I recommend being explicit, e.g. using the “-module:reference” option.
This is probably made clear elsewhere but I've already forgotten. Do I need to transitively specify all modules? For example, if I want to use QtCore.QPushButton, do I need to specify module files in my buildsystem for QtWidgets.QWidget and QtCore.QObject (which are dependencies by inheritance)?
If I have a library or executable and I add a new user of QtWidgets.QLabel to one of the classes, do I have to change my buildsystem to add a module reference for that?
Or would I (as an upsteam Qt maintainer), in a case like that, design the Qt modules system such that users specify directories instead because it is more convenient and requires fewer changes to the buildsystem?
That ends up kind of similar to what we have with header files today. However, the buildsystem still needs to know what files get included, in order that in the input header/module files change, the translation units depending on those headers/modules get recompiled. At least today GCC can output the used header files in a Makefile compatible format (and Ninja can read those files too). I suppose they would need the same feature to output a list of used modules. Just something to think about as part of the impact of this.
Also, one of the things Manual Klimek mentioned in his talk is that they hit command line length limits in the buildsystem due to specifying all of the module files.
Steve:
>> What is the impact of Modules on buildsystems?
This is a good question.
The Windows engineering team upgraded its existing build system to include infrastructure for using modules. From what I understand, it was a smooth upgrade.
What I would find interesting is whether they need to parse the 'module modulename;' content from the source files, or whether they always need to specify the ifc output file name in the buildsystem when processing the ixx file (and if that is the case - what the value of the 'module modulename;' content is).
--
You received this message because you are subscribed to the Google Groups "SG2 - Modules" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modules+u...@isocpp.org.
To post to this group, send email to mod...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/modules/.
Regards
On Mon, Feb 6, 2017 at 12:55 AM Stephen Kelly <stev...@gmail.com> wrote:
In his talk, Manual Klimek describes scalability problems encountered by having so many small modules:
https://www.youtube.com/watch?v=dHFNpBfemDI
In the end, they kept the design of having so many modules and optimized other areas to compensate, but there may be other reasonable approaches based on the idea of reducing the absolute number of modules.
However, you recommend that the latter is not an approach which modules are really designed for, right?
I thought my message was that the problems with scalability are modules that are too large :)
Also note that since that talk we've brought build times down significantly by automatically pruning unneeded modules (as exposed by the .d file of the modules build). Something like that would need to be implemented by CMake modules support for extra speed-ups for rebuilds.
Note though that this requires knowing/hardcoding the output files which will be created by the command.
Don't we do the exact same thing for object files today?
Similarly to "C++ source file generates object file", I'd expect we'll have "C++ module file generates module output file".
At least today GCC can output the used header files in a Makefile compatible format (and Ninja can read those files too). I suppose they would need the same feature to output a list of used modules. Just something to think about as part of the impact of this.
clang for example already does that.
Also, one of the things Manual Klimek mentioned in his talk is that they hit command line length limits in the buildsystem due to specifying all of the module files.
That is a very clang specific issue, though, and really unrelated to standardization issues :)
Steve:
>> What is the impact of Modules on buildsystems?
This is a good question.
The Windows engineering team upgraded its existing build system to include infrastructure for using modules. From what I understand, it was a smooth upgrade.
What I would find interesting is whether they need to parse the 'module modulename;' content from the source files, or whether they always need to specify the ifc output file name in the buildsystem when processing the ixx file (and if that is the case - what the value of the 'module modulename;' content is).
That is all very MS specific terminology.
Generally, I'd expect that build systems could, in a modules world, be able to parse the C++ code that specifies the module and its dependencies (for example by using the compiler with a switch that will run in reduced mode, similar to preprocessing, but without requiring transitive inputs) and build the dependency graph from that.
Alternatively, at least for the foreseeable future, I'd expect us to still specify library level dependencies, and have the build system use those to trigger the module builds;
in that case, your source code would need to match what's written in your code, but that's already the case today.
On 02/06/2017 03:07 PM, 'Manuel Klimek' via SG2 - Modules wrote:
On Mon, Feb 6, 2017 at 12:55 AM Stephen Kelly <stev...@gmail.com> wrote:
In his talk, Manual Klimek describes scalability problems encountered by having so many small modules:
https://www.youtube.com/watch?v=dHFNpBfemDI
In the end, they kept the design of having so many modules and optimized other areas to compensate, but there may be other reasonable approaches based on the idea of reducing the absolute number of modules.
However, you recommend that the latter is not an approach which modules are really designed for, right?
I thought my message was that the problems with scalability are modules that are too large :)
I suppose I misremembered, sorry. I recall you mentioned several problems with different approaches, but it is not possible to search/skim the information the video. I thought there was some problem of having to repeatedly cycle through all module files many times. Thanks for clarifying. Could you provide some more-raw information?
Nevertheless, I'm sure when people get their hands on this they will want to take many different approaches to solve the problems that occur, including along the spectrum from few large modules to many small ones.
Also note that since that talk we've brought build times down significantly by automatically pruning unneeded modules (as exposed by the .d file of the modules build). Something like that would need to be implemented by CMake modules support for extra speed-ups for rebuilds.
I think you did mention that in the talk already actually.
Note though that this requires knowing/hardcoding the output files which will be created by the command.
Don't we do the exact same thing for object files today?
I don't know. Is it? Do our source files contain the name of the object file to create? Or perhaps: does the source file contain information about how the linker should refer to the object file? I really don't know here. I'm no linker expert.
It seems odd/redundant/potentially problematic to me that the module name is specified in the source. I'm trying to convince someone to provide some rationale for that so that I can understand (and preferably help add modules to the git repo I posted so that we can experiment).
Similarly to "C++ source file generates object file", I'd expect we'll have "C++ module file generates module output file".
... "which must be referred to by tokens specified in the C++ module file". That is the part that I'm not aware of the precedence or rationale for, and I'm not sure if it's relevant to buildsystems. Maybe it's not relevant for them. I don't have the information.
At least today GCC can output the used header files in a Makefile compatible format (and Ninja can read those files too). I suppose they would need the same feature to output a list of used modules. Just something to think about as part of the impact of this.
clang for example already does that.
Interesting. As far as I was aware, clang doesn't support the 'import' syntax at all yet.
Is the feature of generating the list of imported modules documented on
https://clang.llvm.org/docs/Modules.html
or elsewhere yet?
Also, one of the things Manual Klimek mentioned in his talk is that they hit command line length limits in the buildsystem due to specifying all of the module files.
That is a very clang specific issue, though, and really unrelated to standardization issues :)
We're not discussing standardization anyway (there's nothing standardized about buildsystems, compiler switches for file name conventions).
If, as Gaby recommends, all used modules should be specified on the compile command line with MSVC, why do you call it a clang issue? Am I missing something here?
Steve:
>> What is the impact of Modules on buildsystems?
This is a good question.
The Windows engineering team upgraded its existing build system to include infrastructure for using modules. From what I understand, it was a smooth upgrade.
What I would find interesting is whether they need to parse the 'module modulename;' content from the source files, or whether they always need to specify the ifc output file name in the buildsystem when processing the ixx file (and if that is the case - what the value of the 'module modulename;' content is).
That is all very MS specific terminology.
What is? I don't know what part of the quote you are referring to. Parts of are are from the modules TS, and I don't think file extensions are so relevant? (I'm making a guess about what your remark means)
Generally, I'd expect that build systems could, in a modules world, be able to parse the C++ code that specifies the module and its dependencies (for example by using the compiler with a switch that will run in reduced mode, similar to preprocessing, but without requiring transitive inputs) and build the dependency graph from that.
This is the kind of input I was looking for with this thread. This is why I tried to a very open question so that people like yourself can share the experience you have with modules, and your conclusions about how the buildsystem and compiler drivers are affected.
Alternatively, at least for the foreseeable future, I'd expect us to still specify library level dependencies, and have the build system use those to trigger the module builds;
Can you give an example?
in that case, your source code would need to match what's written in your code, but that's already the case today.
I'm having trouble parsing "your source code would need to match what's written in your code". What do you mean?
On 6 February 2017 at 13:31, Stephen Kelly <stev...@gmail.com> wrote:
On 02/06/2017 03:07 PM, 'Manuel Klimek' via SG2 - Modules wrote:
On Mon, Feb 6, 2017 at 12:55 AM Stephen Kelly <stev...@gmail.com> wrote:
In his talk, Manual Klimek describes scalability problems encountered by having so many small modules:
https://www.youtube.com/watch?v=dHFNpBfemDI
In the end, they kept the design of having so many modules and optimized other areas to compensate, but there may be other reasonable approaches based on the idea of reducing the absolute number of modules.
However, you recommend that the latter is not an approach which modules are really designed for, right?
I thought my message was that the problems with scalability are modules that are too large :)
I suppose I misremembered, sorry. I recall you mentioned several problems with different approaches, but it is not possible to search/skim the information the video. I thought there was some problem of having to repeatedly cycle through all module files many times. Thanks for clarifying. Could you provide some more-raw information?
Nevertheless, I'm sure when people get their hands on this they will want to take many different approaches to solve the problems that occur, including along the spectrum from few large modules to many small ones.
This doesn't seem fundamentally different from people wanting anything on the spectrum from a large number of small header files to a small number of huge header files today
And as with that choice, one of the things they may want to consider is the effect on their build performance when different parts of their codebase change (larger headers or modules will typically mean that changes to that interface cause more downstream targets to recompile).Also note that since that talk we've brought build times down significantly by automatically pruning unneeded modules (as exposed by the .d file of the modules build). Something like that would need to be implemented by CMake modules support for extra speed-ups for rebuilds.
I think you did mention that in the talk already actually.
Note though that this requires knowing/hardcoding the output files which will be created by the command.
Don't we do the exact same thing for object files today?
I don't know. Is it? Do our source files contain the name of the object file to create? Or perhaps: does the source file contain information about how the linker should refer to the object file? I really don't know here. I'm no linker expert.
It seems odd/redundant/potentially problematic to me that the module name is specified in the source. I'm trying to convince someone to provide some rationale for that so that I can understand (and preferably help add modules to the git repo I posted so that we can experiment).
Are you assuming that the name of the module output file would need to be in some way related to the name of the module in the source? I don't see any reason to assume that. If you want to build module Foo in bar.cppm to baz.pcm, I would expect that to work, just as if you wanted to build a definition of class Foo in bar.cpp to baz.o (although I would question the wisdom of some of those choices).
Also, one of the things Manual Klimek mentioned in his talk is that they hit command line length limits in the buildsystem due to specifying all of the module files.
That is a very clang specific issue, though, and really unrelated to standardization issues :)
If, as Gaby recommends, all used modules should be specified on the compile command line with MSVC, why do you call it a clang issue? Am I missing something here?
Some implementations (clang included) support the ability to pass command-line arguments via a "response file" instead of as actual command-line arguments. Command-line argument length limits vary between operating systems. An implementation could support a way for a module file to suggest the location of another module file if it's not explicitly specified (and clang supports such a mechanism).
On 02/06/2017 09:45 PM, 'Richard Smith' via SG2 - Modules wrote:
On 6 February 2017 at 13:31, Stephen Kelly <stev...@gmail.com> wrote:
On 02/06/2017 03:07 PM, 'Manuel Klimek' via SG2 - Modules wrote:
On Mon, Feb 6, 2017 at 12:55 AM Stephen Kelly <stev...@gmail.com> wrote:
In his talk, Manual Klimek describes scalability problems encountered by having so many small modules:
https://www.youtube.com/watch?v=dHFNpBfemDI
In the end, they kept the design of having so many modules and optimized other areas to compensate, but there may be other reasonable approaches based on the idea of reducing the absolute number of modules.
However, you recommend that the latter is not an approach which modules are really designed for, right?
I thought my message was that the problems with scalability are modules that are too large :)
I suppose I misremembered, sorry. I recall you mentioned several problems with different approaches, but it is not possible to search/skim the information the video. I thought there was some problem of having to repeatedly cycle through all module files many times. Thanks for clarifying. Could you provide some more-raw information?
Nevertheless, I'm sure when people get their hands on this they will want to take many different approaches to solve the problems that occur, including along the spectrum from few large modules to many small ones.
This doesn't seem fundamentally different from people wanting anything on the spectrum from a large number of small header files to a small number of huge header files todayI suppose. This is a topic because it is not clear to me what approach Qt should take to modules and what impact that has on buildsystems of Qt users for example. I was more specific about that in a previous email, including the requirement to make a change in the buildsystem every time I add 'import QPushButton' to my applications cpp file. See my previous email for more.
Manuel mentioned in his talk that Google requires specifying all headers in use anyway, so I guess it's no big change for you, but many projects don't do that. If that is needed, I wonder if it would harm adoption. Manuel mentioned that perhaps compilers could learn to parse the import statements and present that information to the buildsystem, but that also requires implementation by buildsystem implementors, which becomes the next adoption bottleneck.
Really, this whole thread for me is about trying to find out what impediments there are to C++ modules being successful, and what assumptions are being made about how tooling will make them successful. I know modules have been made to work for some large codebases, but I don't know how that will generalize to the rest of the world.
The more I think about it, the more I think one module per class wouldn't work well for Qt. To prevent requiring the user to specify the path to all module files that they use in their buildsystem, we would probably make Qt simply put all module files from, say, all classes in QtWidgets on your compile line if you use QtWidgets. Then you could 'import QPushButton' without changing the buildsystem, but there would be no advantage to having one C++ module per class. One C++ module per Qt library would be the only thing to make sense instead I think.
Perhaps build performance is the thing that would cause it swing the other way, I don't know.
Then the question remains how to specify such a C++ module for a Qt library in a maintainable way.
A separate issue is that, as compiled modules will probably not be distributable, the buildsystem will have to compile all the module files it depends on. I don't have any sense of how long it would take to compile module files for all classes in QtCore QtGui and QtWidgets (or QtCore QtGui QtQml and QtQuick) just to build your first hello world Qt program.
And as with that choice, one of the things they may want to consider is the effect on their build performance when different parts of their codebase change (larger headers or modules will typically mean that changes to that interface cause more downstream targets to recompile).Also note that since that talk we've brought build times down significantly by automatically pruning unneeded modules (as exposed by the .d file of the modules build). Something like that would need to be implemented by CMake modules support for extra speed-ups for rebuilds.
I think you did mention that in the talk already actually.
Note though that this requires knowing/hardcoding the output files which will be created by the command.
Don't we do the exact same thing for object files today?
I don't know. Is it? Do our source files contain the name of the object file to create? Or perhaps: does the source file contain information about how the linker should refer to the object file? I really don't know here. I'm no linker expert.
It seems odd/redundant/potentially problematic to me that the module name is specified in the source. I'm trying to convince someone to provide some rationale for that so that I can understand (and preferably help add modules to the git repo I posted so that we can experiment).
Are you assuming that the name of the module output file would need to be in some way related to the name of the module in the source? I don't see any reason to assume that. If you want to build module Foo in bar.cppm to baz.pcm, I would expect that to work, just as if you wanted to build a definition of class Foo in bar.cpp to baz.o (although I would question the wisdom of some of those choices).Ah, ok - that's the analogy.
That's why all the compiled module files need to be specified on the compile line of any translation unit using them. I'm assuming all files in the transitive closure need to be specified. I suppose that transitive closure would be computed by the buildsystem an mostly hidden from the user (as particular names of object files are today generally). At least in the blog at
https://blogs.msdn.microsoft.com/vcblog/2015/12/03/c-modules-in-vs-2015-update-1/
there is an example of using a search directory and not specifying the module file name on the compile line. The 'import M' in bar.cpp still works. That can only work if either some correlation from modulename to filename is assumed by the compiler, or the compiler simply pre-loads all modules in that directory regardless of filename and makes the modulenames available.
If that could be expected to work, CMake could compile module files into some internal directory within the build directory and simply pass the path to it to all compiles (CMake would still need to know somehow which module files to compile in that way though).
Also, one of the things Manual Klimek mentioned in his talk is that they hit command line length limits in the buildsystem due to specifying all of the module files.
That is a very clang specific issue, though, and really unrelated to standardization issues :)If, as Gaby recommends, all used modules should be specified on the compile command line with MSVC, why do you call it a clang issue? Am I missing something here?
Some implementations (clang included) support the ability to pass command-line arguments via a "response file" instead of as actual command-line arguments. Command-line argument length limits vary between operating systems. An implementation could support a way for a module file to suggest the location of another module file if it's not explicitly specified (and clang supports such a mechanism).
I still don't see anything 'very clang-specific' about this, so I guess I'm still missing something :). Doesn't seem important though.
On Tue, Feb 7, 2017 at 12:04 AM Stephen Kelly <stev...@gmail.com> wrote:
A separate issue is that, as compiled modules will probably not be distributable, the buildsystem will have to compile all the module files it depends on. I don't have any sense of how long it would take to compile module files for all classes in QtCore QtGui and QtWidgets (or QtCore QtGui QtQml and QtQuick) just to build your first hello world Qt program.
I think it's too early to really say how Qt should be layed out. It'll depend a lot on how specific compilers will need to get modules presented.
For example, we went back-and-forth multiple times between specifying all transitive modules on the command line (less non-parallelizable work for the build system) vs only handing in top-level modules (that is, not specifying modules on the command line that are in the transitive dependencies of a different module in the set of transitively used modules).
Currently, we are back to only specifying top-level modules, as that means clang can figure out which modules are actually used, writes only the used modules to the .d file, and we can use that information to prune the dependency graph of builds depending on that module (yea, it's somewhat complex, and we actually do an include-scanning step where we try to figure out which headers, and thus modules, are reachable at all from a source file).
A different consideration is that for large continuously integrated code bases, the triggered rebuilds by a header change in a core library are much more of a problem than for a user of Qt, for example, who will probably stick with a single version for a longer while.
If that could be expected to work, CMake could compile module files into some internal directory within the build directory and simply pass the path to it to all compiles (CMake would still need to know somehow which module files to compile in that way though).
I think a simple modules support in CMake could look this way:If you add a C++ library, you can specify interface h1.hm h2.hm h3.hm (no idea how module files will be called :), which would build a module out of these headers / C++ module definitions.The module will be passed to all libraries in the reverse transitive dependency closure.For clang, for example, we could also currently pass (for backwards compatibility) h.cppmap in interface, which will only be needed for the transitional period, and compile a module out of plain C++ headers.
That way, you could get incremental compilation benefits early, and we could get real world experience on modules builds for projects that are happy to be early testers.
On 02/07/2017 09:32 AM, 'Manuel Klimek' via SG2 - Modules wrote:
On Tue, Feb 7, 2017 at 12:04 AM Stephen Kelly <stev...@gmail.com> wrote:
A separate issue is that, as compiled modules will probably not be distributable, the buildsystem will have to compile all the module files it depends on. I don't have any sense of how long it would take to compile module files for all classes in QtCore QtGui and QtWidgets (or QtCore QtGui QtQml and QtQuick) just to build your first hello world Qt program.
I think it's too early to really say how Qt should be layed out. It'll depend a lot on how specific compilers will need to get modules presented.
The issue of what the interface to modules is and how libraries such as (but not limited to) Qt and their users and buildsystems will make use of them seem quite interdependent to me.
It is also interdependent with buildsystems. The Ninja buildsystem can not currently build Fortran code because the way that Fortran modules are specified is not compatible with how Ninja currently works:
https://groups.google.com/forum/#!topic/ninja-build/tPOcu5EWXio
I don't know how fixable that is, but I think that's an important conversation, and one which is inseparable from the design of modules.
I think, rather than the question in the email title here, a better open question is:
What needs to happen in tooling and popular libraries for C++ modules to become a success?
That can include questions such as the impact of naming a module in the source, the features Manuel mentioned would be needed to add to all compilers such as an import/module/export scanning mode, buildsystems to be ported to use such a mode, extra steps buildsystems may have to explicitly take where today 'compile and link' gets you most of the way and how that will be exposed, what people who maintain the buildsystem for their project have to do, what does the transition look like etc.
Currently, from what I read on reddit at least, people seem to think that C++ modules will be added to the standard, magic will happen, and then they will achieve nirvana. What needs to be done to make that true?
The modules proposal seems more interdependent with tooling and how libraries maintain/define their interface than any other C++ feature I'm aware of since C++11.
For example, we went back-and-forth multiple times between specifying all transitive modules on the command line (less non-parallelizable work for the build system) vs only handing in top-level modules (that is, not specifying modules on the command line that are in the transitive dependencies of a different module in the set of transitively used modules).
Not specifying all transitive (public) dependencies on the command line will only work if all module files are in well-known locations or the same location, right?
IOW, if Lib1 depends (publically) on Lib2 and Exe depends on Lib1, you specify only the full path to the module file for Lib1 when compiling Exe.cpp. How is the module file for Lib2 found if it is in some other random location? Is the path hardcoded in the Lib1 module file?
Currently, we are back to only specifying top-level modules, as that means clang can figure out which modules are actually used, writes only the used modules to the .d file, and we can use that information to prune the dependency graph of builds depending on that module (yea, it's somewhat complex, and we actually do an include-scanning step where we try to figure out which headers, and thus modules, are reachable at all from a source file).
Whatever complexity you hit regarding your buildsystem with modules, every other buildsystem else will eventually hit too.
A different consideration is that for large continuously integrated code bases, the triggered rebuilds by a header change in a core library are much more of a problem than for a user of Qt, for example, who will probably stick with a single version for a longer while.
That depends on whether you are a developer working on Qt :).
Anyway, that's just an example. Most large codebases I've seen have the code separated into multiple libraries, not unlike the way Qt is structured. They will want to use modules and will hit the same issues we hit with Qt and in the same way.
If that could be expected to work, CMake could compile module files into some internal directory within the build directory and simply pass the path to it to all compiles (CMake would still need to know somehow which module files to compile in that way though).
I think a simple modules support in CMake could look this way:If you add a C++ library, you can specify interface h1.hm h2.hm h3.hm (no idea how module files will be called :), which would build a module out of these headers / C++ module definitions.The module will be passed to all libraries in the reverse transitive dependency closure.For clang, for example, we could also currently pass (for backwards compatibility) h.cppmap in interface, which will only be needed for the transitional period, and compile a module out of plain C++ headers.
You use the singular 'a module' for a library, whereas other discussion was a recommendation to maintain one module per class in the library. I'm having trouble juggling all of the different options and what impact they have on libraries and buildsystems.
What you describe above is similar to what I had in mind at the beginning which is something akin to one module per library.
That way, you could get incremental compilation benefits early, and we could get real world experience on modules builds for projects that are happy to be early testers.
I'm sure there are many willing. I'm sure that will bring up even more questions too.
Thanks,
Steve.
On 02/07/2017 09:32 AM, 'Manuel Klimek' via SG2 - Modules wrote:
On Tue, Feb 7, 2017 at 12:04 AM Stephen Kelly <stev...@gmail.com> wrote:
A separate issue is that, as compiled modules will probably not be distributable, the buildsystem will have to compile all the module files it depends on. I don't have any sense of how long it would take to compile module files for all classes in QtCore QtGui and QtWidgets (or QtCore QtGui QtQml and QtQuick) just to build your first hello world Qt program.
I think it's too early to really say how Qt should be layed out. It'll depend a lot on how specific compilers will need to get modules presented.
The issue of what the interface to modules is and how libraries such as (but not limited to) Qt and their users and buildsystems will make use of them seem quite interdependent to me.
It is also interdependent with buildsystems. The Ninja buildsystem can not currently build Fortran code because the way that Fortran modules are specified is not compatible with how Ninja currently works:
https://groups.google.com/forum/#!topic/ninja-build/tPOcu5EWXio
I don't know how fixable that is, but I think that's an important conversation, and one which is inseparable from the design of modules.
I think, rather than the question in the email title here, a better open question is:
What needs to happen in tooling and popular libraries for C++ modules to become a success?
That can include questions such as the impact of naming a module in the source, the features Manuel mentioned would be needed to add to all compilers such as an import/module/export scanning mode, buildsystems to be ported to use such a mode, extra steps buildsystems may have to explicitly take where today 'compile and link' gets you most of the way and how that will be exposed, what people who maintain the buildsystem for their project have to do, what does the transition look like etc.
Currently, from what I read on reddit at least, people seem to think that C++ modules will be added to the standard, magic will happen, and then they will achieve nirvana. What needs to be done to make that true?
The modules proposal seems more interdependent with tooling and how libraries maintain/define their interface than any other C++ feature I'm aware of since C++11.
For example, we went back-and-forth multiple times between specifying all transitive modules on the command line (less non-parallelizable work for the build system) vs only handing in top-level modules (that is, not specifying modules on the command line that are in the transitive dependencies of a different module in the set of transitively used modules).Not specifying all transitive (public) dependencies on the command line will only work if all module files are in well-known locations or the same location, right?
IOW, if Lib1 depends (publically) on Lib2 and Exe depends on Lib1, you specify only the full path to the module file for Lib1 when compiling Exe.cpp. How is the module file for Lib2 found if it is in some other random location? Is the path hardcoded in the Lib1 module file?
Currently, we are back to only specifying top-level modules, as that means clang can figure out which modules are actually used, writes only the used modules to the .d file, and we can use that information to prune the dependency graph of builds depending on that module (yea, it's somewhat complex, and we actually do an include-scanning step where we try to figure out which headers, and thus modules, are reachable at all from a source file).Whatever complexity you hit regarding your buildsystem with modules, every other buildsystem else will eventually hit too.
A different consideration is that for large continuously integrated code bases, the triggered rebuilds by a header change in a core library are much more of a problem than for a user of Qt, for example, who will probably stick with a single version for a longer while.That depends on whether you are a developer working on Qt :).
Anyway, that's just an example. Most large codebases I've seen have the code separated into multiple libraries, not unlike the way Qt is structured. They will want to use modules and will hit the same issues we hit with Qt and in the same way.
If that could be expected to work, CMake could compile module files into some internal directory within the build directory and simply pass the path to it to all compiles (CMake would still need to know somehow which module files to compile in that way though).
I think a simple modules support in CMake could look this way:If you add a C++ library, you can specify interface h1.hm h2.hm h3.hm (no idea how module files will be called :), which would build a module out of these headers / C++ module definitions.The module will be passed to all libraries in the reverse transitive dependency closure.For clang, for example, we could also currently pass (for backwards compatibility) h.cppmap in interface, which will only be needed for the transitional period, and compile a module out of plain C++ headers.You use the singular 'a module' for a library, whereas other discussion was a recommendation to maintain one module per class in the library. I'm having trouble juggling all of the different options and what impact they have on libraries and buildsystems.
What you describe above is similar to what I had in mind at the beginning which is something akin to one module per library.
That way, you could get incremental compilation benefits early, and we could get real world experience on modules builds for projects that are happy to be early testers.I'm sure there are many willing. I'm sure that will bring up even more questions too.
Thanks,
Steve.
On Wed, Feb 8, 2017 at 1:16 AM Stephen Kelly <stev...@gmail.com> wrote:
The modules proposal seems more interdependent with tooling and how libraries maintain/define their interface than any other C++ feature I'm aware of since C++11.
If I'm not mistaken, the modules TS currently still has open discussion points like "will modules support macros", "will modules support legacy #includes", etc. I expect that we'll only be able to answer your questions fully once that is settled. If you want to have influence on this, you'll probably need to get involved in the standardization process :)