This is an archive of the discontinued LLVM Phabricator instance.

[Driver] [C++20] [Modules] Support -fmodule-output= (2/2)
ClosedPublic

Authored by ChuanqiXu on Oct 31 2022, 12:55 AM.

Details

Summary

Successor of D137058. The intention is described as the second step in D137058. This is helpful if the build systems want to generate these output files in other places which is not the same with -o specified or the input file lived.


The discourse discussion is at: https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144

Diff Detail

Event Timeline

ChuanqiXu created this revision.Oct 31 2022, 12:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 31 2022, 12:55 AM
ChuanqiXu requested review of this revision.Oct 31 2022, 12:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 31 2022, 12:55 AM
ChuanqiXu retitled this revision from [Driver] [Modules] Introduce -fsave-std-c++-module-file= to specify the path of the module file to [Driver] [Modules] Introduce -fsave-std-c++-module-file= to specify the path of the module file (2/2).

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Done. From my reading, in that discourse discussing, we're not talking about to add the new flags. I add the flag since I don't want the .pcm file pollutes the user space accidentally.

if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step

I am not sure what you mean. Do you talk about to forbidden the original 2-phase compilation model? If so, I think it is definitely the wrong direction. The 2-phase compilation model should be the correct direction in the long term since it has higher parallelism.

iains added a comment.Nov 1 2022, 12:49 AM

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Done. From my reading, in that discourse discussing, we're not talking about to add the new flags. I add the flag since I don't want the .pcm file pollutes the user space accidentally.

if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step

I am not sure what you mean. Do you talk about to forbidden the original 2-phase compilation model? If so, I think it is definitely the wrong direction. The 2-phase compilation model should be the correct direction in the long term since it has higher parallelism.

I am not convinced about this second point as motivation for this direction; it comes with some significant resource tradeoffs (compared with the proposed [near] future version of producing the PCM and the object from one invocation of the FE):

  • it requires multiple instantiations of the FE
  • it blocks the objective of reducing the content of module interfaces (so that they only contain the information that pertains to the interface) - since requiring source -> pcm, pcm -> object means that the PCM has to contain all the information necessary to generate the object.
  • in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default; if the user has a specific reason to want to do the two jobs separately then thay could still do so ( -fmodule-only / --precompile ) at the expense of two invocations as now,

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Done. From my reading, in that discourse discussing, we're not talking about to add the new flags. I add the flag since I don't want the .pcm file pollutes the user space accidentally.

if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step

I am not sure what you mean. Do you talk about to forbidden the original 2-phase compilation model? If so, I think it is definitely the wrong direction. The 2-phase compilation model should be the correct direction in the long term since it has higher parallelism.

I am not convinced about this second point as motivation for this direction; it comes with some significant resource tradeoffs (compared with the proposed [near] future version of producing the PCM and the object from one invocation of the FE):

  • it requires multiple instantiations of the FE
  • it blocks the objective of reducing the content of module interfaces (so that they only contain the information that pertains to the interface) - since requiring source -> pcm, pcm -> object means that the PCM has to contain all the information necessary to generate the object.
  • in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default; if the user has a specific reason to want to do the two jobs separately then thay could still do so ( -fmodule-only / --precompile ) at the expense of two invocations as now,

(so that they only contain the information that pertains to the interface)

No, we can't do this. It hurts the performance.

it requires multiple instantiations of the FE

Agreed. But if we care about this, I think it may be best to allow the current 2 phase compilation model only. And we forbid the compilation from module unit to object files directly. This is cleanest approach.

in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

I think the distribution doesn't matter with parallelism. For parallelism, I mean, for the scan-based build systems, the compilation of A must wait until the dependent module B compiles to object files, which is significantly worse than the 2 phase compilation.


So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default;

So the question would be where is the destination place? And if we would offer an option to allow the user to specify the place? This question is discussed in https://reviews.llvm.org/D137058.

iains added a comment.Nov 1 2022, 1:26 AM

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Done. From my reading, in that discourse discussing, we're not talking about to add the new flags. I add the flag since I don't want the .pcm file pollutes the user space accidentally.

if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step

I am not sure what you mean. Do you talk about to forbidden the original 2-phase compilation model? If so, I think it is definitely the wrong direction. The 2-phase compilation model should be the correct direction in the long term since it has higher parallelism.

I am not convinced about this second point as motivation for this direction; it comes with some significant resource tradeoffs (compared with the proposed [near] future version of producing the PCM and the object from one invocation of the FE):

  • it requires multiple instantiations of the FE
  • it blocks the objective of reducing the content of module interfaces (so that they only contain the information that pertains to the interface) - since requiring source -> pcm, pcm -> object means that the PCM has to contain all the information necessary to generate the object.
  • in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default; if the user has a specific reason to want to do the two jobs separately then thay could still do so ( -fmodule-only / --precompile ) at the expense of two invocations as now,

(so that they only contain the information that pertains to the interface)

No, we can't do this. It hurts the performance.

it requires multiple instantiations of the FE

Agreed. But if we care about this, I think it may be best to allow the current 2 phase compilation model only. And we forbid the compilation from module unit to object files directly. This is cleanest approach.

in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

I think the distribution doesn't matter with parallelism. For parallelism, I mean, for the scan-based build systems, the compilation of A must wait until the dependent module B compiles to object files, which is significantly worse than the 2 phase compilation.

Not sure what you mean here; If there is only one user of a PCM then it does not need to be produced (waste of disk space and CPU cycles);
If there are many uses of it (as we might expect in a massively parallel distributed build system) then distributing the PCM is important and its availability predicates progress of other builds - from previous discussions in WG21 there are users that care very much about the size of distributed artefacts.


So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default;

So the question would be where is the destination place? And if we would offer an option to allow the user to specify the place? This question is discussed in https://reviews.llvm.org/D137058.

Having a mechanism to specify the place for the file is fine by me ( I was only commenting on the motivation point for separate pcm and object phases ).

(I think we should move this discussion somewhere else, again - unless it is considered a key factor in deciding on this patch, I have no further comments).

Having a mechanism to specify the place for the file is fine by me ( I was only commenting on the motivation point for separate pcm and object phases ).

(I think we should move this discussion somewhere else, again - unless it is considered a key factor in deciding on this patch, I have no further comments).

Yeah, agreed. Let's avoid the repeating ourselves : )

There is another motivating factor for 1-phase: the build graph is far simpler. With 2-phase, CMake will have to write out rules to perform:

  • source -> .bmi
  • .bmi -> .withbmi.o
  • source -> .o

because we do not know if a BMI is needed or not. If it isn't we use the latter. If it is, we use the former. Note that this also means we need 2 different .o filenames (as neither make nor ninja doesn't support multiple rules making the same output). This also means that the collator needs to generate a response file for the linker to direct which .o file to use for each TU based on the contents.

Also with 2-phase, it is an open question of whether it actually helps with distributed builds (or anywhere process execution and I/O are expensive compared to some minimal work unit such as, say, Windows compiling from a network drive). Since this is not a bright line, giving the option to say "I know that split BMI is better for me in this instance" and "please combine here" would be handy (depending on actual real-world perf results on real-world projects). Yes, this is a chicken-and-egg cycle :) .

There is another motivating factor for 1-phase: the build graph is far simpler. With 2-phase, CMake will have to write out rules to perform:

  • source -> .bmi
  • .bmi -> .withbmi.o
  • source -> .o

because we do not know if a BMI is needed or not. If it isn't we use the latter. If it is, we use the former. Note that this also means we need 2 different .o filenames (as neither make nor ninja doesn't support multiple rules making the same output). This also means that the collator needs to generate a response file for the linker to direct which .o file to use for each TU based on the contents.

Also with 2-phase, it is an open question of whether it actually helps with distributed builds (or anywhere process execution and I/O are expensive compared to some minimal work unit such as, say, Windows compiling from a network drive). Since this is not a bright line, giving the option to say "I know that split BMI is better for me in this instance" and "please combine here" would be handy (depending on actual real-world perf results on real-world projects). Yes, this is a chicken-and-egg cycle :) .

In my mind, it is OK for CMake to support one-phase compilation model in the short term. And the fact that clang also supports the 2-phase compilation wouldn't affect CMake. Do I understand right? I mean, if the 2-phase compilation wouldn't affect CMake, CMake should be able to ignore it. My thought is that there are many more build systems in the world. And I know many of them would handle the dependency them self fully instead of translate the dependency to other build scripts like make and ninja. And I think it should be good for the compiler to remain different possibilities for different build systems (or tools).

In my mind, it is OK for CMake to support one-phase compilation model in the short term. And the fact that clang also supports the 2-phase compilation wouldn't affect CMake. Do I understand right? I mean, if the 2-phase compilation wouldn't affect CMake, CMake should be able to ignore it. My thought is that there are many more build systems in the world. And I know many of them would handle the dependency them self fully instead of translate the dependency to other build scripts like make and ninja. And I think it should be good for the compiler to remain different possibilities for different build systems (or tools).

Indeed. Even if everything supports 2-phase, I suspect there are cases where 1-phase might still be better. But again, this needs real world numbers and testing to actually perform (more because of build graph shapes than individual TU timings).

dblaikie added a subscriber: rsmith.Nov 3 2022, 3:48 PM

There is another motivating factor for 1-phase: the build graph is far simpler. With 2-phase, CMake will have to write out rules to perform:

  • source -> .bmi
  • .bmi -> .withbmi.o
  • source -> .o

because we do not know if a BMI is needed or not. If it isn't we use the latter. If it is, we use the former. Note that this also means we need 2 different .o filenames (as neither make nor ninja doesn't support multiple rules making the same output). This also means that the collator needs to generate a response file for the linker to direct which .o file to use for each TU based on the contents.

I /think/ from @rsmith's comments in the discourse thread, we're more likely to skip/remove the ability to go from ".bmi" -> ".o" and possibly have 2 path options (this is all from @rsmith's comments on discourse) either ".cppm -> {.pcm, .o}" or ".cppm -> .o" + ".cppm -> .pcm" - this'd avoid the need to maintain full V slim pcm, there would never be a pcm that could produce a .o, .pcm would only be sufficient for users, not implementation.

But yeah, maybe we end up with all 3 options in the interim. Though I'd really like to keep the surface area as small as possible, while still allowing room for experimentation. Perhaps experimentation via -Xclang flags until data shows the options are worthwhile beyond those experiments.

Pulling in your (Ben) comment from D137058:

Plus the other compilers offer controls over it; why does Clang have to be different?

Which compilers/flags are you referring to? Arguments from compatibility with GCC are relatively easy to make (though I still have more hesitance for these flags since there's not wide-scale adoption, and I think there's still room to shape the world we want to see and limit the width of the interface/variations we end up having to support long term) & might side-step some of the discussions here.

ChuanqiXu updated this revision to Diff 475664.Nov 15 2022, 7:11 PM

Use tests with -###

@ben.boeckel

Plus the other compilers offer controls over it; why does Clang have to be different?

Which compilers/flags are you referring to? Arguments from compatibility with GCC are relatively easy to make (though I still have more hesitance for these flags since there's not wide-scale adoption, and I think there's still room to shape the world we want to see and limit the width of the interface/variations we end up having to support long term) & might side-step some of the discussions here.

I'm still curious what about the details of other compilers - I think from the sounds of it, @iains suggested GCC doesn't support this yet so we'll need to pick/name the flag ourselves & he's happy to implement whatever we pick? I guess Microsoft's flag naming is sufficiently differently styled as to offer no useful inspiration? Though wouldn't hurt to know what they name it.

Any other examples you had in mind, Ben?

I'm still curious what about the details of other compilers - I think from the sounds of it, @iains suggested GCC doesn't support this yet so we'll need to pick/name the flag ourselves & he's happy to implement whatever we pick? I guess Microsoft's flag naming is sufficiently differently styled as to offer no useful inspiration? Though wouldn't hurt to know what they name it.

Any other examples you had in mind, Ben?

GCC supports naming the output file by asking the "module mapper" where a module with a given name lives (also used for finding imported modules). MSVC uses the -ifcOutput flag to specify where to write any exported module data to. See this CMake code which handles the "module mapping" for the various compilers: https://gitlab.kitware.com/cmake/cmake/-/blob/master/Source/cmCxxModuleMapper.cxx

iains added a comment.EditedDec 5 2022, 8:55 PM

I'm still curious what about the details of other compilers - I think from the sounds of it, @iains suggested GCC doesn't support this yet so we'll need to pick/name the flag ourselves & he's happy to implement whatever we pick? I guess Microsoft's flag naming is sufficiently differently styled as to offer no useful inspiration? Though wouldn't hurt to know what they name it.

Any other examples you had in mind, Ben?

GCC supports naming the output file by asking the "module mapper" where a module with a given name lives (also used for finding imported modules).

This is using the 'P1184' interface for both tasks - which I think we should keep as a separate mechanism (at least mentally) since we can also use that with clang when we implement it. What the interface returns for the name (via 'P1184') is decoupled from how the name is determined (potentially by a command line argument or from some other build system component).

Currently, GCC does not have a command line spelling for specifying the output module name, which is why I say it's still "up for grabs" (actually it would be polite to ask on gcc@gcc.gnu.org for opinions on the spelling, since it would be crazy to have different on at least these two platforms). The spelling of command line options is not IMO bike shedding, it affects day-to-day use of the tools.

I'm still curious what about the details of other compilers - I think from the sounds of it, @iains suggested GCC doesn't support this yet so we'll need to pick/name the flag ourselves & he's happy to implement whatever we pick? I guess Microsoft's flag naming is sufficiently differently styled as to offer no useful inspiration? Though wouldn't hurt to know what they name it.

Any other examples you had in mind, Ben?

GCC supports naming the output file by asking the "module mapper" where a module with a given name lives (also used for finding imported modules).

This is using the 'P1184' interface for both tasks - which I think we should keep as a separate mechanism (at least mentally) since we can also use that with clang when we implement it. What the interface returns for the name (via 'P1184') is decoupled from how the name is determined (potentially by a command line argument or from some other build system component).

Currently, GCC does not have a command line spelling for specifying the output module name, which is why I say it's still "up for grabs" (actually it would be polite to ask on gcc@gcc.gnu.org for opinions on the spelling, since it would be crazy to have different on at least these two platforms). The spelling of command line options is not IMO bike shedding, it affects day-to-day use of the tools.

Alrighty - started a thread here: https://gcc.gnu.org/pipermail/gcc/2022-December/240239.html

ChuanqiXu updated this revision to Diff 481504.Dec 8 2022, 7:32 PM
ChuanqiXu retitled this revision from [Driver] [Modules] Introduce -fsave-std-c++-module-file= to specify the path of the module file (2/2) to [Driver] [C++20] [Modules] Support -fmodule-output= (2/2).

Rename the option to -fmodule-output according to the discussion from https://gcc.gnu.org/pipermail/gcc/2022-December/240239.html

iains accepted this revision.Dec 9 2022, 2:07 AM

this LGTM ( but please wait for an ack from @dblaikie ) and again thanks for patience in seeing this through.

This revision is now accepted and ready to land.Dec 9 2022, 2:07 AM
chfast added a subscriber: chfast.Dec 11 2022, 5:55 AM
ChuanqiXu updated this revision to Diff 481978.Dec 11 2022, 7:29 PM

Update after D137058 updated.

Nathan had a few questions in the cross-project flag naming thread - could you check/reply to those? I think the right answer is probably "do whatever -o does" but would be good to verify that behavior makes sense, maybe explicitly test it in some way if it's significant enough/requires any work to support getting that behavior? (if it falls out naturally from some existing file IO handling we have, I'm less worried about testing it separately)

dblaikie accepted this revision.Dec 12 2022, 1:15 PM

Nathan had a few questions in the cross-project flag naming thread - could you check/reply to those? I think the right answer is probably "do whatever -o does" but would be good to verify that behavior makes sense, maybe explicitly test it in some way if it's significant enough/requires any work to support getting that behavior? (if it falls out naturally from some existing file IO handling we have, I'm less worried about testing it separately)

Oh, I guess those questions ^ are really more applicable to the previous patch in the series, that introduces writing the file out anyway - the ability to specify the name of the file probably has no impact/shouldn't have any impact on the answers to the questions, so this patch looks good - please consider those issues in the other patch.

Update as the dependent one changes.

Update since the dependent one changed.

ChuanqiXu updated this revision to Diff 483075.Dec 14 2022, 7:52 PM

Update since the dependent one changes.

Update since the dependent one changed.

ChuanqiXu updated this revision to Diff 489426.Jan 15 2023, 9:51 PM

Update since the dependent one changes.

This revision was landed with ongoing or failed builds.Jan 15 2023, 10:02 PM
This revision was automatically updated to reflect the committed changes.