Download Raw Diff

Details

Reviewers

dblaikie
ben.boeckel
iains
tschuett

Commits

rG3e9e8d6ef442: [Driver] [C++20] [Modules] Support -fmodule-output= (2/2)

Summary

Successor of D137058. The intention is described as the second step in D137058. This is helpful if the build systems want to generate these output files in other places which is not the same with -o specified or the input file lived.

The discourse discussion is at: https://discourse.llvm.org/t/make-command-line-support-for-c-20-module-uniform-with-gcc/59144

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ChuanqiXu created this revision.Oct 31 2022, 12:55 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 31 2022, 12:55 AM

ChuanqiXu requested review of this revision.Oct 31 2022, 12:55 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 31 2022, 12:55 AM

Herald added subscribers: cfe-commits, MaskRay. · View Herald Transcript

Harbormaster completed remote builds in B195204: Diff 471914.Oct 31 2022, 12:56 AM

ChuanqiXu mentioned this in D134267: [C++] [Modules] Support one phase compilation model for named modules.Oct 31 2022, 12:58 AM

ChuanqiXu retitled this revision from [Driver] [Modules] Introduce -fsave-std-c++-module-file= to specify the path of the module file to [Driver] [Modules] Introduce -fsave-std-c++-module-file= to specify the path of the module file (2/2).

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

h-vetinari added a subscriber: h-vetinari.Oct 31 2022, 6:04 PM

ChuanqiXu added a parent revision: D137058: [Driver] [Modules] Support -fmodule-output (1/2).Oct 31 2022, 7:20 PM

ChuanqiXu edited the summary of this revision. (Show Details)

In D137059#3896661, @dblaikie wrote:

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Done. From my reading, in that discourse discussing, we're not talking about to add the new flags. I add the flag since I don't want the .pcm file pollutes the user space accidentally.

if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step

I am not sure what you mean. Do you talk about to forbidden the original 2-phase compilation model? If so, I think it is definitely the wrong direction. The 2-phase compilation model should be the correct direction in the long term since it has higher parallelism.

In D137059#3898239, @ChuanqiXu wrote:

In D137059#3896661, @dblaikie wrote:

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Done. From my reading, in that discourse discussing, we're not talking about to add the new flags. I add the flag since I don't want the .pcm file pollutes the user space accidentally.

if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step

I am not sure what you mean. Do you talk about to forbidden the original 2-phase compilation model? If so, I think it is definitely the wrong direction. The 2-phase compilation model should be the correct direction in the long term since it has higher parallelism.

I am not convinced about this second point as motivation for this direction; it comes with some significant resource tradeoffs (compared with the proposed [near] future version of producing the PCM and the object from one invocation of the FE):

it requires multiple instantiations of the FE
it blocks the objective of reducing the content of module interfaces (so that they only contain the information that pertains to the interface) - since requiring source -> pcm, pcm -> object means that the PCM has to contain all the information necessary to generate the object.
in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default; if the user has a specific reason to want to do the two jobs separately then thay could still do so ( -fmodule-only / --precompile ) at the expense of two invocations as now,

In D137059#3898463, @iains wrote:

In D137059#3898239, @ChuanqiXu wrote:

In D137059#3896661, @dblaikie wrote:

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Done. From my reading, in that discourse discussing, we're not talking about to add the new flags. I add the flag since I don't want the .pcm file pollutes the user space accidentally.

if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step

I am not sure what you mean. Do you talk about to forbidden the original 2-phase compilation model? If so, I think it is definitely the wrong direction. The 2-phase compilation model should be the correct direction in the long term since it has higher parallelism.

I am not convinced about this second point as motivation for this direction; it comes with some significant resource tradeoffs (compared with the proposed [near] future version of producing the PCM and the object from one invocation of the FE):

it requires multiple instantiations of the FE

it blocks the objective of reducing the content of module interfaces (so that they only contain the information that pertains to the interface) - since requiring source -> pcm, pcm -> object means that the PCM has to contain all the information necessary to generate the object.

in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default; if the user has a specific reason to want to do the two jobs separately then thay could still do so ( -fmodule-only / --precompile ) at the expense of two invocations as now,

(so that they only contain the information that pertains to the interface)

No, we can't do this. It hurts the performance.

it requires multiple instantiations of the FE

Agreed. But if we care about this, I think it may be best to allow the current 2 phase compilation model only. And we forbid the compilation from module unit to object files directly. This is cleanest approach.

in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

I think the distribution doesn't matter with parallelism. For parallelism, I mean, for the scan-based build systems, the compilation of A must wait until the dependent module B compiles to object files, which is significantly worse than the 2 phase compilation.

So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default;

So the question would be where is the destination place? And if we would offer an option to allow the user to specify the place? This question is discussed in https://reviews.llvm.org/D137058.

In D137059#3898482, @ChuanqiXu wrote:

In D137059#3898463, @iains wrote:

In D137059#3898239, @ChuanqiXu wrote:

In D137059#3896661, @dblaikie wrote:

Could you link to the email/discourse discussion about supporting this mode (I think you've linked it in other discussions, be good to have it for reference here & Probably in the other review)? (I'm wondering if we need a new flag for this, or if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step, for instance - I realize this is a somewhat breaking change but may be acceptable given that modules aren't widely deployed yet)

Done. From my reading, in that discourse discussing, we're not talking about to add the new flags. I add the flag since I don't want the .pcm file pollutes the user space accidentally.

if it'll be OK to change the driver behavior to always coalesce the .cppm->.pcm->.o path into a single step

I am not sure what you mean. Do you talk about to forbidden the original 2-phase compilation model? If so, I think it is definitely the wrong direction. The 2-phase compilation model should be the correct direction in the long term since it has higher parallelism.

I am not convinced about this second point as motivation for this direction; it comes with some significant resource tradeoffs (compared with the proposed [near] future version of producing the PCM and the object from one invocation of the FE):

it requires multiple instantiations of the FE

it blocks the objective of reducing the content of module interfaces (so that they only contain the information that pertains to the interface) - since requiring source -> pcm, pcm -> object means that the PCM has to contain all the information necessary to generate the object.

in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default; if the user has a specific reason to want to do the two jobs separately then thay could still do so ( -fmodule-only / --precompile ) at the expense of two invocations as now,

(so that they only contain the information that pertains to the interface)

No, we can't do this. It hurts the performance.

it requires multiple instantiations of the FE

Agreed. But if we care about this, I think it may be best to allow the current 2 phase compilation model only. And we forbid the compilation from module unit to object files directly. This is cleanest approach.

in terms of parallelism, the interface PCM has to be generated and distributed - the parsing and serialisation has to be complete before the PCM can be distributed; that process is the same regardless of whether the FE invocation also produces an object.

I think the distribution doesn't matter with parallelism. For parallelism, I mean, for the scan-based build systems, the compilation of A must wait until the dependent module B compiles to object files, which is significantly worse than the 2 phase compilation.

Not sure what you mean here; If there is only one user of a PCM then it does not need to be produced (waste of disk space and CPU cycles);
If there are many uses of it (as we might expect in a massively parallel distributed build system) then distributing the PCM is important and its availability predicates progress of other builds - from previous discussions in WG21 there are users that care very much about the size of distributed artefacts.

So, I would suggest that we would move to a single invocation of the compiler to produce the PCM and object as the default;

So the question would be where is the destination place? And if we would offer an option to allow the user to specify the place? This question is discussed in https://reviews.llvm.org/D137058.

Having a mechanism to specify the place for the file is fine by me ( I was only commenting on the motivation point for separate pcm and object phases ).

(I think we should move this discussion somewhere else, again - unless it is considered a key factor in deciding on this patch, I have no further comments).

Having a mechanism to specify the place for the file is fine by me ( I was only commenting on the motivation point for separate pcm and object phases ).

(I think we should move this discussion somewhere else, again - unless it is considered a key factor in deciding on this patch, I have no further comments).

Yeah, agreed. Let's avoid the repeating ourselves : )

There is another motivating factor for 1-phase: the build graph is far simpler. With 2-phase, CMake will have to write out rules to perform:

source -> .bmi
.bmi -> .withbmi.o
source -> .o

because we do not know if a BMI is needed or not. If it isn't we use the latter. If it is, we use the former. Note that this also means we need 2 different .o filenames (as neither make nor ninja doesn't support multiple rules making the same output). This also means that the collator needs to generate a response file for the linker to direct which .o file to use for each TU based on the contents.

Also with 2-phase, it is an open question of whether it actually helps with distributed builds (or anywhere process execution and I/O are expensive compared to some minimal work unit such as, say, Windows compiling from a network drive). Since this is not a bright line, giving the option to say "I know that split BMI is better for me in this instance" and "please combine here" would be handy (depending on actual real-world perf results on real-world projects). Yes, this is a chicken-and-egg cycle :) .

In D137059#3899339, @ben.boeckel wrote:

There is another motivating factor for 1-phase: the build graph is far simpler. With 2-phase, CMake will have to write out rules to perform:

source -> .bmi

.bmi -> .withbmi.o

source -> .o

because we do not know if a BMI is needed or not. If it isn't we use the latter. If it is, we use the former. Note that this also means we need 2 different .o filenames (as neither make nor ninja doesn't support multiple rules making the same output). This also means that the collator needs to generate a response file for the linker to direct which .o file to use for each TU based on the contents.

Also with 2-phase, it is an open question of whether it actually helps with distributed builds (or anywhere process execution and I/O are expensive compared to some minimal work unit such as, say, Windows compiling from a network drive). Since this is not a bright line, giving the option to say "I know that split BMI is better for me in this instance" and "please combine here" would be handy (depending on actual real-world perf results on real-world projects). Yes, this is a chicken-and-egg cycle :) .

In my mind, it is OK for CMake to support one-phase compilation model in the short term. And the fact that clang also supports the 2-phase compilation wouldn't affect CMake. Do I understand right? I mean, if the 2-phase compilation wouldn't affect CMake, CMake should be able to ignore it. My thought is that there are many more build systems in the world. And I know many of them would handle the dependency them self fully instead of translate the dependency to other build scripts like make and ninja. And I think it should be good for the compiler to remain different possibilities for different build systems (or tools).

In D137059#3904256, @ChuanqiXu wrote:

In my mind, it is OK for CMake to support one-phase compilation model in the short term. And the fact that clang also supports the 2-phase compilation wouldn't affect CMake. Do I understand right? I mean, if the 2-phase compilation wouldn't affect CMake, CMake should be able to ignore it. My thought is that there are many more build systems in the world. And I know many of them would handle the dependency them self fully instead of translate the dependency to other build scripts like make and ninja. And I think it should be good for the compiler to remain different possibilities for different build systems (or tools).

Indeed. Even if everything supports 2-phase, I suspect there are cases where 1-phase might still be better. But again, this needs real world numbers and testing to actually perform (more because of build graph shapes than individual TU timings).

dblaikie mentioned this in D137058: [Driver] [Modules] Support -fmodule-output (1/2).Nov 3 2022, 3:29 PM

In D137059#3899339, @ben.boeckel wrote:

There is another motivating factor for 1-phase: the build graph is far simpler. With 2-phase, CMake will have to write out rules to perform:

source -> .bmi

.bmi -> .withbmi.o

source -> .o

because we do not know if a BMI is needed or not. If it isn't we use the latter. If it is, we use the former. Note that this also means we need 2 different .o filenames (as neither make nor ninja doesn't support multiple rules making the same output). This also means that the collator needs to generate a response file for the linker to direct which .o file to use for each TU based on the contents.

I /think/ from @rsmith's comments in the discourse thread, we're more likely to skip/remove the ability to go from ".bmi" -> ".o" and possibly have 2 path options (this is all from @rsmith's comments on discourse) either ".cppm -> {.pcm, .o}" or ".cppm -> .o" + ".cppm -> .pcm" - this'd avoid the need to maintain full V slim pcm, there would never be a pcm that could produce a .o, .pcm would only be sufficient for users, not implementation.

But yeah, maybe we end up with all 3 options in the interim. Though I'd really like to keep the surface area as small as possible, while still allowing room for experimentation. Perhaps experimentation via -Xclang flags until data shows the options are worthwhile beyond those experiments.

Pulling in your (Ben) comment from D137058:

Plus the other compilers offer controls over it; why does Clang have to be different?

Which compilers/flags are you referring to? Arguments from compatibility with GCC are relatively easy to make (though I still have more hesitance for these flags since there's not wide-scale adoption, and I think there's still room to shape the world we want to see and limit the width of the interface/variations we end up having to support long term) & might side-step some of the discussions here.

ChuanqiXu mentioned this in D137527: [C++20] [Modules] [ClangScanDeps] Add ClangScanDeps support for C++20 Named Modules in P1689 format (2/4).Nov 6 2022, 10:17 PM

@ben.boeckel gentle ping~

Use tests with -###

Harbormaster completed remote builds in B197895: Diff 475664.Nov 15 2022, 7:48 PM

@dblaikie gentle ping~

@ben.boeckel

Plus the other compilers offer controls over it; why does Clang have to be different?

Which compilers/flags are you referring to? Arguments from compatibility with GCC are relatively easy to make (though I still have more hesitance for these flags since there's not wide-scale adoption, and I think there's still room to shape the world we want to see and limit the width of the interface/variations we end up having to support long term) & might side-step some of the discussions here.

I'm still curious what about the details of other compilers - I think from the sounds of it, @iains suggested GCC doesn't support this yet so we'll need to pick/name the flag ourselves & he's happy to implement whatever we pick? I guess Microsoft's flag naming is sufficiently differently styled as to offer no useful inspiration? Though wouldn't hurt to know what they name it.

Any other examples you had in mind, Ben?

ben.boeckel mentioned this in D137534: [C++20] [Modules] [ClangScanDeps] Allow clang-scan-deps to without specified compilation database in P1689 (3/4).Dec 1 2022, 2:26 PM

In D137059#3934448, @dblaikie wrote:

I'm still curious what about the details of other compilers - I think from the sounds of it, @iains suggested GCC doesn't support this yet so we'll need to pick/name the flag ourselves & he's happy to implement whatever we pick? I guess Microsoft's flag naming is sufficiently differently styled as to offer no useful inspiration? Though wouldn't hurt to know what they name it.

Any other examples you had in mind, Ben?

GCC supports naming the output file by asking the "module mapper" where a module with a given name lives (also used for finding imported modules). MSVC uses the -ifcOutput flag to specify where to write any exported module data to. See this CMake code which handles the "module mapping" for the various compilers: https://gitlab.kitware.com/cmake/cmake/-/blob/master/Source/cmCxxModuleMapper.cxx

In D137059#3973016, @ben.boeckel wrote:

In D137059#3934448, @dblaikie wrote:

I'm still curious what about the details of other compilers - I think from the sounds of it, @iains suggested GCC doesn't support this yet so we'll need to pick/name the flag ourselves & he's happy to implement whatever we pick? I guess Microsoft's flag naming is sufficiently differently styled as to offer no useful inspiration? Though wouldn't hurt to know what they name it.

Any other examples you had in mind, Ben?

GCC supports naming the output file by asking the "module mapper" where a module with a given name lives (also used for finding imported modules).

This is using the 'P1184' interface for both tasks - which I think we should keep as a separate mechanism (at least mentally) since we can also use that with clang when we implement it. What the interface returns for the name (via 'P1184') is decoupled from how the name is determined (potentially by a command line argument or from some other build system component).

Currently, GCC does not have a command line spelling for specifying the output module name, which is why I say it's still "up for grabs" (actually it would be polite to ask on gcc@gcc.gnu.org for opinions on the spelling, since it would be crazy to have different on at least these two platforms). The spelling of command line options is not IMO bike shedding, it affects day-to-day use of the tools.

In D137059#3973096, @iains wrote:

In D137059#3973016, @ben.boeckel wrote:

In D137059#3934448, @dblaikie wrote:

I'm still curious what about the details of other compilers - I think from the sounds of it, @iains suggested GCC doesn't support this yet so we'll need to pick/name the flag ourselves & he's happy to implement whatever we pick? I guess Microsoft's flag naming is sufficiently differently styled as to offer no useful inspiration? Though wouldn't hurt to know what they name it.

Any other examples you had in mind, Ben?

GCC supports naming the output file by asking the "module mapper" where a module with a given name lives (also used for finding imported modules).

This is using the 'P1184' interface for both tasks - which I think we should keep as a separate mechanism (at least mentally) since we can also use that with clang when we implement it. What the interface returns for the name (via 'P1184') is decoupled from how the name is determined (potentially by a command line argument or from some other build system component).

Currently, GCC does not have a command line spelling for specifying the output module name, which is why I say it's still "up for grabs" (actually it would be polite to ask on gcc@gcc.gnu.org for opinions on the spelling, since it would be crazy to have different on at least these two platforms). The spelling of command line options is not IMO bike shedding, it affects day-to-day use of the tools.

Alrighty - started a thread here: https://gcc.gnu.org/pipermail/gcc/2022-December/240239.html

Rename the option to -fmodule-output according to the discussion from https://gcc.gnu.org/pipermail/gcc/2022-December/240239.html

Harbormaster completed remote builds in B202134: Diff 481504.Dec 8 2022, 10:32 PM

this LGTM ( but please wait for an ack from @dblaikie ) and again thanks for patience in seeing this through.

This revision is now accepted and ready to land.Dec 9 2022, 2:07 AM

chfast added a subscriber: chfast.Dec 11 2022, 5:55 AM

Update after D137058 updated.

Harbormaster completed remote builds in B202480: Diff 481978.Dec 11 2022, 8:23 PM

Nathan had a few questions in the cross-project flag naming thread - could you check/reply to those? I think the right answer is probably "do whatever -o does" but would be good to verify that behavior makes sense, maybe explicitly test it in some way if it's significant enough/requires any work to support getting that behavior? (if it falls out naturally from some existing file IO handling we have, I'm less worried about testing it separately)

In D137059#3989856, @dblaikie wrote:

Nathan had a few questions in the cross-project flag naming thread - could you check/reply to those? I think the right answer is probably "do whatever -o does" but would be good to verify that behavior makes sense, maybe explicitly test it in some way if it's significant enough/requires any work to support getting that behavior? (if it falls out naturally from some existing file IO handling we have, I'm less worried about testing it separately)

Oh, I guess those questions ^ are really more applicable to the previous patch in the series, that introduces writing the file out anyway - the ability to specify the name of the file probably has no impact/shouldn't have any impact on the answers to the questions, so this patch looks good - please consider those issues in the other patch.

Update as the dependent one changes.

Harbormaster completed remote builds in B202758: Diff 482360.Dec 12 2022, 10:58 PM

Update since the dependent one changed.

ChuanqiXu added a child revision: D140001: [Driver] [C++20] [Modules] Don't emit unused-argument warning for ` -fmodule-output=` and `-fmodule-output`.Dec 14 2022, 1:47 AM

Harbormaster completed remote builds in B203036: Diff 482739.Dec 14 2022, 1:53 AM

Update since the dependent one changes.

Harbormaster completed remote builds in B203275: Diff 483075.Dec 14 2022, 8:29 PM

Update since the dependent one changed.

Harbormaster completed remote builds in B203549: Diff 483449.Dec 16 2022, 1:27 AM

Arthapz added a subscriber: Arthapz.Dec 16 2022, 8:07 AM

Update since the dependent one changes.

This revision was landed with ongoing or failed builds.Jan 15 2023, 10:02 PM