This is an archive of the discontinued LLVM Phabricator instance.

[Driver] Enable to use C++20 modules standalone by -fcxx-modules
ClosedPublic

Authored by ChuanqiXu on Feb 24 2022, 10:43 PM.

Details

Summary

This patch allows user to use C++20 module by -fcxx-modules. Previously, we could only use it under -std=c++20. Given that user could use C++20 coroutine standalonel by -fcoroutines-ts. It makes sense to offer an option to use C++20 modules without enabling C++20.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Feb 24 2022, 10:43 PM
ChuanqiXu requested review of this revision.Feb 24 2022, 10:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 10:44 PM
iains added a subscriber: Restricted Project.Mar 4 2022, 1:05 AM
iains added a comment.Mar 4 2022, 1:10 AM
  1. I agree 100% that the driver needs to be able to process in "c++20 modules mode"; there is some handling of sources that should be changed accordingly.
  1. I believe that it is a general objective of the tooling folks (roughly SG15 participants) that C++20 modules should (eventually) be considered automatic for C++20+
  1. There is at least one request from tooling folks that there should be an option to disable modules (even when in C++20 mode); this is a practical measure to avoid the case that it is impossible to build a TU because of some potential modules-related bug ...
  1. IMO it becomes increasingly important to try and decouple the clang modules from C++20 modules as much as possible.

So .. I was going to suggest that we might introduce -fmodules= {none, clang, c++20, ...}
with defaults picked:

 fmodules => clang (i.e. the current meaning)

!fmodules && C++20+ => c++20 (i.e. the objective of (2) above
Where there are other flags that imply C++20 that can switch c++20 mode as well

otherwise the default would be "none"

.. this provides for (3) ... since -std=c++20 -fmodules=none could be used.

I do not have a patch for this proposal as of this time (my current patches assume (2) but do not meet the objective of (3))

iains added a comment.Mar 4 2022, 1:18 AM

of course, this meets your objective too - since you could put -fmodules=cxx20 (even without -std=c++20) ..

  1. I agree 100% that the driver needs to be able to process in "c++20 modules mode";

Yeah, not all the projects could be able to run in c++20 mode. We use -std=c++17 + -fcoroutines-ts for C++20 coroutine before.

there is some handling of sources that should be changed accordingly.

If I don't misunderstand, I guess we don't. Since all the code about C++20 modules are controlled by CPlusPlusModules variable. I think there wouldn't be codes controlled by Cpp20 variables.

  1. I believe that it is a general objective of the tooling folks (roughly SG15 participants) that C++20 modules should (eventually) be considered automatic for C++20+

If I understand correctly, clang would enable C++20 modules by default if C++20 is enabled.

  1. There is at least one request from tooling folks that there should be an option to disable modules (even when in C++20 mode); this is a practical measure to avoid the case that it is impossible to build a TU because of some potential modules-related bug ...

I think -fno-cxxmodules could be the option now.

  1. IMO it becomes increasingly important to try and decouple the clang modules from C++20 modules as much as possible.

100% agree. I think it should be beneficial to decouple them from command line, implementation and comments. (Now many comments are not precise. For example, when we talk about a Module. we are referring a module unit indeed. There are other examples.)

So .. I was going to suggest that we might introduce -fmodules= {none, clang, c++20, ...}
with defaults picked:

 fmodules => clang (i.e. the current meaning)

!fmodules && C++20+ => c++20 (i.e. the objective of (2) above
Where there are other flags that imply C++20 that can switch c++20 mode as well

otherwise the default would be "none"

.. this provides for (3) ... since -std=c++20 -fmodules=none could be used.

I do not have a patch for this proposal as of this time (my current patches assume (2) but do not meet the objective of (3))

The proposal is good to me though. I sent a patch (https://reviews.llvm.org/D113391) before to forbid the mixed use of clang modules and c++20 modules since it might be confusing if we use the combination -fmodules -std=c++20. But the comment shows that the current users of Clang Modules (mainly Google and Apple) wish a smooth switch from clang module to c++20 module. So I think the proposal which would break the current use cases would be not easy to land.
(This patch wouldn't break any use case I think).

iains added a comment.Mar 4 2022, 2:00 AM
  1. I agree 100% that the driver needs to be able to process in "c++20 modules mode";

Yeah, not all the projects could be able to run in c++20 mode. We use -std=c++17 + -fcoroutines-ts for C++20 coroutine before.

there is some handling of sources that should be changed accordingly.

If I don't misunderstand, I guess we don't. Since all the code about C++20 modules are controlled by CPlusPlusModules variable. I think there wouldn't be codes controlled by Cpp20 variables.

Actually, the comment was intending to refer to the driver specifically - since (for example) we have to disambiguate PCH jobs from C++20 Header Unit jobs - in the Front End, as of now we have been using CPlusPlus modules as the indicator for C++20 modules. There is still scope for confusion if that is set at the same time as alternates...

  1. I believe that it is a general objective of the tooling folks (roughly SG15 participants) that C++20 modules should (eventually) be considered automatic for C++20+

If I understand correctly, clang would enable C++20 modules by default if C++20 is enabled.

Yes, exactly, as is the case right now.

  1. There is at least one request from tooling folks that there should be an option to disable modules (even when in C++20 mode); this is a practical measure to avoid the case that it is impossible to build a TU because of some potential modules-related bug ...

I think -fno-cxxmodules could be the option now.

so long as this is decoupled from any clang modules meanings?

  1. IMO it becomes increasingly important to try and decouple the clang modules from C++20 modules as much as possible.

100% agree. I think it should be beneficial to decouple them from command line, implementation and comments. (Now many comments are not precise. For example, when we talk about a Module. we are referring a module unit indeed. There are other examples.)

It would be friendly to the user to reject command line options that are not appropriate to the "current modules mode" - since there are ≈ 60+ modules-related command line options it is already very easy to be confused. To do this, the driver needs to establish the "current modules mode" early (even before it builds jobs, since as noted above some jobs build differently depending on the assumed mode).

So .. I was going to suggest that we might introduce -fmodules= {none, clang, c++20, ...}
with defaults picked:

 fmodules => clang (i.e. the current meaning)

!fmodules && C++20+ => c++20 (i.e. the objective of (2) above
Where there are other flags that imply C++20 that can switch c++20 mode as well

otherwise the default would be "none"

.. this provides for (3) ... since -std=c++20 -fmodules=none could be used.

I do not have a patch for this proposal as of this time (my current patches assume (2) but do not meet the objective of (3))

The proposal is good to me though. I sent a patch (https://reviews.llvm.org/D113391) before to forbid the mixed use of clang modules and c++20 modules since it might be confusing if we use the combination -fmodules -std=c++20. But the comment shows that the current users of Clang Modules (mainly Google and Apple) wish a smooth switch from clang module to c++20 module. So I think the proposal which would break the current use cases would be not easy to land.
(This patch wouldn't break any use case I think).

Well, I think neither proposal breaks current use-cases - the selection of defaults is designed to make current command lines do exactly the same as they do now.

My comments are not intended as a blocker for your patch - but simply to offer a suggestion for a more generic handling of the same objective.

  1. I agree 100% that the driver needs to be able to process in "c++20 modules mode";

Yeah, not all the projects could be able to run in c++20 mode. We use -std=c++17 + -fcoroutines-ts for C++20 coroutine before.

there is some handling of sources that should be changed accordingly.

If I don't misunderstand, I guess we don't. Since all the code about C++20 modules are controlled by CPlusPlusModules variable. I think there wouldn't be codes controlled by Cpp20 variables.

Actually, the comment was intending to refer to the driver specifically - since (for example) we have to disambiguate PCH jobs from C++20 Header Unit jobs - in the Front End, as of now we have been using CPlusPlus modules as the indicator for C++20 modules. There is still scope for confusion if that is set at the same time as alternates...

Oh, I didn't see Header Unit into details. I'm mainly focus on named module now. From my personal point, I feel header unit and named module are really not the same in many aspects so that we could handle them separately.

  1. I believe that it is a general objective of the tooling folks (roughly SG15 participants) that C++20 modules should (eventually) be considered automatic for C++20+

If I understand correctly, clang would enable C++20 modules by default if C++20 is enabled.

Yes, exactly, as is the case right now.

  1. There is at least one request from tooling folks that there should be an option to disable modules (even when in C++20 mode); this is a practical measure to avoid the case that it is impossible to build a TU because of some potential modules-related bug ...

I think -fno-cxxmodules could be the option now.

so long as this is decoupled from any clang modules meanings?

Yes, fcxx-modules only refers to C++20 modules now.

  1. IMO it becomes increasingly important to try and decouple the clang modules from C++20 modules as much as possible.

100% agree. I think it should be beneficial to decouple them from command line, implementation and comments. (Now many comments are not precise. For example, when we talk about a Module. we are referring a module unit indeed. There are other examples.)

It would be friendly to the user to reject command line options that are not appropriate to the "current modules mode" - since there are ≈ 60+ modules-related command line options it is already very easy to be confused. To do this, the driver needs to establish the "current modules mode" early (even before it builds jobs, since as noted above some jobs build differently depending on the assumed mode).

Totally agreed. It would be very hard process to disambuguate them since there are already users..

So .. I was going to suggest that we might introduce -fmodules= {none, clang, c++20, ...}
with defaults picked:

 fmodules => clang (i.e. the current meaning)

!fmodules && C++20+ => c++20 (i.e. the objective of (2) above
Where there are other flags that imply C++20 that can switch c++20 mode as well

otherwise the default would be "none"

.. this provides for (3) ... since -std=c++20 -fmodules=none could be used.

I do not have a patch for this proposal as of this time (my current patches assume (2) but do not meet the objective of (3))

The proposal is good to me though. I sent a patch (https://reviews.llvm.org/D113391) before to forbid the mixed use of clang modules and c++20 modules since it might be confusing if we use the combination -fmodules -std=c++20. But the comment shows that the current users of Clang Modules (mainly Google and Apple) wish a smooth switch from clang module to c++20 module. So I think the proposal which would break the current use cases would be not easy to land.
(This patch wouldn't break any use case I think).

Well, I think neither proposal breaks current use-cases - the selection of defaults is designed to make current command lines do exactly the same as they do now.

My comments are not intended as a blocker for your patch - but simply to offer a suggestion for a more generic handling of the same objective.

Got it, thanks. My objective is relatively smaller. I just want to enable the use of C++20 modules for actual projects. (I heard that there are projects couldn't upgrade to C++20 due to a ABI break in std::string). Given the command line design of coroutines, I think this might be acceptable.

urnathan resigned from this revision.Mar 9 2022, 10:54 AM
iains added a comment.Mar 15 2022, 1:59 AM

I am still concerned that there is an expectation that. the fcxx-modules option is connected with clang modules.
.. see, for example:

https://github.com/llvm/llvm-project/blob/d90d45fc9029cc7dbb6d44798f51131df6b2eef1/clang/lib/Driver/ToolChains/Clang.cpp#L3579

and

https://github.com/llvm/llvm-project/blob/875782bd9ea322445dd41213ca6bbf70169e741d/clang/include/clang/Driver/Options.td#L5549

(those are the two places that the option currently appears in the code; test cases that have the option seem to be connected with clang/implicit modules).

So I will defer to other folks to comment further.

I agree. My understanding is that -fcxx-modules enables Clang modules that don't interact with C++20 modules. @Bigcheese, any thoughts?

I agree. My understanding is that -fcxx-modules enables Clang modules that don't interact with C++20 modules. @Bigcheese, any thoughts?

Oh, if it is true, then it is in a chaos really. I mean the language variables set in -fcxx-modules is CPlusPlusModules. And we uses CPlusPlusModules as the induction variable for C++20 modules for a relative long time.

My observation is that:

  • CPlusPlusModules is true if we set -std=c++20 before this patch.
  • Modules is true if we set -fmodules or -fcxx-modules.
  • What I get from the previous patch is that we hope a smooth change from clang modules to C++20 modules.

So my conclusion might be that the -fcxx-modules is connected to both C++20 modules and Clang Modules. Is this intended or not?

iains added a comment.Mar 15 2022, 4:27 AM

I'm not sure it is exactly chaos, but it is certainly fragile and somewhat hard(er than necessary) to maintain.

We (@ChuanqiXu and I at least) agree that there should be some way to make "which modules mode" unambiguous in both the driver and the compiler (I think we're only debating how/which flag to use)

The end game objective is that "when the compiler is processing for C++20 or later, then it should default to C++20+ modules".
IMO, knowing "which modules mode" is in force does not alter this - if anything it ought to simplify checking whether we have completed migration (whatever that ends up meaning).

Presumably, we are never going to delete the other options (e.g. -fmodules etc) they would (perhaps) become deprecated and then "do nothing". OTOH, perhaps that mode will be needed by existing users for some considerable time.

@rsmith told me that the ideal situation would combine C++20 modules and clang modules together in D113391

I think the most important thing here is to get in consensus for the module status. Here might be some helpful questions for the goal:

  • Should C++20 modules and Clang modules be exclusive from each other?
    • If yes, we could take the idea -fmodules= {clang, c++20, none...} and forbid the combination of -fmodules -std=c++20. And we could use variable Modules to indicate clang modules and CPlusPlusModules to indicate c++20 modules.
  • If no, it implies that we could use c++20 modules and clang modules together. So the combination of -fmodules -std=c++20 or even -fmodules -fcxxmodules makes sense. It implies that we could use the grammar of clang module extension or c++20 modules. This is decision from D113391. Here are some further questions:
    • Would it be very hard to implement or maintain?
    • We lack a variable to indicate clang modules only. Currently, we couldn't use Modules to indicate clang modules since Modules is true if we turned C++20 modules on. Modules indicate either clang module or c++20 module is enabled. Or we could think it means the common parts of the two features.

The most important technical question might be Would it be very hard to implement or maintain?. From my experience, I think it is implementable. But I feel it is not easy to maintain. We don't have the experience since C++20 modules are not in the state of maintaining now.

I don't have strong opinions for the concrete decision. But I think it is very important to get in consensus. @iains @Bigcheese

iains added a comment.EditedMar 16 2022, 1:15 AM

@rsmith told me that the ideal situation would combine C++20 modules and clang modules together in D113391

Maybe I understand something slightly different; that we should migrate to a situation where compiling C++ code with no other contradicting options, would produce C++20 modules. Preferably, for most users that means no special options would be needed for the standardised modules.

I think the most important thing here is to get in consensus for the module status. Here might be some helpful questions for the goal:

  • Should C++20 modules and Clang modules be exclusive from each other?
    • If yes, we could take the idea -fmodules= {clang, c++20, none...} and forbid the combination of -fmodules -std=c++20. And we could use variable Modules to indicate clang modules and CPlusPlusModules to indicate c++20 modules.
  • If no, it implies that we could use c++20 modules and clang modules together. So the combination of -fmodules -std=c++20 or even -fmodules -fcxxmodules makes sense. It implies that we could use the grammar of clang module extension or c++20 modules. This is decision from D113391. Here are some further questions:

In my input, there was no intention to forbid -fmodules -std=c++20, quite the opposite (it would be OK).
I am suggesting that the presence of fmodules on the command line would be a statement that the user wanted clang modules (and that would be perfectly OK in conjunction with C++20).
... But that would switch off C++20 modules mode because there are incompatibilities at present (in particular it means treating headers differently - and possibly sub-module visibility). There might come a time when clang modules are identical to C++20 ones, but I could imagine that would be some considerable time in the future - because end users would have to migrate.

  • Would it be very hard to implement or maintain?

once we have a firm plan for which things are allowed together it is not hard to maintain - what is hard to maintain is a situation in which we are not sure of exactly what we should be producing ...

  • We lack a variable to indicate clang modules only. Currently, we couldn't use Modules to indicate clang modules since Modules is true if we turned C++20 modules on. Modules indicate either clang module or c++20 module is enabled. Or we could think it means the common parts of the two features.

we could make such a variable easily - if the decision was made to do it (what could be a little harder is to separate out the modules command line options into mutually-exclusive groups).

The most important technical question might be Would it be very hard to implement or maintain?. From my experience, I think it is implementable. But I feel it is not easy to maintain. We don't have the experience since C++20 modules are not in the state of maintaining now.

I don't have strong opinions for the concrete decision. But I think it is very important to get in consensus. @iains @Bigcheese

In this processes, I am not a major stakeholder - just an interested party as a coder and user - the call is down to @rsmith and @Bigcheese.

MaskRay added inline comments.Mar 31 2022, 11:49 PM
clang/test/Driver/modules.cpp
79

You need to specify a pre-20 -stdc to make the test not stale when clang defaults to gnu++20

clang/test/Modules/cxx-modules.cppm
3

ditto

ChuanqiXu updated this revision to Diff 419649.Apr 1 2022, 12:21 AM
ChuanqiXu added a reviewer: MaskRay.

Address comments.

@iains Do you agree to submit this one first? Since all the discussions/questions I see in this page now it not about the patch itself. The patch itself should be good personally I thought.

MaskRay accepted this revision.May 27 2022, 2:13 PM
This revision is now accepted and ready to land.May 27 2022, 2:13 PM
This revision was landed with ongoing or failed builds.May 29 2022, 7:25 PM
This revision was automatically updated to reflect the committed changes.
DanielMcIntosh-IBM added inline comments.
clang/test/Driver/modules.cpp
79–86
  1. Do we really need separate check prefixes for each of these? The general practice seems to be to re-use the check prefix if we are expecting the same output (e.g. in this very file we use CHECK-COMPILE twice and CHECK-PRECOMPILE three times)
  2. Is there some reason not to pass either --precompile or -S for any of these tests like we do with the rest of the existing tests? Without that, this test fails when a toolchain doesn't support linking. Based on the rest of this test, I would have guessed that linking is outside the scope of this test?
clang/test/Driver/modules.cpp
79–86

Or, instead of --precompile or -S, -c would also prevent this test from requiring linking support. I've created a patch to address these issues: D126669

This patch broke a whole bunch tests in the LLDB testsuite. I'm trying to figure out what exactly the semantics of the change are, particularly on Darwin, where -fmodules doesn't imply -fcxx-modules.

I think the problem might be that previously on Darwin -fcxx-modules was used to turn on C++ Clang modules (which was not implied by -fmodules) without turning of C++20 (ts) modules, and after this patch -fcxx-modules implies -fmodules-ts. Does that sound plausible?

Should we introduce a -fno-modules-ts option on top? Any better suggestions?

In any case, I would appreciate it if we could revert this patch until we found a solution!

iains added a comment.Jun 1 2022, 10:38 AM

as noted before, IMO this is all a bit tangled at the moment.

Probably, a good start would be to make the driver and the FE behave the same way for the flags -- at the moment C++20 jams on fmodules and fcxx-modules (meaning that there's no way to decouple them in the FE) - however the driver still recognises them as distinct.

AFAIU, we all agree that the end game should be that we can clearly define which flags are relevant to each "modules mode" .. but we're some way away from that , and trying to move in small steps that does not break stuff.

I see tests that have stuff like -fmodules -fmodule-ts .. but I am not sure what "modules mode" that really means? (and some combinations are actually not really doing anything, because of the jamming on from c++20)

sorry .. this is not in any way a solution - but more a statement that the problem is quite knotty.

I'm going to revert the patch now. This is not just breaking LLDb, but also clang itself on Darwin platforms. I think we need to be more careful to separate out the enabling of Clang C++ modules and C++20 modules. Either by having -fmodules-ts control the HaveModules flag, or by adding a way to explicitly turn them off. I'm happy to help with testing out any patches!

iains added a comment.Jun 1 2022, 12:16 PM

I guess Chuanqi's TZ is in sleep mode at the moment, so the revert makes sense.

Please let's not use -fmodules-ts to mean anything - we want to phase it out and make it a NOP (we are implementing the standardised modules; I do not see anyone putting effort into finishing the ts implementation).

my suggestion is to introduce -fmodules={cxx20, clang, etc. etc} where -fmodules is mapped in the driver to -fmodules=clang ... and to introduce an enumeration in the options that is visible in the FE so that we can be explicit in checking which kind of modules we mean at each stage.

unfortunately, right now I cannot volunteer to implement the suggestion ...

@aprantl Thanks for the reverting. I never image -fcxx-modules would refer to Clang C++ modules. So the problem becomes more complicated.

I guess Chuanqi's TZ is in sleep mode at the moment, so the revert makes sense.

Please let's not use -fmodules-ts to mean anything - we want to phase it out and make it a NOP (we are implementing the standardised modules; I do not see anyone putting effort into finishing the ts implementation).

my suggestion is to introduce -fmodules={cxx20, clang, etc. etc} where -fmodules is mapped in the driver to -fmodules=clang ... and to introduce an enumeration in the options that is visible in the FE so that we can be explicit in checking which kind of modules we mean at each stage.

unfortunately, right now I cannot volunteer to implement the suggestion ...

Now I feel your proposal looks better. Given the current complexity we saw, I feel it might be better to disambiguate the meaning of modules in clang first and I filed an issue: https://github.com/llvm/llvm-project/issues/55891. So that we could avoid discussing the same problem again and again in different review pages.