This is an archive of the discontinued LLVM Phabricator instance.

[docs] Add "Standard C++ Modules"
ClosedPublic

Authored by ChuanqiXu on Aug 8 2022, 3:14 AM.

Details

Summary

Successor of D131062. We create the new page due to the old one has too many review comments to review. So we decide to create the new page to review more clearly.


We get some standard C++ module things done in clang15.x. But we lack a user documentation for it. The implementation of standard C++ modules share a big part of codes with clang modules. But they have very different semantics and user interfaces, so I think it is necessary to add a document for Standard C++ modules. Previously, there were also some people ask the document for standard C++ Modules and I couldn't offer that time.

I wish we could land this before September and I can backport this to 15.x. This implies that even we solve any bug during the reviewing process. We shouldn't edit the document in time. We should land it and edit it.

Tested with grammarly and make docs-clang-html.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Aug 8 2022, 3:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 8 2022, 3:14 AM
Herald added a subscriber: arphaman. · View Herald Transcript
ChuanqiXu requested review of this revision.Aug 8 2022, 3:14 AM

Thanks! Repeating a point that might have been overlooked from D131062 (this time not as comments in the diff to avoid the "pollution" that caused the move to this PR):

If you do open a new revision, please also consider breaking the lines at a length that phabricator doesn't overflow (seems to be 122 characters), and change all occurrences of "codes" to "code".

iains added a comment.Aug 8 2022, 3:41 AM

general comment.

Do we encourage contractions (don't, can't) etc. in documentation?
I would suggest that to assist in any translation process it is better to write "do not" or "can not" instead (but that's just an opinion, not a matter of correctness).

clang/docs/CPlusPlus20Modules.rst
20–22 ↗(On Diff #450747)

I do not think we need to justify the existence of the page, just identify its purpose - maybe like:

"Since C++20 modules have different semantics (and work flows) from `Clang modules this page describes the background and use of Clang with C++20 modules."

25–26 ↗(On Diff #450747)

this also includes C++20 header units, which are also covered in this document.

57 ↗(On Diff #450747)

must

219 ↗(On Diff #450747)

procedural point: do we ever actually remove an existing option - or does it just become a "NOP"?

259 ↗(On Diff #450747)

link

458 ↗(On Diff #450747)

The declarations in a module unit which are not in global module fragment have new linkage names.

592 ↗(On Diff #450747)

With the existing implementation `-fprebuilt-module-path cannot be used for header units (since they are nominally anonymous). For header units use -fmodule-file` to include the relevant PCM file for each header unit.

594–595 ↗(On Diff #450747)

Notes ( we have an internal implementation name, which can be used by planned tooling and/or mapping changes)

  • so maybe:

this is expect to be solved in future editions of the compiler either by the tooling finding and specifying the -fmodule-file or by the use of a module-mapper that understands how to map the header name to their PCMs.

iains added inline comments.Aug 8 2022, 3:56 AM
clang/docs/CPlusPlus20Modules.rst
673–674 ↗(On Diff #450747)

as an aside :

there is an open question in the implementation of p1184r2 as to whether one form of input that the module mapper could consume would be module map files (but, obviously, producing C++20 compliant output).

I wonder if this section is adding information that is useful to the user ?
(perhaps it is more documentation of implementation decisions?)

As noted before semantics of clang module headers != semantics of C++20 header units seems a sufficient reason for keeping them separate?

I am not sure what could change in the future to alter this, since existing code will have the current semantics, even if some change is later made to the semantics of clang header modules ?

ChuanqiXu marked 8 inline comments as done.Aug 8 2022, 8:00 PM

Thanks! Repeating a point that might have been overlooked from D131062 (this time not as comments in the diff to avoid the "pollution" that caused the move to this PR):

If you do open a new revision, please also consider breaking the lines at a length that phabricator doesn't overflow (seems to be 122 characters), and change all occurrences of "codes" to "code".

Yeah, I think I've addressed it.

general comment.

Do we encourage contractions (don't, can't) etc. in documentation?
I would suggest that to assist in any translation process it is better to write "do not" or "can not" instead (but that's just an opinion, not a matter of correctness).

Yes. I've used many must/should in the document. Maybe there is someplace missing.

clang/docs/CPlusPlus20Modules.rst
25–26 ↗(On Diff #450747)

I took half of the suggestion. Since I still want to clarify that the two terms modules and header units are different.

219 ↗(On Diff #450747)

According to the history, we did actually remove some option: https://github.com/llvm/llvm-project/commit/29f852a1516bcd3928dad74835965f238de34409

673–674 ↗(On Diff #450747)

I wonder if this section is adding information that is useful to the user ?

(perhaps it is more documentation of implementation decisions?)

I think this section is helpful to eliminate the confusion of the users. Since the users may be confused (or even complain) why don't we reuse the existing tools.

As noted before semantics of clang module headers != semantics of C++20 header units seems a sufficient reason for keeping them separate?

I am not sure what could change in the future to alter this, since existing code will have the current semantics, even if some change is later made to the semantics of clang header modules ?

To be honest, I think the differences mentioned above could be handled by compiler internally. (maybe a lot of if (isHeaderUnit)....). Or even the folks of clang modules could get involved in and say "OK, we could accept the cost." But it is a strong argument that it is possible to introduce new module mappers. So the current decision is much more comprehensible to me.

ChuanqiXu updated this revision to Diff 451030.Aug 8 2022, 8:01 PM
ChuanqiXu marked 2 inline comments as done.

Address comments.

ruoso added a subscriber: ruoso.Aug 9 2022, 8:18 AM
ruoso added inline comments.
clang/docs/CPlusPlus20Modules.rst
70–76 ↗(On Diff #451030)

The terminology here is a bit different than what we've been building the consensus on. Please take a look at sourcefiles.tex (or section [source.types] in the rendered version

ruoso added inline comments.Aug 9 2022, 8:29 AM
clang/docs/CPlusPlus20Modules.rst
225 ↗(On Diff #451030)

Likewise, here the term "Built Module Interface file", with the acronym "BMI" is what we're generally using when talking about the generated file.

238 ↗(On Diff #451030)

Is that actually true? Or does it require explicit arguments to explain how the compiler needs to translate the file?

243–244 ↗(On Diff #451030)

Is that the case for C++20 named modules as well? I thought this was just for clang modules. How does the lookup work?

250–253 ↗(On Diff #451030)

I wonder if it's easier to explain that a module implementation unit implicitly imports the primary module interface unit, and therefore it needs the BMI for that interface to be provided, just like it's the case for every other import statement.

ChuanqiXu added inline comments.Aug 9 2022, 8:42 AM
clang/docs/CPlusPlus20Modules.rst
70–76 ↗(On Diff #451030)

If there is a consensus already, we should follow.

(BTW, I thought we'll discuss module things in SG2 but it looks like we're discussing them in SG15... my bad)

225 ↗(On Diff #451030)

Yeah, this is what I was confused in the chat. In my mind, "BMI" describes a compatible interface format like ABI (like Itanium ABI). In another words, a BMI could be compiled by compiler which follows the BMI standard (like clang and gcc both accepts Itanium ABI). But currrently, a module file couldn't be compiled by clang in different versions.

So from my point of view, the term module file is more appropriate than BMI now.

243–244 ↗(On Diff #451030)

In this document, modules are referring to C++20 Named Modules by default. The option is borrowed from clang modules. The look up rule here is:

(1) When we import module M. The compiler would look up M.pcm in the directories specified by `-fprebuilt-module-interface`.
(2) When we import partition module unit M:P. The compiler would look up M-P.pcm in the directories specified by `-fprebuilt-module-interface`.

I thought this is clear enough. Do you have suggestion to improve? I am a little worried to be wordy.

250–253 ↗(On Diff #451030)

OK, I was afraid to cite too many standard paragraphs to scare readers. But this point looks necessary

dblaikie added inline comments.
clang/docs/CPlusPlus20Modules.rst
31 ↗(On Diff #451030)
122 ↗(On Diff #451030)
262 ↗(On Diff #451030)
312–321 ↗(On Diff #451030)

this sort of aside might be best left for a separate part of the document - an FAQ/side-notes (a footnote, perhaps?), etc to keep the rest of the document more focussed?

347 ↗(On Diff #451030)

Could probably skip the -f?

395–396 ↗(On Diff #451030)

I'm not sure I'm able to follow the example and how it justifies the rough theory as inadequate to explain the motivation for modules - could you clarify more directly (in comments, and then we can discuss how to word it) what the motivation for this section is/what you're trying to convey?

610 ↗(On Diff #451030)

Might need some more words here - I guess this means "there is no .o for a .pcm from a header unit" basically?

ruoso added inline comments.Aug 9 2022, 11:30 AM
clang/docs/CPlusPlus20Modules.rst
70–76 ↗(On Diff #451030)

Yes, please consider joining us for the sg15 meetings, we've been working through quite a few things related to C++ modules.

225 ↗(On Diff #451030)

BMI is definitely not as compatible as the ABI. I think there's some confusion here of what the term BMI is. In this case the pcm file is the BMI.

243–244 ↗(On Diff #451030)

Yes, thanks for the clarification. I think it would be worth including a description of how the lookup works as a part of the doc.

ChuanqiXu updated this revision to Diff 451361.Aug 9 2022, 11:35 PM
ChuanqiXu marked 13 inline comments as done.

Address comments.

ChuanqiXu added inline comments.
clang/docs/CPlusPlus20Modules.rst
31 ↗(On Diff #451030)

The was here is intended. I want to say I thought to not add the all other information and just write it like a manual, which might look like:

-unwindlib=<value>      Unwind library to use
-U <macro>              Undefine macro <macro>
--verify-debug-info     Verify the binary representation of debug output
-verify-pch             Load and verify that a pre-compiled header file is not stale
--version               Print version information
-v                      Show commands to run and use verbose output
-Wa,<arg>               Pass the comma separated arguments in <arg> to the assembler
-Wdeprecated            Enable warnings for deprecated constructs and define __DEPRECATED
-Wl,<arg>               Pass the comma separated arguments in <arg> to the linker
-working-directory <value>
                        Resolve file paths relative to the specified directory
-Wp,<arg>               Pass the comma separated arguments in <arg> to the preprocessor
-W<warning>             Enable the specified warning
-w                      Suppress all warnings
-Xanalyzer <arg>        Pass <arg> to the static analyzer
-Xarch_device <arg>     Pass <arg> to the CUDA/HIP device compilation
-Xarch_host <arg>       Pass <arg> to the CUDA/HIP host compilation
-Xassembler <arg>       Pass <arg> to the assembler
-Xclang=<arg>           Alias for -Xclang
-Xclang <arg>           Pass <arg> to clang -cc1
-Xcuda-fatbinary <arg>  Pass <arg> to fatbinary invocation
-Xcuda-ptxas <arg>      Pass <arg> to the ptxas assembler
-Xlinker <arg>          Pass <arg> to the linker
-Xoffload-linker<triple> <arg>
                        Pass <arg> to the offload linkers or the ones idenfied by -<triple>
-Xopenmp-target=<triple> <arg>
                        Pass <arg> to the target offloading toolchain identified by <triple>.
-Xopenmp-target <arg>   Pass <arg> to the target offloading toolchain.
-Xpreprocessor <arg>    Pass <arg> to the preprocessor
-x <language>           Treat subsequent input files as having type <language>
-z <arg>                Pass -z <arg> to the linker
70–76 ↗(On Diff #451030)

(Yeah, I've subscribed the sg15 mailing list.)

I've followed the terminology in the link except importable unit. In that link, importable unit may contain header units. But in this documentation, we want to split modules and header units. So I take importable module unit here. I guess it might not be a big deal.

225 ↗(On Diff #451030)

Oh, it confuses me. In my mind, BMI is corresponding to ABI (although the format of BMI is not standardized) and the .pcm file is corresponding to .o files naturally (to me). So generally, we don't say a .o file is the ABI. And similar, it is weird to me to say the .pcm files is the BMI.

I've added a sentence in the terminology section of module file to explain the term 'BMI'. I tried to copy it from the module-ecosystem-technical-report but I failed to find one. So I just put the simple explanation.

312–321 ↗(On Diff #451030)

I guess it is better to remain them here. Since the sentences tries to say the macro definitions are not language options, which is consistent to the above paragraphs to me.

347 ↗(On Diff #451030)

In my system, when I run rm M.cppm, it asks rm: remove regular file ‘M.cppm’? and I need to enter y then it would remove the file actually. So it looks better to add -f to me.

395–396 ↗(On Diff #451030)

Let me answer the motivation first. The motivation comes from my personal experience. I feel like when most people heard modules, they would ask "how much speedup could we get"? And there are some other questions like "why does modules speedup the compilation?". So I guess the readers of the document may have similar questions and I try to answer it here.

The complexity theory is correct but it may be too abstract to our users. Since the complexity theory is about the scaling. But for certain users, the scales of their codes are temporarily fixed. So when they try to use modules but find the speedup doesn't meet their expectation in O2. They may feel frustrated. And it doesn't work if I say, "hey, you'll get much better speedup if the your codes get 10x longer." I guess they won't buy in. So what I try to do here is to manage the user's expectation to avoid any misunderstanding.

Following off is about the explanation. For example, there are 1 module interface and 10 users. There is a function F in the module interface and the function is used by every users. And let's say we need a T time to compile the function F and each users without the function F.
In O0, the function F will get compiled completely once and get involved in the Sema part 10 times. Due to the Sema part is relatively fast and let's say the Sema part would take 0.1T. Given we compile them serially, we need 12T to compile the project.

But if we are with optimizations, each function F will get involved in optimizations and IPO in every users. And these optimizations are most time-consuming. Let's say these optimizations will consume 0.8T. And the time required will be 19T. It is easy to say the we need 20T to compile the project if we're using headers. So we could find the speedup with optimization is much slower.

BTW, if we write the required time with variables, it will be nT + mT + T*m*additional_compilation_part. The additional_compilation_part here corresponds to the time percentage of Sema or Optimizations. And since T and additional_compilation_part are both constant. So if we write them in O() form, it would be O(n+m).
So the theory is still correct.

610 ↗(On Diff #451030)

I guess this means "there is no .o for a .pcm from a header unit" basically?

Yes.

(I'll +1 to @ruoso's comments about using BMI - I think it might be best not to introduce a different term ("Module file") as it might confuse people as it's fairly close to "module source file", etc. BMI makes it clear/unambiguous that it's the binary file, not the source file)

clang/docs/CPlusPlus20Modules.rst
665 ↗(On Diff #451361)

could this use a strikethrough, perhaps? (not sure if you can strikethrough inside a code block)

347 ↗(On Diff #451030)

Perhaps your rm is aliased to rm -i? ( https://superuser.com/questions/345923/remove-file-without-asking ) - rm generally/without flags wouldn't prompt for removal of a writable file & encouraging people to get in the habit of using -f (especially when they probably don't need to - even if you do on your system) seems a bit problematic. I think not including the input/output of rm in case some users have it setup to require some interaction is OK - a plain rm is probably enough for users to understand what to do on their system.

395–396 ↗(On Diff #451030)

I think the message is getting a bit lost in the text (both in the proposed text, and the comment here).

"At -O0 implementations of non-inline functions defined in a module will not impact module users, but at higher optimization levels the definitions of such functions are provided to user compilations for the purposes of optimization (but definitions of these functions are still not included in the use's object file) - this means build speed at higher optimization levels may be lower than expected given -O0 experience, but does provide by more optimization opportunities"

ChuanqiXu marked 2 inline comments as done.

Address comments:

  • Change module file to BMI.
  • Replace rm -f with rm.
  • Add a conclusion paragraph to the How module speed up compilation section.
ChuanqiXu added inline comments.Aug 11 2022, 12:24 AM
clang/docs/CPlusPlus20Modules.rst
347 ↗(On Diff #451030)

Oh, got it.

395–396 ↗(On Diff #451030)

Yes, it is hard to talk clearly and briefly. In your suggested wording, you mentioned non-inline function, it is accurate but bring new information to this document. I'm worrying if the reader could understand it if the reader don't know c++ so much.

I put the suggested wording as the conclusion paragraph for the section and hope it could make the reader focus on the intention of the section.

ChuanqiXu added inline comments.Aug 11 2022, 12:24 AM
clang/docs/CPlusPlus20Modules.rst
665 ↗(On Diff #451361)

It looks like we can't do it in code blocks.. maybe we could only depend the comments above.

dblaikie added inline comments.Aug 11 2022, 2:00 PM
clang/docs/CPlusPlus20Modules.rst
395–396 ↗(On Diff #451030)

Maybe "non-inline" could be replaced by "module implementation details" (but "function bodies" sounds OK too)

I think the issue for me is that the current description seems to go into more detail about compiler implementation details than might be helpful for a document at this level. I was/am hoping maybe a one paragraph summary might be simpler/more approachable/sufficiently accurate for the audience.

Remove non-inline terms.

ChuanqiXu added inline comments.Aug 12 2022, 12:21 AM
clang/docs/CPlusPlus20Modules.rst
395–396 ↗(On Diff #451030)

Yeah, it is hard to control the balance between readability vs accuracy. From my personal experience, the 3-stage compilation model is relatively easy to be understood. I've explained the 3-stage compilation model for some our friends who are not majored in CS and all of them could understand it. But I know some programmers still think inline specifier is a optimization hint to the compiler..

After all, it is hard to tell if this is helpful for most readers. But I think the answer is yes from my personal experience.

ruoso added inline comments.Aug 12 2022, 8:53 AM
clang/docs/CPlusPlus20Modules.rst
309 ↗(On Diff #452095)

I think this is a bit confusing. The BMI is not linked...

Maybe something like: "Remember that modules still have an object counterpart to the BMI".

Specifically, it may be important to make a distinction between when the compiler is invoked to produce the bmi and when it's invoked to produce the object, and that you use the bmi when translating code that imports it, and the object when linking.

Actually, when it comes to diagrams - maybe what'd be good is a diagram of classic compilation and a diagram of modules and header modules

src1.cpp -+> clang++ src1.cpp --> src1.o ---, 
hdr1.h  --'                                 +-> clang++ src1.o src2.o ->  executable
hdr2.h  --,                                 |
src2.cpp -+> clang++ src2.cpp --> src2.o ---'
              src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -, 
(header unit) hdr1.h    -> clang++ hdr1.h ...    -> hdr1.pcm --'                                    +-> clang++ src1.o mod1.o src2.o ->  executable
              mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+
              src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -'

(no doubt could look a lot better - but would pretty quickly summarize what's going where - maybe the command lines are too long to include nicely in such a diagram and so the diagram could be a more simplified block diagram and the full command lines shown separately)

clang/docs/CPlusPlus20Modules.rst
395–396 ↗(On Diff #451030)

I still think it's a lot of text and diagrams that, to me, don't convey enough detail/clearly, to spend on what I'd consider a fairly side issue to the main discussion.

ChuanqiXu updated this revision to Diff 452610.Aug 15 2022, 2:18 AM
ChuanqiXu marked an inline comment as done.

Address comments.

ChuanqiXu updated this revision to Diff 452612.Aug 15 2022, 2:24 AM

Address comments.

Actually, when it comes to diagrams - maybe what'd be good is a diagram of classic compilation and a diagram of modules and header modules

src1.cpp -+> clang++ src1.cpp --> src1.o ---, 
hdr1.h  --'                                 +-> clang++ src1.o src2.o ->  executable
hdr2.h  --,                                 |
src2.cpp -+> clang++ src2.cpp --> src2.o ---'
              src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -, 
(header unit) hdr1.h    -> clang++ hdr1.h ...    -> hdr1.pcm --'                                    +-> clang++ src1.o mod1.o src2.o ->  executable
              mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+
              src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -'

(no doubt could look a lot better - but would pretty quickly summarize what's going where - maybe the command lines are too long to include nicely in such a diagram and so the diagram could be a more simplified block diagram and the full command lines shown separately)

Thanks for the diagrams! It is helpful. I added a sentence: (But we can't do this for the BMI from header units. See the later section for the definition of header units) since I'm not sure if all readers are familiar with header units.

clang/docs/CPlusPlus20Modules.rst
309 ↗(On Diff #452095)

Specifically, it may be important to make a distinction between when the compiler is invoked to produce the bmi and when it's invoked to produce the object, and that you use the bmi when translating code that imports it, and the object when linking.

I don't get the meaning here. The doc said the user need to compile *.cppm to *.pcm before imported it and the doc also said the user need to compile *.pcm to *.o to link. So I guess the doc has stated it clear?

395–396 ↗(On Diff #451030)

Got it. I still want to remain the parts since I'm sure that there will be readers who are confusing about it. But your words make sense too. I address your comments by moving the part into the end of the document in Possible Question section. Do you think this makes sense?

ruoso added inline comments.Aug 15 2022, 6:21 AM
clang/docs/CPlusPlus20Modules.rst
338–339 ↗(On Diff #452612)
ChuanqiXu updated this revision to Diff 452875.Aug 15 2022, 7:03 PM
ChuanqiXu marked an inline comment as done.

Address comments.

urnathan resigned from this revision.Aug 16 2022, 12:19 PM

Add description to use -fmodule-header to generate BMI for header units.

If no objection (not comments) come in, I want to land this in next Friday (8.25) since I want to backport this to 15.x and 15.x is going to release in September.

JohelEGP requested changes to this revision.Aug 19 2022, 5:00 AM
JohelEGP added inline comments.
clang/docs/CPlusPlus20Modules.rst
290 ↗(On Diff #453207)
457 ↗(On Diff #453207)
This revision now requires changes to proceed.Aug 19 2022, 5:00 AM
ChuanqiXu updated this revision to Diff 454012.Aug 19 2022, 8:19 AM
ChuanqiXu marked an inline comment as done.

Address comments.

ChuanqiXu added inline comments.Aug 19 2022, 8:20 AM
clang/docs/CPlusPlus20Modules.rst
457 ↗(On Diff #453207)

Thanks for reviewing. BTW, the request to change button is fair to use. But it scares me a little with typo comments..

ChuanqiXu updated this revision to Diff 455462.Aug 24 2022, 7:34 PM
ChuanqiXu retitled this revision from [docs] Add "C++20 Modules" to [docs] Add "Standard C++ Modules".
ChuanqiXu edited the summary of this revision. (Show Details)

Replace C++20 Modules with Standard C++ Modules since @ruoso pointed out that Modules is not specific to certain versions of C++ (for example, Modules may get some big changes in C++23, C++26, ...).

Replace C++20 Modules with Standard C++ Modules since @ruoso pointed out that Modules is not specific to certain versions of C++ (for example, Modules may get some big changes in C++23, C++26, ...).

The filename (and hence future doc URL) still contains the "20".

ruoso added a comment.Aug 25 2022, 6:18 AM

I just noticed that the document was implying that Header Units were separate from the Standard C++ Modules, but they are an integral part of the language in the specification of modules. The contrast that we do is between "Named Modules" and "Header Units". I did some edits to help convey that.

clang/docs/CPlusPlus20Modules.rst
24–26 ↗(On Diff #455462)
28–29 ↗(On Diff #455462)
285–287 ↗(On Diff #455462)

Which one takes precedence?

585–586 ↗(On Diff #455462)
728–729 ↗(On Diff #455462)

I just noticed that the document was implying that Header Units were separate from the Standard C++ Modules, but they are an integral part of the language in the specification of modules. The contrast that we do is between "Named Modules" and "Header Units". I did some edits to help convey that.

Yeah, I know header units are part of modules. Technically, the "header units" are special synthesized units . And the entities in header units part of the global module. And global module have no name. So here is the term Named Modules. I didn't mention this since I feel this bring some language details to the users but I feel like this is not helpful for users. For me, it is simpler and workable to think "header units" are special headers and treat "named modules" and "modules" interchangeably. But given you are doing a lot of modules documentation work, I'll follow your suggestion to make the terms consistency.

(BTW, it is a little bit weird to me since "header units" should be "units" literally but we call it modules)

ChuanqiXu marked 5 inline comments as done.

Address comments:

  • Change the file name to StandardCPlusPlusModules
  • Combine header units with modules more tightly.

I just noticed that the document was implying that Header Units were separate from the Standard C++ Modules, but they are an integral part of the language in the specification of modules. The contrast that we do is between "Named Modules" and "Header Units". I did some edits to help convey that.

Done. Although I wanted to land this today but I feel it will be better to be approved by you. So I'll wait longer.

ruoso accepted this revision.Aug 26 2022, 4:38 AM

Thanks for taking my last minute suggestions

Nice writeup! I appreciate the work that went into this.

clang/docs/StandardCPlusPlusModules.rst
28

Just matching the capitalization of the other sections and subsections.

88

This is a small tweak but it should clarify that using the same partition_name with entirely different module_names is fine.

90
92
244–245

Is there a tracking issue to revise this choice? I seem to recall that we settled on recommending a different file extension, *.ixx, for at least primary module interface units.

We didn't really discuss extensions for other units, but maybe we should if different toolchains are going to have different expectations. Of course, build systems can work around this sort of thing. But if the expectation of clang is that, practically speaking, you'll probably be using a build system, maybe the docs should clarify as much. We could document that parts of this document are intended for people configuring or implementing build systems.

FYI @Bigcheese

745

@ChuanqiXu Thanks a lot for this!

We are currently implementing this module logic in the Bazel ruleset rules_ll.
This page helps clarify the differences between clang modules and C++ modules a lot. If I knew about this earlier we'd probably have saved a lot of time 😅

After reading this doc, a few things came to my mind. I can add a patch for this as a follow-up diff when I better understand how things work.

  • The path logic that references the source files in BMIs is very important. When precompiling a module in one directory and then moving a header included in the global module fragment from the source file, things will break.
  • As a result of this, some unintuitive things can happen when using sandboxed compilation. E.g. precompile BMI in sandbox, move BMI out of that sandbox, then start another sandbox that loads the previously built BMI. If the BMI used headers in a global module fragment, and those headers were in the original sandboxe, the header in the new sandbox will be in a different location (because the new sandbox is at a new location) and things will break.
  • Clang modulemaps are cool, but for build systems that have to declare every output, implicit modulemaps seem like a bad idea. Clang's caching system, even with custom cache paths seems like a bad idea as well. It is probably helpful to mention caveats of implicit modules and how to disable them (or link to the Clang Module docs for this).
  • The doc describes how to build e.g. an iostream module using the C++ semantics, but doesn't mention that AFAIK there is no "native" C++ Module build for libcxx yet. One would currently actually use Clang modules with implicit modulemaps. It may be helpful to mention this to avoid confusion.
  • Issue https://github.com/llvm/llvm-project/issues/56770 and seems-to-be-a-consequence-issue https://github.com/llvm/llvm-project/issues/57293 may be relevant apart from the already mentioned issue about clang-scan-deps.
ChuanqiXu updated this revision to Diff 456244.Aug 28 2022, 8:45 PM
ChuanqiXu marked 5 inline comments as done.

Address comments.

Add contents for "When precompiling a module in one directory and then moving a header included in the global module fragment from the source file, things will break."

@ChuanqiXu Thanks a lot for this!

We are currently implementing this module logic in the Bazel ruleset rules_ll.
This page helps clarify the differences between clang modules and C++ modules a lot. If I knew about this earlier we'd probably have saved a lot of time 😅

Yeah, the reason why I want to land this faster is that I've seen some people confused about this too.

(I've watched rules_ll for releases. So it will be better to *release* it if you guys get something done)

After reading this doc, a few things came to my mind. I can add a patch for this as a follow-up diff when I better understand how things work.

  • The path logic that references the source files in BMIs is very important. When precompiling a module in one directory and then moving a header included in the global module fragment from the source file, things will break.

Yes and I mentioned it implicitly that we can workaround it by -Xclang -fmodules-embed-all-files options. And I add a paragraph at the end of Source content consistency section to make it explicitly. Would you like to take a look?

  • Clang modulemaps are cool, but for build systems that have to declare every output, implicit modulemaps seem like a bad idea. Clang's caching system, even with custom cache paths seems like a bad idea as well. It is probably helpful to mention caveats of implicit modules and how to disable them (or link to the Clang Module docs for this).
  • The doc describes how to build e.g. an iostream module using the C++ semantics, but doesn't mention that AFAIK there is no "native" C++ Module build for libcxx yet. One would currently actually use Clang modules with implicit modulemaps. It may be helpful to mention this to avoid confusion.

In fact, initially I want this document to be separate from clang module map modules. Although they share a lot in the implementation, they are totally different to the users. (Although they share some options too). To be clear, I'm not saying we'll abandon/disallow the mixed use of clang module map modules with standard c++ modules. The status quo is that every thing is still unclear. I'm not sure if they will work together correctly or what the cost we'll pay for to make them work together. So when I wrote this document, I intentionally skipped the part of how to use clang module map modules.

And I think it will be great to add a section "interaction with clang module map modules" once we have the use experience of mixed form.

I'm not sure. And for this document, I feel it is fine to not include all the issues. Otherwise it may be a little bit scary to people who reads it first.


@aaronmondal @bbrown105 Are you happy to land this document in the current status? So that I can backport this to 15.x in time. (clang 15.x is going to be released in the next week). Then the readers could read the document for 15.x at https://releases.llvm.org/15.0.0/tools/clang/docs/StandardCPlusPlusModules.html. Then we can keep maintaining it further. This should be helpful for version controlling.

clang/docs/StandardCPlusPlusModules.rst
244–245

Is there a tracking issue to revise this choice? I seem to recall that we settled on recommending a different file extension, *.ixx, for at least primary module interface units.

I remember there is not an existing one. And I just filed https://github.com/llvm/llvm-project/issues/57416. I think you can post the consensus of SG15 in that issue and we can follow it.

I edit the wording slightly (must -> should) and add a known problem section for it. I think we can edit the document further after we get more consensus.

But if the expectation of clang is that, practically speaking, you'll probably be using a build system, maybe the docs should clarify as much.

No, I am not using/developing a build system for modules (Currently I write command line options manually by Makefile). And the suffix expectation are required by the current clang15.x. And we can change it in further versions.

aaronmondal accepted this revision.Aug 29 2022, 2:39 AM

Thanks for addressing my comment. I think I overlooked the part about -fmodules-embed-all-files 😅

I'm fine with this revision 😊

aaronmondal requested changes to this revision.Aug 30 2022, 9:51 AM

I just noticed that there probably needs to be one change. The doc sometimes describes -fprebuilt-module-interface, which is not a valid option. I think the occurrences were meant to be -fpbrebuilt-module-path.

This revision now requires changes to proceed.Aug 30 2022, 9:51 AM
ChuanqiXu updated this revision to Diff 456844.Aug 30 2022, 7:31 PM

Address comments.

I just noticed that there probably needs to be one change. The doc sometimes describes -fprebuilt-module-interface, which is not a valid option. I think the occurrences were meant to be -fpbrebuilt-module-path.

Oh, thanks for noticing. It is weird that I made such a mistake...

@bbrown105 I am going to land this today. it looks like you're not objecting it and we don't have the time to wait for your formal approval...

ChuanqiXu accepted this revision.Aug 30 2022, 8:03 PM

I'm going to land this. Thanks for everyone who reviewed this!

This revision was not accepted when it landed; it landed in state Needs Review.Aug 30 2022, 8:11 PM
This revision was automatically updated to reflect the committed changes.