Page MenuHomePhabricator

[docs] Add "Standard C++ Modules"
ClosedPublic

Authored by ChuanqiXu on Aug 8 2022, 3:14 AM.

Details

Summary

Successor of D131062. We create the new page due to the old one has too many review comments to review. So we decide to create the new page to review more clearly.


We get some standard C++ module things done in clang15.x. But we lack a user documentation for it. The implementation of standard C++ modules share a big part of codes with clang modules. But they have very different semantics and user interfaces, so I think it is necessary to add a document for Standard C++ modules. Previously, there were also some people ask the document for standard C++ Modules and I couldn't offer that time.

I wish we could land this before September and I can backport this to 15.x. This implies that even we solve any bug during the reviewing process. We shouldn't edit the document in time. We should land it and edit it.

Tested with grammarly and make docs-clang-html.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ChuanqiXu added inline comments.Aug 11 2022, 12:24 AM
clang/docs/CPlusPlus20Modules.rst
666

It looks like we can't do it in code blocks.. maybe we could only depend the comments above.

dblaikie added inline comments.Aug 11 2022, 2:00 PM
clang/docs/CPlusPlus20Modules.rst
396–397

Maybe "non-inline" could be replaced by "module implementation details" (but "function bodies" sounds OK too)

I think the issue for me is that the current description seems to go into more detail about compiler implementation details than might be helpful for a document at this level. I was/am hoping maybe a one paragraph summary might be simpler/more approachable/sufficiently accurate for the audience.

Remove non-inline terms.

ChuanqiXu added inline comments.Aug 12 2022, 12:21 AM
clang/docs/CPlusPlus20Modules.rst
396–397

Yeah, it is hard to control the balance between readability vs accuracy. From my personal experience, the 3-stage compilation model is relatively easy to be understood. I've explained the 3-stage compilation model for some our friends who are not majored in CS and all of them could understand it. But I know some programmers still think inline specifier is a optimization hint to the compiler..

After all, it is hard to tell if this is helpful for most readers. But I think the answer is yes from my personal experience.

ruoso added inline comments.Aug 12 2022, 8:53 AM
clang/docs/CPlusPlus20Modules.rst
310

I think this is a bit confusing. The BMI is not linked...

Maybe something like: "Remember that modules still have an object counterpart to the BMI".

Specifically, it may be important to make a distinction between when the compiler is invoked to produce the bmi and when it's invoked to produce the object, and that you use the bmi when translating code that imports it, and the object when linking.

Actually, when it comes to diagrams - maybe what'd be good is a diagram of classic compilation and a diagram of modules and header modules

src1.cpp -+> clang++ src1.cpp --> src1.o ---, 
hdr1.h  --'                                 +-> clang++ src1.o src2.o ->  executable
hdr2.h  --,                                 |
src2.cpp -+> clang++ src2.cpp --> src2.o ---'
              src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -, 
(header unit) hdr1.h    -> clang++ hdr1.h ...    -> hdr1.pcm --'                                    +-> clang++ src1.o mod1.o src2.o ->  executable
              mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+
              src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -'

(no doubt could look a lot better - but would pretty quickly summarize what's going where - maybe the command lines are too long to include nicely in such a diagram and so the diagram could be a more simplified block diagram and the full command lines shown separately)

clang/docs/CPlusPlus20Modules.rst
396–397

I still think it's a lot of text and diagrams that, to me, don't convey enough detail/clearly, to spend on what I'd consider a fairly side issue to the main discussion.

ChuanqiXu updated this revision to Diff 452610.Aug 15 2022, 2:18 AM
ChuanqiXu marked an inline comment as done.

Address comments.

ChuanqiXu updated this revision to Diff 452612.Aug 15 2022, 2:24 AM

Address comments.

Actually, when it comes to diagrams - maybe what'd be good is a diagram of classic compilation and a diagram of modules and header modules

src1.cpp -+> clang++ src1.cpp --> src1.o ---, 
hdr1.h  --'                                 +-> clang++ src1.o src2.o ->  executable
hdr2.h  --,                                 |
src2.cpp -+> clang++ src2.cpp --> src2.o ---'
              src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -, 
(header unit) hdr1.h    -> clang++ hdr1.h ...    -> hdr1.pcm --'                                    +-> clang++ src1.o mod1.o src2.o ->  executable
              mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+
              src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -'

(no doubt could look a lot better - but would pretty quickly summarize what's going where - maybe the command lines are too long to include nicely in such a diagram and so the diagram could be a more simplified block diagram and the full command lines shown separately)

Thanks for the diagrams! It is helpful. I added a sentence: (But we can't do this for the BMI from header units. See the later section for the definition of header units) since I'm not sure if all readers are familiar with header units.

clang/docs/CPlusPlus20Modules.rst
310

Specifically, it may be important to make a distinction between when the compiler is invoked to produce the bmi and when it's invoked to produce the object, and that you use the bmi when translating code that imports it, and the object when linking.

I don't get the meaning here. The doc said the user need to compile *.cppm to *.pcm before imported it and the doc also said the user need to compile *.pcm to *.o to link. So I guess the doc has stated it clear?

396–397

Got it. I still want to remain the parts since I'm sure that there will be readers who are confusing about it. But your words make sense too. I address your comments by moving the part into the end of the document in Possible Question section. Do you think this makes sense?

ruoso added inline comments.Aug 15 2022, 6:21 AM
clang/docs/CPlusPlus20Modules.rst
339–340
ChuanqiXu updated this revision to Diff 452875.Aug 15 2022, 7:03 PM
ChuanqiXu marked an inline comment as done.

Address comments.

urnathan resigned from this revision.Aug 16 2022, 12:19 PM

Add description to use -fmodule-header to generate BMI for header units.

If no objection (not comments) come in, I want to land this in next Friday (8.25) since I want to backport this to 15.x and 15.x is going to release in September.

JohelEGP requested changes to this revision.Aug 19 2022, 5:00 AM
JohelEGP added inline comments.
clang/docs/CPlusPlus20Modules.rst
291
458
This revision now requires changes to proceed.Aug 19 2022, 5:00 AM
ChuanqiXu updated this revision to Diff 454012.Aug 19 2022, 8:19 AM
ChuanqiXu marked an inline comment as done.

Address comments.

ChuanqiXu added inline comments.Aug 19 2022, 8:20 AM
clang/docs/CPlusPlus20Modules.rst
458

Thanks for reviewing. BTW, the request to change button is fair to use. But it scares me a little with typo comments..

ChuanqiXu updated this revision to Diff 455462.Aug 24 2022, 7:34 PM
ChuanqiXu retitled this revision from [docs] Add "C++20 Modules" to [docs] Add "Standard C++ Modules".
ChuanqiXu edited the summary of this revision. (Show Details)

Replace C++20 Modules with Standard C++ Modules since @ruoso pointed out that Modules is not specific to certain versions of C++ (for example, Modules may get some big changes in C++23, C++26, ...).

Replace C++20 Modules with Standard C++ Modules since @ruoso pointed out that Modules is not specific to certain versions of C++ (for example, Modules may get some big changes in C++23, C++26, ...).

The filename (and hence future doc URL) still contains the "20".

ruoso added a comment.Aug 25 2022, 6:18 AM

I just noticed that the document was implying that Header Units were separate from the Standard C++ Modules, but they are an integral part of the language in the specification of modules. The contrast that we do is between "Named Modules" and "Header Units". I did some edits to help convey that.

clang/docs/CPlusPlus20Modules.rst
24–26
28–29
285–287

Which one takes precedence?

585–586
728–729

I just noticed that the document was implying that Header Units were separate from the Standard C++ Modules, but they are an integral part of the language in the specification of modules. The contrast that we do is between "Named Modules" and "Header Units". I did some edits to help convey that.

Yeah, I know header units are part of modules. Technically, the "header units" are special synthesized units . And the entities in header units part of the global module. And global module have no name. So here is the term Named Modules. I didn't mention this since I feel this bring some language details to the users but I feel like this is not helpful for users. For me, it is simpler and workable to think "header units" are special headers and treat "named modules" and "modules" interchangeably. But given you are doing a lot of modules documentation work, I'll follow your suggestion to make the terms consistency.

(BTW, it is a little bit weird to me since "header units" should be "units" literally but we call it modules)

ChuanqiXu marked 5 inline comments as done.

Address comments:

  • Change the file name to StandardCPlusPlusModules
  • Combine header units with modules more tightly.

I just noticed that the document was implying that Header Units were separate from the Standard C++ Modules, but they are an integral part of the language in the specification of modules. The contrast that we do is between "Named Modules" and "Header Units". I did some edits to help convey that.

Done. Although I wanted to land this today but I feel it will be better to be approved by you. So I'll wait longer.

ruoso accepted this revision.Aug 26 2022, 4:38 AM

Thanks for taking my last minute suggestions

Nice writeup! I appreciate the work that went into this.

clang/docs/StandardCPlusPlusModules.rst
27 ↗(On Diff #455802)

Just matching the capitalization of the other sections and subsections.

87 ↗(On Diff #455802)

This is a small tweak but it should clarify that using the same partition_name with entirely different module_names is fine.

89 ↗(On Diff #455802)
91 ↗(On Diff #455802)
243–244 ↗(On Diff #455802)

Is there a tracking issue to revise this choice? I seem to recall that we settled on recommending a different file extension, *.ixx, for at least primary module interface units.

We didn't really discuss extensions for other units, but maybe we should if different toolchains are going to have different expectations. Of course, build systems can work around this sort of thing. But if the expectation of clang is that, practically speaking, you'll probably be using a build system, maybe the docs should clarify as much. We could document that parts of this document are intended for people configuring or implementing build systems.

FYI @Bigcheese

744 ↗(On Diff #455802)

@ChuanqiXu Thanks a lot for this!

We are currently implementing this module logic in the Bazel ruleset rules_ll.
This page helps clarify the differences between clang modules and C++ modules a lot. If I knew about this earlier we'd probably have saved a lot of time 😅

After reading this doc, a few things came to my mind. I can add a patch for this as a follow-up diff when I better understand how things work.

  • The path logic that references the source files in BMIs is very important. When precompiling a module in one directory and then moving a header included in the global module fragment from the source file, things will break.
  • As a result of this, some unintuitive things can happen when using sandboxed compilation. E.g. precompile BMI in sandbox, move BMI out of that sandbox, then start another sandbox that loads the previously built BMI. If the BMI used headers in a global module fragment, and those headers were in the original sandboxe, the header in the new sandbox will be in a different location (because the new sandbox is at a new location) and things will break.
  • Clang modulemaps are cool, but for build systems that have to declare every output, implicit modulemaps seem like a bad idea. Clang's caching system, even with custom cache paths seems like a bad idea as well. It is probably helpful to mention caveats of implicit modules and how to disable them (or link to the Clang Module docs for this).
  • The doc describes how to build e.g. an iostream module using the C++ semantics, but doesn't mention that AFAIK there is no "native" C++ Module build for libcxx yet. One would currently actually use Clang modules with implicit modulemaps. It may be helpful to mention this to avoid confusion.
  • Issue https://github.com/llvm/llvm-project/issues/56770 and seems-to-be-a-consequence-issue https://github.com/llvm/llvm-project/issues/57293 may be relevant apart from the already mentioned issue about clang-scan-deps.
ChuanqiXu updated this revision to Diff 456244.Aug 28 2022, 8:45 PM
ChuanqiXu marked 5 inline comments as done.

Address comments.

Add contents for "When precompiling a module in one directory and then moving a header included in the global module fragment from the source file, things will break."

@ChuanqiXu Thanks a lot for this!

We are currently implementing this module logic in the Bazel ruleset rules_ll.
This page helps clarify the differences between clang modules and C++ modules a lot. If I knew about this earlier we'd probably have saved a lot of time 😅

Yeah, the reason why I want to land this faster is that I've seen some people confused about this too.

(I've watched rules_ll for releases. So it will be better to *release* it if you guys get something done)

After reading this doc, a few things came to my mind. I can add a patch for this as a follow-up diff when I better understand how things work.

  • The path logic that references the source files in BMIs is very important. When precompiling a module in one directory and then moving a header included in the global module fragment from the source file, things will break.

Yes and I mentioned it implicitly that we can workaround it by -Xclang -fmodules-embed-all-files options. And I add a paragraph at the end of Source content consistency section to make it explicitly. Would you like to take a look?

  • Clang modulemaps are cool, but for build systems that have to declare every output, implicit modulemaps seem like a bad idea. Clang's caching system, even with custom cache paths seems like a bad idea as well. It is probably helpful to mention caveats of implicit modules and how to disable them (or link to the Clang Module docs for this).
  • The doc describes how to build e.g. an iostream module using the C++ semantics, but doesn't mention that AFAIK there is no "native" C++ Module build for libcxx yet. One would currently actually use Clang modules with implicit modulemaps. It may be helpful to mention this to avoid confusion.

In fact, initially I want this document to be separate from clang module map modules. Although they share a lot in the implementation, they are totally different to the users. (Although they share some options too). To be clear, I'm not saying we'll abandon/disallow the mixed use of clang module map modules with standard c++ modules. The status quo is that every thing is still unclear. I'm not sure if they will work together correctly or what the cost we'll pay for to make them work together. So when I wrote this document, I intentionally skipped the part of how to use clang module map modules.

And I think it will be great to add a section "interaction with clang module map modules" once we have the use experience of mixed form.

I'm not sure. And for this document, I feel it is fine to not include all the issues. Otherwise it may be a little bit scary to people who reads it first.


@aaronmondal @bbrown105 Are you happy to land this document in the current status? So that I can backport this to 15.x in time. (clang 15.x is going to be released in the next week). Then the readers could read the document for 15.x at https://releases.llvm.org/15.0.0/tools/clang/docs/StandardCPlusPlusModules.html. Then we can keep maintaining it further. This should be helpful for version controlling.

clang/docs/StandardCPlusPlusModules.rst
243–244 ↗(On Diff #455802)

Is there a tracking issue to revise this choice? I seem to recall that we settled on recommending a different file extension, *.ixx, for at least primary module interface units.

I remember there is not an existing one. And I just filed https://github.com/llvm/llvm-project/issues/57416. I think you can post the consensus of SG15 in that issue and we can follow it.

I edit the wording slightly (must -> should) and add a known problem section for it. I think we can edit the document further after we get more consensus.

But if the expectation of clang is that, practically speaking, you'll probably be using a build system, maybe the docs should clarify as much.

No, I am not using/developing a build system for modules (Currently I write command line options manually by Makefile). And the suffix expectation are required by the current clang15.x. And we can change it in further versions.

aaronmondal accepted this revision.Aug 29 2022, 2:39 AM

Thanks for addressing my comment. I think I overlooked the part about -fmodules-embed-all-files 😅

I'm fine with this revision 😊

aaronmondal requested changes to this revision.Aug 30 2022, 9:51 AM

I just noticed that there probably needs to be one change. The doc sometimes describes -fprebuilt-module-interface, which is not a valid option. I think the occurrences were meant to be -fpbrebuilt-module-path.

This revision now requires changes to proceed.Aug 30 2022, 9:51 AM
ChuanqiXu updated this revision to Diff 456844.Aug 30 2022, 7:31 PM

Address comments.

I just noticed that there probably needs to be one change. The doc sometimes describes -fprebuilt-module-interface, which is not a valid option. I think the occurrences were meant to be -fpbrebuilt-module-path.

Oh, thanks for noticing. It is weird that I made such a mistake...

@bbrown105 I am going to land this today. it looks like you're not objecting it and we don't have the time to wait for your formal approval...

ChuanqiXu accepted this revision.Aug 30 2022, 8:03 PM

I'm going to land this. Thanks for everyone who reviewed this!

This revision was not accepted when it landed; it landed in state Needs Review.Aug 30 2022, 8:11 PM
This revision was automatically updated to reflect the committed changes.