This is an archive of the discontinued LLVM Phabricator instance.

[docs] Add "C++20 Modules"
AbandonedPublic

Authored by ChuanqiXu on Aug 3 2022, 2:58 AM.

Details

Summary

We get some C++20 module things done in clang15.x. But we lack a user documentation for it. The implementation of c++20 modules share a big part of codes with clang modules. But they have very different semantics and user interfaces, so I think it is necessary to add a document for C++20 modules. Previously, there were also some people ask the document for C++20 Modules and I couldn't offer that time.

I wish we could land this before September and I can backport this to 15.x. This implies that even we solve any bug during the reviewing process. We shouldn't edit the document in time. We should land it and edit it.

Tested with grammarly and make docs-clang-html.

Diff Detail

Event Timeline

ChuanqiXu created this revision.Aug 3 2022, 2:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2022, 2:58 AM
Herald added a subscriber: arphaman. · View Herald Transcript
ChuanqiXu requested review of this revision.Aug 3 2022, 2:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 3 2022, 2:58 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ChuanqiXu added inline comments.Aug 3 2022, 3:02 AM
clang/docs/CPlusPlus20Modules.rst
378–394

I'm not sure if this could be rendered properly, if there is any suggestion?

586–588

@iains maybe we need to double check on this. I also filed an issue: https://github.com/llvm/llvm-project/issues/56899. My reading for [cpp.import]p5 is the point of undefinition is also a macro definition, so we should emit it. But I maybe wrong.

If I am wrong, I wish we could get an example to show differences (If any)

ChuanqiXu added inline comments.Aug 3 2022, 4:27 AM
clang/docs/CPlusPlus20Modules.rst
469–470

I am not sure if we should remain this. I feel both (remain or not remain) is OK.

ChuanqiXu updated this revision to Diff 449657.Aug 3 2022, 6:52 AM

Remove the discussion about the difference about macro in clang module and header units since we need more discussion about it. And the discussion is not necessary about the docs.

Thanks a lot for working on this!

I wonder whether it would be better to have some of the more technical information in a separate file and let this file mainly have an introduction to modules in general and how to get started using them in Clang.

I haven't tested the steps yet, but I intend to see whether the examples work for me.

I haven't validated whether the module explanation is correct; I'm sure there are other reviewers who have a deeper knowledge of modules.

clang/docs/CPlusPlus20Modules.rst
11

Clang is capitalized in its documentation.

18

Is Modules_ intended to be a link?

20
22

IMO we don't need to justify why we add a new page. WDYT?

35

How about mentioning the state of modules in Clang? In your GitHub repository you mention some limitations. (A link to a status page could be an alternative.)

56
79

Did you render the document? I would have expected double backticks.

117

Maybe make this hello world example a bit simpler by not using partitions and have a separate example with partitions. This example gives the impression a simple "hello modular world" is a lot harder to achieve than a normal "hello world".

133

Does import <iostream>; work? If not is that a Clang or libc++ limitation?

164

Maybe explain what all steps do.

178
183

I assume -std=c++2b works too. Maybe rephrase the wording like this.

184

I would remove this sentence, it's not too common to backport language features.

321
539

Maybe provide a link to this module map. (FYI the libc++ module map is generated by CMake using a .in file.)

ChuanqiXu updated this revision to Diff 449898.Aug 4 2022, 1:05 AM
ChuanqiXu marked 15 inline comments as done.
ChuanqiXu edited the summary of this revision. (Show Details)
  • Address comments.
  • Tested with make docs-clang-html.
  • Add Known Problems section.
clang/docs/CPlusPlus20Modules.rst
18
22

I guess it is not bad to add it. Since it is really confusing about clang modules and C++20 Modules at first for people who are not so familiar to these topics.

79

Yeah, I had rendered it in https://overbits.herokuapp.com/rsteditor/. But it is not good. I found http://rst.aaroniles.net which looks better.

133

Yes, import <iostream> works here. But I don't want to introduce too many information here. And I mentioned it indirectly later in Include translation section.

164

I made a simple explanation in the # part and I intended to explain them in more detail in the following sections. I feel it may be confusing for readers who reads at the first time?

183

I sent a backport patch to 15.x: https://github.com/llvm/llvm-project/issues/56803. So it -std=c++2b should be fine in 15.x

378–394

Now I've tested it with make docs-clang-html, and it looks not bad.

Thanks a lot for working on this!

I wonder whether it would be better to have some of the more technical information in a separate file and let this file mainly have an introduction to modules in general and how to get started using them in Clang.

What do you mean about the more technical information? Do you refer to the section How module speed up the compilation?

I added the How module speed up the compilation section since I know people are curious about why and how modules could speedup the compilation. And this section tries tell them that they may probably get what they want in O0 but not so much in O2 (and still some improvements after all) and also the reasons.

let this file mainly have an introduction to modules in general

If I don't misread, do you say that we need make this one like a modules tutorial? If yes, I think it is not what I thought. I treat this one like a toolchain documentation instead of a language feature documentation. Everything in the doc is about the building.
I put the language background since I want to make the doc self-contained. And everything else is related to the building. I understand it is good/needed to introduce about modules since it really brings many new concepts. Personally, I feel like it would be done by somebody else besides of clang docs. The compiler and languages are deeply connected but they are still different to me.

I haven't tested the steps yet, but I intend to see whether the examples work for me.

I haven't validated whether the module explanation is correct; I'm sure there are other reviewers who have a deeper knowledge of modules.

Thanks. I've tested all the examples. It is always good to have more testers. It should be good at trunk. If you want to test with 15.x, you may need to wait for the merge of https://github.com/llvm/llvm-project-release-prs/pull/40 and https://github.com/llvm/llvm-project-release-prs/pull/40

clang/docs/CPlusPlus20Modules.rst
35

Yeah, I've added the Known Problems section.

Aside from a couple typos, I was wondering if it would be welcome to do a pass w.r.t stylistic improvements (e.g. "Modules have a lot of meanings." --> "The term 'modules' has a lot of meanings"). I don't want to be too nit-picky - the sentences are all understandable, but sometimes slightly unusual to a native speaker in their formulation.

Aside from a couple typos, I was wondering if it would be welcome to do a pass w.r.t stylistic improvements (e.g. "Modules have a lot of meanings." --> "The term 'modules' has a lot of meanings"). I don't want to be too nit-picky - the sentences are all understandable, but sometimes slightly unusual to a native speaker in their formulation.

It would be greatly welcome for such comments!

It would be greatly welcome for such comments!

OK, here goes. Sorry for the large volume of comments. In addition to typos and stylistic improvements, I've had a few questions where the content wasn't clear to me (but note I'm not experienced with modules at all, so this may be obvious to others). I've also added a few line breaks where Phab was awkwardly overflowing. The position of the linebreaks is obviously irrelevant, but I thought it might help further review.

clang/docs/CPlusPlus20Modules.rst
12
14–15
16–17

Not 100% what the intention of the last sentence is - I presume it sets the goal for this document?

21–22

Due to the C++20 modules having very different semantics, it might be more friendly for users who care about C++20 modules only to create a new page.

Isn't "C++20 modules" what this page intends to do? Which new page are we talking about then?

25–27
32–36
50–52
65–67

The dot `.` in the name has no special meaning.

Not sure if this is intended to say that the dot in the regex is not needed, or that it has no semantic significance.

Also, when wanting to match a literal "." in a regex, I'd consider it beneficial for clarity to escape it ("\."), even though it does what's intended in the context of [].

79–81
95–99

This seems quite important for the terminology of the rest of the document, so I'd structure it in a way that stand out visually, e.g. the suggestion above.

148

Missing space & inconsistent spelling compared to above.

I'd personally write it as (including the quotes): "hello world" example

190
214–215
220
225–227

As a reader, I have no idea how to parse

must end with `.cppm (or .ccm, .cxxm`, etc)

On the one hand there's a clear restriction ("must"), on the other there's an open-ended list, whose contents are not clear to me.

229–231

Similarly here; can we describe the origin or the restrictions should/must - i.e. do they come from the standard or from the Clang implementation? Is there any enforcement, or do things just error otherwise?

236–237
239

Is this equivalent to "-fprebuilt-module-interface". Are there relevant differences, and if so, should the reader of this document know about them?

239

Moving this up from further down, but I might be getting things wrong.

241–242
248

I didn't understand what would be more convenient at first, but I'm guessing that one takes a directory and the other takes a file? I made a suggestion further up how to reflect this.

253–254
262–263
297
305
310
344
365
367–368
372–373
375–376
401–404
427–429

"It is not acceptable" -> "It would be very unfortunate"

Since it sounds like that's what will happen when switching on optimization?

450–451
459–460
465–467
474–479
499
507
512–514
524–529
565
575–576
642–645
647–648
650–652
ChuanqiXu marked 45 inline comments as done.

It would be greatly welcome for such comments!

OK, here goes. Sorry for the large volume of comments. In addition to typos and stylistic improvements, I've had a few questions where the content wasn't clear to me (but note I'm not experienced with modules at all, so this may be obvious to others). I've also added a few line breaks where Phab was awkwardly overflowing. The position of the linebreaks is obviously irrelevant, but I thought it might help further review.

Many thanks for the though review! It is very helpful.

I've had a few questions where the content wasn't clear to me (but note I'm not experienced with modules at all, so this may be obvious to others)

This is welcome. In fact, it is a failure if the document requires the reader to have experienced with modules.

clang/docs/CPlusPlus20Modules.rst
16–17

The intention was to describe the motivation of the document, but we have stated it again in the next paragraph. So the suggestion looks fine.

21–22

The new page here refers to the document. The intention here is that the users may be confused about clang modules and c++20 modules and there is already a document about clang modules. So I am wondering it might be helpful to explain why we need a new document for C++20 Modules instead of clang modules.

65–67

Not sure if this is intended to say that the dot in the regex is not needed, or that it has no semantic significance.

I mean that it has no semantic significance. For example, the user from python may feel like the dot . may be related to hierarchy. But it is not the case for C++.

225–227

What I want to say is for etc is usual c++ source file suffix + m. Currently, the list is .cppm, .ccm, .cxxm and .c++m. The reason why I put etc is that I am afraid of the list may goes longer in the future. But it looks better to state things clear now. I'll follow the suggestion.

229–231

The comes from the clang implementation. If the user don't follow the restrictions, then the clang may fail to build the module. For example, in the "hello world" example, if the name of module file is M.module instead of M.pcm, the the clang would fail to find the corresponding M module.

239

Yes, it is true that -fprebuilt-module-interface takes a directory and -fmodule-file takes a specific file. I'll follow this.

248

I'm guessing that one takes a directory and the other takes a file?

Yes.

I didn't understand what would be more convenient at first

I guess you have understood it. But let me explain it again to avoid any misunderstanding. If there are n dependent module files in the same directory, we may need to n -fmodule-file options, or a -fprebuilt-module-interface option. So I said -fprebuilt-module-interface is more convenient.

427–429

Since it sounds like that's what will happen when switching on optimization?

No, it wouldn't happen. I want to say the users of modules won't get worth performance; but they can't get so much speedup either.

What I want to say in the first sentence is that if the compilation process with optimization is the same with the one with debug, the users will get worth performance, which is unacceptable.

I'm not sure if the current wording eliminates the misunderstanding.

647–648

If we decide to Clang's modulemap

If we decide to reuse Clang's modulemap? It looks like it lacks a verb.

ChuanqiXu updated this revision to Diff 450692.Aug 7 2022, 8:21 PM
ChuanqiXu marked 8 inline comments as done.

Address comments.

It gets very confusing that phab now attaches the old review comments in the wrong place.

clang/docs/CPlusPlus20Modules.rst
23

OK that's fine. Then this should be changed from

Due to the C++20 modules having very different semantics, it might be more friendly for users who care about C++20 modules only to create a new page.

to something like

Due to the C++20 modules having very different semantics, this page deals with them separately.

50–52

Ugh, this should have been "These are".

53

I had corrected myself on second pass, but got confused with phab and ended up not submitting it....

65–68
234–236

The comes from the clang implementation. If the user don't follow the restrictions, then the clang may fail to build the module. For example, in the "hello world" example, if the name of module file is M.module instead of M.pcm, the the clang would fail to find the corresponding M module.

Can we add a sentence to this effect?

317–318

I realized afterwards that it's the inverse - debugging code might depend on the _absence_ of NDEBUG which switches of debug stuff.

472
ChuanqiXu marked 8 inline comments as done.Aug 8 2022, 1:43 AM
ChuanqiXu added inline comments.
clang/docs/CPlusPlus20Modules.rst
317–318

From my experience, NDEBUG is used more frequently in practice. We generally write:

C++
#ifndef NDEBUG
...
#endif

I guess this might not be a big deal.

ChuanqiXu updated this revision to Diff 450731.Aug 8 2022, 1:44 AM
ChuanqiXu marked an inline comment as done.

Address comments.

It gets very confusing that phab now attaches the old review comments in the wrong place.

Yeah, I am also worrying if the comments may block other reviewers to review. Do you know any method to ignore these comments in phab? If not, I guess I need to create a new page (but this is what we want?)

Yeah, I am also worrying if the comments may block other reviewers to review. Do you know any method to ignore these comments in phab? If not, I guess I need to create a new page (but this is what we want?)

I don't know phab well at all, but I understand that old review comments remain is a feature and not a bug. But purely for pragmatic reasons already, it might be better to open a new revision.

If you do open a new revision, please also consider breaking the lines at a length that phabricator doesn't overflow (seems to be 122 characters), and change all occurrences of "codes" to "code".

clang/docs/CPlusPlus20Modules.rst
677–679
ChuanqiXu added inline comments.Aug 8 2022, 2:16 AM
clang/docs/CPlusPlus20Modules.rst
677–679

Since it says in the future, if it is better to use may be or will be than are ?

h-vetinari added inline comments.Aug 8 2022, 2:44 AM
clang/docs/CPlusPlus20Modules.rst
677–679

Since it says in the future, if it is better to use may be or will be than are ?

The "we think" already contains built-in subjectiveness, so the "may be" is redundant in terms of uncertainty. It's not a big deal, but in general: either "they may be" or "we think they are".

It would also be possible to say something like:

So the final answer for why we don't reuse the interface of Clang modules for header units is that
there are some differences between header units and Clang modules and that ignoring those
differences now would likely become a problem in the future.

iains added a comment.Aug 8 2022, 2:49 AM

@h-vetinari you are right, this has become difficult to review - I will try and do some more later - just the one comment for now.

clang/docs/CPlusPlus20Modules.rst
677–679

Is it not simpler than this?

Since clang header modules have different semantics from c++20 header units, if we were to force c++20 semantics on clang header modules, that would break existing code (and therefore to support existing code, we would expect clang header modules to be retained indefinitely).

ChuanqiXu added inline comments.Aug 8 2022, 2:58 AM
clang/docs/CPlusPlus20Modules.rst
677–679

I think we need to express the concern about future changes. We may be able to handle the differences properly now.

h-vetinari added inline comments.Aug 8 2022, 3:27 AM
clang/docs/CPlusPlus20Modules.rst
65–68

This is marked as done, but is still unchanged in D131388