This is an archive of the discontinued LLVM Phabricator instance.

[C++20][Modules][HU 1/5] Introduce header units as a module type.
ClosedPublic

Authored by iains on Mar 7 2022, 2:22 AM.

Details

Summary

This is the first in a series of patches that introduce C++20 importable
header units.

These differ from clang header modules in that:
(a) they are identifiable by an internal name
(b) they represent the top level source for a single header - although that one might include or import other headers.

We name importable header units with the path by which they are specified
(although that need not be the absolute path for the file).

So "foo/bar.h" would have a name "foo/bar.h". Header units are made a
separate module type so that we can deal with diagnosing places where they
are permitted but a named module is not.

Diff Detail

Event Timeline

iains created this revision.Mar 7 2022, 2:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2022, 2:22 AM
iains edited the summary of this revision. (Show Details)Mar 7 2022, 2:42 AM
iains added a comment.Mar 7 2022, 3:00 AM

This is a small patch set that implements the initial front end changes to support C++20 importable modules.
A second path series will follow that makes the driver changes to drive the fronted support.

This implementation makes the user-facing names, command line options and behaviour match the existing GCC implementation. There is no apparent engineering reason for divergence and this strategy minimises user learning curves and tooling differences.

1/5 - introduces the module type and build actions to compile them
2/5 - allows the user to specify that a header unit should be found from either the user or system search paths in force. In particular this avoids the user having to know the install path for system headers.
3/5 - handles emitting macros that are live at the end of the header unit TU, as required.
4/5 - handles pre-processed header unit sources
5/5 - introduces "-fdirectives-only" that can be combined with -E to produce pre-processor output that can be consumed as a second job to build an importable header (it can also be used to provide a scannable preprocesed file that has actioned any directives).

iains published this revision for review.Mar 7 2022, 4:39 AM

please see first comment for a description of the patch series.

Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2022, 4:39 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ChuanqiXu added inline comments.Mar 7 2022, 6:55 PM
clang/include/clang/Sema/Sema.h
2978–2980

From the implementation, I think it should be called only in ActOnStartOfTranslationUnit. So it would be better to move this function to private section to avoid accident calls. We should add such constraint as assumption or at least comment to tell it should be called by ActOnStartOfTranslationUnit.


The name ActOnStartOf* implies there should be a corresponding ActOnEndOf* methods. But there isn't one. So I would suggest to give another name to avoid ambiguity.

clang/lib/AST/Decl.cpp
1573–1574

How about

return InternalLinkage ? M->getTopLevelModule() : nullptr;
clang/lib/Parse/Parser.cpp
2476

The comment is not accurate. header unit import is pre-processor too. http://eel.is/c++draft/cpp.import

clang/lib/Sema/SemaModule.cpp
519

I would feel better if we add an assertion below to assert ModuleScopes.back().Module->isGlobalModule() is true only if Mod is Header Unit.

iains updated this revision to Diff 414888.Mar 12 2022, 3:16 PM
iains marked 4 inline comments as done.

rebased, address review comments.

clang/include/clang/Sema/Sema.h
2978–2980

From the implementation, I think it should be called only in ActOnStartOfTranslationUnit. So it would be better to move this function to private section to avoid accident calls. We should add such constraint as assumption or at least comment to tell it should be called by ActOnStartOfTranslationUnit.

OK, I've made this private and updated the comment to note that this is a helper for ActOnStartOfTranslationUnit


The name ActOnStartOf* implies there should be a corresponding ActOnEndOf* methods. But there isn't one. So I would suggest to give another name to avoid ambiguity.

Is this a rule?

I think that the name ActOnStartOfHeaderUnit() says exactly what we are doing, of course, at some time we might need an ActOnEndOfHeaderUnit() - but we should not add an empty function for that reason.

(this is not a sticking point; if consensus is that the name is confusing, I will change it).

clang/lib/AST/Decl.cpp
1573–1574

that would alter the behaviour of the existing code.

getTopLevelModule() will return the ultimate parent:

const Module *Result = this;
while (Result->Parent)
  Result = Result->Parent;

where the existing code only returns the immediate parent (perhaps that is unintended, but it should be fixed separately if so).

clang/lib/Parse/Parser.cpp
2476

The comment is not accurate. header unit import is pre-processor too.

the pre-processor 'import' is a pre-processor directive.

https://eel.is/c++draft/cpp.pre#1

I amended to clarify that we cannot import a named module in this position - only a header unit.

clang/lib/Sema/SemaModule.cpp
519

OK, done.
I also added a check for CPlusPlus modules, since modules-ts has an implicit GMF and slightly different rules.

ChuanqiXu accepted this revision.Mar 13 2022, 7:14 PM

LGTM with suggested changes.

clang/include/clang/Sema/Sema.h
2978–2980

I don't find the rule that there must be a pair for ActOnStartOf* and ActOnEndOf*. I just find almost all other methods does so. For the consistency problem, I would still suggest to change the name to something else.

clang/lib/AST/Decl.cpp
1573–1574

I am sure that in C++20 modules, there would be 2 levels of hierarchy at most. I am OK to fix it later (Hope we don't forget it.)

This revision is now accepted and ready to land.Mar 13 2022, 7:14 PM
iains updated this revision to Diff 416841.Mar 21 2022, 1:31 AM

rebased

iains updated this revision to Diff 418041.Mar 24 2022, 2:07 PM

rebased, renamed helper method.