This is an archive of the discontinued LLVM Phabricator instance.

[C++20][Modules] Implement include translation.
ClosedPublic

Authored by iains on Jul 1 2022, 3:55 AM.

Details

Summary

This addresses [cpp.include]/7

(when encountering #include header-name)

If the header identified by the header-name denotes an importable header, it
is implementation-defined whether the #include preprocessing directive is
instead replaced by an import directive.

In this implementation, include translation is performed _only_ for headers
in the Global Module fragment, so:

module;
 #include "will-be-translated.h" // IFF the header unit is available.

export module M;
 #include "will-not-be-translated.h" // even if the header unit is available

The reasoning is that, in general, includes in the module purview would not
be validly translatable (they would have to immediately follow the module
decl and without any other intervening decls). Otherwise that would violate
the rules on contiguous import directives.

This would be quite complex to track in the preprocessor, and for relatively
little gain (the user can 'import "will-not-be-translated.h";' instead.)

TODO: This is one area where it becomes increasingly difficult to disambiguate
clang modules in C++ from C++ standard modules. That needs to be addressed in
both the driver and the FE.

Diff Detail

Event Timeline

iains created this revision.Jul 1 2022, 3:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2022, 3:55 AM
iains requested review of this revision.Jul 1 2022, 3:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2022, 3:55 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
iains added a subscriber: Restricted Project.
iains updated this revision to Diff 441871.Jul 2 2022, 1:13 AM

fix a state transition after include translation.

This update fixes an (apparently pre-existing) bug in handling:

module;
 #include "translated-header.h"
 import "header-unit.h";

The translation process pushes a tok::annot_module_include which was being
handled as "misc" in the state machine, leading to the wrong starting state
for processing the 'import' following. The change here treats this token
as if it were the ';' that is notionally injected as part of the translation.

Updated the test case to test cases where header units can be imported.

ChuanqiXu added inline comments.
clang/include/clang/Lex/Preprocessor.h
434

What's the meaning of ATLTS?

clang/lib/Lex/PPDirectives.cpp
2226–2227

From what @rsmith said in https://reviews.llvm.org/D113391, it looks like the ideal direction is to use C++ clang modules and C++ standard modules together. So it looks like we couldn't disambiguate them from command line options.

clang/test/Modules/cxx20-include-translation.cpp
11

Maybe we need an example h5.h which is not pre-compiled as a header unit and it would be handled correctly.

iains marked 3 inline comments as done.Jul 4 2022, 2:13 AM
iains added inline comments.
clang/include/clang/Lex/Preprocessor.h
434

AfterTopLevelTokenSeq I guess it seemed a rather long name (but I do not mind to spell it out if that makes sense).

clang/lib/Lex/PPDirectives.cpp
2226–2227

Well, I think there is some more debate to have around how to solve this problem (i.e. it might be intended that clang++ modules and standard c++ modules converge but as things stand we still need to support the cases that they have different behaviour, or break existing users)
... - but please let us not have that debate in this patch :-)

clang/test/Modules/cxx20-include-translation.cpp
11

I am not sure if you mean that the un-converted header would be included in the GMF or in the module purview - IMO we already have test-cases that cover this, but I do not mind adding something extra if there's a missing case?

ChuanqiXu added inline comments.Jul 4 2022, 2:30 AM
clang/include/clang/Lex/Preprocessor.h
434

Oh, I feel better to spell it out.

clang/lib/Lex/PPDirectives.cpp
2226–2227

It is not good to break existing users. Generally, a breaking change patch could be reverted directly... We must care about it to avoid unnecessary efforts. And it looks like the current implementation wouldn't break any existing users, right? Since it uses isHeaderUnit(). I remember HeaderUnit is introduced by you so it shouldn't conflict with clang modules.

BTW, may I ask the behavior is consistency with GCC?

2335

nit: It looks like we prefer to use C++20 modules over standard modules, although standard modules must be the right term.

clang/lib/Parse/Parser.cpp
672

I think we could ignore getLangOpts().ModulesTS here.

clang/test/Modules/cxx20-include-translation.cpp
11

I am not sure if you mean that the un-converted header would be included in the GMF or in the module purview

At least in the GMF.

IMO we already have test-cases that cover this, but I do not mind adding something extra if there's a missing case?

Yeah, I feel better to have more coverage.

iains updated this revision to Diff 442085.Jul 4 2022, 6:51 AM
iains marked 7 inline comments as done.

rebased, addressed review comments.

iains marked an inline comment as done.Jul 4 2022, 6:52 AM
iains added inline comments.
clang/lib/Lex/PPDirectives.cpp
2226–2227

It is not good to break existing users. Generally, a breaking change patch could be reverted directly... We must care about it to avoid unnecessary efforts. And it looks like the current implementation wouldn't break any existing users, right? Since it uses isHeaderUnit(). I remember HeaderUnit is introduced by you so it shouldn't conflict with clang modules.

correct, in this case, the fact that Header Units are specific to the C++20 implementation (they are quite different from clang header modules) allows us to tell the difference.

BTW, may I ask the behavior is consistency with GCC?

yes, I discussed this with @urnathan (especially that it is very difficult to get consistent behaviour if we were to include-translate in the module purview).

2335

since we are heading for C++23 ... perhaps now would be a good time to start using "standard modules"? (I can change to C++20 modules if there's objection).

clang/lib/Parse/Parser.cpp
672

well, we agreed that (for the present) we would try to avoid changing the behaviour w.r.t modules-ts (but spend no effort to complete the implementation) - and to remove it from comments; we can then take a pass through to remove the modules-ts behaviour (assuming that no-one is using it!)

ChuanqiXu accepted this revision.Jul 4 2022, 7:08 PM

Then LGTM if all the comments addressed.

clang/lib/Lex/PPDirectives.cpp
2226–2227

Got it. Thanks.

2335

It is OK to me. But I feel better if we use C++ standard modules since modules is a feature lives everywhere.

clang/lib/Parse/Parser.cpp
672

Given the previous status of C++ modules, it is hard to believe there is existing users for it. So I think it should be fine to remove the support for module-ts.

This revision is now accepted and ready to land.Jul 4 2022, 7:08 PM
iains updated this revision to Diff 442242.Jul 5 2022, 2:50 AM

addressed review comments.

This revision was automatically updated to reflect the committed changes.

Hi @iains, upstream Clang crashes on the attached test case due to an assertion failure. Git bisect pointed me to this commit. Can you please take a look? Previously, the test would result in a warning of incomplete umbrella header.

// RUN: rm -rf %t && mkdir %t
// RUN: split-file %s %t

//--- frameworks/FW.framework/Modules/module.modulemap
framework module FW {
  umbrella header "FW.h"
  module * { export * }
}
//--- frameworks/FW.framework/Headers/FW.h
#include "One.h"
//--- frameworks/FW.framework/Headers/One.h
//--- frameworks/FW.framework/Headers/Two.h

//--- module.modulemap
module Mod { header "Mod.h" }
//--- Mod.h
#include "FW/Two.h"
//--- from_module.m
#include "Mod.h"

// RUN: %clang -fmodules -fmodules-cache-path=%t/cache -iframework %t/frameworks -c %t/from_module.m -o %t/from_module.o
iains added a comment.Aug 17 2022, 4:27 AM

Hi @iains, upstream Clang crashes on the attached test case due to an assertion failure. Git bisect pointed me to this commit. Can you please take a look? Previously, the test would result in a warning of incomplete umbrella header.

yes, I can repeat this - will take a look.

iains added a comment.Aug 19 2022, 1:33 AM

Hi @iains, upstream Clang crashes on the attached test case due to an assertion failure. Git bisect pointed me to this commit. Can you please take a look? Previously, the test would result in a warning of incomplete umbrella header.

yes, I can repeat this - will take a look.

I landed https://github.com/llvm/llvm-project/commit/a2dd6130d49777d63c2d1b641bd8e56f26fa0822 to fix this (assuming that there is no new fallout) - do you think we should back port to llvm-15?

Thank you! Yes, I think this would be a good candidate for backporting.