This is an archive of the discontinued LLVM Phabricator instance.

[C++20][Modules][Driver][HU 1/N] Initial handling for -xc++-{system,user}-header.
ClosedPublic

Authored by iains on Mar 14 2022, 4:17 AM.

Details

Summary

This adds file types and handling for three input types, representing a C++20
header unit source:

  1. When provided with a complete pathname for the header.
  2. For a header to be looked up (by the frontend) in the user search paths
  3. For a header to be looked up in the system search paths.

We also add a pre-processed file type (although that is a single type, regardless
of the original input type).

These types may be specified with -xc++-{user,system,header-unit}-header xxxx.

These types allow us to disambiguate header unit jobs from PCH ones, and thus
we handle these differently from other header jobs in two ways:

  1. The job construction is altered to build a C++20 header unit (rather than a PCH file, as would be the case for other headers).
  2. When the type is "user" or "system" we defer checking for the file until the front end is run, since we need to look up the header in the relevant paths which are not known at this point.

Diff Detail

Event Timeline

iains created this revision.Mar 14 2022, 4:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 14 2022, 4:17 AM
iains added a subscriber: Restricted Project.Mar 14 2022, 4:28 AM
iains published this revision for review.Mar 14 2022, 4:34 AM

This is a set of 4 patches that adds the driver-side support for C++20 header modules.

We use the same user-facing options as the existing implementation in GCC (there seems no engineering reason to deviate from this and it makes things easier for users and build system alike).

1/4 - this patch introduces file types for header unit headers to be searched for in the user or system paths (or to be given an absolute pathname).
2/4 - adds the -fmoduke-header{.=} command line options
3/4 - is a user-convenience change that allows "foo.h" without complaining about C code used in C++ compilations (since we know that the user has already specified the intent).
4/4 - adds the -fdirectives-only command that produces pre-processor output which retains the macro definitions/undefs that will be present in the built header unit.

Herald added a project: Restricted Project. · View Herald TranscriptMar 14 2022, 4:34 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ChuanqiXu accepted this revision.Mar 14 2022, 8:05 PM

LGTM. Although I don't like these option names, it is more important to keep consistent with GCC.

This revision is now accepted and ready to land.Mar 14 2022, 8:05 PM
iains added a comment.Mar 15 2022, 2:14 AM

note: I do not plan to fix the formatting issue in clang/lib/Driver/Types.cpp, since I am adding one line and the format change would mean ≈ 110 lines of changes.

note: I do not plan to fix the formatting issue in clang/lib/Driver/Types.cpp, since I am adding one line and the format change would mean ≈ 110 lines of changes.

It is odd. I thought all things could be handled by clang-format-diff. @MyDeveloperDay could you help to take a look?

Sorry, I'm pretty ignorant in this area but based on

The job construction is altered to build a C++20 header unit (rather than a PCH file, as would be the case for other headers).

I want to clarify. The goal is to allow mixing PCH and PCM files, right? I haven't double checked and maybe that have changed already but I think -x c++-header started building .pcm instead of .pch with -std=c++20. That's where my confusion about mixing PCH and PCM comes from.

clang/include/clang/Driver/Types.def
67–71

Sorry, it's not really related to your change but do you have a rule where "ii" should go? It's just we have both "mii" and "iim" and I want to make sure it should be "iih" and not "hii". I haven't tried to find a pattern here myself, asking you first.

clang/lib/Frontend/FrontendOptions.cpp
30–31

Given the other branches in this StringSwitch I would expect .Cases("iih", "iim", InputKind(Language::CXX).getPreprocessed()). Is there a reason not to do that (like differences between iih and iim) or is it accidental?

clang/test/Driver/cxx20-header-units-01.cpp
8

What should happen in case of inconsistencies like %clang++ -### -std=c++20 -xc++-system-header %S/Inputs/header-unit-01.hh? Or -xc++-system-header foo.h?

iains marked 3 inline comments as done.EditedMar 16 2022, 1:56 AM

Sorry, I'm pretty ignorant in this area but based on

The job construction is altered to build a C++20 header unit (rather than a PCH file, as would be the case for other headers).

I want to clarify. The goal is to allow mixing PCH and PCM files, right? I haven't double checked and maybe that have changed already but I think -x c++-header started building .pcm instead of .pch with -std=c++20. That's where my confusion about mixing PCH and PCM comes from.

From my understanding of the modules group objectives:

The (my at least understand of) objective is that when the user produces a C++20+ command line with no contradicting options, then the output would be standardised C++20 modules.

As of this moment, clang modules and C++20 modules have some divergence (I understand that the goal is to unify - but that will take some time, I expect).

If the user wants clang modules, then they should add "-fmodules" to the command line - which would switch off C++20 modules and produce a PCH header module as before 9perhaps built from several headers, where a C++20 HU only reflects one header),

Perhaps we cannot make this objective yet - but I can say that the current choices in this code do not produce any regressions in the testsuite - so that is some indication that it could be possible, right now.

There is a quite along discussion on this in https://reviews.llvm.org/D120540

clang/include/clang/Driver/Types.def
67–71

Sorry, it's not really related to your change but do you have a rule where "ii" should go? It's just we have both "mii" and "iim" and I want to make sure it should be "iih" and not "hii". I haven't tried to find a pattern here myself, asking you first.

My understanding is this:

ObjectiveC/C++ append "I" and "ii" (mi and mii)

C++ Modules-related code pre-pends "ii".

so that a pre-processed header unit ==> "iih"
and a pre-processed module ==> "iim".

clang/lib/Frontend/FrontendOptions.cpp
30–31

accidental on someone's part, I expect .. but pre-existing
(I have not [intentionally, at least] modified the behaviour of non-header unit module in this patch set)

clang/test/Driver/cxx20-header-units-01.cpp
8

If the user states that %S/Inputs/header-unit-01.hh is a system header, I do not think that the driver is in a position to argue?

-xc++-system-header foo.h again the user has made an explicit statement..

I suppose that we can always diagnose these things with warnings - but I do not think we can easily reject them (and we risk producing unhelpful output).

iains marked 3 inline comments as done.Mar 16 2022, 4:19 AM
iains added inline comments.
clang/lib/Frontend/FrontendOptions.cpp
30–31

hmm.. sorry I misread .. I will update this to use Cases.

iains updated this revision to Diff 416332.Mar 17 2022, 2:47 PM

adress review comments, adjust test case for windows compatibility.

Thanks for explaining the desired interplay between PCH and PCM. The problem I'm seeing is that it can be hard to find correct compiler flags to achieve the desired result (what is enabled and what is disabled). But that is out of scope for this change.

clang/include/clang/Driver/Types.def
67–71

Thanks for the explanation. Now I see it should be "iih" and not "hii". One could argue that modular-related stuff can be expressed by appending "m", like ".ii" -> ".iim". But I don't know what that would mean for PP_CXXHeaderUnit ("iihm"?) and I think "iih" is reasonable.

clang/test/Driver/cxx20-header-units-01.cpp
8

What is the intended use of -xc++-user-header and -xc++-system-header? Are these supposed to be commonly-used flags or as an escape hatch of some sort? Because if it is commonly-used, I have concerns about usability that forces users to track user and system headers correctly, and duplicates the information already encoded in header search paths.

Also I think it is out of scope but for usability it is important to know the consequences of using a wrong flag and how one can diagnose it. Have no idea what you've already planned in this area, just mentioning it because I'm scarred by inscrutable clang behavior.

iains updated this revision to Diff 416364.Mar 17 2022, 5:41 PM

adjust testcase path separators

iains added inline comments.Mar 17 2022, 5:55 PM
clang/test/Driver/cxx20-header-units-01.cpp
8

These are "generally internal" marking to be specific about how files should be treated.

The usual user-facing commands would be -fmodule-header={user, system}

So the answer is no, they would not be commonly used - pretty much the same as -x c++ or so, only necessary if the file type is ambiguous to the driver in some way.

If the user wants a specific header to be searched for in the user or system paths, then it is reasonable that they have to indicate this. Actually, probably the largest benefit is that the user would not need to know where system headers were stored -- so when trying to build a header unit for, say, 'vector' it would greatly simplify the process,

I hope that the behaviour should not be in any way inscrutable - there are some things that the user (or build system) has to communicate to the driver, it cannot deduce them (and where to search for headers is in that category).


the consequences of using the wrong flag in this case would be a failure to find the header file (the diagnostic would be emitted by the FE rather than the driver, but that would be transparent to a regular user). The logic here is that using the lookup paths is not a job for the driver, because the FE already knows and has mechanisms for it.

iains added a comment.Mar 17 2022, 6:02 PM

Thanks for explaining the desired interplay between PCH and PCM. The problem I'm seeing is that it can be hard to find correct compiler flags to achieve the desired result (what is enabled and what is disabled). But that is out of scope for this change.

IMHO, that is an unfortunate consequence of having ≈ 60+ module-related command line flags; we do need to rationalise this. This is being discussed elsewhere [D120540] (and probably it's better to leave that conversation in one place).

vsapsai accepted this revision.Mar 17 2022, 6:47 PM

Thanks for your patience and for all the explanations, I have no other comments.

iains updated this revision to Diff 416853.Mar 21 2022, 1:38 AM

rebased.

This revision was landed with ongoing or failed builds.Apr 22 2022, 1:25 AM
This revision was automatically updated to reflect the committed changes.