This is an archive of the discontinued LLVM Phabricator instance.

[clang] Draft: Implement P1703R1
AbandonedPublic

Authored by tbaeder on Apr 1 2022, 2:14 AM.

Details

Summary

As in: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1703r1.html

Posting this mostly to get some feedback on the general path taken.

From reading the paper, I understand that the preprocessor should gain a "import-keyword" token, but that is already taken by the GNU #import extension.

The implementation here starts converting newlines to eod tokens when seeing a export or import token. However, that means ignoring the eod tokens when parsing an export block, an exported declaration or a non-header-unit module import.

Let me know what you think.

Diff Detail

Event Timeline

tbaeder created this revision.Apr 1 2022, 2:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2022, 2:14 AM
tbaeder requested review of this revision.Apr 1 2022, 2:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2022, 2:14 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I can;t really tell what you're doing. C++20 lexes import (&module) declarations as preprocessor directives. the parser must also recognize only such declarations that came through from lexing such directives. It is unrelated to #import. GCC's implementation drops the preprocessor into directive mode if it sees 'module' or 'import' (optionally preceeded by 'export') at the start of a line[1]. Those keywords are converted to unspellable internal tokens that the parser recognizes to parse module and import declarations. An end-of-pragma token is injected at the end-of-line, and again the parser expects that after the terminating ';'. (it's leveraging the pragma machinery)

[1] it also peeks the next non-white-space token for '<', '", or <begin-identifier-char>, as specified in the std.

occurrences of 'module' and 'import' that are not lexed that way go through as regular identifier tokens, not keywords.

oh, also in 'import "bob";', "bob" needs to be lexed as a #include name, not a string. (and likewise 'import <bob>;') Hence treating these lines in the preprocessor in directive-processing-mode was the way GCC went.

clang/test/Modules/P1703R1.cpp
49–50

This confuses me. That's a stray 'export' token, followed by a well-formed import-directive

52

that second 'import' is not a token, it's an identifier. This is an ill-formed import-directive with unexpected tokens afte the first ';' You seem to be dropping into lex-import-mode not at the start of a line?

72

what is meant by 'normal'? many of these look ill-formed to me.

87

yup :) gcc tries to tell the user about the single logical-line requirement:

   inform (token->location, "perhaps insert a line break, or other"
	      " disambiguation, to prevent this being considered a"
	      " module control-line");

the expectation here is that this was pre c++20 code that fell into this lexing trap. Specifically things like:

template<typename T> class module {...};

module<mytype> x;

urnathan added a comment.EditedApr 4 2022, 5:53 AM

Ah, I'd had a thinko about what 1703 was. that's superseded by <strike>p1757</strike>wg21.link/p1857. That's what is needed. (your patch makes more sense in the 1703 context)

Ah, I'd had a thinko about what 1703 was. that's superseded by <strike>p1757</strike>wg21.link/p1857. That's what is needed. (your patch makes more sense in the 1703 context)

Ha, okay. Yes, that makes more sense. Both papers are linked in https://clang.llvm.org/cxx_status.html. Sorry for the confusion, I'll have a look at 1857R3 instead.

BTW, I think it is helpful to add @iains's patches as parent versions to avoid conflicting.

What patches are you talking about exactly?

What patches are you talking about exactly?

I mean D121588 D121589 D121590 D121591 D122119

iains added a comment.Apr 6 2022, 3:07 AM

FWIW probably these 5 patches are pretty independent of changes to the preprocessor...
... the first four of those are extremely unlikely to clash - since they are concerned with the driver.

urnathan requested changes to this revision.Apr 6 2022, 11:12 AM

Either update this when you're ready, or abandon and create a new one for 1857.

This revision now requires changes to proceed.Apr 6 2022, 11:12 AM
tbaeder abandoned this revision.Oct 7 2022, 5:09 AM