This is an archive of the discontinued LLVM Phabricator instance.

[clang][ABI] New C++20 module mangling scheme
ClosedPublic

Authored by urnathan on Mar 22 2022, 12:50 PM.

Details

Reviewers
iains
dblaikie
Group Reviewers
Restricted Project
Commits
rGae4dce8659f3: [clang][ABI] New C++20 module mangling scheme
Summary

The existing module symbol mangling scheme turns out to be undemangleable. It is also desirable to switch to the strong-ownership model as the hoped-for C++17 compatibility turns out to be fragile, and we also now have a better way of controlling that.

The issue is captured on the ABI list at:

https://github.com/itanium-cxx-abi/cxx-abi/issues/134

A document describing the issues and new mangling is at: (sigh, I wish I cold get a stable name here)
https://drive.google.com/file/d/1XAR2RLzrpAoWGBQcPkxP33s8SwB2y4vb/view?usp=sharing

This patch reimplements the bits of mangling that we already had, it does not implement the other bits mentioned in that document such as global initializer functions or unnamed enumerations (which itself needs adjusting from the doc).

GCC implements this mangling.

The old mangling is unceremoniously dropped. No backwards compatibility, no deprectated old-mangling flag. It was always labelled experimental. (Old and new manglings cannot be confused.)

Diff Detail

Event Timeline

urnathan created this revision.Mar 22 2022, 12:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2022, 12:50 PM
urnathan requested review of this revision.Mar 22 2022, 12:50 PM

Not exactly my place of expertise, but perhaps others with more modules familiarity can take a look.

clang/lib/AST/ItaniumMangle.cpp
481

I see one instance of us passing around a DeclContext as a nullptr, what is the justification/mechanism for that? I would expect EVERYTHING to have a declaration context, so I would expect this to be able to be a reference.

h-vetinari added inline comments.Mar 22 2022, 5:23 PM
clang/test/CXX/modules-ts/basic/basic.def.odr/p4/module.cppm
7–8

That comment looks outdated...

Nice to see you back to continue work on this! Due to this one is important. I would have tried to continue the work. The problem of the previous patch should be trivial.
Generally we would continue the work in the previous page. But it doesn't matter if we feel the new page is cleaner.
BTW, would you like to review my patches for modules? Your review opinion is really important. But you resigned from all my patches a couple weeks ago and I don't know the reason until now...

urnathan edited the summary of this revision. (Show Details)Mar 23 2022, 4:11 AM
iains added inline comments.Mar 23 2022, 8:37 AM
clang/lib/AST/ItaniumMangle.cpp
1070

So we mangle symbols in 'M.P.Q:B" as _ZW1MW1PW1Qnn.......

That seems slightly odd given that '.' is not significant in a C++20 modules name?
perhaps a necessity given that one doe not know if the target can process periods in symbol names...

urnathan marked an inline comment as done.Mar 24 2022, 5:50 AM
urnathan added inline comments.
clang/lib/AST/ItaniumMangle.cpp
481

The rest of the mangler prefers passing must-be-non-null pointers around, rather than use references. for instance:

void mangleTemplateParamDecl(const NamedDecl *Decl);
void mangleLambda(const CXXRecordDecl *Lambda);
void mangleNestedName(GlobalDecl GD, const DeclContext *DC,
                      const AbiTagList *AdditionalAbiTags,
                      bool NoFunction=false);

It seems more jarring to use a reference here -- particularly juggling between DeclContext ptr & reference.

WRT passing nullptr -- that's in the closure-prefix mangling

// A <closure-prefix> represents a variable or field, not a regular
// DeclContext, so needs special handling. In this case we're mangling a
// limited form of <nested-name>:

I suppose we could get the context out of the GD, but we know it is not relevant to the unqualified name mangling call, so seems like unnecessary work.

1070

Correct, we want mangled names to be alphanumeric, (and .suffix has also become a thing in mangled names). Treating each part separately allows us to have a substitution mechanism similar to the scope substitution mechanism already there.

(In case it wasn't clear, the earlier module mangling also had this property)

urnathan updated this revision to Diff 418180.Mar 25 2022, 4:04 AM
urnathan marked an inline comment as done.

rebase, fix stale test comment

urnathan marked an inline comment as done.Mar 25 2022, 4:04 AM
iains added a comment.Mar 27 2022, 7:00 AM

I have no further comments, this LGTM (and it would be nice to see it landed).
Have you WIP for the module initialiser mangling?

I have no further comments, this LGTM (and it would be nice to see it landed).
Have you WIP for the module initialiser mangling?

Yes. I have that ready to go as soon as there's a user.

iains added a comment.Mar 28 2022, 5:01 AM

I have no further comments, this LGTM (and it would be nice to see it landed).
Have you WIP for the module initialiser mangling?

Yes. I have that ready to go as soon as there's a user.

I am actively working on module initialisers - so there's certainly soon to be a user (and having a patch to test my work on would be really helpful).

dblaikie accepted this revision.Mar 29 2022, 12:51 PM
dblaikie added a subscriber: rsmith.

I have no further comments, this LGTM (and it would be nice to see it landed).

Good enough for me. While I realize there's some missing continuity between a lot of the previous modules work (@rsmith) & the recent work - and I really don't have enough context to provide really very nuanced review (though I do try to follow along and happy to be included) - don't let that slow you folks down, If between you the direction seems reasonable, in the absence of any active concerns from elsewhere, you should feel empowered to approve each other's patches/press forward.

This revision is now accepted and ready to land.Mar 29 2022, 12:51 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2022, 8:51 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
MaskRay added a subscriber: MaskRay.EditedMay 18 2022, 10:25 AM

The differential has the same intention/subject as D118352. In such a case reusing the preexisting differential would have been better. I have left more comments on https://reviews.llvm.org/D125825#3522911

Note: a revert was committed (https://reviews.llvm.org/D118352#3368921) in 11 hours after the problem was reported.
It was fairly legitimate. The user pushing the revert helps maintain build bots green. (Some groups may be more aggressive, e.g. they may leave a comment and perform a revert in 10 minutes.
Technically that is also fine, but sometimes can be done better.)
In this particular case, I believe the problem was just a Mach-O platform differential that Mach-O IR output doesn't use dso_local . This is very common when adding new tests to test/clang/CodeGen* and contributors often neglect it.
If I had seen the failure in time, I would have tried fixing the test as it is quite straightforward, instead of reverting it.

(Thanks for adding #clang-language-wg since this was a patch that some folks would like to subscribe to.)

Additional notes: I'd add ( (@_ZW6ModuleE24noninline_module_linkagev( instead of @_ZW6ModuleE24noninline_module_linkagev) to check the full mangled name. The single character has very low overhead but makes the test stricter/more robust.