This is an archive of the discontinued LLVM Phabricator instance.

[MS ABI] Fix mangling references to declarations.
ClosedPublic

Authored by bolshakov-a on Mar 19 2023, 11:03 AM.

Details

Summary

Several issues have been discovered and (hopefully) fixed here:

  • Reference NTTPs should be mangled in the same manner as pointer ones.
  • Pointer fields of class type NTTPs should be treated in the same

manner as reference ones.

  • Pointer-to-member fields of class type NTTPs should be treated

differently compared to pointer-to-member NTTPs. Tests on
pointer-to-member-function NTTP class fields added.

  • Correct mangling of pointers to anonymous union members.
  • A bug in mangling references to subobjects fixed.
  • Mangling array subscripts and base class members in references

to subobjects.

Reference NTTP mangling was done back in 2013
in e8fdc06e0dab2e7b98339425dbe369e27e2092a3, and Microsoft might change
mangling algorithm since then. But class type NTTPs are introduced only
in C++20, and the test was written in b637148ecb62b900872b34eedd78b923bb43c378.
It is strange if the MS ABI had been realy changed, because Microsoft
claims that they maintain ABI stability since VS 2015. I've tested both
on v142 and v143 MSVC toolsets, and they show the same behavior
on the test cases which are changed in this PR. But
pointer-to-member-function NTTP class field mangling has been actually
changed, because it was erroneous in v142, leading to name collisions.

Moreover, pointer-to-member mangling with conversions across class
hierarchy has been enabled.

Diff Detail

Event Timeline

bolshakov-a created this revision.Mar 19 2023, 11:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2023, 11:03 AM
bolshakov-a requested review of this revision.Mar 19 2023, 11:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2023, 11:03 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Added the code owners here. There doesn't seem to be anything questionable here to me, but I want to make sure someone who really knows what is going on takes a look.

efriedma added inline comments.
clang/lib/AST/MicrosoftMangle.cpp
723

It's not obvious to me why the inheritance model is relevant here. If you want to check if the class has virtual bases, please do that explicitly. (Note that the inheritance model can be messed with using attributes; see https://learn.microsoft.com/en-us/cpp/cpp/inheritance-keywords.)

1875

It might be more clear here to use APValue::isNullPointer(), instead of checking if getMemberPointerDecl() returns null. At first glance, it's not really obvious why getMemberPointerDecl() would return null. And in theory, it's possible to have a non-null APValue where getMemberPointerDecl() returns null (although in this context, such cases should get rejected earlier).

rjmccall added inline comments.Mar 20 2023, 5:36 PM
clang/lib/AST/MicrosoftMangle.cpp
723

It would at least be good to test what exactly this is based on.

If RD is the base class of the member pointer type, isn't RD followed by the unqualified name of VD potentially ambiguous when there's any inheritance at all? You don't even need multiple inheritance, just the presence of multiple classes in the hierarchy that declare a field with the same name. I mean, if that's what MSVC does then we need to match it, but we should make certain by testing member pointers at different positions in the hierarchy of the declared base.

(And this is assuming the MSVC model that restricts what subclass member pointers are allowed in superclass member pointer types.)

Add test cases on unspecified inheritance.

Mangle pointer-to-members with conversion.

Pointer-to-member-functions with conversions across hierarchy work fine, as I can see.

I don't still understand how to mangle nested unnamed tags in general.

clang/lib/AST/MicrosoftMangle.cpp
723

@efriedma, MS inheritance model overrides real inheritance. I've added a couple of test cases to reflect this. Btw, thanks for the link! I really didn't understand what "unspecified inheritance" is and how it is even possible.

@rjmccall, thank you! I've updated the mangling to a more correct algorithm.

1875

APValue::isNullPointer() works only with LValues, not with MemberPointers:
https://github.com/llvm/llvm-project/blob/llvmorg-17-init/clang/lib/AST/APValue.cpp#L1000

bolshakov-a edited the summary of this revision. (Show Details)Mar 26 2023, 12:02 PM

I don't still understand how to mangle nested unnamed tags in general.

According to some quick experiments, for the non-virtual case, you mangle a member of an unnamed union it the same way as a regular member, except you stick <unnamed-tag>@ into the mangling. Additional levels of nesting append 3's: <unnamed-tag>@3, <unnamed-tag>@33, etc.


The mangling with virtual inheritance seems pretty broken. The following produces a name collision on MSVC:

#pragma pointers_to_members(full_generality, virtual_inheritance)
struct Nested { int a; union { int k; int k2;}; };
struct DerivedVirtually : Nested { int a; };
struct D2 { int DerivedVirtually::*p; };
template<D2> void f() {}
template void f<D2{&Nested::k}>();
template void f<D2{&Nested::k2}>();

The following crashes MSVC:

struct A {};
struct Nested { int a; };
struct DerivedVirtually : virtual A, Nested { };
struct D2 { int DerivedVirtually::*p; };
template<D2> void f() {}
template void f<D2{&Nested::a}>();
clang/lib/AST/MicrosoftMangle.cpp
1250

maybe say "anonymous struct/union member pointer", if you don't implement this.

Additional levels of nesting append 3's: <unnamed-tag>@3, <unnamed-tag>@33, etc.

Yes, but if you have such a code:

struct A {
  union {
    union {
      struct B {};
      using T = B;
    };
  };
};
void f(A::T);

you have ?f@@YAXUB@<unnamed-tag>@2A@@@Z for f, and for such a code:

struct A {
  union {
    union {
      struct B { struct C {}; };
      using T = B::C;
    };
  };
};
void f(A::T);

you'd have ?f@@YAXUC@B@<unnamed-tag>@3A@@@Z. What are 2 and 3 in general? I don't know whether the aforementioned code is acceptable according to the standard, honestly. But if I put a code into the common place, I should figure out the common algorithm.

bolshakov-a added inline comments.Mar 26 2023, 7:27 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

It is not specific only to member pointers, as the examples in my previous comment show.

The numbers are backreferences of the sort generated by mangleSourceName(), I think. If you nest deeply enough, MSVC stops using them; for example:

struct A {
union {union { union {union {
struct B {struct C {struct D {struct E {struct F {struct G {
struct H {struct I {struct J {struct K {
};};};};};};};};};};
using T = B::C::D::E::F::G::H::I::J::K;
T x;
using T2 = B::C::D::E::F::G::H::I::J;
T2 y;
};};};};
};
void f(decltype(A::x)) {}
void f2(decltype(A::y)) {}
void *g = (void*)&f;
void *g2 = (void*)&f2;
bolshakov-a edited the summary of this revision. (Show Details)

Mangling of anonymous unions. @eli.friedman, thank you!

I don't know whether the aforementioned code is acceptable according to the standard, honestly.

It is not acceptable because anonymous unions cannot contain nested types. In fact, they cannot contain even other anonymous unions, according to the C++ standard:
https://timsong-cpp.github.io/cppwp/n4868/class.union.anon#1
But clang and MSVC both accept such code, probably because recursively nested anonymous structs and unions are allowed in C17. Hence, this case is handled for generality, but I'm not sure if this behavior should be nailed down in a test.

The following produces a name collision on MSVC

Clang reproduces this buggy behavior. Should it produce an explicit error instead? Or, maybe, some temporary correct algorithm should be found? (It is pretty sad if clang cannot provide new language features for Windows programs just because MSVC doesn't provide them...)

bolshakov-a edited the summary of this revision. (Show Details)

Fix mangling of references to subobjects.

bolshakov-a edited the summary of this revision. (Show Details)

Mangle array subscripts and base class members in references to subobjects.

Minor refactoring. Don't hide function parameter with local variable.

efriedma added inline comments.Apr 27 2023, 2:48 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

Weird indentation

1816

Please just use SmallVector instead of std::stack.

bolshakov-a added inline comments.Apr 28 2023, 12:52 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

I agree. Don't know why clang-format does so. Should I put clang-format off here?

efriedma added inline comments.Apr 28 2023, 12:58 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

If clang-format won't cooperate, you can probably make it figure out what you're doing with something like

if (const auto *ID = dyn_cast<IndirectFieldDecl>(ND)) {
  for (unsigned I = 1, IE = ID->getChainingSize(); I < IE; ++I)
    mangleSourceName("<unnamed-tag>");
}
bolshakov-a added inline comments.Apr 28 2023, 1:03 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

It doesn't help. Moreover, clang-format corrupts formatting in this function even without my changes. You can check it by yourself.

efriedma added inline comments.Apr 28 2023, 1:16 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

I just tried "clang-format -i" on the file with your patch, and it seems to compute the correct indentation here. This with a source tree and clang-format binary from feb 28 because that's what I had handy... but I don't think anything relevant has changed here?

bolshakov-a added inline comments.Apr 28 2023, 2:09 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

Running clang-format on the whole file really seems to produce correct formatting... But it is a lot of unrelated changes. Running clang-format only on the mangleNestedName function (--lines=1185:1287 without my changes, --lines=1242:1349 with my changes) produces bad formatting.

1816

std::stack can be parameterized with container type (std::stack<char, SmallVector<char, 2>>). Or it should not be used for some other reason?

bolshakov-a added inline comments.Apr 28 2023, 2:17 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

I could make a separate PR with formatting this file and fix this PR after landing that one.

efriedma added inline comments.Apr 28 2023, 2:26 PM
clang/lib/AST/MicrosoftMangle.cpp
1250

If clang-format --lines is broken, just fix the formatting by hand (and file a bug against clang-format).

1816

Using std::stack as a wrapper around SmallVector is sort of pointless; you can just explicitly push_back/pop_back.

SmallVector instead of std::stack; fixing formatting.

bolshakov-a added inline comments.Apr 30 2023, 6:36 AM
clang/lib/AST/MicrosoftMangle.cpp
1250

Manual formatting without clang-format off doesn't pass pre-merge checks, of course. But, probably, it is not very important for landing?

I've played for a while with clang-format. clang-format --lines just reproduces indentation from previous block of code (switch in mangleUnqualifiedName in this case). I'm not sure whether it is a bug or a feature.

efriedma accepted this revision.May 1 2023, 12:44 PM

LGTM

(Do you have commit access? If not, please specify the name/email you want for the "author" field.)

This revision is now accepted and ready to land.May 1 2023, 12:44 PM

No, I don't have commit access. You could use just Bolshakov <bolsh.andrey@yandex.ru>

This revision was landed with ongoing or failed builds.May 3 2023, 6:20 PM
This revision was automatically updated to reflect the committed changes.