This is an archive of the discontinued LLVM Phabricator instance.

Fix a couple of mangling canonicalizer corner case bugs.
ClosedPublic

Authored by rsmith on Aug 29 2018, 3:54 PM.

Details

Summary

The hash computed for an ArrayType was different when first constructed
versus when later profiled due to the constructor default argument, and
we were not tracking constructor / destructor variant as part of the
mangled name AST, leading to incorrect equivalences.

Diff Detail

Repository
rL LLVM

Event Timeline

rsmith created this revision.Aug 29 2018, 3:54 PM
erik.pilkington accepted this revision.Sep 4 2018, 11:45 AM

Huh. I think there may be some more (many?) similar bugs to this. Possible testcase: _Z1fI1SEiS0_T_ and _Z1fI1SEiS0_S0_. I think these will be considered equivalent to the canonicalizer because their representation is structurally equivalent, but they are in fact distinct. I think we can fix this with extra indirection. Finding and fixing these as-needed is probably fine, but something to keep in mind.

Otherwise, LGTM!

This revision is now accepted and ready to land.Sep 4 2018, 11:45 AM

I think there may be some more (many?) similar bugs to this. Possible testcase: _Z1fI1SEiS0_T_ and _Z1fI1SEiS0_S0_. I think these will be considered equivalent to the canonicalizer because their representation is structurally equivalent, but they are in fact distinct.

Yes, you're right; I think we should be preserving the fact that the type was named via a reference to a specific template parameter in the AST, since that's part of the semantics of the mangled name, and clients that aren't demanglers will sometimes care about the difference. (Also, I would like the demangler to have a mode where it /doesn't/ substitute template parameters for their values, as that's basically never what I personally want when demangling a name.)

Doing that has some complications: if we say that 1XIiE and 1YIiE are the same, does that mean that 1YIT_E is in the same equivalence class too, in the case where T_ is i? (I think the answer should be "no", but this does not seem obvious.)

The good news is that this patch is sufficient to ensure that all interesting function and variable manglings in at least clang and llvm demangle to non-equivalent trees. (We still have duplicate ASTs for virtual function thunks, but I'm certainly happy to defer addressing that until we have a need.)

This revision was automatically updated to reflect the committed changes.