This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Increase readability of typeinfo comparison of ARM64
ClosedPublic

Authored by ldionne on Mar 2 2021, 1:23 PM.

Details

Summary

We wasted a good deal of time trying to figure out whether our implementation was correct. In the end, it was, but it wasn't so easy to determine. This patch dumbs down the implementation and improves the documentation to make it easier to validate.

See https://lists.llvm.org/pipermail/libcxx-dev/2020-December/001060.html.

Diff Detail

Event Timeline

ldionne created this revision.Mar 2 2021, 1:23 PM
ldionne requested review of this revision.Mar 2 2021, 1:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2021, 1:23 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
ldionne planned changes to this revision.Mar 2 2021, 1:24 PM

This is a WIP:

  1. I'm not sure whether this change is actually correct, or whether we are mis-interpreting the ABI and the previous implementation was actually correct.
  2. I don't know how to test this yet.

I created a patch because I wanted to avoid forgetting about this message on the mailing list forever. I'll need to do some investigation before I can make progress on this.

ldionne added a subscriber: sberg.Mar 2 2021, 1:25 PM
miyuki added a subscriber: miyuki.Mar 3 2021, 1:52 AM
miyuki removed a subscriber: miyuki.

You're misinterpreting the condition. The "is non-unique" bit says that the RTTI is non-unique and thus its identity is string-based; otherwise it is pointer-based. A uniquely-emitted type is never the same as a non-uniquely-emitted type.

Usually people who are having problems with this are messing up type visibility so that it's inconsistent between libraries.

ldionne updated this revision to Diff 334532.Mar 31 2021, 2:10 PM

Updating according to a discussion with John McCall

ldionne edited the summary of this revision. (Show Details)Mar 31 2021, 2:11 PM
smeenai added inline comments.Mar 31 2021, 2:47 PM
libcxx/include/typeinfo
158–162

This reads a little confusingly, because the "Otherwise, if at least one of the RTTIs can't be assumed to be unique bit" makes you think that the case where on RTTI is unique and one isn't will do the deep string comparison, but then the next sentence changes the interpretation of that case. Maybe say "if both RTTIs can't be assumed to be unique", since the last sentence already handles the non-unique + unique case?

163–169

Thanks for explaining the motivation of the design and when it kicks in. I'd been casually curious about this before, and it's a neat design :)

257–258

I don't understand this. My reading of the two parameter __is_type_name_unique function was that it would return true if either typeinfo was unique, and that's also what you're doing in line 266 below. Over here, if both typeinfos are unique, this condition won't kick in, so wouldn't we incorrectly fall through to the strcmp below?

Also a nit: LLVM code style recommends not using an else after a return: https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return. I don't know if libc++ normally does this differently.

rjmccall added inline comments.Mar 31 2021, 3:06 PM
libcxx/include/typeinfo
151

"compiler"

162

This is not correct. We should only perform a string comparison if *both* RTTIs are non-unique. I still don't think this is a runtime bug.

ldionne updated this revision to Diff 334551.Mar 31 2021, 3:14 PM
ldionne marked 3 inline comments as done.
ldionne edited the summary of this revision. (Show Details)

Address comments.

libcxx/include/typeinfo
158–162

Actually, that isn't what I meant to write. I meant:

When comparing type_infos, if both RTTIs can be assumed to be unique, it suffices to compare their addresses. If both the RTTIs can't be assumed to be unique, we must perform a deep string comparison of the type names. However, if one of the RTTIs is guaranteed unique and the other one isn't, then both RTTIs are necessarily not to be considered equal.

Thanks for catching that!

curdeius added inline comments.
libcxx/include/typeinfo
258–262

To be consistent with __lt and so that it reads easier, I propose negating the condition.
Unless you think that the codegen can be worse.

ldionne updated this revision to Diff 334673.Apr 1 2021, 6:43 AM

Address Marek's comment. Turns out this patch is basically a NFC, just making
the code dumber and hence easier to follow.

ldionne retitled this revision from [libc++] Fix incorrect typeinfo comparison on ARM64 to [libc++] Increase readability of typeinfo comparison of ARM64.Apr 1 2021, 6:43 AM
ldionne edited the summary of this revision. (Show Details)
curdeius accepted this revision as: curdeius.Apr 1 2021, 7:20 AM

LGTM. You might want add "NFC" to the commit message.

ldionne accepted this revision as: Restricted Project.Apr 1 2021, 1:37 PM
This revision is now accepted and ready to land.Apr 1 2021, 1:37 PM
This revision was landed with ongoing or failed builds.Apr 1 2021, 1:38 PM
This revision was automatically updated to reflect the committed changes.
rsmith added a subscriber: rsmith.Apr 7 2021, 1:39 PM
rsmith added inline comments.
libcxx/include/typeinfo
265–269

This is not an ordering relation. For example, we could have non-unique C < unique B < non-unique A < non-unique C, where the first and second comparisons are address comparisons, and the third comparison is a string comparison.

I think perhaps something like this would work:

bool __lhs_unique = __is_type_name_unique(__lhs);
if (__lhs_unique != __is_type_name_unique(__rhs))
  return __lhs_unique;
if (__lhs_unique)
  return __lhs < __rhs;
return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) < 0;

(That is: order all unique typeinfos before all non-unique ones, then order unique typeinfos by pointer and non-unique ones by string.)

Wow, yes, you're absolutely right. I think we can probably get away with fixing that without worrying about ODR problems because type_info ordered comparison is so uncommon.