Download Raw Diff

Details

Reviewers

rjmccall
curdeius

Group Reviewers

Restricted Project

Commits

rG17095dc86111: [libc++][NFC] Increase readability of typeinfo comparison of ARM64

Summary

We wasted a good deal of time trying to figure out whether our implementation was correct. In the end, it was, but it wasn't so easy to determine. This patch dumbs down the implementation and improves the documentation to make it easier to validate.

See https://lists.llvm.org/pipermail/libcxx-dev/2020-December/001060.html.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ldionne created this revision.Mar 2 2021, 1:23 PM

Herald added subscribers: jkorous, kristof.beyls. · View Herald TranscriptMar 2 2021, 1:23 PM

ldionne requested review of this revision.Mar 2 2021, 1:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2021, 1:23 PM

Herald added a reviewer: Restricted Project. · View Herald Transcript

Herald added a subscriber: libcxx-commits. · View Herald Transcript

This is a WIP:

I'm not sure whether this change is actually correct, or whether we are mis-interpreting the ABI and the previous implementation was actually correct.
I don't know how to test this yet.

I created a patch because I wanted to avoid forgetting about this message on the mailing list forever. I'll need to do some investigation before I can make progress on this.

ldionne added a subscriber: sberg.Mar 2 2021, 1:25 PM

Harbormaster completed remote builds in B91654: Diff 327563.Mar 2 2021, 5:31 PM

miyuki added a subscriber: miyuki.Mar 3 2021, 1:52 AM

miyuki removed a subscriber: miyuki.

You're misinterpreting the condition. The "is non-unique" bit says that the RTTI is non-unique and thus its identity is string-based; otherwise it is pointer-based. A uniquely-emitted type is never the same as a non-uniquely-emitted type.

Usually people who are having problems with this are messing up type visibility so that it's inconsistent between libraries.

Updating according to a discussion with John McCall

Herald added a subscriber: smeenai. · View Herald TranscriptMar 31 2021, 2:10 PM

ldionne edited the summary of this revision. (Show Details)Mar 31 2021, 2:11 PM

ldionne added a reviewer: rjmccall.Mar 31 2021, 2:24 PM

smeenai added inline comments.Mar 31 2021, 2:47 PM

libcxx/include/typeinfo
158–162	This reads a little confusingly, because the "Otherwise, if at least one of the RTTIs can't be assumed to be unique bit" makes you think that the case where on RTTI is unique and one isn't will do the deep string comparison, but then the next sentence changes the interpretation of that case. Maybe say "if both RTTIs can't be assumed to be unique", since the last sentence already handles the non-unique + unique case?
163–169	Thanks for explaining the motivation of the design and when it kicks in. I'd been casually curious about this before, and it's a neat design :)
257–258	I don't understand this. My reading of the two parameter `__is_type_name_unique` function was that it would return true if either typeinfo was unique, and that's also what you're doing in line 266 below. Over here, if both typeinfos are unique, this condition won't kick in, so wouldn't we incorrectly fall through to the strcmp below? Also a nit: LLVM code style recommends not using an `else` after a return: https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return. I don't know if libc++ normally does this differently.

rjmccall added inline comments.Mar 31 2021, 3:06 PM

libcxx/include/typeinfo
151	"compiler"
162	This is not correct. We should only perform a string comparison if both RTTIs are non-unique. I still don't think this is a runtime bug.

Address comments.

libcxx/include/typeinfo

158–162

Actually, that isn't what I meant to write. I meant:

When comparing type_infos, if both RTTIs can be assumed to be unique, it suffices to compare their addresses. If both the RTTIs can't be assumed to be unique, we must perform a deep string comparison of the type names. However, if one of the RTTIs is guaranteed unique and the other one isn't, then both RTTIs are necessarily not to be considered equal.

Thanks for catching that!

Harbormaster completed remote builds in B96591: Diff 334532.Apr 1 2021, 12:14 AM

Harbormaster completed remote builds in B96607: Diff 334551.Apr 1 2021, 6:01 AM

curdeius added a subscriber: curdeius.Apr 1 2021, 6:33 AM

curdeius added inline comments.

libcxx/include/typeinfo
258–262	To be consistent with `__lt` and so that it reads easier, I propose negating the condition. Unless you think that the codegen can be worse.

Address Marek's comment. Turns out this patch is basically a NFC, just making
the code dumber and hence easier to follow.

ldionne retitled this revision from [libc++] Fix incorrect typeinfo comparison on ARM64 to [libc++] Increase readability of typeinfo comparison of ARM64.Apr 1 2021, 6:43 AM

ldionne edited the summary of this revision. (Show Details)

LGTM. You might want add "NFC" to the commit message.

LGTM

ldionne accepted this revision as: Restricted Project.Apr 1 2021, 1:37 PM

This revision is now accepted and ready to land.Apr 1 2021, 1:37 PM

This revision was landed with ongoing or failed builds.Apr 1 2021, 1:38 PM

Closed by commit rG17095dc86111: [libc++][NFC] Increase readability of typeinfo comparison of ARM64 (authored by ldionne). · Explain Why

This revision was automatically updated to reflect the committed changes.

ldionne added a commit: rG17095dc86111: [libc++][NFC] Increase readability of typeinfo comparison of ARM64.

Harbormaster completed remote builds in B96704: Diff 334673.Apr 1 2021, 7:25 PM

rsmith added a subscriber: rsmith.Apr 7 2021, 1:39 PM

rsmith added inline comments.

libcxx/include/typeinfo
265–269	This is not an ordering relation. For example, we could have non-unique C < unique B < non-unique A < non-unique C, where the first and second comparisons are address comparisons, and the third comparison is a string comparison. I think perhaps something like this would work: bool __lhs_unique = __is_type_name_unique(__lhs); if (__lhs_unique != __is_type_name_unique(__rhs)) return __lhs_unique; if (__lhs_unique) return __lhs < __rhs; return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) < 0; (That is: order all unique typeinfos before all non-unique ones, then order unique typeinfos by pointer and non-unique ones by string.)

Wow, yes, you're absolutely right. I think we can probably get away with fixing that without worrying about ODR problems because type_info ordered comparison is so uncommon.

ldionne mentioned this in D100134: [libc++] Fix std::type_info comparison.Apr 8 2021, 1:11 PM

Diff 334551

libcxx/include/typeinfo

Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines

// ========================================================================== // // ========================================================================== //

// Implementations // Implementations

// ========================================================================== // // ========================================================================== //

// ------------------------------------------------------------------------- // // ------------------------------------------------------------------------- //

// Unique // Unique

// (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 1) // (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 1)

// ------------------------------------------------------------------------- // // ------------------------------------------------------------------------- //

// This implementation of type_info assumes a unique copy of the RTTI for a // This implementation of type_info assumes a unique copy of the RTTI for a

// given type inside a program. This is a valid assumption when abiding to // given type inside a program. This is a valid assumption when abiding to the

// Itanium ABI (http://itanium-cxx-abi.github.io/cxx-abi/abi.html#vtable-components). // Itanium ABI (http://itanium-cxx-abi.github.io/cxx-abi/abi.html#vtable-components).

// Under this assumption, we can always compare the addresses of the type names // Under this assumption, we can always compare the addresses of the type names

// to implement equality-comparison of type_infos instead of having to perform // to implement equality-comparison of type_infos instead of having to perform

// a deep string comparison. // a deep string comparison.

// -------------------------------------------------------------------------- // // -------------------------------------------------------------------------- //

// NonUnique // NonUnique

// (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 2) // (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 2)

// -------------------------------------------------------------------------- // // -------------------------------------------------------------------------- //

// This implementation of type_info does not assume there is always a unique // This implementation of type_info does not assume there is always a unique

// copy of the RTTI for a given type inside a program. For various reasons // copy of the RTTI for a given type inside a program. For various reasons

// the linker may have failed to merge every copy of a types RTTI // the linker may have failed to merge every copy of a types RTTI

// (For example: -Bsymbolic or llvm.org/PR37398). Under this assumption, two // (For example: -Bsymbolic or llvm.org/PR37398). Under this assumption, two

// type_infos are equal if their addresses are equal or if a deep string // type_infos are equal if their addresses are equal or if a deep string

// comparison is equal. // comparison is equal.

// -------------------------------------------------------------------------- // // -------------------------------------------------------------------------- //

// NonUniqueARMRTTIBit // NonUniqueARMRTTIBit

// (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 3) // (_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 3)

// -------------------------------------------------------------------------- // // -------------------------------------------------------------------------- //

// This implementation is specific to ARM64 on Apple platforms.

// This implementation of type_info does not assume always a unique copy of // This implementation of type_info does not assume always a unique copy of

// the RTTI for a given type inside a program. It packs the pointer to the // the RTTI for a given type inside a program. When constructing the type_info,

// type name into a uintptr_t and reserves the high bit of that pointer (which // the compiler packs the pointer to the type name into a uintptr_t and reserves

rjmccallUnsubmitted

Done

"compiler"

rjmccall: "compiler"

// is assumed to be free for use under the ABI in use) to represent whether // the high bit of that pointer, which is assumed to be free for use under that

// that specific copy of the RTTI can be assumed unique inside the program. // ABI. If that high bit is set, that specific copy of the RTTI can't be assumed

// To implement equality-comparison of type_infos, we check whether BOTH // to be unique within the program. If the high bit is unset, then the RTTI can

// type_infos are guaranteed unique, and if so, we simply compare the addresses // be assumed to be unique within the program.

// of their type names instead of doing a deep string comparison, which is

// faster. If at least one of the type_infos can't guarantee uniqueness, we

// have no choice but to fall back to a deep string comparison.

// //

// This implementation is specific to ARM64 on Apple platforms. // When comparing type_infos, if both RTTIs can be assumed to be unique, it

// suffices to compare their addresses. If both the RTTIs can't be assumed to

// be unique, we must perform a deep string comparison of the type names.

// However, if one of the RTTIs is guaranteed unique and the other one isn't,

// then both RTTIs are necessarily not to be considered equal.

// //

smeenaiUnsubmitted

Done

This reads a little confusingly, because the "Otherwise, if at least one of the RTTIs can't be assumed to be unique bit" makes you think that the case where on RTTI is unique and one isn't will do the deep string comparison, but then the next sentence changes the interpretation of that case. Maybe say "if both RTTIs can't be assumed to be unique", since the last sentence already handles the non-unique + unique case?

smeenai: This reads a little confusingly, because the "Otherwise, if at least one of the RTTIs can't be…

ldionneAuthorUnsubmitted

Done

Actually, that isn't what I meant to write. I meant:

When comparing type_infos, if both RTTIs can be assumed to be unique, it suffices to compare their addresses. If both the RTTIs can't be assumed to be unique, we must perform a deep string comparison of the type names. However, if one of the RTTIs is guaranteed unique and the other one isn't, then both RTTIs are necessarily not to be considered equal.

Thanks for catching that!

ldionne: Actually, that isn't what I meant to write. I meant: ``` When comparing type_infos, if both…

rjmccallUnsubmitted

Not Done

This is not correct. We should only perform a string comparison if *both* RTTIs are non-unique. I still don't think this is a runtime bug.

rjmccall: This is not correct. We should only perform a string comparison if *both* RTTIs are non-unique.

// Note that the compiler is the one setting (or unsetting) the high bit of // The intent of this design is to remove the need for weak symbols. Specifically,

// the pointer when it constructs the type_info, depending on whether it can // if a type would normally have a default-visibility RTTI emitted as a weak

// guarantee uniqueness for that specific type_info. // symbol, it is given hidden visibility instead and the non-unique bit is set.

// Otherwise, types declared with hidden visibility are always considered to have

// a unique RTTI: the RTTI is emitted with linkonce_odr linkage and is assumed

// to be deduplicated by the linker within the linked image. Across linked image

// boundaries, such types are thus considered different types.

smeenaiUnsubmitted

Done

Thanks for explaining the motivation of the design and when it kicks in. I'd been casually curious about this before, and it's a neat design :)

smeenai: Thanks for explaining the motivation of the design and when it kicks in. I'd been casually…

// This value can be overriden in the __config_site. When it's not overriden, // This value can be overriden in the __config_site. When it's not overriden,

// we pick a default implementation based on the platform here. // we pick a default implementation based on the platform here.

#ifndef _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION #ifndef _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION

// Windows binaries can't merge typeinfos, so use the NonUnique implementation. // Windows binaries can't merge typeinfos, so use the NonUnique implementation.

# ifdef _LIBCPP_OBJECT_FORMAT_COFF # ifdef _LIBCPP_OBJECT_FORMAT_COFF

# define _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION 2 # define _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION 2

▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines struct __non_unique_arm_rtti_bit_impl {

static size_t __hash(__type_name_t __v) _NOEXCEPT { static size_t __hash(__type_name_t __v) _NOEXCEPT {

if (__is_type_name_unique(__v)) if (__is_type_name_unique(__v))

return reinterpret_cast<size_t>(__v); return reinterpret_cast<size_t>(__v);

return __non_unique_impl::__hash(__type_name_to_string(__v)); return __non_unique_impl::__hash(__type_name_to_string(__v));

} }

_LIBCPP_INLINE_VISIBILITY _LIBCPP_ALWAYS_INLINE _LIBCPP_INLINE_VISIBILITY _LIBCPP_ALWAYS_INLINE

static bool __eq(__type_name_t __lhs, __type_name_t __rhs) _NOEXCEPT { static bool __eq(__type_name_t __lhs, __type_name_t __rhs) _NOEXCEPT {

if (__lhs == __rhs) if (__lhs == __rhs)

return true; return true;

if (__is_type_name_unique(__lhs, __rhs)) if (!__is_type_name_unique(__lhs) && !__is_type_name_unique(__rhs))

smeenaiUnsubmitted

Not Done

I don't understand this. My reading of the two parameter __is_type_name_unique function was that it would return true if either typeinfo was unique, and that's also what you're doing in line 266 below. Over here, if both typeinfos are unique, this condition won't kick in, so wouldn't we incorrectly fall through to the strcmp below?

Also a nit: LLVM code style recommends not using an else after a return: https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return. I don't know if libc++ normally does this differently.

smeenai: I don't understand this. My reading of the two parameter `__is_type_name_unique` function was…

return false;

return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) == 0; return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) == 0;

// Either both are unique and have a different address, or one of them

// is unique and the other one isn't. In both cases they are unequal.

return false;

curdeiusUnsubmitted

Not Done

return true;

- if (!__is_type_name_unique(__lhs) && !__is_type_name_unique(__rhs))

- return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) == 0;

- // Either both are unique and have a different address, or one of them

- // is unique and the other one isn't. In both cases they are unequal.

- return false;

+ if (__is_type_name_unique(__lhs) || __is_type_name_unique(__rhs))

+ // Either both are unique and have a different address, or one of them

+ // is unique and the other one isn't. In both cases they are unequal.

+ return false;

+ return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) == 0;

}

_LIBCPP_INLINE_VISIBILITY _LIBCPP_ALWAYS_INLINE

To be consistent with __lt and so that it reads easier, I propose negating the condition.
Unless you think that the codegen can be worse.

curdeius: To be consistent with `__lt` and so that it reads easier, I propose negating the condition.

} }

_LIBCPP_INLINE_VISIBILITY _LIBCPP_ALWAYS_INLINE _LIBCPP_INLINE_VISIBILITY _LIBCPP_ALWAYS_INLINE

static bool __lt(__type_name_t __lhs, __type_name_t __rhs) _NOEXCEPT { static bool __lt(__type_name_t __lhs, __type_name_t __rhs) _NOEXCEPT {

if (__is_type_name_unique(__lhs, __rhs)) if (__is_type_name_unique(__lhs) || __is_type_name_unique(__rhs))

return __lhs < __rhs; return __lhs < __rhs;

return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) < 0; return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) < 0;

} }

rsmithUnsubmitted

Not Done

This is not an ordering relation. For example, we could have non-unique C < unique B < non-unique A < non-unique C, where the first and second comparisons are address comparisons, and the third comparison is a string comparison.

I think perhaps something like this would work:

bool __lhs_unique = __is_type_name_unique(__lhs);
if (__lhs_unique != __is_type_name_unique(__rhs))
  return __lhs_unique;
if (__lhs_unique)
  return __lhs < __rhs;
return __builtin_strcmp(__type_name_to_string(__lhs), __type_name_to_string(__rhs)) < 0;

(That is: order all unique typeinfos before all non-unique ones, then order unique typeinfos by pointer and non-unique ones by string.)

rsmith: This is not an ordering relation. For example, we could have non-unique C < unique B < non…

private: private:

// The unique bit is the top bit. It is expected that __type_name_t is 64 bits when // The unique bit is the top bit. It is expected that __type_name_t is 64 bits when

// this implementation is actually used. // this implementation is actually used.

typedef integral_constant<__type_name_t, typedef integral_constant<__type_name_t,

(1ULL << ((__CHAR_BIT__ * sizeof(__type_name_t)) - 1))> __non_unique_rtti_bit; (1ULL << ((__CHAR_BIT__ * sizeof(__type_name_t)) - 1))> __non_unique_rtti_bit;

_LIBCPP_INLINE_VISIBILITY _LIBCPP_INLINE_VISIBILITY

static bool __is_type_name_unique(__type_name_t __lhs) _NOEXCEPT { static bool __is_type_name_unique(__type_name_t __lhs) _NOEXCEPT {

return !(__lhs & __non_unique_rtti_bit::value); return !(__lhs & __non_unique_rtti_bit::value);

} }

_LIBCPP_INLINE_VISIBILITY

static bool __is_type_name_unique(__type_name_t __lhs, __type_name_t __rhs) _NOEXCEPT {

return !((__lhs & __rhs) & __non_unique_rtti_bit::value);

}

}; };

typedef typedef

#if _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION == 1 #if _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION == 1

__unique_impl __unique_impl

#elif _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION == 2 #elif _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION == 2

__non_unique_impl __non_unique_impl

#elif _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION == 3 #elif _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION == 3

▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Increase readability of typeinfo comparison of ARM64
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 334551

libcxx/include/typeinfo

This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Increase readability of typeinfo comparison of ARM64ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 334551

libcxx/include/typeinfo

[libc++] Increase readability of typeinfo comparison of ARM64
ClosedPublic