This doesn't only happen for incorrectly built apps, it also happens
for libraries dlopened with RTLD_LOCAL.
Details
- Reviewers
mclow.lists EricWF
Diff Detail
- Build Status
Buildable 11043 Build 11043: arc lint + arc unit
Event Timeline
From my testing, yes. Regardless, RTLD_LOCAL is how JNI libraries get loaded when System.loadLibrary is used, so the strcmp fallback is a requirement for that use case.
I'm also much out of my depth here, but I'm skeptical. You're changing the comments in the code from essentially saying "This workaround helps people with broken code" to essentially saying "This indispensable functionality helps people like me who use dlopen()." Are you 100% sure that you're not just a person with broken code?
In other words, what did this guy from 2013 get wrong? -- or, if "he got nothing wrong", then why can't you just follow his advice to eliminate the duplicate typeinfos from your code? http://www.russellmcc.com/posts/2013-08-03-rtti.html
Are you 100% sure that you're not just a person with broken code?
Absolutely, since it isn't my code ;) I maintain the toolchain and this is a behavioral change when switching from libstdc++ to libc++.
In other words, what did this guy from 2013 get wrong? -- or, if "he got nothing wrong", then why can't you just follow his advice to eliminate the duplicate typeinfos from your code? http://www.russellmcc.com/posts/2013-08-03-rtti.html
That post is all about binaries that were built with hidden visibility and says nothing about the case of dlopen. There's nothing wrong with the code as built. All the symbols have public visibility, all the type_infos are weak, everything works correctly if you explicitly link the libraries to the final executable or dlopen only one library (not always an option).
If anyone is still skeptical, libsupc++ went through this at one point too and eventually decided to use strcmp by default because otherwise RTTI doesn't work in plugin architectures (which is essentially what JNI is). https://gcc.gnu.org/ml/gcc-patches/2009-07/msg01239.html
I assume std::type_info::operator== also needs to be adjusted to compare string names? It looks like libc++'s version of the function does string comparisons for ARM64 iOS, but only on some classes (e.g. public(?) ones). Look for the _LIBCPP_HAS_NONUNIQUE_TYPEINFO and _LIBCPP_NONUNIQUE_RTTI_BIT flags. I wonder if ARM64 iOS also sets _LIBCXX_DYNAMIC_FALLBACK for libc++abi.
I noticed that Clang on GNU/Linux is treating two classes defined in separate anonymous namespaces as equal for std::type_info::operator== and dynamic_cast, which I *think* is a Clang bug. G++ marks the typeinfo names with an asterisk to disable the string comparisons. I'd expect that enabling _LIBCXX_DYNAMIC_FALLBACK would introduce the same bug into the Android toolchain.
Some relevant code links:
- https://github.com/llvm-mirror/clang/blob/8c9bf999aa40ab6077b958b5edcf587b9d76ce24/lib/CodeGen/ItaniumCXXABI.cpp#L359 ==> iOS64CXXABI overrides shouldRTTIBeUnique to return false.
- https://github.com/llvm-mirror/libcxx/blob/ca79c159d8bfbe190a6cbfce74eb2d050697d8b9/include/typeinfo#L176 ==> libc++ type_info::operator== uses string comparison, but only if _LIBCPP_NONUNIQUE_RTTI_BIT has been OR'ed into the __type_name pointers of both type_info objects
- https://github.com/llvm-mirror/libcxx/blob/ca79c159d8bfbe190a6cbfce74eb2d050697d8b9/include/__config#L879 ==> use the highest bit for ARM64 iOS
Didn't realize I could do multi binary test cases with this test runner. It'll be a little messy, but I'll try adding one.
Oh, also, any test I add is going to fail, since the case I'm trying to account for here is not the default behavior.
I could make the more invasive change and actually make libc++abi use the fallback by default like libsupc++ does, but I was willing to settle for just not spamming the user with warnings that they can't do anything about.
Added a (failing) test case. The test case will fail unless the default value of _LIBCXX_DYNAMIC_FALLBACK is changed.
Which should be set (along with -D_LIBCXX_DYNAMIC_FALLBACK (possibly renamed to _LIBCXXABI_DYNAMIC_FALLBACK)) via a CMake flag... We shouldn't just be defining macros like this in CMAKE_CXX_FLAGS.
Update the test with an XFAIL when _LIBCXX_DYNAMIC_FALLBACK is not set.
https://reviews.llvm.org/D38827 adds this cmake option to libc++.
I opted for adding this switch to libc++ instead. Like @rprichard points out, we'll need to do this in std::type_info::operator== as well. I thought that code wasn't in libc++ yet, but it seems it is. I'm going to rename this option in libc++abi to match the one in libc++.
Needs a docs entry for the new flag (in libcxx's BuildingLibcxx.rst). Other than that, all the stuff I've asked you to add LGTM. I'd still appreciate @EricWF / @mclow 's opinion on the meat of the functional change part of this though... I don't know all the implications relaxing the search here.
nbjoerg and zygoloid got me pointed in the right direction. Both Base and BaseImpl are missing their key functions, and that's the problem here. Patch should be unnecessary.
Fwiw, I wrote this code. All of that "fallback" stuff was written to make customer code that was incorrect, but working on OS X -version-that-used-libsupc++ continue to work. I.e. to be a crutch for incorrect code. It should all be removed if you no longer want to provide such a crutch.
Android may end up wanting to use it for the same reason. I've backed it out for now, but it may end up being something we need since a fair number of NDK users (majority, probably) use libsupc++ and that might be a point of pain in migrating to libc++abi. Other Linux distros may want it as well, so probably worth leaving the code around.
Ok. Well that's why it is under a #define: to make it easier to include or not, depending on the needs of the platform.
On OS X, dynamic vague linkage appears to work by default, and _LIBCXX_DYNAMIC_FALLBACK catches some code that slips through the cracks (e.g. code that does something odd like: specify visibility attributes, explicitly specify RTLD_LOCAL, link with -Bsymbolic, etc.). dlopen defaults to RTLD_GLOBAL, which merges weak definitions, even with an executable. System.loadLibrary uses RTLD_GLOBAL.
On Android, AFAICT, dynamic vague linkage doesn't work prior to M, and with M, it only works between shared objects loaded in one batch. Loading multiple C++ shared objects one-at-a-time generally results in duplicated type_info objects, which break the various RTTI-dependent features when libc++/libc++abi are used.
_LIBCXX_DYNAMIC_FALLBACK doesn't generally fix __dynamic_cast to account for duplicated type_info objects. Instead, it flags a specific situation that's easy to detect. If we have code like this...
Static *s = new Derived; dynamic_cast<Dest*>(s)
... __dynamic_cast will search the RTTI graph of Derived looking for Static and Dest. It expects to find at least one instance of Static. If it doesn't, there's obviously something wrong, because, assuming the C++ ABI has been followed, the s pointer is not really of type Static! This situation is cheap to detect (a couple extra comparisons). If Static is found, the fallback doesn't activate, but dynamic_cast can still fail, e.g.:
- If a Dest in the RTTI graph doesn't match &typeid(Dest) where dynamic_cast is used, then dynamic_cast can incorrectly return NULL.
- If there are two Dest objects in the RTTI graph with unequal addresses, then dynamic_cast can incorrectly return non-NULL when the cast should have been ambiguous. (Writing this test case is a fun little exercise.)
- Multiple Static objects with unequal addresses probably causes similar failures, as long as one of them matches &typeid(Static) where dynamic_cast is used.
I think _LIBCXX_DYNAMIC_FALLBACK is fine to enable on Android, but (1) it seems insufficient, (2) I'm skeptical about removing the diagnostic, and (3) a general solution to the libc++abi RTTI problem tends to make _LIBCXX_DYNAMIC_FALLBACK irrelevant. i.e. We may need to simply compare typeinfo strings like gnustl does (or the #ifdef _WIN32 code for dynamic_cast), which subsumes _LIBCXX_DYNAMIC_FALLBACK.
I'm still intrigued by the shouldRTTIBeUnique / _LIBCPP_NONUNIQUE_RTTI_BIT stuff I linked above. I'm a little surprised that 64-bit iOS would need that behavior. Maybe we need to turn it on for Android, but where is the corresponding flag for libcxxabi?