Page MenuHomePhabricator

Remove warnings for dynamic_cast fallback.
AbandonedPublic

Authored by danalbert on Oct 5 2017, 2:15 PM.

Details

Summary

This doesn't only happen for incorrectly built apps, it also happens
for libraries dlopened with RTLD_LOCAL.

Event Timeline

danalbert created this revision.Oct 5 2017, 2:15 PM
smeenai added a subscriber: smeenai.Oct 5 2017, 2:41 PM

Does dlopen cause issues even with RTLD_GLOBAL?

Does dlopen cause issues even with RTLD_GLOBAL?

From my testing, yes. Regardless, RTLD_LOCAL is how JNI libraries get loaded when System.loadLibrary is used, so the strcmp fallback is a requirement for that use case.

jroelofs resigned from this revision.Oct 9 2017, 1:31 PM

I'm not sure I'm the right person to review this.

I'm also much out of my depth here, but I'm skeptical. You're changing the comments in the code from essentially saying "This workaround helps people with broken code" to essentially saying "This indispensable functionality helps people like me who use dlopen()." Are you 100% sure that you're not just a person with broken code?

In other words, what did this guy from 2013 get wrong? -- or, if "he got nothing wrong", then why can't you just follow his advice to eliminate the duplicate typeinfos from your code? http://www.russellmcc.com/posts/2013-08-03-rtti.html

jroelofs edited reviewers, added: mclow.lists, EricWF; removed: jroelofs.Oct 9 2017, 3:03 PM

Are you 100% sure that you're not just a person with broken code?

Absolutely, since it isn't my code ;) I maintain the toolchain and this is a behavioral change when switching from libstdc++ to libc++.

In other words, what did this guy from 2013 get wrong? -- or, if "he got nothing wrong", then why can't you just follow his advice to eliminate the duplicate typeinfos from your code? http://www.russellmcc.com/posts/2013-08-03-rtti.html

That post is all about binaries that were built with hidden visibility and says nothing about the case of dlopen. There's nothing wrong with the code as built. All the symbols have public visibility, all the type_infos are weak, everything works correctly if you explicitly link the libraries to the final executable or dlopen only one library (not always an option).

If anyone is still skeptical, libsupc++ went through this at one point too and eventually decided to use strcmp by default because otherwise RTTI doesn't work in plugin architectures (which is essentially what JNI is). https://gcc.gnu.org/ml/gcc-patches/2009-07/msg01239.html

I assume std::type_info::operator== also needs to be adjusted to compare string names? It looks like libc++'s version of the function does string comparisons for ARM64 iOS, but only on some classes (e.g. public(?) ones). Look for the _LIBCPP_HAS_NONUNIQUE_TYPEINFO and _LIBCPP_NONUNIQUE_RTTI_BIT flags. I wonder if ARM64 iOS also sets _LIBCXX_DYNAMIC_FALLBACK for libc++abi.

I noticed that Clang on GNU/Linux is treating two classes defined in separate anonymous namespaces as equal for std::type_info::operator== and dynamic_cast, which I *think* is a Clang bug. G++ marks the typeinfo names with an asterisk to disable the string comparisons. I'd expect that enabling _LIBCXX_DYNAMIC_FALLBACK would introduce the same bug into the Android toolchain.

Some relevant code links:

That reminds me... this does need a testcase or two.

That reminds me... this does need a testcase or two.

Didn't realize I could do multi binary test cases with this test runner. It'll be a little messy, but I'll try adding one.

That reminds me... this does need a testcase or two.

Oh, also, any test I add is going to fail, since the case I'm trying to account for here is not the default behavior.

I could make the more invasive change and actually make libc++abi use the fallback by default like libsupc++ does, but I was willing to settle for just not spamming the user with warnings that they can't do anything about.

danalbert updated this revision to Diff 118502.Oct 10 2017, 3:56 PM
danalbert edited the summary of this revision. (Show Details)

Added a (failing) test case. The test case will fail unless the default value of _LIBCXX_DYNAMIC_FALLBACK is changed.

That reminds me... this does need a testcase or two.

Oh, also, any test I add is going to fail, since the case I'm trying to account for here is not the default behavior.

That's what an available_feature + // REQUIRES: is for.

That reminds me... this does need a testcase or two.

Oh, also, any test I add is going to fail, since the case I'm trying to account for here is not the default behavior.

That's what an available_feature + // REQUIRES: is for.

Which should be set (along with -D_LIBCXX_DYNAMIC_FALLBACK (possibly renamed to _LIBCXXABI_DYNAMIC_FALLBACK)) via a CMake flag... We shouldn't just be defining macros like this in CMAKE_CXX_FLAGS.

danalbert updated this revision to Diff 118711.Oct 11 2017, 3:39 PM

Update the test with an XFAIL when _LIBCXX_DYNAMIC_FALLBACK is not set.

https://reviews.llvm.org/D38827 adds this cmake option to libc++.

danalbert planned changes to this revision.Oct 11 2017, 3:52 PM

(possibly renamed to _LIBCXXABI_DYNAMIC_FALLBACK)

I opted for adding this switch to libc++ instead. Like @rprichard points out, we'll need to do this in std::type_info::operator== as well. I thought that code wasn't in libc++ yet, but it seems it is. I'm going to rename this option in libc++abi to match the one in libc++.

jroelofs added a comment.EditedOct 11 2017, 4:01 PM

Needs a docs entry for the new flag (in libcxx's BuildingLibcxx.rst). Other than that, all the stuff I've asked you to add LGTM. I'd still appreciate @EricWF / @mclow 's opinion on the meat of the functional change part of this though... I don't know all the implications relaxing the search here.

danalbert abandoned this revision.Oct 11 2017, 4:22 PM

nbjoerg and zygoloid got me pointed in the right direction. Both Base and BaseImpl are missing their key functions, and that's the problem here. Patch should be unnecessary.

Fwiw, I wrote this code. All of that "fallback" stuff was written to make customer code that was incorrect, but working on OS X -version-that-used-libsupc++ continue to work. I.e. to be a crutch for incorrect code. It should all be removed if you no longer want to provide such a crutch.

Fwiw, I wrote this code. All of that "fallback" stuff was written to make customer code that was incorrect, but working on OS X -version-that-used-libsupc++ continue to work. I.e. to be a crutch for incorrect code. It should all be removed if you no longer want to provide such a crutch.

Android may end up wanting to use it for the same reason. I've backed it out for now, but it may end up being something we need since a fair number of NDK users (majority, probably) use libsupc++ and that might be a point of pain in migrating to libc++abi. Other Linux distros may want it as well, so probably worth leaving the code around.

Ok. Well that's why it is under a #define: to make it easier to include or not, depending on the needs of the platform.

Fwiw, I wrote this code. All of that "fallback" stuff was written to make customer code that was incorrect, but working on OS X -version-that-used-libsupc++ continue to work. I.e. to be a crutch for incorrect code. It should all be removed if you no longer want to provide such a crutch.

Android may end up wanting to use it for the same reason. I've backed it out for now, but it may end up being something we need since a fair number of NDK users (majority, probably) use libsupc++ and that might be a point of pain in migrating to libc++abi. Other Linux distros may want it as well, so probably worth leaving the code around.

On OS X, dynamic vague linkage appears to work by default, and _LIBCXX_DYNAMIC_FALLBACK catches some code that slips through the cracks (e.g. code that does something odd like: specify visibility attributes, explicitly specify RTLD_LOCAL, link with -Bsymbolic, etc.). dlopen defaults to RTLD_GLOBAL, which merges weak definitions, even with an executable. System.loadLibrary uses RTLD_GLOBAL.

On Android, AFAICT, dynamic vague linkage doesn't work prior to M, and with M, it only works between shared objects loaded in one batch. Loading multiple C++ shared objects one-at-a-time generally results in duplicated type_info objects, which break the various RTTI-dependent features when libc++/libc++abi are used.

_LIBCXX_DYNAMIC_FALLBACK doesn't generally fix __dynamic_cast to account for duplicated type_info objects. Instead, it flags a specific situation that's easy to detect. If we have code like this...

Static *s = new Derived;
dynamic_cast<Dest*>(s)

... __dynamic_cast will search the RTTI graph of Derived looking for Static and Dest. It expects to find at least one instance of Static. If it doesn't, there's obviously something wrong, because, assuming the C++ ABI has been followed, the s pointer is not really of type Static! This situation is cheap to detect (a couple extra comparisons). If Static is found, the fallback doesn't activate, but dynamic_cast can still fail, e.g.:

  • If a Dest in the RTTI graph doesn't match &typeid(Dest) where dynamic_cast is used, then dynamic_cast can incorrectly return NULL.
  • If there are two Dest objects in the RTTI graph with unequal addresses, then dynamic_cast can incorrectly return non-NULL when the cast should have been ambiguous. (Writing this test case is a fun little exercise.)
  • Multiple Static objects with unequal addresses probably causes similar failures, as long as one of them matches &typeid(Static) where dynamic_cast is used.

I think _LIBCXX_DYNAMIC_FALLBACK is fine to enable on Android, but (1) it seems insufficient, (2) I'm skeptical about removing the diagnostic, and (3) a general solution to the libc++abi RTTI problem tends to make _LIBCXX_DYNAMIC_FALLBACK irrelevant. i.e. We may need to simply compare typeinfo strings like gnustl does (or the #ifdef _WIN32 code for dynamic_cast), which subsumes _LIBCXX_DYNAMIC_FALLBACK.

I'm still intrigued by the shouldRTTIBeUnique / _LIBCPP_NONUNIQUE_RTTI_BIT stuff I linked above. I'm a little surprised that 64-bit iOS would need that behavior. Maybe we need to turn it on for Android, but where is the corresponding flag for libcxxabi?