Fixes #44892. The example given there does indeed optimize away completely with my changes.
Diff Detail
Event Timeline
I think this could be implemented in a non ABI breaking way by using __attribute__((__gnu_inline__)). See https://godbolt.org/z/W83rTb714 for an example. If you want to allow it in ABIv1 this way you should probably just disable the magic in ABIv2.
libcxx/src/support/runtime/exception_pointer_cxxabi.ipp | ||
---|---|---|
17 | Is there a reason for this additional indirection? |
Sorry, but I haven't been able to get this to work. I don't really understand your example, and the documentation for gnu_inline is pretty sparse.
Yes, indeed, thanks for pointing this out!
I don't know what a sensible benchmark for this would look like, but have a look at the generated assembly:
Example 1 (from #44892): https://godbolt.org/z/YGq86cszM. With my changes, this compiles to just
0000000000000000 <_Z4testv>: 0: b0 01 mov al,0x1 2: c3 ret
Example 2 (from #39333): https://godbolt.org/z/oee13ennv. With my changes, this becomes
0000000000000000 <main>: 0: 31 c0 xor eax,eax 2: c3 ret
libcxx/src/support/runtime/exception_pointer_cxxabi.ipp | ||
---|---|---|
17 | There are two implementations of __incr/decr_refcount(), one here and one in exception_pointer_unimplemented.ipp. But maybe the latter one is unnecessary? I'm not sure. |
Adding the move constructor and assignment operator isn't ABI-breaking AFAICT. IIUC the ABI breaking part of this patch is that the functions don't get generated anymore in the dylib. Using __attribute__((__gnu_inline__)) allows us to suppress generating the functions if they're not inlined, but allows the compiler to inline the function. Now if we don't apply the attribute when building the library the functions will be emitted, but otherwise they won't. This way we have the same linkage and so on for the dylib function, but have all the optimization opportunities of an inline function.
Yes, of course you're right about the move constructor and assignment operator, I had gotten this mixed up.
The GCC documentation says about gnu_inline that
The way to use this is to put a function definition in a header file with this attribute, and put another copy of the function, without extern, in a library file.
and
In C++, this attribute does not depend on extern in any way, but it still requires the inline keyword to enable its special behavior.
So I don't understand how this is supposed to work in C++. Let's consider a small example:
a.h:
#pragma once struct A { int foo() { return 1; } };
a.cc:
#include "a.h" int A::foo() { return 1; }
Could you annotate this example with gnu_inline to illustrate your point? And, if you are going to use any #ifdefs, what should they check for in libc++? Thank you!
I think I'm abusing __gnu_inline__ here a bit, but hey:
First of all, you have to have both the declaration and the definition inside the header, because neither clang nor GCC are happy to have an out of line definition of an already defined function.
a.h:
#pragma once #ifdef EMIT_FUNCTION # define GNU_INLINE #else # define GNU_INLINE __attribute__((__gnu_inline__)) inline #endif struct A { GNU_INLINE int foo(); }; // Defining the function this way forces the compiler to emit the function when it's not marked inline, // but if it's maked '__attribute__((__gnu_inline__)) inline' it forces the compiler to _not_ emit the function. GNU_INLINE int A::foo() { return 1; }
a.cpp:
#define EMIT_FUNCTION #include "a.h"
You have to use a different name and suppress a warning, but I left that away for simplicity.
Thanks for your help @philnik. I have now revised my implementation so that the ABI break is no longer required.
I don't know what a sensible benchmark for this would look like, but have a look at the generated assembly:
Example 1 (from #44892): https://godbolt.org/z/YGq86cszM. With my changes, this compiles to just
0000000000000000 <_Z4testv>: 0: b0 01 mov al,0x1 2: c3 retExample 2 (from #39333): https://godbolt.org/z/oee13ennv. With my changes, this becomes
0000000000000000 <main>: 0: 31 c0 xor eax,eax 2: c3 ret
That assembly looks convincing, thanks.
libcxx/include/exception | ||
---|---|---|
144 | I would like some comment describing this _LIBCPP_EXCEPTION_PTR_GNU_INLINE magic. Reading the other comments in this review make it clear, could you add short summary of that information here? | |
145 | We should validate whether the compiler supports this attribute. | |
189 | Same at line 167. |
Thanks for reviewing this @Mordante! I think I've addressed all of your comments, and I've added a suppression for -Wgnu-inline-cpp-without-extern, which will hopefully fix the failing tests.
libcxx/include/exception | ||
---|---|---|
145 | The documentation seems to imply that either __GNUC_GNU_INLINE__ or __GNUC_STDC_INLINE__ will be defined if the __gnu_inline__ attribute is supported, so that's what I am now checking for. Let me know if you had something else in mind. |
libcxx/include/exception | ||
---|---|---|
150 | I think you mean the __gnu_inline__ attribute here. | |
153 | Do you mean _LIBCPP_EXCEPTION_PTR_GNU_INLINE here? If yes, please just put it here. It's a bit confusing otherwise. | |
154–155 | __gnu_inline__ doesn't force the functions to be inlined. It disables the generation of the functions. | |
158 | You can use __has_attribute(__gnu_inline__) to check if the compiler supports it. That also works for any other attribute. | |
250 | I don't think that the comment is necessary here, but I also don't object. | |
252–254 | ||
libcxx/src/exception.cpp | ||
9–11 | Why do you guard the macro definition here? Could you add a short comment here why you are defining this? |
libcxx/src/exception.cpp | ||
---|---|---|
9–11 | I think this was wrong; the macro definition can be unguarded and we unfortunately need more preprocessor guards in the exception header, because the glibcxx ABI doesn't have __incr/decr_refcount()... |
Are there still any remaining issues with this change?
As an user of std::exception_ptr (mostly for async callbacks and coroutines), I would love to see this land in LLVM 15 :)
The latest patch didn't pass our pre-commit CI. So that needs to be fixed and it would be good to rebase the patch against main.
@Mordante I think the main problem were the inconsistent abilist files, which I have attempted to fix here (unfortunately generate-cxx-abilist didn't really work for me, as it created lots of other entries for things I didn't touch, so I had to do it manually). I've also rebased the patch, though the rebase went cleanly.
This problem does indeed appear to be fixed now, judging by the latest CI run. But there are now some problems on Mac OS, which I cannot reproduce because I don't have such a machine. Perhaps someone else can have a look at this?
The problem is that you have introduced new symbols to the dylib. I think the easiest way to avoid this is to use __cxa_{increment, decrement}_exception_refcount in the header directly.
How is this supposed to work? Including cxxabi.h in the header breaks everything, and nowhere else in libcxx/include/ do any __cxa_* functions get called. Also, there must be a way to distinguish exception_pointer_cxxabi.ipp from exception_pointer_unimplemented.ipp in the header if we're not going to introduce any additional symbols; how should I go about doing this?
I think you should only enable the optimization for known-good C++ABIs (so only libc++abi currently) and to forward-declare the __cxa_ functions. That means the default still is calling the functions in exception_pointer_unimplemented.ipp.
Thanks for your help @philnik. I'm still unsure what exactly I should check for in the exception header; I went for
#if defined(_LIBCPP_ABI_VERSION) || defined(LIBCXXRT) || defined(LIBCXX_BUILDING_LIBCXXABI)
because the first two are used in libcxx/src/exception.cpp to indicate that the exception_pointer_cxxabi.ipp implementation is to be used, and it didn't work without the last one, but it doesn't really look right to me, so please comment.
I'm not actually sure what the right is here. I think @ldionne should take a look, as he is more familiar with these things.
I would like some comment describing this _LIBCPP_EXCEPTION_PTR_GNU_INLINE magic. Reading the other comments in this review make it clear, could you add short summary of that information here?