Page MenuHomePhabricator

In the libc++ unstable ABI, use [[no_unique_address]] instead of __compressed_pair when available.
Needs ReviewPublic

Authored by rsmith on Jun 24 2019, 4:13 PM.

Details

Reviewers
EricWF
mclow.lists
ldionne
jdoerfert
Group Reviewers
Restricted Project
Summary

Adds a new __config macro _LIBCPP_USE_NO_UNIQUE_ADDRESS that specifies when to
use [[no_unique_address]] (instead of __compressed_pair or similar layout
hacks). This macro is defined only when the unstable ABI is enabled and
[[no_unique_address]] is available.

This reduces the number of template instantiations, compile-time cost, size of
debug information, and number of copies performed when using various parts of
libc++. In one source file, we saw a 4% decrease in object file size from
switching std::function to use [[no_unique_address]] instead of
__compressed_pair stemming primarily from reduced debug info size.

This patch does not intend to provide any ABI compatibility between the new
mode and old modes (or between two different compilers with the unstable ABI
enabled, where one supports [[no_unique_address]] and the other does not).
If we want a hard guarantee that even the unstable ABI provides stability
between compilers, we'll need to provide user control over whether we apply
this optimization rather than doing it whenever possible in the unstable ABI.

Event Timeline

rsmith created this revision.Jun 24 2019, 4:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 24 2019, 4:13 PM
rsmith edited the summary of this revision. (Show Details)Jun 24 2019, 4:14 PM
rsmith updated this revision to Diff 206322.Jun 24 2019, 4:17 PM
  • Revert vestigial changes to __compressed_pair.
rsmith updated this revision to Diff 206323.Jun 24 2019, 4:18 PM
  • Revert one more vestigial change.
Harbormaster completed remote builds in B33836: Diff 206323.

A couple of general comments; still working through all the bits.

libcxx/include/__config
1019

This should really go with the rest of the ABI flags up around line #70, and let's give it an ABI name like _LIBCPP_ABI_USE_NO_UNIQUE_ADDRESS

libcxx/include/__hash_table
3068

Shouldn't these be wrapped in a #ifdef _LIBCPP_USE_NO_UNIQUE_ADDRESS block?

rsmith updated this revision to Diff 206346.Jun 24 2019, 5:49 PM
rsmith marked an inline comment as done.
  • Address a couple of review comments from Marshall.
rsmith marked an inline comment as done.Jun 24 2019, 5:50 PM
rsmith updated this revision to Diff 206347.Jun 24 2019, 5:55 PM
  • Rename __hash_ macro to avoid name collision.

Thanks for working on this! __compressed_pair is a menace which causes serious template and code bloat.
And the code without it is much clearer and cleaner.

But...

I'm concerned by the number and size of the conditional compilation blocks this introduces.
In my time on libc++ conditional compilation blocks have been a frequent source of bugs; and I've been working to remove as many as I can.

I haven't figured out a cleaner way to implement this.
Does anyone have ideas on how we might eat our cake and have it too?

libcxx/include/__hash_table
1042

I really don't like this #define approach. Perhaps we could create accessor functions instead?

ldionne requested changes to this revision.Jun 27 2019, 10:31 AM
ldionne added a subscriber: ldionne.

I like this, however I want to echo @EricWF 's concern about the conditional compilation this introduces. Do we get any benefit if we still use compressed_pair but switch THAT to use no_unique_address instead?

This revision now requires changes to proceed.Jun 27 2019, 10:31 AM
rsmith added a comment.EditedJun 27 2019, 12:56 PM

I like this, however I want to echo @EricWF 's concern about the conditional compilation this introduces. Do we get any benefit if we still use compressed_pair but switch THAT to use no_unique_address instead?

Some, but much much less. In my experiments, converting __alloc_func to use [[no_unique_address]] directly had four times as much impact as converting __compressed_pair to use [[no_unique_address]]. Also, probably minor, but we can't easily get rid of the redundant (eg) allocator move constructions unless we remove the extra layer of indirection.

rsmith updated this revision to Diff 206949.EditedJun 27 2019, 3:11 PM
rsmith marked 2 inline comments as done.
  • Replace #defines of field names in non-[[no_unique_address]] mode with accessor functions
libcxx/include/__hash_table
1042

OK. This will re-add some compilation overhead to the new mode that I was trying to eliminate, but it should be no more than we pay in the old mode. I certainly don't have any measurements to justify this ugliness.

phosek added a subscriber: phosek.Jul 11 2019, 6:49 PM

Ping. Is there any interest in pursuing this, or is the preprocessor complexity too great?

Ping. Is there any interest in pursuing this, or is the preprocessor complexity too great?

Ping.

thakis added inline comments.
libcxx/include/__config
121

Can we make this defined(_LIBCPP_ABI_UNSTABLE) && !defined(_LIBCPP_ABI_USE_NO_UNIQUE_ADDRESS?

We have a natviz file to tell msvc's debugger how to display libc++ types, and I don't know how well codeview can describe no_unique_address. Furthermore, the natviz files currently explicitly mention the compressed pairs.

So it'd be good if we could opt out of this for a while on windows after it lands, to prepare the visualizers.

IMO, this is a good idea, and should be committed evendespite the addition of ifdefs in the implementation.

We'd definitely want to do this in a brand new ABI, and there's not really any better way to accomplish it now.

libcxx/include/__config
122

IMO, the ABI shouldn't change based on a compiler feature-test, even in unstable-abi mode.

But I don't see why it'd be a problem to just enable it by default with _LIBCPP_ABI_UNSTABLE, especially if, per the request above, there's a way to explicitly disable it again.

Then, if _LIBCPP_ABI_USE_NO_UNIQUE_ADDRESS is enabled, it should just emit an #error when the compiler fails to support the attribute.

libcxx/include/memory
2611

ifdef in the middle of an expression isn't worth it. Just repeat the rest of the statement.

ldionne added a reviewer: Restricted Project.Nov 2 2020, 2:18 PM

If we assume that the compiler supports [[no_unique_address]], is this an ABI break? Does no_unique_address result in a different layout than the __compressed_pair?

If we assume that the compiler supports [[no_unique_address]], is this an ABI break? Does no_unique_address result in a different layout than the __compressed_pair?

This is only enabled in the unstable ABI so I think it's OK.

If we assume that the compiler supports [[no_unique_address]], is this an ABI break? Does no_unique_address result in a different layout than the __compressed_pair?

This is only enabled in the unstable ABI so I think it's OK.

The fact that it's only enabled in the unstable ABI causes all this #ifdef mess, which is the reason why this patch has not landed yet. I'm trying to figure out whether we can use __attribute__((no_unique_address)) in the stable ABI too. If we don't assume the compiler supports it then we can't (because then it would break ABI between compilers), but if we require the compiler to support it, then we might be okay. That's my question :-).

If we assume that the compiler supports [[no_unique_address]], is this an ABI break? Does no_unique_address result in a different layout than the __compressed_pair?

Yes, this creates an incompatible memory layout. I don't think there's any way around that, while getting the benefits of removing the intermediate object.

If we put [[no_unique_address]] on the elements inside __compressed_pair, I believe that probably creates the same memory layout, but that doesn't get the desired benefit.
As soon as you move the fields out of the intermediate object, you will lose some padding, and thus the layout will be different (consider class Foo {__compressed_pair<double, char> a; __compressed_pair<short, char> b;}; -- due to alignment, "a" has 7 bytes of padding on the end, resulting in a 24-byte object. Whereas, with class Foo {double a1; char a2; short b1; char b2; }; the total size of the object would only be 16 bytes. (I've omitted no_unique_address for brevity here, since none of the objects in this example are empty or possibly-empty).

If we assume that the compiler supports [[no_unique_address]], is this an ABI break? Does no_unique_address result in a different layout than the __compressed_pair?

This is only enabled in the unstable ABI so I think it's OK.

The fact that it's only enabled in the unstable ABI causes all this #ifdef mess, which is the reason why this patch has not landed yet. I'm trying to figure out whether we can use __attribute__((no_unique_address)) in the stable ABI too. If we don't assume the compiler supports it then we can't (because then it would break ABI between compilers), but if we require the compiler to support it, then we might be okay. That's my question :-).

I misunderstood what you were saying. Hmm, that's an interesting idea. I think you're right, they would always have the same layout. Because if is_empty_v<T> == true then __compressed_pair_elem acts as an empty base class, so it would have the same effect as a member with [[no_unique_address]]. This wouldn't hold true if any of the compilers we support don't always implement empty base optimizations, though.

@jyknight I think we could change the uses were __compressed_pair is the only/last member of an object. Then I'm not sure the padding could change.

If we assume that the compiler supports [[no_unique_address]], is this an ABI break? Does no_unique_address result in a different layout than the __compressed_pair?

Yes, this creates an incompatible memory layout. I don't think there's any way around that, while getting the benefits of removing the intermediate object.

[...]
As soon as you move the fields out of the intermediate object, you will lose some padding, and thus the layout will be different [...]

Ah, right, of course. In that case, though, we could do something like:

class Foo {
  double a1;
  char a2;
#if defined(_LIBCPP_ABI_STABLE)
  char __abi_stable_padding_blablabla[7];
#endif
  short b1;
  char b2;
#if defined(_LIBCPP_ABI_STABLE)
  char __abi_stable_padding2_blablabla[1];
#endif
};

And that would preserve the same layout. Or am I forgetting something else here?

In that case, we could have a utility that creates the right padding for translating from __compressed_pair<T, U> to a sequence of [[no_unique_address]] T; [[no_unique_address]] U;. The usage would be something along the lines of:

class Foo {
  double a1;
  char a2;
#if defined(_LIBCPP_ABI_STABLE)
  [[no_unique_address]] __old_compressed_pair_ABI_padding<double, char> __padding1;
#endif
  short b1;
  char b2;
#if defined(_LIBCPP_ABI_STABLE)
  [[no_unique_address]] __old_compressed_pair_ABI_padding<short, char> __padding2;
#endif
};

This would expand to the right amount of padding for preserving ABI, and it would make sure we don't have to hardcode number of bytes in the code (which is error prone because of platform differences).

And this would neatly preserve ABI while removing 90% of the noise in this patch. Only the member declarations would contain a bit of noise aimed specifically at preserving ABI. WDYT?

Somewhat related, I recently learned that no_unique_address can make clang assert in certain case. Test case + fix: D90622.

Hm, that might indeed be feasible. We'll need to potentially insert padding both before the first type and after the second type, but we can static_assert the correctness against the old layout, so that's not _too_ scary.

If we go this route, it will mean that libc++ is only usable in an ABI-stable manner on Clang 9+ or GCC 9+. On earlier compilers, and on MSVC, the attribute is ignored, and the layout will thus be incorrect/incompatible. Is it viable to require those compiler versions to both build and use libc++? The LLVM build itself only requires GCC 5+/Clang 3.5 at the moment, so that would mean we'd need to start requiring the use of the 2-stage "runtimes" build in order to build libc++, at the least.

Hm, that might indeed be feasible. We'll need to potentially insert padding both before the first type and after the second type, but we can static_assert the correctness against the old layout, so that's not _too_ scary.

If we go this route, it will mean that libc++ is only usable in an ABI-stable manner on Clang 9+ or GCC 9+. On earlier compilers, and on MSVC, the attribute is ignored, and the layout will thus be incorrect/incompatible. Is it viable to require those compiler versions to both build and use libc++? The LLVM build itself only requires GCC 5+/Clang 3.5 at the moment, so that would mean we'd need to start requiring the use of the 2-stage "runtimes" build in order to build libc++, at the least.

At which point, we should be asking ourselves, "is this worth doing?"

Hm, that might indeed be feasible. We'll need to potentially insert padding both before the first type and after the second type, but we can static_assert the correctness against the old layout, so that's not _too_ scary.

If we go this route, it will mean that libc++ is only usable in an ABI-stable manner on Clang 9+ or GCC 9+. On earlier compilers, and on MSVC, the attribute is ignored, and the layout will thus be incorrect/incompatible. Is it viable to require those compiler versions to both build and use libc++? The LLVM build itself only requires GCC 5+/Clang 3.5 at the moment, so that would mean we'd need to start requiring the use of the 2-stage "runtimes" build in order to build libc++, at the least.

I think this is entirely reasonable. I actually have a draft email I want to send to discuss the issue of libc++ compiler support, but its relevance is gated on making progress with the unified standalone build discussed at http://lists.llvm.org/pipermail/libcxx-dev/2020-October/001004.html.

TLDR: It's already wrong to build libc++ with older compilers if you truly care about ABI and having a robust setup, since those are not tested AND they can't always build the library properly (e.g. libc++abi might be missing RTTI for some fundamental types). I think a 2 stage runtimes build should be the default way that libc++ is built as part of releasing a toolchain. This is what libstdc++ does w/ GCC.

At which point, we should be asking ourselves, "is this worth doing?"

I think it's a good question, however according to my answer to the above, my personal opinion is that it's worth exploring, if there are true benefits to be had from using [[no_unique_address]]. I think the benefits are real.

MTC added a subscriber: MTC.Apr 20 2021, 2:08 AM