This is an archive of the discontinued LLVM Phabricator instance.

[clang] Mark `trivial_abi` types as "trivially relocatable".
ClosedPublic

Authored by devin.jeanpierre on Nov 29 2021, 12:06 PM.

Details

Summary

This change enables library code to skip paired move-construction and destruction for trivial_abi types, as if they were trivially-movable and trivially-destructible. This offers an extension to the performance fix offered by trivial_abi: rather than only offering trivial-type-like performance for pass-by-value, it also offers it for library code that moves values but not as arguments.

For example, if we use memcpy for trivially relocatable types inside of vector reallocation, and mark unique_ptr as trivial_abi (via _LIBCPP_ABI_ENABLE_UNIQUE_PTR_TRIVIAL_ABI / _LIBCPP_ABI_UNSTABLE / etc.), this would speed up vector<unique_ptr>::push_back by 40% on my benchmarks. (Though note that in this case, the compiler could have done this anyway, but happens not to due to the inlining horizon.)

If accepted, I intend to follow up with exactly such changes to library code, including and especially std::vector, making them use a trivial relocation operation on trivially relocatable types.

D50119 and P1144:

This change is very similar to D50119, which was rejected from Clang. (That change was an implementation of P1144, which is not yet part of the C++ standard.)

The intent of this change, rather than trying to pick a winning proposal for trivial relocation operations, is to extend the behavior of trivial_abi in a way that could be made compatible with any such proposal. If P1144 or any similar proposal were accepted, then trivial_abi, __is_trivially_relocatable, and everything else in this change would be redefined in terms of that.

Safety:

It's worth pointing out, specifically, that trivial_abi already implies trivial relocatability in a narrow sense: a trivial_abi type, when passed by value, has its constructor run in one location, and its destructor run in another, after the type has been trivially relocated (through registers).

Trivial relocatability optimizations could change the number of paired constructor/destructor calls, but this seems unlikely to matter for trivial_abi types.

Diff Detail

Event Timeline

devin.jeanpierre requested review of this revision.Nov 29 2021, 12:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 29 2021, 12:06 PM
devin.jeanpierre edited the summary of this revision. (Show Details)Nov 29 2021, 12:08 PM
devin.jeanpierre edited the summary of this revision. (Show Details)

Just a heads up, I think this is my first change to clang or llvm, and I'd appreciate any feedback you have on the code, review process, etc.

On the one hand, I like that someone else is pushing a patch that's basically the same as D50119, because maybe if enough people reimplement the same thing, eventually Clang will accept one of them. ;)
However, I'd like to point out:

  • D50119 has more extensive tests, and has been live on https://p1144.godbolt.org/z/axx9Wj3r5 for a couple years now; if the consensus is that Clang wants "part of D50119 but not all of it," then IMHO it would be reasonable to put some comments on D50119 with the goal of stripping it down, rather than trying to build up a smaller version of it in parallel.
  • (Major point of contention) To me, [[trivial_abi]] does not imply [[trivially_relocatable]] at all. I believe @rjmccall disagreed a few years ago: basically I said "The two attributes are orthogonal," with practical demonstration, and basically he said "yes, in practice you're right, but philosophically that's a bug and we do actually intend [[trivial_abi]] to imply [[trivially_relocatable]] even though the compiler doesn't enforce it" (but then AFAIK nothing was ever done about that).

Here are my practical examples: https://p1144.godbolt.org/z/EYcMM1nTW
With ATTR=[[clang::trivial_abi]], we see that take(g) passes in a register; but notice that it still calls the copy-ctor, and it will still call the dtor (inside take — the ABI changes to "callee-destroy" for arguments of these types). And we see that std::swap is completely unaffected: we know that the calling convention for passing T by value is altered, but we don't know anything semantic about the behavior of T itself (e.g. when it is relocated or swapped).
With ATTR=[[clang::trivially_relocatable]] (D50119), we see that take(g) still passes on the stack (p1144 trivial relocation is backward-compatible with C++20 by design: it doesn't alter the calling convention at all). But notice that because we know something semantic about the behavior of T (namely, that it is trivially relocatable), std::swap is able to skip calling its ctors and dtors and just swap the bits directly.

Therefore, I support something close to this patch, but I strongly believe that you need to split up the "trivially relocatable" attribute itself from "trivial abi". They are represented orthogonally in the AST, and they should be toggleable orthogonally by the end-user. The end-user might want a trivial-abi type that is not trivially-relocatable, or a trivially-relocatable type that is not trivial-abi.

Related reading: https://quuxplusone.github.io/blog/2018/05/02/trivial-abi-101/
Related watching, on p1144 [[trivially_relocatable]] and adjacent topics such as trivial_abi and move_relocates: https://www.youtube.com/watch?v=SGdfPextuAU

devin.jeanpierre added a comment.EditedNov 29 2021, 1:13 PM

Wow, thanks for the quick response! I really appreciate it.

  • (Major point of contention) To me, [[trivial_abi]] does not imply [[trivially_relocatable]] at all. I believe @rjmccall disagreed a few years ago: basically I said "The two attributes are orthogonal," with practical demonstration, and basically he said "yes, in practice you're right, but philosophically that's a bug and we do actually intend [[trivial_abi]] to imply [[trivially_relocatable]] even though the compiler doesn't enforce it" (but then AFAIK nothing was ever done about that).

I agree with you that [[clang::trivial_abi]] could be taken not to imply [[trivially_relocatable]]. However, I also think we can choose for it to imply it -- the meaning of [[clang::trivial_abi]] is up to us! It does not seem very useful to have a type which is trivial_abi, but not trivially relocatable: such a type can be relocated when being passed by value, but is leaving performance on the table everywhere else. It might be that such a type really depends on the number of paired constructor/destructor calls, but this seems unlikely to be a realistic concern, especially since the number of paired constructor/destructor calls can change as a result of implementation changes in e.g. std::vector's reallocation algorithm anyway.

As you noted, "nothing was ever done about that". This change seeks to address that problem. :B

Therefore, I support something close to this patch, but I strongly believe that you need to split up the "trivially relocatable" attribute itself from "trivial abi". They are represented orthogonally in the AST, and they should be toggleable orthogonally by the end-user. The end-user might want a trivial-abi type that is not trivially-relocatable, or a trivially-relocatable type that is not trivial-abi.

Leaving aside differences of opinion on how important it is, a version of the standards proposal could maintain linear independence, though not orthogonality: we could imagine writing struct [[clang::trivial_abi]] [[trivially_relocatable(false)]] Foo { ... }; and the like to achieve every combination.

I do agree that not every trivially relocatable type should be trivial for calls. This can produce bad performance outcomes in the case of e.g. non-inline functions which accept the trivial-for-calls value by pointer/reference, such as methods. Calling those can force the value to be put back in the stack, adding additional (trivial) copies where a non-trivial type would've had none. And so, even with this change, we would still want to be able to specify trivial relocatability separately, without implying trivial-for-calls.

A bigger problem is that my perception from reading that review thread was that any [[trivially_relocatable]] attribute proposal was a non-starter: this is the subject of a standards proposal which has not yet been accepted and clang doesn't want to "pick a winner". So my hope is that rather than picking a winner there, we extend the behavior of trivial_abi, in a way that is compatible with and desirable to have in combination with any accepted standards proposal. Hopefully, by being a bit less ambitious and complete, and reducing scope only to changes in behavior for trivial_abi, the resulting change to clang will be more acceptable.

This does mean that some performance is left on the table -- any types which can't be marked trivial_abi will not get the benefit of trivial relocatability -- but that's what the standards proposal can aim to fix.

rsmith added a comment.Dec 1 2021, 3:33 PM
  • D50119 has more extensive tests, and has been live on https://p1144.godbolt.org/z/axx9Wj3r5 for a couple years now; if the consensus is that Clang wants "part of D50119 but not all of it," then IMHO it would be reasonable to put some comments on D50119 with the goal of stripping it down, rather than trying to build up a smaller version of it in parallel.

I've compared the two patches, and as far as I can tell, this patch is effectively the subset of D50119 that includes only the new type trait and not the new attributes, with the following exceptions:

  • No manual __has_extension support. This is an improvement in this patch; the intended way to test for type trait primitives is with __has_builtin, and that has built-in knowledge of type traits. I think this may have changed since D50119 was first proposed; __has_builtin has not always supported type trait builtins.
  • This patch has an unnecessary check for trivial destruction of trivially copyable types (commented in this review).
  • This patch treats trivial-for-calls as implying trivially-relocatable, and in particular this means that [[trivial_abi]] types are implicitly trivially-relocatable, as discussed below.
  • -ast-dump support. I'd be inclined to omit that from this change; until / unless we have an attribute that makes a class trivially-relocatable but not trivial for calls, adding dump support for canPassInRegisters rather than isTriviallyRelocatable seems better-motivated, but wouldn't make sense as part of this patch.
  • Different set of test cases.

Even if we want to land all of the functionality in D50119 now, splitting out the new type trait and the new attribute into separate patches makes sense to me.

  • (Major point of contention) To me, [[trivial_abi]] does not imply [[trivially_relocatable]] at all. I believe @rjmccall disagreed a few years ago: basically I said "The two attributes are orthogonal," with practical demonstration, and basically he said "yes, in practice you're right, but philosophically that's a bug and we do actually intend [[trivial_abi]] to imply [[trivially_relocatable]] even though the compiler doesn't enforce it" (but then AFAIK nothing was ever done about that).

I agree with @rjmccall on this: [[trivial_abi]] is meant to imply both that it's correct to trivially relocate and that we should perform a trivial relocation when making a function call. The latter without the former doesn't make sense to me. I agree that the former without the latter does make sense, and I don't think this change is an impediment to supporting that.

Personally I'm more comfortable landing this subset of D50119 now than I would be with landing all of D50119, given that the type trait seems well-motivated and unlikely to have any serious conflicts with an eventually-standardized feature. John's concern that the semantics of an eventual [[trivially_relocatable]] attribute may differ from what's in D50119 seem well-founded to me.

clang/include/clang/Basic/AttrDocs.td
3211–3215

I think this documentation change has mixed together two different things. The revised wording says that [[trivial_abi]] implies trivially-relocatable and that trivially-relocatable implies passing using the C ABI, which is wrong: the second implication does not hold. What we should say is that [[trivial_abi]] (if we're not in the "has no effect" case described below) implies both that the type is trivially-relocatable, and that it is passed using the C ABI (that is, that we trivially relocate in argument passing).

Instead of the wording changes you have here, perhaps we could leave the old wording alone and add a paragraph that says that a type with the attribute is assumed to be trivially-relocatable for the purpose of __is_trivially_relocatable, but that Clang doesn't yet take advantage of this fact anywhere other than argument passing.

clang/lib/AST/Type.cpp
2500

You can simplify this a little by dropping this isDestructedType() check; trivially copyable implies trivially destructible.

It would be a little more precise to check for !isNonTrivialToPrimitiveDestructiveMove() && !isDestructedType() here; that'd treat __autoreleasing pointers as trivially-relocatable, but not __strong pointers (because "destructive move" there actually means "non-destructive move" and needs to leave the object in a valid but unspecified state, which means nulling out a moved-from __strong pointer). I think getting this "fully" right would require a bit more plumbing to properly handle the case of a struct containing a __strong pointer (which is trivially relocatable but not trivially anything else).

Suggested changes from code review.

(Sorry for delayed reply -- I made the mistake of signing up to phabricator with my personal email, which I don't check very well, apparently!)

clang/include/clang/Basic/AttrDocs.td
3211–3215

Yeah, this was too aggressive a wording change, and might easily change later when [[trivially_relocatable]] is added.

I did drop the "Clang doesn't yet take advantage of this fact anywhere other than argument passing.", because I feel like -- does that sort of sentiment belong instead in the docs for __is_trivially_relocatable? Maybe I should change __is_trivially_relocatable to note that this is never taken advantage of anywhere except in by-value argument passing (when the type is trivial for the purpose of calls) and library code.

clang/lib/AST/Type.cpp
2500

Oooh. isNonTrivialToPrimitiveDestructiveMove is perfect.

Actually, a struct containing a __strong pointer is already handled -- Clang already ensures they're passed in registers, and does all the appropriate logic. It's just __strong pointers themselves that I'd need to handle the same way here. Unfortunately, I really, *really* don't understand the source code dealing with __strong pointers in structs. (In particular, I don't understand the difference between GC::Strong and ObjCLifetime::OCL_Strong, plus I'm uncomfy with the global language opts.)

It's tempting to use the return value of isNonTrivialToPrimitiveDestructiveMove and check for Strong -- but now that's some *third* thing, and not mirroring the logic of when it's in a struct as a data member.

rsmith added inline comments.Dec 6 2021, 6:18 PM
clang/lib/AST/Type.cpp
2500

The existing handling only covers Objective-C++ and not (non-C++) Objective-C (the C and C++ codepaths for computing struct properties are entirely different, because the C++ side of things is primarily working out properties of special member functions). It looks like your test coverage doesn't cover the non-C++ side of this.

Don't worry about GC::Strong. ObjC GC is thoroughly dead (unless there's GNU-runtime interest in it?), and AFAIK we've never made any effort to incorporate it into our treatment of non-trivial structs.

Quuxplusone added inline comments.Dec 7 2021, 2:10 PM
clang/include/clang/Basic/AttrDocs.td
3211–3215

Richard's comment jogged my brain and made we worry about this again. IIUC, you're saying:

  1. All trivial-abi types are in fact trivially relocated, specifically when they are passed to functions.
  2. Therefore, all trivial-abi types are "trivially relocatable" in general.
  3. Therefore, all trivial-abi types are safe to optimize in all the ways depicted in D67524.

I agree with 1 for sure, but I'm skeptical of the logical soundness of 2 and 3. Are we willing to set it in stone?

If we're not willing to set that whole syllogism in stone, then I would object to this patch. Because of this syllogism:

  1. This patch makes Clang's __is_trivially_relocatable(T) return true for all trivial-abi types.
  2. D67524 (rightly) applies its optimizations to all types for which __is_trivially_relocatable(T) is true.
  3. Therefore, if we adopt this patch, we'd better be dang sure that all trivial-abi types are safe to optimize in all the ways depicted in D67524.

That is, we must set in stone the idea that all trivial-abi types can safely:

  • be swapped trivially, in terms of bytewise swap, inside e.g. std::sort
  • use trivial (bytewise) relocation inside vector reallocation
  • use trivial (bytewise) relocation when shifting inside vector::insert or vector::erase

If @devin.jeanpierre and @rsmith and @rjmccall and @ahatanak are all nodding along to this and saying "yeah, totally, every premise above is true and the syllogisms are all valid and those optimizations are all valid, for every trivial-abi type, forever," then I'm happy.

rsmith added inline comments.Dec 7 2021, 2:37 PM
clang/include/clang/Basic/AttrDocs.td
3211–3215

What use cases would we be giving up on if we go this way? What would a type look like for which trivial relocation when passing to functions is correct, but for which trivial relocation in general is either incorrect or undesirable?

I am perfectly happy accepting that syllogism. Passing a value in registers is a trivial relocation. If there is a reason your type shouldn't be trivially relocated, you should not make it trivial_abi.

Use PCK_ARCStrong to check for ObjC strong pointers, marking them as trivially relocatable as well.

devin.jeanpierre added a comment.EditedDec 7 2021, 5:55 PM

(Sorry, I think I'm doing threading wrong here due to lack of experience with phabricator. The reply buttons are grayed out!)

  1. All trivial-abi types are in fact trivially relocated, specifically when they are passed to functions.
  2. Therefore, all trivial-abi types are "trivially relocatable" in general.

This sounds mostly good so far. For example:

class [[clang::trivial_abi]] MyType { ... };

MyType Create() {
  MyType x;
  /* do stuff */
  return x;
}

void Consume(MyType) {}
int main() {
  Consume(Create());
}

Here, you can place x into any valid state for a MyType, and it will be trivially relocated in the call to Consume, without any intermediate steps, thanks to guaranteed copy elision. If it is not safe to trivially relocate in general, then this is not a sound use of trivial_abi.

So I think some version of 2 is correct: safe use of [[clang::trivial_abi]] requires that it be safe to relocate for all publicly reachable object states.

(This isn't watertight: perhaps a non-publicly reachable state, only achieved temporarily internally, is not trivially relocatable. Then such a type would still be safe to use via its public API, but not be trivially relocatable in general, and very dangerous to use within its private API. By adding the optimizations, we'd add more footguns to such a type inside these sections of code.)

  1. Therefore, all trivial-abi types are safe to optimize in all the ways depicted in D67524.

This is not quite right. It might be safe to relocate, in that adding a new relocation operation does not change program behavior, but it doesn't quite follow that it's safe to stop calling the move constructor and destructor. So e.g. some type which relies on exactly counting pairs move constructor / destructor calls might get upset.

(Edit: fixed wording on that last sentence, oops!)

Also, there is that parenthetical above: if a type can sometimes reach states internally that are not trivially relocatable, and in addition it calls routines like swap etc. while in such a state, then a relocation optimization would be unsound.

Nonetheless, my hope is that these can be considered contrived or unsound cases which we will not support for trivial_abi.

clang/lib/AST/Type.cpp
2500

I think this is fine. The keyword __is_trivially_relocatable only works in (Objective) C++ source files (KEYCXX), and so in order for it to be callable, the struct must have been defined in a C++ source file to begin with, right? At least, assuming ObjC works the same as C -- where the struct is defined in a header that's interpreted as (objective) C++ when it's used by (objective) C++ code. I might not be understanding any number of things, though. What would the test you're imagining look like?

2500

I switched to checking PCK_ARCStrong after all, since this is set by checking for Qualifiers::OCL_Strong, which is exactly the same as what sets structs to be passed in registers. (if we ignore the rest of the logic (and things like GC::Strong)). Thanks for the reassurance, @rjmccall !

That said, this still might not be what you would write if you were doing it, so let me know if this looks good.

Trivial relocation doesn't imply that types have to be safe against being suddenly relocated during the middle of operations while they're not in a safe internal state. That is not a consideration.

devin.jeanpierre added a comment.EditedDec 7 2021, 6:54 PM

Trivial relocation doesn't imply that types have to be safe against being suddenly relocated during the middle of operations while they're not in a safe internal state. That is not a consideration.

You're right. In fact, I want to walk back that concern altogether. As a simple example, I think swap() is allowed to be defined as so:

template <typename T>
void swap(T& lhs, T& rhs) {
    ([&](T tmp) {lhs = std::move(rhs); rhs = std::move(tmp);})(std::move(lhs));
}

It's really hard to argue that a type can sensibly call out to library code while in a state that is non-relocatable, despite self-labeling as [[clang::trivial_abi]]. If the library is allowed to move the value, it's allowed to move it into a parameter, causing a relocation! So I actually regret pointing them out, or thinking that they might be a concern. :)

(Demonstration code: https://godbolt.org/z/j88Kqd49a)

Quuxplusone added inline comments.Dec 7 2021, 6:55 PM
clang/include/clang/Basic/AttrDocs.td
3211–3215

! @rsmith wrote:
What would a type look like for which trivial relocation when passing to functions is correct, but for which trivial relocation in general is either incorrect or undesirable?

I have no examples in mind myself, but that (and my skepticism) are largely due to my lack of practical experience with [[trivial_abi]]. If I can't come up with any and you can't come up with any and... etc..., then that's good news for this PR.

! @rjmccall wrote:

I am perfectly happy accepting that syllogism. [...] If there is a reason your type [ever] shouldn't be trivially relocated, you should not make it trivial_abi.

Okay, that's good news for this PR also. :)

Trivial relocation doesn't imply that types have to be safe against being suddenly relocated during the middle of operations while they're not in a safe internal state. That is not a consideration.

+100.

rsmith accepted this revision.Jan 12 2022, 3:16 PM

LGTM

This revision is now accepted and ready to land.Jan 12 2022, 3:16 PM

Update to pass on Windows (untested right now).

On Windows, unlike Itanium ABI, the only special member functions used are the copy constructor and the destructor. If the copy constructor is deleted, then the type is not considered trivial for calls, even if it has a move constructor.

OK, while I'm struggling to set up a new Windows machine so I can make sure this works on Windows... @Quuxplusone, after this is merged, do you want to rebase D67524 on top of this, or should I? I can review it -- I think when I looked at it, I only had two ideas for changes:

  1. May need to guard all of these optimizations on the allocator being the standard allocator, since otherwise the change in the number of constructor/destructor calls would be observable to more than just the annotated type.
  1. changing std::swap to correctly handle potentially-overlapping-objects. My thought is we could test that there's no reusable tail padding.

First draft: has_unique_object_representations is conservative -- on the Itanium ABI, "POD for the purposes of layout" types can have padding which isn't reused when it's a potentially overlapping subobject.

Second draft: check by hand:

    
struct TestStruct {
  [[no_unique_address]] T x;
  // not sure if this needs to be a bitfield or anything like that, but the idea is this.
  char extra_byte;
};
bool has_padding = sizeof(TestStruct) == sizeof(T);

Third draft: we can improve on this further still by iterating over different numbers of characters to determine the exact size of the padding, and memcpy only the non-padding parts.


(D63620 could in some form also be ported over, but it needs to be guarded behind ABI stability, since [[clang::trivial_abi]] is an ABI breaking change. For example, the same way it was done for unique_ptr, with the same benefits.)

OK, while I'm struggling to set up a new Windows machine so I can make sure this works on Windows... @Quuxplusone, after this is merged, do you want to rebase D67524 on top of this, or should I? I can review it -- I think when I looked at it, I only had two ideas for changes:

  1. May need to guard all of these optimizations on the allocator being the standard allocator, since otherwise the change in the number of constructor/destructor calls would be observable to more than just the annotated type.

My intent is that the type-traits __has_trivial_construct and __has_trivial_destroy already handle that. The standard std::allocator has a trivial construct and destroy, but so do lots of user-defined allocators.

  1. changing std::swap to correctly handle potentially-overlapping-objects. My thought is we could test that there's no reusable tail padding.

First draft: has_unique_object_representations is conservative -- on the Itanium ABI, "POD for the purposes of layout" types can have padding which isn't reused when it's a potentially overlapping subobject.

I'd like something that triggers successfully even when there are internal padding bytes, e.g. struct TR { int x; std::string y; }. (But anything is better than nothing. I'm going to keep maintaining my branch for the foreseeable future, so I can keep doing my thing no matter what half-measure makes it into trunk.)

Second draft: check by hand:

    
struct TestStruct {
  [[no_unique_address]] T x;
  // not sure if this needs to be a bitfield or anything like that, but the idea is this.
  char extra_byte;
};
bool has_padding = sizeof(TestStruct) == sizeof(T);

If this works, then either my mental model of [[no_unique_address]] is wrong or my mental model of tail padding is wrong. I would expect more like

template<class T> struct DerivedFrom : public T { char extra_byte; };
template<class T> requires is_final_v<T> struct DerivedFrom<T> { char x[sizeof(T)+1]; }; // definitely no objects in the tail padding if it's final
bool has_padding = sizeof(DerivedFrom<T>) == sizeof(T);

If T is final, then it can't have stuff in its tail padding, I think: https://godbolt.org/z/69sjf15MY

Anyway, that seems like a reasonable path in the name of getting something merged. My personal druthers is that the Standard should just admit that std::swap was never meant to work on polymorphic objects, so therefore it can assume it's never swapping Derived objects via Base&, so therefore it can assume tail padding is not an issue, so therefore no metaprogramming is needed and it can just always do the efficient thing (which is therefore what I've implemented in D67524). But I know that's unlikely to ever happen.

(D63620 could in some form also be ported over, but it needs to be guarded behind ABI stability, since [[clang::trivial_abi]] is an ABI breaking change. For example, the same way it was done for unique_ptr, with the same benefits.)

The point of [[trivially_relocatable]] is that it doesn't change the ABI (and therefore also doesn't cause the lifetime-related pitfalls described on that web page). So I won't attempt to merge it with [[trivial_abi]], and I'm also personally uninterested in pursuing a [[trivial_abi]] std::string etc., because that gives no benefits to people stuck on libc++ ABIv1 (which is everyone, AFAICT). But if someone wants to implement _LIBCPP_ABI_ENABLE_STRING_TRIVIAL_ABI, _LIBCPP_ABI_ENABLE_VECTOR_TRIVIAL_ABI, etc. etc., for the benefit of ABIv2 users... hey, go for it.

clang/test/SemaCXX/attr-trivial-abi.cpp
9–11

(1) Why should Windows be different from everyone else here?
(2) AFAIK, a user-defined move ctor means the copy ctor is not implicitly declared, but it's not deleted; so I think this comment is slightly wrong.

117–121

Again, I don't see why Windows should be any different. CopyDeleted is trivial to move and trivial to destroy; that means (by definition) it's trivial to relocate.

136

I think you meant s/CopyDeleted/S20/ here.

clang/test/SemaCXX/type-traits.cpp
2862

It's mildly surprising that an incomplete type can be trivially relocatable; but it doesn't really matter, since nobody can be depending on this answer. (int[] is not movable or destructible, so it's not relocatable — never mind trivially relocatable.)

clang/test/SemaObjCXX/objc-weak-type-traits.mm
213–214

IIUC (which I probably don't): Both __strong id and __weak id are trivially relocatable in the p1144 sense, but only __strong id is trivial-for-purposes-of-ABI, and that's why only __strong id is being caught here. Yes/no?

Clarify Windows comments.

devin.jeanpierre added a comment.EditedJan 19 2022, 12:00 PM

CI test finished successfully before windows setup did 😢. My workplace's Windows VMs are a bit hosed at the moment...

OK, while I'm struggling to set up a new Windows machine so I can make sure this works on Windows... @Quuxplusone, after this is merged, do you want to rebase D67524 on top of this, or should I? I can review it -- I think when I looked at it, I only had two ideas for changes:

  1. May need to guard all of these optimizations on the allocator being the standard allocator, since otherwise the change in the number of constructor/destructor calls would be observable to more than just the annotated type.

My intent is that the type-traits __has_trivial_construct and __has_trivial_destroy already handle that. The standard std::allocator has a trivial construct and destroy, but so do lots of user-defined allocators.

Fair, I'm sorry for missing that detail.

  1. changing std::swap to correctly handle potentially-overlapping-objects. My thought is we could test that there's no reusable tail padding.

First draft: has_unique_object_representations is conservative -- on the Itanium ABI, "POD for the purposes of layout" types can have padding which isn't reused when it's a potentially overlapping subobject.

I'd like something that triggers successfully even when there are internal padding bytes, e.g. struct TR { int x; std::string y; }. (But anything is better than nothing. I'm going to keep maintaining my branch for the foreseeable future, so I can keep doing my thing no matter what half-measure makes it into trunk.)

Well, we must do something. In particular, consider this code:

// class to make it non-POD for the purpose of layout
class X { int64_t a; int8_t b;};
class Y : public X {int8_t c;};

Y y1 = ...;
Y y2 = ....;
X& x1 = y1;
X& x2 = y2;

std::swap(x1, x2);

Here, at least in some implementations, c will live inside the tail padding of X. Without any special guarding, if swap used a memcpy with sizeof(x1) etc., then it would overwrite the padding bytes, which include c. I don't think swap should do this.

Second draft: check by hand:

    
struct TestStruct {
  [[no_unique_address]] T x;
  // not sure if this needs to be a bitfield or anything like that, but the idea is this.
  char extra_byte;
};
bool has_padding = sizeof(TestStruct) == sizeof(T);

If this works, then either my mental model of [[no_unique_address]] is wrong or my mental model of tail padding is wrong. I would expect more like

template<class T> struct DerivedFrom : public T { char extra_byte; };
template<class T> requires is_final_v<T> struct DerivedFrom<T> { char x[sizeof(T)+1]; }; // definitely no objects in the tail padding if it's final
bool has_padding = sizeof(DerivedFrom<T>) == sizeof(T);

If T is final, then it can't have stuff in its tail padding, I think: https://godbolt.org/z/69sjf15MY

Unfortunately, even if T is final, it can have things in its tail padding: [[no_unique_address]] allows all the same optimizations as a base class, including reuse of tail padding and EBCO. See e.g. the cppreference page.

The reason your godbolt link shows them having the same size is that the A struct is "POD for the purpose of layout". In the Itanium ABI, POD types are laid out as normal, but their tail padding isn't reused in any circumstances (not for subclasses either -- so try e.g. removing final and using inheritance, and the assertion still passes: https://godbolt.org/z/MWrMoEbvb). If you make that a class instead of a struct, it isn't POD for the purpose of layout (due to private members), and the assertion fails now that the tail padding can be reused: https://godbolt.org/z/5edfqrK47

However, I think you're right that this isn't enough -- I don't believe it's guaranteed that inheritance and [[no_unique_address]] use padding in the exact same way, so you probably need to try both just to be safe.

Anyway, that seems like a reasonable path in the name of getting something merged. My personal druthers is that the Standard should just admit that std::swap was never meant to work on polymorphic objects, so therefore it can assume it's never swapping Derived objects via Base&, so therefore it can assume tail padding is not an issue, so therefore no metaprogramming is needed and it can just always do the efficient thing (which is therefore what I've implemented in D67524). But I know that's unlikely to ever happen.

I don't think it matters if they're polymorphic or not -- swapping the derived class's member variables when you're swapping the base class subobject (or a [[no_unique_address]] member) is unexpected and an observable change in behavior.

(D63620 could in some form also be ported over, but it needs to be guarded behind ABI stability, since [[clang::trivial_abi]] is an ABI breaking change. For example, the same way it was done for unique_ptr, with the same benefits.)

The point of [[trivially_relocatable]] is that it doesn't change the ABI (and therefore also doesn't cause the lifetime-related pitfalls described on that web page). So I won't attempt to merge it with [[trivial_abi]], and I'm also personally uninterested in pursuing a [[trivial_abi]] std::string etc., because that gives no benefits to people stuck on libc++ ABIv1 (which is everyone, AFAICT). But if someone wants to implement _LIBCPP_ABI_ENABLE_STRING_TRIVIAL_ABI, _LIBCPP_ABI_ENABLE_VECTOR_TRIVIAL_ABI, etc. etc., for the benefit of ABIv2 users... hey, go for it.

Yeah, I'm probably going to want to do that later, FWIW.

clang/test/SemaCXX/attr-trivial-abi.cpp
9–11

(2) AFAIK, a user-defined move ctor means the copy ctor is not implicitly declared, but it's not deleted; so I think this comment is slightly wrong.

Sorry, I get that confused a lot. :)

(1) Why should Windows be different from everyone else here?

In this change so far, an object is only considered trivially relocatable if it's trivial for calls, and Windows is non-conforming wrt what it considers trivial for calls: it won't consider something trivial for calls if it isn't trivially copyable.

Code: https://clang.llvm.org/doxygen/SemaDeclCXX_8cpp_source.html#l06558

It surprised me too. This is a good motivator IMO to support something like your proposal, and have types be trivially relocatable that aren't trivial for calls. This allows, as you mention elsewhere, for optimizations that aren't ABI-breaking, and for e.g. non-conforming platforms like this to still take advantage of trivial relocatability.

clang/test/SemaCXX/attr-trivial-abi.cpp
9–11

In this change so far, an object is only considered trivially relocatable if it's trivial for calls, and Windows [has a calling convention that's different from the Itanium C++ ABI's calling convention]: it won't consider something trivial for calls if it isn't trivially copyable.

Ah, I see, and I agree with your logic here. It's unintuitive but only for the same reasons we've already hashed over: we're introducing an __is_trivially_relocatable(T) that gives zero false-positives, but (for now) frequent false-negatives, and this just happens to be one additional source of false negatives on Win32 specifically. (And I keep tripping over it because this frequent-false-negative __is_trivially_relocatable(T) is named the same as D50119's "perfect" discriminator.)

Everyone's on board with the idea that we're promising to preserve the true positives forever, but at the same time we're expecting that we might someday fix the false negatives, right? I.e., nobody's expecting !__is_trivially_relocatable(a<int>) to remain true on Win32 forever? (I'm pretty sure we're all on board with this.)

Sorry, I missed your other comments. Let me know if there's anything else I didn't address.

clang/test/SemaCXX/attr-trivial-abi.cpp
136

Ugh, copy-paste. Thanks, done.

clang/test/SemaCXX/type-traits.cpp
2862

I think it is consistent with the other type traits in C++ -- for example, this assert would also pass:

static_assert(std::is_trivially_copyable_v<int[]>, "");

The important thing is that this line fails to compile:

struct Incomplete;
bool unused = __is_trivially_relocatable(Incomplete);
clang/test/SemaObjCXX/objc-weak-type-traits.mm
213–214

__weak can't be trivially relocatable in any proposal, because their address is registered in a runtime table, so that they can be nulled out when the object is deleted.

Fix copy-paste error.

Sorry, I missed your other comments. Let me know if there's anything else I didn't address.

I just noticed that there's a "Done" checkbox next to each comment, and it isn't automatically checked if I reply to it -- I suppose I should be checking these so that it's easier to keep track of what's left to do? Sorry, this is my first time using phabricator!

clang/test/SemaCXX/attr-trivial-abi.cpp
9–11

Everyone's on board with the idea that we're promising to preserve the true positives forever, but at the same time we're expecting that we might someday fix the false negatives, right? I.e., nobody's expecting !__is_trivially_relocatable(a<int>) to remain true on Win32 forever? (I'm pretty sure we're all on board with this.)

+1

My hope is that future changes can fix the false negatives -- either with explicit annotations (like [[trivially_relocatable]), or with improved automatic inference, etc.

(For example, we could imagine marking types as trivially relocatable if they have *a* trivial (for calls) copy or move constructor and a trivial (for calls) destructor, even if they also have a nontrivial move or copy destructor. This would be a superset of trivial-for-calls types on all platforms, and especially Windows.)

devin.jeanpierre marked 12 inline comments as done.Jan 19 2022, 1:50 PM
rsmith added inline comments.Jan 19 2022, 2:10 PM
clang/test/SemaCXX/attr-trivial-abi.cpp
9–11

Everyone's on board with the idea that we're promising to preserve the true positives forever, but at the same time we're expecting that we might someday fix the false negatives, right? I.e., nobody's expecting !__is_trivially_relocatable(a<int>) to remain true on Win32 forever? (I'm pretty sure we're all on board with this.)

+1 from me too. Users of __is_trivially_relocatable should expect it to start returning true for more trivially-relocatable types over time.

devin.jeanpierre marked an inline comment as done.Jan 19 2022, 3:08 PM

@rsmith I've pulled+rebased again to avoid the (looks like pre-existing) failure. Copy-pasting the wording on https://llvm.org/docs/MyFirstTypoFix.html#commit-by-proxy: I don’t have commit access, can you land this patch for me? Please use “Devin Jeanpierre jeanpierreda@google.com” to commit the change.

(I normally use the jeanpierreda email alias for external stuff, so that my internal username doesn't leak.)

This revision was automatically updated to reflect the committed changes.

Woooo! Thank you for submitting, and thanks everyone for the kind and helpful reviews. 😀

dyung added a subscriber: dyung.Feb 3 2022, 12:13 PM

Hi, your change is causing a test failure in the test attr-trivial-abi.cpp that you modified on the PS4 Windows box:

https://lab.llvm.org/staging/#/builders/204/builds/758/steps/7/logs/FAIL__Clang__attr-trivial-abi_cpp

******************** TEST 'Clang :: SemaCXX/attr-trivial-abi.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';   c:\buildbot-root\llvm-clang-x86_64-sie-win\build\bin\clang.exe -cc1 -internal-isystem c:\buildbot-root\llvm-clang-x86_64-sie-win\build\lib\clang\15.0.0\include -nostdsysteminc -fsyntax-only -verify C:\buildbot-root\llvm-clang-x86_64-sie-win\llvm-project\clang\test\SemaCXX\attr-trivial-abi.cpp -std=c++11
--
Exit Code: 1
Command Output (stdout):
--
$ ":" "RUN: at line 1"
$ "c:\buildbot-root\llvm-clang-x86_64-sie-win\build\bin\clang.exe" "-cc1" "-internal-isystem" "c:\buildbot-root\llvm-clang-x86_64-sie-win\build\lib\clang\15.0.0\include" "-nostdsysteminc" "-fsyntax-only" "-verify" "C:\buildbot-root\llvm-clang-x86_64-sie-win\llvm-project\clang\test\SemaCXX\attr-trivial-abi.cpp" "-std=c++11"
# command stderr:
error: 'error' diagnostics seen but not expected: 
  File C:\buildbot-root\llvm-clang-x86_64-sie-win\llvm-project\clang\test\SemaCXX\attr-trivial-abi.cpp Line 42: static_assert failed due to requirement '!__is_trivially_relocatable(S3_3)' ""
  File C:\buildbot-root\llvm-clang-x86_64-sie-win\llvm-project\clang\test\SemaCXX\attr-trivial-abi.cpp Line 114: static_assert failed due to requirement '!__is_trivially_relocatable(deletedCopyMoveConstructor::CopyMoveDeleted)' ""
  File C:\buildbot-root\llvm-clang-x86_64-sie-win\llvm-project\clang\test\SemaCXX\attr-trivial-abi.cpp Line 119: static_assert failed due to requirement '!__is_trivially_relocatable(deletedCopyMoveConstructor::S18)' ""
  File C:\buildbot-root\llvm-clang-x86_64-sie-win\llvm-project\clang\test\SemaCXX\attr-trivial-abi.cpp Line 141: static_assert failed due to requirement '!__is_trivially_relocatable(deletedCopyMoveConstructor::S19)' ""
4 errors generated.
error: command failed with exit status: 1
--
********************

Can you take a look?

Oops, sorry about that. What is the correct way to test/reproduce the change? Do I / can I submit builds to the buildbot manually for testing?

Also, should I be rolling back this change, or no? Not sure of the protocol here, this is my first change to Clang.

P.S. that breakage is disturbing -- I would not have expected this sort of failure on any platform, and I'm not sure if it reflects a bug on PS4. (It seems to be that structs are passed in registers, even if the member variables of that struct cannot be.)

I rolled it back while @devin.jeanpierre investigates.

Thank you @gribozavr2 ! I'll hopefully have a roll-forward ready for you either today or tomorrow.

dyung added a comment.Feb 3 2022, 2:00 PM

Oops, sorry about that. What is the correct way to test/reproduce the change? Do I / can I submit builds to the buildbot manually for testing?

Also, should I be rolling back this change, or no? Not sure of the protocol here, this is my first change to Clang.

P.S. that breakage is disturbing -- I would not have expected this sort of failure on any platform, and I'm not sure if it reflects a bug on PS4. (It seems to be that structs are passed in registers, even if the member variables of that struct cannot be.)

Reproducing it should be as simple as just rerunning the test with the PS4 triple, x86_64-scei-ps4 and comparing that output with what is expected. If it helps, I noticed that it seems to reproduce on both linux/windows.

The core difference of behavior is this, in the logic for setting canPassInRegisters:

// Clang <= 4 used the pre-C++11 rule, which ignores move operations.
// The PS4 platform ABI follows the behavior of Clang 3.2.
if (CCK == TargetInfo::CCK_ClangABI4OrPS4)
  return !D->hasNonTrivialDestructorForCall() &&
         !D->hasNonTrivialCopyConstructorForCall();

This is noticeably different from the non-PS4 behavior: not only does it ignore the move constructor, but it also does not require that a copy constructor or destructor even exist. And so all of the broken tests are structs where the copy constructor is deleted and the destructor is either trivial or deleted. On non-CCK_ClangABI4OrPS4, it is not passable in registers, but on CCK_ClangABI4OrPS4 it is.

Related, but separate: DR1734, which is the same thing, but for std::is_trivially_copyable.

I believe the fix here, like for Windows, is to special-case this platform in the test. (Or delete these tests.)