This is an archive of the discontinued LLVM Phabricator instance.

[ObjC] Enable __strong pointers in structs under ARC
ClosedPublic

Authored by ahatanak on Dec 14 2017, 1:24 AM.

Details

Summary

ObjectiveC ARC in C mode currently disallows having strong and weak pointers in structs. This patch takes the first step towards lifting that restriction. In order to properly manage the lifetimes of the objects owned by the structs, this patch modifies IRGen to synthesize special functions for structs with __strong pointers and call them when those structs have to be initialized, copied, and destructed similarly to what C++ special member functions of non-trivial classes do.

I plan to send a patch that allows __weak pointers in structs after this patch is committed.

Diff Detail

Event Timeline

ahatanak created this revision.Dec 14 2017, 1:24 AM

I just found that the code that creates the mangled name for a special function is not correct. Two structs with different record layouts can end up having functions that have the same name.

I'll fix the bug and update the patch today.

ahatanak updated this revision to Diff 127231.Dec 15 2017, 6:41 PM

Fixed a bug where the code gets stuck in an infinite loop when a struct with an array field is passed to FieldInfoToString. Encoded the offset of an array in a struct into the function name. Also, added comments that describe the structure of the mangled function name.

ahatanak updated this revision to Diff 127233.Dec 15 2017, 6:46 PM

Insert an underscore before src-alignment in the BNF rule for alignment-info.

<alignment-info> ::= <dst-alignment> ["_" <src-alignment>]

You should add a has_feature check for this (has_feature(objc_arc_fields)?), as well as documentation. The documentation would go in the ARC spec.

include/clang/AST/Decl.h
3575

Naming style would be "isNonTrivialToDefualtInitialize" and so on for all of these.

If these aren't guaranteed to be consistent with the C++ definitions (and the copying one almost certainly isn't, since there isn't a single definition of copying in C++), I think you probably need to name these in a way that makes it clear that it's the C bit that's being queried/changed. Perhaps adding "Primitive" to each one, e.g. isNonTrivialToPrimitiveCopy? Richard is probably opinionated about this.

lib/AST/ASTContext.cpp
5780

"Return true" is just a description of the code that follows. I would suggest something like "The block needs copy/destroy helpers if...".

lib/CodeGen/CGBlocks.cpp
2244

You've updated the code for __block captures, but I think you missed computeBlockInfo for direct captures.

lib/CodeGen/CGDecl.cpp
1421

This can only be getARCCleanupKind() if it's non-trivial to destroy solely because of strong members. Since the setup here is significantly more general than ARC, I think you need to default to the more-correct behavior; if you want to add a special-case check for only strong members, you ought to do that explicitly.

lib/CodeGen/CGExprAgg.cpp
315

Do these functions have a memcpy as a precondition? I would expect them to do the full copy (for code size if nothing else).

lib/CodeGen/CMakeLists.txt
73

I think CGNonTrivialStruct.cpp would be a better name.

I accidentally hit 'send' too early on my review, so here's part two. I still haven't fully reviewed the new IRGen file.

lib/CodeGen/CodeGenCStruct.cpp
28

There's no need to put this in an anonymous namespace.

This enum makes me wonder if it would make more sense to create equivalents to QualType::DestructionKind for each of these operations and all of the different primitive types. That would let you e.g. implement your only-strong-members check above much more easily, and it would also make it simpler to guarantee that all the places that need to exhaustively enumerate the reasons why something might need special initialization/copying logic don't miss an important new case.

lib/CodeGen/CodeGenFunction.cpp
1144

You're not actually checking the "is not passed indirectly" part of this, but I think that's okay, because I don't think we actually want the ownership aspects of the ABI to vary based on how the argument is passed. So you should just strike that part from the comment.

Also, this should really be done in EmitParmDecl.

lib/CodeGen/CodeGenFunction.h
3347

These methods should *definitely* be somehow namespaced for your new feature.

ahatanak marked 9 inline comments as done.Dec 18 2017, 6:26 PM

Address review comments.

lib/CodeGen/CGDecl.cpp
1421

I added a DestructionKind (DK_c_struct_strong_field) that is used just for structs that have strong fields (but doesn't have weak fields).

lib/CodeGen/CGExprAgg.cpp
315

Yes, there should always be a call to memcpy before the copy/move special functions are called. I don't think we want to fold the call to memcpy into the body of the special functions since another special function can be called from the body if the non-trivial C struct has a field whose type is a non-trivial C struct too (in which case, there will be a redundant memcpy to copy the C struct field).

For example, in the following code, there shouldn't be a call to memcpy to copy field "f0" of StrongOuter if there is already a memcpy that copies struct StrongOuter:

typedef struct {
  int f0;
  id f1;
} Strong;

typedef struct {
  Strong f0;
  id f1;
} StrongOuter;
lib/CodeGen/CodeGenCStruct.cpp
28

I'm not sure if I understood your point here. Do you mean there should be DestructionKinds for arrays or structs too or merge these enums into DestructionKind?

lib/CodeGen/CodeGenFunction.cpp
1144

To make ARC and ARC++ compatible, I think we'll have to change the ABI of C++ functions that are passed structs containing weak fields if we want to destruct non-trivial C arguments destructed in the callee. I guess that's OK since we have to break the ABI for structs containing strong fields too.

lib/CodeGen/CodeGenFunction.h
3347

I renamed these methods to make it clear that they are calling special functions for C structs. Is that what you mean by "namespaced"?

ahatanak updated this revision to Diff 127456.Dec 18 2017, 6:28 PM
ahatanak marked 5 inline comments as done.

Update patch.

rjmccall added inline comments.Dec 18 2017, 10:21 PM
lib/CodeGen/CGDecl.cpp
1421

Sure, that's a simple enough way to do it, and I think it's fairly warranted.

lib/CodeGen/CGExprAgg.cpp
315

Well, I guess I was imagining something more C++-ish where you don't necessarily have a struct-wide memcpy, and instead you just memcpy the parts where that's profitable and otherwise do something type-specific, which would mean recursing for a struct. Your approach is reasonable if the non-trivial copying is relatively sparse and the structure is large; on the other hand, if the non-trivial copying is dense, the memcpy itself might be mostly redundant. And it does mean a bigger code-size hit in the original place that kicks off the copy.

IIRC we do already have some code in copy-constructor emission that's capable of emitting sequences of field copies with a memcpy, if you wanted to try the C++ style, you could probably take advantage of that pretty easily. But I won't hold up the patch if you think the memcpy precondition is the right way to go.

lib/CodeGen/CodeGenCStruct.cpp
28

I was suggesting there should be equivalents to DestructionKind for some of the other operations, like a NonTrivialCopyKind, and queries on QualType analogous to isDestructedType(), e.g. isNonTrivialToCopy(). That way you can directly put this stuff in the AST, and then here you can just walk the struct fields and switch on the enum, at which point I think you don't need FieldKind at all. Remember that some types are non-trivial in only a subset of these dimensions.

lib/CodeGen/CodeGenFunction.cpp
1144

I hadn't been thinking of weak, just the difference between small and large structs, but this is a good point. I think I agree with you, we should also change the destruction ABI for structs containing weak fields; basically, the rule will be that only C++ destruction (under the Itanium ABI) requires destruction in the caller.

lib/CodeGen/CodeGenFunction.h
3347

Yes, thank you.

ahatanak updated this revision to Diff 127591.Dec 19 2017, 12:47 PM

Address review comments. Changed the mangling of special functions to make it a bit easier to read.

ahatanak marked 2 inline comments as done.Dec 19 2017, 12:53 PM
ahatanak added inline comments.
lib/CodeGen/CGExprAgg.cpp
315

I made changes so that either a load/store pair or a call to memcpy is emitted in the copy/move special functions to copy fields that are not non-trivial.

lib/CodeGen/CodeGenCStruct.cpp
28

I added enum NonTrivialCopyKind and function isNonTrivialToCopy to QualType and removed FieldKind from CGNonTrivialStruct.cpp. Since there are special functions other than copy constructor/assignment operator, should the enum's name be NonTrivialStructKind or something instead of NonTrivialCopyKind?

ahatanak marked 2 inline comments as done.Dec 19 2017, 1:08 PM
ahatanak added inline comments.
lib/CodeGen/CGExprAgg.cpp
315

I haven't implemented the optimization you suggested which merges a sequence of field copies into a single memcpy. I'll look into it later.

ahatanak updated this revision to Diff 127818.Dec 20 2017, 7:09 PM
  • Improved IRGen for copying trivial fields in a non-trivial C struct. IRGen in CGNonTrivialStruct.cpp now calls a single memcpy if there are consecutive trivial fields in a struct (it does something similar to what FieldMemcpyizer does)
  • IRGen for structs containing bitfields was not correct, so I fixed that too.
rjmccall added inline comments.Dec 20 2017, 7:29 PM
include/clang/AST/Type.h
1144

We don't actually distinguish arrays in DestructionKind. Is it important here? You can't actually do anything with that information.

Regarding your naming question, I wonder if it would be useful to just invent a new term here, something meaning "non-trivial" but without the C++ baggage. Maybe just "PrimitiveCopyKind", with the understanding that C++ can never do a "primitive" copy? And for consistency with DestructionKind, maybe you should lowercase the second words.

I'm not sure how isNonTrivialToCopy is supposed to be used. Where does the function come from? And why is a context required when it isn't required for isDestructedType?

ahatanak added inline comments.Dec 20 2017, 8:52 PM
include/clang/AST/Type.h
1144

Enum NonTrivialCopyKind and isNonTrivialToCopy() are used just to distinguish the different types of fields in a C struct. Currently, there are four types that are handled by visitField functions in CGNonTrivialStruct.cpp:

  • struct field
  • array field
  • __strong field
  • trivial field that can be copied using memcpy

I thought using enums would make the code easier to read, but of course it's possible to remove them and instead just check the field types calling functions like getAsArrayType or getAs<RecordType>. ASTContext is passed to isNonTrivialToCopy so that it can call ASTContext::getAsArrayType to determine whether the type is an array. Maybe there is another way to detect an array that doesn't require ASTContext?

As for the naming, PrimitiveCopyKind seems find if we want to distinguish it from C++ non-trivial classes (I don't have a better name). If we are going to call non-trivial C structs primitiveCopyKind, what should we call the C structs that are trivial (those that can be copied using memcpy)?

rjmccall added inline comments.Dec 20 2017, 9:25 PM
include/clang/AST/Type.h
1144

I think "trivial" is a reasonable kind of primitive-copy semantics. Basically, we're saying that all complete object types other than non-trivial C++ types have some sort of primitive-copy semantics, and this enum tells us what those semantics are.

DestructionKind has to deal with arrays as well, but it does so by just reporting the appropriate value for the underlying element type without mentioning that it's specifically an *array* of the type. I think that's a reasonable design, since most places do not need to distinguish arrays from non-arrays. IRGen does, but only when you're actually finally emitting code; prior to that (when e.g. deciding that a field is non-trivial to copy) you can just use the enum value.

You can do without getAsArrayType the same way that isDestructedType() does: the more complex operation is only necessary in order to preserve qualifiers on the element type, but that's unnecessary for this operation because the array type itself is considered to have the same qualifiers as its element type. That is, you can ask the original type whether it's strong/weak-qualified without having to specifically handle array types, and once you've done all of those queries, you can use the "unsafe" operations to drill down to the element type because you no longer care about qualification.

ahatanak added inline comments.Dec 21 2017, 1:12 PM
include/clang/AST/Type.h
1144

So QualType::hasNonTrivialToDefaultInitializeStruct should be renamed to QualType::hasPrimitiveDefaultInitializeStruct and RecordDecl::isNonTrivialToPrimitiveDefaultInitialize to RecordDecl::isPrimitiveDefaultInitialize, for example?

Also, I realized that, after I made the changes that removed the NonTrivial flags from RecordDecl, CGNonTrivialStruct.cpp is the only user of enum NonTrivialCopyKind and the following methods of QualType (Sema::isValidVarArgType calls nonTrivialToDestroy, but it should call methods like hasNonTrivialToDestroyStruct that detect non-trivial C structs instead):

nonTrivialToDefaultInitialize
nonTrivialToCopy
nonTrivialToDestructiveMove
nonTrivialToDestroy

Maybe we should make these enums and functions local to CGNonTrivialStruct.cpp to avoid adding something that is not used outside CGNonTrivialStruct.cpp?

rjmccall added inline comments.Dec 21 2017, 11:22 PM
include/clang/AST/Type.h
1144

I think the right name would be isNonTrivialToPrimitiveDefaultInitialize(), and like isDestructedType() it would return an enum that happens to have a zero value (and is thus false) for the trivial case.

I think we are likely to add more places that use these methods and enums. For example, if we want to add a diagnostic that warns about memcpy'ing non-trivial struct types, that diagnostic should probably include a note about which field is non-trivial, and that will still basically want to do this same investigation.

ahatanak updated this revision to Diff 128061.Dec 22 2017, 3:50 PM

Rename functions. Check whether a non-trivial type passed to Sema::isValidVarArgType is a struct.

ahatanak marked an inline comment as done.Dec 22 2017, 3:52 PM
ahatanak added inline comments.
include/clang/AST/Type.h
1144

I renamed the functions to names that start with isNonTrivialToPrimitive and removed the enumerator for "Array" from the enum too.

ahatanak updated this revision to Diff 129359.Jan 10 2018, 4:06 PM
ahatanak marked an inline comment as done.

In EmitCallArg and EmitParmDecl, use the existing code for Microsoft C++ ABI to determine whether the argument should be destructed in the callee. Also, add a test case that checks the destructor for a non-trivial C struct is called on the EH path.

rjmccall added inline comments.Jan 12 2018, 9:41 AM
include/clang/AST/Decl.h
3510

Is it interesting to track all the individual reasons that a struct is non-trivial at the struct level, as opposed to (like we do with CXXDefinitionData) just tracking the four aggregate properties you've described below?

include/clang/AST/Type.h
1100

I think this one should probably get its own enum. It's not too hard to imagine types (maybe relative references, or something else that's address-sensitive?) that require non-trivial copy semantics but isn't non-trivial to default-initialize.

1110

You need to define what you mean by "destructive move":

  • A C++-style move leaves the source in a still-initialized state, just potentially with a different value (such as nil for a strong reference). This is what we would do when e.g. loading from an r-value reference in C++. It's also what we have to do when moving a block variable to the heap, since there may still be pointers to the stack copy of the variable, and since the compiler will unconditionally attempt to destroy that stack copy when it goes out of scope. Since the source still needs to be destroyed, a non-trivially-copyable type will probably always have to do non-trivial work to do this. Because of that, a function for this query could reasonably just return PrimitiveCopyKind.
  • A truly primitive destructive move is something that actually leaves the source uninitialized. This is generally only possible in narrow circumstances where the compiler can guarantee that the previous value is never again going to be referenced, including to destroy it. I don't know if you have a reason to track this at all (i.e. whether there are copies you want to perform with a destructive move in IRGen), but if you do, you should use a different return type, because many types that are non-trivial to "C++-style move" are trivial to "destructively move": for example, __strong references are trivial to destructively move.
1115

Is there a reason we need this in addition to isDestructedType()? It seems to me that it's exactly the same except ruling out the possibility of a C++ destructor.

1127

I don't think you want to refer to the fact that the C struct specifically contains a __strong field here. As we add more reasons, would we create a new enumerator for each? What if a struct is non-trivial for multiple reasons? Just say that it's a non-trivial C struct.

1139

Are these four queries really important enough to provide API for them on QualType?

lib/AST/ASTContext.cpp
5782

I think you should use the non-struct-specific checks here. In addition to being cleaner, it would also let you completely short-circuit the rest of this function in ARC mode. In non-ARC mode, you do still have to check lifetime (only to see if it's unsafe_unretained; the strong and __weak cases would be unreachable), and you need the type-specific special cases.

ahatanak added inline comments.Jan 12 2018, 10:34 AM
include/clang/AST/Type.h
1127

I added an enumerator for DK_c_struct_strong_field since CodeGenFunction::needsEHCleanup distinguishes between __weak and __strong types. Is it not necessary to distinguish between a struct that has a __weak field and a struct that has a __strong field but not a __weak field?

rjmccall added inline comments.Jan 12 2018, 12:17 PM
include/clang/AST/Type.h
1127

Oh, that's right.

I... hmm. The problem is that that's really a special-case behavior, and we're designing a more general feature. If we try to handle the special case in the first patch, we'll end up trivializing the general case, and that will permeate the design in unfortunate ways.

So I recommend that we *not* treat them separately right now; just emit cleanups for them unconditionally. You're still planning to add support for __weak properties in a follow-up patch, right? Once that's in, it will make more sense to carve out the __strong-only behavior as a special case.

ahatanak updated this revision to Diff 133994.Feb 12 2018, 10:44 PM
ahatanak marked 8 inline comments as done.

Address review comments and feedback I got from John offline.

The main changes are in CGNonTrivialStruct.cpp. I cleaned up the class hierarchy and used variadic template functions in CGNonTrivialStruct.cpp to enable sharing code between the class creating the function name and the class emitting IR for the special functions. I also made changes so that memset is used instead of a loop to default-initialize a large array (see test_constructor_destructor_IDArray in test/CodeGenObjC/strong-in-c-struct.m).

rjmccall added inline comments.Feb 13 2018, 2:53 PM
include/clang/AST/Type.h
1099

This is unused.

lib/CodeGen/CGBlocks.cpp
1553

Please restructure this code to use the result of isNonTrivialToPrimitiveCopy(), preferably in a way that involves an exhaustive switch. That should be as easy as breaking out in the PCK_Strong case and in the PCK_None case when !isObjCRetainableType().

1754

Same as above with using the result of isDestructedType().

2040

Just isDestructedType() should be fine.

lib/CodeGen/CGDecl.cpp
1244

Again, it would be nice to restructure this code to use the result of isNonTrivialToPrimitiveDefaultInitialize() exhaustively.

lib/CodeGen/CGNonTrivialStruct.cpp
38 ↗(On Diff #133994)

These visitors would be re-usable components outside of IRGen if they just took a QualType and then forwarded an arbitrary set of other arguments. I would suggest structuring them like this:

template <class Derived, class RetTy> struct DestructedTypeVisitor {
  Derived &asDerived() { return static_cast<Derived&>(*this); }

  template <class... Ts>
  RetTy visit(QualType Ty, Ts &&...Args) {
    return asDerived().visit(Ty.isDestructedType(), Ty, std::forward<Ts>(Args)...);
  }

  template <class... Ts>
  RetTy visit(QualType::DestructionKind DK, QualType Ty, Ts &&...Args) {
    switch (DK) {
    case QualType::DK_none: return asDerived().visitTrivial(Ty, std::forward<Ts>(Args)...);
    case QualType::DK_cxx_destructor: return asDerived().visitCXXDestructor(Ty, std::forward<Ts>(Args)...);
    ...
    }
  }
};

You can then pretty easily add the special behavior for IsVolatile and arrays (when DK is not DK_none) in an IRGen-specific subclass. visitArray would take a DK and then recurse by calling back into visit with that same DK.

55 ↗(On Diff #133994)

Better not to have a default case here. You can add an assert for the C++ case that it should never get here.

92 ↗(On Diff #133994)

This entire template is dead.

110 ↗(On Diff #133994)

Please pass around offsets as CharUnits until you actually need to interact with LLVM.

131 ↗(On Diff #133994)

I think you should just call this CopyStructVisitor.

145 ↗(On Diff #133994)

This would also be a good place to check for arrays so you don't have to do it in every non-trivial case below.

There's a couple other places in this file where I would suggest the same thing.

289 ↗(On Diff #133994)

Offset should definitely be a CharUnits here.

432 ↗(On Diff #133994)

You should at least make sure this has the right function type; it's possible, even if not really allowed, to declare a C function that collides with one of these names.

lib/CodeGen/CodeGenFunction.h
3348

I think we're moving towards generally taking LValues instead of Addresses unless you're talking about a very low-level routine.

ahatanak updated this revision to Diff 134430.Feb 15 2018, 7:36 AM
ahatanak marked 14 inline comments as done.

Address review comments.

include/clang/AST/Type.h
1139

I removed these functions.

lib/CodeGen/CGDecl.cpp
1244

It looks like this code needs restructuring, but I didn't understand how I can use isNonTrivialToPrimitiveDefaultInitialize() here. The only change I made here is that drillIntoBlockVariable is called if the variable is byref. This fixes a bug where a non-trivial struct variable declared as __block wasn't initialized correctly.

Also, in the process, I found that when a c++ struct declared as a __block is captured by its initializer, an assertion fails (see test case in test/CodeGenObjCXX/arc-blocks.mm). I fixed the bug in CodeGenFunction::EmitExprAsInit, but I'm not sure that's the right way to fix it. I'm not sure what the comment "how can we delay here if D is captured by its initializer" means.

lib/CodeGen/CGNonTrivialStruct.cpp
38 ↗(On Diff #133994)

I created three visitor classes that can be used outside of IRGen,

55 ↗(On Diff #133994)

I added llvm_unreachable in StructVisitor's methods visitWeak and visitCXXDestructor.

rjmccall added inline comments.Feb 15 2018, 12:09 PM
include/clang/AST/Decl.h
3593

Please document that this means a C++-style destructive move that leaves the source object in a validly-initialized state.

include/clang/AST/Type.h
1091

Please call this ARCStrong, it'll be much clearer (and easier to visually distinguish from "struct').

1093

Nit-pick: please declare this with the methods that use it.

lib/AST/ASTContext.cpp
5802

Nit-pick: period.

lib/CodeGen/CGBlocks.cpp
1758

Should we just have a helper function like getBlockFieldFlagsForObjCObjectPointer(QualType);? You could just call it from all the object-specific places instead of having this Flags variable that's not valid in half the cases.

lib/CodeGen/CGDecl.cpp
1363

The "delay" is that we generally handle __block captures within initializers by delaying the drill-into until we're ready to actually store into the variable. Not delaying has the advantage of allowing us to just project out the stack storage instead of actually chasing the forwarding pointer, but this isn't safe if the forwarding pointer might no longer point to the stack storage, which is what happens when a block capturing a __block variable is copied to the heap.

Delaying the drill-into is easy when the stores can be completely separated from the initializer, as is true for scalar and complex types. It's not easy for aggregates because we need to emit the initializer in place, but it's possible that a block capturing the __block variable will be copied during the emission of that initializer (e.g. during a later element of a braced-init-list or within a C++ constructor call), meaning that we might be copying an object that's only partly initialized. This is particularly dangerous if there are destructors involved.

Probably the best solution is to forbid capturing __block variables of aggregate type within their own initializer. It's possible that that might break existing code that's not doing the specific dangerous things, though.

The more complex solution is to make IRGen force the __block variable to the heap before actually initializing it. (It can do this by directly calling _Block_object_assign.) We would then initialize the heap copy in-place, safely knowing that there is no possibility that it will move again. Effectively, the stack copy of the variable would never exist as an object; only its header would matter, and only enough to allow the runtime to "copy" it to the heap. We'd have to hack the __block variable's copy helper to not try to call the move-constructor, and we'd need to suppress the cleanup that destroys the stack copy of the variable. We already push a cleanup to release the __block variable when it goes out of scope, and that would stay.

1421

Okay, since you don't have a DK_c_struct_strong_field anymore, you either need to stop using getARCCleanupKind() or do the test for just __strong fields.

lib/CodeGen/CGExprAgg.cpp
82

Please use a boolean enum for this instead of a raw bool.

255

Hrm. Okay, I guess. This is specific to ignored results because otherwise we assume that the caller is going to manage the lifetime of the object?

lib/CodeGen/CGNonTrivialStruct.cpp
63 ↗(On Diff #134430)

DefaultInitialize*d*TypeVisitor would be better, I think.

77 ↗(On Diff #134430)

Use returns, please. And you can add an llvm_unreachable after the switch (in all of the visitors).

92 ↗(On Diff #134430)

CopiedTypeVisitor, I guess.

138 ↗(On Diff #134430)

These computations should be in uint64_t.

143 ↗(On Diff #134430)

ASTContext has a toCharUnitsFromBits method.

184 ↗(On Diff #134430)

You can add using StructVisitor<Derived>::asDerived; to the class to avoid having to write this a bunch.

193 ↗(On Diff #134430)

I feel like maybe volatile fields should be individually copied instead of being aggregated into a multi-field memcpy. This is a more natural interpretation of the C volatile rules than we currently do. In fact, arguably we should really add a PrimitiveCopyKind enumerator for volatile fields (that are otherwise trivially-copyable) and force all copies of structs with volatile fields into this path precisely so that we can make a point of copying the volatile fields this way. (Obviously that part is not something that's your responsibility to do.)

To get that right with bit-fields, you'll need to propagate the actual FieldDecl down. On the plus side, that should let you use EmitLValueForField to do the field projection in the common case.

341 ↗(On Diff #134430)

"N-ary" is the usual spelling.

ahatanak updated this revision to Diff 135309.Feb 21 2018, 12:04 PM
ahatanak marked 14 inline comments as done.

Address review comments.

lib/CodeGen/CGDecl.cpp
1363

Ah I see. I'll see if I can implement the first solution you suggested in a separate patch. If it turns out that doing so breaks too much code, we can consider implementing the second solution.

Is this something users do commonly?

lib/CodeGen/CGExprAgg.cpp
255

Yes, this is specific to ignored results. This is needed because otherwise function return values that are ignored won't get destructed by the caller.

lib/CodeGen/CGNonTrivialStruct.cpp
193 ↗(On Diff #134430)

I added method visitVolatileTrivial that copies volatile fields individually. Please see test case test_copy_constructor_Bitfield1 in test/CodeGenObjC/strong-in-c-struct.m.

rjmccall added inline comments.Feb 21 2018, 1:30 PM
include/clang/AST/Type.h
1105

These should all be /// documentation comments, and they mostly shouldn't talk about fields since this is a query on QualType, not FieldDecl. I would suggest something like:

for Trivial - The type does not fall into any of the following categories. Note that this case is zero-valued so that values of this enum can be used as a boolean condition for non-triviality.

for VolatileTrivial - The type would be trivial except that it is volatile-qualified. Types that fall into one of the other non-trivial cases may additionally be volatile-qualified.

for ARCStrong - The type is an Objective-C retainable pointer type that is qualified with the ARC __strong qualifier.

for Struct - The type is a struct containing a field whose type is not PCK_Trivial. Note that a C++ struct type does not necessarily match this; C++ copying semantics are too complex to express here, in part because they depend on the exact constructor or assignment operator that is chosen by overload resolution to do the copy.

1118

"truly"

Hmm. Now that I'm thinking more about it, I'm not sure there's any point in tracking non-triviality of a C++-style destructive move separately from the non-triviality of a copy. It's hard to imagine that there would ever be a non-C++ type that primitively has non-trivial copies but trivial C++-style moves or vice-versa. Type-based destructors imply that the type represents some kind of resource, and a C++-style move will always be non-trivial for resource types because ownership of the resource needs to be given up by the old location. Otherwise, a type might be non-trivial to copy but not destroy because there's something special about how it's stored (like volatility), but it's hard to imagine what could possibly cause it to be non-trivial to destroy but not copy.

If we were tracking non-triviality of an *unsafe* destructive move, one that leaves the source in an uninitialized state, that's quite different.

I think there are three reasonable options here:

  • Ignore the argument I just made about the types that we're *likely* to care about modeling and generalize your tracking to also distinguish construction from assignment. In such an environment, I think you can absolutely make an argument that it's still interesting to track C++-style moves separately from copies.
  • Drop the tracking of destructive moves completely. If you want to keep the method around, find, but it can just call isNonTrivialToPrimitiveCopy().
  • Change the tracking of *destructive* moves to instead track *deinitializing* moves. The implementation would stop considering __strong types to be non-trivial to move.

But as things stand today, I do not see any point in separately tracking triviality of C++-style destructive moves.

lib/AST/Type.cpp
2230

You can re-use this call to getQualifiers() in the check for volatile below.

3963

Please combine this check with the getAsCXXRecordDecl() check below. That is, (1) check whether it's a record type once, and if so, (2) it's a CXXRecordDecl, ask about its destructor, and (3) if it's not a CXXRecordDecl, ask whether it's non-trivial to primitive destroy.

lib/CodeGen/CGBlocks.cpp
1750

This entire switch can be within an #ifndef NDEBUG if you really feel it's important to assert.

1780

I think you should leave the byref case of that function here, but you can make it local to this if-clause.

lib/CodeGen/CGDecl.cpp
1363

There's an idiom of copying __block variables of *block pointer* type in their own initializers, as a way of constructing a recursive block. I don't know of any reason why someone would intentionally do it for some random struct type, but that's always a risk when locking down on something retroactively. Anyway, I agree that we can experiment to see if locking down on aggregate __block captures causes any actual problems.

1421

Ping about this again.

lib/CodeGen/CGExprAgg.cpp
255

Okay. I'm just a little worried about what this implies about when we push destructor cleanups in general. I know it's rather ad-hoc for C++ destructors right now.

lib/CodeGen/CGNonTrivialStruct.cpp
193 ↗(On Diff #134430)

Okay, great! I like the name.

Does this mean we're now copying all structs that contain volatile fields with one of these helper functions? If so, please add a C test case testing just that. Also, you should retitle this review and stress that we're changing how *all* non-trivial types are copied, and that that includes both volatile and ARC-qualified fields.

110 ↗(On Diff #135309)

Please call these methods visitARCStrong to go with the new PCK name.

ahatanak added inline comments.Feb 22 2018, 10:47 AM
include/clang/AST/Type.h
1118

The second option seems most reasonable to me. We can always make changes if someone comes up with a type that requires tracking destructive moves separately.

lib/CodeGen/CGNonTrivialStruct.cpp
193 ↗(On Diff #134430)

No, the current patch doesn't copy volatile fields of a struct individually unless the struct is a non-trivial type (which means its primitive copy kind is PCK_Struct). I'll look into today how I can force structs with volatile fields that are not non-trivial to be copied using the helper functions.

It seems like we would need a boolean flag in RecordDecl that tracks the presence of volatile fields in the struct or one of its subobjects. I assume we want to copy volatile fields individually in C++ too, in which case the flag needs to be set in both C and C++ mode. Is that right?

rjmccall added inline comments.Feb 22 2018, 11:55 AM
include/clang/AST/Type.h
1118

Well, we already have a type that would have a trivial deinitializing move: ARC __strong pointers. What we don't have is anything in IRGen that currently would take advantage of that fact. Anyway, I agree that we can wait to start tracking this until we add such code to the compiler.

lib/CodeGen/CGNonTrivialStruct.cpp
193 ↗(On Diff #134430)

Oh, I see, there's specific logic to not flag the struct trivial if a field is merely VolatileTrivial. Okay, that works for me; we can think about the right thing to do in a follow-up commit. It probably needs to be raised as an RFC on cfe-dev.

The C++ standard has explicit language about triviality that says that volatile fields don't make defaulted copy-constructors non-trivial. Whatever we do here can't break that. But that doesn't necessarily imply that we should continue to use a simple memcpy to copy the struct; we could still use the C logic to copy "trivial" C++ structs with volatile members.

If we do try to apply this rule to C++, you will need to propagate these bits appropriately for C++ types, including from bases.

lib/Sema/SemaDecl.cpp
15015

Please compute isNonTrivialToPrimitiveCopy once here, it's not trivial.

...arguably we should figure out a way to do all these type checks once and then set the bits from that. We can do that as a separate patch, though.

ahatanak updated this revision to Diff 135703.Feb 23 2018, 2:37 PM
ahatanak marked 9 inline comments as done.

Address review comments.

ahatanak added inline comments.Feb 23 2018, 2:39 PM
include/clang/AST/Type.h
1105

Thanks, I copied your comments verbatim except the comment on PCK_Struct: types that are PCK_Struct have a field that is neither PCK_Trivial nor PCK_VolatileTrivial. We can use the original comment once we start distinguishing PCK_Trivial structs that have volatile fields from those that don't.

lib/CodeGen/CGBlocks.cpp
1750

It's probably not important to assert here, so I removed the switch.

1780

I also simplified the QualType::DK_none case below.

lib/CodeGen/CGDecl.cpp
1421

Sorry I missed this one.

rjmccall accepted this revision.Feb 24 2018, 9:36 AM

Thanks! This looks ready; thank you for your patience in working this out.

include/clang/AST/Type.h
1105

Yes, of course, that makes sense.

This revision is now accepted and ready to land.Feb 24 2018, 9:36 AM
This revision was automatically updated to reflect the committed changes.
sberg added a subscriber: sberg.Apr 17 2018, 12:09 AM

It looks like this caused a clang-cl regression https://bugs.llvm.org/show_bug.cgi?id=37146 "clang-cl emits special functions for non-trivial C-structs ('__destructor_8') introduced for Objective-C".