This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Merge equivalent block copy/helper functions
ClosedPublic

Authored by ahatanak on Aug 1 2018, 12:10 PM.

Details

Summary

Currently, clang generates different copy or dispose helper functions for each block literal on the stack if the block has the possibility of being copied to the heap. This patch makes changes to merge equivalent copy and dispose helper functions and reduce code size.

To enable merging equivalent copy/dispose functions, the types and offsets of the captured objects are encoded into the helper function name. This allows IRGen to check whether an equivalent helper function has already been emitted and reuse the function instead of generating a new helper function whenever a block is defined. In addition, the helper functions are marked as linkonce_odr to enable merging helper functions that have the same name across translation units and marked as unnamed_addr to enable the linker's deduplication pass to merge functions that have different names but the same content.

rdar://problem/22950898

Diff Detail

Repository
rC Clang

Event Timeline

ahatanak created this revision.Aug 1 2018, 12:10 PM
ahatanak added inline comments.Aug 1 2018, 12:18 PM
test/CodeGenCXX/block-byref-cxx-objc.cpp
36

Should the second call to @_Block_object_dispose be an invoke if the destructor can throw? The FIXME in CodeGenFunction::BuildBlockRelease seems to imply that we should consider whether the destructor can throw.

mgrang added inline comments.Aug 1 2018, 12:19 PM
lib/CodeGen/CGBlocks.cpp
1599
ahatanak updated this revision to Diff 158640.Aug 1 2018, 2:29 PM
ahatanak marked an inline comment as done.

Use llvm::sort.

test/CodeGenCXX/block-byref-cxx-objc.cpp
36

I mean the first call, not the second call. If the call to A's destructor throws when the first object is being destructed, the second object should be destructed on the EH path.

rjmccall added inline comments.Aug 1 2018, 8:24 PM
lib/CodeGen/CGBlocks.cpp
1646

I forget whether this case has already filtered out trivial copy-constructors, but if not, you should consider doing that here.

1655

This doesn't always have to be a copy constructor. I mean, maybe this procedure works anyway, but the constructor used to copy an object isn't always a copy constructor. Blame constructor templates.

1659

The name of a C++ class is not unique; you need to use the mangler. Ideally, you would find a way to mangle all the C++ types in the context of the same mangler instance so that substitutions could be reused across it.

You also need to check for a type with non-external linkage and make this helper private if you find one. (You can still unique within the translation unit, for what it's worth.)

ahatanak updated this revision to Diff 158774.Aug 2 2018, 9:11 AM
ahatanak marked 3 inline comments as done.

Address review comments.

lib/CodeGen/CGBlocks.cpp
1646

E.Kind is set to CXXRecord only when the copy constructor is non-trivial (or the destructor is non-trivial if this is a destroy helper). Both computeCopyInfoForBlockCapture and computeDestroyInfoForBlockCapture filter out types with trivial copy constructors or destructors.

1655

I've made changes to use the name of the constructor function that is used to copy the capture instead of using the volatility of the copy constructor's argument.

I'm not sure when a function that is not a copy constructor would be used to copy a captured object though. The comment in function captureInBlock in SemaExpr.cpp says that the the blocks spec requires a const copy constructor.

1659

Good catch. test/CodeGenCXX/blocks.cpp checks that helper functions for blocks that capture non-external types have internal linkage.

rjmccall added inline comments.Aug 2 2018, 4:13 PM
lib/CodeGen/CGBlocks.cpp
497

You only need to do this if you're going to mangle the type name, i.e. if it's actually going to end up as a CXXRecord capture. It should be easy enough to just set this flag above in the clauses where you recognize interesting C++ record captures.

1646

Great.

1655

Okay. I think mangling the type of the capture is probably good enough (and likely to be a little smaller), because different compilations using the same capture type will always use the same constructor. That will include volatile if the captured variable was marked volatile.

The notable case where copy-construction doesn't use a copy constructor is when you have a constructor template like template <class T> ClassName(T &);, which will be preferred over the default copy-constructor for copying non-const objects. It's silly, but it's how it works.

You can mangle the type by just building an ItaniumMangleContext. I think it's better to stably use the Itanium mangling instead of using the target's native mangling.

ahatanak updated this revision to Diff 158892.Aug 2 2018, 7:40 PM
ahatanak marked 2 inline comments as done.

Address review comments.

lib/CodeGen/CGBlocks.cpp
497

I modified the test case in test/CodeGenObjCXX/arc-blocks.mm to check that capturing a trivial C++ class doesn't cause the helper functions to be internal.

rjmccall added inline comments.Aug 2 2018, 9:53 PM
lib/CodeGen/CGBlocks.cpp
1655

Itanium type manglings are self-delimiting, so you don't actually need this run-length encoding, but it doesn't really hurt, either. Certainly it makes it a lot easier for tools to parse one of these function names.

1693

I feel like this reads a little better if you write the quantity first. Also I think you can drop the underscore because all of your suffices are easy to compute the length of, so a block that captures 3 strong variables can just be 32s40s48s or something like that.

Oh, I guess if you do that you need to do something about how you handle the bit-mask for BlockObject, because that could run into the next number. Would it be better to just break this out into cases? The three current cases are (1) __block variables, which can throw, (2) blocks, and (3) object references in non-ARC modes (which for copy/dispose purposes are just s again, except we use a different entrypoint because we hadn't exposed objc_retain and objc_release yet).

test/CodeGenCXX/block-byref-cxx-objc.cpp
36

Yes, I think we should assume that the block runtime correctly propagates exceptions. The function should not be marked nothrow, but call sites can be marked nothrow depending on what they do.

This won't generally be a code-size problem because (1) we do assume that the ObjC retain/release and weak operations can't throw and (2) C++11 makes destructors noexcept by default.

ahatanak updated this revision to Diff 159250.Aug 6 2018, 12:19 AM
ahatanak marked 2 inline comments as done.

Address review comments.

lib/CodeGen/CGBlocks.cpp
1693

I broke this out into several cases based on the flag's value and whether the constructor or destructor of a C++ type can throw.

test/CodeGenCXX/block-byref-cxx-objc.cpp
36

I defined two functions, cxxDestructorCanThrow and cxxConstructorCanThrow, that are called to find out whether the __block object has a destructor or constructor that can throw.

These functions can return true even when the destructor or constructor cannot throw, but that should be conservatively correct ('invoke' is emitted in some cases even when the function doesn't throw).

rjmccall added inline comments.Aug 6 2018, 1:23 PM
lib/CodeGen/CGBlocks.cpp
1631

I'm pretty sure this step can't fail.

1644

Can you just ask Sema to check canThrow for the expression and pass it down?

ahatanak updated this revision to Diff 159388.Aug 6 2018, 1:55 PM
ahatanak marked 2 inline comments as done.

Address review comments.

lib/CodeGen/CGBlocks.cpp
1644

Since this changes the existing behavior, I made changes to test/CodeGenCXX/block-byref-cxx-objc.cpp to test it. Previously, IRGen would emit an invoke to call _Block_object_assign when the constructor was marked as noexcept.

ahatanak added inline comments.Aug 6 2018, 2:06 PM
lib/CodeGen/CGBlocks.cpp
1644

Perhaps I misunderstood your comment, should I have Sema set a flag or something in Expr when it calls a function that can throw?

rjmccall added inline comments.Aug 6 2018, 2:31 PM
lib/CodeGen/CGBlocks.cpp
1644

Sema has a canThrow predicate that it uses when checking things like the noexcept expression. I was thinking that you could pass that down with the copy expression in the AST for the block capture.

Constructors can have default-argument expressions that can throw even if the constructor itself can't, so it's important to do it that way.

You might want to ask Richard on IRC if there are caveats when using that for these purposes.

ahatanak updated this revision to Diff 159546.Aug 7 2018, 10:52 AM
ahatanak marked an inline comment as done.
ahatanak added a reviewer: rsmith.

Address review comments.

ahatanak added inline comments.Aug 7 2018, 10:53 AM
lib/CodeGen/CGBlocks.cpp
1644

I moved the code in lib/Sema/SemaExceptionSpec.cpp that is needed to compute the exception specification to a template class in include/AST so that both Sema and IRGen can use it. Also, I added a call to ResolveExceptionSpec in Sema::CheckCompleteVariableDeclaration to resolve the destructor's exception specification and removed the isUnresolvedExceptionSpec check in CodeGenFunction::cxxDestructorCanThrow. Richard pointed out that the constructor's and destructor's exception specifications should have been resolved by the time IRGen is run.

ahatanak updated this revision to Diff 159553.Aug 7 2018, 11:02 AM

Remove a stale comment and add an assertion to check the destructor's exception specification has been resolved.

That is a change that Richard should definitely sign off on. Also, I'm not sure this works — is it really okay to skip the work done by ResolveExceptionSpec in IRGen? What does that mean, that we're just somewhat more conservative than we would otherwise be? And why is this a better solution than just storing whether the copy-expression throws in BlockDecl::Capture?

I thought it was okay to skip the work done by ResolveExceptionSpec in IRGen as long as the exception specifications that are needed have already been resolved in Sema. But calling Sema::canThrow in Sema::CheckCompleteVariableDeclaration and storing the result in BlockDecl::Capture is clearly the better solution since it doesn't introduce the complexity introduced in the updated patch.

Since BlockVarCopyInits is a map with key VarDecl *, I think we want to add a flag to VarDecl (NonParmVarDeclBits) that indicates whether the copy expression can throw or not. Or is there a reason to store the bit in BlockDecl::Capture instead?

Since BlockVarCopyInits is a map with key VarDecl *, I think we want to add a flag to VarDecl (NonParmVarDeclBits) that indicates whether the copy expression can throw or not. Or is there a reason to store the bit in BlockDecl::Capture instead?

I was thinking about the non-__block capture case, sorry. For __block variables, I think the values of that map should just be... well, I'd make a struct wrapping it, but basically a PointerIntPair of an Expr* and the canThrow flag.

ahatanak updated this revision to Diff 159762.Aug 8 2018, 11:33 AM

Change the value type of BlockVarCopyInits to PointerIntPair<Expr *, 1, bool>. The boolean flag indicates whether the copy expression can throw. Serialize/deserialize the copy expression and the boolean flag and add a regression test to test/PCH.

rjmccall added inline comments.Aug 8 2018, 12:59 PM
include/clang/AST/ASTContext.h
248 ↗(On Diff #159762)

Maybe you should just make a type for this pairing. You can put this documentation there, and the access functions can take and return it.

lib/Sema/SemaDecl.cpp
11764

Can we just make this a parameter to setBlockVarCopyInits? The caller always needs to set this correctly, so making it a required parameter makes sense — and it avoids doing the hash-lookup twice.

ahatanak updated this revision to Diff 159800.Aug 8 2018, 2:40 PM
ahatanak marked 2 inline comments as done.

Modify getBlockVarCopyInits and setBlockVarCopyInits to get and set both the copy expression and the boolean flag that indicates whether the expression can throw or not.

rjmccall added inline comments.Aug 8 2018, 5:22 PM
include/clang/AST/ASTContext.h
158 ↗(On Diff #159800)

Using a PointerIntPair in the implementation of this is still a reasonable choice. Just make it a proper abstraction, with a constructor with the two arguments and getters for the two fields.

2680 ↗(On Diff #159800)

I know this is longstanding, but since you're changing all the call sites anyway, please remove the trailing s from these two method names.

248 ↗(On Diff #159762)

I don't think the typedef is needed here.

lib/AST/ASTContext.cpp
2511 ↗(On Diff #159800)

I think auto is okay for things like this.

lib/CodeGen/CGBlocks.cpp
1683

I don't think you need to add d to the name of a copy helper. It's a bit weird, but while copying a __block variable can cause its copy helper to run, destroying it immediately afterwards can never cause its destroy helper to run. That's because a newly-copied __block variable always has a reference count of 2: the new reference in the copy and the forwarding reference from the original.

I think that means you can just add a single letter which specifies whether the corresponding __block variable operation is known to be able to throw.

1689

Why rb for a captured block instead of some single-letter thing? You don't need to emulate the structure of the flags here.

1706

The underscore is necessary here because non-trivial destructor strings can start with a number? Worth a comment.

ahatanak updated this revision to Diff 159850.Aug 8 2018, 7:49 PM
ahatanak marked 5 inline comments as done.

Address review comments.

include/clang/AST/ASTContext.h
158 ↗(On Diff #159800)

I also needed a function that sets the pointer and flag.

lib/CodeGen/CGBlocks.cpp
1683

I added 'd' to the name of the copy helper functions only because IRGen generates different code depending on whether the destructor can throw or not.

For example, if I compile the following code with -DTHROWS, IRGen uses 'invoke' (which jumps to the terminate block) for the calls to _Block_object_dispose on the EH path whereas it uses 'call' if the destructor doesn't throw.

struct S {
  S();
#ifdef THROWS
  ~S() noexcept(false);
#else
  ~S() noexcept(true);
#endif
  S(const S &);
  int a;
};

void test() {
  __block S s0, s1, s2;
  ^{ (void)s0, (void)s1; (void)s2; };
}

It seems like IRGen doesn't have to use 'invoke' when emitting a call to _Block_object_dispose even when the class has a destructor that can throw, if I understood your explanation correctly?

1689

I can use a single letter here, but I'm already using 'b' for byref captures. Perhaps I can use 'o' for non-arc objects, instead of 'r', and use 'r' for byref?

1706

Yes, that's correct.

rjmccall added inline comments.Aug 8 2018, 8:03 PM
lib/CodeGen/CGBlocks.cpp
1683

Right. It's specifically only true when unwinding after a copy, which is very atypical for C++ code, but nonetheless it's true. We should make the call nounwind in these situations and leave a comment explaining why. Did my explanation make any sense?

1689

That seems reasonable.

ahatanak updated this revision to Diff 159926.Aug 9 2018, 7:49 AM
ahatanak marked 2 inline comments as done.

Address review comments.

lib/CodeGen/CGBlocks.cpp
1683

Yes, it makes sense. Since the reference count is two after the __block variable is copied, calling _Block_object_dispose on it won't cause the destructor to be called.

rjmccall added inline comments.Aug 9 2018, 4:19 PM
include/clang/AST/ASTContext.h
161 ↗(On Diff #159926)

These should all just be defined inline.

lib/CodeGen/CGBlocks.cpp
1746

Could you replace these two flags with something more semantic, like telling this function what the context of pushing the cleanup is — basically meaning, are we in the copy helper or the destroy helper? That will let you pull the comment explaining DisposeCannotThrow into this function, where it makes a lot more sense.

ahatanak updated this revision to Diff 160040.Aug 9 2018, 5:51 PM
ahatanak marked an inline comment as done.

Define functions inline in the header file.

lib/CodeGen/CGBlocks.cpp
1746

I thought about passing a single flag instead of passing two flags too. If we are going to pass a single flag, should we still use two variables inside the function, EHOnly and DisposeCannotThrow, to maintain the readability of the code?

rjmccall added inline comments.Aug 9 2018, 8:06 PM
lib/CodeGen/CGBlocks.cpp
1746

Computing EHOnly at the top makes sense to me. DisposeCannotThrow is really only interesting in the one case, and the analysis makes more sense in terms of a ForCopyHelper parameter than it would in terms of this more abstract concept.

ahatanak updated this revision to Diff 160053.Aug 9 2018, 9:00 PM
ahatanak marked an inline comment as done.

Replace the two flags, EHOnly and DisposeCannotThrow, passed to pushCaptureCleanup with a single flag ForCopyHelper.

rjmccall accepted this revision.Aug 9 2018, 10:28 PM

Thanks, LGTM.

This revision is now accepted and ready to land.Aug 9 2018, 10:28 PM
This revision was automatically updated to reflect the committed changes.