This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGDecl.cpp
-
test/CodeGenCXX/
-
CodeGenCXX/
-
block-capture-own-init.cpp

Differential D89903

[CodeGen] Crash instead of generating broken code with self-capturing __block var
Needs ReviewPublic

Authored by • ille on Oct 21 2020, 11:32 AM.

Download Raw Diff

Details

Reviewers

jfb
rjmccall

Summary

This applies to situations where a __block variable's initializer references a
block that potentially captures the variable itself.

Clang special-cases this because the initializer might call Block_copy and
cause the variable to be moved to the heap while it's still being initialized.
For example, in this snippet:

int copyBlockAndReturnInt(void (^)(void));

__block int val = copyBlockAndReturnInt(^{
    printf("%p\n", val);
});

...if copyBlockAndReturnInt calls Block_copy, val will be moved to the
heap. When copyBlockAndReturnInt returns, the generated code needs to store
the value to the new location.

For scalar values like pointers, this is handled by deferring computation of
the variable's address until after the initializer has been evaluated (and just
before actually storing to that address).

But what about structs? This case would be hard:

struct Foo { int a, b; };
int copyBlockAndReturnInt(void (^)(void));

__block struct Foo foo = {42, copyBlockAndReturnInt(^{
    printf("%p\n", &foo);
})};

To support this, Clang would have to recalculate the location of foo
before writing to each struct field, in case the initializer of the
current field moved the struct. Block_copy would also end up copying a
half-initialized struct, although that's no big deal with C structs.

This C++ case, on the other hand, would be impossible:

struct Foo {
    void (^block_)(void);
    Foo(void (^block)(void)) : {
        printf("I am at %p\n", this);
        block_ = Block_copy(block);
        printf("I am now at %p\n", this);
    }
};

__block Foo foo(^{
    printf("%p\n", &foo);
});

To support *that*, Clang would have to magically move foo
mid-construction. Normally, Clang uses the move or copy constructor
when moving C++ classes to the heap, but C++ classes certainly don't
expect a move constructor to be called in the middle of another
constructor. memcpy might work in some cases, but it violates C++
classes' expectation that object addresses will remain stable.
Thus there's no real way to make this work, and ideally it would result in
a compile error (or, alternately, a runtime crash upon calling
Block_copy).

So how does Clang currently approach this situation? Well, there's just a
comment:

case TEK_Aggregate:
  [...]
    // TODO: how can we delay here if D is captured by its initializer?

Clang never calls drillIntoBlockVariable, and ends up writing the
variable to the beginning of the byref struct (not the corresponding
field in it), which is never the correct location. This leaves both the
variable and the byref struct corrupt - even if the block is never
actually copied.

This bug dates back to at least 2012, as I reproduced it on LLVM 3.1.

The current patch is intended to be a very minimal fix, for the sake of
landing sooner. It simply calls CGM.ErrorUnsupported, resulting in an
error followed by an intentional crash, if (and only if) it would
otherwise generate bad code.

Note that I'd like to follow this patch with another one that (at least):

detects the issue earlier in the pipeline, and emits a proper compile error; and
makes the "potentially self-capturing" analysis more precise, since it currently has a lot of false positives.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	510 ms	linux > Clang.CodeGenCXX::block-capture-own-init.cpp
	160 ms	windows > Clang.CodeGenCXX::block-capture-own-init.cpp

Event Timeline

• ille created this revision.Oct 21 2020, 11:32 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2020, 11:32 AM

• ille requested review of this revision.Oct 21 2020, 11:32 AM

It's not optimal, but an alternative would be to force the variable to the heap immediately rather than waiting for a potential block copy. The variable would actually be uninitialized during its "copy", so we'd need to give it a trivial copy helper. But once the variable's been moved to the heap, it's never copied again, so that's fine.

We can force it to the heap using the public ABI for __block variables by copying and then immediately disposing of it. (That would leave a reference from the stack, which is what we want.) It would be better to have some way of generating it on the heap to begin it, but there's no ABI support for that right now.

Of course, that wouldn't solve the general problem that the block might be getting used before its capture is fully initialized, but that's a general problem with uses within initializers in C.

Harbormaster completed remote builds in B75916: Diff 299763.Oct 21 2020, 12:06 PM

One downside of that approach is performance. It's somewhat idiosyncratic, but I work on codebases that use blocks heavily in performance-sensitive paths. Typically, the blocks are never copied (they don't escape) and are expected to be inlined. Any implicit heap allocations added by the compiler would have an undesirable performance cost, and would likely also hinder inlining. For those codebases, I would much prefer to be prompted to refactor the code to avoid the self-capture. I suppose this could be implemented as a warning, perhaps off by default.

Another, possibly more serious issue is allocation failure. Block_copy returns null rather than aborting on allocation failure. This is from the Darwin userland implementation:

// Its a stack block.  Make a copy.
if (!isGC) {
    struct Block_layout *result = malloc(aBlock->descriptor->size);
    if (!result) return NULL;

xnu also has a Block_copy implementation that can fail.

If Clang implicitly called Block_copy and it failed, it would have to either hand the user a null block, which would be unexpected, or abort the process. For most codebases, aborting the process would be fine, but there are codebases that attempt to recover cleanly from all allocation failures, and some of those may happen to use blocks. xnu is one example; I don't know of any others, but they might exist.

I suppose those codebases could also enable the hypothetical warning, but it seems like a footgun, at least if the warning is off by default.

Incidentally, I checked how this behaves under ARC. ARC has to deal with the same situation, of course, but Objective-C generally aborts on allocation failure, so having failed block copies do the same would be more acceptable and is the behavior I'd expect. However, objc_retainBlock, the function ARC generates calls to, simply forwards directly to Block_copy; it doesn't check for a null return value, nor does the ARC-generated code do so itself. So the user just gets a null block, which seems like a bug.

We do not actually support allocation failure for a lot of things around blocks. I don't think the copy-helper functions even have a way to propagate out a failure in copying a field. I have never seen any code in the wild that would handle Block_copy returning a null pointer. Effectively it is assumed to not happen.

It seems somewhat unlikely to me that anyone would actually write code like your example without copying the block and potentially triggering the __block variable to be moved to the heap, which is why I think pre-moving the variable might be acceptable.

With all that said, I agree that crashing and/or just not drilling into the variable is not acceptable.

We do not actually support allocation failure for a lot of things around blocks. I don't think the copy-helper functions even have a way to propagate out a failure in copying a field. I have never seen any code in the wild that would handle Block_copy returning a null pointer. Effectively it is assumed to not happen.

Fair – although if that's the case, perhaps xnu should not be using blocks. I suppose failures of small allocations like this are rare enough that the issue hasn't come up in practice, and same with the null blocks under ARC.

It seems somewhat unlikely to me that anyone would actually write code like your example without copying the block and potentially triggering the __block variable to be moved to the heap, which is why I think pre-moving the variable might be acceptable.

Also fair. I can implement this, but I do think there should be an associated warning flag.

With all that said, I agree that crashing and/or just not drilling into the variable is not acceptable.

The patch I submitted switches from not drilling into the variable to crashing. Are you saying I should submit a more thorough fix rather than trying to land this first?

Move the check to cover the atomic case as well.

Harbormaster completed remote builds in B76049: Diff 299998.Oct 22 2020, 9:21 AM

• ille mentioned this in D90434: [CodeGen] Correct codegen for self-capturing __block var.Oct 29 2020, 3:13 PM

VarDecl has a bit (EscapingByref) that indicates whether the __block variable is captured by an escaping block. Can that information be used to avoid pre-moving to the heap?

OK, I se the other patch is using the bit.

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGDecl.cpp

6 lines

test/

CodeGenCXX/

block-capture-own-init.cpp

9 lines

Diff 299998

clang/lib/CodeGen/CGDecl.cpp

Show First 20 Lines • Show All 1,907 Lines • ▼ Show 20 Lines	void CodeGenFunction::EmitExprAsInit(const Expr init, const ValueDecl D,
case TEK_Complex: {		case TEK_Complex: {
ComplexPairTy complex = EmitComplexExpr(init);		ComplexPairTy complex = EmitComplexExpr(init);
if (capturedByInit)		if (capturedByInit)
drillIntoBlockVariable(*this, lvalue, cast<VarDecl>(D));		drillIntoBlockVariable(*this, lvalue, cast<VarDecl>(D));
EmitStoreOfComplex(complex, lvalue, /init/ true);		EmitStoreOfComplex(complex, lvalue, /init/ true);
return;		return;
}		}
case TEK_Aggregate:		case TEK_Aggregate:
		if (capturedByInit) {
		// TODO: how can we delay here if D is captured by its initializer?
		CGM.ErrorUnsupported(
		init, "aggregate initialized with potentially self-capturing block");
		}
if (type->isAtomicType()) {		if (type->isAtomicType()) {
EmitAtomicInit(const_cast<Expr*>(init), lvalue);		EmitAtomicInit(const_cast<Expr*>(init), lvalue);
} else {		} else {
AggValueSlot::Overlap_t Overlap = AggValueSlot::MayOverlap;		AggValueSlot::Overlap_t Overlap = AggValueSlot::MayOverlap;
if (isa<VarDecl>(D))		if (isa<VarDecl>(D))
Overlap = AggValueSlot::DoesNotOverlap;		Overlap = AggValueSlot::DoesNotOverlap;
else if (auto *FD = dyn_cast<FieldDecl>(D))		else if (auto *FD = dyn_cast<FieldDecl>(D))
Overlap = getOverlapForFieldInit(FD);		Overlap = getOverlapForFieldInit(FD);
// TODO: how can we delay here if D is captured by its initializer?
EmitAggExpr(init, AggValueSlot::forLValue(		EmitAggExpr(init, AggValueSlot::forLValue(
lvalue, *this, AggValueSlot::IsDestructed,		lvalue, *this, AggValueSlot::IsDestructed,
AggValueSlot::DoesNotNeedGCBarriers,		AggValueSlot::DoesNotNeedGCBarriers,
AggValueSlot::IsNotAliased, Overlap));		AggValueSlot::IsNotAliased, Overlap));
}		}
return;		return;
}		}
llvm_unreachable("bad evaluation kind");		llvm_unreachable("bad evaluation kind");
▲ Show 20 Lines • Show All 654 Lines • Show Last 20 Lines

clang/test/CodeGenCXX/block-capture-own-init.cpp

This file was added.

				// RUN: not %clang_cc1 -x c++ -std=c++11 -fblocks -emit-llvm %s 2>&1 \| FileCheck %s

				void test_aggregate_captured_by_own_init() {
				struct foo { int a[100]; };
				extern foo get_foo(foo *(^)());
				// CHECK: error: cannot compile this aggregate initialized with potentially self-capturing block yet
				__block foo f = get_foo(^{ return &f; });
				}