This is an archive of the discontinued LLVM Phabricator instance.

Variable auto-init: fix __block initialization
ClosedPublic

Authored by jfb on Feb 5 2019, 5:03 PM.

Details

Summary

Automatic initialization [1] of __block variables was trampling over the block's headers after they'd been initialized, which caused self-init usage to crash, such as here:

typedef struct XYZ { void (^block)(); } *xyz_t;
__attribute__((noinline))
xyz_t create(void (^block)()) {
  xyz_t myself = malloc(sizeof(struct XYZ));
  myself->block = block;
  return myself;
}
int main() {
  __block xyz_t captured = create(^(){ (void)captured; });
}

This type of code shouldn't be broken by variable auto-init, even if it's sketchy.

[1] With -ftrivial-auto-var-init=pattern

rdar://problem/47798396

Diff Detail

Event Timeline

jfb created this revision.Feb 5 2019, 5:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2019, 5:03 PM
jfb edited the summary of this revision. (Show Details)Feb 5 2019, 5:05 PM
rjmccall added inline comments.Feb 5 2019, 9:15 PM
lib/CodeGen/CGDecl.cpp
1611

Why don't you just initialize the payload right here, after we've drilled down to it?

1646

Does this check handle flexible arrays on zero-sized structures properly? I suppose they're always initialized.

1647

Can't you just drill to the byref storage right here and avoid all the other complexity and subtle ordering interactions?

1667

This is a late comment on the main patch, but there's an ASTContext::getAsVariableArrayType method.

rjmccall added inline comments.Feb 5 2019, 9:16 PM
lib/CodeGen/CGDecl.cpp
1611

I meant to delete this comment before submitting, sorry. I am suggesting the other place.

jfb marked 5 inline comments as done.Feb 6 2019, 2:59 PM
jfb added inline comments.
lib/CodeGen/CGDecl.cpp
1646

You mean something like

void foo(int size) {
  struct Z {};
  Z arr[size];
}

?

1647

We're in the lambda that does the initialization here. The tricky order part is that code that calls the lambda does:

  • Block (which was missing the early auto-init)
  • Trivial initializer (which has auto-init, then early exit)
  • Constant aggregate / constexpr (which might auto-init if it wasn't constant, and then early-exit)
  • The other stuff

I don't think here is the right place to do anything... and I'm not sure what you're suggesting.

1667

OK I can do in a separate follow-up.

rjmccall added inline comments.Feb 6 2019, 4:18 PM
lib/CodeGen/CGDecl.cpp
1647

Escaping __block variables are basically a fixed-layout header followed by storage of the variable's formal type. Anything you do at this point in the function to auto-initialize the header is a waste of time and code size because it is precisely at this point in the function that we perform a bunch of stores to initialize that header. You are mitigating nothing by covering the header. So what I'm saying is that, in this lambda which is meant to initialize the user-exposed storage of a variable, you should just make sure you're pointing at the user-exposed storage of the variable by calling emitBlockByrefAddress(false) (which just does a GEP), and then you can initialize only that.

jfb updated this revision to Diff 185883.Feb 7 2019, 3:48 PM
  • Only initialize __block's storage.
jfb marked 3 inline comments as done.Feb 7 2019, 3:48 PM
jfb added inline comments.
lib/CodeGen/CGDecl.cpp
1647

Thanks, I now understand what you meant. It's super simple indeed, sorry I took a while to grok. Updated patch should do this now.

rjmccall added inline comments.Feb 7 2019, 3:52 PM
lib/CodeGen/CGDecl.cpp
1605

Are these changes still needed?

1644

Thanks, this is what I was looking for.

jfb marked 2 inline comments as done.Feb 7 2019, 3:55 PM
jfb added inline comments.
lib/CodeGen/CGDecl.cpp
1726

Note that we still want this to be pulled out in this way because emitByrefStructureInit emits the call to the initializer (in test_block_self_init the call to create). Were we to leave this as it was before, one of the initializations below would be emitted, but it would be *after* the call.

Similarly, we also want the little dance I added to only initialize once.

jfb marked 2 inline comments as done.Feb 7 2019, 3:59 PM
jfb added inline comments.
lib/CodeGen/CGDecl.cpp
1605

I believe so, see concurrent comment here:
https://reviews.llvm.org/D57797#inline-512702

jfb marked 2 inline comments as done.Feb 7 2019, 4:00 PM
rjmccall added inline comments.Feb 7 2019, 4:00 PM
lib/CodeGen/CGDecl.cpp
1726

Oh, that's interesting, I hadn't remembered that. Alright, in that case LGTM, I guess.

...unless you want to refactor this so that emitByrefStructureInit only handles the header initialization and the bit about handling the initializer is handled more cleanly down in the rest of this code.

jfb updated this revision to Diff 185892.Feb 7 2019, 4:51 PM
jfb marked 4 inline comments as done.
  • Simplify patch greatly.
jfb marked an inline comment as done.Feb 7 2019, 4:51 PM
jfb added inline comments.
lib/CodeGen/CGDecl.cpp
1605

Actually I'm wrong, updated patch drops this and the test makes sure the call is after this initialization. Sorry for the confusion. This is much cleaner.

1646

Just to circle back, the above code lowers to:

%struct.Z = type { i8 }

@__const._Z3fooi.arr = private unnamed_addr constant %struct.Z { i8 -86 }, align 1

; Function Attrs: noinline nounwind optnone ssp uwtable
define void @_Z3fooi(i32 %size) #0 {
entry:
  %size.addr = alloca i32, align 4
  %saved_stack = alloca i8*, align 8
  %__vla_expr0 = alloca i64, align 8
  store i32 %size, i32* %size.addr, align 4
  %0 = load i32, i32* %size.addr, align 4
  %1 = zext i32 %0 to i64
  %2 = call i8* @llvm.stacksave()
  store i8* %2, i8** %saved_stack, align 8
  %vla = alloca %struct.Z, i64 %1, align 16
  store i64 %1, i64* %__vla_expr0, align 8
  %vla.iszerosized = icmp eq i64 %1, 0
  br i1 %vla.iszerosized, label %vla-init.cont, label %vla-setup.loop

vla-setup.loop:                                   ; preds = %entry
  %vla.begin = bitcast %struct.Z* %vla to i8*
  %vla.end = getelementptr inbounds i8, i8* %vla.begin, i64 %1
  br label %vla-init.loop

vla-init.loop:                                    ; preds = %vla-init.loop, %vla-setup.loop
  %vla.cur = phi i8* [ %vla.begin, %vla-setup.loop ], [ %vla.next, %vla-init.loop ]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %vla.cur, i8* align 1 getelementptr inbounds (%struct.Z, %struct.Z* @__const._Z3fooi.arr, i32 0, i32 0), i64 1, i1 false)
  %vla.next = getelementptr inbounds i8, i8* %vla.cur, i64 1
  %vla-init.isdone = icmp eq i8* %vla.next, %vla.end
  br i1 %vla-init.isdone, label %vla-init.cont, label %vla-init.loop

vla-init.cont:                                    ; preds = %vla-init.loop, %entry
  %3 = load i8*, i8** %saved_stack, align 8
  call void @llvm.stackrestore(i8* %3)
  ret void
}

So yes it's handled.

1667
1726

As noted above, I was wrong here.

jfb edited the summary of this revision. (Show Details)Feb 7 2019, 4:53 PM
jfb updated this revision to Diff 185896.Feb 7 2019, 4:59 PM
  • Remove whitespace changes.
This revision was not accepted when it landed; it landed in state Needs Review.Feb 7 2019, 5:29 PM
This revision was automatically updated to reflect the committed changes.

SGTM, thanks!