This is an archive of the discontinued LLVM Phabricator instance.

[SROA] Drop lifetime.start/end intrinsics when they block promotion.
ClosedPublic

Authored by efriedma on Sep 22 2016, 5:54 PM.

Download Raw Diff

Details

Reviewers

vitalybuka
chandlerc

Commits

rG5096775393b8: [SROA] Drop lifetime.start/end intrinsics when they block promotion.
rL288074: [SROA] Drop lifetime.start/end intrinsics when they block promotion.

Summary

Preserving lifetime markers isn't as important as allowing promotion, so just drop the lifetime markers if necessary.

This also fixes an assertion failure where other parts of SROA assumed that lifetime markers never block promotion.

Fixes https://llvm.org/bugs/show_bug.cgi?id=29139.

Diff Detail

Repository: rL LLVM

Event Timeline

efriedma updated this revision to Diff 72229.Sep 22 2016, 5:54 PM

efriedma retitled this revision from to [SROA] Expand lifetime.start/end offset checks to a couple more places..

efriedma updated this object.

efriedma added a reviewer: chandlerc.

efriedma set the repository for this revision to rL LLVM.

efriedma added subscribers: llvm-commits, Ka-Ka.

This patch solve the problem I have in the "out of tree" backend I work with. Many thanks!

I applied this patch to our codebase and it seems to fix the assertion failure we were seeing in our smaller test cases, thanks!

Ping.

I understand that this fixes an assertion, but it does so by refusing to promote in a case where we can in fact promote.

As an alternative, have you looked at teaching the rewrite logic to "fix" (potentially be stripping) the lifetime markers during rewrite to allow the promotion cases?

New approach which drops lifetime intrinsics instead of trying to preserve them.

Awesome, I'm glad this was so easy. Tiny nit picks below.

lib/Transforms/Scalar/SROA.cpp
2891–2892	We also don't want to do this if the function is being instrumented by ASan as this will undermine scope checking using lifetime markers.
2893–2895	Since we just always can promote through lifetime intrinsics now, I would skip the variable entirely.

Don't drop lifetime markers when ASan is enabled.

The asan checks make the patch a bit more complicated. Hopefully what I'm doing here makes sense.

Ping.

chandlerc added inline comments.Oct 27 2016, 12:23 AM

lib/Transforms/Scalar/SROA.cpp
1738–1744	This (and the bit below) seems like a really good change, and I can see why you might need to make it in order to make progress. But I think you'll need to split this out into a separate patch. First, this needs its own test case. It can be pretty boring and just check that asan functions keep lifetime markers and avoid promoting. But more importantly, this may have really significant knock-on effects for ASan. I just don't know. We should ask the ASan folks to test this and see what it does to things like the performance of ASan-instrumented code. My fear is that it may block so much promotion to registers that we actually make ASan unusably slow. It may be necessary to instead do something much more fancy for ASan: we should instead always promote to a register but synthesize the necessary checks for ASan's use-after-scope directly without relying on memory. If this is something you're interested in playing with, I'm happy to chat with you and the ASan folks about how this might work. Otherwise, I'd suggest sending them this patch as a WIP and they can follow up when it gets to the top of their priorities. See below for my suggestion on how to make progress in the interim though.
1954–1960	(this is the other change I would group with the above)
2897–2901	FWIW, I think this is sufficient to avoid regressing ASan while allowing promotion through partial lifetime markers. Is there something that breaks if you make this change and not the changes above? If so, I'd like to find a way to temporarily work around them as I think fully handling ASan + lifetime markers + promotable allocas is going to be a big project.

efriedma added inline comments.Oct 27 2016, 10:19 AM

lib/Transforms/Scalar/SROA.cpp
2897–2901	The assertion on 2262 ("assert(CanSROA);") fails without the other changes. I can probably narrow the other checks so they only trigger on partial lifetime markers, if you think that would be an improvement.

Ping.

I was debugging a testcase reduced from a PS4 title when I realized Eli already provided a fix
An alternative testcase, FWIW.

%myclass = type { [16 x i32] }

declare void @llvm.lifetime.start(i64, i8* nocapture)

define void @patatino() {
  %gb = alloca [2 x %myclass*]
  %tmp = bitcast [2 x %myclass*]* %gb to i8*
  call void @llvm.lifetime.start(i64 16, i8* %tmp)
  %tmp1 = getelementptr [2 x %myclass*], [2 x %myclass*]* %gb, i64 0, i64 0
  store %myclass* undef, %myclass** %tmp1
  %tmp3 = bitcast %myclass** %tmp1 to <4 x i64>*
  %tmp4 = load <4 x i64>, <4 x i64>* %tmp3
  ret void
}

ping @chandlerc

@chandlerc Ping.

chandlerc added inline comments.Nov 21 2016, 6:15 PM

lib/Transforms/Scalar/SROA.cpp
2897–2901	OK, but making the changes above will I suspect have a much more dramatic impact on ASan. I guess go with just always dropping partial alloca lifetime markers and leave a FIXME for the sanitizer folks to come in and re-work this. I suspect that the sanitizer folks will need to accept the false-negatives here until they're able to move local scalar sanitizing to work in a non-shadow-memory way.

+Vitaly, as this may affect asan's stack-use-after-scope detection.

In D24854#602150, @kcc wrote:

+Vitaly, as this may affect asan's stack-use-after-scope detection.

In case following the thread is confusing: the suggested approach here may cause false *negatives* for ASan + O2 + stack-use-after-scope. But anything Eli does here otherwise will regress ASan + O2 performance dramatically and fix an unknown number of current false negatives.

So I suspect the ASan folks will want to pursue changes here to reduce false negatives in a separate change where you can both avoid the optimization regressions and control enabling these checks to not suddenly expose large numbers of new failures to a check that is already deployed.

If this patch can wait a couple of days I can investigate performance regression and amount of exposed false negatives.
It this is urgent then probably safe to do as suggested, with false-negatives, but without the risk of significant performance regression in Asan.

In D24854#602229, @vitalybuka wrote:

If this patch can wait a couple of days I can investigate performance regression and amount of exposed false negatives.
It this is urgent then probably safe to do as suggested, with false-negatives, but without the risk of significant performance regression in Asan.

I would rather go ahead. The false negatives added will be small and rare compared to those we already have IMO. And I think the risk of performance regressions and other things is really quite high. Plus, it would need a great deal more testing than we currently have, mostly around cases which ASan should catch. This patch isn't really about ASan, so it seems better to unblock it and come back and do a comprehensive job with ASan in mind later.

In D24854#602253, @chandlerc wrote:

I would rather go ahead.

SGTM

I've rerun all out tests with the patch tonight and I see no new reports.
So even without hasFnAttribute(Attribute::SanitizeAddress) checks probability of false-negatives should be low.

If I'm following the conversation correctly, I guess this means I should go ahead and commit https://reviews.llvm.org/D24854?id=73682 (the previous version of the patch)?

In D24854#603277, @efriedma wrote:

If I'm following the conversation correctly, I guess this means I should go ahead and commit https://reviews.llvm.org/D24854?id=73682 (the previous version of the patch)?

Yes. I would just remove the bool variable as I indicated in the comment on that revision. LGTM, go ahead and land the patch, and sorry for the back and forth, it wasn't clear to me how fast this would explode into a mess with my original ASan comment. =[

This revision is now accepted and ready to land.Nov 22 2016, 2:26 PM

For https://reviews.llvm.org/D24854?id=73682

Closed by commit rL288074: [SROA] Drop lifetime.start/end intrinsics when they block promotion. (authored by efriedma). · Explain WhyNov 28 2016, 2:00 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

SROA.cpp

30 lines

test/

Transforms/

SROA/

basictest.ll

38 lines

Diff 74556

lib/Transforms/Scalar/SROA.cpp

Show First 20 Lines • Show All 1,729 Lines • ▼ Show 20 Lines	static bool isVectorPromotionViableForSlice(Partition &P, const Slice &S,
Use *U = S.getUse();		Use *U = S.getUse();

if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(U->getUser())) {		if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(U->getUser())) {
if (MI->isVolatile())		if (MI->isVolatile())
return false;		return false;
if (!S.isSplittable())		if (!S.isSplittable())
return false; // Skip any unsplittable intrinsics.		return false; // Skip any unsplittable intrinsics.
} else if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(U->getUser())) {		} else if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(U->getUser())) {
		// lifetime.start and lifetime.end are promotable, but we avoid
		// promoting them in some cases if asan is enabled.
if (II->getIntrinsicID() != Intrinsic::lifetime_start &&		if (II->getIntrinsicID() != Intrinsic::lifetime_start &&
II->getIntrinsicID() != Intrinsic::lifetime_end)		II->getIntrinsicID() != Intrinsic::lifetime_end)
return false;		return false;
		if (II->getFunction()->hasFnAttribute(Attribute::SanitizeAddress))
		return false;
		chandlercUnsubmitted Not Done Reply Inline Actions This (and the bit below) seems like a really good change, and I can see why you might need to make it in order to make progress. But I think you'll need to split this out into a separate patch. First, this needs its own test case. It can be pretty boring and just check that asan functions keep lifetime markers and avoid promoting. But more importantly, this may have really significant knock-on effects for ASan. I just don't know. We should ask the ASan folks to test this and see what it does to things like the performance of ASan-instrumented code. My fear is that it may block so much promotion to registers that we actually make ASan unusably slow. It may be necessary to instead do something much more fancy for ASan: we should instead always promote to a register but synthesize the necessary checks for ASan's use-after-scope directly without relying on memory. If this is something you're interested in playing with, I'm happy to chat with you and the ASan folks about how this might work. Otherwise, I'd suggest sending them this patch as a WIP and they can follow up when it gets to the top of their priorities. See below for my suggestion on how to make progress in the interim though. chandlerc: This (and the bit below) seems like a really good change, and I can see why you might need to…
} else if (U->get()->getType()->getPointerElementType()->isStructTy()) {		} else if (U->get()->getType()->getPointerElementType()->isStructTy()) {
// Disable vector promotion when there are loads or stores of an FCA.		// Disable vector promotion when there are loads or stores of an FCA.
return false;		return false;
} else if (LoadInst *LI = dyn_cast<LoadInst>(U->getUser())) {		} else if (LoadInst *LI = dyn_cast<LoadInst>(U->getUser())) {
if (LI->isVolatile())		if (LI->isVolatile())
return false;		return false;
Type *LTy = LI->getType();		Type *LTy = LI->getType();
if (P.beginOffset() > S.beginOffset() \|\| P.endOffset() < S.endOffset()) {		if (P.beginOffset() > S.beginOffset() \|\| P.endOffset() < S.endOffset()) {
▲ Show 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	if (IntegerType *ITy = dyn_cast<IntegerType>(ValueTy)) {
return false;		return false;
}		}
} else if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(U->getUser())) {		} else if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(U->getUser())) {
if (MI->isVolatile() \|\| !isa<Constant>(MI->getLength()))		if (MI->isVolatile() \|\| !isa<Constant>(MI->getLength()))
return false;		return false;
if (!S.isSplittable())		if (!S.isSplittable())
return false; // Skip any unsplittable intrinsics.		return false; // Skip any unsplittable intrinsics.
} else if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(U->getUser())) {		} else if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(U->getUser())) {
		// lifetime.start and lifetime.end are promotable, but we avoid
		// promoting them in some cases if asan is enabled.
if (II->getIntrinsicID() != Intrinsic::lifetime_start &&		if (II->getIntrinsicID() != Intrinsic::lifetime_start &&
II->getIntrinsicID() != Intrinsic::lifetime_end)		II->getIntrinsicID() != Intrinsic::lifetime_end)
return false;		return false;
		if (II->getFunction()->hasFnAttribute(Attribute::SanitizeAddress))
		return false;
		chandlercUnsubmitted Not Done Reply Inline Actions (this is the other change I would group with the above) chandlerc: (this is the other change I would group with the above)
} else {		} else {
return false;		return false;
}		}

return true;		return true;
}		}

/// \brief Test whether the given alloca partition's integer operations can be		/// \brief Test whether the given alloca partition's integer operations can be
▲ Show 20 Lines • Show All 909 Lines • ▼ Show 20 Lines	bool visitIntrinsicInst(IntrinsicInst &II) {
assert(II.getIntrinsicID() == Intrinsic::lifetime_start \|\|		assert(II.getIntrinsicID() == Intrinsic::lifetime_start \|\|
II.getIntrinsicID() == Intrinsic::lifetime_end);		II.getIntrinsicID() == Intrinsic::lifetime_end);
DEBUG(dbgs() << " original: " << II << "\n");		DEBUG(dbgs() << " original: " << II << "\n");
assert(II.getArgOperand(1) == OldPtr);		assert(II.getArgOperand(1) == OldPtr);

// Record this instruction for deletion.		// Record this instruction for deletion.
Pass.DeadInsts.insert(&II);		Pass.DeadInsts.insert(&II);

		// Lifetime intrinsics are only promotable if they cover the whole alloca.
		// Therefore, we drop lifetime intrinsics which don't cover the whole
		// alloca.
		//
		// As an exception to this rule, don't drop the intrinsics if asan is
		// enabled, to enable more precise lifetime tracking.
		//
		chandlercUnsubmitted Not Done Reply Inline Actions We also don't want to do this if the function is being instrumented by ASan as this will undermine scope checking using lifetime markers. chandlerc: We also don't want to do this if the function is being instrumented by ASan as this will…
		// (In theory, intrinsics which partially cover an alloca could be
		// promoted, but PromoteMemToReg doesn't handle that case.)
		// FIXME: Check whether the alloca is promotable before dropping the
		chandlercUnsubmitted Not Done Reply Inline Actions Since we just always can promote through lifetime intrinsics now, I would skip the variable entirely. chandlerc: Since we just always can promote through lifetime intrinsics now, I would skip the variable…
		// lifetime intrinsics?
		bool IsWholeAlloca = NewBeginOffset == NewAllocaBeginOffset &&
		NewEndOffset == NewAllocaEndOffset;
		if (!II.getFunction()->hasFnAttribute(Attribute::SanitizeAddress) &&
		!IsWholeAlloca)
		return true;
		chandlercUnsubmitted Not Done Reply Inline Actions FWIW, I think this is sufficient to avoid regressing ASan while allowing promotion through partial lifetime markers. Is there something that breaks if you make this change and not the changes above? If so, I'd like to find a way to temporarily work around them as I think fully handling ASan + lifetime markers + promotable allocas is going to be a big project. chandlerc: FWIW, I think this is sufficient to avoid regressing ASan while allowing promotion through…
		efriedmaAuthorUnsubmitted Not Done Reply Inline Actions The assertion on 2262 ("assert(CanSROA);") fails without the other changes. I can probably narrow the other checks so they only trigger on partial lifetime markers, if you think that would be an improvement. efriedma: The assertion on 2262 ("assert(CanSROA);") fails without the other changes. I can probably…
		chandlercUnsubmitted Not Done Reply Inline Actions OK, but making the changes above will I suspect have a much more dramatic impact on ASan. I guess go with just always dropping partial alloca lifetime markers and leave a FIXME for the sanitizer folks to come in and re-work this. I suspect that the sanitizer folks will need to accept the false-negatives here until they're able to move local scalar sanitizing to work in a non-shadow-memory way. chandlerc: OK, but making the changes above will I suspect have a much more dramatic impact on ASan. I…

ConstantInt *Size =		ConstantInt *Size =
ConstantInt::get(cast<IntegerType>(II.getArgOperand(0)->getType()),		ConstantInt::get(cast<IntegerType>(II.getArgOperand(0)->getType()),
NewEndOffset - NewBeginOffset);		NewEndOffset - NewBeginOffset);
Value *Ptr = getNewAllocaSlicePtr(IRB, OldPtr->getType());		Value *Ptr = getNewAllocaSlicePtr(IRB, OldPtr->getType());
Value *New;		Value *New;
if (II.getIntrinsicID() == Intrinsic::lifetime_start)		if (II.getIntrinsicID() == Intrinsic::lifetime_start)
New = IRB.CreateLifetimeStart(Ptr, Size);		New = IRB.CreateLifetimeStart(Ptr, Size);
else		else
New = IRB.CreateLifetimeEnd(Ptr, Size);		New = IRB.CreateLifetimeEnd(Ptr, Size);

(void)New;		(void)New;
DEBUG(dbgs() << " to: " << *New << "\n");		DEBUG(dbgs() << " to: " << *New << "\n");

// Lifetime intrinsics are only promotable if they cover the whole alloca.
// (In theory, intrinsics which partially cover an alloca could be
// promoted, but PromoteMemToReg doesn't handle that case.)
bool IsWholeAlloca = NewBeginOffset == NewAllocaBeginOffset &&
NewEndOffset == NewAllocaEndOffset;
return IsWholeAlloca;		return IsWholeAlloca;
}		}

bool visitPHINode(PHINode &PN) {		bool visitPHINode(PHINode &PN) {
DEBUG(dbgs() << " original: " << PN << "\n");		DEBUG(dbgs() << " original: " << PN << "\n");
assert(BeginOffset >= NewAllocaBeginOffset && "PHIs are unsplittable");		assert(BeginOffset >= NewAllocaBeginOffset && "PHIs are unsplittable");
assert(EndOffset <= NewAllocaEndOffset && "PHIs are unsplittable");		assert(EndOffset <= NewAllocaEndOffset && "PHIs are unsplittable");

▲ Show 20 Lines • Show All 1,387 Lines • Show Last 20 Lines

test/Transforms/SROA/basictest.ll

Show First 20 Lines • Show All 1,666 Lines • ▼ Show 20 Lines	entry:
call void @llvm.lifetime.end(i64 16, i8* %0)		call void @llvm.lifetime.end(i64 16, i8* %0)
ret void		ret void
}		}

declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1) nounwind		declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1) nounwind

define void @PR27999() unnamed_addr {		define void @PR27999() unnamed_addr {
; CHECK-LABEL: @PR27999(		; CHECK-LABEL: @PR27999(
		; CHECK: entry-block:
		; CHECK-NEXT: ret void
		entry-block:
		%0 = alloca [2 x i64], align 8
		%1 = bitcast [2 x i64]* %0 to i8*
		call void @llvm.lifetime.start(i64 16, i8* %1)
		%2 = getelementptr inbounds [2 x i64], [2 x i64]* %0, i32 0, i32 1
		%3 = bitcast i64* %2 to i8*
		call void @llvm.lifetime.end(i64 8, i8* %3)
		ret void
		}

		define void @PR27999_asan() sanitize_address {
		; CHECK-LABEL: @PR27999_asan(
; CHECK: alloca [2 x i64], align 8		; CHECK: alloca [2 x i64], align 8
; CHECK: call void @llvm.lifetime.start(i64 16,		; CHECK: call void @llvm.lifetime.start(i64 16,
; CHECK: call void @llvm.lifetime.end(i64 8,		; CHECK: call void @llvm.lifetime.end(i64 8,
entry-block:		entry-block:
%0 = alloca [2 x i64], align 8		%0 = alloca [2 x i64], align 8
%1 = bitcast [2 x i64]* %0 to i8*		%1 = bitcast [2 x i64]* %0 to i8*
call void @llvm.lifetime.start(i64 16, i8* %1)		call void @llvm.lifetime.start(i64 16, i8* %1)
%2 = getelementptr inbounds [2 x i64], [2 x i64]* %0, i32 0, i32 1		%2 = getelementptr inbounds [2 x i64], [2 x i64]* %0, i32 0, i32 1
%3 = bitcast i64* %2 to i8*		%3 = bitcast i64* %2 to i8*
call void @llvm.lifetime.end(i64 8, i8* %3)		call void @llvm.lifetime.end(i64 8, i8* %3)
ret void		ret void
}		}

		define void @PR29139() {
		; CHECK-LABEL: @PR29139(
		; CHECK: bb1:
		; CHECK-NEXT: ret void
		bb1:
		%e.7.sroa.6.i = alloca i32, align 1
		%e.7.sroa.6.0.load81.i = load i32, i32* %e.7.sroa.6.i, align 1
		%0 = bitcast i32* %e.7.sroa.6.i to i8*
		call void @llvm.lifetime.end(i64 2, i8* %0)
		ret void
		}

		define void @PR29139_asan() sanitize_address {
		; CHECK-LABEL: @PR29139_asan(
		; CHECK: alloca i32
		; CHECK: call void @llvm.lifetime.end(i64 2
		bb1:
		%e.7.sroa.6.i = alloca i32, align 1
		%e.7.sroa.6.0.load81.i = load i32, i32* %e.7.sroa.6.i, align 1
		%0 = bitcast i32* %e.7.sroa.6.i to i8*
		call void @llvm.lifetime.end(i64 2, i8* %0)
		ret void
		}

This is an archive of the discontinued LLVM Phabricator instance.

[SROA] Drop lifetime.start/end intrinsics when they block promotion.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 74556

lib/Transforms/Scalar/SROA.cpp

test/Transforms/SROA/basictest.ll

[SROA] Drop lifetime.start/end intrinsics when they block promotion.
ClosedPublic