Download Raw Diff

Details

Reviewers

reames
eli.friedman
danielcdh
efriedma

Commits

rGc9acad12e974: [LICM] Allow promotion of some stores that are not guaranteed to execute.
rL291171: [LICM] Allow promotion of some stores that are not guaranteed to execute.

Summary

Promotion is always legal when a store within the loop is guaranteed to execute.

However, this is not a necessary condition - for promotion to be memory model semantics-preserving, it is enough to have a store that dominates every exit block. This is because if the store dominates every exit block, the fact the exit block was executed implies the original store was executed as well.

Diff Detail

Repository: rL LLVM

Event Timeline

mkuper updated this revision to Diff 82627.Dec 28 2016, 4:24 PM

mkuper retitled this revision from to [LICM] Allow promotion of some stores that are not guaranteed to execute.

mkuper updated this object.

mkuper added reviewers: reames, eli.friedman, danielcdh.

mkuper added a subscriber: llvm-commits.

I'm responding to the description. If the implementation splits a hair the description doesn't, my response may be off base.

I don't think this transform is legal. The exception being thrown might be what prevents the otherwise faulting load or store from executing.

Consider this small example:
for(.i = ...) {

throw_if_out_of_bounds(i);
a[i] += 5;

}

We can't speculatively execute the load from a[i] as it might be the case that it crosses a page boundary and dereferencing the memory will introduce a page fault which doesn't exist in the original program.

(I'm deliberately using a loop variant i in the example, but the same thing applies if i is loop invariant.)

lib/Transforms/Scalar/LICM.cpp
910 ↗	(On Diff #82627)	You don't need a store. A load is sufficient for this part.

This revision now requires changes to proceed.Dec 28 2016, 9:02 PM

reames added inline comments.Dec 28 2016, 9:06 PM

test/Transforms/LICM/scalar_promote.ll
209 ↗	(On Diff #82627)	Both of your examples have unconditional loads with conditional stores. This is interesting in that the dereferenceability would seem to be established by the load. Is this representative of the real example you're trying to address?

Thanks, Philip!

Yes, sorry, that was unclear from the description. I think it's clearer in the inline comment.

For promotion to be legal you need two conditions to hold:

You can unconditionally dereference the pointer in a hoisted load. This holds is if at least one store or load is guaranteed to execute - but that's not a necessary condition. You may know the pointer is dereferencible through other means.
You are allowed to sink the store, from the memory model point of view. The previous code would check that the store is either guaranteed to execute or, if not, stores to a thread local.

What I'm doing here is relaxing condition 2.
Assuming I got the logic right, this patch should not be introducing any cases where the load can fault. That's why my tests have an unconditional load.

This targets reduction loops like

for (int i = 0; i < n; ++i)
  for (int j = 0; j < m; ++j)
    local[i] += foo(i,j)

Where local[i] would otherwise be promotable within the loop, except that the store is not guaranteed to execute because foo() may throw. Note that the load precedes the call to bar.

Added more tests that hopefully alleviate Philip's concerns.

Ok, with the inline discussion, it looks like this is a valid and useful extension. Can you update the description of the patch to reflect the discussion?

Inline code comments below.

lib/Transforms/Scalar/LICM.cpp
253 ↗	(On Diff #82677)	You've got a scattering of unrelated trivial cleanup changes here (style fixes, clarifying comments on the original code, etc..). Please separate those and submit them. No further review needed, but I'd like to get them out of this patch for future rounds.
258 ↗	(On Diff #82677)	Changing from the or and exit scheme to the and scheme should be NFC. Please separate and land without further review. (In particular, if this turns out not to be NFC, I want to be able to bisect to the refactoring.)
268 ↗	(On Diff #82677)	You switching from a lazy to an eager computation of both exit blocks and insert pts. In this case, it shouldn't matter since hasDedicatedExits also does the same (relatively expensive) computation of the loop exits. At some point, we really do need to cache the loop exit walk.... (not for this change)
288 ↗	(On Diff #82677)	Use of the Changed variable here is overly conservative. We actually only need to run this if store promotion changed the loop. If you wanted to do that as a a separate change, it would be a useful cleanup and compile time win.
955 ↗	(On Diff #82677)	This loop is getting really complicated and hard to follow. I think we need to find an alternate structure here which makes your changes easier to reason about. One idea would be to introduce two variables for the two different things we're trying to establish. (i.e. LocationIsDereferencable, and CanInsertStoresOnExits.) The key simplification is that we don't need to track why we know we can insert the stores. I'm also open to other organizations here.
970 ↗	(On Diff #82677)	You still need a slightly modified version of this comment.
1034 ↗	(On Diff #82677)	I'm having a hard time following the semantic changes here intermixed with the refactoring. Please separate that and commit without further review. I suspect that once that's done, I'll spot a couple more things here.
test/Transforms/LICM/scalar_promote.ll
208 ↗	(On Diff #82677)	Looking at this test, it really should be handled by the thread local code we already have. Have you looked to see why it isn't? In fact, looking at all of your tests, the thread local case should handle it...? (Note: This is a very important question to that I'll want an answer to before we commit the semantics parts of this code.)

This revision now requires changes to proceed.Dec 29 2016, 12:36 PM

mkuper added inline comments.Dec 29 2016, 2:20 PM

lib/Transforms/Scalar/LICM.cpp
253 ↗	(On Diff #82677)	I'm not sure all of those changes makes sense while ExitBlocks are still computed lazily. I'll separate as many of them as reasonable, let me know if you still want things unbundled at the next iteration.
258 ↗	(On Diff #82677)	So, it's not NFC on its own, because it also requries the sinking the formLCSSARecursively() call out of the if. Note that: If Changed was true, then "Preheader \|\| L->hasDedicatedExits()" is true, because the arguments to the or guard the previous transofrmations. The "if (Changed)" on line 269 can be true even if promoteLoopAccessesToScalars() made no changes, since it's the same Changed variable, not one defined within the if block. So we end up running formLCSSARecursively() for any change. Which works, by accident, but is really counter-intuitive. And would breaks if DisablePromotion actually happend to be true. The two changes should be NFC when combined, though, so I'll commit that separately.
268 ↗	(On Diff #82677)	It's not just that - isGuaranteedToExecute() also does the same computation... But yes, that's a separate change.
288 ↗	(On Diff #82677)	I think that's actually wrong (and the comment is misleading, I should fix it)- see above for explanation. I mean, I'm not saying that we need to run it in other cases (although I'm pretty sure we do), I'm saying we did run it in other cases, and I'd prefer this part to be NFC.
955 ↗	(On Diff #82677)	I agree. I'll see if I can simplify.
910 ↗	(On Diff #82627)	No, if the load is guaranteed to be executed, but the store is under control flow, then it's not safe to promote, unless you have additional information (next paragraph). I can try to rewrite this whole thing, but I'm not sure I can make it both precise and clearer.
test/Transforms/LICM/scalar_promote.ll
208 ↗	(On Diff #82677)	Why would it be handled by the thread-local code? %local is an alloca, so it's not thread-local memory.

mkuper added inline comments.Dec 29 2016, 2:42 PM

lib/Transforms/Scalar/LICM.cpp
288 ↗	(On Diff #82677)	And looks like I was wrong, apparently this really is fine to run only for promotions, I guess the other transformations preserve LCSSA. I'll sort all of that out first, then update this patch.

mkuper mentioned this in rL290729: [LICM] Compute exit blocks for promotion eagerly. NFC..Dec 29 2016, 3:22 PM

Responding to the thread local discussion, but leaving the rest until you've separated the various pieces and landed them.

test/Transforms/LICM/scalar_promote.ll
208 ↗	(On Diff #82677)	An alloca is thread local until the first time it escapes and become visible to another thread. This is the same as a call to malloc(). I think the isAllocLikeFn check in the thread local case should become (isAllocLikeFn \|\| isa<AllocaInst>). I believe that will handle the cases you've mentioned.

mkuper added inline comments.Dec 29 2016, 4:17 PM

test/Transforms/LICM/scalar_promote.ll
208 ↗	(On Diff #82677)	An alloca is thread local until the first time it escapes and become visible to another thread. This is the same as a call to malloc(). Argh, right. I think the isAllocLikeFn check in the thread local case should become (isAllocLikeFn \|\| isa<AllocaInst>). I believe that will handle the cases you've mentioned. I guess you're right, that will solve most issues. Thanks for the (failed :-) ) sanity check. I think we may want both things, though - this can fire even if the alloca is captured (or if it's a global or a function argument instead of the alloca), if bar is argmemonly/readnone.

reames added inline comments.Dec 29 2016, 4:35 PM

test/Transforms/LICM/scalar_promote.ll
208 ↗	(On Diff #82677)	Completely agreed that we want both. Multiple ways to skin a cat are always welcome; sometimes one falls over where another doesn't. Why don't you spawn another thread for the thread local change and save this one for the current approach? I see that you've split out most of the pieces, so once we rebase this should be much easier to review. I'd really like to see both pieces land in the next day or so.

So, the patch is now fairly small, but I'm having trouble with the testcases.
In particular, I think what we have now is unsound, because isGuaranteedToTransferExecutionToSuccessor() returns true for argmemonly calls, which means that things that should only get promoted with my patch get promoted without.

I'll get back to this on Tuesday. For now, I'll upload the code change, without fixing the tests, in case you want to take a look at the semantic change.

mkuper updated this revision to Diff 82714.Dec 29 2016, 6:03 PM

mkuper edited edge metadata.

In D28147#632595, @mkuper wrote:

So, the patch is now fairly small, but I'm having trouble with the testcases.
In particular, I think what we have now is unsound, because isGuaranteedToTransferExecutionToSuccessor() returns true for argmemonly calls, which means that things that should only get promoted with my patch get promoted without.

We have a prevaling assumption that to throw an exception, you must write to some global (but unknown) location to do so. We effectively assume that any readonly call is also nounwind. As you point out, we also assume that argmemonly must be readonly with respect to that unknown magic location. That doesn't seem sound as the location written to might be reachable through an argument of the function. So, yes, I think this is wrong.

I'll get back to this on Tuesday. For now, I'll upload the code change, without fixing the tests, in case you want to take a look at the semantic change.

Thanks. I took a look and have only minor comments on what's left. Once we have the above issue fixed, this will be quick.

lib/Transforms/Scalar/LICM.cpp
928 ↗	(On Diff #82714)	if an exception is thrown on the first iteration.
1012 ↗	(On Diff #82714)	Expand this comment a bit. possibly: because it must have executed in the original program before reaching one of the inert points.

In D28147#632595, @mkuper wrote:

So, the patch is now fairly small, but I'm having trouble with the testcases.
In particular, I think what we have now is unsound, because isGuaranteedToTransferExecutionToSuccessor() returns true for argmemonly calls, which means that things that should only get promoted with my patch get promoted without.

That was an oversight, fixed in https://reviews.llvm.org/rL290794

I'll get back to this on Tuesday. For now, I'll upload the code change, without fixing the tests, in case you want to take a look at the semantic change.

After r290794, it looks like the tests are actually checking what they should.

efriedma added a subscriber: efriedma.Jan 3 2017, 12:27 PM

efriedma added inline comments.

test/Transforms/LICM/scalar_promote.ll
222 ↗	(On Diff #82919)	I'm pretty sure this transform is wrong. For example, `@capture` could create a mutex protecting `%local`, and lock it. Then `@opaque` could release the mutex, does some work, and lock the mutex. In that case, the value stored by `store i32 %x2, i32* %local` is visible to other threads in each iteration, so you can't sink the store out of the loop.

efriedma requested changes to this revision.Jan 3 2017, 12:27 PM

efriedma added a reviewer: efriedma.

This revision now requires changes to proceed.Jan 3 2017, 12:27 PM

efriedma added inline comments.Jan 3 2017, 12:32 PM

test/Transforms/LICM/scalar_promote.ll
222 ↗	(On Diff #82919)	Err, my example is overly complicated. You don't need multiple threads. The address of `%local` escapes, therefore `@opaque` could read it, therefore we have to update it every iteration of the loop.

efriedma resigned from this revision.Jan 3 2017, 12:34 PM

efriedma removed a reviewer: efriedma.

efriedma added inline comments.

test/Transforms/LICM/scalar_promote.ll
222 ↗	(On Diff #82919)	Err, sorry, ignore me, didn't notice the argmemonly annotation on `@opaque`. I need to think a bit.

mkuper added inline comments.Jan 3 2017, 12:57 PM

test/Transforms/LICM/scalar_promote.ll
222 ↗	(On Diff #82919)	I'm not actually changing this behavior. The transformation is already performed when @opaque is marked as nounwind. This patch only removes the nounwind requirement, which has no bearing on your scenario. In any case, I don't think the stores are required to be observable in this scenario (and you also need to pass the mutex as an argument, but, technically, that's possible), so I believe the transformation is still legal. If we reach the conclusion it's not, then it will still be legal if opaque is "argmemonly readonly", since readonly functions are not allowed to perform atomic writes. So we'll need to start requiring readonly (and not just argmemonly), but that's orthogonal to this patch.

I'm satisfied with this, with some minor tweaks.

lib/Transforms/Scalar/LICM.cpp
1019 ↗	(On Diff #82919)	This check is similar to llvm::isGuaranteedToExecute. Maybe worth noting the parallel here. If the loop throws, you're depending on the earlier check that the memory location is an alloca; you should explicitly note that here.
test/Transforms/LICM/scalar_promote.ll
222 ↗	(On Diff #82919)	Mmm... okay, I think I understand what's happening. From a memory model perspective, the rule is this: if the current thread performs a non-atomic store to `%local`, no other thread is allowed to read that memory until the current thread performs an atomic release (or equivalent). Therefore, if a loop doesn't contain an atomic release, and every exit is dominated by a non-atomic store, you can sink the store out of the loop. (An atomic release gets modeled by alias analysis as a read to all memory locations whose addresses have escaped, so the existence of the alias set implies the loop doesn't contain a relevant atomic release.) The wrinkle introduced by a call which throws is that there are additional exits which are not modeled in the CFG; the store might not actually dominate all exits. That said, if a loop throws, we only allow sinking stores to allocas, and allocas get deallocated when the loop throws, so we're still okay.

Thanks Eli.

Philip, do you have any more comments?

lib/Transforms/Scalar/LICM.cpp
1019 ↗	(On Diff #82919)	I'm not depending on it being an alloca. I'm only depending on it being dereferenceable - which is the DereferenceableInPH check and is documented. So I'm not sure how this can be more explicit.

efriedma added inline comments.Jan 4 2017, 3:19 PM

lib/Transforms/Scalar/LICM.cpp
1019 ↗	(On Diff #82919)	The implicit unwind edges from the loop aren't listed in ExitBlocks, so you aren't checking them. If the memory is an alloca, this doesn't matter because the unwind edge will end the lifetime of the alloca; if we somehow extended LICM so it could make implicit unwind edges explicit, the missing check might matter.

mkuper added inline comments.Jan 4 2017, 3:23 PM

lib/Transforms/Scalar/LICM.cpp
1019 ↗	(On Diff #82919)	Oh, got it, you mean the check on line 951. Right, will add a comment.

No further comments, LGTM

Closed by commit rL291171: [LICM] Allow promotion of some stores that are not guaranteed to execute. (authored by mkuper). · Explain WhyJan 5 2017, 12:52 PM

This revision was automatically updated to reflect the committed changes.

Diff 83289

llvm/trunk/lib/Transforms/Scalar/LICM.cpp

Show First 20 Lines • Show All 906 Lines • ▼ Show 20 Lines	bool llvm::promoteLoopAccessesToScalars(
// The safety property divides into two parts:		// The safety property divides into two parts:
// p1) The memory may not be dereferenceable on entry to the loop. In this		// p1) The memory may not be dereferenceable on entry to the loop. In this
// case, we can't insert the required load in the preheader.		// case, we can't insert the required load in the preheader.
// p2) The memory model does not allow us to insert a store along any dynamic		// p2) The memory model does not allow us to insert a store along any dynamic
// path which did not originally have one.		// path which did not originally have one.
//		//
// If at least one store is guaranteed to execute, both properties are		// If at least one store is guaranteed to execute, both properties are
// satisfied, and promotion is legal.		// satisfied, and promotion is legal.
		//
// This, however, is not a necessary condition. Even if no store/load is		// This, however, is not a necessary condition. Even if no store/load is
// guaranteed to execute, we can still establish these properties:		// guaranteed to execute, we can still establish these properties.
// (p1) by proving that hoisting the load into the preheader is		// We can establish (p1) by proving that hoisting the load into the preheader
// safe (i.e. proving dereferenceability on all paths through the loop). We		// is safe (i.e. proving dereferenceability on all paths through the loop). We
// can use any access within the alias set to prove dereferenceability,		// can use any access within the alias set to prove dereferenceability,
// since they're all must alias.		// since they're all must alias.
// (p2) by proving the memory is thread-local, so the memory model		//
		// There are two ways establish (p2):
		// a) Prove the location is thread-local. In this case the memory model
// requirement does not apply, and stores are safe to insert.		// requirement does not apply, and stores are safe to insert.
		// b) Prove a store dominates every exit block. In this case, if an exit
		// blocks is reached, the original dynamic path would have taken us through
		// the store, so inserting a store into the exit block is safe. Note that this
		// is different from the store being guaranteed to execute. For instance,
		// if an exception is thrown on the first iteration of the loop, the original
		// store is never executed, but the exit blocks are not executed either.

bool DereferenceableInPH = false;		bool DereferenceableInPH = false;
bool SafeToInsertStore = false;		bool SafeToInsertStore = false;

SmallVector<Instruction *, 64> LoopUses;		SmallVector<Instruction *, 64> LoopUses;
SmallPtrSet<Value *, 4> PointerMustAliases;		SmallPtrSet<Value *, 4> PointerMustAliases;

// We start with an alignment of one and try to find instructions that allow		// We start with an alignment of one and try to find instructions that allow
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	for (User *U : ASIV->users()) {
(InstAlignment > Alignment)) {		(InstAlignment > Alignment)) {
if (isGuaranteedToExecute(*UI, DT, CurLoop, SafetyInfo)) {		if (isGuaranteedToExecute(*UI, DT, CurLoop, SafetyInfo)) {
DereferenceableInPH = true;		DereferenceableInPH = true;
SafeToInsertStore = true;		SafeToInsertStore = true;
Alignment = std::max(Alignment, InstAlignment);		Alignment = std::max(Alignment, InstAlignment);
}		}
}		}

		// If a store dominates all exit blocks, it is safe to sink.
		// As explained above, if an exit block was executed, a dominating
		// store must have been been executed at least once, so we are not
		// introducing stores on paths that did not have them.
		// Note that this only looks at explicit exit blocks. If we ever
		// start sinking stores into unwind edges (see above), this will break.
		if (!SafeToInsertStore)
		SafeToInsertStore = llvm::all_of(ExitBlocks, [&](BasicBlock *Exit) {
		return DT->dominates(Store->getParent(), Exit);
		});

// If the store is not guaranteed to execute, we may still get		// If the store is not guaranteed to execute, we may still get
// deref info through it.		// deref info through it.
if (!DereferenceableInPH) {		if (!DereferenceableInPH) {
DereferenceableInPH = isDereferenceableAndAlignedPointer(		DereferenceableInPH = isDereferenceableAndAlignedPointer(
Store->getPointerOperand(), Store->getAlignment(), MDL,		Store->getPointerOperand(), Store->getAlignment(), MDL,
Preheader->getTerminator(), DT);		Preheader->getTerminator(), DT);
}		}
} else		} else
▲ Show 20 Lines • Show All 177 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LICM/scalar_promote.ll

	Show First 20 Lines • Show All 180 Lines • ▼ Show 20 Lines

	; CHECK: for.body.lr.ph:			; CHECK: for.body.lr.ph:
	; CHECK-NEXT: %gi.promoted = load i32, i32* %gi, align 4, !tbaa !0			; CHECK-NEXT: %gi.promoted = load i32, i32* %gi, align 4, !tbaa !0
	; CHECK: for.cond.for.end_crit_edge:			; CHECK: for.cond.for.end_crit_edge:
	; CHECK-NEXT: %[[LCSSAPHI:.*]] = phi i32 [ %inc			; CHECK-NEXT: %[[LCSSAPHI:.*]] = phi i32 [ %inc
	; CHECK-NEXT: store i32 %[[LCSSAPHI]], i32* %gi, align 4, !tbaa !0			; CHECK-NEXT: store i32 %[[LCSSAPHI]], i32* %gi, align 4, !tbaa !0
	}			}

				declare i32 @opaque(i32) argmemonly
				declare void @capture(i32*)

				; We can promote even if opaque may throw.
				define i32 @test7() {
				; CHECK-LABEL: @test7(
				; CHECK: entry:
				; CHECK-NEXT: %local = alloca
				; CHECK-NEXT: call void @capture(i32* %local)
				; CHECK-NEXT: load i32, i32* %local
				; CHECK-NEXT: br label %loop
				; CHECK: exit:
				; CHECK-NEXT: %[[LCSSAPHI:.*]] = phi i32 [ %x2, %loop ]
				; CHECK-NEXT: store i32 %[[LCSSAPHI]], i32* %local
				; CHECK-NEXT: %ret = load i32, i32* %local
				; CHECK-NEXT: ret i32 %ret
				entry:
				%local = alloca i32
				call void @capture(i32* %local)
				br label %loop

				loop:
				%j = phi i32 [ 0, %entry ], [ %next, %loop ]
				%x = load i32, i32* %local
				%x2 = call i32 @opaque(i32 %x) ; Note this does not capture %local
				store i32 %x2, i32* %local
				%next = add i32 %j, 1
				%cond = icmp eq i32 %next, 0
				br i1 %cond, label %exit, label %loop

				exit:
				%ret = load i32, i32* %local
				ret i32 %ret
				}

				; Make sure we don't promote if the store is really control-flow dependent.
				define i32 @test7bad() {
				; CHECK-LABEL: @test7bad(
				; CHECK: entry:
				; CHECK-NEXT: %local = alloca
				; CHECK-NEXT: call void @capture(i32* %local)
				; CHECK-NEXT: br label %loop
				; CHECK: if:
				; CHECK-NEXT: store i32 %x2, i32* %local
				; CHECK-NEXT: br label %else
				; CHECK: exit:
				; CHECK-NEXT: %ret = load i32, i32* %local
				; CHECK-NEXT: ret i32 %ret
				entry:
				%local = alloca i32
				call void @capture(i32* %local)
				br label %loop
				loop:
				%j = phi i32 [ 0, %entry ], [ %next, %else ]
				%x = load i32, i32* %local
				%x2 = call i32 @opaque(i32 %x) ; Note this does not capture %local
				%cmp = icmp eq i32 %x2, 0
				br i1 %cmp, label %if, label %else

				if:
				store i32 %x2, i32* %local
				br label %else

				else:
				%next = add i32 %j, 1
				%cond = icmp eq i32 %next, 0
				br i1 %cond, label %exit, label %loop

				exit:
				%ret = load i32, i32* %local
				ret i32 %ret
				}

				; Even if neither the load nor the store or guaranteed to execute because
				; opaque() may throw, we can still promote - the load not being guaranteed
				; doesn't block us, because %local is always dereferenceable.
				define i32 @test8() {
				; CHECK-LABEL: @test8(
				; CHECK: entry:
				; CHECK-NEXT: %local = alloca
				; CHECK-NEXT: call void @capture(i32* %local)
				; CHECK-NEXT: load i32, i32* %local
				; CHECK-NEXT: br label %loop
				; CHECK: exit:
				; CHECK-NEXT: %[[LCSSAPHI:.*]] = phi i32 [ %x2, %loop ]
				; CHECK-NEXT: store i32 %[[LCSSAPHI]], i32* %local
				; CHECK-NEXT: %ret = load i32, i32* %local
				; CHECK-NEXT: ret i32 %ret
				entry:
				%local = alloca i32
				call void @capture(i32* %local)
				br label %loop

				loop:
				%j = phi i32 [ 0, %entry ], [ %next, %loop ]
				%throwaway = call i32 @opaque(i32 %j)
				%x = load i32, i32* %local
				%x2 = call i32 @opaque(i32 %x)
				store i32 %x2, i32* %local
				%next = add i32 %j, 1
				%cond = icmp eq i32 %next, 0
				br i1 %cond, label %exit, label %loop

				exit:
				%ret = load i32, i32* %local
				ret i32 %ret
				}


				; If the store is "guaranteed modulo exceptions", and the load depends on
				; control flow, we can only promote if the pointer is otherwise known to be
				; dereferenceable
				define i32 @test9() {
				; CHECK-LABEL: @test9(
				; CHECK: entry:
				; CHECK-NEXT: %local = alloca
				; CHECK-NEXT: call void @capture(i32* %local)
				; CHECK-NEXT: load i32, i32* %local
				; CHECK-NEXT: br label %loop
				; CHECK: exit:
				; CHECK-NEXT: %[[LCSSAPHI:.*]] = phi i32 [ %x2, %else ]
				; CHECK-NEXT: store i32 %[[LCSSAPHI]], i32* %local
				; CHECK-NEXT: %ret = load i32, i32* %local
				; CHECK-NEXT: ret i32 %ret
				entry:
				%local = alloca i32
				call void @capture(i32* %local)
				br label %loop

				loop:
				%j = phi i32 [ 0, %entry ], [ %next, %else ]
				%j2 = call i32 @opaque(i32 %j)
				%cmp = icmp eq i32 %j2, 0
				br i1 %cmp, label %if, label %else

				if:
				%x = load i32, i32* %local
				br label %else

				else:
				%x2 = phi i32 [ 0, %loop ], [ %x, %if]
				store i32 %x2, i32* %local
				%next = add i32 %j, 1
				%cond = icmp eq i32 %next, 0
				br i1 %cond, label %exit, label %loop

				exit:
				%ret = load i32, i32* %local
				ret i32 %ret
				}

				define i32 @test9bad(i32 %i) {
				; CHECK-LABEL: @test9bad(
				; CHECK: entry:
				; CHECK-NEXT: %local = alloca
				; CHECK-NEXT: call void @capture(i32* %local)
				; CHECK-NEXT: %notderef = getelementptr
				; CHECK-NEXT: br label %loop
				; CHECK: if:
				; CHECK-NEXT: load i32, i32* %notderef
				; CHECK-NEXT: br label %else
				; CHECK: exit:
				; CHECK-NEXT: %ret = load i32, i32* %notderef
				; CHECK-NEXT: ret i32 %ret
				entry:
				%local = alloca i32
				call void @capture(i32* %local)
				%notderef = getelementptr i32, i32* %local, i32 %i
				br label %loop

				loop:
				%j = phi i32 [ 0, %entry ], [ %next, %else ]
				%j2 = call i32 @opaque(i32 %j)
				%cmp = icmp eq i32 %j2, 0
				br i1 %cmp, label %if, label %else

				if:
				%x = load i32, i32* %notderef
				br label %else

				else:
				%x2 = phi i32 [ 0, %loop ], [ %x, %if]
				store i32 %x2, i32* %notderef
				%next = add i32 %j, 1
				%cond = icmp eq i32 %next, 0
				br i1 %cond, label %exit, label %loop

				exit:
				%ret = load i32, i32* %notderef
				ret i32 %ret
				}

	!0 = !{!4, !4, i64 0}			!0 = !{!4, !4, i64 0}
	!1 = !{!"omnipotent char", !2}			!1 = !{!"omnipotent char", !2}
	!2 = !{!"Simple C/C++ TBAA"}			!2 = !{!"Simple C/C++ TBAA"}
	!3 = !{!5, !5, i64 0}			!3 = !{!5, !5, i64 0}
	!4 = !{!"int", !1}			!4 = !{!"int", !1}
	!5 = !{!"float", !1}			!5 = !{!"float", !1}

This is an archive of the discontinued LLVM Phabricator instance.

[LICM] Allow promotion of some stores that are not guaranteed to execute
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 83289

llvm/trunk/lib/Transforms/Scalar/LICM.cpp

llvm/trunk/test/Transforms/LICM/scalar_promote.ll

This is an archive of the discontinued LLVM Phabricator instance.

[LICM] Allow promotion of some stores that are not guaranteed to executeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 83289

llvm/trunk/lib/Transforms/Scalar/LICM.cpp

llvm/trunk/test/Transforms/LICM/scalar_promote.ll

[LICM] Allow promotion of some stores that are not guaranteed to execute
ClosedPublic