This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Analysis/
-
Analysis/
-
MemorySSAUpdater.cpp
-
test/Transforms/GVNHoist/
-
Transforms/
-
GVNHoist/
-
pr37445.ll

Differential D49425

[MemorySSAUpdater] Update Phi operands after trivial Phi elimination
ClosedPublic

Authored by labrinea on Jul 17 2018, 7:08 AM.

Download Raw Diff

Details

Reviewers

llvm-commits
efriedma
george.burgess.iv

Commits

rGbf6009c234d3: [MemorySSAUpdater] Update Phi operands after trivial Phi elimination
rL337680: [MemorySSAUpdater] Update Phi operands after trivial Phi elimination

Summary

Bug fix for PR37445. The regression test is a reduced version of the original reproducer attached to the bug report. The underlying problem and its fix are similar to PR37808. The bug lies in MemorySSAUpdater::getPreviousDefRecursive(), where PhiOps is computed before the call to tryRemoveTrivialPhi() and it ends up being out of date, pointing to stale data.

Diff Detail

Repository: rL LLVM

Event Timeline

labrinea created this revision.Jul 17 2018, 7:08 AM

Herald added a subscriber: Prazek. · View Herald TranscriptJul 17 2018, 7:08 AM

A few remarks:

SmallVector<WeakVH, 8> PhiOps fixes the bug on its own (without the rest changes) and I am wondering why..
When we mark a block as visited why do we cache it? When the recursion ends we might trivially remove the Phi. In that case the second cache insertion for the same key block should fail, no?
Do we ever reach the PHIExistsButNeedsUpdate case? Is it when a Phi existed beforehand, meaning we did not create it? I can't think of another way to reach that state.
Interestingly enough the reproducer only made opt crash in bitcode form and not in IR form.

Thanks for this!

Looks like the MSSA that we're starting with here has a redundant Phi to start with (6 = MemoryPhi({_, liveOnEntry}, {_2, 6})), so it's unsurprising that we try to remove it. In general, I'm noticing that MSSA tries really hard to keep this nice minimal Phi form, but it's really easy for a user to inadvertently make a Phi trivially removable with a RAUW or similar. Fixing this is likely nontrivial + out of the scope of this patch, so I'm happy to support this for now. It's just a mildly sad state of affairs. :)

Do we ever reach the PHIExistsButNeedsUpdate case? Is it when a Phi existed beforehand, meaning we did not create it? I can't think of another way to reach that state.

Good point. In general, we'd hit that case if we called getPreviousDefRecursive on a BB with > 1 pred with an existing Phi, but I don't see how that can happen today. The only callers are:

getPreviousDef, which, unless it's given a Phi, is guaranteed to not call getPreviousDefRecursive if there's a Phi in the same BB as its argument.
getPreviousDefFromEnd, which has an identical guarantee (except it takes a BB instead of a MemoryAccess)

...And we only ever call getPreviousDef with Uses/Defs.

Replacing that code with llvm_unreachable gave 0 errors in a clang bootstrap, so I assume it's dead code. If you'd like to remove it (or replace it with an assertion), please do so in a separate patch.

SmallVector<WeakVH, 8> PhiOps fixes the bug on its own (without the rest changes) and I am wondering why..

Best guess: we end up calling tryRemoveTrivialPhi(nullptr, VectorOfWeakVH), which ends up seeing a vector consisting of {nullptr, liveOnEntry} (nullptr being a special value we use in tryRemoveTrivialPhi, so passing it in as an operand is really sketchy), which then "recurses" on and returns liveOnEntry. liveOnEntry != nullptr, so we don't try to create a phi/etc.

I doubt all of that is right, but ...

When we mark a block as visited why do we cache it? When the recursion ends we might trivially remove the Phi. In that case the second cache insertion for the same key block should fail, no?

Not sure I understand the question. Do you mean "why do we cache the Phi we've just created if we're visiting a block for the second time?" If so, the cache will automatically track the removal if we do ultimately end up removing the phi. If not, we'll fill the Phi in when we wind our stack back to the bit that populates it, so the second cache insertion should be a nop anyway..

Interestingly enough the reproducer only made opt crash in bitcode form and not in IR form.

Does the original test-case crash reliably as IR for you? If so, please use that instead. (Phab won't let me download the attached bitcode, but with asan, I see use-after-free crashes 100% of the time in the original repro).

lib/Analysis/MemorySSAUpdater.cpp
68 ↗	(On Diff #155876)	Should this be a `TrackingVH<MemoryAccess>` instead? That way, we don't need the "Phi ops may be out of date" loop below.
90 ↗	(On Diff #155876)	`cast_or_null`, please. We know this'll be a `MemoryAccess` if it's non-null.

Does the original test-case crash reliably as IR for you? If so, please use that instead. (Phab won't let me download the attached bitcode, but with asan, I see use-after-free crashes 100% of the time in the original repro).

It does, but using opt -S -O3 ./tc_memphi_gvnhoist.ll -enable-gvn-hoist. Using bugpoint on that command you get the bitcode I uploaded.

Not sure I understand the question. Do you mean "why do we cache the Phi we've just created if we're visiting a block for the second time?" If so, the cache will automatically track the removal if we do ultimately end up removing the phi. If not, we'll fill the Phi in when we wind our stack back to the bit that populates it, so the second cache insertion should be a nop anyway..

Imagine we call getPreviousDefRecursive on a BB with > 1 pred. Assume there's no Phi yet. We mark the BB as visited and start collecting PhiOps with recursive calls via getPreviousDefFromEnd. By the time we reach BB again we create an empty Phi and cache it for the first time (line 62). As all the PhiOps are collected we call tryRemoveTrivialPhi which might have replaced the Phi with another MemoryAccess. Then we try to cache that MemoryAccess with the same BB key (line 110), which should return false.

Should this be a TrackingVH<MemoryAccess> instead? That way, we don't need the "Phi ops may be out of date" loop below.

Works, but still it's not clear to me why. What happens if the PhiOp is a deleted Phi itself? Do we still keep the instance of the dead Phi even though MSSA has deleted the corresponding MemoryAccess? How would this work without the updating loop?

If the bitcode is crashing but the textual IR isn't, you're probably getting bitten by use-list ordering. You can use the preserve-ll-uselistorder option for "opt" to preserve it in IR.

In D49425#1167032, @efriedma wrote:

If the bitcode is crashing but the textual IR isn't, you're probably getting bitten by use-list ordering. You can use the preserve-ll-uselistorder option for "opt" to preserve it in IR.

Yeap, that worked.

Waiting for @george.burgess.iv's comments on the following to proceed accordingly.

In D49425#1166187, @labrinea wrote:

Not sure I understand the question. Do you mean "why do we cache the Phi we've just created if we're visiting a block for the second time?" If so, the cache will automatically track the removal if we do ultimately end up removing the phi. If not, we'll fill the Phi in when we wind our stack back to the bit that populates it, so the second cache insertion should be a nop anyway..

Imagine we call getPreviousDefRecursive on a BB with > 1 pred. Assume there's no Phi yet. We mark the BB as visited and start collecting PhiOps with recursive calls via getPreviousDefFromEnd. By the time we reach BB again we create an empty Phi and cache it for the first time (line 62). As all the PhiOps are collected we call tryRemoveTrivialPhi which might have replaced the Phi with another MemoryAccess. Then we try to cache that MemoryAccess with the same BB key (line 110), which should return false.

Should this be a TrackingVH<MemoryAccess> instead? That way, we don't need the "Phi ops may be out of date" loop below.

Works, but still it's not clear to me why. What happens if the PhiOp is a deleted Phi itself? Do we still keep the instance of the dead Phi even though MSSA has deleted the corresponding MemoryAccess? How would this work without the updating loop?

As all the PhiOps are collected we call tryRemoveTrivialPhi which might have replaced the Phi with another MemoryAccess. Then we try to cache that MemoryAccess with the same BB key (line 110), which should return false

Thanks for the clarification. I don't think that can happen, since the only way we can remove a Phi is if we call tryRemoveTrivialPhi directly on it (which doesn't happen), or if we recursively call it on said Phi. We only recurse to Users of arbitrary MemoryAccesses, and the blank Phis we create are Users of nothing until we fill their operands in.

If you'd like, you're welcome to add something like assert(Inserted || CachedPreviousDef[BB] == Result); to be sure of this, but like said, I don't see how it can happen.

Works, but still it's not clear to me why. What happens if the PhiOp is a deleted Phi itself?

MSSA Phi deletion requires replacing all Uses (...in the LLVM sense of Use, not MemoryUse) of the to-be-deleted Phi with another MemoryAccess prior to us actually deleteing the Phi. As a part of this replacement, the TrackingVHes will all automatically get pointed at this new MemoryAccess.

Changes to prior revision.

Removed the update loop for PhiOps and used TrackingVH<MemoryAccess> instead.
Replaced the Bitcode reproducer with IR using -preserve-ll-uselistorder.

LGTM after one nit is addressed. Thanks again!

lib/Analysis/MemorySSAUpdater.cpp
90 ↗	(On Diff #156466)	`Phi` should always be non-null now

This revision is now accepted and ready to land.Jul 20 2018, 2:05 PM

Closed by commit rL337680: [MemorySSAUpdater] Update Phi operands after trivial Phi elimination (authored by alelab01). · Explain WhyJul 23 2018, 3:57 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Analysis/

MemorySSAUpdater.cpp

28 lines

test/

Transforms/

GVNHoist/

pr37445.ll

119 lines

Diff 156739

llvm/trunk/lib/Analysis/MemorySSAUpdater.cpp

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	if (VisitedBlocks.count(BB)) {
// insert useless phis is if we have irreducible control flow.		// insert useless phis is if we have irreducible control flow.
MemoryAccess *Result = MSSA->createMemoryPhi(BB);		MemoryAccess *Result = MSSA->createMemoryPhi(BB);
CachedPreviousDef.insert({BB, Result});		CachedPreviousDef.insert({BB, Result});
return Result;		return Result;
}		}

if (VisitedBlocks.insert(BB).second) {		if (VisitedBlocks.insert(BB).second) {
// Mark us visited so we can detect a cycle		// Mark us visited so we can detect a cycle
SmallVector<MemoryAccess *, 8> PhiOps;		SmallVector<TrackingVH<MemoryAccess>, 8> PhiOps;

// Recurse to get the values in our predecessors for placement of a		// Recurse to get the values in our predecessors for placement of a
// potential phi node. This will insert phi nodes if we cycle in order to		// potential phi node. This will insert phi nodes if we cycle in order to
// break the cycle and have an operand.		// break the cycle and have an operand.
for (auto *Pred : predecessors(BB))		for (auto *Pred : predecessors(BB))
PhiOps.push_back(getPreviousDefFromEnd(Pred, CachedPreviousDef));		PhiOps.push_back(getPreviousDefFromEnd(Pred, CachedPreviousDef));

// Now try to simplify the ops to avoid placing a phi.		// Now try to simplify the ops to avoid placing a phi.
// This may return null if we never created a phi yet, that's okay		// This may return null if we never created a phi yet, that's okay
MemoryPhi *Phi = dyn_cast_or_null<MemoryPhi>(MSSA->getMemoryAccess(BB));		MemoryPhi *Phi = dyn_cast_or_null<MemoryPhi>(MSSA->getMemoryAccess(BB));
bool PHIExistsButNeedsUpdate = false;
// See if the existing phi operands match what we need.
// Unlike normal SSA, we only allow one phi node per block, so we can't just
// create a new one.
if (Phi && Phi->getNumOperands() != 0)
if (!std::equal(Phi->op_begin(), Phi->op_end(), PhiOps.begin())) {
PHIExistsButNeedsUpdate = true;
}

// See if we can avoid the phi by simplifying it.		// See if we can avoid the phi by simplifying it.
auto *Result = tryRemoveTrivialPhi(Phi, PhiOps);		auto *Result = tryRemoveTrivialPhi(Phi, PhiOps);
// If we couldn't simplify, we may have to create a phi		// If we couldn't simplify, we may have to create a phi
if (Result == Phi) {		if (Result == Phi) {
if (!Phi)		if (!Phi)
Phi = MSSA->createMemoryPhi(BB);		Phi = MSSA->createMemoryPhi(BB);

		// See if the existing phi operands match what we need.
		// Unlike normal SSA, we only allow one phi node per block, so we can't just
		// create a new one.
		if (Phi->getNumOperands() != 0) {
		// FIXME: Figure out whether this is dead code and if so remove it.
		if (!std::equal(Phi->op_begin(), Phi->op_end(), PhiOps.begin())) {
// These will have been filled in by the recursive read we did above.		// These will have been filled in by the recursive read we did above.
if (PHIExistsButNeedsUpdate) {
std::copy(PhiOps.begin(), PhiOps.end(), Phi->op_begin());		std::copy(PhiOps.begin(), PhiOps.end(), Phi->op_begin());
std::copy(pred_begin(BB), pred_end(BB), Phi->block_begin());		std::copy(pred_begin(BB), pred_end(BB), Phi->block_begin());
		}
} else {		} else {
unsigned i = 0;		unsigned i = 0;
for (auto *Pred : predecessors(BB))		for (auto *Pred : predecessors(BB))
Phi->addIncoming(PhiOps[i++], Pred);		Phi->addIncoming(&*PhiOps[i++], Pred);
InsertedPHIs.push_back(Phi);		InsertedPHIs.push_back(Phi);
}		}
Result = Phi;		Result = Phi;
}		}

// Set ourselves up for the next variable by resetting visited state.		// Set ourselves up for the next variable by resetting visited state.
VisitedBlocks.erase(BB);		VisitedBlocks.erase(BB);
CachedPreviousDef.insert({BB, Result});		CachedPreviousDef.insert({BB, Result});
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	MemoryAccess MemorySSAUpdater::tryRemoveTrivialPhi(MemoryPhi Phi,
MemoryAccess *Same = nullptr;		MemoryAccess *Same = nullptr;
for (auto &Op : Operands) {		for (auto &Op : Operands) {
// If the same or self, good so far		// If the same or self, good so far
if (Op == Phi \|\| Op == Same)		if (Op == Phi \|\| Op == Same)
continue;		continue;
// not the same, return the phi since it's not eliminatable by us		// not the same, return the phi since it's not eliminatable by us
if (Same)		if (Same)
return Phi;		return Phi;
Same = cast<MemoryAccess>(Op);		Same = cast<MemoryAccess>(&*Op);
}		}
// Never found a non-self reference, the phi is undef		// Never found a non-self reference, the phi is undef
if (Same == nullptr)		if (Same == nullptr)
return MSSA->getLiveOnEntryDef();		return MSSA->getLiveOnEntryDef();
if (Phi) {		if (Phi) {
Phi->replaceAllUsesWith(Same);		Phi->replaceAllUsesWith(Same);
removeMemoryAccess(Phi);		removeMemoryAccess(Phi);
}		}
▲ Show 20 Lines • Show All 428 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/GVNHoist/pr37445.ll

				; RUN: opt < %s -early-cse-memssa -gvn-hoist -S \| FileCheck %s

				; Make sure opt won't crash and that this pair of
				; instructions (load, icmp) is hoisted successfully
				; from bb45 and bb58 to bb41.

				@g_10 = external global i32, align 4
				@g_536 = external global i8*, align 8
				@g_1629 = external global i32**, align 8
				@g_963 = external global i32**, align 8
				@g_1276 = external global i32**, align 8

				;CHECK-LABEL: @func_22

				define void @func_22(i32* %arg, i32* %arg1) {
				bb:
				br label %bb12

				bb12:
				%tmp3.0 = phi i32 [ undef, %bb ], [ %tmp40, %bb36 ]
				%tmp7.0 = phi i32 [ undef, %bb ], [ %spec.select, %bb36 ]
				%tmp14 = icmp eq i32 %tmp3.0, 6
				br i1 %tmp14, label %bb41, label %bb15

				bb15:
				%tmp183 = trunc i16 0 to i8
				%tmp20 = load i8, i8* @g_536, align 8
				%tmp21 = load i8, i8* %tmp20, align 1
				%tmp23 = or i8 %tmp21, %tmp183
				store i8 %tmp23, i8* %tmp20, align 1
				%tmp5.i = icmp eq i8 %tmp23, 0
				br i1 %tmp5.i, label %safe_div_func_uint8_t_u_u.exit, label %bb8.i

				bb8.i:
				%0 = udiv i8 1, %tmp23
				br label %safe_div_func_uint8_t_u_u.exit

				safe_div_func_uint8_t_u_u.exit:
				%tmp13.in.i = phi i8 [ %0, %bb8.i ], [ 1, %bb15 ]
				%tmp31 = icmp eq i8 %tmp13.in.i, 0
				%spec.select = select i1 %tmp31, i32 %tmp7.0, i32 53
				%tmp35 = icmp eq i32 %spec.select, 0
				br i1 %tmp35, label %bb36, label %bb41

				bb36:
				%tmp38 = sext i32 %tmp3.0 to i64
				%tmp40 = trunc i64 %tmp38 to i32
				br label %bb12

				;CHECK: bb41:
				;CHECK: %tmp47 = load i32, i32* %arg1, align 4
				;CHECK: %tmp48 = icmp eq i32 %tmp47, 0

				bb41:
				%tmp43 = load i32, i32* %arg, align 4
				%tmp44 = icmp eq i32 %tmp43, 0
				br i1 %tmp44, label %bb52, label %bb45

				;CHECK: bb45:
				;CHECK-NOT: %tmp47 = load i32, i32* %arg1, align 4
				;CHECK-NOT: %tmp48 = icmp eq i32 %tmp47, 0

				bb45:
				%tmp47 = load i32, i32* %arg1, align 4
				%tmp48 = icmp eq i32 %tmp47, 0
				br i1 %tmp48, label %bb50, label %bb64

				bb50:
				%tmp51 = load volatile i32, i32* @g_963, align 8
				unreachable

				bb52:
				%tmp8.0 = phi i32 [ undef, %bb41 ], [ %tmp57, %bb55 ]
				%tmp54 = icmp slt i32 %tmp8.0, 3
				br i1 %tmp54, label %bb55, label %bb58

				bb55:
				%tmp57 = add nsw i32 %tmp8.0, 1
				br label %bb52

				;CHECK: bb58:
				;CHECK-NOT: %tmp60 = load i32, i32* %arg1, align 4
				;CHECK-NOT: %tmp61 = icmp eq i32 %tmp60, 0

				bb58:
				%tmp60 = load i32, i32* %arg1, align 4
				%tmp61 = icmp eq i32 %tmp60, 0
				br i1 %tmp61, label %bb62, label %bb64

				bb62:
				%tmp63 = load volatile i32, i32* @g_1276, align 8
				unreachable

				bb64:
				%tmp65 = load volatile i32, i32* @g_1629, align 8
				unreachable

				; uselistorder directives
				uselistorder i32 %spec.select, { 1, 0 }
				uselistorder i32* %arg1, { 1, 0 }
				uselistorder label %bb64, { 1, 0 }
				uselistorder label %bb52, { 1, 0 }
				uselistorder label %bb41, { 1, 0 }
				uselistorder label %safe_div_func_uint8_t_u_u.exit, { 1, 0 }
				}

				define zeroext i8 @safe_div_func_uint8_t_u_u(i8 zeroext %arg, i8 zeroext %arg1) {
				bb:
				%tmp5 = icmp eq i8 %arg1, 0
				br i1 %tmp5, label %bb12, label %bb8

				bb8:
				%0 = udiv i8 %arg, %arg1
				br label %bb12

				bb12:
				%tmp13.in = phi i8 [ %0, %bb8 ], [ %arg, %bb ]
				ret i8 %tmp13.in
				}