This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
GVN.cpp
-
test/Transforms/GVN/
-
Transforms/
-
GVN/
-
condprop-md-invalidation2.ll

Differential D65204

[GVN] Also invalidate users of instructions replaced due to conditionals.
AbandonedPublic

Authored by fhahn on Jul 24 2019, 5:52 AM.

Download Raw Diff

Details

Reviewers

john.brawn
efriedma
hfinkel
reames

Summary

We also need to invalidate the users of the replaced value, as they could
be GEPs indexed by the replaced value. In that case, the cache will
contain stale information and return invalid information.

Fixes PR31651.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 35579
Build 35578: arc lint + arc unit

Event Timeline

fhahn created this revision.Jul 24 2019, 5:52 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2019, 5:52 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B35579: Diff 211479.Jul 24 2019, 5:52 AM

I'm not sure this fix is actually solving a real issue, as opposed to merely hiding the issue for the given testcase.

Replacing a use of an integer value with a use of an equal integer value could make alias analysis return different results, but the old result should still be correct; you haven't actually changed the alias properties of any memory operations. So reusing the result of any queries cached by MemDep should still produce a correct result (maybe an overly conservative result, but that should be okay). Or does MemDep caching not work like that for some reason?

In D65204#1599820, @efriedma wrote:

I'm not sure this fix is actually solving a real issue, as opposed to merely hiding the issue for the given testcase.

Replacing a use of an integer value with a use of an equal integer value could make alias analysis return different results, but the old result should still be correct; you haven't actually changed the alias properties of any memory operations. So reusing the result of any queries cached by MemDep should still produce a correct result (maybe an overly conservative result, but that should be okay). Or does MemDep caching not work like that for some reason?

My understanding of the non-local query caching is the following: MemDepAnalysis keeps a map of query pointer values to a list discovered (BB, Dependency) pairs. It derives the pointer values by starting with the original pointer and then translating the address through PHIs in the predecessors. PHITranslateAddr tries to translate the pointer to an existing pointer, by looking at the existing uses of the translated PHI. In the example we translate %_tmp20 through %step1.7.0 to getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 %step1.7.0 and it returns the first matching existing matching value %_tmp47. Using the pointer, we find Defs %_tmp48 = load i16, i16* %_tmp47, align 2 and %_tmp58 = load i16, i16* %_tmp57, align 2 and cache them using the translated pointer (%_tmp47)

Now GVN comes along and replaces the use of %step1.7.0 in %_tmp47 with 0.

Later we translate _tmp78 to getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 0 through %i.8.0 and later find the equivalent value %_tmp47 and use that as key into the cache, which now returns the defs %_tmp47 and %_tmp57, even though they now access different addresses.

It seems like MemDepAnalysis is using syntactic equivalences for effective caching (and therefore assumes the IR is not modified under it) and GVN in this case destroys the syntactic equivalence, Unless I am missing something here, invalidating the cache seems reasonable here (I think the same argument would also hold for replacing equivalent pointers).

Now there is still a potential problem as we just invalidate values 2 levels deep (the value itself and its users), if MemDepAnalysis goes deeper than looking at the immediate operands to find equivalences, we still have invalid values in the cache. dberlin mentioned that he might have a test case for such a scenario.

Oh, okay, that makes more sense.

cache them using the translated pointer (%_tmp47)

Does it really make sense to cache the load %_tmp58 using the pointer %_tmp47, as opposed to the pointer %_tmp57? There isn't any dominance relationship between %_tmp47 and %_tmp58.

Now there is still a potential problem as we just invalidate values 2 levels deep

This seems concerning, yes.

In D65204#1600091, @efriedma wrote:

Oh, okay, that makes more sense.

cache them using the translated pointer (%_tmp47)

Does it really make sense to cache the load %_tmp58 using the pointer %_tmp47, as opposed to the pointer %_tmp57? There isn't any dominance relationship between %_tmp47 and %_tmp58.

I think it makes sense in this use case, given we just want to find a single value to represent equivalent addresses and we are not using them to replace anything. But it requires careful cache invalidation and we get that wrong in a bunch of places.

We could ask for addresses that dominate the blocks we are translating through, but it would limit both the number of addresses we can translate and also increase the number of cache misses (6 existing GVN test cases fail when requiring dominating addresses). What do you think? Should we fix the issues with invalidation or restrict the caching/translating?

Ping. Eli, do you think we should change the values we use as results of PHI translations, given my last comment?

uabelho added a subscriber: uabelho.Aug 10 2022, 1:00 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2022, 1:00 AM

IIUC the underlying issue has been fixed in a different patch.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

GVN.cpp

20 lines

test/

Transforms/

GVN/

condprop-md-invalidation2.ll

72 lines

Diff 211479

llvm/lib/Transforms/Scalar/GVN.cpp

Show First 20 Lines • Show All 1,713 Lines • ▼ Show 20 Lines	while (!Worklist.empty()) {
// have the simple case where the edge dominates the end.		// have the simple case where the edge dominates the end.
if (RootDominatesEnd && !isa<Instruction>(RHS))		if (RootDominatesEnd && !isa<Instruction>(RHS))
addToLeaderTable(LVN, RHS, Root.getEnd());		addToLeaderTable(LVN, RHS, Root.getEnd());

// Replace all occurrences of 'LHS' with 'RHS' everywhere in the scope. As		// Replace all occurrences of 'LHS' with 'RHS' everywhere in the scope. As
// LHS always has at least one use that is not dominated by Root, this will		// LHS always has at least one use that is not dominated by Root, this will
// never do anything if LHS has only one use.		// never do anything if LHS has only one use.
if (!LHS->hasOneUse()) {		if (!LHS->hasOneUse()) {
		// Cached information for anything that uses LHS will be invalid.
		if (MD) {
		for (auto *U : LHS->users())
		MD->invalidateCachedPointerInfo(U);
		MD->invalidateCachedPointerInfo(LHS);
		}

unsigned NumReplacements =		unsigned NumReplacements =
DominatesByEdge		DominatesByEdge
? replaceDominatedUsesWith(LHS, RHS, *DT, Root)		? replaceDominatedUsesWith(LHS, RHS, *DT, Root)
: replaceDominatedUsesWith(LHS, RHS, *DT, Root.getStart());		: replaceDominatedUsesWith(LHS, RHS, *DT, Root.getStart());

Changed \|= NumReplacements > 0;		Changed \|= NumReplacements > 0;
NumGVNEqProp += NumReplacements;		NumGVNEqProp += NumReplacements;
// Cached information for anything that uses LHS will be invalid.
if (MD)
MD->invalidateCachedPointerInfo(LHS);
}		}

// Now try to deduce additional equalities from this one. For example, if		// Now try to deduce additional equalities from this one. For example, if
// the known equality was "(A != B)" == "false" then it follows that A and B		// the known equality was "(A != B)" == "false" then it follows that A and B
// are equal in the scope. Only boolean equalities with an explicit true or		// are equal in the scope. Only boolean equalities with an explicit true or
// false RHS are currently supported.		// false RHS are currently supported.
if (!RHS->getType()->isIntegerTy(1))		if (!RHS->getType()->isIntegerTy(1))
// Not a boolean equality - bail out.		// Not a boolean equality - bail out.
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	if (CmpInst *Cmp = dyn_cast<CmpInst>(LHS)) {
// appropriate instruction (if any).		// appropriate instruction (if any).
uint32_t NextNum = VN.getNextUnusedValueNumber();		uint32_t NextNum = VN.getNextUnusedValueNumber();
uint32_t Num = VN.lookupOrAddCmp(Cmp->getOpcode(), NotPred, Op0, Op1);		uint32_t Num = VN.lookupOrAddCmp(Cmp->getOpcode(), NotPred, Op0, Op1);
// If the number we were assigned was brand new then there is no point in		// If the number we were assigned was brand new then there is no point in
// looking for an instruction realizing it: there cannot be one!		// looking for an instruction realizing it: there cannot be one!
if (Num < NextNum) {		if (Num < NextNum) {
Value *NotCmp = findLeader(Root.getEnd(), Num);		Value *NotCmp = findLeader(Root.getEnd(), Num);
if (NotCmp && isa<Instruction>(NotCmp)) {		if (NotCmp && isa<Instruction>(NotCmp)) {
		// Cached information for anything that uses NotCmp will be invalid.
		if (MD) {
		for (auto *U : NotCmp->users())
		MD->invalidateCachedPointerInfo(U);
		MD->invalidateCachedPointerInfo(NotCmp);
		}

unsigned NumReplacements =		unsigned NumReplacements =
DominatesByEdge		DominatesByEdge
? replaceDominatedUsesWith(NotCmp, NotVal, *DT, Root)		? replaceDominatedUsesWith(NotCmp, NotVal, *DT, Root)
: replaceDominatedUsesWith(NotCmp, NotVal, *DT,		: replaceDominatedUsesWith(NotCmp, NotVal, *DT,
Root.getStart());		Root.getStart());
Changed \|= NumReplacements > 0;		Changed \|= NumReplacements > 0;
NumGVNEqProp += NumReplacements;		NumGVNEqProp += NumReplacements;
// Cached information for anything that uses NotCmp will be invalid.
if (MD)
MD->invalidateCachedPointerInfo(NotCmp);
}		}
}		}
// Ensure that any instruction in scope that gets the "A < B" value number		// Ensure that any instruction in scope that gets the "A < B" value number
// is replaced with false.		// is replaced with false.
// The leader table only tracks basic blocks, not edges. Only add to if we		// The leader table only tracks basic blocks, not edges. Only add to if we
// have the simple case where the edge dominates the end.		// have the simple case where the edge dominates the end.
if (RootDominatesEnd)		if (RootDominatesEnd)
addToLeaderTable(Num, NotVal, Root.getEnd());		addToLeaderTable(Num, NotVal, Root.getEnd());
▲ Show 20 Lines • Show All 772 Lines • Show Last 20 Lines

llvm/test/Transforms/GVN/condprop-md-invalidation2.ll

This file was added.

				; RUN: opt -gvn %s -S \| FileCheck %s

				; Test case for PR31651.

				target datalayout = "p:16:16"

				declare void @CVAL_VERIFY_FUNC(i16)

				declare i16 @gen()

				define i16 @main() #1 {
				%ub.16 = alloca [4 x i16], align 2
				br label %bb1

				bb1: ; preds = %bb12, %0
				%step1.7.0 = phi i16 [ 0, %0 ], [ %_tmp72, %bb12 ]
				%_tmp13 = icmp eq i16 %step1.7.0, 0
				br i1 %_tmp13, label %bb5, label %bb4

				bb4: ; preds = %bb1
				%_tmp18 = add i16 %step1.7.0, -1
				%_tmp20 = getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 %_tmp18
				%_tmp21 = load i16, i16* %_tmp20, align 2
				br label %bb5

				bb5: ; preds = %bb1, %bb4
				%step1.7.0.sink = phi i16 [ %step1.7.0, %bb4 ], [ 0, %bb1 ]
				%_tmp22.sink = phi i16 [ %_tmp21, %bb4 ], [ 10, %bb1 ]
				%_tmp26 = getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 %step1.7.0.sink
				store i16 %_tmp22.sink, i16* %_tmp26, align 2
				%_tmp28 = icmp eq i16 %step1.7.0, 0
				br i1 %_tmp28, label %bb7, label %bb8

				; CHECK-LABEL: bb7:
				; CHECK-NEXT: %_tmp47 = getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 0
				; CHECK-NEXT: %_tmp48 = load i16, i16* %_tmp47, align 2
				bb7: ; preds = %bb_usw2
				%_tmp47 = getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 %step1.7.0
				%_tmp48 = load i16, i16* %_tmp47, align 2
				call void @CVAL_VERIFY_FUNC(i16 %_tmp48)
				br label %bb12

				bb8: ; preds = %bb_usw2
				%_tmp57 = getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 %step1.7.0
				%_tmp58 = load i16, i16* %_tmp57, align 2
				call void @CVAL_VERIFY_FUNC(i16 %_tmp58)
				br label %bb12

				bb12: ; preds = %bb9, %bb11, %bb10
				%_tmp72 = add i16 %step1.7.0, 1
				%_tmp74 = icmp slt i16 %_tmp72, 3
				br i1 %_tmp74, label %bb1, label %bb14


				; CHECK-LABEL: bb14:
				; CHECK-NOT: %_tmp79 = phi
				; CHECK-NEXT: %i.8.0 = phi i16
				; CHECK: %_tmp79 = load i16, i16* %_tmp78, align 2
				bb14: ; preds = %bb12, %bb14
				%i.8.0 = phi i16 [ %_tmp91, %bb14 ], [ 0, %bb12 ]
				%_tmp78 = getelementptr [4 x i16], [4 x i16]* %ub.16, i16 0, i16 %i.8.0
				%_tmp79 = load i16, i16* %_tmp78, align 2
				call void @CVAL_VERIFY_FUNC(i16 %_tmp79)
				%_tmp91 = add i16 %i.8.0, 1
				%_tmp93 = icmp slt i16 %_tmp91, 101
				br i1 %_tmp93, label %bb14, label %bb17

				bb17: ; preds = %bb14
				ret i16 0

				uselistorder [4 x i16]* %ub.16, { 4, 3, 2, 0, 1 }
				}