This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Analysis/
-
Analysis/
-
BasicAliasAnalysis.cpp
-
test/
-
Analysis/BasicAA/
-
BasicAA/
-
phi-aa.ll
-
recphi.ll
-
Transforms/
-
LoopVectorize/ARM/
-
ARM/
-
pointer_iv.ll
-
LoopVersioning/
-
exit-block-dominates-rt-check-block.ll

Differential D92401

[BasicAA] Handle two unknown sizes for GEPs
ClosedPublic

Authored by nikic on Dec 1 2020, 9:43 AM.

Download Raw Diff

Details

Reviewers

asbirlea
jdoerfert

Commits

rG8b1c4e310c2f: [BasicAA] Handle two unknown sizes for GEPs

Summary

If we have two unknown sizes and one GEP operand and one non-GEP operand, then we currently simply return MayAlias. The comment says we can't do anything useful ... but we can! We can still check that the underlying objects are different (and do so for the GEP-GEP case).

To reduce the compile-time impact, this a) checks this early, before doing the relatively expensive GEP decomposition that will not be used and b) doesn't do the check if the other operand is a phi or select. In that case, the phi/select will already recurse, so this would just do two slightly different recursive walks that arrive at the same roots.

Compile-time is still a bit of a mixed bag: https://llvm-compile-time-tracker.com/compare.php?from=624af932a808b363a888139beca49f57313d9a3b&to=845356e14adbe651a553ed11318ddb5e79a24bcd&stat=instructions On average this is a small improvement, but sqlite with ThinLTO has a 0.5% regression (lencod has a 1% improvement).

The BasicAA test case checks this by using two memsets with unknown size. However, the more interesting case where this is useful is the LoopVectorize test case, as analysis of accesses in loops tends to always us unknown sizes.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nikic created this revision.Dec 1 2020, 9:43 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 1 2020, 9:43 AM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

nikic requested review of this revision.Dec 1 2020, 9:43 AM

Harbormaster completed remote builds in B80678: Diff 308687.Dec 1 2020, 10:28 AM

This kind of logic should always be valid, regardless if V1 is a GEP or not, right? Is there a way to do this check early or late for any query?

In D92401#2426402, @jdoerfert wrote:

This kind of logic should always be valid, regardless if V1 is a GEP or not, right? Is there a way to do this check early or late for any query?

The general idea is valid, but the way it is applied depends on the operation. For GEPs we just strip to the base pointer. For phis and selects, we recurse over the phi/select operands (with special casing for the case of identical control dependence).

In D92401#2426503, @nikic wrote:

In D92401#2426402, @jdoerfert wrote:

This kind of logic should always be valid, regardless if V1 is a GEP or not, right? Is there a way to do this check early or late for any query?

The general idea is valid, but the way it is applied depends on the operation. For GEPs we just strip to the base pointer. For phis and selects, we recurse over the phi/select operands (with special casing for the case of identical control dependence).

Hm, I'm confused(, but also not deep enough in the BasicAA code to argue). I assumed that this new logic which is put in a function that has a precondition (isa<GEP>(V1)) could be placed in a function without that precondition. The logic does not utilize the fact that V1 isa GEP. Anyway, it was just a thought.

In D92401#2427794, @jdoerfert wrote:

In D92401#2426503, @nikic wrote:

In D92401#2426402, @jdoerfert wrote:

This kind of logic should always be valid, regardless if V1 is a GEP or not, right? Is there a way to do this check early or late for any query?

The general idea is valid, but the way it is applied depends on the operation. For GEPs we just strip to the base pointer. For phis and selects, we recurse over the phi/select operands (with special casing for the case of identical control dependence).

Hm, I'm confused(, but also not deep enough in the BasicAA code to argue). I assumed that this new logic which is put in a function that has a precondition (isa<GEP>(V1)) could be placed in a function without that precondition. The logic does not utilize the fact that V1 isa GEP. Anyway, it was just a thought.

BasicAA first does a stripPointerCastsAndInvariantGroups(). After that operation we are left with either a GEP, a Phi, a Select, or an underlying object (in the sense of something that cannot be further inspected). The BasicAA code thus handles these three cases specially with recursive queries. The reason this is in the GEP code is that we need to strip away the GEP there to arrive at the underlying object. So while isa<GEP> is not directly used, if the value weren't a GEP, then this code wouldn't do anything useful (as we'd have already arrived at the underlying object).

Hope that makes some sense...

Ruin formatting using clang-format :(

ping :)

LGTM, with your explanation it makes sense to place it there. The logic seemed sound from the start.

This revision is now accepted and ready to land.Dec 10 2020, 12:42 PM

Closed by commit rG8b1c4e310c2f: [BasicAA] Handle two unknown sizes for GEPs (authored by nikic). · Explain WhyDec 11 2020, 9:46 AM

This revision was automatically updated to reflect the committed changes.

nikic added a commit: rG8b1c4e310c2f: [BasicAA] Handle two unknown sizes for GEPs.

With this change, it appears LLVM gets stuck in a loop when building MultiSource/Benchmarks/MiBench/consumer-lame with -O3 & LTO. To reproduce, build MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame with -O3 & LTO on X86.

I am not sure if there are any public bots that run with that configuration, but it is causing some issues with our internal testing. Given that, I am inclined to revert this change for now, unless there's a quick fix. I'll try to reduce the input IR for the hang.

fhahn added a reverting change: rGa74941da716d: Revert "[BasicAA] Handle two unknown sizes for GEPs".Dec 18 2020, 9:59 AM

I reverted the change for now in a74941da716d and filed https://bugs.llvm.org/show_bug.cgi?id=48553 which contains a smallish reproducer to show the increase in compile-time. I am not sure if the full input actually hangs or just takes a very long time to finish.

@fhahn Thank you! For this test case we end up recursing through a deep "gep of phi of gep of phi of ..." chain, where we split up into two branches at each phi, resulting in exponential runtime. Not recursing in this case, while somewhat arbitrary, ends up cutting off the recursion at the first "gep of phi" where we no longer have known sizes.

I think the right way to address this is to introduce a proper recursion limit in BasicAA, though that would take some care to ensure cache consistency.

nikic mentioned this in D96647: [BasicAA] Add depth limit.Feb 13 2021, 7:04 AM

nikic mentioned this in D97401: [basicaa] Recurse through a single phi input.Feb 24 2021, 1:20 PM

Revision Contents

Path

Size

llvm/

lib/

Analysis/

BasicAliasAnalysis.cpp

20 lines

test/

Analysis/

BasicAA/

phi-aa.ll

3 lines

recphi.ll

3 lines

Transforms/

LoopVectorize/

ARM/

pointer_iv.ll

59 lines

LoopVersioning/

exit-block-dominates-rt-check-block.ll

6 lines

Diff 311262

llvm/lib/Analysis/BasicAliasAnalysis.cpp

Show First 20 Lines • Show All 1,092 Lines • ▼ Show 20 Lines
///		///
/// We know that V1 is a GEP, but we don't know anything about V2.		/// We know that V1 is a GEP, but we don't know anything about V2.
/// UnderlyingV1 is getUnderlyingObject(GEP1), UnderlyingV2 is the same for		/// UnderlyingV1 is getUnderlyingObject(GEP1), UnderlyingV2 is the same for
/// V2.		/// V2.
AliasResult BasicAAResult::aliasGEP(		AliasResult BasicAAResult::aliasGEP(
const GEPOperator *GEP1, LocationSize V1Size, const AAMDNodes &V1AAInfo,		const GEPOperator *GEP1, LocationSize V1Size, const AAMDNodes &V1AAInfo,
const Value *V2, LocationSize V2Size, const AAMDNodes &V2AAInfo,		const Value *V2, LocationSize V2Size, const AAMDNodes &V2AAInfo,
const Value UnderlyingV1, const Value UnderlyingV2, AAQueryInfo &AAQI) {		const Value UnderlyingV1, const Value UnderlyingV2, AAQueryInfo &AAQI) {
		// If both accesses are unknown size, we can only check whether the
		// underlying objects are different.
		if (!V1Size.hasValue() && !V2Size.hasValue()) {
		// If the other operand is a phi/select, let phi/select handling perform
		// this check. Otherwise the same recursive walk is done twice.
		if (!isa<PHINode>(V2) && !isa<SelectInst>(V2)) {
		AliasResult BaseAlias =
		aliasCheck(UnderlyingV1, LocationSize::beforeOrAfterPointer(),
		AAMDNodes(), UnderlyingV2,
		LocationSize::beforeOrAfterPointer(), AAMDNodes(), AAQI);
		if (BaseAlias == NoAlias)
		return NoAlias;
		}
		return MayAlias;
		}

DecomposedGEP DecompGEP1 = DecomposeGEPExpression(GEP1, DL, &AC, DT);		DecomposedGEP DecompGEP1 = DecomposeGEPExpression(GEP1, DL, &AC, DT);
DecomposedGEP DecompGEP2 = DecomposeGEPExpression(V2, DL, &AC, DT);		DecomposedGEP DecompGEP2 = DecomposeGEPExpression(V2, DL, &AC, DT);

// Don't attempt to analyze the decomposed GEP if index scale is not a		// Don't attempt to analyze the decomposed GEP if index scale is not a
// compile-time constant.		// compile-time constant.
if (!DecompGEP1.HasCompileTimeConstantScale \|\|		if (!DecompGEP1.HasCompileTimeConstantScale \|\|
!DecompGEP2.HasCompileTimeConstantScale)		!DecompGEP2.HasCompileTimeConstantScale)
return MayAlias;		return MayAlias;
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	if (const GEPOperator *GEP2 = dyn_cast<GEPOperator>(V2)) {
DecompGEP1.Offset -= DecompGEP2.Offset;		DecompGEP1.Offset -= DecompGEP2.Offset;
GetIndexDifference(DecompGEP1.VarIndices, DecompGEP2.VarIndices);		GetIndexDifference(DecompGEP1.VarIndices, DecompGEP2.VarIndices);

} else {		} else {
// Check to see if these two pointers are related by the getelementptr		// Check to see if these two pointers are related by the getelementptr
// instruction. If one pointer is a GEP with a non-zero index of the other		// instruction. If one pointer is a GEP with a non-zero index of the other
// pointer, we know they cannot alias.		// pointer, we know they cannot alias.

// If both accesses are unknown size, we can't do anything useful here.
if (!V1Size.hasValue() && !V2Size.hasValue())
return MayAlias;

AliasResult R = aliasCheck(		AliasResult R = aliasCheck(
UnderlyingV1, LocationSize::beforeOrAfterPointer(), AAMDNodes(),		UnderlyingV1, LocationSize::beforeOrAfterPointer(), AAMDNodes(),
V2, V2Size, V2AAInfo, AAQI, nullptr, UnderlyingV2);		V2, V2Size, V2AAInfo, AAQI, nullptr, UnderlyingV2);
if (R != MustAlias) {		if (R != MustAlias) {
// If V2 may alias GEP base pointer, conservatively returns MayAlias.		// If V2 may alias GEP base pointer, conservatively returns MayAlias.
// If V2 is known not to alias GEP base pointer, then the two values		// If V2 is known not to alias GEP base pointer, then the two values
// cannot alias per GEP semantics: "Any memory access must be done through		// cannot alias per GEP semantics: "Any memory access must be done through
// a pointer value associated with an address range of the memory access,		// a pointer value associated with an address range of the memory access,
▲ Show 20 Lines • Show All 711 Lines • Show Last 20 Lines

llvm/test/Analysis/BasicAA/phi-aa.ll

Show First 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	loop:
%p1.next = getelementptr i32, i32* %p1, i64 1		%p1.next = getelementptr i32, i32* %p1, i64 1
%p2.next = getelementptr i32, i32* %p1, i64 2		%p2.next = getelementptr i32, i32* %p1, i64 2
store i32 0, i32* %p1		store i32 0, i32* %p1
store i32 0, i32* %p2		store i32 0, i32* %p2
br label %loop		br label %loop
}		}

; CHECK-LABEL: phi_and_gep_unknown_size		; CHECK-LABEL: phi_and_gep_unknown_size
; CHECK: Just Mod: call void @llvm.memset.p0i8.i32(i8* %g, i8 0, i32 %size, i1 false) <-> call void @llvm.memset.p0i8.i32(i8* %z, i8 0, i32 %size, i1 false)		; CHECK: NoModRef: call void @llvm.memset.p0i8.i32(i8* %g, i8 0, i32 %size, i1 false) <-> call void @llvm.memset.p0i8.i32(i8* %z, i8 0, i32 %size, i1 false)
; TODO: This should be NoModRef.
define void @phi_and_gep_unknown_size(i1 %c, i8* %x, i8* %y, i8* noalias %z, i32 %size) {		define void @phi_and_gep_unknown_size(i1 %c, i8* %x, i8* %y, i8* noalias %z, i32 %size) {
entry:		entry:
br i1 %c, label %true, label %false		br i1 %c, label %true, label %false

true:		true:
br label %exit		br label %exit

false:		false:
Show All 35 Lines

llvm/test/Analysis/BasicAA/recphi.ll

Show First 20 Lines • Show All 261 Lines • ▼ Show 20 Lines	exit:
ret void		ret void
}		}

; Same as the previous test case, but avoiding phi of phi.		; Same as the previous test case, but avoiding phi of phi.
; CHECK-LABEL: Function: nested_loop2		; CHECK-LABEL: Function: nested_loop2
; CHECK: NoAlias: i8* %a, i8* %p.base		; CHECK: NoAlias: i8* %a, i8* %p.base
; CHECK: NoAlias: i8* %a, i8* %p.outer		; CHECK: NoAlias: i8* %a, i8* %p.outer
; CHECK: NoAlias: i8* %a, i8* %p.outer.next		; CHECK: NoAlias: i8* %a, i8* %p.outer.next
; CHECK: MayAlias: i8* %a, i8* %p.inner		; CHECK: NoAlias: i8* %a, i8* %p.inner
; CHECK: NoAlias: i8* %a, i8* %p.inner.next		; CHECK: NoAlias: i8* %a, i8* %p.inner.next
; TODO: (a, p.inner) could be NoAlias
define void @nested_loop2(i1 %c, i1 %c2, i8* noalias %p.base) {		define void @nested_loop2(i1 %c, i1 %c2, i8* noalias %p.base) {
entry:		entry:
%a = alloca i8		%a = alloca i8
br label %outer_loop		br label %outer_loop

outer_loop:		outer_loop:
%p.outer = phi i8* [ %p.base, %entry ], [ %p.outer.next, %outer_loop_latch ]		%p.outer = phi i8* [ %p.base, %entry ], [ %p.outer.next, %outer_loop_latch ]
%p.outer.next = getelementptr inbounds i8, i8* %p.outer, i64 10		%p.outer.next = getelementptr inbounds i8, i8* %p.outer, i64 10
Show All 15 Lines

llvm/test/Transforms/LoopVectorize/ARM/pointer_iv.ll

Show First 20 Lines • Show All 870 Lines • ▼ Show 20 Lines	for.body:
%inc = add nuw nsw i32 %i.07, 1		%inc = add nuw nsw i32 %i.07, 1
%exitcond = icmp eq i32 %inc, 10000		%exitcond = icmp eq i32 %inc, 10000
br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !2		br i1 %exitcond, label %for.cond.cleanup, label %for.body, !llvm.loop !2
}		}

define hidden void @mult_ptr_iv(i8* noalias nocapture readonly %x, i8* noalias nocapture %z) {		define hidden void @mult_ptr_iv(i8* noalias nocapture readonly %x, i8* noalias nocapture %z) {
; CHECK-LABEL: @mult_ptr_iv(		; CHECK-LABEL: @mult_ptr_iv(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[SCEVGEP:%.]] = getelementptr i8, i8 [[Z:%.*]], i32 3000
; CHECK-NEXT: [[SCEVGEP1:%.]] = getelementptr i8, i8 [[X:%.*]], i32 3000
; CHECK-NEXT: [[BOUND0:%.]] = icmp ugt i8 [[SCEVGEP1]], [[Z]]
; CHECK-NEXT: [[BOUND1:%.]] = icmp ugt i8 [[SCEVGEP]], [[X]]
; CHECK-NEXT: [[FOUND_CONFLICT:%.*]] = and i1 [[BOUND0]], [[BOUND1]]
; CHECK-NEXT: br i1 [[FOUND_CONFLICT]], label [[FOR_BODY:%.]], label [[VECTOR_PH:%.]]
; CHECK: vector.ph:
; CHECK-NEXT: [[IND_END:%.]] = getelementptr i8, i8 [[X]], i32 3000
; CHECK-NEXT: [[IND_END3:%.]] = getelementptr i8, i8 [[Z]], i32 3000
; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]		; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
; CHECK: vector.body:		; CHECK: vector.body:
; CHECK-NEXT: [[POINTER_PHI:%.]] = phi i8 [ [[X]], [[VECTOR_PH]] ], [ [[PTR_IND:%.*]], [[VECTOR_BODY]] ]		; CHECK-NEXT: [[POINTER_PHI:%.]] = phi i8 [ [[X:%.]], [[ENTRY:%.]] ], [ [[PTR_IND:%.*]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[POINTER_PHI5:%.]] = phi i8 [ [[Z]], [[VECTOR_PH]] ], [ [[PTR_IND6:%.*]], [[VECTOR_BODY]] ]		; CHECK-NEXT: [[POINTER_PHI4:%.]] = phi i8 [ [[Z:%.]], [[ENTRY]] ], [ [[PTR_IND5:%.]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]		; CHECK-NEXT: [[INDEX:%.]] = phi i32 [ 0, [[ENTRY]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[TMP0:%.]] = getelementptr i8, i8 [[POINTER_PHI]], <4 x i32> <i32 0, i32 3, i32 6, i32 9>		; CHECK-NEXT: [[TMP0:%.]] = getelementptr i8, i8 [[POINTER_PHI]], <4 x i32> <i32 0, i32 3, i32 6, i32 9>
; CHECK-NEXT: [[TMP1:%.]] = getelementptr i8, i8 [[POINTER_PHI5]], <4 x i32> <i32 0, i32 3, i32 6, i32 9>		; CHECK-NEXT: [[TMP1:%.]] = getelementptr i8, i8 [[POINTER_PHI4]], <4 x i32> <i32 0, i32 3, i32 6, i32 9>
; CHECK-NEXT: [[TMP2:%.]] = getelementptr inbounds i8, <4 x i8> [[TMP0]], i32 1		; CHECK-NEXT: [[TMP2:%.]] = getelementptr inbounds i8, <4 x i8> [[TMP0]], i32 1
; CHECK-NEXT: [[WIDE_MASKED_GATHER:%.]] = call <4 x i8> @llvm.masked.gather.v4i8.v4p0i8(<4 x i8> [[TMP0]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x i8> undef), !alias.scope !26		; CHECK-NEXT: [[WIDE_MASKED_GATHER:%.]] = call <4 x i8> @llvm.masked.gather.v4i8.v4p0i8(<4 x i8> [[TMP0]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x i8> undef)
; CHECK-NEXT: [[TMP3:%.]] = getelementptr inbounds i8, <4 x i8> [[TMP0]], i32 2		; CHECK-NEXT: [[TMP3:%.]] = getelementptr inbounds i8, <4 x i8> [[TMP0]], i32 2
; CHECK-NEXT: [[WIDE_MASKED_GATHER7:%.]] = call <4 x i8> @llvm.masked.gather.v4i8.v4p0i8(<4 x i8> [[TMP2]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x i8> undef), !alias.scope !26		; CHECK-NEXT: [[WIDE_MASKED_GATHER6:%.]] = call <4 x i8> @llvm.masked.gather.v4i8.v4p0i8(<4 x i8> [[TMP2]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x i8> undef)
; CHECK-NEXT: [[WIDE_MASKED_GATHER8:%.]] = call <4 x i8> @llvm.masked.gather.v4i8.v4p0i8(<4 x i8> [[TMP3]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x i8> undef), !alias.scope !26		; CHECK-NEXT: [[WIDE_MASKED_GATHER7:%.]] = call <4 x i8> @llvm.masked.gather.v4i8.v4p0i8(<4 x i8> [[TMP3]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x i8> undef)
; CHECK-NEXT: [[TMP4:%.*]] = mul <4 x i8> [[WIDE_MASKED_GATHER]], <i8 10, i8 10, i8 10, i8 10>		; CHECK-NEXT: [[TMP4:%.*]] = mul <4 x i8> [[WIDE_MASKED_GATHER]], <i8 10, i8 10, i8 10, i8 10>
; CHECK-NEXT: [[TMP5:%.*]] = mul <4 x i8> [[WIDE_MASKED_GATHER]], [[WIDE_MASKED_GATHER7]]		; CHECK-NEXT: [[TMP5:%.*]] = mul <4 x i8> [[WIDE_MASKED_GATHER]], [[WIDE_MASKED_GATHER6]]
; CHECK-NEXT: [[TMP6:%.*]] = mul <4 x i8> [[WIDE_MASKED_GATHER]], [[WIDE_MASKED_GATHER8]]		; CHECK-NEXT: [[TMP6:%.*]] = mul <4 x i8> [[WIDE_MASKED_GATHER]], [[WIDE_MASKED_GATHER7]]
; CHECK-NEXT: [[TMP7:%.]] = getelementptr inbounds i8, <4 x i8> [[TMP1]], i32 1		; CHECK-NEXT: [[TMP7:%.]] = getelementptr inbounds i8, <4 x i8> [[TMP1]], i32 1
; CHECK-NEXT: call void @llvm.masked.scatter.v4i8.v4p0i8(<4 x i8> [[TMP4]], <4 x i8*> [[TMP1]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>), !alias.scope !29, !noalias !26		; CHECK-NEXT: call void @llvm.masked.scatter.v4i8.v4p0i8(<4 x i8> [[TMP4]], <4 x i8*> [[TMP1]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)
; CHECK-NEXT: [[TMP8:%.]] = getelementptr inbounds i8, <4 x i8> [[TMP1]], i32 2		; CHECK-NEXT: [[TMP8:%.]] = getelementptr inbounds i8, <4 x i8> [[TMP1]], i32 2
; CHECK-NEXT: call void @llvm.masked.scatter.v4i8.v4p0i8(<4 x i8> [[TMP5]], <4 x i8*> [[TMP7]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>), !alias.scope !29, !noalias !26		; CHECK-NEXT: call void @llvm.masked.scatter.v4i8.v4p0i8(<4 x i8> [[TMP5]], <4 x i8*> [[TMP7]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)
; CHECK-NEXT: call void @llvm.masked.scatter.v4i8.v4p0i8(<4 x i8> [[TMP6]], <4 x i8*> [[TMP8]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>), !alias.scope !29, !noalias !26		; CHECK-NEXT: call void @llvm.masked.scatter.v4i8.v4p0i8(<4 x i8> [[TMP6]], <4 x i8*> [[TMP8]], i32 1, <4 x i1> <i1 true, i1 true, i1 true, i1 true>)
; CHECK-NEXT: [[INDEX_NEXT]] = add i32 [[INDEX]], 4		; CHECK-NEXT: [[INDEX_NEXT]] = add i32 [[INDEX]], 4
; CHECK-NEXT: [[TMP9:%.*]] = icmp eq i32 [[INDEX_NEXT]], 1000		; CHECK-NEXT: [[TMP9:%.*]] = icmp eq i32 [[INDEX_NEXT]], 1000
; CHECK-NEXT: [[PTR_IND]] = getelementptr i8, i8* [[POINTER_PHI]], i32 12		; CHECK-NEXT: [[PTR_IND]] = getelementptr i8, i8* [[POINTER_PHI]], i32 12
; CHECK-NEXT: [[PTR_IND6]] = getelementptr i8, i8* [[POINTER_PHI5]], i32 12		; CHECK-NEXT: [[PTR_IND5]] = getelementptr i8, i8* [[POINTER_PHI4]], i32 12
; CHECK-NEXT: br i1 [[TMP9]], label [[END:%.]], label [[VECTOR_BODY]], [[LOOP31:!llvm.loop !.]]		; CHECK-NEXT: br i1 [[TMP9]], label [[END:%.]], label [[VECTOR_BODY]], [[LOOP26:!llvm.loop !.]]
; CHECK: for.body:
; CHECK-NEXT: [[X_ADDR_050:%.]] = phi i8 [ [[INCDEC_PTR2:%.]], [[FOR_BODY]] ], [ [[X]], [[ENTRY:%.]] ]
; CHECK-NEXT: [[Z_ADDR_049:%.]] = phi i8 [ [[INCDEC_PTR34:%.*]], [[FOR_BODY]] ], [ [[Z]], [[ENTRY]] ]
; CHECK-NEXT: [[I_048:%.]] = phi i32 [ [[INC:%.]], [[FOR_BODY]] ], [ 0, [[ENTRY]] ]
; CHECK-NEXT: [[INCDEC_PTR:%.]] = getelementptr inbounds i8, i8 [[X_ADDR_050]], i32 1
; CHECK-NEXT: [[TMP10:%.]] = load i8, i8 [[X_ADDR_050]], align 1
; CHECK-NEXT: [[INCDEC_PTR1:%.]] = getelementptr inbounds i8, i8 [[X_ADDR_050]], i32 2
; CHECK-NEXT: [[TMP11:%.]] = load i8, i8 [[INCDEC_PTR]], align 1
; CHECK-NEXT: [[INCDEC_PTR2]] = getelementptr inbounds i8, i8* [[X_ADDR_050]], i32 3
; CHECK-NEXT: [[TMP12:%.]] = load i8, i8 [[INCDEC_PTR1]], align 1
; CHECK-NEXT: [[MUL:%.*]] = mul i8 [[TMP10]], 10
; CHECK-NEXT: [[MUL1:%.*]] = mul i8 [[TMP10]], [[TMP11]]
; CHECK-NEXT: [[MUL2:%.*]] = mul i8 [[TMP10]], [[TMP12]]
; CHECK-NEXT: [[INCDEC_PTR32:%.]] = getelementptr inbounds i8, i8 [[Z_ADDR_049]], i32 1
; CHECK-NEXT: store i8 [[MUL]], i8* [[Z_ADDR_049]], align 1
; CHECK-NEXT: [[INCDEC_PTR33:%.]] = getelementptr inbounds i8, i8 [[Z_ADDR_049]], i32 2
; CHECK-NEXT: store i8 [[MUL1]], i8* [[INCDEC_PTR32]], align 1
; CHECK-NEXT: [[INCDEC_PTR34]] = getelementptr inbounds i8, i8* [[Z_ADDR_049]], i32 3
; CHECK-NEXT: store i8 [[MUL2]], i8* [[INCDEC_PTR33]], align 1
; CHECK-NEXT: [[INC]] = add nuw i32 [[I_048]], 1
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[INC]], 1000
; CHECK-NEXT: br i1 [[EXITCOND]], label [[END]], label [[FOR_BODY]], [[LOOP32:!llvm.loop !.*]]
; CHECK: end:		; CHECK: end:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body:		for.body:
%x.addr.050 = phi i8* [ %incdec.ptr2, %for.body ], [ %x, %entry ]		%x.addr.050 = phi i8* [ %incdec.ptr2, %for.body ], [ %x, %entry ]
Show All 30 Lines

llvm/test/Transforms/LoopVersioning/exit-block-dominates-rt-check-block.ll

	; This test ensures loop versioning does not produce an invalid dominator tree			; This test ensures loop versioning does not produce an invalid dominator tree
	; if the exit block of the loop (bb0) dominates the runtime check block			; if the exit block of the loop (bb0) dominates the runtime check block
	; (bb1 will become the runtime check block).			; (bb1 will become the runtime check block).

	; RUN: opt -loop-distribute -enable-loop-distribute -verify-dom-info -S -o - %s > %t			; RUN: opt -loop-distribute -enable-loop-distribute -verify-dom-info -S -o - %s > %t
	; RUN: opt -loop-simplify -loop-distribute -enable-loop-distribute -verify-dom-info -S -o - %s > %t			; RUN: opt -loop-simplify -loop-distribute -enable-loop-distribute -verify-dom-info -S -o - %s > %t
	; RUN: FileCheck --check-prefix CHECK-VERSIONING -input-file %t %s			; RUN: FileCheck --check-prefix CHECK-VERSIONING -input-file %t %s

	; RUN: opt -loop-versioning -verify-dom-info -S -o - %s > %t			; RUN: opt -loop-versioning -verify-dom-info -S -o - %s > %t
	; RUN: opt -loop-simplify -loop-versioning -verify-dom-info -S -o - %s > %t			; RUN: opt -loop-simplify -loop-versioning -verify-dom-info -S -o - %s > %t
	; RUN: FileCheck --check-prefix CHECK-VERSIONING -input-file %t %s			; RUN: FileCheck --check-prefix CHECK-VERSIONING -input-file %t %s

	@c1 = external global i16			@c1 = external global i16

	define void @f(i16 %a) {			define void @f(i16 %a, [1 x i32]* %p) {
	br label %bb0			br label %bb0

	bb0:			bb0:
	br label %bb1			br label %bb1

	bb1:			bb1:
	%tmp1 = load i16, i16* @c1			%tmp1 = load i16, i16* @c1
	br label %bb2			br label %bb2

	bb2:			bb2:
	%tmp2 = phi i16 [ %tmp1, %bb1 ], [ %tmp3, %bb2 ]			%tmp2 = phi i16 [ %tmp1, %bb1 ], [ %tmp3, %bb2 ]
	%tmp4 = getelementptr inbounds [1 x i32], [1 x i32]* undef, i32 0, i32 4			%tmp4 = getelementptr inbounds [1 x i32], [1 x i32]* %p, i32 0, i32 4
	store i32 1, i32* %tmp4			store i32 1, i32* %tmp4
	%tmp5 = getelementptr inbounds [1 x i32], [1 x i32]* undef, i32 0, i32 9			%tmp5 = getelementptr inbounds [1 x i32], [1 x i32]* %p, i32 0, i32 9
	store i32 0, i32* %tmp5			store i32 0, i32* %tmp5
	%tmp3 = add i16 %tmp2, 1			%tmp3 = add i16 %tmp2, 1
	store i16 %tmp2, i16* @c1			store i16 %tmp2, i16* @c1
	%tmp6 = icmp sle i16 %tmp3, 0			%tmp6 = icmp sle i16 %tmp3, 0
	br i1 %tmp6, label %bb2, label %bb0			br i1 %tmp6, label %bb2, label %bb0
	}			}

	; Simple check to make sure loop versioning happened.			; Simple check to make sure loop versioning happened.
	; CHECK-VERSIONING: bb2.lver.check:			; CHECK-VERSIONING: bb2.lver.check: