This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Analysis/
-
Analysis/
1/1
ScalarEvolution.cpp
-
test/Analysis/
-
Analysis/
-
LoopAccessAnalysis/
-
number-of-memchecks.ll
-
reverse-memcheck-bounds.ll
-
ScalarEvolution/
-
flags-from-poison.ll
-
nsw-offset-assume.ll
-
nsw-offset.ll
2/3
nsw.ll

Differential D20058

[SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"
ClosedPublic

Authored by iid_iunknown on May 8 2016, 10:22 AM.

Download Raw Diff

Details

Reviewers

sanjoy

Commits

rGeb4eccae5c14: [SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"
rL270695: [SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"

Summary

Description

This makes WidenIV::widenIVUse (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code.

When WidenIV::widenIVUse gets a NarrowUse such as {(-2 + %inc.lcssa),+,1}<nsw><%for.body3>, it first tries to get a wide recurrence for it via the getWideRecurrence call.
getWideRecurrence returns recurrence like this: {(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>.

Then a wide use operation is generated by cloneIVUser. The generated wide use is evaluated to {(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>, which is different from the getWideRecurrence result. cloneIVUser sees the difference and returns nullptr.

This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction.

Minimal reproducer:

int foo(int a, int b, int c);
int baz();

void bar()
{
   int arr[20];
   int i = 0;

   for (i = 0; i < 4; ++i)
     arr[i] = baz();

   for (; i < 20; ++i)
     arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]);
}

Clang command line:

clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir

Expected result:
The -mllvm -debug log shows that all the IV's for the second for loop have been eliminated.

Diff Detail

Repository: rL LLVM

Event Timeline

iid_iunknown updated this revision to Diff 56520.May 8 2016, 10:22 AM

iid_iunknown retitled this revision from to [SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}".

iid_iunknown updated this object.

iid_iunknown added a reviewer: sanjoy.

iid_iunknown set the repository for this revision to rL LLVM.

iid_iunknown added a subscriber: llvm-commits.

Herald added subscribers: mzolotukhin, aemerson. · View Herald TranscriptMay 8 2016, 10:22 AM

This needs at least one test case (preferably more than one) demonstrating the motivation. A simplified form of the "Minimal reproducer:" will do.

lib/Analysis/ScalarEvolution.cpp
2283–2286	Can you please add a comment here on why this is correct: // This follows from the fact that the no-wrap flags on the outer add // expression is applicable on the 0th iteration, when the add recurrence // will be equal to its start value.

This revision now requires changes to proceed.May 8 2016, 11:36 PM

Addressing the review comments.

In D20058#424378, @sanjoy wrote:

This needs at least one test case (preferably more than one) demonstrating the motivation. A simplified form of the "Minimal reproducer:" will do.

Thanks for your comments!

I added the test case and reduced it a bit but its IR is still quite big. The problem is that any attempt to shorten it further seems to hide the problem - IV elimination starts working even without our fix. Could you check if such a test is acceptable, please?

The test case checks if the optimized IR has no IV and its unnecessary sext. Another way to check this is to use opt -debug and look for "INDVARS: eliminating ..." in the output. Please, let me know if this way is more preferable.

Thanks!

asl added a subscriber: asl.May 16 2016, 7:08 AM

sanjoy requested changes to this revision.May 16 2016, 2:54 PM

sanjoy edited edge metadata.

sanjoy added inline comments.

test/Analysis/ScalarEvolution/iv_elimination.ll
1 ↗	(On Diff #57262)	We don't run `-On` in unit tests unless unavoidable. What I had in was something like: ; RUN: opt -analyze -scalar-evolution < %s \| FileCheck %s < IR that -indvars sees, you can get this by breakpointing at IndVarSimplify::runOnLoop , and dumping out the Module > with `CHECK:` lines verifying that the `sext` instructions don't get mapped to `sext` SCEV expressions. This should also live in `nsw.ll`.

This revision now requires changes to proceed.May 16 2016, 2:54 PM

atrick added a subscriber: atrick.May 16 2016, 6:53 PM

Could you elaborate on this part of your comment, please?

with `CHECK:` lines verifying that the `sext` instructions don't get mapped to `sext` SCEV expressions.

A piece from my modified test that exposes the problem:

for.body3:
  %i.16 = phi i32 [ %inc5, %for.body3 ], [ %i.0.lcssa, %for.cond1.preheader ]
  %sub = add nsw i32 %i.16, -2
  %idxprom = sext i32 %sub to i64
  %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %arr, i64 0, i64 %idxprom

The patch eliminates %idxprom and uses an introduced %indvars.iv for array indexing. I am not sure this is something that can be well tested with opt -analyze -scalar-evolution. Opt shows different outputs with and w/o the patch, but these differences mainly relate to moving the constants outside of 'sext'. E.g.:

Patched:

%idxprom = sext i32 %sub to i64
  -->  {(-2 + (sext i32 %i.0.lcssa to i64))<nsw>,+,1}<nsw><%for.body3> U: [-2147483650,4294967304) S: [-2147483650,4294967304)		Exits: (-2 + (zext i32 (9 + (-1 * %i.0.lcssa)) to i64) + (sext i32 %i.0.lcssa to i64))

No patch:

%idxprom = sext i32 %sub to i64
  -->  {(sext i32 (-2 + %i.0.lcssa) to i64),+,1}<nsw><%for.body3> U: [-2147483648,4294967306) S: [-2147483648,4294967306)		Exits: ((zext i32 (9 + (-1 * %i.0.lcssa)) to i64) + (sext i32 (-2 + %i.0.lcssa) to i64))

(Patch gives "(-2 + (sext i32 %i.0.lcssa to i64))<nsw>" instead of "(sext i32 (-2 + %i.0.lcssa) to i64)").

My idea was to check that %idxprom gets eliminated and the array is indexed by an expression w/o 'sext'. This can be done by opt -indvars (-analyze is not useful for -indvars as IndVarsSimplify::print() is not defined).

Is this is what the test is expected to do or you have a different idea?

Thanks.

In D20058#436432, @iid_iunknown wrote:

Could you elaborate on this part of your comment, please?

with `CHECK:` lines verifying that the `sext` instructions don't get mapped to `sext` SCEV expressions.

A piece from my modified test that exposes the problem:

for.body3:
  %i.16 = phi i32 [ %inc5, %for.body3 ], [ %i.0.lcssa, %for.cond1.preheader ]
  %sub = add nsw i32 %i.16, -2
  %idxprom = sext i32 %sub to i64
  %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %arr, i64 0, i64 %idxprom

The patch eliminates %idxprom and uses an introduced %indvars.iv for
array indexing. I am not sure this is something that can be well
tested with opt -analyze -scalar-evolution. Opt shows different
outputs with and w/o the patch, but these differences mainly relate to
moving the constants outside of 'sext'. E.g.:

Patched:

%idxprom = sext i32 %sub to i64
  -->  {(-2 + (sext i32 %i.0.lcssa to i64))<nsw>,+,1}<nsw><%for.body3> U: [-2147483650,4294967304) S: [-2147483650,4294967304)		Exits: (-2 + (zext i32 (9 + (-1 * %i.0.lcssa)) to i64) + (sext i32 %i.0.lcssa to i64))

No patch:

%idxprom = sext i32 %sub to i64
  -->  {(sext i32 (-2 + %i.0.lcssa) to i64),+,1}<nsw><%for.body3> U: [-2147483648,4294967306) S: [-2147483648,4294967306)		Exits: ((zext i32 (9 + (-1 * %i.0.lcssa)) to i64) + (sext i32 (-2 + %i.0.lcssa) to i64))

This _is_ the difference I'm suggesting we should test for. In the
unpatched case, SCEV has not been able to prove that `(-2 +
%i.0.lcssa)` won't sign overflow, so it cannot commute a sext into the
addition to transform sext(-2 + %i.0.lcssa) into `-2 +
sext(%i.0.lcssa). With your change SCEV is able to prove that -2 +
%i.0.lcssa` does not sign overflow, so the sext can be commuted into
the addition, simplifying the expression to `(-2 + (sext i32
%i.0.lcssa to i64))` as you can see.

My idea was to check that %idxprom gets eliminated and the array is
indexed by an expression w/o 'sext'. This can be done by `opt
-indvars` (-analyze is not useful for -indvars as
IndVarsSimplify::print() is not defined).

That's a useful test too, and if you're more comfortable with that I
won't be opposed to it. But checking SCEV directly is more targeted.

Addressing the review comments from Sanjoy.

The test case was reduced and changed to check that the sext instructions don't get mapped to sext SCEV expressions.

lgtm with nits

Let me know if you need me to check this in for you.

test/Analysis/ScalarEvolution/nsw.ll
181	Nit: you don't need to specify `D20058` here. Also use a `CHECK-LABEL: <function name>` like in the previous tests.
201	Add newline at end of file?

This revision is now accepted and ready to land.May 24 2016, 9:35 AM

Corrections according to the Sanjoy's comments

In D20058#437833, @sanjoy wrote:

Let me know if you need me to check this in for you.

I have commit access and can commit it on my own.
Thanks!

test/Analysis/ScalarEvolution/nsw.ll
201	The tool I used for diff creation removed the trailing eol for some reason. Fixed.

iid_iunknown closed this revision.May 25 2016, 6:07 AM

iid_iunknown marked an inline comment as done.

Revision Contents

Path

Size

lib/

Analysis/

	ScalarEvolution.cpp
	ScalarEvolution.cpp (revision 266669)

5 lines

test/

Analysis/

LoopAccessAnalysis/

	number-of-memchecks.ll
	number-of-memchecks.ll (revision 266669)

10 lines

	reverse-memcheck-bounds.ll
	reverse-memcheck-bounds.ll (revision 266669)

4 lines

ScalarEvolution/

	flags-from-poison.ll
	flags-from-poison.ll (revision 266669)

6 lines

	nsw-offset-assume.ll
	nsw-offset-assume.ll (revision 266669)

4 lines

	nsw-offset.ll
	nsw-offset.ll (revision 266669)

4 lines

	nsw.ll
	nsw.ll (revision 266669)

30 lines

Diff 58171

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,274 Lines • ▼ Show 20 Lines	for (; Idx < Ops.size() && isa<SCEVAddRecExpr>(Ops[Idx]); ++Idx) {

// If we found some loop invariants, fold them into the recurrence.		// If we found some loop invariants, fold them into the recurrence.
if (!LIOps.empty()) {		if (!LIOps.empty()) {
// NLI + LI + {Start,+,Step} --> NLI + {LI+Start,+,Step}		// NLI + LI + {Start,+,Step} --> NLI + {LI+Start,+,Step}
LIOps.push_back(AddRec->getStart());		LIOps.push_back(AddRec->getStart());

SmallVector<const SCEV *, 4> AddRecOps(AddRec->op_begin(),		SmallVector<const SCEV *, 4> AddRecOps(AddRec->op_begin(),
AddRec->op_end());		AddRec->op_end());
AddRecOps[0] = getAddExpr(LIOps);		// This follows from the fact that the no-wrap flags on the outer add
		// expression is applicable on the 0th iteration, when the add recurrence
		// will be equal to its start value.
		AddRecOps[0] = getAddExpr(LIOps, Flags);
		sanjoyUnsubmitted Done Reply Inline Actions Can you please add a comment here on why this is correct: // This follows from the fact that the no-wrap flags on the outer add // expression is applicable on the 0th iteration, when the add recurrence // will be equal to its start value. sanjoy: Can you please add a comment here on why this is correct: ``` // This follows from the fact…

// Build the new addrec. Propagate the NUW and NSW flags if both the		// Build the new addrec. Propagate the NUW and NSW flags if both the
// outer add and the inner addrec are guaranteed to have no overflow.		// outer add and the inner addrec are guaranteed to have no overflow.
// Always propagate NW.		// Always propagate NW.
Flags = AddRec->getNoWrapFlags(setFlags(Flags, SCEV::FlagNW));		Flags = AddRec->getNoWrapFlags(setFlags(Flags, SCEV::FlagNW));
const SCEV *NewRec = getAddRecExpr(AddRecOps, AddRecLoop, Flags);		const SCEV *NewRec = getAddRecExpr(AddRecOps, AddRecLoop, Flags);

// If all of the other operands were loop invariant, we are done.		// If all of the other operands were loop invariant, we are done.
▲ Show 20 Lines • Show All 8,098 Lines • Show Last 20 Lines

test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll

	Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Comparing group ({{.*}}[[ZERO]]):			; CHECK-NEXT: Comparing group ({{.*}}[[ZERO]]):
	; CHECK-NEXT: %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc			; CHECK-NEXT: %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc
	; CHECK-NEXT: %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind			; CHECK-NEXT: %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind
	; CHECK-NEXT: Against group ([[TWO:.+]]):			; CHECK-NEXT: Against group ([[TWO:.+]]):
	; CHECK-NEXT: %arrayidxB = getelementptr inbounds i16, i16* %b, i64 %ind			; CHECK-NEXT: %arrayidxB = getelementptr inbounds i16, i16* %b, i64 %ind
	; CHECK-NEXT: Grouped accesses:			; CHECK-NEXT: Grouped accesses:
	; CHECK-NEXT: Group {{.*}}[[ZERO]]:			; CHECK-NEXT: Group {{.*}}[[ZERO]]:
	; CHECK-NEXT: (Low: %c High: (78 + %c))			; CHECK-NEXT: (Low: %c High: (78 + %c))
	; CHECK-NEXT: Member: {(2 + %c),+,4}			; CHECK-NEXT: Member: {(2 + %c)<nsw>,+,4}
	; CHECK-NEXT: Member: {%c,+,4}			; CHECK-NEXT: Member: {%c,+,4}
	; CHECK-NEXT: Group {{.*}}[[ONE]]:			; CHECK-NEXT: Group {{.*}}[[ONE]]:
	; CHECK-NEXT: (Low: %a High: (40 + %a))			; CHECK-NEXT: (Low: %a High: (40 + %a))
	; CHECK-NEXT: Member: {(2 + %a),+,2}			; CHECK-NEXT: Member: {(2 + %a)<nsw>,+,2}
	; CHECK-NEXT: Member: {%a,+,2}			; CHECK-NEXT: Member: {%a,+,2}
	; CHECK-NEXT: Group {{.*}}[[TWO]]:			; CHECK-NEXT: Group {{.*}}[[TWO]]:
	; CHECK-NEXT: (Low: %b High: (38 + %b))			; CHECK-NEXT: (Low: %b High: (38 + %b))
	; CHECK-NEXT: Member: {%b,+,2}			; CHECK-NEXT: Member: {%b,+,2}

	define void @testg(i16* %a,			define void @testg(i16* %a,
	i16* %b,			i16* %b,
	i16* %c) {			i16* %c) {
	▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Comparing group ({{.*}}[[ZERO]]):			; CHECK-NEXT: Comparing group ({{.*}}[[ZERO]]):
	; CHECK-NEXT: %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc			; CHECK-NEXT: %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc
	; CHECK-NEXT: %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind			; CHECK-NEXT: %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind
	; CHECK-NEXT: Against group ([[TWO:.+]]):			; CHECK-NEXT: Against group ([[TWO:.+]]):
	; CHECK-NEXT: %arrayidxB = getelementptr i16, i16* %b, i64 %ind			; CHECK-NEXT: %arrayidxB = getelementptr i16, i16* %b, i64 %ind
	; CHECK-NEXT: Grouped accesses:			; CHECK-NEXT: Grouped accesses:
	; CHECK-NEXT: Group {{.*}}[[ZERO]]:			; CHECK-NEXT: Group {{.*}}[[ZERO]]:
	; CHECK-NEXT: (Low: %c High: (78 + %c))			; CHECK-NEXT: (Low: %c High: (78 + %c))
	; CHECK-NEXT: Member: {(2 + %c),+,4}			; CHECK-NEXT: Member: {(2 + %c)<nsw>,+,4}
	; CHECK-NEXT: Member: {%c,+,4}			; CHECK-NEXT: Member: {%c,+,4}
	; CHECK-NEXT: Group {{.*}}[[ONE]]:			; CHECK-NEXT: Group {{.*}}[[ONE]]:
	; CHECK-NEXT: (Low: %a High: (40 + %a))			; CHECK-NEXT: (Low: %a High: (40 + %a))
	; CHECK-NEXT: Member: {(2 + %a),+,2}			; CHECK-NEXT: Member: {(2 + %a),+,2}
	; CHECK-NEXT: Member: {%a,+,2}			; CHECK-NEXT: Member: {%a,+,2}
	; CHECK-NEXT: Group {{.*}}[[TWO]]:			; CHECK-NEXT: Group {{.*}}[[TWO]]:
	; CHECK-NEXT: (Low: %b High: (38 + %b))			; CHECK-NEXT: (Low: %b High: (38 + %b))
	; CHECK-NEXT: Member: {%b,+,2}			; CHECK-NEXT: Member: {%b,+,2}
	▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: %arrayidxA1 = getelementptr i16, i16* %a, i64 %ind			; CHECK-NEXT: %arrayidxA1 = getelementptr i16, i16* %a, i64 %ind
	; CHECK-NEXT: Check 1:			; CHECK-NEXT: Check 1:
	; CHECK-NEXT: Comparing group ({{.*}}[[ZERO]]):			; CHECK-NEXT: Comparing group ({{.*}}[[ZERO]]):
	; CHECK-NEXT: %storeidx = getelementptr inbounds i16, i16* %a, i64 %store_ind			; CHECK-NEXT: %storeidx = getelementptr inbounds i16, i16* %a, i64 %store_ind
	; CHECK-NEXT: Against group ([[TWO:.+]]):			; CHECK-NEXT: Against group ([[TWO:.+]]):
	; CHECK-NEXT: %arrayidxA2 = getelementptr i16, i16* %a, i64 %ind2			; CHECK-NEXT: %arrayidxA2 = getelementptr i16, i16* %a, i64 %ind2
	; CHECK-NEXT: Grouped accesses:			; CHECK-NEXT: Grouped accesses:
	; CHECK-NEXT: Group {{.*}}[[ZERO]]:			; CHECK-NEXT: Group {{.*}}[[ZERO]]:
	; CHECK-NEXT: (Low: ((2 * %offset) + %a) High: (9998 + (2 * %offset) + %a))			; CHECK-NEXT: (Low: ((2 * %offset) + %a)<nsw> High: (9998 + (2 * %offset) + %a))
	; CHECK-NEXT: Member: {((2 * %offset) + %a),+,2}<nsw><%for.body>			; CHECK-NEXT: Member: {((2 * %offset) + %a)<nsw>,+,2}<nsw><%for.body>
	; CHECK-NEXT: Group {{.*}}[[ONE]]:			; CHECK-NEXT: Group {{.*}}[[ONE]]:
	; CHECK-NEXT: (Low: %a High: (9998 + %a))			; CHECK-NEXT: (Low: %a High: (9998 + %a))
	; CHECK-NEXT: Member: {%a,+,2}<%for.body>			; CHECK-NEXT: Member: {%a,+,2}<%for.body>
	; CHECK-NEXT: Group {{.*}}[[TWO]]:			; CHECK-NEXT: Group {{.*}}[[TWO]]:
	; CHECK-NEXT: (Low: (20000 + %a) High: (29998 + %a))			; CHECK-NEXT: (Low: (20000 + %a) High: (29998 + %a))
	; CHECK-NEXT: Member: {(20000 + %a),+,2}<%for.body>			; CHECK-NEXT: Member: {(20000 + %a),+,2}<%for.body>

	define void @testi(i16* %a,			define void @testi(i16* %a,
	Show All 29 Lines

test/Analysis/LoopAccessAnalysis/reverse-memcheck-bounds.ll

	Show All 9 Lines
	; for (i = 0; i < 10000; i++) {			; for (i = 0; i < 10000; i++) {
	; B[i] = A[15000 - i] * 3;			; B[i] = A[15000 - i] * 3;
	; }			; }

	target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
	target triple = "aarch64--linux-gnueabi"			target triple = "aarch64--linux-gnueabi"

	; CHECK: function 'f':			; CHECK: function 'f':
	; CHECK: (Low: (20000 + %a) High: (60000 + %a))			; CHECK: (Low: (20000 + %a) High: (60000 + %a)<nsw>)

	@B = common global i32* null, align 8			@B = common global i32* null, align 8
	@A = common global i32* null, align 8			@A = common global i32* null, align 8

	define void @f() {			define void @f() {
	entry:			entry:
	%a = load i32, i32* @A, align 8			%a = load i32, i32* @A, align 8
	%b = load i32, i32* @B, align 8			%b = load i32, i32* @B, align 8
	Show All 26 Lines

	; for (i = 0; i < 10000; i++) {			; for (i = 0; i < 10000; i++) {
	; B[i] = A[15000 - step * i] * 3;			; B[i] = A[15000 - step * i] * 3;
	; }			; }

	; Here it is not obvious what the limits are, since 'step' could be negative.			; Here it is not obvious what the limits are, since 'step' could be negative.

	; CHECK: Low: (-1 + (-1 * ((-60001 + (-1 * %a)) umax (-60001 + (40000 * %step) + (-1 * %a)))))			; CHECK: Low: (-1 + (-1 * ((-60001 + (-1 * %a)) umax (-60001 + (40000 * %step) + (-1 * %a)))))
	; CHECK: High: ((60000 + %a) umax (60000 + (-40000 * %step) + %a))			; CHECK: High: ((60000 + %a)<nsw> umax (60000 + (-40000 * %step) + %a))

	define void @g(i64 %step) {			define void @g(i64 %step) {
	entry:			entry:
	%a = load i32, i32* @A, align 8			%a = load i32, i32* @A, align 8
	%b = load i32, i32* @B, align 8			%b = load i32, i32* @B, align 8
	br label %for.body			br label %for.body

	for.body: ; preds = %for.body, %entry			for.body: ; preds = %for.body, %entry
	Show All 20 Lines

test/Analysis/ScalarEvolution/flags-from-poison.ll

Show First 20 Lines • Show All 340 Lines • ▼ Show 20 Lines	; CHECK: --> {(2 + %offset),+,1}<nw>
%exitcond = icmp eq i32 %nexti, %numIterations		%exitcond = icmp eq i32 %nexti, %numIterations
br i1 %exitcond, label %exit, label %loop		br i1 %exitcond, label %exit, label %loop

loop:		loop:
%i = phi i32 [ %nexti, %loop2 ], [ 0, %entry ]		%i = phi i32 [ %nexti, %loop2 ], [ 0, %entry ]

%j = add nsw i32 %i, 1		%j = add nsw i32 %i, 1
; CHECK: %index32 =		; CHECK: %index32 =
; CHECK: --> {(1 + %offset),+,1}<nsw>		; CHECK: --> {(1 + %offset)<nsw>,+,1}<nsw>
%index32 = add nsw i32 %j, %offset		%index32 = add nsw i32 %j, %offset

%ptr = getelementptr inbounds float, float* %input, i32 %index32		%ptr = getelementptr inbounds float, float* %input, i32 %index32
%nexti = add nsw i32 %i, 1		%nexti = add nsw i32 %i, 1
store float 1.0, float* %ptr, align 4		store float 1.0, float* %ptr, align 4
br label %loop2		br label %loop2
exit:		exit:
ret void		ret void
▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines
; CHECK-LABEL: @test-sub-nsw		; CHECK-LABEL: @test-sub-nsw
entry:		entry:
%halfsub = ashr i32 %sub, 1		%halfsub = ashr i32 %sub, 1
br label %loop		br label %loop
loop:		loop:
%i = phi i32 [ %nexti, %loop ], [ %start, %entry ]		%i = phi i32 [ %nexti, %loop ], [ %start, %entry ]

; CHECK: %index32 =		; CHECK: %index32 =
; CHECK: --> {((-1 * %halfsub)<nsw> + %start),+,1}<nsw>		; CHECK: --> {((-1 * %halfsub)<nsw> + %start)<nsw>,+,1}<nsw>
%index32 = sub nsw i32 %i, %halfsub		%index32 = sub nsw i32 %i, %halfsub
%index64 = sext i32 %index32 to i64		%index64 = sext i32 %index32 to i64

%ptr = getelementptr inbounds float, float* %input, i64 %index64		%ptr = getelementptr inbounds float, float* %input, i64 %index64
%nexti = add nsw i32 %i, 1		%nexti = add nsw i32 %i, 1
%f = load float, float* %ptr, align 4		%f = load float, float* %ptr, align 4
%exitcond = icmp eq i32 %nexti, %numIterations		%exitcond = icmp eq i32 %nexti, %numIterations
br i1 %exitcond, label %exit, label %loop		br i1 %exitcond, label %exit, label %loop
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	; CHECK: --> {(2 + (-1 * %offset)),+,1}<nw>
%exitcond = icmp eq i32 %nexti, %numIterations		%exitcond = icmp eq i32 %nexti, %numIterations
br i1 %exitcond, label %exit, label %loop		br i1 %exitcond, label %exit, label %loop

loop:		loop:
%i = phi i32 [ %nexti, %loop2 ], [ 0, %entry ]		%i = phi i32 [ %nexti, %loop2 ], [ 0, %entry ]

%j = add nsw i32 %i, 1		%j = add nsw i32 %i, 1
; CHECK: %index32 =		; CHECK: %index32 =
; CHECK: --> {(1 + (-1 * %offset)),+,1}<nsw>		; CHECK: --> {(1 + (-1 * %offset))<nsw>,+,1}<nsw>
%index32 = sub nsw i32 %j, %offset		%index32 = sub nsw i32 %j, %offset

%ptr = getelementptr inbounds float, float* %input, i32 %index32		%ptr = getelementptr inbounds float, float* %input, i32 %index32
%nexti = add nsw i32 %i, 1		%nexti = add nsw i32 %i, 1
store float 1.0, float* %ptr, align 4		store float 1.0, float* %ptr, align 4
br label %loop2		br label %loop2
exit:		exit:
ret void		ret void
Show All 34 Lines

test/Analysis/ScalarEvolution/nsw-offset-assume.ll

Show All 33 Lines	; CHECK: --> {%d,+,16}<nsw><%bb>
%6 = load double, double* %5, align 8 ; <double> [#uses=1]		%6 = load double, double* %5, align 8 ; <double> [#uses=1]
%7 = or i32 %i.01, 1 ; <i32> [#uses=1]		%7 = or i32 %i.01, 1 ; <i32> [#uses=1]

; CHECK: %8 = sext i32 %7 to i64		; CHECK: %8 = sext i32 %7 to i64
; CHECK: --> {1,+,2}<nuw><nsw><%bb>		; CHECK: --> {1,+,2}<nuw><nsw><%bb>
%8 = sext i32 %7 to i64 ; <i64> [#uses=1]		%8 = sext i32 %7 to i64 ; <i64> [#uses=1]

; CHECK: %9 = getelementptr inbounds double, double* %q, i64 %8		; CHECK: %9 = getelementptr inbounds double, double* %q, i64 %8
; CHECK: {(8 + %q),+,16}<nsw><%bb>		; CHECK: {(8 + %q)<nsw>,+,16}<nsw><%bb>
%9 = getelementptr inbounds double, double* %q, i64 %8 ; <double*> [#uses=1]		%9 = getelementptr inbounds double, double* %q, i64 %8 ; <double*> [#uses=1]

; Artificially repeat the above three instructions, this time using		; Artificially repeat the above three instructions, this time using
; add nsw instead of or.		; add nsw instead of or.
%t7 = add nsw i32 %i.01, 1 ; <i32> [#uses=1]		%t7 = add nsw i32 %i.01, 1 ; <i32> [#uses=1]

; CHECK: %t8 = sext i32 %t7 to i64		; CHECK: %t8 = sext i32 %t7 to i64
; CHECK: --> {1,+,2}<nuw><nsw><%bb>		; CHECK: --> {1,+,2}<nuw><nsw><%bb>
%t8 = sext i32 %t7 to i64 ; <i64> [#uses=1]		%t8 = sext i32 %t7 to i64 ; <i64> [#uses=1]

; CHECK: %t9 = getelementptr inbounds double, double* %q, i64 %t8		; CHECK: %t9 = getelementptr inbounds double, double* %q, i64 %t8
; CHECK: {(8 + %q),+,16}<nsw><%bb>		; CHECK: {(8 + %q)<nsw>,+,16}<nsw><%bb>
%t9 = getelementptr inbounds double, double* %q, i64 %t8 ; <double*> [#uses=1]		%t9 = getelementptr inbounds double, double* %q, i64 %t8 ; <double*> [#uses=1]

%10 = load double, double* %9, align 8 ; <double> [#uses=1]		%10 = load double, double* %9, align 8 ; <double> [#uses=1]
%11 = fadd double %6, %10 ; <double> [#uses=1]		%11 = fadd double %6, %10 ; <double> [#uses=1]
%12 = fadd double %11, 3.200000e+00 ; <double> [#uses=1]		%12 = fadd double %11, 3.200000e+00 ; <double> [#uses=1]
%13 = fmul double %3, %12 ; <double> [#uses=1]		%13 = fmul double %3, %12 ; <double> [#uses=1]
%14 = sext i32 %i.01 to i64 ; <i64> [#uses=1]		%14 = sext i32 %i.01 to i64 ; <i64> [#uses=1]
%15 = getelementptr inbounds double, double* %d, i64 %14 ; <double*> [#uses=1]		%15 = getelementptr inbounds double, double* %d, i64 %14 ; <double*> [#uses=1]
Show All 21 Lines

test/Analysis/ScalarEvolution/nsw-offset.ll

Show All 31 Lines	; CHECK: --> {%d,+,16}<nsw><%bb>
%6 = load double, double* %5, align 8 ; <double> [#uses=1]		%6 = load double, double* %5, align 8 ; <double> [#uses=1]
%7 = or i32 %i.01, 1 ; <i32> [#uses=1]		%7 = or i32 %i.01, 1 ; <i32> [#uses=1]

; CHECK: %8 = sext i32 %7 to i64		; CHECK: %8 = sext i32 %7 to i64
; CHECK: --> {1,+,2}<nuw><nsw><%bb>		; CHECK: --> {1,+,2}<nuw><nsw><%bb>
%8 = sext i32 %7 to i64 ; <i64> [#uses=1]		%8 = sext i32 %7 to i64 ; <i64> [#uses=1]

; CHECK: %9 = getelementptr inbounds double, double* %q, i64 %8		; CHECK: %9 = getelementptr inbounds double, double* %q, i64 %8
; CHECK: {(8 + %q),+,16}<nsw><%bb>		; CHECK: {(8 + %q)<nsw>,+,16}<nsw><%bb>
%9 = getelementptr inbounds double, double* %q, i64 %8 ; <double*> [#uses=1]		%9 = getelementptr inbounds double, double* %q, i64 %8 ; <double*> [#uses=1]

; Artificially repeat the above three instructions, this time using		; Artificially repeat the above three instructions, this time using
; add nsw instead of or.		; add nsw instead of or.
%t7 = add nsw i32 %i.01, 1 ; <i32> [#uses=1]		%t7 = add nsw i32 %i.01, 1 ; <i32> [#uses=1]

; CHECK: %t8 = sext i32 %t7 to i64		; CHECK: %t8 = sext i32 %t7 to i64
; CHECK: --> {1,+,2}<nuw><nsw><%bb>		; CHECK: --> {1,+,2}<nuw><nsw><%bb>
%t8 = sext i32 %t7 to i64 ; <i64> [#uses=1]		%t8 = sext i32 %t7 to i64 ; <i64> [#uses=1]

; CHECK: %t9 = getelementptr inbounds double, double* %q, i64 %t8		; CHECK: %t9 = getelementptr inbounds double, double* %q, i64 %t8
; CHECK: {(8 + %q),+,16}<nsw><%bb>		; CHECK: {(8 + %q)<nsw>,+,16}<nsw><%bb>
%t9 = getelementptr inbounds double, double* %q, i64 %t8 ; <double*> [#uses=1]		%t9 = getelementptr inbounds double, double* %q, i64 %t8 ; <double*> [#uses=1]

%10 = load double, double* %9, align 8 ; <double> [#uses=1]		%10 = load double, double* %9, align 8 ; <double> [#uses=1]
%11 = fadd double %6, %10 ; <double> [#uses=1]		%11 = fadd double %6, %10 ; <double> [#uses=1]
%12 = fadd double %11, 3.200000e+00 ; <double> [#uses=1]		%12 = fadd double %11, 3.200000e+00 ; <double> [#uses=1]
%13 = fmul double %3, %12 ; <double> [#uses=1]		%13 = fmul double %3, %12 ; <double> [#uses=1]
%14 = sext i32 %i.01 to i64 ; <i64> [#uses=1]		%14 = sext i32 %i.01 to i64 ; <i64> [#uses=1]
%15 = getelementptr inbounds double, double* %d, i64 %14 ; <double*> [#uses=1]		%15 = getelementptr inbounds double, double* %d, i64 %14 ; <double*> [#uses=1]
Show All 17 Lines

test/Analysis/ScalarEvolution/nsw.ll

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines

for.body.i.i: ; preds = %for.body.i.i, %for.body.lr.ph.i.i		for.body.i.i: ; preds = %for.body.i.i, %for.body.lr.ph.i.i
%__first.addr.02.i.i = phi i32* [ %begin, %for.body.lr.ph.i.i ], [ %ptrincdec.i.i, %for.body.i.i ]		%__first.addr.02.i.i = phi i32* [ %begin, %for.body.lr.ph.i.i ], [ %ptrincdec.i.i, %for.body.i.i ]
; CHECK: %__first.addr.02.i.i		; CHECK: %__first.addr.02.i.i
; CHECK-NEXT: --> {%begin,+,4}<nuw><%for.body.i.i>		; CHECK-NEXT: --> {%begin,+,4}<nuw><%for.body.i.i>
store i32 0, i32* %__first.addr.02.i.i, align 4		store i32 0, i32* %__first.addr.02.i.i, align 4
%ptrincdec.i.i = getelementptr inbounds i32, i32* %__first.addr.02.i.i, i64 1		%ptrincdec.i.i = getelementptr inbounds i32, i32* %__first.addr.02.i.i, i64 1
; CHECK: %ptrincdec.i.i		; CHECK: %ptrincdec.i.i
; CHECK-NEXT: --> {(4 + %begin),+,4}<nuw><%for.body.i.i>		; CHECK-NEXT: --> {(4 + %begin)<nsw>,+,4}<nuw><%for.body.i.i>
%cmp.i.i = icmp eq i32* %ptrincdec.i.i, %end		%cmp.i.i = icmp eq i32* %ptrincdec.i.i, %end
br i1 %cmp.i.i, label %for.cond.for.end_crit_edge.i.i, label %for.body.i.i		br i1 %cmp.i.i, label %for.cond.for.end_crit_edge.i.i, label %for.body.i.i

for.cond.for.end_crit_edge.i.i: ; preds = %for.body.i.i		for.cond.for.end_crit_edge.i.i: ; preds = %for.body.i.i
br label %_ZSt4fillIPiiEvT_S1_RKT0_.exit		br label %_ZSt4fillIPiiEvT_S1_RKT0_.exit

_ZSt4fillIPiiEvT_S1_RKT0_.exit: ; preds = %entry, %for.cond.for.end_crit_edge.i.i		_ZSt4fillIPiiEvT_S1_RKT0_.exit: ; preds = %entry, %for.cond.for.end_crit_edge.i.i
ret void		ret void
Show All 9 Lines	for.body.i.i: ; preds = %entry, %for.body.i.i
%indvar.i.i = phi i64 [ %tmp, %for.body.i.i ], [ 0, %entry ]		%indvar.i.i = phi i64 [ %tmp, %for.body.i.i ], [ 0, %entry ]
; CHECK: %indvar.i.i		; CHECK: %indvar.i.i
; CHECK: {0,+,1}<nuw><nsw><%for.body.i.i>		; CHECK: {0,+,1}<nuw><nsw><%for.body.i.i>
%tmp = add nsw i64 %indvar.i.i, 1		%tmp = add nsw i64 %indvar.i.i, 1
; CHECK: %tmp =		; CHECK: %tmp =
; CHECK: {1,+,1}<nuw><nsw><%for.body.i.i>		; CHECK: {1,+,1}<nuw><nsw><%for.body.i.i>
%ptrincdec.i.i = getelementptr inbounds i32, i32* %begin, i64 %tmp		%ptrincdec.i.i = getelementptr inbounds i32, i32* %begin, i64 %tmp
; CHECK: %ptrincdec.i.i =		; CHECK: %ptrincdec.i.i =
; CHECK: {(4 + %begin),+,4}<nsw><%for.body.i.i>		; CHECK: {(4 + %begin)<nsw>,+,4}<nsw><%for.body.i.i>
%__first.addr.08.i.i = getelementptr inbounds i32, i32* %begin, i64 %indvar.i.i		%__first.addr.08.i.i = getelementptr inbounds i32, i32* %begin, i64 %indvar.i.i
; CHECK: %__first.addr.08.i.i		; CHECK: %__first.addr.08.i.i
; CHECK: {%begin,+,4}<nsw><%for.body.i.i>		; CHECK: {%begin,+,4}<nsw><%for.body.i.i>
store i32 0, i32* %__first.addr.08.i.i, align 4		store i32 0, i32* %__first.addr.08.i.i, align 4
%cmp.i.i = icmp eq i32* %ptrincdec.i.i, %end		%cmp.i.i = icmp eq i32* %ptrincdec.i.i, %end
br i1 %cmp.i.i, label %_ZSt4fillIPiiEvT_S1_RKT0_.exit, label %for.body.i.i		br i1 %cmp.i.i, label %_ZSt4fillIPiiEvT_S1_RKT0_.exit, label %for.body.i.i
; CHECK: Loop %for.body.i.i: backedge-taken count is ((-4 + (-1 * %begin) + %end) /u 4)		; CHECK: Loop %for.body.i.i: backedge-taken count is ((-4 + (-1 * %begin) + %end) /u 4)
; CHECK: Loop %for.body.i.i: max backedge-taken count is ((-4 + (-1 * %begin) + %end) /u 4)		; CHECK: Loop %for.body.i.i: max backedge-taken count is ((-4 + (-1 * %begin) + %end) /u 4)
Show All 15 Lines	greater:
br label %exit		br label %exit

exit:		exit:
%result = phi i32 [ %a, %entry ], [ %tmp2, %greater ]		%result = phi i32 [ %a, %entry ], [ %tmp2, %greater ]
ret i32 %result		ret i32 %result
}		}

; CHECK-LABEL: PR12375		; CHECK-LABEL: PR12375
; CHECK: --> {(4 + %arg),+,4}<nuw><%bb1>{{ U: [^ ]+ S: [^ ]+}}{{ *}}Exits: (8 + %arg)<nsw>		; CHECK: --> {(4 + %arg)<nsw>,+,4}<nuw><%bb1>{{ U: [^ ]+ S: [^ ]+}}{{ *}}Exits: (8 + %arg)<nsw>
define i32 @PR12375(i32* readnone %arg) {		define i32 @PR12375(i32* readnone %arg) {
bb:		bb:
%tmp = getelementptr inbounds i32, i32* %arg, i64 2		%tmp = getelementptr inbounds i32, i32* %arg, i64 2
br label %bb1		br label %bb1

bb1: ; preds = %bb1, %bb		bb1: ; preds = %bb1, %bb
%tmp2 = phi i32* [ %arg, %bb ], [ %tmp5, %bb1 ]		%tmp2 = phi i32* [ %arg, %bb ], [ %tmp5, %bb1 ]
%tmp3 = phi i32 [ 0, %bb ], [ %tmp4, %bb1 ]		%tmp3 = phi i32 [ 0, %bb ], [ %tmp4, %bb1 ]
%tmp4 = add nsw i32 %tmp3, 1		%tmp4 = add nsw i32 %tmp3, 1
%tmp5 = getelementptr inbounds i32, i32* %tmp2, i64 1		%tmp5 = getelementptr inbounds i32, i32* %tmp2, i64 1
%tmp6 = icmp ult i32* %tmp5, %tmp		%tmp6 = icmp ult i32* %tmp5, %tmp
br i1 %tmp6, label %bb1, label %bb7		br i1 %tmp6, label %bb1, label %bb7

bb7: ; preds = %bb1		bb7: ; preds = %bb1
ret i32 %tmp4		ret i32 %tmp4
}		}

; CHECK-LABEL: PR12376		; CHECK-LABEL: PR12376
; CHECK: --> {(4 + %arg),+,4}<nuw><%bb2>{{ U: [^ ]+ S: [^ ]+}}{{ }}Exits: (4 + (4 ((3 + (-1 * %arg) + (%arg umax %arg1)) /u 4)) + %arg)		; CHECK: --> {(4 + %arg)<nsw>,+,4}<nuw><%bb2>{{ U: [^ ]+ S: [^ ]+}}{{ }}Exits: (4 + (4 ((3 + (-1 * %arg) + (%arg umax %arg1)) /u 4)) + %arg)
define void @PR12376(i32* nocapture %arg, i32* nocapture %arg1) {		define void @PR12376(i32* nocapture %arg, i32* nocapture %arg1) {
bb:		bb:
br label %bb2		br label %bb2

bb2: ; preds = %bb2, %bb		bb2: ; preds = %bb2, %bb
%tmp = phi i32* [ %arg, %bb ], [ %tmp4, %bb2 ]		%tmp = phi i32* [ %arg, %bb ], [ %tmp4, %bb2 ]
%tmp3 = icmp ult i32* %tmp, %arg1		%tmp3 = icmp ult i32* %tmp, %arg1
%tmp4 = getelementptr inbounds i32, i32* %tmp, i64 1		%tmp4 = getelementptr inbounds i32, i32* %tmp, i64 1
Show All 17 Lines	for.body:
%inc = add nsw i32 %i.04, 1		%inc = add nsw i32 %i.04, 1
tail call void @f(i32 %i.04)		tail call void @f(i32 %i.04)
%cmp = icmp slt i32 %i.04, %add		%cmp = icmp slt i32 %i.04, %add
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

for.end:		for.end:
ret void		ret void
}		}

		; This test checks if no-wrap flags are propagated when folding {S,+,X}+T ==> {S+T,+,X} (D20058)
		sanjoyUnsubmitted Done Reply Inline Actions Nit: you don't need to specify `D20058` here. Also use a `CHECK-LABEL: <function name>` like in the previous tests. sanjoy: Nit: you don't need to specify `D20058` here. Also use a `CHECK-LABEL: <function name>` like…
		; CHECK: %idxprom
		; CHECK-NEXT: --> {(-2 + (sext i32 %arg to i64))<nsw>,+,1}<nsw><%for.body>
		define void @test4(i32 %arg) {
		entry:
		%array = alloca [10 x i32], align 4
		br label %for.body

		for.body:
		%index = phi i32 [ %inc5, %for.body ], [ %arg, %entry ]
		%sub = add nsw i32 %index, -2
		%idxprom = sext i32 %sub to i64
		%arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %array, i64 0, i64 %idxprom
		%data = load i32, i32* %arrayidx, align 4
		%inc5 = add nsw i32 %index, 1
		%cmp2 = icmp slt i32 %inc5, 10
		br i1 %cmp2, label %for.body, label %for.end

		for.end:
		ret void
		}
		sanjoyUnsubmitted Done Reply Inline Actions Add newline at end of file? sanjoy: Add newline at end of file?
		iid_iunknownAuthorUnsubmitted Not Done Reply Inline Actions The tool I used for diff creation removed the trailing eol for some reason. Fixed. iid_iunknown: The tool I used for diff creation removed the trailing eol for some reason. Fixed.
		No newline at end of file

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 58171

lib/Analysis/ScalarEvolution.cpp

test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll

test/Analysis/LoopAccessAnalysis/reverse-memcheck-bounds.ll

test/Analysis/ScalarEvolution/flags-from-poison.ll

test/Analysis/ScalarEvolution/nsw-offset-assume.ll

test/Analysis/ScalarEvolution/nsw-offset.ll

test/Analysis/ScalarEvolution/nsw.ll

[SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"
ClosedPublic