This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Analysis/
-
llvm/
-
Analysis/
2/2
ScalarEvolution.h
-
lib/Analysis/
-
Analysis/
34/35
ScalarEvolution.cpp
-
test/
-
Analysis/
-
Delinearization/
-
multidim_ivs_and_integer_offsets_3d.ll
-
multidim_ivs_and_parameteric_offsets_3d.ll
-
ScalarEvolution/
-
flags-from-poison.ll
-
Transforms/LoopStrengthReduce/
-
LoopStrengthReduce/
-
sext-ind-var.ll

Differential D11212

[SCEV] Apply NSW and NUW flags via poison value analysis
ClosedPublic

Authored by broune on Jul 14 2015, 10:22 PM.

Download Raw Diff

Details

Reviewers

eliben
atrick
sanjoy

Commits

rG42f1d67a45f3: [SCEV] Apply NSW and NUW flags via poison value analysis
rL243460: [SCEV] Apply NSW and NUW flags via poison value analysis

Summary

Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.

There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial.

This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this:

for (int i = 0; i < limit; ++i) {
  sum += ptr[i + offset];
}

Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.

There are more details in this discussion on llvmdev.
June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234
July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392

Patch by Bjarke Roune

Diff Detail

Event Timeline

broune updated this revision to Diff 29752.Jul 14 2015, 10:22 PM

broune retitled this revision from to [SCEV] Apply NSW and NUW flags via poison value analysis.

broune updated this object.

broune added reviewers: eliben, atrick, sanjoy.

broune added subscribers: llvm-commits, meheff, jingyue.

First round of comments inline. I'll do a more detailed review in the next couple of days.

lib/Analysis/ScalarEvolution.cpp
4284	Same comment as earlier -- I don't think this `assert` adds much value. You could take `Value &V` if your really cared about not passing in `nullptr`, but if I were you I'd just remove the `assert`.
4287	Nit: propagate.
4307	Question: why isn't `undefinedBehaviorIsGuaranteedIfPoison` enough? IOW, why can't we have if (undefinedBehaviorIsGuaranteedIfPoison(BinOp)) return Flags; else return SCEV::FlagAnyWrap;
4312	LLVM convention is `auto *AddRec`.
4372	LLVM convention is `auto *`.

sanjoy added inline comments.Jul 15 2015, 12:09 AM

lib/Analysis/ScalarEvolution.cpp
4095	I think `isGuaranteedToExecuteForEveryIteration`, `propagatesPoison`, `getOperandThatCausesUndefinedBehaviorIfPoison` and `undefinedBehaviorIsGuaranteedIfPoison` should live in `ValueTracking.h`, given that `ValueTracking` contains similar things like `isSafeToSpeculativelyExecute`.
4099	I don't think these asserts are adding much here. Do you think it will be cleaner to take `I` and `L` by reference instead, to indicate this invariant?
4114	Use `isa`. Also `mayThrow` implies `isa<CallInst> \|\| isa<ResumeInst>` and since `ResumeInst` is a terminator, I think this check can just be `isa<CallInst>` with a comment that this checks for both infinite loops and throws.
4116	I'd swap this check with the previous one -- otherwise you'll always return `false` for calls.
4125	I agree these are fairly "obvious", but I'd like to run these by David Majnemer and Nuno Lopes to make sure that these rules are at least congruent on what they have in mind for poison's future.
4130	I don't think this `assert` adds much. It is fairly obvious that `I` is expected to be non null and if it is, we'll get a deterministic segfault in `I->getOpcode()`.
4210	Same comment as above about nullness of `I`.
4263	Some comment as in `isGuaranteedToExecuteForEveryIteration` for `CallInst` and `mayThrow`.
4272	You can directly iterate over `I->users()`.

Address Sanjoy's initial comments.

lib/Analysis/ScalarEvolution.cpp
4099	I would have preferred references, but all the other functions in this file take pointers, so I wanted to be consistent. Asserting non-null is itself not consistent with the rest of this file, so I took it out.
4114	Done.
4116	Done.
4125	Thank you. I'm curious what their thoughts are on it.
4130	I removed it.
4210	I removed it.
4263	Done.
4272	Thanks for the tip.
4284	I removed the `assert`.
4287	Done.
4307	I added a comment on why the other conditions are necessary. There may be a way around isLoopInvariant, but I'm not 100% sure about that, so I left it in.
4372	Done.

Added and used isGuaranteedToTransferExecutionToSuccessor (is there a better name?). Also slight improvement to comments.

I checked all the instructions in the langref to see if any others might also not terminate. All I found is that while the langref doesn't explicitly say so, some atomics like atomicrmw do not necessarily terminate if another thread keeps interfering. Looking at the C++14 standard, some thread is guaranteed to make progress but I could not find a statement that any given thread is guaranteed to make progress, so I made isGuaranteedToTransferExecutionToSuccessor conservative on that point.

In D11212#206104, @broune wrote:

Added and used isGuaranteedToTransferExecutionToSuccessor (is there a better name?). Also slight improvement to comments.

I checked all the instructions in the langref to see if any others might also not terminate. All I found is that while the langref doesn't explicitly say so, some atomics like atomicrmw do not necessarily terminate if another thread keeps interfering. Looking at the C++14 standard, some thread is guaranteed to make progress but I could not find a statement that any given thread is guaranteed to make progress, so I made isGuaranteedToTransferExecutionToSuccessor conservative on that point.

I think we'll end up needing some additional attribute to let us get the C++ semantics at the IR level, and we do want them, but we also need to support languages like Java where infinite loops are (sadly) well defined. Regarding C++, 1.10p27 says:

The implementation may assume that any thread will eventually do one of the following:
  terminate
  make a call to a library I/O function
  access or modify a volatile object, or
  perform a synchronization operation or an atomic operation

 [Note: This is intended to allow compiler transformations such as removal of empty loops, even
  when termination cannot be proven. — end note ]

And, thus, when we can assume C++ semantics, any thread is guaranteed to make progress, or call some external function, or access a volatile/atomic variable.

majnemer added a subscriber: majnemer.Jul 15 2015, 7:35 PM

majnemer added inline comments.

lib/Analysis/ValueTracking.cpp
3369–3370 ↗	(On Diff #29858)	Left shift by poison is poison, not UB.
3418–3419 ↗	(On Diff #29858)	Call? Invoke?

In D11212#206138, @hfinkel wrote:

In D11212#206104, @broune wrote:

Added and used isGuaranteedToTransferExecutionToSuccessor (is there a better name?). Also slight improvement to comments.

I checked all the instructions in the langref to see if any others might also not terminate. All I found is that while the langref doesn't explicitly say so, some atomics like atomicrmw do not necessarily terminate if another thread keeps interfering. Looking at the C++14 standard, some thread is guaranteed to make progress but I could not find a statement that any given thread is guaranteed to make progress, so I made isGuaranteedToTransferExecutionToSuccessor conservative on that point.

And, thus, when we can assume C++ semantics, any thread is guaranteed to make progress, or call some external function, or access a volatile/atomic variable.

I don't think this is relevant here -- even assuming C++ semantics [edit: unless we can prove the call to be readonly / readnone] CallInst is not guaranteed to always return -- the called function could be stalled doing an infinite number of volatile accesses for instance.

Addresses David Majnemer's comment on shl.

In D11212#206138, @hfinkel wrote:
In D11212#206104, @broune wrote:

Added and used isGuaranteedToTransferExecutionToSuccessor (is there a better name?). Also slight improvement to comments.

I checked all the instructions in the langref to see if any others might also not terminate. All I found is that while the langref doesn't explicitly say so, some atomics like atomicrmw do not necessarily terminate if another thread keeps interfering. Looking at the C++14 standard, some thread is guaranteed to make progress but I could not find a statement that any given thread is guaranteed to make progress, so I made isGuaranteedToTransferExecutionToSuccessor conservative on that point.

I think we'll end up needing some additional attribute to let us get the C++ semantics at the IR level, and we do want them, but we also need to support languages like Java where infinite loops are (sadly) well defined. Regarding C++, 1.10p27 says:
The implementation may assume that any thread will eventually do one of the following:
  terminate
  make a call to a library I/O function
  access or modify a volatile object, or
  perform a synchronization operation or an atomic operation

 [Note: This is intended to allow compiler transformations such as removal of empty loops, even
  when termination cannot be proven. — end note ]
And, thus, when we can assume C++ semantics, any thread is guaranteed to make progress, or call some external function, or access a volatile/atomic variable.

From the LLVM atomics guide, we have a pass AtomicExpandPass that e.g. can expand atomicrmw into a loop with compare-exchange. There may be a reason that such a loop always terminates, but I'm not aware of one. The expanded loop does meet the requirement that it will continually perform an atomic operation (just not successfully). If that isn't guaranteed to terminate, and AtomicExpandPass is correct in choosing that implementation for atomicrmw, then it's not clear to me that we can assume that atomicrmw terminates.

It may be that we can assume that e.g. atomicrmw always terminates, I just haven't so far been able to convince myself of that, so I decided to be conservative. I also haven't looked into the Java semantics. If you're confident that it's not an issue, I'll go with that.

lib/Analysis/ValueTracking.cpp
3369–3370 ↗	(On Diff #29858)	You're right, thank you. From the langref on shl: "If op2 is (statically or dynamically) equal to or larger than the number of bits in op1, the result is undefined." I had read that as undefined behavior, but it only says undefined, which I'm thinking just means undef. That 0 << poison would be poison makes sense to me, so I'll go with that if no one objects.
3418–3419 ↗	(On Diff #29858)	I'm not sure what the question is. Are you suggesting that there are some intrinsics that it would be good to handle? This function is conservative, so returning false for Call and Invoke is correct.

majnemer added inline comments.Jul 16 2015, 1:54 PM

lib/Analysis/ValueTracking.cpp
3418–3419 ↗	(On Diff #29928)	I misread how `propagatesFullPoison` is supposed to work. I think the name needs some work.

broune added inline comments.Jul 16 2015, 2:13 PM

lib/Analysis/ValueTracking.cpp
3418–3419 ↗	(On Diff #29928)	What thing would you like the name to be clearer about? I'm happy to change it. I could do isGuaranteedToYieldFullPoisonIfGivenFullPoison; it's clearer even if not as succinct. Any suggestions?

Second round of review comments inline.

include/llvm/Analysis/ScalarEvolution.h
587	Minor & optional: I'd call this `getNoWrapFlagsFromUB`.
include/llvm/Analysis/ValueTracking.h
313 ↗	(On Diff #29928)	Is the indentation a little bit off here? Actually, I'll just assume you'll run this change through `clang-format` before checking in, and won't make any further whitespace related comments. Please let me know if you'd prefer otherwise.
327 ↗	(On Diff #29928)	I'd prefer if you were a little more terse -- perhaps `getGuaranteedNonPoisonOp`? (IOW, which operand is guaranteed to not be poison since if it were poison then the program is undefined)
339 ↗	(On Diff #29928)	Similarly, how about calling this `isKnownNotPoison`?
lib/Analysis/ScalarEvolution.cpp
4144	I'm not sure that this is correct. I think you need to prove that the instruction you used to justify UB if `V` was poison is what needs to execute on every loop iteration. This is not a problem currently because `undefinedBehaviorIsGuaranteedIfFullPoison` only looks at a single basic block, but semantically, the following program will be problematic: loop_header: %x = add nsw %p, %q ... outside_the_loop: ;; no other uses of %x store i32 42, GEP(@global, %x) You can conclude that the `%x` does not overflow in the last iteration, but that's it -- even if `%x` was poison in all the other iterations your program is well defined.
lib/Analysis/ValueTracking.cpp
3327 ↗	(On Diff #29928)	If you're dealing with terminator instructions here (I'm not sure that that is necessary) why is `br` okay? Can't a `br` form an infinite loop?

Thanks for the comments, Sanjoy. I'll update the code with name changes Monday.

include/llvm/Analysis/ScalarEvolution.h
587	SGTM, I'll make the change.
include/llvm/Analysis/ValueTracking.h
313 ↗	(On Diff #29928)	Sorry, I need to find a better way to set up my editor. I'll run clang-format before checking in.
327 ↗	(On Diff #29928)	I wasn't so happy with it myself and I like your suggestion. Maybe getGuaranteedNonFullPoisonOp ? I'll change it Monday.
339 ↗	(On Diff #29928)	Nice. isKnownNotFullPoison? I'm concerned that it would be too easy to miss the distinction between poison and full-poison and the bugs from that would be hard to shake out.
lib/Analysis/ScalarEvolution.cpp
4144	As you point out, UB is not guaranteed on poison in this example, so even if `undefinedBehaviorIsGuaranteedIfFullPoison` is made stronger by considering more basic blocks, it should still return false here, right?
lib/Analysis/ValueTracking.cpp
3327 ↗	(On Diff #29928)	Yes, br can form an infinite loop. This function should still return false for br, as each single execution of br does terminate and then transfers execution to its successor (even if it is its own successor), but I suspect that that's not what you're getting at with this question. :) Zooming out a bit, what I'm really using this function for is as a component of proving that one instruction B strongly post-dominates another instruction A. Part of that is to prove that there are no infinite paths from A to B, since in such a path we'd never actually get to B, even if B (non-strongly) post-dominates A. In the general case, yes, we'd also need to take into account loops between A and B within the same CFG/function and prove that they terminate. The main reason that I'm only considering one basic block at a time is to make things simpler by avoiding the need for that, since I think that this change is already complex enough as it is. Even then, we still need to look out for single instructions where a single invocation can be enough to prevent strong post-dominance (even within a basic block), and this function serves that purpose. You're right that this over-all feature could work with a function like this that did not consider terminators. I included terminators anyway just because it gives this function a cohesive responsibility hat makes it easier to think about and it's also what would/will be needed when reasoning about strong post-dominance across basic blocks. I don't have any strong objection to taking the terminators out for now, though.

sanjoy added inline comments.Jul 19 2015, 10:23 PM

lib/Analysis/ScalarEvolution.cpp
4144	In that case why do we bother with `isGuaranteedToExecuteForEveryIteration`? IOW, why not if (undefinedBehaviorIsGuaranteedIfFullPoison(BinOp)) return Flags;
lib/Analysis/ValueTracking.cpp
3327 ↗	(On Diff #29928)	I think removing terminators and `assert(!isa<Terminator>(I) && "...");` will be an improvement. If we later decide to make the algorithm more precise around control flow, then we can consider adding terminators.

sanjoy added inline comments.Jul 19 2015, 10:41 PM

lib/Analysis/ValueTracking.cpp
3327 ↗	(On Diff #29928)	On second thought, I take back what I said and agree with your reasoning -- I think the function is fine as is.

broune added inline comments.Jul 19 2015, 10:56 PM

lib/Analysis/ScalarEvolution.cpp
4144	That would prove that if BinOp is executed, then what it calculates will not wrap. There is then still the possibility that BinOp is not executed on a given iteration, in which case we have no information about wrapping of the SCEV for that iteration. Then we cannot apply the flag to the SCEV as other instructions in the loop that map to the same SCEV would then also get the flag on the shared SCEV, but we have not actually proven that the shared SCEV does not wrap.

Minor nits inline. At this point I think this change is ready to go in once the style / naming fixes are done. However:

I'd like to take a look at the final change before it goes in.
I'd also like Andy to take a look before this goes in.

Side comment and optional: have you bootstrapped clang with this change? That's a good sanity check for this sort of change. You may consider bootstrapping with ubsan / asan too to get some extra coverage: the extra control flow the sanitizers add tends to shake out a lot of bugs.

include/llvm/Analysis/ValueTracking.h
293 ↗	(On Diff #29928)	This is borderline bikeshedding, but the rest of the file specifies behavior as a verb -- `Return true if ...`
311 ↗	(On Diff #29928)	Minor: I'd just say "Note that this currently only looks at the loop header". Specifically, I'd avoid using the term "analysis" since that has a specific meaning within LLVM.
327 ↗	(On Diff #29928)	SGTM.
339 ↗	(On Diff #29928)	SGTM.
lib/Analysis/ScalarEvolution.cpp
4113	Nitpick: elsewhere you use SCEV not "scev".
4144	Ah, right. You also have a large comment explaining the very same thing -- sorry for making you repeat yourself.
4379	The `Do an operation by itself if a no-wrap flag can be applied` bit did not parse for me.
lib/Analysis/ValueTracking.cpp
3397 ↗	(On Diff #29928)	Can't you use `for (Value *V : OBO->operands())` here?

Comments addressed and ran clang-format.

In D11212#208037, @sanjoy wrote: [...]

Side comment and optional: have you bootstrapped clang with this change? That's a good sanity check for this sort of change. You may consider bootstrapping with ubsan / asan too to get some extra coverage: the extra control flow the sanitizers add tends to shake out a lot of bugs.

I'm not sure what this entails, but I made a guess: I took a release-with-asserts build of llvm, built with my patch, and used the binaries from that with cmake like so:

CXX=path/to/release/with/asserts/clang/bin/clang++ CC=path/to/release/with/asserts/clang/bin/clang++ cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=Yes

then I did ninja check and ninja check-asan. If that's not what you had in mind, feel free to let me know. I didn't find a guide on how to test LLVM via bootstrapping.

lib/Analysis/ScalarEvolution.cpp
4379	I clarified the comment.

Change comments to avoid referring to what this change does as "analysis".

Looks good. Nice comments.

For the goal of reassociating array address computation, I would have tried a different approach, but I think this is still valid and generally adds more power to SCEV.

I trust that Sanjoy has tried to poke enough holes in the logic. Overall, it looks pretty solid to me. Thanks for the thorough review, Sanjoy.

This revision is now accepted and ready to land.Jul 27 2015, 12:44 AM

Improve handling of case where V is a ConstantExpr in createSCEV.

Some very minor non-semantic nits inline, otherwise looks good to me. Feel free to check in as is and fix the nits in a follow up change if you find that more convenient.

include/llvm/Analysis/ValueTracking.h
293 ↗	(On Diff #30751)	Please fix this in the comments for the other functions you added as well.
lib/Analysis/ScalarEvolution.cpp
4111	Nit: change "rec" to something more descriptive, perhaps "add recurrence"?
lib/Analysis/ValueTracking.cpp
3326 ↗	(On Diff #30751)	Minor nit: `invoke`s can also throw (see https://llvm.org/bugs/show_bug.cgi?id=24185, especially https://llvm.org/bugs/show_bug.cgi?id=24185). Btw, I got the idea for testing for 24185 bug from reading this function.

Tiny update to comments.

Thank you to Sanjoy and Andy for the review.

include/llvm/Analysis/ValueTracking.h
293 ↗	(On Diff #30769)	Argh, sorry about that. Done.
lib/Analysis/ValueTracking.cpp
3326 ↗	(On Diff #30751)	As I understand the bug, an invoke could throw somewhere other than to the landingpad successor in the CFG, if the landingpad is not a match for the exception thrown, so I updated the comment. I referred to the bug you mentioned.

jingyue updated this object.Jul 28 2015, 11:22 AM

jingyue closed this revision.Jul 28 2015, 11:23 AM

jingyue mentioned this in D12016: [SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign-overflow..Aug 13 2015, 10:50 AM

jingyue mentioned this in rL245003: [SeparateConstOffsetFromGEP] sext(a)+sext(b) => sext(a+b) when a+b can't sign….Aug 13 2015, 7:02 PM

Revision Contents

Path

Size

include/

llvm/

Analysis/

ScalarEvolution.h

8 lines

lib/

Analysis/

ScalarEvolution.cpp

296 lines

test/

Analysis/

Delinearization/

multidim_ivs_and_integer_offsets_3d.ll

2 lines

multidim_ivs_and_parameteric_offsets_3d.ll

2 lines

ScalarEvolution/

flags-from-poison.ll

358 lines

Transforms/

LoopStrengthReduce/

sext-ind-var.ll

36 lines

Diff 29752

include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 559 Lines • ▼ Show 20 Lines	private:
/// of LHS and RHS.		/// of LHS and RHS.
///		///
bool isKnownPredicateWithRanges(ICmpInst::Predicate Pred,		bool isKnownPredicateWithRanges(ICmpInst::Predicate Pred,
const SCEV LHS, const SCEV RHS);		const SCEV LHS, const SCEV RHS);

/// forgetMemoizedResults - Drop memoized information computed for S.		/// forgetMemoizedResults - Drop memoized information computed for S.
void forgetMemoizedResults(const SCEV *S);		void forgetMemoizedResults(const SCEV *S);

		/// Return an existing SCEV for V if there is one, otherwise return nullptr.
		const SCEV getExistingSCEV(Value V);

/// Return false iff given SCEV contains a SCEVUnknown with NULL value-		/// Return false iff given SCEV contains a SCEVUnknown with NULL value-
/// pointer.		/// pointer.
bool checkValidity(const SCEV *S) const;		bool checkValidity(const SCEV *S) const;

// Return true if `ExtendOpTy`({`Start`,+,`Step`}) can be proved to be equal		// Return true if `ExtendOpTy`({`Start`,+,`Step`}) can be proved to be equal
// to {`ExtendOpTy`(`Start`),+,`ExtendOpTy`(`Step`)}. This is equivalent to		// to {`ExtendOpTy`(`Start`),+,`ExtendOpTy`(`Step`)}. This is equivalent to
// proving no signed (resp. unsigned) wrap in {`Start`,+,`Step`} if		// proving no signed (resp. unsigned) wrap in {`Start`,+,`Step`} if
// `ExtendOpTy` is `SCEVSignExtendExpr` (resp. `SCEVZeroExtendExpr`).		// `ExtendOpTy` is `SCEVSignExtendExpr` (resp. `SCEVZeroExtendExpr`).
//		//
template<typename ExtendOpTy>		template<typename ExtendOpTy>
bool proveNoWrapByVaryingStart(const SCEV Start, const SCEV Step,		bool proveNoWrapByVaryingStart(const SCEV Start, const SCEV Step,
const Loop *L);		const Loop *L);

		// Return SCEV no-wrap flags that can be proven based on reasoning
		// about how poison produced from no-wrap flags on this value
		// (e.g. a nuw add) would trigger undefined behavior on overflow.
		SCEV::NoWrapFlags getNoWrapFlagsFromPoison(const Value *V);
		sanjoyUnsubmitted Done Reply Inline Actions Minor & optional: I'd call this `getNoWrapFlagsFromUB`. sanjoy: Minor & optional: I'd call this `getNoWrapFlagsFromUB`.
		brouneAuthorUnsubmitted Done Reply Inline Actions SGTM, I'll make the change. broune: SGTM, I'll make the change.

public:		public:
static char ID; // Pass identification, replacement for typeid		static char ID; // Pass identification, replacement for typeid
ScalarEvolution();		ScalarEvolution();

LLVMContext &getContext() const { return F->getContext(); }		LLVMContext &getContext() const { return F->getContext(); }

/// isSCEVable - Test if values of the given type are analyzable within		/// isSCEVable - Test if values of the given type are analyzable within
/// the SCEV framework. This primarily includes integer types, and it		/// the SCEV framework. This primarily includes integer types, and it
▲ Show 20 Lines • Show All 478 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,930 Lines • ▼ Show 20 Lines	ScalarEvolution::getGEPExpr(Type PointeeType, const SCEV BaseExpr,
const SmallVectorImpl<const SCEV *> &IndexExprs,		const SmallVectorImpl<const SCEV *> &IndexExprs,
bool InBounds) {		bool InBounds) {
// getSCEV(Base)->getType() has the same address space as Base->getType()		// getSCEV(Base)->getType() has the same address space as Base->getType()
// because SCEV::getType() preserves the address space.		// because SCEV::getType() preserves the address space.
Type *IntPtrTy = getEffectiveSCEVType(BaseExpr->getType());		Type *IntPtrTy = getEffectiveSCEVType(BaseExpr->getType());
// FIXME(PR23527): Don't blindly transfer the inbounds flag from the GEP		// FIXME(PR23527): Don't blindly transfer the inbounds flag from the GEP
// instruction to its SCEV, because the Instruction may be guarded by control		// instruction to its SCEV, because the Instruction may be guarded by control
// flow and the no-overflow bits may not be valid for the expression in any		// flow and the no-overflow bits may not be valid for the expression in any
// context.		// context. This can be fixed similarly to how these flags are handled for
		// adds.
SCEV::NoWrapFlags Wrap = InBounds ? SCEV::FlagNSW : SCEV::FlagAnyWrap;		SCEV::NoWrapFlags Wrap = InBounds ? SCEV::FlagNSW : SCEV::FlagAnyWrap;

const SCEV *TotalOffset = getConstant(IntPtrTy, 0);		const SCEV *TotalOffset = getConstant(IntPtrTy, 0);
// The address space is unimportant. The first thing we do on CurTy is getting		// The address space is unimportant. The first thing we do on CurTy is getting
// its element type.		// its element type.
Type *CurTy = PointerType::getUnqual(PointeeType);		Type *CurTy = PointerType::getUnqual(PointeeType);
for (const SCEV *IndexExpr : IndexExprs) {		for (const SCEV *IndexExpr : IndexExprs) {
// Compute the (potentially symbolic) offset in bytes for this index.		// Compute the (potentially symbolic) offset in bytes for this index.
▲ Show 20 Lines • Show All 362 Lines • ▼ Show 20 Lines	bool ScalarEvolution::checkValidity(const SCEV *S) const {
return !F.FindOne;		return !F.FindOne;
}		}

/// getSCEV - Return an existing SCEV if it exists, otherwise analyze the		/// getSCEV - Return an existing SCEV if it exists, otherwise analyze the
/// expression and create a new one.		/// expression and create a new one.
const SCEV ScalarEvolution::getSCEV(Value V) {		const SCEV ScalarEvolution::getSCEV(Value V) {
assert(isSCEVable(V->getType()) && "Value is not SCEVable!");		assert(isSCEVable(V->getType()) && "Value is not SCEVable!");

		const SCEV *S = getExistingSCEV(V);
		if (S == nullptr) {
		S = createSCEV(V);
		ValueExprMap.insert(std::make_pair(SCEVCallbackVH(V, this), S));
		}
		return S;
		}

		const SCEV ScalarEvolution::getExistingSCEV(Value V) {
		assert(isSCEVable(V->getType()) && "Value is not SCEVable!");

ValueExprMapType::iterator I = ValueExprMap.find_as(V);		ValueExprMapType::iterator I = ValueExprMap.find_as(V);
if (I != ValueExprMap.end()) {		if (I != ValueExprMap.end()) {
const SCEV *S = I->second;		const SCEV *S = I->second;
if (checkValidity(S))		if (checkValidity(S))
return S;		return S;
else
ValueExprMap.erase(I);		ValueExprMap.erase(I);
}		}
const SCEV *S = createSCEV(V);		return nullptr;

// The process of creating a SCEV for V may have caused other SCEVs
// to have been created, so it's necessary to insert the new entry
// from scratch, rather than trying to remember the insert position
// above.
ValueExprMap.insert(std::make_pair(SCEVCallbackVH(V, this), S));
return S;
}		}

/// getNegativeSCEV - Return a SCEV corresponding to -V = -1*V		/// getNegativeSCEV - Return a SCEV corresponding to -V = -1*V
///		///
const SCEV ScalarEvolution::getNegativeSCEV(const SCEV V) {		const SCEV ScalarEvolution::getNegativeSCEV(const SCEV V) {
if (const SCEVConstant *VC = dyn_cast<SCEVConstant>(V))		if (const SCEVConstant *VC = dyn_cast<SCEVConstant>(V))
return getConstant(		return getConstant(
cast<ConstantInt>(ConstantExpr::getNeg(VC->getValue())));		cast<ConstantInt>(ConstantExpr::getNeg(VC->getValue())));
▲ Show 20 Lines • Show All 741 Lines • ▼ Show 20 Lines	if (SignHint == ScalarEvolution::HINT_RANGE_UNSIGNED) {
APInt::getSignedMaxValue(BitWidth).ashr(NS - 1) + 1));		APInt::getSignedMaxValue(BitWidth).ashr(NS - 1) + 1));
}		}

return setRange(U, SignHint, ConservativeResult);		return setRange(U, SignHint, ConservativeResult);
}		}

return setRange(S, SignHint, ConservativeResult);		return setRange(S, SignHint, ConservativeResult);
}		}

		sanjoyUnsubmitted Done Reply Inline Actions I think `isGuaranteedToExecuteForEveryIteration`, `propagatesPoison`, `getOperandThatCausesUndefinedBehaviorIfPoison` and `undefinedBehaviorIsGuaranteedIfPoison` should live in `ValueTracking.h`, given that `ValueTracking` contains similar things like `isSafeToSpeculativelyExecute`. sanjoy: I think `isGuaranteedToExecuteForEveryIteration`, `propagatesPoison`…
/// createSCEV - We know that there is no SCEV for the specified value.		/// Returns true if I is guaranteed to be executed for every iteration of L.
/// Analyze the expression.		static bool isGuaranteedToExecuteForEveryIteration(const Instruction *I,
		const Loop *L) {
		assert(I);
		sanjoyUnsubmitted Done Reply Inline Actions I don't think these asserts are adding much here. Do you think it will be cleaner to take `I` and `L` by reference instead, to indicate this invariant? sanjoy: I don't think these asserts are adding much here. Do you think it will be cleaner to take `I`…
		brouneAuthorUnsubmitted Done Reply Inline Actions I would have preferred references, but all the other functions in this file take pointers, so I wanted to be consistent. Asserting non-null is itself not consistent with the rest of this file, so I took it out. broune: I would have preferred references, but all the other functions in this file take pointers, so I…
		assert(L);

		// The loop header is guaranteed to be executed for every iteration.
		//
		// FIXME: Relax this constraint to cover all basic blocks that are
		// guaranteed to be executed at every iteration.
		if (I->getParent() != L->getHeader())
		return false;

		for (const Instruction &LI : *L->getHeader()) {
		// The called function could contain an infinite loop and therefore not
		// return. That and instructions that can throw can prevent later
		sanjoyUnsubmitted Done Reply Inline Actions Nit: change "rec" to something more descriptive, perhaps "add recurrence"? sanjoy: Nit: change "rec" to something more descriptive, perhaps "add recurrence"?
		// instructions from being executed, even if those later instructions
		// (non-strongly) post-dominate I.
		sanjoyUnsubmitted Done Reply Inline Actions Nitpick: elsewhere you use SCEV not "scev". sanjoy: Nitpick: elsewhere you use SCEV not "scev".
		if (dyn_cast<CallInst>(&LI) \|\| LI.mayThrow())
		sanjoyUnsubmitted Done Reply Inline Actions Use `isa`. Also `mayThrow` implies `isa<CallInst> \|\| isa<ResumeInst>` and since `ResumeInst` is a terminator, I think this check can just be `isa<CallInst>` with a comment that this checks for both infinite loops and throws. sanjoy: Use `isa`. Also `mayThrow` implies `isa<CallInst> \|\| isa<ResumeInst>` and since `ResumeInst`…
		brouneAuthorUnsubmitted Done Reply Inline Actions Done. broune: Done.
		return false;
		if (&LI == I)
		sanjoyUnsubmitted Done Reply Inline Actions I'd swap this check with the previous one -- otherwise you'll always return `false` for calls. sanjoy: I'd swap this check with the previous one -- otherwise you'll always return `false` for calls.
		brouneAuthorUnsubmitted Done Reply Inline Actions Done. broune: Done.
		return true;
		}
		llvm_unreachable("Instruction not contained in its own parent basic block.");
		}

		/// Returns true if Op is guaranteed to yield poison (all bits poison) if at
		/// least one of its operands are poison (all bits poison).
		///
		/// The exact rules for how poison propagates through instructions have not
		sanjoyUnsubmitted Done Reply Inline Actions I agree these are fairly "obvious", but I'd like to run these by David Majnemer and Nuno Lopes to make sure that these rules are at least congruent on what they have in mind for poison's future. sanjoy: I agree these are fairly "obvious", but I'd like to run these by David Majnemer and Nuno Lopes…
		brouneAuthorUnsubmitted Done Reply Inline Actions Thank you. I'm curious what their thoughts are on it. broune: Thank you. I'm curious what their thoughts are on it.
		/// been settled as of 2015-07-10, so this function is conservative and only
		/// considers poison to be propagated in uncontroversial cases and does not
		/// attempt to track values that may be only partially poison.
		static bool propagatesPoison(const Instruction *I) {
		assert(I);
		sanjoyUnsubmitted Done Reply Inline Actions I don't think this `assert` adds much. It is fairly obvious that `I` is expected to be non null and if it is, we'll get a deterministic segfault in `I->getOpcode()`. sanjoy: I don't think this `assert` adds much. It is fairly obvious that `I` is expected to be non…
		brouneAuthorUnsubmitted Done Reply Inline Actions I removed it. broune: I removed it.

		switch (I->getOpcode()) {
		case Instruction::Add:
		case Instruction::Sub:
		case Instruction::Xor:
		case Instruction::Trunc:
		case Instruction::BitCast:
		case Instruction::AddrSpaceCast:
		// These operations all propagate poison unconditionally. Note that poison
		// is not any particular value, so xor or subtraction of poison with
		// itself still yields poison, not zero.
		return true;

		case Instruction::AShr:
		sanjoyUnsubmitted Done Reply Inline Actions I'm not sure that this is correct. I think you need to prove that the instruction you used to justify UB if `V` was poison is what needs to execute on every loop iteration. This is not a problem currently because `undefinedBehaviorIsGuaranteedIfFullPoison` only looks at a single basic block, but semantically, the following program will be problematic: loop_header: %x = add nsw %p, %q ... outside_the_loop: ;; no other uses of %x store i32 42, GEP(@global, %x) You can conclude that the `%x` does not overflow in the last iteration, but that's it -- even if `%x` was poison in all the other iterations your program is well defined. sanjoy: I'm not sure that this is correct. I think you need to prove that the instruction you used to…
		brouneAuthorUnsubmitted Done Reply Inline Actions As you point out, UB is not guaranteed on poison in this example, so even if `undefinedBehaviorIsGuaranteedIfFullPoison` is made stronger by considering more basic blocks, it should still return false here, right? broune: As you point out, UB is not guaranteed on poison in this example, so even if…
		sanjoyUnsubmitted Done Reply Inline Actions In that case why do we bother with `isGuaranteedToExecuteForEveryIteration`? IOW, why not if (undefinedBehaviorIsGuaranteedIfFullPoison(BinOp)) return Flags; sanjoy: In that case why do we bother with `isGuaranteedToExecuteForEveryIteration`? IOW, why not ```…
		brouneAuthorUnsubmitted Done Reply Inline Actions That would prove that if BinOp is executed, then what it calculates will not wrap. There is then still the possibility that BinOp is not executed on a given iteration, in which case we have no information about wrapping of the SCEV for that iteration. Then we cannot apply the flag to the SCEV as other instructions in the loop that map to the same SCEV would then also get the flag on the shared SCEV, but we have not actually proven that the shared SCEV does not wrap. broune: That would prove that if BinOp is executed, then what it calculates will not wrap. There is…
		sanjoyUnsubmitted Done Reply Inline Actions Ah, right. You also have a large comment explaining the very same thing -- sorry for making you repeat yourself. sanjoy: Ah, right. You also have a large comment explaining the very same thing -- sorry for making…
		case Instruction::SExt:
		// For these operations, one bit of the input is replicated across
		// multiple output bits. A replicated poison bit is still poison.
		return true;

		case Instruction::Shl: {
		// Left shift by a poison value is undefined behavior, so we can assume
		// that that does not happen.
		//
		// The number of positions to shift is unsigned, so no negative values are
		// possible there. Left shift by zero places preserves poison. So we only
		// need to consider left shift by a positive number of places.
		//
		// A left shift by a positive number of places leaves the lowest order bit
		// non-poisoned. However, if such a shift has a no-wrap flag, then we can
		// make the poison operand violate that flag, yielding a fresh full-poison
		// value.
		auto *OBO = cast<OverflowingBinaryOperator>(I);
		return OBO->hasNoUnsignedWrap() \|\| OBO->hasNoSignedWrap();
		}

		case Instruction::Mul: {
		// A multiplication by zero yields a non-poison zero result, so we need to
		// rule out zero as an operand. Conservatively, multiplication by a
		// non-zero constant is not multiplication by zero.
		//
		// Multiplication by a non-zero constant can leave some bits
		// non-poisoned. For example, a multiplication by 2 leaves the lowest
		// order bit unpoisoned. So we need to consider that.
		//
		// Multiplication by 1 preserves poison. If the multiplication has a
		// no-wrap flag, then we can make the poison operand violate that flag
		// when multiplied by any integer other than 0 and 1.
		auto *OBO = cast<OverflowingBinaryOperator>(I);
		if (OBO->hasNoUnsignedWrap() \|\| OBO->hasNoSignedWrap()) {
		for (int OpIndex = 0; OpIndex < 2; ++OpIndex) {
		if (auto *CI = dyn_cast<ConstantInt>(OBO->getOperand(OpIndex))) {
		// A ConstantInt cannot yield poison, so we can assume that it is
		// the other operand that is poison.
		return !CI->isZero();
		}
		}
		}
		return false;
		}

		case Instruction::GetElementPtr:
		// A GEP implicitly represents a sequence of additions, subtractions,
		// truncations, sign extensions and multiplications. The multiplications
		// are by the non-zero sizes of some set of types, so we do not have to be
		// concerned with multiplication by zero. If the GEP is in-bounds, then
		// these operations are implicitly no-signed-wrap so poison is propagated
		// by the arguments above for Add, Sub, Trunc, SExt and Mul.
		return cast<GEPOperator>(I)->isInBounds();

		default:
		return false;
		}
		}

		/// Returns either nullptr or an operand of I such that I will trigger
		/// undefined behavior if I is executed and that operand has a poison value
		/// (all bits poison).
		static const Value *
		getOperandThatCausesUndefinedBehaviorIfPoison(const Instruction *I) {
		assert(I);
		sanjoyUnsubmitted Done Reply Inline Actions Same comment as above about nullness of `I`. sanjoy: Same comment as above about nullness of `I`.
		brouneAuthorUnsubmitted Done Reply Inline Actions I removed it. broune: I removed it.

		switch (I->getOpcode()) {
		case Instruction::Store:
		return cast<StoreInst>(I)->getPointerOperand();

		case Instruction::Load:
		return cast<LoadInst>(I)->getPointerOperand();

		case Instruction::AtomicCmpXchg:
		return cast<AtomicCmpXchgInst>(I)->getPointerOperand();

		case Instruction::AtomicRMW:
		return cast<AtomicRMWInst>(I)->getPointerOperand();

		case Instruction::UDiv:
		case Instruction::SDiv:
		case Instruction::URem:
		case Instruction::SRem:
		return I->getOperand(1);

		default:
		return nullptr;
		}
		}

		// Returns true if this function can prove that if PoisonI is executed and
		// yields a poison value, then that will trigger undefined behavior.
		static bool undefinedBehaviorIsGuaranteedIfPoison(const Instruction *PoisonI) {
		// We currently only look for uses of poison values within the same basic
		// block, as that makes it easier to guarantee that the uses will be
		// executed.
		//
		// FIXME: Expand this to consider uses beyond the same basic block. To do
		// this, look out for the distinction between post-dominance and strong
		// post-dominance.
		const BasicBlock *BB = PoisonI->getParent();

		// Set of instructions that we have proved will yield poison if PoisonI
		// does.
		SmallSet<const Value *, 16> YieldsPoison;
		YieldsPoison.insert(PoisonI);

		for (const Instruction I = PoisonI, E = BB->end();
		I != E; I = I->getNextNode()) {

		// The called function could contain an infinite loop and therefore not
		// return. That and instructions that can throw can prevent later
		// instructions from being executed, even if those later instructions
		// (non-strongly) post-dominate I.
		//
		// PoisonI is assumed to yield poison, which implies that it did terminate
		// and did not throw.
		if ((isa<CallInst>(I) \|\| I->mayThrow()) && I != PoisonI)
		sanjoyUnsubmitted Done Reply Inline Actions Some comment as in `isGuaranteedToExecuteForEveryIteration` for `CallInst` and `mayThrow`. sanjoy: Some comment as in `isGuaranteedToExecuteForEveryIteration` for `CallInst` and `mayThrow`.
		brouneAuthorUnsubmitted Done Reply Inline Actions Done. broune: Done.
		return false;

		const Value *UBIfPoison = getOperandThatCausesUndefinedBehaviorIfPoison(I);
		if (UBIfPoison != nullptr && YieldsPoison.count(UBIfPoison))
		return true;

		// Mark poison that propagates from I through uses of I.
		if (YieldsPoison.count(I)) {
		for (const Use &U : I->uses()) {
		sanjoyUnsubmitted Done Reply Inline Actions You can directly iterate over `I->users()`. sanjoy: You can directly iterate over `I->users()`.
		brouneAuthorUnsubmitted Done Reply Inline Actions Thanks for the tip. broune: Thanks for the tip.
		const Instruction *User = cast<Instruction>(U.getUser());
		if (User->getParent() == BB && propagatesPoison(User))
		YieldsPoison.insert(User);
		}
		}
		}
		return false;
		}

		SCEV::NoWrapFlags
		ScalarEvolution::getNoWrapFlagsFromPoison(const Value *V) {
		assert(V);
		sanjoyUnsubmitted Done Reply Inline Actions Same comment as earlier -- I don't think this `assert` adds much value. You could take `Value &V` if your really cared about not passing in `nullptr`, but if I were you I'd just remove the `assert`. sanjoy: Same comment as earlier -- I don't think this `assert` adds much value. You could take `Value…
		brouneAuthorUnsubmitted Done Reply Inline Actions I removed the `assert`. broune: I removed the `assert`.
		const BinaryOperator *BinOp = cast<BinaryOperator>(V);

		// Return early if there are no flags to propagae to the SCEV.
		sanjoyUnsubmitted Done Reply Inline Actions Nit: propagate. sanjoy: Nit: propagate.
		brouneAuthorUnsubmitted Done Reply Inline Actions Done. broune: Done.
		SCEV::NoWrapFlags Flags = SCEV::FlagAnyWrap;
		if (BinOp->hasNoUnsignedWrap())
		Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNUW);
		if (BinOp->hasNoSignedWrap())
		Flags = ScalarEvolution::setFlags(Flags, SCEV::FlagNSW);
		if (Flags == SCEV::FlagAnyWrap) {
		return SCEV::FlagAnyWrap;
		}

		// Here we check that BinOp is in the header of the innermost loop
		// containing BinOp, since we only deal with instructions in the loop
		// header. The actual loop we need to check later will come from a rec, but
		// getting that requires computing the scev of the operands, which can be
		// expensive. This check we can do cheaply to rule out some cases early.
		Loop *innermostContainingLoop = LI->getLoopFor(BinOp->getParent());
		if (innermostContainingLoop == nullptr \|\|
		innermostContainingLoop->getHeader() != BinOp->getParent())
		return SCEV::FlagAnyWrap;

		if (!undefinedBehaviorIsGuaranteedIfPoison(BinOp))
		sanjoyUnsubmitted Done Reply Inline Actions Question: why isn't `undefinedBehaviorIsGuaranteedIfPoison` enough? IOW, why can't we have if (undefinedBehaviorIsGuaranteedIfPoison(BinOp)) return Flags; else return SCEV::FlagAnyWrap; sanjoy: Question: why isn't `undefinedBehaviorIsGuaranteedIfPoison` enough? IOW, why can't we have…
		brouneAuthorUnsubmitted Done Reply Inline Actions I added a comment on why the other conditions are necessary. There may be a way around isLoopInvariant, but I'm not 100% sure about that, so I left it in. broune: I added a comment on why the other conditions are necessary. There may be a way around…
		return SCEV::FlagAnyWrap;

		for (int OpIndex = 0; OpIndex < 2; ++OpIndex) {
		const SCEV *Op = getSCEV(BinOp->getOperand(OpIndex));
		if (auto AddRec = dyn_cast<SCEVAddRecExpr>(Op)) {
		sanjoyUnsubmitted Done Reply Inline Actions LLVM convention is `auto AddRec`. sanjoy:* LLVM convention is `auto *AddRec`.
		const int OtherOpIndex = 1 - OpIndex;
		const SCEV *OtherOp = getSCEV(BinOp->getOperand(OtherOpIndex));
		if (isLoopInvariant(OtherOp, AddRec->getLoop()) &&
		isGuaranteedToExecuteForEveryIteration(BinOp, AddRec->getLoop()))
		return Flags;
		}
		}
		return SCEV::FlagAnyWrap;
		}

		/// createSCEV - We know that there is no SCEV for the specified value. Analyze
		/// the expression.
///		///
const SCEV ScalarEvolution::createSCEV(Value V) {		const SCEV ScalarEvolution::createSCEV(Value V) {
if (!isSCEVable(V->getType()))		if (!isSCEVable(V->getType()))
return getUnknown(V);		return getUnknown(V);

unsigned Opcode = Instruction::UserOp1;		unsigned Opcode = Instruction::UserOp1;
if (Instruction *I = dyn_cast<Instruction>(V)) {		if (Instruction *I = dyn_cast<Instruction>(V)) {
Opcode = I->getOpcode();		Opcode = I->getOpcode();
Show All 20 Lines	const SCEV ScalarEvolution::createSCEV(Value V) {
case Instruction::Add: {		case Instruction::Add: {
// The simple thing to do would be to just call getSCEV on both operands		// The simple thing to do would be to just call getSCEV on both operands
// and call getAddExpr with the result. However if we're looking at a		// and call getAddExpr with the result. However if we're looking at a
// bunch of things all added together, this can be quite inefficient,		// bunch of things all added together, this can be quite inefficient,
// because it leads to N-1 getAddExpr calls for N ultimate operands.		// because it leads to N-1 getAddExpr calls for N ultimate operands.
// Instead, gather up all the operands and make a single getAddExpr call.		// Instead, gather up all the operands and make a single getAddExpr call.
// LLVM IR canonical form means we need only traverse the left operands.		// LLVM IR canonical form means we need only traverse the left operands.
//		//
// Don't apply this instruction's NSW or NUW flags to the new		// FIXME: Expand this handling of NSW and NUW to other instructions, like
// expression. The instruction may be guarded by control flow that the		// sub and mul.
// no-wrap behavior depends on. Non-control-equivalent instructions can be
// mapped to the same SCEV expression, and it would be incorrect to transfer
// NSW/NUW semantics to those operations.
SmallVector<const SCEV *, 4> AddOps;		SmallVector<const SCEV *, 4> AddOps;
AddOps.push_back(getSCEV(U->getOperand(1)));		for (Value *Op = U; ; Op = U->getOperand(0)) {
for (Value *Op = U->getOperand(0); ; Op = U->getOperand(0)) {
unsigned Opcode = Op->getValueID() - Value::InstructionVal;		unsigned Opcode = Op->getValueID() - Value::InstructionVal;
if (Opcode != Instruction::Add && Opcode != Instruction::Sub)		if (Opcode != Instruction::Add && Opcode != Instruction::Sub) {
		assert(Op != V && "V should be an add");
		AddOps.push_back(getSCEV(Op));
break;		break;
		}

		if (auto OpSCEV = getExistingSCEV(Op)) {
		sanjoyUnsubmitted Done Reply Inline Actions LLVM convention is `auto `. sanjoy:* LLVM convention is `auto *`.
		brouneAuthorUnsubmitted Done Reply Inline Actions Done. broune: Done.
		AddOps.push_back(OpSCEV);
		break;
		}

U = cast<Operator>(Op);		U = cast<Operator>(Op);

		// Do an operation by itself if a no-wrap flag can be applied, since the
		sanjoyUnsubmitted Done Reply Inline Actions The `Do an operation by itself if a no-wrap flag can be applied` bit did not parse for me. sanjoy: The `Do an operation by itself if a no-wrap flag can be applied` bit did not parse for me.
		brouneAuthorUnsubmitted Not Done Reply Inline Actions I clarified the comment. broune: I clarified the comment.
		// flag only applies to that particular operation.
		//
		// FIXME: Expand this to sub instructions.
		if (Opcode == Instruction::Add) {
		SCEV::NoWrapFlags Flags = getNoWrapFlagsFromPoison(U);
		if (Flags != SCEV::FlagAnyWrap) {
		AddOps.push_back(getAddExpr(getSCEV(U->getOperand(0)),
		getSCEV(U->getOperand(1)), Flags));
		break;
		}
		}

const SCEV *Op1 = getSCEV(U->getOperand(1));		const SCEV *Op1 = getSCEV(U->getOperand(1));
if (Opcode == Instruction::Sub)		if (Opcode == Instruction::Sub)
AddOps.push_back(getNegativeSCEV(Op1));		AddOps.push_back(getNegativeSCEV(Op1));
else		else
AddOps.push_back(Op1);		AddOps.push_back(Op1);
}		}
AddOps.push_back(getSCEV(U->getOperand(0)));
return getAddExpr(AddOps);		return getAddExpr(AddOps);
}		}

case Instruction::Mul: {		case Instruction::Mul: {
// Don't transfer NSW/NUW for the same reason as AddExpr.		// FIXME: Transfer NSW/NUW as in AddExpr.
SmallVector<const SCEV *, 4> MulOps;		SmallVector<const SCEV *, 4> MulOps;
MulOps.push_back(getSCEV(U->getOperand(1)));		MulOps.push_back(getSCEV(U->getOperand(1)));
for (Value *Op = U->getOperand(0);		for (Value *Op = U->getOperand(0);
Op->getValueID() == Instruction::Mul + Value::InstructionVal;		Op->getValueID() == Instruction::Mul + Value::InstructionVal;
Op = U->getOperand(0)) {		Op = U->getOperand(0)) {
U = cast<Operator>(Op);		U = cast<Operator>(Op);
MulOps.push_back(getSCEV(U->getOperand(1)));		MulOps.push_back(getSCEV(U->getOperand(1)));
}		}
▲ Show 20 Lines • Show All 4,374 Lines • Show Last 20 Lines

test/Analysis/Delinearization/multidim_ivs_and_integer_offsets_3d.ll

	; RUN: opt < %s -analyze -delinearize \| FileCheck %s			; RUN: opt < %s -analyze -delinearize \| FileCheck %s

	; void foo(long n, long m, long o, double A[n][m][o]) {			; void foo(long n, long m, long o, double A[n][m][o]) {
	;			;
	; for (long i = 0; i < n; i++)			; for (long i = 0; i < n; i++)
	; for (long j = 0; j < m; j++)			; for (long j = 0; j < m; j++)
	; for (long k = 0; k < o; k++)			; for (long k = 0; k < o; k++)
	; A[i+3][j-4][k+7] = 1.0;			; A[i+3][j-4][k+7] = 1.0;
	; }			; }

	; AddRec: {{{(56 + (8 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m * %o)}<%for.i>,+,(8 * %o)}<%for.j>,+,8}<%for.k>			; AddRec: {{{(56 + (8 * (-4 + (3 * %m)) * %o) + %A),+,(8 * %m * %o)}<%for.i>,+,(8 * %o)}<%for.j>,+,8}<%for.k>
	; CHECK: Base offset: %A			; CHECK: Base offset: %A
	; CHECK: ArrayDecl[UnknownSize][%m][%o] with elements of 8 bytes.			; CHECK: ArrayDecl[UnknownSize][%m][%o] with elements of 8 bytes.
	; CHECK: ArrayRef[{3,+,1}<nw><%for.i>][{-4,+,1}<nw><%for.j>][{7,+,1}<nw><%for.k>]			; CHECK: ArrayRef[{3,+,1}<nw><%for.i>][{-4,+,1}<nw><%for.j>][{7,+,1}<nuw><nsw><%for.k>]

	define void @foo(i64 %n, i64 %m, i64 %o, double* %A) {			define void @foo(i64 %n, i64 %m, i64 %o, double* %A) {
	entry:			entry:
	br label %for.i			br label %for.i

	for.i:			for.i:
	%i = phi i64 [ 0, %entry ], [ %i.inc, %for.i.inc ]			%i = phi i64 [ 0, %entry ], [ %i.inc, %for.i.inc ]
	br label %for.j			br label %for.j
	Show All 36 Lines

test/Analysis/Delinearization/multidim_ivs_and_parameteric_offsets_3d.ll

	; RUN: opt < %s -analyze -delinearize \| FileCheck %s			; RUN: opt < %s -analyze -delinearize \| FileCheck %s

	; void foo(long n, long m, long o, double A[n][m][o], long p, long q, long r) {			; void foo(long n, long m, long o, double A[n][m][o], long p, long q, long r) {
	;			;
	; for (long i = 0; i < n; i++)			; for (long i = 0; i < n; i++)
	; for (long j = 0; j < m; j++)			; for (long j = 0; j < m; j++)
	; for (long k = 0; k < o; k++)			; for (long k = 0; k < o; k++)
	; A[i+p][j+q][k+r] = 1.0;			; A[i+p][j+q][k+r] = 1.0;
	; }			; }

	; AddRec: {{{((8 * ((((%m * %p) + %q) * %o) + %r)) + %A),+,(8 * %m * %o)}<%for.i>,+,(8 * %o)}<%for.j>,+,8}<%for.k>			; AddRec: {{{((8 * ((((%m * %p) + %q) * %o) + %r)) + %A),+,(8 * %m * %o)}<%for.i>,+,(8 * %o)}<%for.j>,+,8}<%for.k>
	; CHECK: Base offset: %A			; CHECK: Base offset: %A
	; CHECK: ArrayDecl[UnknownSize][%m][%o] with elements of 8 bytes.			; CHECK: ArrayDecl[UnknownSize][%m][%o] with elements of 8 bytes.
	; CHECK: ArrayRef[{%p,+,1}<nw><%for.i>][{%q,+,1}<nw><%for.j>][{%r,+,1}<nw><%for.k>]			; CHECK: ArrayRef[{%p,+,1}<nw><%for.i>][{%q,+,1}<nw><%for.j>][{%r,+,1}<nsw><%for.k>]

	define void @foo(i64 %n, i64 %m, i64 %o, double* %A, i64 %p, i64 %q, i64 %r) {			define void @foo(i64 %n, i64 %m, i64 %o, double* %A, i64 %p, i64 %q, i64 %r) {
	entry:			entry:
	br label %for.i			br label %for.i

	for.i:			for.i:
	%i = phi i64 [ 0, %entry ], [ %i.inc, %for.i.inc ]			%i = phi i64 [ 0, %entry ], [ %i.inc, %for.i.inc ]
	br label %for.j			br label %for.j
	Show All 36 Lines

test/Analysis/ScalarEvolution/flags-from-poison.ll

This file was added.

				; RUN: opt < %s -S -analyze -scalar-evolution \| FileCheck %s

				; Positive and negative tests for inferring flags like nsw from
				; reasoning about how a poison value from overflow would trigger
				; undefined behavior.

				define void @foo() {
				ret void
				}

				; Example where an add should get the nsw flag, so that a sext can be
				; distributed over the add.
				define void @test-add-nsw(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-nsw
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nsw>
				%index32 = add nsw i32 %i, %offset

				; CHECK: %index64 =
				; CHECK: --> {(sext i32 %offset to i64),+,1}<nsw>
				%index64 = sext i32 %index32 to i64

				%ptr = getelementptr inbounds float, float* %input, i64 %index64
				%nexti = add nsw i32 %i, 1
				%f = load float, float* %ptr, align 4
				call void @foo()
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Example where an add should get the nuw flag.
				define void @test-add-nuw(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-nuw
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nuw>
				%index32 = add nuw i32 %i, %offset

				%ptr = getelementptr inbounds float, float* %input, i32 %index32
				%nexti = add nuw i32 %i, 1
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop

				exit:
				ret void
				}

				; With no load to trigger UB from poison, we cannot infer nsw.
				define void @test-add-no-load(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-no-load
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nw>
				%index32 = add nsw i32 %i, %offset

				%ptr = getelementptr inbounds float, float* %input, i32 %index32
				%nexti = add nuw i32 %i, 1
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop

				exit:
				ret void
				}

				; The current analysis is only supposed to look at the loop header, so
				; it should not infer nsw in this case, as that would require looking
				; outside the loop header.
				define void @test-add-not-header(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-not-header
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop2 ], [ 0, %entry ]
				br label %loop2
				loop2:

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nw>
				%index32 = add nsw i32 %i, %offset

				%ptr = getelementptr inbounds float, float* %input, i32 %index32
				%nexti = add nsw i32 %i, 1
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Same thing as test-add-not-header, but in this case only the load
				; instruction is outside the loop header.
				define void @test-add-not-header2(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-not-header2
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop2 ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nw>
				%index32 = add nsw i32 %i, %offset

				%ptr = getelementptr inbounds float, float* %input, i32 %index32
				%nexti = add nsw i32 %i, 1
				br label %loop2
				loop2:
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; The call instruction makes it not guaranteed that the add will be
				; executed, since it could run forever or throw an exception, so we
				; cannot assume that the UB is realized.
				define void @test-add-call(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-call
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nw>
				call void @foo()
				%index32 = add nsw i32 %i, %offset

				%ptr = getelementptr inbounds float, float* %input, i32 %index32
				%nexti = add nsw i32 %i, 1
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Same issue as test-add-call, but this time the call is between the
				; producer of poison and the load that consumes it.
				define void @test-add-call2(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-call2
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nw>
				%index32 = add nsw i32 %i, %offset

				%ptr = getelementptr inbounds float, float* %input, i32 %index32
				%nexti = add nsw i32 %i, 1
				call void @foo()
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Without inbounds, GEP does not propagate poison in the very
				; conservative analysis used here.
				define void @test-add-no-inbounds(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-no-inbounds
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nw>
				%index32 = add nsw i32 %i, %offset

				%ptr = getelementptr float, float* %input, i32 %index32
				%nexti = add nsw i32 %i, 1
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Multiplication by a non-zero constant propagates poison if there is
				; a nuw or nsw flag on the multiplication.
				define void @test-add-mul-propagates(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-mul-propagates
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nsw>
				%index32 = add nsw i32 %i, %offset

				%indexmul = mul nuw i32 %index32, 2
				%ptr = getelementptr inbounds float, float* %input, i32 %indexmul
				%nexti = add nsw i32 %i, 1
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Multiplication by a non-constant should not propagate poison in the
				; very conservative analysis used here.
				define void @test-add-mul-no-propagation(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-mul-no-propagation
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nw>
				%index32 = add nsw i32 %i, %offset

				%indexmul = mul nsw i32 %index32, %offset
				%ptr = getelementptr inbounds float, float* %input, i32 %indexmul
				%nexti = add nsw i32 %i, 1
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Multiplication by a non-zero constant does not propagate poison
				; without a no-wrap flag.
				define void @test-add-mul-no-propagation2(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-mul-no-propagation2
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nw>
				%index32 = add nsw i32 %i, %offset

				%indexmul = mul i32 %index32, 2
				%ptr = getelementptr inbounds float, float* %input, i32 %indexmul
				%nexti = add nsw i32 %i, 1
				%f = load float, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Division by poison triggers UB.
				define void @test-add-div(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-div
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %j =
				; CHECK: --> {%offset,+,1}<nsw>
				%j = add nsw i32 %i, %offset

				%q = sdiv i32 %numIterations, %j
				%nexti = add nsw i32 %i, 1
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Remainder of poison by non-poison divisor does not trigger UB.
				define void @test-add-div2(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-div2
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %j =
				; CHECK: --> {%offset,+,1}<nw>
				%j = add nsw i32 %i, %offset

				%q = sdiv i32 %j, %numIterations
				%nexti = add nsw i32 %i, 1
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Store to poison address triggers UB.
				define void @test-add-store(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-store
				entry:
				br label %loop
				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]

				; CHECK: %index32 =
				; CHECK: --> {%offset,+,1}<nsw>
				%index32 = add nsw i32 %i, %offset

				%ptr = getelementptr inbounds float, float* %input, i32 %index32
				%nexti = add nsw i32 %i, 1
				store float 1.0, float* %ptr, align 4
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop
				exit:
				ret void
				}

				; Three sequential adds where the middle add should have nsw. There is
				; a special case for sequential adds and this test covers that. We have to
				; put the final add first in the program since otherwise the special case
				; is not triggered, hence the strange basic block ordering.
				define void @test-add-twice(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @test-add-twice
				entry:
				br label %loop
				loop2:
				; CHECK: %seq =
				; CHECK: --> {(2 + %offset),+,1}<nw>
				%seq = add nsw nuw i32 %index32, 1
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop

				loop:
				%i = phi i32 [ %nexti, %loop2 ], [ 0, %entry ]

				%j = add nsw i32 %i, 1
				; CHECK: %index32 =
				; CHECK: --> {(1 + %offset),+,1}<nsw>
				%index32 = add nsw i32 %j, %offset

				%ptr = getelementptr inbounds float, float* %input, i32 %index32
				%nexti = add nsw i32 %i, 1
				store float 1.0, float* %ptr, align 4
				br label %loop2
				exit:
				ret void
				}

test/Transforms/LoopStrengthReduce/sext-ind-var.ll

This file was added.

				; RUN: opt -loop-reduce -S < %s \| FileCheck %s

				target datalayout = "e-i64:64-v16:16-v32:32-n16:32:64"
				target triple = "nvptx64-unknown-unknown"

				; LSR used not to be able to generate a float* induction variable in
				; these cases due to scalar evolution not propagating nsw from an
				; instruction to the SCEV, preventing distributing sext into the
				; corresponding addrec.

				define float @testadd(float* %input, i32 %offset, i32 %numIterations) {
				; CHECK-LABEL: @testadd
				; CHECK: sext i32 %offset to i64
				; CHECK: loop:
				; CHECK-DAG: phi float*
				; CHECK-DAG: phi i32
				; CHECK-NOT: sext

				entry:
				br label %loop

				loop:
				%i = phi i32 [ %nexti, %loop ], [ 0, %entry ]
				%sum = phi float [ %nextsum, %loop ], [ 0.000000e+00, %entry ]
				%index32 = add nuw nsw i32 %i, %offset
				%index64 = sext i32 %index32 to i64
				%ptr = getelementptr inbounds float, float* %input, i64 %index64
				%addend = load float, float* %ptr, align 4
				%nextsum = fadd float %sum, %addend
				%nexti = add nuw nsw i32 %i, 1
				%exitcond = icmp eq i32 %nexti, %numIterations
				br i1 %exitcond, label %exit, label %loop

				exit:
				ret float %nextsum
				}

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Apply NSW and NUW flags via poison value analysisClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 29752

include/llvm/Analysis/ScalarEvolution.h

lib/Analysis/ScalarEvolution.cpp

test/Analysis/Delinearization/multidim_ivs_and_integer_offsets_3d.ll

test/Analysis/Delinearization/multidim_ivs_and_parameteric_offsets_3d.ll

test/Analysis/ScalarEvolution/flags-from-poison.ll

test/Transforms/LoopStrengthReduce/sext-ind-var.ll

[SCEV] Apply NSW and NUW flags via poison value analysis
ClosedPublic