This is an archive of the discontinued LLVM Phabricator instance.

That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.

Anyway, here are a couple of high level comments.

Thanks,
-Quentin

lib/Analysis/ScalarEvolutionExpander.cpp
1649	This seems for the users of the SCEV API to know that when all those conditions apply, it needs to create a new value.
lib/Transforms/Vectorize/LoopVectorize.cpp
2602	The formatting of the comment looks strange and the comment itself is hard to digest. Could you rephrase? Also, don’t we have a split method directly on the DT? Having to explicitly add the block to DT seems error prone to me.

Quentin, thanks for the review.

That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.

Yes, if CSE can fully clean up the redundencies, that way looks simpler and cleaner. I am not sure which is the best way to fix the problem. The patch is proposed just as an alternative I can think of to fully clean up such redundencies. I expect people having better understanding of SCEV to provide some suggestions here.

lib/Analysis/ScalarEvolutionExpander.cpp
1649	Yes, that is a precise description about the use of the code above.
lib/Transforms/Vectorize/LoopVectorize.cpp
2602	Sorry, I will fix the comment. The comment is trying to say the SCEV expansion may query the DT at the same time when the func createEmptyLoop generates new bypass blocks. This is before InnerLoopVectorizer::updateAnalysis update the whole DT so we need to maintain the DT incrementally. I don't know the common way to do that so I just move the code originally in InnerLoopVectorizer::updateAnalysis forward.

In D12090#226980, @wmi wrote:

Quentin, thanks for the review.

That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.

Yes, if CSE can fully clean up the redundencies, that way looks simpler and cleaner. I am not sure which is the best way to fix the problem. The patch is proposed just as an alternative I can think of to fully clean up such redundencies. I expect people having better understanding of SCEV to provide some suggestions here.

It seems that CSE will only handle the more-trivial cases. If constant folding, or reassociation, etc. has changed the form of the expressions, then CSE won't help. I think handling at least some of this in SCEV does make sense.

That having been said, adding a Value* and the enum to every SCEV adds an overhead to every SCEV created, even though most of them are never expanded. Would it be better to keep this information on the side (in a DenseMap or similar)?

In D12090#234842, @hfinkel wrote:

In D12090#226980, @wmi wrote:

Quentin, thanks for the review.

That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.

Yes, if CSE can fully clean up the redundencies, that way looks simpler and cleaner. I am not sure which is the best way to fix the problem. The patch is proposed just as an alternative I can think of to fully clean up such redundencies. I expect people having better understanding of SCEV to provide some suggestions here.

It seems that CSE will only handle the more-trivial cases. If constant folding, or reassociation, etc. has changed the form of the expressions, then CSE won't help. I think handling at least some of this in SCEV does make sense.

That having been said, adding a Value* and the enum to every SCEV adds an overhead to every SCEV created, even though most of them are never expanded. Would it be better to keep this information on the side (in a DenseMap or similar)?

Yes, keeping the information in a map looks better. I will change it.

Move Value * and enum from SCEV class to maps in ScalarEvolution class.
performance test result of llvm testsuite is neutral on a x86-64 sandybridge.

Herald added a subscriber: sanjoy. · View Herald TranscriptSep 4 2015, 10:35 AM

hfinkel added inline comments.Sep 24 2015, 2:35 AM

include/llvm/Analysis/ScalarEvolution.h
105	This now looks only like an unnecessary formatting change.
lib/Analysis/ScalarEvolution.cpp
3374	This makes me a bit uncomfortable; you're relying on the fact that, if a Value* is removed, then no new Value* will be created in the same location, or if that does happen, the new Value* won't be used with getSCEV(). Nothing really guarantees this, however. One option is to hold WeakVH as the values in your map. Another option is to enhance the SCEVCallbackVH implementation to update the ExprValueMap directly. Given that you seem to have already done the SCEVCallbackVH update below, maybe the additional check is just unnecessary now.

This makes me a bit uncomfortable; you're relying on the fact that, if a Value* is removed, then no new Value* will be created in the same location, or if that does happen, the new Value* won't be used with getSCEV(). Nothing really guarantees this, however.
One option is to hold WeakVH as the values in your map. Another option is to enhance the SCEVCallbackVH implementation to update the ExprValueMap directly. Given that you seem to have already done the SCEVCallbackVH update below, maybe the additional check is just unnecessary now.

Thanks! It is a problem indeed. I took the first option and used WeakVH in the map.

hfinkel added inline comments.Sep 25 2015, 3:27 PM

lib/Analysis/ScalarEvolutionExpander.cpp
1646	Why are we reusing an existing value only for AddRecs? I see in the summary that you say: The intuition is, if only SCEV doesn't contain scAddRecExpr, using the original value to expand will not nullify valid loop transformations. But I don't see the downside to always reusing an available value from that. Another potentially-problematic issue is that SCEVs may not be unique, and I'm a bit concerned about always taking only the first or last such value encountered, because it imposes an indirect constraint on the users of ScalarEvolution and the expander to ensure that they always visit all such values in some deterministic order. This is not currently the case. Moreover, it can be problematic if, for example, you symbolically compute X-Y first, and then call getSCEV on a value that happens to be X-Y, will expand differently than calling getSCEV on that value and doing the symbolic calculation later. Also, if we are going to restrict these to a subclass of SCEVs, why wouldn't you only store the values/SCEV pair in the map if the SCEV satisfies hasAnyRec?

wmi added inline comments.Sep 25 2015, 11:12 PM

lib/Analysis/ScalarEvolutionExpander.cpp
1646	Why are we reusing an existing value only for AddRecs? No, we are reusing an existing value only for scevs which are not scAddRecExpr. The downside to use existing value for scAddRecExpr is that it may nullify some optimization done by LSR. (I think only scAddRecExpr type scev is substantially involved in LSR optimization) Another potentially-problematic issue is that SCEVs may not be unique, and I'm a bit concerned about always taking only the first or last such value encountered, because it imposes an indirect constraint on the users of ScalarEvolution and the expander to ensure that they always visit all such values in some deterministic order. This is not currently the case. Moreover, it can be problematic if, for example, you symbolically compute X-Y first, and then call getSCEV on a value that happens to be X-Y, will expand differently than calling getSCEV on that value and doing the symbolic calculation later. I guess your point is: multiple values can be mapped to the same SCEV. Only one of those values will be recorded in ExprValueMap and will be used in expansion. To get the maximum optimization opportunity, the values must be encountered/expanded in a certain order. However, I think the same problem exists either even without the patch because the fact that mulitple values can be mapped to the same SCEV is true w/wo the patch (This is determined by ScalarEvolution::createSCEV and we didn't change it in the patch). value = X-Y ... To expand S1. Suppose X-Y and X'-Y' will both be mapped to S1. Whether S1 will be expanded to X-Y or X'-Y' depends on which one is first encountered by getSCEV. This is true even without the patch. Also, if we are going to restrict these to a subclass of SCEVs, why wouldn't you only store the values/SCEV pair in the map if the SCEV satisfies hasAnyRec? I think you mean not to store the value/scAddRecExpr pair in the map. It can make the map smaller. I will change it.

hfinkel added inline comments.Sep 28 2015, 1:57 PM

lib/Analysis/ScalarEvolutionExpander.cpp
1646	No, we are reusing an existing value only for scevs which are not scAddRecExpr. The downside to use existing value for scAddRecExpr is that it may nullify some optimization done by LSR. (I think only scAddRecExpr type scev is substantially involved in LSR optimization) I'm very afraid here of creating an quirky interface that, while appearing to offer a general set of facilities, contains a set of unexpected behaviors tailored to a specific consumer (LSR, in this case). Regardless, ScalarEvolutionExpander already contains a special 'LSRMode', and we should key and LSR-specific customizations off of that. In the general case, we should have a consistent behavior.

wmi added inline comments.Sep 28 2015, 11:13 PM

lib/Analysis/ScalarEvolutionExpander.cpp
1646	I rethought your previous comments about uncertainty of the expansion result and found something I can improve. Thanks for those comments. A case is like this: BBi: %sub2 = %x - %y; ... BBj: %x = load %a; %y = load %b; %sub1 = %x - %y; ... %i_0 = %sub1; for.body: %i_1 = PHI (%i_0, %i_next) %i_next = %i_1 + 1; %cmp = icmp slt %i_next, %z br i1 %cmp, label %for.body, label %for.end for.end: If we expand the SCEV when we try to get the backedge count, without the patch it will be "%z - (%x - %y) - 1", with the patch it may be "%z - %sub1 - 1" if %sub1 is recorded in SCEV of "%x - %y", or it may be "%z - %sub2 - 1" if %sub2 is recorded in SCEV of "%x - %y". It is also possible that BBi cannot reach BBj, so SCEV also may expand to "%z - (%x - %y) - 1" with the patch. This can be improved. I can record the set of all possible Values mapped to the same SCEV in ExprValueMap. In SCEVExpander::expand, I will select one value from the set which will dominate the insert point, so in the case above, SCEV will only be expanded to "%z - %sub1 - 1" if BBi cannot reach BBj, so reuse will be realized every time and there will be much less uncertainty in the expansion result. For another concern you raised, although existing behavior of SCEVExpander::expand is more consistent -- it always literally generates all the computations SCEV represents, it sacrifices the opportunity to reuse existing values. The patch introduces some inconsistency. The inconsistency is that we keep the expansion behavior of scAddRecExpr the same as without the patch, but may reuse existing values when expanding other kinds of SCEV (So the inconsistency here is actually an improvement when expanding non-scAddRecExpr SCEVs). I think about the generaility of the interface, besides LSR, other components can still use the interface the same as before and believe the interface will generate equal value -- without affecting correctness, but possibly with less cost. And I think I can describe the inconsistency more clearly in comments to remove potential confusion from the users of the interface.

Update the patch (Sorry for not updating it for a long time).

This can be improved. I can record the set of all possible Values mapped to the same SCEV in ExprValueMap. In SCEVExpander::expand, I will select one value from the set which will dominate the insert point.

The improvement is implemented. Because there can be multiple Value mapping to the same SCEV, record the mapping from SCEV to vector<WeakVH> in ExprValueMap. During SCEV expansion, choose one Value from the vector which can dominate the insertPt. The update to the unittest test/Transforms/IndVarSimplify/udiv.ll reflects the usage of the change. IndVars doesn't emit a udiv in for.body.preheader BB after the change. %div1 will be reused there.

I'm very afraid here of creating an quirky interface that, while appearing to offer a general set of facilities, contains a set of unexpected behaviors tailored to a specific consumer (LSR, in this case). Regardless, ScalarEvolutionExpander already contains a special 'LSRMode', and we should key and LSR-specific customizations off of that. In the general case, we should have a consistent behavior.

This concern has not been addressed very well. I still don't have a good solution for it right now. What I have done for it is to add a comment describing the status before func SCEVExpander::expand.

Ping.

Thanks,
Wei.

Some minor nits inline.

Overall, I agree with Hal's judgement that any LSR specific behavior should be guarded on LSRMode; both so that the reason for the limitation is obvious, and also to not unnecessarily (and, to the end user, inexplicably) do a worse job than we could have done.

lib/Analysis/ScalarEvolution.cpp
3314	I'd rename this to `containsAddRecurrence`.
3317	Why not `return I->second;`?
3319	Can't you use a `SCEVTraversal` here?
lib/Analysis/ScalarEvolutionExpander.cpp
1650	Nit: LLVM naming style is `auto const &Ent : *Vec`.
lib/Transforms/Vectorize/LoopVectorize.cpp
2598	Nit: wrapping

Overall, I agree with Hal's judgement that any LSR specific behavior should be guarded on LSRMode;

Thanks for the explaination. I misunderstood Hal's comment and thought existing use of LSRMode is already bad so I shouldn't add another use of LSRMode (My bad English). That is why I say I cannot figure out a better way to make the interface clean.

Now the behavior of SCEVExpander::expand is defined clearer in its function header comment:

The expansion of SCEV will either reuse a previous Value in ExprValueMap,
or expand the SCEV literally. Specifically, if the expansion is in LSRMode,
and the SCEV contains any sub scAddRecExpr type SCEV, it will be expanded
literally, to prevent LSR transformed SCEV from being reverted. Otherwise,
the expansion will try to reuse Value from ExprValueMap, and only when it
fails, expand the SCEV literally.

lib/Analysis/ScalarEvolution.cpp
3319	That is much better. Done.

Addressed Hal and Sanjoy's comments.

Other changes:

Change vector to set in ExprValueMap.
Add test scev-expander-existing-value.ll.

hfinkel added inline comments.Dec 10 2015, 6:00 PM

include/llvm/Analysis/ScalarEvolution.h
252	store the analysis result about -> record
258	As I note later, you probably want a SetVector here, not a std::set. Also, std::set is generally much slower than DenseSet, so we should use the latter if possible (SetVector uses a DenseSet).
lib/Analysis/ScalarEvolutionExpander.cpp
1608	LSR -> LSR's
1648	LSR -> LSR's
1653	You're iterating over the elements of a set here, and those have WeakVH (i.e. pointer-valued) keys. That seems unlikely to be deterministic. SetVector seems like a better choice.
lib/Transforms/Vectorize/LoopVectorize.cpp
2600	func -> function (no need to abbreviate here)

Addressed Hal's comments. Changed std::set<WeakVH> to SetVector<WeakVH, std::vector<WeakVH>, DenseSet<WeakVH>>.

hfinkel added inline comments.Dec 11 2015, 2:13 AM

include/llvm/Analysis/ScalarEvolution.h
210	SetVector is defined as: template <typename T, typename Vector = std::vector<T>, typename Set = DenseSet<T>> class SetVector { ... and so ,std::vector<WeakVH>, DenseSet<WeakVH> should be implied by the first template argument. If this can be simplified to: typedef SetVector<WeakVH> WeakVHSetType; then please do. (but, you also need to change the WeakVH type, see below)
include/llvm/IR/ValueHandle.h
177 ↗	(On Diff #42502)	I apologize, because I believe I was the one who implied this would work. But that fact that you had to add this here reminded me that it won't. The problem is that a WeakVH's value changes when the underlying Value is removed (it changes from the pointer value to nullptr). Thus, we can't use these as keys in a set (or map) because the key needs to remain fixed (otherwise it will be in the wrong bucket, or in the wrong order for a sorted set, after the change). We need instead to use a different kind of ValueHandle that can remove itself from its parent map once the underlying value goes away. The good news is that we already have implementations of this: We have a ValueMap class (include/llvm/IR/ValueMap.h), and we have SCEV's ValueExprMapType type SCEVCallbackVH. All things considered, I think that just using raw pointers in the SetVectors is probably your best option, and enhance SCEVCallbackVH to also remove outdated Values (for every value in one of those vectors, we must already have an entry in ValueExprMap which should have the same lifetime). Thus, in SCEVCallbackVH's callback, you can use the associated SCEV* to lookup the correct SetVector<Value*> and remove the necessary entry (SetVector has a convenient 'remove' member function for this purpose).

Thanks for detecting the potential error, and your suggestion to use SCEVCallbackVH's callback instead of WeakVH looks feasible.

I just looked at ScalarEvolution::getSCEV again and believed ExprValueMap[S].insert(WeakVH(V)) will be called only once for the same Value (It will not happen that two instances of the same Value are inserted to the set). So can we simply use a std::vector<WeakVH> instead of SetVector, which may be cheaper because SetVector uses std::vector inside of it.

I try std::vector and find it is possible for ExprValueMap to have duplicate Values in the vector because createSCEV will be called multiple times for the same PHI Value. Another weakness is there may be multiple WeakVHs with nullptr Values staying in the vector.

So I still follow Hal's suggestion to use SetVector<Value *>.

ScalarEvolution::eraseValueFromMap is created to ensure whenever V->S is removed from ValueExprMap, V is also removed from the set of ExprValueMap[S] . In this way, entry in ValueExprMap will always have equal or longer life time than corresponding entry in ExprValueMap. So when V is deleted and V is in a SetVector of ExprValueMap, ValueExprMap[V] can always be used as the Key of ExprValueMap.

Ping.

hfinkel added inline comments.Feb 2 2016, 2:11 PM

lib/Analysis/ScalarEvolution.cpp
3319	I'd name this FoundOne (instead of FindOne), because it indicates whether or not an AddRec was found, not a directive for the future search).
3370	I appreciate this idea, but please don't do this by default. In a build with asserts, this check makes Value removal O(N^2). If you'd like to have this check, you'll need a separate flag. This reminds me of EnableExpensiveChecks in lib/CodeGen/SelectionDAG/LegalizeTypes.cpp.
lib/Transforms/Vectorize/LoopVectorize.cpp
2598	dominate -> dominator

wmi marked 2 inline comments as done.Feb 2 2016, 5:52 PM

wmi added inline comments.

lib/Analysis/ScalarEvolution.cpp
3370	I put the check under VerifySCEVMap option (similar as VerifySCEV option). And I moved the check to ScalarEvolution::getSCEVValues from ScalarEvolution::eraseValueFromMap, so it can ensure every Value set returned by getSCEVValues don't have dangling value inside of it.

wmi updated this revision to Diff 46730.Feb 2 2016, 5:53 PM

LGTM.

This revision is now accepted and ready to land.Feb 2 2016, 6:01 PM

Thank you for your patience and thank you for providing many helpful
suggestions!

Wei.

Closed by commit rL259662: [SCEV] Try to reuse existing value during SCEV expansion (authored by wmi). · Explain WhyFeb 3 2016, 9:09 AM

This revision was automatically updated to reflect the committed changes.

mzolotukhin mentioned this in D15559: [SCEVExpander] Make findExistingExpansion smarter.Feb 4 2016, 12:16 PM

wmi mentioned this in D21313: Use ValueOffsetPair to enhance value reuse during SCEV expansion. .Jun 13 2016, 4:07 PM

wmi mentioned this in rL276136: Use ValueOffsetPair to enhance value reuse during SCEV expansion..Jul 20 2016, 9:48 AM

wmi mentioned this in rL278160: Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion"..Aug 9 2016, 1:45 PM

Revision Contents

Path

Size

include/

llvm/

Analysis/

ScalarEvolution.h

25 lines

lib/

Analysis/

ScalarEvolution.cpp

77 lines

ScalarEvolutionExpander.cpp

15 lines

Transforms/

Vectorize/

LoopVectorize.cpp

12 lines

test/

CodeGen/

Thumb2/

2009-12-01-LoopIVUsers.ll

1 line

Transforms/

IRCE/

decrementing-loop.ll

1 line

IndVarSimplify/

lftr-address-space-pointers.ll

4 lines

udiv.ll

6 lines

ult-sub-to-eq.ll

10 lines

LoopStrengthReduce/

post-inc-icmpzero.ll

5 lines

Diff 39695

include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 96 Lines • ▼ Show 20 Lines	public:
enum NoWrapFlags { FlagAnyWrap = 0, // No guarantee.		enum NoWrapFlags { FlagAnyWrap = 0, // No guarantee.
FlagNW = (1 << 0), // No self-wrap.		FlagNW = (1 << 0), // No self-wrap.
FlagNUW = (1 << 1), // No unsigned wrap.		FlagNUW = (1 << 1), // No unsigned wrap.
FlagNSW = (1 << 2), // No signed wrap.		FlagNSW = (1 << 2), // No signed wrap.
NoWrapMask = (1 << 3) -1 };		NoWrapMask = (1 << 3) -1 };

explicit SCEV(const FoldingSetNodeIDRef ID, unsigned SCEVTy) :		explicit SCEV(const FoldingSetNodeIDRef ID, unsigned SCEVTy) :
FastID(ID), SCEVType(SCEVTy), SubclassData(0) {}		FastID(ID), SCEVType(SCEVTy), SubclassData(0) {}

		hfinkelUnsubmitted Not Done Reply Inline Actions This now looks only like an unnecessary formatting change. hfinkel: This now looks only like an unnecessary formatting change.
unsigned getSCEVType() const { return SCEVType; }		unsigned getSCEVType() const { return SCEVType; }

/// getType - Return the LLVM type of this SCEV expression.		/// getType - Return the LLVM type of this SCEV expression.
///		///
Type *getType() const;		Type *getType() const;

/// isZero - Return true if the expression is a constant zero.		/// isZero - Return true if the expression is a constant zero.
///		///
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	public:
setFlags(SCEV::NoWrapFlags Flags, SCEV::NoWrapFlags OnFlags) {		setFlags(SCEV::NoWrapFlags Flags, SCEV::NoWrapFlags OnFlags) {
return (SCEV::NoWrapFlags)(Flags \| OnFlags);		return (SCEV::NoWrapFlags)(Flags \| OnFlags);
}		}
static SCEV::NoWrapFlags LLVM_ATTRIBUTE_UNUSED_RESULT		static SCEV::NoWrapFlags LLVM_ATTRIBUTE_UNUSED_RESULT
clearFlags(SCEV::NoWrapFlags Flags, SCEV::NoWrapFlags OffFlags) {		clearFlags(SCEV::NoWrapFlags Flags, SCEV::NoWrapFlags OffFlags) {
return (SCEV::NoWrapFlags)(Flags & ~OffFlags);		return (SCEV::NoWrapFlags)(Flags & ~OffFlags);
}		}

private:		private:
		hfinkelUnsubmitted Not Done Reply Inline Actions SetVector is defined as: template <typename T, typename Vector = std::vector<T>, typename Set = DenseSet<T>> class SetVector { ... and so ,std::vector<WeakVH>, DenseSet<WeakVH> should be implied by the first template argument. If this can be simplified to: typedef SetVector<WeakVH> WeakVHSetType; then please do. (but, you also need to change the WeakVH type, see below) hfinkel: SetVector is defined as: template <typename T, typename Vector = std::vector<T>…
/// SCEVCallbackVH - A CallbackVH to arrange for ScalarEvolution to be		/// SCEVCallbackVH - A CallbackVH to arrange for ScalarEvolution to be
/// notified whenever a Value is deleted.		/// notified whenever a Value is deleted.
class SCEVCallbackVH : public CallbackVH {		class SCEVCallbackVH : public CallbackVH {
ScalarEvolution *SE;		ScalarEvolution *SE;
void deleted() override;		void deleted() override;
void allUsesReplacedWith(Value *New) override;		void allUsesReplacedWith(Value *New) override;
public:		public:
SCEVCallbackVH(Value V, ScalarEvolution SE = nullptr);		SCEVCallbackVH(Value V, ScalarEvolution SE = nullptr);
Show All 21 Lines	private:
/// DT - The dominator tree.		/// DT - The dominator tree.
///		///
DominatorTree *DT;		DominatorTree *DT;

/// CouldNotCompute - This SCEV is used to represent unknown trip		/// CouldNotCompute - This SCEV is used to represent unknown trip
/// counts and things.		/// counts and things.
SCEVCouldNotCompute CouldNotCompute;		SCEVCouldNotCompute CouldNotCompute;

		/// HasRecMapType - The typedef for HasRecMap.
		///
		typedef DenseMap<const SCEV *, bool> HasRecMapType;

		/// HasRecMap -- This is a cache to store the analysis result about whether
		hfinkelUnsubmitted Done Reply Inline Actions store the analysis result about -> record hfinkel: store the analysis result about -> record
		/// a SCEV contains any scAddRecExpr.
		HasRecMapType HasRecMap;

		/// ExprValueMapType - The typedef for ExprValueMap.
		///
		typedef DenseMap<const SCEV *, std::vector<WeakVH>> ExprValueMapType;
		hfinkelUnsubmitted Done Reply Inline Actions As I note later, you probably want a SetVector here, not a std::set. Also, std::set is generally much slower than DenseSet, so we should use the latter if possible (SetVector uses a DenseSet). hfinkel: As I note later, you probably want a SetVector here, not a std::set. Also, std::set is…

		/// ExprValueMap -- This map records the original value from which
		/// the SCEV expr is generated from.
		ExprValueMapType ExprValueMap;

/// ValueExprMapType - The typedef for ValueExprMap.		/// ValueExprMapType - The typedef for ValueExprMap.
///		///
typedef DenseMap<SCEVCallbackVH, const SCEV , DenseMapInfo<Value > >		typedef DenseMap<SCEVCallbackVH, const SCEV , DenseMapInfo<Value > >
ValueExprMapType;		ValueExprMapType;

/// ValueExprMap - This is a cache of the values we have analyzed so far.		/// ValueExprMap - This is a cache of the values we have analyzed so far.
///		///
ValueExprMapType ValueExprMap;		ValueExprMapType ValueExprMap;
▲ Show 20 Lines • Show All 364 Lines • ▼ Show 20 Lines	public:
uint64_t getTypeSizeInBits(Type *Ty) const;		uint64_t getTypeSizeInBits(Type *Ty) const;

/// getEffectiveSCEVType - Return a type with the same bitwidth as		/// getEffectiveSCEVType - Return a type with the same bitwidth as
/// the given type and which represents how SCEV will treat the given		/// the given type and which represents how SCEV will treat the given
/// type, for which isSCEVable must return true. For pointer types,		/// type, for which isSCEVable must return true. For pointer types,
/// this is the pointer-sized integer type.		/// this is the pointer-sized integer type.
Type getEffectiveSCEVType(Type Ty) const;		Type getEffectiveSCEVType(Type Ty) const;

		/// hasAnyRec - Return true if the SCEV is a scAddRecExpr or it
		/// contains scAddRecExpr. The result will be cached in HasRecMap.
		///
		bool hasAnyRec(const SCEV *S);

		/// getSCEVValue - Return the WeakVH vector from which the SCEV expr is
		/// generated.
		std::vector<WeakVH> getSCEVValue(const SCEV S);

/// getSCEV - Return a SCEV expression for the full generality of the		/// getSCEV - Return a SCEV expression for the full generality of the
/// specified expression.		/// specified expression.
const SCEV getSCEV(Value V);		const SCEV getSCEV(Value V);

const SCEV getConstant(ConstantInt V);		const SCEV getConstant(ConstantInt V);
const SCEV *getConstant(const APInt& Val);		const SCEV *getConstant(const APInt& Val);
const SCEV getConstant(Type Ty, uint64_t V, bool isSigned = false);		const SCEV getConstant(Type Ty, uint64_t V, bool isSigned = false);
const SCEV getTruncateExpr(const SCEV Op, Type *Ty);		const SCEV getTruncateExpr(const SCEV Op, Type *Ty);
▲ Show 20 Lines • Show All 466 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 3,305 Lines • ▼ Show 20 Lines
	bool ScalarEvolution::checkValidity(const SCEV *S) const {			bool ScalarEvolution::checkValidity(const SCEV *S) const {
	FindInvalidSCEVUnknown F;			FindInvalidSCEVUnknown F;
	SCEVTraversal<FindInvalidSCEVUnknown> ST(F);			SCEVTraversal<FindInvalidSCEVUnknown> ST(F);
	ST.visitAll(S);			ST.visitAll(S);

	return !F.FindOne;			return !F.FindOne;
	}			}

				bool ScalarEvolution::hasAnyRec(const SCEV *S) {
				sanjoyUnsubmitted Not Done Reply Inline Actions I'd rename this to `containsAddRecurrence`. sanjoy: I'd rename this to `containsAddRecurrence`.
				HasRecMapType::iterator I = HasRecMap.find_as(S);
				if (I != HasRecMap.end())
				return I->second == true;
				sanjoyUnsubmitted Not Done Reply Inline Actions Why not `return I->second;`? sanjoy: Why not `return I->second;`?

				bool AnyRec = false;
				sanjoyUnsubmitted Not Done Reply Inline Actions Can't you use a `SCEVTraversal` here? sanjoy: Can't you use a `SCEVTraversal` here?
				wmiAuthorUnsubmitted Not Done Reply Inline Actions That is much better. Done. wmi: That is much better. Done.
				hfinkelUnsubmitted Done Reply Inline Actions I'd name this FoundOne (instead of FindOne), because it indicates whether or not an AddRec was found, not a directive for the future search). hfinkel: I'd name this FoundOne (instead of FindOne), because it indicates whether or not an AddRec was…
				switch (static_cast<SCEVTypes>(S->getSCEVType())) {
				case scConstant: {
				AnyRec = false;
				break;
				}
				case scTruncate: {
				const SCEVTruncateExpr *Trunc = cast<SCEVTruncateExpr>(S);
				AnyRec = hasAnyRec(Trunc->getOperand());
				break;
				}
				case scZeroExtend: {
				const SCEVZeroExtendExpr *ZExt = cast<SCEVZeroExtendExpr>(S);
				AnyRec = hasAnyRec(ZExt->getOperand());
				break;
				}
				case scSignExtend: {
				const SCEVSignExtendExpr *SExt = cast<SCEVSignExtendExpr>(S);
				AnyRec = hasAnyRec(SExt->getOperand());
				break;
				}
				case scAddRecExpr: {
				AnyRec = true;
				break;
				}
				case scAddExpr:
				case scMulExpr:
				case scUMaxExpr:
				case scSMaxExpr: {
				const SCEVNAryExpr *NAry = cast<SCEVNAryExpr>(S);
				for (SCEVNAryExpr::op_iterator I = NAry->op_begin(), E = NAry->op_end();
				I != E; ++I)
				AnyRec = AnyRec \|\| hasAnyRec(*I);
				break;
				}
				case scUDivExpr: {
				const SCEVUDivExpr *UDiv = cast<SCEVUDivExpr>(S);
				AnyRec = hasAnyRec(UDiv->getLHS()) \|\| hasAnyRec(UDiv->getRHS());
				break;
				}
				case scUnknown:
				case scCouldNotCompute: {
				AnyRec = false;
				break;
				}
				default:
				llvm_unreachable("Unknown SCEV kind!");
				}
				HasRecMap.insert(std::make_pair(S, AnyRec));
				return AnyRec;
				}

				hfinkelUnsubmitted Not Done Reply Inline Actions I appreciate this idea, but please don't do this by default. In a build with asserts, this check makes Value removal O(N^2). If you'd like to have this check, you'll need a separate flag. This reminds me of EnableExpensiveChecks in lib/CodeGen/SelectionDAG/LegalizeTypes.cpp. hfinkel: I appreciate this idea, but please don't do this by default. In a build with asserts, this…
				wmiAuthorUnsubmitted Not Done Reply Inline Actions I put the check under VerifySCEVMap option (similar as VerifySCEV option). And I moved the check to ScalarEvolution::getSCEVValues from ScalarEvolution::eraseValueFromMap, so it can ensure every Value set returned by getSCEVValues don't have dangling value inside of it. wmi: I put the check under VerifySCEVMap option (similar as VerifySCEV option). And I moved the…
				/// getSCEVValue - Return the Value vector from S.
				std::vector<WeakVH> ScalarEvolution::getSCEVValue(const SCEV S) {
				ExprValueMapType::iterator SI = ExprValueMap.find_as(S);
				return (SI == ExprValueMap.end()) ? nullptr : &SI->second;
				hfinkelUnsubmitted Not Done Reply Inline Actions This makes me a bit uncomfortable; you're relying on the fact that, if a Value* is removed, then no new Value* will be created in the same location, or if that does happen, the new Value* won't be used with getSCEV(). Nothing really guarantees this, however. One option is to hold WeakVH as the values in your map. Another option is to enhance the SCEVCallbackVH implementation to update the ExprValueMap directly. Given that you seem to have already done the SCEVCallbackVH update below, maybe the additional check is just unnecessary now. hfinkel: This makes me a bit uncomfortable; you're relying on the fact that, if a Value* is removed…
				}

	/// getSCEV - Return an existing SCEV if it exists, otherwise analyze the			/// getSCEV - Return an existing SCEV if it exists, otherwise analyze the
	/// expression and create a new one.			/// expression and create a new one.
	const SCEV ScalarEvolution::getSCEV(Value V) {			const SCEV ScalarEvolution::getSCEV(Value V) {
	assert(isSCEVable(V->getType()) && "Value is not SCEVable!");			assert(isSCEVable(V->getType()) && "Value is not SCEVable!");

	const SCEV *S = getExistingSCEV(V);			const SCEV *S = getExistingSCEV(V);
	if (S == nullptr) {			if (S == nullptr) {
	S = createSCEV(V);			S = createSCEV(V);
	ValueExprMap.insert(std::make_pair(SCEVCallbackVH(V, this), S));			ValueExprMap.insert(std::make_pair(SCEVCallbackVH(V, this), S));
	}			}

				ExprValueMapType::iterator SI = ExprValueMap.find_as(S);
				if (SI == ExprValueMap.end()) {
				std::vector<WeakVH> Vec;
				Vec.push_back(WeakVH(V));
				ExprValueMap.insert(std::make_pair(S, Vec));
				} else {
				std::vector<WeakVH> &Vec = SI->second;
				Vec.push_back(WeakVH(V));
				}
	return S;			return S;
	}			}

	const SCEV ScalarEvolution::getExistingSCEV(Value V) {			const SCEV ScalarEvolution::getExistingSCEV(Value V) {
	assert(isSCEVable(V->getType()) && "Value is not SCEVable!");			assert(isSCEVable(V->getType()) && "Value is not SCEVable!");

	ValueExprMapType::iterator I = ValueExprMap.find_as(V);			ValueExprMapType::iterator I = ValueExprMap.find_as(V);
	if (I != ValueExprMap.end()) {			if (I != ValueExprMap.end()) {
	▲ Show 20 Lines • Show All 4,959 Lines • ▼ Show 20 Lines

	void ScalarEvolution::releaseMemory() {			void ScalarEvolution::releaseMemory() {
	// Iterate through all the SCEVUnknown instances and call their			// Iterate through all the SCEVUnknown instances and call their
	// destructors, so that they release their references to their values.			// destructors, so that they release their references to their values.
	for (SCEVUnknown *U = FirstUnknown; U; U = U->Next)			for (SCEVUnknown *U = FirstUnknown; U; U = U->Next)
	U->~SCEVUnknown();			U->~SCEVUnknown();
	FirstUnknown = nullptr;			FirstUnknown = nullptr;

				ExprValueMap.clear();
	ValueExprMap.clear();			ValueExprMap.clear();
				HasRecMap.clear();

	// Free any extra memory created for ExitNotTakenInfo in the unlikely event			// Free any extra memory created for ExitNotTakenInfo in the unlikely event
	// that a loop had multiple computable exits.			// that a loop had multiple computable exits.
	for (DenseMap<const Loop*, BackedgeTakenInfo>::iterator I =			for (DenseMap<const Loop*, BackedgeTakenInfo>::iterator I =
	BackedgeTakenCounts.begin(), E = BackedgeTakenCounts.end();			BackedgeTakenCounts.begin(), E = BackedgeTakenCounts.end();
	I != E; ++I) {			I != E; ++I) {
	I->second.clear();			I->second.clear();
	}			}
	▲ Show 20 Lines • Show All 338 Lines • ▼ Show 20 Lines
	}			}

	void ScalarEvolution::forgetMemoizedResults(const SCEV *S) {			void ScalarEvolution::forgetMemoizedResults(const SCEV *S) {
	ValuesAtScopes.erase(S);			ValuesAtScopes.erase(S);
	LoopDispositions.erase(S);			LoopDispositions.erase(S);
	BlockDispositions.erase(S);			BlockDispositions.erase(S);
	UnsignedRanges.erase(S);			UnsignedRanges.erase(S);
	SignedRanges.erase(S);			SignedRanges.erase(S);
				ExprValueMap.erase(S);
				HasRecMap.erase(S);

	for (DenseMap<const Loop*, BackedgeTakenInfo>::iterator I =			for (DenseMap<const Loop*, BackedgeTakenInfo>::iterator I =
	BackedgeTakenCounts.begin(), E = BackedgeTakenCounts.end(); I != E; ) {			BackedgeTakenCounts.begin(), E = BackedgeTakenCounts.end(); I != E; ) {
	BackedgeTakenInfo &BEInfo = I->second;			BackedgeTakenInfo &BEInfo = I->second;
	if (BEInfo.hasOperand(S, this)) {			if (BEInfo.hasOperand(S, this)) {
	BEInfo.clear();			BEInfo.clear();
	BackedgeTakenCounts.erase(I++);			BackedgeTakenCounts.erase(I++);
	}			}
	▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolutionExpander.cpp

Show First 20 Lines • Show All 1,599 Lines • ▼ Show 20 Lines	if (Ty) {
V = InsertNoopCastOfTo(V, Ty);		V = InsertNoopCastOfTo(V, Ty);
}		}
return V;		return V;
}		}

Value SCEVExpander::expand(const SCEV S) {		Value SCEVExpander::expand(const SCEV S) {
// Compute an insertion point for this SCEV object. Hoist the instructions		// Compute an insertion point for this SCEV object. Hoist the instructions
// as far out in the loop nest as possible.		// as far out in the loop nest as possible.
Instruction *InsertPt = Builder.GetInsertPoint();		Instruction *InsertPt = Builder.GetInsertPoint();
		hfinkelUnsubmitted Done Reply Inline Actions LSR -> LSR's hfinkel: LSR -> LSR's
for (Loop *L = SE.LI->getLoopFor(Builder.GetInsertBlock()); ;		for (Loop *L = SE.LI->getLoopFor(Builder.GetInsertBlock()); ;
L = L->getParentLoop())		L = L->getParentLoop())
if (SE.isLoopInvariant(S, L)) {		if (SE.isLoopInvariant(S, L)) {
if (!L) break;		if (!L) break;
if (BasicBlock *Preheader = L->getLoopPreheader())		if (BasicBlock *Preheader = L->getLoopPreheader())
InsertPt = Preheader->getTerminator();		InsertPt = Preheader->getTerminator();
else {		else {
// LSR sets the insertion point for AddRec start/step values to the		// LSR sets the insertion point for AddRec start/step values to the
Show All 20 Lines	std::map<std::pair<const SCEV , Instruction >, TrackingVH<Value> >::iterator
I = InsertedExpressions.find(std::make_pair(S, InsertPt));		I = InsertedExpressions.find(std::make_pair(S, InsertPt));
if (I != InsertedExpressions.end())		if (I != InsertedExpressions.end())
return I->second;		return I->second;

BuilderType::InsertPointGuard Guard(Builder);		BuilderType::InsertPointGuard Guard(Builder);
Builder.SetInsertPoint(InsertPt->getParent(), InsertPt);		Builder.SetInsertPoint(InsertPt->getParent(), InsertPt);

// Expand the expression into instructions.		// Expand the expression into instructions.
Value *V = visit(S);		std::vector<WeakVH> *Vec = SE.getSCEVValue(S);
		Value *V = nullptr;
		hfinkelUnsubmitted Not Done Reply Inline Actions Why are we reusing an existing value only for AddRecs? I see in the summary that you say: The intuition is, if only SCEV doesn't contain scAddRecExpr, using the original value to expand will not nullify valid loop transformations. But I don't see the downside to always reusing an available value from that. Another potentially-problematic issue is that SCEVs may not be unique, and I'm a bit concerned about always taking only the first or last such value encountered, because it imposes an indirect constraint on the users of ScalarEvolution and the expander to ensure that they always visit all such values in some deterministic order. This is not currently the case. Moreover, it can be problematic if, for example, you symbolically compute X-Y first, and then call getSCEV on a value that happens to be X-Y, will expand differently than calling getSCEV on that value and doing the symbolic calculation later. Also, if we are going to restrict these to a subclass of SCEVs, why wouldn't you only store the values/SCEV pair in the map if the SCEV satisfies hasAnyRec? hfinkel: Why are we reusing an existing value only for AddRecs? I see in the summary that you say: >…
		wmiAuthorUnsubmitted Not Done Reply Inline Actions Why are we reusing an existing value only for AddRecs? No, we are reusing an existing value only for scevs which are not scAddRecExpr. The downside to use existing value for scAddRecExpr is that it may nullify some optimization done by LSR. (I think only scAddRecExpr type scev is substantially involved in LSR optimization) Another potentially-problematic issue is that SCEVs may not be unique, and I'm a bit concerned about always taking only the first or last such value encountered, because it imposes an indirect constraint on the users of ScalarEvolution and the expander to ensure that they always visit all such values in some deterministic order. This is not currently the case. Moreover, it can be problematic if, for example, you symbolically compute X-Y first, and then call getSCEV on a value that happens to be X-Y, will expand differently than calling getSCEV on that value and doing the symbolic calculation later. I guess your point is: multiple values can be mapped to the same SCEV. Only one of those values will be recorded in ExprValueMap and will be used in expansion. To get the maximum optimization opportunity, the values must be encountered/expanded in a certain order. However, I think the same problem exists either even without the patch because the fact that mulitple values can be mapped to the same SCEV is true w/wo the patch (This is determined by ScalarEvolution::createSCEV and we didn't change it in the patch). value = X-Y ... To expand S1. Suppose X-Y and X'-Y' will both be mapped to S1. Whether S1 will be expanded to X-Y or X'-Y' depends on which one is first encountered by getSCEV. This is true even without the patch. Also, if we are going to restrict these to a subclass of SCEVs, why wouldn't you only store the values/SCEV pair in the map if the SCEV satisfies hasAnyRec? I think you mean not to store the value/scAddRecExpr pair in the map. It can make the map smaller. I will change it. wmi: > Why are we reusing an existing value only for AddRecs? No, we are reusing an existing value…
		hfinkelUnsubmitted Not Done Reply Inline Actions No, we are reusing an existing value only for scevs which are not scAddRecExpr. The downside to use existing value for scAddRecExpr is that it may nullify some optimization done by LSR. (I think only scAddRecExpr type scev is substantially involved in LSR optimization) I'm very afraid here of creating an quirky interface that, while appearing to offer a general set of facilities, contains a set of unexpected behaviors tailored to a specific consumer (LSR, in this case). Regardless, ScalarEvolutionExpander already contains a special 'LSRMode', and we should key and LSR-specific customizations off of that. In the general case, we should have a consistent behavior. hfinkel: > No, we are reusing an existing value only for scevs which are not scAddRecExpr. The downside…
		wmiAuthorUnsubmitted Not Done Reply Inline Actions I rethought your previous comments about uncertainty of the expansion result and found something I can improve. Thanks for those comments. A case is like this: BBi: %sub2 = %x - %y; ... BBj: %x = load %a; %y = load %b; %sub1 = %x - %y; ... %i_0 = %sub1; for.body: %i_1 = PHI (%i_0, %i_next) %i_next = %i_1 + 1; %cmp = icmp slt %i_next, %z br i1 %cmp, label %for.body, label %for.end for.end: If we expand the SCEV when we try to get the backedge count, without the patch it will be "%z - (%x - %y) - 1", with the patch it may be "%z - %sub1 - 1" if %sub1 is recorded in SCEV of "%x - %y", or it may be "%z - %sub2 - 1" if %sub2 is recorded in SCEV of "%x - %y". It is also possible that BBi cannot reach BBj, so SCEV also may expand to "%z - (%x - %y) - 1" with the patch. This can be improved. I can record the set of all possible Values mapped to the same SCEV in ExprValueMap. In SCEVExpander::expand, I will select one value from the set which will dominate the insert point, so in the case above, SCEV will only be expanded to "%z - %sub1 - 1" if BBi cannot reach BBj, so reuse will be realized every time and there will be much less uncertainty in the expansion result. For another concern you raised, although existing behavior of SCEVExpander::expand is more consistent -- it always literally generates all the computations SCEV represents, it sacrifices the opportunity to reuse existing values. The patch introduces some inconsistency. The inconsistency is that we keep the expansion behavior of scAddRecExpr the same as without the patch, but may reuse existing values when expanding other kinds of SCEV (So the inconsistency here is actually an improvement when expanding non-scAddRecExpr SCEVs). I think about the generaility of the interface, besides LSR, other components can still use the interface the same as before and believe the interface will generate equal value -- without affecting correctness, but possibly with less cost. And I think I can describe the inconsistency more clearly in comments to remove potential confusion from the users of the interface. wmi: 1. I rethought your previous comments about uncertainty of the expansion result and found…
		if (Vec && !SE.hasAnyRec(S)) {
		// Choose a Value from the vector which dominates the insertPt.
		hfinkelUnsubmitted Done Reply Inline Actions LSR -> LSR's hfinkel: LSR -> LSR's
		for (auto const &ent : *Vec) {
		qcolombetUnsubmitted Not Done Reply Inline Actions This seems for the users of the SCEV API to know that when all those conditions apply, it needs to create a new value. qcolombet: This seems for the users of the SCEV API to know that when all those conditions apply, it needs…
		wmiAuthorUnsubmitted Not Done Reply Inline Actions Yes, that is a precise description about the use of the code above. wmi: Yes, that is a precise description about the use of the code above.
		if (ent && isa<Instruction>(ent) && S->getType() == ent->getType() &&
		sanjoyUnsubmitted Not Done Reply Inline Actions Nit: LLVM naming style is `auto const &Ent : Vec`. sanjoy:* Nit: LLVM naming style is `auto const &Ent : *Vec`.
		SE.DT->dominates(cast<Instruction>(ent), InsertPt)) {
		V = ent;
		break;
		hfinkelUnsubmitted Done Reply Inline Actions You're iterating over the elements of a set here, and those have WeakVH (i.e. pointer-valued) keys. That seems unlikely to be deterministic. SetVector seems like a better choice. hfinkel: You're iterating over the elements of a set here, and those have WeakVH (i.e. pointer-valued)…
		}
		}
		}
		if (!V)
		V = visit(S);

// Remember the expanded value for this SCEV at this location.		// Remember the expanded value for this SCEV at this location.
//		//
// This is independent of PostIncLoops. The mapped value simply materializes		// This is independent of PostIncLoops. The mapped value simply materializes
// the expression at this insertion point. If the mapped value happened to be		// the expression at this insertion point. If the mapped value happened to be
// a postinc expansion, it could be reused by a non-postinc user, but only if		// a postinc expansion, it could be reused by a non-postinc user, but only if
// its insertion point was already at the head of the loop.		// its insertion point was already at the head of the loop.
InsertedExpressions[std::make_pair(S, InsertPt)] = V;		InsertedExpressions[std::make_pair(S, InsertPt)] = V;
▲ Show 20 Lines • Show All 298 Lines • Show Last 20 Lines

lib/Transforms/Vectorize/LoopVectorize.cpp

Show First 20 Lines • Show All 2,589 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::createEmptyLoop() {
// The loop step is equal to the vectorization factor (num of SIMD elements)		// The loop step is equal to the vectorization factor (num of SIMD elements)
// times the unroll factor (num of SIMD instructions).		// times the unroll factor (num of SIMD instructions).
Constant Step = ConstantInt::get(IdxTy, VF UF);		Constant Step = ConstantInt::get(IdxTy, VF UF);

// Generate code to check that the loop's trip count that we computed by		// Generate code to check that the loop's trip count that we computed by
// adding one to the backedge-taken count will not overflow.		// adding one to the backedge-taken count will not overflow.
BasicBlock *NewVectorPH =		BasicBlock *NewVectorPH =
VectorPH->splitBasicBlock(VectorPH->getTerminator(), "overflow.checked");		VectorPH->splitBasicBlock(VectorPH->getTerminator(), "overflow.checked");
		// Update dominate tree immediately if the generated block is a
		sanjoyUnsubmitted Not Done Reply Inline Actions Nit: wrapping sanjoy: Nit: wrapping
		hfinkelUnsubmitted Done Reply Inline Actions dominate -> dominator hfinkel: dominate -> dominator
		// LoopBypassBlock
		// because SCEV expansions to generate loop bypass checks may query it before
		hfinkelUnsubmitted Done Reply Inline Actions func -> function (no need to abbreviate here) hfinkel: func -> function (no need to abbreviate here)
		// the current func is finished.
		DT->addNewBlock(NewVectorPH, VectorPH);
		qcolombetUnsubmitted Not Done Reply Inline Actions The formatting of the comment looks strange and the comment itself is hard to digest. Could you rephrase? Also, don’t we have a split method directly on the DT? Having to explicitly add the block to DT seems error prone to me. qcolombet: The formatting of the comment looks strange and the comment itself is hard to digest. Could you…
		wmiAuthorUnsubmitted Not Done Reply Inline Actions Sorry, I will fix the comment. The comment is trying to say the SCEV expansion may query the DT at the same time when the func createEmptyLoop generates new bypass blocks. This is before InnerLoopVectorizer::updateAnalysis update the whole DT so we need to maintain the DT incrementally. I don't know the common way to do that so I just move the code originally in InnerLoopVectorizer::updateAnalysis forward. wmi: Sorry, I will fix the comment. The comment is trying to say the SCEV expansion may query the…
if (ParentLoop)		if (ParentLoop)
ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);		ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);
ReplaceInstWithInst(		ReplaceInstWithInst(
VectorPH->getTerminator(),		VectorPH->getTerminator(),
BranchInst::Create(ScalarPH, NewVectorPH, CheckBCOverflow));		BranchInst::Create(ScalarPH, NewVectorPH, CheckBCOverflow));
VectorPH = NewVectorPH;		VectorPH = NewVectorPH;

// This is the IR builder that we use to add all of the logic for bypassing		// This is the IR builder that we use to add all of the logic for bypassing
Show All 24 Lines	Value *IdxEndRoundDown = BypassBuilder.CreateAdd(CountRoundDown, StartIdx,
"end.idx.rnd.down");		"end.idx.rnd.down");

// Now, compare the new count to zero. If it is zero skip the vector loop and		// Now, compare the new count to zero. If it is zero skip the vector loop and
// jump to the scalar loop.		// jump to the scalar loop.
Value *Cmp =		Value *Cmp =
BypassBuilder.CreateICmpEQ(IdxEndRoundDown, StartIdx, "cmp.zero");		BypassBuilder.CreateICmpEQ(IdxEndRoundDown, StartIdx, "cmp.zero");
NewVectorPH =		NewVectorPH =
VectorPH->splitBasicBlock(VectorPH->getTerminator(), "vector.ph");		VectorPH->splitBasicBlock(VectorPH->getTerminator(), "vector.ph");
		DT->addNewBlock(NewVectorPH, VectorPH);
if (ParentLoop)		if (ParentLoop)
ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);		ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);
LoopBypassBlocks.push_back(VectorPH);		LoopBypassBlocks.push_back(VectorPH);
ReplaceInstWithInst(VectorPH->getTerminator(),		ReplaceInstWithInst(VectorPH->getTerminator(),
BranchInst::Create(MiddleBlock, NewVectorPH, Cmp));		BranchInst::Create(MiddleBlock, NewVectorPH, Cmp));
VectorPH = NewVectorPH;		VectorPH = NewVectorPH;

// Generate the code to check that the strides we assumed to be one are really		// Generate the code to check that the strides we assumed to be one are really
// one. We want the new basic block to start at the first instruction in a		// one. We want the new basic block to start at the first instruction in a
// sequence of instructions that form a check.		// sequence of instructions that form a check.
Instruction *StrideCheck;		Instruction *StrideCheck;
Instruction *FirstCheckInst;		Instruction *FirstCheckInst;
std::tie(FirstCheckInst, StrideCheck) =		std::tie(FirstCheckInst, StrideCheck) =
addStrideCheck(VectorPH->getTerminator());		addStrideCheck(VectorPH->getTerminator());
if (StrideCheck) {		if (StrideCheck) {
AddedSafetyChecks = true;		AddedSafetyChecks = true;
// Create a new block containing the stride check.		// Create a new block containing the stride check.
VectorPH->setName("vector.stridecheck");		VectorPH->setName("vector.stridecheck");
NewVectorPH =		NewVectorPH =
VectorPH->splitBasicBlock(VectorPH->getTerminator(), "vector.ph");		VectorPH->splitBasicBlock(VectorPH->getTerminator(), "vector.ph");
		DT->addNewBlock(NewVectorPH, VectorPH);
if (ParentLoop)		if (ParentLoop)
ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);		ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);
LoopBypassBlocks.push_back(VectorPH);		LoopBypassBlocks.push_back(VectorPH);

// Replace the branch into the memory check block with a conditional branch		// Replace the branch into the memory check block with a conditional branch
// for the "few elements case".		// for the "few elements case".
ReplaceInstWithInst(		ReplaceInstWithInst(
VectorPH->getTerminator(),		VectorPH->getTerminator(),
Show All 9 Lines	void InnerLoopVectorizer::createEmptyLoop() {
std::tie(FirstCheckInst, MemRuntimeCheck) =		std::tie(FirstCheckInst, MemRuntimeCheck) =
Legal->getLAI()->addRuntimeCheck(VectorPH->getTerminator());		Legal->getLAI()->addRuntimeCheck(VectorPH->getTerminator());
if (MemRuntimeCheck) {		if (MemRuntimeCheck) {
AddedSafetyChecks = true;		AddedSafetyChecks = true;
// Create a new block containing the memory check.		// Create a new block containing the memory check.
VectorPH->setName("vector.memcheck");		VectorPH->setName("vector.memcheck");
NewVectorPH =		NewVectorPH =
VectorPH->splitBasicBlock(VectorPH->getTerminator(), "vector.ph");		VectorPH->splitBasicBlock(VectorPH->getTerminator(), "vector.ph");
		DT->addNewBlock(NewVectorPH, VectorPH);
if (ParentLoop)		if (ParentLoop)
ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);		ParentLoop->addBasicBlockToLoop(NewVectorPH, *LI);
LoopBypassBlocks.push_back(VectorPH);		LoopBypassBlocks.push_back(VectorPH);

// Replace the branch into the memory check block with a conditional branch		// Replace the branch into the memory check block with a conditional branch
// for the "few elements case".		// for the "few elements case".
ReplaceInstWithInst(		ReplaceInstWithInst(
VectorPH->getTerminator(),		VectorPH->getTerminator(),
▲ Show 20 Lines • Show All 1,001 Lines • ▼ Show 20 Lines
void InnerLoopVectorizer::updateAnalysis() {		void InnerLoopVectorizer::updateAnalysis() {
// Forget the original basic block.		// Forget the original basic block.
SE->forgetLoop(OrigLoop);		SE->forgetLoop(OrigLoop);

// Update the dominator tree information.		// Update the dominator tree information.
assert(DT->properlyDominates(LoopBypassBlocks.front(), LoopExitBlock) &&		assert(DT->properlyDominates(LoopBypassBlocks.front(), LoopExitBlock) &&
"Entry does not dominate exit.");		"Entry does not dominate exit.");

for (unsigned I = 1, E = LoopBypassBlocks.size(); I != E; ++I)
DT->addNewBlock(LoopBypassBlocks[I], LoopBypassBlocks[I-1]);
DT->addNewBlock(LoopVectorPreHeader, LoopBypassBlocks.back());

// Due to if predication of stores we might create a sequence of "if(pred)		// Due to if predication of stores we might create a sequence of "if(pred)
// a[i] = ...; " blocks.		// a[i] = ...; " blocks.
for (unsigned i = 0, e = LoopVectorBody.size(); i != e; ++i) {		for (unsigned i = 0, e = LoopVectorBody.size(); i != e; ++i) {
if (i == 0)		if (i == 0)
DT->addNewBlock(LoopVectorBody[0], LoopVectorPreHeader);		DT->addNewBlock(LoopVectorBody[0], LoopVectorPreHeader);
else if (isPredicatedBlock(i)) {		else if (isPredicatedBlock(i)) {
DT->addNewBlock(LoopVectorBody[i], LoopVectorBody[i-1]);		DT->addNewBlock(LoopVectorBody[i], LoopVectorBody[i-1]);
} else {		} else {
▲ Show 20 Lines • Show All 1,693 Lines • Show Last 20 Lines

test/CodeGen/Thumb2/2009-12-01-LoopIVUsers.ll

	; RUN: opt < %s -O3 \| \			; RUN: opt < %s -O3 \| \
	; RUN: llc -mtriple=thumbv7-apple-darwin10 -mattr=+neon \| FileCheck %s			; RUN: llc -mtriple=thumbv7-apple-darwin10 -mattr=+neon \| FileCheck %s

	target datalayout = "e-p:32:32:32-i1:8:32-i8:8:32-i16:16:32-i32:32:32-i64:32:32-f32:32:32-f64:32:32-v64:64:64-v128:128:128-a0:0:32"			target datalayout = "e-p:32:32:32-i1:8:32-i8:8:32-i16:16:32-i32:32:32-i64:32:32-f32:32:32-f64:32:32-v64:64:64-v128:128:128-a0:0:32"

	define void @fred(i32 %three_by_three, i8* %in, double %dt1, i32 %x_size, i32 %y_size, i8* %bp) nounwind {			define void @fred(i32 %three_by_three, i8* %in, double %dt1, i32 %x_size, i32 %y_size, i8* %bp) nounwind {
	entry:			entry:
	; -- The loop following the load should only use a single add-literation			; -- The loop following the load should only use a single add-literation
	; instruction.			; instruction.
	; CHECK: vldr			; CHECK: vldr
	; CHECK: adds r{{[0-9]+.*}}#1
	; CHECK-NOT: adds			; CHECK-NOT: adds
	; CHECK: subsections_via_symbols			; CHECK: subsections_via_symbols


	%three_by_three_addr = alloca i32 ; <i32*> [#uses=2]			%three_by_three_addr = alloca i32 ; <i32*> [#uses=2]
	%in_addr = alloca i8* ; <i8**> [#uses=2]			%in_addr = alloca i8* ; <i8**> [#uses=2]
	%dt_addr = alloca float ; <float*> [#uses=4]			%dt_addr = alloca float ; <float*> [#uses=4]
	%x_size_addr = alloca i32 ; <i32*> [#uses=2]			%x_size_addr = alloca i32 ; <i32*> [#uses=2]
	▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

test/Transforms/IRCE/decrementing-loop.ll

	Show All 22 Lines

	out.of.bounds:			out.of.bounds:
	ret void			ret void

	exit:			exit:
	ret void			ret void

	; CHECK: loop.preheader:			; CHECK: loop.preheader:
	; CHECK: [[indvar_start:[^ ]+]] = add i32 %n, -1
	; CHECK: [[not_len:[^ ]+]] = sub i32 -1, %len			; CHECK: [[not_len:[^ ]+]] = sub i32 -1, %len
	; CHECK: [[not_n:[^ ]+]] = sub i32 -1, %n			; CHECK: [[not_n:[^ ]+]] = sub i32 -1, %n
	; CHECK: [[not_len_hiclamp_cmp:[^ ]+]] = icmp sgt i32 [[not_len]], [[not_n]]			; CHECK: [[not_len_hiclamp_cmp:[^ ]+]] = icmp sgt i32 [[not_len]], [[not_n]]
	; CHECK: [[not_len_hiclamp:[^ ]+]] = select i1 [[not_len_hiclamp_cmp]], i32 [[not_len]], i32 [[not_n]]			; CHECK: [[not_len_hiclamp:[^ ]+]] = select i1 [[not_len_hiclamp_cmp]], i32 [[not_len]], i32 [[not_n]]
	; CHECK: [[len_hiclamp:[^ ]+]] = sub i32 -1, [[not_len_hiclamp]]			; CHECK: [[len_hiclamp:[^ ]+]] = sub i32 -1, [[not_len_hiclamp]]
	; CHECK: [[not_exit_preloop_at_cmp:[^ ]+]] = icmp sgt i32 [[len_hiclamp]], 0			; CHECK: [[not_exit_preloop_at_cmp:[^ ]+]] = icmp sgt i32 [[len_hiclamp]], 0
	; CHECK: [[not_exit_preloop_at:[^ ]+]] = select i1 [[not_exit_preloop_at_cmp]], i32 [[len_hiclamp]], i32 0			; CHECK: [[not_exit_preloop_at:[^ ]+]] = select i1 [[not_exit_preloop_at_cmp]], i32 [[len_hiclamp]], i32 0
	; CHECK: %exit.preloop.at = add i32 [[not_exit_preloop_at]], -1			; CHECK: %exit.preloop.at = add i32 [[not_exit_preloop_at]], -1
	}			}

	!0 = !{i32 0, i32 2147483647}			!0 = !{i32 0, i32 2147483647}
	!1 = !{!"branch_weights", i32 64, i32 4}			!1 = !{!"branch_weights", i32 64, i32 4}

test/Transforms/IndVarSimplify/lftr-address-space-pointers.ll

	; RUN: opt -S -indvars -o - %s \| FileCheck %s			; RUN: opt -S -indvars -o - %s \| FileCheck %s
	target datalayout = "e-p:32:32:32-p1:64:64:64-p2:8:8:8-p3:16:16:16-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:32-n8:16:32:64"			target datalayout = "e-p:32:32:32-p1:64:64:64-p2:8:8:8-p3:16:16:16-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:32-n8:16:32:64"

	; Derived from ptriv in lftr-reuse.ll			; Derived from ptriv in lftr-reuse.ll
	define void @ptriv_as2(i8 addrspace(2)* %base, i32 %n) nounwind {			define void @ptriv_as2(i8 addrspace(2)* %base, i32 %n) nounwind {
	; CHECK-LABEL: @ptriv_as2(			; CHECK-LABEL: @ptriv_as2(
	entry:			entry:
	%idx.trunc = trunc i32 %n to i8			%idx.trunc = trunc i32 %n to i8
	%add.ptr = getelementptr inbounds i8, i8 addrspace(2)* %base, i8 %idx.trunc			%add.ptr = getelementptr inbounds i8, i8 addrspace(2)* %base, i8 %idx.trunc
	%cmp1 = icmp ult i8 addrspace(2)* %base, %add.ptr			%cmp1 = icmp ult i8 addrspace(2)* %base, %add.ptr
	br i1 %cmp1, label %for.body, label %for.end			br i1 %cmp1, label %for.body, label %for.end

	; Make sure the added GEP has the right index type			; Make sure the added GEP has the right index type
	; CHECK: %lftr.limit = getelementptr i8, i8 addrspace(2)* %base, i8 %0			; CHECK: %lftr.limit = getelementptr i8, i8 addrspace(2)* %base, i8 %idx.trunc

	; CHECK: for.body:			; CHECK: for.body:
	; CHECK: phi i8 addrspace(2)*			; CHECK: phi i8 addrspace(2)*
	; CHECK-NOT: phi			; CHECK-NOT: phi
	; CHECK-NOT: add{{^rspace}}			; CHECK-NOT: add{{^rspace}}
	; CHECK: icmp ne i8 addrspace(2)*			; CHECK: icmp ne i8 addrspace(2)*
	; CHECK: br i1			; CHECK: br i1
	for.body:			for.body:
	Show All 15 Lines
	; CHECK-LABEL: @ptriv_as3(			; CHECK-LABEL: @ptriv_as3(
	entry:			entry:
	%idx.trunc = trunc i32 %n to i16			%idx.trunc = trunc i32 %n to i16
	%add.ptr = getelementptr inbounds i8, i8 addrspace(3)* %base, i16 %idx.trunc			%add.ptr = getelementptr inbounds i8, i8 addrspace(3)* %base, i16 %idx.trunc
	%cmp1 = icmp ult i8 addrspace(3)* %base, %add.ptr			%cmp1 = icmp ult i8 addrspace(3)* %base, %add.ptr
	br i1 %cmp1, label %for.body, label %for.end			br i1 %cmp1, label %for.body, label %for.end

	; Make sure the added GEP has the right index type			; Make sure the added GEP has the right index type
	; CHECK: %lftr.limit = getelementptr i8, i8 addrspace(3)* %base, i16 %0			; CHECK: %lftr.limit = getelementptr i8, i8 addrspace(3)* %base, i16 %idx.trunc

	; CHECK: for.body:			; CHECK: for.body:
	; CHECK: phi i8 addrspace(3)*			; CHECK: phi i8 addrspace(3)*
	; CHECK-NOT: phi			; CHECK-NOT: phi
	; CHECK-NOT: add{{^rspace}}			; CHECK-NOT: add{{^rspace}}
	; CHECK: icmp ne i8 addrspace(3)*			; CHECK: icmp ne i8 addrspace(3)*
	; CHECK: br i1			; CHECK: br i1
	for.body:			for.body:
	Show All 15 Lines

test/Transforms/IndVarSimplify/udiv.ll

Show First 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	while.end: ; preds = %while.cond.while.end_crit_edge, %while.cond.preheader
%call40 = tail call i32 (i8, ...) @printf(i8 getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i64 0, i64 0), i32 %count.0.lcssa) nounwind ; <i32> [#uses=0]		%call40 = tail call i32 (i8, ...) @printf(i8 getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i64 0, i64 0), i32 %count.0.lcssa) nounwind ; <i32> [#uses=0]
ret i32 0		ret i32 0
}		}

declare i32 @atoi(i8* nocapture) nounwind readonly		declare i32 @atoi(i8* nocapture) nounwind readonly

declare i32 @printf(i8* nocapture, ...) nounwind		declare i32 @printf(i8* nocapture, ...) nounwind

; IndVars shouldn't be afraid to emit a udiv here, since there's a udiv in		; IndVars doesn't emit a udiv in for.body.preheader since SCEVExpander::expand will
; the original code.		; find out there's already a udiv in the original code.

; CHECK-LABEL: @foo(		; CHECK-LABEL: @foo(
; CHECK: for.body.preheader:		; CHECK: for.body.preheader:
; CHECK-NEXT: udiv		; CHECK-NOT: udiv

define void @foo(double* %p, i64 %n) nounwind {		define void @foo(double* %p, i64 %n) nounwind {
entry:		entry:
%div0 = udiv i64 %n, 7 ; <i64> [#uses=1]		%div0 = udiv i64 %n, 7 ; <i64> [#uses=1]
%div1 = add i64 %div0, 1		%div1 = add i64 %div0, 1
%cmp2 = icmp ult i64 0, %div1 ; <i1> [#uses=1]		%cmp2 = icmp ult i64 0, %div1 ; <i1> [#uses=1]
br i1 %cmp2, label %for.body.preheader, label %for.end		br i1 %cmp2, label %for.body.preheader, label %for.end

Show All 19 Lines

test/Transforms/IndVarSimplify/ult-sub-to-eq.ll

Show All 26 Lines	for.body: ; preds = %entry, %for.body
%cmp = icmp ult i32 %3, %sub		%cmp = icmp ult i32 %3, %sub
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

for.end: ; preds = %for.body, %entry		for.end: ; preds = %for.body, %entry
ret void		ret void

; CHECK-LABEL: @test1(		; CHECK-LABEL: @test1(

; First check that we move the sub into the preheader, it doesn't have to be		; check that we turn the IV test into an eq.
; executed if %cmp4 == false
; CHECK: for.body.preheader:
; CHECK: sub i32 %data_len, %sample
; CHECK: br label %for.body

; Second, check that we turn the IV test into an eq.
; CHECK: %lftr.wideiv = trunc i64 %indvars.iv.next to i32		; CHECK: %lftr.wideiv = trunc i64 %indvars.iv.next to i32
; CHECK: %exitcond = icmp ne i32 %lftr.wideiv, %0		; CHECK: %exitcond = icmp ne i32 %lftr.wideiv, %sub
; CHECK: br i1 %exitcond, label %for.body, label %for.end.loopexit		; CHECK: br i1 %exitcond, label %for.body, label %for.end.loopexit
}		}

test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll

	; RUN: opt -loop-reduce -S < %s \| FileCheck %s			; RUN: opt -loop-reduce -S < %s \| FileCheck %s
	; PR9939			; PR9939

	; LSR should properly handle the post-inc offset when folding the			; LSR should properly handle the post-inc offset when folding the
	; non-IV operand of an icmp into the IV.			; non-IV operand of an icmp into the IV.

	; CHECK: [[r1:%[a-z0-9]+]] = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast			; CHECK: [[r1:%[a-z0-9\.]+]] = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
	; CHECK: [[r2:%[a-z0-9]+]] = lshr i64 [[r1]], 1			; CHECK: [[r2:%[a-z0-9\.]+]] = lshr exact i64 [[r1]], 1
				; CHECK: for.body.lr.ph:
	; CHECK: [[r3:%[a-z0-9]+]] = shl i64 [[r2]], 1			; CHECK: [[r3:%[a-z0-9]+]] = shl i64 [[r2]], 1
	; CHECK: br label %for.body			; CHECK: br label %for.body
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK: %lsr.iv2 = phi i64 [ %lsr.iv.next, %for.body ], [ [[r3]], %for.body.lr.ph ]			; CHECK: %lsr.iv2 = phi i64 [ %lsr.iv.next, %for.body ], [ [[r3]], %for.body.lr.ph ]
	; CHECK: %lsr.iv.next = add i64 %lsr.iv2, -2			; CHECK: %lsr.iv.next = add i64 %lsr.iv2, -2
	; CHECK: %lsr.iv.next3 = inttoptr i64 %lsr.iv.next to i16*			; CHECK: %lsr.iv.next3 = inttoptr i64 %lsr.iv.next to i16*
	; CHECK: %cmp27 = icmp eq i16* %lsr.iv.next3, null			; CHECK: %cmp27 = icmp eq i16* %lsr.iv.next3, null

	▲ Show 20 Lines • Show All 74 Lines • Show Last 20 Lines