This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Vectorize/
-
Transforms/
-
Vectorize/
-
LoopVectorize.cpp
-
test/Transforms/LoopVectorize/
-
Transforms/
-
LoopVectorize/
-
consecutive-ptr-cg-bug.ll

Differential D39346

[LV] [ScalarEvolution] Fix PR34965 - Cache pointer stride information before LV code gen
AbandonedPublic

Authored by dcaballe on Oct 26 2017, 3:02 PM.

Download Raw Diff

Details

Reviewers

mkuper
hfinkel
rengolin
gilr
Ayal
hsaito

Summary

This patch fixes PR34965. I have some concerns, though, about the current solution. I'm attaching the information I posted in Bugzilla. You can skip the Investigation section to have a quick understanding of what's going on, what the current patch does, my concerns and alternative solutions.

Thanks,
Diego

Symptom

Return value of Legal->isConsecutivePtr(%11) during LV code gen has changed from 1 to 0 as a side effect of VPlan patch (revision r311077). This value 0 (meaning non-consecutive pointer) during code gen is inconsistent with value 1 (meaning stride-1 pointer) returned during legal/cost model.
For that reason, store using %11 is considered a stride-1 vector store during legal/cost model but a scatter during code gen, resulting in an assert.

Investigation

createVectorizedLoopSkeleton in LV code gen generates the vector loop, turns original scalar loop into the scalar remainder, introduces the middle and scalar.ph basic blocks after the vector loop and generates new phi nodes that feed the scalar remainder (the IR after createVectorizedLoopSkeleton is attached (after_createVectorizedLoopSkeleton.ll). After this IR modification, Legal->isConsecutivePtr() no longer returns 1 for the problematic pointer, now in the scalar remainder.

Legal->isConsecutivePtr() indirectly uses ScalarEvolution (SE) to determine the pointer stride. For this particular test, Legal->isUniform() also triggers the invalidation of some SCEVs in SE cache, including the one for %11. This bug is exposed by VPlan patch because it changes the execution order of Legal->isUniform() and Legal->isConsecutivePtr() happening before LV code gen. This is the invocation order:

Before VPlan patch: Legal->isUniform(…) Invalidates SCEV for %11. Legal->isConsecutivePtr(%11) Computes new SCEV for %11. It returns 1. createVectorizedLoopSkeleton() Modifies IR (middle, scalar.ph and new phis) Legal->isConsecutivePtr(%11) Uses cached SCEV for %11. It returns 1.
After VPlan patch: Legal->isConsecutivePtr(%11) Uses cached SCEV for %11. It returns 1. Legal->isUniform(…) Invalidates SCEV for %11. createVectorizedLoopSkeleton() Modifies IR (middle, scalar.ph and new phis) Legal->isConsecutivePtr(%11) Computes new SCEV for %11 using modified IR. It returns 0!

For this test, recomputing Legal->isConsecutivePtr(%11) after createVectorizedLoopSkeleton yields a different output because SE is not able to detect a recurrence. These are the SCEVs computed for the problematic pointer before and after createVectorizedLoopSkeleton:

Before (%11 in original IR): ({(8 + (4 * (zext i32 %.ph2 to i64)) + undef),+,4}<nsw><%6>). // SCEVAddRec
After (%21 in modified IR): ((4 * (zext i32 (2 + %18) to i64))<nuw><nsw> + undef)<nsw> // No recurrence

In order to return a SCEVAddRec for the latter, SE would have to be able to represent %bc.resume.val and %bc.resume.val1 in terms of more basic components to conclude that %bc.resume.val1 is always %bc.resume.val + 1. Unfortunately, there is no way to represent a non-inductive phi operation in SCEV at this point.

Root Cause

Current LV code gen is based on the fragile assumption that analyzing the scalar remainder loop will yield the same information as analyzing the input scalar loop. Unfortunately, this test demonstrates that this is not always the case so we cannot blindly rely on that assumption.

Tentative Fix

Pointer strides are cached in Legal during the analysis of the input scalar loop (following the same approach as for uniform and scalar values). Legal->isConsecutivePtr now uses cached pointer strides instead of blindly recomputing the same information every time. Pointer strides are recomputed when runtime checks are added as a requirement for vectorization (PSE.addPredicate in LV).

Concern: LAI-> replaceSymbolicStrideSCEV may also add predicates to PSE. This routine is called when pointer strides are collected but is also called from other analysis-like routines in LAI (hasComputableBounds, isNoWrap, getPTrStride, etc.). If a new predicate that changes pointer strides is added after collecting them, those changes won’t be taken into account when using Legal->isConsecutivePtr. This scenario could be unlikely given that LAI-> replaceSymbolicStrideSCEV is also used when collecting pointer strides but we need to be sure. I would need some feedback from someone with experience in PSE/LAI.

Alternative/Complementary Solutions

If Tentative Fix is not enough to deal with unexpected PSE->AddPredicate from LAI, we could revert changes in Legal->isConsecutivePtr() and collect Legal->isConsecutivePtr() output before going into LV code gen.

Extend SE so that a SCEVAddRec can be generated for this test.

Diff Detail

Event Timeline

dcaballe created this revision.Oct 26 2017, 3:02 PM

dcaballe edited the summary of this revision. (Show Details)

In reply to Hal Finkel's comments in llvm-dev:

"P.S. I agree with your conclusions based on your description of what's going on: We need to cache the results before we start transforming things. I also agree that we should potentially be concerned with maintaining the cache in the face of new predicates being added. Maybe we should instantiate the cached values as a separate setup after all predicates have been added?"

That is my concern. It's not clear to me when "all predicates have been added". replaceSymbolicStrideSCEV can be called from multiple routines, as I explained in Tentative Fix. I guess we could assume that all predicates have been added just before going into LV code gen. If that is the case, we could call 'collectPtrStride' only once, for example, in line 7668 (just before createVectorizedLoopSkeleton) and do something like the following in Legal->isConsecutivePtr():

if !PtrStrides.empty()
    // Compute isConsecutivePtr using PtrStride information
else
    // Compute isConsecutivePtr from scratch.

dcaballe added reviewers: mkuper, hfinkel.Oct 26 2017, 3:35 PM

rogfer01 added a subscriber: rogfer01.Oct 26 2017, 11:08 PM

a.elovikov added a subscriber: a.elovikov.Oct 28 2017, 9:48 AM

I implemented a new fix where the return value of Legal->isConsecutivePtr() is collected just before the original loop is modified. This approach seems simpler and less invasive than the one described in my previous comment.

Thanks!

Herald added a subscriber: bollu. · View Herald TranscriptNov 3 2017, 1:17 PM

In D39346#915527, @dcaballe wrote:

I implemented a new fix where the return value of Legal->isConsecutivePtr() is collected just before the original loop is modified. This approach seems simpler and less invasive than the one described in my previous comment.

I may not have understood the problem correctly, but I think we shouldn't moved cached analysis into new representations.

To me it seems that SCEV is not able to see something that has a dependency, which VPlan has just exposed. In this case, fixing/extending SCEV is the right thing to do.

I'm adding the VPlan folks to get a wider audience.

cheers,
--renato

In D39346#917815, @rengolin wrote:

In D39346#915527, @dcaballe wrote:

I implemented a new fix where the return value of Legal->isConsecutivePtr() is collected just before the original loop is modified. This approach seems simpler and less invasive than the one described in my previous comment.

I may not have understood the problem correctly, but I think we shouldn't moved cached analysis into new representations.

To me it seems that SCEV is not able to see something that has a dependency, which VPlan has just exposed. In this case, fixing/extending SCEV is the right thing to do.

I'm adding the VPlan folks to get a wider audience.

cheers,
--renato

There are two things. If we are expecting to see the same result coming back, we shouldn't re-analyze. If we are expecting to see better result, there is a good reason to re-analyze.
Memory reference kind information (unit-stride, strided, or gather) is utilized in multiple places in vectorizer, and failing to see the same/better result can lead to vectorizer's stability
problem. So, having such data handy (i.e., cache) is a good defensive programming. As we improve vectorizer's capability, we expect memory reference kind information becomes more
and more complex (e.g., unit-stride up to certain VF, piece-wise unit-stride beyond that). Re-analyzing that will become more and more undesirable. So, please consider the tentative fix
as a baby step towards that direction.

Fixing/extending SCEV is also a good thing to do ---- and if vectorizer is re-analyzing, we should add an assertion such that the resulting information hasn't degraded from the cached info.

Another thing that caught my attention was Legal->isUniform() invalidating SCEV for %11. I don't think it should. Having said that, Simon Moll (Saarland) and we (Intel team) are looking into bringing Divergence Analysis from RV project. As such, we currently do not have a good incentive to dig deeper into Legal->isUniform() to resolve this particular issue.

Thanks,
Hideki

Thanks for the feedback.

Before posting this review, I got some early feedback regarding fixing/extending SCEV for this case and it didn't look good. Maybe, we could special-case it but a generic solution, if I understood correctly, could require representing non-inductive phis in SCEV. This is not my area of expertise so, hopefully, we'll get more feedback about it. I'll ping llvm-dev if this doesn't happen in the short term.

In any case, I agree with Hideki and Hal. We should cache any analysis information before we start transforming the code. We shouldn't expect the analysis information from the original loop to be preserved after transforming the code. VPlan will address this issue in a better way but, in the meantime, this should be a reasonable fix.

Thanks,
Diego

Hi Diego,

Ok, I agree with Hideki that caching is good as well as re-evaluating and making sure that we don't degrade. However, as you both said, detecting degradation will require SCEV to go beyond its current design and new cases in the vectoriser we don't have yet.

From that point of view, I agree with Hideki that this can be a baby-step towards a better model, so the patch would be ok for now.

Can you please add some comments to collectIsConsecutivePtr to reflect that? Otherwise, LGTM. Thanks!

This revision is now accepted and ready to land.Nov 22 2017, 10:24 AM

Diego,

Nice catch! Continuing to use SCEV and expect consistent answers in the midst of restructuring the IR is indeed wrong; its cache is not meant to cover for cases that can no longer be analyzed as before.

We’d like to add few points one can learn from this issue, as you are now to support and develop VPlan going forward.

LV already records consecutivity of pointers, or rather the decision to widen the corresponding loads and stores, using CM_Widen. The only thing that seems to be missing, which required calling SCEV again when generating vector code, is to figure out if the stride is +1 or -1. This can be fixed by keeping CM_Widen for stride +1 and using, say, CM_Widen_Reverse for stride -1.
So, what change did VPlan’s patch do that triggered this issue? It refactored collectUniformsAndScalars() and collectInstsToScalarize() away from selectVectorizationFactor(), in order to facilitate building VPlans first and deciding on the best VF/VPlan later. It is still possible btw to move selectVectorizationFactor() back and embed these collect methods within expectedCost() as they worked before, to see their effect on the reproducer.
How does this refactoring cause the observed issue? Collecting uniforms fills SCEV cache entries (specifically setCostBasedWideningDecision()). Computing the cost to eventually select the best VF may invalidate certain cache entries when called the first time for VF = 1 (specifically getBackedgeTakenInfo()). That’s why first collecting uniforms for all VF’s, and then computing costs for all VF’s may leave the cache invalid, compared to interleaving the two for each VF – when more than VF = 1 is involved.
How come expectedCost() which eventually calls isUniform() after collectUniformsAndScalars() was invoked, causes any trouble (let alone invalidate SCEV cache entries) – haven’t we recorded its result already? Well, getInstructionCost() does consult the collected isUniformAfterVectorization() on each instruction, but later calls Legal->isUniform(Op2) on an operand – note that Op2 may reside outside the vectorized loop and thus not collected. Indeed the Op2 checked in the reproducer is outside the vectorized loop, and Legal->isUniform(Op2) calls LAI->isUniform() which in turn triggers SCEV’s isLoopInvariant(). Computing Op2’s AddRec SCEV triggers getBackedgeTakenInfo() of the enclosing loop, which invalidates the 1st-order-recurring AddRec inside the vectorized loop. LAI->isUniform() could simply return true for values outside TheLoop, and/or getInstructionCost() could check if Op2 is outside the vectorized loop or in the per-VF collected isUniformAfterVectorization() instead of calling Legal->isUniform(Op2).
Despite retaining the original scalar loop intact as remainder/default loop, information concerning it may be lost. Namely, values inside it which SCEV could originally analyze may now require flow-sensitive analysis (which, btw, Polyhedral Value & Memory Analysis may handle – see Johannes’s talk). This is inherent to LV, irrespective of VPlan, and may affect plans to further optimize this loop such as vectorizing it. One tentative solution to the reported case is to subtract step from the current value of an IV, instead of rewiring phi’s to access its value from the previous iteration (1st-order recurring AddRec). Worth leaving a TODO note.

Bearing in mind that all VPlan patches so far deliberately aimed to minimize functional changes, such intriguing and esoteric issues hopefully emphasize the importance of changing LV gradually and carefully, and the benefit of LLVM’s vast user-base and profound QA (e.g., PR35122 ;-).

Good Luck!
Gil and Ayal.

Thanks, Renato/Gil/Ayal for your feedback.

I think that the CM_Widen approach is a good idea but it might go beyond this bug fix because it implies extending the current design. I'll have a look to see how invasive the changes would be. If it's too invasive, I would suggest that we proceed with the current tentative fix and address the CM_Widen design extension in a separate thread.

Thanks,
Diego

dcaballe mentioned this in D40742: [LV] Fix PR34965 - Extend InstWidening with CM_Widen_Recursive.Dec 1 2017, 11:18 AM

I had a look at the CM_Widen approach that Ayal and Gil suggested and the fix was indeed very simple and cleaner than the one in this differential.
I created D40742 with the new fix so that you can compare and contrast with this one. IMO, D40742 seems a better option for this bug.

Ayal/Gil, thanks a lot for the feedback!

Diego.

Abandoning this differential. D40742 seems a better fix for this bug and was approved.
Thank you all for the feedback!

hfinkel mentioned this in rL320913: [LV] Extend InstWidening with CM_Widen_Recursive.Dec 15 2017, 6:56 PM

Revision Contents

Path

Size

lib/

Transforms/

Vectorize/

LoopVectorize.cpp

33 lines

test/

Transforms/

LoopVectorize/

consecutive-ptr-cg-bug.ll

52 lines

Diff 121535

lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 516 Lines • ▼ Show 20 Lines	protected:
using ScalarParts = SmallVector<SmallVector<Value *, 4>, 2>;		using ScalarParts = SmallVector<SmallVector<Value *, 4>, 2>;

// When we if-convert we need to create edge masks. We have to cache values		// When we if-convert we need to create edge masks. We have to cache values
// so that we don't end up with exponential recursion/IR.		// so that we don't end up with exponential recursion/IR.
using EdgeMaskCacheTy =		using EdgeMaskCacheTy =
DenseMap<std::pair<BasicBlock , BasicBlock >, VectorParts>;		DenseMap<std::pair<BasicBlock , BasicBlock >, VectorParts>;
using BlockMaskCacheTy = DenseMap<BasicBlock *, VectorParts>;		using BlockMaskCacheTy = DenseMap<BasicBlock *, VectorParts>;

		/// Collect the return value of Legal->isConsecutivePtr() for each pointer in
		/// the loop. This is done before modifications to the loop which can affect
		/// the return value of Legal->isConsecutivePtr().
		void collectIsConsecutivePtr();

/// Set up the values of the IVs correctly when exiting the vector loop.		/// Set up the values of the IVs correctly when exiting the vector loop.
void fixupIVUsers(PHINode *OrigPhi, const InductionDescriptor &II,		void fixupIVUsers(PHINode *OrigPhi, const InductionDescriptor &II,
Value CountRoundDown, Value EndValue,		Value CountRoundDown, Value EndValue,
BasicBlock *MiddleBlock);		BasicBlock *MiddleBlock);

/// Create a new induction variable inside L.		/// Create a new induction variable inside L.
PHINode createInductionVariable(Loop L, Value Start, Value End,		PHINode createInductionVariable(Loop L, Value Start, Value End,
Value Step, Instruction DL);		Value Step, Instruction DL);
▲ Show 20 Lines • Show All 219 Lines • ▼ Show 20 Lines	protected:
LoopVectorizationCostModel *Cost;		LoopVectorizationCostModel *Cost;

// Record whether runtime checks are added.		// Record whether runtime checks are added.
bool AddedSafetyChecks = false;		bool AddedSafetyChecks = false;

// Holds the end values for each induction variable. We save the end values		// Holds the end values for each induction variable. We save the end values
// so we can later fix-up the external users of the induction variables.		// so we can later fix-up the external users of the induction variables.
DenseMap<PHINode , Value > IVEndValues;		DenseMap<PHINode , Value > IVEndValues;

		/// Holds the return value of Legal->isConsecutivePtr() for each pointer in
		/// the loop, computed before the loop is modified.
		DenseMap<Value *, int> IsConsecutivePtr;
};		};

class InnerLoopUnroller : public InnerLoopVectorizer {		class InnerLoopUnroller : public InnerLoopVectorizer {
public:		public:
InnerLoopUnroller(Loop *OrigLoop, PredicatedScalarEvolution &PSE,		InnerLoopUnroller(Loop *OrigLoop, PredicatedScalarEvolution &PSE,
LoopInfo LI, DominatorTree DT,		LoopInfo LI, DominatorTree DT,
const TargetLibraryInfo *TLI,		const TargetLibraryInfo *TLI,
const TargetTransformInfo TTI, AssumptionCache AC,		const TargetTransformInfo TTI, AssumptionCache AC,
▲ Show 20 Lines • Show All 2,332 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::vectorizeMemoryInstruction(Instruction *Instr) {
// target abi alignment in such a case.		// target abi alignment in such a case.
const DataLayout &DL = Instr->getModule()->getDataLayout();		const DataLayout &DL = Instr->getModule()->getDataLayout();
if (!Alignment)		if (!Alignment)
Alignment = DL.getABITypeAlignment(ScalarDataTy);		Alignment = DL.getABITypeAlignment(ScalarDataTy);
unsigned AddressSpace = getMemInstAddressSpace(Instr);		unsigned AddressSpace = getMemInstAddressSpace(Instr);

// Determine if the pointer operand of the access is either consecutive or		// Determine if the pointer operand of the access is either consecutive or
// reverse consecutive.		// reverse consecutive.
int ConsecutiveStride = Legal->isConsecutivePtr(Ptr);		assert(IsConsecutivePtr.count(Ptr) &&
		"Missing IsConsecutivePtr information for this pointer.");
		int ConsecutiveStride = IsConsecutivePtr[Ptr];
bool Reverse = ConsecutiveStride < 0;		bool Reverse = ConsecutiveStride < 0;
bool CreateGatherScatter =		bool CreateGatherScatter =
(Decision == LoopVectorizationCostModel::CM_GatherScatter);		(Decision == LoopVectorizationCostModel::CM_GatherScatter);

// Either Ptr feeds a vector load/store, or a vector GEP should feed a vector		// Either Ptr feeds a vector load/store, or a vector GEP should feed a vector
// gather/scatter. Otherwise Decision should have been to Scalarize.		// gather/scatter. Otherwise Decision should have been to Scalarize.
assert((ConsecutiveStride \|\| CreateGatherScatter) &&		assert((ConsecutiveStride \|\| CreateGatherScatter) &&
"The instruction should be scalarized");		"The instruction should be scalarized");
▲ Show 20 Lines • Show All 399 Lines • ▼ Show 20 Lines	BasicBlock *InnerLoopVectorizer::createVectorizedLoopSkeleton() {
*/		*/

BasicBlock *OldBasicBlock = OrigLoop->getHeader();		BasicBlock *OldBasicBlock = OrigLoop->getHeader();
BasicBlock *VectorPH = OrigLoop->getLoopPreheader();		BasicBlock *VectorPH = OrigLoop->getLoopPreheader();
BasicBlock *ExitBlock = OrigLoop->getExitBlock();		BasicBlock *ExitBlock = OrigLoop->getExitBlock();
assert(VectorPH && "Invalid loop structure");		assert(VectorPH && "Invalid loop structure");
assert(ExitBlock && "Must have an exit block");		assert(ExitBlock && "Must have an exit block");

		// Collect Legal->isConsecutivePtr() information for all the pointers in the
		// loop before it's modified.
		collectIsConsecutivePtr();

// Some loops have a single integer induction variable, while other loops		// Some loops have a single integer induction variable, while other loops
// don't. One example is c++ iterators that often have multiple pointer		// don't. One example is c++ iterators that often have multiple pointer
// induction variables. In the code below we also support a case where we		// induction variables. In the code below we also support a case where we
// don't have a single induction variable.		// don't have a single induction variable.
//		//
// We try to obtain an induction variable from the original loop as hard		// We try to obtain an induction variable from the original loop as hard
// as possible. However if we don't find one that:		// as possible. However if we don't find one that:
// - is an integer		// - is an integer
▲ Show 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	if (MDNode *LID = OrigLoop->getLoopID())
Lp->setLoopID(LID);		Lp->setLoopID(LID);

LoopVectorizeHints Hints(Lp, true, *ORE);		LoopVectorizeHints Hints(Lp, true, *ORE);
Hints.setAlreadyVectorized();		Hints.setAlreadyVectorized();

return LoopVectorPreHeader;		return LoopVectorPreHeader;
}		}

		void InnerLoopVectorizer::collectIsConsecutivePtr() {
		assert(IsConsecutivePtr.empty() &&
		"IsConsecutivePtr information has been already collected.");

		for (auto *BB : OrigLoop->blocks())
		for (auto &I : *BB) {
		// If there's no pointer operand or it was visited before, there's
		// nothing to do.
		auto *Ptr = dyn_cast_or_null<Instruction>(getPointerOperand(&I));
		if (!Ptr \|\| IsConsecutivePtr.count(Ptr))
		continue;

		IsConsecutivePtr[Ptr] = Legal->isConsecutivePtr(Ptr);
		}
		}

// Fix up external users of the induction variable. At this point, we are		// Fix up external users of the induction variable. At this point, we are
// in LCSSA form, with all external PHIs that use the IV having one input value,		// in LCSSA form, with all external PHIs that use the IV having one input value,
// coming from the remainder loop. We need those PHIs to also have a correct		// coming from the remainder loop. We need those PHIs to also have a correct
// value for the IV when arriving directly from the middle block.		// value for the IV when arriving directly from the middle block.
void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,		void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
const InductionDescriptor &II,		const InductionDescriptor &II,
Value CountRoundDown, Value EndValue,		Value CountRoundDown, Value EndValue,
BasicBlock *MiddleBlock) {		BasicBlock *MiddleBlock) {
▲ Show 20 Lines • Show All 5,140 Lines • Show Last 20 Lines

test/Transforms/LoopVectorize/consecutive-ptr-cg-bug.ll

				; RUN: opt -loop-vectorize -S < %s \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
				target triple = "x86_64-unknown-linux-gnu"

				; During LV code generation, after introducing middle and scalar.ph basic
				; blocks, we expected Legal->isConsecutivePtr() to be consistent and return the
				; same output as during legal/cost model phases. Unfortunately, there were some
				; corner cases where that didn't happen due to a limitation in SE/SLEV. This
				; test verifies that LV is able to handle those corner cases.

				; PR34965

				; Verify that store is vectorized as stride-1 memory access.

				; CHECK: vector.body:
				; CHECK: store <4 x i32>

				; Function Attrs: uwtable
				define void @test() {
				br label %.outer

				; <label>:1: ; preds = %2
				ret void

				; <label>:2: ; preds = %._crit_edge.loopexit
				%3 = add nsw i32 %.ph, -2
				br i1 undef, label %1, label %.outer

				.outer: ; preds = %2, %0
				%.ph = phi i32 [ %3, %2 ], [ 336, %0 ]
				%.ph2 = phi i32 [ 62, %2 ], [ 110, %0 ]
				%4 = and i32 %.ph, 30
				%5 = add i32 %.ph2, 1
				br label %6

				; <label>:6: ; preds = %6, %.outer
				%7 = phi i32 [ %5, %.outer ], [ %13, %6 ]
				%8 = phi i32 [ %.ph2, %.outer ], [ %7, %6 ]
				%9 = add i32 %8, 2
				%10 = zext i32 %9 to i64
				%11 = getelementptr inbounds i32, i32 addrspace(1)* undef, i64 %10
				%12 = ashr i32 undef, %4
				store i32 %12, i32 addrspace(1)* %11, align 4
				%13 = add i32 %7, 1
				%14 = icmp sgt i32 %13, 61
				br i1 %14, label %._crit_edge.loopexit, label %6

				._crit_edge.loopexit: ; preds = %._crit_edge.loopexit, %6
				br i1 undef, label %2, label %._crit_edge.loopexit
				}

This is an archive of the discontinued LLVM Phabricator instance.

[LV] [ScalarEvolution] Fix PR34965 - Cache pointer stride information before LV code genAbandonedPublic

Details

Symptom

Investigation

Root Cause

Tentative Fix

Alternative/Complementary Solutions

Diff Detail

Event Timeline

Revision Contents

Diff 121535

lib/Transforms/Vectorize/LoopVectorize.cpp

test/Transforms/LoopVectorize/consecutive-ptr-cg-bug.ll

[LV] [ScalarEvolution] Fix PR34965 - Cache pointer stride information before LV code gen
AbandonedPublic