This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/Vectorize/
-
Transforms/
-
Vectorize/
-
LoopVectorizationLegality.cpp
-
LoopVectorize.cpp
-
test/Transforms/LoopVectorize/
-
Transforms/
-
LoopVectorize/
-
pr37515.ll
-
pr38800.ll

Differential D52327

[Loop Vectorizer] Abandon vectorization when no integer IV found
ClosedPublic

Authored by wristow on Sep 20 2018, 3:32 PM.

Download Raw Diff

Details

Reviewers

delena
mkuper
hsaito

Commits

rG4f27730eaf60: [Loop Vectorizer] Abandon vectorization when no integer IV found
rL342786: [Loop Vectorizer] Abandon vectorization when no integer IV found

Summary

Support for vectorizing loops with secondary floating-point induction variables was added in r276554. A primary integer IV is still required for vectorization to be done. If an FP IV was found, but no integer IV was found at all (primary or secondary), the attempt to vectorize still went forward, causing a compiler-crash.

This fixes PR38800.

Diff Detail

Repository: rL LLVM

Event Timeline

wristow created this revision.Sep 20 2018, 3:32 PM

Sorry, I totally forgot about my old patch https://reviews.llvm.org/D47216.

I like your two different message version better than changing Line 788 condition to if (!WidestIndTy). That's good.
Please add non-NULL assertion after "Type *IdxTy = Legal->getWidestInductionType();" in InnerLoopVectorizer::getOrCreateTripCount().
Please include LIT test from D47216.

Thanks,
Hideki

Why do we need an integer induction variable? If one doesn't exist, it should be straightforward to create one.

In D52327#1242411, @efriedma wrote:

Why do we need an integer induction variable? If one doesn't exist, it should be straightforward to create one.

Is there a practical need (i.e., beyond academic interest) to vectorize such code? Examples?

In D52327#1242351, @hsaito wrote:

Sorry, I totally forgot about my old patch https://reviews.llvm.org/D47216.

I like your two different message version better than changing Line 788 condition to if (!WidestIndTy). That's good.

Please add non-NULL assertion after "Type *IdxTy = Legal->getWidestInductionType();" in InnerLoopVectorizer::getOrCreateTripCount().

Please include LIT test from D47216.

Thanks Hideki! I'll add the non-NULL assertion for idxTy. Regarding the LIT test from D47216, isn't that essentially the same as the LIT test I have here?

In D52327#1242411, @efriedma wrote:

Why do we need an integer induction variable? If one doesn't exist, it should be straightforward to create one.

Makes sense to me, but I'm not experienced in working on the vectorizer, so I'm hesitant to jump into that. And I'd view it as a separate patch to add that capability. As Hideki said in D47216:

Whether it's legal to convert FP primary induction to INT primary induction and if so under what conditions are debatable, but bailing
out when it's not proven safe (and currently never proven to be safe as far as LV's existing code is concerned) is a valid thing to do.

I don't know how difficult it is to extend it this way, but given that there are explicit expectations that the primary induction is of an integer type, such as:

assert((IV->getType()->isIntegerTy() || IV != OldInduction) &&
       "Primary induction variable must have an integer type");

it doesn't feel like it's a small tweak. (But again, I'm not experienced in this area.)
In short, tackling that seems like a fine idea, but a separate task.

In D52327#1242414, @hsaito wrote:

In D52327#1242411, @efriedma wrote:

Why do we need an integer induction variable? If one doesn't exist, it should be straightforward to create one.

Is there a practical need (i.e., beyond academic interest) to vectorize such code? Examples?

I think that's a separate question, i.e. improving the vectorizer without depending on real world benchmarks is a good thing (especially if it's not a fundamental change). So, it looks like the LV is currently limited in vectorizing FP loops where we don't have an integer IV. It's worthwhile to see whether adding the integer IV and related cost model changes allows us to vectorize such loops on some targets? Maybe the cost model will prove that even though vectorizing is possible, it may not be beneficial. Disclaimer: This is speculation, I have not done any analysis here.

That stated, I believe fixing the PR and the crash is a good change (and orthogonal to the question of adding integer IV).

If there's some actual implementation work involved in supporting it, fine, I guess... but it seems like it can't be much work to support properly: even just adding an unused i1 induction variable is enough to get around the crash for the given testcase.

In D52327#1242561, @efriedma wrote:

If there's some actual implementation work involved in supporting it, fine, I guess... but it seems like it can't be much work to support properly: even just adding an unused i1 induction variable is enough to get around the crash for the given testcase.

Interesting, although given that the loop runs for more than one iteration, would that i1 variable work correctly? The real question is: can we compute the trip count for these loops? I'd guess that we need to do that too.

As the discussion on the possibility of creating an integer IV goes on, I'm updating the patch to include the non-NULL assertion for IdxTy.

Interesting, although given that the loop runs for more than one iteration, would that i1 variable work correctly?

The i1 variable doesn't actually get used for anything; the vectorizer makes a new induction variable to track the iteration count.

can we compute the trip count for these loops?

SCEV can compute the trip count for this testcase, yes. See ScalarEvolution::computeExitCountExhaustively.

In D52327#1242567, @efriedma wrote:

Interesting, although given that the loop runs for more than one iteration, would that i1 variable work correctly?

The i1 variable doesn't actually get used for anything; the vectorizer makes a new induction variable to track the iteration count.

Ah, okay. Interesting. I certainly see your point - if it is never used for anything, then we don't need to require it.

can we compute the trip count for these loops?

SCEV can compute the trip count for this testcase, yes. See ScalarEvolution::computeExitCountExhaustively.

Sure. That will do it.

In D52327#1242578, @hfinkel wrote:

In D52327#1242567, @efriedma wrote:

can we compute the trip count for these loops?

SCEV can compute the trip count for this testcase, yes. See ScalarEvolution::computeExitCountExhaustively.

Sure. That will do it.

Still doesn't move a needle on my priority list. We might be able to do this for fast-math mode, but that means loop trip count may be different depending on the FP optimizations performed. Not a very good program to begin with. Vectorizer has a long list of TODOs for optimizing practically-useful well-written code.

In any case, I think the proposed fix (or some other mechanism) is still needed since I don't think SCEV can compute trip count in all cases of FP primary inductions.

In D52327#1242439, @wristow wrote:

Regarding the LIT test from D47216, isn't that essentially the same as the LIT test I have here?

I say yes and no. One induction versus two inductions. Given that there were none before, it's nice to have more than one, especially so if another one is handily available. I don't insist, though.

LGTM. We can continue discussing about not bailing out for the subset of cases, but we don't have to let the compiler crash while we do that.

This revision is now accepted and ready to land.Sep 21 2018, 3:25 PM

In D52327#1242611, @hsaito wrote:

In D52327#1242439, @wristow wrote:

Regarding the LIT test from D47216, isn't that essentially the same as the LIT test I have here?

I say yes and no. One induction versus two inductions. Given that there were none before, it's nice to have more than one, especially so if another one is handily available. I don't insist, though.

Makes sense. I'll add that second test.

In D52327#1242612, @hsaito wrote:

LGTM. We can continue discussing about not bailing out for the subset of cases, but we don't have to let the compiler crash while we do that.

Thanks! I'll commit after adding that test.

Added test from D47216.

Closed by commit rL342786: [Loop Vectorizer] Abandon vectorization when no integer IV found (authored by wristow). · Explain WhySep 21 2018, 4:07 PM

This revision was automatically updated to reflect the committed changes.

wristow mentioned this in rL342786: [Loop Vectorizer] Abandon vectorization when no integer IV found.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Vectorize/

LoopVectorizationLegality.cpp

4 lines

LoopVectorize.cpp

1 line

test/

Transforms/

LoopVectorize/

pr37515.ll

20 lines

pr38800.ll

34 lines

Diff 166573

llvm/trunk/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

Show First 20 Lines • Show All 783 Lines • ▼ Show 20 Lines	bool LoopVectorizationLegality::canVectorizeInstrs() {
}		}

if (!PrimaryInduction) {		if (!PrimaryInduction) {
LLVM_DEBUG(dbgs() << "LV: Did not find one integer induction var.\n");		LLVM_DEBUG(dbgs() << "LV: Did not find one integer induction var.\n");
if (Inductions.empty()) {		if (Inductions.empty()) {
ORE->emit(createMissedAnalysis("NoInductionVariable")		ORE->emit(createMissedAnalysis("NoInductionVariable")
<< "loop induction variable could not be identified");		<< "loop induction variable could not be identified");
return false;		return false;
		} else if (!WidestIndTy) {
		ORE->emit(createMissedAnalysis("NoIntegerInductionVariable")
		<< "integer loop induction variable could not be identified");
		return false;
}		}
}		}

// Now we know the widest induction type, check if our found induction		// Now we know the widest induction type, check if our found induction
// is the same size. If it's not, unset it here and InnerLoopVectorizer		// is the same size. If it's not, unset it here and InnerLoopVectorizer
// will create another.		// will create another.
if (PrimaryInduction && WidestIndTy != PrimaryInduction->getType())		if (PrimaryInduction && WidestIndTy != PrimaryInduction->getType())
PrimaryInduction = nullptr;		PrimaryInduction = nullptr;
▲ Show 20 Lines • Show All 332 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,297 Lines • ▼ Show 20 Lines	Value InnerLoopVectorizer::getOrCreateTripCount(Loop L) {
IRBuilder<> Builder(L->getLoopPreheader()->getTerminator());		IRBuilder<> Builder(L->getLoopPreheader()->getTerminator());
// Find the loop boundaries.		// Find the loop boundaries.
ScalarEvolution *SE = PSE.getSE();		ScalarEvolution *SE = PSE.getSE();
const SCEV *BackedgeTakenCount = PSE.getBackedgeTakenCount();		const SCEV *BackedgeTakenCount = PSE.getBackedgeTakenCount();
assert(BackedgeTakenCount != SE->getCouldNotCompute() &&		assert(BackedgeTakenCount != SE->getCouldNotCompute() &&
"Invalid loop count");		"Invalid loop count");

Type *IdxTy = Legal->getWidestInductionType();		Type *IdxTy = Legal->getWidestInductionType();
		assert(IdxTy && "No type for induction");

// The exit count might have the type of i64 while the phi is i32. This can		// The exit count might have the type of i64 while the phi is i32. This can
// happen if we have an induction variable that is sign extended before the		// happen if we have an induction variable that is sign extended before the
// compare. The only way that we get a backedge taken count is that the		// compare. The only way that we get a backedge taken count is that the
// induction variable was signed and as such will not overflow. In such a case		// induction variable was signed and as such will not overflow. In such a case
// truncation is legal.		// truncation is legal.
if (BackedgeTakenCount->getType()->getPrimitiveSizeInBits() >		if (BackedgeTakenCount->getType()->getPrimitiveSizeInBits() >
IdxTy->getPrimitiveSizeInBits())		IdxTy->getPrimitiveSizeInBits())
▲ Show 20 Lines • Show All 4,953 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopVectorize/pr37515.ll

				; RUN: opt -passes='loop-vectorize' -S -pass-remarks-missed=loop-vectorize < %s 2>&1 \| FileCheck %s
				;
				; FP primary induction is not supported in LV. Make sure Legal bails out.
				;
				; CHECK: loop not vectorized

				define void @PR37515() {
				entry:
				br label %loop

				loop:
				%p = phi float [ 19.0, %entry ], [ %a, %loop ]
				%a = fadd fast float %p, -1.0
				%m = fmul fast float %a, %a
				%c = fcmp fast ugt float %a, 2.0
				br i1 %c, label %loop, label %exit

				exit:
				unreachable
				}

llvm/trunk/test/Transforms/LoopVectorize/pr38800.ll

Property	Old Value	New Value
svn:executable	null	* \ No newline at end of property

				; RUN: opt -loop-vectorize -force-vector-width=2 -pass-remarks-missed='loop-vectorize' -S < %s 2>&1 \| FileCheck %s

				; CHECK: remark: <unknown>:0:0: loop not vectorized: integer loop induction variable could not be identified

				; Test-case ('-O2 -ffast-math') from PR38800.
				; (Set '-force-vector-width=2' to enable vector code generation.)
				;
				; No integral induction variable in the source-code caused a compiler-crash
				; when attempting to vectorize. With the fix, a remark indicating why it
				; wasn't vectorized is produced
				;
				;void foo(float *ptr, float val) {
				; float f;
				; for (f = 0.1f; f < 1.0f; f += 0.01f)
				; *ptr += val;
				;}

				define void @foo(float* nocapture %ptr, float %val) local_unnamed_addr {
				entry:
				%ptr.promoted = load float, float* %ptr, align 4
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%add5 = phi float [ %ptr.promoted, %entry ], [ %add, %for.body ]
				%f.04 = phi float [ 0x3FB99999A0000000, %entry ], [ %add1, %for.body ]
				%add = fadd fast float %add5, %val
				%add1 = fadd fast float %f.04, 0x3F847AE140000000
				%cmp = fcmp fast olt float %add1, 1.000000e+00
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				store float %add, float* %ptr, align 4
				ret void
				}