This is an archive of the discontinued LLVM Phabricator instance.

[LV] Prevent interleaving if computeMaxVF returned None.
ClosedPublic

Authored by fhahn on Feb 6 2019, 12:29 PM.

Details

Diff Detail

Repository
rL LLVM

Event Timeline

fhahn created this revision.Feb 6 2019, 12:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2019, 12:29 PM
hsaito added a comment.Feb 6 2019, 2:00 PM

I have a little mental barrier in accepting this change as is. I think this feeling of mine is mainly due to the name and the "stated" functionality of computeMaxVF and "indirect inference" towards using it for suppressing interleaving. Maybe I'm just being too picky here. If so, my apologies ahead of time.

Other than just the function name and the associated comment:

  1. runtime ptr check && hasBranchDivergence interleave w/ VF=1 still okay?
  2. For interleaving, the remaining checks other than TC==0/1 (for better diagnostics) and OptForSize appear useless. Even if TC is evenly divisible by VF or canFoldTailByMasking, we still can't interleave under OptForSize.

I think we should start thinking in terms of separate "is vectorization feasible" and "is interleaving feasible" (plus possibly "is unrolling feasible"), even if we evaluate those feasibility in one function.

Possibly going to tangent, but while we look at computeMaxVF, at the bottom, we say use pragma, but we bail out of plan() w/o checking whether UserVF exists or not.

Hopefully, this conveys enough about my feeling of uneasiness.

rengolin added inline comments.Feb 6 2019, 2:01 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6101 ↗(On Diff #185615)

Nit: To avoid redundancy in the name, I'd just call it something like "Disabled"...

7323 ↗(On Diff #185615)

This pattern feels like it could be an utility function of VectorizationFactor...

7330 ↗(On Diff #185615)

None of the parameters depend on IC and this call will override the value, why initialise with 1 above?

Ayal added inline comments.Feb 6 2019, 2:05 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6116 ↗(On Diff #185615)

Worth updating the comment, e.g., Cases that should not to be vectorized nor interleaved. And while touching the line, worth shortening to if (!MaybeMaxVF).

7325 ↗(On Diff #185615)

UserIC should always be set to Hints.getInterleave().

llvm/test/Transforms/LoopVectorize/no-interleave-up-front.ll
6 ↗(On Diff #185615)

Document where this test came from?

fhahn marked 2 inline comments as done.Feb 6 2019, 2:32 PM

Thanks for all the comments!

Other than just the function name and the associated comment:

  1. runtime ptr check && hasBranchDivergence interleave w/ VF=1 still okay?

Yep I think so.

  1. For interleaving, the remaining checks other than TC==0/1 (for better diagnostics) and OptForSize appear useless. Even if TC is evenly divisible by VF or canFoldTailByMasking, we still can't interleave under OptForSize.

I think we should start thinking in terms of separate "is vectorization feasible" and "is interleaving feasible" (plus possibly "is unrolling feasible"), even if we evaluate those feasibility in one function.

Agreed, having it done implicitly in a function called computeMaxVF does not seem ideal. From the current behavior of computeMaxVF, I think what we actually want to decide whether to disable interleaving up front is just if we optimize for size or not. What do you think?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7325 ↗(On Diff #185615)

Ah I thought you meant that we should ignore UserIC, in case computeMaxVF returned None, similar to the way we ignore UserVF in that case. Did I miss something?

7330 ↗(On Diff #185615)

In case MaybeVF is None, we want to avoid interleaving, that is why it is initialized with 1 and only computed when we have a VF.

This comment was removed by dcaballe.
hsaito added a comment.Feb 6 2019, 3:14 PM

Agreed, having it done implicitly in a function called computeMaxVF does not seem ideal. From the current behavior of computeMaxVF, I think what we actually want to decide whether to disable interleaving up front is just if we optimize for size or not. What do you think?

That was my conclusion as well.

With that understanding, your current direction is

If VPlan is not built for vectorization purposes, also bail out on interleaving. ---- Work on enabling interleaving in a subsequent patch.

and I'm fine with that approach.

Ayal added a comment.Feb 6 2019, 3:27 PM

I have a little mental barrier in accepting this change as is. I think this feeling of mine is mainly due to the name and the "stated" functionality of computeMaxVF and "indirect inference" towards using it for suppressing interleaving. Maybe I'm just being too picky here. If so, my apologies ahead of time.

Other than just the function name and the associated comment:

  1. runtime ptr check && hasBranchDivergence interleave w/ VF=1 still okay?
  2. For interleaving, the remaining checks other than TC==0/1 (for better diagnostics) and OptForSize appear useless. Even if TC is evenly divisible by VF or canFoldTailByMasking, we still can't interleave under OptForSize.

I think we should start thinking in terms of separate "is vectorization feasible" and "is interleaving feasible" (plus possibly "is unrolling feasible"), even if we evaluate those feasibility in one function.

Possibly going to tangent, but while we look at computeMaxVF, at the bottom, we say use pragma, but we bail out of plan() w/o checking whether UserVF exists or not.

Hopefully, this conveys enough about my feeling of uneasiness.

This patch tries only to avoid running into the "no-VPlan-was-built-for-interleaving" problem, w/o modifying any other behavior or performing any refactoring. The latter indeed deserve to be addressed, but in separate patches.

For 1), a same runtime ptr check is needed for both vectorization and interleaving, so seems it should prevent both if one wishes to avoid introducing such branches before a loop when hasBranchDivergence. But I can't find a review explaining r309890.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7325 ↗(On Diff #185615)

UserIC should remain intact. Only IC must decide not to interleave if no (suitable) VPlan was built.

Ayal accepted this revision.Feb 6 2019, 3:42 PM

This LGTM with the minor documentation comments and retention of UserIC.
Seems like we've reached a consensus, correct me if not.
Thanks!

This revision is now accepted and ready to land.Feb 6 2019, 3:42 PM
hsaito added a comment.Feb 6 2019, 4:18 PM

This LGTM with the minor documentation comments and retention of UserIC.
Seems like we've reached a consensus, correct me if not.
Thanks!

I'm fine addressing my concerns in updated comments and later commits.

LGTM (as you could see in my previous message :))

fhahn marked an inline comment as done.Feb 7 2019, 4:38 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7325 ↗(On Diff #185615)

The problem is we pick either UserIC or the computed IC a bit later on, so if we initialize UserIC unconditionally, we still hit exactly the problem in the test case. We could move the code to choose IC or UserIC into the conditional (together with the diagnostic messages)

Ayal added inline comments.Feb 7 2019, 6:37 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7325 ↗(On Diff #185615)

Ah, right, my bad. How about setting InterleaveLoop below to false, with suitable debug/report message. That is actually the only change really needed: always calling CM.selectInterleaveCount() would in most cases return 1 immediately anyhow due to OptForSize (including the test case), and perform redundant computation only for the runtime ptr check && hasBranchDivergence case. But it is logical to avoid calling CM.selectInterleaveCount() below; or leave it to a separate refactoring of how VF&IC are set, possibly noting the potential waste in a TODO.

fhahn updated this revision to Diff 185779.Feb 7 2019, 8:10 AM

Move selection of IC into conditional.

fhahn marked an inline comment as done.Feb 7 2019, 8:12 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7325 ↗(On Diff #185615)

Right, I moved the code to pick IC or UserIC and the diagnostics into the conditional. I also moved the VF == 1 diagnostic and the requirements check, as we might get different diagnostic ordering otherwise. What do you think?

Ayal added inline comments.Feb 7 2019, 8:59 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7325 ↗(On Diff #185615)

Above thought was simply to add another

if (!MaybeVF && UserIC > 1) {
  // Tell the user interleaving was avoided up-front, despite being explicitly requested. 
  LLVM_DEBUG(dbgs() << "LV: ...\n");
  IntDiagMsg = std::make_pair(...);
  InterleaveLoop = false;
} else if // other cases for setting InterleaveLoop to false;

following the previous structure of setting

UserIC = Hints.getInterleave();
if (MaybeVF) {
  VF = *MaybeVF;
  IC = CM.selectInterleaveCount(OptForSize, VF.Width, VF.Cost);
}

Sounds right?

fhahn updated this revision to Diff 185798.Feb 7 2019, 9:46 AM
fhahn marked an inline comment as done.

Simplified as per Ayal's comment, thanks.

fhahn marked an inline comment as done.Feb 7 2019, 9:48 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7325 ↗(On Diff #185615)

Right, let's go with the less invasive change first. I've updated the code. Note that it sets UserIC = 1 if it was ignored, so we do not display any other UserIC related messages later on.

Ayal added inline comments.Feb 7 2019, 11:20 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7329 ↗(On Diff #185798)

Separate with an empty line.

7325 ↗(On Diff #185615)

Original suggestion was to continue setting VF and IC above, and add a 3rd cascaded-if case that sets InterleaveCount to false below, w/o resetting UserIC, thereby retaining the current structure amap. But if you prefer fusing them together to check MaybeVF only once that's also ok.

fhahn added a comment.Feb 7 2019, 11:42 AM

Thanks Ayal! I'll add the line before committing. Are you ok for the latest iteration to go in?

Ayal added a comment.Feb 7 2019, 12:23 PM

Thanks Ayal! I'll add the line before committing. Are you ok for the latest iteration to go in?

Yes, I'm ok. Would have preferred the separate if (MaybeVF) { VF=..; IC=..; } and if (!MaybeVF && UserIC >0) { InterleaveCount = false; ... } else if ... version, but leave it for you to decide.

fhahn added a comment.Feb 7 2019, 12:31 PM

Thanks Ayal! I'll add the line before committing. Are you ok for the latest iteration to go in?

Yes, I'm ok. Would have preferred the separate if (MaybeVF) { VF=..; IC=..; } and if (!MaybeVF && UserIC >0) { InterleaveCount = false; ... } else if ... version, but leave it for you to decide.

I'll update it before committing.

This revision was automatically updated to reflect the committed changes.
Ayal added inline comments.Feb 7 2019, 1:00 PM
llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
7360

This UserIC = 1 is now redundant, right?

Thanks!

fhahn marked an inline comment as done.Feb 7 2019, 1:25 PM
fhahn added inline comments.
llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
7360

Of course, just moved the code around.... Removed in rGf557a94aa32c.