This is an archive of the discontinued LLVM Phabricator instance.

[LV] Use safe-divisor lowering for fixed vectors if profitable
ClosedPublic

Authored by reames on Aug 24 2022, 12:25 PM.

Details

Summary

This extends the safe-divisor widening scheme recently added for scalable vectors to handle fixed vectors as well.

Two notes:

  • Due to how early this is used during costing, I couldn't figure out how to intersect this with the minimum bitwidth handling. So, in theory, we could chose to widen when a narrower scalarization would have been profitable.
  • This is mostly completion-ism from my side. I think it makes sense, but I'm not strongly motivated to push this currently. If there are any serious cost model concerns raised, I'm likely to abandon the patch.

Diff Detail

Event Timeline

reames created this revision.Aug 24 2022, 12:25 PM
reames requested review of this revision.Aug 24 2022, 12:25 PM

I like the idea of this patch - I imagine it's a bit similar to the setCostBasedWideningDecision function where we make a widening decision before doing the actual full loop costing. I guess if you had concerns about enabling this by default for all targets you could limit this to a RISCV TTI hook initially?

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4519

This is just a minor point, but I wonder if ScalarizationCost is a better name, to distinguish from an actual scalar cost (i.e. VF=1)?

4551

In order to avoid creating the vector type twice here and below, perhaps create a local variable, i.e.

Type VecTy = ToVectorTy(I->getType(), VF);

?

7154

It seems a bit unfortunate that we're calling getDivRemSpeculationCost twice here - once in this function and once in isScalarWithPredication. Would it be better to pull the switch statement out of isScalarWithPredication into a small, common function, i.e.:

bool isDivRemScalarWithPredication(Instruction *I, ElementCount VF) {
  switch (ForceSafeDivisor) {
  case cl::BOU_UNSET: {
    const auto [ScalarCost, SafeDivisorCost] = getDivRemSpeculationCost(I, VF);
    return ScalarCost< SafeDivisorCost;
  }
  case cl::BOU_TRUE:
    return false;
  case cl::BOU_FALSE:
    return true;
  };
}

Then isScalarWithPredication and this function can just call isDivRemScalarWithPredication. It also restricts the cost calculations to when ForceSafeDivisor is cl::BOU_UNSET.

I guess if you had concerns about enabling this by default for all targets you could limit this to a RISCV TTI hook initially?

I have no concerns with simply enabling this. We have no evidence of problems, and we can revert if actual performance issues are revealed.

If you, or another reviewer, has concerns I'll simply drop the patch. As I said up front, I'm not strongly motivated to push this.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
4519

Will change.

4551

Was in the code being moved already, but sure, will change.

7154

You're misreading the code. We need to return a cost regardless of the value of the cl::opt. The only thing that changes is which one we return,

david-arm added inline comments.Sep 1 2022, 1:06 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
7154

My apologies, yes you're right. However, my previous comment still stands - we're calculating the costs twice, which feels inefficient. In that case, it's still possible to refactor this to get the costs just once. For example,

case Instruction::UDiv:
case Instruction::SDiv:
case Instruction::URem:
case Instruction::SRem:
  const auto [ScalarCost, SafeDivisorCost] = getDivRemSpeculationCost(I, VF);
  return isDivRemScalarWithPredication(ScalarCost, SafeDivisorCost) ? ScalarCost : SafeDivisorCost;

where

bool isDivRemScalarWithPredication(InstructionCost ScalarCost, InstructionCost  SafeDivisorCost) {
  switch (ForceSafeDivisor) {
  case cl::BOU_UNSET:
    return ScalarCost< SafeDivisorCost;
  case cl::BOU_TRUE:
    return false;
  case cl::BOU_FALSE:
    return true;
  };
}

and do something similar for the div/rem case in isScalarWithPredication.

reames updated this revision to Diff 458207.Sep 6 2022, 10:24 AM

Address review style comments.

david-arm accepted this revision.Sep 8 2022, 5:24 AM

LGTM! Thanks for making the changes @reames.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1442

nit: I realise I asked you to add this function, but could you add a brief comment above this function before landing the patch?

This revision is now accepted and ready to land.Sep 8 2022, 5:24 AM
fhahn accepted this revision.Sep 8 2022, 6:31 AM

I have no concerns with simply enabling this. We have no evidence of problems, and we can revert if actual performance issues are revealed.

Enabling this by default should be fine, AFAICT.

LGTM, thanks! I think it would be good to add -force-widen-divrem-via-safe-divisor=0 to induction.ll and pr44488-predication.ll to make sure they test the same as before.

llvm/test/Transforms/LoopVectorize/induction.ll
2036

I think the division here is used intentionally to force uses of an IV in predicated blocks and with this change we are now not checking for that scenario any longer. This should probably use -force-widen-divrem-via-safe-divisor=0 to preserve the original behavior.

llvm/test/Transforms/LoopVectorize/pr44488-predication.ll
11

It looks like this test specially tests for a special case with branches and it should probably use -force-widen-divrem-via-safe-divisor=0 .

This revision was landed with ongoing or failed builds.Sep 8 2022, 9:16 AM
This revision was automatically updated to reflect the committed changes.