This is an archive of the discontinued LLVM Phabricator instance.

[SLP]Fix PR57322: vectorize constant float stores.
ClosedPublic

Authored by ABataev on Aug 26 2022, 10:45 AM.

Details

Summary

Stores for constant floats must be vectorized, improve analysis in SLP
vectorizer for stores.

Diff Detail

Event Timeline

ABataev created this revision.Aug 26 2022, 10:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2022, 10:45 AM
ABataev requested review of this revision.Aug 26 2022, 10:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2022, 10:45 AM
vdmitrie added inline comments.Aug 26 2022, 6:02 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5819

I'm not sure I understand why we may need this extra argument.
Is it for restraining purpose?
We can tell whether we dealing with FP or integer from type of the operand we are evaluating.
Besides, type of operation across VL have to be homogeneous so it makes sense to multi-version the method for better clarity.
And you don't need to enumerate VL if you process first element early.

Here is how the code could look:

const auto *I0 = cast<Instruction>(VL[0]);
Type *OpTy = I0->getOperand(OpIdx)->getType();
if (OpTy->isFloatingPointTy()) {
  for (Value *V : VL) {
    const auto *Inst = cast<Instruction>(V);
    assert(Inst->getOpcode() == I0->getOpcode() && "Expected same opcode");
    if (!isa<ConstantFP>(Inst->getOperand(OpIdx)))
      return {TTI::OK_AnyValue, TTI::OP_None};
  }
  return {TTI::OK_NonUniformConstantValue, TTI::OP_None};
}

// If all operands are exactly the same ConstantInt then set the
// operand kind to OK_UniformConstantValue.
// If instead not all operands are constants, then set the operand kind
// to OK_AnyValue. If all operands are constants but not the same,
// then set the operand kind to OK_NonUniformConstantValue.
ConstantInt *CInt0 = dyn_cast<ConstantInt>(I0->getOperand(OpIdx));
if (!CInt0)
  return {TTI::OK_AnyValue, TTI::OP_None};

TTI::OperandValueKind VK = TTI::OK_UniformConstantValue;
TTI::OperandValueProperties VP = TTI::OP_PowerOf2;
for (Value *V : VL.drop_front()) {
  const auto *Inst = cast<Instruction>(V);
  assert(Inst->getOpcode() == I0->getOpcode() && "Expected same opcode");
  auto *CInt = dyn_cast<ConstantInt>(Inst->getOperand(OpIdx));
  if (!CInt)
    return {TTI::OK_AnyValue, TTI::OP_None};
  if (VP == TTI::OP_PowerOf2 && !CInt->getValue().isPowerOf2())
    VP = TTI::OP_None;
  if (CInt0 != CInt)
    VK = TTI::OK_NonUniformConstantValue;
}
return {VK, VP};
vdmitrie added inline comments.Aug 26 2022, 6:10 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5819

Minor correction for the above (VP Initialization):

TTI::OperandValueProperties VP =
    CInt0->getValue().isPowerOf2() ? TTI::OP_PowerOf2 : TTI::OP_None;
ABataev added inline comments.Aug 26 2022, 6:27 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5819

Just we need to perform this abalysis only for stores but not for fadd, fneg, etc. I can add a check fir the kind of the instruction instead but I thought that it would be better to provide a flag explicitly and pass it explicitly for the required instructions.

vdmitrie added inline comments.Aug 26 2022, 6:39 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5819

only for stores but not for fadd, fneg

Is that because you are targeting stores only in this patch? How does it affect other operations?
It is not yet clear for me why you would like to avoid that analysis for them.

ABataev added inline comments.Aug 26 2022, 7:06 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5819

Yes, I just don't have enough time/resources for other ops analysis.

vdmitrie added inline comments.Aug 26 2022, 7:24 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5819

That's okay for incremental step. Then please don't declare the default value for the argument and
please add a TODO comment at the call site where it is 'false' saying that impact of enabling the analysis there is yet to be determined. Also renaming the argument to enableFP or likewise would make restraining purpose more explicit.

ABataev updated this revision to Diff 456322.Aug 29 2022, 5:44 AM

Address comments

vdmitrie accepted this revision.Aug 29 2022, 9:21 AM

Looks acceptable.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5835

As I said earlier we should never have a situation of mixed types. So I do not quite understand why you are still sticking with packing the two flows into the single loop. This does not save anything but makes the code to be error prone. Just look: the condition at 5840 should probably be if (EnableFP && CFlt). You also do not use CFlt to access the object but only for null pointer check.
Technically yes, this code will work. But I appreciate if you address maintenance concern as well.

This revision is now accepted and ready to land.Aug 29 2022, 9:21 AM
ABataev added inline comments.Aug 29 2022, 9:42 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5835

Sure, just missed it.

This revision was automatically updated to reflect the committed changes.
reames added a subscriber: reames.Aug 29 2022, 11:08 AM

Adding this only for the store case seems undesirable both long term as it's clearly not the right long term goal and short term as it adds complexity for no good reason. I'm not actively requesting you don't land this, but I do strongly want to encourage you to either a) do the analysis for the non-stores and refresh this patch, or b) commit to doing a follow up in the near future.

Adding this only for the store case seems undesirable both long term as it's clearly not the right long term goal and short term as it adds complexity for no good reason. I'm not actively requesting you don't land this, but I do strongly want to encourage you to either a) do the analysis for the non-stores and refresh this patch, or b) commit to doing a follow up in the near future.

I can try to add OK_FPConstantValue value instead and change analysis using this value for now.

Adding this only for the store case seems undesirable both long term as it's clearly not the right long term goal and short term as it adds complexity for no good reason. I'm not actively requesting you don't land this, but I do strongly want to encourage you to either a) do the analysis for the non-stores and refresh this patch, or b) commit to doing a follow up in the near future.

I can try to add OK_FPConstantValue value instead and change analysis using this value for now.

Huh? Why do this?

I meant that we clearly want uniform constant handling to include float vectors in all cases long term. That's what I was asking you to follow up with.

Adding this only for the store case seems undesirable both long term as it's clearly not the right long term goal and short term as it adds complexity for no good reason. I'm not actively requesting you don't land this, but I do strongly want to encourage you to either a) do the analysis for the non-stores and refresh this patch, or b) commit to doing a follow up in the near future.

I can try to add OK_FPConstantValue value instead and change analysis using this value for now.

Huh? Why do this?

I meant that we clearly want uniform constant handling to include float vectors in all cases long term. That's what I was asking you to follow up with.

I see. I can send a followup patch with uniform floats analysis but I cannot test it thoroughly.