This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Propagate fast-math flags for inloop reductions
ClosedPublic

Authored by RosieSumpter on Oct 26 2021, 8:15 AM.

Details

Summary

This patch updates VPReductionRecipe::execute so that the fast-math
flags associated with the underlying instruction of the VPReductionRecipe are
propagated through to the reductions which are created.

Depends on D112547

Diff Detail

Event Timeline

RosieSumpter created this revision.Oct 26 2021, 8:15 AM
RosieSumpter requested review of this revision.Oct 26 2021, 8:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2021, 8:15 AM

Note: this patch builds on the NFC patch D112547

Thanks for this patch @RosieSumpter - seems like a nice fix! I wonder if it's also worth changing createOrderedReduction to be similar to createTargetReduction and set the flags in there too? Also, you've fixed up the VF=1 cases too.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
9752

I completely forgot we can actually just use a guard here, i.e.

IRBuilderBase::FastMathFlagGuard FMFGuard(State.Builder);
State.Builder.setFastMathFlags(RdxDesc->getFastMathFlags());

then you don't need the extra line at the bottom anymore to reset the flags.

9753

Maybe we should do this for all cases here, i.e. ordered and non-ordered?

9755

I just realised that you might be able to replace the whole if block with:

State.Builder.setFastMathFlags(RdxDesc->getFastMathFlags());

here.

9774

I just realised you're actually also fixing the case where VF=1. Is it worth having a test to ensure flags are propagated for VF=1 as well?

RosieSumpter retitled this revision from [LoopVectorize] Propagate fast-math flags for ordered reductions to [LoopVectorize] Propagate fast-math flags for inloop reductions.
  • Use IRBuilderBase::FastMathFlagGuard
  • Propagate fast-math flags for both ordered and non-ordered reductions
  • Updated tests
RosieSumpter marked 4 inline comments as done.Oct 28 2021, 7:43 AM
kmclaughlin accepted this revision.Oct 28 2021, 9:01 AM

Thank you @RosieSumpter, this patch looks good to me and I think all of the comments have been addressed.

llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
854

Should fast here be replaced with nnan?

This revision is now accepted and ready to land.Oct 28 2021, 9:01 AM
RosieSumpter added inline comments.Oct 28 2021, 9:06 AM
llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
854

Yes it should! I will change that before committing.

david-arm added inline comments.Nov 1 2021, 2:32 AM
llvm/test/Transforms/LoopVectorize/reduction-inloop-pred.ll
1020

Nice! Instcombine has folded the call and the fadd together into one!

Just adding an extra datapoint to do with as you wish.

llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
854

Generally speaking it is better for CHECK-NOT lines to be as small/simple as possible so that you do not end up creating a passing test just because some minor detail has changed.

In this instance the CHECK-NOT does not care about the instruction's flags but rather wants to ensure there is never a call to llvm.vector.reduce.fadd. In this regard I think just having CHECK-UNORDERED-NOT: @llvm.vector.reduce.fadd would yield a more resilient test.

That said, given all the expected output is explicitly checked for I'm not sure what value these CHECK-NOT lines provide.

RosieSumpter added inline comments.Nov 1 2021, 9:43 AM
llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd.ll
854

Thanks @paulwalker-arm, I will change the CHECK-NOT lines. Although I see what you mean about them not having value here, maybe it's best to remove them.