This is an archive of the discontinued LLVM Phabricator instance.

Relax constraints for reduction vectorization
ClosedPublic

Authored by sanjoy on Feb 4 2019, 5:35 PM.

Diff Detail

Event Timeline

sanjoy created this revision.Feb 4 2019, 5:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2019, 5:35 PM
sanjoy added a subscriber: jmolloy.Feb 4 2019, 5:35 PM
sdesmalen added inline comments.Feb 20 2019, 8:19 AM
lib/Analysis/IVDescriptors.cpp
556

Is there a reason for the condition I->hasAllowContract()? As far as I can tell only hasAllowReassoc() seems required and thus hasAllowContract() would be overly restrictive.

As long as hasAllowReassoc() is true, I think the resulting reduction operations should have the same properties as the original instruction. For example, if an instruction has hasNoNaNs() == hasNoInfs() == false, the vectorised reduction retain those properties under reassocation.

When trying out this patch, the reduction block seems to assume fast instead of just reassoc, e.g.

middle.block:
%bin.rdx = fadd fast <4 x float> %9, %8

You'll probably want to fix that and extend the test to ensure the flags are retained.

sanjoy updated this revision to Diff 189097.Mar 3 2019, 2:00 PM

Propagate only correct FP fast-math flags.

I'd appreciate a careful review here since I'm not sure all of the ways we
codegen reductions in the loop vectorizer.

sanjoy marked an inline comment as done.
sanjoy updated this revision to Diff 189272.Mar 4 2019, 8:15 PM

Rebase on trunk.

sanjoy added a reviewer: Ayal.Mar 5 2019, 3:36 PM
sanjoy edited the summary of this revision. (Show Details)

Thanks for these changes to your patch @sanjoy! The patch is looking in good shape, just a few remaining nits mostly.

include/llvm/Analysis/IVDescriptors.h
244

nit: recurrenct -> recurrent

lib/Analysis/IVDescriptors.cpp
254

Is it worth adding a comment describing that FMF will be an intersection of the FastMathFlags from all the reduction operations (and thus needs to start with the full set of flags)?

302

nit: unnecessary curly braces

563

nit: since 'CanVectorizeReduction()' is only used once, it probably makes more sense to just expand it here and remove the function.

lib/CodeGen/ExpandReductions.cpp
124

Rather than choosing 'getFast() and going through 'IsOrdered' (which is set when the full 'fast' property is set), do we want to take the FastMathFlags from the IntrinsicInst directly?

124

Note that there is a test missing for this change.

lib/Transforms/Utils/LoopUtils.cpp
675

nit: unnecessary curly braces.

lib/Transforms/Vectorize/LoopVectorize.cpp
322

nit: unnecessary curly braces.

329

nit: unnecessary curly braces.

test/Transforms/LoopVectorize/reduction-fastmath.ll
48

Do you want to add a check for the 'fast' attribute on the reductions in the middle.block, similar to what you've done for reduction_sum_float_only_reassoc_and_contract ?

sanjoy updated this revision to Diff 189593.Mar 6 2019, 2:13 PM
sanjoy marked 9 inline comments as done.
  • Address review comments
lib/CodeGen/ExpandReductions.cpp
124

Rather than choosing 'getFast() and going through 'IsOrdered' (which is set when the full 'fast' property is set), do we want to take the FastMathFlags from the IntrinsicInst directly?

My rationale was that I wanted to keep this as obviously NFC as possible -- previously we would implicitly tag the instructions as fast, and this change just makes it explicit.

Note that there is a test missing for this change.

This change is not supposed to change any behavior (assuming no bugs :) ). Do you want me to add a test case to check existing behavior? CodeGen/Generic/expand-experimental-reductions.ll already tests for fast flags in the expansion.

sdesmalen accepted this revision.Mar 11 2019, 4:53 AM

Thanks for updating your patch @sanjoy ! LGTM (with a side-note on a follow up patch to take more specific flags in expandReductions).
You may want to update the commit message before you commit :)

lib/CodeGen/ExpandReductions.cpp
124

My rationale was that I wanted to keep this as obviously NFC as possible -- previously we would implicitly tag the instructions as fast, and this change just makes it explicit.

In a way that feels a bit artificial since the patch is already not NFC and the purpose of this patch is to be more specific about passing the exact flags. However, if you think it makes more sense to do this change in a separate patch, that's fine with me.

This revision is now accepted and ready to land.Mar 11 2019, 4:53 AM
sanjoy marked 2 inline comments as done.Mar 11 2019, 2:27 PM
sanjoy added inline comments.
lib/CodeGen/ExpandReductions.cpp
124

As I said above, we already have tests for this behavior (that we propagate fast-math flags into getShuffleReduction.

This revision was automatically updated to reflect the committed changes.