This is an archive of the discontinued LLVM Phabricator instance.

[LV] Recognize intrinsic min/max reductions
ClosedPublic

Authored by dmgreen on Sep 11 2021, 11:20 AM.

Details

Summary

This extends the reduction logic in the vectorizer to handle intrinsic versions of min and max, both the floating point variants already created by instcombine under fastmath and the integer variants from D98152.

As a bonus this allows it to match a chain of min or max operations into a single reduction, similar to how add/mul/etc work.

Diff Detail

Event Timeline

dmgreen created this revision.Sep 11 2021, 11:20 AM
dmgreen requested review of this revision.Sep 11 2021, 11:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 11 2021, 11:20 AM
fhahn added a comment.Sep 13 2021, 2:05 AM

Looks reasonable. Could you recommit the tests?

Looks reasonable. Could you recommit the tests?

Sure, I can do that. I'm not sure it helps much, the tests just go from "CHECK-NOT <2 x i32> @llvm.smin" to "CHECK: <2 x i32> @llvm.smin" :)

dmgreen updated this revision to Diff 372229.Sep 13 2021, 6:19 AM

On second thoughts, added checks to "llvm.vector.reduce.smax.v2i32" too, to show the reductions. Also added some "fmin/max without fast" reduction test cases, to show they are not vectorized.

I didn't realize we were still missing vectorization of these intrinsics. What is the difference between the patterns here vs. the tests that are already vectorized in llvm/test/Transforms/LoopVectorize/intrinsic.ll ?

llvm/include/llvm/Analysis/IVDescriptors.h
135

Should this be called "isMinMaxPattern" now since it matches both intrinsics and select-cmp?

llvm/test/Transforms/LoopVectorize/minmax_reduction.ll
877–1046

Either this test or the previous one could be just "nnan nsz" rather than "fast" so we are testing the minimum FMF (assuming that works as expected).

I didn't realize we were still missing vectorization of these intrinsics. What is the difference between the patterns here vs. the tests that are already vectorized in llvm/test/Transforms/LoopVectorize/intrinsic.ll ?

These ones are reductions (i.e https://godbolt.org/z/G9affdzsh, creating a llvm.vector.reduce.smax after the loop) as opposed to just plain vectorization (https://godbolt.org/z/K9jvT9rsv, which I believe are still working just fine). The code to match those reduction is what it changing here, requiring to look through the min/max intrinsics as opposed to a pair of icmp/select (which is actually a lot simpler, as you might imagine. The matching of reductions isn't the cleanest code in the world and I'm hoping that once min/max become canonical we can simplify it somewhat. At least in the meantime we have to match both though.)

dmgreen updated this revision to Diff 372335.Sep 13 2021, 1:43 PM

Rename to IsMinMaxPattern and make a test more precise, as suggested.

spatel added inline comments.Sep 13 2021, 1:52 PM
llvm/test/Transforms/LoopVectorize/minmax_reduction.ll
878

Does this call (and also fmin above) have the expected FMF too?

dmgreen updated this revision to Diff 372416.Sep 14 2021, 12:23 AM

Added FMF checks for reduction intrinsic.

spatel accepted this revision.Sep 14 2021, 4:50 AM

LGTM

llvm/test/Transforms/LoopVectorize/minmax_reduction.ll
877–1046

Sorry if I wasn't clear, but it would be better to have one of these tests have minimal FMF and one have extra FMF (ie, the original 'fast' was good) because that shows we are propagating all FMF as expected even if they are not required for the transform. Alternatively, we could add a 2nd test for maxnum with 'fast'.

This revision is now accepted and ready to land.Sep 14 2021, 4:50 AM
This revision was landed with ongoing or failed builds.Sep 15 2021, 2:45 AM
This revision was automatically updated to reflect the committed changes.