Page MenuHomePhabricator

[ARM} Add support for MVE vmaxv and vminv
ClosedPublic

Authored by samtebbs on Aug 19 2019, 5:26 AM.

Details

Summary

This patch adds vecreduce_smax, vecredude_umax, vecreduce_smin, vecreduce_umin and selection for vmaxv and minv.

Diff Detail

Repository
rL LLVM

Event Timeline

samtebbs created this revision.Aug 19 2019, 5:26 AM
dmgreen added inline comments.Aug 21 2019, 6:06 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1064 ↗(On Diff #215860)

Is there a VMINV patch too? Am I correct that this would enable both min and max reduction intrinsics?

llvm/test/CodeGen/Thumb2/mve-vmaxv.ll
14 ↗(On Diff #215860)

The old code for these was quite large, right? Much larger than needing to materialise the constant?

samtebbs marked 2 inline comments as done.Aug 21 2019, 7:23 AM
samtebbs added inline comments.
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1064 ↗(On Diff #215860)

It would enable both. There isn't a separate patch for vminv so I could add it to this one if you'd like.

llvm/test/CodeGen/Thumb2/mve-vmaxv.ll
14 ↗(On Diff #215860)

Yep, in this case it was 78 instructions long! And the others varied from 15 to 61.

samtebbs marked 2 inline comments as done.Sep 2 2019, 1:34 AM
samparker added inline comments.Sep 2 2019, 5:46 AM
llvm/lib/Target/ARM/ARMInstrMVE.td
664 ↗(On Diff #215860)

Guess these need to be changed to MQPR.

samtebbs updated this revision to Diff 218350.Sep 2 2019, 6:13 AM
samtebbs marked an inline comment as done.
samparker added inline comments.Sep 3 2019, 2:54 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1064 ↗(On Diff #218350)

Is this actually needed for codegen?

samtebbs marked an inline comment as done.Sep 9 2019, 3:10 AM
samtebbs added inline comments.
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1064 ↗(On Diff #218350)

It is, as otherwise the smax and umax vector reductions aren't generated and can't be selected on.

samparker added inline comments.Sep 9 2019, 3:13 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1064 ↗(On Diff #218350)

As in, the reductions get expanded somewhere and your tests would fail?

samtebbs marked an inline comment as done.Sep 9 2019, 3:53 AM
samtebbs added inline comments.
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1064 ↗(On Diff #218350)

Indeed, the tests fail and we get markedly worse codegen.

65c65,70
< 	vmaxv.u32	r2, q0
---
> 	vmov.f32	s5, s3
> 	vmax.u32	q0, q0, q1
> 	vmov.32	r2, q0[1]
> 	vdup.32	q1, r2
> 	vmax.u32	q0, q0, q1
> 	vmov	r2, s0
samparker added inline comments.Sep 9 2019, 4:04 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1064 ↗(On Diff #218350)

Ok, cheers. Then I expect Dave is right that the vmin could now get generated and will then assert because it can't be selected...?

dmgreen added inline comments.Sep 9 2019, 4:13 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1064 ↗(On Diff #218350)

Yeah, I think there's two different places that these intrinsics can be expanded, once in a pre-isel pass if this returns false, or in ISEL if the instructions are expand.

So I think the selection would be OK, but it would be best to add vminv to make sure the vectoriser doesn't start generating them only for them the be messily expanded.

samtebbs updated this revision to Diff 219333.Sep 9 2019, 5:56 AM
samtebbs updated this revision to Diff 219338.Sep 9 2019, 6:10 AM
samtebbs updated this revision to Diff 219867.Sep 12 2019, 2:44 AM
samtebbs retitled this revision from [ARM} Add support for MVE vmaxv to [ARM} Add support for MVE vmaxv and vminv.
samtebbs edited the summary of this revision. (Show Details)

Added vminv support

dmgreen accepted this revision.Sep 12 2019, 8:29 AM

Thanks! Looks good to me.

This revision is now accepted and ready to land.Sep 12 2019, 8:29 AM
This revision was automatically updated to reflect the committed changes.