This patch adds vecreduce_smax, vecredude_umax, vecreduce_smin, vecreduce_umin and selection for vmaxv and minv.
Details
Diff Detail
Event Timeline
llvm/lib/Target/ARM/ARMInstrMVE.td | ||
---|---|---|
673 | Guess these need to be changed to MQPR. |
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
---|---|---|
1064 | Is this actually needed for codegen? |
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
---|---|---|
1064 | It is, as otherwise the smax and umax vector reductions aren't generated and can't be selected on. |
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
---|---|---|
1064 | As in, the reductions get expanded somewhere and your tests would fail? |
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
---|---|---|
1064 | Indeed, the tests fail and we get markedly worse codegen. 65c65,70 < vmaxv.u32 r2, q0 --- > vmov.f32 s5, s3 > vmax.u32 q0, q0, q1 > vmov.32 r2, q0[1] > vdup.32 q1, r2 > vmax.u32 q0, q0, q1 > vmov r2, s0 |
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
---|---|---|
1064 | Ok, cheers. Then I expect Dave is right that the vmin could now get generated and will then assert because it can't be selected...? |
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
---|---|---|
1064 | Yeah, I think there's two different places that these intrinsics can be expanded, once in a pre-isel pass if this returns false, or in ISEL if the instructions are expand. So I think the selection would be OK, but it would be best to add vminv to make sure the vectoriser doesn't start generating them only for them the be messily expanded. |
Guess these need to be changed to MQPR.