This is an archive of the discontinued LLVM Phabricator instance.

[ARM][MVE] Allow loops containing strides != 1 to be tail predicated with gather/scatters enabled
ClosedPublic

Authored by anwel on Aug 6 2020, 1:27 AM.

Details

Summary

There is a stride check in ARMTargetTransformInfo that decides whether a loop cannot be tail predicated by checking whether the strides in it are different from 1. To enable tail predication for loops containing gather/scatters, this patch takes a more detailed approach if the EnableMaskedGatherScatters flag is true, and also adds some more detailed debug messages.

Diff Detail

Event Timeline

anwel created this revision.Aug 6 2020, 1:27 AM
anwel requested review of this revision.Aug 6 2020, 1:27 AM
samparker added inline comments.Aug 6 2020, 1:58 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1634

else ifs?

1639

I think this would be easier to read if this was organised in stride order and separating the gather/scatter from the consecutive accesses. So when !EnableMaskedGatherScatter, getPtrStride should only ever be 1, right? So I don't think we have to track the 'NextStride' business.

dmgreen added inline comments.Aug 6 2020, 2:09 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1634

A load or a store can be vld2 or a vst2, neither of which can be tail folded unfortunately.

anwel updated this revision to Diff 285960.Aug 17 2020, 4:21 AM
anwel marked 3 inline comments as done.

Widened the range of allowed strides to also include loop invariant expressions.

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1634

Good point.

1634

Right, I forgot about the vstr2's. In that case we can never allow a stride of 2 here, as the only instructions that get us here are loads and stores.

1639

Good point. We may as well get rid of that.

dmgreen added inline comments.Aug 17 2020, 6:55 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1634–1639

Can you change this condition to something like

if (NextStride == -1 || (NextStride == 2 && MVEMaxSupportedInterleaveFactor >= 2) || (NextStride == 4 && MVEMaxSupportedInterleaveFactor >= 4))

That should hopefully make it more futureproof, and specifically rule out reverse loads even if the vectorizer changes to support them.

1649–1650

Perhaps

if (auto AR = dyn_cast<SCEVAddRecExpr>(PtrScev)) {
anwel updated this revision to Diff 286754.Aug 20 2020, 2:45 AM
anwel marked 2 inline comments as done.

I made the NextStride == 2 condition more future proof as requested by Dave's comment. This required to make the MVEMaxSupportedInterleaveFactor from ARMISelLowering non-static.

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1634–1639

Of course I can, that sounds like a good idea.

1649–1650

Sure

dmgreen accepted this revision.Aug 20 2020, 6:57 AM

Thanks. Looks good to me.

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
31

Do you need to include this?

This revision is now accepted and ready to land.Aug 20 2020, 6:57 AM
anwel marked an inline comment as done.Aug 24 2020, 3:59 AM
anwel added inline comments.
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
31

Oops, no. Will remove for commit.

This revision was automatically updated to reflect the committed changes.
anwel marked an inline comment as done.