There is a stride check in ARMTargetTransformInfo that decides whether a loop cannot be tail predicated by checking whether the strides in it are different from 1. To enable tail predication for loops containing gather/scatters, this patch takes a more detailed approach if the EnableMaskedGatherScatters flag is true, and also adds some more detailed debug messages.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
| llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
|---|---|---|
| 23 | else ifs? | |
| 28 | I think this would be easier to read if this was organised in stride order and separating the gather/scatter from the consecutive accesses. So when !EnableMaskedGatherScatter, getPtrStride should only ever be 1, right? So I don't think we have to track the 'NextStride' business. | |
| llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
|---|---|---|
| 23 | A load or a store can be vld2 or a vst2, neither of which can be tail folded unfortunately. | |
Widened the range of allowed strides to also include loop invariant expressions.
| llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
|---|---|---|
| 23 | Good point. | |
| 23 | Right, I forgot about the vstr2's. In that case we can never allow a stride of 2 here, as the only instructions that get us here are loads and stores. | |
| 28 | Good point. We may as well get rid of that. | |
| llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
|---|---|---|
| 17–1 | Can you change this condition to something like if (NextStride == -1 || (NextStride == 2 && MVEMaxSupportedInterleaveFactor >= 2) || (NextStride == 4 && MVEMaxSupportedInterleaveFactor >= 4)) That should hopefully make it more futureproof, and specifically rule out reverse loads even if the vectorizer changes to support them. | |
| 26–27 | Perhaps if (auto AR = dyn_cast<SCEVAddRecExpr>(PtrScev)) { | |
Thanks. Looks good to me.
| llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
|---|---|---|
| 18 | Do you need to include this? | |
| llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
|---|---|---|
| 18 | Oops, no. Will remove for commit. | |
Can you change this condition to something like
That should hopefully make it more futureproof, and specifically rule out reverse loads even if the vectorizer changes to support them.