Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I'm not clear if the address space argument is the right one.
Looking into where to get the isFast info from.
Added Fast parameter.
Updated check on allowsMisaligned.
Updated test that now vectorizes.
include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
391 ↗ | (On Diff #62984) | Maybe a better comment would be Determine whether we can read BitWidth bits from memory with the given alignment (in bytes). That is, this isn't strictly for determining whether *misaligned* memory accesses are allowed -- if Alignment == BitWidth, the call is still allowed (right?). |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
703 ↗ | (On Diff #62984) | Might as well set this to false, to guard against low-quality TTI overrides. |
704 ↗ | (On Diff #62984) | No == true. |
705 ↗ | (On Diff #62984) | I'm not sure why this is the right condition. "If the store is going to be misaligned, don't vectorize it." This is now, essentially, "if the store is going to be misaligned and misaligned stores of this witdh and alignment are not fast, don't vectorize it," right? So shouldn't the condition be `(Alignment % SzInBytes) != 0 && (!allowsMisaligned(...) || !Fast)` ? In fact, maybe it makes sense to centralize this logic, rename allowsMisaligned, and just do `memoryAccessIsAllowedAndFast(SzInBytes, AS, Alignment)` |
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll | ||
505 ↗ | (On Diff #62984) | I am surprised we can vectorize this...are you sure it's not a bug? |
include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
391 ↗ | (On Diff #63130) | (This suggestion may be wrong -- the callee may assume that the memory access is misaligned, even if Alignment would indicate it's not. I dunno.) |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
700 ↗ | (On Diff #63130) | Updated this with new condition. Now there are other tests that vectorize, the ones with "natural" alignment that I had updated previously. I'm looking into whether this is correct behavior. |
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll | ||
505 ↗ | (On Diff #63130) | This is no longer vectorized with the above changes. I will update it with the other tests. |
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll | ||
---|---|---|
505 ↗ | (On Diff #63130) | I recently enabled misaligned access depending on the triple, so this should probably be vectorized? |
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll | ||
---|---|---|
505 ↗ | (On Diff #63130) | Never mind, I misread which test this is |
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll | ||
---|---|---|
505 ↗ | (On Diff #63130) | I believe due to what you enabled, the tests ending in "_natural_align" are now vectorizing. They don't vectorize without the triple argument. |
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll | ||
---|---|---|
505 ↗ | (On Diff #63130) | I don't see the changes in the diff, but that should be fine |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
830 ↗ | (On Diff #63146) | ! instead of == false |
include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
391 ↗ | (On Diff #63146) | The definition should be the same as the one in TLI. It reads: /// \brief Determine if the target supports unaligned memory accesses. /// /// This function returns true if the target allows unaligned memory accesses /// of the specified type in the given address space. If true, it also returns /// whether the unaligned memory access is "fast" in the last argument by /// reference. This is used, for example, in situations where an array /// copy/move/set is converted to a sequence of store operations. Its use /// helps to ensure that such replacements don't generate code that causes an /// alignment error (trap) on the target machine. The difference in the API here is not having a type, but a size in bits. The brief description still applies. Perhaps use just that? |
include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
391 ↗ | (On Diff #63155) | sgtm |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
702 ↗ | (On Diff #63155) | I still think it might make sense to move the (Alignment % SzInBytes) != 0 && (Alignment % TargetBaseAlign) != 0 checks into the helper function (currently called allowsMisalignedAndIsFast, would need a new name). But, up to you. |
include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
391 ↗ | (On Diff #63155) | done. |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
702 ↗ | (On Diff #63155) | I debated a bit whether to do this. Ideally I would have liked a single point of usage (and future removal) for TargetBaseAlign. Still, the condition is cleaner and I can remove a nesting level. |