This is an archive of the discontinued LLVM Phabricator instance.

Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer
ClosedPublic

Authored by asbirlea on Jul 1 2016, 1:42 PM.

Details

Summary

Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains.

Diff Detail

Repository
rL LLVM

Event Timeline

asbirlea updated this revision to Diff 62532.Jul 1 2016, 1:42 PM
asbirlea retitled this revision from to Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer.
asbirlea updated this object.
asbirlea added reviewers: llvm-commits, jlebar.
asbirlea added a subscriber: arsenm.
arsenm added inline comments.Jul 1 2016, 1:55 PM
include/llvm/Analysis/TargetTransformInfo.h
392 ↗(On Diff #62532)

This is missing a lot of parameters compared to the TLI version. It at least needs and address space and the alignment value

392 ↗(On Diff #62532)

IsFast would be helpful too

asbirlea updated this revision to Diff 62564.Jul 1 2016, 4:08 PM

Adding AddressSpace and Alignment.

I'm not clear if the address space argument is the right one.

Looking into where to get the isFast info from.

asbirlea updated this revision to Diff 62984.Jul 6 2016, 3:16 PM

Added Fast parameter.
Updated check on allowsMisaligned.
Updated test that now vectorizes.

jlebar added inline comments.Jul 6 2016, 3:49 PM
include/llvm/Analysis/TargetTransformInfo.h
391 ↗(On Diff #62984)

Maybe a better comment would be

Determine whether we can read BitWidth bits from memory with the given alignment (in bytes).

That is, this isn't strictly for determining whether *misaligned* memory accesses are allowed -- if Alignment == BitWidth, the call is still allowed (right?).

lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
703 ↗(On Diff #62984)

Might as well set this to false, to guard against low-quality TTI overrides.

704 ↗(On Diff #62984)

No == true.

705 ↗(On Diff #62984)

I'm not sure why this is the right condition. "If the store is going to be misaligned, don't vectorize it." This is now, essentially, "if the store is going to be misaligned and misaligned stores of this witdh and alignment are not fast, don't vectorize it," right? So shouldn't the condition be

`(Alignment % SzInBytes) != 0 && (!allowsMisaligned(...) || !Fast)`

?

In fact, maybe it makes sense to centralize this logic, rename allowsMisaligned, and just do

`memoryAccessIsAllowedAndFast(SzInBytes, AS, Alignment)`
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
505 ↗(On Diff #62984)

I am surprised we can vectorize this...are you sure it's not a bug?

asbirlea updated this revision to Diff 63130.Jul 7 2016, 1:37 PM
asbirlea edited edge metadata.

Update after formatting.

jlebar added inline comments.Jul 7 2016, 1:41 PM
include/llvm/Analysis/TargetTransformInfo.h
391 ↗(On Diff #63130)

(This suggestion may be wrong -- the callee may assume that the memory access is misaligned, even if Alignment would indicate it's not. I dunno.)

asbirlea added inline comments.Jul 7 2016, 1:44 PM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
700 ↗(On Diff #63130)

Updated this with new condition. Now there are other tests that vectorize, the ones with "natural" alignment that I had updated previously. I'm looking into whether this is correct behavior.

test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
505 ↗(On Diff #63130)

This is no longer vectorized with the above changes. I will update it with the other tests.

arsenm added inline comments.Jul 7 2016, 1:54 PM
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
505 ↗(On Diff #63130)

I recently enabled misaligned access depending on the triple, so this should probably be vectorized?

arsenm added inline comments.Jul 7 2016, 1:55 PM
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
505 ↗(On Diff #63130)

Never mind, I misread which test this is

asbirlea added inline comments.Jul 7 2016, 2:59 PM
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
505 ↗(On Diff #63130)

I believe due to what you enabled, the tests ending in "_natural_align" are now vectorizing. They don't vectorize without the triple argument.
I'm still not clear why/if the vectorization is correct. Would you mind taking a look to confirm?
I can update the (4) tests in that case.

arsenm added inline comments.Jul 7 2016, 3:08 PM
test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
505 ↗(On Diff #63130)

I don't see the changes in the diff, but that should be fine

asbirlea updated this revision to Diff 63146.Jul 7 2016, 3:13 PM

Updating tests.

arsenm added inline comments.Jul 7 2016, 3:16 PM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
830 ↗(On Diff #63146)

! instead of == false

asbirlea updated this revision to Diff 63148.Jul 7 2016, 3:27 PM

Replace ==false with !

asbirlea updated this revision to Diff 63155.Jul 7 2016, 4:12 PM

clang-format.

asbirlea added inline comments.Jul 8 2016, 10:16 AM
include/llvm/Analysis/TargetTransformInfo.h
391 ↗(On Diff #63146)

The definition should be the same as the one in TLI. It reads:

/// \brief Determine if the target supports unaligned memory accesses.
///
/// This function returns true if the target allows unaligned memory accesses
/// of the specified type in the given address space. If true, it also returns
/// whether the unaligned memory access is "fast" in the last argument by
/// reference. This is used, for example, in situations where an array
/// copy/move/set is converted to a sequence of store operations. Its use
/// helps to ensure that such replacements don't generate code that causes an
/// alignment error (trap) on the target machine.

The difference in the API here is not having a type, but a size in bits. The brief description still applies. Perhaps use just that?

jlebar accepted this revision.Jul 8 2016, 11:01 AM
jlebar edited edge metadata.
jlebar added inline comments.
include/llvm/Analysis/TargetTransformInfo.h
391 ↗(On Diff #63155)

sgtm

lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
702 ↗(On Diff #63155)

I still think it might make sense to move the (Alignment % SzInBytes) != 0 && (Alignment % TargetBaseAlign) != 0 checks into the helper function (currently called allowsMisalignedAndIsFast, would need a new name). But, up to you.

This revision is now accepted and ready to land.Jul 8 2016, 11:01 AM
asbirlea updated this revision to Diff 63307.Jul 8 2016, 1:33 PM
asbirlea edited edge metadata.

Address comments.

asbirlea added inline comments.Jul 8 2016, 1:36 PM
include/llvm/Analysis/TargetTransformInfo.h
391 ↗(On Diff #63155)

done.

lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
702 ↗(On Diff #63155)

I debated a bit whether to do this. Ideally I would have liked a single point of usage (and future removal) for TargetBaseAlign. Still, the condition is cleaner and I can remove a nesting level.

This revision was automatically updated to reflect the committed changes.