This is an archive of the discontinued LLVM Phabricator instance.

LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments.
ClosedPublic

Authored by asbirlea on Aug 2 2016, 9:57 AM.

Details

Summary

TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.

Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures:

  • x86 failing tests either did not have the right target, or the right alignment.
  • NVPTX failing tests did not have the right alignment.
  • AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks for a maximum size allowed and relies on the next condition checking for %4 for correctness. This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list).

Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure),
so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context.

Diff Detail

Repository
rL LLVM

Event Timeline

asbirlea updated this revision to Diff 66494.Aug 2 2016, 9:57 AM
asbirlea retitled this revision from to LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments..
asbirlea updated this object.
asbirlea added reviewers: arsenm, jlebar.
asbirlea added a subscriber: llvm-commits.
jlebar added inline comments.Aug 2 2016, 10:28 AM
lib/Target/AMDGPU/SIISelLowering.cpp
451 ↗(On Diff #66494)

We don't generally include commented-out code like this. It makes things hard to read.

In this case it's particularly hard for me to read because there are two TODOs and it's not clear how they relate to one another.

Can we have just one comment that explains where we are today and where we want to be in the future?

451 ↗(On Diff #66494)

LLVM style is to leave off braces on single-line "if" statements. Unfortunately clang-tidy doesn't do this for you (such a waste of human time...).

479 ↗(On Diff #66494)

I don't think you meant to include this change?

lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
1038 ↗(On Diff #66494)

Please run clang-format.

1038 ↗(On Diff #66494)

We don't usually use {} for DEBUGs that just contain a single log statement. clang-format should be able to do something pretty here without the {}, I think.

test/Transforms/LoadStoreVectorizer/X86/preserve-order32.ll
21 ↗(On Diff #66494)

Why the change on this line? Maybe you don't need any align for this load?

test/Transforms/LoadStoreVectorizer/X86/preserve-order64.ll
25 ↗(On Diff #66494)

I thought x86 doesn't care about the alignment? Or, is it a question of the "fast" param?

asbirlea updated this revision to Diff 66512.Aug 2 2016, 11:46 AM
asbirlea edited edge metadata.

Address comments.

asbirlea marked 5 inline comments as done.Aug 2 2016, 11:49 AM
asbirlea added inline comments.
test/Transforms/LoadStoreVectorizer/X86/preserve-order32.ll
21 ↗(On Diff #66494)

This is not necessary. I can remove it completely for these 2 lines with the same outcome, but the one below for address 0 is required. I haven't looked in detail into what are the x86 requirements here.
I only changes it to 4 because it looked right considering we're using i32.

test/Transforms/LoadStoreVectorizer/X86/preserve-order64.ll
25 ↗(On Diff #66494)

Related to the comment in the 32bit version of this, this is required for x86, haven't investigated in detail why. The other alignment directives above (buff.p and buff.val) are not needed.

jlebar edited edge metadata.Aug 2 2016, 1:26 PM

lgtm, but I can't speak to the correctness of the amdgpu changes, so I'd like arsenm to sign off on them.

lib/Target/AMDGPU/SIISelLowering.cpp
443 ↗(On Diff #66512)

if the size

test/Transforms/LoadStoreVectorizer/X86/preserve-order32.ll
21 ↗(On Diff #66494)

I only changes it to 4 because it looked right considering we're using i32.

Eh, it's x86, it doesn't really care. I'd either (a) change as little as possible, or (b) make the tests as simple as possible (removing unnecessary alignments, although e.g. if you align arr[0] to 8 bytes, saying that arr[1] is aligned to 4b doesn't seem so unreasonable).

Not a big deal, though.

test/Transforms/LoadStoreVectorizer/X86/preserve-order64.ll
25 ↗(On Diff #66494)

It sounds like x86 is requiring us to align the whole vector. It's probably the "fast" thing. That's fine.

jlebar accepted this revision.Aug 2 2016, 1:27 PM
jlebar edited edge metadata.
This revision is now accepted and ready to land.Aug 2 2016, 1:27 PM
asbirlea updated this revision to Diff 66556.Aug 2 2016, 2:19 PM
asbirlea edited edge metadata.

Address comments. Simplify tests.

asbirlea marked an inline comment as done.Aug 2 2016, 2:20 PM
asbirlea added inline comments.
test/Transforms/LoadStoreVectorizer/X86/preserve-order32.ll
21 ↗(On Diff #66556)

I removed the alignments for arr[1] both for this and the 64bit variant of the test.

asbirlea added a comment.EditedAug 2 2016, 2:21 PM

I'll wait for arsenm input on the AMD lowering changes.

arsenm added inline comments.Aug 3 2016, 2:07 PM
lib/Target/AMDGPU/SIISelLowering.cpp
444 ↗(On Diff #66556)

Space before. Shouldn't need parentheses around the first condition. Should probably check VT.getStoreSize() > 16 (although I'm not sure that will actually matter anywhere)

asbirlea updated this revision to Diff 66742.Aug 3 2016, 5:50 PM

Address comment.

This revision was automatically updated to reflect the committed changes.