This is an archive of the discontinued LLVM Phabricator instance.

[LSV] Refactoring + supporting bitcasts to a type of different size
ClosedPublic

Authored by rtereshin on Jul 14 2018, 10:08 AM.

Details

Summary

This is mostly a preparation work for adding a limited support for select instructions.
It proved to be difficult to do due to size and irregularity of Vectorizer::isConsecutiveAccess,
this is fixed here I believe.

It also turned out that these changes make it simpler to finish one of the TODOs and fix a number of other small issues, namely:

  1. Looking through bitcasts to a type of a different size (requires careful tracking of the original load/store size and some math converting sizes in bytes to expected differences in indices of GEPs).
  2. Reusing partial analysis of pointers done by first attempt in proving them consecutive instead of starting from scratch. This added limited support for nested GEPs co-existing with difficult sext/zext instructions. This also required a careful handling of negative differences between constant parts of offsets.
  3. Handing a case where the first pointer index is not an add, but something else (a function parameter for instance).

I observe an increased number of successful vectorizations on a large set of shader programs. Only few shaders are affected, but those that are affected sport >5% less loads and stores than before the patch.

The selects patch is coming soon.

This is related to but independent of https://reviews.llvm.org/D48853 ("[SCEV] Add zext(C + x + ...) -> D + zext(C-D + x + ...)<nuw> transform"), also improving LoadStoreVectorizer.

Diff Detail

Repository
rL LLVM

Event Timeline

rtereshin created this revision.Jul 14 2018, 10:08 AM
rampitec added inline comments.Jul 16 2018, 10:32 AM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
394–395

I do not think this can happen. There is a convention to have constant operand always at the last position in a commutative operation. Do you see it in a real world example or just in the artificially crafted testcase?

rtereshin added inline comments.Jul 16 2018, 10:54 AM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
394–395

In the artificial one only. I will remove the case from the patch and the test.

rtereshin updated this revision to Diff 155762.Jul 16 2018, 2:18 PM
rtereshin edited the summary of this revision. (Show Details)

Removing the "4. Handling a case where the sext/zext's operand is not add %x, C, but add C, %x, where C is a constant." case handling code and updating the test correspondingly.

This revision is now accepted and ready to land.Jul 16 2018, 2:20 PM
arsenm added inline comments.Jul 16 2018, 2:24 PM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
339

Needs better name

test/Transforms/LoadStoreVectorizer/AMDGPU/gep-bitcast.ll
62

Why was the new dropped here?

rtereshin added inline comments.Jul 16 2018, 2:55 PM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
339

Any suggestions?

test/Transforms/LoadStoreVectorizer/AMDGPU/gep-bitcast.ll
62

We don't need to know that add i32 %base, C0 doesn't wrap if we know that add n[su]w i32 %base, C1 doesn't wrap and 0 <= C0 <= C1. Vectorizer::isConsecutiveAccess is able to notice and exploit this fact in some cases (w/ and w/o this patch). I don't want to regress on it. Perhaps it's better if we also change the offsets in this test from 0, 4, 8, 12 to something non-zero based, like 4, 8, 12, 16.

rtereshin updated this revision to Diff 155787.Jul 16 2018, 4:37 PM
  1. Renaming tryHarder to lookThroughComplexAddresses
  2. Adding nuw flag back to the first pointer in the pre-existing test

I've noticed that one of tests I've added partly tests what I wanted to test anyway so this is fine.