This is an archive of the discontinued LLVM Phabricator instance.

[LV] Teach vectorizer about variant value store into uniform address
ClosedPublic

Authored by anna on Sep 28 2018, 8:38 AM.

Details

Summary

Teach vectorizer about vectorizing variant value stores to uniform
address. Similar to rL343028, we do not allow vectorization if we have
multiple stores to the same uniform address.

Cost model already has the change for considering the extract
instruction cost for a variant value store. See added test cases for how
vectorization is done.
The patch also contains changes to the ORE messages.

Diff Detail

Repository
rL LLVM

Event Timeline

anna created this revision.Sep 28 2018, 8:38 AM
Ayal added inline comments.Oct 4 2018, 3:42 PM
include/llvm/Analysis/LoopAccessAnalysis.h
623 ↗(On Diff #167484)

there [are] multiple

lib/Analysis/LoopAccessAnalysis.cpp
2307 ↗(On Diff #167484)

was >> were

test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll
42 ↗(On Diff #167484)

was >> were (+ multiple occurrences below)

test/Transforms/LoopVectorize/X86/illegal-parallel-loop-uniform-write.ll
7 ↗(On Diff #167484)

CHECK how LV actually vectorizes the invariant store, rather than if it emits a vector type?

Would be good to include a test having two conditional invariant stores.

anna marked 3 inline comments as done.Oct 5 2018, 11:58 AM
anna added inline comments.
test/Transforms/LoopVectorize/X86/illegal-parallel-loop-uniform-write.ll
7 ↗(On Diff #167484)

CHECK how LV actually vectorizes the invariant store, rather than if it emits a vector type?

LV vectorizing the invariant store in this case is undefined behaviour because the user incorrectly tagged the loop as parallel. Pls see the PR15794 mentioned below.

Do we still need to see how LV vectorizes the invariant store because the code generated is anyway incorrect in the presence of output dependency. See test case below which is the same test case without the incorrect mem_parallel.loop metadata and we don't vectorize it.

Would be good to include a test having two conditional invariant stores.

Will do in the invariant-store-vectorization case below.

anna updated this revision to Diff 168501.Oct 5 2018, 12:02 PM

addressed review comments. Added one test for multiple uniform stores should not be vectorized.

Ayal accepted this revision.Oct 5 2018, 6:30 PM

Thanks, added only minor optional suggestions.

lib/Transforms/Vectorize/LoopVectorize.cpp
1181 ↗(On Diff #168501)

Check the cost estimates printed in debugging?

test/Transforms/LoopVectorize/X86/illegal-parallel-loop-uniform-write.ll
7 ↗(On Diff #167484)

CHECK how LV actually vectorizes the invariant store, rather than if it emits a vector type?

LV vectorizing the invariant store in this case is undefined behaviour because the user incorrectly tagged the loop as parallel. Pls see the PR15794 mentioned below.

Ahh, right; so indeed no point in checking how LV vectorizes the invariant store.

Would be good to include a test having two conditional invariant stores.

Will do in the invariant-store-vectorization case below.

Very good, thanks. It may also be good to test that LV does not vectorize two conditional stores (of variant values) to the same invariant address, in addition to @inv_val_store_to_inv_address_conditional_diff_values which checks for two stores of invariant values to the same invariant address.

This revision is now accepted and ready to land.Oct 5 2018, 6:30 PM
This revision was automatically updated to reflect the committed changes.
anna added inline comments.Oct 16 2018, 8:57 AM
test/Transforms/LoopVectorize/X86/illegal-parallel-loop-uniform-write.ll
7 ↗(On Diff #167484)

Very good, thanks. It may also be good to test that LV does not vectorize two conditional stores (of variant values) to the same invariant address, in addition to @inv_val_store_to_inv_address_conditional_diff_values which checks for two stores of invariant values to the same invariant address.

Two unconditional stores of variant values is in multiple_uniform_stores test. I'll add the multiple *conditional* stores of variant values case.