This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Extend reordering data of tree entry to support PHI nodes
ClosedPublic

Authored by skc7 on Oct 26 2022, 4:59 AM.

Details

Summary

SLPVectorize pass uses the order of vectorizable phi nodes in the basic block to form a tree entry and decides the mask of the resulting shuffle vector.
If tree entry has insertelement instruction sequence, they will be also converted to shufflevector with mask based on insertelement instruction order in BB.

In the BB, if the result of vectorizable phi nodes is used by insertelement instructions, the resulting masks of shuffle vectors after vectorizing these two tree entries can mismatch. This happens if the order of phi nodes and the order of use of results of phi nodes by extractelement/insertelement mismatch.

There will be a scope for optimizing these resulting shuffle vectors in the subsequent passes in the pipeline, if their shuffle masks match.

This change reorders the phinodes in the tree entry, based on the use of the results of phi nodes by extractelement/insertelement.

Diff Detail

Event Timeline

skc7 created this revision.Oct 26 2022, 4:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2022, 4:59 AM
skc7 requested review of this revision.Oct 26 2022, 4:59 AM
ABataev added inline comments.Oct 26 2022, 6:04 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3904

&& TE.getOpcode() == Instruction::PHI

3906

Better to do:

if (V1->user_empty() || V2->user_empty())
  return false;
3909

I don't think the parent comparison should be here at all, I think you can drop it.

3912

getInsertIndex may return None, which is not derefereancable (if index is not constant, for example), need to check it to avoid crash.

3914–3916

I think you need to check for incoming values here rather than for the users.

3931

I think this is dangerous to do, TE may have its reordering order already, you ignore it here.

3934

What if different phis are inserted into the different vectors but using the same indices?

skc7 updated this revision to Diff 471943.Oct 31 2022, 3:07 AM

changes as per review comments.

skc7 added inline comments.Oct 31 2022, 3:20 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3909

Removed it.

3912

Added this check to see the getInsertIndex() and getExtractIndex() are not NONE

3914–3916

In this patch I'm trying to reorder phis based on result use order(extractelement/insertelement order). Is there any check needed for incoming values of phis here?

3931

added check to see that reordering order is empty. TE.ReorderIndices.empty()

3934

For insertelements, areTwoInsertFromSameBuildVector() is used to check the inserts are to same vector. For extractelements, check is done to see that extract is from same vector.

ABataev added inline comments.Oct 31 2022, 5:55 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3904

I believe this can be used not only for PHIs, but for other instructions too

3940–3941

I believe this can be combined with the buildvector order but we can skip it for now.

3949

Better to use stable_sort

4107

&& TE->getOpcode() == Instruction::PHI, but I think better to implement it for all possible instructions here

skc7 updated this revision to Diff 472015.Oct 31 2022, 8:22 AM

Use llvm::stable_sort for reordering phis.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3904

I have seen this issue with only PHIs in my code walkthrough. Haven't come across a case where ordering of other instructions in BB changes resulting shuffle mask.

ABataev added inline comments.Oct 31 2022, 8:44 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3904

Ah, most probably because we consider phi nodes as key nodes, while others are not.

3953

I think we need to return {} here to point that we need to use identity order.

skc7 updated this revision to Diff 472037.Oct 31 2022, 9:04 AM

return {} when identity order.

This revision is now accepted and ready to land.Oct 31 2022, 9:48 AM

Hi @skc7, this has caused a test suite failure on Linaro's SVE vector length agnostic buildbot.

 #8 0x0000ffff9be1a490 __assert_fail_base /build/glibc-RIFKjK/glibc-2.31/assert/assert.c:89:7
 #9 0x0000ffff9be1a4f4 (/lib/aarch64-linux-gnu/libc.so.6+0x2d4f4)
#10 0x0000aaaae40da824 (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla/stage1.install/bin/clang+0x7287824)
#11 0x0000aaaae410ab68 areTwoInsertFromSameBuildVector(llvm::InsertElementInst*, llvm::InsertElementInst*, llvm::function_ref<llvm::Value* (llvm::InsertElementInst*)>) SLPVectorizer.cpp:0:0
#12 0x0000aaaae41284d8 llvm::slpvectorizer::BoUpSLP::getReorderingData(llvm::slpvectorizer::BoUpSLP::TreeEntry const&, bool)::$_10::operator()(llvm::Value*, llvm::Value*) const SLPVectorizer.cpp:0:0
#13 0x0000aaaae4129328 void std::__merge_adaptive<llvm::Value**, long, llvm::Value**, __gnu_cxx::__ops::_Iter_comp_iter<llvm::slpvectorizer::BoUpSLP::getReorderingData(llvm::slpvectorizer::BoUpSLP::TreeEntry const&, bool)::$_10>>(llvm::Value**, llvm::Value**, llvm::Value**, long, long, llvm::Value**, long, __gnu_cxx::__ops::_Iter_comp_iter<llvm::slpvectorizer::BoUpSLP::getReorderingData(llvm::slpvectorizer::BoUpSLP::TreeEntry const&, bool)::$_10>) SLPVectorizer.cpp:0:0

When building McCat eks (whatever that is).

https://lab.llvm.org/buildbot/#/builders/197/builds/3237

The logs are pretty jumbled, so we'll dig out the reproducer for you.

dmgreen added a subscriber: dmgreen.Nov 6 2022, 4:08 AM
dmgreen added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3906

This checks that V1 and V2 have some number of uses, but the code below then always looks at the first one, to see if it is an insert/extract. As far as I understand the order of use lists is not deterministic though, and we can't just rely on the first element always being the same item. It leads to non-determinism issues.

I hit this where clang was crashing whereas .ll files fed into opt were not (on the scalable vector issue @DavidSpickett mentioned). I'm not sure of the best way to fix this exactly, I've reverted the patch in 656f1d8b74df5d3f5f2bc75a5f2565df48340757. The SLP issue is hopefully easy enough to fix. There's a new test case in 0e9dfff37ef8f29cdda716d5ccb9d8e74d2a48fe.

https://godbolt.org/z/96axz5sbT and https://godbolt.org/z/e4WeTE7ve also have larger examples.

ABataev added inline comments.Nov 6 2022, 4:41 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3906

Yep, better to check for hasOneUse here.

skc7 added inline comments.Nov 7 2022, 2:36 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
3906

Issue is with getInsertIndex() which fails with ScalableVectors. Created a new patch with the fix D137537. Please review.