This is an archive of the discontinued LLVM Phabricator instance.

[VectorCombine] convert scalar fneg with insert/extract to vector fneg
ClosedPublic

Authored by spatel on Oct 5 2022, 8:54 AM.

Details

Summary

insertelt DestVec, (fneg (extractelt SrcVec, Index)), Index --> shuffle DestVec, (fneg SrcVec), Mask

This is a specialized form of what could be a more general fold for a binop. It's also possible that fneg is overlooked by SLP in this kind of insert/extract pattern since it's a unary op.

This shows up in the motivating example from #issue 58139, but it won't solve it (that probably requires some x86-specific backend changes). There are also some small enhancements (see TODO comments) that can be done as follow-up patches.

Diff Detail

Event Timeline

spatel created this revision.Oct 5 2022, 8:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 5 2022, 8:54 AM
spatel requested review of this revision.Oct 5 2022, 8:54 AM
tschuett added inline comments.
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
564

Tiniest nit: You could Mask.reserve(VecTy->getNumElements());

RKSimon accepted this revision.Oct 10 2022, 5:16 AM

LGTM

llvm/lib/Transforms/Vectorize/VectorCombine.cpp
564

If you wanted to you could go even further (assuming we know that Index < VecTy->getNumElements())

SmallVector<int> Mask(VecTy->getNumElements());
std::iota(Mask.begin(), Mask.end(), 0);
Mask[Index] = Index + e;
llvm/test/Transforms/VectorCombine/X86/extract-fneg-insert.ll
7

Are you going to look at this? If not please can you raise a bug.

This revision is now accepted and ready to land.Oct 10 2022, 5:16 AM
spatel marked an inline comment as done.Oct 10 2022, 5:56 AM
spatel added inline comments.
llvm/test/Transforms/VectorCombine/X86/extract-fneg-insert.ll
7

I'm not sure what the proper fix will be, so let's track it either way:
https://github.com/llvm/llvm-project/issues/58261

spatel updated this revision to Diff 466578.Oct 10 2022, 11:53 AM
spatel marked 3 inline comments as done.

Patch updated:
Replace shuffle mask loop with pre-allocated vector and std::iota().

Hello,

The following starts crashing with this patch:

opt -passes='vector-combine' bbi-74776.ll -S -o /dev/null

Hello,

The following starts crashing with this patch:

opt -passes='vector-combine' bbi-74776.ll -S -o /dev/null

Thanks! Taking a look now - probably need to restrict the pattern match to true fneg.

Hello,

The following starts crashing with this patch:

opt -passes='vector-combine' bbi-74776.ll -S -o /dev/null

Thanks! Taking a look now - probably need to restrict the pattern match to true fneg.

On 2nd thought, we can adjust the capture to handle this kind of pattern:
8d76fbb5f065