This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Bubble vector.reverse of select operands to their result.
ClosedPublic

Authored by paulwalker-arm on Dec 5 2022, 9:19 AM.

Details

Summary

This mirrors a similar shufflevector transformation so the same
effect is obtained for scalable vectors. The transformation is
only performed when it can be proven the number of resulting
reversals is not increased. By bubbling the reversals from operand
to result this should typically be the case and ideally leads to
back-back shuffles that can be elimitated entirely.

Diff Detail

Event Timeline

paulwalker-arm created this revision.Dec 5 2022, 9:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 5 2022, 9:19 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
paulwalker-arm requested review of this revision.Dec 5 2022, 9:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 5 2022, 9:19 AM
sdesmalen added inline comments.Dec 6 2022, 12:49 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2378

what is the reasoning behind either of them having a single use?

2381

FValSplat?

david-arm added inline comments.Dec 6 2022, 2:29 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2368

Would it be possible to modify one of the existing tests to have flags so we can correctly defend their propagation? We now have code paths where we depend upon the fast-math flags being set correctly on the select by the loop vectoriser.

2382

I don't know how far you want to go with this, but you can support more than just splats for the false and true vals. For example, we can also do the transformation when FVal is an arbitrary fixed-width vector constant, i.e.

define <4 x i32> @select_reverse(<4 x i1> %a, <4 x i32> %b, <4 x i32> %c) {
  %a.rev = tail call <4 x i1> @llvm.experimental.vector.reverse.v4i1(<4 x i1> %a)
  %b.rev = tail call <4 x i32> @llvm.experimental.vector.reverse.v4i32(<4 x i32> %b)
  %select = select <4 x i1> %a.rev, <4 x i32> %b.rev, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
  ret <4 x i32> %select
}

Fixed typo in comment and converted reverse_select_reverse to show flags get preserved.

paulwalker-arm marked 4 inline comments as done.Dec 6 2022, 4:38 AM
paulwalker-arm added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
2368

I've updated reverse_select_reverse.

2378

The only situation where the combine can introduce more reversals than the original input IR is when all operands have more than one use. So if one or more operands have a single use then we can be sure the output will be the same or better.

For the case where the number of reversals is the same the hope is that at some point the reversals will bubble to a point where back-back reversals are seen with both then removed.

2382

vector.reverse is not typically used for fixed length vectors as such intrinsics should be converted to shufflevector instructions where the existing combines will kick in.

For scalable vectors is there an idiom outside of splats where the operand can be rewritten without introducing an explicit reversal?

david-arm accepted this revision.Dec 6 2022, 5:09 AM

LGTM! C'est magnifique. :)

This revision is now accepted and ready to land.Dec 6 2022, 5:09 AM
sdesmalen accepted this revision.Dec 6 2022, 7:29 AM