Page MenuHomePhabricator

[AArch64][SVE] Lower vector.insert to predicated SEL
ClosedPublic

Authored by MattDevereau on Dec 7 2021, 9:02 AM.

Details

Summary

Use predicated SEL for vector.insert instead of going through memory

Diff Detail

Event Timeline

MattDevereau created this revision.Dec 7 2021, 9:02 AM
MattDevereau requested review of this revision.Dec 7 2021, 9:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2021, 9:02 AM
paulwalker-arm added inline comments.Dec 7 2021, 9:36 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
10999–11010

Perhaps worth combining these blocks? Like

if (Idx == 0 && isPackedVectorType(VT, DAG)) {
  // This will be matched by custom code during ISelDAGToDAG.
  if (Vec0.isUndef())
    return Op;

  unsigned int PredPattern = ......
  ......
}
MattDevereau added inline comments.Dec 8 2021, 3:24 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
10999–11010

The two blocks differ slightly:
isPackedVectorType(VT, DAG)
and
isPackedVectorType(InVT, DAG)

I had to pass VT instead of InVT to get isPackedVector to return false on the <4 x bfloat> test. All tests pass when I combine the statements and use VT. Do you think there will be any side effects later down the line from doing this?

paulwalker-arm added inline comments.Dec 8 2021, 5:11 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
10999–11010

Oh that's interesting. I think the original code is wrong to use InVT and is likely the result of some refactoring for when this code path needed to handle scalable vectors.

isPackedVectorType will always return true for fixed length vectors and so given by this point we know InVT is fixed length the call is redundant. I believe the original code should be using VT because that is the only case that is handled within ISelDAGToDAG (i.e. inserting a fixed length vector into an undef packed scalable vector at index 0).

combined similar if statement logic

MattDevereau marked 2 inline comments as done.Dec 8 2021, 5:15 AM
paulwalker-arm added inline comments.Dec 8 2021, 5:45 AM
llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll
171–172

These look like they should be CHECK-NEXT:?

Did update_llc_test_checks.py produce this output?

llvm/test/CodeGen/AArch64/sve-insert-vector.ll
44–45

These look like they should be CHECK-NEXT:?

499–500

Missing : for CHECK-NEXT:

ran tests through update_llc_test_checks.py

MattDevereau marked 3 inline comments as done.Dec 8 2021, 6:38 AM
peterwaller-arm added inline comments.Dec 8 2021, 7:02 AM
llvm/test/CodeGen/AArch64/split-vector-insert.ll
20

You've run update_llc_test_checks on this file, which did not previously use it but the intent of the test was not to check the output-- it was checking for 'legally typed node' in the debug output. I think you can keep the newer CHECK lines but this diff hunk needs discarding.

108

Same for this one.

removed update_llc_test_checks generation on split-vector-insert.ll

MattDevereau marked 2 inline comments as done.Dec 8 2021, 8:27 AM
peterwaller-arm accepted this revision.Dec 8 2021, 8:29 AM
This revision is now accepted and ready to land.Dec 8 2021, 8:29 AM
paulwalker-arm requested changes to this revision.Dec 8 2021, 11:05 AM
paulwalker-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11009

This is why the tests are wrong. ISDOpcodes says Select(COND, TRUEVAL, FALSEVAL) and so Vec0 and ScalableVec1 need switching. Might be worth writing a runable test just to verify this to yourself.

llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll
82

Is this not backwards? SEL says "Read active elements from the first source vector and inactive elements from the second source vector" so the subvector should be the first source?

This revision now requires changes to proceed.Dec 8 2021, 11:05 AM

Swap vector operands for vector.insert

MattDevereau marked 2 inline comments as done.Dec 9 2021, 2:09 AM
MattDevereau added inline comments.
llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll
82

Yes, it was backwards. Swapping the operands has made it generate a MOV alias of SEL instead which seems more appropriate.

paulwalker-arm accepted this revision.Dec 9 2021, 3:31 AM
paulwalker-arm added inline comments.
llvm/test/CodeGen/AArch64/split-vector-insert.ll
56–60

Not sure why but I noticed that each new patch revision introduced a new blank line. Please remove them before committing.

109–113

As above.

This revision is now accepted and ready to land.Dec 9 2021, 3:31 AM
This revision was automatically updated to reflect the committed changes.
MattDevereau marked an inline comment as done.