This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Avoid AND operation if both side are splat of i1 or PTRUE
ClosedPublic

Authored by dtemirbulatov on Jan 5 2023, 4:14 AM.

Details

Summary

If both sides of AND operations are i1 splat_vectors or PTRUE node then we can produce just i1 splat_vector as the result.

Diff Detail

Event Timeline

dtemirbulatov created this revision.Jan 5 2023, 4:14 AM
dtemirbulatov requested review of this revision.Jan 5 2023, 4:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 4:14 AM

Changed proposed test in order to remove -O3 out of llc parameters.

sdesmalen added inline comments.Jan 5 2023, 8:31 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16271

I think you can use the existing function isAllActivePredicate to check if either of the operands is an all active-predicate. This function already looks through reinterpret nodes and already checks both the splat/ptrue case.

llvm/test/CodeGen/AArch64/sve-splat-one-and-ptrue.ll
15 ↗(On Diff #486542)

It would also be good to have a test that uses a shufflevector to create the splat.

19 ↗(On Diff #486542)

Is this function call relevant?

david-arm added inline comments.Jan 5 2023, 8:36 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16217

Rather than use a recursive function call you could just do:

static SDValue findRootNonReinterpretNode(SDValue Op) {
  while (Op.getOpcode() == AArch64ISD::REINTERPRET_CAST)
    Op = Op->getOperand(0);
  return Op;
}

Perhaps that's a bit friendlier to the compiler too?

dtemirbulatov marked 3 inline comments as done.

Addressed comments.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16271

but I need to check and return ptrue/splat node and isAllActivePredicate() just checks.

georges added a subscriber: georges.Jan 6 2023, 8:20 AM
Matt added a subscriber: Matt.Jan 6 2023, 12:18 PM
david-arm added inline comments.Jan 9 2023, 1:38 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16271

Hi @dtemirbulatov, I'm re-reading this code and isAllActivePredicate. I actually think @sdesmalen is right and we should use isAllActivePredicate here, because your current code doesn't take into account various cases such as:

  1. When reinterpreting from a type with fewer elements the "new" elements are not active, so I think we should bail out of this optimisation in this case.
  2. Your code below doesn't consider what type of pattern is being passed to ptrue, but we should be. isAllActivePredicate only returns true for AArch64SVEPredPattern::all.

For any cases where we bail out I think it's worth adding a negative test as well.

Addressed coments.

dtemirbulatov marked 2 inline comments as done.Jan 9 2023, 6:42 AM

Hi @dtemirbulatov, this is looking better now thanks! I just have a couple more comments ...

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16269

I think this comment suggests we only support the case where both operands are i1 splat_vectors, but your code below supports only one operand being an all-active predicate. Can you update the comment to reflect that? For example, something like

// If either side of the AND operation is a i1 splat_vector then return the other
// operand.

For example, and %splat_i1, %b -> %b and and %a, %splat_i1 -> %a. Can you add twmo test cases to cover both of these cases too?

llvm/test/CodeGen/AArch64/sve-splat-one-and-ptrue.ll
35 ↗(On Diff #487413)

This looks like a negative test. Can you add a comment explaining why the DAG combine doesn't happen?

sdesmalen added inline comments.Jan 9 2023, 6:58 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16217–16223

isAllInactivePredicate is unused now?

16225

Can you commit the moving of this function as a non-functional change (as a precursory patch)?

16271–16274

nit: can you replace Src with N->getOperand(0) here? It's equivalent, but it makes it a bit more intuitive to understand.

llvm/test/CodeGen/AArch64/sve-splat-one-and-ptrue.ll
40 ↗(On Diff #487413)

I would have expected all ptrues to have folded away and to only have:

ptrue p0.s

Why are these AND instructions not removed?

45 ↗(On Diff #487413)

nit: you can remove tail for these function calls.

sdesmalen added inline comments.Jan 9 2023, 7:33 AM
llvm/test/CodeGen/AArch64/sve-splat-one-and-ptrue.ll
40 ↗(On Diff #487413)

In second consideration, I see that only one of these AND's can be removed. The semantics of llvm.aarch64.sve.convert.to.svbool.nxv2i1 are such that it explicitly zeroes the inactive lanes, i.e. they don't map 1-1 to REINTERPRET_CAST.

dtemirbulatov marked 6 inline comments as done.

Added combine for AArch64ISD::REINTERPRET_CAST nodes, addressed remarks.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16217–16223

No, still in use in performVSelectCombine.

sdesmalen added inline comments.Jan 10 2023, 7:52 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16219–16221

This code would simplify:

t0: nxv2i1 ...
  t1: nxv16i1 REINTERPRET_CAST t0
    t2: nxv4i1 REINTERPRET_CAST t1
      t3: nxv8i1 REINTERPRET_CAST t2
        t4: nxv2i1 REINTERPRET_CAST t3

into: t0

But not something like this

t0: nxv2i1 ...
  t1: nxv16i1 REINTERPRET_CAST t0
    t2: nxv4i1 REINTERPRET_CAST t1
      t3: nxv8i1 REINTERPRET_CAST t2
        t4: nxv16i1 REINTERPRET_CAST t3

into:

t0: nxv2i1 ...
  t1: nxv16i1 REINTERPRET_CAST t0

I'm not sure if this is common, but it's probably more robust to do something like this:

while(Op.getOpcode() == AArch64ISD::REINTERPRET_CAST &&
      LeafOp.getValueType() != Op.getValueType())
  Op = Op->getOperand(0);
llvm/test/CodeGen/AArch64/sve-intrinsics-reinterpret.ll
128

Nice :)

llvm/test/CodeGen/AArch64/sve-splat-one-and-ptrue.ll
8 ↗(On Diff #487796)

Can you give these tests more meaningful names, e.g. @fold_away_ptrue_and_ptrue for @foo and @fold_away_ptrue_and_splat_predicate for bar.

david-arm added inline comments.Jan 10 2023, 8:56 AM
llvm/test/CodeGen/AArch64/sve-splat-one-and-ptrue.ll
1 ↗(On Diff #487796)

I think this is still missing tests for the more general case where only one AND operand is an all-active predicate. Your logic on line 16380 permits this more general behaviour so we ought to test it.

dtemirbulatov marked 3 inline comments as done.

Addressed comments.

sdesmalen accepted this revision.Jan 11 2023, 2:52 AM

LGTM with comment on the test addressed!

llvm/test/CodeGen/AArch64/sve-splat-one-and-ptrue.ll
36 ↗(On Diff #488116)

Can you give @foo and @bar some more meaningful names as well?

This revision is now accepted and ready to land.Jan 11 2023, 2:52 AM

Renamed two functions in sve-splat-one-and-ptrue.ll

This revision was landed with ongoing or failed builds.Jan 11 2023, 6:07 AM
This revision was automatically updated to reflect the committed changes.