This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen][SVE] Add patterns for whole vector predicate select
ClosedPublic

Authored by sdesmalen on May 4 2020, 12:34 PM.

Details

Summary

Added patterns to implement select i1 %p, <vty> %a, <vty> %b

Diff Detail

Event Timeline

sdesmalen created this revision.May 4 2020, 12:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2020, 12:34 PM

We use whilelo to lower VECTOR_SPLAT; I think that ends up being one fewer vector instruction. What's the tradeoff between that vs. dup+cmpne?

We use whilelo to lower VECTOR_SPLAT; I think that ends up being one fewer vector instruction. What's the tradeoff between that vs. dup+cmpne?

There's no specific reason to choose dup+cmpne, other than that in these patterns the GPR32 can be any non-zero value, where to use whilelo we'd need a sign-extended i1 value.
I'm happy to change it to use whilelo, but I think it then needs to be implemented in ISelLowering code where we can do analysis on the value of the predicate using computeKnownBits, rather than using patterns. Is that correct?

It's possible to write a TableGen pattern that explicitly generates an SBFM. In general that's what we would generate anyway. But you'd want to do custom lowering to give DAGCombine a chance to combine it away, yes.

  • select now lowers to splat_vector for the predicate (which codegens to whilelo) plus vselect.
  • Updated tests and rebased patch.
This revision is now accepted and ready to land.May 11 2020, 1:05 PM
This revision was automatically updated to reflect the committed changes.