This is an archive of the discontinued LLVM Phabricator instance.

[SVE][AArch64] Enable first active true vector combine for INTRINSIC_WO_CHAIN
ClosedPublic

Authored by Allen on Mar 31 2022, 3:15 AM.

Details

Summary

WHILELO/LS insn is used very important for SVE loop, and itself
is a flag-setting operation, so add it.

Diff Detail

Event Timeline

Allen created this revision.Mar 31 2022, 3:15 AM
Herald added a project: Restricted Project. · View Herald Transcript
Allen requested review of this revision.Mar 31 2022, 3:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2022, 3:15 AM
david-arm added inline comments.Mar 31 2022, 5:43 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
14550

Hi @Allen, don't you also have to check the first operand to see if it matches the intrinsic, i.e. aarch64_sve_whilelo or aarch64_sve_whilels?

llvm/test/CodeGen/AArch64/sve-cmp-folds.ll
102

Since you also want to support the DAG combine with whilels is it worth having a test for that too?

Allen updated this revision to Diff 419445.Mar 31 2022, 7:16 AM

update to check the aarch64_sve_whilelo or aarch64_sve_whilels

paulwalker-arm added inline comments.Mar 31 2022, 9:02 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
14550

I guess I should have asked this a couple of patches ago but being provocative what's the value in having any restriction here? I recall something about allowing hoisting from loops but is this really a necessary consideration at this phase of compilation? Are there other things to consider?

david-arm added inline comments.Mar 31 2022, 9:11 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
14550

So I think the MachineLICM pass runs after this phase, right? And if you introduce flag-setting operations they don't get hoisted out. Obviously if you had flag-setting operations to start with then it's no worse. This is what happens when you dup a i1 into a predicate register, for example see LowerSPLAT_VECTOR where we create a whilelo that then never gets hoisted out of the loop.

I'm happy to be proved wrong though as I may have missed something!

paulwalker-arm added inline comments.Mar 31 2022, 9:23 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
14550

Sure, that makes sense. I guess what I'm asking is what's the scenario where we are extracting from a loop invariant predicate vector that hasn't already been hoisted at the IR level.

Matt added a subscriber: Matt.Mar 31 2022, 9:30 AM
paulwalker-arm accepted this revision.Apr 4 2022, 5:08 AM

Whether it's worth having the CC restriction aside, the patch look good to me. I've mentioned some refactoring to consider.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
14538–14539

This name is less meaningful after this change.

14548–14551

Perhaps it's worth extracting this into a isPredicateCCSettingOp helper function (or lambda) as eventually we'll want to add all such operations if we're going to persist with this restriction.

This revision is now accepted and ready to land.Apr 4 2022, 5:08 AM
Allen updated this revision to Diff 420717.Apr 5 2022, 11:41 PM

update according review

  1. define a new helper function isPredicateCCSettingOp
  2. add all other type whileXX operations
  3. add related cases, and sve->sve2
This revision was automatically updated to reflect the committed changes.
Allen marked an inline comment as done.