This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Extend SVEVectorFuseMulAddSub to support newly added "undef" intrinsics.
ClosedPublic

Authored by paulwalker-arm on Feb 20 2023, 10:23 AM.

Details

Summary

D143767 will change the intrinsics used to lower floating-point
svadd_x, svmul_x and svsub_x builtins. This will result in the
combines added as part of D140200 to no longer fire in all cases.
This patch extends the existing combines for contraction to cover
fadd_u, fmul_u and fsub_u intrinsics.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2023, 10:23 AM
paulwalker-arm requested review of this revision.Feb 20 2023, 10:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2023, 10:23 AM
paulwalker-arm edited the summary of this revision. (Show Details)Feb 20 2023, 10:24 AM

There's likely other combines that will also need to be updated but this is the most important one blocking D143767. I'd like to take a more holistic look at the others as part of work to unify some code paths. For example, I want to canonicalise sve intrinsic calls where the predicate is all active to use the "undef" intrinsics so that some code duplication within the code generator can be removed. With that said, please shout if you know of something critical that should be handled before D143767 lands.

Matt added a subscriber: Matt.Feb 20 2023, 1:21 PM
MattDevereau added inline comments.Mar 9 2023, 11:14 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1622–1625

Is this case separate from instCombineSVEVectorAdd because the "undef" variant can't combine to fmla_u unless both the fmul and fadd are of the _u variants? Or because this case can't benefit from combines in instCombineSVEVectorBinOp?

paulwalker-arm added inline comments.Mar 10 2023, 3:04 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
1622–1625

The formerish. fadd_m(pg, a, fmul_u(pg, b, c)) expects the inactive elements to come from a, which fmla_u does not guarantee. It's worth pointing out the opposite is a valid transformation (i.e. fadd_u(pg, a, fmul_m(pg, b, c)) --> fmla_u(a, b, c) but that's new and I have half a thought it'll be better to soften the fmul_m to fmul_u rather than jumping straight to fmla_u.

This does mean we're not getting the benefit of instCombineSVEVectorBinOp but here my plan is to rewrite m instrinsics that take an all active predicate to their equivalent u intrinsic, to minimise duplication.

MattDevereau accepted this revision.Mar 10 2023, 3:15 AM
This revision is now accepted and ready to land.Mar 10 2023, 3:15 AM
This revision was landed with ongoing or failed builds.Mar 12 2023, 4:31 AM
This revision was automatically updated to reflect the committed changes.