This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Combine FADD and FMUL aarch64 intrinsics to FMLA
ClosedPublic

Authored by MattDevereau on Nov 3 2021, 4:39 AM.

Details

Diff Detail

Event Timeline

MattDevereau created this revision.Nov 3 2021, 4:39 AM
MattDevereau requested review of this revision.Nov 3 2021, 4:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2021, 4:39 AM
peterwaller-arm added inline comments.Nov 3 2021, 6:28 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
706

Use m_Specific; m_Deferred is for when you want to match an earlier m_Value.

714
llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fmla.ll
2 ↗(On Diff #384397)

Some ideas:

  1. Maybe generate the tests using llvm/utils/update_test_checks.py
  2. How about calling this file sve-intrinsic-muladd, so that we don't have to introduce a new file for each of FMLA, FMAD, FMLS etc?
  3. (Less important) I think I'd prefer to see names rather than anonymous instructions, so %p, %a, %b, etc. Makes it easier to conceptually line them up and means they don't have to be all renumbered should the test have something added to it in the future.
paulwalker-arm added inline comments.Nov 3 2021, 6:29 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
701

These should be upper case to match LLVM styling and be consistent with the other names using within this patch.

706

This can be m_Specific as there is nothing to defer.

718

Given by this point you've proven FAddFlags == FMulFlags you only need to check FAddFlags for allowContract which means you don't really need the FMulFlags temporary as they'll only be a single use remaining.

760–762

I think the following is more in keeping with LLVM styling.

if (auto FMLA = instCombineSVEVectorFMLA(IC, II))
  return FMLA;
MattDevereau added inline comments.Nov 3 2021, 7:53 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
701

I'm not 100% sure what you mean by "these" being consistent with the other names

p -> P
a -> A
b -> B
c -> C
FMul -> FMUL?

peterwaller-arm added inline comments.Nov 3 2021, 8:09 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
701

Paul is referring to this section of the style guide: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

I think FMul is in keeping with the surroundings as it is. Your other suggestions look fine to me in the context, or you could name them more thoroughly (Pred/Predicate, OpA, OpB, OpC) for example.

MattDevereau updated this revision to Diff 384463.EditedNov 3 2021, 8:41 AM

Renamed variables in instCombineSVEVectorFMLA
Reformatted instCombineSVEVectorFAdd
Replaced incorrect m_Deferred with m_Specific
Renamed test sve_intrinsic_fmla.ll to sve_intrinsic_muladd.ll

MattDevereau marked 7 inline comments as done.Nov 3 2021, 8:43 AM
MattDevereau added inline comments.
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
701

I've changed FMul to FMUL to stay in line with FMLA further down in the function

MattDevereau marked 2 inline comments as done.Nov 3 2021, 8:45 AM
MattDevereau added inline comments.
llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fmla.ll
2 ↗(On Diff #384397)

I've done all of the above

peterwaller-arm added inline comments.Nov 3 2021, 8:46 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
701

FMLA is an acronym so warrants full capitalization, FMul is more like PTest, and you can find other examples of things being called Mul. It's camel case, so I preferred it as you had it before (there are other examples in the file of 'Mul' for example).

MattDevereau marked an inline comment as done.

Renamed FMUL to FMul

Matt added a subscriber: Matt.Nov 3 2021, 10:02 AM
peterwaller-arm added inline comments.Nov 4 2021, 2:59 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
713–714

I think this comment needs work. The bar needs to be: if someone who is not us reads this in the future, does it help them understand what is going on? For example, the qualifier 'important' -- it is going to be difficult for a future reader to arrive at the same understanding as you have about what an 'important flag' is. Important for what? The set of flags in existence may be different in the future, for example. I've suggested an edit, please only accept if it makes sense to you and feel free to refine it further.

MattDevereau added inline comments.Nov 4 2021, 4:33 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
713–714

I'm not a fan of using language that refers to possible future work, i.e. "For the time being", if you want the reader to understand what's going on here with this comment then referring to work that might not even happen instead of whats literally happening seems a bit strange to me. Also I don't really like using less common synonyms like inhibit or elide. what about:

// FMLA must merge two sets of flags into one. Stop the combine if the flags on the inputs differ in case
// they allow other optimizations to skip the operation entirely.
david-arm added inline comments.Nov 4 2021, 7:09 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
713–714

For what it's worth I think I agree with @MattDevereau. To be honest, I thought the original comment seemed fine to me. Perhaps it's worth emphasising that combining the operations requires taking the common subset of flags and potentially dropping ones that would otherwise permit more powerful optimisations to take place? i.e. something like

// FMLA must merge two sets of flags into one, taking only the common subset.
// Stop the combine when the flags on the inputs differ in case dropping flags would lead to
// us missing out on more beneficial optimizations.
MattDevereau added inline comments.Nov 4 2021, 7:52 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
713–714

I like that suggestion, but for clarity on "taking only the common subset" - we aren't taking the common subset/intersect here, but rather if any flags at all aren't matching we're bailing out. On the previous attempt we talked about it here https://reviews.llvm.org/D111638#inline-1071530.

How about:

// FMLA must merge two equal sets of input flags into one.
// Stop the combine when the flags on the inputs differ in case dropping flags would lead to
// us missing out on more beneficial optimizations.
peterwaller-arm added inline comments.Nov 4 2021, 8:14 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
713–714

Agree with what is being said.

nit. "FMLA must merge two equal sets of input flags into one." => can be read as 'false', because we don't actually merge them, so it's not something it 'must' do.

I think you can potentially just drop that sentence (suggested edit). What I have in mind is that 'there were two slots', now 'there is only one slot', therefore information loss, which is where I started with my original suggestion.

MattDevereau marked 3 inline comments as done.
MattDevereau retitled this revision from Combine FADD and FMUL aarch64 intrinsics to FMLA to [AArch64][SVE] Combine FADD and FMUL aarch64 intrinsics to FMLA.

Updated flags comment

peterwaller-arm accepted this revision.Nov 4 2021, 9:33 AM
This revision is now accepted and ready to land.Nov 4 2021, 9:33 AM
paulwalker-arm accepted this revision.Nov 5 2021, 3:31 AM
paulwalker-arm added inline comments.
llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-muladd.ll
56–67 ↗(On Diff #384780)

I don't think this test offers any real value.

MattDevereau added inline comments.Nov 5 2021, 3:35 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
713–714

Sounds good to me. I'll update it with this

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-muladd.ll
56–67 ↗(On Diff #384780)

I added this to double check only fmul's get combined. At some point my fmul match was off and it was combining double fadds. Should I remove it when I push?

paulwalker-arm added inline comments.Nov 5 2021, 3:38 AM
llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-muladd.ll
56–67 ↗(On Diff #384780)

Yes please.