This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Fold vselect into predicated fmul, fsub and fadd
ClosedPublic

Authored by MattDevereau on Jan 19 2022, 8:50 AM.

Details

Summary

[AArch64][SVE] Fold vselect into predicated fmul, fsub and fadd

Fold vselect with an unpredicated fmul/fsub/fadd operand into a predicated fmul/fsub/fadd:

(vselect (p) (op (a) (b)) (a)) => (op -> (p) (a) (b))

Diff Detail

Event Timeline

MattDevereau created this revision.Jan 19 2022, 8:50 AM
MattDevereau requested review of this revision.Jan 19 2022, 8:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 19 2022, 8:50 AM
peterwaller-arm added inline comments.
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
501–518

You might be able to condense it to something like this suggestion, by introducing another multiclass in InstrFormats SVE_VSelect_3_Op_Pat, analogous to SVE_3_Op_Pat_SelZero. What you have looks reasonable to me, you might want a second opinion from someone else as to whether my proposal here makes sense.

llvm/test/CodeGen/AArch64/sve-fp-vselect.ll
89

Register called %mul for an fsub -- looks like a few register names need updating.

paulwalker-arm added inline comments.Jan 20 2022, 4:20 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
501–518

If possible I prefer the patterns to be added directly to the class that defines the instruction (i.e. sve_fp_2op_p_zds in this case)?

paulwalker-arm added inline comments.Jan 20 2022, 4:37 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
501–518

Just to extend on this. Perhaps you can create a PatFrags that contains both the intrinsic and vselect DAG patterns? That way you won't need to change the sve_fp_2op_p_zds class, you'll just pass in a different op in place of int_aarch64_sve_fadd for example.

MattDevereau added inline comments.Jan 24 2022, 9:11 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
501–518

I don't understand what you mean by the intrinsic. I've not used the int_aarch64_sve_fadd naming so far so I'm not sure which part that refers to, and the only DAG nodes I saw when working on this was AArch64fadd_p. Are you suggesting I use the PatFrags to add the match to defm FADD_ZPmZ on line 454 when it uses sve_fp_2op_p_zds?

paulwalker-arm added inline comments.Jan 25 2022, 5:23 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
501–518

The sve_fp_2op_p_zds multiclass is used to define the FADD_ZPmZ instruction. It takes an operator that represents the instruction at the DAG level. Currently this is set to int_aarch64_sve_fadd. With this patch you're effectively saying there's another way to represent this instruction within the DAG (i.e. (vselect (p) (a) (fadd (a) (b)))) and so I'm thinking you can create a PatFrags that contains bother operators which is them passed into sve_fp_2op_p_zds in place of the existing int_aarch64_sve_fadd. This way you shouldn't need to create any extra isel patterns.

MattDevereau added inline comments.Jan 25 2022, 5:48 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
501–518

Me and Peter had a dive into this, and we assumed you were getting at something like that.
We came up with this PatFrags:

def myfirstpatfrag : PatFrags<(ops node:$Pg, node:$Op1, node:$Op2), [
  (int_aarch64_sve_fmul node:$Pg, node:$Op1, node:$Op2),
  (vselect node:$Pg, node:$Op1, 
    (AArch64fmul_p (nxv2i1 (AArch64ptrue 31)), node:$Op1, node:$Op2)
  ),
]>;

but we haven't managed to find a way to generalise the nxv2i1 part so far.

paulwalker-arm added inline comments.Jan 25 2022, 7:54 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
501–518

You should be able to drop the (nxv2i1 ) part as the predicate type should be inferable. I just gave this a quick try and it works but you also need to tighten up the definition of the predicated binops (i.e. SDT_AArch64Arith, which is used to define AArch64fmul_p)

I extended it to include SDTCVecEltisVT<1,i1>, SDTCisSameNumEltsAs<0,1>,...., which looks to do the trick.

Matt added a subscriber: Matt.Jan 25 2022, 3:15 PM
MattDevereau retitled this revision from [AArch64][SVE] Fold vselect into predicated fmul to [AArch64][SVE] Fold vselect into predicated fmul, fsub and fadd.
MattDevereau edited the summary of this revision. (Show Details)

Use patfrags
Update commit message

paulwalker-arm added inline comments.Jan 26 2022, 4:49 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
168

Out of interest are the SDTCisVec<1>, SDTCisVec<2>, SDTCisVec<3> bits removable? I'm thinking they're redundant given the next line is now very specific about what the types of those operands are.

243

We already have a defined naming scheme for these operations, so these should be named AArch64fadd_m1, AArch64fmul_m1 & AArch64fsub_m1.

MattDevereau marked an inline comment as done.Jan 26 2022, 5:01 AM
MattDevereau added inline comments.
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
168

Removing them has no effect on the tests added in this patch or anything in the check-llvm target

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
246

Can the PatFrags be factored into a class with two parameters to avoid stamping it out repeatedly?

251

s/(AArch64ptrue 31)/(SVEAllActive)/g

Moved repeated patfrag into class

llvm/lib/Target/AArch64/SVEInstrFormats.td
8468 ↗(On Diff #403275)

I don't think this should mimic the form of the other (SVE_...) names, since those are for defining patterns, whereas this is a PatFrags, which is a different kind of thing. So in my view the name should be distinct to try and avoid catching unsuspecting readers into another trap.

How about EitherVSelectOrPassthruPatFrags. I'm thinking along the lines of trying to avoid the error of Or reading as the binary operation by prefacing it with Either.

Also, this doesn't look like the right place for this. I think a better location would be up near this, prehaps right at the end of that block before the getSVEPseudoMap definition. For good measure I might put a comment-block delimiter which separates it from the patterns above, and indicate in the comment that this "matches either an intrinsic, or a predicated operation with an all active predicate".

//===----------------------------------------------------------------------===//
// SVE pattern match helpers.
//===----------------------------------------------------------------------===//

Note also the missing trailing newline phabricator is complaining about.

llvm/lib/Target/AArch64/SVEInstrFormats.td
8464 ↗(On Diff #403275)

I think better names might be:
op => intrinsic
inst => sdnode

Beware copying the names form the other SVE_... classes, which I think would be confusing. Because in those classes, op is the thing to match for and 'inst' is the replacement, which is not the case for your PatFrags where, both cases are actually things to match.

Updated PatFrags class name and location

paulwalker-arm added inline comments.Jan 27 2022, 7:44 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
169

I cannot say if it really matters (given SDTCisSameNumEltsAs<0,1> implies 0 must be a vector) but I'd be inclined to keep SDTCisVec<0>.

llvm/test/CodeGen/AArch64/sve-fp-vselect.ll
5–13

Are these tests correct? To me it looks like you've got the true and false operands the wrong way round. This IR is effectively saying "when the predicate it true return op1 else return op1 * op2".

The fact the generate code is what we're after suggests the same mistake exists for the new isel patterns.

MattDevereau added inline comments.Jan 28 2022, 7:37 AM
llvm/test/CodeGen/AArch64/sve-fp-vselect.ll
5–13

The tests aren't correct, I missed the fact that fcmeq in the sqrt example is creating a predicate of 1s. Is it possible to reverse the operands for specific patterns in the PatFrags list? If not I'll have to split up parts of the logic introduced in this patch

I'd focus on one thing at a time. This patch enables the isel required for the way merged arithmetic is modelled by the DAG. Once that is complete you can see what's needed to improve the reciprocal code. For that follow-on work I'm guessing you'll need a DAG combine for the select that inverts the comparison for the cases where this'll be beneficial.

MattDevereau edited the summary of this revision. (Show Details)

update commit message, flip operands for vselect pattern

paulwalker-arm accepted this revision.Feb 1 2022, 4:01 PM

The IR within the tests looks like it's been doubly indented and there's a typo in the commit message (should be (vselect (p) (op (a) (b)) (a)) => (op -> (p) (a) (b)) but otherwise looks good.

This revision is now accepted and ready to land.Feb 1 2022, 4:01 PM
peterwaller-arm accepted this revision.Feb 2 2022, 1:27 AM
MattDevereau edited the summary of this revision. (Show Details)Feb 3 2022, 5:46 AM