Page MenuHomePhabricator

[AArch64] Optimise min/max lowering in ISel
ClosedPublic

Authored by Rin on Jul 22 2021, 8:25 AM.

Details

Summary

Some code optimisations for lowering i64 vector types for the min/max instruction in ISel.

Diff Detail

Event Timeline

Rin created this revision.Jul 22 2021, 8:25 AM
Rin requested review of this revision.Jul 22 2021, 8:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2021, 8:25 AM
Rin edited the summary of this revision. (Show Details)Jul 22 2021, 8:26 AM
Rin added reviewers: SjoerdMeijer, dmgreen.
david-arm added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7159

I think you can move this if block to the start of the function, i.e.

unsigned Opcode = Op.getOpcode();
if (...) {
  switch (Opcode) {
  ...
  }
}
7160

I think you might be able to just do this:

if (useSVEForFixedLengthVectorVT(VT, /*OverrideNEON=*/true)) {

instead?

llvm/test/Analysis/CostModel/AArch64/min-max.ll
-87

nit: It might be easier to review the patch if you first introduced a NFC change to update the tests using utils/update_analyze_test_checks.py?

llvm/test/CodeGen/AArch64/vecreduce-umax-legalization.ll
90–94

Hi @Rin, at first glance this code looks like it might be worse than before? I realise the instruction count is the same, but I wonder if the cost of 'ext' might be higher?

dmgreen added inline comments.Jul 22 2021, 8:58 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4833–4834

These can use fallthroughs to the next case, if they all call LowerMinMax in the same way.

7159

Is this VT check needed?

llvm/test/Analysis/CostModel/AArch64/min-max.ll
-87

I was updating this file the other day. I think if you rebase the changes may disappear.

dmgreen added inline comments.Jul 22 2021, 8:59 AM
llvm/test/CodeGen/AArch64/vecreduce-umax-legalization.ll
90–94

Why would the cost of an ext be higher?

david-arm added inline comments.Jul 22 2021, 9:02 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7160

Hi @Rin, apologies please ignore my previous comment about changing the if statement. I see now we also want scalable vector types to fall into the block too!

llvm/test/CodeGen/AArch64/vecreduce-umax-legalization.ll
90–94

@dmgreen it might not be! I just wondered if ext was worse than the mov that's all.

Rin added inline comments.Jul 26 2021, 4:56 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
4833–4834

Good point, I'll change them

llvm/test/Analysis/CostModel/AArch64/min-max.ll
-87

Ah, fair, I'll rebase then

Rin added inline comments.Jul 26 2021, 5:51 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7159

Is this VT check needed?

Nope, I'll get rid of it

Rin updated this revision to Diff 361666.Jul 26 2021, 8:06 AM
Rin marked 5 inline comments as done.
Rin edited the summary of this revision. (Show Details)

Address review comments

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7160

No worries :)

dmgreen added inline comments.Jul 27 2021, 10:50 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1043

Can these go into a loop, looping over the UMAX/UMIN/...

llvm/test/Analysis/CostModel/AArch64/min-max.ll
-87

It turns out I had not updated that test yet.. it should now have the changes I mentioned, if you rebase the patch.

Can you make sure the instructions have a proper cost now too? They should be 2 I think. There is already some code in AArch64TTIImpl::getIntrinsicInstrCost for it.

Rin updated this revision to Diff 362379.Jul 28 2021, 7:39 AM

Fix Costmodel and loop over min/max when legalising instruction

Rin marked 3 inline comments as done.Jul 28 2021, 7:40 AM

Thanks for updating the costs.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
222–223

This code can be removed now, it can fall through to the smin/smax cases.

229

Maybe add a comment about it being converted to a cmp+bif?

Rin added inline comments.Jul 28 2021, 7:44 AM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
222–223

Yeah, I was wondering about that. I'll remove it

229

Will do

Rin updated this revision to Diff 362389.Jul 28 2021, 8:01 AM

Add comment and remove unnecesary code

Rin marked 2 inline comments as done.Jul 28 2021, 8:02 AM
dmgreen accepted this revision.Jul 28 2021, 10:19 AM

Thanks. LGTM

This revision is now accepted and ready to land.Jul 28 2021, 10:19 AM
efriedma added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7180

If I'm understanding correctly, if we mark both ISD::SMAX and ISD::VSELECT as "Expand", vector legalization decides to unroll it. So we mark ISD::SMAX as "Custom", then explicitly lower to a VSELECT to get the code we want.

This seems kind of silly, given VSELECT is equivalent to AArch64ISD::BSP. For the sake of making changes to target-independent code easier, maybe we should consider marking ISD::VSELECT "custom"? Or add a target hook to indicate whether the operation is cheap?

Matt added a subscriber: Matt.Jul 28 2021, 2:53 PM
dmgreen added inline comments.Jul 29 2021, 8:44 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7180

Yes, we didn't mention that anywhere here. There are three ways we thought of fixing this - either make vselect custom/legal, adding a target hook for the expand code or custom lowering the type. Because it was only a single type, custom lowering seemed like the simplest route forward. The target hook felt messy for one type on one operation, and custom lowering vselect could be a much larger change - one that would easily cause regressions if a lot of other transforms were not added.

I guess this turned into more code than I expected, with it being mixed up with SVE lowering. Perhaps custom lowering vselect would be better in the long run if someone was to optimize the BSP's in all the cases it would need, but this patch seems fine to me for the problem it is solving.

efriedma added inline comments.Jul 29 2021, 11:46 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
7180

Okay.

I'm not really concerned about this patch on its own, just thinking about what we might want to do in the future.

This revision was landed with ongoing or failed builds.Aug 2 2021, 5:40 AM
This revision was automatically updated to reflect the committed changes.