This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine][AArch64][CodeGen] Allow tranformable vectors to a legal for MULH lowering and use SVE's MULH for fixed vector types.
ClosedPublic

Authored by dtemirbulatov on Apr 13 2023, 7:06 AM.

Details

Summary

Allow 2x bigger vectors than legal for mulh lowering by relying on legalization to spilt combine input/output and use SVE's mulh for fixed vector types.

Diff Detail

Event Timeline

dtemirbulatov created this revision.Apr 13 2023, 7:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2023, 7:06 AM
dtemirbulatov requested review of this revision.Apr 13 2023, 7:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2023, 7:06 AM
RKSimon added inline comments.Apr 13 2023, 7:12 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9948

if there any chance that odd element count vectors can get here?

llvm/test/CodeGen/AArch64/sve-fixed-length-int-mulh.ll
15–16

remove the FIXME?

149–150

FIXME?

sdesmalen added inline comments.Apr 13 2023, 8:28 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9942–9954

I guess your point here is that if the vector is too wide (i.e. it has too many elements), then we can rely on legalisation of the UMULH/SMULH to do splitting of the operation/vectors, and this is preferred over legalising the extends themselves. So rather than only testing if the operation is Legal/Custom for NarrowVT or it's halved type, we actually want to check this for any legalised type of NarrowVT. That means you can use getTypeToTransformTo, e.g.

EVT TransformVT = TLI.getTypeToTransformTo(*DAG.getContext(), NarrowVT);
if (!TLI.isOperationLegalOrCustom(Opc, TransformVT))
  return SDValue();
dtemirbulatov marked 4 inline comments as done.

Replaced to a transformable to the legal type for MULH operation instead of 2x size check.

dtemirbulatov retitled this revision from [DAGCombine][AArch64][CodeGen] Allow 2x bigger vectors than legal for mulh lowering and use SVE's mulh for fixed vector types. to [DAGCombine][AArch64][CodeGen] Allow tranformable vectors to a legal for MULH lowering and use SVE's MULH for fixed vector types..Apr 13 2023, 11:32 AM
Matt added a subscriber: Matt.Apr 13 2023, 12:41 PM
sdesmalen added inline comments.Apr 13 2023, 2:40 PM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9945

this check is now irrelevant, you can remove the if/else entirely.

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
170 ↗(On Diff #513318)

I don't believe using PromoteIntRes_SimpleIntBinOp here is correct, see my comments on @smulh_v2i16

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll
102

This doesn't seem right because this instruction is sign-extending i32 elements, not the i16 elements that are passed in as the arguments.

I would have expected smulh z0.h, p0/m, z0.h, z1.h instead.

dtemirbulatov marked an inline comment as done.

Fixed error in @smulh_v2i16 function.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9945

Hmm, I have got several errors on scalar handling of MULH on RISCV and LoongArch targets without isVector() check.

dmgreen added inline comments.Apr 17 2023, 8:27 AM
llvm/test/CodeGen/Thumb2/mve-vmulh.ll
77 ↗(On Diff #514219)

This still looks wrong I'm afraid (and worse).

See https://reviews.llvm.org/D119075 (and https://reviews.llvm.org/D119556) for how I once thought of doing this with custom legalization from the target. That didn't get through review though, and we worked around it for some hadd/abd by using no wrap flags.

My understanding is that it will need to use the original type size for the mulh, otherwise the shift will be incorrect. You might be able to replace the NVCAST with bitcast providing it doesn't go incorrectly for big endian. I'm not sure all targets would like that though.

94 ↗(On Diff #514219)

This is just returning 0.

sdesmalen added inline comments.Apr 17 2023, 9:09 AM
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll
102

I still don't think this is correct, because it's not keeping the correct part of the result. I think the promotion could work by shifting the operands to the left by 16bits, doing the smulh on 32-bit elements, and then shifting the result right again by 16bits. But this is probably less efficient than the original code (sign-extend + mul + shift-right), so it seems better to avoid combining this case entirely and in the DAGCombine just add a check that the transformed type's element type is the same as the element type of NarrowVT (otherwise that would imply the need for type promotion)

Added check for vector types in combineShiftToMULH() that NarrowVT element type is equal to TransformVT or ignore the tranform.

dtemirbulatov marked an inline comment as done.Apr 17 2023, 5:19 PM

Removed pointless check for !TLI.isOperationLegalOrCustom(MulhOpcode, NarrowVT), since we also check with TransformVT.

sdesmalen accepted this revision.Apr 19 2023, 12:49 AM
This revision is now accepted and ready to land.Apr 19 2023, 12:49 AM
This revision was landed with ongoing or failed builds.Apr 19 2023, 6:25 AM
This revision was automatically updated to reflect the committed changes.