Allow 2x bigger vectors than legal for mulh lowering by relying on legalization to spilt combine input/output and use SVE's mulh for fixed vector types.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
9942–9954 | I guess your point here is that if the vector is too wide (i.e. it has too many elements), then we can rely on legalisation of the UMULH/SMULH to do splitting of the operation/vectors, and this is preferred over legalising the extends themselves. So rather than only testing if the operation is Legal/Custom for NarrowVT or it's halved type, we actually want to check this for any legalised type of NarrowVT. That means you can use getTypeToTransformTo, e.g. EVT TransformVT = TLI.getTypeToTransformTo(*DAG.getContext(), NarrowVT); if (!TLI.isOperationLegalOrCustom(Opc, TransformVT)) return SDValue(); |
Replaced to a transformable to the legal type for MULH operation instead of 2x size check.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
9945 | this check is now irrelevant, you can remove the if/else entirely. | |
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp | ||
170 ↗ | (On Diff #513318) | I don't believe using PromoteIntRes_SimpleIntBinOp here is correct, see my comments on @smulh_v2i16 |
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll | ||
102 | This doesn't seem right because this instruction is sign-extending i32 elements, not the i16 elements that are passed in as the arguments. I would have expected smulh z0.h, p0/m, z0.h, z1.h instead. |
Fixed error in @smulh_v2i16 function.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
9945 | Hmm, I have got several errors on scalar handling of MULH on RISCV and LoongArch targets without isVector() check. |
llvm/test/CodeGen/Thumb2/mve-vmulh.ll | ||
---|---|---|
77 ↗ | (On Diff #514219) | This still looks wrong I'm afraid (and worse). See https://reviews.llvm.org/D119075 (and https://reviews.llvm.org/D119556) for how I once thought of doing this with custom legalization from the target. That didn't get through review though, and we worked around it for some hadd/abd by using no wrap flags. My understanding is that it will need to use the original type size for the mulh, otherwise the shift will be incorrect. You might be able to replace the NVCAST with bitcast providing it doesn't go incorrectly for big endian. I'm not sure all targets would like that though. |
94 ↗ | (On Diff #514219) | This is just returning 0. |
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll | ||
---|---|---|
102 | I still don't think this is correct, because it's not keeping the correct part of the result. I think the promotion could work by shifting the operands to the left by 16bits, doing the smulh on 32-bit elements, and then shifting the result right again by 16bits. But this is probably less efficient than the original code (sign-extend + mul + shift-right), so it seems better to avoid combining this case entirely and in the DAGCombine just add a check that the transformed type's element type is the same as the element type of NarrowVT (otherwise that would imply the need for type promotion) |
Added check for vector types in combineShiftToMULH() that NarrowVT element type is equal to TransformVT or ignore the tranform.
Removed pointless check for !TLI.isOperationLegalOrCustom(MulhOpcode, NarrowVT), since we also check with TransformVT.
this check is now irrelevant, you can remove the if/else entirely.