The shift libcalls have a shift amount parameter of MVT::i32, but
sometimes ExpandIntRes_Shift may be called with a node whose
second operand is a type that is larger than that. This leads to
an ABI mismatch, and for example causes a spurious zeroing of
a register in RV32 for 64-bit shifts. Note that at present regular
shift intstructions already have their shift amount operand adapted
at SelectionDAGBuilder::visitShift time, and funnelled shifts bypass that.
Details
Diff Detail
Event Timeline
Fixed clang-format.
Updated X86 and AArch64 shift_minsize test output, and refreshed labels for RISCV shifts test output.
llvm/test/CodeGen/RISCV/shifts.ll | ||
---|---|---|
656 | Can you test fshr.i128 as well? |
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp | ||
---|---|---|
4001 | AnyExt seem a bit weird as I assume the lib function will use all bits. I think getZExtOrTrunc would be a more correct choice here. |
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp | ||
---|---|---|
4001 | I agree that AnyExt is incorrect, however I'm wondering whether it should be ZExt or SExt. The source-level semantics seem to point towards SExt, because the libcalls are defined with a signed shift amount (si_int as opposed to su_int). Earlier code in this function uses ZExt (under the LegalOrCustom && TLI.shouldExpandShift(DAG, N) case), but it doesn't need to concern itself with the signature for __ashlti3 and friends. Note that this actually affects the output for llvm/test/CodeGen/X86/shift_minsize.ll functions shl128, ashr128, lshr128, in whether the movzbl is emitted or not. |
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp | ||
---|---|---|
4001 | My reasoning was that the shift amount can't be negative (that would be undefined both for the SelectionDAG nodes and afaik the lib calls (e.g. __ashlti3 expects the shift amount b to be in the range 0 <= b < "bits in a tword". And at least for rotates and funnel shifts the shift amount (in SelectionDAG nodes) is treated as an unsigned. So I kind of assumed that doing a zero extend would be safe, as it would avoid passing a negative shift amount to the lib function (e.g. if the shift amount is 15 and the type we should convert from is i4, then doing a zext to a 32 bit si_int (treated as signed) would still give the value 15, while doing a sext would result in passing -1 to the lib function). |
Alright, yes, that makes a lot of sense to me. The patch already uses ZExt, so I think it could be considered ready.
😄
I have no commit access, might I ask that you commit this on my behalf?
Author: Itay Bookstein <itay.bookstein@nextsilicon.com>
AnyExt seem a bit weird as I assume the lib function will use all bits. I think getZExtOrTrunc would be a more correct choice here.