Before, we checked <32 - probably assuming anything below 32 would be 16 bits.
However, odd integer types like i26 exist and are legal. Don't combine in those cases.
Details
- Reviewers
arsenm
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
What are you trying to solve here? alignbit isn't preferable to 32-bit shifts, and the shift doesn't seem to be wrong
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | ||
---|---|---|
3240 | Legal 16-bit isn't the point here, it's to avoid the 64-bit shift. Even if we didn't have 16-bit types we would want the combine. | |
llvm/test/CodeGen/AMDGPU/partial-shift-shrink.ll | ||
157 | The alignbit and shift are equally fast, and the shift is easier to understand |
This is trying to solve a miscompilation, but it might be the wrong fix. I know for sure this combine is responsible for a miscompilation in a sample, as disabling it/removing it/applying this fix resolves the issue.
This is the DAG being wrongly combined:
t26: i64 = srl t23, Constant:i32<26> t28: i64 = mul nuw nsw t22, Constant:i64<12345678> t29: i64 = add nuw nsw t26, t28 t32: i64 = srl t29, Constant:i32<25> t33: i26 = truncate t32 Combining: t33: i26 = truncate t32 Creating new node: t82: i32 = truncate t29 Creating new node: t83: i32 = srl t82, Constant:i32<25> Creating new node: t84: i26 = truncate t83 ... into: t84: i26 = truncate t83
srl is a shift-right, so it doesn't seem right to me that we truncate the source to 32 bits as it changes the output.
Maybe the error is in the VT of the first trunc? Shouldn't we have at least 2*K for shift-right?
Maybe this is a better fix? This also works to fix the miscompilation I'm seeing.
EVT MidScalarTy = MVT::i32; // For right shifts, ensure the VT of the shift source is wide // enough that we don't lose bits in the result. if(Src.getOpcode() == ISD::SRL || Src.getOpcode() == ISD::SRA) { // Don't risk losing info if we don't know the shift amount. if(!Known.isConstant()) return SDValue(); const uint64_t ScalarWidth = Known.getConstant().getZExtValue() * 2; if(ScalarWidth >= 64) return SDValue(); MidScalarTy = EVT::getIntegerVT(*DAG.getContext(), ScalarWidth); } EVT MidVT = VT.isVector() ? EVT::getVectorVT(*DAG.getContext(), MidScalarTy, VT.getVectorNumElements()) : MidScalarTy;
The checks are that the shift doesn't cross the 32-bit boundary. Here is the broken case: https://alive2.llvm.org/ce/z/P6iBze
The shift amount check seems to be wrong. I think the correct condition is ShiftAmt <= (32 - VT.getScalarSizeInBits()) https://alive2.llvm.org/ce/z/uYZ9tq
I'm not sure alive2 still has the abstract condition checks anymore like the old version
The shift amount check seems to be wrong. I think the correct condition is ShiftAmt <= (32 - VT.getScalarSizeInBits()) https://alive2.llvm.org/ce/z/uYZ9tq
For right shifts. For left shifts it's size.
GlobalISel also seems to have half ported this combine. For some reason it's only handling shl
Legal 16-bit isn't the point here, it's to avoid the 64-bit shift. Even if we didn't have 16-bit types we would want the combine.