As i64 types are not legal on 32-bit targets, insert these into a suitable zero vector and use the packed vXi64<->FP conversion instructions instead.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
16065 ↗ | (On Diff #146655) | This case is the same for SINT and UINT. May be put them in a function? |
25275 ↗ | (On Diff #146655) | VecVT = MVT::getVectorVT(VT, NumElts); |
25279 ↗ | (On Diff #146655) | Why do you need to insert into zero vector? Can you insert to undef? |
test/CodeGen/X86/scalar-fp-to-i64.ll | ||
541 ↗ | (On Diff #146655) | Can the memory operand be folded here? |
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
25279 ↗ | (On Diff #146655) | I think so. I asked the same question before I commandeered it. It's probably no worse than the widening with undef we do for v2f32 legalization. |
test/CodeGen/X86/scalar-fp-to-i64.ll | ||
541 ↗ | (On Diff #146655) | We'd have to detect the load and the possibilty of folding it during this lowering code. Or we'd have to use undef for the upper elts and add a DAG combine to turn insert into undef into a broadcast if its foldable. |
Create helper function to share between SINT_TO_FP and UINT_TO_FP. Use scalar_to_vector instead of inserting into a zero vector.
I think it is simpler than load folding in td. You can open a bugzilla ticket for possible optimization.
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
25279 ↗ | (On Diff #146655) | In the original patch I was just trying to be very sure there wasn't anything in the other source elements that could cause fp exceptions/overflow flags etc. |
Could you please add full context to the patch, Craig?
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
25279 ↗ | (On Diff #146655) | I think the only possible side effect from the other source element being undef is raising "inexact". Do we care? |
16048 ↗ | (On Diff #146701) | I suspect we also want to use this vector sequence for unsigned i64 conventions on 64-bit. |
Thanks, Craig.
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
25279 ↗ | (On Diff #146655) | Forget what I wrote. I was thinking of the INT-->FP case. For the FP-->INT case, why wouldn't we need to worry about raising spurious exceptions? Also, I'm probably missing something, but it looks like this code is expected to kick in for both 32-bit and 64-bit and both signed and unsigned FP-->i64. Is that intentional? For the 64-bit signed case, I would think we would prefer CVTTSx2SI. |
Use zeros for other elements for fp->int to avoid spurious invalid exceptions. Since these sorts of scalar conversions are common there's a chance someone could notice the spurious exceptions and complain. We seem to have generated the same code in the test cases either way. Since we ultimatley had to load two i32s into an xmm register. We need use a movd/movss that 0 the upper bits anyway.
Also addeded an assert for !Subtarget.is64Bit() to the fp->int conversion. That code is in ReplaceNodeResults is a hook for the type legalizer to legalize the result type so we should only get there when i64 isn't legal.
Looks good, Craig. And thanks for the explanation on the 64-bit cases. I had forgotten about the VCVTTSx2USI & VCVTUSI2Sx instructions, which we appear to be generating already.