D32391 added support for extension/truncation for both integer and float types,
but the handling of ISD::EXTRACT_VECTOR_ELT still did not expect FP vectors,
leading to an assertion failure in the test case when compiling at -O0.
Details
- Reviewers
bogner spatel RKSimon craig.topper
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 19736 Build 19736: arc lint + arc unit
Event Timeline
test/CodeGen/X86/pr31088.ll | ||
---|---|---|
73 | What's causing the repeated fptrunc+fpext ? |
test/CodeGen/X86/pr31088.ll | ||
---|---|---|
73 | I have stared at the ISel traces for a long time, but I admit I don't have a good answer, as I am not familiar with the X86 backend. The final schedule kind of makes sense to me: *** Final schedule *** SU(6): t7: f32,ch = CopyFromReg t0, Register:f32 %3 SU(5): t25: v16i8 = COPY_TO_REGCLASS t7, TargetConstant:i32<74> SU(4): t27: v16i8 = VCVTPS2PHrr t25, TargetConstant:i32<4> SU(3): t28: v16i8 = VCVTPH2PSrr t27 SU(2): t20: f32 = COPY_TO_REGCLASS t28, TargetConstant:i32<25> SU(11): t2: f32,ch = CopyFromReg t0, Register:f32 %2 SU(10): t30: v16i8 = COPY_TO_REGCLASS t2, TargetConstant:i32<74> SU(9): t31: v16i8 = VCVTPS2PHrr t30, TargetConstant:i32<4> SU(8): t32: v16i8 = VCVTPH2PSrr t31 SU(7): t18: f32 = COPY_TO_REGCLASS t32, TargetConstant:i32<25> SU(1): t23: f32 = VADDSSrr t18, t20 SU(0): t16: ch = RET TargetConstant:i32<0>, Register:f32 $xmm0, t15, t15:1 t15: ch,glue = CopyToReg t0, Register:f32 $xmm0, t23 But the machine function dump immediately afterwards show the duplicated convert instructions: *** MachineFunction at end of ISel *** # Machine code for function ir_fadd_v1f16: IsSSA, TracksLiveness Function Live Ins: $xmm0 in %0, $xmm1 in %1 bb.0 (%ir-block.0): liveins: $xmm0, $xmm1 %1:fr32 = COPY $xmm1 %0:fr32 = COPY $xmm0 %4:vr128 = COPY %1:fr32 %5:vr128 = VCVTPS2PHrr killed %4:vr128, 4 %6:vr128 = VCVTPH2PSrr killed %5:vr128 %7:fr32 = COPY %6:vr128 %8:vr128 = COPY %0:fr32 %9:vr128 = VCVTPS2PHrr killed %8:vr128, 4 %10:vr128 = VCVTPH2PSrr killed %9:vr128 %11:fr32 = COPY %10:vr128 %3:fr32 = COPY %7:fr32 %2:fr32 = COPY %11:fr32 %12:vr128 = COPY %3:fr32 %13:vr128 = VCVTPS2PHrr killed %12:vr128, 4 %14:vr128 = VCVTPH2PSrr killed %13:vr128 %15:fr32 = COPY %14:vr128 %16:vr128 = COPY %2:fr32 %17:vr128 = VCVTPS2PHrr killed %16:vr128, 4 %18:vr128 = VCVTPH2PSrr killed %17:vr128 %19:fr32 = COPY %18:vr128 %20:fr32 = VADDSSrr killed %19:fr32, killed %15:fr32 $xmm0 = COPY %20:fr32 RET 0, $xmm0 # End machine code for function ir_fadd_v1f16. My best guess is that they had come from the expansion of VADDSSrr which somehow forced explicit conversions on its operands. |
There are actually two selection DAGs for that test. I'm not sure why yet. The final schedule is printed separately for each DAG.
I saw them too. I had assumed that the second DAG would override the first one, being a fallback of some sort, but now that you've mentioned it, the two DAGs joined together would indeed lead to the duplicated instructions.
What's causing the repeated fptrunc+fpext ?