This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Fix promotion of extracted FP vector element
Needs ReviewPublic

Authored by bryanpkc on Jun 26 2018, 3:11 PM.

Details

Summary

D32391 added support for extension/truncation for both integer and float types,
but the handling of ISD::EXTRACT_VECTOR_ELT still did not expect FP vectors,
leading to an assertion failure in the test case when compiling at -O0.

Diff Detail

Event Timeline

bryanpkc created this revision.Jun 26 2018, 3:11 PM
RKSimon added inline comments.Jun 28 2018, 8:16 AM
test/CodeGen/X86/pr31088.ll
73

What's causing the repeated fptrunc+fpext ?

bryanpkc added inline comments.Jul 2 2018, 9:03 PM
test/CodeGen/X86/pr31088.ll
73

I have stared at the ISel traces for a long time, but I admit I don't have a good answer, as I am not familiar with the X86 backend. The final schedule kind of makes sense to me:

*** Final schedule ***
SU(6): t7: f32,ch = CopyFromReg t0, Register:f32 %3
SU(5): t25: v16i8 = COPY_TO_REGCLASS t7, TargetConstant:i32<74>
SU(4): t27: v16i8 = VCVTPS2PHrr t25, TargetConstant:i32<4>
SU(3): t28: v16i8 = VCVTPH2PSrr t27
SU(2): t20: f32 = COPY_TO_REGCLASS t28, TargetConstant:i32<25>
SU(11): t2: f32,ch = CopyFromReg t0, Register:f32 %2
SU(10): t30: v16i8 = COPY_TO_REGCLASS t2, TargetConstant:i32<74>
SU(9): t31: v16i8 = VCVTPS2PHrr t30, TargetConstant:i32<4>
SU(8): t32: v16i8 = VCVTPH2PSrr t31
SU(7): t18: f32 = COPY_TO_REGCLASS t32, TargetConstant:i32<25>
SU(1): t23: f32 = VADDSSrr t18, t20
SU(0): t16: ch = RET TargetConstant:i32<0>, Register:f32 $xmm0, t15, t15:1
    t15: ch,glue = CopyToReg t0, Register:f32 $xmm0, t23

But the machine function dump immediately afterwards show the duplicated convert instructions:

*** MachineFunction at end of ISel ***
# Machine code for function ir_fadd_v1f16: IsSSA, TracksLiveness
Function Live Ins: $xmm0 in %0, $xmm1 in %1

bb.0 (%ir-block.0):
  liveins: $xmm0, $xmm1
  %1:fr32 = COPY $xmm1
  %0:fr32 = COPY $xmm0
  %4:vr128 = COPY %1:fr32
  %5:vr128 = VCVTPS2PHrr killed %4:vr128, 4
  %6:vr128 = VCVTPH2PSrr killed %5:vr128
  %7:fr32 = COPY %6:vr128
  %8:vr128 = COPY %0:fr32
  %9:vr128 = VCVTPS2PHrr killed %8:vr128, 4
  %10:vr128 = VCVTPH2PSrr killed %9:vr128
  %11:fr32 = COPY %10:vr128
  %3:fr32 = COPY %7:fr32
  %2:fr32 = COPY %11:fr32
  %12:vr128 = COPY %3:fr32
  %13:vr128 = VCVTPS2PHrr killed %12:vr128, 4
  %14:vr128 = VCVTPH2PSrr killed %13:vr128
  %15:fr32 = COPY %14:vr128
  %16:vr128 = COPY %2:fr32
  %17:vr128 = VCVTPS2PHrr killed %16:vr128, 4
  %18:vr128 = VCVTPH2PSrr killed %17:vr128
  %19:fr32 = COPY %18:vr128
  %20:fr32 = VADDSSrr killed %19:fr32, killed %15:fr32
  $xmm0 = COPY %20:fr32
  RET 0, $xmm0

# End machine code for function ir_fadd_v1f16.

My best guess is that they had come from the expansion of VADDSSrr which somehow forced explicit conversions on its operands.

There are actually two selection DAGs for that test. I'm not sure why yet. The final schedule is printed separately for each DAG.

There are actually two selection DAGs for that test. I'm not sure why yet. The final schedule is printed separately for each DAG.

I saw them too. I had assumed that the second DAG would override the first one, being a fallback of some sort, but now that you've mentioned it, the two DAGs joined together would indeed lead to the duplicated instructions.

@bryanpkc Any movement on this please?

Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2021, 3:33 AM
RKSimon resigned from this revision.Oct 2 2021, 2:32 PM
craig.topper resigned from this revision.Oct 7 2021, 11:14 AM