This is an archive of the discontinued LLVM Phabricator instance.

[RISCV]A bug when llc -O0 vfmv.f.s.ll
AbandonedPublic

Authored by StephenFan on Jan 22 2021, 1:18 AM.

Details

Summary
The SUBREG_TO_REG opcode can't be eliminated when llc's optimize level is -O0. For the code show in below:
```
define dso_local double @test(<vscale x 1 x double> %v) #0 {
entry:
  %0 = call double @llvm.riscv.vfmv.f.s.f64.nxv1f64(<vscale x 1 x double> %v)
  ret double %0
}
```

when I use llc -O0 to codegen this .ll file. The mir after PostRAPseudos pass is

body:             |
  bb.0.entry:
    liveins: $v8, $x1
  
    $x2 = frame-setup ADDI $x2, -16
    CFI_INSTRUCTION def_cfa_offset 16
    SD killed $x1, $x2, 8 :: (store 8 into %stack.0)
    SD killed $x8, $x2, 0 :: (store 8 into %stack.1)
    CFI_INSTRUCTION offset $x1, -8
    CFI_INSTRUCTION offset $x8, -16
    $x8 = frame-setup ADDI $x2, 16
    CFI_INSTRUCTION def_cfa $x8, 0
    dead $x0 = PseudoVSETVLI killed $x0, 88, implicit-def $vl, implicit-def $vtype
    renamable $f0_f = PseudoVFMV_F_S_M1 killed renamable $v8, -1, implicit $vl, implicit $vtype
    $f10_f = FSGNJ_S killed $f0_f, killed $f0_f, implicit-def $f10_d
    $x8 = LD $x2, 0 :: (load 8 from %stack.1)
    $x1 = LD $x2, 8 :: (load 8 from %stack.0)
    $x2 = frame-destroy ADDI $x2, 16
    PseudoRET implicit killed $f10_d

...

The PostRVPseudos pass will lower the SUBREG_TO_REG to a instruciton of DstSubReg = COPY InsReg. So the
FSGNJ_S instruction appear. But the FSGNJ_S is the f32 register move instruction. Which is not correct for
f64 move.

Diff Detail

Event Timeline

StephenFan created this revision.Jan 22 2021, 1:18 AM
StephenFan requested review of this revision.Jan 22 2021, 1:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 22 2021, 1:18 AM
StephenFan retitled this revision from fix a bug when llc -O0 vfmv.f.s.ll to A bug when llc -O0 vfmv.f.s.ll.Jan 22 2021, 1:21 AM
StephenFan retitled this revision from A bug when llc -O0 vfmv.f.s.ll to [RISCV]A bug when llc -O0 vfmv.f.s.ll.Jan 22 2021, 6:52 AM

The commit message could do with improving in many respects (all of spelling, grammar, formatting, style and content). Please clearly state what the actual issue is and why this change is correct given the comment that is already there to explain the current behaviour.

llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
3562

So is this comment just wrong? If so update it, if not your problem lies elsewhere.

HsiangKai added a comment.EditedJan 22 2021, 8:41 AM

In D95234, I defined different pseudo instructions for different floating-point register classes. The floating-point vector pseudo instructions have correct register class information in D95234. I think it also solves the issue you encountered.

The commit message could do with improving in many respects (all of spelling, grammar, formatting, style and content). Please clearly state what the actual issue is and why this change is correct given the comment that is already there to explain the current behaviour.

Thanks for your suggestion. I come from a non-English speaking country. I will try to improve my English and commit message.

In D95234, I defined different pseudo instructions for different floating-point register classes. The floating-point vector pseudo instructions have correct register class information in D95234. I think it also solves the issue you encountered.

Ok, fine.

StephenFan abandoned this revision.Jan 22 2021, 7:06 PM

@StephenFan thank you for the bug report. I had a feeling that converting FPR32 to FPR64 with a SUBREG_TO_REG could cause a problem.

@StephenFan thank you for the bug report. I had a feeling that converting FPR32 to FPR64 with a SUBREG_TO_REG could cause a problem.

I have a question that why not make the scalar float point as FPR64 (not FPR32) then when encounter the FPR32 or FPR16, use the EXTRACT_SUBREG opcode.

@StephenFan thank you for the bug report. I had a feeling that converting FPR32 to FPR64 with a SUBREG_TO_REG could cause a problem.

I have a question that why not make the scalar float point as FPR64 (not FPR32) then when encounter the FPR32 or FPR16, use the EXTRACT_SUBREG opcode.

If you look at the current output for the fpr-spill-scalar.ll test added in D95234, you'll see that we would generate a 8 byte spill slot for float and half if we use FPR64. But the instruction to store an 8 byte F register isn't supported without the D extension.

@StephenFan thank you for the bug report. I had a feeling that converting FPR32 to FPR64 with a SUBREG_TO_REG could cause a problem.

I have a question that why not make the scalar float point as FPR64 (not FPR32) then when encounter the FPR32 or FPR16, use the EXTRACT_SUBREG opcode.

@StephenFan thank you for the bug report. I had a feeling that converting FPR32 to FPR64 with a SUBREG_TO_REG could cause a problem.

I have a question that why not make the scalar float point as FPR64 (not FPR32) then when encounter the FPR32 or FPR16, use the EXTRACT_SUBREG opcode.

If you look at the current output for the fpr-spill-scalar.ll test added in D95234, you'll see that we would generate a 8 byte spill slot for float and half if we use FPR64. But the instruction to store an 8 byte F register isn't supported without the D extension.

@StephenFan thank you for the bug report. I had a feeling that converting FPR32 to FPR64 with a SUBREG_TO_REG could cause a problem.

I have a question that why not make the scalar float point as FPR64 (not FPR32) then when encounter the FPR32 or FPR16, use the EXTRACT_SUBREG opcode.

Get it.