This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Delete dead elideCopy code in InsertVSETVLI [nfc]
ClosedPublic

Authored by reames on Jun 16 2022, 11:26 AM.

Details

Summary

This code appears to be dead. A simple whole register copy of an IMPLICIT_DEF, is simply an IMPLICIT_DEF of it's own. (This would not be true for freeze, but is for copy.)

If we find a case which gets here with vector operand copy of an IMPLICIT_DEF, we most likely have an earlier missed optimization anyways.

Diff Detail

Event Timeline

reames created this revision.Jun 16 2022, 11:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 11:26 AM
reames requested review of this revision.Jun 16 2022, 11:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 11:26 AM

Looking at this more, this code may only be dead because we don't do the tied operand check if a policy operand is present. That's probably true for a significant number of instructions these days.

Ignoring that and looking at what the MIR looks like at the time we reach InsertVSETVLI for this test case

define <vscale x 8 x i64> @vpload_nxv8i64(<vscale x 8 x i64>* %ptr, <vscale x 8 x i1> %m, i32 zeroext %evl) #1 {
  %load = call <vscale x 8 x i64> @llvm.vp.load.nxv8i64.p0nxv8i64(<vscale x 8 x i64>* %ptr, <vscale x 8 x i1> %m, i32 %evl)
  ret <vscale x 8 x i64> %load                                                   
}

I see

# Machine code for function vpload_nxv8i64: IsSSA, TracksLiveness                
Function Live Ins: $x10 in %0, $v0 in %1, $x11 in %2                             
                                                                                 
bb.0 (%ir-block.0):                                                              
  liveins: $x10, $v0, $x11                                                       
  %2:gprnox0 = COPY $x11                                                         
  %1:vr = COPY $v0                                                               
  %0:gpr = COPY $x10                                                             
  $v0 = COPY %1:vr                                                               
  %4:vrm8 = IMPLICIT_DEF                                                         
  %5:vrm8nov0 = COPY %4:vrm8                                                     
  %3:vrm8nov0 = PseudoVLE64_V_M8_MASK %5:vrm8nov0(tied-def 0), %0:gpr, $v0, %2:gprnox0, 6, 1 :: (load unknown-size from %ir.ptr, align 64)
  $v8m8 = COPY %3:vrm8nov0                                                       
  PseudoRET implicit $v8m8                                                       
                                                                                 
# End machine code for function vpload_nxv8i64.

That shows an IMPLICIT_DEF behind a copy instruction. I think it is related to vrm8nov0 having 3 registers. It doesn't happen on an LMUL=4 test. I suspect it hit the MinRCSize limit of 4 in InstrEmitter::AddRegisterOperand.

Looking at this more, this code may only be dead because we don't do the tied operand check if a policy operand is present. That's probably true for a significant number of instructions these days.

Ignoring that and looking at what the MIR looks like at the time we reach InsertVSETVLI for this test case

define <vscale x 8 x i64> @vpload_nxv8i64(<vscale x 8 x i64>* %ptr, <vscale x 8 x i1> %m, i32 zeroext %evl) #1 {
  %load = call <vscale x 8 x i64> @llvm.vp.load.nxv8i64.p0nxv8i64(<vscale x 8 x i64>* %ptr, <vscale x 8 x i1> %m, i32 %evl)
  ret <vscale x 8 x i64> %load                                                   
}

I see

# Machine code for function vpload_nxv8i64: IsSSA, TracksLiveness                
Function Live Ins: $x10 in %0, $v0 in %1, $x11 in %2                             
                                                                                 
bb.0 (%ir-block.0):                                                              
  liveins: $x10, $v0, $x11                                                       
  %2:gprnox0 = COPY $x11                                                         
  %1:vr = COPY $v0                                                               
  %0:gpr = COPY $x10                                                             
  $v0 = COPY %1:vr                                                               
  %4:vrm8 = IMPLICIT_DEF                                                         
  %5:vrm8nov0 = COPY %4:vrm8                                                     
  %3:vrm8nov0 = PseudoVLE64_V_M8_MASK %5:vrm8nov0(tied-def 0), %0:gpr, $v0, %2:gprnox0, 6, 1 :: (load unknown-size from %ir.ptr, align 64)
  $v8m8 = COPY %3:vrm8nov0                                                       
  PseudoRET implicit $v8m8                                                       
                                                                                 
# End machine code for function vpload_nxv8i64.

That shows an IMPLICIT_DEF behind a copy instruction. I think it is related to vrm8nov0 having 3 registers. It doesn't happen on an LMUL=4 test. I suspect it hit the MinRCSize limit of 4 in InstrEmitter::AddRegisterOperand.

I think this might be a bug in InstrEmitter. InstrEmitter::getVR creates unique vreg for every IMPLICIT_DEF use, but then we aren't allowed to constrain it below MinRCSize.

Looking at this more, this code may only be dead because we don't do the tied operand check if a policy operand is present. That's probably true for a significant number of instructions these days.

Ignoring that and looking at what the MIR looks like at the time we reach InsertVSETVLI for this test case

define <vscale x 8 x i64> @vpload_nxv8i64(<vscale x 8 x i64>* %ptr, <vscale x 8 x i1> %m, i32 zeroext %evl) #1 {
  %load = call <vscale x 8 x i64> @llvm.vp.load.nxv8i64.p0nxv8i64(<vscale x 8 x i64>* %ptr, <vscale x 8 x i1> %m, i32 %evl)
  ret <vscale x 8 x i64> %load                                                   
}

I see

# Machine code for function vpload_nxv8i64: IsSSA, TracksLiveness                
Function Live Ins: $x10 in %0, $v0 in %1, $x11 in %2                             
                                                                                 
bb.0 (%ir-block.0):                                                              
  liveins: $x10, $v0, $x11                                                       
  %2:gprnox0 = COPY $x11                                                         
  %1:vr = COPY $v0                                                               
  %0:gpr = COPY $x10                                                             
  $v0 = COPY %1:vr                                                               
  %4:vrm8 = IMPLICIT_DEF                                                         
  %5:vrm8nov0 = COPY %4:vrm8                                                     
  %3:vrm8nov0 = PseudoVLE64_V_M8_MASK %5:vrm8nov0(tied-def 0), %0:gpr, $v0, %2:gprnox0, 6, 1 :: (load unknown-size from %ir.ptr, align 64)
  $v8m8 = COPY %3:vrm8nov0                                                       
  PseudoRET implicit $v8m8                                                       
                                                                                 
# End machine code for function vpload_nxv8i64.

That shows an IMPLICIT_DEF behind a copy instruction. I think it is related to vrm8nov0 having 3 registers. It doesn't happen on an LMUL=4 test. I suspect it hit the MinRCSize limit of 4 in InstrEmitter::AddRegisterOperand.

I think this might be a bug in InstrEmitter. InstrEmitter::getVR creates unique vreg for every IMPLICIT_DEF use, but then we aren't allowed to constrain it below MinRCSize.

https://reviews.llvm.org/D128005

This revision is now accepted and ready to land.Jun 16 2022, 1:42 PM
This revision was landed with ongoing or failed builds.Jun 17 2022, 9:58 AM
This revision was automatically updated to reflect the committed changes.