This is an archive of the discontinued LLVM Phabricator instance.

[mips] Fix RetCC_MipsN to promote types smaller than i64 to GPR-width.
AbandonedPublic

Authored by vkalintiris on Jan 15 2016, 7:09 AM.

Details

Reviewers
dsanders
Summary

In order to avoid a number of redundant sign/zero-extensions, we have
to custom combine AssertSext on our target and fold a generic pattern
in the target independent DAG combiner for the SIGN_EXTEND_INREG node.

Depends on D10970

Diff Detail

Event Timeline

vkalintiris retitled this revision from to [mips] Fix RetCC_MipsN to promote types smaller than i64 to GPR-width..
vkalintiris updated this object.
vkalintiris added a reviewer: dsanders.
vkalintiris added a subscriber: llvm-commits.
dsanders requested changes to this revision.Jan 29 2016, 9:30 AM
dsanders edited edge metadata.

Well caught. It turns out we've been falling through to the O32 handler in some cases.

There's a few changes that don't look right to me and I've commented on these below. I'd like to explain one set of the comments in advance. Up until select-flt.ll I've pointed out a number of redundant instructions. These look like regressions but it's possible that we've always emitted some of them and have just not mentioned them in the test. Could you check whether these are regressions or not?

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
6934–6935 ↗(On Diff #44983)

I think we can do better here. We really want to know whether the sign bit that sext_in_reg will replicate is any one of the sign-bits that ComputeNumSignBits reports so I believe the condition should be:

(N00VTBits - N00NSB) < EVTBits

In any case I'd suggest posting this as a separate patch without the '[mips]' tag since this affects all targets.

lib/Target/Mips/MipsCallingConv.td
183–185

Putting this here prevents the big-endian CCIfInReg path on line 205 from occurring. As a result small structs won't return correctly.

I think it belongs on line 206 at which point the little-endian CCIfInReg case could be folded into it

344

Could you also make this conditional on isABI_O32()? We'd have caught this sooner if we hadn't fallen out into the O32 implementation.

lib/Target/Mips/MipsISelLowering.cpp
825–826

What is 'asext'? I assume it's 'AssertSExt' but it's confusing when we already have 'aext' for any extend. Also, could you clarify which 'asext' you're referring to at the end of the sentence?

829–830

Optional: Would 'Outer' make more sense than 'Old'?

836

Isn't it always a trunc because NewVT has fewer bits than OldVT?

test/CodeGen/Mips/atomic.ll
100

Why did you remove the signext's from these tests?

test/CodeGen/Mips/llvm-ir/ashr.ll
43–59

indentation. Similarly below.

43–59

Something is odd here. The two GP32 cases don't sign-extend but the two GP64 cases do. None of them need to sign-extend because there's no way for ashr to remove the sign extension that was already present in the 'i8 signext' argument. Similarly below

Also, the andi instructions are still redundant because srav only reads the lowest 5 bits. Please restore the fixme.

117–132

Please restore the MIPS32R6 case.

120–123

Why did you remove the definition of the $[[BB1]] label?

179–181

Why did you remove the definition of the $[[BB1]] label?

test/CodeGen/Mips/llvm-ir/lshr.ll
80

This sign extend is redundant. The srlv already did it

test/CodeGen/Mips/llvm-ir/mul.ll
145–154

These three sign extends are redundant. For the M4 case, the mflo already did it and for the other two it was the mul

test/CodeGen/Mips/llvm-ir/ret.ll
121

I doubt you meant 'GP32' here.

Also why are the GP32 and GP64 cases different now?

test/CodeGen/Mips/llvm-ir/sdiv.ll
6–10

The prefixes contradict each other

21–26

The prefixes contradict each other

74–78

This is not correct for R6. This kind of divide instruction was removed from the ISA.

100–105

This is not correct for R6. This kind of divide instruction was removed from the ISA.

123–133

The two sll's are redundant. The mflo's already did it

test/CodeGen/Mips/llvm-ir/select-flt.ll
1

Regarding the splitting of select.ll: This is a good thing but it makes it difficult to see what's changed. Could we split it and change it in separate commits? Feel free to commit the split part of this without further review.

37

This andi and the similar ones below are redundant because %s can only be 0 or -1 and the branch test is for zero/not-zero

test/CodeGen/Mips/llvm-ir/srem.ll
73–78

Like in the div.ll test, this isn't correct for R6. Similarly below.

This revision now requires changes to proceed.Jan 29 2016, 9:30 AM
vkalintiris updated this revision to Diff 46642.Feb 2 2016, 5:28 AM
vkalintiris edited edge metadata.
vkalintiris marked 23 inline comments as done.

Addressed Daniel's comments.

As far as I can tell, almost all of our redundant sign-extensions come from the
usage of the 32-bit version of our instructions.

vkalintiris added inline comments.Feb 4 2016, 3:01 AM
lib/Target/Mips/MipsCallingConv.td
183–185

I don't think that the small structs case in big-endian is affected by this. Here's the relevant generated code:

if (LocVT == MVT::i1 ||                                                         
    LocVT == MVT::i8 ||                                                         
    LocVT == MVT::i16 ||                                                        
    LocVT == MVT::i32) {                                                        
  LocVT = MVT::i64;                                                             
  if (ArgFlags.isSExt())                                                        
      LocInfo = CCValAssign::SExt;                                              
  else if (ArgFlags.isZExt())                                                   
      LocInfo = CCValAssign::ZExt;                                              
  else                                                                          
      LocInfo = CCValAssign::AExt;                                              
}                                                                               
                                                                                
if (LocVT == MVT::i64) {                                                        
  if (static_cast<MipsCCState *>(&State)->WasOriginalArgF128(ValNo)) {          
    if (!RetCC_F128(ValNo, ValVT, LocVT, LocInfo, ArgFlags, State))             
      return false;                                                             
  }                                                                             
}                                                                               
                                                                                
if (static_cast<const MipsSubtarget&>(State.getMachineFunction().getSubtarget()).isLittle()) {
  if (LocVT == MVT::i8 ||                                                       
      LocVT == MVT::i16 ||                                                      
      LocVT == MVT::i32 ||                                                      
      LocVT == MVT::i64) {                                                      
    if (ArgFlags.isInReg()) {                                                   
      LocVT = MVT::i64;                                                         
      if (ArgFlags.isSExt())                                                    
              LocInfo = CCValAssign::SExt;                                      
      else if (ArgFlags.isZExt())                                               
              LocInfo = CCValAssign::ZExt;                                      
      else                                                                      
              LocInfo = CCValAssign::AExt;                                      
    }                                                                           
  }                                                                             
}                                                                               
                                                                                
if (!static_cast<const MipsSubtarget&>(State.getMachineFunction().getSubtarget()).isLittle()) {
  if (LocVT == MVT::i8 ||                                                       
      LocVT == MVT::i16 ||                                                      
      LocVT == MVT::i32 ||                                                      
      LocVT == MVT::i64) {                                                      
    if (ArgFlags.isInReg()) {                                                   
      LocVT = MVT::i64;                                                         
      if (ArgFlags.isSExt())                                                    
              LocInfo = CCValAssign::SExtUpper;                                 
      else if (ArgFlags.isZExt())                                               
              LocInfo = CCValAssign::ZExtUpper;                                 
      else                                                                      
              LocInfo = CCValAssign::AExtUpper;                                 
    }                                                                           
  }                                                                             
}

However, we can merge together this change with the small structs case, for little-endian, if we don't want to handle this explicitly.

344

This causes the 35 new failures shown below. Should I add a FIXME instead?

CodeGen/Mips/2008-07-16-SignExtInReg.ll
CodeGen/Mips/atomic.ll                 
CodeGen/Mips/cconv/return-float.ll     
CodeGen/Mips/cconv/return.ll           
CodeGen/Mips/cmov.ll                   
CodeGen/Mips/const-mult.ll             
CodeGen/Mips/countleading.ll           
CodeGen/Mips/ctlz-v.ll                 
CodeGen/Mips/cttz-v.ll                 
CodeGen/Mips/divrem.ll                 
CodeGen/Mips/fcmp.ll                   
CodeGen/Mips/llvm-ir/add.ll            
CodeGen/Mips/llvm-ir/and.ll            
CodeGen/Mips/llvm-ir/ashr.ll           
CodeGen/Mips/llvm-ir/call.ll           
CodeGen/Mips/llvm-ir/indirectbr.ll     
CodeGen/Mips/llvm-ir/load-atomic.ll    
CodeGen/Mips/llvm-ir/lshr.ll           
CodeGen/Mips/llvm-ir/mul.ll            
CodeGen/Mips/llvm-ir/or.ll             
CodeGen/Mips/llvm-ir/ret.ll            
CodeGen/Mips/llvm-ir/sdiv.ll           
CodeGen/Mips/llvm-ir/select.ll         
CodeGen/Mips/llvm-ir/shl.ll            
CodeGen/Mips/llvm-ir/srem.ll           
CodeGen/Mips/llvm-ir/sub.ll            
CodeGen/Mips/llvm-ir/udiv.ll           
CodeGen/Mips/llvm-ir/urem.ll           
CodeGen/Mips/llvm-ir/xor.ll            
CodeGen/Mips/load-store-left-right.ll  
CodeGen/Mips/mips64-f128.ll            
CodeGen/Mips/octeon_popcnt.ll          
CodeGen/Mips/select.ll                 
CodeGen/Mips/tailcall.ll               
CodeGen/Mips/zeroreg.ll
lib/Target/Mips/MipsISelLowering.cpp
836

You are right it's always a trunc. I though that getSExtOrTrunc() is just a convenience function.

test/CodeGen/Mips/atomic.ll
100

Because we generate an extra pair of dsll/dsra at the NO-SEB_SEH label. MipsISelLowering::emitAtomicBinaryPartword() uses only the 32-bit version of the instructions. The sll/sra instructions get emitted by that function and the dsll/dsra promote the result to i64. The review request D13649 tries to solve a similar problem for MipsTargetLowering::emitAtomicCmpSwapPartword().

test/CodeGen/Mips/llvm-ir/ashr.ll
43–59

For GP32 the initial DAG:

  t11: i8 = sra t7, t17   
t12: i32 = sign_extend t11

is transformed into:

  t19: i32 = sra t18, t17                        
t20: i32 = sign_extend_inreg t19, ValueType:ch:i8

after the initial type legalization. For GP64, the initial DAG:

t16: i8 = sra t12, t15
t17: i32 = sign_extend t16
  t18: i64 = sign_extend t16

is transformed into:

  t16: i8 = sra t28, t23
t18: i64 = sign_extend t16

However, after the second pass, we get a sext_inreg that "remembers" the original type (i8):

    t31: i32 = sra t29, t23
  t32: i64 = any_extend t31
t33: i64 = sign_extend_inreg t32, ValueType:ch:i8

Otherwise, we would get a simple 32 -> 64 bit extension as is the case with ashr_i32().

test/CodeGen/Mips/llvm-ir/lshr.ll
80

The correspondent ISD::SRL node is legalized to i32:

  t10: i32 = srl t5, t9
t11: i64 = sign_extend t10
test/CodeGen/Mips/llvm-ir/mul.ll
145–154

Similarly with the lshr.ll case the ISD::MUL is legalized to i32:

t10: i32 = mul t5, t9

t11: i64 = sign_extend t10

test/CodeGen/Mips/llvm-ir/ret.ll
121

Because with the changes of this patch, we are materializing an i64 and we consult MipsAnalyzeImmediate in order to get the instruction(s) we need (see D16290 for more info).

test/CodeGen/Mips/llvm-ir/sdiv.ll
74–78

The label should have been R2-R5. I fixed this.

123–133

We'll have to change the way we handle MFLO because we use it's 32-bit version:

%vreg4<def> = PseudoSDIV %vreg0:sub_32<kill>, %vreg1:sub_32; ACC64:%vreg4 GPR64:%vreg0,%vreg1
TEQ %vreg1:sub_32<kill>, %ZERO, 7; GPR64:%vreg1
%vreg5<def> = PseudoMFLO %vreg4<kill>; GPR32:%vreg5 ACC64:%vreg4
%vreg6<def> = SLL64_32 %vreg5<kill>; GPR64:%vreg6 GPR32:%vreg5
test/CodeGen/Mips/llvm-ir/select-flt.ll
37

The problem is in PromoteTargetBoolean(). It blindly calls getExtendForContent(), which returns the correct target-specific action, without considering any information from DAG's ComputeNumSignBits first. Should I add a FIXME?

vkalintiris abandoned this revision.Jun 5 2016, 10:40 AM