This is an archive of the discontinued LLVM Phabricator instance.

[MIPS] Fix uitofp and fptoui for vector and scalar types
Needs ReviewPublic

Authored by sdardis on Mar 6 2017, 10:23 PM.

Details

Reviewers
slthakur
Summary

This patch fixes the lowering of uitofp and fptoui. For out of range values to fptoui and uitofp we perform fixups so that the result value is correct.

This patch is based on work by Simon Dardis.

Diff Detail

Repository
rL LLVM

Event Timeline

slthakur created this revision.Mar 6 2017, 10:23 PM
slthakur added a subscriber: jaydeep.
sdardis requested changes to this revision.Apr 7 2017, 6:09 AM

Comments inlined. I'm concerned that the legalization action of fp_to_uint is incorrect given the differences between the output of LLVM and GCC, in addition to the difference in the lowering of the i32 and i64 types.

Additionally, some of the new emit*() functions are producing bad machine code due to differences in register classes/instructions for MIPS32 / MIPS64. You can find the problems by running the test cases with -verify-machineinstrs.

With -verify-machineinstrs, two of the bad machine code errors refer to mtc1 / mfc1. You can ignore them for the purposes of this patch, this is an outstanding bug with LLVM for MIPS, as those instructions need definitions for the AFGR64 and FGR64 register classes.

lib/Target/Mips/MipsISelLowering.cpp
343

This looks incorrect. The associated lowering calls lowerFP_TO_SINT for the conversion of a floating point value to an unsigned integer. It also wildly differs from the output of gcc.

2724–2735

clang-format this chunk. There's spurious whitespace and overly long lines.

lib/Target/Mips/MipsInstrFPU.td
252

The '[' should be aligned with the (outs), not with the template parameters.

833

Overly long line, place the FGR_64 under the MipsPat.

lib/Target/Mips/MipsMSAInstrInfo.td
3775

This is formatted incorrectly. It should be formatted like the pseudo description above.

lib/Target/Mips/MipsSEISelLowering.cpp
3716

Can you provide the expansion as a comment similar to the likes of emitFPEXTEND_PSEUDO ?

3727

This variable is unused.

3739

Can you provide the expansion as a comment similar to the likes of emitFPEXTEND_PSEUDO ?

3743

FAddOp.

3764

Add BranchProbability::getOne() to the call, we know that this is a guaranteed fall-through.

3778–3781

Rather than numbers, try to name variables so that their purpose is obvious.

E.g. FPVReg3 should be FPAddReg.

3793

If src is a GPR64, this needs to be BGEZ64.

3797

This needs to be LUi64 on if GPVReg1 is a GPR64.

3806

That should be Mips::ZERO. BuildPairF64_64 is for FGR64 on Mips32.

test/CodeGen/Mips/2008-07-07-Float2Int.ll
12–13

Change that line while you're here to CHECK-LABEL: fptouint:

test/CodeGen/Mips/msa/f16-llvm-ir.ll
162–165

FIXME: This shift may be spurious given the definition of fill.w in the MSA spec.

This revision now requires changes to proceed.Apr 7 2017, 6:09 AM
slthakur updated this revision to Diff 96534.Apr 25 2017, 5:26 AM
slthakur edited edge metadata.
slthakur marked 15 inline comments as done.

Addressed review comments

Comments inlined. I'm concerned about two things in this patch, the handling of FP_TO_UINT which lowers as if it was FP_TO_SINT and the sequences which perform correction for negative values.

lib/Target/Mips/MipsISelLowering.cpp
1242

This is incorrect. Instead we need to provide a custom expander that mimics the expansion of fp_to_sint for i32, i.e. expand it to subtraction and comparison returning the appropriate converted result.

2719

This should be "SDValue Ops[2] = {Lo, Hi};".

lib/Target/Mips/MipsMSAInstrInfo.td
3776

Spurious whitespace at the end of the line here.

lib/Target/Mips/MipsSEISelLowering.cpp
3750–3753

I have some concerns about this sequence.

Firstly, the addition of negative zero does not negate the result which I believe you're trying to achieve here. You either need to subtract negative zero from the result of the conversion to negate the result or as we can only reach here if the input is less than zero, use fneg.s directly.

Secondly, although the conversion of an unsigned integer whose value is greater than the range of values that a floating point type can represent is undefined behaviour, is this behaviour compatible with GCC's result or do we need that level of compatibility?

3758–3760

This sequence treats the input as a signed value.

3764

That's acceptable for mips64r2 or later. We also need to provide a version for mips3 to mips64 which would use shifts to truncate the value.

3765–3766

This treats in the input as an unsigned value as well.

3772–3777

See my comment about using fneg directly.

3782–3790

These treat their inputs as signed values.

3804

Missing the branch label here. Also the branch labels for N32/N64 are .L, not $.

3816

These two variables have uninitialized uses...

3854–3857

dext is a mips64r2 instruction, but we also need to support the likes of mips3, mips4 and mips64. You need to provide codepaths for those systems.

3888–3924

...here if (IsSrc64 || !IsFP64) is false.

sdardis requested changes to this revision.May 8 2017, 5:21 AM
This revision now requires changes to proceed.May 8 2017, 5:21 AM
slthakur updated this revision to Diff 102308.Jun 13 2017, 3:40 AM
slthakur edited edge metadata.
slthakur marked 13 inline comments as done.

Addressed review comments. Sorry for the delay.

lib/Target/Mips/MipsSEISelLowering.cpp
3750–3753

Firstly, the addition of negative zero does not negate the result which I believe you're trying to achieve here. You either need > to subtract negative zero from the result of the conversion to negate the result or as we can only reach here if the input is > less than zero, use fneg.s directly.

The convert operation is wrong here. We need to convert into a double precision and add a double precision correction value. I have fixed it now. Sorry for to confusion.

Secondly, although the conversion of an unsigned integer whose value is greater than the range of values that a floating point > type can represent is undefined behaviour, is this behaviour compatible with GCC's result or do we need that level of > compatibility?

Yes, this behavior is compatible with GCC's result.

3758–3760

The input value is supposed to be zero extended. The zero value needed to be in the higher part.

3764

Added a codepath for mips3 to mips64 below.

3765–3766

The input value is 32-bit which is represented in 64-bit format (zero extended). Since the range of 64-bit format covers all the negative 32-bit values in its positive side it ensures safe conversion to floating point single. Therefore we don't need to worry about negative values here.

3782–3790

The input value is 32-bit which is represented in 64-bit format (zero extended). Since the range of long format covers all the 32-bit values in its positive side it ensures safe conversion to floating point double. Therefore we don't need to worry about negative values here.

3888–3924

Restructured the code. The cases of FGR64onMips32 and FGR64onMips64 with 32-bit source input will he handled above. Rest all cases will be handled here.

test/CodeGen/Mips/msa/f16-llvm-ir.ll
162–165

From the dump, looks like sll is being selected from the truncate node:

SelectionDAG has 12 nodes:
  t0: ch = EntryToken
          t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0
        t3: i32 = truncate t2
      t4: f16 = uint_to_fp t3
        t12: i64 = MipsISD::Wrapper Register:i64 %vreg1, TargetGlobalAddress:i64<half* @h> 0 [TF=15]
      t13: i64,ch = load<LD8[GOT]> t0, t12, undef:i64
    t8: ch = store<ST2[@h]> t0, t4, t13, undef:i64
  t9: ch = MipsISD::Ret t8

ISEL: Starting pattern match on root node: t3: i32 = truncate t2

  Initial Opcode index to 29877
  Match failed at index 29880
  Continuing at 29924
  Created node: t16: i32 = EXTRACT_SUBREG t2, TargetConstant:i32<1>

  Morphed node: t3: i32 = SLL t16, TargetConstant:i32<0>
sdardis requested changes to this revision.Jun 16 2017, 4:34 AM

Comments inlined. Most of them are small nits. The big change required is that you need to provide a target DAGCombine to produce the pseudos for later expansion on mips32r2 with a 64 bit fpu.

lib/Target/Mips/MipsSEISelLowering.cpp
1052–1055

Nit: can you rename UINT_TO_FP_MSA to MSA_UINT_TO_FP ? and likewise for FP_TO_UINT?

It preserves the style of how the emit functions are named.

3745–3747

This comment is incorrect, it reflects MSA_UINT_TO_FP. It should reflect MSA_FP_TO_UINT.

3849–3850

hasMips64() -> hasMips3(). Mips3 was the first MIPS ISA to support double precision FPUs.

3867

Lower than createVirtualRegister call to the point of usage, so we aren't creating virtual registers unnecessarily.

3868–3874

These sequences are not quite correct. For Mips64r2, you should use DEXT directly rather DEXT64_32 as DEXT64_32 is marked as isCodegenOnly. This means that instruction doesn't participate the ISA mapping tables. We use these tables to convert standard MIPS to microMIPS64R6, so currently this code will produce broken objects.

You need to define a GPR64 register using IMPLICIT_DEF, then use that register along with INSERT_SUBREG and Src and the subregister index. You can then use DEXT directly.

That sequence I've outlined is also required for the DSLL / DSRL sequence. This more or less handles it:

unsigned GPImpDef = RegInfo.createVirtualRegister(&Mips::GPR64RegClass);                                          
unsigned GPRRes = RegInfo.createVirtualRegister(&Mips::GPR64RegClass);                                            
BuildMI(*BB, MI, DL, TII->get(Mips::IMPLICIT_DEF), GPImpDef);                                                     
BuildMI(*BB, MI, DL, TII->get(Mips::INSERT_SUBREG), GPRRes)                                                       
    .addReg(GPImpDef)                                                                                             
    .addReg(Src)                                                                                                  
    .addImm(Mips::sub_32);

You'll want to change the variable names around.

3919

Lower this to the point of usage to avoid creating unused virtual registers.

3923

hasMips64() -> hasMips3()

3964

Using FPAddResult here as the destination means that the addition and phi node instructions can be hoisted above the if block, simplifying the code here.

4004–4024

This sequence isn't being generated by LLVM. Instead, we always get the libcall expansion. Also :: emitFP_TO_UINT(...) doesn't seem able to produce the sequence with xor in it.

This occurs because the legalizer replaces the fp_to_uint node because the result type is not legal, before we can select the pseudo the generates this sequence.

What we want to do is to use setTargetDAGCombine(ISD::FP_TO_UINT) and provide a combine that replaces fp_to_uint with the correct pseudo when we're targeting mips32r2 -mfp64 and not mips3. That transformation will have to be guarded by DCI.isBeforeLegalizeOps(). See my commit fixing the multi-precision arithmetic and the optimization for madd / msub.

4037–4039

This sequence looks incorrect. It should be lui $2, 0x8000, as you're oring the value which has been truncated to 32 bits. Also, if this is FP_TO_UINT FGR64Opnd:$fs, GPR32Opnd:$rd, then the floating point operations should be on double precision values, not single precision values.

4067–4068

Double check this for endian dependant behaviour.

4082–4084

Either lui $2, 1 ; dsll $2, $2, 31 or li $2, 1; dsll32 $2, $2, 31.

4101–4102

hasMips64() -> hasMips3().

4171

This if condition looks spurious.

4173

DSLL32 with an immediate of 0.

4215–4242

Can you restructure this code so that it reads like:

if (IsDest64) {
  if (ISFGR64onMips32)
    ..
} else {
  ..
}

Rather than:

if (IsDest64) {
  ...
} else {
  ...
}

if (IsDest64) {
  ...
} else {
  ...
}
4220

This if condition looks spurious. If the destination is 64 bits in size, then we decide between mips32 or mips64.

4230–4238

FIXME: The delay slot filler fails to schedule LUi(64) into the delay slot of the BC1T.

test/CodeGen/Mips/llvm-ir/fpcvt.ll
6 ↗(On Diff #102308)

This needs mips3, mips64 as well.

test/CodeGen/Mips/msa/f16-llvm-ir.ll
162–165

I believe my original thought was this was a redundant sign extension, as my reading of the MSA spec leads me to believe that fill.w behaves differently to MIPS32 instructions in that it doesn't need to be guarded by a sign extension.

I believe that fill.[bhw] and copy_s.[bhw] when used around argument passing in/out and for returns, there is scope to eliminate some sign extensions. That's future work though as the optimization would be tedious to implement at the SelectionDAG layer.

This revision now requires changes to proceed.Jun 16 2017, 4:34 AM
slthakur updated this revision to Diff 107485.Jul 20 2017, 4:57 AM
slthakur edited edge metadata.
slthakur marked 18 inline comments as done.

Addressed review comments and re-synced with TOT.

sdardis commandeered this revision.Jul 25 2017, 6:10 AM
sdardis edited reviewers, added: slthakur; removed: sdardis.
sdardis added a subscriber: llvm-commits.

Taking over this work, as slthakur is no longer with IMG.