Page MenuHomePhabricator

Expand BuildPairF64 by spilling to stack slot and reloading from it
ClosedPublic

Authored by sstankovic on Jul 10 2014, 12:30 PM.

Details

Summary

[mips] Expand BuildPairF64 to a spill and reload when the O32 FPXX ABI is enabled and mthc1 and dmtc1 are not available (e.g. on MIPS32r1)

This prevents the upper 32-bits of a double precision value from being moved to the FPU with mtc1 to an odd-numbered FPU register. This is necessary to ensure that the code generated executes correctly regardless of the current FPU mode.

MIPS32r2 and above continues to use mtc1/mthc1, while MIPS-IV and above continue to use dmtc1.

Diff Detail

Repository
rL LLVM

Event Timeline

sstankovic retitled this revision from to Expand BuildPairF64 by spilling to stack slot and reloading from it.
sstankovic updated this object.
sstankovic edited the test plan for this revision. (Show Details)
sstankovic added a reviewer: dsanders.
sstankovic added a subscriber: Unknown Object (MLST).
dsanders edited edge metadata.Jul 10 2014, 2:06 PM

I think this is fairly close. There are quite a few comments below but they are all simple issues.

The commit message ought to be more informative than it currently is since most readers aren't going to know why this is necessary. Perhaps something like:

[mips] Expand BuildPairF64 to a spill and reload when the O32 FPXX ABI is enabled and mthc1 and dmtc1 are not available (e.g. on MIPS32r1)

This prevents the upper 32-bits of a double precision value from being moved to the FPU with mtc1 to an
odd-numbered FPU register. This is necessary to ensure that the code generated executes correctly
regardless of the current FPU mode.

MIPS32r2 and above continues to use mtc1/mthc1, while MIPS-IV and above continue to use dmtc1.

lib/Target/Mips/MipsSEFrameLowering.cpp
271–274 ↗(On Diff #11286)

I've been asked not to repeat the function name in the comments.

...when ABI is fpxx and mthc1 is not available...

It would be good to add a comment mentioning that the case where dmtc1 is available doesn't need to be handled here because it never creates a BuildPair node. It would be great if you could also add this comment to MipsSEInstrInfo::expandBuildPairF64() too.

On the subject of MipsSEInstrInfo::expandBuildPairF64(), it contains a comment about the spill and reload case. You need to update it to say that that case is handled by the frame lowering code. It would be a good idea to add an assertion there too just in case ExpandPseudo::expandBuildPairF64() isn't called for some reason.

282–283 ↗(On Diff #11286)

MipsSubTarget::hasMips32r2() is true for MIPS32r6 too so you could drop the explicit check for hasMips32r6(). Likewise for the same code in MipsSEInstrInfo::expandBuildPairF64().

295 ↗(On Diff #11286)

Should we hardcode this register class, or pick between AFGR64/FGR64 depending on whether we have 64-bit FPR's?
It should be impossible to have FGR64 on MIPS-II or MIPS32r1 (which are the two cases that trigger this code) but I don't think we have anything enforcing that at the moment.

If we do hardcode the class, we should at least assert(!TM.getSubtarget<MipsSubtarget>().isFP64()) so that failing to prevent -mfp64 on MIPS-II and MIPS32r1 elsewhere will fail loudly here.

297–299 ↗(On Diff #11286)

It should be fairly easy to re-use the stack slot. You can store FI in the MipsFunctionInfo which you can access with MF.getInfo<MipsFunctionInfo>

test/CodeGen/Mips/fpxx.ll
1–2 ↗(On Diff #11286)

It would be good to test the mtc1/mthc1 (mips32r2) and dmtc1 (mips4 or mips64r1) cases in this test. We cover them elsewhere as a side-effect of testing something else but it would be good to cover them deliberately in this test.

It just occurred to me that we haven't covered moves from the FPU. They need to do the same thing.

dsanders added inline comments.Jul 11 2014, 7:12 AM
lib/Target/Mips/MipsSEFrameLowering.cpp
300–301 ↗(On Diff #11286)

The 0 and 4 need to be the other way around for big-endian.

dsanders added inline comments.Jul 11 2014, 8:28 AM
lib/Target/Mips/MipsSEFrameLowering.cpp
300–301 ↗(On Diff #11286)

Also, LoReg and HiReg might not be killed in all cases. Query the operand with I->getOperand(1).isKill().

dsanders added inline comments.Jul 11 2014, 9:17 AM
lib/Target/Mips/MipsSEFrameLowering.cpp
300–301 ↗(On Diff #11286)

The 0 and 4 need to be the other way around for big-endian.

It turns out that I'm wrong about this. In big-endian mode, the hi/lo operands have already been swapped so we don't need to account for it in the offset. Sorry for the noise.

sstankovic updated this object.
sstankovic edited edge metadata.

Patch is updated.

sstankovic added inline comments.Jul 11 2014, 12:52 PM
lib/Target/Mips/MipsSEFrameLowering.cpp
271–274 ↗(On Diff #11286)

It would be good to add a comment mentioning that the case where dmtc1 is available doesn't need to be handled here because it never creates a BuildPair node. It would be great if you could also add this comment to MipsSEInstrInfo::expandBuildPairF64() too.

Done.

On the subject of MipsSEInstrInfo::expandBuildPairF64(), it contains a comment about the spill and reload case. You need to update it to say that that case is handled by the frame lowering code. It would be a good idea to add an assertion there too just in case ExpandPseudo::expandBuildPairF64() isn't called for some reason.

Done

282–283 ↗(On Diff #11286)

Done

295 ↗(On Diff #11286)

I hardcoded the class and added the assert.

297–299 ↗(On Diff #11286)

Done.

300–301 ↗(On Diff #11286)

Also, LoReg and HiReg might not be killed in all cases. Query the operand with I->getOperand(1).isKill().

Done

test/CodeGen/Mips/fpxx.ll
1–2 ↗(On Diff #11286)

I added the test for mips32r2. What should the mips64r1/mips4 case test?

It just occurred to me that we haven't covered moves from the FPU. They need to do the same thing.

Should this be part of this patch or a separate patch?

It just occurred to me that we haven't covered moves from the FPU. They need to do the same thing.

Should this be part of this patch or a separate patch?

It can be a separate patch. I have an unfinished implementation of it for FP64A in my working copy at the moment. The code will be very similar between FP64A and FPXX and has very similar test cases so I could do both in the same patch if you like.

lib/Target/Mips/MipsMachineFunction.h
144 ↗(On Diff #11324)

I see you are using -1 to mean invalid. -1 is normally a valid FrameIndex (negative numbers are used for stack objects with a fixed offset) so we should mention that -1 is used as an invalid value.

Also, please turn this comment into a doxygen comment using '///'.

test/CodeGen/Mips/fpxx.ll
1–2 ↗(On Diff #11286)

They should test that the NOFPXX variant produces a dmtc1 and that the FPXX variant reports an error saying that FPXX is not permitted with the N32/N64 ABI's.

It's not strictly necessary, but it would be good to leave a RUN-TODO: line for the O32 (-mattr=-n64,+o32) on MIPS4/MIPS64r1 with and without FPXX cases. We don't support O32 on 64-bit ISA's at the moment but it will be helpful to be able to find all the tests easily (with something like 'grep -r RUN-TODO . | grep o32') when we come to support it.

... so I could do both in the same patch if you like.

Just fixing some poor phrasing I noticed after clicking send. I mean I could add the FPXX code to handle move from FPU at the same time as I add the code to handle FP64A move to/from FPU.

sstankovic updated this revision to Diff 11345.Jul 13 2014, 3:16 AM

Patch is updated.

It can be a separate patch. I have an unfinished implementation of it for FP64A in my working copy at the moment. The code will be very similar between FP64A and FPXX and has very similar test cases so I could do both in the same patch if you like.

OK.

lib/Target/Mips/MipsMachineFunction.h
144 ↗(On Diff #11324)

Done.

test/CodeGen/Mips/fpxx.ll
1–2 ↗(On Diff #11286)

I added a new test for dtmc1. Existing tests when compiled for n64 receive double arguments in float registers, so dmtc1 is not used.

I also added RUN-TODO lines.

dsanders accepted this revision.Jul 13 2014, 4:14 AM
dsanders edited edge metadata.

LGTM with one last change to match the function labels in all the RUN lines of the testcase.

As discussed, I'll cover the move-from-fpu case for FPXX in my FP64A patch.

test/CodeGen/Mips/fpxx.ll
23 ↗(On Diff #11345)

You can make this apply to all the tests using multiple FileCheck prefixes (e.g. 'FileCheck %s -check-prefix=ALL -check-prefix=32-NOFPXX' and using 'ALL-LABEL:').

Likewise for the others below

This revision is now accepted and ready to land.Jul 13 2014, 4:14 AM
sstankovic edited edge metadata.

I added -check-prefix=ALL to all RUN lines except two "RUN: not" lines that check for error messages (otherwise test won't execute). I also moved checks for error messages to the start of the test file.

sstankovic closed this revision.Jul 14 2014, 2:49 AM
sstankovic updated this revision to Diff 11365.

Closed by commit rL212930 (authored by @sstankovic).