This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/Mips/
-
Target/
-
Mips/
10/18
MipsDelaySlotFiller.cpp
7/9
MipsInstrInfo.cpp
1/1
MipsSEInstrInfo.h
5/10
MipsSEInstrInfo.cpp
-
test/CodeGen/Mips/
-
CodeGen/
-
Mips/
-
analyzebranch.ll
-
atomic.ll
1/1
compact-branches.ll
-
fcmp.ll
-
fpbr.ll

Differential D16353

[mips] MIPS32R6 compact branch support
ClosedPublic

Authored by sdardis on Jan 20 2016, 4:44 AM.

Download Raw Diff

Details

Reviewers

dsanders
vkalintiris

Commits

rGe8efff373a51: [mips] MIPS32R6 compact branch support
rL263444: [mips] MIPS32R6 compact branch support

Summary

Summary:
MIPSR6 introduces a class of branches called compact branches. Unlike the
traditional MIPS branches which have a delay slot, compact branches do not
have a delay slot. The instruction following the compact branch is only
executed if the branch is not taken and must not be a branch.

This patch implements support for compact branches on MIPS32R6.

Generate compact branches for MIPS32R6 when the delay slot filler cannot fill
a delay slot. Inspect the generated code for forbidden slot hazards (a compact
branch with an adjacent branch or other CTI) and insert nops to clear this
hazard.

Diff Detail

Event Timeline

sdardis updated this revision to Diff 45374.Jan 20 2016, 4:44 AM

sdardis retitled this revision from to [mips] MIPSR6 compact branch support.

sdardis updated this object.

sdardis added a reviewer: vkalintiris.

sdardis added a subscriber: llvm-commits.

Herald added a subscriber: dsanders. · View Herald TranscriptJan 20 2016, 4:44 AM

vkalintiris requested changes to this revision.Jan 20 2016, 5:15 PM

vkalintiris edited edge metadata.

vkalintiris added inline comments.

lib/Target/Mips/Mips.h
34 ↗	(On Diff #45374)	Do we really need this pass? Shouldn't we just take care so that the MipsDelaySlotFiller pass does not create any hazards in the first place?
lib/Target/Mips/MipsDelaySlotFiller.cpp
709–710	We can remove this once we add support for microMIPS in getEquivalentCompactForm().
730	The `STI.hasMips64r6()` will be true if `STI.hasMip32r6()` is true.
test/CodeGen/Mips/compact-branches.ll
2	Although it isn't absolutely necessary, it would be better to have a function per instruction check, ie. one function that tests bnec, another one for bgezc etc.

vkalintiris added inline comments.Jan 20 2016, 5:15 PM

lib/Target/Mips/MipsDelaySlotFiller.cpp
498–512	The same functionality is implemented in MipsInstrInfo::genInstrWithNewOpc(). You can expand the logic of that function, for branches that use the $zero register.
498–550	We should make this function a member of MipsSEInstrInfo. This is where we keep similar logic for branch instruction operations. Can you merge the first switch statement into the second? The check for the number of explicit operands becomes unnecessary once you've identified the opcode. Also, this is where we should provide the relevant logic for branches in microMIPS mode.
548	I think that we should return 0 here. Mips::ZERO is a positive number.

This revision now requires changes to proceed.Jan 20 2016, 5:15 PM

Addressed comments. Shifted the $zero detection for branches into MipsInstrInfo::getInstrWithNewOpc. I've elected to keep the separate hazard pass for the moment, see my reply there.

Thanks,
Simon

Apologies, comment didn't get posted, reproduced here:

Shouldn't we just take care so that the MipsDelaySlotFiller pass does not create any hazards in the first place?

For this patch that approach could be taken but I think its better it introduce a fairly specific pass to handle forbidden slot hazards.

Handling hazards as a separate pass introduces a high level of safety as then compact branches can be produced anywhere in the compilation pipeline without the subtly incorrect reliance that the delay slot filler will fix those hazards. It's a separation of concerns issue in my view.

Aside: GCC for MIPS uses a very similar hazard strategy for forbidden slot, hi/lo and delayed value hazards.

Thoughts?

lib/Target/Mips/Mips.h
34 ↗	(On Diff #45374)	Shouldn't we just take care so that the MipsDelaySlotFiller pass does not create any hazards in the first place? For this patch that approach could be taken but I think its better it introduce a fairly specific pass to handle forbidden slot hazards. Handling hazards as a separate pass introduces a high level of safety as then compact branches can be produced anywhere in the compilation pipeline without the subtly incorrect reliance that the delay slot filler will fix those hazards. It's a separation of concerns issue in my view. Aside: GCC for MIPS uses a very similar hazard strategy for forbidden slot, hi/lo and delayed value hazards. Thoughts?

In D16353#332475, @sdardis wrote:

Apologies, comment didn't get posted, reproduced here:

Shouldn't we just take care so that the MipsDelaySlotFiller pass does not create any hazards in the first place?

For this patch that approach could be taken but I think its better it introduce a fairly specific pass to handle forbidden slot hazards.

Handling hazards as a separate pass introduces a high level of safety as then compact branches can be produced anywhere in the compilation pipeline without the subtly incorrect reliance that the delay slot filler will fix those hazards. It's a separation of concerns issue in my view.

Aside: GCC for MIPS uses a very similar hazard strategy for forbidden slot, hi/lo and delayed value hazards.

Thoughts?

Following this approach means that the MipsDelaySlotFiller pass will no longer generate correct output code, even in the presence of correct input code. As far as I can tell, the code generator provides (or at least it should provide) correct input code to the MipsDelaySlotFiller pass as it does not generate any CTIs in any slot.

Even if we start emitting CTIs in forbidden slots for some unforeseeable reason in the future, conceptually, the MipsDelaySlotFiller pass would be the more appropriate place to remove them as it is responsbile for the handling
of slots either way.

Having said that, I wouldn't be against a dedicated hazard handling pass if we couldn't avoid generating them from the code generator, or handle them at the right pass.

Shifted the hazard clearing logic into the delay slot filler.

vkalintiris added inline comments.Jan 25 2016, 5:17 AM

lib/Target/Mips/MipsDelaySlotFiller.cpp
506–510	If you update `Branch` with the return value of `genInstrWithNewOpc()` then deleting the previous branch is as simple as: `std::next(Branch)->eraseFromParent();`
533–575	I think that it would be useful to have these functions at `MipsInstrInfo` as we could call them from other passes too.
557	The cast is redundant.
595–659	There's no need to iterate over every instruction twice. We can check whether we have to add a NOP inside `replaceWithCompactBranch()` and bundle it together with the compact branch.
lib/Target/Mips/MipsInstrInfo.cpp
286	We don't have to check for the number of explicit operands.
287	You can use MipsABIInfo::GetZeroReg() to test with the right zero register. Also, the cast is unnecessary.
302–303	Shouldn't we set `ZeroOperandBranch` to false in the default case?
lib/Target/Mips/MipsSEInstrInfo.cpp
438	Redundant newline.
473–482	We will never reach this if `STI.hasMips32r6()` is true. We should merge the logic of opcode selection into the first switch statement.

Addressed outstanding comments except for the one regarding Filler::clearFSHazards. See my comment there.

Added a FIXME to MipsISelLowering.cpp: MipsTargetLowering::emitAtomicBinary on an outstanding code gen issue and to MipsInstrInfo::genInstrWithNewOpc as they're related. Without the checking for Mips::ZERO and MipsABIInfo::GetZeroReg() we get more verbose asm than necessary.

Updated commit text to reflect that its for MIPS32R6 only.

sdardis updated this object.Jan 26 2016, 2:58 AM

sdardis added inline comments.

lib/Target/Mips/MipsDelaySlotFiller.cpp
595–659	We can check whether we have to add a NOP inside replaceWithCompactBranch() and bundle it together with the compact branch. We can't. You've missed case C in the comment. Consider the following from function l() in the test case before delay slot filling (using assembly here to keep things clear): move $16, $2 jal j bne $16, $2, $BB0_2 # BB#1: # %if.then addiu $4, $zero, -2 jal f For the first jal, the delay slot filler will insert the 'move' into its delay slot. The delay slot filler has no instruction to put in the delay slot of bne, so it transforms it into bnec. No nop will be inserted as a) there is no CTI following bnec in that basic block, and b) the first non-debug instruction in the physically following basic block is not a CTI. The delay filler then considers BB#1 and schedules the addiu into the delay slot of the jal. The delay filler has now created a forbidden slot hazard. Handling this in the delay filler itself requires identifying such special cases, then inserting a nop into the basic block of the compact branch (which could be the previous basic block to the one we're working on). This is in addition to cases A and B described in the comment. My reasoning for processing all the instructions again is that it trivializes all those cases (and possibly others) into a 'is the physical successor instruction of a compact branch a CTI?' examination rather than inserting checks where ever we could create a FS hazard. Inserting handling FS hazard logic in multiple places I think could lead to correctness issues as we'd need to ensure we cover all possible cases, rather than a simple (mini-)pass decoupled from the delay slot filling logic. Thoughts?
lib/Target/Mips/MipsInstrInfo.cpp
287	You can use MipsABIInfo::GetZeroReg() to test with the right zero register. Strangely enough, this doesn't work. MipsTargeLowering::emitAtomicBinary will generate a Mips::BEQ with Mips::ZERO as an operand on mips64 if the requested size is 4. For mips64 it appears we need to check for MipsABIInfo::GetZeroReg() and Mips::ZERO for the moment. This appears to be an outstanding issue with emitAtomicBinary.
lib/Target/Mips/MipsSEInstrInfo.cpp
438	clang-format inserts the newline, leave or keep it?

vkalintiris added inline comments.Jan 27 2016, 4:09 AM

lib/Target/Mips/MipsDelaySlotFiller.cpp
595–659	The (C) case will happen when filling the delay slot of a normal branch, ie. inside the MipsDelaySlotFiller pass. We can teach the "search-backwards" part of the delay slot filler (DSF) algorithm to handle this by checking the last instruction of the BB's layout predecessor. If there are other issues that conceptually belong to the DSF pass, then we should add them too by modifying the pass accordingly. I don't see how (B) can happen as we only emit compact branches at this pass for the time being. From my perspective, if a previous pass would like to add a compact-branch, then I can think of two options: (a) insert the compact branch only if the next instruction is not a CTI-hazard and leave the task of fixing the corner cases that happen during the filling of normal branches by the DSF, or (b) insert a NOP and just offload everything to the DSF. I can't think of any reason for a post-DSF pass to insert new/additional compact-branches or move non-CTI instructions from a forbidden slot. Even if we find ourselves constrained by having the DSF pass handling everything, then we can add a separate/final pass that fixes every case left over from previous passes. However, with this design we would achieve the minimum number of places were our code is wrong/invalid. If we'd want to be on the really safe side and make sure that we are aware of every possible case that we might haven't consider yet, then we can add a separate pass and enable it only for debug builds. It could assert upon finding a compact-branch that contains a CTI in its forbidden slot.
lib/Target/Mips/MipsInstrInfo.cpp
287	Ok, if I understand correctly then we should just check for Mips::ZERO and keep the FIXME comment, ie. we can remove the `Subtarget.getABI().GetZeroReg()` check as it's not working at the moment.
lib/Target/Mips/MipsSEInstrInfo.cpp
438	It doesn't insert one for me, so I suggest that we remove the newline.

sdardis added inline comments.Jan 27 2016, 5:39 AM

lib/Target/Mips/MipsDelaySlotFiller.cpp
595–659	Even if we find ourselves constrained by having the DSF pass handling everything, then we can add a separate/final pass that fixes every case left over from previous passes. However, with this design we would achieve the minimum number of places were our code is wrong/invalid. Rather than wiring FS hazard handling logic into the DSF, we could split it off into a separate pass and enable it after the delay slot filler, like the first diff. Thoughts dsanders? It keeps the implementation safer and simpler.

Addressed comments. move hazard schedule back into it's own pass. You ok with that change Daniel?

Herald added a subscriber: MatzeB. · View Herald TranscriptJan 28 2016, 9:11 AM

Simplify MipsHazardSchedule pass according to style guide.

Ping.

Most of these are nits but there a a couple important ones.

Regarding testing this pass. Ideally we would be using MIR but I'm not going to require that since I haven't had chance to try it myself.

Could you put the new tests in a subdirectory so that we keep forbidden slot tests together and can migrate them to MIR tests later?

lib/Target/Mips/MipsDelaySlotFiller.cpp
506	Mips16 uses the delay slot filler too as far as I know. If so, this could be a Mips16InstrInfo
660–727	The separation makes sense to me given that we're only using compact branches when delay slots go unfilled. There are a couple cases I can think of where this may not be the right thing to do: If we need a nop for both the delay slot and the forbidden slot then we need to decide which path should have the bubble. Ideally, it should be on the coldest path. Some machines may generally prefer compact branches. For #1, we should allow the hazard scheduler to revert the branch to a delay slot branch if the BB probabilities suggest that the taken path is colder than the not-taken path. For #2, I think it's reasonable to expect that such machine will exist in the future but I'm not aware of any yet. It shouldn't be difficult to deal with that if/when it arises.
667–669	This could be a Mips16InstrInfo
714–717	Could you put braces around this?
lib/Target/Mips/MipsHazardSchedule.cpp
11–43 ↗	(On Diff #46763)	Could you change this into doxygen documentation? (see '\file')
86 ↗	(On Diff #46763)	Don't repeat the function name in new code. Also, please use doxygen comments ('///').
97 ↗	(On Diff #46763)	We shouldn't have the '\|\| inMicroMipsMode()' here. microMIPSR6 has forbidden slots too.
104 ↗	(On Diff #46763)	Use the arrow operator instead of '(*FI).'. There's a few other cases of this below
122 ↗	(On Diff #46763)	We can only have one fallthrough. We should stop iterating once we find it.
125–128 ↗	(On Diff #46763)	Please factor out the duplicate code.
lib/Target/Mips/MipsInstrInfo.cpp
260–302	We ought to tablegen-erate these.
281	I read this as being a branch with no operands. Can you make it clearer?
287	We'll should test for both ZERO and ZERO_64 so that this is still correct when the atomics are fixed.
lib/Target/Mips/MipsInstrInfo.h
74 ↗	(On Diff #46763)	Please make this doxygen comment document the function
74–78 ↗	(On Diff #46763)	Please make the arguments references since nullptr is not permitted
lib/Target/Mips/MipsSEInstrInfo.cpp
451	This doesn't look equivalent to BEQ to me. What happens if the second operand of the comparison isn't $zero?
456	This doesn't look equivalent to BEQ to me. What happens if the second operand of the comparison isn't $zero?
lib/Target/Mips/MipsSEInstrInfo.h
69	The iterator should be const too.

This revision now requires changes to proceed.Feb 16 2016, 5:39 AM

Addressed nits. Instructions are now flagged as CTIs or having forbidden slots in the .td files using target specific flags. Moved getEquivalentCompactForm to MipsInstrInfo as it can actually apply to MIPS16, MIPSR6, microMIPS, microMIPSR6.

Rebased to ToT.

lib/Target/Mips/MipsDelaySlotFiller.cpp
660–727	For #1, I'm not quite following you here. We have to place the nop after the compact branch as part the same basic block. For example: BB1: <no instructions to go in a delay slot> bne v0,L3 BB2: jal g * move a0, v0 BB3: .... The bne here will be transformed into a compact form. Since the next instruction is unsafe in the forbidden slot, we have to put a nop in the slot. We can't put the nop in BB3 as the first instruction as that doesn't clear the hazard. The hazard is due to instruction layout, not execution. For #2, it's somewhat trivial to block delay slot filling for branches which have corresponding delay slot forms.
lib/Target/Mips/MipsHazardSchedule.cpp
97 ↗	(On Diff #46763)	microMIPSR6 doesn't have forbidden slots. The MD00582-2B-microMIPS32-AFP-06.03 says that "Any instruction, including a branch or jump, may immediately follow a branch or jump, that is, delay slot restrictions do not apply in Release 6". Confusingly MIPSR6 ISA has them.
lib/Target/Mips/MipsSEInstrInfo.cpp
451	If the second operand of the comparison is not $zero canUseMicroMipsBranches is false as that explicitly checks for conditions that beqz16 can be used in. However, if we have microMIPSR6 we can generate beqc (long version). if both canUseMicroMipsBranches and hasMips32r6() is false we return 0 since there's no equivalent form. I've renamed canUseMicroMipsBranches to canUseShortMMBranches to better reflect what it's doing.
456	See above.

Re: MIR tests:

I have another ~6 patches for compact branch/jumps pending. Unfortunately every one of them modifies compact-branches.ll, I haven't submitted them upstream yet because this one is the necessary groundwork for all of them. After submitting of them all or in the last patch, I could redo test/Codegen/Mips/compactbranches/compact-branches.ll as a MIR based test.

Ping.

LGTM with a couple nits.

lib/Target/Mips/MipsDelaySlotFiller.cpp
660–728	For #1, I'm trying to say that, in some circumstances, delay-slot branches can be a better choice than compact branches when both are available. For example: BB1: ... bne $2, BB3 nop BB2: jal g BB3: ... introduces a bubble on both the taken and not-taken paths while: BB1: ... bnec $2, BB3 nop BB2: jal g BB3: ... introduces a bubble on the taken path and possibly two bubbles on the not-taken path depending on implementation. I'm told this kind of thing occurs between delay-slot returns and compact-branch returns but we need to verify that. My comment was intended to be something to think about rather than something to do in this patch. Forbidden slots is a correctness issue so it's important that we get a solution in place and we can expand on performance decisions in later patches. The hazard is due to instruction layout, not execution. That's right, but we aren't forced to choose code that has this hazard.
lib/Target/Mips/MipsHazardSchedule.cpp
98 ↗	(On Diff #48308)	Ok then. I'm pretty sure it had them at one point but it doesn't now
lib/Target/Mips/MipsInstrInfo.cpp
260–261	Repeated name in comment.
lib/Target/Mips/MipsInstrInfo.td
1118–1122 ↗	(On Diff #48308)	Why move this above DEI_FT?
1321–1325 ↗	(On Diff #48308)	Could you indent this since we have nested 'let' statements
lib/Target/Mips/MipsSEInstrInfo.cpp
451	Ok, thanks.
456	Ok, thanks

Patch updated to address nits. Could you commit on my behalf? Thanks.

lib/Target/Mips/MipsInstrInfo.td
1118–1122 ↗	(On Diff #48308)	I was grouping it syscall/break/(d)eret classes so I could mark them all with one let isCTI =1 {..}. Want me to move it back?

Could you commit on my behalf?

Sure

lib/Target/Mips/MipsInstrInfo.td
1118–1122 ↗	(On Diff #49944)	Ah ok. I was associating the '}' with the class rather than the let so it looked like there was no reason to move it.

Rebased and retested to ToT (r263428).

Closed by commit rL263444: [mips] MIPS32R6 compact branch support (authored by dsanders). · Explain WhyMar 14 2016, 9:29 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Target/

Mips/

MipsDelaySlotFiller.cpp

190 lines

MipsInstrInfo.cpp

29 lines

MipsSEInstrInfo.h

2 lines

MipsSEInstrInfo.cpp

68 lines

test/

CodeGen/

Mips/

5 lines

28 lines

155 lines

8 lines

18 lines

Diff 45690

lib/Target/Mips/MipsDelaySlotFiller.cpp

Show All 9 Lines
// Simple pass to fill delay slots with useful instructions.		// Simple pass to fill delay slots with useful instructions.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "MCTargetDesc/MipsMCNaCl.h"		#include "MCTargetDesc/MipsMCNaCl.h"
#include "Mips.h"		#include "Mips.h"
#include "MipsInstrInfo.h"		#include "MipsInstrInfo.h"
#include "MipsTargetMachine.h"		#include "MipsTargetMachine.h"
		#include "MipsSEInstrInfo.h"
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"		#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines	namespace {
public:		public:
Filler(TargetMachine &tm)		Filler(TargetMachine &tm)
: MachineFunctionPass(ID), TM(tm) { }		: MachineFunctionPass(ID), TM(tm) { }

const char *getPassName() const override {		const char *getPassName() const override {
return "Mips Delay Slot Filler";		return "Mips Delay Slot Filler";
}		}

bool runOnMachineFunction(MachineFunction &F) override {		bool runOnMachineFunction(MachineFunction &F) override;
bool Changed = false;
for (MachineFunction::iterator FI = F.begin(), FE = F.end();
FI != FE; ++FI)
Changed \|= runOnMachineBasicBlock(*FI);

// This pass invalidates liveness information when it reorders
// instructions to fill delay slot. Without this, -verify-machineinstrs
// will fail.
if (Changed)
F.getRegInfo().invalidateLiveness();

return Changed;
}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<MachineBranchProbabilityInfo>();		AU.addRequired<MachineBranchProbabilityInfo>();
MachineFunctionPass::getAnalysisUsage(AU);		MachineFunctionPass::getAnalysisUsage(AU);
}		}

private:		private:
bool runOnMachineBasicBlock(MachineBasicBlock &MBB);		bool runOnMachineBasicBlock(MachineBasicBlock &MBB);
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	private:
/// Examine Pred and see if it is possible to insert an instruction into		/// Examine Pred and see if it is possible to insert an instruction into
/// one of its branches delay slot or its end.		/// one of its branches delay slot or its end.
bool examinePred(MachineBasicBlock &Pred, const MachineBasicBlock &Succ,		bool examinePred(MachineBasicBlock &Pred, const MachineBasicBlock &Succ,
RegDefsUses &RegDU, bool &HasMultipleSuccs,		RegDefsUses &RegDU, bool &HasMultipleSuccs,
BB2BrMap &BrMap) const;		BB2BrMap &BrMap) const;

bool terminateSearch(const MachineInstr &Candidate) const;		bool terminateSearch(const MachineInstr &Candidate) const;

		/// Clear any potiental forbidden slot hazards in F.
		bool clearFSHazards(MachineFunction &F);

TargetMachine &TM;		TargetMachine &TM;

static char ID;		static char ID;
};		};
char Filler::ID = 0;		char Filler::ID = 0;
} // end of anonymous namespace		} // end of anonymous namespace

static bool hasUnoccupiedSlot(const MachineInstr *MI) {		static bool hasUnoccupiedSlot(const MachineInstr *MI) {
▲ Show 20 Lines • Show All 242 Lines • ▼ Show 20 Lines	if (!isIdentifiedObject(V))
return false;		return false;

Objects.push_back(*I);		Objects.push_back(*I);
}		}

return true;		return true;
}		}

// Replace Branch with the compact branch instruction.		// Replace Branch with the compact branch instruction.
Iter Filler::replaceWithCompactBranch(MachineBasicBlock &MBB,		Iter Filler::replaceWithCompactBranch(MachineBasicBlock &MBB, Iter Branch,
Iter Branch, DebugLoc DL) {		DebugLoc DL) {
const MipsInstrInfo *TII =		const MipsSubtarget &STI = MBB.getParent()->getSubtarget<MipsSubtarget>();
MBB.getParent()->getSubtarget<MipsSubtarget>().getInstrInfo();		const MipsSEInstrInfo *TII =
		static_cast<const MipsSEInstrInfo *>(STI.getInstrInfo());
unsigned NewOpcode =
(((unsigned) Branch->getOpcode()) == Mips::BEQ) ? Mips::BEQZC_MM
: Mips::BNEZC_MM;

const MCInstrDesc &NewDesc = TII->get(NewOpcode);
MachineInstrBuilder MIB = BuildMI(MBB, Branch, DL, NewDesc);

MIB.addReg(Branch->getOperand(0).getReg());		unsigned NewOpcode = TII->getEquivalentCompactForm(Branch);
MIB.addMBB(Branch->getOperand(2).getMBB());		TII->genInstrWithNewOpc(NewOpcode, Branch);
		dsandersUnsubmitted Done Reply Inline Actions Mips16 uses the delay slot filler too as far as I know. If so, this could be a Mips16InstrInfo dsanders: Mips16 uses the delay slot filler too as far as I know. If so, this could be a Mips16InstrInfo

Iter tmpIter = Branch;		Iter tmpIter = Branch;
Branch = std::prev(Branch);		Branch = std::prev(Branch);
MBB.erase(tmpIter);		MBB.erase(tmpIter);
		vkalintirisUnsubmitted Done Reply Inline Actions If you update `Branch` with the return value of `genInstrWithNewOpc()` then deleting the previous branch is as simple as: `std::next(Branch)->eraseFromParent();` vkalintiris: If you update `Branch` with the return value of `genInstrWithNewOpc()` then deleting the…

return Branch;		return Branch;
		vkalintirisUnsubmitted Done Reply Inline Actions The same functionality is implemented in MipsInstrInfo::genInstrWithNewOpc(). You can expand the logic of that function, for branches that use the $zero register. vkalintiris: The same functionality is implemented in MipsInstrInfo::genInstrWithNewOpc(). You can expand…
}		}

// Replace Jumps with the compact jump instruction.		// Replace Jumps with the compact jump instruction.
Iter Filler::replaceWithCompactJump(MachineBasicBlock &MBB,		Iter Filler::replaceWithCompactJump(MachineBasicBlock &MBB,
Iter Jump, DebugLoc DL) {		Iter Jump, DebugLoc DL) {
const MipsInstrInfo *TII =		const MipsInstrInfo *TII =
MBB.getParent()->getSubtarget<MipsSubtarget>().getInstrInfo();		MBB.getParent()->getSubtarget<MipsSubtarget>().getInstrInfo();

const MCInstrDesc &NewDesc = TII->get(Mips::JRC16_MM);		const MCInstrDesc &NewDesc = TII->get(Mips::JRC16_MM);
MachineInstrBuilder MIB = BuildMI(MBB, Jump, DL, NewDesc);		MachineInstrBuilder MIB = BuildMI(MBB, Jump, DL, NewDesc);

MIB.addReg(Jump->getOperand(0).getReg());		MIB.addReg(Jump->getOperand(0).getReg());

Iter tmpIter = Jump;		Iter tmpIter = Jump;
Jump = std::prev(Jump);		Jump = std::prev(Jump);
MBB.erase(tmpIter);		MBB.erase(tmpIter);

return Jump;		return Jump;
}		}

		// Predicate for distingushing between control transfer instructions and all
		// other instructions for handling forbidden slots. Consider inline assembly
		// as unsafe as well.
		static bool safeInForbiddenSlot(const MachineInstr *MI) {
		if (MI->isCall() \|\| MI->isBranch() \|\| MI->isReturn() \|\| MI->isInlineAsm())
		return false;

		switch (MI->getOpcode()) {
		case Mips::ERET:
		case Mips::ERETNC:
		case Mips::DERET:
		case Mips::PAUSE:
		case Mips::WAIT:
		return false;
		default:
		return true;
		vkalintirisUnsubmitted Done Reply Inline Actions I think that we should return 0 here. Mips::ZERO is a positive number. vkalintiris: I think that we should return 0 here. Mips::ZERO is a positive number.
		}
		}
		vkalintirisUnsubmitted Done Reply Inline Actions We should make this function a member of MipsSEInstrInfo. This is where we keep similar logic for branch instruction operations. Can you merge the first switch statement into the second? The check for the number of explicit operands becomes unnecessary once you've identified the opcode. Also, this is where we should provide the relevant logic for branches in microMIPS mode. vkalintiris: We should make this function a member of MipsSEInstrInfo. This is where we keep similar logic…

		// Predicate for distingushing instructions that have forbidden slots.
		static bool hasForbiddenSlot(const MachineInstr *MI) {
		if (!MI->isBranch())
		return false;

		switch ((unsigned)MI->getOpcode()) {
		vkalintirisUnsubmitted Done Reply Inline Actions The cast is redundant. vkalintiris: The cast is redundant.
		case Mips::BEQC:
		case Mips::BNEC:
		case Mips::BLTC:
		case Mips::BGEC:
		case Mips::BLTUC:
		case Mips::BGEUC:
		case Mips::BEQZC:
		case Mips::BNEZC:
		case Mips::BGEZC:
		case Mips::BGTZC:
		case Mips::BLEZC:
		case Mips::BLTZC:
		return true;
		default:
		return false;
		}
		}

		vkalintirisUnsubmitted Done Reply Inline Actions I think that it would be useful to have these functions at `MipsInstrInfo` as we could call them from other passes too. vkalintiris: I think that it would be useful to have these functions at `MipsInstrInfo` as we could call…
// For given opcode returns opcode of corresponding instruction with short		// For given opcode returns opcode of corresponding instruction with short
// delay slot.		// delay slot.
static int getEquivalentCallShort(int Opcode) {		static int getEquivalentCallShort(int Opcode) {
switch (Opcode) {		switch (Opcode) {
case Mips::BGEZAL:		case Mips::BGEZAL:
return Mips::BGEZALS_MM;		return Mips::BGEZALS_MM;
case Mips::BLTZAL:		case Mips::BLTZAL:
return Mips::BLTZALS_MM;		return Mips::BLTZALS_MM;
case Mips::JAL:		case Mips::JAL:
return Mips::JALS_MM;		return Mips::JALS_MM;
case Mips::JALR:		case Mips::JALR:
return Mips::JALRS_MM;		return Mips::JALRS_MM;
case Mips::JALR16_MM:		case Mips::JALR16_MM:
return Mips::JALRS16_MM;		return Mips::JALRS16_MM;
default:		default:
llvm_unreachable("Unexpected call instruction for microMIPS.");		llvm_unreachable("Unexpected call instruction for microMIPS.");
}		}
}		}

		// The forbidden slot is the instruction immediately following a compact
		// branch. A forbidden slot hazard occurs when a compact branch instruction
		// is executed and the adjacent instruction in memory is a control transfer
		// instruction such as a branch or jump, ERET, ERETNC, DERET, WAIT and
		// PAUSE.
		//
		// In such cases, the processor is required to signal a Reserved Instruction
		// exception. Forbidden slot hazards are defined for MIPSR6, no microMIPS.
		//
		// There are three sources of forbidden slot hazards:
		//
		// A) Transforming a delay slot branch into compact branch.
		// B) A previous pass has created a compact branch directly.
		// C) Filling a delay slot using a backwards search when the instruction
		// moved was in a forbidden slot. This case will create hazards in already
		// processed code.
		//
		bool Filler::clearFSHazards(MachineFunction &F) {
		bool Changed = false;
		const MipsSubtarget &STI = F.getSubtarget<MipsSubtarget>();
		const MipsSEInstrInfo *TII =
		static_cast<const MipsSEInstrInfo *>(STI.getInstrInfo());

		for (MachineFunction::iterator FI = F.begin(), FE = F.end(); FI != FE; ++FI) {
		for (Iter I = (FI).begin(); I != (FI).end(); ++I) {
		if (hasForbiddenSlot(I)) {
		if (std::next(I) != (*FI).end() &&
		!safeInForbiddenSlot(&*std::next(I))) {
		BuildMI((*FI), std::next(I), I->getDebugLoc(), TII->get(Mips::NOP));
		Changed = true;
		} else {
		for (auto Succ : (FI).successors()) {
		if ((*FI).isLayoutSuccessor(Succ) &&
		(Succ->getFirstNonDebugInstr()) != Succ->end() &&
		!safeInForbiddenSlot((Succ->getFirstNonDebugInstr()))) {
		BuildMI(&(*FI), I->getDebugLoc(), TII->get(Mips::NOP));
		Changed = true;
		}
		}
		}
		}
		}
		}
		return Changed;
		}

		bool Filler::runOnMachineFunction(MachineFunction &F) {
		bool Changed = false;
		for (MachineFunction::iterator FI = F.begin(), FE = F.end(); FI != FE; ++FI)
		Changed \|= runOnMachineBasicBlock(*FI);

		// Make a second pass over the instructions to clear hazards.
		const MipsSubtarget &STI = F.getSubtarget<MipsSubtarget>();
		if (STI.hasMips32r6() && !STI.inMicroMipsMode())
		Changed \|= clearFSHazards(F);

		// This pass invalidates liveness information when it reorders
		// instructions to fill delay slot. Without this, -verify-machineinstrs
		// will fail.
		if (Changed)
		F.getRegInfo().invalidateLiveness();

		return Changed;
		}

		vkalintirisUnsubmitted Not Done Reply Inline Actions There's no need to iterate over every instruction twice. We can check whether we have to add a NOP inside `replaceWithCompactBranch()` and bundle it together with the compact branch. vkalintiris: There's no need to iterate over every instruction twice. We can check whether we have to add a…
		sdardisAuthorUnsubmitted Not Done Reply Inline Actions We can check whether we have to add a NOP inside replaceWithCompactBranch() and bundle it together with the compact branch. We can't. You've missed case C in the comment. Consider the following from function l() in the test case before delay slot filling (using assembly here to keep things clear): move $16, $2 jal j bne $16, $2, $BB0_2 # BB#1: # %if.then addiu $4, $zero, -2 jal f For the first jal, the delay slot filler will insert the 'move' into its delay slot. The delay slot filler has no instruction to put in the delay slot of bne, so it transforms it into bnec. No nop will be inserted as a) there is no CTI following bnec in that basic block, and b) the first non-debug instruction in the physically following basic block is not a CTI. The delay filler then considers BB#1 and schedules the addiu into the delay slot of the jal. The delay filler has now created a forbidden slot hazard. Handling this in the delay filler itself requires identifying such special cases, then inserting a nop into the basic block of the compact branch (which could be the previous basic block to the one we're working on). This is in addition to cases A and B described in the comment. My reasoning for processing all the instructions again is that it trivializes all those cases (and possibly others) into a 'is the physical successor instruction of a compact branch a CTI?' examination rather than inserting checks where ever we could create a FS hazard. Inserting handling FS hazard logic in multiple places I think could lead to correctness issues as we'd need to ensure we cover all possible cases, rather than a simple (mini-)pass decoupled from the delay slot filling logic. Thoughts? sdardis: > We can check whether we have to add a NOP inside replaceWithCompactBranch() and bundle it…
		vkalintirisUnsubmitted Not Done Reply Inline Actions The (C) case will happen when filling the delay slot of a normal branch, ie. inside the MipsDelaySlotFiller pass. We can teach the "search-backwards" part of the delay slot filler (DSF) algorithm to handle this by checking the last instruction of the BB's layout predecessor. If there are other issues that conceptually belong to the DSF pass, then we should add them too by modifying the pass accordingly. I don't see how (B) can happen as we only emit compact branches at this pass for the time being. From my perspective, if a previous pass would like to add a compact-branch, then I can think of two options: (a) insert the compact branch only if the next instruction is not a CTI-hazard and leave the task of fixing the corner cases that happen during the filling of normal branches by the DSF, or (b) insert a NOP and just offload everything to the DSF. I can't think of any reason for a post-DSF pass to insert new/additional compact-branches or move non-CTI instructions from a forbidden slot. Even if we find ourselves constrained by having the DSF pass handling everything, then we can add a separate/final pass that fixes every case left over from previous passes. However, with this design we would achieve the minimum number of places were our code is wrong/invalid. If we'd want to be on the really safe side and make sure that we are aware of every possible case that we might haven't consider yet, then we can add a separate pass and enable it only for debug builds. It could assert upon finding a compact-branch that contains a CTI in its forbidden slot. vkalintiris: The (C) case will happen when filling the delay slot of a normal branch, ie. inside the…
		sdardisAuthorUnsubmitted Not Done Reply Inline Actions Even if we find ourselves constrained by having the DSF pass handling everything, then we can add a separate/final pass that fixes every case left over from previous passes. However, with this design we would achieve the minimum number of places were our code is wrong/invalid. Rather than wiring FS hazard handling logic into the DSF, we could split it off into a separate pass and enable it after the delay slot filler, like the first diff. Thoughts dsanders? It keeps the implementation safer and simpler. sdardis: > Even if we find ourselves constrained by having the DSF pass handling everything, then we…
/// runOnMachineBasicBlock - Fill in delay slots for the given basic block.		/// runOnMachineBasicBlock - Fill in delay slots for the given basic block.
/// We assume there is only one delay slot per delayed instruction.		/// We assume there is only one delay slot per delayed instruction. Also ensure
		/// that there are no forbidden slot hazards.
bool Filler::runOnMachineBasicBlock(MachineBasicBlock &MBB) {		bool Filler::runOnMachineBasicBlock(MachineBasicBlock &MBB) {
bool Changed = false;		bool Changed = false;
const MipsSubtarget &STI = MBB.getParent()->getSubtarget<MipsSubtarget>();		const MipsSubtarget &STI = MBB.getParent()->getSubtarget<MipsSubtarget>();
bool InMicroMipsMode = STI.inMicroMipsMode();		bool InMicroMipsMode = STI.inMicroMipsMode();
const MipsInstrInfo *TII = STI.getInstrInfo();		const MipsSEInstrInfo *TII =
		static_cast<const MipsSEInstrInfo *>(STI.getInstrInfo());

		dsandersUnsubmitted Done Reply Inline Actions This could be a Mips16InstrInfo dsanders: This could be a Mips16InstrInfo
for (Iter I = MBB.begin(); I != MBB.end(); ++I) {		for (Iter I = MBB.begin(); I != MBB.end(); ++I) {
if (!hasUnoccupiedSlot(&*I))		if (!hasUnoccupiedSlot(&*I))
continue;		continue;

++FilledSlots;		++FilledSlots;
Changed = true;		Changed = true;

// Delay slot filling is disabled at -O0.		// Delay slot filling is disabled at -O0.
if (!DisableDelaySlotFiller && (TM.getOptLevel() != CodeGenOpt::None)) {		if (!DisableDelaySlotFiller && (TM.getOptLevel() != CodeGenOpt::None)) {
bool Filled = false;		bool Filled = false;

if (searchBackward(MBB, I)) {		if (searchBackward(MBB, I)) {
Filled = true;		Filled = true;
} else if (I->isTerminator()) {		} else if (I->isTerminator()) {
if (searchSuccBBs(MBB, I)) {		if (searchSuccBBs(MBB, I)) {
Filled = true;		Filled = true;
}		}
} else if (searchForward(MBB, I)) {		} else if (searchForward(MBB, I)) {
Filled = true;		Filled = true;
}		}

if (Filled) {		if (Filled) {
// Get instruction with delay slot.		// Get instruction with delay slot.
MachineBasicBlock::instr_iterator DSI(I);		MachineBasicBlock::instr_iterator DSI(I);

if (InMicroMipsMode && TII->GetInstSizeInBytes(&*std::next(DSI)) == 2 &&		if (InMicroMipsMode && TII->GetInstSizeInBytes(&*std::next(DSI)) == 2 &&
DSI->isCall()) {		DSI->isCall()) {
// If instruction in delay slot is 16b change opcode to		// If instruction in delay slot is 16b change opcode to
// corresponding instruction with short delay slot.		// corresponding instruction with short delay slot.
DSI->setDesc(TII->get(getEquivalentCallShort(DSI->getOpcode())));		DSI->setDesc(TII->get(getEquivalentCallShort(DSI->getOpcode())));
}		}

continue;		continue;
}		}
}		}

// If instruction is BEQ or BNE with one ZERO register, then instead of		// If instruction is BEQ or BNE with one ZERO register, then instead of
// adding NOP replace this instruction with the corresponding compact		// adding NOP replace this instruction with the corresponding compact
// branch instruction, i.e. BEQZC or BNEZC.		// branch instruction, i.e. BEQZC or BNEZC.
unsigned Opcode = I->getOpcode();
if (InMicroMipsMode) {		if (InMicroMipsMode) {
switch (Opcode) {		if (TII->getEquivalentCompactForm(I)) {
		vkalintirisUnsubmitted Done Reply Inline Actions We can remove this once we add support for microMIPS in getEquivalentCompactForm(). vkalintiris: We can remove this once we add support for microMIPS in getEquivalentCompactForm().
case Mips::BEQ:
case Mips::BNE:
if (((unsigned) I->getOperand(1).getReg()) == Mips::ZERO) {
I = replaceWithCompactBranch(MBB, I, I->getDebugLoc());		I = replaceWithCompactBranch(MBB, I, I->getDebugLoc());
continue;		continue;
}		}
break;		if (I->isIndirectBranch() \|\| I->isReturn())
case Mips::JR:		// For microMIPS the PseudoReturn and PseudoIndirectBranch are always
case Mips::PseudoReturn:
case Mips::PseudoIndirectBranch:
// For microMIPS the PseudoReturn and PseudoIndirectBranch are allways
// expanded to JR_MM, so they can be replaced with JRC16_MM.		// expanded to JR_MM, so they can be replaced with JRC16_MM.
I = replaceWithCompactJump(MBB, I, I->getDebugLoc());		I = replaceWithCompactJump(MBB, I, I->getDebugLoc());
		dsandersUnsubmitted Not Done Reply Inline Actions Could you put braces around this? dsanders: Could you put braces around this?
continue;		continue;
default:
break;
}		}

		// For MIPSR6 attempt to produce the corresponding compact (no delay slot)
		// form of the branch. This should save putting in a NOP.
		if ((STI.hasMips32r6()) && TII->getEquivalentCompactForm(I)) {
		I = replaceWithCompactBranch(MBB, I, I->getDebugLoc());
		continue;
}		}

		dsandersUnsubmitted Not Done Reply Inline Actions The separation makes sense to me given that we're only using compact branches when delay slots go unfilled. There are a couple cases I can think of where this may not be the right thing to do: If we need a nop for both the delay slot and the forbidden slot then we need to decide which path should have the bubble. Ideally, it should be on the coldest path. Some machines may generally prefer compact branches. For #1, we should allow the hazard scheduler to revert the branch to a delay slot branch if the BB probabilities suggest that the taken path is colder than the not-taken path. For #2, I think it's reasonable to expect that such machine will exist in the future but I'm not aware of any yet. It shouldn't be difficult to deal with that if/when it arises. dsanders: The separation makes sense to me given that we're only using compact branches when delay slots…
		sdardisAuthorUnsubmitted Not Done Reply Inline Actions For #1, I'm not quite following you here. We have to place the nop after the compact branch as part the same basic block. For example: BB1: <no instructions to go in a delay slot> bne v0,L3 BB2: jal g * move a0, v0 BB3: .... The bne here will be transformed into a compact form. Since the next instruction is unsafe in the forbidden slot, we have to put a nop in the slot. We can't put the nop in BB3 as the first instruction as that doesn't clear the hazard. The hazard is due to instruction layout, not execution. For #2, it's somewhat trivial to block delay slot filling for branches which have corresponding delay slot forms. sdardis: For #1, I'm not quite following you here. We have to place the nop after the compact branch as…
// Bundle the NOP to the instruction with the delay slot.		// Bundle the NOP to the instruction with the delay slot.
		dsandersUnsubmitted Not Done Reply Inline Actions For #1, I'm trying to say that, in some circumstances, delay-slot branches can be a better choice than compact branches when both are available. For example: BB1: ... bne $2, BB3 nop BB2: jal g BB3: ... introduces a bubble on both the taken and not-taken paths while: BB1: ... bnec $2, BB3 nop BB2: jal g BB3: ... introduces a bubble on the taken path and possibly two bubbles on the not-taken path depending on implementation. I'm told this kind of thing occurs between delay-slot returns and compact-branch returns but we need to verify that. My comment was intended to be something to think about rather than something to do in this patch. Forbidden slots is a correctness issue so it's important that we get a solution in place and we can expand on performance decisions in later patches. The hazard is due to instruction layout, not execution. That's right, but we aren't forced to choose code that has this hazard. dsanders: For #1, I'm trying to say that, in some circumstances, delay-slot branches can be a better…
BuildMI(MBB, std::next(I), I->getDebugLoc(), TII->get(Mips::NOP));		BuildMI(MBB, std::next(I), I->getDebugLoc(), TII->get(Mips::NOP));
MIBundleBuilder(MBB, I, std::next(I, 2));		MIBundleBuilder(MBB, I, std::next(I, 2));
		vkalintirisUnsubmitted Done Reply Inline Actions The `STI.hasMips64r6()` will be true if `STI.hasMip32r6()` is true. vkalintiris: The `STI.hasMips64r6()` will be true if `STI.hasMip32r6()` is true.

		// For correct codegen ensure any loose compact branch does not induce a
		// forbidden slot hazard.
}		}

return Changed;		return Changed;
}		}

/// createMipsDelaySlotFillerPass - Returns a pass that fills in delay		/// createMipsDelaySlotFillerPass - Returns a pass that fills in delay
/// slots in Mips MachineFunctions		/// slots in Mips MachineFunctions
FunctionPass *llvm::createMipsDelaySlotFillerPass(MipsTargetMachine &tm) {		FunctionPass *llvm::createMipsDelaySlotFillerPass(MipsTargetMachine &tm) {
▲ Show 20 Lines • Show All 242 Lines • Show Last 20 Lines

lib/Target/Mips/MipsInstrInfo.cpp

Show First 20 Lines • Show All 251 Lines • ▼ Show 20 Lines	if (LastOpc != UncondBrOpc)
return BT_None;		return BT_None;

AnalyzeCondBr(SecondLastInst, SecondLastOpc, TBB, Cond);		AnalyzeCondBr(SecondLastInst, SecondLastOpc, TBB, Cond);
FBB = LastInst->getOperand(0).getMBB();		FBB = LastInst->getOperand(0).getMBB();

return BT_CondUncond;		return BT_CondUncond;
}		}

/// Return the number of bytes of code the specified instruction may be.		/// Return the number of bytes of code the specified instruction may be.
unsigned MipsInstrInfo::GetInstSizeInBytes(const MachineInstr *MI) const {		unsigned MipsInstrInfo::GetInstSizeInBytes(const MachineInstr *MI) const {
		dsandersUnsubmitted Not Done Reply Inline Actions Repeated name in comment. dsanders: Repeated name in comment.
switch (MI->getOpcode()) {		switch (MI->getOpcode()) {
default:		default:
return MI->getDesc().getSize();		return MI->getDesc().getSize();
case TargetOpcode::INLINEASM: { // Inline Asm: Variable size.		case TargetOpcode::INLINEASM: { // Inline Asm: Variable size.
const MachineFunction *MF = MI->getParent()->getParent();		const MachineFunction *MF = MI->getParent()->getParent();
const char *AsmStr = MI->getOperand(0).getSymbolName();		const char *AsmStr = MI->getOperand(0).getSymbolName();
return getInlineAsmLength(AsmStr, *MF->getTarget().getMCAsmInfo());		return getInlineAsmLength(AsmStr, *MF->getTarget().getMCAsmInfo());
}		}
case Mips::CONSTPOOL_ENTRY:		case Mips::CONSTPOOL_ENTRY:
// If this machine instr is a constant pool entry, its size is recorded as		// If this machine instr is a constant pool entry, its size is recorded as
// operand #2.		// operand #2.
return MI->getOperand(2).getImm();		return MI->getOperand(2).getImm();
}		}
}		}

MachineInstrBuilder		MachineInstrBuilder
MipsInstrInfo::genInstrWithNewOpc(unsigned NewOpc,		MipsInstrInfo::genInstrWithNewOpc(unsigned NewOpc,
MachineBasicBlock::iterator I) const {		MachineBasicBlock::iterator I) const {
MachineInstrBuilder MIB;		MachineInstrBuilder MIB;
		bool ZeroOperandBranch = false;
		dsandersUnsubmitted Done Reply Inline Actions I read this as being a branch with no operands. Can you make it clearer? dsanders: I read this as being a branch with no operands. Can you make it clearer?

		// Certain branches have two forms: e.g beq $1, $zero, dst vs beqz $1, dest
		// Pick the zero form of the branch for readable assembly and for greater
		// branch distance in non-microMIPS mode.
		if (I->isBranch() && I->getNumExplicitOperands() == 3 &&
		vkalintirisUnsubmitted Done Reply Inline Actions We don't have to check for the number of explicit operands. vkalintiris: We don't have to check for the number of explicit operands.
		(unsigned)I->getOperand(1).getReg() == Mips::ZERO) {
		vkalintirisUnsubmitted Done Reply Inline Actions You can use MipsABIInfo::GetZeroReg() to test with the right zero register. Also, the cast is unnecessary. vkalintiris: You can use MipsABIInfo::GetZeroReg() to test with the right zero register. Also, the cast is…
		sdardisAuthorUnsubmitted Not Done Reply Inline Actions You can use MipsABIInfo::GetZeroReg() to test with the right zero register. Strangely enough, this doesn't work. MipsTargeLowering::emitAtomicBinary will generate a Mips::BEQ with Mips::ZERO as an operand on mips64 if the requested size is 4. For mips64 it appears we need to check for MipsABIInfo::GetZeroReg() and Mips::ZERO for the moment. This appears to be an outstanding issue with emitAtomicBinary. sdardis: > You can use MipsABIInfo::GetZeroReg() to test with the right zero register. Strangely enough…
		vkalintirisUnsubmitted Done Reply Inline Actions Ok, if I understand correctly then we should just check for Mips::ZERO and keep the FIXME comment, ie. we can remove the `Subtarget.getABI().GetZeroReg()` check as it's not working at the moment. vkalintiris: Ok, if I understand correctly then we should just check for Mips::ZERO and keep the FIXME…
		dsandersUnsubmitted Done Reply Inline Actions We'll should test for both ZERO and ZERO_64 so that this is still correct when the atomics are fixed. dsanders: We'll should test for both ZERO and ZERO_64 so that this is still correct when the atomics are…
		ZeroOperandBranch = true;
		switch (NewOpc) {
		case Mips::BEQC:
		NewOpc = Mips::BEQZC;
		break;
		case Mips::BNEC:
		NewOpc = Mips::BNEZC;
		break;
		case Mips::BGEC:
		NewOpc = Mips::BGEZC;
		break;
		case Mips::BLTC:
		NewOpc = Mips::BLTZC;
		break;
		default:
		dsandersUnsubmitted Done Reply Inline Actions We ought to tablegen-erate these. dsanders: We ought to tablegen-erate these.
		break;
		vkalintirisUnsubmitted Done Reply Inline Actions Shouldn't we set `ZeroOperandBranch` to false in the default case? vkalintiris: Shouldn't we set `ZeroOperandBranch` to false in the default case?
		}
		}

MIB = BuildMI(*I->getParent(), I, I->getDebugLoc(), get(NewOpc));		MIB = BuildMI(*I->getParent(), I, I->getDebugLoc(), get(NewOpc));

for (unsigned J = 0, E = I->getDesc().getNumOperands(); J < E; ++J)		for (unsigned J = 0, E = I->getDesc().getNumOperands(); J < E; ++J)
		if (!(ZeroOperandBranch && (J == 1)))
MIB.addOperand(I->getOperand(J));		MIB.addOperand(I->getOperand(J));

MIB.setMemRefs(I->memoperands_begin(), I->memoperands_end());		MIB.setMemRefs(I->memoperands_begin(), I->memoperands_end());
return MIB;		return MIB;
}		}

lib/Target/Mips/MipsSEInstrInfo.h

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	void loadRegFromStack(MachineBasicBlock &MBB,
const TargetRegisterClass *RC,		const TargetRegisterClass *RC,
const TargetRegisterInfo *TRI,		const TargetRegisterInfo *TRI,
int64_t Offset) const override;		int64_t Offset) const override;

bool expandPostRAPseudo(MachineBasicBlock::iterator MI) const override;		bool expandPostRAPseudo(MachineBasicBlock::iterator MI) const override;

unsigned getOppositeBranchOpc(unsigned Opc) const override;		unsigned getOppositeBranchOpc(unsigned Opc) const override;

		unsigned getEquivalentCompactForm(MachineBasicBlock::iterator I) const;
		dsandersUnsubmitted Done Reply Inline Actions The iterator should be const too. dsanders: The iterator should be const too.

/// Adjust SP by Amount bytes.		/// Adjust SP by Amount bytes.
void adjustStackPtr(unsigned SP, int64_t Amount, MachineBasicBlock &MBB,		void adjustStackPtr(unsigned SP, int64_t Amount, MachineBasicBlock &MBB,
MachineBasicBlock::iterator I) const override;		MachineBasicBlock::iterator I) const override;

/// Emit a series of instructions to load an immediate. If NewImm is a		/// Emit a series of instructions to load an immediate. If NewImm is a
/// non-NULL parameter, the last instruction is not emitted, but instead		/// non-NULL parameter, the last instruction is not emitted, but instead
/// its immediate operand is returned in NewImm.		/// its immediate operand is returned in NewImm.
unsigned loadImmediate(int64_t Imm, MachineBasicBlock &MBB,		unsigned loadImmediate(int64_t Imm, MachineBasicBlock &MBB,
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

lib/Target/Mips/MipsSEInstrInfo.cpp

Show First 20 Lines • Show All 414 Lines • ▼ Show 20 Lines	unsigned MipsSEInstrInfo::getOppositeBranchOpc(unsigned Opc) const {
case Mips::BGTZ64: return Mips::BLEZ64;		case Mips::BGTZ64: return Mips::BLEZ64;
case Mips::BGEZ64: return Mips::BLTZ64;		case Mips::BGEZ64: return Mips::BLTZ64;
case Mips::BLTZ64: return Mips::BGEZ64;		case Mips::BLTZ64: return Mips::BGEZ64;
case Mips::BLEZ64: return Mips::BGTZ64;		case Mips::BLEZ64: return Mips::BGTZ64;
case Mips::BC1T: return Mips::BC1F;		case Mips::BC1T: return Mips::BC1F;
case Mips::BC1F: return Mips::BC1T;		case Mips::BC1F: return Mips::BC1T;
case Mips::BEQZC_MM: return Mips::BNEZC_MM;		case Mips::BEQZC_MM: return Mips::BNEZC_MM;
case Mips::BNEZC_MM: return Mips::BEQZC_MM;		case Mips::BNEZC_MM: return Mips::BEQZC_MM;
		case Mips::BEQZC: return Mips::BNEZC;
		case Mips::BNEZC: return Mips::BEQZC;
		case Mips::BEQC: return Mips::BNEC;
		case Mips::BNEC: return Mips::BEQC;
		case Mips::BGTZC: return Mips::BLEZC;
		case Mips::BGEZC: return Mips::BLTZC;
		case Mips::BLTZC: return Mips::BGEZC;
		case Mips::BLEZC: return Mips::BGTZC;
}		}
}		}

		/// getEquivalentCompactForm - Return the corresponding compact form of
		/// a branch.
		unsigned
		MipsSEInstrInfo::getEquivalentCompactForm(MachineBasicBlock::iterator I) const {

		vkalintirisUnsubmitted Done Reply Inline Actions Redundant newline. vkalintiris: Redundant newline.
		sdardisAuthorUnsubmitted Not Done Reply Inline Actions clang-format inserts the newline, leave or keep it? sdardis: clang-format inserts the newline, leave or keep it?
		vkalintirisUnsubmitted Done Reply Inline Actions It doesn't insert one for me, so I suggest that we remove the newline. vkalintiris: It doesn't insert one for me, so I suggest that we remove the newline.
		const MipsSubtarget &STI =
		I->getParent()->getParent()->getSubtarget<MipsSubtarget>();
		unsigned Opcode = (unsigned)I->getOpcode();
		if (STI.hasMips32r6()) {
		switch (Opcode) {
		case Mips::B:
		return Mips::BC;
		case Mips::BAL:
		return Mips::BALC;
		case Mips::BEQ:
		return Mips::BEQC;
		case Mips::BNE:
		return Mips::BNEC;
		dsandersUnsubmitted Done Reply Inline Actions This doesn't look equivalent to BEQ to me. What happens if the second operand of the comparison isn't $zero? dsanders: This doesn't look equivalent to BEQ to me. What happens if the second operand of the comparison…
		sdardisAuthorUnsubmitted Not Done Reply Inline Actions If the second operand of the comparison is not $zero canUseMicroMipsBranches is false as that explicitly checks for conditions that beqz16 can be used in. However, if we have microMIPSR6 we can generate beqc (long version). if both canUseMicroMipsBranches and hasMips32r6() is false we return 0 since there's no equivalent form. I've renamed canUseMicroMipsBranches to canUseShortMMBranches to better reflect what it's doing. sdardis: If the second operand of the comparison is not $zero canUseMicroMipsBranches is false as that…
		dsandersUnsubmitted Not Done Reply Inline Actions Ok, thanks. dsanders: Ok, thanks.
		case Mips::BGE:
		return Mips::BGEC;
		case Mips::BGEU:
		return Mips::BGEUC;
		case Mips::BGEZ:
		dsandersUnsubmitted Done Reply Inline Actions This doesn't look equivalent to BEQ to me. What happens if the second operand of the comparison isn't $zero? dsanders: This doesn't look equivalent to BEQ to me. What happens if the second operand of the comparison…
		sdardisAuthorUnsubmitted Not Done Reply Inline Actions See above. sdardis: See above.
		dsandersUnsubmitted Not Done Reply Inline Actions Ok, thanks dsanders: Ok, thanks
		return Mips::BGEZC;
		case Mips::BGTZ:
		return Mips::BGTZC;
		case Mips::BLEZ:
		return Mips::BLEZC;
		case Mips::BLT:
		return Mips::BLTC;
		case Mips::BLTU:
		return Mips::BLTUC;
		case Mips::BLTZ:
		return Mips::BLTZC;
		default:
		return 0;
		}
		}

		if (STI.inMicroMipsMode()) {
		switch (Opcode) {
		case Mips::BEQ:
		case Mips::BNE:
		if ((unsigned)I->getOperand(1).getReg() == Mips::ZERO)
		return Opcode == Mips::BEQ ? Mips::BEQZC_MM : Mips::BNEZC_MM;
		default:
		return 0;
		}
		}
		vkalintirisUnsubmitted Done Reply Inline Actions We will never reach this if `STI.hasMips32r6()` is true. We should merge the logic of opcode selection into the first switch statement. vkalintiris: We will never reach this if `STI.hasMips32r6()` is true. We should merge the logic of opcode…
		return 0;
		}

/// Adjust SP by Amount bytes.		/// Adjust SP by Amount bytes.
void MipsSEInstrInfo::adjustStackPtr(unsigned SP, int64_t Amount,		void MipsSEInstrInfo::adjustStackPtr(unsigned SP, int64_t Amount,
MachineBasicBlock &MBB,		MachineBasicBlock &MBB,
MachineBasicBlock::iterator I) const {		MachineBasicBlock::iterator I) const {
MipsABIInfo ABI = Subtarget.getABI();		MipsABIInfo ABI = Subtarget.getABI();
DebugLoc DL;		DebugLoc DL;
unsigned ADDu = ABI.GetPtrAdduOp();		unsigned ADDu = ABI.GetPtrAdduOp();
unsigned ADDiu = ABI.GetPtrAddiuOp();		unsigned ADDiu = ABI.GetPtrAddiuOp();
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
}		}

unsigned MipsSEInstrInfo::getAnalyzableBrOpc(unsigned Opc) const {		unsigned MipsSEInstrInfo::getAnalyzableBrOpc(unsigned Opc) const {
return (Opc == Mips::BEQ \|\| Opc == Mips::BNE \|\| Opc == Mips::BGTZ \|\|		return (Opc == Mips::BEQ \|\| Opc == Mips::BNE \|\| Opc == Mips::BGTZ \|\|
Opc == Mips::BGEZ \|\| Opc == Mips::BLTZ \|\| Opc == Mips::BLEZ \|\|		Opc == Mips::BGEZ \|\| Opc == Mips::BLTZ \|\| Opc == Mips::BLEZ \|\|
Opc == Mips::BEQ64 \|\| Opc == Mips::BNE64 \|\| Opc == Mips::BGTZ64 \|\|		Opc == Mips::BEQ64 \|\| Opc == Mips::BNE64 \|\| Opc == Mips::BGTZ64 \|\|
Opc == Mips::BGEZ64 \|\| Opc == Mips::BLTZ64 \|\| Opc == Mips::BLEZ64 \|\|		Opc == Mips::BGEZ64 \|\| Opc == Mips::BLTZ64 \|\| Opc == Mips::BLEZ64 \|\|
Opc == Mips::BC1T \|\| Opc == Mips::BC1F \|\| Opc == Mips::B \|\|		Opc == Mips::BC1T \|\| Opc == Mips::BC1F \|\| Opc == Mips::B \|\|
Opc == Mips::J \|\| Opc == Mips::BEQZC_MM \|\| Opc == Mips::BNEZC_MM) ?		Opc == Mips::J \|\| Opc == Mips::BEQZC_MM \|\| Opc == Mips::BNEZC_MM \|\|
Opc : 0;		Opc == Mips::BEQC \|\| Opc == Mips::BNEC \|\| Opc == Mips::BLTC \|\|
		Opc == Mips::BGEC \|\| Opc == Mips::BLTUC \|\| Opc == Mips::BGEUC \|\|
		Opc == Mips::BGTZC \|\| Opc == Mips::BLEZC \|\| Opc == Mips::BGEZC \|\|
		Opc == Mips::BGTZC \|\| Opc == Mips::BEQZC \|\| Opc == Mips::BNEZC \|\|
		Opc == Mips::BC) ? Opc : 0;
}		}

void MipsSEInstrInfo::expandRetRA(MachineBasicBlock &MBB,		void MipsSEInstrInfo::expandRetRA(MachineBasicBlock &MBB,
MachineBasicBlock::iterator I) const {		MachineBasicBlock::iterator I) const {
if (Subtarget.isGP64bit())		if (Subtarget.isGP64bit())
BuildMI(MBB, I, I->getDebugLoc(), get(Mips::PseudoReturn64))		BuildMI(MBB, I, I->getDebugLoc(), get(Mips::PseudoReturn64))
.addReg(Mips::RA_64);		.addReg(Mips::RA_64);
else		else
▲ Show 20 Lines • Show All 208 Lines • Show Last 20 Lines

test/CodeGen/Mips/analyzebranch.ll

	Show All 13 Lines
	; FCC: nop			; FCC: nop

	; 32-GPR: mtc1 $zero, $[[Z:f[0-9]]]			; 32-GPR: mtc1 $zero, $[[Z:f[0-9]]]
	; 32-GPR: mthc1 $zero, $[[Z:f[0-9]]]			; 32-GPR: mthc1 $zero, $[[Z:f[0-9]]]
	; 64-GPR: dmtc1 $zero, $[[Z:f[0-9]]]			; 64-GPR: dmtc1 $zero, $[[Z:f[0-9]]]
	; GPR: cmp.lt.d $[[FGRCC:f[0-9]+]], $[[Z]], $f12			; GPR: cmp.lt.d $[[FGRCC:f[0-9]+]], $[[Z]], $f12
	; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC]]			; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC]]
	; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]			; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]
	; GPR: bnez $[[GPRCC]], $BB			; GPR: bnezc $[[GPRCC]], $BB

	%cmp = fcmp ogt double %a, 0.000000e+00			%cmp = fcmp ogt double %a, 0.000000e+00
	br i1 %cmp, label %if.end6, label %if.else			br i1 %cmp, label %if.end6, label %if.else

	if.else: ; preds = %entry			if.else: ; preds = %entry
	%cmp3 = fcmp ogt double %b, 0.000000e+00			%cmp3 = fcmp ogt double %b, 0.000000e+00
	br i1 %cmp3, label %if.end6, label %return			br i1 %cmp3, label %if.end6, label %return

	Show All 14 Lines

	; FCC: bc1f $BB			; FCC: bc1f $BB
	; FCC: nop			; FCC: nop

	; GPR: mtc1 $zero, $[[Z:f[0-9]]]			; GPR: mtc1 $zero, $[[Z:f[0-9]]]
	; GPR: cmp.eq.s $[[FGRCC:f[0-9]+]], $f12, $[[Z]]			; GPR: cmp.eq.s $[[FGRCC:f[0-9]+]], $f12, $[[Z]]
	; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC]]			; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC]]
	; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]			; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]
	; GPR: beqz $[[GPRCC]], $BB			; 64-GPR beqzc $[[GPRCC]], $BB
				; 32-GPR beqz $[[GPRCC]], $BB

	%cmp = fcmp une float %f, 0.000000e+00			%cmp = fcmp une float %f, 0.000000e+00
	br i1 %cmp, label %if.then, label %if.end			br i1 %cmp, label %if.then, label %if.end

	if.then: ; preds = %entry			if.then: ; preds = %entry
	tail call void @abort() noreturn			tail call void @abort() noreturn
	unreachable			unreachable

	if.end: ; preds = %entry			if.end: ; preds = %entry
	tail call void (...) @f2() nounwind			tail call void (...) @f2() nounwind
	ret void			ret void
	}			}

	declare void @abort() noreturn nounwind			declare void @abort() noreturn nounwind

	declare void @f2(...)			declare void @f2(...)

test/CodeGen/Mips/atomic.ll

	; RUN: llc -march=mipsel --disable-machine-licm -mcpu=mips32 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=NO-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS			; RUN: llc -march=mipsel --disable-machine-licm -mcpu=mips32 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=NO-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS
	; RUN: llc -march=mipsel --disable-machine-licm -mcpu=mips32r2 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS			; RUN: llc -march=mipsel --disable-machine-licm -mcpu=mips32r2 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS
	; RUN: llc -march=mipsel --disable-machine-licm -mcpu=mips32r6 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS			; RUN: llc -march=mipsel --disable-machine-licm -mcpu=mips32r6 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=MIPSR6
	; RUN: llc -march=mips64el --disable-machine-licm -mcpu=mips4 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS64-ANY -check-prefix=NO-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS			; RUN: llc -march=mips64el --disable-machine-licm -mcpu=mips4 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS64-ANY -check-prefix=NO-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS
	; RUN: llc -march=mips64el --disable-machine-licm -mcpu=mips64 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS64-ANY -check-prefix=NO-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS			; RUN: llc -march=mips64el --disable-machine-licm -mcpu=mips64 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS64-ANY -check-prefix=NO-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS
	; RUN: llc -march=mips64el --disable-machine-licm -mcpu=mips64r2 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS64-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS			; RUN: llc -march=mips64el --disable-machine-licm -mcpu=mips64r2 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS64-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS
	; RUN: llc -march=mips64el --disable-machine-licm -mcpu=mips64r6 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS64-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=NOT-MICROMIPS			; RUN: llc -march=mips64el --disable-machine-licm -mcpu=mips64r6 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS64-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=MIPSR6
	; RUN: llc -march=mipsel --disable-machine-licm -mcpu=mips32r2 -mattr=micromips < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=MICROMIPS			; RUN: llc -march=mipsel --disable-machine-licm -mcpu=mips32r2 -mattr=micromips < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=HAS-SEB-SEH -check-prefix=CHECK-EL -check-prefix=MICROMIPS

	; Keep one big-endian check so that we don't reduce testing, but don't add more			; Keep one big-endian check so that we don't reduce testing, but don't add more
	; since endianness doesn't affect the body of the atomic operations.			; since endianness doesn't affect the body of the atomic operations.
	; RUN: llc -march=mips --disable-machine-licm -mcpu=mips32 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=NO-SEB-SEH -check-prefix=CHECK-EB -check-prefix=NOT-MICROMIPS			; RUN: llc -march=mips --disable-machine-licm -mcpu=mips32 < %s \| FileCheck %s -check-prefix=ALL -check-prefix=MIPS32-ANY -check-prefix=NO-SEB-SEH -check-prefix=CHECK-EB -check-prefix=NOT-MICROMIPS

	@x = common global i32 0, align 4			@x = common global i32 0, align 4

	define i32 @AtomicLoadAdd32(i32 signext %incr) nounwind {			define i32 @AtomicLoadAdd32(i32 signext %incr) nounwind {
	entry:			entry:
	%0 = atomicrmw add i32* @x, i32 %incr monotonic			%0 = atomicrmw add i32* @x, i32 %incr monotonic
	ret i32 %0			ret i32 %0

	; ALL-LABEL: AtomicLoadAdd32:			; ALL-LABEL: AtomicLoadAdd32:

	; MIPS32-ANY: lw $[[R0:[0-9]+]], %got(x)			; MIPS32-ANY: lw $[[R0:[0-9]+]], %got(x)
	; MIPS64-ANY: ld $[[R0:[0-9]+]], %got_disp(x)(			; MIPS64-ANY: ld $[[R0:[0-9]+]], %got_disp(x)(

	; ALL: $[[BB0:[A-Z_0-9]+]]:			; ALL: $[[BB0:[A-Z_0-9]+]]:
	; ALL: ll $[[R1:[0-9]+]], 0($[[R0]])			; ALL: ll $[[R1:[0-9]+]], 0($[[R0]])
	; ALL: addu $[[R2:[0-9]+]], $[[R1]], $4			; ALL: addu $[[R2:[0-9]+]], $[[R1]], $4
	; ALL: sc $[[R2]], 0($[[R0]])			; ALL: sc $[[R2]], 0($[[R0]])
	; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]
	; MICROMIPS: beqzc $[[R2]], $[[BB0]]			; MICROMIPS: beqzc $[[R2]], $[[BB0]]
				; MIPSR6: beqzc $[[R2]], $[[BB0]]
	}			}

	define i32 @AtomicLoadNand32(i32 signext %incr) nounwind {			define i32 @AtomicLoadNand32(i32 signext %incr) nounwind {
	entry:			entry:
	%0 = atomicrmw nand i32* @x, i32 %incr monotonic			%0 = atomicrmw nand i32* @x, i32 %incr monotonic
	ret i32 %0			ret i32 %0

	; ALL-LABEL: AtomicLoadNand32:			; ALL-LABEL: AtomicLoadNand32:

	; MIPS32-ANY: lw $[[R0:[0-9]+]], %got(x)			; MIPS32-ANY: lw $[[R0:[0-9]+]], %got(x)
	; MIPS64-ANY: ld $[[R0:[0-9]+]], %got_disp(x)(			; MIPS64-ANY: ld $[[R0:[0-9]+]], %got_disp(x)(

	; ALL: $[[BB0:[A-Z_0-9]+]]:			; ALL: $[[BB0:[A-Z_0-9]+]]:
	; ALL: ll $[[R1:[0-9]+]], 0($[[R0]])			; ALL: ll $[[R1:[0-9]+]], 0($[[R0]])
	; ALL: and $[[R3:[0-9]+]], $[[R1]], $4			; ALL: and $[[R3:[0-9]+]], $[[R1]], $4
	; ALL: nor $[[R2:[0-9]+]], $zero, $[[R3]]			; ALL: nor $[[R2:[0-9]+]], $zero, $[[R3]]
	; ALL: sc $[[R2]], 0($[[R0]])			; ALL: sc $[[R2]], 0($[[R0]])
	; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]
	; MICROMIPS: beqzc $[[R2]], $[[BB0]]			; MICROMIPS: beqzc $[[R2]], $[[BB0]]
				; MIPSR6: beqzc $[[R2]], $[[BB0]]
	}			}

	define i32 @AtomicSwap32(i32 signext %newval) nounwind {			define i32 @AtomicSwap32(i32 signext %newval) nounwind {
	entry:			entry:
	%newval.addr = alloca i32, align 4			%newval.addr = alloca i32, align 4
	store i32 %newval, i32* %newval.addr, align 4			store i32 %newval, i32* %newval.addr, align 4
	%tmp = load i32, i32* %newval.addr, align 4			%tmp = load i32, i32* %newval.addr, align 4
	%0 = atomicrmw xchg i32* @x, i32 %tmp monotonic			%0 = atomicrmw xchg i32* @x, i32 %tmp monotonic
	ret i32 %0			ret i32 %0

	; ALL-LABEL: AtomicSwap32:			; ALL-LABEL: AtomicSwap32:

	; MIPS32-ANY: lw $[[R0:[0-9]+]], %got(x)			; MIPS32-ANY: lw $[[R0:[0-9]+]], %got(x)
	; MIPS64-ANY: ld $[[R0:[0-9]+]], %got_disp(x)			; MIPS64-ANY: ld $[[R0:[0-9]+]], %got_disp(x)

	; ALL: $[[BB0:[A-Z_0-9]+]]:			; ALL: $[[BB0:[A-Z_0-9]+]]:
	; ALL: ll ${{[0-9]+}}, 0($[[R0]])			; ALL: ll ${{[0-9]+}}, 0($[[R0]])
	; ALL: sc $[[R2:[0-9]+]], 0($[[R0]])			; ALL: sc $[[R2:[0-9]+]], 0($[[R0]])
	; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]
	; MICROMIPS: beqzc $[[R2]], $[[BB0]]			; MICROMIPS: beqzc $[[R2]], $[[BB0]]
				; MIPSR6: beqzc $[[R2]], $[[BB0]]
	}			}

	define i32 @AtomicCmpSwap32(i32 signext %oldval, i32 signext %newval) nounwind {			define i32 @AtomicCmpSwap32(i32 signext %oldval, i32 signext %newval) nounwind {
	entry:			entry:
	%newval.addr = alloca i32, align 4			%newval.addr = alloca i32, align 4
	store i32 %newval, i32* %newval.addr, align 4			store i32 %newval, i32* %newval.addr, align 4
	%tmp = load i32, i32* %newval.addr, align 4			%tmp = load i32, i32* %newval.addr, align 4
	%0 = cmpxchg i32* @x, i32 %oldval, i32 %tmp monotonic monotonic			%0 = cmpxchg i32* @x, i32 %oldval, i32 %tmp monotonic monotonic
	%1 = extractvalue { i32, i1 } %0, 0			%1 = extractvalue { i32, i1 } %0, 0
	ret i32 %1			ret i32 %1

	; ALL-LABEL: AtomicCmpSwap32:			; ALL-LABEL: AtomicCmpSwap32:

	; MIPS32-ANY: lw $[[R0:[0-9]+]], %got(x)			; MIPS32-ANY: lw $[[R0:[0-9]+]], %got(x)
	; MIPS64-ANY: ld $[[R0:[0-9]+]], %got_disp(x)(			; MIPS64-ANY: ld $[[R0:[0-9]+]], %got_disp(x)(

	; ALL: $[[BB0:[A-Z_0-9]+]]:			; ALL: $[[BB0:[A-Z_0-9]+]]:
	; ALL: ll $2, 0($[[R0]])			; ALL: ll $2, 0($[[R0]])
	; ALL: bne $2, $4, $[[BB1:[A-Z_0-9]+]]			; NOT-MICROMIPS: bne $2, $4, $[[BB1:[A-Z_0-9]+]]
				; MICROMIPS: bne $2, $4, $[[BB1:[A-Z_0-9]+]]
				; MIPSR6: bnec $2, $4, $[[BB1:[A-Z_0-9]+]]
	; ALL: sc $[[R2:[0-9]+]], 0($[[R0]])			; ALL: sc $[[R2:[0-9]+]], 0($[[R0]])
	; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]
	; MICROMIPS: beqzc $[[R2]], $[[BB0]]			; MICROMIPS: beqzc $[[R2]], $[[BB0]]
				; MIPSR6: beqzc $[[R2]], $[[BB0]]
	; ALL: $[[BB1]]:			; ALL: $[[BB1]]:
	}			}



	@y = common global i8 0, align 1			@y = common global i8 0, align 1

	define signext i8 @AtomicLoadAdd8(i8 signext %incr) nounwind {			define signext i8 @AtomicLoadAdd8(i8 signext %incr) nounwind {
	Show All 21 Lines
	; ALL: ll $[[R10:[0-9]+]], 0($[[R2]])			; ALL: ll $[[R10:[0-9]+]], 0($[[R2]])
	; ALL: addu $[[R11:[0-9]+]], $[[R10]], $[[R9]]			; ALL: addu $[[R11:[0-9]+]], $[[R10]], $[[R9]]
	; ALL: and $[[R12:[0-9]+]], $[[R11]], $[[R7]]			; ALL: and $[[R12:[0-9]+]], $[[R11]], $[[R7]]
	; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]			; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]
	; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R12]]			; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R12]]
	; ALL: sc $[[R14]], 0($[[R2]])			; ALL: sc $[[R14]], 0($[[R2]])
	; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]
	; MICROMIPS: beqzc $[[R14]], $[[BB0]]			; MICROMIPS: beqzc $[[R14]], $[[BB0]]
				; MIPSR6: beqzc $[[R14]], $[[BB0]]

	; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]			; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]
	; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]			; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]

	; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 24			; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 24
	; NO-SEB-SEH: sra $2, $[[R17]], 24			; NO-SEB-SEH: sra $2, $[[R17]], 24

	; HAS-SEB-SEH: seb $2, $[[R16]]			; HAS-SEB-SEH: seb $2, $[[R16]]
	Show All 24 Lines
	; ALL: ll $[[R10:[0-9]+]], 0($[[R2]])			; ALL: ll $[[R10:[0-9]+]], 0($[[R2]])
	; ALL: subu $[[R11:[0-9]+]], $[[R10]], $[[R9]]			; ALL: subu $[[R11:[0-9]+]], $[[R10]], $[[R9]]
	; ALL: and $[[R12:[0-9]+]], $[[R11]], $[[R7]]			; ALL: and $[[R12:[0-9]+]], $[[R11]], $[[R7]]
	; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]			; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]
	; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R12]]			; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R12]]
	; ALL: sc $[[R14]], 0($[[R2]])			; ALL: sc $[[R14]], 0($[[R2]])
	; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]
	; MICROMIPS: beqzc $[[R14]], $[[BB0]]			; MICROMIPS: beqzc $[[R14]], $[[BB0]]
				; MIPSR6: beqzc $[[R14]], $[[BB0]]

	; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]			; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]
	; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]			; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]

	; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 24			; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 24
	; NO-SEB-SEH: sra $2, $[[R17]], 24			; NO-SEB-SEH: sra $2, $[[R17]], 24

	; HAS-SEB-SEH:seb $2, $[[R16]]			; HAS-SEB-SEH:seb $2, $[[R16]]
	Show All 25 Lines
	; ALL: and $[[R18:[0-9]+]], $[[R10]], $[[R9]]			; ALL: and $[[R18:[0-9]+]], $[[R10]], $[[R9]]
	; ALL: nor $[[R11:[0-9]+]], $zero, $[[R18]]			; ALL: nor $[[R11:[0-9]+]], $zero, $[[R18]]
	; ALL: and $[[R12:[0-9]+]], $[[R11]], $[[R7]]			; ALL: and $[[R12:[0-9]+]], $[[R11]], $[[R7]]
	; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]			; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]
	; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R12]]			; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R12]]
	; ALL: sc $[[R14]], 0($[[R2]])			; ALL: sc $[[R14]], 0($[[R2]])
	; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]
	; MICROMIPS: beqzc $[[R14]], $[[BB0]]			; MICROMIPS: beqzc $[[R14]], $[[BB0]]
				; MIPSR6: beqzc $[[R14]], $[[BB0]]

	; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]			; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]
	; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]			; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]

	; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 24			; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 24
	; NO-SEB-SEH: sra $2, $[[R17]], 24			; NO-SEB-SEH: sra $2, $[[R17]], 24

	; HAS-SEB-SEH: seb $2, $[[R16]]			; HAS-SEB-SEH: seb $2, $[[R16]]
	Show All 23 Lines
	; ALL: $[[BB0:[A-Z_0-9]+]]:			; ALL: $[[BB0:[A-Z_0-9]+]]:
	; ALL: ll $[[R10:[0-9]+]], 0($[[R2]])			; ALL: ll $[[R10:[0-9]+]], 0($[[R2]])
	; ALL: and $[[R18:[0-9]+]], $[[R9]], $[[R7]]			; ALL: and $[[R18:[0-9]+]], $[[R9]], $[[R7]]
	; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]			; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]
	; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R18]]			; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R18]]
	; ALL: sc $[[R14]], 0($[[R2]])			; ALL: sc $[[R14]], 0($[[R2]])
	; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]
	; MICROMIPS: beqzc $[[R14]], $[[BB0]]			; MICROMIPS: beqzc $[[R14]], $[[BB0]]
				; MIPSR6: beqzc $[[R14]], $[[BB0]]

	; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]			; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]
	; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]			; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]

	; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 24			; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 24
	; NO-SEB-SEH: sra $2, $[[R17]], 24			; NO-SEB-SEH: sra $2, $[[R17]], 24

	; HAS-SEB-SEH: seb $2, $[[R16]]			; HAS-SEB-SEH: seb $2, $[[R16]]
	Show All 23 Lines
	; ALL: andi $[[R9:[0-9]+]], $4, 255			; ALL: andi $[[R9:[0-9]+]], $4, 255
	; ALL: sllv $[[R10:[0-9]+]], $[[R9]], $[[R5]]			; ALL: sllv $[[R10:[0-9]+]], $[[R9]], $[[R5]]
	; ALL: andi $[[R11:[0-9]+]], $5, 255			; ALL: andi $[[R11:[0-9]+]], $5, 255
	; ALL: sllv $[[R12:[0-9]+]], $[[R11]], $[[R5]]			; ALL: sllv $[[R12:[0-9]+]], $[[R11]], $[[R5]]

	; ALL: $[[BB0:[A-Z_0-9]+]]:			; ALL: $[[BB0:[A-Z_0-9]+]]:
	; ALL: ll $[[R13:[0-9]+]], 0($[[R2]])			; ALL: ll $[[R13:[0-9]+]], 0($[[R2]])
	; ALL: and $[[R14:[0-9]+]], $[[R13]], $[[R7]]			; ALL: and $[[R14:[0-9]+]], $[[R13]], $[[R7]]
	; ALL: bne $[[R14]], $[[R10]], $[[BB1:[A-Z_0-9]+]]			; NOT-MICROMIPS: bne $[[R14]], $[[R10]], $[[BB1:[A-Z_0-9]+]]
				; MICROMIPS: bne $[[R14]], $[[R10]], $[[BB1:[A-Z_0-9]+]]
				; MIPSR6: bnec $[[R14]], $[[R10]], $[[BB1:[A-Z_0-9]+]]

	; ALL: and $[[R15:[0-9]+]], $[[R13]], $[[R8]]			; ALL: and $[[R15:[0-9]+]], $[[R13]], $[[R8]]
	; ALL: or $[[R16:[0-9]+]], $[[R15]], $[[R12]]			; ALL: or $[[R16:[0-9]+]], $[[R15]], $[[R12]]
	; ALL: sc $[[R16]], 0($[[R2]])			; ALL: sc $[[R16]], 0($[[R2]])
	; NOT-MICROMIPS: beqz $[[R16]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R16]], $[[BB0]]
	; MICROMIPS: beqzc $[[R16]], $[[BB0]]			; MICROMIPS: beqzc $[[R16]], $[[BB0]]
				; MIPSR6: beqzc $[[R16]], $[[BB0]]

	; ALL: $[[BB1]]:			; ALL: $[[BB1]]:
	; ALL: srlv $[[R17:[0-9]+]], $[[R14]], $[[R5]]			; ALL: srlv $[[R17:[0-9]+]], $[[R14]], $[[R5]]

	; NO-SEB-SEH: sll $[[R18:[0-9]+]], $[[R17]], 24			; NO-SEB-SEH: sll $[[R18:[0-9]+]], $[[R17]], 24
	; NO-SEB-SEH: sra $2, $[[R18]], 24			; NO-SEB-SEH: sra $2, $[[R18]], 24

	; HAS-SEB-SEH: seb $2, $[[R17]]			; HAS-SEB-SEH: seb $2, $[[R17]]
	Show All 18 Lines
	; ALL: andi $[[R9:[0-9]+]], $5, 255			; ALL: andi $[[R9:[0-9]+]], $5, 255
	; ALL: sllv $[[R10:[0-9]+]], $[[R9]], $[[R5]]			; ALL: sllv $[[R10:[0-9]+]], $[[R9]], $[[R5]]
	; ALL: andi $[[R11:[0-9]+]], $6, 255			; ALL: andi $[[R11:[0-9]+]], $6, 255
	; ALL: sllv $[[R12:[0-9]+]], $[[R11]], $[[R5]]			; ALL: sllv $[[R12:[0-9]+]], $[[R11]], $[[R5]]

	; ALL: $[[BB0:[A-Z_0-9]+]]:			; ALL: $[[BB0:[A-Z_0-9]+]]:
	; ALL: ll $[[R13:[0-9]+]], 0($[[R2]])			; ALL: ll $[[R13:[0-9]+]], 0($[[R2]])
	; ALL: and $[[R14:[0-9]+]], $[[R13]], $[[R7]]			; ALL: and $[[R14:[0-9]+]], $[[R13]], $[[R7]]
	; ALL: bne $[[R14]], $[[R10]], $[[BB1:[A-Z_0-9]+]]			; NOT-MICROMIPS: bne $[[R14]], $[[R10]], $[[BB1:[A-Z_0-9]+]]
				; MICROMIPS: bne $[[R14]], $[[R10]], $[[BB1:[A-Z_0-9]+]]
				; MIPSR6: bnec $[[R14]], $[[R10]], $[[BB1:[A-Z_0-9]+]]

	; ALL: and $[[R15:[0-9]+]], $[[R13]], $[[R8]]			; ALL: and $[[R15:[0-9]+]], $[[R13]], $[[R8]]
	; ALL: or $[[R16:[0-9]+]], $[[R15]], $[[R12]]			; ALL: or $[[R16:[0-9]+]], $[[R15]], $[[R12]]
	; ALL: sc $[[R16]], 0($[[R2]])			; ALL: sc $[[R16]], 0($[[R2]])
	; NOT-MICROMIPS: beqz $[[R16]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R16]], $[[BB0]]
	; MICROMIPS: beqzc $[[R16]], $[[BB0]]			; MICROMIPS: beqzc $[[R16]], $[[BB0]]
				; MIPSR6: beqzc $[[R16]], $[[BB0]]

	; ALL: $[[BB1]]:			; ALL: $[[BB1]]:
	; ALL: srlv $[[R17:[0-9]+]], $[[R14]], $[[R5]]			; ALL: srlv $[[R17:[0-9]+]], $[[R14]], $[[R5]]

	; NO-SEB-SEH: sll $[[R18:[0-9]+]], $[[R17]], 24			; NO-SEB-SEH: sll $[[R18:[0-9]+]], $[[R17]], 24
	; NO-SEB-SEH: sra $[[R19:[0-9]+]], $[[R18]], 24			; NO-SEB-SEH: sra $[[R19:[0-9]+]], $[[R18]], 24

	; HAS-SEB-SEH: seb $[[R19:[0-9]+]], $[[R17]]			; HAS-SEB-SEH: seb $[[R19:[0-9]+]], $[[R17]]
	Show All 30 Lines
	; ALL: ll $[[R10:[0-9]+]], 0($[[R2]])			; ALL: ll $[[R10:[0-9]+]], 0($[[R2]])
	; ALL: addu $[[R11:[0-9]+]], $[[R10]], $[[R9]]			; ALL: addu $[[R11:[0-9]+]], $[[R10]], $[[R9]]
	; ALL: and $[[R12:[0-9]+]], $[[R11]], $[[R7]]			; ALL: and $[[R12:[0-9]+]], $[[R11]], $[[R7]]
	; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]			; ALL: and $[[R13:[0-9]+]], $[[R10]], $[[R8]]
	; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R12]]			; ALL: or $[[R14:[0-9]+]], $[[R13]], $[[R12]]
	; ALL: sc $[[R14]], 0($[[R2]])			; ALL: sc $[[R14]], 0($[[R2]])
	; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R14]], $[[BB0]]
	; MICROMIPS: beqzc $[[R14]], $[[BB0]]			; MICROMIPS: beqzc $[[R14]], $[[BB0]]
				; MIPSR6: beqzc $[[R14]], $[[BB0]]

	; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]			; ALL: and $[[R15:[0-9]+]], $[[R10]], $[[R7]]
	; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]			; ALL: srlv $[[R16:[0-9]+]], $[[R15]], $[[R5]]

	; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 16			; NO-SEB-SEH: sll $[[R17:[0-9]+]], $[[R16]], 16
	; NO-SEB-SEH: sra $2, $[[R17]], 16			; NO-SEB-SEH: sra $2, $[[R17]], 16

	; MIPS32R2: seh $2, $[[R16]]			; MIPS32R2: seh $2, $[[R16]]
	▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines

	; ALL: addiu $[[PTR:[0-9]+]], $[[R0]], 1024			; ALL: addiu $[[PTR:[0-9]+]], $[[R0]], 1024
	; ALL: $[[BB0:[A-Z_0-9]+]]:			; ALL: $[[BB0:[A-Z_0-9]+]]:
	; ALL: ll $[[R1:[0-9]+]], 0($[[PTR]])			; ALL: ll $[[R1:[0-9]+]], 0($[[PTR]])
	; ALL: addu $[[R2:[0-9]+]], $[[R1]], $4			; ALL: addu $[[R2:[0-9]+]], $[[R1]], $4
	; ALL: sc $[[R2]], 0($[[PTR]])			; ALL: sc $[[R2]], 0($[[PTR]])
	; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]			; NOT-MICROMIPS: beqz $[[R2]], $[[BB0]]
	; MICROMIPS: beqzc $[[R2]], $[[BB0]]			; MICROMIPS: beqzc $[[R2]], $[[BB0]]
				; MIPSR6: beqzc $[[R2]], $[[BB0]]
	}			}

test/CodeGen/Mips/compact-branches.ll

This file was added.

				; RUN: llc -march=mipsel -mcpu=mips32r6 -relocation-model=static < %s \| FileCheck %s

				vkalintirisUnsubmitted Done Reply Inline Actions Although it isn't absolutely necessary, it would be better to have a function per instruction check, ie. one function that tests bnec, another one for bgezc etc. vkalintiris: Although it isn't absolutely necessary, it would be better to have a function per instruction…
				; Function Attrs: nounwind
				define void @l() {
				entry:
				%call = tail call i32 @k()
				%call1 = tail call i32 @j()
				%cmp = icmp eq i32 %call, %call1
				; CHECK: bnec
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry:
				; CHECK: nop
				; CHECK: jal
				tail call void @f(i32 signext -2)
				br label %if.end

				if.end: ; preds = %if.then, %entry
				ret void
				}

				declare i32 @k()

				declare i32 @j()

				declare void @f(i32 signext)

				; Function Attrs: define void @l2() {
				define void @l2() {
				entry:
				%call = tail call i32 @k()
				%call1 = tail call i32 @i()
				%cmp = icmp eq i32 %call, %call1
				; CHECK beqc
				br i1 %cmp, label %if.end, label %if.then

				if.then: ; preds = %entry:
				; CHECK: nop
				; CHECK: jal
				tail call void @f(i32 signext -1)
				br label %if.end

				if.end: ; preds = %entry, %if.then
				ret void
				}

				declare i32 @i()

				; Function Attrs: nounwind
				define void @l3() {
				entry:
				%call = tail call i32 @k()
				%cmp = icmp slt i32 %call, 0
				; CHECK : bgez
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry:
				; CHECK: nop
				; CHECK: jal
				tail call void @f(i32 signext 0)
				br label %if.end

				if.end: ; preds = %if.then, %entry
				ret void
				}

				; Function Attrs: nounwind
				define void @l4() {
				entry:
				%call = tail call i32 @k()
				%cmp = icmp slt i32 %call, 1
				; CHECK: bgtzc
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry:
				; CHECK: nop
				; CHECK: jal
				tail call void @f(i32 signext 1)
				br label %if.end

				if.end: ; preds = %if.then, %entry
				ret void
				}

				; Function Attrs: nounwind
				define void @l5() {
				entry:
				%call = tail call i32 @k()
				%cmp = icmp sgt i32 %call, 0
				; CHECK: blezc
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry:
				; CHECK: nop
				; CHECK: jal
				tail call void @f(i32 signext 2)
				br label %if.end

				if.end: ; preds = %if.then, %entry
				ret void
				}

				; Function Attrs: nounwind
				define void @l6() {
				entry:
				%call = tail call i32 @k()
				%cmp = icmp sgt i32 %call, -1
				; CHECK: bltzc
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry:
				; CHECK: nop
				; CHECK: jal
				tail call void @f(i32 signext 3)
				br label %if.end

				if.end: ; preds = %if.then, %entry
				ret void
				}

				; Function Attrs: nounwind
				define void @l7() {
				entry:
				%call = tail call i32 @k()
				%cmp = icmp eq i32 %call, 0
				; CHECK: bnezc
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry:
				; CHECK: nop
				; CHECK: jal
				tail call void @f(i32 signext 4)
				br label %if.end

				if.end: ; preds = %if.then, %entry
				ret void
				}

				; Function Attrs: nounwind
				define void @l8() {
				entry:
				%call = tail call i32 @k()
				%cmp = icmp eq i32 %call, 0
				; CHECK: beqzc
				br i1 %cmp, label %if.end, label %if.then

				if.then: ; preds = %entry:
				; CHECK: nop
				; CHECK: jal
				tail call void @f(i32 signext 5)
				br label %if.end

				if.end: ; preds = %entry, %if.then
				ret void
				}

test/CodeGen/Mips/fcmp.ll

	Show First 20 Lines • Show All 744 Lines • ▼ Show 20 Lines
	; 32-C-DAG: bc1t			; 32-C-DAG: bc1t

	; 32-CMP-DAG: add.s $[[T0:f[0-9]+]], $f14, $f12			; 32-CMP-DAG: add.s $[[T0:f[0-9]+]], $f14, $f12
	; 32-CMP-DAG: lwc1 $[[T1:f[0-9]+]], %lo($CPI32_0)(			; 32-CMP-DAG: lwc1 $[[T1:f[0-9]+]], %lo($CPI32_0)(
	; 32-CMP-DAG: cmp.le.s $[[T2:f[0-9]+]], $[[T0]], $[[T1]]			; 32-CMP-DAG: cmp.le.s $[[T2:f[0-9]+]], $[[T0]], $[[T1]]
	; 32-CMP-DAG: mfc1 $[[T3:[0-9]+]], $[[T2]]			; 32-CMP-DAG: mfc1 $[[T3:[0-9]+]], $[[T2]]
	; FIXME: This instruction is redundant.			; FIXME: This instruction is redundant.
	; 32-CMP-DAG: andi $[[T4:[0-9]+]], $[[T3]], 1			; 32-CMP-DAG: andi $[[T4:[0-9]+]], $[[T3]], 1
	; 32-CMP-DAG: bnez $[[T4]],			; 32-CMP-DAG: bnezc $[[T4]],

	; 64-C-DAG: add.s $[[T0:f[0-9]+]], $f13, $f12			; 64-C-DAG: add.s $[[T0:f[0-9]+]], $f13, $f12
	; 64-C-DAG: lwc1 $[[T1:f[0-9]+]], %got_ofst($CPI32_0)(			; 64-C-DAG: lwc1 $[[T1:f[0-9]+]], %got_ofst($CPI32_0)(
	; 64-C-DAG: c.ole.s $[[T0]], $[[T1]]			; 64-C-DAG: c.ole.s $[[T0]], $[[T1]]
	; 64-C-DAG: bc1t			; 64-C-DAG: bc1t

	; 64-CMP-DAG: add.s $[[T0:f[0-9]+]], $f13, $f12			; 64-CMP-DAG: add.s $[[T0:f[0-9]+]], $f13, $f12
	; 64-CMP-DAG: lwc1 $[[T1:f[0-9]+]], %got_ofst($CPI32_0)(			; 64-CMP-DAG: lwc1 $[[T1:f[0-9]+]], %got_ofst($CPI32_0)(
	; 64-CMP-DAG: cmp.le.s $[[T2:f[0-9]+]], $[[T0]], $[[T1]]			; 64-CMP-DAG: cmp.le.s $[[T2:f[0-9]+]], $[[T0]], $[[T1]]
	; 64-CMP-DAG: mfc1 $[[T3:[0-9]+]], $[[T2]]			; 64-CMP-DAG: mfc1 $[[T3:[0-9]+]], $[[T2]]
	; FIXME: This instruction is redundant.			; FIXME: This instruction is redundant.
	; 64-CMP-DAG: andi $[[T4:[0-9]+]], $[[T3]], 1			; 64-CMP-DAG: andi $[[T4:[0-9]+]], $[[T3]], 1
	; 64-CMP-DAG: bnez $[[T4]],			; 64-CMP-DAG: bnezc $[[T4]],

	%add = fadd fast float %at, %angle			%add = fadd fast float %at, %angle
	%cmp = fcmp ogt float %add, 1.000000e+00			%cmp = fcmp ogt float %add, 1.000000e+00
	br i1 %cmp, label %if.then, label %if.end			br i1 %cmp, label %if.then, label %if.end

	if.then:			if.then:
	%sub = fadd fast float %add, -1.000000e+00			%sub = fadd fast float %add, -1.000000e+00
	br label %if.end			br label %if.end
	Show All 14 Lines
	; 32-C-DAG: bc1t			; 32-C-DAG: bc1t

	; 32-CMP-DAG: add.d $[[T0:f[0-9]+]], $f14, $f12			; 32-CMP-DAG: add.d $[[T0:f[0-9]+]], $f14, $f12
	; 32-CMP-DAG: ldc1 $[[T1:f[0-9]+]], %lo($CPI33_0)(			; 32-CMP-DAG: ldc1 $[[T1:f[0-9]+]], %lo($CPI33_0)(
	; 32-CMP-DAG: cmp.le.d $[[T2:f[0-9]+]], $[[T0]], $[[T1]]			; 32-CMP-DAG: cmp.le.d $[[T2:f[0-9]+]], $[[T0]], $[[T1]]
	; 32-CMP-DAG: mfc1 $[[T3:[0-9]+]], $[[T2]]			; 32-CMP-DAG: mfc1 $[[T3:[0-9]+]], $[[T2]]
	; FIXME: This instruction is redundant.			; FIXME: This instruction is redundant.
	; 32-CMP-DAG: andi $[[T4:[0-9]+]], $[[T3]], 1			; 32-CMP-DAG: andi $[[T4:[0-9]+]], $[[T3]], 1
	; 32-CMP-DAG: bnez $[[T4]],			; 32-CMP-DAG: bnezc $[[T4]],

	; 64-C-DAG: add.d $[[T0:f[0-9]+]], $f13, $f12			; 64-C-DAG: add.d $[[T0:f[0-9]+]], $f13, $f12
	; 64-C-DAG: ldc1 $[[T1:f[0-9]+]], %got_ofst($CPI33_0)(			; 64-C-DAG: ldc1 $[[T1:f[0-9]+]], %got_ofst($CPI33_0)(
	; 64-C-DAG: c.ole.d $[[T0]], $[[T1]]			; 64-C-DAG: c.ole.d $[[T0]], $[[T1]]
	; 64-C-DAG: bc1t			; 64-C-DAG: bc1t

	; 64-CMP-DAG: add.d $[[T0:f[0-9]+]], $f13, $f12			; 64-CMP-DAG: add.d $[[T0:f[0-9]+]], $f13, $f12
	; 64-CMP-DAG: ldc1 $[[T1:f[0-9]+]], %got_ofst($CPI33_0)(			; 64-CMP-DAG: ldc1 $[[T1:f[0-9]+]], %got_ofst($CPI33_0)(
	; 64-CMP-DAG: cmp.le.d $[[T2:f[0-9]+]], $[[T0]], $[[T1]]			; 64-CMP-DAG: cmp.le.d $[[T2:f[0-9]+]], $[[T0]], $[[T1]]
	; 64-CMP-DAG: mfc1 $[[T3:[0-9]+]], $[[T2]]			; 64-CMP-DAG: mfc1 $[[T3:[0-9]+]], $[[T2]]
	; FIXME: This instruction is redundant.			; FIXME: This instruction is redundant.
	; 64-CMP-DAG: andi $[[T4:[0-9]+]], $[[T3]], 1			; 64-CMP-DAG: andi $[[T4:[0-9]+]], $[[T3]], 1
	; 64-CMP-DAG: bnez $[[T4]],			; 64-CMP-DAG: bnezc $[[T4]],

	%add = fadd fast double %at, %angle			%add = fadd fast double %at, %angle
	%cmp = fcmp ogt double %add, 1.000000e+00			%cmp = fcmp ogt double %add, 1.000000e+00
	br i1 %cmp, label %if.then, label %if.end			br i1 %cmp, label %if.then, label %if.end

	if.then:			if.then:
	%sub = fadd fast double %add, -1.000000e+00			%sub = fadd fast double %add, -1.000000e+00
	br label %if.end			br label %if.end

	if.end:			if.end:
	%theta.0 = phi double [ %sub, %if.then ], [ %add, %entry ]			%theta.0 = phi double [ %sub, %if.then ], [ %add, %entry ]
	ret double %theta.0			ret double %theta.0
	}			}

	attributes #0 = { nounwind readnone "no-nans-fp-math"="true" }			attributes #0 = { nounwind readnone "no-nans-fp-math"="true" }

test/CodeGen/Mips/fpbr.ll

	Show All 12 Lines
	; 64-FCC: c.eq.s $f12, $f13			; 64-FCC: c.eq.s $f12, $f13
	; FCC: bc1f $BB0_2			; FCC: bc1f $BB0_2

	; 32-GPR: cmp.eq.s $[[FGRCC:f[0-9]+]], $f12, $f14			; 32-GPR: cmp.eq.s $[[FGRCC:f[0-9]+]], $f12, $f14
	; 64-GPR: cmp.eq.s $[[FGRCC:f[0-9]+]], $f12, $f13			; 64-GPR: cmp.eq.s $[[FGRCC:f[0-9]+]], $f12, $f13
	; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]			; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]
	; FIXME: We ought to be able to transform not+bnez -> beqz			; FIXME: We ought to be able to transform not+bnez -> beqz
	; GPR: not $[[GPRCC]], $[[GPRCC]]			; GPR: not $[[GPRCC]], $[[GPRCC]]
	; GPR: bnez $[[GPRCC]], $BB0_2			; 32-GPR: bnez $[[GPRCC]], $BB0_2
				; 64-GPR: bnezc $[[GPRCC]], $BB0_2

	%cmp = fcmp oeq float %f2, %f3			%cmp = fcmp oeq float %f2, %f3
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then: ; preds = %entry			if.then: ; preds = %entry
	tail call void (...) @g0() nounwind			tail call void (...) @g0() nounwind
	br label %if.end			br label %if.end

	Show All 16 Lines
	; 32-FCC: c.olt.s $f12, $f14			; 32-FCC: c.olt.s $f12, $f14
	; 64-FCC: c.olt.s $f12, $f13			; 64-FCC: c.olt.s $f12, $f13
	; FCC: bc1f $BB1_2			; FCC: bc1f $BB1_2

	; 32-GPR: cmp.ule.s $[[FGRCC:f[0-9]+]], $f14, $f12			; 32-GPR: cmp.ule.s $[[FGRCC:f[0-9]+]], $f14, $f12
	; 64-GPR: cmp.ule.s $[[FGRCC:f[0-9]+]], $f13, $f12			; 64-GPR: cmp.ule.s $[[FGRCC:f[0-9]+]], $f13, $f12
	; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]			; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]
	; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]			; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]
	; GPR: bnez $[[GPRCC]], $BB1_2			; 32-GPR: bnez $[[GPRCC]], $BB1_2
				; 64-GPR: bnezc $[[GPRCC]], $BB1_2

	%cmp = fcmp olt float %f2, %f3			%cmp = fcmp olt float %f2, %f3
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then: ; preds = %entry			if.then: ; preds = %entry
	tail call void (...) @g0() nounwind			tail call void (...) @g0() nounwind
	br label %if.end			br label %if.end

	Show All 12 Lines
	; 32-FCC: c.ole.s $f12, $f14			; 32-FCC: c.ole.s $f12, $f14
	; 64-FCC: c.ole.s $f12, $f13			; 64-FCC: c.ole.s $f12, $f13
	; FCC: bc1t $BB2_2			; FCC: bc1t $BB2_2

	; 32-GPR: cmp.ult.s $[[FGRCC:f[0-9]+]], $f14, $f12			; 32-GPR: cmp.ult.s $[[FGRCC:f[0-9]+]], $f14, $f12
	; 64-GPR: cmp.ult.s $[[FGRCC:f[0-9]+]], $f13, $f12			; 64-GPR: cmp.ult.s $[[FGRCC:f[0-9]+]], $f13, $f12
	; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]			; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]
	; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]			; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]
	; GPR: beqz $[[GPRCC]], $BB2_2			; 32-GPR: beqz $[[GPRCC]], $BB2_2
				; 64-GPR: beqzc $[[GPRCC]], $BB2_2

	%cmp = fcmp ugt float %f2, %f3			%cmp = fcmp ugt float %f2, %f3
	br i1 %cmp, label %if.else, label %if.then			br i1 %cmp, label %if.else, label %if.then

	if.then: ; preds = %entry			if.then: ; preds = %entry
	tail call void (...) @g0() nounwind			tail call void (...) @g0() nounwind
	br label %if.end			br label %if.end

	Show All 13 Lines
	; 64-FCC: c.eq.d $f12, $f13			; 64-FCC: c.eq.d $f12, $f13
	; FCC: bc1f $BB3_2			; FCC: bc1f $BB3_2

	; 32-GPR: cmp.eq.d $[[FGRCC:f[0-9]+]], $f12, $f14			; 32-GPR: cmp.eq.d $[[FGRCC:f[0-9]+]], $f12, $f14
	; 64-GPR: cmp.eq.d $[[FGRCC:f[0-9]+]], $f12, $f13			; 64-GPR: cmp.eq.d $[[FGRCC:f[0-9]+]], $f12, $f13
	; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]			; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]
	; FIXME: We ought to be able to transform not+bnez -> beqz			; FIXME: We ought to be able to transform not+bnez -> beqz
	; GPR: not $[[GPRCC]], $[[GPRCC]]			; GPR: not $[[GPRCC]], $[[GPRCC]]
	; GPR: bnez $[[GPRCC]], $BB3_2			; 32-GPR: bnez $[[GPRCC]], $BB3_2
				; 64-GPR: bnezc $[[GPRCC]], $BB3_2

	%cmp = fcmp oeq double %f2, %f3			%cmp = fcmp oeq double %f2, %f3
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then: ; preds = %entry			if.then: ; preds = %entry
	tail call void (...) @g0() nounwind			tail call void (...) @g0() nounwind
	br label %if.end			br label %if.end

	Show All 12 Lines
	; 32-FCC: c.olt.d $f12, $f14			; 32-FCC: c.olt.d $f12, $f14
	; 64-FCC: c.olt.d $f12, $f13			; 64-FCC: c.olt.d $f12, $f13
	; FCC: bc1f $BB4_2			; FCC: bc1f $BB4_2

	; 32-GPR: cmp.ule.d $[[FGRCC:f[0-9]+]], $f14, $f12			; 32-GPR: cmp.ule.d $[[FGRCC:f[0-9]+]], $f14, $f12
	; 64-GPR: cmp.ule.d $[[FGRCC:f[0-9]+]], $f13, $f12			; 64-GPR: cmp.ule.d $[[FGRCC:f[0-9]+]], $f13, $f12
	; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]			; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]
	; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]			; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]
	; GPR: bnez $[[GPRCC]], $BB4_2			; 32-GPR: bnez $[[GPRCC]], $BB4_2
				; 64-GPR: bnezc $[[GPRCC]], $BB4_2

	%cmp = fcmp olt double %f2, %f3			%cmp = fcmp olt double %f2, %f3
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then: ; preds = %entry			if.then: ; preds = %entry
	tail call void (...) @g0() nounwind			tail call void (...) @g0() nounwind
	br label %if.end			br label %if.end

	Show All 12 Lines
	; 32-FCC: c.ole.d $f12, $f14			; 32-FCC: c.ole.d $f12, $f14
	; 64-FCC: c.ole.d $f12, $f13			; 64-FCC: c.ole.d $f12, $f13
	; FCC: bc1t $BB5_2			; FCC: bc1t $BB5_2

	; 32-GPR: cmp.ult.d $[[FGRCC:f[0-9]+]], $f14, $f12			; 32-GPR: cmp.ult.d $[[FGRCC:f[0-9]+]], $f14, $f12
	; 64-GPR: cmp.ult.d $[[FGRCC:f[0-9]+]], $f13, $f12			; 64-GPR: cmp.ult.d $[[FGRCC:f[0-9]+]], $f13, $f12
	; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]			; GPR: mfc1 $[[GPRCC:[0-9]+]], $[[FGRCC:f[0-9]+]]
	; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]			; GPR-NOT: not $[[GPRCC]], $[[GPRCC]]
	; GPR: beqz $[[GPRCC]], $BB5_2			; 32-GPR: beqz $[[GPRCC]], $BB5_2
				; 64-GPR: beqzc $[[GPRCC]], $BB5_2

	%cmp = fcmp ugt double %f2, %f3			%cmp = fcmp ugt double %f2, %f3
	br i1 %cmp, label %if.else, label %if.then			br i1 %cmp, label %if.else, label %if.then

	if.then: ; preds = %entry			if.then: ; preds = %entry
	tail call void (...) @g0() nounwind			tail call void (...) @g0() nounwind
	br label %if.end			br label %if.end

	if.else: ; preds = %entry			if.else: ; preds = %entry
	tail call void (...) @g1() nounwind			tail call void (...) @g1() nounwind
	br label %if.end			br label %if.end

	if.end: ; preds = %if.else, %if.then			if.end: ; preds = %if.else, %if.then
	ret void			ret void
	}			}