This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVInstrInfo.h
6/17
RISCVInstrInfo.cpp
-
RISCVTargetMachine.cpp
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
3/9
machineoutliner.mir

Differential D66210

[RISCV] Enable the machine outliner for RISC-V
ClosedPublic

Authored by lewis-revill on Aug 14 2019, 3:53 AM.

Download Raw Diff

Details

Reviewers

asb
lenary
edward-jones
paquette
efriedma
simoncook
luismarques

Commits

rGa116f28a0d71: [RISCV] Enable the machine outliner for RISC-V

Summary

This patch enables the machine outliner for RISC-V and adds the necessary logic for checking whether sequences can be safely outlined, and describing how they should be outlined. Outlined functions are called using the register t0 (x5) as the return address register, which must be available for an occurrence of a sequence to be safely outlined.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lewis-revill created this revision.Aug 14 2019, 3:53 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 14 2019, 3:53 AM

Herald added subscribers: llvm-commits, s.egerton, Jim and 20 others. · View Herald Transcript

lewis-revill mentioned this in D62686: [RISCV] Add support for save/restore of callee-saved registers via libcalls.Aug 14 2019, 3:57 AM

luismarques added a reviewer: luismarques.Aug 16 2019, 8:42 AM

luismarques requested changes to this revision.Aug 16 2019, 10:36 AM

luismarques added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
471	AArch64 has more checks, some of which seem like could be relevant for us. Checking for section markings, link once, etc.
481	If we are only going to support one possible register for now, shouldn't it be the one least likely to already be in use? Wouldn't that be t6 (x31)?
496	This might be pessimistic when the instructions are compressible. Suddenly the outlined amount vs the overhead math might not quite be realistic. I think it's worth devising a test to try to explore how much of a problem that is, and if we can compensate for it by, say, tweaking the overhead numbers, or determining if the instruction is likely to be compressible, or being conservative regarding the instruction size when targeting RVC.

This revision now requires changes to proceed.Aug 16 2019, 10:36 AM

efriedma added inline comments.Aug 16 2019, 1:37 PM

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
496	Wouldn't you want getInstSizeInBytes to account for compression for other reasons, anyway? For example, to avoid triggering branch relaxation when it isn't necessary.
503	RISCV::PseudoRET = 8 bytes. I don't understand the math here.

luismarques added inline comments.Aug 16 2019, 3:58 PM

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
496	@efriedma I was assuming it wouldn't be possible to make `getInstSizeInBytes` fully account for compression due to relaxations and/or compression being done by the linker, but if that is possible (or at least a good enough approximation) then that sounds like a good solution.

lewis-revill marked 2 inline comments as done.Aug 19 2019, 4:48 AM

lewis-revill added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
481	That's a sensible suggestion, I was using t0 to match how the save/restore libcalls behave so I presumed there was a good reason for using t0. Don't we also need to think about RV32E here though?
503	Good catch this is definitely an error.

Herald added a subscriber: pzheng. · View Herald TranscriptAug 19 2019, 4:48 AM

luismarques marked an inline comment as done.Aug 19 2019, 5:02 AM

luismarques added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
481	Good point about RV32E. I guess you can either always use `t2` or check the target and use `t6` when available, falling back to `t2` otherwise. If that's not trivial it might be worth checking how hard it would be to dynamically choose the register, like AArch64 does IIRC.

paquette added inline comments.Aug 20 2019, 9:07 AM

llvm/test/CodeGen/RISCV/machineoutliner.ll
1 ↗	(On Diff #215075)	I recommend writing a .mir test for this instead of a .ll test. (e.g, use -stop-before=machine-outliner -simplify-mir) That would make the test resilient against other code generation changes, and make it easier to test the instruction sequences you want to test.

efriedma added inline comments.Aug 20 2019, 12:36 PM

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
496	The linker only changes the size of calls and global address computation, currently. I mean, I guess it's theoretically possible for it to compress other branches, but that isn't implemented, as far as I know. So it should be possible to accurately compute the size of almost all compressible instructions. LLVM currently doesn't do any RISCV compression computation before the assembler runs, but it was only implemented that way to simplify the implementation; it could be done earlier, if it would be useful.

Rebased and addressed a couple of comments.

lewis-revill marked 3 inline comments as done.Aug 22 2019, 2:43 AM

lewis-revill added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
471	I've added the basic checks here, I will look into adding tests for this.
llvm/test/CodeGen/RISCV/machineoutliner.ll
1 ↗	(On Diff #215075)	I haven't used the tool in this way before, I've been trying to replicate the command used in the AArch64 run line, but I'm still getting 'error: expected top-level entity' when running the check.

efriedma added inline comments.Aug 22 2019, 2:06 PM

llvm/test/CodeGen/RISCV/machineoutliner.ll
1 ↗	(On Diff #215075)	"expected top-level entity" comes from the IR parser, so your testcase probably isn't getting treated as MIR at all. MIR testcases have to end with a ".mir" extension (or you can use "-x mir").

Updated the test to an MIR-based test, and added checks for the cases where outlining cannot be done.

paquette added inline comments.Sep 3 2019, 10:17 AM

llvm/test/CodeGen/RISCV/machineoutliner.mir
1–2	You probably want -verify-machineinstrs on each of these.
7–11	I don't think you actually need the body of the IR functions here. You can make each function void, and just return: e.g. define i32 @outline_0(i32 %a, i32 %b) #0 { ret void } If it weren't for the section checks, you wouldn't need the IR at all. It only needs to be there when you look back at the IR in some way in the pass.
59	I'm pretty sure you can remove this.
61	This isn't checked/used by the outliner, so you can remove it from the IR.
76	If you do remove the bulk of the IR code, you'll need to rename the BB labels to `bb.0`.
81	It would be good to check the contents of the outlined functions as well. E.g. OUTLINED_FUNCTION_0 should contain the ORI/ADDI/AND sequence. (It's possible that utils/update_mir_test_checks.py can generate all of the checks you want here, but I'm not sure if it can handle functions being created.)

I've been testing this on the Embench set of benchmarks (https://github.com/embench/embench-iot), and it appears that some programs cause execution failures with this enabled, which is a little concerning... I'm going to look into this and then also update the tests.

Don't allow return statements in outlined sequences until tail calls to outlined functions are properly implemented. Improve logic for checking if X5 is available for use at the insertion point of an outlined sequence, and update the test to be pure MIR.

lewis-revill marked 3 inline comments as done.Sep 11 2019, 9:28 AM

lewis-revill added inline comments.

llvm/test/CodeGen/RISCV/machineoutliner.mir
1–2	I think there is some issue with the generation of MIR, at least for RISC-V... If I do IR -> MIR -> Assembly with -verify-machineinstrs, I get the error: 'Function has NoVRegs property but there are VReg operands'. I don't get it for IR -> Assembly with -verify-machineinstrs, and I'm not at all sure what to do about it.
81	(It's possible that utils/update_mir_test_checks.py can generate all of the checks you want here, but I'm not sure if it can handle functions being created.) I'm not sure I can either! Inserting check lines at the end of the file causes parsing errors.

paquette added inline comments.Sep 11 2019, 11:08 AM

llvm/test/CodeGen/RISCV/machineoutliner.mir
1–2	The verifier should tell you which pass is causing this to happen, and in which function. What is probably happening is that it's because you're using `-enable-machine-outliner`. This is probably running passes you don't expect. Since this output was generated by `-stop-before=machine-outliner`, these passes are probably doing weird things. What you probably want is one of these: `-run-pass=machine-outliner`: Only run the outliner, and produce MIR output `-start-before=machine-outliner`: Run the outliner and all following passes, and produce assembly output I think that using `-run-pass` is usually better for these kinds of tests. The reason being that it localizes the test to a single pass, rather than several. The outliner is an extremely late pass though, so using `-start-before` is probably fine. If you use `-run-pass`, you'll have to rewrite your CHECK lines as MIR though.

Refactor the MIR test case again:

use -run-pass=machine-outliner to only test this pass (required removing the no-machineoutliner checks, they are redundant anyway)
add riscv64 checks for completeness?
add -verify-machineinstrs

Two small nits, other than that it's looking pretty good.

I presume you got to the bottom of the exception handling issues you saw, and they are now solved.

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
501	It would be useful to include the ABI name of x5 in this comment too.
503	Can you rename this to `CannotInsertCall`, I kept missing the `t` when reviewing, which inverts its meaning.

Updated to address comments.

LGTM. Please wait for a review from @asb before landing this.

We discussed this at the RISC-V meeting on 19 Sept 2019. @apazos says she is willing to test this on SPEC.

I have run a couple of standard workloads like SPEC.
There is no correctness issue when enabling the MO feature (except for spec2000/twolf which fails with/without MO).
The MO code size gains without compression are up to 7%. With compression enabled, most of the code size gain is gone, and I see more code size increase.
It is possible it has to do with the SequenceSize we are estimating.
The reason we enable compression late is to have all paths - coming from codegen (LLVM IR), parsing, assembling .s or inline asm - go through the same mechanism for compression.
This is similar/compatible with GCC behavior, which relies on the external assembler to implement compression.
We can better estimate SequenceSize by checking if an instruction is compressable. We can modify the tablegen backend for compression emitter to generate a function that does the isCompressable check, but instead of using MCInst and MCOperands for the checks, we need to use MachineInstr and MachineOperand types. I will try this solution. Another alternative is to compress LLVM IR code before running the machine outliner.

In D66210#1684230, @apazos wrote:

I have run a couple of standard workloads like SPEC.
There is no correctness issue when enabling the MO feature (except for spec2000/twolf which fails with/without MO).
The MO code size gains without compression are up to 7%. With compression enabled, most of the code size gain is gone, and I see more code size increase.
It is possible it has to do with the SequenceSize we are estimating.
The reason we enable compression late is to have all paths - coming from codegen (LLVM IR), parsing, assembling .s or inline asm - go through the same mechanism for compression.
This is similar/compatible with GCC behavior, which relies on the external assembler to implement compression.
We can better estimate SequenceSize by checking if an instruction is compressable. We can modify the tablegen backend for compression emitter to generate a function that does the isCompressable check, but instead of using MCInst and MCOperands for the checks, we need to use MachineInstr and MachineOperand types. I will try this solution. Another alternative is to compress LLVM IR code before running the machine outliner.

Hi Ana, thank you for this analysis, interestingly I still found significant benefits with compressed instructions enabled. It's worth noting that I was using Embench, which targets the embedded use case.

I'm not sure I quite follow what you mean when you say 'Another alternative is to compress LLVM IR code before running the machine outliner', But did you make any progress with the 'isCompressible' method?

Add missing bailout condition after potentially removing candidate locations.

Hi Lewis,
Here is the change: https://reviews.llvm.org/D68290
With this change I was able to remove the code size degradation I had observed.
Please try it out.

Lewis, this patch LGTM. You can go ahead and merge it.

To improve code size, I have updated the additional patch in https://reviews.llvm.org/D68290. No change is required in machine outliner code.

Rebased prior to commit; Will run this through the testsuite once more first.

sorear added a subscriber: sorear.Oct 9 2019, 2:54 AM

sorear added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
481	`ra` and `t0` (`x1` and `x5`) have special functionality in implementations with a return-address stack; `jr t6` will be treated as a general indirect branch, not a return, and is much more likely to mispredict. So this should probably prefer `t0` whenever possible.
528	Tangentially related, but in many of the cases where outlining makes sense, it would also make sense to generate 4-byte `jal t0, label` instructions; lld and ld.bfd would need to be taught to generate thunks for out of range `jal` (the Go linker already supports them).
532	Note that `jr <reg>` is a compressible instruction regardless of the register used.

In D66210#1699653, @lewis-revill wrote:

Rebased prior to commit; Will run this through the testsuite once more first.

I'm seeing some excess failures from this. However, the failures I've looked into so far seem to be timeouts from compilation being too slow. I'll hold off on committing until I've triaged all of them.

lewis-revill added inline comments.Oct 10 2019, 8:01 AM

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
532	Thanks, I'll try and incorporate this too.

I need to investigate some failures with this patch whereby sections of code which reference different temporary symbols still get outlined, with only one reference being used in the outlined function.

IE the following sequences should definitely not be outlined:

	lui	a0, %hi(.LCPI1_0)
	addi	a0, a0, %lo(.LCPI1_0)
        ...

...
	lui	a0, %hi(.LCPI2_0)
	addi	a0, a0, %lo(.LCPI2_0)
        ...

...
	lui	a0, %hi(.LCPI3_0)
	addi	a0, a0, %lo(.LCPI3_0)
        ...

Rebased. Added special case for size of jr t0 with compression enabled. Fixed failure in GCC testsuite.

The failure was caused by machine operands referencing constant pool indexes being outlined, even though these indexes are materialized as local labels. This was fixed by simply checking MO.isCPI().

Herald added a subscriber: sameer.abuasal. · View Herald TranscriptNov 1 2019, 1:32 PM

Rebased

Lewis, I tested the latest version of this patch (with and without https://reviews.llvm.org/D68290), I don't see any issue. LGTM.

I'm doing a final rebase and check on this before merging. I have a failure in the GCC testsuite which I'm triaging.

So the failure in the GCC testsuite occurs due to the following test:

/* PR rtl-optimization/48141 */
/* { dg-do compile } */
/* { dg-options "-O" } */

#define A i = 0;
#define B A A A A A A A A A A
#define C B B B B B B B B B B
#define D C C C C C C C C C C
#define E D D D D D D D D D D

int
foo (void)
{
  volatile int i = 0;
  E E E E E E E E E E E
  return 0;
}

Which generates a huge amount of work for the machine outliner algorithm to process. I get a segmentation fault when the machine outliner attempts to iterate over the tree to add suffix indices.

Obviously it would be nice for the compiler not to segfault and I'll carry on looking into this further. However I don't see that this indicates a problem with the RISC-V backend specific logic added by this patch, so if there's no objection I may go ahead with merging this later.

Rebased.

Can you open a bug for the machine outliner and maybe contact Jessica, may be she can help fix this quickly.

In D66210#1787974, @apazos wrote:

Can you open a bug for the machine outliner and maybe contact Jessica, may be she can help fix this quickly.

Sorry for the delay. Finding a testcase for X86 is proving extremely time consuming, although I have been able to prove to myself that the same situation is indeed possible. I just don't know enough about the X86 backend to invoke it.

So I think it is easier if I commit the RISC-V implementation, and we can then discuss how to approach the testcase which we already have.

Thanks Lewis for providing the test case. A bug can be opened for the machine outliner with that example. In my opinion you can go ahead and merge this patch.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 19 2019, 8:43 AM

Closed by commit rGa116f28a0d71: [RISCV] Enable the machine outliner for RISC-V (authored by lewis-revill). · Explain Why

This revision was automatically updated to reflect the committed changes.

https://bugs.llvm.org/show_bug.cgi?id=44344

Pinging @paquette perhaps you have some insight into whether there's anything that can be done to avoid this issue (also discussed above).

• pcwang-thead mentioned this in D122634: [RISCV] Do not outline CFI instructions when they are needed in EH.Apr 12 2022, 11:31 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVInstrInfo.h

30 lines

RISCVInstrInfo.cpp

137 lines

RISCVTargetMachine.cpp

3 lines

test/

CodeGen/

RISCV/

machineoutliner.mir

226 lines

Diff 218344

llvm/lib/Target/RISCV/RISCVInstrInfo.h

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	public:
reverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const override;		reverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const override;

MachineBasicBlock *getBranchDestBlock(const MachineInstr &MI) const override;		MachineBasicBlock *getBranchDestBlock(const MachineInstr &MI) const override;

bool isBranchOffsetInRange(unsigned BranchOpc,		bool isBranchOffsetInRange(unsigned BranchOpc,
int64_t BrOffset) const override;		int64_t BrOffset) const override;

bool isAsCheapAsAMove(const MachineInstr &MI) const override;		bool isAsCheapAsAMove(const MachineInstr &MI) const override;

		// Return true if the function can safely be outlined from.
		virtual bool
		isFunctionSafeToOutlineFrom(MachineFunction &MF,
		bool OutlineFromLinkOnceODRs) const override;

		// Return true if MBB is safe to outline from, and return any target-specific
		// information in Flags.
		virtual bool isMBBSafeToOutlineFrom(MachineBasicBlock &MBB,
		unsigned &Flags) const override;

		// Calculate target-specific information for a set of outlining candidates.
		outliner::OutlinedFunction getOutliningCandidateInfo(
		std::vector<outliner::Candidate> &RepeatedSequenceLocs) const override;

		// Return if/how a given MachineInstr should be outlined.
		virtual outliner::InstrType
		getOutliningType(MachineBasicBlock::iterator &MBBI,
		unsigned Flags) const override;

		// Insert a custom frame for outlined functions.
		virtual void
		buildOutlinedFrame(MachineBasicBlock &MBB, MachineFunction &MF,
		const outliner::OutlinedFunction &OF) const override;

		// Insert a call to an outlined function into a given basic block.
		virtual MachineBasicBlock::iterator
		insertOutlinedCall(Module &M, MachineBasicBlock &MBB,
		MachineBasicBlock::iterator &It, MachineFunction &MF,
		const outliner::Candidate &C) const override;
};		};
}		}
#endif		#endif

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp

Show First 20 Lines • Show All 460 Lines • ▼ Show 20 Lines	default:
break;		break;
case RISCV::ADDI:		case RISCV::ADDI:
case RISCV::ORI:		case RISCV::ORI:
case RISCV::XORI:		case RISCV::XORI:
return (MI.getOperand(1).isReg() && MI.getOperand(1).getReg() == RISCV::X0);		return (MI.getOperand(1).isReg() && MI.getOperand(1).getReg() == RISCV::X0);
}		}
return MI.isAsCheapAsAMove();		return MI.isAsCheapAsAMove();
}		}

		bool RISCVInstrInfo::isFunctionSafeToOutlineFrom(
		MachineFunction &MF, bool OutlineFromLinkOnceODRs) const {
		luismarquesUnsubmitted Done Reply Inline Actions AArch64 has more checks, some of which seem like could be relevant for us. Checking for section markings, link once, etc. luismarques: AArch64 has more checks, some of which seem like could be relevant for us. Checking for section…
		lewis-revillAuthorUnsubmitted Done Reply Inline Actions I've added the basic checks here, I will look into adding tests for this. lewis-revill: I've added the basic checks here, I will look into adding tests for this.
		const Function &F = MF.getFunction();

		// Can F be deduplicated by the linker? If it can, don't outline from it.
		if (!OutlineFromLinkOnceODRs && F.hasLinkOnceODRLinkage())
		return false;

		// Don't outline from functions with section markings; the program could
		// expect that all the code is in the named section.
		if (F.hasSection())
		return false;
		luismarquesUnsubmitted Not Done Reply Inline Actions If we are only going to support one possible register for now, shouldn't it be the one least likely to already be in use? Wouldn't that be t6 (x31)? luismarques: If we are only going to support one possible register for now, shouldn't it be the one least…
		lewis-revillAuthorUnsubmitted Done Reply Inline Actions That's a sensible suggestion, I was using t0 to match how the save/restore libcalls behave so I presumed there was a good reason for using t0. Don't we also need to think about RV32E here though? lewis-revill: That's a sensible suggestion, I was using t0 to match how the save/restore libcalls behave so I…
		luismarquesUnsubmitted Done Reply Inline Actions Good point about RV32E. I guess you can either always use `t2` or check the target and use `t6` when available, falling back to `t2` otherwise. If that's not trivial it might be worth checking how hard it would be to dynamically choose the register, like AArch64 does IIRC. luismarques: Good point about RV32E. I guess you can either always use `t2` or check the target and use `t6`…
		sorearUnsubmitted Not Done Reply Inline Actions `ra` and `t0` (`x1` and `x5`) have special functionality in implementations with a return-address stack; `jr t6` will be treated as a general indirect branch, not a return, and is much more likely to mispredict. So this should probably prefer `t0` whenever possible. sorear: `ra` and `t0` (`x1` and `x5`) have [special functionality in implementations with a return…

		// It's safe to outline from MF.
		return true;
		}

		bool RISCVInstrInfo::isMBBSafeToOutlineFrom(MachineBasicBlock &MBB,
		unsigned &Flags) const {
		// Check if we will be able to insert a call through the X5 register.
		RegScavenger RS;
		RS.enterBasicBlock(MBB);
		return !RS.isRegUsed(RISCV::X5);
		}

		// Enum values indicating how an outlined call should be constructed.
		enum MachineOutlinerConstructionID {
		luismarquesUnsubmitted Not Done Reply Inline Actions This might be pessimistic when the instructions are compressible. Suddenly the outlined amount vs the overhead math might not quite be realistic. I think it's worth devising a test to try to explore how much of a problem that is, and if we can compensate for it by, say, tweaking the overhead numbers, or determining if the instruction is likely to be compressible, or being conservative regarding the instruction size when targeting RVC. luismarques: This might be pessimistic when the instructions are compressible. Suddenly the outlined amount…
		efriedmaUnsubmitted Not Done Reply Inline Actions Wouldn't you want getInstSizeInBytes to account for compression for other reasons, anyway? For example, to avoid triggering branch relaxation when it isn't necessary. efriedma: Wouldn't you want getInstSizeInBytes to account for compression for other reasons, anyway? For…
		luismarquesUnsubmitted Not Done Reply Inline Actions @efriedma I was assuming it wouldn't be possible to make `getInstSizeInBytes` fully account for compression due to relaxations and/or compression being done by the linker, but if that is possible (or at least a good enough approximation) then that sounds like a good solution. luismarques: @efriedma I was assuming it wouldn't be possible to make `getInstSizeInBytes` fully account for…
		efriedmaUnsubmitted Not Done Reply Inline Actions The linker only changes the size of calls and global address computation, currently. I mean, I guess it's theoretically possible for it to compress other branches, but that isn't implemented, as far as I know. So it should be possible to accurately compute the size of almost all compressible instructions. LLVM currently doesn't do any RISCV compression computation before the assembler runs, but it was only implemented that way to simplify the implementation; it could be done earlier, if it would be useful. efriedma: The linker only changes the size of calls and global address computation, currently. I mean, I…
		MachineOutlinerDefault
		};

		outliner::OutlinedFunction RISCVInstrInfo::getOutliningCandidateInfo(
		std::vector<outliner::Candidate> &RepeatedSequenceLocs) const {
		lenaryUnsubmitted Not Done Reply Inline Actions It would be useful to include the ABI name of x5 in this comment too. lenary: It would be useful to include the ABI name of x5 in this comment too.
		unsigned SequenceSize = 0;

		efriedmaUnsubmitted Done Reply Inline Actions RISCV::PseudoRET = 8 bytes. I don't understand the math here. efriedma: > RISCV::PseudoRET = 8 bytes. I don't understand the math here.
		lewis-revillAuthorUnsubmitted Done Reply Inline Actions Good catch this is definitely an error. lewis-revill: Good catch this is definitely an error.
		lenaryUnsubmitted Not Done Reply Inline Actions Can you rename this to `CannotInsertCall`, I kept missing the `t` when reviewing, which inverts its meaning. lenary: Can you rename this to `CannotInsertCall`, I kept missing the `t` when reviewing, which inverts…
		auto I = RepeatedSequenceLocs[0].front();
		auto E = std::next(RepeatedSequenceLocs[0].back());
		for (; I != E; ++I)
		SequenceSize += getInstSizeInBytes(*I);

		// RISCV::PseudoCALL = 8 bytes.
		unsigned CallOverhead = 8;
		for (auto &C : RepeatedSequenceLocs)
		C.setCallInfo(MachineOutlinerDefault, CallOverhead);

		// RISCV::PseudoRET = 4 bytes.
		unsigned FrameOverhead = 4;
		return outliner::OutlinedFunction(RepeatedSequenceLocs, SequenceSize,
		FrameOverhead, MachineOutlinerDefault);
		}

		outliner::InstrType
		RISCVInstrInfo::getOutliningType(MachineBasicBlock::iterator &MBBI,
		unsigned Flags) const {
		MachineInstr &MI = *MBBI;
		MachineBasicBlock *MBB = MI.getParent();
		const TargetRegisterInfo *TRI =
		MBB->getParent()->getSubtarget().getRegisterInfo();

		// Positions generally can't safely be outlined.
		sorearUnsubmitted Not Done Reply Inline Actions Tangentially related, but in many of the cases where outlining makes sense, it would also make sense to generate 4-byte `jal t0, label` instructions; lld and ld.bfd would need to be taught to generate thunks for out of range `jal` (the Go linker already supports them). sorear: Tangentially related, but in many of the cases where outlining makes sense, it would also make…
		if (MI.isPosition()) {
		// We can manually strip out CFI instructions later.
		if (MI.isCFIInstruction())
		return outliner::InstrType::Invisible;
		sorearUnsubmitted Not Done Reply Inline Actions Note that `jr <reg>` is a compressible instruction regardless of the register used. sorear: Note that `jr <reg>` is a compressible instruction regardless of the register used.
		lewis-revillAuthorUnsubmitted Not Done Reply Inline Actions Thanks, I'll try and incorporate this too. lewis-revill: Thanks, I'll try and incorporate this too.

		return outliner::InstrType::Illegal;
		}

		// Don't trust the user to write safe inline assembly.
		if (MI.isInlineAsm())
		return outliner::InstrType::Illegal;

		// We can't outline branches to other basic blocks.
		if (MI.isTerminator() && !MBB->succ_empty())
		return outliner::InstrType::Illegal;

		// Don't allow modifying the X5 register which we use for return addresses for
		// these outlined functions.
		if (MI.modifiesRegister(RISCV::X5, TRI) \|\|
		MI.getDesc().hasImplicitDefOfPhysReg(RISCV::X5))
		return outliner::InstrType::Illegal;

		// Make sure the operands don't reference something unsafe.
		for (const auto &MO : MI.operands())
		if (MO.isMBB() \|\| MO.isBlockAddress())
		return outliner::InstrType::Illegal;

		// Don't allow instructions which won't be materialized to impact outlining
		// analysis.
		if (MI.isMetaInstruction())
		return outliner::InstrType::Invisible;

		return outliner::InstrType::Legal;
		}

		void RISCVInstrInfo::buildOutlinedFrame(
		MachineBasicBlock &MBB, MachineFunction &MF,
		const outliner::OutlinedFunction &OF) const {

		// Strip out any CFI instructions, and any return statements.
		bool Changed = true;
		while (Changed) {
		Changed = false;
		auto I = MBB.begin();
		auto E = MBB.end();
		for (; I != E; ++I) {
		if (I->isCFIInstruction()) {
		I->removeFromParent();
		Changed = true;
		break;
		}
		if (I->isReturn()) {
		I->removeFromParent();
		Changed = true;
		break;
		}
		}
		}

		// Add in a return instruction to the end of the outlined frame.
		MBB.insert(MBB.end(), BuildMI(MF, DebugLoc(), get(RISCV::JALR))
		.addReg(RISCV::X0)
		.addReg(RISCV::X5)
		.addImm(0));
		}

		MachineBasicBlock::iterator RISCVInstrInfo::insertOutlinedCall(
		Module &M, MachineBasicBlock &MBB, MachineBasicBlock::iterator &It,
		MachineFunction &MF, const outliner::Candidate &C) const {

		// Add in a call instruction to the outlined function at the given location.
		It = MBB.insert(It,
		BuildMI(MF, DebugLoc(), get(RISCV::PseudoCALLReg), RISCV::X5)
		.addGlobalAddress(M.getNamedValue(MF.getName()), 0,
		RISCVII::MO_CALL));
		return It;
		}

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	RISCVTargetMachine::RISCVTargetMachine(const Target &T, const Triple &TT,
Optional<CodeModel::Model> CM,		Optional<CodeModel::Model> CM,
CodeGenOpt::Level OL, bool JIT)		CodeGenOpt::Level OL, bool JIT)
: LLVMTargetMachine(T, computeDataLayout(TT), TT, CPU, FS, Options,		: LLVMTargetMachine(T, computeDataLayout(TT), TT, CPU, FS, Options,
getEffectiveRelocModel(TT, RM),		getEffectiveRelocModel(TT, RM),
getEffectiveCodeModel(CM, CodeModel::Small), OL),		getEffectiveCodeModel(CM, CodeModel::Small), OL),
TLOF(std::make_unique<RISCVELFTargetObjectFile>()),		TLOF(std::make_unique<RISCVELFTargetObjectFile>()),
Subtarget(TT, CPU, FS, Options.MCOptions.getABIName(), *this) {		Subtarget(TT, CPU, FS, Options.MCOptions.getABIName(), *this) {
initAsmInfo();		initAsmInfo();

		// RISC-V supports the MachineOutliner.
		setMachineOutliner(true);
}		}

TargetTransformInfo		TargetTransformInfo
RISCVTargetMachine::getTargetTransformInfo(const Function &F) {		RISCVTargetMachine::getTargetTransformInfo(const Function &F) {
return TargetTransformInfo(RISCVTTIImpl(this, F));		return TargetTransformInfo(RISCVTTIImpl(this, F));
}		}

namespace {		namespace {
▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/machineoutliner.mir

This file was added.

				# RUN: llc -march riscv32 -x mir < %s \| FileCheck -check-prefix=RV32I %s
				# RUN: llc -march riscv32 -x mir -enable-machine-outliner < %s \
				paquetteUnsubmitted Not Done Reply Inline Actions You probably want -verify-machineinstrs on each of these. paquette: You probably want -verify-machineinstrs on each of these.
				lewis-revillAuthorUnsubmitted Not Done Reply Inline Actions I think there is some issue with the generation of MIR, at least for RISC-V... If I do IR -> MIR -> Assembly with -verify-machineinstrs, I get the error: 'Function has NoVRegs property but there are VReg operands'. I don't get it for IR -> Assembly with -verify-machineinstrs, and I'm not at all sure what to do about it. lewis-revill: I think there is some issue with the generation of MIR, at least for RISC-V... If I do IR ->…
				paquetteUnsubmitted Not Done Reply Inline Actions The verifier should tell you which pass is causing this to happen, and in which function. What is probably happening is that it's because you're using `-enable-machine-outliner`. This is probably running passes you don't expect. Since this output was generated by `-stop-before=machine-outliner`, these passes are probably doing weird things. What you probably want is one of these: `-run-pass=machine-outliner`: Only run the outliner, and produce MIR output `-start-before=machine-outliner`: Run the outliner and all following passes, and produce assembly output I think that using `-run-pass` is usually better for these kinds of tests. The reason being that it localizes the test to a single pass, rather than several. The outliner is an extremely late pass though, so using `-start-before` is probably fine. If you use `-run-pass`, you'll have to rewrite your CHECK lines as MIR though. paquette: The verifier should tell you which pass is causing this to happen, and in which function. What…
				# RUN: \| FileCheck -check-prefix=RV32I-MO %s

				--- \|
				define i32 @outline_0(i32 %a, i32 %b) #0 {
				entry:
				%0 = add i32 %a, 17
				%1 = or i32 %b, 1023
				%2 = and i32 %0, %1
				ret i32 %2
				paquetteUnsubmitted Done Reply Inline Actions I don't think you actually need the body of the IR functions here. You can make each function void, and just return: e.g. define i32 @outline_0(i32 %a, i32 %b) #0 { ret void } If it weren't for the section checks, you wouldn't need the IR at all. It only needs to be there when you look back at the IR in some way in the pass. paquette: I don't think you actually need the body of the IR functions here. You can make each function…
				}

				define i32 @outline_1(i32 %a, i32 %b) #0 {
				entry:
				%0 = add i32 %a, 17
				%1 = or i32 %b, 1023
				%2 = and i32 %0, %1
				ret i32 %2
				}

				define i32 @outline_2(i32 %a, i32 %b) #0 {
				entry:
				%0 = add i32 %a, 17
				%1 = or i32 %b, 1023
				%2 = and i32 %0, %1
				ret i32 %2
				}

				; Should not outline linkonce_odr functions which could be
				; deduplicated by the linker.
				define linkonce_odr i32 @dont_outline_0(i32 %a, i32 %b) #0 {
				entry:
				%0 = add i32 %a, 17
				%1 = or i32 %b, 1023
				%2 = and i32 %0, %1
				ret i32 %2
				}

				; Should not outline functions with named linker sections
				define i32 @dont_outline_1(i32 %a, i32 %b) #0 section "named" {
				entry:
				%0 = add i32 %a, 17
				%1 = or i32 %b, 1023
				%2 = and i32 %0, %1
				ret i32 %2
				}

				; Cannot outline if the X5 (t0) register is not free
				define i32 @dont_outline_2(i32 %a, i32 %b) #0 {
				entry:
				tail call void asm sideeffect "", "~{x5}"()
				%0 = add i32 %a, 17
				%1 = or i32 %b, 1023
				%2 = and i32 %0, %1
				ret i32 %2
				}

				declare void @llvm.stackprotector(i8, i8*) #0
				paquetteUnsubmitted Not Done Reply Inline Actions I'm pretty sure you can remove this. paquette: I'm pretty sure you can remove this.

				attributes #0 = { nounwind }
				paquetteUnsubmitted Done Reply Inline Actions This isn't checked/used by the outliner, so you can remove it from the IR. paquette: This isn't checked/used by the outliner, so you can remove it from the IR.

				...
				---
				name: outline_0
				alignment: 2
				tracksRegLiveness: true
				liveins:
				- { reg: '$x10' }
				- { reg: '$x11' }
				frameInfo:
				maxAlignment: 1
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				paquetteUnsubmitted Done Reply Inline Actions If you do remove the bulk of the IR code, you'll need to rename the BB labels to `bb.0`. paquette: If you do remove the bulk of the IR code, you'll need to rename the BB labels to `bb.0`.
				; RV32I-LABEL: outline_0:
				; RV32I-NOT: call t0, OUTLINED_FUNCTION_0
				;
				; RV32I-MO-LABEL: outline_0:
				; RV32I-MO: call t0, OUTLINED_FUNCTION_0
				paquetteUnsubmitted Not Done Reply Inline Actions It would be good to check the contents of the outlined functions as well. E.g. OUTLINED_FUNCTION_0 should contain the ORI/ADDI/AND sequence. (It's possible that utils/update_mir_test_checks.py can generate all of the checks you want here, but I'm not sure if it can handle functions being created.) paquette: It would be good to check the contents of the outlined functions as well. E.g.
				lewis-revillAuthorUnsubmitted Not Done Reply Inline Actions (It's possible that utils/update_mir_test_checks.py can generate all of the checks you want here, but I'm not sure if it can handle functions being created.) I'm not sure I can either! Inserting check lines at the end of the file causes parsing errors. lewis-revill: > (It's possible that utils/update_mir_test_checks.py can generate all of the checks you want…
				liveins: $x10, $x11

				renamable $x11 = ORI killed renamable $x11, 1023
				renamable $x10 = ADDI killed renamable $x10, 17
				renamable $x10 = AND killed renamable $x10, killed renamable $x11
				CFI_INSTRUCTION def_cfa_offset 0
				PseudoRET implicit $x10

				...
				---
				name: outline_1
				alignment: 2
				tracksRegLiveness: true
				liveins:
				- { reg: '$x10' }
				- { reg: '$x11' }
				frameInfo:
				maxAlignment: 1
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				; RV32I-LABEL: outline_1:
				; RV32I-NOT: call t0, OUTLINED_FUNCTION_0
				;
				; RV32I-MO-LABEL: outline_1:
				; RV32I-MO: call t0, OUTLINED_FUNCTION_0
				liveins: $x10, $x11

				renamable $x11 = ORI killed renamable $x11, 1023
				renamable $x10 = ADDI killed renamable $x10, 17
				renamable $x10 = AND killed renamable $x10, killed renamable $x11
				CFI_INSTRUCTION def_cfa_offset 0
				PseudoRET implicit $x10

				...
				---
				name: outline_2
				alignment: 2
				tracksRegLiveness: true
				liveins:
				- { reg: '$x10' }
				- { reg: '$x11' }
				frameInfo:
				maxAlignment: 1
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				; RV32I-LABEL: outline_2:
				; RV32I-NOT: call t0, OUTLINED_FUNCTION_0
				;
				; RV32I-MO-LABEL: outline_2:
				; RV32I-MO: call t0, OUTLINED_FUNCTION_0
				liveins: $x10, $x11

				renamable $x11 = ORI killed renamable $x11, 1023
				renamable $x10 = ADDI killed renamable $x10, 17
				renamable $x10 = AND killed renamable $x10, killed renamable $x11
				CFI_INSTRUCTION def_cfa_offset 0
				PseudoRET implicit $x10

				...
				---
				name: dont_outline_0
				alignment: 2
				tracksRegLiveness: true
				liveins:
				- { reg: '$x10' }
				- { reg: '$x11' }
				frameInfo:
				maxAlignment: 1
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				; RV32I-LABEL: dont_outline_0:
				; RV32I-NOT: call t0, OUTLINED_FUNCTION_0
				;
				; RV32I-MO-LABEL: dont_outline_0:
				; RV32I-MO-NOT: call t0, OUTLINED_FUNCTION_0
				liveins: $x10, $x11

				renamable $x11 = ORI killed renamable $x11, 1023
				renamable $x10 = ADDI killed renamable $x10, 17
				renamable $x10 = AND killed renamable $x10, killed renamable $x11
				CFI_INSTRUCTION def_cfa_offset 0
				PseudoRET implicit $x10

				...
				---
				name: dont_outline_1
				alignment: 2
				tracksRegLiveness: true
				liveins:
				- { reg: '$x10' }
				- { reg: '$x11' }
				frameInfo:
				maxAlignment: 1
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				; RV32I-LABEL: dont_outline_1:
				; RV32I-NOT: call t0, OUTLINED_FUNCTION_0
				;
				; RV32I-MO-LABEL: dont_outline_1:
				; RV32I-MO-NOT: call t0, OUTLINED_FUNCTION_0
				liveins: $x10, $x11

				renamable $x11 = ORI killed renamable $x11, 1023
				renamable $x10 = ADDI killed renamable $x10, 17
				renamable $x10 = AND killed renamable $x10, killed renamable $x11
				CFI_INSTRUCTION def_cfa_offset 0
				PseudoRET implicit $x10

				...
				---
				name: dont_outline_2
				alignment: 2
				tracksRegLiveness: true
				liveins:
				- { reg: '$x10' }
				- { reg: '$x11' }
				frameInfo:
				maxAlignment: 1
				maxCallFrameSize: 0
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				; RV32I-LABEL: dont_outline_2:
				; RV32I-NOT: call t0, OUTLINED_FUNCTION_0
				;
				; RV32I-MO-LABEL: dont_outline_2:
				; RV32I-MO-NOT: call t0, OUTLINED_FUNCTION_0
				liveins: $x10, $x11

				renamable $x11 = ORI killed renamable $x11, 1023
				renamable $x10 = ADDI killed renamable $x10, 17
				renamable $x10 = AND killed renamable $x10, killed renamable $x11
				INLINEASM &"", 1, 12, implicit-def dead early-clobber $x5
				CFI_INSTRUCTION def_cfa_offset 0
				PseudoRET implicit $x10

				...

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Enable the machine outliner for RISC-VClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 218344

llvm/lib/Target/RISCV/RISCVInstrInfo.h

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp

llvm/test/CodeGen/RISCV/machineoutliner.mir

[RISCV] Enable the machine outliner for RISC-V
ClosedPublic