This is an archive of the discontinued LLVM Phabricator instance.

Add a shrink-wrapping pass to improve the placement of prologue and epilogue.
ClosedPublic

Authored by qcolombet on Apr 22 2015, 3:50 PM.

Download Raw Diff

Details

Reviewers

qcolombet
grosbach

Commits

rG61b305edfd86: [ShrinkWrap] Add (a simplified version) of shrink-wrapping.

Summary

Hi,

@Jim, putting you as reviewer as you seemed to have touched PEI more than most people :).

This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function.
The interest is to first safe points that are cheaper than the entry and exits blocks.

Context **

Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places.
The job of the shrink-wrapping pass is to identify such places.

Motivating example **

Let us consider the following function that perform a call only in one branch of a if:
define i32 @f(i32 %a, i32 %b) {

%tmp = alloca i32, align 4
%tmp2 = icmp slt i32 %a, %b
br i1 %tmp2, label %true, label %false

true:

store i32 %a, i32* %tmp, align 4
%tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
br label %false

false:

%tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
ret i32 %tmp.0

}

On AArch64 this code generates (removing the cfi directives to ease readabilities):
_f: ; @f
; BB#0:
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16 ; =16
cmp w0, w1
b.ge LBB0_2
; BB#1: ; %true
stur w0, [x29, #-4]
sub x1, x29, #4 ; =4
mov w0, wzr
bl _doSomething
LBB0_2: ; %false
mov sp, x29
ldp x29, x30, [sp], #16
ret

With shrink-wrapping we could generate:
_f: ; @f
; BB#0:
cmp w0, w1
b.ge LBB0_2
; BB#1: ; %true
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16 ; =16
stur w0, [x29, #-4]
sub x1, x29, #4 ; =4
mov w0, wzr
bl _doSomething
add sp, x29, #16 ; =16
ldp x29, x30, [sp], #16
LBB0_2: ; %false
ret

Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call.

Proposed Solution **

This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI.

Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties.

The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap.

Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block.

Design Decisions **

ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file.
Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be:
The pass itself: New algorithm needed.
MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer.
PEI: Should loop over the save point and restore point.

Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples.

That being said, the target specific code should not change, which is another point for not blocking the optimization on that :).

Feedback Needed **

Right now, I haven’t added any new target hook, but I am wondering if some more would make sense.
In particular:

Should we have a target hook to be able to fix something on the entry block if this one was not the save point?
Same question with all the exit blocks?

For #2, for instance, ARM needs to expand the TCRETURN pseudo-instruction, but this can be done in the expand pseudo pass. Therefore, I do not know if this is actually needed.

What Is Next? **

I have patches to enable this for AArch64 and ARM and I think I will look into X86 as well. For the record, this implementation of shrink-wrapping applies on about 20% of the function for both O3 and Os with no-regressions and a few improvements.
PGO would certainly helped, but I haven’t tried.

Thanks for your feedbacks,
-Quentin

Diff Detail

Event Timeline

qcolombet updated this revision to Diff 24260.Apr 22 2015, 3:50 PM

qcolombet retitled this revision from to Add a shrink-wrapping pass to improve the placement of prologue and epilogue..

qcolombet updated this object.

qcolombet edited the test plan for this revision. (Show Details)

qcolombet added a reviewer: grosbach.

qcolombet set the repository for this revision to rL LLVM.

qcolombet added a subscriber: Unknown Object (MLST).

Herald added subscribers: jholewinski, aemerson. · View Herald TranscriptApr 22 2015, 3:50 PM

Out of curiosity there a reason you didn't just do the standard
AVAIL/ANTIN version of this like PEI does?

Hi Daniel,

I was looking at the old version, but ...

https://llvm.org/svn/llvm-project/llvm/tags/RELEASE_26/lib/CodeGen/ShrinkWrapping.cpp

This is essentially the Chow paper, but

The dataflow is not really expensive (It's O(N), the same as your

current algorithm). If it is, you should do it differently ;) It can
be computed really fast. Definitely faster than computing
post-dominators.

The chow paper handles multiple save/restore points :)

If your concern is #1, that's solvable, easily.
If the concern is "avoiding placement into bad blocks", that's also
solvable. Chow does not use the other lazy code motion calculations
for some reason. If it did, it would not place it into loops :)

In fact, it will give you the placement points, and you can choose
which you want, and then it will give you the insertion/deletions to
make that happen.

Hi Quentin,

Thanks for working on this - I think shrink wrapping is a very useful optimization to have.
My comments are mostly nit picks - feel free to ignore them if they don't make sense as a result of me missing something.

Overall, I guess that this needs support in at least one backend, so regression tests can be added before this is in a state ready to commit?

Thanks,

Kristof

lib/CodeGen/MachineFunction.cpp
626–627	I don't understand what "... than do not cross Restore." means here. Should the comment just be "Starting from MBB, check if there is a patch leading to Save"?
634–637	"Since we do not reach" -> "If we do not reach"?
lib/CodeGen/PrologEpilogInserter.cpp
385–386	"basic block" -> "basic blocks"
395–398	I'm wondering if the logic overall would get simpler if MFI->getSavePoint() would always return a MachineBasicBlock, i.e. returning Entry if shrink wrapping didn't change the default save point? That way, the logic in most functions doesn't need to handle the case separately of whether or not the save block and entry block are the same - generalizing the logic a bit more. But maybe there's a good reason to do it like this that I haven't noticed?
lib/CodeGen/ShrinkWrap.cpp
187–211	This seems like the most critical function to get right from a correctness point-of-view? I can't derive from just staring at this code whether this is also going to catch code generated for VLAs, or inline assembly fragments touching the frame, or registers such as stack pointer, frame pointer, base pointer. I'm guessing this could be good cases to add as regression tests - i.e. a case where only the presence of a VLA prevents the optimization & a case where only the presence of a piece of inline assembly prevents the optimization?

I have a patch that does this for Hexagon. It's implemented as a part of PEI, specifically in HexagonFrameLowering::insertPrologue.

The problem with multiple prologs is that this can increase code size, plus it's a lot harder to identify the set of registers that need to be saved in each prolog. You don't want to be saving more registers than you need.

One other concern I have is that the way PEI is implemented leaves little room for a target to decide what to do. With the shrink-wrapping in place, we either get none of it or all of it. For most applications on Hexagon, both code size and performance are important. The decision as to whether we want multiple prologs may be the result of an analysis of the entire function. As a matter of fact we may prefer to have a single prolog, but not in the entry block even if the shrink-wrap pass finds multiple locations.

Hi Kristof,

Thanks for your feedbacks.

Overall, I guess that this needs support in at least one backend, so regression tests can be added before this is in a state ready to commit?

Whatever people prefers. I have a patch to add the support for shrink-wrapping for AArch64, I can just merge it with this one with the path disabled by default.

Cheers,
-Quentin

lib/CodeGen/MachineFunction.cpp
626–627	That is a typo indeed, this is “that do not cross Restore” :). This is the important part in fact. Restore is the kill point of the region that needs CSR to be saved and past the kill point, there is nothing to track.
634–637	That works too!
lib/CodeGen/PrologEpilogInserter.cpp
395–398	The rational was that this is homogenous with the way we handle the Restore point. I.e., current shrink-wrapping just give Restore point, whereas PEI can have multiple of them. So I did not want to unbalanced this. I guess I can also push the list of exits blocks in MFI if you think that is better.
lib/CodeGen/ShrinkWrap.cpp
187–211	If the callee-saved information is properly set, it should… Not sure how the VLA are handled, definitely a good regression test to add! For inline assembly, I guess you are right and we could play it safe. Additionally, we could add a target hook to check whether or not a given instruction should be after the prologue and before the epilogue. At the moment, I had needed this but if it proves to be useful we can definitely do that.

Hi Krzysztof,

Fix typos.
Add the support for AArch64
Add regressions tests for AArch64.
Improve the debug messages.
Fix the handling of empty functions.

qcolombet added inline comments.Apr 24 2015, 6:42 PM

lib/CodeGen/MachineFunction.cpp
626–627	Do I still need to do something about the comment?
lib/CodeGen/Passes.cpp
58	clang-format related. I can back that out.
lib/CodeGen/PrologEpilogInserter.cpp
395–398	What do you think? Should I: Keep Restore and Save as they are? Create an asymmetry between their handling? (Don’t like that.) Push everything into MachineFrameInfo?
lib/CodeGen/ShrinkWrap.cpp
187–211	The regression tests did not show any problem with the current implementation.

kristof.beyls added inline comments.Apr 27 2015, 5:04 AM

lib/CodeGen/MachineFunction.cpp
626–627	If the comment reads "Starting from MBB, check if there is a path leading to Save that does not cross Restore", that makes sense to me.
lib/CodeGen/PrologEpilogInserter.cpp
395–398	Looking further into the details of how PEI is implemented at the moment, I'm starting to think that the best option probably is: Get rid of the EntryBlock variable in class PEI, and replace it/change its name to "SaveBlock". Get rid of the ReturnBlocks variable in class PEI, and replace it/change its name to "RestoreBlocks". That way, the names of the variables reflect the intent better, i.e. the block where saves and restores need to be inserted. This can be done as a separate NFC-patch. After that, the shrink wrapping pass "just" optimizes the SaveBlock and RestoreBlocks values. With the current implementation, if the shrink wrapping actually does any optimization, RestoreBlocks will be a single-element vector. Doing it this way, I think PEI code should not have to special case (i.e. have if statements) based on whether or not the shrink wrapping optimization happened or not. I hope. I guess this corresponds to option 3, "push everything into MachineFrameInfo"?

kristof.beyls added inline comments.Apr 27 2015, 5:18 AM

lib/Target/AArch64/AArch64FrameLowering.cpp
547–555	I guess this code sequence will lead to DL sometimes being uninitialized. I'm not sure when "MBB.end() != MBBI", but in that situation, can't you get a reasonable debug location from somewhere else? If the RetOpcode variable is only used to detect tail call returns, maybe it's better to replace it with a bool variable called "isTailCallReturn" or something similar, and use that later on, instead of explicitly checking what the value is?

qcolombet added inline comments.Apr 27 2015, 9:52 AM

lib/Target/AArch64/AArch64FrameLowering.cpp
547–555	I guess I can use MBB.findDebugLoc() with some sensible iterator. I will check. Same for RetOpcode, thanks for the feedback.

Use boolean instead of unsigned to determine whether or not this is a tail call return.

qcolombet added inline comments.Apr 28 2015, 9:58 AM

lib/Target/AArch64/AArch64FrameLowering.cpp
547–555	I haven’t changed the handling of the debug information. It is indeed consistent with what is done AArch64FrameLowering::spillCalleeSavedRegisters and AArch64FrameLowering::restoreCalleeSavedRegisters. Therefore, I think that if we want to fix that, we should do it consistently at these three places. What do you think?

Ping?

Thanks,
-Quentin

Hi Quentin,

As far as I can see, the only comment made that hasn't been addressed yet
is about renaming/replacing EntryBlock and ReturnBlocks in class PEI with
SaveBlock/RestoreBlocks. It seems PEI only needs to know what the Save and
Restore blocks are, and doesn't need to know which ones are the Entry or
Return blocks. Therefore, it doesn't seem a good idea to retain the EntryBlock
and ReturnBlocks variables in that class. But maybe you already looked
into that and there's a reason why EntryBlock and ReturnBlocks is needed
after all?

Apart from the above comment, which I think needs to be addressed, the patch
looks good to me.

Thanks,

Kristof

Hi Kristof,

Update the EntryBlock, ReturnBlocks fields in PEI.

Overall, it seems you've made a lot of changes in the Hexagon backend. I guess these are necessary to make the backend structured more like the other backends so that the SaveBlock/RestoreBlock changes in PEI can be done?
If so, are the changes basically refactoring, i.e. moving existing code to be placed behind certain interfaces, or did you need to write a lot of new code in the Hexagon backend that should be reviewed in detail?

lib/CodeGen/PrologEpilogInserter.cpp
142–157	Just a minor comment, feel free to ignore this one: I feel that it would be a slightly better separation of concerns if MFI->getSavePoint() would always be set, and PEI doesn't need to calculate a SavePoint itself when it isn't set. For this to happen, the ShrinkWrap pass should probably always be run, and when it's "switched off be default", it would just set MFI->SavePoint and MFI->Restore to be equal to FN.begin() and to the ReturnBlocks respectively. I.o.w., The second half of the code above would be executed in the ShrinkWrap pass. If you'd push ahead and implement it like this, the name of the pass also should be changed to "SaveRestorePointInserter/Calculator" or something similar. It's then a pass which always calculates the most appropriate Save and Restore blocks and happens to implement a ShrinkWrapping optimization. As I said, maybe this is pushing for too much separation of concerns - I'm not sure what the negative consequences would be to require the "SaveRestorePointInserterPass" to always be run, if any.
784–785	s/RetBlock/RestoreBlock/? The basic blocks iterated through are "Restore blocks", not necessarily "Return blocks"?
lib/Target/ARM/ARMFrameLowering.cpp
1692–1693	It's unclear to me why this change is needed? If only implementing the AArch64-backend-specific functionality, how come you need to make a change in the AArch32-backend?

Hi Kristof,

LGTM now!

Thanks all for the feedbacks.

Committed revision 236507.

Cheers,
-Quentin

This revision is now accepted and ready to land.May 5 2015, 10:43 AM

Committed in r236507.

Revision Contents

Path

Size

include/

llvm/

CodeGen/

MachineFrameInfo.h

17 lines

Passes.h

10 lines

InitializePasses.h

1 line

Target/

TargetFrameLowering.h

13 lines

lib/

CodeGen/

1 line

1 line

38 lines

30 lines

PrologEpilogInserter.cpp

114 lines

ShrinkWrap.cpp

383 lines

Target/

AArch64/

AArch64FrameLowering.h

2 lines

AArch64FrameLowering.cpp

20 lines

ARM/

ARMFrameLowering.h

5 lines

ARMFrameLowering.cpp

16 lines

Thumb1FrameLowering.h

2 lines

Thumb1FrameLowering.cpp

5 lines

BPF/

BPFFrameLowering.h

2 lines

BPFFrameLowering.cpp

3 lines

Hexagon/

HexagonFrameLowering.h

3 lines

HexagonFrameLowering.cpp

5 lines

MSP430/

MSP430FrameLowering.h

2 lines

MSP430FrameLowering.cpp

5 lines

Mips/

Mips16FrameLowering.h

2 lines

Mips16FrameLowering.cpp

5 lines

MipsSEFrameLowering.h

2 lines

MipsSEFrameLowering.cpp

5 lines

NVPTX/

NVPTXFrameLowering.h

2 lines

NVPTXFrameLowering.cpp

5 lines

NVPTXPrologEpilogPass.cpp

2 lines

PowerPC/

PPCFrameLowering.h

2 lines

PPCFrameLowering.cpp

5 lines

R600/

AMDGPUFrameLowering.h

2 lines

AMDGPUFrameLowering.cpp

5 lines

Sparc/

SparcFrameLowering.h

2 lines

SparcFrameLowering.cpp

5 lines

SystemZ/

SystemZFrameLowering.h

2 lines

SystemZFrameLowering.cpp

5 lines

X86/

X86FrameLowering.h

8 lines

X86FrameLowering.cpp

40 lines

XCore/

XCoreFrameLowering.h

3 lines

XCoreFrameLowering.cpp

5 lines

test/

CodeGen/

AArch64/

arm64-shrink-wrapping.ll

502 lines

Diff 24926

include/llvm/CodeGen/MachineFrameInfo.h

Show First 20 Lines • Show All 240 Lines • ▼ Show 20 Lines	class MachineFrameInfo {
bool HasInlineAsmWithSPAdjust;		bool HasInlineAsmWithSPAdjust;

/// True if the function contains a call to the llvm.vastart intrinsic.		/// True if the function contains a call to the llvm.vastart intrinsic.
bool HasVAStart;		bool HasVAStart;

/// True if this is a varargs function that contains a musttail call.		/// True if this is a varargs function that contains a musttail call.
bool HasMustTailInVarArgFunc;		bool HasMustTailInVarArgFunc;

		/// Not null, if shrink-wrapping found a better place for the prologue.
		MachineBasicBlock *Save;
		/// Not null, if shrink-wrapping found a better place for the epilogue.
		MachineBasicBlock *Restore;

		/// Check if it exists a path from \p MBB leading to the basic
		/// block with a SavePoint (a.k.a. prologue).
		bool isBeforeSavePoint(const MachineFunction &MF,
		const MachineBasicBlock &MBB) const;

public:		public:
explicit MachineFrameInfo(unsigned StackAlign, bool isStackRealign,		explicit MachineFrameInfo(unsigned StackAlign, bool isStackRealign,
bool RealignOpt)		bool RealignOpt)
: StackAlignment(StackAlign), StackRealignable(isStackRealign),		: StackAlignment(StackAlign), StackRealignable(isStackRealign),
RealignOption(RealignOpt) {		RealignOption(RealignOpt) {
StackSize = NumFixedObjects = OffsetAdjustment = MaxAlignment = 0;		StackSize = NumFixedObjects = OffsetAdjustment = MaxAlignment = 0;
HasVarSizedObjects = false;		HasVarSizedObjects = false;
FrameAddressTaken = false;		FrameAddressTaken = false;
ReturnAddressTaken = false;		ReturnAddressTaken = false;
HasStackMap = false;		HasStackMap = false;
HasPatchPoint = false;		HasPatchPoint = false;
AdjustsStack = false;		AdjustsStack = false;
HasCalls = false;		HasCalls = false;
StackProtectorIdx = -1;		StackProtectorIdx = -1;
FunctionContextIdx = -1;		FunctionContextIdx = -1;
MaxCallFrameSize = 0;		MaxCallFrameSize = 0;
CSIValid = false;		CSIValid = false;
LocalFrameSize = 0;		LocalFrameSize = 0;
LocalFrameMaxAlign = 0;		LocalFrameMaxAlign = 0;
UseLocalStackAllocationBlock = false;		UseLocalStackAllocationBlock = false;
HasInlineAsmWithSPAdjust = false;		HasInlineAsmWithSPAdjust = false;
HasVAStart = false;		HasVAStart = false;
HasMustTailInVarArgFunc = false;		HasMustTailInVarArgFunc = false;
		Save = nullptr;
		Restore = nullptr;
}		}

/// hasStackObjects - Return true if there are any stack objects in this		/// hasStackObjects - Return true if there are any stack objects in this
/// function.		/// function.
///		///
bool hasStackObjects() const { return !Objects.empty(); }		bool hasStackObjects() const { return !Objects.empty(); }

/// hasVarSizedObjects - This method may be called any time after instruction		/// hasVarSizedObjects - This method may be called any time after instruction
▲ Show 20 Lines • Show All 312 Lines • ▼ Show 20 Lines	void setCalleeSavedInfo(const std::vector<CalleeSavedInfo> &CSI) {
CSInfo = CSI;		CSInfo = CSI;
}		}

/// isCalleeSavedInfoValid - Has the callee saved info been calculated yet?		/// isCalleeSavedInfoValid - Has the callee saved info been calculated yet?
bool isCalleeSavedInfoValid() const { return CSIValid; }		bool isCalleeSavedInfoValid() const { return CSIValid; }

void setCalleeSavedInfoValid(bool v) { CSIValid = v; }		void setCalleeSavedInfoValid(bool v) { CSIValid = v; }

		MachineBasicBlock *getSavePoint() const { return Save; }
		void setSavePoint(MachineBasicBlock *NewSave) { Save = NewSave; }
		MachineBasicBlock *getRestorePoint() const { return Restore; }
		void setRestorePoint(MachineBasicBlock *NewRestore) { Restore = NewRestore; }

/// getPristineRegs - Return a set of physical registers that are pristine on		/// getPristineRegs - Return a set of physical registers that are pristine on
/// entry to the MBB.		/// entry to the MBB.
///		///
/// Pristine registers hold a value that is useless to the current function,		/// Pristine registers hold a value that is useless to the current function,
/// but that must be preserved - they are callee saved registers that have not		/// but that must be preserved - they are callee saved registers that have not
/// been saved yet.		/// been saved yet.
///		///
/// Before the PrologueEpilogueInserter has placed the CSR spill code, this		/// Before the PrologueEpilogueInserter has placed the CSR spill code, this
Show All 15 Lines

include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	protected:
// Target Pass Options		// Target Pass Options
// Targets provide a default setting, user flags override.		// Targets provide a default setting, user flags override.
//		//
bool DisableVerify;		bool DisableVerify;

/// Default setting for -enable-tail-merge on this target.		/// Default setting for -enable-tail-merge on this target.
bool EnableTailMerge;		bool EnableTailMerge;

		/// Default setting for -enable-shrink-wrap on this target.
		bool EnableShrinkWrap;

public:		public:
TargetPassConfig(TargetMachine *tm, PassManagerBase &pm);		TargetPassConfig(TargetMachine *tm, PassManagerBase &pm);
// Dummy constructor.		// Dummy constructor.
TargetPassConfig();		TargetPassConfig();

~TargetPassConfig() override;		~TargetPassConfig() override;

static char ID;		static char ID;
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	public:

/// Return the pass substituted for StandardID by the target.		/// Return the pass substituted for StandardID by the target.
/// If no substitution exists, return StandardID.		/// If no substitution exists, return StandardID.
IdentifyingPassPtr getPassSubstitution(AnalysisID StandardID) const;		IdentifyingPassPtr getPassSubstitution(AnalysisID StandardID) const;

/// Return true if the optimized regalloc pipeline is enabled.		/// Return true if the optimized regalloc pipeline is enabled.
bool getOptimizeRegAlloc() const;		bool getOptimizeRegAlloc() const;

		/// Return true if shrink wrapping is enabled.
		bool getEnableShrinkWrap() const;

/// Return true if the default global register allocator is in use and		/// Return true if the default global register allocator is in use and
/// has not be overriden on the command line with '-regalloc=...'		/// has not be overriden on the command line with '-regalloc=...'
bool usingDefaultRegAlloc() const;		bool usingDefaultRegAlloc() const;

/// Add common target configurable passes that perform LLVM IR to IR		/// Add common target configurable passes that perform LLVM IR to IR
/// transforms following machine independent optimization.		/// transforms following machine independent optimization.
virtual void addIRPasses();		virtual void addIRPasses();

▲ Show 20 Lines • Show All 231 Lines • ▼ Show 20 Lines	/// MachineDominanaceFrontier - This pass is a machine dominators analysis pass.

/// PostMachineScheduler - This pass schedules machine instructions postRA.		/// PostMachineScheduler - This pass schedules machine instructions postRA.
extern char &PostMachineSchedulerID;		extern char &PostMachineSchedulerID;

/// SpillPlacement analysis. Suggest optimal placement of spill code between		/// SpillPlacement analysis. Suggest optimal placement of spill code between
/// basic blocks.		/// basic blocks.
extern char &SpillPlacementID;		extern char &SpillPlacementID;

		/// ShrinkWrap pass. Look for the best place to insert save and restore
		// instruction and update the MachineFunctionInfo with that information.
		extern char &ShrinkWrapID;

/// VirtRegRewriter pass. Rewrite virtual registers to physical registers as		/// VirtRegRewriter pass. Rewrite virtual registers to physical registers as
/// assigned in VirtRegMap.		/// assigned in VirtRegMap.
extern char &VirtRegRewriterID;		extern char &VirtRegRewriterID;

/// UnreachableMachineBlockElimination - This pass removes unreachable		/// UnreachableMachineBlockElimination - This pass removes unreachable
/// machine basic blocks.		/// machine basic blocks.
extern char &UnreachableMachineBlockElimID;		extern char &UnreachableMachineBlockElimID;

▲ Show 20 Lines • Show All 205 Lines • Show Last 20 Lines

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 241 Lines • ▼ Show 20 Lines
	void initializeRegionViewerPass(PassRegistry&);			void initializeRegionViewerPass(PassRegistry&);
	void initializeRewriteStatepointsForGCPass(PassRegistry&);			void initializeRewriteStatepointsForGCPass(PassRegistry&);
	void initializeSCCPPass(PassRegistry&);			void initializeSCCPPass(PassRegistry&);
	void initializeSROAPass(PassRegistry&);			void initializeSROAPass(PassRegistry&);
	void initializeSROA_DTPass(PassRegistry&);			void initializeSROA_DTPass(PassRegistry&);
	void initializeSROA_SSAUpPass(PassRegistry&);			void initializeSROA_SSAUpPass(PassRegistry&);
	void initializeScalarEvolutionAliasAnalysisPass(PassRegistry&);			void initializeScalarEvolutionAliasAnalysisPass(PassRegistry&);
	void initializeScalarEvolutionPass(PassRegistry&);			void initializeScalarEvolutionPass(PassRegistry&);
				void initializeShrinkWrapPass(PassRegistry &);
	void initializeSimpleInlinerPass(PassRegistry&);			void initializeSimpleInlinerPass(PassRegistry&);
	void initializeShadowStackGCLoweringPass(PassRegistry&);			void initializeShadowStackGCLoweringPass(PassRegistry&);
	void initializeRegisterCoalescerPass(PassRegistry&);			void initializeRegisterCoalescerPass(PassRegistry&);
	void initializeSingleLoopExtractorPass(PassRegistry&);			void initializeSingleLoopExtractorPass(PassRegistry&);
	void initializeSinkingPass(PassRegistry&);			void initializeSinkingPass(PassRegistry&);
	void initializeSeparateConstOffsetFromGEPPass(PassRegistry &);			void initializeSeparateConstOffsetFromGEPPass(PassRegistry &);
	void initializeSlotIndexesPass(PassRegistry&);			void initializeSlotIndexesPass(PassRegistry&);
	void initializeSpillPlacementPass(PassRegistry&);			void initializeSpillPlacementPass(PassRegistry&);
	▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

include/llvm/Target/TargetFrameLowering.h

Show First 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	public:
/// responsible for rounding up the stack frame (probably at emitPrologue		/// responsible for rounding up the stack frame (probably at emitPrologue
/// time).		/// time).
virtual bool targetHandlesStackFrameRounding() const {		virtual bool targetHandlesStackFrameRounding() const {
return false;		return false;
}		}

/// emitProlog/emitEpilog - These methods insert prolog and epilog code into		/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
/// the function.		/// the function.
virtual void emitPrologue(MachineFunction &MF) const = 0;		virtual void emitPrologue(MachineFunction &MF,
		MachineBasicBlock &MBB) const = 0;
virtual void emitEpilogue(MachineFunction &MF,		virtual void emitEpilogue(MachineFunction &MF,
MachineBasicBlock &MBB) const = 0;		MachineBasicBlock &MBB) const = 0;

/// Adjust the prologue to have the function use segmented stacks. This works		/// Adjust the prologue to have the function use segmented stacks. This works
/// by adding a check even before the "normal" function prologue.		/// by adding a check even before the "normal" function prologue.
virtual void adjustForSegmentedStacks(MachineFunction &MF) const { }		virtual void adjustForSegmentedStacks(MachineFunction &MF,
		MachineBasicBlock &PrologueMBB) const {}

/// Adjust the prologue to add Erlang Run-Time System (ERTS) specific code in		/// Adjust the prologue to add Erlang Run-Time System (ERTS) specific code in
/// the assembly prologue to explicitly handle the stack.		/// the assembly prologue to explicitly handle the stack.
virtual void adjustForHiPEPrologue(MachineFunction &MF) const { }		virtual void adjustForHiPEPrologue(MachineFunction &MF,
		MachineBasicBlock &PrologueMBB) const {}

/// Adjust the prologue to add an allocation at a fixed offset from the frame		/// Adjust the prologue to add an allocation at a fixed offset from the frame
/// pointer.		/// pointer.
virtual void adjustForFrameAllocatePrologue(MachineFunction &MF) const { }		virtual void
		adjustForFrameAllocatePrologue(MachineFunction &MF,
		MachineBasicBlock &PrologueMBB) const {}

/// spillCalleeSavedRegisters - Issues instruction(s) to spill all callee		/// spillCalleeSavedRegisters - Issues instruction(s) to spill all callee
/// saved registers and returns true if it isn't possible / profitable to do		/// saved registers and returns true if it isn't possible / profitable to do
/// so by issuing a series of store instructions via		/// so by issuing a series of store instructions via
/// storeRegToStackSlot(). Returns false otherwise.		/// storeRegToStackSlot(). Returns false otherwise.
virtual bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,		virtual bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MI,		MachineBasicBlock::iterator MI,
const std::vector<CalleeSavedInfo> &CSI,		const std::vector<CalleeSavedInfo> &CSI,
▲ Show 20 Lines • Show All 101 Lines • Show Last 20 Lines

lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	add_llvm_library(LLVMCodeGen
RegisterClassInfo.cpp		RegisterClassInfo.cpp
RegisterCoalescer.cpp		RegisterCoalescer.cpp
RegisterPressure.cpp		RegisterPressure.cpp
RegisterScavenging.cpp		RegisterScavenging.cpp
ScheduleDAG.cpp		ScheduleDAG.cpp
ScheduleDAGInstrs.cpp		ScheduleDAGInstrs.cpp
ScheduleDAGPrinter.cpp		ScheduleDAGPrinter.cpp
ScoreboardHazardRecognizer.cpp		ScoreboardHazardRecognizer.cpp
		ShrinkWrap.cpp
ShadowStackGC.cpp		ShadowStackGC.cpp
ShadowStackGCLowering.cpp		ShadowStackGCLowering.cpp
SjLjEHPrepare.cpp		SjLjEHPrepare.cpp
SlotIndexes.cpp		SlotIndexes.cpp
SpillPlacement.cpp		SpillPlacement.cpp
SplitKit.cpp		SplitKit.cpp
StackColoring.cpp		StackColoring.cpp
StackProtector.cpp		StackProtector.cpp
Show All 26 Lines

lib/CodeGen/CodeGen.cpp

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeOptimizePHIsPass(Registry);		initializeOptimizePHIsPass(Registry);
initializePEIPass(Registry);		initializePEIPass(Registry);
initializePHIEliminationPass(Registry);		initializePHIEliminationPass(Registry);
initializePeepholeOptimizerPass(Registry);		initializePeepholeOptimizerPass(Registry);
initializePostMachineSchedulerPass(Registry);		initializePostMachineSchedulerPass(Registry);
initializePostRASchedulerPass(Registry);		initializePostRASchedulerPass(Registry);
initializeProcessImplicitDefsPass(Registry);		initializeProcessImplicitDefsPass(Registry);
initializeRegisterCoalescerPass(Registry);		initializeRegisterCoalescerPass(Registry);
		initializeShrinkWrapPass(Registry);
initializeSlotIndexesPass(Registry);		initializeSlotIndexesPass(Registry);
initializeStackColoringPass(Registry);		initializeStackColoringPass(Registry);
initializeStackMapLivenessPass(Registry);		initializeStackMapLivenessPass(Registry);
initializeStackProtectorPass(Registry);		initializeStackProtectorPass(Registry);
initializeStackSlotColoringPass(Registry);		initializeStackSlotColoringPass(Registry);
initializeTailDuplicatePassPass(Registry);		initializeTailDuplicatePassPass(Registry);
initializeTargetPassConfigPass(Registry);		initializeTargetPassConfigPass(Registry);
initializeTwoAddressInstructionPassPass(Registry);		initializeTwoAddressInstructionPassPass(Registry);
Show All 11 Lines

lib/CodeGen/MachineFunction.cpp

Show First 20 Lines • Show All 594 Lines • ▼ Show 20 Lines	MachineFrameInfo::getPristineRegs(const MachineBasicBlock *MBB) const {
// Before CSI is calculated, no registers are considered pristine. They can be		// Before CSI is calculated, no registers are considered pristine. They can be
// freely used and PEI will make sure they are saved.		// freely used and PEI will make sure they are saved.
if (!isCalleeSavedInfoValid())		if (!isCalleeSavedInfoValid())
return BV;		return BV;

for (const MCPhysReg CSR = TRI->getCalleeSavedRegs(MF); CSR && CSR; ++CSR)		for (const MCPhysReg CSR = TRI->getCalleeSavedRegs(MF); CSR && CSR; ++CSR)
BV.set(*CSR);		BV.set(*CSR);

// The entry MBB always has all CSRs pristine.		// Each MBB before the save point has all CSRs pristine.
if (MBB == &MF->front())		if (isBeforeSavePoint(MF, MBB))
return BV;		return BV;

// On other MBBs the saved CSRs are not pristine.		// On other MBBs the saved CSRs are not pristine.
const std::vector<CalleeSavedInfo> &CSI = getCalleeSavedInfo();		const std::vector<CalleeSavedInfo> &CSI = getCalleeSavedInfo();
for (std::vector<CalleeSavedInfo>::const_iterator I = CSI.begin(),		for (std::vector<CalleeSavedInfo>::const_iterator I = CSI.begin(),
E = CSI.end(); I != E; ++I)		E = CSI.end(); I != E; ++I)
BV.reset(I->getReg());		BV.reset(I->getReg());

return BV;		return BV;
}		}

		// Note: We could use some sort of caching mecanism, but we lack the ability
		// to know when the cache is invalid, i.e., the CFG changed.
		// Assuming we have that, we can simply compute all the set of MBBs
		// that are before the save point.
		bool MachineFrameInfo::isBeforeSavePoint(const MachineFunction &MF,
		const MachineBasicBlock &MBB) const {
		// Early exit if shrink-wrapping did not kick.
		if (!Save)
		return &MBB == &MF.front();

		// Starting from MBB, check if there is a path leading to Save that do
		// not cross Restore.
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I don't understand what "... than do not cross Restore." means here. Should the comment just be "Starting from MBB, check if there is a patch leading to Save"? kristof.beyls: I don't understand what "... than do not cross Restore." means here. Should the comment just be…
		qcolombetAuthorUnsubmitted Not Done Reply Inline Actions That is a typo indeed, this is “that do not cross Restore” :). This is the important part in fact. Restore is the kill point of the region that needs CSR to be saved and past the kill point, there is nothing to track. qcolombet: That is a typo indeed, this is “that do not cross Restore” :). This is the important part in…
		qcolombetAuthorUnsubmitted Not Done Reply Inline Actions Do I still need to do something about the comment? qcolombet: Do I still need to do something about the comment?
		kristof.beylsUnsubmitted Not Done Reply Inline Actions If the comment reads "Starting from MBB, check if there is a path leading to Save that does not cross Restore", that makes sense to me. kristof.beyls: If the comment reads "Starting from MBB, check if there is a path leading to Save that does not…
		SmallPtrSet<const MachineBasicBlock *, 8> Visited;
		SmallVector<const MachineBasicBlock *, 8> WorkList;
		WorkList.push_back(&MBB);
		Visited.insert(&MBB);
		do {
		const MachineBasicBlock *CurBB = WorkList.pop_back_val();
		// By construction, the region that is after the save point is
		// dominated by the Save and post-dominated by the Restore.
		// If we do not reach Restore and still reach Save, this
		// means MBB is before Save.
		kristof.beylsUnsubmitted Not Done Reply Inline Actions "Since we do not reach" -> "If we do not reach"? kristof.beyls: "Since we do not reach" -> "If we do not reach"?
		qcolombetAuthorUnsubmitted Not Done Reply Inline Actions That works too! qcolombet: That works too!
		if (CurBB == Save)
		return true;
		if (CurBB == Restore)
		continue;
		// Enqueue all the successors not already visited.
		for (MachineBasicBlock *SuccBB : CurBB->successors())
		if (Visited.insert(SuccBB).second)
		WorkList.push_back(SuccBB);
		} while (!WorkList.empty());
		return false;
		}

unsigned MachineFrameInfo::estimateStackSize(const MachineFunction &MF) const {		unsigned MachineFrameInfo::estimateStackSize(const MachineFunction &MF) const {
const TargetFrameLowering *TFI = MF.getSubtarget().getFrameLowering();		const TargetFrameLowering *TFI = MF.getSubtarget().getFrameLowering();
const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();		const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
unsigned MaxAlign = getMaxAlignment();		unsigned MaxAlign = getMaxAlignment();
int Offset = 0;		int Offset = 0;

// This code is very, very similar to PEI::calculateFrameObjectOffsets().		// This code is very, very similar to PEI::calculateFrameObjectOffsets().
// It really should be refactored to share code. Until then, changes		// It really should be refactored to share code. Until then, changes
▲ Show 20 Lines • Show All 349 Lines • Show Last 20 Lines

lib/CodeGen/Passes.cpp

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	static cl::opt<bool> DisableMachineDCE("disable-machine-dce", cl::Hidden,
cl::desc("Disable Machine Dead Code Elimination"));		cl::desc("Disable Machine Dead Code Elimination"));
static cl::opt<bool> DisableEarlyIfConversion("disable-early-ifcvt", cl::Hidden,		static cl::opt<bool> DisableEarlyIfConversion("disable-early-ifcvt", cl::Hidden,
cl::desc("Disable Early If-conversion"));		cl::desc("Disable Early If-conversion"));
static cl::opt<bool> DisableMachineLICM("disable-machine-licm", cl::Hidden,		static cl::opt<bool> DisableMachineLICM("disable-machine-licm", cl::Hidden,
cl::desc("Disable Machine LICM"));		cl::desc("Disable Machine LICM"));
static cl::opt<bool> DisableMachineCSE("disable-machine-cse", cl::Hidden,		static cl::opt<bool> DisableMachineCSE("disable-machine-cse", cl::Hidden,
cl::desc("Disable Machine Common Subexpression Elimination"));		cl::desc("Disable Machine Common Subexpression Elimination"));
static cl::opt<cl::boolOrDefault>		static cl::opt<cl::boolOrDefault>
OptimizeRegAlloc("optimize-regalloc", cl::Hidden,		EnableShrinkWrapOpt("enable-shrink-wrap", cl::Hidden,
		cl::desc("enable the shrink-wrapping pass"));
		static cl::opt<cl::boolOrDefault> OptimizeRegAlloc(
		"optimize-regalloc", cl::Hidden,
		qcolombetAuthorUnsubmitted Not Done Reply Inline Actions clang-format related. I can back that out. qcolombet: clang-format related. I can back that out.
cl::desc("Enable optimized register allocation compilation path."));		cl::desc("Enable optimized register allocation compilation path."));
static cl::opt<bool> DisablePostRAMachineLICM("disable-postra-machine-licm",		static cl::opt<bool> DisablePostRAMachineLICM("disable-postra-machine-licm",
cl::Hidden,		cl::Hidden,
cl::desc("Disable Machine LICM"));		cl::desc("Disable Machine LICM"));
static cl::opt<bool> DisableMachineSink("disable-machine-sink", cl::Hidden,		static cl::opt<bool> DisableMachineSink("disable-machine-sink", cl::Hidden,
cl::desc("Disable Machine Sinking"));		cl::desc("Disable Machine Sinking"));
static cl::opt<bool> DisableLSR("disable-lsr", cl::Hidden,		static cl::opt<bool> DisableLSR("disable-lsr", cl::Hidden,
cl::desc("Disable Loop Strength Reduction Pass"));		cl::desc("Disable Loop Strength Reduction Pass"));
▲ Show 20 Lines • Show All 137 Lines • ▼ Show 20 Lines
// Out of line virtual method.		// Out of line virtual method.
TargetPassConfig::~TargetPassConfig() {		TargetPassConfig::~TargetPassConfig() {
delete Impl;		delete Impl;
}		}

// Out of line constructor provides default values for pass options and		// Out of line constructor provides default values for pass options and
// registers all common codegen passes.		// registers all common codegen passes.
TargetPassConfig::TargetPassConfig(TargetMachine *tm, PassManagerBase &pm)		TargetPassConfig::TargetPassConfig(TargetMachine *tm, PassManagerBase &pm)
: ImmutablePass(ID), PM(&pm), StartAfter(nullptr), StopAfter(nullptr),		: ImmutablePass(ID), PM(&pm), StartAfter(nullptr), StopAfter(nullptr),
Started(true), Stopped(false), AddingMachinePasses(false), TM(tm),		Started(true), Stopped(false), AddingMachinePasses(false), TM(tm),
Impl(nullptr), Initialized(false), DisableVerify(false),		Impl(nullptr), Initialized(false), DisableVerify(false),
EnableTailMerge(true) {		EnableTailMerge(true), EnableShrinkWrap(false) {

Impl = new PassConfigImpl();		Impl = new PassConfigImpl();

// Register all target independent codegen passes to activate their PassIDs,		// Register all target independent codegen passes to activate their PassIDs,
// including this pass itself.		// including this pass itself.
initializeCodeGen(*PassRegistry::getPassRegistry());		initializeCodeGen(*PassRegistry::getPassRegistry());

// Substitute Pseudo Pass IDs for real ones.		// Substitute Pseudo Pass IDs for real ones.
▲ Show 20 Lines • Show All 298 Lines • ▼ Show 20 Lines	if (getOptimizeRegAlloc())
addOptimizedRegAlloc(createRegAllocPass(true));		addOptimizedRegAlloc(createRegAllocPass(true));
else		else
addFastRegAlloc(createRegAllocPass(false));		addFastRegAlloc(createRegAllocPass(false));

// Run post-ra passes.		// Run post-ra passes.
addPostRegAlloc();		addPostRegAlloc();

// Insert prolog/epilog code. Eliminate abstract frame index references...		// Insert prolog/epilog code. Eliminate abstract frame index references...
		if (getEnableShrinkWrap())
		addPass(&ShrinkWrapID);
addPass(&PrologEpilogCodeInserterID);		addPass(&PrologEpilogCodeInserterID);

/// Add passes that optimize machine instructions after register allocation.		/// Add passes that optimize machine instructions after register allocation.
if (getOptLevel() != CodeGenOpt::None)		if (getOptLevel() != CodeGenOpt::None)
addMachineLateOptimization();		addMachineLateOptimization();

// Expand pseudo instructions before second scheduling pass.		// Expand pseudo instructions before second scheduling pass.
addPass(&ExpandPostRAPseudosID);		addPass(&ExpandPostRAPseudosID);
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	void TargetPassConfig::addMachineSSAOptimization() {
addPass(&MachineSinkingID);		addPass(&MachineSinkingID);

addPass(&PeepholeOptimizerID, false);		addPass(&PeepholeOptimizerID, false);
// Clean-up the dead code that may have been generated by peephole		// Clean-up the dead code that may have been generated by peephole
// rewriting.		// rewriting.
addPass(&DeadMachineInstructionElimID);		addPass(&DeadMachineInstructionElimID);
}		}

		bool TargetPassConfig::getEnableShrinkWrap() const {
		switch (EnableShrinkWrapOpt) {
		case cl::BOU_UNSET:
		return EnableShrinkWrap && getOptLevel() != CodeGenOpt::None;
		// If EnableShrinkWrap is set, it takes precedence on whatever the
		// target sets. The rational is that we assume we want to test
		// something related to shrink-wrapping.
		case cl::BOU_TRUE:
		return true;
		case cl::BOU_FALSE:
		return false;
		}
		llvm_unreachable("Invalid shrink-wrapping state");
		}

//===---------------------------------------------------------------------===//		//===---------------------------------------------------------------------===//
/// Register Allocation Pass Configuration		/// Register Allocation Pass Configuration
//===---------------------------------------------------------------------===//		//===---------------------------------------------------------------------===//

bool TargetPassConfig::getOptimizeRegAlloc() const {		bool TargetPassConfig::getOptimizeRegAlloc() const {
switch (OptimizeRegAlloc) {		switch (OptimizeRegAlloc) {
case cl::BOU_UNSET: return getOptLevel() != CodeGenOpt::None;		case cl::BOU_UNSET: return getOptLevel() != CodeGenOpt::None;
case cl::BOU_TRUE: return true;		case cl::BOU_TRUE: return true;
▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

lib/CodeGen/PrologEpilogInserter.cpp

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines

private:		private:
RegScavenger *RS;		RegScavenger *RS;

// MinCSFrameIndex, MaxCSFrameIndex - Keeps the range of callee saved		// MinCSFrameIndex, MaxCSFrameIndex - Keeps the range of callee saved
// stack frame indexes.		// stack frame indexes.
unsigned MinCSFrameIndex, MaxCSFrameIndex;		unsigned MinCSFrameIndex, MaxCSFrameIndex;

// Entry and return blocks of the current function.		// Save and Restore blocks of the current function.
MachineBasicBlock *EntryBlock;		MachineBasicBlock *SaveBlock;
SmallVector<MachineBasicBlock *, 4> ReturnBlocks;		SmallVector<MachineBasicBlock *, 4> RestoreBlocks;

// Flag to control whether to use the register scavenger to resolve		// Flag to control whether to use the register scavenger to resolve
// frame index materialization registers. Set according to		// frame index materialization registers. Set according to
// TRI->requiresFrameIndexScavenging() for the current function.		// TRI->requiresFrameIndexScavenging() for the current function.
bool FrameIndexVirtualScavenging;		bool FrameIndexVirtualScavenging;

void calculateSets(MachineFunction &Fn);		void calculateSets(MachineFunction &Fn);
void calculateCallsInformation(MachineFunction &Fn);		void calculateCallsInformation(MachineFunction &Fn);
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
}		}

bool PEI::isReturnBlock(MachineBasicBlock* MBB) {		bool PEI::isReturnBlock(MachineBasicBlock* MBB) {
return (MBB && !MBB->empty() && MBB->back().isReturn());		return (MBB && !MBB->empty() && MBB->back().isReturn());
}		}

/// Compute the set of return blocks		/// Compute the set of return blocks
void PEI::calculateSets(MachineFunction &Fn) {		void PEI::calculateSets(MachineFunction &Fn) {
// Sets used to compute spill, restore placement sets.		const MachineFrameInfo *MFI = Fn.getFrameInfo();
const std::vector<CalleeSavedInfo> &CSI =
Fn.getFrameInfo()->getCalleeSavedInfo();

// If no CSRs used, we are done.		// Even when we do not change any CSR, we still want to insert the
if (CSI.empty())		// prologue and epilogue of the function.
		// So set the save points for those.

		// Use the points found by shrink-wrapping, if any.
		if (MFI->getSavePoint()) {
		SaveBlock = MFI->getSavePoint();
		assert(MFI->getRestorePoint() && "Both restore and save must be set");
		RestoreBlocks.push_back(MFI->getRestorePoint());
return;		return;
		}

// Save refs to entry and return blocks.		// Save refs to entry and return blocks.
EntryBlock = Fn.begin();		SaveBlock = Fn.begin();
for (MachineFunction::iterator MBB = Fn.begin(), E = Fn.end();		for (MachineFunction::iterator MBB = Fn.begin(), E = Fn.end();
MBB != E; ++MBB)		MBB != E; ++MBB)
if (isReturnBlock(MBB))		if (isReturnBlock(MBB))
ReturnBlocks.push_back(MBB);		RestoreBlocks.push_back(MBB);

return;		return;
		kristof.beylsUnsubmitted Not Done Reply Inline Actions Just a minor comment, feel free to ignore this one: I feel that it would be a slightly better separation of concerns if MFI->getSavePoint() would always be set, and PEI doesn't need to calculate a SavePoint itself when it isn't set. For this to happen, the ShrinkWrap pass should probably always be run, and when it's "switched off be default", it would just set MFI->SavePoint and MFI->Restore to be equal to FN.begin() and to the ReturnBlocks respectively. I.o.w., The second half of the code above would be executed in the ShrinkWrap pass. If you'd push ahead and implement it like this, the name of the pass also should be changed to "SaveRestorePointInserter/Calculator" or something similar. It's then a pass which always calculates the most appropriate Save and Restore blocks and happens to implement a ShrinkWrapping optimization. As I said, maybe this is pushing for too much separation of concerns - I'm not sure what the negative consequences would be to require the "SaveRestorePointInserterPass" to always be run, if any. kristof.beyls: Just a minor comment, feel free to ignore this one: I feel that it would be a slightly better…
}		}

/// StackObjSet - A set of stack object indexes		/// StackObjSet - A set of stack object indexes
typedef SmallSetVector<int, 8> StackObjSet;		typedef SmallSetVector<int, 8> StackObjSet;

/// runOnMachineFunction - Insert prolog/epilog code and replace abstract		/// runOnMachineFunction - Insert prolog/epilog code and replace abstract
/// frame indexes with appropriate references.		/// frame indexes with appropriate references.
///		///
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	bool PEI::runOnMachineFunction(MachineFunction &Fn) {
MachineFrameInfo *MFI = Fn.getFrameInfo();		MachineFrameInfo *MFI = Fn.getFrameInfo();
uint64_t StackSize = MFI->getStackSize();		uint64_t StackSize = MFI->getStackSize();
if (WarnStackSize.getNumOccurrences() > 0 && WarnStackSize < StackSize) {		if (WarnStackSize.getNumOccurrences() > 0 && WarnStackSize < StackSize) {
DiagnosticInfoStackSize DiagStackSize(*F, StackSize);		DiagnosticInfoStackSize DiagStackSize(*F, StackSize);
F->getContext().diagnose(DiagStackSize);		F->getContext().diagnose(DiagStackSize);
}		}

delete RS;		delete RS;
ReturnBlocks.clear();		RestoreBlocks.clear();
return true;		return true;
}		}

/// calculateCallsInformation - Calculate the MaxCallFrameSize and AdjustsStack		/// calculateCallsInformation - Calculate the MaxCallFrameSize and AdjustsStack
/// variables for the function's frame information and eliminate call frame		/// variables for the function's frame information and eliminate call frame
/// pseudo instructions.		/// pseudo instructions.
void PEI::calculateCallsInformation(MachineFunction &Fn) {		void PEI::calculateCallsInformation(MachineFunction &Fn) {
const TargetInstrInfo &TII = *Fn.getSubtarget().getInstrInfo();		const TargetInstrInfo &TII = *Fn.getSubtarget().getInstrInfo();
▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	for (std::vector<CalleeSavedInfo>::iterator I = CSI.begin(), E = CSI.end();

I->setFrameIdx(FrameIdx);		I->setFrameIdx(FrameIdx);
}		}
}		}

MFI->setCalleeSavedInfo(CSI);		MFI->setCalleeSavedInfo(CSI);
}		}

		/// Helper function to update the liveness information for the callee-saved
		/// registers.
		static void updateLiveness(MachineFunction &MF) {
		MachineFrameInfo *MFI = MF.getFrameInfo();
		// Visited will contain all the basic blocks that are in the region
		// where the callee saved registers are alive:
		kristof.beylsUnsubmitted Not Done Reply Inline Actions "basic block" -> "basic blocks" kristof.beyls: "basic block" -> "basic blocks"
		// - Anything that is not Save or Restore -> LiveThrough.
		// - Save -> LiveIn.
		// - Restore -> LiveOut.
		// The live-out is not attached to the block, so no need to keep
		// Restore in this set.
		SmallPtrSet<MachineBasicBlock *, 8> Visited;
		SmallVector<MachineBasicBlock *, 8> WorkList;
		MachineBasicBlock *Entry = &MF.front();
		MachineBasicBlock *Save = MFI->getSavePoint();

		if (!Save)
		Save = Entry;
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I'm wondering if the logic overall would get simpler if MFI->getSavePoint() would always return a MachineBasicBlock, i.e. returning Entry if shrink wrapping didn't change the default save point? That way, the logic in most functions doesn't need to handle the case separately of whether or not the save block and entry block are the same - generalizing the logic a bit more. But maybe there's a good reason to do it like this that I haven't noticed? kristof.beyls: I'm wondering if the logic overall would get simpler if MFI->getSavePoint() would always return…
		qcolombetAuthorUnsubmitted Not Done Reply Inline Actions The rational was that this is homogenous with the way we handle the Restore point. I.e., current shrink-wrapping just give Restore point, whereas PEI can have multiple of them. So I did not want to unbalanced this. I guess I can also push the list of exits blocks in MFI if you think that is better. qcolombet: The rational was that this is homogenous with the way we handle the Restore point. I.e.
		qcolombetAuthorUnsubmitted Not Done Reply Inline Actions What do you think? Should I: Keep Restore and Save as they are? Create an asymmetry between their handling? (Don’t like that.) Push everything into MachineFrameInfo? qcolombet: What do you think? Should I: 1. Keep Restore and Save as they are? 2. Create an asymmetry…
		kristof.beylsUnsubmitted Not Done Reply Inline Actions Looking further into the details of how PEI is implemented at the moment, I'm starting to think that the best option probably is: Get rid of the EntryBlock variable in class PEI, and replace it/change its name to "SaveBlock". Get rid of the ReturnBlocks variable in class PEI, and replace it/change its name to "RestoreBlocks". That way, the names of the variables reflect the intent better, i.e. the block where saves and restores need to be inserted. This can be done as a separate NFC-patch. After that, the shrink wrapping pass "just" optimizes the SaveBlock and RestoreBlocks values. With the current implementation, if the shrink wrapping actually does any optimization, RestoreBlocks will be a single-element vector. Doing it this way, I think PEI code should not have to special case (i.e. have if statements) based on whether or not the shrink wrapping optimization happened or not. I hope. I guess this corresponds to option 3, "push everything into MachineFrameInfo"? kristof.beyls: Looking further into the details of how PEI is implemented at the moment, I'm starting to think…

		if (Entry != Save) {
		WorkList.push_back(Entry);
		Visited.insert(Entry);
		}
		Visited.insert(Save);

		MachineBasicBlock *Restore = MFI->getRestorePoint();
		if (Restore)
		// By construction Restore cannot be visited, otherwise it
		// means there exists a path to Restore that does not go
		// through Save.
		WorkList.push_back(Restore);

		while (!WorkList.empty()) {
		const MachineBasicBlock *CurBB = WorkList.pop_back_val();
		// By construction, the region that is after the save point is
		// dominated by the Save and post-dominated by the Restore.
		if (CurBB == Save)
		continue;
		// Enqueue all the successors not already visited.
		// Those are by construction either before Save or after Restore.
		for (MachineBasicBlock *SuccBB : CurBB->successors())
		if (Visited.insert(SuccBB).second)
		WorkList.push_back(SuccBB);
		}

		const std::vector<CalleeSavedInfo> &CSI = MFI->getCalleeSavedInfo();

		for (unsigned i = 0, e = CSI.size(); i != e; ++i) {
		for (MachineBasicBlock *MBB : Visited)
		// Add the callee-saved register as live-in.
		// It's killed at the spill.
		MBB->addLiveIn(CSI[i].getReg());
		}
		}

/// insertCSRSpillsAndRestores - Insert spill and restore code for		/// insertCSRSpillsAndRestores - Insert spill and restore code for
/// callee saved registers used in the function.		/// callee saved registers used in the function.
///		///
void PEI::insertCSRSpillsAndRestores(MachineFunction &Fn) {		void PEI::insertCSRSpillsAndRestores(MachineFunction &Fn) {
// Get callee saved register information.		// Get callee saved register information.
MachineFrameInfo *MFI = Fn.getFrameInfo();		MachineFrameInfo *MFI = Fn.getFrameInfo();
const std::vector<CalleeSavedInfo> &CSI = MFI->getCalleeSavedInfo();		const std::vector<CalleeSavedInfo> &CSI = MFI->getCalleeSavedInfo();

MFI->setCalleeSavedInfoValid(true);		MFI->setCalleeSavedInfoValid(true);

// Early exit if no callee saved registers are modified!		// Early exit if no callee saved registers are modified!
if (CSI.empty())		if (CSI.empty())
return;		return;

const TargetInstrInfo &TII = *Fn.getSubtarget().getInstrInfo();		const TargetInstrInfo &TII = *Fn.getSubtarget().getInstrInfo();
const TargetFrameLowering *TFI = Fn.getSubtarget().getFrameLowering();		const TargetFrameLowering *TFI = Fn.getSubtarget().getFrameLowering();
const TargetRegisterInfo *TRI = Fn.getSubtarget().getRegisterInfo();		const TargetRegisterInfo *TRI = Fn.getSubtarget().getRegisterInfo();
MachineBasicBlock::iterator I;		MachineBasicBlock::iterator I;

// Spill using target interface.		// Spill using target interface.
I = EntryBlock->begin();		I = SaveBlock->begin();
if (!TFI->spillCalleeSavedRegisters(*EntryBlock, I, CSI, TRI)) {		if (!TFI->spillCalleeSavedRegisters(*SaveBlock, I, CSI, TRI)) {
for (unsigned i = 0, e = CSI.size(); i != e; ++i) {		for (unsigned i = 0, e = CSI.size(); i != e; ++i) {
// Add the callee-saved register as live-in.
// It's killed at the spill.
EntryBlock->addLiveIn(CSI[i].getReg());

// Insert the spill to the stack frame.		// Insert the spill to the stack frame.
unsigned Reg = CSI[i].getReg();		unsigned Reg = CSI[i].getReg();
const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);		const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);
TII.storeRegToStackSlot(*EntryBlock, I, Reg, true, CSI[i].getFrameIdx(),		TII.storeRegToStackSlot(*SaveBlock, I, Reg, true, CSI[i].getFrameIdx(),
RC, TRI);		RC, TRI);
}		}
}		}
		// Update the live-in information of all the blocks up to the save point.
		updateLiveness(Fn);

// Restore using target interface.		// Restore using target interface.
for (unsigned ri = 0, re = ReturnBlocks.size(); ri != re; ++ri) {		for (MachineBasicBlock *MBB : RestoreBlocks) {
MachineBasicBlock *MBB = ReturnBlocks[ri];
I = MBB->end();		I = MBB->end();
--I;

// Skip over all terminator instructions, which are part of the return		// Skip over all terminator instructions, which are part of the return
// sequence.		// sequence.
MachineBasicBlock::iterator I2 = I;		MachineBasicBlock::iterator I2 = I;
while (I2 != MBB->begin() && (--I2)->isTerminator())		while (I2 != MBB->begin() && (--I2)->isTerminator())
I = I2;		I = I2;

bool AtStart = I == MBB->begin();		bool AtStart = I == MBB->begin();
▲ Show 20 Lines • Show All 293 Lines • ▼ Show 20 Lines
/// insertPrologEpilogCode - Scan the function for modified callee saved		/// insertPrologEpilogCode - Scan the function for modified callee saved
/// registers, insert spill code for these callee saved registers, then add		/// registers, insert spill code for these callee saved registers, then add
/// prolog and epilog code to the function.		/// prolog and epilog code to the function.
///		///
void PEI::insertPrologEpilogCode(MachineFunction &Fn) {		void PEI::insertPrologEpilogCode(MachineFunction &Fn) {
const TargetFrameLowering &TFI = *Fn.getSubtarget().getFrameLowering();		const TargetFrameLowering &TFI = *Fn.getSubtarget().getFrameLowering();

// Add prologue to the function...		// Add prologue to the function...
TFI.emitPrologue(Fn);		TFI.emitPrologue(Fn, *SaveBlock);

// Add epilogue to restore the callee-save registers in each exiting block		// Add epilogue to restore the callee-save registers in each exiting block.
for (MachineFunction::iterator I = Fn.begin(), E = Fn.end(); I != E; ++I) {		for (MachineBasicBlock *RetBlock : RestoreBlocks)
// If last instruction is a return instruction, add an epilogue		TFI.emitEpilogue(Fn, *RetBlock);
		kristof.beylsUnsubmitted Not Done Reply Inline Actions s/RetBlock/RestoreBlock/? The basic blocks iterated through are "Restore blocks", not necessarily "Return blocks"? kristof.beyls: s/RetBlock/RestoreBlock/? The basic blocks iterated through are "Restore blocks", not…
if (!I->empty() && I->back().isReturn())
TFI.emitEpilogue(Fn, *I);
}

// Emit additional code that is required to support segmented stacks, if		// Emit additional code that is required to support segmented stacks, if
// we've been asked for it. This, when linked with a runtime with support		// we've been asked for it. This, when linked with a runtime with support
// for segmented stacks (libgcc is one), will result in allocating stack		// for segmented stacks (libgcc is one), will result in allocating stack
// space in small chunks instead of one large contiguous block.		// space in small chunks instead of one large contiguous block.
if (Fn.shouldSplitStack())		if (Fn.shouldSplitStack())
TFI.adjustForSegmentedStacks(Fn);		TFI.adjustForSegmentedStacks(Fn, *SaveBlock);

// Emit additional code that is required to explicitly handle the stack in		// Emit additional code that is required to explicitly handle the stack in
// HiPE native code (if needed) when loaded in the Erlang/OTP runtime. The		// HiPE native code (if needed) when loaded in the Erlang/OTP runtime. The
// approach is rather similar to that of Segmented Stacks, but it uses a		// approach is rather similar to that of Segmented Stacks, but it uses a
// different conditional check and another BIF for allocating more stack		// different conditional check and another BIF for allocating more stack
// space.		// space.
if (Fn.getFunction()->getCallingConv() == CallingConv::HiPE)		if (Fn.getFunction()->getCallingConv() == CallingConv::HiPE)
TFI.adjustForHiPEPrologue(Fn);		TFI.adjustForHiPEPrologue(Fn, *SaveBlock);
}		}

/// replaceFrameIndices - Replace all MO_FrameIndex operands with physical		/// replaceFrameIndices - Replace all MO_FrameIndex operands with physical
/// register references and actual offsets.		/// register references and actual offsets.
///		///
void PEI::replaceFrameIndices(MachineFunction &Fn) {		void PEI::replaceFrameIndices(MachineFunction &Fn) {
const TargetFrameLowering &TFI = *Fn.getSubtarget().getFrameLowering();		const TargetFrameLowering &TFI = *Fn.getSubtarget().getFrameLowering();
if (!TFI.needsFrameIndexResolution(Fn)) return;		if (!TFI.needsFrameIndexResolution(Fn)) return;
▲ Show 20 Lines • Show All 262 Lines • Show Last 20 Lines

lib/CodeGen/ShrinkWrap.cpp

				//===-- ShrinkWrap.cpp - Compute safe point for prolog/epilog insertion ---===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass looks for safe point where the prologue and epilogue can be
				// inserted.
				// The safe point for the prologue (resp. epilogue) is called Save
				// (resp. Restore).
				// A point is safe for prologue (resp. epilogue) if and only if
				// it 1) dominates (resp. post-dominates) all the frame related operations and
				// between 2) two executions of the Save (resp. Restore) point there is an
				// execution of the Restore (resp. Save) point.
				//
				// For instance, the following points are safe:
				// for (int i = 0; i < 10; ++i) {
				// Save
				// ...
				// Restore
				// }
				// Indeed, the execution looks like Save -> Restore -> Save -> Restore ...
				// And the following points are not:
				// for (int i = 0; i < 10; ++i) {
				// Save
				// ...
				// }
				// for (int i = 0; i < 10; ++i) {
				// ...
				// Restore
				// }
				// Indeed, the execution looks like Save -> Save -> ... -> Restore -> Restore.
				//
				// This pass also ensures that the safe points are 3) cheaper than the regular
				// entry and exits blocks.
				//
				// Property #1 is ensured via the use of MachineDominatorTree and
				// MachinePostDominatorTree.
				// Property #2 is ensured via property #1 and MachineLoopInfo, i.e., both
				// points must be in the same loop.
				// Property #3 is ensured via the MachineBlockFrequencyInfo.
				//
				// If this pass found points matching all this properties, then
				// MachineFrameInfo is updated this that information.
				//===----------------------------------------------------------------------===//
				#include "llvm/ADT/Statistic.h"
				// To check for profitability.
				#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"
				// For property #1 for Save.
				#include "llvm/CodeGen/MachineDominators.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				// To record the result of the analysis.
				#include "llvm/CodeGen/MachineFrameInfo.h"
				// For property #2.
				#include "llvm/CodeGen/MachineLoopInfo.h"
				// For property #1 for Restore.
				#include "llvm/CodeGen/MachinePostDominators.h"
				#include "llvm/CodeGen/Passes.h"
				// To know about callee-saved.
				#include "llvm/CodeGen/RegisterClassInfo.h"
				#include "llvm/Support/Debug.h"
				// To know about frame setup operation.
				#include "llvm/Target/TargetInstrInfo.h"
				// To access TargetInstrInfo.
				#include "llvm/Target/TargetSubtargetInfo.h"

				#define DEBUG_TYPE "shrink-wrap"

				using namespace llvm;

				STATISTIC(NumFunc, "Number of functions");
				STATISTIC(NumCandidates, "Number of shrink-wrapping candidates");
				STATISTIC(NumCandidatesDropped,
				"Number of shrink-wrapping candidates dropped because of frequency");

				namespace {
				/// \brief Class to determine where the safe point to insert the
				/// prologue and epilogue are.
				/// Unlike the paper from Fred C. Chow, PLDI'88, that introduces the
				/// shrink-wrapping term for prologue/epilogue placement, this pass
				/// does not rely on expensive data-flow analysis. Instead we use the
				/// dominance properties and loop information to decide which point
				/// are safe for such insertion.
				class ShrinkWrap : public MachineFunctionPass {
				/// Hold callee-saved information.
				RegisterClassInfo RCI;
				MachineDominatorTree *MDT;
				MachinePostDominatorTree *MPDT;
				/// Current safe point found for the prologue.
				/// The prologue will be inserted before the first instruction
				/// in this basic block.
				MachineBasicBlock *Save;
				/// Current safe point found for the epilogue.
				/// The epilogue will be inserted before the first terminator instruction
				/// in this basic block.
				MachineBasicBlock *Restore;
				/// Hold the information of the basic block frequency.
				/// Use to check the profitability of the new points.
				MachineBlockFrequencyInfo *MBFI;
				/// Hold the loop information. Used to determine if Save and Restore
				/// are in the same loop.
				MachineLoopInfo *MLI;
				/// Frequency of the Entry block.
				uint64_t EntryFreq;
				/// Current opcode for frame setup.
				int FrameSetupOpcode;
				/// Current opcode for frame destroy.
				int FrameDestroyOpcode;
				/// Entry block.
				const MachineBasicBlock *Entry;

				/// \brief Check if \p MI uses or defines a callee-saved register or
				/// a frame index. If this is the case, this means \p MI must happen
				/// after Save and before Restore.
				bool useOrDefCSROrFI(const MachineInstr &MI) const;

				/// \brief Update the Save and Restore points such that \p MBB is in
				/// the region that is dominated by Save and post-dominated by Restore
				/// and Save and Restore still match the safe point definition.
				/// Such point may not exist and Save and/or Restore may be null after
				/// this call.
				void updateSaveRestorePoints(MachineBasicBlock &MBB);

				/// \brief Initialize the pass for \p MF.
				void init(MachineFunction &MF) {
				RCI.runOnMachineFunction(MF);
				MDT = &getAnalysis<MachineDominatorTree>();
				MPDT = &getAnalysis<MachinePostDominatorTree>();
				Save = nullptr;
				Restore = nullptr;
				MBFI = &getAnalysis<MachineBlockFrequencyInfo>();
				MLI = &getAnalysis<MachineLoopInfo>();
				EntryFreq = MBFI->getEntryFreq();
				const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
				FrameSetupOpcode = TII.getCallFrameSetupOpcode();
				FrameDestroyOpcode = TII.getCallFrameDestroyOpcode();
				Entry = &MF.front();

				++NumFunc;
				}

				/// Check whether or not Save and Restore points are still interesting for
				/// shrink-wrapping.
				bool ArePointsInteresting() const { return Save != Entry && Save && Restore; }

				public:
				static char ID;

				ShrinkWrap() : MachineFunctionPass(ID) {
				initializeShrinkWrapPass(*PassRegistry::getPassRegistry());
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.setPreservesAll();
				AU.addRequired<MachineBlockFrequencyInfo>();
				AU.addRequired<MachineDominatorTree>();
				AU.addRequired<MachinePostDominatorTree>();
				AU.addRequired<MachineLoopInfo>();
				MachineFunctionPass::getAnalysisUsage(AU);
				}

				const char *getPassName() const override {
				return "Shrink Wrapping analysis";
				}

				/// \brief Perform the shrink-wrapping analysis and update
				/// the MachineFrameInfo attached to \p MF with the results.
				bool runOnMachineFunction(MachineFunction &MF) override;
				};
				} // End anonymous namespace.

				char ShrinkWrap::ID = 0;
				char &llvm::ShrinkWrapID = ShrinkWrap::ID;

				INITIALIZE_PASS_BEGIN(ShrinkWrap, "shrink-wrap", "Shrink Wrap Pass", false,
				false)
				INITIALIZE_PASS_DEPENDENCY(MachineBlockFrequencyInfo)
				INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree)
				INITIALIZE_PASS_DEPENDENCY(MachinePostDominatorTree)
				INITIALIZE_PASS_DEPENDENCY(MachineLoopInfo)
				INITIALIZE_PASS_END(ShrinkWrap, "shrink-wrap", "Shrink Wrap Pass", false, false)

				bool ShrinkWrap::useOrDefCSROrFI(const MachineInstr &MI) const {
				if (MI.getOpcode() == FrameSetupOpcode \|\|
				MI.getOpcode() == FrameDestroyOpcode) {
				DEBUG(dbgs() << "Frame instruction: " << MI << '\n');
				return true;
				}
				for (const MachineOperand &MO : MI.operands()) {
				bool UseCSR = false;
				if (MO.isReg()) {
				unsigned PhysReg = MO.getReg();
				if (!PhysReg)
				continue;
				assert(TargetRegisterInfo::isPhysicalRegister(PhysReg) &&
				"Unallocated register?!");
				UseCSR = RCI.getLastCalleeSavedAlias(PhysReg);
				}
				// TODO: Handle regmask more accurately.
				// For now, be conservative about them.
				if (UseCSR \|\| MO.isFI() \|\| MO.isRegMask()) {
				DEBUG(dbgs() << "Use or define CSR(" << UseCSR << ") or FI(" << MO.isFI()
				<< "): " << MI << '\n');
				return true;
				}
				}
				return false;
				}
				kristof.beylsUnsubmitted Not Done Reply Inline Actions This seems like the most critical function to get right from a correctness point-of-view? I can't derive from just staring at this code whether this is also going to catch code generated for VLAs, or inline assembly fragments touching the frame, or registers such as stack pointer, frame pointer, base pointer. I'm guessing this could be good cases to add as regression tests - i.e. a case where only the presence of a VLA prevents the optimization & a case where only the presence of a piece of inline assembly prevents the optimization? kristof.beyls: This seems like the most critical function to get right from a correctness point-of-view? I…
				qcolombetAuthorUnsubmitted Not Done Reply Inline Actions If the callee-saved information is properly set, it should… Not sure how the VLA are handled, definitely a good regression test to add! For inline assembly, I guess you are right and we could play it safe. Additionally, we could add a target hook to check whether or not a given instruction should be after the prologue and before the epilogue. At the moment, I had needed this but if it proves to be useful we can definitely do that. qcolombet: If the callee-saved information is properly set, it should… Not sure how the VLA are handled…
				qcolombetAuthorUnsubmitted Not Done Reply Inline Actions The regression tests did not show any problem with the current implementation. qcolombet: The regression tests did not show any problem with the current implementation.

				/// \brief Helper function to find the immediate (post) dominator.
				template <typename ListOfBBs, typename DominanceAnalysis>
				MachineBasicBlock *FindIDom(MachineBasicBlock &Block, ListOfBBs BBs,
				DominanceAnalysis &Dom) {
				MachineBasicBlock *IDom = &Block;
				for (MachineBasicBlock *BB : BBs) {
				IDom = Dom.findNearestCommonDominator(IDom, BB);
				if (!IDom)
				break;
				}
				return IDom;
				}

				void ShrinkWrap::updateSaveRestorePoints(MachineBasicBlock &MBB) {
				// Get rid of the easy cases first.
				if (!Save)
				Save = &MBB;
				else
				Save = MDT->findNearestCommonDominator(Save, &MBB);

				if (!Save) {
				DEBUG(dbgs() << "Found a block that is not reachable from Entry\n");
				return;
				}

				if (!Restore)
				Restore = &MBB;
				else
				Restore = MPDT->findNearestCommonDominator(Restore, &MBB);

				// Make sure we would be able to insert the restore code before the
				// terminator.
				if (Restore == &MBB) {
				for (const MachineInstr &Terminator : MBB.terminators()) {
				if (!useOrDefCSROrFI(Terminator))
				continue;
				// One of the terminator needs to happen before the restore point.
				if (MBB.succ_empty()) {
				Restore = nullptr;
				break;
				}
				// Look for a restore point that post-dominates all the successors.
				// The immediate post-dominator is what we are looking for.
				Restore = FindIDom<>(Restore, Restore->successors(), MPDT);
				break;
				}
				}

				if (!Restore) {
				DEBUG(dbgs() << "Restore point needs to be spanned on several blocks\n");
				return;
				}

				// Make sure Save and Restore are suitable for shrink-wrapping:
				// 1. all path from Save needs to lead to Restore before exiting.
				// 2. all path to Restore needs to go through Save from Entry.
				// We achieve that by making sure that:
				// A. Save dominates Restore.
				// B. Restore post-dominates Save.
				// C. Save and Restore are in the same loop.
				bool SaveDominatesRestore = false;
				bool RestorePostDominatesSave = false;
				while (Save && Restore &&
				(!(SaveDominatesRestore = MDT->dominates(Save, Restore)) \|\|
				!(RestorePostDominatesSave = MPDT->dominates(Restore, Save)) \|\|
				MLI->getLoopFor(Save) != MLI->getLoopFor(Restore))) {
				// Fix (A).
				if (!SaveDominatesRestore) {
				Save = MDT->findNearestCommonDominator(Save, Restore);
				continue;
				}
				// Fix (B).
				if (!RestorePostDominatesSave)
				Restore = MPDT->findNearestCommonDominator(Restore, Save);

				// Fix (C).
				if (Save && Restore && Save != Restore &&
				MLI->getLoopFor(Save) != MLI->getLoopFor(Restore)) {
				if (MLI->getLoopDepth(Save) > MLI->getLoopDepth(Restore))
				// Push Save outside of this loop.
				Save = FindIDom<>(Save, Save->predecessors(), MDT);
				else
				// Push Restore outside of this loop.
				Restore = FindIDom<>(Restore, Restore->successors(), MPDT);
				}
				}
				}

				bool ShrinkWrap::runOnMachineFunction(MachineFunction &MF) {
				if (MF.empty())
				return false;
				DEBUG(dbgs() << "**** Analysing " << MF.getName() << '\n');

				init(MF);

				for (MachineBasicBlock &MBB : MF) {
				DEBUG(dbgs() << "Look into: " << MBB.getNumber() << ' ' << MBB.getName()
				<< '\n');

				for (const MachineInstr &MI : MBB) {
				if (!useOrDefCSROrFI(MI))
				continue;
				// Save (resp. restore) point must dominate (resp. post dominate)
				// MI. Look for the proper basic block for those.
				updateSaveRestorePoints(MBB);
				// If we are at a point where we cannot improve the placement of
				// save/restore instructions, just give up.
				if (!ArePointsInteresting()) {
				DEBUG(dbgs() << "No Shrink wrap candidate found\n");
				return false;
				}
				// No need to look for other instructions, this basic block
				// will already be part of the handled region.
				break;
				}
				}
				if (!ArePointsInteresting()) {
				// If the points are not interesting at this point, then they must be null
				// because it means we did not encounter any frame/CSR related code.
				// Otherwise, we would have returned from the previous loop.
				assert(!Save && !Restore && "We miss a shrink-wrap opportunity?!");
				DEBUG(dbgs() << "Nothing to shrink-wrap\n");
				return false;
				}

				DEBUG(dbgs() << "\n Results \nFrequency of the Entry: " << EntryFreq
				<< '\n');

				do {
				DEBUG(dbgs() << "Shrink wrap candidates (#, Name, Freq):\nSave: "
				<< Save->getNumber() << ' ' << Save->getName() << ' '
				<< MBFI->getBlockFreq(Save).getFrequency() << "\nRestore: "
				<< Restore->getNumber() << ' ' << Restore->getName() << ' '
				<< MBFI->getBlockFreq(Restore).getFrequency() << '\n');

				bool IsSaveCheap;
				if ((IsSaveCheap = EntryFreq >= MBFI->getBlockFreq(Save).getFrequency()) &&
				EntryFreq >= MBFI->getBlockFreq(Restore).getFrequency())
				break;
				DEBUG(dbgs() << "New points are too expensive\n");
				MachineBasicBlock *NewBB;
				if (!IsSaveCheap) {
				Save = FindIDom<>(Save, Save->predecessors(), MDT);
				if (!Save)
				break;
				NewBB = Save;
				} else {
				// Restore is expensive.
				Restore = FindIDom<>(Restore, Restore->successors(), MPDT);
				if (!Restore)
				break;
				NewBB = Restore;
				}
				updateSaveRestorePoints(*NewBB);
				} while (Save && Restore);

				if (!ArePointsInteresting()) {
				++NumCandidatesDropped;
				return false;
				}

				DEBUG(dbgs() << "Final shrink wrap candidates:\nSave: " << Save->getNumber()
				<< ' ' << Save->getName() << "\nRestore: "
				<< Restore->getNumber() << ' ' << Restore->getName() << '\n');

				MachineFrameInfo *MFI = MF.getFrameInfo();
				MFI->setSavePoint(Save);
				MFI->setRestorePoint(Restore);
				++NumCandidates;
				return false;
				}

lib/Target/AArch64/AArch64FrameLowering.h

Show All 28 Lines	void emitCalleeSavedFrameMoves(MachineBasicBlock &MBB,
unsigned FramePtr) const;		unsigned FramePtr) const;

void eliminateCallFramePseudoInstr(MachineFunction &MF,		void eliminateCallFramePseudoInstr(MachineFunction &MF,
MachineBasicBlock &MBB,		MachineBasicBlock &MBB,
MachineBasicBlock::iterator I) const override;		MachineBasicBlock::iterator I) const override;

/// emitProlog/emitEpilog - These methods insert prolog and epilog code into		/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
/// the function.		/// the function.
void emitPrologue(MachineFunction &MF) const override;		void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;		void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

int getFrameIndexOffset(const MachineFunction &MF, int FI) const override;		int getFrameIndexOffset(const MachineFunction &MF, int FI) const override;
int getFrameIndexReference(const MachineFunction &MF, int FI,		int getFrameIndexReference(const MachineFunction &MF, int FI,
unsigned &FrameReg) const override;		unsigned &FrameReg) const override;
int resolveFrameIndexReference(const MachineFunction &MF, int FI,		int resolveFrameIndexReference(const MachineFunction &MF, int FI,
unsigned &FrameReg,		unsigned &FrameReg,
bool PreferFP = false) const;		bool PreferFP = false) const;
Show All 23 Lines

lib/Target/AArch64/AArch64FrameLowering.cpp

Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines

static bool isCSSave(MachineInstr *MBBI) {		static bool isCSSave(MachineInstr *MBBI) {
return MBBI->getOpcode() == AArch64::STPXi \|\|		return MBBI->getOpcode() == AArch64::STPXi \|\|
MBBI->getOpcode() == AArch64::STPDi \|\|		MBBI->getOpcode() == AArch64::STPDi \|\|
MBBI->getOpcode() == AArch64::STPXpre \|\|		MBBI->getOpcode() == AArch64::STPXpre \|\|
MBBI->getOpcode() == AArch64::STPDpre;		MBBI->getOpcode() == AArch64::STPDpre;
}		}

void AArch64FrameLowering::emitPrologue(MachineFunction &MF) const {		void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front(); // Prologue goes in entry BB.		MachineBasicBlock &MBB) const {
MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
const MachineFrameInfo *MFI = MF.getFrameInfo();		const MachineFrameInfo *MFI = MF.getFrameInfo();
const Function *Fn = MF.getFunction();		const Function *Fn = MF.getFunction();
const AArch64RegisterInfo RegInfo = static_cast<const AArch64RegisterInfo >(		const AArch64RegisterInfo RegInfo = static_cast<const AArch64RegisterInfo >(
MF.getSubtarget().getRegisterInfo());		MF.getSubtarget().getRegisterInfo());
const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();		const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();		AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
▲ Show 20 Lines • Show All 246 Lines • ▼ Show 20 Lines	static bool isCSRestore(MachineInstr MI, const MCPhysReg CSRegs) {
}		}

return false;		return false;
}		}

void AArch64FrameLowering::emitEpilogue(MachineFunction &MF,		void AArch64FrameLowering::emitEpilogue(MachineFunction &MF,
MachineBasicBlock &MBB) const {		MachineBasicBlock &MBB) const {
MachineBasicBlock::iterator MBBI = MBB.getLastNonDebugInstr();		MachineBasicBlock::iterator MBBI = MBB.getLastNonDebugInstr();
assert(MBBI->isReturn() && "Can only insert epilog into returning blocks");
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
const AArch64InstrInfo *TII =		const AArch64InstrInfo *TII =
static_cast<const AArch64InstrInfo *>(MF.getSubtarget().getInstrInfo());		static_cast<const AArch64InstrInfo *>(MF.getSubtarget().getInstrInfo());
const AArch64RegisterInfo RegInfo = static_cast<const AArch64RegisterInfo >(		const AArch64RegisterInfo RegInfo = static_cast<const AArch64RegisterInfo >(
MF.getSubtarget().getRegisterInfo());		MF.getSubtarget().getRegisterInfo());
DebugLoc DL = MBBI->getDebugLoc();		DebugLoc DL;
		bool IsTailCallReturn = false;
		if (MBB.end() != MBBI) {
		DL = MBBI->getDebugLoc();
unsigned RetOpcode = MBBI->getOpcode();		unsigned RetOpcode = MBBI->getOpcode();
		IsTailCallReturn = RetOpcode == AArch64::TCRETURNdi \|\|
		RetOpcode == AArch64::TCRETURNri;
		}
int NumBytes = MFI->getStackSize();		int NumBytes = MFI->getStackSize();
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I guess this code sequence will lead to DL sometimes being uninitialized. I'm not sure when "MBB.end() != MBBI", but in that situation, can't you get a reasonable debug location from somewhere else? If the RetOpcode variable is only used to detect tail call returns, maybe it's better to replace it with a bool variable called "isTailCallReturn" or something similar, and use that later on, instead of explicitly checking what the value is? kristof.beyls: I guess this code sequence will lead to DL sometimes being uninitialized. I'm not sure when…
		qcolombetAuthorUnsubmitted Not Done Reply Inline Actions I guess I can use MBB.findDebugLoc() with some sensible iterator. I will check. Same for RetOpcode, thanks for the feedback. qcolombet: I guess I can use MBB.findDebugLoc() with some sensible iterator. I will check. Same for…
		qcolombetAuthorUnsubmitted Not Done Reply Inline Actions I haven’t changed the handling of the debug information. It is indeed consistent with what is done AArch64FrameLowering::spillCalleeSavedRegisters and AArch64FrameLowering::restoreCalleeSavedRegisters. Therefore, I think that if we want to fix that, we should do it consistently at these three places. What do you think? qcolombet: I haven’t changed the handling of the debug information. It is indeed consistent with what is…
const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();		const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();

// All calls are tail calls in GHC calling conv, and functions have no		// All calls are tail calls in GHC calling conv, and functions have no
// prologue/epilogue.		// prologue/epilogue.
if (MF.getFunction()->getCallingConv() == CallingConv::GHC)		if (MF.getFunction()->getCallingConv() == CallingConv::GHC)
return;		return;

// Initial and residual are named for consistency with the prologue. Note that		// Initial and residual are named for consistency with the prologue. Note that
// in the epilogue, the residual adjustment is executed first.		// in the epilogue, the residual adjustment is executed first.
uint64_t ArgumentPopSize = 0;		uint64_t ArgumentPopSize = 0;
if (RetOpcode == AArch64::TCRETURNdi \|\| RetOpcode == AArch64::TCRETURNri) {		if (IsTailCallReturn) {
MachineOperand &StackAdjust = MBBI->getOperand(1);		MachineOperand &StackAdjust = MBBI->getOperand(1);

// For a tail-call in a callee-pops-arguments environment, some or all of		// For a tail-call in a callee-pops-arguments environment, some or all of
// the stack may actually be in use for the call's arguments, this is		// the stack may actually be in use for the call's arguments, this is
// calculated during LowerCall and consumed here...		// calculated during LowerCall and consumed here...
ArgumentPopSize = StackAdjust.getImm();		ArgumentPopSize = StackAdjust.getImm();
} else {		} else {
// ... otherwise the amount to pop is all of the argument space,		// ... otherwise the amount to pop is all of the argument space,
Show All 28 Lines	void AArch64FrameLowering::emitEpilogue(MachineFunction &MF,
// = StackSize + ArgumentPopSize		// = StackSize + ArgumentPopSize
//		//
// AArch64TargetLowering::LowerCall figures out ArgumentPopSize and keeps		// AArch64TargetLowering::LowerCall figures out ArgumentPopSize and keeps
// it as the 2nd argument of AArch64ISD::TC_RETURN.		// it as the 2nd argument of AArch64ISD::TC_RETURN.
NumBytes += ArgumentPopSize;		NumBytes += ArgumentPopSize;

unsigned NumRestores = 0;		unsigned NumRestores = 0;
// Move past the restores of the callee-saved registers.		// Move past the restores of the callee-saved registers.
MachineBasicBlock::iterator LastPopI = MBBI;		MachineBasicBlock::iterator LastPopI = MBB.getFirstTerminator();
const MCPhysReg *CSRegs = RegInfo->getCalleeSavedRegs(&MF);		const MCPhysReg *CSRegs = RegInfo->getCalleeSavedRegs(&MF);
if (LastPopI != MBB.begin()) {		if (LastPopI != MBB.begin()) {
do {		do {
++NumRestores;		++NumRestores;
--LastPopI;		--LastPopI;
} while (LastPopI != MBB.begin() && isCSRestore(LastPopI, CSRegs));		} while (LastPopI != MBB.begin() && isCSRestore(LastPopI, CSRegs));
if (!isCSRestore(LastPopI, CSRegs)) {		if (!isCSRestore(LastPopI, CSRegs)) {
++LastPopI;		++LastPopI;
▲ Show 20 Lines • Show All 400 Lines • Show Last 20 Lines

lib/Target/ARM/ARMFrameLowering.h

Show All 22 Lines
protected:		protected:
const ARMSubtarget &STI;		const ARMSubtarget &STI;

public:		public:
explicit ARMFrameLowering(const ARMSubtarget &sti);		explicit ARMFrameLowering(const ARMSubtarget &sti);

/// emitProlog/emitEpilog - These methods insert prolog and epilog code into		/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
/// the function.		/// the function.
void emitPrologue(MachineFunction &MF) const override;		void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;		void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

void fixTCReturn(MachineFunction &MF, MachineBasicBlock &MBB) const;		void fixTCReturn(MachineFunction &MF, MachineBasicBlock &MBB) const;

bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,		bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MI,		MachineBasicBlock::iterator MI,
const std::vector<CalleeSavedInfo> &CSI,		const std::vector<CalleeSavedInfo> &CSI,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;
Show All 10 Lines	int getFrameIndexReference(const MachineFunction &MF, int FI,
unsigned &FrameReg) const override;		unsigned &FrameReg) const override;
int ResolveFrameIndexReference(const MachineFunction &MF, int FI,		int ResolveFrameIndexReference(const MachineFunction &MF, int FI,
unsigned &FrameReg, int SPAdj) const;		unsigned &FrameReg, int SPAdj) const;
int getFrameIndexOffset(const MachineFunction &MF, int FI) const override;		int getFrameIndexOffset(const MachineFunction &MF, int FI) const override;

void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,		void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,
RegScavenger *RS) const override;		RegScavenger *RS) const override;

void adjustForSegmentedStacks(MachineFunction &MF) const override;		void adjustForSegmentedStacks(MachineFunction &MF,
		MachineBasicBlock &MBB) const override;

private:		private:
void emitPushInst(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,		void emitPushInst(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
const std::vector<CalleeSavedInfo> &CSI, unsigned StmOpc,		const std::vector<CalleeSavedInfo> &CSI, unsigned StmOpc,
unsigned StrOpc, bool NoGap,		unsigned StrOpc, bool NoGap,
bool(*Func)(unsigned, bool), unsigned NumAlignedDPRCS2Regs,		bool(*Func)(unsigned, bool), unsigned NumAlignedDPRCS2Regs,
unsigned MIFlags = 0) const;		unsigned MIFlags = 0) const;
void emitPopInst(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,		void emitPopInst(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
Show All 14 Lines

lib/Target/ARM/ARMFrameLowering.cpp

Show First 20 Lines • Show All 272 Lines • ▼ Show 20 Lines	if (!AFI->isThumbFunction()) {
// should always be available.		// should always be available.
assert(CanUseBFC);		assert(CanUseBFC);
AddDefaultPred(BuildMI(MBB, MBBI, DL, TII.get(ARM::t2BFC), Reg)		AddDefaultPred(BuildMI(MBB, MBBI, DL, TII.get(ARM::t2BFC), Reg)
.addReg(Reg, RegState::Kill)		.addReg(Reg, RegState::Kill)
.addImm(~AlignMask));		.addImm(~AlignMask));
}		}
}		}

void ARMFrameLowering::emitPrologue(MachineFunction &MF) const {		void ARMFrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front();		MachineBasicBlock &MBB) const {
		assert(&MBB == &MF.front() && "Shrink-wrapping not yet implemented");
MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
ARMFunctionInfo *AFI = MF.getInfo<ARMFunctionInfo>();		ARMFunctionInfo *AFI = MF.getInfo<ARMFunctionInfo>();
MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
MCContext &Context = MMI.getContext();		MCContext &Context = MMI.getContext();
const TargetMachine &TM = MF.getTarget();		const TargetMachine &TM = MF.getTarget();
const MCRegisterInfo *MRI = Context.getRegisterInfo();		const MCRegisterInfo *MRI = Context.getRegisterInfo();
const ARMBaseRegisterInfo *RegInfo = STI.getRegisterInfo();		const ARMBaseRegisterInfo *RegInfo = STI.getRegisterInfo();
▲ Show 20 Lines • Show All 1,392 Lines • ▼ Show 20 Lines	if (BigStack \|\| !CanEliminateFrame \|\| RegInfo->cannotEliminateFrame(MF)) {
// If stack and double are 8-byte aligned and we are spilling an odd number		// If stack and double are 8-byte aligned and we are spilling an odd number
// of GPRs, spill one extra callee save GPR so we won't have to pad between		// of GPRs, spill one extra callee save GPR so we won't have to pad between
// the integer and double callee save areas.		// the integer and double callee save areas.
unsigned TargetAlign = getStackAlignment();		unsigned TargetAlign = getStackAlignment();
if (TargetAlign >= 8 && (NumGPRSpills & 1)) {		if (TargetAlign >= 8 && (NumGPRSpills & 1)) {
if (CS1Spilled && !UnspilledCS1GPRs.empty()) {		if (CS1Spilled && !UnspilledCS1GPRs.empty()) {
for (unsigned i = 0, e = UnspilledCS1GPRs.size(); i != e; ++i) {		for (unsigned i = 0, e = UnspilledCS1GPRs.size(); i != e; ++i) {
unsigned Reg = UnspilledCS1GPRs[i];		unsigned Reg = UnspilledCS1GPRs[i];
// Don't spill high register if the function is thumb		// Don't spill high register if the function is thumb
if (!AFI->isThumbFunction() \|\|		if (!AFI->isThumbFunction() \|\|
		kristof.beylsUnsubmitted Not Done Reply Inline Actions It's unclear to me why this change is needed? If only implementing the AArch64-backend-specific functionality, how come you need to make a change in the AArch32-backend? kristof.beyls: It's unclear to me why this change is needed? If only implementing the AArch64-backend-specific…
isARMLowRegister(Reg) \|\| Reg == ARM::LR) {		isARMLowRegister(Reg) \|\| Reg == ARM::LR) {
MRI.setPhysRegUsed(Reg);		MRI.setPhysRegUsed(Reg);
if (!MRI.isReserved(Reg))		if (!MRI.isReserved(Reg))
ExtraCSSpill = true;		ExtraCSSpill = true;
break;		break;
}		}
}		}
} else if (!UnspilledCS2GPRs.empty() && !AFI->isThumb1OnlyFunction()) {		} else if (!UnspilledCS2GPRs.empty() && !AFI->isThumb1OnlyFunction()) {
▲ Show 20 Lines • Show All 155 Lines • ▼ Show 20 Lines
// Implementations of __morestack should use r4 to allocate a new stack, r5 to		// Implementations of __morestack should use r4 to allocate a new stack, r5 to
// place the arguments on to the new stack, and the 3-instruction knowledge to		// place the arguments on to the new stack, and the 3-instruction knowledge to
// jump directly to the body of the function when working on the new stack.		// jump directly to the body of the function when working on the new stack.
//		//
// An old (and possibly no longer compatible) implementation of __morestack for		// An old (and possibly no longer compatible) implementation of __morestack for
// ARM can be found at [1].		// ARM can be found at [1].
//		//
// [1] - https://github.com/mozilla/rust/blob/86efd9/src/rt/arch/arm/morestack.S		// [1] - https://github.com/mozilla/rust/blob/86efd9/src/rt/arch/arm/morestack.S
void ARMFrameLowering::adjustForSegmentedStacks(MachineFunction &MF) const {		void ARMFrameLowering::adjustForSegmentedStacks(
		MachineFunction &MF, MachineBasicBlock &PrologueMBB) const {
unsigned Opcode;		unsigned Opcode;
unsigned CFIIndex;		unsigned CFIIndex;
const ARMSubtarget *ST = &MF.getSubtarget<ARMSubtarget>();		const ARMSubtarget *ST = &MF.getSubtarget<ARMSubtarget>();
bool Thumb = ST->isThumb();		bool Thumb = ST->isThumb();

// Sadly, this currently doesn't support varargs, platforms other than		// Sadly, this currently doesn't support varargs, platforms other than
// android/linux. Note that thumb1/thumb2 are support for android/linux.		// android/linux. Note that thumb1/thumb2 are support for android/linux.
if (MF.getFunction()->isVarArg())		if (MF.getFunction()->isVarArg())
report_fatal_error("Segmented stacks do not support vararg functions.");		report_fatal_error("Segmented stacks do not support vararg functions.");
if (!ST->isTargetAndroid() && !ST->isTargetLinux())		if (!ST->isTargetAndroid() && !ST->isTargetLinux())
report_fatal_error("Segmented stacks not supported on this platform.");		report_fatal_error("Segmented stacks not supported on this platform.");

MachineBasicBlock &prologueMBB = MF.front();		assert(&PrologueMBB == &MF.front() && "Shrink-wrapping not yet implemented");
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
MCContext &Context = MMI.getContext();		MCContext &Context = MMI.getContext();
const MCRegisterInfo *MRI = Context.getRegisterInfo();		const MCRegisterInfo *MRI = Context.getRegisterInfo();
const ARMBaseInstrInfo &TII =		const ARMBaseInstrInfo &TII =
static_cast<const ARMBaseInstrInfo >(MF.getSubtarget().getInstrInfo());		static_cast<const ARMBaseInstrInfo >(MF.getSubtarget().getInstrInfo());
ARMFunctionInfo *ARMFI = MF.getInfo<ARMFunctionInfo>();		ARMFunctionInfo *ARMFI = MF.getInfo<ARMFunctionInfo>();
DebugLoc DL;		DebugLoc DL;
Show All 11 Lines	void ARMFrameLowering::adjustForSegmentedStacks(
uint64_t AlignedStackSize;		uint64_t AlignedStackSize;

MachineBasicBlock *PrevStackMBB = MF.CreateMachineBasicBlock();		MachineBasicBlock *PrevStackMBB = MF.CreateMachineBasicBlock();
MachineBasicBlock *PostStackMBB = MF.CreateMachineBasicBlock();		MachineBasicBlock *PostStackMBB = MF.CreateMachineBasicBlock();
MachineBasicBlock *AllocMBB = MF.CreateMachineBasicBlock();		MachineBasicBlock *AllocMBB = MF.CreateMachineBasicBlock();
MachineBasicBlock *GetMBB = MF.CreateMachineBasicBlock();		MachineBasicBlock *GetMBB = MF.CreateMachineBasicBlock();
MachineBasicBlock *McrMBB = MF.CreateMachineBasicBlock();		MachineBasicBlock *McrMBB = MF.CreateMachineBasicBlock();

for (MachineBasicBlock::livein_iterator i = prologueMBB.livein_begin(),		for (MachineBasicBlock::livein_iterator i = PrologueMBB.livein_begin(),
e = prologueMBB.livein_end();		e = PrologueMBB.livein_end();
i != e; ++i) {		i != e; ++i) {
AllocMBB->addLiveIn(*i);		AllocMBB->addLiveIn(*i);
GetMBB->addLiveIn(*i);		GetMBB->addLiveIn(*i);
McrMBB->addLiveIn(*i);		McrMBB->addLiveIn(*i);
PrevStackMBB->addLiveIn(*i);		PrevStackMBB->addLiveIn(*i);
PostStackMBB->addLiveIn(*i);		PostStackMBB->addLiveIn(*i);
}		}

▲ Show 20 Lines • Show All 236 Lines • ▼ Show 20 Lines	void ARMFrameLowering::adjustForSegmentedStacks(
BuildMI(PostStackMBB, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))		BuildMI(PostStackMBB, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
.addCFIIndex(CFIIndex);		.addCFIIndex(CFIIndex);
CFIIndex = MMI.addFrameInst(MCCFIInstruction::createSameValue(		CFIIndex = MMI.addFrameInst(MCCFIInstruction::createSameValue(
nullptr, MRI->getDwarfRegNum(ScratchReg1, true)));		nullptr, MRI->getDwarfRegNum(ScratchReg1, true)));
BuildMI(PostStackMBB, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))		BuildMI(PostStackMBB, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
.addCFIIndex(CFIIndex);		.addCFIIndex(CFIIndex);

// Organizing MBB lists		// Organizing MBB lists
PostStackMBB->addSuccessor(&prologueMBB);		PostStackMBB->addSuccessor(&PrologueMBB);

AllocMBB->addSuccessor(PostStackMBB);		AllocMBB->addSuccessor(PostStackMBB);

GetMBB->addSuccessor(PostStackMBB);		GetMBB->addSuccessor(PostStackMBB);
GetMBB->addSuccessor(AllocMBB);		GetMBB->addSuccessor(AllocMBB);

McrMBB->addSuccessor(GetMBB);		McrMBB->addSuccessor(GetMBB);

PrevStackMBB->addSuccessor(McrMBB);		PrevStackMBB->addSuccessor(McrMBB);

#ifdef XDEBUG		#ifdef XDEBUG
MF.verify();		MF.verify();
#endif		#endif
}		}

lib/Target/ARM/Thumb1FrameLowering.h

	Show All 21 Lines
	namespace llvm {			namespace llvm {

	class Thumb1FrameLowering : public ARMFrameLowering {			class Thumb1FrameLowering : public ARMFrameLowering {
	public:			public:
	explicit Thumb1FrameLowering(const ARMSubtarget &sti);			explicit Thumb1FrameLowering(const ARMSubtarget &sti);

	/// emitProlog/emitEpilog - These methods insert prolog and epilog code into			/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
	/// the function.			/// the function.
	void emitPrologue(MachineFunction &MF) const override;			void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
	void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;			void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

	bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,			bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
	MachineBasicBlock::iterator MI,			MachineBasicBlock::iterator MI,
	const std::vector<CalleeSavedInfo> &CSI,			const std::vector<CalleeSavedInfo> &CSI,
	const TargetRegisterInfo *TRI) const override;			const TargetRegisterInfo *TRI) const override;
	bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,			bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,
	MachineBasicBlock::iterator MI,			MachineBasicBlock::iterator MI,
	Show All 14 Lines

lib/Target/ARM/Thumb1FrameLowering.cpp

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	if (Amount != 0) {
assert(Opc == ARM::ADJCALLSTACKUP \|\| Opc == ARM::tADJCALLSTACKUP);		assert(Opc == ARM::ADJCALLSTACKUP \|\| Opc == ARM::tADJCALLSTACKUP);
emitSPUpdate(MBB, I, TII, dl, *RegInfo, Amount);		emitSPUpdate(MBB, I, TII, dl, *RegInfo, Amount);
}		}
}		}
}		}
MBB.erase(I);		MBB.erase(I);
}		}

void Thumb1FrameLowering::emitPrologue(MachineFunction &MF) const {		void Thumb1FrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front();		MachineBasicBlock &MBB) const {
		assert(&MBB == &MF.front() && "Shrink-wrapping not yet implemented");
MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
ARMFunctionInfo *AFI = MF.getInfo<ARMFunctionInfo>();		ARMFunctionInfo *AFI = MF.getInfo<ARMFunctionInfo>();
MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
const MCRegisterInfo *MRI = MMI.getContext().getRegisterInfo();		const MCRegisterInfo *MRI = MMI.getContext().getRegisterInfo();
const ThumbRegisterInfo *RegInfo =		const ThumbRegisterInfo *RegInfo =
static_cast<const ThumbRegisterInfo *>(STI.getRegisterInfo());		static_cast<const ThumbRegisterInfo *>(STI.getRegisterInfo());
const Thumb1InstrInfo &TII =		const Thumb1InstrInfo &TII =
▲ Show 20 Lines • Show All 439 Lines • Show Last 20 Lines

lib/Target/BPF/BPFFrameLowering.h

	Show All 18 Lines
	namespace llvm {			namespace llvm {
	class BPFSubtarget;			class BPFSubtarget;

	class BPFFrameLowering : public TargetFrameLowering {			class BPFFrameLowering : public TargetFrameLowering {
	public:			public:
	explicit BPFFrameLowering(const BPFSubtarget &sti)			explicit BPFFrameLowering(const BPFSubtarget &sti)
	: TargetFrameLowering(TargetFrameLowering::StackGrowsDown, 8, 0) {}			: TargetFrameLowering(TargetFrameLowering::StackGrowsDown, 8, 0) {}

	void emitPrologue(MachineFunction &MF) const override;			void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
	void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;			void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

	bool hasFP(const MachineFunction &MF) const override;			bool hasFP(const MachineFunction &MF) const override;
	void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,			void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,
	RegScavenger *RS) const override;			RegScavenger *RS) const override;

	void			void
	eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB,			eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB,
	MachineBasicBlock::iterator MI) const override {			MachineBasicBlock::iterator MI) const override {
	MBB.erase(MI);			MBB.erase(MI);
	}			}
	};			};
	}			}
	#endif			#endif

lib/Target/BPF/BPFFrameLowering.cpp

	Show All 17 Lines
	#include "llvm/CodeGen/MachineFunction.h"			#include "llvm/CodeGen/MachineFunction.h"
	#include "llvm/CodeGen/MachineInstrBuilder.h"			#include "llvm/CodeGen/MachineInstrBuilder.h"
	#include "llvm/CodeGen/MachineRegisterInfo.h"			#include "llvm/CodeGen/MachineRegisterInfo.h"

	using namespace llvm;			using namespace llvm;

	bool BPFFrameLowering::hasFP(const MachineFunction &MF) const { return true; }			bool BPFFrameLowering::hasFP(const MachineFunction &MF) const { return true; }

	void BPFFrameLowering::emitPrologue(MachineFunction &MF) const {}			void BPFFrameLowering::emitPrologue(MachineFunction &MF,
				MachineBasicBlock &MBB) const {}

	void BPFFrameLowering::emitEpilogue(MachineFunction &MF,			void BPFFrameLowering::emitEpilogue(MachineFunction &MF,
	MachineBasicBlock &MBB) const {}			MachineBasicBlock &MBB) const {}

	void BPFFrameLowering::processFunctionBeforeCalleeSavedScan(			void BPFFrameLowering::processFunctionBeforeCalleeSavedScan(
	MachineFunction &MF, RegScavenger *RS) const {			MachineFunction &MF, RegScavenger *RS) const {
	MachineRegisterInfo &MRI = MF.getRegInfo();			MachineRegisterInfo &MRI = MF.getRegInfo();

	MRI.setPhysRegUnused(BPF::R6);			MRI.setPhysRegUnused(BPF::R6);
	MRI.setPhysRegUnused(BPF::R7);			MRI.setPhysRegUnused(BPF::R7);
	MRI.setPhysRegUnused(BPF::R8);			MRI.setPhysRegUnused(BPF::R8);
	MRI.setPhysRegUnused(BPF::R9);			MRI.setPhysRegUnused(BPF::R9);
	}			}

lib/Target/Hexagon/HexagonFrameLowering.h

	Show All 20 Lines
	class HexagonFrameLowering : public TargetFrameLowering {			class HexagonFrameLowering : public TargetFrameLowering {
	public:			public:
	explicit HexagonFrameLowering()			explicit HexagonFrameLowering()
	: TargetFrameLowering(StackGrowsDown, 8, 0, 1, true) {}			: TargetFrameLowering(StackGrowsDown, 8, 0, 1, true) {}

	// All of the prolog/epilog functionality, including saving and restoring			// All of the prolog/epilog functionality, including saving and restoring
	// callee-saved registers is handled in emitPrologue. This is to have the			// callee-saved registers is handled in emitPrologue. This is to have the
	// logic for shrink-wrapping in one place.			// logic for shrink-wrapping in one place.
	void emitPrologue(MachineFunction &MF) const override;			void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const
				override;
	void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const			void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const
	override {}			override {}
	bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,			bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
	MachineBasicBlock::iterator MI, const std::vector<CalleeSavedInfo> &CSI,			MachineBasicBlock::iterator MI, const std::vector<CalleeSavedInfo> &CSI,
	const TargetRegisterInfo *TRI) const override {			const TargetRegisterInfo *TRI) const override {
	return true;			return true;
	}			}
	bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,			bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

lib/Target/Hexagon/HexagonFrameLowering.cpp

Show First 20 Lines • Show All 338 Lines • ▼ Show 20 Lines	if (!MPT.dominates(PDomB, DomB)) {
return;		return;
}		}

// Finally, everything seems right.		// Finally, everything seems right.
PrologB = DomB;		PrologB = DomB;
EpilogB = PDomB;		EpilogB = PDomB;
}		}


/// Perform most of the PEI work here:		/// Perform most of the PEI work here:
/// - saving/restoring of the callee-saved registers,		/// - saving/restoring of the callee-saved registers,
/// - stack frame creation and destruction.		/// - stack frame creation and destruction.
/// Normally, this work is distributed among various functions, but doing it		/// Normally, this work is distributed among various functions, but doing it
/// in one place allows shrink-wrapping of the stack frame.		/// in one place allows shrink-wrapping of the stack frame.
void HexagonFrameLowering::emitPrologue(MachineFunction &MF) const {		void HexagonFrameLowering::emitPrologue(MachineFunction &MF,
		MachineBasicBlock &MBB) const {
auto &HST = static_cast<const HexagonSubtarget&>(MF.getSubtarget());		auto &HST = static_cast<const HexagonSubtarget&>(MF.getSubtarget());
auto &HRI = *HST.getRegisterInfo();		auto &HRI = *HST.getRegisterInfo();

		assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
const std::vector<CalleeSavedInfo> &CSI = MFI->getCalleeSavedInfo();		const std::vector<CalleeSavedInfo> &CSI = MFI->getCalleeSavedInfo();

MachineBasicBlock PrologB = &MF.front(), EpilogB = nullptr;		MachineBasicBlock PrologB = &MF.front(), EpilogB = nullptr;
if (EnableShrinkWrapping)		if (EnableShrinkWrapping)
findShrunkPrologEpilog(MF, PrologB, EpilogB);		findShrunkPrologEpilog(MF, PrologB, EpilogB);

insertCSRSpillsInBlock(*PrologB, CSI, HRI);		insertCSRSpillsInBlock(*PrologB, CSI, HRI);
▲ Show 20 Lines • Show All 926 Lines • Show Last 20 Lines

lib/Target/MSP430/MSP430FrameLowering.h

	Show All 21 Lines
	protected:			protected:

	public:			public:
	explicit MSP430FrameLowering()			explicit MSP430FrameLowering()
	: TargetFrameLowering(TargetFrameLowering::StackGrowsDown, 2, -2, 2) {}			: TargetFrameLowering(TargetFrameLowering::StackGrowsDown, 2, -2, 2) {}

	/// emitProlog/emitEpilog - These methods insert prolog and epilog code into			/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
	/// the function.			/// the function.
	void emitPrologue(MachineFunction &MF) const override;			void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
	void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;			void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

	void eliminateCallFramePseudoInstr(MachineFunction &MF,			void eliminateCallFramePseudoInstr(MachineFunction &MF,
	MachineBasicBlock &MBB,			MachineBasicBlock &MBB,
	MachineBasicBlock::iterator I) const override;			MachineBasicBlock::iterator I) const override;

	bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,			bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
	MachineBasicBlock::iterator MI,			MachineBasicBlock::iterator MI,
	Show All 16 Lines

lib/Target/MSP430/MSP430FrameLowering.cpp

Show All 33 Lines	return (MF.getTarget().Options.DisableFramePointerElim(MF) \|\|
MF.getFrameInfo()->hasVarSizedObjects() \|\|		MF.getFrameInfo()->hasVarSizedObjects() \|\|
MFI->isFrameAddressTaken());		MFI->isFrameAddressTaken());
}		}

bool MSP430FrameLowering::hasReservedCallFrame(const MachineFunction &MF) const {		bool MSP430FrameLowering::hasReservedCallFrame(const MachineFunction &MF) const {
return !MF.getFrameInfo()->hasVarSizedObjects();		return !MF.getFrameInfo()->hasVarSizedObjects();
}		}

void MSP430FrameLowering::emitPrologue(MachineFunction &MF) const {		void MSP430FrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front(); // Prolog goes in entry BB		MachineBasicBlock &MBB) const {
		assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
MSP430MachineFunctionInfo *MSP430FI = MF.getInfo<MSP430MachineFunctionInfo>();		MSP430MachineFunctionInfo *MSP430FI = MF.getInfo<MSP430MachineFunctionInfo>();
const MSP430InstrInfo &TII =		const MSP430InstrInfo &TII =
static_cast<const MSP430InstrInfo >(MF.getSubtarget().getInstrInfo());		static_cast<const MSP430InstrInfo >(MF.getSubtarget().getInstrInfo());

MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
DebugLoc DL = MBBI != MBB.end() ? MBBI->getDebugLoc() : DebugLoc();		DebugLoc DL = MBBI != MBB.end() ? MBBI->getDebugLoc() : DebugLoc();

▲ Show 20 Lines • Show All 247 Lines • Show Last 20 Lines

lib/Target/Mips/Mips16FrameLowering.h

	Show All 17 Lines

	namespace llvm {			namespace llvm {
	class Mips16FrameLowering : public MipsFrameLowering {			class Mips16FrameLowering : public MipsFrameLowering {
	public:			public:
	explicit Mips16FrameLowering(const MipsSubtarget &STI);			explicit Mips16FrameLowering(const MipsSubtarget &STI);

	/// emitProlog/emitEpilog - These methods insert prolog and epilog code into			/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
	/// the function.			/// the function.
	void emitPrologue(MachineFunction &MF) const override;			void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
	void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;			void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

	bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,			bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
	MachineBasicBlock::iterator MI,			MachineBasicBlock::iterator MI,
	const std::vector<CalleeSavedInfo> &CSI,			const std::vector<CalleeSavedInfo> &CSI,
	const TargetRegisterInfo *TRI) const override;			const TargetRegisterInfo *TRI) const override;

	bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,			bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,
	Show All 13 Lines

lib/Target/Mips/Mips16FrameLowering.cpp

	Show All 26 Lines
	#include "llvm/Support/CommandLine.h"			#include "llvm/Support/CommandLine.h"
	#include "llvm/Target/TargetOptions.h"			#include "llvm/Target/TargetOptions.h"

	using namespace llvm;			using namespace llvm;

	Mips16FrameLowering::Mips16FrameLowering(const MipsSubtarget &STI)			Mips16FrameLowering::Mips16FrameLowering(const MipsSubtarget &STI)
	: MipsFrameLowering(STI, STI.stackAlignment()) {}			: MipsFrameLowering(STI, STI.stackAlignment()) {}

	void Mips16FrameLowering::emitPrologue(MachineFunction &MF) const {			void Mips16FrameLowering::emitPrologue(MachineFunction &MF,
	MachineBasicBlock &MBB = MF.front();			MachineBasicBlock &MBB) const {
				assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
	MachineFrameInfo *MFI = MF.getFrameInfo();			MachineFrameInfo *MFI = MF.getFrameInfo();
	const Mips16InstrInfo &TII =			const Mips16InstrInfo &TII =
	static_cast<const Mips16InstrInfo >(STI.getInstrInfo());			static_cast<const Mips16InstrInfo >(STI.getInstrInfo());
	MachineBasicBlock::iterator MBBI = MBB.begin();			MachineBasicBlock::iterator MBBI = MBB.begin();
	DebugLoc dl = MBBI != MBB.end() ? MBBI->getDebugLoc() : DebugLoc();			DebugLoc dl = MBBI != MBB.end() ? MBBI->getDebugLoc() : DebugLoc();
	uint64_t StackSize = MFI->getStackSize();			uint64_t StackSize = MFI->getStackSize();

	// No need to allocate space on the stack.			// No need to allocate space on the stack.
	▲ Show 20 Lines • Show All 127 Lines • Show Last 20 Lines

lib/Target/Mips/MipsSEFrameLowering.h

	Show All 18 Lines
	namespace llvm {			namespace llvm {

	class MipsSEFrameLowering : public MipsFrameLowering {			class MipsSEFrameLowering : public MipsFrameLowering {
	public:			public:
	explicit MipsSEFrameLowering(const MipsSubtarget &STI);			explicit MipsSEFrameLowering(const MipsSubtarget &STI);

	/// emitProlog/emitEpilog - These methods insert prolog and epilog code into			/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
	/// the function.			/// the function.
	void emitPrologue(MachineFunction &MF) const override;			void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
	void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;			void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

	bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,			bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
	MachineBasicBlock::iterator MI,			MachineBasicBlock::iterator MI,
	const std::vector<CalleeSavedInfo> &CSI,			const std::vector<CalleeSavedInfo> &CSI,
	const TargetRegisterInfo *TRI) const override;			const TargetRegisterInfo *TRI) const override;

	bool hasReservedCallFrame(const MachineFunction &MF) const override;			bool hasReservedCallFrame(const MachineFunction &MF) const override;
	Show All 9 Lines

lib/Target/Mips/MipsSEFrameLowering.cpp

Show First 20 Lines • Show All 358 Lines • ▼ Show 20 Lines	bool ExpandPseudo::expandExtractElementF64(MachineBasicBlock &MBB,
}		}

return false;		return false;
}		}

MipsSEFrameLowering::MipsSEFrameLowering(const MipsSubtarget &STI)		MipsSEFrameLowering::MipsSEFrameLowering(const MipsSubtarget &STI)
: MipsFrameLowering(STI, STI.stackAlignment()) {}		: MipsFrameLowering(STI, STI.stackAlignment()) {}

void MipsSEFrameLowering::emitPrologue(MachineFunction &MF) const {		void MipsSEFrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front();		MachineBasicBlock &MBB) const {
		assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
MipsFunctionInfo *MipsFI = MF.getInfo<MipsFunctionInfo>();		MipsFunctionInfo *MipsFI = MF.getInfo<MipsFunctionInfo>();

const MipsSEInstrInfo &TII =		const MipsSEInstrInfo &TII =
static_cast<const MipsSEInstrInfo >(STI.getInstrInfo());		static_cast<const MipsSEInstrInfo >(STI.getInstrInfo());
const MipsRegisterInfo &RegInfo =		const MipsRegisterInfo &RegInfo =
static_cast<const MipsRegisterInfo >(STI.getRegisterInfo());		static_cast<const MipsRegisterInfo >(STI.getRegisterInfo());

▲ Show 20 Lines • Show All 270 Lines • Show Last 20 Lines

lib/Target/NVPTX/NVPTXFrameLowering.h

	Show All 17 Lines

	namespace llvm {			namespace llvm {
	class NVPTXSubtarget;			class NVPTXSubtarget;
	class NVPTXFrameLowering : public TargetFrameLowering {			class NVPTXFrameLowering : public TargetFrameLowering {
	public:			public:
	explicit NVPTXFrameLowering();			explicit NVPTXFrameLowering();

	bool hasFP(const MachineFunction &MF) const override;			bool hasFP(const MachineFunction &MF) const override;
	void emitPrologue(MachineFunction &MF) const override;			void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
	void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;			void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

	void			void
	eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB,			eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB,
	MachineBasicBlock::iterator I) const override;			MachineBasicBlock::iterator I) const override;
	};			};

	} // End llvm namespace			} // End llvm namespace

	#endif			#endif

lib/Target/NVPTX/NVPTXFrameLowering.cpp

	Show All 25 Lines

	using namespace llvm;			using namespace llvm;

	NVPTXFrameLowering::NVPTXFrameLowering()			NVPTXFrameLowering::NVPTXFrameLowering()
	: TargetFrameLowering(TargetFrameLowering::StackGrowsUp, 8, 0) {}			: TargetFrameLowering(TargetFrameLowering::StackGrowsUp, 8, 0) {}

	bool NVPTXFrameLowering::hasFP(const MachineFunction &MF) const { return true; }			bool NVPTXFrameLowering::hasFP(const MachineFunction &MF) const { return true; }

	void NVPTXFrameLowering::emitPrologue(MachineFunction &MF) const {			void NVPTXFrameLowering::emitPrologue(MachineFunction &MF,
				MachineBasicBlock &MBB) const {
	if (MF.getFrameInfo()->hasStackObjects()) {			if (MF.getFrameInfo()->hasStackObjects()) {
	MachineBasicBlock &MBB = MF.front();			assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
	// Insert "mov.u32 %SP, %Depot"			// Insert "mov.u32 %SP, %Depot"
	MachineBasicBlock::iterator MBBI = MBB.begin();			MachineBasicBlock::iterator MBBI = MBB.begin();
	// This instruction really occurs before first instruction			// This instruction really occurs before first instruction
	// in the BB, so giving it no debug location.			// in the BB, so giving it no debug location.
	DebugLoc dl = DebugLoc();			DebugLoc dl = DebugLoc();

	MachineRegisterInfo &MRI = MF.getRegInfo();			MachineRegisterInfo &MRI = MF.getRegInfo();

	Show All 36 Lines

lib/Target/NVPTX/NVPTXPrologEpilogPass.cpp

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	for (MachineBasicBlock::iterator I = BB->begin(); I != BB->end(); ++I) {
continue;		continue;
TRI.eliminateFrameIndex(MI, 0, i, nullptr);		TRI.eliminateFrameIndex(MI, 0, i, nullptr);
Modified = true;		Modified = true;
}		}
}		}
}		}

// Add function prolog/epilog		// Add function prolog/epilog
TFI.emitPrologue(MF);		TFI.emitPrologue(MF, MF.front());

for (MachineFunction::iterator I = MF.begin(), E = MF.end(); I != E; ++I) {		for (MachineFunction::iterator I = MF.begin(), E = MF.end(); I != E; ++I) {
// If last instruction is a return instruction, add an epilogue		// If last instruction is a return instruction, add an epilogue
if (!I->empty() && I->back().isReturn())		if (!I->empty() && I->back().isReturn())
TFI.emitEpilogue(MF, *I);		TFI.emitEpilogue(MF, *I);
}		}

return Modified;		return Modified;
▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCFrameLowering.h

Show All 32 Lines	public:
PPCFrameLowering(const PPCSubtarget &STI);		PPCFrameLowering(const PPCSubtarget &STI);

unsigned determineFrameLayout(MachineFunction &MF,		unsigned determineFrameLayout(MachineFunction &MF,
bool UpdateMF = true,		bool UpdateMF = true,
bool UseEstimate = false) const;		bool UseEstimate = false) const;

/// emitProlog/emitEpilog - These methods insert prolog and epilog code into		/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
/// the function.		/// the function.
void emitPrologue(MachineFunction &MF) const override;		void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;		void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

bool hasFP(const MachineFunction &MF) const override;		bool hasFP(const MachineFunction &MF) const override;
bool needsFP(const MachineFunction &MF) const;		bool needsFP(const MachineFunction &MF) const;
void replaceFPWithRealFP(MachineFunction &MF) const;		void replaceFPWithRealFP(MachineFunction &MF) const;

void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,		void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,
RegScavenger *RS = nullptr) const override;		RegScavenger *RS = nullptr) const override;
▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCFrameLowering.cpp

Show First 20 Lines • Show All 549 Lines • ▼ Show 20 Lines	for (MachineBasicBlock::iterator MBBI = BI->end(); MBBI != BI->begin(); ) {
MO.setReg(BP8Reg);		MO.setReg(BP8Reg);
break;		break;

}		}
}		}
}		}
}		}

void PPCFrameLowering::emitPrologue(MachineFunction &MF) const {		void PPCFrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front(); // Prolog goes in entry BB		MachineBasicBlock &MBB) const {
		assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
const PPCInstrInfo &TII =		const PPCInstrInfo &TII =
static_cast<const PPCInstrInfo >(Subtarget.getInstrInfo());		static_cast<const PPCInstrInfo >(Subtarget.getInstrInfo());
const PPCRegisterInfo *RegInfo =		const PPCRegisterInfo *RegInfo =
static_cast<const PPCRegisterInfo *>(Subtarget.getRegisterInfo());		static_cast<const PPCRegisterInfo *>(Subtarget.getRegisterInfo());

MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
▲ Show 20 Lines • Show All 1,143 Lines • Show Last 20 Lines

lib/Target/R600/AMDGPUFrameLowering.h

Show All 31 Lines	public:
virtual ~AMDGPUFrameLowering();		virtual ~AMDGPUFrameLowering();

/// \returns The number of 32-bit sub-registers that are used when storing		/// \returns The number of 32-bit sub-registers that are used when storing
/// values to the stack.		/// values to the stack.
unsigned getStackWidth(const MachineFunction &MF) const;		unsigned getStackWidth(const MachineFunction &MF) const;
int getFrameIndexOffset(const MachineFunction &MF, int FI) const override;		int getFrameIndexOffset(const MachineFunction &MF, int FI) const override;
const SpillSlot *		const SpillSlot *
getCalleeSavedSpillSlots(unsigned &NumEntries) const override;		getCalleeSavedSpillSlots(unsigned &NumEntries) const override;
void emitPrologue(MachineFunction &MF) const override;		void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;		void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
bool hasFP(const MachineFunction &MF) const override;		bool hasFP(const MachineFunction &MF) const override;
};		};
} // namespace llvm		} // namespace llvm
#endif		#endif

lib/Target/R600/AMDGPUFrameLowering.cpp

Show First 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	int AMDGPUFrameLowering::getFrameIndexOffset(const MachineFunction &MF,
return OffsetBytes / (getStackWidth(MF) * 4);		return OffsetBytes / (getStackWidth(MF) * 4);
}		}

const TargetFrameLowering::SpillSlot *		const TargetFrameLowering::SpillSlot *
AMDGPUFrameLowering::getCalleeSavedSpillSlots(unsigned &NumEntries) const {		AMDGPUFrameLowering::getCalleeSavedSpillSlots(unsigned &NumEntries) const {
NumEntries = 0;		NumEntries = 0;
return nullptr;		return nullptr;
}		}
void		void AMDGPUFrameLowering::emitPrologue(MachineFunction &MF,
AMDGPUFrameLowering::emitPrologue(MachineFunction &MF) const {		MachineBasicBlock &MBB) const {}
}
void		void
AMDGPUFrameLowering::emitEpilogue(MachineFunction &MF,		AMDGPUFrameLowering::emitEpilogue(MachineFunction &MF,
MachineBasicBlock &MBB) const {		MachineBasicBlock &MBB) const {
}		}

bool		bool
AMDGPUFrameLowering::hasFP(const MachineFunction &MF) const {		AMDGPUFrameLowering::hasFP(const MachineFunction &MF) const {
return false;		return false;
}		}

lib/Target/Sparc/SparcFrameLowering.h

	Show All 20 Lines

	class SparcSubtarget;			class SparcSubtarget;
	class SparcFrameLowering : public TargetFrameLowering {			class SparcFrameLowering : public TargetFrameLowering {
	public:			public:
	explicit SparcFrameLowering(const SparcSubtarget &ST);			explicit SparcFrameLowering(const SparcSubtarget &ST);

	/// emitProlog/emitEpilog - These methods insert prolog and epilog code into			/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
	/// the function.			/// the function.
	void emitPrologue(MachineFunction &MF) const override;			void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
	void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;			void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

	void			void
	eliminateCallFramePseudoInstr(MachineFunction &MF,			eliminateCallFramePseudoInstr(MachineFunction &MF,
	MachineBasicBlock &MBB,			MachineBasicBlock &MBB,
	MachineBasicBlock::iterator I) const override;			MachineBasicBlock::iterator I) const override;

	bool hasReservedCallFrame(const MachineFunction &MF) const override;			bool hasReservedCallFrame(const MachineFunction &MF) const override;
	Show All 23 Lines

lib/Target/Sparc/SparcFrameLowering.cpp

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	void SparcFrameLowering::emitSPAdjustment(MachineFunction &MF,
BuildMI(MBB, MBBI, dl, TII.get(SP::SETHIi), SP::G1)		BuildMI(MBB, MBBI, dl, TII.get(SP::SETHIi), SP::G1)
.addImm(HIX22(NumBytes));		.addImm(HIX22(NumBytes));
BuildMI(MBB, MBBI, dl, TII.get(SP::XORri), SP::G1)		BuildMI(MBB, MBBI, dl, TII.get(SP::XORri), SP::G1)
.addReg(SP::G1).addImm(LOX10(NumBytes));		.addReg(SP::G1).addImm(LOX10(NumBytes));
BuildMI(MBB, MBBI, dl, TII.get(ADDrr), SP::O6)		BuildMI(MBB, MBBI, dl, TII.get(ADDrr), SP::O6)
.addReg(SP::O6).addReg(SP::G1);		.addReg(SP::O6).addReg(SP::G1);
}		}

void SparcFrameLowering::emitPrologue(MachineFunction &MF) const {		void SparcFrameLowering::emitPrologue(MachineFunction &MF,
		MachineBasicBlock &MBB) const {
SparcMachineFunctionInfo *FuncInfo = MF.getInfo<SparcMachineFunctionInfo>();		SparcMachineFunctionInfo *FuncInfo = MF.getInfo<SparcMachineFunctionInfo>();

MachineBasicBlock &MBB = MF.front();		assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
const SparcInstrInfo &TII =		const SparcInstrInfo &TII =
static_cast<const SparcInstrInfo >(MF.getSubtarget().getInstrInfo());		static_cast<const SparcInstrInfo >(MF.getSubtarget().getInstrInfo());
MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
DebugLoc dl = MBBI != MBB.end() ? MBBI->getDebugLoc() : DebugLoc();		DebugLoc dl = MBBI != MBB.end() ? MBBI->getDebugLoc() : DebugLoc();

// Get the number of bytes to allocate from the FrameInfo		// Get the number of bytes to allocate from the FrameInfo
int NumBytes = (int) MFI->getStackSize();		int NumBytes = (int) MFI->getStackSize();
▲ Show 20 Lines • Show All 163 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZFrameLowering.h

Show All 34 Lines	bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;
bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,		bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBII,		MachineBasicBlock::iterator MBBII,
const std::vector<CalleeSavedInfo> &CSI,		const std::vector<CalleeSavedInfo> &CSI,
const TargetRegisterInfo *TRI) const		const TargetRegisterInfo *TRI) const
override;		override;
void processFunctionBeforeFrameFinalized(MachineFunction &MF,		void processFunctionBeforeFrameFinalized(MachineFunction &MF,
RegScavenger *RS) const override;		RegScavenger *RS) const override;
void emitPrologue(MachineFunction &MF) const override;		void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;		void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
bool hasFP(const MachineFunction &MF) const override;		bool hasFP(const MachineFunction &MF) const override;
int getFrameIndexOffset(const MachineFunction &MF, int FI) const override;		int getFrameIndexOffset(const MachineFunction &MF, int FI) const override;
bool hasReservedCallFrame(const MachineFunction &MF) const override;		bool hasReservedCallFrame(const MachineFunction &MF) const override;
void eliminateCallFramePseudoInstr(MachineFunction &MF,		void eliminateCallFramePseudoInstr(MachineFunction &MF,
MachineBasicBlock &MBB,		MachineBasicBlock &MBB,
MachineBasicBlock::iterator MI) const		MachineBasicBlock::iterator MI) const
override;		override;
Show All 13 Lines

lib/Target/SystemZ/SystemZFrameLowering.cpp

Show First 20 Lines • Show All 303 Lines • ▼ Show 20 Lines	while (NumBytes) {
MachineInstr *MI = BuildMI(MBB, MBBI, DL, TII->get(Opcode), Reg)		MachineInstr *MI = BuildMI(MBB, MBBI, DL, TII->get(Opcode), Reg)
.addReg(Reg).addImm(ThisVal);		.addReg(Reg).addImm(ThisVal);
// The CC implicit def is dead.		// The CC implicit def is dead.
MI->getOperand(3).setIsDead();		MI->getOperand(3).setIsDead();
NumBytes -= ThisVal;		NumBytes -= ThisVal;
}		}
}		}

void SystemZFrameLowering::emitPrologue(MachineFunction &MF) const {		void SystemZFrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front();		MachineBasicBlock &MBB) const {
		assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
MachineFrameInfo *MFFrame = MF.getFrameInfo();		MachineFrameInfo *MFFrame = MF.getFrameInfo();
auto *ZII =		auto *ZII =
static_cast<const SystemZInstrInfo *>(MF.getSubtarget().getInstrInfo());		static_cast<const SystemZInstrInfo *>(MF.getSubtarget().getInstrInfo());
SystemZMachineFunctionInfo *ZFI = MF.getInfo<SystemZMachineFunctionInfo>();		SystemZMachineFunctionInfo *ZFI = MF.getInfo<SystemZMachineFunctionInfo>();
MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
const MCRegisterInfo *MRI = MMI.getContext().getRegisterInfo();		const MCRegisterInfo *MRI = MMI.getContext().getRegisterInfo();
const std::vector<CalleeSavedInfo> &CSI = MFFrame->getCalleeSavedInfo();		const std::vector<CalleeSavedInfo> &CSI = MFFrame->getCalleeSavedInfo();
▲ Show 20 Lines • Show All 194 Lines • Show Last 20 Lines

lib/Target/X86/X86FrameLowering.h

Show All 29 Lines	static void emitStackProbeCall(MachineFunction &MF, MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI, DebugLoc DL);		MachineBasicBlock::iterator MBBI, DebugLoc DL);

void emitCalleeSavedFrameMoves(MachineBasicBlock &MBB,		void emitCalleeSavedFrameMoves(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI,		MachineBasicBlock::iterator MBBI,
DebugLoc DL) const;		DebugLoc DL) const;

/// emitProlog/emitEpilog - These methods insert prolog and epilog code into		/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
/// the function.		/// the function.
void emitPrologue(MachineFunction &MF) const override;		void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override;
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;		void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override;

void adjustForSegmentedStacks(MachineFunction &MF) const override;		void adjustForSegmentedStacks(MachineFunction &MF,
		MachineBasicBlock &PrologueMBB) const override;

void adjustForHiPEPrologue(MachineFunction &MF) const override;		void adjustForHiPEPrologue(MachineFunction &MF,
		MachineBasicBlock &PrologueMBB) const override;

void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,		void processFunctionBeforeCalleeSavedScan(MachineFunction &MF,
RegScavenger *RS = nullptr) const override;		RegScavenger *RS = nullptr) const override;

bool		bool
assignCalleeSavedSpillSlots(MachineFunction &MF,		assignCalleeSavedSpillSlots(MachineFunction &MF,
const TargetRegisterInfo *TRI,		const TargetRegisterInfo *TRI,
std::vector<CalleeSavedInfo> &CSI) const override;		std::vector<CalleeSavedInfo> &CSI) const override;
▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

lib/Target/X86/X86FrameLowering.cpp

Show First 20 Lines • Show All 559 Lines • ▼ Show 20 Lines	[else]
.cfi_offset %<reg>, (offset from %rsp)		.cfi_offset %<reg>, (offset from %rsp)

Notes:		Notes:
- .seh directives are emitted only for Windows 64 ABI		- .seh directives are emitted only for Windows 64 ABI
- .cfi directives are emitted for all other ABIs		- .cfi directives are emitted for all other ABIs
- for 32-bit code, substitute %e?? registers for %r??		- for 32-bit code, substitute %e?? registers for %r??
*/		*/

void X86FrameLowering::emitPrologue(MachineFunction &MF) const {		void X86FrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front(); // Prologue goes in entry BB.		MachineBasicBlock &MBB) const {
		assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
const Function *Fn = MF.getFunction();		const Function *Fn = MF.getFunction();
const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();		const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();
const X86RegisterInfo *RegInfo = STI.getRegisterInfo();		const X86RegisterInfo *RegInfo = STI.getRegisterInfo();
const TargetInstrInfo &TII = *STI.getInstrInfo();		const TargetInstrInfo &TII = *STI.getInstrInfo();
MachineModuleInfo &MMI = MF.getMMI();		MachineModuleInfo &MMI = MF.getMMI();
X86MachineFunctionInfo *X86FI = MF.getInfo<X86MachineFunctionInfo>();		X86MachineFunctionInfo *X86FI = MF.getInfo<X86MachineFunctionInfo>();
▲ Show 20 Lines • Show All 1,007 Lines • ▼ Show 20 Lines	if (IsNested)
return Primary ? X86::EDX : X86::EAX;		return Primary ? X86::EDX : X86::EAX;
return Primary ? X86::ECX : X86::EAX;		return Primary ? X86::ECX : X86::EAX;
}		}

// The stack limit in the TCB is set to this many bytes above the actual stack		// The stack limit in the TCB is set to this many bytes above the actual stack
// limit.		// limit.
static const uint64_t kSplitStackAvailable = 256;		static const uint64_t kSplitStackAvailable = 256;

void		void X86FrameLowering::adjustForSegmentedStacks(
X86FrameLowering::adjustForSegmentedStacks(MachineFunction &MF) const {		MachineFunction &MF, MachineBasicBlock &PrologueMBB) const {
MachineBasicBlock &prologueMBB = MF.front();		assert(&PrologueMBB == &MF.front() &&
		"Shrink-wrapping is not implemented yet");
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();		const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();
const TargetInstrInfo &TII = *STI.getInstrInfo();		const TargetInstrInfo &TII = *STI.getInstrInfo();
uint64_t StackSize;		uint64_t StackSize;
bool Is64Bit = STI.is64Bit();		bool Is64Bit = STI.is64Bit();
const bool IsLP64 = STI.isTarget64BitLP64();		const bool IsLP64 = STI.isTarget64BitLP64();
unsigned TlsReg, TlsOffset;		unsigned TlsReg, TlsOffset;
DebugLoc DL;		DebugLoc DL;
Show All 25 Lines	void X86FrameLowering::adjustForSegmentedStacks(

// We need to know if the function has a nest argument only in 64 bit mode.		// We need to know if the function has a nest argument only in 64 bit mode.
if (Is64Bit)		if (Is64Bit)
IsNested = HasNestArgument(&MF);		IsNested = HasNestArgument(&MF);

// The MOV R10, RAX needs to be in a different block, since the RET we emit in		// The MOV R10, RAX needs to be in a different block, since the RET we emit in
// allocMBB needs to be last (terminating) instruction.		// allocMBB needs to be last (terminating) instruction.

for (MachineBasicBlock::livein_iterator i = prologueMBB.livein_begin(),		for (MachineBasicBlock::livein_iterator i = PrologueMBB.livein_begin(),
e = prologueMBB.livein_end(); i != e; i++) {		e = PrologueMBB.livein_end();
		i != e; i++) {
allocMBB->addLiveIn(*i);		allocMBB->addLiveIn(*i);
checkMBB->addLiveIn(*i);		checkMBB->addLiveIn(*i);
}		}

if (IsNested)		if (IsNested)
allocMBB->addLiveIn(IsLP64 ? X86::R10 : X86::R10D);		allocMBB->addLiveIn(IsLP64 ? X86::R10 : X86::R10D);

MF.push_front(allocMBB);		MF.push_front(allocMBB);
▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	if (STI.isTargetLinux() \|\| STI.isTargetWin32() \|\| STI.isTargetWin64() \|\|

if (SaveScratch2)		if (SaveScratch2)
BuildMI(checkMBB, DL, TII.get(X86::POP32r), ScratchReg2);		BuildMI(checkMBB, DL, TII.get(X86::POP32r), ScratchReg2);
}		}
}		}

// This jump is taken if SP >= (Stacklet Limit + Stack Space required).		// This jump is taken if SP >= (Stacklet Limit + Stack Space required).
// It jumps to normal execution of the function body.		// It jumps to normal execution of the function body.
BuildMI(checkMBB, DL, TII.get(X86::JA_1)).addMBB(&prologueMBB);		BuildMI(checkMBB, DL, TII.get(X86::JA_1)).addMBB(&PrologueMBB);

// On 32 bit we first push the arguments size and then the frame size. On 64		// On 32 bit we first push the arguments size and then the frame size. On 64
// bit, we pass the stack frame size in r10 and the argument size in r11.		// bit, we pass the stack frame size in r10 and the argument size in r11.
if (Is64Bit) {		if (Is64Bit) {
// Functions with nested arguments use R10, so it needs to be saved across		// Functions with nested arguments use R10, so it needs to be saved across
// the call to _morestack		// the call to _morestack

const unsigned RegAX = IsLP64 ? X86::RAX : X86::EAX;		const unsigned RegAX = IsLP64 ? X86::RAX : X86::EAX;
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	else
.addExternalSymbol("__morestack");		.addExternalSymbol("__morestack");
}		}

if (IsNested)		if (IsNested)
BuildMI(allocMBB, DL, TII.get(X86::MORESTACK_RET_RESTORE_R10));		BuildMI(allocMBB, DL, TII.get(X86::MORESTACK_RET_RESTORE_R10));
else		else
BuildMI(allocMBB, DL, TII.get(X86::MORESTACK_RET));		BuildMI(allocMBB, DL, TII.get(X86::MORESTACK_RET));

allocMBB->addSuccessor(&prologueMBB);		allocMBB->addSuccessor(&PrologueMBB);

checkMBB->addSuccessor(allocMBB);		checkMBB->addSuccessor(allocMBB);
checkMBB->addSuccessor(&prologueMBB);		checkMBB->addSuccessor(&PrologueMBB);

#ifdef XDEBUG		#ifdef XDEBUG
MF.verify();		MF.verify();
#endif		#endif
}		}

/// Erlang programs may need a special prologue to handle the stack size they		/// Erlang programs may need a special prologue to handle the stack size they
/// might need at runtime. That is because Erlang/OTP does not implement a C		/// might need at runtime. That is because Erlang/OTP does not implement a C
/// stack but uses a custom implementation of hybrid stack/heap architecture.		/// stack but uses a custom implementation of hybrid stack/heap architecture.
/// (for more information see Eric Stenman's Ph.D. thesis:		/// (for more information see Eric Stenman's Ph.D. thesis:
/// http://publications.uu.se/uu/fulltext/nbn_se_uu_diva-2688.pdf)		/// http://publications.uu.se/uu/fulltext/nbn_se_uu_diva-2688.pdf)
///		///
/// CheckStack:		/// CheckStack:
/// temp0 = sp - MaxStack		/// temp0 = sp - MaxStack
/// if( temp0 < SP_LIMIT(P) ) goto IncStack else goto OldStart		/// if( temp0 < SP_LIMIT(P) ) goto IncStack else goto OldStart
/// OldStart:		/// OldStart:
/// ...		/// ...
/// IncStack:		/// IncStack:
/// call inc_stack # doubles the stack space		/// call inc_stack # doubles the stack space
/// temp0 = sp - MaxStack		/// temp0 = sp - MaxStack
/// if( temp0 < SP_LIMIT(P) ) goto IncStack else goto OldStart		/// if( temp0 < SP_LIMIT(P) ) goto IncStack else goto OldStart
void X86FrameLowering::adjustForHiPEPrologue(MachineFunction &MF) const {		void X86FrameLowering::adjustForHiPEPrologue(
		MachineFunction &MF, MachineBasicBlock &PrologueMBB) const {
const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();		const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();
const TargetInstrInfo &TII = *STI.getInstrInfo();		const TargetInstrInfo &TII = *STI.getInstrInfo();
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
const unsigned SlotSize = STI.getRegisterInfo()->getSlotSize();		const unsigned SlotSize = STI.getRegisterInfo()->getSlotSize();
const bool Is64Bit = STI.is64Bit();		const bool Is64Bit = STI.is64Bit();
const bool IsLP64 = STI.isTarget64BitLP64();		const bool IsLP64 = STI.isTarget64BitLP64();
DebugLoc DL;		DebugLoc DL;
// HiPE-specific values		// HiPE-specific values
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	for (MachineFunction::iterator MBBI = MF.begin(), MBBE = MF.end();
(HipeLeafWords - 1 - CalleeStkArity) * SlotSize);		(HipeLeafWords - 1 - CalleeStkArity) * SlotSize);
}		}
MaxStack += MoreStackForCalls;		MaxStack += MoreStackForCalls;
}		}

// If the stack frame needed is larger than the guaranteed then runtime checks		// If the stack frame needed is larger than the guaranteed then runtime checks
// and calls to "inc_stack_0" BIF should be inserted in the assembly prologue.		// and calls to "inc_stack_0" BIF should be inserted in the assembly prologue.
if (MaxStack > Guaranteed) {		if (MaxStack > Guaranteed) {
MachineBasicBlock &prologueMBB = MF.front();		assert(&PrologueMBB == &MF.front() &&
		"Shrink-wrapping is not implemented yet");
MachineBasicBlock *stackCheckMBB = MF.CreateMachineBasicBlock();		MachineBasicBlock *stackCheckMBB = MF.CreateMachineBasicBlock();
MachineBasicBlock *incStackMBB = MF.CreateMachineBasicBlock();		MachineBasicBlock *incStackMBB = MF.CreateMachineBasicBlock();

for (MachineBasicBlock::livein_iterator I = prologueMBB.livein_begin(),		for (MachineBasicBlock::livein_iterator I = PrologueMBB.livein_begin(),
E = prologueMBB.livein_end(); I != E; I++) {		E = PrologueMBB.livein_end();
		I != E; I++) {
stackCheckMBB->addLiveIn(*I);		stackCheckMBB->addLiveIn(*I);
incStackMBB->addLiveIn(*I);		incStackMBB->addLiveIn(*I);
}		}

MF.push_front(incStackMBB);		MF.push_front(incStackMBB);
MF.push_front(stackCheckMBB);		MF.push_front(stackCheckMBB);

unsigned ScratchReg, SPReg, PReg, SPLimitOffset;		unsigned ScratchReg, SPReg, PReg, SPLimitOffset;
Show All 19 Lines	assert(!MF.getRegInfo().isLiveIn(ScratchReg) &&
"HiPE prologue scratch register is live-in");		"HiPE prologue scratch register is live-in");

// Create new MBB for StackCheck:		// Create new MBB for StackCheck:
addRegOffset(BuildMI(stackCheckMBB, DL, TII.get(LEAop), ScratchReg),		addRegOffset(BuildMI(stackCheckMBB, DL, TII.get(LEAop), ScratchReg),
SPReg, false, -MaxStack);		SPReg, false, -MaxStack);
// SPLimitOffset is in a fixed heap location (pointed by BP).		// SPLimitOffset is in a fixed heap location (pointed by BP).
addRegOffset(BuildMI(stackCheckMBB, DL, TII.get(CMPop))		addRegOffset(BuildMI(stackCheckMBB, DL, TII.get(CMPop))
.addReg(ScratchReg), PReg, false, SPLimitOffset);		.addReg(ScratchReg), PReg, false, SPLimitOffset);
BuildMI(stackCheckMBB, DL, TII.get(X86::JAE_1)).addMBB(&prologueMBB);		BuildMI(stackCheckMBB, DL, TII.get(X86::JAE_1)).addMBB(&PrologueMBB);

// Create new MBB for IncStack:		// Create new MBB for IncStack:
BuildMI(incStackMBB, DL, TII.get(CALLop)).		BuildMI(incStackMBB, DL, TII.get(CALLop)).
addExternalSymbol("inc_stack_0");		addExternalSymbol("inc_stack_0");
addRegOffset(BuildMI(incStackMBB, DL, TII.get(LEAop), ScratchReg),		addRegOffset(BuildMI(incStackMBB, DL, TII.get(LEAop), ScratchReg),
SPReg, false, -MaxStack);		SPReg, false, -MaxStack);
addRegOffset(BuildMI(incStackMBB, DL, TII.get(CMPop))		addRegOffset(BuildMI(incStackMBB, DL, TII.get(CMPop))
.addReg(ScratchReg), PReg, false, SPLimitOffset);		.addReg(ScratchReg), PReg, false, SPLimitOffset);
BuildMI(incStackMBB, DL, TII.get(X86::JLE_1)).addMBB(incStackMBB);		BuildMI(incStackMBB, DL, TII.get(X86::JLE_1)).addMBB(incStackMBB);

stackCheckMBB->addSuccessor(&prologueMBB, 99);		stackCheckMBB->addSuccessor(&PrologueMBB, 99);
stackCheckMBB->addSuccessor(incStackMBB, 1);		stackCheckMBB->addSuccessor(incStackMBB, 1);
incStackMBB->addSuccessor(&prologueMBB, 99);		incStackMBB->addSuccessor(&PrologueMBB, 99);
incStackMBB->addSuccessor(incStackMBB, 1);		incStackMBB->addSuccessor(incStackMBB, 1);
}		}
#ifdef XDEBUG		#ifdef XDEBUG
MF.verify();		MF.verify();
#endif		#endif
}		}

void X86FrameLowering::		void X86FrameLowering::
▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

lib/Target/XCore/XCoreFrameLowering.h

Show All 21 Lines	namespace llvm {
class XCoreSubtarget;		class XCoreSubtarget;

class XCoreFrameLowering: public TargetFrameLowering {		class XCoreFrameLowering: public TargetFrameLowering {
public:		public:
XCoreFrameLowering(const XCoreSubtarget &STI);		XCoreFrameLowering(const XCoreSubtarget &STI);

/// emitProlog/emitEpilog - These methods insert prolog and epilog code into		/// emitProlog/emitEpilog - These methods insert prolog and epilog code into
/// the function.		/// the function.
void emitPrologue(MachineFunction &MF) const override;		void emitPrologue(MachineFunction &MF,
		MachineBasicBlock &MBB) const override;
void emitEpilogue(MachineFunction &MF,		void emitEpilogue(MachineFunction &MF,
MachineBasicBlock &MBB) const override;		MachineBasicBlock &MBB) const override;

bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,		bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MI,		MachineBasicBlock::iterator MI,
const std::vector<CalleeSavedInfo> &CSI,		const std::vector<CalleeSavedInfo> &CSI,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;
bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,		bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,
Show All 24 Lines

lib/Target/XCore/XCoreFrameLowering.cpp

Show First 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	XCoreFrameLowering::XCoreFrameLowering(const XCoreSubtarget &sti)
// Do nothing		// Do nothing
}		}

bool XCoreFrameLowering::hasFP(const MachineFunction &MF) const {		bool XCoreFrameLowering::hasFP(const MachineFunction &MF) const {
return MF.getTarget().Options.DisableFramePointerElim(MF) \|\|		return MF.getTarget().Options.DisableFramePointerElim(MF) \|\|
MF.getFrameInfo()->hasVarSizedObjects();		MF.getFrameInfo()->hasVarSizedObjects();
}		}

void XCoreFrameLowering::emitPrologue(MachineFunction &MF) const {		void XCoreFrameLowering::emitPrologue(MachineFunction &MF,
MachineBasicBlock &MBB = MF.front(); // Prolog goes in entry BB		MachineBasicBlock &MBB) const {
		assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported");
MachineBasicBlock::iterator MBBI = MBB.begin();		MachineBasicBlock::iterator MBBI = MBB.begin();
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();
MachineModuleInfo *MMI = &MF.getMMI();		MachineModuleInfo *MMI = &MF.getMMI();
const MCRegisterInfo *MRI = MMI->getContext().getRegisterInfo();		const MCRegisterInfo *MRI = MMI->getContext().getRegisterInfo();
const XCoreInstrInfo &TII = *MF.getSubtarget<XCoreSubtarget>().getInstrInfo();		const XCoreInstrInfo &TII = *MF.getSubtarget<XCoreSubtarget>().getInstrInfo();
XCoreFunctionInfo *XFI = MF.getInfo<XCoreFunctionInfo>();		XCoreFunctionInfo *XFI = MF.getInfo<XCoreFunctionInfo>();
// Debug location must be unknown since the first debug location is used		// Debug location must be unknown since the first debug location is used
// to determine the end of the prologue.		// to determine the end of the prologue.
▲ Show 20 Lines • Show All 349 Lines • Show Last 20 Lines

test/CodeGen/AArch64/arm64-shrink-wrapping.ll

				; RUN: llc %s -o - -enable-shrink-wrap=true \| FileCheck %s --check-prefix=CHECK --check-prefix=ENABLE
				; RUN: llc %s -o - -enable-shrink-wrap=false \| FileCheck %s --check-prefix=CHECK --check-prefix=DISABLE
				target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
				target triple = "arm64-apple-ios"


				; Initial motivating example: Simple diamond with a call just on one side.
				; CHECK-LABEL: foo:
				;
				; Compare the arguments and jump to exit.
				; No prologue needed.
				; ENABLE: cmp w0, w1
				; ENABLE-NEXT: b.ge [[EXIT_LABEL:LBB[0-9_]+]]
				;
				; Prologue code.
				; CHECK: stp [[SAVE_SP:x[0-9]+]], [[CSR:x[0-9]+]], [sp, #-16]!
				; CHECK-NEXT: mov [[SAVE_SP]], sp
				; CHECK-NEXT: sub sp, sp, #16
				;
				; Compare the arguments and jump to exit.
				; After the prologue is set.
				; DISABLE: cmp w0, w1
				; DISABLE-NEXT: b.ge [[EXIT_LABEL:LBB[0-9_]+]]
				;
				; Store %a in the alloca.
				; CHECK: stur w0, {{\[}}[[SAVE_SP]], #-4]
				; Set the alloca address in the second argument.
				; CHECK-NEXT: sub x1, [[SAVE_SP]], #4
				; Set the first argument to zero.
				; CHECK-NEXT: mov w0, wzr
				; CHECK-NEXT: bl _doSomething
				;
				; Without shrink-wrapping, epilogue is in the exit block.
				; DISABLE: [[EXIT_LABEL]]:
				; Epilogue code.
				; CHECK-NEXT: mov sp, [[SAVE_SP]]
				; CHECK-NEXT: ldp [[SAVE_SP]], [[CSR]], [sp], #16
				;
				; With shrink-wrapping, exit block is a simple return.
				; ENABLE: [[EXIT_LABEL]]:
				; CHECK-NEXT: ret
				define i32 @foo(i32 %a, i32 %b) {
				%tmp = alloca i32, align 4
				%tmp2 = icmp slt i32 %a, %b
				br i1 %tmp2, label %true, label %false

				true:
				store i32 %a, i32* %tmp, align 4
				%tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
				br label %false

				false:
				%tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
				ret i32 %tmp.0
				}

				; Function Attrs: optsize
				declare i32 @doSomething(i32, i32*)


				; Check that we do not perform the restore inside the loop whereas the save
				; is outside.
				; CHECK-LABEL: freqSaveAndRestoreOutsideLoop:
				;
				; Shrink-wrapping allows to skip the prologue in the else case.
				; ENABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; Prologue code.
				; CHECK: stp [[CSR1:x[0-9]+]], [[CSR2:x[0-9]+]], [sp, #-32]!
				; CHECK-NEXT: stp [[CSR3:x[0-9]+]], [[CSR4:x[0-9]+]], [sp, #16]
				; CHECK-NEXT: add [[NEW_SP:x[0-9]+]], sp, #16
				;
				; DISABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; CHECK: mov [[SUM:w[0-9]+]], wzr
				; CHECK-NEXT: movz [[IV:w[0-9]+]], #0xa
				;
				; Next BB.
				; CHECK: [[LOOP:LBB[0-9_]+]]: ; %for.body
				; CHECK: bl _something
				; CHECK-NEXT: add [[SUM]], w0, [[SUM]]
				; CHECK-NEXT: sub [[IV]], [[IV]], #1
				; CHECK-NEXT: cbnz [[IV]], [[LOOP]]
				;
				; Next BB.
				; Copy SUM into the returned register + << 3.
				; CHECK: lsl w0, [[SUM]], #3
				;
				; Jump to epilogue.
				; DISABLE: b [[EPILOG_BB:LBB[0-9_]+]]
				;
				; DISABLE: [[ELSE_LABEL]]: ; %if.else
				; Shift second argument by one and store into returned register.
				; DISABLE: lsl w0, w1, #1
				; DISABLE: [[EPILOG_BB]]: ; %if.end
				;
				; Epilogue code.
				; CHECK: ldp [[CSR3]], [[CSR4]], [sp, #16]
				; CHECK-NEXT: ldp [[CSR1]], [[CSR2]], [sp], #32
				; CHECK-NEXT: ret
				;
				; ENABLE: [[ELSE_LABEL]]: ; %if.else
				; Shift second argument by one and store into returned register.
				; ENABLE: lsl w0, w1, #1
				; ENABLE: ret
				define i32 @freqSaveAndRestoreOutsideLoop(i32 %cond, i32 %N) {
				entry:
				%tobool = icmp eq i32 %cond, 0
				br i1 %tobool, label %if.else, label %for.body

				for.body: ; preds = %entry, %for.body
				%i.05 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
				%sum.04 = phi i32 [ %add, %for.body ], [ 0, %entry ]
				%call = tail call i32 bitcast (i32 (...)* @something to i32 ()*)()
				%add = add nsw i32 %call, %sum.04
				%inc = add nuw nsw i32 %i.05, 1
				%exitcond = icmp eq i32 %inc, 10
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				%shl = shl i32 %add, 3
				br label %if.end

				if.else: ; preds = %entry
				%mul = shl nsw i32 %N, 1
				br label %if.end

				if.end: ; preds = %if.else, %for.end
				%sum.1 = phi i32 [ %shl, %for.end ], [ %mul, %if.else ]
				ret i32 %sum.1
				}

				declare i32 @something(...)

				; Check that we do not perform the shrink-wrapping inside the loop even
				; though that would be legal. The cost model must prevent that.
				; CHECK-LABEL: freqSaveAndRestoreOutsideLoop2:
				; Prologue code.
				; CHECK: stp [[CSR1:x[0-9]+]], [[CSR2:x[0-9]+]], [sp, #-32]!
				; CHECK-NEXT: stp [[CSR3:x[0-9]+]], [[CSR4:x[0-9]+]], [sp, #16]
				; CHECK-NEXT: add [[NEW_SP:x[0-9]+]], sp, #16
				; CHECK: mov [[SUM:w[0-9]+]], wzr
				; CHECK-NEXT: movz [[IV:w[0-9]+]], #0xa
				; Next BB.
				; CHECK: [[LOOP_LABEL:LBB[0-9_]+]]: ; %for.body
				; CHECK: bl _something
				; CHECK-NEXT: add [[SUM]], w0, [[SUM]]
				; CHECK-NEXT: sub [[IV]], [[IV]], #1
				; CHECK-NEXT: cbnz [[IV]], [[LOOP_LABEL]]
				; Next BB.
				; CHECK: ; %for.end
				; CHECK: mov w0, [[SUM]]
				; CHECK-NEXT: ldp [[CSR3]], [[CSR4]], [sp, #16]
				; CHECK-NEXT: ldp [[CSR1]], [[CSR2]], [sp], #32
				; CHECK-NEXT: ret
				define i32 @freqSaveAndRestoreOutsideLoop2(i32 %cond) {
				entry:
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%i.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%sum.03 = phi i32 [ 0, %entry ], [ %add, %for.body ]
				%call = tail call i32 bitcast (i32 (...)* @something to i32 ()*)()
				%add = add nsw i32 %call, %sum.03
				%inc = add nuw nsw i32 %i.04, 1
				%exitcond = icmp eq i32 %inc, 10
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret i32 %add
				}

				; Check with a more complex case that we do not have save within the loop and
				; restore outside.
				; CHECK-LABEL: loopInfoSaveOutsideLoop:
				;
				; ENABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; Prologue code.
				; CHECK: stp [[CSR1:x[0-9]+]], [[CSR2:x[0-9]+]], [sp, #-32]!
				; CHECK-NEXT: stp [[CSR3:x[0-9]+]], [[CSR4:x[0-9]+]], [sp, #16]
				; CHECK-NEXT: add [[NEW_SP:x[0-9]+]], sp, #16
				;
				; DISABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; CHECK: mov [[SUM:w[0-9]+]], wzr
				; CHECK-NEXT: movz [[IV:w[0-9]+]], #0xa
				;
				; CHECK: [[LOOP_LABEL:LBB[0-9_]+]]: ; %for.body
				; CHECK: bl _something
				; CHECK-NEXT: add [[SUM]], w0, [[SUM]]
				; CHECK-NEXT: sub [[IV]], [[IV]], #1
				; CHECK-NEXT: cbnz [[IV]], [[LOOP_LABEL]]
				; Next BB.
				; CHECK: bl _somethingElse
				; CHECK-NEXT: lsl w0, [[SUM]], #3
				;
				; Jump to epilogue.
				; DISABLE: b [[EPILOG_BB:LBB[0-9_]+]]
				;
				; DISABLE: [[ELSE_LABEL]]: ; %if.else
				; Shift second argument by one and store into returned register.
				; DISABLE: lsl w0, w1, #1
				; DISABLE: [[EPILOG_BB]]: ; %if.end
				; Epilogue code.
				; CHECK-NEXT: ldp [[CSR3]], [[CSR4]], [sp, #16]
				; CHECK-NEXT: ldp [[CSR1]], [[CSR2]], [sp], #32
				; CHECK-NEXT: ret
				;
				; ENABLE: [[ELSE_LABEL]]: ; %if.else
				; Shift second argument by one and store into returned register.
				; ENABLE: lsl w0, w1, #1
				; ENABLE: ret
				define i32 @loopInfoSaveOutsideLoop(i32 %cond, i32 %N) {
				entry:
				%tobool = icmp eq i32 %cond, 0
				br i1 %tobool, label %if.else, label %for.body

				for.body: ; preds = %entry, %for.body
				%i.05 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
				%sum.04 = phi i32 [ %add, %for.body ], [ 0, %entry ]
				%call = tail call i32 bitcast (i32 (...)* @something to i32 ()*)()
				%add = add nsw i32 %call, %sum.04
				%inc = add nuw nsw i32 %i.05, 1
				%exitcond = icmp eq i32 %inc, 10
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				tail call void bitcast (void (...)* @somethingElse to void ()*)()
				%shl = shl i32 %add, 3
				br label %if.end

				if.else: ; preds = %entry
				%mul = shl nsw i32 %N, 1
				br label %if.end

				if.end: ; preds = %if.else, %for.end
				%sum.1 = phi i32 [ %shl, %for.end ], [ %mul, %if.else ]
				ret i32 %sum.1
				}

				declare void @somethingElse(...)

				; Check with a more complex case that we do not have restore within the loop and
				; save outside.
				; CHECK-LABEL: loopInfoRestoreOutsideLoop:
				;
				; ENABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; CHECK: stp [[CSR1:x[0-9]+]], [[CSR2:x[0-9]+]], [sp, #-32]!
				; CHECK-NEXT: stp [[CSR3:x[0-9]+]], [[CSR4:x[0-9]+]], [sp, #16]
				; CHECK-NEXT: add [[NEW_SP:x[0-9]+]], sp, #16
				;
				; DISABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; CHECK: bl _somethingElse
				; CHECK-NEXT: mov [[SUM:w[0-9]+]], wzr
				; CHECK-NEXT: movz [[IV:w[0-9]+]], #0xa
				;
				; CHECK: [[LOOP_LABEL:LBB[0-9_]+]]: ; %for.body
				; CHECK: bl _something
				; CHECK-NEXT: add [[SUM]], w0, [[SUM]]
				; CHECK-NEXT: sub [[IV]], [[IV]], #1
				; CHECK-NEXT: cbnz [[IV]], [[LOOP_LABEL]]
				; Next BB.
				; CHECK: lsl w0, [[SUM]], #3
				;
				; Jump to epilogue.
				; DISABLE: b [[EPILOG_BB:LBB[0-9_]+]]
				;
				; DISABLE: [[ELSE_LABEL]]: ; %if.else
				; Shift second argument by one and store into returned register.
				; DISABLE: lsl w0, w1, #1
				; DISABLE: [[EPILOG_BB]]: ; %if.end
				; Epilogue code.
				; CHECK: ldp [[CSR3]], [[CSR4]], [sp, #16]
				; CHECK-NEXT: ldp [[CSR1]], [[CSR2]], [sp], #32
				; CHECK-NEXT: ret
				;
				; ENABLE: [[ELSE_LABEL]]: ; %if.else
				; Shift second argument by one and store into returned register.
				; ENABLE: lsl w0, w1, #1
				; ENABLE: ret
				define i32 @loopInfoRestoreOutsideLoop(i32 %cond, i32 %N) #0 {
				entry:
				%tobool = icmp eq i32 %cond, 0
				br i1 %tobool, label %if.else, label %if.then

				if.then: ; preds = %entry
				tail call void bitcast (void (...)* @somethingElse to void ()*)()
				br label %for.body

				for.body: ; preds = %for.body, %if.then
				%i.05 = phi i32 [ 0, %if.then ], [ %inc, %for.body ]
				%sum.04 = phi i32 [ 0, %if.then ], [ %add, %for.body ]
				%call = tail call i32 bitcast (i32 (...)* @something to i32 ()*)()
				%add = add nsw i32 %call, %sum.04
				%inc = add nuw nsw i32 %i.05, 1
				%exitcond = icmp eq i32 %inc, 10
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				%shl = shl i32 %add, 3
				br label %if.end

				if.else: ; preds = %entry
				%mul = shl nsw i32 %N, 1
				br label %if.end

				if.end: ; preds = %if.else, %for.end
				%sum.1 = phi i32 [ %shl, %for.end ], [ %mul, %if.else ]
				ret i32 %sum.1
				}

				; Check that we handle function with no frame information correctly.
				; CHECK-LABEL: emptyFrame:
				; CHECK: ; %entry
				; CHECK-NEXT: mov w0, wzr
				; CHECK-NEXT: ret
				define i32 @emptyFrame() {
				entry:
				ret i32 0
				}

				; Check that we handle variadic function correctly.
				; CHECK-LABEL: variadicFunc:
				;
				; ENABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; Prologue code.
				; CHECK: sub sp, sp, #16
				; DISABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; Sum is merged with the returned register.
				; CHECK: mov [[SUM:w0]], wzr
				; CHECK-NEXT: add [[VA_BASE:x[0-9]+]], sp, #16
				; CHECK-NEXT: str [[VA_BASE]], [sp, #8]
				; CHECK-NEXT: cmp w1, #1
				; CHECK-NEXT: b.lt [[IFEND_LABEL:LBB[0-9_]+]]
				;
				; CHECK: [[LOOP_LABEL:LBB[0-9_]+]]: ; %for.body
				; CHECK: ldr [[VA_ADDR:x[0-9]+]], [sp, #8]
				; CHECK-NEXT: add [[NEXT_VA_ADDR:x[0-9]+]], [[VA_ADDR]], #8
				; CHECK-NEXT: str [[NEXT_VA_ADDR]], [sp, #8]
				; CHECK-NEXT: ldr [[VA_VAL:w[0-9]+]], {{\[}}[[VA_ADDR]]]
				; CHECK-NEXT: add [[SUM]], [[SUM]], [[VA_VAL]]
				; CHECK-NEXT: sub w1, w1, #1
				; CHECK-NEXT: cbnz w1, [[LOOP_LABEL]]
				;
				; DISABLE-NEXT: b [[IFEND_LABEL]]
				; DISABLE: [[ELSE_LABEL]]: ; %if.else
				; DISABLE: lsl w0, w1, #1
				;
				; CHECK: [[IFEND_LABEL]]:
				; Epilogue code.
				; CHECK: add sp, sp, #16
				; CHECK-NEXT: ret
				;
				; ENABLE: [[ELSE_LABEL]]: ; %if.else
				; ENABLE: lsl w0, w1, #1
				; ENABLE-NEXT: ret
				define i32 @variadicFunc(i32 %cond, i32 %count, ...) #0 {
				entry:
				%ap = alloca i8*, align 8
				%tobool = icmp eq i32 %cond, 0
				br i1 %tobool, label %if.else, label %if.then

				if.then: ; preds = %entry
				%ap1 = bitcast i8** %ap to i8*
				call void @llvm.va_start(i8* %ap1)
				%cmp6 = icmp sgt i32 %count, 0
				br i1 %cmp6, label %for.body, label %for.end

				for.body: ; preds = %if.then, %for.body
				%i.08 = phi i32 [ %inc, %for.body ], [ 0, %if.then ]
				%sum.07 = phi i32 [ %add, %for.body ], [ 0, %if.then ]
				%0 = va_arg i8** %ap, i32
				%add = add nsw i32 %sum.07, %0
				%inc = add nuw nsw i32 %i.08, 1
				%exitcond = icmp eq i32 %inc, %count
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body, %if.then
				%sum.0.lcssa = phi i32 [ 0, %if.then ], [ %add, %for.body ]
				call void @llvm.va_end(i8* %ap1)
				br label %if.end

				if.else: ; preds = %entry
				%mul = shl nsw i32 %count, 1
				br label %if.end

				if.end: ; preds = %if.else, %for.end
				%sum.1 = phi i32 [ %sum.0.lcssa, %for.end ], [ %mul, %if.else ]
				ret i32 %sum.1
				}

				declare void @llvm.va_start(i8*)

				declare void @llvm.va_end(i8*)

				; Check that we handle inline asm correctly.
				; CHECK-LABEL: inlineAsm:
				;
				; ENABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; Prologue code.
				; Make sure we save the CSR used in the inline asm: x19.
				; CHECK: stp [[CSR1:x[0-9]+]], [[CSR2:x19]], [sp, #-16]!
				;
				; DISABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; CHECK: movz [[IV:w[0-9]+]], #0xa
				;
				; CHECK: [[LOOP_LABEL:LBB[0-9_]+]]: ; %for.body
				; Inline asm statement.
				; CHECK: add x19, x19, #1
				; CHECK: sub [[IV]], [[IV]], #1
				; CHECK-NEXT: cbnz [[IV]], [[LOOP_LABEL]]
				; Next BB.
				; CHECK: mov w0, wzr
				; Epilogue code.
				; CHECK-NEXT: ldp [[CSR1]], [[CSR2]], [sp], #16
				; CHECK-NEXT: ret
				; Next BB.
				; CHECK: [[ELSE_LABEL]]: ; %if.else
				; CHECK-NEXT: lsl w0, w1, #1
				; Epilogue code.
				; DISABLE-NEXT: ldp [[CSR1]], [[CSR2]], [sp], #16
				; CHECK-NEXT: ret
				define i32 @inlineAsm(i32 %cond, i32 %N) {
				entry:
				%tobool = icmp eq i32 %cond, 0
				br i1 %tobool, label %if.else, label %for.body

				for.body: ; preds = %entry, %for.body
				%i.03 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
				tail call void asm sideeffect "add x19, x19, #1", "~{x19}"()
				%inc = add nuw nsw i32 %i.03, 1
				%exitcond = icmp eq i32 %inc, 10
				br i1 %exitcond, label %if.end, label %for.body

				if.else: ; preds = %entry
				%mul = shl nsw i32 %N, 1
				br label %if.end

				if.end: ; preds = %for.body, %if.else
				%sum.0 = phi i32 [ %mul, %if.else ], [ 0, %for.body ]
				ret i32 %sum.0
				}

				; Check that we handle calls to variadic functions correctly.
				; CHECK-LABEL: callVariadicFunc:
				;
				; ENABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				;
				; Prologue code.
				; CHECK: stp [[CSR1:x[0-9]+]], [[CSR2:x[0-9]+]], [sp, #-16]!
				; CHECK-NEXT: mov [[NEW_SP:x[0-9]+]], sp
				; CHECK-NEXT: sub sp, sp, #48
				;
				; DISABLE: cbz w0, [[ELSE_LABEL:LBB[0-9_]+]]
				; Setup of the varags.
				; CHECK: stp x1, x1, [sp, #32]
				; CHECK-NEXT: stp x1, x1, [sp, #16]
				; CHECK-NEXT: stp x1, x1, [sp]
				; CHECK-NEXT: mov w0, w1
				; CHECK-NEXT: bl _someVariadicFunc
				; CHECK-NEXT: lsl w0, w0, #3
				;
				; DISABLE: b [[IFEND_LABEL:LBB[0-9_]+]]
				; DISABLE: [[ELSE_LABEL]]: ; %if.else
				; DISABLE-NEXT: lsl w0, w1, #1
				; DISABLE: [[IFEND_LABEL]]: ; %if.end
				;
				; Epilogue code.
				; CHECK: mov sp, [[NEW_SP]]
				; CHECK-NEXT: ldp [[CSR1]], [[CSR2]], [sp], #16
				; CHECK-NEXT: ret
				;
				; ENABLE: [[ELSE_LABEL]]: ; %if.else
				; ENABLE-NEXT: lsl w0, w1, #1
				; ENABLE-NEXT: ret
				define i32 @callVariadicFunc(i32 %cond, i32 %N) {
				entry:
				%tobool = icmp eq i32 %cond, 0
				br i1 %tobool, label %if.else, label %if.then

				if.then: ; preds = %entry
				%call = tail call i32 (i32, ...) @someVariadicFunc(i32 %N, i32 %N, i32 %N, i32 %N, i32 %N, i32 %N, i32 %N)
				%shl = shl i32 %call, 3
				br label %if.end

				if.else: ; preds = %entry
				%mul = shl nsw i32 %N, 1
				br label %if.end

				if.end: ; preds = %if.else, %if.then
				%sum.0 = phi i32 [ %shl, %if.then ], [ %mul, %if.else ]
				ret i32 %sum.0
				}

				declare i32 @someVariadicFunc(i32, ...)

This is an archive of the discontinued LLVM Phabricator instance.

Add a shrink-wrapping pass to improve the placement of prologue and epilogue.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 24926

include/llvm/CodeGen/MachineFrameInfo.h

include/llvm/CodeGen/Passes.h

include/llvm/InitializePasses.h

include/llvm/Target/TargetFrameLowering.h

lib/CodeGen/CMakeLists.txt

lib/CodeGen/CodeGen.cpp

lib/CodeGen/MachineFunction.cpp

lib/CodeGen/Passes.cpp

lib/CodeGen/PrologEpilogInserter.cpp

lib/CodeGen/ShrinkWrap.cpp

lib/Target/AArch64/AArch64FrameLowering.h

lib/Target/AArch64/AArch64FrameLowering.cpp

lib/Target/ARM/ARMFrameLowering.h

lib/Target/ARM/ARMFrameLowering.cpp

lib/Target/ARM/Thumb1FrameLowering.h

lib/Target/ARM/Thumb1FrameLowering.cpp

lib/Target/BPF/BPFFrameLowering.h

lib/Target/BPF/BPFFrameLowering.cpp

lib/Target/Hexagon/HexagonFrameLowering.h

lib/Target/Hexagon/HexagonFrameLowering.cpp

lib/Target/MSP430/MSP430FrameLowering.h

lib/Target/MSP430/MSP430FrameLowering.cpp

lib/Target/Mips/Mips16FrameLowering.h

lib/Target/Mips/Mips16FrameLowering.cpp

lib/Target/Mips/MipsSEFrameLowering.h

lib/Target/Mips/MipsSEFrameLowering.cpp

lib/Target/NVPTX/NVPTXFrameLowering.h

lib/Target/NVPTX/NVPTXFrameLowering.cpp

lib/Target/NVPTX/NVPTXPrologEpilogPass.cpp

lib/Target/PowerPC/PPCFrameLowering.h

lib/Target/PowerPC/PPCFrameLowering.cpp

lib/Target/R600/AMDGPUFrameLowering.h

lib/Target/R600/AMDGPUFrameLowering.cpp

lib/Target/Sparc/SparcFrameLowering.h

lib/Target/Sparc/SparcFrameLowering.cpp

lib/Target/SystemZ/SystemZFrameLowering.h

lib/Target/SystemZ/SystemZFrameLowering.cpp

lib/Target/X86/X86FrameLowering.h

lib/Target/X86/X86FrameLowering.cpp

lib/Target/XCore/XCoreFrameLowering.h

lib/Target/XCore/XCoreFrameLowering.cpp

test/CodeGen/AArch64/arm64-shrink-wrapping.ll

Add a shrink-wrapping pass to improve the placement of prologue and epilogue.
ClosedPublic