This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
MachineModuleInfo.h
-
lib/
-
CodeGen/AsmPrinter/
-
AsmPrinter/
-
AsmPrinterDwarf.cpp
-
Target/X86/
-
X86/
-
X86CallFrameOptimization.cpp
-
X86FrameLowering.h
-
X86FrameLowering.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
debugloc-argsize.ll
-
fold-push.ll
-
pop-stack-cleanup.ll
-
push-cfi-debug.ll
-
push-cfi-obj.ll
-
push-cfi.ll

Differential D13767

[X86] Fix more -Os + EH issues
ClosedPublic

Authored by mkuper on Oct 15 2015, 2:03 AM.

Download Raw Diff

Details

Reviewers

friss
joerg
kbsmith1
DavidKreitzer
rnk

Commits

rG4d5a9c1d171b: Using higher level interface to insert new arguments so arguments
rG73dc85293f00: [X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments
rL251904: [X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments

Summary

This does two things:

Disables push generation on Darwin whenever EH is used and FP is not present.
Generates .cfa_adjust_cfa_offset directives to make the CFA offset correct at each call site.

Generating more precise offsets adjustments for debug info will be a separate patch.

Diff Detail

Repository: rL LLVM

Event Timeline

mkuper updated this revision to Diff 37460.Oct 15 2015, 2:03 AM

mkuper retitled this revision from to [X86] Fix more -Os + EH issues.

mkuper updated this object.

mkuper added reviewers: friss, DavidKreitzer, rnk, joerg, kbsmith1.

mkuper added a subscriber: llvm-commits.

Please see inline comments.

lib/Target/X86/X86FrameLowering.cpp
2108 ↗	(On Diff #37460)	I think this code needs to go after line 2128 in order to create the CFI adjust after the stack adjustment instruction has happened rather than before.
test/CodeGen/X86/push-cfi-obj.ll
27 ↗	(On Diff #37460)	I don't understand what the specific differences in these values in Section Data coma about from. Do you know?
test/CodeGen/X86/push-cfi.ll
14 ↗	(On Diff #37460)	I don't think this is really the correct spot for the cfi_adjust. I think it should occur after all the pushes have happenned, so right before the call instruction. Once you get to having a different cfi_adjust per push, each of these should follow immediately after the instruction, not before. Now I understand that you are not yet getting to the point of emitting a cfi_adjust per instruction, but it seems like even if you are accumulating them so that they are only correct at the call, then they should be after all the pushes.

Thanks, Kevin!

lib/Target/X86/X86FrameLowering.cpp
2108 ↗	(On Diff #37460)	Since to handle EH we only care about it being correct at the call site, I don't think it matters whether it's before or after the adjustment. I could do it at line 2128, but that would be slightly less cumbersome, since then I'd need to keep track of both the original Amount and Amount - InternalAmt.
test/CodeGen/X86/push-cfi-obj.ll
27 ↗	(On Diff #37460)	More or less. :-) I mean, the part that didn't change encodes the GNU_ARGS_SIZE, and this is what this test is supposed to check (it was introduced by a previous commit that made some changes to the code that outputs the dwarf binary encoding). The newly added part encodes the def_cfa_offset. I didn't actually verify it's correct because I didn't change the encoding code for that. I'll parse it manually and see it really is correct.
test/CodeGen/X86/push-cfi.ll
14 ↗	(On Diff #37460)	I agree with you that it's not the correct spot, in theory - but in terms of being correct at the call site, it doesn't matter if it's before or after the pushes. The problem is that the way things work right now, eliminateCallFramePseudoInstr() doesn't actually get a pointer to the call, it gets the ADJDOWN/ADJUP pseudos. In theory, there doesn't even have to be a call between them. So, the easiest way is to just put the CFA adjust immediately after those pseudos. I could have the eliminate() code actively look for a call, but I'm not sure that's the best idea. What I would prefer for the debug-info version would be to put the CFA adjust for the sub where the current adjust is (well, after the sub - in line 2128), and the adjust for each push after the push. So that case wouldn't have to search for a call either.

rnk added inline comments.Oct 16 2015, 10:45 AM

lib/CodeGen/AsmPrinter/AsmPrinterDwarf.cpp
231–233 ↗	(On Diff #37460)	Hah, I guess we have MC support for .cfi_adjust_offset, but never used if from the compiler. :)
lib/Target/X86/X86CallFrameOptimization.cpp
140 ↗	(On Diff #37460)	Isn't `STI.getFrameLowering()` just `TFL`?
lib/Target/X86/X86FrameLowering.cpp
2097 ↗	(On Diff #37460)	Can we just assert up front that we don't need to produce UNWIND_INFO .xdata for Windows? This call frame optimization stuff only fires for 32-bit x86, which doesn't use UNWIND_INFO.
2110–2111 ↗	(On Diff #37460)	I think we should address this now. We know exactly where we're inserting the pushes in `X86CallFrameOptimization::adjustCallSequence`, so all we have to do is: Insert `.cfi_adjust_cfa_offset 4` after each push Insert `.cfi_adjust_cfa_offset InternalAmt` for frame destruction here Insert `.cfi_adjust_cfa_offset Amount` after `BuildStackAdjustment` below

Thanks, Reid!

lib/CodeGen/AsmPrinter/AsmPrinterDwarf.cpp
231–233 ↗	(On Diff #37460)	Yup. It's used by the assembler, though. :-)
lib/Target/X86/X86CallFrameOptimization.cpp
140 ↗	(On Diff #37460)	Right, thanks!
lib/Target/X86/X86FrameLowering.cpp
2097 ↗	(On Diff #37460)	Sure. Should I just assert on usesWindowsCFI()?
2110–2111 ↗	(On Diff #37460)	The question is whether we want to always have the precise adjusts or have two separate code paths for the precise version and the call-site-only version. This depends on whether we care about the eh_frame size for -Os, or not. If we want it to always be precise, then what you're suggesting is the right way to go. If we don't, then there should be two separate code paths, the one here, and the one you suggest, and I wanted them to be two separate patches.

mkuper added inline comments.Oct 19 2015, 1:50 AM

lib/Target/X86/X86FrameLowering.cpp
2097 ↗	(On Diff #37460)	On second thought, no, we can't. It's possible not to have a reserved call frame even without pushes (currently happens e.g. if there are variable-length arrays on the stack). In this case we'll also have an FP, so outgoing arguments will use ebp-based movs.

The patch now has both the precise and the precise-at-call-site code-paths.

mkuper added inline comments.Oct 19 2015, 5:49 AM

lib/Target/X86/X86CallFrameOptimization.cpp
503 ↗	(On Diff #37742)	Argh, this should have been a cast<>, not a static_cast<>. Also, I can avoid this completely and copy the code from the BuildCFI() helper, but I think it's better this way.

friss added inline comments.Oct 19 2015, 12:52 PM

test/CodeGen/X86/push-cfi-obj.ll
27 ↗	(On Diff #37742)	I'm sorry I didn't look at the patch in more details, but does the above mean that you emit GNU_args_size and def_fa_offset in the same function? If that's the case, it seems wrong: if the offset is reflected by def_cfa_offsets, you don't need the GNU_args_size anymore.

rnk added inline comments.Oct 19 2015, 3:56 PM

lib/Target/X86/X86CallFrameOptimization.cpp
503 ↗	(On Diff #37742)	You could also change TFL to hold an X86FrameLowering. If you ask X86Subtarget for the framelowering, it actually uses covariant return types to return you an X86FrameLowering instead of a TargetFrameLowering.
test/CodeGen/X86/push-cfi-debug.ll
17–21 ↗	(On Diff #37742)	Can you add a test for a callee-cleanup convention like stdcall? I think with the code as written we'll get something like this on Linux, where the stack is 16-byte aligned: subl $4, %esp .cfi_adjust_cfa_offset 4 pushl $3 .cfi_adjust_cfa_offset 4 pushl $2 .cfi_adjust_cfa_offset 4 pushl $1 .cfi_adjust_cfa_offset 4 calll _bar addl $4, %esp .cfi_adjust_cfa_offset -16 Which isn't strictly speaking correct, it should be: calll bar .cfi_adjust_cfa_offset -12 addl $4, %esp .cfi_adjust_cfa_offset -4 We don't need to fix the issue in this change, I just like having test cases with FIXMEs.

Thanks Reid, Frederic.

This addresses the comments from the last revision.
In particular, gnu_args_size is now only generated for fp-based CFA. Frederic, does that sounds right to you?

In D13767#272094, @mkuper wrote:

In particular, gnu_args_size is now only generated for fp-based CFA. Frederic, does that sounds right to you?

Hm, I think there might be an issue here. The cfa_adjust directives don't instruct the unwinder to adjust ESP before transferring to the landingpad, right? They just help indicate where the return address lives in memory.

The test case for this situation is something like:

int main() {
  // force stack realignment, FP usage, and use of SP to address local variables
  int __attribute__((align(32))) x = 42;
  try {
    throw 1;
  } catch (int) {
    return x;
  }
}

If ESP is wrong in the landingpad, we'll return the wrong value of x.

I would still have preferred that in the "only correct for call-site" case that the cfi_adjust occurred after the pushes rather than before. But I don't feel strongly about that, and so, am good with the changes now.

So, LGTM from me.

In D13767#272344, @rnk wrote:

In D13767#272094, @mkuper wrote:

In particular, gnu_args_size is now only generated for fp-based CFA. Frederic, does that sounds right to you?

Hm, I think there might be an issue here. The cfa_adjust directives don't instruct the unwinder to adjust ESP before transferring to the landingpad, right? They just help indicate where the return address lives in memory.

I think you are right, sorry for the confusion. What's the exact semantic of GNU_args_size? Is it only to be used when branching to a landing pad and ignored when just traversing the function?

In D13767#272789, @friss wrote:

In D13767#272344, @rnk wrote:

In D13767#272094, @mkuper wrote:

In particular, gnu_args_size is now only generated for fp-based CFA. Frederic, does that sounds right to you?

Hm, I think there might be an issue here. The cfa_adjust directives don't instruct the unwinder to adjust ESP before transferring to the landingpad, right? They just help indicate where the return address lives in memory.

I think you are right, sorry for the confusion. What's the exact semantic of GNU_args_size? Is it only to be used when branching to a landing pad and ignored when just traversing the function?

Yes, exactly. GNU_args_size is the delta between the stack pointer value immediately before the invoke that triggered the exception and the stack pointer value on entry to the landing pad. It is ignored for unwinding purposes.

The value that the unwinder substitutes for %esp/%rsp in CFA expressions & callee-save register location expressions is the value of %esp/%rsp immediately prior to the call/invoke.

lib/Target/X86/X86FrameLowering.cpp
2105 ↗	(On Diff #38000)	Why are you suppressing the GNU_ARGS_SIZE directives for !hasFP? I think they are still needed. If you agree, some corresponding changes will be needed in the tests.
2120 ↗	(On Diff #38000)	It would be nice to encapsulate the hasDebugInfo checks into a separate method, e.g. usePreciseUnwindInfo. That serves a useful documentation purpose and paves the way for supporting asynchronous EH, which also requires precise unwind info.

Thanks, everyone. Looks like I got confused about the semantics, will upload a new patch.
Sorry again for the amount of noise this is causing.

lib/Target/X86/X86FrameLowering.cpp
2105 ↗	(On Diff #38000)	Yes, this is a mistake, I misinterpreted Frederic's comment. Will make the appropriate changes, thanks.
2120 ↗	(On Diff #38000)	Right, will do.

rnk mentioned this in D14021: Added cfi instructions for correct CFA calculation in case when movpc instruction expands to call and pop.Oct 23 2015, 9:40 AM

Do *not* suppress gnu_args_size regardless of FP.
Also added the encapsulation Dave suggested.

ping

lgtm

This revision is now accepted and ready to land.Nov 2 2015, 1:40 PM

Closed by commit rL251904: [X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments (authored by mkuper). · Explain WhyNov 3 2015, 12:19 AM

This revision was automatically updated to reflect the committed changes.

DavidKreitzer mentioned this in D18046: [X86] Providing correct unwind info in function epilogue.May 2 2016, 3:41 PM

Renaud-K added a commit: rG4d5a9c1d171b: Using higher level interface to insert new arguments so arguments.Nov 9 2022, 11:06 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptNov 9 2022, 11:06 AM

Herald added a subscriber: pengfei. · View Herald Transcript

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

MachineModuleInfo.h

5 lines

lib/

CodeGen/

AsmPrinter/

AsmPrinterDwarf.cpp

3 lines

Target/

X86/

X86CallFrameOptimization.cpp

40 lines

X86FrameLowering.h

6 lines

X86FrameLowering.cpp

45 lines

test/

CodeGen/

X86/

2 lines

4 lines

6 lines

53 lines

48 lines

256 lines

Diff 39033

llvm/trunk/include/llvm/CodeGen/MachineModuleInfo.h

Show First 20 Lines • Show All 239 Lines • ▼ Show 20 Lines	const Ty &getObjFileInfo() const {
return const_cast<MachineModuleInfo*>(this)->getObjFileInfo<Ty>();		return const_cast<MachineModuleInfo*>(this)->getObjFileInfo<Ty>();
}		}

/// hasDebugInfo - Returns true if valid debug info is present.		/// hasDebugInfo - Returns true if valid debug info is present.
///		///
bool hasDebugInfo() const { return DbgInfoAvailable; }		bool hasDebugInfo() const { return DbgInfoAvailable; }
void setDebugInfoAvailability(bool avail) { DbgInfoAvailable = avail; }		void setDebugInfoAvailability(bool avail) { DbgInfoAvailable = avail; }

		// Returns true if we need to generate precise CFI. Currently
		// this is equivalent to hasDebugInfo(), but if we ever implement
		// async EH, it will require precise CFI as well.
		bool usePreciseUnwindInfo() const { return hasDebugInfo(); }

bool callsEHReturn() const { return CallsEHReturn; }		bool callsEHReturn() const { return CallsEHReturn; }
void setCallsEHReturn(bool b) { CallsEHReturn = b; }		void setCallsEHReturn(bool b) { CallsEHReturn = b; }

bool callsUnwindInit() const { return CallsUnwindInit; }		bool callsUnwindInit() const { return CallsUnwindInit; }
void setCallsUnwindInit(bool b) { CallsUnwindInit = b; }		void setCallsUnwindInit(bool b) { CallsUnwindInit = b; }

bool hasEHFunclets() const { return HasEHFunclets; }		bool hasEHFunclets() const { return HasEHFunclets; }
void setHasEHFunclets(bool V) { HasEHFunclets = V; }		void setHasEHFunclets(bool V) { HasEHFunclets = V; }
▲ Show 20 Lines • Show All 183 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinterDwarf.cpp

	Show First 20 Lines • Show All 210 Lines • ▼ Show 20 Lines

	void AsmPrinter::emitCFIInstruction(const MCCFIInstruction &Inst) const {			void AsmPrinter::emitCFIInstruction(const MCCFIInstruction &Inst) const {
	switch (Inst.getOperation()) {			switch (Inst.getOperation()) {
	default:			default:
	llvm_unreachable("Unexpected instruction");			llvm_unreachable("Unexpected instruction");
	case MCCFIInstruction::OpDefCfaOffset:			case MCCFIInstruction::OpDefCfaOffset:
	OutStreamer->EmitCFIDefCfaOffset(Inst.getOffset());			OutStreamer->EmitCFIDefCfaOffset(Inst.getOffset());
	break;			break;
				case MCCFIInstruction::OpAdjustCfaOffset:
				OutStreamer->EmitCFIAdjustCfaOffset(Inst.getOffset());
				break;
	case MCCFIInstruction::OpDefCfa:			case MCCFIInstruction::OpDefCfa:
	OutStreamer->EmitCFIDefCfa(Inst.getRegister(), Inst.getOffset());			OutStreamer->EmitCFIDefCfa(Inst.getRegister(), Inst.getOffset());
	break;			break;
	case MCCFIInstruction::OpDefCfaRegister:			case MCCFIInstruction::OpDefCfaRegister:
	OutStreamer->EmitCFIDefCfaRegister(Inst.getRegister());			OutStreamer->EmitCFIDefCfaRegister(Inst.getRegister());
	break;			break;
	case MCCFIInstruction::OpOffset:			case MCCFIInstruction::OpOffset:
	OutStreamer->EmitCFIOffset(Inst.getRegister(), Inst.getOffset());			OutStreamer->EmitCFIOffset(Inst.getRegister(), Inst.getOffset());
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86CallFrameOptimization.cpp

Show First 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	private:
InstClassification classifyInstruction(MachineBasicBlock &MBB,		InstClassification classifyInstruction(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MI,		MachineBasicBlock::iterator MI,
const X86RegisterInfo &RegInfo,		const X86RegisterInfo &RegInfo,
DenseSet<unsigned int> &UsedRegs);		DenseSet<unsigned int> &UsedRegs);

const char *getPassName() const override { return "X86 Optimize Call Frame"; }		const char *getPassName() const override { return "X86 Optimize Call Frame"; }

const TargetInstrInfo *TII;		const TargetInstrInfo *TII;
const TargetFrameLowering *TFL;		const X86FrameLowering *TFL;
		const X86Subtarget *STI;
const MachineRegisterInfo *MRI;		const MachineRegisterInfo *MRI;
static char ID;		static char ID;
};		};

char X86CallFrameOptimization::ID = 0;		char X86CallFrameOptimization::ID = 0;
}		}

FunctionPass *llvm::createX86CallFrameOptimization() {		FunctionPass *llvm::createX86CallFrameOptimization() {
return new X86CallFrameOptimization();		return new X86CallFrameOptimization();
}		}

// This checks whether the transformation is legal.		// This checks whether the transformation is legal.
// Also returns false in cases where it's potentially legal, but		// Also returns false in cases where it's potentially legal, but
// we don't even want to try.		// we don't even want to try.
bool X86CallFrameOptimization::isLegal(MachineFunction &MF) {		bool X86CallFrameOptimization::isLegal(MachineFunction &MF) {
if (NoX86CFOpt.getValue())		if (NoX86CFOpt.getValue())
return false;		return false;

// We currently only support call sequences where all parameters.		// We currently only support call sequences where all parameters.
// are passed on the stack.		// are passed on the stack.
// No point in running this in 64-bit mode, since some arguments are		// No point in running this in 64-bit mode, since some arguments are
// passed in-register in all common calling conventions, so the pattern		// passed in-register in all common calling conventions, so the pattern
// we're looking for will never match.		// we're looking for will never match.
const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();		if (STI->is64Bit())
if (STI.is64Bit())
return false;		return false;

// We can't encode multiple DW_CFA_GNU_args_size in the compact		// We can't encode multiple DW_CFA_GNU_args_size or DW_CFA_def_cfa_offset
// unwind encoding that Darwin uses.		// in the compact unwind encoding that Darwin uses. So, bail if there
if (STI.isTargetDarwin() && !MF.getMMI().getLandingPads().empty())		// is a danger of that being generated.
		if (STI->isTargetDarwin() &&
		(!MF.getMMI().getLandingPads().empty() \|\|
		(MF.getFunction()->needsUnwindTableEntry() && !TFL->hasFP(MF))))
return false;		return false;

// You would expect straight-line code between call-frame setup and		// You would expect straight-line code between call-frame setup and
// call-frame destroy. You would be wrong. There are circumstances (e.g.		// call-frame destroy. You would be wrong. There are circumstances (e.g.
// CMOV_GR8 expansion of a select that feeds a function call!) where we can		// CMOV_GR8 expansion of a select that feeds a function call!) where we can
// end up with the setup and the destroy in different basic blocks.		// end up with the setup and the destroy in different basic blocks.
// This is bad, and breaks SP adjustment.		// This is bad, and breaks SP adjustment.
// So, check that all of the frames in the function are closed inside		// So, check that all of the frames in the function are closed inside
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	if (!CC.UsePush) {
Advantage += (CC.ExpectedDist / 4) * 3;		Advantage += (CC.ExpectedDist / 4) * 3;
}		}
}		}

return (Advantage >= 0);		return (Advantage >= 0);
}		}

bool X86CallFrameOptimization::runOnMachineFunction(MachineFunction &MF) {		bool X86CallFrameOptimization::runOnMachineFunction(MachineFunction &MF) {
TII = MF.getSubtarget().getInstrInfo();		STI = &MF.getSubtarget<X86Subtarget>();
TFL = MF.getSubtarget().getFrameLowering();		TII = STI->getInstrInfo();
		TFL = STI->getFrameLowering();
MRI = &MF.getRegInfo();		MRI = &MF.getRegInfo();

if (!isLegal(MF))		if (!isLegal(MF))
return false;		return false;

unsigned FrameSetupOpcode = TII->getCallFrameSetupOpcode();		unsigned FrameSetupOpcode = TII->getCallFrameSetupOpcode();

bool Changed = false;		bool Changed = false;
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines

void X86CallFrameOptimization::collectCallInfo(MachineFunction &MF,		void X86CallFrameOptimization::collectCallInfo(MachineFunction &MF,
MachineBasicBlock &MBB,		MachineBasicBlock &MBB,
MachineBasicBlock::iterator I,		MachineBasicBlock::iterator I,
CallContext &Context) {		CallContext &Context) {
// Check that this particular call sequence is amenable to the		// Check that this particular call sequence is amenable to the
// transformation.		// transformation.
const X86RegisterInfo &RegInfo = static_cast<const X86RegisterInfo >(		const X86RegisterInfo &RegInfo = static_cast<const X86RegisterInfo >(
MF.getSubtarget().getRegisterInfo());		STI->getRegisterInfo());
unsigned FrameDestroyOpcode = TII->getCallFrameDestroyOpcode();		unsigned FrameDestroyOpcode = TII->getCallFrameDestroyOpcode();

// We expect to enter this at the beginning of a call sequence		// We expect to enter this at the beginning of a call sequence
assert(I->getOpcode() == TII->getCallFrameSetupOpcode());		assert(I->getOpcode() == TII->getCallFrameSetupOpcode());
MachineBasicBlock::iterator FrameSetup = I++;		MachineBasicBlock::iterator FrameSetup = I++;
Context.FrameSetup = FrameSetup;		Context.FrameSetup = FrameSetup;

// How much do we adjust the stack? This puts an upper bound on		// How much do we adjust the stack? This puts an upper bound on
▲ Show 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	bool X86CallFrameOptimization::adjustCallSequence(MachineFunction &MF,

DebugLoc DL = FrameSetup->getDebugLoc();		DebugLoc DL = FrameSetup->getDebugLoc();
// Now, iterate through the vector in reverse order, and replace the movs		// Now, iterate through the vector in reverse order, and replace the movs
// with pushes. MOVmi/MOVmr doesn't have any defs, so no need to		// with pushes. MOVmi/MOVmr doesn't have any defs, so no need to
// replace uses.		// replace uses.
for (int Idx = (Context.ExpectedDist / 4) - 1; Idx >= 0; --Idx) {		for (int Idx = (Context.ExpectedDist / 4) - 1; Idx >= 0; --Idx) {
MachineBasicBlock::iterator MOV = *Context.MovVector[Idx];		MachineBasicBlock::iterator MOV = *Context.MovVector[Idx];
MachineOperand PushOp = MOV->getOperand(X86::AddrNumOperands);		MachineOperand PushOp = MOV->getOperand(X86::AddrNumOperands);
		MachineBasicBlock::iterator Push = nullptr;
if (MOV->getOpcode() == X86::MOV32mi) {		if (MOV->getOpcode() == X86::MOV32mi) {
unsigned PushOpcode = X86::PUSHi32;		unsigned PushOpcode = X86::PUSHi32;
// If the operand is a small (8-bit) immediate, we can use a		// If the operand is a small (8-bit) immediate, we can use a
// PUSH instruction with a shorter encoding.		// PUSH instruction with a shorter encoding.
// Note that isImm() may fail even though this is a MOVmi, because		// Note that isImm() may fail even though this is a MOVmi, because
// the operand can also be a symbol.		// the operand can also be a symbol.
if (PushOp.isImm()) {		if (PushOp.isImm()) {
int64_t Val = PushOp.getImm();		int64_t Val = PushOp.getImm();
if (isInt<8>(Val))		if (isInt<8>(Val))
PushOpcode = X86::PUSH32i8;		PushOpcode = X86::PUSH32i8;
}		}
BuildMI(MBB, Context.Call, DL, TII->get(PushOpcode)).addOperand(PushOp);		Push = BuildMI(MBB, Context.Call, DL, TII->get(PushOpcode))
		.addOperand(PushOp);
} else {		} else {
unsigned int Reg = PushOp.getReg();		unsigned int Reg = PushOp.getReg();

// If PUSHrmm is not slow on this target, try to fold the source of the		// If PUSHrmm is not slow on this target, try to fold the source of the
// push into the instruction.		// push into the instruction.
const X86Subtarget &ST = MF.getSubtarget<X86Subtarget>();		bool SlowPUSHrmm = STI->isAtom() \|\| STI->isSLM();
bool SlowPUSHrmm = ST.isAtom() \|\| ST.isSLM();

// Check that this is legal to fold. Right now, we're extremely		// Check that this is legal to fold. Right now, we're extremely
// conservative about that.		// conservative about that.
MachineInstr *DefMov = nullptr;		MachineInstr *DefMov = nullptr;
if (!SlowPUSHrmm && (DefMov = canFoldIntoRegPush(FrameSetup, Reg))) {		if (!SlowPUSHrmm && (DefMov = canFoldIntoRegPush(FrameSetup, Reg))) {
MachineInstr *Push =		Push = BuildMI(MBB, Context.Call, DL, TII->get(X86::PUSH32rmm));
BuildMI(MBB, Context.Call, DL, TII->get(X86::PUSH32rmm));

unsigned NumOps = DefMov->getDesc().getNumOperands();		unsigned NumOps = DefMov->getDesc().getNumOperands();
for (unsigned i = NumOps - X86::AddrNumOperands; i != NumOps; ++i)		for (unsigned i = NumOps - X86::AddrNumOperands; i != NumOps; ++i)
Push->addOperand(DefMov->getOperand(i));		Push->addOperand(DefMov->getOperand(i));

DefMov->eraseFromParent();		DefMov->eraseFromParent();
} else {		} else {
BuildMI(MBB, Context.Call, DL, TII->get(X86::PUSH32r))		Push = BuildMI(MBB, Context.Call, DL, TII->get(X86::PUSH32r))
.addReg(Reg)		.addReg(Reg)
.getInstr();		.getInstr();
}		}
}		}

		// For debugging, when using SP-based CFA, we need to adjust the CFA
		// offset after each push.
		if (!TFL->hasFP(MF) && MF.getMMI().usePreciseUnwindInfo())
		TFL->BuildCFI(MBB, std::next(Push), DL,
		MCCFIInstruction::createAdjustCfaOffset(nullptr, 4));

MBB.erase(MOV);		MBB.erase(MOV);
}		}

// The stack-pointer copy is no longer used in the call sequences.		// The stack-pointer copy is no longer used in the call sequences.
// There should not be any other users, but we can't commit to that, so:		// There should not be any other users, but we can't commit to that, so:
if (MRI->use_empty(Context.SPCopy->getOperand(0).getReg()))		if (MRI->use_empty(Context.SPCopy->getOperand(0).getReg()))
Context.SPCopy->eraseFromParent();		Context.SPCopy->eraseFromParent();

▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86FrameLowering.h

Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	public:

/// Check whether or not the given \p MBB can be used as a epilogue		/// Check whether or not the given \p MBB can be used as a epilogue
/// for the target.		/// for the target.
/// The epilogue will be inserted before the first terminator of that block.		/// The epilogue will be inserted before the first terminator of that block.
/// This method is used by the shrink-wrapping pass to decide if		/// This method is used by the shrink-wrapping pass to decide if
/// \p MBB will be correctly handled by the target.		/// \p MBB will be correctly handled by the target.
bool canUseAsEpilogue(const MachineBasicBlock &MBB) const override;		bool canUseAsEpilogue(const MachineBasicBlock &MBB) const override;

private:
uint64_t calculateMaxStackAlign(const MachineFunction &MF) const;

/// Wraps up getting a CFI index and building a MachineInstr for it.		/// Wraps up getting a CFI index and building a MachineInstr for it.
void BuildCFI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,		void BuildCFI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
DebugLoc DL, MCCFIInstruction CFIInst) const;		DebugLoc DL, MCCFIInstruction CFIInst) const;

		private:
		uint64_t calculateMaxStackAlign(const MachineFunction &MF) const;

/// Aligns the stack pointer by ANDing it with -MaxAlign.		/// Aligns the stack pointer by ANDing it with -MaxAlign.
void BuildStackAlignAND(MachineBasicBlock &MBB,		void BuildStackAlignAND(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI, DebugLoc DL,		MachineBasicBlock::iterator MBBI, DebugLoc DL,
uint64_t MaxAlign) const;		uint64_t MaxAlign) const;

/// Make small positive stack adjustments using POPs.		/// Make small positive stack adjustments using POPs.
bool adjustStackWithPops(MachineBasicBlock &MBB,		bool adjustStackWithPops(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI, DebugLoc DL,		MachineBasicBlock::iterator MBBI, DebugLoc DL,
Show All 21 Lines

llvm/trunk/lib/Target/X86/X86FrameLowering.cpp

Show First 20 Lines • Show All 2,099 Lines • ▼ Show 20 Lines	if (!reserveCallFrame) {
// adjcallstackdown instruction into 'add ESP, <amt>'		// adjcallstackdown instruction into 'add ESP, <amt>'

// We need to keep the stack aligned properly. To do this, we round the		// We need to keep the stack aligned properly. To do this, we round the
// amount of space needed for the outgoing arguments up to the next		// amount of space needed for the outgoing arguments up to the next
// alignment boundary.		// alignment boundary.
unsigned StackAlign = getStackAlignment();		unsigned StackAlign = getStackAlignment();
Amount = RoundUpToAlignment(Amount, StackAlign);		Amount = RoundUpToAlignment(Amount, StackAlign);

		MachineModuleInfo &MMI = MF.getMMI();
		const Function *Fn = MF.getFunction();
		bool WindowsCFI = MF.getTarget().getMCAsmInfo()->usesWindowsCFI();
		bool DwarfCFI = !WindowsCFI &&
		(MMI.hasDebugInfo() \|\| Fn->needsUnwindTableEntry());

// If we have any exception handlers in this function, and we adjust		// If we have any exception handlers in this function, and we adjust
// the SP before calls, we may need to indicate this to the unwinder,		// the SP before calls, we may need to indicate this to the unwinder
// using GNU_ARGS_SIZE. Note that this may be necessary		// using GNU_ARGS_SIZE. Note that this may be necessary even when
// even when Amount == 0, because the preceding function may have		// Amount == 0, because the preceding function may have set a non-0
// set a non-0 GNU_ARGS_SIZE.		// GNU_ARGS_SIZE.
// TODO: We don't need to reset this between subsequent functions,		// TODO: We don't need to reset this between subsequent functions,
// if it didn't change.		// if it didn't change.
bool HasDwarfEHHandlers =		bool HasDwarfEHHandlers = !WindowsCFI &&
!MF.getTarget().getMCAsmInfo()->usesWindowsCFI() &&
!MF.getMMI().getLandingPads().empty();		!MF.getMMI().getLandingPads().empty();

if (HasDwarfEHHandlers && !isDestroy &&		if (HasDwarfEHHandlers && !isDestroy &&
MF.getInfo<X86MachineFunctionInfo>()->getHasPushSequences())		MF.getInfo<X86MachineFunctionInfo>()->getHasPushSequences())
BuildCFI(MBB, I, DL,		BuildCFI(MBB, I, DL,
MCCFIInstruction::createGnuArgsSize(nullptr, Amount));		MCCFIInstruction::createGnuArgsSize(nullptr, Amount));

if (Amount == 0)		if (Amount == 0)
return;		return;

// Factor out the amount that gets handled inside the sequence		// Factor out the amount that gets handled inside the sequence
// (Pushes of argument for frame setup, callee pops for frame destroy)		// (Pushes of argument for frame setup, callee pops for frame destroy)
Amount -= InternalAmt;		Amount -= InternalAmt;

		// If this is a callee-pop calling convention, and we're emitting precise
		// SP-based CFI, emit a CFA adjust for the amount the callee popped.
		if (isDestroy && InternalAmt && DwarfCFI && !hasFP(MF) &&
		MMI.usePreciseUnwindInfo())
		BuildCFI(MBB, I, DL,
		MCCFIInstruction::createAdjustCfaOffset(nullptr, -InternalAmt));

if (Amount) {		if (Amount) {
// Add Amount to SP to destroy a frame, and subtract to setup.		// Add Amount to SP to destroy a frame, and subtract to setup.
int Offset = isDestroy ? Amount : -Amount;		int Offset = isDestroy ? Amount : -Amount;

if (!(MF.getFunction()->optForMinSize() &&		if (!(Fn->optForMinSize() &&
adjustStackWithPops(MBB, I, DL, Offset)))		adjustStackWithPops(MBB, I, DL, Offset)))
BuildStackAdjustment(MBB, I, DL, Offset, /InEpilogue=/false);		BuildStackAdjustment(MBB, I, DL, Offset, /InEpilogue=/false);
}		}

		if (DwarfCFI && !hasFP(MF)) {
		// If we don't have FP, but need to generate unwind information,
		// we need to set the correct CFA offset after the stack adjustment.
		// How much we adjust the CFA offset depends on whether we're emitting
		// CFI only for EH purposes or for debugging. EH only requires the CFA
		// offset to be correct at each call site, while for debugging we want
		// it to be more precise.
		int CFAOffset = Amount;
		if (!MMI.usePreciseUnwindInfo())
		CFAOffset += InternalAmt;
		CFAOffset = isDestroy ? -CFAOffset : CFAOffset;
		BuildCFI(MBB, I, DL,
		MCCFIInstruction::createAdjustCfaOffset(nullptr, CFAOffset));
		}

return;		return;
}		}

if (isDestroy && InternalAmt) {		if (isDestroy && InternalAmt) {
// If we are performing frame pointer elimination and if the callee pops		// If we are performing frame pointer elimination and if the callee pops
// something off the stack pointer, add it back. We do this until we have		// something off the stack pointer, add it back. We do this until we have
// more advanced stack pointer tracking ability.		// more advanced stack pointer tracking ability.
// We are not tracking the stack pointer adjustment by the callee, so make		// We are not tracking the stack pointer adjustment by the callee, so make
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/debugloc-argsize.ll

	Show All 24 Lines
	declare void @_Z3bariii(i32, i32, i32) #0			declare void @_Z3bariii(i32, i32, i32) #0

	declare i32 @__gxx_personality_v0(...)			declare i32 @__gxx_personality_v0(...)

	declare i8* @__cxa_begin_catch(i8*)			declare i8* @__cxa_begin_catch(i8*)

	declare void @__cxa_end_catch()			declare void @__cxa_end_catch()

	attributes #0 = { optsize "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }			attributes #0 = { optsize "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
	attributes #1 = { optsize }			attributes #1 = { optsize }
	attributes #2 = { nounwind }			attributes #2 = { nounwind }

	!llvm.dbg.cu = !{!0}			!llvm.dbg.cu = !{!0}
	!llvm.module.flags = !{!7, !8}			!llvm.module.flags = !{!7, !8}
	!llvm.ident = !{!9}			!llvm.ident = !{!9}

	!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "clang version 3.8.0 (trunk 249520)", isOptimized: true, runtimeVersion: 0, emissionKind: 1, enums: !2, subprograms: !3)			!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1, producer: "clang version 3.8.0 (trunk 249520)", isOptimized: true, runtimeVersion: 0, emissionKind: 1, enums: !2, subprograms: !3)
	Show All 17 Lines

llvm/trunk/test/CodeGen/X86/fold-push.ll

	; RUN: llc < %s -mtriple=i686-windows \| FileCheck %s -check-prefix=CHECK -check-prefix=NORMAL			; RUN: llc < %s -mtriple=i686-windows \| FileCheck %s -check-prefix=CHECK -check-prefix=NORMAL
	; RUN: llc < %s -mtriple=i686-windows -mattr=call-reg-indirect \| FileCheck %s -check-prefix=CHECK -check-prefix=SLM			; RUN: llc < %s -mtriple=i686-windows -mattr=call-reg-indirect \| FileCheck %s -check-prefix=CHECK -check-prefix=SLM

	declare void @foo(i32 %r)			declare void @foo(i32 %r)

	define void @test(i32 %a, i32 %b) optsize {			define void @test(i32 %a, i32 %b) optsize nounwind {
	; CHECK-LABEL: test:			; CHECK-LABEL: test:
	; CHECK: movl [[EAX:%e..]], (%esp)			; CHECK: movl [[EAX:%e..]], (%esp)
	; CHECK-NEXT: pushl [[EAX]]			; CHECK-NEXT: pushl [[EAX]]
	; CHECK-NEXT: calll			; CHECK-NEXT: calll
	; CHECK-NEXT: addl $4, %esp			; CHECK-NEXT: addl $4, %esp
	; CHECK: nop			; CHECK: nop
	; NORMAL: pushl (%esp)			; NORMAL: pushl (%esp)
	; SLM: movl (%esp), [[RELOAD:%e..]]			; SLM: movl (%esp), [[RELOAD:%e..]]
	; SLM-NEXT: pushl [[RELOAD]]			; SLM-NEXT: pushl [[RELOAD]]
	; CHECK: calll			; CHECK: calll
	; CHECK-NEXT: addl $4, %esp			; CHECK-NEXT: addl $4, %esp
	%c = add i32 %a, %b			%c = add i32 %a, %b
	call void @foo(i32 %c)			call void @foo(i32 %c)
	call void asm sideeffect "nop", "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()			call void asm sideeffect "nop", "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()
	call void @foo(i32 %c)			call void @foo(i32 %c)
	ret void			ret void
	}			}

	define void @test_min(i32 %a, i32 %b) minsize {			define void @test_min(i32 %a, i32 %b) minsize nounwind {
	; CHECK-LABEL: test_min:			; CHECK-LABEL: test_min:
	; CHECK: movl [[EAX:%e..]], (%esp)			; CHECK: movl [[EAX:%e..]], (%esp)
	; CHECK-NEXT: pushl [[EAX]]			; CHECK-NEXT: pushl [[EAX]]
	; CHECK-NEXT: calll			; CHECK-NEXT: calll
	; CHECK-NEXT: popl			; CHECK-NEXT: popl
	; CHECK: nop			; CHECK: nop
	; CHECK: pushl (%esp)			; CHECK: pushl (%esp)
	; CHECK: calll			; CHECK: calll
	; CHECK-NEXT: popl			; CHECK-NEXT: popl
	%c = add i32 %a, %b			%c = add i32 %a, %b
	call void @foo(i32 %c)			call void @foo(i32 %c)
	call void asm sideeffect "nop", "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()			call void asm sideeffect "nop", "~{ax},~{bx},~{cx},~{dx},~{bp},~{si},~{di}"()
	call void @foo(i32 %c)			call void @foo(i32 %c)
	ret void			ret void
	}			}

llvm/trunk/test/CodeGen/X86/pop-stack-cleanup.ll

	; RUN: llc < %s -mtriple=i686-windows \| FileCheck %s -check-prefix=CHECK			; RUN: llc < %s -mtriple=i686-windows \| FileCheck %s -check-prefix=CHECK
	; RUN: llc < %s -mtriple=x86_64-linux \| FileCheck %s -check-prefix=LINUX64			; RUN: llc < %s -mtriple=x86_64-linux \| FileCheck %s -check-prefix=LINUX64

	declare void @param1(i32 %a)			declare void @param1(i32 %a)
	declare i32 @param2_ret(i32 %a, i32 %b)			declare i32 @param2_ret(i32 %a, i32 %b)
	declare i64 @param2_ret64(i32 %a, i32 %b)			declare i64 @param2_ret64(i32 %a, i32 %b)
	declare void @param2(i32 %a, i32 %b)			declare void @param2(i32 %a, i32 %b)
	declare void @param3(i32 %a, i32 %b, i32 %c)			declare void @param3(i32 %a, i32 %b, i32 %c)
	declare void @param8(i64, i64, i64, i64, i64, i64, i64, i64)			declare void @param8(i64, i64, i64, i64, i64, i64, i64, i64)


	define void @test() minsize {			define void @test() minsize nounwind {
	; CHECK-LABEL: test:			; CHECK-LABEL: test:
	; CHECK: calll _param1			; CHECK: calll _param1
	; CHECK-NEXT: popl %eax			; CHECK-NEXT: popl %eax
	; CHECK: calll _param2			; CHECK: calll _param2
	; CHECK-NEXT: popl %eax			; CHECK-NEXT: popl %eax
	; CHECK-NEXT: popl %ecx			; CHECK-NEXT: popl %ecx
	; CHECK: calll _param2_ret			; CHECK: calll _param2_ret
	; CHECK-NEXT: popl %ecx			; CHECK-NEXT: popl %ecx
	Show All 22 Lines
	; CHECK-NEXT: movl %ebp, %esp			; CHECK-NEXT: movl %ebp, %esp
	%v = alloca i32, i32 %k			%v = alloca i32, i32 %k
	call void @param1(i32 1)			call void @param1(i32 1)
	call void @param2(i32 1, i32 2)			call void @param2(i32 1, i32 2)
	call void @param3(i32 1, i32 2, i32 3)			call void @param3(i32 1, i32 2, i32 3)
	ret void			ret void
	}			}

	define void @spill(i32 inreg %a, i32 inreg %b, i32 inreg %c) minsize {			define void @spill(i32 inreg %a, i32 inreg %b, i32 inreg %c) minsize nounwind {
	; CHECK-LABEL: spill:			; CHECK-LABEL: spill:
	; CHECK-DAG: movl %ecx,			; CHECK-DAG: movl %ecx,
	; CHECK-DAG: movl %edx,			; CHECK-DAG: movl %edx,
	; CHECK: calll _param2_ret			; CHECK: calll _param2_ret
	; CHECK-NEXT: popl %ecx			; CHECK-NEXT: popl %ecx
	; CHECK-NEXT: popl %edx			; CHECK-NEXT: popl %edx
	; CHECK-DAG: movl {{.*}}, %ecx			; CHECK-DAG: movl {{.*}}, %ecx
	; CHECK-DAG: movl {{.*}}, %edx			; CHECK-DAG: movl {{.*}}, %edx
	; CHECK: calll _spill			; CHECK: calll _spill
	%i = call i32 @param2_ret(i32 1, i32 2)			%i = call i32 @param2_ret(i32 1, i32 2)
	call void @spill(i32 %a, i32 %b, i32 %c)			call void @spill(i32 %a, i32 %b, i32 %c)
	ret void			ret void
	}			}

	define void @test_linux64(i32 %size) minsize {			define void @test_linux64(i32 %size) minsize nounwind {
	; LINUX64-LABEL: test_linux64:			; LINUX64-LABEL: test_linux64:
	; LINUX64: pushq %rbp			; LINUX64: pushq %rbp
	; LINUX64: callq param8			; LINUX64: callq param8
	; LINUX64-NEXT: popq %rax			; LINUX64-NEXT: popq %rax
	; LINUX64-NEXT: popq %rcx			; LINUX64-NEXT: popq %rcx

	%a = alloca i64, i32 %size, align 8			%a = alloca i64, i32 %size, align 8
	call void @param8(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8)			call void @param8(i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8)
	ret void			ret void
	}			}

llvm/trunk/test/CodeGen/X86/push-cfi-debug.ll

				; RUN: llc < %s -mtriple=i686-pc-linux \| FileCheck %s


				; Function Attrs: optsize
				declare void @foo(i32, i32) #0
				declare x86_stdcallcc void @stdfoo(i32, i32) #0

				; CHECK-LABEL: test1:
				; CHECK: subl $8, %esp
				; CHECK: .cfi_adjust_cfa_offset 8
				; CHECK: pushl $2
				; CHECK: .cfi_adjust_cfa_offset 4
				; CHECK: pushl $1
				; CHECK: .cfi_adjust_cfa_offset 4
				; CHECK: calll foo
				; CHECK: addl $16, %esp
				; CHECK: .cfi_adjust_cfa_offset -16
				; CHECK: subl $8, %esp
				; CHECK: .cfi_adjust_cfa_offset 8
				; CHECK: pushl $4
				; CHECK: .cfi_adjust_cfa_offset 4
				; CHECK: pushl $3
				; CHECK: .cfi_adjust_cfa_offset 4
				; CHECK: calll stdfoo
				; CHECK: .cfi_adjust_cfa_offset -8
				; CHECK: addl $8, %esp
				; CHECK: .cfi_adjust_cfa_offset -8
				define void @test1() #0 {
				entry:
				tail call void @foo(i32 1, i32 2) #1, !dbg !10
				tail call x86_stdcallcc void @stdfoo(i32 3, i32 4) #1, !dbg !11
				ret void, !dbg !12
				}

				attributes #0 = { nounwind optsize }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!7, !8}
				!llvm.ident = !{!9}

				!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 3.8.0 (trunk 250289)", isOptimized: true, runtimeVersion: 0, emissionKind: 1, enums: !2, subprograms: !3)
				!1 = !DIFile(filename: "foo.c", directory: "foo")
				!2 = !{}
				!3 = !{!4}
				!4 = distinct !DISubprogram(name: "test1", scope: !1, file: !1, line: 3, type: !5, isLocal: false, isDefinition: true, scopeLine: 3, isOptimized: true, function: void ()* @test1, variables: !2)
				!5 = !DISubroutineType(types: !6)
				!6 = !{null}
				!7 = !{i32 2, !"Dwarf Version", i32 4}
				!8 = !{i32 2, !"Debug Info Version", i32 3}
				!9 = !{!"clang version 3.8.0 (trunk 250289)"}
				!10 = !DILocation(line: 4, column: 3, scope: !4)
				!11 = !DILocation(line: 5, column: 3, scope: !4)
				!12 = !DILocation(line: 6, column: 1, scope: !4)

llvm/trunk/test/CodeGen/X86/push-cfi-obj.ll

	; RUN: llc < %s -mtriple=i686-pc-linux -filetype=obj \| llvm-readobj -s -sr -sd \| FileCheck %s			; RUN: llc < %s -mtriple=i686-pc-linux -filetype=obj \| llvm-readobj -s -sr -sd \| FileCheck %s -check-prefix=LINUX
	; RUN: llc < %s -mtriple=i686-darwin-macosx10.7 -filetype=obj \| llvm-readobj -sections \| FileCheck -check-prefix=DARWIN %s			; RUN: llc < %s -mtriple=i686-darwin-macosx10.7 -filetype=obj \| llvm-readobj -sections \| FileCheck -check-prefix=DARWIN %s

	; On darwin, check that we manage to generate the compact unwind section			; On darwin, check that we manage to generate the compact unwind section
	; DARWIN: Name: __compact_unwind			; DARWIN: Name: __compact_unwind
	; DARWIN: Segment: __LD			; DARWIN: Segment: __LD

	; CHECK: Index: 8			; LINUX: Index: 8
	; CHECK-NEXT: Name: .eh_frame (41)			; LINUX-NEXT: Name: .eh_frame (41)
	; CHECK-NEXT: Type: SHT_PROGBITS (0x1)			; LINUX-NEXT: Type: SHT_PROGBITS (0x1)
	; CHECK-NEXT: Flags [ (0x2)			; LINUX-NEXT: Flags [ (0x2)
	; CHECK-NEXT: SHF_ALLOC (0x2)			; LINUX-NEXT: SHF_ALLOC (0x2)
	; CHECK-NEXT: ]			; LINUX-NEXT: ]
	; CHECK-NEXT: Address: 0x0			; LINUX-NEXT: Address: 0x0
	; CHECK-NEXT: Offset: 0x64			; LINUX-NEXT: Offset: 0x68
	; CHECK-NEXT: Size: 60			; LINUX-NEXT: Size: 64
	; CHECK-NEXT: Link: 0			; LINUX-NEXT: Link: 0
	; CHECK-NEXT: Info: 0			; LINUX-NEXT: Info: 0
	; CHECK-NEXT: AddressAlignment: 4			; LINUX-NEXT: AddressAlignment: 4
	; CHECK-NEXT: EntrySize: 0			; LINUX-NEXT: EntrySize: 0
	; CHECK-NEXT: Relocations [			; LINUX-NEXT: Relocations [
	; CHECK-NEXT: ]			; LINUX-NEXT: ]
	; CHECK-NEXT: SectionData (			; LINUX-NEXT: SectionData (
	; CHECK-NEXT: 0000: 1C000000 00000000 017A504C 5200017C \|.........zPLR..\|\|			; LINUX-NEXT: 0000: 1C000000 00000000 017A504C 5200017C \|.........zPLR..\|\|
	; CHECK-NEXT: 0010: 08070000 00000000 1B0C0404 88010000 \|................\|			; LINUX-NEXT: 0010: 08070000 00000000 1B0C0404 88010000 \|................\|
	; CHECK-NEXT: 0020: 18000000 24000000 00000000 19000000 \|....$...........\|			; LINUX-NEXT: 0020: 1C000000 24000000 00000000 1D000000 \|....$...........\|
	; CHECK-NEXT: 0030: 04000000 00430E10 2E100000 \|.....C......\|			; LINUX-NEXT: 0030: 04000000 00410E08 8502420D 05432E10 \|.....A....B..C..\|
	; CHECK-NEXT: )			; LINUX-NEXT: )

	declare i32 @__gxx_personality_v0(...)			declare i32 @__gxx_personality_v0(...)
	declare void @good(i32 %a, i32 %b, i32 %c, i32 %d)			declare void @good(i32 %a, i32 %b, i32 %c, i32 %d)

	define void @test() optsize personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define void @test() #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	invoke void @good(i32 1, i32 2, i32 3, i32 4)			invoke void @good(i32 1, i32 2, i32 3, i32 4)
	to label %continue unwind label %cleanup			to label %continue unwind label %cleanup
	continue:			continue:
	ret void			ret void
	cleanup:			cleanup:
	landingpad { i8*, i32 }			landingpad { i8*, i32 }
	cleanup			cleanup
	ret void			ret void
	}			}

				attributes #0 = { optsize "no-frame-pointer-elim"="true" }

llvm/trunk/test/CodeGen/X86/push-cfi.ll

	; RUN: llc < %s -mtriple=i686-pc-linux \| FileCheck %s			; RUN: llc < %s -mtriple=i686-pc-linux \| FileCheck %s -check-prefix=LINUX -check-prefix=CHECK
				; RUN: llc < %s -mtriple=i686-apple-darwin \| FileCheck %s -check-prefix=DARWIN -check-prefix=CHECK

	declare i32 @__gxx_personality_v0(...)			declare i32 @__gxx_personality_v0(...)
	declare void @good(i32 %a, i32 %b, i32 %c, i32 %d)			declare void @good(i32 %a, i32 %b, i32 %c, i32 %d)
	declare void @large(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f)			declare void @large(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f)
	declare void @empty()			declare void @empty()

	; We use an invoke, and expect a .cfi_escape GNU_ARGS_SIZE with size 16			; When we use an invoke, and have FP, we expect a .cfi_escape GNU_ARGS_SIZE
	; before the invocation			; with size 16 before the invocation. Without FP, we expect.cfi_adjust_cfa_offset
	; CHECK-LABEL: test1:			; before and after.
	; CHECK: .cfi_escape 0x2e, 0x10			; Darwin should not generate pushes in neither circumstance.
	; CHECK-NEXT: pushl $4			; CHECK-LABEL: test1_nofp:
	; CHECK-NEXT: pushl $3			; LINUX: .cfi_escape 0x2e, 0x10
	; CHECK-NEXT: pushl $2			; LINUX: .cfi_adjust_cfa_offset 16
	; CHECK-NEXT: pushl $1			; LINUX-NEXT: pushl $4
	; CHECK-NEXT: call			; LINUX-NEXT: pushl $3
	; CHECK-NEXT: addl $16, %esp			; LINUX-NEXT: pushl $2
	define void @test1() optsize personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			; LINUX-NEXT: pushl $1
				; LINUX-NEXT: call
				; LINUX-NEXT: addl $16, %esp
				; LINUX: .cfi_adjust_cfa_offset -16
				; DARWIN-NOT: .cfi_escape
				; DARWIN-NOT: pushl
				define void @test1_nofp() #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				invoke void @good(i32 1, i32 2, i32 3, i32 4)
				to label %continue unwind label %cleanup
				continue:
				ret void
				cleanup:
				landingpad { i8*, i32 }
				cleanup
				ret void
				}

				; CHECK-LABEL: test1_fp:
				; LINUX: .cfi_escape 0x2e, 0x10
				; LINUX-NEXT: pushl $4
				; LINUX-NEXT: pushl $3
				; LINUX-NEXT: pushl $2
				; LINUX-NEXT: pushl $1
				; LINUX-NEXT: call
				; LINUX-NEXT: addl $16, %esp
				; DARWIN: pushl %ebp
				; DARWIN-NOT: .cfi_escape
				; DARWIN-NOT: pushl
				define void @test1_fp() #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	invoke void @good(i32 1, i32 2, i32 3, i32 4)			invoke void @good(i32 1, i32 2, i32 3, i32 4)
	to label %continue unwind label %cleanup			to label %continue unwind label %cleanup
	continue:			continue:
	ret void			ret void
	cleanup:			cleanup:
	landingpad { i8*, i32 }			landingpad { i8*, i32 }
	cleanup			cleanup
	ret void			ret void
	}			}

	; If the function has no handlers, we don't need to generate GNU_ARGS_SIZE,			; If the function has no handlers, we don't need to generate GNU_ARGS_SIZE,
	; even if it has an unwind table.			; even if it has an unwind table. Without FP, we still need cfi_adjust_cfa_offset,
	; CHECK-LABEL: test2:			; so darwin should not generate pushes.
				; CHECK-LABEL: test2_nofp:
				; LINUX-NOT: .cfi_escape
				; LINUX: .cfi_adjust_cfa_offset 16
				; LINUX-NEXT: pushl $4
				; LINUX-NEXT: pushl $3
				; LINUX-NEXT: pushl $2
				; LINUX-NEXT: pushl $1
				; LINUX-NEXT: call
				; LINUX-NEXT: addl $16, %esp
				; LINUX: .cfi_adjust_cfa_offset -16
				; DARWIN-NOT: .cfi_escape
				; DARWIN-NOT: pushl
				define void @test2_nofp() #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				call void @good(i32 1, i32 2, i32 3, i32 4)
				ret void
				}

				; CHECK-LABEL: test2_fp:
	; CHECK-NOT: .cfi_escape			; CHECK-NOT: .cfi_escape
				; CHECK-NOT: .cfi_adjust_cfa_offset
	; CHECK: pushl $4			; CHECK: pushl $4
	; CHECK-NEXT: pushl $3			; CHECK-NEXT: pushl $3
	; CHECK-NEXT: pushl $2			; CHECK-NEXT: pushl $2
	; CHECK-NEXT: pushl $1			; CHECK-NEXT: pushl $1
	; CHECK-NEXT: call			; CHECK-NEXT: call
	; CHECK-NEXT: addl $16, %esp			; CHECK-NEXT: addl $24, %esp
	define void @test2() optsize personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define void @test2_fp() #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	call void @good(i32 1, i32 2, i32 3, i32 4)			call void @good(i32 1, i32 2, i32 3, i32 4)
	ret void			ret void
	}			}

	; If we did not end up using any pushes, no need for GNU_ARGS_SIZE anywhere			; If we did not end up using any pushes, no need for GNU_ARGS_SIZE or
	; CHECK-LABEL: test3:			; cfi_adjust_cfa_offset.
	; CHECK-NOT: .cfi_escape			; CHECK-LABEL: test3_nofp:
	; CHECK-NOT: pushl			; LINUX-NOT: .cfi_escape
	; CHECK: retl			; LINUX-NOT: .cfi_adjust_cfa_offset
	define void @test3() optsize personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			; LINUX-NOT: pushl
				; LINUX: retl
				define void @test3_nofp() #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				invoke void @empty()
				to label %continue unwind label %cleanup
				continue:
				ret void
				cleanup:
				landingpad { i8*, i32 }
				cleanup
				ret void
				}

				; If we did not end up using any pushes, no need for GNU_ARGS_SIZE or
				; cfi_adjust_cfa_offset.
				; CHECK-LABEL: test3_fp:
				; LINUX: pushl %ebp
				; LINUX-NOT: .cfi_escape
				; LINUX-NOT: .cfi_adjust_cfa_offset
				; LINUX-NOT: pushl
				; LINUX: retl
				define void @test3_fp() #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	invoke void @empty()			invoke void @empty()
	to label %continue unwind label %cleanup			to label %continue unwind label %cleanup
	continue:			continue:
	ret void			ret void
	cleanup:			cleanup:
	landingpad { i8*, i32 }			landingpad { i8*, i32 }
	cleanup			cleanup
	ret void			ret void
	}			}

	; Different sized stacks need different GNU_ARGS_SIZEs			; Different sized stacks need different GNU_ARGS_SIZEs
	; CHECK-LABEL: test4:			; CHECK-LABEL: test4:
	; CHECK: .cfi_escape 0x2e, 0x10			; LINUX: .cfi_escape 0x2e, 0x10
	; CHECK-NEXT: pushl $4			; LINUX-NEXT: pushl $4
	; CHECK-NEXT: pushl $3			; LINUX-NEXT: pushl $3
	; CHECK-NEXT: pushl $2			; LINUX-NEXT: pushl $2
	; CHECK-NEXT: pushl $1			; LINUX-NEXT: pushl $1
	; CHECK-NEXT: call			; LINUX-NEXT: call
	; CHECK-NEXT: addl $16, %esp			; LINUX-NEXT: addl $16, %esp
	; CHECK: .cfi_escape 0x2e, 0x20			; LINUX: .cfi_escape 0x2e, 0x20
	; CHECK-NEXT: subl $8, %esp			; LINUX: subl $8, %esp
	; CHECK-NEXT: pushl $11			; LINUX-NEXT: pushl $11
	; CHECK-NEXT: pushl $10			; LINUX-NEXT: pushl $10
	; CHECK-NEXT: pushl $9			; LINUX-NEXT: pushl $9
	; CHECK-NEXT: pushl $8			; LINUX-NEXT: pushl $8
	; CHECK-NEXT: pushl $7			; LINUX-NEXT: pushl $7
	; CHECK-NEXT: pushl $6			; LINUX-NEXT: pushl $6
	; CHECK-NEXT: calll large			; LINUX-NEXT: calll large
	; CHECK-NEXT: addl $32, %esp			; LINUX-NEXT: addl $32, %esp
	define void @test4() optsize personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define void @test4() #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	invoke void @good(i32 1, i32 2, i32 3, i32 4)			invoke void @good(i32 1, i32 2, i32 3, i32 4)
	to label %continue1 unwind label %cleanup			to label %continue1 unwind label %cleanup
	continue1:			continue1:
	invoke void @large(i32 6, i32 7, i32 8, i32 9, i32 10, i32 11)			invoke void @large(i32 6, i32 7, i32 8, i32 9, i32 10, i32 11)
	to label %continue2 unwind label %cleanup			to label %continue2 unwind label %cleanup
	continue2:			continue2:
	ret void			ret void
	cleanup:			cleanup:
	landingpad { i8*, i32 }			landingpad { i8*, i32 }
	cleanup			cleanup
	ret void			ret void
	}			}

	; If we did use pushes, we need to reset GNU_ARGS_SIZE before a call			; If we did use pushes, we need to reset GNU_ARGS_SIZE before a call
	; without parameters			; without parameters, but don't need to adjust the cfa offset
	; CHECK-LABEL: test5:			; CHECK-LABEL: test5_nofp:
	; CHECK: .cfi_escape 0x2e, 0x10			; LINUX: .cfi_escape 0x2e, 0x10
	; CHECK-NEXT: pushl $4			; LINUX: .cfi_adjust_cfa_offset 16
	; CHECK-NEXT: pushl $3			; LINUX-NEXT: pushl $4
	; CHECK-NEXT: pushl $2			; LINUX-NEXT: pushl $3
	; CHECK-NEXT: pushl $1			; LINUX-NEXT: pushl $2
	; CHECK-NEXT: call			; LINUX-NEXT: pushl $1
	; CHECK-NEXT: addl $16, %esp			; LINUX-NEXT: call
	; CHECK: .cfi_escape 0x2e, 0x00			; LINUX-NEXT: addl $16, %esp
	; CHECK-NEXT: call			; LINUX: .cfi_adjust_cfa_offset -16
	define void @test5() optsize personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			; LINUX-NOT: .cfi_adjust_cfa_offset
				; LINUX: .cfi_escape 0x2e, 0x00
				; LINUX-NOT: .cfi_adjust_cfa_offset
				; LINUX: call
				define void @test5_nofp() #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				invoke void @good(i32 1, i32 2, i32 3, i32 4)
				to label %continue1 unwind label %cleanup
				continue1:
				invoke void @empty()
				to label %continue2 unwind label %cleanup
				continue2:
				ret void
				cleanup:
				landingpad { i8*, i32 }
				cleanup
				ret void
				}

				; CHECK-LABEL: test5_fp:
				; LINUX: .cfi_escape 0x2e, 0x10
				; LINUX-NEXT: pushl $4
				; LINUX-NEXT: pushl $3
				; LINUX-NEXT: pushl $2
				; LINUX-NEXT: pushl $1
				; LINUX-NEXT: call
				; LINUX-NEXT: addl $16, %esp
				; LINUX: .cfi_escape 0x2e, 0x00
				; LINUX-NOT: .cfi_adjust_cfa_offset
				; LINUX: call
				define void @test5_fp() #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	invoke void @good(i32 1, i32 2, i32 3, i32 4)			invoke void @good(i32 1, i32 2, i32 3, i32 4)
	to label %continue1 unwind label %cleanup			to label %continue1 unwind label %cleanup
	continue1:			continue1:
	invoke void @empty()			invoke void @empty()
	to label %continue2 unwind label %cleanup			to label %continue2 unwind label %cleanup
	continue2:			continue2:
	ret void			ret void
	cleanup:			cleanup:
	landingpad { i8*, i32 }			landingpad { i8*, i32 }
	cleanup			cleanup
	ret void			ret void
	}			}

	; This is actually inefficient - we don't need to repeat the .cfi_escape twice.			; FIXME: This is actually inefficient - we don't need to repeat the .cfi_escape twice.
	; CHECK-LABEL: test6:			; CHECK-LABEL: test6:
	; CHECK: .cfi_escape 0x2e, 0x10			; LINUX: .cfi_escape 0x2e, 0x10
	; CHECK: call			; LINUX: call
	; CHECK: .cfi_escape 0x2e, 0x10			; LINUX: .cfi_escape 0x2e, 0x10
	; CHECK: call			; LINUX: call
	define void @test6() optsize personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define void @test6() #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	invoke void @good(i32 1, i32 2, i32 3, i32 4)			invoke void @good(i32 1, i32 2, i32 3, i32 4)
	to label %continue1 unwind label %cleanup			to label %continue1 unwind label %cleanup
	continue1:			continue1:
	invoke void @good(i32 5, i32 6, i32 7, i32 8)			invoke void @good(i32 5, i32 6, i32 7, i32 8)
	to label %continue2 unwind label %cleanup			to label %continue2 unwind label %cleanup
	continue2:			continue2:
	ret void			ret void
	cleanup:			cleanup:
	landingpad { i8*, i32 }			landingpad { i8*, i32 }
	cleanup			cleanup
	ret void			ret void
	}			}

				; Darwin should generate pushes in the presense of FP and an unwind table,
				; but not FP and invoke.
				; CHECK-LABEL: test7:
				; DARWIN: pushl %ebp
				; DARWIN: movl %esp, %ebp
				; DARWIN: .cfi_def_cfa_register %ebp
				; DARWIN-NOT: .cfi_adjust_cfa_offset
				; DARWIN: pushl $4
				; DARWIN-NEXT: pushl $3
				; DARWIN-NEXT: pushl $2
				; DARWIN-NEXT: pushl $1
				; DARWIN-NEXT: call
				define void @test7() #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				call void @good(i32 1, i32 2, i32 3, i32 4)
				ret void
				}

				; CHECK-LABEL: test8:
				; DARWIN: pushl %ebp
				; DARWIN: movl %esp, %ebp
				; DARWIN-NOT: .cfi_adjust_cfa_offset
				; DARWIN-NOT: pushl
				define void @test8() #1 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				invoke void @good(i32 1, i32 2, i32 3, i32 4)
				to label %continue unwind label %cleanup
				continue:
				ret void
				cleanup:
				landingpad { i8*, i32 }
				cleanup
				ret void
				}

				attributes #0 = { optsize }
				attributes #1 = { optsize "no-frame-pointer-elim"="true" }

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Fix more -Os + EH issuesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 39033

llvm/trunk/include/llvm/CodeGen/MachineModuleInfo.h

llvm/trunk/lib/CodeGen/AsmPrinter/AsmPrinterDwarf.cpp

llvm/trunk/lib/Target/X86/X86CallFrameOptimization.cpp

llvm/trunk/lib/Target/X86/X86FrameLowering.h

llvm/trunk/lib/Target/X86/X86FrameLowering.cpp

llvm/trunk/test/CodeGen/X86/debugloc-argsize.ll

llvm/trunk/test/CodeGen/X86/fold-push.ll

llvm/trunk/test/CodeGen/X86/pop-stack-cleanup.ll

llvm/trunk/test/CodeGen/X86/push-cfi-debug.ll

llvm/trunk/test/CodeGen/X86/push-cfi-obj.ll

llvm/trunk/test/CodeGen/X86/push-cfi.ll

[X86] Fix more -Os + EH issues
ClosedPublic