Download Raw Diff

Details

Reviewers

mkuper
rnk
javed.absar

Commits

rG681ad6103cbb: Merging r298179:
rGac6081cb672a: Make library calls sensitive to regparm module flag (Fixes PR3997).
rL304242: Merging r298179:
rL298179: Make library calls sensitive to regparm module flag (Fixes PR3997).

Summary

Case all constructions of library calls in ISEL to hook into target-specific attribute labeller. This allows libcalls to be sensitive to
the default regparm flag.

Diff Detail

Build Status

Buildable 1545
Build 1545: arc lint + arc unit

Event Timeline

niravd updated this revision to Diff 79087.Nov 23 2016, 8:18 AM

niravd retitled this revision from to [X86] Add explicit regparm flag for X86-32 calling convention..

niravd updated this object.

niravd added reviewers: rnk, mkuper.

niravd added a subscriber: llvm-commits.

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptNov 23 2016, 8:18 AM

Add missing testcase

niravd added a child revision: D27051: [X86] Add NumRegisterParameters Module Flag.Nov 23 2016, 8:21 AM

IIRC, The MCU ABI has at least one other difference from -mregparm=3:

MCU will continue passing parameters in-register even if an earlier parameter ends up on the stack, as long as free registers are available. -mregparm will bail once a parameter doesn't fit in a register. So, for something like foo(int a, int b, long long c, int d), MCU will pass {a, b, c} inreg, and mregparm will only do it for {a,b}.

(Interestingly enough, it looks like GCC gets this wrong, and has the -mregparm behavior for -miamcu... :-\ )

include/llvm/CodeGen/CommandFlags.h
48	I really don't think we should have this (except, possibly, for debugging, in which case it probably needs to be hidden). As I mentioned on D27051, the information should be contained in the IR.

I meant "MCU will pass {a, b, d} inreg, and mregparm will only do it for {a,b}."

There's two cases:

The calling convention for functions named in the IR. This can be set separately for each function by the frontend. Today, it's done by putting "inreg" attributes on appropriate parameters.
The calling convention used for library calls generated by LLVM. Today, this is not handled.

The former can be specified by attribute((regparm(N))), and the -mregparm command-line flag specifies both 2, and the default for 1.

This patch adds support for 2, without changing the way 1 works. But, ISTM, both of these should be really handled similarly -- I don't much like that this patch is adding new calling convention code for 2, but leaving the handling for 1 in clang's decision about whether to use "inreg".

I think, instead, that 3 new calling conventions, "x86_regparm_{1,2,3}" should be defined -- variants of the C calling convention, with a different number of args passed in registers.

Then --

Clang can emit the right cc for functions it defines/declares, rather than choosing where to place inreg.
Which one of the calling convention to use for libcalls can be set by a subtarget feature (selected by clang based on -mregparm=N command-line option, examined by X86TargetLowering to determine what to pass to setLibcallCallingConv).

Rework with new calling conventions. Unchange MCU ABI.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptDec 1 2016, 10:34 AM

This new version leaves the old inreg computation in place but add library-call-only calling conventions to deal with libcalls (one for C and one for StdCall). Not ideal, but does work.

A quick search didn't turn up a definitive description of the MCU ABI to verify that gcc is wrong, so I've returned MCU back to its original behavior for now.

This is the most definitive description of the MCU ABI I'm aware of:
https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/iamcu-psABI-0.7.pdf

See the example in table 2.5.
To the best of my understanding, -mregparm=3 should pass "s" on the stack, not in register.

niravd updated this object.Dec 2 2016, 6:44 AM

You touched stdcall, and that reminded me of -mrtd, which has the same problem as -mregparm=: calls to library functions generated by LLVM aren't handled correctly.

Does anybody have any objections to doing this with module flags? They seem like the right kind of thing here for communicating global settings to the backend, and they have the nice property that mismatches get diagnosed during LTO. If you use function attributes as you've done in this patch, it's still possible to LTO a non-regparm TU with a regparm TU and get weird results.

Switch from Subtarget features to module flag.

Harbormaster completed remote builds in B3319: Diff 85926.Jan 26 2017, 9:04 AM

Herald added a subscriber: igorb. · View Herald TranscriptJan 26 2017, 9:04 AM

niravd edited the summary of this revision. (Show Details)Jan 30 2017, 7:06 AM

In D27050#610810, @niravd wrote:

This new version leaves the old inreg computation in place but add library-call-only calling conventions to deal with libcalls (one for C and one for StdCall). Not ideal, but does work.

Can you elaborate on why we need these new calling conventions? I thought that just the module flag would be enough to get things right.

Can you elaborate on why we need these new calling conventions? I thought that just the module flag would be enough to get things right.

The module flag makes it clear how to do the lowering for intrinsics. The calling convention is to mark which calls are to intrinsics. At call time we only have the type / arg attributes and the calling convention and given the C regparm attribute, we can't rely on the reg arg attributes being zero for just intrinsics. The obvious solutions for this are:

Explicitly label all library calls as being of a special calling convention that always uses the module flag value - Uses two calling conventions because of StdCall.
Calculate which parameters should be done in registers in intrinsic lowering time - This seems brittle and more awkward than it's worth.
Mark each intrinsic function with a new "isIntrinsic" function-level attribute - Attributes seem more valuable a resource then calling conventions but this would work as well.

I'm disinclined to implement 2 but 3 might be reasonable.

What's hard about approach number 2? There are only 47 or so places where we construct a CallLoweringInfo to feed to LowerCallTo. I bet you could add a method to CallLoweringInfo that handles all the common cases of making a call to an intrinsic, similar to how setCallee is supposed to work.

Replace calling conventions with special casing of setcall in CallLoweringInfo.

Herald added a reviewer: javed.absar. · View Herald TranscriptFeb 22 2017, 11:36 AM

Herald added a subscriber: nemanjai. · View Herald Transcript

niravd retitled this revision from [X86] Add explicit regparm flag for X86-32 calling convention. to [X86] Make library calls sensitive to regparm module flag (Fixes PR3997)..Feb 22 2017, 12:22 PM

niravd edited the summary of this revision. (Show Details)

Minor comments, I like the approach, sorry for the delay

include/llvm/Target/TargetLowering.h
167 ↗	(On Diff #89391)	I think we can probably leave the `SDValue Node` field null when doing fast isel and get away without any subclassing. It simplifies the code that needs to take the address of all the arglistentries as well.
2636–2638 ↗	(On Diff #89391)	Avoiding the subclassing would save this work
lib/CodeGen/SelectionDAG/FastISel.cpp
879 ↗	(On Diff #89391)	ditto
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2509 ↗	(On Diff #89391)	Unrelated change?

Fold all ArgListEntry types together and cleanup.

Harbormaster completed remote builds in B4788: Diff 91892.Mar 15 2017, 9:31 AM

niravd marked an inline comment as done.Mar 15 2017, 9:36 AM

niravd added inline comments.

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2509 ↗	(On Diff #89391)	Unrelated diff from rebasing.

lgtm

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7658 ↗	(On Diff #91892)	Please commit the `s/is/Is/` change first as a separate change to reduce the diff size and simplify future source history archaeology.

This revision is now accepted and ready to land.Mar 17 2017, 8:37 AM

Closed by commit rL298179: Make library calls sensitive to regparm module flag (Fixes PR3997). (authored by niravd). · Explain WhyMar 17 2017, 5:56 PM

This revision was automatically updated to reflect the committed changes.

niravd marked an inline comment as done.

Diff 79088

include/llvm/CodeGen/CommandFlags.h

Show All 39 Lines	MCPU("mcpu",
cl::init(""));		cl::init(""));

cl::list<std::string>		cl::list<std::string>
MAttrs("mattr",		MAttrs("mattr",
cl::CommaSeparated,		cl::CommaSeparated,
cl::desc("Target specific attributes (-mattr=help for details)"),		cl::desc("Target specific attributes (-mattr=help for details)"),
cl::value_desc("a1,+a2,-a3,..."));		cl::value_desc("a1,+a2,-a3,..."));

		cl::opt<unsigned> RegParm(
		mkuperUnsubmitted Not Done Reply Inline Actions I really don't think we should have this (except, possibly, for debugging, in which case it probably needs to be hidden). As I mentioned on D27051, the information should be contained in the IR. mkuper: I really don't think we should have this (except, possibly, for debugging, in which case it…
		"regparm", cl::desc("set number of register parameters (X86 only)"),
		cl::value_desc("[0-3]"),
		cl::init(0));

cl::opt<Reloc::Model> RelocModel(		cl::opt<Reloc::Model> RelocModel(
"relocation-model", cl::desc("Choose relocation model"),		"relocation-model", cl::desc("Choose relocation model"),
cl::values(		cl::values(
clEnumValN(Reloc::Static, "static", "Non-relocatable code"),		clEnumValN(Reloc::Static, "static", "Non-relocatable code"),
clEnumValN(Reloc::PIC_, "pic",		clEnumValN(Reloc::PIC_, "pic",
"Fully relocatable, position independent code"),		"Fully relocatable, position independent code"),
clEnumValN(Reloc::DynamicNoPIC, "dynamic-no-pic",		clEnumValN(Reloc::DynamicNoPIC, "dynamic-no-pic",
"Relocatable external references, non-relocatable code"),		"Relocatable external references, non-relocatable code"),
▲ Show 20 Lines • Show All 222 Lines • ▼ Show 20 Lines	static inline TargetOptions InitTargetOptionsFromCodeGenFlags() {
Options.NoInfsFPMath = EnableNoInfsFPMath;		Options.NoInfsFPMath = EnableNoInfsFPMath;
Options.NoNaNsFPMath = EnableNoNaNsFPMath;		Options.NoNaNsFPMath = EnableNoNaNsFPMath;
Options.NoTrappingFPMath = EnableNoTrappingFPMath;		Options.NoTrappingFPMath = EnableNoTrappingFPMath;
Options.FPDenormalMode = DenormalMode;		Options.FPDenormalMode = DenormalMode;
Options.HonorSignDependentRoundingFPMathOption =		Options.HonorSignDependentRoundingFPMathOption =
EnableHonorSignDependentRoundingFPMath;		EnableHonorSignDependentRoundingFPMath;
if (FloatABIForCalls != FloatABI::Default)		if (FloatABIForCalls != FloatABI::Default)
Options.FloatABIType = FloatABIForCalls;		Options.FloatABIType = FloatABIForCalls;
		Options.RegParm = RegParm;
Options.NoZerosInBSS = DontPlaceZerosInBSS;		Options.NoZerosInBSS = DontPlaceZerosInBSS;
Options.GuaranteedTailCallOpt = EnableGuaranteedTailCallOpt;		Options.GuaranteedTailCallOpt = EnableGuaranteedTailCallOpt;
Options.StackAlignmentOverride = OverrideStackAlignment;		Options.StackAlignmentOverride = OverrideStackAlignment;
Options.StackSymbolOrdering = StackSymbolOrdering;		Options.StackSymbolOrdering = StackSymbolOrdering;
Options.UseInitArray = !UseCtors;		Options.UseInitArray = !UseCtors;
Options.DataSections = DataSections;		Options.DataSections = DataSections;
Options.FunctionSections = FunctionSections;		Options.FunctionSections = FunctionSections;
Options.UniqueSectionNames = UniqueSectionNames;		Options.UniqueSectionNames = UniqueSectionNames;
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

include/llvm/Target/TargetOptions.h

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	TargetOptions()
UnsafeFPMath(false), NoInfsFPMath(false), NoNaNsFPMath(false),		UnsafeFPMath(false), NoInfsFPMath(false), NoNaNsFPMath(false),
NoTrappingFPMath(false),		NoTrappingFPMath(false),
HonorSignDependentRoundingFPMathOption(false), NoZerosInBSS(false),		HonorSignDependentRoundingFPMathOption(false), NoZerosInBSS(false),
GuaranteedTailCallOpt(false), StackAlignmentOverride(0),		GuaranteedTailCallOpt(false), StackAlignmentOverride(0),
StackSymbolOrdering(true), EnableFastISel(false), UseInitArray(false),		StackSymbolOrdering(true), EnableFastISel(false), UseInitArray(false),
DisableIntegratedAS(false), CompressDebugSections(false),		DisableIntegratedAS(false), CompressDebugSections(false),
RelaxELFRelocations(false), FunctionSections(false),		RelaxELFRelocations(false), FunctionSections(false),
DataSections(false), UniqueSectionNames(true), TrapUnreachable(false),		DataSections(false), UniqueSectionNames(true), TrapUnreachable(false),
EmulatedTLS(false), EnableIPRA(false),		EmulatedTLS(false), EnableIPRA(false), RegParm(0),
FloatABIType(FloatABI::Default),		FloatABIType(FloatABI::Default),
AllowFPOpFusion(FPOpFusion::Standard),		AllowFPOpFusion(FPOpFusion::Standard),
ThreadModel(ThreadModel::POSIX),		ThreadModel(ThreadModel::POSIX),
EABIVersion(EABI::Default), DebuggerTuning(DebuggerKind::Default),		EABIVersion(EABI::Default), DebuggerTuning(DebuggerKind::Default),
FPDenormalMode(FPDenormal::IEEE),		FPDenormalMode(FPDenormal::IEEE),
ExceptionModel(ExceptionHandling::None) {}		ExceptionModel(ExceptionHandling::None) {}

/// PrintMachineCode - This flag is enabled when the -print-machineinstrs		/// PrintMachineCode - This flag is enabled when the -print-machineinstrs
/// option is specified on the command line, and should enable debugging		/// option is specified on the command line, and should enable debugging
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	public:

/// EmulatedTLS - This flag enables emulated TLS model, using emutls		/// EmulatedTLS - This flag enables emulated TLS model, using emutls
/// function in the runtime library..		/// function in the runtime library..
unsigned EmulatedTLS : 1;		unsigned EmulatedTLS : 1;

/// This flag enables InterProcedural Register Allocation (IPRA).		/// This flag enables InterProcedural Register Allocation (IPRA).
unsigned EnableIPRA : 1;		unsigned EnableIPRA : 1;

		/// RegParm - The initial RegParm Value
		unsigned RegParm;

/// FloatABIType - This setting is set by -float-abi=xxx option is specfied		/// FloatABIType - This setting is set by -float-abi=xxx option is specfied
/// on the command line. This setting may either be Default, Soft, or Hard.		/// on the command line. This setting may either be Default, Soft, or Hard.
/// Default selects the target's default behavior. Soft selects the ABI for		/// Default selects the target's default behavior. Soft selects the ABI for
/// software floating point, but does not indicate that FP hardware may not		/// software floating point, but does not indicate that FP hardware may not
/// be used. Such a combination is unfortunately popular (e.g.		/// be used. Such a combination is unfortunately popular (e.g.
/// arm-apple-darwin). Hard presumes that the normal FP ABI is used.		/// arm-apple-darwin). Hard presumes that the normal FP ABI is used.
FloatABI::ABIType FloatABIType;		FloatABI::ABIType FloatABIType;

▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

lib/Target/X86/X86CallingConv.h

	Show All 12 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_LIB_TARGET_X86_X86CALLINGCONV_H			#ifndef LLVM_LIB_TARGET_X86_X86CALLINGCONV_H
	#define LLVM_LIB_TARGET_X86_X86CALLINGCONV_H			#define LLVM_LIB_TARGET_X86_X86CALLINGCONV_H

	#include "MCTargetDesc/X86MCTargetDesc.h"			#include "MCTargetDesc/X86MCTargetDesc.h"
	#include "llvm/CodeGen/CallingConvLower.h"			#include "llvm/CodeGen/CallingConvLower.h"
	#include "llvm/IR/CallingConv.h"			#include "llvm/IR/CallingConv.h"
				#include "llvm/Target/TargetMachine.h"
				#include "llvm/Target/TargetOptions.h"
				#include "X86TargetMachine.h"

	namespace llvm {			namespace llvm {

	inline bool CC_X86_32_VectorCallIndirect(unsigned &ValNo, MVT &ValVT,			inline bool CC_X86_32_VectorCallIndirect(unsigned &ValNo, MVT &ValVT,
	MVT &LocVT,			MVT &LocVT,
	CCValAssign::LocInfo &LocInfo,			CCValAssign::LocInfo &LocInfo,
	ISD::ArgFlagsTy &ArgFlags,			ISD::ArgFlagsTy &ArgFlags,
	CCState &State) {			CCState &State) {
	Show All 16 Lines

	inline bool CC_X86_RegCall_Error(unsigned &, MVT &, MVT &,			inline bool CC_X86_RegCall_Error(unsigned &, MVT &, MVT &,
	CCValAssign::LocInfo &, ISD::ArgFlagsTy &,			CCValAssign::LocInfo &, ISD::ArgFlagsTy &,
	CCState &) {			CCState &) {
	report_fatal_error("LLVM x86 RegCall calling convention implementation" \			report_fatal_error("LLVM x86 RegCall calling convention implementation" \
	" doesn't support long double and mask types yet.");			" doesn't support long double and mask types yet.");
	}			}

	inline bool CC_X86_32_MCUInReg(unsigned &ValNo, MVT &ValVT,			inline bool CC_X86_32_AssignToReg_NoSplit(unsigned &ValNo, MVT &ValVT,
	MVT &LocVT,			MVT &LocVT,
	CCValAssign::LocInfo &LocInfo,			CCValAssign::LocInfo &LocInfo,
	ISD::ArgFlagsTy &ArgFlags,			ISD::ArgFlagsTy &ArgFlags,
	CCState &State) {			CCState &State) {
	// This is similar to CCAssignToReg<[EAX, EDX, ECX]>, but makes sure			// If the argument is InAlloc or ByVal bail.
	// not to split i64 and double between a register and stack			if (ArgFlags.isInAlloca() \|\| ArgFlags.isByVal())
	static const MCPhysReg RegList[] = {X86::EAX, X86::EDX, X86::ECX};			return false;
	static const unsigned NumRegs = sizeof(RegList)/sizeof(RegList[0]);
				// Similiar to AssignToReg, but do not split multi-reg args
				// (i64/double) between a register and stack.
				MCPhysReg RegList[] = {X86::EAX, X86::EDX, X86::ECX};
				static const unsigned MaxRegs = sizeof(RegList)/sizeof(RegList[0]);

				auto NumRegs = State.getMachineFunction().getTarget().Options.RegParm;
				if (static_cast<const X86Subtarget&>(State.getMachineFunction().getSubtarget()).isTargetMCU())
				NumRegs = MaxRegs;

				assert(NumRegs <= MaxRegs && "More register parameters than registers");

	SmallVectorImpl<CCValAssign> &PendingMembers = State.getPendingLocs();			SmallVectorImpl<CCValAssign> &PendingMembers = State.getPendingLocs();

				unsigned FirstFree = std::min(NumRegs, State.getFirstUnallocated(RegList));

	// If this is the first part of an double/i64/i128, or if we're already			// If this is the first part of an double/i64/i128, or if we're already
	// in the middle of a split, add to the pending list. If this is not			// in the middle of a split, add to the pending list. If this is not
	// the end of the split, return, otherwise go on to process the pending			// the end of the split, return, otherwise go on to process the pending
	// list			// list
	if (ArgFlags.isSplit() \|\| !PendingMembers.empty()) {			if (ArgFlags.isSplit() \|\| !PendingMembers.empty()) {
	PendingMembers.push_back(			PendingMembers.push_back(
	CCValAssign::getPending(ValNo, ValVT, LocVT, LocInfo));			CCValAssign::getPending(ValNo, ValVT, LocVT, LocInfo));
	if (!ArgFlags.isSplitEnd())			if (!ArgFlags.isSplitEnd())
	return true;			return true;
	}			}

	// If there are no pending members, we are not in the middle of a split,			// If there are no pending members, we are not in the middle of a split,
	// so do the usual inreg stuff.			// so do the usual inreg stuff.
	if (PendingMembers.empty()) {			if (PendingMembers.empty()) {
	if (unsigned Reg = State.AllocateReg(RegList)) {			if (FirstFree < NumRegs)
				if (unsigned Reg = State.AllocateReg(RegList[FirstFree++])) {
	State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));			State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));
	return true;			return true;
	}			}
	return false;			return false;
	}			}

	assert(ArgFlags.isSplitEnd());			assert(ArgFlags.isSplitEnd());

	// We now have the entire original argument in PendingMembers, so decide			// We now have the entire original argument in PendingMembers, so decide
	// whether to use registers or the stack.			// whether to use registers or the stack.
	// Per the MCU ABI:
	// a) To use registers, we need to have enough of them free to contain			// a) To use registers, we need to have enough of them free to contain
	// the entire argument.			// the entire argument.
	// b) We never want to use more than 2 registers for a single argument.			// b) We never want to use more than 2 registers for a single argument.

	unsigned FirstFree = State.getFirstUnallocated(RegList);
	bool UseRegs = PendingMembers.size() <= std::min(2U, NumRegs - FirstFree);			bool UseRegs = PendingMembers.size() <= std::min(2U, NumRegs - FirstFree);

	for (auto &It : PendingMembers) {			for (auto &It : PendingMembers) {
				// If available, always allocate register so subsequent
				// arguments cannot use them.
	if (UseRegs)			if (UseRegs)
	It.convertToReg(State.AllocateReg(RegList[FirstFree++]));			It.convertToReg(State.AllocateReg(RegList[FirstFree++]));
				else if (FirstFree < MaxRegs)
				It.convertToMem(State.AllocateStack(4, 4, RegList[FirstFree++]));
	else			else
	It.convertToMem(State.AllocateStack(4, 4));			It.convertToMem(State.AllocateStack(4, 4));
	State.addLoc(It);			State.addLoc(It);
	}			}

	PendingMembers.clear();			PendingMembers.clear();

	return true;			return true;
	}			}

	} // End llvm namespace			} // End llvm namespace

	#endif			#endif

lib/Target/X86/X86CallingConv.td

Show First 20 Lines • Show All 790 Lines • ▼ Show 20 Lines	def CC_X86_32_C : CallingConv<[
CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,		CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,

// The 'nest' parameter, if any, is passed in ECX.		// The 'nest' parameter, if any, is passed in ECX.
CCIfNest<CCAssignToReg<[ECX]>>,		CCIfNest<CCAssignToReg<[ECX]>>,

// The first 3 integer arguments, if marked 'inreg' and if the call is not		// The first 3 integer arguments, if marked 'inreg' and if the call is not
// a vararg call, are passed in integer registers.		// a vararg call, are passed in integer registers.
CCIfNotVarArg<CCIfInReg<CCIfType<[i32], CCAssignToReg<[EAX, EDX, ECX]>>>>,		CCIfNotVarArg<CCIfInReg<CCIfType<[i32], CCAssignToReg<[EAX, EDX, ECX]>>>>,
		// Assign to Reg if RegParm flag
		CCIfNotVarArg<CCIfType<[i32], CCCustom<"CC_X86_32_AssignToReg_NoSplit">>>,

// Otherwise, same as everything else.		// Otherwise, same as everything else.
CCDelegateTo<CC_X86_32_Common>		CCDelegateTo<CC_X86_32_Common>
]>;		]>;

def CC_X86_32_MCU : CallingConv<[
// Handles byval parameters. Note that, like FastCC, we can't rely on
// the delegation to CC_X86_32_Common because that happens after code that
// puts arguments in registers.
CCIfByVal<CCPassByVal<4, 4>>,

		def CC_X86_32_MCU : CallingConv<[
// Promote i1/i8/i16 arguments to i32.		// Promote i1/i8/i16 arguments to i32.
CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,		CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,

// If the call is not a vararg call, some arguments may be passed		// If the call is not a vararg call, some arguments may be passed
// in integer registers.		// in integer registers.
CCIfNotVarArg<CCIfType<[i32], CCCustom<"CC_X86_32_MCUInReg">>>,		CCIfNotVarArg<CCIfType<[i32], CCCustom<"CC_X86_32_AssignToReg_NoSplit">>>,

// Otherwise, same as everything else.		// Otherwise, same as everything else.
CCDelegateTo<CC_X86_32_Common>		CCDelegateTo<CC_X86_32_Common>
]>;		]>;

def CC_X86_32_FastCall : CallingConv<[		def CC_X86_32_FastCall : CallingConv<[
// Promote i1/i8/i16 arguments to i32.		// Promote i1/i8/i16 arguments to i32.
CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,		CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,

▲ Show 20 Lines • Show All 160 Lines • ▼ Show 20 Lines	def CC_X86_32 : CallingConv<[
CCIfSubtarget<"isTargetMCU()", CCDelegateTo<CC_X86_32_MCU>>,		CCIfSubtarget<"isTargetMCU()", CCDelegateTo<CC_X86_32_MCU>>,
CCIfCC<"CallingConv::X86_FastCall", CCDelegateTo<CC_X86_32_FastCall>>,		CCIfCC<"CallingConv::X86_FastCall", CCDelegateTo<CC_X86_32_FastCall>>,
CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo<CC_X86_32_VectorCall>>,		CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo<CC_X86_32_VectorCall>>,
CCIfCC<"CallingConv::X86_ThisCall", CCDelegateTo<CC_X86_32_ThisCall>>,		CCIfCC<"CallingConv::X86_ThisCall", CCDelegateTo<CC_X86_32_ThisCall>>,
CCIfCC<"CallingConv::Fast", CCDelegateTo<CC_X86_32_FastCC>>,		CCIfCC<"CallingConv::Fast", CCDelegateTo<CC_X86_32_FastCC>>,
CCIfCC<"CallingConv::GHC", CCDelegateTo<CC_X86_32_GHC>>,		CCIfCC<"CallingConv::GHC", CCDelegateTo<CC_X86_32_GHC>>,
CCIfCC<"CallingConv::HiPE", CCDelegateTo<CC_X86_32_HiPE>>,		CCIfCC<"CallingConv::HiPE", CCDelegateTo<CC_X86_32_HiPE>>,
CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<CC_X86_32_RegCall>>,		CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<CC_X86_32_RegCall>>,

// Otherwise, drop to normal X86-32 CC		// Otherwise, drop to normal X86-32 CC
CCDelegateTo<CC_X86_32_C>		CCDelegateTo<CC_X86_32_C>
]>;		]>;

// This is the root argument convention for the X86-64 backend.		// This is the root argument convention for the X86-64 backend.
def CC_X86_64 : CallingConv<[		def CC_X86_64 : CallingConv<[
CCIfCC<"CallingConv::GHC", CCDelegateTo<CC_X86_64_GHC>>,		CCIfCC<"CallingConv::GHC", CCDelegateTo<CC_X86_64_GHC>>,
CCIfCC<"CallingConv::HiPE", CCDelegateTo<CC_X86_64_HiPE>>,		CCIfCC<"CallingConv::HiPE", CCDelegateTo<CC_X86_64_HiPE>>,
▲ Show 20 Lines • Show All 127 Lines • Show Last 20 Lines

lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.


//===-- X86ISelLowering.cpp - X86 DAG Lowering Implementation -------------===//		//===-- X86ISelLowering.cpp - X86 DAG Lowering Implementation -------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
// Bypass expensive divides on Atom when compiling with O2.		// Bypass expensive divides on Atom when compiling with O2.
if (TM.getOptLevel() >= CodeGenOpt::Default) {		if (TM.getOptLevel() >= CodeGenOpt::Default) {
if (Subtarget.hasSlowDivide32())		if (Subtarget.hasSlowDivide32())
addBypassSlowDiv(32, 8);		addBypassSlowDiv(32, 8);
if (Subtarget.hasSlowDivide64() && Subtarget.is64Bit())		if (Subtarget.hasSlowDivide64() && Subtarget.is64Bit())
addBypassSlowDiv(64, 16);		addBypassSlowDiv(64, 16);
}		}

		// Set all builtin calling conventions to BuiltinCC.
		auto BuiltinCC = CallingConv::C;
		for (int i = 0; i < RTLIB::UNKNOWN_LIBCALL; ++i)
		setLibcallCallingConv((RTLIB::Libcall)i, BuiltinCC);

if (Subtarget.isTargetKnownWindowsMSVC() \|\|		if (Subtarget.isTargetKnownWindowsMSVC() \|\|
Subtarget.isTargetWindowsItanium()) {		Subtarget.isTargetWindowsItanium()) {
// Setup Windows compiler runtime calls.		// Setup Windows compiler runtime calls.
setLibcallName(RTLIB::SDIV_I64, "_alldiv");		setLibcallName(RTLIB::SDIV_I64, "_alldiv");
setLibcallName(RTLIB::UDIV_I64, "_aulldiv");		setLibcallName(RTLIB::UDIV_I64, "_aulldiv");
setLibcallName(RTLIB::SREM_I64, "_allrem");		setLibcallName(RTLIB::SREM_I64, "_allrem");
setLibcallName(RTLIB::UREM_I64, "_aullrem");		setLibcallName(RTLIB::UREM_I64, "_aullrem");
setLibcallName(RTLIB::MUL_I64, "_allmul");		setLibcallName(RTLIB::MUL_I64, "_allmul");
▲ Show 20 Lines • Show All 32,759 Lines • Show Last 20 Lines

test/CodeGen/X86/pr18415.ll

This file was added.

				; RUN: llc %s -mtriple i386-unknown-linux-gnu -regparm 0 -o - \| FileCheck %s -check-prefix CHECK0
				; RUN: llc %s -mtriple i386-unknown-linux-gnu -regparm 3 -o - \| FileCheck %s -check-prefix CHECK3

				; ModuleID = '/usr/local/google/home/niravd/pr18415.c'
				target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
				target triple = "i386-unknown-linux-gnu"

				; Function Attrs: nounwind
				define void @use_foo(i8* inreg %dest, i8* inreg %src, i32 inreg %n) #0 {

				; CHECK0-LABEL: @use_foo
				; CHECK0-NOT: pushl
				; CHECK0: jmp foo
				; CHECK0-NOT: retl
				; CHECK3-LABEL: @use_foo
				; CHECK3-NOT: pushl
				; CHECK3: jmp foo
				; CHECK3-NOT: retl
				%1 = tail call i8* @foo(i8* %dest, i8* %src, i32 %n) #4
				ret void
				}

				declare i8* @foo(i8* inreg, i8* inreg, i32 inreg) #1

				; Function Attrs: norecurse nounwind
				define void @use_memcpy(i8* inreg nocapture %dest, i8* inreg nocapture readonly %src, i32 inreg %n) #2 {
				; CHECK0-LABEL: @use_memcpy
				; CHECK0: pushl %ecx
				; CHECK0: pushl %edx
				; CHECK0: pushl %eax
				; CHECK0: calll memcpy
				; CHECK0: retl
				; CHECK3-LABEL: @use_memcpy
				; CHECK3-NOT: pushl
				; CHECK3: jmp memcpy
				; CHECK3-NOT: retl
				tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 %n, i32 1, i1 false)
				ret void
				}

				; Function Attrs: argmemonly nounwind
				declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i32, i1) #3

				attributes #0 = { nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #1 = { "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
				attributes #2 = { norecurse nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" "regparm"="2"}
				attributes #3 = { argmemonly nounwind}
				attributes #4 = { nounwind }

				!llvm.ident = !{!0}

				!0 = !{!"clang version 3.8.0-2ubuntu3~trusty4 (tags/RELEASE_380/final)"}

test/CodeGen/X86/regparm.ll

This file was added.

				; RUN: llc %s -mtriple i386-unknown-linux-gnu -regparm 0 -o - \| FileCheck %s -check-prefix CHECK0
				; RUN: llc %s -mtriple i386-unknown-linux-gnu -regparm 1 -o - \| FileCheck %s -check-prefix CHECK1
				; RUN: llc %s -mtriple i386-unknown-linux-gnu -regparm 2 -o - \| FileCheck %s -check-prefix CHECK2
				; RUN: llc %s -mtriple i386-unknown-linux-gnu -regparm 3 -o - \| FileCheck %s -check-prefix CHECK3
				target triple = "i386-unknown-linux-gnu"

				;CHECK0-LABEL: @test0
				;CHECK0: movl 4(%esp), %eax
				;CHECK0-NEXT: addl 8(%esp), %eax
				;CHECK0-NEXT: addl 12(%esp), %eax
				;CHECK0-NEXT: retl

				;CHECK1-LABEL: @test0
				;CHECK1: addl 4(%esp), %eax
				;CHECK1-NEXT: addl 8(%esp), %eax
				;CHECK1-NEXT: retl

				;CHECK2-LABEL: @test0
				;CHECK2: addl %edx, %eax
				;CHECK2-NEXT: addl 4(%esp), %eax
				;CHECK2-NEXT: retl

				;CHECK3-LABEL: @test0
				;CHECK3: addl %edx, %eax
				;CHECK3-NEXT: addl %ecx, %eax
				;CHECK3-NEXT: retl

				define i32 @test0(i32 %a, i32 %b, i32 %c, i32 %d) {
				%1 = add i32 %a, %b
				%2 = add i32 %1, %c
				%3 = add i32 %2, %d
				ret i32 %2
				}

				; i64 requires 2 registers. If it does not fit, the 1 register is still allocated.

				;CHECK1-LABEL: @test1
				;CHECK1: movl 4(%esp), %eax
				;CHECK1-NEXT: addl 12(%esp), %eax
				;CHECK1-NEXT: addl 8(%esp), %eax
				;CHECK1-NEXT: addl 16(%esp), %eax
				;CHECK1-NEXT: addl 20(%esp), %eax
				;CHECK1-NEXT: retl

				;CHECK2-LABEL: @test1
				;CHECK2: addl 4(%esp), %eax
				;CHECK2-NEXT: addl %edx, %eax
				;CHECK2-NEXT: addl 8(%esp), %eax
				;CHECK2-NEXT: addl 12(%esp), %eax
				;CHECK2-NEXT: retl

				;CHECK3-LABEL: @test1
				;CHECK3: addl %ecx, %eax
				;CHECK3-NEXT: addl %edx, %eax
				;CHECK3-NEXT: addl 4(%esp), %eax
				;CHECK3-NEXT: addl 8(%esp), %eax
				;CHECK3-NEXT: retl

				define i32 @test1(i64 %a, i32 %b, i32 %c, i32 %d) {
				%shr = lshr i64 %a, 32
				%conv = trunc i64 %shr to i32
				%conv1 = trunc i64 %a to i32
				%add = add i32 %conv1, %b
				%add2 = add i32 %add, %conv
				%add3 = add i32 %add2, %c
				%add4 = add i32 %add3, %d
				ret i32 %add4
				}

				;CHECK1-LABEL: @test2
				;CHECK1: addl 4(%esp), %eax
				;CHECK1-NEXT: addl 8(%esp), %eax
				;CHECK1-NEXT: addl 12(%esp), %eax
				;CHECK1-NEXT: addl 16(%esp), %eax
				;CHECK1-NEXT: retl
				;CHECK2-LABEL: @test2
				;CHECK2: addl 4(%esp), %eax
				;CHECK2-NEXT: addl 8(%esp), %eax
				;CHECK2-NEXT: addl 12(%esp), %eax
				;CHECK2-NEXT: addl 16(%esp), %eax
				;CHECK2-NEXT: retl
				;CHECK3-LABEL: @test2
				;CHECK3: addl %edx, %eax
				;CHECK3-NEXT: addl %ecx, %eax
				;CHECK3-NEXT: addl 4(%esp), %eax
				;CHECK3-NEXT: addl 8(%esp), %eax
				;CHECK3-NEXT: retl

				define i32 @test2(i32 %b, i64 %a, i32 %c, i32 %d) {
				%shr = lshr i64 %a, 32
				%conv = trunc i64 %shr to i32
				%conv1 = trunc i64 %a to i32
				%add = add i32 %conv1, %b
				%add2 = add i32 %add, %conv
				%add3 = add i32 %add2, %c
				%add4 = add i32 %add3, %d
				ret i32 %add4
				}

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Make library calls sensitive to regparm module flag (Fixes PR3997).
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 79088

include/llvm/CodeGen/CommandFlags.h

include/llvm/Target/TargetOptions.h

lib/Target/X86/X86CallingConv.h

lib/Target/X86/X86CallingConv.td

lib/Target/X86/X86ISelLowering.cpp

test/CodeGen/X86/pr18415.ll

test/CodeGen/X86/regparm.ll

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Make library calls sensitive to regparm module flag (Fixes PR3997).ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 79088

include/llvm/CodeGen/CommandFlags.h

include/llvm/Target/TargetOptions.h

lib/Target/X86/X86CallingConv.h

lib/Target/X86/X86CallingConv.td

lib/Target/X86/X86ISelLowering.cpp

test/CodeGen/X86/pr18415.ll

test/CodeGen/X86/regparm.ll

[X86] Make library calls sensitive to regparm module flag (Fixes PR3997).
ClosedPublic