This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/
-
llvm/
-
CodeGen/
-
CallingConvLower.h
-
Target/
-
TargetCallingConv.h
-
lib/
-
CodeGen/
-
CallingConvLower.cpp
-
SelectionDAG/
-
SelectionDAGBuilder.cpp
-
Target/X86/
-
X86/
-
X86CallingConv.h
-
X86CallingConv.cpp
-
X86CallingConv.td
-
X86ISelLowering.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
vectorcall.ll

Differential D27392

Vectorcall Calling Convention - Adding CodeGen Complete Support
ClosedPublic

Authored by oren_ben_simhon on Dec 4 2016, 2:09 AM.

Download Raw Diff

Details

Reviewers

zvi
majnemer
igorb
rnk
aaboud

Commits

rG3b9515709011: [X86] Vectorcall Calling Convention - Adding CodeGen Complete Support
rL290240: [X86] Vectorcall Calling Convention - Adding CodeGen Complete Support

Summary

The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible.
vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use.
The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above.

The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it.
The review also includes additional lit tests to cover better HVAs corner cases.

Diff Detail

Repository: rL LLVM

Event Timeline

oren_ben_simhon updated this revision to Diff 80200.Dec 4 2016, 2:09 AM

oren_ben_simhon retitled this revision from to Vectorcall Calling Convention - Adding CodeGen Complete Support.

oren_ben_simhon updated this object.

oren_ben_simhon added reviewers: rnk, zvi, aaboud, igorb.

oren_ben_simhon set the repository for this revision to rL LLVM.

oren_ben_simhon added a subscriber: llvm-commits.

oren_ben_simhon updated this object.Dec 4 2016, 3:20 AM

majnemer added a subscriber: majnemer.Dec 7 2016, 8:58 PM

majnemer added inline comments.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7741 ↗	(On Diff #80200)	Please use `isa` here.
8039 ↗	(On Diff #80200)	Ditto.
lib/Target/X86/X86CallingConv.cpp
95 ↗	(On Diff #80200)	Range-based for loop?
197–199 ↗	(On Diff #80200)	This comment seems garbled.
206–207 ↗	(On Diff #80200)	else after return is not in the LLVM style.

oren_ben_simhon marked 5 inline comments as done.Dec 8 2016, 8:49 AM

Implemented comments posted by David (Thank You)

majnemer added inline comments.Dec 8 2016, 9:09 AM

include/llvm/CodeGen/CallingConvLower.h
327 ↗	(On Diff #80761)	Comments should end with a period.
329 ↗	(On Diff #80761)	Variables start with an uppercase letter.
include/llvm/Target/TargetCallingConv.h
54–62 ↗	(On Diff #80761)	Can we align this code to match its neighbors?
lib/CodeGen/CallingConvLower.cpp
74 ↗	(On Diff #80200)	Range-based for loop?
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7740–7741 ↗	(On Diff #80761)	Formatting looks strange.
lib/Target/X86/X86CallingConv.cpp
95 ↗	(On Diff #80761)	Formatting looks strange.
104 ↗	(On Diff #80761)	I'd lose these parens.
132 ↗	(On Diff #80761)	I don't think these parens are adding anything.
145–181 ↗	(On Diff #80761)	I wonder if this can be refactored to reduce redundancy: if (!ArgFlags.isHva() \|\| ArgFlags.isHvaStart()) { // Assign shadow GPR register. (void)State.AllocateReg(CC_X86_64_VectorCallGetGPRs()); // Assign XMM register. if (unsigned Reg = State.AllocateReg(CC_X86_VectorCallGetSSEs(ValVT))) { if (!ArgFlags.isHva()) State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo)); // In Vectorcall Calling convention, additional shadow stack can be // created on top of the basic 32 bytes of win64. // It can happen if the fifth or sixth argument is vector type or HVA. // At that case for each argument a shadow stack of 8 bytes is allocated. if (Reg == X86::XMM4 \|\| Reg == X86::XMM5) State.AllocateStack(8, 8); } return true; }
162–163 ↗	(On Diff #80761)	Else after return doesn't conform to LLVM style.
189 ↗	(On Diff #80761)	Ditto.
201 ↗	(On Diff #80761)	I don't think these parens add anything.
206 ↗	(On Diff #80761)	Comments should end with a period.
208 ↗	(On Diff #80761)	Ditto.
214 ↗	(On Diff #80761)	Ditto.
lib/Target/X86/X86ISelLowering.cpp
3276–3277 ↗	(On Diff #80761)	I'd capitalize these variable names.

rnk added inline comments.Dec 8 2016, 11:07 AM

lib/Target/X86/X86CallingConv.cpp
65 ↗	(On Diff #80200)	ArrayRefs are implicitly constructable from C arrays. You should be able to just return RegListZMM here and throughout this file. If not, makeArrayRef takes C arrays and will do the right thing.
130 ↗	(On Diff #80200)	mojibake
lib/Target/X86/X86CallingConv.td
631 ↗	(On Diff #80200)	Ouch. I locally confirmed this is correct, but why design a new calling convention that doesn't handle the latest vector types... =P
test/CodeGen/X86/vectorcall.ll
75 ↗	(On Diff #80761)	This was testing that %r is passed indirectly in the first integer register parameter, but I guess that's incorrect because of the way __vectorcall pins arguments to registers based on their exact argument position. We should check for the load off the stack to capture the correct behavior. Something like this: mov{{[l\|q}} {{[0-9]+}}(%rsp), %[[r_reg:[^ ]*]] movaps %[[r_reg]], %xmm0
75 ↗	(On Diff #80200)	This comment has been deleted.
75 ↗	(On Diff #80200)	ignore the comment above, I can't seem to delete it from phab. :(

aaboud added inline comments.Dec 9 2016, 6:49 AM

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7747 ↗	(On Diff #80761)	you already checked this case.
lib/Target/X86/X86ISelLowering.cpp
2807 ↗	(On Diff #80761)	You have a typo: "the the". Also, can you explain why you are sorting the ArgLoc even for non-VectorCall calling convention? Is that needed?
test/CodeGen/X86/vectorcall.ll
71 ↗	(On Diff #80761)	Do you really want to remove this test? Cannot you just fix it?

oren_ben_simhon marked 21 inline comments as done.Dec 11 2016, 5:20 AM

oren_ben_simhon added inline comments.

lib/Target/X86/X86CallingConv.cpp
65 ↗	(On Diff #80200)	You are right. Thank you.
95 ↗	(On Diff #80761)	I ran clang format before uploading the file. I added comments, hopefully it makes the formatting look better.
132 ↗	(On Diff #80761)	Consider a case in which ValVT is a vector type. If i will remove these parens, i will enter the if statement block. However i do not want to enter this statement in the case of vectors type . That is why this parens are not redundant. I will change a bit the condition look to make it clearer.
lib/Target/X86/X86CallingConv.td
631 ↗	(On Diff #80200)	If you refer to 512 bit vector types, Microsoft probably forgot to mention that it should be assigned to ZMM here.
lib/Target/X86/X86ISelLowering.cpp
2807 ↗	(On Diff #80761)	The next loop assumes that the locations are in the same order of the input arguments. So the order should be kept for all calling conventions. Currently AFAIK, vectorcall is the only one that changes the arguments order, still additional calling conventions might do the same. Anyway, I will add a comment to clarify this.
test/CodeGen/X86/vectorcall.ll
71 ↗	(On Diff #80761)	AFAIK, This test checks something that cannot happen. There is no case in which a function will return {double, double, double, double, double}. If it wanted to return such a structure it will be returned by pointer as the first argument to the function: define x86_vectorcallcc void @test_fp_4(%struct.five_doubles* sret %a)

Implemented comments submitted until 12/10 (Thank you, David/Reid/Amjad)

majnemer added inline comments.Dec 11 2016, 9:46 PM

lib/Target/X86/X86CallingConv.cpp
163–166 ↗	(On Diff #81019)	This is just `return ArgFlags.isHva();`

oren_ben_simhon marked an inline comment as done.Dec 12 2016, 5:18 AM

Implemented changes posted until 12/12 (Thank you David)

rnk added inline comments.Dec 13 2016, 9:46 AM

include/llvm/CodeGen/CallingConvLower.h
326 ↗	(On Diff #81070)	typo on "allocated"
lib/CodeGen/CallingConvLower.cpp
74 ↗	(On Diff #80200)	This creates a copy of a CCValAssign, which is unnecessary.
lib/Target/X86/X86CallingConv.cpp
130 ↗	(On Diff #80200)	I still see "type庸or example" with an Asian character in there.
lib/Target/X86/X86CallingConv.td
631 ↗	(On Diff #80200)	Actually, I think I just misunderstood. It looks like you allocate things to AVX registers elsewhere. I thought with this change you made it so that AVX registers would never be used with vectorcall.
lib/Target/X86/X86ISelLowering.cpp
2799 ↗	(On Diff #81070)	Unnecessary copy
2810 ↗	(On Diff #81070)	The stable_sort is only needed if the CC was vectorcall. I think the code would actually be clearer if we used std::merge. After the first pass, ArgLocs should be all the non-HVA argument locations sorted by argument number. The second pass appends additional sorted HVA argument locations. Then you merge those two sorted lists. You'll need a temporary ArgLocs vector to make this work. Please factor out this two pass vectorcall code into a template function that operates on a SmallVectorImpl<T>, where T can be InputArg or OutputArg.
test/CodeGen/X86/vectorcall.ll
71 ↗	(On Diff #80761)	This was really just a whitebox test to show that LLVM does not crash when you return 5 doubles. We should keep the test if we don't fatal error. Any behavior is fine. For example, it might trigger LLVM's logic to demote return by value to sret.

oren_ben_simhon marked 6 inline comments as done.Dec 14 2016, 3:14 AM

oren_ben_simhon added inline comments.

lib/Target/X86/X86ISelLowering.cpp
2810 ↗	(On Diff #81070)	I created a template function as you suggested. The next loop assumes that the locations are sorted. Today Vectorcall CC change the order, tomorrow, some other functionality/CC could change that. Since we already do the sort, we might as well do it for all CC and avoid future issues. I agree that merge algorithm could be faster than stable_sort. But the overhead of using it is big. Not only i will need a temporary ArgLocs, I will need a temporary CCState. IMHO, the current solution is preferred.
test/CodeGen/X86/vectorcall.ll
71 ↗	(On Diff #80761)	There is no fatal error. I am fine with leaving the test.

Implemented comments posted until 12/13 (Thank you Reid)

rnk added inline comments.Dec 14 2016, 1:43 PM

lib/Target/X86/X86ISelLowering.cpp
2810 ↗	(On Diff #81070)	I had envisioned that the sort or merge logic would live in the factored out second pass logic that is specific to vectorcall. All conventions other than vectorcall already preserve the invariant that argument locations are ordered by their IR position, which is why I see this sort/merge as being specific to vectorcall. I'd expect readers to be surprised that this sort is necessary, so I'd like to bind it closely with the code that makes the list unsorted. If you want to document the invariant of the loop below, we can do that with an assertion. Also, the merge isn't that bad. You should be able to do this: unsigned NumFirstPassLocs = ArgLocs.size(); CCState.AnalyzeFormalArguments(...); decltype(ArgLocs) TmpArgLocs; std::swap(TmpArgLocs, ArgLocs); auto B = TmpArgLocs.begin(), E = TmpArgLocs.end(); std::merge(B, B + NumFirstPassLocs, B + NumFirstPassLocs, E, ArgLocs.begin());

oren_ben_simhon marked an inline comment as done.Dec 15 2016, 7:56 AM

Implemented comments submitted until 12/14 (Thank you Reid)

Looks good. Please fix the remaining style issues before committing. Thanks for the patch!

include/llvm/CodeGen/CallingConvLower.h
543 ↗	(On Diff #81586)	Thanks! I think this looks a lot better factored here.
lib/Target/X86/X86ISelLowering.cpp
2755 ↗	(On Diff #81586)	This should have a more descriptive name in LLVM style. isSortedByValNo or something?
2811 ↗	(On Diff #81586)	formatting
3322 ↗	(On Diff #81586)	formatting

This revision is now accepted and ready to land.Dec 15 2016, 10:50 AM

majnemer added inline comments.Dec 15 2016, 11:23 AM

include/llvm/CodeGen/CallingConvLower.h
542–543 ↗	(On Diff #81586)	Please clang-format this. Also, I think `Args` should be an `ArrayRef<T>`.
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
7740 ↗	(On Diff #81586)	Please remove the extra parens.
8038 ↗	(On Diff #81586)	Ditto.
lib/Target/X86/X86ISelLowering.cpp
2756–2759 ↗	(On Diff #81586)	This doesn't look correctly formatted.

oren_ben_simhon marked 8 inline comments as done.Dec 18 2016, 12:38 AM

oren_ben_simhon added inline comments.

include/llvm/CodeGen/CallingConvLower.h
542–543 ↗	(On Diff #81586)	Thank you Reid and David. I reran clang format on the latest patch. Other functions that receive Args (Like, AnalyzeCallOperands, CheckReturn, etc.) use the SmallVector representation of the data. So, I prefer to be consistent with them and leave it as SmallVectorImpl.

Implemented comments submitted until 12/16.

Thank you Reid, David and Amjad for a very constructive and fruitful code review.
I will leave the review open for a couple of days (in case you have additional comments).

Closed by commit rL290240: [X86] Vectorcall Calling Convention - Adding CodeGen Complete Support (authored by orenb). · Explain WhyDec 21 2016, 12:42 AM

This revision was automatically updated to reflect the committed changes.

erichkeane mentioned this in rL291041: Correct Vectorcall Register passing and HVA Behavior.Jan 4 2017, 4:31 PM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

CallingConvLower.h

48 lines

Target/

TargetCallingConv.h

18 lines

lib/

CodeGen/

CallingConvLower.cpp

18 lines

SelectionDAG/

SelectionDAGBuilder.cpp

26 lines

Target/

X86/

31 lines

154 lines

64 lines

42 lines

test/

CodeGen/

X86/

vectorcall.ll

142 lines

Diff 82206

llvm/trunk/include/llvm/CodeGen/CallingConvLower.h

Show First 20 Lines • Show All 290 Lines • ▼ Show 20 Lines	bool isAllocated(unsigned Reg) const {
return UsedRegs[Reg/32] & (1 << (Reg&31));		return UsedRegs[Reg/32] & (1 << (Reg&31));
}		}

/// AnalyzeFormalArguments - Analyze an array of argument values,		/// AnalyzeFormalArguments - Analyze an array of argument values,
/// incorporating info about the formals into this state.		/// incorporating info about the formals into this state.
void AnalyzeFormalArguments(const SmallVectorImpl<ISD::InputArg> &Ins,		void AnalyzeFormalArguments(const SmallVectorImpl<ISD::InputArg> &Ins,
CCAssignFn Fn);		CCAssignFn Fn);

		/// The function will invoke AnalyzeFormalArguments.
		void AnalyzeArguments(const SmallVectorImpl<ISD::InputArg> &Ins,
		CCAssignFn Fn) {
		AnalyzeFormalArguments(Ins, Fn);
		}

/// AnalyzeReturn - Analyze the returned values of a return,		/// AnalyzeReturn - Analyze the returned values of a return,
/// incorporating info about the result values into this state.		/// incorporating info about the result values into this state.
void AnalyzeReturn(const SmallVectorImpl<ISD::OutputArg> &Outs,		void AnalyzeReturn(const SmallVectorImpl<ISD::OutputArg> &Outs,
CCAssignFn Fn);		CCAssignFn Fn);

/// CheckReturn - Analyze the return values of a function, returning		/// CheckReturn - Analyze the return values of a function, returning
/// true if the return can be performed without sret-demotion, and		/// true if the return can be performed without sret-demotion, and
/// false otherwise.		/// false otherwise.
bool CheckReturn(const SmallVectorImpl<ISD::OutputArg> &ArgsFlags,		bool CheckReturn(const SmallVectorImpl<ISD::OutputArg> &ArgsFlags,
CCAssignFn Fn);		CCAssignFn Fn);

/// AnalyzeCallOperands - Analyze the outgoing arguments to a call,		/// AnalyzeCallOperands - Analyze the outgoing arguments to a call,
/// incorporating info about the passed values into this state.		/// incorporating info about the passed values into this state.
void AnalyzeCallOperands(const SmallVectorImpl<ISD::OutputArg> &Outs,		void AnalyzeCallOperands(const SmallVectorImpl<ISD::OutputArg> &Outs,
CCAssignFn Fn);		CCAssignFn Fn);

/// AnalyzeCallOperands - Same as above except it takes vectors of types		/// AnalyzeCallOperands - Same as above except it takes vectors of types
/// and argument flags.		/// and argument flags.
void AnalyzeCallOperands(SmallVectorImpl<MVT> &ArgVTs,		void AnalyzeCallOperands(SmallVectorImpl<MVT> &ArgVTs,
SmallVectorImpl<ISD::ArgFlagsTy> &Flags,		SmallVectorImpl<ISD::ArgFlagsTy> &Flags,
CCAssignFn Fn);		CCAssignFn Fn);

		/// The function will invoke AnalyzeCallOperands.
		void AnalyzeArguments(const SmallVectorImpl<ISD::OutputArg> &Outs,
		CCAssignFn Fn) {
		AnalyzeCallOperands(Outs, Fn);
		}

/// AnalyzeCallResult - Analyze the return values of a call,		/// AnalyzeCallResult - Analyze the return values of a call,
/// incorporating info about the passed values into this state.		/// incorporating info about the passed values into this state.
void AnalyzeCallResult(const SmallVectorImpl<ISD::InputArg> &Ins,		void AnalyzeCallResult(const SmallVectorImpl<ISD::InputArg> &Ins,
CCAssignFn Fn);		CCAssignFn Fn);

		/// A shadow allocated register is a register that was allocated
		/// but wasn't added to the location list (Locs).
		/// \returns true if the register was allocated as shadow or false otherwise.
		bool IsShadowAllocatedReg(unsigned Reg) const;

/// AnalyzeCallResult - Same as above except it's specialized for calls which		/// AnalyzeCallResult - Same as above except it's specialized for calls which
/// produce a single value.		/// produce a single value.
void AnalyzeCallResult(MVT VT, CCAssignFn Fn);		void AnalyzeCallResult(MVT VT, CCAssignFn Fn);

/// getFirstUnallocated - Return the index of the first unallocated register		/// getFirstUnallocated - Return the index of the first unallocated register
/// in the set, or Regs.size() if they are all allocated.		/// in the set, or Regs.size() if they are all allocated.
unsigned getFirstUnallocated(ArrayRef<MCPhysReg> Regs) const {		unsigned getFirstUnallocated(ArrayRef<MCPhysReg> Regs) const {
for (unsigned i = 0; i < Regs.size(); ++i)		for (unsigned i = 0; i < Regs.size(); ++i)
▲ Show 20 Lines • Show All 182 Lines • ▼ Show 20 Lines	public:
/// Returns true if the results of the two calling conventions are compatible.		/// Returns true if the results of the two calling conventions are compatible.
/// This is usually part of the check for tailcall eligibility.		/// This is usually part of the check for tailcall eligibility.
static bool resultsCompatible(CallingConv::ID CalleeCC,		static bool resultsCompatible(CallingConv::ID CalleeCC,
CallingConv::ID CallerCC, MachineFunction &MF,		CallingConv::ID CallerCC, MachineFunction &MF,
LLVMContext &C,		LLVMContext &C,
const SmallVectorImpl<ISD::InputArg> &Ins,		const SmallVectorImpl<ISD::InputArg> &Ins,
CCAssignFn CalleeFn, CCAssignFn CallerFn);		CCAssignFn CalleeFn, CCAssignFn CallerFn);

		/// The function runs an additional analysis pass over function arguments.
		/// It will mark each argument with the attribute flag SecArgPass.
		/// After running, it will sort the locs list.
		template <class T>
		void AnalyzeArgumentsSecondPass(const SmallVectorImpl<T> &Args,
		CCAssignFn Fn) {
		unsigned NumFirstPassLocs = Locs.size();

		/// Creates similar argument list to \p Args in which each argument is
		/// marked using SecArgPass flag.
		SmallVector<T, 16> SecPassArg;
		// SmallVector<ISD::InputArg, 16> SecPassArg;
		for (auto Arg : Args) {
		Arg.Flags.setSecArgPass();
		SecPassArg.push_back(Arg);
		}

		// Run the second argument pass
		AnalyzeArguments(SecPassArg, Fn);

		// Sort the locations of the arguments according to their original position.
		SmallVector<CCValAssign, 16> TmpArgLocs;
		std::swap(TmpArgLocs, Locs);
		auto B = TmpArgLocs.begin(), E = TmpArgLocs.end();
		std::merge(B, B + NumFirstPassLocs, B + NumFirstPassLocs, E,
		std::back_inserter(Locs),
		[](const CCValAssign &A, const CCValAssign &B) -> bool {
		return A.getValNo() < B.getValNo();
		});
		}

private:		private:
/// MarkAllocated - Mark a register and all of its aliases as allocated.		/// MarkAllocated - Mark a register and all of its aliases as allocated.
void MarkAllocated(unsigned Reg);		void MarkAllocated(unsigned Reg);
};		};



} // end namespace llvm		} // end namespace llvm

#endif		#endif

llvm/trunk/include/llvm/Target/TargetCallingConv.h

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	private:
static const uint64_t InAlloca = 1ULL<<12; ///< Passed with inalloca		static const uint64_t InAlloca = 1ULL<<12; ///< Passed with inalloca
static const uint64_t InAllocaOffs = 12;		static const uint64_t InAllocaOffs = 12;
static const uint64_t SplitEnd = 1ULL<<13; ///< Last part of a split		static const uint64_t SplitEnd = 1ULL<<13; ///< Last part of a split
static const uint64_t SplitEndOffs = 13;		static const uint64_t SplitEndOffs = 13;
static const uint64_t SwiftSelf = 1ULL<<14; ///< Swift self parameter		static const uint64_t SwiftSelf = 1ULL<<14; ///< Swift self parameter
static const uint64_t SwiftSelfOffs = 14;		static const uint64_t SwiftSelfOffs = 14;
static const uint64_t SwiftError = 1ULL<<15; ///< Swift error parameter		static const uint64_t SwiftError = 1ULL<<15; ///< Swift error parameter
static const uint64_t SwiftErrorOffs = 15;		static const uint64_t SwiftErrorOffs = 15;
		static const uint64_t Hva = 1ULL << 16; ///< HVA field for
		///< vectorcall
		static const uint64_t HvaOffs = 16;
		static const uint64_t HvaStart = 1ULL << 17; ///< HVA structure start
		///< for vectorcall
		static const uint64_t HvaStartOffs = 17;
		static const uint64_t SecArgPass = 1ULL << 18; ///< Second argument
		///< pass for vectorcall
		static const uint64_t SecArgPassOffs = 18;
static const uint64_t OrigAlign = 0x1FULL<<27;		static const uint64_t OrigAlign = 0x1FULL<<27;
static const uint64_t OrigAlignOffs = 27;		static const uint64_t OrigAlignOffs = 27;
static const uint64_t ByValSize = 0x3fffffffULL<<32; ///< Struct size		static const uint64_t ByValSize = 0x3fffffffULL<<32; ///< Struct size
static const uint64_t ByValSizeOffs = 32;		static const uint64_t ByValSizeOffs = 32;
static const uint64_t InConsecutiveRegsLast = 0x1ULL<<62; ///< Struct size		static const uint64_t InConsecutiveRegsLast = 0x1ULL<<62; ///< Struct size
static const uint64_t InConsecutiveRegsLastOffs = 62;		static const uint64_t InConsecutiveRegsLastOffs = 62;
static const uint64_t InConsecutiveRegs = 0x1ULL<<63; ///< Struct size		static const uint64_t InConsecutiveRegs = 0x1ULL<<63; ///< Struct size
static const uint64_t InConsecutiveRegsOffs = 63;		static const uint64_t InConsecutiveRegsOffs = 63;
Show All 24 Lines	public:
void setInAlloca() { Flags \|= One << InAllocaOffs; }		void setInAlloca() { Flags \|= One << InAllocaOffs; }

bool isSwiftSelf() const { return Flags & SwiftSelf; }		bool isSwiftSelf() const { return Flags & SwiftSelf; }
void setSwiftSelf() { Flags \|= One << SwiftSelfOffs; }		void setSwiftSelf() { Flags \|= One << SwiftSelfOffs; }

bool isSwiftError() const { return Flags & SwiftError; }		bool isSwiftError() const { return Flags & SwiftError; }
void setSwiftError() { Flags \|= One << SwiftErrorOffs; }		void setSwiftError() { Flags \|= One << SwiftErrorOffs; }

		bool isHva() const { return Flags & Hva; }
		void setHva() { Flags \|= One << HvaOffs; }

		bool isHvaStart() const { return Flags & HvaStart; }
		void setHvaStart() { Flags \|= One << HvaStartOffs; }

		bool isSecArgPass() const { return Flags & SecArgPass; }
		void setSecArgPass() { Flags \|= One << SecArgPassOffs; }

bool isNest() const { return Flags & Nest; }		bool isNest() const { return Flags & Nest; }
void setNest() { Flags \|= One << NestOffs; }		void setNest() { Flags \|= One << NestOffs; }

bool isReturned() const { return Flags & Returned; }		bool isReturned() const { return Flags & Returned; }
void setReturned() { Flags \|= One << ReturnedOffs; }		void setReturned() { Flags \|= One << ReturnedOffs; }

bool isInConsecutiveRegs() const { return Flags & InConsecutiveRegs; }		bool isInConsecutiveRegs() const { return Flags & InConsecutiveRegs; }
void setInConsecutiveRegs() { Flags \|= One << InConsecutiveRegsOffs; }		void setInConsecutiveRegs() { Flags \|= One << InConsecutiveRegsOffs; }
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/CallingConvLower.cpp

	Show All 17 Lines
	#include "llvm/IR/DataLayout.h"			#include "llvm/IR/DataLayout.h"
	#include "llvm/Support/Debug.h"			#include "llvm/Support/Debug.h"
	#include "llvm/Support/ErrorHandling.h"			#include "llvm/Support/ErrorHandling.h"
	#include "llvm/Support/SaveAndRestore.h"			#include "llvm/Support/SaveAndRestore.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"
	#include "llvm/Target/TargetLowering.h"			#include "llvm/Target/TargetLowering.h"
	#include "llvm/Target/TargetRegisterInfo.h"			#include "llvm/Target/TargetRegisterInfo.h"
	#include "llvm/Target/TargetSubtargetInfo.h"			#include "llvm/Target/TargetSubtargetInfo.h"
				#include <algorithm>

	using namespace llvm;			using namespace llvm;

	CCState::CCState(CallingConv::ID CC, bool isVarArg, MachineFunction &mf,			CCState::CCState(CallingConv::ID CC, bool isVarArg, MachineFunction &mf,
	SmallVectorImpl<CCValAssign> &locs, LLVMContext &C)			SmallVectorImpl<CCValAssign> &locs, LLVMContext &C)
	: CallingConv(CC), IsVarArg(isVarArg), MF(mf),			: CallingConv(CC), IsVarArg(isVarArg), MF(mf),
	TRI(*MF.getSubtarget().getRegisterInfo()), Locs(locs), Context(C),			TRI(*MF.getSubtarget().getRegisterInfo()), Locs(locs), Context(C),
	CallOrPrologue(Unknown) {			CallOrPrologue(Unknown) {
	// No stack is used.			// No stack is used.
	Show All 25 Lines
	}			}

	/// Mark a register and all of its aliases as allocated.			/// Mark a register and all of its aliases as allocated.
	void CCState::MarkAllocated(unsigned Reg) {			void CCState::MarkAllocated(unsigned Reg) {
	for (MCRegAliasIterator AI(Reg, &TRI, true); AI.isValid(); ++AI)			for (MCRegAliasIterator AI(Reg, &TRI, true); AI.isValid(); ++AI)
	UsedRegs[AI/32] \|= 1 << (AI&31);			UsedRegs[AI/32] \|= 1 << (AI&31);
	}			}

				bool CCState::IsShadowAllocatedReg(unsigned Reg) const {
				if (!isAllocated(Reg))
				return false;

				for (auto const &ValAssign : Locs) {
				if (ValAssign.isRegLoc()) {
				for (MCRegAliasIterator AI(ValAssign.getLocReg(), &TRI, true);
				AI.isValid(); ++AI) {
				if (*AI == Reg)
				return false;
				}
				}
				}
				return true;
				}

	/// Analyze an array of argument values,			/// Analyze an array of argument values,
	/// incorporating info about the formals into this state.			/// incorporating info about the formals into this state.
	void			void
	CCState::AnalyzeFormalArguments(const SmallVectorImpl<ISD::InputArg> &Ins,			CCState::AnalyzeFormalArguments(const SmallVectorImpl<ISD::InputArg> &Ins,
	CCAssignFn Fn) {			CCAssignFn Fn) {
	unsigned NumArgs = Ins.size();			unsigned NumArgs = Ins.size();

	for (unsigned i = 0; i != NumArgs; ++i) {			for (unsigned i = 0; i != NumArgs; ++i) {
	▲ Show 20 Lines • Show All 213 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,726 Lines • ▼ Show 20 Lines	for (unsigned Value = 0, NumValues = ValueVTs.size(); Value != NumValues;
Args[i].Node.getResNo() + Value);		Args[i].Node.getResNo() + Value);
ISD::ArgFlagsTy Flags;		ISD::ArgFlagsTy Flags;
unsigned OriginalAlignment = DL.getABITypeAlignment(ArgTy);		unsigned OriginalAlignment = DL.getABITypeAlignment(ArgTy);

if (Args[i].isZExt)		if (Args[i].isZExt)
Flags.setZExt();		Flags.setZExt();
if (Args[i].isSExt)		if (Args[i].isSExt)
Flags.setSExt();		Flags.setSExt();
if (Args[i].isInReg)		if (Args[i].isInReg) {
		// If we are using vectorcall calling convention, a structure that is
		// passed InReg - is surely an HVA
		if (CLI.CallConv == CallingConv::X86_VectorCall &&
		isa<StructType>(FinalType)) {
		// The first value of a structure is marked
		if (0 == Value)
		Flags.setHvaStart();
		Flags.setHva();
		}
		// Set InReg Flag
Flags.setInReg();		Flags.setInReg();
		}
if (Args[i].isSRet)		if (Args[i].isSRet)
Flags.setSRet();		Flags.setSRet();
if (Args[i].isSwiftSelf)		if (Args[i].isSwiftSelf)
Flags.setSwiftSelf();		Flags.setSwiftSelf();
if (Args[i].isSwiftError)		if (Args[i].isSwiftError)
Flags.setSwiftError();		Flags.setSwiftError();
if (Args[i].isByVal)		if (Args[i].isByVal)
Flags.setByVal();		Flags.setByVal();
▲ Show 20 Lines • Show All 269 Lines • ▼ Show 20 Lines	for (unsigned Value = 0, NumValues = ValueVTs.size();
Type ArgTy = VT.getTypeForEVT(DAG.getContext());		Type ArgTy = VT.getTypeForEVT(DAG.getContext());
ISD::ArgFlagsTy Flags;		ISD::ArgFlagsTy Flags;
unsigned OriginalAlignment = DL.getABITypeAlignment(ArgTy);		unsigned OriginalAlignment = DL.getABITypeAlignment(ArgTy);

if (F.getAttributes().hasAttribute(Idx, Attribute::ZExt))		if (F.getAttributes().hasAttribute(Idx, Attribute::ZExt))
Flags.setZExt();		Flags.setZExt();
if (F.getAttributes().hasAttribute(Idx, Attribute::SExt))		if (F.getAttributes().hasAttribute(Idx, Attribute::SExt))
Flags.setSExt();		Flags.setSExt();
if (F.getAttributes().hasAttribute(Idx, Attribute::InReg))		if (F.getAttributes().hasAttribute(Idx, Attribute::InReg)) {
		// If we are using vectorcall calling convention, a structure that is
		// passed InReg - is surely an HVA
		if (F.getCallingConv() == CallingConv::X86_VectorCall &&
		isa<StructType>(I->getType())) {
		// The first value of a structure is marked
		if (0 == Value)
		Flags.setHvaStart();
		Flags.setHva();
		}
		// Set InReg Flag
Flags.setInReg();		Flags.setInReg();
		}
if (F.getAttributes().hasAttribute(Idx, Attribute::StructRet))		if (F.getAttributes().hasAttribute(Idx, Attribute::StructRet))
Flags.setSRet();		Flags.setSRet();
if (F.getAttributes().hasAttribute(Idx, Attribute::SwiftSelf))		if (F.getAttributes().hasAttribute(Idx, Attribute::SwiftSelf))
Flags.setSwiftSelf();		Flags.setSwiftSelf();
if (F.getAttributes().hasAttribute(Idx, Attribute::SwiftError))		if (F.getAttributes().hasAttribute(Idx, Attribute::SwiftError))
Flags.setSwiftError();		Flags.setSwiftError();
if (F.getAttributes().hasAttribute(Idx, Attribute::ByVal))		if (F.getAttributes().hasAttribute(Idx, Attribute::ByVal))
Flags.setByVal();		Flags.setByVal();
▲ Show 20 Lines • Show All 1,277 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86CallingConv.h

	Show All 18 Lines
	#include "llvm/CodeGen/CallingConvLower.h"			#include "llvm/CodeGen/CallingConvLower.h"
	#include "llvm/IR/CallingConv.h"			#include "llvm/IR/CallingConv.h"

	namespace llvm {			namespace llvm {

	/// When regcall calling convention compiled to 32 bit arch, special treatment			/// When regcall calling convention compiled to 32 bit arch, special treatment
	/// is required for 64 bit masks.			/// is required for 64 bit masks.
	/// The value should be assigned to two GPRs.			/// The value should be assigned to two GPRs.
	/// @return true if registers were allocated and false otherwise			/// \return true if registers were allocated and false otherwise.
	bool CC_X86_32_RegCall_Assign2Regs(unsigned &ValNo, MVT &ValVT, MVT &LocVT,			bool CC_X86_32_RegCall_Assign2Regs(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
	CCValAssign::LocInfo &LocInfo,			CCValAssign::LocInfo &LocInfo,
	ISD::ArgFlagsTy &ArgFlags, CCState &State);			ISD::ArgFlagsTy &ArgFlags, CCState &State);

	inline bool CC_X86_32_VectorCallIndirect(unsigned &ValNo, MVT &ValVT,			/// Vectorcall calling convention has special handling for vector types or
	MVT &LocVT,			/// HVA for 64 bit arch.
				/// For HVAs shadow registers might be allocated on the first pass
				/// and actual XMM registers are allocated on the second pass.
				/// For vector types, actual XMM registers are allocated on the first pass.
				/// \return true if registers were allocated and false otherwise.
				bool CC_X86_64_VectorCall(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
	CCValAssign::LocInfo &LocInfo,			CCValAssign::LocInfo &LocInfo,
	ISD::ArgFlagsTy &ArgFlags,			ISD::ArgFlagsTy &ArgFlags, CCState &State);
	CCState &State) {
	// Similar to CCPassIndirect, with the addition of inreg.			/// Vectorcall calling convention has special handling for vector types or
	LocVT = MVT::i32;			/// HVA for 32 bit arch.
	LocInfo = CCValAssign::Indirect;			/// For HVAs actual XMM registers are allocated on the second pass.
	ArgFlags.setInReg();			/// For vector types, actual XMM registers are allocated on the first pass.
	return false; // Continue the search, but now for i32.			/// \return true if registers were allocated and false otherwise.
	}			bool CC_X86_32_VectorCall(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
				CCValAssign::LocInfo &LocInfo,
				ISD::ArgFlagsTy &ArgFlags, CCState &State);

	inline bool CC_X86_AnyReg_Error(unsigned &, MVT &, MVT &,			inline bool CC_X86_AnyReg_Error(unsigned &, MVT &, MVT &,
	CCValAssign::LocInfo &, ISD::ArgFlagsTy &,			CCValAssign::LocInfo &, ISD::ArgFlagsTy &,
	CCState &) {			CCState &) {
	llvm_unreachable("The AnyReg calling convention is only supported by the " \			llvm_unreachable("The AnyReg calling convention is only supported by the " \
	"stackmap and patchpoint intrinsics.");			"stackmap and patchpoint intrinsics.");
	// gracefully fallback to X86 C calling convention on Release builds.			// gracefully fallback to X86 C calling convention on Release builds.
	return false;			return false;
	▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86CallingConv.cpp

//=== X86CallingConv.cpp - X86 Custom Calling Convention Impl -- C++ --===//		//=== X86CallingConv.cpp - X86 Custom Calling Convention Impl -- C++ --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file contains the implementation of custom routines for the X86		// This file contains the implementation of custom routines for the X86
// Calling Convention that aren't done by tablegen.		// Calling Convention that aren't done by tablegen.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "MCTargetDesc/X86MCTargetDesc.h"		#include "MCTargetDesc/X86MCTargetDesc.h"
		#include "X86Subtarget.h"
#include "llvm/CodeGen/CallingConvLower.h"		#include "llvm/CodeGen/CallingConvLower.h"
#include "llvm/IR/CallingConv.h"		#include "llvm/IR/CallingConv.h"

namespace llvm {		namespace llvm {

bool CC_X86_32_RegCall_Assign2Regs(unsigned &ValNo, MVT &ValVT, MVT &LocVT,		bool CC_X86_32_RegCall_Assign2Regs(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
CCValAssign::LocInfo &LocInfo,		CCValAssign::LocInfo &LocInfo,
ISD::ArgFlagsTy &ArgFlags, CCState &State) {		ISD::ArgFlagsTy &ArgFlags, CCState &State) {
Show All 10 Lines	for (auto Reg : RegList) {
if (!State.isAllocated(Reg))		if (!State.isAllocated(Reg))
AvailableRegs.push_back(Reg);		AvailableRegs.push_back(Reg);
}		}

const size_t RequiredGprsUponSplit = 2;		const size_t RequiredGprsUponSplit = 2;
if (AvailableRegs.size() < RequiredGprsUponSplit)		if (AvailableRegs.size() < RequiredGprsUponSplit)
return false; // Not enough free registers - continue the search.		return false; // Not enough free registers - continue the search.

// Allocating the available registers		// Allocating the available registers.
for (unsigned I = 0; I < RequiredGprsUponSplit; I++) {		for (unsigned I = 0; I < RequiredGprsUponSplit; I++) {

// Marking the register as located		// Marking the register as located.
unsigned Reg = State.AllocateReg(AvailableRegs[I]);		unsigned Reg = State.AllocateReg(AvailableRegs[I]);

// Since we previously made sure that 2 registers are available		// Since we previously made sure that 2 registers are available
// we expect that a real register number will be returned		// we expect that a real register number will be returned.
assert(Reg && "Expecting a register will be available");		assert(Reg && "Expecting a register will be available");

// Assign the value to the allocated register		// Assign the value to the allocated register
State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, Reg, LocVT, LocInfo));		State.addLoc(CCValAssign::getCustomReg(ValNo, ValVT, Reg, LocVT, LocInfo));
}		}

// Successful in allocating regsiters - stop scanning next rules.		// Successful in allocating regsiters - stop scanning next rules.
return true;		return true;
}		}

		static ArrayRef<MCPhysReg> CC_X86_VectorCallGetSSEs(const MVT &ValVT) {
		if (ValVT.is512BitVector()) {
		static const MCPhysReg RegListZMM[] = {X86::ZMM0, X86::ZMM1, X86::ZMM2,
		X86::ZMM3, X86::ZMM4, X86::ZMM5};
		return RegListZMM;
		}

		if (ValVT.is256BitVector()) {
		static const MCPhysReg RegListYMM[] = {X86::YMM0, X86::YMM1, X86::YMM2,
		X86::YMM3, X86::YMM4, X86::YMM5};
		return RegListYMM;
		}

		static const MCPhysReg RegListXMM[] = {X86::XMM0, X86::XMM1, X86::XMM2,
		X86::XMM3, X86::XMM4, X86::XMM5};
		return RegListXMM;
		}

		static ArrayRef<MCPhysReg> CC_X86_64_VectorCallGetGPRs() {
		static const MCPhysReg RegListGPR[] = {X86::RCX, X86::RDX, X86::R8, X86::R9};
		return RegListGPR;
		}

		static bool CC_X86_VectorCallAssignRegister(unsigned &ValNo, MVT &ValVT,
		MVT &LocVT,
		CCValAssign::LocInfo &LocInfo,
		ISD::ArgFlagsTy &ArgFlags,
		CCState &State) {

		ArrayRef<MCPhysReg> RegList = CC_X86_VectorCallGetSSEs(ValVT);
		bool Is64bit = static_cast<const X86Subtarget &>(
		State.getMachineFunction().getSubtarget())
		.is64Bit();

		for (auto Reg : RegList) {
		// If the register is not marked as allocated - assign to it.
		if (!State.isAllocated(Reg)) {
		unsigned AssigedReg = State.AllocateReg(Reg);
		assert(AssigedReg == Reg && "Expecting a valid register allocation");
		State.addLoc(
		CCValAssign::getReg(ValNo, ValVT, AssigedReg, LocVT, LocInfo));
		return true;
		}
		// If the register is marked as shadow allocated - assign to it.
		if (Is64bit && State.IsShadowAllocatedReg(Reg)) {
		State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));
		return true;
		}
		}

		llvm_unreachable("Clang should ensure that hva marked vectors will have "
		"an available register.");
		return false;
		}

		bool CC_X86_64_VectorCall(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
		CCValAssign::LocInfo &LocInfo,
		ISD::ArgFlagsTy &ArgFlags, CCState &State) {
		// On the second pass, go through the HVAs only.
		if (ArgFlags.isSecArgPass()) {
		if (ArgFlags.isHva())
		return CC_X86_VectorCallAssignRegister(ValNo, ValVT, LocVT, LocInfo,
		ArgFlags, State);
		return true;
		}

		// Process only vector types as defined by vectorcall spec:
		// "A vector type is either a floating-point type, for example,
		// a float or double, or an SIMD vector type, for example, __m128 or __m256".
		if (!(ValVT.isFloatingPoint() \|\|
		(ValVT.isVector() && ValVT.getSizeInBits() >= 128))) {
		// If R9 was already assigned it means that we are after the fourth element
		// and because this is not an HVA / Vector type, we need to allocate
		// shadow XMM register.
		if (State.isAllocated(X86::R9)) {
		// Assign shadow XMM register.
		(void)State.AllocateReg(CC_X86_VectorCallGetSSEs(ValVT));
		}

		return false;
		}

		if (!ArgFlags.isHva() \|\| ArgFlags.isHvaStart()) {
		// Assign shadow GPR register.
		(void)State.AllocateReg(CC_X86_64_VectorCallGetGPRs());

		// Assign XMM register - (shadow for HVA and non-shadow for non HVA).
		if (unsigned Reg = State.AllocateReg(CC_X86_VectorCallGetSSEs(ValVT))) {
		// In Vectorcall Calling convention, additional shadow stack can be
		// created on top of the basic 32 bytes of win64.
		// It can happen if the fifth or sixth argument is vector type or HVA.
		// At that case for each argument a shadow stack of 8 bytes is allocated.
		if (Reg == X86::XMM4 \|\| Reg == X86::XMM5)
		State.AllocateStack(8, 8);

		if (!ArgFlags.isHva()) {
		State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));
		return true; // Allocated a register - Stop the search.
		}
		}
		}

		// If this is an HVA - Stop the search,
		// otherwise continue the search.
		return ArgFlags.isHva();
		}

		bool CC_X86_32_VectorCall(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
		CCValAssign::LocInfo &LocInfo,
		ISD::ArgFlagsTy &ArgFlags, CCState &State) {
		// On the second pass, go through the HVAs only.
		if (ArgFlags.isSecArgPass()) {
		if (ArgFlags.isHva())
		return CC_X86_VectorCallAssignRegister(ValNo, ValVT, LocVT, LocInfo,
		ArgFlags, State);
		return true;
		}

		// Process only vector types as defined by vectorcall spec:
		// "A vector type is either a floating point type, for example,
		// a float or double, or an SIMD vector type, for example, __m128 or __m256".
		if (!(ValVT.isFloatingPoint() \|\|
		(ValVT.isVector() && ValVT.getSizeInBits() >= 128))) {
		return false;
		}

		if (ArgFlags.isHva())
		return true; // If this is an HVA - Stop the search.

		// Assign XMM register.
		if (unsigned Reg = State.AllocateReg(CC_X86_VectorCallGetSSEs(ValVT))) {
		State.addLoc(CCValAssign::getReg(ValNo, ValVT, Reg, LocVT, LocInfo));
		return true;
		}

		// In case we did not find an available XMM register for a vector -
		// pass it indirectly.
		// It is similar to CCPassIndirect, with the addition of inreg.
		if (!ValVT.isFloatingPoint()) {
		LocVT = MVT::i32;
		LocInfo = CCValAssign::Indirect;
		ArgFlags.setInReg();
		}

		return false; // No register was assigned - Continue the search.
		}

} // End llvm namespace		} // End llvm namespace

llvm/trunk/lib/Target/X86/X86CallingConv.td

Show First 20 Lines • Show All 302 Lines • ▼ Show 20 Lines
def RetCC_X86_32_HiPE : CallingConv<[		def RetCC_X86_32_HiPE : CallingConv<[
// Promote all types to i32		// Promote all types to i32
CCIfType<[i8, i16], CCPromoteToType<i32>>,		CCIfType<[i8, i16], CCPromoteToType<i32>>,

// Return: HP, P, VAL1, VAL2		// Return: HP, P, VAL1, VAL2
CCIfType<[i32], CCAssignToReg<[ESI, EBP, EAX, EDX]>>		CCIfType<[i32], CCAssignToReg<[ESI, EBP, EAX, EDX]>>
]>;		]>;

// X86-32 HiPE return-value convention.		// X86-32 Vectorcall return-value convention.
def RetCC_X86_32_VectorCall : CallingConv<[		def RetCC_X86_32_VectorCall : CallingConv<[
// Vector types are returned in XMM0,XMM1,XMMM2 and XMM3.		// Floating Point types are returned in XMM0,XMM1,XMMM2 and XMM3.
CCIfType<[f32, f64, v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],		CCIfType<[f32, f64, f128],
CCAssignToReg<[XMM0,XMM1,XMM2,XMM3]>>,		CCAssignToReg<[XMM0,XMM1,XMM2,XMM3]>>,

// 256-bit FP vectors
CCIfType<[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
CCAssignToReg<[YMM0,YMM1,YMM2,YMM3]>>,

// 512-bit FP vectors
CCIfType<[v64i8, v32i16, v16i32, v8i64, v16f32, v8f64],
CCAssignToReg<[ZMM0,ZMM1,ZMM2,ZMM3]>>,

// Return integers in the standard way.		// Return integers in the standard way.
CCDelegateTo<RetCC_X86Common>		CCDelegateTo<RetCC_X86Common>
]>;		]>;

// X86-64 C return-value convention.		// X86-64 C return-value convention.
def RetCC_X86_64_C : CallingConv<[		def RetCC_X86_64_C : CallingConv<[
// The X86-64 calling convention always returns FP values in XMM0.		// The X86-64 calling convention always returns FP values in XMM0.
CCIfType<[f32], CCAssignToReg<[XMM0, XMM1]>>,		CCIfType<[f32], CCAssignToReg<[XMM0, XMM1]>>,
Show All 12 Lines
def RetCC_X86_Win64_C : CallingConv<[		def RetCC_X86_Win64_C : CallingConv<[
// The X86-Win64 calling convention always returns __m64 values in RAX.		// The X86-Win64 calling convention always returns __m64 values in RAX.
CCIfType<[x86mmx], CCBitConvertToType<i64>>,		CCIfType<[x86mmx], CCBitConvertToType<i64>>,

// Otherwise, everything is the same as 'normal' X86-64 C CC.		// Otherwise, everything is the same as 'normal' X86-64 C CC.
CCDelegateTo<RetCC_X86_64_C>		CCDelegateTo<RetCC_X86_64_C>
]>;		]>;

		// X86-64 vectorcall return-value convention.
		def RetCC_X86_64_Vectorcall : CallingConv<[
		// Vectorcall calling convention always returns FP values in XMMs.
		CCIfType<[f32, f64, f128],
		CCAssignToReg<[XMM0, XMM1, XMM2, XMM3]>>,

		// Otherwise, everything is the same as Windows X86-64 C CC.
		CCDelegateTo<RetCC_X86_Win64_C>
		]>;

// X86-64 HiPE return-value convention.		// X86-64 HiPE return-value convention.
def RetCC_X86_64_HiPE : CallingConv<[		def RetCC_X86_64_HiPE : CallingConv<[
// Promote all types to i64		// Promote all types to i64
CCIfType<[i8, i16, i32], CCPromoteToType<i64>>,		CCIfType<[i8, i16, i32], CCPromoteToType<i64>>,

// Return: HP, P, VAL1, VAL2		// Return: HP, P, VAL1, VAL2
CCIfType<[i64], CCAssignToReg<[R15, RBP, RAX, RDX]>>		CCIfType<[i64], CCAssignToReg<[R15, RBP, RAX, RDX]>>
]>;		]>;
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	def RetCC_X86_64 : CallingConv<[

// Handle Swift calls.		// Handle Swift calls.
CCIfCC<"CallingConv::Swift", CCDelegateTo<RetCC_X86_64_Swift>>,		CCIfCC<"CallingConv::Swift", CCDelegateTo<RetCC_X86_64_Swift>>,

// Handle explicit CC selection		// Handle explicit CC selection
CCIfCC<"CallingConv::X86_64_Win64", CCDelegateTo<RetCC_X86_Win64_C>>,		CCIfCC<"CallingConv::X86_64_Win64", CCDelegateTo<RetCC_X86_Win64_C>>,
CCIfCC<"CallingConv::X86_64_SysV", CCDelegateTo<RetCC_X86_64_C>>,		CCIfCC<"CallingConv::X86_64_SysV", CCDelegateTo<RetCC_X86_64_C>>,

		// Handle Vectorcall CC
		CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo<RetCC_X86_64_Vectorcall>>,

// Handle HHVM calls.		// Handle HHVM calls.
CCIfCC<"CallingConv::HHVM", CCDelegateTo<RetCC_X86_64_HHVM>>,		CCIfCC<"CallingConv::HHVM", CCDelegateTo<RetCC_X86_64_HHVM>>,

CCIfCC<"CallingConv::X86_RegCall",		CCIfCC<"CallingConv::X86_RegCall",
CCIfSubtarget<"isTargetWin64()",		CCIfSubtarget<"isTargetWin64()",
CCDelegateTo<RetCC_X86_Win64_RegCall>>>,		CCDelegateTo<RetCC_X86_Win64_RegCall>>>,
CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<RetCC_X86_SysV64_RegCall>>,		CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<RetCC_X86_SysV64_RegCall>>,

▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	def CC_X86_Win64_C : CallingConv<[
CCIfType<[i32, i64, f32, f64], CCAssignToStack<8, 8>>,		CCIfType<[i32, i64, f32, f64], CCAssignToStack<8, 8>>,

// Long doubles get stack slots whose size and alignment depends on the		// Long doubles get stack slots whose size and alignment depends on the
// subtarget.		// subtarget.
CCIfType<[f80], CCAssignToStack<0, 0>>		CCIfType<[f80], CCAssignToStack<0, 0>>
]>;		]>;

def CC_X86_Win64_VectorCall : CallingConv<[		def CC_X86_Win64_VectorCall : CallingConv<[
// The first 6 floating point and vector types of 128 bits or less use		CCCustom<"CC_X86_64_VectorCall">,
// XMM0-XMM5.
CCIfType<[f32, f64, v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],
CCAssignToReg<[XMM0, XMM1, XMM2, XMM3, XMM4, XMM5]>>,

// 256-bit vectors use YMM registers.
CCIfType<[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
CCAssignToReg<[YMM0, YMM1, YMM2, YMM3, YMM4, YMM5]>>,

// 512-bit vectors use ZMM registers.
CCIfType<[v64i8, v32i16, v16i32, v8i64, v16f32, v8f64],
CCAssignToReg<[ZMM0, ZMM1, ZMM2, ZMM3, ZMM4, ZMM5]>>,

// Delegate to fastcall to handle integer types.		// Delegate to fastcall to handle integer types.
CCDelegateTo<CC_X86_Win64_C>		CCDelegateTo<CC_X86_Win64_C>
]>;		]>;


def CC_X86_64_GHC : CallingConv<[		def CC_X86_64_GHC : CallingConv<[
// Promote i8/i16/i32 arguments to i64.		// Promote i8/i16/i32 arguments to i64.
▲ Show 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	def CC_X86_32_FastCall : CallingConv<[

// The first 2 integer arguments are passed in ECX/EDX		// The first 2 integer arguments are passed in ECX/EDX
CCIfInReg<CCIfType<[i32], CCAssignToReg<[ECX, EDX]>>>,		CCIfInReg<CCIfType<[i32], CCAssignToReg<[ECX, EDX]>>>,

// Otherwise, same as everything else.		// Otherwise, same as everything else.
CCDelegateTo<CC_X86_32_Common>		CCDelegateTo<CC_X86_32_Common>
]>;		]>;

def CC_X86_32_VectorCall : CallingConv<[		def CC_X86_Win32_VectorCall : CallingConv<[
// The first 6 floating point and vector types of 128 bits or less use		// Pass floating point in XMMs
// XMM0-XMM5.		CCCustom<"CC_X86_32_VectorCall">,
CCIfType<[f32, f64, v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],
CCAssignToReg<[XMM0, XMM1, XMM2, XMM3, XMM4, XMM5]>>,

// 256-bit vectors use YMM registers.
CCIfType<[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],
CCAssignToReg<[YMM0, YMM1, YMM2, YMM3, YMM4, YMM5]>>,

// 512-bit vectors use ZMM registers.
CCIfType<[v64i8, v32i16, v16i32, v8i64, v16f32, v8f64],
CCAssignToReg<[ZMM0, ZMM1, ZMM2, ZMM3, ZMM4, ZMM5]>>,

// Otherwise, pass it indirectly.
CCIfType<[v16i8, v8i16, v4i32, v2i64, v4f32, v2f64,
v32i8, v16i16, v8i32, v4i64, v8f32, v4f64,
v64i8, v32i16, v16i32, v8i64, v16f32, v8f64],
CCCustom<"CC_X86_32_VectorCallIndirect">>,

// Delegate to fastcall to handle integer types.		// Delegate to fastcall to handle integer types.
CCDelegateTo<CC_X86_32_FastCall>		CCDelegateTo<CC_X86_32_FastCall>
]>;		]>;

def CC_X86_32_ThisCall_Common : CallingConv<[		def CC_X86_32_ThisCall_Common : CallingConv<[
// The first integer argument is passed in ECX		// The first integer argument is passed in ECX
CCIfType<[i32], CCAssignToReg<[ECX]>>,		CCIfType<[i32], CCAssignToReg<[ECX]>>,
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines

// This is the root argument convention for the X86-32 backend.		// This is the root argument convention for the X86-32 backend.
def CC_X86_32 : CallingConv<[		def CC_X86_32 : CallingConv<[
// X86_INTR calling convention is valid in MCU target and should override the		// X86_INTR calling convention is valid in MCU target and should override the
// MCU calling convention. Thus, this should be checked before isTargetMCU().		// MCU calling convention. Thus, this should be checked before isTargetMCU().
CCIfCC<"CallingConv::X86_INTR", CCDelegateTo<CC_X86_32_Intr>>,		CCIfCC<"CallingConv::X86_INTR", CCDelegateTo<CC_X86_32_Intr>>,
CCIfSubtarget<"isTargetMCU()", CCDelegateTo<CC_X86_32_MCU>>,		CCIfSubtarget<"isTargetMCU()", CCDelegateTo<CC_X86_32_MCU>>,
CCIfCC<"CallingConv::X86_FastCall", CCDelegateTo<CC_X86_32_FastCall>>,		CCIfCC<"CallingConv::X86_FastCall", CCDelegateTo<CC_X86_32_FastCall>>,
CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo<CC_X86_32_VectorCall>>,		CCIfCC<"CallingConv::X86_VectorCall", CCDelegateTo<CC_X86_Win32_VectorCall>>,
CCIfCC<"CallingConv::X86_ThisCall", CCDelegateTo<CC_X86_32_ThisCall>>,		CCIfCC<"CallingConv::X86_ThisCall", CCDelegateTo<CC_X86_32_ThisCall>>,
CCIfCC<"CallingConv::Fast", CCDelegateTo<CC_X86_32_FastCC>>,		CCIfCC<"CallingConv::Fast", CCDelegateTo<CC_X86_32_FastCC>>,
CCIfCC<"CallingConv::GHC", CCDelegateTo<CC_X86_32_GHC>>,		CCIfCC<"CallingConv::GHC", CCDelegateTo<CC_X86_32_GHC>>,
CCIfCC<"CallingConv::HiPE", CCDelegateTo<CC_X86_32_HiPE>>,		CCIfCC<"CallingConv::HiPE", CCDelegateTo<CC_X86_32_HiPE>>,
CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<CC_X86_32_RegCall>>,		CCIfCC<"CallingConv::X86_RegCall", CCDelegateTo<CC_X86_32_RegCall>>,

// Otherwise, drop to normal X86-32 CC		// Otherwise, drop to normal X86-32 CC
CCDelegateTo<CC_X86_32_C>		CCDelegateTo<CC_X86_32_C>
▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 11 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "X86ISelLowering.h"		#include "X86ISelLowering.h"
#include "Utils/X86ShuffleDecode.h"		#include "Utils/X86ShuffleDecode.h"
#include "X86CallingConv.h"		#include "X86CallingConv.h"
#include "X86FrameLowering.h"		#include "X86FrameLowering.h"
#include "X86InstrBuilder.h"		#include "X86InstrBuilder.h"
		#include "X86IntrinsicsInfo.h"
#include "X86MachineFunctionInfo.h"		#include "X86MachineFunctionInfo.h"
#include "X86ShuffleDecodeConstantPool.h"		#include "X86ShuffleDecodeConstantPool.h"
#include "X86TargetMachine.h"		#include "X86TargetMachine.h"
#include "X86TargetObjectFile.h"		#include "X86TargetObjectFile.h"
#include "llvm/ADT/SmallBitVector.h"		#include "llvm/ADT/SmallBitVector.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
Show All 20 Lines
#include "llvm/MC/MCContext.h"		#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCExpr.h"		#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCSymbol.h"		#include "llvm/MC/MCSymbol.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
#include "llvm/Target/TargetOptions.h"		#include "llvm/Target/TargetOptions.h"
#include "X86IntrinsicsInfo.h"		#include <algorithm>
#include <bitset>		#include <bitset>
#include <numeric>
#include <cctype>		#include <cctype>
		#include <numeric>
using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "x86-isel"		#define DEBUG_TYPE "x86-isel"

STATISTIC(NumTailCalls, "Number of tail calls");		STATISTIC(NumTailCalls, "Number of tail calls");

static cl::opt<bool> ExperimentalVectorWideningLegalization(		static cl::opt<bool> ExperimentalVectorWideningLegalization(
"x86-experimental-vector-widening-legalization", cl::init(false),		"x86-experimental-vector-widening-legalization", cl::init(false),
▲ Show 20 Lines • Show All 2,708 Lines • ▼ Show 20 Lines	static ArrayRef<MCPhysReg> get64BitArgumentXMMs(MachineFunction &MF,

static const MCPhysReg XMMArgRegs64Bit[] = {		static const MCPhysReg XMMArgRegs64Bit[] = {
X86::XMM0, X86::XMM1, X86::XMM2, X86::XMM3,		X86::XMM0, X86::XMM1, X86::XMM2, X86::XMM3,
X86::XMM4, X86::XMM5, X86::XMM6, X86::XMM7		X86::XMM4, X86::XMM5, X86::XMM6, X86::XMM7
};		};
return makeArrayRef(std::begin(XMMArgRegs64Bit), std::end(XMMArgRegs64Bit));		return makeArrayRef(std::begin(XMMArgRegs64Bit), std::end(XMMArgRegs64Bit));
}		}

		static bool isSortedByValueNo(const SmallVectorImpl<CCValAssign> &ArgLocs) {
		return std::is_sorted(ArgLocs.begin(), ArgLocs.end(),
		[](const CCValAssign &A, const CCValAssign &B) -> bool {
		return A.getValNo() < B.getValNo();
		});
		}

SDValue X86TargetLowering::LowerFormalArguments(		SDValue X86TargetLowering::LowerFormalArguments(
SDValue Chain, CallingConv::ID CallConv, bool isVarArg,		SDValue Chain, CallingConv::ID CallConv, bool isVarArg,
const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &dl,		const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &dl,
SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals) const {		SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals) const {
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
X86MachineFunctionInfo *FuncInfo = MF.getInfo<X86MachineFunctionInfo>();		X86MachineFunctionInfo *FuncInfo = MF.getInfo<X86MachineFunctionInfo>();
const TargetFrameLowering &TFI = *Subtarget.getFrameLowering();		const TargetFrameLowering &TFI = *Subtarget.getFrameLowering();

Show All 18 Lines	if (CallConv == CallingConv::X86_INTR) {
if (!isLegal)		if (!isLegal)
report_fatal_error("X86 interrupts may take one or two arguments");		report_fatal_error("X86 interrupts may take one or two arguments");
}		}

// Assign locations to all of the incoming arguments.		// Assign locations to all of the incoming arguments.
SmallVector<CCValAssign, 16> ArgLocs;		SmallVector<CCValAssign, 16> ArgLocs;
CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());		CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());

// Allocate shadow area for Win64		// Allocate shadow area for Win64.
if (IsWin64)		if (IsWin64)
CCInfo.AllocateStack(32, 8);		CCInfo.AllocateStack(32, 8);

CCInfo.AnalyzeFormalArguments(Ins, CC_X86);		CCInfo.AnalyzeArguments(Ins, CC_X86);

		// In vectorcall calling convention a second pass is required for the HVA
		// types.
		if (CallingConv::X86_VectorCall == CallConv) {
		CCInfo.AnalyzeArgumentsSecondPass(Ins, CC_X86);
		}

		// The next loop assumes that the locations are in the same order of the
		// input arguments.
		assert(isSortedByValueNo(ArgLocs) &&
		"Argument Location list must be sorted before lowering");

SDValue ArgValue;		SDValue ArgValue;
for (unsigned I = 0, InsIndex = 0, E = ArgLocs.size(); I != E;		for (unsigned I = 0, InsIndex = 0, E = ArgLocs.size(); I != E;
++I, ++InsIndex) {		++I, ++InsIndex) {
assert(InsIndex < Ins.size() && "Invalid Ins index");		assert(InsIndex < Ins.size() && "Invalid Ins index");
CCValAssign &VA = ArgLocs[I];		CCValAssign &VA = ArgLocs[I];

if (VA.isRegLoc()) {		if (VA.isRegLoc()) {
▲ Show 20 Lines • Show All 427 Lines • ▼ Show 20 Lines	X86TargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,

assert(!(isVarArg && canGuaranteeTCO(CallConv)) &&		assert(!(isVarArg && canGuaranteeTCO(CallConv)) &&
"Var args not supported with calling convention fastcc, ghc or hipe");		"Var args not supported with calling convention fastcc, ghc or hipe");

// Analyze operands of the call, assigning locations to each operand.		// Analyze operands of the call, assigning locations to each operand.
SmallVector<CCValAssign, 16> ArgLocs;		SmallVector<CCValAssign, 16> ArgLocs;
CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());		CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());

// Allocate shadow area for Win64		// Allocate shadow area for Win64.
if (IsWin64)		if (IsWin64)
CCInfo.AllocateStack(32, 8);		CCInfo.AllocateStack(32, 8);

CCInfo.AnalyzeCallOperands(Outs, CC_X86);		CCInfo.AnalyzeArguments(Outs, CC_X86);

		// In vectorcall calling convention a second pass is required for the HVA
		// types.
		if (CallingConv::X86_VectorCall == CallConv) {
		CCInfo.AnalyzeArgumentsSecondPass(Outs, CC_X86);
		}

// Get a count of how many bytes are to be pushed on the stack.		// Get a count of how many bytes are to be pushed on the stack.
unsigned NumBytes = CCInfo.getAlignedCallFrameSize();		unsigned NumBytes = CCInfo.getAlignedCallFrameSize();
if (IsSibcall)		if (IsSibcall)
// This is a sibcall. The memory operands are available in caller's		// This is a sibcall. The memory operands are available in caller's
// own caller's stack.		// own caller's stack.
NumBytes = 0;		NumBytes = 0;
else if (MF.getTarget().Options.GuaranteedTailCallOpt &&		else if (MF.getTarget().Options.GuaranteedTailCallOpt &&
Show All 38 Lines	X86TargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
if (isTailCall && FPDiff)		if (isTailCall && FPDiff)
Chain = EmitTailCallLoadRetAddr(DAG, RetAddrFrIdx, Chain, isTailCall,		Chain = EmitTailCallLoadRetAddr(DAG, RetAddrFrIdx, Chain, isTailCall,
Is64Bit, FPDiff, dl);		Is64Bit, FPDiff, dl);

SmallVector<std::pair<unsigned, SDValue>, 8> RegsToPass;		SmallVector<std::pair<unsigned, SDValue>, 8> RegsToPass;
SmallVector<SDValue, 8> MemOpChains;		SmallVector<SDValue, 8> MemOpChains;
SDValue StackPtr;		SDValue StackPtr;

		// The next loop assumes that the locations are in the same order of the
		// input arguments.
		assert(isSortedByValueNo(ArgLocs) &&
		"Argument Location list must be sorted before lowering");

// Walk the register/memloc assignments, inserting copies/loads. In the case		// Walk the register/memloc assignments, inserting copies/loads. In the case
// of tail call optimization arguments are handle later.		// of tail call optimization arguments are handle later.
const X86RegisterInfo *RegInfo = Subtarget.getRegisterInfo();		const X86RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
for (unsigned I = 0, OutIndex = 0, E = ArgLocs.size(); I != E;		for (unsigned I = 0, OutIndex = 0, E = ArgLocs.size(); I != E;
++I, ++OutIndex) {		++I, ++OutIndex) {
assert(OutIndex < Outs.size() && "Invalid Out index");		assert(OutIndex < Outs.size() && "Invalid Out index");
// Skip inalloca arguments, they have already been written.		// Skip inalloca arguments, they have already been written.
ISD::ArgFlagsTy Flags = Outs[OutIndex].Flags;		ISD::ArgFlagsTy Flags = Outs[OutIndex].Flags;
▲ Show 20 Lines • Show All 30,971 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/vectorcall.ll

	; RUN: llc -mtriple=i686-pc-win32 -mattr=+sse2 < %s \| FileCheck %s --check-prefix=CHECK --check-prefix=X86			; RUN: llc -mtriple=i686-pc-win32 -mattr=+sse2 < %s \| FileCheck %s --check-prefix=CHECK --check-prefix=X86
	; RUN: llc -mtriple=x86_64-pc-win32 < %s \| FileCheck %s --check-prefix=CHECK --check-prefix=X64			; RUN: llc -mtriple=x86_64-pc-win32 < %s \| FileCheck %s --check-prefix=CHECK --check-prefix=X64

	; Test integer arguments.			; Test integer arguments.

	define x86_vectorcallcc i32 @test_int_1() {			define x86_vectorcallcc i32 @test_int_1() {
	ret i32 0			ret i32 0
	}			}

	; CHECK-LABEL: {{^}}test_int_1@@0:			; CHECK-LABEL: {{^}}test_int_1@@0:
	; CHECK: xorl %eax, %eax			; CHECK: xorl %eax, %eax

	define x86_vectorcallcc i32 @test_int_2(i32 inreg %a) {			define x86_vectorcallcc i32 @test_int_2(i32 inreg %a) {
	ret i32 %a			ret i32 %a
	}			}

	; X86-LABEL: {{^}}test_int_2@@4:			; X86-LABEL: {{^}}test_int_2@@4:
	; X64-LABEL: {{^}}test_int_2@@8:			; X64-LABEL: {{^}}test_int_2@@8:
	; CHECK: movl %ecx, %eax			; CHECK: movl %ecx, %eax

	define x86_vectorcallcc i32 @test_int_3(i64 inreg %a) {			define x86_vectorcallcc i32 @test_int_3(i64 inreg %a) {
	%at = trunc i64 %a to i32			%at = trunc i64 %a to i32
	ret i32 %at			ret i32 %at
	}			}

	; X86-LABEL: {{^}}test_int_3@@8:			; X86-LABEL: {{^}}test_int_3@@8:
	; X64-LABEL: {{^}}test_int_3@@8:			; X64-LABEL: {{^}}test_int_3@@8:
	; CHECK: movl %ecx, %eax			; CHECK: movl %ecx, %eax

	define x86_vectorcallcc i32 @test_int_4(i32 inreg %a, i32 inreg %b) {			define x86_vectorcallcc i32 @test_int_4(i32 inreg %a, i32 inreg %b) {
	%s = add i32 %a, %b			%s = add i32 %a, %b
	ret i32 %s			ret i32 %s
	}			}

	; X86-LABEL: {{^}}test_int_4@@8:			; X86-LABEL: {{^}}test_int_4@@8:
	; X86: leal (%ecx,%edx), %eax			; X86: leal (%ecx,%edx), %eax

	; X64-LABEL: {{^}}test_int_4@@16:			; X64-LABEL: {{^}}test_int_4@@16:
	; X64: leal (%rcx,%rdx), %eax			; X64: leal (%rcx,%rdx), %eax

	define x86_vectorcallcc i32 @"\01test_int_5"(i32, i32) {			define x86_vectorcallcc i32 @"\01test_int_5"(i32, i32) {
	ret i32 0			ret i32 0
	}			}
	; CHECK-LABEL: {{^}}test_int_5:			; CHECK-LABEL: {{^}}test_int_5:

	Show All 39 Lines
	; CHECK-LABEL: {{^}}test_vec_1@@32:			; CHECK-LABEL: {{^}}test_vec_1@@32:
	; CHECK: movaps %xmm1, %xmm0			; CHECK: movaps %xmm1, %xmm0

	define x86_vectorcallcc <16 x i8> @test_vec_2(			define x86_vectorcallcc <16 x i8> @test_vec_2(
	double, <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> %r) {			double, <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> %r) {
	ret <16 x i8> %r			ret <16 x i8> %r
	}			}
	; CHECK-LABEL: {{^}}test_vec_2@@104:			; CHECK-LABEL: {{^}}test_vec_2@@104:
	; CHECK: movaps (%{{[re]}}cx), %xmm0			; x64: movq {{[0-9]*}}(%rsp), %rax
				; CHECK: movaps (%{{rax\|ecx}}), %xmm0

				%struct.HVA5 = type { <4 x float>, <4 x float>, <4 x float>, <4 x float>, <4 x float> }
				%struct.HVA4 = type { <4 x float>, <4 x float>, <4 x float>, <4 x float> }
				%struct.HVA3 = type { <4 x float>, <4 x float>, <4 x float> }
				%struct.HVA2 = type { <4 x float>, <4 x float> }

				define x86_vectorcallcc <4 x float> @test_mixed_1(i32 %a, %struct.HVA4 inreg %bb, i32 %c) {
				entry:
				%b = alloca %struct.HVA4, align 16
				store %struct.HVA4 %bb, %struct.HVA4* %b, align 16
				%w1 = getelementptr inbounds %struct.HVA4, %struct.HVA4* %b, i32 0, i32 1
				%0 = load <4 x float>, <4 x float>* %w1, align 16
				ret <4 x float> %0
				}
				; CHECK-LABEL: test_mixed_1
				; CHECK: movaps %xmm1, 16(%{{(e\|r)}}sp)
				; CHECK: movaps 16(%{{(e\|r)}}sp), %xmm0
				; CHECK: ret{{q\|l}}

				define x86_vectorcallcc <4 x float> @test_mixed_2(%struct.HVA4 inreg %a, %struct.HVA4* %b, <4 x float> %c) {
				entry:
				%c.addr = alloca <4 x float>, align 16
				store <4 x float> %c, <4 x float>* %c.addr, align 16
				%0 = load <4 x float>, <4 x float>* %c.addr, align 16
				ret <4 x float> %0
				}
				; CHECK-LABEL: test_mixed_2
				; X86: movaps %xmm0, (%esp)
				; X64: movaps %xmm2, %xmm0
				; CHECK: ret{{[ql]}}

				define x86_vectorcallcc <4 x float> @test_mixed_3(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x float> %d, <4 x float> %e, %struct.HVA2* %f) {
				entry:
				%x = getelementptr inbounds %struct.HVA2, %struct.HVA2* %f, i32 0, i32 0
				%0 = load <4 x float>, <4 x float>* %x, align 16
				ret <4 x float> %0
				}
				; CHECK-LABEL: test_mixed_3
				; CHECK: movaps (%{{[re][ac]}}x), %xmm0
				; CHECK: ret{{[ql]}}

				define x86_vectorcallcc <4 x float> @test_mixed_4(%struct.HVA4 inreg %a, %struct.HVA2* %bb, <4 x float> %c) {
				entry:
				%y4 = getelementptr inbounds %struct.HVA2, %struct.HVA2* %bb, i32 0, i32 1
				%0 = load <4 x float>, <4 x float>* %y4, align 16
				ret <4 x float> %0
				}
				; CHECK-LABEL: test_mixed_4
				; X86: movaps 16(%eax), %xmm0
				; X64: movaps 16(%rdx), %xmm0
				; CHECK: ret{{[ql]}}

				define x86_vectorcallcc <4 x float> @test_mixed_5(%struct.HVA3 inreg %a, %struct.HVA3* %b, <4 x float> %c, %struct.HVA2 inreg %dd) {
				entry:
				%d = alloca %struct.HVA2, align 16
				store %struct.HVA2 %dd, %struct.HVA2* %d, align 16
				%y5 = getelementptr inbounds %struct.HVA2, %struct.HVA2* %d, i32 0, i32 1
				%0 = load <4 x float>, <4 x float>* %y5, align 16
				ret <4 x float> %0
				}
				; CHECK-LABEL: test_mixed_5
				; CHECK: movaps %xmm5, 16(%{{(e\|r)}}sp)
				; CHECK: movaps 16(%{{(e\|r)}}sp), %xmm0
				; CHECK: ret{{[ql]}}

				define x86_vectorcallcc %struct.HVA4 @test_mixed_6(%struct.HVA4 inreg %a, %struct.HVA4* %b) {
				entry:
				%retval = alloca %struct.HVA4, align 16
				%0 = bitcast %struct.HVA4* %retval to i8*
				%1 = bitcast %struct.HVA4* %b to i8*
				call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 64, i32 16, i1 false)
				%2 = load %struct.HVA4, %struct.HVA4* %retval, align 16
				ret %struct.HVA4 %2
				}
				; CHECK-LABEL: test_mixed_6
				; CHECK: movaps (%{{[re]}}sp), %xmm0
				; CHECK: movaps 16(%{{[re]}}sp), %xmm1
				; CHECK: movaps 32(%{{[re]}}sp), %xmm2
				; CHECK: movaps 48(%{{[re]}}sp), %xmm3
				; CHECK: ret{{[ql]}}

				declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i32, i1)
				declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i32, i1)
				declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture writeonly, i8* nocapture readonly, i32, i32, i1)

				define x86_vectorcallcc void @test_mixed_7(%struct.HVA5* noalias sret %agg.result) {
				entry:
				%a = alloca %struct.HVA5, align 16
				%0 = bitcast %struct.HVA5* %a to i8*
				call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 80, i32 16, i1 false)
				%1 = bitcast %struct.HVA5* %agg.result to i8*
				%2 = bitcast %struct.HVA5* %a to i8*
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %2, i64 80, i32 16, i1 false)
				ret void
				}
				; CHECK-LABEL: test_mixed_7
				; CHECK: movaps %xmm{{[0-9]}}, 64(%{{rcx\|eax}})
				; CHECK: movaps %xmm{{[0-9]}}, 48(%{{rcx\|eax}})
				; CHECK: movaps %xmm{{[0-9]}}, 32(%{{rcx\|eax}})
				; CHECK: movaps %xmm{{[0-9]}}, 16(%{{rcx\|eax}})
				; CHECK: movaps %xmm{{[0-9]}}, (%{{rcx\|eax}})
				; X64: mov{{[ql]}} %rcx, %rax
				; CHECK: ret{{[ql]}}

				define x86_vectorcallcc <4 x float> @test_mixed_8(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x float> %d, i32 %e, <4 x float> %f) {
				entry:
				%f.addr = alloca <4 x float>, align 16
				store <4 x float> %f, <4 x float>* %f.addr, align 16
				%0 = load <4 x float>, <4 x float>* %f.addr, align 16
				ret <4 x float> %0
				}
				; CHECK-LABEL: test_mixed_8
				; X86: movaps %xmm4, %xmm0
				; X64: movaps %xmm5, %xmm0
				; CHECK: ret{{[ql]}}

				%struct.HFA4 = type { double, double, double, double }
				declare x86_vectorcallcc double @test_mixed_9_callee(%struct.HFA4 %x, double %y)

				define x86_vectorcallcc double @test_mixed_9_caller(%struct.HFA4 inreg %b) {
				entry:
				%call = call x86_vectorcallcc double @test_mixed_9_callee(%struct.HFA4 inreg %b, double 3.000000e+00)
				%add = fadd double 1.000000e+00, %call
				ret double %add
				}
				; CHECK-LABEL: test_mixed_9_caller
				; CHECK: movaps %xmm3, %xmm4
				; CHECK: movaps %xmm2, %xmm3
				; CHECK: movaps %xmm1, %xmm2
				; X32: movasd %xmm0, %xmm1
				; X64: movapd %xmm5, %xmm1
				; CHECK: call{{l\|q}} test_mixed_9_callee@@40
				; CHECK: addsd {{.*}}, %xmm0
				; CHECK: ret{{l\|q}}

This is an archive of the discontinued LLVM Phabricator instance.

Vectorcall Calling Convention - Adding CodeGen Complete SupportClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 82206

llvm/trunk/include/llvm/CodeGen/CallingConvLower.h

llvm/trunk/include/llvm/Target/TargetCallingConv.h

llvm/trunk/lib/CodeGen/CallingConvLower.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/trunk/lib/Target/X86/X86CallingConv.h

llvm/trunk/lib/Target/X86/X86CallingConv.cpp

llvm/trunk/lib/Target/X86/X86CallingConv.td

llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

llvm/trunk/test/CodeGen/X86/vectorcall.ll

Vectorcall Calling Convention - Adding CodeGen Complete Support
ClosedPublic