This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/CodeGen/
-
clang/
-
CodeGen/
-
CGFunctionInfo.h
-
lib/CodeGen/
-
CodeGen/
1/2
TargetInfo.cpp
-
test/CodeGen/
-
CodeGen/
-
vectorcall.c

Differential D72110

[X86] ABI compat bugfix for MSVC vectorcall
ClosedPublic

Authored by rnk on Jan 2 2020, 2:03 PM.

Download Raw Diff

Details

Reviewers

erichkeane
craig.topper

Commits

rG8e780252a728: [X86] ABI compat bugfix for MSVC vectorcall

Summary

Before this change, X86_32ABIInfo::classifyArgument would be called
twice on vector arguments to vectorcall functions. This function has
side effects to track GPR register usage, and this would lead to
incorrect GPR usage in some cases. The specific case I noticed is from
running out of XMM registers with mixed FP and vector arguments and no
aggregates of any kind. Consider this prototype:

void __vectorcall vectorcall_indirect_vec(
    double xmm0, double xmm1, double xmm2, double xmm3, double xmm4,
    __m128 xmm5,
    __m128 ecx,
    int edx,
    __m128 mem);

classifyArgument has no effects when called on a plain FP type, but when
called on a vector type, it modifies FreeRegs to model GPR consumption.
However, this should not happen during the vector call first pass.

I refactored the code to unify vectorcall HVA logic with regcall HVA
logic. The conventions pass HVAs in registers differently (expanded vs.
not expanded), but if they do not fit in registers, they both pass them
indirectly by address.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

rnk created this revision.Jan 2 2020, 2:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 2 2020, 2:03 PM

Unit tests: pass. 61175 tests passed, 0 failed and 729 were skipped.

clang-tidy: pass.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster failed remote builds in B43196: Diff 235946!Jan 2 2020, 2:19 PM

rnk mentioned this in D72114: [MS] Overhaul how clang passes overaligned args on x86_32.Jan 2 2020, 3:25 PM

I don't see anything I have a problem with, but still want @ctopper to have a bit of time to take a look. Ping me monday on both if he doesn't respond.

craig.topper added inline comments.Jan 3 2020, 10:25 AM

clang/lib/CodeGen/TargetInfo.cpp
1648	What about ZMM?

rnk marked an inline comment as done.Jan 3 2020, 11:55 AM

rnk added inline comments.

clang/lib/CodeGen/TargetInfo.cpp
1648	This comment was pre-existing, but we can say [XYZ]MM0-5.

ZYXMM

Unit tests: pass. 61177 tests passed, 0 failed and 729 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Harbormaster failed remote builds in B43266: Diff 236115!Jan 3 2020, 1:20 PM

LGTM

This revision is now accepted and ready to land.Jan 5 2020, 9:13 PM

Closed by commit rG8e780252a728: [X86] ABI compat bugfix for MSVC vectorcall (authored by rnk). · Explain WhyJan 14 2020, 6:00 PM

This revision was automatically updated to reflect the committed changes.

rnk mentioned this in rG2af74e27ed7d: [MS] Overhaul how clang passes overaligned args on x86_32.Jan 23 2020, 4:12 PM

Revision Contents

Path

Size

clang/

include/

clang/

CodeGen/

CGFunctionInfo.h

11 lines

lib/

CodeGen/

TargetInfo.cpp

136 lines

test/

CodeGen/

vectorcall.c

21 lines

Diff 238154

clang/include/clang/CodeGen/CGFunctionInfo.h

Show First 20 Lines • Show All 557 Lines • ▼ Show 20 Lines	public:
}		}
size_t numTrailingObjects(OverloadToken<ExtParameterInfo>) const {		size_t numTrailingObjects(OverloadToken<ExtParameterInfo>) const {
return (HasExtParameterInfos ? NumArgs : 0);		return (HasExtParameterInfos ? NumArgs : 0);
}		}

typedef const ArgInfo *const_arg_iterator;		typedef const ArgInfo *const_arg_iterator;
typedef ArgInfo *arg_iterator;		typedef ArgInfo *arg_iterator;

typedef llvm::iterator_range<arg_iterator> arg_range;		MutableArrayRef<ArgInfo> arguments() {
typedef llvm::iterator_range<const_arg_iterator> const_arg_range;		return MutableArrayRef<ArgInfo>(arg_begin(), NumArgs);
		}
arg_range arguments() { return arg_range(arg_begin(), arg_end()); }		ArrayRef<ArgInfo> arguments() const {
const_arg_range arguments() const {		return ArrayRef<ArgInfo>(arg_begin(), NumArgs);
return const_arg_range(arg_begin(), arg_end());
}		}

const_arg_iterator arg_begin() const { return getArgsBuffer() + 1; }		const_arg_iterator arg_begin() const { return getArgsBuffer() + 1; }
const_arg_iterator arg_end() const { return getArgsBuffer() + 1 + NumArgs; }		const_arg_iterator arg_end() const { return getArgsBuffer() + 1 + NumArgs; }
arg_iterator arg_begin() { return getArgsBuffer() + 1; }		arg_iterator arg_begin() { return getArgsBuffer() + 1; }
arg_iterator arg_end() { return getArgsBuffer() + 1 + NumArgs; }		arg_iterator arg_end() { return getArgsBuffer() + 1 + NumArgs; }

unsigned arg_size() const { return NumArgs; }		unsigned arg_size() const { return NumArgs; }
▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

clang/lib/CodeGen/TargetInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 16 Lines
#include "CGCXXABI.h"		#include "CGCXXABI.h"
#include "CGValue.h"		#include "CGValue.h"
#include "CodeGenFunction.h"		#include "CodeGenFunction.h"
#include "clang/AST/Attr.h"		#include "clang/AST/Attr.h"
#include "clang/AST/RecordLayout.h"		#include "clang/AST/RecordLayout.h"
#include "clang/Basic/CodeGenOptions.h"		#include "clang/Basic/CodeGenOptions.h"
#include "clang/CodeGen/CGFunctionInfo.h"		#include "clang/CodeGen/CGFunctionInfo.h"
#include "clang/CodeGen/SwiftCallingConv.h"		#include "clang/CodeGen/SwiftCallingConv.h"
		#include "llvm/ADT/SmallBitVector.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm> // std::sort		#include <algorithm> // std::sort
▲ Show 20 Lines • Show All 958 Lines • ▼ Show 20 Lines
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// X86-32 ABI Implementation		// X86-32 ABI Implementation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Similar to llvm::CCState, but for Clang.		/// Similar to llvm::CCState, but for Clang.
struct CCState {		struct CCState {
CCState(unsigned CC) : CC(CC), FreeRegs(0), FreeSSERegs(0) {}		CCState(CGFunctionInfo &FI)
		: IsPreassigned(FI.arg_size()), CC(FI.getCallingConvention()) {}

unsigned CC;		llvm::SmallBitVector IsPreassigned;
unsigned FreeRegs;		unsigned CC = CallingConv::CC_C;
unsigned FreeSSERegs;		unsigned FreeRegs = 0;
		unsigned FreeSSERegs = 0;
};		};

enum {		enum {
// Vectorcall only allows the first 6 parameters to be passed in registers.		// Vectorcall only allows the first 6 parameters to be passed in registers.
VectorcallMaxParamNumAsReg = 6		VectorcallMaxParamNumAsReg = 6
};		};

/// X86_32ABIInfo - The X86-32 ABI information.		/// X86_32ABIInfo - The X86-32 ABI information.
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	class X86_32ABIInfo : public SwiftABIInfo {

/// Rewrite the function info so that all memory arguments use		/// Rewrite the function info so that all memory arguments use
/// inalloca.		/// inalloca.
void rewriteWithInAlloca(CGFunctionInfo &FI) const;		void rewriteWithInAlloca(CGFunctionInfo &FI) const;

void addFieldToArgStruct(SmallVector<llvm::Type *, 6> &FrameFields,		void addFieldToArgStruct(SmallVector<llvm::Type *, 6> &FrameFields,
CharUnits &StackOffset, ABIArgInfo &Info,		CharUnits &StackOffset, ABIArgInfo &Info,
QualType Type) const;		QualType Type) const;
void computeVectorCallArgs(CGFunctionInfo &FI, CCState &State,		void runVectorCallFirstPass(CGFunctionInfo &FI, CCState &State) const;
bool &UsedInAlloca) const;

public:		public:

void computeInfo(CGFunctionInfo &FI) const override;		void computeInfo(CGFunctionInfo &FI) const override;
Address EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,		Address EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
QualType Ty) const override;		QualType Ty) const override;

X86_32ABIInfo(CodeGen::CodeGenTypes &CGT, bool DarwinVectorABI,		X86_32ABIInfo(CodeGen::CodeGenTypes &CGT, bool DarwinVectorABI,
▲ Show 20 Lines • Show All 551 Lines • ▼ Show 20 Lines	if (State.CC == llvm::CallingConv::X86_FastCall \|\|

return (Ty->isIntegralOrEnumerationType() \|\| Ty->isPointerType() \|\|		return (Ty->isIntegralOrEnumerationType() \|\| Ty->isPointerType() \|\|
Ty->isReferenceType());		Ty->isReferenceType());
}		}

return true;		return true;
}		}

		void X86_32ABIInfo::runVectorCallFirstPass(CGFunctionInfo &FI, CCState &State) const {
		// Vectorcall x86 works subtly different than in x64, so the format is
		// a bit different than the x64 version. First, all vector types (not HVAs)
		// are assigned, with the first 6 ending up in the [XYZ]MM0-5 registers.
		craig.topperUnsubmitted Not Done Reply Inline Actions What about ZMM? craig.topper: What about ZMM?
		rnkAuthorUnsubmitted Done Reply Inline Actions This comment was pre-existing, but we can say [XYZ]MM0-5. rnk: This comment was pre-existing, but we can say [XYZ]MM0-5.
		// This differs from the x64 implementation, where the first 6 by INDEX get
		// registers.
		// In the second pass over the arguments, HVAs are passed in the remaining
		// vector registers if possible, or indirectly by address. The address will be
		// passed in ECX/EDX if available. Any other arguments are passed according to
		// the usual fastcall rules.
		MutableArrayRef<CGFunctionInfoArgInfo> Args = FI.arguments();
		for (int I = 0, E = Args.size(); I < E; ++I) {
		const Type *Base = nullptr;
		uint64_t NumElts = 0;
		const QualType &Ty = Args[I].type;
		if ((Ty->isVectorType() \|\| Ty->isBuiltinType()) &&
		isHomogeneousAggregate(Ty, Base, NumElts)) {
		if (State.FreeSSERegs >= NumElts) {
		State.FreeSSERegs -= NumElts;
		Args[I].info = ABIArgInfo::getDirect();
		State.IsPreassigned.set(I);
		}
		}
		}
		}

ABIArgInfo X86_32ABIInfo::classifyArgumentType(QualType Ty,		ABIArgInfo X86_32ABIInfo::classifyArgumentType(QualType Ty,
CCState &State) const {		CCState &State) const {
// FIXME: Set alignment on indirect arguments.		// FIXME: Set alignment on indirect arguments.
		bool IsFastCall = State.CC == llvm::CallingConv::X86_FastCall;
		bool IsRegCall = State.CC == llvm::CallingConv::X86_RegCall;
		bool IsVectorCall = State.CC == llvm::CallingConv::X86_VectorCall;

Ty = useFirstFieldIfTransparentUnion(Ty);		Ty = useFirstFieldIfTransparentUnion(Ty);

// Check with the C++ ABI first.		// Check with the C++ ABI first.
const RecordType *RT = Ty->getAs<RecordType>();		const RecordType *RT = Ty->getAs<RecordType>();
if (RT) {		if (RT) {
CGCXXABI::RecordArgABI RAA = getRecordArgABI(RT, getCXXABI());		CGCXXABI::RecordArgABI RAA = getRecordArgABI(RT, getCXXABI());
if (RAA == CGCXXABI::RAA_Indirect) {		if (RAA == CGCXXABI::RAA_Indirect) {
return getIndirectResult(Ty, false, State);		return getIndirectResult(Ty, false, State);
} else if (RAA == CGCXXABI::RAA_DirectInMemory) {		} else if (RAA == CGCXXABI::RAA_DirectInMemory) {
// The field index doesn't matter, we'll fix it up later.		// The field index doesn't matter, we'll fix it up later.
return ABIArgInfo::getInAlloca(/FieldIndex=/0);		return ABIArgInfo::getInAlloca(/FieldIndex=/0);
}		}
}		}

// Regcall uses the concept of a homogenous vector aggregate, similar		// Regcall uses the concept of a homogenous vector aggregate, similar
// to other targets.		// to other targets.
const Type *Base = nullptr;		const Type *Base = nullptr;
uint64_t NumElts = 0;		uint64_t NumElts = 0;
if (State.CC == llvm::CallingConv::X86_RegCall &&		if ((IsRegCall \|\| IsVectorCall) &&
isHomogeneousAggregate(Ty, Base, NumElts)) {		isHomogeneousAggregate(Ty, Base, NumElts)) {

if (State.FreeSSERegs >= NumElts) {		if (State.FreeSSERegs >= NumElts) {
State.FreeSSERegs -= NumElts;		State.FreeSSERegs -= NumElts;

		// Vectorcall passes HVAs directly and does not flatten them, but regcall
		// does.
		if (IsVectorCall)
		return getDirectX86Hva();

if (Ty->isBuiltinType() \|\| Ty->isVectorType())		if (Ty->isBuiltinType() \|\| Ty->isVectorType())
return ABIArgInfo::getDirect();		return ABIArgInfo::getDirect();
return ABIArgInfo::getExpand();		return ABIArgInfo::getExpand();
}		}
return getIndirectResult(Ty, /ByVal=/false, State);		return getIndirectResult(Ty, /ByVal=/false, State);
}		}

if (isAggregateTypeForABI(Ty)) {		if (isAggregateTypeForABI(Ty)) {
Show All 25 Lines	if (isAggregateTypeForABI(Ty)) {
// of those arguments will match the struct. This is important because the		// of those arguments will match the struct. This is important because the
// LLVM backend isn't smart enough to remove byval, which inhibits many		// LLVM backend isn't smart enough to remove byval, which inhibits many
// optimizations.		// optimizations.
// Don't do this for the MCU if there are still free integer registers		// Don't do this for the MCU if there are still free integer registers
// (see X86_64 ABI for full explanation).		// (see X86_64 ABI for full explanation).
if (getContext().getTypeSize(Ty) <= 4 * 32 &&		if (getContext().getTypeSize(Ty) <= 4 * 32 &&
(!IsMCUABI \|\| State.FreeRegs == 0) && canExpandIndirectArgument(Ty))		(!IsMCUABI \|\| State.FreeRegs == 0) && canExpandIndirectArgument(Ty))
return ABIArgInfo::getExpandWithPadding(		return ABIArgInfo::getExpandWithPadding(
State.CC == llvm::CallingConv::X86_FastCall \|\|		IsFastCall \|\| IsVectorCall \|\| IsRegCall, PaddingType);
State.CC == llvm::CallingConv::X86_VectorCall \|\|
State.CC == llvm::CallingConv::X86_RegCall,
PaddingType);

return getIndirectResult(Ty, true, State);		return getIndirectResult(Ty, true, State);
}		}

if (const VectorType *VT = Ty->getAs<VectorType>()) {		if (const VectorType *VT = Ty->getAs<VectorType>()) {
// On Darwin, some vectors are passed in memory, we handle this by passing		// On Darwin, some vectors are passed in memory, we handle this by passing
// it as an i8/i16/i32/i64.		// it as an i8/i16/i32/i64.
if (IsDarwinVectorABI) {		if (IsDarwinVectorABI) {
Show All 22 Lines	if (Ty->isPromotableIntegerType()) {
return ABIArgInfo::getExtend(Ty);		return ABIArgInfo::getExtend(Ty);
}		}

if (InReg)		if (InReg)
return ABIArgInfo::getDirectInReg();		return ABIArgInfo::getDirectInReg();
return ABIArgInfo::getDirect();		return ABIArgInfo::getDirect();
}		}

void X86_32ABIInfo::computeVectorCallArgs(CGFunctionInfo &FI, CCState &State,
bool &UsedInAlloca) const {
// Vectorcall x86 works subtly different than in x64, so the format is
// a bit different than the x64 version. First, all vector types (not HVAs)
// are assigned, with the first 6 ending up in the YMM0-5 or XMM0-5 registers.
// This differs from the x64 implementation, where the first 6 by INDEX get
// registers.
// After that, integers AND HVAs are assigned Left to Right in the same pass.
// Integers are passed as ECX/EDX if one is available (in order). HVAs will
// first take up the remaining YMM/XMM registers. If insufficient registers
// remain but an integer register (ECX/EDX) is available, it will be passed
// in that, else, on the stack.
for (auto &I : FI.arguments()) {
// First pass do all the vector types.
const Type *Base = nullptr;
uint64_t NumElts = 0;
const QualType& Ty = I.type;
if ((Ty->isVectorType() \|\| Ty->isBuiltinType()) &&
isHomogeneousAggregate(Ty, Base, NumElts)) {
if (State.FreeSSERegs >= NumElts) {
State.FreeSSERegs -= NumElts;
I.info = ABIArgInfo::getDirect();
} else {
I.info = classifyArgumentType(Ty, State);
}
UsedInAlloca \|= (I.info.getKind() == ABIArgInfo::InAlloca);
}
}

for (auto &I : FI.arguments()) {
// Second pass, do the rest!
const Type *Base = nullptr;
uint64_t NumElts = 0;
const QualType& Ty = I.type;
bool IsHva = isHomogeneousAggregate(Ty, Base, NumElts);

if (IsHva && !Ty->isVectorType() && !Ty->isBuiltinType()) {
// Assign true HVAs (non vector/native FP types).
if (State.FreeSSERegs >= NumElts) {
State.FreeSSERegs -= NumElts;
I.info = getDirectX86Hva();
} else {
I.info = getIndirectResult(Ty, /ByVal=/false, State);
}
} else if (!IsHva) {
// Assign all Non-HVAs, so this will exclude Vector/FP args.
I.info = classifyArgumentType(Ty, State);
UsedInAlloca \|= (I.info.getKind() == ABIArgInfo::InAlloca);
}
}
}

void X86_32ABIInfo::computeInfo(CGFunctionInfo &FI) const {		void X86_32ABIInfo::computeInfo(CGFunctionInfo &FI) const {
CCState State(FI.getCallingConvention());		CCState State(FI);
if (IsMCUABI)		if (IsMCUABI)
State.FreeRegs = 3;		State.FreeRegs = 3;
else if (State.CC == llvm::CallingConv::X86_FastCall)		else if (State.CC == llvm::CallingConv::X86_FastCall)
State.FreeRegs = 2;		State.FreeRegs = 2;
else if (State.CC == llvm::CallingConv::X86_VectorCall) {		else if (State.CC == llvm::CallingConv::X86_VectorCall) {
State.FreeRegs = 2;		State.FreeRegs = 2;
State.FreeSSERegs = 6;		State.FreeSSERegs = 6;
} else if (FI.getHasRegParm())		} else if (FI.getHasRegParm())
Show All 15 Lines	if (State.FreeRegs) {
FI.getReturnInfo().setInReg(true);		FI.getReturnInfo().setInReg(true);
}		}
}		}

// The chain argument effectively gives us another free register.		// The chain argument effectively gives us another free register.
if (FI.isChainCall())		if (FI.isChainCall())
++State.FreeRegs;		++State.FreeRegs;

		// For vectorcall, do a first pass over the arguments, assigning FP and vector
		// arguments to XMM registers as available.
		if (State.CC == llvm::CallingConv::X86_VectorCall)
		runVectorCallFirstPass(FI, State);

bool UsedInAlloca = false;		bool UsedInAlloca = false;
if (State.CC == llvm::CallingConv::X86_VectorCall) {		MutableArrayRef<CGFunctionInfoArgInfo> Args = FI.arguments();
computeVectorCallArgs(FI, State, UsedInAlloca);		for (int I = 0, E = Args.size(); I < E; ++I) {
} else {		// Skip arguments that have already been assigned.
// If not vectorcall, revert to normal behavior.		if (State.IsPreassigned.test(I))
for (auto &I : FI.arguments()) {		continue;
I.info = classifyArgumentType(I.type, State);
UsedInAlloca \|= (I.info.getKind() == ABIArgInfo::InAlloca);		Args[I].info = classifyArgumentType(Args[I].type, State);
}		UsedInAlloca \|= (Args[I].info.getKind() == ABIArgInfo::InAlloca);
}		}

// If we needed to use inalloca for any argument, do a second pass and rewrite		// If we needed to use inalloca for any argument, do a second pass and rewrite
// all the memory arguments to use inalloca.		// all the memory arguments to use inalloca.
if (UsedInAlloca)		if (UsedInAlloca)
rewriteWithInAlloca(FI);		rewriteWithInAlloca(FI);
}		}

▲ Show 20 Lines • Show All 5,734 Lines • ▼ Show 20 Lines
namespace {		namespace {
class LanaiABIInfo : public DefaultABIInfo {		class LanaiABIInfo : public DefaultABIInfo {
public:		public:
LanaiABIInfo(CodeGen::CodeGenTypes &CGT) : DefaultABIInfo(CGT) {}		LanaiABIInfo(CodeGen::CodeGenTypes &CGT) : DefaultABIInfo(CGT) {}

bool shouldUseInReg(QualType Ty, CCState &State) const;		bool shouldUseInReg(QualType Ty, CCState &State) const;

void computeInfo(CGFunctionInfo &FI) const override {		void computeInfo(CGFunctionInfo &FI) const override {
CCState State(FI.getCallingConvention());		CCState State(FI);
// Lanai uses 4 registers to pass arguments unless the function has the		// Lanai uses 4 registers to pass arguments unless the function has the
// regparm attribute set.		// regparm attribute set.
if (FI.getHasRegParm()) {		if (FI.getHasRegParm()) {
State.FreeRegs = FI.getRegParm();		State.FreeRegs = FI.getRegParm();
} else {		} else {
State.FreeRegs = 4;		State.FreeRegs = 4;
}		}

▲ Show 20 Lines • Show All 959 Lines • ▼ Show 20 Lines	else if (Info.isDirect() && Info.getInReg()) {
if (sz < State.FreeRegs)		if (sz < State.FreeRegs)
State.FreeRegs -= sz;		State.FreeRegs -= sz;
else		else
State.FreeRegs = 0;		State.FreeRegs = 0;
}		}
}		}

void computeInfo(CGFunctionInfo &FI) const override {		void computeInfo(CGFunctionInfo &FI) const override {
CCState State(FI.getCallingConvention());		CCState State(FI);
// ARC uses 8 registers to pass arguments.		// ARC uses 8 registers to pass arguments.
State.FreeRegs = 8;		State.FreeRegs = 8;

if (!getCXXABI().classifyReturnType(FI))		if (!getCXXABI().classifyReturnType(FI))
FI.getReturnInfo() = classifyReturnType(FI.getReturnType());		FI.getReturnInfo() = classifyReturnType(FI.getReturnType());
updateState(FI.getReturnInfo(), FI.getReturnType(), State);		updateState(FI.getReturnInfo(), FI.getReturnType(), State);
for (auto &I : FI.arguments()) {		for (auto &I : FI.arguments()) {
I.info = classifyArgumentType(I.type, State.FreeRegs);		I.info = classifyArgumentType(I.type, State.FreeRegs);
▲ Show 20 Lines • Show All 1,501 Lines • Show Last 20 Lines

clang/test/CodeGen/vectorcall.c

	Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines
	// even if it is not one of the first 6 arguments. First pass puts p4 into a			// even if it is not one of the first 6 arguments. First pass puts p4 into a
	// register on both. p9 ends up in a register in x86 only. Second pass puts p1			// register on both. p9 ends up in a register in x86 only. Second pass puts p1
	// in a register, does NOT put p7 in a register (since theres no room), then puts			// in a register, does NOT put p7 in a register (since theres no room), then puts
	// p8 in a register.			// p8 in a register.
	void __vectorcall HVAAnywhere(struct HFA2 p1, int p2, int p3, float p4, int p5, int p6, struct HFA4 p7, struct HFA2 p8, float p9){}			void __vectorcall HVAAnywhere(struct HFA2 p1, int p2, int p3, float p4, int p5, int p6, struct HFA4 p7, struct HFA2 p8, float p9){}
	// X32: define dso_local x86_vectorcallcc void @"\01HVAAnywhere@@88"(%struct.HFA2 inreg %p1.coerce, i32 inreg %p2, i32 inreg %p3, float %p4, i32 %p5, i32 %p6, %struct.HFA4* %p7, %struct.HFA2 inreg %p8.coerce, float %p9)			// X32: define dso_local x86_vectorcallcc void @"\01HVAAnywhere@@88"(%struct.HFA2 inreg %p1.coerce, i32 inreg %p2, i32 inreg %p3, float %p4, i32 %p5, i32 %p6, %struct.HFA4* %p7, %struct.HFA2 inreg %p8.coerce, float %p9)
	// X64: define dso_local x86_vectorcallcc void @"\01HVAAnywhere@@112"(%struct.HFA2 inreg %p1.coerce, i32 %p2, i32 %p3, float %p4, i32 %p5, i32 %p6, %struct.HFA4* %p7, %struct.HFA2 inreg %p8.coerce, float %p9)			// X64: define dso_local x86_vectorcallcc void @"\01HVAAnywhere@@112"(%struct.HFA2 inreg %p1.coerce, i32 %p2, i32 %p3, float %p4, i32 %p5, i32 %p6, %struct.HFA4* %p7, %struct.HFA2 inreg %p8.coerce, float %p9)

				#ifndef __x86_64__
				// This covers the three ways XMM values can be passed on 32-bit x86:
				// - directly in XMM register (xmm5)
				// - indirectly by address, address in GPR (ecx)
				// - indirectly by address, address on stack
				void __vectorcall vectorcall_indirect_vec(
				double xmm0, double xmm1, double xmm2, double xmm3, double xmm4,
				v4f32 xmm5, v4f32 ecx, int edx, v4f32 mem) {
				}

				// X32: define dso_local x86_vectorcallcc void @"\01vectorcall_indirect_vec@@{{[0-9]+}}"
				// X32-SAME: (double %xmm0,
				// X32-SAME: double %xmm1,
				// X32-SAME: double %xmm2,
				// X32-SAME: double %xmm3,
				// X32-SAME: double %xmm4,
				// X32-SAME: <4 x float> %xmm5,
				// X32-SAME: <4 x float>* inreg %0,
				// X32-SAME: i32 inreg %edx,
				// X32-SAME: <4 x float>* %1)
				#endif

This is an archive of the discontinued LLVM Phabricator instance.

[X86] ABI compat bugfix for MSVC vectorcallClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 238154

clang/include/clang/CodeGen/CGFunctionInfo.h

clang/lib/CodeGen/TargetInfo.cpp

clang/test/CodeGen/vectorcall.c

[X86] ABI compat bugfix for MSVC vectorcall
ClosedPublic