This is an archive of the discontinued LLVM Phabricator instance.

Complex Long Double classification In RegCall calling convention
ClosedPublic

Authored by eandrews on Jul 11 2017, 8:39 AM.

Download Raw Diff

Details

Reviewers

rnk
oren_ben_simhon
erichkeane

Commits

rGde1b2a93758a: Complex Long Double classification In RegCall calling convention
rC308769: Complex Long Double classification In RegCall calling convention
rL308769: Complex Long Double classification In RegCall calling convention

Summary

This change is part of the RegCall calling convention support for LLVM. Existing RegCall implementation was extended to include correct handling of Complex Long Double type. Complex long double types should be returned/passed in memory and not register stack. This patch implements this behavior.

Diff Detail

Repository: rL LLVM

Event Timeline

eandrews created this revision.Jul 11 2017, 8:39 AM

Oren discovered this miss to the original implementation. I'd reviewed this internally quite a bit.

The reason for the Win32ABI test is that MSVC 'long double' is actually small enough for SSE registers in this case. I DO now (looking again, sorry Elizabeth) wonder if there is a better way to exclude extended-length LongDouble type? Could we us the 'length' of it instead? @rnk : opinion?

In D35259#805284, @erichkeane wrote:

Oren discovered this miss to the original implementation. I'd reviewed this internally quite a bit.

The reason for the Win32ABI test is that MSVC 'long double' is actually small enough for SSE registers in this case. I DO now (looking again, sorry Elizabeth) wonder if there is a better way to exclude extended-length LongDouble type? Could we us the 'length' of it instead? @rnk : opinion?

Yeah, you can ask clang::TargetInfo for the format of most basic FP types. The code to do that looks like:

&TI.getLongDoubleFormat() == &llvm::APFloat::x87DoubleExtended()

Any reason you can't just add that condition to isX86VectorTypeForVectorCall? I assume we don't want to pass x86_fp80s in SSE registers for vectorcall either, right? That would eliminate the need for the isRegCallReturnableHA helper and the IsWin32StructABI parameter, which is a poorly named variable.

In D35259#805409, @rnk wrote:
In D35259#805284, @erichkeane wrote:

Oren discovered this miss to the original implementation. I'd reviewed this internally quite a bit.

The reason for the Win32ABI test is that MSVC 'long double' is actually small enough for SSE registers in this case. I DO now (looking again, sorry Elizabeth) wonder if there is a better way to exclude extended-length LongDouble type? Could we us the 'length' of it instead? @rnk : opinion?

Yeah, you can ask clang::TargetInfo for the format of most basic FP types. The code to do that looks like:
&TI.getLongDoubleFormat() == &llvm::APFloat::x87DoubleExtended()
Any reason you can't just add that condition to isX86VectorTypeForVectorCall? I assume we don't want to pass x86_fp80s in SSE registers for vectorcall either, right? That would eliminate the need for the isRegCallReturnableHA helper and the IsWin32StructABI parameter, which is a poorly named variable.

It actually WOULD make sense to apply this to vectorcall as well, wouldn't it? I presumed we didnt want to change its behavior, however vectorcall is MSVC-only (other than us), so the long-double issue isn't a thing over there.@eandrews?

In D35259#805415, @erichkeane wrote:
In D35259#805409, @rnk wrote:
In D35259#805284, @erichkeane wrote:

Oren discovered this miss to the original implementation. I'd reviewed this internally quite a bit.

The reason for the Win32ABI test is that MSVC 'long double' is actually small enough for SSE registers in this case. I DO now (looking again, sorry Elizabeth) wonder if there is a better way to exclude extended-length LongDouble type? Could we us the 'length' of it instead? @rnk : opinion?

Yeah, you can ask clang::TargetInfo for the format of most basic FP types. The code to do that looks like:
&TI.getLongDoubleFormat() == &llvm::APFloat::x87DoubleExtended()
Any reason you can't just add that condition to isX86VectorTypeForVectorCall? I assume we don't want to pass x86_fp80s in SSE registers for vectorcall either, right? That would eliminate the need for the isRegCallReturnableHA helper and the IsWin32StructABI parameter, which is a poorly named variable.
It actually WOULD make sense to apply this to vectorcall as well, wouldn't it? I presumed we didnt want to change its behavior, however vectorcall is MSVC-only (other than us), so the long-double issue isn't a thing over there.@eandrews?

I can apply this condition to isX86VectorTypeForVectorCall like Reid suggested. The only reason I did not do that originally was because I wasn't sure how to differentiate between the calling conventions without changing function declaration. If I don't need to explicitly differentiate between RegCall and VectorCall and can use the 'length' instead, I think this works better. Like Reid mentioned, it would eliminate the need for isRegCallReturnableHA and IsWin32StructABI.

As per revision comments, I moved the condition for extended precision floating type to isX86VectorTypeForVectorCall. This update will now alter behavior for complex long double type under vectorcall calling convention as well. Returns/parameters will be passed in memory.

rnk added inline comments.Jul 19 2017, 11:48 AM

lib/CodeGen/TargetInfo.cpp
3516 ↗	(On Diff #107343)	This is incorrect. We should consult the CXXABI first, and then only do these regcall-specific things if it returns false. Consider adding this test case to find the bug: struct NonTrivial { int x, y; ~NonTrivial(); }; NonTrivial __regcall f() { return NonTrivial(); } This should not return in registers, but it does today.
3526 ↗	(On Diff #107343)	This isn't necessarily a bug, but please always do C++ ABI classifications first so it's easy to spot the bug above.

Regcall-specific checks for Lin64 now occur only if CXXABI returns false. An existing test has also been modified to verify behavior with non trivial destructors.

lgtm

This revision is now accepted and ready to land.Jul 21 2017, 10:59 AM

Closed by commit rL308769: Complex Long Double classification In RegCall calling convention (authored by erichkeane). · Explain WhyJul 21 2017, 11:51 AM

This revision was automatically updated to reflect the committed changes.

erichkeane mentioned this in rG74ef6a11478a: Fix X86_64 complex-returns for regcall..May 19 2020, 1:46 PM

Revision Contents

Path

Size

cfe/

trunk/

lib/

CodeGen/

TargetInfo.cpp

41 lines

test/

CodeGenCXX/

regcall.cpp

12 lines

Diff 107696

cfe/trunk/lib/CodeGen/TargetInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 876 Lines • ▼ Show 20 Lines	static llvm::Type* X86AdjustInlineAsmType(CodeGen::CodeGenFunction &CGF,
// No operation needed		// No operation needed
return Ty;		return Ty;
}		}

/// Returns true if this type can be passed in SSE registers with the		/// Returns true if this type can be passed in SSE registers with the
/// X86_VectorCall calling convention. Shared between x86_32 and x86_64.		/// X86_VectorCall calling convention. Shared between x86_32 and x86_64.
static bool isX86VectorTypeForVectorCall(ASTContext &Context, QualType Ty) {		static bool isX86VectorTypeForVectorCall(ASTContext &Context, QualType Ty) {
if (const BuiltinType *BT = Ty->getAs<BuiltinType>()) {		if (const BuiltinType *BT = Ty->getAs<BuiltinType>()) {
if (BT->isFloatingPoint() && BT->getKind() != BuiltinType::Half)		if (BT->isFloatingPoint() && BT->getKind() != BuiltinType::Half) {
		if (BT->getKind() == BuiltinType::LongDouble) {
		if (&Context.getTargetInfo().getLongDoubleFormat() ==
		&llvm::APFloat::x87DoubleExtended())
		return false;
		}
return true;		return true;
		}
} else if (const VectorType *VT = Ty->getAs<VectorType>()) {		} else if (const VectorType *VT = Ty->getAs<VectorType>()) {
// vectorcall can pass XMM, YMM, and ZMM vectors. We don't pass SSE1 MMX		// vectorcall can pass XMM, YMM, and ZMM vectors. We don't pass SSE1 MMX
// registers specially.		// registers specially.
unsigned VecSize = Context.getTypeSize(VT);		unsigned VecSize = Context.getTypeSize(VT);
if (VecSize == 128 \|\| VecSize == 256 \|\| VecSize == 512)		if (VecSize == 128 \|\| VecSize == 256 \|\| VecSize == 512)
return true;		return true;
}		}
return false;		return false;
▲ Show 20 Lines • Show All 2,615 Lines • ▼ Show 20 Lines	void X86_64ABIInfo::computeInfo(CGFunctionInfo &FI) const {

bool IsRegCall = FI.getCallingConvention() == llvm::CallingConv::X86_RegCall;		bool IsRegCall = FI.getCallingConvention() == llvm::CallingConv::X86_RegCall;

// Keep track of the number of assigned registers.		// Keep track of the number of assigned registers.
unsigned FreeIntRegs = IsRegCall ? 11 : 6;		unsigned FreeIntRegs = IsRegCall ? 11 : 6;
unsigned FreeSSERegs = IsRegCall ? 16 : 8;		unsigned FreeSSERegs = IsRegCall ? 16 : 8;
unsigned NeededInt, NeededSSE;		unsigned NeededInt, NeededSSE;

		if (!getCXXABI().classifyReturnType(FI)) {
if (IsRegCall && FI.getReturnType()->getTypePtr()->isRecordType() &&		if (IsRegCall && FI.getReturnType()->getTypePtr()->isRecordType() &&
!FI.getReturnType()->getTypePtr()->isUnionType()) {		!FI.getReturnType()->getTypePtr()->isUnionType()) {
FI.getReturnInfo() =		FI.getReturnInfo() =
classifyRegCallStructType(FI.getReturnType(), NeededInt, NeededSSE);		classifyRegCallStructType(FI.getReturnType(), NeededInt, NeededSSE);
if (FreeIntRegs >= NeededInt && FreeSSERegs >= NeededSSE) {		if (FreeIntRegs >= NeededInt && FreeSSERegs >= NeededSSE) {
FreeIntRegs -= NeededInt;		FreeIntRegs -= NeededInt;
FreeSSERegs -= NeededSSE;		FreeSSERegs -= NeededSSE;
} else {		} else {
FI.getReturnInfo() = getIndirectReturnResult(FI.getReturnType());		FI.getReturnInfo() = getIndirectReturnResult(FI.getReturnType());
}		}
} else if (!getCXXABI().classifyReturnType(FI))		} else if (IsRegCall && FI.getReturnType()->getAs<ComplexType>()) {
		// Complex Long Double Type is passed in Memory when Regcall
		// calling convention is used.
		const ComplexType *CT = FI.getReturnType()->getAs<ComplexType>();
		if (getContext().getCanonicalType(CT->getElementType()) ==
		getContext().LongDoubleTy)
		FI.getReturnInfo() = getIndirectReturnResult(FI.getReturnType());
		} else
FI.getReturnInfo() = classifyReturnType(FI.getReturnType());		FI.getReturnInfo() = classifyReturnType(FI.getReturnType());
		}

// If the return value is indirect, then the hidden argument is consuming one		// If the return value is indirect, then the hidden argument is consuming one
// integer register.		// integer register.
if (FI.getReturnInfo().isIndirect())		if (FI.getReturnInfo().isIndirect())
--FreeIntRegs;		--FreeIntRegs;

// The chain argument effectively gives us another free register.		// The chain argument effectively gives us another free register.
if (FI.isChainCall())		if (FI.isChainCall())
▲ Show 20 Lines • Show All 5,164 Lines • Show Last 20 Lines

cfe/trunk/test/CodeGenCXX/regcall.cpp

Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	#endif
// CHECK-WIN32-DAG: define linkonce_odr x86_regcallcc void @"\01??$freeTempFunc@H@@YwXH@Z"		// CHECK-WIN32-DAG: define linkonce_odr x86_regcallcc void @"\01??$freeTempFunc@H@@YwXH@Z"
};		};

bool __regcall operator ==(const test_class&, const test_class&){ --x; return false;}		bool __regcall operator ==(const test_class&, const test_class&){ --x; return false;}
// CHECK-LIN-DAG: define x86_regcallcc zeroext i1 @_ZeqRK10test_classS1_		// CHECK-LIN-DAG: define x86_regcallcc zeroext i1 @_ZeqRK10test_classS1_
// CHECK-WIN64-DAG: define x86_regcallcc zeroext i1 @"\01??8@Yw_NAEBVtest_class@@0@Z"		// CHECK-WIN64-DAG: define x86_regcallcc zeroext i1 @"\01??8@Yw_NAEBVtest_class@@0@Z"
// CHECK-WIN32-DAG: define x86_regcallcc zeroext i1 @"\01??8@Yw_NABVtest_class@@0@Z"		// CHECK-WIN32-DAG: define x86_regcallcc zeroext i1 @"\01??8@Yw_NABVtest_class@@0@Z"

test_class __regcall operator""_test_class (unsigned long long) { ++x; return test_class{};}		test_class __regcall operator""_test_class (unsigned long long) { ++x; return test_class{};}
// CHECK-LIN64-DAG: define x86_regcallcc %class.test_class @_Zli11_test_classy(i64)		// CHECK-LIN64-DAG: define x86_regcallcc void @_Zli11_test_classy(%class.test_class* noalias sret %agg.result, i64)
// CHECK-LIN32-DAG: define x86_regcallcc void @_Zli11_test_classy(%class.test_class* inreg noalias sret %agg.result, i64)		// CHECK-LIN32-DAG: define x86_regcallcc void @_Zli11_test_classy(%class.test_class* inreg noalias sret %agg.result, i64)
// CHECK-WIN64-DAG: \01??__K_test_class@@Yw?AVtest_class@@_K@Z"		// CHECK-WIN64-DAG: \01??__K_test_class@@Yw?AVtest_class@@_K@Z"
// CHECK-WIN32-DAG: \01??__K_test_class@@Yw?AVtest_class@@_K@Z"		// CHECK-WIN32-DAG: \01??__K_test_class@@Yw?AVtest_class@@_K@Z"

template<typename T>		template<typename T>
void __regcall freeTempFunc(T i){}		void __regcall freeTempFunc(T i){}
// CHECK-LIN-DAG: define linkonce_odr x86_regcallcc void @_Z24__regcall3__freeTempFuncIiEvT_		// CHECK-LIN-DAG: define linkonce_odr x86_regcallcc void @_Z24__regcall3__freeTempFuncIiEvT_
// CHECK-WIN64-DAG: define linkonce_odr x86_regcallcc void @"\01??$freeTempFunc@H@@YwXH@Z"		// CHECK-WIN64-DAG: define linkonce_odr x86_regcallcc void @"\01??$freeTempFunc@H@@YwXH@Z"
// CHECK-WIN32-DAG: define linkonce_odr x86_regcallcc void @"\01??$freeTempFunc@H@@YwXH@Z"		// CHECK-WIN32-DAG: define linkonce_odr x86_regcallcc void @"\01??$freeTempFunc@H@@YwXH@Z"

// class to force generation of functions		// class to force generation of functions
void force_gen() {		void force_gen() {
test_class t;		test_class t;
test_class t2 = 12_test_class;		test_class t2 = 12_test_class;
t += t2;		t += t2;
auto t3 = 100_test_class;		auto t3 = 100_test_class;
t3.tempFunc(1);		t3.tempFunc(1);
freeTempFunc(1);		freeTempFunc(1);
t3.do_thing();		t3.do_thing();
}		}

		long double _Complex __regcall foo(long double _Complex f) {
		return f;
		}
		// CHECK-LIN64-DAG: define x86_regcallcc void @_Z15__regcall3__fooCe({ x86_fp80, x86_fp80 }* noalias sret %agg.result, { x86_fp80, x86_fp80 }* byval align 16 %f)
		// CHECK-LIN32-DAG: define x86_regcallcc void @_Z15__regcall3__fooCe({ x86_fp80, x86_fp80 }* inreg noalias sret %agg.result, { x86_fp80, x86_fp80 }* byval align 4 %f)
		// CHECK-WIN64-DAG: define x86_regcallcc { double, double } @"\01?foo@@YwU?$_Complex@O@__clang@@U12@@Z"(double %f.0, double %f.1)
		// CHECK-WIN32-DAG: define x86_regcallcc { double, double } @"\01?foo@@YwU?$_Complex@O@__clang@@U12@@Z"(double %f.0, double %f.1)