This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGCall.cpp
-
test/
-
CodeGen/
-
arm64-arguments.c
-
CodeGenCXX/
-
trivial_abi.cpp

Differential D100225

[Clang][AArch64] Coerce integer return values through an undef vector
AbandonedPublic

Authored by asavonic on Apr 9 2021, 12:59 PM.

Download Raw Diff

Details

Reviewers

rjmccall
dmgreen
t.p.northover
ostannard
sdesmalen
momchil.velikov
SjoerdMeijer

Summary

If target ABI requires coercion to a larger type, higher bits of the
resulting value are supposed to be undefined. However, before this
patch Clang CG used to generate a zext instruction to coerce a value
to a larger type, forcing higher bits to zero.

This is problematic in some cases:

struct st {
  int i;
};
struct st foo(i);
struct st bar(int x) {
  return foo(x);
}

For AArch64 Clang generates the following LLVM IR:

define i64 @bar(i32 %x) {
  %call = call i64 @foo(i32 %0)
  %coerce.val.ii = trunc i64 %call to i32
  ;; ... store to alloca and load back
  %coerce.val.ii2 = zext i32 %1 to i64
  ret i64 %coerce.val.ii2
}

Coercion is done with a trunc and a zext. After optimizations we
get the following:

define i64 @bar(i32 %x) local_unnamed_addr #0 {
entry:
  %call = tail call i64 @foo(i32 %x)
  %coerce.val.ii2 = and i64 %call, 4294967295
  ret i64 %coerce.val.ii2
}

The compiler has to keep semantic of the zext instruction, even
though no extension or truncation is required in this case.
This extra and instruction also prevents tail call optimization.

In order to keep information about undefined higher bits, the patch
replaces zext with a sequence of an insertelement and a bitcast:

define i64 @_Z3bari(i32 %x) local_unnamed_addr #0 {
entry:
  %call = tail call i64 @_Z3fooi(i32 %x) #2
  %coerce.val.ii = trunc i64 %call to i32
  %coerce.val.vec = insertelement <2 x i32> undef, i32 %coerce.val.ii, i8 0
  %coerce.val.vec.ii = bitcast <2 x i32> %coerce.val.vec to i64
  ret i64 %coerce.val.vec.ii
}

InstCombiner can then fold this sequence into a nop, and allow tail
call optimization (see D100227).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

asavonic created this revision.Apr 9 2021, 12:59 PM

Herald added subscribers: danielkiss, kristof.beyls. · View Herald TranscriptApr 9 2021, 12:59 PM

asavonic requested review of this revision.Apr 9 2021, 12:59 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 9 2021, 12:59 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

asavonic mentioned this in D100227: [AArch64][InstCombine] Optimize coercion through an undef vector.Apr 9 2021, 1:16 PM

Harbormaster completed remote builds in B98055: Diff 336547.Apr 9 2021, 1:50 PM

asavonic edited the summary of this revision. (Show Details)Apr 11 2021, 8:08 AM

asavonic added reviewers: rjmccall, dmgreen.

Why does the ABI "require" this to be returned as an i64 if some of the bits are undefined?

In D100225#2681809, @rjmccall wrote:

Why does the ABI "require" this to be returned as an i64 if some of the bits are undefined?

AArch64 ABI requires return values (of composite types) to be rounded up to 64 bits (see AArch64ABIInfo::classifyReturnType). I assume that if a value, say i16, is rounded up to i64, then the upper 48 bits can be arbitrary (undefined). I think this is aligned with the description of CreateCoercedLoad:

/// CreateCoercedLoad - Create a load from \arg SrcPtr interpreted as
/// a pointer to an object of type \arg Ty, known to be aligned to
/// \arg SrcAlign bytes.
///
/// This safely handles the case when the src type is smaller than the
/// destination type; in this situation the values of bits which not
/// present in the src are undefined.

AArch64 only has 64-bit integer registers, so of course the algorithm is specified that way. LLVM could nonetheless choose to return an i32.

So we can just remove this rounding from classifyReturnType?
Thanks a lot John! I will upload this change as a separate review.

You should probably talk it over with AArch64 backend folks, but yes, abstractly it seems reasonable. CC'ing Tim Northover.

cjdb mentioned this in D100277: [libc++] [test] Detect an improperly noexcept'ed __decay_copy..Apr 11 2021, 4:21 PM

dmgreen added reviewers: ostannard, sdesmalen, momchil.velikov, SjoerdMeijer.Apr 12 2021, 1:21 AM

After reading the summary/intent of the patch, I thought the same thing as @rjmccall. Simply returning an i32 for the above example and removing the rounding-up seems right to me.

asavonic mentioned this in D100591: [Clang][AArch64] Disable rounding of return values for AArch64.Apr 15 2021, 12:05 PM

asavonic mentioned this in rGb451ecd86e13: [Clang][AArch64] Disable rounding of return values for AArch64.May 4 2021, 10:29 AM

asavonic abandoned this revision.Nov 9 2021, 5:00 AM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGCall.cpp

51 lines

test/

CodeGen/

arm64-arguments.c

60 lines

CodeGenCXX/

trivial_abi.cpp

3 lines

Diff 336547

clang/lib/CodeGen/CGCall.cpp

Show First 20 Lines • Show All 1,165 Lines • ▼ Show 20 Lines

/// CoerceIntOrPtrToIntOrPtr - Convert a value Val to the specific Ty where both		/// CoerceIntOrPtrToIntOrPtr - Convert a value Val to the specific Ty where both
/// are either integers or pointers. This does a truncation of the value if it		/// are either integers or pointers. This does a truncation of the value if it
/// is too large or a zero extension if it is too small.		/// is too large or a zero extension if it is too small.
///		///
/// This behaves as if the value were coerced through memory, so on big-endian		/// This behaves as if the value were coerced through memory, so on big-endian
/// targets the high bits are preserved in a truncation, while little-endian		/// targets the high bits are preserved in a truncation, while little-endian
/// targets preserve the low bits.		/// targets preserve the low bits.
static llvm::Value CoerceIntOrPtrToIntOrPtr(llvm::Value Val,		static llvm::Value CoerceIntOrPtrToIntOrPtr(llvm::Value Val, llvm::Type *Ty,
llvm::Type *Ty,		bool UndefUnusedBits,
CodeGenFunction &CGF) {		CodeGenFunction &CGF) {
if (Val->getType() == Ty)		if (Val->getType() == Ty)
return Val;		return Val;

if (isa<llvm::PointerType>(Val->getType())) {		if (isa<llvm::PointerType>(Val->getType())) {
// If this is Pointer->Pointer avoid conversion to and from int.		// If this is Pointer->Pointer avoid conversion to and from int.
if (isa<llvm::PointerType>(Ty))		if (isa<llvm::PointerType>(Ty))
return CGF.Builder.CreateBitCast(Val, Ty, "coerce.val");		return CGF.Builder.CreateBitCast(Val, Ty, "coerce.val");
Show All 17 Lines	if (DL.isBigEndian()) {
if (SrcSize > DstSize) {		if (SrcSize > DstSize) {
Val = CGF.Builder.CreateLShr(Val, SrcSize - DstSize, "coerce.highbits");		Val = CGF.Builder.CreateLShr(Val, SrcSize - DstSize, "coerce.highbits");
Val = CGF.Builder.CreateTrunc(Val, DestIntTy, "coerce.val.ii");		Val = CGF.Builder.CreateTrunc(Val, DestIntTy, "coerce.val.ii");
} else {		} else {
Val = CGF.Builder.CreateZExt(Val, DestIntTy, "coerce.val.ii");		Val = CGF.Builder.CreateZExt(Val, DestIntTy, "coerce.val.ii");
Val = CGF.Builder.CreateShl(Val, DstSize - SrcSize, "coerce.highbits");		Val = CGF.Builder.CreateShl(Val, DstSize - SrcSize, "coerce.highbits");
}		}
} else {		} else {
		llvm::IntegerType *OrigType = cast<llvm::IntegerType>(Val->getType());
		unsigned OrigWidth = OrigType->getBitWidth();
		unsigned DestWidth = cast<llvm::IntegerType>(DestIntTy)->getBitWidth();
		if (UndefUnusedBits && DestWidth > OrigWidth &&
		DestWidth % OrigWidth == 0) {
		// Insert the value in an undef vector, and then bitcast the vector to
		// the destination type. Unused vector elements and the corresponding
		// bits of the destination value can be treated as undef by
		// optimizations.
		llvm::VectorType *VecType =
		llvm::VectorType::get(OrigType, DestWidth / OrigWidth,
		/Scalable=/false);
		llvm::Value *Vec = CGF.Builder.CreateInsertElement(
		llvm::UndefValue::get(VecType), Val, CGF.Builder.getInt8(0),
		"coerce.val.vec");
		Val = CGF.Builder.CreateBitCast(Vec, DestIntTy, "coerce.val.vec.ii");
		} else {
// Little-endian targets preserve the low bits. No shifts required.		// Little-endian targets preserve the low bits. No shifts required.
Val = CGF.Builder.CreateIntCast(Val, DestIntTy, false, "coerce.val.ii");		Val = CGF.Builder.CreateIntCast(Val, DestIntTy, false, "coerce.val.ii");
}		}
}		}
		}

if (isa<llvm::PointerType>(Ty))		if (isa<llvm::PointerType>(Ty))
Val = CGF.Builder.CreateIntToPtr(Val, Ty, "coerce.val.ip");		Val = CGF.Builder.CreateIntToPtr(Val, Ty, "coerce.val.ip");
return Val;		return Val;
}		}



/// CreateCoercedLoad - Create a load from \arg SrcPtr interpreted as		/// CreateCoercedLoad - Create a load from \arg SrcPtr interpreted as
/// a pointer to an object of type \arg Ty, known to be aligned to		/// a pointer to an object of type \arg Ty, known to be aligned to
/// \arg SrcAlign bytes.		/// \arg SrcAlign bytes.
///		///
/// This safely handles the case when the src type is smaller than the		/// This safely handles the case when the src type is smaller than the
/// destination type; in this situation the values of bits which not		/// destination type; in this situation the values of bits which not
/// present in the src are undefined.		/// present in the src are undefined.
static llvm::Value CreateCoercedLoad(Address Src, llvm::Type Ty,		static llvm::Value CreateCoercedLoad(Address Src, llvm::Type Ty,
		bool UndefUnusedBits,
CodeGenFunction &CGF) {		CodeGenFunction &CGF) {
llvm::Type *SrcTy = Src.getElementType();		llvm::Type *SrcTy = Src.getElementType();

// If SrcTy and Ty are the same, just do a load.		// If SrcTy and Ty are the same, just do a load.
if (SrcTy == Ty)		if (SrcTy == Ty)
return CGF.Builder.CreateLoad(Src);		return CGF.Builder.CreateLoad(Src);

llvm::TypeSize DstSize = CGF.CGM.getDataLayout().getTypeAllocSize(Ty);		llvm::TypeSize DstSize = CGF.CGM.getDataLayout().getTypeAllocSize(Ty);

if (llvm::StructType *SrcSTy = dyn_cast<llvm::StructType>(SrcTy)) {		if (llvm::StructType *SrcSTy = dyn_cast<llvm::StructType>(SrcTy)) {
Src = EnterStructPointerForCoercedAccess(Src, SrcSTy,		Src = EnterStructPointerForCoercedAccess(Src, SrcSTy,
DstSize.getFixedSize(), CGF);		DstSize.getFixedSize(), CGF);
SrcTy = Src.getElementType();		SrcTy = Src.getElementType();
}		}

llvm::TypeSize SrcSize = CGF.CGM.getDataLayout().getTypeAllocSize(SrcTy);		llvm::TypeSize SrcSize = CGF.CGM.getDataLayout().getTypeAllocSize(SrcTy);

// If the source and destination are integer or pointer types, just do an		// If the source and destination are integer or pointer types, just do an
// extension or truncation to the desired type.		// extension or truncation to the desired type.
if ((isa<llvm::IntegerType>(Ty) \|\| isa<llvm::PointerType>(Ty)) &&		if ((isa<llvm::IntegerType>(Ty) \|\| isa<llvm::PointerType>(Ty)) &&
(isa<llvm::IntegerType>(SrcTy) \|\| isa<llvm::PointerType>(SrcTy))) {		(isa<llvm::IntegerType>(SrcTy) \|\| isa<llvm::PointerType>(SrcTy))) {
llvm::Value *Load = CGF.Builder.CreateLoad(Src);		llvm::Value *Load = CGF.Builder.CreateLoad(Src);
return CoerceIntOrPtrToIntOrPtr(Load, Ty, CGF);		return CoerceIntOrPtrToIntOrPtr(Load, Ty, UndefUnusedBits, CGF);
}		}

// If load is legal, just bitcast the src pointer.		// If load is legal, just bitcast the src pointer.
if (!SrcSize.isScalable() && !DstSize.isScalable() &&		if (!SrcSize.isScalable() && !DstSize.isScalable() &&
SrcSize.getFixedSize() >= DstSize.getFixedSize()) {		SrcSize.getFixedSize() >= DstSize.getFixedSize()) {
// Generally SrcSize is never greater than DstSize, since this means we are		// Generally SrcSize is never greater than DstSize, since this means we are
// losing bits. However, this can happen in cases where the structure has		// losing bits. However, this can happen in cases where the structure has
// additional padding, for example due to a user specified alignment.		// additional padding, for example due to a user specified alignment.
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
}		}

/// CreateCoercedStore - Create a store to \arg DstPtr from \arg Src,		/// CreateCoercedStore - Create a store to \arg DstPtr from \arg Src,
/// where the source and destination may have different types. The		/// where the source and destination may have different types. The
/// destination is known to be aligned to \arg DstAlign bytes.		/// destination is known to be aligned to \arg DstAlign bytes.
///		///
/// This safely handles the case when the src type is larger than the		/// This safely handles the case when the src type is larger than the
/// destination type; the upper bits of the src will be lost.		/// destination type; the upper bits of the src will be lost.
static void CreateCoercedStore(llvm::Value *Src,		static void CreateCoercedStore(llvm::Value *Src, Address Dst,
Address Dst,		bool DstIsVolatile, bool UndefUnusedBits,
bool DstIsVolatile,
CodeGenFunction &CGF) {		CodeGenFunction &CGF) {
llvm::Type *SrcTy = Src->getType();		llvm::Type *SrcTy = Src->getType();
llvm::Type *DstTy = Dst.getElementType();		llvm::Type *DstTy = Dst.getElementType();
if (SrcTy == DstTy) {		if (SrcTy == DstTy) {
CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);		CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);
return;		return;
}		}

Show All 13 Lines	if (SrcPtrTy && DstPtrTy &&
CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);		CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);
return;		return;
}		}

// If the source and destination are integer or pointer types, just do an		// If the source and destination are integer or pointer types, just do an
// extension or truncation to the desired type.		// extension or truncation to the desired type.
if ((isa<llvm::IntegerType>(SrcTy) \|\| isa<llvm::PointerType>(SrcTy)) &&		if ((isa<llvm::IntegerType>(SrcTy) \|\| isa<llvm::PointerType>(SrcTy)) &&
(isa<llvm::IntegerType>(DstTy) \|\| isa<llvm::PointerType>(DstTy))) {		(isa<llvm::IntegerType>(DstTy) \|\| isa<llvm::PointerType>(DstTy))) {
Src = CoerceIntOrPtrToIntOrPtr(Src, DstTy, CGF);		Src = CoerceIntOrPtrToIntOrPtr(Src, DstTy, UndefUnusedBits, CGF);
CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);		CGF.Builder.CreateStore(Src, Dst, DstIsVolatile);
return;		return;
}		}

llvm::TypeSize DstSize = CGF.CGM.getDataLayout().getTypeAllocSize(DstTy);		llvm::TypeSize DstSize = CGF.CGM.getDataLayout().getTypeAllocSize(DstTy);

// If store is legal, just bitcast the src pointer.		// If store is legal, just bitcast the src pointer.
if (isa<llvm::ScalableVectorType>(SrcTy) \|\|		if (isa<llvm::ScalableVectorType>(SrcTy) \|\|
▲ Show 20 Lines • Show All 1,510 Lines • ▼ Show 20 Lines	case ABIArgInfo::Direct: {
Builder.CreateMemCpy(Ptr, AddrToStoreInto, DstSize);		Builder.CreateMemCpy(Ptr, AddrToStoreInto, DstSize);
}		}

} else {		} else {
// Simple case, just do a coerced store of the argument into the alloca.		// Simple case, just do a coerced store of the argument into the alloca.
assert(NumIRArgs == 1);		assert(NumIRArgs == 1);
auto AI = Fn->getArg(FirstIRArg);		auto AI = Fn->getArg(FirstIRArg);
AI->setName(Arg->getName() + ".coerce");		AI->setName(Arg->getName() + ".coerce");
CreateCoercedStore(AI, Ptr, /DstIsVolatile=/false, *this);		CreateCoercedStore(AI, Ptr, /DstIsVolatile=/false,
		/UndefUnusedBits=/false, *this);
}		}

// Match to what EmitParmDecl is expecting for this type.		// Match to what EmitParmDecl is expecting for this type.
if (CodeGenFunction::hasScalarEvaluationKind(Ty)) {		if (CodeGenFunction::hasScalarEvaluationKind(Ty)) {
llvm::Value *V =		llvm::Value *V =
EmitLoadOfScalar(Alloca, false, Ty, Arg->getBeginLoc());		EmitLoadOfScalar(Alloca, false, Ty, Arg->getBeginLoc());
if (isPromoted)		if (isPromoted)
V = emitArgumentDemotion(*this, Arg, V);		V = emitArgumentDemotion(*this, Arg, V);
▲ Show 20 Lines • Show All 568 Lines • ▼ Show 20 Lines	if (RetAI.getCoerceToType() == ConvertType(RetTy) &&
// Otherwise, we have to do a simple load.		// Otherwise, we have to do a simple load.
} else {		} else {
RV = Builder.CreateLoad(ReturnValue);		RV = Builder.CreateLoad(ReturnValue);
}		}
} else {		} else {
// If the value is offset in memory, apply the offset now.		// If the value is offset in memory, apply the offset now.
Address V = emitAddressAtOffset(*this, ReturnValue, RetAI);		Address V = emitAddressAtOffset(*this, ReturnValue, RetAI);

RV = CreateCoercedLoad(V, RetAI.getCoerceToType(), *this);		RV = CreateCoercedLoad(V, RetAI.getCoerceToType(),
		/UndefUnusedBits=/true, *this);
}		}

// In ARC, end functions that return a retainable type with a call		// In ARC, end functions that return a retainable type with a call
// to objc_autoreleaseReturnValue.		// to objc_autoreleaseReturnValue.
if (AutoreleaseResult) {		if (AutoreleaseResult) {
#ifndef NDEBUG		#ifndef NDEBUG
// Type::isObjCRetainabletype has to be called on a QualType that hasn't		// Type::isObjCRetainabletype has to be called on a QualType that hasn't
// been stripped of the typedefs, so we cannot use RetTy here. Get the		// been stripped of the typedefs, so we cannot use RetTy here. Get the
▲ Show 20 Lines • Show All 1,453 Lines • ▼ Show 20 Lines	case ABIArgInfo::Direct: {
for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {		for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
Address EltPtr = Builder.CreateStructGEP(Src, i);		Address EltPtr = Builder.CreateStructGEP(Src, i);
llvm::Value *LI = Builder.CreateLoad(EltPtr);		llvm::Value *LI = Builder.CreateLoad(EltPtr);
IRCallArgs[FirstIRArg + i] = LI;		IRCallArgs[FirstIRArg + i] = LI;
}		}
} else {		} else {
// In the simple case, just pass the coerced loaded value.		// In the simple case, just pass the coerced loaded value.
assert(NumIRArgs == 1);		assert(NumIRArgs == 1);
llvm::Value *Load =		llvm::Value *Load = CreateCoercedLoad(Src, ArgInfo.getCoerceToType(),
CreateCoercedLoad(Src, ArgInfo.getCoerceToType(), *this);		/UndefUnusedBits=/false, *this);

if (CallInfo.isCmseNSCall()) {		if (CallInfo.isCmseNSCall()) {
// For certain parameter types, clear padding bits, as they may reveal		// For certain parameter types, clear padding bits, as they may reveal
// sensitive information.		// sensitive information.
// Small struct/union types are passed as integer arrays.		// Small struct/union types are passed as integer arrays.
auto *ATy = dyn_cast<llvm::ArrayType>(Load->getType());		auto *ATy = dyn_cast<llvm::ArrayType>(Load->getType());
if (ATy != nullptr && isa<RecordType>(I->Ty.getCanonicalType()))		if (ATy != nullptr && isa<RecordType>(I->Ty.getCanonicalType()))
Load = EmitCMSEClearRecord(Load, ATy, I->Ty);		Load = EmitCMSEClearRecord(Load, ATy, I->Ty);
▲ Show 20 Lines • Show All 460 Lines • ▼ Show 20 Lines	case ABIArgInfo::Direct: {

if (!DestPtr.isValid()) {		if (!DestPtr.isValid()) {
DestPtr = CreateMemTemp(RetTy, "coerce");		DestPtr = CreateMemTemp(RetTy, "coerce");
DestIsVolatile = false;		DestIsVolatile = false;
}		}

// If the value is offset in memory, apply the offset now.		// If the value is offset in memory, apply the offset now.
Address StorePtr = emitAddressAtOffset(*this, DestPtr, RetAI);		Address StorePtr = emitAddressAtOffset(*this, DestPtr, RetAI);
CreateCoercedStore(CI, StorePtr, DestIsVolatile, *this);		CreateCoercedStore(CI, StorePtr, DestIsVolatile, /UndefUnusedBits=/true,
		*this);

return convertTempToRValue(DestPtr, RetTy, SourceLocation());		return convertTempToRValue(DestPtr, RetTy, SourceLocation());
}		}

case ABIArgInfo::Expand:		case ABIArgInfo::Expand:
case ABIArgInfo::IndirectAliased:		case ABIArgInfo::IndirectAliased:
llvm_unreachable("Invalid ABI kind for return argument");		llvm_unreachable("Invalid ABI kind for return argument");
}		}
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

clang/test/CodeGen/arm64-arguments.c

Show First 20 Lines • Show All 739 Lines • ▼ Show 20 Lines	// CHECK: bitcast i8* [[ALIGNED_LIST]] to %struct.HFAv3*
return r.arr[2];		return r.arr[2];
}		}

float32x3_t test_hva_v3_call(HFAv3 *a) {		float32x3_t test_hva_v3_call(HFAv3 *a) {
// CHECK-LABEL: define{{.}} <3 x float> @test_hva_v3_call(%struct.HFAv3 %a)		// CHECK-LABEL: define{{.}} <3 x float> @test_hva_v3_call(%struct.HFAv3 %a)
// CHECK: call <3 x float> (i32, ...) @test_hva_v3(i32 1, [4 x <4 x float>] {{.*}})		// CHECK: call <3 x float> (i32, ...) @test_hva_v3(i32 1, [4 x <4 x float>] {{.*}})
return test_hva_v3(1, *a);		return test_hva_v3(1, *a);
}		}

		char ret_coerce1(void) {
		// CHECK-LABEL: i8 @ret_coerce1
		// CHECK: alloca i8
		// CHECK-NEXT: load i8
		// CHECK-NEXT: ret i8
		}

		short ret_coerce2(void) {
		// CHECK-LABEL: i16 @ret_coerce2
		// CHECK: alloca i16
		// CHECK-NEXT: load i16
		// CHECK-NEXT: ret i16
		}

		int ret_coerce3(void) {
		// CHECK-LABEL: i32 @ret_coerce3
		// CHECK: alloca i32
		// CHECK-NEXT: load i32
		// CHECK-NEXT: ret i32
		}

		struct ret_coerce_char {
		char f0;
		};
		struct ret_coerce_char ret_coerce4(void) {
		// CHECK-LABEL: i64 @ret_coerce4
		// CHECK: %[[ALLOCA:.*]] = alloca %struct.ret_coerce_char
		// CHECK: %[[GEP:.]] = getelementptr {{.}} %[[ALLOCA]], i32 0, i32 0
		// CHECK: %[[LOAD:.]] = load i8, i8 %[[GEP]]
		// CHECK: %[[VEC:.*]] = insertelement <8 x i8> undef, i8 %[[LOAD]], i8 0
		// CHECK: %[[CAST:.*]] = bitcast <8 x i8> %[[VEC]] to i64
		// CHECK: ret i64 %[[CAST]]
		}

		struct ret_coerce_short {
		short f0;
		};
		struct ret_coerce_short ret_coerce5(void) {
		// CHECK-LABEL: i64 @ret_coerce5
		// CHECK: %[[ALLOCA:.*]] = alloca %struct.ret_coerce_short
		// CHECK: %[[GEP:.]] = getelementptr {{.}} %[[ALLOCA]], i32 0, i32 0
		// CHECK: %[[LOAD:.]] = load i16, i16 %[[GEP]]
		// CHECK: %[[VEC:.*]] = insertelement <4 x i16> undef, i16 %[[LOAD]], i8 0
		// CHECK: %[[CAST:.*]] = bitcast <4 x i16> %[[VEC]] to i64
		// CHECK: ret i64 %[[CAST]]
		}

		struct ret_coerce_int {
		int f0;
		};
		struct ret_coerce_int ret_coerce6(void) {
		// CHECK-LABEL: i64 @ret_coerce6
		// CHECK: %[[ALLOCA:.*]] = alloca %struct.ret_coerce_int
		// CHECK: %[[GEP:.]] = getelementptr {{.}} %[[ALLOCA]], i32 0, i32 0
		// CHECK: %[[LOAD:.]] = load i32, i32 %[[GEP]]
		// CHECK: %[[VEC:.*]] = insertelement <2 x i32> undef, i32 %[[LOAD]], i8 0
		// CHECK: %[[CAST:.*]] = bitcast <2 x i32> %[[VEC]] to i64
		// CHECK: ret i64 %[[CAST]]
		}

clang/test/CodeGenCXX/trivial_abi.cpp

	Show First 20 Lines • Show All 196 Lines • ▼ Show 20 Lines
	void testIgnoredLarge() {			void testIgnoredLarge() {
	testReturnLarge();			testReturnLarge();
	}			}

	// CHECK: define{{.*}} i64 @_Z20testReturnHasTrivialv()			// CHECK: define{{.*}} i64 @_Z20testReturnHasTrivialv()
	// CHECK: %[[RETVAL:.]] = alloca %[[STRUCT_TRIVIAL:.]], align 4			// CHECK: %[[RETVAL:.]] = alloca %[[STRUCT_TRIVIAL:.]], align 4
	// CHECK: %[[COERCE_DIVE:.]] = getelementptr inbounds %[[STRUCT_TRIVIAL]], %[[STRUCT_TRIVIAL]] %[[RETVAL]], i32 0, i32 0			// CHECK: %[[COERCE_DIVE:.]] = getelementptr inbounds %[[STRUCT_TRIVIAL]], %[[STRUCT_TRIVIAL]] %[[RETVAL]], i32 0, i32 0
	// CHECK: %[[V0:.]] = load i32, i32 %[[COERCE_DIVE]], align 4			// CHECK: %[[V0:.]] = load i32, i32 %[[COERCE_DIVE]], align 4
	// CHECK: %[[COERCE_VAL_II:.*]] = zext i32 %[[V0]] to i64			// CHECK: %[[COERCE_VAL_VEC:.*]] = insertelement <2 x i32> undef, i32 %[[V0]], i8 0
				// CHECK: %[[COERCE_VAL_II:.*]] = bitcast <2 x i32> %[[COERCE_VAL_VEC]] to i64
	// CHECK: ret i64 %[[COERCE_VAL_II]]			// CHECK: ret i64 %[[COERCE_VAL_II]]
	// CHECK: }			// CHECK: }

	Trivial testReturnHasTrivial() {			Trivial testReturnHasTrivial() {
	Trivial t;			Trivial t;
	return t;			return t;
	}			}

	▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines