Download Raw Diff

Details

Reviewers

t.p.northover
rsmith

Commits

rGbb846a32e4c8: Adjust coercion of aggregates on RenderScript
rC276904: Adjust coercion of aggregates on RenderScript
rL276904: Adjust coercion of aggregates on RenderScript

Summary

In RenderScript, the size of the argument or return value emitted in the
IR is expected to be the same as the size of corresponding qualified
type. For ARM and AArch64, the coercion performed by Clang can
change the parameter or return value to a type whose size is different
(usually larger) than the original aggregate type. Specifically, this
can happen in the following cases:

Aggregate parameters of size <= 64 bytes and return values smaller than 4 bytes on ARM
Aggregate parameters and return values smaller than bytes on AArch64

This patch coerces the cases above to an integer array that is the same
size and alignment as the original aggregate. A new field is added to
TargetInfo to detect a RenderScript target and limit this coercion just
to that case.

Tests added to test/CodeGen/renderscript.c

Diff Detail

Repository: rL LLVM

Event Timeline

pirama updated this revision to Diff 65553.Jul 26 2016, 10:47 AM

pirama retitled this revision from to Adjust coercion of aggregates on RenderScript.

pirama updated this object.

pirama added a reviewer: rsmith.

pirama added subscribers: llvm-commits, srhines.

Herald added a subscriber: aemerson. · View Herald TranscriptJul 26 2016, 10:47 AM

Fix typo in test

Are you aware of how inefficient the resulting ABI is once it hits CodeGen? For example using [3 x i8] will waste 3 full 64-bit registers for that struct. struct { char arr[16] }; barely avoids crashing the backend it's so bad (LLVM demotes it to sret at the last moment).

I suppose the real question is why do RenderScript passes need the types to have the same size, and have you really considered all other options? This ABI mangling would be an absolute last resort if I was trying to add support.

test/CodeGen/renderscript.c
138–139 ↗	(On Diff #65555)	Shouldn't these be sret? (And above).

In D22822#496636, @t.p.northover wrote:

Are you aware of how inefficient the resulting ABI is once it hits CodeGen? For example using [3 x i8] will waste 3 full 64-bit registers for that struct. struct { char arr[16] }; barely avoids crashing the backend it's so bad (LLVM demotes it to sret at the last moment).

I suppose the real question is why do RenderScript passes need the types to have the same size, and have you really considered all other options? This ABI mangling would be an absolute last resort if I was trying to add support.

The size requirement is imposed by the RenderScript runtime. To run a compute kernel, the RenderScript runtime iterates over its input and output buffers using a stride equal to the size of the underlying type and invokes the kernel function on entries in the buffer. Disagreement between the kernel and the the runtime on the sizes of these types can lead to incorrect output,

An alternative we considered is for the runtime to duplicate this coercion logic but that'd be complex to implement in the runtime and harder to maintain.

test/CodeGen/renderscript.c
138–139 ↗	(On Diff #65555)	Yes, these should be sret. I was lazy and did not check the whole signature. I'll upload an update shortly.

Update 'CHECK's in test to check the entire function signature

To run a compute kernel, the RenderScript runtime iterates over its input and output buffers using a stride equal to the size of the underlying type and invokes the kernel function on entries in the buffer. Disagreement between the kernel and the the runtime on the sizes of these types can lead to incorrect output,

What language is the runtime written in? I assume LLVM-aware C++ (so it inspects the llvm::Function's parameters before JITing some appropriate loop to do the iteration code)? I think it would be best to pass the stride information through a side channel rather than constraining the front-end to produce sub-optimal code (perhaps a global variable with a special name related to the kernel?).

The runtime dictating the ABI like that seems backwards to me. Ideally you'd start with some given set of functions, decide what the fastest way to run them is, and make sure the runtime is capable of supporting that.

Cheers.

Tim.

In D22822#496789, @t.p.northover wrote:

To run a compute kernel, the RenderScript runtime iterates over its input and output buffers using a stride equal to the size of the underlying type and invokes the kernel function on entries in the buffer. Disagreement between the kernel and the the runtime on the sizes of these types can lead to incorrect output,

What language is the runtime written in? I assume LLVM-aware C++ (so it inspects the llvm::Function's parameters before JITing some appropriate loop to do the iteration code)? I think it would be best to pass the stride information through a side channel rather than constraining the front-end to produce sub-optimal code (perhaps a global variable with a special name related to the kernel?).

The runtime is written in a combination of LLVM-aware C++ and raw LLVM bitcode. The problem is that the source language is LLVM bitcode that was/is already lowered. Once Clang has decided to expand these types, the actual backend can't determine the original data size. If we could go back and side-channel the size information, we certainly would, but I don't think that is possible right now. I can talk to the rest of the RS team to see if they would mind passing the information differently going forward, but existing code/devices has to work this way.

The runtime dictating the ABI like that seems backwards to me. Ideally you'd start with some given set of functions, decide what the fastest way to run them is, and make sure the runtime is capable of supporting that.

Cheers.

Tim.

I can talk to the rest of the RS team to see if they would mind passing the information differently going forward,

It seems like it would improve RenderScript if they could. I'm not sure how common arguments like this are, but pessimizing them would be a shame.

but existing code/devices has to work this way.

If the RenderScript runtime changes that happens without putting this patch into Clang doesn't it?

If you've got IR already then Clang isn't involved, and trying to generate IR to run on an older runtime isn't going to work for many other reasons besides this (LLVM IR isn't compatible like that).

In D22822#497693, @t.p.northover wrote:

I can talk to the rest of the RS team to see if they would mind passing the information differently going forward,

It seems like it would improve RenderScript if they could. I'm not sure how common arguments like this are, but pessimizing them would be a shame.

but existing code/devices has to work this way.

If the RenderScript runtime changes that happens without putting this patch into Clang doesn't it?

Actually no. We could fix it for some things going forward, but the Clang that they use will always need some form of this patch for generating IR to run on older devices.

If you've got IR already then Clang isn't involved, and trying to generate IR to run on an older runtime isn't going to work for many other reasons besides this (LLVM IR isn't compatible like that).

Clang has been involved in generating RS IR for the past 6 years. This patch is merely upstreaming an existing Android-only modification to Clang. I believe this is the last such instance of an Android-specific patch. As far as LLVM IR compatibility, RS has bitcode translation that goes between the present-day in-memory IR representation and generates LLVM 3.2 bitcode (among other things). That part will remain outside of upstream LLVM because it is somewhat separable. This change, however, cannot be made in such a way that it can be layered on top of a pristine version of upstream Clang, hence our reason for upstreaming it.

Actually no. We could fix it for some things going forward, but the Clang that they use will always need some form of this patch for generating IR to run on older devices.

So do you have any kind of deprecation policy for older RS runtimes? It would be good to know that we won't have to live with this hack forever (and have some kind of comment in the code telling us when it can be nuked).

In D22822#497848, @t.p.northover wrote:

Actually no. We could fix it for some things going forward, but the Clang that they use will always need some form of this patch for generating IR to run on older devices.

So do you have any kind of deprecation policy for older RS runtimes? It would be good to know that we won't have to live with this hack forever (and have some kind of comment in the code telling us when it can be nuked).

There is no deprecation policy for old runtimes because we support everything back to the original release of RS at this point. Any change made here would be only going forward (again assuming that the team agrees that this needs to go away, and that they aren't willing to put up with the performance issues for having it forever). Considering that all the code here is bracketed by RenderScript checks, it seems like it should be easy to remove any such section when an appropriate time comes. We can add more obvious comments that these checks are specifically for RS ABI issues (at least through Android N at this point).

Oh well, I expect I've been responsible for worse hacks (and will be again in the future). Given the constraints, it seems unavoidable.

This revision is now accepted and ready to land.Jul 27 2016, 10:01 AM

Clarify comment.

Tim, thanks for the review and acceptance. I've slightly updated the comment.

Closed by commit rL276904: Adjust coercion of aggregates on RenderScript (authored by pirama). · Explain WhyJul 27 2016, 12:09 PM

This revision was automatically updated to reflect the committed changes.

Diff 65787

cfe/trunk/include/clang/Basic/TargetInfo.h

Show First 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	protected:
mutable VersionTuple PlatformMinVersion;		mutable VersionTuple PlatformMinVersion;

unsigned HasAlignMac68kSupport : 1;		unsigned HasAlignMac68kSupport : 1;
unsigned RealTypeUsesObjCFPRet : 3;		unsigned RealTypeUsesObjCFPRet : 3;
unsigned ComplexLongDoubleUsesFP2Ret : 1;		unsigned ComplexLongDoubleUsesFP2Ret : 1;

unsigned HasBuiltinMSVaList : 1;		unsigned HasBuiltinMSVaList : 1;

		unsigned IsRenderScriptTarget : 1;

// TargetInfo Constructor. Default initializes all fields.		// TargetInfo Constructor. Default initializes all fields.
TargetInfo(const llvm::Triple &T);		TargetInfo(const llvm::Triple &T);

void resetDataLayout(StringRef DL) {		void resetDataLayout(StringRef DL) {
DataLayout.reset(new llvm::DataLayout(DL));		DataLayout.reset(new llvm::DataLayout(DL));
}		}

public:		public:
▲ Show 20 Lines • Show All 457 Lines • ▼ Show 20 Lines	public:
/// \brief Returns the kind of __builtin_va_list type that should be used		/// \brief Returns the kind of __builtin_va_list type that should be used
/// with this target.		/// with this target.
virtual BuiltinVaListKind getBuiltinVaListKind() const = 0;		virtual BuiltinVaListKind getBuiltinVaListKind() const = 0;

/// Returns whether or not type \c __builtin_ms_va_list type is		/// Returns whether or not type \c __builtin_ms_va_list type is
/// available on this target.		/// available on this target.
bool hasBuiltinMSVaList() const { return HasBuiltinMSVaList; }		bool hasBuiltinMSVaList() const { return HasBuiltinMSVaList; }

		/// Returns true for RenderScript.
		bool isRenderScriptTarget() const { return IsRenderScriptTarget; }

/// \brief Returns whether the passed in string is a valid clobber in an		/// \brief Returns whether the passed in string is a valid clobber in an
/// inline asm statement.		/// inline asm statement.
///		///
/// This is used by Sema.		/// This is used by Sema.
bool isValidClobber(StringRef Name) const;		bool isValidClobber(StringRef Name) const;

/// \brief Returns whether the passed in string is a valid register name		/// \brief Returns whether the passed in string is a valid register name
/// according to GCC.		/// according to GCC.
▲ Show 20 Lines • Show All 433 Lines • Show Last 20 Lines

cfe/trunk/lib/Basic/TargetInfo.cpp

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	TargetInfo::TargetInfo(const llvm::Triple &T) : TargetOpts(), Triple(T) {
DoubleFormat = &llvm::APFloat::IEEEdouble;		DoubleFormat = &llvm::APFloat::IEEEdouble;
LongDoubleFormat = &llvm::APFloat::IEEEdouble;		LongDoubleFormat = &llvm::APFloat::IEEEdouble;
Float128Format = &llvm::APFloat::IEEEquad;		Float128Format = &llvm::APFloat::IEEEquad;
MCountName = "mcount";		MCountName = "mcount";
RegParmMax = 0;		RegParmMax = 0;
SSERegParmMax = 0;		SSERegParmMax = 0;
HasAlignMac68kSupport = false;		HasAlignMac68kSupport = false;
HasBuiltinMSVaList = false;		HasBuiltinMSVaList = false;
		IsRenderScriptTarget = false;

// Default to no types using fpret.		// Default to no types using fpret.
RealTypeUsesObjCFPRet = 0;		RealTypeUsesObjCFPRet = 0;

// Default to not using fp2ret for __Complex long double		// Default to not using fp2ret for __Complex long double
ComplexLongDoubleUsesFP2Ret = false;		ComplexLongDoubleUsesFP2Ret = false;

// Set the C++ ABI based on the triple.		// Set the C++ ABI based on the triple.
▲ Show 20 Lines • Show All 557 Lines • Show Last 20 Lines

cfe/trunk/lib/Basic/Targets.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 8,107 Lines • ▼ Show 20 Lines
	class RenderScript32TargetInfo : public ARMleTargetInfo {			class RenderScript32TargetInfo : public ARMleTargetInfo {
	public:			public:
	RenderScript32TargetInfo(const llvm::Triple &Triple,			RenderScript32TargetInfo(const llvm::Triple &Triple,
	const TargetOptions &Opts)			const TargetOptions &Opts)
	: ARMleTargetInfo(llvm::Triple("armv7", Triple.getVendorName(),			: ARMleTargetInfo(llvm::Triple("armv7", Triple.getVendorName(),
	Triple.getOSName(),			Triple.getOSName(),
	Triple.getEnvironmentName()),			Triple.getEnvironmentName()),
	Opts) {			Opts) {
				IsRenderScriptTarget = true;
	LongWidth = LongAlign = 64;			LongWidth = LongAlign = 64;
	}			}
	void getTargetDefines(const LangOptions &Opts,			void getTargetDefines(const LangOptions &Opts,
	MacroBuilder &Builder) const override {			MacroBuilder &Builder) const override {
	Builder.defineMacro("__RENDERSCRIPT__");			Builder.defineMacro("__RENDERSCRIPT__");
	ARMleTargetInfo::getTargetDefines(Opts, Builder);			ARMleTargetInfo::getTargetDefines(Opts, Builder);
	}			}
	};			};

	// 64-bit RenderScript is aarch64			// 64-bit RenderScript is aarch64
	class RenderScript64TargetInfo : public AArch64leTargetInfo {			class RenderScript64TargetInfo : public AArch64leTargetInfo {
	public:			public:
	RenderScript64TargetInfo(const llvm::Triple &Triple,			RenderScript64TargetInfo(const llvm::Triple &Triple,
	const TargetOptions &Opts)			const TargetOptions &Opts)
	: AArch64leTargetInfo(llvm::Triple("aarch64", Triple.getVendorName(),			: AArch64leTargetInfo(llvm::Triple("aarch64", Triple.getVendorName(),
	Triple.getOSName(),			Triple.getOSName(),
	Triple.getEnvironmentName()),			Triple.getEnvironmentName()),
	Opts) {}			Opts) {
				IsRenderScriptTarget = true;
				}

	void getTargetDefines(const LangOptions &Opts,			void getTargetDefines(const LangOptions &Opts,
	MacroBuilder &Builder) const override {			MacroBuilder &Builder) const override {
	Builder.defineMacro("__RENDERSCRIPT__");			Builder.defineMacro("__RENDERSCRIPT__");
	AArch64leTargetInfo::getTargetDefines(Opts, Builder);			AArch64leTargetInfo::getTargetDefines(Opts, Builder);
	}			}
	};			};

	▲ Show 20 Lines • Show All 491 Lines • Show Last 20 Lines

cfe/trunk/lib/CodeGen/TargetInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 25 Lines
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm> // std::sort		#include <algorithm> // std::sort

using namespace clang;		using namespace clang;
using namespace CodeGen;		using namespace CodeGen;

		// Helper for coercing an aggregate argument or return value into an integer
		// array of the same size (including padding) and alignment. This alternate
		// coercion happens only for the RenderScript ABI and can be removed after
		// runtimes that rely on it are no longer supported.
		//
		// RenderScript assumes that the size of the argument / return value in the IR
		// is the same as the size of the corresponding qualified type. This helper
		// coerces the aggregate type into an array of the same size (including
		// padding). This coercion is used in lieu of expansion of struct members or
		// other canonical coercions that return a coerced-type of larger size.
		//
		// Ty - The argument / return value type
		// Context - The associated ASTContext
		// LLVMContext - The associated LLVMContext
		static ABIArgInfo coerceToIntArray(QualType Ty,
		ASTContext &Context,
		llvm::LLVMContext &LLVMContext) {
		// Alignment and Size are measured in bits.
		const uint64_t Size = Context.getTypeSize(Ty);
		const uint64_t Alignment = Context.getTypeAlign(Ty);
		llvm::Type *IntType = llvm::Type::getIntNTy(LLVMContext, Alignment);
		const uint64_t NumElements = (Size + Alignment - 1) / Alignment;
		return ABIArgInfo::getDirect(llvm::ArrayType::get(IntType, NumElements));
		}

static void AssignToArrayRange(CodeGen::CGBuilderTy &Builder,		static void AssignToArrayRange(CodeGen::CGBuilderTy &Builder,
llvm::Value *Array,		llvm::Value *Array,
llvm::Value *Value,		llvm::Value *Value,
unsigned FirstIndex,		unsigned FirstIndex,
unsigned LastIndex) {		unsigned LastIndex) {
// Alternatively, we could emit this as a loop in the source.		// Alternatively, we could emit this as a loop in the source.
for (unsigned I = FirstIndex; I <= LastIndex; ++I) {		for (unsigned I = FirstIndex; I <= LastIndex; ++I) {
llvm::Value *Cell =		llvm::Value *Cell =
▲ Show 20 Lines • Show All 4,509 Lines • ▼ Show 20 Lines	ABIArgInfo AArch64ABIInfo::classifyArgumentType(QualType Ty) const {
if (isHomogeneousAggregate(Ty, Base, Members)) {		if (isHomogeneousAggregate(Ty, Base, Members)) {
return ABIArgInfo::getDirect(		return ABIArgInfo::getDirect(
llvm::ArrayType::get(CGT.ConvertType(QualType(Base, 0)), Members));		llvm::ArrayType::get(CGT.ConvertType(QualType(Base, 0)), Members));
}		}

// Aggregates <= 16 bytes are passed directly in registers or on the stack.		// Aggregates <= 16 bytes are passed directly in registers or on the stack.
uint64_t Size = getContext().getTypeSize(Ty);		uint64_t Size = getContext().getTypeSize(Ty);
if (Size <= 128) {		if (Size <= 128) {
		// On RenderScript, coerce Aggregates <= 16 bytes to an integer array of
		// same size and alignment.
		if (getTarget().isRenderScriptTarget()) {
		return coerceToIntArray(Ty, getContext(), getVMContext());
		}
unsigned Alignment = getContext().getTypeAlign(Ty);		unsigned Alignment = getContext().getTypeAlign(Ty);
Size = 64 * ((Size + 63) / 64); // round up to multiple of 8 bytes		Size = 64 * ((Size + 63) / 64); // round up to multiple of 8 bytes

// We use a pair of i64 for 16-byte aggregate with 8-byte alignment.		// We use a pair of i64 for 16-byte aggregate with 8-byte alignment.
// For aggregates with 16-byte alignment, we use i128.		// For aggregates with 16-byte alignment, we use i128.
if (Alignment < 128 && Size == 128) {		if (Alignment < 128 && Size == 128) {
llvm::Type *BaseTy = llvm::Type::getInt64Ty(getVMContext());		llvm::Type *BaseTy = llvm::Type::getInt64Ty(getVMContext());
return ABIArgInfo::getDirect(llvm::ArrayType::get(BaseTy, Size / 64));		return ABIArgInfo::getDirect(llvm::ArrayType::get(BaseTy, Size / 64));
Show All 29 Lines	ABIArgInfo AArch64ABIInfo::classifyReturnType(QualType RetTy) const {
uint64_t Members = 0;		uint64_t Members = 0;
if (isHomogeneousAggregate(RetTy, Base, Members))		if (isHomogeneousAggregate(RetTy, Base, Members))
// Homogeneous Floating-point Aggregates (HFAs) are returned directly.		// Homogeneous Floating-point Aggregates (HFAs) are returned directly.
return ABIArgInfo::getDirect();		return ABIArgInfo::getDirect();

// Aggregates <= 16 bytes are returned directly in registers or on the stack.		// Aggregates <= 16 bytes are returned directly in registers or on the stack.
uint64_t Size = getContext().getTypeSize(RetTy);		uint64_t Size = getContext().getTypeSize(RetTy);
if (Size <= 128) {		if (Size <= 128) {
		// On RenderScript, coerce Aggregates <= 16 bytes to an integer array of
		// same size and alignment.
		if (getTarget().isRenderScriptTarget()) {
		return coerceToIntArray(RetTy, getContext(), getVMContext());
		}
unsigned Alignment = getContext().getTypeAlign(RetTy);		unsigned Alignment = getContext().getTypeAlign(RetTy);
Size = 64 * ((Size + 63) / 64); // round up to multiple of 8 bytes		Size = 64 * ((Size + 63) / 64); // round up to multiple of 8 bytes

// We use a pair of i64 for 16-byte aggregate with 8-byte alignment.		// We use a pair of i64 for 16-byte aggregate with 8-byte alignment.
// For aggregates with 16-byte alignment, we use i128.		// For aggregates with 16-byte alignment, we use i128.
if (Alignment < 128 && Size == 128) {		if (Alignment < 128 && Size == 128) {
llvm::Type *BaseTy = llvm::Type::getInt64Ty(getVMContext());		llvm::Type *BaseTy = llvm::Type::getInt64Ty(getVMContext());
return ABIArgInfo::getDirect(llvm::ArrayType::get(BaseTy, Size / 64));		return ABIArgInfo::getDirect(llvm::ArrayType::get(BaseTy, Size / 64));
▲ Show 20 Lines • Show All 674 Lines • ▼ Show 20 Lines	ABIArgInfo ARMABIInfo::classifyArgumentType(QualType Ty,

if (getContext().getTypeSizeInChars(Ty) > CharUnits::fromQuantity(64)) {		if (getContext().getTypeSizeInChars(Ty) > CharUnits::fromQuantity(64)) {
assert(getABIKind() != ARMABIInfo::AAPCS16_VFP && "unexpected byval");		assert(getABIKind() != ARMABIInfo::AAPCS16_VFP && "unexpected byval");
return ABIArgInfo::getIndirect(CharUnits::fromQuantity(ABIAlign),		return ABIArgInfo::getIndirect(CharUnits::fromQuantity(ABIAlign),
/ByVal=/true,		/ByVal=/true,
/Realign=/TyAlign > ABIAlign);		/Realign=/TyAlign > ABIAlign);
}		}

		// On RenderScript, coerce Aggregates <= 64 bytes to an integer array of
		// same size and alignment.
		if (getTarget().isRenderScriptTarget()) {
		return coerceToIntArray(Ty, getContext(), getVMContext());
		}

// Otherwise, pass by coercing to a structure of the appropriate size.		// Otherwise, pass by coercing to a structure of the appropriate size.
llvm::Type* ElemTy;		llvm::Type* ElemTy;
unsigned SizeRegs;		unsigned SizeRegs;
// FIXME: Try to match the types of the arguments more accurately where		// FIXME: Try to match the types of the arguments more accurately where
// we can.		// we can.
if (getContext().getTypeAlign(Ty) <= 32) {		if (getContext().getTypeAlign(Ty) <= 32) {
ElemTy = llvm::Type::getInt32Ty(getVMContext());		ElemTy = llvm::Type::getInt32Ty(getVMContext());
SizeRegs = (getContext().getTypeSize(Ty) + 31) / 32;		SizeRegs = (getContext().getTypeSize(Ty) + 31) / 32;
▲ Show 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	if (isHomogeneousAggregate(RetTy, Base, Members)) {
return ABIArgInfo::getDirect(nullptr, 0, nullptr, false);		return ABIArgInfo::getDirect(nullptr, 0, nullptr, false);
}		}
}		}

// Aggregates <= 4 bytes are returned in r0; other aggregates		// Aggregates <= 4 bytes are returned in r0; other aggregates
// are returned indirectly.		// are returned indirectly.
uint64_t Size = getContext().getTypeSize(RetTy);		uint64_t Size = getContext().getTypeSize(RetTy);
if (Size <= 32) {		if (Size <= 32) {
		// On RenderScript, coerce Aggregates <= 4 bytes to an integer array of
		// same size and alignment.
		if (getTarget().isRenderScriptTarget()) {
		return coerceToIntArray(RetTy, getContext(), getVMContext());
		}
if (getDataLayout().isBigEndian())		if (getDataLayout().isBigEndian())
// Return in 32 bit integer integer type (as if loaded by LDR, AAPCS 5.4)		// Return in 32 bit integer integer type (as if loaded by LDR, AAPCS 5.4)
return ABIArgInfo::getDirect(llvm::Type::getInt32Ty(getVMContext()));		return ABIArgInfo::getDirect(llvm::Type::getInt32Ty(getVMContext()));

// Return in the smallest viable integer type.		// Return in the smallest viable integer type.
if (Size <= 8)		if (Size <= 8)
return ABIArgInfo::getDirect(llvm::Type::getInt8Ty(getVMContext()));		return ABIArgInfo::getDirect(llvm::Type::getInt8Ty(getVMContext()));
if (Size <= 16)		if (Size <= 16)
▲ Show 20 Lines • Show All 2,587 Lines • Show Last 20 Lines

cfe/trunk/test/CodeGen/renderscript.c

	Show All 17 Lines
	_Static_assert(_Alignof(long) == LONG_WIDTH_AND_ALIGN, "sizeof long is wrong");			_Static_assert(_Alignof(long) == LONG_WIDTH_AND_ALIGN, "sizeof long is wrong");

	// CHECK-RS32: i64 @test_long(i64 %v)			// CHECK-RS32: i64 @test_long(i64 %v)
	// CHECK-RS64: i64 @test_long(i64 %v)			// CHECK-RS64: i64 @test_long(i64 %v)
	// CHECK-ARM: i32 @test_long(i32 %v)			// CHECK-ARM: i32 @test_long(i32 %v)
	long test_long(long v) {			long test_long(long v) {
	return v + 1;			return v + 1;
	}			}

				// =============================================================================
				// Test coercion of aggregate argument or return value into integer arrays
				// =============================================================================

				// =============================================================================
				// aggregate parameter <= 4 bytes: coerced to [a x iNN] for both 32-bit and
				// 64-bit RenderScript
				// ==============================================================================

				typedef struct {char c1, c2, c3; } sChar3;
				typedef struct {short s; char c;} sShortChar;

				// CHECK-RS32: void @argChar3([3 x i8] %s.coerce)
				// CHECK-RS64: void @argChar3([3 x i8] %s.coerce)
				void argChar3(sChar3 s) {}

				// CHECK-RS32: void @argShortChar([2 x i16] %s.coerce)
				// CHECK-RS64: void @argShortChar([2 x i16] %s.coerce)
				void argShortChar(sShortChar s) {}

				// =============================================================================
				// aggregate return value <= 4 bytes: coerced to [a x iNN] for both 32-bit and
				// 64-bit RenderScript
				// =============================================================================

				// CHECK-RS32: [3 x i8] @retChar3()
				// CHECK-RS64: [3 x i8] @retChar3()
				sChar3 retChar3() { sChar3 r; return r; }

				// CHECK-RS32: [2 x i16] @retShortChar()
				// CHECK-RS64: [2 x i16] @retShortChar()
				sShortChar retShortChar() { sShortChar r; return r; }

				// =============================================================================
				// aggregate parameter <= 16 bytes: coerced to [a x iNN] for both 32-bit and
				// 64-bit RenderScript
				// =============================================================================

				typedef struct {short s1; char c; short s2; } sShortCharShort;
				typedef struct {int i; short s; char c; } sIntShortChar;
				typedef struct {long l; int i; } sLongInt;

				// CHECK-RS32: void @argShortCharShort([3 x i16] %s.coerce)
				// CHECK-RS64: void @argShortCharShort([3 x i16] %s.coerce)
				void argShortCharShort(sShortCharShort s) {}

				// CHECK-RS32: void @argIntShortChar([2 x i32] %s.coerce)
				// CHECK-RS64: void @argIntShortChar([2 x i32] %s.coerce)
				void argIntShortChar(sIntShortChar s) {}

				// CHECK-RS32: void @argLongInt([2 x i64] %s.coerce)
				// CHECK-RS64: void @argLongInt([2 x i64] %s.coerce)
				void argLongInt(sLongInt s) {}

				// =============================================================================
				// aggregate return value <= 16 bytes: returned on stack for 32-bit RenderScript
				// and coerced to [a x iNN] for 64-bit RenderScript
				// =============================================================================

				// CHECK-RS32: void @retShortCharShort(%struct.sShortCharShort* noalias sret %agg.result)
				// CHECK-RS64: [3 x i16] @retShortCharShort()
				sShortCharShort retShortCharShort() { sShortCharShort r; return r; }

				// CHECK-RS32: void @retIntShortChar(%struct.sIntShortChar* noalias sret %agg.result)
				// CHECK-RS64: [2 x i32] @retIntShortChar()
				sIntShortChar retIntShortChar() { sIntShortChar r; return r; }

				// CHECK-RS32: void @retLongInt(%struct.sLongInt* noalias sret %agg.result)
				// CHECK-RS64: [2 x i64] @retLongInt()
				sLongInt retLongInt() { sLongInt r; return r; }

				// =============================================================================
				// aggregate parameter <= 64 bytes: coerced to [a x iNN] for 32-bit RenderScript
				// and passed on the stack for 64-bit RenderScript
				// =============================================================================

				typedef struct {int i1, i2, i3, i4, i5; } sInt5;
				typedef struct {long l1, l2; char c; } sLong2Char;

				// CHECK-RS32: void @argInt5([5 x i32] %s.coerce)
				// CHECK-RS64: void @argInt5(%struct.sInt5* %s)
				void argInt5(sInt5 s) {}

				// CHECK-RS32: void @argLong2Char([3 x i64] %s.coerce)
				// CHECK-RS64: void @argLong2Char(%struct.sLong2Char* %s)
				void argLong2Char(sLong2Char s) {}

				// =============================================================================
				// aggregate return value <= 64 bytes: returned on stack for both 32-bit and
				// 64-bit RenderScript
				// =============================================================================

				// CHECK-RS32: void @retInt5(%struct.sInt5* noalias sret %agg.result)
				// CHECK-RS64: void @retInt5(%struct.sInt5* noalias sret %agg.result)
				sInt5 retInt5() { sInt5 r; return r;}

				// CHECK-RS32: void @retLong2Char(%struct.sLong2Char* noalias sret %agg.result)
				// CHECK-RS64: void @retLong2Char(%struct.sLong2Char* noalias sret %agg.result)
				sLong2Char retLong2Char() { sLong2Char r; return r;}

				// =============================================================================
				// aggregate parameters and return values > 64 bytes: passed and returned on the
				// stack for both 32-bit and 64-bit RenderScript
				// =============================================================================

				typedef struct {long l1, l2, l3, l4, l5, l6, l7, l8, l9; } sLong9;

				// CHECK-RS32: void @argLong9(%struct.sLong9* byval align 8 %s)
				// CHECK-RS64: void @argLong9(%struct.sLong9* %s)
				void argLong9(sLong9 s) {}

				// CHECK-RS32: void @retLong9(%struct.sLong9* noalias sret %agg.result)
				// CHECK-RS64: void @retLong9(%struct.sLong9* noalias sret %agg.result)
				sLong9 retLong9() { sLong9 r; return r; }

This is an archive of the discontinued LLVM Phabricator instance.

Adjust coercion of aggregates on RenderScript
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 65787

cfe/trunk/include/clang/Basic/TargetInfo.h

cfe/trunk/lib/Basic/TargetInfo.cpp

cfe/trunk/lib/Basic/Targets.cpp

cfe/trunk/lib/CodeGen/TargetInfo.cpp

cfe/trunk/test/CodeGen/renderscript.c

This is an archive of the discontinued LLVM Phabricator instance.

Adjust coercion of aggregates on RenderScriptClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 65787

cfe/trunk/include/clang/Basic/TargetInfo.h

cfe/trunk/lib/Basic/TargetInfo.cpp

cfe/trunk/lib/Basic/Targets.cpp

cfe/trunk/lib/CodeGen/TargetInfo.cpp

cfe/trunk/test/CodeGen/renderscript.c

Adjust coercion of aggregates on RenderScript
ClosedPublic