This is an archive of the discontinued LLVM Phabricator instance.

Add support to promote f16 to f32
ClosedPublic

Authored by pirama on Mar 31 2015, 4:35 PM.

Download Raw Diff

Details

Reviewers

t.p.northover
ab
srhines

Commits

rGdb7c07e2bf91: Add support to promote f16 to f32
rL235215: Add support to promote f16 to f32

Summary

This patch adds legalization support to operate on FP16 as a load/store type
and do operations on it as floats.

Generic pass/fail tests have been added to
test/CodeGen/Generic/fp16-promote.ll.

Diff Detail

Repository: rL LLVM

Event Timeline

pirama updated this revision to Diff 23019.Mar 31 2015, 4:35 PM

pirama retitled this revision from to Add support to promote f16 to f32.

pirama updated this object.

pirama edited the test plan for this revision. (Show Details)

pirama added reviewers: srhines, t.p.northover.

This patch is rough on the edges and needs to handle:

the right approach to expose this as a command-line option. Right now, the option is added to TargetOptions.h, but that might not be right.
I do not have a reasonable way of finding whether ISD::FP_TO_FP16, ISD::FP16_TO_FP are supported by the target. If unsupported, we have to fall back to making a libcall to gnu_h2f_ieee. Right now, this is a hard-coded branch in LegalizeFloatTypes.cpp:GetPromotedValue but I need suggestions on how to handle this.
Once the high-level issues are ironed out, I'll add target-specific tests to verify instructions for this promotion.

pirama added a subscriber: Unknown Object (MLST).Mar 31 2015, 4:43 PM

I forgot to subscribe llvm-commits. So, I'm re-sending the Summary and the raw diff to the mailing list.

Summary:
This patch adds legalization support to operate on FP16 as a load/store type
and do operations on it as floats.

Generic pass/fail tests have been added to
test/CodeGen/Generic/fp16-promote.ll.

This patch is rough on the edges and needs to handle:

the right approach to expose this as a command-line option. Right now, the option is added to TargetOptions.h, but that might not be right.
I do not have a reasonable way of finding whether ISD::FP_TO_FP16, ISD::FP16_TO_FP are supported by the target. If unsupported, we have to fall back to making a libcall to gnu_h2f_ieee. Right now, this is a hard-coded branch in LegalizeFloatTypes.cpp:GetPromotedValue but I need suggestions on how to handle this.
Once the high-level issues are ironed out, I'll add target-specific tests to verify instructions for this promotion.

D8755.diff31 KBDownload

Awesome, I wanted to do this next, thanks for working on this =)

I'll look into this closer, but a couple of very high level not-really-questions first:

what about other types? I just removed the generic f32->f64 promotion a few days ago (in anticipation for such a PromoteFloat legalizer). I'm not sure we can test that in-tree, so there's probably not much point in worrying about this.
might I ask: what's your use case? In particular, do you intend to change clang in any way?

In the meantime, you might be interested in http://reviews.llvm.org/D8648 for tests, as I tried to cover as many IR/generic ISD opcodes as possible. The patch itself just focuses on ops promotion on AArch64 (where half is legal).

-Ahmed

ab added inline comments.Apr 1 2015, 3:56 PM

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1591 ↗	(On Diff #23019)	I don't think you need this, just always pick some opcode, and let the operation legalizer decide it doesn't like a type combination.
1596 ↗	(On Diff #23019)	'else' with '{'. Same elsewhere
1604 ↗	(On Diff #23019)	llvm_unreachable, or better still, report_fatal_error, as this is actually reachable from purpose-crafted IR? Same below.
1608 ↗	(On Diff #23019)	As I said on the ML, not sure if it'll work, but you should just let the ops legalizer pick a libcall for the FP_TO_FP nodes.
1626 ↗	(On Diff #23019)	I didn't notice this, and found the name a bit confusing, thinking this was some magic legalizer method. Might be good to have a more descriptive name, making it explicit we actually promote here, rather than get a previously promoted value as GetPromotedInt/Float does. The fact that it goes both ways is also a bit confusing, but I get that avoids duplication. For such a short function, it might be worth doing two separate versions, perhaps?
1700 ↗	(On Diff #23019)	I'm not sure what to think of this. Could we catch this in a DAGCombine instead?
1701 ↗	(On Diff #23019)	'Op->getOpcode()' instead, as you already do for getValueType. Same elsewhere
1704 ↗	(On Diff #23019)	Dead
1753 ↗	(On Diff #23019)	Should these optimizations be DAGCombines instead?
1804 ↗	(On Diff #23019)	What about it? :P
1809 ↗	(On Diff #23019)	This isn't a binop.
1982 ↗	(On Diff #23019)	I don't follow. This is about rounding to integers, and has nothing to do with FP, right? Why not simply GetPromotedFloat? Or, for that matter, just reusing PromoteFloatRes_UnaryOp.
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
260 ↗	(On Diff #23019)	The line break seems odd, should there be more in the first line?
262 ↗	(On Diff #23019)	Whether the fallthrough is intended or not, I think an explicit 'break' or comment would be useful.
lib/CodeGen/SelectionDAG/LegalizeTypes.h
512 ↗	(On Diff #23019)	Move this to the .cpp?

Thanks for the code review ab. I'll make your suggestions, as well as remove the command line option.

Looks like the Mips backend isn't setting the right operation action for FP_TO_FP16. I'll fix that in a different patch.

pirama added inline comments.Apr 2 2015, 1:38 PM

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1604 ↗	(On Diff #23019)	I'll switch to report_fatal_error.
1608 ↗	(On Diff #23019)	http://reviews.llvm.org/D8804 should let the Op legalizer do the right thing for Mips.
1626 ↗	(On Diff #23019)	I'll remove this function and directly use DAG.getNode at its callers - since libcalls are no longer handled here.
1700 ↗	(On Diff #23019)	I'll attempt to do this in the DAGCombiner.
1982 ↗	(On Diff #23019)	I couldn't find any documentation or reference for this opcode code. So, I assumed it was similar to float rounding, while at the same time wondering the difference between FROUND and FTRUNC. I'll reuse PromoteFloatRes_UnaryOp for FTRUNC. The above function still applies to FROUND. I am trying to trim the extra precision as this node be an explicit operation to reduce precision.
lib/CodeGen/SelectionDAG/LegalizeTypes.h
512 ↗	(On Diff #23019)	I'll leave it here to be consistent with the other GetPromotedInteger and similar functions in this file.

ab added inline comments.Apr 2 2015, 2:10 PM

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1982 ↗	(On Diff #23019)	So, FP_ROUND is the opcode used to do fp->fp rounding (i.e., trim the extra precision). FTRUNC and FROUND are equivalent to the trunc and round libm functions, i.e., rounding to integer. IIRC the difference is that one rounds to infinity, and the other to zero. (-0.1 -> -1 vs 0).
lib/CodeGen/SelectionDAG/LegalizeTypes.h
512 ↗	(On Diff #23019)	Fair enough, I asked because I scrolled up a bit to see whether the others did the same. I saw GetExpandedFloat was in the .cpp, but I was too quick to assume they would be consistent. Either way I'm fine; a follow-up patch to put them consistently in one or the other would be nice.

Removed the command line option and instead enable PromoteFloat on targets where f32 is legal
Updated LegalizeFloatTypes.cpp covering ab's comments.
Added a comprehensive test of fp16 promotion to test/Codegen/ARM/fp16-promote.ll

In ARM-32 ABI, f16 args and return values gets promoted to f32. To keep calling convention out of the picture, the ARM codegen tests take half* args to load arguments and store results.

ab mentioned this in D8648: [AArch64] Promote all/most f16 ops to f32.Apr 9 2015, 4:59 PM

Looking better, thanks!

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1817 ↗	(On Diff #23435)	This needs tests (also for insert_vector_elt ?)
1932–1933 ↗	(On Diff #23435)	Stale reference to FROUND?
test/CodeGen/ARM/fp16-promote.ll
1 ↗	(On Diff #23435)	We should also have tests for non-native conversions with the libcall.
test/CodeGen/Generic/fp16-promote.ll
1 ↗	(On Diff #23435)	Some of the tests have half parameters and return types. I'm not sure that's expected to work on all targets, is it? If it's not, I still see value in a generic sanity check like this, so dealing with pointers instead is probably fine.

ab added inline comments.Apr 13 2015, 1:48 PM

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1817 ↗	(On Diff #23435)	Which makes me think: I haven't tested structs in D8648, but they have a few interesting operations that we should also deal with, mostly insert/extract element. By-value returns/arguments have the same problems as scalar returns/arguments; making sure the call lowering handles them fine is still useful (as you do for scalar f16).
test/CodeGen/ARM/fp16-promote.ll
127–128 ↗	(On Diff #23435)	This could still use some CHECK-lines, if only to demonstrate they are indeed handled as f32.

pirama added inline comments.Apr 14 2015, 10:34 AM

test/CodeGen/Generic/fp16-promote.ll
1 ↗	(On Diff #23435)	I agree that the tests here should take pointers. I actually uploaded this as a placeholder for my initial patch. If it's ok, I'll delete the Generic tests in my next upload, and add them later. I need to commit the Mips patch (http://reviews.llvm.org/D8804) and upload a similar one for X86 before the tests will pass.

Removed test/Generic to be added at a later time. Some backends need patches before they can pass these tests.
Added tests for libcall conversions
Added insert_vector_elt and extract_vector_elt tests
Added struct tests (pass-by-value, return-by-value, extractvalue, insertvalue)
Added CHECKs for half args and returns

I don't see anything missing; LGTM with a couple nits.

Thanks!

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1595 ↗	(On Diff #23747)	report_fatal_error?
test/CodeGen/ARM/fp16-promote.ll
1164 ↗	(On Diff #23747)	tored -> stored

This revision is now accepted and ready to land.Apr 17 2015, 10:18 AM

ab added inline comments.Apr 17 2015, 10:19 AM

lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
1701 ↗	(On Diff #23747)	dl -> DL ? (here and elsewhere)

Thanks for the review and LGTM, ab.

Minor changes based on prior comments
There are a few instances of 'dl' in the softening and expansion routines in the same file. That can be cleaned up as a separate patch.

Minor fixups. Remove a todo that had no functional impact.

srhines accepted this revision.Apr 17 2015, 11:35 AM

srhines edited edge metadata.

Closed by commit rL235215: Add support to promote f16 to f32 (authored by pirama). · Explain WhyApr 17 2015, 11:39 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Target/

TargetLowering.h

3 lines

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

17 lines

LegalizeFloatTypes.cpp

417 lines

LegalizeIntegerTypes.cpp

18 lines

LegalizeTypes.h

43 lines

LegalizeTypes.cpp

19 lines

TargetLoweringBase.cpp

17 lines

test/

CodeGen/

ARM/

fp16-promote.ll

1287 lines

Diff 23953

llvm/trunk/include/llvm/Target/TargetLowering.h

Show First 20 Lines • Show All 94 Lines • ▼ Show 20 Lines	public:
enum LegalizeTypeAction {		enum LegalizeTypeAction {
TypeLegal, // The target natively supports this type.		TypeLegal, // The target natively supports this type.
TypePromoteInteger, // Replace this integer with a larger one.		TypePromoteInteger, // Replace this integer with a larger one.
TypeExpandInteger, // Split this integer into two of half the size.		TypeExpandInteger, // Split this integer into two of half the size.
TypeSoftenFloat, // Convert this float to a same size integer type.		TypeSoftenFloat, // Convert this float to a same size integer type.
TypeExpandFloat, // Split this float into two of half the size.		TypeExpandFloat, // Split this float into two of half the size.
TypeScalarizeVector, // Replace this one-element vector with its element.		TypeScalarizeVector, // Replace this one-element vector with its element.
TypeSplitVector, // Split this vector into two of half the size.		TypeSplitVector, // Split this vector into two of half the size.
TypeWidenVector // This vector should be widened into a larger vector.		TypeWidenVector, // This vector should be widened into a larger vector.
		TypePromoteFloat // Replace this float with a larger one.
};		};

/// LegalizeKind holds the legalization kind that needs to happen to EVT		/// LegalizeKind holds the legalization kind that needs to happen to EVT
/// in order to type-legalize it.		/// in order to type-legalize it.
typedef std::pair<LegalizeTypeAction, EVT> LegalizeKind;		typedef std::pair<LegalizeTypeAction, EVT> LegalizeKind;

/// Enum that describes how the target represents true/false values.		/// Enum that describes how the target represents true/false values.
enum BooleanContent {		enum BooleanContent {
▲ Show 20 Lines • Show All 2,671 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 301 Lines • ▼ Show 20 Lines	private:
SDValue visitBUILD_VECTOR(SDNode *N);		SDValue visitBUILD_VECTOR(SDNode *N);
SDValue visitCONCAT_VECTORS(SDNode *N);		SDValue visitCONCAT_VECTORS(SDNode *N);
SDValue visitEXTRACT_SUBVECTOR(SDNode *N);		SDValue visitEXTRACT_SUBVECTOR(SDNode *N);
SDValue visitVECTOR_SHUFFLE(SDNode *N);		SDValue visitVECTOR_SHUFFLE(SDNode *N);
SDValue visitSCALAR_TO_VECTOR(SDNode *N);		SDValue visitSCALAR_TO_VECTOR(SDNode *N);
SDValue visitINSERT_SUBVECTOR(SDNode *N);		SDValue visitINSERT_SUBVECTOR(SDNode *N);
SDValue visitMLOAD(SDNode *N);		SDValue visitMLOAD(SDNode *N);
SDValue visitMSTORE(SDNode *N);		SDValue visitMSTORE(SDNode *N);
		SDValue visitFP_TO_FP16(SDNode *N);

SDValue XformToShuffleWithZero(SDNode *N);		SDValue XformToShuffleWithZero(SDNode *N);
SDValue ReassociateOps(unsigned Opc, SDLoc DL, SDValue LHS, SDValue RHS);		SDValue ReassociateOps(unsigned Opc, SDLoc DL, SDValue LHS, SDValue RHS);

SDValue visitShiftByConstant(SDNode N, ConstantSDNode Amt);		SDValue visitShiftByConstant(SDNode N, ConstantSDNode Amt);

bool SimplifySelectOps(SDNode *SELECT, SDValue LHS, SDValue RHS);		bool SimplifySelectOps(SDNode *SELECT, SDValue LHS, SDValue RHS);
SDValue SimplifyBinOpWithSameOpcodeHands(SDNode *N);		SDValue SimplifyBinOpWithSameOpcodeHands(SDNode *N);
▲ Show 20 Lines • Show All 1,057 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visit(SDNode *N) {
case ISD::BUILD_VECTOR: return visitBUILD_VECTOR(N);		case ISD::BUILD_VECTOR: return visitBUILD_VECTOR(N);
case ISD::CONCAT_VECTORS: return visitCONCAT_VECTORS(N);		case ISD::CONCAT_VECTORS: return visitCONCAT_VECTORS(N);
case ISD::EXTRACT_SUBVECTOR: return visitEXTRACT_SUBVECTOR(N);		case ISD::EXTRACT_SUBVECTOR: return visitEXTRACT_SUBVECTOR(N);
case ISD::VECTOR_SHUFFLE: return visitVECTOR_SHUFFLE(N);		case ISD::VECTOR_SHUFFLE: return visitVECTOR_SHUFFLE(N);
case ISD::SCALAR_TO_VECTOR: return visitSCALAR_TO_VECTOR(N);		case ISD::SCALAR_TO_VECTOR: return visitSCALAR_TO_VECTOR(N);
case ISD::INSERT_SUBVECTOR: return visitINSERT_SUBVECTOR(N);		case ISD::INSERT_SUBVECTOR: return visitINSERT_SUBVECTOR(N);
case ISD::MLOAD: return visitMLOAD(N);		case ISD::MLOAD: return visitMLOAD(N);
case ISD::MSTORE: return visitMSTORE(N);		case ISD::MSTORE: return visitMSTORE(N);
		case ISD::FP_TO_FP16: return visitFP_TO_FP16(N);
}		}
return SDValue();		return SDValue();
}		}

SDValue DAGCombiner::combine(SDNode *N) {		SDValue DAGCombiner::combine(SDNode *N) {
SDValue RV = visit(N);		SDValue RV = visit(N);

// If nothing happened, try a target-specific DAG combine.		// If nothing happened, try a target-specific DAG combine.
▲ Show 20 Lines • Show All 6,765 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitFP_EXTEND(SDNode *N) {
if (N->hasOneUse() &&		if (N->hasOneUse() &&
N->use_begin()->getOpcode() == ISD::FP_ROUND)		N->use_begin()->getOpcode() == ISD::FP_ROUND)
return SDValue();		return SDValue();

// fold (fp_extend c1fp) -> c1fp		// fold (fp_extend c1fp) -> c1fp
if (isConstantFPBuildVectorOrConstantFP(N0))		if (isConstantFPBuildVectorOrConstantFP(N0))
return DAG.getNode(ISD::FP_EXTEND, SDLoc(N), VT, N0);		return DAG.getNode(ISD::FP_EXTEND, SDLoc(N), VT, N0);

		// fold (fp_extend (fp16_to_fp op)) -> (fp16_to_fp op)
		if (N0.getOpcode() == ISD::FP16_TO_FP &&
		TLI.getOperationAction(ISD::FP16_TO_FP, VT) == TargetLowering::Legal)
		return DAG.getNode(ISD::FP16_TO_FP, SDLoc(N), VT, N0.getOperand(0));

// Turn fp_extend(fp_round(X, 1)) -> x since the fp_round doesn't affect the		// Turn fp_extend(fp_round(X, 1)) -> x since the fp_round doesn't affect the
// value of X.		// value of X.
if (N0.getOpcode() == ISD::FP_ROUND		if (N0.getOpcode() == ISD::FP_ROUND
&& N0.getNode()->getConstantOperandVal(1) == 1) {		&& N0.getNode()->getConstantOperandVal(1) == 1) {
SDValue In = N0.getOperand(0);		SDValue In = N0.getOperand(0);
if (In.getValueType() == VT) return In;		if (In.getValueType() == VT) return In;
if (VT.bitsLT(In.getValueType()))		if (VT.bitsLT(In.getValueType()))
return DAG.getNode(ISD::FP_ROUND, SDLoc(N), VT,		return DAG.getNode(ISD::FP_ROUND, SDLoc(N), VT,
▲ Show 20 Lines • Show All 4,172 Lines • ▼ Show 20 Lines	if (N0.getOpcode() == ISD::CONCAT_VECTORS &&
if (InsIdx == VT.getVectorNumElements()/2)		if (InsIdx == VT.getVectorNumElements()/2)
return DAG.getNode(ISD::CONCAT_VECTORS, SDLoc(N), VT,		return DAG.getNode(ISD::CONCAT_VECTORS, SDLoc(N), VT,
N0.getOperand(0), N->getOperand(1));		N0.getOperand(0), N->getOperand(1));
}		}

return SDValue();		return SDValue();
}		}

		SDValue DAGCombiner::visitFP_TO_FP16(SDNode *N) {
		SDValue N0 = N->getOperand(0);

		// fold (fp_to_fp16 (fp16_to_fp op)) -> op
		if (N0->getOpcode() == ISD::FP16_TO_FP)
		return N0->getOperand(0);

		return SDValue();
		}

/// Returns a vector_shuffle if it able to transform an AND to a vector_shuffle		/// Returns a vector_shuffle if it able to transform an AND to a vector_shuffle
/// with the destination vector and a zero vector.		/// with the destination vector and a zero vector.
/// e.g. AND V, <0xffffffff, 0, 0xffffffff, 0>. ==>		/// e.g. AND V, <0xffffffff, 0, 0xffffffff, 0>. ==>
/// vector_shuffle V, Zero, <0, 4, 2, 4>		/// vector_shuffle V, Zero, <0, 4, 2, 4>
SDValue DAGCombiner::XformToShuffleWithZero(SDNode *N) {		SDValue DAGCombiner::XformToShuffleWithZero(SDNode *N) {
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
SDValue LHS = N->getOperand(0);		SDValue LHS = N->getOperand(0);
SDValue RHS = N->getOperand(1);		SDValue RHS = N->getOperand(1);
▲ Show 20 Lines • Show All 1,074 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

Show First 20 Lines • Show All 1,573 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::ExpandFloatOp_STORE(SDNode *N, unsigned OpNo) {
(void)NVT;		(void)NVT;

SDValue Lo, Hi;		SDValue Lo, Hi;
GetExpandedOp(ST->getValue(), Lo, Hi);		GetExpandedOp(ST->getValue(), Lo, Hi);

return DAG.getTruncStore(Chain, SDLoc(N), Hi, Ptr,		return DAG.getTruncStore(Chain, SDLoc(N), Hi, Ptr,
ST->getMemoryVT(), ST->getMemOperand());		ST->getMemoryVT(), ST->getMemOperand());
}		}

		//===----------------------------------------------------------------------===//
		// Float Operand Promotion
		//===----------------------------------------------------------------------===//
		//

		static ISD::NodeType GetPromotionOpcode(EVT OpVT, EVT RetVT) {
		if (OpVT == MVT::f16) {
		return ISD::FP16_TO_FP;
		} else if (RetVT == MVT::f16) {
		return ISD::FP_TO_FP16;
		}

		report_fatal_error("Attempt at an invalid promotion-related conversion");
		}

		bool DAGTypeLegalizer::PromoteFloatOperand(SDNode *N, unsigned OpNo) {
		SDValue R = SDValue();

		// Nodes that use a promotion-requiring floating point operand, but doesn't
		// produce a promotion-requiring floating point result, need to be legalized
		// to use the promoted float operand. Nodes that produce at least one
		// promotion-requiring floating point result have their operands legalized as
		// a part of PromoteFloatResult.
		switch (N->getOpcode()) {
		default:
		llvm_unreachable("Do not know how to promote this operator's operand!");

		case ISD::BITCAST: R = PromoteFloatOp_BITCAST(N, OpNo); break;
		case ISD::FCOPYSIGN: R = PromoteFloatOp_FCOPYSIGN(N, OpNo); break;
		case ISD::FP_TO_SINT:
		case ISD::FP_TO_UINT: R = PromoteFloatOp_FP_TO_XINT(N, OpNo); break;
		case ISD::FP_EXTEND: R = PromoteFloatOp_FP_EXTEND(N, OpNo); break;
		case ISD::SELECT_CC: R = PromoteFloatOp_SELECT_CC(N, OpNo); break;
		case ISD::SETCC: R = PromoteFloatOp_SETCC(N, OpNo); break;
		case ISD::STORE: R = PromoteFloatOp_STORE(N, OpNo); break;
		}

		if (R.getNode())
		ReplaceValueWith(SDValue(N, 0), R);
		return false;
		}

		SDValue DAGTypeLegalizer::PromoteFloatOp_BITCAST(SDNode *N, unsigned OpNo) {
		SDValue Op = N->getOperand(0);
		EVT OpVT = Op->getValueType(0);

		EVT VT = N->getValueType(0);
		EVT IVT = EVT::getIntegerVT(*DAG.getContext(), OpVT.getSizeInBits());
		assert (IVT == VT && "Bitcast to type of different size");

		SDValue Promoted = GetPromotedFloat(N->getOperand(0));
		EVT PromotedVT = Promoted->getValueType(0);

		// Convert the promoted float value to the desired IVT.
		return DAG.getNode(GetPromotionOpcode(PromotedVT, OpVT), SDLoc(N), IVT,
		Promoted);
		}

		// Promote Operand 1 of FCOPYSIGN. Operand 0 ought to be handled by
		// PromoteFloatRes_FCOPYSIGN.
		SDValue DAGTypeLegalizer::PromoteFloatOp_FCOPYSIGN(SDNode *N, unsigned OpNo) {
		assert (OpNo == 1 && "Only Operand 1 must need promotion here");
		SDValue Op1 = GetPromotedFloat(N->getOperand(1));

		return DAG.getNode(N->getOpcode(), SDLoc(N), N->getValueType(0),
		N->getOperand(0), Op1);
		}

		// Convert the promoted float value to the desired integer type
		SDValue DAGTypeLegalizer::PromoteFloatOp_FP_TO_XINT(SDNode *N, unsigned OpNo) {
		SDValue Op = GetPromotedFloat(N->getOperand(0));
		return DAG.getNode(N->getOpcode(), SDLoc(N), N->getValueType(0), Op);
		}

		SDValue DAGTypeLegalizer::PromoteFloatOp_FP_EXTEND(SDNode *N, unsigned OpNo) {
		SDValue Op = GetPromotedFloat(N->getOperand(0));
		EVT VT = N->getValueType(0);

		// Desired VT is same as promoted type. Use promoted float directly.
		if (VT == Op->getValueType(0))
		return Op;

		// Else, extend the promoted float value to the desired VT.
		return DAG.getNode(ISD::FP_EXTEND, SDLoc(N), VT, Op);
		}

		// Promote the float operands used for comparison. The true- and false-
		// operands have the same type as the result and are promoted, if needed, by
		// PromoteFloatRes_SELECT_CC
		SDValue DAGTypeLegalizer::PromoteFloatOp_SELECT_CC(SDNode *N, unsigned OpNo) {
		SDValue LHS = GetPromotedFloat(N->getOperand(0));
		SDValue RHS = GetPromotedFloat(N->getOperand(1));

		return DAG.getNode(ISD::SELECT_CC, SDLoc(N), N->getValueType(0),
		LHS, RHS, N->getOperand(2), N->getOperand(3),
		N->getOperand(4));
		}

		// Construct a SETCC that compares the promoted values and sets the conditional
		// code.
		SDValue DAGTypeLegalizer::PromoteFloatOp_SETCC(SDNode *N, unsigned OpNo) {
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		SDValue Op0 = GetPromotedFloat(N->getOperand(0));
		SDValue Op1 = GetPromotedFloat(N->getOperand(1));
		ISD::CondCode CCCode = cast<CondCodeSDNode>(N->getOperand(2))->get();

		return DAG.getSetCC(SDLoc(N), NVT, Op0, Op1, CCCode);

		}

		// Lower the promoted Float down to the integer value of same size and construct
		// a STORE of the integer value.
		SDValue DAGTypeLegalizer::PromoteFloatOp_STORE(SDNode *N, unsigned OpNo) {
		StoreSDNode *ST = cast<StoreSDNode>(N);
		SDValue Val = ST->getValue();
		SDLoc DL(N);

		SDValue Promoted = GetPromotedFloat(Val);
		EVT VT = ST->getOperand(1)->getValueType(0);
		EVT IVT = EVT::getIntegerVT(*DAG.getContext(), VT.getSizeInBits());

		SDValue NewVal;
		NewVal = DAG.getNode(GetPromotionOpcode(Promoted.getValueType(), VT), DL,
		IVT, Promoted);

		return DAG.getStore(ST->getChain(), DL, NewVal, ST->getBasePtr(),
		ST->getMemOperand());
		}

		//===----------------------------------------------------------------------===//
		// Float Result Promotion
		//===----------------------------------------------------------------------===//

		void DAGTypeLegalizer::PromoteFloatResult(SDNode *N, unsigned ResNo) {
		SDValue R = SDValue();

		switch (N->getOpcode()) {
		// These opcodes cannot appear if promotion of FP16 is done in the backend
		// instead of Clang
		case ISD::FP16_TO_FP:
		case ISD::FP_TO_FP16:
		default:
		llvm_unreachable("Do not know how to promote this operator's result!");

		case ISD::BITCAST: R = PromoteFloatRes_BITCAST(N); break;
		case ISD::ConstantFP: R = PromoteFloatRes_ConstantFP(N); break;
		case ISD::EXTRACT_VECTOR_ELT:
		R = PromoteFloatRes_EXTRACT_VECTOR_ELT(N); break;
		case ISD::FCOPYSIGN: R = PromoteFloatRes_FCOPYSIGN(N); break;

		// Unary FP Operations
		case ISD::FABS:
		case ISD::FCEIL:
		case ISD::FCOS:
		case ISD::FEXP:
		case ISD::FEXP2:
		case ISD::FFLOOR:
		case ISD::FLOG:
		case ISD::FLOG2:
		case ISD::FLOG10:
		case ISD::FNEARBYINT:
		case ISD::FNEG:
		case ISD::FRINT:
		case ISD::FROUND:
		case ISD::FSIN:
		case ISD::FSQRT:
		case ISD::FTRUNC: R = PromoteFloatRes_UnaryOp(N); break;

		// Binary FP Operations
		case ISD::FADD:
		case ISD::FDIV:
		case ISD::FMAXNUM:
		case ISD::FMINNUM:
		case ISD::FMUL:
		case ISD::FPOW:
		case ISD::FREM:
		case ISD::FSUB: R = PromoteFloatRes_BinOp(N); break;

		case ISD::FMA: // FMA is same as FMAD
		case ISD::FMAD: R = PromoteFloatRes_FMAD(N); break;

		case ISD::FPOWI: R = PromoteFloatRes_FPOWI(N); break;

		case ISD::FP_ROUND: R = PromoteFloatRes_FP_ROUND(N); break;
		case ISD::LOAD: R = PromoteFloatRes_LOAD(N); break;
		case ISD::SELECT: R = PromoteFloatRes_SELECT(N); break;
		case ISD::SELECT_CC: R = PromoteFloatRes_SELECT_CC(N); break;

		case ISD::SINT_TO_FP:
		case ISD::UINT_TO_FP: R = PromoteFloatRes_XINT_TO_FP(N); break;
		case ISD::UNDEF: R = PromoteFloatRes_UNDEF(N); break;

		}

		if (R.getNode())
		SetPromotedFloat(SDValue(N, ResNo), R);
		}

		// Bitcast from i16 to f16: convert the i16 to a f32 value instead.
		// At this point, it is not possible to determine if the bitcast value is
		// eventually stored to memory or promoted to f32 or promoted to a floating
		// point at a higher precision. Some of these cases are handled by FP_EXTEND,
		// STORE promotion handlers.
		SDValue DAGTypeLegalizer::PromoteFloatRes_BITCAST(SDNode *N) {
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		return DAG.getNode(GetPromotionOpcode(VT, NVT), SDLoc(N), NVT,
		N->getOperand(0));
		}

		SDValue DAGTypeLegalizer::PromoteFloatRes_ConstantFP(SDNode *N) {
		ConstantFPSDNode *CFPNode = cast<ConstantFPSDNode>(N);
		EVT VT = N->getValueType(0);

		// Get the (bit-cast) APInt of the APFloat and build an integer constant
		EVT IVT = EVT::getIntegerVT(*DAG.getContext(), VT.getSizeInBits());
		SDValue C = DAG.getConstant(CFPNode->getValueAPF().bitcastToAPInt(),
		IVT);

		// Convert the Constant to the desired FP type
		// FIXME We might be able to do the conversion during compilation and get rid
		// of it from the object code
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		return DAG.getNode(GetPromotionOpcode(VT, NVT), SDLoc(N), NVT, C);
		}

		// If the Index operand is a constant, try to redirect the extract operation to
		// the correct legalized vector. If not, bit-convert the input vector to
		// equivalent integer vector. Extract the element as an (bit-cast) integer
		// value and convert it to the promoted type.
		SDValue DAGTypeLegalizer::PromoteFloatRes_EXTRACT_VECTOR_ELT(SDNode *N) {
		SDLoc DL(N);

		// If the index is constant, try to extract the value from the legalized
		// vector type.
		if (isa<ConstantSDNode>(N->getOperand(1))) {
		SDValue Vec = N->getOperand(0);
		SDValue Idx = N->getOperand(1);
		EVT VecVT = Vec->getValueType(0);
		EVT EltVT = VecVT.getVectorElementType();

		uint64_t IdxVal = cast<ConstantSDNode>(Idx)->getZExtValue();

		switch (getTypeAction(VecVT)) {
		default: break;
		case TargetLowering::TypeScalarizeVector: {
		SDValue Res = GetScalarizedVector(N->getOperand(0));
		ReplaceValueWith(SDValue(N, 0), Res);
		return SDValue();
		}
		case TargetLowering::TypeWidenVector: {
		Vec = GetWidenedVector(Vec);
		SDValue Res = DAG.getNode(N->getOpcode(), DL, EltVT, Vec, Idx);
		ReplaceValueWith(SDValue(N, 0), Res);
		return SDValue();
		}
		case TargetLowering::TypeSplitVector: {
		SDValue Lo, Hi;
		GetSplitVector(Vec, Lo, Hi);

		uint64_t LoElts = Lo.getValueType().getVectorNumElements();
		SDValue Res;
		if (IdxVal < LoElts)
		Res = DAG.getNode(N->getOpcode(), DL, EltVT, Lo, Idx);
		else
		Res = DAG.getNode(N->getOpcode(), DL, EltVT, Hi,
		DAG.getConstant(IdxVal - LoElts,
		Idx.getValueType()));
		ReplaceValueWith(SDValue(N, 0), Res);
		return SDValue();
		}

		}
		}

		// Bit-convert the input vector to the equivalent integer vector
		SDValue NewOp = BitConvertVectorToIntegerVector(N->getOperand(0));
		EVT IVT = NewOp.getValueType().getVectorElementType();

		// Extract the element as an (bit-cast) integer value
		SDValue NewVal = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, IVT,
		NewOp, N->getOperand(1));

		// Convert the element to the desired FP type
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		return DAG.getNode(GetPromotionOpcode(VT, NVT), SDLoc(N), NVT, NewVal);
		}

		// FCOPYSIGN(X, Y) returns the value of X with the sign of Y. If the result
		// needs promotion, so does the argument X. Note that Y, if needed, will be
		// handled during operand promotion.
		SDValue DAGTypeLegalizer::PromoteFloatRes_FCOPYSIGN(SDNode *N) {
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		SDValue Op0 = GetPromotedFloat(N->getOperand(0));

		SDValue Op1 = N->getOperand(1);

		return DAG.getNode(N->getOpcode(), SDLoc(N), NVT, Op0, Op1);
		}

		// Unary operation where the result and the operand have PromoteFloat type
		// action. Construct a new SDNode with the promoted float value of the old
		// operand.
		SDValue DAGTypeLegalizer::PromoteFloatRes_UnaryOp(SDNode *N) {
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		SDValue Op = GetPromotedFloat(N->getOperand(0));

		return DAG.getNode(N->getOpcode(), SDLoc(N), NVT, Op);
		}

		// Binary operations where the result and both operands have PromoteFloat type
		// action. Construct a new SDNode with the promoted float values of the old
		// operands.
		SDValue DAGTypeLegalizer::PromoteFloatRes_BinOp(SDNode *N) {
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		SDValue Op0 = GetPromotedFloat(N->getOperand(0));
		SDValue Op1 = GetPromotedFloat(N->getOperand(1));

		return DAG.getNode(N->getOpcode(), SDLoc(N), NVT, Op0, Op1);
		}

		SDValue DAGTypeLegalizer::PromoteFloatRes_FMAD(SDNode *N) {
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		SDValue Op0 = GetPromotedFloat(N->getOperand(0));
		SDValue Op1 = GetPromotedFloat(N->getOperand(1));
		SDValue Op2 = GetPromotedFloat(N->getOperand(2));

		return DAG.getNode(N->getOpcode(), SDLoc(N), NVT, Op0, Op1, Op2);
		}

		// Promote the Float (first) operand and retain the Integer (second) operand
		SDValue DAGTypeLegalizer::PromoteFloatRes_FPOWI(SDNode *N) {
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		SDValue Op0 = GetPromotedFloat(N->getOperand(0));
		SDValue Op1 = N->getOperand(1);

		return DAG.getNode(N->getOpcode(), SDLoc(N), NVT, Op0, Op1);
		}

		// Explicit operation to reduce precision. Reduce the value to half precision
		// and promote it back to the legal type.
		SDValue DAGTypeLegalizer::PromoteFloatRes_FP_ROUND(SDNode *N) {
		SDLoc DL(N);

		SDValue Op = N->getOperand(0);
		EVT VT = N->getValueType(0);
		EVT OpVT = Op->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
		EVT IVT = EVT::getIntegerVT(*DAG.getContext(), VT.getSizeInBits());

		// Round promoted float to desired precision
		SDValue Round = DAG.getNode(GetPromotionOpcode(OpVT, VT), DL, IVT, Op);
		// Promote it back to the legal output type
		return DAG.getNode(GetPromotionOpcode(VT, NVT), DL, NVT, Round);
		}

		SDValue DAGTypeLegalizer::PromoteFloatRes_LOAD(SDNode *N) {
		LoadSDNode *L = cast<LoadSDNode>(N);
		EVT VT = N->getValueType(0);

		// Load the value as an integer value with the same number of bits
		EVT IVT = EVT::getIntegerVT(*DAG.getContext(), VT.getSizeInBits());
		SDValue newL = DAG.getLoad(L->getAddressingMode(), L->getExtensionType(),
		IVT, SDLoc(N), L->getChain(), L->getBasePtr(),
		L->getOffset(), L->getPointerInfo(), IVT, L->isVolatile(),
		L->isNonTemporal(), false, L->getAlignment(),
		L->getAAInfo());
		// Legalize the chain result by replacing uses of the old value chain with the
		// new one
		ReplaceValueWith(SDValue(N, 1), newL.getValue(1));

		// Convert the integer value to the desired FP type
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		return DAG.getNode(GetPromotionOpcode(VT, NVT), SDLoc(N), NVT, newL);
		}

		// Construct a new SELECT node with the promoted true- and false- values.
		SDValue DAGTypeLegalizer::PromoteFloatRes_SELECT(SDNode *N) {
		SDValue TrueVal = GetPromotedFloat(N->getOperand(1));
		SDValue FalseVal = GetPromotedFloat(N->getOperand(2));

		return DAG.getNode(ISD::SELECT, SDLoc(N), TrueVal->getValueType(0),
		N->getOperand(0), TrueVal, FalseVal);
		}

		// Construct a new SELECT_CC node with the promoted true- and false- values.
		// The operands used for comparison are promoted by PromoteFloatOp_SELECT_CC.
		SDValue DAGTypeLegalizer::PromoteFloatRes_SELECT_CC(SDNode *N) {
		SDValue TrueVal = GetPromotedFloat(N->getOperand(2));
		SDValue FalseVal = GetPromotedFloat(N->getOperand(3));

		return DAG.getNode(ISD::SELECT_CC, SDLoc(N), N->getValueType(0),
		N->getOperand(0), N->getOperand(1), TrueVal, FalseVal,
		N->getOperand(4));
		}

		// Construct a SDNode that transforms the SINT or UINT operand to the promoted
		// float type.
		SDValue DAGTypeLegalizer::PromoteFloatRes_XINT_TO_FP(SDNode *N) {
		EVT VT = N->getValueType(0);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), VT);
		return DAG.getNode(N->getOpcode(), SDLoc(N), NVT, N->getOperand(0));
		}

		SDValue DAGTypeLegalizer::PromoteFloatRes_UNDEF(SDNode *N) {
		return DAG.getUNDEF(TLI.getTypeToTransformTo(*DAG.getContext(),
		N->getValueType(0)));
		}

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 245 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_BITCAST(SDNode *N) {
case TargetLowering::TypePromoteInteger:		case TargetLowering::TypePromoteInteger:
if (NOutVT.bitsEq(NInVT) && !NOutVT.isVector() && !NInVT.isVector())		if (NOutVT.bitsEq(NInVT) && !NOutVT.isVector() && !NInVT.isVector())
// The input promotes to the same size. Convert the promoted value.		// The input promotes to the same size. Convert the promoted value.
return DAG.getNode(ISD::BITCAST, dl, NOutVT, GetPromotedInteger(InOp));		return DAG.getNode(ISD::BITCAST, dl, NOutVT, GetPromotedInteger(InOp));
break;		break;
case TargetLowering::TypeSoftenFloat:		case TargetLowering::TypeSoftenFloat:
// Promote the integer operand by hand.		// Promote the integer operand by hand.
return DAG.getNode(ISD::ANY_EXTEND, dl, NOutVT, GetSoftenedFloat(InOp));		return DAG.getNode(ISD::ANY_EXTEND, dl, NOutVT, GetSoftenedFloat(InOp));
		case TargetLowering::TypePromoteFloat: {
		// Convert the promoted float by hand.
		if (NOutVT.bitsEq(NInVT)) {
		SDValue PromotedOp = GetPromotedFloat(InOp);
		SDValue Trunc = DAG.getNode(ISD::FP_TO_FP16, dl, NOutVT, PromotedOp);
		return DAG.getNode(ISD::AssertZext, dl, NOutVT, Trunc,
		DAG.getValueType(OutVT));
		}
		break;
		}
case TargetLowering::TypeExpandInteger:		case TargetLowering::TypeExpandInteger:
case TargetLowering::TypeExpandFloat:		case TargetLowering::TypeExpandFloat:
break;		break;
case TargetLowering::TypeScalarizeVector:		case TargetLowering::TypeScalarizeVector:
// Convert the element to an integer and promote it by hand.		// Convert the element to an integer and promote it by hand.
if (!NOutVT.isVector())		if (!NOutVT.isVector())
return DAG.getNode(ISD::ANY_EXTEND, dl, NOutVT,		return DAG.getNode(ISD::ANY_EXTEND, dl, NOutVT,
BitConvertToInteger(GetScalarizedVector(InOp)));		BitConvertToInteger(GetScalarizedVector(InOp)));
▲ Show 20 Lines • Show All 1,578 Lines • ▼ Show 20 Lines	Lo = DAG.getSelect(dl, NVT, LoNotZero, LoLZ,
DAG.getConstant(NVT.getSizeInBits(), NVT)));		DAG.getConstant(NVT.getSizeInBits(), NVT)));
Hi = DAG.getConstant(0, NVT);		Hi = DAG.getConstant(0, NVT);
}		}

void DAGTypeLegalizer::ExpandIntRes_FP_TO_SINT(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::ExpandIntRes_FP_TO_SINT(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
SDLoc dl(N);		SDLoc dl(N);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);

SDValue Op = N->getOperand(0);		SDValue Op = N->getOperand(0);
		if (getTypeAction(Op.getValueType()) == TargetLowering::TypePromoteFloat)
		Op = GetPromotedFloat(Op);

RTLIB::Libcall LC = RTLIB::getFPTOSINT(Op.getValueType(), VT);		RTLIB::Libcall LC = RTLIB::getFPTOSINT(Op.getValueType(), VT);
assert(LC != RTLIB::UNKNOWN_LIBCALL && "Unexpected fp-to-sint conversion!");		assert(LC != RTLIB::UNKNOWN_LIBCALL && "Unexpected fp-to-sint conversion!");
SplitInteger(TLI.makeLibCall(DAG, LC, VT, &Op, 1, true/irrelevant/,		SplitInteger(TLI.makeLibCall(DAG, LC, VT, &Op, 1, true/irrelevant/,
dl).first,		dl).first,
Lo, Hi);		Lo, Hi);
}		}

void DAGTypeLegalizer::ExpandIntRes_FP_TO_UINT(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::ExpandIntRes_FP_TO_UINT(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
SDLoc dl(N);		SDLoc dl(N);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);

SDValue Op = N->getOperand(0);		SDValue Op = N->getOperand(0);
		if (getTypeAction(Op.getValueType()) == TargetLowering::TypePromoteFloat)
		Op = GetPromotedFloat(Op);

RTLIB::Libcall LC = RTLIB::getFPTOUINT(Op.getValueType(), VT);		RTLIB::Libcall LC = RTLIB::getFPTOUINT(Op.getValueType(), VT);
assert(LC != RTLIB::UNKNOWN_LIBCALL && "Unexpected fp-to-uint conversion!");		assert(LC != RTLIB::UNKNOWN_LIBCALL && "Unexpected fp-to-uint conversion!");
SplitInteger(TLI.makeLibCall(DAG, LC, VT, &Op, 1, false/irrelevant/,		SplitInteger(TLI.makeLibCall(DAG, LC, VT, &Op, 1, false/irrelevant/,
dl).first,		dl).first,
Lo, Hi);		Lo, Hi);
}		}

void DAGTypeLegalizer::ExpandIntRes_LOAD(LoadSDNode *N,		void DAGTypeLegalizer::ExpandIntRes_LOAD(LoadSDNode *N,
▲ Show 20 Lines • Show All 1,214 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	private:
/// ExpandedIntegers - For integer nodes that need to be expanded this map		/// ExpandedIntegers - For integer nodes that need to be expanded this map
/// indicates which operands are the expanded version of the input.		/// indicates which operands are the expanded version of the input.
SmallDenseMap<SDValue, std::pair<SDValue, SDValue>, 8> ExpandedIntegers;		SmallDenseMap<SDValue, std::pair<SDValue, SDValue>, 8> ExpandedIntegers;

/// SoftenedFloats - For floating point nodes converted to integers of		/// SoftenedFloats - For floating point nodes converted to integers of
/// the same size, this map indicates the converted value to use.		/// the same size, this map indicates the converted value to use.
SmallDenseMap<SDValue, SDValue, 8> SoftenedFloats;		SmallDenseMap<SDValue, SDValue, 8> SoftenedFloats;

		/// PromotedFloats - For floating point nodes that have a smaller precision
		/// than the smallest supported precision, this map indicates what promoted
		/// value to use.
		SmallDenseMap<SDValue, SDValue, 8> PromotedFloats;

/// ExpandedFloats - For float nodes that need to be expanded this map		/// ExpandedFloats - For float nodes that need to be expanded this map
/// indicates which operands are the expanded version of the input.		/// indicates which operands are the expanded version of the input.
SmallDenseMap<SDValue, std::pair<SDValue, SDValue>, 8> ExpandedFloats;		SmallDenseMap<SDValue, std::pair<SDValue, SDValue>, 8> ExpandedFloats;

/// ScalarizedVectors - For nodes that are <1 x ty>, this map indicates the		/// ScalarizedVectors - For nodes that are <1 x ty>, this map indicates the
/// scalar value of type 'ty' to use.		/// scalar value of type 'ty' to use.
SmallDenseMap<SDValue, SDValue, 8> ScalarizedVectors;		SmallDenseMap<SDValue, SDValue, 8> ScalarizedVectors;

▲ Show 20 Lines • Show All 390 Lines • ▼ Show 20 Lines	private:
SDValue ExpandFloatOp_FP_TO_UINT(SDNode *N);		SDValue ExpandFloatOp_FP_TO_UINT(SDNode *N);
SDValue ExpandFloatOp_SELECT_CC(SDNode *N);		SDValue ExpandFloatOp_SELECT_CC(SDNode *N);
SDValue ExpandFloatOp_SETCC(SDNode *N);		SDValue ExpandFloatOp_SETCC(SDNode *N);
SDValue ExpandFloatOp_STORE(SDNode *N, unsigned OpNo);		SDValue ExpandFloatOp_STORE(SDNode *N, unsigned OpNo);

void FloatExpandSetCCOperands(SDValue &NewLHS, SDValue &NewRHS,		void FloatExpandSetCCOperands(SDValue &NewLHS, SDValue &NewRHS,
ISD::CondCode &CCCode, SDLoc dl);		ISD::CondCode &CCCode, SDLoc dl);


		//===--------------------------------------------------------------------===//
		// Float promotion support: LegalizeFloatTypes.cpp
		//===--------------------------------------------------------------------===//

		SDValue GetPromotedFloat(SDValue Op) {
		SDValue &PromotedOp = PromotedFloats[Op];
		RemapValue(PromotedOp);
		assert(PromotedOp.getNode() && "Operand wasn't promoted?");
		return PromotedOp;
		}
		void SetPromotedFloat(SDValue Op, SDValue Result);

		void PromoteFloatResult(SDNode *N, unsigned ResNo);
		SDValue PromoteFloatRes_BITCAST(SDNode *N);
		SDValue PromoteFloatRes_BinOp(SDNode *N);
		SDValue PromoteFloatRes_ConstantFP(SDNode *N);
		SDValue PromoteFloatRes_EXTRACT_VECTOR_ELT(SDNode *N);
		SDValue PromoteFloatRes_FCOPYSIGN(SDNode *N);
		SDValue PromoteFloatRes_FMAD(SDNode *N);
		SDValue PromoteFloatRes_FPOWI(SDNode *N);
		SDValue PromoteFloatRes_FP_ROUND(SDNode *N);
		SDValue PromoteFloatRes_LOAD(SDNode *N);
		SDValue PromoteFloatRes_SELECT(SDNode *N);
		SDValue PromoteFloatRes_SELECT_CC(SDNode *N);
		SDValue PromoteFloatRes_UnaryOp(SDNode *N);
		SDValue PromoteFloatRes_UNDEF(SDNode *N);
		SDValue PromoteFloatRes_XINT_TO_FP(SDNode *N);

		bool PromoteFloatOperand(SDNode *N, unsigned ResNo);
		SDValue PromoteFloatOp_BITCAST(SDNode *N, unsigned OpNo);
		SDValue PromoteFloatOp_FCOPYSIGN(SDNode *N, unsigned OpNo);
		SDValue PromoteFloatOp_FP_EXTEND(SDNode *N, unsigned OpNo);
		SDValue PromoteFloatOp_FP_TO_XINT(SDNode *N, unsigned OpNo);
		SDValue PromoteFloatOp_STORE(SDNode *N, unsigned OpNo);
		SDValue PromoteFloatOp_SELECT_CC(SDNode *N, unsigned OpNo);
		SDValue PromoteFloatOp_SETCC(SDNode *N, unsigned OpNo);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Scalarization Support: LegalizeVectorTypes.cpp		// Scalarization Support: LegalizeVectorTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// GetScalarizedVector - Given a processed one-element vector Op which was		/// GetScalarizedVector - Given a processed one-element vector Op which was
/// scalarized to its element type, this returns the element. For example,		/// scalarized to its element type, this returns the element. For example,
/// if Op is a v1i32, Op = < i32 val >, this method returns val, an i32.		/// if Op is a v1i32, Op = < i32 val >, this method returns val, an i32.
SDValue GetScalarizedVector(SDValue Op) {		SDValue GetScalarizedVector(SDValue Op) {
▲ Show 20 Lines • Show All 266 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp

Show First 20 Lines • Show All 253 Lines • ▼ Show 20 Lines	for (unsigned i = 0, NumResults = N->getNumValues(); i < NumResults; ++i) {
case TargetLowering::TypeSplitVector:		case TargetLowering::TypeSplitVector:
SplitVectorResult(N, i);		SplitVectorResult(N, i);
Changed = true;		Changed = true;
goto NodeDone;		goto NodeDone;
case TargetLowering::TypeWidenVector:		case TargetLowering::TypeWidenVector:
WidenVectorResult(N, i);		WidenVectorResult(N, i);
Changed = true;		Changed = true;
goto NodeDone;		goto NodeDone;
		case TargetLowering::TypePromoteFloat:
		PromoteFloatResult(N, i);
		Changed = true;
		goto NodeDone;
}		}
}		}

ScanOperands:		ScanOperands:
// Scan the operand list for the node, handling any nodes with operands that		// Scan the operand list for the node, handling any nodes with operands that
// are illegal.		// are illegal.
{		{
unsigned NumOperands = N->getNumOperands();		unsigned NumOperands = N->getNumOperands();
Show All 33 Lines	for (i = 0; i != NumOperands; ++i) {
case TargetLowering::TypeSplitVector:		case TargetLowering::TypeSplitVector:
NeedsReanalyzing = SplitVectorOperand(N, i);		NeedsReanalyzing = SplitVectorOperand(N, i);
Changed = true;		Changed = true;
break;		break;
case TargetLowering::TypeWidenVector:		case TargetLowering::TypeWidenVector:
NeedsReanalyzing = WidenVectorOperand(N, i);		NeedsReanalyzing = WidenVectorOperand(N, i);
Changed = true;		Changed = true;
break;		break;
		case TargetLowering::TypePromoteFloat:
		NeedsReanalyzing = PromoteFloatOperand(N, i);
		Changed = true;
		break;
}		}
break;		break;
}		}

// The sub-method updated N in place. Check to see if any operands are new,		// The sub-method updated N in place. Check to see if any operands are new,
// and if so, mark them. If the node needs revisiting, don't add all users		// and if so, mark them. If the node needs revisiting, don't add all users
// to the worklist etc.		// to the worklist etc.
if (NeedsReanalyzing) {		if (NeedsReanalyzing) {
▲ Show 20 Lines • Show All 429 Lines • ▼ Show 20 Lines	assert(Result.getValueType() ==
"Invalid type for softened float");		"Invalid type for softened float");
AnalyzeNewValue(Result);		AnalyzeNewValue(Result);

SDValue &OpEntry = SoftenedFloats[Op];		SDValue &OpEntry = SoftenedFloats[Op];
assert(!OpEntry.getNode() && "Node is already converted to integer!");		assert(!OpEntry.getNode() && "Node is already converted to integer!");
OpEntry = Result;		OpEntry = Result;
}		}

		void DAGTypeLegalizer::SetPromotedFloat(SDValue Op, SDValue Result) {
		assert(Result.getValueType() ==
		TLI.getTypeToTransformTo(*DAG.getContext(), Op.getValueType()) &&
		"Invalid type for promoted float");
		AnalyzeNewValue(Result);

		SDValue &OpEntry = PromotedFloats[Op];
		assert(!OpEntry.getNode() && "Node is already promoted!");
		OpEntry = Result;
		}

void DAGTypeLegalizer::SetScalarizedVector(SDValue Op, SDValue Result) {		void DAGTypeLegalizer::SetScalarizedVector(SDValue Op, SDValue Result) {
// Note that in some cases vector operation operands may be greater than		// Note that in some cases vector operation operands may be greater than
// the vector element type. For example BUILD_VECTOR of type <1 x i1> with		// the vector element type. For example BUILD_VECTOR of type <1 x i1> with
// a constant i8 operand.		// a constant i8 operand.
assert(Result.getValueType().getSizeInBits() >=		assert(Result.getValueType().getSizeInBits() >=
Op.getValueType().getVectorElementType().getSizeInBits() &&		Op.getValueType().getVectorElementType().getSizeInBits() &&
"Invalid type for scalarized vector");		"Invalid type for scalarized vector");
AnalyzeNewValue(Result);		AnalyzeNewValue(Result);
▲ Show 20 Lines • Show All 362 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 1,250 Lines • ▼ Show 20 Lines	void TargetLoweringBase::computeRegisterProperties(
if (!isTypeLegal(MVT::f32)) {		if (!isTypeLegal(MVT::f32)) {
NumRegistersForVT[MVT::f32] = NumRegistersForVT[MVT::i32];		NumRegistersForVT[MVT::f32] = NumRegistersForVT[MVT::i32];
RegisterTypeForVT[MVT::f32] = RegisterTypeForVT[MVT::i32];		RegisterTypeForVT[MVT::f32] = RegisterTypeForVT[MVT::i32];
TransformToType[MVT::f32] = MVT::i32;		TransformToType[MVT::f32] = MVT::i32;
ValueTypeActions.setTypeAction(MVT::f32, TypeSoftenFloat);		ValueTypeActions.setTypeAction(MVT::f32, TypeSoftenFloat);
}		}

if (!isTypeLegal(MVT::f16)) {		if (!isTypeLegal(MVT::f16)) {
		// If the target has native f32 support, promote f16 operations to f32. If
		// f32 is not supported, generate soft float library calls.
		if (isTypeLegal(MVT::f32)) {
		NumRegistersForVT[MVT::f16] = NumRegistersForVT[MVT::f32];
		RegisterTypeForVT[MVT::f16] = RegisterTypeForVT[MVT::f32];
		TransformToType[MVT::f16] = MVT::f32;
		ValueTypeActions.setTypeAction(MVT::f16, TypePromoteFloat);
		} else {
NumRegistersForVT[MVT::f16] = NumRegistersForVT[MVT::i16];		NumRegistersForVT[MVT::f16] = NumRegistersForVT[MVT::i16];
RegisterTypeForVT[MVT::f16] = RegisterTypeForVT[MVT::i16];		RegisterTypeForVT[MVT::f16] = RegisterTypeForVT[MVT::i16];
TransformToType[MVT::f16] = MVT::i16;		TransformToType[MVT::f16] = MVT::i16;
ValueTypeActions.setTypeAction(MVT::f16, TypeSoftenFloat);		ValueTypeActions.setTypeAction(MVT::f16, TypeSoftenFloat);
}		}
		}

// Loop over all of the vector value types to see which need transformations.		// Loop over all of the vector value types to see which need transformations.
for (unsigned i = MVT::FIRST_VECTOR_VALUETYPE;		for (unsigned i = MVT::FIRST_VECTOR_VALUETYPE;
i <= (unsigned)MVT::LAST_VECTOR_VALUETYPE; ++i) {		i <= (unsigned)MVT::LAST_VECTOR_VALUETYPE; ++i) {
MVT VT = (MVT::SimpleValueType) i;		MVT VT = (MVT::SimpleValueType) i;
if (isTypeLegal(VT))		if (isTypeLegal(VT))
continue;		continue;

▲ Show 20 Lines • Show All 366 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/ARM/fp16-promote.ll

				; RUN: llc -asm-verbose=false < %s -mattr=+vfp3,+fp16 \| FileCheck %s -check-prefix=CHECK-FP16 -check-prefix=CHECK-ALL
				; RUN: llc -asm-verbose=false < %s \| FileCheck %s -check-prefix=CHECK-LIBCALL -check-prefix=CHECK-ALL

				target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
				target triple = "armv7-eabihf"

				; CHECK-FP16-LABEL: test_fadd:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r2, [r0]
				; CHECK-FP16-NEXT: ldrh r1, [r1]
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vmov s2, r2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vadd.f32 s0, s2, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: strh r1, [r0]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fadd:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vadd.f32
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fadd(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = fadd half %a, %b
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fsub:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r2, [r0]
				; CHECK-FP16-NEXT: ldrh r1, [r1]
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vmov s2, r2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vsub.f32 s0, s2, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: strh r1, [r0]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fsub:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vsub.f32
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fsub(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = fsub half %a, %b
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fmul:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r2, [r0]
				; CHECK-FP16-NEXT: ldrh r1, [r1]
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vmov s2, r2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vmul.f32 s0, s2, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: strh r1, [r0]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fmul
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vmul.f32
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fmul(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = fmul half %a, %b
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fdiv:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r2, [r0]
				; CHECK-FP16-NEXT: ldrh r1, [r1]
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vmov s2, r2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vdiv.f32 s0, s2, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: strh r1, [r0]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fdiv
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vdiv.f32
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fdiv(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = fdiv half %a, %b
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_frem:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r1]
				; CHECK-FP16-NEXT: ldrh r1, [r4]
				; CHECK-FP16-NEXT: vmov s2, r0
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: vmov r1, s2
				; CHECK-FP16-NEXT: bl fmodf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_frem
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl fmodf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_frem(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = frem half %a, %b
				store half %r, half* %p
				ret void
				}

				; CHECK-ALL-LABEL: test_load_store:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: ldrh r0, [r0]
				; CHECK-ALL-NEXT: strh r0, [r1]
				; CHECK-ALL-NEXT: bx lr
				define void @test_load_store(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				store half %a, half* %q
				ret void
				}

				; Testing only successfull compilation of function calls. In ARM ABI, half
				; args and returns are handled as f32.

				declare half @test_callee(half %a, half %b) #0

				; CHECK-ALL-LABEL: test_call:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: push {r11, lr}
				; CHECK-ALL-NEXT: bl test_callee
				; CHECK-ALL-NEXT: pop {r11, pc}
				define half @test_call(half %a, half %b) #0 {
				%r = call half @test_callee(half %a, half %b)
				ret half %r
				}

				; CHECK-ALL-LABEL: test_call_flipped:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: push {r11, lr}
				; CHECK-ALL-NEXT: mov r2, r0
				; CHECK-ALL-NEXT: mov r0, r1
				; CHECK-ALL-NEXT: mov r1, r2
				; CHECK-ALL-NEXT: bl test_callee
				; CHECK-ALL-NEXT: pop {r11, pc}
				define half @test_call_flipped(half %a, half %b) #0 {
				%r = call half @test_callee(half %b, half %a)
				ret half %r
				}

				; CHECK-ALL-LABEL: test_tailcall_flipped:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: mov r2, r0
				; CHECK-ALL-NEXT: mov r0, r1
				; CHECK-ALL-NEXT: mov r1, r2
				; CHECK-ALL-NEXT: b test_callee
				define half @test_tailcall_flipped(half %a, half %b) #0 {
				%r = tail call half @test_callee(half %b, half %a)
				ret half %r
				}

				; Optimizer picks %p or %q based on %c and only loads that value
				; No conversion is needed
				; CHECK-BOTH-LABEL: test_select:
				; CHECK-BOTH-NEXT: .fnstart
				; CHECK-BOTH-NEXT: cmp r2, #0
				; CHECK-BOTH-NEXT: movne r1, r0
				; CHECK-BOTH-NEXT: ldrh r1, [r1]
				; CHECK-BOTH-NEXT: strh r1, [r0]
				; CHECK-BOTH-NEXT: bx lr
				define void @test_select(half* %p, half* %q, i1 zeroext %c) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = select i1 %c, half %a, half %b
				store half %r, half* %p
				ret void
				}

				; Test only two variants of fcmp. These get translated to f32 vcmpe
				; instructions anyway.
				; CHECK-FP16-LABEL: test_fcmp_une:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r2, [r0]
				; CHECK-FP16-NEXT: ldrh r0, [r1]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vmov s2, r2
				; CHECK-FP16-NEXT: mov r0, #0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vcmpe.f32 s2, s0
				; CHECK-FP16-NEXT: vmrs APSR_nzcv, fpscr
				; CHECK-FP16-NEXT: movwne r0, #1
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fcmp_une:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vcmpe.f32
				; CHECK-LIBCALL: movwne
				define i1 @test_fcmp_une(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = fcmp une half %a, %b
				ret i1 %r
				}

				; CHECK-FP16-LABEL: test_fcmp_ueq:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r2, [r0]
				; CHECK-FP16-NEXT: ldrh r0, [r1]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vmov s2, r2
				; CHECK-FP16-NEXT: mov r0, #0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vcmpe.f32 s2, s0
				; CHECK-FP16-NEXT: vmrs APSR_nzcv, fpscr
				; CHECK-FP16-NEXT: movweq r0, #1
				; CHECK-FP16-NEXT: movwvs r0, #1
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fcmp_ueq:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vcmpe.f32
				; CHECK-LIBCALL: movweq
				define i1 @test_fcmp_ueq(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = fcmp ueq half %a, %b
				ret i1 %r
				}

				; CHECK-FP16-LABEL: test_br_cc:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r0, [r0]
				; CHECK-FP16-NEXT: ldrh r1, [r1]
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vmov s2, r0
				; CHECK-FP16-NEXT: mov r0, #0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vcmpe.f32 s2, s0
				; CHECK-FP16-NEXT: vmrs APSR_nzcv, fpscr
				; CHECK-FP16-NEXT: strmi r0, [r3]
				; CHECK-FP16-NEXT: strpl r0, [r2]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_br_cc:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vcmpe.f32
				; CHECK-LIBCALL: strmi
				; CHECK-LIBCALL: strpl
				define void @test_br_cc(half* %p, half* %q, i32* %p1, i32* %p2) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%c = fcmp uge half %a, %b
				br i1 %c, label %then, label %else
				then:
				store i32 0, i32* %p1
				ret void
				else:
				store i32 0, i32* %p2
				ret void
				}

				declare i1 @test_dummy(half* %p) #0
				; CHECK-FP16-LABEL: test_phi:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: vpush {d8, d9}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s18, s0
				; CHECK-FP16-NEXT: [[LOOP:.LBB[1-9_]+]]:
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov.f32 s16, s18
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: mov r0, r4
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s18, s0
				; CHECK-FP16-NEXT: bl test_dummy
				; CHECK-FP16-NEXT: tst r0, #1
				; CHECK-FP16-NEXT: bne [[LOOP]]
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s16
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-LIBCALL-LABEL: test_phi:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: [[LOOP:.LBB[1-9_]+]]:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl test_dummy
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_phi(half* %p) #0 {
				entry:
				%a = load half, half* %p
				br label %loop
				loop:
				%r = phi half [%a, %entry], [%b, %loop]
				%b = load half, half* %p
				%c = call i1 @test_dummy(half* %p)
				br i1 %c, label %loop, label %return
				return:
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fptosi_i32:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r0, [r0]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvt.s32.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bx
				; CHECK-LIBCALL-LABEL: test_fptosi_i32:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vcvt.s32.f32
				define i32 @test_fptosi_i32(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = fptosi half %a to i32
				ret i32 %r
				}

				; CHECK-FP16-LABEL: test_fptosi_i64:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r11, lr}
				; CHECK-FP16-NEXT: ldrh r0, [r0]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: __aeabi_f2lz
				; CHECK-FP16-NEXT: pop {r11, pc}
				; CHECK-LIBCALL-LABEL: test_fptosi_i64:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __aeabi_f2lz
				define i64 @test_fptosi_i64(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = fptosi half %a to i64
				ret i64 %r
				}

				; CHECK-FP16-LABEL: test_fptoui_i32:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r0, [r0]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvt.u32.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bx
				; CHECK-LIBCALL-LABEL: test_fptoui_i32:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vcvt.u32.f32
				define i32 @test_fptoui_i32(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = fptoui half %a to i32
				ret i32 %r
				}

				; CHECK-FP16-LABEL: test_fptoui_i64:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r11, lr}
				; CHECK-FP16-NEXT: ldrh r0, [r0]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: __aeabi_f2ulz
				; CHECK-FP16-NEXT: pop {r11, pc}
				; CHECK-LIBCALL-LABEL: test_fptoui_i64:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __aeabi_f2ulz
				define i64 @test_fptoui_i64(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = fptoui half %a to i64
				ret i64 %r
				}

				; CHECK-FP16-LABEL: test_sitofp_i32:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvt.f32.s32 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r1]
				; CHECK-FP16-NEXT: bx
				; CHECK-LIBCALL-LABEL: test_sitofp_i32:
				; CHECK-LIBCALL: vcvt.f32.s32
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_sitofp_i32(i32 %a, half* %p) #0 {
				%r = sitofp i32 %a to half
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_uitofp_i32:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvt.f32.u32 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r1]
				; CHECK-FP16-NEXT: bx
				; CHECK-LIBCALL-LABEL: test_uitofp_i32:
				; CHECK-LIBCALL: vcvt.f32.u32
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_uitofp_i32(i32 %a, half* %p) #0 {
				%r = uitofp i32 %a to half
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_sitofp_i64:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r2
				; CHECK-FP16-NEXT: bl __aeabi_l2f
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_sitofp_i64:
				; CHECK-LIBCALL: bl __aeabi_l2f
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_sitofp_i64(i64 %a, half* %p) #0 {
				%r = sitofp i64 %a to half
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_uitofp_i64:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r2
				; CHECK-FP16-NEXT: bl __aeabi_ul2f
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_uitofp_i64:
				; CHECK-LIBCALL: bl __aeabi_ul2f
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_uitofp_i64(i64 %a, half* %p) #0 {
				%r = uitofp i64 %a to half
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fptrunc_float:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r1]
				; CHECK-FP16-NEXT: bx
				; CHECK-LIBCALL-LABEL: test_fptrunc_float:
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fptrunc_float(float %f, half* %p) #0 {
				%a = fptrunc float %f to half
				store half %a, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fptrunc_double:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r2
				; CHECK-FP16-NEXT: bl __aeabi_d2h
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_fptrunc_double:
				; CHECK-LIBCALL: bl __aeabi_d2h
				define void @test_fptrunc_double(double %d, half* %p) #0 {
				%a = fptrunc double %d to half
				store half %a, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fpextend_float:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r0, [r0]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fpextend_float:
				; CHECK-LIBCALL: b __gnu_h2f_ieee
				define float @test_fpextend_float(half* %p) {
				%a = load half, half* %p, align 2
				%r = fpext half %a to float
				ret float %r
				}

				; CHECK-FP16-LABEL: test_fpextend_double:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r0, [r0]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvt.f64.f32 d16, s0
				; CHECK-FP16-NEXT: vmov r0, r1, d16
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fpextend_double:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vcvt.f64.f32
				define double @test_fpextend_double(half* %p) {
				%a = load half, half* %p, align 2
				%r = fpext half %a to double
				ret double %r
				}

				; CHECK-BOTH-LABEL: test_bitcast_halftoi16:
				; CHECK-BOTH-NEXT: .fnstart
				; CHECK-BOTH-NEXT: ldrh r0, [r0]
				; CHECK-BOTH-NEXT: bx lr
				define i16 @test_bitcast_halftoi16(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = bitcast half %a to i16
				ret i16 %r
				}

				; CHECK-BOTH-LABEL: test_bitcast_i16tohalf:
				; CHECK-BOTH-NEXT: .fnstart
				; CHECK-BOTH-NEXT: strh r0, [r1]
				; CHECK-BOTH-NEXT: bx lr
				define void @test_bitcast_i16tohalf(i16 %a, half* %p) #0 {
				%r = bitcast i16 %a to half
				store half %r, half* %p
				ret void
				}

				declare half @llvm.sqrt.f16(half %a) #0
				declare half @llvm.powi.f16(half %a, i32 %b) #0
				declare half @llvm.sin.f16(half %a) #0
				declare half @llvm.cos.f16(half %a) #0
				declare half @llvm.pow.f16(half %a, half %b) #0
				declare half @llvm.exp.f16(half %a) #0
				declare half @llvm.exp2.f16(half %a) #0
				declare half @llvm.log.f16(half %a) #0
				declare half @llvm.log10.f16(half %a) #0
				declare half @llvm.log2.f16(half %a) #0
				declare half @llvm.fma.f16(half %a, half %b, half %c) #0
				declare half @llvm.fabs.f16(half %a) #0
				declare half @llvm.minnum.f16(half %a, half %b) #0
				declare half @llvm.maxnum.f16(half %a, half %b) #0
				declare half @llvm.copysign.f16(half %a, half %b) #0
				declare half @llvm.floor.f16(half %a) #0
				declare half @llvm.ceil.f16(half %a) #0
				declare half @llvm.trunc.f16(half %a) #0
				declare half @llvm.rint.f16(half %a) #0
				declare half @llvm.nearbyint.f16(half %a) #0
				declare half @llvm.round.f16(half %a) #0
				declare half @llvm.fmuladd.f16(half %a, half %b, half %c) #0

				; CHECK-FP16-LABEL: test_sqrt:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r1, [r0]
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vsqrt.f32 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: strh r1, [r0]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_sqrt:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vsqrt.f32
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_sqrt(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.sqrt.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fpowi:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl __powisf2
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_fpowi:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __powisf2
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fpowi(half* %p, i32 %b) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.powi.f16(half %a, i32 %b)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_sin:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl sinf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_sin:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl sinf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_sin(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.sin.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_cos:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl cosf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_cos:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl cosf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_cos(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.cos.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_pow:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r1]
				; CHECK-FP16-NEXT: ldrh r1, [r4]
				; CHECK-FP16-NEXT: vmov s2, r0
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: vmov r1, s2
				; CHECK-FP16-NEXT: bl powf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_pow:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl powf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_pow(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = call half @llvm.pow.f16(half %a, half %b)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_exp:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl expf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_exp:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl expf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_exp(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.exp.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_exp2:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl exp2f
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_exp2:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl exp2f
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_exp2(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.exp2.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_log:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl logf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_log:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl logf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_log(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.log.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_log10:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl log10f
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_log10:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl log10f
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_log10(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.log10.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_log2:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl log2f
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_log2:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl log2f
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_log2(half* %p) #0 {
				%a = load half, half* %p, align 2
				%r = call half @llvm.log2.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fma:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r2]
				; CHECK-FP16-NEXT: ldrh r1, [r1]
				; CHECK-FP16-NEXT: ldrh r2, [r4]
				; CHECK-FP16-NEXT: vmov s2, r1
				; CHECK-FP16-NEXT: vmov s4, r0
				; CHECK-FP16-NEXT: vmov s0, r2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s4, s4
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: vmov r1, s2
				; CHECK-FP16-NEXT: vmov r2, s4
				; CHECK-FP16-NEXT: bl fmaf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_fma:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl fmaf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fma(half* %p, half* %q, half* %r) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%c = load half, half* %r, align 2
				%v = call half @llvm.fma.f16(half %a, half %b, half %c)
				store half %v, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fabs:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r1, [r0]
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vabs.f32 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: strh r1, [r0]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fabs:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bfc
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fabs(half* %p) {
				%a = load half, half* %p, align 2
				%r = call half @llvm.fabs.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_minnum:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r1]
				; CHECK-FP16-NEXT: ldrh r1, [r4]
				; CHECK-FP16-NEXT: vmov s2, r0
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: vmov r1, s2
				; CHECK-FP16-NEXT: bl fminf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_minnum:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl fminf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_minnum(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = call half @llvm.minnum.f16(half %a, half %b)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_maxnum:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r1]
				; CHECK-FP16-NEXT: ldrh r1, [r4]
				; CHECK-FP16-NEXT: vmov s2, r0
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: vmov r1, s2
				; CHECK-FP16-NEXT: bl fmaxf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_maxnum:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl fmaxf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_maxnum(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = call half @llvm.maxnum.f16(half %a, half %b)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_copysign:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r1, [r1]
				; CHECK-FP16-NEXT: ldrh r2, [r0]
				; CHECK-FP16-NEXT: vmov.i32 d2, #0x80000000
				; CHECK-FP16-NEXT: vmov s0, r2
				; CHECK-FP16-NEXT: vmov s2, r1
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vbsl d2, d1, d0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s4
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: strh r1, [r0]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_copysign:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vbsl
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_copysign(half* %p, half* %q) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%r = call half @llvm.copysign.f16(half %a, half %b)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_floor:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl floorf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_floor:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl floorf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_floor(half* %p) {
				%a = load half, half* %p, align 2
				%r = call half @llvm.floor.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_ceil:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl ceilf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_ceil:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl ceilf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_ceil(half* %p) {
				%a = load half, half* %p, align 2
				%r = call half @llvm.ceil.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_trunc:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl truncf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_trunc:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl truncf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_trunc(half* %p) {
				%a = load half, half* %p, align 2
				%r = call half @llvm.trunc.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_rint:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl rintf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_rint:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl rintf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_rint(half* %p) {
				%a = load half, half* %p, align 2
				%r = call half @llvm.rint.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_nearbyint:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl nearbyintf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_nearbyint:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl nearbyintf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_nearbyint(half* %p) {
				%a = load half, half* %p, align 2
				%r = call half @llvm.nearbyint.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_round:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: push {r4, lr}
				; CHECK-FP16-NEXT: mov r4, r0
				; CHECK-FP16-NEXT: ldrh r0, [r4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: bl roundf
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s0
				; CHECK-FP16-NEXT: vmov r0, s0
				; CHECK-FP16-NEXT: strh r0, [r4]
				; CHECK-FP16-NEXT: pop {r4, pc}
				; CHECK-LIBCALL-LABEL: test_round:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl roundf
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_round(half* %p) {
				%a = load half, half* %p, align 2
				%r = call half @llvm.round.f16(half %a)
				store half %r, half* %p
				ret void
				}

				; CHECK-FP16-LABEL: test_fmuladd:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldrh r2, [r2]
				; CHECK-FP16-NEXT: ldrh r3, [r0]
				; CHECK-FP16-NEXT: ldrh r1, [r1]
				; CHECK-FP16-NEXT: vmov s0, r1
				; CHECK-FP16-NEXT: vmov s2, r3
				; CHECK-FP16-NEXT: vmov s4, r2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s2, s2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s4, s4
				; CHECK-FP16-NEXT: vmla.f32 s4, s2, s0
				; CHECK-FP16-NEXT: vcvtb.f16.f32 s0, s4
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: strh r1, [r0]
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_fmuladd:
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: bl __gnu_h2f_ieee
				; CHECK-LIBCALL: vmla.f32
				; CHECK-LIBCALL: bl __gnu_f2h_ieee
				define void @test_fmuladd(half* %p, half* %q, half* %r) #0 {
				%a = load half, half* %p, align 2
				%b = load half, half* %q, align 2
				%c = load half, half* %r, align 2
				%v = call half @llvm.fmuladd.f16(half %a, half %b, half %c)
				store half %v, half* %p
				ret void
				}

				; f16 vectors are not legal in the backend. Vector elements are not assigned
				; to the register, but are stored in the stack instead. Hence insertelement
				; and extractelement have these extra loads and stores.

				; CHECK-ALL-LABEL: test_insertelement:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: sub sp, sp, #8
				; CHECK-ALL-NEXT: ldrh r3, [r1, #6]
				; CHECK-ALL-NEXT: strh r3, [sp, #6]
				; CHECK-ALL-NEXT: ldrh r3, [r1, #4]
				; CHECK-ALL-NEXT: strh r3, [sp, #4]
				; CHECK-ALL-NEXT: ldrh r3, [r1, #2]
				; CHECK-ALL-NEXT: strh r3, [sp, #2]
				; CHECK-ALL-NEXT: ldrh r3, [r1]
				; CHECK-ALL-NEXT: strh r3, [sp]
				; CHECK-ALL-NEXT: mov r3, sp
				; CHECK-ALL-NEXT: ldrh r0, [r0]
				; CHECK-ALL-NEXT: add r2, r3, r2, lsl #1
				; CHECK-ALL-NEXT: strh r0, [r2]
				; CHECK-ALL-NEXT: ldrh r0, [sp, #6]
				; CHECK-ALL-NEXT: strh r0, [r1, #6]
				; CHECK-ALL-NEXT: ldrh r0, [sp, #4]
				; CHECK-ALL-NEXT: strh r0, [r1, #4]
				; CHECK-ALL-NEXT: ldrh r0, [sp, #2]
				; CHECK-ALL-NEXT: strh r0, [r1, #2]
				; CHECK-ALL-NEXT: ldrh r0, [sp]
				; CHECK-ALL-NEXT: strh r0, [r1]
				; CHECK-ALL-NEXT: add sp, sp, #8
				; CHECK-ALL-NEXT: bx lr
				define void @test_insertelement(half* %p, <4 x half>* %q, i32 %i) #0 {
				%a = load half, half* %p, align 2
				%b = load <4 x half>, <4 x half>* %q, align 8
				%c = insertelement <4 x half> %b, half %a, i32 %i
				store <4 x half> %c, <4 x half>* %q
				ret void
				}

				; CHECK-ALL-LABEL: test_extractelement:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: sub sp, sp, #8
				; CHECK-ALL-NEXT: ldrh r12, [r1, #2]
				; CHECK-ALL-NEXT: ldrh r3, [r1]
				; CHECK-ALL-NEXT: orr r3, r3, r12, lsl #16
				; CHECK-ALL-NEXT: str r3, [sp]
				; CHECK-ALL-NEXT: ldrh r3, [r1, #6]
				; CHECK-ALL-NEXT: ldrh r1, [r1, #4]
				; CHECK-ALL-NEXT: orr r1, r1, r3, lsl #16
				; CHECK-ALL-NEXT: str r1, [sp, #4]
				; CHECK-ALL-NEXT: mov r1, sp
				; CHECK-ALL-NEXT: add r1, r1, r2, lsl #1
				; CHECK-ALL-NEXT: ldrh r1, [r1]
				; CHECK-ALL-NEXT: strh r1, [r0]
				; CHECK-ALL-NEXT: add sp, sp, #8
				; CHECK-ALL-NEXT: bx lr
				define void @test_extractelement(half* %p, <4 x half>* %q, i32 %i) #0 {
				%a = load <4 x half>, <4 x half>* %q, align 8
				%b = extractelement <4 x half> %a, i32 %i
				store half %b, half* %p
				ret void
				}

				; test struct operations

				%struct.dummy = type { i32, half }

				; CHECK-ALL-LABEL: test_insertvalue:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: ldr r2, [r0]
				; CHECK-ALL-NEXT: ldrh r1, [r1]
				; CHECK-ALL-NEXT: strh r1, [r0, #4]
				; CHECK-ALL-NEXT: str r2, [r0]
				; CHECK-ALL-NEXT: bx lr
				define void @test_insertvalue(%struct.dummy* %p, half* %q) {
				%a = load %struct.dummy, %struct.dummy* %p
				%b = load half, half* %q
				%c = insertvalue %struct.dummy %a, half %b, 1
				store %struct.dummy %c, %struct.dummy* %p
				ret void
				}

				; CHECK-ALL-LABEL: test_extractvalue:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: ldrh r0, [r0, #4]
				; CHECK-ALL-NEXT: strh r0, [r1]
				; CHECK-ALL-NEXT: bx lr
				define void @test_extractvalue(%struct.dummy* %p, half* %q) {
				%a = load %struct.dummy, %struct.dummy* %p
				%b = extractvalue %struct.dummy %a, 1
				store half %b, half* %q
				ret void
				}

				; CHECK-FP16-LABEL: test_struct_return:
				; CHECK-FP16-NEXT: .fnstart
				; CHECK-FP16-NEXT: ldr r2, [r0]
				; CHECK-FP16-NEXT: ldrh r0, [r0, #4]
				; CHECK-FP16-NEXT: vmov s0, r0
				; CHECK-FP16-NEXT: mov r0, r2
				; CHECK-FP16-NEXT: vcvtb.f32.f16 s0, s0
				; CHECK-FP16-NEXT: vmov r1, s0
				; CHECK-FP16-NEXT: bx lr
				; CHECK-LIBCALL-LABEL: test_struct_return:
				; CHECK-LIBCALL-NEXT: .fnstart
				; CHECK-LIBCALL-NEXT: push {r4, lr}
				; CHECK-LIBCALL-NEXT: ldr r4, [r0]
				; CHECK-LIBCALL-NEXT: ldrh r0, [r0, #4]
				; CHECK-LIBCALL-NEXT: bl __gnu_h2f_ieee
				; CHECK-LIBCALL-NEXT: mov r1, r0
				; CHECK-LIBCALL-NEXT: mov r0, r4
				; CHECK-LIBCALL-NEXT: pop {r4, pc}
				define %struct.dummy @test_struct_return(%struct.dummy* %p) {
				%a = load %struct.dummy, %struct.dummy* %p
				ret %struct.dummy %a
				}

				; CHECK-ALL-LABEL: test_struct_arg:
				; CHECK-ALL-NEXT: .fnstart
				; CHECK-ALL-NEXT: mov r0, r1
				; CHECK-ALL-NEXT: bx lr
				define half @test_struct_arg(%struct.dummy %p) {
				%a = extractvalue %struct.dummy %p, 1
				ret half %a
				}

				attributes #0 = { nounwind }