This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Target/ARM/
-
Target/
-
ARM/
-
ARMCallLowering.cpp
-
ARMInstructionSelector.cpp
-
ARMLegalizerInfo.cpp
-
test/CodeGen/ARM/GlobalISel/
-
CodeGen/
-
ARM/
-
GlobalISel/
-
arm-instruction-select.mir
-
arm-irtranslator.ll
-
arm-isel.ll

Differential D27803

[ARM] GlobalISel: Load i1, i8 and i16 args from stack
ClosedPublic

Authored by rovka on Dec 15 2016, 6:26 AM.

Download Raw Diff

Details

Reviewers

qcolombet
rengolin
t.p.northover
ab

Commits

rG278c722e6d01: [ARM] GlobalISel: Load i1, i8 and i16 args from stack
rL293163: [ARM] GlobalISel: Load i1, i8 and i16 args from stack

Summary

Add support for loading i1, i8 and i16 arguments from the stack, with or without
the ABI extension flags.

When the ABI extension flags are present, we load a 4-byte value, otherwise we
preserve the size of the load and let the instruction selector replace it with a
LDRB/LDRH. This generates the same thing as DAGISel (it's not entirely clear to
me which calling conventions and in what circumstances may be lacking the ext
flags, feel free to point it out).

Diff Detail

Repository: rL LLVM

Event Timeline

rovka updated this revision to Diff 81569.Dec 15 2016, 6:26 AM

rovka retitled this revision from to [ARM] GlobalISel: Load i1, i8 and i16 args from stack.

rovka updated this object.

rovka added reviewers: ab, qcolombet, t.p.northover.

rovka added subscribers: llvm-commits, rengolin, dsanders.

Herald added subscribers: dberris, vkalintiris, aemerson. · View Herald TranscriptDec 15 2016, 6:26 AM

rovka added a parent revision: D27706: [ARM] GlobalISel: Support i1 add and ABI extensions.Dec 15 2016, 6:26 AM

t.p.northover added inline comments.Dec 16 2016, 11:55 AM

lib/Target/ARM/ARMCallLowering.cpp
134–142 ↗	(On Diff #81569)	Is this necessary? Just because the caller stored 4 bytes doesn't mean we have to load 4. You might make some kind of efficiency argument if you expect most arithmetic to be done on i32 (as in C code), but I don't think this really helps there anyway because the G_SEXT/G_ZEXT will still exist in the MIR. Also, if this really is necessary the `MMO` is misrecording the size of the memory operation.

lib/Target/ARM/ARMCallLowering.cpp
134–142 ↗	(On Diff #81569)	Well, actually we're not generating a G_ZEXT or G_SEXT here... Should we? It looks like the simplest thing to do, but OTOH it's the caller's job to generate the extension, so if we did anything here we'd just be duplicating the caller's work (and introducing a potential source of bugs). I'm not sure what the best course of action is: Keep the size and generate the G_LOAD + G_ZEXT/G_SEXT instructions Generate G_LOAD with size 4 Generate LDRi12 directly as in this patch Are we breaking any assumptions if we choose one of the last 2? Would that be premature optimization?

I've been thinking about this a bit more and I think it's better to generate a G_LOAD, but for a size of 4 so that we preserve the extension performed by the caller.
This is achieved by changing the type of the virtual register that we're supposed to load into to a 32-bit scalar (which is sufficient for now, since we only support scalars at the moment). I've also fixed the size of the memory operand, which was wrong in the previous versions of the patch.
Let me know what you think.
Thanks.

Herald added a subscriber: kristof.beyls. · View Herald TranscriptJan 19 2017, 12:27 PM

Ping.

rengolin added inline comments.Jan 23 2017, 9:02 AM

lib/Target/ARM/ARMCallLowering.cpp
134–142 ↗	(On Diff #81569)	I believe we should start with generating the right load (B/H/W) and Z/S-extend. It may duplicate the caller's work (since it was their job), but it's also safer if the caller is not following the ABI. Why would that matter, well, because we support multiple ABIs, and I don't claim to know all of them by heart. So, we implement the safe option now, and later on optimise for specific ABIs if we can guarantee it'll be safe there, too.

rovka added inline comments.Jan 25 2017, 2:18 AM

lib/Target/ARM/ARMCallLowering.cpp
134–142 ↗	(On Diff #81569)	I'm not sure I understand why you think it's safer to duplicate the caller's work. Wherever there's duplication, there's the possibility of going out-of-sync between the two sides, and I don't think guarding against the caller not following the ABI is worth that risk. Also, by doing the extension to 32 bits we're already assuming something about the ABI. Are you thinking about something in particular about an ABI that would break if we just load those 32 bits? FWIW, the current instruction selection just loads 32 bits for every calling convention supported by ARM (except for ghccc, which just asserts).

rengolin accepted this revision.Jan 25 2017, 3:21 AM

rengolin added inline comments.

lib/Target/ARM/ARMCallLowering.cpp
134–142 ↗	(On Diff #81569)	I was under the impression that Tim had knowledge of some ABI (Apple) that didn't follow that rule. Re-reading his comment now it seems it was just a point to optimisation, which should not be our worry for now. If the current selector does this, than not changing behaviour is the way to go. Bonus points for following the ABI. :)

This revision is now accepted and ready to land.Jan 25 2017, 3:21 AM

Closed by commit rL293163: [ARM] GlobalISel: Load i1, i8 and i16 args from stack (authored by rovka). · Explain WhyJan 26 2017, 1:32 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

ARM/

ARMCallLowering.cpp

23 lines

ARMInstructionSelector.cpp

32 lines

ARMLegalizerInfo.cpp

3 lines

test/

CodeGen/

ARM/

GlobalISel/

arm-instruction-select.mir

15 lines

arm-irtranslator.ll

40 lines

arm-isel.ll

37 lines

Diff 85872

llvm/trunk/lib/Target/ARM/ARMCallLowering.cpp

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
namespace {		namespace {
struct FormalArgHandler : public CallLowering::ValueHandler {		struct FormalArgHandler : public CallLowering::ValueHandler {
FormalArgHandler(MachineIRBuilder &MIRBuilder, MachineRegisterInfo &MRI,		FormalArgHandler(MachineIRBuilder &MIRBuilder, MachineRegisterInfo &MRI,
CCAssignFn AssignFn)		CCAssignFn AssignFn)
: ValueHandler(MIRBuilder, MRI, AssignFn) {}		: ValueHandler(MIRBuilder, MRI, AssignFn) {}

unsigned getStackAddress(uint64_t Size, int64_t Offset,		unsigned getStackAddress(uint64_t Size, int64_t Offset,
MachinePointerInfo &MPO) override {		MachinePointerInfo &MPO) override {
assert(Size == 4 && "Unsupported size");		assert((Size == 1 \|\| Size == 2 \|\| Size == 4) && "Unsupported size");

auto &MFI = MIRBuilder.getMF().getFrameInfo();		auto &MFI = MIRBuilder.getMF().getFrameInfo();

int FI = MFI.CreateFixedObject(Size, Offset, true);		int FI = MFI.CreateFixedObject(Size, Offset, true);
MPO = MachinePointerInfo::getFixedStack(MIRBuilder.getMF(), FI);		MPO = MachinePointerInfo::getFixedStack(MIRBuilder.getMF(), FI);

unsigned AddrReg =		unsigned AddrReg =
MRI.createGenericVirtualRegister(LLT::pointer(MPO.getAddrSpace(), 32));		MRI.createGenericVirtualRegister(LLT::pointer(MPO.getAddrSpace(), 32));
MIRBuilder.buildFrameIndex(AddrReg, FI);		MIRBuilder.buildFrameIndex(AddrReg, FI);

return AddrReg;		return AddrReg;
}		}

void assignValueToAddress(unsigned ValVReg, unsigned Addr, uint64_t Size,		void assignValueToAddress(unsigned ValVReg, unsigned Addr, uint64_t Size,
MachinePointerInfo &MPO, CCValAssign &VA) override {		MachinePointerInfo &MPO, CCValAssign &VA) override {
assert(Size == 4 && "Unsupported size");		assert((Size == 1 \|\| Size == 2 \|\| Size == 4) && "Unsupported size");

		if (VA.getLocInfo() == CCValAssign::SExt \|\|
		VA.getLocInfo() == CCValAssign::ZExt) {
		// If the argument is zero- or sign-extended by the caller, its size
		// becomes 4 bytes, so that's what we should load.
		Size = 4;
		assert(MRI.getType(ValVReg).isScalar() && "Only scalars supported atm");
		MRI.setType(ValVReg, LLT::scalar(32));
		}

auto MMO = MIRBuilder.getMF().getMachineMemOperand(		auto MMO = MIRBuilder.getMF().getMachineMemOperand(
MPO, MachineMemOperand::MOLoad, Size, /* Alignment */ 0);		MPO, MachineMemOperand::MOLoad, Size, /* Alignment */ 0);
MIRBuilder.buildLoad(ValVReg, Addr, *MMO);		MIRBuilder.buildLoad(ValVReg, Addr, *MMO);
}		}

void assignValueToReg(unsigned ValVReg, unsigned PhysReg,		void assignValueToReg(unsigned ValVReg, unsigned PhysReg,
CCValAssign &VA) override {		CCValAssign &VA) override {
Show All 22 Lines	bool ARMCallLowering::lowerFormalArguments(MachineIRBuilder &MIRBuilder,

auto DL = MIRBuilder.getMF().getDataLayout();		auto DL = MIRBuilder.getMF().getDataLayout();
auto &TLI = *getTLI<ARMTargetLowering>();		auto &TLI = *getTLI<ARMTargetLowering>();

if (TLI.getSubtarget()->isThumb())		if (TLI.getSubtarget()->isThumb())
return false;		return false;

auto &Args = F.getArgumentList();		auto &Args = F.getArgumentList();
unsigned ArgIdx = 0;		for (auto &Arg : Args)
for (auto &Arg : Args) {
ArgIdx++;
if (!isSupportedType(DL, TLI, Arg.getType()))		if (!isSupportedType(DL, TLI, Arg.getType()))
return false;		return false;

// FIXME: This check as well as ArgIdx are going away as soon as we support
// loading values < 32 bits.
if (ArgIdx > 4 && Arg.getType()->getIntegerBitWidth() != 32)
return false;
}

CCAssignFn *AssignFn =		CCAssignFn *AssignFn =
TLI.CCAssignFnForCall(F.getCallingConv(), F.isVarArg());		TLI.CCAssignFnForCall(F.getCallingConv(), F.isVarArg());

SmallVector<ArgInfo, 8> ArgInfos;		SmallVector<ArgInfo, 8> ArgInfos;
unsigned Idx = 0;		unsigned Idx = 0;
for (auto &Arg : Args) {		for (auto &Arg : Args) {
ArgInfo AInfo(VRegs[Idx], Arg.getType());		ArgInfo AInfo(VRegs[Idx], Arg.getType());
setArgFlags(AInfo, Idx + 1, DL, F);		setArgFlags(AInfo, Idx + 1, DL, F);
ArgInfos.push_back(AInfo);		ArgInfos.push_back(AInfo);
Idx++;		Idx++;
}		}

FormalArgHandler ArgHandler(MIRBuilder, MIRBuilder.getMF().getRegInfo(),		FormalArgHandler ArgHandler(MIRBuilder, MIRBuilder.getMF().getRegInfo(),
AssignFn);		AssignFn);
return handleAssignments(MIRBuilder, ArgInfos, ArgHandler);		return handleAssignments(MIRBuilder, ArgInfos, ArgHandler);
}		}

llvm/trunk/lib/Target/ARM/ARMInstructionSelector.cpp

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	if (Opc == G_SEXT)
return Size == 8 ? ARM::SXTB : ARM::SXTH;		return Size == 8 ? ARM::SXTB : ARM::SXTH;

if (Opc == G_ZEXT)		if (Opc == G_ZEXT)
return Size == 8 ? ARM::UXTB : ARM::UXTH;		return Size == 8 ? ARM::UXTB : ARM::UXTH;

llvm_unreachable("Unsupported opcode");		llvm_unreachable("Unsupported opcode");
}		}

		/// Select the opcode for simple loads. For types smaller than 32 bits, the
		/// value will be zero extended.
		static unsigned selectLoadOpCode(unsigned Size) {
		switch (Size) {
		case 1:
		case 8:
		return ARM::LDRBi12;
		case 16:
		return ARM::LDRH;
		case 32:
		return ARM::LDRi12;
		}

		llvm_unreachable("Unsupported size");
		}

bool ARMInstructionSelector::select(MachineInstr &I) const {		bool ARMInstructionSelector::select(MachineInstr &I) const {
assert(I.getParent() && "Instruction should be in a basic block!");		assert(I.getParent() && "Instruction should be in a basic block!");
assert(I.getParent()->getParent() && "Instruction should be in a function!");		assert(I.getParent()->getParent() && "Instruction should be in a function!");

auto &MBB = *I.getParent();		auto &MBB = *I.getParent();
auto &MF = *MBB.getParent();		auto &MF = *MBB.getParent();
auto &MRI = MF.getRegInfo();		auto &MRI = MF.getRegInfo();

▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	case G_ADD:
MIB.add(predOps(ARMCC::AL)).add(condCodeOp());		MIB.add(predOps(ARMCC::AL)).add(condCodeOp());
break;		break;
case G_FRAME_INDEX:		case G_FRAME_INDEX:
// Add 0 to the given frame index and hope it will eventually be folded into		// Add 0 to the given frame index and hope it will eventually be folded into
// the user(s).		// the user(s).
I.setDesc(TII.get(ARM::ADDri));		I.setDesc(TII.get(ARM::ADDri));
MIB.addImm(0).add(predOps(ARMCC::AL)).add(condCodeOp());		MIB.addImm(0).add(predOps(ARMCC::AL)).add(condCodeOp());
break;		break;
case G_LOAD:		case G_LOAD: {
I.setDesc(TII.get(ARM::LDRi12));		LLT ValTy = MRI.getType(I.getOperand(0).getReg());
		const auto ValSize = ValTy.getSizeInBits();

		if (ValSize != 32 && ValSize != 16 && ValSize != 8 && ValSize != 1)
		return false;

		const auto NewOpc = selectLoadOpCode(ValSize);
		I.setDesc(TII.get(NewOpc));

		if (NewOpc == ARM::LDRH)
		// LDRH has a funny addressing mode (there's already a FIXME for it).
		MIB.addReg(0);
MIB.addImm(0).add(predOps(ARMCC::AL));		MIB.addImm(0).add(predOps(ARMCC::AL));
break;		break;
		}
default:		default:
return false;		return false;
}		}

return constrainSelectedInstRegOperands(I, TII, TRI, RBI);		return constrainSelectedInstRegOperands(I, TII, TRI, RBI);
}		}

llvm/trunk/lib/Target/ARM/ARMLegalizerInfo.cpp

Show All 29 Lines	ARMLegalizerInfo::ARMLegalizerInfo() {

const LLT s1 = LLT::scalar(1);		const LLT s1 = LLT::scalar(1);
const LLT s8 = LLT::scalar(8);		const LLT s8 = LLT::scalar(8);
const LLT s16 = LLT::scalar(16);		const LLT s16 = LLT::scalar(16);
const LLT s32 = LLT::scalar(32);		const LLT s32 = LLT::scalar(32);

setAction({G_FRAME_INDEX, p0}, Legal);		setAction({G_FRAME_INDEX, p0}, Legal);

setAction({G_LOAD, s32}, Legal);		for (auto Ty : {s1, s8, s16, s32})
		setAction({G_LOAD, Ty}, Legal);
setAction({G_LOAD, 1, p0}, Legal);		setAction({G_LOAD, 1, p0}, Legal);

for (auto Ty : {s1, s8, s16, s32})		for (auto Ty : {s1, s8, s16, s32})
setAction({G_ADD, Ty}, Legal);		setAction({G_ADD, Ty}, Legal);

for (auto Op : {G_SEXT, G_ZEXT}) {		for (auto Op : {G_SEXT, G_ZEXT}) {
setAction({Op, s32}, Legal);		setAction({Op, s32}, Legal);
for (auto Ty : {s1, s8, s16})		for (auto Ty : {s1, s8, s16})
setAction({Op, 1, Ty}, Legal);		setAction({Op, 1, Ty}, Legal);
}		}

computeTables();		computeTables();
}		}

llvm/trunk/test/CodeGen/ARM/GlobalISel/arm-instruction-select.mir

Show First 20 Lines • Show All 227 Lines • ▼ Show 20 Lines	registers:
- { id: 1, class: gprb }		- { id: 1, class: gprb }
- { id: 2, class: gprb }		- { id: 2, class: gprb }
- { id: 3, class: gprb }		- { id: 3, class: gprb }
# CHECK-DAG: id: 0, class: gpr		# CHECK-DAG: id: 0, class: gpr
# CHECK-DAG: id: 1, class: gpr		# CHECK-DAG: id: 1, class: gpr
# CHECK-DAG: id: 2, class: gpr		# CHECK-DAG: id: 2, class: gpr
# CHECK-DAG: id: 3, class: gpr		# CHECK-DAG: id: 3, class: gpr
fixedStack:		fixedStack:
- { id: 0, offset: 0, size: 4, alignment: 4, isImmutable: true, isAliased: false }		- { id: 0, offset: 0, size: 1, alignment: 4, isImmutable: true, isAliased: false }
- { id: 1, offset: 4, size: 4, alignment: 4, isImmutable: true, isAliased: false }		- { id: 1, offset: 4, size: 4, alignment: 4, isImmutable: true, isAliased: false }
- { id: 2, offset: 8, size: 4, alignment: 4, isImmutable: true, isAliased: false }		- { id: 2, offset: 8, size: 4, alignment: 4, isImmutable: true, isAliased: false }
# CHECK: id: [[FRAME_INDEX:[0-9]+]], offset: 8		# CHECK-DAG: id: [[FI1:[0-9]+]], offset: 0
		# CHECK-DAG: id: [[FI32:[0-9]+]], offset: 8
body: \|		body: \|
bb.0:		bb.0:
liveins: %r0, %r1, %r2, %r3		liveins: %r0, %r1, %r2, %r3

%0(p0) = G_FRAME_INDEX %fixed-stack.2		%0(p0) = G_FRAME_INDEX %fixed-stack.2
; CHECK: [[FIVREG:%[0-9]+]] = ADDri %fixed-stack.[[FRAME_INDEX]], 0, 14, _, _		; CHECK: [[FI32VREG:%[0-9]+]] = ADDri %fixed-stack.[[FI32]], 0, 14, _, _

%1(s32) = G_LOAD %0(p0)		%1(s32) = G_LOAD %0(p0)
; CHECK: {{%[0-9]+}} = LDRi12 [[FIVREG]], 0, 14, _		; CHECK: {{%[0-9]+}} = LDRi12 [[FI32VREG]], 0, 14, _

		%2(p0) = G_FRAME_INDEX %fixed-stack.0
		; CHECK: [[FI1VREG:%[0-9]+]] = ADDri %fixed-stack.[[FI1]], 0, 14, _, _

		%3(s1) = G_LOAD %2(p0)
		; CHECK: {{%[0-9]+}} = LDRBi12 [[FI1VREG]], 0, 14, _

BX_RET 14, _		BX_RET 14, _
; CHECK: BX_RET 14, _		; CHECK: BX_RET 14, _
...		...

llvm/trunk/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll

	Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
	; CHECK: [[SUM:%[0-9]+]](s32) = G_ADD [[VREGX]], [[VREGY]]			; CHECK: [[SUM:%[0-9]+]](s32) = G_ADD [[VREGX]], [[VREGY]]
	; CHECK: %r0 = COPY [[SUM]](s32)			; CHECK: %r0 = COPY [[SUM]](s32)
	; CHECK: BX_RET 14, _, implicit %r0			; CHECK: BX_RET 14, _, implicit %r0
	entry:			entry:
	%sum = add i32 %x, %y			%sum = add i32 %x, %y
	ret i32 %sum			ret i32 %sum
	}			}

	define i32 @test_many_args(i32 %p0, i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5) {			define i32 @test_stack_args(i32 %p0, i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5) {
	; CHECK-LABEL: name: test_many_args			; CHECK-LABEL: name: test_stack_args
	; CHECK: fixedStack:			; CHECK: fixedStack:
	; CHECK-DAG: id: [[P4:[0-9]]]{{.}}offset: 0{{.}}size: 4			; CHECK-DAG: id: [[P4:[0-9]]]{{.}}offset: 0{{.}}size: 4
	; CHECK-DAG: id: [[P5:[0-9]]]{{.}}offset: 4{{.}}size: 4			; CHECK-DAG: id: [[P5:[0-9]]]{{.}}offset: 4{{.}}size: 4
	; CHECK: liveins: %r0, %r1, %r2, %r3			; CHECK: liveins: %r0, %r1, %r2, %r3
	; CHECK: [[VREGP2:%[0-9]+]]{{.*}} = COPY %r2			; CHECK: [[VREGP2:%[0-9]+]]{{.*}} = COPY %r2
	; CHECK: [[FIP5:%[0-9]+]]{{.*}} = G_FRAME_INDEX %fixed-stack.[[P5]]			; CHECK: [[FIP5:%[0-9]+]]{{.*}} = G_FRAME_INDEX %fixed-stack.[[P5]]
	; CHECK: [[VREGP5:%[0-9]+]]{{.*}} = G_LOAD [[FIP5]]			; CHECK: [[VREGP5:%[0-9]+]]{{.*}} = G_LOAD [[FIP5]]
	; CHECK: [[SUM:%[0-9]+]]{{.*}} = G_ADD [[VREGP2]], [[VREGP5]]			; CHECK: [[SUM:%[0-9]+]]{{.*}} = G_ADD [[VREGP2]], [[VREGP5]]
	; CHECK: %r0 = COPY [[SUM]]			; CHECK: %r0 = COPY [[SUM]]
	; CHECK: BX_RET 14, _, implicit %r0			; CHECK: BX_RET 14, _, implicit %r0
	entry:			entry:
	%sum = add i32 %p2, %p5			%sum = add i32 %p2, %p5
	ret i32 %sum			ret i32 %sum
	}			}

				define i16 @test_stack_args_signext(i32 %p0, i16 %p1, i8 %p2, i1 %p3,
				i8 signext %p4, i16 signext %p5) {
				; CHECK-LABEL: name: test_stack_args_signext
				; CHECK: fixedStack:
				; CHECK-DAG: id: [[P4:[0-9]]]{{.}}offset: 0{{.}}size: 1
				; CHECK-DAG: id: [[P5:[0-9]]]{{.}}offset: 4{{.}}size: 2
				; CHECK: liveins: %r0, %r1, %r2, %r3
				; CHECK: [[VREGP1:%[0-9]+]]{{.*}} = COPY %r1
				; CHECK: [[FIP5:%[0-9]+]]{{.*}} = G_FRAME_INDEX %fixed-stack.[[P5]]
				; CHECK: [[VREGP5:%[0-9]+]]{{.*}} = G_LOAD [[FIP5]](p0)
				; CHECK: [[SUM:%[0-9]+]]{{.*}} = G_ADD [[VREGP1]], [[VREGP5]]
				; CHECK: %r0 = COPY [[SUM]]
				; CHECK: BX_RET 14, _, implicit %r0
				entry:
				%sum = add i16 %p1, %p5
				ret i16 %sum
				}

				define i8 @test_stack_args_zeroext(i32 %p0, i16 %p1, i8 %p2, i1 %p3,
				i8 zeroext %p4, i16 zeroext %p5) {
				; CHECK-LABEL: name: test_stack_args_zeroext
				; CHECK: fixedStack:
				; CHECK-DAG: id: [[P4:[0-9]]]{{.}}offset: 0{{.}}size: 1
				; CHECK-DAG: id: [[P5:[0-9]]]{{.}}offset: 4{{.}}size: 2
				; CHECK: liveins: %r0, %r1, %r2, %r3
				; CHECK: [[VREGP2:%[0-9]+]]{{.*}} = COPY %r2
				; CHECK: [[FIP4:%[0-9]+]]{{.*}} = G_FRAME_INDEX %fixed-stack.[[P4]]
				; CHECK: [[VREGP4:%[0-9]+]]{{.*}} = G_LOAD [[FIP4]](p0)
				; CHECK: [[SUM:%[0-9]+]]{{.*}} = G_ADD [[VREGP2]], [[VREGP4]]
				; CHECK: %r0 = COPY [[SUM]]
				; CHECK: BX_RET 14, _, implicit %r0
				entry:
				%sum = add i8 %p2, %p4
				ret i8 %sum
				}

llvm/trunk/test/CodeGen/ARM/GlobalISel/arm-isel.ll

	Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: test_add_i32:			; CHECK-LABEL: test_add_i32:
	; CHECK: add r0, r0, r1			; CHECK: add r0, r0, r1
	; CHECK: bx lr			; CHECK: bx lr
	entry:			entry:
	%sum = add i32 %x, %y			%sum = add i32 %x, %y
	ret i32 %sum			ret i32 %sum
	}			}

	define i32 @test_many_args(i32 %p0, i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5) {			define i32 @test_stack_args_i32(i32 %p0, i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5) {
	; CHECK-LABEL: test_many_args:			; CHECK-LABEL: test_stack_args_i32:
	; CHECK: add [[P5ADDR:r[0-9]+]], sp, #4			; CHECK: add [[P5ADDR:r[0-9]+]], sp, #4
	; CHECK: ldr [[P5:r[0-9]+]], {{.*}}[[P5ADDR]]			; CHECK: ldr [[P5:r[0-9]+]], {{.*}}[[P5ADDR]]
	; CHECK: add r0, r2, [[P5]]			; CHECK: add r0, r2, [[P5]]
	; CHECK: bx lr			; CHECK: bx lr
	entry:			entry:
	%sum = add i32 %p2, %p5			%sum = add i32 %p2, %p5
	ret i32 %sum			ret i32 %sum
	}			}

				define i16 @test_stack_args_mixed(i32 %p0, i16 %p1, i8 %p2, i1 %p3, i8 %p4, i16 %p5) {
				; CHECK-LABEL: test_stack_args_mixed:
				; CHECK: add [[P5ADDR:r[0-9]+]], sp, #4
				; CHECK: ldrh [[P5:r[0-9]+]], {{.*}}[[P5ADDR]]
				; CHECK: add r0, r1, [[P5]]
				; CHECK: bx lr
				entry:
				%sum = add i16 %p1, %p5
				ret i16 %sum
				}

				define i16 @test_stack_args_zeroext(i32 %p0, i16 %p1, i8 %p2, i1 %p3, i16 zeroext %p4) {
				; CHECK-LABEL: test_stack_args_zeroext:
				; CHECK: mov [[P4ADDR:r[0-9]+]], sp
				; CHECK: ldr [[P4:r[0-9]+]], {{.*}}[[P4ADDR]]
				; CHECK: add r0, r1, [[P4]]
				; CHECK: bx lr
				entry:
				%sum = add i16 %p1, %p4
				ret i16 %sum
				}

				define i8 @test_stack_args_signext(i32 %p0, i16 %p1, i8 %p2, i1 %p3, i8 signext %p4) {
				; CHECK-LABEL: test_stack_args_signext:
				; CHECK: mov [[P4ADDR:r[0-9]+]], sp
				; CHECK: ldr [[P4:r[0-9]+]], {{.*}}[[P4ADDR]]
				; CHECK: add r0, r2, [[P4]]
				; CHECK: bx lr
				entry:
				%sum = add i8 %p2, %p4
				ret i8 %sum
				}