This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
CodeGen/
-
CallingConvLower.h
-
Target/
-
TargetCallingConv.td
-
lib/
-
CodeGen/
-
SelectionDAG/
1/2
SelectionDAGBuilder.cpp
-
TargetLoweringBase.cpp
-
TargetLoweringObjectFileImpl.cpp
-
ExecutionEngine/
-
Orc/
-
IndirectionUtils.cpp
-
LazyReexports.cpp
-
RuntimeDyld/
-
RuntimeDyld.cpp
-
RuntimeDyldMachO.cpp
-
LTO/
-
LTOCodeGenerator.cpp
-
LTOModule.cpp
-
ThinLTOCodeGenerator.cpp
-
MC/
-
MCObjectFileInfo.cpp
-
Target/
-
AArch64/
1/2
AArch64AsmPrinter.cpp
-
AArch64CallLowering.cpp
-
AArch64CallingConvention.h
1/1
AArch64CallingConvention.cpp
-
AArch64CallingConvention.td
-
AArch64CollectLOH.cpp
-
AArch64ExpandPseudoInsts.cpp
1/2
AArch64FastISel.cpp
1/2
AArch64ISelLowering.h
3/5
AArch64ISelLowering.cpp
-
AArch64InstrInfo.cpp
-
AArch64SelectionDAGInfo.cpp
1
AArch64Subtarget.h
-
AArch64TargetMachine.cpp
-
MCTargetDesc/
-
AArch64MCAsmInfo.h
-
AArch64MCAsmInfo.cpp
-
AArch64MCTargetDesc.cpp
-
X86/
-
X86FastISel.cpp
-
test/
-
CodeGen/AArch64/
-
AArch64/
-
arm64-aapcs.ll
-
arm64-collect-loh-garbage-crash.ll
-
arm64-collect-loh-str.ll
1/2
arm64-collect-loh.ll
-
arm64-indexed-memory.ll
-
arm64-stacksave.ll
-
arm64_32-addrs.ll
-
arm64_32-atomics.ll
-
arm64_32-fastisel.ll
-
arm64_32-frame-pointers.ll
-
arm64_32-gep-sink.ll
-
arm64_32-memcpy.ll
1/2
arm64_32-neon.ll
-
arm64_32-null.ll
-
arm64_32-pointer-extend.ll
-
arm64_32-stack-pointers.ll
-
arm64_32-tls.ll
-
arm64_32-va.ll
-
arm64_32.ll
-
fastcc-reserved.ll
-
fastcc.ll
-
jump-table-32.ll
-
sibling-call.ll
-
swift-return.ll
-
swiftcc.ll
-
swifterror.ll
-
swiftself.ll
1/2
tail-call.ll
-
umulo-128-legalisation-lowering.ll
2/3
win64_vararg.ll
-
MC/AArch64/
-
AArch64/
-
arm64_32-compact-unwind.s
-
utils/TableGen/
-
TableGen/
-
CallingConvEmitter.cpp

Differential D61259

AArch64: support arm64_32, an ILP32 slice for watchOS.
ClosedPublic

Authored by t.p.northover on Apr 29 2019, 6:36 AM.

Download Raw Diff

Details

Reviewers

eli.friedman
kristof.beyls
rovka
fhahn
aadg
aemerson
thegameg
paquette

Summary

This is the main CodeGen patch to support the arm64_32 watchOS ABI on the LLVM side.

FastISel is mostly disabled for now since it would generate incorrect code for ILP32. Disabling it also allows for a more incremental approach to implementing that side since we have sensible fallbacks in place.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

t.p.northover created this revision.Apr 29 2019, 6:36 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2019, 6:36 AM

Herald added subscribers: dang, jfb, arphaman and 8 others. · View Herald Transcript

t.p.northover added parent revisions: D61258: AArch64: support binutils-like things on arm64_32., D58982: DAG: allow DAG pointer size different from memory representation..Apr 29 2019, 6:37 AM

Ping.

Looks like there's no reviewers set here. @t.p.northover who should review this?

Looks like there's no reviewers set here. @t.p.northover who should review this?

Anyone reasonably active in the AArch64 backend. The primary review location is still the llvm-commits mailing list, so I don't tend to bother adding specific reviewers unless there's a completely obvious candidate.

Ping.

+ reviewers to move this forward hopefully

Herald added a subscriber: jsji. · View Herald TranscriptJun 26 2019, 1:24 PM

carlokok added a subscriber: carlokok.Jun 27 2019, 4:14 AM

Rebased patch.

efriedma added a subscriber: efriedma.Jul 2 2019, 4:58 PM

efriedma added inline comments.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9847	Does this have any practical effect for other targets?
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3811–3812	What's changing here? Does it make sense to add any comments?
llvm/test/CodeGen/AArch64/tail-call.ll
5–6	What are you trying to do here?
llvm/test/CodeGen/AArch64/win64_vararg.ll
272	There isn't any obvious reason for this test to change?

Thanks for upstreaming this, Tim.

At a high level, my main care about here is that what is upstreamed here doesn't conflict with also adding support at some point in the future for the AArch64 ILP32 ABI - for which a beta quality specification from Arm exists.

I browsed through the code changes. Without being a deep expert in a lot of the areas touched, I just shared a few thoughts in the few places where I saw some change that I couldn't easily come up with a plausible explanation for.

I'm assuming that there is nothing in here that would prevent adding support for non-Darwin AArch64 ILP32 later?

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1206–1207	I assume this patch introduces support for the Darwin-specific arm64_32 ILP32 ABI. That makes me wonder why there is a need to define/create the "getTheAArch64_32Target()". Doesn't "getTheARM64_32Target()" suffice to enable support for the Darwin-specific arm64_32 ILP32 ABI?
llvm/lib/Target/AArch64/AArch64ISelLowering.h
264–266	I found it surprising that this method doesn't loop for whether ILP32 is targeted or not to decide whether the pointer type is a 32 or 64 bit integer. Would it be worthwhile to add a small comment here why the integer pointer type is always 64bits? Maybe I'm missing something completely trivial though...

Herald added a subscriber: • wuzish. · View Herald TranscriptJul 5 2019, 3:55 AM

At a high level, my main care about here is that what is upstreamed here doesn't conflict with also adding support at some point in the future for the AArch64 ILP32 ABI - for which a beta quality specification from Arm exists.

I think most of the changes should be reusable.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
9847	I don't think so. It's a correctness issue so if other targets were affected then they'd start seeing vregs without definitions and things. I believe the driving arm64_32 feature is the fact that pointers are extended beyond 32-bits by the caller, which leads to some weird (but valid) DAG nodes trying to communicate that fact. Other targets only assertzext from invalid types.
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
1206–1207	I think it would be enough for production purposes, but the extra lines allow someone to write triple strings starting with `aarch64_32` in tests and so on if they want to. I still have a long-term goal to switch Clang over to using `aarch64` as the canonical name in LLVM (`aarch64-apple-ios14.0` etc). I was prevented when I tried years ago because ld64 wasn't ready, but I think I fixed that and I should try again some time.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3811–3812	We use the ability to lookup whether a register has already been added (use at line 3801 in this diff). If it has then we're trying to combine two parts of (say) a `[2 x i32]` into a single register for compatibility with armv7k IR and need to do the bitwise arithmetic to make that happen. I don't know that a comment here would work (it would either be a historic note, or pre-empting what comes later). I'll try to do something to call it out unobtrusively at the use-point.
llvm/lib/Target/AArch64/AArch64ISelLowering.h
264–266	A comment would definitely be worthwhile. This is the function that enables the large change I upstreamed earlier so that pointer types in the DAG can remain 64-bits (to exploit addressing-modes available) and get truncated to 32-bits in memory.
llvm/test/CodeGen/AArch64/tail-call.ll
5–6	These functions are using the arrays purely to consume register space during a call, but after this change `[8 x i32]` only uses x0-x3 (in full). It's only an IR-level change (i.e. it doesn't affect C or C++ ABI), but I'll limit it to the arm64_32 target when I update the diff.
llvm/test/CodeGen/AArch64/win64_vararg.ll
272	I think it's because of the `std::map` change you called out above. It implicitly sorts the list of registers that get copied, perturbing the DAG and scheduling. I think I'll switch it back to `SmallVector` and use `std::find_if` to handle the (rare) ARM compatibility instead. It ought to be faster in the common case and won't have this side-effect.

Switched back to SmallVector to track registers used, reducing code perturbation. I decided to add a SmallSet too for queries instead of using std::find_if directly because searches actually happen regardless.
Disabled special handling for [N x i32] except on arm64_32 MachO.
Noticed a bug in 2 above, where we allocated 2N registers anyway and fixed it.
Added getPointerType comment.

t.p.northover marked an inline comment as done.Jul 8 2019, 5:29 AM

t.p.northover added inline comments.

llvm/test/CodeGen/AArch64/win64_vararg.ll
272	Well, as you can see that accounted for a lot of the differences, but the `fmov`s still get reordered w.r.t. the store. I have no idea why this is: the DAG is identical and I'm reasonably sure it's harmless so I blame gremlins.

AlexDenisov added a subscriber: AlexDenisov.Jul 13 2019, 5:55 AM

AlexDenisov added inline comments.

llvm/lib/Target/AArch64/AArch64Subtarget.h
442	When I compile LLVM with this patch applied I'm getting the error: AArch64Subtarget.h:430:34: error: only virtual member functions can be marked 'override' bool addrSinkUsingGEPs() const override { ^~~~~~~~~ Removing the override keyword fixes it, but I'm curious where it comes from? I cannot see any usage of this method across the code base.

Ah sorry, it was part of an NFC change in a different commit. I've rolled it into this diff; it's splitting off GEP sinking from the useAA callback since they're not really related.

Gerolf added reviewers: aemerson, thegameg, paquette.Aug 21 2019, 2:11 PM

aemerson added inline comments.Aug 22 2019, 12:48 AM

llvm/lib/Target/AArch64/AArch64CallingConvention.cpp
114–119	Comment here explaining why we need this? I'm guessing for passing arrays of i32s & compatibility with armv7k?
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3472	Why does this happen at all?
3811–3812	Looks like we still need some form of documentation here at RegUsed use?
llvm/test/CodeGen/AArch64/arm64-collect-loh.ll
163	Can you add a comment here explaining that inbounds is needed for arm64_32 to produce the same code.
llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll
22 ↗	(On Diff #213303)	It's odd that this test changed?
llvm/test/CodeGen/AArch64/arm64_32-neon.ll
7	Is anything really expected to change with NEON & arm64_32?

t.p.northover marked 7 inline comments as done.Sep 9 2019, 4:13 AM

t.p.northover added inline comments.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3472	When lowering a return of (say) [2 x i32], the two components will get mapped to (X0, AExtUpper) and (X0, ZExt), a duplication of X0.
llvm/test/CodeGen/AArch64/arm64-collect-loh.ll
163	I'd kind of prefer not to. It's incidental to this test but the main complication of arm64_32 CodeGen, so I don't think it's really what a random reader of this test is going to be looking for.
llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing.ll
22 ↗	(On Diff #213303)	Yes, it's not needed now (I have no idea why it was) so I've reverted these changes.
llvm/test/CodeGen/AArch64/arm64_32-neon.ll
7	Not to change particularly, but this is in some sense the parts of NEON that could be affected by arm64_32: ABI boundaries and load/store addressing-modes. Testing it separately avoids duplicating the AArch64 tests that really aren't different (e.g. arithmetic) but still gives us the coverage.

Added comments mostly, and undid some changes to tests.

Everything else seems reasonable to me.

This revision is now accepted and ready to land.Sep 11 2019, 4:20 PM

Thanks Amara, it's r371722.

Amanieu mentioned this in D94143: [AArch64] Add support for the GNU ILP32 ABI.Jan 5 2021, 6:22 PM

Amanieu mentioned this in rG21bfd068b32e: [AArch64] Add support for the GNU ILP32 ABI.Jan 20 2021, 5:36 AM

loladiro added a subscriber: loladiro.Feb 24 2021, 12:58 PM

loladiro added inline comments.

llvm/lib/Target/AArch64/AArch64FastISel.cpp
531	@t.p.northover Out of curiosity, what is this assertion guarding against? In our frontend we use non-zero address spaces to indicate GC-tracked pointers, which we expect to be ignored by the backend (stripping them is possible of course, but the performance impact is surprisingly high). As far as I can tell this is the only place in the Aarch64 backend that looks at address spaces and we encountered it when trying to port to Apple Silicon (not quite sure why nobody complained on linux, but maybe people didn't run with assertions).

Herald added subscribers: danielkiss, pengfei. · View Herald TranscriptFeb 24 2021, 12:58 PM

t.p.northover added inline comments.Feb 25 2021, 2:16 AM

llvm/lib/Target/AArch64/AArch64FastISel.cpp
531	I'm afraid I don't remember, but it looks overcautious to me now as well. I've just removed the check from `main`.

DavidSpickett mentioned this in D104123: [llvm][AArch64] Handle arrays of struct properly (from IR).Jun 11 2021, 8:17 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

CallingConvLower.h

1 line

Target/

TargetCallingConv.td

6 lines

lib/

CodeGen/

SelectionDAG/

SelectionDAGBuilder.cpp

4 lines

TargetLoweringBase.cpp

1 line

TargetLoweringObjectFileImpl.cpp

1 line

ExecutionEngine/

Orc/

IndirectionUtils.cpp

4 lines

LazyReexports.cpp

1 line

RuntimeDyld/

RuntimeDyld.cpp

3 lines

RuntimeDyldMachO.cpp

2 lines

LTO/

LTOCodeGenerator.cpp

3 lines

LTOModule.cpp

3 lines

ThinLTOCodeGenerator.cpp

3 lines

MC/

MCObjectFileInfo.cpp

7 lines

Target/

AArch64/

AArch64AsmPrinter.cpp

2 lines

AArch64CallLowering.cpp

10 lines

AArch64CallingConvention.h

3 lines

AArch64CallingConvention.cpp

30 lines

AArch64CallingConvention.td

34 lines

AArch64CollectLOH.cpp

22 lines

AArch64ExpandPseudoInsts.cpp

22 lines

AArch64FastISel.cpp

55 lines

AArch64ISelLowering.h

8 lines

AArch64ISelLowering.cpp

174 lines

AArch64InstrInfo.cpp

39 lines

AArch64SelectionDAGInfo.cpp

2 lines

AArch64Subtarget.h

8 lines

AArch64TargetMachine.cpp

10 lines

MCTargetDesc/

AArch64MCAsmInfo.h

2 lines

AArch64MCAsmInfo.cpp

5 lines

AArch64MCTargetDesc.cpp

2 lines

X86/

X86FastISel.cpp

1 line

test/

CodeGen/

AArch64/

arm64-aapcs.ll

2 lines

arm64-collect-loh-garbage-crash.ll

1 line

arm64-collect-loh-str.ll

1 line

arm64-collect-loh.ll

115 lines

arm64-indexed-memory.ll

1 line

4 lines

44 lines

261 lines

28 lines

arm64_32-frame-pointers.ll

26 lines

61 lines

66 lines

198 lines

29 lines

arm64_32-pointer-extend.ll

49 lines

arm64_32-stack-pointers.ll

13 lines

22 lines

56 lines

715 lines

8 lines

22 lines

42 lines

24 lines

4 lines

2 lines

129 lines

29 lines

24 lines

umulo-128-legalisation-lowering.ll

4 lines

win64_vararg.ll

2 lines

MC/

AArch64/

arm64_32-compact-unwind.s

15 lines

utils/

TableGen/

CallingConvEmitter.cpp

4 lines

Diff 219319

llvm/include/llvm/CodeGen/CallingConvLower.h

Show All 38 Lines	enum LocInfo {
AExt, // The value is extended with undefined upper bits.		AExt, // The value is extended with undefined upper bits.
SExtUpper, // The value is in the upper bits of the location and should be		SExtUpper, // The value is in the upper bits of the location and should be
// sign extended when retrieved.		// sign extended when retrieved.
ZExtUpper, // The value is in the upper bits of the location and should be		ZExtUpper, // The value is in the upper bits of the location and should be
// zero extended when retrieved.		// zero extended when retrieved.
AExtUpper, // The value is in the upper bits of the location and should be		AExtUpper, // The value is in the upper bits of the location and should be
// extended with undefined upper bits when retrieved.		// extended with undefined upper bits when retrieved.
BCvt, // The value is bit-converted in the location.		BCvt, // The value is bit-converted in the location.
		Trunc, // The value is truncated in the location.
VExt, // The value is vector-widened in the location.		VExt, // The value is vector-widened in the location.
// FIXME: Not implemented yet. Code that uses AExt to mean		// FIXME: Not implemented yet. Code that uses AExt to mean
// vector-widen should be fixed to use VExt instead.		// vector-widen should be fixed to use VExt instead.
FPExt, // The floating-point value is fp-extended in the location.		FPExt, // The floating-point value is fp-extended in the location.
Indirect // The location contains pointer to the value.		Indirect // The location contains pointer to the value.
// TODO: a subset of the value is in the location.		// TODO: a subset of the value is in the location.
};		};

▲ Show 20 Lines • Show All 522 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetCallingConv.td

	Show First 20 Lines • Show All 146 Lines • ▼ Show 20 Lines
	}			}

	/// CCBitConvertToType - If applied, this bitconverts the specified current			/// CCBitConvertToType - If applied, this bitconverts the specified current
	/// value to the specified type.			/// value to the specified type.
	class CCBitConvertToType<ValueType destTy> : CCAction {			class CCBitConvertToType<ValueType destTy> : CCAction {
	ValueType DestTy = destTy;			ValueType DestTy = destTy;
	}			}

				/// CCTruncToType - If applied, this truncates the specified current value to
				/// the specified type.
				class CCTruncToType<ValueType destTy> : CCAction {
				ValueType DestTy = destTy;
				}

	/// CCPassIndirect - If applied, this stores the value to stack and passes the pointer			/// CCPassIndirect - If applied, this stores the value to stack and passes the pointer
	/// as normal argument.			/// as normal argument.
	class CCPassIndirect<ValueType destTy> : CCAction {			class CCPassIndirect<ValueType destTy> : CCAction {
	ValueType DestTy = destTy;			ValueType DestTy = destTy;
	}			}

	/// CCDelegateTo - This action invokes the specified sub-calling-convention. It			/// CCDelegateTo - This action invokes the specified sub-calling-convention. It
	/// is successful if the specified CC matches.			/// is successful if the specified CC matches.
	Show All 38 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 9,836 Lines • ▼ Show 20 Lines	if (!TM.Options.EnableFastISel && Res.getOpcode() == ISD::BUILD_PAIR) {
unsigned LowAddressOp = DAG.getDataLayout().isBigEndian() ? 1 : 0;		unsigned LowAddressOp = DAG.getDataLayout().isBigEndian() ? 1 : 0;
if (LoadSDNode *LNode =		if (LoadSDNode *LNode =
dyn_cast<LoadSDNode>(Res.getOperand(LowAddressOp).getNode()))		dyn_cast<LoadSDNode>(Res.getOperand(LowAddressOp).getNode()))
if (FrameIndexSDNode *FI =		if (FrameIndexSDNode *FI =
dyn_cast<FrameIndexSDNode>(LNode->getBasePtr().getNode()))		dyn_cast<FrameIndexSDNode>(LNode->getBasePtr().getNode()))
FuncInfo->setArgumentFrameIndex(&Arg, FI->getIndex());		FuncInfo->setArgumentFrameIndex(&Arg, FI->getIndex());
}		}

		// Analyses past this point are naive and don't expect an assertion.
		if (Res.getOpcode() == ISD::AssertZext)
		Res = Res.getOperand(0);
		efriedmaUnsubmitted Not Done Reply Inline Actions Does this have any practical effect for other targets? efriedma: Does this have any practical effect for other targets?
		t.p.northoverAuthorUnsubmitted Done Reply Inline Actions I don't think so. It's a correctness issue so if other targets were affected then they'd start seeing vregs without definitions and things. I believe the driving arm64_32 feature is the fact that pointers are extended beyond 32-bits by the caller, which leads to some weird (but valid) DAG nodes trying to communicate that fact. Other targets only assertzext from invalid types. t.p.northover: I don't think so. It's a correctness issue so if other targets were affected then they'd start…

// Update the SwiftErrorVRegDefMap.		// Update the SwiftErrorVRegDefMap.
if (Res.getOpcode() == ISD::CopyFromReg && isSwiftErrorArg) {		if (Res.getOpcode() == ISD::CopyFromReg && isSwiftErrorArg) {
unsigned Reg = cast<RegisterSDNode>(Res.getOperand(1))->getReg();		unsigned Reg = cast<RegisterSDNode>(Res.getOperand(1))->getReg();
if (Register::isVirtualRegister(Reg))		if (Register::isVirtualRegister(Reg))
SwiftError->setCurrentVReg(FuncInfo->MBB, SwiftError->getFunctionArg(),		SwiftError->setCurrentVReg(FuncInfo->MBB, SwiftError->getFunctionArg(),
Reg);		Reg);
}		}

▲ Show 20 Lines • Show All 700 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	if (TT.isOSDarwin()) {
// Some darwins have an optimized __bzero/bzero function.		// Some darwins have an optimized __bzero/bzero function.
switch (TT.getArch()) {		switch (TT.getArch()) {
case Triple::x86:		case Triple::x86:
case Triple::x86_64:		case Triple::x86_64:
if (TT.isMacOSX() && !TT.isMacOSXVersionLT(10, 6))		if (TT.isMacOSX() && !TT.isMacOSXVersionLT(10, 6))
setLibcallName(RTLIB::BZERO, "__bzero");		setLibcallName(RTLIB::BZERO, "__bzero");
break;		break;
case Triple::aarch64:		case Triple::aarch64:
		case Triple::aarch64_32:
setLibcallName(RTLIB::BZERO, "bzero");		setLibcallName(RTLIB::BZERO, "bzero");
break;		break;
default:		default:
break;		break;
}		}

if (darwinHasSinCos(TT)) {		if (darwinHasSinCos(TT)) {
setLibcallName(RTLIB::SINCOS_STRET_F32, "__sincosf_stret");		setLibcallName(RTLIB::SINCOS_STRET_F32, "__sincosf_stret");
▲ Show 20 Lines • Show All 1,800 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp

Show First 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	case Triple::hexagon:
if (isPositionIndependent()) {		if (isPositionIndependent()) {
PersonalityEncoding \|= dwarf::DW_EH_PE_indirect \| dwarf::DW_EH_PE_pcrel;		PersonalityEncoding \|= dwarf::DW_EH_PE_indirect \| dwarf::DW_EH_PE_pcrel;
LSDAEncoding \|= dwarf::DW_EH_PE_pcrel;		LSDAEncoding \|= dwarf::DW_EH_PE_pcrel;
TTypeEncoding \|= dwarf::DW_EH_PE_indirect \| dwarf::DW_EH_PE_pcrel;		TTypeEncoding \|= dwarf::DW_EH_PE_indirect \| dwarf::DW_EH_PE_pcrel;
}		}
break;		break;
case Triple::aarch64:		case Triple::aarch64:
case Triple::aarch64_be:		case Triple::aarch64_be:
		case Triple::aarch64_32:
// The small model guarantees static code/data size < 4GB, but not where it		// The small model guarantees static code/data size < 4GB, but not where it
// will be in memory. Most of these could end up >2GB away so even a signed		// will be in memory. Most of these could end up >2GB away so even a signed
// pc-relative 32-bit address is insufficient, theoretically.		// pc-relative 32-bit address is insufficient, theoretically.
if (isPositionIndependent()) {		if (isPositionIndependent()) {
PersonalityEncoding = dwarf::DW_EH_PE_indirect \| dwarf::DW_EH_PE_pcrel \|		PersonalityEncoding = dwarf::DW_EH_PE_indirect \| dwarf::DW_EH_PE_pcrel \|
dwarf::DW_EH_PE_sdata8;		dwarf::DW_EH_PE_sdata8;
LSDAEncoding = dwarf::DW_EH_PE_pcrel \| dwarf::DW_EH_PE_sdata8;		LSDAEncoding = dwarf::DW_EH_PE_pcrel \| dwarf::DW_EH_PE_sdata8;
TTypeEncoding = dwarf::DW_EH_PE_indirect \| dwarf::DW_EH_PE_pcrel \|		TTypeEncoding = dwarf::DW_EH_PE_indirect \| dwarf::DW_EH_PE_pcrel \|
▲ Show 20 Lines • Show All 1,736 Lines • Show Last 20 Lines

llvm/lib/ExecutionEngine/Orc/IndirectionUtils.cpp

Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines
Expected<std::unique_ptr<JITCompileCallbackManager>>		Expected<std::unique_ptr<JITCompileCallbackManager>>
createLocalCompileCallbackManager(const Triple &T, ExecutionSession &ES,		createLocalCompileCallbackManager(const Triple &T, ExecutionSession &ES,
JITTargetAddress ErrorHandlerAddress) {		JITTargetAddress ErrorHandlerAddress) {
switch (T.getArch()) {		switch (T.getArch()) {
default:		default:
return make_error<StringError>(		return make_error<StringError>(
std::string("No callback manager available for ") + T.str(),		std::string("No callback manager available for ") + T.str(),
inconvertibleErrorCode());		inconvertibleErrorCode());
case Triple::aarch64: {		case Triple::aarch64:
		case Triple::aarch64_32: {
typedef orc::LocalJITCompileCallbackManager<orc::OrcAArch64> CCMgrT;		typedef orc::LocalJITCompileCallbackManager<orc::OrcAArch64> CCMgrT;
return CCMgrT::Create(ES, ErrorHandlerAddress);		return CCMgrT::Create(ES, ErrorHandlerAddress);
}		}

case Triple::x86: {		case Triple::x86: {
typedef orc::LocalJITCompileCallbackManager<orc::OrcI386> CCMgrT;		typedef orc::LocalJITCompileCallbackManager<orc::OrcI386> CCMgrT;
return CCMgrT::Create(ES, ErrorHandlerAddress);		return CCMgrT::Create(ES, ErrorHandlerAddress);
}		}
Show All 31 Lines	createLocalIndirectStubsManagerBuilder(const Triple &T) {
switch (T.getArch()) {		switch (T.getArch()) {
default:		default:
return [](){		return [](){
return std::make_unique<		return std::make_unique<
orc::LocalIndirectStubsManager<orc::OrcGenericABI>>();		orc::LocalIndirectStubsManager<orc::OrcGenericABI>>();
};		};

case Triple::aarch64:		case Triple::aarch64:
		case Triple::aarch64_32:
return [](){		return [](){
return std::make_unique<		return std::make_unique<
orc::LocalIndirectStubsManager<orc::OrcAArch64>>();		orc::LocalIndirectStubsManager<orc::OrcAArch64>>();
};		};

case Triple::x86:		case Triple::x86:
return [](){		return [](){
return std::make_unique<		return std::make_unique<
▲ Show 20 Lines • Show All 194 Lines • Show Last 20 Lines

llvm/lib/ExecutionEngine/Orc/LazyReexports.cpp

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	createLocalLazyCallThroughManager(const Triple &T, ExecutionSession &ES,
JITTargetAddress ErrorHandlerAddr) {		JITTargetAddress ErrorHandlerAddr) {
switch (T.getArch()) {		switch (T.getArch()) {
default:		default:
return make_error<StringError>(		return make_error<StringError>(
std::string("No callback manager available for ") + T.str(),		std::string("No callback manager available for ") + T.str(),
inconvertibleErrorCode());		inconvertibleErrorCode());

case Triple::aarch64:		case Triple::aarch64:
		case Triple::aarch64_32:
return LocalLazyCallThroughManager::Create<OrcAArch64>(ES,		return LocalLazyCallThroughManager::Create<OrcAArch64>(ES,
ErrorHandlerAddr);		ErrorHandlerAddr);

case Triple::x86:		case Triple::x86:
return LocalLazyCallThroughManager::Create<OrcI386>(ES, ErrorHandlerAddr);		return LocalLazyCallThroughManager::Create<OrcI386>(ES, ErrorHandlerAddr);

case Triple::mips:		case Triple::mips:
return LocalLazyCallThroughManager::Create<OrcMips32Be>(ES,		return LocalLazyCallThroughManager::Create<OrcMips32Be>(ES,
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp

Show First 20 Lines • Show All 913 Lines • ▼ Show 20 Lines	if (Loc == GlobalSymbolTable.end()) {
const auto &SymInfo = Loc->second;		const auto &SymInfo = Loc->second;
RECopy.Addend += SymInfo.getOffset();		RECopy.Addend += SymInfo.getOffset();
Relocations[SymInfo.getSectionID()].push_back(RECopy);		Relocations[SymInfo.getSectionID()].push_back(RECopy);
}		}
}		}

uint8_t RuntimeDyldImpl::createStubFunction(uint8_t Addr,		uint8_t RuntimeDyldImpl::createStubFunction(uint8_t Addr,
unsigned AbiVariant) {		unsigned AbiVariant) {
if (Arch == Triple::aarch64 \|\| Arch == Triple::aarch64_be) {		if (Arch == Triple::aarch64 \|\| Arch == Triple::aarch64_be \|\|
		Arch == Triple::aarch64_32) {
// This stub has to be able to access the full address space,		// This stub has to be able to access the full address space,
// since symbol lookup won't necessarily find a handy, in-range,		// since symbol lookup won't necessarily find a handy, in-range,
// PLT stub for functions which could be anywhere.		// PLT stub for functions which could be anywhere.
// Stub can use ip0 (== x16) to calculate address		// Stub can use ip0 (== x16) to calculate address
writeBytesUnaligned(0xd2e00010, Addr, 4); // movz ip0, #:abs_g3:<addr>		writeBytesUnaligned(0xd2e00010, Addr, 4); // movz ip0, #:abs_g3:<addr>
writeBytesUnaligned(0xf2c00010, Addr+4, 4); // movk ip0, #:abs_g2_nc:<addr>		writeBytesUnaligned(0xf2c00010, Addr+4, 4); // movk ip0, #:abs_g2_nc:<addr>
writeBytesUnaligned(0xf2a00010, Addr+8, 4); // movk ip0, #:abs_g1_nc:<addr>		writeBytesUnaligned(0xf2a00010, Addr+8, 4); // movk ip0, #:abs_g1_nc:<addr>
writeBytesUnaligned(0xf2800010, Addr+12, 4); // movk ip0, #:abs_g0_nc:<addr>		writeBytesUnaligned(0xf2800010, Addr+12, 4); // movk ip0, #:abs_g0_nc:<addr>
▲ Show 20 Lines • Show All 494 Lines • Show Last 20 Lines

llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp

Show First 20 Lines • Show All 351 Lines • ▼ Show 20 Lines	RuntimeDyldMachO::create(Triple::ArchType Arch,
switch (Arch) {		switch (Arch) {
default:		default:
llvm_unreachable("Unsupported target for RuntimeDyldMachO.");		llvm_unreachable("Unsupported target for RuntimeDyldMachO.");
break;		break;
case Triple::arm:		case Triple::arm:
return std::make_unique<RuntimeDyldMachOARM>(MemMgr, Resolver);		return std::make_unique<RuntimeDyldMachOARM>(MemMgr, Resolver);
case Triple::aarch64:		case Triple::aarch64:
return std::make_unique<RuntimeDyldMachOAArch64>(MemMgr, Resolver);		return std::make_unique<RuntimeDyldMachOAArch64>(MemMgr, Resolver);
		case Triple::aarch64_32:
		return std::make_unique<RuntimeDyldMachOAArch64>(MemMgr, Resolver);
case Triple::x86:		case Triple::x86:
return std::make_unique<RuntimeDyldMachOI386>(MemMgr, Resolver);		return std::make_unique<RuntimeDyldMachOI386>(MemMgr, Resolver);
case Triple::x86_64:		case Triple::x86_64:
return std::make_unique<RuntimeDyldMachOX86_64>(MemMgr, Resolver);		return std::make_unique<RuntimeDyldMachOX86_64>(MemMgr, Resolver);
}		}
}		}

std::unique_ptr<RuntimeDyld::LoadedObjectInfo>		std::unique_ptr<RuntimeDyld::LoadedObjectInfo>
Show All 13 Lines

llvm/lib/LTO/LTOCodeGenerator.cpp

Show First 20 Lines • Show All 359 Lines • ▼ Show 20 Lines	bool LTOCodeGenerator::determineTarget() {
Features.getDefaultSubtargetFeatures(Triple);		Features.getDefaultSubtargetFeatures(Triple);
FeatureStr = Features.getString();		FeatureStr = Features.getString();
// Set a default CPU for Darwin triples.		// Set a default CPU for Darwin triples.
if (MCpu.empty() && Triple.isOSDarwin()) {		if (MCpu.empty() && Triple.isOSDarwin()) {
if (Triple.getArch() == llvm::Triple::x86_64)		if (Triple.getArch() == llvm::Triple::x86_64)
MCpu = "core2";		MCpu = "core2";
else if (Triple.getArch() == llvm::Triple::x86)		else if (Triple.getArch() == llvm::Triple::x86)
MCpu = "yonah";		MCpu = "yonah";
else if (Triple.getArch() == llvm::Triple::aarch64)		else if (Triple.getArch() == llvm::Triple::aarch64 \|\|
		Triple.getArch() == llvm::Triple::aarch64_32)
MCpu = "cyclone";		MCpu = "cyclone";
}		}

TargetMach = createTargetMachine();		TargetMach = createTargetMachine();
return true;		return true;
}		}

std::unique_ptr<TargetMachine> LTOCodeGenerator::createTargetMachine() {		std::unique_ptr<TargetMachine> LTOCodeGenerator::createTargetMachine() {
▲ Show 20 Lines • Show All 343 Lines • Show Last 20 Lines

llvm/lib/LTO/LTOModule.cpp

Show First 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	LTOModule::makeLTOModule(MemoryBufferRef Buffer, const TargetOptions &options,
std::string FeatureStr = Features.getString();		std::string FeatureStr = Features.getString();
// Set a default CPU for Darwin triples.		// Set a default CPU for Darwin triples.
std::string CPU;		std::string CPU;
if (Triple.isOSDarwin()) {		if (Triple.isOSDarwin()) {
if (Triple.getArch() == llvm::Triple::x86_64)		if (Triple.getArch() == llvm::Triple::x86_64)
CPU = "core2";		CPU = "core2";
else if (Triple.getArch() == llvm::Triple::x86)		else if (Triple.getArch() == llvm::Triple::x86)
CPU = "yonah";		CPU = "yonah";
else if (Triple.getArch() == llvm::Triple::aarch64)		else if (Triple.getArch() == llvm::Triple::aarch64 \|\|
		Triple.getArch() == llvm::Triple::aarch64_32)
CPU = "cyclone";		CPU = "cyclone";
}		}

TargetMachine *target =		TargetMachine *target =
march->createTargetMachine(TripleStr, CPU, FeatureStr, options, None);		march->createTargetMachine(TripleStr, CPU, FeatureStr, options, None);

std::unique_ptr<LTOModule> Ret(new LTOModule(std::move(M), Buffer, target));		std::unique_ptr<LTOModule> Ret(new LTOModule(std::move(M), Buffer, target));
Ret->parseSymbols();		Ret->parseSymbols();
▲ Show 20 Lines • Show All 446 Lines • Show Last 20 Lines

llvm/lib/LTO/ThinLTOCodeGenerator.cpp

Show First 20 Lines • Show All 483 Lines • ▼ Show 20 Lines	static void initTMBuilder(TargetMachineBuilder &TMBuilder,
const Triple &TheTriple) {		const Triple &TheTriple) {
// Set a default CPU for Darwin triples (copied from LTOCodeGenerator).		// Set a default CPU for Darwin triples (copied from LTOCodeGenerator).
// FIXME this looks pretty terrible...		// FIXME this looks pretty terrible...
if (TMBuilder.MCpu.empty() && TheTriple.isOSDarwin()) {		if (TMBuilder.MCpu.empty() && TheTriple.isOSDarwin()) {
if (TheTriple.getArch() == llvm::Triple::x86_64)		if (TheTriple.getArch() == llvm::Triple::x86_64)
TMBuilder.MCpu = "core2";		TMBuilder.MCpu = "core2";
else if (TheTriple.getArch() == llvm::Triple::x86)		else if (TheTriple.getArch() == llvm::Triple::x86)
TMBuilder.MCpu = "yonah";		TMBuilder.MCpu = "yonah";
else if (TheTriple.getArch() == llvm::Triple::aarch64)		else if (TheTriple.getArch() == llvm::Triple::aarch64 \|\|
		TheTriple.getArch() == llvm::Triple::aarch64_32)
TMBuilder.MCpu = "cyclone";		TMBuilder.MCpu = "cyclone";
}		}
TMBuilder.TheTriple = std::move(TheTriple);		TMBuilder.TheTriple = std::move(TheTriple);
}		}

} // end anonymous namespace		} // end anonymous namespace

void ThinLTOCodeGenerator::addModule(StringRef Identifier, StringRef Data) {		void ThinLTOCodeGenerator::addModule(StringRef Identifier, StringRef Data) {
▲ Show 20 Lines • Show All 604 Lines • Show Last 20 Lines

llvm/lib/MC/MCObjectFileInfo.cpp

Show All 22 Lines
using namespace llvm;		using namespace llvm;

static bool useCompactUnwind(const Triple &T) {		static bool useCompactUnwind(const Triple &T) {
// Only on darwin.		// Only on darwin.
if (!T.isOSDarwin())		if (!T.isOSDarwin())
return false;		return false;

// aarch64 always has it.		// aarch64 always has it.
if (T.getArch() == Triple::aarch64)		if (T.getArch() == Triple::aarch64 \|\| T.getArch() == Triple::aarch64_32)
return true;		return true;

// armv7k always has it.		// armv7k always has it.
if (T.isWatchABI())		if (T.isWatchABI())
return true;		return true;

// Use it on newer version of OS X.		// Use it on newer version of OS X.
if (T.isMacOSX() && !T.isMacOSXVersionLT(10, 6))		if (T.isMacOSX() && !T.isMacOSXVersionLT(10, 6))
Show All 12 Lines	void MCObjectFileInfo::initMachOMCObjectFileInfo(const Triple &T) {
SupportsWeakOmittedEHFrame = false;		SupportsWeakOmittedEHFrame = false;

EHFrameSection = Ctx->getMachOSection(		EHFrameSection = Ctx->getMachOSection(
"__TEXT", "__eh_frame",		"__TEXT", "__eh_frame",
MachO::S_COALESCED \| MachO::S_ATTR_NO_TOC \|		MachO::S_COALESCED \| MachO::S_ATTR_NO_TOC \|
MachO::S_ATTR_STRIP_STATIC_SYMS \| MachO::S_ATTR_LIVE_SUPPORT,		MachO::S_ATTR_STRIP_STATIC_SYMS \| MachO::S_ATTR_LIVE_SUPPORT,
SectionKind::getReadOnly());		SectionKind::getReadOnly());

if (T.isOSDarwin() && T.getArch() == Triple::aarch64)		if (T.isOSDarwin() &&
		(T.getArch() == Triple::aarch64 \|\| T.getArch() == Triple::aarch64_32))
SupportsCompactUnwindWithoutEHFrame = true;		SupportsCompactUnwindWithoutEHFrame = true;

if (T.isWatchABI())		if (T.isWatchABI())
OmitDwarfIfHaveCompactUnwind = true;		OmitDwarfIfHaveCompactUnwind = true;

FDECFIEncoding = dwarf::DW_EH_PE_pcrel;		FDECFIEncoding = dwarf::DW_EH_PE_pcrel;

// .comm doesn't support alignment before Leopard.		// .comm doesn't support alignment before Leopard.
▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	void MCObjectFileInfo::initMachOMCObjectFileInfo(const Triple &T) {

if (useCompactUnwind(T)) {		if (useCompactUnwind(T)) {
CompactUnwindSection =		CompactUnwindSection =
Ctx->getMachOSection("__LD", "__compact_unwind", MachO::S_ATTR_DEBUG,		Ctx->getMachOSection("__LD", "__compact_unwind", MachO::S_ATTR_DEBUG,
SectionKind::getReadOnly());		SectionKind::getReadOnly());

if (T.getArch() == Triple::x86_64 \|\| T.getArch() == Triple::x86)		if (T.getArch() == Triple::x86_64 \|\| T.getArch() == Triple::x86)
CompactUnwindDwarfEHFrameOnly = 0x04000000; // UNWIND_X86_64_MODE_DWARF		CompactUnwindDwarfEHFrameOnly = 0x04000000; // UNWIND_X86_64_MODE_DWARF
else if (T.getArch() == Triple::aarch64)		else if (T.getArch() == Triple::aarch64 \|\| T.getArch() == Triple::aarch64_32)
CompactUnwindDwarfEHFrameOnly = 0x03000000; // UNWIND_ARM64_MODE_DWARF		CompactUnwindDwarfEHFrameOnly = 0x03000000; // UNWIND_ARM64_MODE_DWARF
else if (T.getArch() == Triple::arm \|\| T.getArch() == Triple::thumb)		else if (T.getArch() == Triple::arm \|\| T.getArch() == Triple::thumb)
CompactUnwindDwarfEHFrameOnly = 0x04000000; // UNWIND_ARM_MODE_DWARF		CompactUnwindDwarfEHFrameOnly = 0x04000000; // UNWIND_ARM_MODE_DWARF
}		}

// Debug Information.		// Debug Information.
DwarfDebugNamesSection =		DwarfDebugNamesSection =
Ctx->getMachOSection("__DWARF", "__debug_names", MachO::S_ATTR_DEBUG,		Ctx->getMachOSection("__DWARF", "__debug_names", MachO::S_ATTR_DEBUG,
▲ Show 20 Lines • Show All 668 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp

Show First 20 Lines • Show All 1,197 Lines • ▼ Show 20 Lines	void AArch64AsmPrinter::EmitInstruction(const MachineInstr *MI) {
EmitToStreamer(*OutStreamer, TmpInst);		EmitToStreamer(*OutStreamer, TmpInst);
}		}

// Force static initialization.		// Force static initialization.
extern "C" void LLVMInitializeAArch64AsmPrinter() {		extern "C" void LLVMInitializeAArch64AsmPrinter() {
RegisterAsmPrinter<AArch64AsmPrinter> X(getTheAArch64leTarget());		RegisterAsmPrinter<AArch64AsmPrinter> X(getTheAArch64leTarget());
RegisterAsmPrinter<AArch64AsmPrinter> Y(getTheAArch64beTarget());		RegisterAsmPrinter<AArch64AsmPrinter> Y(getTheAArch64beTarget());
RegisterAsmPrinter<AArch64AsmPrinter> Z(getTheARM64Target());		RegisterAsmPrinter<AArch64AsmPrinter> Z(getTheARM64Target());
		RegisterAsmPrinter<AArch64AsmPrinter> W(getTheARM64_32Target());
		RegisterAsmPrinter<AArch64AsmPrinter> V(getTheAArch64_32Target());
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I assume this patch introduces support for the Darwin-specific arm64_32 ILP32 ABI. That makes me wonder why there is a need to define/create the "getTheAArch64_32Target()". Doesn't "getTheARM64_32Target()" suffice to enable support for the Darwin-specific arm64_32 ILP32 ABI? kristof.beyls: I assume this patch introduces support for the Darwin-specific arm64_32 ILP32 ABI. That makes…
		t.p.northoverAuthorUnsubmitted Done Reply Inline Actions I think it would be enough for production purposes, but the extra lines allow someone to write triple strings starting with `aarch64_32` in tests and so on if they want to. I still have a long-term goal to switch Clang over to using `aarch64` as the canonical name in LLVM (`aarch64-apple-ios14.0` etc). I was prevented when I tried years ago because ld64 wasn't ready, but I think I fixed that and I should try again some time. t.p.northover: I think it would be enough for production purposes, but the extra lines allow someone to write…
}		}

llvm/lib/Target/AArch64/AArch64CallLowering.cpp

Show First 20 Lines • Show All 384 Lines • ▼ Show 20 Lines	bool AArch64CallLowering::lowerFormalArguments(
CCAssignFn *AssignFn =		CCAssignFn *AssignFn =
TLI.CCAssignFnForCall(F.getCallingConv(), /IsVarArg=/false);		TLI.CCAssignFnForCall(F.getCallingConv(), /IsVarArg=/false);

FormalArgHandler Handler(MIRBuilder, MRI, AssignFn);		FormalArgHandler Handler(MIRBuilder, MRI, AssignFn);
if (!handleAssignments(MIRBuilder, SplitArgs, Handler))		if (!handleAssignments(MIRBuilder, SplitArgs, Handler))
return false;		return false;

if (F.isVarArg()) {		if (F.isVarArg()) {
if (!MF.getSubtarget<AArch64Subtarget>().isTargetDarwin()) {		auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
		if (!Subtarget.isTargetDarwin()) {
// FIXME: we need to reimplement saveVarArgsRegisters from		// FIXME: we need to reimplement saveVarArgsRegisters from
// AArch64ISelLowering.		// AArch64ISelLowering.
return false;		return false;
}		}

// We currently pass all varargs at 8-byte alignment.		// We currently pass all varargs at 8-byte alignment, or 4 in ILP32.
uint64_t StackOffset = alignTo(Handler.StackUsed, 8);		uint64_t StackOffset =
		alignTo(Handler.StackUsed, Subtarget.isTargetILP32() ? 4 : 8);

auto &MFI = MIRBuilder.getMF().getFrameInfo();		auto &MFI = MIRBuilder.getMF().getFrameInfo();
AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();		AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();
FuncInfo->setVarArgsStackIndex(MFI.CreateFixedObject(4, StackOffset, true));		FuncInfo->setVarArgsStackIndex(MFI.CreateFixedObject(4, StackOffset, true));
}		}

auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();		auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
if (Subtarget.hasCustomCallingConv())		if (Subtarget.hasCustomCallingConv())
▲ Show 20 Lines • Show All 272 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64CallingConvention.h

Show All 19 Lines	bool CC_AArch64_AAPCS(unsigned ValNo, MVT ValVT, MVT LocVT,
CCValAssign::LocInfo LocInfo, ISD::ArgFlagsTy ArgFlags,		CCValAssign::LocInfo LocInfo, ISD::ArgFlagsTy ArgFlags,
CCState &State);		CCState &State);
bool CC_AArch64_DarwinPCS_VarArg(unsigned ValNo, MVT ValVT, MVT LocVT,		bool CC_AArch64_DarwinPCS_VarArg(unsigned ValNo, MVT ValVT, MVT LocVT,
CCValAssign::LocInfo LocInfo,		CCValAssign::LocInfo LocInfo,
ISD::ArgFlagsTy ArgFlags, CCState &State);		ISD::ArgFlagsTy ArgFlags, CCState &State);
bool CC_AArch64_DarwinPCS(unsigned ValNo, MVT ValVT, MVT LocVT,		bool CC_AArch64_DarwinPCS(unsigned ValNo, MVT ValVT, MVT LocVT,
CCValAssign::LocInfo LocInfo,		CCValAssign::LocInfo LocInfo,
ISD::ArgFlagsTy ArgFlags, CCState &State);		ISD::ArgFlagsTy ArgFlags, CCState &State);
		bool CC_AArch64_DarwinPCS_ILP32_VarArg(unsigned ValNo, MVT ValVT, MVT LocVT,
		CCValAssign::LocInfo LocInfo,
		ISD::ArgFlagsTy ArgFlags, CCState &State);
bool CC_AArch64_Win64_VarArg(unsigned ValNo, MVT ValVT, MVT LocVT,		bool CC_AArch64_Win64_VarArg(unsigned ValNo, MVT ValVT, MVT LocVT,
CCValAssign::LocInfo LocInfo,		CCValAssign::LocInfo LocInfo,
ISD::ArgFlagsTy ArgFlags, CCState &State);		ISD::ArgFlagsTy ArgFlags, CCState &State);
bool CC_AArch64_WebKit_JS(unsigned ValNo, MVT ValVT, MVT LocVT,		bool CC_AArch64_WebKit_JS(unsigned ValNo, MVT ValVT, MVT LocVT,
CCValAssign::LocInfo LocInfo,		CCValAssign::LocInfo LocInfo,
ISD::ArgFlagsTy ArgFlags, CCState &State);		ISD::ArgFlagsTy ArgFlags, CCState &State);
bool CC_AArch64_GHC(unsigned ValNo, MVT ValVT, MVT LocVT,		bool CC_AArch64_GHC(unsigned ValNo, MVT ValVT, MVT LocVT,
CCValAssign::LocInfo LocInfo, ISD::ArgFlagsTy ArgFlags,		CCValAssign::LocInfo LocInfo, ISD::ArgFlagsTy ArgFlags,
Show All 10 Lines

llvm/lib/Target/AArch64/AArch64CallingConvention.cpp

	Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	}			}

	/// Given an [N x Ty] block, it should be passed in a consecutive sequence of			/// Given an [N x Ty] block, it should be passed in a consecutive sequence of
	/// registers. If no such sequence is available, mark the rest of the registers			/// registers. If no such sequence is available, mark the rest of the registers
	/// of that type as used and place the argument on the stack.			/// of that type as used and place the argument on the stack.
	static bool CC_AArch64_Custom_Block(unsigned &ValNo, MVT &ValVT, MVT &LocVT,			static bool CC_AArch64_Custom_Block(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
	CCValAssign::LocInfo &LocInfo,			CCValAssign::LocInfo &LocInfo,
	ISD::ArgFlagsTy &ArgFlags, CCState &State) {			ISD::ArgFlagsTy &ArgFlags, CCState &State) {
				const AArch64Subtarget &Subtarget = static_cast<const AArch64Subtarget &>(
				State.getMachineFunction().getSubtarget());
				bool IsDarwinILP32 = Subtarget.isTargetILP32() && Subtarget.isTargetMachO();

	// Try to allocate a contiguous block of registers, each of the correct			// Try to allocate a contiguous block of registers, each of the correct
	// size to hold one member.			// size to hold one member.
	ArrayRef<MCPhysReg> RegList;			ArrayRef<MCPhysReg> RegList;
	if (LocVT.SimpleTy == MVT::i64)			if (LocVT.SimpleTy == MVT::i64 \|\| (IsDarwinILP32 && LocVT.SimpleTy == MVT::i32))
	RegList = XRegList;			RegList = XRegList;
	else if (LocVT.SimpleTy == MVT::f16)			else if (LocVT.SimpleTy == MVT::f16)
	RegList = HRegList;			RegList = HRegList;
	else if (LocVT.SimpleTy == MVT::f32 \|\| LocVT.is32BitVector())			else if (LocVT.SimpleTy == MVT::f32 \|\| LocVT.is32BitVector())
	RegList = SRegList;			RegList = SRegList;
	else if (LocVT.SimpleTy == MVT::f64 \|\| LocVT.is64BitVector())			else if (LocVT.SimpleTy == MVT::f64 \|\| LocVT.is64BitVector())
	RegList = DRegList;			RegList = DRegList;
	else if (LocVT.SimpleTy == MVT::f128 \|\| LocVT.is128BitVector())			else if (LocVT.SimpleTy == MVT::f128 \|\| LocVT.is128BitVector())
	RegList = QRegList;			RegList = QRegList;
	else {			else {
	// Not an array we want to split up after all.			// Not an array we want to split up after all.
	return false;			return false;
	}			}

	SmallVectorImpl<CCValAssign> &PendingMembers = State.getPendingLocs();			SmallVectorImpl<CCValAssign> &PendingMembers = State.getPendingLocs();

	// Add the argument to the list to be allocated once we know the size of the			// Add the argument to the list to be allocated once we know the size of the
	// block.			// block.
	PendingMembers.push_back(			PendingMembers.push_back(
	CCValAssign::getPending(ValNo, ValVT, LocVT, LocInfo));			CCValAssign::getPending(ValNo, ValVT, LocVT, LocInfo));

	if (!ArgFlags.isInConsecutiveRegsLast())			if (!ArgFlags.isInConsecutiveRegsLast())
	return true;			return true;

	unsigned RegResult = State.AllocateRegBlock(RegList, PendingMembers.size());			// [N x i32] arguments get packed into x-registers on Darwin's arm64_32
	if (RegResult) {			// because that's how the armv7k Clang front-end emits small structs.
				unsigned EltsPerReg = (IsDarwinILP32 && LocVT.SimpleTy == MVT::i32) ? 2 : 1;
				unsigned RegResult = State.AllocateRegBlock(
				RegList, alignTo(PendingMembers.size(), EltsPerReg) / EltsPerReg);
				if (RegResult && EltsPerReg == 1) {
				aemersonUnsubmitted Done Reply Inline Actions Comment here explaining why we need this? I'm guessing for passing arrays of i32s & compatibility with armv7k? aemerson: Comment here explaining why we need this? I'm guessing for passing arrays of i32s &…
	for (auto &It : PendingMembers) {			for (auto &It : PendingMembers) {
	It.convertToReg(RegResult);			It.convertToReg(RegResult);
	State.addLoc(It);			State.addLoc(It);
	++RegResult;			++RegResult;
	}			}
	PendingMembers.clear();			PendingMembers.clear();
	return true;			return true;
				} else if (RegResult) {
				assert(EltsPerReg == 2 && "unexpected ABI");
				bool UseHigh = false;
				CCValAssign::LocInfo Info;
				for (auto &It : PendingMembers) {
				Info = UseHigh ? CCValAssign::AExtUpper : CCValAssign::ZExt;
				State.addLoc(CCValAssign::getReg(It.getValNo(), MVT::i32, RegResult,
				MVT::i64, Info));
				UseHigh = !UseHigh;
				if (!UseHigh)
				++RegResult;
				}
				PendingMembers.clear();
				return true;
	}			}

	// Mark all regs in the class as unavailable			// Mark all regs in the class as unavailable
	for (auto Reg : RegList)			for (auto Reg : RegList)
	State.AllocateReg(Reg);			State.AllocateReg(Reg);

	const AArch64Subtarget &Subtarget = static_cast<const AArch64Subtarget &>(
	State.getMachineFunction().getSubtarget());
	unsigned SlotAlign = Subtarget.isTargetDarwin() ? 1 : 8;			unsigned SlotAlign = Subtarget.isTargetDarwin() ? 1 : 8;

	return finishStackBlock(PendingMembers, LocVT, ArgFlags, State, SlotAlign);			return finishStackBlock(PendingMembers, LocVT, ArgFlags, State, SlotAlign);
	}			}

	// TableGen provides definitions of the calling convention analysis entry			// TableGen provides definitions of the calling convention analysis entry
	// points.			// points.
	#include "AArch64GenCallingConv.inc"			#include "AArch64GenCallingConv.inc"

llvm/lib/Target/AArch64/AArch64CallingConvention.td

Show All 11 Lines

/// CCIfAlign - Match of the original alignment of the arg		/// CCIfAlign - Match of the original alignment of the arg
class CCIfAlign<string Align, CCAction A> :		class CCIfAlign<string Align, CCAction A> :
CCIf<!strconcat("ArgFlags.getOrigAlign() == ", Align), A>;		CCIf<!strconcat("ArgFlags.getOrigAlign() == ", Align), A>;
/// CCIfBigEndian - Match only if we're in big endian mode.		/// CCIfBigEndian - Match only if we're in big endian mode.
class CCIfBigEndian<CCAction A> :		class CCIfBigEndian<CCAction A> :
CCIf<"State.getMachineFunction().getDataLayout().isBigEndian()", A>;		CCIf<"State.getMachineFunction().getDataLayout().isBigEndian()", A>;

		class CCIfILP32<CCAction A> :
		CCIf<"State.getMachineFunction().getDataLayout().getPointerSize() == 4", A>;


//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// ARM AAPCS64 Calling Convention		// ARM AAPCS64 Calling Convention
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

let Entry = 1 in		let Entry = 1 in
def CC_AArch64_AAPCS : CallingConv<[		def CC_AArch64_AAPCS : CallingConv<[
CCIfType<[iPTR], CCBitConvertToType<i64>>,		CCIfType<[iPTR], CCBitConvertToType<i64>>,
CCIfType<[v2f32], CCBitConvertToType<v2i32>>,		CCIfType<[v2f32], CCBitConvertToType<v2i32>>,
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines
]>;		]>;

let Entry = 1 in		let Entry = 1 in
def RetCC_AArch64_AAPCS : CallingConv<[		def RetCC_AArch64_AAPCS : CallingConv<[
CCIfType<[iPTR], CCBitConvertToType<i64>>,		CCIfType<[iPTR], CCBitConvertToType<i64>>,
CCIfType<[v2f32], CCBitConvertToType<v2i32>>,		CCIfType<[v2f32], CCBitConvertToType<v2i32>>,
CCIfType<[v2f64, v4f32], CCBitConvertToType<v2i64>>,		CCIfType<[v2f64, v4f32], CCBitConvertToType<v2i64>>,

		CCIfConsecutiveRegs<CCCustom<"CC_AArch64_Custom_Block">>,
CCIfSwiftError<CCIfType<[i64], CCAssignToRegWithShadow<[X21], [W21]>>>,		CCIfSwiftError<CCIfType<[i64], CCAssignToRegWithShadow<[X21], [W21]>>>,

// Big endian vectors must be passed as if they were 1-element vectors so that		// Big endian vectors must be passed as if they were 1-element vectors so that
// their lanes are in a consistent order.		// their lanes are in a consistent order.
CCIfBigEndian<CCIfType<[v2i32, v2f32, v4i16, v4f16, v8i8],		CCIfBigEndian<CCIfType<[v2i32, v2f32, v4i16, v4f16, v8i8],
CCBitConvertToType<f64>>>,		CCBitConvertToType<f64>>>,
CCIfBigEndian<CCIfType<[v2i64, v2f64, v4i32, v4f32, v8i16, v8f16, v16i8],		CCIfBigEndian<CCIfType<[v2i64, v2f64, v4i32, v4f32, v8i16, v8f16, v16i8],
CCBitConvertToType<f128>>>,		CCBitConvertToType<f128>>>,
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	CCIfType<[v1i64, v2i32, v4i16, v8i8, v1f64, v2f32, v4f16],
[Q0, Q1, Q2, Q3, Q4, Q5, Q6, Q7]>>,		[Q0, Q1, Q2, Q3, Q4, Q5, Q6, Q7]>>,
CCIfType<[v2i64, v4i32, v8i16, v16i8, v4f32, v2f64, v8f16],		CCIfType<[v2i64, v4i32, v8i16, v16i8, v4f32, v2f64, v8f16],
CCAssignToReg<[Q0, Q1, Q2, Q3, Q4, Q5, Q6, Q7]>>,		CCAssignToReg<[Q0, Q1, Q2, Q3, Q4, Q5, Q6, Q7]>>,

// If more than will fit in registers, pass them on the stack instead.		// If more than will fit in registers, pass them on the stack instead.
CCIf<"ValVT == MVT::i1 \|\| ValVT == MVT::i8", CCAssignToStack<1, 1>>,		CCIf<"ValVT == MVT::i1 \|\| ValVT == MVT::i8", CCAssignToStack<1, 1>>,
CCIf<"ValVT == MVT::i16 \|\| ValVT == MVT::f16", CCAssignToStack<2, 2>>,		CCIf<"ValVT == MVT::i16 \|\| ValVT == MVT::f16", CCAssignToStack<2, 2>>,
CCIfType<[i32, f32], CCAssignToStack<4, 4>>,		CCIfType<[i32, f32], CCAssignToStack<4, 4>>,

		// Re-demote pointers to 32-bits so we don't end up storing 64-bit
		// values and clobbering neighbouring stack locations. Not very pretty.
		CCIfPtr<CCIfILP32<CCTruncToType<i32>>>,
		CCIfPtr<CCIfILP32<CCAssignToStack<4, 4>>>,

CCIfType<[i64, f64, v1f64, v2f32, v1i64, v2i32, v4i16, v8i8, v4f16],		CCIfType<[i64, f64, v1f64, v2f32, v1i64, v2i32, v4i16, v8i8, v4f16],
CCAssignToStack<8, 8>>,		CCAssignToStack<8, 8>>,
CCIfType<[v2i64, v4i32, v8i16, v16i8, v4f32, v2f64, v8f16],		CCIfType<[v2i64, v4i32, v8i16, v16i8, v4f32, v2f64, v8f16],
CCAssignToStack<16, 16>>		CCAssignToStack<16, 16>>
]>;		]>;

let Entry = 1 in		let Entry = 1 in
def CC_AArch64_DarwinPCS_VarArg : CallingConv<[		def CC_AArch64_DarwinPCS_VarArg : CallingConv<[
Show All 11 Lines	def CC_AArch64_DarwinPCS_VarArg : CallingConv<[
// i128 is split to two i64s, and its stack alignment is 16 bytes.		// i128 is split to two i64s, and its stack alignment is 16 bytes.
CCIfType<[i64], CCIfSplit<CCAssignToStack<8, 16>>>,		CCIfType<[i64], CCIfSplit<CCAssignToStack<8, 16>>>,
CCIfType<[i64, f64, v1i64, v2i32, v4i16, v8i8, v1f64, v2f32, v4f16],		CCIfType<[i64, f64, v1i64, v2i32, v4i16, v8i8, v1f64, v2f32, v4f16],
CCAssignToStack<8, 8>>,		CCAssignToStack<8, 8>>,
CCIfType<[v2i64, v4i32, v8i16, v16i8, v4f32, v2f64, v8f16],		CCIfType<[v2i64, v4i32, v8i16, v16i8, v4f32, v2f64, v8f16],
CCAssignToStack<16, 16>>		CCAssignToStack<16, 16>>
]>;		]>;

		// In the ILP32 world, the minimum stack slot size is 4 bytes. Otherwise the
		// same as the normal Darwin VarArgs handling.
		let Entry = 1 in
		def CC_AArch64_DarwinPCS_ILP32_VarArg : CallingConv<[
		CCIfType<[v2f32], CCBitConvertToType<v2i32>>,
		CCIfType<[v2f64, v4f32, f128], CCBitConvertToType<v2i64>>,

		// Handle all scalar types as either i32 or f32.
		CCIfType<[i8, i16], CCPromoteToType<i32>>,
		CCIfType<[f16], CCPromoteToType<f32>>,

		// Everything is on the stack.
		// i128 is split to two i64s, and its stack alignment is 16 bytes.
		CCIfPtr<CCIfILP32<CCTruncToType<i32>>>,
		CCIfType<[i32, f32], CCAssignToStack<4, 4>>,
		CCIfType<[i64], CCIfSplit<CCAssignToStack<8, 16>>>,
		CCIfType<[i64, f64, v1i64, v2i32, v4i16, v8i8, v1f64, v2f32, v4f16],
		CCAssignToStack<8, 8>>,
		CCIfType<[v2i64, v4i32, v8i16, v16i8, v4f32, v2f64, v8f16],
		CCAssignToStack<16, 16>>
		]>;


// The WebKit_JS calling convention only passes the first argument (the callee)		// The WebKit_JS calling convention only passes the first argument (the callee)
// in register and the remaining arguments on stack. We allow 32bit stack slots,		// in register and the remaining arguments on stack. We allow 32bit stack slots,
// so that WebKit can write partial values in the stack and define the other		// so that WebKit can write partial values in the stack and define the other
// 32bit quantity as undef.		// 32bit quantity as undef.
let Entry = 1 in		let Entry = 1 in
def CC_AArch64_WebKit_JS : CallingConv<[		def CC_AArch64_WebKit_JS : CallingConv<[
// Handle i1, i8, i16, i32, and i64 passing in register X0 (W0).		// Handle i1, i8, i16, i32, and i64 passing in register X0 (W0).
CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,		CCIfType<[i1, i8, i16], CCPromoteToType<i32>>,
▲ Show 20 Lines • Show All 185 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64CollectLOH.cpp

Show First 20 Lines • Show All 97 Lines • ▼ Show 20 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "AArch64.h"		#include "AArch64.h"
#include "AArch64InstrInfo.h"		#include "AArch64InstrInfo.h"
#include "AArch64MachineFunctionInfo.h"		#include "AArch64MachineFunctionInfo.h"
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/CodeGen/MachineBasicBlock.h"		#include "llvm/CodeGen/MachineBasicBlock.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineInstr.h"		#include "llvm/CodeGen/MachineInstr.h"
#include "llvm/CodeGen/TargetRegisterInfo.h"		#include "llvm/CodeGen/TargetRegisterInfo.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	static bool canDefBePartOfLOH(const MachineInstr &MI) {
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
default:		default:
return false;		return false;
case AArch64::ADRP:		case AArch64::ADRP:
return true;		return true;
case AArch64::ADDXri:		case AArch64::ADDXri:
return canAddBePartOfLOH(MI);		return canAddBePartOfLOH(MI);
case AArch64::LDRXui:		case AArch64::LDRXui:
		case AArch64::LDRWui:
// Check immediate to see if the immediate is an address.		// Check immediate to see if the immediate is an address.
switch (MI.getOperand(2).getType()) {		switch (MI.getOperand(2).getType()) {
default:		default:
return false;		return false;
case MachineOperand::MO_GlobalAddress:		case MachineOperand::MO_GlobalAddress:
return MI.getOperand(2).getTargetFlags() & AArch64II::MO_GOT;		return MI.getOperand(2).getTargetFlags() & AArch64II::MO_GOT;
}		}
}		}
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	if (isCandidateLoad(MI)) {
Info.Type = MCLOH_AdrpAddStr;		Info.Type = MCLOH_AdrpAddStr;
Info.IsCandidate = true;		Info.IsCandidate = true;
Info.MI0 = &MI;		Info.MI0 = &MI;
Info.MI1 = nullptr;		Info.MI1 = nullptr;
} else if (MI.getOpcode() == AArch64::ADDXri) {		} else if (MI.getOpcode() == AArch64::ADDXri) {
Info.Type = MCLOH_AdrpAdd;		Info.Type = MCLOH_AdrpAdd;
Info.IsCandidate = true;		Info.IsCandidate = true;
Info.MI0 = &MI;		Info.MI0 = &MI;
} else if (MI.getOpcode() == AArch64::LDRXui &&		} else if ((MI.getOpcode() == AArch64::LDRXui \|\|
		MI.getOpcode() == AArch64::LDRWui) &&
MI.getOperand(2).getTargetFlags() & AArch64II::MO_GOT) {		MI.getOperand(2).getTargetFlags() & AArch64II::MO_GOT) {
Info.Type = MCLOH_AdrpLdrGot;		Info.Type = MCLOH_AdrpLdrGot;
Info.IsCandidate = true;		Info.IsCandidate = true;
Info.MI0 = &MI;		Info.MI0 = &MI;
}		}
}		}

/// Update state \p Info given the tracked register is clobbered.		/// Update state \p Info given the tracked register is clobbered.
Show All 28 Lines	if (OpInfo.Type == MCLOH_AdrpLdr) {
return true;		return true;
} else if (OpInfo.Type == MCLOH_AdrpAddStr && OpInfo.MI1 == nullptr) {		} else if (OpInfo.Type == MCLOH_AdrpAddStr && OpInfo.MI1 == nullptr) {
OpInfo.Type = MCLOH_AdrpAddStr;		OpInfo.Type = MCLOH_AdrpAddStr;
OpInfo.IsCandidate = true;		OpInfo.IsCandidate = true;
OpInfo.MI1 = &MI;		OpInfo.MI1 = &MI;
return true;		return true;
}		}
} else {		} else {
assert(MI.getOpcode() == AArch64::LDRXui && "Expect LDRXui");		assert((MI.getOpcode() == AArch64::LDRXui \|\|
		MI.getOpcode() == AArch64::LDRWui) &&
		"Expect LDRXui or LDRWui");
assert((MI.getOperand(2).getTargetFlags() & AArch64II::MO_GOT) &&		assert((MI.getOperand(2).getTargetFlags() & AArch64II::MO_GOT) &&
"Expected GOT relocation");		"Expected GOT relocation");
if (OpInfo.Type == MCLOH_AdrpAddStr && OpInfo.MI1 == nullptr) {		if (OpInfo.Type == MCLOH_AdrpAddStr && OpInfo.MI1 == nullptr) {
OpInfo.Type = MCLOH_AdrpLdrGotStr;		OpInfo.Type = MCLOH_AdrpLdrGotStr;
OpInfo.IsCandidate = true;		OpInfo.IsCandidate = true;
OpInfo.MI1 = &MI;		OpInfo.MI1 = &MI;
return true;		return true;
} else if (OpInfo.Type == MCLOH_AdrpLdr) {		} else if (OpInfo.Type == MCLOH_AdrpLdr) {
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	for (const MachineOperand &MO : MI.operands()) {
if (!MO.isReg() \|\| !MO.isDef())		if (!MO.isReg() \|\| !MO.isDef())
continue;		continue;
int Idx = mapRegToGPRIndex(MO.getReg());		int Idx = mapRegToGPRIndex(MO.getReg());
if (Idx < 0)		if (Idx < 0)
continue;		continue;
handleClobber(LOHInfos[Idx]);		handleClobber(LOHInfos[Idx]);
}		}
// Handle uses.		// Handle uses.

		SmallSet<int, 4> UsesSeen;
for (const MachineOperand &MO : MI.uses()) {		for (const MachineOperand &MO : MI.uses()) {
if (!MO.isReg() \|\| !MO.readsReg())		if (!MO.isReg() \|\| !MO.readsReg())
continue;		continue;
int Idx = mapRegToGPRIndex(MO.getReg());		int Idx = mapRegToGPRIndex(MO.getReg());
if (Idx < 0)		if (Idx < 0)
continue;		continue;

		// Multiple uses of the same register within a single instruction don't
		// count as MultiUser or block optimization. This is especially important on
		// arm64_32, where any memory operation is likely to be an explicit use of
		// xN and an implicit use of wN (the base address register).
		if (!UsesSeen.count(Idx)) {
handleUse(MI, MO, LOHInfos[Idx]);		handleUse(MI, MO, LOHInfos[Idx]);
		UsesSeen.insert(Idx);
		}
}		}
}		}

bool AArch64CollectLOH::runOnMachineFunction(MachineFunction &MF) {		bool AArch64CollectLOH::runOnMachineFunction(MachineFunction &MF) {
if (skipFunction(MF.getFunction()))		if (skipFunction(MF.getFunction()))
return false;		return false;

LLVM_DEBUG(dbgs() << "******** AArch64 Collect LOH ********\n"		LLVM_DEBUG(dbgs() << "******** AArch64 Collect LOH ********\n"
Show All 15 Lines	for (const MachineBasicBlock &MBB : MF) {

// Walk the basic block backwards and update the per register state machine		// Walk the basic block backwards and update the per register state machine
// in the process.		// in the process.
for (const MachineInstr &MI : make_range(MBB.rbegin(), MBB.rend())) {		for (const MachineInstr &MI : make_range(MBB.rbegin(), MBB.rend())) {
unsigned Opcode = MI.getOpcode();		unsigned Opcode = MI.getOpcode();
switch (Opcode) {		switch (Opcode) {
case AArch64::ADDXri:		case AArch64::ADDXri:
case AArch64::LDRXui:		case AArch64::LDRXui:
		case AArch64::LDRWui:
if (canDefBePartOfLOH(MI)) {		if (canDefBePartOfLOH(MI)) {
const MachineOperand &Def = MI.getOperand(0);		const MachineOperand &Def = MI.getOperand(0);
const MachineOperand &Op = MI.getOperand(1);		const MachineOperand &Op = MI.getOperand(1);
assert(Def.isReg() && Def.isDef() && "Expected reg def");		assert(Def.isReg() && Def.isDef() && "Expected reg def");
assert(Op.isReg() && Op.isUse() && "Expected reg use");		assert(Op.isReg() && Op.isUse() && "Expected reg use");
int DefIdx = mapRegToGPRIndex(Def.getReg());		int DefIdx = mapRegToGPRIndex(Def.getReg());
int OpIdx = mapRegToGPRIndex(Op.getReg());		int OpIdx = mapRegToGPRIndex(Op.getReg());
if (DefIdx >= 0 && OpIdx >= 0 &&		if (DefIdx >= 0 && OpIdx >= 0 &&
Show All 24 Lines

llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp

Show First 20 Lines • Show All 489 Lines • ▼ Show 20 Lines	if (MF->getTarget().getCodeModel() == CodeModel::Tiny) {
MIB.addExternalSymbol(MO1.getSymbolName(), Flags);		MIB.addExternalSymbol(MO1.getSymbolName(), Flags);
} else {		} else {
assert(MO1.isCPI() &&		assert(MO1.isCPI() &&
"Only expect globals, externalsymbols, or constant pools");		"Only expect globals, externalsymbols, or constant pools");
MIB.addConstantPoolIndex(MO1.getIndex(), MO1.getOffset(), Flags);		MIB.addConstantPoolIndex(MO1.getIndex(), MO1.getOffset(), Flags);
}		}
} else {		} else {
// Small codemodel expand into ADRP + LDR.		// Small codemodel expand into ADRP + LDR.
		MachineFunction &MF = *MI.getParent()->getParent();
		DebugLoc DL = MI.getDebugLoc();
MachineInstrBuilder MIB1 =		MachineInstrBuilder MIB1 =
BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(AArch64::ADRP), DstReg);		BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(AArch64::ADRP), DstReg);
MachineInstrBuilder MIB2 =
BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(AArch64::LDRXui))		MachineInstrBuilder MIB2;
		if (MF.getSubtarget<AArch64Subtarget>().isTargetILP32()) {
		auto TRI = MBB.getParent()->getSubtarget().getRegisterInfo();
		unsigned Reg32 = TRI->getSubReg(DstReg, AArch64::sub_32);
		unsigned DstFlags = MI.getOperand(0).getTargetFlags();
		MIB2 = BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(AArch64::LDRWui))
		.addDef(Reg32, RegState::Dead)
		.addReg(DstReg, RegState::Kill)
		.addReg(DstReg, DstFlags \| RegState::Implicit);
		} else {
		unsigned DstReg = MI.getOperand(0).getReg();
		MIB2 = BuildMI(MBB, MBBI, DL, TII->get(AArch64::LDRXui))
.add(MI.getOperand(0))		.add(MI.getOperand(0))
.addReg(DstReg);		.addUse(DstReg, RegState::Kill);
		}

if (MO1.isGlobal()) {		if (MO1.isGlobal()) {
MIB1.addGlobalAddress(MO1.getGlobal(), 0, Flags \| AArch64II::MO_PAGE);		MIB1.addGlobalAddress(MO1.getGlobal(), 0, Flags \| AArch64II::MO_PAGE);
MIB2.addGlobalAddress(MO1.getGlobal(), 0,		MIB2.addGlobalAddress(MO1.getGlobal(), 0,
Flags \| AArch64II::MO_PAGEOFF \| AArch64II::MO_NC);		Flags \| AArch64II::MO_PAGEOFF \| AArch64II::MO_NC);
} else if (MO1.isSymbol()) {		} else if (MO1.isSymbol()) {
MIB1.addExternalSymbol(MO1.getSymbolName(), Flags \| AArch64II::MO_PAGE);		MIB1.addExternalSymbol(MO1.getSymbolName(), Flags \| AArch64II::MO_PAGE);
MIB2.addExternalSymbol(MO1.getSymbolName(), Flags \|		MIB2.addExternalSymbol(MO1.getSymbolName(), Flags \|
▲ Show 20 Lines • Show All 212 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64FastISel.cpp

Show First 20 Lines • Show All 468 Lines • ▼ Show 20 Lines	unsigned AArch64FastISel::materializeGV(const GlobalValue *GV) {
unsigned ResultReg;		unsigned ResultReg;

if (OpFlags & AArch64II::MO_GOT) {		if (OpFlags & AArch64II::MO_GOT) {
// ADRP + LDRX		// ADRP + LDRX
BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(AArch64::ADRP),		BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(AArch64::ADRP),
ADRPReg)		ADRPReg)
.addGlobalAddress(GV, 0, AArch64II::MO_PAGE \| OpFlags);		.addGlobalAddress(GV, 0, AArch64II::MO_PAGE \| OpFlags);

		unsigned LdrOpc;
		if (Subtarget->isTargetILP32()) {
		ResultReg = createResultReg(&AArch64::GPR32RegClass);
		LdrOpc = AArch64::LDRWui;
		} else {
ResultReg = createResultReg(&AArch64::GPR64RegClass);		ResultReg = createResultReg(&AArch64::GPR64RegClass);
BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(AArch64::LDRXui),		LdrOpc = AArch64::LDRXui;
		}
		BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(LdrOpc),
ResultReg)		ResultReg)
.addReg(ADRPReg)		.addReg(ADRPReg)
.addGlobalAddress(GV, 0,		.addGlobalAddress(GV, 0, AArch64II::MO_GOT \| AArch64II::MO_PAGEOFF \|
AArch64II::MO_PAGEOFF \| AArch64II::MO_NC \| OpFlags);		AArch64II::MO_NC \| OpFlags);
		if (!Subtarget->isTargetILP32())
		return ResultReg;

		// LDRWui produces a 32-bit register, but pointers in-register are 64-bits
		// so we must extend the result on ILP32.
		unsigned Result64 = createResultReg(&AArch64::GPR64RegClass);
		BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc,
		TII.get(TargetOpcode::SUBREG_TO_REG))
		.addDef(Result64)
		.addImm(0)
		.addReg(ResultReg, RegState::Kill)
		.addImm(AArch64::sub_32);
		return Result64;
} else {		} else {
// ADRP + ADDX		// ADRP + ADDX
BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(AArch64::ADRP),		BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(AArch64::ADRP),
ADRPReg)		ADRPReg)
.addGlobalAddress(GV, 0, AArch64II::MO_PAGE \| OpFlags);		.addGlobalAddress(GV, 0, AArch64II::MO_PAGE \| OpFlags);

ResultReg = createResultReg(&AArch64::GPR64spRegClass);		ResultReg = createResultReg(&AArch64::GPR64spRegClass);
BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(AArch64::ADDXri),		BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(AArch64::ADDXri),
ResultReg)		ResultReg)
.addReg(ADRPReg)		.addReg(ADRPReg)
.addGlobalAddress(GV, 0,		.addGlobalAddress(GV, 0,
AArch64II::MO_PAGEOFF \| AArch64II::MO_NC \| OpFlags)		AArch64II::MO_PAGEOFF \| AArch64II::MO_NC \| OpFlags)
.addImm(0);		.addImm(0);
}		}
return ResultReg;		return ResultReg;
}		}

unsigned AArch64FastISel::fastMaterializeConstant(const Constant *C) {		unsigned AArch64FastISel::fastMaterializeConstant(const Constant *C) {
EVT CEVT = TLI.getValueType(DL, C->getType(), true);		EVT CEVT = TLI.getValueType(DL, C->getType(), true);

// Only handle simple types.		// Only handle simple types.
if (!CEVT.isSimple())		if (!CEVT.isSimple())
return 0;		return 0;
MVT VT = CEVT.getSimpleVT();		MVT VT = CEVT.getSimpleVT();
		// arm64_32 has 32-bit pointers held in 64-bit registers. Because of that,
		// 'null' pointers need to have a somewhat special treatment.
		if (const auto *CPN = dyn_cast<ConstantPointerNull>(C)) {
		(void)CPN;
		assert(CPN->getType()->getPointerAddressSpace() == 0 &&
		loladiroUnsubmitted Not Done Reply Inline Actions @t.p.northover Out of curiosity, what is this assertion guarding against? In our frontend we use non-zero address spaces to indicate GC-tracked pointers, which we expect to be ignored by the backend (stripping them is possible of course, but the performance impact is surprisingly high). As far as I can tell this is the only place in the Aarch64 backend that looks at address spaces and we encountered it when trying to port to Apple Silicon (not quite sure why nobody complained on linux, but maybe people didn't run with assertions). loladiro: @t.p.northover Out of curiosity, what is this assertion guarding against? In our frontend we…
		t.p.northoverAuthorUnsubmitted Done Reply Inline Actions I'm afraid I don't remember, but it looks overcautious to me now as well. I've just removed the check from `main`. t.p.northover: I'm afraid I don't remember, but it looks overcautious to me now as well. I've just removed the…
		"Unexpected address space");
		assert(VT == MVT::i64 && "Expected 64-bit pointers");
		return materializeInt(ConstantInt::get(Type::getInt64Ty(*Context), 0), VT);
		}

if (const auto *CI = dyn_cast<ConstantInt>(C))		if (const auto *CI = dyn_cast<ConstantInt>(C))
return materializeInt(CI, VT);		return materializeInt(CI, VT);
else if (const ConstantFP *CFP = dyn_cast<ConstantFP>(C))		else if (const ConstantFP *CFP = dyn_cast<ConstantFP>(C))
return materializeFP(CFP, VT);		return materializeFP(CFP, VT);
else if (const GlobalValue *GV = dyn_cast<GlobalValue>(C))		else if (const GlobalValue *GV = dyn_cast<GlobalValue>(C))
return materializeGV(GV);		return materializeGV(GV);

▲ Show 20 Lines • Show All 426 Lines • ▼ Show 20 Lines	bool AArch64FastISel::computeCallAddress(const Value *V, Address &Addr) {
}		}

return false;		return false;
}		}

bool AArch64FastISel::isTypeLegal(Type *Ty, MVT &VT) {		bool AArch64FastISel::isTypeLegal(Type *Ty, MVT &VT) {
EVT evt = TLI.getValueType(DL, Ty, true);		EVT evt = TLI.getValueType(DL, Ty, true);

		if (Subtarget->isTargetILP32() && Ty->isPointerTy())
		return false;

// Only handle simple types.		// Only handle simple types.
if (evt == MVT::Other \|\| !evt.isSimple())		if (evt == MVT::Other \|\| !evt.isSimple())
return false;		return false;
VT = evt.getSimpleVT();		VT = evt.getSimpleVT();

// This is a legal type, but it's not something we handle in fast-isel.		// This is a legal type, but it's not something we handle in fast-isel.
if (VT == MVT::f128)		if (VT == MVT::f128)
return false;		return false;
Show All 26 Lines	bool AArch64FastISel::isValueAvailable(const Value *V) const {
if (!isa<Instruction>(V))		if (!isa<Instruction>(V))
return true;		return true;

const auto *I = cast<Instruction>(V);		const auto *I = cast<Instruction>(V);
return FuncInfo.MBBMap[I->getParent()] == FuncInfo.MBB;		return FuncInfo.MBBMap[I->getParent()] == FuncInfo.MBB;
}		}

bool AArch64FastISel::simplifyAddress(Address &Addr, MVT VT) {		bool AArch64FastISel::simplifyAddress(Address &Addr, MVT VT) {
		if (Subtarget->isTargetILP32())
		return false;

unsigned ScaleFactor = getImplicitScaleFactor(VT);		unsigned ScaleFactor = getImplicitScaleFactor(VT);
if (!ScaleFactor)		if (!ScaleFactor)
return false;		return false;

bool ImmediateOffsetNeedsLowering = false;		bool ImmediateOffsetNeedsLowering = false;
bool RegisterOffsetNeedsLowering = false;		bool RegisterOffsetNeedsLowering = false;
int64_t Offset = Addr.getOffset();		int64_t Offset = Addr.getOffset();
if (((Offset < 0) \|\| (Offset & (ScaleFactor - 1))) && !isInt<9>(Offset))		if (((Offset < 0) \|\| (Offset & (ScaleFactor - 1))) && !isInt<9>(Offset))
▲ Show 20 Lines • Show All 2,161 Lines • ▼ Show 20 Lines	bool AArch64FastISel::fastLowerCall(CallLoweringInfo &CLI) {

if (!Callee && !Symbol)		if (!Callee && !Symbol)
return false;		return false;

// Allow SelectionDAG isel to handle tail calls.		// Allow SelectionDAG isel to handle tail calls.
if (IsTailCall)		if (IsTailCall)
return false;		return false;

		// FIXME: we could and should support this, but for now correctness at -O0 is
		// more important.
		if (Subtarget->isTargetILP32())
		return false;

CodeModel::Model CM = TM.getCodeModel();		CodeModel::Model CM = TM.getCodeModel();
// Only support the small-addressing and large code models.		// Only support the small-addressing and large code models.
if (CM != CodeModel::Large && !Subtarget->useSmallAddressing())		if (CM != CodeModel::Large && !Subtarget->useSmallAddressing())
return false;		return false;

// FIXME: Add large code model support for ELF.		// FIXME: Add large code model support for ELF.
if (CM == CodeModel::Large && !Subtarget->isTargetMachO())		if (CM == CodeModel::Large && !Subtarget->isTargetMachO())
return false;		return false;
▲ Show 20 Lines • Show All 615 Lines • ▼ Show 20 Lines

bool AArch64FastISel::selectRet(const Instruction *I) {		bool AArch64FastISel::selectRet(const Instruction *I) {
const ReturnInst *Ret = cast<ReturnInst>(I);		const ReturnInst *Ret = cast<ReturnInst>(I);
const Function &F = *I->getParent()->getParent();		const Function &F = *I->getParent()->getParent();

if (!FuncInfo.CanLowerReturn)		if (!FuncInfo.CanLowerReturn)
return false;		return false;

		// FIXME: in principle it could. Mostly just a case of zero extending outgoing
		// pointers.
		if (Subtarget->isTargetILP32())
		return false;

if (F.isVarArg())		if (F.isVarArg())
return false;		return false;

if (TLI.supportSwiftError() &&		if (TLI.supportSwiftError() &&
F.getAttributes().hasAttrSomewhere(Attribute::SwiftError))		F.getAttributes().hasAttrSomewhere(Attribute::SwiftError))
return false;		return false;

if (TLI.supportSplitCSR(FuncInfo.MF))		if (TLI.supportSplitCSR(FuncInfo.MF))
▲ Show 20 Lines • Show All 1,384 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 255 Lines • ▼ Show 20 Lines	public:

/// Determine which of the bits specified in Mask are known to be either zero		/// Determine which of the bits specified in Mask are known to be either zero
/// or one and return them in the KnownZero/KnownOne bitsets.		/// or one and return them in the KnownZero/KnownOne bitsets.
void computeKnownBitsForTargetNode(const SDValue Op, KnownBits &Known,		void computeKnownBitsForTargetNode(const SDValue Op, KnownBits &Known,
const APInt &DemandedElts,		const APInt &DemandedElts,
const SelectionDAG &DAG,		const SelectionDAG &DAG,
unsigned Depth = 0) const override;		unsigned Depth = 0) const override;

		MVT getPointerTy(const DataLayout &DL, uint32_t AS = 0) const override {
		// Returning i64 unconditionally here (i.e. even for ILP32) means that the
		// DAG representation of pointers will always be 64-bits. They will be
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I found it surprising that this method doesn't loop for whether ILP32 is targeted or not to decide whether the pointer type is a 32 or 64 bit integer. Would it be worthwhile to add a small comment here why the integer pointer type is always 64bits? Maybe I'm missing something completely trivial though... kristof.beyls: I found it surprising that this method doesn't loop for whether ILP32 is targeted or not to…
		t.p.northoverAuthorUnsubmitted Done Reply Inline Actions A comment would definitely be worthwhile. This is the function that enables the large change I upstreamed earlier so that pointer types in the DAG can remain 64-bits (to exploit addressing-modes available) and get truncated to 32-bits in memory. t.p.northover: A comment would definitely be worthwhile. This is the function that enables the large change I…
		// truncated and extended when transferred to memory, but the 64-bit DAG
		// allows us to use AArch64's addressing modes much more easily.
		return MVT::getIntegerVT(64);
		}

bool targetShrinkDemandedConstant(SDValue Op, const APInt &Demanded,		bool targetShrinkDemandedConstant(SDValue Op, const APInt &Demanded,
TargetLoweringOpt &TLO) const override;		TargetLoweringOpt &TLO) const override;

MVT getScalarShiftAmountTy(const DataLayout &DL, EVT) const override;		MVT getScalarShiftAmountTy(const DataLayout &DL, EVT) const override;

/// Returns true if the target allows unaligned memory accesses of the		/// Returns true if the target allows unaligned memory accesses of the
/// specified type.		/// specified type.
bool allowsMisalignedMemoryAccesses(		bool allowsMisalignedMemoryAccesses(
▲ Show 20 Lines • Show All 486 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 17 Lines
#include "AArch64RegisterInfo.h"		#include "AArch64RegisterInfo.h"
#include "AArch64Subtarget.h"		#include "AArch64Subtarget.h"
#include "MCTargetDesc/AArch64AddressingModes.h"		#include "MCTargetDesc/AArch64AddressingModes.h"
#include "Utils/AArch64BaseInfo.h"		#include "Utils/AArch64BaseInfo.h"
#include "llvm/ADT/APFloat.h"		#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/APInt.h"		#include "llvm/ADT/APInt.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/Analysis/VectorUtils.h"		#include "llvm/Analysis/VectorUtils.h"
#include "llvm/CodeGen/CallingConvLower.h"		#include "llvm/CodeGen/CallingConvLower.h"
▲ Show 20 Lines • Show All 1,014 Lines • ▼ Show 20 Lines	void AArch64TargetLowering::computeKnownBitsForTargetNode(
case AArch64ISD::CSEL: {		case AArch64ISD::CSEL: {
KnownBits Known2;		KnownBits Known2;
Known = DAG.computeKnownBits(Op->getOperand(0), Depth + 1);		Known = DAG.computeKnownBits(Op->getOperand(0), Depth + 1);
Known2 = DAG.computeKnownBits(Op->getOperand(1), Depth + 1);		Known2 = DAG.computeKnownBits(Op->getOperand(1), Depth + 1);
Known.Zero &= Known2.Zero;		Known.Zero &= Known2.Zero;
Known.One &= Known2.One;		Known.One &= Known2.One;
break;		break;
}		}
		case AArch64ISD::LOADgot:
		case AArch64ISD::ADDlow: {
		if (!Subtarget->isTargetILP32())
		break;
		// In ILP32 mode all valid pointers are in the low 4GB of the address-space.
		Known.Zero = APInt::getHighBitsSet(64, 32);
		break;
		}
case ISD::INTRINSIC_W_CHAIN: {		case ISD::INTRINSIC_W_CHAIN: {
ConstantSDNode *CN = cast<ConstantSDNode>(Op->getOperand(1));		ConstantSDNode *CN = cast<ConstantSDNode>(Op->getOperand(1));
Intrinsic::ID IntID = static_cast<Intrinsic::ID>(CN->getZExtValue());		Intrinsic::ID IntID = static_cast<Intrinsic::ID>(CN->getZExtValue());
switch (IntID) {		switch (IntID) {
default: return;		default: return;
case Intrinsic::aarch64_ldaxr:		case Intrinsic::aarch64_ldaxr:
case Intrinsic::aarch64_ldxr: {		case Intrinsic::aarch64_ldxr: {
unsigned BitWidth = Known.getBitWidth();		unsigned BitWidth = Known.getBitWidth();
▲ Show 20 Lines • Show All 2,002 Lines • ▼ Show 20 Lines	CCAssignFn *AArch64TargetLowering::CCAssignFnForCall(CallingConv::ID CC,
case CallingConv::Fast:		case CallingConv::Fast:
case CallingConv::PreserveMost:		case CallingConv::PreserveMost:
case CallingConv::CXX_FAST_TLS:		case CallingConv::CXX_FAST_TLS:
case CallingConv::Swift:		case CallingConv::Swift:
if (Subtarget->isTargetWindows() && IsVarArg)		if (Subtarget->isTargetWindows() && IsVarArg)
return CC_AArch64_Win64_VarArg;		return CC_AArch64_Win64_VarArg;
if (!Subtarget->isTargetDarwin())		if (!Subtarget->isTargetDarwin())
return CC_AArch64_AAPCS;		return CC_AArch64_AAPCS;
return IsVarArg ? CC_AArch64_DarwinPCS_VarArg : CC_AArch64_DarwinPCS;		if (!IsVarArg)
		return CC_AArch64_DarwinPCS;
		return Subtarget->isTargetILP32() ? CC_AArch64_DarwinPCS_ILP32_VarArg
		: CC_AArch64_DarwinPCS_VarArg;
case CallingConv::Win64:		case CallingConv::Win64:
return IsVarArg ? CC_AArch64_Win64_VarArg : CC_AArch64_AAPCS;		return IsVarArg ? CC_AArch64_Win64_VarArg : CC_AArch64_AAPCS;
case CallingConv::AArch64_VectorCall:		case CallingConv::AArch64_VectorCall:
return CC_AArch64_AAPCS;		return CC_AArch64_AAPCS;
}		}
}		}

CCAssignFn *		CCAssignFn *
AArch64TargetLowering::CCAssignFnForReturn(CallingConv::ID CC) const {		AArch64TargetLowering::CCAssignFnForReturn(CallingConv::ID CC) const {
return CC == CallingConv::WebKit_JS ? RetCC_AArch64_WebKit_JS		return CC == CallingConv::WebKit_JS ? RetCC_AArch64_WebKit_JS
: RetCC_AArch64_AAPCS;		: RetCC_AArch64_AAPCS;
}		}

SDValue AArch64TargetLowering::LowerFormalArguments(		SDValue AArch64TargetLowering::LowerFormalArguments(
SDValue Chain, CallingConv::ID CallConv, bool isVarArg,		SDValue Chain, CallingConv::ID CallConv, bool isVarArg,
const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL,		const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL,
SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals) const {		SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals) const {
MachineFunction &MF = DAG.getMachineFunction();		MachineFunction &MF = DAG.getMachineFunction();
MachineFrameInfo &MFI = MF.getFrameInfo();		MachineFrameInfo &MFI = MF.getFrameInfo();
bool IsWin64 = Subtarget->isCallingConvWin64(MF.getFunction().getCallingConv());		bool IsWin64 = Subtarget->isCallingConvWin64(MF.getFunction().getCallingConv());

// Assign locations to all of the incoming arguments.		// Assign locations to all of the incoming arguments.
SmallVector<CCValAssign, 16> ArgLocs;		SmallVector<CCValAssign, 16> ArgLocs;
		DenseMap<unsigned, SDValue> CopiedRegs;
CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(), ArgLocs,		CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(), ArgLocs,
*DAG.getContext());		*DAG.getContext());

// At this point, Ins[].VT may already be promoted to i32. To correctly		// At this point, Ins[].VT may already be promoted to i32. To correctly
// handle passing i8 as i8 instead of i32 on stack, we pass in both i32 and		// handle passing i8 as i8 instead of i32 on stack, we pass in both i32 and
// i8 to CC_AArch64_AAPCS with i32 being ValVT and i8 being LocVT.		// i8 to CC_AArch64_AAPCS with i32 being ValVT and i8 being LocVT.
// Since AnalyzeFormalArguments uses Ins[].VT for both ValVT and LocVT, here		// Since AnalyzeFormalArguments uses Ins[].VT for both ValVT and LocVT, here
// we use a special version of AnalyzeFormalArguments to pass in ValVT and		// we use a special version of AnalyzeFormalArguments to pass in ValVT and
Show All 40 Lines	if (Ins[i].Flags.isByVal()) {
unsigned FrameIdx =		unsigned FrameIdx =
MFI.CreateFixedObject(8 * NumRegs, VA.getLocMemOffset(), false);		MFI.CreateFixedObject(8 * NumRegs, VA.getLocMemOffset(), false);
SDValue FrameIdxN = DAG.getFrameIndex(FrameIdx, PtrVT);		SDValue FrameIdxN = DAG.getFrameIndex(FrameIdx, PtrVT);
InVals.push_back(FrameIdxN);		InVals.push_back(FrameIdxN);

continue;		continue;
}		}

		SDValue ArgValue;
if (VA.isRegLoc()) {		if (VA.isRegLoc()) {
// Arguments stored in registers.		// Arguments stored in registers.
EVT RegVT = VA.getLocVT();		EVT RegVT = VA.getLocVT();

SDValue ArgValue;
const TargetRegisterClass *RC;		const TargetRegisterClass *RC;

if (RegVT == MVT::i32)		if (RegVT == MVT::i32)
RC = &AArch64::GPR32RegClass;		RC = &AArch64::GPR32RegClass;
else if (RegVT == MVT::i64)		else if (RegVT == MVT::i64)
RC = &AArch64::GPR64RegClass;		RC = &AArch64::GPR64RegClass;
else if (RegVT == MVT::f16)		else if (RegVT == MVT::f16)
RC = &AArch64::FPR16RegClass;		RC = &AArch64::FPR16RegClass;
Show All 28 Lines	if (VA.isRegLoc()) {
"Only scalable vectors can be passed indirectly");		"Only scalable vectors can be passed indirectly");
llvm_unreachable("Spilling of SVE vectors not yet implemented");		llvm_unreachable("Spilling of SVE vectors not yet implemented");
case CCValAssign::BCvt:		case CCValAssign::BCvt:
ArgValue = DAG.getNode(ISD::BITCAST, DL, VA.getValVT(), ArgValue);		ArgValue = DAG.getNode(ISD::BITCAST, DL, VA.getValVT(), ArgValue);
break;		break;
case CCValAssign::AExt:		case CCValAssign::AExt:
case CCValAssign::SExt:		case CCValAssign::SExt:
case CCValAssign::ZExt:		case CCValAssign::ZExt:
// SelectionDAGBuilder will insert appropriate AssertZExt & AssertSExt		break;
// nodes after our lowering.		case CCValAssign::AExtUpper:
assert(RegVT == Ins[i].VT && "incorrect register location selected");		ArgValue = DAG.getNode(ISD::SRL, DL, RegVT, ArgValue,
		DAG.getConstant(32, DL, RegVT));
		ArgValue = DAG.getZExtOrTrunc(ArgValue, DL, VA.getValVT());
break;		break;
}		}

InVals.push_back(ArgValue);

} else { // VA.isRegLoc()		} else { // VA.isRegLoc()
assert(VA.isMemLoc() && "CCValAssign is neither reg nor mem");		assert(VA.isMemLoc() && "CCValAssign is neither reg nor mem");
unsigned ArgOffset = VA.getLocMemOffset();		unsigned ArgOffset = VA.getLocMemOffset();
unsigned ArgSize = VA.getValVT().getSizeInBits() / 8;		unsigned ArgSize = VA.getValVT().getSizeInBits() / 8;

uint32_t BEAlign = 0;		uint32_t BEAlign = 0;
if (!Subtarget->isLittleEndian() && ArgSize < 8 &&		if (!Subtarget->isLittleEndian() && ArgSize < 8 &&
!Ins[i].Flags.isInConsecutiveRegs())		!Ins[i].Flags.isInConsecutiveRegs())
BEAlign = 8 - ArgSize;		BEAlign = 8 - ArgSize;

int FI = MFI.CreateFixedObject(ArgSize, ArgOffset + BEAlign, true);		int FI = MFI.CreateFixedObject(ArgSize, ArgOffset + BEAlign, true);

// Create load nodes to retrieve arguments from the stack.		// Create load nodes to retrieve arguments from the stack.
SDValue FIN = DAG.getFrameIndex(FI, getPointerTy(DAG.getDataLayout()));		SDValue FIN = DAG.getFrameIndex(FI, getPointerTy(DAG.getDataLayout()));
SDValue ArgValue;

// For NON_EXTLOAD, generic code in getLoad assert(ValVT == MemVT)		// For NON_EXTLOAD, generic code in getLoad assert(ValVT == MemVT)
ISD::LoadExtType ExtType = ISD::NON_EXTLOAD;		ISD::LoadExtType ExtType = ISD::NON_EXTLOAD;
MVT MemVT = VA.getValVT();		MVT MemVT = VA.getValVT();

switch (VA.getLocInfo()) {		switch (VA.getLocInfo()) {
default:		default:
break;		break;
		case CCValAssign::Trunc:
case CCValAssign::BCvt:		case CCValAssign::BCvt:
MemVT = VA.getLocVT();		MemVT = VA.getLocVT();
break;		break;
case CCValAssign::Indirect:		case CCValAssign::Indirect:
assert(VA.getValVT().isScalableVector() &&		assert(VA.getValVT().isScalableVector() &&
"Only scalable vectors can be passed indirectly");		"Only scalable vectors can be passed indirectly");
llvm_unreachable("Spilling of SVE vectors not yet implemented");		llvm_unreachable("Spilling of SVE vectors not yet implemented");
case CCValAssign::SExt:		case CCValAssign::SExt:
ExtType = ISD::SEXTLOAD;		ExtType = ISD::SEXTLOAD;
break;		break;
case CCValAssign::ZExt:		case CCValAssign::ZExt:
ExtType = ISD::ZEXTLOAD;		ExtType = ISD::ZEXTLOAD;
break;		break;
case CCValAssign::AExt:		case CCValAssign::AExt:
ExtType = ISD::EXTLOAD;		ExtType = ISD::EXTLOAD;
break;		break;
}		}

ArgValue = DAG.getExtLoad(		ArgValue = DAG.getExtLoad(
ExtType, DL, VA.getLocVT(), Chain, FIN,		ExtType, DL, VA.getLocVT(), Chain, FIN,
MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI),		MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI),
MemVT);		MemVT);

InVals.push_back(ArgValue);
}		}
		if (Subtarget->isTargetILP32() && Ins[i].Flags.isPointer())
		ArgValue = DAG.getNode(ISD::AssertZext, DL, ArgValue.getValueType(),
		ArgValue, DAG.getValueType(MVT::i32));
		InVals.push_back(ArgValue);
}		}

// varargs		// varargs
AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();		AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();
if (isVarArg) {		if (isVarArg) {
if (!Subtarget->isTargetDarwin() \|\| IsWin64) {		if (!Subtarget->isTargetDarwin() \|\| IsWin64) {
// The AAPCS variadic function ABI is identical to the non-variadic		// The AAPCS variadic function ABI is identical to the non-variadic
// one. As a result there may be more arguments in registers and we should		// one. As a result there may be more arguments in registers and we should
// save them for future reference.		// save them for future reference.
// Win64 variadic functions also pass arguments in registers, but all float		// Win64 variadic functions also pass arguments in registers, but all float
// arguments are passed in integer registers.		// arguments are passed in integer registers.
saveVarArgRegisters(CCInfo, DAG, DL, Chain);		saveVarArgRegisters(CCInfo, DAG, DL, Chain);
}		}

// This will point to the next argument passed via stack.		// This will point to the next argument passed via stack.
unsigned StackOffset = CCInfo.getNextStackOffset();		unsigned StackOffset = CCInfo.getNextStackOffset();
// We currently pass all varargs at 8-byte alignment.		// We currently pass all varargs at 8-byte alignment, or 4 for ILP32
StackOffset = ((StackOffset + 7) & ~7);		StackOffset = alignTo(StackOffset, Subtarget->isTargetILP32() ? 4 : 8);
FuncInfo->setVarArgsStackIndex(MFI.CreateFixedObject(4, StackOffset, true));		FuncInfo->setVarArgsStackIndex(MFI.CreateFixedObject(4, StackOffset, true));

if (MFI.hasMustTailInVarArgFunc()) {		if (MFI.hasMustTailInVarArgFunc()) {
SmallVector<MVT, 2> RegParmTypes;		SmallVector<MVT, 2> RegParmTypes;
RegParmTypes.push_back(MVT::i64);		RegParmTypes.push_back(MVT::i64);
RegParmTypes.push_back(MVT::f128);		RegParmTypes.push_back(MVT::f128);
// Compute the set of forwarded registers. The rest are scratch.		// Compute the set of forwarded registers. The rest are scratch.
SmallVectorImpl<ForwardedRegister> &Forwards =		SmallVectorImpl<ForwardedRegister> &Forwards =
▲ Show 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	SDValue AArch64TargetLowering::LowerCallResult(
const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL,		const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL,
SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals, bool isThisReturn,		SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals, bool isThisReturn,
SDValue ThisVal) const {		SDValue ThisVal) const {
CCAssignFn *RetCC = CallConv == CallingConv::WebKit_JS		CCAssignFn *RetCC = CallConv == CallingConv::WebKit_JS
? RetCC_AArch64_WebKit_JS		? RetCC_AArch64_WebKit_JS
: RetCC_AArch64_AAPCS;		: RetCC_AArch64_AAPCS;
// Assign locations to each value returned by this call.		// Assign locations to each value returned by this call.
SmallVector<CCValAssign, 16> RVLocs;		SmallVector<CCValAssign, 16> RVLocs;
		DenseMap<unsigned, SDValue> CopiedRegs;
CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(), RVLocs,		CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(), RVLocs,
*DAG.getContext());		*DAG.getContext());
CCInfo.AnalyzeCallResult(Ins, RetCC);		CCInfo.AnalyzeCallResult(Ins, RetCC);

// Copy all of the result registers out of their specified physreg.		// Copy all of the result registers out of their specified physreg.
for (unsigned i = 0; i != RVLocs.size(); ++i) {		for (unsigned i = 0; i != RVLocs.size(); ++i) {
CCValAssign VA = RVLocs[i];		CCValAssign VA = RVLocs[i];

// Pass 'this' value directly from the argument to return value, to avoid		// Pass 'this' value directly from the argument to return value, to avoid
// reg unit interference		// reg unit interference
if (i == 0 && isThisReturn) {		if (i == 0 && isThisReturn) {
assert(!VA.needsCustom() && VA.getLocVT() == MVT::i64 &&		assert(!VA.needsCustom() && VA.getLocVT() == MVT::i64 &&
"unexpected return calling convention register assignment");		"unexpected return calling convention register assignment");
InVals.push_back(ThisVal);		InVals.push_back(ThisVal);
continue;		continue;
}		}

SDValue Val =		// Avoid copying a physreg twice since RegAllocFast is incompetent and only
		// allows one use of a physreg per block.
		aemersonUnsubmitted Not Done Reply Inline Actions Why does this happen at all? aemerson: Why does this happen at all?
		t.p.northoverAuthorUnsubmitted Done Reply Inline Actions When lowering a return of (say) [2 x i32], the two components will get mapped to (X0, AExtUpper) and (X0, ZExt), a duplication of X0. t.p.northover: When lowering a return of (say) [2 x i32], the two components will get mapped to (X0…
		SDValue Val = CopiedRegs.lookup(VA.getLocReg());
		if (!Val) {
		Val =
DAG.getCopyFromReg(Chain, DL, VA.getLocReg(), VA.getLocVT(), InFlag);		DAG.getCopyFromReg(Chain, DL, VA.getLocReg(), VA.getLocVT(), InFlag);
Chain = Val.getValue(1);		Chain = Val.getValue(1);
InFlag = Val.getValue(2);		InFlag = Val.getValue(2);
		CopiedRegs[VA.getLocReg()] = Val;
		}

switch (VA.getLocInfo()) {		switch (VA.getLocInfo()) {
default:		default:
llvm_unreachable("Unknown loc info!");		llvm_unreachable("Unknown loc info!");
case CCValAssign::Full:		case CCValAssign::Full:
break;		break;
case CCValAssign::BCvt:		case CCValAssign::BCvt:
Val = DAG.getNode(ISD::BITCAST, DL, VA.getValVT(), Val);		Val = DAG.getNode(ISD::BITCAST, DL, VA.getValVT(), Val);
break;		break;
		case CCValAssign::AExtUpper:
		Val = DAG.getNode(ISD::SRL, DL, VA.getLocVT(), Val,
		DAG.getConstant(32, DL, VA.getLocVT()));
		LLVM_FALLTHROUGH;
		case CCValAssign::AExt:
		LLVM_FALLTHROUGH;
		case CCValAssign::ZExt:
		Val = DAG.getZExtOrTrunc(Val, DL, VA.getValVT());
		break;
}		}

InVals.push_back(Val);		InVals.push_back(Val);
}		}

return Chain;		return Chain;
}		}

▲ Show 20 Lines • Show All 296 Lines • ▼ Show 20 Lines	AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
// Adjust the stack pointer for the new arguments...		// Adjust the stack pointer for the new arguments...
// These operations are automatically eliminated by the prolog/epilog pass		// These operations are automatically eliminated by the prolog/epilog pass
if (!IsSibCall)		if (!IsSibCall)
Chain = DAG.getCALLSEQ_START(Chain, NumBytes, 0, DL);		Chain = DAG.getCALLSEQ_START(Chain, NumBytes, 0, DL);

SDValue StackPtr = DAG.getCopyFromReg(Chain, DL, AArch64::SP,		SDValue StackPtr = DAG.getCopyFromReg(Chain, DL, AArch64::SP,
getPointerTy(DAG.getDataLayout()));		getPointerTy(DAG.getDataLayout()));

SmallVector<std::pair<unsigned, SDValue>, 8> RegsToPass;		SmallVector<std::pair<unsigned, SDValue>, 8> RegsToPass;
		SmallSet<unsigned, 8> RegsUsed;
		efriedmaUnsubmitted Not Done Reply Inline Actions What's changing here? Does it make sense to add any comments? efriedma: What's changing here? Does it make sense to add any comments?
		t.p.northoverAuthorUnsubmitted Done Reply Inline Actions We use the ability to lookup whether a register has already been added (use at line 3801 in this diff). If it has then we're trying to combine two parts of (say) a `[2 x i32]` into a single register for compatibility with armv7k IR and need to do the bitwise arithmetic to make that happen. I don't know that a comment here would work (it would either be a historic note, or pre-empting what comes later). I'll try to do something to call it out unobtrusively at the use-point. t.p.northover: We use the ability to lookup whether a register has already been added (use at line 3801 in…
		aemersonUnsubmitted Done Reply Inline Actions Looks like we still need some form of documentation here at RegUsed use? aemerson: Looks like we still need some form of documentation here at RegUsed use?
SmallVector<SDValue, 8> MemOpChains;		SmallVector<SDValue, 8> MemOpChains;
auto PtrVT = getPointerTy(DAG.getDataLayout());		auto PtrVT = getPointerTy(DAG.getDataLayout());

if (IsVarArg && CLI.CS && CLI.CS.isMustTailCall()) {		if (IsVarArg && CLI.CS && CLI.CS.isMustTailCall()) {
const auto &Forwards = FuncInfo->getForwardedMustTailRegParms();		const auto &Forwards = FuncInfo->getForwardedMustTailRegParms();
for (const auto &F : Forwards) {		for (const auto &F : Forwards) {
SDValue Val = DAG.getCopyFromReg(Chain, DL, F.VReg, F.VT);		SDValue Val = DAG.getCopyFromReg(Chain, DL, F.VReg, F.VT);
RegsToPass.push_back(std::make_pair(unsigned(F.PReg), Val));		RegsToPass.emplace_back(F.PReg, Val);
}		}
}		}

// Walk the register/memloc assignments, inserting copies/loads.		// Walk the register/memloc assignments, inserting copies/loads.
for (unsigned i = 0, realArgIdx = 0, e = ArgLocs.size(); i != e;		for (unsigned i = 0, realArgIdx = 0, e = ArgLocs.size(); i != e;
++i, ++realArgIdx) {		++i, ++realArgIdx) {
CCValAssign &VA = ArgLocs[i];		CCValAssign &VA = ArgLocs[i];
SDValue Arg = OutVals[realArgIdx];		SDValue Arg = OutVals[realArgIdx];
Show All 14 Lines	for (unsigned i = 0, realArgIdx = 0, e = ArgLocs.size(); i != e;
case CCValAssign::AExt:		case CCValAssign::AExt:
if (Outs[realArgIdx].ArgVT == MVT::i1) {		if (Outs[realArgIdx].ArgVT == MVT::i1) {
// AAPCS requires i1 to be zero-extended to 8-bits by the caller.		// AAPCS requires i1 to be zero-extended to 8-bits by the caller.
Arg = DAG.getNode(ISD::TRUNCATE, DL, MVT::i1, Arg);		Arg = DAG.getNode(ISD::TRUNCATE, DL, MVT::i1, Arg);
Arg = DAG.getNode(ISD::ZERO_EXTEND, DL, MVT::i8, Arg);		Arg = DAG.getNode(ISD::ZERO_EXTEND, DL, MVT::i8, Arg);
}		}
Arg = DAG.getNode(ISD::ANY_EXTEND, DL, VA.getLocVT(), Arg);		Arg = DAG.getNode(ISD::ANY_EXTEND, DL, VA.getLocVT(), Arg);
break;		break;
		case CCValAssign::AExtUpper:
		assert(VA.getValVT() == MVT::i32 && "only expect 32 -> 64 upper bits");
		Arg = DAG.getNode(ISD::ANY_EXTEND, DL, VA.getLocVT(), Arg);
		Arg = DAG.getNode(ISD::SHL, DL, VA.getLocVT(), Arg,
		DAG.getConstant(32, DL, VA.getLocVT()));
		break;
case CCValAssign::BCvt:		case CCValAssign::BCvt:
Arg = DAG.getNode(ISD::BITCAST, DL, VA.getLocVT(), Arg);		Arg = DAG.getBitcast(VA.getLocVT(), Arg);
		break;
		case CCValAssign::Trunc:
		Arg = DAG.getZExtOrTrunc(Arg, DL, VA.getLocVT());
break;		break;
case CCValAssign::FPExt:		case CCValAssign::FPExt:
Arg = DAG.getNode(ISD::FP_EXTEND, DL, VA.getLocVT(), Arg);		Arg = DAG.getNode(ISD::FP_EXTEND, DL, VA.getLocVT(), Arg);
break;		break;
case CCValAssign::Indirect:		case CCValAssign::Indirect:
assert(VA.getValVT().isScalableVector() &&		assert(VA.getValVT().isScalableVector() &&
"Only scalable vectors can be passed indirectly");		"Only scalable vectors can be passed indirectly");
llvm_unreachable("Spilling of SVE vectors not yet implemented");		llvm_unreachable("Spilling of SVE vectors not yet implemented");
}		}

if (VA.isRegLoc()) {		if (VA.isRegLoc()) {
if (realArgIdx == 0 && Flags.isReturned() && !Flags.isSwiftSelf() &&		if (realArgIdx == 0 && Flags.isReturned() && !Flags.isSwiftSelf() &&
Outs[0].VT == MVT::i64) {		Outs[0].VT == MVT::i64) {
assert(VA.getLocVT() == MVT::i64 &&		assert(VA.getLocVT() == MVT::i64 &&
"unexpected calling convention register assignment");		"unexpected calling convention register assignment");
assert(!Ins.empty() && Ins[0].VT == MVT::i64 &&		assert(!Ins.empty() && Ins[0].VT == MVT::i64 &&
"unexpected use of 'returned'");		"unexpected use of 'returned'");
IsThisReturn = true;		IsThisReturn = true;
}		}
RegsToPass.push_back(std::make_pair(VA.getLocReg(), Arg));		if (RegsUsed.count(VA.getLocReg())) {
		// If this register has already been used then we're trying to pack
		// parts of an [N x i32] into an X-register. The extension type will
		// take care of putting the two halves in the right place but we have to
		// combine them.
		SDValue &Bits =
		std::find_if(RegsToPass.begin(), RegsToPass.end(),
		[=](const std::pair<unsigned, SDValue> &Elt) {
		return Elt.first == VA.getLocReg();
		})
		->second;
		Bits = DAG.getNode(ISD::OR, DL, Bits.getValueType(), Bits, Arg);
		} else {
		RegsToPass.emplace_back(VA.getLocReg(), Arg);
		RegsUsed.insert(VA.getLocReg());
		}
} else {		} else {
assert(VA.isMemLoc());		assert(VA.isMemLoc());

SDValue DstAddr;		SDValue DstAddr;
MachinePointerInfo DstInfo;		MachinePointerInfo DstInfo;

// FIXME: This works on big-endian for composite byvals, which are the		// FIXME: This works on big-endian for composite byvals, which are the
// common case. It should also work for fundamental types too.		// common case. It should also work for fundamental types too.
▲ Show 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	CCAssignFn *RetCC = CallConv == CallingConv::WebKit_JS
: RetCC_AArch64_AAPCS;		: RetCC_AArch64_AAPCS;
SmallVector<CCValAssign, 16> RVLocs;		SmallVector<CCValAssign, 16> RVLocs;
CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(), RVLocs,		CCState CCInfo(CallConv, isVarArg, DAG.getMachineFunction(), RVLocs,
*DAG.getContext());		*DAG.getContext());
CCInfo.AnalyzeReturn(Outs, RetCC);		CCInfo.AnalyzeReturn(Outs, RetCC);

// Copy the result values into the output registers.		// Copy the result values into the output registers.
SDValue Flag;		SDValue Flag;
SmallVector<SDValue, 4> RetOps(1, Chain);		SmallVector<std::pair<unsigned, SDValue>, 4> RetVals;
		SmallSet<unsigned, 4> RegsUsed;
for (unsigned i = 0, realRVLocIdx = 0; i != RVLocs.size();		for (unsigned i = 0, realRVLocIdx = 0; i != RVLocs.size();
++i, ++realRVLocIdx) {		++i, ++realRVLocIdx) {
CCValAssign &VA = RVLocs[i];		CCValAssign &VA = RVLocs[i];
assert(VA.isRegLoc() && "Can only return in registers!");		assert(VA.isRegLoc() && "Can only return in registers!");
SDValue Arg = OutVals[realRVLocIdx];		SDValue Arg = OutVals[realRVLocIdx];

switch (VA.getLocInfo()) {		switch (VA.getLocInfo()) {
default:		default:
llvm_unreachable("Unknown loc info!");		llvm_unreachable("Unknown loc info!");
case CCValAssign::Full:		case CCValAssign::Full:
if (Outs[i].ArgVT == MVT::i1) {		if (Outs[i].ArgVT == MVT::i1) {
// AAPCS requires i1 to be zero-extended to i8 by the producer of the		// AAPCS requires i1 to be zero-extended to i8 by the producer of the
// value. This is strictly redundant on Darwin (which uses "zeroext		// value. This is strictly redundant on Darwin (which uses "zeroext
// i1"), but will be optimised out before ISel.		// i1"), but will be optimised out before ISel.
Arg = DAG.getNode(ISD::TRUNCATE, DL, MVT::i1, Arg);		Arg = DAG.getNode(ISD::TRUNCATE, DL, MVT::i1, Arg);
Arg = DAG.getNode(ISD::ZERO_EXTEND, DL, VA.getLocVT(), Arg);		Arg = DAG.getNode(ISD::ZERO_EXTEND, DL, VA.getLocVT(), Arg);
}		}
break;		break;
case CCValAssign::BCvt:		case CCValAssign::BCvt:
Arg = DAG.getNode(ISD::BITCAST, DL, VA.getLocVT(), Arg);		Arg = DAG.getNode(ISD::BITCAST, DL, VA.getLocVT(), Arg);
break;		break;
		case CCValAssign::AExt:
		case CCValAssign::ZExt:
		Arg = DAG.getZExtOrTrunc(Arg, DL, VA.getLocVT());
		break;
		case CCValAssign::AExtUpper:
		assert(VA.getValVT() == MVT::i32 && "only expect 32 -> 64 upper bits");
		Arg = DAG.getZExtOrTrunc(Arg, DL, VA.getLocVT());
		Arg = DAG.getNode(ISD::SHL, DL, VA.getLocVT(), Arg,
		DAG.getConstant(32, DL, VA.getLocVT()));
		break;
		}

		if (RegsUsed.count(VA.getLocReg())) {
		SDValue &Bits =
		std::find_if(RetVals.begin(), RetVals.end(),
		[=](const std::pair<unsigned, SDValue> &Elt) {
		return Elt.first == VA.getLocReg();
		})
		->second;
		Bits = DAG.getNode(ISD::OR, DL, Bits.getValueType(), Bits, Arg);
		} else {
		RetVals.emplace_back(VA.getLocReg(), Arg);
		RegsUsed.insert(VA.getLocReg());
		}
}		}

Chain = DAG.getCopyToReg(Chain, DL, VA.getLocReg(), Arg, Flag);		SmallVector<SDValue, 4> RetOps(1, Chain);
		for (auto &RetVal : RetVals) {
		Chain = DAG.getCopyToReg(Chain, DL, RetVal.first, RetVal.second, Flag);
Flag = Chain.getValue(1);		Flag = Chain.getValue(1);
RetOps.push_back(DAG.getRegister(VA.getLocReg(), VA.getLocVT()));		RetOps.push_back(
		DAG.getRegister(RetVal.first, RetVal.second.getValueType()));
}		}

// Windows AArch64 ABIs require that for returning structs by value we copy		// Windows AArch64 ABIs require that for returning structs by value we copy
// the sret argument into X0 for the return.		// the sret argument into X0 for the return.
// We saved the argument into a virtual register in the entry block,		// We saved the argument into a virtual register in the entry block,
// so now we copy the value out and into X0.		// so now we copy the value out and into X0.
if (unsigned SRetReg = FuncInfo->getSRetReturnReg()) {		if (unsigned SRetReg = FuncInfo->getSRetReturnReg()) {
SDValue Val = DAG.getCopyFromReg(RetOps[0], DL, SRetReg,		SDValue Val = DAG.getCopyFromReg(RetOps[0], DL, SRetReg,
▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines
SDValue		SDValue
AArch64TargetLowering::LowerDarwinGlobalTLSAddress(SDValue Op,		AArch64TargetLowering::LowerDarwinGlobalTLSAddress(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
assert(Subtarget->isTargetDarwin() &&		assert(Subtarget->isTargetDarwin() &&
"This function expects a Darwin target");		"This function expects a Darwin target");

SDLoc DL(Op);		SDLoc DL(Op);
MVT PtrVT = getPointerTy(DAG.getDataLayout());		MVT PtrVT = getPointerTy(DAG.getDataLayout());
		MVT PtrMemVT = getPointerMemTy(DAG.getDataLayout());
const GlobalValue *GV = cast<GlobalAddressSDNode>(Op)->getGlobal();		const GlobalValue *GV = cast<GlobalAddressSDNode>(Op)->getGlobal();

SDValue TLVPAddr =		SDValue TLVPAddr =
DAG.getTargetGlobalAddress(GV, DL, PtrVT, 0, AArch64II::MO_TLS);		DAG.getTargetGlobalAddress(GV, DL, PtrVT, 0, AArch64II::MO_TLS);
SDValue DescAddr = DAG.getNode(AArch64ISD::LOADgot, DL, PtrVT, TLVPAddr);		SDValue DescAddr = DAG.getNode(AArch64ISD::LOADgot, DL, PtrVT, TLVPAddr);

// The first entry in the descriptor is a function pointer that we must call		// The first entry in the descriptor is a function pointer that we must call
// to obtain the address of the variable.		// to obtain the address of the variable.
SDValue Chain = DAG.getEntryNode();		SDValue Chain = DAG.getEntryNode();
SDValue FuncTLVGet = DAG.getLoad(		SDValue FuncTLVGet = DAG.getLoad(
MVT::i64, DL, Chain, DescAddr,		PtrMemVT, DL, Chain, DescAddr,
MachinePointerInfo::getGOT(DAG.getMachineFunction()),		MachinePointerInfo::getGOT(DAG.getMachineFunction()),
/* Alignment = */ 8,		/* Alignment = */ PtrMemVT.getSizeInBits() / 8,
MachineMemOperand::MOInvariant \| MachineMemOperand::MODereferenceable);		MachineMemOperand::MOInvariant \| MachineMemOperand::MODereferenceable);
Chain = FuncTLVGet.getValue(1);		Chain = FuncTLVGet.getValue(1);

		// Extend loaded pointer if necessary (i.e. if ILP32) to DAG pointer.
		FuncTLVGet = DAG.getZExtOrTrunc(FuncTLVGet, DL, PtrVT);

MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();		MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();
MFI.setAdjustsStack(true);		MFI.setAdjustsStack(true);

// TLS calls preserve all registers except those that absolutely must be		// TLS calls preserve all registers except those that absolutely must be
// trashed: X0 (it takes an argument), LR (it's a call) and NZCV (let's not be		// trashed: X0 (it takes an argument), LR (it's a call) and NZCV (let's not be
// silly).		// silly).
const AArch64RegisterInfo *TRI = Subtarget->getRegisterInfo();		const AArch64RegisterInfo *TRI = Subtarget->getRegisterInfo();
const uint32_t *Mask = TRI->getTLSCallPreservedMask();		const uint32_t *Mask = TRI->getTLSCallPreservedMask();
▲ Show 20 Lines • Show All 859 Lines • ▼ Show 20 Lines
SDValue AArch64TargetLowering::LowerDarwin_VASTART(SDValue Op,		SDValue AArch64TargetLowering::LowerDarwin_VASTART(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
AArch64FunctionInfo *FuncInfo =		AArch64FunctionInfo *FuncInfo =
DAG.getMachineFunction().getInfo<AArch64FunctionInfo>();		DAG.getMachineFunction().getInfo<AArch64FunctionInfo>();

SDLoc DL(Op);		SDLoc DL(Op);
SDValue FR = DAG.getFrameIndex(FuncInfo->getVarArgsStackIndex(),		SDValue FR = DAG.getFrameIndex(FuncInfo->getVarArgsStackIndex(),
getPointerTy(DAG.getDataLayout()));		getPointerTy(DAG.getDataLayout()));
		FR = DAG.getZExtOrTrunc(FR, DL, getPointerMemTy(DAG.getDataLayout()));
const Value *SV = cast<SrcValueSDNode>(Op.getOperand(2))->getValue();		const Value *SV = cast<SrcValueSDNode>(Op.getOperand(2))->getValue();
return DAG.getStore(Op.getOperand(0), DL, FR, Op.getOperand(1),		return DAG.getStore(Op.getOperand(0), DL, FR, Op.getOperand(1),
MachinePointerInfo(SV));		MachinePointerInfo(SV));
}		}

SDValue AArch64TargetLowering::LowerWin64_VASTART(SDValue Op,		SDValue AArch64TargetLowering::LowerWin64_VASTART(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
AArch64FunctionInfo *FuncInfo =		AArch64FunctionInfo *FuncInfo =
▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	else
return LowerAAPCS_VASTART(Op, DAG);		return LowerAAPCS_VASTART(Op, DAG);
}		}

SDValue AArch64TargetLowering::LowerVACOPY(SDValue Op,		SDValue AArch64TargetLowering::LowerVACOPY(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
// AAPCS has three pointers and two ints (= 32 bytes), Darwin has single		// AAPCS has three pointers and two ints (= 32 bytes), Darwin has single
// pointer.		// pointer.
SDLoc DL(Op);		SDLoc DL(Op);
unsigned VaListSize =		unsigned PtrSize = Subtarget->isTargetILP32() ? 4 : 8;
Subtarget->isTargetDarwin() \|\| Subtarget->isTargetWindows() ? 8 : 32;		unsigned VaListSize = (Subtarget->isTargetDarwin() \|\|
		Subtarget->isTargetWindows()) ? PtrSize : 32;
const Value *DestSV = cast<SrcValueSDNode>(Op.getOperand(3))->getValue();		const Value *DestSV = cast<SrcValueSDNode>(Op.getOperand(3))->getValue();
const Value *SrcSV = cast<SrcValueSDNode>(Op.getOperand(4))->getValue();		const Value *SrcSV = cast<SrcValueSDNode>(Op.getOperand(4))->getValue();

return DAG.getMemcpy(Op.getOperand(0), DL, Op.getOperand(1),		return DAG.getMemcpy(Op.getOperand(0), DL, Op.getOperand(1), Op.getOperand(2),
Op.getOperand(2),		DAG.getConstant(VaListSize, DL, MVT::i32), PtrSize,
DAG.getConstant(VaListSize, DL, MVT::i32),		false, false, false, MachinePointerInfo(DestSV),
8, false, false, false, MachinePointerInfo(DestSV),
MachinePointerInfo(SrcSV));		MachinePointerInfo(SrcSV));
}		}

SDValue AArch64TargetLowering::LowerVAARG(SDValue Op, SelectionDAG &DAG) const {		SDValue AArch64TargetLowering::LowerVAARG(SDValue Op, SelectionDAG &DAG) const {
assert(Subtarget->isTargetDarwin() &&		assert(Subtarget->isTargetDarwin() &&
"automatic va_arg instruction only works on Darwin");		"automatic va_arg instruction only works on Darwin");

const Value *V = cast<SrcValueSDNode>(Op.getOperand(2))->getValue();		const Value *V = cast<SrcValueSDNode>(Op.getOperand(2))->getValue();
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
SDLoc DL(Op);		SDLoc DL(Op);
SDValue Chain = Op.getOperand(0);		SDValue Chain = Op.getOperand(0);
SDValue Addr = Op.getOperand(1);		SDValue Addr = Op.getOperand(1);
unsigned Align = Op.getConstantOperandVal(3);		unsigned Align = Op.getConstantOperandVal(3);
		unsigned MinSlotSize = Subtarget->isTargetILP32() ? 4 : 8;
auto PtrVT = getPointerTy(DAG.getDataLayout());		auto PtrVT = getPointerTy(DAG.getDataLayout());
		auto PtrMemVT = getPointerMemTy(DAG.getDataLayout());
SDValue VAList = DAG.getLoad(PtrVT, DL, Chain, Addr, MachinePointerInfo(V));		SDValue VAList =
		DAG.getLoad(PtrMemVT, DL, Chain, Addr, MachinePointerInfo(V));
Chain = VAList.getValue(1);		Chain = VAList.getValue(1);
		VAList = DAG.getZExtOrTrunc(VAList, DL, PtrVT);

if (Align > 8) {		if (Align > MinSlotSize) {
assert(((Align & (Align - 1)) == 0) && "Expected Align to be a power of 2");		assert(((Align & (Align - 1)) == 0) && "Expected Align to be a power of 2");
VAList = DAG.getNode(ISD::ADD, DL, PtrVT, VAList,		VAList = DAG.getNode(ISD::ADD, DL, PtrVT, VAList,
DAG.getConstant(Align - 1, DL, PtrVT));		DAG.getConstant(Align - 1, DL, PtrVT));
VAList = DAG.getNode(ISD::AND, DL, PtrVT, VAList,		VAList = DAG.getNode(ISD::AND, DL, PtrVT, VAList,
DAG.getConstant(-(int64_t)Align, DL, PtrVT));		DAG.getConstant(-(int64_t)Align, DL, PtrVT));
}		}

Type ArgTy = VT.getTypeForEVT(DAG.getContext());		Type ArgTy = VT.getTypeForEVT(DAG.getContext());
uint64_t ArgSize = DAG.getDataLayout().getTypeAllocSize(ArgTy);		unsigned ArgSize = DAG.getDataLayout().getTypeAllocSize(ArgTy);

// Scalar integer and FP values smaller than 64 bits are implicitly extended		// Scalar integer and FP values smaller than 64 bits are implicitly extended
// up to 64 bits. At the very least, we have to increase the striding of the		// up to 64 bits. At the very least, we have to increase the striding of the
// vaargs list to match this, and for FP values we need to introduce		// vaargs list to match this, and for FP values we need to introduce
// FP_ROUND nodes as well.		// FP_ROUND nodes as well.
if (VT.isInteger() && !VT.isVector())		if (VT.isInteger() && !VT.isVector())
ArgSize = 8;		ArgSize = std::max(ArgSize, MinSlotSize);
bool NeedFPTrunc = false;		bool NeedFPTrunc = false;
if (VT.isFloatingPoint() && !VT.isVector() && VT != MVT::f64) {		if (VT.isFloatingPoint() && !VT.isVector() && VT != MVT::f64) {
ArgSize = 8;		ArgSize = 8;
NeedFPTrunc = true;		NeedFPTrunc = true;
}		}

// Increment the pointer, VAList, to the next vaarg		// Increment the pointer, VAList, to the next vaarg
SDValue VANext = DAG.getNode(ISD::ADD, DL, PtrVT, VAList,		SDValue VANext = DAG.getNode(ISD::ADD, DL, PtrVT, VAList,
DAG.getConstant(ArgSize, DL, PtrVT));		DAG.getConstant(ArgSize, DL, PtrVT));
		VANext = DAG.getZExtOrTrunc(VANext, DL, PtrMemVT);

// Store the incremented VAList to the legalized pointer		// Store the incremented VAList to the legalized pointer
SDValue APStore =		SDValue APStore =
DAG.getStore(Chain, DL, VANext, Addr, MachinePointerInfo(V));		DAG.getStore(Chain, DL, VANext, Addr, MachinePointerInfo(V));

// Load the actual argument out of the pointer VAList		// Load the actual argument out of the pointer VAList
if (NeedFPTrunc) {		if (NeedFPTrunc) {
// Load the value as an f64.		// Load the value as an f64.
SDValue WideFP =		SDValue WideFP =
Show All 13 Lines	SDValue AArch64TargetLowering::LowerFRAMEADDR(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();		MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();
MFI.setFrameAddressIsTaken(true);		MFI.setFrameAddressIsTaken(true);

EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
SDLoc DL(Op);		SDLoc DL(Op);
unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();		unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
SDValue FrameAddr =		SDValue FrameAddr =
DAG.getCopyFromReg(DAG.getEntryNode(), DL, AArch64::FP, VT);		DAG.getCopyFromReg(DAG.getEntryNode(), DL, AArch64::FP, MVT::i64);
while (Depth--)		while (Depth--)
FrameAddr = DAG.getLoad(VT, DL, DAG.getEntryNode(), FrameAddr,		FrameAddr = DAG.getLoad(VT, DL, DAG.getEntryNode(), FrameAddr,
MachinePointerInfo());		MachinePointerInfo());

		if (Subtarget->isTargetILP32())
		FrameAddr = DAG.getNode(ISD::AssertZext, DL, MVT::i64, FrameAddr,
		DAG.getValueType(VT));

return FrameAddr;		return FrameAddr;
}		}

SDValue AArch64TargetLowering::LowerSPONENTRY(SDValue Op,		SDValue AArch64TargetLowering::LowerSPONENTRY(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();		MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();

EVT VT = getPointerTy(DAG.getDataLayout());		EVT VT = getPointerTy(DAG.getDataLayout());
▲ Show 20 Lines • Show All 6,848 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

Show First 20 Lines • Show All 1,465 Lines • ▼ Show 20 Lines
}		}

bool AArch64InstrInfo::expandPostRAPseudo(MachineInstr &MI) const {		bool AArch64InstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
if (MI.getOpcode() != TargetOpcode::LOAD_STACK_GUARD &&		if (MI.getOpcode() != TargetOpcode::LOAD_STACK_GUARD &&
MI.getOpcode() != AArch64::CATCHRET)		MI.getOpcode() != AArch64::CATCHRET)
return false;		return false;

MachineBasicBlock &MBB = *MI.getParent();		MachineBasicBlock &MBB = *MI.getParent();
		auto &Subtarget = MBB.getParent()->getSubtarget<AArch64Subtarget>();
		auto TRI = Subtarget.getRegisterInfo();
DebugLoc DL = MI.getDebugLoc();		DebugLoc DL = MI.getDebugLoc();

if (MI.getOpcode() == AArch64::CATCHRET) {		if (MI.getOpcode() == AArch64::CATCHRET) {
// Skip to the first instruction before the epilog.		// Skip to the first instruction before the epilog.
const TargetInstrInfo *TII =		const TargetInstrInfo *TII =
MBB.getParent()->getSubtarget().getInstrInfo();		MBB.getParent()->getSubtarget().getInstrInfo();
MachineBasicBlock *TargetMBB = MI.getOperand(0).getMBB();		MachineBasicBlock *TargetMBB = MI.getOperand(0).getMBB();
auto MBBI = MachineBasicBlock::iterator(MI);		auto MBBI = MachineBasicBlock::iterator(MI);
Show All 19 Lines	const GlobalValue *GV =
cast<GlobalValue>((*MI.memoperands_begin())->getValue());		cast<GlobalValue>((*MI.memoperands_begin())->getValue());
const TargetMachine &TM = MBB.getParent()->getTarget();		const TargetMachine &TM = MBB.getParent()->getTarget();
unsigned OpFlags = Subtarget.ClassifyGlobalReference(GV, TM);		unsigned OpFlags = Subtarget.ClassifyGlobalReference(GV, TM);
const unsigned char MO_NC = AArch64II::MO_NC;		const unsigned char MO_NC = AArch64II::MO_NC;

if ((OpFlags & AArch64II::MO_GOT) != 0) {		if ((OpFlags & AArch64II::MO_GOT) != 0) {
BuildMI(MBB, MI, DL, get(AArch64::LOADgot), Reg)		BuildMI(MBB, MI, DL, get(AArch64::LOADgot), Reg)
.addGlobalAddress(GV, 0, OpFlags);		.addGlobalAddress(GV, 0, OpFlags);
		if (Subtarget.isTargetILP32()) {
		unsigned Reg32 = TRI->getSubReg(Reg, AArch64::sub_32);
		BuildMI(MBB, MI, DL, get(AArch64::LDRWui))
		.addDef(Reg32, RegState::Dead)
		.addUse(Reg, RegState::Kill)
		.addImm(0)
		.addMemOperand(*MI.memoperands_begin())
		.addDef(Reg, RegState::Implicit);
		} else {
BuildMI(MBB, MI, DL, get(AArch64::LDRXui), Reg)		BuildMI(MBB, MI, DL, get(AArch64::LDRXui), Reg)
.addReg(Reg, RegState::Kill)		.addReg(Reg, RegState::Kill)
.addImm(0)		.addImm(0)
.addMemOperand(*MI.memoperands_begin());		.addMemOperand(*MI.memoperands_begin());
		}
} else if (TM.getCodeModel() == CodeModel::Large) {		} else if (TM.getCodeModel() == CodeModel::Large) {
		assert(!Subtarget.isTargetILP32() && "how can large exist in ILP32?");
BuildMI(MBB, MI, DL, get(AArch64::MOVZXi), Reg)		BuildMI(MBB, MI, DL, get(AArch64::MOVZXi), Reg)
.addGlobalAddress(GV, 0, AArch64II::MO_G0 \| MO_NC)		.addGlobalAddress(GV, 0, AArch64II::MO_G0 \| MO_NC)
.addImm(0);		.addImm(0);
BuildMI(MBB, MI, DL, get(AArch64::MOVKXi), Reg)		BuildMI(MBB, MI, DL, get(AArch64::MOVKXi), Reg)
.addReg(Reg, RegState::Kill)		.addReg(Reg, RegState::Kill)
.addGlobalAddress(GV, 0, AArch64II::MO_G1 \| MO_NC)		.addGlobalAddress(GV, 0, AArch64II::MO_G1 \| MO_NC)
.addImm(16);		.addImm(16);
BuildMI(MBB, MI, DL, get(AArch64::MOVKXi), Reg)		BuildMI(MBB, MI, DL, get(AArch64::MOVKXi), Reg)
Show All 10 Lines	BuildMI(MBB, MI, DL, get(AArch64::LDRXui), Reg)
.addMemOperand(*MI.memoperands_begin());		.addMemOperand(*MI.memoperands_begin());
} else if (TM.getCodeModel() == CodeModel::Tiny) {		} else if (TM.getCodeModel() == CodeModel::Tiny) {
BuildMI(MBB, MI, DL, get(AArch64::ADR), Reg)		BuildMI(MBB, MI, DL, get(AArch64::ADR), Reg)
.addGlobalAddress(GV, 0, OpFlags);		.addGlobalAddress(GV, 0, OpFlags);
} else {		} else {
BuildMI(MBB, MI, DL, get(AArch64::ADRP), Reg)		BuildMI(MBB, MI, DL, get(AArch64::ADRP), Reg)
.addGlobalAddress(GV, 0, OpFlags \| AArch64II::MO_PAGE);		.addGlobalAddress(GV, 0, OpFlags \| AArch64II::MO_PAGE);
unsigned char LoFlags = OpFlags \| AArch64II::MO_PAGEOFF \| MO_NC;		unsigned char LoFlags = OpFlags \| AArch64II::MO_PAGEOFF \| MO_NC;
		if (Subtarget.isTargetILP32()) {
		unsigned Reg32 = TRI->getSubReg(Reg, AArch64::sub_32);
		BuildMI(MBB, MI, DL, get(AArch64::LDRWui))
		.addDef(Reg32, RegState::Dead)
		.addUse(Reg, RegState::Kill)
		.addGlobalAddress(GV, 0, LoFlags)
		.addMemOperand(*MI.memoperands_begin())
		.addDef(Reg, RegState::Implicit);
		} else {
BuildMI(MBB, MI, DL, get(AArch64::LDRXui), Reg)		BuildMI(MBB, MI, DL, get(AArch64::LDRXui), Reg)
.addReg(Reg, RegState::Kill)		.addReg(Reg, RegState::Kill)
.addGlobalAddress(GV, 0, LoFlags)		.addGlobalAddress(GV, 0, LoFlags)
.addMemOperand(*MI.memoperands_begin());		.addMemOperand(*MI.memoperands_begin());
}		}
		}

MBB.erase(MI);		MBB.erase(MI);

return true;		return true;
}		}

// Return true if this instruction simply sets its single destination register		// Return true if this instruction simply sets its single destination register
// to zero. This is equivalent to a register rename of the zero-register.		// to zero. This is equivalent to a register rename of the zero-register.
▲ Show 20 Lines • Show All 4,269 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp

Show All 26 Lines	SDValue AArch64SelectionDAGInfo::EmitTargetCodeForMemset(
const char *bzeroName = (V && V->isNullValue())		const char *bzeroName = (V && V->isNullValue())
? DAG.getTargetLoweringInfo().getLibcallName(RTLIB::BZERO) : nullptr;		? DAG.getTargetLoweringInfo().getLibcallName(RTLIB::BZERO) : nullptr;
// For small size (< 256), it is not beneficial to use bzero		// For small size (< 256), it is not beneficial to use bzero
// instead of memset.		// instead of memset.
if (bzeroName && (!SizeValue \|\| SizeValue->getZExtValue() > 256)) {		if (bzeroName && (!SizeValue \|\| SizeValue->getZExtValue() > 256)) {
const AArch64TargetLowering &TLI = *STI.getTargetLowering();		const AArch64TargetLowering &TLI = *STI.getTargetLowering();

EVT IntPtr = TLI.getPointerTy(DAG.getDataLayout());		EVT IntPtr = TLI.getPointerTy(DAG.getDataLayout());
Type IntPtrTy = DAG.getDataLayout().getIntPtrType(DAG.getContext());		Type IntPtrTy = Type::getInt8PtrTy(DAG.getContext());
TargetLowering::ArgListTy Args;		TargetLowering::ArgListTy Args;
TargetLowering::ArgListEntry Entry;		TargetLowering::ArgListEntry Entry;
Entry.Node = Dst;		Entry.Node = Dst;
Entry.Ty = IntPtrTy;		Entry.Ty = IntPtrTy;
Args.push_back(Entry);		Args.push_back(Entry);
Entry.Node = Size;		Entry.Node = Size;
Args.push_back(Entry);		Args.push_back(Entry);
TargetLowering::CallLoweringInfo CLI(DAG);		TargetLowering::CallLoweringInfo CLI(DAG);
▲ Show 20 Lines • Show All 103 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64Subtarget.h

Show First 20 Lines • Show All 405 Lines • ▼ Show 20 Lines	public:
bool isTargetWindows() const { return TargetTriple.isOSWindows(); }		bool isTargetWindows() const { return TargetTriple.isOSWindows(); }
bool isTargetAndroid() const { return TargetTriple.isAndroid(); }		bool isTargetAndroid() const { return TargetTriple.isAndroid(); }
bool isTargetFuchsia() const { return TargetTriple.isOSFuchsia(); }		bool isTargetFuchsia() const { return TargetTriple.isOSFuchsia(); }

bool isTargetCOFF() const { return TargetTriple.isOSBinFormatCOFF(); }		bool isTargetCOFF() const { return TargetTriple.isOSBinFormatCOFF(); }
bool isTargetELF() const { return TargetTriple.isOSBinFormatELF(); }		bool isTargetELF() const { return TargetTriple.isOSBinFormatELF(); }
bool isTargetMachO() const { return TargetTriple.isOSBinFormatMachO(); }		bool isTargetMachO() const { return TargetTriple.isOSBinFormatMachO(); }

		bool isTargetILP32() const { return TargetTriple.isArch32Bit(); }

bool useAA() const override { return UseAA; }		bool useAA() const override { return UseAA; }

bool hasVH() const { return HasVH; }		bool hasVH() const { return HasVH; }
bool hasPAN() const { return HasPAN; }		bool hasPAN() const { return HasPAN; }
bool hasLOR() const { return HasLOR; }		bool hasLOR() const { return HasLOR; }

bool hasPsUAO() const { return HasPsUAO; }		bool hasPsUAO() const { return HasPsUAO; }
bool hasPAN_RWV() const { return HasPAN_RWV; }		bool hasPAN_RWV() const { return HasPAN_RWV; }
Show All 10 Lines	public:
bool hasDIT() const { return HasDIT; }		bool hasDIT() const { return HasDIT; }
bool hasTRACEV8_4() const { return HasTRACEV8_4; }		bool hasTRACEV8_4() const { return HasTRACEV8_4; }
bool hasAM() const { return HasAM; }		bool hasAM() const { return HasAM; }
bool hasSEL2() const { return HasSEL2; }		bool hasSEL2() const { return HasSEL2; }
bool hasTLB_RMI() const { return HasTLB_RMI; }		bool hasTLB_RMI() const { return HasTLB_RMI; }
bool hasFMI() const { return HasFMI; }		bool hasFMI() const { return HasFMI; }
bool hasRCPC_IMMO() const { return HasRCPC_IMMO; }		bool hasRCPC_IMMO() const { return HasRCPC_IMMO; }

		bool addrSinkUsingGEPs() const override {
		AlexDenisovUnsubmitted Not Done Reply Inline Actions When I compile LLVM with this patch applied I'm getting the error: AArch64Subtarget.h:430:34: error: only virtual member functions can be marked 'override' bool addrSinkUsingGEPs() const override { ^~~~~~~~~ Removing the override keyword fixes it, but I'm curious where it comes from? I cannot see any usage of this method across the code base. AlexDenisov: When I compile LLVM with this patch applied I'm getting the error: ``` AArch64Subtarget.h:430…
		// Keeping GEPs inbounds is important for exploiting AArch64
		// addressing-modes in ILP32 mode.
		return useAA() \|\| isTargetILP32();
		}

bool useSmallAddressing() const {		bool useSmallAddressing() const {
switch (TLInfo.getTargetMachine().getCodeModel()) {		switch (TLInfo.getTargetMachine().getCodeModel()) {
case CodeModel::Kernel:		case CodeModel::Kernel:
// Kernel is currently allowed only for Fuchsia targets,		// Kernel is currently allowed only for Fuchsia targets,
// where it is the same as Small for almost all purposes.		// where it is the same as Small for almost all purposes.
case CodeModel::Small:		case CodeModel::Small:
return true;		return true;
default:		default:
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64TargetMachine.cpp

Show First 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	EnableBranchTargets("aarch64-enable-branch-targets", cl::Hidden,
cl::desc("Enable the AAcrh64 branch target pass"),		cl::desc("Enable the AAcrh64 branch target pass"),
cl::init(true));		cl::init(true));

extern "C" void LLVMInitializeAArch64Target() {		extern "C" void LLVMInitializeAArch64Target() {
// Register the target.		// Register the target.
RegisterTargetMachine<AArch64leTargetMachine> X(getTheAArch64leTarget());		RegisterTargetMachine<AArch64leTargetMachine> X(getTheAArch64leTarget());
RegisterTargetMachine<AArch64beTargetMachine> Y(getTheAArch64beTarget());		RegisterTargetMachine<AArch64beTargetMachine> Y(getTheAArch64beTarget());
RegisterTargetMachine<AArch64leTargetMachine> Z(getTheARM64Target());		RegisterTargetMachine<AArch64leTargetMachine> Z(getTheARM64Target());
		RegisterTargetMachine<AArch64leTargetMachine> W(getTheARM64_32Target());
		RegisterTargetMachine<AArch64leTargetMachine> V(getTheAArch64_32Target());
auto PR = PassRegistry::getPassRegistry();		auto PR = PassRegistry::getPassRegistry();
initializeGlobalISel(*PR);		initializeGlobalISel(*PR);
initializeAArch64A53Fix835769Pass(*PR);		initializeAArch64A53Fix835769Pass(*PR);
initializeAArch64A57FPLoadBalancingPass(*PR);		initializeAArch64A57FPLoadBalancingPass(*PR);
initializeAArch64AdvSIMDScalarPass(*PR);		initializeAArch64AdvSIMDScalarPass(*PR);
initializeAArch64BranchTargetsPass(*PR);		initializeAArch64BranchTargetsPass(*PR);
initializeAArch64CollectLOHPass(*PR);		initializeAArch64CollectLOHPass(*PR);
initializeAArch64CompressJumpTablesPass(*PR);		initializeAArch64CompressJumpTablesPass(*PR);
Show All 28 Lines
}		}

// Helper function to build a DataLayout string		// Helper function to build a DataLayout string
static std::string computeDataLayout(const Triple &TT,		static std::string computeDataLayout(const Triple &TT,
const MCTargetOptions &Options,		const MCTargetOptions &Options,
bool LittleEndian) {		bool LittleEndian) {
if (Options.getABIName() == "ilp32")		if (Options.getABIName() == "ilp32")
return "e-m:e-p:32:32-i8:8-i16:16-i64:64-S128";		return "e-m:e-p:32:32-i8:8-i16:16-i64:64-S128";
if (TT.isOSBinFormatMachO())		if (TT.isOSBinFormatMachO()) {
		if (TT.getArch() == Triple::aarch64_32)
		return "e-m:o-p:32:32-i64:64-i128:128-n32:64-S128";
return "e-m:o-i64:64-i128:128-n32:64-S128";		return "e-m:o-i64:64-i128:128-n32:64-S128";
		}
if (TT.isOSBinFormatCOFF())		if (TT.isOSBinFormatCOFF())
return "e-m:w-p:64:64-i32:32-i64:64-i128:128-n32:64-S128";		return "e-m:w-p:64:64-i32:32-i64:64-i128:128-n32:64-S128";
if (LittleEndian)		if (LittleEndian)
return "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128";		return "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128";
return "E-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128";		return "E-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128";
}		}

static Reloc::Model getEffectiveRelocModel(const Triple &TT,		static Reloc::Model getEffectiveRelocModel(const Triple &TT,
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	if (getMCAsmInfo()->usesWindowsCFI()) {
// is a call.		// is a call.
//		//
// FIXME: We could elide the trap if the next instruction would be in		// FIXME: We could elide the trap if the next instruction would be in
// the same region anyway.		// the same region anyway.
this->Options.TrapUnreachable = true;		this->Options.TrapUnreachable = true;
}		}

// Enable GlobalISel at or below EnableGlobalISelAt0.		// Enable GlobalISel at or below EnableGlobalISelAt0.
if (getOptLevel() <= EnableGlobalISelAtO) {		if (getOptLevel() <= EnableGlobalISelAtO &&
		TT.getArch() != Triple::aarch64_32) {
setGlobalISel(true);		setGlobalISel(true);
setGlobalISelAbort(GlobalISelAbortMode::Disable);		setGlobalISelAbort(GlobalISelAbortMode::Disable);
}		}

// AArch64 supports the MachineOutliner.		// AArch64 supports the MachineOutliner.
setMachineOutliner(true);		setMachineOutliner(true);

// AArch64 supports default outlining behaviour.		// AArch64 supports default outlining behaviour.
▲ Show 20 Lines • Show All 328 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCAsmInfo.h

	Show All 17 Lines
	#include "llvm/MC/MCAsmInfoELF.h"			#include "llvm/MC/MCAsmInfoELF.h"

	namespace llvm {			namespace llvm {
	class MCStreamer;			class MCStreamer;
	class Target;			class Target;
	class Triple;			class Triple;

	struct AArch64MCAsmInfoDarwin : public MCAsmInfoDarwin {			struct AArch64MCAsmInfoDarwin : public MCAsmInfoDarwin {
	explicit AArch64MCAsmInfoDarwin();			explicit AArch64MCAsmInfoDarwin(bool IsILP32);
	const MCExpr *			const MCExpr *
	getExprForPersonalitySymbol(const MCSymbol *Sym, unsigned Encoding,			getExprForPersonalitySymbol(const MCSymbol *Sym, unsigned Encoding,
	MCStreamer &Streamer) const override;			MCStreamer &Streamer) const override;
	};			};

	struct AArch64MCAsmInfoELF : public MCAsmInfoELF {			struct AArch64MCAsmInfoELF : public MCAsmInfoELF {
	explicit AArch64MCAsmInfoELF(const Triple &T);			explicit AArch64MCAsmInfoELF(const Triple &T);
	};			};
	Show All 12 Lines

llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCAsmInfo.cpp

	Show All 24 Lines
	};			};

	static cl::opt<AsmWriterVariantTy> AsmWriterVariant(			static cl::opt<AsmWriterVariantTy> AsmWriterVariant(
	"aarch64-neon-syntax", cl::init(Default),			"aarch64-neon-syntax", cl::init(Default),
	cl::desc("Choose style of NEON code to emit from AArch64 backend:"),			cl::desc("Choose style of NEON code to emit from AArch64 backend:"),
	cl::values(clEnumValN(Generic, "generic", "Emit generic NEON assembly"),			cl::values(clEnumValN(Generic, "generic", "Emit generic NEON assembly"),
	clEnumValN(Apple, "apple", "Emit Apple-style NEON assembly")));			clEnumValN(Apple, "apple", "Emit Apple-style NEON assembly")));

	AArch64MCAsmInfoDarwin::AArch64MCAsmInfoDarwin() {			AArch64MCAsmInfoDarwin::AArch64MCAsmInfoDarwin(bool IsILP32) {
	// We prefer NEON instructions to be printed in the short, Apple-specific			// We prefer NEON instructions to be printed in the short, Apple-specific
	// form when targeting Darwin.			// form when targeting Darwin.
	AssemblerDialect = AsmWriterVariant == Default ? Apple : AsmWriterVariant;			AssemblerDialect = AsmWriterVariant == Default ? Apple : AsmWriterVariant;

	PrivateGlobalPrefix = "L";			PrivateGlobalPrefix = "L";
	PrivateLabelPrefix = "L";			PrivateLabelPrefix = "L";
	SeparatorString = "%%";			SeparatorString = "%%";
	CommentString = ";";			CommentString = ";";
	CodePointerSize = CalleeSaveStackSlotSize = 8;			CalleeSaveStackSlotSize = 8;
				CodePointerSize = IsILP32 ? 4 : 8;

	AlignmentIsInBytes = false;			AlignmentIsInBytes = false;
	UsesELFSectionDirectiveForBSS = true;			UsesELFSectionDirectiveForBSS = true;
	SupportsDebugInformation = true;			SupportsDebugInformation = true;
	UseDataRegionDirectives = true;			UseDataRegionDirectives = true;

	ExceptionsType = ExceptionHandling::DwarfCFI;			ExceptionsType = ExceptionHandling::DwarfCFI;
	}			}
	▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp

Show First 20 Lines • Show All 235 Lines • ▼ Show 20 Lines	static MCRegisterInfo *createAArch64MCRegisterInfo(const Triple &Triple) {
AArch64_MC::initLLVMToCVRegMapping(X);		AArch64_MC::initLLVMToCVRegMapping(X);
return X;		return X;
}		}

static MCAsmInfo *createAArch64MCAsmInfo(const MCRegisterInfo &MRI,		static MCAsmInfo *createAArch64MCAsmInfo(const MCRegisterInfo &MRI,
const Triple &TheTriple) {		const Triple &TheTriple) {
MCAsmInfo *MAI;		MCAsmInfo *MAI;
if (TheTriple.isOSBinFormatMachO())		if (TheTriple.isOSBinFormatMachO())
MAI = new AArch64MCAsmInfoDarwin();		MAI = new AArch64MCAsmInfoDarwin(TheTriple.getArch() == Triple::aarch64_32);
else if (TheTriple.isWindowsMSVCEnvironment())		else if (TheTriple.isWindowsMSVCEnvironment())
MAI = new AArch64MCAsmInfoMicrosoftCOFF();		MAI = new AArch64MCAsmInfoMicrosoftCOFF();
else if (TheTriple.isOSBinFormatCOFF())		else if (TheTriple.isOSBinFormatCOFF())
MAI = new AArch64MCAsmInfoGNUCOFF();		MAI = new AArch64MCAsmInfoGNUCOFF();
else {		else {
assert(TheTriple.isOSBinFormatELF() && "Invalid target");		assert(TheTriple.isOSBinFormatELF() && "Invalid target");
MAI = new AArch64MCAsmInfoELF(TheTriple);		MAI = new AArch64MCAsmInfoELF(TheTriple);
}		}
▲ Show 20 Lines • Show All 160 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86FastISel.cpp

Show First 20 Lines • Show All 3,381 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = ArgLocs.size(); i != e; ++i) {
case CCValAssign::VExt:		case CCValAssign::VExt:
// VExt has not been implemented, so this should be impossible to reach		// VExt has not been implemented, so this should be impossible to reach
// for now. However, fallback to Selection DAG isel once implemented.		// for now. However, fallback to Selection DAG isel once implemented.
return false;		return false;
case CCValAssign::AExtUpper:		case CCValAssign::AExtUpper:
case CCValAssign::SExtUpper:		case CCValAssign::SExtUpper:
case CCValAssign::ZExtUpper:		case CCValAssign::ZExtUpper:
case CCValAssign::FPExt:		case CCValAssign::FPExt:
		case CCValAssign::Trunc:
llvm_unreachable("Unexpected loc info!");		llvm_unreachable("Unexpected loc info!");
case CCValAssign::Indirect:		case CCValAssign::Indirect:
// FIXME: Indirect doesn't need extending, but fast-isel doesn't fully		// FIXME: Indirect doesn't need extending, but fast-isel doesn't fully
// support this.		// support this.
return false;		return false;
}		}

if (VA.isRegLoc()) {		if (VA.isRegLoc()) {
▲ Show 20 Lines • Show All 609 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/arm64-aapcs.ll

Show All 19 Lines	; CHECK-DAG: str w3, [{{x[0-9]+}}, :lo12:var]
ret [2 x i64] %arg		ret [2 x i64] %arg
; CHECK-DAG: mov x0, x1		; CHECK-DAG: mov x0, x1
; CHECK: mov x1, x2		; CHECK: mov x1, x2
}		}

@var64 = global i64 0, align 8		@var64 = global i64 0, align 8

; Check stack slots are 64-bit at all times.		; Check stack slots are 64-bit at all times.
define void @test_stack_slots([8 x i32], i1 %bool, i8 %char, i16 %short,		define void @test_stack_slots([8 x i64], i1 %bool, i8 %char, i16 %short,
i32 %int, i64 %long) {		i32 %int, i64 %long) {
; CHECK-LABEL: test_stack_slots:		; CHECK-LABEL: test_stack_slots:
; CHECK-DAG: ldr w[[ext1:[0-9]+]], [sp, #24]		; CHECK-DAG: ldr w[[ext1:[0-9]+]], [sp, #24]
; CHECK-DAG: ldrh w[[ext2:[0-9]+]], [sp, #16]		; CHECK-DAG: ldrh w[[ext2:[0-9]+]], [sp, #16]
; CHECK-DAG: ldrb w[[ext3:[0-9]+]], [sp, #8]		; CHECK-DAG: ldrb w[[ext3:[0-9]+]], [sp, #8]
; CHECK-DAG: ldr x[[ext4:[0-9]+]], [sp, #32]		; CHECK-DAG: ldr x[[ext4:[0-9]+]], [sp, #32]
; CHECK-DAG: ldrb w[[ext5:[0-9]+]], [sp]		; CHECK-DAG: ldrb w[[ext5:[0-9]+]], [sp]
; CHECK-DAG: and x[[ext5]], x[[ext5]], #0x1		; CHECK-DAG: and x[[ext5]], x[[ext5]], #0x1
▲ Show 20 Lines • Show All 128 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/arm64-collect-loh-garbage-crash.ll

	; RUN: llc -o - %s -mtriple=arm64-apple-ios -O3 -aarch64-enable-collect-loh \| FileCheck %s			; RUN: llc -o - %s -mtriple=arm64-apple-ios -O3 -aarch64-enable-collect-loh \| FileCheck %s
				; RUN: llc -o - %s -mtriple=arm64_32-apple-watchos -O3 -aarch64-enable-collect-loh \| FileCheck %s
	; Check that the LOH analysis does not crash when the analysed chained			; Check that the LOH analysis does not crash when the analysed chained
	; contains instructions that are filtered out.			; contains instructions that are filtered out.
	;			;
	; Before the fix for <rdar://problem/16041712>, these cases were removed			; Before the fix for <rdar://problem/16041712>, these cases were removed
	; from the main container. Now, the deterministic container does not allow			; from the main container. Now, the deterministic container does not allow
	; to remove arbitrary values, so we have to live with garbage values.			; to remove arbitrary values, so we have to live with garbage values.
	; <rdar://problem/16041712>			; <rdar://problem/16041712>

	Show All 28 Lines

llvm/test/CodeGen/AArch64/arm64-collect-loh-str.ll

	; RUN: llc -o - %s -mtriple=arm64-apple-ios -O2 \| FileCheck %s			; RUN: llc -o - %s -mtriple=arm64-apple-ios -O2 \| FileCheck %s
				; RUN: llc -o - %s -mtriple=arm64_32-apple-ios -O2 \| FileCheck %s
	; Test case for <rdar://problem/15942912>.			; Test case for <rdar://problem/15942912>.
	; AdrpAddStr cannot be used when the store uses same			; AdrpAddStr cannot be used when the store uses same
	; register as address and value. Indeed, the related			; register as address and value. Indeed, the related
	; if applied, may completely remove the definition or			; if applied, may completely remove the definition or
	; at least provide a wrong one (with the offset folded			; at least provide a wrong one (with the offset folded
	; into the definition).			; into the definition).

	%struct.anon = type { i32, i32* }			%struct.anon = type { i32, i32* }
	Show All 14 Lines

llvm/test/CodeGen/AArch64/arm64-collect-loh.ll

	; RUN: llc -o - %s -mtriple=arm64-apple-ios -O2 \| FileCheck %s			; RUN: llc -o - %s -mtriple=arm64-apple-ios -O2 \| FileCheck %s
				; RUN: llc -o - %s -mtriple=arm64_32-apple-watchos -O2 \| FileCheck %s
	; RUN: llc -o - %s -mtriple=arm64-linux-gnu -O2 \| FileCheck %s --check-prefix=CHECK-ELF			; RUN: llc -o - %s -mtriple=arm64-linux-gnu -O2 \| FileCheck %s --check-prefix=CHECK-ELF

	; CHECK-ELF-NOT: .loh			; CHECK-ELF-NOT: .loh
	; CHECK-ELF-NOT: AdrpAdrp			; CHECK-ELF-NOT: AdrpAdrp
	; CHECK-ELF-NOT: AdrpAdd			; CHECK-ELF-NOT: AdrpAdd
	; CHECK-ELF-NOT: AdrpLdrGot			; CHECK-ELF-NOT: AdrpLdrGot

	@a = internal unnamed_addr global i32 0, align 4			@a = internal unnamed_addr global i32 0, align 4
	▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
	@C = common global i32 0, align 4			@C = common global i32 0, align 4

	; Check that we catch AdrpLdrGotLdr case when we have a simple chain:			; Check that we catch AdrpLdrGotLdr case when we have a simple chain:
	; adrp -> ldrgot -> ldr.			; adrp -> ldrgot -> ldr.
	; CHECK-LABEL: _getC			; CHECK-LABEL: _getC
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _C@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _C@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _C@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _C@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr w0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr w0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define i32 @getC() {			define i32 @getC() {
	%res = load i32, i32* @C, align 4			%res = load i32, i32* @C, align 4
	ret i32 %res			ret i32 %res
	}			}

	; LDRSW supports loading from a literal.			; LDRSW supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getSExtC			; CHECK-LABEL: _getSExtC
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _C@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _C@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _C@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _C@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldrsw x0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldrsw x0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define i64 @getSExtC() {			define i64 @getSExtC() {
	%res = load i32, i32* @C, align 4			%res = load i32, i32* @C, align 4
	%sextres = sext i32 %res to i64			%sextres = sext i32 %res to i64
	ret i64 %sextres			ret i64 %sextres
	}			}

	; It may not be safe to fold the literal in the load if the address is			; It may not be safe to fold the literal in the load if the address is
	; used several times.			; used several times.
	; Make sure we emit AdrpLdrGot for those.			; Make sure we emit AdrpLdrGot for those.
	; CHECK-LABEL: _getSeveralC			; CHECK-LABEL: _getSeveralC
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _C@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _C@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _C@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _C@GOTPAGEOFF]
	; CHECK-NEXT: ldr [[LOAD:w[0-9]+]], {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr [[LOAD:w[0-9]+]], [x[[LDRGOT_REG]]]
	; CHECK-NEXT: add [[ADD:w[0-9]+]], [[LOAD]], w0			; CHECK-NEXT: add [[ADD:w[0-9]+]], [[LOAD]], w0
	; CHECK-NEXT: str [[ADD]], {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: str [[ADD]], [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGot [[ADRP_LABEL]], [[LDRGOT_LABEL]]			; CHECK: .loh AdrpLdrGot [[ADRP_LABEL]], [[LDRGOT_LABEL]]
	define void @getSeveralC(i32 %t) {			define void @getSeveralC(i32 %t) {
	entry:			entry:
	%tmp = load i32, i32* @C, align 4			%tmp = load i32, i32* @C, align 4
	%add = add nsw i32 %tmp, %t			%add = add nsw i32 %tmp, %t
	store i32 %add, i32* @C, align 4			store i32 %add, i32* @C, align 4
	ret void			ret void
	}			}

	; Make sure we catch that:			; Make sure we catch that:
	; adrp -> ldrgot -> str.			; adrp -> ldrgot -> str.
	; CHECK-LABEL: _setC			; CHECK-LABEL: _setC
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _C@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _C@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _C@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _C@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: str w0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: str w0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define void @setC(i32 %t) {			define void @setC(i32 %t) {
	entry:			entry:
	store i32 %t, i32* @C, align 4			store i32 %t, i32* @C, align 4
	ret void			ret void
	}			}

	Show All 9 Lines
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _InternalC@PAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _InternalC@PAGE
	; CHECK-NEXT: [[ADDGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[ADDGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: add [[ADDGOT_REG:x[0-9]+]], [[ADRP_REG]], _InternalC@PAGEOFF			; CHECK-NEXT: add [[ADDGOT_REG:x[0-9]+]], [[ADRP_REG]], _InternalC@PAGEOFF
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr w0, {{\[}}[[ADDGOT_REG]], #16]			; CHECK-NEXT: ldr w0, {{\[}}[[ADDGOT_REG]], #16]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpAddLdr [[ADRP_LABEL]], [[ADDGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpAddLdr [[ADRP_LABEL]], [[ADDGOT_LABEL]], [[LDR_LABEL]]
	define i32 @getInternalCPlus4() {			define i32 @getInternalCPlus4() {
	%addr = getelementptr i32, i32* @InternalC, i32 4			%addr = getelementptr inbounds i32, i32* @InternalC, i32 4
	%res = load i32, i32* %addr, align 4			%res = load i32, i32* %addr, align 4
	ret i32 %res			ret i32 %res
	}			}

	; LDRSW supports loading from a literal.			; LDRSW supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getSExtInternalCPlus4			; CHECK-LABEL: _getSExtInternalCPlus4
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _InternalC@PAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _InternalC@PAGE
	; CHECK-NEXT: [[ADDGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[ADDGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: add [[ADDGOT_REG:x[0-9]+]], [[ADRP_REG]], _InternalC@PAGEOFF			; CHECK-NEXT: add [[ADDGOT_REG:x[0-9]+]], [[ADRP_REG]], _InternalC@PAGEOFF
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldrsw x0, {{\[}}[[ADDGOT_REG]], #16]			; CHECK-NEXT: ldrsw x0, {{\[}}[[ADDGOT_REG]], #16]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpAddLdr [[ADRP_LABEL]], [[ADDGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpAddLdr [[ADRP_LABEL]], [[ADDGOT_LABEL]], [[LDR_LABEL]]
	define i64 @getSExtInternalCPlus4() {			define i64 @getSExtInternalCPlus4() {
	%addr = getelementptr i32, i32* @InternalC, i32 4			%addr = getelementptr inbounds i32, i32* @InternalC, i32 4
				aemersonUnsubmitted Not Done Reply Inline Actions Can you add a comment here explaining that inbounds is needed for arm64_32 to produce the same code. aemerson: Can you add a comment here explaining that inbounds is needed for arm64_32 to produce the same…
				t.p.northoverAuthorUnsubmitted Done Reply Inline Actions I'd kind of prefer not to. It's incidental to this test but the main complication of arm64_32 CodeGen, so I don't think it's really what a random reader of this test is going to be looking for. t.p.northover: I'd kind of prefer not to. It's incidental to this test but the main complication of arm64_32…
	%res = load i32, i32* %addr, align 4			%res = load i32, i32* %addr, align 4
	%sextres = sext i32 %res to i64			%sextres = sext i32 %res to i64
	ret i64 %sextres			ret i64 %sextres
	}			}

	; It may not be safe to fold the literal in the load if the address is			; It may not be safe to fold the literal in the load if the address is
	; used several times.			; used several times.
	; Make sure we emit AdrpAdd for those.			; Make sure we emit AdrpAdd for those.
	; CHECK-LABEL: _getSeveralInternalCPlus4			; CHECK-LABEL: _getSeveralInternalCPlus4
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _InternalC@PAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _InternalC@PAGE
	; CHECK-NEXT: [[ADDGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[ADDGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: add [[ADDGOT_REG:x[0-9]+]], [[ADRP_REG]], _InternalC@PAGEOFF			; CHECK-NEXT: add [[ADDGOT_REG:x[0-9]+]], [[ADRP_REG]], _InternalC@PAGEOFF
	; CHECK-NEXT: ldr [[LOAD:w[0-9]+]], {{\[}}[[ADDGOT_REG]], #16]			; CHECK-NEXT: ldr [[LOAD:w[0-9]+]], {{\[}}[[ADDGOT_REG]], #16]
	; CHECK-NEXT: add [[ADD:w[0-9]+]], [[LOAD]], w0			; CHECK-NEXT: add [[ADD:w[0-9]+]], [[LOAD]], w0
	; CHECK-NEXT: str [[ADD]], {{\[}}[[ADDGOT_REG]], #16]			; CHECK-NEXT: str [[ADD]], {{\[}}[[ADDGOT_REG]], #16]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpAdd [[ADRP_LABEL]], [[ADDGOT_LABEL]]			; CHECK: .loh AdrpAdd [[ADRP_LABEL]], [[ADDGOT_LABEL]]
	define void @getSeveralInternalCPlus4(i32 %t) {			define void @getSeveralInternalCPlus4(i32 %t) {
	entry:			entry:
	%addr = getelementptr i32, i32* @InternalC, i32 4			%addr = getelementptr inbounds i32, i32* @InternalC, i32 4
	%tmp = load i32, i32* %addr, align 4			%tmp = load i32, i32* %addr, align 4
	%add = add nsw i32 %tmp, %t			%add = add nsw i32 %tmp, %t
	store i32 %add, i32* %addr, align 4			store i32 %add, i32* %addr, align 4
	ret void			ret void
	}			}

	; Make sure we catch that:			; Make sure we catch that:
	; adrp -> add -> str.			; adrp -> add -> str.
	; CHECK-LABEL: _setInternalCPlus4			; CHECK-LABEL: _setInternalCPlus4
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _InternalC@PAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _InternalC@PAGE
	; CHECK-NEXT: [[ADDGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[ADDGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: add [[ADDGOT_REG:x[0-9]+]], [[ADRP_REG]], _InternalC@PAGEOFF			; CHECK-NEXT: add [[ADDGOT_REG:x[0-9]+]], [[ADRP_REG]], _InternalC@PAGEOFF
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: str w0, {{\[}}[[ADDGOT_REG]], #16]			; CHECK-NEXT: str w0, {{\[}}[[ADDGOT_REG]], #16]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpAddStr [[ADRP_LABEL]], [[ADDGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpAddStr [[ADRP_LABEL]], [[ADDGOT_LABEL]], [[LDR_LABEL]]
	define void @setInternalCPlus4(i32 %t) {			define void @setInternalCPlus4(i32 %t) {
	entry:			entry:
	%addr = getelementptr i32, i32* @InternalC, i32 4			%addr = getelementptr inbounds i32, i32* @InternalC, i32 4
	store i32 %t, i32* %addr, align 4			store i32 %t, i32* %addr, align 4
	ret void			ret void
	}			}

	; Check that we catch AdrpAddLdr case when we have a simple chain:			; Check that we catch AdrpAddLdr case when we have a simple chain:
	; adrp -> ldr.			; adrp -> ldr.
	; CHECK-LABEL: _getInternalC			; CHECK-LABEL: _getInternalC
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
	@D = common global i8 0, align 4			@D = common global i8 0, align 4

	; LDRB does not support loading from a literal.			; LDRB does not support loading from a literal.
	; Make sure we emit AdrpLdrGot and not AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGot and not AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getD			; CHECK-LABEL: _getD
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _D@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _D@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _D@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _D@GOTPAGEOFF]
	; CHECK-NEXT: ldrb w0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldrb w0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGot [[ADRP_LABEL]], [[LDRGOT_LABEL]]			; CHECK: .loh AdrpLdrGot [[ADRP_LABEL]], [[LDRGOT_LABEL]]
	define i8 @getD() {			define i8 @getD() {
	%res = load i8, i8* @D, align 4			%res = load i8, i8* @D, align 4
	ret i8 %res			ret i8 %res
	}			}

	; CHECK-LABEL: _setD			; CHECK-LABEL: _setD
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _D@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _D@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _D@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _D@GOTPAGEOFF]
	; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: strb w0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: strb w0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]
	define void @setD(i8 %t) {			define void @setD(i8 %t) {
	store i8 %t, i8* @D, align 4			store i8 %t, i8* @D, align 4
	ret void			ret void
	}			}

	; LDRSB supports loading from a literal.			; LDRSB supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getSExtD			; CHECK-LABEL: _getSExtD
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _D@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _D@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _D@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _D@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldrsb w0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldrsb w0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define i32 @getSExtD() {			define i32 @getSExtD() {
	%res = load i8, i8* @D, align 4			%res = load i8, i8* @D, align 4
	%sextres = sext i8 %res to i32			%sextres = sext i8 %res to i32
	ret i32 %sextres			ret i32 %sextres
	}			}

	; LDRSB supports loading from a literal.			; LDRSB supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getSExt64D			; CHECK-LABEL: _getSExt64D
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _D@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _D@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _D@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _D@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldrsb x0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldrsb x0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define i64 @getSExt64D() {			define i64 @getSExt64D() {
	%res = load i8, i8* @D, align 4			%res = load i8, i8* @D, align 4
	%sextres = sext i8 %res to i64			%sextres = sext i8 %res to i64
	ret i64 %sextres			ret i64 %sextres
	}			}

	@E = common global i16 0, align 4			@E = common global i16 0, align 4

	; LDRH does not support loading from a literal.			; LDRH does not support loading from a literal.
	; Make sure we emit AdrpLdrGot and not AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGot and not AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getE			; CHECK-LABEL: _getE
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _E@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _E@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _E@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _E@GOTPAGEOFF]
	; CHECK-NEXT: ldrh w0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldrh w0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGot [[ADRP_LABEL]], [[LDRGOT_LABEL]]			; CHECK: .loh AdrpLdrGot [[ADRP_LABEL]], [[LDRGOT_LABEL]]
	define i16 @getE() {			define i16 @getE() {
	%res = load i16, i16* @E, align 4			%res = load i16, i16* @E, align 4
	ret i16 %res			ret i16 %res
	}			}

	; LDRSH supports loading from a literal.			; LDRSH supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getSExtE			; CHECK-LABEL: _getSExtE
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _E@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _E@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _E@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _E@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldrsh w0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldrsh w0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define i32 @getSExtE() {			define i32 @getSExtE() {
	%res = load i16, i16* @E, align 4			%res = load i16, i16* @E, align 4
	%sextres = sext i16 %res to i32			%sextres = sext i16 %res to i32
	ret i32 %sextres			ret i32 %sextres
	}			}

	; CHECK-LABEL: _setE			; CHECK-LABEL: _setE
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _E@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _E@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _E@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _E@GOTPAGEOFF]
	; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: strh w0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: strh w0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]
	define void @setE(i16 %t) {			define void @setE(i16 %t) {
	store i16 %t, i16* @E, align 4			store i16 %t, i16* @E, align 4
	ret void			ret void
	}			}

	; LDRSH supports loading from a literal.			; LDRSH supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getSExt64E			; CHECK-LABEL: _getSExt64E
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _E@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _E@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _E@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _E@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldrsh x0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldrsh x0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define i64 @getSExt64E() {			define i64 @getSExt64E() {
	%res = load i16, i16* @E, align 4			%res = load i16, i16* @E, align 4
	%sextres = sext i16 %res to i64			%sextres = sext i16 %res to i64
	ret i64 %sextres			ret i64 %sextres
	}			}

	@F = common global i64 0, align 4			@F = common global i64 0, align 4

	; LDR supports loading from a literal.			; LDR supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getF			; CHECK-LABEL: _getF
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _F@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _F@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _F@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _F@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr x0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr x0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define i64 @getF() {			define i64 @getF() {
	%res = load i64, i64* @F, align 4			%res = load i64, i64* @F, align 4
	ret i64 %res			ret i64 %res
	}			}

	; CHECK-LABEL: _setF			; CHECK-LABEL: _setF
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _F@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _F@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _F@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _F@GOTPAGEOFF]
	; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: str x0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: str x0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]
	define void @setF(i64 %t) {			define void @setF(i64 %t) {
	store i64 %t, i64* @F, align 4			store i64 %t, i64* @F, align 4
	ret void			ret void
	}			}

	@G = common global float 0.0, align 4			@G = common global float 0.0, align 4

	; LDR float supports loading from a literal.			; LDR float supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getG			; CHECK-LABEL: _getG
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _G@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _G@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _G@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _G@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr s0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr s0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define float @getG() {			define float @getG() {
	%res = load float, float* @G, align 4			%res = load float, float* @G, align 4
	ret float %res			ret float %res
	}			}

	; CHECK-LABEL: _setG			; CHECK-LABEL: _setG
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _G@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _G@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _G@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _G@GOTPAGEOFF]
	; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: str s0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: str s0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]
	define void @setG(float %t) {			define void @setG(float %t) {
	store float %t, float* @G, align 4			store float %t, float* @G, align 4
	ret void			ret void
	}			}

	@H = common global half 0.0, align 4			@H = common global half 0.0, align 4

	; LDR half supports loading from a literal.			; LDR half supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getH			; CHECK-LABEL: _getH
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _H@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _H@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _H@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _H@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr h0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr h0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define half @getH() {			define half @getH() {
	%res = load half, half* @H, align 4			%res = load half, half* @H, align 4
	ret half %res			ret half %res
	}			}

	; CHECK-LABEL: _setH			; CHECK-LABEL: _setH
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _H@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _H@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _H@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _H@GOTPAGEOFF]
	; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: str h0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: str h0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]
	define void @setH(half %t) {			define void @setH(half %t) {
	store half %t, half* @H, align 4			store half %t, half* @H, align 4
	ret void			ret void
	}			}

	@I = common global double 0.0, align 4			@I = common global double 0.0, align 4

	; LDR double supports loading from a literal.			; LDR double supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getI			; CHECK-LABEL: _getI
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _I@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _I@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _I@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _I@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr d0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr d0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define double @getI() {			define double @getI() {
	%res = load double, double* @I, align 4			%res = load double, double* @I, align 4
	ret double %res			ret double %res
	}			}

	; CHECK-LABEL: _setI			; CHECK-LABEL: _setI
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _I@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _I@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _I@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _I@GOTPAGEOFF]
	; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: str d0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: str d0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]
	define void @setI(double %t) {			define void @setI(double %t) {
	store double %t, double* @I, align 4			store double %t, double* @I, align 4
	ret void			ret void
	}			}

	@J = common global <2 x i32> <i32 0, i32 0>, align 4			@J = common global <2 x i32> <i32 0, i32 0>, align 4

	; LDR 64-bit vector supports loading from a literal.			; LDR 64-bit vector supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getJ			; CHECK-LABEL: _getJ
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _J@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _J@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _J@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _J@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr d0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr d0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define <2 x i32> @getJ() {			define <2 x i32> @getJ() {
	%res = load <2 x i32>, <2 x i32>* @J, align 4			%res = load <2 x i32>, <2 x i32>* @J, align 4
	ret <2 x i32> %res			ret <2 x i32> %res
	}			}

	; CHECK-LABEL: _setJ			; CHECK-LABEL: _setJ
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _J@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _J@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _J@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _J@GOTPAGEOFF]
	; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: str d0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: str d0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]
	define void @setJ(<2 x i32> %t) {			define void @setJ(<2 x i32> %t) {
	store <2 x i32> %t, <2 x i32>* @J, align 4			store <2 x i32> %t, <2 x i32>* @J, align 4
	ret void			ret void
	}			}

	@K = common global <4 x i32> <i32 0, i32 0, i32 0, i32 0>, align 4			@K = common global <4 x i32> <i32 0, i32 0, i32 0, i32 0>, align 4

	; LDR 128-bit vector supports loading from a literal.			; LDR 128-bit vector supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getK			; CHECK-LABEL: _getK
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _K@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _K@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _K@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _K@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr q0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr q0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define <4 x i32> @getK() {			define <4 x i32> @getK() {
	%res = load <4 x i32>, <4 x i32>* @K, align 4			%res = load <4 x i32>, <4 x i32>* @K, align 4
	ret <4 x i32> %res			ret <4 x i32> %res
	}			}

	; CHECK-LABEL: _setK			; CHECK-LABEL: _setK
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _K@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _K@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _K@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _K@GOTPAGEOFF]
	; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[STR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: str q0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: str q0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]			; CHECK: .loh AdrpLdrGotStr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[STR_LABEL]]
	define void @setK(<4 x i32> %t) {			define void @setK(<4 x i32> %t) {
	store <4 x i32> %t, <4 x i32>* @K, align 4			store <4 x i32> %t, <4 x i32>* @K, align 4
	ret void			ret void
	}			}

	@L = common global <1 x i8> <i8 0>, align 4			@L = common global <1 x i8> <i8 0>, align 4

	; LDR 8-bit vector supports loading from a literal.			; LDR 8-bit vector supports loading from a literal.
	; Make sure we emit AdrpLdrGotLdr for those.			; Make sure we emit AdrpLdrGotLdr for those.
	; CHECK-LABEL: _getL			; CHECK-LABEL: _getL
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _L@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _L@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _L@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _L@GOTPAGEOFF]
	; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDR_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr b0, {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: ldr b0, [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]			; CHECK: .loh AdrpLdrGotLdr [[ADRP_LABEL]], [[LDRGOT_LABEL]], [[LDR_LABEL]]
	define <1 x i8> @getL() {			define <1 x i8> @getL() {
	%res = load <1 x i8>, <1 x i8>* @L, align 4			%res = load <1 x i8>, <1 x i8>* @L, align 4
	ret <1 x i8> %res			ret <1 x i8> %res
	}			}

	; CHECK-LABEL: _setL			; CHECK-LABEL: _setL
	; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:			; CHECK: [[ADRP_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _L@GOTPAGE			; CHECK-NEXT: adrp [[ADRP_REG:x[0-9]+]], _L@GOTPAGE
	; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:			; CHECK-NEXT: [[LDRGOT_LABEL:Lloh[0-9]+]]:
	; CHECK-NEXT: ldr [[LDRGOT_REG:x[0-9]+]], {{\[}}[[ADRP_REG]], _L@GOTPAGEOFF]			; CHECK-NEXT: ldr {{[xw]}}[[LDRGOT_REG:[0-9]+]], {{\[}}[[ADRP_REG]], _L@GOTPAGEOFF]
	; CHECK-NEXT: ; kill			; CHECK-NEXT: ; kill
	; Ultimately we should generate str b0, but right now, we match the vector			; Ultimately we should generate str b0, but right now, we match the vector
	; variant which does not allow to fold the immediate into the store.			; variant which does not allow to fold the immediate into the store.
	; CHECK-NEXT: st1.b { v0 }[0], {{\[}}[[LDRGOT_REG]]]			; CHECK-NEXT: st1.b { v0 }[0], [x[[LDRGOT_REG]]]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	; CHECK: .loh AdrpLdrGot [[ADRP_LABEL]], [[LDRGOT_LABEL]]			; CHECK: .loh AdrpLdrGot [[ADRP_LABEL]], [[LDRGOT_LABEL]]
	define void @setL(<1 x i8> %t) {			define void @setL(<1 x i8> %t) {
	store <1 x i8> %t, <1 x i8>* @L, align 4			store <1 x i8> %t, <1 x i8>* @L, align 4
	ret void			ret void
	}			}

	; Make sure we do not assert when we do not track			; Make sure we do not assert when we do not track
	▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/arm64-indexed-memory.ll

	; RUN: llc < %s -mtriple=arm64-eabi -aarch64-redzone \| FileCheck %s			; RUN: llc < %s -mtriple=arm64-eabi -aarch64-redzone \| FileCheck %s
				; RUN: llc < %s -mtriple=arm64_32-apple-ios -aarch64-redzone \| FileCheck %s

	define i64* @store64(i64* %ptr, i64 %index, i64 %spacing) {			define i64* @store64(i64* %ptr, i64 %index, i64 %spacing) {
	; CHECK-LABEL: store64:			; CHECK-LABEL: store64:
	; CHECK: str x{{[0-9+]}}, [x{{[0-9+]}}], #8			; CHECK: str x{{[0-9+]}}, [x{{[0-9+]}}], #8
	; CHECK: ret			; CHECK: ret
	%incdec.ptr = getelementptr inbounds i64, i64* %ptr, i64 1			%incdec.ptr = getelementptr inbounds i64, i64* %ptr, i64 1
	store i64 %spacing, i64* %ptr, align 4			store i64 %spacing, i64* %ptr, align 4
	ret i64* %incdec.ptr			ret i64* %incdec.ptr
	▲ Show 20 Lines • Show All 504 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/arm64-stacksave.ll

	; RUN: llc < %s -verify-coalescing			; RUN: llc -mtriple=arm64-apple-macosx10.8.0 < %s -verify-coalescing
				; RUN: llc -mtriple=arm64_32-apple-ios9.0 < %s -verify-coalescing
	; <rdar://problem/11522048>			; <rdar://problem/11522048>
	target triple = "arm64-apple-macosx10.8.0"

	; Verify that we can handle spilling the stack pointer without attempting			; Verify that we can handle spilling the stack pointer without attempting
	; spilling it directly.			; spilling it directly.
	; CHECK: f			; CHECK: f
	; CHECK: mov [[X0:x[0-9]+]], sp			; CHECK: mov [[X0:x[0-9]+]], sp
	; CHECK: str [[X0]]			; CHECK: str [[X0]]
	; CHECK: inlineasm			; CHECK: inlineasm
	define void @f() nounwind ssp {			define void @f() nounwind ssp {
	Show All 9 Lines

llvm/test/CodeGen/AArch64/arm64_32-addrs.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios %s -o - \| FileCheck %s

				; If %base < 96 then the sum will not wrap (in an unsigned sense), but "ldr w0,
				; [x0, #-96]" would.
				define i32 @test_valid_wrap(i32 %base) {
				; CHECK-LABEL: test_valid_wrap:
				; CHECK: sub w[[ADDR:[0-9]+]], w0, #96
				; CHECK: ldr w0, [x[[ADDR]]]

				%newaddr = add nuw i32 %base, -96
				%ptr = inttoptr i32 %newaddr to i32*
				%val = load i32, i32* %ptr
				ret i32 %val
				}

				define i8 @test_valid_wrap_optimizable(i8* %base) {
				; CHECK-LABEL: test_valid_wrap_optimizable:
				; CHECK: ldurb w0, [x0, #-96]

				%newaddr = getelementptr inbounds i8, i8* %base, i32 -96
				%val = load i8, i8* %newaddr
				ret i8 %val
				}

				define i8 @test_valid_wrap_optimizable1(i8* %base, i32 %offset) {
				; CHECK-LABEL: test_valid_wrap_optimizable1:
				; CHECK: ldrb w0, [x0, w1, sxtw]

				%newaddr = getelementptr inbounds i8, i8* %base, i32 %offset
				%val = load i8, i8* %newaddr
				ret i8 %val
				}

				;
				define i8 @test_valid_wrap_optimizable2(i8* %base, i32 %offset) {
				; CHECK-LABEL: test_valid_wrap_optimizable2:
				; CHECK: sxtw x[[OFFSET:[0-9]+]], w1
				; CHECK: mov w[[BASE:[0-9]+]], #-100
				; CHECK: ldrb w0, [x[[OFFSET]], x[[BASE]]]

				%newaddr = getelementptr inbounds i8, i8* inttoptr(i32 -100 to i8*), i32 %offset
				%val = load i8, i8* %newaddr
				ret i8 %val
				}

llvm/test/CodeGen/AArch64/arm64_32-atomics.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios7.0 -o - %s \| FileCheck %s

				define i8 @test_load_8(i8* %addr) {
				; CHECK-LABAL: test_load_8:
				; CHECK: ldarb w0, [x0]
				%val = load atomic i8, i8* %addr seq_cst, align 1
				ret i8 %val
				}

				define i16 @test_load_16(i16* %addr) {
				; CHECK-LABAL: test_load_16:
				; CHECK: ldarh w0, [x0]
				%val = load atomic i16, i16* %addr acquire, align 2
				ret i16 %val
				}

				define i32 @test_load_32(i32* %addr) {
				; CHECK-LABAL: test_load_32:
				; CHECK: ldar w0, [x0]
				%val = load atomic i32, i32* %addr seq_cst, align 4
				ret i32 %val
				}

				define i64 @test_load_64(i64* %addr) {
				; CHECK-LABAL: test_load_64:
				; CHECK: ldar x0, [x0]
				%val = load atomic i64, i64* %addr seq_cst, align 8
				ret i64 %val
				}

				define i8* @test_load_ptr(i8** %addr) {
				; CHECK-LABAL: test_load_ptr:
				; CHECK: ldar w0, [x0]
				%val = load atomic i8, i8* %addr seq_cst, align 8
				ret i8* %val
				}

				define void @test_store_8(i8* %addr) {
				; CHECK-LABAL: test_store_8:
				; CHECK: stlrb wzr, [x0]
				store atomic i8 0, i8* %addr seq_cst, align 1
				ret void
				}

				define void @test_store_16(i16* %addr) {
				; CHECK-LABAL: test_store_16:
				; CHECK: stlrh wzr, [x0]
				store atomic i16 0, i16* %addr seq_cst, align 2
				ret void
				}

				define void @test_store_32(i32* %addr) {
				; CHECK-LABAL: test_store_32:
				; CHECK: stlr wzr, [x0]
				store atomic i32 0, i32* %addr seq_cst, align 4
				ret void
				}

				define void @test_store_64(i64* %addr) {
				; CHECK-LABAL: test_store_64:
				; CHECK: stlr xzr, [x0]
				store atomic i64 0, i64* %addr seq_cst, align 8
				ret void
				}

				define void @test_store_ptr(i8** %addr) {
				; CHECK-LABAL: test_store_ptr:
				; CHECK: stlr wzr, [x0]
				store atomic i8* null, i8** %addr seq_cst, align 8
				ret void
				}

				declare i64 @llvm.aarch64.ldxr.p0i8(i8* %addr)
				declare i64 @llvm.aarch64.ldxr.p0i16(i16* %addr)
				declare i64 @llvm.aarch64.ldxr.p0i32(i32* %addr)
				declare i64 @llvm.aarch64.ldxr.p0i64(i64* %addr)

				define i8 @test_ldxr_8(i8* %addr) {
				; CHECK-LABEL: test_ldxr_8:
				; CHECK: ldxrb w0, [x0]

				%val = call i64 @llvm.aarch64.ldxr.p0i8(i8* %addr)
				%val8 = trunc i64 %val to i8
				ret i8 %val8
				}

				define i16 @test_ldxr_16(i16* %addr) {
				; CHECK-LABEL: test_ldxr_16:
				; CHECK: ldxrh w0, [x0]

				%val = call i64 @llvm.aarch64.ldxr.p0i16(i16* %addr)
				%val16 = trunc i64 %val to i16
				ret i16 %val16
				}

				define i32 @test_ldxr_32(i32* %addr) {
				; CHECK-LABEL: test_ldxr_32:
				; CHECK: ldxr w0, [x0]

				%val = call i64 @llvm.aarch64.ldxr.p0i32(i32* %addr)
				%val32 = trunc i64 %val to i32
				ret i32 %val32
				}

				define i64 @test_ldxr_64(i64* %addr) {
				; CHECK-LABEL: test_ldxr_64:
				; CHECK: ldxr x0, [x0]

				%val = call i64 @llvm.aarch64.ldxr.p0i64(i64* %addr)
				ret i64 %val
				}

				declare i64 @llvm.aarch64.ldaxr.p0i8(i8* %addr)
				declare i64 @llvm.aarch64.ldaxr.p0i16(i16* %addr)
				declare i64 @llvm.aarch64.ldaxr.p0i32(i32* %addr)
				declare i64 @llvm.aarch64.ldaxr.p0i64(i64* %addr)

				define i8 @test_ldaxr_8(i8* %addr) {
				; CHECK-LABEL: test_ldaxr_8:
				; CHECK: ldaxrb w0, [x0]

				%val = call i64 @llvm.aarch64.ldaxr.p0i8(i8* %addr)
				%val8 = trunc i64 %val to i8
				ret i8 %val8
				}

				define i16 @test_ldaxr_16(i16* %addr) {
				; CHECK-LABEL: test_ldaxr_16:
				; CHECK: ldaxrh w0, [x0]

				%val = call i64 @llvm.aarch64.ldaxr.p0i16(i16* %addr)
				%val16 = trunc i64 %val to i16
				ret i16 %val16
				}

				define i32 @test_ldaxr_32(i32* %addr) {
				; CHECK-LABEL: test_ldaxr_32:
				; CHECK: ldaxr w0, [x0]

				%val = call i64 @llvm.aarch64.ldaxr.p0i32(i32* %addr)
				%val32 = trunc i64 %val to i32
				ret i32 %val32
				}

				define i64 @test_ldaxr_64(i64* %addr) {
				; CHECK-LABEL: test_ldaxr_64:
				; CHECK: ldaxr x0, [x0]

				%val = call i64 @llvm.aarch64.ldaxr.p0i64(i64* %addr)
				ret i64 %val
				}

				declare i32 @llvm.aarch64.stxr.p0i8(i64, i8*)
				declare i32 @llvm.aarch64.stxr.p0i16(i64, i16*)
				declare i32 @llvm.aarch64.stxr.p0i32(i64, i32*)
				declare i32 @llvm.aarch64.stxr.p0i64(i64, i64*)

				define i32 @test_stxr_8(i8* %addr, i8 %val) {
				; CHECK-LABEL: test_stxr_8:
				; CHECK: stxrb [[TMP:w[0-9]+]], w1, [x0]
				; CHECK: mov w0, [[TMP]]

				%extval = zext i8 %val to i64
				%success = call i32 @llvm.aarch64.stxr.p0i8(i64 %extval, i8* %addr)
				ret i32 %success
				}

				define i32 @test_stxr_16(i16* %addr, i16 %val) {
				; CHECK-LABEL: test_stxr_16:
				; CHECK: stxrh [[TMP:w[0-9]+]], w1, [x0]
				; CHECK: mov w0, [[TMP]]

				%extval = zext i16 %val to i64
				%success = call i32 @llvm.aarch64.stxr.p0i16(i64 %extval, i16* %addr)
				ret i32 %success
				}

				define i32 @test_stxr_32(i32* %addr, i32 %val) {
				; CHECK-LABEL: test_stxr_32:
				; CHECK: stxr [[TMP:w[0-9]+]], w1, [x0]
				; CHECK: mov w0, [[TMP]]

				%extval = zext i32 %val to i64
				%success = call i32 @llvm.aarch64.stxr.p0i32(i64 %extval, i32* %addr)
				ret i32 %success
				}

				define i32 @test_stxr_64(i64* %addr, i64 %val) {
				; CHECK-LABEL: test_stxr_64:
				; CHECK: stxr [[TMP:w[0-9]+]], x1, [x0]
				; CHECK: mov w0, [[TMP]]

				%success = call i32 @llvm.aarch64.stxr.p0i64(i64 %val, i64* %addr)
				ret i32 %success
				}

				declare i32 @llvm.aarch64.stlxr.p0i8(i64, i8*)
				declare i32 @llvm.aarch64.stlxr.p0i16(i64, i16*)
				declare i32 @llvm.aarch64.stlxr.p0i32(i64, i32*)
				declare i32 @llvm.aarch64.stlxr.p0i64(i64, i64*)

				define i32 @test_stlxr_8(i8* %addr, i8 %val) {
				; CHECK-LABEL: test_stlxr_8:
				; CHECK: stlxrb [[TMP:w[0-9]+]], w1, [x0]
				; CHECK: mov w0, [[TMP]]

				%extval = zext i8 %val to i64
				%success = call i32 @llvm.aarch64.stlxr.p0i8(i64 %extval, i8* %addr)
				ret i32 %success
				}

				define i32 @test_stlxr_16(i16* %addr, i16 %val) {
				; CHECK-LABEL: test_stlxr_16:
				; CHECK: stlxrh [[TMP:w[0-9]+]], w1, [x0]
				; CHECK: mov w0, [[TMP]]

				%extval = zext i16 %val to i64
				%success = call i32 @llvm.aarch64.stlxr.p0i16(i64 %extval, i16* %addr)
				ret i32 %success
				}

				define i32 @test_stlxr_32(i32* %addr, i32 %val) {
				; CHECK-LABEL: test_stlxr_32:
				; CHECK: stlxr [[TMP:w[0-9]+]], w1, [x0]
				; CHECK: mov w0, [[TMP]]

				%extval = zext i32 %val to i64
				%success = call i32 @llvm.aarch64.stlxr.p0i32(i64 %extval, i32* %addr)
				ret i32 %success
				}

				define i32 @test_stlxr_64(i64* %addr, i64 %val) {
				; CHECK-LABEL: test_stlxr_64:
				; CHECK: stlxr [[TMP:w[0-9]+]], x1, [x0]
				; CHECK: mov w0, [[TMP]]

				%success = call i32 @llvm.aarch64.stlxr.p0i64(i64 %val, i64* %addr)
				ret i32 %success
				}

				define {i8, i1} @test_cmpxchg_ptr(i8* %addr, i8* %cmp, i8* %new) {
				; CHECK-LABEL: test_cmpxchg_ptr:
				; CHECK: [[LOOP:LBB[0-9]+_[0-9]+]]:
				; CHECK: ldaxr [[OLD:w[0-9]+]], [x0]
				; CHECK: cmp [[OLD]], w1
				; CHECK: b.ne [[DONE:LBB[0-9]+_[0-9]+]]
				; CHECK: stlxr [[SUCCESS:w[0-9]+]], w2, [x0]
				; CHECK: cbnz [[SUCCESS]], [[LOOP]]

				; CHECK: mov w1, #1
				; CHECK: mov w0, [[OLD]]
				; CHECK: ret

				; CHECK: [[DONE]]:
				; CHECK: clrex
				; CHECK: mov w1, wzr
				; CHECK: mov w0, [[OLD]]
				; CHECK: ret
				%res = cmpxchg i8** %addr, i8* %cmp, i8* %new acq_rel acquire
				ret {i8*, i1} %res
				}

llvm/test/CodeGen/AArch64/arm64_32-fastisel.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios -O0 -fast-isel %s -o - \| FileCheck %s
				@var = global i8* null

				define void @test_store_release_ptr() {
				; CHECK-LABEL: test_store_release_ptr
				; CHECK: mov [[ZERO:w[0-9]+]], wzr
				; CHECK: stlr [[ZERO]]
				store atomic i8* null, i8** @var release, align 4
				br label %next

				next:
				ret void
				}

				declare [2 x i32] @callee()

				define void @test_struct_return(i32* %addr) {
				; CHECK-LABEL: test_struct_return:
				; CHECK: bl _callee
				; CHECK-DAG: lsr [[HI:x[0-9]+]], x0, #32
				; CHECK-DAG: mov [[LO:w[0-9]+]], w0
				%res = call [2 x i32] @callee()
				%res.0 = extractvalue [2 x i32] %res, 0
				store i32 %res.0, i32* %addr
				%res.1 = extractvalue [2 x i32] %res, 1
				store i32 %res.1, i32* %addr
				ret void
				}

llvm/test/CodeGen/AArch64/arm64_32-frame-pointers.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios8.0 %s -o - \| FileCheck %s

				; We're provoking LocalStackSlotAllocation to create some shared frame bases
				; here: it wants multiple <fi#N> using instructions that can be satisfied by a
				; single base, but not within the addressing-mode.
				;
				; When that happens it's important that we don't mix our pointer sizes
				; (e.g. try to create an ldr from a w-register base).
				define i8 @test_register_wrangling() {
				; CHECK-LABEL: test_register_wrangling:
				; CHECK: add [[TMP:x[0-9]+]], sp,
				; CHECK: add x[[BASE:[0-9]+]], [[TMP]],
				; CHECK: ldrb {{w[0-9]+}}, [x[[BASE]], #1]
				; CHECK: ldrb {{w[0-9]+}}, [x[[BASE]]]

				%var1 = alloca i8, i32 4100
				%var3 = alloca i8
				%dummy = alloca i8, i32 4100

				%var1p1 = getelementptr i8, i8* %var1, i32 1
				%val1 = load i8, i8* %var1
				%val2 = load i8, i8* %var3

				%sum = add i8 %val1, %val2
				ret i8 %sum
				}

llvm/test/CodeGen/AArch64/arm64_32-gep-sink.ll

This file was added.

				; RUN: opt -codegenprepare -mtriple=arm64_32-apple-ios %s -S -o - \| FileCheck %s

				define void @test_simple_sink(i1* %base, i64 %offset) {
				; CHECK-LABEL: @test_simple_sink
				; CHECK: next:
				; CHECK: [[BASE8:%.]] = bitcast i1 %base to i8*
				; CHECK: [[ADDR8:%.]] = getelementptr i8, i8 [[BASE8]], i64 %offset
				; CHECK: [[ADDR:%.]] = bitcast i8 [[ADDR8]] to i1*
				; CHECK: load volatile i1, i1* [[ADDR]]
				%addr = getelementptr i1, i1* %base, i64 %offset
				%tst = load i1, i1* %addr
				br i1 %tst, label %next, label %end

				next:
				load volatile i1, i1* %addr
				ret void

				end:
				ret void
				}

				define void @test_inbounds_sink(i1* %base, i64 %offset) {
				; CHECK-LABEL: @test_inbounds_sink
				; CHECK: next:
				; CHECK: [[BASE8:%.]] = bitcast i1 %base to i8*
				; CHECK: [[ADDR8:%.]] = getelementptr inbounds i8, i8 [[BASE8]], i64 %offset
				; CHECK: [[ADDR:%.]] = bitcast i8 [[ADDR8]] to i1*
				; CHECK: load volatile i1, i1* [[ADDR]]
				%addr = getelementptr inbounds i1, i1* %base, i64 %offset
				%tst = load i1, i1* %addr
				br i1 %tst, label %next, label %end

				next:
				load volatile i1, i1* %addr
				ret void

				end:
				ret void
				}

				; No address derived via an add can be guaranteed inbounds
				define void @test_add_sink(i1* %base, i64 %offset) {
				; CHECK-LABEL: @test_add_sink
				; CHECK: next:
				; CHECK: [[BASE8:%.]] = bitcast i1 %base to i8*
				; CHECK: [[ADDR8:%.]] = getelementptr i8, i8 [[BASE8]], i64 %offset
				; CHECK: [[ADDR:%.]] = bitcast i8 [[ADDR8]] to i1*
				; CHECK: load volatile i1, i1* [[ADDR]]
				%base64 = ptrtoint i1* %base to i64
				%addr64 = add nsw nuw i64 %base64, %offset
				%addr = inttoptr i64 %addr64 to i1*
				%tst = load i1, i1* %addr
				br i1 %tst, label %next, label %end

				next:
				load volatile i1, i1* %addr
				ret void

				end:
				ret void
				}

llvm/test/CodeGen/AArch64/arm64_32-memcpy.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios9.0 -o - %s \| FileCheck %s

				define i64 @test_memcpy(i64* %addr, i8* %src, i1 %tst) minsize {
				; CHECK-LABEL: test_memcpy:
				; CHECK: ldr [[VAL64:x[0-9]+]], [x0]
				; [...]
				; CHECK: and x0, [[VAL64]], #0xffffffff
				; CHECK: bl _memcpy

				%val64 = load i64, i64* %addr
				br i1 %tst, label %true, label %false

				true:
				ret i64 %val64

				false:
				%val32 = trunc i64 %val64 to i32
				%val.ptr = inttoptr i32 %val32 to i8*
				call void @llvm.memcpy.p0i8.p0i8.i32(i8* %val.ptr, i8* %src, i32 128, i32 0, i1 1)
				ret i64 undef
				}

				define i64 @test_memmove(i64* %addr, i8* %src, i1 %tst) minsize {
				; CHECK-LABEL: test_memmove:
				; CHECK: ldr [[VAL64:x[0-9]+]], [x0]
				; [...]
				; CHECK: and x0, [[VAL64]], #0xffffffff
				; CHECK: bl _memmove

				%val64 = load i64, i64* %addr
				br i1 %tst, label %true, label %false

				true:
				ret i64 %val64

				false:
				%val32 = trunc i64 %val64 to i32
				%val.ptr = inttoptr i32 %val32 to i8*
				call void @llvm.memmove.p0i8.p0i8.i32(i8* %val.ptr, i8* %src, i32 128, i32 0, i1 1)
				ret i64 undef
				}

				define i64 @test_memset(i64* %addr, i8* %src, i1 %tst) minsize {
				; CHECK-LABEL: test_memset:
				; CHECK: ldr [[VAL64:x[0-9]+]], [x0]
				; [...]
				; CHECK: and x0, [[VAL64]], #0xffffffff
				; CHECK: bl _memset

				%val64 = load i64, i64* %addr
				br i1 %tst, label %true, label %false

				true:
				ret i64 %val64

				false:
				%val32 = trunc i64 %val64 to i32
				%val.ptr = inttoptr i32 %val32 to i8*
				call void @llvm.memset.p0i8.i32(i8* %val.ptr, i8 42, i32 256, i32 0, i1 1)
				ret i64 undef
				}

				declare void @llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1)
				declare void @llvm.memmove.p0i8.p0i8.i32(i8, i8, i32, i32, i1)
				declare void @llvm.memset.p0i8.i32(i8*, i8, i32, i32, i1)

llvm/test/CodeGen/AArch64/arm64_32-neon.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios7.0 -mcpu=cyclone %s -o - \| FileCheck %s

				define <2 x double> @test_insert_elt(<2 x double> %vec, double %val) {
				; CHECK-LABEL: test_insert_elt:
				; CHECK: mov.d v0[0], v1[0]
				%res = insertelement <2 x double> %vec, double %val, i32 0
				ret <2 x double> %res
				aemersonUnsubmitted Not Done Reply Inline Actions Is anything really expected to change with NEON & arm64_32? aemerson: Is anything really expected to change with NEON & arm64_32?
				t.p.northoverAuthorUnsubmitted Done Reply Inline Actions Not to change particularly, but this is in some sense the parts of NEON that could be affected by arm64_32: ABI boundaries and load/store addressing-modes. Testing it separately avoids duplicating the AArch64 tests that really aren't different (e.g. arithmetic) but still gives us the coverage. t.p.northover: Not to change particularly, but this is in some sense the parts of NEON that could be affected…
				}

				define void @test_split_16B(<4 x float> %val, <4 x float>* %addr) {
				; CHECK-LABEL: test_split_16B:
				; CHECK: str q0, [x0]
				store <4 x float> %val, <4 x float>* %addr, align 8
				ret void
				}

				define void @test_split_16B_splat(<4 x i32>, <4 x i32>* %addr) {
				; CHECK-LABEL: test_split_16B_splat:
				; CHECK: str {{q[0-9]+}}

				%vec.tmp0 = insertelement <4 x i32> undef, i32 42, i32 0
				%vec.tmp1 = insertelement <4 x i32> %vec.tmp0, i32 42, i32 1
				%vec.tmp2 = insertelement <4 x i32> %vec.tmp1, i32 42, i32 2
				%vec = insertelement <4 x i32> %vec.tmp2, i32 42, i32 3

				store <4 x i32> %vec, <4 x i32>* %addr, align 8
				ret void
				}


				%vec = type <2 x double>

				declare {%vec, %vec} @llvm.aarch64.neon.ld2r.v2f64.p0i8(i8*)
				define {%vec, %vec} @test_neon_load(i8* %addr) {
				; CHECK-LABEL: test_neon_load:
				; CHECK: ld2r.2d { v0, v1 }, [x0]
				%res = call {%vec, %vec} @llvm.aarch64.neon.ld2r.v2f64.p0i8(i8* %addr)
				ret {%vec, %vec} %res
				}

				declare {%vec, %vec} @llvm.aarch64.neon.ld2lane.v2f64.p0i8(%vec, %vec, i64, i8*)
				define {%vec, %vec} @test_neon_load_lane(i8* %addr, %vec %in1, %vec %in2) {
				; CHECK-LABEL: test_neon_load_lane:
				; CHECK: ld2.d { v0, v1 }[0], [x0]
				%res = call {%vec, %vec} @llvm.aarch64.neon.ld2lane.v2f64.p0i8(%vec %in1, %vec %in2, i64 0, i8* %addr)
				ret {%vec, %vec} %res
				}

				declare void @llvm.aarch64.neon.st2.v2f64.p0i8(%vec, %vec, i8*)
				define void @test_neon_store(i8* %addr, %vec %in1, %vec %in2) {
				; CHECK-LABEL: test_neon_store:
				; CHECK: st2.2d { v0, v1 }, [x0]
				call void @llvm.aarch64.neon.st2.v2f64.p0i8(%vec %in1, %vec %in2, i8* %addr)
				ret void
				}

				declare void @llvm.aarch64.neon.st2lane.v2f64.p0i8(%vec, %vec, i64, i8*)
				define void @test_neon_store_lane(i8* %addr, %vec %in1, %vec %in2) {
				; CHECK-LABEL: test_neon_store_lane:
				; CHECK: st2.d { v0, v1 }[1], [x0]
				call void @llvm.aarch64.neon.st2lane.v2f64.p0i8(%vec %in1, %vec %in2, i64 1, i8* %addr)
				ret void
				}

				declare {%vec, %vec} @llvm.aarch64.neon.ld2.v2f64.p0i8(i8*)
				define {{%vec, %vec}, i8} @test_neon_load_post(i8 %addr, i32 %offset) {
				; CHECK-LABEL: test_neon_load_post:
				; CHECK-DAG: sxtw [[OFFSET:x[0-9]+]], w1
				; CHECK: ld2.2d { v0, v1 }, [x0], [[OFFSET]]

				%vecs = call {%vec, %vec} @llvm.aarch64.neon.ld2.v2f64.p0i8(i8* %addr)

				%addr.new = getelementptr inbounds i8, i8* %addr, i32 %offset

				%res.tmp = insertvalue {{%vec, %vec}, i8*} undef, {%vec, %vec} %vecs, 0
				%res = insertvalue {{%vec, %vec}, i8} %res.tmp, i8 %addr.new, 1
				ret {{%vec, %vec}, i8*} %res
				}

				define {{%vec, %vec}, i8} @test_neon_load_post_lane(i8 %addr, i32 %offset, %vec %in1, %vec %in2) {
				; CHECK-LABEL: test_neon_load_post_lane:
				; CHECK-DAG: sxtw [[OFFSET:x[0-9]+]], w1
				; CHECK: ld2.d { v0, v1 }[1], [x0], [[OFFSET]]

				%vecs = call {%vec, %vec} @llvm.aarch64.neon.ld2lane.v2f64.p0i8(%vec %in1, %vec %in2, i64 1, i8* %addr)

				%addr.new = getelementptr inbounds i8, i8* %addr, i32 %offset

				%res.tmp = insertvalue {{%vec, %vec}, i8*} undef, {%vec, %vec} %vecs, 0
				%res = insertvalue {{%vec, %vec}, i8} %res.tmp, i8 %addr.new, 1
				ret {{%vec, %vec}, i8*} %res
				}

				define i8* @test_neon_store_post(i8* %addr, i32 %offset, %vec %in1, %vec %in2) {
				; CHECK-LABEL: test_neon_store_post:
				; CHECK-DAG: sxtw [[OFFSET:x[0-9]+]], w1
				; CHECK: st2.2d { v0, v1 }, [x0], [[OFFSET]]

				call void @llvm.aarch64.neon.st2.v2f64.p0i8(%vec %in1, %vec %in2, i8* %addr)

				%addr.new = getelementptr inbounds i8, i8* %addr, i32 %offset

				ret i8* %addr.new
				}

				define i8* @test_neon_store_post_lane(i8* %addr, i32 %offset, %vec %in1, %vec %in2) {
				; CHECK-LABEL: test_neon_store_post_lane:
				; CHECK: sxtw [[OFFSET:x[0-9]+]], w1
				; CHECK: st2.d { v0, v1 }[0], [x0], [[OFFSET]]

				call void @llvm.aarch64.neon.st2lane.v2f64.p0i8(%vec %in1, %vec %in2, i64 0, i8* %addr)

				%addr.new = getelementptr inbounds i8, i8* %addr, i32 %offset

				ret i8* %addr.new
				}

				; ld1 is slightly different because it goes via ISelLowering of normal IR ops
				; rather than an intrinsic.
				define {%vec, double} @test_neon_ld1_post_lane(double %addr, i32 %offset, %vec %in) {
				; CHECK-LABEL: test_neon_ld1_post_lane:
				; CHECK: sbfiz [[OFFSET:x[0-9]+]], x1, #3, #32
				; CHECK: ld1.d { v0 }[0], [x0], [[OFFSET]]

				%loaded = load double, double* %addr, align 8
				%newvec = insertelement %vec %in, double %loaded, i32 0

				%addr.new = getelementptr inbounds double, double* %addr, i32 %offset

				%res.tmp = insertvalue {%vec, double*} undef, %vec %newvec, 0
				%res = insertvalue {%vec, double} %res.tmp, double %addr.new, 1

				ret {%vec, double*} %res
				}

				define {{%vec, %vec}, i8} @test_neon_load_post_exact(i8 %addr) {
				; CHECK-LABEL: test_neon_load_post_exact:
				; CHECK: ld2.2d { v0, v1 }, [x0], #32

				%vecs = call {%vec, %vec} @llvm.aarch64.neon.ld2.v2f64.p0i8(i8* %addr)

				%addr.new = getelementptr inbounds i8, i8* %addr, i32 32

				%res.tmp = insertvalue {{%vec, %vec}, i8*} undef, {%vec, %vec} %vecs, 0
				%res = insertvalue {{%vec, %vec}, i8} %res.tmp, i8 %addr.new, 1
				ret {{%vec, %vec}, i8*} %res
				}

				define {%vec, double} @test_neon_ld1_post_lane_exact(double %addr, %vec %in) {
				; CHECK-LABEL: test_neon_ld1_post_lane_exact:
				; CHECK: ld1.d { v0 }[0], [x0], #8

				%loaded = load double, double* %addr, align 8
				%newvec = insertelement %vec %in, double %loaded, i32 0

				%addr.new = getelementptr inbounds double, double* %addr, i32 1

				%res.tmp = insertvalue {%vec, double*} undef, %vec %newvec, 0
				%res = insertvalue {%vec, double} %res.tmp, double %addr.new, 1

				ret {%vec, double*} %res
				}

				; As in the general load/store case, this GEP has defined semantics when the
				; address wraps. We cannot use post-indexed addressing.
				define {%vec, double} @test_neon_ld1_notpost_lane_exact(double %addr, %vec %in) {
				; CHECK-LABEL: test_neon_ld1_notpost_lane_exact:
				; CHECK-NOT: ld1.d { {{v[0-9]+}} }[0], [{{x[0-9]+\|sp}}], #8
				; CHECK: add w0, w0, #8
				; CHECK: ret

				%loaded = load double, double* %addr, align 8
				%newvec = insertelement %vec %in, double %loaded, i32 0

				%addr.new = getelementptr double, double* %addr, i32 1

				%res.tmp = insertvalue {%vec, double*} undef, %vec %newvec, 0
				%res = insertvalue {%vec, double} %res.tmp, double %addr.new, 1

				ret {%vec, double*} %res
				}

				define {%vec, double} @test_neon_ld1_notpost_lane(double %addr, i32 %offset, %vec %in) {
				; CHECK-LABEL: test_neon_ld1_notpost_lane:
				; CHECK-NOT: ld1.d { {{v[0-9]+}} }[0], [{{x[0-9]+\|sp}}], {{x[0-9]+\|sp}}
				; CHECK: add w0, w0, w1, lsl #3
				; CHECK: ret

				%loaded = load double, double* %addr, align 8
				%newvec = insertelement %vec %in, double %loaded, i32 0

				%addr.new = getelementptr double, double* %addr, i32 %offset

				%res.tmp = insertvalue {%vec, double*} undef, %vec %newvec, 0
				%res = insertvalue {%vec, double} %res.tmp, double %addr.new, 1

				ret {%vec, double*} %res
				}

llvm/test/CodeGen/AArch64/arm64_32-null.ll

This file was added.

				; RUN: llc -fast-isel=true -global-isel=false -O0 -mtriple=arm64_32-apple-ios %s -o - \| FileCheck %s
				; RUN: llc -fast-isel=false -global-isel=false -O0 -mtriple=arm64_32-apple-ios %s -o - \| FileCheck %s

				define void @test_store(i8** %p) {
				; CHECK-LABEL: test_store:
				; CHECK: mov [[R1:w[0-9]+]], wzr
				; CHECK: str [[R1]], [x0]

				store i8* null, i8** %p
				ret void
				}

				define void @test_phi(i8** %p) {
				; CHECK-LABEL: test_phi:
				; CHECK: mov [[R1:x[0-9]+]], xzr
				; CHECK: str [[R1]], [sp]
				; CHECK: b [[BB:LBB[0-9_]+]]
				; CHECK: [[BB]]:
				; CHECK: ldr x0, [sp]
				; CHECK: mov [[R2:w[0-9]+]], w0
				; CHECK: str [[R2]], [x{{.*}}]

				bb0:
				br label %bb1
				bb1:
				%tmp0 = phi i8* [ null, %bb0 ]
				store i8* %tmp0, i8** %p
				ret void
				}

llvm/test/CodeGen/AArch64/arm64_32-pointer-extend.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios7.0 %s -o - \| FileCheck %s

				define void @pass_pointer(i64 %in) {
				; CHECK-LABEL: pass_pointer:
				; CHECK: and x0, x0, #0xffffffff
				; CHECK: bl _take_pointer

				%in32 = trunc i64 %in to i32
				%ptr = inttoptr i32 %in32 to i8*
				call i64 @take_pointer(i8* %ptr)
				ret void
				}

				define i64 @take_pointer(i8* %ptr) nounwind {
				; CHECK-LABEL: take_pointer:
				; CHECK-NEXT: %bb.0
				; CHECK-NEXT: ret

				%val = ptrtoint i8* %ptr to i32
				%res = zext i32 %val to i64
				ret i64 %res
				}

				define i32 @callee_ptr_stack_slot([8 x i64], i8*, i32 %val) {
				; CHECK-LABEL: callee_ptr_stack_slot:
				; CHECK: ldr w0, [sp, #4]

				ret i32 %val
				}

				define void @caller_ptr_stack_slot(i8* %ptr) {
				; CHECK-LABEL: caller_ptr_stack_slot:
				; CHECK-DAG: mov [[VAL:w[0-9]]], #42
				; CHECK: stp w0, [[VAL]], [sp]

				call i32 @callee_ptr_stack_slot([8 x i64] undef, i8* %ptr, i32 42)
				ret void
				}

				define i8* @return_ptr(i64 %in, i64 %r) {
				; CHECK-LABEL: return_ptr:
				; CHECK: sdiv [[VAL64:x[0-9]+]], x0, x1
				; CHECK: and x0, [[VAL64]], #0xffffffff

				%sum = sdiv i64 %in, %r
				%sum32 = trunc i64 %sum to i32
				%res = inttoptr i32 %sum32 to i8*
				ret i8* %res
				}

llvm/test/CodeGen/AArch64/arm64_32-stack-pointers.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios9.0 -o - %s \| FileCheck %s

				declare void @callee([8 x i64], i8, i8)

				; Make sure we don't accidentally store X0 or XZR, which might well
				; clobber other arguments or data.
				define void @test_stack_ptr_32bits(i8* %in) {
				; CHECK-LABEL: test_stack_ptr_32bits:
				; CHECK-DAG: stp wzr, w0, [sp]

				call void @callee([8 x i64] undef, i8* null, i8* %in)
				ret void
				}

llvm/test/CodeGen/AArch64/arm64_32-tls.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios %s -o - \| FileCheck %s

				define i32 @test_thread_local() {
				; CHECK-LABEL: test_thread_local:
				; CHECK: adrp x[[TMP:[0-9]+]], _var@TLVPPAGE
				; CHECK: ldr w0, [x[[TMP]], _var@TLVPPAGEOFF]
				; CHECK: ldr w[[DEST:[0-9]+]], [x0]
				; CHECK: blr x[[DEST]]

				%val = load i32, i32* @var
				ret i32 %val
				}

				@var = thread_local global i32 zeroinitializer

				; CHECK: .tbss _var$tlv$init, 4, 2

				; CHECK-LABEL: __DATA,__thread_vars
				; CHECK: _var:
				; CHECK: .long __tlv_bootstrap
				; CHECK: .long 0
				; CHECK: .long _var$tlv$init

llvm/test/CodeGen/AArch64/arm64_32-va.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios %s -o - \| FileCheck %s

				define void @test_va_copy(i8* %dst, i8* %src) {
				; CHECK-LABEL: test_va_copy:
				; CHECK: ldr [[PTR:w[0-9]+]], [x1]
				; CHECK: str [[PTR]], [x0]

				call void @llvm.va_copy(i8* %dst, i8* %src)
				ret void
				}

				define void @test_va_start(i32, ...) {
				; CHECK-LABEL: test_va_start
				; CHECK: add x[[LIST:[0-9]+]], sp, #16
				; CHECK: str w[[LIST]],
				%slot = alloca i8*, align 4
				%list = bitcast i8** %slot to i8*
				call void @llvm.va_start(i8* %list)
				ret void
				}

				define void @test_va_start_odd([8 x i64], i32, ...) {
				; CHECK-LABEL: test_va_start_odd:
				; CHECK: add x[[LIST:[0-9]+]], sp, #20
				; CHECK: str w[[LIST]],
				%slot = alloca i8*, align 4
				%list = bitcast i8** %slot to i8*
				call void @llvm.va_start(i8* %list)
				ret void
				}

				define i8* @test_va_arg(i8** %list) {
				; CHECK-LABEL: test_va_arg:
				; CHECK: ldr w[[LOC:[0-9]+]], [x0]
				; CHECK: add [[NEXTLOC:w[0-9]+]], w[[LOC]], #4
				; CHECK: str [[NEXTLOC]], [x0]
				; CHECK: ldr w0, [x[[LOC]]]
				%res = va_arg i8** %list, i8*
				ret i8* %res
				}

				define i8* @really_test_va_arg(i8** %list, i1 %tst) {
				; CHECK-LABEL: really_test_va_arg:
				; CHECK: ldr w[[LOC:[0-9]+]], [x0]
				; CHECK: add [[NEXTLOC:w[0-9]+]], w[[LOC]], #4
				; CHECK: str [[NEXTLOC]], [x0]
				; CHECK: ldr w[[VAARG:[0-9]+]], [x[[LOC]]]
				; CHECK: csel x0, x[[VAARG]], xzr
				%tmp = va_arg i8** %list, i8*
				%res = select i1 %tst, i8* %tmp, i8* null
				ret i8* %res
				}

				declare void @llvm.va_start(i8*)

				declare void @llvm.va_copy(i8, i8)

llvm/test/CodeGen/AArch64/arm64_32.ll

This file was added.

				; RUN: llc -mtriple=arm64_32-apple-ios7.0 %s -filetype=obj -o - -disable-post-ra -frame-pointer=all \| \
				; RUN: llvm-objdump -private-headers - \| \
				; RUN: FileCheck %s --check-prefix=CHECK-MACHO
				; RUN: llc -mtriple=arm64_32-apple-ios7.0 %s -o - -aarch64-enable-atomic-cfg-tidy=0 -disable-post-ra -frame-pointer=all \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-OPT
				; RUN: llc -mtriple=arm64_32-apple-ios7.0 %s -o - -fast-isel -aarch64-enable-atomic-cfg-tidy=0 -disable-post-ra -frame-pointer=all \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-FAST

				; CHECK-MACHO: Mach header
				; CHECK-MACHO: MH_MAGIC ARM64_32 V8

				@var64 = global i64 zeroinitializer, align 8
				@var32 = global i32 zeroinitializer, align 4

				@var_got = external global i8

				define i32* @test_global_addr() {
				; CHECK-LABEL: test_global_addr:
				; CHECK: adrp [[PAGE:x[0-9]+]], _var32@PAGE
				; CHECK: add x0, [[PAGE]], _var32@PAGEOFF
				ret i32* @var32
				}

				; ADRP is necessarily 64-bit. The important point to check is that, however that
				; gets truncated to 32-bits, it's free. No need to zero out higher bits of that
				; register.
				define i64 @test_global_addr_extension() {
				; CHECK-LABEL: test_global_addr_extension:
				; CHECK: adrp [[PAGE:x[0-9]+]], _var32@PAGE
				; CHECK: add x0, [[PAGE]], _var32@PAGEOFF
				; CHECK-NOT: and
				; CHECK: ret

				ret i64 ptrtoint(i32* @var32 to i64)
				}

				define i32 @test_global_value() {
				; CHECK-LABEL: test_global_value:
				; CHECK: adrp x[[PAGE:[0-9]+]], _var32@PAGE
				; CHECK: ldr w0, [x[[PAGE]], _var32@PAGEOFF]
				%val = load i32, i32* @var32, align 4
				ret i32 %val
				}

				; Because the addition may wrap, it is not safe to use "ldr w0, [xN, #32]" here.
				define i32 @test_unsafe_indexed_add() {
				; CHECK-LABEL: test_unsafe_indexed_add:
				; CHECK: add x[[VAR32:[0-9]+]], {{x[0-9]+}}, _var32@PAGEOFF
				; CHECK: add w[[ADDR:[0-9]+]], w[[VAR32]], #32
				; CHECK: ldr w0, [x[[ADDR]]]
				%addr_int = ptrtoint i32* @var32 to i32
				%addr_plus_32 = add i32 %addr_int, 32
				%addr = inttoptr i32 %addr_plus_32 to i32*
				%val = load i32, i32* %addr, align 4
				ret i32 %val
				}

				; Since we've promised there is no unsigned overflow, @var32 must be at least
				; 32-bytes below 2^32, and we can use the load this time.
				define i32 @test_safe_indexed_add() {
				; CHECK-LABEL: test_safe_indexed_add:
				; CHECK: add x[[VAR32:[0-9]+]], {{x[0-9]+}}, _var32@PAGEOFF
				; CHECK: add w[[ADDR:[0-9]+]], w[[VAR32]], #32
				; CHECK: ldr w0, [x[[ADDR]]]
				%addr_int = ptrtoint i32* @var32 to i64
				%addr_plus_32 = add nuw i64 %addr_int, 32
				%addr = inttoptr i64 %addr_plus_32 to i32*
				%val = load i32, i32* %addr, align 4
				ret i32 %val
				}

				define i32 @test_safe_indexed_or(i32 %in) {
				; CHECK-LABEL: test_safe_indexed_or:
				; CHECK: and [[TMP:w[0-9]+]], {{w[0-9]+}}, #0xfffffff0
				; CHECK: orr w[[ADDR:[0-9]+]], [[TMP]], #0x4
				; CHECK: ldr w0, [x[[ADDR]]]
				%addr_int = and i32 %in, -16
				%addr_plus_4 = or i32 %addr_int, 4
				%addr = inttoptr i32 %addr_plus_4 to i32*
				%val = load i32, i32* %addr, align 4
				ret i32 %val
				}


				; Promising nsw is not sufficient because the addressing mode basically
				; calculates "zext(base) + zext(offset)" and nsw only guarantees
				; "sext(base) + sext(offset) == base + offset".
				define i32 @test_unsafe_nsw_indexed_add() {
				; CHECK-LABEL: test_unsafe_nsw_indexed_add:
				; CHECK: add x[[VAR32:[0-9]+]], {{x[0-9]+}}, _var32@PAGEOFF
				; CHECK: add w[[ADDR:[0-9]+]], w[[VAR32]], #32
				; CHECK-NOT: ubfx
				; CHECK: ldr w0, [x[[ADDR]]]
				%addr_int = ptrtoint i32* @var32 to i32
				%addr_plus_32 = add nsw i32 %addr_int, 32
				%addr = inttoptr i32 %addr_plus_32 to i32*
				%val = load i32, i32* %addr, align 4
				ret i32 %val
				}

				; Because the addition may wrap, it is not safe to use "ldr w0, [xN, #32]" here.
				define i32 @test_unsafe_unscaled_add() {
				; CHECK-LABEL: test_unsafe_unscaled_add:
				; CHECK: add x[[VAR32:[0-9]+]], {{x[0-9]+}}, _var32@PAGEOFF
				; CHECK: add w[[ADDR:[0-9]+]], w[[VAR32]], #3
				; CHECK: ldr w0, [x[[ADDR]]]
				%addr_int = ptrtoint i32* @var32 to i32
				%addr_plus_3 = add i32 %addr_int, 3
				%addr = inttoptr i32 %addr_plus_3 to i32*
				%val = load i32, i32* %addr, align 1
				ret i32 %val
				}

				; Since we've promised there is no unsigned overflow, @var32 must be at least
				; 32-bytes below 2^32, and we can use the load this time.
				define i32 @test_safe_unscaled_add() {
				; CHECK-LABEL: test_safe_unscaled_add:
				; CHECK: add x[[VAR32:[0-9]+]], {{x[0-9]+}}, _var32@PAGEOFF
				; CHECK: add w[[ADDR:[0-9]+]], w[[VAR32]], #3
				; CHECK: ldr w0, [x[[ADDR]]]
				%addr_int = ptrtoint i32* @var32 to i32
				%addr_plus_3 = add nuw i32 %addr_int, 3
				%addr = inttoptr i32 %addr_plus_3 to i32*
				%val = load i32, i32* %addr, align 1
				ret i32 %val
				}

				; Promising nsw is not sufficient because the addressing mode basically
				; calculates "zext(base) + zext(offset)" and nsw only guarantees
				; "sext(base) + sext(offset) == base + offset".
				define i32 @test_unsafe_nsw_unscaled_add() {
				; CHECK-LABEL: test_unsafe_nsw_unscaled_add:
				; CHECK: add x[[VAR32:[0-9]+]], {{x[0-9]+}}, _var32@PAGEOFF
				; CHECK: add w[[ADDR:[0-9]+]], w[[VAR32]], #3
				; CHECK-NOT: ubfx
				; CHECK: ldr w0, [x[[ADDR]]]
				%addr_int = ptrtoint i32* @var32 to i32
				%addr_plus_3 = add nsw i32 %addr_int, 3
				%addr = inttoptr i32 %addr_plus_3 to i32*
				%val = load i32, i32* %addr, align 1
				ret i32 %val
				}

				; Because the addition may wrap, it is not safe to use "ldur w0, [xN, #-3]"
				; here.
				define i32 @test_unsafe_negative_unscaled_add() {
				; CHECK-LABEL: test_unsafe_negative_unscaled_add:
				; CHECK: add x[[VAR32:[0-9]+]], {{x[0-9]+}}, _var32@PAGEOFF
				; CHECK: sub w[[ADDR:[0-9]+]], w[[VAR32]], #3
				; CHECK: ldr w0, [x[[ADDR]]]
				%addr_int = ptrtoint i32* @var32 to i32
				%addr_minus_3 = add i32 %addr_int, -3
				%addr = inttoptr i32 %addr_minus_3 to i32*
				%val = load i32, i32* %addr, align 1
				ret i32 %val
				}

				define i8* @test_got_addr() {
				; CHECK-LABEL: test_got_addr:
				; CHECK: adrp x[[PAGE:[0-9]+]], _var_got@GOTPAGE
				; CHECK: ldr w0, [x[[PAGE]], _var_got@GOTPAGEOFF]
				ret i8* @var_got
				}

				define float @test_va_arg_f32(i8** %list) {
				; CHECK-LABEL: test_va_arg_f32:

				; CHECK: ldr w[[START:[0-9]+]], [x0]
				; CHECK: add [[AFTER:w[0-9]+]], w[[START]], #8
				; CHECK: str [[AFTER]], [x0]

				; Floating point arguments get promoted to double as per C99.
				; CHECK: ldr [[DBL:d[0-9]+]], [x[[START]]]
				; CHECK: fcvt s0, [[DBL]]
				%res = va_arg i8** %list, float
				ret float %res
				}

				; Interesting point is that the slot is 4 bytes.
				define i8 @test_va_arg_i8(i8** %list) {
				; CHECK-LABEL: test_va_arg_i8:

				; CHECK: ldr w[[START:[0-9]+]], [x0]
				; CHECK: add [[AFTER:w[0-9]+]], w[[START]], #4
				; CHECK: str [[AFTER]], [x0]

				; i8 gets promoted to int (again, as per C99).
				; CHECK: ldr w0, [x[[START]]]

				%res = va_arg i8** %list, i8
				ret i8 %res
				}

				; Interesting point is that the slot needs aligning (again, min size is 4
				; bytes).
				define i64 @test_va_arg_i64(i64** %list) {
				; CHECK-LABEL: test_va_arg_i64:

				; Update the list for the next user (minimum slot size is 4, but the actual
				; argument is 8 which had better be reflected!)
				; CHECK: ldr w[[UNALIGNED_START:[0-9]+]], [x0]
				; CHECK: add [[ALIGN_TMP:x[0-9]+]], x[[UNALIGNED_START]], #7
				; CHECK: and x[[START:[0-9]+]], [[ALIGN_TMP]], #0x1fffffff8
				; CHECK: add w[[AFTER:[0-9]+]], w[[START]], #8
				; CHECK: str w[[AFTER]], [x0]

				; CHECK: ldr x0, [x[[START]]]

				%res = va_arg i64** %list, i64
				ret i64 %res
				}

				declare void @bar(...)
				define void @test_va_call(i8 %l, i8 %r, float %in, i8* %ptr) {
				; CHECK-LABEL: test_va_call:
				; CHECK: add [[SUM:w[0-9]+]], {{w[0-9]+}}, w1

				; CHECK-DAG: str w2, [sp, #32]
				; CHECK-DAG: str xzr, [sp, #24]
				; CHECK-DAG: str s0, [sp, #16]
				; CHECK-DAG: str xzr, [sp, #8]
				; CHECK-DAG: str [[SUM]], [sp]

				; Add them to ensure real promotion occurs.
				%sum = add i8 %l, %r
				call void(...) @bar(i8 %sum, i64 0, float %in, double 0.0, i8* %ptr)
				ret void
				}

				declare i8* @llvm.frameaddress(i32)

				define i8* @test_frameaddr() {
				; CHECK-LABEL: test_frameaddr:
				; CHECK: ldr {{w0\|x0}}, [x29]
				%val = call i8* @llvm.frameaddress(i32 1)
				ret i8* %val
				}

				declare i8* @llvm.returnaddress(i32)

				define i8* @test_toplevel_returnaddr() {
				; CHECK-LABEL: test_toplevel_returnaddr:
				; CHECK: mov x0, x30
				%val = call i8* @llvm.returnaddress(i32 0)
				ret i8* %val
				}

				define i8* @test_deep_returnaddr() {
				; CHECK-LABEL: test_deep_returnaddr:
				; CHECK: ldr x[[FRAME_REC:[0-9]+]], [x29]
				; CHECK: ldr x0, [x[[FRAME_REC]], #8]
				%val = call i8* @llvm.returnaddress(i32 1)
				ret i8* %val
				}

				define void @test_indirect_call(void()* %func) {
				; CHECK-LABEL: test_indirect_call:
				; CHECK: blr x0
				call void() %func()
				ret void
				}

				; Safe to use the unextended address here
				define void @test_indirect_safe_call(i32* %weird_funcs) {
				; CHECK-LABEL: test_indirect_safe_call:
				; CHECK: add w[[ADDR32:[0-9]+]], w0, #4
				; CHECK-OPT-NOT: ubfx
				; CHECK: blr x[[ADDR32]]
				%addr = getelementptr i32, i32* %weird_funcs, i32 1
				%func = bitcast i32* %addr to void()*
				call void() %func()
				ret void
				}

				declare void @simple()
				define void @test_simple_tail_call() {
				; CHECK-LABEL: test_simple_tail_call:
				; CHECK: b _simple
				tail call void @simple()
				ret void
				}

				define void @test_indirect_tail_call(void()* %func) {
				; CHECK-LABEL: test_indirect_tail_call:
				; CHECK: br x0
				tail call void() %func()
				ret void
				}

				; Safe to use the unextended address here
				define void @test_indirect_safe_tail_call(i32* %weird_funcs) {
				; CHECK-LABEL: test_indirect_safe_tail_call:
				; CHECK: add w[[ADDR32:[0-9]+]], w0, #4
				; CHECK-OPT-NOT: ubfx
				; CHECK-OPT: br x[[ADDR32]]
				%addr = getelementptr i32, i32* %weird_funcs, i32 1
				%func = bitcast i32* %addr to void()*
				tail call void() %func()
				ret void
				}

				; For the "armv7k" slice, Clang will be emitting some small structs as [N x
				; i32]. For ABI compatibility with arm64_32 these need to be passed in X
				; registers (e.g. [2 x i32] would be packed into a single register).

				define i32 @test_in_smallstruct_low([3 x i32] %in) {
				; CHECK-LABEL: test_in_smallstruct_low:
				; CHECK: mov x0, x1
				%val = extractvalue [3 x i32] %in, 2
				ret i32 %val
				}

				define i32 @test_in_smallstruct_high([3 x i32] %in) {
				; CHECK-LABEL: test_in_smallstruct_high:
				; CHECK: lsr x0, x0, #32
				%val = extractvalue [3 x i32] %in, 1
				ret i32 %val
				}

				; The 64-bit DarwinPCS ABI has the quirk that structs on the stack are always
				; 64-bit aligned. This must not happen for arm64_32 since othwerwise va_arg will
				; be incompatible with the armv7k ABI.
				define i32 @test_in_smallstruct_stack([8 x i64], i32, [3 x i32] %in) {
				; CHECK-LABEL: test_in_smallstruct_stack:
				; CHECK: ldr w0, [sp, #4]
				%val = extractvalue [3 x i32] %in, 0
				ret i32 %val
				}

				define [2 x i32] @test_ret_smallstruct([3 x i32] %in) {
				; CHECK-LABEL: test_ret_smallstruct:
				; CHECK: mov x0, #1
				; CHECK: movk x0, #2, lsl #32

				ret [2 x i32] [i32 1, i32 2]
				}

				declare void @smallstruct_callee([4 x i32])
				define void @test_call_smallstruct() {
				; CHECK-LABEL: test_call_smallstruct:
				; CHECK: mov x0, #1
				; CHECK: movk x0, #2, lsl #32
				; CHECK: mov x1, #3
				; CHECK: movk x1, #4, lsl #32
				; CHECK: bl _smallstruct_callee

				call void @smallstruct_callee([4 x i32] [i32 1, i32 2, i32 3, i32 4])
				ret void
				}

				declare void @smallstruct_callee_stack([8 x i64], i32, [2 x i32])
				define void @test_call_smallstruct_stack() {
				; CHECK-LABEL: test_call_smallstruct_stack:
				; CHECK: mov [[VAL:x[0-9]+]], #1
				; CHECK: movk [[VAL]], #2, lsl #32
				; CHECK: stur [[VAL]], [sp, #4]

				call void @smallstruct_callee_stack([8 x i64] undef, i32 undef, [2 x i32] [i32 1, i32 2])
				ret void
				}

				declare [3 x i32] @returns_smallstruct()
				define i32 @test_use_smallstruct_low() {
				; CHECK-LABEL: test_use_smallstruct_low:
				; CHECK: bl _returns_smallstruct
				; CHECK: mov x0, x1

				%struct = call [3 x i32] @returns_smallstruct()
				%val = extractvalue [3 x i32] %struct, 2
				ret i32 %val
				}

				define i32 @test_use_smallstruct_high() {
				; CHECK-LABEL: test_use_smallstruct_high:
				; CHECK: bl _returns_smallstruct
				; CHECK: lsr x0, x0, #32

				%struct = call [3 x i32] @returns_smallstruct()
				%val = extractvalue [3 x i32] %struct, 1
				ret i32 %val
				}

				; If a small struct can't be allocated to x0-x7, the remaining registers should
				; be marked as unavailable and subsequent GPR arguments should also be on the
				; stack. Obviously the struct itself should be passed entirely on the stack.
				define i32 @test_smallstruct_padding([7 x i64], [4 x i32] %struct, i32 %in) {
				; CHECK-LABEL: test_smallstruct_padding:
				; CHECK-DAG: ldr [[IN:w[0-9]+]], [sp, #16]
				; CHECK-DAG: ldr [[LHS:w[0-9]+]], [sp]
				; CHECK: add w0, [[LHS]], [[IN]]
				%lhs = extractvalue [4 x i32] %struct, 0
				%sum = add i32 %lhs, %in
				ret i32 %sum
				}

				declare void @take_small_smallstruct(i64, [1 x i32])
				define void @test_small_smallstruct() {
				; CHECK-LABEL: test_small_smallstruct:
				; CHECK-DAG: mov w0, #1
				; CHECK-DAG: mov w1, #2
				; CHECK: bl _take_small_smallstruct
				call void @take_small_smallstruct(i64 1, [1 x i32] [i32 2])
				ret void
				}

				define void @test_bare_frameaddr(i8** %addr) {
				; CHECK-LABEL: test_bare_frameaddr:
				; CHECK: add x[[LOCAL:[0-9]+]], sp, #{{[0-9]+}}
				; CHECK: str w[[LOCAL]],

				%ptr = alloca i8
				store i8* %ptr, i8** %addr, align 4
				ret void
				}

				define void @test_sret_use([8 x i64]* sret %out) {
				; CHECK-LABEL: test_sret_use:
				; CHECK: str xzr, [x8]
				%addr = getelementptr [8 x i64], [8 x i64]* %out, i32 0, i32 0
				store i64 0, i64* %addr
				ret void
				}

				define i64 @test_sret_call() {
				; CHECK-LABEL: test_sret_call:
				; CHECK: mov x8, sp
				; CHECK: bl _test_sret_use
				%arr = alloca [8 x i64]
				call void @test_sret_use([8 x i64]* sret %arr)

				%addr = getelementptr [8 x i64], [8 x i64]* %arr, i32 0, i32 0
				%val = load i64, i64* %addr
				ret i64 %val
				}

				define double @test_constpool() {
				; CHECK-LABEL: test_constpool:
				; CHECK: adrp x[[PAGE:[0-9]+]], [[POOL:lCPI[0-9]+_[0-9]+]]@PAGE
				; CHECK: ldr d0, [x[[PAGE]], [[POOL]]@PAGEOFF]
				ret double 1.0e-6
				}

				define i8* @test_blockaddress() {
				; CHECK-LABEL: test_blockaddress:
				; CHECK: [[BLOCK:Ltmp[0-9]+]]:
				; CHECK: adrp [[PAGE:x[0-9]+]], [[BLOCK]]@PAGE
				; CHECK: add x0, [[PAGE]], [[BLOCK]]@PAGEOFF
				br label %dest
				dest:
				ret i8* blockaddress(@test_blockaddress, %dest)
				}

				define i8* @test_indirectbr(i8* %dest) {
				; CHECK-LABEL: test_indirectbr:
				; CHECK: br x0
				indirectbr i8* %dest, [label %true, label %false]

				true:
				ret i8* blockaddress(@test_indirectbr, %true)
				false:
				ret i8* blockaddress(@test_indirectbr, %false)
				}

				; ISelDAGToDAG tries to fold an offset FI load (in this case var+4) into the
				; actual load instruction. This needs to be done slightly carefully since we
				; claim the FI in the process -- it doesn't need extending.
				define float @test_frameindex_offset_load() {
				; CHECK-LABEL: test_frameindex_offset_load:
				; CHECK: ldr s0, [sp, #4]
				%arr = alloca float, i32 4, align 8
				%addr = getelementptr inbounds float, float* %arr, i32 1

				%val = load float, float* %addr, align 4
				ret float %val
				}

				define void @test_unaligned_frameindex_offset_store() {
				; CHECK-LABEL: test_unaligned_frameindex_offset_store:
				; CHECK: mov x[[TMP:[0-9]+]], sp
				; CHECK: orr w[[ADDR:[0-9]+]], w[[TMP]], #0x2
				; CHECK: mov [[VAL:w[0-9]+]], #42
				; CHECK: str [[VAL]], [x[[ADDR]]]
				%arr = alloca [4 x i32]

				%addr.int = ptrtoint [4 x i32]* %arr to i32
				%addr.nextint = add nuw i32 %addr.int, 2
				%addr.next = inttoptr i32 %addr.nextint to i32*
				store i32 42, i32* %addr.next
				ret void
				}


				define {i64, i64} @test_pre_idx(i64 %addr) {
				; CHECK-LABEL: test_pre_idx:

				; CHECK: add w[[ADDR:[0-9]+]], w0, #8
				; CHECK: ldr x0, [x[[ADDR]]]
				%addr.int = ptrtoint i64* %addr to i32
				%addr.next.int = add nuw i32 %addr.int, 8
				%addr.next = inttoptr i32 %addr.next.int to i64*
				%val = load i64, i64* %addr.next

				%tmp = insertvalue {i64, i64*} undef, i64 %val, 0
				%res = insertvalue {i64, i64} %tmp, i64 %addr.next, 1

				ret {i64, i64*} %res
				}

				; Forming a post-indexed load is invalid here since the GEP needs to work when
				; %addr wraps round to 0.
				define {i64, i64} @test_invalid_pre_idx(i64 %addr) {
				; CHECK-LABEL: test_invalid_pre_idx:
				; CHECK: add w1, w0, #8
				; CHECK: ldr x0, [x1]
				%addr.next = getelementptr i64, i64* %addr, i32 1
				%val = load i64, i64* %addr.next

				%tmp = insertvalue {i64, i64*} undef, i64 %val, 0
				%res = insertvalue {i64, i64} %tmp, i64 %addr.next, 1

				ret {i64, i64*} %res
				}

				declare void @callee([8 x i32]*)
				define void @test_stack_guard() ssp {
				; CHECK-LABEL: test_stack_guard:
				; CHECK: adrp x[[GUARD_GOTPAGE:[0-9]+]], ___stack_chk_guard@GOTPAGE
				; CHECK: ldr w[[GUARD_ADDR:[0-9]+]], [x[[GUARD_GOTPAGE]], ___stack_chk_guard@GOTPAGEOFF]
				; CHECK: ldr [[GUARD_VAL:w[0-9]+]], [x[[GUARD_ADDR]]]
				; CHECK: stur [[GUARD_VAL]], [x29, #[[GUARD_OFFSET:-[0-9]+]]]

				; CHECK: add x0, sp, #{{[0-9]+}}
				; CHECK: bl _callee

				; CHECK-OPT: adrp x[[GUARD_GOTPAGE:[0-9]+]], ___stack_chk_guard@GOTPAGE
				; CHECK-OPT: ldr w[[GUARD_ADDR:[0-9]+]], [x[[GUARD_GOTPAGE]], ___stack_chk_guard@GOTPAGEOFF]
				; CHECK-OPT: ldr [[GUARD_VAL:w[0-9]+]], [x[[GUARD_ADDR]]]
				; CHECK-OPT: ldur [[NEW_VAL:w[0-9]+]], [x29, #[[GUARD_OFFSET]]]
				; CHECK-OPT: cmp [[GUARD_VAL]], [[NEW_VAL]]
				; CHECK-OPT: b.ne [[FAIL:LBB[0-9]+_[0-9]+]]

				; CHECK-OPT: [[FAIL]]:
				; CHECK-OPT-NEXT: bl ___stack_chk_fail
				%arr = alloca [8 x i32]
				call void @callee([8 x i32]* %arr)
				ret void
				}

				declare i32 @__gxx_personality_v0(...)
				declare void @eat_landingpad_args(i32, i8*, i32)
				@_ZTI8Whatever = external global i8
				define void @test_landingpad_marshalling() personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				; CHECK-LABEL: test_landingpad_marshalling:
				; CHECK-OPT: mov x2, x1
				; CHECK-OPT: mov x1, x0
				; CHECK: bl _eat_landingpad_args
				invoke void @callee([8 x i32]* undef) to label %done unwind label %lpad

				lpad: ; preds = %entry
				%exc = landingpad { i8*, i32 }
				catch i8* @_ZTI8Whatever
				%pointer = extractvalue { i8*, i32 } %exc, 0
				%selector = extractvalue { i8*, i32 } %exc, 1
				call void @eat_landingpad_args(i32 undef, i8* %pointer, i32 %selector)
				ret void

				done:
				ret void
				}

				define void @test_dynamic_stackalloc() {
				; CHECK-LABEL: test_dynamic_stackalloc:
				; CHECK: sub [[REG:x[0-9]+]], sp, #32
				; CHECK: mov sp, [[REG]]
				; CHECK-OPT-NOT: ubfx
				; CHECK: bl _callee
				br label %next

				next:
				%val = alloca [8 x i32]
				call void @callee([8 x i32]* %val)
				ret void
				}

				define void @test_asm_memory(i32* %base.addr) {
				; CHECK-LABEL: test_asm_memory:
				; CHECK: add w[[ADDR:[0-9]+]], w0, #4
				; CHECK: str wzr, [x[[ADDR]]
				%addr = getelementptr i32, i32* %base.addr, i32 1
				call void asm sideeffect "str wzr, $0", "m"(i32 %addr)
				ret void
				}

				define void @test_unsafe_asm_memory(i64 %val) {
				; CHECK-LABEL: test_unsafe_asm_memory:
				; CHECK: and x[[ADDR:[0-9]+]], x0, #0xffffffff
				; CHECK: str wzr, [x[[ADDR]]]
				%addr_int = trunc i64 %val to i32
				%addr = inttoptr i32 %addr_int to i32*
				call void asm sideeffect "str wzr, $0", "m"(i32 %addr)
				ret void
				}

				define [9 x i8] @test_demoted_return(i8 %in) {
				; CHECK-LABEL: test_demoted_return:
				; CHECK: str w0, [x8, #32]
				%res = insertvalue [9 x i8] undef, i8 %in, 8
				ret [9 x i8*] %res
				}

				define i8* @test_inttoptr(i64 %in) {
				; CHECK-LABEL: test_inttoptr:
				; CHECK: and x0, x0, #0xffffffff
				%res = inttoptr i64 %in to i8*
				ret i8* %res
				}

				declare i32 @llvm.get.dynamic.area.offset.i32()
				define i32 @test_dynamic_area() {
				; CHECK-LABEL: test_dynamic_area:
				; CHECK: mov w0, wzr
				%res = call i32 @llvm.get.dynamic.area.offset.i32()
				ret i32 %res
				}

				define void @test_pointer_vec_store(<2 x i8> %addr) {
				; CHECK-LABEL: test_pointer_vec_store:
				; CHECK: str xzr, [x0]
				; CHECK-NOT: str
				; CHECK-NOT: stp

				store <2 x i8> zeroinitializer, <2 x i8>* %addr, align 16
				ret void
				}

				define <2 x i8> @test_pointer_vec_load(<2 x i8>* %addr) {
				; CHECK-LABEL: test_pointer_vec_load:
				; CHECK: ldr d[[TMP:[0-9]+]], [x0]
				; CHECK: ushll.2d v0, v[[TMP]], #0
				%val = load <2 x i8>, <2 x i8>* %addr, align 16
				ret <2 x i8*> %val
				}

				define void @test_inline_asm_mem_pointer(i32* %in) {
				; CHECK-LABEL: test_inline_asm_mem_pointer:
				; CHECK: str w0,
				tail call void asm sideeffect "ldr x0, $0", "rm"(i32* %in)
				ret void
				}


				define void @test_struct_hi(i32 %hi) nounwind {
				; CHECK-LABEL: test_struct_hi:
				; CHECK: mov w[[IN:[0-9]+]], w0
				; CHECK: bl _get_int
				; CHECK-NEXT: bfi x0, x[[IN]], #32, #32
				; CHECK-NEXT: bl _take_pair
				%val.64 = call i64 @get_int()
				%val.32 = trunc i64 %val.64 to i32

				%pair.0 = insertvalue [2 x i32] undef, i32 %val.32, 0
				%pair.1 = insertvalue [2 x i32] %pair.0, i32 %hi, 1
				call void @take_pair([2 x i32] %pair.1)

				ret void
				}
				declare void @take_pair([2 x i32])
				declare i64 @get_int()

				define i1 @test_icmp_ptr(i8* %in) {
				; CHECK-LABEL: test_icmp_ptr
				; CHECK: ubfx x0, x0, #31, #1
				%res = icmp slt i8* %in, null
				ret i1 %res
				}

				define void @test_multiple_icmp_ptr(i8* %l, i8* %r) {
				; CHECK-LABEL: test_multiple_icmp_ptr:
				; CHECK: tbnz w0, #31, [[FALSEBB:LBB[0-9]+_[0-9]+]]
				; CHECK: tbnz w1, #31, [[FALSEBB]]
				%tst1 = icmp sgt i8* %l, inttoptr (i32 -1 to i8*)
				%tst2 = icmp sgt i8* %r, inttoptr (i32 -1 to i8*)
				%tst = and i1 %tst1, %tst2
				br i1 %tst, label %true, label %false

				true:
				call void(...) @bar()
				ret void

				false:
				ret void
				}

				define { [18 x i8] }* @test_gep_nonpow2({ [18 x i8] }* %a0, i32 %a1) {
				; CHECK-LABEL: test_gep_nonpow2:
				; CHECK: mov w[[SIZE:[0-9]+]], #18
				; CHECK-NEXT: smaddl x0, w1, w[[SIZE]], x0
				; CHECK-NEXT: ret
				%tmp0 = getelementptr inbounds { [18 x i8] }, { [18 x i8] }* %a0, i32 %a1
				ret { [18 x i8] }* %tmp0
				}

				define void @test_bzero(i64 %in) {
				; CHECK-LABEL: test_bzero:
				; CHECK-DAG: lsr x1, x0, #32
				; CHECK-DAG: and x0, x0, #0xffffffff
				; CHECK: bl _bzero

				%ptr.i32 = trunc i64 %in to i32
				%size.64 = lshr i64 %in, 32
				%size = trunc i64 %size.64 to i32
				%ptr = inttoptr i32 %ptr.i32 to i8*
				tail call void @llvm.memset.p0i8.i32(i8* align 4 %ptr, i8 0, i32 %size, i1 false)
				ret void
				}

				declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1)

llvm/test/CodeGen/AArch64/fastcc-reserved.ll

	; RUN: llc -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu -tailcallopt \| FileCheck %s			; RUN: llc -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu -tailcallopt \| FileCheck %s

	; This test is designed to be run in the situation where the			; This test is designed to be run in the situation where the
	; call-frame is not reserved (hence disable-fp-elim), but where			; call-frame is not reserved (hence disable-fp-elim), but where
	; callee-pop can occur (hence tailcallopt).			; callee-pop can occur (hence tailcallopt).

	declare fastcc void @will_pop([8 x i32], i32 %val)			declare fastcc void @will_pop([8 x i64], i32 %val)

	define fastcc void @foo(i32 %in) {			define fastcc void @foo(i32 %in) {
	; CHECK-LABEL: foo:			; CHECK-LABEL: foo:

	%addr = alloca i8, i32 %in			%addr = alloca i8, i32 %in

	; Normal frame setup stuff:			; Normal frame setup stuff:
	; CHECK: stp x29, x30, [sp, #-16]!			; CHECK: stp x29, x30, [sp, #-16]!
	; CHECK: mov x29, sp			; CHECK: mov x29, sp

	; Reserve space for call-frame:			; Reserve space for call-frame:
	; CHECK: str w{{[0-9]+}}, [sp, #-16]!			; CHECK: str w{{[0-9]+}}, [sp, #-16]!

	call fastcc void @will_pop([8 x i32] undef, i32 42)			call fastcc void @will_pop([8 x i64] undef, i32 42)
	; CHECK: bl will_pop			; CHECK: bl will_pop

	; Since @will_pop is fastcc with tailcallopt, it will put the stack			; Since @will_pop is fastcc with tailcallopt, it will put the stack
	; back where it needs to be, we shouldn't duplicate that			; back where it needs to be, we shouldn't duplicate that
	; CHECK-NOT: sub sp, sp, #16			; CHECK-NOT: sub sp, sp, #16
	; CHECK-NOT: add sp, sp,			; CHECK-NOT: add sp, sp,

	; CHECK: mov sp, x29			; CHECK: mov sp, x29
	; CHECK: ldp x29, x30, [sp], #16			; CHECK: ldp x29, x30, [sp], #16
	ret void			ret void
	}			}

	declare void @wont_pop([8 x i32], i32 %val)			declare void @wont_pop([8 x i64], i32 %val)

	define void @foo1(i32 %in) {			define void @foo1(i32 %in) {
	; CHECK-LABEL: foo1:			; CHECK-LABEL: foo1:

	%addr = alloca i8, i32 %in			%addr = alloca i8, i32 %in
	; Normal frame setup again			; Normal frame setup again
	; CHECK: stp x29, x30, [sp, #-16]!			; CHECK: stp x29, x30, [sp, #-16]!
	; CHECK: mov x29, sp			; CHECK: mov x29, sp

	; Reserve space for call-frame			; Reserve space for call-frame
	; CHECK: str w{{[0-9]+}}, [sp, #-16]!			; CHECK: str w{{[0-9]+}}, [sp, #-16]!

	call void @wont_pop([8 x i32] undef, i32 42)			call void @wont_pop([8 x i64] undef, i32 42)
	; CHECK: bl wont_pop			; CHECK: bl wont_pop

	; This time we do need to unreserve the call-frame			; This time we do need to unreserve the call-frame
	; CHECK: add sp, sp, #16			; CHECK: add sp, sp, #16

	; Check for epilogue (primarily to make sure sp spotted above wasn't			; Check for epilogue (primarily to make sure sp spotted above wasn't
	; part of it).			; part of it).
	; CHECK: mov sp, x29			; CHECK: mov sp, x29
	; CHECK: ldp x29, x30, [sp], #16			; CHECK: ldp x29, x30, [sp], #16
	ret void			ret void
	}			}

llvm/test/CodeGen/AArch64/fastcc.ll

Show All 12 Lines

; CHECK-TAIL-LABEL: func_stack0:		; CHECK-TAIL-LABEL: func_stack0:
; CHECK-TAIL: sub sp, sp, #48		; CHECK-TAIL: sub sp, sp, #48
; CHECK-TAIL-NEXT: stp x29, x30, [sp, #32]		; CHECK-TAIL-NEXT: stp x29, x30, [sp, #32]
; CHECK-TAIL-NEXT: add x29, sp, #32		; CHECK-TAIL-NEXT: add x29, sp, #32
; CHECK-TAIL: str w{{[0-9]+}}, [sp]		; CHECK-TAIL: str w{{[0-9]+}}, [sp]


call fastcc void @func_stack8([8 x i32] undef, i32 42)		call fastcc void @func_stack8([8 x i64] undef, i32 42)
; CHECK: bl func_stack8		; CHECK: bl func_stack8
; CHECK-NOT: sub sp, sp,		; CHECK-NOT: sub sp, sp,
; CHECK-NOT: [sp, #{{[-0-9]+}}]!		; CHECK-NOT: [sp, #{{[-0-9]+}}]!
; CHECK-NOT: [sp], #{{[-0-9]+}}		; CHECK-NOT: [sp], #{{[-0-9]+}}

; CHECK-TAIL: bl func_stack8		; CHECK-TAIL: bl func_stack8
; CHECK-TAIL: stp xzr, xzr, [sp, #-16]!		; CHECK-TAIL: stp xzr, xzr, [sp, #-16]!


call fastcc void @func_stack32([8 x i32] undef, i128 0, i128 9)		call fastcc void @func_stack32([8 x i64] undef, i128 0, i128 9)
; CHECK: bl func_stack32		; CHECK: bl func_stack32
; CHECK-NOT: sub sp, sp,		; CHECK-NOT: sub sp, sp,


; CHECK-TAIL: bl func_stack32		; CHECK-TAIL: bl func_stack32
; CHECK-TAIL: sub sp, sp, #32		; CHECK-TAIL: sub sp, sp, #32


Show All 11 Lines
; CHECK-NEXT: ret		; CHECK-NEXT: ret


; CHECK-TAIL: ldp x29, x30, [sp, #32]		; CHECK-TAIL: ldp x29, x30, [sp, #32]
; CHECK-TAIL-NEXT: add sp, sp, #48		; CHECK-TAIL-NEXT: add sp, sp, #48
; CHECK-TAIL-NEXT: ret		; CHECK-TAIL-NEXT: ret
}		}

define fastcc void @func_stack8([8 x i32], i32 %stacked) {		define fastcc void @func_stack8([8 x i64], i32 %stacked) {
; CHECK-LABEL: func_stack8:		; CHECK-LABEL: func_stack8:
; CHECK: sub sp, sp, #48		; CHECK: sub sp, sp, #48
; CHECK: stp x29, x30, [sp, #32]		; CHECK: stp x29, x30, [sp, #32]
; CHECK: add x29, sp, #32		; CHECK: add x29, sp, #32
; CHECK: str w{{[0-9]+}}, [sp]		; CHECK: str w{{[0-9]+}}, [sp]


; CHECK-TAIL-LABEL: func_stack8:		; CHECK-TAIL-LABEL: func_stack8:
; CHECK-TAIL: sub sp, sp, #48		; CHECK-TAIL: sub sp, sp, #48
; CHECK-TAIL: stp x29, x30, [sp, #32]		; CHECK-TAIL: stp x29, x30, [sp, #32]
; CHECK-TAIL: add x29, sp, #32		; CHECK-TAIL: add x29, sp, #32
; CHECK-TAIL: str w{{[0-9]+}}, [sp]		; CHECK-TAIL: str w{{[0-9]+}}, [sp]


call fastcc void @func_stack8([8 x i32] undef, i32 42)		call fastcc void @func_stack8([8 x i64] undef, i32 42)
; CHECK: bl func_stack8		; CHECK: bl func_stack8
; CHECK-NOT: sub sp, sp,		; CHECK-NOT: sub sp, sp,
; CHECK-NOT: [sp, #{{[-0-9]+}}]!		; CHECK-NOT: [sp, #{{[-0-9]+}}]!
; CHECK-NOT: [sp], #{{[-0-9]+}}		; CHECK-NOT: [sp], #{{[-0-9]+}}


; CHECK-TAIL: bl func_stack8		; CHECK-TAIL: bl func_stack8
; CHECK-TAIL: stp xzr, xzr, [sp, #-16]!		; CHECK-TAIL: stp xzr, xzr, [sp, #-16]!


call fastcc void @func_stack32([8 x i32] undef, i128 0, i128 9)		call fastcc void @func_stack32([8 x i64] undef, i128 0, i128 9)
; CHECK: bl func_stack32		; CHECK: bl func_stack32
; CHECK-NOT: sub sp, sp,		; CHECK-NOT: sub sp, sp,


; CHECK-TAIL: bl func_stack32		; CHECK-TAIL: bl func_stack32
; CHECK-TAIL: sub sp, sp, #32		; CHECK-TAIL: sub sp, sp, #32


Show All 10 Lines
; CHECK-NEXT: ret		; CHECK-NEXT: ret


; CHECK-TAIL: ldp x29, x30, [sp, #32]		; CHECK-TAIL: ldp x29, x30, [sp, #32]
; CHECK-TAIL-NEXT: add sp, sp, #64		; CHECK-TAIL-NEXT: add sp, sp, #64
; CHECK-TAIL-NEXT: ret		; CHECK-TAIL-NEXT: ret
}		}

define fastcc void @func_stack32([8 x i32], i128 %stacked0, i128 %stacked1) {		define fastcc void @func_stack32([8 x i64], i128 %stacked0, i128 %stacked1) {
; CHECK-LABEL: func_stack32:		; CHECK-LABEL: func_stack32:
; CHECK: add x29, sp, #32		; CHECK: add x29, sp, #32

; CHECK-TAIL-LABEL: func_stack32:		; CHECK-TAIL-LABEL: func_stack32:
; CHECK-TAIL: add x29, sp, #32		; CHECK-TAIL: add x29, sp, #32


call fastcc void @func_stack8([8 x i32] undef, i32 42)		call fastcc void @func_stack8([8 x i64] undef, i32 42)
; CHECK: bl func_stack8		; CHECK: bl func_stack8
; CHECK-NOT: sub sp, sp,		; CHECK-NOT: sub sp, sp,
; CHECK-NOT: [sp, #{{[-0-9]+}}]!		; CHECK-NOT: [sp, #{{[-0-9]+}}]!
; CHECK-NOT: [sp], #{{[-0-9]+}}		; CHECK-NOT: [sp], #{{[-0-9]+}}

; CHECK-TAIL: bl func_stack8		; CHECK-TAIL: bl func_stack8
; CHECK-TAIL: stp xzr, xzr, [sp, #-16]!		; CHECK-TAIL: stp xzr, xzr, [sp, #-16]!


call fastcc void @func_stack32([8 x i32] undef, i128 0, i128 9)		call fastcc void @func_stack32([8 x i64] undef, i128 0, i128 9)
; CHECK: bl func_stack32		; CHECK: bl func_stack32
; CHECK-NOT: sub sp, sp,		; CHECK-NOT: sub sp, sp,


; CHECK-TAIL: bl func_stack32		; CHECK-TAIL: bl func_stack32
; CHECK-TAIL: sub sp, sp, #32		; CHECK-TAIL: sub sp, sp, #32


Show All 11 Lines
; CHECK-NEXT: ret		; CHECK-NEXT: ret

; CHECK-TAIL: ldp x29, x30, [sp, #32]		; CHECK-TAIL: ldp x29, x30, [sp, #32]
; CHECK-TAIL-NEXT: add sp, sp, #80		; CHECK-TAIL-NEXT: add sp, sp, #80
; CHECK-TAIL-NEXT: ret		; CHECK-TAIL-NEXT: ret
}		}

; Check that arg stack pop is done after callee-save restore when no frame pointer is used.		; Check that arg stack pop is done after callee-save restore when no frame pointer is used.
define fastcc void @func_stack32_leaf([8 x i32], i128 %stacked0, i128 %stacked1) {		define fastcc void @func_stack32_leaf([8 x i64], i128 %stacked0, i128 %stacked1) {
; CHECK-LABEL: func_stack32_leaf:		; CHECK-LABEL: func_stack32_leaf:
; CHECK: str x20, [sp, #-16]!		; CHECK: str x20, [sp, #-16]!
; CHECK: nop		; CHECK: nop
; CHECK-NEXT: //NO_APP		; CHECK-NEXT: //NO_APP
; CHECK-NEXT: ldr x20, [sp], #16		; CHECK-NEXT: ldr x20, [sp], #16
; CHECK-NEXT: ret		; CHECK-NEXT: ret

; CHECK-TAIL-LABEL: func_stack32_leaf:		; CHECK-TAIL-LABEL: func_stack32_leaf:
Show All 14 Lines
; CHECK-TAIL-RZ-NEXT: ret		; CHECK-TAIL-RZ-NEXT: ret

; Make sure there is a callee-save register to save/restore.		; Make sure there is a callee-save register to save/restore.
call void asm sideeffect "nop", "~{x20}"() nounwind		call void asm sideeffect "nop", "~{x20}"() nounwind
ret void		ret void
}		}

; Check that arg stack pop is done after callee-save restore when no frame pointer is used.		; Check that arg stack pop is done after callee-save restore when no frame pointer is used.
define fastcc void @func_stack32_leaf_local([8 x i32], i128 %stacked0, i128 %stacked1) {		define fastcc void @func_stack32_leaf_local([8 x i64], i128 %stacked0, i128 %stacked1) {
; CHECK-LABEL: func_stack32_leaf_local:		; CHECK-LABEL: func_stack32_leaf_local:
; CHECK: sub sp, sp, #32		; CHECK: sub sp, sp, #32
; CHECK-NEXT: str x20, [sp, #16]		; CHECK-NEXT: str x20, [sp, #16]
; CHECK: nop		; CHECK: nop
; CHECK-NEXT: //NO_APP		; CHECK-NEXT: //NO_APP
; CHECK-NEXT: ldr x20, [sp, #16]		; CHECK-NEXT: ldr x20, [sp, #16]
; CHECK-NEXT: add sp, sp, #32		; CHECK-NEXT: add sp, sp, #32
; CHECK-NEXT: ret		; CHECK-NEXT: ret
Show All 19 Lines	; CHECK-TAIL-RZ-NEXT: ret
%val0 = alloca [2 x i64], align 8		%val0 = alloca [2 x i64], align 8

; Make sure there is a callee-save register to save/restore.		; Make sure there is a callee-save register to save/restore.
call void asm sideeffect "nop", "~{x20}"() nounwind		call void asm sideeffect "nop", "~{x20}"() nounwind
ret void		ret void
}		}

; Check that arg stack pop is done after callee-save restore when no frame pointer is used.		; Check that arg stack pop is done after callee-save restore when no frame pointer is used.
define fastcc void @func_stack32_leaf_local_nocs([8 x i32], i128 %stacked0, i128 %stacked1) {		define fastcc void @func_stack32_leaf_local_nocs([8 x i64], i128 %stacked0, i128 %stacked1) {
; CHECK-LABEL: func_stack32_leaf_local_nocs:		; CHECK-LABEL: func_stack32_leaf_local_nocs:
; CHECK: sub sp, sp, #16		; CHECK: sub sp, sp, #16
; CHECK: add sp, sp, #16		; CHECK: add sp, sp, #16
; CHECK-NEXT: ret		; CHECK-NEXT: ret

; CHECK-TAIL-LABEL: func_stack32_leaf_local_nocs:		; CHECK-TAIL-LABEL: func_stack32_leaf_local_nocs:
; CHECK-TAIL: sub sp, sp, #16		; CHECK-TAIL: sub sp, sp, #16
; CHECK-TAIL: add sp, sp, #48		; CHECK-TAIL: add sp, sp, #48
Show All 10 Lines

llvm/test/CodeGen/AArch64/jump-table-32.ll

This file was added.

				; RUN: llc -verify-machineinstrs -o - %s -mtriple=arm64_32-apple-ios7.0 -aarch64-enable-atomic-cfg-tidy=0 \| FileCheck %s

				define i32 @test_jumptable(i32 %in) {
				; CHECK: test_jumptable

				switch i32 %in, label %def [
				i32 0, label %lbl1
				i32 1, label %lbl2
				i32 2, label %lbl3
				i32 4, label %lbl4
				]
				; CHECK: adrp [[JTPAGE:x[0-9]+]], LJTI0_0@PAGE
				; CHECK: mov w[[INDEX:[0-9]+]], w0
				; CHECK: add x[[JT:[0-9]+]], [[JTPAGE]], LJTI0_0@PAGEOFF
				; CHECK: adr [[BASE_BLOCK:x[0-9]+]], LBB0_2
				; CHECK: ldrb w[[OFFSET:[0-9]+]], [x[[JT]], x[[INDEX]]]
				; CHECK: add [[DEST:x[0-9]+]], [[BASE_BLOCK]], x[[OFFSET]], lsl #2
				; CHECK: br [[DEST]]

				def:
				ret i32 0

				lbl1:
				ret i32 1

				lbl2:
				ret i32 2

				lbl3:
				ret i32 4

				lbl4:
				ret i32 8

				}

				; CHECK: LJTI0_0:
				; CHECK-NEXT: .byte
				; CHECK-NEXT: .byte
				; CHECK-NEXT: .byte
				; CHECK-NEXT: .byte
				; CHECK-NEXT: .byte

llvm/test/CodeGen/AArch64/sibling-call.ll

	; RUN: llc -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu -aarch64-enable-ldst-opt=0 \| FileCheck %s			; RUN: llc -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu -aarch64-enable-ldst-opt=0 \| FileCheck %s

	declare void @callee_stack0()			declare void @callee_stack0()
	declare void @callee_stack8([8 x i32], i64)			declare void @callee_stack8([8 x i64], i64)
	declare void @callee_stack16([8 x i32], i64, i64)			declare void @callee_stack16([8 x i64], i64, i64)

	define void @caller_to0_from0() nounwind {			define void @caller_to0_from0() nounwind {
	; CHECK-LABEL: caller_to0_from0:			; CHECK-LABEL: caller_to0_from0:
	; CHECK-NEXT: // %bb.			; CHECK-NEXT: // %bb.
	tail call void @callee_stack0()			tail call void @callee_stack0()
	ret void			ret void
	; CHECK-NEXT: b callee_stack0			; CHECK-NEXT: b callee_stack0
	}			}

	define void @caller_to0_from8([8 x i32], i64) nounwind{			define void @caller_to0_from8([8 x i64], i64) nounwind{
	; CHECK-LABEL: caller_to0_from8:			; CHECK-LABEL: caller_to0_from8:
	; CHECK-NEXT: // %bb.			; CHECK-NEXT: // %bb.

	tail call void @callee_stack0()			tail call void @callee_stack0()
	ret void			ret void
	; CHECK-NEXT: b callee_stack0			; CHECK-NEXT: b callee_stack0
	}			}

	define void @caller_to8_from0() {			define void @caller_to8_from0() {
	; CHECK-LABEL: caller_to8_from0:			; CHECK-LABEL: caller_to8_from0:

	; Caller isn't going to clean up any extra stack we allocate, so it			; Caller isn't going to clean up any extra stack we allocate, so it
	; can't be a tail call.			; can't be a tail call.
	tail call void @callee_stack8([8 x i32] undef, i64 42)			tail call void @callee_stack8([8 x i64] undef, i64 42)
	ret void			ret void
	; CHECK: bl callee_stack8			; CHECK: bl callee_stack8
	}			}

	define void @caller_to8_from8([8 x i32], i64 %a) {			define void @caller_to8_from8([8 x i64], i64 %a) {
	; CHECK-LABEL: caller_to8_from8:			; CHECK-LABEL: caller_to8_from8:
	; CHECK-NOT: sub sp, sp,			; CHECK-NOT: sub sp, sp,

	; This should reuse our stack area for the 42			; This should reuse our stack area for the 42
	tail call void @callee_stack8([8 x i32] undef, i64 42)			tail call void @callee_stack8([8 x i64] undef, i64 42)
	ret void			ret void
	; CHECK: str {{x[0-9]+}}, [sp]			; CHECK: str {{x[0-9]+}}, [sp]
	; CHECK-NEXT: b callee_stack8			; CHECK-NEXT: b callee_stack8
	}			}

	define void @caller_to16_from8([8 x i32], i64 %a) {			define void @caller_to16_from8([8 x i64], i64 %a) {
	; CHECK-LABEL: caller_to16_from8:			; CHECK-LABEL: caller_to16_from8:

	; Shouldn't be a tail call: we can't use SP+8 because our caller might			; Shouldn't be a tail call: we can't use SP+8 because our caller might
	; have something there. This may sound obvious but implementation does			; have something there. This may sound obvious but implementation does
	; some funky aligning.			; some funky aligning.
	tail call void @callee_stack16([8 x i32] undef, i64 undef, i64 undef)			tail call void @callee_stack16([8 x i64] undef, i64 undef, i64 undef)
	; CHECK: bl callee_stack16			; CHECK: bl callee_stack16
	ret void			ret void
	}			}

	define void @caller_to8_from24([8 x i32], i64 %a, i64 %b, i64 %c) {			define void @caller_to8_from24([8 x i64], i64 %a, i64 %b, i64 %c) {
	; CHECK-LABEL: caller_to8_from24:			; CHECK-LABEL: caller_to8_from24:
	; CHECK-NOT: sub sp, sp			; CHECK-NOT: sub sp, sp

	; Reuse our area, putting "42" at incoming sp			; Reuse our area, putting "42" at incoming sp
	tail call void @callee_stack8([8 x i32] undef, i64 42)			tail call void @callee_stack8([8 x i64] undef, i64 42)
	ret void			ret void
	; CHECK: str {{x[0-9]+}}, [sp]			; CHECK: str {{x[0-9]+}}, [sp]
	; CHECK-NEXT: b callee_stack8			; CHECK-NEXT: b callee_stack8
	}			}

	define void @caller_to16_from16([8 x i32], i64 %a, i64 %b) {			define void @caller_to16_from16([8 x i64], i64 %a, i64 %b) {
	; CHECK-LABEL: caller_to16_from16:			; CHECK-LABEL: caller_to16_from16:
	; CHECK-NOT: sub sp, sp,			; CHECK-NOT: sub sp, sp,

	; Here we want to make sure that both loads happen before the stores:			; Here we want to make sure that both loads happen before the stores:
	; otherwise either %a or %b will be wrongly clobbered.			; otherwise either %a or %b will be wrongly clobbered.
	tail call void @callee_stack16([8 x i32] undef, i64 %b, i64 %a)			tail call void @callee_stack16([8 x i64] undef, i64 %b, i64 %a)
	ret void			ret void

	; CHECK: ldr [[VAL0:x[0-9]+]],			; CHECK: ldr [[VAL0:x[0-9]+]],
	; CHECK: ldr [[VAL1:x[0-9]+]],			; CHECK: ldr [[VAL1:x[0-9]+]],
	; CHECK: str [[VAL0]],			; CHECK: str [[VAL0]],
	; CHECK: str [[VAL1]],			; CHECK: str [[VAL1]],

	; CHECK-NOT: add sp, sp,			; CHECK-NOT: add sp, sp,
	Show All 16 Lines

llvm/test/CodeGen/AArch64/swift-return.ll

	; RUN: llc -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s			; RUN: llc -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s
	; RUN: llc -O0 -fast-isel -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s --check-prefix=CHECK-O0			; RUN: llc -O0 -fast-isel -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s --check-prefix=CHECK-O0
				; RUN: llc -verify-machineinstrs -mtriple=arm64_32-apple-ios -o - %s \| FileCheck %s
				; RUN: llc -O0 -fast-isel -verify-machineinstrs -mtriple=arm64_32-apple-ios -o - %s \| FileCheck %s --check-prefix=CHECK-O0

	; CHECK-LABEL: test1			; CHECK-LABEL: test1
	; CHECK: bl _gen			; CHECK: bl _gen
	; CHECK: sxth [[TMP:w.*]], w0			; CHECK: sxth [[TMP:w.*]], w0
	; CHECK: add w0, [[TMP]], w1, sxtb			; CHECK: add w0, [[TMP]], w1, sxtb
	; CHECK-O0-LABEL: test1			; CHECK-O0-LABEL: test1
	; CHECK-O0: bl _gen			; CHECK-O0: bl _gen
	; CHECK-O0: sxth [[TMP:w.*]], w0			; CHECK-O0: sxth [[TMP:w.*]], w0
	; CHECK-O0: add w8, [[TMP]], w1, sxtb			; CHECK-O0: add {{w[0-9]+}}, [[TMP]], w1, sxtb
	define i16 @test1(i32) {			define i16 @test1(i32) {
	entry:			entry:
	%call = call swiftcc { i16, i8 } @gen(i32 %0)			%call = call swiftcc { i16, i8 } @gen(i32 %0)
	%v3 = extractvalue { i16, i8 } %call, 0			%v3 = extractvalue { i16, i8 } %call, 0
	%v1 = sext i16 %v3 to i32			%v1 = sext i16 %v3 to i32
	%v5 = extractvalue { i16, i8 } %call, 1			%v5 = extractvalue { i16, i8 } %call, 1
	%v2 = sext i8 %v5 to i32			%v2 = sext i8 %v5 to i32
	%add = add nsw i32 %v1, %v2			%add = add nsw i32 %v1, %v2
	▲ Show 20 Lines • Show All 277 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/swiftcc.ll

	; RUN: llc -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s			; RUN: llc -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s
	; RUN: llc -O0 -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s			; RUN: llc -O0 -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s
				; RUN: llc -verify-machineinstrs -mtriple=arm64_32-apple-ios -o - %s \| FileCheck %s
				; RUN: llc -O0 -verify-machineinstrs -mtriple=arm64_32-apple-ios -o - %s \| FileCheck %s

	; CHECK: t1			; CHECK: t1
	; CHECK: fadd s0, s0, s1			; CHECK: fadd s0, s0, s1
	; CHECK: ret			; CHECK: ret
	define swiftcc float @t1(float %a, float %b) {			define swiftcc float @t1(float %a, float %b) {
	entry:			entry:
	%add = fadd float %a, %b			%add = fadd float %a, %b
	ret float %add			ret float %add
	}			}

llvm/test/CodeGen/AArch64/swifterror.ll

; RUN: llc -fast-isel-sink-local-values -verify-machineinstrs -frame-pointer=all -enable-shrink-wrap=false < %s -mtriple=aarch64-apple-ios -disable-post-ra \| FileCheck -allow-deprecated-dag-overlap --check-prefix=CHECK-APPLE %s		; RUN: llc -fast-isel-sink-local-values -verify-machineinstrs -frame-pointer=all -enable-shrink-wrap=false < %s -mtriple=aarch64-apple-ios -disable-post-ra \| FileCheck -allow-deprecated-dag-overlap --check-prefix=CHECK-APPLE --check-prefix=CHECK-APPLE-AARCH64 %s
; RUN: llc -fast-isel-sink-local-values -verify-machineinstrs -frame-pointer=all -O0 -fast-isel < %s -mtriple=aarch64-apple-ios -disable-post-ra \| FileCheck -allow-deprecated-dag-overlap --check-prefix=CHECK-O0 %s		; RUN: llc -fast-isel-sink-local-values -verify-machineinstrs -frame-pointer=all -O0 -fast-isel < %s -mtriple=aarch64-apple-ios -disable-post-ra \| FileCheck -allow-deprecated-dag-overlap --check-prefix=CHECK-O0 --check-prefix=CHECK-O0-AARCH64 %s
		; RUN: llc -fast-isel-sink-local-values -verify-machineinstrs -frame-pointer=all -enable-shrink-wrap=false < %s -mtriple=arm64_32-apple-ios -disable-post-ra \| FileCheck -allow-deprecated-dag-overlap --check-prefix=CHECK-APPLE --check-prefix=CHECK-APPLE-ARM64_32 %s
		; RUN: llc -fast-isel-sink-local-values -verify-machineinstrs -O0 -fast-isel < %s -mtriple=arm64_32-apple-ios -disable-post-ra \| FileCheck -allow-deprecated-dag-overlap --check-prefix=CHECK-O0 --check-prefix=CHECK-O0-ARM64_32 %s

declare i8* @malloc(i64)		declare i8* @malloc(i64)
declare void @free(i8*)		declare void @free(i8*)
%swift_error = type {i64, i8}		%swift_error = type {i64, i8}

; This tests the basic usage of a swifterror parameter. "foo" is the function		; This tests the basic usage of a swifterror parameter. "foo" is the function
; that takes a swifterror parameter and "caller" is the caller of "foo".		; that takes a swifterror parameter and "caller" is the caller of "foo".
define float @foo(%swift_error** swifterror %error_ptr_ref) {		define float @foo(%swift_error** swifterror %error_ptr_ref) {
Show All 24 Lines

; "caller" calls "foo" that takes a swifterror parameter.		; "caller" calls "foo" that takes a swifterror parameter.
define float @caller(i8* %error_ref) {		define float @caller(i8* %error_ref) {
; CHECK-APPLE-LABEL: caller:		; CHECK-APPLE-LABEL: caller:
; CHECK-APPLE: mov [[ID:x[0-9]+]], x0		; CHECK-APPLE: mov [[ID:x[0-9]+]], x0
; CHECK-APPLE: mov x21, xzr		; CHECK-APPLE: mov x21, xzr
; CHECK-APPLE: bl {{.*}}foo		; CHECK-APPLE: bl {{.*}}foo
; CHECK-APPLE: mov x0, x21		; CHECK-APPLE: mov x0, x21
; CHECK-APPLE: cbnz x21		; CHECK-APPLE-AARCH64: cbnz x21
		; CHECK-APPLE-ARM64_32: cbnz w0
; Access part of the error object and save it to error_ref		; Access part of the error object and save it to error_ref
; CHECK-APPLE: ldrb [[CODE:w[0-9]+]], [x0, #8]		; CHECK-APPLE: ldrb [[CODE:w[0-9]+]], [x0, #8]
; CHECK-APPLE: strb [[CODE]], [{{.*}}[[ID]]]		; CHECK-APPLE: strb [[CODE]], [{{.*}}[[ID]]]
; CHECK-APPLE: bl {{.*}}free		; CHECK-APPLE: bl {{.*}}free

; CHECK-O0-LABEL: caller:		; CHECK-O0-LABEL: caller:
; CHECK-O0: mov x21		; CHECK-O0: mov x21
; CHECK-O0: bl {{.*}}foo		; CHECK-O0: bl {{.*}}foo
; CHECK-O0: mov [[ID:x[0-9]+]], x21		; CHECK-O0: mov [[ID:x[0-9]+]], x21
; CHECK-O0: cbnz x21		; CHECK-O0-AARCH64: cbnz x21
		; CHECK-O0-ARM64_32: cmp x21, #0
entry:		entry:
%error_ptr_ref = alloca swifterror %swift_error*		%error_ptr_ref = alloca swifterror %swift_error*
store %swift_error* null, %swift_error** %error_ptr_ref		store %swift_error* null, %swift_error** %error_ptr_ref
%call = call float @foo(%swift_error** swifterror %error_ptr_ref)		%call = call float @foo(%swift_error** swifterror %error_ptr_ref)
%error_from_foo = load %swift_error, %swift_error* %error_ptr_ref		%error_from_foo = load %swift_error, %swift_error* %error_ptr_ref
%had_error_from_foo = icmp ne %swift_error* %error_from_foo, null		%had_error_from_foo = icmp ne %swift_error* %error_from_foo, null
%tmp = bitcast %swift_error* %error_from_foo to i8*		%tmp = bitcast %swift_error* %error_from_foo to i8*
br i1 %had_error_from_foo, label %handler, label %cont		br i1 %had_error_from_foo, label %handler, label %cont
Show All 9 Lines

; "caller2" is the caller of "foo", it calls "foo" inside a loop.		; "caller2" is the caller of "foo", it calls "foo" inside a loop.
define float @caller2(i8* %error_ref) {		define float @caller2(i8* %error_ref) {
; CHECK-APPLE-LABEL: caller2:		; CHECK-APPLE-LABEL: caller2:
; CHECK-APPLE: mov [[ID:x[0-9]+]], x0		; CHECK-APPLE: mov [[ID:x[0-9]+]], x0
; CHECK-APPLE: fmov [[CMP:s[0-9]+]], #1.0		; CHECK-APPLE: fmov [[CMP:s[0-9]+]], #1.0
; CHECK-APPLE: mov x21, xzr		; CHECK-APPLE: mov x21, xzr
; CHECK-APPLE: bl {{.*}}foo		; CHECK-APPLE: bl {{.*}}foo
; CHECK-APPLE: cbnz x21		; CHECK-APPLE-AARCH64: cbnz x21
		; CHECK-APPLE-ARM64_32: cbnz w21
; CHECK-APPLE: fcmp s0, [[CMP]]		; CHECK-APPLE: fcmp s0, [[CMP]]
; CHECK-APPLE: b.le		; CHECK-APPLE: b.le
; Access part of the error object and save it to error_ref		; Access part of the error object and save it to error_ref
; CHECK-APPLE: ldrb [[CODE:w[0-9]+]], [x21, #8]		; CHECK-APPLE: ldrb [[CODE:w[0-9]+]], [x21, #8]
; CHECK-APPLE: strb [[CODE]], [{{.*}}[[ID]]]		; CHECK-APPLE: strb [[CODE]], [{{.*}}[[ID]]]
; CHECK-APPLE: mov x0, x21		; CHECK-APPLE: mov x0, x21
; CHECK-APPLE: bl {{.*}}free		; CHECK-APPLE: bl {{.*}}free

; CHECK-O0-LABEL: caller2:		; CHECK-O0-LABEL: caller2:
; CHECK-O0: mov x21		; CHECK-O0: mov x21
; CHECK-O0: bl {{.*}}foo		; CHECK-O0: bl {{.*}}foo
; CHECK-O0: mov [[ID:x[0-9]+]], x21		; CHECK-O0: mov [[ID:x[0-9]+]], x21
; CHECK-O0: cbnz x21		; CHECK-O0-AARCH64: cbnz x21
		; CHECK-O0-ARM64_32: cmp x21, #0
entry:		entry:
%error_ptr_ref = alloca swifterror %swift_error*		%error_ptr_ref = alloca swifterror %swift_error*
br label %bb_loop		br label %bb_loop
bb_loop:		bb_loop:
store %swift_error* null, %swift_error** %error_ptr_ref		store %swift_error* null, %swift_error** %error_ptr_ref
%call = call float @foo(%swift_error** swifterror %error_ptr_ref)		%call = call float @foo(%swift_error** swifterror %error_ptr_ref)
%error_from_foo = load %swift_error, %swift_error* %error_ptr_ref		%error_from_foo = load %swift_error, %swift_error* %error_ptr_ref
%had_error_from_foo = icmp ne %swift_error* %error_from_foo, null		%had_error_from_foo = icmp ne %swift_error* %error_from_foo, null
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
; CHECK-APPLE: mov w0, #16		; CHECK-APPLE: mov w0, #16
; CHECK-APPLE: malloc		; CHECK-APPLE: malloc
; CHECK-APPLE: strb w{{.*}}, [x0, #8]		; CHECK-APPLE: strb w{{.*}}, [x0, #8]
; CHECK-APPLE: fcmp		; CHECK-APPLE: fcmp
; CHECK-APPLE: b.le		; CHECK-APPLE: b.le
; CHECK-APPLE: mov x21, x0		; CHECK-APPLE: mov x21, x0
; CHECK-APPLE: ret		; CHECK-APPLE: ret

; CHECK-O0-LABEL: foo_loop:		; CHECK-O0-AARCH64-LABEL: foo_loop:
; spill x21		; spill x21
; CHECK-O0: str x21, [sp, [[SLOT:#[0-9]+]]]		; CHECK-O0-AARCH64: str x21, [sp, [[SLOT:#[0-9]+]]]
; CHECK-O0: b [[BB1:[A-Za-z0-9_]*]]		; CHECK-O0-AARCH64: b [[BB1:[A-Za-z0-9_]*]]
; CHECK-O0: [[BB1]]:		; CHECK-O0-AARCH64: [[BB1]]:
; CHECK-O0: ldr x0, [sp, [[SLOT]]]		; CHECK-O0-AARCH64: ldr x0, [sp, [[SLOT]]]
; CHECK-O0: str x0, [sp, [[SLOT2:#[0-9]+]]]		; CHECK-O0-AARCH64: str x0, [sp, [[SLOT2:#[0-9]+]]]
; CHECK-O0: cbz {{.}}, [[BB2:[A-Za-z0-9_]]]		; CHECK-O0-AARCH64: cbz {{.}}, [[BB2:[A-Za-z0-9_]]]
; CHECK-O0: mov w{{.*}}, #16		; CHECK-O0-AARCH64: mov w{{.*}}, #16
; CHECK-O0: malloc		; CHECK-O0-AARCH64: malloc
; CHECK-O0: mov [[ID:x[0-9]+]], x0		; CHECK-O0-AARCH64: mov [[ID:x[0-9]+]], x0
; CHECK-O0: strb w{{.}}, [{{.}}[[ID]], #8]		; CHECK-O0-AARCH64: strb w{{.}}, [{{.}}[[ID]], #8]
; spill x0		; spill x0
; CHECK-O0: str x0, [sp, [[SLOT2]]]		; CHECK-O0-AARCH64: str x0, [sp, [[SLOT2]]]
; CHECK-O0:[[BB2]]:		; CHECK-O0-AARCH64:[[BB2]]:
; CHECK-O0: ldr x0, [sp, [[SLOT2]]]		; CHECK-O0-AARCH64: ldr x0, [sp, [[SLOT2]]]
; CHECK-O0: fcmp		; CHECK-O0-AARCH64: fcmp
; CHECK-O0: str x0, [sp]		; CHECK-O0-AARCH64: str x0, [sp]
; CHECK-O0: b.le [[BB1]]		; CHECK-O0-AARCH64: b.le [[BB1]]
; reload from stack		; reload from stack
; CHECK-O0: ldr [[ID3:x[0-9]+]], [sp]		; CHECK-O0-AARCH64: ldr [[ID3:x[0-9]+]], [sp]
; CHECK-O0: mov x21, [[ID3]]		; CHECK-O0-AARCH64: mov x21, [[ID3]]
; CHECK-O0: ret		; CHECK-O0-AARCH64: ret

		; CHECK-O0-ARM64_32-LABEL: foo_loop:
		; spill x21
		; CHECK-O0-ARM64_32: str x21, [sp, [[SLOT:#[0-9]+]]]
		; CHECK-O0-ARM64_32: b [[BB1:[A-Za-z0-9_]*]]
		; CHECK-O0-ARM64_32: [[BB1]]:
		; CHECK-O0-ARM64_32: ldr x0, [sp, [[SLOT]]]
		; CHECK-O0-ARM64_32: str x0, [sp, [[SLOT2:#[0-9]+]]]
		; CHECK-O0-ARM64_32: cbz {{.}}, [[BB2:[A-Za-z0-9_]]]
		; CHECK-O0-ARM64_32: mov w{{.*}}, #16
		; CHECK-O0-ARM64_32: malloc
		; CHECK-O0-ARM64_32: mov {{.*}}, x0
		; CHECK-O0-ARM64_32: strb w{{.*}},
		; spill x0
		; CHECK-O0-ARM64_32: str [[ID2]], [sp, [[SLOT2]]]
		; CHECK-O0-ARM64_32:[[BB2]]:
		; CHECK-O0-ARM64_32: ldr x0, [sp, [[SLOT2]]]
		; CHECK-O0-ARM64_32: fcmp
		; CHECK-O0-ARM64_32: str x0, [sp[[OFFSET:.*]]]
		; CHECK-O0-ARM64_32: b.le [[BB1]]
		; reload from stack
		; CHECK-O0-ARM64_32: ldr [[ID3:x[0-9]+]], [sp[[OFFSET]]]
		; CHECK-O0-ARM64_32: mov x21, [[ID3]]
		; CHECK-O0-ARM64_32: ret

entry:		entry:
br label %bb_loop		br label %bb_loop

bb_loop:		bb_loop:
%cond = icmp ne i32 %cc, 0		%cond = icmp ne i32 %cc, 0
br i1 %cond, label %gen_error, label %bb_cont		br i1 %cond, label %gen_error, label %bb_cont

gen_error:		gen_error:
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines

; "caller3" calls "foo_sret" that takes a swifterror parameter.		; "caller3" calls "foo_sret" that takes a swifterror parameter.
define float @caller3(i8* %error_ref) {		define float @caller3(i8* %error_ref) {
; CHECK-APPLE-LABEL: caller3:		; CHECK-APPLE-LABEL: caller3:
; CHECK-APPLE: mov [[ID:x[0-9]+]], x0		; CHECK-APPLE: mov [[ID:x[0-9]+]], x0
; CHECK-APPLE: mov x21, xzr		; CHECK-APPLE: mov x21, xzr
; CHECK-APPLE: bl {{.*}}foo_sret		; CHECK-APPLE: bl {{.*}}foo_sret
; CHECK-APPLE: mov x0, x21		; CHECK-APPLE: mov x0, x21
; CHECK-APPLE: cbnz x21		; CHECK-APPLE-AARCH64: cbnz x21
		; CHECK-APPLE-ARM64_32: cbnz w0
; Access part of the error object and save it to error_ref		; Access part of the error object and save it to error_ref
; CHECK-APPLE: ldrb [[CODE:w[0-9]+]], [x0, #8]		; CHECK-APPLE: ldrb [[CODE:w[0-9]+]], [x0, #8]
; CHECK-APPLE: strb [[CODE]], [{{.*}}[[ID]]]		; CHECK-APPLE: strb [[CODE]], [{{.*}}[[ID]]]
; CHECK-APPLE: bl {{.*}}free		; CHECK-APPLE: bl {{.*}}free

; CHECK-O0-LABEL: caller3:		; CHECK-O0-LABEL: caller3:
; spill x0		; spill x0
; CHECK-O0: str x0		; CHECK-O0: str x0
; CHECK-O0: mov x21		; CHECK-O0: mov x21
; CHECK-O0: bl {{.*}}foo_sret		; CHECK-O0: bl {{.*}}foo_sret
; CHECK-O0: mov [[ID2:x[0-9]+]], x21		; CHECK-O0: mov [[ID2:x[0-9]+]], x21
; CHECK-O0: cbnz x21		; CHECK-O0-AARCH64: cbnz x21
		; CHECK-O0-ARM64_32: cmp x21, #0
; Access part of the error object and save it to error_ref		; Access part of the error object and save it to error_ref
; reload from stack		; reload from stack
; CHECK-O0: ldrb [[CODE:w[0-9]+]]		; CHECK-O0: ldrb [[CODE:w[0-9]+]]
; CHECK-O0: ldr [[ID:x[0-9]+]]		; CHECK-O0: ldr [[ID:x[0-9]+]]
; CHECK-O0: strb [[CODE]], [{{.*}}[[ID]]]		; CHECK-O0: strb [[CODE]], [{{.*}}[[ID]]]
; CHECK-O0: bl {{.*}}free		; CHECK-O0: bl {{.*}}free
entry:		entry:
%s = alloca %struct.S, align 8		%s = alloca %struct.S, align 8
Show All 16 Lines

; "foo_vararg" is a function that takes a swifterror parameter, it also has		; "foo_vararg" is a function that takes a swifterror parameter, it also has
; variable number of arguments.		; variable number of arguments.
declare void @llvm.va_start(i8*) nounwind		declare void @llvm.va_start(i8*) nounwind
define float @foo_vararg(%swift_error** swifterror %error_ptr_ref, ...) {		define float @foo_vararg(%swift_error** swifterror %error_ptr_ref, ...) {
; CHECK-APPLE-LABEL: foo_vararg:		; CHECK-APPLE-LABEL: foo_vararg:
; CHECK-APPLE: mov w0, #16		; CHECK-APPLE: mov w0, #16
; CHECK-APPLE: malloc		; CHECK-APPLE: malloc
; CHECK-APPLE-DAG: mov [[ID:w[0-9]+]], #1
; CHECK-APPLE-DAG: add [[ARGS:x[0-9]+]], [[TMP:x[0-9]+]], #16
; CHECK-APPLE-DAG: strb [[ID]], [x0, #8]

; First vararg		; First vararg
; CHECK-APPLE-DAG: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #16]		; CHECK-APPLE-AARCH64: ldr {{w[0-9]+}}, [{{.*}}[[TMP:x[0-9]+]], #16]
		; CHECK-APPLE-AARCH64: mov [[ID:w[0-9]+]], #1
		; CHECK-APPLE-AARCH64: add [[ARGS:x[0-9]+]], [[TMP]], #16
		; CHECK-APPLE-AARCH64: strb [[ID]], [x0, #8]
; Second vararg		; Second vararg
; CHECK-APPLE-DAG: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #24]		; CHECK-APPLE-AARCH64: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #24]
; CHECK-APPLE-DAG: add {{x[0-9]+}}, {{x[0-9]+}}, #16
; Third vararg		; Third vararg
; CHECK-APPLE-DAG: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #32]		; CHECK-APPLE-AARCH64: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #32]

		; CHECK-APPLE-ARM64_32: mov [[ID:w[0-9]+]], #1
		; CHECK-APPLE-ARM64_32: add [[ARGS:x[0-9]+]], [[TMP:x[0-9]+]], #16
		; CHECK-APPLE-ARM64_32: strb [[ID]], [x0, #8]


; CHECK-APPLE: mov x21, x0
; CHECK-APPLE-NOT: x21
entry:		entry:
%call = call i8* @malloc(i64 16)		%call = call i8* @malloc(i64 16)
%call.0 = bitcast i8* %call to %swift_error*		%call.0 = bitcast i8* %call to %swift_error*
store %swift_error* %call.0, %swift_error** %error_ptr_ref		store %swift_error* %call.0, %swift_error** %error_ptr_ref
%tmp = getelementptr inbounds i8, i8* %call, i64 8		%tmp = getelementptr inbounds i8, i8* %call, i64 8
store i8 1, i8* %tmp		store i8 1, i8* %tmp

%args = alloca i8*, align 8		%args = alloca i8*, align 8
Show All 11 Lines	entry:

ret float 1.0		ret float 1.0
}		}

; "caller4" calls "foo_vararg" that takes a swifterror parameter.		; "caller4" calls "foo_vararg" that takes a swifterror parameter.
define float @caller4(i8* %error_ref) {		define float @caller4(i8* %error_ref) {
; CHECK-APPLE-LABEL: caller4:		; CHECK-APPLE-LABEL: caller4:

; CHECK-APPLE: mov [[ID:x[0-9]+]], x0		; CHECK-APPLE-AARCH64: mov [[ID:x[0-9]+]], x0
; CHECK-APPLE: stp {{x[0-9]+}}, {{x[0-9]+}}, [sp, #8]		; CHECK-APPLE-AARCH64: stp {{x[0-9]+}}, {{x[0-9]+}}, [sp, #8]
; CHECK-APPLE: str {{x[0-9]+}}, [sp]		; CHECK-APPLE-AARCH64: str {{x[0-9]+}}, [sp]

; CHECK-APPLE: mov x21, xzr		; CHECK-APPLE-AARCH64: mov x21, xzr
; CHECK-APPLE: bl {{.*}}foo_vararg		; CHECK-APPLE-AARCH64: bl {{.*}}foo_vararg
; CHECK-APPLE: mov x0, x21		; CHECK-APPLE-AARCH64: mov x0, x21
; CHECK-APPLE: cbnz x21		; CHECK-APPLE-AARCH64: cbnz x21
; Access part of the error object and save it to error_ref		; Access part of the error object and save it to error_ref
; CHECK-APPLE: ldrb [[CODE:w[0-9]+]], [x0, #8]		; CHECK-APPLE-AARCH64: ldrb [[CODE:w[0-9]+]], [x0, #8]
; CHECK-APPLE: strb [[CODE]], [{{.*}}[[ID]]]		; CHECK-APPLE-AARCH64: strb [[CODE]], [{{.*}}[[ID]]]
; CHECK-APPLE: bl {{.*}}free		; CHECK-APPLE-AARCH64: bl {{.*}}free
entry:		entry:
%error_ptr_ref = alloca swifterror %swift_error*		%error_ptr_ref = alloca swifterror %swift_error*
store %swift_error* null, %swift_error** %error_ptr_ref		store %swift_error* null, %swift_error** %error_ptr_ref

%a10 = alloca i32, align 4		%a10 = alloca i32, align 4
%a11 = alloca i32, align 4		%a11 = alloca i32, align 4
%a12 = alloca i32, align 4		%a12 = alloca i32, align 4
store i32 10, i32* %a10, align 4		store i32 10, i32* %a10, align 4
▲ Show 20 Lines • Show All 252 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/swiftself.ll

; RUN: llc -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck --check-prefix=CHECK --check-prefix=OPT %s		; RUN: llc -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck --check-prefix=CHECK --check-prefix=OPT --check-prefix=OPTAARCH64 %s
; RUN: llc -O0 -fast-isel -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s		; RUN: llc -O0 -fast-isel -verify-machineinstrs -mtriple=aarch64-apple-ios -o - %s \| FileCheck %s
; RUN: llc -verify-machineinstrs -mtriple=aarch64-unknown-linux-gnu -o - %s \| FileCheck --check-prefix=CHECK --check-prefix=OPT %s		; RUN: llc -verify-machineinstrs -mtriple=aarch64-unknown-linux-gnu -o - %s \| FileCheck --check-prefix=CHECK --check-prefix=OPT --check-prefix=OPTAARCH64 %s
		; RUN: llc -verify-machineinstrs -mtriple=arm64_32-apple-ios -o - %s \| FileCheck --check-prefix=CHECK --check-prefix=OPT --check-prefix=OPTARM64_32 %s

; Parameter with swiftself should be allocated to x20.		; Parameter with swiftself should be allocated to x20.
; CHECK-LABEL: swiftself_param:		; CHECK-LABEL: swiftself_param:
; CHECK: mov x0, x20		; CHECK: mov x0, x20
; CHECK-NEXT: ret		; CHECK-NEXT: ret
define i8* @swiftself_param(i8* swiftself %addr0) {		define i8* @swiftself_param(i8* swiftself %addr0) {
ret i8 *%addr0		ret i8 *%addr0
}		}
Show All 31 Lines	define void @swiftself_passthrough(i8* swiftself %addr0) {
call i8 @swiftself_param(i8 swiftself %addr0)		call i8 @swiftself_param(i8 swiftself %addr0)
call i8 @swiftself_param(i8 swiftself %addr0)		call i8 @swiftself_param(i8 swiftself %addr0)
ret void		ret void
}		}

; We can use a tail call if the callee swiftself is the same as the caller one.		; We can use a tail call if the callee swiftself is the same as the caller one.
; This should also work with fast-isel.		; This should also work with fast-isel.
; CHECK-LABEL: swiftself_tail:		; CHECK-LABEL: swiftself_tail:
; CHECK: b {{_?}}swiftself_param		; OPTAARCH64: b {{_?}}swiftself_param
; CHECK-NOT: ret		; OPTAARCH64-NOT: ret
		; OPTARM64_32: bl {{_?}}swiftself_param
define i8* @swiftself_tail(i8* swiftself %addr0) {		define i8* @swiftself_tail(i8* swiftself %addr0) {
call void asm sideeffect "", "~{x20}"()		call void asm sideeffect "", "~{x20}"()
%res = tail call i8* @swiftself_param(i8* swiftself %addr0)		%res = tail call i8* @swiftself_param(i8* swiftself %addr0)
ret i8* %res		ret i8* %res
}		}

; We can not use a tail call if the callee swiftself is not the same as the		; We can not use a tail call if the callee swiftself is not the same as the
; caller one.		; caller one.
; CHECK-LABEL: swiftself_notail:		; CHECK-LABEL: swiftself_notail:
; CHECK: mov x20, x0		; CHECK: mov x20, x0
; CHECK: bl {{_?}}swiftself_param		; CHECK: bl {{_?}}swiftself_param
; CHECK: ret		; CHECK: ret
define i8* @swiftself_notail(i8* swiftself %addr0, i8* %addr1) nounwind {		define i8* @swiftself_notail(i8* swiftself %addr0, i8* %addr1) nounwind {
%res = tail call i8* @swiftself_param(i8* swiftself %addr1)		%res = tail call i8* @swiftself_param(i8* swiftself %addr1)
ret i8* %res		ret i8* %res
}		}

; We cannot pretend that 'x0' is alive across the thisreturn_attribute call as		; We cannot pretend that 'x0' is alive across the thisreturn_attribute call as
; we normally would. We marked the first parameter with swiftself which means it		; we normally would. We marked the first parameter with swiftself which means it
; will no longer be passed in x0.		; will no longer be passed in x0.
declare swiftcc i8* @thisreturn_attribute(i8* returned swiftself)		declare swiftcc i8* @thisreturn_attribute(i8* returned swiftself)
; OPT-LABEL: swiftself_nothisreturn:		; OPTAARCH64-LABEL: swiftself_nothisreturn:
; OPT-DAG: ldr x20, [x20]		; OPTAARCH64-DAG: ldr x20, [x20]
; OPT-DAG: mov [[CSREG:x[1-9].*]], x8		; OPTAARCH64-DAG: mov [[CSREG:x[1-9].*]], x8
; OPT: bl {{_?}}thisreturn_attribute		; OPTAARCH64: bl {{_?}}thisreturn_attribute
; OPT: str x0, {{\[}}[[CSREG]]		; OPTAARCH64: str x0, {{\[}}[[CSREG]]
; OPT: ret		; OPTAARCH64: ret

		; OPTARM64_32-LABEL: swiftself_nothisreturn:
		; OPTARM64_32-DAG: ldr w20, [x20]
		; OPTARM64_32-DAG: mov [[CSREG:x[1-9].*]], x8
		; OPTARM64_32: bl {{_?}}thisreturn_attribute
		; OPTARM64_32: str w0, {{\[}}[[CSREG]]
		; OPTARM64_32: ret
define hidden swiftcc void @swiftself_nothisreturn(i8 noalias nocapture sret, i8 noalias nocapture readonly swiftself) {		define hidden swiftcc void @swiftself_nothisreturn(i8 noalias nocapture sret, i8 noalias nocapture readonly swiftself) {
entry:		entry:
%2 = load i8, i8* %1, align 8		%2 = load i8, i8* %1, align 8
%3 = tail call swiftcc i8* @thisreturn_attribute(i8* swiftself %2)		%3 = tail call swiftcc i8* @thisreturn_attribute(i8* swiftself %2)
store i8* %3, i8** %0, align 8		store i8* %3, i8** %0, align 8
ret void		ret void
}		}

llvm/test/CodeGen/AArch64/tail-call.ll

	; RUN: llc -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu -tailcallopt \| FileCheck %s --check-prefixes=SDAG,COMMON			; RUN: llc -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu -tailcallopt \| FileCheck %s --check-prefixes=SDAG,COMMON
	; RUN: llc -global-isel -global-isel-abort=2 -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu -tailcallopt \| FileCheck %s --check-prefixes=GISEL,COMMON			; RUN: llc -global-isel -global-isel-abort=2 -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu -tailcallopt \| FileCheck %s --check-prefixes=GISEL,COMMON

	declare fastcc void @callee_stack0()			declare fastcc void @callee_stack0()
	declare fastcc void @callee_stack8([8 x i32], i64)			declare fastcc void @callee_stack8([8 x i64], i64)
	declare fastcc void @callee_stack16([8 x i32], i64, i64)			declare fastcc void @callee_stack16([8 x i64], i64, i64)
				efriedmaUnsubmitted Not Done Reply Inline Actions What are you trying to do here? efriedma: What are you trying to do here?
				t.p.northoverAuthorUnsubmitted Done Reply Inline Actions These functions are using the arrays purely to consume register space during a call, but after this change `[8 x i32]` only uses x0-x3 (in full). It's only an IR-level change (i.e. it doesn't affect C or C++ ABI), but I'll limit it to the arm64_32 target when I update the diff. t.p.northover: These functions are using the arrays purely to consume register space during a call, but after…
	declare extern_weak fastcc void @callee_weak()			declare extern_weak fastcc void @callee_weak()

	define fastcc void @caller_to0_from0() nounwind {			define fastcc void @caller_to0_from0() nounwind {
	; COMMON-LABEL: caller_to0_from0:			; COMMON-LABEL: caller_to0_from0:
	; COMMON-NEXT: // %bb.			; COMMON-NEXT: // %bb.

	tail call fastcc void @callee_stack0()			tail call fastcc void @callee_stack0()
	ret void			ret void

	; COMMON-NEXT: b callee_stack0			; COMMON-NEXT: b callee_stack0
	}			}

	define fastcc void @caller_to0_from8([8 x i32], i64) {			define fastcc void @caller_to0_from8([8 x i64], i64) {
	; COMMON-LABEL: caller_to0_from8:			; COMMON-LABEL: caller_to0_from8:

	tail call fastcc void @callee_stack0()			tail call fastcc void @callee_stack0()
	ret void			ret void

	; COMMON: add sp, sp, #16			; COMMON: add sp, sp, #16
	; COMMON-NEXT: b callee_stack0			; COMMON-NEXT: b callee_stack0
	}			}

	define fastcc void @caller_to8_from0() {			define fastcc void @caller_to8_from0() {
	; COMMON-LABEL: caller_to8_from0:			; COMMON-LABEL: caller_to8_from0:
	; COMMON: sub sp, sp, #32			; COMMON: sub sp, sp, #32

	; Key point is that the "42" should go #16 below incoming stack			; Key point is that the "42" should go #16 below incoming stack
	; pointer (we didn't have arg space to reuse).			; pointer (we didn't have arg space to reuse).
	tail call fastcc void @callee_stack8([8 x i32] undef, i64 42)			tail call fastcc void @callee_stack8([8 x i64] undef, i64 42)
	ret void			ret void

	; COMMON: str {{x[0-9]+}}, [sp, #16]!			; COMMON: str {{x[0-9]+}}, [sp, #16]!
	; COMMON-NEXT: b callee_stack8			; COMMON-NEXT: b callee_stack8
	}			}

	define fastcc void @caller_to8_from8([8 x i32], i64 %a) {			define fastcc void @caller_to8_from8([8 x i64], i64 %a) {
	; COMMON-LABEL: caller_to8_from8:			; COMMON-LABEL: caller_to8_from8:
	; COMMON: sub sp, sp, #16			; COMMON: sub sp, sp, #16

	; Key point is that the "%a" should go where at SP on entry.			; Key point is that the "%a" should go where at SP on entry.
	tail call fastcc void @callee_stack8([8 x i32] undef, i64 42)			tail call fastcc void @callee_stack8([8 x i64] undef, i64 42)
	ret void			ret void

	; COMMON: str {{x[0-9]+}}, [sp, #16]!			; COMMON: str {{x[0-9]+}}, [sp, #16]!
	; COMMON-NEXT: b callee_stack8			; COMMON-NEXT: b callee_stack8
	}			}

	define fastcc void @caller_to16_from8([8 x i32], i64 %a) {			define fastcc void @caller_to16_from8([8 x i64], i64 %a) {
	; COMMON-LABEL: caller_to16_from8:			; COMMON-LABEL: caller_to16_from8:
	; COMMON: sub sp, sp, #16			; COMMON: sub sp, sp, #16

	; Important point is that the call reuses the "dead" argument space			; Important point is that the call reuses the "dead" argument space
	; above %a on the stack. If it tries to go below incoming-SP then the			; above %a on the stack. If it tries to go below incoming-SP then the
	; callee will not deallocate the space, even in fastcc.			; callee will not deallocate the space, even in fastcc.
	tail call fastcc void @callee_stack16([8 x i32] undef, i64 42, i64 2)			tail call fastcc void @callee_stack16([8 x i64] undef, i64 42, i64 2)

	; COMMON: stp {{x[0-9]+}}, {{x[0-9]+}}, [sp, #16]!			; COMMON: stp {{x[0-9]+}}, {{x[0-9]+}}, [sp, #16]!
	; COMMON-NEXT: b callee_stack16			; COMMON-NEXT: b callee_stack16
	ret void			ret void
	}			}


	define fastcc void @caller_to8_from24([8 x i32], i64 %a, i64 %b, i64 %c) {			define fastcc void @caller_to8_from24([8 x i64], i64 %a, i64 %b, i64 %c) {
	; COMMON-LABEL: caller_to8_from24:			; COMMON-LABEL: caller_to8_from24:
	; COMMON: sub sp, sp, #16			; COMMON: sub sp, sp, #16

	; Key point is that the "%a" should go where at #16 above SP on entry.			; Key point is that the "%a" should go where at #16 above SP on entry.
	tail call fastcc void @callee_stack8([8 x i32] undef, i64 42)			tail call fastcc void @callee_stack8([8 x i64] undef, i64 42)
	ret void			ret void

	; COMMON: str {{x[0-9]+}}, [sp, #32]!			; COMMON: str {{x[0-9]+}}, [sp, #32]!
	; COMMON-NEXT: b callee_stack8			; COMMON-NEXT: b callee_stack8
	}			}


	define fastcc void @caller_to16_from16([8 x i32], i64 %a, i64 %b) {			define fastcc void @caller_to16_from16([8 x i64], i64 %a, i64 %b) {
	; COMMON-LABEL: caller_to16_from16:			; COMMON-LABEL: caller_to16_from16:
	; COMMON: sub sp, sp, #16			; COMMON: sub sp, sp, #16

	; Here we want to make sure that both loads happen before the stores:			; Here we want to make sure that both loads happen before the stores:
	; otherwise either %a or %b will be wrongly clobbered.			; otherwise either %a or %b will be wrongly clobbered.
	tail call fastcc void @callee_stack16([8 x i32] undef, i64 %b, i64 %a)			tail call fastcc void @callee_stack16([8 x i64] undef, i64 %b, i64 %a)
	ret void			ret void

	; COMMON: ldp {{x[0-9]+}}, {{x[0-9]+}}, [sp, #16]			; COMMON: ldp {{x[0-9]+}}, {{x[0-9]+}}, [sp, #16]
	; COMMON: stp {{x[0-9]+}}, {{x[0-9]+}}, [sp, #16]!			; COMMON: stp {{x[0-9]+}}, {{x[0-9]+}}, [sp, #16]!
	; COMMON-NEXT: b callee_stack16			; COMMON-NEXT: b callee_stack16
	}			}


	▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/umulo-128-legalisation-lowering.ll

	Show All 21 Lines
	; AARCH-NEXT: and w10, w13, w12			; AARCH-NEXT: and w10, w13, w12
	; AARCH-NEXT: cset w12, ne			; AARCH-NEXT: cset w12, ne
	; AARCH-NEXT: cmp xzr, x11			; AARCH-NEXT: cmp xzr, x11
	; AARCH-NEXT: orr w10, w10, w12			; AARCH-NEXT: orr w10, w10, w12
	; AARCH-NEXT: cset w11, ne			; AARCH-NEXT: cset w11, ne
	; AARCH-NEXT: orr w10, w10, w11			; AARCH-NEXT: orr w10, w10, w11
	; AARCH-NEXT: orr w9, w10, w9			; AARCH-NEXT: orr w9, w10, w9
	; AARCH-NEXT: mul x0, x0, x2			; AARCH-NEXT: mul x0, x0, x2
	; AARCH-NEXT: mov x1, x8			; AARCH-DAG: mov x1, x8
	; AARCH-NEXT: mov w2, w9			; AARCH-DAG: mov w2, w9
	; AARCH-NEXT: ret			; AARCH-NEXT: ret
	start:			start:
	%0 = tail call { i128, i1 } @llvm.umul.with.overflow.i128(i128 %l, i128 %r) #2			%0 = tail call { i128, i1 } @llvm.umul.with.overflow.i128(i128 %l, i128 %r) #2
	%1 = extractvalue { i128, i1 } %0, 0			%1 = extractvalue { i128, i1 } %0, 0
	%2 = extractvalue { i128, i1 } %0, 1			%2 = extractvalue { i128, i1 } %0, 1
	%3 = zext i1 %2 to i8			%3 = zext i1 %2 to i8
	%4 = insertvalue { i128, i8 } undef, i128 %1, 0			%4 = insertvalue { i128, i8 } undef, i128 %1, 0
	%5 = insertvalue { i128, i8 } %4, i8 %3, 1			%5 = insertvalue { i128, i8 } %4, i8 %3, 1
	Show All 9 Lines

llvm/test/CodeGen/AArch64/win64_vararg.ll

Show First 20 Lines • Show All 255 Lines • ▼ Show 20 Lines	define i32 @snprintf(i8, i64, i8, ...) local_unnamed_addr #5 {
ret i32 %12		ret i32 %12
}		}

; CHECK-LABEL: fixed_params		; CHECK-LABEL: fixed_params
; CHECK: sub sp, sp, #32		; CHECK: sub sp, sp, #32
; CHECK-DAG: mov w6, w3		; CHECK-DAG: mov w6, w3
; CHECK-DAG: mov [[REG1:w[0-9]+]], w2		; CHECK-DAG: mov [[REG1:w[0-9]+]], w2
; CHECK: mov w2, w1		; CHECK: mov w2, w1
; CHECK: str w4, [sp]
; CHECK: fmov x1, d0		; CHECK: fmov x1, d0
; CHECK: fmov x3, d1		; CHECK: fmov x3, d1
; CHECK: fmov x5, d2		; CHECK: fmov x5, d2
; CHECK: fmov x7, d3		; CHECK: fmov x7, d3
		; CHECK: str w4, [sp]
; CHECK: mov w4, [[REG1]]		; CHECK: mov w4, [[REG1]]
; CHECK: str x30, [sp, #16]		; CHECK: str x30, [sp, #16]
; CHECK: str d4, [sp, #8]		; CHECK: str d4, [sp, #8]
; CHECK: bl varargs		; CHECK: bl varargs
		efriedmaUnsubmitted Not Done Reply Inline Actions There isn't any obvious reason for this test to change? efriedma: There isn't any obvious reason for this test to change?
		t.p.northoverAuthorUnsubmitted Done Reply Inline Actions I think it's because of the `std::map` change you called out above. It implicitly sorts the list of registers that get copied, perturbing the DAG and scheduling. I think I'll switch it back to `SmallVector` and use `std::find_if` to handle the (rare) ARM compatibility instead. It ought to be faster in the common case and won't have this side-effect. t.p.northover: I think it's because of the `std::map` change you called out above. It implicitly sorts the…
		t.p.northoverAuthorUnsubmitted Done Reply Inline Actions Well, as you can see that accounted for a lot of the differences, but the `fmov`s still get reordered w.r.t. the store. I have no idea why this is: the DAG is identical and I'm reasonably sure it's harmless so I blame gremlins. t.p.northover: Well, as you can see that accounted for a lot of the differences, but the `fmov`s still get…
; CHECK: ldr x30, [sp, #16]		; CHECK: ldr x30, [sp, #16]
; CHECK: add sp, sp, #32		; CHECK: add sp, sp, #32
; CHECK: ret		; CHECK: ret
define void @fixed_params(i32, double, i32, double, i32, double, i32, double, i32, double) nounwind {		define void @fixed_params(i32, double, i32, double, i32, double, i32, double, i32, double) nounwind {
tail call void (i32, ...) @varargs(i32 %0, double %1, i32 %2, double %3, i32 %4, double %5, i32 %6, double %7, i32 %8, double %9)		tail call void (i32, ...) @varargs(i32 %0, double %1, i32 %2, double %3, i32 %4, double %5, i32 %6, double %7, i32 %8, double %9)
ret void		ret void
}		}

declare void @varargs(i32, ...) local_unnamed_addr		declare void @varargs(i32, ...) local_unnamed_addr

llvm/test/MC/AArch64/arm64_32-compact-unwind.s

This file was added.

				; RUN: llvm-mc -triple=arm64_32-ios7.0 -filetype=obj %s -o %t
				; RUN: llvm-objdump -s %t \| FileCheck %s

				; The compact unwind format in ILP32 mode is pretty much the same, except
				; references to addresses (function, personality, LSDA) are pointer-sized.

				; CHECK: Contents of section __compact_unwind:
				; CHECK-NEXT: 0004 00000000 04000000 00000002 00000000
				; CHECK-NEXT: 0014 00000000
				.globl _test_compact_unwind
				.align 2
				_test_compact_unwind:
				.cfi_startproc
				ret
				.cfi_endproc

llvm/utils/TableGen/CallingConvEmitter.cpp

Show First 20 Lines • Show All 258 Lines • ▼ Show 20 Lines	if (Action->isSubClassOf("CCDelegateTo")) {
<< IndentStr << IndentStr << "LocInfo = CCValAssign::ZExtUpper;\n"		<< IndentStr << IndentStr << "LocInfo = CCValAssign::ZExtUpper;\n"
<< IndentStr << "else\n"		<< IndentStr << "else\n"
<< IndentStr << IndentStr << "LocInfo = CCValAssign::AExtUpper;\n";		<< IndentStr << IndentStr << "LocInfo = CCValAssign::AExtUpper;\n";
}		}
} else if (Action->isSubClassOf("CCBitConvertToType")) {		} else if (Action->isSubClassOf("CCBitConvertToType")) {
Record *DestTy = Action->getValueAsDef("DestTy");		Record *DestTy = Action->getValueAsDef("DestTy");
O << IndentStr << "LocVT = " << getEnumName(getValueType(DestTy)) <<";\n";		O << IndentStr << "LocVT = " << getEnumName(getValueType(DestTy)) <<";\n";
O << IndentStr << "LocInfo = CCValAssign::BCvt;\n";		O << IndentStr << "LocInfo = CCValAssign::BCvt;\n";
		} else if (Action->isSubClassOf("CCTruncToType")) {
		Record *DestTy = Action->getValueAsDef("DestTy");
		O << IndentStr << "LocVT = " << getEnumName(getValueType(DestTy)) <<";\n";
		O << IndentStr << "LocInfo = CCValAssign::Trunc;\n";
} else if (Action->isSubClassOf("CCPassIndirect")) {		} else if (Action->isSubClassOf("CCPassIndirect")) {
Record *DestTy = Action->getValueAsDef("DestTy");		Record *DestTy = Action->getValueAsDef("DestTy");
O << IndentStr << "LocVT = " << getEnumName(getValueType(DestTy)) <<";\n";		O << IndentStr << "LocVT = " << getEnumName(getValueType(DestTy)) <<";\n";
O << IndentStr << "LocInfo = CCValAssign::Indirect;\n";		O << IndentStr << "LocInfo = CCValAssign::Indirect;\n";
} else if (Action->isSubClassOf("CCPassByVal")) {		} else if (Action->isSubClassOf("CCPassByVal")) {
int Size = Action->getValueAsInt("Size");		int Size = Action->getValueAsInt("Size");
int Align = Action->getValueAsInt("Align");		int Align = Action->getValueAsInt("Align");
O << IndentStr		O << IndentStr
Show All 23 Lines

This is an archive of the discontinued LLVM Phabricator instance.

AArch64: support arm64_32, an ILP32 slice for watchOS.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 219319

llvm/include/llvm/CodeGen/CallingConvLower.h

llvm/include/llvm/Target/TargetCallingConv.td

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/CodeGen/TargetLoweringBase.cpp

llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp

llvm/lib/ExecutionEngine/Orc/IndirectionUtils.cpp

llvm/lib/ExecutionEngine/Orc/LazyReexports.cpp

llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp

llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldMachO.cpp

llvm/lib/LTO/LTOCodeGenerator.cpp

llvm/lib/LTO/LTOModule.cpp

llvm/lib/LTO/ThinLTOCodeGenerator.cpp

llvm/lib/MC/MCObjectFileInfo.cpp

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp

llvm/lib/Target/AArch64/AArch64CallLowering.cpp

llvm/lib/Target/AArch64/AArch64CallingConvention.h

llvm/lib/Target/AArch64/AArch64CallingConvention.cpp

llvm/lib/Target/AArch64/AArch64CallingConvention.td

llvm/lib/Target/AArch64/AArch64CollectLOH.cpp

llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp

llvm/lib/Target/AArch64/AArch64FastISel.cpp

llvm/lib/Target/AArch64/AArch64ISelLowering.h

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

llvm/lib/Target/AArch64/AArch64SelectionDAGInfo.cpp

llvm/lib/Target/AArch64/AArch64Subtarget.h

llvm/lib/Target/AArch64/AArch64TargetMachine.cpp

llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCAsmInfo.h

llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCAsmInfo.cpp

llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCTargetDesc.cpp

llvm/lib/Target/X86/X86FastISel.cpp

llvm/test/CodeGen/AArch64/arm64-aapcs.ll

llvm/test/CodeGen/AArch64/arm64-collect-loh-garbage-crash.ll

llvm/test/CodeGen/AArch64/arm64-collect-loh-str.ll

llvm/test/CodeGen/AArch64/arm64-collect-loh.ll

llvm/test/CodeGen/AArch64/arm64-indexed-memory.ll

llvm/test/CodeGen/AArch64/arm64-stacksave.ll

llvm/test/CodeGen/AArch64/arm64_32-addrs.ll

llvm/test/CodeGen/AArch64/arm64_32-atomics.ll

llvm/test/CodeGen/AArch64/arm64_32-fastisel.ll

llvm/test/CodeGen/AArch64/arm64_32-frame-pointers.ll

llvm/test/CodeGen/AArch64/arm64_32-gep-sink.ll

llvm/test/CodeGen/AArch64/arm64_32-memcpy.ll

llvm/test/CodeGen/AArch64/arm64_32-neon.ll

llvm/test/CodeGen/AArch64/arm64_32-null.ll

llvm/test/CodeGen/AArch64/arm64_32-pointer-extend.ll

llvm/test/CodeGen/AArch64/arm64_32-stack-pointers.ll

llvm/test/CodeGen/AArch64/arm64_32-tls.ll

llvm/test/CodeGen/AArch64/arm64_32-va.ll

llvm/test/CodeGen/AArch64/arm64_32.ll

llvm/test/CodeGen/AArch64/fastcc-reserved.ll

llvm/test/CodeGen/AArch64/fastcc.ll

llvm/test/CodeGen/AArch64/jump-table-32.ll

llvm/test/CodeGen/AArch64/sibling-call.ll

llvm/test/CodeGen/AArch64/swift-return.ll

llvm/test/CodeGen/AArch64/swiftcc.ll

llvm/test/CodeGen/AArch64/swifterror.ll

llvm/test/CodeGen/AArch64/swiftself.ll

llvm/test/CodeGen/AArch64/tail-call.ll

llvm/test/CodeGen/AArch64/umulo-128-legalisation-lowering.ll

llvm/test/CodeGen/AArch64/win64_vararg.ll

llvm/test/MC/AArch64/arm64_32-compact-unwind.s

llvm/utils/TableGen/CallingConvEmitter.cpp

AArch64: support arm64_32, an ILP32 slice for watchOS.
ClosedPublic