This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Target/
-
llvm/
-
Target/
1/1
TargetLowering.h
-
TargetSubtargetInfo.h
-
lib/
-
CodeGen/
-
AtomicExpandPass.cpp
-
SelectionDAG/
-
LegalizeDAG.cpp
-
LegalizeIntegerTypes.cpp
-
LegalizeTypes.h
3
SelectionDAG.cpp
-
TargetLoweringBase.cpp
-
Target/
-
AArch64/
1/3
AArch64ISelLowering.cpp
-
ARM/
-
ARMISelLowering.h
7
ARMISelLowering.cpp
-
ARMSubtarget.h
-
ARMSubtarget.cpp
-
BPF/
-
BPFISelLowering.cpp
-
Hexagon/
-
HexagonISelLowering.h
-
HexagonISelLowering.cpp
-
Mips/
-
Mips16ISelLowering.cpp
-
MipsISelLowering.cpp
-
PowerPC/
-
PPCISelLowering.cpp
-
Sparc/
-
SparcISelLowering.cpp
-
SystemZ/
1/1
SystemZISelLowering.cpp
-
TargetSubtargetInfo.cpp
-
WebAssembly/
1
WebAssemblyISelLowering.cpp
-
X86/
-
X86ISelLowering.cpp
-
XCore/
-
XCoreISelLowering.h
-
XCoreISelLowering.cpp
-
test/CodeGen/
-
CodeGen/
-
ARM/
1/2
atomic-cmpxchg.ll
2
atomic-load-store.ll
1/1
atomic-op.ll
-
PowerPC/
-
atomics-indexed.ll
-
atomics.ll
-
X86/
-
atomic-non-integer.ll
-
nocx16.ll
-
XCore/
-
atomic.ll

Differential D18201

Switch over targets to use AtomicExpandPass, and clean up target atomics code.
Needs ReviewPublic

Authored by jyknight on Mar 15 2016, 3:39 PM.

Download Raw Diff

Details

Reviewers

javed.absar
jfb

Summary

Removed __sync_* libcall names from the default set of libcalls, so that they cannot be accidentally invoked, unless requested. Targets call initSyncLibcalls() to request them where they're supported and required. (This is ARM and MIPS16 at the moment)

Deleted 'enableAtomicExpand' TargetLowering function: it's always enabled if you add the pass.

Removed unnecessary selection dag expansions of ATOMIC_LOAD into ATOMIC_CMP_SWAP and ATOMIC_STORE into ATOMIC_SWAP. Targets that don't support atomic load/store should now be handled by the translation in AtomicExpandPass into atomic_load/atomic_store, rather than expanding into unnatural operations.

Cleaned up conditionals in Target ISel code to handle too-large atomic operations, where such operations are now expanded before hitting the target's code.

ARM is the most complex target, as it's the only one which mixes all
the different options for atomic lowering, depending on the exact
subarchitecture and OS in use:

AtomicExpandPass expansion to LL/SC instructions, where available natively (ARMv6+ in ARM mode, ARMv7+ in Thumb mode).

Passthrough of atomicrmw and cmpxchg in AtomicExpandPass, followed by ISel expansion to __sync_* library calls, when native instructions are not available, but when the OS does provides a kernel-supported pseudo-atomic sequence. (This is Linux and Darwin, at the moment).

AtomicExpandPass expansion to __atomic_* library calls, if neither of the above are possible for a given size on the given CPU architecture.

So, naturally, ARM required the most changes here.

Also of note are the XCore changes. I removed all the code handing
ATOMIC_STORE/ATOMIC_LOAD, because as far as I can tell, XCore does not
have any cmpxchg support, and thus cannot claim to implement atomics
of any size.

Diff Detail

Event Timeline

jyknight updated this revision to Diff 50781.Mar 15 2016, 3:39 PM

jyknight retitled this revision from to Switch over targets to use AtomicExpandPass, and clean up target atomics code..

jyknight updated this object.

jyknight added subscribers: theraven, rnk, hfinkel and 4 others.

Herald added subscribers: dsanders, jyknight, aemerson. · View Herald TranscriptMar 15 2016, 3:39 PM

jyknight added a parent revision: D18200: Add __atomic_* lowering to AtomicExpandPass..Mar 15 2016, 3:44 PM

Targets call initSyncLibcalls() to request them where they're supported and required. (This is ARM and MIPS16 at the moment)

What's the benefit here, performance? I can't seem to find where they'd ever be used unset (only getATOMIC/getSYNC references them, and all callees I can find assert that they get a valid libcall).

Removed unnecessary selection dag expansions of ATOMIC_LOAD into ATOMIC_CMP_SWAP and ATOMIC_STORE into ATOMIC_SWAP. Targets that don't support atomic load/store should now be handled by the translation in AtomicExpandPass into atomic_load/atomic_store, rather than expanding into unnatural operations.

What if the target doesn't have those libcalls? Darwin doesn't, for example, and we're going to have to allow deploying back to older OSs for quite a while even if we added them immediately (and, as you say, they have to be in a .dylib so we can't ship them with the compiler).

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
1438–1440	Isn't this an assertion-level error? It doesn't seem to be actionable by the user in any reasonable way.

In D18201#376627, @t.p.northover wrote:

Targets call initSyncLibcalls() to request them where they're supported and required. (This is ARM and MIPS16 at the moment)

What's the benefit here, performance? I can't seem to find where they'd ever be used unset (only getATOMIC/getSYNC references them, and all callees I can find assert that they get a valid libcall).

I added this because I found that they were getting emitted in error on targets that should not be using them (in intermediate versions of this patch). So, I added that as essentially a safety check, to make it impossible for those expansions to be used where they should not be, which is almost everywhere.

Removed unnecessary selection dag expansions of ATOMIC_LOAD into ATOMIC_CMP_SWAP and ATOMIC_STORE into ATOMIC_SWAP. Targets that don't support atomic load/store should now be handled by the translation in AtomicExpandPass into atomic_load/atomic_store, rather than expanding into unnatural operations.

What if the target doesn't have those libcalls? Darwin doesn't, for example, and we're going to have to allow deploying back to older OSs for quite a while even if we added them immediately (and, as you say, they have to be in a .dylib so we can't ship them with the compiler).

To the more general question "what happens if a user uses an atomic function which requires an __atomic libcall on a platform that doesn't provide it": they'll get a linker error about it not being defined.

But that doesn't have anything to do with the ATOMIC_LOAD/ATOMIC_STORE dag expansions. The __atomic_* libcalls are only generated by AtomicExpandPass for atomics that are too large, or misaligned.

So the DAG expansions would only come into play if you somehow had a platform which claims (via setMaxAtomicSizeSupported) backend support for that size, but doesn't even have an atomic load or store instruction capable of that size. I know of no platform where that'd be possibly useful. Certainly none that Darwin supports.

That's not to say that these expansions didn't USED to get used -- they did, prior to this change. For two reasons:

1stly, most obviously, that AtomicExpandPass didn't take care of the "no atomics at all of this size" case before getting to ISel. That issue is now gone.

2ndly (and probably where your concern is coming from): On ARM, LLVM used to go through that expansion for some CPU models. That was never actually necessary, though, and this patch changes it to use a normal load/store instruction, with an appropriate barrier, instead. (And if a barrier instruction isn't available, it'll use a barrier libcall.)

E.g. see the changed behavior in test/CodeGen/ARM/atomic-load-store.ll:

 %val = load atomic i32, i32* %ptr seq_cst, align 4
-; THUMBONE: __sync_val_compare_and_swap_4
+; THUMBONE: ldr
+; THUMBONE: __sync_synchronize

And test/CodeGen/ARM/atomic-op.ll:

 store atomic i32 %val1, i32* %mem1 release, align 4
 store atomic i32 %val2, i32* %mem2 release, align 4
-; CHECK-M0: ___sync_lock_test_and_set
-; CHECK-M0: ___sync_lock_test_and_set
+; CHECK-M0: dmb
+; CHECK-M0: str r1, [r0]
+; CHECK-M0: dmb
+; CHECK-M0: str r3, [r2]

So, in summary, I don't think Darwin has anything to worry about here.

Hi James,

So the DAG expansions would only come into play if you somehow had a platform which claims (via setMaxAtomicSizeSupported) backend support for that size, but doesn't even have an atomic load or store instruction capable of that size. I know of no platform where that'd be possibly useful. Certainly none that Darwin supports.

Ugh, bloody Phabricator! This is what I meant to say:

I believe x86 falls into this category for 128-bit types. 8.1.1 of
Volume 3[1] gives some guarantees for normal operations up to 64-bits,
but says nothing about 128.

To get 128-bit guarantees you seem to need a LOCK prefix, which can't
be legally applied to a MOV (but can to cmpxchg16b and xchg). Even the
cmpxchg16b documentation says that "The processor never produces a
locked read without also producing a locked write" which, while not
directly applicable to MOV, is suggestive.

Finally, this is how GCC's libatomic actually implements these
operations, except less buggily (it seems Clang doesn't actually take
account of that cmpxchg16b note, so atomic loads are probably broken
now).

The broken state of Clang might be a get-out, but not trivially:

We can't expand to __atomic_load_16 because we can't rely on the correct .dylib being present for a long time.
We can't expand to __sync_load because it doesn't exist.

So the alternatives seem to be a new sync_load & sync_store (when
everything else uses cmpxchg16b) or to keep (and fix[2]) the existing
expansions. I think I'd prefer the latter.

Cheers.

Tim.

[1] https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf
[2] I'm happy to do that after you're done; please don't think I'm
trying to foist fixing more of (probably) my bugs on you.

On the plus side, GCC itself seems to be as buggy as clang (i.e. it loads with a cmpxchg16b with no loop to ensure success) so maybe real processors are more forgiving than the documentation suggests.

Tim.

Sorry, yes. The handling of extra-wide loads/stores slipped my mind when writing the comment, although I was aware of it earlier. I'd note also that ARM actually has a similar thing: you need to use ll/sc for wide loads and stores, not normal loads and stores (excepting, of course, 64bit load/stores on CPUs that support LPAE, per the unresolved FIXME note there.)

See shouldExpandAtomicStoreInIR and shouldExpandAtomicLoadInIR for how this is handled in llvm these days (both before and after this patch).

What I meant to say is that I believe there's no situation where you should ever see an expansion of an atomic store to a `__sync_lock_test_and_set_* call or an atomic load to __sync_val_compare_and_swap_*` call, which is really all that this code was being used for.

After this change cmpxchg16b does (still) get emitted for 16-byte atomic load/store operations on x86-64, and no libcall is needed...when your CPU supports that instruction.

But, one thing that might affect Darwin, come to think of it. In clang, in http://reviews.llvm.org/D17933, and in LLVM here, we now correctly check if the architecture *actually has* the cx16 instruction when setting the maximum atomic size. It was previously assumed that all x86-64 cpus had it in some code, but not in other code, somewhat inconsistently. Thus, if you compile for an X86 CPU without cmpxchg16b support, you will (now) get `__atomic_*` libcalls. That's absolutely the correct behavior.

Whereas, before, it would pretend you always had lock-free 16-byte atomics in clang, and it emitted an llvm 'store atomic' instruction. LLVM's x86 backend would've not expanded to cmpxchg in shouldExpandAtomicStoreInIR, so you'd be left with a 16byte ATOMIC_STORE DAG node. Which would then be lowered to ATOMIC_SWAP and then `__sync_lock_test_and_set_16` via the removed code we're talking about. But if you're actually on a CPU without cmpxchg16b, there's no way to actually implement that function according to spec, since you're not supposed to use a mutex.

I don't think Darwin actually supports any pre-cx16 CPUs though, so this really shouldn't affect it. But perhaps somewhere a default CPU architecture needs to be set, if it's not already?

In D18201#377066, @t.p.northover wrote:

On the plus side, GCC itself seems to be as buggy as clang (i.e. it loads with a cmpxchg16b with no loop to ensure success) so maybe real processors are more forgiving than the documentation suggests.

I don't think there's any bug there. according to the manual (the rest of the paragraph you quoted):

To simplify the interface to the processor’s bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.)

That is, the lock cmpxchg16b instruction didn't fail if it returns false. It always does both a load and a store. It's just that if the comparison was false, it stores back the just-loaded value, instead of the new value.

I don't think Darwin actually supports any pre-cx16 CPUs though, so this really shouldn't affect it. But perhaps somewhere a default CPU architecture needs to be set, if it's not already?

No, that should be fine. Thanks for the reassurances (and reminding me that AtomicExpand did it). I was initially confused by the "unnatural operations".

That is, the lock cmpxchg16b instruction didn't fail if it returns false. It always does both a load and a store. It's just that if the comparison was false, it stores back the just-loaded value, instead of the new value.

Ah yes, see what you mean there. I agree.

Thanks again for the explanations.

Tim.

jfb added inline comments.Mar 23 2016, 6:21 PM

include/llvm/Target/TargetLowering.h
168	Would be nice to document what it does otherwise.
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
1438–1440	Yes, should probably be a separate change anyways?
lib/Target/AArch64/AArch64ISelLowering.cpp
10219	Is the assumption here that Size > 128 never gets to this code? Could that be asserted?
10232	"RMW"
lib/Target/ARM/ARMISelLowering.cpp
847–848	What happens now when ThreadModel is Single? Is that handled elsewhere?
12120	The comment confuses me: Thumb1 and pre-v6 never get to the else clause, right? The fence is for everyone else?
12180	Seems better to just change the name of the hasLdrex function instead of explaining it here :-)
12202–12204	Same here, size can't be > 64? Assert?
lib/Target/SystemZ/SystemZISelLowering.cpp
131	This comment seems kind of obvious from the code.
lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
56	This is correct for wasm (whereas I'm not certain for some of the other ISAs).
test/CodeGen/ARM/atomic-cmpxchg.ll
93	Missing CHECKs.
test/CodeGen/ARM/atomic-load-store.ll
91	Ugh we should eliminate redundant libcalls elsewhere...
test/CodeGen/ARM/atomic-op.ll
240	Colon after func4:

jfb mentioned this in D17633: [X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC..Mar 28 2016, 12:14 PM

Updates for review.

jyknight added inline comments.Apr 1 2016, 2:04 PM

lib/CodeGen/SelectionDAG/SelectionDAG.cpp
1438–1440	Yeah, you're right. Actually, I'm going to just remove this change entirely. You get an assertion failure: llc: llvm/include/llvm/ADT/StringRef.h:73: llvm::StringRef::StringRef(const char *): Assertion `Str && "StringRef cannot be built from a NULL argument"' failed. anyways, without it. With a stack trace pointing you at the problem.
lib/Target/AArch64/AArch64ISelLowering.cpp
10219	Yes; above 128 get expanded to libcalls by AtomicExpandPass before getting here.
lib/Target/ARM/ARMISelLowering.cpp
847–848	Yeah, this is handled in ARMTargetMachine.cpp, in addIRPasses: if ThreadMode == Single, it adds the LowerAtomic pass, which changes all atomic ops to non-atomic ops, and deletes all fences. That's actually purely dead code at this point. Committed separately.
12120	Thumb1 and pre-v6 do get in here. Then, an IR fence instruction is created, so that it will get expanded into the libcall. CPUs with actual fence instructions use special intrisincs instead, in order to make the particularly appropriate kinds of fence instruction needed here.
12180	Well, hasLdrex is actually a good name. I'm trying to explain why I'm using it here. But apparently failed -- I've reworded, hopefully more clearly.
test/CodeGen/ARM/atomic-cmpxchg.ll
93	Oops!
test/CodeGen/ARM/atomic-load-store.ll
91	Yes, it might be nice to have a redundant fence elimination pass. You could also sometimes remove a fence which is redundant with a neighboring atomic instruction. E.g.: "atomic_store seq_cst; atomic_fence seq_cst" could potentially eliminate the fence on some (all existing?) targets. But that's a whole other project.

jyknight added a child revision: D18802: Improve support for i386 and i486 CPUs..Apr 5 2016, 12:18 PM

Rebased.

Is this still work in progress?

will this land sometime?

Any plan to push the patch recently? After https://reviews.llvm.org/rL312830, with better alignment information of atomic object, more atomic load/store are generated for 128 bits atomic object instead of atomic libcalls. Those 128bits atomic load/store are translated into sync_* libcalls on x86-64 target without cmpxchg16b support. This patch is needed for atomicExpandPass to generate atomic libcalls before isel to generate sync_* libcalls.

efriedma mentioned this in D38046: [Atomic][X8664] set max atomic inline/promote width according to the target.Sep 19 2017, 11:35 AM

Are there any plans to push this forwards?

In current LLVM, MaxAtomicSizeInBitsSupported defaults to 1024, contrary to the documentation. Since its addition in April 2016, a comment is included indicating this will be fixed in the next commit:

// TODO: the default will be switched to 0 in the next commit, along
// with the Target-specific changes necessary.
MaxAtomicSizeInBitsSupported = 1024;

Adjusting that comment and documenting the 1024 default in docs/Atomics.rst is an easy fix of course. At this point, is that the best next step? Or would it be worth adjusting the default MaxAtomicSizeInBitsSupported even without the rest of this patch? Affected in-tree targets can obviously just get a new setMaxAtomicSizeInBitsSupported(1024) call to preserve the previous behaviour, but this may trip up out-of-tree backends for little gain vs just fixing the docs.

Herald added a reviewer: javed.absar. · View Herald TranscriptMay 30 2018, 8:48 PM

Herald added a reviewer: jfb. · View Herald Transcript

Herald added subscribers: atanasyan, fedor.sergeev, kbarton and 8 others. · View Herald Transcript

Revisiting this topic after a long time: I see that this now seems to be handled in Clang since D38046?

Revision Contents

Path

Size

include/

llvm/

Target/

TargetLowering.h

7 lines

TargetSubtargetInfo.h

3 lines

lib/

CodeGen/

AtomicExpandPass.cpp

2 lines

SelectionDAG/

LegalizeDAG.cpp

27 lines

LegalizeIntegerTypes.cpp

34 lines

LegalizeTypes.h

1 line

SelectionDAG.cpp

8 lines

TargetLoweringBase.cpp

143 lines

Target/

AArch64/

AArch64ISelLowering.cpp

21 lines

ARM/

4 lines

186 lines

6 lines

4 lines

BPF/

BPFISelLowering.cpp

2 lines

Hexagon/

HexagonISelLowering.h

5 lines

HexagonISelLowering.cpp

14 lines

Mips/

Mips16ISelLowering.cpp

4 lines

MipsISelLowering.cpp

9 lines

PowerPC/

PPCISelLowering.cpp

6 lines

Sparc/

SparcISelLowering.cpp

19 lines

SystemZ/

SystemZISelLowering.cpp

2 lines

TargetSubtargetInfo.cpp

4 lines

WebAssembly/

WebAssemblyISelLowering.cpp

5 lines

X86/

X86ISelLowering.cpp

55 lines

XCore/

XCoreISelLowering.h

5 lines

XCoreISelLowering.cpp

69 lines

test/

CodeGen/

ARM/

atomic-cmpxchg.ll

52 lines

atomic-load-store.ll

14 lines

atomic-op.ll

23 lines

PowerPC/

atomics-indexed.ll

8 lines

atomics.ll

20 lines

X86/

atomic-non-integer.ll

4 lines

nocx16.ll

16 lines

XCore/

atomic.ll

66 lines

Diff 61121

include/llvm/Target/TargetLowering.h

Show First 20 Lines • Show All 159 Lines • ▼ Show 20 Lines	public:
/// NOTE: The TargetMachine owns TLOF.		/// NOTE: The TargetMachine owns TLOF.
explicit TargetLoweringBase(const TargetMachine &TM);		explicit TargetLoweringBase(const TargetMachine &TM);
virtual ~TargetLoweringBase() {}		virtual ~TargetLoweringBase() {}

protected:		protected:
/// \brief Initialize all of the actions to default values.		/// \brief Initialize all of the actions to default values.
void initActions();		void initActions();

		/// Allow lowering into __sync_* libcalls. Without calling this, the
		jfbUnsubmitted Done Reply Inline Actions Would be nice to document what it does otherwise. jfb: Would be nice to document what it does otherwise.
		/// __sync calls do not have names defined, and attempting to use
		/// them from your backend will result in an error. (These must be
		/// enabled explicitly only in order to avoid them being generated
		/// accidentally on targets that don't support them.)
		void initSyncLibcalls();

public:		public:
const TargetMachine &getTargetMachine() const { return TM; }		const TargetMachine &getTargetMachine() const { return TM; }

virtual bool useSoftFloat() const { return false; }		virtual bool useSoftFloat() const { return false; }

/// Return the pointer type for the given address space, defaults to		/// Return the pointer type for the given address space, defaults to
/// the pointer type from the data layout.		/// the pointer type from the data layout.
/// FIXME: The default needs to be removed once all the code is updated.		/// FIXME: The default needs to be removed once all the code is updated.
▲ Show 20 Lines • Show All 2,896 Lines • Show Last 20 Lines

include/llvm/Target/TargetSubtargetInfo.h

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	public:
virtual bool enableJoinGlobalCopies() const;		virtual bool enableJoinGlobalCopies() const;

/// True if the subtarget should run a scheduler after register allocation.		/// True if the subtarget should run a scheduler after register allocation.
///		///
/// By default this queries the PostRAScheduling bit in the scheduling model		/// By default this queries the PostRAScheduling bit in the scheduling model
/// which is the preferred way to influence this.		/// which is the preferred way to influence this.
virtual bool enablePostRAScheduler() const;		virtual bool enablePostRAScheduler() const;

/// \brief True if the subtarget should run the atomic expansion pass.
virtual bool enableAtomicExpand() const;

/// \brief Override generic scheduling policy within a region.		/// \brief Override generic scheduling policy within a region.
///		///
/// This is a convenient way for targets that don't provide any custom		/// This is a convenient way for targets that don't provide any custom
/// scheduling heuristics (no custom MachineSchedStrategy) to make		/// scheduling heuristics (no custom MachineSchedStrategy) to make
/// changes to the generic scheduling policy.		/// changes to the generic scheduling policy.
virtual void overrideSchedPolicy(MachineSchedPolicy &Policy,		virtual void overrideSchedPolicy(MachineSchedPolicy &Policy,
MachineInstr begin, MachineInstr end,		MachineInstr begin, MachineInstr end,
unsigned NumRegionInstrs) const {}		unsigned NumRegionInstrs) const {}
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

lib/CodeGen/AtomicExpandPass.cpp

Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	bool atomicSizeSupported(const TargetLowering TLI, Inst I) {
unsigned Size = getAtomicOpSize(I);		unsigned Size = getAtomicOpSize(I);
unsigned Align = getAtomicOpAlign(I);		unsigned Align = getAtomicOpAlign(I);
return Align >= Size && Size <= TLI->getMaxAtomicSizeInBitsSupported() / 8;		return Align >= Size && Size <= TLI->getMaxAtomicSizeInBitsSupported() / 8;
}		}

} // end anonymous namespace		} // end anonymous namespace

bool AtomicExpand::runOnFunction(Function &F) {		bool AtomicExpand::runOnFunction(Function &F) {
if (!TM \|\| !TM->getSubtargetImpl(F)->enableAtomicExpand())		if (!TM)
return false;		return false;
TLI = TM->getSubtargetImpl(F)->getTargetLowering();		TLI = TM->getSubtargetImpl(F)->getTargetLowering();

SmallVector<Instruction *, 1> AtomicInsts;		SmallVector<Instruction *, 1> AtomicInsts;

// Changing control-flow while iterating through it is a bad idea, so gather a		// Changing control-flow while iterating through it is a bad idea, so gather a
// list of all atomic instructions before we start.		// list of all atomic instructions before we start.
for (inst_iterator II = inst_begin(F), E = inst_end(F); II != E; ++II) {		for (inst_iterator II = inst_begin(F), E = inst_end(F); II != E; ++II) {
▲ Show 20 Lines • Show All 1,456 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 2,789 Lines • ▼ Show 20 Lines	case ISD::READCYCLECOUNTER:
Results.push_back(Node->getOperand(0));		Results.push_back(Node->getOperand(0));
break;		break;
case ISD::EH_SJLJ_SETJMP:		case ISD::EH_SJLJ_SETJMP:
// If the target didn't expand this, just return 'zero' and preserve the		// If the target didn't expand this, just return 'zero' and preserve the
// chain.		// chain.
Results.push_back(DAG.getConstant(0, dl, MVT::i32));		Results.push_back(DAG.getConstant(0, dl, MVT::i32));
Results.push_back(Node->getOperand(0));		Results.push_back(Node->getOperand(0));
break;		break;
case ISD::ATOMIC_LOAD: {
// There is no libcall for atomic load; fake it with ATOMIC_CMP_SWAP.
SDValue Zero = DAG.getConstant(0, dl, Node->getValueType(0));
SDVTList VTs = DAG.getVTList(Node->getValueType(0), MVT::Other);
SDValue Swap = DAG.getAtomicCmpSwap(
ISD::ATOMIC_CMP_SWAP, dl, cast<AtomicSDNode>(Node)->getMemoryVT(), VTs,
Node->getOperand(0), Node->getOperand(1), Zero, Zero,
cast<AtomicSDNode>(Node)->getMemOperand(),
cast<AtomicSDNode>(Node)->getOrdering(),
cast<AtomicSDNode>(Node)->getOrdering(),
cast<AtomicSDNode>(Node)->getSynchScope());
Results.push_back(Swap.getValue(0));
Results.push_back(Swap.getValue(1));
break;
}
case ISD::ATOMIC_STORE: {
// There is no libcall for atomic store; fake it with ATOMIC_SWAP.
SDValue Swap = DAG.getAtomic(ISD::ATOMIC_SWAP, dl,
cast<AtomicSDNode>(Node)->getMemoryVT(),
Node->getOperand(0),
Node->getOperand(1), Node->getOperand(2),
cast<AtomicSDNode>(Node)->getMemOperand(),
cast<AtomicSDNode>(Node)->getOrdering(),
cast<AtomicSDNode>(Node)->getSynchScope());
Results.push_back(Swap.getValue(1));
break;
}
case ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS: {		case ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS: {
// Expanding an ATOMIC_CMP_SWAP_WITH_SUCCESS produces an ATOMIC_CMP_SWAP and		// Expanding an ATOMIC_CMP_SWAP_WITH_SUCCESS produces an ATOMIC_CMP_SWAP and
// splits out the success value as a comparison. Expanding the resulting		// splits out the success value as a comparison. Expanding the resulting
// ATOMIC_CMP_SWAP will produce a libcall.		// ATOMIC_CMP_SWAP will produce a libcall.
SDVTList VTs = DAG.getVTList(Node->getValueType(0), MVT::Other);		SDVTList VTs = DAG.getVTList(Node->getValueType(0), MVT::Other);
SDValue Res = DAG.getAtomicCmpSwap(		SDValue Res = DAG.getAtomicCmpSwap(
ISD::ATOMIC_CMP_SWAP, dl, cast<AtomicSDNode>(Node)->getMemoryVT(), VTs,		ISD::ATOMIC_CMP_SWAP, dl, cast<AtomicSDNode>(Node)->getMemoryVT(), VTs,
Node->getOperand(0), Node->getOperand(1), Node->getOperand(2),		Node->getOperand(0), Node->getOperand(1), Node->getOperand(2),
▲ Show 20 Lines • Show All 1,629 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 1,321 Lines • ▼ Show 20 Lines	#endif
case ISD::SDIV: ExpandIntRes_SDIV(N, Lo, Hi); break;		case ISD::SDIV: ExpandIntRes_SDIV(N, Lo, Hi); break;
case ISD::SIGN_EXTEND: ExpandIntRes_SIGN_EXTEND(N, Lo, Hi); break;		case ISD::SIGN_EXTEND: ExpandIntRes_SIGN_EXTEND(N, Lo, Hi); break;
case ISD::SIGN_EXTEND_INREG: ExpandIntRes_SIGN_EXTEND_INREG(N, Lo, Hi); break;		case ISD::SIGN_EXTEND_INREG: ExpandIntRes_SIGN_EXTEND_INREG(N, Lo, Hi); break;
case ISD::SREM: ExpandIntRes_SREM(N, Lo, Hi); break;		case ISD::SREM: ExpandIntRes_SREM(N, Lo, Hi); break;
case ISD::TRUNCATE: ExpandIntRes_TRUNCATE(N, Lo, Hi); break;		case ISD::TRUNCATE: ExpandIntRes_TRUNCATE(N, Lo, Hi); break;
case ISD::UDIV: ExpandIntRes_UDIV(N, Lo, Hi); break;		case ISD::UDIV: ExpandIntRes_UDIV(N, Lo, Hi); break;
case ISD::UREM: ExpandIntRes_UREM(N, Lo, Hi); break;		case ISD::UREM: ExpandIntRes_UREM(N, Lo, Hi); break;
case ISD::ZERO_EXTEND: ExpandIntRes_ZERO_EXTEND(N, Lo, Hi); break;		case ISD::ZERO_EXTEND: ExpandIntRes_ZERO_EXTEND(N, Lo, Hi); break;
case ISD::ATOMIC_LOAD: ExpandIntRes_ATOMIC_LOAD(N, Lo, Hi); break;

case ISD::ATOMIC_LOAD_ADD:		case ISD::ATOMIC_LOAD_ADD:
case ISD::ATOMIC_LOAD_SUB:		case ISD::ATOMIC_LOAD_SUB:
case ISD::ATOMIC_LOAD_AND:		case ISD::ATOMIC_LOAD_AND:
case ISD::ATOMIC_LOAD_OR:		case ISD::ATOMIC_LOAD_OR:
case ISD::ATOMIC_LOAD_XOR:		case ISD::ATOMIC_LOAD_XOR:
case ISD::ATOMIC_LOAD_NAND:		case ISD::ATOMIC_LOAD_NAND:
case ISD::ATOMIC_LOAD_MIN:		case ISD::ATOMIC_LOAD_MIN:
▲ Show 20 Lines • Show All 1,356 Lines • ▼ Show 20 Lines	if (Op.getValueType().bitsLE(NVT)) {
unsigned ExcessBits =		unsigned ExcessBits =
Op.getValueType().getSizeInBits() - NVT.getSizeInBits();		Op.getValueType().getSizeInBits() - NVT.getSizeInBits();
Hi = DAG.getZeroExtendInReg(Hi, dl,		Hi = DAG.getZeroExtendInReg(Hi, dl,
EVT::getIntegerVT(*DAG.getContext(),		EVT::getIntegerVT(*DAG.getContext(),
ExcessBits));		ExcessBits));
}		}
}		}

void DAGTypeLegalizer::ExpandIntRes_ATOMIC_LOAD(SDNode *N,
SDValue &Lo, SDValue &Hi) {
SDLoc dl(N);
EVT VT = cast<AtomicSDNode>(N)->getMemoryVT();
SDVTList VTs = DAG.getVTList(VT, MVT::i1, MVT::Other);
SDValue Zero = DAG.getConstant(0, dl, VT);
SDValue Swap = DAG.getAtomicCmpSwap(
ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS, dl,
cast<AtomicSDNode>(N)->getMemoryVT(), VTs, N->getOperand(0),
N->getOperand(1), Zero, Zero, cast<AtomicSDNode>(N)->getMemOperand(),
cast<AtomicSDNode>(N)->getOrdering(),
cast<AtomicSDNode>(N)->getOrdering(),
cast<AtomicSDNode>(N)->getSynchScope());

ReplaceValueWith(SDValue(N, 0), Swap.getValue(0));
ReplaceValueWith(SDValue(N, 1), Swap.getValue(2));
}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Integer Operand Expansion		// Integer Operand Expansion
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// ExpandIntegerOperand - This method is called when the specified operand of		/// ExpandIntegerOperand - This method is called when the specified operand of
/// the specified node is found to need expansion. At this point, all of the		/// the specified node is found to need expansion. At this point, all of the
/// result types of the node are known to be legal, but other operands of the		/// result types of the node are known to be legal, but other operands of the
/// node may need promotion or expansion as well as the specified one.		/// node may need promotion or expansion as well as the specified one.
Show All 28 Lines	bool DAGTypeLegalizer::ExpandIntegerOperand(SDNode *N, unsigned OpNo) {

case ISD::SHL:		case ISD::SHL:
case ISD::SRA:		case ISD::SRA:
case ISD::SRL:		case ISD::SRL:
case ISD::ROTL:		case ISD::ROTL:
case ISD::ROTR: Res = ExpandIntOp_Shift(N); break;		case ISD::ROTR: Res = ExpandIntOp_Shift(N); break;
case ISD::RETURNADDR:		case ISD::RETURNADDR:
case ISD::FRAMEADDR: Res = ExpandIntOp_RETURNADDR(N); break;		case ISD::FRAMEADDR: Res = ExpandIntOp_RETURNADDR(N); break;

case ISD::ATOMIC_STORE: Res = ExpandIntOp_ATOMIC_STORE(N); break;
}		}

// If the result is null, the sub-method took care of registering results etc.		// If the result is null, the sub-method took care of registering results etc.
if (!Res.getNode()) return false;		if (!Res.getNode()) return false;

// If the result is N, the sub-method updated N in place. Tell the legalizer		// If the result is N, the sub-method updated N in place. Tell the legalizer
// core about this.		// core about this.
if (Res.getNode() == N)		if (Res.getNode() == N)
▲ Show 20 Lines • Show All 418 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::ExpandIntOp_UINT_TO_FP(SDNode *N) {

// Otherwise, use a libcall.		// Otherwise, use a libcall.
RTLIB::Libcall LC = RTLIB::getUINTTOFP(SrcVT, DstVT);		RTLIB::Libcall LC = RTLIB::getUINTTOFP(SrcVT, DstVT);
assert(LC != RTLIB::UNKNOWN_LIBCALL &&		assert(LC != RTLIB::UNKNOWN_LIBCALL &&
"Don't know how to expand this UINT_TO_FP!");		"Don't know how to expand this UINT_TO_FP!");
return TLI.makeLibCall(DAG, LC, DstVT, Op, true, dl).first;		return TLI.makeLibCall(DAG, LC, DstVT, Op, true, dl).first;
}		}

SDValue DAGTypeLegalizer::ExpandIntOp_ATOMIC_STORE(SDNode *N) {
SDLoc dl(N);
SDValue Swap = DAG.getAtomic(ISD::ATOMIC_SWAP, dl,
cast<AtomicSDNode>(N)->getMemoryVT(),
N->getOperand(0),
N->getOperand(1), N->getOperand(2),
cast<AtomicSDNode>(N)->getMemOperand(),
cast<AtomicSDNode>(N)->getOrdering(),
cast<AtomicSDNode>(N)->getSynchScope());
return Swap.getValue(1);
}


SDValue DAGTypeLegalizer::PromoteIntRes_EXTRACT_SUBVECTOR(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_EXTRACT_SUBVECTOR(SDNode *N) {
SDValue InOp0 = N->getOperand(0);		SDValue InOp0 = N->getOperand(0);
EVT InVT = InOp0.getValueType();		EVT InVT = InOp0.getValueType();

EVT OutVT = N->getValueType(0);		EVT OutVT = N->getValueType(0);
EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);		EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);
assert(NOutVT.isVector() && "This type must be promoted to a vector type");		assert(NOutVT.isVector() && "This type must be promoted to a vector type");
unsigned OutNumElems = OutVT.getVectorNumElements();		unsigned OutNumElems = OutVT.getVectorNumElements();
▲ Show 20 Lines • Show All 181 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 380 Lines • ▼ Show 20 Lines	private:
SDValue ExpandIntOp_SETCC(SDNode *N);		SDValue ExpandIntOp_SETCC(SDNode *N);
SDValue ExpandIntOp_SETCCE(SDNode *N);		SDValue ExpandIntOp_SETCCE(SDNode *N);
SDValue ExpandIntOp_Shift(SDNode *N);		SDValue ExpandIntOp_Shift(SDNode *N);
SDValue ExpandIntOp_SINT_TO_FP(SDNode *N);		SDValue ExpandIntOp_SINT_TO_FP(SDNode *N);
SDValue ExpandIntOp_STORE(StoreSDNode *N, unsigned OpNo);		SDValue ExpandIntOp_STORE(StoreSDNode *N, unsigned OpNo);
SDValue ExpandIntOp_TRUNCATE(SDNode *N);		SDValue ExpandIntOp_TRUNCATE(SDNode *N);
SDValue ExpandIntOp_UINT_TO_FP(SDNode *N);		SDValue ExpandIntOp_UINT_TO_FP(SDNode *N);
SDValue ExpandIntOp_RETURNADDR(SDNode *N);		SDValue ExpandIntOp_RETURNADDR(SDNode *N);
SDValue ExpandIntOp_ATOMIC_STORE(SDNode *N);

void IntegerExpandSetCCOperands(SDValue &NewLHS, SDValue &NewRHS,		void IntegerExpandSetCCOperands(SDValue &NewLHS, SDValue &NewRHS,
ISD::CondCode &CCCode, const SDLoc &dl);		ISD::CondCode &CCCode, const SDLoc &dl);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Float to Integer Conversion Support: LegalizeFloatTypes.cpp		// Float to Integer Conversion Support: LegalizeFloatTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

▲ Show 20 Lines • Show All 469 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,429 Lines • ▼ Show 20 Lines	SDValue SelectionDAG::getValueType(EVT VT) {

if (N) return SDValue(N, 0);		if (N) return SDValue(N, 0);
N = newSDNode<VTSDNode>(VT);		N = newSDNode<VTSDNode>(VT);
InsertNode(N);		InsertNode(N);
return SDValue(N, 0);		return SDValue(N, 0);
}		}

SDValue SelectionDAG::getExternalSymbol(const char *Sym, EVT VT) {		SDValue SelectionDAG::getExternalSymbol(const char *Sym, EVT VT) {
		if (!Sym)
		report_fatal_error(
		"Attempted to use null symbol in SelectionDAG::getExternalSymbol!");
		t.p.northoverUnsubmitted Not Done Reply Inline Actions Isn't this an assertion-level error? It doesn't seem to be actionable by the user in any reasonable way. t.p.northover: Isn't this an assertion-level error? It doesn't seem to be actionable by the user in any…
		jfbUnsubmitted Not Done Reply Inline Actions Yes, should probably be a separate change anyways? jfb: Yes, should probably be a separate change anyways?
		jyknightAuthorUnsubmitted Not Done Reply Inline Actions Yeah, you're right. Actually, I'm going to just remove this change entirely. You get an assertion failure: llc: llvm/include/llvm/ADT/StringRef.h:73: llvm::StringRef::StringRef(const char ): Assertion `Str && "StringRef cannot be built from a NULL argument"' failed. anyways, without it. With a stack trace pointing you at the problem. jyknight:* Yeah, you're right. Actually, I'm going to just remove this change entirely. You get an…

SDNode *&N = ExternalSymbols[Sym];		SDNode *&N = ExternalSymbols[Sym];
if (N) return SDValue(N, 0);		if (N) return SDValue(N, 0);
N = newSDNode<ExternalSymbolSDNode>(false, Sym, 0, VT);		N = newSDNode<ExternalSymbolSDNode>(false, Sym, 0, VT);
InsertNode(N);		InsertNode(N);
return SDValue(N, 0);		return SDValue(N, 0);
}		}

SDValue SelectionDAG::getMCSymbol(MCSymbol *Sym, EVT VT) {		SDValue SelectionDAG::getMCSymbol(MCSymbol *Sym, EVT VT) {
SDNode *&N = MCSymbols[Sym];		SDNode *&N = MCSymbols[Sym];
if (N)		if (N)
return SDValue(N, 0);		return SDValue(N, 0);
N = newSDNode<MCSymbolSDNode>(Sym, VT);		N = newSDNode<MCSymbolSDNode>(Sym, VT);
InsertNode(N);		InsertNode(N);
return SDValue(N, 0);		return SDValue(N, 0);
}		}

SDValue SelectionDAG::getTargetExternalSymbol(const char *Sym, EVT VT,		SDValue SelectionDAG::getTargetExternalSymbol(const char *Sym, EVT VT,
unsigned char TargetFlags) {		unsigned char TargetFlags) {
		if (!Sym)
		report_fatal_error("Attempted to use null symbol in "
		"SelectionDAG::getTargetExternalSymbol!");

SDNode *&N =		SDNode *&N =
TargetExternalSymbols[std::pair<std::string,unsigned char>(Sym,		TargetExternalSymbols[std::pair<std::string,unsigned char>(Sym,
TargetFlags)];		TargetFlags)];
if (N) return SDValue(N, 0);		if (N) return SDValue(N, 0);
N = newSDNode<ExternalSymbolSDNode>(true, Sym, TargetFlags, VT);		N = newSDNode<ExternalSymbolSDNode>(true, Sym, TargetFlags, VT);
InsertNode(N);		InsertNode(N);
return SDValue(N, 0);		return SDValue(N, 0);
}		}
▲ Show 20 Lines • Show All 5,874 Lines • Show Last 20 Lines

lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 347 Lines • ▼ Show 20 Lines	static void InitLibcallNames(const char **Names, const Triple &TT) {
Names[RTLIB::O_F32] = "__unordsf2";		Names[RTLIB::O_F32] = "__unordsf2";
Names[RTLIB::O_F64] = "__unorddf2";		Names[RTLIB::O_F64] = "__unorddf2";
Names[RTLIB::O_F128] = "__unordtf2";		Names[RTLIB::O_F128] = "__unordtf2";
Names[RTLIB::O_PPCF128] = "__gcc_qunord";		Names[RTLIB::O_PPCF128] = "__gcc_qunord";
Names[RTLIB::MEMCPY] = "memcpy";		Names[RTLIB::MEMCPY] = "memcpy";
Names[RTLIB::MEMMOVE] = "memmove";		Names[RTLIB::MEMMOVE] = "memmove";
Names[RTLIB::MEMSET] = "memset";		Names[RTLIB::MEMSET] = "memset";
Names[RTLIB::UNWIND_RESUME] = "_Unwind_Resume";		Names[RTLIB::UNWIND_RESUME] = "_Unwind_Resume";
Names[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_1] = "__sync_val_compare_and_swap_1";
Names[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_2] = "__sync_val_compare_and_swap_2";
Names[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_4] = "__sync_val_compare_and_swap_4";
Names[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_8] = "__sync_val_compare_and_swap_8";
Names[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_16] = "__sync_val_compare_and_swap_16";
Names[RTLIB::SYNC_LOCK_TEST_AND_SET_1] = "__sync_lock_test_and_set_1";
Names[RTLIB::SYNC_LOCK_TEST_AND_SET_2] = "__sync_lock_test_and_set_2";
Names[RTLIB::SYNC_LOCK_TEST_AND_SET_4] = "__sync_lock_test_and_set_4";
Names[RTLIB::SYNC_LOCK_TEST_AND_SET_8] = "__sync_lock_test_and_set_8";
Names[RTLIB::SYNC_LOCK_TEST_AND_SET_16] = "__sync_lock_test_and_set_16";
Names[RTLIB::SYNC_FETCH_AND_ADD_1] = "__sync_fetch_and_add_1";
Names[RTLIB::SYNC_FETCH_AND_ADD_2] = "__sync_fetch_and_add_2";
Names[RTLIB::SYNC_FETCH_AND_ADD_4] = "__sync_fetch_and_add_4";
Names[RTLIB::SYNC_FETCH_AND_ADD_8] = "__sync_fetch_and_add_8";
Names[RTLIB::SYNC_FETCH_AND_ADD_16] = "__sync_fetch_and_add_16";
Names[RTLIB::SYNC_FETCH_AND_SUB_1] = "__sync_fetch_and_sub_1";
Names[RTLIB::SYNC_FETCH_AND_SUB_2] = "__sync_fetch_and_sub_2";
Names[RTLIB::SYNC_FETCH_AND_SUB_4] = "__sync_fetch_and_sub_4";
Names[RTLIB::SYNC_FETCH_AND_SUB_8] = "__sync_fetch_and_sub_8";
Names[RTLIB::SYNC_FETCH_AND_SUB_16] = "__sync_fetch_and_sub_16";
Names[RTLIB::SYNC_FETCH_AND_AND_1] = "__sync_fetch_and_and_1";
Names[RTLIB::SYNC_FETCH_AND_AND_2] = "__sync_fetch_and_and_2";
Names[RTLIB::SYNC_FETCH_AND_AND_4] = "__sync_fetch_and_and_4";
Names[RTLIB::SYNC_FETCH_AND_AND_8] = "__sync_fetch_and_and_8";
Names[RTLIB::SYNC_FETCH_AND_AND_16] = "__sync_fetch_and_and_16";
Names[RTLIB::SYNC_FETCH_AND_OR_1] = "__sync_fetch_and_or_1";
Names[RTLIB::SYNC_FETCH_AND_OR_2] = "__sync_fetch_and_or_2";
Names[RTLIB::SYNC_FETCH_AND_OR_4] = "__sync_fetch_and_or_4";
Names[RTLIB::SYNC_FETCH_AND_OR_8] = "__sync_fetch_and_or_8";
Names[RTLIB::SYNC_FETCH_AND_OR_16] = "__sync_fetch_and_or_16";
Names[RTLIB::SYNC_FETCH_AND_XOR_1] = "__sync_fetch_and_xor_1";
Names[RTLIB::SYNC_FETCH_AND_XOR_2] = "__sync_fetch_and_xor_2";
Names[RTLIB::SYNC_FETCH_AND_XOR_4] = "__sync_fetch_and_xor_4";
Names[RTLIB::SYNC_FETCH_AND_XOR_8] = "__sync_fetch_and_xor_8";
Names[RTLIB::SYNC_FETCH_AND_XOR_16] = "__sync_fetch_and_xor_16";
Names[RTLIB::SYNC_FETCH_AND_NAND_1] = "__sync_fetch_and_nand_1";
Names[RTLIB::SYNC_FETCH_AND_NAND_2] = "__sync_fetch_and_nand_2";
Names[RTLIB::SYNC_FETCH_AND_NAND_4] = "__sync_fetch_and_nand_4";
Names[RTLIB::SYNC_FETCH_AND_NAND_8] = "__sync_fetch_and_nand_8";
Names[RTLIB::SYNC_FETCH_AND_NAND_16] = "__sync_fetch_and_nand_16";
Names[RTLIB::SYNC_FETCH_AND_MAX_1] = "__sync_fetch_and_max_1";
Names[RTLIB::SYNC_FETCH_AND_MAX_2] = "__sync_fetch_and_max_2";
Names[RTLIB::SYNC_FETCH_AND_MAX_4] = "__sync_fetch_and_max_4";
Names[RTLIB::SYNC_FETCH_AND_MAX_8] = "__sync_fetch_and_max_8";
Names[RTLIB::SYNC_FETCH_AND_MAX_16] = "__sync_fetch_and_max_16";
Names[RTLIB::SYNC_FETCH_AND_UMAX_1] = "__sync_fetch_and_umax_1";
Names[RTLIB::SYNC_FETCH_AND_UMAX_2] = "__sync_fetch_and_umax_2";
Names[RTLIB::SYNC_FETCH_AND_UMAX_4] = "__sync_fetch_and_umax_4";
Names[RTLIB::SYNC_FETCH_AND_UMAX_8] = "__sync_fetch_and_umax_8";
Names[RTLIB::SYNC_FETCH_AND_UMAX_16] = "__sync_fetch_and_umax_16";
Names[RTLIB::SYNC_FETCH_AND_MIN_1] = "__sync_fetch_and_min_1";
Names[RTLIB::SYNC_FETCH_AND_MIN_2] = "__sync_fetch_and_min_2";
Names[RTLIB::SYNC_FETCH_AND_MIN_4] = "__sync_fetch_and_min_4";
Names[RTLIB::SYNC_FETCH_AND_MIN_8] = "__sync_fetch_and_min_8";
Names[RTLIB::SYNC_FETCH_AND_MIN_16] = "__sync_fetch_and_min_16";
Names[RTLIB::SYNC_FETCH_AND_UMIN_1] = "__sync_fetch_and_umin_1";
Names[RTLIB::SYNC_FETCH_AND_UMIN_2] = "__sync_fetch_and_umin_2";
Names[RTLIB::SYNC_FETCH_AND_UMIN_4] = "__sync_fetch_and_umin_4";
Names[RTLIB::SYNC_FETCH_AND_UMIN_8] = "__sync_fetch_and_umin_8";
Names[RTLIB::SYNC_FETCH_AND_UMIN_16] = "__sync_fetch_and_umin_16";

Names[RTLIB::ATOMIC_LOAD] = "__atomic_load";		Names[RTLIB::ATOMIC_LOAD] = "__atomic_load";
Names[RTLIB::ATOMIC_LOAD_1] = "__atomic_load_1";		Names[RTLIB::ATOMIC_LOAD_1] = "__atomic_load_1";
Names[RTLIB::ATOMIC_LOAD_2] = "__atomic_load_2";		Names[RTLIB::ATOMIC_LOAD_2] = "__atomic_load_2";
Names[RTLIB::ATOMIC_LOAD_4] = "__atomic_load_4";		Names[RTLIB::ATOMIC_LOAD_4] = "__atomic_load_4";
Names[RTLIB::ATOMIC_LOAD_8] = "__atomic_load_8";		Names[RTLIB::ATOMIC_LOAD_8] = "__atomic_load_8";
Names[RTLIB::ATOMIC_LOAD_16] = "__atomic_load_16";		Names[RTLIB::ATOMIC_LOAD_16] = "__atomic_load_16";

▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	static void InitLibcallNames(const char **Names, const Triple &TT) {

if (!TT.isOSOpenBSD()) {		if (!TT.isOSOpenBSD()) {
Names[RTLIB::STACKPROTECTOR_CHECK_FAIL] = "__stack_chk_fail";		Names[RTLIB::STACKPROTECTOR_CHECK_FAIL] = "__stack_chk_fail";
}		}

Names[RTLIB::DEOPTIMIZE] = "__llvm_deoptimize";		Names[RTLIB::DEOPTIMIZE] = "__llvm_deoptimize";
}		}

		void TargetLoweringBase::initSyncLibcalls() {
		LibcallRoutineNames[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_1] =
		"__sync_val_compare_and_swap_1";
		LibcallRoutineNames[RTLIB::SYNC_LOCK_TEST_AND_SET_1] =
		"__sync_lock_test_and_set_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_ADD_1] = "__sync_fetch_and_add_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_SUB_1] = "__sync_fetch_and_sub_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_AND_1] = "__sync_fetch_and_and_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_OR_1] = "__sync_fetch_and_or_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_XOR_1] = "__sync_fetch_and_xor_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_NAND_1] = "__sync_fetch_and_nand_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MAX_1] = "__sync_fetch_and_max_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMAX_1] = "__sync_fetch_and_umax_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MIN_1] = "__sync_fetch_and_min_1";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMIN_1] = "__sync_fetch_and_umin_1";

		LibcallRoutineNames[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_2] =
		"__sync_val_compare_and_swap_2";
		LibcallRoutineNames[RTLIB::SYNC_LOCK_TEST_AND_SET_2] =
		"__sync_lock_test_and_set_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_ADD_2] = "__sync_fetch_and_add_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_SUB_2] = "__sync_fetch_and_sub_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_AND_2] = "__sync_fetch_and_and_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_OR_2] = "__sync_fetch_and_or_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_XOR_2] = "__sync_fetch_and_xor_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_NAND_2] = "__sync_fetch_and_nand_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MAX_2] = "__sync_fetch_and_max_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMAX_2] = "__sync_fetch_and_umax_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MIN_2] = "__sync_fetch_and_min_2";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMIN_2] = "__sync_fetch_and_umin_2";

		LibcallRoutineNames[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_4] =
		"__sync_val_compare_and_swap_4";
		LibcallRoutineNames[RTLIB::SYNC_LOCK_TEST_AND_SET_4] =
		"__sync_lock_test_and_set_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_ADD_4] = "__sync_fetch_and_add_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_SUB_4] = "__sync_fetch_and_sub_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_AND_4] = "__sync_fetch_and_and_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_OR_4] = "__sync_fetch_and_or_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_XOR_4] = "__sync_fetch_and_xor_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_NAND_4] = "__sync_fetch_and_nand_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MAX_4] = "__sync_fetch_and_max_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMAX_4] = "__sync_fetch_and_umax_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MIN_4] = "__sync_fetch_and_min_4";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMIN_4] = "__sync_fetch_and_umin_4";

		LibcallRoutineNames[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_8] =
		"__sync_val_compare_and_swap_8";
		LibcallRoutineNames[RTLIB::SYNC_LOCK_TEST_AND_SET_8] =
		"__sync_lock_test_and_set_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_ADD_8] = "__sync_fetch_and_add_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_SUB_8] = "__sync_fetch_and_sub_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_AND_8] = "__sync_fetch_and_and_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_OR_8] = "__sync_fetch_and_or_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_XOR_8] = "__sync_fetch_and_xor_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_NAND_8] = "__sync_fetch_and_nand_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MAX_8] = "__sync_fetch_and_max_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMAX_8] = "__sync_fetch_and_umax_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MIN_8] = "__sync_fetch_and_min_8";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMIN_8] = "__sync_fetch_and_umin_8";

		LibcallRoutineNames[RTLIB::SYNC_VAL_COMPARE_AND_SWAP_16] =
		"__sync_val_compare_and_swap_16";
		LibcallRoutineNames[RTLIB::SYNC_LOCK_TEST_AND_SET_16] =
		"__sync_lock_test_and_set_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_ADD_16] = "__sync_fetch_and_add_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_SUB_16] = "__sync_fetch_and_sub_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_AND_16] = "__sync_fetch_and_and_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_OR_16] = "__sync_fetch_and_or_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_XOR_16] = "__sync_fetch_and_xor_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_NAND_16] =
		"__sync_fetch_and_nand_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MAX_16] = "__sync_fetch_and_max_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMAX_16] =
		"__sync_fetch_and_umax_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_MIN_16] = "__sync_fetch_and_min_16";
		LibcallRoutineNames[RTLIB::SYNC_FETCH_AND_UMIN_16] =
		"__sync_fetch_and_umin_16";
		}
/// InitLibcallCallingConvs - Set default libcall CallingConvs.		/// InitLibcallCallingConvs - Set default libcall CallingConvs.
///		///
static void InitLibcallCallingConvs(CallingConv::ID *CCs) {		static void InitLibcallCallingConvs(CallingConv::ID *CCs) {
for (int i = 0; i < RTLIB::UNKNOWN_LIBCALL; ++i) {		for (int i = 0; i < RTLIB::UNKNOWN_LIBCALL; ++i) {
CCs[i] = CallingConv::C;		CCs[i] = CallingConv::C;
}		}
}		}

▲ Show 20 Lines • Show All 322 Lines • ▼ Show 20 Lines	TargetLoweringBase::TargetLoweringBase(const TargetMachine &tm) : TM(tm) {
JumpBufSize = 0;		JumpBufSize = 0;
JumpBufAlignment = 0;		JumpBufAlignment = 0;
MinFunctionAlignment = 0;		MinFunctionAlignment = 0;
PrefFunctionAlignment = 0;		PrefFunctionAlignment = 0;
PrefLoopAlignment = 0;		PrefLoopAlignment = 0;
GatherAllAliasesMaxDepth = 6;		GatherAllAliasesMaxDepth = 6;
MinStackArgumentAlignment = 1;		MinStackArgumentAlignment = 1;
MinimumJumpTableEntries = 4;		MinimumJumpTableEntries = 4;
// TODO: the default will be switched to 0 in the next commit, along		MaxAtomicSizeInBitsSupported = 0;
// with the Target-specific changes necessary.
MaxAtomicSizeInBitsSupported = 1024;

MinCmpXchgSizeInBits = 0;		MinCmpXchgSizeInBits = 0;

std::fill(std::begin(LibcallRoutineNames), std::end(LibcallRoutineNames), nullptr);		std::fill(std::begin(LibcallRoutineNames), std::end(LibcallRoutineNames), nullptr);

InitLibcallNames(LibcallRoutineNames, TM.getTargetTriple());		InitLibcallNames(LibcallRoutineNames, TM.getTargetTriple());
InitCmpLibcallCCs(CmpLibcallCCs);		InitCmpLibcallCCs(CmpLibcallCCs);
InitLibcallCallingConvs(LibcallCallingConvs);		InitLibcallCallingConvs(LibcallCallingConvs);
▲ Show 20 Lines • Show All 1,002 Lines • Show Last 20 Lines

lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 633 Lines • ▼ Show 20 Lines	for (MVT Ty : {MVT::v2f32, MVT::v4f32, MVT::v2f64}) {
setOperationAction(ISD::FCEIL, Ty, Legal);		setOperationAction(ISD::FCEIL, Ty, Legal);
setOperationAction(ISD::FRINT, Ty, Legal);		setOperationAction(ISD::FRINT, Ty, Legal);
setOperationAction(ISD::FTRUNC, Ty, Legal);		setOperationAction(ISD::FTRUNC, Ty, Legal);
setOperationAction(ISD::FROUND, Ty, Legal);		setOperationAction(ISD::FROUND, Ty, Legal);
}		}
}		}

PredictableSelectIsExpensive = Subtarget->predictableSelectIsExpensive();		PredictableSelectIsExpensive = Subtarget->predictableSelectIsExpensive();

		setMaxAtomicSizeInBitsSupported(128);
}		}

void AArch64TargetLowering::addTypeForNEON(MVT VT, MVT PromotedBitwiseVT) {		void AArch64TargetLowering::addTypeForNEON(MVT VT, MVT PromotedBitwiseVT) {
if (VT == MVT::v2f32 \|\| VT == MVT::v4f16) {		if (VT == MVT::v2f32 \|\| VT == MVT::v4f16) {
setOperationAction(ISD::LOAD, VT, Promote);		setOperationAction(ISD::LOAD, VT, Promote);
AddPromotedToType(ISD::LOAD, VT, MVT::v2i32);		AddPromotedToType(ISD::LOAD, VT, MVT::v2i32);

setOperationAction(ISD::STORE, VT, Promote);		setOperationAction(ISD::STORE, VT, Promote);
▲ Show 20 Lines • Show All 9,553 Lines • ▼ Show 20 Lines	AArch64TargetLowering::getPreferredVectorAction(EVT VT) const {
// v4i16, v2i32 instead of to promote.		// v4i16, v2i32 instead of to promote.
if (SVT == MVT::v1i8 \|\| SVT == MVT::v1i16 \|\| SVT == MVT::v1i32		if (SVT == MVT::v1i8 \|\| SVT == MVT::v1i16 \|\| SVT == MVT::v1i32
\|\| SVT == MVT::v1f32)		\|\| SVT == MVT::v1f32)
return TypeWidenVector;		return TypeWidenVector;

return TargetLoweringBase::getPreferredVectorAction(VT);		return TargetLoweringBase::getPreferredVectorAction(VT);
}		}

// Loads and stores less than 128-bits are already atomic; ones above that		// Loads and stores less than 128-bits are already atomic; 128-bit
// are doomed anyway, so defer to the default libcall and blame the OS when		// ones can only be done via ldaxp/stlxp sequences, so must be expanded.
// things go wrong.
bool AArch64TargetLowering::shouldExpandAtomicStoreInIR(StoreInst *SI) const {		bool AArch64TargetLowering::shouldExpandAtomicStoreInIR(StoreInst *SI) const {
unsigned Size = SI->getValueOperand()->getType()->getPrimitiveSizeInBits();		unsigned Size = SI->getValueOperand()->getType()->getPrimitiveSizeInBits();
		assert(Size <= 128 &&
		"Sizes above 128 should've been handled by AtomicExpandPass");
return Size == 128;		return Size == 128;
		jfbUnsubmitted Done Reply Inline Actions Is the assumption here that Size > 128 never gets to this code? Could that be asserted? jfb: Is the assumption here that Size > 128 never gets to this code? Could that be asserted?
		jyknightAuthorUnsubmitted Not Done Reply Inline Actions Yes; above 128 get expanded to libcalls by AtomicExpandPass before getting here. jyknight: Yes; above 128 get expanded to libcalls by AtomicExpandPass before getting here.
}		}

// Loads and stores less than 128-bits are already atomic; ones above that		// Loads and stores less than 128-bits are already atomic; 128-bit
// are doomed anyway, so defer to the default libcall and blame the OS when		// ones can only be done via ldaxp/stlxp sequences, so must be expanded.
// things go wrong.
TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
AArch64TargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const {		AArch64TargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const {
unsigned Size = LI->getType()->getPrimitiveSizeInBits();		unsigned Size = LI->getType()->getPrimitiveSizeInBits();
		assert(Size <= 128 &&
		"Sizes above 128 should've been handled by AtomicExpandPass");
return Size == 128 ? AtomicExpansionKind::LLSC : AtomicExpansionKind::None;		return Size == 128 ? AtomicExpansionKind::LLSC : AtomicExpansionKind::None;
}		}

// For the real atomic operations, we have ldxr/stxr up to 128 bits,		// Expand RMW operations to ldrex/strex instructions.
		jfbUnsubmitted Not Done Reply Inline Actions "RMW" jfb: "RMW"
TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
AArch64TargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const {		AArch64TargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const {
unsigned Size = AI->getType()->getPrimitiveSizeInBits();		return AtomicExpansionKind::LLSC;
return Size <= 128 ? AtomicExpansionKind::LLSC : AtomicExpansionKind::None;
}		}

bool AArch64TargetLowering::shouldExpandAtomicCmpXchgInIR(		bool AArch64TargetLowering::shouldExpandAtomicCmpXchgInIR(
AtomicCmpXchgInst *AI) const {		AtomicCmpXchgInst *AI) const {
// At -O0, fast-regalloc cannot cope with the live vregs necessary to		// At -O0, fast-regalloc cannot cope with the live vregs necessary to
// implement cmpxchg without spilling. If the address being exchanged is also		// implement cmpxchg without spilling. If the address being exchanged is also
// on the stack and close enough to the spill slot, this can lead to a		// on the stack and close enough to the spill slot, this can lead to a
// situation where the monitor always gets cleared and the atomic operation		// situation where the monitor always gets cleared and the atomic operation
▲ Show 20 Lines • Show All 179 Lines • Show Last 20 Lines

lib/Target/ARM/ARMISelLowering.h

Show First 20 Lines • Show All 487 Lines • ▼ Show 20 Lines	private:
const TargetRegisterInfo *RegInfo;		const TargetRegisterInfo *RegInfo;

const InstrItineraryData *Itins;		const InstrItineraryData *Itins;

/// ARMPCLabelIndex - Keep track of the number of ARM PC labels created.		/// ARMPCLabelIndex - Keep track of the number of ARM PC labels created.
///		///
unsigned ARMPCLabelIndex;		unsigned ARMPCLabelIndex;

// TODO: remove this, and have shouldInsertFencesForAtomic do the proper
// check.
bool InsertFencesForAtomic;

void addTypeForNEON(MVT VT, MVT PromotedLdStVT, MVT PromotedBitwiseVT);		void addTypeForNEON(MVT VT, MVT PromotedLdStVT, MVT PromotedBitwiseVT);
void addDRTypeForNEON(MVT VT);		void addDRTypeForNEON(MVT VT);
void addQRTypeForNEON(MVT VT);		void addQRTypeForNEON(MVT VT);
std::pair<SDValue, SDValue> getARMXALUOOp(SDValue Op, SelectionDAG &DAG, SDValue &ARMcc) const;		std::pair<SDValue, SDValue> getARMXALUOOp(SDValue Op, SelectionDAG &DAG, SDValue &ARMcc) const;

typedef SmallVector<std::pair<unsigned, SDValue>, 8> RegsToPassVector;		typedef SmallVector<std::pair<unsigned, SDValue>, 8> RegsToPassVector;
void PassF64ArgInRegs(const SDLoc &dl, SelectionDAG &DAG, SDValue Chain,		void PassF64ArgInRegs(const SDLoc &dl, SelectionDAG &DAG, SDValue Chain,
SDValue &Arg, RegsToPassVector &RegsToPass,		SDValue &Arg, RegsToPassVector &RegsToPass,
▲ Show 20 Lines • Show All 186 Lines • Show Last 20 Lines

lib/Target/ARM/ARMISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 836 Lines • ▼ Show 20 Lines	ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::STACKSAVE, MVT::Other, Expand);		setOperationAction(ISD::STACKSAVE, MVT::Other, Expand);
setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand);		setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand);

if (Subtarget->getTargetTriple().isWindowsItaniumEnvironment())		if (Subtarget->getTargetTriple().isWindowsItaniumEnvironment())
setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32, Custom);		setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32, Custom);
else		else
setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32, Expand);		setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32, Expand);

// ARMv6 Thumb1 (except for CPUs that support dmb / dsb) and earlier use		// Processors that support ldrex have native lock-free atomics.
// the default expansion.		//
InsertFencesForAtomic = false;		// And, OSes that have cmpxchg via kernel support can use atomics
if (Subtarget->hasAnyDataBarrier() &&		// regardless (with expansion to __sync_* libcalls as needed).
jfbUnsubmitted Not Done Reply Inline Actions What happens now when ThreadModel is Single? Is that handled elsewhere? jfb: What happens now when ThreadModel is Single? Is that handled elsewhere?
jyknightAuthorUnsubmitted Not Done Reply Inline Actions Yeah, this is handled in ARMTargetMachine.cpp, in addIRPasses: if ThreadMode == Single, it adds the LowerAtomic pass, which changes all atomic ops to non-atomic ops, and deletes all fences. That's actually purely dead code at this point. Committed separately. jyknight: Yeah, this is handled in ARMTargetMachine.cpp, in addIRPasses: if ThreadMode == Single, it adds…
(!Subtarget->isThumb() \|\| Subtarget->hasV8MBaselineOps())) {		//
// ATOMIC_FENCE needs custom lowering; the others should have been expanded		if (Subtarget->hasLdrex() \|\| Subtarget->isTargetDarwin() \|\|
// to ldrex/strex loops already.		Subtarget->isTargetLinux()) {
setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, Custom);		// The Cortex-M only supports up to 32bit operations, while
if (!Subtarget->isThumb() \|\| !Subtarget->isMClass())		// everything else supports 64-bit (via the ldrexd intrinsic
setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i64, Custom);		// expansion).
		if (Subtarget->isMClass())
// On v8, we have particularly efficient implementations of atomic fences		setMaxAtomicSizeInBitsSupported(32);
// if they can be combined with nearby atomic loads and stores.		else
if (!Subtarget->hasV8Ops() \|\| getTargetMachine().getOptLevel() == 0) {		setMaxAtomicSizeInBitsSupported(64);
// Automatically insert fences (dmb ish) around ATOMIC_SWAP etc.
InsertFencesForAtomic = true;
}
} else {
// If there's anything we can use as a barrier, go through custom lowering
// for ATOMIC_FENCE.
setOperationAction(ISD::ATOMIC_FENCE, MVT::Other,
Subtarget->hasAnyDataBarrier() ? Custom : Expand);

// Set them all for expansion, which will force libcalls.		// When we're relying on OS cmpxchg support, set everything but
		// ATOMIC_LOAD/ATOMIC_STORE for expansion, so we will emit
		// __sync_* libcalls. (load and store themselves are atomic on all
		// CPUs)
		if (!Subtarget->hasLdrex()) {
		initSyncLibcalls();
setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_NAND, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_NAND, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_MIN, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_MIN, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_MAX, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_MAX, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_UMIN, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_UMIN, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_UMAX, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_UMAX, MVT::i32, Expand);
// Mark ATOMIC_LOAD and ATOMIC_STORE custom so we can handle the		} else {
// Unordered/Monotonic case.		// This is part of the hack for -O0 mode: in other modes cmpxchg is
setOperationAction(ISD::ATOMIC_LOAD, MVT::i32, Custom);		// translated into ldrex/strex, so no ATOMIC_CMP_SWAP is seen.
setOperationAction(ISD::ATOMIC_STORE, MVT::i32, Custom);		setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i64, Custom);
}		}
		}

		// If there's anything we can use as a barrier, go through custom lowering
		// for ATOMIC_FENCE.
		setOperationAction(ISD::ATOMIC_FENCE, MVT::Other,
		Subtarget->hasAnyDataBarrier() ? Custom : Expand);

setOperationAction(ISD::PREFETCH, MVT::Other, Custom);		setOperationAction(ISD::PREFETCH, MVT::Other, Custom);

// Requires SXTB/SXTH, available on v6 and up in both ARM and Thumb modes.		// Requires SXTB/SXTH, available on v6 and up in both ARM and Thumb modes.
if (!Subtarget->hasV6Ops()) {		if (!Subtarget->hasV6Ops()) {
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);
}		}
▲ Show 20 Lines • Show All 6,035 Lines • ▼ Show 20 Lines	void ARMTargetLowering::ExpandDIV_Windows(
SDValue Upper = DAG.getNode(ISD::SRL, dl, MVT::i64, Result,		SDValue Upper = DAG.getNode(ISD::SRL, dl, MVT::i64, Result,
DAG.getConstant(32, dl, TLI.getPointerTy(DL)));		DAG.getConstant(32, dl, TLI.getPointerTy(DL)));
Upper = DAG.getNode(ISD::TRUNCATE, dl, MVT::i32, Upper);		Upper = DAG.getNode(ISD::TRUNCATE, dl, MVT::i32, Upper);

Results.push_back(Lower);		Results.push_back(Lower);
Results.push_back(Upper);		Results.push_back(Upper);
}		}

static SDValue LowerAtomicLoadStore(SDValue Op, SelectionDAG &DAG) {
if (isStrongerThanMonotonic(cast<AtomicSDNode>(Op)->getOrdering()))
// Acquire/Release load/store is not legal for targets without a dmb or
// equivalent available.
return SDValue();

// Monotonic load/store is legal for all targets.
return Op;
}

static void ReplaceREADCYCLECOUNTER(SDNode *N,		static void ReplaceREADCYCLECOUNTER(SDNode *N,
SmallVectorImpl<SDValue> &Results,		SmallVectorImpl<SDValue> &Results,
SelectionDAG &DAG,		SelectionDAG &DAG,
const ARMSubtarget *Subtarget) {		const ARMSubtarget *Subtarget) {
SDLoc DL(N);		SDLoc DL(N);
// Under Power Management extensions, the cycle-count is:		// Under Power Management extensions, the cycle-count is:
// mrc p15, #0, <Rt>, c9, c13, #0		// mrc p15, #0, <Rt>, c9, c13, #0
SDValue Ops[] = { N->getOperand(0), // Chain		SDValue Ops[] = { N->getOperand(0), // Chain
▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	SDValue ARMTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
case ISD::ADDE:		case ISD::ADDE:
case ISD::SUBC:		case ISD::SUBC:
case ISD::SUBE: return LowerADDC_ADDE_SUBC_SUBE(Op, DAG);		case ISD::SUBE: return LowerADDC_ADDE_SUBC_SUBE(Op, DAG);
case ISD::SADDO:		case ISD::SADDO:
case ISD::UADDO:		case ISD::UADDO:
case ISD::SSUBO:		case ISD::SSUBO:
case ISD::USUBO:		case ISD::USUBO:
return LowerXALUO(Op, DAG);		return LowerXALUO(Op, DAG);
case ISD::ATOMIC_LOAD:
case ISD::ATOMIC_STORE: return LowerAtomicLoadStore(Op, DAG);
case ISD::FSINCOS: return LowerFSINCOS(Op, DAG);		case ISD::FSINCOS: return LowerFSINCOS(Op, DAG);
case ISD::SDIVREM:		case ISD::SDIVREM:
case ISD::UDIVREM: return LowerDivRem(Op, DAG);		case ISD::UDIVREM: return LowerDivRem(Op, DAG);
case ISD::DYNAMIC_STACKALLOC:		case ISD::DYNAMIC_STACKALLOC:
if (Subtarget->getTargetTriple().isWindowsItaniumEnvironment())		if (Subtarget->getTargetTriple().isWindowsItaniumEnvironment())
return LowerDYNAMIC_STACKALLOC(Op, DAG);		return LowerDYNAMIC_STACKALLOC(Op, DAG);
llvm_unreachable("Don't know how to custom lower this!");		llvm_unreachable("Don't know how to custom lower this!");
case ISD::FP_ROUND: return LowerFP_ROUND(Op, DAG);		case ISD::FP_ROUND: return LowerFP_ROUND(Op, DAG);
▲ Show 20 Lines • Show All 5,017 Lines • ▼ Show 20 Lines

Instruction* ARMTargetLowering::makeDMB(IRBuilder<> &Builder,		Instruction* ARMTargetLowering::makeDMB(IRBuilder<> &Builder,
ARM_MB::MemBOpt Domain) const {		ARM_MB::MemBOpt Domain) const {
Module *M = Builder.GetInsertBlock()->getParent()->getParent();		Module *M = Builder.GetInsertBlock()->getParent()->getParent();

// First, if the target has no DMB, see what fallback we can use.		// First, if the target has no DMB, see what fallback we can use.
if (!Subtarget->hasDataBarrier()) {		if (!Subtarget->hasDataBarrier()) {
// Some ARMv6 cpus can support data barriers with an mcr instruction.		// Some ARMv6 cpus can support data barriers with an mcr instruction.
// Thumb1 and pre-v6 ARM mode use a libcall instead and should never get
// here.
if (Subtarget->hasV6Ops() && !Subtarget->isThumb()) {		if (Subtarget->hasV6Ops() && !Subtarget->isThumb()) {
Function *MCR = llvm::Intrinsic::getDeclaration(M, Intrinsic::arm_mcr);		Function *MCR = llvm::Intrinsic::getDeclaration(M, Intrinsic::arm_mcr);
Value* args[6] = {Builder.getInt32(15), Builder.getInt32(0),		Value* args[6] = {Builder.getInt32(15), Builder.getInt32(0),
Builder.getInt32(0), Builder.getInt32(7),		Builder.getInt32(0), Builder.getInt32(7),
Builder.getInt32(10), Builder.getInt32(5)};		Builder.getInt32(10), Builder.getInt32(5)};
return Builder.CreateCall(MCR, args);		return Builder.CreateCall(MCR, args);
} else {		} else {
// Instead of using barriers, atomic accesses on these subtargets use		// Instead of barriers, atomic accesses on Thumb1 and pre-v6 ARM
// libcalls.		// mode just use a libcall to __sync_synchronize. So, just emit
llvm_unreachable("makeDMB on a target so old that it has no barriers");		// a fence instruction.
		jfbUnsubmitted Not Done Reply Inline Actions The comment confuses me: Thumb1 and pre-v6 never get to the else clause, right? The fence is for everyone else? jfb: The comment confuses me: Thumb1 and pre-v6 never get to the else clause, right? The fence is…
		jyknightAuthorUnsubmitted Not Done Reply Inline Actions Thumb1 and pre-v6 do get in here. Then, an IR fence instruction is created, so that it will get expanded into the libcall. CPUs with actual fence instructions use special intrisincs instead, in order to make the particularly appropriate kinds of fence instruction needed here. jyknight: Thumb1 and pre-v6 do get in here. Then, an IR fence instruction is created, so that it will get…
		return Builder.CreateFence(AtomicOrdering::SequentiallyConsistent);
}		}
} else {		} else {
Function *DMB = llvm::Intrinsic::getDeclaration(M, Intrinsic::arm_dmb);		Function *DMB = llvm::Intrinsic::getDeclaration(M, Intrinsic::arm_dmb);
// Only a full system barrier exists in the M-class architectures.		// Only a full system barrier exists in the M-class architectures.
Domain = Subtarget->isMClass() ? ARM_MB::SY : Domain;		Domain = Subtarget->isMClass() ? ARM_MB::SY : Domain;
Constant *CDomain = Builder.getInt32(Domain);		Constant *CDomain = Builder.getInt32(Domain);
return Builder.CreateCall(DMB, CDomain);		return Builder.CreateCall(DMB, CDomain);
}		}
Show All 38 Lines	Instruction* ARMTargetLowering::emitTrailingFence(IRBuilder<> &Builder,
case AtomicOrdering::Acquire:		case AtomicOrdering::Acquire:
case AtomicOrdering::AcquireRelease:		case AtomicOrdering::AcquireRelease:
case AtomicOrdering::SequentiallyConsistent:		case AtomicOrdering::SequentiallyConsistent:
return makeDMB(Builder, ARM_MB::ISH);		return makeDMB(Builder, ARM_MB::ISH);
}		}
llvm_unreachable("Unknown fence ordering in emitTrailingFence");		llvm_unreachable("Unknown fence ordering in emitTrailingFence");
}		}

// Loads and stores less than 64-bits are already atomic; ones above that		// In the following "shouldAtomic" routines, there's two cases to consider:
// are doomed anyway, so defer to the default libcall and blame the OS when		// 1) We have native atomics (hasLdrex() == true)
// things go wrong. Cortex M doesn't have ldrexd/strexd though, so don't emit		//
// anything for those.		// 2) We don't actually have native atomics, but we have told AtomicExpandPass
		// that we do, because we're on an OS that provides a "magic" lock-free
		jfbUnsubmitted Not Done Reply Inline Actions Seems better to just change the name of the hasLdrex function instead of explaining it here :-) jfb: Seems better to just change the name of the hasLdrex function instead of explaining it here :-)
		jyknightAuthorUnsubmitted Not Done Reply Inline Actions Well, hasLdrex is actually a good name. I'm trying to explain why I'm using it here. But apparently failed -- I've reworded, hopefully more clearly. jyknight: Well, hasLdrex is actually a good name. I'm trying to explain why I'm using it here. But…
		// compare-and-swap routine. In the latter case, we rely on __sync libcall
		// expansions for all the operations.
		//
		// The other possibility is that we have neither native atomics, nor special OS
		// routines allowing lock-free libcalls. However, then, expansion to __atomic_*
		// calls will happen in AtomicExpandPass (due to MaxAtomicSizeInBitsSupported =
		// 0), and the below routines will not be called. So, here, we're only concerned
		// with the first two cases.
		//
		// If we are using libcalls, cmpxchg and rmw operations are desired. If we're
		// using native instructions ll/sc expansions are needed.

		// Loads and stores less than 64-bits are intrinsically atomic. For 64-bit
		// operations, we can replace with ldrexd/strexd.
		//
		// FIXME: ldrd and strd are atomic if the CPU has LPAE (e.g. A15 has that
		// guarantee, see DDI0406C ARM architecture reference manual, sections
		// A8.8.72-74 LDRD); on such CPUs it would be advantageous to not expand 64-bit
		// loads and stores to LL/SC sequences.
bool ARMTargetLowering::shouldExpandAtomicStoreInIR(StoreInst *SI) const {		bool ARMTargetLowering::shouldExpandAtomicStoreInIR(StoreInst *SI) const {
unsigned Size = SI->getValueOperand()->getType()->getPrimitiveSizeInBits();		unsigned Size = SI->getValueOperand()->getType()->getPrimitiveSizeInBits();
return (Size == 64) && !Subtarget->isMClass();		assert(Size <= 64 &&
		"Sizes above 64 should've been handled by AtomicExpandPass");
		return Size == 64;
		jfbUnsubmitted Not Done Reply Inline Actions Same here, size can't be > 64? Assert? jfb: Same here, size can't be > 64? Assert?
}		}

// Loads and stores less than 64-bits are already atomic; ones above that
// are doomed anyway, so defer to the default libcall and blame the OS when
// things go wrong. Cortex M doesn't have ldrexd/strexd though, so don't emit
// anything for those.
// FIXME: ldrd and strd are atomic if the CPU has LPAE (e.g. A15 has that
// guarantee, see DDI0406C ARM architecture reference manual,
// sections A8.8.72-74 LDRD)
TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
ARMTargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const {		ARMTargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const {
unsigned Size = LI->getType()->getPrimitiveSizeInBits();		unsigned Size = LI->getType()->getPrimitiveSizeInBits();
return ((Size == 64) && !Subtarget->isMClass()) ? AtomicExpansionKind::LLOnly		assert(Size <= 64 &&
: AtomicExpansionKind::None;		"Sizes above 64 should've been handled by AtomicExpandPass");
		if (Size != 64)
		return AtomicExpansionKind::None;

		if (!Subtarget->hasLdrex())
		// will expand to cmpxchg libcall.
		return AtomicExpansionKind::CmpXChg;

		return AtomicExpansionKind::LLOnly;
}		}

// For the real atomic operations, we have ldrex/strex up to 32 bits,		// For the more complex atomic operations, we use LL/SC instead of
// and up to 64 bits on the non-M profiles		// cmpxchg.
TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
ARMTargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const {		ARMTargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const {
unsigned Size = AI->getType()->getPrimitiveSizeInBits();		if (!Subtarget->hasLdrex())
return (Size <= (Subtarget->isMClass() ? 32U : 64U))		return AtomicExpansionKind::None;
? AtomicExpansionKind::LLSC		return AtomicExpansionKind::LLSC;
: AtomicExpansionKind::None;
}		}

bool ARMTargetLowering::shouldExpandAtomicCmpXchgInIR(		bool ARMTargetLowering::shouldExpandAtomicCmpXchgInIR(
AtomicCmpXchgInst *AI) const {		AtomicCmpXchgInst *AI) const {
		if (!Subtarget->hasLdrex())
		return false;

// At -O0, fast-regalloc cannot cope with the live vregs necessary to		// At -O0, fast-regalloc cannot cope with the live vregs necessary to
// implement cmpxchg without spilling. If the address being exchanged is also		// implement cmpxchg without spilling. If the address being exchanged is also
// on the stack and close enough to the spill slot, this can lead to a		// on the stack and close enough to the spill slot, this can lead to a
// situation where the monitor always gets cleared and the atomic operation		// situation where the monitor always gets cleared and the atomic operation
// can never succeed. So at -O0 we need a late-expanded pseudo-inst instead.		// can never succeed. So at -O0 we need a late-expanded pseudo-inst instead.
return getTargetMachine().getOptLevel() != 0;		return getTargetMachine().getOptLevel() != 0;
}		}

bool ARMTargetLowering::shouldInsertFencesForAtomic(		bool ARMTargetLowering::shouldInsertFencesForAtomic(
const Instruction *I) const {		const Instruction *I) const {
return InsertFencesForAtomic;		// On cpus without ldrex, we emit __sync_* libcalls. These don't need
		// barriers, as they already have appropriate barriers within. However, Load
		// and Store are still handled directly, and thus need barriers.
		if (!Subtarget->hasLdrex()) {
		return isa<LoadInst>(I) \|\| isa<StoreInst>(I);
		}

		// In -O0 mode, there's a hack in place to expand ATOMIC_CMP_SWAP in a late
		// pseudo expansion instead of in IR. This pseduo requires fences to be
		// emitted externally externally.
		if (getTargetMachine().getOptLevel() == 0 && isa<AtomicCmpXchgInst>(I))
		return true;

		// On v8, we have particularly efficient implementations of atomic fences
		// if they can be combined with nearby atomic loads and stores.
		if (Subtarget->hasV8Ops())
		return false;

		// Automatically insert fences (dmb ish) around all atomic operations.
		return true;
}		}

// This has so far only been implemented for MachO.		// This has so far only been implemented for MachO.
bool ARMTargetLowering::useLoadStackGuardNode() const {		bool ARMTargetLowering::useLoadStackGuardNode() const {
return Subtarget->isTargetMachO();		return Subtarget->isTargetMachO();
}		}

bool ARMTargetLowering::canCombineStoreAndExtract(Type VectorTy, Value Idx,		bool ARMTargetLowering::canCombineStoreAndExtract(Type VectorTy, Value Idx,
▲ Show 20 Lines • Show All 413 Lines • Show Last 20 Lines

lib/Target/ARM/ARMSubtarget.h

Show First 20 Lines • Show All 486 Lines • ▼ Show 20 Lines	public:
bool hasSinCos() const;		bool hasSinCos() const;

/// Returns true if machine scheduler should be enabled.		/// Returns true if machine scheduler should be enabled.
bool enableMachineScheduler() const override;		bool enableMachineScheduler() const override;

/// True for some subtargets at > -O0.		/// True for some subtargets at > -O0.
bool enablePostRAScheduler() const override;		bool enablePostRAScheduler() const override;

// enableAtomicExpand- True if we need to expand our atomics.		// True for targets that support atomic ldrex/strex.
bool enableAtomicExpand() const override;		bool hasLdrex() const {
		return HasV6Ops && (!InThumbMode \|\| HasV8MBaselineOps);
		}

/// getInstrItins - Return the instruction itineraries based on subtarget		/// getInstrItins - Return the instruction itineraries based on subtarget
/// selection.		/// selection.
const InstrItineraryData *getInstrItineraryData() const override {		const InstrItineraryData *getInstrItineraryData() const override {
return &InstrItins;		return &InstrItins;
}		}

/// getStackAlignment - Returns the minimum alignment known to hold of the		/// getStackAlignment - Returns the minimum alignment known to hold of the
Show All 14 Lines

lib/Target/ARM/ARMSubtarget.cpp

	Show First 20 Lines • Show All 311 Lines • ▼ Show 20 Lines
	bool ARMSubtarget::enablePostRAScheduler() const {			bool ARMSubtarget::enablePostRAScheduler() const {
	// No need for PostRA scheduling on out of order CPUs (for now restricted to			// No need for PostRA scheduling on out of order CPUs (for now restricted to
	// swift).			// swift).
	if (getSchedModel().isOutOfOrder() && isSwift())			if (getSchedModel().isOutOfOrder() && isSwift())
	return false;			return false;
	return (!isThumb() \|\| hasThumb2());			return (!isThumb() \|\| hasThumb2());
	}			}

	bool ARMSubtarget::enableAtomicExpand() const {
	return hasAnyDataBarrier() && (!isThumb() \|\| hasV8MBaselineOps());
	}

	bool ARMSubtarget::useStride4VFPs(const MachineFunction &MF) const {			bool ARMSubtarget::useStride4VFPs(const MachineFunction &MF) const {
	// For general targets, the prologue can grow when VFPs are allocated with			// For general targets, the prologue can grow when VFPs are allocated with
	// stride 4 (more vpush instructions). But WatchOS uses a compact unwind			// stride 4 (more vpush instructions). But WatchOS uses a compact unwind
	// format which it's more important to get right.			// format which it's more important to get right.
	return isTargetWatchABI() \|\| (isSwift() && !MF.getFunction()->optForMinSize());			return isTargetWatchABI() \|\| (isSwift() && !MF.getFunction()->optForMinSize());
	}			}

	bool ARMSubtarget::useMovt(const MachineFunction &MF) const {			bool ARMSubtarget::useMovt(const MachineFunction &MF) const {
	Show All 21 Lines

lib/Target/BPF/BPFISelLowering.cpp

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM,
// Set up the register classes.		// Set up the register classes.
addRegisterClass(MVT::i64, &BPF::GPRRegClass);		addRegisterClass(MVT::i64, &BPF::GPRRegClass);

// Compute derived properties from the register classes		// Compute derived properties from the register classes
computeRegisterProperties(STI.getRegisterInfo());		computeRegisterProperties(STI.getRegisterInfo());

setStackPointerRegisterToSaveRestore(BPF::R11);		setStackPointerRegisterToSaveRestore(BPF::R11);

		setMaxAtomicSizeInBitsSupported(64);

setOperationAction(ISD::BR_CC, MVT::i64, Custom);		setOperationAction(ISD::BR_CC, MVT::i64, Custom);
setOperationAction(ISD::BR_JT, MVT::Other, Expand);		setOperationAction(ISD::BR_JT, MVT::Other, Expand);
setOperationAction(ISD::BRIND, MVT::Other, Expand);		setOperationAction(ISD::BRIND, MVT::Other, Expand);
setOperationAction(ISD::BRCOND, MVT::Other, Expand);		setOperationAction(ISD::BRCOND, MVT::Other, Expand);
setOperationAction(ISD::SETCC, MVT::i64, Expand);		setOperationAction(ISD::SETCC, MVT::i64, Expand);
setOperationAction(ISD::SELECT, MVT::i64, Expand);		setOperationAction(ISD::SELECT, MVT::i64, Expand);
setOperationAction(ISD::SELECT_CC, MVT::i64, Custom);		setOperationAction(ISD::SELECT_CC, MVT::i64, Custom);

▲ Show 20 Lines • Show All 523 Lines • Show Last 20 Lines

lib/Target/Hexagon/HexagonISelLowering.h

Show First 20 Lines • Show All 248 Lines • ▼ Show 20 Lines	public:
SDValue getPICJumpTableRelocBase(SDValue Table, SelectionDAG &DAG)		SDValue getPICJumpTableRelocBase(SDValue Table, SelectionDAG &DAG)
const override;		const override;

// Handling of atomic RMW instructions.		// Handling of atomic RMW instructions.
Value emitLoadLinked(IRBuilder<> &Builder, Value Addr,		Value emitLoadLinked(IRBuilder<> &Builder, Value Addr,
AtomicOrdering Ord) const override;		AtomicOrdering Ord) const override;
Value emitStoreConditional(IRBuilder<> &Builder, Value Val,		Value emitStoreConditional(IRBuilder<> &Builder, Value Val,
Value *Addr, AtomicOrdering Ord) const override;		Value *Addr, AtomicOrdering Ord) const override;
AtomicExpansionKind shouldExpandAtomicLoadInIR(LoadInst *LI) const override;
bool shouldExpandAtomicStoreInIR(StoreInst *SI) const override;
AtomicExpansionKind		AtomicExpansionKind
shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override {		shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override {
return AtomicExpansionKind::LLSC;		return AtomicExpansionKind::LLSC;
}		}
		bool shouldExpandAtomicCmpXchgInIR(AtomicCmpXchgInst *AI) const override {
		return true;
		}

protected:		protected:
std::pair<const TargetRegisterClass*, uint8_t>		std::pair<const TargetRegisterClass*, uint8_t>
findRepresentativeClass(const TargetRegisterInfo *TRI, MVT VT)		findRepresentativeClass(const TargetRegisterInfo *TRI, MVT VT)
const override;		const override;
};		};
} // end namespace llvm		} // end namespace llvm

#endif // Hexagon_ISELLOWERING_H		#endif // Hexagon_ISELLOWERING_H

lib/Target/Hexagon/HexagonISelLowering.cpp

Show First 20 Lines • Show All 1,707 Lines • ▼ Show 20 Lines	HexagonTargetLowering::HexagonTargetLowering(const TargetMachine &TM,
auto &HRI = *Subtarget.getRegisterInfo();		auto &HRI = *Subtarget.getRegisterInfo();
bool UseHVX = Subtarget.useHVXOps();		bool UseHVX = Subtarget.useHVXOps();
bool UseHVXSgl = Subtarget.useHVXSglOps();		bool UseHVXSgl = Subtarget.useHVXSglOps();
bool UseHVXDbl = Subtarget.useHVXDblOps();		bool UseHVXDbl = Subtarget.useHVXDblOps();

setPrefLoopAlignment(4);		setPrefLoopAlignment(4);
setPrefFunctionAlignment(4);		setPrefFunctionAlignment(4);
setMinFunctionAlignment(2);		setMinFunctionAlignment(2);
		setMaxAtomicSizeInBitsSupported(64);
setStackPointerRegisterToSaveRestore(HRI.getStackRegister());		setStackPointerRegisterToSaveRestore(HRI.getStackRegister());

if (EnableHexSDNodeSched)		if (EnableHexSDNodeSched)
setSchedulingPreference(Sched::VLIW);		setSchedulingPreference(Sched::VLIW);
else		else
setSchedulingPreference(Sched::Source);		setSchedulingPreference(Sched::Source);

// Limits for inline expansion of memcpy/memmove		// Limits for inline expansion of memcpy/memmove
▲ Show 20 Lines • Show All 1,379 Lines • ▼ Show 20 Lines	Value *HexagonTargetLowering::emitStoreConditional(IRBuilder<> &Builder,
Intrinsic::ID IntID = (SZ == 32) ? Intrinsic::hexagon_S2_storew_locked		Intrinsic::ID IntID = (SZ == 32) ? Intrinsic::hexagon_S2_storew_locked
: Intrinsic::hexagon_S4_stored_locked;		: Intrinsic::hexagon_S4_stored_locked;
Value *Fn = Intrinsic::getDeclaration(M, IntID);		Value *Fn = Intrinsic::getDeclaration(M, IntID);
Value *Call = Builder.CreateCall(Fn, {Addr, Val}, "stcx");		Value *Call = Builder.CreateCall(Fn, {Addr, Val}, "stcx");
Value *Cmp = Builder.CreateICmpEQ(Call, Builder.getInt32(0), "");		Value *Cmp = Builder.CreateICmpEQ(Call, Builder.getInt32(0), "");
Value *Ext = Builder.CreateZExt(Cmp, Type::getInt32Ty(M->getContext()));		Value *Ext = Builder.CreateZExt(Cmp, Type::getInt32Ty(M->getContext()));
return Ext;		return Ext;
}		}

TargetLowering::AtomicExpansionKind
HexagonTargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const {
// Do not expand loads and stores that don't exceed 64 bits.
return LI->getType()->getPrimitiveSizeInBits() > 64
? AtomicExpansionKind::LLOnly
: AtomicExpansionKind::None;
}

bool HexagonTargetLowering::shouldExpandAtomicStoreInIR(StoreInst *SI) const {
// Do not expand loads and stores that don't exceed 64 bits.
return SI->getValueOperand()->getType()->getPrimitiveSizeInBits() > 64;
}

lib/Target/Mips/Mips16ISelLowering.cpp

Show First 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	Mips16TargetLowering::Mips16TargetLowering(const MipsTargetMachine &TM,
: MipsTargetLowering(TM, STI) {		: MipsTargetLowering(TM, STI) {

// Set up the register classes		// Set up the register classes
addRegisterClass(MVT::i32, &Mips::CPU16RegsRegClass);		addRegisterClass(MVT::i32, &Mips::CPU16RegsRegClass);

if (!Subtarget.useSoftFloat())		if (!Subtarget.useSoftFloat())
setMips16HardFloatLibCalls();		setMips16HardFloatLibCalls();

		// Call __sync_* library calls for most atomic instructions; the
		// MIPS16 ISA has no ll/sc or fence instructions, but it can call mips32
		// functions to do the work.
		initSyncLibcalls();
setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, Expand);		setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, Expand);
setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_ADD, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_SUB, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_AND, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_OR, MVT::i32, Expand);
setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i32, Expand);		setOperationAction(ISD::ATOMIC_LOAD_XOR, MVT::i32, Expand);
▲ Show 20 Lines • Show All 655 Lines • Show Last 20 Lines

lib/Target/Mips/MipsISelLowering.cpp

Show First 20 Lines • Show All 381 Lines • ▼ Show 20 Lines	MipsTargetLowering::MipsTargetLowering(const MipsTargetMachine &TM,
setOperationAction(ISD::VAARG, MVT::Other, Custom);		setOperationAction(ISD::VAARG, MVT::Other, Custom);
setOperationAction(ISD::VACOPY, MVT::Other, Expand);		setOperationAction(ISD::VACOPY, MVT::Other, Expand);
setOperationAction(ISD::VAEND, MVT::Other, Expand);		setOperationAction(ISD::VAEND, MVT::Other, Expand);

// Use the default for now		// Use the default for now
setOperationAction(ISD::STACKSAVE, MVT::Other, Expand);		setOperationAction(ISD::STACKSAVE, MVT::Other, Expand);
setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand);		setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand);

if (!Subtarget.isGP64bit()) {		if (Subtarget.isGP64bit())
setOperationAction(ISD::ATOMIC_LOAD, MVT::i64, Expand);		setMaxAtomicSizeInBitsSupported(64);
setOperationAction(ISD::ATOMIC_STORE, MVT::i64, Expand);		else
}		setMaxAtomicSizeInBitsSupported(32);


if (!Subtarget.hasMips32r2()) {		if (!Subtarget.hasMips32r2()) {
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);
}		}

// MIPS16 lacks MIPS32's clz and clo instructions.		// MIPS16 lacks MIPS32's clz and clo instructions.
if (!Subtarget.hasMips32() \|\| Subtarget.inMips16Mode())		if (!Subtarget.hasMips32() \|\| Subtarget.inMips16Mode())
▲ Show 20 Lines • Show All 3,642 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 822 Lines • ▼ Show 20 Lines	if (Subtarget.hasQPX()) {
}		}
}		}

if (Subtarget.has64BitSupport())		if (Subtarget.has64BitSupport())
setOperationAction(ISD::PREFETCH, MVT::Other, Legal);		setOperationAction(ISD::PREFETCH, MVT::Other, Legal);

setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, isPPC64 ? Legal : Custom);		setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, isPPC64 ? Legal : Custom);

if (!isPPC64) {
setOperationAction(ISD::ATOMIC_LOAD, MVT::i64, Expand);
setOperationAction(ISD::ATOMIC_STORE, MVT::i64, Expand);
}

setBooleanContents(ZeroOrOneBooleanContent);		setBooleanContents(ZeroOrOneBooleanContent);

if (Subtarget.hasAltivec()) {		if (Subtarget.hasAltivec()) {
// Altivec instructions set fields to all zeros or all ones.		// Altivec instructions set fields to all zeros or all ones.
setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);		setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);
}		}

if (!isPPC64) {		if (!isPPC64) {
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
case PPC::DIR_PWR7:		case PPC::DIR_PWR7:
case PPC::DIR_PWR8:		case PPC::DIR_PWR8:
case PPC::DIR_PWR9:		case PPC::DIR_PWR9:
setPrefFunctionAlignment(4);		setPrefFunctionAlignment(4);
setPrefLoopAlignment(4);		setPrefLoopAlignment(4);
break;		break;
}		}

		setMaxAtomicSizeInBitsSupported(isPPC64 ? 64 : 32);

if (Subtarget.enableMachineScheduler())		if (Subtarget.enableMachineScheduler())
setSchedulingPreference(Sched::Source);		setSchedulingPreference(Sched::Source);
else		else
setSchedulingPreference(Sched::Hybrid);		setSchedulingPreference(Sched::Hybrid);

computeRegisterProperties(STI.getRegisterInfo());		computeRegisterProperties(STI.getRegisterInfo());

▲ Show 20 Lines • Show All 11,055 Lines • Show Last 20 Lines

lib/Target/Sparc/SparcISelLowering.cpp

Show First 20 Lines • Show All 1,638 Lines • ▼ Show 20 Lines	SparcTargetLowering::SparcTargetLowering(const TargetMachine &TM,
// supported by some Leon SparcV8 variants. Otherwise, atomics		// supported by some Leon SparcV8 variants. Otherwise, atomics
// are unsupported.		// are unsupported.
if (Subtarget->isV9())		if (Subtarget->isV9())
setMaxAtomicSizeInBitsSupported(64);		setMaxAtomicSizeInBitsSupported(64);
else if (false && Subtarget->hasLeonCasa())		else if (false && Subtarget->hasLeonCasa())
// Test made to fail pending completion of AtomicExpandPass,		// Test made to fail pending completion of AtomicExpandPass,
// as this will cause a regression until that work is completed.		// as this will cause a regression until that work is completed.
setMaxAtomicSizeInBitsSupported(32);		setMaxAtomicSizeInBitsSupported(32);
else
setMaxAtomicSizeInBitsSupported(0);

setMinCmpXchgSizeInBits(32);		setMinCmpXchgSizeInBits(32);

setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Legal);		setOperationAction(ISD::ATOMIC_SWAP, MVT::i32, Legal);

setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, Legal);		setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, Legal);

// Custom Lower Atomic LOAD/STORE
setOperationAction(ISD::ATOMIC_LOAD, MVT::i32, Custom);
setOperationAction(ISD::ATOMIC_STORE, MVT::i32, Custom);

if (Subtarget->is64Bit()) {		if (Subtarget->is64Bit()) {
setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i64, Legal);		setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i64, Legal);
setOperationAction(ISD::ATOMIC_SWAP, MVT::i64, Legal);		setOperationAction(ISD::ATOMIC_SWAP, MVT::i64, Legal);
setOperationAction(ISD::ATOMIC_LOAD, MVT::i64, Custom);
setOperationAction(ISD::ATOMIC_STORE, MVT::i64, Custom);
}		}

if (!Subtarget->isV9()) {		if (!Subtarget->isV9()) {
// SparcV8 does not have FNEGD and FABSD.		// SparcV8 does not have FNEGD and FABSD.
setOperationAction(ISD::FNEG, MVT::f64, Custom);		setOperationAction(ISD::FNEG, MVT::f64, Custom);
setOperationAction(ISD::FABS, MVT::f64, Custom);		setOperationAction(ISD::FABS, MVT::f64, Custom);
}		}

▲ Show 20 Lines • Show All 1,318 Lines • ▼ Show 20 Lines	static SDValue LowerUMULO_SMULO(SDValue Op, SelectionDAG &DAG,
// nothing is left using the node. The above EXTRACT_ELEMENT nodes should have		// nothing is left using the node. The above EXTRACT_ELEMENT nodes should have
// been folded.		// been folded.
assert(MulResult->use_empty() && "Illegally typed node still in use!");		assert(MulResult->use_empty() && "Illegally typed node still in use!");

SDValue Ops[2] = { BottomHalf, TopHalf } ;		SDValue Ops[2] = { BottomHalf, TopHalf } ;
return DAG.getMergeValues(Ops, dl);		return DAG.getMergeValues(Ops, dl);
}		}

static SDValue LowerATOMIC_LOAD_STORE(SDValue Op, SelectionDAG &DAG) {
if (isStrongerThanMonotonic(cast<AtomicSDNode>(Op)->getOrdering()))
// Expand with a fence.
return SDValue();

// Monotonic load/stores are legal.
return Op;
}

SDValue SparcTargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,		SDValue SparcTargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
unsigned IntNo = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();		unsigned IntNo = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
SDLoc dl(Op);		SDLoc dl(Op);
switch (IntNo) {		switch (IntNo) {
default: return SDValue(); // Don't custom lower most intrinsics.		default: return SDValue(); // Don't custom lower most intrinsics.
case Intrinsic::thread_pointer: {		case Intrinsic::thread_pointer: {
EVT PtrVT = getPointerTy(DAG.getDataLayout());		EVT PtrVT = getPointerTy(DAG.getDataLayout());
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	LowerOperation(SDValue Op, SelectionDAG &DAG) const {
case ISD::FP_EXTEND: return LowerF128_FPEXTEND(Op, DAG, *this);		case ISD::FP_EXTEND: return LowerF128_FPEXTEND(Op, DAG, *this);
case ISD::FP_ROUND: return LowerF128_FPROUND(Op, DAG, *this);		case ISD::FP_ROUND: return LowerF128_FPROUND(Op, DAG, *this);
case ISD::ADDC:		case ISD::ADDC:
case ISD::ADDE:		case ISD::ADDE:
case ISD::SUBC:		case ISD::SUBC:
case ISD::SUBE: return LowerADDC_ADDE_SUBC_SUBE(Op, DAG);		case ISD::SUBE: return LowerADDC_ADDE_SUBC_SUBE(Op, DAG);
case ISD::UMULO:		case ISD::UMULO:
case ISD::SMULO: return LowerUMULO_SMULO(Op, DAG, *this);		case ISD::SMULO: return LowerUMULO_SMULO(Op, DAG, *this);
case ISD::ATOMIC_LOAD:
case ISD::ATOMIC_STORE: return LowerATOMIC_LOAD_STORE(Op, DAG);
case ISD::INTRINSIC_WO_CHAIN: return LowerINTRINSIC_WO_CHAIN(Op, DAG);		case ISD::INTRINSIC_WO_CHAIN: return LowerINTRINSIC_WO_CHAIN(Op, DAG);
}		}
}		}

MachineBasicBlock *		MachineBasicBlock *
SparcTargetLowering::EmitInstrWithCustomInserter(MachineInstr *MI,		SparcTargetLowering::EmitInstrWithCustomInserter(MachineInstr *MI,
MachineBasicBlock *BB) const {		MachineBasicBlock *BB) const {
switch (MI->getOpcode()) {		switch (MI->getOpcode()) {
▲ Show 20 Lines • Show All 481 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZISelLowering.cpp

Show First 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	SystemZTargetLowering::SystemZTargetLowering(const TargetMachine &TM,
setSchedulingPreference(Sched::RegPressure);		setSchedulingPreference(Sched::RegPressure);

setBooleanContents(ZeroOrOneBooleanContent);		setBooleanContents(ZeroOrOneBooleanContent);
setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);		setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);

// Instructions are strings of 2-byte aligned 2-byte values.		// Instructions are strings of 2-byte aligned 2-byte values.
setMinFunctionAlignment(2);		setMinFunctionAlignment(2);

		setMaxAtomicSizeInBitsSupported(64);
		jfbUnsubmitted Done Reply Inline Actions This comment seems kind of obvious from the code. jfb: This comment seems kind of obvious from the code.

// Handle operations that are handled in a similar way for all types.		// Handle operations that are handled in a similar way for all types.
for (unsigned I = MVT::FIRST_INTEGER_VALUETYPE;		for (unsigned I = MVT::FIRST_INTEGER_VALUETYPE;
I <= MVT::LAST_FP_VALUETYPE;		I <= MVT::LAST_FP_VALUETYPE;
++I) {		++I) {
MVT VT = MVT::SimpleValueType(I);		MVT VT = MVT::SimpleValueType(I);
if (isTypeLegal(VT)) {		if (isTypeLegal(VT)) {
// Lower SET_CC into an IPM-based sequence.		// Lower SET_CC into an IPM-based sequence.
setOperationAction(ISD::SETCC, VT, Custom);		setOperationAction(ISD::SETCC, VT, Custom);
▲ Show 20 Lines • Show All 6,026 Lines • Show Last 20 Lines

lib/Target/TargetSubtargetInfo.cpp

Show All 22 Lines	TargetSubtargetInfo::TargetSubtargetInfo(
const SubtargetInfoKV ProcSched, const MCWriteProcResEntry WPR,		const SubtargetInfoKV ProcSched, const MCWriteProcResEntry WPR,
const MCWriteLatencyEntry WL, const MCReadAdvanceEntry RA,		const MCWriteLatencyEntry WL, const MCReadAdvanceEntry RA,
const InstrStage IS, const unsigned OC, const unsigned *FP)		const InstrStage IS, const unsigned OC, const unsigned *FP)
: MCSubtargetInfo(TT, CPU, FS, PF, PD, ProcSched, WPR, WL, RA, IS, OC, FP) {		: MCSubtargetInfo(TT, CPU, FS, PF, PD, ProcSched, WPR, WL, RA, IS, OC, FP) {
}		}

TargetSubtargetInfo::~TargetSubtargetInfo() {}		TargetSubtargetInfo::~TargetSubtargetInfo() {}

bool TargetSubtargetInfo::enableAtomicExpand() const {
return true;
}

bool TargetSubtargetInfo::enableMachineScheduler() const {		bool TargetSubtargetInfo::enableMachineScheduler() const {
return false;		return false;
}		}

bool TargetSubtargetInfo::enableJoinGlobalCopies() const {		bool TargetSubtargetInfo::enableJoinGlobalCopies() const {
return enableMachineScheduler();		return enableMachineScheduler();
}		}

Show All 12 Lines

lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	WebAssemblyTargetLowering::WebAssemblyTargetLowering(
// WebAssembly does not produce floating-point exceptions on normal floating		// WebAssembly does not produce floating-point exceptions on normal floating
// point operations.		// point operations.
setHasFloatingPointExceptions(false);		setHasFloatingPointExceptions(false);
// We don't know the microarchitecture here, so just reduce register pressure.		// We don't know the microarchitecture here, so just reduce register pressure.
setSchedulingPreference(Sched::RegPressure);		setSchedulingPreference(Sched::RegPressure);
// Tell ISel that we have a stack pointer.		// Tell ISel that we have a stack pointer.
setStackPointerRegisterToSaveRestore(		setStackPointerRegisterToSaveRestore(
Subtarget->hasAddr64() ? WebAssembly::SP64 : WebAssembly::SP32);		Subtarget->hasAddr64() ? WebAssembly::SP64 : WebAssembly::SP32);
		// Maximum atomics size
		if (Subtarget->hasAddr64())
		setMaxAtomicSizeInBitsSupported(64);
		else
		setMaxAtomicSizeInBitsSupported(32);
		jfbUnsubmitted Not Done Reply Inline Actions This is correct for wasm (whereas I'm not certain for some of the other ISAs). jfb: This is correct for wasm (whereas I'm not certain for some of the other ISAs).
// Set up the register classes.		// Set up the register classes.
addRegisterClass(MVT::i32, &WebAssembly::I32RegClass);		addRegisterClass(MVT::i32, &WebAssembly::I32RegClass);
addRegisterClass(MVT::i64, &WebAssembly::I64RegClass);		addRegisterClass(MVT::i64, &WebAssembly::I64RegClass);
addRegisterClass(MVT::f32, &WebAssembly::F32RegClass);		addRegisterClass(MVT::f32, &WebAssembly::F32RegClass);
addRegisterClass(MVT::f64, &WebAssembly::F64RegClass);		addRegisterClass(MVT::f64, &WebAssembly::F64RegClass);
// Compute derived properties from the register classes.		// Compute derived properties from the register classes.
computeRegisterProperties(Subtarget->getRegisterInfo());		computeRegisterProperties(Subtarget->getRegisterInfo());

▲ Show 20 Lines • Show All 631 Lines • Show Last 20 Lines

lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,

// Set up the TargetLowering object.		// Set up the TargetLowering object.

// X86 is weird. It always uses i8 for shift amounts and setcc results.		// X86 is weird. It always uses i8 for shift amounts and setcc results.
setBooleanContents(ZeroOrOneBooleanContent);		setBooleanContents(ZeroOrOneBooleanContent);
// X86-SSE is even stranger. It uses -1 or 0 for vector masks.		// X86-SSE is even stranger. It uses -1 or 0 for vector masks.
setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);		setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);

		if (Subtarget.is64Bit()) {
		if (Subtarget.hasCmpxchg16b())
		setMaxAtomicSizeInBitsSupported(128);
		else
		setMaxAtomicSizeInBitsSupported(64);
		} else {
		// FIXME: Check that we actually have cmpxchg (i486 or later)
		// FIXME: Check that we actually have cmpxchg8b (i586 or later)
		setMaxAtomicSizeInBitsSupported(64);
		}

// For 64-bit, since we have so many registers, use the ILP scheduler.		// For 64-bit, since we have so many registers, use the ILP scheduler.
// For 32-bit, use the register pressure specific scheduling.		// For 32-bit, use the register pressure specific scheduling.
// For Atom, always use ILP scheduling.		// For Atom, always use ILP scheduling.
if (Subtarget.isAtom())		if (Subtarget.isAtom())
setSchedulingPreference(Sched::ILP);		setSchedulingPreference(Sched::ILP);
else if (Subtarget.is64Bit())		else if (Subtarget.is64Bit())
setSchedulingPreference(Sched::ILP);		setSchedulingPreference(Sched::ILP);
else		else
▲ Show 20 Lines • Show All 20,317 Lines • ▼ Show 20 Lines	static SDValue LowerXALUO(SDValue Op, SelectionDAG &DAG) {
SDValue SetCC =		SDValue SetCC =
DAG.getNode(X86ISD::SETCC, DL, N->getValueType(1),		DAG.getNode(X86ISD::SETCC, DL, N->getValueType(1),
DAG.getConstant(Cond, DL, MVT::i32),		DAG.getConstant(Cond, DL, MVT::i32),
SDValue(Sum.getNode(), 1));		SDValue(Sum.getNode(), 1));

return DAG.getNode(ISD::MERGE_VALUES, DL, N->getVTList(), Sum, SetCC);		return DAG.getNode(ISD::MERGE_VALUES, DL, N->getVTList(), Sum, SetCC);
}		}

/// Returns true if the operand type is exactly twice the native width, and		// Atomic operations larger than the normal register size can only be
/// the corresponding cmpxchg8b or cmpxchg16b instruction is available.		// done with cmpxchg8b/16b, so expand loads/stores to cmpxchg if
/// Used to know whether to use cmpxchg8/16b when expanding atomic operations		// required.
/// (otherwise we leave them alone to become __sync_fetch_and_... calls).
bool X86TargetLowering::needsCmpXchgNb(Type *MemType) const {		// (Note: we don't need to worry about those instructions not being
unsigned OpWidth = MemType->getPrimitiveSizeInBits();		// available, because larger-than-supported IR instructions will
		// already have been transformed into __atomic_* libcalls if needed)
if (OpWidth == 64)
return !Subtarget.is64Bit(); // FIXME this should be Subtarget.hasCmpxchg8b
else if (OpWidth == 128)
return Subtarget.hasCmpxchg16b();
else
return false;
}

bool X86TargetLowering::shouldExpandAtomicStoreInIR(StoreInst *SI) const {		bool X86TargetLowering::shouldExpandAtomicStoreInIR(StoreInst *SI) const {
return needsCmpXchgNb(SI->getValueOperand()->getType());		unsigned NativeWidth = Subtarget.is64Bit() ? 64 : 32;
		return SI->getValueOperand()->getType()->getPrimitiveSizeInBits() >
		NativeWidth;
}		}

// Note: this turns large loads into lock cmpxchg8b/16b.		// Note: this turns large loads into lock cmpxchg8b/16b.
// FIXME: On 32 bits x86, fild/movq might be faster than lock cmpxchg8b.		// FIXME: On 32 bits x86, fild/movq might be faster than lock cmpxchg8b.
TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
X86TargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const {		X86TargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const {
auto PTy = cast<PointerType>(LI->getPointerOperand()->getType());		unsigned NativeWidth = Subtarget.is64Bit() ? 64 : 32;
return needsCmpXchgNb(PTy->getElementType()) ? AtomicExpansionKind::CmpXChg		return (LI->getType()->getPrimitiveSizeInBits() > NativeWidth)
		? AtomicExpansionKind::CmpXChg
: AtomicExpansionKind::None;		: AtomicExpansionKind::None;
}		}

TargetLowering::AtomicExpansionKind		TargetLowering::AtomicExpansionKind
X86TargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const {		X86TargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const {
unsigned NativeWidth = Subtarget.is64Bit() ? 64 : 32;		unsigned NativeWidth = Subtarget.is64Bit() ? 64 : 32;
Type *MemType = AI->getType();		Type *MemType = AI->getType();

// If the operand is too big, we must see if cmpxchg8/16b is available		// If the operand is too big, we need to use cmpxchg8b/16b.
// and default to library calls otherwise.		if (MemType->getPrimitiveSizeInBits() > NativeWidth)
if (MemType->getPrimitiveSizeInBits() > NativeWidth) {		return AtomicExpansionKind::CmpXChg;
return needsCmpXchgNb(MemType) ? AtomicExpansionKind::CmpXChg
: AtomicExpansionKind::None;
}

AtomicRMWInst::BinOp Op = AI->getOperation();		AtomicRMWInst::BinOp Op = AI->getOperation();
switch (Op) {		switch (Op) {
default:		default:
llvm_unreachable("Unknown atomic operation");		llvm_unreachable("Unknown atomic operation");
case AtomicRMWInst::Xchg:		case AtomicRMWInst::Xchg:
case AtomicRMWInst::Add:		case AtomicRMWInst::Add:
case AtomicRMWInst::Sub:		case AtomicRMWInst::Sub:
▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines	SDValue EFLAGS = DAG.getCopyFromReg(cpOut.getValue(1), DL, X86::EFLAGS,
MVT::i32, cpOut.getValue(2));		MVT::i32, cpOut.getValue(2));
SDValue Success = DAG.getNode(X86ISD::SETCC, DL, Op->getValueType(1),		SDValue Success = DAG.getNode(X86ISD::SETCC, DL, Op->getValueType(1),
DAG.getConstant(X86::COND_E, DL, MVT::i8),		DAG.getConstant(X86::COND_E, DL, MVT::i8),
EFLAGS);		EFLAGS);

DAG.ReplaceAllUsesOfValueWith(Op.getValue(0), cpOut);		DAG.ReplaceAllUsesOfValueWith(Op.getValue(0), cpOut);
DAG.ReplaceAllUsesOfValueWith(Op.getValue(1), Success);		DAG.ReplaceAllUsesOfValueWith(Op.getValue(1), Success);
DAG.ReplaceAllUsesOfValueWith(Op.getValue(2), EFLAGS.getValue(1));		DAG.ReplaceAllUsesOfValueWith(Op.getValue(2), EFLAGS.getValue(1));
return SDValue();		return Op;
}		}

static SDValue LowerBITCAST(SDValue Op, const X86Subtarget &Subtarget,		static SDValue LowerBITCAST(SDValue Op, const X86Subtarget &Subtarget,
SelectionDAG &DAG) {		SelectionDAG &DAG) {
MVT SrcVT = Op.getOperand(0).getSimpleValueType();		MVT SrcVT = Op.getOperand(0).getSimpleValueType();
MVT DstVT = Op.getSimpleValueType();		MVT DstVT = Op.getSimpleValueType();

if (SrcVT == MVT::v2i32 \|\| SrcVT == MVT::v4i16 \|\| SrcVT == MVT::v8i8 \|\|		if (SrcVT == MVT::v2i32 \|\| SrcVT == MVT::v4i16 \|\| SrcVT == MVT::v8i8 \|\|
▲ Show 20 Lines • Show All 458 Lines • ▼ Show 20 Lines	assert(Opc == ISD::ATOMIC_LOAD_ADD &&
"Used AtomicRMW ops other than Add should have been expanded!");		"Used AtomicRMW ops other than Add should have been expanded!");
return N;		return N;
}		}

SDValue LockOp = lowerAtomicArithWithLOCK(N, DAG);		SDValue LockOp = lowerAtomicArithWithLOCK(N, DAG);
// RAUW the chain, but don't worry about the result, as it's unused.		// RAUW the chain, but don't worry about the result, as it's unused.
assert(!N->hasAnyUseOfValue(0));		assert(!N->hasAnyUseOfValue(0));
DAG.ReplaceAllUsesOfValueWith(N.getValue(1), LockOp.getValue(1));		DAG.ReplaceAllUsesOfValueWith(N.getValue(1), LockOp.getValue(1));
return SDValue();		return LockOp;
}		}

static SDValue LowerATOMIC_STORE(SDValue Op, SelectionDAG &DAG) {		static SDValue LowerATOMIC_STORE(SDValue Op, SelectionDAG &DAG) {
SDNode *Node = Op.getNode();		SDNode *Node = Op.getNode();
SDLoc dl(Node);		SDLoc dl(Node);
EVT VT = cast<AtomicSDNode>(Node)->getMemoryVT();		EVT VT = cast<AtomicSDNode>(Node)->getMemoryVT();

// Convert seq_cst store -> xchg		// Convert seq_cst store -> xchg
▲ Show 20 Lines • Show All 10,341 Lines • Show Last 20 Lines

lib/Target/XCore/XCoreISelLowering.h

Show First 20 Lines • Show All 179 Lines • ▼ Show 20 Lines	private:
SDValue LowerSMUL_LOHI(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerSMUL_LOHI(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerFRAME_TO_ARGS_OFFSET(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerFRAME_TO_ARGS_OFFSET(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerADJUST_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerADJUST_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerINTRINSIC_WO_CHAIN(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerINTRINSIC_WO_CHAIN(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerATOMIC_FENCE(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerATOMIC_FENCE(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerATOMIC_LOAD(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerATOMIC_STORE(SDValue Op, SelectionDAG &DAG) const;

// Inline asm support		// Inline asm support
std::pair<unsigned, const TargetRegisterClass *>		std::pair<unsigned, const TargetRegisterClass *>
getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,		getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
StringRef Constraint, MVT VT) const override;		StringRef Constraint, MVT VT) const override;

// Expand specifics		// Expand specifics
SDValue TryExpandADDWithMul(SDNode *Op, SelectionDAG &DAG) const;		SDValue TryExpandADDWithMul(SDNode *Op, SelectionDAG &DAG) const;
Show All 22 Lines	SDValue LowerReturn(SDValue Chain, CallingConv::ID CallConv, bool isVarArg,
const SmallVectorImpl<SDValue> &OutVals,		const SmallVectorImpl<SDValue> &OutVals,
const SDLoc &dl, SelectionDAG &DAG) const override;		const SDLoc &dl, SelectionDAG &DAG) const override;

bool		bool
CanLowerReturn(CallingConv::ID CallConv, MachineFunction &MF,		CanLowerReturn(CallingConv::ID CallConv, MachineFunction &MF,
bool isVarArg,		bool isVarArg,
const SmallVectorImpl<ISD::OutputArg> &ArgsFlags,		const SmallVectorImpl<ISD::OutputArg> &ArgsFlags,
LLVMContext &Context) const override;		LLVMContext &Context) const override;
bool shouldInsertFencesForAtomic(const Instruction *I) const override {
return true;
}
};		};
}		}

#endif		#endif

lib/Target/XCore/XCoreISelLowering.cpp

Show First 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	XCoreTargetLowering::XCoreTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::STACKSAVE, MVT::Other, Expand);		setOperationAction(ISD::STACKSAVE, MVT::Other, Expand);
setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand);		setOperationAction(ISD::STACKRESTORE, MVT::Other, Expand);
setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32, Expand);		setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32, Expand);

// Exception handling		// Exception handling
setOperationAction(ISD::EH_RETURN, MVT::Other, Custom);		setOperationAction(ISD::EH_RETURN, MVT::Other, Custom);
setOperationAction(ISD::FRAME_TO_ARGS_OFFSET, MVT::i32, Custom);		setOperationAction(ISD::FRAME_TO_ARGS_OFFSET, MVT::i32, Custom);

// Atomic operations
// We request a fence for ATOMIC_* instructions, to reduce them to Monotonic.
// As we are always Sequential Consistent, an ATOMIC_FENCE becomes a no OP.
setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, Custom);		setOperationAction(ISD::ATOMIC_FENCE, MVT::Other, Custom);
setOperationAction(ISD::ATOMIC_LOAD, MVT::i32, Custom);
setOperationAction(ISD::ATOMIC_STORE, MVT::i32, Custom);

// TRAMPOLINE is custom lowered.		// TRAMPOLINE is custom lowered.
setOperationAction(ISD::INIT_TRAMPOLINE, MVT::Other, Custom);		setOperationAction(ISD::INIT_TRAMPOLINE, MVT::Other, Custom);
setOperationAction(ISD::ADJUST_TRAMPOLINE, MVT::Other, Custom);		setOperationAction(ISD::ADJUST_TRAMPOLINE, MVT::Other, Custom);

// We want to custom lower some of our intrinsics.		// We want to custom lower some of our intrinsics.
setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);		setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);

▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	LowerOperation(SDValue Op, SelectionDAG &DAG) const {
case ISD::SUB: return ExpandADDSUB(Op.getNode(), DAG);		case ISD::SUB: return ExpandADDSUB(Op.getNode(), DAG);
case ISD::FRAMEADDR: return LowerFRAMEADDR(Op, DAG);		case ISD::FRAMEADDR: return LowerFRAMEADDR(Op, DAG);
case ISD::RETURNADDR: return LowerRETURNADDR(Op, DAG);		case ISD::RETURNADDR: return LowerRETURNADDR(Op, DAG);
case ISD::FRAME_TO_ARGS_OFFSET: return LowerFRAME_TO_ARGS_OFFSET(Op, DAG);		case ISD::FRAME_TO_ARGS_OFFSET: return LowerFRAME_TO_ARGS_OFFSET(Op, DAG);
case ISD::INIT_TRAMPOLINE: return LowerINIT_TRAMPOLINE(Op, DAG);		case ISD::INIT_TRAMPOLINE: return LowerINIT_TRAMPOLINE(Op, DAG);
case ISD::ADJUST_TRAMPOLINE: return LowerADJUST_TRAMPOLINE(Op, DAG);		case ISD::ADJUST_TRAMPOLINE: return LowerADJUST_TRAMPOLINE(Op, DAG);
case ISD::INTRINSIC_WO_CHAIN: return LowerINTRINSIC_WO_CHAIN(Op, DAG);		case ISD::INTRINSIC_WO_CHAIN: return LowerINTRINSIC_WO_CHAIN(Op, DAG);
case ISD::ATOMIC_FENCE: return LowerATOMIC_FENCE(Op, DAG);		case ISD::ATOMIC_FENCE: return LowerATOMIC_FENCE(Op, DAG);
case ISD::ATOMIC_LOAD: return LowerATOMIC_LOAD(Op, DAG);
case ISD::ATOMIC_STORE: return LowerATOMIC_STORE(Op, DAG);
default:		default:
llvm_unreachable("unimplemented operand");		llvm_unreachable("unimplemented operand");
}		}
}		}

/// ReplaceNodeResults - Replace the results of node with an illegal result		/// ReplaceNodeResults - Replace the results of node with an illegal result
/// type with new values built out of custom code.		/// type with new values built out of custom code.
void XCoreTargetLowering::ReplaceNodeResults(SDNode *N,		void XCoreTargetLowering::ReplaceNodeResults(SDNode *N,
▲ Show 20 Lines • Show All 723 Lines • ▼ Show 20 Lines
}		}

SDValue XCoreTargetLowering::		SDValue XCoreTargetLowering::
LowerATOMIC_FENCE(SDValue Op, SelectionDAG &DAG) const {		LowerATOMIC_FENCE(SDValue Op, SelectionDAG &DAG) const {
SDLoc DL(Op);		SDLoc DL(Op);
return DAG.getNode(XCoreISD::MEMBARRIER, DL, MVT::Other, Op.getOperand(0));		return DAG.getNode(XCoreISD::MEMBARRIER, DL, MVT::Other, Op.getOperand(0));
}		}

SDValue XCoreTargetLowering::
LowerATOMIC_LOAD(SDValue Op, SelectionDAG &DAG) const {
AtomicSDNode *N = cast<AtomicSDNode>(Op);
assert(N->getOpcode() == ISD::ATOMIC_LOAD && "Bad Atomic OP");
assert((N->getOrdering() == AtomicOrdering::Unordered \|\|
N->getOrdering() == AtomicOrdering::Monotonic) &&
"setInsertFencesForAtomic(true) expects unordered / monotonic");
if (N->getMemoryVT() == MVT::i32) {
if (N->getAlignment() < 4)
report_fatal_error("atomic load must be aligned");
return DAG.getLoad(getPointerTy(DAG.getDataLayout()), SDLoc(Op),
N->getChain(), N->getBasePtr(), N->getPointerInfo(),
N->isVolatile(), N->isNonTemporal(), N->isInvariant(),
N->getAlignment(), N->getAAInfo(), N->getRanges());
}
if (N->getMemoryVT() == MVT::i16) {
if (N->getAlignment() < 2)
report_fatal_error("atomic load must be aligned");
return DAG.getExtLoad(ISD::EXTLOAD, SDLoc(Op), MVT::i32, N->getChain(),
N->getBasePtr(), N->getPointerInfo(), MVT::i16,
N->isVolatile(), N->isNonTemporal(),
N->isInvariant(), N->getAlignment(), N->getAAInfo());
}
if (N->getMemoryVT() == MVT::i8)
return DAG.getExtLoad(ISD::EXTLOAD, SDLoc(Op), MVT::i32, N->getChain(),
N->getBasePtr(), N->getPointerInfo(), MVT::i8,
N->isVolatile(), N->isNonTemporal(),
N->isInvariant(), N->getAlignment(), N->getAAInfo());
return SDValue();
}

SDValue XCoreTargetLowering::
LowerATOMIC_STORE(SDValue Op, SelectionDAG &DAG) const {
AtomicSDNode *N = cast<AtomicSDNode>(Op);
assert(N->getOpcode() == ISD::ATOMIC_STORE && "Bad Atomic OP");
assert((N->getOrdering() == AtomicOrdering::Unordered \|\|
N->getOrdering() == AtomicOrdering::Monotonic) &&
"setInsertFencesForAtomic(true) expects unordered / monotonic");
if (N->getMemoryVT() == MVT::i32) {
if (N->getAlignment() < 4)
report_fatal_error("atomic store must be aligned");
return DAG.getStore(N->getChain(), SDLoc(Op), N->getVal(),
N->getBasePtr(), N->getPointerInfo(),
N->isVolatile(), N->isNonTemporal(),
N->getAlignment(), N->getAAInfo());
}
if (N->getMemoryVT() == MVT::i16) {
if (N->getAlignment() < 2)
report_fatal_error("atomic store must be aligned");
return DAG.getTruncStore(N->getChain(), SDLoc(Op), N->getVal(),
N->getBasePtr(), N->getPointerInfo(), MVT::i16,
N->isVolatile(), N->isNonTemporal(),
N->getAlignment(), N->getAAInfo());
}
if (N->getMemoryVT() == MVT::i8)
return DAG.getTruncStore(N->getChain(), SDLoc(Op), N->getVal(),
N->getBasePtr(), N->getPointerInfo(), MVT::i8,
N->isVolatile(), N->isNonTemporal(),
N->getAlignment(), N->getAAInfo());
return SDValue();
}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Calling Convention Implementation		// Calling Convention Implementation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "XCoreGenCallingConv.inc"		#include "XCoreGenCallingConv.inc"

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Call Calling Convention Implementation		// Call Calling Convention Implementation
▲ Show 20 Lines • Show All 925 Lines • Show Last 20 Lines

test/CodeGen/ARM/atomic-cmpxchg.ll

	; RUN: llc < %s -mtriple=arm-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-ARM			; RUN: llc < %s -mtriple=arm-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-ARM -check-prefix=CHECK
	; RUN: llc < %s -mtriple=thumb-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-THUMB			; RUN: llc < %s -mtriple=thumb-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-THUMB -check-prefix=CHECK

	; RUN: llc < %s -mtriple=armv6-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-ARMV6			; RUN: llc < %s -mtriple=armv6-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-ARMV6 -check-prefix=CHECK
	; RUN: llc < %s -mtriple=thumbv6-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-THUMBV6			; RUN: llc < %s -mtriple=thumbv6-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-THUMBV6 -check-prefix=CHECK

	; RUN: llc < %s -mtriple=armv7-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-ARMV7			; RUN: llc < %s -mtriple=armv7-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-ARMV7 -check-prefix=CHECK
	; RUN: llc < %s -mtriple=thumbv7-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-THUMBV7			; RUN: llc < %s -mtriple=thumbv7-linux-gnueabi -asm-verbose=false -verify-machineinstrs \| FileCheck %s -check-prefix=CHECK-THUMBV7 -check-prefix=CHECK

	define zeroext i1 @test_cmpxchg_res_i8(i8* %addr, i8 %desired, i8 zeroext %new) {			define zeroext i1 @test_cmpxchg_res_i8(i8* %addr, i8 %desired, i8 zeroext %new) {
	entry:			; CHECK-LABEL: test_cmpxchg_res_i8:
	%0 = cmpxchg i8* %addr, i8 %desired, i8 %new monotonic monotonic
	%1 = extractvalue { i8, i1 } %0, 1
	ret i1 %1
	}

	; CHECK-ARM-LABEL: test_cmpxchg_res_i8
	; CHECK-ARM: bl __sync_val_compare_and_swap_1			; CHECK-ARM: bl __sync_val_compare_and_swap_1
	; CHECK-ARM: mov [[REG:r[0-9]+]], #0			; CHECK-ARM: mov [[REG:r[0-9]+]], #0
	; CHECK-ARM: cmp r0, {{r[0-9]+}}			; CHECK-ARM: cmp r0, {{r[0-9]+}}
	; CHECK-ARM: moveq [[REG]], #1			; CHECK-ARM: moveq [[REG]], #1
	; CHECK-ARM: mov r0, [[REG]]			; CHECK-ARM: mov r0, [[REG]]

	; CHECK-THUMB-LABEL: test_cmpxchg_res_i8
	; CHECK-THUMB: bl __sync_val_compare_and_swap_1			; CHECK-THUMB: bl __sync_val_compare_and_swap_1
	; CHECK-THUMB-NOT: mov [[R1:r[0-7]]], r0			; CHECK-THUMB-NOT: mov [[R1:r[0-7]]], r0
	; CHECK-THUMB: push {r0}			; CHECK-THUMB: push {r0}
	; CHECK-THUMB: pop {[[R1:r[0-7]]]}			; CHECK-THUMB: pop {[[R1:r[0-7]]]}
	; CHECK-THUMB: movs r0, #1			; CHECK-THUMB: movs r0, #1
	; CHECK-THUMB: movs [[R2:r[0-9]+]], #0			; CHECK-THUMB: movs [[R2:r[0-9]+]], #0
	; CHECK-THUMB: cmp [[R1]], {{r[0-9]+}}			; CHECK-THUMB: cmp [[R1]], {{r[0-9]+}}
	; CHECK-THUMB: beq			; CHECK-THUMB: beq
	; CHECK-THUMB: push {[[R2]]}			; CHECK-THUMB: push {[[R2]]}
	; CHECK-THUMB: pop {r0}			; CHECK-THUMB: pop {r0}

	; CHECK-ARMV6-LABEL: test_cmpxchg_res_i8:
	; CHECK-ARMV6-NEXT: .fnstart			; CHECK-ARMV6-NEXT: .fnstart
	; CHECK-ARMV6-NEXT: uxtb [[DESIRED:r[0-9]+]], r1			; CHECK-ARMV6-NEXT: uxtb [[DESIRED:r[0-9]+]], r1
	; CHECK-ARMV6-NEXT: [[TRY:.LBB[0-9_]+]]:			; CHECK-ARMV6-NEXT: [[TRY:.LBB[0-9_]+]]:
	; CHECK-ARMV6-NEXT: ldrexb [[LD:r[0-9]+]], [r0]			; CHECK-ARMV6-NEXT: ldrexb [[LD:r[0-9]+]], [r0]
	; CHECK-ARMV6-NEXT: mov [[RES:r[0-9]+]], #0			; CHECK-ARMV6-NEXT: mov [[RES:r[0-9]+]], #0
	; CHECK-ARMV6-NEXT: cmp [[LD]], [[DESIRED]]			; CHECK-ARMV6-NEXT: cmp [[LD]], [[DESIRED]]
	; CHECK-ARMV6-NEXT: bne [[END:.LBB[0-9_]+]]			; CHECK-ARMV6-NEXT: bne [[END:.LBB[0-9_]+]]
	; CHECK-ARMV6-NEXT: strexb [[SUCCESS:r[0-9]+]], r2, [r0]			; CHECK-ARMV6-NEXT: strexb [[SUCCESS:r[0-9]+]], r2, [r0]
	; CHECK-ARMV6-NEXT: mov [[RES]], #1			; CHECK-ARMV6-NEXT: mov [[RES]], #1
	; CHECK-ARMV6-NEXT: cmp [[SUCCESS]], #0			; CHECK-ARMV6-NEXT: cmp [[SUCCESS]], #0
	; CHECK-ARMV6-NEXT: bne [[TRY]]			; CHECK-ARMV6-NEXT: bne [[TRY]]
	; CHECK-ARMV6-NEXT: [[END]]:			; CHECK-ARMV6-NEXT: [[END]]:
	; CHECK-ARMV6-NEXT: mov r0, [[RES]]			; CHECK-ARMV6-NEXT: mov r0, [[RES]]
	; CHECK-ARMV6-NEXT: bx lr			; CHECK-ARMV6-NEXT: bx lr

	; CHECK-THUMBV6-LABEL: test_cmpxchg_res_i8:
	; CHECK-THUMBV6: mov [[EXPECTED:r[0-9]+]], r1			; CHECK-THUMBV6: mov [[EXPECTED:r[0-9]+]], r1
	; CHECK-THUMBV6-NEXT: bl __sync_val_compare_and_swap_1			; CHECK-THUMBV6-NEXT: bl __sync_val_compare_and_swap_1
	; CHECK-THUMBV6-NEXT: mov [[RES:r[0-9]+]], r0			; CHECK-THUMBV6-NEXT: mov [[RES:r[0-9]+]], r0
	; CHECK-THUMBV6-NEXT: movs r0, #1			; CHECK-THUMBV6-NEXT: movs r0, #1
	; CHECK-THUMBV6-NEXT: movs [[ZERO:r[0-9]+]], #0			; CHECK-THUMBV6-NEXT: movs [[ZERO:r[0-9]+]], #0
	; CHECK-THUMBV6-NEXT: cmp [[RES]], [[EXPECTED]]			; CHECK-THUMBV6-NEXT: cmp [[RES]], [[EXPECTED]]
	; CHECK-THUMBV6-NEXT: beq [[END:.LBB[0-9_]+]]			; CHECK-THUMBV6-NEXT: beq [[END:.LBB[0-9_]+]]
	; CHECK-THUMBV6-NEXT: mov r0, [[ZERO]]			; CHECK-THUMBV6-NEXT: mov r0, [[ZERO]]
	; CHECK-THUMBV6-NEXT: [[END]]:			; CHECK-THUMBV6-NEXT: [[END]]:
	; CHECK-THUMBV6-NEXT: pop {{.*}}pc}			; CHECK-THUMBV6-NEXT: pop {{.*}}pc}

	; CHECK-ARMV7-LABEL: test_cmpxchg_res_i8:
	; CHECK-ARMV7-NEXT: .fnstart			; CHECK-ARMV7-NEXT: .fnstart
	; CHECK-ARMV7-NEXT: uxtb [[DESIRED:r[0-9]+]], r1			; CHECK-ARMV7-NEXT: uxtb [[DESIRED:r[0-9]+]], r1
	; CHECK-ARMV7-NEXT: [[TRY:.LBB[0-9_]+]]:			; CHECK-ARMV7-NEXT: [[TRY:.LBB[0-9_]+]]:
	; CHECK-ARMV7-NEXT: ldrexb [[LD:r[0-9]+]], [r0]			; CHECK-ARMV7-NEXT: ldrexb [[LD:r[0-9]+]], [r0]
	; CHECK-ARMV7-NEXT: cmp [[LD]], [[DESIRED]]			; CHECK-ARMV7-NEXT: cmp [[LD]], [[DESIRED]]
	; CHECK-ARMV7-NEXT: bne [[FAIL:.LBB[0-9_]+]]			; CHECK-ARMV7-NEXT: bne [[FAIL:.LBB[0-9_]+]]
	; CHECK-ARMV7-NEXT: strexb [[SUCCESS:r[0-9]+]], r2, [r0]			; CHECK-ARMV7-NEXT: strexb [[SUCCESS:r[0-9]+]], r2, [r0]
	; CHECK-ARMV7-NEXT: mov [[RES:r[0-9]+]], #1			; CHECK-ARMV7-NEXT: mov [[RES:r[0-9]+]], #1
	; CHECK-ARMV7-NEXT: cmp [[SUCCESS]], #0			; CHECK-ARMV7-NEXT: cmp [[SUCCESS]], #0
	; CHECK-ARMV7-NEXT: bne [[TRY]]			; CHECK-ARMV7-NEXT: bne [[TRY]]
	; CHECK-ARMV7-NEXT: b [[END:.LBB[0-9_]+]]			; CHECK-ARMV7-NEXT: b [[END:.LBB[0-9_]+]]
	; CHECK-ARMV7-NEXT: [[FAIL]]:			; CHECK-ARMV7-NEXT: [[FAIL]]:
	; CHECK-ARMV7-NEXT: clrex			; CHECK-ARMV7-NEXT: clrex
	; CHECK-ARMV7-NEXT: mov [[RES]], #0			; CHECK-ARMV7-NEXT: mov [[RES]], #0
	; CHECK-ARMV7-NEXT: [[END]]:			; CHECK-ARMV7-NEXT: [[END]]:
	; CHECK-ARMV7-NEXT: mov r0, [[RES]]			; CHECK-ARMV7-NEXT: mov r0, [[RES]]
	; CHECK-ARMV7-NEXT: bx lr			; CHECK-ARMV7-NEXT: bx lr

	; CHECK-THUMBV7-LABEL: test_cmpxchg_res_i8:
	; CHECK-THUMBV7-NEXT: .fnstart			; CHECK-THUMBV7-NEXT: .fnstart
	; CHECK-THUMBV7-NEXT: uxtb [[DESIRED:r[0-9]+]], r1			; CHECK-THUMBV7-NEXT: uxtb [[DESIRED:r[0-9]+]], r1
	; CHECK-THUMBV7-NEXT: b [[TRYLD:.LBB[0-9_]+]]			; CHECK-THUMBV7-NEXT: b [[TRYLD:.LBB[0-9_]+]]
	; CHECK-THUMBV7-NEXT: [[TRYST:.LBB[0-9_]+]]:			; CHECK-THUMBV7-NEXT: [[TRYST:.LBB[0-9_]+]]:
	; CHECK-THUMBV7-NEXT: strexb [[SUCCESS:r[0-9]+]], r2, [r0]			; CHECK-THUMBV7-NEXT: strexb [[SUCCESS:r[0-9]+]], r2, [r0]
	; CHECK-THUMBV7-NEXT: cmp [[SUCCESS]], #0			; CHECK-THUMBV7-NEXT: cmp [[SUCCESS]], #0
	; CHECK-THUMBV7-NEXT: itt eq			; CHECK-THUMBV7-NEXT: itt eq
	; CHECK-THUMBV7-NEXT: moveq r0, #1			; CHECK-THUMBV7-NEXT: moveq r0, #1
	; CHECK-THUMBV7-NEXT: bxeq lr			; CHECK-THUMBV7-NEXT: bxeq lr
	; CHECK-THUMBV7-NEXT: [[TRYLD]]:			; CHECK-THUMBV7-NEXT: [[TRYLD]]:
	; CHECK-THUMBV7-NEXT: ldrexb [[LD:r[0-9]+]], [r0]			; CHECK-THUMBV7-NEXT: ldrexb [[LD:r[0-9]+]], [r0]
	; CHECK-THUMBV7-NEXT: cmp [[LD]], [[DESIRED]]			; CHECK-THUMBV7-NEXT: cmp [[LD]], [[DESIRED]]
	; CHECK-THUMBV7-NEXT: beq [[TRYST:.LBB[0-9_]+]]			; CHECK-THUMBV7-NEXT: beq [[TRYST:.LBB[0-9_]+]]
	; CHECK-THUMBV7-NEXT: clrex			; CHECK-THUMBV7-NEXT: clrex
	; CHECK-THUMBV7-NEXT: movs r0, #0			; CHECK-THUMBV7-NEXT: movs r0, #0
	; CHECK-THUMBV7-NEXT: bx lr			; CHECK-THUMBV7-NEXT: bx lr

				entry:
				%0 = cmpxchg i8* %addr, i8 %desired, i8 %new monotonic monotonic
				%1 = extractvalue { i8, i1 } %0, 1
				jfbUnsubmitted Done Reply Inline Actions Missing CHECKs. jfb: Missing CHECKs.
				jyknightAuthorUnsubmitted Not Done Reply Inline Actions Oops! jyknight: Oops!
				ret i1 %1
				}



				;; Also ensure that i64s are inlined or turned into a libcall, as appropriate.
				define zeroext i1 @test_cmpxchg_res_i64(i64* %addr, i64 %desired, i64 zeroext %new) {
				; CHECK-LABEL: test_cmpxchg_res_i64:

				; CHECK-ARM: __sync_val_compare_and_swap_8
				; CHECK-THUMB: __sync_val_compare_and_swap_8
				; CHECK-ARMV6: ldrexd
				; CHECK-ARMV6: strexd
				; CHECK-THUMBV6: __sync_val_compare_and_swap_8
				; CHECK-ARMV7: ldrexd
				; CHECK-ARMV7: strexd
				; CHECK-THUMBv7: ldrexd
				; CHECK-THUMBv7: strexd

				entry:
				%0 = cmpxchg i64* %addr, i64 %desired, i64 %new monotonic monotonic
				%1 = extractvalue { i64, i1 } %0, 1
				ret i1 %1
				}

test/CodeGen/ARM/atomic-load-store.ll

	; RUN: llc < %s -mtriple=armv7-apple-ios -verify-machineinstrs \| FileCheck %s -check-prefix=ARM			; RUN: llc < %s -mtriple=armv7-apple-ios -verify-machineinstrs \| FileCheck %s -check-prefix=ARM
	; RUN: llc < %s -mtriple=armv7-apple-ios -O0 \| FileCheck %s -check-prefix=ARM			; RUN: llc < %s -mtriple=armv7-apple-ios -O0 \| FileCheck %s -check-prefix=ARM
	; RUN: llc < %s -mtriple=thumbv7-apple-ios -verify-machineinstrs \| FileCheck %s -check-prefix=THUMBTWO			; RUN: llc < %s -mtriple=thumbv7-apple-ios -verify-machineinstrs \| FileCheck %s -check-prefix=THUMBTWO
	; RUN: llc < %s -mtriple=thumbv6-apple-ios \| FileCheck %s -check-prefix=THUMBONE			; RUN: llc < %s -mtriple=thumbv6-apple-ios \| FileCheck %s -check-prefix=THUMBONE
	; RUN: llc < %s -mtriple=armv4-apple-ios \| FileCheck %s -check-prefix=ARMV4			; RUN: llc < %s -mtriple=armv4-apple-ios \| FileCheck %s -check-prefix=ARMV4
	; RUN: llc < %s -mtriple=armv6-apple-ios \| FileCheck %s -check-prefix=ARMV6			; RUN: llc < %s -mtriple=armv6-apple-ios \| FileCheck %s -check-prefix=ARMV6
	; RUN: llc < %s -mtriple=thumbv7m-apple-ios \| FileCheck %s -check-prefix=THUMBM			; RUN: llc < %s -mtriple=thumbv7m-apple-ios \| FileCheck %s -check-prefix=THUMBM

	define void @test1(i32* %ptr, i32 %val1) {			define void @test1(i32* %ptr, i32 %val1) {
	; ARM-LABEL: test1			; ARM-LABEL: test1
	; ARM: dmb {{ish$}}			; ARM: dmb {{ish$}}
	; ARM-NEXT: str			; ARM-NEXT: str
	; ARM-NEXT: dmb {{ish$}}			; ARM-NEXT: dmb {{ish$}}
	; THUMBONE-LABEL: test1			; THUMBONE-LABEL: test1
	; THUMBONE: __sync_lock_test_and_set_4			; THUMBONE: ___sync_synchronize
				; THUMBONE-NEXT: str
				; THUMBONE-NEXT: ___sync_synchronize
	; THUMBTWO-LABEL: test1			; THUMBTWO-LABEL: test1
	; THUMBTWO: dmb {{ish$}}			; THUMBTWO: dmb {{ish$}}
	; THUMBTWO-NEXT: str			; THUMBTWO-NEXT: str
	; THUMBTWO-NEXT: dmb {{ish$}}			; THUMBTWO-NEXT: dmb {{ish$}}
	; ARMV6-LABEL: test1			; ARMV6-LABEL: test1
	; ARMV6: mcr p15, #0, {{r[0-9]*}}, c7, c10, #5			; ARMV6: mcr p15, #0, {{r[0-9]*}}, c7, c10, #5
	; ARMV6: str			; ARMV6: str
	; ARMV6: mcr p15, #0, {{r[0-9]*}}, c7, c10, #5			; ARMV6: mcr p15, #0, {{r[0-9]*}}, c7, c10, #5
	; THUMBM-LABEL: test1			; THUMBM-LABEL: test1
	; THUMBM: dmb sy			; THUMBM: dmb sy
	; THUMBM: str			; THUMBM: str
	; THUMBM: dmb sy			; THUMBM: dmb sy
	store atomic i32 %val1, i32* %ptr seq_cst, align 4			store atomic i32 %val1, i32* %ptr seq_cst, align 4
	ret void			ret void
	}			}

	define i32 @test2(i32* %ptr) {			define i32 @test2(i32* %ptr) {
	; ARM-LABEL: test2			; ARM-LABEL: test2
	; ARM: ldr			; ARM: ldr
	; ARM-NEXT: dmb {{ish$}}			; ARM-NEXT: dmb {{ish$}}
	; THUMBONE-LABEL: test2			; THUMBONE-LABEL: test2
	; THUMBONE: __sync_val_compare_and_swap_4			; THUMBONE: ldr
				; THUMBONE: __sync_synchronize
	; THUMBTWO-LABEL: test2			; THUMBTWO-LABEL: test2
	; THUMBTWO: ldr			; THUMBTWO: ldr
	; THUMBTWO-NEXT: dmb {{ish$}}			; THUMBTWO-NEXT: dmb {{ish$}}
	; ARMV6-LABEL: test2			; ARMV6-LABEL: test2
	; ARMV6: ldr			; ARMV6: ldr
	; ARMV6: mcr p15, #0, {{r[0-9]*}}, c7, c10, #5			; ARMV6: mcr p15, #0, {{r[0-9]*}}, c7, c10, #5
	; THUMBM-LABEL: test2			; THUMBM-LABEL: test2
	; THUMBM: ldr			; THUMBM: ldr
	Show All 32 Lines
	; THUMBM-NOT: dmb sy			; THUMBM-NOT: dmb sy
	%val = load atomic i8, i8* %ptr1 unordered, align 1			%val = load atomic i8, i8* %ptr1 unordered, align 1
	store atomic i8 %val, i8* %ptr2 unordered, align 1			store atomic i8 %val, i8* %ptr2 unordered, align 1
	ret void			ret void
	}			}

	define void @test4(i8* %ptr1, i8* %ptr2) {			define void @test4(i8* %ptr1, i8* %ptr2) {
	; THUMBONE-LABEL: test4			; THUMBONE-LABEL: test4
	; THUMBONE: ___sync_val_compare_and_swap_1			; THUMBONE: ldrb
	; THUMBONE: ___sync_lock_test_and_set_1			; THUMBONE-NEXT: ___sync_synchronize
				; THUMBONE-NEXT: ___sync_synchronize
				jfbUnsubmitted Not Done Reply Inline Actions Ugh we should eliminate redundant libcalls elsewhere... jfb: Ugh we should eliminate redundant libcalls elsewhere...
				jyknightAuthorUnsubmitted Not Done Reply Inline Actions Yes, it might be nice to have a redundant fence elimination pass. You could also sometimes remove a fence which is redundant with a neighboring atomic instruction. E.g.: "atomic_store seq_cst; atomic_fence seq_cst" could potentially eliminate the fence on some (all existing?) targets. But that's a whole other project. jyknight: Yes, it might be nice to have a redundant fence elimination pass. You could also sometimes…
				; THUMBONE-NEXT: strb
				; THUMBONE-NEXT: ___sync_synchronize
	; ARMV6-LABEL: test4			; ARMV6-LABEL: test4
	; THUMBM-LABEL: test4			; THUMBM-LABEL: test4
	%val = load atomic i8, i8* %ptr1 seq_cst, align 1			%val = load atomic i8, i8* %ptr1 seq_cst, align 1
	store atomic i8 %val, i8* %ptr2 seq_cst, align 1			store atomic i8 %val, i8* %ptr2 seq_cst, align 1
	ret void			ret void
	}			}

	define i64 @test_old_load_64bit(i64* %p) {			define i64 @test_old_load_64bit(i64* %p) {
	Show All 12 Lines

test/CodeGen/ARM/atomic-op.ll

; RUN: llc < %s -mtriple=armv7-apple-ios -verify-machineinstrs \| FileCheck %s --check-prefix=CHECK --check-prefix CHECK-ARMV7		; RUN: llc < %s -mtriple=armv7-apple-ios -verify-machineinstrs \| FileCheck %s --check-prefix=CHECK --check-prefix CHECK-ARMV7
; RUN: llc < %s -mtriple=thumbv7-apple-ios -verify-machineinstrs \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-T2		; RUN: llc < %s -mtriple=thumbv7-apple-ios -verify-machineinstrs \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-T2
; RUN: llc < %s -mtriple=thumbv6-apple-ios -verify-machineinstrs \| FileCheck %s --check-prefix=CHECK-T1		; RUN: llc < %s -mtriple=thumbv6-apple-ios -verify-machineinstrs \| FileCheck %s --check-prefix=CHECK-T1
; RUN: llc < %s -mtriple=thumbv6-apple-ios -verify-machineinstrs -mcpu=cortex-m0 \| FileCheck %s --check-prefix=CHECK-T1		; RUN: llc < %s -mtriple=thumbv6-apple-ios -verify-machineinstrs -mcpu=cortex-m0 \| FileCheck %s --check-prefix=CHECK-T1
; RUN: llc < %s -mtriple=thumbv7--none-eabi -thread-model single -verify-machineinstrs \| FileCheck %s --check-prefix=CHECK-BAREMETAL		; RUN: llc < %s -mtriple=thumbv7--none-eabi -thread-model single -verify-machineinstrs \| FileCheck %s --check-prefix=CHECK-BAREMETAL

target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"		target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"

define void @func(i32 %argc, i8** %argv) nounwind {		define void @func(i32 %argc, i8** %argv) nounwind {
		; CHECK-LABEL: func:
entry:		entry:
%argc.addr = alloca i32 ; <i32*> [#uses=1]		%argc.addr = alloca i32 ; <i32*> [#uses=1]
%argv.addr = alloca i8 ; <i8*> [#uses=1]		%argv.addr = alloca i8 ; <i8*> [#uses=1]
%val1 = alloca i32 ; <i32*> [#uses=2]		%val1 = alloca i32 ; <i32*> [#uses=2]
%val2 = alloca i32 ; <i32*> [#uses=15]		%val2 = alloca i32 ; <i32*> [#uses=15]
%andt = alloca i32 ; <i32*> [#uses=2]		%andt = alloca i32 ; <i32*> [#uses=2]
%ort = alloca i32 ; <i32*> [#uses=2]		%ort = alloca i32 ; <i32*> [#uses=2]
%xort = alloca i32 ; <i32*> [#uses=2]		%xort = alloca i32 ; <i32*> [#uses=2]
▲ Show 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	entry:
; CHECK-BAREMETAL-NOT: __sync		; CHECK-BAREMETAL-NOT: __sync
%14 = atomicrmw umax i32* %val2, i32 0 monotonic		%14 = atomicrmw umax i32* %val2, i32 0 monotonic
store i32 %14, i32* %old		store i32 %14, i32* %old

ret void		ret void
}		}

define void @func2() nounwind {		define void @func2() nounwind {
		; CHECK-LABEL: func2:
entry:		entry:
%val = alloca i16		%val = alloca i16
%old = alloca i16		%old = alloca i16
store i16 31, i16* %val		store i16 31, i16* %val
; CHECK: ldrex		; CHECK: ldrex
; CHECK: cmp		; CHECK: cmp
; CHECK: strex		; CHECK: strex
; CHECK-T1: bl ___sync_fetch_and_umin_2		; CHECK-T1: bl ___sync_fetch_and_umin_2
Show All 25 Lines	entry:
; CHECK-BAREMETAL: cmp		; CHECK-BAREMETAL: cmp
; CHECK-BAREMETAL-NOT: __sync		; CHECK-BAREMETAL-NOT: __sync
%3 = atomicrmw umax i16* %val, i16 0 monotonic		%3 = atomicrmw umax i16* %val, i16 0 monotonic
store i16 %3, i16* %old		store i16 %3, i16* %old
ret void		ret void
}		}

define void @func3() nounwind {		define void @func3() nounwind {
		; CHECK-LABEL: func3:
entry:		entry:
%val = alloca i8		%val = alloca i8
%old = alloca i8		%old = alloca i8
store i8 31, i8* %val		store i8 31, i8* %val
; CHECK: ldrex		; CHECK: ldrex
; CHECK: cmp		; CHECK: cmp
; CHECK: strex		; CHECK: strex
; CHECK-T1: bl ___sync_fetch_and_umin_1		; CHECK-T1: bl ___sync_fetch_and_umin_1
Show All 24 Lines	entry:
; CHECK-T1: bl ___sync_fetch_and_umax_1		; CHECK-T1: bl ___sync_fetch_and_umax_1
; CHECK-BAREMETAL: cmp		; CHECK-BAREMETAL: cmp
; CHECK-BAREMETAL-NOT: __sync		; CHECK-BAREMETAL-NOT: __sync
%3 = atomicrmw umax i8* %val, i8 0 monotonic		%3 = atomicrmw umax i8* %val, i8 0 monotonic
store i8 %3, i8* %old		store i8 %3, i8* %old
ret void		ret void
}		}

; CHECK: func4		; CHECK-LABEL: func4:
		jfbUnsubmitted Done Reply Inline Actions Colon after func4: jfb: Colon after func4:
; This function should not need to use callee-saved registers.		; This function should not need to use callee-saved registers.
; rdar://problem/12203728		; rdar://problem/12203728
; CHECK-NOT: r4		; CHECK-NOT: r4
define i32 @func4(i32* %p) nounwind optsize ssp {		define i32 @func4(i32* %p) nounwind optsize ssp {
entry:		entry:
%0 = atomicrmw add i32* %p, i32 1 monotonic		%0 = atomicrmw add i32* %p, i32 1 monotonic
ret i32 %0		ret i32 %0
}		}

define i32 @test_cmpxchg_fail_order(i32 *%addr, i32 %desired, i32 %new) {		define i32 @test_cmpxchg_fail_order(i32 *%addr, i32 %desired, i32 %new) {
; CHECK-LABEL: test_cmpxchg_fail_order:		; CHECK-LABEL: test_cmpxchg_fail_order:

%pair = cmpxchg i32* %addr, i32 %desired, i32 %new seq_cst monotonic		%pair = cmpxchg i32* %addr, i32 %desired, i32 %new seq_cst monotonic
%oldval = extractvalue { i32, i1 } %pair, 0		%oldval = extractvalue { i32, i1 } %pair, 0
; CHECK-ARMV7: ldrex [[OLDVAL:r[0-9]+]], [r[[ADDR:[0-9]+]]]		; CHECK-ARMV7: ldrex [[OLDVAL:r[0-9]+]], [r[[ADDR:[0-9]+]]]
; CHECK-ARMV7: cmp [[OLDVAL]], r1		; CHECK-ARMV7: cmp [[OLDVAL]], r1
; CHECK-ARMV7: bne [[FAIL_BB:\.?LBB[0-9]+_[0-9]+]]		; CHECK-ARMV7: bne [[FAIL_BB:\.?LBB[0-9]+_[0-9]+]]
; CHECK-ARMV7: dmb ish		; CHECK-ARMV7: dmb ish
; CHECK-ARMV7: [[LOOP_BB:\.?LBB.*]]:		; CHECK-ARMV7: [[LOOP_BB:\.?LBB.*]]:
; CHECK-ARMV7: strex [[SUCCESS:r[0-9]+]], r2, [r[[ADDR]]]		; CHECK-ARMV7: strex [[SUCCESS:r[0-9]+]], r2, [r[[ADDR]]]
Show All 23 Lines
; CHECK-T2: beq [[LOOP_BB]]		; CHECK-T2: beq [[LOOP_BB]]
; CHECK-T2: clrex		; CHECK-T2: clrex

ret i32 %oldval		ret i32 %oldval
}		}

define i32 @test_cmpxchg_fail_order1(i32 *%addr, i32 %desired, i32 %new) {		define i32 @test_cmpxchg_fail_order1(i32 *%addr, i32 %desired, i32 %new) {
; CHECK-LABEL: test_cmpxchg_fail_order1:		; CHECK-LABEL: test_cmpxchg_fail_order1:

%pair = cmpxchg i32* %addr, i32 %desired, i32 %new acquire acquire		%pair = cmpxchg i32* %addr, i32 %desired, i32 %new acquire acquire
%oldval = extractvalue { i32, i1 } %pair, 0		%oldval = extractvalue { i32, i1 } %pair, 0
; CHECK-NOT: dmb ish		; CHECK-NOT: dmb ish
; CHECK: [[LOOP_BB:\.?LBB[0-9]+_1]]:		; CHECK: [[LOOP_BB:\.?LBB[0-9]+_1]]:
; CHECK: ldrex [[OLDVAL:r[0-9]+]], [r[[ADDR:[0-9]+]]]		; CHECK: ldrex [[OLDVAL:r[0-9]+]], [r[[ADDR:[0-9]+]]]
; CHECK: cmp [[OLDVAL]], r1		; CHECK: cmp [[OLDVAL]], r1
; CHECK: bne [[FAIL_BB:\.?LBB[0-9]+_[0-9]+]]		; CHECK: bne [[FAIL_BB:\.?LBB[0-9]+_[0-9]+]]
; CHECK: strex [[SUCCESS:r[0-9]+]], r2, [r[[ADDR]]]		; CHECK: strex [[SUCCESS:r[0-9]+]], r2, [r[[ADDR]]]
; CHECK: cmp [[SUCCESS]], #0		; CHECK: cmp [[SUCCESS]], #0
; CHECK: bne [[LOOP_BB]]		; CHECK: bne [[LOOP_BB]]
; CHECK: b [[END_BB:\.?LBB[0-9]+_[0-9]+]]		; CHECK: b [[END_BB:\.?LBB[0-9]+_[0-9]+]]
; CHECK: [[FAIL_BB]]:		; CHECK: [[FAIL_BB]]:
; CHECK-NEXT: clrex		; CHECK-NEXT: clrex
; CHECK-NEXT: [[END_BB]]:		; CHECK-NEXT: [[END_BB]]:
; CHECK: dmb ish		; CHECK: dmb ish
; CHECK: bx lr		; CHECK: bx lr

ret i32 %oldval		ret i32 %oldval
}		}

define i32 @load_load_add_acquire(i32* %mem1, i32* %mem2) nounwind {		define i32 @load_load_add_acquire(i32* %mem1, i32* %mem2) nounwind {
; CHECK-LABEL: load_load_add_acquire		; CHECK-LABEL: load_load_add_acquire:
%val1 = load atomic i32, i32* %mem1 acquire, align 4		%val1 = load atomic i32, i32* %mem1 acquire, align 4
%val2 = load atomic i32, i32* %mem2 acquire, align 4		%val2 = load atomic i32, i32* %mem2 acquire, align 4
%tmp = add i32 %val1, %val2		%tmp = add i32 %val1, %val2

; CHECK: ldr {{r[0-9]}}, [r0]		; CHECK: ldr {{r[0-9]}}, [r0]
; CHECK: dmb		; CHECK: dmb
; CHECK: ldr {{r[0-9]}}, [r1]		; CHECK: ldr {{r[0-9]}}, [r1]
; CHECK: dmb		; CHECK: dmb
; CHECK: add r0,		; CHECK: add r0,

; CHECK-T1: ___sync_val_compare_and_swap_4		; CHECK-T1: ___sync_val_compare_and_swap_4
; CHECK-T1: ___sync_val_compare_and_swap_4		; CHECK-T1: ___sync_val_compare_and_swap_4

; CHECK-BAREMETAL: ldr {{r[0-9]}}, [r0]		; CHECK-BAREMETAL: ldr {{r[0-9]}}, [r0]
; CHECK-BAREMETAL-NOT: dmb		; CHECK-BAREMETAL-NOT: dmb
; CHECK-BAREMETAL: ldr {{r[0-9]}}, [r1]		; CHECK-BAREMETAL: ldr {{r[0-9]}}, [r1]
; CHECK-BAREMETAL-NOT: dmb		; CHECK-BAREMETAL-NOT: dmb
; CHECK-BAREMETAL: add r0,		; CHECK-BAREMETAL: add r0,

ret i32 %tmp		ret i32 %tmp
}		}

define void @store_store_release(i32* %mem1, i32 %val1, i32* %mem2, i32 %val2) {		define void @store_store_release(i32* %mem1, i32 %val1, i32* %mem2, i32 %val2) {
; CHECK-LABEL: store_store_release		; CHECK-LABEL: store_store_release:
store atomic i32 %val1, i32* %mem1 release, align 4		store atomic i32 %val1, i32* %mem1 release, align 4
store atomic i32 %val2, i32* %mem2 release, align 4		store atomic i32 %val2, i32* %mem2 release, align 4

; CHECK: dmb		; CHECK: dmb
; CHECK: str r1, [r0]		; CHECK: str r1, [r0]
; CHECK: dmb		; CHECK: dmb
; CHECK: str r3, [r2]		; CHECK: str r3, [r2]

; CHECK-T1: ___sync_lock_test_and_set		; CHECK-M0: dmb
; CHECK-T1: ___sync_lock_test_and_set		; CHECK-M0: str r1, [r0]
		; CHECK-M0: dmb
		; CHECK-M0: str r3, [r2]

; CHECK-BAREMETAL-NOT: dmb		; CHECK-BAREMETAL-NOT: dmb
; CHECK-BAREMTEAL: str r1, [r0]		; CHECK-BAREMETAL: str r1, [r0]
; CHECK-BAREMETAL-NOT: dmb		; CHECK-BAREMETAL-NOT: dmb
; CHECK-BAREMTEAL: str r3, [r2]		; CHECK-BAREMETAL: str r3, [r2]

ret void		ret void
}		}

define void @load_fence_store_monotonic(i32* %mem1, i32* %mem2) {		define void @load_fence_store_monotonic(i32* %mem1, i32* %mem2) {
; CHECK-LABEL: load_fence_store_monotonic		; CHECK-LABEL: load_fence_store_monotonic:
%val = load atomic i32, i32* %mem1 monotonic, align 4		%val = load atomic i32, i32* %mem1 monotonic, align 4
fence seq_cst		fence seq_cst
store atomic i32 %val, i32* %mem2 monotonic, align 4		store atomic i32 %val, i32* %mem2 monotonic, align 4

; CHECK: ldr [[R0:r[0-9]]], [r0]		; CHECK: ldr [[R0:r[0-9]]], [r0]
; CHECK: dmb		; CHECK: dmb
; CHECK: str [[R0]], [r1]		; CHECK: str [[R0]], [r1]

Show All 10 Lines

test/CodeGen/PowerPC/atomics-indexed.ll

	Show All 28 Lines
	; CHECK: lwzx			; CHECK: lwzx
	; CHECK-NOT: sync			; CHECK-NOT: sync
	%ptr = getelementptr inbounds [100000 x i32], [100000 x i32]* %mem, i64 0, i64 90000			%ptr = getelementptr inbounds [100000 x i32], [100000 x i32]* %mem, i64 0, i64 90000
	%val = load atomic i32, i32* %ptr monotonic, align 4			%val = load atomic i32, i32* %ptr monotonic, align 4
	ret i32 %val			ret i32 %val
	}			}
	define i64 @load_x_i64_unordered([100000 x i64]* %mem) {			define i64 @load_x_i64_unordered([100000 x i64]* %mem) {
	; CHECK-LABEL: load_x_i64_unordered			; CHECK-LABEL: load_x_i64_unordered
	; PPC32: __sync_			; PPC32: __atomic_
	; PPC64-NOT: __sync_			; PPC64-NOT: __atomic_
	; PPC64: ldx			; PPC64: ldx
	; CHECK-NOT: sync			; CHECK-NOT: sync
	%ptr = getelementptr inbounds [100000 x i64], [100000 x i64]* %mem, i64 0, i64 90000			%ptr = getelementptr inbounds [100000 x i64], [100000 x i64]* %mem, i64 0, i64 90000
	%val = load atomic i64, i64* %ptr unordered, align 8			%val = load atomic i64, i64* %ptr unordered, align 8
	ret i64 %val			ret i64 %val
	}			}

	; Indexed version of stores			; Indexed version of stores
	Show All 19 Lines
	; CHECK: stwx			; CHECK: stwx
	%ptr = getelementptr inbounds [100000 x i32], [100000 x i32]* %mem, i64 0, i64 90000			%ptr = getelementptr inbounds [100000 x i32], [100000 x i32]* %mem, i64 0, i64 90000
	store atomic i32 42, i32* %ptr monotonic, align 4			store atomic i32 42, i32* %ptr monotonic, align 4
	ret void			ret void
	}			}
	define void @store_x_i64_unordered([100000 x i64]* %mem) {			define void @store_x_i64_unordered([100000 x i64]* %mem) {
	; CHECK-LABEL: store_x_i64_unordered			; CHECK-LABEL: store_x_i64_unordered
	; CHECK-NOT: sync			; CHECK-NOT: sync
	; PPC32: __sync_			; PPC32: __atomic_
	; PPC64-NOT: __sync_			; PPC64-NOT: __atomic_
	; PPC64: stdx			; PPC64: stdx
	%ptr = getelementptr inbounds [100000 x i64], [100000 x i64]* %mem, i64 0, i64 90000			%ptr = getelementptr inbounds [100000 x i64], [100000 x i64]* %mem, i64 0, i64 90000
	store atomic i64 42, i64* %ptr unordered, align 8			store atomic i64 42, i64* %ptr unordered, align 8
	ret void			ret void
	}			}

test/CodeGen/PowerPC/atomics.ll

	Show All 26 Lines
	; CHECK-LABEL: load_i32_acquire			; CHECK-LABEL: load_i32_acquire
	; CHECK: lwz			; CHECK: lwz
	%val = load atomic i32, i32* %mem acquire, align 4			%val = load atomic i32, i32* %mem acquire, align 4
	; CHECK: lwsync			; CHECK: lwsync
	ret i32 %val			ret i32 %val
	}			}
	define i64 @load_i64_seq_cst(i64* %mem) {			define i64 @load_i64_seq_cst(i64* %mem) {
	; CHECK-LABEL: load_i64_seq_cst			; CHECK-LABEL: load_i64_seq_cst
	; CHECK: sync			; PPC32: __atomic_
	; PPC32: __sync_			; PPC64-NOT: __atomic_
	; PPC64-NOT: __sync_			; PPC64: sync
	; PPC64: ld			; PPC64: ld
	%val = load atomic i64, i64* %mem seq_cst, align 8			%val = load atomic i64, i64* %mem seq_cst, align 8
	; CHECK: lwsync			; PPC64: lwsync
	ret i64 %val			ret i64 %val
	}			}

	; Stores			; Stores
	define void @store_i8_unordered(i8* %mem) {			define void @store_i8_unordered(i8* %mem) {
	; CHECK-LABEL: store_i8_unordered			; CHECK-LABEL: store_i8_unordered
	; CHECK-NOT: sync			; CHECK-NOT: sync
	; CHECK: stb			; CHECK: stb
	Show All 11 Lines
	; CHECK-LABEL: store_i32_release			; CHECK-LABEL: store_i32_release
	; CHECK: lwsync			; CHECK: lwsync
	; CHECK: stw			; CHECK: stw
	store atomic i32 42, i32* %mem release, align 4			store atomic i32 42, i32* %mem release, align 4
	ret void			ret void
	}			}
	define void @store_i64_seq_cst(i64* %mem) {			define void @store_i64_seq_cst(i64* %mem) {
	; CHECK-LABEL: store_i64_seq_cst			; CHECK-LABEL: store_i64_seq_cst
	; CHECK: sync			; PPC32: __atomic_
	; PPC32: __sync_			; PPC64-NOT: __atomic_
	; PPC64-NOT: __sync_			; PPC64: sync
	; PPC64: std			; PPC64: std
	store atomic i64 42, i64* %mem seq_cst, align 8			store atomic i64 42, i64* %mem seq_cst, align 8
	ret void			ret void
	}			}

	; Atomic CmpXchg			; Atomic CmpXchg
	define i8 @cas_strong_i8_sc_sc(i8* %mem) {			define i8 @cas_strong_i8_sc_sc(i8* %mem) {
	; CHECK-LABEL: cas_strong_i8_sc_sc			; CHECK-LABEL: cas_strong_i8_sc_sc
	Show All 16 Lines
	; CHECK: lwsync			; CHECK: lwsync
	%val = cmpxchg i32* %mem, i32 0, i32 1 acq_rel acquire			%val = cmpxchg i32* %mem, i32 0, i32 1 acq_rel acquire
	; CHECK: lwsync			; CHECK: lwsync
	%loaded = extractvalue { i32, i1} %val, 0			%loaded = extractvalue { i32, i1} %val, 0
	ret i32 %loaded			ret i32 %loaded
	}			}
	define i64 @cas_weak_i64_release_monotonic(i64* %mem) {			define i64 @cas_weak_i64_release_monotonic(i64* %mem) {
	; CHECK-LABEL: cas_weak_i64_release_monotonic			; CHECK-LABEL: cas_weak_i64_release_monotonic
	; CHECK: lwsync			; PPC32: __atomic_
				; PPC64: lwsync
	%val = cmpxchg weak i64* %mem, i64 0, i64 1 release monotonic			%val = cmpxchg weak i64* %mem, i64 0, i64 1 release monotonic
	; CHECK-NOT: [sync ]			; CHECK-NOT: [sync ]
	%loaded = extractvalue { i64, i1} %val, 0			%loaded = extractvalue { i64, i1} %val, 0
	ret i64 %loaded			ret i64 %loaded
	}			}

	; AtomicRMW			; AtomicRMW
	define i8 @add_i8_monotonic(i8* %mem, i8 %operand) {			define i8 @add_i8_monotonic(i8* %mem, i8 %operand) {
	Show All 13 Lines
	; CHECK-LABEL: xchg_i32_acq_rel			; CHECK-LABEL: xchg_i32_acq_rel
	; CHECK: lwsync			; CHECK: lwsync
	%val = atomicrmw xchg i32* %mem, i32 %operand acq_rel			%val = atomicrmw xchg i32* %mem, i32 %operand acq_rel
	; CHECK: lwsync			; CHECK: lwsync
	ret i32 %val			ret i32 %val
	}			}
	define i64 @and_i64_release(i64* %mem, i64 %operand) {			define i64 @and_i64_release(i64* %mem, i64 %operand) {
	; CHECK-LABEL: and_i64_release			; CHECK-LABEL: and_i64_release
	; CHECK: lwsync			; PPC32: __atomic_
				; PPC64: lwsync
	%val = atomicrmw and i64* %mem, i64 %operand release			%val = atomicrmw and i64* %mem, i64 %operand release
	; CHECK-NOT: [sync ]			; CHECK-NOT: [sync ]
	ret i64 %val			ret i64 %val
	}			}

test/CodeGen/X86/atomic-non-integer.ll

	Show All 28 Lines
	; CHECK: movd %xmm0, %rax			; CHECK: movd %xmm0, %rax
	; CHECK: movq %rax, (%rdi)			; CHECK: movq %rax, (%rdi)
	store atomic double %v, double* %fptr unordered, align 8			store atomic double %v, double* %fptr unordered, align 8
	ret void			ret void
	}			}

	define void @store_fp128(fp128* %fptr, fp128 %v) {			define void @store_fp128(fp128* %fptr, fp128 %v) {
	; CHECK-LABEL: @store_fp128			; CHECK-LABEL: @store_fp128
	; CHECK: callq __sync_lock_test_and_set_16			; CHECK: callq __atomic_store_16
	store atomic fp128 %v, fp128* %fptr unordered, align 16			store atomic fp128 %v, fp128* %fptr unordered, align 16
	ret void			ret void
	}			}

	define half @load_half(half* %fptr) {			define half @load_half(half* %fptr) {
	; CHECK-LABEL: @load_half			; CHECK-LABEL: @load_half
	; CHECK: movw (%rdi), %ax			; CHECK: movw (%rdi), %ax
	; CHECK: movzwl %ax, %edi			; CHECK: movzwl %ax, %edi
	Show All 15 Lines
	; CHECK: movq (%rdi), %rax			; CHECK: movq (%rdi), %rax
	; CHECK: movd %rax, %xmm0			; CHECK: movd %rax, %xmm0
	%v = load atomic double, double* %fptr unordered, align 8			%v = load atomic double, double* %fptr unordered, align 8
	ret double %v			ret double %v
	}			}

	define fp128 @load_fp128(fp128* %fptr) {			define fp128 @load_fp128(fp128* %fptr) {
	; CHECK-LABEL: @load_fp128			; CHECK-LABEL: @load_fp128
	; CHECK: callq __sync_val_compare_and_swap_16			; CHECK: callq __atomic_load_16
	%v = load atomic fp128, fp128* %fptr unordered, align 16			%v = load atomic fp128, fp128* %fptr unordered, align 16
	ret fp128 %v			ret fp128 %v
	}			}


	; sanity check the seq_cst lowering since that's the			; sanity check the seq_cst lowering since that's the
	; interesting one from an ordering perspective on x86.			; interesting one from an ordering perspective on x86.

	Show All 31 Lines

test/CodeGen/X86/nocx16.ll

	; RUN: llc < %s -march=x86-64 -mcpu=corei7 -mattr=-cx16 \| FileCheck %s			; RUN: llc < %s -march=x86-64 -mcpu=corei7 -mattr=-cx16 \| FileCheck %s
	define void @test(i128* %a) nounwind {			define void @test(i128* %a) nounwind {
	entry:			entry:
	; CHECK: __sync_val_compare_and_swap_16			; CHECK: __atomic_compare_exchange_16
	%0 = cmpxchg i128* %a, i128 1, i128 1 seq_cst seq_cst			%0 = cmpxchg i128* %a, i128 1, i128 1 seq_cst seq_cst
	; CHECK: __sync_lock_test_and_set_16			; CHECK: __atomic_exchange_16
	%1 = atomicrmw xchg i128* %a, i128 1 seq_cst			%1 = atomicrmw xchg i128* %a, i128 1 seq_cst
	; CHECK: __sync_fetch_and_add_16			; CHECK: __atomic_fetch_add_16
	%2 = atomicrmw add i128* %a, i128 1 seq_cst			%2 = atomicrmw add i128* %a, i128 1 seq_cst
	; CHECK: __sync_fetch_and_sub_16			; CHECK: __atomic_fetch_sub_16
	%3 = atomicrmw sub i128* %a, i128 1 seq_cst			%3 = atomicrmw sub i128* %a, i128 1 seq_cst
	; CHECK: __sync_fetch_and_and_16			; CHECK: __atomic_fetch_and_16
	%4 = atomicrmw and i128* %a, i128 1 seq_cst			%4 = atomicrmw and i128* %a, i128 1 seq_cst
	; CHECK: __sync_fetch_and_nand_16			; CHECK: __atomic_fetch_nand_16
	%5 = atomicrmw nand i128* %a, i128 1 seq_cst			%5 = atomicrmw nand i128* %a, i128 1 seq_cst
	; CHECK: __sync_fetch_and_or_16			; CHECK: __atomic_fetch_or_16
	%6 = atomicrmw or i128* %a, i128 1 seq_cst			%6 = atomicrmw or i128* %a, i128 1 seq_cst
	; CHECK: __sync_fetch_and_xor_16			; CHECK: __atomic_fetch_xor_16
	%7 = atomicrmw xor i128* %a, i128 1 seq_cst			%7 = atomicrmw xor i128* %a, i128 1 seq_cst
	ret void			ret void
	}			}

test/CodeGen/XCore/atomic.ll

	Show All 15 Lines
	}			}

	@pool = external global i64			@pool = external global i64

	define void @atomicloadstore() nounwind {			define void @atomicloadstore() nounwind {
	entry:			entry:
	; CHECK-LABEL: atomicloadstore			; CHECK-LABEL: atomicloadstore

	; CHECK: ldw r[[R0:[0-9]+]], dp[pool]			; CHECK: bl __atomic_load_4
	; CHECK-NEXT: ldaw r[[R1:[0-9]+]], dp[pool]
	; CHECK-NEXT: #MEMBARRIER
	; CHECK-NEXT: ldc r[[R2:[0-9]+]], 0
	%0 = load atomic i32, i32* bitcast (i64* @pool to i32*) acquire, align 4			%0 = load atomic i32, i32* bitcast (i64* @pool to i32*) acquire, align 4

	; CHECK-NEXT: ld16s r3, r[[R1]][r[[R2]]]			; CHECK: bl __atomic_store_2
	; CHECK-NEXT: #MEMBARRIER			store atomic i16 5, i16* bitcast (i64* @pool to i16*) release, align 2
	%1 = load atomic i16, i16* bitcast (i64* @pool to i16*) acquire, align 2

	; CHECK-NEXT: ld8u r11, r[[R1]][r[[R2]]]
	; CHECK-NEXT: #MEMBARRIER
	%2 = load atomic i8, i8* bitcast (i64* @pool to i8*) acquire, align 1

	; CHECK-NEXT: ldw r4, dp[pool]
	; CHECK-NEXT: #MEMBARRIER
	%3 = load atomic i32, i32* bitcast (i64* @pool to i32*) seq_cst, align 4

	; CHECK-NEXT: ld16s r5, r[[R1]][r[[R2]]]
	; CHECK-NEXT: #MEMBARRIER
	%4 = load atomic i16, i16* bitcast (i64* @pool to i16*) seq_cst, align 2

	; CHECK-NEXT: ld8u r6, r[[R1]][r[[R2]]]
	; CHECK-NEXT: #MEMBARRIER
	%5 = load atomic i8, i8* bitcast (i64* @pool to i8*) seq_cst, align 1

	; CHECK-NEXT: #MEMBARRIER
	; CHECK-NEXT: stw r[[R0]], dp[pool]
	store atomic i32 %0, i32* bitcast (i64* @pool to i32*) release, align 4

	; CHECK-NEXT: #MEMBARRIER
	; CHECK-NEXT: st16 r3, r[[R1]][r[[R2]]]
	store atomic i16 %1, i16* bitcast (i64* @pool to i16*) release, align 2

	; CHECK-NEXT: #MEMBARRIER
	; CHECK-NEXT: st8 r11, r[[R1]][r[[R2]]]
	store atomic i8 %2, i8* bitcast (i64* @pool to i8*) release, align 1

	; CHECK-NEXT: #MEMBARRIER
	; CHECK-NEXT: stw r4, dp[pool]
	; CHECK-NEXT: #MEMBARRIER
	store atomic i32 %3, i32* bitcast (i64* @pool to i32*) seq_cst, align 4

	; CHECK-NEXT: #MEMBARRIER
	; CHECK-NEXT: st16 r5, r[[R1]][r[[R2]]]
	; CHECK-NEXT: #MEMBARRIER
	store atomic i16 %4, i16* bitcast (i64* @pool to i16*) seq_cst, align 2

	; CHECK-NEXT: #MEMBARRIER
	; CHECK-NEXT: st8 r6, r[[R1]][r[[R2]]]
	; CHECK-NEXT: #MEMBARRIER
	store atomic i8 %5, i8* bitcast (i64* @pool to i8*) seq_cst, align 1

	; CHECK-NEXT: ldw r[[R0]], dp[pool]
	; CHECK-NEXT: stw r[[R0]], dp[pool]
	; CHECK-NEXT: ld16s r[[R0]], r[[R1]][r[[R2]]]
	; CHECK-NEXT: st16 r[[R0]], r[[R1]][r[[R2]]]
	; CHECK-NEXT: ld8u r[[R0]], r[[R1]][r[[R2]]]
	; CHECK-NEXT: st8 r[[R0]], r[[R1]][r[[R2]]]
	%6 = load atomic i32, i32* bitcast (i64* @pool to i32*) monotonic, align 4
	store atomic i32 %6, i32* bitcast (i64* @pool to i32*) monotonic, align 4
	%7 = load atomic i16, i16* bitcast (i64* @pool to i16*) monotonic, align 2
	store atomic i16 %7, i16* bitcast (i64* @pool to i16*) monotonic, align 2
	%8 = load atomic i8, i8* bitcast (i64* @pool to i8*) monotonic, align 1
	store atomic i8 %8, i8* bitcast (i64* @pool to i8*) monotonic, align 1

	ret void			ret void
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

Switch over targets to use AtomicExpandPass, and clean up target atomics code.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 61121

include/llvm/Target/TargetLowering.h

include/llvm/Target/TargetSubtargetInfo.h

lib/CodeGen/AtomicExpandPass.cpp

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

lib/CodeGen/SelectionDAG/LegalizeTypes.h

lib/CodeGen/SelectionDAG/SelectionDAG.cpp

lib/CodeGen/TargetLoweringBase.cpp

lib/Target/AArch64/AArch64ISelLowering.cpp

lib/Target/ARM/ARMISelLowering.h

lib/Target/ARM/ARMISelLowering.cpp

lib/Target/ARM/ARMSubtarget.h

lib/Target/ARM/ARMSubtarget.cpp

lib/Target/BPF/BPFISelLowering.cpp

lib/Target/Hexagon/HexagonISelLowering.h

lib/Target/Hexagon/HexagonISelLowering.cpp

lib/Target/Mips/Mips16ISelLowering.cpp

lib/Target/Mips/MipsISelLowering.cpp

lib/Target/PowerPC/PPCISelLowering.cpp

lib/Target/Sparc/SparcISelLowering.cpp

lib/Target/SystemZ/SystemZISelLowering.cpp

lib/Target/TargetSubtargetInfo.cpp

lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

lib/Target/X86/X86ISelLowering.cpp

lib/Target/XCore/XCoreISelLowering.h

lib/Target/XCore/XCoreISelLowering.cpp

test/CodeGen/ARM/atomic-cmpxchg.ll

test/CodeGen/ARM/atomic-load-store.ll

test/CodeGen/ARM/atomic-op.ll

test/CodeGen/PowerPC/atomics-indexed.ll

test/CodeGen/PowerPC/atomics.ll

test/CodeGen/X86/atomic-non-integer.ll

test/CodeGen/X86/nocx16.ll

test/CodeGen/XCore/atomic.ll

Switch over targets to use AtomicExpandPass, and clean up target atomics code.
Needs ReviewPublic