This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
docs/
-
LangRef.rst
-
lib/
-
CodeGen/
-
AtomicExpandPass.cpp
-
IR/
-
Verifier.cpp
-
test/
-
CodeGen/X86/
-
X86/
-
atomic-non-integer.ll
-
Transforms/AtomicExpand/X86/
-
AtomicExpand/
-
X86/
-
expand-atomic-non-integer.ll
-
Verifier/
-
atomics.ll

Differential D15471

[IR] Add support for floating pointer atomic loads and stores
ClosedPublic

Authored by reames on Dec 11 2015, 4:15 PM.

Download Raw Diff

Details

Reviewers

jyknight
jfb
hfinkel

Commits

rG61a24ab6cc30: [IR] Add support for floating pointer atomic loads and stores
rL255737: [IR] Add support for floating pointer atomic loads and stores

Summary

This patch allows atomic loads and stores of floating point to be specified in the IR and adds an adapter to allow them to be lowered via existing backend support for bitcast-to-equivalent-integer idiom.

Previously, the only way to specify a atomic float operation was to bitcast the pointer to a i32, load the value as an i32, then bitcast to a float. At it's most basic, this patch simply moves this expansion step to the point we start lowering to the backend.

This patch does not add canonicalization rules to convert the bitcast idioms to the appropriate atomic loads. I plan to do that in the future, but for now, let's simply add the support. I'd like to get instruction selection working through at least one backend (x86-64) without the bitcast conversion before canonicalizing into this form.

Similarly, I haven't yet added the target hooks to opt out of the lowering step I added to AtomicExpand. I figured it would more sense to add those once at least one backend (x86) was ready to actually opt out.

As you can see from the included tests, the generated code quality is not great. I plan on submitting some patches to fix this, but help from others along that line would be very welcome. I'm not super familiar with the backend and my ramp up time may be material.

Diff Detail

Repository: rL LLVM

Event Timeline

reames updated this revision to Diff 42599.Dec 11 2015, 4:15 PM

reames retitled this revision from to [IR] Add support for floating pointer and vector atomic loads and stores.

reames updated this object.

reames added reviewers: jfb, hfinkel, jyknight.

reames added a subscriber: llvm-commits.

Minor drop by comments inline.

docs/LangRef.rst
6994 ↗	(On Diff #42599)	Minor & optional language suggestion: right now this can be read as "is >= 2^8". Also, I'm not sure if the "greater than or equal to a target specific limit" is better than "greater than a target specific limit". Might be clearer to phrase this as: ... be a type whose bit width is a power of two, is not less than 8, and is greater than a target specific limit ...
lib/CodeGen/AtomicExpandPass.cpp
199 ↗	(On Diff #42599)	Why not have `IntegerType *` as the return type?
209 ↗	(On Diff #42599)	`F` is a `llvm::Module`, so please call it `M`. Also, there is an `Instruction::getModule`.
218 ↗	(On Diff #42599)	Very minor: spacing before `=`.
292 ↗	(On Diff #42599)	Again, I think `auto *M = SI->getModule()` is better.

Address Sanjoy's review.

Does this have any implications for alias analysis, especially type-based? The C++ model means memory locations are always atomic, so this isn't un-aliasing non-atomics further, but does LLVM's AAs understand that atomics can be non-integral now?

As explained in one of my comments, I think this patch should only do FP and not vectors.

docs/LangRef.rst
6995 ↗	(On Diff #42613)	This isn't correct because it can't be any type: it has to be an integer, FP or vector. I'm not sure I'm really enthused by the addition of vector here because it has odd ramifications: What if the vector's size isn't a power of 2? What about vector atomicity and alignment? Some usecases may be OK with element-wise atomicity only (with undefined ordering). What if the vector contains pointers? Right now you can't have atomic load/store of pointers, it seems odd to allow vectors of pointers. I'd drop vectors from this patch, and clarify the documentation to mention integer and floating-point.
lib/CodeGen/AtomicExpandPass.cpp
142 ↗	(On Diff #42613)	Add a string: `assert(foo && "bar");`
153 ↗	(On Diff #42613)	Ditto.
202 ↗	(On Diff #42613)	Shouldn't this be `getStoreSizeInBits()`? If `getStoreSizeInBits() != getSizeInBits()` then we'll have problems because the alignment may be wrong. I'd assert that they're the same, as well as checking the number is a power of 2 (because lulz fp80).

In D15471#309318, @jfb wrote:

Does this have any implications for alias analysis, especially type-based? The C++ model means memory locations are always atomic, so this isn't un-aliasing non-atomics further, but does LLVM's AAs understand that atomics can be non-integral now?

I'm not sure what you're trying to ask here specifically, but I have no reason to believe this influences AA in any way. Any AA which is using the *llvm type* to prove no alias is wrong and should be fixed. TBAA is entirely orthogonal and uneffected.

As explained in one of my comments, I think this patch should only do FP and not vectors.

I'm okay with this for moment. Will upload a simplified patch shortly.

reames added inline comments.Dec 14 2015, 1:05 PM

docs/LangRef.rst
6995 ↗	(On Diff #42613)	Let's move this to the llvm-dev thread. For the record, your point 3 is based on a wrong assumption. We do support atomic loads and stores of pointers.
lib/CodeGen/AtomicExpandPass.cpp
142 ↗	(On Diff #42613)	Will do. For the record, I feel the message adds absolutely nothing here given the code context, but I don't care enough to argue the point.
202 ↗	(On Diff #42613)	I'll switch methods and add the first assert since it's slightly non-obvious. The power of two is enforced by the verifier.

Address JF's comments and remove the vector support for the moment.

Forgot the new LangRef changes in the last update.

bcraig added a subscriber: bcraig.Dec 14 2015, 1:25 PM

bcraig added inline comments.

test/CodeGen/X86/atomic-non-integer.ll
49 ↗	(On Diff #42760)	All of your llc tests are currently testing unordered accesses. The interesting code gen on X86 is with seq_cst stores. I recommend adding tests for those, and ensuring that you get the appropriate [lock] xchg operations.

Add a couple of seq_cst test per Ben's request.

LGTM... but that doesn't mean much. Thanks for adding the extra tests.

For x86, can you add a negative test for fp80 to make sure it doesn't work? Or will that test only hit an assert? I'm assuming that it should fail validation.

Also, add fp128. For now I'm assuming we'll do lock cmpxchg16b on processors which support it, and a call to the runtime lock from compiler-rt otherwise (__atomic_load_16 and __atomic_store_16 from lib/builtins/atomic.c which I'm not sure work?).

lib/CodeGen/AtomicExpandPass.cpp
207 ↗	(On Diff #42788)	"integral"
284 ↗	(On Diff #42788)	"integral"
test/CodeGen/X86/atomic-non-integer.ll
1 ↗	(On Diff #42788)	Add a reference to: Also add a reference to: https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
6 ↗	(On Diff #42788)	Why does this involve an f2h conversion? Is it a calling convention thing where half is passed as its integer equivalent? It would be worth documenting, I find it surprising.
32 ↗	(On Diff #42788)	Same.
54 ↗	(On Diff #42788)	For each of the `xchg` below, can you also `CHECK-NOT: lock` since the x86 manual states that the `lock` prefix is implicit.
test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll
5 ↗	(On Diff #42788)	Also test `volatile` and `addressspace` combinations.

In D15471#311041, @jfb wrote:

For x86, can you add a negative test for fp80 to make sure it doesn't work? Or will that test only hit an assert? I'm assuming that it should fail validation.

This will be caught by the verifier. Will add a test.

Also, add fp128. For now I'm assuming we'll do lock cmpxchg16b on processors which support it, and a call to the runtime lock from compiler-rt otherwise (__atomic_load_16 and __atomic_store_16 from lib/builtins/atomic.c which I'm not sure work?).

I haven't checked, but the details of the lowering (including correctness!) are beyond the scope of this change. The lowering will be whatever you would have gotten with the previous bitcast idiom. *All* this change is doing is removing the need for that idiom. If we need to revise lowering, that should and will be a separate change. I will add a test to check that we lower to *something*, but that's all that should be part of this change.

I'll make the testing changes requested. With the assumption those will be done before submission, can I get an official LGTM?

I'm noticing a trend for this patch to keep snowballing to include more and more lowering questions; I specifically want to cut that off. This change *shouldn't* be changing lowering in any way from what frontends get today. That's the entire point of the patch.

lib/CodeGen/AtomicExpandPass.cpp
284 ↗	(On Diff #42788)	Will fix.
test/CodeGen/X86/atomic-non-integer.ll
1 ↗	(On Diff #42788)	This is, or should, be documented in the code. Since I didn't write the lowering code for this and don't know what reasoning let to this particular emission, I'd rather not falsely site something due to the later confusion it might cause.
6 ↗	(On Diff #42788)	Frankly, I have no clue. We emit the same conversion when doing a non-atomic store, so it's not related to the changes in this patch.
54 ↗	(On Diff #42788)	This is an implementation detail that is irrelevant to this functionality. This is a) tested elsewhere, and b) irrelevant to the correctness of the code genation here.

In D15471#311072, @reames wrote:

In D15471#311041, @jfb wrote:

For x86, can you add a negative test for fp80 to make sure it doesn't work? Or will that test only hit an assert? I'm assuming that it should fail validation.

This will be caught by the verifier. Will add a test.

OK ty.

Also, add fp128. For now I'm assuming we'll do lock cmpxchg16b on processors which support it, and a call to the runtime lock from compiler-rt otherwise (__atomic_load_16 and __atomic_store_16 from lib/builtins/atomic.c which I'm not sure work?).

I haven't checked, but the details of the lowering (including correctness!) are beyond the scope of this change. The lowering will be whatever you would have gotten with the previous bitcast idiom. *All* this change is doing is removing the need for that idiom. If we need to revise lowering, that should and will be a separate change. I will add a test to check that we lower to *something*, but that's all that should be part of this change.

Oh yeah I just want to know that we generate something. If it's wrong the fix should definitely be separate.

I'll make the testing changes requested. With the assumption those will be done before submission, can I get an official LGTM?

Yes.

I'm noticing a trend for this patch to keep snowballing to include more and more lowering questions; I specifically want to cut that off. This change *shouldn't* be changing lowering in any way from what frontends get today. That's the entire point of the patch.

I want to improve our test coverage and understand what the tests are doing. I agree that lowering improvements are separate.

test/CodeGen/X86/atomic-non-integer.ll
1 ↗	(On Diff #42788)	Could you document this at the top of the test? It's not clear from the name of the test that you're not checking correctness of the generated code.
6 ↗	(On Diff #42788)	I'd rather know if the test is correct, and fix elsewhere if not: otherwise it's hard to reason about this test when fixing bugs. I can't find anything about `half` or `fp16` in the calling convention, but the signature (`short __gnu_f2h_ieee(float)`) leads me to believe that the ABI passes things as `float` and then converts to `short` as a proxy for `half`. I think a cleaner test wouldn't pass a `half` in registers but would instead pass two pointers. That makes the test way easier to understand IMO.
54 ↗	(On Diff #42788)	Oh you're right, `test/CodeGen/X86/atomic_mi.ll` tests this and `git blame` shows a familiar name on that!

add various requested tests

address JF's comments with regards to halfs.

lgtm after the two "integral" typo fixes.

test/Verifier/atomics.ll
4 ↗	(On Diff #42920)	Ha, that's not the best error message since `x86_mmx` is a floating-point type. I'll rework it in D15512 if that's OK with you.

This revision is now accepted and ready to land.Dec 15 2015, 4:15 PM

reames added inline comments.Dec 15 2015, 4:48 PM

test/CodeGen/X86/atomic-non-integer.ll
2 ↗	(On Diff #42920)	I added a note to the top of the file. Let me know if you want something more.
7 ↗	(On Diff #42920)	This comment doesn't really parse for me. If you want to suggest a change here, please raise it on the mailing lists so that someone more knowledgeable than I can comment.
test/Verifier/atomics.ll
4 ↗	(On Diff #42920)	Not according to LLVM's isFloatingPointTy it's not. Surprised me too. Fixing that might end up being a much larger change though.

Closed by commit rL255737: [IR] Add support for floating pointer atomic loads and stores (authored by reames). · Explain WhyDec 15 2015, 4:52 PM

This revision was automatically updated to reflect the committed changes.

jfb mentioned this in D85900: [mlir] do not use llvm.cmpxchg with floats.Aug 13 2020, 3:00 PM

danilaml mentioned this in D60394: [X86] Add patterns for using movss/movsd for atomic load/store of f32/64. Remove atomic fadd pseudos use isel patterns instead..Dec 30 2021, 4:55 AM

Revision Contents

Path

Size

llvm/

trunk/

docs/

LangRef.rst

25 lines

lib/

CodeGen/

AtomicExpandPass.cpp

96 lines

IR/

Verifier.cpp

8 lines

test/

CodeGen/

X86/

atomic-non-integer.ll

108 lines

Transforms/

AtomicExpand/

X86/

expand-atomic-non-integer.ll

82 lines

Verifier/

atomics.ll

14 lines

Diff 42941

llvm/trunk/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 6,838 Lines • ▼ Show 20 Lines
	execution of this ``load`` with other :ref:`volatile			execution of this ``load`` with other :ref:`volatile
	operations <volatile>`.			operations <volatile>`.

	If the ``load`` is marked as ``atomic``, it takes an extra			If the ``load`` is marked as ``atomic``, it takes an extra
	:ref:`ordering <ordering>` and optional ``singlethread`` argument. The			:ref:`ordering <ordering>` and optional ``singlethread`` argument. The
	``release`` and ``acq_rel`` orderings are not valid on ``load``			``release`` and ``acq_rel`` orderings are not valid on ``load``
	instructions. Atomic loads produce :ref:`defined <memmodel>` results			instructions. Atomic loads produce :ref:`defined <memmodel>` results
	when they may see multiple atomic stores. The type of the pointee must			when they may see multiple atomic stores. The type of the pointee must
	be an integer type whose bit width is a power of two greater than or			be an integer or floating point type whose bit width is a power of two,
	equal to eight and less than or equal to a target-specific size limit.			greater than or equal to eight, and less than or equal to a
	``align`` must be explicitly specified on atomic loads, and the load has			target-specific size limit. ``align`` must be explicitly specified on
	undefined behavior if the alignment is not set to a value which is at			atomic loads, and the load has undefined behavior if the alignment is
	least the size in bytes of the pointee. ``!nontemporal`` does not have			not set to a value which is at least the size in bytes of the pointee.
	any defined semantics for atomic loads.			``!nontemporal`` does not have any defined semantics for atomic loads.

	The optional constant ``align`` argument specifies the alignment of the			The optional constant ``align`` argument specifies the alignment of the
	operation (that is, the alignment of the memory address). A value of 0			operation (that is, the alignment of the memory address). A value of 0
	or an omitted ``align`` argument means that the operation has the ABI			or an omitted ``align`` argument means that the operation has the ABI
	alignment for the target. It is the responsibility of the code emitter			alignment for the target. It is the responsibility of the code emitter
	to ensure that the alignment information is correct. Overestimating the			to ensure that the alignment information is correct. Overestimating the
	alignment results in undefined behavior. Underestimating the alignment			alignment results in undefined behavior. Underestimating the alignment
	may produce less efficient code. An alignment of 1 is always safe. The			may produce less efficient code. An alignment of 1 is always safe. The
	▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines
	execution of this ``store`` with other :ref:`volatile			execution of this ``store`` with other :ref:`volatile
	operations <volatile>`.			operations <volatile>`.

	If the ``store`` is marked as ``atomic``, it takes an extra			If the ``store`` is marked as ``atomic``, it takes an extra
	:ref:`ordering <ordering>` and optional ``singlethread`` argument. The			:ref:`ordering <ordering>` and optional ``singlethread`` argument. The
	``acquire`` and ``acq_rel`` orderings aren't valid on ``store``			``acquire`` and ``acq_rel`` orderings aren't valid on ``store``
	instructions. Atomic loads produce :ref:`defined <memmodel>` results			instructions. Atomic loads produce :ref:`defined <memmodel>` results
	when they may see multiple atomic stores. The type of the pointee must			when they may see multiple atomic stores. The type of the pointee must
	be an integer type whose bit width is a power of two greater than or			be an integer or floating point type whose bit width is a power of two,
	equal to eight and less than or equal to a target-specific size limit.			greater than or equal to eight, and less than or equal to a
	``align`` must be explicitly specified on atomic stores, and the store			target-specific size limit. ``align`` must be explicitly specified
	has undefined behavior if the alignment is not set to a value which is			on atomic stores, and the store has undefined behavior if the alignment
	at least the size in bytes of the pointee. ``!nontemporal`` does not			is not set to a value which is at least the size in bytes of the
	have any defined semantics for atomic stores.			pointee. ``!nontemporal`` does not have any defined semantics for
				atomic stores.

	The optional constant ``align`` argument specifies the alignment of the			The optional constant ``align`` argument specifies the alignment of the
	operation (that is, the alignment of the memory address). A value of 0			operation (that is, the alignment of the memory address). A value of 0
	or an omitted ``align`` argument means that the operation has the ABI			or an omitted ``align`` argument means that the operation has the ABI
	alignment for the target. It is the responsibility of the code emitter			alignment for the target. It is the responsibility of the code emitter
	to ensure that the alignment information is correct. Overestimating the			to ensure that the alignment information is correct. Overestimating the
	alignment results in undefined behavior. Underestimating the			alignment results in undefined behavior. Underestimating the
	alignment may produce less efficient code. An alignment of 1 is always			alignment may produce less efficient code. An alignment of 1 is always
	▲ Show 20 Lines • Show All 5,084 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/AtomicExpandPass.cpp

//===-- AtomicExpandPass.cpp - Expand atomic instructions -------===//		//===-- AtomicExpandPass.cpp - Expand atomic instructions -------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file contains a pass (at IR level) to replace atomic instructions with		// This file contains a pass (at IR level) to replace atomic instructions with
// either (intrinsic-based) load-linked/store-conditional loops or		// target specific instruction which implement the same semantics in a way
// AtomicCmpXchg.		// which better fits the target backend. This can include the use of either
		// (intrinsic-based) load-linked/store-conditional loops, AtomicCmpXchg, or
		// type coercions.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/CodeGen/AtomicExpandUtils.h"		#include "llvm/CodeGen/AtomicExpandUtils.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
Show All 20 Lines	explicit AtomicExpand(const TargetMachine *TM = nullptr)
initializeAtomicExpandPass(*PassRegistry::getPassRegistry());		initializeAtomicExpandPass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;

private:		private:
bool bracketInstWithFences(Instruction *I, AtomicOrdering Order,		bool bracketInstWithFences(Instruction *I, AtomicOrdering Order,
bool IsStore, bool IsLoad);		bool IsStore, bool IsLoad);
		IntegerType getCorrespondingIntegerType(Type T, const DataLayout &DL);
		LoadInst convertAtomicLoadToIntegerType(LoadInst LI);
bool tryExpandAtomicLoad(LoadInst *LI);		bool tryExpandAtomicLoad(LoadInst *LI);
bool expandAtomicLoadToLL(LoadInst *LI);		bool expandAtomicLoadToLL(LoadInst *LI);
bool expandAtomicLoadToCmpXchg(LoadInst *LI);		bool expandAtomicLoadToCmpXchg(LoadInst *LI);
		StoreInst convertAtomicStoreToIntegerType(StoreInst SI);
bool expandAtomicStore(StoreInst *SI);		bool expandAtomicStore(StoreInst *SI);
bool tryExpandAtomicRMW(AtomicRMWInst *AI);		bool tryExpandAtomicRMW(AtomicRMWInst *AI);
bool expandAtomicOpToLLSC(		bool expandAtomicOpToLLSC(
Instruction I, Value Addr, AtomicOrdering MemOpOrder,		Instruction I, Value Addr, AtomicOrdering MemOpOrder,
std::function<Value (IRBuilder<> &, Value )> PerformOp);		std::function<Value (IRBuilder<> &, Value )> PerformOp);
bool expandAtomicCmpXchg(AtomicCmpXchgInst *CI);		bool expandAtomicCmpXchg(AtomicCmpXchgInst *CI);
bool isIdempotentRMW(AtomicRMWInst *AI);		bool isIdempotentRMW(AtomicRMWInst *AI);
bool simplifyIdempotentRMW(AtomicRMWInst *AI);		bool simplifyIdempotentRMW(AtomicRMWInst *AI);
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	if (TLI->getInsertFencesForAtomic()) {
}		}

if (FenceOrdering != Monotonic) {		if (FenceOrdering != Monotonic) {
MadeChange \|= bracketInstWithFences(I, FenceOrdering, IsStore, IsLoad);		MadeChange \|= bracketInstWithFences(I, FenceOrdering, IsStore, IsLoad);
}		}
}		}

if (LI) {		if (LI) {
		if (LI->getType()->isFloatingPointTy()) {
		// TODO: add a TLI hook to control this so that each target can
		// convert to lowering the original type one at a time.
		LI = convertAtomicLoadToIntegerType(LI);
		assert(LI->getType()->isIntegerTy() && "invariant broken");
		MadeChange = true;
		}

MadeChange \|= tryExpandAtomicLoad(LI);		MadeChange \|= tryExpandAtomicLoad(LI);
} else if (SI && TLI->shouldExpandAtomicStoreInIR(SI)) {		} else if (SI) {
		if (SI->getValueOperand()->getType()->isFloatingPointTy()) {
		// TODO: add a TLI hook to control this so that each target can
		// convert to lowering the original type one at a time.
		SI = convertAtomicStoreToIntegerType(SI);
		assert(SI->getValueOperand()->getType()->isIntegerTy() &&
		"invariant broken");
		MadeChange = true;
		}

		if (TLI->shouldExpandAtomicStoreInIR(SI))
MadeChange \|= expandAtomicStore(SI);		MadeChange \|= expandAtomicStore(SI);
} else if (RMWI) {		} else if (RMWI) {
// There are two different ways of expanding RMW instructions:		// There are two different ways of expanding RMW instructions:
// - into a load if it is idempotent		// - into a load if it is idempotent
// - into a Cmpxchg/LL-SC loop otherwise		// - into a Cmpxchg/LL-SC loop otherwise
// we try them in that order.		// we try them in that order.

if (isIdempotentRMW(RMWI) && simplifyIdempotentRMW(RMWI)) {		if (isIdempotentRMW(RMWI) && simplifyIdempotentRMW(RMWI)) {
MadeChange = true;		MadeChange = true;
Show All 23 Lines	bool AtomicExpand::bracketInstWithFences(Instruction *I, AtomicOrdering Order,
if (TrailingFence) {		if (TrailingFence) {
TrailingFence->removeFromParent();		TrailingFence->removeFromParent();
TrailingFence->insertAfter(I);		TrailingFence->insertAfter(I);
}		}

return (LeadingFence \|\| TrailingFence);		return (LeadingFence \|\| TrailingFence);
}		}

		/// Get the iX type with the same bitwidth as T.
		IntegerType AtomicExpand::getCorrespondingIntegerType(Type T,
		const DataLayout &DL) {
		EVT VT = TLI->getValueType(DL, T);
		unsigned BitWidth = VT.getStoreSizeInBits();
		assert(BitWidth == VT.getSizeInBits() && "must be a power of two");
		return IntegerType::get(T->getContext(), BitWidth);
		}

		/// Convert an atomic load of a non-integral type to an integer load of the
		/// equivelent bitwidth. See the function comment on
		/// convertAtomicStoreToIntegerType for background.
		LoadInst AtomicExpand::convertAtomicLoadToIntegerType(LoadInst LI) {
		auto *M = LI->getModule();
		Type *NewTy = getCorrespondingIntegerType(LI->getType(),
		M->getDataLayout());

		IRBuilder<> Builder(LI);

		Value *Addr = LI->getPointerOperand();
		Type *PT = PointerType::get(NewTy,
		Addr->getType()->getPointerAddressSpace());
		Value *NewAddr = Builder.CreateBitCast(Addr, PT);

		auto *NewLI = Builder.CreateLoad(NewAddr);
		NewLI->setAlignment(LI->getAlignment());
		NewLI->setVolatile(LI->isVolatile());
		NewLI->setAtomic(LI->getOrdering(), LI->getSynchScope());
		DEBUG(dbgs() << "Replaced " << LI << " with " << NewLI << "\n");

		Value *NewVal = Builder.CreateBitCast(NewLI, LI->getType());
		LI->replaceAllUsesWith(NewVal);
		LI->eraseFromParent();
		return NewLI;
		}

bool AtomicExpand::tryExpandAtomicLoad(LoadInst *LI) {		bool AtomicExpand::tryExpandAtomicLoad(LoadInst *LI) {
switch (TLI->shouldExpandAtomicLoadInIR(LI)) {		switch (TLI->shouldExpandAtomicLoadInIR(LI)) {
case TargetLoweringBase::AtomicExpansionKind::None:		case TargetLoweringBase::AtomicExpansionKind::None:
return false;		return false;
case TargetLoweringBase::AtomicExpansionKind::LLSC:		case TargetLoweringBase::AtomicExpansionKind::LLSC:
return expandAtomicOpToLLSC(		return expandAtomicOpToLLSC(
LI, LI->getPointerOperand(), LI->getOrdering(),		LI, LI->getPointerOperand(), LI->getOrdering(),
[](IRBuilder<> &Builder, Value *Loaded) { return Loaded; });		[](IRBuilder<> &Builder, Value *Loaded) { return Loaded; });
Show All 34 Lines	bool AtomicExpand::expandAtomicLoadToCmpXchg(LoadInst *LI) {
Value *Loaded = Builder.CreateExtractValue(Pair, 0, "loaded");		Value *Loaded = Builder.CreateExtractValue(Pair, 0, "loaded");

LI->replaceAllUsesWith(Loaded);		LI->replaceAllUsesWith(Loaded);
LI->eraseFromParent();		LI->eraseFromParent();

return true;		return true;
}		}

		/// Convert an atomic store of a non-integral type to an integer store of the
		/// equivelent bitwidth. We used to not support floating point or vector
		/// atomics in the IR at all. The backends learned to deal with the bitcast
		/// idiom because that was the only way of expressing the notion of a atomic
		/// float or vector store. The long term plan is to teach each backend to
		/// instruction select from the original atomic store, but as a migration
		/// mechanism, we convert back to the old format which the backends understand.
		/// Each backend will need individual work to recognize the new format.
		StoreInst AtomicExpand::convertAtomicStoreToIntegerType(StoreInst SI) {
		IRBuilder<> Builder(SI);
		auto *M = SI->getModule();
		Type *NewTy = getCorrespondingIntegerType(SI->getValueOperand()->getType(),
		M->getDataLayout());
		Value *NewVal = Builder.CreateBitCast(SI->getValueOperand(), NewTy);

		Value *Addr = SI->getPointerOperand();
		Type *PT = PointerType::get(NewTy,
		Addr->getType()->getPointerAddressSpace());
		Value *NewAddr = Builder.CreateBitCast(Addr, PT);

		StoreInst *NewSI = Builder.CreateStore(NewVal, NewAddr);
		NewSI->setAlignment(SI->getAlignment());
		NewSI->setVolatile(SI->isVolatile());
		NewSI->setAtomic(SI->getOrdering(), SI->getSynchScope());
		DEBUG(dbgs() << "Replaced " << SI << " with " << NewSI << "\n");
		SI->eraseFromParent();
		return NewSI;
		}

bool AtomicExpand::expandAtomicStore(StoreInst *SI) {		bool AtomicExpand::expandAtomicStore(StoreInst *SI) {
// This function is only called on atomic stores that are too large to be		// This function is only called on atomic stores that are too large to be
// atomic if implemented as a native store. So we replace them by an		// atomic if implemented as a native store. So we replace them by an
// atomic swap, that can be implemented for example as a ldrex/strex on ARM		// atomic swap, that can be implemented for example as a ldrex/strex on ARM
// or lock cmpxchg8/16b on X86, as these are atomic for larger sizes.		// or lock cmpxchg8/16b on X86, as these are atomic for larger sizes.
// It is the responsibility of the target to only signal expansion via		// It is the responsibility of the target to only signal expansion via
// shouldExpandAtomicRMW in cases where this is required and possible.		// shouldExpandAtomicRMW in cases where this is required and possible.
IRBuilder<> Builder(SI);		IRBuilder<> Builder(SI);
▲ Show 20 Lines • Show All 369 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 2,726 Lines • ▼ Show 20 Lines	void Verifier::visitLoadInst(LoadInst &LI) {
Assert(LI.getAlignment() <= Value::MaximumAlignment,		Assert(LI.getAlignment() <= Value::MaximumAlignment,
"huge alignment values are unsupported", &LI);		"huge alignment values are unsupported", &LI);
if (LI.isAtomic()) {		if (LI.isAtomic()) {
Assert(LI.getOrdering() != Release && LI.getOrdering() != AcquireRelease,		Assert(LI.getOrdering() != Release && LI.getOrdering() != AcquireRelease,
"Load cannot have Release ordering", &LI);		"Load cannot have Release ordering", &LI);
Assert(LI.getAlignment() != 0,		Assert(LI.getAlignment() != 0,
"Atomic load must specify explicit alignment", &LI);		"Atomic load must specify explicit alignment", &LI);
if (!ElTy->isPointerTy()) {		if (!ElTy->isPointerTy()) {
Assert(ElTy->isIntegerTy(), "atomic load operand must have integer type!",		Assert(ElTy->isIntegerTy() \|\| ElTy->isFloatingPointTy(),
		"atomic load operand must have integer or floating point type!",
&LI, ElTy);		&LI, ElTy);
unsigned Size = ElTy->getPrimitiveSizeInBits();		unsigned Size = ElTy->getPrimitiveSizeInBits();
Assert(Size >= 8 && !(Size & (Size - 1)),		Assert(Size >= 8 && !(Size & (Size - 1)),
"atomic load operand must be power-of-two byte-sized integer", &LI,		"atomic load operand must be power-of-two byte-sized integer", &LI,
ElTy);		ElTy);
}		}
} else {		} else {
Assert(LI.getSynchScope() == CrossThread,		Assert(LI.getSynchScope() == CrossThread,
Show All 12 Lines	void Verifier::visitStoreInst(StoreInst &SI) {
Assert(SI.getAlignment() <= Value::MaximumAlignment,		Assert(SI.getAlignment() <= Value::MaximumAlignment,
"huge alignment values are unsupported", &SI);		"huge alignment values are unsupported", &SI);
if (SI.isAtomic()) {		if (SI.isAtomic()) {
Assert(SI.getOrdering() != Acquire && SI.getOrdering() != AcquireRelease,		Assert(SI.getOrdering() != Acquire && SI.getOrdering() != AcquireRelease,
"Store cannot have Acquire ordering", &SI);		"Store cannot have Acquire ordering", &SI);
Assert(SI.getAlignment() != 0,		Assert(SI.getAlignment() != 0,
"Atomic store must specify explicit alignment", &SI);		"Atomic store must specify explicit alignment", &SI);
if (!ElTy->isPointerTy()) {		if (!ElTy->isPointerTy()) {
Assert(ElTy->isIntegerTy(),		Assert(ElTy->isIntegerTy() \|\| ElTy->isFloatingPointTy(),
"atomic store operand must have integer type!", &SI, ElTy);		"atomic store operand must have integer or floating point type!",
		&SI, ElTy);
unsigned Size = ElTy->getPrimitiveSizeInBits();		unsigned Size = ElTy->getPrimitiveSizeInBits();
Assert(Size >= 8 && !(Size & (Size - 1)),		Assert(Size >= 8 && !(Size & (Size - 1)),
"atomic store operand must be power-of-two byte-sized integer",		"atomic store operand must be power-of-two byte-sized integer",
&SI, ElTy);		&SI, ElTy);
}		}
} else {		} else {
Assert(SI.getSynchScope() == CrossThread,		Assert(SI.getSynchScope() == CrossThread,
"Non-atomic store cannot have SynchronizationScope specified", &SI);		"Non-atomic store cannot have SynchronizationScope specified", &SI);
▲ Show 20 Lines • Show All 1,235 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/atomic-non-integer.ll

				; RUN: llc < %s -mtriple=x86_64-linux-generic -verify-machineinstrs -mattr=sse2 \| FileCheck %s

				; Note: This test is testing that the lowering for atomics matches what we
				; currently emit for non-atomics + the atomic restriction. The presence of
				; particular lowering detail in these tests should not be read as requiring
				; that detail for correctness unless it's related to the atomicity itself.
				; (Specifically, there were reviewer questions about the lowering for halfs
				; and their calling convention which remain unresolved.)

				define void @store_half(half* %fptr, half %v) {
				; CHECK-LABEL: @store_half
				; CHECK: movq %rdi, %rbx
				; CHECK: callq __gnu_f2h_ieee
				; CHECK: movw %ax, (%rbx)
				store atomic half %v, half* %fptr unordered, align 2
				ret void
				}

				define void @store_float(float* %fptr, float %v) {
				; CHECK-LABEL: @store_float
				; CHECK: movd %xmm0, %eax
				; CHECK: movl %eax, (%rdi)
				store atomic float %v, float* %fptr unordered, align 4
				ret void
				}

				define void @store_double(double* %fptr, double %v) {
				; CHECK-LABEL: @store_double
				; CHECK: movd %xmm0, %rax
				; CHECK: movq %rax, (%rdi)
				store atomic double %v, double* %fptr unordered, align 8
				ret void
				}

				define void @store_fp128(fp128* %fptr, fp128 %v) {
				; CHECK-LABEL: @store_fp128
				; CHECK: callq __sync_lock_test_and_set_16
				store atomic fp128 %v, fp128* %fptr unordered, align 16
				ret void
				}

				define half @load_half(half* %fptr) {
				; CHECK-LABEL: @load_half
				; CHECK: movw (%rdi), %ax
				; CHECK: movzwl %ax, %edi
				; CHECK: jmp __gnu_h2f_ieee
				%v = load atomic half, half* %fptr unordered, align 2
				ret half %v
				}

				define float @load_float(float* %fptr) {
				; CHECK-LABEL: @load_float
				; CHECK: movl (%rdi), %eax
				; CHECK: movd %eax, %xmm0
				%v = load atomic float, float* %fptr unordered, align 4
				ret float %v
				}

				define double @load_double(double* %fptr) {
				; CHECK-LABEL: @load_double
				; CHECK: movq (%rdi), %rax
				; CHECK: movd %rax, %xmm0
				%v = load atomic double, double* %fptr unordered, align 8
				ret double %v
				}

				define fp128 @load_fp128(fp128* %fptr) {
				; CHECK-LABEL: @load_fp128
				; CHECK: callq __sync_val_compare_and_swap_16
				%v = load atomic fp128, fp128* %fptr unordered, align 16
				ret fp128 %v
				}


				; sanity check the seq_cst lowering since that's the
				; interesting one from an ordering perspective on x86.

				define void @store_float_seq_cst(float* %fptr, float %v) {
				; CHECK-LABEL: @store_float_seq_cst
				; CHECK: movd %xmm0, %eax
				; CHECK: xchgl %eax, (%rdi)
				store atomic float %v, float* %fptr seq_cst, align 4
				ret void
				}

				define void @store_double_seq_cst(double* %fptr, double %v) {
				; CHECK-LABEL: @store_double_seq_cst
				; CHECK: movd %xmm0, %rax
				; CHECK: xchgq %rax, (%rdi)
				store atomic double %v, double* %fptr seq_cst, align 8
				ret void
				}

				define float @load_float_seq_cst(float* %fptr) {
				; CHECK-LABEL: @load_float_seq_cst
				; CHECK: movl (%rdi), %eax
				; CHECK: movd %eax, %xmm0
				%v = load atomic float, float* %fptr seq_cst, align 4
				ret float %v
				}

				define double @load_double_seq_cst(double* %fptr) {
				; CHECK-LABEL: @load_double_seq_cst
				; CHECK: movq (%rdi), %rax
				; CHECK: movd %rax, %xmm0
				%v = load atomic double, double* %fptr seq_cst, align 8
				ret double %v
				}

llvm/trunk/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll

				; RUN: opt -S %s -atomic-expand -mtriple=x86_64-linux-gnu \| FileCheck %s

				; This file tests the functions `llvm::convertAtomicLoadToIntegerType` and
				; `llvm::convertAtomicStoreToIntegerType`. If X86 stops using this
				; functionality, please move this test to a target which still is.

				define float @float_load_expand(float* %ptr) {
				; CHECK-LABEL: @float_load_expand
				; CHECK: %1 = bitcast float* %ptr to i32*
				; CHECK: %2 = load atomic i32, i32* %1 unordered, align 4
				; CHECK: %3 = bitcast i32 %2 to float
				; CHECK: ret float %3
				%res = load atomic float, float* %ptr unordered, align 4
				ret float %res
				}

				define float @float_load_expand_seq_cst(float* %ptr) {
				; CHECK-LABEL: @float_load_expand_seq_cst
				; CHECK: %1 = bitcast float* %ptr to i32*
				; CHECK: %2 = load atomic i32, i32* %1 seq_cst, align 4
				; CHECK: %3 = bitcast i32 %2 to float
				; CHECK: ret float %3
				%res = load atomic float, float* %ptr seq_cst, align 4
				ret float %res
				}

				define float @float_load_expand_vol(float* %ptr) {
				; CHECK-LABEL: @float_load_expand_vol
				; CHECK: %1 = bitcast float* %ptr to i32*
				; CHECK: %2 = load atomic volatile i32, i32* %1 unordered, align 4
				; CHECK: %3 = bitcast i32 %2 to float
				; CHECK: ret float %3
				%res = load atomic volatile float, float* %ptr unordered, align 4
				ret float %res
				}

				define float @float_load_expand_addr1(float addrspace(1)* %ptr) {
				; CHECK-LABEL: @float_load_expand_addr1
				; CHECK: %1 = bitcast float addrspace(1)* %ptr to i32 addrspace(1)*
				; CHECK: %2 = load atomic i32, i32 addrspace(1)* %1 unordered, align 4
				; CHECK: %3 = bitcast i32 %2 to float
				; CHECK: ret float %3
				%res = load atomic float, float addrspace(1)* %ptr unordered, align 4
				ret float %res
				}

				define void @float_store_expand(float* %ptr, float %v) {
				; CHECK-LABEL: @float_store_expand
				; CHECK: %1 = bitcast float %v to i32
				; CHECK: %2 = bitcast float* %ptr to i32*
				; CHECK: store atomic i32 %1, i32* %2 unordered, align 4
				store atomic float %v, float* %ptr unordered, align 4
				ret void
				}

				define void @float_store_expand_seq_cst(float* %ptr, float %v) {
				; CHECK-LABEL: @float_store_expand_seq_cst
				; CHECK: %1 = bitcast float %v to i32
				; CHECK: %2 = bitcast float* %ptr to i32*
				; CHECK: store atomic i32 %1, i32* %2 seq_cst, align 4
				store atomic float %v, float* %ptr seq_cst, align 4
				ret void
				}

				define void @float_store_expand_vol(float* %ptr, float %v) {
				; CHECK-LABEL: @float_store_expand_vol
				; CHECK: %1 = bitcast float %v to i32
				; CHECK: %2 = bitcast float* %ptr to i32*
				; CHECK: store atomic volatile i32 %1, i32* %2 unordered, align 4
				store atomic volatile float %v, float* %ptr unordered, align 4
				ret void
				}

				define void @float_store_expand_addr1(float addrspace(1)* %ptr, float %v) {
				; CHECK-LABEL: @float_store_expand_addr1
				; CHECK: %1 = bitcast float %v to i32
				; CHECK: %2 = bitcast float addrspace(1)* %ptr to i32 addrspace(1)*
				; CHECK: store atomic i32 %1, i32 addrspace(1)* %2 unordered, align 4
				store atomic float %v, float addrspace(1)* %ptr unordered, align 4
				ret void
				}

llvm/trunk/test/Verifier/atomics.ll

				; RUN: not opt -verify < %s 2>&1 \| FileCheck %s

				; CHECK: atomic store operand must have integer or floating point type!
				; CHECK: atomic load operand must have integer or floating point type!

				define void @foo(x86_mmx* %P, x86_mmx %v) {
				store atomic x86_mmx %v, x86_mmx* %P unordered, align 8
				ret void
				}

				define x86_mmx @bar(x86_mmx* %P) {
				%v = load atomic x86_mmx, x86_mmx* %P unordered, align 8
				ret x86_mmx %v
				}