This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGAtomic.cpp
-
CGBuilder.h
2/2
CGBuiltin.cpp
-
CGExprScalar.cpp
-
CGStmtOpenMP.cpp
-
test/
-
CodeGen/
-
atomic-ops.c
-
ms-intrinsics-underaligned.c
-
ms-intrinsics.c
-
OpenMP/
-
parallel_reduction_codegen.cpp

Differential D97224

Use Address for CGBuilder's CreateAtomicRMW and CreateAtomicCmpXchg.
AcceptedPublic

Authored by jyknight on Feb 22 2021, 1:29 PM.

Download Raw Diff

Details

Reviewers

gchatelet
jfb
rjmccall

Summary

Following the LLVM change to add an alignment argument to the
IRBuilder calls, switch Clang's CGBuilder variants to take an Address
type. Then, update all callers to pass through the Address.

There is one interesting exception: Microsoft's
InterlockedCompareExchange128 family of functions are documented to
require (and assume) 16-byte alignment, despite the argument type
being only long long*.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jyknight requested review of this revision.Feb 22 2021, 1:29 PM

jyknight created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2021, 1:29 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B90275: Diff 325555.Feb 22 2021, 1:31 PM

jyknight mentioned this in rG24539f1ef247: Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg..Feb 25 2021, 3:30 PM

Do we really consider the existing atomic intrinsics to not impose added alignment restrictions? I'm somewhat concerned that we're going to produce IR here that's either really suboptimal or, worse, unimplementable, just because we over-interpreted some cue about alignment. I guess it would only be a significant problem on a target where types are frequently under-aligned for what atomics need, which is not typical, or when the user is doing atomics on a field of something like a packed struct.

clang/lib/CodeGen/CGBuiltin.cpp
362–366	Since you're changing this code anyway, please make this do `CreateElementBitCast(DestAddr, Int128Ty)` so that it's address-space-correct. There are a lot of other lines in the patch that would benefit from the same thing.
3819	This should be using `EmitPointerWithAlignment` instead of assuming an alignment of 1.

Address review comments.

In D97224#2596410, @rjmccall wrote:

Do we really consider the existing atomic intrinsics to not impose added alignment restrictions? I'm somewhat concerned that we're going to produce IR here that's either really suboptimal or, worse, unimplementable, just because we over-interpreted some cue about alignment. I guess it would only be a significant problem on a target where types are frequently under-aligned for what atomics need, which is not typical, or when the user is doing atomics on a field of something like a packed struct.

For the __atomic_* intrinsics, we don't consider those as imposing additional alignment restrictions -- currently, we avoid generating the LLVM IR instruction if it's below natural alignment, since we couldn't specify alignment on the IR instruction. (Now that we have alignment on the LLVM IR operations, I'd like to eventually get rid of that logic from Clang, since it's also handled by LLVM.)

For other intrinsics (e.g. MSVCIntrin::_InterlockedAnd, Builtin::BIsync_fetch_and_add_4, NVPTX::BInvvm_atom_add_gen_i, and the others in those 3 function families), we currently entirely ignore the alignment, and simply assume the argument is naturally-aligned. So, yes, this change could potentially affect the behavior for underaligned types.

So, I could change these to explicitly increase the assumed alignment of their arguments, like I did for InterlockedCompareExchange128. My inclination is not to do so, however...it doesn't seem like a good idea in general to ignore type alignment. But, I'd not be opposed to doing that, if there's a good reason.

In D97224#2604069, @jyknight wrote:

In D97224#2596410, @rjmccall wrote:

Do we really consider the existing atomic intrinsics to not impose added alignment restrictions? I'm somewhat concerned that we're going to produce IR here that's either really suboptimal or, worse, unimplementable, just because we over-interpreted some cue about alignment. I guess it would only be a significant problem on a target where types are frequently under-aligned for what atomics need, which is not typical, or when the user is doing atomics on a field of something like a packed struct.

For the __atomic_* intrinsics, we don't consider those as imposing additional alignment restrictions -- currently, we avoid generating the LLVM IR instruction if it's below natural alignment, since we couldn't specify alignment on the IR instruction. (Now that we have alignment on the LLVM IR operations, I'd like to eventually get rid of that logic from Clang, since it's also handled by LLVM.)

Frontends ultimately have the responsibility of making sure they ultimately emit code that follows the platform ABI for atomics. In most other parts of the ABI, we usually find that it is possible (even necessary) to delegate *part* of that ABI responsibility down to LLVM — e.g. to emit inline atomic sequences, which I suppose frontends could do with inline assembly, but there are obvious reasons to prefer a more semantic IR — but that at least in some cases, there is information that we cannot pass down and so must handle in the frontend. I am somewhat skeptical that atomics are going to prove an exception here. At the very least, there will always be *some* operations that we have to expand into compare-exchange loops in the frontend, for want of a sufficiently powerful instruction/intrinsic. That said, if you find that you can actually free Clang from having to make certain decisions here, that's great; I just want you to understand that usually we find that there are limits to what we can usefully delegate to LLVM, and ultimately the responsibility is ours.

For other intrinsics (e.g. MSVCIntrin::_InterlockedAnd, Builtin::BIsync_fetch_and_add_4, NVPTX::BInvvm_atom_add_gen_i, and the others in those 3 function families), we currently entirely ignore the alignment, and simply assume the argument is naturally-aligned. So, yes, this change could potentially affect the behavior for underaligned types.

So, I could change these to explicitly increase the assumed alignment of their arguments, like I did for InterlockedCompareExchange128. My inclination is not to do so, however...it doesn't seem like a good idea in general to ignore type alignment. But, I'd not be opposed to doing that, if there's a good reason.

The idea here is not to "ignore type alignment". EmitPointerWithAlignment will sometimes return an alignment for a pointer that's less than the alignment of the pointee type, e.g. because you're taking the address of a packed struct field. The critical question is whether the atomic builtins ought to make an effort to honor that reduced alignment, even if it leads to terrible code, or if we should treat the use of the builtin as a user promise that the pointer is actually more aligned than you might think from the information statically available. That has to be resolved by the semantics of the builtin, and unfortunately intrinsic documentation is often spotty on this point. For example, the MSDN documentation for InterlockedIncrement says that it requires 32-bit alignment, but the documentation for InterlockedAdd doesn't. It seems extremely unlikely that that is meant to be read as a statement that InterlockedAdd is actually more permissive. I would say that all of the Interlocked APIs ought to be read as guaranteeing the natural, full-width alignment for their operation. I'm less certain about what to do with the __atomic_* builtins, because maybe we should take this as an opportunity to try to "do the right thing" with under-aligned atomics; on the other hand, that assumes that we always *can* "do the right thing", and I don't want Clang to start miscompiling or refusing to compile code because we're trying to do something above and beyond the normal language semantics. It might be more reasonable to always use the type alignment as a minimum.

The idea here is not to "ignore type alignment". EmitPointerWithAlignment will sometimes return an alignment for a pointer that's less than the alignment of the pointee type, e.g. because you're taking the address of a packed struct field. The critical question is whether the atomic builtins ought to make an effort to honor that reduced alignment, even if it leads to terrible code, or if we should treat the use of the builtin as a user promise that the pointer is actually more aligned than you might think from the information statically available.

Agreed -- that is the question. In general, I'd like to default to basing decisions only on the statically-known alignments, because I think that'll typically be the best choice for users. Where there's something like a packed struct, it's likely that the values will end up under-aligned in fact, not just in the compiler's understanding.

For example, the MSDN documentation for InterlockedIncrement says that it requires 32-bit alignment [...]. I would say that all of the Interlocked APIs ought to be read as guaranteeing the natural, full-width alignment for their operation.

I had missed that it was documented in some of the other functions (beyond InterlockedCompareExchange128). I'll change the remainder of the _Interlocked APIs to assume at least natural alignment.

I'm less certain about what to do with the __atomic_* builtins

The __atomic builtins have already been supporting under-aligned pointers all along -- and that behavior is unchanged by this patch.

Harbormaster completed remote builds in B92108: Diff 328219.Mar 4 2021, 11:59 PM

Use natural alignment for _Interlocked* intrinsics.

Harbormaster completed remote builds in B93387: Diff 330088.Mar 11 2021, 9:37 PM

In D97224#2604537, @jyknight wrote:

I'm less certain about what to do with the __atomic_* builtins

The __atomic builtins have already been supporting under-aligned pointers all along -- and that behavior is unchanged by this patch.

How so? Clang hasn't been passing down an alignment, which means that it's been building atomicrmw instructions with the natural alignment for the IR type, which means we've been assuming that all pointers have a least that alignment. The LLVM documentation even says that atomicrmw doesn't allow under-alignment.

In D97224#2621328, @rjmccall wrote:

In D97224#2604537, @jyknight wrote:

I'm less certain about what to do with the __atomic_* builtins

The __atomic builtins have already been supporting under-aligned pointers all along -- and that behavior is unchanged by this patch.

How so? Clang hasn't been passing down an alignment, which means that it's been building atomicrmw instructions with the natural alignment for the IR type, which means we've been assuming that all pointers have a least that alignment. The LLVM documentation even says that atomicrmw doesn't allow under-alignment.

We construct a libcall to __atomic_* routines in the frontend upon seeing an underaligned argument, instead of letting the backend handle it -- there's a bunch of code at https://github.com/llvm/llvm-project/blob/bc4a5bdce4af82a5522869d8a154e9e15cf809df/clang/lib/CodeGen/CGAtomic.cpp#L790 to handle that. I'd like to rip most of that out in the future, and just let the backend handle it in more cases.

E.g.

typedef int __attribute__((aligned(1))) unaligned_int;
bool cmpxchg_u(unaligned_int *x) {
    int expected = 2;
    return __atomic_compare_exchange_n(x, &expected, 2, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
}

generates a libcall to __atomic_compare_exchange_4 (in IR, generated in the Clang code), instead of creating a cmpxchg IR instruction, due to the under-alignment. (Although, sigh, I've only just noticed we actually have a problem here -- the __atomic_*_SIZE libcalls are supposed to require an aligned argument -- so we should be using __atomic_compare_exchange (without size suffix) instead. Gah.)

Ping. I think this is correct, and would like to commit.

Alright, well, this does look cleaner.

This revision is now accepted and ready to land.Jun 23 2021, 12:54 PM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

8 lines

17 lines

285 lines

6 lines

2 lines

test/

CodeGen/

atomic-ops.c

4 lines

ms-intrinsics-underaligned.c

99 lines

ms-intrinsics.c

4 lines

OpenMP/

parallel_reduction_codegen.cpp

8 lines

Diff 330088

clang/lib/CodeGen/CGAtomic.cpp

Show First 20 Lines • Show All 368 Lines • ▼ Show 20 Lines	static void emitAtomicCmpXchg(CodeGenFunction &CGF, AtomicExpr *E, bool IsWeak,
llvm::AtomicOrdering SuccessOrder,		llvm::AtomicOrdering SuccessOrder,
llvm::AtomicOrdering FailureOrder,		llvm::AtomicOrdering FailureOrder,
llvm::SyncScope::ID Scope) {		llvm::SyncScope::ID Scope) {
// Note that cmpxchg doesn't support weak cmpxchg, at least at the moment.		// Note that cmpxchg doesn't support weak cmpxchg, at least at the moment.
llvm::Value *Expected = CGF.Builder.CreateLoad(Val1);		llvm::Value *Expected = CGF.Builder.CreateLoad(Val1);
llvm::Value *Desired = CGF.Builder.CreateLoad(Val2);		llvm::Value *Desired = CGF.Builder.CreateLoad(Val2);

llvm::AtomicCmpXchgInst *Pair = CGF.Builder.CreateAtomicCmpXchg(		llvm::AtomicCmpXchgInst *Pair = CGF.Builder.CreateAtomicCmpXchg(
Ptr.getPointer(), Expected, Desired, SuccessOrder, FailureOrder,		Ptr, Expected, Desired, SuccessOrder, FailureOrder, Scope);
Scope);
Pair->setVolatile(E->isVolatile());		Pair->setVolatile(E->isVolatile());
Pair->setWeak(IsWeak);		Pair->setWeak(IsWeak);

// Cmp holds the result of the compare-exchange operation: true on success,		// Cmp holds the result of the compare-exchange operation: true on success,
// false on failure.		// false on failure.
llvm::Value *Old = CGF.Builder.CreateExtractValue(Pair, 0);		llvm::Value *Old = CGF.Builder.CreateExtractValue(Pair, 0);
llvm::Value *Cmp = CGF.Builder.CreateExtractValue(Pair, 1);		llvm::Value *Cmp = CGF.Builder.CreateExtractValue(Pair, 1);

▲ Show 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	case AtomicExpr::AO__atomic_nand_fetch:
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case AtomicExpr::AO__atomic_fetch_nand:		case AtomicExpr::AO__atomic_fetch_nand:
Op = llvm::AtomicRMWInst::Nand;		Op = llvm::AtomicRMWInst::Nand;
break;		break;
}		}

llvm::Value *LoadVal1 = CGF.Builder.CreateLoad(Val1);		llvm::Value *LoadVal1 = CGF.Builder.CreateLoad(Val1);
llvm::AtomicRMWInst *RMWI =		llvm::AtomicRMWInst *RMWI =
CGF.Builder.CreateAtomicRMW(Op, Ptr.getPointer(), LoadVal1, Order, Scope);		CGF.Builder.CreateAtomicRMW(Op, Ptr, LoadVal1, Order, Scope);
RMWI->setVolatile(E->isVolatile());		RMWI->setVolatile(E->isVolatile());

// For __atomic_*_fetch operations, perform the operation again to		// For __atomic_*_fetch operations, perform the operation again to
// determine the value which was written.		// determine the value which was written.
llvm::Value *Result = RMWI;		llvm::Value *Result = RMWI;
if (PostOpMinMax)		if (PostOpMinMax)
Result = EmitPostAtomicMinMax(CGF.Builder, E->getOp(),		Result = EmitPostAtomicMinMax(CGF.Builder, E->getOp(),
E->getValueType()->isSignedIntegerType(),		E->getValueType()->isSignedIntegerType(),
▲ Show 20 Lines • Show All 986 Lines • ▼ Show 20 Lines	llvm::Value *AtomicInfo::convertRValueToInt(RValue RVal) const {
return CGF.Builder.CreateLoad(Addr);		return CGF.Builder.CreateLoad(Addr);
}		}

std::pair<llvm::Value , llvm::Value > AtomicInfo::EmitAtomicCompareExchangeOp(		std::pair<llvm::Value , llvm::Value > AtomicInfo::EmitAtomicCompareExchangeOp(
llvm::Value ExpectedVal, llvm::Value DesiredVal,		llvm::Value ExpectedVal, llvm::Value DesiredVal,
llvm::AtomicOrdering Success, llvm::AtomicOrdering Failure, bool IsWeak) {		llvm::AtomicOrdering Success, llvm::AtomicOrdering Failure, bool IsWeak) {
// Do the atomic store.		// Do the atomic store.
Address Addr = getAtomicAddressAsAtomicIntPointer();		Address Addr = getAtomicAddressAsAtomicIntPointer();
auto *Inst = CGF.Builder.CreateAtomicCmpXchg(Addr.getPointer(),		auto *Inst = CGF.Builder.CreateAtomicCmpXchg(Addr, ExpectedVal, DesiredVal,
ExpectedVal, DesiredVal,
Success, Failure);		Success, Failure);
// Other decoration.		// Other decoration.
Inst->setVolatile(LVal.isVolatileQualified());		Inst->setVolatile(LVal.isVolatileQualified());
Inst->setWeak(IsWeak);		Inst->setWeak(IsWeak);

// Okay, turn that back into the original value type.		// Okay, turn that back into the original value type.
auto PreviousVal = CGF.Builder.CreateExtractValue(Inst, /Idxs=*/0);		auto PreviousVal = CGF.Builder.CreateExtractValue(Inst, /Idxs=*/0);
auto SuccessFailureVal = CGF.Builder.CreateExtractValue(Inst, /Idxs=*/1);		auto SuccessFailureVal = CGF.Builder.CreateExtractValue(Inst, /Idxs=*/1);
▲ Show 20 Lines • Show All 430 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGBuilder.h

Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	public:
}		}

/// Emit a store to an i1 flag variable.		/// Emit a store to an i1 flag variable.
llvm::StoreInst CreateFlagStore(bool Value, llvm::Value Addr) {		llvm::StoreInst CreateFlagStore(bool Value, llvm::Value Addr) {
assert(Addr->getType()->getPointerElementType() == getInt1Ty());		assert(Addr->getType()->getPointerElementType() == getInt1Ty());
return CreateAlignedStore(getInt1(Value), Addr, CharUnits::One());		return CreateAlignedStore(getInt1(Value), Addr, CharUnits::One());
}		}

// Temporarily use old signature; clang will be updated to an Address overload
// in a subsequent patch.
llvm::AtomicCmpXchgInst *		llvm::AtomicCmpXchgInst *
CreateAtomicCmpXchg(llvm::Value Ptr, llvm::Value Cmp, llvm::Value *New,		CreateAtomicCmpXchg(Address Addr, llvm::Value Cmp, llvm::Value New,
llvm::AtomicOrdering SuccessOrdering,		llvm::AtomicOrdering SuccessOrdering,
llvm::AtomicOrdering FailureOrdering,		llvm::AtomicOrdering FailureOrdering,
llvm::SyncScope::ID SSID = llvm::SyncScope::System) {		llvm::SyncScope::ID SSID = llvm::SyncScope::System) {
return CGBuilderBaseTy::CreateAtomicCmpXchg(		return CGBuilderBaseTy::CreateAtomicCmpXchg(
Ptr, Cmp, New, llvm::MaybeAlign(), SuccessOrdering, FailureOrdering,		Addr.getPointer(), Cmp, New, Addr.getAlignment().getAsAlign(),
SSID);		SuccessOrdering, FailureOrdering, SSID);
}		}

// Temporarily use old signature; clang will be updated to an Address overload
// in a subsequent patch.
llvm::AtomicRMWInst *		llvm::AtomicRMWInst *
CreateAtomicRMW(llvm::AtomicRMWInst::BinOp Op, llvm::Value *Ptr,		CreateAtomicRMW(llvm::AtomicRMWInst::BinOp Op, Address Addr, llvm::Value *Val,
llvm::Value *Val, llvm::AtomicOrdering Ordering,		llvm::AtomicOrdering Ordering,
llvm::SyncScope::ID SSID = llvm::SyncScope::System) {		llvm::SyncScope::ID SSID = llvm::SyncScope::System) {
return CGBuilderBaseTy::CreateAtomicRMW(Op, Ptr, Val, llvm::MaybeAlign(),		return CGBuilderBaseTy::CreateAtomicRMW(Op, Addr.getPointer(), Val,
		Addr.getAlignment().getAsAlign(),
Ordering, SSID);		Ordering, SSID);
}		}

using CGBuilderBaseTy::CreateBitCast;		using CGBuilderBaseTy::CreateBitCast;
Address CreateBitCast(Address Addr, llvm::Type *Ty,		Address CreateBitCast(Address Addr, llvm::Type *Ty,
const llvm::Twine &Name = "") {		const llvm::Twine &Name = "") {
return Address(CreateBitCast(Addr.getPointer(), Ty, Name),		return Address(CreateBitCast(Addr.getPointer(), Ty, Name),
Addr.getAlignment());		Addr.getAlignment());
▲ Show 20 Lines • Show All 184 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	static Value EmitFromInt(CodeGenFunction &CGF, llvm::Value V,

if (ResultType->isPointerTy())		if (ResultType->isPointerTy())
return CGF.Builder.CreateIntToPtr(V, ResultType);		return CGF.Builder.CreateIntToPtr(V, ResultType);

assert(V->getType() == ResultType);		assert(V->getType() == ResultType);
return V;		return V;
}		}

		// This helper function returns a new Address, based on Addr, but with its
		// alignment set to at least as large as the type's size.
		static Address ForceNaturalAlignment(CodeGenFunction &CGF, Address Addr) {
		CharUnits TypeSize = CharUnits::fromQuantity(
		CGF.CGM.getDataLayout().getTypeStoreSize(Addr.getElementType()));
		if (Addr.getAlignment() < TypeSize)
		return Address(Addr.getPointer(), TypeSize);
		return Addr;
		}

/// Utility to insert an atomic instruction based on Intrinsic::ID		/// Utility to insert an atomic instruction based on Intrinsic::ID
/// and the expression node.		/// and the expression node.
static Value *MakeBinaryAtomicValue(		static Value *MakeBinaryAtomicValue(
CodeGenFunction &CGF, llvm::AtomicRMWInst::BinOp Kind, const CallExpr *E,		CodeGenFunction &CGF, llvm::AtomicRMWInst::BinOp Kind, const CallExpr *E,
AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {		AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent,
		bool RequireNaturalAlignment = false) {
QualType T = E->getType();		QualType T = E->getType();
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());
assert(CGF.getContext().hasSameUnqualifiedType(T,		assert(CGF.getContext().hasSameUnqualifiedType(T,
E->getArg(0)->getType()->getPointeeType()));		E->getArg(0)->getType()->getPointeeType()));
assert(CGF.getContext().hasSameUnqualifiedType(T, E->getArg(1)->getType()));		assert(CGF.getContext().hasSameUnqualifiedType(T, E->getArg(1)->getType()));

llvm::Value *DestPtr = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
unsigned AddrSpace = DestPtr->getType()->getPointerAddressSpace();

llvm::IntegerType *IntType =		llvm::IntegerType *IntType =
llvm::IntegerType::get(CGF.getLLVMContext(),		llvm::IntegerType::get(CGF.getLLVMContext(),
CGF.getContext().getTypeSize(T));		CGF.getContext().getTypeSize(T));
llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);

llvm::Value *Args[2];		DestAddr = CGF.Builder.CreateElementBitCast(DestAddr, IntType);
Args[0] = CGF.Builder.CreateBitCast(DestPtr, IntPtrType);		llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(1));
Args[1] = CGF.EmitScalarExpr(E->getArg(1));		llvm::Type *ValueType = Val->getType();
llvm::Type *ValueType = Args[1]->getType();		Val = EmitToInt(CGF, Val, T, IntType);
Args[1] = EmitToInt(CGF, Args[1], T, IntType);

llvm::Value *Result = CGF.Builder.CreateAtomicRMW(		if (RequireNaturalAlignment)
Kind, Args[0], Args[1], Ordering);		DestAddr = ForceNaturalAlignment(CGF, DestAddr);

		llvm::Value *Result =
		CGF.Builder.CreateAtomicRMW(Kind, DestAddr, Val, Ordering);
return EmitFromInt(CGF, Result, T, ValueType);		return EmitFromInt(CGF, Result, T, ValueType);
}		}

static Value EmitNontemporalStore(CodeGenFunction &CGF, const CallExpr E) {		static Value EmitNontemporalStore(CodeGenFunction &CGF, const CallExpr E) {
Value *Val = CGF.EmitScalarExpr(E->getArg(0));		Value *Val = CGF.EmitScalarExpr(E->getArg(0));
Value *Address = CGF.EmitScalarExpr(E->getArg(1));		Value *Address = CGF.EmitScalarExpr(E->getArg(1));

// Convert the type of the pointer to a pointer to the stored type.		// Convert the type of the pointer to a pointer to the stored type.
Show All 29 Lines	static RValue EmitBinaryAtomicPost(CodeGenFunction &CGF,
Instruction::BinaryOps Op,		Instruction::BinaryOps Op,
bool Invert = false) {		bool Invert = false) {
QualType T = E->getType();		QualType T = E->getType();
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());
assert(CGF.getContext().hasSameUnqualifiedType(T,		assert(CGF.getContext().hasSameUnqualifiedType(T,
E->getArg(0)->getType()->getPointeeType()));		E->getArg(0)->getType()->getPointeeType()));
assert(CGF.getContext().hasSameUnqualifiedType(T, E->getArg(1)->getType()));		assert(CGF.getContext().hasSameUnqualifiedType(T, E->getArg(1)->getType()));

llvm::Value *DestPtr = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
unsigned AddrSpace = DestPtr->getType()->getPointerAddressSpace();

llvm::IntegerType *IntType =		llvm::IntegerType *IntType =
llvm::IntegerType::get(CGF.getLLVMContext(),		llvm::IntegerType::get(CGF.getLLVMContext(),
CGF.getContext().getTypeSize(T));		CGF.getContext().getTypeSize(T));
llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);

llvm::Value *Args[2];		DestAddr = CGF.Builder.CreateElementBitCast(DestAddr, IntType);
Args[1] = CGF.EmitScalarExpr(E->getArg(1));		llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(1));
llvm::Type *ValueType = Args[1]->getType();		llvm::Type *ValueType = Val->getType();
Args[1] = EmitToInt(CGF, Args[1], T, IntType);		Val = EmitToInt(CGF, Val, T, IntType);
Args[0] = CGF.Builder.CreateBitCast(DestPtr, IntPtrType);

llvm::Value *Result = CGF.Builder.CreateAtomicRMW(		llvm::Value *Result = CGF.Builder.CreateAtomicRMW(
Kind, Args[0], Args[1], llvm::AtomicOrdering::SequentiallyConsistent);		Kind, DestAddr, Val, llvm::AtomicOrdering::SequentiallyConsistent);
Result = CGF.Builder.CreateBinOp(Op, Result, Args[1]);		Result = CGF.Builder.CreateBinOp(Op, Result, Val);
if (Invert)		if (Invert)
Result =		Result =
CGF.Builder.CreateBinOp(llvm::Instruction::Xor, Result,		CGF.Builder.CreateBinOp(llvm::Instruction::Xor, Result,
llvm::ConstantInt::getAllOnesValue(IntType));		llvm::ConstantInt::getAllOnesValue(IntType));
Result = EmitFromInt(CGF, Result, T, ValueType);		Result = EmitFromInt(CGF, Result, T, ValueType);
return RValue::get(Result);		return RValue::get(Result);
}		}

Show All 9 Lines
///		///
/// @returns result of cmpxchg, according to ReturnBool		/// @returns result of cmpxchg, according to ReturnBool
///		///
/// Note: In order to lower Microsoft's _InterlockedCompareExchange* intrinsics		/// Note: In order to lower Microsoft's _InterlockedCompareExchange* intrinsics
/// invoke the function EmitAtomicCmpXchgForMSIntrin.		/// invoke the function EmitAtomicCmpXchgForMSIntrin.
static Value MakeAtomicCmpXchgValue(CodeGenFunction &CGF, const CallExpr E,		static Value MakeAtomicCmpXchgValue(CodeGenFunction &CGF, const CallExpr E,
bool ReturnBool) {		bool ReturnBool) {
QualType T = ReturnBool ? E->getArg(1)->getType() : E->getType();		QualType T = ReturnBool ? E->getArg(1)->getType() : E->getType();
llvm::Value *DestPtr = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
unsigned AddrSpace = DestPtr->getType()->getPointerAddressSpace();

llvm::IntegerType *IntType = llvm::IntegerType::get(		llvm::IntegerType *IntType = llvm::IntegerType::get(
CGF.getLLVMContext(), CGF.getContext().getTypeSize(T));		CGF.getLLVMContext(), CGF.getContext().getTypeSize(T));
llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);

Value *Args[3];		DestAddr = CGF.Builder.CreateElementBitCast(DestAddr, IntType);
Args[0] = CGF.Builder.CreateBitCast(DestPtr, IntPtrType);		Value *Cmp = CGF.EmitScalarExpr(E->getArg(1));
Args[1] = CGF.EmitScalarExpr(E->getArg(1));		llvm::Type *ValueType = Cmp->getType();
llvm::Type *ValueType = Args[1]->getType();		Cmp = EmitToInt(CGF, Cmp, T, IntType);
Args[1] = EmitToInt(CGF, Args[1], T, IntType);		Value *New = EmitToInt(CGF, CGF.EmitScalarExpr(E->getArg(2)), T, IntType);
Args[2] = EmitToInt(CGF, CGF.EmitScalarExpr(E->getArg(2)), T, IntType);

Value *Pair = CGF.Builder.CreateAtomicCmpXchg(		Value *Pair = CGF.Builder.CreateAtomicCmpXchg(
Args[0], Args[1], Args[2], llvm::AtomicOrdering::SequentiallyConsistent,		DestAddr, Cmp, New, llvm::AtomicOrdering::SequentiallyConsistent,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
if (ReturnBool)		if (ReturnBool)
// Extract boolean success flag and zext it to int.		// Extract boolean success flag and zext it to int.
return CGF.Builder.CreateZExt(CGF.Builder.CreateExtractValue(Pair, 1),		return CGF.Builder.CreateZExt(CGF.Builder.CreateExtractValue(Pair, 1),
CGF.ConvertType(E->getType()));		CGF.ConvertType(E->getType()));
else		else
// Extract old value and emit it using the same type as compare value.		// Extract old value and emit it using the same type as compare value.
return EmitFromInt(CGF, CGF.Builder.CreateExtractValue(Pair, 0), T,		return EmitFromInt(CGF, CGF.Builder.CreateExtractValue(Pair, 0), T,
ValueType);		ValueType);
}		}

/// This function should be invoked to emit atomic cmpxchg for Microsoft's		/// This function should be invoked to emit atomic cmpxchg for Microsoft's
/// _InterlockedCompareExchange* intrinsics which have the following signature:		/// _InterlockedCompareExchange* intrinsics which have the following signature:
/// T _InterlockedCompareExchange(T volatile *Destination,		/// T _InterlockedCompareExchange(T volatile *Destination,
/// T Exchange,		/// T Exchange,
/// T Comparand);		/// T Comparand);
///		///
/// Whereas the llvm 'cmpxchg' instruction has the following syntax:		/// Whereas the llvm 'cmpxchg' instruction has the following syntax:
/// cmpxchg *Destination, Comparand, Exchange.		/// cmpxchg *Destination, Comparand, Exchange.
/// So we need to swap Comparand and Exchange when invoking		/// So we need to swap Comparand and Exchange when invoking
/// CreateAtomicCmpXchg. That is the reason we could not use the above utility		/// CreateAtomicCmpXchg. That is the reason we could not use the above utility
/// function MakeAtomicCmpXchgValue since it expects the arguments to be		/// function MakeAtomicCmpXchgValue since it expects the arguments to be
/// already swapped.		/// already swapped.
		///
		/// Note that the MSVC 'Interlocked' intrinsics always assume the destination
		/// address has been aligned to their size.
static		static
Value EmitAtomicCmpXchgForMSIntrin(CodeGenFunction &CGF, const CallExpr E,		Value EmitAtomicCmpXchgForMSIntrin(CodeGenFunction &CGF, const CallExpr E,
AtomicOrdering SuccessOrdering = AtomicOrdering::SequentiallyConsistent) {		AtomicOrdering SuccessOrdering = AtomicOrdering::SequentiallyConsistent) {
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());
assert(CGF.getContext().hasSameUnqualifiedType(		assert(CGF.getContext().hasSameUnqualifiedType(
E->getType(), E->getArg(0)->getType()->getPointeeType()));		E->getType(), E->getArg(0)->getType()->getPointeeType()));
assert(CGF.getContext().hasSameUnqualifiedType(E->getType(),		assert(CGF.getContext().hasSameUnqualifiedType(E->getType(),
E->getArg(1)->getType()));		E->getArg(1)->getType()));
assert(CGF.getContext().hasSameUnqualifiedType(E->getType(),		assert(CGF.getContext().hasSameUnqualifiedType(E->getType(),
E->getArg(2)->getType()));		E->getArg(2)->getType()));

auto *Destination = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
		DestAddr = ForceNaturalAlignment(CGF, DestAddr);

auto *Comparand = CGF.EmitScalarExpr(E->getArg(2));		auto *Comparand = CGF.EmitScalarExpr(E->getArg(2));
auto *Exchange = CGF.EmitScalarExpr(E->getArg(1));		auto *Exchange = CGF.EmitScalarExpr(E->getArg(1));

// For Release ordering, the failure ordering should be Monotonic.		// For Release ordering, the failure ordering should be Monotonic.
auto FailureOrdering = SuccessOrdering == AtomicOrdering::Release ?		auto FailureOrdering = SuccessOrdering == AtomicOrdering::Release ?
AtomicOrdering::Monotonic :		AtomicOrdering::Monotonic :
SuccessOrdering;		SuccessOrdering;

// The atomic instruction is marked volatile for consistency with MSVC. This		// The atomic instruction is marked volatile for consistency with MSVC. This
// blocks the few atomics optimizations that LLVM has. If we want to optimize		// blocks the few atomics optimizations that LLVM has. If we want to optimize
// _Interlocked* operations in the future, we will have to remove the volatile		// _Interlocked* operations in the future, we will have to remove the volatile
// marker.		// marker.
auto *Result = CGF.Builder.CreateAtomicCmpXchg(		auto *Result = CGF.Builder.CreateAtomicCmpXchg(
Destination, Comparand, Exchange,		DestAddr, Comparand, Exchange, SuccessOrdering, FailureOrdering);
SuccessOrdering, FailureOrdering);
Result->setVolatile(true);		Result->setVolatile(true);
return CGF.Builder.CreateExtractValue(Result, 0);		return CGF.Builder.CreateExtractValue(Result, 0);
}		}

// 64-bit Microsoft platforms support 128 bit cmpxchg operations. They are		// 64-bit Microsoft platforms support 128 bit cmpxchg operations. They are
// prototyped like this:		// prototyped like this:
//		//
// unsigned char _InterlockedCompareExchange128...(		// unsigned char _InterlockedCompareExchange128...(
// __int64 volatile * _Destination,		// __int64 volatile * _Destination,
// __int64 _ExchangeHigh,		// __int64 _ExchangeHigh,
// __int64 _ExchangeLow,		// __int64 _ExchangeLow,
// __int64 * _ComparandResult);		// __int64 * _ComparandResult);
		//
		// Note that Destination is assumed to be at least 16-byte aligned, despite
		// being typed int64.

static Value *EmitAtomicCmpXchg128ForMSIntrin(CodeGenFunction &CGF,		static Value *EmitAtomicCmpXchg128ForMSIntrin(CodeGenFunction &CGF,
const CallExpr *E,		const CallExpr *E,
AtomicOrdering SuccessOrdering) {		AtomicOrdering SuccessOrdering) {
assert(E->getNumArgs() == 4);		assert(E->getNumArgs() == 4);
llvm::Value *Destination = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
llvm::Value *ExchangeHigh = CGF.EmitScalarExpr(E->getArg(1));		llvm::Value *ExchangeHigh = CGF.EmitScalarExpr(E->getArg(1));
llvm::Value *ExchangeLow = CGF.EmitScalarExpr(E->getArg(2));		llvm::Value *ExchangeLow = CGF.EmitScalarExpr(E->getArg(2));
llvm::Value *ComparandPtr = CGF.EmitScalarExpr(E->getArg(3));		Address ComparandAddr = CGF.EmitPointerWithAlignment(E->getArg(3));

assert(Destination->getType()->isPointerTy());		assert(DestAddr.getType()->isPointerTy());
assert(!ExchangeHigh->getType()->isPointerTy());		assert(!ExchangeHigh->getType()->isPointerTy());
assert(!ExchangeLow->getType()->isPointerTy());		assert(!ExchangeLow->getType()->isPointerTy());
assert(ComparandPtr->getType()->isPointerTy());		assert(ComparandAddr.getType()->isPointerTy());

// For Release ordering, the failure ordering should be Monotonic.		// For Release ordering, the failure ordering should be Monotonic.
auto FailureOrdering = SuccessOrdering == AtomicOrdering::Release		auto FailureOrdering = SuccessOrdering == AtomicOrdering::Release
? AtomicOrdering::Monotonic		? AtomicOrdering::Monotonic
: SuccessOrdering;		: SuccessOrdering;

// Convert to i128 pointers and values.		// Convert to i128 pointers and values.
llvm::Type *Int128Ty = llvm::IntegerType::get(CGF.getLLVMContext(), 128);		llvm::Type *Int128Ty = llvm::IntegerType::get(CGF.getLLVMContext(), 128);
llvm::Type *Int128PtrTy = Int128Ty->getPointerTo();		DestAddr = CGF.Builder.CreateElementBitCast(DestAddr, Int128Ty);
Destination = CGF.Builder.CreateBitCast(Destination, Int128PtrTy);		ComparandAddr = CGF.Builder.CreateElementBitCast(ComparandAddr, Int128Ty);
Address ComparandResult(CGF.Builder.CreateBitCast(ComparandPtr, Int128PtrTy),
CGF.getContext().toCharUnitsFromBits(128));		// Force alignment after casting (thus forcing 16-byte alignment)
		DestAddr = ForceNaturalAlignment(CGF, DestAddr);
		rjmccallUnsubmitted Done Reply Inline Actions Since you're changing this code anyway, please make this do `CreateElementBitCast(DestAddr, Int128Ty)` so that it's address-space-correct. There are a lot of other lines in the patch that would benefit from the same thing. rjmccall: Since you're changing this code anyway, please make this do `CreateElementBitCast(DestAddr…

// (((i128)hi) << 64) \| ((i128)lo)		// (((i128)hi) << 64) \| ((i128)lo)
ExchangeHigh = CGF.Builder.CreateZExt(ExchangeHigh, Int128Ty);		ExchangeHigh = CGF.Builder.CreateZExt(ExchangeHigh, Int128Ty);
ExchangeLow = CGF.Builder.CreateZExt(ExchangeLow, Int128Ty);		ExchangeLow = CGF.Builder.CreateZExt(ExchangeLow, Int128Ty);
ExchangeHigh =		ExchangeHigh =
CGF.Builder.CreateShl(ExchangeHigh, llvm::ConstantInt::get(Int128Ty, 64));		CGF.Builder.CreateShl(ExchangeHigh, llvm::ConstantInt::get(Int128Ty, 64));
llvm::Value *Exchange = CGF.Builder.CreateOr(ExchangeHigh, ExchangeLow);		llvm::Value *Exchange = CGF.Builder.CreateOr(ExchangeHigh, ExchangeLow);

// Load the comparand for the instruction.		// Load the comparand for the instruction.
llvm::Value *Comparand = CGF.Builder.CreateLoad(ComparandResult);		llvm::Value *Comparand = CGF.Builder.CreateLoad(ComparandAddr);

auto *CXI = CGF.Builder.CreateAtomicCmpXchg(Destination, Comparand, Exchange,		auto *CXI = CGF.Builder.CreateAtomicCmpXchg(DestAddr, Comparand, Exchange,
SuccessOrdering, FailureOrdering);		SuccessOrdering, FailureOrdering);

// The atomic instruction is marked volatile for consistency with MSVC. This		// The atomic instruction is marked volatile for consistency with MSVC. This
// blocks the few atomics optimizations that LLVM has. If we want to optimize		// blocks the few atomics optimizations that LLVM has. If we want to optimize
// _Interlocked* operations in the future, we will have to remove the volatile		// _Interlocked* operations in the future, we will have to remove the volatile
// marker.		// marker.
CXI->setVolatile(true);		CXI->setVolatile(true);

// Store the result as an outparameter.		// Store the result as an outparameter.
CGF.Builder.CreateStore(CGF.Builder.CreateExtractValue(CXI, 0),		CGF.Builder.CreateStore(CGF.Builder.CreateExtractValue(CXI, 0),
ComparandResult);		ComparandAddr);

// Get the success boolean and zero extend it to i8.		// Get the success boolean and zero extend it to i8.
Value *Success = CGF.Builder.CreateExtractValue(CXI, 1);		Value *Success = CGF.Builder.CreateExtractValue(CXI, 1);
return CGF.Builder.CreateZExt(Success, CGF.Int8Ty);		return CGF.Builder.CreateZExt(Success, CGF.Int8Ty);
}		}

static Value EmitAtomicIncrementValue(CodeGenFunction &CGF, const CallExpr E,		static Value *EmitAtomicIncrementValue(
AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {		CodeGenFunction &CGF, const CallExpr *E,
		AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent,
		bool RequireNaturalAlignment = false) {
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());

auto *IntTy = CGF.ConvertType(E->getType());		auto *IntTy = CGF.ConvertType(E->getType());
		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
		if (RequireNaturalAlignment)
		DestAddr = ForceNaturalAlignment(CGF, DestAddr);
auto *Result = CGF.Builder.CreateAtomicRMW(		auto *Result = CGF.Builder.CreateAtomicRMW(
AtomicRMWInst::Add,		AtomicRMWInst::Add, DestAddr, ConstantInt::get(IntTy, 1), Ordering);
CGF.EmitScalarExpr(E->getArg(0)),
ConstantInt::get(IntTy, 1),
Ordering);
return CGF.Builder.CreateAdd(Result, ConstantInt::get(IntTy, 1));		return CGF.Builder.CreateAdd(Result, ConstantInt::get(IntTy, 1));
}		}

static Value EmitAtomicDecrementValue(CodeGenFunction &CGF, const CallExpr E,		static Value *EmitAtomicDecrementValue(
AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {		CodeGenFunction &CGF, const CallExpr *E,
		AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent,
		bool RequireNaturalAlignment = false) {
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());

auto *IntTy = CGF.ConvertType(E->getType());		auto *IntTy = CGF.ConvertType(E->getType());
		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
		if (RequireNaturalAlignment)
		DestAddr = ForceNaturalAlignment(CGF, DestAddr);
auto *Result = CGF.Builder.CreateAtomicRMW(		auto *Result = CGF.Builder.CreateAtomicRMW(
AtomicRMWInst::Sub,		AtomicRMWInst::Sub, DestAddr, ConstantInt::get(IntTy, 1), Ordering);
CGF.EmitScalarExpr(E->getArg(0)),
ConstantInt::get(IntTy, 1),
Ordering);
return CGF.Builder.CreateSub(Result, ConstantInt::get(IntTy, 1));		return CGF.Builder.CreateSub(Result, ConstantInt::get(IntTy, 1));
}		}

// Build a plain volatile load.		// Build a plain volatile load.
static Value EmitISOVolatileLoad(CodeGenFunction &CGF, const CallExpr E) {		static Value EmitISOVolatileLoad(CodeGenFunction &CGF, const CallExpr E) {
Value *Ptr = CGF.EmitScalarExpr(E->getArg(0));		Value *Ptr = CGF.EmitScalarExpr(E->getArg(0));
QualType ElTy = E->getArg(0)->getType()->getPointeeType();		QualType ElTy = E->getArg(0)->getType()->getPointeeType();
CharUnits LoadSize = CGF.getContext().getTypeSizeInChars(ElTy);		CharUnits LoadSize = CGF.getContext().getTypeSizeInChars(ElTy);
▲ Show 20 Lines • Show All 543 Lines • ▼ Show 20 Lines	static llvm::Value *EmitBitTestIntrinsic(CodeGenFunction &CGF,
if (Ordering != llvm::AtomicOrdering::NotAtomic) {		if (Ordering != llvm::AtomicOrdering::NotAtomic) {
// Emit a combined atomicrmw load/store operation for the interlocked		// Emit a combined atomicrmw load/store operation for the interlocked
// intrinsics.		// intrinsics.
llvm::AtomicRMWInst::BinOp RMWOp = llvm::AtomicRMWInst::Or;		llvm::AtomicRMWInst::BinOp RMWOp = llvm::AtomicRMWInst::Or;
if (BT.Action == BitTest::Reset) {		if (BT.Action == BitTest::Reset) {
Mask = CGF.Builder.CreateNot(Mask);		Mask = CGF.Builder.CreateNot(Mask);
RMWOp = llvm::AtomicRMWInst::And;		RMWOp = llvm::AtomicRMWInst::And;
}		}
OldByte = CGF.Builder.CreateAtomicRMW(RMWOp, ByteAddr.getPointer(), Mask,		OldByte = CGF.Builder.CreateAtomicRMW(RMWOp, ByteAddr, Mask, Ordering);
Ordering);
} else {		} else {
// Emit a plain load for the non-interlocked intrinsics.		// Emit a plain load for the non-interlocked intrinsics.
OldByte = CGF.Builder.CreateLoad(ByteAddr, "bittest.byte");		OldByte = CGF.Builder.CreateLoad(ByteAddr, "bittest.byte");
Value *NewByte = nullptr;		Value *NewByte = nullptr;
switch (BT.Action) {		switch (BT.Action) {
case BitTest::TestOnly:		case BitTest::TestOnly:
// Don't store anything.		// Don't store anything.
break;		break;
▲ Show 20 Lines • Show All 495 Lines • ▼ Show 20 Lines	case MSVCIntrin::_BitScanReverse: {
}		}
Builder.CreateBr(End);		Builder.CreateBr(End);
Result->addIncoming(ResOne, NotZero);		Result->addIncoming(ResOne, NotZero);

Builder.SetInsertPoint(End);		Builder.SetInsertPoint(End);
return Result;		return Result;
}		}
case MSVCIntrin::_InterlockedAnd:		case MSVCIntrin::_InterlockedAnd:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::And, E);		return MakeBinaryAtomicValue(*this, AtomicRMWInst::And, E,
		AtomicOrdering::SequentiallyConsistent,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchange:		case MSVCIntrin::_InterlockedExchange:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xchg, E);		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xchg, E,
		AtomicOrdering::SequentiallyConsistent,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchangeAdd:		case MSVCIntrin::_InterlockedExchangeAdd:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Add, E);		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Add, E,
		AtomicOrdering::SequentiallyConsistent,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchangeSub:		case MSVCIntrin::_InterlockedExchangeSub:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Sub, E);		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Sub, E,
		AtomicOrdering::SequentiallyConsistent,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedOr:		case MSVCIntrin::_InterlockedOr:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Or, E);		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Or, E,
		AtomicOrdering::SequentiallyConsistent,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedXor:		case MSVCIntrin::_InterlockedXor:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xor, E);		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xor, E,
		AtomicOrdering::SequentiallyConsistent,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchangeAdd_acq:		case MSVCIntrin::_InterlockedExchangeAdd_acq:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Add, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Add, E,
AtomicOrdering::Acquire);		AtomicOrdering::Acquire,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchangeAdd_rel:		case MSVCIntrin::_InterlockedExchangeAdd_rel:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Add, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Add, E,
AtomicOrdering::Release);		AtomicOrdering::Release,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchangeAdd_nf:		case MSVCIntrin::_InterlockedExchangeAdd_nf:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Add, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Add, E,
AtomicOrdering::Monotonic);		AtomicOrdering::Monotonic,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchange_acq:		case MSVCIntrin::_InterlockedExchange_acq:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xchg, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xchg, E,
AtomicOrdering::Acquire);		AtomicOrdering::Acquire,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchange_rel:		case MSVCIntrin::_InterlockedExchange_rel:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xchg, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xchg, E,
AtomicOrdering::Release);		AtomicOrdering::Release,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedExchange_nf:		case MSVCIntrin::_InterlockedExchange_nf:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xchg, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xchg, E,
AtomicOrdering::Monotonic);		AtomicOrdering::Monotonic,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedCompareExchange_acq:		case MSVCIntrin::_InterlockedCompareExchange_acq:
return EmitAtomicCmpXchgForMSIntrin(*this, E, AtomicOrdering::Acquire);		return EmitAtomicCmpXchgForMSIntrin(*this, E, AtomicOrdering::Acquire);
case MSVCIntrin::_InterlockedCompareExchange_rel:		case MSVCIntrin::_InterlockedCompareExchange_rel:
return EmitAtomicCmpXchgForMSIntrin(*this, E, AtomicOrdering::Release);		return EmitAtomicCmpXchgForMSIntrin(*this, E, AtomicOrdering::Release);
case MSVCIntrin::_InterlockedCompareExchange_nf:		case MSVCIntrin::_InterlockedCompareExchange_nf:
return EmitAtomicCmpXchgForMSIntrin(*this, E, AtomicOrdering::Monotonic);		return EmitAtomicCmpXchgForMSIntrin(*this, E, AtomicOrdering::Monotonic);
case MSVCIntrin::_InterlockedCompareExchange128:		case MSVCIntrin::_InterlockedCompareExchange128:
return EmitAtomicCmpXchg128ForMSIntrin(		return EmitAtomicCmpXchg128ForMSIntrin(
*this, E, AtomicOrdering::SequentiallyConsistent);		*this, E, AtomicOrdering::SequentiallyConsistent);
case MSVCIntrin::_InterlockedCompareExchange128_acq:		case MSVCIntrin::_InterlockedCompareExchange128_acq:
return EmitAtomicCmpXchg128ForMSIntrin(*this, E, AtomicOrdering::Acquire);		return EmitAtomicCmpXchg128ForMSIntrin(*this, E, AtomicOrdering::Acquire);
case MSVCIntrin::_InterlockedCompareExchange128_rel:		case MSVCIntrin::_InterlockedCompareExchange128_rel:
return EmitAtomicCmpXchg128ForMSIntrin(*this, E, AtomicOrdering::Release);		return EmitAtomicCmpXchg128ForMSIntrin(*this, E, AtomicOrdering::Release);
case MSVCIntrin::_InterlockedCompareExchange128_nf:		case MSVCIntrin::_InterlockedCompareExchange128_nf:
return EmitAtomicCmpXchg128ForMSIntrin(*this, E, AtomicOrdering::Monotonic);		return EmitAtomicCmpXchg128ForMSIntrin(*this, E, AtomicOrdering::Monotonic);
case MSVCIntrin::_InterlockedOr_acq:		case MSVCIntrin::_InterlockedOr_acq:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Or, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Or, E,
AtomicOrdering::Acquire);		AtomicOrdering::Acquire,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedOr_rel:		case MSVCIntrin::_InterlockedOr_rel:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Or, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Or, E,
AtomicOrdering::Release);		AtomicOrdering::Release,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedOr_nf:		case MSVCIntrin::_InterlockedOr_nf:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Or, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Or, E,
AtomicOrdering::Monotonic);		AtomicOrdering::Monotonic,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedXor_acq:		case MSVCIntrin::_InterlockedXor_acq:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xor, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xor, E,
AtomicOrdering::Acquire);		AtomicOrdering::Acquire,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedXor_rel:		case MSVCIntrin::_InterlockedXor_rel:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xor, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xor, E,
AtomicOrdering::Release);		AtomicOrdering::Release,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedXor_nf:		case MSVCIntrin::_InterlockedXor_nf:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xor, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::Xor, E,
AtomicOrdering::Monotonic);		AtomicOrdering::Monotonic,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedAnd_acq:		case MSVCIntrin::_InterlockedAnd_acq:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::And, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::And, E,
AtomicOrdering::Acquire);		AtomicOrdering::Acquire,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedAnd_rel:		case MSVCIntrin::_InterlockedAnd_rel:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::And, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::And, E,
AtomicOrdering::Release);		AtomicOrdering::Release,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedAnd_nf:		case MSVCIntrin::_InterlockedAnd_nf:
return MakeBinaryAtomicValue(*this, AtomicRMWInst::And, E,		return MakeBinaryAtomicValue(*this, AtomicRMWInst::And, E,
AtomicOrdering::Monotonic);		AtomicOrdering::Monotonic,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedIncrement_acq:		case MSVCIntrin::_InterlockedIncrement_acq:
return EmitAtomicIncrementValue(*this, E, AtomicOrdering::Acquire);		return EmitAtomicIncrementValue(*this, E, AtomicOrdering::Acquire,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedIncrement_rel:		case MSVCIntrin::_InterlockedIncrement_rel:
return EmitAtomicIncrementValue(*this, E, AtomicOrdering::Release);		return EmitAtomicIncrementValue(*this, E, AtomicOrdering::Release,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedIncrement_nf:		case MSVCIntrin::_InterlockedIncrement_nf:
return EmitAtomicIncrementValue(*this, E, AtomicOrdering::Monotonic);		return EmitAtomicIncrementValue(*this, E, AtomicOrdering::Monotonic,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedDecrement_acq:		case MSVCIntrin::_InterlockedDecrement_acq:
return EmitAtomicDecrementValue(*this, E, AtomicOrdering::Acquire);		return EmitAtomicDecrementValue(*this, E, AtomicOrdering::Acquire,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedDecrement_rel:		case MSVCIntrin::_InterlockedDecrement_rel:
return EmitAtomicDecrementValue(*this, E, AtomicOrdering::Release);		return EmitAtomicDecrementValue(*this, E, AtomicOrdering::Release,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedDecrement_nf:		case MSVCIntrin::_InterlockedDecrement_nf:
return EmitAtomicDecrementValue(*this, E, AtomicOrdering::Monotonic);		return EmitAtomicDecrementValue(*this, E, AtomicOrdering::Monotonic,
		/RequireNaturalAlignment=/true);

case MSVCIntrin::_InterlockedDecrement:		case MSVCIntrin::_InterlockedDecrement:
return EmitAtomicDecrementValue(*this, E);		return EmitAtomicDecrementValue(*this, E,
		AtomicOrdering::SequentiallyConsistent,
		/RequireNaturalAlignment=/true);
case MSVCIntrin::_InterlockedIncrement:		case MSVCIntrin::_InterlockedIncrement:
return EmitAtomicIncrementValue(*this, E);		return EmitAtomicIncrementValue(*this, E,
		AtomicOrdering::SequentiallyConsistent,
		/RequireNaturalAlignment=/true);

case MSVCIntrin::__fastfail: {		case MSVCIntrin::__fastfail: {
// Request immediate process termination from the kernel. The instruction		// Request immediate process termination from the kernel. The instruction
// sequences to do this are documented on MSDN:		// sequences to do this are documented on MSDN:
// https://msdn.microsoft.com/en-us/library/dn774154.aspx		// https://msdn.microsoft.com/en-us/library/dn774154.aspx
llvm::Triple::ArchType ISA = getTarget().getTriple().getArch();		llvm::Triple::ArchType ISA = getTarget().getTriple().getArch();
StringRef Asm, Constraints;		StringRef Asm, Constraints;
switch (ISA) {		switch (ISA) {
▲ Show 20 Lines • Show All 2,183 Lines • ▼ Show 20 Lines

case Builtin::BI__atomic_test_and_set: {		case Builtin::BI__atomic_test_and_set: {
// Look at the argument type to determine whether this is a volatile		// Look at the argument type to determine whether this is a volatile
// operation. The parameter type is always volatile.		// operation. The parameter type is always volatile.
QualType PtrTy = E->getArg(0)->IgnoreImpCasts()->getType();		QualType PtrTy = E->getArg(0)->IgnoreImpCasts()->getType();
bool Volatile =		bool Volatile =
PtrTy->castAs<PointerType>()->getPointeeType().isVolatileQualified();		PtrTy->castAs<PointerType>()->getPointeeType().isVolatileQualified();

Value *Ptr = EmitScalarExpr(E->getArg(0));		Address PtrAddr = EmitPointerWithAlignment(E->getArg(0));
unsigned AddrSpace = Ptr->getType()->getPointerAddressSpace();		PtrAddr = Builder.CreateElementBitCast(PtrAddr, Int8Ty);
		rjmccallUnsubmitted Done Reply Inline Actions This should be using `EmitPointerWithAlignment` instead of assuming an alignment of 1. rjmccall: This should be using `EmitPointerWithAlignment` instead of assuming an alignment of 1.
Ptr = Builder.CreateBitCast(Ptr, Int8Ty->getPointerTo(AddrSpace));
Value *NewVal = Builder.getInt8(1);		Value *NewVal = Builder.getInt8(1);
Value *Order = EmitScalarExpr(E->getArg(1));		Value *Order = EmitScalarExpr(E->getArg(1));
if (isa<llvm::ConstantInt>(Order)) {		if (isa<llvm::ConstantInt>(Order)) {
int ord = cast<llvm::ConstantInt>(Order)->getZExtValue();		int ord = cast<llvm::ConstantInt>(Order)->getZExtValue();
AtomicRMWInst *Result = nullptr;		AtomicRMWInst *Result = nullptr;
switch (ord) {		switch (ord) {
case 0: // memory_order_relaxed		case 0: // memory_order_relaxed
default: // invalid order		default: // invalid order
Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		Result =
		Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, PtrAddr, NewVal,
llvm::AtomicOrdering::Monotonic);		llvm::AtomicOrdering::Monotonic);
break;		break;
case 1: // memory_order_consume		case 1: // memory_order_consume
case 2: // memory_order_acquire		case 2: // memory_order_acquire
Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, PtrAddr,
llvm::AtomicOrdering::Acquire);		NewVal, llvm::AtomicOrdering::Acquire);
break;		break;
case 3: // memory_order_release		case 3: // memory_order_release
Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, PtrAddr,
llvm::AtomicOrdering::Release);		NewVal, llvm::AtomicOrdering::Release);
break;		break;
case 4: // memory_order_acq_rel		case 4: // memory_order_acq_rel

Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		Result =
		Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, PtrAddr, NewVal,
llvm::AtomicOrdering::AcquireRelease);		llvm::AtomicOrdering::AcquireRelease);
break;		break;
case 5: // memory_order_seq_cst		case 5: // memory_order_seq_cst
Result = Builder.CreateAtomicRMW(		Result = Builder.CreateAtomicRMW(
llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		llvm::AtomicRMWInst::Xchg, PtrAddr, NewVal,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
break;		break;
}		}
Result->setVolatile(Volatile);		Result->setVolatile(Volatile);
return RValue::get(Builder.CreateIsNotNull(Result, "tobool"));		return RValue::get(Builder.CreateIsNotNull(Result, "tobool"));
}		}

llvm::BasicBlock *ContBB = createBasicBlock("atomic.continue", CurFn);		llvm::BasicBlock *ContBB = createBasicBlock("atomic.continue", CurFn);
Show All 14 Lines	case Builtin::BI__atomic_test_and_set: {
llvm::SwitchInst *SI = Builder.CreateSwitch(Order, BBs[0]);		llvm::SwitchInst *SI = Builder.CreateSwitch(Order, BBs[0]);

Builder.SetInsertPoint(ContBB);		Builder.SetInsertPoint(ContBB);
PHINode *Result = Builder.CreatePHI(Int8Ty, 5, "was_set");		PHINode *Result = Builder.CreatePHI(Int8Ty, 5, "was_set");

for (unsigned i = 0; i < 5; ++i) {		for (unsigned i = 0; i < 5; ++i) {
Builder.SetInsertPoint(BBs[i]);		Builder.SetInsertPoint(BBs[i]);
AtomicRMWInst *RMW = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg,		AtomicRMWInst *RMW = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg,
Ptr, NewVal, Orders[i]);		PtrAddr, NewVal, Orders[i]);
RMW->setVolatile(Volatile);		RMW->setVolatile(Volatile);
Result->addIncoming(RMW, BBs[i]);		Result->addIncoming(RMW, BBs[i]);
Builder.CreateBr(ContBB);		Builder.CreateBr(ContBB);
}		}

SI->addCase(Builder.getInt32(0), BBs[0]);		SI->addCase(Builder.getInt32(0), BBs[0]);
SI->addCase(Builder.getInt32(1), BBs[1]);		SI->addCase(Builder.getInt32(1), BBs[1]);
SI->addCase(Builder.getInt32(2), BBs[1]);		SI->addCase(Builder.getInt32(2), BBs[1]);
▲ Show 20 Lines • Show All 439 Lines • ▼ Show 20 Lines	case Builtin::BI_InterlockedExchangePointer:
return RValue::get(		return RValue::get(
EmitMSVCBuiltinExpr(MSVCIntrin::_InterlockedExchange, E));		EmitMSVCBuiltinExpr(MSVCIntrin::_InterlockedExchange, E));
case Builtin::BI_InterlockedCompareExchangePointer:		case Builtin::BI_InterlockedCompareExchangePointer:
case Builtin::BI_InterlockedCompareExchangePointer_nf: {		case Builtin::BI_InterlockedCompareExchangePointer_nf: {
llvm::Type *RTy;		llvm::Type *RTy;
llvm::IntegerType *IntType =		llvm::IntegerType *IntType =
IntegerType::get(getLLVMContext(),		IntegerType::get(getLLVMContext(),
getContext().getTypeSize(E->getType()));		getContext().getTypeSize(E->getType()));
llvm::Type *IntPtrType = IntType->getPointerTo();

llvm::Value *Destination =		Address DestAddr = Builder.CreateElementBitCast(
Builder.CreateBitCast(EmitScalarExpr(E->getArg(0)), IntPtrType);		EmitPointerWithAlignment(E->getArg(0)), IntType);
		DestAddr = ForceNaturalAlignment(*this, DestAddr);

llvm::Value *Exchange = EmitScalarExpr(E->getArg(1));		llvm::Value *Exchange = EmitScalarExpr(E->getArg(1));
RTy = Exchange->getType();		RTy = Exchange->getType();
Exchange = Builder.CreatePtrToInt(Exchange, IntType);		Exchange = Builder.CreatePtrToInt(Exchange, IntType);

llvm::Value *Comparand =		llvm::Value *Comparand =
Builder.CreatePtrToInt(EmitScalarExpr(E->getArg(2)), IntType);		Builder.CreatePtrToInt(EmitScalarExpr(E->getArg(2)), IntType);

auto Ordering =		auto Ordering =
BuiltinID == Builtin::BI_InterlockedCompareExchangePointer_nf ?		BuiltinID == Builtin::BI_InterlockedCompareExchangePointer_nf ?
AtomicOrdering::Monotonic : AtomicOrdering::SequentiallyConsistent;		AtomicOrdering::Monotonic : AtomicOrdering::SequentiallyConsistent;

auto Result = Builder.CreateAtomicCmpXchg(Destination, Comparand, Exchange,		auto Result = Builder.CreateAtomicCmpXchg(DestAddr, Comparand, Exchange,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: 'auto Result' can be declared as 'auto Result' [llvm-qualified-auto] not useful Lint: Pre-merge checks:* clang-tidy: warning: 'auto Result' can be declared as 'auto *Result' [llvm-qualified-auto]…
Ordering, Ordering);		Ordering, Ordering);
Result->setVolatile(true);		Result->setVolatile(true);

return RValue::get(Builder.CreateIntToPtr(Builder.CreateExtractValue(Result,		return RValue::get(Builder.CreateIntToPtr(Builder.CreateExtractValue(Result,
0),		0),
RTy));		RTy));
}		}
case Builtin::BI_InterlockedCompareExchange8:		case Builtin::BI_InterlockedCompareExchange8:
▲ Show 20 Lines • Show All 5,915 Lines • ▼ Show 20 Lines
case NEON::BI__builtin_neon_vgetq_lane_bf16:		case NEON::BI__builtin_neon_vgetq_lane_bf16:
case NEON::BI__builtin_neon_vduph_laneq_bf16:		case NEON::BI__builtin_neon_vduph_laneq_bf16:
case NEON::BI__builtin_neon_vduph_laneq_f16: {		case NEON::BI__builtin_neon_vduph_laneq_f16: {
return Builder.CreateExtractElement(Ops[0], EmitScalarExpr(E->getArg(1)),		return Builder.CreateExtractElement(Ops[0], EmitScalarExpr(E->getArg(1)),
"vgetq_lane");		"vgetq_lane");
}		}

case AArch64::BI_InterlockedAdd: {		case AArch64::BI_InterlockedAdd: {
Value *Arg0 = EmitScalarExpr(E->getArg(0));		Address DestAddr = EmitPointerWithAlignment(E->getArg(0));
Value *Arg1 = EmitScalarExpr(E->getArg(1));		DestAddr = ForceNaturalAlignment(*this, DestAddr);
AtomicRMWInst *RMWI = Builder.CreateAtomicRMW(		Value *Val = EmitScalarExpr(E->getArg(1));
AtomicRMWInst::Add, Arg0, Arg1,
		AtomicRMWInst *RMWI =
		Builder.CreateAtomicRMW(AtomicRMWInst::Add, DestAddr, Val,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
return Builder.CreateAdd(RMWI, Arg1);		return Builder.CreateAdd(RMWI, Val);
}		}
}		}

llvm::FixedVectorType *VTy = GetNeonType(this, Type);		llvm::FixedVectorType *VTy = GetNeonType(this, Type);
llvm::Type *Ty = VTy;		llvm::Type *Ty = VTy;
if (!Ty)		if (!Ty)
return nullptr;		return nullptr;

▲ Show 20 Lines • Show All 6,075 Lines • ▼ Show 20 Lines	CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned BuiltinID, const CallExpr *E) {
case NVPTX::BI__nvvm_atom_cas_gen_l:		case NVPTX::BI__nvvm_atom_cas_gen_l:
case NVPTX::BI__nvvm_atom_cas_gen_ll:		case NVPTX::BI__nvvm_atom_cas_gen_ll:
// __nvvm_atom_cas_gen_* should return the old value rather than the		// __nvvm_atom_cas_gen_* should return the old value rather than the
// success flag.		// success flag.
return MakeAtomicCmpXchgValue(this, E, /ReturnBool=*/false);		return MakeAtomicCmpXchgValue(this, E, /ReturnBool=*/false);

case NVPTX::BI__nvvm_atom_add_gen_f:		case NVPTX::BI__nvvm_atom_add_gen_f:
case NVPTX::BI__nvvm_atom_add_gen_d: {		case NVPTX::BI__nvvm_atom_add_gen_d: {
Value *Ptr = EmitScalarExpr(E->getArg(0));		Address DestAddr = EmitPointerWithAlignment(E->getArg(0));
Value *Val = EmitScalarExpr(E->getArg(1));		Value *Val = EmitScalarExpr(E->getArg(1));
return Builder.CreateAtomicRMW(llvm::AtomicRMWInst::FAdd, Ptr, Val,
		return Builder.CreateAtomicRMW(llvm::AtomicRMWInst::FAdd, DestAddr, Val,
AtomicOrdering::SequentiallyConsistent);		AtomicOrdering::SequentiallyConsistent);
}		}

case NVPTX::BI__nvvm_atom_inc_gen_ui: {		case NVPTX::BI__nvvm_atom_inc_gen_ui: {
Value *Ptr = EmitScalarExpr(E->getArg(0));		Value *Ptr = EmitScalarExpr(E->getArg(0));
Value *Val = EmitScalarExpr(E->getArg(1));		Value *Val = EmitScalarExpr(E->getArg(1));
Function *FnALI32 =		Function *FnALI32 =
CGM.getIntrinsic(Intrinsic::nvvm_atomic_load_inc_32, Ptr->getType());		CGM.getIntrinsic(Intrinsic::nvvm_atomic_load_inc_32, Ptr->getType());
▲ Show 20 Lines • Show All 1,618 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 2,424 Lines • ▼ Show 20 Lines	if (isInc && type->isBooleanType()) {
if (isPre) {		if (isPre) {
Builder.CreateStore(True, LV.getAddress(CGF), LV.isVolatileQualified())		Builder.CreateStore(True, LV.getAddress(CGF), LV.isVolatileQualified())
->setAtomic(llvm::AtomicOrdering::SequentiallyConsistent);		->setAtomic(llvm::AtomicOrdering::SequentiallyConsistent);
return Builder.getTrue();		return Builder.getTrue();
}		}
// For atomic bool increment, we just store true and return it for		// For atomic bool increment, we just store true and return it for
// preincrement, do an atomic swap with true for postincrement		// preincrement, do an atomic swap with true for postincrement
return Builder.CreateAtomicRMW(		return Builder.CreateAtomicRMW(
llvm::AtomicRMWInst::Xchg, LV.getPointer(CGF), True,		llvm::AtomicRMWInst::Xchg, LV.getAddress(CGF), True,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
}		}
// Special case for atomic increment / decrement on integers, emit		// Special case for atomic increment / decrement on integers, emit
// atomicrmw instructions. We skip this if we want to be doing overflow		// atomicrmw instructions. We skip this if we want to be doing overflow
// checking, and fall into the slow path with the atomic cmpxchg loop.		// checking, and fall into the slow path with the atomic cmpxchg loop.
if (!type->isBooleanType() && type->isIntegerType() &&		if (!type->isBooleanType() && type->isIntegerType() &&
!(type->isUnsignedIntegerType() &&		!(type->isUnsignedIntegerType() &&
CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow)) &&		CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow)) &&
CGF.getLangOpts().getSignedOverflowBehavior() !=		CGF.getLangOpts().getSignedOverflowBehavior() !=
LangOptions::SOB_Trapping) {		LangOptions::SOB_Trapping) {
llvm::AtomicRMWInst::BinOp aop = isInc ? llvm::AtomicRMWInst::Add :		llvm::AtomicRMWInst::BinOp aop = isInc ? llvm::AtomicRMWInst::Add :
llvm::AtomicRMWInst::Sub;		llvm::AtomicRMWInst::Sub;
llvm::Instruction::BinaryOps op = isInc ? llvm::Instruction::Add :		llvm::Instruction::BinaryOps op = isInc ? llvm::Instruction::Add :
llvm::Instruction::Sub;		llvm::Instruction::Sub;
llvm::Value *amt = CGF.EmitToMemory(		llvm::Value *amt = CGF.EmitToMemory(
llvm::ConstantInt::get(ConvertType(type), 1, true), type);		llvm::ConstantInt::get(ConvertType(type), 1, true), type);
llvm::Value *old =		llvm::Value *old =
Builder.CreateAtomicRMW(aop, LV.getPointer(CGF), amt,		Builder.CreateAtomicRMW(aop, LV.getAddress(CGF), amt,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
return isPre ? Builder.CreateBinOp(op, old, amt) : old;		return isPre ? Builder.CreateBinOp(op, old, amt) : old;
}		}
value = EmitLoadOfLValue(LV, E->getExprLoc());		value = EmitLoadOfLValue(LV, E->getExprLoc());
input = value;		input = value;
// For every other atomic operation, we need to emit a load-op-cmpxchg loop		// For every other atomic operation, we need to emit a load-op-cmpxchg loop
llvm::BasicBlock *startBB = Builder.GetInsertBlock();		llvm::BasicBlock *startBB = Builder.GetInsertBlock();
llvm::BasicBlock *opBB = CGF.createBasicBlock("atomic_op", CGF.CurFn);		llvm::BasicBlock *opBB = CGF.createBasicBlock("atomic_op", CGF.CurFn);
▲ Show 20 Lines • Show All 555 Lines • ▼ Show 20 Lines	if (!type->isBooleanType() && type->isIntegerType() &&
llvm_unreachable("Invalid compound assignment type");		llvm_unreachable("Invalid compound assignment type");
}		}
if (AtomicOp != llvm::AtomicRMWInst::BAD_BINOP) {		if (AtomicOp != llvm::AtomicRMWInst::BAD_BINOP) {
llvm::Value *Amt = CGF.EmitToMemory(		llvm::Value *Amt = CGF.EmitToMemory(
EmitScalarConversion(OpInfo.RHS, E->getRHS()->getType(), LHSTy,		EmitScalarConversion(OpInfo.RHS, E->getRHS()->getType(), LHSTy,
E->getExprLoc()),		E->getExprLoc()),
LHSTy);		LHSTy);
Value *OldVal = Builder.CreateAtomicRMW(		Value *OldVal = Builder.CreateAtomicRMW(
AtomicOp, LHSLV.getPointer(CGF), Amt,		AtomicOp, LHSLV.getAddress(CGF), Amt,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);

// Since operation is atomic, the result type is guaranteed to be the		// Since operation is atomic, the result type is guaranteed to be the
// same as the input in LLVM terms.		// same as the input in LLVM terms.
Result = Builder.CreateBinOp(Op, OldVal, Amt);		Result = Builder.CreateBinOp(Op, OldVal, Amt);
return LHSLV;		return LHSLV;
}		}
}		}
▲ Show 20 Lines • Show All 2,033 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGStmtOpenMP.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,144 Lines • ▼ Show 20 Lines	static std::pair<bool, RValue> emitOMPAtomicRMW(CodeGenFunction &CGF, LValue X,
}		}
llvm::Value *UpdateVal = Update.getScalarVal();		llvm::Value *UpdateVal = Update.getScalarVal();
if (auto *IC = dyn_cast<llvm::ConstantInt>(UpdateVal)) {		if (auto *IC = dyn_cast<llvm::ConstantInt>(UpdateVal)) {
UpdateVal = CGF.Builder.CreateIntCast(		UpdateVal = CGF.Builder.CreateIntCast(
IC, X.getAddress(CGF).getElementType(),		IC, X.getAddress(CGF).getElementType(),
X.getType()->hasSignedIntegerRepresentation());		X.getType()->hasSignedIntegerRepresentation());
}		}
llvm::Value *Res =		llvm::Value *Res =
CGF.Builder.CreateAtomicRMW(RMWOp, X.getPointer(CGF), UpdateVal, AO);		CGF.Builder.CreateAtomicRMW(RMWOp, X.getAddress(CGF), UpdateVal, AO);
return std::make_pair(true, RValue::get(Res));		return std::make_pair(true, RValue::get(Res));
}		}

std::pair<bool, RValue> CodeGenFunction::EmitOMPAtomicSimpleUpdateExpr(		std::pair<bool, RValue> CodeGenFunction::EmitOMPAtomicSimpleUpdateExpr(
LValue X, RValue E, BinaryOperatorKind BO, bool IsXLHSInRHSPart,		LValue X, RValue E, BinaryOperatorKind BO, bool IsXLHSInRHSPart,
llvm::AtomicOrdering AO, SourceLocation Loc,		llvm::AtomicOrdering AO, SourceLocation Loc,
const llvm::function_ref<RValue(RValue)> CommonGen) {		const llvm::function_ref<RValue(RValue)> CommonGen) {
// Update expressions are allowed to have the following forms:		// Update expressions are allowed to have the following forms:
▲ Show 20 Lines • Show All 1,608 Lines • Show Last 20 Lines

clang/test/CodeGen/atomic-ops.c

Show First 20 Lines • Show All 649 Lines • ▼ Show 20 Lines	void test_underaligned() {
__atomic_compare_exchange(&underaligned_a, &underaligned_b, &underaligned_c, 1, memory_order_seq_cst, memory_order_seq_cst);		__atomic_compare_exchange(&underaligned_a, &underaligned_b, &underaligned_c, 1, memory_order_seq_cst, memory_order_seq_cst);

__attribute__((aligned)) struct Underaligned aligned_a, aligned_b, aligned_c;		__attribute__((aligned)) struct Underaligned aligned_a, aligned_b, aligned_c;

// CHECK: load atomic i64, {{.*}}, align 16		// CHECK: load atomic i64, {{.*}}, align 16
__atomic_load(&aligned_a, &aligned_b, memory_order_seq_cst);		__atomic_load(&aligned_a, &aligned_b, memory_order_seq_cst);
// CHECK: store atomic i64 {{.*}}, align 16		// CHECK: store atomic i64 {{.*}}, align 16
__atomic_store(&aligned_a, &aligned_b, memory_order_seq_cst);		__atomic_store(&aligned_a, &aligned_b, memory_order_seq_cst);
// CHECK: atomicrmw xchg i64* {{.*}}, align 8		// CHECK: atomicrmw xchg i64* {{.*}}, align 16
__atomic_exchange(&aligned_a, &aligned_b, &aligned_c, memory_order_seq_cst);		__atomic_exchange(&aligned_a, &aligned_b, &aligned_c, memory_order_seq_cst);
// CHECK: cmpxchg weak i64* {{.*}}, align 8		// CHECK: cmpxchg weak i64* {{.*}}, align 16
__atomic_compare_exchange(&aligned_a, &aligned_b, &aligned_c, 1, memory_order_seq_cst, memory_order_seq_cst);		__atomic_compare_exchange(&aligned_a, &aligned_b, &aligned_c, 1, memory_order_seq_cst, memory_order_seq_cst);
}		}

void test_c11_minmax(_Atomic(int) * si, _Atomic(unsigned) * ui, _Atomic(short) * ss, _Atomic(unsigned char) * uc, _Atomic(long long) * sll) {		void test_c11_minmax(_Atomic(int) * si, _Atomic(unsigned) * ui, _Atomic(short) * ss, _Atomic(unsigned char) * uc, _Atomic(long long) * sll) {
// CHECK-LABEL: @test_c11_minmax		// CHECK-LABEL: @test_c11_minmax

// CHECK: atomicrmw max i32* {{.*}} acquire, align 4		// CHECK: atomicrmw max i32* {{.*}} acquire, align 4
*si = __c11_atomic_fetch_max(si, 42, memory_order_acquire);		*si = __c11_atomic_fetch_max(si, 42, memory_order_acquire);
▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

clang/test/CodeGen/ms-intrinsics-underaligned.c

This file was added.

				// RUN: %clang_cc1 -ffreestanding -fms-extensions -fms-compatibility -fms-compatibility-version=17.00 \
				// RUN: -triple x86_64--windows -Oz -emit-llvm -target-feature +cx16 %s -o - \
				// RUN: \| FileCheck %s

				// Ensure that we emit _Interlocked atomic operations specifying natural
				// alignment, even when clang's usual alignment derivation would result in a
				// lower alignment value.

				// intrin.h needs size_t, but -ffreestanding prevents us from getting it from
				// stddef.h. Work around it with this typedef.
				typedef __SIZE_TYPE__ size_t;

				#include <intrin.h>

				#pragma pack(1)
				typedef struct {
				char a;
				short b;
				long c;
				long long d;
				void *p;
				} X;

				_Static_assert(sizeof(X) == 23, "");
				_Static_assert(__alignof__(X) == 1, "");

				// CHECK-LABEL: @test_InterlockedExchangePointer(
				// CHECK: atomicrmw {{.*}} align 8
				void test_InterlockedExchangePointer(X x) {
				return _InterlockedExchangePointer(&x->p, 0);
				}

				// CHECK-LABEL: @test_InterlockedExchange8(
				// CHECK: atomicrmw {{.*}} align 1
				char test_InterlockedExchange8(X *x) {
				return _InterlockedExchange8(&x->a, 0);
				}

				// CHECK-LABEL: @test_InterlockedExchange16(
				// CHECK: atomicrmw {{.*}} align 2
				short test_InterlockedExchange16(X *x) {
				return _InterlockedExchange16(&x->b, 0);
				}

				// CHECK-LABEL: @test_InterlockedExchange(
				// CHECK: atomicrmw {{.*}} align 4
				long test_InterlockedExchange(X *x) {
				return _InterlockedExchange(&x->c, 0);
				}

				// CHECK-LABEL: @test_InterlockedExchange64(
				// CHECK: atomicrmw {{.*}} align 8
				long long test_InterlockedExchange64(X *x) {
				return _InterlockedExchange64(&x->d, 0);
				}

				// CHECK-LABEL: @test_InterlockedIncrement(
				// CHECK: atomicrmw {{.*}} align 4
				long test_InterlockedIncrement(X *x) {
				return _InterlockedIncrement(&x->c);
				}

				// CHECK-LABEL: @test_InterlockedDecrement16(
				// CHECK: atomicrmw {{.*}} align 2
				short test_InterlockedDecrement16(X *x) {
				return _InterlockedDecrement16(&x->b);
				}


				// CHECK-LABEL: @test_InterlockedCompareExchangePointer(
				// CHECK: cmpxchg {{.*}} align 8
				void test_InterlockedCompareExchangePointer(X x) {
				return _InterlockedCompareExchangePointer(&x->p, 0, 0);
				}

				// CHECK-LABEL: @test_InterlockedCompareExchange8(
				// CHECK: cmpxchg {{.*}} align 1
				char test_InterlockedCompareExchange8(X *x) {
				return _InterlockedCompareExchange8(&x->a, 0, 0);
				}

				// CHECK-LABEL: @test_InterlockedCompareExchange16(
				// CHECK: cmpxchg {{.*}} align 2
				short test_InterlockedCompareExchange16(X *x) {
				return _InterlockedCompareExchange16(&x->b, 0, 0);
				}

				// CHECK-LABEL: @test_InterlockedCompareExchange(
				// CHECK: cmpxchg {{.*}} align 4
				long test_InterlockedCompareExchange(X *x) {
				return _InterlockedCompareExchange(&x->c, 0, 0);
				}

				// CHECK-LABEL: @test_InterlockedCompareExchange64(
				// CHECK: cmpxchg {{.*}} align 8
				long long test_InterlockedCompareExchange64(X *x) {
				return _InterlockedCompareExchange64(&x->d, 0, 0);
				}

clang/test/CodeGen/ms-intrinsics.c

	Show First 20 Lines • Show All 444 Lines • ▼ Show 20 Lines
	// CHECK-64: %inc1 = add nsw i64 %ExchangeLow, 1			// CHECK-64: %inc1 = add nsw i64 %ExchangeLow, 1
	// CHECK-64: %incdec.ptr2 = getelementptr inbounds i64, i64* %ComparandResult, i64 1			// CHECK-64: %incdec.ptr2 = getelementptr inbounds i64, i64* %ComparandResult, i64 1
	// CHECK-64: [[DST:%[0-9]+]] = bitcast i64* %incdec.ptr to i128*			// CHECK-64: [[DST:%[0-9]+]] = bitcast i64* %incdec.ptr to i128*
	// CHECK-64: [[CNR:%[0-9]+]] = bitcast i64* %incdec.ptr2 to i128*			// CHECK-64: [[CNR:%[0-9]+]] = bitcast i64* %incdec.ptr2 to i128*
	// CHECK-64: [[EH:%[0-9]+]] = zext i64 %inc to i128			// CHECK-64: [[EH:%[0-9]+]] = zext i64 %inc to i128
	// CHECK-64: [[EL:%[0-9]+]] = zext i64 %inc1 to i128			// CHECK-64: [[EL:%[0-9]+]] = zext i64 %inc1 to i128
	// CHECK-64: [[EHS:%[0-9]+]] = shl nuw i128 [[EH]], 64			// CHECK-64: [[EHS:%[0-9]+]] = shl nuw i128 [[EH]], 64
	// CHECK-64: [[EXP:%[0-9]+]] = or i128 [[EHS]], [[EL]]			// CHECK-64: [[EXP:%[0-9]+]] = or i128 [[EHS]], [[EL]]
	// CHECK-64: [[ORG:%[0-9]+]] = load i128, i128* [[CNR]], align 16			// CHECK-64: [[ORG:%[0-9]+]] = load i128, i128* [[CNR]], align 8
	// CHECK-64: [[RES:%[0-9]+]] = cmpxchg volatile i128* [[DST]], i128 [[ORG]], i128 [[EXP]] seq_cst seq_cst, align 16			// CHECK-64: [[RES:%[0-9]+]] = cmpxchg volatile i128* [[DST]], i128 [[ORG]], i128 [[EXP]] seq_cst seq_cst, align 16
	// CHECK-64: [[OLD:%[0-9]+]] = extractvalue { i128, i1 } [[RES]], 0			// CHECK-64: [[OLD:%[0-9]+]] = extractvalue { i128, i1 } [[RES]], 0
	// CHECK-64: store i128 [[OLD]], i128* [[CNR]], align 16			// CHECK-64: store i128 [[OLD]], i128* [[CNR]], align 8
	// CHECK-64: [[SUC1:%[0-9]+]] = extractvalue { i128, i1 } [[RES]], 1			// CHECK-64: [[SUC1:%[0-9]+]] = extractvalue { i128, i1 } [[RES]], 1
	// CHECK-64: [[SUC8:%[0-9]+]] = zext i1 [[SUC1]] to i8			// CHECK-64: [[SUC8:%[0-9]+]] = zext i1 [[SUC1]] to i8
	// CHECK-64: ret i8 [[SUC8]]			// CHECK-64: ret i8 [[SUC8]]
	// CHECK-64: }			// CHECK-64: }
	#endif			#endif

	#if defined(__aarch64__)			#if defined(__aarch64__)
	unsigned char test_InterlockedCompareExchange128_acq(			unsigned char test_InterlockedCompareExchange128_acq(
	▲ Show 20 Lines • Show All 951 Lines • Show Last 20 Lines

clang/test/OpenMP/parallel_reduction_codegen.cpp

Show First 20 Lines • Show All 192 Lines • ▼ Show 20 Lines	#pragma omp parallel reduction(+:g)
// LAMBDA: [[G_VAL:%.+]] = load i32, i32* [[G_REF]]		// LAMBDA: [[G_VAL:%.+]] = load i32, i32* [[G_REF]]
// LAMBDA: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]		// LAMBDA: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]
// LAMBDA: [[ADD:%.+]] = add nsw i32 [[G_VAL]], [[G_PRIV_VAL]]		// LAMBDA: [[ADD:%.+]] = add nsw i32 [[G_VAL]], [[G_PRIV_VAL]]
// LAMBDA: store i32 [[ADD]], i32* [[G_REF]]		// LAMBDA: store i32 [[ADD]], i32* [[G_REF]]
// LAMBDA: call void @__kmpc_end_reduce_nowait(		// LAMBDA: call void @__kmpc_end_reduce_nowait(
// LAMBDA: br label %[[REDUCTION_DONE]]		// LAMBDA: br label %[[REDUCTION_DONE]]
// LAMBDA: [[CASE2]]		// LAMBDA: [[CASE2]]
// LAMBDA: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]		// LAMBDA: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]
// LAMBDA: atomicrmw add i32* [[G_REF]], i32 [[G_PRIV_VAL]] monotonic, align 4		// LAMBDA: atomicrmw add i32* [[G_REF]], i32 [[G_PRIV_VAL]] monotonic, align 128
// LAMBDA: br label %[[REDUCTION_DONE]]		// LAMBDA: br label %[[REDUCTION_DONE]]
// LAMBDA: [[REDUCTION_DONE]]		// LAMBDA: [[REDUCTION_DONE]]
// LAMBDA: ret void		// LAMBDA: ret void
[&]() {		[&]() {
// LAMBDA: define {{.+}} void [[INNER_LAMBDA]](%{{.+}}* {{[^,]*}} [[ARG_PTR:%.+]])		// LAMBDA: define {{.+}} void [[INNER_LAMBDA]](%{{.+}}* {{[^,]*}} [[ARG_PTR:%.+]])
// LAMBDA: store %{{.+}}* [[ARG_PTR]], %{{.+}}** [[ARG_PTR_REF:%.+]],		// LAMBDA: store %{{.+}}* [[ARG_PTR]], %{{.+}}** [[ARG_PTR_REF:%.+]],
g = 2;		g = 2;
// LAMBDA: [[ARG_PTR:%.+]] = load %{{.+}}, %{{.+}}* [[ARG_PTR_REF]]		// LAMBDA: [[ARG_PTR:%.+]] = load %{{.+}}, %{{.+}}* [[ARG_PTR_REF]]
Show All 40 Lines	#pragma omp parallel reduction(-:g)
// BLOCKS: [[G_VAL:%.+]] = load i32, i32* [[G_REF]]		// BLOCKS: [[G_VAL:%.+]] = load i32, i32* [[G_REF]]
// BLOCKS: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]		// BLOCKS: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]
// BLOCKS: [[ADD:%.+]] = add nsw i32 [[G_VAL]], [[G_PRIV_VAL]]		// BLOCKS: [[ADD:%.+]] = add nsw i32 [[G_VAL]], [[G_PRIV_VAL]]
// BLOCKS: store i32 [[ADD]], i32* [[G_REF]]		// BLOCKS: store i32 [[ADD]], i32* [[G_REF]]
// BLOCKS: call void @__kmpc_end_reduce_nowait(		// BLOCKS: call void @__kmpc_end_reduce_nowait(
// BLOCKS: br label %[[REDUCTION_DONE]]		// BLOCKS: br label %[[REDUCTION_DONE]]
// BLOCKS: [[CASE2]]		// BLOCKS: [[CASE2]]
// BLOCKS: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]		// BLOCKS: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]
// BLOCKS: atomicrmw add i32* [[G_REF]], i32 [[G_PRIV_VAL]] monotonic, align 4		// BLOCKS: atomicrmw add i32* [[G_REF]], i32 [[G_PRIV_VAL]] monotonic, align 128
// BLOCKS: br label %[[REDUCTION_DONE]]		// BLOCKS: br label %[[REDUCTION_DONE]]
// BLOCKS: [[REDUCTION_DONE]]		// BLOCKS: [[REDUCTION_DONE]]
// BLOCKS: ret void		// BLOCKS: ret void
^{		^{
// BLOCKS: define {{.+}} void {{@.+}}(i8*		// BLOCKS: define {{.+}} void {{@.+}}(i8*
g = 2;		g = 2;
// BLOCKS-NOT: [[G]]{{[[^:word:]]}}		// BLOCKS-NOT: [[G]]{{[[^:word:]]}}
// BLOCKS: store i{{[0-9]+}} 2, i{{[0-9]+}}*		// BLOCKS: store i{{[0-9]+}} 2, i{{[0-9]+}}*
▲ Show 20 Lines • Show All 499 Lines • ▼ Show 20 Lines
// CHECK: call void @__kmpc_end_reduce_nowait(%{{.+}}* [[REDUCTION_LOC]], i32 [[GTID]], [8 x i32]* [[REDUCTION_LOCK]])		// CHECK: call void @__kmpc_end_reduce_nowait(%{{.+}}* [[REDUCTION_LOC]], i32 [[GTID]], [8 x i32]* [[REDUCTION_LOCK]])

// break;		// break;
// CHECK: br label %[[RED_DONE]]		// CHECK: br label %[[RED_DONE]]

// case 2:		// case 2:
// t_var += t_var_reduction;		// t_var += t_var_reduction;
// CHECK: [[T_VAR_PRIV_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR_PRIV]]		// CHECK: [[T_VAR_PRIV_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR_PRIV]]
// CHECK: atomicrmw add i32* [[T_VAR_REF]], i32 [[T_VAR_PRIV_VAL]] monotonic, align 4		// CHECK: atomicrmw add i32* [[T_VAR_REF]], i32 [[T_VAR_PRIV_VAL]] monotonic, align 128

// var = var.operator &(var_reduction);		// var = var.operator &(var_reduction);
// CHECK: call void @__kmpc_critical(		// CHECK: call void @__kmpc_critical(
// CHECK: [[UP:%.+]] = call nonnull align 4 dereferenceable(4) [[S_INT_TY]]* @{{.+}}([[S_INT_TY]]* {{[^,]}} [[VAR_REF]], [[S_INT_TY]] nonnull align 4 dereferenceable(4) [[VAR_PRIV]])		// CHECK: [[UP:%.+]] = call nonnull align 4 dereferenceable(4) [[S_INT_TY]]* @{{.+}}([[S_INT_TY]]* {{[^,]}} [[VAR_REF]], [[S_INT_TY]] nonnull align 4 dereferenceable(4) [[VAR_PRIV]])
// CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR_REF]] to i8*		// CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR_REF]] to i8*
// CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[UP]] to i8*		// CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[UP]] to i8*
// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false)		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false)
// CHECK: call void @__kmpc_end_critical(		// CHECK: call void @__kmpc_end_critical(
Show All 13 Lines
// CHECK: call void @{{.+}}([[S_INT_TY]]* {{[^,]*}} [[COND_LVALUE:%.+]], i32 [[CONV]])		// CHECK: call void @{{.+}}([[S_INT_TY]]* {{[^,]*}} [[COND_LVALUE:%.+]], i32 [[CONV]])
// CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR1_REF]] to i8*		// CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR1_REF]] to i8*
// CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[COND_LVALUE]] to i8*		// CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[COND_LVALUE]] to i8*
// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false)		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false)
// CHECK: call void @__kmpc_end_critical(		// CHECK: call void @__kmpc_end_critical(

// t_var1 = min(t_var1, t_var1_reduction);		// t_var1 = min(t_var1, t_var1_reduction);
// CHECK: [[T_VAR1_PRIV_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR1_PRIV]]		// CHECK: [[T_VAR1_PRIV_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR1_PRIV]]
// CHECK: atomicrmw min i32* [[T_VAR1_REF]], i32 [[T_VAR1_PRIV_VAL]] monotonic, align 4		// CHECK: atomicrmw min i32* [[T_VAR1_REF]], i32 [[T_VAR1_PRIV_VAL]] monotonic, align 128

// break;		// break;
// CHECK: br label %[[RED_DONE]]		// CHECK: br label %[[RED_DONE]]
// CHECK: [[RED_DONE]]		// CHECK: [[RED_DONE]]

// CHECK-DAG: call {{.}} [[S_INT_TY_DESTR]]([[S_INT_TY]] {{[^,]*}} [[VAR_PRIV]])		// CHECK-DAG: call {{.}} [[S_INT_TY_DESTR]]([[S_INT_TY]] {{[^,]*}} [[VAR_PRIV]])
// CHECK-DAG: call {{.}} [[S_INT_TY_DESTR]]([[S_INT_TY]]		// CHECK-DAG: call {{.}} [[S_INT_TY_DESTR]]([[S_INT_TY]]
// CHECK: ret void		// CHECK: ret void
▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines