This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGAtomic.cpp
-
CGBuilder.h
2/2
CGBuiltin.cpp
-
CGExprScalar.cpp
-
CGStmtOpenMP.cpp
-
test/
-
CodeGen/
-
atomic-ops.c
-
ms-intrinsics.c
-
OpenMP/
-
parallel_reduction_codegen.cpp

Differential D97224

Use Address for CGBuilder's CreateAtomicRMW and CreateAtomicCmpXchg.
AcceptedPublic

Authored by jyknight on Feb 22 2021, 1:29 PM.

Download Raw Diff

Details

Reviewers

gchatelet
jfb
rjmccall

Summary

Following the LLVM change to add an alignment argument to the
IRBuilder calls, switch Clang's CGBuilder variants to take an Address
type. Then, update all callers to pass through the Address.

There is one interesting exception: Microsoft's
InterlockedCompareExchange128 family of functions are documented to
require (and assume) 16-byte alignment, despite the argument type
being only long long*.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jyknight requested review of this revision.Feb 22 2021, 1:29 PM

jyknight created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2021, 1:29 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B90275: Diff 325555.Feb 22 2021, 1:31 PM

jyknight mentioned this in rG24539f1ef247: Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg..Feb 25 2021, 3:30 PM

Do we really consider the existing atomic intrinsics to not impose added alignment restrictions? I'm somewhat concerned that we're going to produce IR here that's either really suboptimal or, worse, unimplementable, just because we over-interpreted some cue about alignment. I guess it would only be a significant problem on a target where types are frequently under-aligned for what atomics need, which is not typical, or when the user is doing atomics on a field of something like a packed struct.

clang/lib/CodeGen/CGBuiltin.cpp
354	Since you're changing this code anyway, please make this do `CreateElementBitCast(DestAddr, Int128Ty)` so that it's address-space-correct. There are a lot of other lines in the patch that would benefit from the same thing.
3765	This should be using `EmitPointerWithAlignment` instead of assuming an alignment of 1.

Address review comments.

In D97224#2596410, @rjmccall wrote:

Do we really consider the existing atomic intrinsics to not impose added alignment restrictions? I'm somewhat concerned that we're going to produce IR here that's either really suboptimal or, worse, unimplementable, just because we over-interpreted some cue about alignment. I guess it would only be a significant problem on a target where types are frequently under-aligned for what atomics need, which is not typical, or when the user is doing atomics on a field of something like a packed struct.

For the __atomic_* intrinsics, we don't consider those as imposing additional alignment restrictions -- currently, we avoid generating the LLVM IR instruction if it's below natural alignment, since we couldn't specify alignment on the IR instruction. (Now that we have alignment on the LLVM IR operations, I'd like to eventually get rid of that logic from Clang, since it's also handled by LLVM.)

For other intrinsics (e.g. MSVCIntrin::_InterlockedAnd, Builtin::BIsync_fetch_and_add_4, NVPTX::BInvvm_atom_add_gen_i, and the others in those 3 function families), we currently entirely ignore the alignment, and simply assume the argument is naturally-aligned. So, yes, this change could potentially affect the behavior for underaligned types.

So, I could change these to explicitly increase the assumed alignment of their arguments, like I did for InterlockedCompareExchange128. My inclination is not to do so, however...it doesn't seem like a good idea in general to ignore type alignment. But, I'd not be opposed to doing that, if there's a good reason.

In D97224#2604069, @jyknight wrote:

In D97224#2596410, @rjmccall wrote:

Do we really consider the existing atomic intrinsics to not impose added alignment restrictions? I'm somewhat concerned that we're going to produce IR here that's either really suboptimal or, worse, unimplementable, just because we over-interpreted some cue about alignment. I guess it would only be a significant problem on a target where types are frequently under-aligned for what atomics need, which is not typical, or when the user is doing atomics on a field of something like a packed struct.

For the __atomic_* intrinsics, we don't consider those as imposing additional alignment restrictions -- currently, we avoid generating the LLVM IR instruction if it's below natural alignment, since we couldn't specify alignment on the IR instruction. (Now that we have alignment on the LLVM IR operations, I'd like to eventually get rid of that logic from Clang, since it's also handled by LLVM.)

Frontends ultimately have the responsibility of making sure they ultimately emit code that follows the platform ABI for atomics. In most other parts of the ABI, we usually find that it is possible (even necessary) to delegate *part* of that ABI responsibility down to LLVM — e.g. to emit inline atomic sequences, which I suppose frontends could do with inline assembly, but there are obvious reasons to prefer a more semantic IR — but that at least in some cases, there is information that we cannot pass down and so must handle in the frontend. I am somewhat skeptical that atomics are going to prove an exception here. At the very least, there will always be *some* operations that we have to expand into compare-exchange loops in the frontend, for want of a sufficiently powerful instruction/intrinsic. That said, if you find that you can actually free Clang from having to make certain decisions here, that's great; I just want you to understand that usually we find that there are limits to what we can usefully delegate to LLVM, and ultimately the responsibility is ours.

For other intrinsics (e.g. MSVCIntrin::_InterlockedAnd, Builtin::BIsync_fetch_and_add_4, NVPTX::BInvvm_atom_add_gen_i, and the others in those 3 function families), we currently entirely ignore the alignment, and simply assume the argument is naturally-aligned. So, yes, this change could potentially affect the behavior for underaligned types.

So, I could change these to explicitly increase the assumed alignment of their arguments, like I did for InterlockedCompareExchange128. My inclination is not to do so, however...it doesn't seem like a good idea in general to ignore type alignment. But, I'd not be opposed to doing that, if there's a good reason.

The idea here is not to "ignore type alignment". EmitPointerWithAlignment will sometimes return an alignment for a pointer that's less than the alignment of the pointee type, e.g. because you're taking the address of a packed struct field. The critical question is whether the atomic builtins ought to make an effort to honor that reduced alignment, even if it leads to terrible code, or if we should treat the use of the builtin as a user promise that the pointer is actually more aligned than you might think from the information statically available. That has to be resolved by the semantics of the builtin, and unfortunately intrinsic documentation is often spotty on this point. For example, the MSDN documentation for InterlockedIncrement says that it requires 32-bit alignment, but the documentation for InterlockedAdd doesn't. It seems extremely unlikely that that is meant to be read as a statement that InterlockedAdd is actually more permissive. I would say that all of the Interlocked APIs ought to be read as guaranteeing the natural, full-width alignment for their operation. I'm less certain about what to do with the __atomic_* builtins, because maybe we should take this as an opportunity to try to "do the right thing" with under-aligned atomics; on the other hand, that assumes that we always *can* "do the right thing", and I don't want Clang to start miscompiling or refusing to compile code because we're trying to do something above and beyond the normal language semantics. It might be more reasonable to always use the type alignment as a minimum.

The idea here is not to "ignore type alignment". EmitPointerWithAlignment will sometimes return an alignment for a pointer that's less than the alignment of the pointee type, e.g. because you're taking the address of a packed struct field. The critical question is whether the atomic builtins ought to make an effort to honor that reduced alignment, even if it leads to terrible code, or if we should treat the use of the builtin as a user promise that the pointer is actually more aligned than you might think from the information statically available.

Agreed -- that is the question. In general, I'd like to default to basing decisions only on the statically-known alignments, because I think that'll typically be the best choice for users. Where there's something like a packed struct, it's likely that the values will end up under-aligned in fact, not just in the compiler's understanding.

For example, the MSDN documentation for InterlockedIncrement says that it requires 32-bit alignment [...]. I would say that all of the Interlocked APIs ought to be read as guaranteeing the natural, full-width alignment for their operation.

I had missed that it was documented in some of the other functions (beyond InterlockedCompareExchange128). I'll change the remainder of the _Interlocked APIs to assume at least natural alignment.

I'm less certain about what to do with the __atomic_* builtins

The __atomic builtins have already been supporting under-aligned pointers all along -- and that behavior is unchanged by this patch.

Harbormaster completed remote builds in B92108: Diff 328219.Mar 4 2021, 11:59 PM

Use natural alignment for _Interlocked* intrinsics.

Harbormaster completed remote builds in B93387: Diff 330088.Mar 11 2021, 9:37 PM

In D97224#2604537, @jyknight wrote:

I'm less certain about what to do with the __atomic_* builtins

The __atomic builtins have already been supporting under-aligned pointers all along -- and that behavior is unchanged by this patch.

How so? Clang hasn't been passing down an alignment, which means that it's been building atomicrmw instructions with the natural alignment for the IR type, which means we've been assuming that all pointers have a least that alignment. The LLVM documentation even says that atomicrmw doesn't allow under-alignment.

In D97224#2621328, @rjmccall wrote:

In D97224#2604537, @jyknight wrote:

I'm less certain about what to do with the __atomic_* builtins

The __atomic builtins have already been supporting under-aligned pointers all along -- and that behavior is unchanged by this patch.

How so? Clang hasn't been passing down an alignment, which means that it's been building atomicrmw instructions with the natural alignment for the IR type, which means we've been assuming that all pointers have a least that alignment. The LLVM documentation even says that atomicrmw doesn't allow under-alignment.

We construct a libcall to __atomic_* routines in the frontend upon seeing an underaligned argument, instead of letting the backend handle it -- there's a bunch of code at https://github.com/llvm/llvm-project/blob/bc4a5bdce4af82a5522869d8a154e9e15cf809df/clang/lib/CodeGen/CGAtomic.cpp#L790 to handle that. I'd like to rip most of that out in the future, and just let the backend handle it in more cases.

E.g.

typedef int __attribute__((aligned(1))) unaligned_int;
bool cmpxchg_u(unaligned_int *x) {
    int expected = 2;
    return __atomic_compare_exchange_n(x, &expected, 2, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
}

generates a libcall to __atomic_compare_exchange_4 (in IR, generated in the Clang code), instead of creating a cmpxchg IR instruction, due to the under-alignment. (Although, sigh, I've only just noticed we actually have a problem here -- the __atomic_*_SIZE libcalls are supposed to require an aligned argument -- so we should be using __atomic_compare_exchange (without size suffix) instead. Gah.)

Ping. I think this is correct, and would like to commit.

Alright, well, this does look cleaner.

This revision is now accepted and ready to land.Jun 23 2021, 12:54 PM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

8 lines

17 lines

145 lines

6 lines

2 lines

test/

CodeGen/

atomic-ops.c

4 lines

ms-intrinsics.c

4 lines

OpenMP/

parallel_reduction_codegen.cpp

8 lines

Diff 325555

clang/lib/CodeGen/CGAtomic.cpp

Show First 20 Lines • Show All 368 Lines • ▼ Show 20 Lines	static void emitAtomicCmpXchg(CodeGenFunction &CGF, AtomicExpr *E, bool IsWeak,
llvm::AtomicOrdering SuccessOrder,		llvm::AtomicOrdering SuccessOrder,
llvm::AtomicOrdering FailureOrder,		llvm::AtomicOrdering FailureOrder,
llvm::SyncScope::ID Scope) {		llvm::SyncScope::ID Scope) {
// Note that cmpxchg doesn't support weak cmpxchg, at least at the moment.		// Note that cmpxchg doesn't support weak cmpxchg, at least at the moment.
llvm::Value *Expected = CGF.Builder.CreateLoad(Val1);		llvm::Value *Expected = CGF.Builder.CreateLoad(Val1);
llvm::Value *Desired = CGF.Builder.CreateLoad(Val2);		llvm::Value *Desired = CGF.Builder.CreateLoad(Val2);

llvm::AtomicCmpXchgInst *Pair = CGF.Builder.CreateAtomicCmpXchg(		llvm::AtomicCmpXchgInst *Pair = CGF.Builder.CreateAtomicCmpXchg(
Ptr.getPointer(), Expected, Desired, SuccessOrder, FailureOrder,		Ptr, Expected, Desired, SuccessOrder, FailureOrder, Scope);
Scope);
Pair->setVolatile(E->isVolatile());		Pair->setVolatile(E->isVolatile());
Pair->setWeak(IsWeak);		Pair->setWeak(IsWeak);

// Cmp holds the result of the compare-exchange operation: true on success,		// Cmp holds the result of the compare-exchange operation: true on success,
// false on failure.		// false on failure.
llvm::Value *Old = CGF.Builder.CreateExtractValue(Pair, 0);		llvm::Value *Old = CGF.Builder.CreateExtractValue(Pair, 0);
llvm::Value *Cmp = CGF.Builder.CreateExtractValue(Pair, 1);		llvm::Value *Cmp = CGF.Builder.CreateExtractValue(Pair, 1);

▲ Show 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	case AtomicExpr::AO__atomic_nand_fetch:
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case AtomicExpr::AO__atomic_fetch_nand:		case AtomicExpr::AO__atomic_fetch_nand:
Op = llvm::AtomicRMWInst::Nand;		Op = llvm::AtomicRMWInst::Nand;
break;		break;
}		}

llvm::Value *LoadVal1 = CGF.Builder.CreateLoad(Val1);		llvm::Value *LoadVal1 = CGF.Builder.CreateLoad(Val1);
llvm::AtomicRMWInst *RMWI =		llvm::AtomicRMWInst *RMWI =
CGF.Builder.CreateAtomicRMW(Op, Ptr.getPointer(), LoadVal1, Order, Scope);		CGF.Builder.CreateAtomicRMW(Op, Ptr, LoadVal1, Order, Scope);
RMWI->setVolatile(E->isVolatile());		RMWI->setVolatile(E->isVolatile());

// For __atomic_*_fetch operations, perform the operation again to		// For __atomic_*_fetch operations, perform the operation again to
// determine the value which was written.		// determine the value which was written.
llvm::Value *Result = RMWI;		llvm::Value *Result = RMWI;
if (PostOpMinMax)		if (PostOpMinMax)
Result = EmitPostAtomicMinMax(CGF.Builder, E->getOp(),		Result = EmitPostAtomicMinMax(CGF.Builder, E->getOp(),
E->getValueType()->isSignedIntegerType(),		E->getValueType()->isSignedIntegerType(),
▲ Show 20 Lines • Show All 986 Lines • ▼ Show 20 Lines	llvm::Value *AtomicInfo::convertRValueToInt(RValue RVal) const {
return CGF.Builder.CreateLoad(Addr);		return CGF.Builder.CreateLoad(Addr);
}		}

std::pair<llvm::Value , llvm::Value > AtomicInfo::EmitAtomicCompareExchangeOp(		std::pair<llvm::Value , llvm::Value > AtomicInfo::EmitAtomicCompareExchangeOp(
llvm::Value ExpectedVal, llvm::Value DesiredVal,		llvm::Value ExpectedVal, llvm::Value DesiredVal,
llvm::AtomicOrdering Success, llvm::AtomicOrdering Failure, bool IsWeak) {		llvm::AtomicOrdering Success, llvm::AtomicOrdering Failure, bool IsWeak) {
// Do the atomic store.		// Do the atomic store.
Address Addr = getAtomicAddressAsAtomicIntPointer();		Address Addr = getAtomicAddressAsAtomicIntPointer();
auto *Inst = CGF.Builder.CreateAtomicCmpXchg(Addr.getPointer(),		auto *Inst = CGF.Builder.CreateAtomicCmpXchg(Addr, ExpectedVal, DesiredVal,
ExpectedVal, DesiredVal,
Success, Failure);		Success, Failure);
// Other decoration.		// Other decoration.
Inst->setVolatile(LVal.isVolatileQualified());		Inst->setVolatile(LVal.isVolatileQualified());
Inst->setWeak(IsWeak);		Inst->setWeak(IsWeak);

// Okay, turn that back into the original value type.		// Okay, turn that back into the original value type.
auto PreviousVal = CGF.Builder.CreateExtractValue(Inst, /Idxs=*/0);		auto PreviousVal = CGF.Builder.CreateExtractValue(Inst, /Idxs=*/0);
auto SuccessFailureVal = CGF.Builder.CreateExtractValue(Inst, /Idxs=*/1);		auto SuccessFailureVal = CGF.Builder.CreateExtractValue(Inst, /Idxs=*/1);
▲ Show 20 Lines • Show All 430 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGBuilder.h

Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	public:
}		}

/// Emit a store to an i1 flag variable.		/// Emit a store to an i1 flag variable.
llvm::StoreInst CreateFlagStore(bool Value, llvm::Value Addr) {		llvm::StoreInst CreateFlagStore(bool Value, llvm::Value Addr) {
assert(Addr->getType()->getPointerElementType() == getInt1Ty());		assert(Addr->getType()->getPointerElementType() == getInt1Ty());
return CreateAlignedStore(getInt1(Value), Addr, CharUnits::One());		return CreateAlignedStore(getInt1(Value), Addr, CharUnits::One());
}		}

// Temporarily use old signature; clang will be updated to an Address overload
// in a subsequent patch.
llvm::AtomicCmpXchgInst *		llvm::AtomicCmpXchgInst *
CreateAtomicCmpXchg(llvm::Value Ptr, llvm::Value Cmp, llvm::Value *New,		CreateAtomicCmpXchg(Address Addr, llvm::Value Cmp, llvm::Value New,
llvm::AtomicOrdering SuccessOrdering,		llvm::AtomicOrdering SuccessOrdering,
llvm::AtomicOrdering FailureOrdering,		llvm::AtomicOrdering FailureOrdering,
llvm::SyncScope::ID SSID = llvm::SyncScope::System) {		llvm::SyncScope::ID SSID = llvm::SyncScope::System) {
return CGBuilderBaseTy::CreateAtomicCmpXchg(		return CGBuilderBaseTy::CreateAtomicCmpXchg(
Ptr, Cmp, New, llvm::MaybeAlign(), SuccessOrdering, FailureOrdering,		Addr.getPointer(), Cmp, New, Addr.getAlignment().getAsAlign(),
SSID);		SuccessOrdering, FailureOrdering, SSID);
}		}

// Temporarily use old signature; clang will be updated to an Address overload
// in a subsequent patch.
llvm::AtomicRMWInst *		llvm::AtomicRMWInst *
CreateAtomicRMW(llvm::AtomicRMWInst::BinOp Op, llvm::Value *Ptr,		CreateAtomicRMW(llvm::AtomicRMWInst::BinOp Op, Address Addr, llvm::Value *Val,
llvm::Value *Val, llvm::AtomicOrdering Ordering,		llvm::AtomicOrdering Ordering,
llvm::SyncScope::ID SSID = llvm::SyncScope::System) {		llvm::SyncScope::ID SSID = llvm::SyncScope::System) {
return CGBuilderBaseTy::CreateAtomicRMW(Op, Ptr, Val, llvm::MaybeAlign(),		return CGBuilderBaseTy::CreateAtomicRMW(Op, Addr.getPointer(), Val,
		Addr.getAlignment().getAsAlign(),
Ordering, SSID);		Ordering, SSID);
}		}

using CGBuilderBaseTy::CreateBitCast;		using CGBuilderBaseTy::CreateBitCast;
Address CreateBitCast(Address Addr, llvm::Type *Ty,		Address CreateBitCast(Address Addr, llvm::Type *Ty,
const llvm::Twine &Name = "") {		const llvm::Twine &Name = "") {
return Address(CreateBitCast(Addr.getPointer(), Ty, Name),		return Address(CreateBitCast(Addr.getPointer(), Ty, Name),
Addr.getAlignment());		Addr.getAlignment());
▲ Show 20 Lines • Show All 184 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines	static Value *MakeBinaryAtomicValue(
CodeGenFunction &CGF, llvm::AtomicRMWInst::BinOp Kind, const CallExpr *E,		CodeGenFunction &CGF, llvm::AtomicRMWInst::BinOp Kind, const CallExpr *E,
AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {		AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {
QualType T = E->getType();		QualType T = E->getType();
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());
assert(CGF.getContext().hasSameUnqualifiedType(T,		assert(CGF.getContext().hasSameUnqualifiedType(T,
E->getArg(0)->getType()->getPointeeType()));		E->getArg(0)->getType()->getPointeeType()));
assert(CGF.getContext().hasSameUnqualifiedType(T, E->getArg(1)->getType()));		assert(CGF.getContext().hasSameUnqualifiedType(T, E->getArg(1)->getType()));

llvm::Value *DestPtr = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
unsigned AddrSpace = DestPtr->getType()->getPointerAddressSpace();		unsigned AddrSpace = DestAddr.getAddressSpace();

llvm::IntegerType *IntType =		llvm::IntegerType *IntType =
llvm::IntegerType::get(CGF.getLLVMContext(),		llvm::IntegerType::get(CGF.getLLVMContext(),
CGF.getContext().getTypeSize(T));		CGF.getContext().getTypeSize(T));
llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);		llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);

llvm::Value *Args[2];		DestAddr = CGF.Builder.CreateBitCast(DestAddr, IntPtrType);
Args[0] = CGF.Builder.CreateBitCast(DestPtr, IntPtrType);		llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(1));
Args[1] = CGF.EmitScalarExpr(E->getArg(1));		llvm::Type *ValueType = Val->getType();
llvm::Type *ValueType = Args[1]->getType();		Val = EmitToInt(CGF, Val, T, IntType);
Args[1] = EmitToInt(CGF, Args[1], T, IntType);

llvm::Value *Result = CGF.Builder.CreateAtomicRMW(		llvm::Value *Result =
Kind, Args[0], Args[1], Ordering);		CGF.Builder.CreateAtomicRMW(Kind, DestAddr, Val, Ordering);
return EmitFromInt(CGF, Result, T, ValueType);		return EmitFromInt(CGF, Result, T, ValueType);
}		}

static Value EmitNontemporalStore(CodeGenFunction &CGF, const CallExpr E) {		static Value EmitNontemporalStore(CodeGenFunction &CGF, const CallExpr E) {
Value *Val = CGF.EmitScalarExpr(E->getArg(0));		Value *Val = CGF.EmitScalarExpr(E->getArg(0));
Value *Address = CGF.EmitScalarExpr(E->getArg(1));		Value *Address = CGF.EmitScalarExpr(E->getArg(1));

// Convert the type of the pointer to a pointer to the stored type.		// Convert the type of the pointer to a pointer to the stored type.
Show All 29 Lines	static RValue EmitBinaryAtomicPost(CodeGenFunction &CGF,
Instruction::BinaryOps Op,		Instruction::BinaryOps Op,
bool Invert = false) {		bool Invert = false) {
QualType T = E->getType();		QualType T = E->getType();
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());
assert(CGF.getContext().hasSameUnqualifiedType(T,		assert(CGF.getContext().hasSameUnqualifiedType(T,
E->getArg(0)->getType()->getPointeeType()));		E->getArg(0)->getType()->getPointeeType()));
assert(CGF.getContext().hasSameUnqualifiedType(T, E->getArg(1)->getType()));		assert(CGF.getContext().hasSameUnqualifiedType(T, E->getArg(1)->getType()));

llvm::Value *DestPtr = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
unsigned AddrSpace = DestPtr->getType()->getPointerAddressSpace();		unsigned AddrSpace = DestAddr.getAddressSpace();

llvm::IntegerType *IntType =		llvm::IntegerType *IntType =
llvm::IntegerType::get(CGF.getLLVMContext(),		llvm::IntegerType::get(CGF.getLLVMContext(),
CGF.getContext().getTypeSize(T));		CGF.getContext().getTypeSize(T));
llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);		llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);

llvm::Value *Args[2];		DestAddr = CGF.Builder.CreateBitCast(DestAddr, IntPtrType);
Args[1] = CGF.EmitScalarExpr(E->getArg(1));		llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(1));
llvm::Type *ValueType = Args[1]->getType();		llvm::Type *ValueType = Val->getType();
Args[1] = EmitToInt(CGF, Args[1], T, IntType);		Val = EmitToInt(CGF, Val, T, IntType);
Args[0] = CGF.Builder.CreateBitCast(DestPtr, IntPtrType);

llvm::Value *Result = CGF.Builder.CreateAtomicRMW(		llvm::Value *Result = CGF.Builder.CreateAtomicRMW(
Kind, Args[0], Args[1], llvm::AtomicOrdering::SequentiallyConsistent);		Kind, DestAddr, Val, llvm::AtomicOrdering::SequentiallyConsistent);
Result = CGF.Builder.CreateBinOp(Op, Result, Args[1]);		Result = CGF.Builder.CreateBinOp(Op, Result, Val);
if (Invert)		if (Invert)
Result =		Result =
CGF.Builder.CreateBinOp(llvm::Instruction::Xor, Result,		CGF.Builder.CreateBinOp(llvm::Instruction::Xor, Result,
llvm::ConstantInt::getAllOnesValue(IntType));		llvm::ConstantInt::getAllOnesValue(IntType));
Result = EmitFromInt(CGF, Result, T, ValueType);		Result = EmitFromInt(CGF, Result, T, ValueType);
return RValue::get(Result);		return RValue::get(Result);
}		}

Show All 9 Lines
///		///
/// @returns result of cmpxchg, according to ReturnBool		/// @returns result of cmpxchg, according to ReturnBool
///		///
/// Note: In order to lower Microsoft's _InterlockedCompareExchange* intrinsics		/// Note: In order to lower Microsoft's _InterlockedCompareExchange* intrinsics
/// invoke the function EmitAtomicCmpXchgForMSIntrin.		/// invoke the function EmitAtomicCmpXchgForMSIntrin.
static Value MakeAtomicCmpXchgValue(CodeGenFunction &CGF, const CallExpr E,		static Value MakeAtomicCmpXchgValue(CodeGenFunction &CGF, const CallExpr E,
bool ReturnBool) {		bool ReturnBool) {
QualType T = ReturnBool ? E->getArg(1)->getType() : E->getType();		QualType T = ReturnBool ? E->getArg(1)->getType() : E->getType();
llvm::Value *DestPtr = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
unsigned AddrSpace = DestPtr->getType()->getPointerAddressSpace();		unsigned AddrSpace = DestAddr.getAddressSpace();

llvm::IntegerType *IntType = llvm::IntegerType::get(		llvm::IntegerType *IntType = llvm::IntegerType::get(
CGF.getLLVMContext(), CGF.getContext().getTypeSize(T));		CGF.getLLVMContext(), CGF.getContext().getTypeSize(T));
llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);		llvm::Type *IntPtrType = IntType->getPointerTo(AddrSpace);

Value *Args[3];		DestAddr = CGF.Builder.CreateBitCast(DestAddr, IntPtrType);
Args[0] = CGF.Builder.CreateBitCast(DestPtr, IntPtrType);		Value *Cmp = CGF.EmitScalarExpr(E->getArg(1));
Args[1] = CGF.EmitScalarExpr(E->getArg(1));		llvm::Type *ValueType = Cmp->getType();
llvm::Type *ValueType = Args[1]->getType();		Cmp = EmitToInt(CGF, Cmp, T, IntType);
Args[1] = EmitToInt(CGF, Args[1], T, IntType);		Value *New = EmitToInt(CGF, CGF.EmitScalarExpr(E->getArg(2)), T, IntType);
Args[2] = EmitToInt(CGF, CGF.EmitScalarExpr(E->getArg(2)), T, IntType);

Value *Pair = CGF.Builder.CreateAtomicCmpXchg(		Value *Pair = CGF.Builder.CreateAtomicCmpXchg(
Args[0], Args[1], Args[2], llvm::AtomicOrdering::SequentiallyConsistent,		DestAddr, Cmp, New, llvm::AtomicOrdering::SequentiallyConsistent,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
if (ReturnBool)		if (ReturnBool)
// Extract boolean success flag and zext it to int.		// Extract boolean success flag and zext it to int.
return CGF.Builder.CreateZExt(CGF.Builder.CreateExtractValue(Pair, 1),		return CGF.Builder.CreateZExt(CGF.Builder.CreateExtractValue(Pair, 1),
CGF.ConvertType(E->getType()));		CGF.ConvertType(E->getType()));
else		else
// Extract old value and emit it using the same type as compare value.		// Extract old value and emit it using the same type as compare value.
return EmitFromInt(CGF, CGF.Builder.CreateExtractValue(Pair, 0), T,		return EmitFromInt(CGF, CGF.Builder.CreateExtractValue(Pair, 0), T,
Show All 19 Lines	Value EmitAtomicCmpXchgForMSIntrin(CodeGenFunction &CGF, const CallExpr E,
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());
assert(CGF.getContext().hasSameUnqualifiedType(		assert(CGF.getContext().hasSameUnqualifiedType(
E->getType(), E->getArg(0)->getType()->getPointeeType()));		E->getType(), E->getArg(0)->getType()->getPointeeType()));
assert(CGF.getContext().hasSameUnqualifiedType(E->getType(),		assert(CGF.getContext().hasSameUnqualifiedType(E->getType(),
E->getArg(1)->getType()));		E->getArg(1)->getType()));
assert(CGF.getContext().hasSameUnqualifiedType(E->getType(),		assert(CGF.getContext().hasSameUnqualifiedType(E->getType(),
E->getArg(2)->getType()));		E->getArg(2)->getType()));

auto *Destination = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
auto *Comparand = CGF.EmitScalarExpr(E->getArg(2));		auto *Comparand = CGF.EmitScalarExpr(E->getArg(2));
auto *Exchange = CGF.EmitScalarExpr(E->getArg(1));		auto *Exchange = CGF.EmitScalarExpr(E->getArg(1));

// For Release ordering, the failure ordering should be Monotonic.		// For Release ordering, the failure ordering should be Monotonic.
auto FailureOrdering = SuccessOrdering == AtomicOrdering::Release ?		auto FailureOrdering = SuccessOrdering == AtomicOrdering::Release ?
AtomicOrdering::Monotonic :		AtomicOrdering::Monotonic :
SuccessOrdering;		SuccessOrdering;

// The atomic instruction is marked volatile for consistency with MSVC. This		// The atomic instruction is marked volatile for consistency with MSVC. This
// blocks the few atomics optimizations that LLVM has. If we want to optimize		// blocks the few atomics optimizations that LLVM has. If we want to optimize
// _Interlocked* operations in the future, we will have to remove the volatile		// _Interlocked* operations in the future, we will have to remove the volatile
// marker.		// marker.
auto *Result = CGF.Builder.CreateAtomicCmpXchg(		auto *Result = CGF.Builder.CreateAtomicCmpXchg(
Destination, Comparand, Exchange,		DestAddr, Comparand, Exchange, SuccessOrdering, FailureOrdering);
SuccessOrdering, FailureOrdering);
Result->setVolatile(true);		Result->setVolatile(true);
return CGF.Builder.CreateExtractValue(Result, 0);		return CGF.Builder.CreateExtractValue(Result, 0);
}		}

// 64-bit Microsoft platforms support 128 bit cmpxchg operations. They are		// 64-bit Microsoft platforms support 128 bit cmpxchg operations. They are
// prototyped like this:		// prototyped like this:
//		//
// unsigned char _InterlockedCompareExchange128...(		// unsigned char _InterlockedCompareExchange128...(
// __int64 volatile * _Destination,		// __int64 volatile * _Destination,
// __int64 _ExchangeHigh,		// __int64 _ExchangeHigh,
// __int64 _ExchangeLow,		// __int64 _ExchangeLow,
// __int64 * _ComparandResult);		// __int64 * _ComparandResult);
		//
		// Note that Destination is assumed to be 16-byte aligned, despite being typed
		// int64.

static Value *EmitAtomicCmpXchg128ForMSIntrin(CodeGenFunction &CGF,		static Value *EmitAtomicCmpXchg128ForMSIntrin(CodeGenFunction &CGF,
const CallExpr *E,		const CallExpr *E,
AtomicOrdering SuccessOrdering) {		AtomicOrdering SuccessOrdering) {
assert(E->getNumArgs() == 4);		assert(E->getNumArgs() == 4);
llvm::Value *Destination = CGF.EmitScalarExpr(E->getArg(0));		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
llvm::Value *ExchangeHigh = CGF.EmitScalarExpr(E->getArg(1));		llvm::Value *ExchangeHigh = CGF.EmitScalarExpr(E->getArg(1));
llvm::Value *ExchangeLow = CGF.EmitScalarExpr(E->getArg(2));		llvm::Value *ExchangeLow = CGF.EmitScalarExpr(E->getArg(2));
llvm::Value *ComparandPtr = CGF.EmitScalarExpr(E->getArg(3));		Address ComparandAddr = CGF.EmitPointerWithAlignment(E->getArg(3));

		// Force at least 16-byte alignment.
		if (DestAddr.getAlignment() < CharUnits::fromQuantity(16))
		DestAddr = Address(DestAddr.getPointer(), CharUnits::fromQuantity(16));

assert(Destination->getType()->isPointerTy());		assert(DestAddr.getType()->isPointerTy());
assert(!ExchangeHigh->getType()->isPointerTy());		assert(!ExchangeHigh->getType()->isPointerTy());
assert(!ExchangeLow->getType()->isPointerTy());		assert(!ExchangeLow->getType()->isPointerTy());
assert(ComparandPtr->getType()->isPointerTy());		assert(ComparandAddr.getType()->isPointerTy());

// For Release ordering, the failure ordering should be Monotonic.		// For Release ordering, the failure ordering should be Monotonic.
auto FailureOrdering = SuccessOrdering == AtomicOrdering::Release		auto FailureOrdering = SuccessOrdering == AtomicOrdering::Release
? AtomicOrdering::Monotonic		? AtomicOrdering::Monotonic
: SuccessOrdering;		: SuccessOrdering;

// Convert to i128 pointers and values.		// Convert to i128 pointers and values.
llvm::Type *Int128Ty = llvm::IntegerType::get(CGF.getLLVMContext(), 128);		llvm::Type *Int128Ty = llvm::IntegerType::get(CGF.getLLVMContext(), 128);
llvm::Type *Int128PtrTy = Int128Ty->getPointerTo();		llvm::Type *Int128PtrTy = Int128Ty->getPointerTo();
Destination = CGF.Builder.CreateBitCast(Destination, Int128PtrTy);		DestAddr = CGF.Builder.CreateBitCast(DestAddr, Int128PtrTy);
		rjmccallUnsubmitted Done Reply Inline Actions Since you're changing this code anyway, please make this do `CreateElementBitCast(DestAddr, Int128Ty)` so that it's address-space-correct. There are a lot of other lines in the patch that would benefit from the same thing. rjmccall: Since you're changing this code anyway, please make this do `CreateElementBitCast(DestAddr…
Address ComparandResult(CGF.Builder.CreateBitCast(ComparandPtr, Int128PtrTy),		ComparandAddr = CGF.Builder.CreateBitCast(ComparandAddr, Int128PtrTy);
CGF.getContext().toCharUnitsFromBits(128));

// (((i128)hi) << 64) \| ((i128)lo)		// (((i128)hi) << 64) \| ((i128)lo)
ExchangeHigh = CGF.Builder.CreateZExt(ExchangeHigh, Int128Ty);		ExchangeHigh = CGF.Builder.CreateZExt(ExchangeHigh, Int128Ty);
ExchangeLow = CGF.Builder.CreateZExt(ExchangeLow, Int128Ty);		ExchangeLow = CGF.Builder.CreateZExt(ExchangeLow, Int128Ty);
ExchangeHigh =		ExchangeHigh =
CGF.Builder.CreateShl(ExchangeHigh, llvm::ConstantInt::get(Int128Ty, 64));		CGF.Builder.CreateShl(ExchangeHigh, llvm::ConstantInt::get(Int128Ty, 64));
llvm::Value *Exchange = CGF.Builder.CreateOr(ExchangeHigh, ExchangeLow);		llvm::Value *Exchange = CGF.Builder.CreateOr(ExchangeHigh, ExchangeLow);

// Load the comparand for the instruction.		// Load the comparand for the instruction.
llvm::Value *Comparand = CGF.Builder.CreateLoad(ComparandResult);		llvm::Value *Comparand = CGF.Builder.CreateLoad(ComparandAddr);

auto *CXI = CGF.Builder.CreateAtomicCmpXchg(Destination, Comparand, Exchange,		auto *CXI = CGF.Builder.CreateAtomicCmpXchg(DestAddr, Comparand, Exchange,
SuccessOrdering, FailureOrdering);		SuccessOrdering, FailureOrdering);

// The atomic instruction is marked volatile for consistency with MSVC. This		// The atomic instruction is marked volatile for consistency with MSVC. This
// blocks the few atomics optimizations that LLVM has. If we want to optimize		// blocks the few atomics optimizations that LLVM has. If we want to optimize
// _Interlocked* operations in the future, we will have to remove the volatile		// _Interlocked* operations in the future, we will have to remove the volatile
// marker.		// marker.
CXI->setVolatile(true);		CXI->setVolatile(true);

// Store the result as an outparameter.		// Store the result as an outparameter.
CGF.Builder.CreateStore(CGF.Builder.CreateExtractValue(CXI, 0),		CGF.Builder.CreateStore(CGF.Builder.CreateExtractValue(CXI, 0),
ComparandResult);		ComparandAddr);

// Get the success boolean and zero extend it to i8.		// Get the success boolean and zero extend it to i8.
Value *Success = CGF.Builder.CreateExtractValue(CXI, 1);		Value *Success = CGF.Builder.CreateExtractValue(CXI, 1);
return CGF.Builder.CreateZExt(Success, CGF.Int8Ty);		return CGF.Builder.CreateZExt(Success, CGF.Int8Ty);
}		}

static Value EmitAtomicIncrementValue(CodeGenFunction &CGF, const CallExpr E,		static Value EmitAtomicIncrementValue(CodeGenFunction &CGF, const CallExpr E,
AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {		AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());

auto *IntTy = CGF.ConvertType(E->getType());		auto *IntTy = CGF.ConvertType(E->getType());
		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
auto *Result = CGF.Builder.CreateAtomicRMW(		auto *Result = CGF.Builder.CreateAtomicRMW(
AtomicRMWInst::Add,		AtomicRMWInst::Add, DestAddr, ConstantInt::get(IntTy, 1), Ordering);
CGF.EmitScalarExpr(E->getArg(0)),
ConstantInt::get(IntTy, 1),
Ordering);
return CGF.Builder.CreateAdd(Result, ConstantInt::get(IntTy, 1));		return CGF.Builder.CreateAdd(Result, ConstantInt::get(IntTy, 1));
}		}

static Value EmitAtomicDecrementValue(CodeGenFunction &CGF, const CallExpr E,		static Value EmitAtomicDecrementValue(CodeGenFunction &CGF, const CallExpr E,
AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {		AtomicOrdering Ordering = AtomicOrdering::SequentiallyConsistent) {
assert(E->getArg(0)->getType()->isPointerType());		assert(E->getArg(0)->getType()->isPointerType());

auto *IntTy = CGF.ConvertType(E->getType());		auto *IntTy = CGF.ConvertType(E->getType());
		Address DestAddr = CGF.EmitPointerWithAlignment(E->getArg(0));
auto *Result = CGF.Builder.CreateAtomicRMW(		auto *Result = CGF.Builder.CreateAtomicRMW(
AtomicRMWInst::Sub,		AtomicRMWInst::Sub, DestAddr, ConstantInt::get(IntTy, 1), Ordering);
CGF.EmitScalarExpr(E->getArg(0)),
ConstantInt::get(IntTy, 1),
Ordering);
return CGF.Builder.CreateSub(Result, ConstantInt::get(IntTy, 1));		return CGF.Builder.CreateSub(Result, ConstantInt::get(IntTy, 1));
}		}

// Build a plain volatile load.		// Build a plain volatile load.
static Value EmitISOVolatileLoad(CodeGenFunction &CGF, const CallExpr E) {		static Value EmitISOVolatileLoad(CodeGenFunction &CGF, const CallExpr E) {
Value *Ptr = CGF.EmitScalarExpr(E->getArg(0));		Value *Ptr = CGF.EmitScalarExpr(E->getArg(0));
QualType ElTy = E->getArg(0)->getType()->getPointeeType();		QualType ElTy = E->getArg(0)->getType()->getPointeeType();
CharUnits LoadSize = CGF.getContext().getTypeSizeInChars(ElTy);		CharUnits LoadSize = CGF.getContext().getTypeSizeInChars(ElTy);
▲ Show 20 Lines • Show All 543 Lines • ▼ Show 20 Lines	static llvm::Value *EmitBitTestIntrinsic(CodeGenFunction &CGF,
if (Ordering != llvm::AtomicOrdering::NotAtomic) {		if (Ordering != llvm::AtomicOrdering::NotAtomic) {
// Emit a combined atomicrmw load/store operation for the interlocked		// Emit a combined atomicrmw load/store operation for the interlocked
// intrinsics.		// intrinsics.
llvm::AtomicRMWInst::BinOp RMWOp = llvm::AtomicRMWInst::Or;		llvm::AtomicRMWInst::BinOp RMWOp = llvm::AtomicRMWInst::Or;
if (BT.Action == BitTest::Reset) {		if (BT.Action == BitTest::Reset) {
Mask = CGF.Builder.CreateNot(Mask);		Mask = CGF.Builder.CreateNot(Mask);
RMWOp = llvm::AtomicRMWInst::And;		RMWOp = llvm::AtomicRMWInst::And;
}		}
OldByte = CGF.Builder.CreateAtomicRMW(RMWOp, ByteAddr.getPointer(), Mask,		OldByte = CGF.Builder.CreateAtomicRMW(RMWOp, ByteAddr, Mask, Ordering);
Ordering);
} else {		} else {
// Emit a plain load for the non-interlocked intrinsics.		// Emit a plain load for the non-interlocked intrinsics.
OldByte = CGF.Builder.CreateLoad(ByteAddr, "bittest.byte");		OldByte = CGF.Builder.CreateLoad(ByteAddr, "bittest.byte");
Value *NewByte = nullptr;		Value *NewByte = nullptr;
switch (BT.Action) {		switch (BT.Action) {
case BitTest::TestOnly:		case BitTest::TestOnly:
// Don't store anything.		// Don't store anything.
break;		break;
▲ Show 20 Lines • Show All 2,785 Lines • ▼ Show 20 Lines	case Builtin::BI__atomic_test_and_set: {
// operation. The parameter type is always volatile.		// operation. The parameter type is always volatile.
QualType PtrTy = E->getArg(0)->IgnoreImpCasts()->getType();		QualType PtrTy = E->getArg(0)->IgnoreImpCasts()->getType();
bool Volatile =		bool Volatile =
PtrTy->castAs<PointerType>()->getPointeeType().isVolatileQualified();		PtrTy->castAs<PointerType>()->getPointeeType().isVolatileQualified();

Value *Ptr = EmitScalarExpr(E->getArg(0));		Value *Ptr = EmitScalarExpr(E->getArg(0));
unsigned AddrSpace = Ptr->getType()->getPointerAddressSpace();		unsigned AddrSpace = Ptr->getType()->getPointerAddressSpace();
Ptr = Builder.CreateBitCast(Ptr, Int8Ty->getPointerTo(AddrSpace));		Ptr = Builder.CreateBitCast(Ptr, Int8Ty->getPointerTo(AddrSpace));
		Address PtrAddr(Ptr, CharUnits::One());
		rjmccallUnsubmitted Done Reply Inline Actions This should be using `EmitPointerWithAlignment` instead of assuming an alignment of 1. rjmccall: This should be using `EmitPointerWithAlignment` instead of assuming an alignment of 1.
Value *NewVal = Builder.getInt8(1);		Value *NewVal = Builder.getInt8(1);
Value *Order = EmitScalarExpr(E->getArg(1));		Value *Order = EmitScalarExpr(E->getArg(1));
if (isa<llvm::ConstantInt>(Order)) {		if (isa<llvm::ConstantInt>(Order)) {
int ord = cast<llvm::ConstantInt>(Order)->getZExtValue();		int ord = cast<llvm::ConstantInt>(Order)->getZExtValue();
AtomicRMWInst *Result = nullptr;		AtomicRMWInst *Result = nullptr;
switch (ord) {		switch (ord) {
case 0: // memory_order_relaxed		case 0: // memory_order_relaxed
default: // invalid order		default: // invalid order
Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		Result =
		Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, PtrAddr, NewVal,
llvm::AtomicOrdering::Monotonic);		llvm::AtomicOrdering::Monotonic);
break;		break;
case 1: // memory_order_consume		case 1: // memory_order_consume
case 2: // memory_order_acquire		case 2: // memory_order_acquire
Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, PtrAddr,
llvm::AtomicOrdering::Acquire);		NewVal, llvm::AtomicOrdering::Acquire);
break;		break;
case 3: // memory_order_release		case 3: // memory_order_release
Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, PtrAddr,
llvm::AtomicOrdering::Release);		NewVal, llvm::AtomicOrdering::Release);
break;		break;
case 4: // memory_order_acq_rel		case 4: // memory_order_acq_rel

Result = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		Result =
		Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg, PtrAddr, NewVal,
llvm::AtomicOrdering::AcquireRelease);		llvm::AtomicOrdering::AcquireRelease);
break;		break;
case 5: // memory_order_seq_cst		case 5: // memory_order_seq_cst
Result = Builder.CreateAtomicRMW(		Result = Builder.CreateAtomicRMW(
llvm::AtomicRMWInst::Xchg, Ptr, NewVal,		llvm::AtomicRMWInst::Xchg, PtrAddr, NewVal,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
break;		break;
}		}
Result->setVolatile(Volatile);		Result->setVolatile(Volatile);
return RValue::get(Builder.CreateIsNotNull(Result, "tobool"));		return RValue::get(Builder.CreateIsNotNull(Result, "tobool"));
}		}

llvm::BasicBlock *ContBB = createBasicBlock("atomic.continue", CurFn);		llvm::BasicBlock *ContBB = createBasicBlock("atomic.continue", CurFn);
Show All 14 Lines	case Builtin::BI__atomic_test_and_set: {
llvm::SwitchInst *SI = Builder.CreateSwitch(Order, BBs[0]);		llvm::SwitchInst *SI = Builder.CreateSwitch(Order, BBs[0]);

Builder.SetInsertPoint(ContBB);		Builder.SetInsertPoint(ContBB);
PHINode *Result = Builder.CreatePHI(Int8Ty, 5, "was_set");		PHINode *Result = Builder.CreatePHI(Int8Ty, 5, "was_set");

for (unsigned i = 0; i < 5; ++i) {		for (unsigned i = 0; i < 5; ++i) {
Builder.SetInsertPoint(BBs[i]);		Builder.SetInsertPoint(BBs[i]);
AtomicRMWInst *RMW = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg,		AtomicRMWInst *RMW = Builder.CreateAtomicRMW(llvm::AtomicRMWInst::Xchg,
Ptr, NewVal, Orders[i]);		PtrAddr, NewVal, Orders[i]);
RMW->setVolatile(Volatile);		RMW->setVolatile(Volatile);
Result->addIncoming(RMW, BBs[i]);		Result->addIncoming(RMW, BBs[i]);
Builder.CreateBr(ContBB);		Builder.CreateBr(ContBB);
}		}

SI->addCase(Builder.getInt32(0), BBs[0]);		SI->addCase(Builder.getInt32(0), BBs[0]);
SI->addCase(Builder.getInt32(1), BBs[1]);		SI->addCase(Builder.getInt32(1), BBs[1]);
SI->addCase(Builder.getInt32(2), BBs[1]);		SI->addCase(Builder.getInt32(2), BBs[1]);
▲ Show 20 Lines • Show All 441 Lines • ▼ Show 20 Lines
case Builtin::BI_InterlockedCompareExchangePointer:		case Builtin::BI_InterlockedCompareExchangePointer:
case Builtin::BI_InterlockedCompareExchangePointer_nf: {		case Builtin::BI_InterlockedCompareExchangePointer_nf: {
llvm::Type *RTy;		llvm::Type *RTy;
llvm::IntegerType *IntType =		llvm::IntegerType *IntType =
IntegerType::get(getLLVMContext(),		IntegerType::get(getLLVMContext(),
getContext().getTypeSize(E->getType()));		getContext().getTypeSize(E->getType()));
llvm::Type *IntPtrType = IntType->getPointerTo();		llvm::Type *IntPtrType = IntType->getPointerTo();

llvm::Value *Destination =		Address DestAddr = Builder.CreateBitCast(
Builder.CreateBitCast(EmitScalarExpr(E->getArg(0)), IntPtrType);		EmitPointerWithAlignment(E->getArg(0)), IntPtrType);

llvm::Value *Exchange = EmitScalarExpr(E->getArg(1));		llvm::Value *Exchange = EmitScalarExpr(E->getArg(1));
RTy = Exchange->getType();		RTy = Exchange->getType();
Exchange = Builder.CreatePtrToInt(Exchange, IntType);		Exchange = Builder.CreatePtrToInt(Exchange, IntType);

llvm::Value *Comparand =		llvm::Value *Comparand =
Builder.CreatePtrToInt(EmitScalarExpr(E->getArg(2)), IntType);		Builder.CreatePtrToInt(EmitScalarExpr(E->getArg(2)), IntType);

auto Ordering =		auto Ordering =
BuiltinID == Builtin::BI_InterlockedCompareExchangePointer_nf ?		BuiltinID == Builtin::BI_InterlockedCompareExchangePointer_nf ?
AtomicOrdering::Monotonic : AtomicOrdering::SequentiallyConsistent;		AtomicOrdering::Monotonic : AtomicOrdering::SequentiallyConsistent;

auto Result = Builder.CreateAtomicCmpXchg(Destination, Comparand, Exchange,		auto Result = Builder.CreateAtomicCmpXchg(DestAddr, Comparand, Exchange,
Ordering, Ordering);		Ordering, Ordering);
Result->setVolatile(true);		Result->setVolatile(true);

return RValue::get(Builder.CreateIntToPtr(Builder.CreateExtractValue(Result,		return RValue::get(Builder.CreateIntToPtr(Builder.CreateExtractValue(Result,
0),		0),
RTy));		RTy));
}		}
case Builtin::BI_InterlockedCompareExchange8:		case Builtin::BI_InterlockedCompareExchange8:
▲ Show 20 Lines • Show All 5,912 Lines • ▼ Show 20 Lines
case NEON::BI__builtin_neon_vgetq_lane_bf16:		case NEON::BI__builtin_neon_vgetq_lane_bf16:
case NEON::BI__builtin_neon_vduph_laneq_bf16:		case NEON::BI__builtin_neon_vduph_laneq_bf16:
case NEON::BI__builtin_neon_vduph_laneq_f16: {		case NEON::BI__builtin_neon_vduph_laneq_f16: {
return Builder.CreateExtractElement(Ops[0], EmitScalarExpr(E->getArg(1)),		return Builder.CreateExtractElement(Ops[0], EmitScalarExpr(E->getArg(1)),
"vgetq_lane");		"vgetq_lane");
}		}

case AArch64::BI_InterlockedAdd: {		case AArch64::BI_InterlockedAdd: {
Value *Arg0 = EmitScalarExpr(E->getArg(0));		Address DestAddr = EmitPointerWithAlignment(E->getArg(0));
Value *Arg1 = EmitScalarExpr(E->getArg(1));		Value *Val = EmitScalarExpr(E->getArg(1));
AtomicRMWInst *RMWI = Builder.CreateAtomicRMW(
AtomicRMWInst::Add, Arg0, Arg1,		AtomicRMWInst *RMWI =
		Builder.CreateAtomicRMW(AtomicRMWInst::Add, DestAddr, Val,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
return Builder.CreateAdd(RMWI, Arg1);		return Builder.CreateAdd(RMWI, Val);
}		}
}		}

llvm::FixedVectorType *VTy = GetNeonType(this, Type);		llvm::FixedVectorType *VTy = GetNeonType(this, Type);
llvm::Type *Ty = VTy;		llvm::Type *Ty = VTy;
if (!Ty)		if (!Ty)
return nullptr;		return nullptr;

▲ Show 20 Lines • Show All 6,075 Lines • ▼ Show 20 Lines	CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned BuiltinID, const CallExpr *E) {
case NVPTX::BI__nvvm_atom_cas_gen_l:		case NVPTX::BI__nvvm_atom_cas_gen_l:
case NVPTX::BI__nvvm_atom_cas_gen_ll:		case NVPTX::BI__nvvm_atom_cas_gen_ll:
// __nvvm_atom_cas_gen_* should return the old value rather than the		// __nvvm_atom_cas_gen_* should return the old value rather than the
// success flag.		// success flag.
return MakeAtomicCmpXchgValue(this, E, /ReturnBool=*/false);		return MakeAtomicCmpXchgValue(this, E, /ReturnBool=*/false);

case NVPTX::BI__nvvm_atom_add_gen_f:		case NVPTX::BI__nvvm_atom_add_gen_f:
case NVPTX::BI__nvvm_atom_add_gen_d: {		case NVPTX::BI__nvvm_atom_add_gen_d: {
Value *Ptr = EmitScalarExpr(E->getArg(0));		Address DestAddr = EmitPointerWithAlignment(E->getArg(0));
Value *Val = EmitScalarExpr(E->getArg(1));		Value *Val = EmitScalarExpr(E->getArg(1));
return Builder.CreateAtomicRMW(llvm::AtomicRMWInst::FAdd, Ptr, Val,
		return Builder.CreateAtomicRMW(llvm::AtomicRMWInst::FAdd, DestAddr, Val,
AtomicOrdering::SequentiallyConsistent);		AtomicOrdering::SequentiallyConsistent);
}		}

case NVPTX::BI__nvvm_atom_inc_gen_ui: {		case NVPTX::BI__nvvm_atom_inc_gen_ui: {
Value *Ptr = EmitScalarExpr(E->getArg(0));		Value *Ptr = EmitScalarExpr(E->getArg(0));
Value *Val = EmitScalarExpr(E->getArg(1));		Value *Val = EmitScalarExpr(E->getArg(1));
Function *FnALI32 =		Function *FnALI32 =
CGM.getIntrinsic(Intrinsic::nvvm_atomic_load_inc_32, Ptr->getType());		CGM.getIntrinsic(Intrinsic::nvvm_atomic_load_inc_32, Ptr->getType());
▲ Show 20 Lines • Show All 1,398 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 2,424 Lines • ▼ Show 20 Lines	if (isInc && type->isBooleanType()) {
if (isPre) {		if (isPre) {
Builder.CreateStore(True, LV.getAddress(CGF), LV.isVolatileQualified())		Builder.CreateStore(True, LV.getAddress(CGF), LV.isVolatileQualified())
->setAtomic(llvm::AtomicOrdering::SequentiallyConsistent);		->setAtomic(llvm::AtomicOrdering::SequentiallyConsistent);
return Builder.getTrue();		return Builder.getTrue();
}		}
// For atomic bool increment, we just store true and return it for		// For atomic bool increment, we just store true and return it for
// preincrement, do an atomic swap with true for postincrement		// preincrement, do an atomic swap with true for postincrement
return Builder.CreateAtomicRMW(		return Builder.CreateAtomicRMW(
llvm::AtomicRMWInst::Xchg, LV.getPointer(CGF), True,		llvm::AtomicRMWInst::Xchg, LV.getAddress(CGF), True,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
}		}
// Special case for atomic increment / decrement on integers, emit		// Special case for atomic increment / decrement on integers, emit
// atomicrmw instructions. We skip this if we want to be doing overflow		// atomicrmw instructions. We skip this if we want to be doing overflow
// checking, and fall into the slow path with the atomic cmpxchg loop.		// checking, and fall into the slow path with the atomic cmpxchg loop.
if (!type->isBooleanType() && type->isIntegerType() &&		if (!type->isBooleanType() && type->isIntegerType() &&
!(type->isUnsignedIntegerType() &&		!(type->isUnsignedIntegerType() &&
CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow)) &&		CGF.SanOpts.has(SanitizerKind::UnsignedIntegerOverflow)) &&
CGF.getLangOpts().getSignedOverflowBehavior() !=		CGF.getLangOpts().getSignedOverflowBehavior() !=
LangOptions::SOB_Trapping) {		LangOptions::SOB_Trapping) {
llvm::AtomicRMWInst::BinOp aop = isInc ? llvm::AtomicRMWInst::Add :		llvm::AtomicRMWInst::BinOp aop = isInc ? llvm::AtomicRMWInst::Add :
llvm::AtomicRMWInst::Sub;		llvm::AtomicRMWInst::Sub;
llvm::Instruction::BinaryOps op = isInc ? llvm::Instruction::Add :		llvm::Instruction::BinaryOps op = isInc ? llvm::Instruction::Add :
llvm::Instruction::Sub;		llvm::Instruction::Sub;
llvm::Value *amt = CGF.EmitToMemory(		llvm::Value *amt = CGF.EmitToMemory(
llvm::ConstantInt::get(ConvertType(type), 1, true), type);		llvm::ConstantInt::get(ConvertType(type), 1, true), type);
llvm::Value *old =		llvm::Value *old =
Builder.CreateAtomicRMW(aop, LV.getPointer(CGF), amt,		Builder.CreateAtomicRMW(aop, LV.getAddress(CGF), amt,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);
return isPre ? Builder.CreateBinOp(op, old, amt) : old;		return isPre ? Builder.CreateBinOp(op, old, amt) : old;
}		}
value = EmitLoadOfLValue(LV, E->getExprLoc());		value = EmitLoadOfLValue(LV, E->getExprLoc());
input = value;		input = value;
// For every other atomic operation, we need to emit a load-op-cmpxchg loop		// For every other atomic operation, we need to emit a load-op-cmpxchg loop
llvm::BasicBlock *startBB = Builder.GetInsertBlock();		llvm::BasicBlock *startBB = Builder.GetInsertBlock();
llvm::BasicBlock *opBB = CGF.createBasicBlock("atomic_op", CGF.CurFn);		llvm::BasicBlock *opBB = CGF.createBasicBlock("atomic_op", CGF.CurFn);
▲ Show 20 Lines • Show All 555 Lines • ▼ Show 20 Lines	if (!type->isBooleanType() && type->isIntegerType() &&
llvm_unreachable("Invalid compound assignment type");		llvm_unreachable("Invalid compound assignment type");
}		}
if (AtomicOp != llvm::AtomicRMWInst::BAD_BINOP) {		if (AtomicOp != llvm::AtomicRMWInst::BAD_BINOP) {
llvm::Value *Amt = CGF.EmitToMemory(		llvm::Value *Amt = CGF.EmitToMemory(
EmitScalarConversion(OpInfo.RHS, E->getRHS()->getType(), LHSTy,		EmitScalarConversion(OpInfo.RHS, E->getRHS()->getType(), LHSTy,
E->getExprLoc()),		E->getExprLoc()),
LHSTy);		LHSTy);
Value *OldVal = Builder.CreateAtomicRMW(		Value *OldVal = Builder.CreateAtomicRMW(
AtomicOp, LHSLV.getPointer(CGF), Amt,		AtomicOp, LHSLV.getAddress(CGF), Amt,
llvm::AtomicOrdering::SequentiallyConsistent);		llvm::AtomicOrdering::SequentiallyConsistent);

// Since operation is atomic, the result type is guaranteed to be the		// Since operation is atomic, the result type is guaranteed to be the
// same as the input in LLVM terms.		// same as the input in LLVM terms.
Result = Builder.CreateBinOp(Op, OldVal, Amt);		Result = Builder.CreateBinOp(Op, OldVal, Amt);
return LHSLV;		return LHSLV;
}		}
}		}
▲ Show 20 Lines • Show All 2,033 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGStmtOpenMP.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,144 Lines • ▼ Show 20 Lines	static std::pair<bool, RValue> emitOMPAtomicRMW(CodeGenFunction &CGF, LValue X,
}		}
llvm::Value *UpdateVal = Update.getScalarVal();		llvm::Value *UpdateVal = Update.getScalarVal();
if (auto *IC = dyn_cast<llvm::ConstantInt>(UpdateVal)) {		if (auto *IC = dyn_cast<llvm::ConstantInt>(UpdateVal)) {
UpdateVal = CGF.Builder.CreateIntCast(		UpdateVal = CGF.Builder.CreateIntCast(
IC, X.getAddress(CGF).getElementType(),		IC, X.getAddress(CGF).getElementType(),
X.getType()->hasSignedIntegerRepresentation());		X.getType()->hasSignedIntegerRepresentation());
}		}
llvm::Value *Res =		llvm::Value *Res =
CGF.Builder.CreateAtomicRMW(RMWOp, X.getPointer(CGF), UpdateVal, AO);		CGF.Builder.CreateAtomicRMW(RMWOp, X.getAddress(CGF), UpdateVal, AO);
return std::make_pair(true, RValue::get(Res));		return std::make_pair(true, RValue::get(Res));
}		}

std::pair<bool, RValue> CodeGenFunction::EmitOMPAtomicSimpleUpdateExpr(		std::pair<bool, RValue> CodeGenFunction::EmitOMPAtomicSimpleUpdateExpr(
LValue X, RValue E, BinaryOperatorKind BO, bool IsXLHSInRHSPart,		LValue X, RValue E, BinaryOperatorKind BO, bool IsXLHSInRHSPart,
llvm::AtomicOrdering AO, SourceLocation Loc,		llvm::AtomicOrdering AO, SourceLocation Loc,
const llvm::function_ref<RValue(RValue)> CommonGen) {		const llvm::function_ref<RValue(RValue)> CommonGen) {
// Update expressions are allowed to have the following forms:		// Update expressions are allowed to have the following forms:
▲ Show 20 Lines • Show All 1,608 Lines • Show Last 20 Lines

clang/test/CodeGen/atomic-ops.c

Show First 20 Lines • Show All 649 Lines • ▼ Show 20 Lines	void test_underaligned() {
__atomic_compare_exchange(&underaligned_a, &underaligned_b, &underaligned_c, 1, memory_order_seq_cst, memory_order_seq_cst);		__atomic_compare_exchange(&underaligned_a, &underaligned_b, &underaligned_c, 1, memory_order_seq_cst, memory_order_seq_cst);

__attribute__((aligned)) struct Underaligned aligned_a, aligned_b, aligned_c;		__attribute__((aligned)) struct Underaligned aligned_a, aligned_b, aligned_c;

// CHECK: load atomic i64, {{.*}}, align 16		// CHECK: load atomic i64, {{.*}}, align 16
__atomic_load(&aligned_a, &aligned_b, memory_order_seq_cst);		__atomic_load(&aligned_a, &aligned_b, memory_order_seq_cst);
// CHECK: store atomic i64 {{.*}}, align 16		// CHECK: store atomic i64 {{.*}}, align 16
__atomic_store(&aligned_a, &aligned_b, memory_order_seq_cst);		__atomic_store(&aligned_a, &aligned_b, memory_order_seq_cst);
// CHECK: atomicrmw xchg i64* {{.*}}, align 8		// CHECK: atomicrmw xchg i64* {{.*}}, align 16
__atomic_exchange(&aligned_a, &aligned_b, &aligned_c, memory_order_seq_cst);		__atomic_exchange(&aligned_a, &aligned_b, &aligned_c, memory_order_seq_cst);
// CHECK: cmpxchg weak i64* {{.*}}, align 8		// CHECK: cmpxchg weak i64* {{.*}}, align 16
__atomic_compare_exchange(&aligned_a, &aligned_b, &aligned_c, 1, memory_order_seq_cst, memory_order_seq_cst);		__atomic_compare_exchange(&aligned_a, &aligned_b, &aligned_c, 1, memory_order_seq_cst, memory_order_seq_cst);
}		}

void test_c11_minmax(_Atomic(int) * si, _Atomic(unsigned) * ui, _Atomic(short) * ss, _Atomic(unsigned char) * uc, _Atomic(long long) * sll) {		void test_c11_minmax(_Atomic(int) * si, _Atomic(unsigned) * ui, _Atomic(short) * ss, _Atomic(unsigned char) * uc, _Atomic(long long) * sll) {
// CHECK-LABEL: @test_c11_minmax		// CHECK-LABEL: @test_c11_minmax

// CHECK: atomicrmw max i32* {{.*}} acquire, align 4		// CHECK: atomicrmw max i32* {{.*}} acquire, align 4
*si = __c11_atomic_fetch_max(si, 42, memory_order_acquire);		*si = __c11_atomic_fetch_max(si, 42, memory_order_acquire);
▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

clang/test/CodeGen/ms-intrinsics.c

	Show First 20 Lines • Show All 444 Lines • ▼ Show 20 Lines
	// CHECK-64: %inc1 = add nsw i64 %ExchangeLow, 1			// CHECK-64: %inc1 = add nsw i64 %ExchangeLow, 1
	// CHECK-64: %incdec.ptr2 = getelementptr inbounds i64, i64* %ComparandResult, i64 1			// CHECK-64: %incdec.ptr2 = getelementptr inbounds i64, i64* %ComparandResult, i64 1
	// CHECK-64: [[DST:%[0-9]+]] = bitcast i64* %incdec.ptr to i128*			// CHECK-64: [[DST:%[0-9]+]] = bitcast i64* %incdec.ptr to i128*
	// CHECK-64: [[CNR:%[0-9]+]] = bitcast i64* %incdec.ptr2 to i128*			// CHECK-64: [[CNR:%[0-9]+]] = bitcast i64* %incdec.ptr2 to i128*
	// CHECK-64: [[EH:%[0-9]+]] = zext i64 %inc to i128			// CHECK-64: [[EH:%[0-9]+]] = zext i64 %inc to i128
	// CHECK-64: [[EL:%[0-9]+]] = zext i64 %inc1 to i128			// CHECK-64: [[EL:%[0-9]+]] = zext i64 %inc1 to i128
	// CHECK-64: [[EHS:%[0-9]+]] = shl nuw i128 [[EH]], 64			// CHECK-64: [[EHS:%[0-9]+]] = shl nuw i128 [[EH]], 64
	// CHECK-64: [[EXP:%[0-9]+]] = or i128 [[EHS]], [[EL]]			// CHECK-64: [[EXP:%[0-9]+]] = or i128 [[EHS]], [[EL]]
	// CHECK-64: [[ORG:%[0-9]+]] = load i128, i128* [[CNR]], align 16			// CHECK-64: [[ORG:%[0-9]+]] = load i128, i128* [[CNR]], align 8
	// CHECK-64: [[RES:%[0-9]+]] = cmpxchg volatile i128* [[DST]], i128 [[ORG]], i128 [[EXP]] seq_cst seq_cst, align 16			// CHECK-64: [[RES:%[0-9]+]] = cmpxchg volatile i128* [[DST]], i128 [[ORG]], i128 [[EXP]] seq_cst seq_cst, align 16
	// CHECK-64: [[OLD:%[0-9]+]] = extractvalue { i128, i1 } [[RES]], 0			// CHECK-64: [[OLD:%[0-9]+]] = extractvalue { i128, i1 } [[RES]], 0
	// CHECK-64: store i128 [[OLD]], i128* [[CNR]], align 16			// CHECK-64: store i128 [[OLD]], i128* [[CNR]], align 8
	// CHECK-64: [[SUC1:%[0-9]+]] = extractvalue { i128, i1 } [[RES]], 1			// CHECK-64: [[SUC1:%[0-9]+]] = extractvalue { i128, i1 } [[RES]], 1
	// CHECK-64: [[SUC8:%[0-9]+]] = zext i1 [[SUC1]] to i8			// CHECK-64: [[SUC8:%[0-9]+]] = zext i1 [[SUC1]] to i8
	// CHECK-64: ret i8 [[SUC8]]			// CHECK-64: ret i8 [[SUC8]]
	// CHECK-64: }			// CHECK-64: }
	#endif			#endif

	#if defined(__aarch64__)			#if defined(__aarch64__)
	unsigned char test_InterlockedCompareExchange128_acq(			unsigned char test_InterlockedCompareExchange128_acq(
	▲ Show 20 Lines • Show All 951 Lines • Show Last 20 Lines

clang/test/OpenMP/parallel_reduction_codegen.cpp

Show First 20 Lines • Show All 192 Lines • ▼ Show 20 Lines	#pragma omp parallel reduction(+:g)
// LAMBDA: [[G_VAL:%.+]] = load i32, i32* [[G_REF]]		// LAMBDA: [[G_VAL:%.+]] = load i32, i32* [[G_REF]]
// LAMBDA: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]		// LAMBDA: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]
// LAMBDA: [[ADD:%.+]] = add nsw i32 [[G_VAL]], [[G_PRIV_VAL]]		// LAMBDA: [[ADD:%.+]] = add nsw i32 [[G_VAL]], [[G_PRIV_VAL]]
// LAMBDA: store i32 [[ADD]], i32* [[G_REF]]		// LAMBDA: store i32 [[ADD]], i32* [[G_REF]]
// LAMBDA: call void @__kmpc_end_reduce_nowait(		// LAMBDA: call void @__kmpc_end_reduce_nowait(
// LAMBDA: br label %[[REDUCTION_DONE]]		// LAMBDA: br label %[[REDUCTION_DONE]]
// LAMBDA: [[CASE2]]		// LAMBDA: [[CASE2]]
// LAMBDA: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]		// LAMBDA: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]
// LAMBDA: atomicrmw add i32* [[G_REF]], i32 [[G_PRIV_VAL]] monotonic, align 4		// LAMBDA: atomicrmw add i32* [[G_REF]], i32 [[G_PRIV_VAL]] monotonic, align 128
// LAMBDA: br label %[[REDUCTION_DONE]]		// LAMBDA: br label %[[REDUCTION_DONE]]
// LAMBDA: [[REDUCTION_DONE]]		// LAMBDA: [[REDUCTION_DONE]]
// LAMBDA: ret void		// LAMBDA: ret void
[&]() {		[&]() {
// LAMBDA: define {{.+}} void [[INNER_LAMBDA]](%{{.+}}* {{[^,]*}} [[ARG_PTR:%.+]])		// LAMBDA: define {{.+}} void [[INNER_LAMBDA]](%{{.+}}* {{[^,]*}} [[ARG_PTR:%.+]])
// LAMBDA: store %{{.+}}* [[ARG_PTR]], %{{.+}}** [[ARG_PTR_REF:%.+]],		// LAMBDA: store %{{.+}}* [[ARG_PTR]], %{{.+}}** [[ARG_PTR_REF:%.+]],
g = 2;		g = 2;
// LAMBDA: [[ARG_PTR:%.+]] = load %{{.+}}, %{{.+}}* [[ARG_PTR_REF]]		// LAMBDA: [[ARG_PTR:%.+]] = load %{{.+}}, %{{.+}}* [[ARG_PTR_REF]]
Show All 40 Lines	#pragma omp parallel reduction(-:g)
// BLOCKS: [[G_VAL:%.+]] = load i32, i32* [[G_REF]]		// BLOCKS: [[G_VAL:%.+]] = load i32, i32* [[G_REF]]
// BLOCKS: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]		// BLOCKS: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]
// BLOCKS: [[ADD:%.+]] = add nsw i32 [[G_VAL]], [[G_PRIV_VAL]]		// BLOCKS: [[ADD:%.+]] = add nsw i32 [[G_VAL]], [[G_PRIV_VAL]]
// BLOCKS: store i32 [[ADD]], i32* [[G_REF]]		// BLOCKS: store i32 [[ADD]], i32* [[G_REF]]
// BLOCKS: call void @__kmpc_end_reduce_nowait(		// BLOCKS: call void @__kmpc_end_reduce_nowait(
// BLOCKS: br label %[[REDUCTION_DONE]]		// BLOCKS: br label %[[REDUCTION_DONE]]
// BLOCKS: [[CASE2]]		// BLOCKS: [[CASE2]]
// BLOCKS: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]		// BLOCKS: [[G_PRIV_VAL:%.+]] = load i32, i32* [[G_PRIVATE_ADDR]]
// BLOCKS: atomicrmw add i32* [[G_REF]], i32 [[G_PRIV_VAL]] monotonic, align 4		// BLOCKS: atomicrmw add i32* [[G_REF]], i32 [[G_PRIV_VAL]] monotonic, align 128
// BLOCKS: br label %[[REDUCTION_DONE]]		// BLOCKS: br label %[[REDUCTION_DONE]]
// BLOCKS: [[REDUCTION_DONE]]		// BLOCKS: [[REDUCTION_DONE]]
// BLOCKS: ret void		// BLOCKS: ret void
^{		^{
// BLOCKS: define {{.+}} void {{@.+}}(i8*		// BLOCKS: define {{.+}} void {{@.+}}(i8*
g = 2;		g = 2;
// BLOCKS-NOT: [[G]]{{[[^:word:]]}}		// BLOCKS-NOT: [[G]]{{[[^:word:]]}}
// BLOCKS: store i{{[0-9]+}} 2, i{{[0-9]+}}*		// BLOCKS: store i{{[0-9]+}} 2, i{{[0-9]+}}*
▲ Show 20 Lines • Show All 499 Lines • ▼ Show 20 Lines
// CHECK: call void @__kmpc_end_reduce_nowait(%{{.+}}* [[REDUCTION_LOC]], i32 [[GTID]], [8 x i32]* [[REDUCTION_LOCK]])		// CHECK: call void @__kmpc_end_reduce_nowait(%{{.+}}* [[REDUCTION_LOC]], i32 [[GTID]], [8 x i32]* [[REDUCTION_LOCK]])

// break;		// break;
// CHECK: br label %[[RED_DONE]]		// CHECK: br label %[[RED_DONE]]

// case 2:		// case 2:
// t_var += t_var_reduction;		// t_var += t_var_reduction;
// CHECK: [[T_VAR_PRIV_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR_PRIV]]		// CHECK: [[T_VAR_PRIV_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR_PRIV]]
// CHECK: atomicrmw add i32* [[T_VAR_REF]], i32 [[T_VAR_PRIV_VAL]] monotonic, align 4		// CHECK: atomicrmw add i32* [[T_VAR_REF]], i32 [[T_VAR_PRIV_VAL]] monotonic, align 128

// var = var.operator &(var_reduction);		// var = var.operator &(var_reduction);
// CHECK: call void @__kmpc_critical(		// CHECK: call void @__kmpc_critical(
// CHECK: [[UP:%.+]] = call nonnull align 4 dereferenceable(4) [[S_INT_TY]]* @{{.+}}([[S_INT_TY]]* {{[^,]}} [[VAR_REF]], [[S_INT_TY]] nonnull align 4 dereferenceable(4) [[VAR_PRIV]])		// CHECK: [[UP:%.+]] = call nonnull align 4 dereferenceable(4) [[S_INT_TY]]* @{{.+}}([[S_INT_TY]]* {{[^,]}} [[VAR_REF]], [[S_INT_TY]] nonnull align 4 dereferenceable(4) [[VAR_PRIV]])
// CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR_REF]] to i8*		// CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR_REF]] to i8*
// CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[UP]] to i8*		// CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[UP]] to i8*
// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false)		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false)
// CHECK: call void @__kmpc_end_critical(		// CHECK: call void @__kmpc_end_critical(
Show All 13 Lines
// CHECK: call void @{{.+}}([[S_INT_TY]]* {{[^,]*}} [[COND_LVALUE:%.+]], i32 [[CONV]])		// CHECK: call void @{{.+}}([[S_INT_TY]]* {{[^,]*}} [[COND_LVALUE:%.+]], i32 [[CONV]])
// CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR1_REF]] to i8*		// CHECK: [[BC1:%.+]] = bitcast [[S_INT_TY]]* [[VAR1_REF]] to i8*
// CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[COND_LVALUE]] to i8*		// CHECK: [[BC2:%.+]] = bitcast [[S_INT_TY]]* [[COND_LVALUE]] to i8*
// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false)		// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 128 [[BC1]], i8* align 4 [[BC2]], i64 4, i1 false)
// CHECK: call void @__kmpc_end_critical(		// CHECK: call void @__kmpc_end_critical(

// t_var1 = min(t_var1, t_var1_reduction);		// t_var1 = min(t_var1, t_var1_reduction);
// CHECK: [[T_VAR1_PRIV_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR1_PRIV]]		// CHECK: [[T_VAR1_PRIV_VAL:%.+]] = load i{{[0-9]+}}, i{{[0-9]+}}* [[T_VAR1_PRIV]]
// CHECK: atomicrmw min i32* [[T_VAR1_REF]], i32 [[T_VAR1_PRIV_VAL]] monotonic, align 4		// CHECK: atomicrmw min i32* [[T_VAR1_REF]], i32 [[T_VAR1_PRIV_VAL]] monotonic, align 128

// break;		// break;
// CHECK: br label %[[RED_DONE]]		// CHECK: br label %[[RED_DONE]]
// CHECK: [[RED_DONE]]		// CHECK: [[RED_DONE]]

// CHECK-DAG: call {{.}} [[S_INT_TY_DESTR]]([[S_INT_TY]] {{[^,]*}} [[VAR_PRIV]])		// CHECK-DAG: call {{.}} [[S_INT_TY_DESTR]]([[S_INT_TY]] {{[^,]*}} [[VAR_PRIV]])
// CHECK-DAG: call {{.}} [[S_INT_TY_DESTR]]([[S_INT_TY]]		// CHECK-DAG: call {{.}} [[S_INT_TY_DESTR]]([[S_INT_TY]]
// CHECK: ret void		// CHECK: ret void
▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines