This is an archive of the discontinued LLVM Phabricator instance.

[AtomicExpand] Set branch weights when new basic blocks are created
Needs RevisionPublic

Authored by ahatanak on Feb 20 2015, 5:49 PM.

Download Raw Diff

Details

Reviewers

Summary

This patch fixes AtomicExpand pass to set branch weights when it creates new basic blocks. The weights are set in a way that optimizes for the case where there are few contentions.

Diff Detail

Event Timeline

ahatanak updated this revision to Diff 20448.Feb 20 2015, 5:49 PM

ahatanak retitled this revision from to [AtomicExpand] Set branch weights when new basic blocks are created.

ahatanak updated this object.

ahatanak edited the test plan for this revision. (Show Details)

ahatanak added a subscriber: Unknown Object (MLST).

First, please update with full context. I need to see the surrounding code to interpret a few of your changes.

I'm a bit concerned about the general approach. Without profiling data, unconditionally predicting no contention seems questionable. Can you quantify the performance impact here for both the contented and uncontended case?

lib/CodeGen/AtomicExpandPass.cpp
388	Please extract a helper function for this shared code

Defined a new function which returns weights for both destination BBs.

The rationale for optimizing for the no contention case is that, if there are a lot of contentions, the program won't run fast anyway, however, I don't have data to support my decision.

Do you have any suggestions for benchmark programs that I can run to evaluate the changes I made in my patch?

For the performance question, even a simple micro benchmark would be sufficient. What is the difference in a contented vs uncontented access to a single word value with and without your change? Is the difference material enough to worry about getting it wrong when the access is contented?

include/llvm/Analysis/BranchProbabilityInfo.h
118	I know you're copying the StackProtector code here, but I'm not sure I see the need for this function to be exposed on BPI. I'd just make it a static helper in the transform file.

Sorry for not replying soon.

I wrote several mirco-benchmarks and compiled and ran them to evaluate the impact of my patch on performance. For most of the benchmarks, my patch made no difference in the generated code or made no measurable difference in performance, but there was one benchmark that showed degradation in performance when a large number of threads (16 threads) were contending for access to a atomic variable.

Given that I couldn't demonstrate optimizing for the low-contention case generates code that is at least as fast as the code currently generated, I think I should retract this patch for now and resubmit it later when I figure out a way to set the weights that makes the code run faster.

Seems like a reasonable plan.

I'm going to list a couple of ideas for alternate approaches below, but it's been long enough I've completely lost context on this patch. Feel free to ignore if my suggestions don't seem worthwhile.

We could add a piece of metadata to record how contented a given RMW operation is. This could be as simple as "uncontended" or a more generic "continition_count(x, of y)". With that in place, you change could be applied only for uncontended accesses. (I'd also be open to switching the default to uncontended and then having the metadata for the contended case. Your reported data made this sound plausible; you just need to make the argument on llvm-dev.)
You could investigate why the branch weights cause a slow down in the contended cases. Looking at the loop structure, I find it slightly odd that it has any impact at all since it's not likely to effect code placement. There might be an independent tweak that could be made here.

Also, can you explain *why* you expected the branch weights to help in the first place? (i.e. what part of the optimizer/micro-architecture were you trying to exploit?) Maybe that will spark an idea for another approach.

This revision now requires changes to proceed.May 19 2015, 9:23 AM

In D7804#175083, @reames wrote:

Seems like a reasonable plan.

I'm going to list a couple of ideas for alternate approaches below, but it's been long enough I've completely lost context on this patch. Feel free to ignore if my suggestions don't seem worthwhile.

We could add a piece of metadata to record how contented a given RMW operation is. This could be as simple as "uncontended" or a more generic "continition_count(x, of y)". With that in place, you change could be applied only for uncontended accesses. (I'd also be open to switching the default to uncontended and then having the metadata for the contended case. Your reported data made this sound plausible; you just need to make the argument on llvm-dev.)

You could investigate why the branch weights cause a slow down in the contended cases. Looking at the loop structure, I find it slightly odd that it has any impact at all since it's not likely to effect code placement. There might be an independent tweak that could be made here.

Judging from the code clang generates, machine block placement is the pass that is making the difference. The basic blocks are laid out in a way that appears to hurt performance in the contended cases (more backward and forward branches to non-consecutive blocks).

This is part of the program that ran slower because of my patch. You can probably see the difference if you compile it for arm64 or aarch64.

struct Node {
  Node(int ii, Node *nn) : i(ii), next(nn) {}
  Node() : i(0), next(nullptr) {}
  unsigned i;
  Node *next;
};

struct List {
  atomic<Node*> head;
};

List list1;

std::atomic<unsigned> flag;
unsigned sum = 0;

void addNodes(unsigned b, unsigned e) {
  while (b != e) {
    Node *n = new Node(b++, nullptr);
    Node *expected = list1.head;
     
    do {
      n->next = expected;
    } while (!atomic_compare_exchange_weak(&list1.head, &expected, n));
  }
}

Also, can you explain *why* you expected the branch weights to help in the first place? (i.e. what part of the optimizer/micro-architecture were you trying to exploit?) Maybe that will spark an idea for another approach.

I was simply assuming real workloads would typically have little contention, and if I set a low weight for the "failure" branch and a high weight for the "success" branch, the optimizing passes would generate code that would run faster in the uncontended case. I wasn't trying to exploit any optimization passes in particular, I was just expecting it should make a difference in the generated code.

Revision Contents

Path

Size

include/

llvm/

Analysis/

BranchProbabilityInfo.h

2 lines

lib/

Analysis/

BranchProbabilityInfo.cpp

11 lines

CodeGen/

AtomicExpandPass.cpp

16 lines

test/

CodeGen/

AArch64/

arm64-atomic.ll

6 lines

ARM/

atomic-64bit.ll

8 lines

cmpxchg-idioms.ll

55 lines

X86/

atomic128.ll

6 lines

Transforms/

AtomicExpand/

ARM/

atomic-expansion-v7.ll

33 lines

atomic-expansion-v8.ll

19 lines

cmpxchg-weak.ll

12 lines

X86/

atomic32-weight.ll

17 lines

lit.local.cfg

3 lines

Diff 20728

include/llvm/Analysis/BranchProbabilityInfo.h

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	public:
/// weights are calculated carefully before using!		/// weights are calculated carefully before using!
void setEdgeWeight(const BasicBlock *Src, unsigned IndexInSuccessors,		void setEdgeWeight(const BasicBlock *Src, unsigned IndexInSuccessors,
uint32_t Weight);		uint32_t Weight);

static uint32_t getBranchWeightStackProtector(bool IsLikely) {		static uint32_t getBranchWeightStackProtector(bool IsLikely) {
return IsLikely ? (1u << 20) - 1 : 1;		return IsLikely ? (1u << 20) - 1 : 1;
}		}

		static MDNode *getBranchWeightsAtomicExpand(bool Reverse, LLVMContext &Ctx);
		reamesUnsubmitted Not Done Reply Inline Actions I know you're copying the StackProtector code here, but I'm not sure I see the need for this function to be exposed on BPI. I'd just make it a static helper in the transform file. reames: I know you're copying the StackProtector code here, but I'm not sure I see the need for this…

private:		private:
// Since we allow duplicate edges from one basic block to another, we use		// Since we allow duplicate edges from one basic block to another, we use
// a pair (PredBlock and an index in the successors) to specify an edge.		// a pair (PredBlock and an index in the successors) to specify an edge.
typedef std::pair<const BasicBlock *, unsigned> Edge;		typedef std::pair<const BasicBlock *, unsigned> Edge;

// Default weight value. Used when we don't have information about the edge.		// Default weight value. Used when we don't have information about the edge.
// TODO: DEFAULT_WEIGHT makes sense during static predication, when none of		// TODO: DEFAULT_WEIGHT makes sense during static predication, when none of
// the successors have a weight yet. But it doesn't make sense when providing		// the successors have a weight yet. But it doesn't make sense when providing
Show All 35 Lines

lib/Analysis/BranchProbabilityInfo.cpp

	Show All 13 Lines
	#include "llvm/Analysis/BranchProbabilityInfo.h"			#include "llvm/Analysis/BranchProbabilityInfo.h"
	#include "llvm/ADT/PostOrderIterator.h"			#include "llvm/ADT/PostOrderIterator.h"
	#include "llvm/Analysis/LoopInfo.h"			#include "llvm/Analysis/LoopInfo.h"
	#include "llvm/IR/CFG.h"			#include "llvm/IR/CFG.h"
	#include "llvm/IR/Constants.h"			#include "llvm/IR/Constants.h"
	#include "llvm/IR/Function.h"			#include "llvm/IR/Function.h"
	#include "llvm/IR/Instructions.h"			#include "llvm/IR/Instructions.h"
	#include "llvm/IR/LLVMContext.h"			#include "llvm/IR/LLVMContext.h"
				#include "llvm/IR/MDBuilder.h"
	#include "llvm/IR/Metadata.h"			#include "llvm/IR/Metadata.h"
	#include "llvm/Support/Debug.h"			#include "llvm/Support/Debug.h"

	using namespace llvm;			using namespace llvm;

	#define DEBUG_TYPE "branch-prob"			#define DEBUG_TYPE "branch-prob"

	INITIALIZE_PASS_BEGIN(BranchProbabilityInfo, "branch-prob",			INITIALIZE_PASS_BEGIN(BranchProbabilityInfo, "branch-prob",
	▲ Show 20 Lines • Show All 595 Lines • ▼ Show 20 Lines
	setEdgeWeight(const BasicBlock *Src, unsigned IndexInSuccessors,			setEdgeWeight(const BasicBlock *Src, unsigned IndexInSuccessors,
	uint32_t Weight) {			uint32_t Weight) {
	Weights[std::make_pair(Src, IndexInSuccessors)] = Weight;			Weights[std::make_pair(Src, IndexInSuccessors)] = Weight;
	DEBUG(dbgs() << "set edge " << Src->getName() << " -> "			DEBUG(dbgs() << "set edge " << Src->getName() << " -> "
	<< IndexInSuccessors << " successor weight to "			<< IndexInSuccessors << " successor weight to "
	<< Weight << "\n");			<< Weight << "\n");
	}			}

				MDNode *BranchProbabilityInfo::getBranchWeightsAtomicExpand(bool Reverse,
				LLVMContext &Ctx) {
				unsigned SuccessWeight = (1u << 20) - 1, FailureWeight = 1;

				if (Reverse)
				std::swap(SuccessWeight, FailureWeight);

				return MDBuilder(Ctx).createBranchWeights(SuccessWeight, FailureWeight);
				}

	/// Get an edge's probability, relative to other out-edges from Src.			/// Get an edge's probability, relative to other out-edges from Src.
	BranchProbability BranchProbabilityInfo::			BranchProbability BranchProbabilityInfo::
	getEdgeProbability(const BasicBlock *Src, unsigned IndexInSuccessors) const {			getEdgeProbability(const BasicBlock *Src, unsigned IndexInSuccessors) const {
	uint32_t N = getEdgeWeight(Src, IndexInSuccessors);			uint32_t N = getEdgeWeight(Src, IndexInSuccessors);
	uint32_t D = getSumForBlock(Src);			uint32_t D = getSumForBlock(Src);

	return BranchProbability(N, D);			return BranchProbability(N, D);
	}			}
	Show All 24 Lines

lib/CodeGen/AtomicExpandPass.cpp

//===-- AtomicExpandPass.cpp - Expand atomic instructions -------===//		//===-- AtomicExpandPass.cpp - Expand atomic instructions -------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file contains a pass (at IR level) to replace atomic instructions with		// This file contains a pass (at IR level) to replace atomic instructions with
// either (intrinsic-based) load-linked/store-conditional loops or AtomicCmpXchg.		// either (intrinsic-based) load-linked/store-conditional loops or AtomicCmpXchg.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "llvm/Analysis/BranchProbabilityInfo.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
		#include "llvm/IR/MDBuilder.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Target/TargetLowering.h"		#include "llvm/Target/TargetLowering.h"
#include "llvm/Target/TargetMachine.h"		#include "llvm/Target/TargetMachine.h"
#include "llvm/Target/TargetSubtargetInfo.h"		#include "llvm/Target/TargetSubtargetInfo.h"

using namespace llvm;		using namespace llvm;

▲ Show 20 Lines • Show All 278 Lines • ▼ Show 20 Lines	bool AtomicExpand::expandAtomicRMWToLLSC(AtomicRMWInst *AI) {

Value *NewVal =		Value *NewVal =
performAtomicOp(AI->getOperation(), Builder, Loaded, AI->getValOperand());		performAtomicOp(AI->getOperation(), Builder, Loaded, AI->getValOperand());

Value *StoreSuccess =		Value *StoreSuccess =
TLI->emitStoreConditional(Builder, NewVal, Addr, MemOpOrder);		TLI->emitStoreConditional(Builder, NewVal, Addr, MemOpOrder);
Value *TryAgain = Builder.CreateICmpNE(		Value *TryAgain = Builder.CreateICmpNE(
StoreSuccess, ConstantInt::get(IntegerType::get(Ctx, 32), 0), "tryagain");		StoreSuccess, ConstantInt::get(IntegerType::get(Ctx, 32), 0), "tryagain");
Builder.CreateCondBr(TryAgain, LoopBB, ExitBB);		MDNode *Weights = BranchProbabilityInfo::getBranchWeightsAtomicExpand(
		true, F->getContext());
		Builder.CreateCondBr(TryAgain, LoopBB, ExitBB, Weights);

Builder.SetInsertPoint(ExitBB, ExitBB->begin());		Builder.SetInsertPoint(ExitBB, ExitBB->begin());

AI->replaceAllUsesWith(Loaded);		AI->replaceAllUsesWith(Loaded);
AI->eraseFromParent();		AI->eraseFromParent();

return true;		return true;
}		}
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	bool AtomicExpand::expandAtomicRMWToCmpXchg(AtomicRMWInst *AI) {

Value *Pair = Builder.CreateAtomicCmpXchg(		Value *Pair = Builder.CreateAtomicCmpXchg(
Addr, Loaded, NewVal, MemOpOrder,		Addr, Loaded, NewVal, MemOpOrder,
AtomicCmpXchgInst::getStrongestFailureOrdering(MemOpOrder));		AtomicCmpXchgInst::getStrongestFailureOrdering(MemOpOrder));
Value *NewLoaded = Builder.CreateExtractValue(Pair, 0, "newloaded");		Value *NewLoaded = Builder.CreateExtractValue(Pair, 0, "newloaded");
Loaded->addIncoming(NewLoaded, LoopBB);		Loaded->addIncoming(NewLoaded, LoopBB);

Value *Success = Builder.CreateExtractValue(Pair, 1, "success");		Value *Success = Builder.CreateExtractValue(Pair, 1, "success");
Builder.CreateCondBr(Success, ExitBB, LoopBB);		MDNode *Weights = BranchProbabilityInfo::getBranchWeightsAtomicExpand(
		false, F->getContext());
		Builder.CreateCondBr(Success, ExitBB, LoopBB, Weights);

Builder.SetInsertPoint(ExitBB, ExitBB->begin());		Builder.SetInsertPoint(ExitBB, ExitBB->begin());

		reamesUnsubmitted Not Done Reply Inline Actions Please extract a helper function for this shared code reames: Please extract a helper function for this shared code
AI->replaceAllUsesWith(NewLoaded);		AI->replaceAllUsesWith(NewLoaded);
AI->eraseFromParent();		AI->eraseFromParent();

return true;		return true;
}		}

bool AtomicExpand::expandAtomicCmpXchg(AtomicCmpXchgInst *CI) {		bool AtomicExpand::expandAtomicCmpXchg(AtomicCmpXchgInst *CI) {
AtomicOrdering SuccessOrder = CI->getSuccessOrdering();		AtomicOrdering SuccessOrder = CI->getSuccessOrdering();
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	bool AtomicExpand::expandAtomicCmpXchg(AtomicCmpXchgInst *CI) {
// Start the main loop block now that we've taken care of the preliminaries.		// Start the main loop block now that we've taken care of the preliminaries.
Builder.SetInsertPoint(LoopBB);		Builder.SetInsertPoint(LoopBB);
Value *Loaded = TLI->emitLoadLinked(Builder, Addr, MemOpOrder);		Value *Loaded = TLI->emitLoadLinked(Builder, Addr, MemOpOrder);
Value *ShouldStore =		Value *ShouldStore =
Builder.CreateICmpEQ(Loaded, CI->getCompareOperand(), "should_store");		Builder.CreateICmpEQ(Loaded, CI->getCompareOperand(), "should_store");

// If the the cmpxchg doesn't actually need any ordering when it fails, we can		// If the the cmpxchg doesn't actually need any ordering when it fails, we can
// jump straight past that fence instruction (if it exists).		// jump straight past that fence instruction (if it exists).
Builder.CreateCondBr(ShouldStore, TryStoreBB, FailureBB);		MDNode *Weights = BranchProbabilityInfo::getBranchWeightsAtomicExpand(
		false, F->getContext());
		Builder.CreateCondBr(ShouldStore, TryStoreBB, FailureBB, Weights);

Builder.SetInsertPoint(TryStoreBB);		Builder.SetInsertPoint(TryStoreBB);
Value *StoreSuccess = TLI->emitStoreConditional(		Value *StoreSuccess = TLI->emitStoreConditional(
Builder, CI->getNewValOperand(), Addr, MemOpOrder);		Builder, CI->getNewValOperand(), Addr, MemOpOrder);
StoreSuccess = Builder.CreateICmpEQ(		StoreSuccess = Builder.CreateICmpEQ(
StoreSuccess, ConstantInt::get(Type::getInt32Ty(Ctx), 0), "success");		StoreSuccess, ConstantInt::get(Type::getInt32Ty(Ctx), 0), "success");
Builder.CreateCondBr(StoreSuccess, SuccessBB,		Builder.CreateCondBr(StoreSuccess, SuccessBB,
CI->isWeak() ? FailureBB : LoopBB);		CI->isWeak() ? FailureBB : LoopBB, Weights);

// Make sure later instructions don't get reordered with a fence if necessary.		// Make sure later instructions don't get reordered with a fence if necessary.
Builder.SetInsertPoint(SuccessBB);		Builder.SetInsertPoint(SuccessBB);
TLI->emitTrailingFence(Builder, SuccessOrder, /IsStore=/true,		TLI->emitTrailingFence(Builder, SuccessOrder, /IsStore=/true,
/IsLoad=/true);		/IsLoad=/true);
Builder.CreateBr(ExitBB);		Builder.CreateBr(ExitBB);

Builder.SetInsertPoint(FailureBB);		Builder.SetInsertPoint(FailureBB);
▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

test/CodeGen/AArch64/arm64-atomic.ll

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	; CHECK: cbnz [[SCRATCH_REG]], [[LABEL]]			; CHECK: cbnz [[SCRATCH_REG]], [[LABEL]]
	; CHECK: mov x0, x[[DEST_REG]]			; CHECK: mov x0, x[[DEST_REG]]
	%val = atomicrmw nand i32* %p, i32 7 release			%val = atomicrmw nand i32* %p, i32 7 release
	ret i32 %val			ret i32 %val
	}			}

	define i64 @fetch_and_nand_64(i64* %p) {			define i64 @fetch_and_nand_64(i64* %p) {
	; CHECK-LABEL: fetch_and_nand_64:			; CHECK-LABEL: fetch_and_nand_64:
	; CHECK: mov x[[ADDR:[0-9]+]], x0
	; CHECK: [[LABEL:.?LBB[0-9]+_[0-9]+]]:			; CHECK: [[LABEL:.?LBB[0-9]+_[0-9]+]]:
	; CHECK: ldaxr x[[DEST_REG:[0-9]+]], [x[[ADDR]]]			; CHECK: ldaxr x[[DEST_REG:[0-9]+]], [x[[ADDR:[0-9]+]]]
	; CHECK: mvn w[[TMP_REG:[0-9]+]], w[[DEST_REG]]			; CHECK: mvn w[[TMP_REG:[0-9]+]], w[[DEST_REG]]
	; CHECK: orr [[SCRATCH2_REG:x[0-9]+]], x[[TMP_REG]], #0xfffffffffffffff8			; CHECK: orr [[SCRATCH2_REG:x[0-9]+]], x[[TMP_REG]], #0xfffffffffffffff8
	; CHECK: stlxr [[SCRATCH_REG:w[0-9]+]], [[SCRATCH2_REG]], [x[[ADDR]]]			; CHECK: stlxr [[SCRATCH_REG:w[0-9]+]], [[SCRATCH2_REG]], [x[[ADDR]]]
	; CHECK: cbnz [[SCRATCH_REG]], [[LABEL]]			; CHECK: cbnz [[SCRATCH_REG]], [[LABEL]]

	%val = atomicrmw nand i64* %p, i64 7 acq_rel			%val = atomicrmw nand i64* %p, i64 7 acq_rel
	ret i64 %val			ret i64 %val
	}			}
	Show All 9 Lines
	; CHECK: cbnz [[SCRATCH_REG]], [[LABEL]]			; CHECK: cbnz [[SCRATCH_REG]], [[LABEL]]
	; CHECK: mov x0, x[[DEST_REG]]			; CHECK: mov x0, x[[DEST_REG]]
	%val = atomicrmw or i32* %p, i32 5 seq_cst			%val = atomicrmw or i32* %p, i32 5 seq_cst
	ret i32 %val			ret i32 %val
	}			}

	define i64 @fetch_and_or_64(i64* %p) {			define i64 @fetch_and_or_64(i64* %p) {
	; CHECK: fetch_and_or_64:			; CHECK: fetch_and_or_64:
	; CHECK: mov x[[ADDR:[0-9]+]], x0
	; CHECK: [[LABEL:.?LBB[0-9]+_[0-9]+]]:			; CHECK: [[LABEL:.?LBB[0-9]+_[0-9]+]]:
	; CHECK: ldxr [[DEST_REG:x[0-9]+]], [x[[ADDR]]]			; CHECK: ldxr [[DEST_REG:x[0-9]+]], [x[[ADDR:[0-9]+]]]
	; CHECK: orr [[SCRATCH2_REG:x[0-9]+]], [[DEST_REG]], #0x7			; CHECK: orr [[SCRATCH2_REG:x[0-9]+]], [[DEST_REG]], #0x7
	; CHECK: stxr [[SCRATCH_REG:w[0-9]+]], [[SCRATCH2_REG]], [x[[ADDR]]]			; CHECK: stxr [[SCRATCH_REG:w[0-9]+]], [[SCRATCH2_REG]], [x[[ADDR]]]
	; CHECK: cbnz [[SCRATCH_REG]], [[LABEL]]			; CHECK: cbnz [[SCRATCH_REG]], [[LABEL]]
	%val = atomicrmw or i64* %p, i64 7 monotonic			%val = atomicrmw or i64* %p, i64 7 monotonic
	ret i64 %val			ret i64 %val
	}			}

	define void @acquire_fence() {			define void @acquire_fence() {
	▲ Show 20 Lines • Show All 247 Lines • Show Last 20 Lines

test/CodeGen/ARM/atomic-64bit.ll

Show First 20 Lines • Show All 168 Lines • ▼ Show 20 Lines	; CHECK-THUMB: dmb {{ish$}}
ret i64 %r		ret i64 %r
}		}

define i64 @test7(i64* %ptr, i64 %val1, i64 %val2) {		define i64 @test7(i64* %ptr, i64 %val1, i64 %val2) {
; CHECK-LABEL: test7:		; CHECK-LABEL: test7:
; CHECK-DAG: mov [[VAL1LO:r[0-9]+]], r1		; CHECK-DAG: mov [[VAL1LO:r[0-9]+]], r1
; CHECK-DAG: dmb {{ish$}}		; CHECK-DAG: dmb {{ish$}}
; CHECK: ldrexd [[REG1:(r[0-9]?[02468])]], [[REG2:(r[0-9]?[13579])]]		; CHECK: ldrexd [[REG1:(r[0-9]?[02468])]], [[REG2:(r[0-9]?[13579])]]
; CHECK-LE-DAG: eor [[MISMATCH_LO:r[0-9]+]], [[REG1]], [[VAL1LO]]		; CHECK-LE-DAG: eor [[MISMATCH_LO:lr\|r[0-9]+]], [[REG1]], [[VAL1LO]]
; CHECK-LE-DAG: eor [[MISMATCH_HI:r[0-9]+]], [[REG2]], r2		; CHECK-LE-DAG: eor [[MISMATCH_HI:lr\|r[0-9]+]], [[REG2]], r2
; CHECK-BE-DAG: eor [[MISMATCH_LO:r[0-9]+]], [[REG2]], r2		; CHECK-BE-DAG: eor [[MISMATCH_LO:lr\|r[0-9]+]], [[REG2]], r2
; CHECK-BE-DAG: eor [[MISMATCH_HI:r[0-9]+]], [[REG1]], r1		; CHECK-BE-DAG: eor [[MISMATCH_HI:lr\|r[0-9]+]], [[REG1]], r1
; CHECK: orrs {{r[0-9]+}}, [[MISMATCH_LO]], [[MISMATCH_HI]]		; CHECK: orrs {{r[0-9]+}}, [[MISMATCH_LO]], [[MISMATCH_HI]]
; CHECK: bne		; CHECK: bne
; CHECK: strexd {{[a-z0-9]+}}, {{r[0-9]?[02468]}}, {{r[0-9]?[13579]}}		; CHECK: strexd {{[a-z0-9]+}}, {{r[0-9]?[02468]}}, {{r[0-9]?[13579]}}
; CHECK: cmp		; CHECK: cmp
; CHECK: bne		; CHECK: bne
; CHECK: dmb {{ish$}}		; CHECK: dmb {{ish$}}

; CHECK-THUMB-LABEL: test7:		; CHECK-THUMB-LABEL: test7:
▲ Show 20 Lines • Show All 250 Lines • Show Last 20 Lines

test/CodeGen/ARM/cmpxchg-idioms.ll

	; RUN: llc -mtriple=thumbv7s-apple-ios7.0 -o - %s \| FileCheck %s			; RUN: llc -mtriple=thumbv7s-apple-ios7.0 -o - %s \| FileCheck %s

	define i32 @test_return(i32* %p, i32 %oldval, i32 %newval) {			define i32 @test_return(i32* %p, i32 %oldval, i32 %newval) {
	; CHECK-LABEL: test_return:			; CHECK-LABEL: test_return:

	; CHECK: dmb ishst			; CHECK: dmb ishst

	; CHECK: [[LOOP:LBB[0-9]+_[0-9]+]]:			; CHECK: [[LOOP:LBB[0-9]+_[0-9]+]]:
	; CHECK: ldrex [[LOADED:r[0-9]+]], [r0]
	; CHECK: cmp [[LOADED]], r1
	; CHECK: bne [[FAILED:LBB[0-9]+_[0-9]+]]

	; CHECK: strex [[STATUS:r[0-9]+]], {{r[0-9]+}}, [r0]			; CHECK: strex [[STATUS:r[0-9]+]], {{r[0-9]+}}, [r0]
	; CHECK: cmp [[STATUS]], #0			; CHECK: cmp [[STATUS]], #0
	; CHECK: bne [[LOOP]]			; CHECK: ittt eq
				; CHECK: dmbeq ish
				; CHECK: moveq r0, #1
				; CHECK: bxeq lr

	; CHECK-NOT: cmp {{r[0-9]+}}, {{r[0-9]+}}			; CHECK: ldrex [[LOADED:r[0-9]+]], [r0]
	; CHECK: movs r0, #1			; CHECK: cmp [[LOADED]], r1
	; CHECK: dmb ish			; CHECK: beq [[LOOP]]
	; CHECK: bx lr

	; CHECK: [[FAILED]]:
	; CHECK-NOT: cmp {{r[0-9]+}}, {{r[0-9]+}}
	; CHECK: movs r0, #0			; CHECK: movs r0, #0
	; CHECK: dmb ish			; CHECK: dmb ish
	; CHECK: bx lr			; CHECK: bx lr

	%pair = cmpxchg i32* %p, i32 %oldval, i32 %newval seq_cst seq_cst			%pair = cmpxchg i32* %p, i32 %oldval, i32 %newval seq_cst seq_cst
	%success = extractvalue { i32, i1 } %pair, 1			%success = extractvalue { i32, i1 } %pair, 1
	%conv = zext i1 %success to i32			%conv = zext i1 %success to i32
	ret i32 %conv			ret i32 %conv
	}			}

	define i1 @test_return_bool(i8* %value, i8 %oldValue, i8 %newValue) {			define i1 @test_return_bool(i8* %value, i8 %oldValue, i8 %newValue) {
	; CHECK-LABEL: test_return_bool:			; CHECK-LABEL: test_return_bool:

	; CHECK: uxtb [[OLDBYTE:r[0-9]+]], r1			; CHECK: uxtb [[OLDBYTE:r[0-9]+]], r1
	; CHECK: dmb ishst			; CHECK: dmb ishst

				; FIXME: eoreq and eor are redundant. Need to teach DAG combine that.
	; CHECK: [[LOOP:LBB[0-9]+_[0-9]+]]:			; CHECK: [[LOOP:LBB[0-9]+_[0-9]+]]:
	; CHECK: ldrexb [[LOADED:r[0-9]+]], [r0]
	; CHECK: cmp [[LOADED]], [[OLDBYTE]]
	; CHECK: bne [[FAIL:LBB[0-9]+_[0-9]+]]

	; CHECK: strexb [[STATUS:r[0-9]+]], {{r[0-9]+}}, [r0]			; CHECK: strexb [[STATUS:r[0-9]+]], {{r[0-9]+}}, [r0]
	; CHECK: cmp [[STATUS]], #0			; CHECK: cmp [[STATUS]], #0
	; CHECK: bne [[LOOP]]			; CHECK: itttt eq
				; CHECK: dmbeq ish
				; CHECK: moveq [[TMP:r[0-9]+]], #1
				; CHECK: eoreq r0, [[TMP]], #1
				; CHECK: bxeq lr

	; FIXME: this eor is redundant. Need to teach DAG combine that.			; CHECK: ldrexb [[LOADED:r[0-9]+]], [r0]
	; CHECK-NOT: cmp {{r[0-9]+}}, {{r[0-9]+}}			; CHECK: cmp [[LOADED]], [[OLDBYTE]]
	; CHECK: movs [[TMP:r[0-9]+]], #1			; CHECK: beq [[LOOP]]
	; CHECK: eor r0, [[TMP]], #1
	; CHECK: bx lr

	; CHECK: [[FAIL]]:
	; CHECK: movs [[TMP:r[0-9]+]], #0			; CHECK: movs [[TMP:r[0-9]+]], #0
	; CHECK: eor r0, [[TMP]], #1			; CHECK: eor r0, [[TMP]], #1
	; CHECK: bx lr			; CHECK: bx lr


	%pair = cmpxchg i8* %value, i8 %oldValue, i8 %newValue acq_rel monotonic			%pair = cmpxchg i8* %value, i8 %oldValue, i8 %newValue acq_rel monotonic
	%success = extractvalue { i8, i1 } %pair, 1			%success = extractvalue { i8, i1 } %pair, 1
	%failure = xor i1 %success, 1			%failure = xor i1 %success, 1
	ret i1 %failure			ret i1 %failure
	}			}

	define void @test_conditional(i32* %p, i32 %oldval, i32 %newval) {			define void @test_conditional(i32* %p, i32 %oldval, i32 %newval) {
	; CHECK-LABEL: test_conditional:			; CHECK-LABEL: test_conditional:

	; CHECK: dmb ishst			; CHECK: dmb ishst

	; CHECK: [[LOOP:LBB[0-9]+_[0-9]+]]:			; CHECK: [[LOOP:LBB[0-9]+_[0-9]+]]:
	; CHECK: ldrex [[LOADED:r[0-9]+]], [r0]
	; CHECK: cmp [[LOADED]], r1
	; CHECK: bne [[FAILED:LBB[0-9]+_[0-9]+]]

	; CHECK: strex [[STATUS:r[0-9]+]], r2, [r0]			; CHECK: strex [[STATUS:r[0-9]+]], r2, [r0]
	; CHECK: cmp [[STATUS]], #0			; CHECK: cmp [[STATUS]], #0
	; CHECK: bne [[LOOP]]			; CHECK: itt eq
				; CHECK: dmbeq ish
				; CHECK: beq.w _bar

	; CHECK-NOT: cmp {{r[0-9]+}}, {{r[0-9]+}}			; CHECK: ldrex [[LOADED:r[0-9]+]], [r0]
	; CHECK: dmb ish			; CHECK: cmp [[LOADED]], r1
	; CHECK: b.w _bar			; CHECK: beq [[LOOP]]

	; CHECK: [[FAILED]]:
	; CHECK-NOT: cmp {{r[0-9]+}}, {{r[0-9]+}}
	; CHECK: dmb ish			; CHECK: dmb ish
	; CHECK: b.w _baz			; CHECK: b.w _baz

	%pair = cmpxchg i32* %p, i32 %oldval, i32 %newval seq_cst seq_cst			%pair = cmpxchg i32* %p, i32 %oldval, i32 %newval seq_cst seq_cst
	%success = extractvalue { i32, i1 } %pair, 1			%success = extractvalue { i32, i1 } %pair, 1
	br i1 %success, label %true, label %false			br i1 %success, label %true, label %false

	true:			true:
	Show All 13 Lines

test/CodeGen/X86/atomic128.ll

Show First 20 Lines • Show All 263 Lines • ▼ Show 20 Lines	; CHECK: cmpxchg16b (%rdi)

%r = load atomic i128* %p monotonic, align 16		%r = load atomic i128* %p monotonic, align 16
ret i128 %r		ret i128 %r
}		}

define void @atomic_store_seq_cst(i128* %p, i128 %in) {		define void @atomic_store_seq_cst(i128* %p, i128 %in) {
; CHECK-LABEL: atomic_store_seq_cst:		; CHECK-LABEL: atomic_store_seq_cst:
; CHECK: movq %rdx, %rcx		; CHECK: movq %rdx, %rcx
; CHECK: movq %rsi, %rbx
; CHECK: movq (%rdi), %rax		; CHECK: movq (%rdi), %rax
; CHECK: movq 8(%rdi), %rdx		; CHECK: movq 8(%rdi), %rdx

; CHECK: [[LOOP:.?LBB[0-9]+_[0-9]+]]:		; CHECK: [[LOOP:.?LBB[0-9]+_[0-9]+]]:
		; CHECK-DAG: movq %rsi, %rbx
; CHECK: lock		; CHECK: lock
; CHECK: cmpxchg16b (%rdi)		; CHECK: cmpxchg16b (%rdi)
; CHECK: jne [[LOOP]]		; CHECK: jne [[LOOP]]
; CHECK-NOT: callq ___sync_lock_test_and_set_16		; CHECK-NOT: callq ___sync_lock_test_and_set_16

store atomic i128 %in, i128* %p seq_cst, align 16		store atomic i128 %in, i128* %p seq_cst, align 16
ret void		ret void
}		}

define void @atomic_store_release(i128* %p, i128 %in) {		define void @atomic_store_release(i128* %p, i128 %in) {
; CHECK-LABEL: atomic_store_release:		; CHECK-LABEL: atomic_store_release:
; CHECK: movq %rdx, %rcx		; CHECK: movq %rdx, %rcx
; CHECK: movq %rsi, %rbx
; CHECK: movq (%rdi), %rax		; CHECK: movq (%rdi), %rax
; CHECK: movq 8(%rdi), %rdx		; CHECK: movq 8(%rdi), %rdx

; CHECK: [[LOOP:.?LBB[0-9]+_[0-9]+]]:		; CHECK: [[LOOP:.?LBB[0-9]+_[0-9]+]]:
		; CHECK: movq %rsi, %rbx
; CHECK: lock		; CHECK: lock
; CHECK: cmpxchg16b (%rdi)		; CHECK: cmpxchg16b (%rdi)
; CHECK: jne [[LOOP]]		; CHECK: jne [[LOOP]]

store atomic i128 %in, i128* %p release, align 16		store atomic i128 %in, i128* %p release, align 16
ret void		ret void
}		}

define void @atomic_store_relaxed(i128* %p, i128 %in) {		define void @atomic_store_relaxed(i128* %p, i128 %in) {
; CHECK-LABEL: atomic_store_relaxed:		; CHECK-LABEL: atomic_store_relaxed:
; CHECK: movq %rdx, %rcx		; CHECK: movq %rdx, %rcx
; CHECK: movq %rsi, %rbx
; CHECK: movq (%rdi), %rax		; CHECK: movq (%rdi), %rax
; CHECK: movq 8(%rdi), %rdx		; CHECK: movq 8(%rdi), %rdx

; CHECK: [[LOOP:.?LBB[0-9]+_[0-9]+]]:		; CHECK: [[LOOP:.?LBB[0-9]+_[0-9]+]]:
		; CHECK: movq %rsi, %rbx
; CHECK: lock		; CHECK: lock
; CHECK: cmpxchg16b (%rdi)		; CHECK: cmpxchg16b (%rdi)
; CHECK: jne [[LOOP]]		; CHECK: jne [[LOOP]]

store atomic i128 %in, i128* %p unordered, align 16		store atomic i128 %in, i128* %p unordered, align 16
ret void		ret void
}		}

test/Transforms/AtomicExpand/ARM/atomic-expansion-v7.ll

	; RUN: opt -S -o - -mtriple=armv7-apple-ios7.0 -atomic-expand %s \| FileCheck %s			; RUN: opt -S -o - -mtriple=armv7-apple-ios7.0 -atomic-expand %s \| FileCheck %s

	define i8 @test_atomic_xchg_i8(i8* %ptr, i8 %xchgend) {			define i8 @test_atomic_xchg_i8(i8* %ptr, i8 %xchgend) {
	; CHECK-LABEL: @test_atomic_xchg_i8			; CHECK-LABEL: @test_atomic_xchg_i8
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8
	; CHECK: [[NEWVAL32:%.*]] = zext i8 %xchgend to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 %xchgend to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0:[0-9]+]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: ret i8 [[OLDVAL]]			; CHECK: ret i8 [[OLDVAL]]
	%res = atomicrmw xchg i8* %ptr, i8 %xchgend monotonic			%res = atomicrmw xchg i8* %ptr, i8 %xchgend monotonic
	ret i8 %res			ret i8 %res
	}			}

	define i16 @test_atomic_add_i16(i16* %ptr, i16 %addend) {			define i16 @test_atomic_add_i16(i16* %ptr, i16 %addend) {
	; CHECK-LABEL: @test_atomic_add_i16			; CHECK-LABEL: @test_atomic_add_i16
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i16(i16 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i16(i16 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i16			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i16
	; CHECK: [[NEWVAL:%.*]] = add i16 [[OLDVAL]], %addend			; CHECK: [[NEWVAL:%.*]] = add i16 [[OLDVAL]], %addend
	; CHECK: [[NEWVAL32:%.*]] = zext i16 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i16 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i16(i32 [[NEWVAL32]], i16 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i16(i32 [[NEWVAL32]], i16 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i16 [[OLDVAL]]			; CHECK: ret i16 [[OLDVAL]]
	%res = atomicrmw add i16* %ptr, i16 %addend seq_cst			%res = atomicrmw add i16* %ptr, i16 %addend seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i32 @test_atomic_sub_i32(i32* %ptr, i32 %subend) {			define i32 @test_atomic_sub_i32(i32* %ptr, i32 %subend) {
	; CHECK-LABEL: @test_atomic_sub_i32			; CHECK-LABEL: @test_atomic_sub_i32
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %ptr)			; CHECK: [[OLDVAL:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %ptr)
	; CHECK: [[NEWVAL:%.*]] = sub i32 [[OLDVAL]], %subend			; CHECK: [[NEWVAL:%.*]] = sub i32 [[OLDVAL]], %subend
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i32(i32 [[NEWVAL]], i32 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i32(i32 [[NEWVAL]], i32 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i32 [[OLDVAL]]			; CHECK: ret i32 [[OLDVAL]]
	%res = atomicrmw sub i32* %ptr, i32 %subend acquire			%res = atomicrmw sub i32* %ptr, i32 %subend acquire
	ret i32 %res			ret i32 %res
	}			}

	define i8 @test_atomic_and_i8(i8* %ptr, i8 %andend) {			define i8 @test_atomic_and_i8(i8* %ptr, i8 %andend) {
	; CHECK-LABEL: @test_atomic_and_i8			; CHECK-LABEL: @test_atomic_and_i8
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8
	; CHECK: [[NEWVAL:%.*]] = and i8 [[OLDVAL]], %andend			; CHECK: [[NEWVAL:%.*]] = and i8 [[OLDVAL]], %andend
	; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: ret i8 [[OLDVAL]]			; CHECK: ret i8 [[OLDVAL]]
	%res = atomicrmw and i8* %ptr, i8 %andend release			%res = atomicrmw and i8* %ptr, i8 %andend release
	ret i8 %res			ret i8 %res
	}			}

	define i16 @test_atomic_nand_i16(i16* %ptr, i16 %nandend) {			define i16 @test_atomic_nand_i16(i16* %ptr, i16 %nandend) {
	; CHECK-LABEL: @test_atomic_nand_i16			; CHECK-LABEL: @test_atomic_nand_i16
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i16(i16 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i16(i16 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i16			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i16
	; CHECK: [[NEWVAL_TMP:%.*]] = and i16 [[OLDVAL]], %nandend			; CHECK: [[NEWVAL_TMP:%.*]] = and i16 [[OLDVAL]], %nandend
	; CHECK: [[NEWVAL:%.*]] = xor i16 [[NEWVAL_TMP]], -1			; CHECK: [[NEWVAL:%.*]] = xor i16 [[NEWVAL_TMP]], -1
	; CHECK: [[NEWVAL32:%.*]] = zext i16 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i16 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i16(i32 [[NEWVAL32]], i16 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i16(i32 [[NEWVAL32]], i16 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i16 [[OLDVAL]]			; CHECK: ret i16 [[OLDVAL]]
	%res = atomicrmw nand i16* %ptr, i16 %nandend seq_cst			%res = atomicrmw nand i16* %ptr, i16 %nandend seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i64 @test_atomic_or_i64(i64* %ptr, i64 %orend) {			define i64 @test_atomic_or_i64(i64* %ptr, i64 %orend) {
	Show All 11 Lines
	; CHECK: [[OLDVAL:%.*]] = or i64 [[LO64]], [[HI64]]			; CHECK: [[OLDVAL:%.*]] = or i64 [[LO64]], [[HI64]]
	; CHECK: [[NEWVAL:%.*]] = or i64 [[OLDVAL]], %orend			; CHECK: [[NEWVAL:%.*]] = or i64 [[OLDVAL]], %orend
	; CHECK: [[NEWLO:%.*]] = trunc i64 [[NEWVAL]] to i32			; CHECK: [[NEWLO:%.*]] = trunc i64 [[NEWVAL]] to i32
	; CHECK: [[NEWHI_TMP:%.*]] = lshr i64 [[NEWVAL]], 32			; CHECK: [[NEWHI_TMP:%.*]] = lshr i64 [[NEWVAL]], 32
	; CHECK: [[NEWHI:%.*]] = trunc i64 [[NEWHI_TMP]] to i32			; CHECK: [[NEWHI:%.*]] = trunc i64 [[NEWHI_TMP]] to i32
	; CHECK: [[PTR8:%.]] = bitcast i64 %ptr to i8*			; CHECK: [[PTR8:%.]] = bitcast i64 %ptr to i8*
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strexd(i32 [[NEWLO]], i32 [[NEWHI]], i8 [[PTR8]])			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strexd(i32 [[NEWLO]], i32 [[NEWHI]], i8 [[PTR8]])
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i64 [[OLDVAL]]			; CHECK: ret i64 [[OLDVAL]]
	%res = atomicrmw or i64* %ptr, i64 %orend seq_cst			%res = atomicrmw or i64* %ptr, i64 %orend seq_cst
	ret i64 %res			ret i64 %res
	}			}

	define i8 @test_atomic_xor_i8(i8* %ptr, i8 %xorend) {			define i8 @test_atomic_xor_i8(i8* %ptr, i8 %xorend) {
	; CHECK-LABEL: @test_atomic_xor_i8			; CHECK-LABEL: @test_atomic_xor_i8
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8
	; CHECK: [[NEWVAL:%.*]] = xor i8 [[OLDVAL]], %xorend			; CHECK: [[NEWVAL:%.*]] = xor i8 [[OLDVAL]], %xorend
	; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i8 [[OLDVAL]]			; CHECK: ret i8 [[OLDVAL]]
	%res = atomicrmw xor i8* %ptr, i8 %xorend seq_cst			%res = atomicrmw xor i8* %ptr, i8 %xorend seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomic_max_i8(i8* %ptr, i8 %maxend) {			define i8 @test_atomic_max_i8(i8* %ptr, i8 %maxend) {
	; CHECK-LABEL: @test_atomic_max_i8			; CHECK-LABEL: @test_atomic_max_i8
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8
	; CHECK: [[WANT_OLD:%.*]] = icmp sgt i8 [[OLDVAL]], %maxend			; CHECK: [[WANT_OLD:%.*]] = icmp sgt i8 [[OLDVAL]], %maxend
	; CHECK: [[NEWVAL:%.*]] = select i1 [[WANT_OLD]], i8 [[OLDVAL]], i8 %maxend			; CHECK: [[NEWVAL:%.*]] = select i1 [[WANT_OLD]], i8 [[OLDVAL]], i8 %maxend
	; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i8 [[OLDVAL]]			; CHECK: ret i8 [[OLDVAL]]
	%res = atomicrmw max i8* %ptr, i8 %maxend seq_cst			%res = atomicrmw max i8* %ptr, i8 %maxend seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomic_min_i8(i8* %ptr, i8 %minend) {			define i8 @test_atomic_min_i8(i8* %ptr, i8 %minend) {
	; CHECK-LABEL: @test_atomic_min_i8			; CHECK-LABEL: @test_atomic_min_i8
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8
	; CHECK: [[WANT_OLD:%.*]] = icmp sle i8 [[OLDVAL]], %minend			; CHECK: [[WANT_OLD:%.*]] = icmp sle i8 [[OLDVAL]], %minend
	; CHECK: [[NEWVAL:%.*]] = select i1 [[WANT_OLD]], i8 [[OLDVAL]], i8 %minend			; CHECK: [[NEWVAL:%.*]] = select i1 [[WANT_OLD]], i8 [[OLDVAL]], i8 %minend
	; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i8 [[OLDVAL]]			; CHECK: ret i8 [[OLDVAL]]
	%res = atomicrmw min i8* %ptr, i8 %minend seq_cst			%res = atomicrmw min i8* %ptr, i8 %minend seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomic_umax_i8(i8* %ptr, i8 %umaxend) {			define i8 @test_atomic_umax_i8(i8* %ptr, i8 %umaxend) {
	; CHECK-LABEL: @test_atomic_umax_i8			; CHECK-LABEL: @test_atomic_umax_i8
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8
	; CHECK: [[WANT_OLD:%.*]] = icmp ugt i8 [[OLDVAL]], %umaxend			; CHECK: [[WANT_OLD:%.*]] = icmp ugt i8 [[OLDVAL]], %umaxend
	; CHECK: [[NEWVAL:%.*]] = select i1 [[WANT_OLD]], i8 [[OLDVAL]], i8 %umaxend			; CHECK: [[NEWVAL:%.*]] = select i1 [[WANT_OLD]], i8 [[OLDVAL]], i8 %umaxend
	; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i8 [[OLDVAL]]			; CHECK: ret i8 [[OLDVAL]]
	%res = atomicrmw umax i8* %ptr, i8 %umaxend seq_cst			%res = atomicrmw umax i8* %ptr, i8 %umaxend seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomic_umin_i8(i8* %ptr, i8 %uminend) {			define i8 @test_atomic_umin_i8(i8* %ptr, i8 %uminend) {
	; CHECK-LABEL: @test_atomic_umin_i8			; CHECK-LABEL: @test_atomic_umin_i8
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8
	; CHECK: [[WANT_OLD:%.*]] = icmp ule i8 [[OLDVAL]], %uminend			; CHECK: [[WANT_OLD:%.*]] = icmp ule i8 [[OLDVAL]], %uminend
	; CHECK: [[NEWVAL:%.*]] = select i1 [[WANT_OLD]], i8 [[OLDVAL]], i8 %uminend			; CHECK: [[NEWVAL:%.*]] = select i1 [[WANT_OLD]], i8 [[OLDVAL]], i8 %uminend
	; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: ret i8 [[OLDVAL]]			; CHECK: ret i8 [[OLDVAL]]
	%res = atomicrmw umin i8* %ptr, i8 %uminend seq_cst			%res = atomicrmw umin i8* %ptr, i8 %uminend seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_cmpxchg_i8_seqcst_seqcst(i8* %ptr, i8 %desired, i8 %newval) {			define i8 @test_cmpxchg_i8_seqcst_seqcst(i8* %ptr, i8 %desired, i8 %newval) {
	; CHECK-LABEL: @test_cmpxchg_i8_seqcst_seqcst			; CHECK-LABEL: @test_cmpxchg_i8_seqcst_seqcst
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]

	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 %1 to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 %1 to i8
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i8 [[OLDVAL]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i8 [[OLDVAL]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[NEWVAL32:%.*]] = zext i8 %newval to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 %newval to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]			; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	Show All 17 Lines
	; CHECK-LABEL: @test_cmpxchg_i16_seqcst_monotonic			; CHECK-LABEL: @test_cmpxchg_i16_seqcst_monotonic
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]

	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i16(i16 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i16(i16 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 %1 to i16			; CHECK: [[OLDVAL:%.*]] = trunc i32 %1 to i16
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i16 [[OLDVAL]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i16 [[OLDVAL]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[NEWVAL32:%.*]] = zext i16 %newval to i32			; CHECK: [[NEWVAL32:%.*]] = zext i16 %newval to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i16(i32 [[NEWVAL32]], i16 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i16(i32 [[NEWVAL32]], i16 %ptr)
	; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]			; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	Show All 16 Lines
	define i32 @test_cmpxchg_i32_acquire_acquire(i32* %ptr, i32 %desired, i32 %newval) {			define i32 @test_cmpxchg_i32_acquire_acquire(i32* %ptr, i32 %desired, i32 %newval) {
	; CHECK-LABEL: @test_cmpxchg_i32_acquire_acquire			; CHECK-LABEL: @test_cmpxchg_i32_acquire_acquire
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]

	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %ptr)			; CHECK: [[OLDVAL:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %ptr)
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[OLDVAL]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[OLDVAL]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %newval, i32 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %newval, i32 %ptr)
	; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]			; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	Show All 22 Lines
	; CHECK: [[LOHI:%.]] = call { i32, i32 } @llvm.arm.ldrexd(i8 [[PTR8]])			; CHECK: [[LOHI:%.]] = call { i32, i32 } @llvm.arm.ldrexd(i8 [[PTR8]])
	; CHECK: [[LO:%.*]] = extractvalue { i32, i32 } [[LOHI]], 0			; CHECK: [[LO:%.*]] = extractvalue { i32, i32 } [[LOHI]], 0
	; CHECK: [[HI:%.*]] = extractvalue { i32, i32 } [[LOHI]], 1			; CHECK: [[HI:%.*]] = extractvalue { i32, i32 } [[LOHI]], 1
	; CHECK: [[LO64:%.*]] = zext i32 [[LO]] to i64			; CHECK: [[LO64:%.*]] = zext i32 [[LO]] to i64
	; CHECK: [[HI64_TMP:%.*]] = zext i32 [[HI]] to i64			; CHECK: [[HI64_TMP:%.*]] = zext i32 [[HI]] to i64
	; CHECK: [[HI64:%.*]] = shl i64 [[HI64_TMP]], 32			; CHECK: [[HI64:%.*]] = shl i64 [[HI64_TMP]], 32
	; CHECK: [[OLDVAL:%.*]] = or i64 [[LO64]], [[HI64]]			; CHECK: [[OLDVAL:%.*]] = or i64 [[LO64]], [[HI64]]
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i64 [[OLDVAL]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i64 [[OLDVAL]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[NEWLO:%.*]] = trunc i64 %newval to i32			; CHECK: [[NEWLO:%.*]] = trunc i64 %newval to i32
	; CHECK: [[NEWHI_TMP:%.*]] = lshr i64 %newval, 32			; CHECK: [[NEWHI_TMP:%.*]] = lshr i64 %newval, 32
	; CHECK: [[NEWHI:%.*]] = trunc i64 [[NEWHI_TMP]] to i32			; CHECK: [[NEWHI:%.*]] = trunc i64 [[NEWHI_TMP]] to i32
	; CHECK: [[PTR8:%.]] = bitcast i64 %ptr to i8*			; CHECK: [[PTR8:%.]] = bitcast i64 %ptr to i8*
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strexd(i32 [[NEWLO]], i32 [[NEWHI]], i8 [[PTR8]])			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strexd(i32 [[NEWLO]], i32 [[NEWHI]], i8 [[PTR8]])
	; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0
	Show All 10 Lines
	; CHECK: [[DONE]]:			; CHECK: [[DONE]]:
	; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, %[[FAILURE_BB]] ]			; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, %[[FAILURE_BB]] ]
	; CHECK: ret i64 [[OLDVAL]]			; CHECK: ret i64 [[OLDVAL]]

	%pairold = cmpxchg i64* %ptr, i64 %desired, i64 %newval monotonic monotonic			%pairold = cmpxchg i64* %ptr, i64 %desired, i64 %newval monotonic monotonic
	%old = extractvalue { i64, i1 } %pairold, 0			%old = extractvalue { i64, i1 } %pairold, 0
	ret i64 %old			ret i64 %old
	}			}

				; CHECK: ![[PROF0]] = !{!"branch_weights", i32 1, i32 1048575}
				; CHECK: ![[PROF1]] = !{!"branch_weights", i32 1048575, i32 1}

test/Transforms/AtomicExpand/ARM/atomic-expansion-v8.ll

	; RUN: opt -S -o - -mtriple=armv8-linux-gnueabihf -atomic-expand %s \| FileCheck %s			; RUN: opt -S -o - -mtriple=armv8-linux-gnueabihf -atomic-expand %s \| FileCheck %s

	define i8 @test_atomic_xchg_i8(i8* %ptr, i8 %xchgend) {			define i8 @test_atomic_xchg_i8(i8* %ptr, i8 %xchgend) {
	; CHECK-LABEL: @test_atomic_xchg_i8			; CHECK-LABEL: @test_atomic_xchg_i8
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldrex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i8
	; CHECK: [[NEWVAL32:%.*]] = zext i8 %xchgend to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 %xchgend to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0:[0-9]+]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: ret i8 [[OLDVAL]]			; CHECK: ret i8 [[OLDVAL]]
	%res = atomicrmw xchg i8* %ptr, i8 %xchgend monotonic			%res = atomicrmw xchg i8* %ptr, i8 %xchgend monotonic
	ret i8 %res			ret i8 %res
	}			}

	define i16 @test_atomic_add_i16(i16* %ptr, i16 %addend) {			define i16 @test_atomic_add_i16(i16* %ptr, i16 %addend) {
	; CHECK-LABEL: @test_atomic_add_i16			; CHECK-LABEL: @test_atomic_add_i16
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldaex.p0i16(i16 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldaex.p0i16(i16 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i16			; CHECK: [[OLDVAL:%.*]] = trunc i32 [[OLDVAL32]] to i16
	; CHECK: [[NEWVAL:%.*]] = add i16 [[OLDVAL]], %addend			; CHECK: [[NEWVAL:%.*]] = add i16 [[OLDVAL]], %addend
	; CHECK: [[NEWVAL32:%.*]] = zext i16 [[NEWVAL]] to i32			; CHECK: [[NEWVAL32:%.*]] = zext i16 [[NEWVAL]] to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.stlex.p0i16(i32 [[NEWVAL32]], i16 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.stlex.p0i16(i32 [[NEWVAL32]], i16 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0:[0-9]+]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: ret i16 [[OLDVAL]]			; CHECK: ret i16 [[OLDVAL]]
	%res = atomicrmw add i16* %ptr, i16 %addend seq_cst			%res = atomicrmw add i16* %ptr, i16 %addend seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i32 @test_atomic_sub_i32(i32* %ptr, i32 %subend) {			define i32 @test_atomic_sub_i32(i32* %ptr, i32 %subend) {
	; CHECK-LABEL: @test_atomic_sub_i32			; CHECK-LABEL: @test_atomic_sub_i32
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]
	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL:%.]] = call i32 @llvm.arm.ldaex.p0i32(i32 %ptr)			; CHECK: [[OLDVAL:%.]] = call i32 @llvm.arm.ldaex.p0i32(i32 %ptr)
	; CHECK: [[NEWVAL:%.*]] = sub i32 [[OLDVAL]], %subend			; CHECK: [[NEWVAL:%.*]] = sub i32 [[OLDVAL]], %subend
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i32(i32 [[NEWVAL]], i32 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i32(i32 [[NEWVAL]], i32 %ptr)
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0:[0-9]+]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: ret i32 [[OLDVAL]]			; CHECK: ret i32 [[OLDVAL]]
	%res = atomicrmw sub i32* %ptr, i32 %subend acquire			%res = atomicrmw sub i32* %ptr, i32 %subend acquire
	ret i32 %res			ret i32 %res
	}			}

	define i64 @test_atomic_or_i64(i64* %ptr, i64 %orend) {			define i64 @test_atomic_or_i64(i64* %ptr, i64 %orend) {
	Show All 11 Lines
	; CHECK: [[OLDVAL:%.*]] = or i64 [[LO64]], [[HI64]]			; CHECK: [[OLDVAL:%.*]] = or i64 [[LO64]], [[HI64]]
	; CHECK: [[NEWVAL:%.*]] = or i64 [[OLDVAL]], %orend			; CHECK: [[NEWVAL:%.*]] = or i64 [[OLDVAL]], %orend
	; CHECK: [[NEWLO:%.*]] = trunc i64 [[NEWVAL]] to i32			; CHECK: [[NEWLO:%.*]] = trunc i64 [[NEWVAL]] to i32
	; CHECK: [[NEWHI_TMP:%.*]] = lshr i64 [[NEWVAL]], 32			; CHECK: [[NEWHI_TMP:%.*]] = lshr i64 [[NEWVAL]], 32
	; CHECK: [[NEWHI:%.*]] = trunc i64 [[NEWHI_TMP]] to i32			; CHECK: [[NEWHI:%.*]] = trunc i64 [[NEWHI_TMP]] to i32
	; CHECK: [[PTR8:%.]] = bitcast i64 %ptr to i8*			; CHECK: [[PTR8:%.]] = bitcast i64 %ptr to i8*
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.stlexd(i32 [[NEWLO]], i32 [[NEWHI]], i8 [[PTR8]])			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.stlexd(i32 [[NEWLO]], i32 [[NEWHI]], i8 [[PTR8]])
	; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp ne i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]]			; CHECK: br i1 [[TST]], label %[[LOOP]], label %[[END:.*]], !prof ![[PROF0:[0-9]+]]
	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: ret i64 [[OLDVAL]]			; CHECK: ret i64 [[OLDVAL]]
	%res = atomicrmw or i64* %ptr, i64 %orend seq_cst			%res = atomicrmw or i64* %ptr, i64 %orend seq_cst
	ret i64 %res			ret i64 %res
	}			}

	define i8 @test_cmpxchg_i8_seqcst_seqcst(i8* %ptr, i8 %desired, i8 %newval) {			define i8 @test_cmpxchg_i8_seqcst_seqcst(i8* %ptr, i8 %desired, i8 %newval) {
	; CHECK-LABEL: @test_cmpxchg_i8_seqcst_seqcst			; CHECK-LABEL: @test_cmpxchg_i8_seqcst_seqcst
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]

	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldaex.p0i8(i8 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldaex.p0i8(i8 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 %1 to i8			; CHECK: [[OLDVAL:%.*]] = trunc i32 %1 to i8
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i8 [[OLDVAL]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i8 [[OLDVAL]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[NEWVAL32:%.*]] = zext i8 %newval to i32			; CHECK: [[NEWVAL32:%.*]] = zext i8 %newval to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.stlex.p0i8(i32 [[NEWVAL32]], i8 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.stlex.p0i8(i32 [[NEWVAL32]], i8 %ptr)
	; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]			; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	Show All 17 Lines
	; CHECK-LABEL: @test_cmpxchg_i16_seqcst_monotonic			; CHECK-LABEL: @test_cmpxchg_i16_seqcst_monotonic
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]

	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldaex.p0i16(i16 %ptr)			; CHECK: [[OLDVAL32:%.]] = call i32 @llvm.arm.ldaex.p0i16(i16 %ptr)
	; CHECK: [[OLDVAL:%.*]] = trunc i32 %1 to i16			; CHECK: [[OLDVAL:%.*]] = trunc i32 %1 to i16
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i16 [[OLDVAL]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i16 [[OLDVAL]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[NEWVAL32:%.*]] = zext i16 %newval to i32			; CHECK: [[NEWVAL32:%.*]] = zext i16 %newval to i32
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.stlex.p0i16(i32 [[NEWVAL32]], i16 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.stlex.p0i16(i32 [[NEWVAL32]], i16 %ptr)
	; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]			; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	Show All 16 Lines
	define i32 @test_cmpxchg_i32_acquire_acquire(i32* %ptr, i32 %desired, i32 %newval) {			define i32 @test_cmpxchg_i32_acquire_acquire(i32* %ptr, i32 %desired, i32 %newval) {
	; CHECK-LABEL: @test_cmpxchg_i32_acquire_acquire			; CHECK-LABEL: @test_cmpxchg_i32_acquire_acquire
	; CHECK-NOT: fence			; CHECK-NOT: fence
	; CHECK: br label %[[LOOP:.*]]			; CHECK: br label %[[LOOP:.*]]

	; CHECK: [[LOOP]]:			; CHECK: [[LOOP]]:
	; CHECK: [[OLDVAL:%.]] = call i32 @llvm.arm.ldaex.p0i32(i32 %ptr)			; CHECK: [[OLDVAL:%.]] = call i32 @llvm.arm.ldaex.p0i32(i32 %ptr)
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[OLDVAL]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[OLDVAL]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %newval, i32 %ptr)			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %newval, i32 %ptr)
	; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0
	; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]			; CHECK: br i1 [[TST]], label %[[SUCCESS_BB:.*]], label %[[LOOP]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	; CHECK-NOT: fence_cst			; CHECK-NOT: fence_cst
	Show All 22 Lines
	; CHECK: [[LOHI:%.]] = call { i32, i32 } @llvm.arm.ldrexd(i8 [[PTR8]])			; CHECK: [[LOHI:%.]] = call { i32, i32 } @llvm.arm.ldrexd(i8 [[PTR8]])
	; CHECK: [[LO:%.*]] = extractvalue { i32, i32 } [[LOHI]], 0			; CHECK: [[LO:%.*]] = extractvalue { i32, i32 } [[LOHI]], 0
	; CHECK: [[HI:%.*]] = extractvalue { i32, i32 } [[LOHI]], 1			; CHECK: [[HI:%.*]] = extractvalue { i32, i32 } [[LOHI]], 1
	; CHECK: [[LO64:%.*]] = zext i32 [[LO]] to i64			; CHECK: [[LO64:%.*]] = zext i32 [[LO]] to i64
	; CHECK: [[HI64_TMP:%.*]] = zext i32 [[HI]] to i64			; CHECK: [[HI64_TMP:%.*]] = zext i32 [[HI]] to i64
	; CHECK: [[HI64:%.*]] = shl i64 [[HI64_TMP]], 32			; CHECK: [[HI64:%.*]] = shl i64 [[HI64_TMP]], 32
	; CHECK: [[OLDVAL:%.*]] = or i64 [[LO64]], [[HI64]]			; CHECK: [[OLDVAL:%.*]] = or i64 [[LO64]], [[HI64]]
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i64 [[OLDVAL]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i64 [[OLDVAL]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[NEWLO:%.*]] = trunc i64 %newval to i32			; CHECK: [[NEWLO:%.*]] = trunc i64 %newval to i32
	; CHECK: [[NEWHI_TMP:%.*]] = lshr i64 %newval, 32			; CHECK: [[NEWHI_TMP:%.*]] = lshr i64 %newval, 32
	; CHECK: [[NEWHI:%.*]] = trunc i64 [[NEWHI_TMP]] to i32			; CHECK: [[NEWHI:%.*]] = trunc i64 [[NEWHI_TMP]] to i32
	; CHECK: [[PTR8:%.]] = bitcast i64 %ptr to i8*			; CHECK: [[PTR8:%.]] = bitcast i64 %ptr to i8*
	; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strexd(i32 [[NEWLO]], i32 [[NEWHI]], i8 [[PTR8]])			; CHECK: [[TRYAGAIN:%.]] = call i32 @llvm.arm.strexd(i32 [[NEWLO]], i32 [[NEWHI]], i8 [[PTR8]])
	; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0			; CHECK: [[TST:%.*]] = icmp eq i32 [[TRYAGAIN]], 0
	Show All 10 Lines
	; CHECK: [[DONE]]:			; CHECK: [[DONE]]:
	; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, %[[FAILURE_BB]] ]			; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, %[[FAILURE_BB]] ]
	; CHECK: ret i64 [[OLDVAL]]			; CHECK: ret i64 [[OLDVAL]]

	%pairold = cmpxchg i64* %ptr, i64 %desired, i64 %newval monotonic monotonic			%pairold = cmpxchg i64* %ptr, i64 %desired, i64 %newval monotonic monotonic
	%old = extractvalue { i64, i1 } %pairold, 0			%old = extractvalue { i64, i1 } %pairold, 0
	ret i64 %old			ret i64 %old
	}			}

				; CHECK: ![[PROF0]] = !{!"branch_weights", i32 1, i32 1048575}
				; CHECK: ![[PROF1]] = !{!"branch_weights", i32 1048575, i32 1}

test/Transforms/AtomicExpand/ARM/cmpxchg-weak.ll

	; RUN: opt -atomic-expand -S -mtriple=thumbv7s-apple-ios7.0 %s \| FileCheck %s			; RUN: opt -atomic-expand -S -mtriple=thumbv7s-apple-ios7.0 %s \| FileCheck %s

	define i32 @test_cmpxchg_seq_cst(i32* %addr, i32 %desired, i32 %new) {			define i32 @test_cmpxchg_seq_cst(i32* %addr, i32 %desired, i32 %new) {
	; CHECK-LABEL: @test_cmpxchg_seq_cst			; CHECK-LABEL: @test_cmpxchg_seq_cst
	; Intrinsic for "dmb ishst" is then expected			; Intrinsic for "dmb ishst" is then expected
	; CHECK: call void @llvm.arm.dmb(i32 10)			; CHECK: call void @llvm.arm.dmb(i32 10)
	; CHECK: br label %[[START:.*]]			; CHECK: br label %[[START:.*]]

	; CHECK: [[START]]:			; CHECK: [[START]]:
	; CHECK: [[LOADED:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %addr)			; CHECK: [[LOADED:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %addr)
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[STREX:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %new, i32 %addr)			; CHECK: [[STREX:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %new, i32 %addr)
	; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0			; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0
	; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.*]], label %[[FAILURE_BB]]			; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.*]], label %[[FAILURE_BB]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	Show All 15 Lines
	define i1 @test_cmpxchg_weak_fail(i32* %addr, i32 %desired, i32 %new) {			define i1 @test_cmpxchg_weak_fail(i32* %addr, i32 %desired, i32 %new) {
	; CHECK-LABEL: @test_cmpxchg_weak_fail			; CHECK-LABEL: @test_cmpxchg_weak_fail
	; CHECK: call void @llvm.arm.dmb(i32 10)			; CHECK: call void @llvm.arm.dmb(i32 10)
	; CHECK: br label %[[START:.*]]			; CHECK: br label %[[START:.*]]

	; CHECK: [[START]]:			; CHECK: [[START]]:
	; CHECK: [[LOADED:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %addr)			; CHECK: [[LOADED:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %addr)
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[STREX:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %new, i32 %addr)			; CHECK: [[STREX:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %new, i32 %addr)
	; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0			; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0
	; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.*]], label %[[FAILURE_BB]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	; CHECK: call void @llvm.arm.dmb(i32 11)			; CHECK: call void @llvm.arm.dmb(i32 11)
	; CHECK: br label %[[END:.*]]			; CHECK: br label %[[END:.*]]

	; CHECK: [[FAILURE_BB]]:			; CHECK: [[FAILURE_BB]]:
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: br label %[[END]]			; CHECK: br label %[[END]]
	Show All 10 Lines
	define i32 @test_cmpxchg_monotonic(i32* %addr, i32 %desired, i32 %new) {			define i32 @test_cmpxchg_monotonic(i32* %addr, i32 %desired, i32 %new) {
	; CHECK-LABEL: @test_cmpxchg_monotonic			; CHECK-LABEL: @test_cmpxchg_monotonic
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: br label %[[START:.*]]			; CHECK: br label %[[START:.*]]

	; CHECK: [[START]]:			; CHECK: [[START]]:
	; CHECK: [[LOADED:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %addr)			; CHECK: [[LOADED:%.]] = call i32 @llvm.arm.ldrex.p0i32(i32 %addr)
	; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired			; CHECK: [[SHOULD_STORE:%.*]] = icmp eq i32 [[LOADED]], %desired
	; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SHOULD_STORE]], label %[[TRY_STORE:.]], label %[[FAILURE_BB:.]], !prof ![[PROF1:[0-9]+]]

	; CHECK: [[TRY_STORE]]:			; CHECK: [[TRY_STORE]]:
	; CHECK: [[STREX:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %new, i32 %addr)			; CHECK: [[STREX:%.]] = call i32 @llvm.arm.strex.p0i32(i32 %new, i32 %addr)
	; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0			; CHECK: [[SUCCESS:%.*]] = icmp eq i32 [[STREX]], 0
	; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.]], label %[[FAILURE_BB:.]]			; CHECK: br i1 [[SUCCESS]], label %[[SUCCESS_BB:.*]], label %[[FAILURE_BB]]

	; CHECK: [[SUCCESS_BB]]:			; CHECK: [[SUCCESS_BB]]:
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: br label %[[END:.*]]			; CHECK: br label %[[END:.*]]

	; CHECK: [[FAILURE_BB]]:			; CHECK: [[FAILURE_BB]]:
	; CHECK-NOT: dmb			; CHECK-NOT: dmb
	; CHECK: br label %[[END]]			; CHECK: br label %[[END]]

	; CHECK: [[END]]:			; CHECK: [[END]]:
	; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, %[[FAILURE_BB]] ]			; CHECK: [[SUCCESS:%.*]] = phi i1 [ true, %[[SUCCESS_BB]] ], [ false, %[[FAILURE_BB]] ]
	; CHECK: ret i32 [[LOADED]]			; CHECK: ret i32 [[LOADED]]

	%pair = cmpxchg weak i32* %addr, i32 %desired, i32 %new monotonic monotonic			%pair = cmpxchg weak i32* %addr, i32 %desired, i32 %new monotonic monotonic
	%oldval = extractvalue { i32, i1 } %pair, 0			%oldval = extractvalue { i32, i1 } %pair, 0
	ret i32 %oldval			ret i32 %oldval
	}			}

				; CHECK: ![[PROF1]] = !{!"branch_weights", i32 1048575, i32 1}

test/Transforms/AtomicExpand/X86/atomic32-weight.ll

This file was added.

				; RUN: opt -S -o - -mtriple=i386-apple-ios -atomic-expand %s \| FileCheck %s

				@sc32 = external global i32

				; CHECK: br label %[[STARTBB:.*]]
				; CHECK: [[STARTBB]]:
				; CHECK: [[VAL_SUCCESS:%[0-9]+]] = cmpxchg
				; CHECK: [[SUCCESS:%[a-z]+]] = extractvalue { i32, i1 } [[VAL_SUCCESS]], 1
				; CHECK: br i1 [[SUCCESS]], label %[[ENDBB:.*]], label %[[STARTBB]], !prof !0
				; CHECK: [[ENDBB]]:
				; CHECK: !0 = !{!"branch_weights", i32 1048575, i32 1}

				define void @atomic_fetch_and32() nounwind {
				%t2 = atomicrmw and i32* @sc32, i32 5 acquire
				%t3 = atomicrmw and i32* @sc32, i32 %t2 acquire
				ret void
				}

test/Transforms/AtomicExpand/X86/lit.local.cfg

This file was added.

				if not 'X86' in config.root.targets:
				config.unsupported = True