This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/
-
CodeGen/
1/3
AtomicExpandPass.cpp
-
test/
-
CodeGen/VE/Scalar/
-
VE/
-
Scalar/
-
atomic_cmp_swap.ll
-
Transforms/AtomicExpand/
-
AtomicExpand/
-
AMDGPU/
-
expand-atomic-i16.ll
-
expand-atomic-i8.ll
-
SPARC/
-
partword.ll

Differential D134308

AtomicExpand: Use llvm.ptrmask instead of ptrtoint
ClosedPublic

Authored by arsenm on Sep 20 2022, 1:27 PM.

Download Raw Diff

Details

Reviewers

reames
jyknight
jdoerfert

Summary

This removes the ptrtoint from the load's pointer operand, although we
can't entirely eliminate these to get the LSB shift. In a future
patch, this will avoid ptrtoint in the case where the atomic is
overaligned to the word size.

Diff Detail

Build Status

Buildable 188424

Event Timeline

arsenm created this revision.Sep 20 2022, 1:27 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2022, 1:27 PM

Herald added subscribers: kosarev, kerbowa, jrtc27 and 3 others. · View Herald Transcript

arsenm requested review of this revision.Sep 20 2022, 1:27 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2022, 1:27 PM

Herald added a subscriber: wdng. · View Herald Transcript

Harbormaster completed remote builds in B187823: Diff 461680.Sep 20 2022, 1:28 PM

VE test update

Harbormaster completed remote builds in B187824: Diff 461683.Sep 20 2022, 1:34 PM

This also will allow eliminating all ptrtoints in the case where the atomic is sufficiently aligned in a future patch

Can you explain what effect you expect this to have? It removes all the inttoptr -- maybe that's useful in itself?

Is the remaining ptrtoint to get the low bits harmful? Being able to omit the ptrtoint when the value is sufficiently aligned seems likely to be nearly useless -- this code is only used when the width of the atomic operation requested is smaller than the smallest size of atomic supported by the hardware. And sure, the smaller value could be overaligned in some cases, but...

llvm/lib/CodeGen/AtomicExpandPass.cpp
707	This "if" seems extraneous -- Builder.CreateBitCast is already a no-op internally if it's asked to cast between opaque pointers, right?

In D134308#3804295, @jyknight wrote:

Can you explain what effect you expect this to have? It removes all the inttoptr -- maybe that's useful in itself?

Moving towards the goal of never having compiler introduced inttoptr

In D134308#3804305, @arsenm wrote:

In D134308#3804295, @jyknight wrote:

Can you explain what effect you expect this to have? It removes all the inttoptr -- maybe that's useful in itself?

Moving towards the goal of never having compiler introduced inttoptr

That was sort of my guess -- but, then, this change already DOES fix (this part of) the problem, right? We have a ptrtoint remaining, but not an inttoptr -- that should be fine?

arsenm added a child revision: D134323: AtomicExpand: Avoid some operations if the atomic is overaligned.Sep 20 2022, 3:45 PM

In D134308#3804313, @jyknight wrote:

In D134308#3804305, @arsenm wrote:

In D134308#3804295, @jyknight wrote:

Can you explain what effect you expect this to have? It removes all the inttoptr -- maybe that's useful in itself?

Moving towards the goal of never having compiler introduced inttoptr

That was sort of my guess -- but, then, this change already DOES fix (this part of) the problem, right? We have a ptrtoint remaining, but not an inttoptr -- that should be fine?

It half fixes it. It's still introducing an inttoptr in order to get the low bits. We could introduce a second ptrmask with the inverted mask, but we still need to inttoptr in order to shift that into the Inv_Mask position

In D134308#3804439, @arsenm wrote:

In D134308#3804313, @jyknight wrote:

In D134308#3804305, @arsenm wrote:

In D134308#3804295, @jyknight wrote:

Can you explain what effect you expect this to have? It removes all the inttoptr -- maybe that's useful in itself?

Moving towards the goal of never having compiler introduced inttoptr

That was sort of my guess -- but, then, this change already DOES fix (this part of) the problem, right? We have a ptrtoint remaining, but not an inttoptr -- that should be fine?

It half fixes it. It's still introducing an inttoptr in order to get the low bits. We could introduce a second ptrmask with the inverted mask, but we still need to inttoptr in order to shift that into the Inv_Mask position

OK, now I'm confused again...

My understanding:.

converting from integer to pointer ("inttoptr") is undesirable to introduce in transformations. (Such instruciton may be present in the original input, but shouldn't be added if it was not.)
converting from pointer to integer ("ptrtoint") is OK.

In the current AtomicExpandPass.cpp, we have 3 mentions of CreateIntToPtr:

convertAtomicXchgToIntegerType
convertCmpXchgToIntegerType
createMaskInstrs

The first two are about pointer-valued atomicrmw and cmpxchg operands, which are unrelated to this change, so we'll ignore that. The third is removed by this patch.

So, ISTM this patch does entirely fix the problem it sets out to fix -- it has removed the only CreateIntToPtr from the address computation. There remains a CreatePtrToInt to extract the value shift/mask -- but that's not a problem. Right?

In D134308#3808650, @jyknight wrote:

So, ISTM this patch does entirely fix the problem it sets out to fix -- it has removed the only CreateIntToPtr from the address computation. There remains a CreatePtrToInt to extract the value shift/mask -- but that's not a problem. Right?

We don't want inttoptr or ptrtoint anywhere, since they still are capturing, but this is the limit of what is possible with current IR. We could push one of the ptrtoints later after doing the mask to get the LSB with the intrinsic, but you still need the ptrtoint to do the shift

In D134308#3808652, @arsenm wrote:

We don't want inttoptr or ptrtoint anywhere, since they still are capturing, but this is the limit of what is possible with current IR. We could push one of the ptrtoints later after doing the mask to get the LSB with the intrinsic, but you still need the ptrtoint to do the shift

OK -- it sounds like the root of the confusion here is that

My understanding:

converting from integer to pointer ("inttoptr") is undesirable to introduce in transformations. (Such instruciton may be present in the original input, but shouldn't be added if it was not.)

converting from pointer to integer ("ptrtoint") is OK.

is incorrect. Sorry to be a pain, but do you have a link that describes the issues here? I may well not be remembering properly, but I had thought previous discussions and changes were about pointer provenance issues that arise from the inttoptr direction.

So I'm afraid I'm not at all clear to me why the ptrtoint is problematic -- and in particular, why you wrote in the change description that this might not actually be worth doing since we cannot remove the ptrtoint.

In D134308#3808841, @jyknight wrote:

is incorrect. Sorry to be a pain, but do you have a link that describes the issues here? I may well not be remembering properly, but I had thought previous discussions and changes were about pointer provenance issues that arise from the inttoptr direction.

ptrtoint is the most basic capture

So I'm afraid I'm not at all clear to me why the ptrtoint is problematic -- and in particular, why you wrote in the change description that this might not actually be worth doing since we cannot remove the ptrtoint.

This does fully eliminate all ptrtoint in the overaligned case. I'd still rather move towards reducing uses of it to the point where it's required (which moves it to the shift)

In D134308#3808877, @arsenm wrote:

In D134308#3808841, @jyknight wrote:

is incorrect. Sorry to be a pain, but do you have a link that describes the issues here? I may well not be remembering properly, but I had thought previous discussions and changes were about pointer provenance issues that arise from the inttoptr direction.

ptrtoint is the most basic capture

Also, just see the langref description for llvm.ptrmask

In D134308#3809031, @arsenm wrote:

In D134308#3808877, @arsenm wrote:

In D134308#3808841, @jyknight wrote:

is incorrect. Sorry to be a pain, but do you have a link that describes the issues here? I may well not be remembering properly, but I had thought previous discussions and changes were about pointer provenance issues that arise from the inttoptr direction.

ptrtoint is the most basic capture

OK.

Also, just see the langref description for llvm.ptrmask

I looked at that first thing. It does not -- at least, not obviously -- discuss this side of the issue. It describes its semantics as being equivalent to getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr). It also says it preserves more information about the resulting pointer than ptrtoint+inttoptr. Both these statements imply that the issue it's solving is the loss of information that occurs via creation of a new pointer with inttoptr -- and that ptrmask exists primarily because it's a much simpler canonical form to deal with than the equivalent GEP -- which requires extraneous addition/subtraction operations.

Anyways, I have no issue with this change itself (only the 1 minor nit) -- just with the description.

arsenm added inline comments.Sep 23 2022, 9:01 AM

llvm/lib/CodeGen/AtomicExpandPass.cpp
707	Yes, but when opaque pointers are removed, how will we track down all the unnecessary CreateBitCast calls?

Remove opaque pointer check

Harbormaster completed remote builds in B188424: Diff 462512.Sep 23 2022, 9:03 AM

LGTM, thanks for the discussion.

I'd appreciate if you modify the commit to have a clearer description before pushing.

llvm/lib/CodeGen/AtomicExpandPass.cpp
707	I'd start by looking at all the calls to `Type::getPointerTo(AS)`, `Type::get*PtrTy(...)`, `PointerType::get(Ty, AS)`, etc. Many such calls can probably be removed entirely, along with their use, being used only as input to some cast creation. Others would still be required, and changed to `PointerType::get(Ctx, AS)` instead. I suspect that'll get rid of most of the redundant casts as a side-effect. Then, I'd make a temporary local modification where attempting a no-op cast from ptr to ptr will assert-fail, run tests, and examine all the failure locations, to see which other casts ought to be removed.

This revision is now accepted and ready to land.Sep 23 2022, 9:52 AM

In D134308#3812077, @jyknight wrote:

LGTM, thanks for the discussion.

I'd appreciate if you modify the commit to have a clearer description before pushing.

I just modified it with the last revision

a61c3455c02867f6ef74017e705f238648ef47ca

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

AtomicExpandPass.cpp

18 lines

test/

CodeGen/

VE/

Scalar/

atomic_cmp_swap.ll

154 lines

Transforms/

AtomicExpand/

AMDGPU/

expand-atomic-i16.ll

512 lines

expand-atomic-i8.ll

376 lines

SPARC/

partword.ll

254 lines

Diff 462512

llvm/lib/CodeGen/AtomicExpandPass.cpp

Show First 20 Lines • Show All 688 Lines • ▼ Show 20 Lines	if (PMV.ValueType == PMV.WordType) {
PMV.ShiftAmt = ConstantInt::get(PMV.ValueType, 0);		PMV.ShiftAmt = ConstantInt::get(PMV.ValueType, 0);
PMV.Mask = ConstantInt::get(PMV.ValueType, ~0, /isSigned/ true);		PMV.Mask = ConstantInt::get(PMV.ValueType, ~0, /isSigned/ true);
return PMV;		return PMV;
}		}

assert(ValueSize < MinWordSize);		assert(ValueSize < MinWordSize);

PointerType *PtrTy = cast<PointerType>(Addr->getType());		PointerType *PtrTy = cast<PointerType>(Addr->getType());
Type *WordPtrType = PMV.WordType->getPointerTo(PtrTy->getAddressSpace());		IntegerType *IntTy = DL.getIntPtrType(Ctx, PtrTy->getAddressSpace());

// TODO: we could skip some of this if AddrAlign >= MinWordSize.		// TODO: we could skip some of this if AddrAlign >= MinWordSize.
Value *AddrInt = Builder.CreatePtrToInt(
Addr, DL.getIntPtrType(Ctx, PtrTy->getAddressSpace()));		PMV.AlignedAddr = Builder.CreateIntrinsic(
PMV.AlignedAddr = Builder.CreateIntToPtr(		Intrinsic::ptrmask, {PtrTy, IntTy},
Builder.CreateAnd(AddrInt, ~(uint64_t)(MinWordSize - 1)), WordPtrType,		{Addr, ConstantInt::get(IntTy, ~(uint64_t)(MinWordSize - 1))}, nullptr,
"AlignedAddr");		"AlignedAddr");

		Type *WordPtrType = PMV.WordType->getPointerTo(PtrTy->getAddressSpace());

		jyknightUnsubmitted Not Done Reply Inline Actions This "if" seems extraneous -- Builder.CreateBitCast is already a no-op internally if it's asked to cast between opaque pointers, right? jyknight: This "if" seems extraneous -- Builder.CreateBitCast is already a no-op internally if it's asked…
		arsenmAuthorUnsubmitted Done Reply Inline Actions Yes, but when opaque pointers are removed, how will we track down all the unnecessary CreateBitCast calls? arsenm: Yes, but when opaque pointers are removed, how will we track down all the unnecessary…
		jyknightUnsubmitted Not Done Reply Inline Actions I'd start by looking at all the calls to `Type::getPointerTo(AS)`, `Type::getPtrTy(...)`, `PointerType::get(Ty, AS)`, etc. Many such calls can probably be removed entirely, along with their use, being used only as input to some cast creation. Others would still be required, and changed to `PointerType::get(Ctx, AS)` instead. I suspect that'll get rid of most of the redundant casts as a side-effect. Then, I'd make a temporary local modification where attempting a no-op cast from ptr to ptr will assert-fail, run tests, and examine all the failure locations, to see which other casts ought to be removed. jyknight:* I'd start by looking at all the calls to `Type::getPointerTo(AS)`, `Type::get*PtrTy(...)`…
		// Cast for typed pointers.
		PMV.AlignedAddr =
		Builder.CreateBitCast(PMV.AlignedAddr, WordPtrType, "AlignedAddr");

PMV.AlignedAddrAlignment = Align(MinWordSize);		PMV.AlignedAddrAlignment = Align(MinWordSize);

		Value *AddrInt = Builder.CreatePtrToInt(Addr, IntTy);
Value *PtrLSB = Builder.CreateAnd(AddrInt, MinWordSize - 1, "PtrLSB");		Value *PtrLSB = Builder.CreateAnd(AddrInt, MinWordSize - 1, "PtrLSB");
if (DL.isLittleEndian()) {		if (DL.isLittleEndian()) {
// turn bytes into bits		// turn bytes into bits
PMV.ShiftAmt = Builder.CreateShl(PtrLSB, 3);		PMV.ShiftAmt = Builder.CreateShl(PtrLSB, 3);
} else {		} else {
// turn bytes into bits, and count from the other side.		// turn bytes into bits, and count from the other side.
PMV.ShiftAmt = Builder.CreateShl(		PMV.ShiftAmt = Builder.CreateShl(
Builder.CreateXor(PtrLSB, MinWordSize - ValueSize), 3);		Builder.CreateXor(PtrLSB, MinWordSize - ValueSize), 3);
▲ Show 20 Lines • Show All 1,192 Lines • Show Last 20 Lines

llvm/test/CodeGen/VE/Scalar/atomic_cmp_swap.ll

Show First 20 Lines • Show All 1,989 Lines • ▼ Show 20 Lines	; CHECK-NEXT: b.l.t (, %s10)
%9 = zext i1 %8 to i128		%9 = zext i1 %8 to i128
call void @llvm.lifetime.end.p0i8(i64 16, i8* nonnull %5)		call void @llvm.lifetime.end.p0i8(i64 16, i8* nonnull %5)
ret i128 %9		ret i128 %9
}		}

; Function Attrs: nofree norecurse nounwind mustprogress		; Function Attrs: nofree norecurse nounwind mustprogress
define zeroext i1 @_Z29atomic_cmp_swap_relaxed_gv_i1Rbb(i8* nocapture nonnull align 1 dereferenceable(1) %0, i1 zeroext %1) {		define zeroext i1 @_Z29atomic_cmp_swap_relaxed_gv_i1Rbb(i8* nocapture nonnull align 1 dereferenceable(1) %0, i1 zeroext %1) {
; CHECK-LABEL: _Z29atomic_cmp_swap_relaxed_gv_i1Rbb:		; CHECK-LABEL: _Z29atomic_cmp_swap_relaxed_gv_i1Rbb:
; CHECK: # %bb.0:		; CHECK: # %bb.0: # %partword.cmpxchg.loop
; CHECK-NEXT: and %s2, %s1, (32)0		; CHECK-NEXT: lea %s2, gv_i1@lo
; CHECK-NEXT: lea %s1, gv_i1@lo		; CHECK-NEXT: and %s2, %s2, (32)0
; CHECK-NEXT: and %s1, %s1, (32)0		; CHECK-NEXT: lea.sl %s2, gv_i1@hi(, %s2)
; CHECK-NEXT: lea.sl %s1, gv_i1@hi(, %s1)		; CHECK-NEXT: and %s2, -4, %s2
; CHECK-NEXT: and %s1, -4, %s1		; CHECK-NEXT: ldl.zx %s3, (, %s2)
; CHECK-NEXT: ldl.zx %s4, (, %s1)		; CHECK-NEXT: lea %s4, -256
; CHECK-NEXT: ld1b.zx %s3, (, %s0)		; CHECK-NEXT: ld1b.zx %s5, (, %s0)
; CHECK-NEXT: lea %s5, -256
; CHECK-NEXT: and %s5, %s5, (32)0
; CHECK-NEXT: and %s4, %s4, %s5
; CHECK-NEXT: and %s4, %s4, (32)0		; CHECK-NEXT: and %s4, %s4, (32)0
; CHECK-NEXT: or %s2, %s4, %s2		; CHECK-NEXT: and %s3, %s3, %s4
; CHECK-NEXT: or %s3, %s4, %s3		; CHECK-NEXT: or %s1, %s3, %s1
; CHECK-NEXT: cas.w %s2, (%s1), %s3		; CHECK-NEXT: or %s3, %s3, %s5
; CHECK-NEXT: cmps.w.sx %s3, %s2, %s3		; CHECK-NEXT: cas.w %s1, (%s2), %s3
; CHECK-NEXT: or %s1, 0, (0)1		; CHECK-NEXT: cmps.w.sx %s4, %s1, %s3
; CHECK-NEXT: cmov.w.eq %s1, (63)0, %s3		; CHECK-NEXT: or %s2, 0, (0)1
; CHECK-NEXT: brne.w 0, %s1, .LBB44_2		; CHECK-NEXT: cmov.w.eq %s2, (63)0, %s4
		; CHECK-NEXT: breq.w %s1, %s3, .LBB44_2
; CHECK-NEXT: # %bb.1:		; CHECK-NEXT: # %bb.1:
; CHECK-NEXT: st1b %s2, (, %s0)		; CHECK-NEXT: st1b %s1, (, %s0)
; CHECK-NEXT: .LBB44_2:		; CHECK-NEXT: .LBB44_2:
; CHECK-NEXT: adds.w.zx %s0, %s1, (0)1		; CHECK-NEXT: adds.w.zx %s0, %s2, (0)1
; CHECK-NEXT: b.l.t (, %s10)		; CHECK-NEXT: b.l.t (, %s10)
%3 = zext i1 %1 to i8		%3 = zext i1 %1 to i8
%4 = load i8, i8* %0, align 1		%4 = load i8, i8* %0, align 1
%5 = cmpxchg weak i8* getelementptr inbounds (%"struct.std::__1::atomic", %"struct.std::__1::atomic"* @gv_i1, i64 0, i32 0, i32 0, i32 0, i32 0), i8 %4, i8 %3 monotonic monotonic		%5 = cmpxchg weak i8* getelementptr inbounds (%"struct.std::__1::atomic", %"struct.std::__1::atomic"* @gv_i1, i64 0, i32 0, i32 0, i32 0, i32 0), i8 %4, i8 %3 monotonic monotonic
%6 = extractvalue { i8, i1 } %5, 1		%6 = extractvalue { i8, i1 } %5, 1
br i1 %6, label %9, label %7		br i1 %6, label %9, label %7

7: ; preds = %2		7: ; preds = %2
%8 = extractvalue { i8, i1 } %5, 0		%8 = extractvalue { i8, i1 } %5, 0
store i8 %8, i8* %0, align 1		store i8 %8, i8* %0, align 1
br label %9		br label %9

9: ; preds = %2, %7		9: ; preds = %2, %7
ret i1 %6		ret i1 %6
}		}

; Function Attrs: nofree norecurse nounwind mustprogress		; Function Attrs: nofree norecurse nounwind mustprogress
define signext i8 @_Z29atomic_cmp_swap_relaxed_gv_i8Rcc(i8* nocapture nonnull align 1 dereferenceable(1) %0, i8 signext %1) {		define signext i8 @_Z29atomic_cmp_swap_relaxed_gv_i8Rcc(i8* nocapture nonnull align 1 dereferenceable(1) %0, i8 signext %1) {
; CHECK-LABEL: _Z29atomic_cmp_swap_relaxed_gv_i8Rcc:		; CHECK-LABEL: _Z29atomic_cmp_swap_relaxed_gv_i8Rcc:
; CHECK: # %bb.0:		; CHECK: # %bb.0: # %partword.cmpxchg.loop
; CHECK-NEXT: ld1b.zx %s2, (, %s0)		; CHECK-NEXT: lea %s2, gv_i8@lo
; CHECK-NEXT: and %s3, %s1, (56)0		; CHECK-NEXT: and %s2, %s2, (32)0
; CHECK-NEXT: lea %s1, gv_i8@lo		; CHECK-NEXT: lea.sl %s2, gv_i8@hi(, %s2)
; CHECK-NEXT: and %s1, %s1, (32)0		; CHECK-NEXT: and %s2, -4, %s2
; CHECK-NEXT: lea.sl %s1, gv_i8@hi(, %s1)		; CHECK-NEXT: and %s1, %s1, (56)0
; CHECK-NEXT: and %s1, -4, %s1		; CHECK-NEXT: ldl.zx %s3, (, %s2)
; CHECK-NEXT: ldl.zx %s4, (, %s1)		; CHECK-NEXT: lea %s4, -256
; CHECK-NEXT: and %s3, %s3, (32)0		; CHECK-NEXT: ld1b.zx %s5, (, %s0)
; CHECK-NEXT: lea %s5, -256
; CHECK-NEXT: and %s5, %s5, (32)0
; CHECK-NEXT: and %s4, %s4, %s5
; CHECK-NEXT: and %s4, %s4, (32)0		; CHECK-NEXT: and %s4, %s4, (32)0
; CHECK-NEXT: or %s3, %s4, %s3		; CHECK-NEXT: and %s3, %s3, %s4
; CHECK-NEXT: or %s2, %s4, %s2		; CHECK-NEXT: or %s1, %s3, %s1
; CHECK-NEXT: cas.w %s3, (%s1), %s2		; CHECK-NEXT: or %s3, %s3, %s5
; CHECK-NEXT: cmps.w.sx %s2, %s3, %s2		; CHECK-NEXT: cas.w %s1, (%s2), %s3
; CHECK-NEXT: or %s1, 0, (0)1		; CHECK-NEXT: cmps.w.sx %s4, %s1, %s3
; CHECK-NEXT: cmov.w.eq %s1, (63)0, %s2		; CHECK-NEXT: or %s2, 0, (0)1
; CHECK-NEXT: brne.w 0, %s1, .LBB45_2		; CHECK-NEXT: cmov.w.eq %s2, (63)0, %s4
		; CHECK-NEXT: breq.w %s1, %s3, .LBB45_2
; CHECK-NEXT: # %bb.1:		; CHECK-NEXT: # %bb.1:
; CHECK-NEXT: st1b %s3, (, %s0)		; CHECK-NEXT: st1b %s1, (, %s0)
; CHECK-NEXT: .LBB45_2:		; CHECK-NEXT: .LBB45_2:
; CHECK-NEXT: adds.w.zx %s0, %s1, (0)1		; CHECK-NEXT: adds.w.zx %s0, %s2, (0)1
; CHECK-NEXT: b.l.t (, %s10)		; CHECK-NEXT: b.l.t (, %s10)
%3 = load i8, i8* %0, align 1		%3 = load i8, i8* %0, align 1
%4 = cmpxchg weak i8* getelementptr inbounds (%"struct.std::__1::atomic.0", %"struct.std::__1::atomic.0"* @gv_i8, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), i8 %3, i8 %1 monotonic monotonic		%4 = cmpxchg weak i8* getelementptr inbounds (%"struct.std::__1::atomic.0", %"struct.std::__1::atomic.0"* @gv_i8, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), i8 %3, i8 %1 monotonic monotonic
%5 = extractvalue { i8, i1 } %4, 1		%5 = extractvalue { i8, i1 } %4, 1
br i1 %5, label %8, label %6		br i1 %5, label %8, label %6

6: ; preds = %2		6: ; preds = %2
%7 = extractvalue { i8, i1 } %4, 0		%7 = extractvalue { i8, i1 } %4, 0
store i8 %7, i8* %0, align 1		store i8 %7, i8* %0, align 1
br label %8		br label %8

8: ; preds = %2, %6		8: ; preds = %2, %6
%9 = zext i1 %5 to i8		%9 = zext i1 %5 to i8
ret i8 %9		ret i8 %9
}		}

; Function Attrs: nofree norecurse nounwind mustprogress		; Function Attrs: nofree norecurse nounwind mustprogress
define zeroext i8 @_Z29atomic_cmp_swap_relaxed_gv_u8Rhh(i8* nocapture nonnull align 1 dereferenceable(1) %0, i8 zeroext %1) {		define zeroext i8 @_Z29atomic_cmp_swap_relaxed_gv_u8Rhh(i8* nocapture nonnull align 1 dereferenceable(1) %0, i8 zeroext %1) {
; CHECK-LABEL: _Z29atomic_cmp_swap_relaxed_gv_u8Rhh:		; CHECK-LABEL: _Z29atomic_cmp_swap_relaxed_gv_u8Rhh:
; CHECK: # %bb.0:		; CHECK: # %bb.0: # %partword.cmpxchg.loop
; CHECK-NEXT: and %s2, %s1, (32)0		; CHECK-NEXT: lea %s2, gv_u8@lo
; CHECK-NEXT: lea %s1, gv_u8@lo		; CHECK-NEXT: and %s2, %s2, (32)0
; CHECK-NEXT: and %s1, %s1, (32)0		; CHECK-NEXT: lea.sl %s2, gv_u8@hi(, %s2)
; CHECK-NEXT: lea.sl %s1, gv_u8@hi(, %s1)		; CHECK-NEXT: and %s2, -4, %s2
; CHECK-NEXT: and %s1, -4, %s1		; CHECK-NEXT: ldl.zx %s3, (, %s2)
; CHECK-NEXT: ldl.zx %s4, (, %s1)		; CHECK-NEXT: lea %s4, -256
; CHECK-NEXT: ld1b.zx %s3, (, %s0)		; CHECK-NEXT: ld1b.zx %s5, (, %s0)
; CHECK-NEXT: lea %s5, -256
; CHECK-NEXT: and %s5, %s5, (32)0
; CHECK-NEXT: and %s4, %s4, %s5
; CHECK-NEXT: and %s4, %s4, (32)0		; CHECK-NEXT: and %s4, %s4, (32)0
; CHECK-NEXT: or %s2, %s4, %s2		; CHECK-NEXT: and %s3, %s3, %s4
; CHECK-NEXT: or %s3, %s4, %s3		; CHECK-NEXT: or %s1, %s3, %s1
; CHECK-NEXT: cas.w %s2, (%s1), %s3		; CHECK-NEXT: or %s3, %s3, %s5
; CHECK-NEXT: cmps.w.sx %s3, %s2, %s3		; CHECK-NEXT: cas.w %s1, (%s2), %s3
; CHECK-NEXT: or %s1, 0, (0)1		; CHECK-NEXT: cmps.w.sx %s4, %s1, %s3
; CHECK-NEXT: cmov.w.eq %s1, (63)0, %s3		; CHECK-NEXT: or %s2, 0, (0)1
; CHECK-NEXT: brne.w 0, %s1, .LBB46_2		; CHECK-NEXT: cmov.w.eq %s2, (63)0, %s4
		; CHECK-NEXT: breq.w %s1, %s3, .LBB46_2
; CHECK-NEXT: # %bb.1:		; CHECK-NEXT: # %bb.1:
; CHECK-NEXT: st1b %s2, (, %s0)		; CHECK-NEXT: st1b %s1, (, %s0)
; CHECK-NEXT: .LBB46_2:		; CHECK-NEXT: .LBB46_2:
; CHECK-NEXT: adds.w.zx %s0, %s1, (0)1		; CHECK-NEXT: adds.w.zx %s0, %s2, (0)1
; CHECK-NEXT: b.l.t (, %s10)		; CHECK-NEXT: b.l.t (, %s10)
%3 = load i8, i8* %0, align 1		%3 = load i8, i8* %0, align 1
%4 = cmpxchg weak i8* getelementptr inbounds (%"struct.std::__1::atomic.5", %"struct.std::__1::atomic.5"* @gv_u8, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), i8 %3, i8 %1 monotonic monotonic		%4 = cmpxchg weak i8* getelementptr inbounds (%"struct.std::__1::atomic.5", %"struct.std::__1::atomic.5"* @gv_u8, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), i8 %3, i8 %1 monotonic monotonic
%5 = extractvalue { i8, i1 } %4, 1		%5 = extractvalue { i8, i1 } %4, 1
br i1 %5, label %8, label %6		br i1 %5, label %8, label %6

6: ; preds = %2		6: ; preds = %2
%7 = extractvalue { i8, i1 } %4, 0		%7 = extractvalue { i8, i1 } %4, 0
store i8 %7, i8* %0, align 1		store i8 %7, i8* %0, align 1
br label %8		br label %8

8: ; preds = %2, %6		8: ; preds = %2, %6
%9 = zext i1 %5 to i8		%9 = zext i1 %5 to i8
ret i8 %9		ret i8 %9
}		}

; Function Attrs: nofree norecurse nounwind mustprogress		; Function Attrs: nofree norecurse nounwind mustprogress
define signext i16 @_Z30atomic_cmp_swap_relaxed_gv_i16Rss(i16* nocapture nonnull align 2 dereferenceable(2) %0, i16 signext %1) {		define signext i16 @_Z30atomic_cmp_swap_relaxed_gv_i16Rss(i16* nocapture nonnull align 2 dereferenceable(2) %0, i16 signext %1) {
; CHECK-LABEL: _Z30atomic_cmp_swap_relaxed_gv_i16Rss:		; CHECK-LABEL: _Z30atomic_cmp_swap_relaxed_gv_i16Rss:
; CHECK: # %bb.0:		; CHECK: # %bb.0: # %partword.cmpxchg.loop
; CHECK-NEXT: lea %s2, gv_i16@lo		; CHECK-NEXT: lea %s2, gv_i16@lo
; CHECK-NEXT: and %s2, %s2, (32)0		; CHECK-NEXT: and %s2, %s2, (32)0
; CHECK-NEXT: lea.sl %s2, gv_i16@hi(, %s2)		; CHECK-NEXT: lea.sl %s2, gv_i16@hi(, %s2)
; CHECK-NEXT: and %s2, -4, %s2		; CHECK-NEXT: and %s2, -4, %s2
; CHECK-NEXT: ld2b.zx %s4, 2(, %s2)		; CHECK-NEXT: ld2b.zx %s3, 2(, %s2)
; CHECK-NEXT: ld2b.zx %s3, (, %s0)		; CHECK-NEXT: ld2b.zx %s4, (, %s0)
; CHECK-NEXT: and %s1, %s1, (48)0		; CHECK-NEXT: and %s1, %s1, (48)0
; CHECK-NEXT: and %s1, %s1, (32)0		; CHECK-NEXT: sla.w.sx %s3, %s3, 16
; CHECK-NEXT: sla.w.sx %s4, %s4, 16		; CHECK-NEXT: or %s1, %s3, %s1
; CHECK-NEXT: or %s1, %s4, %s1		; CHECK-NEXT: or %s3, %s3, %s4
; CHECK-NEXT: or %s3, %s4, %s3
; CHECK-NEXT: cas.w %s1, (%s2), %s3		; CHECK-NEXT: cas.w %s1, (%s2), %s3
; CHECK-NEXT: cmps.w.sx %s3, %s1, %s3		; CHECK-NEXT: cmps.w.sx %s4, %s1, %s3
; CHECK-NEXT: or %s2, 0, (0)1		; CHECK-NEXT: or %s2, 0, (0)1
; CHECK-NEXT: cmov.w.eq %s2, (63)0, %s3		; CHECK-NEXT: cmov.w.eq %s2, (63)0, %s4
; CHECK-NEXT: brne.w 0, %s2, .LBB47_2		; CHECK-NEXT: breq.w %s1, %s3, .LBB47_2
; CHECK-NEXT: # %bb.1:		; CHECK-NEXT: # %bb.1:
; CHECK-NEXT: st2b %s1, (, %s0)		; CHECK-NEXT: st2b %s1, (, %s0)
; CHECK-NEXT: .LBB47_2:		; CHECK-NEXT: .LBB47_2:
; CHECK-NEXT: adds.w.zx %s0, %s2, (0)1		; CHECK-NEXT: adds.w.zx %s0, %s2, (0)1
; CHECK-NEXT: b.l.t (, %s10)		; CHECK-NEXT: b.l.t (, %s10)
%3 = load i16, i16* %0, align 2		%3 = load i16, i16* %0, align 2
%4 = cmpxchg weak i16* getelementptr inbounds (%"struct.std::__1::atomic.10", %"struct.std::__1::atomic.10"* @gv_i16, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), i16 %3, i16 %1 monotonic monotonic		%4 = cmpxchg weak i16* getelementptr inbounds (%"struct.std::__1::atomic.10", %"struct.std::__1::atomic.10"* @gv_i16, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), i16 %3, i16 %1 monotonic monotonic
%5 = extractvalue { i16, i1 } %4, 1		%5 = extractvalue { i16, i1 } %4, 1
br i1 %5, label %8, label %6		br i1 %5, label %8, label %6

6: ; preds = %2		6: ; preds = %2
%7 = extractvalue { i16, i1 } %4, 0		%7 = extractvalue { i16, i1 } %4, 0
store i16 %7, i16* %0, align 2		store i16 %7, i16* %0, align 2
br label %8		br label %8

8: ; preds = %2, %6		8: ; preds = %2, %6
%9 = zext i1 %5 to i16		%9 = zext i1 %5 to i16
ret i16 %9		ret i16 %9
}		}

; Function Attrs: nofree norecurse nounwind mustprogress		; Function Attrs: nofree norecurse nounwind mustprogress
define zeroext i16 @_Z30atomic_cmp_swap_relaxed_gv_u16Rtt(i16* nocapture nonnull align 2 dereferenceable(2) %0, i16 zeroext %1) {		define zeroext i16 @_Z30atomic_cmp_swap_relaxed_gv_u16Rtt(i16* nocapture nonnull align 2 dereferenceable(2) %0, i16 zeroext %1) {
; CHECK-LABEL: _Z30atomic_cmp_swap_relaxed_gv_u16Rtt:		; CHECK-LABEL: _Z30atomic_cmp_swap_relaxed_gv_u16Rtt:
; CHECK: # %bb.0:		; CHECK: # %bb.0: # %partword.cmpxchg.loop
; CHECK-NEXT: lea %s2, gv_u16@lo		; CHECK-NEXT: lea %s2, gv_u16@lo
; CHECK-NEXT: and %s2, %s2, (32)0		; CHECK-NEXT: and %s2, %s2, (32)0
; CHECK-NEXT: lea.sl %s2, gv_u16@hi(, %s2)		; CHECK-NEXT: lea.sl %s2, gv_u16@hi(, %s2)
; CHECK-NEXT: and %s2, -4, %s2		; CHECK-NEXT: and %s2, -4, %s2
; CHECK-NEXT: ld2b.zx %s4, 2(, %s2)		; CHECK-NEXT: ld2b.zx %s3, 2(, %s2)
; CHECK-NEXT: ld2b.zx %s3, (, %s0)		; CHECK-NEXT: ld2b.zx %s4, (, %s0)
; CHECK-NEXT: and %s1, %s1, (32)0		; CHECK-NEXT: sla.w.sx %s3, %s3, 16
; CHECK-NEXT: sla.w.sx %s4, %s4, 16		; CHECK-NEXT: or %s1, %s3, %s1
; CHECK-NEXT: or %s1, %s4, %s1		; CHECK-NEXT: or %s3, %s3, %s4
; CHECK-NEXT: or %s3, %s4, %s3
; CHECK-NEXT: cas.w %s1, (%s2), %s3		; CHECK-NEXT: cas.w %s1, (%s2), %s3
; CHECK-NEXT: cmps.w.sx %s3, %s1, %s3		; CHECK-NEXT: cmps.w.sx %s4, %s1, %s3
; CHECK-NEXT: or %s2, 0, (0)1		; CHECK-NEXT: or %s2, 0, (0)1
; CHECK-NEXT: cmov.w.eq %s2, (63)0, %s3		; CHECK-NEXT: cmov.w.eq %s2, (63)0, %s4
; CHECK-NEXT: brne.w 0, %s2, .LBB48_2		; CHECK-NEXT: breq.w %s1, %s3, .LBB48_2
; CHECK-NEXT: # %bb.1:		; CHECK-NEXT: # %bb.1:
; CHECK-NEXT: st2b %s1, (, %s0)		; CHECK-NEXT: st2b %s1, (, %s0)
; CHECK-NEXT: .LBB48_2:		; CHECK-NEXT: .LBB48_2:
; CHECK-NEXT: adds.w.zx %s0, %s2, (0)1		; CHECK-NEXT: adds.w.zx %s0, %s2, (0)1
; CHECK-NEXT: b.l.t (, %s10)		; CHECK-NEXT: b.l.t (, %s10)
%3 = load i16, i16* %0, align 2		%3 = load i16, i16* %0, align 2
%4 = cmpxchg weak i16* getelementptr inbounds (%"struct.std::__1::atomic.15", %"struct.std::__1::atomic.15"* @gv_u16, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), i16 %3, i16 %1 monotonic monotonic		%4 = cmpxchg weak i16* getelementptr inbounds (%"struct.std::__1::atomic.15", %"struct.std::__1::atomic.15"* @gv_u16, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), i16 %3, i16 %1 monotonic monotonic
%5 = extractvalue { i16, i1 } %4, 1		%5 = extractvalue { i16, i1 } %4, 1
▲ Show 20 Lines • Show All 251 Lines • Show Last 20 Lines

llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i16.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -mtriple=amdgcn-amd-amdhsa -S -atomic-expand %s \| FileCheck %s			; RUN: opt -mtriple=amdgcn-amd-amdhsa -S -atomic-expand %s \| FileCheck %s
	; RUN: opt -mtriple=r600-mesa-mesa3d -S -atomic-expand %s \| FileCheck %s			; RUN: opt -mtriple=r600-mesa-mesa3d -S -atomic-expand %s \| FileCheck %s

	target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5"			target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5"

	define i16 @test_atomicrmw_xchg_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_xchg_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_xchg_i16_global(			; CHECK-LABEL: @test_atomicrmw_xchg_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[TMP6:%.*]] = or i32 [[TMP5]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP6]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw xchg i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw xchg i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_xchg_i16_global_align4(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_xchg_i16_global_align4(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_xchg_i16_global_align4(			; CHECK-LABEL: @test_atomicrmw_xchg_i16_global_align4(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[TMP6:%.*]] = or i32 [[TMP5]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP6]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw xchg i16 addrspace(1)* %ptr, i16 %value seq_cst, align 4			%res = atomicrmw xchg i16 addrspace(1)* %ptr, i16 %value seq_cst, align 4
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_add_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_add_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_add_i16_global(			; CHECK-LABEL: @test_atomicrmw_add_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = add i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[NEW:%.*]] = add i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[NEW]], [[MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[NEW]], [[MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP7]], [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[TMP5]]
	; CHECK-NEXT: [[TMP9:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP8]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP9]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP9]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw add i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw add i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_add_i16_global_align4(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_add_i16_global_align4(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_add_i16_global_align4(			; CHECK-LABEL: @test_atomicrmw_add_i16_global_align4(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = add i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[NEW:%.*]] = add i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[NEW]], [[MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[NEW]], [[MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP7]], [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[TMP5]]
	; CHECK-NEXT: [[TMP9:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP8]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP9]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP9]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw add i16 addrspace(1)* %ptr, i16 %value seq_cst, align 4			%res = atomicrmw add i16 addrspace(1)* %ptr, i16 %value seq_cst, align 4
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_sub_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_sub_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_sub_i16_global(			; CHECK-LABEL: @test_atomicrmw_sub_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = sub i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[NEW:%.*]] = sub i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[NEW]], [[MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[NEW]], [[MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP7]], [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[TMP5]]
	; CHECK-NEXT: [[TMP9:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP8]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP9]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP9]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw sub i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw sub i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_and_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_and_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_and_i16_global(			; CHECK-LABEL: @test_atomicrmw_and_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[ANDOPERAND:%.*]] = or i32 [[INV_MASK]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[ANDOPERAND:%.*]] = or i32 [[INV_MASK]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP5:%.]] = atomicrmw and i32 addrspace(1) [[ALIGNEDADDR]], i32 [[ANDOPERAND]] seq_cst, align 4			; CHECK-NEXT: [[TMP4:%.]] = atomicrmw and i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[ANDOPERAND]] seq_cst, align 4
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP5]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP4]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw and i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw and i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_nand_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_nand_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_nand_i16_global(			; CHECK-LABEL: @test_atomicrmw_nand_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[NEW:%.*]] = xor i32 [[TMP6]], -1			; CHECK-NEXT: [[NEW:%.*]] = xor i32 [[TMP5]], -1
	; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[NEW]], [[MASK]]			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[NEW]], [[MASK]]
	; CHECK-NEXT: [[TMP8:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP9:%.*]] = or i32 [[TMP8]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP7]], [[TMP6]]
	; CHECK-NEXT: [[TMP10:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP9]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP9:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP8]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP10]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP9]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP10]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP9]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw nand i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw nand i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_or_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_or_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_or_i16_global(			; CHECK-LABEL: @test_atomicrmw_or_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = atomicrmw or i32 addrspace(1) [[ALIGNEDADDR]], i32 [[VALOPERAND_SHIFTED]] seq_cst, align 4			; CHECK-NEXT: [[TMP4:%.]] = atomicrmw or i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[VALOPERAND_SHIFTED]] seq_cst, align 4
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP5]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP4]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw or i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw or i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_xor_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_xor_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_xor_i16_global(			; CHECK-LABEL: @test_atomicrmw_xor_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = atomicrmw xor i32 addrspace(1) [[ALIGNEDADDR]], i32 [[VALOPERAND_SHIFTED]] seq_cst, align 4			; CHECK-NEXT: [[TMP4:%.]] = atomicrmw xor i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[VALOPERAND_SHIFTED]] seq_cst, align 4
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP5]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP4]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw xor i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw xor i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_max_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_max_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_max_i16_global(			; CHECK-LABEL: @test_atomicrmw_max_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: [[TMP6:%.*]] = icmp sgt i16 [[EXTRACTED]], [[VALUE]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp sgt i16 [[EXTRACTED]], [[VALUE]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i16 [[EXTRACTED]], i16 [[VALUE]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i16 [[EXTRACTED]], i16 [[VALUE]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i16			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED3]]			; CHECK-NEXT: ret i16 [[EXTRACTED4]]
	;			;
	%res = atomicrmw max i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw max i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_min_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_min_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_min_i16_global(			; CHECK-LABEL: @test_atomicrmw_min_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: [[TMP6:%.*]] = icmp sle i16 [[EXTRACTED]], [[VALUE]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp sle i16 [[EXTRACTED]], [[VALUE]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i16 [[EXTRACTED]], i16 [[VALUE]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i16 [[EXTRACTED]], i16 [[VALUE]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i16			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED3]]			; CHECK-NEXT: ret i16 [[EXTRACTED4]]
	;			;
	%res = atomicrmw min i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw min i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_umax_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_umax_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_umax_i16_global(			; CHECK-LABEL: @test_atomicrmw_umax_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: [[TMP6:%.*]] = icmp ugt i16 [[EXTRACTED]], [[VALUE]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp ugt i16 [[EXTRACTED]], [[VALUE]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i16 [[EXTRACTED]], i16 [[VALUE]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i16 [[EXTRACTED]], i16 [[VALUE]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i16			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED3]]			; CHECK-NEXT: ret i16 [[EXTRACTED4]]
	;			;
	%res = atomicrmw umax i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw umax i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_atomicrmw_umin_i16_global(i16 addrspace(1)* %ptr, i16 %value) {			define i16 @test_atomicrmw_umin_i16_global(i16 addrspace(1)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_umin_i16_global(			; CHECK-LABEL: @test_atomicrmw_umin_i16_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: [[TMP6:%.*]] = icmp ule i16 [[EXTRACTED]], [[VALUE]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp ule i16 [[EXTRACTED]], [[VALUE]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i16 [[EXTRACTED]], i16 [[VALUE]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i16 [[EXTRACTED]], i16 [[VALUE]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i16			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED3]]			; CHECK-NEXT: ret i16 [[EXTRACTED4]]
	;			;
	%res = atomicrmw umin i16 addrspace(1)* %ptr, i16 %value seq_cst			%res = atomicrmw umin i16 addrspace(1)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_cmpxchg_i16_global(i16 addrspace(1)* %out, i16 %in, i16 %old) {			define i16 @test_cmpxchg_i16_global(i16 addrspace(1)* %out, i16 %in, i16 %old) {
	; CHECK-LABEL: @test_cmpxchg_i16_global(			; CHECK-LABEL: @test_cmpxchg_i16_global(
	; CHECK-NEXT: [[GEP:%.]] = getelementptr i16, i16 addrspace(1) [[OUT:%.*]], i64 4			; CHECK-NEXT: [[GEP:%.]] = getelementptr i16, i16 addrspace(1) [[OUT:%.*]], i64 4
				; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(1) @llvm.ptrmask.p1i16.i64(i16 addrspace(1)* [[GEP]], i64 -4)
				; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[GEP]] to i64			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(1) [[GEP]] to i64
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[IN:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[IN:%.]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[TMP4:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP6:%.]] = zext i16 [[OLD:%.]] to i32			; CHECK-NEXT: [[TMP5:%.]] = zext i16 [[OLD:%.]] to i32
	; CHECK-NEXT: [[TMP7:%.*]] = shl i32 [[TMP6]], [[SHIFTAMT]]			; CHECK-NEXT: [[TMP6:%.*]] = shl i32 [[TMP5]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP7:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP8]], [[INV_MASK]]			; CHECK-NEXT: [[TMP8:%.*]] = and i32 [[TMP7]], [[INV_MASK]]
	; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]			; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]
	; CHECK: partword.cmpxchg.loop:			; CHECK: partword.cmpxchg.loop:
	; CHECK-NEXT: [[TMP10:%.]] = phi i32 [ [[TMP9]], [[TMP0:%.]] ], [ [[TMP16:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]			; CHECK-NEXT: [[TMP9:%.]] = phi i32 [ [[TMP8]], [[TMP0:%.]] ], [ [[TMP15:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]
	; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP10]], [[TMP5]]			; CHECK-NEXT: [[TMP10:%.*]] = or i32 [[TMP9]], [[TMP4]]
	; CHECK-NEXT: [[TMP12:%.*]] = or i32 [[TMP10]], [[TMP7]]			; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP9]], [[TMP6]]
	; CHECK-NEXT: [[TMP13:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[TMP12]], i32 [[TMP11]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP12:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[TMP11]], i32 [[TMP10]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP13]], 0			; CHECK-NEXT: [[TMP13:%.*]] = extractvalue { i32, i1 } [[TMP12]], 0
	; CHECK-NEXT: [[TMP15:%.*]] = extractvalue { i32, i1 } [[TMP13]], 1			; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP12]], 1
	; CHECK-NEXT: br i1 [[TMP15]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]			; CHECK-NEXT: br i1 [[TMP14]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]
	; CHECK: partword.cmpxchg.failure:			; CHECK: partword.cmpxchg.failure:
	; CHECK-NEXT: [[TMP16]] = and i32 [[TMP14]], [[INV_MASK]]			; CHECK-NEXT: [[TMP15]] = and i32 [[TMP13]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP17:%.*]] = icmp ne i32 [[TMP10]], [[TMP16]]			; CHECK-NEXT: [[TMP16:%.*]] = icmp ne i32 [[TMP9]], [[TMP15]]
	; CHECK-NEXT: br i1 [[TMP17]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]			; CHECK-NEXT: br i1 [[TMP16]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]
	; CHECK: partword.cmpxchg.end:			; CHECK: partword.cmpxchg.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP14]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP13]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i16, i1 } undef, i16 [[EXTRACTED]], 0			; CHECK-NEXT: [[TMP17:%.*]] = insertvalue { i16, i1 } undef, i16 [[EXTRACTED]], 0
	; CHECK-NEXT: [[TMP19:%.*]] = insertvalue { i16, i1 } [[TMP18]], i1 [[TMP15]], 1			; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i16, i1 } [[TMP17]], i1 [[TMP14]], 1
	; CHECK-NEXT: [[EXTRACT:%.*]] = extractvalue { i16, i1 } [[TMP19]], 0			; CHECK-NEXT: [[EXTRACT:%.*]] = extractvalue { i16, i1 } [[TMP18]], 0
	; CHECK-NEXT: ret i16 [[EXTRACT]]			; CHECK-NEXT: ret i16 [[EXTRACT]]
	;			;
	%gep = getelementptr i16, i16 addrspace(1)* %out, i64 4			%gep = getelementptr i16, i16 addrspace(1)* %out, i64 4
	%res = cmpxchg i16 addrspace(1)* %gep, i16 %old, i16 %in seq_cst seq_cst			%res = cmpxchg i16 addrspace(1)* %gep, i16 %old, i16 %in seq_cst seq_cst
	%extract = extractvalue {i16, i1} %res, 0			%extract = extractvalue {i16, i1} %res, 0
	ret i16 %extract			ret i16 %extract
	}			}

	define i16 @test_atomicrmw_xchg_i16_local(i16 addrspace(3)* %ptr, i16 %value) {			define i16 @test_atomicrmw_xchg_i16_local(i16 addrspace(3)* %ptr, i16 %value) {
	; CHECK-LABEL: @test_atomicrmw_xchg_i16_local(			; CHECK-LABEL: @test_atomicrmw_xchg_i16_local(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(3) [[PTR:%.*]] to i32			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(3) @llvm.ptrmask.p3i16.i32(i16 addrspace(3)* [[PTR:%.*]], i32 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(3) [[ALIGNEDADDR]] to i32 addrspace(3)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i32 [[TMP2]] to i32 addrspace(3)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(3) [[PTR]] to i32
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i32 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i32 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i32 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i32 [[PTRLSB]], 3
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[TMP3]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[TMP2]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[TMP3]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[TMP2]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(3) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(3) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[TMP6:%.*]] = or i32 [[TMP5]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(3) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(3) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP6]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[TMP3]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[TMP2]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	%res = atomicrmw xchg i16 addrspace(3)* %ptr, i16 %value seq_cst			%res = atomicrmw xchg i16 addrspace(3)* %ptr, i16 %value seq_cst
	ret i16 %res			ret i16 %res
	}			}

	define i16 @test_cmpxchg_i16_local(i16 addrspace(3)* %out, i16 %in, i16 %old) {			define i16 @test_cmpxchg_i16_local(i16 addrspace(3)* %out, i16 %in, i16 %old) {
	; CHECK-LABEL: @test_cmpxchg_i16_local(			; CHECK-LABEL: @test_cmpxchg_i16_local(
	; CHECK-NEXT: [[GEP:%.]] = getelementptr i16, i16 addrspace(3) [[OUT:%.*]], i64 4			; CHECK-NEXT: [[GEP:%.]] = getelementptr i16, i16 addrspace(3) [[OUT:%.*]], i64 4
				; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 addrspace(3) @llvm.ptrmask.p3i16.i32(i16 addrspace(3)* [[GEP]], i32 -4)
				; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 addrspace(3) [[ALIGNEDADDR]] to i32 addrspace(3)*
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(3) [[GEP]] to i32			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i16 addrspace(3) [[GEP]] to i32
	; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], -4
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i32 [[TMP2]] to i32 addrspace(3)
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i32 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i32 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i32 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i32 [[PTRLSB]], 3
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[TMP3]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[TMP2]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[IN:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[IN:%.]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = shl i32 [[TMP4]], [[TMP3]]			; CHECK-NEXT: [[TMP4:%.*]] = shl i32 [[TMP3]], [[TMP2]]
	; CHECK-NEXT: [[TMP6:%.]] = zext i16 [[OLD:%.]] to i32			; CHECK-NEXT: [[TMP5:%.]] = zext i16 [[OLD:%.]] to i32
	; CHECK-NEXT: [[TMP7:%.*]] = shl i32 [[TMP6]], [[TMP3]]			; CHECK-NEXT: [[TMP6:%.*]] = shl i32 [[TMP5]], [[TMP2]]
	; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 addrspace(3) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP7:%.]] = load i32, i32 addrspace(3) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP8]], [[INV_MASK]]			; CHECK-NEXT: [[TMP8:%.*]] = and i32 [[TMP7]], [[INV_MASK]]
	; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]			; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]
	; CHECK: partword.cmpxchg.loop:			; CHECK: partword.cmpxchg.loop:
	; CHECK-NEXT: [[TMP10:%.]] = phi i32 [ [[TMP9]], [[TMP0:%.]] ], [ [[TMP16:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]			; CHECK-NEXT: [[TMP9:%.]] = phi i32 [ [[TMP8]], [[TMP0:%.]] ], [ [[TMP15:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]
	; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP10]], [[TMP5]]			; CHECK-NEXT: [[TMP10:%.*]] = or i32 [[TMP9]], [[TMP4]]
	; CHECK-NEXT: [[TMP12:%.*]] = or i32 [[TMP10]], [[TMP7]]			; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP9]], [[TMP6]]
	; CHECK-NEXT: [[TMP13:%.]] = cmpxchg i32 addrspace(3) [[ALIGNEDADDR]], i32 [[TMP12]], i32 [[TMP11]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP12:%.]] = cmpxchg i32 addrspace(3) [[ALIGNEDADDR1]], i32 [[TMP11]], i32 [[TMP10]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP13]], 0			; CHECK-NEXT: [[TMP13:%.*]] = extractvalue { i32, i1 } [[TMP12]], 0
	; CHECK-NEXT: [[TMP15:%.*]] = extractvalue { i32, i1 } [[TMP13]], 1			; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP12]], 1
	; CHECK-NEXT: br i1 [[TMP15]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]			; CHECK-NEXT: br i1 [[TMP14]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]
	; CHECK: partword.cmpxchg.failure:			; CHECK: partword.cmpxchg.failure:
	; CHECK-NEXT: [[TMP16]] = and i32 [[TMP14]], [[INV_MASK]]			; CHECK-NEXT: [[TMP15]] = and i32 [[TMP13]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP17:%.*]] = icmp ne i32 [[TMP10]], [[TMP16]]			; CHECK-NEXT: [[TMP16:%.*]] = icmp ne i32 [[TMP9]], [[TMP15]]
	; CHECK-NEXT: br i1 [[TMP17]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]			; CHECK-NEXT: br i1 [[TMP16]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]
	; CHECK: partword.cmpxchg.end:			; CHECK: partword.cmpxchg.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP14]], [[TMP3]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP13]], [[TMP2]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i16, i1 } undef, i16 [[EXTRACTED]], 0			; CHECK-NEXT: [[TMP17:%.*]] = insertvalue { i16, i1 } undef, i16 [[EXTRACTED]], 0
	; CHECK-NEXT: [[TMP19:%.*]] = insertvalue { i16, i1 } [[TMP18]], i1 [[TMP15]], 1			; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i16, i1 } [[TMP17]], i1 [[TMP14]], 1
	; CHECK-NEXT: [[EXTRACT:%.*]] = extractvalue { i16, i1 } [[TMP19]], 0			; CHECK-NEXT: [[EXTRACT:%.*]] = extractvalue { i16, i1 } [[TMP18]], 0
	; CHECK-NEXT: ret i16 [[EXTRACT]]			; CHECK-NEXT: ret i16 [[EXTRACT]]
	;			;
	%gep = getelementptr i16, i16 addrspace(3)* %out, i64 4			%gep = getelementptr i16, i16 addrspace(3)* %out, i64 4
	%res = cmpxchg i16 addrspace(3)* %gep, i16 %old, i16 %in seq_cst seq_cst			%res = cmpxchg i16 addrspace(3)* %gep, i16 %old, i16 %in seq_cst seq_cst
	%extract = extractvalue {i16, i1} %res, 0			%extract = extractvalue {i16, i1} %res, 0
	ret i16 %extract			ret i16 %extract
	}			}

llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i8.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -mtriple=amdgcn-amd-amdhsa -S -atomic-expand %s \| FileCheck %s			; RUN: opt -mtriple=amdgcn-amd-amdhsa -S -atomic-expand %s \| FileCheck %s
	; RUN: opt -mtriple=r600-mesa-mesa3d -S -atomic-expand %s \| FileCheck %s			; RUN: opt -mtriple=r600-mesa-mesa3d -S -atomic-expand %s \| FileCheck %s

	define i8 @test_atomicrmw_xchg_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_xchg_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_xchg_i8_global(			; CHECK-LABEL: @test_atomicrmw_xchg_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[TMP6:%.*]] = or i32 [[TMP5]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP6]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED]]			; CHECK-NEXT: ret i8 [[EXTRACTED]]
	;			;
	%res = atomicrmw xchg i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw xchg i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_add_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_add_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_add_i8_global(			; CHECK-LABEL: @test_atomicrmw_add_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = add i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[NEW:%.*]] = add i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[NEW]], [[MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[NEW]], [[MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP7]], [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[TMP5]]
	; CHECK-NEXT: [[TMP9:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP8]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP9]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP9]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED]]			; CHECK-NEXT: ret i8 [[EXTRACTED]]
	;			;
	%res = atomicrmw add i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw add i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_sub_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_sub_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_sub_i8_global(			; CHECK-LABEL: @test_atomicrmw_sub_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = sub i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[NEW:%.*]] = sub i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[NEW]], [[MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[NEW]], [[MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP7]], [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[TMP5]]
	; CHECK-NEXT: [[TMP9:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP8]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP7]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP9]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP9]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED]]			; CHECK-NEXT: ret i8 [[EXTRACTED]]
	;			;
	%res = atomicrmw sub i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw sub i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_and_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_and_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_and_i8_global(			; CHECK-LABEL: @test_atomicrmw_and_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[ANDOPERAND:%.*]] = or i32 [[INV_MASK]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[ANDOPERAND:%.*]] = or i32 [[INV_MASK]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP5:%.]] = atomicrmw and i32 addrspace(1) [[ALIGNEDADDR]], i32 [[ANDOPERAND]] seq_cst, align 4			; CHECK-NEXT: [[TMP4:%.]] = atomicrmw and i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[ANDOPERAND]] seq_cst, align 4
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP5]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP4]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED]]			; CHECK-NEXT: ret i8 [[EXTRACTED]]
	;			;
	%res = atomicrmw and i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw and i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_nand_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_nand_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_nand_i8_global(			; CHECK-LABEL: @test_atomicrmw_nand_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[NEW:%.*]] = xor i32 [[TMP6]], -1			; CHECK-NEXT: [[NEW:%.*]] = xor i32 [[TMP5]], -1
	; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[NEW]], [[MASK]]			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[NEW]], [[MASK]]
	; CHECK-NEXT: [[TMP8:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP9:%.*]] = or i32 [[TMP8]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP7]], [[TMP6]]
	; CHECK-NEXT: [[TMP10:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP9]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP9:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP8]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP10]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP9]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP10]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP9]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED]]			; CHECK-NEXT: ret i8 [[EXTRACTED]]
	;			;
	%res = atomicrmw nand i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw nand i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_or_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_or_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_or_i8_global(			; CHECK-LABEL: @test_atomicrmw_or_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = atomicrmw or i32 addrspace(1) [[ALIGNEDADDR]], i32 [[VALOPERAND_SHIFTED]] seq_cst, align 4			; CHECK-NEXT: [[TMP4:%.]] = atomicrmw or i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[VALOPERAND_SHIFTED]] seq_cst, align 4
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP5]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP4]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED]]			; CHECK-NEXT: ret i8 [[EXTRACTED]]
	;			;
	%res = atomicrmw or i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw or i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_xor_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_xor_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_xor_i8_global(			; CHECK-LABEL: @test_atomicrmw_xor_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = atomicrmw xor i32 addrspace(1) [[ALIGNEDADDR]], i32 [[VALOPERAND_SHIFTED]] seq_cst, align 4			; CHECK-NEXT: [[TMP4:%.]] = atomicrmw xor i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[VALOPERAND_SHIFTED]] seq_cst, align 4
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP5]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP4]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED]]			; CHECK-NEXT: ret i8 [[EXTRACTED]]
	;			;
	%res = atomicrmw xor i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw xor i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_max_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_max_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_max_i8_global(			; CHECK-LABEL: @test_atomicrmw_max_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: [[TMP6:%.*]] = icmp sgt i8 [[EXTRACTED]], [[VALUE]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp sgt i8 [[EXTRACTED]], [[VALUE]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i8 [[EXTRACTED]], i8 [[VALUE]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i8 [[EXTRACTED]], i8 [[VALUE]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i8 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i8 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i8			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED3]]			; CHECK-NEXT: ret i8 [[EXTRACTED4]]
	;			;
	%res = atomicrmw max i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw max i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_min_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_min_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_min_i8_global(			; CHECK-LABEL: @test_atomicrmw_min_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: [[TMP6:%.*]] = icmp sle i8 [[EXTRACTED]], [[VALUE]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp sle i8 [[EXTRACTED]], [[VALUE]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i8 [[EXTRACTED]], i8 [[VALUE]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i8 [[EXTRACTED]], i8 [[VALUE]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i8 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i8 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i8			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED3]]			; CHECK-NEXT: ret i8 [[EXTRACTED4]]
	;			;
	%res = atomicrmw min i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw min i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_umax_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_umax_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_umax_i8_global(			; CHECK-LABEL: @test_atomicrmw_umax_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: [[TMP6:%.*]] = icmp ugt i8 [[EXTRACTED]], [[VALUE]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp ugt i8 [[EXTRACTED]], [[VALUE]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i8 [[EXTRACTED]], i8 [[VALUE]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i8 [[EXTRACTED]], i8 [[VALUE]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i8 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i8 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i8			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED3]]			; CHECK-NEXT: ret i8 [[EXTRACTED4]]
	;			;
	%res = atomicrmw umax i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw umax i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_atomicrmw_umin_i8_global(i8 addrspace(1)* %ptr, i8 %value) {			define i8 @test_atomicrmw_umin_i8_global(i8 addrspace(1)* %ptr, i8 %value) {
	; CHECK-LABEL: @test_atomicrmw_umin_i8_global(			; CHECK-LABEL: @test_atomicrmw_umin_i8_global(
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[PTR:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[PTR]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[VALUE:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[VALUE:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[TMP0:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: [[TMP6:%.*]] = icmp ule i8 [[EXTRACTED]], [[VALUE]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp ule i8 [[EXTRACTED]], [[VALUE]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i8 [[EXTRACTED]], i8 [[VALUE]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i8 [[EXTRACTED]], i8 [[VALUE]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i8 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i8 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i8			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i8
	; CHECK-NEXT: ret i8 [[EXTRACTED3]]			; CHECK-NEXT: ret i8 [[EXTRACTED4]]
	;			;
	%res = atomicrmw umin i8 addrspace(1)* %ptr, i8 %value seq_cst			%res = atomicrmw umin i8 addrspace(1)* %ptr, i8 %value seq_cst
	ret i8 %res			ret i8 %res
	}			}

	define i8 @test_cmpxchg_i8_global(i8 addrspace(1)* %out, i8 %in, i8 %old) {			define i8 @test_cmpxchg_i8_global(i8 addrspace(1)* %out, i8 %in, i8 %old) {
	; CHECK-LABEL: @test_cmpxchg_i8_global(			; CHECK-LABEL: @test_cmpxchg_i8_global(
	; CHECK-NEXT: [[GEP:%.]] = getelementptr i8, i8 addrspace(1) [[OUT:%.*]], i64 4			; CHECK-NEXT: [[GEP:%.]] = getelementptr i8, i8 addrspace(1) [[OUT:%.*]], i64 4
				; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 addrspace(1) @llvm.ptrmask.p1i8.i64(i8 addrspace(1)* [[GEP]], i64 -4)
				; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 addrspace(1) [[ALIGNEDADDR]] to i32 addrspace(1)*
	; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[GEP]] to i64			; CHECK-NEXT: [[TMP1:%.]] = ptrtoint i8 addrspace(1) [[GEP]] to i64
	; CHECK-NEXT: [[TMP2:%.*]] = and i64 [[TMP1]], -4
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP2]] to i32 addrspace(1)
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP1]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[IN:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[IN:%.]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[TMP4:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP6:%.]] = zext i8 [[OLD:%.]] to i32			; CHECK-NEXT: [[TMP5:%.]] = zext i8 [[OLD:%.]] to i32
	; CHECK-NEXT: [[TMP7:%.*]] = shl i32 [[TMP6]], [[SHIFTAMT]]			; CHECK-NEXT: [[TMP6:%.*]] = shl i32 [[TMP5]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP7:%.]] = load i32, i32 addrspace(1) [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP8]], [[INV_MASK]]			; CHECK-NEXT: [[TMP8:%.*]] = and i32 [[TMP7]], [[INV_MASK]]
	; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]			; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]
	; CHECK: partword.cmpxchg.loop:			; CHECK: partword.cmpxchg.loop:
	; CHECK-NEXT: [[TMP10:%.]] = phi i32 [ [[TMP9]], [[TMP0:%.]] ], [ [[TMP16:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]			; CHECK-NEXT: [[TMP9:%.]] = phi i32 [ [[TMP8]], [[TMP0:%.]] ], [ [[TMP15:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]
	; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP10]], [[TMP5]]			; CHECK-NEXT: [[TMP10:%.*]] = or i32 [[TMP9]], [[TMP4]]
	; CHECK-NEXT: [[TMP12:%.*]] = or i32 [[TMP10]], [[TMP7]]			; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP9]], [[TMP6]]
	; CHECK-NEXT: [[TMP13:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR]], i32 [[TMP12]], i32 [[TMP11]] seq_cst seq_cst, align 4			; CHECK-NEXT: [[TMP12:%.]] = cmpxchg i32 addrspace(1) [[ALIGNEDADDR1]], i32 [[TMP11]], i32 [[TMP10]] seq_cst seq_cst, align 4
	; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP13]], 0			; CHECK-NEXT: [[TMP13:%.*]] = extractvalue { i32, i1 } [[TMP12]], 0
	; CHECK-NEXT: [[TMP15:%.*]] = extractvalue { i32, i1 } [[TMP13]], 1			; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP12]], 1
	; CHECK-NEXT: br i1 [[TMP15]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]			; CHECK-NEXT: br i1 [[TMP14]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]
	; CHECK: partword.cmpxchg.failure:			; CHECK: partword.cmpxchg.failure:
	; CHECK-NEXT: [[TMP16]] = and i32 [[TMP14]], [[INV_MASK]]			; CHECK-NEXT: [[TMP15]] = and i32 [[TMP13]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP17:%.*]] = icmp ne i32 [[TMP10]], [[TMP16]]			; CHECK-NEXT: [[TMP16:%.*]] = icmp ne i32 [[TMP9]], [[TMP15]]
	; CHECK-NEXT: br i1 [[TMP17]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]			; CHECK-NEXT: br i1 [[TMP16]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]
	; CHECK: partword.cmpxchg.end:			; CHECK: partword.cmpxchg.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP14]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP13]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i8, i1 } undef, i8 [[EXTRACTED]], 0			; CHECK-NEXT: [[TMP17:%.*]] = insertvalue { i8, i1 } undef, i8 [[EXTRACTED]], 0
	; CHECK-NEXT: [[TMP19:%.*]] = insertvalue { i8, i1 } [[TMP18]], i1 [[TMP15]], 1			; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i8, i1 } [[TMP17]], i1 [[TMP14]], 1
	; CHECK-NEXT: [[EXTRACT:%.*]] = extractvalue { i8, i1 } [[TMP19]], 0			; CHECK-NEXT: [[EXTRACT:%.*]] = extractvalue { i8, i1 } [[TMP18]], 0
	; CHECK-NEXT: ret i8 [[EXTRACT]]			; CHECK-NEXT: ret i8 [[EXTRACT]]
	;			;
	%gep = getelementptr i8, i8 addrspace(1)* %out, i64 4			%gep = getelementptr i8, i8 addrspace(1)* %out, i64 4
	%res = cmpxchg i8 addrspace(1)* %gep, i8 %old, i8 %in seq_cst seq_cst			%res = cmpxchg i8 addrspace(1)* %gep, i8 %old, i8 %in seq_cst seq_cst
	%extract = extractvalue {i8, i1} %res, 0			%extract = extractvalue {i8, i1} %res, 0
	ret i8 %extract			ret i8 %extract
	}			}

llvm/test/Transforms/AtomicExpand/SPARC/partword.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -S %s -atomic-expand \| FileCheck %s			; RUN: opt -S %s -atomic-expand \| FileCheck %s

	;; Verify the cmpxchg and atomicrmw expansions where sub-word-size			;; Verify the cmpxchg and atomicrmw expansions where sub-word-size
	;; instructions are not available.			;; instructions are not available.

	;;; NOTE: this test is mostly target-independent -- any target which			;;; NOTE: this test is mostly target-independent -- any target which
	;;; doesn't support cmpxchg of sub-word sizes would do.			;;; doesn't support cmpxchg of sub-word sizes would do.
	target datalayout = "E-m:e-i64:64-n32:64-S128"			target datalayout = "E-m:e-i64:64-n32:64-S128"
	target triple = "sparcv9-unknown-unknown"			target triple = "sparcv9-unknown-unknown"

	define i8 @test_cmpxchg_i8(i8* %arg, i8 %old, i8 %new) {			define i8 @test_cmpxchg_i8(i8* %arg, i8 %old, i8 %new) {
	; CHECK-LABEL: @test_cmpxchg_i8(			; CHECK-LABEL: @test_cmpxchg_i8(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i8 [[ARG:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i8 @llvm.ptrmask.p0i8.i64(i8* [[ARG:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i8 [[ALIGNEDADDR]] to i32*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i8 [[ARG]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3
	; CHECK-NEXT: [[TMP2:%.*]] = xor i64 [[PTRLSB]], 3			; CHECK-NEXT: [[TMP1:%.*]] = xor i64 [[PTRLSB]], 3
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[TMP2]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[TMP1]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 255, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i8 [[NEW:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i8 [[NEW:%.]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[TMP4:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP6:%.]] = zext i8 [[OLD:%.]] to i32			; CHECK-NEXT: [[TMP5:%.]] = zext i8 [[OLD:%.]] to i32
	; CHECK-NEXT: [[TMP7:%.*]] = shl i32 [[TMP6]], [[SHIFTAMT]]			; CHECK-NEXT: [[TMP6:%.*]] = shl i32 [[TMP5]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP7:%.]] = load i32, i32 [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP8]], [[INV_MASK]]			; CHECK-NEXT: [[TMP8:%.*]] = and i32 [[TMP7]], [[INV_MASK]]
	; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]			; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]
	; CHECK: partword.cmpxchg.loop:			; CHECK: partword.cmpxchg.loop:
	; CHECK-NEXT: [[TMP10:%.]] = phi i32 [ [[TMP9]], [[ENTRY:%.]] ], [ [[TMP16:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]			; CHECK-NEXT: [[TMP9:%.]] = phi i32 [ [[TMP8]], [[ENTRY:%.]] ], [ [[TMP15:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]
	; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP10]], [[TMP5]]			; CHECK-NEXT: [[TMP10:%.*]] = or i32 [[TMP9]], [[TMP4]]
	; CHECK-NEXT: [[TMP12:%.*]] = or i32 [[TMP10]], [[TMP7]]			; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP9]], [[TMP6]]
	; CHECK-NEXT: [[TMP13:%.]] = cmpxchg i32 [[ALIGNEDADDR]], i32 [[TMP12]], i32 [[TMP11]] monotonic monotonic, align 4			; CHECK-NEXT: [[TMP12:%.]] = cmpxchg i32 [[ALIGNEDADDR1]], i32 [[TMP11]], i32 [[TMP10]] monotonic monotonic, align 4
	; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP13]], 0			; CHECK-NEXT: [[TMP13:%.*]] = extractvalue { i32, i1 } [[TMP12]], 0
	; CHECK-NEXT: [[TMP15:%.*]] = extractvalue { i32, i1 } [[TMP13]], 1			; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP12]], 1
	; CHECK-NEXT: br i1 [[TMP15]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]			; CHECK-NEXT: br i1 [[TMP14]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]
	; CHECK: partword.cmpxchg.failure:			; CHECK: partword.cmpxchg.failure:
	; CHECK-NEXT: [[TMP16]] = and i32 [[TMP14]], [[INV_MASK]]			; CHECK-NEXT: [[TMP15]] = and i32 [[TMP13]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP17:%.*]] = icmp ne i32 [[TMP10]], [[TMP16]]			; CHECK-NEXT: [[TMP16:%.*]] = icmp ne i32 [[TMP9]], [[TMP15]]
	; CHECK-NEXT: br i1 [[TMP17]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]			; CHECK-NEXT: br i1 [[TMP16]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]
	; CHECK: partword.cmpxchg.end:			; CHECK: partword.cmpxchg.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP14]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP13]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i8
	; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i8, i1 } undef, i8 [[EXTRACTED]], 0			; CHECK-NEXT: [[TMP17:%.*]] = insertvalue { i8, i1 } undef, i8 [[EXTRACTED]], 0
	; CHECK-NEXT: [[TMP19:%.*]] = insertvalue { i8, i1 } [[TMP18]], i1 [[TMP15]], 1			; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i8, i1 } [[TMP17]], i1 [[TMP14]], 1
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[RET:%.*]] = extractvalue { i8, i1 } [[TMP19]], 0			; CHECK-NEXT: [[RET:%.*]] = extractvalue { i8, i1 } [[TMP18]], 0
	; CHECK-NEXT: ret i8 [[RET]]			; CHECK-NEXT: ret i8 [[RET]]
	;			;
	entry:			entry:
	%ret_succ = cmpxchg i8* %arg, i8 %old, i8 %new seq_cst monotonic			%ret_succ = cmpxchg i8* %arg, i8 %old, i8 %new seq_cst monotonic
	%ret = extractvalue { i8, i1 } %ret_succ, 0			%ret = extractvalue { i8, i1 } %ret_succ, 0
	ret i8 %ret			ret i8 %ret
	}			}

	define i16 @test_cmpxchg_i16(i16* %arg, i16 %old, i16 %new) {			define i16 @test_cmpxchg_i16(i16* %arg, i16 %old, i16 %new) {
	; CHECK-LABEL: @test_cmpxchg_i16(			; CHECK-LABEL: @test_cmpxchg_i16(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 @llvm.ptrmask.p0i16.i64(i16* [[ARG:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 [[ALIGNEDADDR]] to i32*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3
	; CHECK-NEXT: [[TMP2:%.*]] = xor i64 [[PTRLSB]], 2			; CHECK-NEXT: [[TMP1:%.*]] = xor i64 [[PTRLSB]], 2
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[TMP2]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[TMP1]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[NEW:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[NEW:%.]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[TMP4:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP6:%.]] = zext i16 [[OLD:%.]] to i32			; CHECK-NEXT: [[TMP5:%.]] = zext i16 [[OLD:%.]] to i32
	; CHECK-NEXT: [[TMP7:%.*]] = shl i32 [[TMP6]], [[SHIFTAMT]]			; CHECK-NEXT: [[TMP6:%.*]] = shl i32 [[TMP5]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP8:%.]] = load i32, i32 [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP7:%.]] = load i32, i32 [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP8]], [[INV_MASK]]			; CHECK-NEXT: [[TMP8:%.*]] = and i32 [[TMP7]], [[INV_MASK]]
	; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]			; CHECK-NEXT: br label [[PARTWORD_CMPXCHG_LOOP:%.*]]
	; CHECK: partword.cmpxchg.loop:			; CHECK: partword.cmpxchg.loop:
	; CHECK-NEXT: [[TMP10:%.]] = phi i32 [ [[TMP9]], [[ENTRY:%.]] ], [ [[TMP16:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]			; CHECK-NEXT: [[TMP9:%.]] = phi i32 [ [[TMP8]], [[ENTRY:%.]] ], [ [[TMP15:%.]], [[PARTWORD_CMPXCHG_FAILURE:%.]] ]
	; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP10]], [[TMP5]]			; CHECK-NEXT: [[TMP10:%.*]] = or i32 [[TMP9]], [[TMP4]]
	; CHECK-NEXT: [[TMP12:%.*]] = or i32 [[TMP10]], [[TMP7]]			; CHECK-NEXT: [[TMP11:%.*]] = or i32 [[TMP9]], [[TMP6]]
	; CHECK-NEXT: [[TMP13:%.]] = cmpxchg i32 [[ALIGNEDADDR]], i32 [[TMP12]], i32 [[TMP11]] monotonic monotonic, align 4			; CHECK-NEXT: [[TMP12:%.]] = cmpxchg i32 [[ALIGNEDADDR1]], i32 [[TMP11]], i32 [[TMP10]] monotonic monotonic, align 4
	; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP13]], 0			; CHECK-NEXT: [[TMP13:%.*]] = extractvalue { i32, i1 } [[TMP12]], 0
	; CHECK-NEXT: [[TMP15:%.*]] = extractvalue { i32, i1 } [[TMP13]], 1			; CHECK-NEXT: [[TMP14:%.*]] = extractvalue { i32, i1 } [[TMP12]], 1
	; CHECK-NEXT: br i1 [[TMP15]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]			; CHECK-NEXT: br i1 [[TMP14]], label [[PARTWORD_CMPXCHG_END:%.*]], label [[PARTWORD_CMPXCHG_FAILURE]]
	; CHECK: partword.cmpxchg.failure:			; CHECK: partword.cmpxchg.failure:
	; CHECK-NEXT: [[TMP16]] = and i32 [[TMP14]], [[INV_MASK]]			; CHECK-NEXT: [[TMP15]] = and i32 [[TMP13]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP17:%.*]] = icmp ne i32 [[TMP10]], [[TMP16]]			; CHECK-NEXT: [[TMP16:%.*]] = icmp ne i32 [[TMP9]], [[TMP15]]
	; CHECK-NEXT: br i1 [[TMP17]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]			; CHECK-NEXT: br i1 [[TMP16]], label [[PARTWORD_CMPXCHG_LOOP]], label [[PARTWORD_CMPXCHG_END]]
	; CHECK: partword.cmpxchg.end:			; CHECK: partword.cmpxchg.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP14]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[TMP13]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i16, i1 } undef, i16 [[EXTRACTED]], 0			; CHECK-NEXT: [[TMP17:%.*]] = insertvalue { i16, i1 } undef, i16 [[EXTRACTED]], 0
	; CHECK-NEXT: [[TMP19:%.*]] = insertvalue { i16, i1 } [[TMP18]], i1 [[TMP15]], 1			; CHECK-NEXT: [[TMP18:%.*]] = insertvalue { i16, i1 } [[TMP17]], i1 [[TMP14]], 1
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[RET:%.*]] = extractvalue { i16, i1 } [[TMP19]], 0			; CHECK-NEXT: [[RET:%.*]] = extractvalue { i16, i1 } [[TMP18]], 0
	; CHECK-NEXT: ret i16 [[RET]]			; CHECK-NEXT: ret i16 [[RET]]
	;			;
	entry:			entry:
	%ret_succ = cmpxchg i16* %arg, i16 %old, i16 %new seq_cst monotonic			%ret_succ = cmpxchg i16* %arg, i16 %old, i16 %new seq_cst monotonic
	%ret = extractvalue { i16, i1 } %ret_succ, 0			%ret = extractvalue { i16, i1 } %ret_succ, 0
	ret i16 %ret			ret i16 %ret
	}			}

	define i16 @test_add_i16(i16* %arg, i16 %val) {			define i16 @test_add_i16(i16* %arg, i16 %val) {
	; CHECK-LABEL: @test_add_i16(			; CHECK-LABEL: @test_add_i16(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 @llvm.ptrmask.p0i16.i64(i16* [[ARG:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 [[ALIGNEDADDR]] to i32*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3
	; CHECK-NEXT: [[TMP2:%.*]] = xor i64 [[PTRLSB]], 2			; CHECK-NEXT: [[TMP1:%.*]] = xor i64 [[PTRLSB]], 2
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[TMP2]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[TMP1]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VAL:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VAL:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = add i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[NEW:%.*]] = add i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[NEW]], [[MASK]]			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[NEW]], [[MASK]]
	; CHECK-NEXT: [[TMP7:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[TMP8:%.*]] = or i32 [[TMP7]], [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], [[TMP5]]
	; CHECK-NEXT: [[TMP9:%.]] = cmpxchg i32 [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[TMP8]] monotonic monotonic, align 4			; CHECK-NEXT: [[TMP8:%.]] = cmpxchg i32 [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[TMP7]] monotonic monotonic, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP9]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP8]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP9]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP8]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	entry:			entry:
	%ret = atomicrmw add i16* %arg, i16 %val seq_cst			%ret = atomicrmw add i16* %arg, i16 %val seq_cst
	ret i16 %ret			ret i16 %ret
	}			}

	define i16 @test_xor_i16(i16* %arg, i16 %val) {			define i16 @test_xor_i16(i16* %arg, i16 %val) {
	; CHECK-LABEL: @test_xor_i16(			; CHECK-LABEL: @test_xor_i16(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 @llvm.ptrmask.p0i16.i64(i16* [[ARG:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 [[ALIGNEDADDR]] to i32*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3
	; CHECK-NEXT: [[TMP2:%.*]] = xor i64 [[PTRLSB]], 2			; CHECK-NEXT: [[TMP1:%.*]] = xor i64 [[PTRLSB]], 2
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[TMP2]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[TMP1]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VAL:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VAL:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = xor i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[NEW:%.*]] = xor i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[NEW]] monotonic monotonic, align 4			; CHECK-NEXT: [[TMP5:%.]] = cmpxchg i32 [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[NEW]] monotonic monotonic, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP5]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	entry:			entry:
	%ret = atomicrmw xor i16* %arg, i16 %val seq_cst			%ret = atomicrmw xor i16* %arg, i16 %val seq_cst
	ret i16 %ret			ret i16 %ret
	}			}

	define i16 @test_or_i16(i16* %arg, i16 %val) {			define i16 @test_or_i16(i16* %arg, i16 %val) {
	; CHECK-LABEL: @test_or_i16(			; CHECK-LABEL: @test_or_i16(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 @llvm.ptrmask.p0i16.i64(i16* [[ARG:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 [[ALIGNEDADDR]] to i32*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3
	; CHECK-NEXT: [[TMP2:%.*]] = xor i64 [[PTRLSB]], 2			; CHECK-NEXT: [[TMP1:%.*]] = xor i64 [[PTRLSB]], 2
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[TMP2]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[TMP1]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VAL:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VAL:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = or i32 [[LOADED]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[NEW:%.*]] = or i32 [[LOADED]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[NEW]] monotonic monotonic, align 4			; CHECK-NEXT: [[TMP5:%.]] = cmpxchg i32 [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[NEW]] monotonic monotonic, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP5]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	entry:			entry:
	%ret = atomicrmw or i16* %arg, i16 %val seq_cst			%ret = atomicrmw or i16* %arg, i16 %val seq_cst
	ret i16 %ret			ret i16 %ret
	}			}

	define i16 @test_and_i16(i16* %arg, i16 %val) {			define i16 @test_and_i16(i16* %arg, i16 %val) {
	; CHECK-LABEL: @test_and_i16(			; CHECK-LABEL: @test_and_i16(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 @llvm.ptrmask.p0i16.i64(i16* [[ARG:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 [[ALIGNEDADDR]] to i32*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3
	; CHECK-NEXT: [[TMP2:%.*]] = xor i64 [[PTRLSB]], 2			; CHECK-NEXT: [[TMP1:%.*]] = xor i64 [[PTRLSB]], 2
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[TMP2]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[TMP1]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VAL:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VAL:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[ANDOPERAND:%.*]] = or i32 [[INV_MASK]], [[VALOPERAND_SHIFTED]]			; CHECK-NEXT: [[ANDOPERAND:%.*]] = or i32 [[INV_MASK]], [[VALOPERAND_SHIFTED]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[NEW:%.*]] = and i32 [[LOADED]], [[ANDOPERAND]]			; CHECK-NEXT: [[NEW:%.*]] = and i32 [[LOADED]], [[ANDOPERAND]]
	; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[NEW]] monotonic monotonic, align 4			; CHECK-NEXT: [[TMP5:%.]] = cmpxchg i32 [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[NEW]] monotonic monotonic, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP5]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: ret i16 [[EXTRACTED]]			; CHECK-NEXT: ret i16 [[EXTRACTED]]
	;			;
	entry:			entry:
	%ret = atomicrmw and i16* %arg, i16 %val seq_cst			%ret = atomicrmw and i16* %arg, i16 %val seq_cst
	ret i16 %ret			ret i16 %ret
	}			}

	define i16 @test_min_i16(i16* %arg, i16 %val) {			define i16 @test_min_i16(i16* %arg, i16 %val) {
	; CHECK-LABEL: @test_min_i16(			; CHECK-LABEL: @test_min_i16(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG:%.*]] to i64			; CHECK-NEXT: [[ALIGNEDADDR:%.]] = call i16 @llvm.ptrmask.p0i16.i64(i16* [[ARG:%.*]], i64 -4)
	; CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], -4			; CHECK-NEXT: [[ALIGNEDADDR1:%.]] = bitcast i16 [[ALIGNEDADDR]] to i32*
	; CHECK-NEXT: [[ALIGNEDADDR:%.]] = inttoptr i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP0:%.]] = ptrtoint i16 [[ARG]] to i64
	; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3			; CHECK-NEXT: [[PTRLSB:%.*]] = and i64 [[TMP0]], 3
	; CHECK-NEXT: [[TMP2:%.*]] = xor i64 [[PTRLSB]], 2			; CHECK-NEXT: [[TMP1:%.*]] = xor i64 [[PTRLSB]], 2
	; CHECK-NEXT: [[TMP3:%.*]] = shl i64 [[TMP2]], 3			; CHECK-NEXT: [[TMP2:%.*]] = shl i64 [[TMP1]], 3
	; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP3]] to i32			; CHECK-NEXT: [[SHIFTAMT:%.*]] = trunc i64 [[TMP2]] to i32
	; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]			; CHECK-NEXT: [[MASK:%.*]] = shl i32 65535, [[SHIFTAMT]]
	; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1			; CHECK-NEXT: [[INV_MASK:%.*]] = xor i32 [[MASK]], -1
	; CHECK-NEXT: [[TMP4:%.]] = zext i16 [[VAL:%.]] to i32			; CHECK-NEXT: [[TMP3:%.]] = zext i16 [[VAL:%.]] to i32
	; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP4]], [[SHIFTAMT]]			; CHECK-NEXT: [[VALOPERAND_SHIFTED:%.*]] = shl i32 [[TMP3]], [[SHIFTAMT]]
	; CHECK-NEXT: [[TMP5:%.]] = load i32, i32 [[ALIGNEDADDR]], align 4			; CHECK-NEXT: [[TMP4:%.]] = load i32, i32 [[ALIGNEDADDR1]], align 4
	; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]			; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]]
	; CHECK: atomicrmw.start:			; CHECK: atomicrmw.start:
	; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP5]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]			; CHECK-NEXT: [[LOADED:%.]] = phi i32 [ [[TMP4]], [[ENTRY:%.]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ]
	; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED:%.*]] = lshr i32 [[LOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16			; CHECK-NEXT: [[EXTRACTED:%.*]] = trunc i32 [[SHIFTED]] to i16
	; CHECK-NEXT: [[TMP6:%.*]] = icmp sle i16 [[EXTRACTED]], [[VAL]]			; CHECK-NEXT: [[TMP5:%.*]] = icmp sle i16 [[EXTRACTED]], [[VAL]]
	; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP6]], i16 [[EXTRACTED]], i16 [[VAL]]			; CHECK-NEXT: [[NEW:%.*]] = select i1 [[TMP5]], i16 [[EXTRACTED]], i16 [[VAL]]
	; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32			; CHECK-NEXT: [[EXTENDED:%.*]] = zext i16 [[NEW]] to i32
	; CHECK-NEXT: [[SHIFTED1:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED2:%.*]] = shl nuw i32 [[EXTENDED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]			; CHECK-NEXT: [[UNMASKED:%.*]] = and i32 [[LOADED]], [[INV_MASK]]
	; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED1]]			; CHECK-NEXT: [[INSERTED:%.*]] = or i32 [[UNMASKED]], [[SHIFTED2]]
	; CHECK-NEXT: [[TMP7:%.]] = cmpxchg i32 [[ALIGNEDADDR]], i32 [[LOADED]], i32 [[INSERTED]] monotonic monotonic, align 4			; CHECK-NEXT: [[TMP6:%.]] = cmpxchg i32 [[ALIGNEDADDR1]], i32 [[LOADED]], i32 [[INSERTED]] monotonic monotonic, align 4
	; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP7]], 1			; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP6]], 1
	; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP7]], 0			; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i32, i1 } [[TMP6]], 0
	; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]			; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
	; CHECK: atomicrmw.end:			; CHECK: atomicrmw.end:
	; CHECK-NEXT: [[SHIFTED2:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]			; CHECK-NEXT: [[SHIFTED3:%.*]] = lshr i32 [[NEWLOADED]], [[SHIFTAMT]]
	; CHECK-NEXT: [[EXTRACTED3:%.*]] = trunc i32 [[SHIFTED2]] to i16			; CHECK-NEXT: [[EXTRACTED4:%.*]] = trunc i32 [[SHIFTED3]] to i16
	; CHECK-NEXT: fence seq_cst			; CHECK-NEXT: fence seq_cst
	; CHECK-NEXT: ret i16 [[EXTRACTED3]]			; CHECK-NEXT: ret i16 [[EXTRACTED4]]
	;			;
	entry:			entry:
	%ret = atomicrmw min i16* %arg, i16 %val seq_cst			%ret = atomicrmw min i16* %arg, i16 %val seq_cst
	ret i16 %ret			ret i16 %ret
	}			}