This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Basic/
-
Basic/
5/12
Targets.cpp
-
test/
-
CodeGen/
2
atomic-ops.c
-
ms-volatile.c
-
CodeGenCXX/
-
atomicinit.cpp
-
Sema/
-
atomic-ops.c

Differential D29542

[TargetInfo] Adjust x86-32 atomic support to the CPU used
Needs ReviewPublic

Authored by mgorny on Feb 4 2017, 10:24 AM.

Download Raw Diff

Details

Reviewers

dim
jyknight
chandlerc
eli.friedman
jlebar
Bigcheese
hfinkel
rsmith

Summary

Set the maximum width of atomic operations on x86-32 based on the target
CPU. The 64-bit inline atomics require cmpxchg8b which is an i586
instruction. Other inline atomics require cmpxchg which is an i486
instruction.

This fixes the incorrect value of GCC_ATOMIC_LLONG_LOCK_FREE
and atomic_always_lock_free() on FreeBSD where clang defaults to i486
CPU (PR#31864).

For CUDA device builds, assume i586+. This matches the default CPUs for
all x86-32 targets on systems supporting CUDA.

Diff Detail

Event Timeline

mgorny created this revision.Feb 4 2017, 10:24 AM

Herald added a subscriber: emaste. · View Herald TranscriptFeb 4 2017, 10:24 AM

Could someone help me figure out what is the cause and correct solution to that failure? @jlebar?

The test is checking that the macros have the same value when compiling for CUDA host and device. That is, if we're compiling for an x86 CPU and an NVPTX GPU, we invoke cc1 twice, and the macros should have the same values both times. Which, I know, is a lie. But because when we're compiling for NVPTX we still parse all of the CPU code, macros generally need to have the same values otherwise we get into Big Trouble. NVPTX atomics are controlled separately.

You can see in NVPTXTargetInfo that we read properties from the host targetinfo so that we export the same macros. The problem here seems to be that we're mutating the x86 targetinfo after the nvptx targetinfo reads its properties.

Does that give you enough context to fix the problem?

dim added inline comments.Feb 4 2017, 11:32 AM

lib/Basic/Targets.cpp
4251	Are you purposefully not setting `MaxAtomicInlineWidth` here? (It seems from `TargetInfo` that the default value is zero.)
4273	As far as I can see, in the constructor this call is _always_ made with `CPU` set to `CK_Generic`, i.e. zero. Therefore, the "allow locked atomics up to 4 bytes" path in `setAtomic` is always chosen. Maybe it is clearer to just initialize `MaxAtomicPromoteWidth` to 32 directly here, instead?

In D29542#666831, @jlebar wrote:

Could someone help me figure out what is the cause and correct solution to that failure? @jlebar?

You can see in NVPTXTargetInfo that we read properties from the host targetinfo so that we export the same macros. The problem here seems to be that we're mutating the x86 targetinfo after the nvptx targetinfo reads its properties.

Does that give you enough context to fix the problem?

Thanks. I'll try to find a reasonably sane solution ;-).

lib/Basic/Targets.cpp
4251	Yes. I've based this on what's done for ARM. Unless I misunderstood something, this means that on 'plain' i386 there is no inline atomics support but we still want to do atomics-via-locking up to 32-bit types. I'm not sure about 32/64 here to match i486.
4273	Well, I just copied the idea from ARM. I thought of it more like 'make sure it is initialized to some value, possibly update it later when setting CPU'. I'm fine either way.

Ok, this CUDA fix should be reasonable, i think. It simply assumes i586+ (i.e. all inline atomics enabled) for CUDA target builds. I seriously doubt it's technically possible that anyone will ever use CUDA on <i586 ;-).

jlebar added inline comments.Feb 5 2017, 2:07 PM

lib/Basic/Targets.cpp
1784	Someone else is calling setCPU on the HostTarget. If they're calling it after we call it here, this is obviously not going to work. Since it does work, I presume they are calling it before we get here. In which case, can we not just set `MaxAtomicPromoteWidth=HostTarget->MaxAtomicPromoteWidth` right after we set `MaxAtomicInlineWidth = HostTarget->getMaxAtomicInlineWidth();` below? (Note that as written definitely isn't right because it assumes that HostTarget is non-null.)

CUDA: added the MaxAtomicPromoteWidth setting, and moved the CPU setting a little lower to ensure that it doesn't get called with null HostTarget.

jlebar added inline comments.Feb 5 2017, 4:59 PM

lib/Basic/Targets.cpp
1808	Okay, is this still needed now?

mgorny added inline comments.Feb 6 2017, 12:19 AM

lib/Basic/Targets.cpp
1808	Yes. I've specifically tested with it commented out, and the CPU gets initiated to generic (=no inline atomics) then.

jlebar added inline comments.Feb 6 2017, 8:37 AM

lib/Basic/Targets.cpp
1808	Yes, but is that a bug? Does that break the test? I thought the problem we were trying to solve here was that CUDA host and device builds did not define the same macros. And I thought that setCPU modified the values for MaxAtomicInlineWidth and MaxAtomicPromoteWidth. Moreover I thought that we called HostTarget->setCPU before calling this function. If all of those things are true, I don't see what problem we're solving by calling HostTarget->setCPU("i586") here.

At this point, I don't think there is any use on pretending that i386-as-default makes sense. So I would request that the i386 case should be made explicit or just dropped, with a preference for the latter.

mgorny added inline comments.Feb 6 2017, 9:32 AM

lib/Basic/Targets.cpp
1808	Well, the thing is, we don't call `HostTarget->setCPU()` before this function. We just call `AllocateTarget()`, and it does not set the CPU. Normally the CPU is set in Driver, based on `-march` etc. if provided, with fallback to platform-specific defaults. In the case of host-side CUDA build, the Driver sets x86-specific CPU. While the defaults differ per platform, for all platforms supporting CUDA it's i586+. Now, for the target-side, the Driver creates NVPTX target, and sets NVPTX-specific CPU. The `HostTarget` instance is only created within `NVPTXTargetInfo`, and so we need to `setCPU()` explicitly. Since we can reliably assume that the host-side will be i586+, we use `i586` here. So far this didn't matter since all atomic properties were defined statically. However, this patch changes them to adjust to the CPU used, and so if the `X8632TargetInfo` instance is allocated without an explicit `setCPU()` call, it defaults to generic x86 (= no inline atomics available) which is different from the host platform default. As a result, different macros are defined and the test fails.

In D29542#667814, @joerg wrote:

At this point, I don't think there is any use on pretending that i386-as-default makes sense. So I would request that the i386 case should be made explicit or just dropped, with a preference for the latter.

By the former, do you mean making CK_Generic imply i486+ or i586+? What about line ~3947 where the same conditions are used to control other definitions? Should they be changed too?

Generic should imply i486+. I don't think any general purpose system supports i386 at this point, simply because it has an annoying number of bugs in critical components. The i486 (esp. the non-crippled ones) are reasonable easy to support and there are still people around with hardware, esp. clones.

ahatanak added a subscriber: ahatanak.Feb 6 2017, 10:03 AM

ahatanak added inline comments.

lib/Basic/Targets.cpp
4251	If there isn't a test case for plain i386, is it possible to add one (perhaps in test/Sema/atomic-ops.c)?

mgorny added inline comments.Feb 6 2017, 10:11 AM

lib/Basic/Targets.cpp
4251	I could do that. However, @joerg suggested dropping i386 branch entirely, and assuming i486+.

ahatanak added inline comments.Feb 6 2017, 10:16 AM

lib/Basic/Targets.cpp
4251	OK, thanks. In that case, we don't need the test.

Well, the thing is, we don't call HostTarget->setCPU() before this function. We just call AllocateTarget(), and it does not set the CPU.

Ah, got it. LGTM for the nvptx changes.

There's still something strange here. If I compile the following on i386-freebsd12, with clang -march=i486 -O2 -S:

_Atomic(long long) ll;

void f(void)
{
  ++ll;
}

the result is:

	.globl	f
	.p2align	4, 0x90
	.type	f,@function
f:                                      # @f
# BB#0:                                 # %entry
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%ebx
	movl	ll+4, %edx
	movl	ll, %eax
	.p2align	4, 0x90
.LBB0_1:                                # %atomicrmw.start
                                        # =>This Inner Loop Header: Depth=1
	movl	%eax, %ebx
	addl	$1, %ebx
	movl	%edx, %ecx
	adcl	$0, %ecx
	lock		cmpxchg8b	ll
	jne	.LBB0_1
# BB#2:                                 # %atomicrmw.end
	popl	%ebx
	popl	%ebp
	retl
.Lfunc_end0:
	.size	f, .Lfunc_end0-f

	.type	ll,@object              # @ll
	.comm	ll,8,4

So what gives? It's still inserting a cmpxchg8b! AFAIK it should now insert a call to __atomic_fetch_add_8 instead.

Note that this changes if you use C++ atomics, e.g.:

#include <atomic>

void f(std::atomic<long long>& x)
{
  ++x;
}

compiles to:

	.globl	_Z1fRNSt3__16atomicIxEE
	.p2align	4, 0x90
	.type	_Z1fRNSt3__16atomicIxEE,@function
_Z1fRNSt3__16atomicIxEE:                # @_Z1fRNSt3__16atomicIxEE
.Lfunc_begin0:
	.cfi_sections .debug_frame
	.cfi_startproc
	.cfi_personality 0, __gxx_personality_v0
	.cfi_lsda 0, .Lexception0
# BB#0:                                 # %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$16, %esp
	movl	8(%ebp), %eax
.Ltmp0:
	movl	%eax, (%esp)
	movl	$5, 12(%esp)
	movl	$0, 8(%esp)
	movl	$1, 4(%esp)
	calll	__atomic_fetch_add_8
.Ltmp1:
# BB#1:                                 # %_ZNSt3__113__atomic_baseIxLb1EEppEv.exit
	addl	$16, %esp
	popl	%ebp
	retl
.LBB0_2:                                # %lpad.i.i
.Ltmp2:
	movl	%eax, (%esp)
	calll	__cxa_call_unexpected
	subl	$4, %esp
.Lfunc_end0:
	.size	_Z1fRNSt3__16atomicIxEE, .Lfunc_end0-_Z1fRNSt3__16atomicIxEE
	.cfi_endproc

I think you're running into https://llvm.org/bugs/show_bug.cgi?id=31620 .

Removed the i386 branch. Now the i486+ are used unconditionally.

Le gentle ping.

+jyknight, who had a similar patch in http://reviews.llvm.org/D17933 (see also r291477 and PR31864)

test/CodeGen/atomic-ops.c
1	Naive question: why is the i686- part of the triple not sufficient here; why is -target-cpu needed?

mgorny added inline comments.Feb 23 2017, 9:32 PM

test/CodeGen/atomic-ops.c
1	It's because triple is not really meaningful on most of the systems (e.g. many Linux distros use i386, *BSD use i486), and the default CPU logic is applied in the Driver, while cc1 is called directly here.

A gentle ping.

mgorny added a child revision: D28213: [Frontend] Correct values of ATOMIC_*_LOCK_FREE to match builtin.Mar 23 2017, 10:26 AM

mgorny added a reviewer: hfinkel.

tstellar added a subscriber: tstellar.Jul 15 2021, 10:10 PM

Herald added subscribers: pengfei, jfb, krytarowski, arichardson. · View Herald TranscriptJul 15 2021, 10:10 PM

Maybe this change is obsolete now that D59566 is merged?

Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2022, 6:51 PM

rprichard mentioned this in D28213: [Frontend] Correct values of ATOMIC_*_LOCK_FREE to match builtin.May 26 2022, 6:14 PM

rprichard mentioned this in D127267: [NVPTX] Add setAuxTarget override rather than make a new TargetInfo.Jun 7 2022, 7:06 PM

rprichard removed a child revision: D28213: [Frontend] Correct values of ATOMIC_*_LOCK_FREE to match builtin.Jun 7 2022, 8:07 PM

Revision Contents

Path

Size

lib/

Basic/

Targets.cpp

33 lines

test/

CodeGen/

atomic-ops.c

6 lines

ms-volatile.c

2 lines

CodeGenCXX/

atomicinit.cpp

2 lines

Sema/

atomic-ops.c

27 lines

Diff 87311

lib/Basic/Targets.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,775 Lines • ▼ Show 20 Lines	NVPTXTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts,

// If possible, get a TargetInfo for our host triple, so we can match its		// If possible, get a TargetInfo for our host triple, so we can match its
// types.		// types.
llvm::Triple HostTriple(Opts.HostTriple);		llvm::Triple HostTriple(Opts.HostTriple);
if (!HostTriple.isNVPTX())		if (!HostTriple.isNVPTX())
HostTarget.reset(AllocateTarget(llvm::Triple(Opts.HostTriple), Opts));		HostTarget.reset(AllocateTarget(llvm::Triple(Opts.HostTriple), Opts));

// If no host target, make some guesses about the data layout and return.		// If no host target, make some guesses about the data layout and return.
if (!HostTarget) {		if (!HostTarget) {
		jlebarUnsubmitted Not Done Reply Inline Actions Someone else is calling setCPU on the HostTarget. If they're calling it after we call it here, this is obviously not going to work. Since it does work, I presume they are calling it before we get here. In which case, can we not just set `MaxAtomicPromoteWidth=HostTarget->MaxAtomicPromoteWidth` right after we set `MaxAtomicInlineWidth = HostTarget->getMaxAtomicInlineWidth();` below? (Note that as written definitely isn't right because it assumes that HostTarget is non-null.) jlebar: Someone else is calling setCPU on the HostTarget. If they're calling it after we call it here…
LongWidth = LongAlign = TargetPointerWidth;		LongWidth = LongAlign = TargetPointerWidth;
PointerWidth = PointerAlign = TargetPointerWidth;		PointerWidth = PointerAlign = TargetPointerWidth;
switch (TargetPointerWidth) {		switch (TargetPointerWidth) {
case 32:		case 32:
SizeType = TargetInfo::UnsignedInt;		SizeType = TargetInfo::UnsignedInt;
PtrDiffType = TargetInfo::SignedInt;		PtrDiffType = TargetInfo::SignedInt;
IntPtrType = TargetInfo::SignedInt;		IntPtrType = TargetInfo::SignedInt;
break;		break;
case 64:		case 64:
SizeType = TargetInfo::UnsignedLong;		SizeType = TargetInfo::UnsignedLong;
PtrDiffType = TargetInfo::SignedLong;		PtrDiffType = TargetInfo::SignedLong;
IntPtrType = TargetInfo::SignedLong;		IntPtrType = TargetInfo::SignedLong;
break;		break;
default:		default:
llvm_unreachable("TargetPointerWidth must be 32 or 64");		llvm_unreachable("TargetPointerWidth must be 32 or 64");
}		}
return;		return;
}		}

		// If targeting x86-32, set the CPU to i586 to enable all inline
		// atomics. This matches the defaults for the systems using CUDA.
		// TODO: pass the host target CPU
		if (HostTriple.getArch() == llvm::Triple::x86)
		HostTarget->setCPU("i586");
		jlebarUnsubmitted Not Done Reply Inline Actions Okay, is this still needed now? jlebar: Okay, is this still needed now?
		mgornyAuthorUnsubmitted Not Done Reply Inline Actions Yes. I've specifically tested with it commented out, and the CPU gets initiated to generic (=no inline atomics) then. mgorny: Yes. I've specifically tested with it commented out, and the CPU gets initiated to generic (=no…
		jlebarUnsubmitted Not Done Reply Inline Actions Yes, but is that a bug? Does that break the test? I thought the problem we were trying to solve here was that CUDA host and device builds did not define the same macros. And I thought that setCPU modified the values for MaxAtomicInlineWidth and MaxAtomicPromoteWidth. Moreover I thought that we called HostTarget->setCPU before calling this function. If all of those things are true, I don't see what problem we're solving by calling HostTarget->setCPU("i586") here. jlebar: Yes, but is that a bug? Does that break the test? I thought the problem we were trying to…
		mgornyAuthorUnsubmitted Not Done Reply Inline Actions Well, the thing is, we don't call `HostTarget->setCPU()` before this function. We just call `AllocateTarget()`, and it does not set the CPU. Normally the CPU is set in Driver, based on `-march` etc. if provided, with fallback to platform-specific defaults. In the case of host-side CUDA build, the Driver sets x86-specific CPU. While the defaults differ per platform, for all platforms supporting CUDA it's i586+. Now, for the target-side, the Driver creates NVPTX target, and sets NVPTX-specific CPU. The `HostTarget` instance is only created within `NVPTXTargetInfo`, and so we need to `setCPU()` explicitly. Since we can reliably assume that the host-side will be i586+, we use `i586` here. So far this didn't matter since all atomic properties were defined statically. However, this patch changes them to adjust to the CPU used, and so if the `X8632TargetInfo` instance is allocated without an explicit `setCPU()` call, it defaults to generic x86 (= no inline atomics available) which is different from the host platform default. As a result, different macros are defined and the test fails. mgorny: Well, the thing is, we don't call `HostTarget->setCPU()` before this function. We just call…

// Copy properties from host target.		// Copy properties from host target.
PointerWidth = HostTarget->getPointerWidth(/* AddrSpace = */ 0);		PointerWidth = HostTarget->getPointerWidth(/* AddrSpace = */ 0);
PointerAlign = HostTarget->getPointerAlign(/* AddrSpace = */ 0);		PointerAlign = HostTarget->getPointerAlign(/* AddrSpace = */ 0);
BoolWidth = HostTarget->getBoolWidth();		BoolWidth = HostTarget->getBoolWidth();
BoolAlign = HostTarget->getBoolAlign();		BoolAlign = HostTarget->getBoolAlign();
IntWidth = HostTarget->getIntWidth();		IntWidth = HostTarget->getIntWidth();
IntAlign = HostTarget->getIntAlign();		IntAlign = HostTarget->getIntAlign();
HalfWidth = HostTarget->getHalfWidth();		HalfWidth = HostTarget->getHalfWidth();
Show All 28 Lines	NVPTXTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts,
UseExplicitBitFieldAlignment = HostTarget->useExplicitBitFieldAlignment();		UseExplicitBitFieldAlignment = HostTarget->useExplicitBitFieldAlignment();
ZeroLengthBitfieldBoundary = HostTarget->getZeroLengthBitfieldBoundary();		ZeroLengthBitfieldBoundary = HostTarget->getZeroLengthBitfieldBoundary();

// This is a bit of a lie, but it controls __GCC_ATOMIC_XXX_LOCK_FREE, and		// This is a bit of a lie, but it controls __GCC_ATOMIC_XXX_LOCK_FREE, and
// we need those macros to be identical on host and device, because (among		// we need those macros to be identical on host and device, because (among
// other things) they affect which standard library classes are defined, and		// other things) they affect which standard library classes are defined, and
// we need all classes to be defined on both the host and device.		// we need all classes to be defined on both the host and device.
MaxAtomicInlineWidth = HostTarget->getMaxAtomicInlineWidth();		MaxAtomicInlineWidth = HostTarget->getMaxAtomicInlineWidth();
		MaxAtomicPromoteWidth = HostTarget->getMaxAtomicPromoteWidth();

// Properties intentionally not copied from host:		// Properties intentionally not copied from host:
// - LargeArrayMinWidth, LargeArrayAlign: Not visible across the		// - LargeArrayMinWidth, LargeArrayAlign: Not visible across the
// host/device boundary.		// host/device boundary.
// - SuitableAlign: Not visible across the host/device boundary, and may		// - SuitableAlign: Not visible across the host/device boundary, and may
// correctly be different on host/device, e.g. if host has wider vector		// correctly be different on host/device, e.g. if host has wider vector
// types than device.		// types than device.
// - LongDoubleWidth, LongDoubleAlign: nvptx's long double type is the same		// - LongDoubleWidth, LongDoubleAlign: nvptx's long double type is the same
▲ Show 20 Lines • Show All 582 Lines • ▼ Show 20 Lines	const TargetInfo::AddlRegName AddlRegNames[] = {
{ { "r13d", "r13w", "r13b" }, 43 },		{ { "r13d", "r13w", "r13b" }, 43 },
{ { "r14d", "r14w", "r14b" }, 44 },		{ { "r14d", "r14w", "r14b" }, 44 },
{ { "r15d", "r15w", "r15b" }, 45 },		{ { "r15d", "r15w", "r15b" }, 45 },
};		};

// X86 target abstract base class; x86-32 and x86-64 are very close, so		// X86 target abstract base class; x86-32 and x86-64 are very close, so
// most of the implementation can be shared.		// most of the implementation can be shared.
class X86TargetInfo : public TargetInfo {		class X86TargetInfo : public TargetInfo {
		protected:
enum X86SSEEnum {		enum X86SSEEnum {
NoSSE, SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, AVX, AVX2, AVX512F		NoSSE, SSE1, SSE2, SSE3, SSSE3, SSE41, SSE42, AVX, AVX2, AVX512F
} SSELevel = NoSSE;		} SSELevel = NoSSE;
enum MMX3DNowEnum {		enum MMX3DNowEnum {
NoMMX3DNow, MMX, AMD3DNow, AMD3DNowAthlon		NoMMX3DNow, MMX, AMD3DNow, AMD3DNowAthlon
} MMX3DNowLevel = NoMMX3DNow;		} MMX3DNowLevel = NoMMX3DNow;
enum XOPEnum {		enum XOPEnum {
NoXOP,		NoXOP,
▲ Show 20 Lines • Show All 1,772 Lines • ▼ Show 20 Lines	case 'Y':
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
default:		default:
return std::string(1, *Constraint);		return std::string(1, *Constraint);
}		}
}		}

// X86-32 generic target		// X86-32 generic target
class X86_32TargetInfo : public X86TargetInfo {		class X86_32TargetInfo : public X86TargetInfo {
		void setAtomic() {
		// cmpxchg8b is required to support 64-bit inline atomics.
		if (CPU >= CK_i586) {
		MaxAtomicPromoteWidth = 64;
		MaxAtomicInlineWidth = 64;
		} else {
		// Note: strictly speaking, cmpxchg (i.e. i486+) is required
		// for any inline atomics at all.
		MaxAtomicPromoteWidth = 32;
		MaxAtomicInlineWidth = 32;
		dimUnsubmitted Done Reply Inline Actions Are you purposefully not setting `MaxAtomicInlineWidth` here? (It seems from `TargetInfo` that the default value is zero.) dim: Are you purposefully not setting `MaxAtomicInlineWidth` here? (It seems from `TargetInfo` that…
		mgornyAuthorUnsubmitted Done Reply Inline Actions Yes. I've based this on what's done for ARM. Unless I misunderstood something, this means that on 'plain' i386 there is no inline atomics support but we still want to do atomics-via-locking up to 32-bit types. I'm not sure about 32/64 here to match i486. mgorny: Yes. I've based this on what's done for ARM. Unless I misunderstood something, this means that…
		ahatanakUnsubmitted Done Reply Inline Actions If there isn't a test case for plain i386, is it possible to add one (perhaps in test/Sema/atomic-ops.c)? ahatanak: If there isn't a test case for plain i386, is it possible to add one (perhaps in…
		mgornyAuthorUnsubmitted Done Reply Inline Actions I could do that. However, @joerg suggested dropping i386 branch entirely, and assuming i486+. mgorny: I could do that. However, @joerg suggested dropping i386 branch entirely, and assuming i486+.
		ahatanakUnsubmitted Done Reply Inline Actions OK, thanks. In that case, we don't need the test. ahatanak: OK, thanks. In that case, we don't need the test.
		}
		}

public:		public:
X86_32TargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)		X86_32TargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
: X86TargetInfo(Triple, Opts) {		: X86TargetInfo(Triple, Opts) {
DoubleAlign = LongLongAlign = 32;		DoubleAlign = LongLongAlign = 32;
LongDoubleWidth = 96;		LongDoubleWidth = 96;
LongDoubleAlign = 32;		LongDoubleAlign = 32;
SuitableAlign = 128;		SuitableAlign = 128;
resetDataLayout("e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128");		resetDataLayout("e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128");
SizeType = UnsignedInt;		SizeType = UnsignedInt;
PtrDiffType = SignedInt;		PtrDiffType = SignedInt;
IntPtrType = SignedInt;		IntPtrType = SignedInt;
RegParmMax = 3;		RegParmMax = 3;

// Use fpret for all types.		// Use fpret for all types.
RealTypeUsesObjCFPRet = ((1 << TargetInfo::Float) \|		RealTypeUsesObjCFPRet = ((1 << TargetInfo::Float) \|
(1 << TargetInfo::Double) \|		(1 << TargetInfo::Double) \|
(1 << TargetInfo::LongDouble));		(1 << TargetInfo::LongDouble));

// x86-32 has atomics up to 8 bytes		setAtomic();
		dimUnsubmitted Not Done Reply Inline Actions As far as I can see, in the constructor this call is _always_ made with `CPU` set to `CK_Generic`, i.e. zero. Therefore, the "allow locked atomics up to 4 bytes" path in `setAtomic` is always chosen. Maybe it is clearer to just initialize `MaxAtomicPromoteWidth` to 32 directly here, instead? dim: As far as I can see, in the constructor this call is _always_ made with `CPU` set to…
		mgornyAuthorUnsubmitted Not Done Reply Inline Actions Well, I just copied the idea from ARM. I thought of it more like 'make sure it is initialized to some value, possibly update it later when setting CPU'. I'm fine either way. mgorny: Well, I just copied the idea from ARM. I thought of it more like 'make sure it is initialized…
// FIXME: Check that we actually have cmpxchg8b before setting
// MaxAtomicInlineWidth. (cmpxchg8b is an i586 instruction.)
MaxAtomicPromoteWidth = MaxAtomicInlineWidth = 64;
}		}
BuiltinVaListKind getBuiltinVaListKind() const override {		BuiltinVaListKind getBuiltinVaListKind() const override {
return TargetInfo::CharPtrBuiltinVaList;		return TargetInfo::CharPtrBuiltinVaList;
}		}

		bool setCPU(const std::string &Name) override {
		bool ret = X86TargetInfo::setCPU(Name);
		if (ret)
		setAtomic();
		return ret;
		}

int getEHDataRegisterNumber(unsigned RegNo) const override {		int getEHDataRegisterNumber(unsigned RegNo) const override {
if (RegNo == 0) return 0;		if (RegNo == 0) return 0;
if (RegNo == 1) return 2;		if (RegNo == 1) return 2;
return -1;		return -1;
}		}
bool validateOperandSize(StringRef Constraint,		bool validateOperandSize(StringRef Constraint,
unsigned Size) const override {		unsigned Size) const override {
switch (Constraint[0]) {		switch (Constraint[0]) {
▲ Show 20 Lines • Show All 4,783 Lines • Show Last 20 Lines

test/CodeGen/atomic-ops.c

	// RUN: %clang_cc1 %s -emit-llvm -o - -ffreestanding -ffake-address-space-map -triple=i686-apple-darwin9 \| FileCheck %s			// RUN: %clang_cc1 %s -emit-llvm -o - -ffreestanding -ffake-address-space-map -triple=i686-apple-darwin9 -target-cpu i686 \| FileCheck %s
				hansUnsubmitted Not Done Reply Inline Actions Naive question: why is the i686- part of the triple not sufficient here; why is -target-cpu needed? hans: Naive question: why is the i686- part of the triple not sufficient here; why is -target-cpu…
				mgornyAuthorUnsubmitted Not Done Reply Inline Actions It's because triple is not really meaningful on most of the systems (e.g. many Linux distros use i386, BSD use i486), and the default CPU logic is applied in the Driver, while cc1 is called directly here. mgorny:* It's because triple is not really meaningful on most of the systems (e.g. many Linux distros…
	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target

	// Also test serialization of atomic operations here, to avoid duplicating the			// Also test serialization of atomic operations here, to avoid duplicating the
	// test.			// test.
	// RUN: %clang_cc1 %s -emit-pch -o %t -ffreestanding -ffake-address-space-map -triple=i686-apple-darwin9			// RUN: %clang_cc1 %s -emit-pch -o %t -ffreestanding -ffake-address-space-map -triple=i686-apple-darwin9 -target-cpu i686
	// RUN: %clang_cc1 %s -include-pch %t -ffreestanding -ffake-address-space-map -triple=i686-apple-darwin9 -emit-llvm -o - \| FileCheck %s			// RUN: %clang_cc1 %s -include-pch %t -ffreestanding -ffake-address-space-map -triple=i686-apple-darwin9 -target-cpu i686 -emit-llvm -o - \| FileCheck %s
	#ifndef ALREADY_INCLUDED			#ifndef ALREADY_INCLUDED
	#define ALREADY_INCLUDED			#define ALREADY_INCLUDED

	#include <stdatomic.h>			#include <stdatomic.h>

	// Basic IRGen tests for __c11_atomic_* and GNU __atomic_*			// Basic IRGen tests for __c11_atomic_* and GNU __atomic_*

	int fi1(_Atomic(int) *i) {			int fi1(_Atomic(int) *i) {
	▲ Show 20 Lines • Show All 610 Lines • Show Last 20 Lines

test/CodeGen/ms-volatile.c

	// RUN: %clang_cc1 -triple i386-pc-win32 -fms-extensions -emit-llvm -fms-volatile -o - < %s \| FileCheck %s			// RUN: %clang_cc1 -triple i386-pc-win32 -target-cpu i686 -fms-extensions -emit-llvm -fms-volatile -o - < %s \| FileCheck %s
	struct foo {			struct foo {
	volatile int x;			volatile int x;
	};			};
	struct bar {			struct bar {
	int x;			int x;
	};			};
	typedef _Complex float __declspec(align(8)) baz;			typedef _Complex float __declspec(align(8)) baz;

	▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

test/CodeGenCXX/atomicinit.cpp

	// RUN: %clang_cc1 %s -emit-llvm -O1 -o - -triple=i686-apple-darwin9 -std=c++11 \| FileCheck %s			// RUN: %clang_cc1 %s -emit-llvm -O1 -o - -triple=i686-apple-darwin9 -target-cpu i686 -std=c++11 \| FileCheck %s

	// CHECK-DAG: @PR22043 = local_unnamed_addr global i32 0, align 4			// CHECK-DAG: @PR22043 = local_unnamed_addr global i32 0, align 4
	typedef _Atomic(int) AtomicInt;			typedef _Atomic(int) AtomicInt;
	AtomicInt PR22043 = AtomicInt();			AtomicInt PR22043 = AtomicInt();

	// CHECK-DAG: @_ZN7PR180978constant1aE = local_unnamed_addr global { i16, i8 } { i16 1, i8 6 }, align 4			// CHECK-DAG: @_ZN7PR180978constant1aE = local_unnamed_addr global { i16, i8 } { i16 1, i8 6 }, align 4
	// CHECK-DAG: @_ZN7PR180978constant1bE = local_unnamed_addr global { i16, i8 } { i16 2, i8 6 }, align 4			// CHECK-DAG: @_ZN7PR180978constant1bE = local_unnamed_addr global { i16, i8 } { i16 2, i8 6 }, align 4
	// CHECK-DAG: @_ZN7PR180978constant1cE = local_unnamed_addr global { i16, i8 } { i16 3, i8 6 }, align 4			// CHECK-DAG: @_ZN7PR180978constant1cE = local_unnamed_addr global { i16, i8 } { i16 3, i8 6 }, align 4
	▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

test/Sema/atomic-ops.c

	// RUN: %clang_cc1 %s -verify -ffreestanding -fsyntax-only -triple=i686-linux-gnu -std=c11			// RUN: %clang_cc1 %s -verify -ffreestanding -fsyntax-only -triple=i686-linux-gnu -target-cpu i686 -std=c11
				// RUN: %clang_cc1 %s -verify -ffreestanding -fsyntax-only -triple=i486-linux-gnu -target-cpu i486 -std=c11

	// Basic parsing/Sema tests for __c11_atomic_*			// Basic parsing/Sema tests for __c11_atomic_*

	#include <stdatomic.h>			#include <stdatomic.h>

	struct S { char c[3]; };			struct S { char c[3]; };

	_Static_assert(__GCC_ATOMIC_BOOL_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_BOOL_LOCK_FREE == 2, "");
	_Static_assert(__GCC_ATOMIC_CHAR_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_CHAR_LOCK_FREE == 2, "");
	_Static_assert(__GCC_ATOMIC_CHAR16_T_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_CHAR16_T_LOCK_FREE == 2, "");
	_Static_assert(__GCC_ATOMIC_CHAR32_T_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_CHAR32_T_LOCK_FREE == 2, "");
	_Static_assert(__GCC_ATOMIC_WCHAR_T_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_WCHAR_T_LOCK_FREE == 2, "");
	_Static_assert(__GCC_ATOMIC_SHORT_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_SHORT_LOCK_FREE == 2, "");
	_Static_assert(__GCC_ATOMIC_INT_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_INT_LOCK_FREE == 2, "");
	_Static_assert(__GCC_ATOMIC_LONG_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_LONG_LOCK_FREE == 2, "");
				#if defined(__i486__)
				_Static_assert(__GCC_ATOMIC_LLONG_LOCK_FREE == 1, "");
				#else
	_Static_assert(__GCC_ATOMIC_LLONG_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_LLONG_LOCK_FREE == 2, "");
				#endif
	_Static_assert(__GCC_ATOMIC_POINTER_LOCK_FREE == 2, "");			_Static_assert(__GCC_ATOMIC_POINTER_LOCK_FREE == 2, "");

	_Static_assert(__c11_atomic_is_lock_free(1), "");			_Static_assert(__c11_atomic_is_lock_free(1), "");
	_Static_assert(__c11_atomic_is_lock_free(2), "");			_Static_assert(__c11_atomic_is_lock_free(2), "");
	_Static_assert(__c11_atomic_is_lock_free(3), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__c11_atomic_is_lock_free(3), ""); // expected-error {{not an integral constant expression}}
	_Static_assert(__c11_atomic_is_lock_free(4), "");			_Static_assert(__c11_atomic_is_lock_free(4), "");
				#if defined(__i486__)
				_Static_assert(__c11_atomic_is_lock_free(8), ""); // expected-error {{not an integral constant expression}}
				#else
	_Static_assert(__c11_atomic_is_lock_free(8), "");			_Static_assert(__c11_atomic_is_lock_free(8), "");
				#endif
	_Static_assert(__c11_atomic_is_lock_free(16), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__c11_atomic_is_lock_free(16), ""); // expected-error {{not an integral constant expression}}
	_Static_assert(__c11_atomic_is_lock_free(17), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__c11_atomic_is_lock_free(17), ""); // expected-error {{not an integral constant expression}}

	_Static_assert(__atomic_is_lock_free(1, 0), "");			_Static_assert(__atomic_is_lock_free(1, 0), "");
	_Static_assert(__atomic_is_lock_free(2, 0), "");			_Static_assert(__atomic_is_lock_free(2, 0), "");
	_Static_assert(__atomic_is_lock_free(3, 0), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__atomic_is_lock_free(3, 0), ""); // expected-error {{not an integral constant expression}}
	_Static_assert(__atomic_is_lock_free(4, 0), "");			_Static_assert(__atomic_is_lock_free(4, 0), "");
				#if defined(__i486__)
				_Static_assert(__atomic_is_lock_free(8, 0), ""); // expected-error {{not an integral constant expression}}
				#else
	_Static_assert(__atomic_is_lock_free(8, 0), "");			_Static_assert(__atomic_is_lock_free(8, 0), "");
				#endif
	_Static_assert(__atomic_is_lock_free(16, 0), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__atomic_is_lock_free(16, 0), ""); // expected-error {{not an integral constant expression}}
	_Static_assert(__atomic_is_lock_free(17, 0), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__atomic_is_lock_free(17, 0), ""); // expected-error {{not an integral constant expression}}

	_Static_assert(atomic_is_lock_free((atomic_char*)0), "");			_Static_assert(atomic_is_lock_free((atomic_char*)0), "");
	_Static_assert(atomic_is_lock_free((atomic_short*)0), "");			_Static_assert(atomic_is_lock_free((atomic_short*)0), "");
	_Static_assert(atomic_is_lock_free((atomic_int*)0), "");			_Static_assert(atomic_is_lock_free((atomic_int*)0), "");
	_Static_assert(atomic_is_lock_free((atomic_long*)0), "");			_Static_assert(atomic_is_lock_free((atomic_long*)0), "");
	// expected-error@+1 {{__int128 is not supported on this target}}			// expected-error@+1 {{__int128 is not supported on this target}}
	Show All 10 Lines
	_Static_assert(__atomic_is_lock_free(1, &i64), "");			_Static_assert(__atomic_is_lock_free(1, &i64), "");
	_Static_assert(__atomic_is_lock_free(2, &i8), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__atomic_is_lock_free(2, &i8), ""); // expected-error {{not an integral constant expression}}
	_Static_assert(__atomic_is_lock_free(2, &i16), "");			_Static_assert(__atomic_is_lock_free(2, &i16), "");
	_Static_assert(__atomic_is_lock_free(2, &i64), "");			_Static_assert(__atomic_is_lock_free(2, &i64), "");
	_Static_assert(__atomic_is_lock_free(4, &i16), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__atomic_is_lock_free(4, &i16), ""); // expected-error {{not an integral constant expression}}
	_Static_assert(__atomic_is_lock_free(4, &i32), "");			_Static_assert(__atomic_is_lock_free(4, &i32), "");
	_Static_assert(__atomic_is_lock_free(4, &i64), "");			_Static_assert(__atomic_is_lock_free(4, &i64), "");
	_Static_assert(__atomic_is_lock_free(8, &i32), ""); // expected-error {{not an integral constant expression}}			_Static_assert(__atomic_is_lock_free(8, &i32), ""); // expected-error {{not an integral constant expression}}
				#if defined(__i486__)
				_Static_assert(__atomic_is_lock_free(8, &i64), ""); // expected-error {{not an integral constant expression}}
				#else
	_Static_assert(__atomic_is_lock_free(8, &i64), "");			_Static_assert(__atomic_is_lock_free(8, &i64), "");
				#endif

	_Static_assert(__atomic_always_lock_free(1, 0), "");			_Static_assert(__atomic_always_lock_free(1, 0), "");
	_Static_assert(__atomic_always_lock_free(2, 0), "");			_Static_assert(__atomic_always_lock_free(2, 0), "");
	_Static_assert(!__atomic_always_lock_free(3, 0), "");			_Static_assert(!__atomic_always_lock_free(3, 0), "");
	_Static_assert(__atomic_always_lock_free(4, 0), "");			_Static_assert(__atomic_always_lock_free(4, 0), "");
				#if defined(__i486__)
				_Static_assert(!__atomic_always_lock_free(8, 0), "");
				#else
	_Static_assert(__atomic_always_lock_free(8, 0), "");			_Static_assert(__atomic_always_lock_free(8, 0), "");
				#endif
	_Static_assert(!__atomic_always_lock_free(16, 0), "");			_Static_assert(!__atomic_always_lock_free(16, 0), "");
	_Static_assert(!__atomic_always_lock_free(17, 0), "");			_Static_assert(!__atomic_always_lock_free(17, 0), "");

	_Static_assert(__atomic_always_lock_free(1, incomplete), "");			_Static_assert(__atomic_always_lock_free(1, incomplete), "");
	_Static_assert(!__atomic_always_lock_free(2, incomplete), "");			_Static_assert(!__atomic_always_lock_free(2, incomplete), "");
	_Static_assert(!__atomic_always_lock_free(4, incomplete), "");			_Static_assert(!__atomic_always_lock_free(4, incomplete), "");

	_Static_assert(__atomic_always_lock_free(1, &i8), "");			_Static_assert(__atomic_always_lock_free(1, &i8), "");
	_Static_assert(__atomic_always_lock_free(1, &i64), "");			_Static_assert(__atomic_always_lock_free(1, &i64), "");
	_Static_assert(!__atomic_always_lock_free(2, &i8), "");			_Static_assert(!__atomic_always_lock_free(2, &i8), "");
	_Static_assert(__atomic_always_lock_free(2, &i16), "");			_Static_assert(__atomic_always_lock_free(2, &i16), "");
	_Static_assert(__atomic_always_lock_free(2, &i64), "");			_Static_assert(__atomic_always_lock_free(2, &i64), "");
	_Static_assert(!__atomic_always_lock_free(4, &i16), "");			_Static_assert(!__atomic_always_lock_free(4, &i16), "");
	_Static_assert(__atomic_always_lock_free(4, &i32), "");			_Static_assert(__atomic_always_lock_free(4, &i32), "");
	_Static_assert(__atomic_always_lock_free(4, &i64), "");			_Static_assert(__atomic_always_lock_free(4, &i64), "");
	_Static_assert(!__atomic_always_lock_free(8, &i32), "");			_Static_assert(!__atomic_always_lock_free(8, &i32), "");
				#if defined(__i486__)
				_Static_assert(!__atomic_always_lock_free(8, &i64), "");
				#else
	_Static_assert(__atomic_always_lock_free(8, &i64), "");			_Static_assert(__atomic_always_lock_free(8, &i64), "");
				#endif

	#define _AS1 __attribute__((address_space(1)))			#define _AS1 __attribute__((address_space(1)))
	#define _AS2 __attribute__((address_space(2)))			#define _AS2 __attribute__((address_space(2)))

	void f(_Atomic(int) i, const _Atomic(int) ci,			void f(_Atomic(int) i, const _Atomic(int) ci,
	_Atomic(int) p, _Atomic(float) *d,			_Atomic(int) p, _Atomic(float) *d,
	int I, const int CI,			int I, const int CI,
	int *P, float D, struct S s1, struct S s2) {			int *P, float D, struct S s1, struct S s2) {
	▲ Show 20 Lines • Show All 417 Lines • Show Last 20 Lines