This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Make __GCC_ATOMIC_XXX_LOCK_FREE macros the same on host/device.
ClosedPublic

Authored by jlebar on Sep 9 2016, 10:07 AM.

Download Raw Diff

Details

Reviewers

Commits

rG5057f17716f0: [CUDA] Make __GCC_ATOMIC_XXX_LOCK_FREE macros the same on host/device.
rC281089: [CUDA] Make __GCC_ATOMIC_XXX_LOCK_FREE macros the same on host/device.
rL281089: [CUDA] Make __GCC_ATOMIC_XXX_LOCK_FREE macros the same on host/device.

Summary

This fixes a bug where we were unable to compile the following CUDA
file with libstdc++ (didn't try libc++):

#include <future>
void foo() { std::shared_future<int> x; }

The problem is that <future> only defines std::shared_future if
__GCC_ATOMIC_INT_LOCK_FREE > 1. When we compiled this file for device,
the macro was set to 1, and then the class didn't exist at all.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 70857.Sep 9 2016, 10:07 AM

jlebar retitled this revision from to [CUDA] Make __GCC_ATOMIC_XXX_LOCK_FREE macros the same on host/device..

jlebar updated this object.

jlebar added a reviewer: tra.

jlebar added subscribers: jhen, cfe-commits.

LGTM

This revision is now accepted and ready to land.Sep 9 2016, 10:19 AM

Closed by commit rL281089: [CUDA] Make __GCC_ATOMIC_XXX_LOCK_FREE macros the same on host/device. (authored by jlebar). · Explain WhySep 9 2016, 1:44 PM

This revision was automatically updated to reflect the committed changes.

rprichard mentioned this in D127267: [NVPTX] Add setAuxTarget override rather than make a new TargetInfo.Jun 9 2022, 6:46 PM

Revision Contents

Path

Size

clang/

lib/

Basic/

Targets.cpp

6 lines

test/

Preprocessor/

cuda-types.cu

39 lines

Diff 70857

clang/lib/Basic/Targets.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,762 Lines • ▼ Show 20 Lines	NVPTXTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
ProcessIDType = HostTarget->getProcessIDType();		ProcessIDType = HostTarget->getProcessIDType();

UseBitFieldTypeAlignment = HostTarget->useBitFieldTypeAlignment();		UseBitFieldTypeAlignment = HostTarget->useBitFieldTypeAlignment();
UseZeroLengthBitfieldAlignment =		UseZeroLengthBitfieldAlignment =
HostTarget->useZeroLengthBitfieldAlignment();		HostTarget->useZeroLengthBitfieldAlignment();
UseExplicitBitFieldAlignment = HostTarget->useExplicitBitFieldAlignment();		UseExplicitBitFieldAlignment = HostTarget->useExplicitBitFieldAlignment();
ZeroLengthBitfieldBoundary = HostTarget->getZeroLengthBitfieldBoundary();		ZeroLengthBitfieldBoundary = HostTarget->getZeroLengthBitfieldBoundary();

		// This is a bit of a lie, but it controls __GCC_ATOMIC_XXX_LOCK_FREE, and
		// we need those macros to be identical on host and device, because (among
		// other things) they affect which standard library classes are defined, and
		// we need all classes to be defined on both the host and device.
		MaxAtomicInlineWidth = HostTarget->getMaxAtomicInlineWidth();

// Properties intentionally not copied from host:		// Properties intentionally not copied from host:
// - LargeArrayMinWidth, LargeArrayAlign: Not visible across the		// - LargeArrayMinWidth, LargeArrayAlign: Not visible across the
// host/device boundary.		// host/device boundary.
// - SuitableAlign: Not visible across the host/device boundary, and may		// - SuitableAlign: Not visible across the host/device boundary, and may
// correctly be different on host/device, e.g. if host has wider vector		// correctly be different on host/device, e.g. if host has wider vector
// types than device.		// types than device.
// - LongDoubleWidth, LongDoubleAlign: nvptx's long double type is the same		// - LongDoubleWidth, LongDoubleAlign: nvptx's long double type is the same
// as its double type, but that's not necessarily true on the host.		// as its double type, but that's not necessarily true on the host.
▲ Show 20 Lines • Show All 6,902 Lines • Show Last 20 Lines

clang/test/Preprocessor/cuda-types.cu

	// Check that types, widths, etc. match on the host and device sides of CUDA			// Check that types, widths, __GCC_ATOMIC* macros, etc. match on the host and
	// compilations. Note that we filter out long double, as this is intentionally			// device sides of CUDA compilations. Note that we filter out long double, as
	// different on host and device.			// this is intentionally different on host and device.
				//
				// FIXME: We really should make __GCC_HAVE_SYNC_COMPARE_AND_SWAP identical on
				// host and device, but architecturally this is difficult at the moment.

	// RUN: %clang --cuda-host-only -nocudainc -target i386-unknown-linux-gnu -x cuda -E -dM -o - /dev/null > %T/i386-host-defines			// RUN: %clang --cuda-host-only -nocudainc -target i386-unknown-linux-gnu -x cuda -E -dM -o - /dev/null \
	// RUN: %clang --cuda-device-only -nocudainc -nocudalib -target i386-unknown-linux-gnu -x cuda -E -dM -o - /dev/null > %T/i386-device-defines			// RUN: \| grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\|WIDTH\)\\|define __GCC_ATOMIC' \
	// RUN: grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\|WIDTH\)' %T/i386-host-defines \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/i386-host-defines-filtered			// RUN: \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/i386-host-defines-filtered
	// RUN: grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\|WIDTH\)' %T/i386-device-defines \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/i386-device-defines-filtered			// RUN: %clang --cuda-device-only -nocudainc -nocudalib -target i386-unknown-linux-gnu -x cuda -E -dM -o - /dev/null \
				// RUN: \| grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\|WIDTH\)\\|define __GCC_ATOMIC' \
				// RUN: \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/i386-device-defines-filtered
	// RUN: diff %T/i386-host-defines-filtered %T/i386-device-defines-filtered			// RUN: diff %T/i386-host-defines-filtered %T/i386-device-defines-filtered

	// RUN: %clang --cuda-host-only -nocudainc -target x86_64-unknown-linux-gnu -x cuda -E -dM -o - /dev/null > %T/x86_64-host-defines			// RUN: %clang --cuda-host-only -nocudainc -target x86_64-unknown-linux-gnu -x cuda -E -dM -o - /dev/null \
	// RUN: %clang --cuda-device-only -nocudainc -nocudalib -target x86_64-unknown-linux-gnu -x cuda -E -dM -o - /dev/null > %T/x86_64-device-defines			// RUN: \| grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\|WIDTH\)\\|define __GCC_ATOMIC' \
	// RUN: grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\\|WIDTH\)' %T/x86_64-host-defines \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/x86_64-host-defines-filtered			// RUN: \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/x86_64-host-defines-filtered
	// RUN: grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\\|WIDTH\)' %T/x86_64-device-defines \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/x86_64-device-defines-filtered			// RUN: %clang --cuda-device-only -nocudainc -nocudalib -target x86_64-unknown-linux-gnu -x cuda -E -dM -o - /dev/null \
				// RUN: \| grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\|WIDTH\)\\|define __GCC_ATOMIC' \
				// RUN: \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/x86_64-device-defines-filtered
	// RUN: diff %T/x86_64-host-defines-filtered %T/x86_64-device-defines-filtered			// RUN: diff %T/x86_64-host-defines-filtered %T/x86_64-device-defines-filtered

	// RUN: %clang --cuda-host-only -nocudainc -target powerpc64-unknown-linux-gnu -x cuda -E -dM -o - /dev/null > %T/powerpc64-host-defines			// RUN: %clang --cuda-host-only -nocudainc -target powerpc64-unknown-linux-gnu -x cuda -E -dM -o - /dev/null \
	// RUN: %clang --cuda-device-only -nocudainc -nocudalib -target powerpc64-unknown-linux-gnu -x cuda -E -dM -o - /dev/null > %T/powerpc64-device-defines			// RUN: \| grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\|WIDTH\)\\|define __GCC_ATOMIC' \
	// RUN: grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\\|WIDTH\)' %T/powerpc64-host-defines \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/powerpc64-host-defines-filtered			// RUN: \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/powerpc64-host-defines-filtered
	// RUN: grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\\|WIDTH\)' %T/powerpc64-device-defines \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/powerpc64-device-defines-filtered			// RUN: %clang --cuda-device-only -nocudainc -nocudalib -target powerpc64-unknown-linux-gnu -x cuda -E -dM -o - /dev/null \
				// RUN: \| grep 'define __[^ ]*\(TYPE\\|MAX\\|SIZEOF\|WIDTH\)\\|define __GCC_ATOMIC' \
				// RUN: \| grep -v '__LDBL\\|_LONG_DOUBLE' > %T/powerpc64-device-defines-filtered
	// RUN: diff %T/powerpc64-host-defines-filtered %T/powerpc64-device-defines-filtered			// RUN: diff %T/powerpc64-host-defines-filtered %T/powerpc64-device-defines-filtered