This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
openmp/libomptarget/DeviceRTL/src/
-
libomptarget/
-
DeviceRTL/
-
src/
-
Synchronization.cpp

Differential D130030

[OpenMP][DeviceRTL] Remove `atomic::store`
AbandonedPublic

Authored by tianshilei1992 on Jul 18 2022, 11:28 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
jhuber6

Summary

atomic::store currently is being used by AMDGPU for named barrier. Internally
it calls __atomic_store_n compiler builtin. However, NVPTX backends doesn't
support AtomicStore instruction, causing backend crash when calling llc on
the device runtime directly, where atomic::store is not optimized out. This
patch removes atomic::store from public area and uses __atomic_store_n
directly where it is needed as a workaround. We can come back and revisit it in
the future is atomic::store is needed somewhere else.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tianshilei1992 created this revision.Jul 18 2022, 11:28 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2022, 11:28 AM

Herald added subscribers: guansong, yaxunl. · View Herald Transcript

tianshilei1992 requested review of this revision.Jul 18 2022, 11:28 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2022, 11:28 AM

Herald added subscribers: openmp-commits, sstefan1. · View Herald Transcript

LG, though the backend should be able to handle this, IMHO

This revision is now accepted and ready to land.Jul 18 2022, 11:38 AM

In D130030#3660553, @jdoerfert wrote:

LG, though the backend should be able to handle this, IMHO

Yeah, __atomic_exchange_n is expanded based on the ordering, and one of the expansion is still atomic store, which goes back to current situation. atomic::store is only being used by AMDGPU to implement named barrier. I'm wondering if we want to drop atomic::store and use the __atomic_store_n wherever it needs?

refine it

This revision is now accepted and ready to land.Jul 18 2022, 11:59 AM

tianshilei1992 edited the summary of this revision. (Show Details)Jul 18 2022, 12:00 PM

Herald added subscribers: kosarev, tpr. · View Herald TranscriptJul 18 2022, 12:00 PM

tianshilei1992 retitled this revision from [OpenMP][DeviceRTL] Use `__atomic_exchange_n` to implement atomicStore to [OpenMP][DeviceRTL] Remove `atomic::store`.Jul 18 2022, 12:00 PM

tianshilei1992 requested review of this revision.Jul 18 2022, 12:09 PM

Harbormaster completed remote builds in B176091: Diff 445590.Jul 18 2022, 1:19 PM

No, this is strictly worse. If anything, we can introduce a switch to ensure the ordering is known statically. We can use a macro and do it for all of the atomic ops.

In D130030#3663558, @jdoerfert wrote:

No, this is strictly worse. If anything, we can introduce a switch to ensure the ordering is known statically. We can use a macro and do it for all of the atomic ops.

It doesn't work because __atomic_exchange_n with __ATOMIC_RELEASE will be lowered to atomicLoad, which again causes the issue.

tianshilei1992 abandoned this revision.Sep 4 2022, 12:17 PM

Revision Contents

Path

Size

openmp/

libomptarget/

DeviceRTL/

src/

Synchronization.cpp

2 lines

Diff 445573

openmp/libomptarget/DeviceRTL/src/Synchronization.cpp

	Show All 30 Lines
	/// NOTE: This function needs to be implemented by every target.			/// NOTE: This function needs to be implemented by every target.
	uint32_t atomicInc(uint32_t *Address, uint32_t Val, int Ordering);			uint32_t atomicInc(uint32_t *Address, uint32_t Val, int Ordering);

	uint32_t atomicLoad(uint32_t *Address, int Ordering) {			uint32_t atomicLoad(uint32_t *Address, int Ordering) {
	return __atomic_fetch_add(Address, 0U, __ATOMIC_SEQ_CST);			return __atomic_fetch_add(Address, 0U, __ATOMIC_SEQ_CST);
	}			}

	void atomicStore(uint32_t *Address, uint32_t Val, int Ordering) {			void atomicStore(uint32_t *Address, uint32_t Val, int Ordering) {
	__atomic_store_n(Address, Val, Ordering);			(void)__atomic_exchange_n(Address, Val, Ordering);
	}			}

	uint32_t atomicAdd(uint32_t *Address, uint32_t Val, int Ordering) {			uint32_t atomicAdd(uint32_t *Address, uint32_t Val, int Ordering) {
	return __atomic_fetch_add(Address, Val, Ordering);			return __atomic_fetch_add(Address, Val, Ordering);
	}			}
	uint32_t atomicMax(uint32_t *Address, uint32_t Val, int Ordering) {			uint32_t atomicMax(uint32_t *Address, uint32_t Val, int Ordering) {
	return __atomic_fetch_max(Address, Val, Ordering);			return __atomic_fetch_max(Address, Val, Ordering);
	}			}
	▲ Show 20 Lines • Show All 379 Lines • Show Last 20 Lines