This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Target/SystemZ/
-
lib/
-
Target/
-
SystemZ/
-
SystemZISelLowering.cpp

Differential D141187

[SystemZ] Set setMaxAtomicSizeInBitsSupported
AbandonedPublic

Authored by brad on Jan 7 2023, 12:24 AM.

Download Raw Diff

Details

Reviewers

jonpa
uweigand

Summary

Set setMaxAtomicSizeInBitsSupported for SystemZ.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,080 ms	x64 debian > ThreadSanitizer-x86_64.ThreadSanitizer-x86_64::restore_stack.cpp
	60,060 ms	x64 debian > libFuzzer.libFuzzer::fuzzer-leak.test
	60,060 ms	x64 debian > libFuzzer.libFuzzer::minimize_crash.test

Event Timeline

brad created this revision.Jan 7 2023, 12:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 7 2023, 12:24 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

brad requested review of this revision.Jan 7 2023, 12:24 AM

Harbormaster completed remote builds in B206257: Diff 487061.Jan 7 2023, 1:15 AM

I don't think this is quite correct, as we do support 128-bit atomics on properly aligned addresses. In any case, the back-end implements atomic load, store, and cmpxchg for i128. We do not currently implement atomicrmw, but that could be handled in terms of cmpxchg of course. For non-aligned addresses, we do need to fall back to the library routine, however.

Interestingly enough, the SystemZ back-end does not even use the AtomicExpandPass at all at the moment, so the setMaxAtomicSizeInBitsSupported change probably doesn't even have any effect? Everything still seems to work, however ...

@jonpa we don't seem to have test cases for i128 atomicrmw at the moment - could you have a loop at what happens here, and if there's maybe something we need to change? Thanks!

LegalizeTypes expands the i128 result of e.g. an AtomicLoadAdd node, which is done by DAGTypeLegalizer::ExpandAtomic(), which results in a library call. There are no tests for the atomic operations with i128, but it should work due to the TypeLegalizer. i128 is not a legal type so I guess therefore they are all expanded this way.

I could imagine that it might be beneficial to use the AtomicExpandPass to use the target instructions whenever possible, and libcalls otherwise. However, I am unsure about this: The AtomicExpandPass seems to do this (emit the libcalls only if needed, after checking alignment as well), but I see this in the Atomics document:

"One very important property of the atomic operations is that if your backend supports any inline lock-free atomic operations of a given size, you should support ALL operations of that size in a lock-free manner...it is invalid to implement atomic load using the native instruction, but cmpxchg using a library call to a function that uses a mutex, atomic load must also expand to a library call on such architectures, so that it can remain atomic with regards to a simultaneous cmpxchg, by using the same mutex."

I guess I must be missing something somehow...

No, you don't. You cannot mix lock-free and mutex atomics. E.g, if loads are lock-free and stores use a mutex. Then you have lost atomicity. Your mutex version could do stores one byte at a time.

In D141187#4054105, @jonpa wrote:

LegalizeTypes expands the i128 result of e.g. an AtomicLoadAdd node, which is done by DAGTypeLegalizer::ExpandAtomic(), which results in a library call. There are no tests for the atomic operations with i128, but it should work due to the TypeLegalizer. i128 is not a legal type so I guess therefore they are all expanded this way.

I could imagine that it might be beneficial to use the AtomicExpandPass to use the target instructions whenever possible, and libcalls otherwise. However, I am unsure about this: The AtomicExpandPass seems to do this (emit the libcalls only if needed, after checking alignment as well), but I see this in the Atomics document:

"One very important property of the atomic operations is that if your backend supports any inline lock-free atomic operations of a given size, you should support ALL operations of that size in a lock-free manner...it is invalid to implement atomic load using the native instruction, but cmpxchg using a library call to a function that uses a mutex, atomic load must also expand to a library call on such architectures, so that it can remain atomic with regards to a simultaneous cmpxchg, by using the same mutex."

I guess I must be missing something somehow...

Note that this is handled by the particular implementation of the 16-byte atomic library routines on s390x: they all perform a *runtime* check for proper alignment, and if the address actually *is* 16-byte aligned, the library routine will in fact use the appropriate 16-byte atomic instructions to implement the operation (instead of a separate mutex). That is why it *is* safe for the compiler to emit those instructions itself *if* it can prove at compile-time that the address will in fact be properly aligned at run-time. (That's also done by GCC, for example.)

I have experimented with this here: https://reviews.llvm.org/D146425

brad abandoned this revision.Jul 9 2023, 5:41 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

SystemZ/

SystemZISelLowering.cpp

2 lines

Diff 487061

llvm/lib/Target/SystemZ/SystemZISelLowering.cpp

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	SystemZTargetLowering::SystemZTargetLowering(const TargetMachine &TM,
// LLVM's current latency-oriented scheduler can't handle physreg definitions		// LLVM's current latency-oriented scheduler can't handle physreg definitions
// such as SystemZ has with CC, so set this to the register-pressure		// such as SystemZ has with CC, so set this to the register-pressure
// scheduler, because it can.		// scheduler, because it can.
setSchedulingPreference(Sched::RegPressure);		setSchedulingPreference(Sched::RegPressure);

setBooleanContents(ZeroOrOneBooleanContent);		setBooleanContents(ZeroOrOneBooleanContent);
setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);		setBooleanVectorContents(ZeroOrNegativeOneBooleanContent);

		setMaxAtomicSizeInBitsSupported(64);

// Instructions are strings of 2-byte aligned 2-byte values.		// Instructions are strings of 2-byte aligned 2-byte values.
setMinFunctionAlignment(Align(2));		setMinFunctionAlignment(Align(2));
// For performance reasons we prefer 16-byte alignment.		// For performance reasons we prefer 16-byte alignment.
setPrefFunctionAlignment(Align(16));		setPrefFunctionAlignment(Align(16));

// Handle operations that are handled in a similar way for all types.		// Handle operations that are handled in a similar way for all types.
for (unsigned I = MVT::FIRST_INTEGER_VALUETYPE;		for (unsigned I = MVT::FIRST_INTEGER_VALUETYPE;
I <= MVT::LAST_FP_VALUETYPE;		I <= MVT::LAST_FP_VALUETYPE;
▲ Show 20 Lines • Show All 991 Lines • Show Last 20 Lines