This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
ItaniumCXXABI.cpp
-
test/CodeGenCXX/
-
CodeGenCXX/
-
threadsafe-statics-no-atomic.cpp

Differential D135628

[clang][codegen] Don't emit atomic loads for threadsafe init if they aren't inline
ClosedPublic

Authored by efriedma on Oct 10 2022, 3:49 PM.

Download Raw Diff

Details

Reviewers

rjmccall
nikic

Commits

rG1079662d2fff: [clang][codegen] Don't emit atomic loads for threadsafe init if they aren't…

Summary

Performing a load before calling __cxa_guard_acquire is supposed to be an optimization, but it isn't much of one if we're just going to emit a call to __atomic_load_1 instead. Instead, just skip the load, and let __cxa_guard_acquire do whatever it wants.

(In practice, on such targets, the C++ library is just built with threading turned off, so the result isn't actually threadsafe, but there's not really anything clang can do about that.)

The alternative here is that we try to define some ABI for threadsafe init that allows the speculative load without full atomics. Almost any target without full atomics has a load that's s "atomic enough" for this purpose. But it's not clear how we emit an "atomic enough" load in LLVM IR, and there isn't any ABI document we can refer to.

Or I guess we could turn off -fthreadsafe-statics by default on Cortex-M0, but that seems like it would be surprising.

Fixes https://github.com/llvm/llvm-project/issues/58184

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

efriedma created this revision.Oct 10 2022, 3:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 3:49 PM

efriedma requested review of this revision.Oct 10 2022, 3:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 3:49 PM

efriedma edited the summary of this revision. (Show Details)Oct 10 2022, 3:50 PM

Upload correct version of patch

Harbormaster completed remote builds in B191394: Diff 466657.Oct 10 2022, 4:24 PM

Makes sense. Nice code-size optimization at any rate.

Hmm, actually, along those lines, should we do the same thing at -Oz?

This revision is now accepted and ready to land.Oct 10 2022, 6:03 PM

This revision was landed with ongoing or failed builds.Oct 11 2022, 2:01 PM

Closed by commit rG1079662d2fff: [clang][codegen] Don't emit atomic loads for threadsafe init if they aren't… (authored by efriedma). · Explain Why

This revision was automatically updated to reflect the committed changes.

efriedma added a commit: rG1079662d2fff: [clang][codegen] Don't emit atomic loads for threadsafe init if they aren't….

Revision Contents

Path

Size

clang/

lib/

CodeGen/

ItaniumCXXABI.cpp

97 lines

test/

CodeGenCXX/

threadsafe-statics-no-atomic.cpp

21 lines

Diff 466910

clang/lib/CodeGen/ItaniumCXXABI.cpp

Show First 20 Lines • Show All 2,422 Lines • ▼ Show 20 Lines	void ItaniumCXXABI::EmitGuardedInit(CodeGenFunction &CGF,
// } catch (...) {		// } catch (...) {
// __cxa_guard_abort (&obj_guard);		// __cxa_guard_abort (&obj_guard);
// throw;		// throw;
// }		// }
// ... queue object destructor with __cxa_atexit() ...;		// ... queue object destructor with __cxa_atexit() ...;
// __cxa_guard_release (&obj_guard);		// __cxa_guard_release (&obj_guard);
// }		// }
// }		// }
		//
		// If threadsafe statics are enabled, but we don't have inline atomics, just
		// call __cxa_guard_acquire unconditionally. The "inline" check isn't
		// actually inline, and the user might not expect calls to __atomic libcalls.

		unsigned MaxInlineWidthInBits = CGF.getTarget().getMaxAtomicInlineWidth();
		llvm::BasicBlock *EndBlock = CGF.createBasicBlock("init.end");
		if (!threadsafe \|\| MaxInlineWidthInBits) {
// Load the first byte of the guard variable.		// Load the first byte of the guard variable.
llvm::LoadInst *LI =		llvm::LoadInst *LI =
Builder.CreateLoad(Builder.CreateElementBitCast(guardAddr, CGM.Int8Ty));		Builder.CreateLoad(Builder.CreateElementBitCast(guardAddr, CGM.Int8Ty));

// Itanium ABI:		// Itanium ABI:
// An implementation supporting thread-safety on multiprocessor		// An implementation supporting thread-safety on multiprocessor
// systems must also guarantee that references to the initialized		// systems must also guarantee that references to the initialized
// object do not occur before the load of the initialization flag.		// object do not occur before the load of the initialization flag.
//		//
// In LLVM, we do this by marking the load Acquire.		// In LLVM, we do this by marking the load Acquire.
if (threadsafe)		if (threadsafe)
LI->setAtomic(llvm::AtomicOrdering::Acquire);		LI->setAtomic(llvm::AtomicOrdering::Acquire);

// For ARM, we should only check the first bit, rather than the entire byte:		// For ARM, we should only check the first bit, rather than the entire byte:
//		//
// ARM C++ ABI 3.2.3.1:		// ARM C++ ABI 3.2.3.1:
// To support the potential use of initialization guard variables		// To support the potential use of initialization guard variables
// as semaphores that are the target of ARM SWP and LDREX/STREX		// as semaphores that are the target of ARM SWP and LDREX/STREX
// synchronizing instructions we define a static initialization		// synchronizing instructions we define a static initialization
// guard variable to be a 4-byte aligned, 4-byte word with the		// guard variable to be a 4-byte aligned, 4-byte word with the
// following inline access protocol.		// following inline access protocol.
// #define INITIALIZED 1		// #define INITIALIZED 1
// if ((obj_guard & INITIALIZED) != INITIALIZED) {		// if ((obj_guard & INITIALIZED) != INITIALIZED) {
// if (__cxa_guard_acquire(&obj_guard))		// if (__cxa_guard_acquire(&obj_guard))
// ...		// ...
// }		// }
//		//
// and similarly for ARM64:		// and similarly for ARM64:
//		//
// ARM64 C++ ABI 3.2.2:		// ARM64 C++ ABI 3.2.2:
// This ABI instead only specifies the value bit 0 of the static guard		// This ABI instead only specifies the value bit 0 of the static guard
// variable; all other bits are platform defined. Bit 0 shall be 0 when the		// variable; all other bits are platform defined. Bit 0 shall be 0 when the
// variable is not initialized and 1 when it is.		// variable is not initialized and 1 when it is.
llvm::Value *V =		llvm::Value *V =
(UseARMGuardVarABI && !useInt8GuardVariable)		(UseARMGuardVarABI && !useInt8GuardVariable)
? Builder.CreateAnd(LI, llvm::ConstantInt::get(CGM.Int8Ty, 1))		? Builder.CreateAnd(LI, llvm::ConstantInt::get(CGM.Int8Ty, 1))
: LI;		: LI;
llvm::Value *NeedsInit = Builder.CreateIsNull(V, "guard.uninitialized");		llvm::Value *NeedsInit = Builder.CreateIsNull(V, "guard.uninitialized");

llvm::BasicBlock *InitCheckBlock = CGF.createBasicBlock("init.check");		llvm::BasicBlock *InitCheckBlock = CGF.createBasicBlock("init.check");
llvm::BasicBlock *EndBlock = CGF.createBasicBlock("init.end");

// Check if the first byte of the guard variable is zero.		// Check if the first byte of the guard variable is zero.
CGF.EmitCXXGuardedInitBranch(NeedsInit, InitCheckBlock, EndBlock,		CGF.EmitCXXGuardedInitBranch(NeedsInit, InitCheckBlock, EndBlock,
CodeGenFunction::GuardKind::VariableGuard, &D);		CodeGenFunction::GuardKind::VariableGuard, &D);

CGF.EmitBlock(InitCheckBlock);		CGF.EmitBlock(InitCheckBlock);
		}

// Variables used when coping with thread-safe statics and exceptions.		// Variables used when coping with thread-safe statics and exceptions.
if (threadsafe) {		if (threadsafe) {
// Call __cxa_guard_acquire.		// Call __cxa_guard_acquire.
llvm::Value *V		llvm::Value *V
= CGF.EmitNounwindRuntimeCall(getGuardAcquireFn(CGM, guardPtrTy), guard);		= CGF.EmitNounwindRuntimeCall(getGuardAcquireFn(CGM, guardPtrTy), guard);

llvm::BasicBlock *InitBlock = CGF.createBasicBlock("init");		llvm::BasicBlock *InitBlock = CGF.createBasicBlock("init");
▲ Show 20 Lines • Show All 2,345 Lines • Show Last 20 Lines

clang/test/CodeGenCXX/threadsafe-statics-no-atomic.cpp

This file was added.

				// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
				// RUN: %clang_cc1 -emit-llvm -triple=thumbv6m-eabi -o - %s \| FileCheck %s

				int f();

				// CHECK-LABEL: @_Z1gv(
				// CHECK-NEXT: entry:
				// CHECK-NEXT: [[TMP0:%.*]] = call i32 @__cxa_guard_acquire(ptr @_ZGVZ1gvE1a) #[[ATTR1:[0-9]+]]
				// CHECK-NEXT: [[TOBOOL:%.*]] = icmp ne i32 [[TMP0]], 0
				// CHECK-NEXT: br i1 [[TOBOOL]], label [[INIT:%.]], label [[INIT_END:%.]]
				// CHECK: init:
				// CHECK-NEXT: [[CALL:%.*]] = call noundef i32 @_Z1fv()
				// CHECK-NEXT: store i32 [[CALL]], ptr @_ZZ1gvE1a, align 4
				// CHECK-NEXT: call void @__cxa_guard_release(ptr @_ZGVZ1gvE1a) #[[ATTR1]]
				// CHECK-NEXT: br label [[INIT_END]]
				// CHECK: init.end:
				// CHECK-NEXT: ret void
				//
				void g() {
				static int a = f();
				}