Download Raw Diff

Details

Reviewers

jdoerfert
JonChesterfield
arsenm
yaxunl
sivachandra
michaelrj
MaskRay
rampitec

Commits

rGa1da7461571c: [AMDGPU] Place global constructors in .init_array and .fini_array

Summary

For the GPU, we emit external kernels that call the initializers and
constructors, however if we had a persistent kernel like in the _start
kernel for the libc project, we could initialize the standard way of
calling constructors. This patch adds new global variables containing
pointers to the constructors to be called. If these are placed in the
.init_array and .fini_array sections, then the backend will handle
them specially. The linker will then provide the __init_array_ and
__fini_array_ sections to traverse them. An implementation would look
like this.

extern uintptr_t __init_array_start[];
extern uintptr_t __init_array_end[];
extern uintptr_t __fini_array_start[];
extern uintptr_t __fini_array_end[];

using InitCallback = void(int, char **, char **);
using FiniCallback = void(void);

extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
_start(int argc, char **argv, char **envp) {
  uint64_t init_array_size = __init_array_end - __init_array_start;
  for (uint64_t i = 0; i < init_array_size; ++i)
    reinterpret_cast<InitCallback *>(__init_array_start[i])(argc, argv, env);
  uint64_t fini_array_size = __fini_array_end - __fini_array_start;
  for (uint64_t i = 0; i < fini_array_size; ++i)
    reinterpret_cast<FiniCallback *>(__fini_array_start[i])();
}

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jhuber6 created this revision.Apr 27 2023, 6:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2023, 6:07 AM

Herald added subscribers: kosarev, foad, kerbowa and 5 others. · View Herald Transcript

jhuber6 requested review of this revision.Apr 27 2023, 6:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2023, 6:07 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

Add section check in test.

Harbormaster completed remote builds in B228539: Diff 517534.Apr 27 2023, 6:37 AM

I am curious how other targets handle @llvm.global_ctors. Is there some generic LLVM pass to change them to .init_array ?

In D149340#4301953, @yaxunl wrote:

I am curious how other targets handle @llvm.global_ctors. Is there some generic LLVM pass to change them to .init_array ?

I don't think we could fold this AMDGPU lowering pass since it need to do its own ctor / dtor handling for making the kernels. I actually don't know where the lowering happens for generic targets, I could look if it's important.

Add priority

Harbormaster completed remote builds in B228555: Diff 517556.Apr 27 2023, 7:57 AM

Rebase

Remove leftoever debug prints.

Does this work for non-AMD hardware?

Is this just adding the globals and not actually using this mechanism?

In D149340#4302728, @jdoerfert wrote:

Does this work for non-AMD hardware?

Is this just adding the globals and not actually using this mechanism?

This is all that's required on a platform with a functioning linker, Nvidia need not apply.

In D149340#4302728, @jdoerfert wrote:

Does this work for non-AMD hardware?

Is this just adding the globals and not actually using this mechanism?

If you're talking about the expected usage, it's in the commit header. I'm planning on adding that to libc in a separate patch.

So we have different schemes for AMD and NVIDIA? That does not sound good.

In D149340#4303525, @jdoerfert wrote:

So we have different schemes for AMD and NVIDIA? That does not sound good.

There's no choice, Nvidia offers no way to export variables to sections and an incomplete ELF linker. I'd rather have AMDGPU do what every other target does and have Nvidia be the black sheep.

I know little about GPU. The generic code emittings constructors to assembly is AsmPrinter::emitSpecialLLVMGlobal and AsmPrinter::emitXXStructorList. It will use .init_array.

The priority works this way. Note that in the absence of a .N suffix, .init_array has the highest priority. https://maskray.me/blog/2021-11-07-init-ctors-init-array

a.o:(.init_array.1) b.o:(.init_array.1)
a.o:(.init_array.2) b.o:(.init_array.2)
...
a.o:(.init_array.65533) b.o:(.init_array.65533)
a.o:(.init_array.65534) b.o:(.init_array.65534)
a.o:(.init_array) b.o:(.init_array)

In D149340#4303678, @MaskRay wrote:

I know little about GPU. The generic code emittings constructors to assembly is AsmPrinter::emitSpecialLLVMGlobal and AsmPrinter::emitXXStructorList. It will use .init_array.

This pass currently deletes the global list before making it to the assembly printer, maybe we will get some of this behaviour if we don't do that?

The priority works this way. Note that in the absence of a .N suffix, .init_array has the highest priority. https://maskray.me/blog/2021-11-07-init-ctors-init-array
a.o:(.init_array.1) b.o:(.init_array.1)
a.o:(.init_array.2) b.o:(.init_array.2)
...
a.o:(.init_array.65533) b.o:(.init_array.65533)
a.o:(.init_array.65534) b.o:(.init_array.65534)
a.o:(.init_array) b.o:(.init_array)

AMD's linker is lld and targets ELF, so it should follow the exact same rules as you are familiar with.

Update, turns out that if we don't delete the ctor / dtor here it does the
default approach which is exactly what we want. The only changes required are
making sure we don't create duplicate kernels.

jhuber6 added a child revision: D149398: [libc] Add support for global ctors / dtors for AMDGPU.Apr 27 2023, 6:25 PM

jhuber6 mentioned this in D149451: [NVPTX] Add NVPTXCtorDtorLoweringPass to handle global ctors / dtors.Apr 28 2023, 8:13 AM

LGTM. Thanks.

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
76	I noticed these functions are called not following the priority. However, I guess that is out of the scope of this patch.

This revision is now accepted and ready to land.Apr 28 2023, 8:53 AM

jhuber6 added inline comments.Apr 28 2023, 9:13 AM

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
76	True, it was like that when I got here. Do you know who the current user is for this feature?

yaxunl added inline comments.Apr 28 2023, 9:22 AM

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
76	Currently, all HIP programs use this feature when -fsanitize=addr is used. But they do not care about priority yet.

jhuber6 added inline comments.Apr 28 2023, 9:23 AM

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
76	FWIW we could change this to be a single kernel that calls an ASAN library function, which implements the method in the commit header to traverse the list in priority order.

yaxunl added inline comments.Apr 28 2023, 9:34 AM

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
76	We want to keep this feature as a generic approach to support dynamic initialization.

jhuber6 added inline comments.Apr 28 2023, 9:35 AM

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
76	This will also cause a duplicate symbol if someone compiles without monolithic LTO. I'm assuming that's the expected behavior.

yaxunl added inline comments.Apr 28 2023, 9:44 AM

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
76	That is fine. HIP runtime will launch all of them as long as they have init kernel metadata. It is not based on the kernel name.

This revision was landed with ongoing or failed builds.Apr 29 2023, 6:40 AM

Closed by commit rGa1da7461571c: [AMDGPU] Place global constructors in .init_array and .fini_array (authored by jhuber6). · Explain Why

This revision was automatically updated to reflect the committed changes.

jhuber6 added a commit: rGa1da7461571c: [AMDGPU] Place global constructors in .init_array and .fini_array.

jhuber6 mentioned this in rGf05ce9045af4: [NVPTX] Add NVPTXCtorDtorLoweringPass to handle global ctors / dtors.May 4 2023, 5:13 AM

jhuber6 mentioned this in D150565: [AMDGPU] Add an option to disable manual ctor / dtor lowering.May 15 2023, 6:04 AM

jhuber6 mentioned this in rG4a1236e0f663: [AMDGPU] Add an option to disable manual ctor / dtor lowering.May 23 2023, 7:03 AM

Diff 518162

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp

Show All 25 Lines
#define DEBUG_TYPE "amdgpu-lower-ctor-dtor"		#define DEBUG_TYPE "amdgpu-lower-ctor-dtor"

namespace {		namespace {

static Function *createInitOrFiniKernelFunction(Module &M, bool IsCtor) {		static Function *createInitOrFiniKernelFunction(Module &M, bool IsCtor) {
StringRef InitOrFiniKernelName = "amdgcn.device.init";		StringRef InitOrFiniKernelName = "amdgcn.device.init";
if (!IsCtor)		if (!IsCtor)
InitOrFiniKernelName = "amdgcn.device.fini";		InitOrFiniKernelName = "amdgcn.device.fini";
		if (Function *F = M.getFunction(InitOrFiniKernelName))
		return F;

Function *InitOrFiniKernel = Function::createWithDefaultAttr(		Function *InitOrFiniKernel = Function::createWithDefaultAttr(
FunctionType::get(Type::getVoidTy(M.getContext()), false),		FunctionType::get(Type::getVoidTy(M.getContext()), false),
GlobalValue::ExternalLinkage, 0, InitOrFiniKernelName, &M);		GlobalValue::ExternalLinkage, 0, InitOrFiniKernelName, &M);
BasicBlock *InitOrFiniKernelBB =		BasicBlock *InitOrFiniKernelBB =
BasicBlock::Create(M.getContext(), "", InitOrFiniKernel);		BasicBlock::Create(M.getContext(), "", InitOrFiniKernel);
ReturnInst::Create(M.getContext(), InitOrFiniKernelBB);		ReturnInst::Create(M.getContext(), InitOrFiniKernelBB);

Show All 16 Lines	static bool createInitOrFiniKernel(Module &M, StringRef GlobalName,

Function *InitOrFiniKernel = createInitOrFiniKernelFunction(M, IsCtor);		Function *InitOrFiniKernel = createInitOrFiniKernelFunction(M, IsCtor);
IRBuilder<> IRB(InitOrFiniKernel->getEntryBlock().getTerminator());		IRBuilder<> IRB(InitOrFiniKernel->getEntryBlock().getTerminator());

FunctionType *ConstructorTy = InitOrFiniKernel->getFunctionType();		FunctionType *ConstructorTy = InitOrFiniKernel->getFunctionType();

for (Value *V : GA->operands()) {		for (Value *V : GA->operands()) {
auto *CS = cast<ConstantStruct>(V);		auto *CS = cast<ConstantStruct>(V);
		bool AlreadyRegistered =
		llvm::any_of(CS->getOperand(1)->uses(), [=](Use &U) {
		if (auto *CB = dyn_cast<CallBase>(U.getUser()))
		if (CB->getCaller() == InitOrFiniKernel)
		return true;
		return false;
		});
		if (!AlreadyRegistered)
IRB.CreateCall(ConstructorTy, CS->getOperand(1));		IRB.CreateCall(ConstructorTy, CS->getOperand(1));
		yaxunlUnsubmitted Not Done Reply Inline Actions I noticed these functions are called not following the priority. However, I guess that is out of the scope of this patch. yaxunl: I noticed these functions are called not following the priority. However, I guess that is out…
		jhuber6AuthorUnsubmitted Done Reply Inline Actions True, it was like that when I got here. Do you know who the current user is for this feature? jhuber6: True, it was like that when I got here. Do you know who the current user is for this feature?
		yaxunlUnsubmitted Not Done Reply Inline Actions Currently, all HIP programs use this feature when -fsanitize=addr is used. But they do not care about priority yet. yaxunl: Currently, all HIP programs use this feature when -fsanitize=addr is used. But they do not…
		jhuber6AuthorUnsubmitted Done Reply Inline Actions FWIW we could change this to be a single kernel that calls an ASAN library function, which implements the method in the commit header to traverse the list in priority order. jhuber6: FWIW we could change this to be a single kernel that calls an ASAN library function, which…
		yaxunlUnsubmitted Not Done Reply Inline Actions We want to keep this feature as a generic approach to support dynamic initialization. yaxunl: We want to keep this feature as a generic approach to support dynamic initialization.
		jhuber6AuthorUnsubmitted Done Reply Inline Actions This will also cause a duplicate symbol if someone compiles without monolithic LTO. I'm assuming that's the expected behavior. jhuber6: This will also cause a duplicate symbol if someone compiles without monolithic LTO. I'm…
		yaxunlUnsubmitted Not Done Reply Inline Actions That is fine. HIP runtime will launch all of them as long as they have init kernel metadata. It is not based on the kernel name. yaxunl: That is fine. HIP runtime will launch all of them as long as they have init kernel metadata. It…
}		}

appendToUsed(M, {InitOrFiniKernel});		appendToUsed(M, {InitOrFiniKernel});

GV->eraseFromParent();
return true;		return true;
}		}

static bool lowerCtorsAndDtors(Module &M) {		static bool lowerCtorsAndDtors(Module &M) {
bool Modified = false;		bool Modified = false;
Modified \|= createInitOrFiniKernel(M, "llvm.global_ctors", /IsCtor =/true);		Modified \|= createInitOrFiniKernel(M, "llvm.global_ctors", /IsCtor =/true);
Modified \|= createInitOrFiniKernel(M, "llvm.global_dtors", /IsCtor =/false);		Modified \|= createInitOrFiniKernel(M, "llvm.global_dtors", /IsCtor =/false);
return Modified;		return Modified;
}		}

class AMDGPUCtorDtorLoweringLegacy final : public ModulePass {		class AMDGPUCtorDtorLoweringLegacy final : public ModulePass {
public:		public:
static char ID;		static char ID;
AMDGPUCtorDtorLoweringLegacy() : ModulePass(ID) {}		AMDGPUCtorDtorLoweringLegacy() : ModulePass(ID) {}
bool runOnModule(Module &M) override {		bool runOnModule(Module &M) override { return lowerCtorsAndDtors(M); }
return lowerCtorsAndDtors(M);
}
};		};

} // End anonymous namespace		} // End anonymous namespace

PreservedAnalyses AMDGPUCtorDtorLoweringPass::run(Module &M,		PreservedAnalyses AMDGPUCtorDtorLoweringPass::run(Module &M,
ModuleAnalysisManager &AM) {		ModuleAnalysisManager &AM) {
return lowerCtorsAndDtors(M) ? PreservedAnalyses::none()		return lowerCtorsAndDtors(M) ? PreservedAnalyses::none()
: PreservedAnalyses::all();		: PreservedAnalyses::all();
Show All 11 Lines

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll

	Show All 12 Lines
	; Check a constantexpr addrspacecast			; Check a constantexpr addrspacecast
	@llvm.global_dtors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [			@llvm.global_dtors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [
	{ i32, ptr, ptr } { i32 1, ptr addrspacecast (ptr addrspace(1) @bar to ptr), i8* null }			{ i32, ptr, ptr } { i32 1, ptr addrspacecast (ptr addrspace(1) @bar to ptr), i8* null }
	]			]

	@foo.alias = hidden alias void (), ptr @foo			@foo.alias = hidden alias void (), ptr @foo

	;.			;.
	; CHECK-NOT: @llvm.global_ctors
	; CHECK-NOT: @llvm.global_dtors
	; CHECK: @llvm.used = appending global [2 x ptr] [ptr @amdgcn.device.init, ptr @amdgcn.device.fini], section "llvm.metadata"			; CHECK: @llvm.used = appending global [2 x ptr] [ptr @amdgcn.device.init, ptr @amdgcn.device.fini], section "llvm.metadata"
	; CHECK: @foo.alias = hidden alias void (), ptr @foo			; CHECK: @foo.alias = hidden alias void (), ptr @foo
	;.			;.
	define void @foo() {			define void @foo() {
	; CHECK-LABEL: @foo(			; CHECK-LABEL: @foo(
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	ret void			ret void
	▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-existing.ll

	; RUN: opt -S -mtriple=amdgcn-- -passes=amdgpu-lower-ctor-dtor < %s \| FileCheck %s			; RUN: opt -S -mtriple=amdgcn-- -passes=amdgpu-lower-ctor-dtor < %s \| FileCheck %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -filetype=obj -o - < %s \| llvm-readelf -s - 2>&1 \| FileCheck %s -check-prefix=CHECK-VIS			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -filetype=obj -o - < %s \| llvm-readelf -s - 2>&1 \| FileCheck %s -check-prefix=CHECK-VIS

	; Make sure there's no crash or error if amdgcn.device.init or			; Make sure there's no crash or error if amdgcn.device.init or
	; amdgcn.device.fini already exist.			; amdgcn.device.fini already exist.

	@llvm.global_ctors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @foo, ptr null }]			@llvm.global_ctors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @foo, ptr null }]
	@llvm.global_dtors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @bar, ptr null }]			@llvm.global_dtors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @bar, ptr null }]

	; CHECK-NOT: @llvm.global_ctors			; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.init() #0 {
	; CHECK-NOT: @llvm.global_dtors			; CHECK-NEXT: store volatile i32 1, ptr addrspace(1) null
				; CHECK-NEXT: call void @foo()
	; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.init() #0
	; CHECK-NEXT: store
	; CHECK-NEXT: ret void

	; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.fini() #1
	; CHECK-NEXT: store
	; CHECK-NEXT: ret void


	; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.init.1() #0
	; CHECK-NEXT: call void @foo
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
				; CHECK-NEXT: }

	; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.fini.2() #1			; CHECK-LABEL: define amdgpu_kernel void @amdgcn.device.fini() #1 {
	; CHECK-NEXT: call void @bar			; CHECK-NEXT: store volatile i32 0, ptr addrspace(1) null
				; CHECK-NEXT: call void @bar()
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
				; CHECK-NEXT: }

	; CHECK-NOT: amdgcn.device.			; CHECK-NOT: amdgcn.device.

	; CHECK-VIS: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.init{{$}}			; CHECK-VIS: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.init{{$}}
	; CHECK-VIS: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.init.kd{{$}}			; CHECK-VIS: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.init.kd{{$}}
	; CHECK-VIS: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.fini{{$}}			; CHECK-VIS: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.fini{{$}}
	; CHECK-VIS: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.fini.kd{{$}}			; CHECK-VIS: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.fini.kd{{$}}

	; CHECK-VIS: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.init.1{{$}}
	; CHECK-VIS: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.init.1.kd{{$}}
	; CHECK-VIS: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.fini.2{{$}}
	; CHECK-VIS: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.fini.2.kd{{$}}


	define internal void @foo() {			define internal void @foo() {
	ret void			ret void
	}			}

	define internal void @bar() {			define internal void @bar() {
	ret void			ret void
	}			}

	Show All 14 Lines

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll

	; RUN: opt -S -mtriple=amdgcn-- -amdgpu-lower-ctor-dtor < %s \| FileCheck %s			; RUN: opt -S -mtriple=amdgcn-- -amdgpu-lower-ctor-dtor < %s \| FileCheck %s
	; RUN: opt -S -mtriple=amdgcn-- -passes=amdgpu-lower-ctor-dtor < %s \| FileCheck %s			; RUN: opt -S -mtriple=amdgcn-- -passes=amdgpu-lower-ctor-dtor < %s \| FileCheck %s

	; Make sure we get the same result if we run multiple times			; Make sure we get the same result if we run multiple times
	; RUN: opt -S -mtriple=amdgcn-- -passes=amdgpu-lower-ctor-dtor,amdgpu-lower-ctor-dtor < %s \| FileCheck %s			; RUN: opt -S -mtriple=amdgcn-- -passes=amdgpu-lower-ctor-dtor,amdgpu-lower-ctor-dtor < %s \| FileCheck %s
	; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -filetype=obj -o - < %s \| llvm-readelf -s - 2>&1 \| FileCheck %s -check-prefix=VISIBILITY			; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -filetype=obj -o - < %s \| llvm-readelf -s - 2>&1 \| FileCheck %s -check-prefix=VISIBILITY
				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -filetype=obj -o - < %s \| llvm-readelf -S - 2>&1 \| FileCheck %s -check-prefix=SECTION

	@llvm.global_ctors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @foo, ptr null }]			@llvm.global_ctors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @foo, ptr null }]
	@llvm.global_dtors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @bar, ptr null }]			@llvm.global_dtors = appending addrspace(1) global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @bar, ptr null }]

	; CHECK-NOT: @llvm.global_ctors			; CHECK: @llvm.used = appending global [2 x ptr] [ptr @amdgcn.device.init, ptr @amdgcn.device.fini]
	; CHECK-NOT: @llvm.global_dtors

	; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.init() #0			; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.init() #0
	; CHECK-NEXT: call void @foo			; CHECK-NEXT: call void @foo
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void

	; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.fini() #1			; CHECK-LABEL: amdgpu_kernel void @amdgcn.device.fini() #1
	; CHECK-NEXT: call void @bar			; CHECK-NEXT: call void @bar
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void

	; CHECK-NOT: amdgcn.device.			; CHECK-NOT: amdgcn.device.

	; VISIBILITY: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.init			; VISIBILITY: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.init
	; VISIBILITY: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.init.kd			; VISIBILITY: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.init.kd
	; VISIBILITY: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.fini			; VISIBILITY: FUNC GLOBAL PROTECTED {{.*}} amdgcn.device.fini
	; VISIBILITY: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.fini.kd			; VISIBILITY: OBJECT GLOBAL DEFAULT {{.*}} amdgcn.device.fini.kd
				; SECTION: .init_array.1 INIT_ARRAY {{.}} {{.}} 000008 00 WA 0 0 8
				; SECTION: .fini_array.1 FINI_ARRAY {{.}} {{.}} 000008 00 WA 0 0 8

	define internal void @foo() {			define internal void @foo() {
	ret void			ret void
	}			}

	define internal void @bar() {			define internal void @bar() {
	ret void			ret void
	}			}

	; CHECK: attributes #0 = { "device-init" }			; CHECK: attributes #0 = { "device-init" }
	; CHECK: attributes #1 = { "device-fini" }			; CHECK: attributes #1 = { "device-fini" }

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Place global constructors in .init_array and .fini_array
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 518162

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-existing.ll

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Place global constructors in .init_array and .fini_arrayClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 518162

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-existing.ll

llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll

[AMDGPU] Place global constructors in .init_array and .fini_array
ClosedPublic