This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGDebugInfo.cpp
1/2
CGDecl.cpp
-
test/CodeGenHIP/
-
CodeGenHIP/
-
debug-info-address-class.hip

Differential D88978

[WIP] Attach debug intrinsics to allocas, and use correct address space
AbandonedPublic

Authored by scott.linder on Oct 7 2020, 9:38 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
ABataev
yaxunl
echristo
probinson
arsenm
kzhuravl
t-tye
ramana-nvr
aprantl

Summary

A dbg.declare for a local/parameter describes the hardware location of
the source variable's value. This matches up with the semantics of the
alloca for the variable, whereas any addrspacecast inserted in order to
implement some source-level notion of address spaces does not.

When creating the dbg.declare intrinsic, attach it directly to the
alloca, not to any addrspacecast.

Update the DIExpression with the address space of the alloca, rather
than use the address space associated with the source level type.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

scott.linder created this revision.Oct 7 2020, 9:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2020, 9:38 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

scott.linder requested review of this revision.Oct 7 2020, 9:38 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptOct 7 2020, 9:38 AM

Herald added a subscriber: sstefan1. · View Herald Transcript

scott.linder added a parent revision: D88976: [clang] Use correct address space for global variable debug info.Oct 7 2020, 9:38 AM

Harbormaster completed remote builds in B74298: Diff 296713.Oct 7 2020, 9:38 AM

scott.linder added reviewers: ABataev, yaxunl, echristo, probinson, arsenm, kzhuravl, t-tye, ramana-nvr.Oct 7 2020, 9:39 AM

Herald added a subscriber: wdng. · View Herald TranscriptOct 7 2020, 9:39 AM

I need to add more tests, but I wanted to float the idea of the change and get feedback first.

scott.linder added a reviewer: aprantl.Oct 7 2020, 9:45 AM

aprantl added inline comments.Oct 9 2020, 11:56 AM

clang/lib/CodeGen/CGDecl.cpp
1579	This is unintuitive — can you add a comment explaining why it may not be valid and why address should only be used then?

scott.linder added inline comments.Oct 9 2020, 2:08 PM

clang/lib/CodeGen/CGDecl.cpp
1579	This is kind of a cop-out on my part, the only path where this occurs is for OpenMP, and I think I just need to understand better what is happening. This also occurs for NRVO, but that is explicitly called out just below this. I'll try to understand this more completely and see if I can represent the possibilities more direclty. Somewhat related, it is a bit unsettling reading through this, as the invariant seems to be that `address.isValid()` by the time the call to `setAddrOfLocalVar` is called, which makes sense but isn't explicit anywhere in the multiple nested `if`s. I'll also add an assert of that.

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here? I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

In D88978#2325982, @scott.linder wrote:

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

Yes, right.

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?

I assume, no.

I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

In D88978#2325991, @ABataev wrote:

In D88978#2325982, @scott.linder wrote:

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

Yes, right.

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?

I assume, no.

I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

Ok, thank you! I think the simplest thing is for me to update the patch to tie the debug info to the call to the runtime allocator, then.

I have no idea what from the commit message what this has to do with the OpenMP code gen but I saw this randomly while looking for something else so here we go:

In D88978#2326036, @scott.linder wrote:

In D88978#2325991, @ABataev wrote:

In D88978#2325982, @scott.linder wrote:

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

Yes, right.

The frontend does *not* insert __kmpc_alloc calls for firstprivate, or almost anything else for that matter. Grep clang/lib and you can find 2 uses, both in very specialized cases not related to "regular" user allocations". alloca is used as with basically everything else: https://clang.godbolt.org/z/z8fEqG

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?

I assume, no.

I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

Ok, thank you! I think the simplest thing is for me to update the patch to tie the debug info to the call to the runtime allocator, then.

SROA/mem2reg is happening as you expect it to. FWIW, we also have heap2stack and argument-promotion + constant prop for parallel regions implemented in the Attributor. That means we would/will apply SROA/mem2reg even if you have a runtime alloca and if the value is nominally "shared" but could be made firtprivate.

In D88978#2343484, @jdoerfert wrote:

I have no idea what from the commit message what this has to do with the OpenMP code gen but I saw this randomly while looking for something else so here we go:

In D88978#2326036, @scott.linder wrote:

In D88978#2325991, @ABataev wrote:

In D88978#2325982, @scott.linder wrote:

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

Yes, right.

The frontend does *not* insert __kmpc_alloc calls for firstprivate, or almost anything else for that matter. Grep clang/lib and you can find 2 uses, both in very specialized cases not related to "regular" user allocations". alloca is used as with basically everything else: https://clang.godbolt.org/z/z8fEqG

They are inserted for pragma allocate and privates with allocate clauses.

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?

I assume, no.

I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

Ok, thank you! I think the simplest thing is for me to update the patch to tie the debug info to the call to the runtime allocator, then.

SROA/mem2reg is happening as you expect it to. FWIW, we also have heap2stack and argument-promotion + constant prop for parallel regions implemented in the Attributor. That means we would/will apply SROA/mem2reg even if you have a runtime alloca and if the value is nominally "shared" but could be made firtprivate.

Is this still needed?

In D88978#2660274, @arsenm wrote:

Is this still needed?

Yes, I just got a little bogged down in the OMP code and haven't gotten back to it to finish it up. I anticipate needing to do this to soon, though.

Is this still needed?

Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 4:23 PM

Please rebase if still relevant

This revision now requires changes to proceed.Dec 14 2022, 6:06 AM

Herald added a subscriber: arichardson. · View Herald TranscriptDec 14 2022, 6:06 AM

I'll open a new review as part of upstreaming all of the debug info work

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGDebugInfo.cpp

3 lines

CGDecl.cpp

11 lines

test/

CodeGenHIP/

debug-info-address-class.hip

7 lines

Diff 296713

clang/lib/CodeGen/CGDebugInfo.cpp

Show First 20 Lines • Show All 4,180 Lines • ▼ Show 20 Lines	llvm::DILocalVariable CGDebugInfo::EmitDeclare(const VarDecl VD,
}		}
SmallVector<int64_t, 13> Expr;		SmallVector<int64_t, 13> Expr;
llvm::DINode::DIFlags Flags = llvm::DINode::FlagZero;		llvm::DINode::DIFlags Flags = llvm::DINode::FlagZero;
if (VD->isImplicit())		if (VD->isImplicit())
Flags \|= llvm::DINode::FlagArtificial;		Flags \|= llvm::DINode::FlagArtificial;

auto Align = getDeclAlignIfRequired(VD, CGM.getContext());		auto Align = getDeclAlignIfRequired(VD, CGM.getContext());

unsigned AddressSpace = CGM.getContext().getTargetAddressSpace(VD->getType());		unsigned AddressSpace =
		llvm::cast<llvm::PointerType>(Storage->getType())->getAddressSpace();
AppendAddressSpaceXDeref(AddressSpace, Expr);		AppendAddressSpaceXDeref(AddressSpace, Expr);

// If this is implicit parameter of CXXThis or ObjCSelf kind, then give it an		// If this is implicit parameter of CXXThis or ObjCSelf kind, then give it an
// object pointer flag.		// object pointer flag.
if (const auto *IPD = dyn_cast<ImplicitParamDecl>(VD)) {		if (const auto *IPD = dyn_cast<ImplicitParamDecl>(VD)) {
if (IPD->getParameterKind() == ImplicitParamDecl::CXXThis \|\|		if (IPD->getParameterKind() == ImplicitParamDecl::CXXThis \|\|
IPD->getParameterKind() == ImplicitParamDecl::ObjCSelf)		IPD->getParameterKind() == ImplicitParamDecl::ObjCSelf)
Flags \|= llvm::DINode::FlagObjectPointer;		Flags \|= llvm::DINode::FlagObjectPointer;
▲ Show 20 Lines • Show All 824 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGDecl.cpp

Show First 20 Lines • Show All 1,570 Lines • ▼ Show 20 Lines	CodeGenFunction::EmitAutoVarAlloca(const VarDecl &D) {
}		}

setAddrOfLocalVar(&D, address);		setAddrOfLocalVar(&D, address);
emission.Addr = address;		emission.Addr = address;
emission.AllocaAddr = AllocaAddr;		emission.AllocaAddr = AllocaAddr;

// Emit debug info for local var declaration.		// Emit debug info for local var declaration.
if (EmitDebugInfo && HaveInsertPoint()) {		if (EmitDebugInfo && HaveInsertPoint()) {
Address DebugAddr = address;		Address DebugAddr = AllocaAddr.isValid() ? AllocaAddr : address;
		aprantlUnsubmitted Not Done Reply Inline Actions This is unintuitive — can you add a comment explaining why it may not be valid and why address should only be used then? aprantl: This is unintuitive — can you add a comment explaining why it may not be valid and why address…
		scott.linderAuthorUnsubmitted Done Reply Inline Actions This is kind of a cop-out on my part, the only path where this occurs is for OpenMP, and I think I just need to understand better what is happening. This also occurs for NRVO, but that is explicitly called out just below this. I'll try to understand this more completely and see if I can represent the possibilities more direclty. Somewhat related, it is a bit unsettling reading through this, as the invariant seems to be that `address.isValid()` by the time the call to `setAddrOfLocalVar` is called, which makes sense but isn't explicit anywhere in the multiple nested `if`s. I'll also add an assert of that. scott.linder: This is kind of a cop-out on my part, the only path where this occurs is for OpenMP, and I…
bool UsePointerValue = NRVO && ReturnValuePointer.isValid();		bool UsePointerValue = NRVO && ReturnValuePointer.isValid();
DI->setLocation(D.getLocation());		DI->setLocation(D.getLocation());

// If NRVO, use a pointer to the return address.		// If NRVO, use a pointer to the return address.
if (UsePointerValue)		if (UsePointerValue)
DebugAddr = ReturnValuePointer;		DebugAddr = ReturnValuePointer;

(void)DI->EmitDeclareOfAutoVariable(&D, DebugAddr.getPointer(), Builder,		(void)DI->EmitDeclareOfAutoVariable(&D, DebugAddr.getPointer(), Builder,
▲ Show 20 Lines • Show All 824 Lines • ▼ Show 20 Lines	if (BlockInfo) {
? Builder.CreateLoad(Arg.getIndirectAddress())		? Builder.CreateLoad(Arg.getIndirectAddress())
: Arg.getDirectValue();		: Arg.getDirectValue();
setBlockContextParameter(IPD, ArgNo, V);		setBlockContextParameter(IPD, ArgNo, V);
return;		return;
}		}
}		}

Address DeclPtr = Address::invalid();		Address DeclPtr = Address::invalid();
		Address DebugAddr = Address::invalid();
bool DoStore = false;		bool DoStore = false;
bool IsScalar = hasScalarEvaluationKind(Ty);		bool IsScalar = hasScalarEvaluationKind(Ty);
// If we already have a pointer to the argument, reuse the input pointer.		// If we already have a pointer to the argument, reuse the input pointer.
if (Arg.isIndirect()) {		if (Arg.isIndirect()) {
DeclPtr = Arg.getIndirectAddress();		DeclPtr = DebugAddr = Arg.getIndirectAddress();
// If we have a prettier pointer type at this point, bitcast to that.		// If we have a prettier pointer type at this point, bitcast to that.
unsigned AS = DeclPtr.getType()->getAddressSpace();		unsigned AS = DeclPtr.getType()->getAddressSpace();
llvm::Type *IRTy = ConvertTypeForMem(Ty)->getPointerTo(AS);		llvm::Type *IRTy = ConvertTypeForMem(Ty)->getPointerTo(AS);
if (DeclPtr.getType() != IRTy)		if (DeclPtr.getType() != IRTy)
DeclPtr = Builder.CreateBitCast(DeclPtr, IRTy, D.getName());		DeclPtr = Builder.CreateBitCast(DeclPtr, IRTy, D.getName());
// Indirect argument is in alloca address space, which may be different		// Indirect argument is in alloca address space, which may be different
// from the default address space.		// from the default address space.
auto AllocaAS = CGM.getASTAllocaAddressSpace();		auto AllocaAS = CGM.getASTAllocaAddressSpace();
Show All 28 Lines	if (Arg.isIndirect()) {
}		}
} else {		} else {
// Check if the parameter address is controlled by OpenMP runtime.		// Check if the parameter address is controlled by OpenMP runtime.
Address OpenMPLocalAddr =		Address OpenMPLocalAddr =
getLangOpts().OpenMP		getLangOpts().OpenMP
? CGM.getOpenMPRuntime().getAddressOfLocalVariable(*this, &D)		? CGM.getOpenMPRuntime().getAddressOfLocalVariable(*this, &D)
: Address::invalid();		: Address::invalid();
if (getLangOpts().OpenMP && OpenMPLocalAddr.isValid()) {		if (getLangOpts().OpenMP && OpenMPLocalAddr.isValid()) {
DeclPtr = OpenMPLocalAddr;		DeclPtr = DebugAddr = OpenMPLocalAddr;
} else {		} else {
// Otherwise, create a temporary to hold the value.		// Otherwise, create a temporary to hold the value.
DeclPtr = CreateMemTemp(Ty, getContext().getDeclAlign(&D),		DeclPtr = CreateMemTemp(Ty, getContext().getDeclAlign(&D),
D.getName() + ".addr");		D.getName() + ".addr", &DebugAddr);
}		}
DoStore = true;		DoStore = true;
}		}

llvm::Value *ArgVal = (DoStore ? Arg.getDirectValue() : nullptr);		llvm::Value *ArgVal = (DoStore ? Arg.getDirectValue() : nullptr);

LValue lv = MakeAddrLValue(DeclPtr, Ty);		LValue lv = MakeAddrLValue(DeclPtr, Ty);
if (IsScalar) {		if (IsScalar) {
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	void CodeGenFunction::EmitParmDecl(const VarDecl &D, ParamValue Arg,
if (DoStore)		if (DoStore)
EmitStoreOfScalar(ArgVal, lv, /* isInitialization */ true);		EmitStoreOfScalar(ArgVal, lv, /* isInitialization */ true);

setAddrOfLocalVar(&D, DeclPtr);		setAddrOfLocalVar(&D, DeclPtr);

// Emit debug info for param declarations in non-thunk functions.		// Emit debug info for param declarations in non-thunk functions.
if (CGDebugInfo *DI = getDebugInfo()) {		if (CGDebugInfo *DI = getDebugInfo()) {
if (CGM.getCodeGenOpts().hasReducedDebugInfo() && !CurFuncIsThunk) {		if (CGM.getCodeGenOpts().hasReducedDebugInfo() && !CurFuncIsThunk) {
DI->EmitDeclareOfArgVariable(&D, DeclPtr.getPointer(), ArgNo, Builder);		DI->EmitDeclareOfArgVariable(&D, DebugAddr.getPointer(), ArgNo, Builder);
}		}
}		}

if (D.hasAttr<AnnotateAttr>())		if (D.hasAttr<AnnotateAttr>())
EmitVarAnnotations(&D, DeclPtr.getPointer());		EmitVarAnnotations(&D, DeclPtr.getPointer());

// We can only check return value nullability if all arguments to the		// We can only check return value nullability if all arguments to the
// function satisfy their nullability preconditions. This makes it necessary		// function satisfy their nullability preconditions. This makes it necessary
Show All 30 Lines

clang/test/CodeGenHIP/debug-info-address-class.hip

	Show All 10 Lines
	// CHECK-DAG: ![[FILEVAR1:[0-9]+]] = distinct !DIGlobalVariable(name: "FileVar1", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: false, isDefinition: true)			// CHECK-DAG: ![[FILEVAR1:[0-9]+]] = distinct !DIGlobalVariable(name: "FileVar1", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: false, isDefinition: true)
	// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FILEVAR1]], expr: !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))			// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FILEVAR1]], expr: !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
	__device__ __shared__ int FileVar1;			__device__ __shared__ int FileVar1;
	// CHECK-DAG: ![[FILEVAR2:[0-9]+]] = distinct !DIGlobalVariable(name: "FileVar2", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: false, isDefinition: true)			// CHECK-DAG: ![[FILEVAR2:[0-9]+]] = distinct !DIGlobalVariable(name: "FileVar2", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: false, isDefinition: true)
	// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FILEVAR2]], expr: !DIExpression())			// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FILEVAR2]], expr: !DIExpression())
	__device__ __constant__ int FileVar2;			__device__ __constant__ int FileVar2;

	__device__ void kernel1(			__device__ void kernel1(
	// FIXME This should be in the private address space.
	// CHECK-DAG: ![[ARG:[0-9]+]] = !DILocalVariable(name: "Arg", arg: {{[0-9]+}}, scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})			// CHECK-DAG: ![[ARG:[0-9]+]] = !DILocalVariable(name: "Arg", arg: {{[0-9]+}}, scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})
	// CHECK-DAG: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata ![[ARG]], metadata !DIExpression()), !dbg !{{[0-9]+}}			// CHECK-DAG: call void @llvm.dbg.declare(metadata i32 addrspace(5)* {{.*}}, metadata ![[ARG]], metadata !DIExpression(DW_OP_constu, 1, DW_OP_swap, DW_OP_xderef)), !dbg !{{[0-9]+}}
	int Arg) {			int Arg) {
	// CHECK-DAG: ![[FUNCVAR0:[0-9]+]] = distinct !DIGlobalVariable(name: "FuncVar0", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: true, isDefinition: true)			// CHECK-DAG: ![[FUNCVAR0:[0-9]+]] = distinct !DIGlobalVariable(name: "FuncVar0", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: true, isDefinition: true)
	// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FUNCVAR0]], expr: !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))			// CHECK-DAG: !DIGlobalVariableExpression(var: ![[FUNCVAR0]], expr: !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
	__shared__ int FuncVar0;			__shared__ int FuncVar0;

	// FIXME This should be in the private address space.
	// CHECK-DAG: ![[FUNCVAR1:[0-9]+]] = !DILocalVariable(name: "FuncVar1", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})			// CHECK-DAG: ![[FUNCVAR1:[0-9]+]] = !DILocalVariable(name: "FuncVar1", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})
	// CHECK-DAG: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata ![[FUNCVAR1]], metadata !DIExpression()), !dbg !{{[0-9]+}}			// CHECK-DAG: call void @llvm.dbg.declare(metadata i32 addrspace(5)* {{.*}}, metadata ![[FUNCVAR1]], metadata !DIExpression(DW_OP_constu, 1, DW_OP_swap, DW_OP_xderef)), !dbg !{{[0-9]+}}
	int FuncVar1;			int FuncVar1;
	}			}