This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/
-
Basic/Targets/
-
Targets/
1/1
NVPTX.h
-
CodeGen/
-
CGDebugInfo.cpp
-
test/CodeGenHIP/
-
CodeGenHIP/
1/2
debug-info-address-class.hip

Differential D88976

[clang] Use correct address space for global variable debug info
AcceptedPublic

Authored by scott.linder on Oct 7 2020, 9:07 AM.

Download Raw Diff

Details

Reviewers

ABataev
yaxunl
echristo
probinson
arsenm
kzhuravl
t-tye
ramana-nvr
aprantl

Summary

The target needs to be queried here, but previously we seemed to only
duplicate CUDA's (and so HIP's) behavior, and only partially. Use the
same function as codegen to find the correct address space.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

scott.linder created this revision.Oct 7 2020, 9:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2020, 9:07 AM

Herald added subscribers: cfe-commits, jholewinski. · View Herald Transcript

scott.linder requested review of this revision.Oct 7 2020, 9:07 AM

scott.linder added reviewers: ABataev, yaxunl, echristo, probinson, arsenm, kzhuravl, t-tye, ramana-nvr.Oct 7 2020, 9:08 AM

Herald added a subscriber: wdng. · View Herald TranscriptOct 7 2020, 9:08 AM

I'm not certain I fully understand NVPTX's relationship with its debugger, but from https://reviews.llvm.org/D57162 I gather the "default" address space in the debugger is global, and so the frontend omits it rather than explicitly mentioning it. I think it would be simpler to carry this information throughout the compiler, and only strip it late in the backend as a quirk controllable via some "optimize for NVPTX debugger", but in the patch as it currently is I instead just update NVPTXDWARFAddrSpaceMap.

Edit: Concerning auto variables, when coming back to the patch to post it I had missed the next patch in the series which addresses it by directly referring to the corresponding alloca rather than the addrspacecast to the default address space. I'll post that patch shortly to address the "FIXME" in this one.

scott.linder added a child revision: D88978: [WIP] Attach debug intrinsics to allocas, and use correct address space.Oct 7 2020, 9:38 AM

scott.linder added a reviewer: aprantl.Oct 7 2020, 9:45 AM

Harbormaster completed remote builds in B74296: Diff 296705.Oct 7 2020, 10:55 AM

That looks much nicer.

clang/test/CodeGenHIP/debug-info-address-class.hip
9	They are more convenient, but having very many CHECK_DAGs is also really slow — would it be feasible to reorder them and use CHECKs? Perhaps by running clang/FileCheck twice with different sets of CHECK lines?

This revision is now accepted and ready to land.Oct 9 2020, 11:54 AM

Replace uses of CHECK-DAG, use more meaningful names in test

scott.linder added inline comments.Oct 9 2020, 12:35 PM

clang/test/CodeGenHIP/debug-info-address-class.hip
9	The test is pretty short, so I just re-ordered the checks to match how they appear in the output (and used more descriptive names to make it easier to follow). This does mean the test relies on the order these things are traversed. Some bits are maybe a bit surprising, like how the metadata for the `__shared__` auto variable comes before the argument, but I don't imagine it is liable to change often/accidentally.

Harbormaster completed remote builds in B74641: Diff 297309.Oct 9 2020, 1:25 PM

scott.linder added inline comments.Oct 14 2020, 12:10 PM

clang/lib/Basic/Targets/NVPTX.h
47	Does anyone have any thoughts on this change specifically? Is someone more familiar with NVPTX willing to weigh in on whether it makes more sense to carry the address space throughout the compiler explicitly and "drop" it late in the DWARF emission, or to do what I did in the current patch (drop it early). I would lean towards updating the patch to do the latter, but I wanted to get feedback before plunging off to do it.

ping?

Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2022, 2:27 PM

Herald added subscribers: mattd, gchakrabarti, asavonic. · View Herald Transcript

LGTM. Thanks.

Revision Contents

Path

Size

clang/

lib/

Basic/

Targets/

NVPTX.h

2 lines

CodeGen/

CGDebugInfo.cpp

10 lines

test/

CodeGenHIP/

debug-info-address-class.hip

37 lines

Diff 297309

clang/lib/Basic/Targets/NVPTX.h

//===--- NVPTX.h - Declare NVPTX target feature support ---------- C++ --===//		//===--- NVPTX.h - Declare NVPTX target feature support ---------- C++ --===//
		Lint: Lint Inline Actions clang-format suggested style edits found: Lint: Lint: clang-format suggested style edits found:
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file declares NVPTX TargetInfo objects.		// This file declares NVPTX TargetInfo objects.
Show All 29 Lines	static const unsigned NVPTXAddrSpaceMap[] = {
0, // ptr32_uptr		0, // ptr32_uptr
0 // ptr64		0 // ptr64
};		};

/// The DWARF address class. Taken from		/// The DWARF address class. Taken from
/// https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf		/// https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf
static const int NVPTXDWARFAddrSpaceMap[] = {		static const int NVPTXDWARFAddrSpaceMap[] = {
-1, // Default, opencl_private or opencl_generic - not defined		-1, // Default, opencl_private or opencl_generic - not defined
5, // opencl_global		-1, // opencl_global
		scott.linderAuthorUnsubmitted Done Reply Inline Actions Does anyone have any thoughts on this change specifically? Is someone more familiar with NVPTX willing to weigh in on whether it makes more sense to carry the address space throughout the compiler explicitly and "drop" it late in the DWARF emission, or to do what I did in the current patch (drop it early). I would lean towards updating the patch to do the latter, but I wanted to get feedback before plunging off to do it. scott.linder: Does anyone have any thoughts on this change specifically? Is someone more familiar with NVPTX…
-1,		-1,
8, // opencl_local or cuda_shared		8, // opencl_local or cuda_shared
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - 8, // opencl_local or cuda_shared - 4, // opencl_constant or cuda_constant + 8, // opencl_local or cuda_shared + 4, // opencl_constant or cuda_constant Lint: Pre-merge checks: clang-format: please reformat the code ``` - 8, // opencl_local or cuda_shared - 4, //…
4, // opencl_constant or cuda_constant		4, // opencl_constant or cuda_constant
};		};

class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public TargetInfo {		class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public TargetInfo {
static const char *const GCCRegNames[];		static const char *const GCCRegNames[];
static const Builtin::Info BuiltinInfo[];		static const Builtin::Info BuiltinInfo[];
CudaArch GPU;		CudaArch GPU;
uint32_t PTXVersion;		uint32_t PTXVersion;
▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGDebugInfo.cpp

Show First 20 Lines • Show All 4,661 Lines • ▼ Show 20 Lines	if (T->isUnionType() && DeclName.empty()) {
assert(RD->isAnonymousStructOrUnion() &&		assert(RD->isAnonymousStructOrUnion() &&
"unnamed non-anonymous struct or union?");		"unnamed non-anonymous struct or union?");
GVE = CollectAnonRecordDecls(RD, Unit, LineNo, LinkageName, Var, DContext);		GVE = CollectAnonRecordDecls(RD, Unit, LineNo, LinkageName, Var, DContext);
} else {		} else {
auto Align = getDeclAlignIfRequired(D, CGM.getContext());		auto Align = getDeclAlignIfRequired(D, CGM.getContext());

SmallVector<int64_t, 4> Expr;		SmallVector<int64_t, 4> Expr;
unsigned AddressSpace =		unsigned AddressSpace =
CGM.getContext().getTargetAddressSpace(D->getType());		CGM.getContext().getTargetAddressSpace(CGM.GetGlobalVarAddressSpace(D));
if (CGM.getLangOpts().CUDA && CGM.getLangOpts().CUDAIsDevice) {
if (D->hasAttr<CUDASharedAttr>())
AddressSpace =
CGM.getContext().getTargetAddressSpace(LangAS::cuda_shared);
else if (D->hasAttr<CUDAConstantAttr>())
AddressSpace =
CGM.getContext().getTargetAddressSpace(LangAS::cuda_constant);
}
AppendAddressSpaceXDeref(AddressSpace, Expr);		AppendAddressSpaceXDeref(AddressSpace, Expr);

GVE = DBuilder.createGlobalVariableExpression(		GVE = DBuilder.createGlobalVariableExpression(
DContext, DeclName, LinkageName, Unit, LineNo, getOrCreateType(T, Unit),		DContext, DeclName, LinkageName, Unit, LineNo, getOrCreateType(T, Unit),
Var->hasLocalLinkage(), true,		Var->hasLocalLinkage(), true,
Expr.empty() ? nullptr : DBuilder.createExpression(Expr),		Expr.empty() ? nullptr : DBuilder.createExpression(Expr),
getOrCreateStaticDataMemberDeclarationOrNull(D), TemplateParameters,		getOrCreateStaticDataMemberDeclarationOrNull(D), TemplateParameters,
Align);		Align);
▲ Show 20 Lines • Show All 346 Lines • Show Last 20 Lines

clang/test/CodeGenHIP/debug-info-address-class.hip

This file was added.

				// REQUIRES: amdgpu-registered-target
				// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -x hip -emit-llvm -fcuda-is-device -debug-info-kind=limited -dwarf-version=4 -o - %s \| FileCheck %s

				#define __device__ __attribute__((device))
				#define __shared__ __attribute__((shared))
				#define __constant__ __attribute__((constant))

				__device__ int FileVarDevice;
				__device__ __shared__ int FileVarDeviceShared;
				aprantlUnsubmitted Not Done Reply Inline Actions They are more convenient, but having very many CHECK_DAGs is also really slow — would it be feasible to reorder them and use CHECKs? Perhaps by running clang/FileCheck twice with different sets of CHECK lines? aprantl: They are more convenient, but having very many CHECK_DAGs is also really slow — would it be…
				scott.linderAuthorUnsubmitted Done Reply Inline Actions The test is pretty short, so I just re-ordered the checks to match how they appear in the output (and used more descriptive names to make it easier to follow). This does mean the test relies on the order these things are traversed. Some bits are maybe a bit surprising, like how the metadata for the `__shared__` auto variable comes before the argument, but I don't imagine it is liable to change often/accidentally. scott.linder: The test is pretty short, so I just re-ordered the checks to match how they appear in the…
				__device__ __constant__ int FileVarDeviceConstant;

				__device__ void kernel1(
				// FIXME This should be in the private address space.
				// CHECK: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata ![[ARG:[0-9]+]], metadata !DIExpression()), !dbg !{{[0-9]+}}
				int Arg) {
				__shared__ int FuncVarShared;

				// FIXME This should be in the private address space.
				// CHECK: call void @llvm.dbg.declare(metadata i32* {{.*}}, metadata ![[FUNC_VAR:[0-9]+]], metadata !DIExpression()), !dbg !{{[0-9]+}}
				int FuncVar;
				}

				// CHECK: !DIGlobalVariableExpression(var: ![[FILE_VAR_DEVICE:[0-9]+]], expr: !DIExpression())
				// CHECK: ![[FILE_VAR_DEVICE]] = distinct !DIGlobalVariable(name: "FileVarDevice", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: false, isDefinition: true)

				// CHECK: !DIGlobalVariableExpression(var: ![[FILE_VAR_DEVICE_SHARED:[0-9]+]], expr: !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
				// CHECK: ![[FILE_VAR_DEVICE_SHARED]] = distinct !DIGlobalVariable(name: "FileVarDeviceShared", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: false, isDefinition: true)

				// CHECK: !DIGlobalVariableExpression(var: ![[FILE_VAR_DEVICE_CONSTANT:[0-9]+]], expr: !DIExpression())
				// CHECK: ![[FILE_VAR_DEVICE_CONSTANT]] = distinct !DIGlobalVariable(name: "FileVarDeviceConstant", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: false, isDefinition: true)

				// CHECK: !DIGlobalVariableExpression(var: ![[FUNC_VAR_SHARED:[0-9]+]], expr: !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef))
				// CHECK: ![[FUNC_VAR_SHARED]] = distinct !DIGlobalVariable(name: "FuncVarShared", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}}, isLocal: true, isDefinition: true)

				// CHECK: ![[ARG]] = !DILocalVariable(name: "Arg", arg: {{[0-9]+}}, scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})

				// CHECK: ![[FUNC_VAR]] = !DILocalVariable(name: "FuncVar", scope: !{{[0-9]+}}, file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})