This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGDebugInfo.h
-
test/CodeGenCXX/
-
CodeGenCXX/
-
debug-info-function-context.cpp

Differential D155991

[DWARF] Make sure file entry for artificial functions has an MD5 checksum
ClosedPublic

Authored by probinson on Jul 21 2023, 1:39 PM.

Download Raw Diff

Details

Reviewers

dblaikie
aprantl

Commits

rG7abb5fc618ce: [DWARF] Make sure file entry for artificial functions has an MD5 checksum

Summary

The DIFile cache was keyed on a string pointer instead of string content,
which was causing misses and resulted in an entry without a checksum.
In DWARF v5 if any checksum is missing, we can't write any to the output
file, so this had consequences.

Fixes https://github.com/llvm/llvm-project/issues/63955

Diff Detail

Event Timeline

probinson created this revision.Jul 21 2023, 1:39 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 21 2023, 1:39 PM

probinson requested review of this revision.Jul 21 2023, 1:39 PM

The fix is straightforward but the test was surprisingly tricky; I had to add the -main-file-name option to keep one of the DIFile's from ending up as <stdin>.
I'm still befuddled about why there are two DIFile entries, when the DIFileCache clearly has only one. The metadata remains mysterious to me.

Harbormaster completed remote builds in B247306: Diff 543044.Jul 21 2023, 7:02 PM

aprantl accepted this revision.Jul 24 2023, 9:24 AM

This revision is now accepted and ready to land.Jul 24 2023, 9:24 AM

This revision was landed with ongoing or failed builds.Jul 24 2023, 10:53 AM

Closed by commit rG7abb5fc618ce: [DWARF] Make sure file entry for artificial functions has an MD5 checksum (authored by probinson). · Explain Why

This revision was automatically updated to reflect the committed changes.

probinson added a commit: rG7abb5fc618ce: [DWARF] Make sure file entry for artificial functions has an MD5 checksum.

Herald added a project: Restricted Project. · View Herald TranscriptJul 24 2023, 10:53 AM

Any memory usage measurements to check this doesn't have a significant adverse impact by copying all the strings? (or performance by having to do string rather than pointer equality comparisons?)

Could/should we do the lookup on the CU filename before it goes into the DI metadata, and store that FileID somewhere for later use? Rather than requerying it later/making all queries string comparisons/storing strings, etc? (we could put an "expensive checks" assertion in that our queries always return in agreement with the pointer equality - so in case there are other things that trip over this issue we'll know about them sooner)

FYI this is a 0.5% compile-time regression on O0 builds (https://llvm-compile-time-tracker.com/compare.php?from=69593aa5c054cec6be6b822c073ccdc63748a68d&to=7abb5fc618cec66841a8280d2a099a4c9c8cb91b&stat=instructions:u). Is that expected?

Any memory usage measurements to check this doesn't have a significant adverse impact by copying all the strings?

Not actual measurements, no; but intuitively the size should not be much greater than the size of the filename entries in the .debug_line section for the CU, which should be KB not MB. (Plus StringMap entry overhead, which is constant per node.) Therefore I didn't take the time to measure. But see below for a possible alternate approach which would avoid even that much.

Could/should we do the lookup on the CU filename before it goes into the DI metadata, and store that FileID somewhere for later use?

There's a note in issue 63955 to the effect that I can't find an API to turn a filename into a FileID. If there is one that I didn't find, we could use FileIDs instead of pointers to name strings.

FYI this is a 0.5% compile-time regression on O0 builds (https://llvm-compile-time-tracker.com/compare.php?from=69593aa5c054cec6be6b822c073ccdc63748a68d&to=7abb5fc618cec66841a8280d2a099a4c9c8cb91b&stat=instructions:u). Is that expected?

That's higher than I expected.

It might be feasible to do this a different way: When an invalid loc comes in, search the DIFileCache for a matching string, and use that instead of using the CU's copy of the filename. Then we can go back to using pointers as the keys. Might eliminate the duplicate DIFile as well. I'll look into that. Then the time cost would be incurred only when invalid locs come in, which is a minority of lookups (depends on the number of artificial functions generated).

See https://reviews.llvm.org/D156571

probinson mentioned this in rGca1295c5a15f: [DebugInfo] Alternate (more efficient) MD5 fix.Aug 17 2023, 7:04 AM

probinson mentioned this in rG2e4d2d800b9c: Reapply "[DebugInfo] Alternate (more efficient) MD5 fix".Aug 18 2023, 5:24 AM

probinson mentioned this in rG1fcc2bc31bb9: Reapply "[DebugInfo] Alternate (more efficient) MD5 fix".Aug 18 2023, 9:21 AM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGDebugInfo.h

2 lines

test/

CodeGenCXX/

debug-info-function-context.cpp

16 lines

Diff 543044

clang/lib/CodeGen/CGDebugInfo.h

Show First 20 Lines • Show All 142 Lines • ▼ Show 20 Lines	#include "clang/Basic/WebAssemblyReferenceTypes.def"
/// function.		/// function.
std::vector<unsigned> FnBeginRegionCount;		std::vector<unsigned> FnBeginRegionCount;

/// This is a storage for names that are constructed on demand. For		/// This is a storage for names that are constructed on demand. For
/// example, C++ destructors, C++ operators etc..		/// example, C++ destructors, C++ operators etc..
llvm::BumpPtrAllocator DebugInfoNames;		llvm::BumpPtrAllocator DebugInfoNames;
StringRef CWDName;		StringRef CWDName;

llvm::DenseMap<const char *, llvm::TrackingMDRef> DIFileCache;		llvm::StringMap<llvm::TrackingMDRef> DIFileCache;
llvm::DenseMap<const FunctionDecl *, llvm::TrackingMDRef> SPCache;		llvm::DenseMap<const FunctionDecl *, llvm::TrackingMDRef> SPCache;
/// Cache declarations relevant to DW_TAG_imported_declarations (C++		/// Cache declarations relevant to DW_TAG_imported_declarations (C++
/// using declarations and global alias variables) that aren't covered		/// using declarations and global alias variables) that aren't covered
/// by other more specific caches.		/// by other more specific caches.
llvm::DenseMap<const Decl *, llvm::TrackingMDRef> DeclCache;		llvm::DenseMap<const Decl *, llvm::TrackingMDRef> DeclCache;
llvm::DenseMap<const Decl *, llvm::TrackingMDRef> ImportedDeclCache;		llvm::DenseMap<const Decl *, llvm::TrackingMDRef> ImportedDeclCache;
llvm::DenseMap<const NamespaceDecl *, llvm::TrackingMDRef> NamespaceCache;		llvm::DenseMap<const NamespaceDecl *, llvm::TrackingMDRef> NamespaceCache;
llvm::DenseMap<const NamespaceAliasDecl *, llvm::TrackingMDRef>		llvm::DenseMap<const NamespaceAliasDecl *, llvm::TrackingMDRef>
▲ Show 20 Lines • Show All 733 Lines • Show Last 20 Lines

clang/test/CodeGenCXX/debug-info-function-context.cpp

	// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s -o - \| FileCheck %s			// RUN: %clang_cc1 -emit-llvm -debug-info-kind=limited -triple x86_64-pc-linux-gnu %s \
				// RUN: -dwarf-version=5 -main-file-name %s -o - \| FileCheck %s

	struct C {			struct C {
	void member_function();			void member_function();
	static int static_member_function();			static int static_member_function();
	static int static_member_variable;			static int static_member_variable;
	};			};

	int C::static_member_variable = 0;			int C::static_member_variable = 0;

	void C::member_function() { static_member_variable = 0; }			void C::member_function() { static_member_variable = 0; }

	int C::static_member_function() { return static_member_variable; }			int C::static_member_function() { return static_member_variable; }

	C global_variable;			C global_variable;

	int global_function() { return -1; }			int global_function() { return -1; }

	namespace ns {			namespace ns {
	void global_namespace_function() { global_variable.member_function(); }			void global_namespace_function() { global_variable.member_function(); }
	int global_namespace_variable = 1;			int global_namespace_variable = 1;
	}			}

				// Generate the artificial global functions to initialize a global.
				int global_initialized_variable = C::static_member_function();

	// Check that the functions that belong to C have C as a context and the			// Check that the functions that belong to C have C as a context and the
	// functions that belong to the namespace have it as a context, and the global			// functions that belong to the namespace have it as a context, and the global
	// function has the file as a context.			// functions (user-defined and artificial) have the file as a context.

	// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.*}}context.cpp",			// The first DIFile is for the CU, the second is what everything else uses.
				// We're using DWARF v5 so both should have MD5 checksums.
				// CHECK: !DIFile(filename: "{{.}}context.cpp",{{.}} checksumkind: CSK_MD5
				// CHECK: ![[FILE:[0-9]+]] = !DIFile(filename: "{{.}}context.cpp",{{.}} checksumkind: CSK_MD5
	// CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "C",			// CHECK: ![[C:[0-9]+]] = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "C",
	// CHECK: ![[NS:.*]] = !DINamespace(name: "ns"			// CHECK: ![[NS:.*]] = !DINamespace(name: "ns"
	// CHECK: !DISubprogram(name: "member_function",{{.}} scope: ![[C]],{{.}} DISPFlagDefinition			// CHECK: !DISubprogram(name: "member_function",{{.}} scope: ![[C]],{{.}} DISPFlagDefinition

	// CHECK: !DISubprogram(name: "static_member_function",{{.}} scope: ![[C]],{{.}} DISPFlagDefinition			// CHECK: !DISubprogram(name: "static_member_function",{{.}} scope: ![[C]],{{.}} DISPFlagDefinition

	// CHECK: !DISubprogram(name: "global_function",{{.}} scope: ![[FILE]],{{.}} DISPFlagDefinition			// CHECK: !DISubprogram(name: "global_function",{{.}} scope: ![[FILE]],{{.}} DISPFlagDefinition

	// CHECK: !DISubprogram(name: "global_namespace_function",{{.}} scope: ![[NS]],{{.}} DISPFlagDefinition			// CHECK: !DISubprogram(name: "global_namespace_function",{{.}} scope: ![[NS]],{{.}} DISPFlagDefinition

				// CHECK: !DISubprogram(name: "__cxx_global_var_init",{{.}} scope: ![[FILE]],{{.}} DISPFlagDefinition
				// CHECK: !DISubprogram(linkageName: "_GLOBAL__sub_I_{{.}}",{{.}} scope: ![[FILE]],{{.*}} DISPFlagDefinition