This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CodeGenFunction.cpp
-
test/CodeGen/
-
CodeGen/
-
attr-nodebug2.c

Differential D104777

PR50767: clear non-distinct debuginfo for function with nodebug definition after undecorated declaration
ClosedPublic

Authored by brunodefraine on Jun 23 2021, 4:30 AM.

Download Raw Diff

Details

Reviewers

ychen
dblaikie

Commits

rG4d8871a898b3: PR50767: clear non-distinct debuginfo for function with nodebug definition…

Summary

Fix suggested by Yuanfang Chen:

Non-distinct debuginfo is attached to the function due to the undecorated declaration. Later, when seeing the function definition and nodebug attribute, the non-distinct debuginfo should be cleared.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

brunodefraine requested review of this revision.Jun 23 2021, 4:30 AM

brunodefraine created this revision.

Herald added a subscriber: cfe-commits. · View Herald TranscriptJun 23 2021, 4:30 AM

Harbormaster completed remote builds in B110598: Diff 353914.Jun 23 2021, 5:05 AM

jeroen.dobbelaere added a subscriber: jeroen.dobbelaere.Jun 23 2021, 6:27 AM

Fix issues from Windows/clang-format buildbot. Fix mistake in code comment.

Harbormaster completed remote builds in B110616: Diff 353949.Jun 23 2021, 7:18 AM

@aaron.ballman Do we have attributes/infrastructure for attributes that have to be the same from their first declaration or at least from their first call? I'm wondering if it might be simpler/better to require nodebug to be provided earlier, rather than fixing it up after the fact like this.

(for instance, requiring it to be attributed on the declaration would ensure we don't create call_site descriptions for calls to nodebug functions - which would save some debug info size)

In D104777#2835965, @dblaikie wrote:

@aaron.ballman Do we have attributes/infrastructure for attributes that have to be the same from their first declaration or at least from their first call? I'm wondering if it might be simpler/better to require nodebug to be provided earlier, rather than fixing it up after the fact like this.

(for instance, requiring it to be attributed on the declaration would ensure we don't create call_site descriptions for calls to nodebug functions - which would save some debug info size)

We don't have any attributes that have to be the same "from their first call" that I'm aware of. But we do have attributes that need to be the same on all declarations. IIRC, the section and code_seg attributes both have to be specified on the initial declaration in some circumstances, and [[noreturn]] needs to be on the first declaration we see. We don't have any tablegen machinery to enforce this, the logic needs to be manually written. I think it'd live somewhere around Sema::mergeDeclAttributes() or Sema::MergeFunctionDecl() most likely.

Fixing up after the fact would be a bit strange though. A declaration is typically considered marked by an attribute from the point of declaration including the attribute onward, not backwards. e.g., https://godbolt.org/z/4E1Ya764n I won't say we *never* do this back propagation, but I would say it's a sign that the attribute is venturing into novel, perhaps dangerous territory. For example with this patch, an AST matcher won't see the nodebug attribute on the initial declaration of t1() and so it won't react to it the same way that codegen is being changed.

In D104777#2836327, @aaron.ballman wrote:

In D104777#2835965, @dblaikie wrote:

@aaron.ballman Do we have attributes/infrastructure for attributes that have to be the same from their first declaration or at least from their first call? I'm wondering if it might be simpler/better to require nodebug to be provided earlier, rather than fixing it up after the fact like this.

(for instance, requiring it to be attributed on the declaration would ensure we don't create call_site descriptions for calls to nodebug functions - which would save some debug info size)

We don't have any attributes that have to be the same "from their first call" that I'm aware of. But we do have attributes that need to be the same on all declarations. IIRC, the section and code_seg attributes both have to be specified on the initial declaration in some circumstances, and [[noreturn]] needs to be on the first declaration we see. We don't have any tablegen machinery to enforce this, the logic needs to be manually written. I think it'd live somewhere around Sema::mergeDeclAttributes() or Sema::MergeFunctionDecl() most likely.

Yeah, all that sounds reasonable to me - @brunodefraine could you look into supporting nodebug in a similar way as @aaron.ballman has described here?

Fixing up after the fact would be a bit strange though. A declaration is typically considered marked by an attribute from the point of declaration including the attribute onward, not backwards. e.g., https://godbolt.org/z/4E1Ya764n I won't say we *never* do this back propagation, but I would say it's a sign that the attribute is venturing into novel, perhaps dangerous territory. For example with this patch, an AST matcher won't see the nodebug attribute on the initial declaration of t1() and so it won't react to it the same way that codegen is being changed.

Yeah, fair concerns for sure - thanks for the tips!

In D104777#2836669, @dblaikie wrote:

Yeah, all that sounds reasonable to me - @brunodefraine could you look into supporting nodebug in a similar way as @aaron.ballman has described here?

Since the debuginfo for use() is slightly affected by the nodebug version of t1() that follows it, I can see how this back propagation is perhaps dangerous. Checking that nodebug is the same on all declarations of a function is a way to prevent this.

But when discussing the PR, @probinson wrote "I'm inclined to think we want this to work" and I can see what he means from the use case where I observed the bug. If you don't want debuginfo for the implementation of t1(), it should be fine to annotate just the function definition in an implementation file, not the declaration in a header, since the debuginfo of the implementation is not of the caller's concern. But nodebug as it exists does affect the debuginfo of callers as well, so I cannot really express that I don't want debuginfo for the implementation of a function and leave its callers unaffected?

In D104777#2837347, @brunodefraine wrote:

In D104777#2836669, @dblaikie wrote:

Yeah, all that sounds reasonable to me - @brunodefraine could you look into supporting nodebug in a similar way as @aaron.ballman has described here?

Since the debuginfo for use() is slightly affected by the nodebug version of t1() that follows it, I can see how this back propagation is perhaps dangerous. Checking that nodebug is the same on all declarations of a function is a way to prevent this.

But when discussing the PR, @probinson wrote "I'm inclined to think we want this to work" and I can see what he means from the use case where I observed the bug. If you don't want debuginfo for the implementation of t1(), it should be fine to annotate just the function definition in an implementation file, not the declaration in a header, since the debuginfo of the implementation is not of the caller's concern. But nodebug as it exists does affect the debuginfo of callers as well, so I cannot really express that I don't want debuginfo for the implementation of a function and leave its callers unaffected?

I can see the convenience there, to be sure, being able to put the attribute directly on the function you want to debug - but consistency in how attributes are handled (admitedly this isn't a strong consistency - some are handled this way, some aren't) & consistently seeing the same state for an attribute for a given function seems useful.

@probinson - thoughts?

In D104777#2837794, @dblaikie wrote:

In D104777#2837347, @brunodefraine wrote:

In D104777#2836669, @dblaikie wrote:

Yeah, all that sounds reasonable to me - @brunodefraine could you look into supporting nodebug in a similar way as @aaron.ballman has described here?

Since the debuginfo for use() is slightly affected by the nodebug version of t1() that follows it, I can see how this back propagation is perhaps dangerous. Checking that nodebug is the same on all declarations of a function is a way to prevent this.

But when discussing the PR, @probinson wrote "I'm inclined to think we want this to work" and I can see what he means from the use case where I observed the bug. If you don't want debuginfo for the implementation of t1(), it should be fine to annotate just the function definition in an implementation file, not the declaration in a header, since the debuginfo of the implementation is not of the caller's concern. But nodebug as it exists does affect the debuginfo of callers as well, so I cannot really express that I don't want debuginfo for the implementation of a function and leave its callers unaffected?

I can see the convenience there, to be sure, being able to put the attribute directly on the function you want to debug - but consistency in how attributes are handled (admitedly this isn't a strong consistency - some are handled this way, some aren't) & consistently seeing the same state for an attribute for a given function seems useful.

@probinson - thoughts?

To me, this is the key bit:

But nodebug as it exists does affect the debuginfo of callers as well, so I cannot really express that I don't want debuginfo for the implementation of a function and leave its callers unaffected?

because it does affect the callers, the programmer introducing the API should be aware of that. Making this case an error helps them to understand that this attribute is actually a part of their API and not an implementation detail, and silently applying the attribute may cause hard-to-debug problems for them after deployment of their API.

In D104777#2838211, @aaron.ballman wrote:

In D104777#2837794, @dblaikie wrote:

In D104777#2837347, @brunodefraine wrote:

In D104777#2836669, @dblaikie wrote:

Yeah, all that sounds reasonable to me - @brunodefraine could you look into supporting nodebug in a similar way as @aaron.ballman has described here?

Since the debuginfo for use() is slightly affected by the nodebug version of t1() that follows it, I can see how this back propagation is perhaps dangerous. Checking that nodebug is the same on all declarations of a function is a way to prevent this.

But when discussing the PR, @probinson wrote "I'm inclined to think we want this to work" and I can see what he means from the use case where I observed the bug. If you don't want debuginfo for the implementation of t1(), it should be fine to annotate just the function definition in an implementation file, not the declaration in a header, since the debuginfo of the implementation is not of the caller's concern. But nodebug as it exists does affect the debuginfo of callers as well, so I cannot really express that I don't want debuginfo for the implementation of a function and leave its callers unaffected?

I can see the convenience there, to be sure, being able to put the attribute directly on the function you want to debug - but consistency in how attributes are handled (admitedly this isn't a strong consistency - some are handled this way, some aren't) & consistently seeing the same state for an attribute for a given function seems useful.

@probinson - thoughts?

To me, this is the key bit:

But nodebug as it exists does affect the debuginfo of callers as well, so I cannot really express that I don't want debuginfo for the implementation of a function and leave its callers unaffected?

because it does affect the callers, the programmer introducing the API should be aware of that. Making this case an error helps them to understand that this attribute is actually a part of their API and not an implementation detail, and silently applying the attribute may cause hard-to-debug problems for them after deployment of their API.

Yeah - the effect on callers is fairly minimal when it comes to DWARF. It changes whether DW_TAG_call_sites are emitted for the function - those call sites can help describe tail calls (which isn't nothing - in the absence of this info you can have missing stack frames, I think) but also mostly helps by describing "entry values" (if a caller stashes a value before calling a function, the call site can record where the extra copy was stored - then the callee might be able to describe the value of its parameters in terms of those entry values even if the callee's copy has been clobbered) - that these features would be dropped only in the translation unit that's compiling the implementation of the function, but not in every calling TU, is not a great loss - adding the attribute to the header would remove this extra information from everywhere (but at the expense of a longer build, having to rebuild all the dependencies).

Hmm - yeah, nodebug is more the sort of thing you put on a function because it's not an abstraction you want to debug - so I /think/ it makes sense to put it on the declaration in the header, rather than only on the implementation (I was going to say I thought it'd be the thing someone might do more iteratively/interactively - but that's "optnone" - where you'd want to put it on a function during a test/debug loop and wouldn't want to have to rebuild the whole project because it's more one-off, but "nodebug" isn't intended for that same sort of one-off use case)

Eh, I could see it going either way...

Following suggestion by @dblaikie and @aaron.ballman I'm preparing a new diff where a function redeclaration that adds nodebug is flagged as an error in Sema::mergeDeclAttributes.

This triggers test failures because of violations in builtin header files, for example:

In file included from clang/test/CodeGen/ms-intrinsics.c:18:
build/lib/clang/13.0.0/include/intrin.h:451:24: error: function declared with 'nodebug' attribute was previously declared without the 'nodebug' attribute
static __inline__ void __DEFAULT_FN_ATTRS __movsb(unsigned char *__dst,
                       ^
build/lib/clang/13.0.0/include/intrin.h:37:62: note: expanded from macro '__DEFAULT_FN_ATTRS'
#define __DEFAULT_FN_ATTRS __attribute__((__always_inline__, __nodebug__))
                                                             ^
build/lib/clang/13.0.0/include/intrin.h:80:6: note: previous declaration is here
void __movsb(unsigned char *, unsigned char const *, size_t);
     ^

From what I understand from clang/lib/Headers/intrin.h the functions are declared in general at the beginning of the headers, then implemented for specific architectures as inline function containing assembly. I do not know if it is appropriate to mark the general declarations as nodebug...

That I already encounter this in builtin headers is probably an indication that the new error will break some code in the wild. A pragmatic solution would be to only raise the error in Sema::mergeDeclAttributes if Old->isUsed(), i.e. an earlier declaration without nodebug can be forgiven if not yet used. I think that would still make it impossible to trigger the crash?

In D104777#2844077, @brunodefraine wrote:

That I already encounter this in builtin headers is probably an indication that the new error will break some code in the wild.

Fair point, breaking backwards compatibility isn't great.

A pragmatic solution would be to only raise the error in Sema::mergeDeclAttributes if Old->isUsed(), i.e. an earlier declaration without nodebug can be forgiven if not yet used. I think that would still make it impossible to trigger the crash?

I'm not too fussed about the original patch - just figured it might be nicer/cleaner to require a consistent view of the attribute value. I'm not sure it's worth splitting hairs on "it can be inconsistent but only in this narrow way".

I can appreciate that only invalidating the cases that were crashes before avoids increasing the feature surface area more than necessary - but I'm not sure it makes for a more ergonomic feature. (though, equally - the idea that adding a definition to a translation unit would modify the DWARF produced for calls to the function that were previously fine is weird too - but I guess we already had that property if you added the definition at the start of the translation unit)

Let's just go with the original patch then - thanks for your patience/working through the alternatives!

This revision is now accepted and ready to land.Jun 28 2021, 8:45 AM

Slightly improved test case (both C and C++).

Harbormaster completed remote builds in B111309: Diff 354922.Jun 28 2021, 9:57 AM

Closed by commit rG4d8871a898b3: PR50767: clear non-distinct debuginfo for function with nodebug definition… (authored by brunodf, committed by jeroen.dobbelaere). · Explain WhyJun 29 2021, 1:34 AM

This revision was automatically updated to reflect the committed changes.

jeroen.dobbelaere added a commit: rG4d8871a898b3: PR50767: clear non-distinct debuginfo for function with nodebug definition….

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CodeGenFunction.cpp

10 lines

test/

CodeGen/

attr-nodebug2.c

34 lines

Diff 355141

clang/lib/CodeGen/CodeGenFunction.cpp

Show First 20 Lines • Show All 1,298 Lines • ▼ Show 20 Lines	void CodeGenFunction::GenerateCode(GlobalDecl GD, llvm::Function *Fn,
const CGFunctionInfo &FnInfo) {		const CGFunctionInfo &FnInfo) {
const FunctionDecl *FD = cast<FunctionDecl>(GD.getDecl());		const FunctionDecl *FD = cast<FunctionDecl>(GD.getDecl());
CurGD = GD;		CurGD = GD;

FunctionArgList Args;		FunctionArgList Args;
QualType ResTy = BuildFunctionArgList(GD, Args);		QualType ResTy = BuildFunctionArgList(GD, Args);

// Check if we should generate debug info for this function.		// Check if we should generate debug info for this function.
if (FD->hasAttr<NoDebugAttr>())		if (FD->hasAttr<NoDebugAttr>()) {
DebugInfo = nullptr; // disable debug info indefinitely for this function		// Clear non-distinct debug info that was possibly attached to the function
		// due to an earlier declaration without the nodebug attribute
		if (Fn)
		Fn->setSubprogram(nullptr);
		// Disable debug info indefinitely for this function
		DebugInfo = nullptr;
		}

// The function might not have a body if we're generating thunks for a		// The function might not have a body if we're generating thunks for a
// function declaration.		// function declaration.
SourceRange BodyRange;		SourceRange BodyRange;
if (Stmt *Body = FD->getBody())		if (Stmt *Body = FD->getBody())
BodyRange = Body->getSourceRange();		BodyRange = Body->getSourceRange();
else		else
BodyRange = FD->getLocation();		BodyRange = FD->getLocation();
▲ Show 20 Lines • Show All 1,380 Lines • Show Last 20 Lines

clang/test/CodeGen/attr-nodebug2.c

This file was added.

				// RUN: %clang_cc1 -x c -debug-info-kind=limited -debugger-tuning=gdb -dwarf-version=4 -O -disable-llvm-passes -emit-llvm -o - %s \| FileCheck %s
				// RUN: %clang_cc1 -x c++ -debug-info-kind=limited -debugger-tuning=gdb -dwarf-version=4 -O -disable-llvm-passes -emit-llvm -o - %s \| FileCheck %s

				#ifdef __cplusplus
				extern "C" {
				#endif

				void t1();

				void use() { t1(); }

				__attribute__((nodebug)) void t1() {
				int a = 10;
				a++;
				}

				#ifdef __cplusplus
				}
				#endif

				// CHECK-LABEL: define{{.*}} void @use()
				// CHECK-SAME: !dbg
				// CHECK-SAME: {
				// CHECK: !dbg
				// CHECK: }

				// PR50767 Function __attribute__((nodebug)) inconsistency causes crash
				// illegal (non-distinct) !dbg metadata was being added to _Z2t1v definition

				// CHECK-LABEL: define{{.*}} void @t1()
				// CHECK-NOT: !dbg
				// CHECK-SAME: {
				// CHECK-NOT: !dbg
				// CHECK: }