This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Add flag to only emit referenced member functions
Needs ReviewPublic

Authored by dblaikie on Jun 2 2023, 11:22 AM.

Details

Summary

[DebugInfo] Add flag to only emit referenced member functions

Complete C++ type information can be quite expensive - and there's limited value in representing every member function, even those that can't be called (we don't do similarly for every non-member function anyway). So add a flag to opt out of this behavior for experimenting with this more terse behavior.

I think Sony already does this by default, so perhaps with a change to the defaults, Sony can migrate to this rather than a downstream patch.

This breaks current debuggers in some expected ways - but those breakages are visible without this feature too. Consider member function template instantiations - they can't be consistently enumerated in every translation unit:

a.h:

struct t1 {
  template <int i>
  static int f1() {
    return i;
  }
};
namespace ns {
template <int i>
int f1() {
  return i;
}
}  // namespace ns

a.cpp:

void f1() {
  t1::f1<0>();
  ns::f1<0>();
}

b.cpp:

void f1();
int main() {
  f1();
  t1::f1<1>();
  ns::f1<1>();
}
(gdb) p ns::f1<0>()
$1 = 0
(gdb) p ns::f1<1>()
$2 = 1
(gdb) p t1::f1<0>()
Couldn't find method t1::f1<0>
(gdb) p t1::f1<1>()
$3 = 1
(gdb) s
f1 () at a.cpp:3
3         t1::f1<0>();
(gdb) p t1::f1<0>()
$4 = 0
(gdb) p t1::f1<1>()
Couldn't find method t1::f1<1>
(gdb)

(other similar non-canonical features are implicit special members (copy/move ctor/assignment operator, default ctor) and nested types (eg: pimpl idiom, where the nested type is declared-but-not-defined in one TU, and defined in another TU))

lldb can't parse the template expressions above, so I'm not sure how to test it there, but I'd guess it has similar problems. ( https://stackoverflow.com/questions/64602475/how-to-print-value-returned-by-template-member-function-in-gdb-lldb-debugging so... I guess that's just totally not supported in lldb, how unfortunate. And implicit special members are instantiated implicitly by lldb, so missing those doesn't tickle the same issue)

Some very rudimentary numbers for a clang debug build:
.debug_info section size:
-g: 476MiB
-g -fdebug-types-section: 357MiB
-g -gincomplete-types: 340MiB

Though it also means a major reduction in .debug_str size, -fdebug-types-section doesn't reduce string usage (so the first two examples have the same .debug_str size, 247MiB), down to 175MiB.

Total debug info section size reduction is 13% - though different workloads will have different tradeoffs. (& haven't tested with compression enabled - since most of the win is in strings, and strings are quite compressible, this might not account for much in that case - though not having to decompress inputs to the debugger, etc, is valuable)

Also open to any riffing on the flag name for sure.

@probinson - would this be an accurate upstreaming of your internal handling/would you use this functionality? If it wouldn't be useful to you, it's maybe not worth adding upstream yet - not sure we'll use it at Google, but if it was useful to you folks and meant other folks could test with it it seemed maybe useful.

Diff Detail

Event Timeline

dblaikie created this revision.Jun 2 2023, 11:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2023, 11:22 AM
dblaikie requested review of this revision.Jun 2 2023, 11:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2023, 11:22 AM
dblaikie updated this revision to Diff 527920.Jun 2 2023, 11:30 AM

Updating commit message with extra details

dblaikie edited the summary of this revision. (Show Details)Jun 2 2023, 11:34 AM
dblaikie added a reviewer: rnk.
dblaikie edited the summary of this revision. (Show Details)Jun 2 2023, 11:47 AM

I'm traveling but will look at this on Monday.

rnk added a comment.Jun 2 2023, 1:47 PM

This is pretty cool, even if it's just for data gathering. It's interesting that 29% of .debug_info and .debug_str is for describing class methods, that's pretty significant, even if it's only a 13% reduction across all debug info sections.

I experimented with this. Looks like it emits info only for defined methods, and not used methods. That is, if you change the test to say
void t1::f1() { f2(); }
you get DWARF for f1 but not f2. The way Sony does it, you get DWARF for f1 and f2.
(Neither case sees info for f3, which is declared but not defined or used.)

Is that what you intended?

I experimented with this. Looks like it emits info only for defined methods, and not used methods. That is, if you change the test to say
void t1::f1() { f2(); }
you get DWARF for f1 but not f2. The way Sony does it, you get DWARF for f1 and f2.
(Neither case sees info for f3, which is declared but not defined or used.)

Ah, that's good to understand - didn't realize that's what you folks were doing. Fair enough then.

Is that what you intended?

It is.

What's the particular goal/value in including called-but-not-defined functions? Are your users generally building only parts of their program with debug info & you want it to be complete-ish in the parts that do have debug info? Because if they were building the whole program with debug info, /some/ translation unit would have the definition of the function, and the debug info for it.

oh, side note: We don't emit DWARF descriptions of called non-member functions, generally (well, more often these days with optimized debug info call_site info we do need to emit function declarations for called functions) - so by analogy I'd have thought doing similarly for member functions would be suitable

What's the particular goal/value in including called-but-not-defined functions? Are your users generally building only parts of their program with debug info & you want it to be complete-ish in the parts that do have debug info? Because if they were building the whole program with debug info, /some/ translation unit would have the definition of the function, and the debug info for it.

Actually the goal was to match the behavior of proprietary compilers for previous consoles, knowing that the debugger would be fine with that. I think it's worth taking this idea (only defined methods) back to them, and see what they think. Because your patch is seriously simpler than ours!

My debugger guy says "this shouldn't be a problem."

Given that, my request is that -gincomplete-types should be default-true for DebuggerTuning == SCE if you want to commit this; otherwise I'll redo our downstream patch to match yours.

My debugger guy says "this shouldn't be a problem."

Given that, my request is that -gincomplete-types should be default-true for DebuggerTuning == SCE if you want to commit this; otherwise I'll redo our downstream patch to match yours.

Happy to do the SCE tuning in a follow-up patch.

Is the name good for you? (not that you'd need to interact with it much if it's the default for your target anyway)

I have to say I'm not super excited about "-gincomplete-types" given that "incomplete type" means something different to a C++ user.

"-gdefined-methods-only" ? reads awkwardly in the no- form.
"-gsuppress-undefined-methods" ? which is similar to what we called it.
"-gundefined-methods" ? you'd default-true in that case, the no form would suppress them.

rnk added a comment.Jun 6 2023, 11:23 AM

I tend to think that enabled-by-default options which have names that you are supposed to prefix with no are awkward[1], but the compiler ecosystem (GCC & Clang) has lots of those, and perhaps we should lean into it. I like -g[no-]undefined-methods the best of Paul's suggestions.

[1] -fdelete-null-pointer-checks, anyone? Yes, that's a feature I want in my compiler, please delete my null checks, I didn't need those, sign me up!

Oh, -fstandalone-debug should override this? In the Sony implementation, it does. (Sorry for not remembering that sooner.)

Oh, -fstandalone-debug should override this? In the Sony implementation, it does. (Sorry for not remembering that sooner.)

I'd think of this as orthogonal to that, potentially. by analogy with non-member functions. Even in standalone debug, we wouldn't emit a declaration for every function declared in a translation unit. Ah, because we include the mangled names on member function declarations, having that would still allow you to call the function from another translation unit, if it's available, even without DWARF for that TU, I suppose.

So, while it's inconsistent with non-member functions, which doesn't seem like a good principle - yeah, guess we could imply this option is disabled by -fstandalone-debug...

I tend to think that enabled-by-default options which have names that you are supposed to prefix with no are awkward[1], but the compiler ecosystem (GCC & Clang) has lots of those, and perhaps we should lean into it. I like -g[no-]undefined-methods the best of Paul's suggestions.

[1] -fdelete-null-pointer-checks, anyone? Yes, that's a feature I want in my compiler, please delete my null checks, I didn't need those, sign me up!

Yeah, mixed feelings about a default-on flag, so you have to use -gno-*. Though one name in that form, might be -gno-canonical-types.
(pedantically C++ calls these things "member functions" (& it seems nice to use consistent naming) rather than "methods", but -g[no-]undefined-member-functions would be a bit of a mouthful... )

rnk added a comment.Jun 6 2023, 4:36 PM

Yeah, mixed feelings about a default-on flag, so you have to use -gno-*. Though one name in that form, might be -gno-canonical-types.
(pedantically C++ calls these things "member functions" (& it seems nice to use consistent naming) rather than "methods", but -g[no-]undefined-member-functions would be a bit of a mouthful... )

We don't have great project-wide policy in Clang about whether to use colloquial terminology or standardese in user visible text. Some diagnostics go to great lengths to use the WG21 terminology, and others less so. Our AST node is a CXXMethodDecl, so I'd vote for using "methods" over "member functions".

Could always go with -gsuppress-undefined-methods if you're not happy about default-on options. I don't have a strong opinion.

Basing this on "defined" or "undefined" member functions isn't /perfectly/ correct (though might be correct enough for this flag - though I do prefer something about "canonicality" of types, but perhaps it's too inside-baseball for the average user - though these debug info flags are pretty esoteric at the best of times anyway) - we might emit member function declarations for call sites in DWARF, for instance (seems currently we don't emit call sites for member function calls, but I don't think it's a fundamental issue - just something no one's implemented yet/maybe a heuristic for what's worth describing call sites for) - also, while we don't do it currently, I've just been thinking about template info completeness and we don't currently reference DIEs for non-type-template parameters of pointer type, including function/member function pointers, and it'd be nice if we did point to the actual DIE describing the thing we're pointing to, in which case we'd need to emit some more function declarations there too.

Thoughts?

DianQK added a subscriber: DianQK.Jun 22 2023, 7:03 PM

we might emit member function declarations for call sites in DWARF, for instance

So are you now leaning more toward wanting to emit the "used" declarations as well? I'm sure you don't want to implement it the way we did...

I *think* the template parameter thing has more to do with not supporting expressions in all their glory, although I'm fuzzy on that--last time I looked at that stuff was a number of years ago.

we might emit member function declarations for call sites in DWARF, for instance

So are you now leaning more toward wanting to emit the "used" declarations as well? I'm sure you don't want to implement it the way we did...

There's already infrastructure to do this for non-member functions, and if we do it the way we do for things like member function template instantiations (they don't go in the "members" list, they just list the type as their "scope"/parent) - it should be relatively easy/the same as for non-member functions, I think.

I *think* the template parameter thing has more to do with not supporting expressions in all their glory, although I'm fuzzy on that--last time I looked at that stuff was a number of years ago.

Yeah, there are some cases that are really hairy with whole expression mangling, which is unfortunate and I'm not sure what we could do there short of embed a string with the expression in it... :/ but there are some more cases we could do better at with more semantic information than that, such as the case I mentioned like a non-type template parameter of pointer type pointing to a member function.