This is an archive of the discontinued LLVM Phabricator instance.

Add a key method to Sema to optimize debug info size
ClosedPublic

Authored by rnk on Nov 15 2019, 2:00 PM.

Details

Summary

It turns out that the debug info describing the Sema class is an
appreciable percentage of the total object file size of objects in Sema.
By adding a key function, clang is able to optimize the debug info size
by emitting a forward declaration in TUs that do not define the key
function.

On Windows, with clang-cl, these are the total object file sizes before
and after this change when compiling with optimizations and debug info:

before: 335,012 KB
after:  278,116 KB
delta:  -56,896 KB
percent: -17.0%

The effect on link time was negligible, despite having ~56MB less input.

On Linux, with clang, these are the same sizes using DWARF -g and
optimizations:

before: 603,756 KB
after:  515,340 KB
delta:  -88,416 KB
percent: -14.6%

I didn't use type units, DWARF-5, fission, or any other special flags.

Diff Detail

Event Timeline

rnk created this revision.Nov 15 2019, 2:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 15 2019, 2:00 PM
Herald added a subscriber: aprantl. · View Herald Transcript
thakis accepted this revision.Nov 15 2019, 11:47 PM

I don't see any reason not to do this. What's there to discuss? I'm probably missing something obvious.

dblaikie added anchor functions in many places a while ago (but iirc for vtables, not debug info).

This revision is now accepted and ready to land.Nov 15 2019, 11:47 PM

PS: nice find!

I don't see any reason not to do this. What's there to discuss? I'm probably missing something obvious.

Eh, it's a bit quirky - adds production code (albeit a very small amount) only to improve debug build properties. I'm not super averse to it - though would like @rsmith to weigh in before committing to it.

dblaikie added anchor functions in many places a while ago (but iirc for vtables, not debug info).

Yeah, that was just following the rules (& a little pedantry/boredom): https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers - it'd be interesting to see how much those are actually worth in object size with and without debug info.

rnk added a comment.Nov 17 2019, 8:32 AM

I don't see any reason not to do this. What's there to discuss? I'm probably missing something obvious.

I guess I was thinking about enabling this only in +asserts builds, so we pay zero overhead in release builds. I was also thinking that if we do implement the "constructor is key for class debug info" flag in the near term, this becomes obsolete. But it's not that much code churn, and it reduces DWARF size with GCC. I guess we could land it after all. :)

In D70340#1748975, @rnk wrote:

I guess I was thinking about enabling this only in +asserts builds, so we pay zero overhead in release builds. I was also thinking that if we do implement the "constructor is key for class debug info" flag in the near term, this becomes obsolete. But it's not that much code churn, and it reduces DWARF size with GCC. I guess we could land it after all. :)

With the overhead being the cost of a single vtable with one entry? Or is there more?

rnk added a comment.Nov 18 2019, 3:51 PM

With the overhead being the cost of a single vtable with one entry? Or is there more?

I guess I worry about the extra dead vtable pointer in Sema. But, I don't think it matters. I think we should do this. I'll re-upload with comments and update the description.

rnk edited the summary of this revision. (Show Details)Nov 18 2019, 3:51 PM
rnk updated this revision to Diff 229943.Nov 18 2019, 3:53 PM
  • comment
hans added a comment.Nov 19 2019, 1:24 AM

Nice!

Silly questions, but for my own education: I thought the key function concept only existed in the Itanium ABI, but from your numbers it sounds like it's a concept, at least for debug info, also on Windows?

clang/include/clang/Sema/Sema.h
335

I worry that this is going to look obscure to most readers passing through. Maybe it could be expanded to more explicitly spell out that it reduces the size of the debug info?

rnk added a comment.Nov 19 2019, 11:38 AM

Nice!

Silly questions, but for my own education: I thought the key function concept only existed in the Itanium ABI, but from your numbers it sounds like it's a concept, at least for debug info, also on Windows?

There's sort of two things going on:

  • -flimit-debug-info: if a type has a vtable, debug info for the class is only emitted where the vtable is emitted, on the assumption that we believe the vtable will be in the program somewhere.
  • key functions in the ABI: these optimize object file size by avoiding the need to emit the vtable in as many places.

The -flimit-debug-info behavior is cross-platform and happens regardless of whether the class has a key function. So, clang only emits a forward declaration of Foo in the debug info for this program, regardless of target:

struct Foo {
  Foo();
  ~Foo();
  virtual void f() {}
};
Foo *makeFoo() { return new Foo(); }

-flimit-debug-info would emit complete type info if the constructor (which touches the vtable) was inline.


I'll try to land this today, I think it's worth doing. If anyone thinks it's too much of a hack, let me know.

rnk added a comment.Nov 19 2019, 12:37 PM

Oh, yeah, I forgot this causes tons of -Wdelete-non-virtual-dtor warnings, so I'll have to look into that before landing.

rnk updated this revision to Diff 230130.Nov 19 2019, 12:43 PM
  • add final, tweak comment
rnk marked an inline comment as done.Nov 19 2019, 12:44 PM
rnk added inline comments.
clang/include/clang/Sema/Sema.h
335

I want to keep it concise, most readers shouldn't need to know what this is, and they can look up technical terms like "key method". I'll say "debug info" instead of "type info", though, that should be more obvious.

This revision was automatically updated to reflect the committed changes.

I guess my point is: a better comment would have saved me some time. Basically point out that the 'debug' info for the whole type is emitted with a virtual method, and that non-virtual types have it emitted in every TU. Also that this causes it to be emitted in only 1 place, since now there is only a single virtual method definition in a single TU.

clang/include/clang/Sema/Sema.h
335

FWIW, I just ran into this and did a double/triple take, as it didn't make sense for me to see a 'virtual' function in a 'final' type that didn't inherit to anything looked like nonsense.

The only way I found out what this meant (googling "key method" did very little for me here) was to do a 'git-blame' then found this review. The ONLY place that explained what is happening here is the comment you made here: https://reviews.llvm.org/D70340#1752192

rnk added a subscriber: akhuang.Apr 26 2021, 6:39 PM
rnk added inline comments.
clang/include/clang/Sema/Sema.h
335

Sorry, I went ahead and wrote better comments in rG6d78c38986fa0974ea0b37e66f8cb89b256f4e0d.

Re: key functions, this is where the idea is documented:
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vague-vtable
They control where the vtable is emitted. We have this style rule to take advantage of them:
https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers
However, the existing rule has to do with RTTI and vtables, which doesn't make any sense for Sema.

The idea that class debug info is tied to the vtable "known", but not well documented. It is mentioned maybe once in the user manual:
https://clang.llvm.org/docs/UsersManual.html#cmdoption-fstandalone-debug
I couldn't find any GCC documentation about this behavior, so we're doing better. :)

@akhuang has been working on the constructor homing feature announced here:
https://blog.llvm.org/posts/2021-04-05-constructor-homing-for-debug-info/
So maybe in the near future we won't need this hack.

erichkeane added inline comments.Apr 27 2021, 5:54 AM
clang/include/clang/Sema/Sema.h
335

Thanks! That is at least more descriptive that a virtual function in a non-inheriting final type is intentional and not just silliness. I appreciate the change!