This is an archive of the discontinued LLVM Phabricator instance.

[DWARF] Fix dwarf5-index-is-used.cpp
ClosedPublic

Authored by aleksandr.urakov on Aug 24 2018, 1:22 AM.

Details

Summary

dwarf5-index-is-used.cpp have been failing after rL340206, because clang have stopped to emit pubnames by default after that change. Current patch adds -gpubnames option to the clang command line in the test to emit pubnames.

Diff Detail

Repository
rL LLVM

Event Timeline

As far as the strict intention of this test goes, the change is fine, as it is meant to check that the accelerator tables get used *when* they are generated. How do you achieve them being generated is not important.

However, I am not sure now under what circumstances will the accelerator tables be emitted in the first place. David, does this mean that we will now not emit .debug-names even if one specifies -glldb? Was that intentional?

As far as the strict intention of this test goes, the change is fine, as it is meant to check that the accelerator tables get used *when* they are generated. How do you achieve them being generated is not important.

However, I am not sure now under what circumstances will the accelerator tables be emitted in the first place. David, does this mean that we will now not emit .debug-names even if one specifies -glldb? Was that intentional?

Not especially intentional - but I clearly didn't give it quite enough thought. Mostly I was modelling the behavior of GCC (no pubnames by default - you can opt-in to them (& split-dwarf opts in by default, but one thing GCC didn't do was allow them to be turned off again - which is what I wanted)).

As for the default behavior for DWARFv5 - have you run much in the way of performance numbers on how much debug_names speeds things up? From what I could see with a gdb-index (sort of like debug_names - but linker generated, so it's a single table for the whole program) it didn't seem to take GDB long to parse/build up its own index compared to using the one in the file. So it seemed to me like the extra link time, object size, etc, wasn't worth it in the normal case. The really bad case for me is split-DWARF (worse with a distributed filesystem) - where the debugger has to read all the .dwo files & might have a high latency filesystem for each file it reads... super slow. But if the debug info was either in the executable (not split) or in a DWP (split then packaged), it seemed pretty brief.

But if LLDB has different performance characteristics, or the default should be different for other reasons - I'm fine with that. I think I left it on for Apple so as not to mess with their stuff because of the MachO/dsym sort of thing that's a bit different from the environments I'm looking at.

As far as the strict intention of this test goes, the change is fine, as it is meant to check that the accelerator tables get used *when* they are generated. How do you achieve them being generated is not important.

However, I am not sure now under what circumstances will the accelerator tables be emitted in the first place. David, does this mean that we will now not emit .debug-names even if one specifies -glldb? Was that intentional?

Not especially intentional - but I clearly didn't give it quite enough thought. Mostly I was modelling the behavior of GCC (no pubnames by default - you can opt-in to them (& split-dwarf opts in by default, but one thing GCC didn't do was allow them to be turned off again - which is what I wanted)).

As for the default behavior for DWARFv5 - have you run much in the way of performance numbers on how much debug_names speeds things up? From what I could see with a gdb-index (sort of like debug_names - but linker generated, so it's a single table for the whole program) it didn't seem to take GDB long to parse/build up its own index compared to using the one in the file. So it seemed to me like the extra link time, object size, etc, wasn't worth it in the normal case. The really bad case for me is split-DWARF (worse with a distributed filesystem) - where the debugger has to read all the .dwo files & might have a high latency filesystem for each file it reads... super slow. But if the debug info was either in the executable (not split) or in a DWP (split then packaged), it seemed pretty brief.

But if LLDB has different performance characteristics, or the default should be different for other reasons - I'm fine with that. I think I left it on for Apple so as not to mess with their stuff because of the MachO/dsym sort of thing that's a bit different from the environments I'm looking at.

These are the numbers from my llvm-dev email in June:

setting a breakpoint on a non-existing function without the use of
accelerator tables:
real 0m5.554s
user 0m43.764s
sys 0m6.748s

setting a breakpoint on a non-existing function with accelerator tables:
real 0m3.517s
user 0m3.136s
sys 0m0.376s

This is an extreme case, because practically the only thing we are doing is building the symbol index, but it's nice for demonstrating the amount of work that lldb needs to do without it. In practice, the ratio will not be this huge most of the time, because we will usually find some matches, and then will have to do some extra work, which will add a constant overhead to both sides. However, this means that the no-accel case will take even longer. I am not sure how this compares to gdb numbers, but I think the difference here is significant.

Also, I am pretty sure the Apple folks, who afaik are in the process of switching to debug_names, will want to have them on by default for their targets (ping @aprantl, @JDevlieghere). I think the cleanest way (and one that best reflects the reality) to achieve that would be to have -glldb imply -gpubnames. As for whether we should emit debug_names for DWARF 5 by default (-gdwarf-5 => -gpubnames) is a more tricky question, and I don't have a clear opinion on that (however, @probinson might).

(In either case, I agree that there are circumstances in which having debug_names is not beneficial, so having a flag to control them is a good idea).

But if LLDB has different performance characteristics, or the default should be different for other reasons - I'm fine with that. I think I left it on for Apple so as not to mess with their stuff because of the MachO/dsym sort of thing that's a bit different from the environments I'm looking at.

These are the numbers from my llvm-dev email in June:

setting a breakpoint on a non-existing function without the use of
accelerator tables:
real 0m5.554s
user 0m43.764s
sys 0m6.748s

setting a breakpoint on a non-existing function with accelerator tables:
real 0m3.517s
user 0m3.136s
sys 0m0.376s

This is an extreme case,

What was being tested here? Is it a realistic case (ie: not a pathalogical case with an absurd number of symbols, etc)? Using ELF? Fission or not?

How's it compare to GDB performance, I wonder? Perhaps LLDB hasn't been optimized for the non-indexed case and could be - though that'd still potentially motivate turning on indexes by default for LLDB until someone has a motivation to do any non-indexed performance tuning/improvements.

because practically the only thing we are doing is building the symbol index, but it's nice for demonstrating the amount of work that lldb needs to do without it. In practice, the ratio will not be this huge most of the time, because we will usually find some matches, and then will have to do some extra work, which will add a constant overhead to both sides. However, this means that the no-accel case will take even longer. I am not sure how this compares to gdb numbers, but I think the difference here is significant.

Also, I am pretty sure the Apple folks, who afaik are in the process of switching to debug_names, will want to have them on by default for their targets (ping @aprantl, @JDevlieghere). I think the cleanest way (and one that best reflects the reality) to achieve that would be to have -glldb imply -gpubnames. As for whether we should emit debug_names for DWARF 5 by default (-gdwarf-5 => -gpubnames) is a more tricky question, and I don't have a clear opinion on that (however, @probinson might).

(In either case, I agree that there are circumstances in which having debug_names is not beneficial, so having a flag to control them is a good idea).

*nod* Happy if you want to (or I can) have clang set pubnames on by default when tuning for LLDB, while allowing -gno-pubnames to turn them off. (& maybe we should have another alias for that flag, since debug_names isn't "pubnames", but that's pretty low-priority)

But if LLDB has different performance characteristics, or the default should be different for other reasons - I'm fine with that. I think I left it on for Apple so as not to mess with their stuff because of the MachO/dsym sort of thing that's a bit different from the environments I'm looking at.

These are the numbers from my llvm-dev email in June:

setting a breakpoint on a non-existing function without the use of
accelerator tables:
real 0m5.554s
user 0m43.764s
sys 0m6.748s

setting a breakpoint on a non-existing function with accelerator tables:
real 0m3.517s
user 0m3.136s
sys 0m0.376s

This is an extreme case,

What was being tested here? Is it a realistic case (ie: not a pathalogical case with an absurd number of symbols, etc)? Using ELF? Fission or not?

How's it compare to GDB performance, I wonder? Perhaps LLDB hasn't been optimized for the non-indexed case and could be - though that'd still potentially motivate turning on indexes by default for LLDB until someone has a motivation to do any non-indexed performance tuning/improvements.

The performance is huge for large binaries. LLDB doesn't need to manually index all DWARF if the accelerator tables are there, it does if they are not. Many people complain about the time it takes to index the DWARF, so avoiding will be a huge improvement. We have some server binaries that statically link in libc, libstdc++ and many other binaries and they are currently 11GB, with over 10GB of that being DWARF. It makes debugging with GDB and LLDB unusable when manual indexing of all that DWARF is needed. They have .debug_types enabled where they can on these binaries, but LTO tends to be the culprit that can't take advantage of the .debug_types so the binaries are still huge.

GDB performance is hard to compare to due to the way they import types. They import all types in a compile unit at the same time and have 3 layers of resolution as they parse. Their indexes, if created by a linker, just say "foo" exists in this compile unit somewhere, not the exact DIE offset, so their indexes are useful, but they don't help LLDB out as much as the DWARF 5 versions do.

because practically the only thing we are doing is building the symbol index, but it's nice for demonstrating the amount of work that lldb needs to do without it. In practice, the ratio will not be this huge most of the time, because we will usually find some matches, and then will have to do some extra work, which will add a constant overhead to both sides. However, this means that the no-accel case will take even longer. I am not sure how this compares to gdb numbers, but I think the difference here is significant.

Also, I am pretty sure the Apple folks, who afaik are in the process of switching to debug_names, will want to have them on by default for their targets (ping @aprantl, @JDevlieghere). I think the cleanest way (and one that best reflects the reality) to achieve that would be to have -glldb imply -gpubnames. As for whether we should emit debug_names for DWARF 5 by default (-gdwarf-5 => -gpubnames) is a more tricky question, and I don't have a clear opinion on that (however, @probinson might).

(In either case, I agree that there are circumstances in which having debug_names is not beneficial, so having a flag to control them is a good idea).

*nod* Happy if you want to (or I can) have clang set pubnames on by default when tuning for LLDB, while allowing -gno-pubnames to turn them off. (& maybe we should have another alias for that flag, since debug_names isn't "pubnames", but that's pretty low-priority)

.debug_pubnames and .debug_pubtypes are not useful for LLDB or any debugger in general so please don't enable for LLDB. They only include public functions (no statics functions for functions in the anonymous namespace) and types. This means LLDB ignores these sections and manually indexes the DWARF, just like GDB ignores them.

I would vote to enable the DWARF5 accelerator tables for -glldb by default if possible.

labath added a comment.Sep 1 2018, 8:05 AM

But if LLDB has different performance characteristics, or the default should be different for other reasons - I'm fine with that. I think I left it on for Apple so as not to mess with their stuff because of the MachO/dsym sort of thing that's a bit different from the environments I'm looking at.

These are the numbers from my llvm-dev email in June:

setting a breakpoint on a non-existing function without the use of
accelerator tables:
real 0m5.554s
user 0m43.764s
sys 0m6.748s

setting a breakpoint on a non-existing function with accelerator tables:
real 0m3.517s
user 0m3.136s
sys 0m0.376s

This is an extreme case,

What was being tested here? Is it a realistic case (ie: not a pathalogical case with an absurd number of symbols, etc)? Using ELF? Fission or not?

These numbers are from linux (so ELF), without fission, and with -fstandalone-debug. The binary being debugged is a debug build of clang. The only somewhat pathological aspect of this is that I deliberately chose a function name not present in the binary when setting the breakpoint.

How's it compare to GDB performance, I wonder? Perhaps LLDB hasn't been optimized for the non-indexed case and could be - though that'd still potentially motivate turning on indexes by default for LLDB until someone has a motivation to do any non-indexed performance tuning/improvements.

Yes, I could very well be that LLDB is optimized for the indexed case (or perhaps the other way around, maybe the accelerator tables are optimized for the kind of searches that lldb likes to do). Though I don't mean to say that the non-indexed case is not optimized too -- a number of people over the years have spent a lot of time trying to make the index building step as fast as possible. However, these were all very low-level optimizations (the "improve parallelism by reducing lock contention" kind) and it's possible we might come up with something with better performance characteristics if we looked at a bigger picture. However, that would probably require redesigning a lot of things.

So, with that in mind, I agree we should enable debug_names for -glldb. I'll create a patch for that.

labath accepted this revision.Sep 1 2018, 8:10 AM

As for the original patch, which we've kind of hijacked, I think it is a good idea to commit this for now.

Once the clang patch is in, I'll exchange these flags for -glldb.

This revision is now accepted and ready to land.Sep 1 2018, 8:10 AM
This revision was automatically updated to reflect the committed changes.