Page MenuHomePhabricator

[ELF] Add --strip-debug-non-line option
Needs ReviewPublic

Authored by luciang on May 8 2018, 10:16 PM.

Details

Summary

Add support for gold's --strip-debug-non-line added in May 2008 (see: https://sourceware.org/ml/binutils/2008-05/msg00232.html)

Stripping strategy:

  • .debug_info
    • .debug_info is usually the largest debug section, but for file:lineno info we don't need most of the DIEs in it.
    • The top-level DIE corresponds to the CU and contains the name of the path to the file DW_AT_comp_dir/DW_AT_name.
    • Keep top-level compilation unit DIEs and skip children (recursively).
  • .debug_abbrev - only keep DW_TAG_compile_unit abbreviations for these top-level compilation unit DIEs
  • .debug_aranges - update offsets into .debug_info as they changed after stripping children of entries.

Test Plan: check-lld: added single-CU and multi-CU tests

Also tested manually that gdb is able to print file:lineno info.

$ cat foo.cpp
#include <stdexcept>
void foo() { throw std::exception(); }

$ cat bar.cpp
void foo();
int main() { foo(); }
$ clang++ -g -gdwarf-aranges   -c -o bar.o bar.cpp
$ clang++ -g -gdwarf-aranges   -c -o foo.o foo.cpp
$ clang++ foo.o bar.o -fuse-ld=lld -Xlinker --no-threads -Xlinker --strip-debug-non-line -o bar-lld-strip-debug-non-line
$ llvm-dwarfdump bar-lld-strip-debug-non-line
bar-lld-strip-debug-non-line:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000026 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000002a)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("bar.cpp")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/home/lucian/local/github/llvm-project/test")
              DW_AT_low_pc	(0x0000000000201100)
              DW_AT_high_pc	(0x000000000020110d)
0x0000002a: Compile Unit: length = 0x00000026 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x00000054)

0x00000035: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("foo.cpp")
              DW_AT_stmt_list	(0x00000042)
              DW_AT_comp_dir	("/home/lucian/local/github/llvm-project/test")
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_ranges	(0x00000000
                 [0x0000000000201110, 0x0000000000201154)
                 [0x0000000000201160, 0x0000000000201181))
$ llvm-dwarfdump --verify bar-lld-strip-debug-non-line  | less
Verifying bar-lld-strip-debug-non-line:	file format ELF64-x86-64
Verifying .debug_abbrev...
Verifying .debug_info Unit Header Chain...
Verifying .debug_info references...
Verifying .debug_types Unit Header Chain...
No errors.
gdb ./bar-lld-strip-debug-non-line-compress-debug-sections -batch  --ex run --ex bt --ex quit
3.6.3rc1+ (default, Feb  5 2019, 15:51:57)
[GCC 5.x 20180625 (Facebook) 5.5.0+]
Script information not found in binary, assuming oldest version
Type "fbload" to load fb-specific gdb extensions.
terminate called after throwing an instance of 'std::exception'
  what():  std::exception

Program received signal SIGABRT, Aborted.
0x00007ffff7225277 in raise () from /lib64/libc.so.6
#0  0x00007ffff7225277 in raise () from /lib64/libc.so.6
#1  0x00007ffff7226968 in abort () from /lib64/libc.so.6
#2  0x00007ffff7b347d5 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
#3  0x00007ffff7b32746 in ?? () from /lib64/libstdc++.so.6
#4  0x00007ffff7b32773 in std::terminate() () from /lib64/libstdc++.so.6
#5  0x00007ffff7b32993 in __cxa_throw () from /lib64/libstdc++.so.6
#6  0x0000000000201154 in foo() () at foo.cpp:2
#7  0x0000000000201109 in main () at bar.cpp:2
A debugging session is active.

	Inferior 1 [process 3496864] will be killed.

Quit anyway? (y or n) [answered Y; input not from terminal]

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Does gold really preserve .debug_info and .debug_abbrev? Generally .debug_info is by far the largest DWARF section and so the one you most likely want to remove.

Sorry for letting this languish a bit. I took some time to experiment. Long story short I think llvm-dwarfdump and lldb aren't able to verify this option at the moment.

As for validating the debug info, what about using 'llvm-dwarfdump -verify'? In fact running that on the output of this new option resulted in errors, so I'll address those.

It turns out that both gold --strip-debug-non-line and the implementation of ld.lld --strip-debug-non-line in this patch output debug info that (1) llvm-dwarfdump -verify reports errors with, and (2) causes SIGSEVs in lldb.

I'll demonstrate with program bar.cpp:

#include <stdexcept>

void foo() {
  throw std::exception();
}

int main() {
  foo();
}

Compiling and linking this program with Clang trunk and GNU gold version 2.25.1 results in invalid DWARF info, according to llvm-dwarfdump:

$ clang++ bar.cpp -g -fuse-ld=gold -Xlinker --strip-debug-non-line -o bar-gold-strip-debug-non-line
$ llvm-dwarfdump -verify bar-gold-strip-debug-non-line
Verifying bar-gold-strip-debug-non-line:        file format ELF64-x86-64
Verifying .debug_abbrev...
Verifying .debug_info Unit Header Chain...
error: DW_AT_ranges offset is beyond .debug_ranges bounds:

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer    ("clang version 7.0.0 (http://llvm.org/git/clang.git 1aad2818adcb106eb0b350c8c9028b75a055647e) (http://llvm.org/git/llvm.git e7343867622e65294892168ec85edf426e5c3430)")
              DW_AT_language    (DW_LANG_C_plus_plus)
              DW_AT_name        ("bar.cpp")
              DW_AT_stmt_list   (0x00000000)
              DW_AT_comp_dir    ("/data/users/bgesiak/Source/fb/llvm/build")
              DW_AT_GNU_pubnames        (true)
              DW_AT_low_pc      (0x0000000000000000)
              DW_AT_ranges      (0x00000000)

Verifying .debug_info references...
Errors detected.

The exact same error occurs with this patch:

$ clang++ bar.cpp -g -fuse-ld=lld -Xlinker --strip-debug-non-line -o bar-lld-strip-debug-non-line
$ llvm-dwarfdump -verify bar-lld-strip-debug-non-line
Verifying bar-lld-strip-debug-non-line: file format ELF64-x86-64
Verifying .debug_abbrev...
Verifying .debug_info Unit Header Chain...
error: DW_AT_ranges offset is beyond .debug_ranges bounds:

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer    ("clang version 7.0.0 (http://llvm.org/git/clang.git 1aad2818adcb106eb0b350c8c9028b75a055647e) (http://llvm.org/git/llvm.git e7343867622e65294892168ec85edf426e5c3430)")
              DW_AT_language    (DW_LANG_C_plus_plus)
              DW_AT_name        ("bar.cpp")
              DW_AT_stmt_list   (0x00000000)
              DW_AT_comp_dir    ("/data/users/bgesiak/Source/fb/llvm/build")
              DW_AT_GNU_pubnames        (true)
              DW_AT_low_pc      (0x0000000000000000)
              DW_AT_ranges      (0x00000000)

Verifying .debug_info references...
Errors detected.

I think llvm-dwarfdump points to a legitimate error in both of these cases, because when I attempt to use lldb trunk to place a breakpoint in the program, lldb crashes:

$ lldb -- bar-gold-strip-debug-non-line
(lldb) target create "bar-gold-strip-debug-non-line"
Current executable set to 'bar-gold-strip-debug-non-line' (x86_64).
(lldb) b main
Stack dump:
0.      HandleCommand(command = "b main")
1.      HandleCommand(command = "breakpoint set --name 'main'")
fish: “bin/lldb -- bar-gold-strip-debu…” terminated by signal SIGSEGV (Address boundary error)

$ lldb -- bar-lld-strip-debug-non-line
(lldb) target create "bar-lld-strip-debug-non-line"
Current executable set to 'bar-lld-strip-debug-non-line' (x86_64).
(lldb) b main
Stack dump:
0.      HandleCommand(command = "b main")
1.      HandleCommand(command = "breakpoint set --name 'main'")
fish: “bin/lldb -- bar-lld-strip-debug…” terminated by signal SIGSEGV (Address boundary error)

gdb works just fine with both of these programs, however. It doesn't print line numbers in its backtrace, even if the program is linked with gold, like I would have expected. But the line numbers are still present in the debug info, and symbolizers like this one are capable of producing stack traces with those line numbers included.

If dwarfdump can't find a problem with the output (and there is a problem with it) then that's a bug in dwarfdump. If you want to do functional testing, you'll probably want to pipe the result through something like llvm-symbolizer or lldb as an end-to-end test.

On the contrary, it turns out that llvm-dwarfdump does find problems with the output, and if crashing lldb is any indication, there are actually problems with the output -- both with this patch, and with gold. Would the correct next step to be to patch lldb and llvm-dwarfdump? Is there something else I've neglected to consider? Feedback welcome!

Does gold really preserve .debug_info and .debug_abbrev? Generally .debug_info is by far the largest DWARF section and so the one you most likely want to remove.

Good point. gold doesn't remove those sections completely, but it does prune them. For example, here's the difference between gold and gold --strip-debug-non-line for a simple program: https://reviews.llvm.org/P8082

modocache updated this revision to Diff 148153.May 22 2018, 8:38 PM

Thanks again for the reviews. I removed the extraneous else, and reverted the string switch back to what I had originally.

Does gold really preserve .debug_info and .debug_abbrev? Generally .debug_info is by far the largest DWARF section and so the one you most likely want to remove.

Good point. gold doesn't remove those sections completely, but it does prune them. For example, here's the difference between gold and gold --strip-debug-non-line for a simple program: https://reviews.llvm.org/P8082

Ah ha. And if the compile-unit DIE has DW_AT_ranges but .debug_ranges has been eliminated, that's a verifier error. LLDB probably won't like it either. Maybe the stripping function needs to become a little smarter.

christylee commandeered this revision.Nov 30 2018, 10:07 AM
christylee added a reviewer: modocache.
christylee added a subscriber: christylee.

I spoke with @modocache and I'm going to take a crack at this.

luciang commandeered this revision.Jun 21 2019, 1:55 PM
luciang added a reviewer: christylee.
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2019, 1:55 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
luciang updated this revision to Diff 206079.Jun 21 2019, 2:22 PM

Update to strip data from .debug_abbrev, .debug_info, .debug_aranges

Excellent, thank you! One of the comments on this diff mentioned using llvm-dwarfdump --verify to test whether the debug info generated by this option is valid. Have you tried doing so? Could you add a test case to this patch?

modocache requested changes to this revision.Jun 21 2019, 2:53 PM
This revision now requires changes to proceed.Jun 21 2019, 2:53 PM
luciang updated this revision to Diff 206148.Jun 23 2019, 5:53 PM

add single-cu and multi-cu tests

luciang edited the summary of this revision. (Show Details)Jun 23 2019, 6:10 PM
luciang edited the summary of this revision. (Show Details)Jun 23 2019, 6:15 PM
MaskRay added inline comments.Jun 23 2019, 7:40 PM
lld/ELF/Config.h
47

Since there is a total order, this can be ordered by the strip level: None, DebugNonLine, Debug, All.

lld/ELF/Driver.cpp
1851

Do you need .debug_rnglists?

lld/ELF/OutputSections.cpp
234

Consider using the double dash form since that is what will be in the manpage.

lld/ELF/OutputSections.h
116

Is size 1 ReducedDebugData common? If not, a container other than SmallVector<char, 1> may be better.

lld/ELF/Writer.cpp
595

double-dash form

lld/test/ELF/Inputs/strip-debug-non-line-multi-cu-bar.s
7

The comment markers are misaligned. Do you mix tabs and spaces? That'll render the comments badly on Phabricator.

lld/test/ELF/strip-debug-non-line-multi-cu.s
117

misaligned

lld/test/ELF/strip-debug-non-line.s
56

misaligned

209

.ident, .note.GNU-stack are unnecessary

MaskRay removed a subscriber: MaskRay.
luciang marked 4 inline comments as done.Jun 23 2019, 8:37 PM

`

lld/ELF/Driver.cpp
1851

I'll add a test for dwarf 5.

The gdb I have locally didn't print line number info with dwarf 5. I'll try a newer version of gdb.

lld/ELF/OutputSections.h
116

I used the same container as for CompressedData bellow. I'll use std::vector. Anything is fine here as I only allocate memory once.

lld/test/ELF/Inputs/strip-debug-non-line-multi-cu-bar.s
7

This file was created using clang++. I'll convert the tabs to spaces. I thought it was preferred to keep this as close to the original clang output as possible.

lld/test/ELF/strip-debug-non-line.s
209

I got this from clang output.

I thought I saw similar in other tests.

I didn't want to edit strings to avoid messing offsets but these are fair game to remove. Will do.

MaskRay added inline comments.Jun 23 2019, 9:13 PM
lld/test/ELF/strip-debug-non-line.s
209

If it is not lots of trouble, it'd be nice to minimize the tests. That will emphasize the important parts and make tests easier to understand. Thanks!

MaskRay added inline comments.Jun 23 2019, 9:14 PM
lld/test/ELF/strip-debug-non-line.s
211

.addrsig is also not necessary (it cannot be consumed by GNU as).

MaskRay added inline comments.Jun 24 2019, 1:11 AM
lld/ELF/Driver.cpp
1853
if (S->Name.consume_front(".debug_") || S->Name.consume_front(".zdebug_")) {
  ...
}
lld/ELF/OutputSections.cpp
205

Neither vector::clear nor SmallVector::clear deallocates the storage. Did you intend to call shrink_to_fit()?

lld/ELF/Writer.cpp
1073

Delete this helper.

Use endian::write(OS, UINT32_C(0xffffffff), Endian); below.

(this utility is defined in llvm/Support/EndianStream.h)

1078

NonLine. The option name treats non-line separately.

1083
for (OutputSection *Sec : OutputSections) {
  if (Sec->Name == ".debug_abbrev")
    AbbrSec = Sec;
  else if ...
}
1103

Config->IsLE

1161

SmallString<0>

The allocation is likely unavoidable. Don't waste stack space.

1181

NextInfoOffset is only used once. Just InfoOffset += Length below.

1219

delete {}

lld/test/ELF/strip-debug-non-line-multi-cu.s
6

--long (instead of -long) is preferred in llvm-readobj.

73

.byte 8 # DW_FORM_string allows you to inline the DW_AT_producer string. Or delete it if it is not necessary (the clang version string is too long but probably not relevant to this feature)

lld/test/ELF/strip-debug-non-line.s
3

-o %t.o

It is more natural to use .o for object files. Then use %t for the executable.

6

llvm-dwarfdump -verify returns 1 if there is any error, so you can omit | FileCheck.

11

.zdebug_str is not possible in the output. So you can just check Name: .debug_str

16

The *-NOT directive checks a string doesn’t occur between two matches. This doesn't guarantee .debug_macinfo is absent.

You may need: --implicit-check-not=.debug_macinfo --implicit-check-not=.debug_types

72

The # string offset= comments are not necessary - the literal offsets are not referenced in this file.

MaskRay added a comment.EditedJun 24 2019, 2:11 AM

It seems --strip-debug-non-line is not popular (internally we have never used this option...nor can I find its use case in any open-source project). The full output section .debug_info is produced then reduced, so I bet it will not help with the memory usage. So the benefit is a smaller output size.

However, there is another way to retain line table information, at the compiler level: -gmlt. I wonder whether -gmlt + regular lld link will be a better alternative than -g + lld --strip-debug-non-line... Another thing is that if we care about output sizes, we will likely use -gsplit-dwarf. I don't know how this option will interact with -gsplit-dwarf. .debug*.dwo sections are ignored by the linkers... This option may be useless when -gsplit-dwarf is used.

Note, this task does not have to be done by the linker (no input section is read): ld + objcopy -R .debug_foo -R .debug_bar can discard unnecessary debug sections as well, though a separate tool will be needed if we want to reduce .debug_abbrev and .debug_info.

@luciang @modocache @christylee Have you thought about these options and compared the output sizes/memory usage of them?

luciang updated this revision to Diff 206180.Jun 24 2019, 3:20 AM
luciang marked an inline comment as done.
luciang edited the summary of this revision. (Show Details)

convert tabs to spaces, align comments, remove some noise from tests, add dwarf5 test

luciang updated this revision to Diff 206198.Jun 24 2019, 5:16 AM
luciang marked 30 inline comments as done.

So the benefit is a smaller output size.

Yes. That's what we're currently using gold + strip-debug-non-line for: to reduce binary sizes in a context where a very large number of very large binaries are produced and executed.

We've also considered the items you brought up (building and deploying a separate tool to strip out the debug info after creation, create .o with minimal set of debug info), but went with this approach for:

  • lld feature parity with gold
    • based on initial response to earlier version of the diff it seemed desirable
    • the flag is useful on its own
  • producting large binaries, reading and re-writing with reduced size is a waste of IO / time
  • the patch seemed reasonable in size and intrusiveness -- I intentionally kept the implementation in a single function and avoided adding new types or passing maps or structs between phases and leveraged as much as llvm libraries as I could.
  • input .o come from a shared cache and have rich debug info and we're not using split-dwarf
luciang updated this revision to Diff 206199.Jun 24 2019, 5:25 AM

simplify tests more

So the benefit is a smaller output size.

Yes. That's what we're currently using gold + strip-debug-non-line for: to reduce binary sizes in a context where a very large number of very large binaries are produced and executed.

We've also considered the items you brought up (building and deploying a separate tool to strip out the debug info after creation, create .o with minimal set of debug info), but went with this approach for:

  • lld feature parity with gold
    • based on initial response to earlier version of the diff it seemed desirable
    • the flag is useful on its own
  • producting large binaries, reading and re-writing with reduced size is a waste of IO / time
  • the patch seemed reasonable in size and intrusiveness -- I intentionally kept the implementation in a single function and avoided adding new types or passing maps or structs between phases and leveraged as much as llvm libraries as I could.
  • input .o come from a shared cache and have rich debug info and we're not using split-dwarf

Then how about -gmlt? If you compile the program (say a.c) twice, once with -g and once with -gmlt, then you link the program twice. The -g link gets full debug info, while the -gmlt link naturally gets smaller input and produces smaller output.

If you compile the program once with -g, and expect to get two programs, one with full debug info, the other with sufficient debug info to retain line tables. You can link it once and then postprocess the program with another tool.

In neither case a linker option is needed.

If you compile with -g, but never use the full debug info. This is the case that --strip-debug-non-line will become handy. However, why can't the program be compiled with -gmlt in the first place?

I tried digging up some history and it seems that --strip-debug-non-line may be a (legacy) solution that dated before -gmlt (-g1) and -gsplit-dwarf. My understanding is that with either compiler option, the linker option will become significantly less useful.

lld/ELF/Driver.cpp
1853

This is not done.

lld/ELF/Writer.cpp
1073

This is not done.

1181

This is not done.

Sorry, I forgot I didn't submit comments.

lld/ELF/Driver.cpp
1853

Can't use consume_front: that changes the Name of the section.

/// Returns true if this StringRef has the given prefix and removes that
/// prefix.
bool consume_front(StringRef Prefix) {
  if (!startswith(Prefix))
    return false;
 
  *this = drop_front(Prefix.size());
  return true;
}
lld/ELF/OutputSections.cpp
205

I used clear here not to free memory, but to signal that ReducedDebugData doesn't hold information anymore -- it has been moved to CompressedData.

I did call shrink_to_fit as you suggested in the std::string version, but later switched to SmallString and swapped with a temporary to free memory.

lld/ELF/OutputSections.h
116

I can't use a std::vector as I mentioned above. I couldn't find a raw_ostream implementation that prints to a std::vector.

I only found https://llvm.org/doxygen/classllvm_1_1raw__string__ostream.html

  • raw_svector_ostream which prints to a SmallVector
  • raw_string_ostream which prints to a std::string

I'll go with a std::string here.

lld/ELF/Writer.cpp
1073

Unfortunately I can't include that header as it defines another Writer class which clashes with the one defined in this file:

This is what happens if I include it:

$ git diff
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 6f16cf21daf..b3c54b47251 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -28,6 +28,7 @@
 #include "llvm/BinaryFormat/Dwarf.h"
 #include "llvm/DebugInfo/DWARF/DWARFFormValue.h"
 #include "llvm/Support/DataExtractor.h"
+#include "llvm/Support/EndianStream.h"
 #include "llvm/Support/LEB128.h"
 #include "llvm/Support/RandomNumberGenerator.h"
 #include "llvm/Support/SHA1.h"
/home/lucian/local/github/llvm-project/lld/ELF/Writer.cpp: In function ‘void lld::elf::writeResult()’:
/home/lucian/local/github/llvm-project/lld/ELF/Writer.cpp:152:49: error: reference to ‘Writer’ is ambiguous
 template <class ELFT> void elf::writeResult() { Writer<ELFT>().run(); }
                                                 ^
/home/lucian/local/github/llvm-project/lld/ELF/Writer.cpp:51:29: note: candidates are: template<class ELFT> class {anonymous}::Writer
 template <class ELFT> class Writer {
                             ^
In file included from /home/lucian/local/github/llvm-project/lld/ELF/Writer.cpp:31:0:
/home/lucian/local/github/llvm-project/llvm/include/llvm/Support/EndianStream.h:51:8: note:                 struct llvm::support::endian::Writer
 struct Writer {
        ^
/home/lucian/local/github/llvm-project/lld/ELF/Writer.cpp:152:60: error: expected primary-expression before ‘>’ token
 template <class ELFT> void elf::writeResult() { Writer<ELFT>().run(); }
                                                            ^
/home/lucian/local/github/llvm-project/lld/ELF/Writer.cpp:152:62: error: expected primary-expression before ‘)’ token
 template <class ELFT> void elf::writeResult() { Writer<ELFT>().run(); }
                                                              ^

I could re-name the Writer class in this file, but wanted to keep the diff focused on the task.

1161

I had all of these as SmallString<1, char> earlier and switched to std::string after a previous comment. I misunderstood your intent there. I switched all of them to SmallString<0>.

1181

InfoOffset is updated by the getU8 & co. calls.

http://dwarfstd.org/doc/DWARF5.pdf
7.5.1.1 Full and Partial Compilation Unit Headers

unit_length (initial length)
A 4-byte or 12-byte unsigned integer representing the length of the
.debug_info contribution for that compilation unit, not including the length
field itself. In the 32-bit DWARF format, this is a 4-byte unsigned integer
(which must be less than 0xfffffff0); in the 64-bit DWARF format, this
consists of the 4-byte value 0xffffffff followed by an 8-byte unsigned
integer that gives the actual length (see Section 7.4 on page 196).

Length is the length of this CU + all its children. I don't copy the children into the reduced buffer, just the DIE for the CU.

At the end of the loop bellow InfoOffset bellow will only point to the end of the DIE corresponding to this CU. I need it to skip the children too.

As written here NextInfoOffset will point to the next CU. I'll keep this behavior.

Then how about -gmlt?

Sorry, I tried to reply in this bullet point but wasn't very clear:

input .o come from a shared cache and have rich debug info

We have two kinds of .o:

  • always built from source - we could use -gmlt here (even though it would be preferable to fetch them from a shared cache).
  • prebuilt .o with full debug info - (eg. prebuilt external third-party code managed by an *inflexible* tp management system).

This option is a bit weird to me because:

  • In binutils-gdb, only two commits were specific to gold/reduced_debug_output.cc. This suggests to me either the feature works really reliably, or it stays unused since then.
  • This features requires partial DWARF rewriting. I think this is against the spirit of ELF. .debug_abbrev and .debug_info get rewritten then references to them get repaired: fortunately there aren't many! Probably only .debug_info and .debug_aranges can reference .debug_info and for line tables to work, only .debug_aranges needs adjustment.

I believe the following is the list of debug sections that gold --strip-debug-non-line keeps:

static const char* lines_only_debug_sections[] =
{
  "abbrev",
  // "addr",      // Fission extension
  // "aranges",   // not used by gdb as of 7.4
  // "frame",
  // "gdb_scripts",
  "info",
  // "types",
  "line",
  // "loc",
  // "macinfo",
  // "macro",
  // "pubnames",  // not used by gdb as of 7.4
  // "pubtypes",  // not used by gdb as of 7.4
  // "gnu_pubnames",  // Fission extension
  // "gnu_pubtypes",  // Fission extension
  // "ranges",
  "str",
  "str_offsets",  // Fission extension
};

Note the aranges line that is filtered out. I don't know how users of this feature expects to get a fast symbolizer...

Then how about -gmlt?

Sorry, I tried to reply in this bullet point but wasn't very clear:

input .o come from a shared cache and have rich debug info

We have two kinds of .o:

  • always built from source - we could use -gmlt here (even though it would be preferable to fetch them from a shared cache).
  • prebuilt .o with full debug info - (eg. prebuilt external third-party code managed by an *inflexible* tp management system).

See this part of my question:

If you compile the program (say a.c) twice, once with -g and once with -gmlt, then you link the program twice. The -g link gets full debug info, while the -gmlt link naturally gets smaller input and produces smaller output.

If you compile the program once with -g, and expect to get two programs, one with full debug info, the other with sufficient debug info to retain line tables. You can link it once and then postprocess the program with another tool.

In neither case a linker option is needed.

luciang added a comment.EditedJun 24 2019, 6:32 AM

Here's where we use aranges: instead of linearly scanning .debug_info you can jump to the correct CU DIE using info from .debug_aranges

We have internal changes to gold to keep .debug_aranges.

prebuilt .o with full debug info - (eg. prebuilt external third-party code managed by an *inflexible* tp management system).

Let's assume .o will continue to have no-nline debug info due to the inflexible tp management system -- properly supporting -gmlt this would much delay adoption of lld.

You can link it once and then postprocess the program with another tool.

There are a few aspects for which we chose doing this in lld

  • RAM use reduction: debug section are processed and reduced before creating the output file and the ReducedDebugData are small (from 2-3GiB -> tens of MiB).
  • inefficiency: wasted IO / time: writing huge binaries twice
  • complexity of integrating binary shrinking into the build system

prebuilt .o with full debug info - (eg. prebuilt external third-party code managed by an *inflexible* tp management system).

Let's assume .o will continue to have no-nline debug info due to the inflexible tp management system -- properly supporting -gmlt this would much delay adoption of lld.

You can link it once and then postprocess the program with another tool.

There are a few aspects for which we chose doing this in lld

  • RAM use reduction: debug section are processed and reduced before creating the output file and the ReducedDebugData are small (from 2-3GiB -> tens of MiB).
  • inefficiency: wasted IO / time: writing huge binaries twice
  • complexity of integrating binary shrinking into the build system

If you do the following before:

# compile once
clang -g -c a.c
# link twice
clang a.o -o a.full
clang a.o -Wl,--strip-debug-non-line -o a.line  # will be changed

I wonder if you can change the second link to reduce-tool a.full -o a.line.

If you do:

# compile twice
clang -g -c a.c -o a.full.o
clang -g1 -c a.c -o a.line.o
# link twice
clang a.full.o -o a.full
clang a.line.o -o a.line   # no -Wl,--strip-debug-non-line is necessary

In neither case a linker option is necessary.

If the tp management system is so inflexible that it can't even change -g to -g1, then you can add a -g1 to override the debug level of -g:

clang -g a.c -o a.line.o -g1   # the debug level is decided by the last -g*

On the GCC side, GCC started to emit minimum line tables at -g1 with this patch https://github.com/gcc-mirror/gcc/commit/7fa9fa16198d84fe9354a1adc644c84b3b4dba79
I think it is included in GCC 4.9 though it is not included in the release log.

luciang marked 5 inline comments as done.Jun 25 2019, 12:41 PM

Some clarifications: our third-party management system stores a single version of .o and doesn't allow you to choose between flavors like a.full.o + a.line.o.

Other companies' third-party code is always built from source -- applying whatever compiler options the author needs at that point (-g1 or -g2 as needed), but ours doesn't :(

If I change our TP to use -g1 we lose debug info for TP code.

In many contexts we need rich debug info (e.g. when debugging something or shipping to production),

In other contexts (e.g. in a continuous integration system) we build binaries with minimal debug info (stacktraces + file:line info are sufficient).

  • code built from source uses -g1
  • but not all code is built from source: a significant amount comes from this central TP store where .os are built with -g2 and can't be easily adapted to use -g1.

compile once + link twice

clang a.o -o a.full
clang a.o -Wl,--strip-debug-non-line -o a.line # will be changed
I wonder if you can change the second link to reduce-tool a.full -o a.line.

We don't do that.

  • In the CI system we don't create both a.full and a.line binaries: we only create a.line
  • when creating release binaries or binaries for local debugging we create a.full

compile twice + link twice
clang -g -c a.c -o a.full.o
clang -g1 -c a.c -o a.line.o

As I mentioned above the TP system doesn't support a.full.o + a.line.o flavors, just a.o.

luciang added a comment.EditedJun 25 2019, 12:42 PM

GCC with -g1 produces extra information not necessary for stacktrace + file:line info production:

https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html

Level 1 produces minimal information, enough for making backtraces in parts of the program that you don’t plan to debug. This includes descriptions of functions and external variables, and line number tables, but no information about local variables.

Using GCC 7: note the generation of DW_TAG_subprogram -- so --strip-debug-non-line is still useful in reducing that info

$ echo "int foo() { return 0; } int bar() { return foo(); }" | g++ -g1 -S -x c - -o - > /tmp/test/gcc.s
$ llvm-mc -filetype=obj -triple=x86_64-unknown-linux /tmp/test/gcc.s -o /tmp/test/gcc.o
$ llvm-dwarfdump --debug-info /tmp/test/gcc.o
/tmp/test/gcc.o:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x0000005a version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000005e)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("GNU C17 7.x 20190403 (Facebook) 8.x -mtune=generic -march=x86-64 -g1")
              DW_AT_language	(DW_LANG_C99)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_high_pc	(0x000000000000001b)
              DW_AT_stmt_list	(0x00000000)

0x00000029:   DW_TAG_subprogram
                DW_AT_external	(true)
                DW_AT_name	("bar")
                DW_AT_decl_file	("/tmp/test/<stdin>")
                DW_AT_decl_line	(1)
                DW_AT_decl_column	(0x1d)
                DW_AT_low_pc	(0x000000000000000b)
                DW_AT_high_pc	(0x000000000000001b)
                DW_AT_frame_base	(DW_OP_call_frame_cfa)
                DW_AT_unknown_2116	(true)

0x00000043:   DW_TAG_subprogram
                DW_AT_external	(true)
                DW_AT_name	("foo")
                DW_AT_decl_file	("/tmp/test/<stdin>")
                DW_AT_decl_line	(1)
                DW_AT_decl_column	(0x05)
                DW_AT_low_pc	(0x0000000000000000)
                DW_AT_high_pc	(0x000000000000000b)
                DW_AT_frame_base	(DW_OP_call_frame_cfa)
                DW_AT_GNU_all_call_sites	(true)

0x0000005d:   NULL

clang does better (I tried clang-7, clang-8) and only produces DW_TAG_compile_unit:

$ echo "int foo() { return 0; } int bar() { return foo(); }" | clang++ -Os -g1 -S -x c - -Xclang -fdebug-compilation-dir -Xclang . -o - > /tmp/test/clang.s
$ llvm-mc -filetype=obj -triple=x86_64-unknown-linux /tmp/test/clang.s -o /tmp/test/clang.o
$ llvm-dwarfdump --debug-info /tmp/test/clang.o
/tmp/test/clang.o:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000026 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000002a)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 8.0.20181009 ")
              DW_AT_language	(DW_LANG_C99)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	(".")
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_high_pc	(0x0000000000000006)
ot added a subscriber: ot.Jun 25 2019, 3:57 PM

I found that clang 9 (and previous versions) also generates non CU info in .debug_info with -gmlt:

$ echo 'struct A { A() {} int a = 0; }; struct B { B(); A a; }; B::B() {}' | clang++ -x c++ -O2 -gmlt -o clang-02-gmlt.o -c - && llvm-dwarfdump --debug-info clang-02-gmlt.o
clang-02-gmlt.o:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000051 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x00000055)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_high_pc	(0x0000000000000007)

0x0000002a:   DW_TAG_subprogram
                DW_AT_name	("A")

0x0000002f:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000000000)
                DW_AT_high_pc	(0x0000000000000007)
                DW_AT_name	("B")

0x00000040:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x0000002a "A")
                  DW_AT_low_pc	(0x0000000000000000)
                  DW_AT_high_pc	(0x0000000000000006)
                  DW_AT_call_file	("/tmp/test/<stdin>")
                  DW_AT_call_line	(1)

0x00000053:     NULL

0x00000054:   NULL

Note that dropping with -O0 or -O1 .debug_info the DW_TAG_subprogram and DW_TAG_inlined_subroutine DIEs are not generated. -Os, -O2 and -O3 do.

$ echo 'struct A { A() {} int a = 0; }; struct B { B(); A a; }; B::B() {}' | clang++ -x c++ -O0 -gmlt -o clang-02-gmlt.o -c - && llvm-dwarfdump --debug-info clang-02-gmlt.o
clang-02-gmlt.o:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000026 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000002a)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_ranges	(0x00000000
                 [0x0000000000000000, 0x000000000000001b)
                 [0x0000000000000000, 0x0000000000000014))

The proposed --strip-debug-non-line is an improvement on top of what current gcc/clang produce with -g1 or -gmlt:

$ echo 'struct A { A() {} int a = 0; }; struct B { B(); A a; }; B::B() {}; int main(){ return 0;}' | clang++ -x c++ -O2 -gmlt -o clang-02-gmlt-strip -fuse-ld=lld -Wl,--strip-debug-non-line - && llvm-dwarfdump --debug-info clang-02-gmlt-strip
clang-02-gmlt-strip:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000026 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000002a)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000201100)
              DW_AT_high_pc	(0x0000000000201113)

vs

$ echo 'struct A { A() {} int a = 0; }; struct B { B(); A a; }; B::B() {}; int main(){ return 0;}' | clang++ -x c++ -O2 -gmlt -o clang-02-gmlt -fuse-ld=lld  - && llvm-dwarfdump --debug-info clang-02-gmlt
clang-02-gmlt:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000051 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x00000055)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000201100)
              DW_AT_high_pc	(0x0000000000201113)

0x0000002a:   DW_TAG_subprogram
                DW_AT_name	("A")

0x0000002f:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000201100)
                DW_AT_high_pc	(0x0000000000201107)
                DW_AT_name	("B")

0x00000040:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x0000002a "A")
                  DW_AT_low_pc	(0x0000000000201100)
                  DW_AT_high_pc	(0x0000000000201106)
                  DW_AT_call_file	("/tmp/test/<stdin>")
                  DW_AT_call_line	(1)

0x00000053:     NULL

0x00000054:   NULL

(I'll look into what triggers -gmlt + -O2 to generate the extra debug sections separately).

I found that clang 9 (and previous versions) also generates non CU info in .debug_info with -gmlt:

$ echo 'struct A { A() {} int a = 0; }; struct B { B(); A a; }; B::B() {}' | clang++ -x c++ -O2 -gmlt -o clang-02-gmlt.o -c - && llvm-dwarfdump --debug-info clang-02-gmlt.o
clang-02-gmlt.o:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000051 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x00000055)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_high_pc	(0x0000000000000007)

0x0000002a:   DW_TAG_subprogram
                DW_AT_name	("A")

0x0000002f:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000000000)
                DW_AT_high_pc	(0x0000000000000007)
                DW_AT_name	("B")

0x00000040:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x0000002a "A")
                  DW_AT_low_pc	(0x0000000000000000)
                  DW_AT_high_pc	(0x0000000000000006)
                  DW_AT_call_file	("/tmp/test/<stdin>")
                  DW_AT_call_line	(1)

0x00000053:     NULL

0x00000054:   NULL

Note that dropping with -O0 or -O1 .debug_info the DW_TAG_subprogram and DW_TAG_inlined_subroutine DIEs are not generated. -Os, -O2 and -O3 do.

$ echo 'struct A { A() {} int a = 0; }; struct B { B(); A a; }; B::B() {}' | clang++ -x c++ -O0 -gmlt -o clang-02-gmlt.o -c - && llvm-dwarfdump --debug-info clang-02-gmlt.o
clang-02-gmlt.o:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000026 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000002a)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000000000)
              DW_AT_ranges	(0x00000000
                 [0x0000000000000000, 0x000000000000001b)
                 [0x0000000000000000, 0x0000000000000014))

The proposed --strip-debug-non-line is an improvement on top of what current gcc/clang produce with -g1 or -gmlt:

$ echo 'struct A { A() {} int a = 0; }; struct B { B(); A a; }; B::B() {}; int main(){ return 0;}' | clang++ -x c++ -O2 -gmlt -o clang-02-gmlt-strip -fuse-ld=lld -Wl,--strip-debug-non-line - && llvm-dwarfdump --debug-info clang-02-gmlt-strip
clang-02-gmlt-strip:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000026 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000002a)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000201100)
              DW_AT_high_pc	(0x0000000000201113)

vs

$ echo 'struct A { A() {} int a = 0; }; struct B { B(); A a; }; B::B() {}; int main(){ return 0;}' | clang++ -x c++ -O2 -gmlt -o clang-02-gmlt -fuse-ld=lld  - && llvm-dwarfdump --debug-info clang-02-gmlt
clang-02-gmlt:	file format ELF64-x86-64

.debug_info contents:
0x00000000: Compile Unit: length = 0x00000051 version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x00000055)

0x0000000b: DW_TAG_compile_unit
              DW_AT_producer	("clang version 9.0.0 (https://github.com/llvm/llvm-project 01a99c0aa5ae5be47ea62bd6c87ca6bb63f5a454)")
              DW_AT_language	(DW_LANG_C_plus_plus)
              DW_AT_name	("-")
              DW_AT_stmt_list	(0x00000000)
              DW_AT_comp_dir	("/tmp/test")
              DW_AT_low_pc	(0x0000000000201100)
              DW_AT_high_pc	(0x0000000000201113)

0x0000002a:   DW_TAG_subprogram
                DW_AT_name	("A")

0x0000002f:   DW_TAG_subprogram
                DW_AT_low_pc	(0x0000000000201100)
                DW_AT_high_pc	(0x0000000000201107)
                DW_AT_name	("B")

0x00000040:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin	(0x0000002a "A")
                  DW_AT_low_pc	(0x0000000000201100)
                  DW_AT_high_pc	(0x0000000000201106)
                  DW_AT_call_file	("/tmp/test/<stdin>")
                  DW_AT_call_line	(1)

0x00000053:     NULL

0x00000054:   NULL

(I'll look into what triggers -gmlt + -O2 to generate the extra debug sections separately).

While all this (the behavior of gmlt) seems unrelated to this patch, really (I agree/understand your description - if you really want to be able to compile things once and link them into debug-ish and non-debug-ish forms, something like the --strip-debug-non-line sounds nice though I can also understand some push-back against it, since it's necessarily going to make broken DWARF, well, except in DWARFv5 where there's an intentional way to support line-table-only debug info (the line table has its own string table, so you can strip /everything/ (including debug_info) except debug_line and debug_line_str, I believe) - perahps that'd be a good way to implement this in a more principled way - work for DWARFv5 only (it wouldn't necessarily need/want to check DWARF version, but be designed to work correctly for DWARFv5 & be weird/bad/problematic before that))

So, while the behavior of gmlt doesn't seem too relevant - I can explain what it does, why GCC's and Clang's are designed the way they are. Their goal is to be able to produce correct back traces, that means needing to describe inline stack frames. Both GCC and Clang used to describe all functions - but I made changes to Clang to reduce the cost of this feature by only describing functions that have inlining in them (or are inlined/need to share a description for that purpose). So the simplest test case that always produces more than just the CU DIE is:

void f1();
__attribute__((always_inline)) void f2() {
  f1();
}
void f3() {
  f2();
}

While all this (the behavior of gmlt) seems unrelated to this patch, really (I agree/understand your description - if you really want to be able to compile things once and link them into debug-ish and non-debug-ish forms, something like the --strip-debug-non-line sounds nice though I can also understand some push-back against it, since it's necessarily going to make broken DWARF, well, except in DWARFv5 where there's an intentional way to support line-table-only debug info (the line table has its own string table, so you can strip /everything/ (including debug_info) except debug_line and debug_line_str, I believe) - perahps that'd be a good way to implement this in a more principled way - work for DWARFv5 only (it wouldn't necessarily need/want to check DWARF version, but be designed to work correctly for DWARFv5 & be weird/bad/problematic before that))

Preserving ".debug_line*" should be fully usable for DWARFv5, and the idea is that ".debug_line*" should be sufficient in future DWARF versions that might invent other new sections related to the line table. For DWARFv4, preserving just .debug_line would lose the explicit address-size (likely you can infer that from other characteristics of the object file) and the compilation directory/root file, which might or might not be referenced within the line table.

Given that part of the use-case is pre-built 3rd party objects with full debug info, you probably can't assume everything is DWARFv5, however, so a solution that works for prior versions is probably the right way to go. It should be comparatively easy to trim a .debug_info compilation unit down to just the DW_TAG_compile_unit DIE, which gets back the missing info I mentioned in the previous paragraph,

For a traceback, this gets you source attributions but not nice subprogram names, which would require the additional inlined subprogram information that David was talking about.