This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Implement --only-keep-debug
AbandonedPublic

Authored by arichardson on Feb 19 2018, 10:45 AM.

Details

Summary

This implementation removes much more than either GNU binutils or
elftoolchain. The only downside seems to be that GDB prints lots of
warnings like this: warning: section .init not found in ../clang.debug
when loading the file but debugging seems to work just fine.

This fixes https://bugs.llvm.org/show_bug.cgi?id=36266

Diff Detail

Event Timeline

arichardson created this revision.Feb 19 2018, 10:45 AM
jakehehrlich added a comment.EditedFeb 19 2018, 12:02 PM

Hmmm, I don't know enough about how debuggers to know if this is ok. I spoke with Roland McGrath about this a while back and he seemed to think the NOBITS sections were needed. This is coming from the guy who championed the --strip-sections feature which is a pretty aggressive form of stripping. If this more aggressive approach is ok then we should do it this way but otherwise I'll have to reject this change unless it does the conversion to NOBITS. We should probably get rid of the warnings in GDB in that case.

There might be a better way to do this now that we have the Writer setup. Instead of converting to NOBITS we can just not write those sections out and we can just write things out a little bit differently.

For reference, Jake's original version was here: https://reviews.llvm.org/D40523

Not being someone who knows much about GDB or other debuggers, I can't really comment on the validity of this patch. What I will say is that the predicate here is significantly different (and more complex) than the original predicate Jake's version used (which marked only allocatable segments for removal/replacement). It's also quite a bit different from the inverse of the --strip-debug switch, although that might just indicate a problem with that switch, rather than with the implementation here.

It seems like changing all alloc sections to NOBITS is probably the correct way of doing this. So I guess D40523 adjusted to always keep the named debug sections is better.
I don't think any of them are SHF_ALLOC anyway but checking the names shouldn't do any harm.

Looking at the source code it seems like bfd keeps sections starting with: ".debug", ".zdebug", ".gdb_index", ".line", ".stab", ".gnu.linkonce.wi."
and elftoolchian keeps: ".apple_", ".debug", ".gnu.linkonce.wi.", ".line", ".stab"

I can ask some people about this stuff and get back to everyone on this. Like James I have no clue if this is valid or why the old tools converted allocated sections to NOBITS. As for --strip-debug, we should do some testing to see if .zdebug, .gdb_index, etc... are removed by it. Making --strip-debug more aggressive where possible is probably a good thing. Funny anecdote: my main use case for --strip-debug is to tell other people to use it so that they can send me large binaries via email and I can still see the symbol table and relocations.

The following is speculation:
I think there are two use cases for --only-keep-debug that require the NOBITS sections.

  1. If you pre-link the stripped binary then things might be relocated and thus the debug information would need to have access to the original sections in order to translate. I vaguely remember Roland talking about a case like this.
  2. If you use --strip-all/--strip-sections with the right --keep options you might still be able to debug the stripped binary due to extra information in the debug binary. If this is possible then --strip-debug should become as aggressive as it can be without actually preventing this from happening.

So in my cursory findings I have confirmed that prelinking is a known use case for this. Also the fact that debuggers emit warnings on this is reason enough to not accept this sort of change. It seems like there should in theory be ways to make something like this work but it would a) need debugger support and b) would need a design proposal. I'm happy to participate in new inventions but we can't just make --only-keep-debug do something which is known to behave differently from GNU objcopy in ways that are trivially visible.

So the basic method here will have to be rejected. Paths to resolve this include at least the following:

  1. Propose a new standard for how debugging information should be collected by the debugger. Get community approval. Decide on a new flag name, and implement that.
  2. Modify llvm-objcopy to properly handle the conversion to nobits. I think this can now be done more simply by not performing the conversion in memory but instead laying out the file differently and writing out the section header table a bit differently.
arichardson abandoned this revision.Apr 20 2018, 1:05 AM

This approach is wrong and I currently don't have time to work on a correct solution.