This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Add support for dwarf fission
ClosedPublic

Authored by jakehehrlich on Oct 23 2017, 3:12 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

jakehehrlich created this revision.Oct 23 2017, 3:12 PM

I wanted to put this up for review a bit early for people to look at. I haven't confirmed that gdb/lldb are happy with this output yet but I think my basic test here is correct. It only appears to disagree with GNU objcopy on points that don't really matter.

My understanding of dwarf-fission is that the .dwo file should wind up with all the .dwo sections, the section header table, the .note section and nothing else (including the symbol table). I don't really have a spec to work off of to confirm that though. I'm just guessing based on what I've seen GNU objcopy and clang spit out. My understanding is also that the stripped file should wind up with all the sections it had before minus the .dwo sections. If anyone could confirm/deny or point me towards a better spec that would be awesome. If that is the case then I think this is behavior is correct.

aprantl added inline comments.Oct 23 2017, 3:34 PM
tools/llvm-objcopy/llvm-objcopy.cpp
72 ↗(On Diff #119942)

equivliant
gsplit-dwaf

jhenderson edited edge metadata.Oct 24 2017, 6:06 AM

I've made a number of inline comments, but it's mostly fine-detail. The overall approach looks fine to me.

test/tools/llvm-objcopy/drawf-fission.test
6 ↗(On Diff #119942)

I'm not sure you need the -rs here? Also, do you really need to check the output? Doesn't diff exit with a non-zero exit code if the files differ?

9 ↗(On Diff #119942)

I think explicitly listing all the sections here and their properties is unnecessary here. Probably better would be to a) check the number of sections is as expected, and then simply do something like:

#DWARF: Name: (0)
#DWARF: Name: .debug_loc.dwo
#DWARF: Name: .debug_str.dwo
...

If you want to check specific properties (e.g. the Size), then just do Name, Size, Name, Size checks without the NEXT, since that will capture the ones that are needed in the right places (checking the number of sections means that you don't accidentally start checking another section's properties).

tools/llvm-objcopy/llvm-objcopy.cpp
68 ↗(On Diff #119942)

"non DWARF" should be hyphenated, I believe: "non-DWARF".

71 ↗(On Diff #119942)

What is "only-keep-dwo"?

76 ↗(On Diff #119942)

The has the -> This has the

87–90 ↗(On Diff #119942)

Why do we need this? According to the DWARF 5 appendix, there are tags in the debug information that allow matching the two up. It certainly doesn't mention anything about the SHT_NOTE sections. Note that there are many different kinds of SHT_NOTE sections as well, and not just one single one.

96 ↗(On Diff #119942)

File should be a StringRef, I believe?

I'd also avoid the unnecessary abbreviation here. Either call the function WriteObject or WriteObjectFile.

112 ↗(On Diff #119942)

StringRef

113 ↗(On Diff #119942)

"... for the DWO sections", maybe? Also full stop.

166 ↗(On Diff #119942)

Do you need the explicit template argument list here? Can't it derive it from (*Obj)'s type?

  1. Fixed typos and StringRef/std::string mixup
  2. Changed the way that SHT_NOTE sections were handled.
  3. Made the test shorter because I included a ridiculous amount of information in that test.
  4. Responded to a few other little things from James
jakehehrlich added inline comments.Oct 26 2017, 1:37 PM
test/tools/llvm-objcopy/drawf-fission.test
6 ↗(On Diff #119942)

For some reason, until this comment, I was under the impression that llvm-lit and FileCheck had a special relationship and that if you wanted a test to fail the failure had to come from FileCheck but thinking about it that makes no sense. I prefer doing things this way much better.

tools/llvm-objcopy/llvm-objcopy.cpp
71 ↗(On Diff #119942)

I had the option wrong to begin with and this is a remnant of that mistake. I meant "extract-dwo"

87–90 ↗(On Diff #119942)

Yeah I think I mislead myself on that one reading into this further. Nothing even seems to produce a SHT_NOTE section. I'll remove that.

How do you expect -gsplit-dwarf to interact with the other two options? In other words, if a user specifies it with other or both of the other options, what do you expect the result to be?

I ask because I think the behaviour might be a little unexpected in some combinations of these three switches.

test/tools/llvm-objcopy/drawf-fission.test
8 ↗(On Diff #120479)

I think you've gone too far in stripping things down, because now there's no way of knowing that only the expected sections are in the output. You also need to check the number of sections as well (i.e. the corresponding elf header field). This test might still pass if objcopy were to continue to emit the other sections!

tools/llvm-objcopy/llvm-objcopy.cpp
66 ↗(On Diff #120479)

This help text is incorrect - we don't remove all DWARF info, just the DWO sections, leaving the skeleton debug information.

68 ↗(On Diff #120479)

There's a similar issue with this help text - we do remove some DWARF information, but keep the DWO info.

75 ↗(On Diff #120479)

Obj is an unneeded argument (and therefore so is the templating, I believe).

Also, I think this function may be better off being called IsDWOSection or similar, since that's all it's doing. It's more general purpose than a predicate for some other function.

Given that, the comment probably needs moving somewhere else.

83 ↗(On Diff #120479)

Dwo should be capitalized for consistency, since it is an abbreviation.

87 ↗(On Diff #120479)

I think this comment needs rewording, since there may not be another file, if using extract-dwo directly.

71 ↗(On Diff #119942)

Looking at the online documentation, gsplit-dwarf is a compiler switch, rather than an objcopy switch. I want to be sure with this switch name that you have chosen it deliberately to match this, because I'd have probably just called it "split-dwarf" or even "split-dwo" (to match the other two options).

It's also not clear to me what the "gsplit-dwarf file" is. I think you need to reword the help text here.

How do you expect -gsplit-dwarf to interact with the other two options? In other words, if a user specifies it with other or both of the other options, what do you expect the result to be?

I ask because I think the behaviour might be a little unexpected in some combinations of these three switches.

I'm fine renaming this to split-dwo. I think that name is better. I took the name gsplit-dwarf from clang but there really is no reason to do that.

As for using these in conjunction...man I have no clue what I expect out of that. Even if I just think about -split-dwo and -extract-dwo being used in combination I'm a bit confused as to what should happen. GNU objcopy's answer depends on the order that you give the arguments in. I would interpret the result of requesting both be done as a contradiction. e.g. it is demanded that the result of --strip-dwo should not have any .dwo sections. Likewise the result of --extract-dwo should be as small as possible while still maintaining any .dwo sections. I suppose one consistent way of interpreting this is that just the string table should be left. The other way would be to say this is an error. I think throwing errors like that checking for consistent sets of flags is kind of a nightmare and in this case there is a consistent result it's just somewhat unexpected (but so is what GNU objcopy does IMO). Right now as is llvm-objcopy will just give you the string table which seems consistent with "--strip-dwo guarantees no .dwo sections and --extract-dwo will preserve any .dwo sections it can but nothing else". I'm happy with that.

The other issue is how options in general should effect things like gsplit-dwarf (or now split-dwo). One option is to make no options apply to output to a separate file. This would mean such file outputting options would never compose with other options. In the case of GNU objcopy's --dump-section this is acceptable. The other idea is to make all options apply to them. This has the consequence of being too course and not allowing different options to apply to each file. I also think this could produce some extremely unexpected results now that you point it out. Neither of these seem like good ideas to me. In fact this seems like a general reason not to support these kind of options (options that output object files). Is removing this flag fine with you or do you think one of these options is consistent? GNU objcopy just has --strip-dwo and --extract-dwo. I think that's fine even though it's somewhat non-ideal in the number of invocations the compiler has to make.

  1. Added section count to tests to avoid false positives.
  2. Fixed option descriptions.
  3. Repalced "Dwo" with "DWO" everywhere including in cl::opt types.
  4. Changed "-gsplit-dwarf" to "-split-dwo" but I think this should actually be removed. I'm waiting to hear back before I remove it.
  5. Moved some comments around and altered them.
  6. Removed extraneous passing of Obj and additionally removed the template.

As for using these in conjunction...man I have no clue what I expect out of that. Even if I just think about -split-dwo and -extract-dwo being used in combination I'm a bit confused as to what should happen. GNU objcopy's answer depends on the order that you give the arguments in. I would interpret the result of requesting both be done as a contradiction. e.g. it is demanded that the result of --strip-dwo should not have any .dwo sections. Likewise the result of --extract-dwo should be as small as possible while still maintaining any .dwo sections. I suppose one consistent way of interpreting this is that just the string table should be left. The other way would be to say this is an error. I think throwing errors like that checking for consistent sets of flags is kind of a nightmare and in this case there is a consistent result it's just somewhat unexpected (but so is what GNU objcopy does IMO). Right now as is llvm-objcopy will just give you the string table which seems consistent with "--strip-dwo guarantees no .dwo sections and --extract-dwo will preserve any .dwo sections it can but nothing else". I'm happy with that.

I agree. This sounds like a reasonable plan.

The other issue is how options in general should effect things like gsplit-dwarf (or now split-dwo). One option is to make no options apply to output to a separate file. This would mean such file outputting options would never compose with other options. In the case of GNU objcopy's --dump-section this is acceptable. The other idea is to make all options apply to them. This has the consequence of being too course and not allowing different options to apply to each file. I also think this could produce some extremely unexpected results now that you point it out. Neither of these seem like good ideas to me. In fact this seems like a general reason not to support these kind of options (options that output object files). Is removing this flag fine with you or do you think one of these options is consistent? GNU objcopy just has --strip-dwo and --extract-dwo. I think that's fine even though it's somewhat non-ideal in the number of invocations the compiler has to make.

I think it would be a shame to drop the extra switch, if we can come up with some sort of sensible scheme to implement it. I think my preference of the two ideas (apply all or no options to the secondary output) is for the options to not be applied to the DWO file. If a user wants to apply additional options to it, they can invoke objcopy again on the file.

However, that all being said, it could be dropped for now, if you prefer, since there isn't a compelling use case, so that the other switches can go in, since those are uncontroversially useful.

tools/llvm-objcopy/llvm-objcopy.cpp
107 ↗(On Diff #121089)

If you keep split-dwo, then GSplitObj -> SplitObj, or perhaps more clearly DWOFile.

71 ↗(On Diff #119942)

Still doesn't read right. Maybe "equivalent to -strip-dwo on the input file to the specified file, then -extract-dwo on the input file to the output file". Does the command-line code have the ability to set the display name of the option's argument? If so, it might be nice to use that in the help text instead of "specified file" to make it unambiguous. I don't think there is a "strip-dwo file".

Also, if you keep this option, you should rename the variable to SplitDWO, to match its new name.

  1. Cleaned up command line argument description. I decided to keep the split-dwo option. For the one use case of dwarf fission that I'm aware of (clang) that's a better option to use. I agree that not applying options to the DWO file is more sensible.
  2. removed/replaced all references to "GSplitDwarf"

Couple more small things, and then I think we're done.

tools/llvm-objcopy/llvm-objcopy.cpp
72–73 ↗(On Diff #121163)

Reading again, I think these are the wrong way around - the extract needs to happen first, otherwise there's nothing to extract!

Text should read "equivalent to extract-dwo on the input file to <dwo-file>, then strip-dwo on the input file".

148–149 ↗(On Diff #121163)

To be absolutely clear, I'd move this clause to before the RemovePred assignments, either to about line 128 (in the current diff), or to line 137, depending on your preference. The reason for this is to maintain the conceptual order of things - split the DWO into a separate file needs to happen before the stripping. Really, it's an independent operation, so should not be mixed in with all the remove section stuff.

Also, as more stripping options become available I think coming up with a system to handle the removal predicate would be nice. I might write that patch soon.

jhenderson accepted this revision.Nov 3 2017, 2:27 AM

LGTM. The change as it stands is fine, but I don't understand why you've made a couple of changes, that I don't remember seeing before (see inline comments). If there's a good reason for them, feel free to put the changes in as part of this, or to remove them if they're not supposed to be there.

Also, as more stripping options become available I think coming up with a system to handle the removal predicate would be nice. I might write that patch soon.

Sounds good.

tools/llvm-objcopy/llvm-objcopy.cpp
81 ↗(On Diff #121422)

Why the removal of static here and in the following lines?

139 ↗(On Diff #121422)

Why the change here?

This revision is now accepted and ready to land.Nov 3 2017, 2:27 AM

It would appear some kind stranger made some sweeping improvements to the code base that caused conflicts with the code I wrote. I never added those 'static' clauses (but they're good) and I never used the "using" keyword (but I support it). See here: https://reviews.llvm.org/rL317123

I'll make sure those improvements are kept.

This revision was automatically updated to reflect the committed changes.