This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Add --archive-headers (-a) option
ClosedPublic

Authored by paulsemel on Jul 3 2018, 4:35 PM.

Details

Diff Detail

Repository
rL LLVM

Event Timeline

paulsemel created this revision.Jul 3 2018, 4:35 PM

Fun fact: the GNU objdump documentation uses --archive-headers in its initial switch list, but --archive-header in its description! In fact, it supports all three of "-a", "--archive-header" and "--archive-headers", due to its inexact matching of switches (i.e. it also supports --archive-head as an alias, for example).

Also, I noticed that GNU objdump prints a brief summary of the file before each entry in the dump - something like:

test.o:      file format elf64-x86-64
<header dump>

test 2.o:      file format elf64-x86-64

I see that an equivalent of this is printed as part of DumpObject. This leads me to think three things:

  1. The printing of the information you are adding here should be printed inside DumpObject (and first within that function before anything else).
  2. llvm-objdump prints the name as test.a(test.o), which means that the "In Archive ..." printing is unnecessary.
  3. We need a test for combining -a with another switch, e.g. -d.

What are your thoughts on this?

test/tools/llvm-objdump/archive-headers.test
2

As noted in D48810, please add a duplicate line using the long form as well.

Also, please put a space between the '#' and the RUN/CHECK directives, as it makes it a little easier to read.

tools/llvm-objdump/llvm-objdump.cpp
2129–2130

Why not just use report_error to print the appropriate message?

I assume that testing this would be non-trivial?

2134–2136

Using an admittedly old version of GNU objdump (2.24), this dash doesn't appear at all when I use --archive-headers. Maybe we shouldn't output it either?

2153

Why %3? FWIW, GNU objdump appears to to do no padding on this column.

2154

Please put a blank line before this line.

2159

Please put a blank line before this block too.

2162

GNU objdump uses 6 as the minimum width of this column. Why 5?

2170

cime(3) -> ctime(3) presumably?

I think what you are doing here is fine, but again, it's worth noting that GNU objdump doesn't print the day (i.e. "Sun" for your example).

2179

I'm not sure I like us just throwing away this error, since it might indicate a problem that is not clear from the raw file name. Maybe it's worth warning if that error is reported, but the raw name does not result in an error. What do you think?

2180

This can, on high warning levels on some compilers, lead to warnings, I believe. I'd suggest just calling it RawNameOrErr for clarity.

2186–2187

You can avoid duplicating the printing code here, by doing something like:

StringRef Name = "";
Expected<StringRef> NameOrErr = C.getName();
if (!NameOrErr) {
  ...
  Expected<StringRef> RawNameOrErr = C.getRawName();
  ...
  Name = RawNameOrErr.get();
} else {
  Name = NameOrErr.get();
}
outs() << Name << "\n";

or possibly better yet, if you are able to reuse NameOrErr:

Expected<StringRef> NameOrErr = C.getName();
if (!NameOrErr) {
  ...
  NameOrErr = C.getRawName();
  ...
}
outs() << NameOrErr.get() << "\n";
2191

I'm not sure this is a good name for this function. Possibly better would be "printArchiveSummary" or similar, since it's not actually doing any header printing. Or probably better, just place this inline, since it's only one line and is only called in one place.

paulsemel updated this revision to Diff 154099.Jul 4 2018, 6:22 AM
paulsemel marked 10 inline comments as done.

Added Jame's suggestions.
Totally agree with all your points James (except the error one, as it might break the whole behavior)

Added Jame's suggestions.
Totally agree with all your points James (except the error one, as it might break the whole behavior)

Could you explain more about why you think that the error might break behaviour?

Please also add a test as I described earlier:

We need a test for combining -a with another switch, e.g. -d.

tools/llvm-objdump/MachODump.cpp
78–86

As this is no longer a MachO-specific option, surely it makes more sense for this to be in llvm-objdump.cpp, with the rest of the switches?

tools/llvm-objdump/llvm-objdump.cpp
2156

This should change as well, if you're not padding the first one.

2191

If you get a moment, could you either NFC change the variable names to all be capitalised. Either that, or 'C' should be 'c'

paulsemel updated this revision to Diff 154138.Jul 4 2018, 10:42 AM
paulsemel marked 3 inline comments as done.

Added Jame's suggestions.

paulsemel added a comment.EditedJul 4 2018, 10:42 AM

Alright :)
The fact is that this error is often occurring on the first entry when we create the archive file with GNU ar (I don't know why, I have to admit it). So, if we throw an error when we encounter this, we are simply not printing any entry of those files at all... (but the other entries are ok !)
That's actually what GNU objdump is doing. It is just reporting the file as not parsable in the archive, and going to the next file.
I think this is a good behavior. What do you think ?

tools/llvm-objdump/llvm-objdump.cpp
2129–2130

No, the fact is that, sometimes, only the first entry is ill-formed, and we still want to print the remaining part of the archive. I first did an error_report, but this leads to a non-functional option.
I think we should let it like this.

2191

let's remain this lowercase for this patch, I will do another one to correct all the variables in the file.

Alright :)
The fact is that this error is often occurring on the first entry when we create the archive file with GNU ar (I don't know why, I have to admit it). So, if we throw an error when we encounter this, we are simply not printing any entry of those files at all... (but the other entries are ok !)
That's actually what GNU objdump is doing. It is just reporting the file as not parsable in the archive, and going to the next file.
I think this is a good behavior. What do you think ?

Okay, that makes sense to me. Does GNU objdump print it on stdout, along with the rest of the information, or stderr? We should do whatever it does. We should also test it - I think a good way to do so would be to add a non-object (e.g. a text file) to the archive used for the main test, and show that the "ill-formed" message is printed, but then the rest of the check continues. What do you think?

test/tools/llvm-objdump/archive-headers-disas.test
3 ↗(On Diff #154138)

I'd add before each CHECK group a check showing the file name/format line (e.g. "test.a(test.o): file format ELF64-x86-64).

6–11 ↗(On Diff #154138)

I'd probably omit the actual disassembly part of the CHECK to make it less fragile to minor changes in how it is printed. It probably makes sense to just keep the first three CHECKs for each member (i.e. the --archive-headers line, the "Disassembly of..." line and the symbol name.

test/tools/llvm-objdump/archive-headers.test
5

Similar to the other test, I think it's worth checking the file name line that precedes each dump.

tools/llvm-objdump/llvm-objdump.cpp
205

This help text needs updating to remove reference to Mach-O.

paulsemel updated this revision to Diff 154193.Jul 5 2018, 3:40 AM
paulsemel marked 4 inline comments as done.

Ok, so it appears that I can't trigger the error anymore, because I move the call to printArchiveChild to the DumpObject function.
In the DumpArchive function, when the object is ill-formed, we are just ignoring and going to the next file.
I think we can let it like this, as it is barely not possible to test the behavior now..

Ok, so it appears that I can't trigger the error anymore, because I move the call to printArchiveChild to the DumpObject function.
In the DumpArchive function, when the object is ill-formed, we are just ignoring and going to the next file.
I think we can let it like this, as it is barely not possible to test the behavior now..

Okay, that's fine. I'm not sure if I agree with the DumpArchive function ignoring errors completely, but that's not relevant to this change, so don't worry about it.

I've got one more question, and one small request, then I think this will be good to go on.

test/tools/llvm-objdump/archive-headers-disas.test
6–11 ↗(On Diff #154138)

Sorry, maybe I confused you - I meant each member should have a disassembly check, so the later members should have it just like the earlier ones, but only the first part, like you've done here. Otherwise, it looks like the first few cases are different and special.

tools/llvm-objdump/llvm-objdump.cpp
2201–2202

Maybe a silly question, but why is this case special? Why not always print the double new line?

paulsemel added inline comments.Jul 5 2018, 5:42 AM
test/tools/llvm-objdump/archive-headers-disas.test
6–11 ↗(On Diff #154138)

If there is not disassembly part, that means that there is nothing to disassemble :)
I think just putting this line for the disassembly part is ok, as this is not what we are checking here (but this still ensures that both options are working correctly together).

tools/llvm-objdump/llvm-objdump.cpp
2201–2202

When combined with other options, GNU objdump is not putting a new line, and effectively I find it nicer without it. What do you think ?

jhenderson added inline comments.Jul 5 2018, 6:11 AM
test/tools/llvm-objdump/archive-headers-disas.test
6–11 ↗(On Diff #154138)

Sorry, just to be clear, do the other members have no text section then?

tools/llvm-objdump/llvm-objdump.cpp
2201–2202

Okay, that does sound sensible, but would you mind showing what you mean a bit more, with some output from both versions? Why is ArchiveHeaders special, and not, say, Disassemble?

paulsemel updated this revision to Diff 154221.Jul 5 2018, 6:41 AM

Put back the two newlines

test/tools/llvm-objdump/archive-headers-disas.test
6–11 ↗(On Diff #154138)

Yes that there is not code sections (you can test with objdump, the output is the same :) )

tools/llvm-objdump/llvm-objdump.cpp
2201–2202

Ok, I double checked, and it seems that GNU objdump is putting the two newlines too.. let's stick with what was here before then :)

jhenderson accepted this revision.Jul 5 2018, 6:44 AM

Okay, LGTM. Go ahead and land when you're ready.

This revision is now accepted and ready to land.Jul 5 2018, 6:44 AM
paulsemel closed this revision.Jul 5 2018, 7:49 AM

Arf.. Again, I forgot to mention the revision in the commit.. I'll try to think about it next time :)

Arf.. @jhenderson forgot that the time was timezone dependent.. can I push a patch ?

Arf.. @jhenderson forgot that the time was timezone dependent.. can I push a patch ?

Go ahead.

I assume you're going to use some form of regex? It would be good to not just do {{.*}}, but rather to do one that matches the format, if the format is consistent between timezone settings.

Ok, for an unknown reason, the disassembly part is not working on some bots..
I removed it, but will work on it tomorrow.
Again, I'm really sorry about it..