Page MenuHomePhabricator

[llvm-readelf] Dump the stack sizes sections with --stack-sizes
ClosedPublic

Authored by wolfgangp on Jul 25 2019, 6:43 PM.

Details

Summary

This is a GNU-style implementation of the --stack-sizes option. It dumps the stack sizes sections in the format
StackSizes:

Size        Function

<Stacksize> <FunctionName>

...                      ...

LLVM style dumping is not implemented yet.

Diff Detail

Repository
rL LLVM

Event Timeline

wolfgangp created this revision.Jul 25 2019, 6:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2019, 6:43 PM

The first comments are below.

llvm/test/tools/llvm-readobj/stack-sizes.test
30 ↗(On Diff #211863)

This needs a comment I think, i.e. what does this content describe.

40 ↗(On Diff #211863)

##

113 ↗(On Diff #211863)

You do not need the last line.
We usually either separate the comment from RUN: with just an empry line (though sometimes not, just be consistent).

163 ↗(On Diff #211863)

Please align like you did above.

186 ↗(On Diff #211863)

Do you need section symbol?

286 ↗(On Diff #211863)

##

295 ↗(On Diff #211863)

We usually use 2>&1. I am not sure it worth to chech the output separatelly here.

323 ↗(On Diff #211863)

I'd expect to see "# MULTIPLE:" lines right here. You can probably test ARCHIVE just separatelly.

359 ↗(On Diff #211863)

Again, probably I'd not test errors separatelly. (i.e. would use 2>&1)

392 ↗(On Diff #211863)

Add a description?

395 ↗(On Diff #211863)

Perhaps you should print the relocation name too.

llvm/tools/llvm-readobj/ELFDumper.cpp
4315 ↗(On Diff #211863)

This looks worrying. Why do you need this?

4334 ↗(On Diff #211863)

This probably need a test (I think I didn't see one for testing demangled names)

4336 ↗(On Diff #211863)

Why?

4378 ↗(On Diff #211863)

Why?

4398 ↗(On Diff #211863)

Why?

4416 ↗(On Diff #211863)

This method is very long. If there is any way to make it shorter, that would be great.

4433 ↗(On Diff #211863)

Seems you can use ELFObjectFile::toDRI() to hide work with DRI.p and move this out to make it static helper probably?

4486 ↗(On Diff #211863)

Its a bit strange to see using here. You use SectionMap only once
and it is not common do do in the missle of function I think.
I'd just remove.

4550 ↗(On Diff #211863)

Supports probably not a very good name for function. May be IsSupportedFn?

5436 ↗(On Diff #211863)

I think it worth adding the llvm-readobj calls to your test case too. So that it will be visible what result we have and what is the place to change when we implement this.

grimar added inline comments.Jul 26 2019, 2:45 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
4486 ↗(On Diff #211863)

"do do in the missle of function" -> "to do in the middle of function"

wolfgangp marked 24 inline comments as done.
  • Undertook some refactoring of the change with the following motivation: The printStackSize* methods do mostly extracting and validating, printing the output is only a minor part. Yet they were methods of the GNUStyle class, which would have made it more difficult to implement LLVM style dumping. So I moved them mostly into the base class (DumpStyle), and delegated the printing mostly to a new method printStackSizeEntry(), which is style specific.
  • Addressed the formatting comments
  • Added 2 test cases (demangling and verifiying the 'not implemented' message we get from llvm-readobj.
  • Explained the rationale using consumeError().
llvm/test/tools/llvm-readobj/stack-sizes.test
186 ↗(On Diff #211863)

No, it's not needed.

295 ↗(On Diff #211863)

It was suggested by a different reviewer to check the output as well. If you do not feel strongly about it, I would like to keep it in place.

323 ↗(On Diff #211863)

I collapsed the 2 test cases and removed some unnecessary yaml2obj invocations.
I'd like to intertwine the ARCHIVE and MULTIPLE check lines to check the proper
ordering. Please let me know if that's OK.

359 ↗(On Diff #211863)

Testing the output separately was suggested by someone else. Please let me know if you feel strongly about it.

llvm/tools/llvm-readobj/ELFDumper.cpp
4315 ↗(On Diff #211863)

I am scanning the symbol table to figure out which function symbol the stack size entry corresponds to. Symbol.getAddress() returns an 'Expected', which means that it may possibly return an error. I need to test the Expected and once I do that, and an error has been returned, the 'Expected' interface mandates that we do something with this error, i.e. either report it (by calling takeError()) or ignore it (by calling consumeError()). If we don't do that, the 'Expected' destructor asserts and we get a compiler crash.

As far as choosing whether to report or consume the error, I have decided that unless the error is directly related to stack size dumping, I would rather ignore it and leave the reporting to other aspects of dumping, e.g. when explicitly dumping the symbol table. Hence the consumeError() here.

This also goes for all the other occurrences of consumeError().

4336 ↗(On Diff #211863)

Please see my comment above.

4378 ↗(On Diff #211863)

Please see my comment above.

4398 ↗(On Diff #211863)

Please see my comment above.

4416 ↗(On Diff #211863)

This morphed into a somewhat larger refactoring. I will explain this in detail in the commit message.

4433 ↗(On Diff #211863)

I extracted it into a helper due to the refactoring, but unfortunately ELFObjectFile::toDRI() is protected, so I can't use it here. Now, this could be refactored, but I don't want to lump it into the same patch.

4550 ↗(On Diff #211863)

But it's so cool to be able to say

if (Supports(...))

Well, if you insist...

I'm not familiar with the .stack_sizes functionality in general, but it seems useful & could have saved me countless hours of debugging (large stack frames => stack overflows in recursive parsers) if I knew more about how it worked.

llvm/test/tools/llvm-readobj/stack-sizes.test
8–12 ↗(On Diff #212209)

Out of curiousity, how did you decide on this format? Does this number of columns line up with other readelf options? (There is no GNU readelf --stack-size option I'm aware of that this could be emulating).

llvm/tools/llvm-readobj/ELFDumper.cpp
391–397 ↗(On Diff #212209)

Why two methods with the same name? It's confusing trying to figure out what is calling what.

4441 ↗(On Diff #212209)

Why is this check here? This check happens before the method is called.

If you want to keep it in (I'm not sure I care either way), put it in an assert instead.

4459 ↗(On Diff #212209)

Should true be Obj->isLittleEndian()?

4496–4501 ↗(On Diff #212209)

I may just be missing something obvious, but I don't see why this wildly differs from the previous method. Why does the previous method go through all these hoops:

const Elf_Shdr *ElfSec = Obj->getSection(Sec.getRawDataRefImpl());
Expected<StringRef> NameOrErr = EF->getSectionName(ElfSec);
// Ignore sections whose name we cannot find.
if (!NameOrErr) {
  consumeError(NameOrErr.takeError());
  continue;
}
StringRef SectionName = *NameOrErr;
if (!SectionName.startswith(".stack_sizes"))

... but then this one skips all that and just calls getName(SectionName)?

4315 ↗(On Diff #211863)

I think this is fine, but instead of just having the comment for this one consumeError call, it might be better to have one at the top of each method that explains the blanket exception

llvm/tools/llvm-readobj/llvm-readobj.cpp
58 ↗(On Diff #212209)

--stack-sizes is not a GNU readelf options, so I don't think it should be implied when running llvm-readelf -a

737–740 ↗(On Diff #212209)

(I guess my comment earlier actually applies here)

grimar added inline comments.Jul 31 2019, 1:26 AM
llvm/test/tools/llvm-readobj/stack-sizes.test
295 ↗(On Diff #211863)

Sure, NP.

323 ↗(On Diff #211863)

Looks good.

llvm/tools/llvm-readobj/ELFDumper.cpp
4554 ↗(On Diff #212209)

So you're using RelocSectionName only for the error reporting.
And reportError is no return function, so it is called only once.
Then you can just move these lines right to the place where you construct the error.

4550 ↗(On Diff #211863)

Well, may be other reviewers have a different opinion here. Supports just doesn't look like a function name for me.

wolfgangp updated this revision to Diff 212676.Jul 31 2019, 3:39 PM
wolfgangp marked 12 inline comments as done.

Addressed review comments:

  • renamed one of the printStackSize() functions.
  • removed superfluous check for nonrelocatable object
  • Used SectionRef::getName() instead of the explicit sequence to extract a section name.
  • Fixed the calls to the DataExtractor() constructor to use Obj->isLittleEndian()
  • Added some comments to explain the error handling policy.
  • Moved a SectionName variable definition closer to the use point.

Not resolved:
Left the question about whether --stack-sizes should be include in --all open. This is still a discussion point.

Left the question about whether --stack-sizes should be include in --all open. This is still a discussion point.

Do you have a discussion going on somewhere you can link to?

Personally, I think it's neat & wouldn't mind it being included as part of --all for personal use, but generally we want to make llvm-readelf as close to GNU readelf as possible. (This has been feedback given at llvm dev meetings too).

Maybe it could be part of --all only when invoked as llvm-readobj? Though it doesn't make much since *now* because this only has the GNU-style dumper implemented, and in fact prints a warning for the LLVM style -- but in the future, we could do that.

Left the question about whether --stack-sizes should be include in --all open. This is still a discussion point.

Do you have a discussion going on somewhere you can link to?

Not really. I'd like to ask @jhenderson for input on this, but your argument below about being as close to GNU readelf as possible sounds convincing.

Personally, I think it's neat & wouldn't mind it being included as part of --all for personal use, but generally we want to make llvm-readelf as close to GNU readelf as possible. (This has been feedback given at llvm dev meetings too).

Maybe it could be part of --all only when invoked as llvm-readobj? Though it doesn't make much since *now* because this only has the GNU-style dumper implemented, and in fact prints a warning for the LLVM style -- but in the future, we could do that.

llvm/test/tools/llvm-readobj/stack-sizes.test
8–12 ↗(On Diff #212209)

I settled on this format in consultation with @jhenderson . Basically to maximize readability: The size is right-justified in its column, the function name left-justified. That leaves some room on the left for larger stack sizes and room on the right for long function names (or signatures). Other readelf options weren't really under consideration.

Any suggestions are of course welcome.

llvm/tools/llvm-readobj/ELFDumper.cpp
391–397 ↗(On Diff #212209)

My design thought was that one was called from a relocatable context, and the other from a non-relocatable context, so the relocatable one (the one with the 'Rel' parameter), would resolve the relocation and delegate the rest to the non-relocation function. There is no reason for them to have the same name, though. I'll change it.

4441 ↗(On Diff #212209)

Thanks, oversight during a refactor.

4496–4501 ↗(On Diff #212209)

You're not missing anything, I didn't discover until later that SectionRef::getName() encapsulates what I'm doing in the segment above and forgot to update it. Thanks for pointing this out.

4315 ↗(On Diff #211863)

Added a comment to each function where consumeError() is used.

llvm/tools/llvm-readobj/llvm-readobj.cpp
58 ↗(On Diff #212209)

I have no particularly strong opinion on this, but since the stack size functionality originated in LLVM, it's not surprising that GNU readelf would not support it. However, it does seem reasonable to include it with an 'all' option, unless the goal is exact compatibility with GNU.

grimar added inline comments.Aug 2 2019, 12:57 AM
llvm/tools/llvm-readobj/ELFDumper.cpp
4403 ↗(On Diff #212676)

3 comments/suggestions:

  1. Instead of manually constructing the message with a file name, you can use the same approach as already used now for reporting a errors.

i.e. just introduce a method:

void reportWarning(StringRef Input, Error Err) {
  if (Input == "-")
    Input = "<stdin>";
  warn(createFileError(Input, std::move(Err)));
}

(see my D65515 for reference).

  1. With that you do not need to call consumeError after reportWarning.
  1. I think everywhere where you used consumeError you should probably report a warning.

Like in the code above:

if (NameOrErr)
  SymName = *NameOrErr;
else
  consumeError(NameOrErr.takeError());

i.e. call reportWarning from (1) instead.

What do you think?

I've only done a fairly brief review, so don't wait for me for an LGTM if anybody else gives it. I'm happy to defer to others judgment on the --all behaviour. My instinct there is that the --all behaviour change is a wider issue beyond --stack-sizes. For example, --addrsig, --hash-symbols etc are switches that don't exist in GNU readelf, but might be printed if they did.

llvm/test/tools/llvm-readobj/stack-sizes.test
194 ↗(On Diff #212676)

"with a complete" sounds like there's a mistake here. Should it be "within"?

434 ↗(On Diff #212676)

Let's tidy up the spacing in this block here so that it better matches what is printed (and is therefore more readable).

llvm/tools/llvm-readobj/ELFDumper.cpp
4317–4318 ↗(On Diff #212676)

If I'm not mistaken, we tend to write comments in the third person i.e. "This function ignores..."

4380 ↗(On Diff #212676)

See my comment above.

4405–4406 ↗(On Diff #212676)

Perhaps rephrase this "correct section and report its stack size anyway."

llvm/tools/llvm-readobj/llvm-readobj.cpp
58 ↗(On Diff #212209)

I'm in two minds about this. My hope is that we could ignore GNU compatibility on this, and indeed I'd expect people to not be parsing --all output in such a way as it can't work if more stuff was emitted, but I could be wrong. Absolutely it should be printed for LLVM output. I'm happy to defer to others on this one.

wolfgangp updated this revision to Diff 213104.Aug 2 2019, 11:55 AM
wolfgangp marked 4 inline comments as done.

Addressed review comments:

--all does not imply --stack-sizes any more

Changed some comments to use third person instead of first person plural.

Some formatting adjustments in the test.

llvm/test/tools/llvm-readobj/stack-sizes.test
194 ↗(On Diff #212676)

I've been struggling with the phrasing of this from the beginning. I wanted to say "ends with an incomplete stack size entry".

llvm/tools/llvm-readobj/ELFDumper.cpp
4403 ↗(On Diff #212676)

I like the idea of adapting the reporting of warnings to the one with errors.

However, I feel this should really be done in a different patch, and then we can refactor all uses of reportWarning in ELFDumper.cpp and ObjDumper.cpp. I'd be happy to do this afterwards, while it's still fresh in my mind.

The other caveat is that with reporting warnings on all malformed input we would now create messages on, for example, malformed symbols that have nothing to do with what we are reporting on (since we're scanning the symbol table). If you feel that's acceptable, I have no problem doing it, but I wanted to point it out.

jhenderson accepted this revision.Aug 5 2019, 6:14 AM

I'm happy with the state of this, but I'd like @grimar/@rupprecht to approve too.

This revision is now accepted and ready to land.Aug 5 2019, 6:14 AM
grimar accepted this revision.Aug 5 2019, 6:27 AM

I am OK with this. I'd like to refactor how the warnings are reported. We can do it separatelly indeed.

This revision was automatically updated to reflect the committed changes.

Sorry, I had some draft comments yesterday but had to leave early, here are post-commit comments now:

llvm/tools/llvm-readobj/ELFDumper.cpp
4524 ↗(On Diff #213104)

Is this iteration order well defined/stable? I tried going down the rabbit hole of making sure this isn't sorting based on arbitrary pointer values, but couldn't figure out what it's keyed off of. Is it sorting by section name/index?

(llvm::MapVector itself is deterministic/sorted on the key, but if the key is a pointer, it's effectively random)

llvm/tools/llvm-readobj/llvm-readobj.cpp
737–740 ↗(On Diff #213104)

I think this block is obsolete now? The default for flags is false, so this just means that llvm-readobj --all --stack-sizes won't print stack sizes, but llvm-readobj --stack-sizes will (and will print the warning).

58 ↗(On Diff #212209)

Hmm, good point; definitely there are countless users that invoke+parse the output of readelf -a (when probably they only care about a few things, but use -a as a lazy way to just get everything). But there's so much output it produces, any strict parser that would break if an additional block is produced is probably so fragile anyway.

Still, might be good to submit this without being included in --all (for readelf), and we can add it there that as a separate change after having this feature live in the wild for a bit/get some more opinions.

(Also just want to say the print*RelocatableStackSizes methods look much better now, thanks for cleaning that up!)