This is an archive of the discontinued LLVM Phabricator instance.

[MC] Function stack size section.
ClosedPublic

Authored by seaneveson on Nov 8 2017, 3:27 AM.

Details

Summary

Original RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-August/117028.html

I wasn't sure who to put as reviewers, so please add/remove people as appropriate.

This change adds a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. The section contains pairs of function symbol references (8 byte) and stack sizes (unsigned LEB128).

The contents of this section can be used to measure changes to stack sizes between different versions of the compiler or a source base. The advantage of having a section is that we can extract this information when examining binaries that we didn't build, and it allows users and tools easy access to that information just by referencing the binary.

There is a follow up change to add an option to clang.

Thanks.

Event Timeline

seaneveson created this revision.Nov 8 2017, 3:27 AM
seaneveson updated this revision to Diff 122049.Nov 8 2017, 3:30 AM

Add context to diff. The arc tool doesn't seem to do that for me anymore.

asb added a subscriber: asb.Nov 8 2017, 3:33 AM
MatzeB edited edge metadata.Nov 15 2017, 10:43 AM

I'd defer to others to comment on the highlevel whether this is a good or helpful idea. Some comments though:

  • The format of the new section ought to be documented properly in the llvm docs and not just in the commit message and sourcecode.
  • I think this should report a special "unknown" value for functions with dynamic stack allocations.
  • I'm not 100% sure that MachineFrameInfo::getStackSize() captures all possible allocations; it probably corresponds to what is allocated in the prologue), however I wonder if we can have cases where callframes are not allocated as part of the prologue. Probably need a stronger wording on what "StackSize" in MachineFrameInfo means if you want to output it to the user in this form.

I'd defer to others to comment on the highlevel whether this is a good or helpful idea. Some comments though:

  • The format of the new section ought to be documented properly in the llvm docs and not just in the commit message and sourcecode.
  • I think this should report a special "unknown" value for functions with dynamic stack allocations.
  • I'm not 100% sure that MachineFrameInfo::getStackSize() captures all possible allocations; it probably corresponds to what is allocated in the prologue), however I wonder if we can have cases where callframes are not allocated as part of the prologue. Probably need a stronger wording on what "StackSize" in MachineFrameInfo means if you want to output it to the user in this form.

To spin this more positively: There seems to be interest for the feature, the code changes appear minimal so maintenance shouldn't be too bad. So I wouldn't object going ahead with this.

However we have to be very careful in managing expectations here! I would expect this to need several iterations, bugfixes and testing/fuzzing until I would trust this information in all cases. Nothing is stopping targets from emitting stack pointer modifying code without recording information in MachineFrameInfo.

Thanks for the feedback!

The format of the new section ought to be documented properly in the llvm docs and not just in the commit message and sourcecode.

OK.

I think this should report a special "unknown" value for functions with dynamic stack allocations.

I agree that would be useful. With the value being ULEB128 we could have 0 as "unknown" and encode everything else as value + 1, or we could not emit those functions (Neither is ideal).

I'm not 100% sure that MachineFrameInfo::getStackSize() captures all possible allocations; it probably corresponds to what is allocated in the prologue

Yes I think that is the case.

However we have to be very careful in managing expectations here! I would expect this to need several iterations, bugfixes and testing/fuzzing until I would trust this information in all cases. Nothing is stopping targets from emitting stack pointer modifying code without recording information in MachineFrameInfo.

Agreed. We would like to produce the maximum size each function might be on the stack (where possible). I imagine there will be a lot of cases to account for and we won't be able to foresee all of them (especially for all targets).

  • Add documentation to the llc CommandGuide
  • Skip functions with dynamic stack allocation

Ping.

I'd defer to others to comment on the highlevel whether this is a good or helpful idea.

Does anyone have any feedback on that front?

MatzeB added inline comments.Nov 27 2017, 11:57 AM
docs/CommandGuide/llc.rst
135–141 ↗(On Diff #123330)

I'd rather move the documentation to a new (sub)section in CodeGenerator.rst as the main purpose would be to document the name and format of the new section.

Documenting the llc flag in addition to that is fine too of course.

lib/MC/MCObjectFileInfo.cpp
597–598

I'm no expert with ELF, but PROGBITS feels wrong to me as I would expect it to get loaded/memory mapped. Is NOTE a better fit?

test/CodeGen/X86/stack-size-section.ll
4–15

I'd move the CHECK lines in front of the functions.

  • Rebase
  • Add documentation to CodeGenerator.rst
  • Move test checks above corresponding functions
seaneveson marked 2 inline comments as done.Nov 29 2017, 2:55 AM
seaneveson added inline comments.
lib/MC/MCObjectFileInfo.cpp
597–598

PROGBITS sections are only loaded when the SHF_ALLOC attribute flag is set, for example the .comment and .debug* sections are PROGBITS.

NOTE has a specific format (array of 4-byte words), which this section doesn't meet.

MatzeB accepted this revision.Nov 29 2017, 10:57 AM

Let's go ahead with this and fix problems as we go along. LGTM, thanks.

This revision is now accepted and ready to land.Nov 29 2017, 10:57 AM
seaneveson closed this revision.Nov 30 2017, 4:01 AM