This is an archive of the discontinued LLVM Phabricator instance.

[codeview] Emit S_FRAMEPROC and use S_DEFRANGE_FRAMEPOINTER_REL
ClosedPublic

Authored by rnk on Sep 17 2018, 6:35 PM.

Details

Summary

Before this change, LLVM would always describe locals on the stack as
being relative to some specific register, RSP, ESP, EBP, ESI, etc.
Variables in stack memory are pretty common, so there is a special
S_DEFRANGE_FRAMEPOINTER_REL symbol for them. This change uses it to
reduce the size of our debug info.

On top of the size savings, there are cases on 32-bit x86 where local
variables are addressed from ESP, but ESP changes across the function.
Unlike in DWARF, there is no FPO data to describe the stack adjustments
made to push arguments onto the stack and pop them off after the call,
which makes it hard for the debugger to find the local variables in
frames further up the stack.

To handle this, CodeView has a special VFRAME register, which
corresponds to the $T0 variable set by our FPO data in 32-bit. Offsets
to local variables are instead relative to this value.

This is part of PR38857.

Diff Detail

Repository
rL LLVM

Event Timeline

rnk created this revision.Sep 17 2018, 6:35 PM
zturner added inline comments.Sep 18 2018, 9:44 AM
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
108 ↗(On Diff #165875)

Similarly here, shouldn't this just be assert(false)?

llvm/lib/DebugInfo/CodeView/SymbolRecordMapping.cpp
496 ↗(On Diff #165875)

This should probably be an assert(false) as opposed to an llvm_unreachable. This function operates on user input, so it can technically be any value that fits in the underlying type, and we want the program to at least work with bad user input. If we use llvm_unreachable then entering this codepath would be undefined behavior.

rnk added inline comments.Sep 18 2018, 10:45 AM
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
108 ↗(On Diff #165875)

Why? Assertions are for internal errors. We want to report an error even if assertions are disabled if a user somehow manages to generate code for ppc64-windows-msvc and enable codeview. That's not an internal error, and there's no checking up until this point that will error out for it.

llvm/lib/DebugInfo/CodeView/SymbolRecordMapping.cpp
496 ↗(On Diff #165875)

Why would we ever prefer assert(false) to llvm_unreachable? The unreachable is usually preferred because more static analysis tools know what it means.

We also already assert that EncodedReg < 4 above, so this is already pretty trivially dead.

And, the places where we would get this from user data, it's extracted from a 2-bit bitfield. We've covered all the possible values it can be, unless there are bugs in our own code.

zturner added inline comments.Sep 18 2018, 10:52 AM
llvm/lib/DebugInfo/CodeView/SymbolRecordMapping.cpp
496 ↗(On Diff #165875)

If an unreachable path executes, it's basically undefined behavior. We don't want undefined behavior executing conditionally based on the value of user input. That should never happen.

aprantl added inline comments.Sep 18 2018, 10:59 AM
llvm/lib/DebugInfo/CodeView/SymbolRecordMapping.cpp
496 ↗(On Diff #165875)

If the switch can trigger an unhandled case, because of invalid user input, that case must be handled by code that is enabled in NDEBUG. If the we are just guarding against programming errors with an internal consistency check, then the unreachable is preferred over the assert.

rnk added inline comments.Sep 18 2018, 11:19 AM
llvm/lib/DebugInfo/CodeView/SymbolRecordMapping.cpp
496 ↗(On Diff #165875)

If an unreachable path executes, it's basically undefined behavior. We don't want undefined behavior executing conditionally based on the value of user input. That should never happen.

I think it's just a consequence of current implementation details that llvm_unreachable is immediate UB and a failed assert is... probably-UB-soon. I think if it was easy for us to say, if (!assert_cond) __builtin_unreachable(); in the implementation of assert.h, we would do it.

I think the important thing is whether the condition is triggerable by user input changes or not. It's currently impossible. All the callers do Input & 3U before calling here, so invalid values are impossible. If that changes, that'd be a bug and we'll find it with assertions like we always do.

zturner added inline comments.Sep 18 2018, 11:25 AM
llvm/lib/DebugInfo/CodeView/SymbolRecordMapping.cpp
496 ↗(On Diff #165875)

This code gets run in llvm-pdbutil when we dump. If we get an invalid PDB or one with a frame pointer value which was valid but we just didn't know about, we would crash here. We should print something sane in that case, like "<unknown>" or "<none>".

But the point is that this is all in generic library code which is intended to be consumed by debuggers and the like. And they may have to deal with bad debug info which could have changed in various ways across different compiler versions.

We could maybe return an Expected<RegisterId> here, or even an Optional<RegisterId> if Expected is too cumbersome.

rnk added a comment.Sep 24 2018, 9:50 AM

Any other thoughts on this patch?

llvm/lib/DebugInfo/CodeView/SymbolRecordMapping.cpp
496 ↗(On Diff #165875)

I've explained my position and I don't intend to make this change. This program point is in fact unreachable, it's implied by the assertion in this same function.

zturner accepted this revision.Sep 24 2018, 11:48 AM
zturner added inline comments.
llvm/lib/DebugInfo/CodeView/SymbolRecordMapping.cpp
496 ↗(On Diff #165875)

Hmm, I somehow missed that it’s asserted on earlier :-/

This revision is now accepted and ready to land.Sep 24 2018, 11:48 AM
This revision was automatically updated to reflect the committed changes.