Page MenuHomePhabricator

Add "Undeclared registers are marked as Undefined" to UnwindPlan, adopt it in arch default unwind plans
ClosedPublic

Authored by jasonmolenda on Feb 16 2021, 6:48 PM.

Details

Summary

The ABI plugins can provide an architectural default unwind plan -- an UnwindPlan that will get you up the stack when you don't have any unwind information or function start addresses. The unwinder uses these arch default unwind plans in two cases: No information for this stack frame, or as a fast unwind plan to walk the stack quickly, when you're not retrieving additional registers. (the unwinder will often also try this as a "fallback" unwind plan when it thinks there might be more stack to discover, giving us the ever-awesome double-start() that we see on Darwin processes at the end of the backtrace, sigh)

The big important use of an arch default unwind plan is jitted code -- where lldb doesn't have any start addresses for the functions or unwind information. We can't detect where registers are spilled to the stack, but at least we can walk the stack.

However, when we have a stack 5 frames deep,

0 func1()
1 func2()
2 <no function, jitted code>
3 func3()
4 main()

If we navigate to frame 3 (func3) and say "p/x $r12", lldb will descend down the stack (ultimately to frame 0, func1) looking for a place where r12 was saved to stack before it was reused. If no saves/spills are found, the live r12 value in frame 0 is used.

The problem here is that we have no view into what happened in frame 2, the jitted code. It may have spilled r12 and modified the register. The rule in lldb is that we don't provide register values that are *possibly* modified -- the same reason why lldb won't pass caller-spilled aka volatile register values up the stack, e.g. rax on SysV x86_64, where a function is free to modify it without saving the old value. We cannot be sure that the value of these registers haven't been changed, so lldb will refuse to print $rax for frame 3 in this backtrace.

This patchset adds a new bool to UnwindPlan::Row which specifies that any register not explicitly defined should return as Undefined. This means that the saved-register-finder method should NOT continue down the stack looking for a saved value - we cannot know what happened in this stack frame, so it's not safe to continue searching and pass a register value up the stack.

The second use case of the arch default UnwindPlan is as a "fast Unwind Plan" when lldb only needs to retrieve frame pointer & pc values when doing a quick stack walk. In this case, when lldb is searching for a register in RegisterContextUnwind::SavedLocationForRegister, it first tries the fast unwind plan, and if the register isn't mentioned there, it tries to find a location in the Full unwind plan. I have one small modification here to treat an Undefined register in the *fast* unwind plan as meaning that we should look in the full unwind plan, instead of stopping here. The full unwind plan may have a real register save location; if it says the register is Undefined, then that will be used.

The big risk with this patch, obviously, is that I'm updating all of the architectural default unwind plans to mark undefined registers as unavailable. I worry that there's some target that is relying on architectural default unwind plans heavily and previously the frame 0 register values were passed up the stack ... and maybe it was right sometimes, but now lldb won't send them up any more. I'll need to watch the bots closely when this lands. I could have done the cowardly thing and only updated the SysV x86_64 and Darwin AArch64 ABI plugins, but this is a change I've been wanting to make for a while.

Testing this is something I don't have a good answer for. I could unit-test that the SysV x86_64 ABI's arch default UnwindPlan sets SetUnspecifiedRegistersAreUndefined, but that's not anything really useful. Until recently you could jit an expression that stopped at a breakpoint and you'd have a wonderfully unsymbolicated stack frame in the middle of the backtrace but some rascal like Raphael seems to have changed it so we have a Module for the jitted code with a function start address recently. :/ Short of a corefile or a test using a scripted process plugin (basically the same thing), I don't think I can simulate the use case here. On the up side, bugs show up real fast in the testsuite run as I found when I didn't originally remember how the fast unwind plan fallback needed to work when I wrote this patch. :)

Diff Detail

Event Timeline

jasonmolenda created this revision.Feb 16 2021, 6:48 PM
jasonmolenda requested review of this revision.Feb 16 2021, 6:48 PM

I marked Pavel as the reviewer for this because he's maybe the most likely to have an opinion about this change; I don't mean to sign you up for review work if you don't have time right now. I don't know if the specifics of the patch are especially interesting, but the testability problem and the decision to update all of the arch default unwind plans seem the most notable/discussable to me.

Makes sense, LGTM.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 17 2021, 11:52 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.