This is an archive of the discontinued LLVM Phabricator instance.

Recognize __builtin_debugtrap on arm64, advance pc past it so users can continue easily
ClosedPublic

Authored by jasonmolenda on Nov 11 2020, 12:05 AM.

Details

Summary

This is a change I need on Darwin systems, so I'm trying to decide whether I put the test case in API/functionalities or in macosx, but I think it may apply on Linux as well.

With a builtin_debugtrap() in a program, we want the debugger to stop execution there, but we want the user to get past it with a 'continue' or next/step. With a builtin_trap(), we want the debugger to stop on that instruction and not advance unless the user rewrites $pc manually or something.

On x86, __builtin_debugtrap() is 0xcc (the breakpoint instruction); when you hit that, the pc has advanced past the 0xcc. In debugserver (DNBArchImplX86_64::NotifyException) when we've hit an 0xcc that was NOT a breakpoint debugserver inserted, it leaves the $pc past the 0xcc, so continuing will work.

On arm64, __builtin_debugtrap is 'brk #0xf000', this patch recognizes that specific instruction in DNBArchMachARM64::NotifyException and advances the pc past it so we get the same behavior.

The test case hits a builtin_debugtrap(), continues past it, hits a builtin_trap(), and checks that it cannot advance past that. With this debugserver patch, that's how lldb behaves on both x86 darwin and arm64 darwin.

Pretty simple stuff; the only real question is whether we should make this a macos-only tested behavior, or include Linux as well. Anyone know how this works with lldb-server on linux?

Diff Detail

Event Timeline

jasonmolenda created this revision.Nov 11 2020, 12:05 AM
jasonmolenda requested review of this revision.Nov 11 2020, 12:05 AM

Including Jim, as he was involved in the discussion last time this came around http://lists.llvm.org/pipermail/lldb-dev/2020-March/016017.html.

This functionality would not work on linux/lldb-server currently, though I would love if it did -- I've found the x86 behavior of continuing after an int3 to be pretty handy, and was annoyed by the fact it does not work the same way on arm.

That said, I'm not sure that implementing this in the server is the right approach. The previous discussion seemed to converge on doing that in the client, and I think the reasoning behind that is sound (like, ensuring that the PC we report will be the same as the one extracted by the native crash/backtrace tools when the process when the process hits that instruction without the debugger present). It would also mean that this behavior would automatically work with all stubs, even third party ones we have no control over.

Maybe we don't even have to do the really fancy thing of adjusting the PC just before we continue. The initial implementation could just adjust the PC immediately after a pc. That way, if this will be a client feature, we could easily go back and implement something more fancy...

Another possible way to do this would be to make a new stop reason in LLDB. debugserver currently can return a "reason" whose values are one of "trap", "breakpoint", "watchpoint", "exception", or "exec". What if we added "debug-trap"? Then back in LLDB we could add "eStopReasonDebugTrap" to "lldb::StopReason". Then the ABI plug-in or some arch specific plug-in could do something in response to this on stop, or on continue. x86 would know that is doesn't need to do anything because the PC is already past the trap instructions. But ARM64 it could advance the PC when the process goes to start running again so it skips the trap instruction. I would presonally like to see the program tell the truth when you hit a "brk #0xf000" and show the PC being on this instruction, and then continue from it correctly when you resume. Your solution is much easier though, so either way works.

Yeah, I don't know where it's best to centralize the logic of (1) recognize a builtin_debugtrap instruction, and (2) set the pc so we step past it, know how to step past it when the user asks to resume control. debugserver (and lldb-server I'm sure) are already lying a *little* with a normal x86 user breakpoint - when you execute 0xcc at addr 0x100, the stop pc is 0x101 and someone is rolling that back to 0x100 to report the breakpoint hit (debugserver does this, I'm sure lldb-server does the same). So we're just adding more lies into the remote stub.

You could argue that the stub should always set $pc to be the instruction that hit the breakpoint/undefined instruction (so on x86, back up the pc by 1) and lldb should look at the breakpoint list / instruction bytes and decide if this was a user breakpoint or a builtin_trap of some kind. And when it's a builtin_debugtrap, and the user continues/steps/nexts, lldb could advance $pc past the instruction and then do the nexting.

I don't have much of an opinion about where this knowledge and pc adjusting should happen. Jim might though.

Yeah, I don't know where it's best to centralize the logic of (1) recognize a builtin_debugtrap instruction, and (2) set the pc so we step past it, know how to step past it when the user asks to resume control. debugserver (and lldb-server I'm sure) are already lying a *little* with a normal x86 user breakpoint - when you execute 0xcc at addr 0x100, the stop pc is 0x101 and someone is rolling that back to 0x100 to report the breakpoint hit (debugserver does this, I'm sure lldb-server does the same). So we're just adding more lies into the remote stub.

You could argue that the stub should always set $pc to be the instruction that hit the breakpoint/undefined instruction (so on x86, back up the pc by 1) and lldb should look at the breakpoint list / instruction bytes and decide if this was a user breakpoint or a builtin_trap of some kind. And when it's a builtin_debugtrap, and the user continues/steps/nexts, lldb could advance $pc past the instruction and then do the nexting.

I don't have much of an opinion about where this knowledge and pc adjusting should happen. Jim might though.

No worries, this solution is simple and will work just fine. I have no issues either way. It will be interesting to see what Jim suggests.

The attraction of having stub fix up the PC in the stop after hitting the trap for breakpoints (in this case by moving the PC before the trap on architectures that execute the trap) was that this decision could be made simply in the stub, but if lldb had to check DECR_PC_AFTER_BREAK to understand a breakpoint stop, that would need to happen in a bunch of different places (as it did in gdb) and would be easier to get wrong.

But the only job we would need to do to proceed past unrecognized traps is to check, when we are about to continue, whether the next instruction is a trap. If it is a trap we would push a "step over trap" plan before whatever else we were going to do. And in fact, we already do that for breakpoints because the user might have put a breakpoint on the next instruction before hitting "continue" and we really don't want to stop at that breakpoint. We always check this last thing before the continue. So implementing proceed past unrecognized trap in lldb would be very contained. We'd just have to expand the "is there a breakpoint at the current PC" to "is there a breakpoint or any other architecture defined 'trap' instruction at the current PC."

That weakens the argument for doing it in the stub rather than in lldb.

Doing it in lldb is a little nicer too, since on systems that execute their traps, you would just never see a trap instruction at the PC when you were about to continue, so the same solution would work. And as Pavel says, then we wouldn't have to change lldb-server as well.

So provided lldb has an easy way to answer the question "is this a trap I see before me" I'm mildly in favor of extending lldb's "should I push a step-over-breakpoint plan now" logic to check both "is there a breakpoint here" and "is there any other architecture defined trap here". But not enough to say you shouldn't do it this way...

Minor update to the patch to move the test case under API/macosx for now, because it depends on debugserver behavior. Add comments to note that we're all in agreement this should really be handled up in lldb - but that's more a TODO for a future day, for now.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 12 2020, 11:31 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.