This is an archive of the discontinued LLVM Phabricator instance.

Advance the return-address breakpoint location to the end of the next source line, or the next branching instruction, when stepping over a func call
ClosedPublic

Authored by jasonmolenda on Dec 21 2015, 7:59 PM.

Details

Reviewers
jingham
Summary

When lldb is instruction stepping (or fast-stepping / instruction stepping) over a source line, and it instruction steps into a function call, and the function is not "interesting" to the user, lldb sets a breakpoint on the return address and continues the process until that breakpoint is hit on the current thread.

lldb has a fast-stepping approach to avoid stopping at every instruction in a source line range - it only stops on instructions that can branch / return / call functions.

So we have one extra stop when nexting over a function call. We step into the uninteresting-function, put a breakpoint on the return address, continue, hit the breakpoint on the return address and then we fast-step to the next branch instruction. That was four extra gdb-remote protocol packets for every function call in a source line.

This patch advances the stop breakpoint to the first branching instruction after we return, or to the end of the source line.

It passes the testsuite. I'll be doing more by-hand testing in the following week or two when I have time, but it's a straightforward change, it shouldn't cause problems unless I've missed something. I added the new Process::AdvanceAddressToNextBranchInstruction() method but I'm only calling it from ThreadPlanStepOut::ThreadPlanStepOut. I thought I'd be able to unify this new function with the code in ThreadPlanStepRange::SetNextBranchBreakpoint but I don't think the few lines I could remove from ThreadPlanStepRange::SetNextBranchBreakpoint would be worth the change. Jim might disagree with that.

The one point where this would be incorrect is a command like "finish" which displays the return value after the function exits. On an architecture where the return values are passed in volatile registers (x86, arm), that register may be overwritten after the function return so we must stop on the return address.

Diff Detail

Repository
rL LLVM

Event Timeline

jasonmolenda retitled this revision from to Advance the return-address breakpoint location to the end of the next source line, or the next branching instruction, when stepping over a func call.
jasonmolenda updated this object.
jasonmolenda added a reviewer: jingham.
jasonmolenda set the repository for this revision to rL LLVM.
jasonmolenda added a subscriber: lldb-commits.
jingham requested changes to this revision.Jan 4 2016, 4:32 PM
jingham edited edge metadata.

I don't so much mind that you couldn't reuse AdvanceAddressToNextBranchInstruction, we wouldn't be using the "get the disassembly" part of it, which is the biggest bit, since that's already done in GetInstructionsForAddress which also handles the cache of disassembled instructions. If the disassembly part of this stepping ever shows up as a big time sink, then we can move the cache of instruction fragments to the process, and get it from there.

The only bit I'm concerned with is why you needed to call Clear on the disassembler's instruction list. That seems odd to me.

include/lldb/Target/Process.h
3182–3183

"The" should be "the".

include/lldb/Target/Thread.h
912–913

Can you call this something that says what it does, like "continue_to_next_branch"? private_step_out seems too generic.

Also, should this be defaulted to false?

source/Target/Process.cpp
6578–6579

I don't understand this comment, and this seems like an odd thing to have to do. The disassembler is going to get destroyed when we leave this function. Why should we have to clear its instruction list manually?

This revision now requires changes to proceed.Jan 4 2016, 4:32 PM

Thanks for the comments. As for calling Clear() on the instruction list manually, I was aping the dtor on ThreadPlanStepRange which reads,

// FIXME: The DisassemblerLLVMC has a reference cycle and won't go away if it has any active instructions.
// I'll fix that but for now, just clear the list and it will go away nicely.
for (size_t i = 0; i < num_instruction_ranges; i++)
{
    if (m_instruction_ranges[i])
        m_instruction_ranges[i]->GetInstructionList().Clear();
}

I should probably check that this is the case before I copy the hack.

I'll see if I can consolidate the disassembly into one big of shared code.

jasonmolenda edited edge metadata.

Updated to address Jim's first round of comments.

  1. fix typeo in comment
  1. update change argument name from "private_step_out" to "continue_to_next_branch"
  1. Change continue_to_next_branch argument to have a default value of false (I'm always a little reluctant to use these default arguments because it can make it more difficult to understand which method is being called with overloaded methods, or if when it gets out of hand, which arguments are going where -- but I'm sure it won't be a problem in this case)
  1. The call to InstructionList::Clear is necessary, letting the Disassembler dtor execute is not sufficient (I did a quick test to make sure the comment was still right). This comment:

// The DisassemblerLLVMC has a reference cycle and won't go away if it has any active instructions.

appears in six places around the codebase already (one of them being in the ThreadPlanStepRange dtor). I updated to use the same comment.

  1. The code duplication in ThreadPlanStepRange::GetInstructionsForAddress and Process::AdvanceAddressToNextBranchInstruction (where we disassemble an address range and identify the next branching instruction) is not great, but when you strip out the correctness checks on the inputs, the total of the duplication is

    ExecutionContext exe_ctx (this); const char *plugin_name = nullptr; const char *flavor = nullptr; const bool prefer_file_cache = true; disassembler_sp = Disassembler::DisassembleRange(target.GetArchitecture(), plugin_name, flavor, exe_ctx, range_bounds, prefer_file_cache);

so I just left them as separate code. (over in ThreadPlanStepRange, the method to get the InstructionList is in GetInstructionsForAddress and then the logic to get the next non-branching instruction is in SetNextBranchBreakpoint. In Process, I'm doing both in one method). ThreadPlanStepRange saves the Disassemblers (one per AddressRange for the multiple parts of a source line) in a vector; I'm not caching anything over in Process because we're unlikely to use the same InstructionList again in this same context.

I can go either way on this one. It was enough of a difference in behavior, and a small enough bit of code, that I didn't try to unify them, but I'm not wedded to that.

jingham accepted this revision.Jan 7 2016, 10:17 AM
jingham edited edge metadata.

I agree default arguments should be used with care. But in cases where some argument should generally be used one way and only have another value when used for special purposes by them as knows what they are doing, then the default argument expresses this.

Can you file a bug about the need to clear the instruction list. That's just a little time bomb we're leaving around. You don't need to fix it for this patch, but we should remind ourselves to fix it when we have a spare moment.

The code duplication doesn't bother me either, it is not significant.

This revision is now accepted and ready to land.Jan 7 2016, 10:17 AM
jasonmolenda closed this revision.Jan 7 2016, 4:10 PM

Landed in r257117.

Sending include/lldb/Target/Process.h
Sending include/lldb/Target/Thread.h
Sending include/lldb/Target/ThreadPlanStepOut.h
Sending source/Target/Process.cpp
Sending source/Target/Thread.cpp
Sending source/Target/ThreadPlanShouldStopHere.cpp
Sending source/Target/ThreadPlanStepOut.cpp
Sending source/Target/ThreadPlanStepOverRange.cpp
Transmitting file data ........
Committed revision 257117.