This is an archive of the discontinued LLVM Phabricator instance.

LLDB bug 30863: "Step" doesn't stop with conditional breakpoint on the next line
AcceptedPublic

Authored by boris.ulasevich on Nov 10 2016, 5:03 AM.

Details

Reviewers
jingham
Summary

The issue is that

  • breakpoint handler issues the breakpoint event and discards step plan (ThreadPlanBase::ShouldStop method)
  • breakpoint event handler skips breakpoint (because its condition is false) and resumes the execution

Final fix includes:

  • additional check for completed plans while discarding stale plans
  • GetStopInfo algorithm rework
  • reset StopInfo on breakpoint condition fail

Diff Detail

Event Timeline

boris.ulasevich retitled this revision from to LLDB bug 30863: "Step" doesn't stop with conditional breakpoint on the next line.
boris.ulasevich updated this object.
boris.ulasevich added a project: Restricted Project.
jingham edited edge metadata.Nov 10 2016, 11:03 AM

This isn't quite right. If the condition was false, the breakpoint formally didn't get hit. That means its commands shouldn't get run. So you can't just fall through in the case where you want to treat the "unhit" breakpoint hit as a step end. You need to have that location vote to stop, but then skip all the rest of the handling of that location.

Also, instead of calling it "WasThreadStepPlanDiscarded" I would prefer to name it by what you are using it for, like "StepPlanOverridesBreakpoint".

Also, why is the plan you are checking for on the discarded plan stack? That seems wrong to me. If the step was successful, it should be on the completed plan stack. A step plan will end up getting discarded, for instance, if you were stepping over a function that threw an exception, and you ended up stopping in a frame above where you were stepping. In that case, the plan would return IsStale = false, and get discarded. But if the step completes it should be in the completed plan stack.

jingham requested changes to this revision.Nov 10 2016, 11:03 AM
jingham edited edge metadata.

I'm supposed to also say "request changes"...

This revision now requires changes to proceed.Nov 10 2016, 11:03 AM
boris.ulasevich edited edge metadata.

Thank you for the review. I really appreciate it.

You are right about breakpoint commands, my bad, breakpoint commands should not run. I moved the check later.

I fixed function name. Actually I don't like both names, but I do not see better option :)

About the discarded plan stack: current logic is the following. Remaining plans are discarded intentionally because breakpoint stop is expected (as I can understand, we can't check breakpoint condition in PrivateState thread), and user will not be happy to get more than one stop:

ThreadPlanBase:112: ThreadPlanBase::ShouldStop: eStopReasonBreakpoint -> m_thread.DiscardThreadPlans(false)
Thread.cpp:908: Thread::ShouldStop: should_stop -> DiscardThreadPlansUpToPlan(examined_plan)

So for me it is quite Ok to check discarded plan stack on this unusual condition.

jingham requested changes to this revision.Nov 16 2016, 11:04 AM
jingham edited edge metadata.

It is not the case in general that a breakpoint hit will discard thread plans. Suppose I'm doing a "step over", and in stepping over a function call I hit a breakpoint in the function. The "step over plan" doesn't get discarded, it stays on the plan stack so if you do a "continue" you will complete the step over. The "force" argument in ThreadPlanBasic says "only discard plans that say they can't persist over a breakpoint and other unrelated stop messages - and that's what the comment you adjusted in ThreadPlanBase.cpp is saying.

You can see this in practice:

(lldb) br s -n main
Breakpoint 1: where = foo`main + 15 at foo.c:11, address = 0x0000000100000f5f
(lldb) br s -p "Set a breakpoint here"
Breakpoint 2: where = foo`func + 11 at foo.c:5, address = 0x0000000100000f3b
(lldb) run
Process 16708 launched: '/private/tmp/foo' (x86_64)
Process 16708 stopped
* thread #1: tid = 0x9b1f6d, function: main , stop reason = breakpoint 1.1
    frame #0: 0x0000000100000f5f foo`main at foo.c:11
   8   	int
   9   	main()
   10  	{
-> 11  	  func();
   12  	  return 0;
   13  	} 
(lldb) n
Process 16708 stopped
* thread #1: tid = 0x9b1f6d, function: func , stop reason = breakpoint 2.1
    frame #0: 0x0000000100000f3b foo`func at foo.c:5
   2   	
   3   	int func()
   4   	{
-> 5   	  return printf("Set a breakpoint here\n");
   6   	}
   7   	
   8   	int
(lldb) thread plan list
thread #1: tid = 0x9b1f6d:
  Active plan stack:
    Element 0: Base thread plan.
    Element 1: Stepping over line foo.c:11.

The step over plan has NOT been discarded.

In this particular case, you are seeing an artifact of our choosing to discard the step plan when you hit a breakpoint when the step plan was done. That was done to make the breakpoint hit be the one reported rather than the step plan completion. But since you are now treating the step plan completion as the primary event, you may need to reverse that decision. If you can't easily figure out where that's being done, I'll have a look.

BTW, it's actually commands that we can't evaluate in the private state thread. We actually could test the conditions, since it turns out there are a bunch of instances which require expression evaluation on the private state thread, so I had to make that work. But it makes more sense to handle conditions & commands all together. The problem with running commands on the private state thread is side-stepped by having the PerformAction get run as part of the event's DoOnRemoval, which always happens on the client thread that's running the event loop. I don't think that is relevant.

This revision now requires changes to proceed.Nov 16 2016, 11:04 AM

Thank you for guiding me though this stuff. For years I was responsible for similar debugging issues in Mobile Java, and my motivation now is to learn how it works in lldb world.

I see that not all plans discarded by breakpoint, thank you. But our stepOver plan is discarded, and the reason is that it is stale (PC is out of the range):

Thread.cpp:908: Thread::ShouldStop: if should_stop, if stale -> DiscardThreadPlansUpToPlan(examined_plan)

I do not think it is correct to treat step plan completion as the primary event. Breakpoint have bump counter and commands, and we should not skip its processing.

reverse that decision. If you can't easily figure out where that's being done

It can be done in Thread.cpp:908: call PopPlan instead of DiscardThreadPlans, then the stale step plan will be put to completed stack and next it will be treated as stop reason by event handler. But, again, for me it looks incorrect.

apilipenko added inline comments.
lldb/source/Target/ThreadPlanBase.cpp
104–105 ↗(On Diff #78169)

This type of changes can go in separately without review. Having them in the review request only increases visual litter.

boris.ulasevich edited edge metadata.

Here I come with another fix. Additional code in ThreadPlanStepRange::IsPlanStale() checks if we are exactly on the next instruction just after the range, and sets plan Complete in this case. So for the case we have both breakpoint hit and step plan complete event, and further processing threats step plan complete as primary event (Thread::GetStopInfo), and stop works Ok. Please note that breakpoint's actions (condition check, bump count increase, commands) are skipped when step plan is treated as main event.

I want also to add lldb test for this case, but for now I did not succeed with correct work of lldb test suite on my Mac.

boris.ulasevich edited edge metadata.

Stop reason priorities was reworked, step plans gets a last chance to mark themselves complete while cleaning up stale plans, and breakpoint's PerformAction learned to take Completed Plans into account.

Test suite passed. New test was added to verify breakpoint vs stepping interaction.

jingham requested changes to this revision.Dec 14 2016, 10:49 AM
jingham edited edge metadata.

The logic looks right to me. I suggested a few cleanups to the test case (for which, thanks!) and a couple of other little details. And I requested an overall comment for the logic rework GetStopInfo 'cause that is a little confusing. You can get the logic somewhat by looking at the variable names, but I think some discursive explanation would be really helpful.

packages/Python/lldbsuite/test/functionalities/breakpoint/step_over_breakpoint/TestStepOverBreakpoint.py
52–56 ↗(On Diff #80991)

You can do this check more compactly - and check that you stopped at the right breakpoint - with:

self.thread = lldbutil.get_one_thread_stopped_at_breakpoint(self.breakpoint_1)
self.assertIsNotNone(self.thread, "Didn't stop at breakpoint 1.")

103–104 ↗(On Diff #80991)

If you use assertTrue, you should print the wrong value in the error message, otherwise you'll end up with some weird failure that happens occasionally and you will really wish you knew what the actual stop reason was.

But in this case, I think you can use assertEquals for the stop reason test, and that will automatically print both the expected & actual values.

108–109 ↗(On Diff #80991)

This doesn't actually check that you are stopped at breakpoint 4, just at some breakpoint. For breakpoint stop reasons, the stop info holds the BP ID and the location ID as the 0th and 1st data elements, so the bp id is:

self.thread.GetStopReasonDataAtIndex(0)

Or you can use get_one_thread_stopped_at_breakpoint, and make sure the thread is your thread.

source/Target/StopInfo.cpp
537–538 ↗(On Diff #80991)

typo:

additionalluy -> additionally

539 ↗(On Diff #80991)

I don't understand this comment.

source/Target/Thread.cpp
383 ↗(On Diff #80991)

We use _sp suffix for shared pointers.

387–399 ↗(On Diff #80991)

This bit of logic needs a comment to explain what it is doing.

920–921 ↗(On Diff #80991)

Can you move this comment to before the while loop? It is explaining what the whole loops logic is doing, so it's easier to read if it comes before you dive into reading the loop.

source/Target/ThreadPlanStepInstruction.cpp
97 ↗(On Diff #80991)

You mean "reach" not "stay on", right? Stay on makes it sound like we didn't move the pc from one run to the other.

source/Target/ThreadPlanStepRange.cpp
464

Again, "reach" not "stay on"?

This revision now requires changes to proceed.Dec 14 2016, 10:49 AM
boris.ulasevich marked 10 inline comments as done.Dec 17 2016, 5:08 AM
boris.ulasevich added inline comments.
source/Target/StopInfo.cpp
539 ↗(On Diff #80991)

new comment:

// Here we clean preset stop info.
// Neхt GetStopInfo call will find another stop info,
// and it should be stop info related to the completed plan
source/Target/Thread.cpp
387–399 ↗(On Diff #80991)

Let me add a overall comment for the if-else-else block:

// Here we select the stop info according to priorirty:
// - m_stop_info_sp (if not trace) - preset value
// - completed plan stop info - new value with plan from completed plan stack
// - m_stop_info_sp (trace stop reason is OK now)
// - ask GetPrivateStopInfo to set stop info
920–921 ↗(On Diff #80991)

Sure! New comment for the block will be:

// Discard stale plans and all plans below them in the stack,
// plus move completed plans to completed plan stack
jingham accepted this revision.Dec 19 2016, 11:03 AM
jingham edited edge metadata.

I made some trivial grammar fixes, but other than that this looks good. Thanks for working on this.

source/Target/StopInfo.cpp
539 ↗(On Diff #80991)

Maybe:

Here we clean the preset stop info so the next
GetStopInfo call will find the appropriate stop info,
which should be the stop info related to the completed plan

source/Target/Thread.cpp
387–399 ↗(On Diff #80991)

Great.

920–921 ↗(On Diff #80991)

Fine, but missing some articles:

Discard THE stale plans...
plus move THE completed plans to THE ...

This revision is now accepted and ready to land.Dec 19 2016, 11:03 AM
boris.ulasevich edited edge metadata.