This is an archive of the discontinued LLVM Phabricator instance.

Add completed_plan_stack to LLDB ThreadStateCheckpoint
AcceptedPublic

Authored by boris.ulasevich on Jan 20 2017, 4:32 AM.

Details

Reviewers
jingham
Summary

Here is a fix for Ubuntu tests fails after the recent change D26497 (https://reviews.llvm.org/D26497). The cause of those fails is in internal call to mmap fuction which destroys completed_plan_stack content which in its turn leads to incorrect stop reason discovering.

The solution is to save/restore completed_plan_stack along with existing ThreadStateCheckpoint functionality.

Diff Detail

Event Timeline

labath added a subscriber: labath.

Jim should probably review this change.

That said, I don't see any failures on linux currently. Can you elaborate on which tests are failing and why? It sounds like there is an opportunity here to write a more robust test...

Jim should probably review this change.

Yes, thank you. We are in contact with Jim. And I see now I had put Jim to Subscribers list instead of Revievers list :)

That said, I don't see any failures on linux currently. Can you elaborate on which tests are failing and why?

Let me explain it. A month ago I got ready-to-land resolution for D26497 and committed it. Very soon I got tests fail notification from build robot: several tests (including my new test) failed on Ubuntu. The decision was to revert the change temporary, and study the issue without a rush. And now I am back with the fix for this issue. I am going to commit D26497 back after this fix.

jingham requested changes to this revision.Jan 20 2017, 11:14 AM

The restoration of the stop reason works for the most part. For instance, when we call a user expression the stop reason is correctly restored. This was done by saving and restoring the stop info. There's one other little two-step you have to do to produce the "cooked" stop reason. For instance, if you stop because of a breakpoint hit that completed a "step-out" plan, you have to check the completed plan stack to figure out what the user visible stop reason was. But before Boris' change the state of the completed plan stack wasn't important once this was done. Boris' change started using the completed plan stack for more than just the initial determination of the cooked stop reason, so you do need to preserve it as well.

I don't think that letting the completed plans live a little longer should pose any problems. The plans should do whatever cleanup they are going to do when they get popped, and should not rely on the destructors to do this work. But it would be worth a quick audit to make sure I didn't get sloppy about that before we make this change, and we should document that requirement in the ThreadPlan dissertation at the beginning of ThreadPlan.h.

Can you do that, and then we should be fine to go in?

This revision now requires changes to proceed.Jan 20 2017, 11:14 AM

BTW, this shows one important difference between OS X and Linux that pops up from time to time. On OS X, we can allocate memory in the target process without having to call functions in the target. We don't do that by hand in lldb (that wouldn't work for remote debugging) rather we have a packet we send to debugserver, and it does the allocation. But the Linux lldb-server doesn't have this ability so on Linux, lldb ends up having to make and run a ThreadPlanCallFunction from awkward places in processing stops. This works for the most part, but it sometimes requires tweaks like this.

This functionality gets tested pretty rigorously just by using lldb on Linux, since this gets done as a side effect of many operations. So it is tested indirectly by the test suite. I'm not sure how you would test this directly, however.

Thank you both for explaining the situation. The "test failure on ubuntu" part threw me off -- I did not realize you were talking about hypothetical failure should a different change go in. I was worried we have some very environment-dependent tests.

Looks good from my side then.

The plans should do whatever cleanup they are going to do when they get popped, and should not rely on the destructors to do this work.

Yes, you already made a note a time ago:

r123869 | jingham | 2011-01-20

Back up both the register AND the stop state when calling functions.

123869 jingham N.B. Don't wait to do clean up target state till the destructor, since that will usually get called when
123869 jingham
the target resumes, and you want to leave the target state correct for new plans in the time between when
123869 jingham // your plan gets unshipped and the next resume.

I don't think that letting the completed plans live a little longer should pose any problems

It is not something extraordinary, but just a life frame the plans intended to live without unexpected internall call.

By the way, I fixed my issue by restoring specific Thread's property - don't you think it is a potential bug for somebody who will use another property (destroyed plans?) without proper treatment in Thread State Checkpoint?

But it would be worth a quick audit

As I see, actually destructors clears breakpoints - it is important, but not urgent job.

we should document that requirement in the ThreadPlan dissertation at the beginning of ThreadPlan.h

For me the ThreadPlan dissertation is like the Bible: I feel it is very old and complicated, a lot on wisdom is hidden there, but nobody around have ever read it attentively :)

I would add words about Thread State there and define stack direction to make words 'below' and 'higher' more definite. Please see new diff.

boris.ulasevich edited edge metadata.

description update

jingham accepted this revision.Jan 23 2017, 12:18 PM

I can't think of a case where you would need the discarded plans to stick around across a function call, I don't think you need to preserve that stack.

I do worry a bit about how the completed plans go away, mostly as you say because of breakpoints which could be left without their ThreadPlan handlers. The simplest proceeding is to convert all the Destructor cleanups to WillPop and then just call that in the destructor. That way the real cleanup goes on when the plan is popped and the destructor is just a safety backup. Leaving the plans around longer increases the risk of this: especially when we know we are changing behavior to leave completed plans around while running a function call thread plan.

I did that conversion for the Function call thread plans a while back, but must have gotten interrupted on the way to doing them all. Anyway, I doesn't make sense to gate this change on completing that piece of work.

This revision is now accepted and ready to land.Jan 23 2017, 12:18 PM