This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Clear children of ValueObject on value update
Needs ReviewPublic

Authored by werat on Jul 6 2021, 3:25 AM.

Details

Reviewers
teemperor
jingham
Summary

Children of ValueObject automatically update themselves when they detect
the state of the process has changed, which typically happens when the
parent value is updated. However, if in case of updating
ValueObjectConstResult the process state is unchanged and the children
remain stale.

Explicitly clear the children upon the parent update, so that they're
re-calculated afterwards.

Diff Detail

Event Timeline

werat requested review of this revision.Jul 6 2021, 3:25 AM
werat created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2021, 3:25 AM
werat updated this revision to Diff 356663.Jul 6 2021, 3:27 AM

Simplify the test case

teemperor accepted this revision.Jul 6 2021, 4:23 AM

So IIUC we can do things here:

  1. We disallow giving ValueObjectConstResult a new value via SetData/SetValueFromCString. Both functions have a way of signaling error to the user, so that could work. I do wonder how many users right now try to reassign a ValueObjectConstResult a new value and expect that to work (or do the error handling).
  2. This patch which fixes the side-effects of giving ValueObjectConstResult a new value (which we already allow in LLDB).

I think this solution here is fine as it seems reasonable thing to do from a user's perspective (which don't know that they actually have a ConstResult underneath). Also given that ValueObjectConstResult is more like a "ValueObjectHostCopy" this doesn't seem too outlandish of a feature. And it doesn't break any existing code.

I'll accept this but please wait until Jim had a chance to have a look (just because having another pair of eyes on ValueObject patches seems like a good idea)

lldb/test/API/python_api/value/change_values/TestChangeValueAPI.py
140

Can you assert that this returns True?

lldb/test/API/python_api/value/change_values/main.c
28

Could this just point to local/global struct bar variables instead? I would prefer not adding more references to external code (malloc) in this test if possible and it would prevent the memory leak in this test.

This revision is now accepted and ready to land.Jul 6 2021, 4:23 AM
werat updated this revision to Diff 356677.Jul 6 2021, 4:30 AM

Use local variables in test
Assert for SetValueFromCString

werat marked 2 inline comments as done.Jul 6 2021, 4:31 AM

Jim, can you take a look, please?

This change is wrong for ValueObjectConstResult's. The original point of ValueObjectConstResult's was to store the results of expressions so that, even if the process is not at the original stop point, you could still check the value. It should have the value it had at the time the expression was evaluated. Updating the value is exactly what you don't want to do. So I think you need to make sure that we can't change their value after they are created.

Note, ExpressionResult variables are different from ExpressionPersistentVariables. The former should not be updated, and after the process moves on any children that weren't gathered become "unknown". But ExpressionPersistentVariables should be able to be assigned to, and when the reference target object, it should update them live. I think this is a useful distinction, but it's somewhat messily expressed in the current state of this code. That should get cleaned up IMO but in the meantime we shouldn't make it muddier...

The other thing to check about this patch is whether it defeats detecting changed values in child elements. If I make a ValueObject that tracks a value in the target - i.e. ValueObjectVariable - and fetch some of its children, then step, I should be able to call ValueObject::GetValueDidChange on the child element and get a correct result. It seems to me that if you throw away the children at the beginning of the update I don't see how you would have anything to compare to. There are a few tests for GetValueDidChange (python_api/value_var_update/TestValueVarUpdate.py). It would be good to first understand whether they just didn't test thoroughly, or if this IS still working, how. It wouldn't be good if this continued working by some accident that was going to break later.

werat added a comment.Jul 9 2021, 3:51 AM

Thanks for the prompt review!

This change is wrong for ValueObjectConstResult's. The original point of ValueObjectConstResult's was to store the results of expressions so that, even if the process is not at the original stop point, you could still check the value. It should have the value it had at the time the expression was evaluated. Updating the value is exactly what you don't want to do. So I think you need to make sure that we can't change their value after they are created.

Looking at the current state of things and the available APIs, I don't think this principle holds anymore. ValueObjectConstResult is also used for values created via SBTarget::CreateValueFromData. As a user I don't see any reason why I shouldn't be allowed to modify the value of an object I created myself earlier. I would argue the same for the results of the expression evaluation. Why are some values different from others?

Note, ExpressionResult variables are different from ExpressionPersistentVariables. The former should not be updated, and after the process moves on any children that weren't gathered become "unknown". But ExpressionPersistentVariables should be able to be assigned to, and when the reference target object, it should update them live. .

Please, correct me if I'm wrong. In my example ExpressionResult is a value returned by EvaluateExpression (i.e. b) and ExpressionPersistentVariable is variable created by the expression evaluator (i.e. $b_0), right? As far as I can tell, they're both ValueObjectConstResult and both have the problem outlined in this patch:

frame0.EvaluateExpression("auto $b_0 = b1")
b = frame0.FindValue("$b_0", lldb.eValueTypeConstResult)
...
# Same problem, updating the value of `b` doesn't invalidate children.

The other thing to check about this patch is whether it defeats detecting changed values in child elements.

This patch invalidates children only in ValueObject::SetNeedsUpdate() which is called only from SetData/SetValueFromCString as far as I understand. If you have a ValueObjectVariable which tracks some variable that can be modified by the process, then it will continue to work fine. TestValueVarUpdate.py passes successfully, but I didn't look to0 closely yet whether it actually tests the scenario you described.


Looking at this from the user perspective, I would prefer to be able to update any value, regardless of which API it came from. In my use case I rely on this heavily -- https://werat.dev/blog/blazing-fast-expression-evaluation-for-c-in-lldb/#maintaning-state.
I could potentially live with the results of EvaluateExpression being immutable, but being able to modify values created by CreateValueFromData or persistent values like $b_0 is necessary for me.

werat updated this revision to Diff 357477.Jul 9 2021, 4:37 AM

Add a test for a value created via SBTarget::CreateValueFromData.

werat updated this revision to Diff 357478.Jul 9 2021, 4:38 AM

Add a test for a value created via SBTarget::CreateValueFromData.

Thanks for the prompt review!

This change is wrong for ValueObjectConstResult's. The original point of ValueObjectConstResult's was to store the results of expressions so that, even if the process is not at the original stop point, you could still check the value. It should have the value it had at the time the expression was evaluated. Updating the value is exactly what you don't want to do. So I think you need to make sure that we can't change their value after they are created.

Looking at the current state of things and the available APIs, I don't think this principle holds anymore. ValueObjectConstResult is also used for values created via SBTarget::CreateValueFromData. As a user I don't see any reason why I shouldn't be allowed to modify the value of an object I created myself earlier. I would argue the same for the results of the expression evaluation. Why are some values different from others?

Expression results store the results of an expression evaluation. As such, it doesn't really make sense to change their values. Moreover, I'd argue it is confusing to update them. If I get the value of an object at time A and the try to fetch a child of it at time B, the value I would get is incoherent with the state of the object when I fetched it. So I don't think those operations make much sense for ExpressionResults, and allowing them to change when you look at them again defeats their purpose, which is to store the results of the expression at the time it was evaluated.

Convenience variables made in the expression parser have a whole different meaning. They are made by and their values controlled by the user of the expression parser. You should be able to change them at will. And since you might have handed them off to the program (for instance storing a convenience variable held object in some collection managed by the program, they need to be updated when program state changes.

Note, ExpressionResult variables are different from ExpressionPersistentVariables. The former should not be updated, and after the process moves on any children that weren't gathered become "unknown". But ExpressionPersistentVariables should be able to be assigned to, and when the reference target object, it should update them live. .

Please, correct me if I'm wrong. In my example ExpressionResult is a value returned by EvaluateExpression (i.e. b) and ExpressionPersistentVariable is variable created by the expression evaluator (i.e. $b_0), right? As far as I can tell, they're both ValueObjectConstResult and both have the problem outlined in this patch:

frame0.EvaluateExpression("auto $b_0 = b1")
b = frame0.FindValue("$b_0", lldb.eValueTypeConstResult)
...
# Same problem, updating the value of `b` doesn't invalidate children.

You are right in everything bug that ExpressionResult variables should not be updated when program state changes.

> The other thing to check about this patch is whether it defeats detecting changed values in child elements. 

This patch invalidates children only in `ValueObject::SetNeedsUpdate()` which is called only from `SetData/SetValueFromCString` as far as I understand. If you have a `ValueObjectVariable` which tracks some variable that can be modified by the process, then it will continue to work fine. `TestValueVarUpdate.py` passes successfully, but I didn't look to0 closely yet whether it actually tests the scenario you described.

The fact that SetNeedsUpdate only gets called from SetData/SetValueFromCString seems to me to be just an accident, nothing enforces not using these when the program state changes, for instance.


Looking at this from the user perspective, I would prefer to be able to update any value, regardless of which API it came from. In my use case I rely on this heavily -- https://werat.dev/blog/blazing-fast-expression-evaluation-for-c-in-lldb/#maintaning-state.
I could potentially live with the results of EvaluateExpression being immutable, but being able to modify values created by CreateValueFromData or persistent values like $b_0 is necessary for me.

I'm less certain about updating "CreateValueFromData" variables. Those tend to be used by Synthetic Child Providers, for instance when you've found an element of an NSDictionary, you use CreateValueFromData to produce the ValueObject that represents it. But in these cases, the data formatters generally recreate these objects rather than update them. If they were going to update them using SetValueFromCString and the like, you would then need to preserve "IsChanged" information in that process.

But for certain expression result variables should be changeable.

jingham requested changes to this revision.Jul 9 2021, 9:42 AM
This revision now requires changes to proceed.Jul 9 2021, 9:42 AM
werat updated this revision to Diff 357710.Jul 10 2021, 6:17 AM

Add more test cases

werat added a comment.EditedJul 10 2021, 6:18 AM

Thanks for the explanation! But at this point I feel I'm a bit confused about how it all _supposed_ to work in the current design :)

If I understand correctly, there are four "types" of values from the user (API) perspective:

  1. ExpressionResult -- value returned by SBFrame::EvaluateExpression()
  2. ExpressionPersistentVariable -- value created by the expression via auto $name = ... syntax. Can be obtained by SBFrame::FindValue("$name", lldb::eValueTypeConstResult).
  3. "Const value" -- value created by SBTarget::CreateValueFromData() or SBTarget::CreateValueFromAddress
  4. "Variable reference" -- value returned by SBFrame::FindVariable()

For which of these value the following test is supposed to work?

struct Foo { int x; };
Foo* f1 = { .x = 1}
Foo* f2 = { .x = 2}  # pseudo-C for simplicity

f1_ref = ...  # Get a value that holds the value of `f1` using one of the four methods described above
print(f1_ref.GetChild(0))  # '1'
f1_ref.SetValueFromCString(frame.FindVariable('f2').value)
print(f1_ref.GetChild(0))  # '2'

My experiments show that it works for "variable references" and "const values" created by
CreateValueFromAddress (but _not_ CreateValueFromData).
UPD: it seems values created CreateValueFromAddress actually behave like "variable references". Modifying their value will modify the underlying data directly.

If I understand your comment correctly, you're saying it should work only for ExpressionPersistentVariable values (#2). Is that right?

I don't have the full picture about the internal implementation and all the use cases, but as a user I would expect it to work for at least #2, #3 and #4. Afaik there's no API to fully distinguish between these kinds of values, so I find it confusing why SBValue::SetData() would be allowed for some values and not allowed for others. If I can create a value using CreateValueFromData and then there's a method SetValueFromCString, then I don't see why it should not be allowed (apart from implementation complexity/consistency reasons).

What do you think? How should we proceed with this?

werat updated this revision to Diff 361649.Jul 26 2021, 7:33 AM
This comment was removed by werat.

Thanks for the explanation! But at this point I feel I'm a bit confused about how it all _supposed_ to work in the current design :)

Apparently I lost track of this review...

If I understand correctly, there are four "types" of values from the user (API) perspective:

  1. ExpressionResult -- value returned by SBFrame::EvaluateExpression()

This is the one that tis expected to be "constant". After all, expression results aren't necessarily anything in the target, they could be an int returned by a function call that is only in the expression result.
Moreover, this ValueObject represents the "results of an expression". It really doesn't make sense to change that. And even if the expression result is just some local variable, we want users to be able to refer back to the results of expressions even when they have left the frame where the expression was evaluated. So these ones need to be "frozen" values.

Note there is a bunch of code in the ValueObjectConstResult that tracks the "live address" if the expression result was something in memory. That's so that you can do things like *$1 and have then do the right thing.

And, there's another complexity which is that the ValueObjectConstResult class got reused for a bunch of things that don't have this restriction, so the internal policy is pretty confusing. I'm trying to find some time to clean this up a bit in the near term.

  1. ExpressionPersistentVariable -- value created by the expression via auto $name = ... syntax. Can be obtained by SBFrame::FindValue("$name", lldb::eValueTypeConstResult).

These ones are supposed to act as though the user had declared an exported global variable of this name and type. So it should be modifiable. As I said above, it's really confusing that this is a ConstResult, which it clearly isn't...

  1. "Const value" -- value created by SBTarget::CreateValueFromData() or SBTarget::CreateValueFromAddress

Creating a value from Data is making a ValueObject that only makes sense to lldb. This is data in lldb's memory. But these entities are really for use by other ValueObject entities, and they would know what the right policy for recalculation should be.

CreateValueFromAddress is like a the Expression Persistent Variables. It refers to an address in the target, and isn't associated with a frame, so it should just reflect what's in that memory location, and writing makes sense as well.

  1. "Variable reference" -- value returned by SBFrame::FindVariable()

These ones are pretty self evident. We do allow you to modify the state of locals, so that should be hooked up. If the underlying variable is a stack local, you it doesn't make sense to update or change them once their frame has been pushed off the stack. But if the variable is a global, then it should be valid till the library it lives in gets unloaded. These all have update points saying what the criteria for their validity are.

For which of these value the following test is supposed to work?

struct Foo { int x; };
Foo* f1 = { .x = 1}
Foo* f2 = { .x = 2}  # pseudo-C for simplicity

f1_ref = ...  # Get a value that holds the value of `f1` using one of the four methods described above
print(f1_ref.GetChild(0))  # '1'
f1_ref.SetValueFromCString(frame.FindVariable('f2').value)
print(f1_ref.GetChild(0))  # '2'

My experiments show that it works for "variable references" and "const values" created by
CreateValueFromAddress (but _not_ CreateValueFromData).
UPD: it seems values created CreateValueFromAddress actually behave like "variable references". Modifying their value will modify the underlying data directly.

If I understand your comment correctly, you're saying it should work only for ExpressionPersistentVariable values (#2). Is that right?

Sorry if I wasn't clear. From the comments above, you should be able to update everything but expression result variables. Updating the From Data ones is a little less clear, but they are mostly for internal uses - like for handing out Synthetic Children.

I don't have the full picture about the internal implementation and all the use cases, but as a user I would expect it to work for at least #2, #3 and #4. Afaik there's no API to fully distinguish between these kinds of values, so I find it confusing why SBValue::SetData() would be allowed for some values and not allowed for others. If I can create a value using CreateValueFromData and then there's a method SetValueFromCString, then I don't see why it should not be allowed (apart from implementation complexity/consistency reasons).

What do you think? How should we proceed with this?

Note, however, that there is one more detail of importance in this system. For variables that represent valid entities in the target, every time you stop we should be able to ask whether the value has changed. We don't try to record every value in full depth, so if on stop A you looked at foo, foo->bar & foo->bar.baz, but not foo->other_struct, then we only report "IsChanged" for foo, foo->bar & foo->bar.baz. But that means you can't willy-nilly throw away the children when you stop or you won't be able to reconstruct the previous value.

Anyway, except for the Data version, which is more an implementation for some other presentation, we seem to agree on what should happen:

ValueObjectVariables should refresh themselves on every stop until they are no longer valid. And they need to be able to report "IsChanged".

ExpressionPersistentVariables and FromAddress ValueObjects should work like ValueObjectVariables that reflect globals, except we made up the globals. The former get memory allocations we made, so they won't get externally changed unless you hand a pointer to them to the target somehow. The address ones are just some random address, so there's no guarantee about their behavior when the target is running. I don't actually remember whether IsChanged is hooked up for them, I don't think that gets shown anywhere.