This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix SBValue::Persist() for constant values
Needs ReviewPublic

Authored by werat on Mar 10 2021, 1:43 PM.

Details

Reviewers
teemperor
jingham
Summary

Right now SBValue::Persist() works properly only for values that refere to variables (refer to unit-tests for an example). Constant values (e.g. values created by EvaluateExpression or CreateValueFromData) can be persisted, but using them in the expressions leads to an error:

>>> v = lldb.frame.EvaluateExpression("1+2")
>>> vp = v.Persist()
>>> vp.GetName()
'$1'
>>> v = lldb.frame.EvaluateExpression("$1")
>>> v.GetValue()
'3'
>>> v = lldb.frame.EvaluateExpression("$1+1")
>>> v.GetError().GetCString()
"error: supposed to interpret, but failed: Interpreter couldn't read from memory\n"

In this patch we mark constant values as "required materialization" and fix up the dematerialization logic to apply side-effects.

Also move Windows-failing use cases to a separate test, so that other tests can be executed on Windows too.

Diff Detail

Event Timeline

werat requested review of this revision.Mar 10 2021, 1:43 PM
werat created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2021, 1:43 PM
werat added a comment.Mar 10 2021, 1:50 PM

Hi @teemperor , here's an attempt to fix SBValue::Persist method. I've highlighted a few moments in the patch I'm not so sure about, please let me know what you think. Thanks!

lldb/source/Expression/Materializer.cpp
301

Writing data directly to m_persistent_variable_sp->GetValueBytes() is not enough, because the underlying ValueObject also stores the data in m_value and I didn't find a way to update that one too. Overall there seems to be an assumption that ValueObjectConstResult doesn't change (which makes sense), so creating a new one here seems more in line with the design.

303–304

I am not sure how to deal with this. "ConstResult" variables needs to be dematerialized every time, so they should either have EVNeedsFreezeDry flag, or the check should be different. Removing this flag doesn't seem to break anything though...

Sorry for the delay!

From my playing around it look like if you do all the same steps as you show in your example, except that you leave out the call to "Persist" and instead directly use the SBValue from the expression directly, then everything works correctly.

If that's true then the original persistent variable we made for the expression result was working correctly, and something about Persist messed it up. That's not 100% surprising, since we're "re-persisting" an already persistent variable...

If persisting already persistent variables is indeed the only failure case, then I wonder if it wouldn't be more straightforward to just see if the ValueObject is already a persistent variable and have Persist just return the incoming variable.

werat added a comment.EditedApr 21 2021, 3:41 AM

If persisting already persistent variables is indeed the only failure case, then I wonder if it wouldn't be more straightforward to just see if the ValueObject is already a persistent variable and have Persist just return the incoming variable.

Persisting already persistent variables is not the only valid use case, one might also want to persist variables created via SBTarget::CreateValueFromData().

I guess it is currently not clear from the API, but I would expect SBValue::Persist() to produces a new value every time:

auto v1 = value.Persist();
auto v2 = value.Persist();

assert(v1.GetName() != v2.GetName())  // modulo comparing char*...

If persisting already persistent variables is indeed the only failure case, then I wonder if it wouldn't be more straightforward to just see if the ValueObject is already a persistent variable and have Persist just return the incoming variable.

Persisting already persistent variables is not the only valid use case, one might also want to persist variables created via SBTarget::CreateValueFromData().

Sure. But but when I was poking around at it a little bit, it seems like the other use cases already work, and the only one that was failing was the case where you call persist on a persistent variable. If that is really true, then maybe we should fix the failing case directly.

I guess it is currently not clear from the API, but I would expect SBValue::Persist() to produces a new value every time:

auto v1 = value.Persist();
auto v2 = value.Persist();

assert(v1.GetName() != v1.GetName())  // modulo comparing char*...

Not sure why? The API is a request: "I made a variable somehow, and I would like you to make it persist so I can use its value later on even if the underlying data has changed." Why do you care whether you get a copy of an already persistent or just a shared value?

werat added a comment.Apr 22 2021, 7:43 AM

Sure. But but when I was poking around at it a little bit, it seems like the other use cases already work, and the only one that was failing was the case where you call persist on a persistent variable. If that is really true, then maybe we should fix the failing case directly.

Right now Persist() doesn't really work for values created via CreateValueFromData. You can read them, but can't modify:

>>> data = lldb.SBData.CreateDataFromUInt64Array(lldb.process.GetByteOrder(), lldb.process.GetAddressByteSize(), [42])
>>> v = lldb.target.CreateValueFromData('v', data, lldb.target.GetBasicType(lldb.eBasicTypeUnsignedLong))
>>> v.value
'42'
>>> vp = v.Persist()
>>> vp.name
'$3'
>>> lldb.frame.EvaluateExpression('$3').value
'42'
>>> lldb.frame.EvaluateExpression('++$3 + 1').value
>>> lldb.frame.EvaluateExpression('++$3 + 1').error.GetCString()
"error: supposed to interpret, but failed: Interpreter couldn't read from memory\n"

However I realized my patch doesn't completely fixes it either:

>>> data = lldb.SBData.CreateDataFromUInt64Array(lldb.process.GetByteOrder(), lldb.process.GetAddressByteSize(), [42])
>>> v = lldb.target.CreateValueFromData('v', data, lldb.target.GetBasicType(lldb.eBasicTypeUnsignedLong))
>>> vp = v.Persist()
>>> vp.name
'$0'
>>> lldb.frame.EvaluateExpression('$0').value
'42'
>>> lldb.frame.EvaluateExpression('++$0').value
'43'
>>> lldb.frame.EvaluateExpression('++$0').value
'44'
>>> vp.value
'42'

Not sure why? The API is a request: "I made a variable somehow, and I would like you to make it persist so I can use its value later on even if the underlying data has changed." Why do you care whether you get a copy of an already persistent or just a shared value?

You're right, I got confused by something else. I don't care if I get a new name/copy, as long as I can use it by the returned name it's fine. However I want to point out that the current API does generate a new name every time (but the it points to the same data):

>>> x = lldb.frame.FindVariable('x')
>>> x.value
'1'
>>> xp1 = x.Persist()
>>> xp1.name
'$0'
>>> xp2 = x.Persist()
>>> xp2.name
'$1'
>>> lldb.frame.EvaluateExpression('++$0 + ++$1').value
'3'
>>> xp1.value
'3'
>>> xp2.value
'3'

Sure. But but when I was poking around at it a little bit, it seems like the other use cases already work, and the only one that was failing was the case where you call persist on a persistent variable. If that is really true, then maybe we should fix the failing case directly.

Right now Persist() doesn't really work for values created via CreateValueFromData. You can read them, but can't modify:

>>> data = lldb.SBData.CreateDataFromUInt64Array(lldb.process.GetByteOrder(), lldb.process.GetAddressByteSize(), [42])
>>> v = lldb.target.CreateValueFromData('v', data, lldb.target.GetBasicType(lldb.eBasicTypeUnsignedLong))
>>> v.value
'42'
>>> vp = v.Persist()
>>> vp.name
'$3'
>>> lldb.frame.EvaluateExpression('$3').value
'42'
>>> lldb.frame.EvaluateExpression('++$3 + 1').value
>>> lldb.frame.EvaluateExpression('++$3 + 1').error.GetCString()
"error: supposed to interpret, but failed: Interpreter couldn't read from memory\n"

However I realized my patch doesn't completely fixes it either:

>>> data = lldb.SBData.CreateDataFromUInt64Array(lldb.process.GetByteOrder(), lldb.process.GetAddressByteSize(), [42])
>>> v = lldb.target.CreateValueFromData('v', data, lldb.target.GetBasicType(lldb.eBasicTypeUnsignedLong))
>>> vp = v.Persist()
>>> vp.name
'$0'
>>> lldb.frame.EvaluateExpression('$0').value
'42'
>>> lldb.frame.EvaluateExpression('++$0').value
'43'
>>> lldb.frame.EvaluateExpression('++$0').value
'44'
>>> vp.value
'42'

Not sure why? The API is a request: "I made a variable somehow, and I would like you to make it persist so I can use its value later on even if the underlying data has changed." Why do you care whether you get a copy of an already persistent or just a shared value?

You're right, I got confused by something else. I don't care if I get a new name/copy, as long as I can use it by the returned name it's fine. However I want to point out that the current API does generate a new name every time (but the it points to the same data):

>>> x = lldb.frame.FindVariable('x')
>>> x.value
'1'
>>> xp1 = x.Persist()
>>> xp1.name
'$0'
>>> xp2 = x.Persist()
>>> xp2.name
'$1'
>>> lldb.frame.EvaluateExpression('++$0 + ++$1').value
'3'
>>> xp1.value
'3'
>>> xp2.value
'3'

Be careful here. There are really two kinds of persistent variables: "expression results" and variables created in the expression parser. The former are by design constant. The idea is that you can use them to checkpoint the state of a value and refer to it later. You can use their values in expressions, but they aren't supposed to be modifiable. Those are the ones called $NUM.

The ones you make in an expression, like:

(lldb) expr int $myVar = 20

on the other hand are modifiable.

werat added a comment.Apr 22 2021, 1:42 PM

Be careful here. There are really two kinds of persistent variables: "expression results" and variables created in the expression parser. The former are by design constant. The idea is that you can use them to checkpoint the state of a value and refer to it later. You can use their values in expressions, but they aren't supposed to be modifiable. Those are the ones called $NUM.

The ones you make in an expression, like:

(lldb) expr int $myVar = 20

on the other hand are modifiable.

What about the values created via SBValue::Persist() then? They have names like $NUM, but they can be either of the two you mentioned and more (values created via CreateValueFromData or values acquired via FindVariable):

v = frame.EvaluateExpression("...")  # `v` is "$0" -- it's an "expression result" by your classification
vp = v.Persist()  # `vp` is "$1" -- I assume it should point the same value as "$0" and be non-modifiable

--

v = frame.EvaluateExpression("$myVar = 20")  # `v` is "$0" -- it's an "expression result" by your classification
vp = v.Persist()  # `vp` is "$1" -- I assume it should point the same value as "$0" and be non-modifiable

myVar = frame.EvaluateExpression("$myVar")  # now `myVar` is "$myVar" -- it's a "variables created in the expression parser"
myVarP= myVar.Persist()  # `myVarP` is "$2" -- ?? Should it point the same value as `myVar` and be modifiable?

--

v = target.CreateValueFromData(...)
vp = v.Persist()  # `vp` is "$0" -- ?? Should it be modifiable? Should it point to the original `v`?

I think at this point I should explain the problem I'm solving from the beginning.

I have built an expression evaluator for LLDB (and based on LLDB) -- https://github.com/google/lldb-eval. It's much faster than LLDB's EvaluateExpression, but has some limitations -- it doesn't support user-defined function calls, doesn't allows to evaluate arbitrary C++ code, etc.
It is used in Stadia for Visual Studio debugger (https://github.com/googlestadia/vsi-lldb) complementing LLDB. Expressions that are not supported by lldb-eval are evaluated by LLDB, thus from user perspective everything "just works". In some situations we need to maintain some "state", which can be read/written by multiple consecutive expressions. Simplified example:

EvaluateExpression("$index += 1");
v1 = EvaluateExpression("array[$index]")

EvaluateExpression("$index += 1");
v2 = EvaluateExpression("array[$index]")

In case of LLDB we create $index via something like expr $index = 0 and then the expressions can modify it. In lldb-eval, however, we don't want to use LLDB's persistent variables, so the state is represented via regular map of SBValue objects (https://werat.dev/blog/blazing-fast-expression-evaluation-for-c-in-lldb/#maintaning-state). Basically the API is something like this:

state = {"index": some_sbvalue}

EvaluateExpression("index += 1", state);
v1 = EvaluateExpression("array[index]", state)

EvaluateExpression("index += 1", state);
v2 = EvaluateExpression("array[index]", state)

some_sbvalue.GetValue()  # == 2

But as I mentioned before, our debugger shell is "smart" and can fallback to LLDB if lldb-eval fails to evaluate the expression. If there was a state involved, it needs to be converted to LLDB's persistent variables. Right now we workaround this issues by always allocating "state" values in LLDB (https://github.com/googlestadia/vsi-lldb/blob/0fcbe30a498fac70230efdb815125ee5f6f087bb/YetiVSI/DebugEngine/NatvisEngine/NatvisExpressionEvaluator.cs#L315) -- this way we don't need to convert it back. However it would be better to convert only if needed and for that I was hoping to use SBValue::Persist().

The example below doesn't work, but that's what I would like to fix (code is simplified):

// We can assume here the result is created via something like `CreateValueFromData()`
// 
lldb::SBValue counter = lldb_eval::EvaluateExpression("10");

lldb_eval::ContextVariableList vars = { {"counter", &counter } };

while (counter.GetValueAsUnsigned() > 5) {
  lldb_eval::EvaluateExpression("--counter", vars);
}

lldb::SBValue p = counter.Persist();  // Assume `p.GetName()` is "$1"
lldb::SBValue v1 = lldb::EvaluateExpression("$1");    // v1.GetValue() == 5
lldb::SBValue v2 = lldb::EvaluateExpression("--$1");  // v2.GetValue() == 4

Sorry for the long message, I hope it's more clear what I'm trying to do here :)

Be careful here. There are really two kinds of persistent variables: "expression results" and variables created in the expression parser. The former are by design constant. The idea is that you can use them to checkpoint the state of a value and refer to it later. You can use their values in expressions, but they aren't supposed to be modifiable. Those are the ones called $NUM.

The ones you make in an expression, like:

(lldb) expr int $myVar = 20

on the other hand are modifiable.

What about the values created via SBValue::Persist() then? They have names like $NUM, but they can be either of the two you mentioned and more (values created via CreateValueFromData or values acquired via FindVariable):

I am guessing those names are chosen just because that was a handy routine for auto-generating a variable name. It doesn't follow with our general naming conventions, we should probably distinguish them from result variables by giving them some other name (like $P0...) or something.

v = frame.EvaluateExpression("...")  # `v` is "$0" -- it's an "expression result" by your classification
vp = v.Persist()  # `vp` is "$1" -- I assume it should point the same value as "$0" and be non-modifiable

--

v = frame.EvaluateExpression("$myVar = 20")  # `v` is "$0" -- it's an "expression result" by your classification
vp = v.Persist()  # `vp` is "$1" -- I assume it should point the same value as "$0" and be non-modifiable

myVar = frame.EvaluateExpression("$myVar")  # now `myVar` is "$myVar" -- it's a "variables created in the expression parser"
myVarP= myVar.Persist()  # `myVarP` is "$2" -- ?? Should it point the same value as `myVar` and be modifiable?

--

v = target.CreateValueFromData(...)
vp = v.Persist()  # `vp` is "$0" -- ?? Should it be modifiable? Should it point to the original `v`?

I think at this point I should explain the problem I'm solving from the beginning.

I have built an expression evaluator for LLDB (and based on LLDB) -- https://github.com/google/lldb-eval. It's much faster than LLDB's EvaluateExpression, but has some limitations -- it doesn't support user-defined function calls, doesn't allows to evaluate arbitrary C++ code, etc.
It is used in Stadia for Visual Studio debugger (https://github.com/googlestadia/vsi-lldb) complementing LLDB. Expressions that are not supported by lldb-eval are evaluated by LLDB, thus from user perspective everything "just works". In some situations we need to maintain some "state", which can be read/written by multiple consecutive expressions. Simplified example:

EvaluateExpression("$index += 1");
v1 = EvaluateExpression("array[$index]")

EvaluateExpression("$index += 1");
v2 = EvaluateExpression("array[$index]")

In case of LLDB we create $index via something like expr $index = 0 and then the expressions can modify it. In lldb-eval, however, we don't want to use LLDB's persistent variables, so the state is represented via regular map of SBValue objects (https://werat.dev/blog/blazing-fast-expression-evaluation-for-c-in-lldb/#maintaning-state). Basically the API is something like this:

state = {"index": some_sbvalue}

EvaluateExpression("index += 1", state);
v1 = EvaluateExpression("array[index]", state)

EvaluateExpression("index += 1", state);
v2 = EvaluateExpression("array[index]", state)

some_sbvalue.GetValue()  # == 2

But as I mentioned before, our debugger shell is "smart" and can fallback to LLDB if lldb-eval fails to evaluate the expression. If there was a state involved, it needs to be converted to LLDB's persistent variables. Right now we workaround this issues by always allocating "state" values in LLDB (https://github.com/googlestadia/vsi-lldb/blob/0fcbe30a498fac70230efdb815125ee5f6f087bb/YetiVSI/DebugEngine/NatvisEngine/NatvisExpressionEvaluator.cs#L315) -- this way we don't need to convert it back. However it would be better to convert only if needed and for that I was hoping to use SBValue::Persist().

The example below doesn't work, but that's what I would like to fix (code is simplified):

// We can assume here the result is created via something like `CreateValueFromData()`
// 
lldb::SBValue counter = lldb_eval::EvaluateExpression("10");

lldb_eval::ContextVariableList vars = { {"counter", &counter } };

while (counter.GetValueAsUnsigned() > 5) {
  lldb_eval::EvaluateExpression("--counter", vars);
}

lldb::SBValue p = counter.Persist();  // Assume `p.GetName()` is "$1"
lldb::SBValue v1 = lldb::EvaluateExpression("$1");    // v1.GetValue() == 5
lldb::SBValue v2 = lldb::EvaluateExpression("--$1");  // v2.GetValue() == 4

Sorry for the long message, I hope it's more clear what I'm trying to do here :)

NP, good to know the context...

From playing around with the current state of things, it looks like the Persisted version of an SBValue should behave exactly like it's underlying SBValue, the only thing Persist is supposed to do is give it a name that can be referred to in expressions. For instance, if you get an SBValue from SBFrame.FindVariable, call Persist on it, and then call SetValueFromCString on the persisted version, or use it in an expression and it actually changes the value of the local variable, just as it would if you called that on the FrameVariable one.

So it looks to me like you can't just key off of whether the underlying m_frozen_sp is a ValueObjectConstResult. That's just the object container that mostly always gets used to hold the frozen value. Instead we should always set the copy back policies based on the kind of SBValue. That happens automatically for expression & expression result variables because the system gets to make them the right way when it sets them up. From what I can see, inheriting the original SBValue's behavior when calling Persist on a FrameVariable SBValue also sets it up to track the underlying value & be able to change it/change with it.

Have you implemented the part where instead of using an expression result for your extra values you make one with SBTarget.CreateValueFromData? Do SBValues made that way also fail to change value like the expression result ones do? It doesn't seem to me like you actually care whether expression results are changeable - you don't actually plan to use those right? You just want to make sure that Persist-ed CreateValueFromData SBValues are changeable (and get materialized properly when used). So I'd start with seeing if you can change SBValues made that way. If they just work, then Party!

Otherwise, Persist is going to have to have some way to tell these SBValues apart from Expression Result Variables, so that it can fix them up appropriately. Maybe we should be stricter about the rule should be that $<NN> variables are always considered expression result. We'd probably want to add a parallel naming queue for Persisted variables to keep them distinct? And then you could do pretty much what you are doing here, but only to non-result variables.