This is an archive of the discontinued LLVM Phabricator instance.

Improve data formatter for libstdcpp unique_ptr
AbandonedPublic

Authored by tberghammer on Feb 22 2017, 1:49 PM.

Details

Reviewers
jingham
labath
Summary

Improve data formatter for libstdcpp unique_ptr

  • Fix infinite loop when there is a reference loop created from smart pointers
  • Respect pointer depth argument in frame variable command
  • Support dereferencing unique_ptr in the frame variable command
  • Support operator-> for unique_ptr in the frame variable command

Note: After submitting this change I plan to add the same functionality for the other smart pointers as well (shared_ptr/weak_ptr, libcpp/libstdcpp)

Diff Detail

Event Timeline

tberghammer created this revision.Feb 22 2017, 1:49 PM
jingham requested changes to this revision.Feb 22 2017, 4:10 PM
jingham added a subscriber: jingham.

It seems a little wrong to me to hook "IsPointerLikeObject" up to the ValueObject, and not to the Synthetic Child Provider for the type. After all, you are using the Synthetic Child Provider to provide the pointee object, so it should be the synthetic child provider that says it is presenting the bare type as a pointer-like thing. That's also more in line with the way the formatters work. They add structure to an underlying object independent of it's own structure. And the "can be treated like a pointer" bit is just another instance of this.

Also, if you did it that way, you could then add an "is_pointer_like" method to the Scripted Synthetic Child Provider, and this could be adopted by formatters for types lldb knows nothing about.

This revision now requires changes to proceed.Feb 22 2017, 4:10 PM
labath edited edge metadata.Feb 23 2017, 2:44 AM

Jim's comment definitely makes sense. If we happen to not use pretty-printer for the unique_ptr, it will not behave as a pointer-like object, and it should be formatted as a normal struct (I have no idea how much of a challenge it would be to implement it that way).

source/DataFormatters/ValueObjectPrinter.cpp
515

Is there are reason you are bundling the is-pointer-like with the is-ref flag instead of the is-ptr one (which would be more logical)? If there is one it certainly isn't obvious.

source/Target/StackFrame.cpp
637

SetErrorStringWithFormatv (and then you can drop the .AsCString from the deref_error).

My original plan was to add it to the synthetic child provider but after some thoughts and experimenting I concluded it doesn't really belongs to there because it should be responsible for generating new children while here we are modifying the way the parent object is printed. Also for this reason the implementation of it would be fairly complex because we would have to execute the synthetic child providers when the parent is referenced instead of when a child of it is queried. Also I think it won't be a good idea to add a new method to the synthetic child provider called "IsPointerLike" as it seems like introducing a very specific method to an interface what is currently quite generic. On the other hand I see the need for making the list of pointer like objects easily extensible from python.

One solution I can possibly see (haven't tried it out) is to add the following 2 new method to the value object:

  • IsDeceremntPointerDepth: By default returns true for pointers and references and will return true for children generated for smart pointers
  • IsDereferenceOfParent: By default returns false and will return true for the first child of the smart pointers

If we add these methods then the synthetic child provider can create a child where they will be returning true and then rely on them to implement this functionality. I think this solution would be a fairly clean design but would introduce a lot of new API for a small amount of extra functionality.

The third possible option is to somehow put the IsPointerLikeObject method into the summary provider as I think that would be the best place for it but I am not sure how to do it as it currently takes a ValueObject and then returns a string without too much flexibility. Saying that it should just flip some bit on the ValueObject itself seems like a quite big hack and would make the API hard to maintain so I don't really like this implementation.

What do you think?

source/DataFormatters/ValueObjectPrinter.cpp
515

Because I want to treat a unique_ptr the same way as a reference so if you just write so it is automatically dereferenced at the root level while not at other levels (we don't dereference pointers by default at any level). I can split it out to a separate bool if you think that helps

My original plan was to add it to the synthetic child provider but after some thoughts and experimenting I concluded it doesn't really belongs to there because it should be responsible for generating new children while here we are modifying the way the parent object is printed.

That is true, however. It still seems that this property should be somehow tied to the child provider, because it is only in the presence of the provider that the unique_ptr obtains its magic dereferencing properties. For example, if I do a frame variable -R X Y, where X is a std::unique_ptr and Y is a variable that happens to have the same layout (but a different name) as X, I would expect them to be printed identically, but it seems this will break that.

Also for this reason the implementation of it would be fairly complex because we would have to execute the synthetic child providers when the parent is referenced instead of when a child of it is queried. Also I think it won't be a good idea to add a new method to the synthetic child provider called "IsPointerLike" as it seems like introducing a very specific method to an interface what is currently quite generic. On the other hand I see the need for making the list of pointer like objects easily extensible from python.

One solution I can possibly see (haven't tried it out) is to add the following 2 new method to the value object:

  • IsDeceremntPointerDepth: By default returns true for pointers and references and will return true for children generated for smart pointers
  • IsDereferenceOfParent: By default returns false and will return true for the first child of the smart pointers

If we add these methods then the synthetic child provider can create a child where they will be returning true and then rely on them to implement this functionality. I think this solution would be a fairly clean design but would introduce a lot of new API for a small amount of extra functionality.

I like where this is going. I can see one possible problem here: the "IsDecrementPointerDepth"(*) property would now be a property of the child, whereas it really seems to be a property of the parent-child relationship. So we could have paths arriving at the same object and one will have this bit set and the other one will not. However this may not be a major problem as the ValueObjects seem to already know their parent anyway. I don't think the API blowup would be big. It seems you have had to add the IsPointerLikeObject method to quite a lot of classes for this implementation, and presumably these would not be needed if we go that way.

  • I would try to keep "pointer depth" out of the name as this isn't something the values should need to worry about. Maybe something like IsObtainedByDereferencingAPointer, only shorter :)
source/DataFormatters/ValueObjectPrinter.cpp
515

Splitting out would definitely help as that gives you space to explain why you chose to treat this differently. That said, I am not entirely sure this needs special treatment. If the smart pointer should behave like a pointer then maybe we short format it as such too (i.e., display the pointed-to value only if it was explicitly asked for).

jingham added a comment.EditedFeb 23 2017, 11:46 AM

It definitely does not belong in the summary provider. So far as I can tell the main benefit of the IsPointerLike part of this thing is that you can do:

(lldb) frame var isPointerLikeThing->field_of_pointee->field_of_field_of_pointee

which the summary wouldn't help with.

I don't understand your argument about it not belonging in the summary provider. Maybe you can clarify?

It seems to me clear this is something that people who write synthetic child providers would want to be able to specify. So hard-coding it in the ValueObject with some special logic to turn it on in the CPP language runtime seems the wrong way to go.

You can also imagine wanting to write a synthetic provider that supports a more complex object - with several interesting synthetic-fields that also has a -> operator. For such an object, you would want:

(lldb) frame var such_an_object

to print all the synthetic fields, but then:

(lldb) frame var such_an_object->field_of_pointee->field_of_field_pointee

would also work. In that case you need to ask somebody which one of the synthetic fields is the "pointer like" one. That clearly has to be the parent's synthetic child provider, nobody else can know. Seems to me, the fact that you had to hard-code this to the 0'th synthetic child demonstrates that this was the wrong approach.

Thank you the comments. Based on them I have the following proposal:

  • Add a new property to value object with name "m_is_dereference_of_parent"
  • Add a new static method named "CreateCopy(name, valobj, is_dereference_of_parent)" to ValueObject and to SBValue what will create a new value object with the same type and data as the original value object with m_is_dereference_of_parent is set (I am open to suggestions for a better name). The reason I need to create a copy of the value object instead of modifying it in place is that if the dereferenced object is referenced without using a synthetic child provider then this flag shouldn't be set and the cleanest way to achieve this is to have 2 separate value object instances.
  • From the synthetic child providers when we want to create a child what is the dereference of the parent (e.g. object for unique_ptr) then we will use the above CreateCopy method
  • We will decrement the pointer depth during display every time we go through an object marked as m_is_dereference_of_parent
  • In "frame variable" when the user uses "operator*" or "operator->" we return the first child with m_is_dereference_of_parent (possibly return an error if multiple child has it set)

I think this approach satisfies all the requirements we had (extensibility, dereference support, loop detection) and makes the API fairly clean. What do you think?

I also thought about having the synthetic child provider mark up the special child value objects when it made them. It bugs me a little but that you couldn't, for instance, have a synthetic child provider that doesn't display a child that is the result of the dereference in its view, but provides one when you do the dereference. And it looks like we're creating a system where we have somebody - the SCP - that knows what the dereference means to it, but then we lose the connection to that knowledge when we hand it out.

Explain a bit more why you need to create a copy? Anything that makes two versions of the same value that we have to keep in sync makes me nervous. For instance, one of the things we want to do in the future is make write-back possible for synthetic children. That would allow you to change values of std::vector elements in the Locals view of a GUI debugger (which would naturally be based on SBValues) just like you can with the non-synthetic SBValues. Also, how would the SCP know to update this copy when you've stepped in the debugger and the dereference now points to another pointee?

I also thought about having the synthetic child provider mark up the special child value objects when it made them. It bugs me a little but that you couldn't, for instance, have a synthetic child provider that doesn't display a child that is the result of the dereference in its view, but provides one when you do the dereference. And it looks like we're creating a system where we have somebody - the SCP - that knows what the dereference means to it, but then we lose the connection to that knowledge when we hand it out.

The current system already supports it to have a synthetic child what doesn't displayed by default (can be referenced by name) so we can rely on that functionality to have a child what is the result of the dereference while not in view. We might have to gave it a special name (e.g. "$dereference") to make it easy to access it from the value object but I am not sure if it will be necessary (would certainly make the code simpler and more efficient).

I thought quite a bit about adding a Dereference method the the synthetic child provider but I am not convinced it is a good idea as it would complicate the API for it and I don't see how can we solve the problem I hit with the infinite smart pointer loop when I added a synthetic child displayed by default what contained the dereference of the pointed object without annotating the children itself.

Explain a bit more why you need to create a copy? Anything that makes two versions of the same value that we have to keep in sync makes me nervous. For instance, one of the things we want to do in the future is make write-back possible for synthetic children. That would allow you to change values of std::vector elements in the Locals view of a GUI debugger (which would naturally be based on SBValues) just like you can with the non-synthetic SBValues. Also, how would the SCP know to update this copy when you've stepped in the debugger and the dereference now points to another pointee?

Take the following (bit strange) example:

struct Foo {
  int x;
  int& operator*() { return x; }
}

If I implement a synthetic child provider for it then it will return the same value object as we would return when accessing the "x" member variable. If we annotate that value object as dereference of parent then it will effect the way we are displaying the "x" member variable even when it is not used through the "operator*". Creating a second separate ValueObject solves this problem by separating the 2 solving this issue. For updating it we will rely on the same logic what is currently used to update other synthetic children after a step what is not backed by an actual variable. My current idea is to create a new ValueObject with the memory address and compiler type copied from the original value object and it will be updated using the Update method of the synthetic child provider.

tberghammer abandoned this revision.Mar 25 2017, 4:26 AM

I wasn't able to get this approach working so I propose an alternative solution at https://reviews.llvm.org/D31366 for the infinite loop issue and I will create a separate CL for dereferencing smart pointers in "frame variable".