This is an archive of the discontinued LLVM Phabricator instance.

[SBValue] Add a method GetNumChildren(uint32_t max)
ClosedPublic

Authored by sivachandra on Oct 15 2015, 11:16 AM.

Details

Summary

Along with this, support for an optional argument to the "num_children"
method of a Python synthetic child provider has also been added. These have
been added with the following use case in mind:

Synthetic child providers currently have a method "has_children" and
"num_children". While the former is good enough to know if there are
children, it does not give any insight into how many children there are.
Though the latter serves this purpose, calculating the number for children
of a data structure could be an O(N) operation if the data structure has N
children. The new method added in this change provide a middle ground.
One can call GetNumChildren(K) to know if a child exists at an index K
which can be as large as the callers tolerance can be. If the caller wants
to know about children beyond K, it can make an other call with 2K. If the
synthetic child provider maintains state about it counting till K
previosly, then the next call is only an O(K) operation. Infact, all
calls made progressively with steps of K will be O(K) operations.

Diff Detail

Event Timeline

sivachandra retitled this revision from to [SBValue] Add a method HasChildAtIndex..
sivachandra updated this object.
sivachandra added a subscriber: lldb-commits.

If the general idea in this patch is OK, I will update it with tests.

What do you plan to use this has_child_at_index call for?

My concern is that it adds complication to the synthetic child provider interface. Yes, I understand it is optional, but it seems like yet another thing in the "any children?" realm that I would rather much avoid if possible. Which is why my question. Let's start from the use case and see if we can come up with anything else that could cover it.

As I mentioned in the first post, this is the use case:

Synthetic child providers currently have a method "has_children" and
"get_num_children". While the former is good enough to know if there are
children, it does not give any insight into how many children there are.
Though the latter serves this purpose, calculating the number for children
of a data structure could be an O(N) operation if the data structure has N
children. The new methods added in this change provide a middle ground.
One can call HasChildAtIndex to know if a child exists at an index K which
can be as large as the callers tolerance can be. If the caller wants to
know about children beyond K, it can make an other call with 2K. If the
synthetic child provider maintains state about it counting till K
previosly, then the next call is only an O(K) operation. Infact, all
calls made progressively with steps of K will be O(K) operations.

Is it possible to ask for concrete examples?

As in, for example, do you plan to change ValueObjectPrinter to use this API in some way?
Or do you have any specific synthetic providers that you would like to use this API in?

Yes, we would like to use this is specific synthetic providers. It will not be used by any part of LLDB internally.

granata.enrico requested changes to this revision.Oct 15 2015, 11:49 AM
granata.enrico edited edge metadata.

If so, why do you need this to be exposed as a call that LLDB knows about?

You are free to provide any methods you want in your synthetic child providers. Adding them to the interface as this patch does is only necessary if you want LLDB to use the new methods as part of its interaction with the child provider.

This revision now requires changes to proceed.Oct 15 2015, 11:49 AM

Ah, we want this exposed in the SBValue API. What I mean't in the earlier message was that no other part of LLDB will functionally depend/lookup/expect this "feature".

So, if I understand this correctly, this HasChildAtIndex() API returns a three-state int:

1 == yes I do
0 == no I do not
-1 == maybe, can't tell

When would you ever run into the -1 case? It seems you're mostly using it as an invalid marker. Would 0 not be a better choice for those invalid cases? Then you could just use a boolean which seems a better answer for an inherently yes/no question (invalid objects do not have a child at index for any index, do they?)

Also, if you're only using this from inside a synthetic child provider, why does it need to be in SBValue? Or is your argument that you want to write scripts against LLDB that use this feature?

Truth be told, I am not particularly excited about this feature, and the fact that this seems to be introduced in a vacuum does nothing to assuage my original concern that is is a potentially avoidable complication to the model.

clayborg edited edge metadata.Oct 15 2015, 1:17 PM

I would almost rather change the GetNumChildren() to take a "max" param:

class SBValue
{
     uint32_t
     GetNumChildren ();
     uint32_t
     GetNumChildren (uint32_t max = UINT32_MAX);
};

We need to keep the other one so that our public API still has the old call, but the second version could limit the number of children to verify to at most "max".

I did consider the option of adding a new method GetNumChildren (uint32_t max = UINT32_MAX). This if fine from SBValue point of view. But from a the scripted synthetic provider point of view, I am not sure there is an easy/clean way of checking whether the num_children method takes the max cap argument. Are you suggesting that we have different method like "num_children_with_cap"?

One alternative would be to have the num_children method on the synthetic provider itself be able to take an optional argument that is the max_count. That is, instead of defining

def num_children(self):
  ..

You would instead define

def num_children(self, max_count):
  ..

Of course, you would need to continue supporting both possible signatures in order to maintain compatibility with existing formatters. But we do have some support for that. If you look at LLDBSwigPythonCallCommand:

PyCallable::argc argc = pfunc.GetNumArguments();
if (argc.num_args == 5 || argc.varargs == true)
    pvalue = pfunc(debugger_sb, args, exe_ctx_sb, &cmd_retobj_sb, session_dict = FindSessionDictionary(session_dictionary_name));
else
    pvalue = pfunc(debugger_sb, args, &cmd_retobj_sb, session_dict = FindSessionDictionary(session_dictionary_name));

num_children could start doing something similar for 1 vs. 2 arguments

Ah, I did not notice PyCallable::GetNumArguments before you drew my attention to it. It seems to be using internals of PyCodeObject. When I said "no easy/clean way", I was talking about avoiding such a usage. If that is acceptable, I am fine with it. I will modify this patch to add SBValue::GetNumArguments(uint32 max=UINT32_MAX).

I did this experiment before setting off:

class A
{
public:
  int method();
  int method(int a = 10);
};

int
A::method()
{
  return 10;
}

int
A::method(int a)
{
  return a + 10;
}

int
main()
{
  A a;
  return a.method();
}

The invocation a.method() is ambiguous:

$> clang++-3.5 -g methods.cc 
methods.cc:25:12: error: call to member function 'method' is ambiguous
  return a.method();
         ~~^~~~~~
methods.cc:10:4: note: candidate function
A::method()
   ^
methods.cc:16:4: note: candidate function
A::method(int a)
   ^
1 error generated.

So, what is wrong if I just add SBValue::GetNumChildren(int max) without a default argument?

For the public API you will need both and not have a default arg.

labath added a subscriber: labath.Oct 16 2015, 1:19 AM

Not really my area, but couldn't the needed functionality be implemented on top of GetChildAtIndex(). I mean, if GetChildAtIndex(50) returns a valid SBValue, then the container has at least 50 elements, right?
Then HasAtLeastNChildren(n) == GetChildAtIndex(n).IsValid() and we don't need to add anything to the public API. Or am I missing something here... (?)

There could be an error reading the k-th element for k < n.In which case, we will wrongly conclude that there are only k-1 elements.

How big of an error does it have to be for that to happen? Is that something we would consider an "lldb bug"/"data formatter bug" or it can be caused by something out of our control, like missing debug info on one of the children?

When such an error occurs, it is unlikely to be an lldb bug, or data-formatter bug. One can come up with a data structure for which reading a child element would require complete debug info, but reading the # of child elements need not require complete debug info (there could be an instance variable storing the size). Another could be that a child is improperly initialized but the data structure itself is properly initialized.

sivachandra edited edge metadata.

Address comments.

sivachandra retitled this revision from [SBValue] Add a method HasChildAtIndex. to [SBValue] Add a method GetNumChildren(uint32_t max).Oct 16 2015, 1:45 PM
sivachandra updated this object.
sivachandra edited edge metadata.

Pinging to ensure that it did not get lost in the title change.

When such an error occurs, it is unlikely to be an lldb bug, or data-formatter bug. One can come up with a data structure for which reading a child element would require complete debug info, but reading the # of child elements need not require complete debug info (there could be an instance variable storing the size). Another could be that a child is improperly initialized but the data structure itself is properly initialized.

Ok, that makes sense. No objections from my side.

source/API/SBValue.cpp
1285

You can replace this by return GetNumChildren(UINT32_MAX);

clayborg accepted this revision.Oct 20 2015, 4:54 PM
clayborg edited edge metadata.

Accepted as long as Enrico is OK with this.

granata.enrico accepted this revision.Oct 20 2015, 4:58 PM
granata.enrico edited edge metadata.

Looks fine

This revision is now accepted and ready to land.Oct 20 2015, 4:58 PM
sivachandra closed this revision.Oct 21 2015, 12:30 PM