This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Introduce StackFrameRecognizer
ClosedPublic

Authored by kubamracek on Mar 17 2018, 4:45 PM.

Details

Summary

This patch introduces a concept of "frame recognizer" and "recognized frame". This should be an extensible mechanism that retrieves information about special frames based on ABI, arguments or other special properties of that frame, even without source code. A few examples where that could be useful could be 1) objc_exception_throw, where we'd like to get the current exception, 2) terminate_with_reason and extracting the current terminate string, 3) recognizing Objective-C frames and automatically extracting the receiver+selector, or perhaps all arguments (based on selector).

Diff Detail

Event Timeline

kubamracek created this revision.Mar 17 2018, 4:45 PM

This seems like a good place to start. Certainly producing synthetic function arguments is one of the fixed things these recognizers can do, and we can build on that as we go. The inputs are all types that we know how to bind to Python, so it will be easy to make wrappers and eventually an affordance for making Python based frame recognizers. I don't think that needs to be a part of this patch, though that would make writing tests easier. How were you planning to add tests for this?

include/lldb/Target/StackFrame.h
558–559

Do we need m_recognized? Wouldn't it be enough to check !m_recognized_frame?

Also, we prepend SharedPointers with _sp as a general rule since it's really handy to know that w/o having to look up the definition.

source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCStackFrameRecognizer.cpp
27–28 ↗(On Diff #138835)

In the rest of the Apple support for ObjC we use either AppleObjCRuntimeV1 and AppleObjCRuntimeV2, and then use AppleObjCRuntime for support that crosses versions. It would be better not to introduce another name here.

source/Target/StackFrameRecognizer.cpp
27–31

Because of the way you search, you're doing first one in wins for any conflicts. I'm not sure we need to overthink conflicts at this point, but at least it should be last one in wins. And maybe start out returning an Status so that if there are conflicts we can warn about them?

kubamracek edited the summary of this revision. (Show Details)

Updating StackFrameRecognizer with a Python plugin implementation and a testcase using a Python recognizer. Also removing the Darwin-specific parts.

@jingham, @jasonmolenda, I haven't addresses the comments yet, but I'll do that in the next round. I'd seeking approval of the high-level approach first. The patch is getting a bit large, but the added test (and the Python implementation of a recognizer) should showcase what I'm trying to achieve.

kubamracek added inline comments.Jun 11 2018, 10:07 PM
source/Commands/CommandObjectFrame.cpp
833

Note to self: We should also require -m and -n (otherwise the default is that the recognizer will be called on every single stop point).

This is going as I imagined it should, looks great!

We probably want to turn this on by default for "frame var" or no one will ever discover it. The IDE folks can decide on their own what to do from the SB API's. Since you are doing module filtering and this only triggers when you explicitly run "frame var" I think having it on by default will not be a big deal.

Does this work with "frame var paths" as well? That is, if I do:

(lldb) frame var
(struct Foo *) recognized_variable = 0x12345

Will we find it again if I do:

(lldb) frame var *recognized_variable

or

(lldb) frame var recognized_variable.some_interesting_ivar

I think this should just work, but it's worth checking.

This is going as I imagined it should, looks great!
We probably want to turn this on by default for "frame var" or no one will ever discover it. The IDE folks can decide on their own what to do from the SB API's. Since you are doing module filtering and this only triggers when you explicitly run "frame var" I think having it on by default will not be a big deal.
Does this work with "frame var paths" as well? That is, if I do:

It does not work. The reason is that the current patch is not creating VariableSP objects, so things like m_variable_list_sp (in StackFrame) don't contain these new "recognized arguments". Right now, I'm just creating ValueObjects and GetVariableList/GetInScopeVariableList doesn't know about them. Do you think we should go that way instead? In that case I think we'd need to extend or subclass Variable so that we can actually create Variable objects that are not backed by DWARF and SymbolContextScope and stuff like that.

If you think that makes sense, I'll be happy to work on that. Also, if you think it's not necessary for an initial commit, I can send a separate follow-up patch (to avoid growing this already large patch).

I think ultimately we should make this presentation of variables as symmetrical with locals as we can or it's going to confuse users.

I don't think it is necessary for the first version. The first uses, I would imagine, are to make available some variables in situations where none existed before - mostly digging info out of frames with no debug info - so it's strictly better than what was there before already...

Cleaned up patch, clang-formatted, respects 80-columns now. Expanded test case. Implemented all "frame recognizer" commands (add, delete, clear, list). Changed iteration order on the list of active recognizers (so that last added wins). Hopefully addressed all comments now.

Rebasing to current master.

This code looks fine to me. The docs are a little thin.

You don't say how to make a recognizer anywhere except in the test example. There should probably be an example in the long help for CommandObjectFrameRecognizerAdd. Look for instance at the help for "breakpoint command add" which tells you the signature of the Python function you need to add and gives an example. You should also add a section to www/python-reference.html since that's where people go to see "how can I extend lldb using the Python interface."

Switched on showing recognized args by default in "frame variable". Switched "-t" to mean "omit" instead of "include".
Added documentation and an example into "help frame recognizer add" and into python-reference.html.

Fixing up test (it was using the "-t" flag).
Removing a leftover printf().

friendly ping :)

jingham requested changes to this revision.Jul 9 2018, 5:31 PM

This is good. I had a few inline comments, mostly about the command syntax. I think you should switch "-m" to "-s" since that's what we use in the other similar places.

For the "type summary" etc. commands which have a similar registry & match function, Enrico added:

(lldb) type summary info expression

that tell you what summary formatter will match the given expression. It might be nice to have:

(lldb) recognizer info <FRAME_NUMBER>

that would tell you what recognizer is going to be in force for this frame. The isn't terribly important if there's only a few recognizers, but if there are more, or when you're trying to get one to work, having this info is really handy. That's not required - and I'm fine with doing that as a follow-on, however.

source/API/SBFrame.cpp
1062

You can say:

for (auto &rec_value_sp : recognized_arg_list->GetObjects()) {
   SBValue value_sb;
  value_sb.SetSP(rec_value_sp, use_dynamic);
   value_list.Append(value_sb);
}

That's a little easier to read.

source/Commands/CommandObjectFrame.cpp
721

Should be able to use GetObjects here too.

761

In most of the other places where we limit a search to a shared library (e.g. breakpoint set, source list & target var) we use "s" and "shlib" in the command line. It would be better to stay consistent with that.

Also the type of the argument should be eArgTypeShlibName (though these types don't do anything yet, they should eventually get hooked up to completers. But since they don't you can also add the module completer by setting:

CommandCompletions::eModuleCompletion

in the completers slot. See for instance the setting of "shlib" in g_breakpoint_set_options in CommandObjectBreakpoint.cpp.

The few places that don't follow this rule (like image list) take a list of shared libraries as arguments, so don't fall within this rule.

Also, for module and name entries I've been allowing multiple instances of the option, and then accumulating them in a list. So for instance, you can set:

(lldb) br s -n foo -n main -s Sketch -s libsystem_c.dylib

There's no automatic way to indicate that you are doing this, I always note it explicitly in the help:

(lldb) help break set
...
       -n <function-name> ( --name <function-name> )
            Set the breakpoint by function name.  Can be repeated multiple times to make one breakpoint for multiple
            names

That might also be useful here.

858–859

Why?

www/python-reference.html
749 ↗(On Diff #152017)

Grammatically it would be better to say "Frame recognizers allow FOR retrieving..."

752–753 ↗(On Diff #152017)

They can also be used to augment extant function arguments, so this isn't exactly right. It's just the most obvious use. Might be good to be clear about that?

754 ↗(On Diff #152017)

I'd say "Adding a custom frame recognizer is DONE by implementing..."

You are at this point telling somebody how to do it, not what can be done...

776–777 ↗(On Diff #152017)

Again, say why. Presumably this is for performance reasons, but from this comment I can't tell if it would crash otherwise or what bad thing might happen...

This revision now requires changes to proceed.Jul 9 2018, 5:31 PM

Updating patch, addressing most comments. Changed '-m' to '-s'. Added 'frame recognizer info' subcommand. Improved wording in documentation.

I didn't add the possibility to specify multiple '-s' and '-n' args. Do you think I should add this to the first iteration of this or can we leave it as a future improvement?

aprantl added inline comments.
include/lldb/Target/StackFrameRecognizer.h
26

Would you mind adding doxygen comments to each of the new classes to explain what they are good for?

This is good. The addition of the "info" command will be helpful for people trying to debug their recognizers. It's okay to add multiple -s and -n's later - though the fact that you don't allow "apply to all frames" may make us want the ability to provide more than one option sooner rather than later...

Sorry for this thought coming late, but I worry a little bit about the fact that the class name is the recognizers identity, particularly for deleting. I might have a good recognizer class that I want to add to one library, then later in the session want to apply it to another library as well. The second addition will mean that the name now exists twice but with different shared libraries, and then since I only provide the name to "delete" I can't tell which one I've deleted.

It also makes the API's a little odd, since the name of the recognizer is important to the commands for handling it, but it isn't clear from the API's that I have to provide a unique name for them.

I think it would be better to have the RecognizerManager keep an ID for each recognizer, and have list print and delete take the index.

Also, it might be convenient to have "frame recognizer delete" with no arguments query "Do you want to delete all recognizers" and then delete them. That's the way "break delete" works so it's an accepted pattern in lldb. This could be done as a follow-on, however.

packages/Python/lldbsuite/test/functionalities/frame-recognizer/TestFrameRecognizer.py
55

Would you mind using:

lldbutil.run_break_set_by_symbol(self, "foo")

instead of this runCmd. The lldbutil routine will check that the breakpoint got set and error out here if it didn't. That will reduce the head-scratching if for some reason we fail to set the breakpoint and then the test falls over downstream.

92

Also here.

Adding class comments, adding IDs to recognizers.

jingham accepted this revision.Oct 30 2018, 11:31 AM

LGTM. Thanks for working on this!

This revision is now accepted and ready to land.Oct 30 2018, 11:31 AM
This revision was automatically updated to reflect the committed changes.
kubamracek reopened this revision.Oct 30 2018, 5:31 PM
This revision is now accepted and ready to land.Oct 30 2018, 5:31 PM
This revision was automatically updated to reflect the committed changes.
kubamracek reopened this revision.Oct 30 2018, 6:25 PM
This revision is now accepted and ready to land.Oct 30 2018, 6:25 PM
This revision was automatically updated to reflect the committed changes.