This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Convert script native types to StructuredData counterpart
ClosedPublic

Authored by mib on Jul 13 2023, 1:37 AM.

Details

Summary

This patch adds the ability to pass native types from the script
interpreter to methods that use a {SB,}StructuredData argument.

To do so, this patch changes the ScriptedObject struture that holds
the pointer to the script object as well as the originating script
interpreter language. It also exposes that to the SB API via a new class
called SBScriptObject.

This structure allows the debugger to parse the script object and
convert it to a StructuredData object. If the type is not compatible
with the StructuredData types, we will store its pointer in a
StructuredData::Generic object.

This patch also adds some SWIG typemaps that checks the input argument to
ensure it's either an SBStructuredData object, in which case it just
passes it throught, or a python object that is NOT another SB type, to
provide some guardrails for the user.

rdar://111467140

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>

Diff Detail

Event Timeline

mib created this revision.Jul 13 2023, 1:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2023, 1:37 AM
mib requested review of this revision.Jul 13 2023, 1:37 AM

Very interesting, I think this will be pretty useful for the ergonomics of the SBAPI from python!

lldb/bindings/python/python-typemaps.swig
74–75

Two things about this part:

  1. llvm::toString(Error) returns a std::string, so taking a StringRef to this thing is not safe. It's pointing to a temporary, the moment you get to line 75 err_msg will be pointing to unmanaged memory.
  2. PyErr_SetString takes a const char * as its second parameter. In general, it is not safe to pass a StringRef's data field in these scenarios because you can't always guarantee that it points to a null-terminated string. If the StringRef is constructed from a std::string it's usually okay, but it's difficult to reason about these kinds of things when you come across them.
81–82

Same as above here.

91–92

Same here

98–99

Same here

104–105

This may be as simple as:

std::string error_msg = "Input type is invalid: " + type_name.get();
PyErr_SetString(PyExc_TypeError, error_msg.c_str());

Right?

108

Do we actually have to do any conversions here? Like to turn it into an SBStructuredData from a python list or dictionary?

lldb/include/lldb/API/SBStructuredData.h
38–40

Why do we offer both this and SBDebugger::CreateStructuredDataFromScriptObject? They do the exact same thing, when would you want to use one over the other?

lldb/source/API/SBDebugger.cpp
1709–1710 ↗(On Diff #539891)

Can you document what true is supposed to represent here?

lldb/source/API/SBStructuredData.cpp
74–76

I already asked why this is necessary, but if it does stick around you'll want to add LLDB_INSTRUMENT_VA here too yeah?

mib marked 2 inline comments as done.Jul 13 2023, 1:36 PM
mib added inline comments.
lldb/bindings/python/python-typemaps.swig
108

No, if the input is a lldb.SBStructuredData python object, SWIG takes care of doing the conversion to the C++ type (line 87), but here we just made sure that the input is not an SB type and leave the type conversion to be done in lldb by the Script Interpreter code.

lldb/include/lldb/API/SBStructuredData.h
38–40

This actually calls the SBDebugger method, but I tried to put myself in the user's shoes and I think I'd go look at the SBStructuredData help if I wanted to create a one from a script object. This one is for better discoverability and the other one (SBDebugger) is for better ergonomics (since we don't have to pass a debugger).

lldb/source/API/SBStructuredData.cpp
74–76

This is a static method and I don't think we call LLDB_INSTRUMENT_VA in that case. Might be wrong though @JDevlieghere ?

JDevlieghere added inline comments.Jul 13 2023, 2:20 PM
lldb/source/API/SBStructuredData.cpp
74–76

We do, see SBDebugger::Create for example. we just don't pass this.

mib updated this revision to Diff 542217.Jul 19 2023, 3:13 PM
mib marked 10 inline comments as done.
mib edited the summary of this revision. (Show Details)

Address @bulbazord comments.

I'm good with this approach. One thing to note is that this change is explicitly ABI breaking. Specifically this change removes lldb::ScriptedObject so SBProcess::GetScriptedImplementation's return value changes. lldb::ScriptedObject was originally added in February 2023 (see: c1928033047409f977b26ffc938d59188f1ced97) so this has not made it into an LLDB release so far. Even though this does break ABI, from a release perspective this should be okay to do since it's technically additive when compared to the previous release.

I'll defer to @JDevlieghere for approval.

lldb/bindings/python/python-typemaps.swig
144–152
mib added a comment.Jul 19 2023, 3:49 PM

I'm good with this approach. One thing to note is that this change is explicitly ABI breaking. Specifically this change removes lldb::ScriptedObject so SBProcess::GetScriptedImplementation's return value changes. lldb::ScriptedObject was originally added in February 2023 (see: c1928033047409f977b26ffc938d59188f1ced97) so this has not made it into an LLDB release so far. Even though this does break ABI, from a release perspective this should be okay to do since it's technically additive when compared to the previous release.

That is true, this change is ABI breaking, but as you mentioned it, this hasn't been distributed to other client and should only be used by us. Also, I've changed the typemap so that this change is transparent to people who were using this API from python (you might have notices that I didn't have to change the test for GetScriptedImplementation).

JDevlieghere requested changes to this revision.Jul 21 2023, 9:08 AM

I'm okay with the ABI break given that this (1) hasn't made it into a release yet, (2) it is under active development by Ismail himself and (3) has existing adopters beyond the ones motivating the original and current change.

However, now that it's more than a typedef, we should rename ScriptObject to SBScriptObject, move it to its own file and hide its implementation with PIMPL. There is no precedent for having small structs in lldb-types and I don't see why we would want lock ourselves into the current implementation. If anything this patch itself serves as an example that we want the ability to change this class without breaking the ABI (again).

This revision now requires changes to proceed.Jul 21 2023, 9:08 AM
mib updated this revision to Diff 543101.Jul 21 2023, 3:38 PM
mib marked an inline comment as done.
mib edited the summary of this revision. (Show Details)

Address @JDevlieghere & @bulbazord comments:

  • Add SBScriptObject with PIMPL
JDevlieghere added inline comments.Jul 21 2023, 4:13 PM
lldb/include/lldb/API/SBScriptObject.h
48

Does this need to be a shared pointer as opposed to a unique pointer? If so please add a comment why.

lldb/include/lldb/Utility/ScriptObject.h
9 ↗(On Diff #543101)

Not sure if this really belongs in Utility. What about putting it under Plugins/ScriptInterpreter/ or Interpreter/ next to ScriptInterpreter.h. The header guard seems to suggest that it was there at some point?

lldb/source/API/SBScriptObject.cpp
47

Missing newline.

mib updated this revision to Diff 543122.Jul 21 2023, 4:45 PM
mib marked 3 inline comments as done.

Address @JDevlieghere last comments.

  • Use unique_ptr instead of shared_ptr for PIMPL
  • Add missing new line after instrumentation macros
  • Move ScriptObject from Utility to Interpreter
JDevlieghere accepted this revision.Jul 21 2023, 6:28 PM

Thanks for bearing with me. LGTM.

This revision is now accepted and ready to land.Jul 21 2023, 6:28 PM