This is an archive of the discontinued LLVM Phabricator instance.

Fix crash on SBCommandReturnObject & assignment
ClosedPublic

Authored by jankratochvil on Sep 14 2019, 2:07 PM.

Details

Summary

I was writing an SB API client and it was crashing on:

bool DoExecute(SBDebugger dbg, char **command, SBCommandReturnObject &result) {
  result = subcommand(dbg, "help");

That is because SBCommandReturnObject &result gets initialized inside LLDB by:

bool DoExecute(Args &command, CommandReturnObject &result) override {
  // std::unique_ptr gets initialized here from `&result`!!!
  SBCommandReturnObject sb_return(&result);
  DoExecute(...);
  sb_return.Release();

Diff Detail

Repository
rL LLVM

Event Timeline

jankratochvil created this revision.Sep 14 2019, 2:07 PM

+ Jim for SB API.

I ran into this quirk of SBCommandReturnObject a couple of weeks ago, and then ran away screaming.

I agree that this approach isn't nice, but probably there isn't a nice approach to this at this point. One possibility you could consider is storing a shared_ptr<CommandRO> inside SBCommandRO and encoding the ownership into the deleter of the shared_ptr (regular deleter => owned, noop deleter => unowned).

lldb/include/lldb/API/SBCommandReturnObject.h
24 ↗(On Diff #220231)

Add a comment to explain the purpose of this function?

lldb/packages/Python/lldbsuite/test/api/command-return-object/main.cpp
18 ↗(On Diff #220231)

Is that intentional? If so, why?

23 ↗(On Diff #220231)

This looks weird. Please use a separate declaration for the variable and the class.

clayborg added inline comments.
lldb/source/API/SBCommandInterpreter.cpp
165–172 ↗(On Diff #220231)

Could this code just create a local SBCommandReturnObject and then copy the CommandReturnObject back into "result"?

 bool DoExecute(Args &command, CommandReturnObject &result) override {
    SBCommandReturnObject sb_return;
    SBCommandInterpreter sb_interpreter(&m_interpreter);
    SBDebugger debugger_sb(m_interpreter.GetDebugger().shared_from_this());
    bool ret = m_backend->DoExecute(
        debugger_sb, (char **)command.GetArgumentVector(), sb_return);
    std::swap(result, sb_return.ref());
    return ret;
}
jankratochvil marked 3 inline comments as done.Sep 17 2019, 8:00 AM

I agree that this approach isn't nice, but probably there isn't a nice approach to this at this point. One possibility you could consider is storing a shared_ptr<CommandRO> inside SBCommandRO and encoding the ownership into the deleter of the shared_ptr (regular deleter => owned, noop deleter => unowned).

Maybe also possibly cheaper llvm::PointerIntPair representing an associated bool for the deletion.

lldb/packages/Python/lldbsuite/test/api/command-return-object/main.cpp
18 ↗(On Diff #220231)

It has no purpose there for this bugfix but when already writing a testcase I wanted to test this generally fragile functionality.

lldb/source/API/SBCommandInterpreter.cpp
165–172 ↗(On Diff #220231)

I made it that way first but it breaks the testsuite. For example due to:

lldb/source/Commands/CommandObjectCommands.cpp:
1236	  bool DoExecute(llvm::StringRef raw_command_line,
1237	                 CommandReturnObject &result) override {
...
1242	    result.SetStatus(eReturnStatusInvalid);
 - ctored CommandReturnObject has eReturnStatusStarted.
1244	    if (!scripter ||
1245	        !scripter->RunScriptBasedCommand(m_function_name.c_str(),
1246	                                         raw_command_line, m_synchro, result,
1247	                                         error, m_exe_ctx)) {
1248	      result.AppendError(error.AsCString());
1249	      result.SetStatus(eReturnStatusFailed);
1250	    } else {
1251	      // Don't change the status if the command already set it...
1252	      if (result.GetStatus() == eReturnStatusInvalid) {
1253	        if (result.GetOutputData().empty())
1254	          result.SetStatus(eReturnStatusSuccessFinishNoResult);
1255	        else
1256	          result.SetStatus(eReturnStatusSuccessFinishResult);
1257	      }
1258	    }

So it would be possible but that would require refactorization more of the code in callers which I find outside of the scope of this patch.
This patch tries to make this bugfix fully transparent to existing code.

jankratochvil marked 3 inline comments as done.
jankratochvil marked an inline comment as done.Sep 17 2019, 8:08 AM
jankratochvil added inline comments.
lldb/packages/Python/lldbsuite/test/api/command-return-object/main.cpp
18 ↗(On Diff #220231)

If you mean the line:

+     if (!result.Succeeded())

That is not really needed for this testcase but I think this is how a real world implementation should behave so why not here, just 2 lines of code.

clayborg added inline comments.Sep 17 2019, 8:50 AM
lldb/include/lldb/API/SBCommandReturnObject.h
22–27 ↗(On Diff #220505)

Easier to just make SBCommandReturnObject::ref() protected and friend any users or just make it public?

99–100 ↗(On Diff #220505)

Remove if we make SBCommandReturnObject::ref() protected and friend any users or just make it public?

108 ↗(On Diff #220505)

Easier to just make SBCommandReturnObject::ref() protected and friend any users or just make it public?

lldb/source/API/SBCommandReturnObject.cpp
21–24 ↗(On Diff #220505)

Easier to just make SBCommandReturnObject::ref() protected and friend any users or just make it public?

I agree that this approach isn't nice, but probably there isn't a nice approach to this at this point. One possibility you could consider is storing a shared_ptr<CommandRO> inside SBCommandRO and encoding the ownership into the deleter of the shared_ptr (regular deleter => owned, noop deleter => unowned).

Maybe also possibly cheaper llvm::PointerIntPair representing an associated bool for the deletion.

We couldn't use PointerIntPair due to the our abi stability promise (http://lldb.llvm.org/resources/sbapi.html). One could try to implement something like that by hand, but I don't think it's worth the trouble.

lldb/packages/Python/lldbsuite/test/api/command-return-object/main.cpp
18 ↗(On Diff #220231)

Nope, I meant the "result = result" line. Testing self-assignment is fine, but you may want to call that out explicitly, as otherwise someone may just come along and delete the "obviously useless" code.

jankratochvil marked 6 inline comments as done.Sep 18 2019, 5:18 AM

We couldn't use PointerIntPair due to the our abi stability promise (http://lldb.llvm.org/resources/sbapi.html).

Thanks for this document! I did not know it. But isn't the document trying to keep ABI compatibility despite it is not explicitly stated there? Or otherwise why are there so many restrictions.

As then even using std::shared_ptr I think violates the ABI compatibility promise there as currently SBCommandReturnObject contains std::unique_ptr<lldb_private::CommandReturnObject> m_opaque_up. But maybe this discussion goes too far from this patch.

lldb/include/lldb/API/SBCommandReturnObject.h
22–27 ↗(On Diff #220505)

I tried to make everyone a friend with SBCommandReturnObject but I could not. The problem is these two

SWIGEXPORT bool LLDBSwigPythonCallCommand
SWIGEXPORT bool LLDBSwigPythonCallCommandObject

from lldb/scripts/Python/python-wrapper.swig have SWIGEXPORT which is defined only in build directory tools/lldb/scripts/LLDBWrapPython.cpp` so one cannot access it from lldb/include/lldb/API/SBCommandReturnObject.h:

error: unknown type name 'SWIGEXPORT'

And when I remove SWIGEXPORT there I get:

llvm-monorepo-clangassert/tools/lldb/scripts/LLDBWrapPython.cpp:78030:1: error: declaration of 'LLDBSwigPythonCallCommandObject' has a different language linkage
LLDBSwigPythonCallCommandObject
llvm-monorepo/lldb/include/lldb/API/SBCommandReturnObject.h:20:2: note: previous declaration is here
 LLDBSwigPythonCallCommandObject

But it is true making it public is safe enough, lldb_private::CommandReturnObject is opaque to SB API users anyway. I found it too bold to write it that way myself, thanks.

jankratochvil marked an inline comment as done.
clayborg added a comment.EditedSep 18 2019, 11:02 AM

We couldn't use PointerIntPair due to the our abi stability promise (http://lldb.llvm.org/resources/sbapi.html).

Thanks for this document! I did not know it. But isn't the document trying to keep ABI compatibility despite it is not explicitly stated there? Or otherwise why are there so many restrictions.

We are making all efforts to vend a stable C++ API. Thus the restrictions. If anyone makes a binary that contains a lldb::SB object we can't change the size of the object. That means no inheritance (size of object could change), no virtual functions (vtable size would change), and no changing the ivars since their size defines the size of the object. Then all function calls to the LLDB API are just fancy long mangled name lookups just like a C API, and thus our API is stable. Then people can have our classes as ivars in their classes, and if liblldb.so gets updated they can still run without having to re-link.

As then even using std::shared_ptr I think violates the ABI compatibility promise there as currently SBCommandReturnObject contains std::unique_ptr<lldb_private::CommandReturnObject> m_opaque_up. But maybe this discussion goes too far from this patch.

We can replace the std::unique_ptr<lldb_private::CommandReturnObject> m_opaque_up with another unique pointer to an internal object (std::unique_ptr<lldb_private::CommandReturnObjectImpl> m_opaque_up;). We could make a struct:

lldb_private::CommandReturnObjectImpl {
  bool owned;
  std::unique_ptr<lldb_private::CommandReturnObject> m_opaque_up;
};

And then release the object without freeing it if needed in the CommandReturnObjectImpl destructor. This keeps the ABI stable since the size of lldb::SBCommandReturnObject doesn't change. The constructors aren't inlined for just this purpose (another API thing).

We are making all efforts to vend a stable C++ API.

IIUC you mean "stable C++ ABI" here. Thanks for the clarification. Maybe http://lldb.llvm.org/resources/sbapi.html could say that and it would be much more clear.

lldb_private::CommandReturnObjectImpl {
  bool owned;
  std::unique_ptr<lldb_private::CommandReturnObject> m_opaque_up;
};

Is this a request to rework this patch this way? If so isn't it safer / more clear to do it rather this way?

lldb_private::CommandReturnObjectImpl {
  bool owned;
  lldb_private::CommandReturnObject *m_opaque_ptr;
  ~CommandReturnObjectImpl() { if (owned) delete m_opaque_ptr; }
};

Maybe http://lldb.llvm.org/resources/sbapi.html could say that and it would be much more clear.

That's probably a good idea, though I would still keep the list of restrictions spelled out as ABI can be broken in a lot of subtle and unobvious ways (at least to a person who has never tried to maintain a stable abi).

lldb_private::CommandReturnObjectImpl {
  bool owned;
  std::unique_ptr<lldb_private::CommandReturnObject> m_opaque_up;
};

Is this a request to rework this patch this way? If so isn't it safer / more clear to do it rather this way?

lldb_private::CommandReturnObjectImpl {
  bool owned;
  lldb_private::CommandReturnObject *m_opaque_ptr;
  ~CommandReturnObjectImpl() { if (owned) delete m_opaque_ptr; }
};

I think I would like that better than the swap trick. Since you're now inside a pimpl, you can replace the two members with a llvm::PointerIntPair, if you are so inclined.

jankratochvil planned changes to this revision.Sep 24 2019, 5:59 AM
jankratochvil edited the summary of this revision. (Show Details)

Changed @clayborg's lldb_private::CommandReturnObjectImpl to lldb_private::SBCommandReturnObjectImpl - both variants are used in LLDB SB*.cpp and the SB* looks more logical to me.
Used @clayborg's lldb_private::SBCommandReturnObjectImpl::m_owned with conditional in dtor but personally I would use two inherited classes + virtual dtor instead.

I think I would like that better than the swap trick. Since you're now inside a pimpl, you can replace the two members with a llvm::PointerIntPair, if you are so inclined.

Not done. I proposed llvm::PointerIntPair originally to match your goal of keeping single pointer in SBCommandReturnObject. But when there is already a new instance of lldb_private::CommandReturnObjectImpl then llvm::PointerIntPair therein would be more a burden both in the source code and for CPU.

m_opaque_up typeABI compatibleCPU overheadformal http://lldb.llvm.org/resources/sbapi.html compliance
shared_ptr with two deletersno (size 16)high ('lock' instructions)yes (a bug in the sbapi.html doc?)
llvm::PointerIntPairyes (size 8)lowno (sort of)
unique_ptr<SBCommandReturnObjectImpl>yes (size 8)high (extra object allocation)yes

But I am fine with this patch implementing SBCommandReturnObjectImpl as this is not any performance critical part of code.

labath accepted this revision.Sep 27 2019, 12:19 AM

This looks good to me, but please do give a chance for other to look this over too. Thanks for cleaning this up.

m_opaque_up typeABI compatible
llvm::PointerIntPairyes (size 8)

This is true, but it also inflicts the SB stability requirements onto the PointerIntPair class. While I don't imagine the sizeof(PointerIntPair) will ever change, I don't think this is a good thing to do, just on principle.

lldb/packages/Python/lldbsuite/test/api/command-return-object/Makefile
1 ↗(On Diff #222021)

I'm pretty sure this line is not needed.

lldb/source/API/SBCommandReturnObject.cpp
142 ↗(On Diff #222021)

I think get()->Clear() or ref().Clear() would be easier on the eye.

This revision is now accepted and ready to land.Sep 27 2019, 12:19 AM
jankratochvil marked 2 inline comments as done.EditedSep 27 2019, 2:58 AM
This comment has been deleted.
lldb/packages/Python/lldbsuite/test/api/command-return-object/Makefile
1 ↗(On Diff #222021)

Removed in current testsuite: rL373061

jankratochvil marked 2 inline comments as done.
shafik added a subscriber: shafik.EditedSep 27 2019, 10:23 AM

@jankratochvil it looks like this PR broke the green dragon build: http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/1926/

The specific test that is failing is TestFunctionStarts.py see the more detailed log here: http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/1926/testReport/junit/lldb-Suite/macosx_function-starts/TestFunctionStarts_py/

To clarify, reverting this change fixes the test on my side.

@jankratochvil it looks like this PR broke the green dragon build: http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/1926/

This PR is not yet checked-in. Let's move this discussion to rL373061.

jankratochvil marked an inline comment as done.Sep 27 2019, 11:56 AM
jankratochvil added inline comments.
lldb/packages/Python/lldbsuite/test/api/command-return-object/Makefile
1 ↗(On Diff #222021)

So I do not understand how MAKE_DSYM works but global removal of MAKE_DSYM := NO in rL373061 broke OSX so I have reverted it by rL373110. Keeping MAKE_DSYM := NO stays omitted for this testcase.

labath added inline comments.Sep 30 2019, 5:50 AM
lldb/packages/Python/lldbsuite/test/api/command-return-object/Makefile
1 ↗(On Diff #222021)

Yeah, the test which failed is doing pretty funky stuff, so I can imagine it can get hurt by removing MAKE_DSYM. Omitting it for this test should be safe though. (Probably all other tests except TestFunctionStarts.py too, but I don't think you have to bother with that.)

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2019, 12:32 PM

Had to disable the testcase on Windows: rL373787

labath added a comment.Oct 7 2019, 1:53 AM

(Ideally, all of these tests would be just (gtest) unit tests. There's no need to pull in python to do something our build system already knows perfectly well to do.)