This is an archive of the discontinued LLVM Phabricator instance.

[LLDB] Fix Python GIL-not-held issues
ClosedPublic

Authored by labath on Nov 29 2021, 9:58 AM.

Details

Summary

The GIL must be held when calling any Python C API functions. In multithreaded applications that use callbacks this requirement can easily be violated by accident. A general tool to ensure GIL health is not available, but patching Python Py_INCREF to add an assert provides a basic health check:

+int PyGILState_Check(void); /* Include/internal/pystate.h */
+
 #define Py_INCREF(op) (                         \
+    assert(PyGILState_Check()),                 \
     _Py_INC_REFTOTAL  _Py_REF_DEBUG_COMMA       \
     ((PyObject *)(op))->ob_refcnt++)

 #define Py_DECREF(op)                                   \
     do {                                                \
+        assert(PyGILState_Check());                     \
         PyObject *_py_decref_tmp = (PyObject *)(op);    \
         if (_Py_DEC_REFTOTAL  _Py_REF_DEBUG_COMMA       \
         --(_py_decref_tmp)->ob_refcnt != 0)             \

Adding this assertion causes around 50 test failures in LLDB. Adjusting the scope of things guarded by py_lock fixes them.

More background: https://docs.python.org/3/glossary.html#term-global-interpreter-lock

Patch by Ralf Grosse-Kunstleve

Diff Detail

Event Timeline

rupprecht requested review of this revision.Nov 29 2021, 9:58 AM
rupprecht created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptNov 29 2021, 9:58 AM
labath added inline comments.
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.h
101

This usage of auto isn't very llvm-ish.

277–285

As I mentioned internally, I think this should be placed inside the ScriptInterpreterPythonImpl destructor (and elsewhere, if needed), as that's the level at which we do our normal locking. There it can become a regular Locker object, since that function will be called from SBDebugger::Terminate (or maybe even earlier, in either case it will be before any global destructors run).

The reason this can't be done with StructuredPythonObject, is because this one gets passed on to python code, which can delete it (or not) at pretty much arbitrary time. PythonObjects OTOH, are just C++ handles to python objects, and we should be able to keep track of their lifetimes reasonably well.

We can put an assert here to ensure the callers obey this contract.

lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
1440–1441

Since the only reason this scope was introduced was to run the return statement in an unlocked state, I think it would be better to just remove it now. Same goes for other patterns like this.

rwgk added a subscriber: rwgk.Dec 6 2021, 7:47 AM
labath commandeered this revision.Jan 12 2022, 5:44 AM
labath edited reviewers, added: rupprecht; removed: labath.
labath added a reviewer: mib.
labath added inline comments.
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.h
277–285

It turns out this is not as easy as I made it sound. I thought one could just put the lock around the body of the destructor, but (of course) destructors for subobjects run only after the body of the main object desctructor terminates. I mean, it would still be possible to make it work that way, but it wouldn't be as pretty. So, after some soul-searching, I decided to keep the current implementation.

labath updated this revision to Diff 399302.Jan 12 2022, 5:44 AM

This just fixes the style issues.

JDevlieghere accepted this revision.Jan 12 2022, 11:28 AM

LGTM

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.h
277–285

Makes sense!

This revision is now accepted and ready to land.Jan 12 2022, 11:28 AM
This revision was automatically updated to reflect the committed changes.

Seems that it broke apt.llvm.org on ubuntu bionic with Python 3.6:
I reported this issue: https://github.com/llvm/llvm-project/issues/53291