This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix that the embedded Python REPL crashes if it receives SIGINT
ClosedPublic

Authored by JDevlieghere on Jun 24 2021, 4:49 PM.

Details

Summary

When LLDB receives a SIGINT while running the embedded Python REPL it currently just
crashes in ScriptInterpreterPythonImpl::Interrupt with an error such as the one below:

Fatal Python error: PyThreadState_Get: the function must be called with the GIL held, but the GIL is released (the current Python thread state is NULL)

The faulty code that causes this error is this part of ScriptInterpreterPythonImpl::Interrupt:

PyThreadState *state = PyThreadState_GET();
if (!state)
  state = GetThreadState();
if (state) {
  long tid = state->thread_id;
  PyThreadState_Swap(state);
  int num_threads = PyThreadState_SetAsyncExc(tid, PyExc_KeyboardInterrupt);

The obvious fix I tried is to just acquire the GIL before this code is running which fixes the
crash but the KeyboardInterrupt we want to raise immediately is actually just queued
and would only be raised once the next line of input has been parsed (which e.g. won't
interrupt Python code that is currently waiting on a timer or IO from what I can see).
Also none of the functions we call here is marked as safe to be called from a signal
handler from what I can see, so we might still end up crashing here with some bad timing.

Python 3.2 introduced PyErr_SetInterrupt to solve this and the function takes care of
all the details and avoids doing anything that isn't safe to do inside a signal handler.
The only thing we need to do is to manually setup our own fake SIGINT handler that
behaves the same way as the standalone Python REPL signal handler (which raises
a KeyboardInterrupt).

From what I understand the old code used to work with Python 2 so I kept the old
code around until we officially drop support for Python 2.

There is a small gap here with Python 3.0->3.1 where we might still be crashing, but
those versions have reached their EOL more than a decade ago so I think we don't need
to bother about them.

Diff Detail

Event Timeline

teemperor requested review of this revision.Jun 24 2021, 4:49 PM
teemperor created this revision.
JDevlieghere accepted this revision.Jun 24 2021, 5:50 PM

Nice, thanks for figuring this one out. LGTM.

This revision is now accepted and ready to land.Jun 24 2021, 5:50 PM
teemperor updated this revision to Diff 354445.Jun 25 2021, 1:19 AM
  • Removed unused include and revert some unintentional whitespace change.
teemperor updated this revision to Diff 354937.Jun 28 2021, 9:37 AM
  • Don't pretend to have handled the interrupt if we're not running Python at the moment.
  • Make sure that Python doesn't overwrite the LLDB process signal handlers.
  • Extend tests to cover aborting running Python code and sending it while Python is waiting for user input.
teemperor requested review of this revision.Jun 28 2021, 9:39 AM

Sending back as I changed quite a few things to handle more corner cases.

JDevlieghere accepted this revision.Jun 28 2021, 10:26 AM
JDevlieghere added inline comments.
lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
3233–3235

IIRC signal handlers are per process, so there's no such thing as just changing the Python one. I feel a bit uneasy about swapping out signal handlers, but I don't know a better way to do this. Do you happen to know how this worked with Python 2? Was it already swapping out the signal handler behind our back?

This revision is now accepted and ready to land.Jun 28 2021, 10:26 AM
teemperor added inline comments.Jun 28 2021, 2:05 PM
lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
3233–3235

Python maintains its own set of internal signal handler objects which is what we're setting below in the code snippet. Those Python-internal 'signal handlers' are what the SetInterrupt thingy above is calling. But cpython is also hooking up those Python object signal handlers to the process signal handlers when calling signal.signal. I'm trying to revert that change to our process signal handlers while still creating the Python signal handler objects (which we need so that we can call SetInterrupt). I don't think there is a way to just create the internal signal handlers.

In Python 2 we didn't create any signal handlers from Python. We just had no SIGINT signal handler at all beside the LLDB one. And the LLDB signal handler just tried to create the KeyboardInterrupt exception object. But that code was neither signal safe nor did it interrupt IO events, at least it's not so in Python3 (but the documentation is also rather minimalistic.

I'll update the docs here. Most of this is just a product of reading cpython's init/signal module implementation so it makes sense to better explain what's going on here.

This revision was landed with ongoing or failed builds.Nov 12 2021, 5:19 AM
This revision was automatically updated to reflect the committed changes.

@labath raised some concerns in IRC about the setup code being run in a potential multithreaded env (which makes sense). Feel free to revert (not sure when/if I'll get around to address the issues)

This broke the build on Windows. The buildbot was, sadly, already red because of another test failure:

https://lab.llvm.org/buildbot/#/builders/83/builds/11931

@labath raised some concerns in IRC about the setup code being run in a potential multithreaded env (which makes sense). Feel free to revert (not sure when/if I'll get around to address the issues)

Yes, I multithreaded code (and code in libraries in particular) should be very careful what it's doing with signals and signal handlers. That said, after taking a closer look at this. I don't think this patch is as bad as it originally seemed to me. The signal handlers are only changed for a bried period, and most of the signal code in the patch is there to undo the handlers that python sets.

Probably the only thing bothering me now is that all of this work happens lazily, whenever we happen to use the python interpreter for the first time. It would be much better if this happened in the initialization phase, as most applications will likely still be single-threaded at that point (and even if they weren't, it any global effects of the SBDebugger::Ininitalize call would be less surprising (more predictable)). I wouldn't call that a blocker though, since it would require a "how we initialize python" discussion, and it does not add any technical debt there (it would be straightforward to move this to the initialize call).

lldb/test/API/iohandler/sigint/TestIOHandlerPythonREPLSigint.py
58–60

This is probably a consequence of (a bug in) our libedit/readline compatibility workarounds in source/Plugins/ScriptInterpreter/Python/PythonReadline.cpp

JDevlieghere reopened this revision.Jan 11 2022, 7:13 PM
This revision is now accepted and ready to land.Jan 11 2022, 7:13 PM
JDevlieghere commandeered this revision.Jan 11 2022, 7:14 PM
JDevlieghere edited reviewers, added: teemperor; removed: JDevlieghere.
This revision now requires review to proceed.Jan 11 2022, 7:14 PM
lldb/source/Plugins/ScriptInterpreter/Python/ScriptInterpreterPython.cpp
73

Were these not used on Windows at all before?

This revision is now accepted and ready to land.Jan 13 2022, 11:16 AM