This is an archive of the discontinued LLVM Phabricator instance.

[libclang] Control whether crash recovery is enabled/disabled using function argument.
Needs ReviewPublic

Authored by rs on Aug 18 2016, 7:18 AM.

Details

Reviewers
akyrtzi
Summary

I'm using libclang from Java via the Java Native Interface (JNI). When calling the clang_createIndex function the first thing it does is check whether the environment variable 'LIBCLANG_DISABLE_CRASH_RECOVERY' is set if it isn't then it enables crash recovery.

// We use crash recovery to make some of our APIs more reliable, implicitly
// enable it.
CXIndex clang_createIndex(int excludeDeclarationsFromPCH,
                        int displayDiagnostics) {
if (!getenv("LIBCLANG_DISABLE_CRASH_RECOVERY"))
  llvm::CrashRecoveryContext::Enable();
...

The crash recovery code installs it's own signal handler for catching several signals one of them being the segmentation fault signal 11. If the handler CrashRecoverySignalHandler is triggered and it doesn't recognise the signal coming from a call in LLVM then it reinstalls the previous handler (by calling CrashRecoveryContext::Disable()) for the signals and reraises the signal:

static void CrashRecoverySignalHandler(int Signal) {
// Lookup the current thread local recovery object.
const CrashRecoveryContextImpl *CRCI = CurrentContext->get();

if (!CRCI) {
  // We didn't find a crash recovery context -- this means either we got a
  // signal on a thread we didn't expect it on, the application got a signal
  // outside of a crash recovery context, or something else went horribly
  // wrong.
  //
  // Disable crash recovery and raise the signal again. The assumption here is
  // that the enclosing application will terminate soon, and we won't want to
  // attempt crash recovery again.
  //
  // This call of Disable isn't thread safe, but it doesn't actually matter.
  CrashRecoveryContext::Disable();
  raise(Signal);

  // The signal will be thrown once the signal mask is restored.
  return;
}

The JVM installs it's own segfault handler which catches operations on null objects and raises a NullPointerException that produces a Java stacktrace. When my Java application calls clang_createIndex() with crash recovery enabled it replaces the JVM's segfault handler with CrashRecoverySignalHandler and now this handler gets all the segfault signals that would have normally been sent to the JVM and when it does and tries to restore the previous segfault hanlder (which is the JVMs) it doesn't install the right one because the JVM ends up crashing and producing a core dump. Solution to this problem is to disable crash recovery when using libclang from Java.

I could disable crash recovery by calling clang_toggleCrashRecovery(false) after clang_createIndex has been called but this doesn't work because the right JVM handler isn't reinstalled so I still get a core dump when the JVM receives a segfault signal which it would have normally treated as a NullPointerException.

The only way to correctly solve my problem is to set the environment variable 'LIBCLANG_DISABLE_CRASH_RECOVERY' but I find this to not be a very nice way to control the program behaviour so I've got a patch below which allows users to control whether it's enabled/disable by setting a function argument. I've kept the old check in for the environment variable LIBCLANG_DISABLE_CRASH_RECOVERY for backwards compatibility.

Diff Detail

Repository
rL LLVM

Event Timeline

rs updated this revision to Diff 68535.Aug 18 2016, 7:18 AM
rs retitled this revision from to [libclang] Control whether crash recovery is enabled/disabled using function argument..
rs updated this object.
rs set the repository for this revision to rL LLVM.
rs updated this object.
rs updated this object.
rs updated this object.
rs updated this object.Aug 18 2016, 7:22 AM
rs added a subscriber: cfe-commits.Aug 18 2016, 7:26 AM
rs added a reviewer: akyrtzi.Aug 18 2016, 7:28 AM

When my Java application calls clang_createIndex() with crash recovery enabled it replaces the JVM's segfault handler with CrashRecoverySignalHandler and now this handler gets all the segfault signals that would have normally been sent to the JVM and when it does and tries to restore the previous segfault hanlder (which is the JVMs) it doesn't install the right one because the JVM ends up crashing and producing a core dump.

Surely the fix then is to make sure CrashRecoveryContext::Disable does reinstall the right signal handler?

The only way to correctly solve my problem is to set the environment variable 'LIBCLANG_DISABLE_CRASH_RECOVERY' but I find this to not be a very nice way to control the program behaviour

Why not? Also I notice that using environment variables to control behaviour is used in a bunch of places in libclang so you're introducing some inconsistency here.

wrmsr added a subscriber: wrmsr.Sep 27 2016, 4:16 PM

I just lost a few days diagnosing what wound up being this :(

Surely the fix then is to make sure CrashRecoveryContext::Disable does reinstall the right signal handler?

The fix is to not do this by default as this breaks host environments. Libraries that don’t explicitly relate to signal handling have no business overriding signal handlers.

Why not? Also I notice that using environment variables to control behaviour is used in a bunch of places in libclang so you're introducing some inconsistency here.

There are already API exports controlling crash recovery so the API is already inconsistent in addition to being broken by default. I understand env vars controlling environment related variability (dev vs prod) but not host-process and callsite related variability (cpython vs java).

akyrtzi edited edge metadata.Sep 28 2016, 9:25 AM

I could disable crash recovery by calling clang_toggleCrashRecovery(false) after clang_createIndex has been called but this doesn't work because the right JVM handler isn't reinstalled

Could you explain more why this doesn't work ? Is it a bug with the crash handling logic or something else ?

If we are going to have a 'clang_createIndex2' function to be able to introduce a new flag, it should be made to accept OR'ed enums for flags, so we can add more flags in the future and not get in a similar situation and have to add 'clang_createIndex3'..