This is an archive of the discontinued LLVM Phabricator instance.

[libFuzzer] Diff 8 - Improve synchronization.
AbandonedPublic

Authored by mpividori on Nov 29 2016, 4:35 PM.

Details

Reviewers
kcc
zturner
Summary

+ Move the SIGTERM, SIGINT and SIGALRM handlers into a separate thread for Posix implemention (instead of asynchronous signal handler). This is more appropriate for 2 reasons:

  • InterruptCallback() and AlarmCallback() use functions like printf, and access to global data, which is not appropriate for a signal handler. By using a separate thread, we are able to synchronize with the main thread.
  • Is the same approach than Windows, which makes code simpler, since we are sure that InterruptCallback() and AlarmCallback() will be excecuted in a separate thread in both Windows and Posix systems. Under this assumption, we can safely use locks to synchronize that thread with the main thread. (We couldn't do that if the same code could be executed asynchronously by a signal handler in Posix systems and by a separated thread in Windows).

+ I added a new flag RunningCB to know if the Fuzzer's main thread is running the CB function, instead of using ! CurrentUnitSize. ! CurrentUnitSize doesn't work properly. For example, in FuzzerLoop.cpp, line 452, we execute the callback with size 0. Previous implementation failed to detect timeouts in that execution.

+ Add a mutex RunningCBMtx to synchronize the access to the Fuzzer's data between different threads:
All the information related to the state of the fuzzer is only modified by the main thread, when it is not running the callback function.
So, in order to consistently access to the Fuzzer's data, we should lock the RunningCBMtx and make sure RunningCB is true (the main thread is running the CB).
This is used to synchronize the thread which manages the timers, and the one which supervises rss limits, with the main thread.
In the callbacks: Fuzzer::InterruptCallback(), Fuzzer::AlarmCallback() and Fuzzer::RssLimitCallback(), I lock the mutex RunningCBMtx, before accessing the Fuzzer's data.
I can do that because I am sure that callbacks are executed by a separate thread, both in Posix and Windows implementations.
For the special case of: Fuzzer::CrashCallback(), I can't use the mutex because the callback could be executed asynchronously by the main thread in Posix, where we use signal handlers for crash signals such as: SIGSEGV, SIGILL, etc. So, I do my best and only check for the RunningCB flag.

Diff Detail

Repository
rL LLVM

Event Timeline

mpividori updated this revision to Diff 79672.Nov 29 2016, 4:35 PM
mpividori retitled this revision from to [libFuzzer] Diff 8 - Improve synchronization..
mpividori updated this object.
mpividori added reviewers: kcc, zturner.
mpividori set the repository for this revision to rL LLVM.
mpividori added a subscriber: llvm-commits.
mpividori updated this object.Nov 29 2016, 4:36 PM
kcc edited edge metadata.Nov 30 2016, 6:57 PM

OMG.
Some of this error handling used to be in a separate thread and I pushed it into the same thread to avoid synchronization problems.
This deserves more discussion, I am not convinced yet.

lib/Fuzzer/FuzzerLoop.cpp
544

please use std::lock_guard

zturner edited edge metadata.Dec 1 2016, 9:24 AM

It seems to me like the motivation for this change is that timers on Windows have their callback executed in an arbitrary thread context. If we could fix that -- force the timer to execute its callback in the main thread -- then this problem would go away. Is that right? All other signals (posix) and exceptions (windows) happen on the main thread already, so those are not the problem, only the timer, and only on Windows.

So would it be possible to just implement the timer logic in the main fuzzer loop? Use std::chrono::high_resolution_clock for example. We wouldn't need to treat the callback specially either, when it fired you could use the Win32 API RaiseException which would still trigger the handler you've installed via AddVectoredExceptionHandler.

Thoughts?

mpividori updated this revision to Diff 79930.Dec 1 2016, 9:42 AM
mpividori edited edge metadata.

+ Use lock_guard instead of lock() and unlock().

Hi @kcc @zturner , I will split my explanation in 2 parts:

+) Previous implementation for Posix was not appropriate: (this reason is not related to Windows at all)

It registers signals handlers:

  • SIGSEGV SIGFPE SIGBUS SIGILL SIGABRT, call the function CrashCallback().
  • SIGINT SIGTERM, call the function InterruptCallback().
  • SIGALRM, call the function AlarmCallback().

The implementation of that callbacks include functions like printf() (it isn't async-safe), and access to global data, which is not appropriate for a signal handler that will be executed asynchronously.

Also, that functions are not necessarily executed by the main thread. For example, they could be executed by the thread that supervises rss limit.
In POSIX it is unspecified which thread handle signals. So, while executing that callbacks, we don't know if the main thread continues its execution modifying the fuzzer's state and making it inconsistent.

Because of that, I think it is better to have a separate thread to handle the signals: SIGINT, SIGTERM and SIGALRM (this signals are sent to the process as a whole, we don't know which thread). We will block that signals to all the threads in the actual process, using pthread_sigmask(), and have only one thread waiting for incoming signals with sigwait(). We can be sure that signals will be handled only by that thread and we can synchronize that thread with the main thread to access to the Fuzzer's data.

For signals SIGSEGV SIGFPE SIGBUS SIGILL SIGABRT, (which are sent to a specific thread that generated the given problem, not to the process as a whole), we can't use a separate thread. They will be probably executed by the main thread, when that problems are caused inside the execution of the CB function being fuzzed. Because of that, I don't use the lock inside CrashCallback(), because it will be called by the main thread. I can't lock, because locks functions are not async-safe, but at the same time, I am sure the main thread will not continue executing and modifying the fuzzer data (because it is executing this callback), so all I have to do is check the flag RunningCB to know if the fuzzer's data is consistent.

So, this is the main reason of that changes for Posix implementation. Move all of the signal handling into a separate thread when possible.

+) In Windows, we implement similar functionalities using:

  • Vectored Exception Handling for exceptions similar to the signals: SIGSEGV SIGFPE SIGBUS SIGILL.
  • Signal for: SIGABRT.
  • SetConsoleCtrlHandler(), for behaviour similar to: SIGINT, SIGTERM.
  • Timer Queues for timer implementation, similar to: SIGALRM

The callbacks InterruptCallback() registered by SetConsoleCtrlHandler(), and AlarmCallback() registered by CreateTimerQueueTimer(), will be executed in a separate thread.
So, we need to synchronize the access to the fuzzer's data by that threads. The main thread could continue its execution and make the data inconsistent.

So, with these changes, we make code simpler, since we are sure that InterruptCallback() and AlarmCallback() will be excecuted in a separate thread in both Windows and Posix systems. Under this assumption, we can safely use locks to synchronize that thread with the main thread.

mpividori abandoned this revision.Dec 5 2016, 10:36 AM

After personally discussing this with @kcc and @zturner , we decided that we will continue with the previous implementation, even if it is not completely correct.
Main reasons:

  • It was mentioned that this implementation makes code more complex.
  • It was mentioned that relying in signal masking is not a good idea because that implementation is generally full of bugs in posix*s systems.

Thanks.