Before this change, the FDR mode implementation relied on at thread-exit
handling to return buffers back to the (global) buffer queue. This
introduces issues with the initialisation of the thread_local objects
which, even through the use of pthread_setspecific(...) may eventually
call into an allocation function. Similar to previous changes in this
line, we're finding that there is a huge potential for deadlocks when
initialising these thread-locals when the memory allocation
implementation is also xray-instrumented.
In this change, we limit the call to pthread_setspecific(...) to provide
a non-null value to associate to the key created with
pthread_key_create(...). While this doesn't completely eliminate the
potential for the deadlock(s), it does allow us to still clean up at
thread exit when we need to. The change is that we don't need to do more
work when starting and ending a thread's lifetime. We also have a test
to make sure that we actually can safely recycle the buffers in case we
end up re-using the buffer(s) available from the queue on multiple
This change cuts across both LLVM and compiler-rt to allow us to update
both the XRay runtime implementation as well as the library support for
loading these new versions of the FDR mode logging. Version 2 of the FDR
logging implementation makes the following changes:
- Introduction of a new 'BufferExtents' metadata record that's outside of the buffer's contents but are written before the actual buffer. This data is associated to the Buffer handed out by the BufferQueue rather than a record that occupies bytes in the actual buffer.
- Removal of the "end of buffer" records. This is in-line with the changes we described above, to allow for optimistic logging without explicit record writing at thread exit.
The optimistic logging model operates under the following assumptions:
- Threads writing to the buffers will potentially race with the thread attempting to flush the log. To avoid this situation from occuring, we make sure that when we've finalized the logging implementation, that threads will see this finalization state on the next write, and either choose to not write records the thread would have written or write the record(s) in two phases -- first write the record(s), then update the extents metadata.
- We change the buffer queue implementation so that once it's handed out a buffer to a thread, that we assume that buffer is marked "used" to be able to capture partial writes. None of this will be safe to handle if threads are racing to write the extents records and the reader thread is attempting to flush the log. The optimism comes from the finalization routine being required to complete before we attempt to flush the log.
This is a fairly significant semantics change for the FDR
implementation. This is why we've decided to update the version number
for FDR mode logs. The tools, however, still need to be able to support
older versions of the log until we finally deprecate those earlier
- Move per-thread key initialisation to when we initialise the logging implementation
- Define a grace period for finalisation (default to 100ms).
- Use aligned allocations for the buffers handed out by the BufferQueue.