Index: llvm/docs/XRayFramework.rst =================================================================== --- /dev/null +++ llvm/docs/XRayFramework.rst @@ -0,0 +1,469 @@ +============== +XRay Framework +============== + +:Version: 1 as of 2019-01-02 + +.. contents:: + :local: + +Introduction +============ + +XRay is a function call tracing system combining compiler-inserted +instrumentation and a runtime library for intercepting these function calls +(and other kinds of events). This document describes the runtime framework +through which the XRay runtime implementations included in the compiler-rt +project can be used to implement user-defined logging implementations. + +This document is structured both as a reference guide as well as a tutorial. It +is intended to be read in order, presenting the APIs and the hooks for how an +implementation can be plugged into the framework. + +Background +========== + +XRay's runtime library is made up of the following parts: + +* A set of APIs for handling the management and life cycle of a logging + implementation. +* The instrumentation point patching/unpatching routines. +* A set of APIs for controlling the functions invoked when the instrumentation + points are enabled. + +We're first going to start with the typical life cycle of a logging +implementation. This is the user-facing interface that you'll typically find +integrated in some form of control surface (through some control mechanism, +usually something more integral to the application being traced). + +In this document we will walk through how an example mode could be implemented +and registered through the logging implementation. We'll refer to +implementations plugged in through the XRay framework as "modes" throughout +this document. There are three modes that come with the LLVM XRay distribution: +Basic, Flight Data Recorder (FDR), and Profiling. + +Controlling Modes +================= + +We can control the various modes programmatically using functions exposed in +the ``xray_log_interface.h`` header. This header comes with the compiler-rt +project. To initialize a mode, we can refer to them through mode identifier +strings and calling the ``__xray_log_init_mode(...)`` function. + +:: + + #include + + int main(int argc, char* argv[]) { + // Initialize a mode called "xray-basic", with options. + auto init_status = __xray_log_init_mode( + "xray-basic", "func_duration_threshold_us=0"); + if (init_status != XRayLogRegisterStatus::XRAY_REGISTRATION_OK) { + // We have encountered an error here. + } + + // ... continued ... + +Once we've initialised the mode we want, we can then "patch" the currently +running program using a call to ``__xray_patch()``. + +:: + + // ... continuation of main (...) ... + + auto patch_status = __xray_patch(); + if (patch_status != XRayPatchingStatus::SUCCESS) { + // We have failed to patch the currently running program. + } + + // ... continued ... + + +The patching operation will modify the instrumentation points in the program to +intercept function entry and exit events, along with other special "custom" and +typed events that might be used. Because we've initialised the specific mode +with a specific set of options, once the patched functions run, the handlers +associated with the mode will be invoked when these events are encountered. + +At this point, we can wait, spawn threads, perform the core function of +the program, and then eventually, we can decide to finalize and flush the logs. +To do that, we can call the appropriate functions: + +:: + // ... continuation of main (...) ... + + auto finalize_status = __xray_log_finalize(); + if (finalize_status != XRayLogInitStatus::XRAY_LOG_FINALIZED) { + // We have filed to finalize the implementation. + } + + auto flush_status = __xray_log_flushLog(); + if (flush_status != XRayLogFlushStatus::XRAY_LOG_FLUSHED) { + // We have failed to flush the log. + } + + // ... continued ... + +From here, we can also call ``__xray_unpatch()`` to undo the patching we've done +earlier, which partially undoes the effects of the patching operation. + +:: + + // ... continuation of main (...) ... + patch_status = __xray_unpatch(); + if (patch_status != XRayPatchingStatus::SUCCESS) { + // We have failed to unpatch the currently running program. + } + + } // end of main(...). + +This is the basic flow of control for using modes that have been registered and +known to the XRay framework. + +Registering Modes +================= + +The key data structure for modes we can register to the implementation is the +``XRayLogImpl`` type, defined in ``xray_log_interface.h``. More thorough +documentation can be found in the header, but here we'll provide an abridged +version of the struct, and dive into the details of some of the +assumptions/requirements of each of the pluggable functions described in the +struct. + +:: + + struct XRayLogImpl { + XRayLogInitStatus (*log_init)(size_t, size_t, void*, size_t); + XRayLogInitStatus (*log_finalize)(); + void (*handle_arg0)(int32_t, XRayEntryType); + XRayLogFlushStatus (*flush_log)(); + } + +All the function pointers in the struct are required, when providing a minimal +implementation of a mode. The four functions are: + +* ``log_init``: a function pointer which takes two deprecated and unused + ``size_t`` arguments, a pointer to data provided through either + ``__xray_log_init_mode(...)`` or ``__xray_log_init_mode_bin(...)``. This + function must return a value of ``XRayLogInitStatus``. +* ``log_finalize``: a function pointer which is invoked when + ``__xray_log_finalize()`` is invoked. +* ``handle_arg0``: a function pointer which is invoked when a function-related + event is encountered by the tracing framework. This function must be all of + the following: + + * *Re-entrant* -- this function must handle the case when it's being called + even while it is already executing (as in the case of an instrumented + signal handler). + * *Async-signal safe* -- this function may be invoked in a signal handler, + while there are other signals that are queued up. + * *Thread-safe* -- this function may be called in multiple concurrent + threads. + +* ``flush_log``: a function pointer which is invoked when + ``__xray_log_flushLog()`` is invoked. + +These functions are the minimum required function pointers for any mode +implementation that can be registered with the XRay framework. However, +implementations that would like to provide some optional functionality can call +some special extension functions in ``xray_interface.h`` and +``xray_log_interface.h`` in the mode initialisation function provided in the +XRayLogImpl structure. These are: + +* ``__xray_set_handler_arg1(...)`` which takes a function pointer which handles + function events for functions that have the first argument to be logged. +* ``__xray_set_customevent_handler(...)`` which takes a function pointer which + is invoked when a custom event is encountered. Custom events are those that + appear in source code as either C/C++ calls to ``__xray_customevent(...)`` or + in LLVM IR as ``llvm.xray.customevent(...)``. +* ``__xray_log_set_buffer_iterator(...)`` lets modes provide an "iterator" + function which returns pointers and extents for buffers (if there are any) + associated with the mode. + +When modes register an iterator function for the buffer iterators, users can +call the ``__xray_log_process_buffers(...)`` function providing a function +pointer which is invoked for every buffer provided by the iterator functions. + +Modes can then register a mode through the ``__xray_log_register_mode(...)`` +function. One example of this might look like: + +:: + + #include + #include + + XRayLogInitStatus exampleLogInit(size_t, size_t, void*, size_t); + XRayLogInitStatus exampleFinalize(); + void exampleHandleArg0(int32_t, XRayEntryType); + XRayLogFlushStatus exampleFlushLog(); + + static bool Init = [] [[clang::xray_never_instrument]] { + auto register_mode = __xray_log_register_mode("example-mode", { + exampleLogInit, exampleFinalize, exampleHandleArg0, exampleFlushLog + }); + return register_mode != XRayLogRegisterStatus::XRAY_REGISTRATION_OK; + }(); + +This registration is global, and once a mode has been registered in this manner, +it can be initialised and used similar to how the provided modes can be used. +If this code is compiled into a library archive (for static linking) then the +global initializer for ``Init`` runs before the ``main()`` function. This way +the modes can be registered just by adding a linker dependency to an XRay +instrumented binary. + +Also note that all the functions that are meant to be invoked by the XRay +framework are best attributed with the ``[[clang::xray_never_instrument]]`` +C++11 attribute, or using the ``__attribute__((xray_never_instrument))`` +syntax. Since we're using C++11 in the implementation here, we'll be using the +C++11 style attributes throughout the document. + +Example Mode Implementation +=========================== + +An example mode that might be useful is a simple function coverage mode +implementation. This is meant as an exercise in exercising the features we've +referred to above. + +Mode Initialization +------------------- + +The first thing we do for this example implementation is to define the +initialization function. For the purposes of illustration, we'll use C++ and +components in the C++ standard library. It's recommended however that modes not +use or depend on types and functions that may be XRay instrumented to reduce the +potential overheads of the implementations. The modes that come with the +compiler-rt project in XRay make these concessions to allow for full portability +of the implementation. + +That said, for our example mode, we'll set up a global vector of booleans sized +to the number of functions we have instrumented in the running program. To do +this we use the ``__xray_max_function_id()`` call to determine how many +functions we should account the coverage information for. We will synchronise +this global boolean with a mutex for correct concurrent operation -- however, we +should note that this is only a demonstration, that in general if you are +implementing a mode that needs to be efficient, you may consider using +alternative means of avoiding unnecessary contention/overheads when tracing. + +Here's one way of implementing this initialization function: + +:: + + #include + #include + #include + #include + #include + + static std::atomic example_initialized{0}; + static std::mutex global_mutex; + static std::vector function_coverage; + static std::atomic finalized{false}; + + XRayLogInitStatus exampleLogInit(size_t, size_t, void*, size_t) + [[clang::xray_never_instrument]] { + // TODO: Parse the options if we ever need to support any customisation. + int expected = 0; + if (!example_initialized.compare_exchange_strong(expected, 1)) + return expected == 1 ? + XRayLogInitStatus::XRAY_LOG_INITIALIZING : + XRayLogInitStatus::XRAY_LOG_INITIALIZED; + + std::lock_guard guard(global_mutex); + + // Set up the function coverage with enough storage to represent coverage + // information for all XRay instrumented functions. + function_coverage.resize(__xray_max_function_id()); + + // We reset the contents of the bitmap. + function_coverage.clear(); + + // Then we set the global state to indicate that we're done initializing + // this mode. + example_initialized.store(2); + + // We also ensure that we are no longer in a finalized state, if we were + // previously finalized. + finalized.store(false, std::memory_order_release); + return XRayLogInitStatus::XRAY_LOG_INITIALIZED; + } + +Note that we needed to account for the case when multiple threads might be +invoking the initialization function at the same time for the same mode. + +Handling Function Events +------------------------ + +We can then move onto the functions which handle the cases when function events +are encountered. In this implementation we don't need to differentiate the +actions for the kinds of events we encounter, but some modes (like the ones that +come with the compiler-rt XRay runtimes) will want to differentiate function +entry from exit events, etc. + +What we do for function coverage is two-fold: + +1. We first mark the function as "covered" in the global bitmap. +1. We then "unpatch" the function we've just covered. + +Again, this implementation is not the most efficient way of doing this, but for +the sake of brevity and simplicity, this will get us what we need. + +:: + + thread_local std::atomic reentrance_guard{0}; + + void exampleHandleArg0(int32_t function_id, XRayEntryType) + [[clang::xray_never_instrument]] { + // Prevent recursion into this function either through a signal handler or + // if we're calling a function that is also instrumented from within this + // handler. + if (reentrance_guard.exchange(1, std::memory_order_acq_rel) == 1) return; + + // Stop immediately if the mode has already been finalized. + if (finalized.load(std::memory_order_acquire)) return; + + // Hold the global lock to ensure that updates to the coverage bitmap are + // thread-safe. + std::lock_guard guard(global_mutex); + function_coverage[function_id] = 1; + + // We specifically ignore the result of unpatching a specific function. + __xray_unpatch_function(function_id); + + // Then we reset the reentrance guard to "0". + reentrance_guard.store(0, std::memory_order_release); + } + +Notice that we're using a thread-local reentrance guard in the function to +ensure that, even in the event of signal handlers being invoked (or the lock +implementation being instrumented) we do nothing in case the current thread is +already handling a function event. + +We also check whether the implementation has been finalized before we proceed +with any modifications we're about to make to global data structures. + + +Finalization and Cleanup +------------------------ + +There are several ways we can provide access to the data. In this example, we +provide this through explicitly writing out a file which contains the bitmap of +function coverage. This is the simplest thing that will possibly work, but +sometimes users might want to expose this data through a separate interface (via +the network or processing the data in-process). + +The first part we need to implement is finalization. Notice how when we're +handling function events, we check first whether the implementation has been +finalized -- when we are finalizing, we want to ensure that we set a flag that +stops all threads from attempting to make any more modifications to the global +state. The finalization routine is also usually the place where we set the +iterator functions for in-memory processing. + +:: + + XRayBuffer exampleBufferIterator(XRayBuffer buffer) + [[clang::xray_never_instrument]] { + static std::string global_buffer; + if (buffer.Data == nullptr && buffer.Size == 0) { + // This is the first invocation of the iterator. Because our global + // structure is a std::vector we cannot just provide access to the + // underlying data. To do this properly we create a single global buffer + // and serialize the bitmap into that. This is only for the purposes of + // demonstration -- a real implementation may use a more efficient + // representation of the bitmap (run-length encoding, compression, etc.). + global_buffer.resize(__xray_max_function_id()); + global_buffer.clear(); + { + std::lock_guard guard(global_mutex); + for (const auto fn : function_coverage) + global_buffer.push_back(fn ? '1' : '0'); + } + return {global_buffer.c_str(), global_buffer.size()}; + } + + if (buffer.Data == global_buffer.data() && + buffer.Size == global_buffer.size()) { + // Clear the buffer, since we no longer need it. + std::string().swap(global_buffer); + return {nullptr, 0}; + } + + // If we get here we've encountered a bug. + assert(false && "Invalid argument to exampleBufferIterator!"); + } + + XRayLogInitStatus exampleFinalize() [[clang::xray_never_instrument]] { + // Set the global flag to "finalized", and if it's already been finalized + // before this call, we do nothing else. This way, only the first thread + // which successfully sets the finalized global flag from 0 -> 1 gets to + // proceed. + if (finalized.exchange(1, std::memory_order_acq_rel) == 1) + XRayLogInitStatus::XRAY_LOG_FINALIZED; + + // We then set up the buffer iterators to allow us to access the global + // state when users call __xray_log_process_buffers(...). + __xray_log_set_buffer_iterator(exampleBufferIterator); + + return XRayLogInitStatus::XRAY_LOG_FINALIZED; + } + +The next part we need to implement is the log flushing function. Because we +already have a means of accessing the data through a serialized buffer, we can +just have that implementation write out the data to a file. In this +implementation we pick an arbitrary filename, but you can imagine using various +approaches for configuring the filename of the output instead. + +:: + + XRayLogFlushStatus exampleFlushLog() [[clang::xray_never_instrument]] { + if (!finalized.load(std::memory_order_acquire)) + return XRayLogFlushStatus::XRAY_LOG_NOT_FLUSHING; + + static std::atomic flushing{false}; + if (flushing.exchange(true, std::memory_order_acq_rel)) + return XRayLogFlushStatus::XRAY_LOG_FLUSHING; + + // Open a file and write out the contents there. We use a function-local + // static pointer so that it can be accessed as a global variable by the + // lambda we provide to the call to __xray_log_process_buffers(...). + // This is necessary because __xray_log_process_buffers(...) takes a + // function pointer, and in C++ we cannot convert a lambda with + // captured state into a function pointer. + static FILE* tmpf = 0; + tmpf = std::tmpfile(); + auto result = __xray_log_process_buffers( + +[](const char *, XRayBuffer buffer) { + // write out the data raw into the temporary file. + std::fputs(buffer.Buffer, tmpf); + }); + fclose(tmpf); + + // Indicate that we are no longer flushing. + flushing.store(false, std::memory_order_release); + + // Now that we've flushed the contents of the global data structure, we then + // re-set the global "initialized" flag to 0, which indicates we can now be + // re-initialized. + example_initialized.store(0, std::memory_order_release); + return result; + } + +Further Work +============ + +There are two main ways of controlling an XRay instrumented binary at the +moment: through environment variables described in the `XRay Documentation +` and programmatically as described above. In the future we may +consider providing a helper library which dedicates a thread of execution, or +uses ``ptrace`` or similar functionality to control the XRay tracing framework +externally. That should make enabling/disabling XRay more convenient than +having to bake in the control surface to an application/binary's code. + +There is also potential for exposing some of the helper types and functions +used by the XRay runtimes, to make it more convenient to implement modes more +correctly. + +Contact Information +=================== + +Please send questions on details of implementing your own XRay modes to the +`llvm-dev mailing list `. Index: llvm/docs/index.rst =================================================================== --- llvm/docs/index.rst +++ llvm/docs/index.rst @@ -300,6 +300,7 @@ XRay XRayExample XRayFDRFormat + XRayFramework PDB/index CFIVerify SpeculativeLoadHardening @@ -431,6 +432,10 @@ :doc:`XRayExample` An example of how to debug an application with XRay. +:doc:`XRayFramework` + A detailed walk-through of how to plug in a new XRay mode through the XRay + Framework. + :doc:`The Microsoft PDB File Format ` A detailed description of the Microsoft PDB (Program Database) file format.