This ports the profiling runtime on Fuchsia and enables the
instrumentation. Unlike on other platforms, Fuchsia doesn't use
files to dump the instrumentation data since on Fuchsia, filesystem
may not be accessible to the instrumented process. We instead use
the data sink to pass the profiling data to the system the same
sanitizer runtimes do.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c | ||
---|---|---|
31 ↗ | (On Diff #148052) | This is only needed because runtime registration invokes it: https://reviews.llvm.org/diffusion/L/browse/compiler-rt/trunk/lib/profile/InstrProfilingRuntime.cc;333007$24, if we did our own registration we could avoid this function altogether and we could also name __llvm_profile_register_write_file_atexit more appropriately (since there're no files involved) but I'm not sure whether it's worth the duplication? |
compiler-rt/lib/profile/InstrProfilingPlatformLinux.c | ||
10 ↗ | (On Diff #148052) | This file seems more related to the file format and linker support than platform. Would there be any objections against renaming InstrProfilingPlatformLinux.c to InstrProfilingPlatformELF.c and similarly InstrProfilingPlatformDarwin.c to InstrProfilingPlatformMachO.c? |
This looks good at a high level. How is it tested?
compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c | ||
---|---|---|
31 ↗ | (On Diff #148052) | Having an imperfect name sounds a bit better to me than copying the code for the static initializer. |
compiler-rt/lib/profile/InstrProfilingPlatformLinux.c | ||
10 ↗ | (On Diff #148052) | No objection here. |
compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c | ||
---|---|---|
37 ↗ | (On Diff #148052) | Info notes out of date? |
55 ↗ | (On Diff #148052) | Add a comment here. |
102 ↗ | (On Diff #148052) | Probably don't need this. In fact, buffer API does not yet support value profile data dump, so value profiling can be turned off by default for Fusia. |
compiler-rt/lib/profile/InstrProfilingPlatformLinux.c | ||
10 ↗ | (On Diff #148052) | if you want to do this, please do it in a separate patch. |
compiler-rt/lib/profile/InstrProfilingPlatformLinux.c | ||
---|---|---|
10 ↗ | (On Diff #148052) | There are no other MachO platforms other than Darwin anyway. |
lgtm
compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c | ||
---|---|---|
102 ↗ | (On Diff #148052) | The problem is builds that only depend on buffer APIs can not depend on libc (for dynamic memory allocation). Value profiling dump requires dynamically allocated memory. |
I don't have enough context on how the profiling runtime works to properly review this implementation strategy for Fuchsia. Perhaps adequate comments here would tell me enough. Certainly it needs a lot more comments.
But probably we just need to talk through the implementation plan in person first.
clang/lib/Driver/ToolChains/Fuchsia.cpp | ||
---|---|---|
106 ↗ | (On Diff #148783) | addSanitizerRuntimes should also be after AddLinkerInputs. |
compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c | ||
1 ↗ | (On Diff #148783) | s/Linux/Fuchsia/ |
9 ↗ | (On Diff #148783) | This needs an overview comment explaining the implementation scheme. Overall the code needs a lot more comments. |
14 ↗ | (On Diff #148783) | Why are these needed? stdlib.h is for atexit. <sys/types.h> should never be used in new code that is not POSIXy. |
24 ↗ | (On Diff #148783) | This needs comments explaining why this is a reliable upper bound. |
25 ↗ | (On Diff #148783) | This should be static. |
36 ↗ | (On Diff #148783) | C does not require such casts from void *. If LLVM C style does require them, you can leave it. |
37 ↗ | (On Diff #148783) | Use C99 style: declare only at first initialization. |
47 ↗ | (On Diff #148783) | You should use only the _zx_* entry points. |
92 ↗ | (On Diff #148783) | Move decls to first use. |
109 ↗ | (On Diff #148783) | %zu is for size_t. KOIDs are uint64_t, so use "%" PRIu64 (<inttypes.h>). |
compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c | ||
---|---|---|
147 ↗ | (On Diff #148783) | I cannot do that since this function is always invoked by RegisterRuntime() during initialization (https://reviews.llvm.org/source/compiler-rt/browse/compiler-rt/trunk/lib/profile/InstrProfilingRuntime.cc;333870$24), otherwise I'd avoid providing this function altogether. |
compiler-rt/lib/profile/CMakeLists.txt | ||
---|---|---|
62 ↗ | (On Diff #155070) | Can we add these .c files unconditionally and just handle it with global __Fuchsia__ preprocessor switch? This is already a common practice in compiler-rt. |
LGTM with these few substantive nits and then a whole lot more comments (mostly a more thorough overview covering linkage and output model).
clang/lib/Driver/ToolChains/Fuchsia.cpp | ||
---|---|---|
106 ↗ | (On Diff #148783) | still relevant, though could be an unrelated change |
compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c | ||
33 ↗ | (On Diff #155095) | Each symbol defined in this file without static or COMPILER_RT_VISIBILITY (hidden) needs to have an explicit comment clearly saying that this symbol is exported in each dynamic executable and DSO that statically links in the profiling runtime, and why that is desireable and what the protocol is for sharing a single set of definitions from one of those at runtime. I think the VMO and offset are the only things we actually want to share. I think it is exactly just those two symbols that should be exported, so they can have a short comment here and a longer explanatory comment about the whole inter-module sharing model can be part of the overview comment at the top of the file. |
34 ↗ | (On Diff #155095) | s/should written to/should be written next/ |
70 ↗ | (On Diff #155095) | So, this needs some overall comments on the methodology of this function and when/how it's called. The first time through, it will allocate the VMO and use llvm_profile_get_size_for_buffer() as the initial size. After that, and also on any subsequent calls, it will resize the VMO to cover the incoming __llvm_profile_offset value plus the total being written in this call. So, each call from the first onwards serially appends an arbitrary amount to the VMO, which has been published on creation. But actually "first call" and "subsequent calls" are misnomers. Each incarnation of this function will only ever be called once. Each module (executable or DSO) statically links in the whole profile runtime to satisfy the calls from its instrumented code. Several modules in the same program might be separately compiled and even use different versions of the instrumentation ABI and data format. All they share in common is the VMO and the offset, which live in exported globals so that exactly one definition will be shared across all modules. (Ideally they'd be exported as STB_GNU_UNIQUE even, so that multiple isolated modules loaded by dlopen into a program with no global-scope modules exporting these two globals would still share a VMO. But that case is unlikely, and Fuchsia's dynamic linker doesn't usefully support STB_GNU_UNIQUE so far.) Each module has its own independent runtime that registers its own atexit hook to write its own data. The data format is sufficiently chunked and self-describing that simply concatenating the data streams from several unrelated incarnations of the runtime (that might even be from different versions of the compiler instrumentation) is sufficient to sort the data out easily offline later. The alternative to the shared globals approach is that each module's atexit hook is totally unaware of the others and each creates its own VMO and does its own __sanitizer_publish_data call. That wouldn't be the worst thing, and that's the "failure mode" of the pathological dlopen case mentioned in the aside above. It just means that the data sink receiver has to be ready to handle multiple VMOs for the same sink name that originate from the same process. As things stand, that means multiple VMOs with the same name. I think that should actually be fine and there's no need for this code to try to mitigate it (e.g. by also putting the VMO KOID into the name). Rather, every implementation of the data sink service ought to be prepared for arbitrary (or no) names in the VMOs, so e.g. ones that write files might always choose the file name to be "${sink_name}.${vmo_koid}.${vmo_name}" so as to be guaranteed unique (by dint of the VMO KOID) as well as informative. Most of this background belongs in the overview comment at the top of the file. Comments within this function should be clear about how multiple incarnations of this function are interacting in the "first time" if branch and later. |
86 ↗ | (On Diff #155095) | Make sure to always use the _zx_* names. |
109 ↗ | (On Diff #155095) | PROF_NOTE and PROF_ERR and PROF_WARN are fortunately so far confinsed to InstrProfilingFile.c, which is code not used for Fuchsia. |
149 ↗ | (On Diff #155095) | I'd use <stdbool.h> bool here (and in general C11, or at least C99, freely and modern style preferentially), unless there's a C portability constraint I don't know about for lib/profile. |
compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c | ||
---|---|---|
19 ↗ | (On Diff #156909) | typo: "provided" |
67 ↗ | (On Diff #156909) | This should log a message including a sanitizer markup dumpfile element that refers to the VMO name. |
120 ↗ | (On Diff #156909) | s/get/got/ |