This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
compiler-rt/trunk/
-
trunk/
-
lib/xray/
-
xray/
-
CMakeLists.txt
-
tests/
-
CMakeLists.txt
-
unit/
-
function_call_trie_test.cc
-
profile_collector_test.cc
-
xray_function_call_trie.h
-
xray_profile_collector.cc
-
xray_profiler_flags.h
-
xray_profiler_flags.cc
-
xray_profiler_flags.inc
-
xray_profiling.cc
-
xray_profiling_flags.h
-
xray_profiling_flags.cc
-
xray_profiling_flags.inc
-
test/xray/TestCases/Posix/
-
xray/
-
TestCases/
-
Posix/
-
c-test.cc
-
profiling-multi-threaded.cc
-
profiling-single-threaded.cc

Differential D44620

[XRay][profiler] Part 4: Profiler Mode Wiring
ClosedPublic

Authored by dberris on Mar 18 2018, 10:24 PM.

Download Raw Diff

Details

Reviewers

eizan
kpw
pelikan

Commits

rGcfd7eec3d83e: [XRay][profiler] Part 4: Profiler Mode Wiring
rCRT334469: [XRay][profiler] Part 4: Profiler Mode Wiring
rL334469: [XRay][profiler] Part 4: Profiler Mode Wiring

Summary

This is part of the larger XRay Profiling Mode effort.

This patch implements the wiring required to enable us to actually
select the xray-profiling mode, and install the handlers to start
measuring the time and frequency of the function calls in call stacks.
The current way to get the profile information is by working with the
XRay API to __xray_process_buffers(...).

In subsequent changes we'll implement profile saving to files, similar
to how the FDR and basic modes operate, as well as means for converting
this format into those that can be loaded/visualised as flame graphs. We
will also be extending the accounting tool in LLVM to support
stack-based function call accounting.

We also continue with the implementation to support building small
histograms of latencies for the FunctionCallTrie::Node type, to allow
us to actually approximate the distribution of latencies per function.

Depends on D45758 and D46998.

Diff Detail

Repository: rL LLVM

Event Timeline

dberris created this revision.Mar 18 2018, 10:24 PM

Herald added a subscriber: mgorny. · View Herald TranscriptMar 18 2018, 10:24 PM

ping @echristo or @kpw?

dberris mentioned this in D45474: [XRay][clang+compiler-rt] Support build-time mode selection.Apr 9 2018, 11:24 PM

dberris mentioned this in rL329772: [XRay][clang+compiler-rt] Support build-time mode selection.Apr 10 2018, 6:31 PM

dberris mentioned this in rC329772: [XRay][clang+compiler-rt] Support build-time mode selection.

dberris mentioned this in rCRT329772: [XRay][clang+compiler-rt] Support build-time mode selection.

Sorry for the late review, and for destroying your diff.

What I think would also make interesting test cases:

when the call sequence is A → B → C → setjmp(3) ↓ B ↓ A → D → longjmp(3), where an exit from A would be after N times the loop ran
or (alternatively) strange situations involving C++ exceptions

compiler-rt/lib/xray/tests/unit/allocator_test.cc
36 ↗	(On Diff #138879)	ASSERT_EQ(A.Counter, 1);
39 ↗	(On Diff #138879)	ASSERT_EQ(A.Counter, 1);
compiler-rt/lib/xray/tests/unit/function_call_trie_test.cc
39–40 ↗	(On Diff #138879)	For readability, can we have the TSCs to be 100, 200, 300 etc.? Now the numbers look the same. (and I see a test below does that already)
compiler-rt/lib/xray/xray_allocator.h
12–13 ↗	(On Diff #138879)	I would at least add a TODO for adding support for replacing this allocator with any of the security-checking ones. (ubsan? efence? valgrind? I keep forgetting the names of them. compiler-rt IIRC has some "secure" allocator too.) We don't want to become another OpenSSL.
66 ↗	(On Diff #138879)	static_assert(Size <= 16); to make sure we don't run out of bits when a CPU manufacturer goes crazy?
compiler-rt/lib/xray/xray_function_call_trie.h
24–27 ↗	(On Diff #138879)	I'd reword/reorder this slightly, to make it clearer what's actually being stored. FunctionCallTrie represents stack traces of XRay instrumented functions that we've encountered, where a node corresponds to a function call and the path from the root to that node represents its stack trace.
84 ↗	(On Diff #138879)	IIUC, a comma before "then", or a new sentence.
90 ↗	(On Diff #138879)	New line not necessary?
116–117 ↗	(On Diff #138879)	Why are these not unsigned? There's no such thing as a negative count or negative time spent in a function. ShadowStackEntry has the time as u64. (I'd make the function ID unsigned too.)
117 ↗	(On Diff #138879)	Please put something like "// TSC ticks" at the end of the line, or introduce u64 typedefs for TSC ticks/deltas to avoid someone mistaking it for nanoseconds or the like.
131 ↗	(On Diff #138879)	It's not clear to me why does this need FId when the Node pointer below has it as well. Please add a brief comment if it's really necessary.
152 ↗	(On Diff #138879)	Um, why is this line necessary? :-) Line 230 should work without Allocators:: too.
255–256 ↗	(On Diff #138879)	Why is the comment related to the function not above the function's first line? Same for exitFunction().
315–319 ↗	(On Diff #138879)	It's not clear to me from this comment that this function, unlike mergeInto, may create duplicate entries.
compiler-rt/lib/xray/xray_profile_collector.cc
31 ↗	(On Diff #138879)	I'm not sure a spinning lock is the best idea when deepCopying a huge tree, but have no data to prove anything.
129 ↗	(On Diff #138879)	Why not just "auto FId"?
134–138 ↗	(On Diff #138879)	Are you sure "static" is needed here? With just a local zero variable, the compiler may inline the memcpy and turn it into memset, whereas you're telling it to load a thing far away in memory. That said, I reckon "internal_memset(NextPtr, 0, 4); NextPtr += 4;" would be both easier to read and faster.
151–154 ↗	(On Diff #138879)	Same here, memset(NextPtr, 0, 8); NextPtr += 8;
compiler-rt/lib/xray/xray_profile_collector.h
10 ↗	(On Diff #138879)	instruementation has an typoe in it -> instrumentation Please fix other files as well.
31–32 ↗	(On Diff #138879)	Why is this a class with public static methods, when it doesn't have any data or subclasses and therefore should be a namespace?
compiler-rt/test/xray/TestCases/Posix/profiling-single-threaded.cc
15–16 ↗	(On Diff #138879)	These local macros don't make it that much shorter or more readable. Consider either removing "XRAY_" or dropping them.

pelikan added inline comments.Apr 11 2018, 12:33 PM

compiler-rt/lib/xray/tests/unit/function_call_trie_test.cc
91–92 ↗	(On Diff #138879)	f0 → f1 → setjmp ↓(f1→f0) longjmp ↓(f1↓f0) longjmp ↓(f1↓f0) should generate lots of exits but only one entry. Or when the profiling starts in a signal handler. So it shouldn't be "impossible", just "infrequent" :-)
106–110 ↗	(On Diff #138879)	Test name says "MissingIntermediaryEntry" but this is missing an intermediary exit.
145–150 ↗	(On Diff #138879)	Same, please use multiplies of 100 for TSCs. I wonder whether we should test the TSC time series not being non-decreasing due to TSC mismatches when rescheduling among poorly synchronized CPU packages.
compiler-rt/lib/xray/tests/unit/profile_collector_test.cc
21 ↗	(On Diff #138879)	did you mean: "the only one we actually care about"? If we "only care" about it, what more can we do about it? :-)
47 ↗	(On Diff #138879)	I would at least assert the buffer's size is within some reasonable bounds - has the trailing bit and a function list. Maybe also the zero sentinels in places.
77 ↗	(On Diff #138879)	Again, some assertions to make sure both threads are reflected in that buffer would be nice. Doesn't have to be too strict.
compiler-rt/lib/xray/tests/unit/segmented_array_test.cc
71 ↗	(On Diff #138879)	Please also test what happens when you do Array <TestData> data; auto it = data.begin(); it--; Because I think you'll find Offset will be SIZE_MAX. Not sure we want that.
compiler-rt/lib/xray/xray_allocator.h
94 ↗	(On Diff #138879)	assume NewChain == nullptr. (or BackingStore for that matter)
compiler-rt/lib/xray/xray_profile_collector.cc
50 ↗	(On Diff #138879)	Why do we need the volatile? It's a global, there's very little optimization the compiler can do anyway... I'd like to see what I missed, thinking it'd be OK without it.
compiler-rt/lib/xray/xray_segmented_array.h
42 ↗	(On Diff #138879)	So, actually, I never liked linked lists where the prev/next pointers are in a separate region of memory, because that tends to worsen the cache miss rate when you walk through the list, and when the points at which these Chunk things are allocated are reasonably randomized along with the actual data allocations to confuse the CPU prefetcher. Which is why I've always been using LIST/TAILQ versions from queue(3) as they embed these to the structures they're listing. I'm not saying you should rewrite all of this now, but have you thought about putting the prev/next into the T somehow? Is that even possible to do with C++ templates?
44 ↗	(On Diff #138879)	Why is this necessary, and we can't just use N?
61 ↗	(On Diff #138879)	InternalAlloc can return nullptr.
153 ↗	(On Diff #138879)	I suppose these were your debugging statements which can go away (and below).
242 ↗	(On Diff #138879)	tautology

fixup: Address comments

compiler-rt/lib/xray/tests/unit/allocator_test.cc
36 ↗	(On Diff #138879)	This is not a valid assertion, since `Counter` is not relevant to the allocator's observable properties.
39 ↗	(On Diff #138879)	Same.
compiler-rt/lib/xray/tests/unit/function_call_trie_test.cc
39–40 ↗	(On Diff #138879)	Can you explain better why using 100, 200, 300 as opposed to 1, 2, 3 is better?
91–92 ↗	(On Diff #138879)	Right, changed to "rare". For now, we've not made special support for setjmp/longjmp instrumentation. While there's a possibility we can do that in the future, we're not counting on being able to differentiate that for now. The signal handler case is precisely the one we're looking to support here, but that's just one case.
145–150 ↗	(On Diff #138879)	Good point. I'll leave a TODO for that. In particular we actually need to keep track of the CPU ID instead of just the TSC when we're building the shadow stack. That will let us track the migration of the thread(s). It's still not clear to me why using multiples of 100 is important. These are just arbitrary numbers anyway, it shouldn't matter what order of magnitude they are.
compiler-rt/lib/xray/tests/unit/profile_collector_test.cc
21 ↗	(On Diff #138879)	I don't see the difference. English is hard. "the only one we care about" == "the one we actually only care about" There are other use-cases for this collection API, some of which we don't cover in this unit test (yet). In particular, we could be collecting snapshots of the function call trie for a function every so often, and associating a timestamp to that, so we can show profiles over time instead of a single profile.
47 ↗	(On Diff #138879)	Some of these details aren't really relevant to the unit test. For example: The size of the block is dependent on how we've decided to serialise the data. Yes we can assert that the size is not zero (doing that now). The function list could be in any order. It's not a relevant feature of the API. What we care about is that we're able to get the data. The concern of parsing this data is not really at this level of the unit test (I'd rather we have an actual end-to-end test that would get this information). We're testing that we can get a buffer that's not the empty buffer, which tells us enough information to say that this API in particular is holding its promises based on the preconditions and postconditions.
compiler-rt/lib/xray/tests/unit/segmented_array_test.cc
71 ↗	(On Diff #138879)	That's technically testing for undefined behaviour -- i.e. outside of the contract of the container. :)
compiler-rt/lib/xray/xray_allocator.h
12–13 ↗	(On Diff #138879)	We already do that, because we're relying on the underlying allocator for sanitizer_common. There's already a way to provide alternate implementations of those in that regard. All the backing store we have is gotten from sanitizer_common as opposed to using our own calls to mmap directly.
compiler-rt/lib/xray/xray_function_call_trie.h
116–117 ↗	(On Diff #138879)	There are some potentially subtle issues with using unsigned in these variables. Some of them are: Forcing a value to be unsigned causes the compiler to implement modular arithmetic, even if we don't ever expect that values will wrap-around. Doing zero-sign extension is not cheap. We want to make these values as cheap as possible to update. Also, we cannot make the function ID unsigned, because the value we're getting from XRay is a signed number (int32_t). The conversion will not be faithful, and we've avoided unsigned in those cases for similar reasons. If we decide in the future that we can actually get away with unsigned values for function ids (which I think we can) then we can change all the XRay implementations to take unsigned values for the function id, etc. -- most of which is not really worth the cost.
131 ↗	(On Diff #138879)	This is an optimisation, so that we don't actually need to reach into the pointer just to get the function id of the function at the top of the stack.
152 ↗	(On Diff #138879)	This is necessary because users of this nested type need to access these exported types. In this case, because NodeRefAllocatorType is part of the FunctionCallTrie type, we're re-exporting this type through Allocators which is a public type.
255–256 ↗	(On Diff #138879)	Because the comment is an implementation detail, it's explaining what it's doing rather than what users need to expect (i.e. it's not documentation of the contract, it's documentation of the implementation detail).
315–319 ↗	(On Diff #138879)	Good point. Updated the comment and the implementation to make it clear that we're not destroying the state of the FunctionCallTrie in `O`.
compiler-rt/lib/xray/xray_profile_collector.cc
31 ↗	(On Diff #138879)	Note, the intent here is to use the `GlobalMutex` to lock operations on the `ThreadTries` vector. We shouldn't be holding a lock on the `GlobalMutex` while in the process of copying the FunctionCallTrie.
compiler-rt/lib/xray/xray_profile_collector.h
31–32 ↗	(On Diff #138879)	Good point. It started as a class that had member variables, until it evolved to a global implementation, which is better just as a namespace.
compiler-rt/lib/xray/xray_segmented_array.h
42 ↗	(On Diff #138879)	What you're talking about is intrusive lists. These work only if you're doing a linked list of elements, but in this case it's the chunks we're linking together. Each chunk will have a block, which is what we're managing. All the chunks come from the same region of memory.
44 ↗	(On Diff #138879)	Usability -- because you can do: Chunk C; assert(C.Size > 0);
compiler-rt/test/xray/TestCases/Posix/profiling-single-threaded.cc
15–16 ↗	(On Diff #138879)	Yeah, unfortunately without these clang-format gets confused. :D

Harbormaster completed remote builds in B17101: Diff 142590.Apr 15 2018, 8:13 PM

Can this possibly be split up? It's way too long to easily review and seems to at least have two sets of functionality.

Split into smaller parts.

Harbormaster completed remote builds in B17169: Diff 142893.Apr 18 2018, 1:18 AM

Retitled, and updated to reflect breakup into smaller parts.

Adding back llvm-commits.

Harbormaster completed remote builds in B17172: Diff 142896.Apr 18 2018, 1:34 AM

dberris added a child revision: D45998: [XRay][profiler] Part 5: Profiler File Writing.Apr 23 2018, 10:22 PM

This one is more straightforward than the previous in the change. The main ideas all fall into place nicely, but I have pointed out some details that could use some attention.

compiler-rt/lib/xray/xray_profiler.cc
41 ↗	(On Diff #142896)	Can you expand the initialism to thread-local data in a comment? I always think TopLevelDomain when I see this and after jumping to definition that would help refresh my memory.
51 ↗	(On Diff #142896)	Don't you need pthread_create first for ProfilingKey?
105 ↗	(On Diff #142896)	Seems to me this should be memory_order_acquire_release if we want mutual exclusion of profileCollectorService::reset() from another thread.
142–143 ↗	(On Diff #142896)	Maybe check verbosity and Report?
156–168 ↗	(On Diff #142896)	I think this should be wrapped in an "if (TLD.FCT)" block. It's definitely an edge case, but If ProfileLogStatus is INITIALIZED, and if the atomic load happens and before the next statement a context switch and update to FINALIZING happens, then the TLD can be unitialized, but the status check won't know about the FINALIZING statement and we'll deference a null TLD.FCT. I think there is a similar problem with moving the GetTLD() before the atomic_load for the transition from FINALIZED to INITIALIZING. You could solve it by having GetTLD return the TLD reference and status from the load. It also might be worth documenting that InternalAlloc won't return null (as opposed to FCT allocator, so we're relying on that.
189–190 ↗	(On Diff #142896)	Not scoped to profiling mode: This stragegy still makes me uncomfortable for cases where not much is instrumented (e.g. event tracing), but I don't have a better solution fleshed out. It kind of feels like a cooperative scheduling problem where threads should check periodically if they're cancelled. Maybe we could have an option to add sleds that just do a finalizing check without instrumenting so that sparse instrumentation is able to respect the grace period.
198 ↗	(On Diff #142896)	Is this protected from simultaneous calls to postCurrentThreadFCT if other threads take a while to see they're finalized.
210–213 ↗	(On Diff #142896)	Do you have that graph of valid state transitions? I though it was OK to go from FINALIZED back to INITIALIZED without going back to UNITIALIZED.
229–230 ↗	(On Diff #142896)	Can these error for bad/missing flags?
240 ↗	(On Diff #142896)	Ahh. Can you just make a comment near the pthread_set_specific that INITIALIZE is responsible for calling pthread_key_create.
compiler-rt/test/xray/TestCases/Posix/profiling-multi-threaded.cc
4–5 ↗	(On Diff #142896)	Is it xray-profiler or xray-profiling? Does the flag not match the mode string in code? You assert on xray-profiling and set the mode to that in code below.
8 ↗	(On Diff #142896)	Do you need a thing to exclude windows since you're calling readtsc()
29 ↗	(On Diff #142896)	Yes please. ;)
compiler-rt/test/xray/TestCases/Posix/profiling-single-threaded.cc
5 ↗	(On Diff #142896)	Similar flag confusion.
47 ↗	(On Diff #142896)	Could be illustrative and increase coverage to have a test case that verifies that profiling mode can turn back on after a "round."

in the chain* I meant.

fixup: Use updated name for flags
fixup: address comments by kpw@, rename to use profiling instead of profiler

dberris planned changes to this revision.Jun 1 2018, 2:19 AM

dberris added inline comments.

compiler-rt/lib/xray/xray_profiler.cc
41 ↗	(On Diff #142896)	Renamed to ProfilingData instead.
142–143 ↗	(On Diff #142896)	Yeah, we'll need to refactor this to instead use the same reentrance guard across all the implementations (FDR and Basic). I'll add a dependency to the C++ ABI changes which has those changes.
189–190 ↗	(On Diff #142896)	Yep, we talked about this offline and we'll need to fix this across the implementations anyway. Let's fix that later.
198 ↗	(On Diff #142896)	serialize() has internal synchronisation, so we're relying on that synchronisation to do the right thing.
210–213 ↗	(On Diff #142896)	Yeah, unfortunately it seems that we're going to need to make it so that once an implementation has flushed, it should go back to UNINITIALIZED. Either that, or we're going to have to be a bit more clever about this.
229–230 ↗	(On Diff #142896)	Yes, but we're really just ignoring them here -- the parser will already report if the verbosity is high enough.
compiler-rt/test/xray/TestCases/Posix/profiling-multi-threaded.cc
4–5 ↗	(On Diff #142896)	There's two parts here -- there's the mode, which in this case is a name for the implementation. We're using "profiler" to be consistent with "flight data recorder" and "basic". We could just make this "xray-profiling" all throughout, which I think would make it much simpler -- and pretend that "FDR" is "flight data recording" instead. ;)
8 ↗	(On Diff #142896)	Good point. Yes, need to require Linux for now.
compiler-rt/test/xray/TestCases/Posix/profiling-single-threaded.cc
47 ↗	(On Diff #142896)	Good call, let me do that in the next round.

Harbormaster completed remote builds in B18809: Diff 149413.Jun 1 2018, 2:19 AM

dberris edited the summary of this revision. (Show Details)Jun 1 2018, 2:22 AM

dberris removed a reviewer: echristo.

dberris added a parent revision: D46998: [XRay][compiler-rt] Remove reliance on C++ ABI features.

Rebase after removing ABI reliance and refactoring of RecursionGuard.

fixup: s/__sanitizer:://g + clang-format
fixup: use the common recursion guard
fixup: remove C++ ABI dependency
fixup: do two rounds of profiling

This is now ready for a look @kpw.

I'm going to have to download the renamed files to diff locally. Is there a way to do this in Differential that I'm missing?

compiler-rt/test/xray/TestCases/Posix/profiling-multi-threaded.cc
8 ↗	(On Diff #142896)	I think this still must be done before you submit.

In D44620#1127115, @kpw wrote:

I'm going to have to download the renamed files to diff locally. Is there a way to do this in Differential that I'm missing?

I'm not sure, but it looks like Differential already sees the rename/merge -- is there something else you're looking for?

compiler-rt/test/xray/TestCases/Posix/profiling-multi-threaded.cc
8 ↗	(On Diff #142896)	I looked into this and we already only enable all the tests for Linux from the lit configurations.

kpw accepted this revision.Jun 11 2018, 6:35 PM

This revision is now accepted and ready to land.Jun 11 2018, 6:35 PM

Closed by commit rL334469: [XRay][profiler] Part 4: Profiler Mode Wiring (authored by dberris). · Explain WhyJun 11 2018, 8:34 PM

This revision was automatically updated to reflect the committed changes.

dberris marked an inline comment as done.

Herald added a subscriber: delcypher. · View Herald TranscriptJun 11 2018, 8:34 PM

Revision Contents

Path

Size

compiler-rt/

trunk/

lib/

xray/

CMakeLists.txt

32 lines

tests/

CMakeLists.txt

8 lines

unit/

function_call_trie_test.cc

12 lines

profile_collector_test.cc

8 lines

xray_function_call_trie.h

14 lines

xray_profile_collector.cc

8 lines

xray_profiler_flags.h

39 lines

xray_profiler_flags.cc

40 lines

xray_profiler_flags.inc

26 lines

xray_profiling.cc

291 lines

xray_profiling_flags.h

39 lines

xray_profiling_flags.cc

40 lines

xray_profiling_flags.inc

26 lines

test/

xray/

TestCases/

Posix/

c-test.cc

2 lines

profiling-multi-threaded.cc

53 lines

profiling-single-threaded.cc

58 lines

Diff 150886

compiler-rt/trunk/lib/xray/CMakeLists.txt

Show All 12 Lines	set(XRAY_FDR_MODE_SOURCES
xray_fdr_flags.cc		xray_fdr_flags.cc
xray_buffer_queue.cc		xray_buffer_queue.cc
xray_fdr_logging.cc)		xray_fdr_logging.cc)

set(XRAY_BASIC_MODE_SOURCES		set(XRAY_BASIC_MODE_SOURCES
xray_basic_flags.cc		xray_basic_flags.cc
xray_basic_logging.cc)		xray_basic_logging.cc)

set(XRAY_PROFILER_MODE_SOURCES		set(XRAY_PROFILING_MODE_SOURCES
xray_profile_collector.cc		xray_profile_collector.cc
xray_profiler_flags.cc)		xray_profiling.cc
		xray_profiling_flags.cc)

# Implementation files for all XRay architectures.		# Implementation files for all XRay architectures.
set(x86_64_SOURCES		set(x86_64_SOURCES
xray_x86_64.cc		xray_x86_64.cc
xray_trampoline_x86_64.S)		xray_trampoline_x86_64.S)

set(arm_SOURCES		set(arm_SOURCES
xray_arm.cc		xray_arm.cc
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	add_compiler_rt_object_libraries(RTXrayFDR
CFLAGS ${XRAY_CFLAGS}		CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS})		DEFS ${XRAY_COMMON_DEFINITIONS})
add_compiler_rt_object_libraries(RTXrayBASIC		add_compiler_rt_object_libraries(RTXrayBASIC
OS ${XRAY_SUPPORTED_OS}		OS ${XRAY_SUPPORTED_OS}
ARCHS ${XRAY_SUPPORTED_ARCH}		ARCHS ${XRAY_SUPPORTED_ARCH}
SOURCES ${XRAY_BASIC_MODE_SOURCES}		SOURCES ${XRAY_BASIC_MODE_SOURCES}
CFLAGS ${XRAY_CFLAGS}		CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS})		DEFS ${XRAY_COMMON_DEFINITIONS})
add_compiler_rt_object_libraries(RTXrayPROFILER		add_compiler_rt_object_libraries(RTXrayPROFILING
OS ${XRAY_SUPPORTED_OS}		OS ${XRAY_SUPPORTED_OS}
ARCHS ${XRAY_SUPPORTED_ARCH}		ARCHS ${XRAY_SUPPORTED_ARCH}
SOURCES ${XRAY_PROFILER_MODE_SOURCES}		SOURCES ${XRAY_PROFILING_MODE_SOURCES}
CFLAGS ${XRAY_CFLAGS}		CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS})		DEFS ${XRAY_COMMON_DEFINITIONS})

# We only support running on osx for now.		# We only support running on osx for now.
add_compiler_rt_runtime(clang_rt.xray		add_compiler_rt_runtime(clang_rt.xray
STATIC		STATIC
OS ${XRAY_SUPPORTED_OS}		OS ${XRAY_SUPPORTED_OS}
ARCHS ${XRAY_SUPPORTED_ARCH}		ARCHS ${XRAY_SUPPORTED_ARCH}
Show All 20 Lines	add_compiler_rt_runtime(clang_rt.xray-basic
OS ${XRAY_SUPPORTED_OS}		OS ${XRAY_SUPPORTED_OS}
ARCHS ${XRAY_SUPPORTED_ARCH}		ARCHS ${XRAY_SUPPORTED_ARCH}
OBJECT_LIBS RTXrayBASIC		OBJECT_LIBS RTXrayBASIC
CFLAGS ${XRAY_CFLAGS}		CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS}		DEFS ${XRAY_COMMON_DEFINITIONS}
LINK_FLAGS ${SANITIZER_COMMON_LINK_FLAGS} ${WEAK_SYMBOL_LINK_FLAGS}		LINK_FLAGS ${SANITIZER_COMMON_LINK_FLAGS} ${WEAK_SYMBOL_LINK_FLAGS}
LINK_LIBS ${XRAY_LINK_LIBS}		LINK_LIBS ${XRAY_LINK_LIBS}
PARENT_TARGET xray)		PARENT_TARGET xray)
add_compiler_rt_runtime(clang_rt.xray-profiler		add_compiler_rt_runtime(clang_rt.xray-profiling
STATIC		STATIC
OS ${XRAY_SUPPORTED_OS}		OS ${XRAY_SUPPORTED_OS}
ARCHS ${XRAY_SUPPORTED_ARCH}		ARCHS ${XRAY_SUPPORTED_ARCH}
OBJECT_LIBS RTXrayPROFILER		OBJECT_LIBS RTXrayPROFILING
CFLAGS ${XRAY_CFLAGS}		CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS}		DEFS ${XRAY_COMMON_DEFINITIONS}
LINK_FLAGS ${SANITIZER_COMMON_LINK_FLAGS} ${WEAK_SYMBOL_LINK_FLAGS}		LINK_FLAGS ${SANITIZER_COMMON_LINK_FLAGS} ${WEAK_SYMBOL_LINK_FLAGS}
LINK_LIBS ${XRAY_LINK_LIBS}		LINK_LIBS ${XRAY_LINK_LIBS}
PARENT_TARGET xray)		PARENT_TARGET xray)
else() # not Apple		else() # not Apple
foreach(arch ${XRAY_SUPPORTED_ARCH})		foreach(arch ${XRAY_SUPPORTED_ARCH})
if(NOT CAN_TARGET_${arch})		if(NOT CAN_TARGET_${arch})
continue()		continue()
endif()		endif()
add_compiler_rt_object_libraries(RTXray		add_compiler_rt_object_libraries(RTXray
ARCHS ${arch}		ARCHS ${arch}
SOURCES ${XRAY_SOURCES} ${${arch}_SOURCES} CFLAGS ${XRAY_CFLAGS}		SOURCES ${XRAY_SOURCES} ${${arch}_SOURCES} CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS})		DEFS ${XRAY_COMMON_DEFINITIONS})
add_compiler_rt_object_libraries(RTXrayFDR		add_compiler_rt_object_libraries(RTXrayFDR
ARCHS ${arch}		ARCHS ${arch}
SOURCES ${XRAY_FDR_MODE_SOURCES} CFLAGS ${XRAY_CFLAGS}		SOURCES ${XRAY_FDR_MODE_SOURCES} CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS})		DEFS ${XRAY_COMMON_DEFINITIONS})
add_compiler_rt_object_libraries(RTXrayBASIC		add_compiler_rt_object_libraries(RTXrayBASIC
ARCHS ${arch}		ARCHS ${arch}
SOURCES ${XRAY_BASIC_MODE_SOURCES} CFLAGS ${XRAY_CFLAGS}		SOURCES ${XRAY_BASIC_MODE_SOURCES} CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS})		DEFS ${XRAY_COMMON_DEFINITIONS})
add_compiler_rt_object_libraries(RTXrayPROFILER		add_compiler_rt_object_libraries(RTXrayPROFILING
ARCHS ${arch}		ARCHS ${arch}
SOURCES ${XRAY_PROFILER_MODE_SOURCES} CFLAGS ${XRAY_CFLAGS}		SOURCES ${XRAY_PROFILING_MODE_SOURCES} CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS})		DEFS ${XRAY_COMMON_DEFINITIONS})

# Common XRay archive for instrumented binaries.		# Common XRay archive for instrumented binaries.
add_compiler_rt_runtime(clang_rt.xray		add_compiler_rt_runtime(clang_rt.xray
STATIC		STATIC
ARCHS ${arch}		ARCHS ${arch}
CFLAGS ${XRAY_CFLAGS}		CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS}		DEFS ${XRAY_COMMON_DEFINITIONS}
Show All 10 Lines	foreach(arch ${XRAY_SUPPORTED_ARCH})
# Basic mode runtime archive (addon for clang_rt.xray)		# Basic mode runtime archive (addon for clang_rt.xray)
add_compiler_rt_runtime(clang_rt.xray-basic		add_compiler_rt_runtime(clang_rt.xray-basic
STATIC		STATIC
ARCHS ${arch}		ARCHS ${arch}
CFLAGS ${XRAY_CFLAGS}		CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS}		DEFS ${XRAY_COMMON_DEFINITIONS}
OBJECT_LIBS RTXrayBASIC		OBJECT_LIBS RTXrayBASIC
PARENT_TARGET xray)		PARENT_TARGET xray)
add_compiler_rt_runtime(clang_rt.xray-profiler		# Profiler Mode runtime
		add_compiler_rt_runtime(clang_rt.xray-profiling
STATIC		STATIC
ARCHS ${arch}		ARCHS ${arch}
CFLAGS ${XRAY_CFLAGS}		CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS}		DEFS ${XRAY_COMMON_DEFINITIONS}
OBJECT_LIBS RTXrayPROFILER		OBJECT_LIBS RTXrayPROFILING
PARENT_TARGET xray)		PARENT_TARGET xray)
endforeach()		endforeach()
endif() # not Apple		endif() # not Apple

if(COMPILER_RT_INCLUDE_TESTS)		if(COMPILER_RT_INCLUDE_TESTS)
add_subdirectory(tests)		add_subdirectory(tests)
endif()		endif()

compiler-rt/trunk/lib/xray/tests/CMakeLists.txt

Show All 27 Lines	set(XRAY_IMPL_FILES
../../xray_interface.cc		../../xray_interface.cc
../../xray_interface_internal.h		../../xray_interface_internal.h
../../xray_log_interface.cc		../../xray_log_interface.cc
../../xray_mips64.cc		../../xray_mips64.cc
../../xray_mips.cc		../../xray_mips.cc
../../xray_powerpc64.cc		../../xray_powerpc64.cc
../../xray_profile_collector.cc		../../xray_profile_collector.cc
../../xray_profile_collector.h		../../xray_profile_collector.h
../../xray_profiler_flags.cc		../../xray_profiling_flags.cc
../../xray_profiler_flags.h		../../xray_profiling_flags.h
../../xray_recursion_guard.h		../../xray_recursion_guard.h
../../xray_segmented_array.h		../../xray_segmented_array.h
../../xray_trampoline_powerpc64.cc		../../xray_trampoline_powerpc64.cc
../../xray_tsc.h		../../xray_tsc.h
../../xray_utils.cc		../../xray_utils.cc
../../xray_utils.h		../../xray_utils.h
../../xray_x86_64.cc)		../../xray_x86_64.cc)

▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	macro(add_xray_unittest testname)
endif()		endif()
endmacro()		endmacro()

if(COMPILER_RT_CAN_EXECUTE_TESTS)		if(COMPILER_RT_CAN_EXECUTE_TESTS)
if (APPLE)		if (APPLE)
add_xray_lib("RTXRay.test.osx"		add_xray_lib("RTXRay.test.osx"
$<TARGET_OBJECTS:RTXray.osx>		$<TARGET_OBJECTS:RTXray.osx>
$<TARGET_OBJECTS:RTXrayFDR.osx>		$<TARGET_OBJECTS:RTXrayFDR.osx>
$<TARGET_OBJECTS:RTXrayPROFILER.osx>		$<TARGET_OBJECTS:RTXrayPROFILING.osx>
$<TARGET_OBJECTS:RTSanitizerCommon.osx>		$<TARGET_OBJECTS:RTSanitizerCommon.osx>
$<TARGET_OBJECTS:RTSanitizerCommonLibc.osx>)		$<TARGET_OBJECTS:RTSanitizerCommonLibc.osx>)
else()		else()
foreach(arch ${XRAY_SUPPORTED_ARCH})		foreach(arch ${XRAY_SUPPORTED_ARCH})
add_xray_lib("RTXRay.test.${arch}"		add_xray_lib("RTXRay.test.${arch}"
$<TARGET_OBJECTS:RTXray.${arch}>		$<TARGET_OBJECTS:RTXray.${arch}>
$<TARGET_OBJECTS:RTXrayFDR.${arch}>		$<TARGET_OBJECTS:RTXrayFDR.${arch}>
$<TARGET_OBJECTS:RTXrayPROFILER.${arch}>		$<TARGET_OBJECTS:RTXrayPROFILING.${arch}>
$<TARGET_OBJECTS:RTSanitizerCommon.${arch}>		$<TARGET_OBJECTS:RTSanitizerCommon.${arch}>
$<TARGET_OBJECTS:RTSanitizerCommonLibc.${arch}>)		$<TARGET_OBJECTS:RTSanitizerCommonLibc.${arch}>)
endforeach()		endforeach()
endif()		endif()
add_subdirectory(unit)		add_subdirectory(unit)
endif()		endif()

compiler-rt/trunk/lib/xray/tests/unit/function_call_trie_test.cc

Show All 21 Lines	TEST(FunctionCallTrieTest, Construction) {
// We want to make sure that we can create one of these without the set of		// We want to make sure that we can create one of these without the set of
// allocators we need. This will by default use the global allocators.		// allocators we need. This will by default use the global allocators.
FunctionCallTrie Trie;		FunctionCallTrie Trie;
}		}

TEST(FunctionCallTrieTest, ConstructWithTLSAllocators) {		TEST(FunctionCallTrieTest, ConstructWithTLSAllocators) {
// FIXME: Support passing in configuration for allocators in the allocator		// FIXME: Support passing in configuration for allocators in the allocator
// constructors.		// constructors.
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
FunctionCallTrie::Allocators Allocators = FunctionCallTrie::InitAllocators();		FunctionCallTrie::Allocators Allocators = FunctionCallTrie::InitAllocators();
FunctionCallTrie Trie(Allocators);		FunctionCallTrie Trie(Allocators);
}		}

TEST(FunctionCallTrieTest, EnterAndExitFunction) {		TEST(FunctionCallTrieTest, EnterAndExitFunction) {
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
auto A = FunctionCallTrie::InitAllocators();		auto A = FunctionCallTrie::InitAllocators();
FunctionCallTrie Trie(A);		FunctionCallTrie Trie(A);

Trie.enterFunction(1, 1);		Trie.enterFunction(1, 1);
Trie.exitFunction(1, 2);		Trie.exitFunction(1, 2);

// We need a way to pull the data out. At this point, until we get a data		// We need a way to pull the data out. At this point, until we get a data
// collection service implemented, we're going to export the data as a list of		// collection service implemented, we're going to export the data as a list of
Show All 21 Lines	TEST(FunctionCallTrieTest, MissingFunctionExit) {
FunctionCallTrie Trie(A);		FunctionCallTrie Trie(A);
Trie.enterFunction(1, 1);		Trie.enterFunction(1, 1);
const auto &R = Trie.getRoots();		const auto &R = Trie.getRoots();

ASSERT_TRUE(R.empty());		ASSERT_TRUE(R.empty());
}		}

TEST(FunctionCallTrieTest, MultipleRoots) {		TEST(FunctionCallTrieTest, MultipleRoots) {
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
auto A = FunctionCallTrie::InitAllocators();		auto A = FunctionCallTrie::InitAllocators();
FunctionCallTrie Trie(A);		FunctionCallTrie Trie(A);

// Enter and exit FId = 1.		// Enter and exit FId = 1.
Trie.enterFunction(1, 1);		Trie.enterFunction(1, 1);
Trie.exitFunction(1, 2);		Trie.exitFunction(1, 2);

// Enter and exit FId = 2.		// Enter and exit FId = 2.
Show All 26 Lines
//		//
// f0@t0 -> f1@t1 -> f2@t2		// f0@t0 -> f1@t1 -> f2@t2
//		//
// If for whatever reason we see an exit for `f2` @ t3, followed by an exit for		// If for whatever reason we see an exit for `f2` @ t3, followed by an exit for
// `f0` @ t4 (i.e. no `f1` exit in between) then we need to handle the case of		// `f0` @ t4 (i.e. no `f1` exit in between) then we need to handle the case of
// accounting local time to `f2` from d = (t3 - t2), then local time to `f1`		// accounting local time to `f2` from d = (t3 - t2), then local time to `f1`
// as d' = (t3 - t1) - d, and then local time to `f0` as d'' = (t3 - t0) - d'.		// as d' = (t3 - t1) - d, and then local time to `f0` as d'' = (t3 - t0) - d'.
TEST(FunctionCallTrieTest, MissingIntermediaryExit) {		TEST(FunctionCallTrieTest, MissingIntermediaryExit) {
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
auto A = FunctionCallTrie::InitAllocators();		auto A = FunctionCallTrie::InitAllocators();
FunctionCallTrie Trie(A);		FunctionCallTrie Trie(A);

Trie.enterFunction(1, 0);		Trie.enterFunction(1, 0);
Trie.enterFunction(2, 100);		Trie.enterFunction(2, 100);
Trie.enterFunction(3, 200);		Trie.enterFunction(3, 200);
Trie.exitFunction(3, 300);		Trie.exitFunction(3, 300);
Trie.exitFunction(1, 400);		Trie.exitFunction(1, 400);
Show All 25 Lines	TEST(FunctionCallTrieTest, MissingIntermediaryExit) {
EXPECT_EQ(F3.CumulativeLocalTime, 100);		EXPECT_EQ(F3.CumulativeLocalTime, 100);
EXPECT_EQ(F2.CumulativeLocalTime, 300);		EXPECT_EQ(F2.CumulativeLocalTime, 300);
EXPECT_EQ(F1.CumulativeLocalTime, 100);		EXPECT_EQ(F1.CumulativeLocalTime, 100);
}		}

// TODO: Test that we can handle cross-CPU migrations, where TSCs are not		// TODO: Test that we can handle cross-CPU migrations, where TSCs are not
// guaranteed to be synchronised.		// guaranteed to be synchronised.
TEST(FunctionCallTrieTest, DeepCopy) {		TEST(FunctionCallTrieTest, DeepCopy) {
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
auto A = FunctionCallTrie::InitAllocators();		auto A = FunctionCallTrie::InitAllocators();
FunctionCallTrie Trie(A);		FunctionCallTrie Trie(A);

Trie.enterFunction(1, 0);		Trie.enterFunction(1, 0);
Trie.enterFunction(2, 1);		Trie.enterFunction(2, 1);
Trie.exitFunction(2, 2);		Trie.exitFunction(2, 2);
Trie.enterFunction(3, 3);		Trie.enterFunction(3, 3);
Trie.exitFunction(3, 4);		Trie.exitFunction(3, 4);
Show All 24 Lines	const auto &F1Copy =
.find_element(		.find_element(
[](const FunctionCallTrie::NodeIdPair &R) { return R.FId == 2; })		[](const FunctionCallTrie::NodeIdPair &R) { return R.FId == 2; })
->NodePtr;		->NodePtr;
EXPECT_EQ(&R0Orig, F1Orig.Parent);		EXPECT_EQ(&R0Orig, F1Orig.Parent);
EXPECT_EQ(&R0Copy, F1Copy.Parent);		EXPECT_EQ(&R0Copy, F1Copy.Parent);
}		}

TEST(FunctionCallTrieTest, MergeInto) {		TEST(FunctionCallTrieTest, MergeInto) {
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
auto A = FunctionCallTrie::InitAllocators();		auto A = FunctionCallTrie::InitAllocators();
FunctionCallTrie T0(A);		FunctionCallTrie T0(A);
FunctionCallTrie T1(A);		FunctionCallTrie T1(A);

// 1 -> 2 -> 3		// 1 -> 2 -> 3
T0.enterFunction(1, 0);		T0.enterFunction(1, 0);
T0.enterFunction(2, 1);		T0.enterFunction(2, 1);
T0.enterFunction(3, 2);		T0.enterFunction(3, 2);
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

compiler-rt/trunk/lib/xray/tests/unit/profile_collector_test.cc

//===-- profile_collector_test.cc -----------------------------------------===//		//===-- profile_collector_test.cc -----------------------------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file is a part of XRay, a function call tracing system.		// This file is a part of XRay, a function call tracing system.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
#include "gtest/gtest.h"		#include "gtest/gtest.h"

#include "xray_profile_collector.h"		#include "xray_profile_collector.h"
#include "xray_profiler_flags.h"		#include "xray_profiling_flags.h"
#include <cstdint>		#include <cstdint>
#include <thread>		#include <thread>
#include <utility>		#include <utility>
#include <vector>		#include <vector>

namespace __xray {		namespace __xray {
namespace {		namespace {

static constexpr auto kHeaderSize = 16u;		static constexpr auto kHeaderSize = 16u;

void ValidateBlock(XRayBuffer B) {		void ValidateBlock(XRayBuffer B) {
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
ASSERT_NE(static_cast<const void *>(B.Data), nullptr);		ASSERT_NE(static_cast<const void *>(B.Data), nullptr);
ASSERT_NE(B.Size, 0u);		ASSERT_NE(B.Size, 0u);
ASSERT_GE(B.Size, kHeaderSize);		ASSERT_GE(B.Size, kHeaderSize);
// We look at the block size, the block number, and the thread ID to ensure		// We look at the block size, the block number, and the thread ID to ensure
// that none of them are zero (or that the header data is laid out as we		// that none of them are zero (or that the header data is laid out as we
// expect).		// expect).
char LocalBuffer[kHeaderSize] = {};		char LocalBuffer[kHeaderSize] = {};
internal_memcpy(LocalBuffer, B.Data, kHeaderSize);		internal_memcpy(LocalBuffer, B.Data, kHeaderSize);
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	std::tuple<Profile, const char > ParseProfile(const char P) {

// Then read the CumulativeLocalTime.		// Then read the CumulativeLocalTime.
internal_memcpy(&Result.CumulativeLocalTime, P, sizeof(int64_t));		internal_memcpy(&Result.CumulativeLocalTime, P, sizeof(int64_t));
P += sizeof(int64_t);		P += sizeof(int64_t);
return std::make_tuple(std::move(Result), P);		return std::make_tuple(std::move(Result), P);
}		}

TEST(profileCollectorServiceTest, PostSerializeCollect) {		TEST(profileCollectorServiceTest, PostSerializeCollect) {
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
// The most basic use-case (the one we actually only care about) is the one		// The most basic use-case (the one we actually only care about) is the one
// where we ensure that we can post FunctionCallTrie instances, which are then		// where we ensure that we can post FunctionCallTrie instances, which are then
// destroyed but serialized properly.		// destroyed but serialized properly.
//		//
// First, we initialise a set of allocators in the local scope. This ensures		// First, we initialise a set of allocators in the local scope. This ensures
// that we're able to copy the contents of the FunctionCallTrie that uses		// that we're able to copy the contents of the FunctionCallTrie that uses
// the local allocators.		// the local allocators.
auto Allocators = FunctionCallTrie::InitAllocators();		auto Allocators = FunctionCallTrie::InitAllocators();
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	void threadProcessing() {
T.enterFunction(2, 2);		T.enterFunction(2, 2);
T.exitFunction(2, 3);		T.exitFunction(2, 3);
T.exitFunction(1, 4);		T.exitFunction(1, 4);

profileCollectorService::post(T, GetTid());		profileCollectorService::post(T, GetTid());
}		}

TEST(profileCollectorServiceTest, PostSerializeCollectMultipleThread) {		TEST(profileCollectorServiceTest, PostSerializeCollectMultipleThread) {
profilerFlags()->setDefaults();		profilingFlags()->setDefaults();
std::thread t1(threadProcessing);		std::thread t1(threadProcessing);
std::thread t2(threadProcessing);		std::thread t2(threadProcessing);

t1.join();		t1.join();
t2.join();		t2.join();

// At this point, t1 and t2 are already done with what they were doing.		// At this point, t1 and t2 are already done with what they were doing.
profileCollectorService::serialize();		profileCollectorService::serialize();
Show All 11 Lines

compiler-rt/trunk/lib/xray/xray_function_call_trie.h

Show All 9 Lines
// This file is a part of XRay, a dynamic runtime instrumentation system.		// This file is a part of XRay, a dynamic runtime instrumentation system.
//		//
// This file defines the interface for a function call trie.		// This file defines the interface for a function call trie.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
#ifndef XRAY_FUNCTION_CALL_TRIE_H		#ifndef XRAY_FUNCTION_CALL_TRIE_H
#define XRAY_FUNCTION_CALL_TRIE_H		#define XRAY_FUNCTION_CALL_TRIE_H

#include "xray_profiler_flags.h"		#include "xray_profiling_flags.h"
#include "xray_segmented_array.h"		#include "xray_segmented_array.h"
#include <utility>		#include <utility>
#include <memory> // For placement new.		#include <memory> // For placement new.

namespace __xray {		namespace __xray {

/// A FunctionCallTrie represents the stack traces of XRay instrumented		/// A FunctionCallTrie represents the stack traces of XRay instrumented
/// functions that we've encountered, where a node corresponds to a function and		/// functions that we've encountered, where a node corresponds to a function and
▲ Show 20 Lines • Show All 191 Lines • ▼ Show 20 Lines	public:
};		};

// TODO: Support configuration of options through the arguments.		// TODO: Support configuration of options through the arguments.
static Allocators InitAllocators() {		static Allocators InitAllocators() {
Allocators A;		Allocators A;
auto NodeAllocator = reinterpret_cast<Allocators::NodeAllocatorType *>(		auto NodeAllocator = reinterpret_cast<Allocators::NodeAllocatorType *>(
InternalAlloc(sizeof(Allocators::NodeAllocatorType)));		InternalAlloc(sizeof(Allocators::NodeAllocatorType)));
new (NodeAllocator) Allocators::NodeAllocatorType(		new (NodeAllocator) Allocators::NodeAllocatorType(
profilerFlags()->per_thread_allocator_max, 0);		profilingFlags()->per_thread_allocator_max, 0);
A.NodeAllocator = NodeAllocator;		A.NodeAllocator = NodeAllocator;

auto RootAllocator = reinterpret_cast<Allocators::RootAllocatorType *>(		auto RootAllocator = reinterpret_cast<Allocators::RootAllocatorType *>(
InternalAlloc(sizeof(Allocators::RootAllocatorType)));		InternalAlloc(sizeof(Allocators::RootAllocatorType)));
new (RootAllocator) Allocators::RootAllocatorType(		new (RootAllocator) Allocators::RootAllocatorType(
profilerFlags()->per_thread_allocator_max, 0);		profilingFlags()->per_thread_allocator_max, 0);
A.RootAllocator = RootAllocator;		A.RootAllocator = RootAllocator;

auto ShadowStackAllocator =		auto ShadowStackAllocator =
reinterpret_cast<Allocators::ShadowStackAllocatorType *>(		reinterpret_cast<Allocators::ShadowStackAllocatorType *>(
InternalAlloc(sizeof(Allocators::ShadowStackAllocatorType)));		InternalAlloc(sizeof(Allocators::ShadowStackAllocatorType)));
new (ShadowStackAllocator) Allocators::ShadowStackAllocatorType(		new (ShadowStackAllocator) Allocators::ShadowStackAllocatorType(
profilerFlags()->per_thread_allocator_max, 0);		profilingFlags()->per_thread_allocator_max, 0);
A.ShadowStackAllocator = ShadowStackAllocator;		A.ShadowStackAllocator = ShadowStackAllocator;

auto NodeIdPairAllocator = reinterpret_cast<NodeIdPairAllocatorType *>(		auto NodeIdPairAllocator = reinterpret_cast<NodeIdPairAllocatorType *>(
InternalAlloc(sizeof(NodeIdPairAllocatorType)));		InternalAlloc(sizeof(NodeIdPairAllocatorType)));
new (NodeIdPairAllocator)		new (NodeIdPairAllocator)
NodeIdPairAllocatorType(profilerFlags()->per_thread_allocator_max, 0);		NodeIdPairAllocatorType(profilingFlags()->per_thread_allocator_max, 0);
A.NodeIdPairAllocator = NodeIdPairAllocator;		A.NodeIdPairAllocator = NodeIdPairAllocator;
return A;		return A;
}		}

private:		private:
NodeArray Nodes;		NodeArray Nodes;
RootArray Roots;		RootArray Roots;
ShadowStackArray ShadowStack;		ShadowStackArray ShadowStack;
▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	for (const auto Root : getRoots()) {
// nodes we push in as we're traversing depth-first down the call tree.		// nodes we push in as we're traversing depth-first down the call tree.
struct NodeAndParent {		struct NodeAndParent {
FunctionCallTrie::Node *Node;		FunctionCallTrie::Node *Node;
FunctionCallTrie::Node *NewNode;		FunctionCallTrie::Node *NewNode;
};		};
using Stack = Array<NodeAndParent>;		using Stack = Array<NodeAndParent>;

typename Stack::AllocatorType StackAllocator(		typename Stack::AllocatorType StackAllocator(
profilerFlags()->stack_allocator_max, 0);		profilingFlags()->stack_allocator_max, 0);
Stack DFSStack(StackAllocator);		Stack DFSStack(StackAllocator);

// TODO: Figure out what to do if we fail to allocate any more stack		// TODO: Figure out what to do if we fail to allocate any more stack
// space. Maybe warn or report once?		// space. Maybe warn or report once?
DFSStack.Append(NodeAndParent{Root, NewRoot});		DFSStack.Append(NodeAndParent{Root, NewRoot});
while (!DFSStack.empty()) {		while (!DFSStack.empty()) {
NodeAndParent NP = DFSStack.back();		NodeAndParent NP = DFSStack.back();
DCHECK_NE(NP.Node, nullptr);		DCHECK_NE(NP.Node, nullptr);
Show All 21 Lines	public:
// synchronisation of both "this" and \|O\|.		// synchronisation of both "this" and \|O\|.
void mergeInto(FunctionCallTrie &O) const {		void mergeInto(FunctionCallTrie &O) const {
struct NodeAndTarget {		struct NodeAndTarget {
FunctionCallTrie::Node *OrigNode;		FunctionCallTrie::Node *OrigNode;
FunctionCallTrie::Node *TargetNode;		FunctionCallTrie::Node *TargetNode;
};		};
using Stack = Array<NodeAndTarget>;		using Stack = Array<NodeAndTarget>;
typename Stack::AllocatorType StackAllocator(		typename Stack::AllocatorType StackAllocator(
profilerFlags()->stack_allocator_max, 0);		profilingFlags()->stack_allocator_max, 0);
Stack DFSStack(StackAllocator);		Stack DFSStack(StackAllocator);

for (const auto Root : getRoots()) {		for (const auto Root : getRoots()) {
Node *TargetRoot = nullptr;		Node *TargetRoot = nullptr;
auto R = O.Roots.find_element(		auto R = O.Roots.find_element(
[&](const Node *Node) { return Node->FId == Root->FId; });		[&](const Node *Node) { return Node->FId == Root->FId; });
if (R == nullptr) {		if (R == nullptr) {
TargetRoot = O.Nodes.AppendEmplace(nullptr, *O.NodeIdPairAllocator, 0,		TargetRoot = O.Nodes.AppendEmplace(nullptr, *O.NodeIdPairAllocator, 0,
Show All 36 Lines

compiler-rt/trunk/lib/xray/xray_profile_collector.cc

Show All 9 Lines
// This file is a part of XRay, a dynamic runtime instrumentation system.		// This file is a part of XRay, a dynamic runtime instrumentation system.
//		//
// This implements the interface for the profileCollectorService.		// This implements the interface for the profileCollectorService.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
#include "xray_profile_collector.h"		#include "xray_profile_collector.h"
#include "sanitizer_common/sanitizer_common.h"		#include "sanitizer_common/sanitizer_common.h"
#include "sanitizer_common/sanitizer_vector.h"		#include "sanitizer_common/sanitizer_vector.h"
#include "xray_profiler_flags.h"		#include "xray_profiling_flags.h"
#include <pthread.h>		#include <pthread.h>
#include <memory>		#include <memory>
#include <utility>		#include <utility>

namespace __xray {		namespace __xray {
namespace profileCollectorService {		namespace profileCollectorService {

namespace {		namespace {
▲ Show 20 Lines • Show All 96 Lines • ▼ Show 20 Lines

// Walk a depth-first traversal of each root of the FunctionCallTrie to generate		// Walk a depth-first traversal of each root of the FunctionCallTrie to generate
// the path(s) and the data associated with the path.		// the path(s) and the data associated with the path.
static void populateRecords(ProfileRecordArray &PRs,		static void populateRecords(ProfileRecordArray &PRs,
ProfileRecord::PathAllocator &PA,		ProfileRecord::PathAllocator &PA,
const FunctionCallTrie &Trie) {		const FunctionCallTrie &Trie) {
using StackArray = Array<const FunctionCallTrie::Node *>;		using StackArray = Array<const FunctionCallTrie::Node *>;
using StackAllocator = typename StackArray::AllocatorType;		using StackAllocator = typename StackArray::AllocatorType;
StackAllocator StackAlloc(profilerFlags()->stack_allocator_max, 0);		StackAllocator StackAlloc(profilingFlags()->stack_allocator_max, 0);
StackArray DFSStack(StackAlloc);		StackArray DFSStack(StackAlloc);
for (const auto R : Trie.getRoots()) {		for (const auto R : Trie.getRoots()) {
DFSStack.Append(R);		DFSStack.Append(R);
while (!DFSStack.empty()) {		while (!DFSStack.empty()) {
auto Node = DFSStack.back();		auto Node = DFSStack.back();
DFSStack.trim(1);		DFSStack.trim(1);
auto Record = PRs.AppendEmplace(PA, Node);		auto Record = PRs.AppendEmplace(PA, Node);
DCHECK_NE(Record, nullptr);		DCHECK_NE(Record, nullptr);
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	void serialize() {
ProfileBuffers.Reset();		ProfileBuffers.Reset();

if (ThreadTries.Size() == 0)		if (ThreadTries.Size() == 0)
return;		return;

// Then repopulate the global ProfileBuffers.		// Then repopulate the global ProfileBuffers.
for (u32 I = 0; I < ThreadTries.Size(); ++I) {		for (u32 I = 0; I < ThreadTries.Size(); ++I) {
using ProfileRecordAllocator = typename ProfileRecordArray::AllocatorType;		using ProfileRecordAllocator = typename ProfileRecordArray::AllocatorType;
ProfileRecordAllocator PRAlloc(profilerFlags()->global_allocator_max, 0);		ProfileRecordAllocator PRAlloc(profilingFlags()->global_allocator_max, 0);
ProfileRecord::PathAllocator PathAlloc(		ProfileRecord::PathAllocator PathAlloc(
profilerFlags()->global_allocator_max, 0);		profilingFlags()->global_allocator_max, 0);
ProfileRecordArray ProfileRecords(PRAlloc);		ProfileRecordArray ProfileRecords(PRAlloc);

// First, we want to compute the amount of space we're going to need. We'll		// First, we want to compute the amount of space we're going to need. We'll
// use a local allocator and an __xray::Array<...> to store the intermediary		// use a local allocator and an __xray::Array<...> to store the intermediary
// data, then compute the size as we're going along. Then we'll allocate the		// data, then compute the size as we're going along. Then we'll allocate the
// contiguous space to contain the thread buffer data.		// contiguous space to contain the thread buffer data.
const auto &Trie = *ThreadTries[I].Trie;		const auto &Trie = *ThreadTries[I].Trie;
if (Trie.getRoots().empty())		if (Trie.getRoots().empty())
▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

compiler-rt/trunk/lib/xray/xray_profiler_flags.h

	//===-- xray_profiler_flags.h ----------------------------------- C++ --===//
	//
	// The LLVM Compiler Infrastructure
	//
	// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.
	//
	//===----------------------------------------------------------------------===//
	//
	// This file is a part of XRay, a dynamic runtime instrumentation system.
	//
	// XRay profiler runtime flags.
	//===----------------------------------------------------------------------===//

	#ifndef XRAY_PROFILER_FLAGS_H
	#define XRAY_PROFILER_FLAGS_H

	#include "sanitizer_common/sanitizer_flag_parser.h"
	#include "sanitizer_common/sanitizer_internal_defs.h"

	namespace __xray {

	struct ProfilerFlags {
	#define XRAY_FLAG(Type, Name, DefaultValue, Description) Type Name;
	#include "xray_profiler_flags.inc"
	#undef XRAY_FLAG

	void setDefaults();
	};

	extern ProfilerFlags xray_profiler_flags_dont_use_directly;
	inline ProfilerFlags *profilerFlags() {
	return &xray_profiler_flags_dont_use_directly;
	}
	void registerProfilerFlags(FlagParser P, ProfilerFlags F);

	} // namespace __xray

	#endif // XRAY_PROFILER_FLAGS_H

compiler-rt/trunk/lib/xray/xray_profiler_flags.cc

	//===-- xray_flags.h -------------------------------------------- C++ --===//
	//
	// The LLVM Compiler Infrastructure
	//
	// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.
	//
	//===----------------------------------------------------------------------===//
	//
	// This file is a part of XRay, a dynamic runtime instrumentation system.
	//
	// XRay runtime flags.
	//===----------------------------------------------------------------------===//

	#include "xray_profiler_flags.h"
	#include "sanitizer_common/sanitizer_common.h"
	#include "sanitizer_common/sanitizer_flag_parser.h"
	#include "sanitizer_common/sanitizer_libc.h"
	#include "xray_defs.h"

	namespace __xray {

	// Storage for the profiler flags.
	ProfilerFlags xray_profiler_flags_dont_use_directly;

	void ProfilerFlags::setDefaults() XRAY_NEVER_INSTRUMENT {
	#define XRAY_FLAG(Type, Name, DefaultValue, Description) Name = DefaultValue;
	#include "xray_profiler_flags.inc"
	#undef XRAY_FLAG
	}

	void registerProfilerFlags(FlagParser *P,
	ProfilerFlags *F) XRAY_NEVER_INSTRUMENT {
	#define XRAY_FLAG(Type, Name, DefaultValue, Description) \
	RegisterFlag(P, #Name, Description, &F->Name);
	#include "xray_profiler_flags.inc"
	#undef XRAY_FLAG
	}

	} // namespace __xray

compiler-rt/trunk/lib/xray/xray_profiler_flags.inc

	//===-- xray_flags.inc ------------------------------------------- C++ --===//
	//
	// The LLVM Compiler Infrastructure
	//
	// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.
	//
	//===----------------------------------------------------------------------===//
	//
	// XRay profiling runtime flags.
	//
	//===----------------------------------------------------------------------===//
	#ifndef XRAY_FLAG
	#error "Define XRAY_FLAG prior to including this file!"
	#endif

	XRAY_FLAG(uptr, per_thread_allocator_max, 2 << 20,
	"Maximum size of any single per-thread allocator.")
	XRAY_FLAG(uptr, global_allocator_max, 2 << 24,
	"Maximum size of the global allocator for profile storage.")
	XRAY_FLAG(uptr, stack_allocator_max, 2 << 24,
	"Maximum size of the traversal stack allocator.")
	XRAY_FLAG(int, grace_period_ms, 100,
	"Profile collection will wait this much time in milliseconds before "
	"resetting the global state. This gives a chance to threads to "
	"notice that the profiler has been finalized and clean up.")

compiler-rt/trunk/lib/xray/xray_profiling.cc

				//===-- xray_profiling.cc ---------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is a part of XRay, a dynamic runtime instrumentation system.
				//
				// This is the implementation of a profiling handler.
				//
				//===----------------------------------------------------------------------===//
				#include <memory>

				#include "sanitizer_common/sanitizer_atomic.h"
				#include "sanitizer_common/sanitizer_flags.h"
				#include "xray/xray_interface.h"
				#include "xray/xray_log_interface.h"

				#include "xray_flags.h"
				#include "xray_profile_collector.h"
				#include "xray_profiling_flags.h"
				#include "xray_recursion_guard.h"
				#include "xray_tsc.h"
				#include "xray_utils.h"
				#include <pthread.h>

				namespace __xray {

				namespace {

				atomic_sint32_t ProfilerLogFlushStatus = {
				XRayLogFlushStatus::XRAY_LOG_NOT_FLUSHING};

				atomic_sint32_t ProfilerLogStatus = {XRayLogInitStatus::XRAY_LOG_UNINITIALIZED};

				SpinMutex ProfilerOptionsMutex;

				struct alignas(64) ProfilingData {
				FunctionCallTrie::Allocators *Allocators = nullptr;
				FunctionCallTrie *FCT = nullptr;
				};

				static pthread_key_t ProfilingKey;

				ProfilingData &getThreadLocalData() XRAY_NEVER_INSTRUMENT {
				thread_local std::aligned_storage<sizeof(ProfilingData)>::type ThreadStorage;
				if (pthread_getspecific(ProfilingKey) == NULL) {
				new (&ThreadStorage) ProfilingData{};
				pthread_setspecific(ProfilingKey, &ThreadStorage);
				}

				auto &TLD = reinterpret_cast<ProfilingData >(&ThreadStorage);

				// We need to check whether the global flag to finalizing/finalized has been
				// switched. If it is, then we ought to not actually initialise the data.
				auto Status = atomic_load(&ProfilerLogStatus, memory_order_acquire);
				if (Status == XRayLogInitStatus::XRAY_LOG_FINALIZING \|\|
				Status == XRayLogInitStatus::XRAY_LOG_FINALIZED)
				return TLD;

				// If we're live, then we re-initialize TLD if the pointers are not null.
				if (UNLIKELY(TLD.Allocators == nullptr && TLD.FCT == nullptr)) {
				TLD.Allocators = reinterpret_cast<FunctionCallTrie::Allocators *>(
				InternalAlloc(sizeof(FunctionCallTrie::Allocators)));
				new (TLD.Allocators) FunctionCallTrie::Allocators();
				*TLD.Allocators = FunctionCallTrie::InitAllocators();
				TLD.FCT = reinterpret_cast<FunctionCallTrie *>(
				InternalAlloc(sizeof(FunctionCallTrie)));
				new (TLD.FCT) FunctionCallTrie(*TLD.Allocators);
				}

				return TLD;
				}

				} // namespace

				const char *profilingCompilerDefinedFlags() XRAY_NEVER_INSTRUMENT {
				#ifdef XRAY_PROFILER_DEFAULT_OPTIONS
				return SANITIZER_STRINGIFY(XRAY_PROFILER_DEFAULT_OPTIONS);
				#else
				return "";
				#endif
				}

				atomic_sint32_t ProfileFlushStatus = {
				XRayLogFlushStatus::XRAY_LOG_NOT_FLUSHING};

				XRayLogFlushStatus profilingFlush() XRAY_NEVER_INSTRUMENT {
				// When flushing, all we really do is reset the global state, and only when
				// the log has already been finalized.
				if (atomic_load(&ProfilerLogStatus, memory_order_acquire) !=
				XRayLogInitStatus::XRAY_LOG_FINALIZED) {
				if (Verbosity())
				Report("Not flushing profiles, profiling not been finalized.\n");
				return XRayLogFlushStatus::XRAY_LOG_NOT_FLUSHING;
				}

				s32 Result = XRayLogFlushStatus::XRAY_LOG_NOT_FLUSHING;
				if (!atomic_compare_exchange_strong(&ProfilerLogFlushStatus, &Result,
				XRayLogFlushStatus::XRAY_LOG_FLUSHING,
				memory_order_acq_rel)) {
				if (Verbosity())
				Report("Not flushing profiles, implementation still finalizing.\n");
				}

				profileCollectorService::reset();

				atomic_store(&ProfilerLogStatus, XRayLogFlushStatus::XRAY_LOG_FLUSHED,
				memory_order_release);

				return XRayLogFlushStatus::XRAY_LOG_FLUSHED;
				}

				namespace {

				thread_local atomic_uint8_t ReentranceGuard{0};

				void postCurrentThreadFCT(ProfilingData &TLD) {
				if (TLD.Allocators == nullptr \|\| TLD.FCT == nullptr)
				return;

				profileCollectorService::post(*TLD.FCT, GetTid());
				TLD.FCT->~FunctionCallTrie();
				TLD.Allocators->~Allocators();
				InternalFree(TLD.FCT);
				InternalFree(TLD.Allocators);
				TLD.FCT = nullptr;
				TLD.Allocators = nullptr;
				}

				} // namespace

				void profilingHandleArg0(int32_t FuncId,
				XRayEntryType Entry) XRAY_NEVER_INSTRUMENT {
				unsigned char CPU;
				auto TSC = readTSC(CPU);
				RecursionGuard G(ReentranceGuard);
				if (!G)
				return;

				auto Status = atomic_load(&ProfilerLogStatus, memory_order_acquire);
				auto &TLD = getThreadLocalData();
				if (UNLIKELY(Status == XRayLogInitStatus::XRAY_LOG_FINALIZED \|\|
				Status == XRayLogInitStatus::XRAY_LOG_FINALIZING)) {
				postCurrentThreadFCT(TLD);
				return;
				}

				switch (Entry) {
				case XRayEntryType::ENTRY:
				case XRayEntryType::LOG_ARGS_ENTRY:
				TLD.FCT->enterFunction(FuncId, TSC);
				break;
				case XRayEntryType::EXIT:
				case XRayEntryType::TAIL:
				TLD.FCT->exitFunction(FuncId, TSC);
				break;
				default:
				// FIXME: Handle bugs.
				break;
				}
				}

				void profilingHandleArg1(int32_t FuncId, XRayEntryType Entry,
				uint64_t) XRAY_NEVER_INSTRUMENT {
				return profilingHandleArg0(FuncId, Entry);
				}

				XRayLogInitStatus profilingFinalize() XRAY_NEVER_INSTRUMENT {
				s32 CurrentStatus = XRayLogInitStatus::XRAY_LOG_INITIALIZED;
				if (!atomic_compare_exchange_strong(&ProfilerLogStatus, &CurrentStatus,
				XRayLogInitStatus::XRAY_LOG_FINALIZING,
				memory_order_release)) {
				if (Verbosity())
				Report("Cannot finalize profile, the profiling is not initialized.\n");
				return static_cast<XRayLogInitStatus>(CurrentStatus);
				}

				// Wait a grace period to allow threads to see that we're finalizing.
				SleepForMillis(profilingFlags()->grace_period_ms);

				// We also want to make sure that the current thread's data is cleaned up,
				// if we have any.
				auto &TLD = getThreadLocalData();
				postCurrentThreadFCT(TLD);

				// Then we force serialize the log data.
				profileCollectorService::serialize();

				atomic_store(&ProfilerLogStatus, XRayLogInitStatus::XRAY_LOG_FINALIZED,
				memory_order_release);
				return XRayLogInitStatus::XRAY_LOG_FINALIZED;
				}

				XRayLogInitStatus
				profilingLoggingInit(size_t BufferSize, size_t BufferMax, void *Options,
				size_t OptionsSize) XRAY_NEVER_INSTRUMENT {
				if (BufferSize != 0 \|\| BufferMax != 0) {
				if (Verbosity())
				Report("__xray_log_init() being used, and is unsupported. Use "
				"__xray_log_init_mode(...) instead. Bailing out.");
				return XRayLogInitStatus::XRAY_LOG_UNINITIALIZED;
				}

				s32 CurrentStatus = XRayLogInitStatus::XRAY_LOG_UNINITIALIZED;
				if (!atomic_compare_exchange_strong(&ProfilerLogStatus, &CurrentStatus,
				XRayLogInitStatus::XRAY_LOG_INITIALIZING,
				memory_order_release)) {
				if (Verbosity())
				Report("Cannot initialize already initialised profiling "
				"implementation.\n");
				return static_cast<XRayLogInitStatus>(CurrentStatus);
				}

				{
				SpinMutexLock Lock(&ProfilerOptionsMutex);
				FlagParser ConfigParser;
				auto *F = profilingFlags();
				F->setDefaults();
				registerProfilerFlags(&ConfigParser, F);
				const char *ProfilerCompileFlags = profilingCompilerDefinedFlags();
				ConfigParser.ParseString(ProfilerCompileFlags);
				ConfigParser.ParseString(static_cast<const char *>(Options));
				if (Verbosity())
				ReportUnrecognizedFlags();
				}

				// We need to reset the profile data collection implementation now.
				profileCollectorService::reset();

				// We need to set up the at-thread-exit handler.
				static pthread_once_t Once = PTHREAD_ONCE_INIT;
				pthread_once(&Once, +[] {
				pthread_key_create(&ProfilingKey, +[](void *P) {
				// This is the thread-exit handler.
				auto &TLD = reinterpret_cast<ProfilingData >(P);
				if (TLD.Allocators == nullptr && TLD.FCT == nullptr)
				return;

				postCurrentThreadFCT(TLD);
				});
				});

				__xray_log_set_buffer_iterator(profileCollectorService::nextBuffer);
				__xray_set_handler(profilingHandleArg0);
				__xray_set_handler_arg1(profilingHandleArg1);

				atomic_store(&ProfilerLogStatus, XRayLogInitStatus::XRAY_LOG_INITIALIZED,
				memory_order_release);
				if (Verbosity())
				Report("XRay Profiling init successful.\n");

				return XRayLogInitStatus::XRAY_LOG_INITIALIZED;
				}

				bool profilingDynamicInitializer() XRAY_NEVER_INSTRUMENT {
				// Set up the flag defaults from the static defaults and the
				// compiler-provided defaults.
				{
				SpinMutexLock Lock(&ProfilerOptionsMutex);
				auto *F = profilingFlags();
				F->setDefaults();
				FlagParser ProfilingParser;
				registerProfilerFlags(&ProfilingParser, F);
				const char *ProfilerCompileFlags = profilingCompilerDefinedFlags();
				ProfilingParser.ParseString(ProfilerCompileFlags);
				}

				XRayLogImpl Impl{
				profilingLoggingInit,
				profilingFinalize,
				profilingHandleArg0,
				profilingFlush,
				};
				auto RegistrationResult = __xray_log_register_mode("xray-profiling", Impl);
				if (RegistrationResult != XRayLogRegisterStatus::XRAY_REGISTRATION_OK &&
				Verbosity())
				Report("Cannot register XRay Profiling mode to 'xray-profiling'; error = "
				"%d\n",
				RegistrationResult);
				if (!internal_strcmp(flags()->xray_mode, "xray-profiling"))
				__xray_set_log_impl(Impl);
				return true;
				}

				} // namespace __xray

				static auto UNUSED Unused = __xray::profilingDynamicInitializer();

compiler-rt/trunk/lib/xray/xray_profiling_flags.h

				//===-- xray_profiling_flags.h ----------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is a part of XRay, a dynamic runtime instrumentation system.
				//
				// XRay profiling runtime flags.
				//===----------------------------------------------------------------------===//

				#ifndef XRAY_PROFILER_FLAGS_H
				#define XRAY_PROFILER_FLAGS_H

				#include "sanitizer_common/sanitizer_flag_parser.h"
				#include "sanitizer_common/sanitizer_internal_defs.h"

				namespace __xray {

				struct ProfilerFlags {
				#define XRAY_FLAG(Type, Name, DefaultValue, Description) Type Name;
				#include "xray_profiling_flags.inc"
				#undef XRAY_FLAG

				void setDefaults();
				};

				extern ProfilerFlags xray_profiling_flags_dont_use_directly;
				inline ProfilerFlags *profilingFlags() {
				return &xray_profiling_flags_dont_use_directly;
				}
				void registerProfilerFlags(FlagParser P, ProfilerFlags F);

				} // namespace __xray

				#endif // XRAY_PROFILER_FLAGS_H

compiler-rt/trunk/lib/xray/xray_profiling_flags.cc

				//===-- xray_flags.h -------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is a part of XRay, a dynamic runtime instrumentation system.
				//
				// XRay runtime flags.
				//===----------------------------------------------------------------------===//

				#include "xray_profiling_flags.h"
				#include "sanitizer_common/sanitizer_common.h"
				#include "sanitizer_common/sanitizer_flag_parser.h"
				#include "sanitizer_common/sanitizer_libc.h"
				#include "xray_defs.h"

				namespace __xray {

				// Storage for the profiling flags.
				ProfilerFlags xray_profiling_flags_dont_use_directly;

				void ProfilerFlags::setDefaults() XRAY_NEVER_INSTRUMENT {
				#define XRAY_FLAG(Type, Name, DefaultValue, Description) Name = DefaultValue;
				#include "xray_profiling_flags.inc"
				#undef XRAY_FLAG
				}

				void registerProfilerFlags(FlagParser *P,
				ProfilerFlags *F) XRAY_NEVER_INSTRUMENT {
				#define XRAY_FLAG(Type, Name, DefaultValue, Description) \
				RegisterFlag(P, #Name, Description, &F->Name);
				#include "xray_profiling_flags.inc"
				#undef XRAY_FLAG
				}

				} // namespace __xray

compiler-rt/trunk/lib/xray/xray_profiling_flags.inc

				//===-- xray_profiling_flags.inc --------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// XRay profiling runtime flags.
				//
				//===----------------------------------------------------------------------===//
				#ifndef XRAY_FLAG
				#error "Define XRAY_FLAG prior to including this file!"
				#endif

				XRAY_FLAG(uptr, per_thread_allocator_max, 2 << 20,
				"Maximum size of any single per-thread allocator.")
				XRAY_FLAG(uptr, global_allocator_max, 2 << 24,
				"Maximum size of the global allocator for profile storage.")
				XRAY_FLAG(uptr, stack_allocator_max, 2 << 24,
				"Maximum size of the traversal stack allocator.")
				XRAY_FLAG(int, grace_period_ms, 100,
				"Profile collection will wait this much time in milliseconds before "
				"resetting the global state. This gives a chance to threads to "
				"notice that the profiler has been finalized and clean up.")

compiler-rt/trunk/test/xray/TestCases/Posix/c-test.cc

	// RUN: %clang_xray -g -o %t %s			// RUN: %clang_xray -g -fxray-modes=xray-basic,xray-fdr,xray-profiling -o %t %s
	// RUN: rm xray-log.c-test.* \|\| true			// RUN: rm xray-log.c-test.* \|\| true
	// RUN: XRAY_OPTIONS=patch_premain=true:verbosity=1:xray_mode=xray-basic %t \			// RUN: XRAY_OPTIONS=patch_premain=true:verbosity=1:xray_mode=xray-basic %t \
	// RUN: 2>&1 \| FileCheck %s			// RUN: 2>&1 \| FileCheck %s
	// RUN: rm xray-log.c-test.* \|\| true			// RUN: rm xray-log.c-test.* \|\| true
	//			//
	// REQUIRES: x86_64-target-arch			// REQUIRES: x86_64-target-arch
	// REQUIRES: built-in-llvm-tree			// REQUIRES: built-in-llvm-tree
	__attribute__((xray_always_instrument)) void always() {}			__attribute__((xray_always_instrument)) void always() {}

	int main() {			int main() {
	always();			always();
	}			}

	// CHECK: =={{[0-9].}}==XRay: Log file in '{{.}}'			// CHECK: =={{[0-9].}}==XRay: Log file in '{{.}}'

compiler-rt/trunk/test/xray/TestCases/Posix/profiling-multi-threaded.cc

				// Check that we can get a profile from a single-threaded application, on
				// demand through the XRay logging implementation API.
				//
				// FIXME: Make -fxray-modes=xray-profiling part of the default?
				// RUN: %clangxx_xray -std=c++11 %s -o %t -fxray-modes=xray-profiling
				// RUN: %run %t
				//
				// UNSUPPORTED: target-is-mips64,target-is-mips64el

				#include "xray/xray_interface.h"
				#include "xray/xray_log_interface.h"
				#include <cassert>
				#include <cstdio>
				#include <string>
				#include <thread>

				#define XRAY_ALWAYS_INSTRUMENT [[clang::xray_always_instrument]]
				#define XRAY_NEVER_INSTRUMENT [[clang::xray_never_instrument]]

				XRAY_ALWAYS_INSTRUMENT void f2() { return; }
				XRAY_ALWAYS_INSTRUMENT void f1() { f2(); }
				XRAY_ALWAYS_INSTRUMENT void f0() { f1(); }

				using namespace std;

				volatile int buffer_counter = 0;

				XRAY_NEVER_INSTRUMENT void process_buffer(const char *, XRayBuffer) {
				// FIXME: Actually assert the contents of the buffer.
				++buffer_counter;
				}

				XRAY_ALWAYS_INSTRUMENT int main(int, char **) {
				assert(__xray_log_select_mode("xray-profiling") ==
				XRayLogRegisterStatus::XRAY_REGISTRATION_OK);
				assert(__xray_log_get_current_mode() != nullptr);
				std::string current_mode = __xray_log_get_current_mode();
				assert(current_mode == "xray-profiling");
				assert(__xray_patch() == XRayPatchingStatus::SUCCESS);
				assert(__xray_log_init(0, 0, nullptr, 0) ==
				XRayLogInitStatus::XRAY_LOG_INITIALIZED);
				std::thread t0([] { f0(); });
				std::thread t1([] { f0(); });
				f0();
				t0.join();
				t1.join();
				assert(__xray_log_finalize() == XRayLogInitStatus::XRAY_LOG_FINALIZED);
				assert(__xray_log_process_buffers(process_buffer) ==
				XRayLogFlushStatus::XRAY_LOG_FLUSHED);
				// We're running three threds, so we expect three buffers.
				assert(buffer_counter == 3);
				assert(__xray_log_flushLog() == XRayLogFlushStatus::XRAY_LOG_FLUSHED);
				}

compiler-rt/trunk/test/xray/TestCases/Posix/profiling-single-threaded.cc

				// Check that we can get a profile from a single-threaded application, on
				// demand through the XRay logging implementation API.
				//
				// FIXME: Make -fxray-modes=xray-profiling part of the default?
				// RUN: %clangxx_xray -std=c++11 %s -o %t -fxray-modes=xray-profiling
				// RUN: %run %t
				//
				// UNSUPPORTED: target-is-mips64,target-is-mips64el

				#include "xray/xray_interface.h"
				#include "xray/xray_log_interface.h"
				#include <cassert>
				#include <cstdio>
				#include <string>

				[[clang::xray_always_instrument]] void f2() { return; }
				[[clang::xray_always_instrument]] void f1() { f2(); }
				[[clang::xray_always_instrument]] void f0() { f1(); }

				using namespace std;

				volatile int buffer_counter = 0;

				[[clang::xray_never_instrument]] void process_buffer(const char *, XRayBuffer) {
				// FIXME: Actually assert the contents of the buffer.
				++buffer_counter;
				}

				[[clang::xray_always_instrument]] int main(int, char **) {
				assert(__xray_log_select_mode("xray-profiling") ==
				XRayLogRegisterStatus::XRAY_REGISTRATION_OK);
				assert(__xray_log_get_current_mode() != nullptr);
				std::string current_mode = __xray_log_get_current_mode();
				assert(current_mode == "xray-profiling");
				assert(__xray_patch() == XRayPatchingStatus::SUCCESS);
				assert(__xray_log_init_mode("xray-profiling", "") ==
				XRayLogInitStatus::XRAY_LOG_INITIALIZED);
				f0();
				assert(__xray_log_finalize() == XRayLogInitStatus::XRAY_LOG_FINALIZED);
				f0();
				assert(__xray_log_process_buffers(process_buffer) ==
				XRayLogFlushStatus::XRAY_LOG_FLUSHED);
				assert(buffer_counter == 1);
				assert(__xray_log_flushLog() == XRayLogFlushStatus::XRAY_LOG_FLUSHED);

				// Let's reset the counter.
				buffer_counter = 0;

				assert(__xray_log_init_mode("xray-profiling", "") ==
				XRayLogInitStatus::XRAY_LOG_INITIALIZED);
				f0();
				assert(__xray_log_finalize() == XRayLogInitStatus::XRAY_LOG_FINALIZED);
				f0();
				assert(__xray_log_process_buffers(process_buffer) ==
				XRayLogFlushStatus::XRAY_LOG_FLUSHED);
				assert(buffer_counter == 1);
				assert(__xray_log_flushLog() == XRayLogFlushStatus::XRAY_LOG_FLUSHED);
				}

This is an archive of the discontinued LLVM Phabricator instance.

[XRay][profiler] Part 4: Profiler Mode WiringClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 150886

compiler-rt/trunk/lib/xray/CMakeLists.txt

compiler-rt/trunk/lib/xray/tests/CMakeLists.txt

compiler-rt/trunk/lib/xray/tests/unit/function_call_trie_test.cc

compiler-rt/trunk/lib/xray/tests/unit/profile_collector_test.cc

compiler-rt/trunk/lib/xray/xray_function_call_trie.h

compiler-rt/trunk/lib/xray/xray_profile_collector.cc

compiler-rt/trunk/lib/xray/xray_profiler_flags.h

compiler-rt/trunk/lib/xray/xray_profiler_flags.cc

compiler-rt/trunk/lib/xray/xray_profiler_flags.inc

compiler-rt/trunk/lib/xray/xray_profiling.cc

compiler-rt/trunk/lib/xray/xray_profiling_flags.h

compiler-rt/trunk/lib/xray/xray_profiling_flags.cc

compiler-rt/trunk/lib/xray/xray_profiling_flags.inc

compiler-rt/trunk/test/xray/TestCases/Posix/c-test.cc

compiler-rt/trunk/test/xray/TestCases/Posix/profiling-multi-threaded.cc

compiler-rt/trunk/test/xray/TestCases/Posix/profiling-single-threaded.cc

[XRay][profiler] Part 4: Profiler Mode Wiring
ClosedPublic