This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
compiler-rt/lib/xray/
-
lib/
-
xray/
2
CMakeLists.txt
-
tests/unit/
-
unit/
-
CMakeLists.txt
1/1
allocator_test.cc
-
segmented_array_test.cc
16/23
xray_allocator.h
12/21
xray_segmented_array.h

Differential D45756

[XRay][profiler] Part 1: XRay Allocator and Array Implementations
ClosedPublic

Authored by dberris on Apr 18 2018, 12:53 AM.

Download Raw Diff

Details

Reviewers

echristo
pelikan
kpw

Commits

rG26e81209ef84: [XRay][profiler] Part 1: XRay Allocator and Array Implementations
rCRT331141: [XRay][profiler] Part 1: XRay Allocator and Array Implementations
rL331141: [XRay][profiler] Part 1: XRay Allocator and Array Implementations

Summary

This change is part of the larger XRay Profiling Mode effort.

Here we implement an arena allocator, for fixed sized buffers used in a
segmented array implementation. This change adds the segmented array
data structure, which relies on the allocator to provide and maintain
the storage for the segmented array.

Key features of the Allocator type:

It uses cache-aligned blocks, intended to host the actual data. These blocks are cache-line-size multiples of contiguous bytes.

The Allocator has a maximum memory budget, set at construction time. This allows us to cap the amount of data each specific Allocator instance is responsible for.

Upon destruction, the Allocator will clean up the storage it's used, handing it back to the internal allocator used in sanitizer_common.

Key features of the Array type:

Each segmented array is always backed by an Allocator, which is either user-provided or uses a global allocator.

When an Array grows, it grows by appending a segment that's fixed-sized. The size of each segment is computed by the number of elements of type T that can fit into cache line multiples.

An Array does not return memory to the Allocator, but it can keep track of the current number of "live" objects it stores.

When an Array is destroyed, it will not return memory to the Allocator. Users should clean up the Allocator independently of the Array.

These basic data structures are used by the XRay Profiling Mode
implementation to implement efficient and cache-aware storage for data
that's typically read-and-write heavy for tracking latency information.
We're relying on the cache line characteristics of the architecture to
provide us good data isolation and cache friendliness, when we're
performing operations like searching for elements and/or updating data
hosted in these cache lines.

Diff Detail

Build Status

Buildable 17270
Build 17270: arc lint + arc unit

Event Timeline

dberris created this revision.Apr 18 2018, 12:53 AM

Herald added a subscriber: mgorny. · View Herald TranscriptApr 18 2018, 12:53 AM

Harbormaster completed remote builds in B17165: Diff 142886.Apr 18 2018, 12:53 AM

dberris added a child revision: D45757: [XRay][profiler] Part 2: XRay Function Call Trie.Apr 18 2018, 1:03 AM

Haven't looked at segmented array yet, but I had a good look at the allocator. I think I caught one serious bug, and the rest is just suggestions.

compiler-rt/lib/xray/CMakeLists.txt
21–52	Didn't like Martin's attempt to alphabetize? ;)
compiler-rt/lib/xray/tests/unit/allocator_test.cc
31	Expect not null? I don't grok this syntax.
compiler-rt/lib/xray/xray_allocator.h
36	Can you document the template parameter in the class level comment? Minimum block size?
37–39	If you comment the template parameter in the class comment. You can replace this with something to the effect of "A block is a a piece of memory handed out by the Allocator"
53	This name collides with the cryptographic immutable content-address data structure. I would try to avoid that. BlockListNode?
58–60	Could you make this more clear? Maybe even name it BlockPtrsPerCacheLine instead of Size. I had to read it a few times.
67	sizeof(Block)? Should be equivalent since Block just wraps void, but could be more clear. I see this type of punning show up in a few other places, so maybe it's alright to be consistent.
72–74	If you're punting on the freelist you might be able to squeeze some more data into here. UsedBits can actually be disposed of if you rely on always advancing the Chain when Counter % BlockChain::Size == 0 instead of wrap around detection. Similarly, *Next is never actually used.
87	Maybe call this Tail to communicate that it is updated to refer to the node to pull new blocks from and not the head?
102–104	Did you consider having this function accept the tail of the list? Then it could write forward links in addition to backward links. It also seems fishy (at least if you want to do a free list) that this always points back to the Chain member.
124	I prefer 'auto*' in cases like this.
131	NullLink is static right? This seems messy, although I know you don't intend to install multiple allocators.
139	Pretty sure you meant & instead of \|.
146–148	DCHECK that the ChainOffset == 0 and BitMask == 1?
160	auto* C please.

fixup: Address comments by kpw@

compiler-rt/lib/xray/CMakeLists.txt
21–52	Oh, I just thought it was harder to track which files where in which bucket by having them all in the same line. Visually spelling these out this way allows me to determine whether there are missing files quickly.
compiler-rt/lib/xray/xray_allocator.h
53	Renamed to `BlockLink`.
72–74	Good call! Yeah, I kind-of stopped short of cleaning this up further. Definitely something to think about if we ever need the freelist, but not really required at this time.
102–104	It could be, but now that we're not actually using the forward links (thanks for catching that), we can simplify this a lot now.
131	Yeah, NullLink is used as a sentinel value, which simplifies a lot of the operations for determining whether it's the end. It could as easily have been a nullptr though.

kpw added inline comments.Apr 19 2018, 5:26 PM

compiler-rt/lib/xray/xray_allocator.h
53	BlockLink is a good name. Sorry to be a stickler. ;)
65–66	This comment is out of date since you've cleaned up the Next ptr and the bitmap.
72–73	Bitmap doesn't exist any more.
123	I don't think this is a special case now that you've made changes. You should only need two arms in your conditional. In one case, ChainOffset == 0 and you have to allocate a new BlockLink. In the other arm, you can return a Block from the existing BlockLink arena.
131	Now that you're not keeping a forward pointer, having a static as the first link isn't problematic.
135–139	It looks like both branches of this conditional allocate a new BlockLink.
compiler-rt/lib/xray/xray_segmented_array.h
35–36	For the N default value, you can't pack an entire T if the remaining bytes are non-zero. I think it can be simplified to kCacheLineSize / sizeof(T)
47–48	Did you consider the tradeoffs of these being external to Block versus invasive to the Block?
105	Can you call out what iterator category this can be used like even though it doesn't satisfy the complete concept.
110	O -> Offset otherwise it kind of looked like a default zero in the second parameter.
125	Shouldn't you check against the sentinel chunk instead (here and above).
126–132	This doesn't seem right. There is still a value at offset 0 within the current chunk when going backward. What you want to check is whether Offset ends up being in the N - 1 congruence class after decrement.
139–141	Can't you just invoke operator++() here?
147–149	Similarly reuse operator--() code?
154	This is fine., but will a compiler generate memcmp equivalent?
172–173	Haven't looked past here yet. Leaving a note so I know where to come back to.

fixup: Address comments by kpw@.

Thank you for the thorough review, @kpw!

compiler-rt/lib/xray/xray_segmented_array.h
35–36	Here's the problem with that -- if T > kCacheLineSize, then N must not be 0. Essentially we want to figure out how many T's will fit in blocks that are cache-line size multiples. The computation I wanted to express is something like this: Given sizeof(T), how many T's can I fit in cache-line _multiple_ sized blocks if each cache line is kCacheLineSize bytes? For a cache line of 64 bytes, and T being 16 bytes for example, it seems that we can fit 4 T's just fine. But for something with 24 bytes, we can only fit 2 in one cache line, but if we asked for 8 then we can put them in three cache lines. It might not be great without alignment information (i.e. we really should be treating the 24-byte object as a 32-byte object) but that's something that can be fixed by the definition of T. Thinking this through, we really should be using the least-common multiple of sizeof(T) and kCacheLineSize, divide that by sizeof(T) to figure out how many elements to store in cache-line-multiple-sized blocks for the default N. I've updated the implementation to do just that. :)
47–48	I did think about using the Block storage as an opaque set of bytes and intrusively putting the pointers in there too (ala an intrusive list). I think that makes the implementation much more bug prone and it made my head hurt just trying to keep the aliasing and alignment rules in my head of where the pointers should be, whether the addresses we're getting are properly aligned, etc. -- so I punted and left it this way instead. :) Putting a TODO to explore that possibility in the future.
126–132	Oh my, yes you're right! Good catch, thanks.
154	I don't think `memcmp` is actually what we want, but I'm happy for the compiler to do it for me if it can.

Harbormaster completed remote builds in B17269: Diff 143230.Apr 19 2018, 11:38 PM

fixup: Use iterators in find_element implementation

Harbormaster completed remote builds in B17270: Diff 143236.Apr 19 2018, 11:50 PM

Finally reached the end. Phew!

I think there is some substantial complexity here, and you'll probably want to think about how to get some test coverage of the implementation.

One edge case I didn't quite reason through is trimming down to size zero might trigger special case sentinel checks being invalid for instance.

compiler-rt/lib/xray/xray_segmented_array.h
283–290	You want to reuse the allocated chunks when growing the list rather than allocate new ones right?
295–296	Isn't this a peculiar that these don't return a const type? Will this be a problem in practice?
299–300	Can you explain this statement please? My template foo is a bit rusty. Is this a templated member definition and which version of the standard does it require? LLVM is still on 11 IIUC.

Rebase
- squash: Add a freelist to the Array

Harbormaster completed remote builds in B17318: Diff 143500.Apr 22 2018, 9:58 PM

dberris added inline comments.Apr 22 2018, 9:58 PM

compiler-rt/lib/xray/xray_segmented_array.h
283–290	Yes, that's correct. Good call, fixed.
295–296	In practice, for our use-case, it shouldn't be a problem, although ideally the type of the Iterator should be different for the non-const and const case. For us here, it shouldn't matter (yet). The way to fix this would be to make the Iterator type itself a template, so that we can determine what the result of the dereference and arrow operators are, whether they are `const T` or `T`. I've made the change to make it more "technically correct", which is the best kind of correct. :)
299–300	So, what happens here is we're telling the compiler that if the type `Array<T, N>` is ever instantiated in a translation unit, that we're going to get storage for the static data member. This is because `SentinelChunk` is a static data member of the `Array<T,N>` type, and it needs static (program duration) storage to be defined across translation units. This is how you spell this in C++98, AFAICT.

This looks better now, although there were a few issues that slipped through the tests and I wonder if there are others lurking.

If we're going to maintain this cache-line aware storage structure, do you have a plan to measure the performance impact on the GWP implementation versus using a more simple container like sanitizer/vector.h?
This seems important to me because certain kinds of mistakes might still leave the interface operating correctly, but screw up the performance details.

compiler-rt/lib/xray/xray_segmented_array.h
299–300	Oh. I see. This is outside of the Array class, so it's an out of line static member definition with default initialization. The templates obscur that a bit, but I get it now.

This revision is now accepted and ready to land.Apr 28 2018, 8:24 PM

In D45756#1082146, @kpw wrote:

This looks better now, although there were a few issues that slipped through the tests and I wonder if there are others lurking.

If we're going to maintain this cache-line aware storage structure, do you have a plan to measure the performance impact on the GWP implementation versus using a more simple container like sanitizer/vector.h?
This seems important to me because certain kinds of mistakes might still leave the interface operating correctly, but screw up the performance details.

Yes, I have that on the list of things to do (add microbenchmarks to the test-suite).

I had some initial measurements on the cost of an early implementation, which basically show that the cost of reaching for additional memory and copying elements dominate. Those are important things to measure and make sure we can track going forward. It's also the whole point of us going through this route, to mitigate the problems we've encountered in the early prototype of this. :)

Closed by commit rL331141: [XRay][profiler] Part 1: XRay Allocator and Array Implementations (authored by dberris). · Explain WhyApr 29 2018, 6:50 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: delcypher. · View Herald TranscriptApr 29 2018, 6:50 AM

dberris mentioned this in rCRT332212: [XRay][compiler-rt] Relocate a DCHECK to the correct location..May 13 2018, 9:25 PM

dberris mentioned this in rL332212: [XRay][compiler-rt] Relocate a DCHECK to the correct location..

Your usage of constexpr appears to crash GCC 4.8.5:

[871/917] Building CXX object lib/xray/CMakeFiles/RTXrayPROFILER.x86_64.dir/xray_profile_collector.cc.o
FAILED: lib/xray/CMakeFiles/RTXrayPROFILER.x86_64.dir/xray_profile_collector.cc.o 
/b/c/b/ToTLinux/src/third_party/llvm-build-tools/gcc485precise/bin/g++  ...  /b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_profile_collector.cc
In file included from /b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_segmented_array.h:20:0,
                 from /b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_function_call_trie.h:19,
                 from /b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_profile_collector.h:21,
                 from /b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_profile_collector.cc:15:
/b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_allocator.h: In instantiation of ‘__xray::Allocator<320ul>::BlockLink __xray::Allocator<320ul>::NullLink’:
/b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_allocator.h:133:31:   required from ‘__xray::Allocator<N>::~Allocator() [with long unsigned int N = 320ul]’
/b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_function_call_trie.h:202:43:   required from here
/b/c/b/ToTLinux/src/third_party/llvm/compiler-rt/lib/xray/xray_allocator.h:57:10: internal compiler error: Segmentation fault
   struct BlockLink {
          ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

I will try to find a workaround. @thakis

I committed rL333678, which should start the process of healing the bots on https://ci.chromium.org/p/chromium/g/chromium.clang/console.

Wow, that's... interesting.

Thanks Reid!

Revision Contents

Path

Size

compiler-rt/

lib/

xray/

CMakeLists.txt

56 lines

tests/

unit/

CMakeLists.txt

4 lines

allocator_test.cc

42 lines

segmented_array_test.cc

110 lines

xray_allocator.h

149 lines

xray_segmented_array.h

304 lines

Diff 143236

compiler-rt/lib/xray/CMakeLists.txt

	# Build for all components of the XRay runtime support library.			# Build for all components of the XRay runtime support library.

	# XRay runtime library implementation files.			# XRay runtime library implementation files.
	set(XRAY_SOURCES			set(XRAY_SOURCES
	xray_init.cc			xray_init.cc
	xray_flags.cc			xray_flags.cc
	xray_interface.cc			xray_interface.cc
	xray_log_interface.cc			xray_log_interface.cc
	xray_utils.cc)			xray_utils.cc)

	# Implementation files for all XRay modes.			# Implementation files for all XRay modes.
	set(XRAY_FDR_MODE_SOURCES			set(XRAY_FDR_MODE_SOURCES
	xray_buffer_queue.cc			xray_buffer_queue.cc
	xray_fdr_logging.cc)			xray_fdr_logging.cc)

	set(XRAY_BASIC_MODE_SOURCES			set(XRAY_BASIC_MODE_SOURCES
	xray_inmemory_log.cc)			xray_inmemory_log.cc)


	# Implementation files for all XRay architectures.			# Implementation files for all XRay architectures.
	set(aarch64_SOURCES xray_AArch64.cc xray_trampoline_AArch64.S)			set(x86_64_SOURCES
	set(arm_SOURCES xray_arm.cc xray_trampoline_arm.S)			xray_x86_64.cc
	set(armhf_SOURCES ${arm_SOURCES})			xray_trampoline_x86_64.S)
	set(mips_SOURCES xray_mips.cc xray_trampoline_mips.S)
	set(mipsel_SOURCES xray_mips.cc xray_trampoline_mips.S)			set(arm_SOURCES
	set(mips64_SOURCES xray_mips64.cc xray_trampoline_mips64.S)			xray_arm.cc
	set(mips64el_SOURCES xray_mips64.cc xray_trampoline_mips64.S)			xray_trampoline_arm.S)

				set(armhf_SOURCES
				${arm_SOURCES})

				set(aarch64_SOURCES
				xray_AArch64.cc
				xray_trampoline_AArch64.S)

				set(mips_SOURCES
				xray_mips.cc
				xray_trampoline_mips.S)

				set(mipsel_SOURCES
				xray_mips.cc
				xray_trampoline_mips.S)

				set(mips64_SOURCES
				xray_mips64.cc
				xray_trampoline_mips64.S)

				set(mips64el_SOURCES
				xray_mips64.cc
				xray_trampoline_mips64.S)

	set(powerpc64le_SOURCES			set(powerpc64le_SOURCES
				kpwUnsubmitted Not Done Reply Inline Actions Didn't like Martin's attempt to alphabetize? ;) kpw: Didn't like Martin's attempt to alphabetize? ;)
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Oh, I just thought it was harder to track which files where in which bucket by having them all in the same line. Visually spelling these out this way allows me to determine whether there are missing files quickly. dberris: Oh, I just thought it was harder to track which files where in which bucket by having them all…
	xray_powerpc64.cc			xray_powerpc64.cc
	xray_trampoline_powerpc64.cc			xray_trampoline_powerpc64.cc
	xray_trampoline_powerpc64_asm.S)			xray_trampoline_powerpc64_asm.S)
	set(x86_64_SOURCES xray_x86_64.cc xray_trampoline_x86_64.S)

	# Now put it all together...			# Now put it all together...
	include_directories(..)			include_directories(..)
	include_directories(../../include)			include_directories(../../include)

	set(XRAY_CFLAGS ${SANITIZER_COMMON_CFLAGS})			set(XRAY_CFLAGS ${SANITIZER_COMMON_CFLAGS})
	set(XRAY_COMMON_DEFINITIONS XRAY_HAS_EXCEPTIONS=1)			set(XRAY_COMMON_DEFINITIONS XRAY_HAS_EXCEPTIONS=1)
	append_list_if(			append_list_if(
	▲ Show 20 Lines • Show All 117 Lines • Show Last 20 Lines

compiler-rt/lib/xray/tests/unit/CMakeLists.txt

	add_xray_unittest(XRayBufferQueueTest SOURCES			add_xray_unittest(XRayBufferQueueTest SOURCES
	buffer_queue_test.cc xray_unit_test_main.cc)			buffer_queue_test.cc xray_unit_test_main.cc)
	add_xray_unittest(XRayFDRLoggingTest SOURCES			add_xray_unittest(XRayFDRLoggingTest SOURCES
	fdr_logging_test.cc xray_unit_test_main.cc)			fdr_logging_test.cc xray_unit_test_main.cc)
				add_xray_unittest(XRayAllocatorTest SOURCES
				allocator_test.cc xray_unit_test_main.cc)
				add_xray_unittest(XRaySegmentedArrayTest SOURCES
				segmented_array_test.cc xray_unit_test_main.cc)

compiler-rt/lib/xray/tests/unit/allocator_test.cc

This file was added.

				//===-- allocator_test.cc -------------------------------------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is a part of XRay, a function call tracing system.
				//
				//===----------------------------------------------------------------------===//

				#include "xray_allocator.h"
				#include "gtest/gtest.h"

				namespace __xray {
				namespace {

				struct TestData {
				s64 First;
				s64 Second;
				};

				TEST(AllocatorTest, Construction) { Allocator<sizeof(TestData)> A(2 << 11, 0); }

				TEST(AllocatorTest, Allocate) {
				Allocator<sizeof(TestData)> A(2 << 11, 0);
				auto B = A.Allocate();
				ASSERT_NE(B.Data, nullptr);
				}
				kpwUnsubmitted Done Reply Inline Actions Expect not null? I don't grok this syntax. kpw: Expect not null? I don't grok this syntax.

				TEST(AllocatorTest, OverAllocate) {
				Allocator<sizeof(TestData)> A(sizeof(TestData), 0);
				auto B1 = A.Allocate();
				(void)B1;
				auto B2 = A.Allocate();
				ASSERT_EQ(B2.Data, nullptr);
				}

				} // namespace
				} // namespace __xray

compiler-rt/lib/xray/tests/unit/segmented_array_test.cc

This file was added.

				#include "xray_segmented_array.h"
				#include "gtest/gtest.h"

				namespace __xray {
				namespace {

				struct TestData {
				s64 First;
				s64 Second;

				// Need a constructor for emplace operations.
				TestData(s64 F, s64 S) : First(F), Second(S) {}
				};

				TEST(SegmentedArrayTest, Construction) {
				Array<TestData> Data;
				(void)Data;
				}

				TEST(SegmentedArrayTest, ConstructWithAllocator) {
				using AllocatorType = typename Array<TestData>::AllocatorType;
				AllocatorType A(1 << 4, 0);
				Array<TestData> Data(A);
				(void)Data;
				}

				TEST(SegmentedArrayTest, ConstructAndPopulate) {
				Array<TestData> data;
				ASSERT_NE(data.Append(TestData{0, 0}), nullptr);
				ASSERT_NE(data.Append(TestData{1, 1}), nullptr);
				ASSERT_EQ(data.size(), 2u);
				}

				TEST(SegmentedArrayTest, ConstructPopulateAndLookup) {
				Array<TestData> data;
				ASSERT_NE(data.Append(TestData{0, 1}), nullptr);
				ASSERT_EQ(data.size(), 1u);
				ASSERT_EQ(data[0].First, 0);
				ASSERT_EQ(data[0].Second, 1);
				}

				TEST(SegmentedArrayTest, PopulateWithMoreElements) {
				Array<TestData> data;
				static const auto kMaxElements = 100u;
				for (auto I = 0u; I < kMaxElements; ++I) {
				ASSERT_NE(data.Append(TestData{I, I + 1}), nullptr);
				}
				ASSERT_EQ(data.size(), kMaxElements);
				for (auto I = 0u; I < kMaxElements; ++I) {
				ASSERT_EQ(data[I].First, I);
				ASSERT_EQ(data[I].Second, I + 1);
				}
				}

				TEST(SegmentedArrayTest, AppendEmplace) {
				Array<TestData> data;
				ASSERT_NE(data.AppendEmplace(1, 1), nullptr);
				ASSERT_EQ(data[0].First, 1);
				ASSERT_EQ(data[0].Second, 1);
				}

				TEST(SegmentedArrayTest, AppendAndTrim) {
				Array<TestData> data;
				ASSERT_NE(data.AppendEmplace(1, 1), nullptr);
				ASSERT_EQ(data.size(), 1u);
				data.trim(1);
				ASSERT_EQ(data.size(), 0u);
				ASSERT_TRUE(data.empty());
				}

				TEST(SegmentedArrayTest, IteratorAdvance) {
				Array<TestData> data;
				ASSERT_TRUE(data.empty());
				ASSERT_EQ(data.begin(), data.end());
				auto I0 = data.begin();
				ASSERT_EQ(I0++, data.begin());
				ASSERT_NE(I0, data.begin());
				for (const auto &D : data) {
				(void)D;
				FAIL();
				}
				ASSERT_NE(data.AppendEmplace(1, 1), nullptr);
				ASSERT_EQ(data.size(), 1u);
				ASSERT_NE(data.begin(), data.end());
				auto &D0 = *data.begin();
				ASSERT_EQ(D0.First, 1);
				ASSERT_EQ(D0.Second, 1);
				}

				TEST(SegmentedArrayTest, IteratorRetreat) {
				Array<TestData> data;
				ASSERT_TRUE(data.empty());
				ASSERT_EQ(data.begin(), data.end());
				ASSERT_NE(data.AppendEmplace(1, 1), nullptr);
				ASSERT_EQ(data.size(), 1u);
				ASSERT_NE(data.begin(), data.end());
				auto &D0 = *data.begin();
				ASSERT_EQ(D0.First, 1);
				ASSERT_EQ(D0.Second, 1);

				auto I0 = data.end();
				ASSERT_EQ(I0--, data.end());
				ASSERT_NE(I0, data.end());
				ASSERT_EQ(I0, data.begin());
				ASSERT_EQ(I0->First, 1);
				ASSERT_EQ(I0->Second, 1);
				}

				} // namespace
				} // namespace __xray

compiler-rt/lib/xray/xray_allocator.h

This file was added.

				//===-- xray_allocator.h ---------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is a part of XRay, a dynamic runtime instrumentation system.
				//
				// Defines the allocator interface for an arena allocator, used primarily for
				// the profiling runtime.
				//
				//===----------------------------------------------------------------------===//
				#ifndef XRAY_ALLOCATOR_H
				#define XRAY_ALLOCATOR_H

				#include "sanitizer_common/sanitizer_allocator_internal.h"
				#include "sanitizer_common/sanitizer_common.h"
				#include "sanitizer_common/sanitizer_mutex.h"
				#include <cstddef>
				#include <cstdint>

				#include "sanitizer_common/sanitizer_internal_defs.h"

				namespace __xray {

				/// The Allocator type hands out fixed-sized chunks of memory that are
				/// cache-line aligned and sized. This is useful for placement of
				/// performance-sensitive data in memory that's frequently accessed. The
				/// allocator also self-limits the peak memory usage to a dynamically defined
				/// maximum.
				///
				/// N is the lower-bound size of the block of memory to return from the
				/// allocation function. N is used to compute the size of a block, which is
				kpwUnsubmitted Done Reply Inline Actions Can you document the template parameter in the class level comment? Minimum block size? kpw: Can you document the template parameter in the class level comment? Minimum block size?
				/// cache-line-size multiples worth of memory. We compute the size of a block by
				/// determining how many cache lines worth of memory is required to subsume N.
				template <size_t N> struct Allocator {
				kpwUnsubmitted Done Reply Inline Actions If you comment the template parameter in the class comment. You can replace this with something to the effect of "A block is a a piece of memory handed out by the Allocator" kpw: If you comment the template parameter in the class comment. You can replace this with something…
				// The Allocator returns memory as Block instances.
				struct Block {
				/// Compute the minimum cache-line size multiple that is >= N.
				static constexpr auto Size =
				kCacheLineSize * ((N / kCacheLineSize) + (N % kCacheLineSize ? 1 : 0));
				void *Data = nullptr;
				};

				private:
				// A BlockLink will contain a fixed number of blocks, each with an identifier
				// to specify whether it's been handed out or not. We keep track of BlockLink
				// iterators, which are basically a pointer to the link and an offset into
				// the fixed set of blocks associated with a link. The iterators are
				// bidirectional.
				kpwUnsubmitted Done Reply Inline Actions This name collides with the cryptographic immutable content-address data structure. I would try to avoid that. BlockListNode? kpw: This name collides with the cryptographic immutable content-address data structure. I would try…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Renamed to `BlockLink`. dberris: Renamed to `BlockLink`.
				kpwUnsubmitted Not Done Reply Inline Actions BlockLink is a good name. Sorry to be a stickler. ;) kpw: BlockLink is a good name. Sorry to be a stickler. ;)
				//
				// We're calling it a "link" in the context of seeing these as a chain of
				// block pointer containers (i.e. links in a chain).
				struct BlockLink {
				static_assert(kCacheLineSize % sizeof(void *) == 0,
				"Cache line size is not divisible by size of void*; none of "
				"the assumptions of the BlockLink will hold.");
				kpwUnsubmitted Done Reply Inline Actions Could you make this more clear? Maybe even name it BlockPtrsPerCacheLine instead of Size. I had to read it a few times. kpw: Could you make this more clear? Maybe even name it BlockPtrsPerCacheLine instead of Size. I…

				// We compute the number of pointers to areas in memory where we consider as
				// individual blocks we've allocated. To ensure that instances of the
				// BlockLink object are cache-line sized, we deduct one additional
				// pointers worth representing the pointer to the previous link.
				//
				kpwUnsubmitted Done Reply Inline Actions This comment is out of date since you've cleaned up the Next ptr and the bitmap. kpw: This comment is out of date since you've cleaned up the Next ptr and the bitmap.
				// This structure corresponds to the following layout:
				kpwUnsubmitted Done Reply Inline Actions sizeof(Block)? Should be equivalent since Block just wraps void, but could be more clear. I see this type of punning show up in a few other places, so maybe it's alright to be consistent. kpw: sizeof(Block)? Should be equivalent since Block just wraps void, but could be more clear. I…
				//
				// Blocks [ 0, 1, 2, .., BlockPtrCount - 1]
				//
				static constexpr auto BlockPtrCount =
				(kCacheLineSize / sizeof(Block *)) - 1;

				kpwUnsubmitted Done Reply Inline Actions Bitmap doesn't exist any more. kpw: Bitmap doesn't exist any more.
				// FIXME: Align this to cache-line address boundaries?
				kpwUnsubmitted Done Reply Inline Actions If you're punting on the freelist you might be able to squeeze some more data into here. UsedBits can actually be disposed of if you rely on always advancing the Chain when Counter % BlockChain::Size == 0 instead of wrap around detection. Similarly, Next is never actually used. kpw:* If you're punting on the freelist you might be able to squeeze some more data into here.
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Good call! Yeah, I kind-of stopped short of cleaning this up further. Definitely something to think about if we ever need the freelist, but not really required at this time. dberris: Good call! Yeah, I kind-of stopped short of cleaning this up further. Definitely something to…
				Block Blocks[BlockPtrCount]{};
				BlockLink *Prev = nullptr;
				};

				static_assert(sizeof(BlockLink) == kCacheLineSize,
				"BlockLink instances must be cache-line-sized.");

				static BlockLink NullLink;

				// FIXME: Implement a freelist, in case we actually do intend to return memory
				// to the allocator, as opposed to just de-allocating everything in one go?

				size_t MaxMemory;
				kpwUnsubmitted Done Reply Inline Actions Maybe call this Tail to communicate that it is updated to refer to the node to pull new blocks from and not the head? kpw: Maybe call this Tail to communicate that it is updated to refer to the node to pull new blocks…
				SpinMutex Mutex{};
				BlockLink *Tail = &NullLink;
				size_t Counter = 0;

				BlockLink *NewChainLink() {
				auto NewChain = reinterpret_cast<BlockLink *>(
				InternalAlloc(sizeof(BlockLink), nullptr, kCacheLineSize));
				auto BackingStore = reinterpret_cast<char *>(InternalAlloc(
				BlockLink::BlockPtrCount * Block::Size, nullptr, kCacheLineSize));
				size_t Offset = 0;
				DCHECK_NE(NewChain, nullptr);
				DCHECK_NE(BackingStore, nullptr);
				for (auto &B : NewChain->Blocks) {
				B.Data = BackingStore + Offset;
				Offset += Block::Size;
				}
				NewChain->Prev = Tail;
				kpwUnsubmitted Done Reply Inline Actions Did you consider having this function accept the tail of the list? Then it could write forward links in addition to backward links. It also seems fishy (at least if you want to do a free list) that this always points back to the Chain member. kpw: Did you consider having this function accept the tail of the list? Then it could write forward…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions It could be, but now that we're not actually using the forward links (thanks for catching that), we can simplify this a lot now. dberris: It could be, but now that we're not actually using the forward links (thanks for catching that)…
				return NewChain;
				}

				public:
				Allocator(size_t M, size_t PreAllocate) : MaxMemory(M) {
				// FIXME: Implement PreAllocate support!
				}

				Block Allocate() {
				SpinMutexLock Lock(&Mutex);
				// Check whether we're over quota.
				if (Counter * Block::Size >= MaxMemory)
				return {};

				size_t ChainOffset = Counter % BlockLink::BlockPtrCount;

				Block B{};
				BlockLink *Link = Tail;
				if (UNLIKELY(Counter == 0 \|\| ChainOffset == 0))
				kpwUnsubmitted Not Done Reply Inline Actions I don't think this is a special case now that you've made changes. You should only need two arms in your conditional. In one case, ChainOffset == 0 and you have to allocate a new BlockLink. In the other arm, you can return a Block from the existing BlockLink arena. kpw: I don't think this is a special case now that you've made changes. You should only need two…
				Tail = Link = NewChainLink();
				kpwUnsubmitted Done Reply Inline Actions I prefer 'auto' in cases like this. kpw:* I prefer 'auto*' in cases like this.

				B = Link->Blocks[ChainOffset];
				++Counter;
				return B;
				}

				~Allocator() NOEXCEPT {
				kpwUnsubmitted Done Reply Inline Actions NullLink is static right? This seems messy, although I know you don't intend to install multiple allocators. kpw: NullLink is static right? This seems messy, although I know you don't intend to install…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Yeah, NullLink is used as a sentinel value, which simplifies a lot of the operations for determining whether it's the end. It could as easily have been a nullptr though. dberris: Yeah, NullLink is used as a sentinel value, which simplifies a lot of the operations for…
				kpwUnsubmitted Not Done Reply Inline Actions Now that you're not keeping a forward pointer, having a static as the first link isn't problematic. kpw: Now that you're not keeping a forward pointer, having a static as the first link isn't…
				// We need to deallocate all the blocks, including the chain links.
				for (auto *C = Tail; C != &NullLink;) {
				// We know that the data block is a large contiguous page, we deallocate
				// that at once.
				InternalFree(C->Blocks[0].Data);
				auto Prev = C->Prev;
				InternalFree(C);
				C = Prev;
				kpwUnsubmitted Done Reply Inline Actions Pretty sure you meant & instead of \|. kpw: Pretty sure you meant & instead of \|.
				kpwUnsubmitted Done Reply Inline Actions It looks like both branches of this conditional allocate a new BlockLink. kpw: It looks like both branches of this conditional allocate a new BlockLink.
				}
				}
				}; // namespace __xray

				// Storage for the NullLink sentinel.
				template <size_t N> typename Allocator<N>::BlockLink Allocator<N>::NullLink;

				} // namespace __xray

				kpwUnsubmitted Done Reply Inline Actions DCHECK that the ChainOffset == 0 and BitMask == 1? kpw: DCHECK that the ChainOffset == 0 and BitMask == 1?
				#endif // XRAY_ALLOCATOR_H
				kpwUnsubmitted Done Reply Inline Actions auto* C please. kpw: auto* C please.

compiler-rt/lib/xray/xray_segmented_array.h

This file was added.

				//===-- xray_segmented_array.h ---------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file is a part of XRay, a dynamic runtime instrumentation system.
				//
				// Defines the implementation of a segmented array, with fixed-size chunks
				// backing the segments.
				//
				//===----------------------------------------------------------------------===//
				#ifndef XRAY_SEGMENTED_ARRAY_H
				#define XRAY_SEGMENTED_ARRAY_H

				#include "sanitizer_common/sanitizer_allocator.h"
				#include "xray_allocator.h"
				#include <type_traits>
				#include <utility>

				namespace __xray {

				namespace {

				constexpr size_t gcd(size_t a, size_t b) {
				return (b == 0) ? a : gcd(b, a % b);
				}

				constexpr size_t lcm(size_t a, size_t b) { return a * b / gcd(a, b); }

				} // namespace

				/// The Array type provides an interface similar to std::vector<...> but does
				kpwUnsubmitted Not Done Reply Inline Actions For the N default value, you can't pack an entire T if the remaining bytes are non-zero. I think it can be simplified to kCacheLineSize / sizeof(T) kpw: For the N default value, you can't pack an entire T if the remaining bytes are non-zero. I…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Here's the problem with that -- if T > kCacheLineSize, then N must not be 0. Essentially we want to figure out how many T's will fit in blocks that are cache-line size multiples. The computation I wanted to express is something like this: Given sizeof(T), how many T's can I fit in cache-line _multiple_ sized blocks if each cache line is kCacheLineSize bytes? For a cache line of 64 bytes, and T being 16 bytes for example, it seems that we can fit 4 T's just fine. But for something with 24 bytes, we can only fit 2 in one cache line, but if we asked for 8 then we can put them in three cache lines. It might not be great without alignment information (i.e. we really should be treating the 24-byte object as a 32-byte object) but that's something that can be fixed by the definition of T. Thinking this through, we really should be using the least-common multiple of sizeof(T) and kCacheLineSize, divide that by sizeof(T) to figure out how many elements to store in cache-line-multiple-sized blocks for the default N. I've updated the implementation to do just that. :) dberris: Here's the problem with that -- if T > kCacheLineSize, then N must not be 0. Essentially we…
				/// not shrink in size. Once constructed, elements can be appended but cannot be
				/// removed. The implementation is heavily dependent on the contract provided by
				/// the Allocator type, in that all memory will be released when the Allocator
				/// is destroyed. When an Array is destroyed, it will destroy elements in the
				/// backing store but will not free the memory. The parameter N defines how many
				/// elements of T there should be in a single block.
				///
				/// We compute the least common multiple of the size of T and the cache line
				/// size, to allow us to maximise the number of T objects we can place in
				/// cache-line multiple sized blocks. To get back the number of T's, we divide
				/// this least common multiple by the size of T.
				template <class T, size_t N = lcm(sizeof(T), kCacheLineSize) / sizeof(T)>
				kpwUnsubmitted Done Reply Inline Actions Did you consider the tradeoffs of these being external to Block versus invasive to the Block? kpw: Did you consider the tradeoffs of these being external to Block versus invasive to the Block?
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions I did think about using the Block storage as an opaque set of bytes and intrusively putting the pointers in there too (ala an intrusive list). I think that makes the implementation much more bug prone and it made my head hurt just trying to keep the aliasing and alignment rules in my head of where the pointers should be, whether the addresses we're getting are properly aligned, etc. -- so I punted and left it this way instead. :) Putting a TODO to explore that possibility in the future. dberris: I did think about using the Block storage as an opaque set of bytes and intrusively putting the…
				struct Array {
				static constexpr size_t AllocatorChunkSize = sizeof(T) * N;
				using AllocatorType = Allocator<AllocatorChunkSize>;
				static_assert(std::is_trivially_destructible<T>::value,
				"T must be trivially destructible.");

				private:
				// TODO: Consider co-locating the chunk information with the data in the
				// Block, as in an intrusive list -- i.e. putting the next and previous
				// pointer values inside the Block storage.
				struct Chunk {
				typename AllocatorType::Block Block;
				static constexpr size_t Size = N;
				Chunk *Prev = nullptr;
				Chunk *Next = nullptr;
				};

				static Chunk SentinelChunk;

				AllocatorType *Allocator;
				Chunk *Head = &SentinelChunk;
				Chunk *Tail = &SentinelChunk;
				size_t Size = 0;

				Chunk *NewChunk() {
				auto Block = Allocator->Allocate();
				if (Block.Data == nullptr)
				return nullptr;
				// TODO: Maybe use a separate managed allocator for Chunk instances?
				auto C = reinterpret_cast<Chunk *>(InternalAlloc(sizeof(Chunk)));
				if (C == nullptr)
				return nullptr;
				C->Block = Block;
				return C;
				}

				static AllocatorType &GetGlobalAllocator() {
				static AllocatorType *const GlobalAllocator = [] {
				AllocatorType A = reinterpret_cast<AllocatorType >(
				InternalAlloc(sizeof(AllocatorType)));
				new (A) AllocatorType(2 << 10, 0);
				return A;
				}();

				return *GlobalAllocator;
				}

				Chunk *InitHeadAndTail() {
				DCHECK_EQ(Head, &SentinelChunk);
				DCHECK_EQ(Tail, &SentinelChunk);
				auto Chunk = NewChunk();
				if (Chunk == nullptr)
				return nullptr;
				Chunk->Prev = &SentinelChunk;
				Chunk->Next = &SentinelChunk;
				Head = Chunk;
				Tail = Chunk;
				kpwUnsubmitted Done Reply Inline Actions Can you call out what iterator category this can be used like even though it doesn't satisfy the complete concept. kpw: Can you call out what iterator category this can be used like even though it doesn't satisfy…
				return Chunk;
				}

				Chunk *AppendNewChunk() {
				auto Chunk = NewChunk();
				kpwUnsubmitted Done Reply Inline Actions O -> Offset otherwise it kind of looked like a default zero in the second parameter. kpw: O -> Offset otherwise it kind of looked like a default zero in the second parameter.
				if (Chunk == nullptr)
				return nullptr;
				Tail->Next = Chunk;
				Chunk->Prev = Tail;
				Chunk->Next = &SentinelChunk;
				Tail = Chunk;
				return Chunk;
				}

				// This Iterator models a BidirectionalIterator.
				class Iterator {
				Chunk *C = nullptr;
				size_t Offset = 0;

				public:
				kpwUnsubmitted Done Reply Inline Actions Shouldn't you check against the sentinel chunk instead (here and above). kpw: Shouldn't you check against the sentinel chunk instead (here and above).
				Iterator(Chunk *IC, size_t Off) : C(IC), Offset(Off) {}

				Iterator &operator++() {
				DCHECK_NE(C, &SentinelChunk);
				if (++Offset % N)
				return *this;

				kpwUnsubmitted Done Reply Inline Actions This doesn't seem right. There is still a value at offset 0 within the current chunk when going backward. What you want to check is whether Offset ends up being in the N - 1 congruence class after decrement. kpw: This doesn't seem right. There is still a value at offset 0 within the current chunk when going…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Oh my, yes you're right! Good catch, thanks. dberris: Oh my, yes you're right! Good catch, thanks.
				// At this point, we know that Offset % N == 0, so we must advance the
				// chunk pointer.
				DCHECK_EQ(Offset % N, 0);
				C = C->Next;
				return *this;
				}

				Iterator &operator--() {
				DCHECK_NE(C, &SentinelChunk);
				kpwUnsubmitted Done Reply Inline Actions Can't you just invoke operator++() here? kpw: Can't you just invoke operator++() here?
				DCHECK_GT(Offset, 0);

				// We check whether the offset was on a boundary before decrement, to see
				// whether we need to retreat to the previous chunk.
				if ((Offset-- % N) == 0)
				C = C->Prev;
				return *this;
				}
				kpwUnsubmitted Done Reply Inline Actions Similarly reuse operator--() code? kpw: Similarly reuse operator--() code?

				Iterator operator++(int) {
				Iterator Copy(*this);
				++(*this);
				return Copy;
				kpwUnsubmitted Done Reply Inline Actions This is fine., but will a compiler generate memcmp equivalent? kpw: This is fine., but will a compiler generate memcmp equivalent?
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions I don't think `memcmp` is actually what we want, but I'm happy for the compiler to do it for me if it can. dberris: I don't think `memcmp` is actually what we want, but I'm happy for the compiler to do it for me…
				}

				Iterator operator--(int) {
				Iterator Copy(*this);
				--(*this);
				return Copy;
				}

				friend bool operator==(const Iterator &L, const Iterator &R) {
				return L.C == R.C && L.Offset == R.Offset;
				}

				friend bool operator!=(const Iterator &L, const Iterator &R) {
				return !(L == R);
				}

				const T &operator*() const {
				DCHECK_NE(C, &SentinelChunk);
				return reinterpret_cast<const T *>(C->Block.Data)[Offset % N];
				kpwUnsubmitted Done Reply Inline Actions Haven't looked past here yet. Leaving a note so I know where to come back to. kpw: Haven't looked past here yet. Leaving a note so I know where to come back to.
				}

				T &operator*() {
				DCHECK_NE(C, &SentinelChunk);
				return reinterpret_cast<T *>(C->Block.Data)[Offset % N];
				}

				T *operator->() {
				return reinterpret_cast<T *>(C->Block.Data) + (Offset % N);
				}

				const T *operator->() const {
				return reinterpret_cast<const T *>(C->Block.Data) + (Offset % N);
				}
				};

				public:
				explicit Array(AllocatorType &A) : Allocator(&A) {}
				Array() : Array(GetGlobalAllocator()) {}

				Array(const Array &) = delete;
				Array(Array &&O) NOEXCEPT : Allocator(O.Allocator),
				Head(O.Head),
				Tail(O.Tail),
				Size(O.Size) {
				O.Head = &SentinelChunk;
				O.Tail = &SentinelChunk;
				O.Size = 0;
				}

				bool empty() const { return Size == 0; }

				AllocatorType &allocator() const {
				DCHECK_NE(Allocator, nullptr);
				return *Allocator;
				}

				size_t size() const { return Size; }

				T *Append(const T &E) {
				if (UNLIKELY(Head == &SentinelChunk))
				if (InitHeadAndTail() == nullptr)
				return nullptr;

				auto Offset = Size % N;
				if (UNLIKELY(Size != 0 && Offset == 0))
				if (AppendNewChunk() == nullptr)
				return nullptr;

				auto Position = reinterpret_cast<T *>(Tail->Block.Data) + Offset;
				*Position = E;
				++Size;
				return Position;
				}

				template <class... Args> T *AppendEmplace(Args &&... args) {
				if (UNLIKELY(Head == &SentinelChunk))
				if (InitHeadAndTail() == nullptr)
				return nullptr;

				auto Offset = Size % N;
				if (UNLIKELY(Size != 0 && Offset == 0))
				if (AppendNewChunk() == nullptr)
				return nullptr;

				auto Position = reinterpret_cast<T *>(Tail->Block.Data) + Offset;
				// In-place construct at Position.
				new (Position) T(std::forward<Args>(args)...);
				++Size;
				return Position;
				}

				T &operator[](size_t Offset) const {
				DCHECK_LE(Offset, Size);
				// We need to traverse the array enough times to find the element at Offset.
				auto C = Head;
				while (Offset >= N) {
				C = C->Next;
				Offset -= N;
				DCHECK_NE(C, &SentinelChunk);
				}
				auto Position = reinterpret_cast<T *>(C->Block.Data) + Offset;
				return *Position;
				}

				T &front() const {
				DCHECK_NE(Head, &SentinelChunk);
				DCHECK_NE(Size, 0u);
				return reinterpret_cast<T >(Head->Block.Data);
				}

				T &back() const {
				DCHECK_NE(Tail, &SentinelChunk);
				auto Offset = (Size - 1) % N;
				return (reinterpret_cast<T >(Tail->Block.Data) + Offset);
				}

				template <class Predicate> T *find_element(Predicate P) const {
				if (empty())
				return nullptr;

				auto E = end();
				for (auto I = begin(); I != E; ++I)
				if (P(*I))
				return &(*I);

				return nullptr;
				}

				/// Remove N Elements from the end.
				void trim(size_t Elements) {
				// TODO: Figure out whether we need to actually destroy the objects that are
				// going to be trimmed -- determine whether assuming that trivially
				// destructible objects are OK to discard, from the XRay implementation.
				DCHECK_LE(Elements, Size);
				Size -= Elements;
				}
				kpwUnsubmitted Done Reply Inline Actions You want to reuse the allocated chunks when growing the list rather than allocate new ones right? kpw: You want to reuse the allocated chunks when growing the list rather than allocate new ones…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions Yes, that's correct. Good call, fixed. dberris: Yes, that's correct. Good call, fixed.

				// Provide iterators.
				Iterator begin() const { return Iterator(Head, 0); }
				Iterator end() const { return Iterator(Tail, Size); }
				Iterator cbegin() const { return Iterator(Head, 0); }
				Iterator cend() const { return Iterator(Tail, Size); }
				kpwUnsubmitted Done Reply Inline Actions Isn't this a peculiar that these don't return a const type? Will this be a problem in practice? kpw: Isn't this a peculiar that these don't return a const type? Will this be a problem in practice?
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions In practice, for our use-case, it shouldn't be a problem, although ideally the type of the Iterator should be different for the non-const and const case. For us here, it shouldn't matter (yet). The way to fix this would be to make the Iterator type itself a template, so that we can determine what the result of the dereference and arrow operators are, whether they are `const T` or `T`. I've made the change to make it more "technically correct", which is the best kind of correct. :) dberris: In practice, for our use-case, it shouldn't be a problem, although ideally the type of the…
				};

				template <class T, size_t N>
				typename Array<T, N>::Chunk Array<T, N>::SentinelChunk;
				kpwUnsubmitted Done Reply Inline Actions Can you explain this statement please? My template foo is a bit rusty. Is this a templated member definition and which version of the standard does it require? LLVM is still on 11 IIUC. kpw: Can you explain this statement please? My template foo is a bit rusty. Is this a templated…
				dberrisAuthorUnsubmitted Not Done Reply Inline Actions So, what happens here is we're telling the compiler that if the type `Array<T, N>` is ever instantiated in a translation unit, that we're going to get storage for the static data member. This is because `SentinelChunk` is a static data member of the `Array<T,N>` type, and it needs static (program duration) storage to be defined across translation units. This is how you spell this in C++98, AFAICT. dberris: So, what happens here is we're telling the compiler that if the type `Array<T, N>` is ever…
				kpwUnsubmitted Not Done Reply Inline Actions Oh. I see. This is outside of the Array class, so it's an out of line static member definition with default initialization. The templates obscur that a bit, but I get it now. kpw: Oh. I see. This is outside of the Array class, so it's an out of line static member definition…

				} // namespace __xray

				#endif // XRAY_SEGMENTED_ARRAY_H