This is an archive of the discontinued LLVM Phabricator instance.

[XRay][compiler-rt] Avoid InternalAlloc(...) in Profiling Mode
ClosedPublic

Authored by dberris on Aug 15 2018, 8:10 AM.

Details

Summary

We avoid using dynamic memory allocated with the internal allocator in
the profile collection service used by profiling mode. We use aligned
storage for globals and in-struct storage of objects we dynamically
initialize.

We also remove the dependency on Vector<...> which also internally
uses the dynamic allocator in sanitizer_common (InternalAlloc) in favour
of the XRay allocator and segmented array implementation.

This change addresses llvm.org/PR38577.

Event Timeline

dberris created this revision.Aug 15 2018, 8:10 AM
dberris updated this revision to Diff 160822.Aug 15 2018, 9:02 AM
dberris edited the summary of this revision. (Show Details)

Remove dependency on Vector<...> as well.

dberris updated this revision to Diff 160830.Aug 15 2018, 9:25 AM

Update to add correct includes for Array<...> and Allocator<...>.

eizan added a comment.Aug 16 2018, 5:25 AM

I'm trying to understand the memory allocation/deallocation dynamics of this module. It looks like all of the Array<...> objects only have their memory allocation increase because the XRay segmented array doesn't support having its allocation shrink. However, the memory buffers that each cell in ProfileBuffers point to do get deallocated inside serialize(). Is this correct? Why are these buffers special that they should be deallocated rather than kept around for reuse later like all the other memory?

compiler-rt/lib/xray/xray_profile_collector.cc
63

Do you mean dynamic memory allocation? Why does alignment affect whether it's done?

83

fd argument should be set to -1

216

suggest keeping the comments, as this one went away, but the "then repopulate..." stayed.

dberris updated this revision to Diff 161028.Aug 16 2018, 7:45 AM
dberris marked 3 inline comments as done.

Address comments by eizan@.

I'm trying to understand the memory allocation/deallocation dynamics of this module. It looks like all of the Array<...> objects only have their memory allocation increase because the XRay segmented array doesn't support having its allocation shrink.

The Array<...> objects only grow, yes, but we're able to trim them and re-use the memory that's in the internal freelist for the array segments.

However, the memory buffers that each cell in ProfileBuffers point to do get deallocated inside serialize(). Is this correct? Why are these buffers special that they should be deallocated rather than kept around for reuse later like all the other memory?

The actual buffers are obtained through mmap, and they are not fixed-size -- the size of the buffers are dependent on how large the serialised version of the function call tries will be. We can't re-use these buffers across multiple profile collection sessions. The ProfileBuffers array hosts structs that are fixed-size (it's a pointer and a size).

Note that in the reset() function, we destroy the allocators and re-initilize them (through placement new). The static storage for the allocators and the arrays get effectively re-used, without having to reach for memory from the heap (all of the storage for the Array<...> instances will be obtained through the Allocator<...> instances). If you look at Allocator<...>, the destructor will return the memory to the system as well.

compiler-rt/lib/xray/xray_profile_collector.cc
63

I meant dynamic initialisation. The alignment is important so that the pointers we get when we reinterpret-cast will be of the expected alignment for an object of an appropriate size. We need that to make the placement new calls have well-defined semantics (and the pointer is appropriately aligned).

We need this to be global program-duration (static) storage only, because we want to avoid relying on C++ ABI functions for registering dynamic initialisation and de-initialisation (constructor and destructor) routines.

eizan added a comment.Aug 16 2018, 5:57 PM

I'm trying to understand the memory allocation/deallocation dynamics of this module. It looks like all of the Array<...> objects only have their memory allocation increase because the XRay segmented array doesn't support having its allocation shrink.

The Array<...> objects only grow, yes, but we're able to trim them and re-use the memory that's in the internal freelist for the array segments.

However, the memory buffers that each cell in ProfileBuffers point to do get deallocated inside serialize(). Is this correct? Why are these buffers special that they should be deallocated rather than kept around for reuse later like all the other memory?

The actual buffers are obtained through mmap, and they are not fixed-size -- the size of the buffers are dependent on how large the serialised version of the function call tries will be. We can't re-use these buffers across multiple profile collection sessions. The ProfileBuffers array hosts structs that are fixed-size (it's a pointer and a size).

Note that in the reset() function, we destroy the allocators and re-initilize them (through placement new). The static storage for the allocators and the arrays get effectively re-used, without having to reach for memory from the heap (all of the storage for the Array<...> instances will be obtained through the Allocator<...> instances). If you look at Allocator<...>, the destructor will return the memory to the system as well.

When/how often does serialize() get called during runtime? Should we be worried about making a number of munmap/mmap system calls while the task being profiled is running?

I'm trying to understand the memory allocation/deallocation dynamics of this module. It looks like all of the Array<...> objects only have their memory allocation increase because the XRay segmented array doesn't support having its allocation shrink.

The Array<...> objects only grow, yes, but we're able to trim them and re-use the memory that's in the internal freelist for the array segments.

However, the memory buffers that each cell in ProfileBuffers point to do get deallocated inside serialize(). Is this correct? Why are these buffers special that they should be deallocated rather than kept around for reuse later like all the other memory?

The actual buffers are obtained through mmap, and they are not fixed-size -- the size of the buffers are dependent on how large the serialised version of the function call tries will be. We can't re-use these buffers across multiple profile collection sessions. The ProfileBuffers array hosts structs that are fixed-size (it's a pointer and a size).

Note that in the reset() function, we destroy the allocators and re-initilize them (through placement new). The static storage for the allocators and the arrays get effectively re-used, without having to reach for memory from the heap (all of the storage for the Array<...> instances will be obtained through the Allocator<...> instances). If you look at Allocator<...>, the destructor will return the memory to the system as well.

When/how often does serialize() get called during runtime? Should we be worried about making a number of munmap/mmap system calls while the task being profiled is running?

This is controlled by calls to __xray_log_finalize(...), which is only happens on-demand and when the process is shutting down. I have no reason to believe that making a number of mmap/munmap syscalls as part of the normal course of serializing profiles to be a huge concern because they're not in the critical path.

eizan accepted this revision.Aug 16 2018, 6:48 PM
This revision is now accepted and ready to land.Aug 16 2018, 6:48 PM
This revision was automatically updated to reflect the committed changes.