This is an archive of the discontinued LLVM Phabricator instance.

[compiler-rt][profiling] Add an incremental buffer writing mode to libprofile

Authored by paquette on Apr 15 2023, 12:44 AM.



In environments with tight memory constraints, users may not be able to actually allocate a buffer large enough to fit all of their program's profiling data.

In this case, a workflow like this would be helpful:

while (!__llvm_profile_incremental_writer_done()) {
  char Buffer[BUFFER_SIZE];
  if (__llvm_profile_write_buffer_incremental(Buffer, BUFFER_SIZE) == -1)
    return -1;
  if (putBufferSomewhere(Buffer, BUFFER_SIZE))
    return -1;

That is, make a smaller buffer, fill it with BUFFER_SIZE bytes of data, transfer that somewhere else, and repeat the process until all of the profiling data has been transferred out.

This patch adds the state and APIs necessary to make the above workflow possible.

The added APIs are as follows:

  • int __llvm_profile_init_incremental_writer_state(void) - Initializes internal structures to remember where we are in the process of writing profiling data
  • __llvm_profile_write_buffer_incremental(char *Buffer, uint64_t WriteSizeInBytes) - Writes a certain number of bytes into a buffer
  • int __llvm_profile_incremental_writer_done(void) - Returns whether or not all profiling bytes have been written

The added state is a struct (IncrementalProfileWriterState) that keeps track of

  • How many sources we need to write data from
  • What those sources are
  • Which source we're writing from, and where in it we should write from

The struct reuses the existing ProfDataIOVec structure used throughout the rest of libprofile.

Currently, binary ID writing is not supported, since it seems very platform-specific. This will emit a warning on platforms that enable binary ID writing.

This patch also adds a couple refactors/error message improvements:

  • Factor out some of the header construction code in lprofWriteDataImpl
  • Add a PROF_ERR_INTERNAL to communicate when an error is likely not a user error, but rather a bug of some sort

Diff Detail

Event Timeline

paquette created this revision.Apr 15 2023, 12:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 15 2023, 12:44 AM
paquette requested review of this revision.Apr 15 2023, 12:44 AM
paquette updated this revision to Diff 515923.Apr 21 2023, 2:43 PM
paquette edited the summary of this revision. (Show Details)
  • Rebase
  • Don't assume dst buffer is 0 filled; use memset for padding
  • Make __llvm_profile_write_buffer_incremental return the number of bytes actually written on success. -1 is still the error.
aemerson added inline comments.Apr 27 2023, 9:56 AM

I think this can just take a pointer to ProfDataIOVec? It's a small struct but seems unnecessary to copy since the call site is already dealing with a ptr.

paquette updated this revision to Diff 518561.May 1 2023, 2:10 PM

Make isPadding take a pointer.

aemerson accepted this revision.May 8 2023, 3:13 PM

I'm not sure I'm the best person to approve but so far LGTM from my shallow understanding.

This revision is now accepted and ready to land.May 8 2023, 3:13 PM
gulfem added a comment.May 8 2023, 3:29 PM

I did not have a change to finish reviewing this patch, but it breaks binary-id.c and binary-id-lookup.c tests that you can see in the pre-merge checks.

LLVM Profile Error: Failed to write file "/var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/profile/Profile-x86_64/Linux/Output/binary-id.c.tmp.profraw": File exists
error: /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/profile/Profile-x86_64/Linux/Output/binary-id.c.tmp.profraw: invalid instrumentation profile data (file header is corrupt)

tellenbach added inline comments.Jul 9 2023, 12:07 AM

Seems like this function is missing a return in case the branch is not taken. This is UB.

jroelofs added inline comments.Jul 9 2023, 10:49 AM

Should this be a uint64_t, since the struct's NumBytesWritten has that type? Likewise, shouldn't the return type be changed to match?


I think the comment should elaborate a little on why this isn't supported yet, what makes it tricky or a lot of work or whatever.

phosek added a comment.Jul 9 2023, 2:55 PM

The runtime memory overhead of profile instrumentation comes from the different allocated sections: __llvm_prf_cnts, __llvm_prf_data, __llvm_prf_vnds, __llvm_prf_names; let's call all of these together A. When using the buffer mode, we'll need a buffer to hold all the profile data, let's call it B. The total memory overhead is therefore A+B. This change replaces B with a fixed size buffer k, so the memory overhead is going to decrease to A+k.

My concern is that this change adds a lot of complexity to the profile runtime, which already suffers from poor code quality and is in a dire need of a substantial rewrite/refactoring, but the impact is somewhat limited because this change does nothing to reduce the size of A.

Have you considered alternatives such as using instrumentation function groups? For example, using -fprofile-function-groups=4 would reduce the memory overhead to (A+B)/4 which should be strictly better than A+k. Would that work for your use case?

Another direction we discussed for reducing the memory overhead is making __llvm_prf_data, __llvm_prf_vnds, __llvm_prf_names non-allocated. Our observation is that the only thing we really need in the profile are the header and the counters, everything else can be recovered from the binary (which becomes easier if the profile has binary ID and you can use debuginfod to retrieve the binary). That's direction we'd like to pursue to support profiling in embedded environments.


This struct is fairly large and would be always allocated, even if we aren't using the incremental mode unless it can be GC'ed by the linker (which requires compiling with GC enabled which isn't enabled by default on all platforms).

Could we move all the symbols related to the incremental mode into a separate TU so if the incremental mode isn't being used, those symbols won't be pulled into the link in in the first place?


I think we might need to rethink the error reporting. The callees will already print their own error message, but so will the callers, so we'll end up with reporting the error multiple times.

I think we should either omit the error message here, and rely on callees to do the error reporting, or change the callees to return a return code and do all the error reporting here (but not both).


Would it be possible to split this change and land it separately?


Since this is C and not C++, do you know if all compilers supported by compiler-rt perform RVO in C? It might be safer to use an output parameter here to avoid a copy.


Could this be an assert?


Could this be an assert?


Even though Binary ID is only supported for ELF, we have a plan for how to support it for COFF and Mach-O as well, it just hasn't been implemented yet.

Do you have an idea for how we could support Binary ID with incremental mode? I'm a little wary about introducing features that are mutually incompatible without some long term plan for resolving those incompatibilities.