This is an archive of the discontinued LLVM Phabricator instance.

[memprof] Add a raw binary format to serialize memprof profiles.
ClosedPublic

Authored by snehasish on Nov 5 2021, 1:06 PM.

Details

Summary

This change implements the raw binary format discussed in
https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html

Summary of changes

  • Add a new memprof option to choose binary or text (default) format.
  • Add a rawprofile library which serializes the MIB map to profile.
  • Add a unit test for rawprofile.
  • Mark sanitizer procmaps methods as virtual to be able to mock them.
  • Extend memprof_profile_dump regression test.

Diff Detail

Event Timeline

snehasish created this revision.Nov 5 2021, 1:06 PM
snehasish requested review of this revision.Nov 5 2021, 1:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 5 2021, 1:06 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript

This is raw format only change, right? Probably update the summary to reflect that.

snehasish updated this revision to Diff 385192.Nov 5 2021, 2:13 PM

Update description, rebase on stacked changes.

snehasish retitled this revision from [memprof] Add a binary format for serialization of Memprof profiles. to [memprof] Add a raw binary format to serialize memprof profiles..Nov 5 2021, 2:19 PM
snehasish edited the summary of this revision. (Show Details)
snehasish edited the summary of this revision. (Show Details)
davidxl added inline comments.Nov 5 2021, 10:48 PM
compiler-rt/lib/memprof/memprof_allocator.cpp
260

Document 'true' argument with comment. Also should it be controlled by a flag?

266

Add a comment for the function.

compiler-rt/lib/memprof/memprof_flags.inc
38

should it be false by default?

compiler-rt/lib/memprof/memprof_rawprofile.cpp
18

Does it need to be packed?

26

is the assertion supposed be on?

64

Add as function comments about the stack section layout.

compiler-rt/test/memprof/TestCases/memprof_profile_dump.cpp
28

This depends on endianess.

28

Should more info be checked?

tejohnson added inline comments.Nov 8 2021, 10:54 AM
compiler-rt/lib/memprof/memprof_allocator.cpp
17

Is this change related to this patch?

232

We're now printing this twice afaict (here and in FinishAndWrite) - is this intentional?

293

In D111676 it was suggested to move the unlock until after the ForEach PrintCallback to address a race, but this effectively removes that change. Intentional?

compiler-rt/lib/memprof/memprof_rawprofile.cpp
65

Comment about the first value being counted here.

95

Nit: suggest reversing the above 2 lines, logic flows better.

98

Why multiply by sizeof(Count) and not just directly sizeof(uptr) or sizeof(u64)? It seems a little odd to use the size of the Count variable, which happens to be u64 but isn't necessarily related to the size of the pc values.

106

Suggest being more explicit about what value you are talking about. I.e. you don't touch the provided MIB value. Does it need to be marked as unused to avoid warnings (looks like there is an UNUSED macro defined in sanitizer_internal_defs.h that you could use).

113

Comment about first value being counted here.

117

Comment about u64 value being written with each MIB.

187

How much extra memory do we carry by essentially duplicating the necessary allocations (e.g. we have allocated memory for the segment/mibinfo/stack already, then allocate it a second time here)? Since the 3 functions that allocate these individual buffers have 2 parts to them (compute # bytes, then write bytes), should we consider splitting each of those into 2 separate functions, call the computation parts first, then do a single allocation here, then write the individual components into this Buffer directly?

compiler-rt/test/memprof/TestCases/interface_test.cpp
5 ↗(On Diff #385192)

Seems like an unrelated fix - can you just commit it NFC separately and remove from here?

snehasish updated this revision to Diff 385639.Nov 8 2021, 2:57 PM
snehasish marked 8 inline comments as done.
snehasish edited the summary of this revision. (Show Details)

Address review comments.

Also

  • Update test to use od instead of xxd since its not available on debian buildbots.
  • Some smaller changes to address warnings.

Thanks for the prompt review! PTAL.

compiler-rt/lib/memprof/memprof_allocator.cpp
17

No, reverted.

232

No, its leftover from a merge. Reverted.

260

Done. I think setting it to true is more conservative - forces an update of the cache and then reads the proc maps again to check if anything was missed (i.e. due to concurrent updates). Setting it to false could be faster but the gains are probably negligible so I think hard-coding the value is fine.

293

Not intentional. Thinking about this some more - the deallocation path is guarded by the destructing var - so print/merge conflict should not happen. Thus we can remove the locks from this path altogether. I updated the code - converted destructing and constructed to atomics and removed the ForceLock/ForceUnlock calls since they are not used anywhere else. Wdyt?

compiler-rt/lib/memprof/memprof_flags.inc
38

Not planning on making it the default in this change since it would mean updating the tests which check the printed out output. I plan on adding this flag to the tests in compiler-rt and making binary the default in a future patch.

compiler-rt/lib/memprof/memprof_rawprofile.cpp
18

Yes, it is packed to make it easy to serialize/deserialize. I plan on reusing this typedef in the llvm/ProfileData reader code by moving it to the shared include header in the future.

26

Removed commented code. It would be nice to have this static check but we don't have the c++ standard library here. There are some sanitizer defined trait checks but is_pod isn't implemented.

64

Added a comment - the contents are the same as the format description in the SerializeMIBInfo function. Let me know if you think I should clarify further.

187

I started with an approach which precomputed the necessary space and performed a single allocation but then abandoned the approach in favour of the current implementation which IMO is easier to read (and test individually though I didn't take advantage of this fact).

For a clang workload (eg ~35K contexts), the overhead would be ~ equal to the size of the raw profile on disk, the largest I observed was ~5M. For a larger internal workload with ~1M observed contexts, I expect the overhead to be ~100M. I don't have a strong opinion and I can rewrite this patch to precompute and only allocate once if you would like.

compiler-rt/test/memprof/TestCases/interface_test.cpp
5 ↗(On Diff #385192)

Pushed in rG9305e3b6d7e7

compiler-rt/test/memprof/TestCases/memprof_profile_dump.cpp
28

This depends on endianess.

Yes. The sanitizer tests are only enabled on x86_64 [1]. I don't think we intend to support memprof on big endian arch. So I think we can leave this as is?

Should more info be checked?

I have a more robust test for contents in the unit test (compiler-rt/lib/memprof/tests/rawprofile.cpp). Once we have a llvm-profdata reader interface to dump as text we can add something to cross-project-tests? If you feel strongly I could add more checks here for the binary octets we expect.

https://git.io/JXgFi

davidxl added inline comments.Nov 9 2021, 10:01 AM
compiler-rt/lib/memprof/tests/rawprofile.cpp
35

Perhaps make the make contain two stacks and two MIB entries?

66

Can offset be larger than start?

100

Can the actual size be checked?

130

Why using NE instead of comparing with actual id with EQ?

compiler-rt/test/memprof/TestCases/memprof_profile_dump.cpp
28

Ok, perhaps add a comment about endianess assumption

tejohnson added inline comments.Nov 9 2021, 12:52 PM
compiler-rt/lib/memprof/memprof_allocator.cpp
293

When we call this from __memprof_profile_dump we aren't destructing, so wouldn't we have an issue then?

compiler-rt/lib/memprof/memprof_rawprofile.cpp
106

Suggest changing comment to be more explicit, e.g. s/the value/the MIB value/

187

It seems large enough that I think it would be worth splitting if it isn't too much trouble.

snehasish updated this revision to Diff 386036.Nov 9 2021, 6:40 PM

Address review comments.

  • Enhance the unit test with multiple entries and strict checks.
  • Re-introduce the allocator locks.
  • Rewrite to memprof_rawprofile.cpp to allocate memory only once.
snehasish updated this revision to Diff 386039.Nov 9 2021, 6:48 PM
snehasish marked 6 inline comments as done.

Updated fake offset in test to match start.

Rewrote the serialization logic to remove unnecessary allocations and addressed issues in the test. PTAL, thanks!

compiler-rt/lib/memprof/memprof_allocator.cpp
293

Yes it will be. I missed that case when I was reasoning about this. I reintroduced the check and updated the code here. While working on this I also realized that the malloc call memprof_rawprofile was being instrumented to and caused a deadlock when attempting to serialize the rawprofile. I converted that to use InternalAlloc. I left the constructed and destructing vars as atomics since its stricter.

compiler-rt/lib/memprof/tests/rawprofile.cpp
66

Not in practice. Updated to be the same.

100

Done here and all other checks.

tejohnson accepted this revision.Nov 10 2021, 3:33 PM

lgtm with small fix noted below. Wait to see if @davidxl has more comments too

compiler-rt/lib/memprof/memprof_rawprofile.cpp
93

Include a meaningful message instead of "" (here and in other similar CHECKs below).

This revision is now accepted and ready to land.Nov 10 2021, 3:33 PM
snehasish updated this revision to Diff 386342.Nov 10 2021, 4:09 PM

Fill in missing messages for CHECKs.

snehasish marked an inline comment as done.Nov 10 2021, 4:17 PM

The fuzzer test failure seems to be unrelated so I'll go ahead and push this set of patches.

This revision was landed with ongoing or failed builds.Nov 11 2021, 11:31 AM
This revision was automatically updated to reflect the committed changes.