This is an archive of the discontinued LLVM Phabricator instance.

llvm-profdata: Indirect infrequently used fields to reduce memory usage
ClosedPublic

Authored by dblaikie on Jun 27 2017, 8:41 AM.

Details

Summary

Examining a large profile example, it seems relatively few records have
non-empty IndirectCall and MemOP data, so indirecting these through a
unique_ptr (non-null only when they are non-empty) Reduces memory usage
on this particular example from 14GB to 10GB according to valgrind's
massif.

I suspect it'd still be worth moving InstrProfWriter to its own data
structure that had Counts and the indirected IndirectCall+MemOP, and did
not include the Name, Hash, or Error fields. This would reduce the size
of this dominant data structure by half of this new, lower amount.
(Name(2), Hash(1), Error(1) ~= Counts(vector, 3), ValueProfData
(unique_ptr, 1))

Event Timeline

dblaikie created this revision.Jun 27 2017, 8:41 AM

Open questions:

Is this relatively low incidence of IndirectCall and MemOP collections representative, or is it likely to be more common in the future/in other use cases? (I haven't really looked at the code/what these mean in any great detail at all)

The copyability of InstrProfRecord might be able to be avoided with more work - there were certainly a bunch of cases where they didn't need to be copied that I found before I hit a more difficult one that seemed like more work than was worth it for the prototype of this change at least, but happy to go back to that if it seems worthwhile.

If InstrProfWriter's going to need to move away from InstrProfRecord anyway (to avoid the Name/Hash/Error values) this change may be a dead end not worth committing (since it'd change the data structure, only to not end up using the data structure in this codepath anyway).

davidxl edited edge metadata.Jun 27 2017, 9:46 AM

Yes, in general, the profile data is very sparse for large applications. Only a small percentage of modules have non-zero profile counts. Unlike edge/block profile which is statically allocated, the value profile data is created on-demand, this makes it possible to take advantage the sparsity as your patch does.

Regarding the followup change, here is my suggestion:

  1. refactor InstrProfRecord into two parts, one part is the identification related including name, hash, etc, and the second part includes only counters -- InstrProfRecord becomes a wrapper class.
  2. InstrProfWriter can use the counter record and maintains its own mapping.
include/llvm/ProfileData/InstrProf.h
717–718

Can we change the interface to return ArrayRef so that

  1. we don't need empty vector
  2. no need to duplicate the const/non-const version of the method?
dblaikie updated this revision to Diff 104276.Jun 27 2017, 2:31 PM

Move to 'getOrCreateValueSitesForKind' for accessing the mutable std::vector reference.

davidxl added inline comments.Jun 27 2017, 2:55 PM
include/llvm/ProfileData/InstrProf.h
698

Is the const_cast here needed? Non const pointer can be used with const method right?

lib/ProfileData/InstrProf.cpp
510–511

This one should use the read-only version of the API.

537–538

Should use the read-only version.

dblaikie updated this revision to Diff 104524.Jun 28 2017, 3:15 PM

Address review comments - using get instead of getOrCreate in a couple of places.

dblaikie updated this revision to Diff 104525.Jun 28 2017, 3:18 PM

Range-for-ify a loop.

dblaikie updated this revision to Diff 104526.Jun 28 2017, 3:20 PM

Remove unneeded bounds checking.

include/llvm/ProfileData/InstrProf.h
698

Yep - otherwise this function would infinitely recurse - calling back into this non-const function.

Ideally we'd have an "implicit_cast<T>" function template that would make it more obvious this was only adding const, and not a dangerous thing like const_cast that can take away const.

lib/ProfileData/InstrProf.cpp
510–511

Ah, well, it does need to use the non-const, but not the 'creating' kind. 'merge' takes the InstrProfRecord by non-const ref to call sort on it, it seems, just in case it's not already sorted.

537–538

Right right - like the other one, it doesn't need the 'orCreate' part, but it does need the non-const (MutableArrayRef) to scale the elements.

davidxl accepted this revision.Jun 28 2017, 3:47 PM

lgtm

lib/ProfileData/InstrProf.cpp
510–511

Ok. can you put a comment there (re. using the non-const version)?

This revision is now accepted and ready to land.Jun 28 2017, 3:47 PM
This revision was automatically updated to reflect the committed changes.