This is an archive of the discontinued LLVM Phabricator instance.

Prototype: Reduce llvm-profdata merge memory usage further
ClosedPublic

Authored by dblaikie on Jun 29 2017, 3:05 PM.

Details

Summary

The InstrProfWriter already stores the name and hash of the record in
the nested maps it uses for lookup while merging - this data is
duplicated in the value within the maps.

Refactor the InstrProfRecord to use a nested struct for the counters
themselves so that InstrProfWriter can use this nested struct alone
without the name or hash duplicated there.

This work is incomplete, but enough to demonstrate the value (around a
50% decrease in memory usage for a large test case (10GB -> 5GB)).
Though most of that decrease is probably from removing the
SoftInstrProfError as well, but I haven't implemented a replacement for
it yet. (it needs to go with the counters, because the operations on the
counters - merging, etc, are where the failures are - unlike the
name/hash which are totally unused by those counter-related operations
and thus easy to split out)

Ongoing discussion about removing SoftInstrProfError as a field of the
InstrProfRecord is happening on the thread that added it - including
the possibility of moving back towards an earlier version of that
proposed patch that passed SoftInstrProfError through the various APIs,
rather than as a member of InstrProfRecord.

Event Timeline

dblaikie created this revision.Jun 29 2017, 3:05 PM

I should say this doesn't currently pass the llvm-profdata regression tests - in part because it totally breaks/ignores the error handling (as mentioned in the description, there's ongoing list discussion about how to handle/address that). But enough of a rough prototype to handle some review feedback like naming, etc, I think. I haven't done a byte-for-byte comparison, but it seems to do similar work to the ToT version in my (error-free) test case.

davidxl added inline comments.Jun 29 2017, 3:20 PM
include/llvm/ProfileData/InstrProf.h
598

nit: maybe just Counters? Holder does not provide additional information.

lib/ProfileData/InstrProf.cpp
504

Keep this until the discussion about its fate is resolved.

lib/ProfileData/InstrProfWriter.cpp
152

Summary Builder does not need name, hash etc. so perfhaps fixing API is the right way.

dblaikie updated this revision to Diff 104783.Jun 29 2017, 5:05 PM

Address code review feedback, including leaving the error handling in, fixing the SummaryBuilder API, and fixing a few bugs I hadn't addressed - check-all now works (& I have a patch for the clang API fixes needed).

davidxl edited edge metadata.Jun 29 2017, 5:18 PM

Instead of exposing Counter members widely (and fixing tests), it is better to add wrapper methods in InstrProfRecord to forward the call to the nested member struct.

In summary, I think this patch should be minimized by

  1. sinking the minimal number of interfaces into Counter subclass
  2. for any sunk interface, provide a wrapper in the parent class InstrProfRecord
  3. avoid exposing internal members to client side other than InstrProfWriters (may be making those method private and making Writer a friend ?)
  4. split out unrelated changes
include/llvm/ProfileData/InstrProf.h
253

unrelated format change?

701

format change?

include/llvm/ProfileData/InstrProfData.inc
317

Why this change? IMO, we should reduce exposing CounterHolder if possible and maintain/use APIs at the InstrProfRecord level.

361

Same here.

530

Can this be split into different patch?

include/llvm/ProfileData/InstrProfWriter.h
66

There is no need to change this interface. You are passing name, hash etc, so might as well create a InstrProfRecord.

lib/ProfileData/InstrProf.cpp
641

These naming changes are not necessary

651

The interface change here is not necessary.

lib/ProfileData/InstrProfWriter.cpp
152

On second thought, to reduce exposing implementation details and minimizing interface changes, it is cleaner to just do

SummaryBuilder->addRecord(InstrProfRecord(K, ProfileData.first, Counters));

Similarly for the SerializeFrom call below and ValueProfData::getSize() call above.

dblaikie updated this revision to Diff 105374.Jul 5 2017, 11:16 PM

Try inverting the API change - move the name/hash out into NamedInstrProfRecord

Limit NamedInstrProfRecord to the reader file seems ok -- but this class also looks like a useful convenience class. We can probably discuss that in a different thread if it is desirable.

lib/ProfileData/InstrProfWriter.cpp
182

I don't find the comment here helping. Perhaps just mention the move construction happens later when assigned to the Dest?

dblaikie updated this revision to Diff 105492.Jul 6 2017, 11:22 AM

Refactor code accessing a potentially std::move'd object to avoid the need for a comment.

davidxl accepted this revision.Jul 6 2017, 11:24 AM

lgtm

This revision is now accepted and ready to land.Jul 6 2017, 11:24 AM
This revision was automatically updated to reflect the committed changes.