This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Profile Summary Index profile writer and reader support
ClosedPublic

Authored by davidxl on Jan 15 2016, 10:22 PM.

Details

Summary

Profile summary can be expensive to recompute, so it is better to compute it once during profile write and save the summary in the indexed profile data. This patch adds the support for summary writing and reading.

For older versions of the profile data, the reader is taught to re-synthesize the summary data on the fly. Since profileSummary includes MaxFunctionCount already, existing fields in InstrProfReader and Writer are removed. The MaxFunctionCount computation code is also removed (redundant). The Header still includes MaxFunctionCount field to minimize layout change. This field is deprecated and can be repurposed for something else in the future.

Diff Detail

Repository
rL LLVM

Event Timeline

davidxl updated this revision to Diff 45072.Jan 15 2016, 10:22 PM
davidxl retitled this revision from to [PGO] Profile Summary Index profile writer and reader support.
davidxl updated this object.
davidxl added a reviewer: vsk.
davidxl added subscribers: llvm-commits, eraman, xur.
vsk edited edge metadata.Jan 21 2016, 1:26 AM

Thanks, comments inline --

include/llvm/ProfileData/InstrProf.h
673 ↗(On Diff #45072)

Tiny nit, could you move one line up?

709 ↗(On Diff #45072)

Nit, do you mind keeping the original comment ('The maximal execution count among all functions.')?

Wdyt of changing getMaxFunctionCount() to return an Optional<uint64_t> in a later commit?

714 ↗(On Diff #45072)

Missing comment.

726 ↗(On Diff #45072)

Why not use a SmallVector<uint64_t, N>? Then there's no need for NumEntries, getSize(), allocSummary() etc.

lib/ProfileData/InstrProf.cpp
642 ↗(On Diff #45072)

A range-based for loop would be nicer here.

644 ↗(On Diff #45072)

If you add a constructor to ProfileSummaryEntry, you can take advantage of emplace_back() here.

lib/ProfileData/InstrProfReader.cpp
567 ↗(On Diff #45072)

In this scenario I'd prefer using a different name to avoid confusion and extra 'this->'s. Maybe 'SummaryInHE'?

582 ↗(On Diff #45072)

As written, constructing the Cutoffs vector takes linear time. Passing it into the ProfileSummary constructor is also linear time. ISTM the ProfileSummary constructor shouldn't accept a vector of cutoffs, it should use ArrayRef.

If you agree, I can take care of it in a separate patch.

583 ↗(On Diff #45072)

Ignoring the result of a getter is a bit confusing. Maybe you could call computeDetailedSummary() directly?

lib/ProfileData/InstrProfWriter.cpp
78 ↗(On Diff #45072)

These static variables break the use-case of running llvm in a multi-threaded environment! If you know of any other instances like this, they need to be fixed.

Please pass a summary pointer into EmitData directly.

194 ↗(On Diff #45072)

Using a vector-like container for Entries turns this loop into a resize() + std::copy.

eraman added inline comments.Jan 21 2016, 4:28 PM
include/llvm/ProfileData/InstrProf.h
698 ↗(On Diff #45072)

it's useful to add a cutoff of 999999 to identify cold blocks. IMO these are more useful than the cutoffs in the 1-80% range.

davidxl marked 8 inline comments as done.Jan 22 2016, 1:57 PM
davidxl added inline comments.
include/llvm/ProfileData/InstrProf.h
726 ↗(On Diff #45072)

This data structure is used for serialization describing on disk/buffer data. The in-memory data structure is ProfileSummary

lib/ProfileData/InstrProf.cpp
642 ↗(On Diff #45072)

The number of entries is unknown statically.

lib/ProfileData/InstrProfReader.cpp
582 ↗(On Diff #45072)

I did not notice -- I think making the ProfileSummary to take a const reference to vector should do. Or even better to avoid the the copy from Cutoffs to DetailedSummaryCutoffs member, a unique_ptr to the cutoffs can be created by the creator of ProfileSummary and the ownership of the cutoffs can be passed in.

Feel free to improve this part.

lib/ProfileData/InstrProfWriter.cpp
78 ↗(On Diff #45072)

Fixed.

194 ↗(On Diff #45072)

The target data structure is serialization/persistence so it is in 'raw' form.

davidxl updated this revision to Diff 45755.Jan 22 2016, 2:46 PM
davidxl edited edge metadata.
davidxl marked an inline comment as done.

Update patch by incorporating vsk and eraman's comments.

davidxl updated this revision to Diff 45881.Jan 25 2016, 10:36 AM

Misc updates of the patch:

  1. renamed unused header field
  2. ensure summary data to be zeroalized
  3. added more comments
  4. added reserved fields in summary for future extension.

Any more comments on this patch?

eraman added inline comments.Jan 28 2016, 2:28 PM
include/llvm/ProfileData/InstrProf.h
743 ↗(On Diff #45881)

This is not very useful. In practice we should expect MaxInternalBlockCount == MaxBlockCount

include/llvm/ProfileData/InstrProfReader.h
345 ↗(On Diff #45881)

Nit: pass -> past

davidxl added inline comments.Jan 28 2016, 2:37 PM
include/llvm/ProfileData/InstrProf.h
743 ↗(On Diff #45881)

This field is introduced to match the corresponding field in ProfileSummary -- it is used by the current dumper. I prefer keeping it (just 8 byte space).

include/llvm/ProfileData/InstrProfReader.h
345 ↗(On Diff #45881)

will change.

davidxl updated this revision to Diff 46351.Jan 28 2016, 10:01 PM

Fix a typo.

vsk added inline comments.Feb 1 2016, 10:50 AM
include/llvm/ProfileData/InstrProf.h
751 ↗(On Diff #46351)

I'm not sure about this Reserved field. IMO any new future functionality should come with a format version bump, even if the on-disk layout stays the same. That means we can teach the reader to skip straight to NumEntries for v4 files.

lib/ProfileData/InstrProfWriter.cpp
197 ↗(On Diff #46351)

I see. Is it possible to make the Entries array an array of ProfileSummaryEntry? We'd have to make the Cutoff field 64-bit, but it would help keep things in sync. That also lets us use the default copy-constructor here, i.e TheSummary->Entries[I] = Res[I].

xur added inline comments.Feb 1 2016, 11:10 AM
include/llvm/ProfileData/InstrProf.h
751 ↗(On Diff #46351)

Do we need to use uint64_t for TotalNumFunctions and TotalNumBlocks. Will uint32_t enough?

davidxl added inline comments.Feb 1 2016, 11:17 AM
include/llvm/ProfileData/InstrProf.h
751 ↗(On Diff #46351)

Yes it is true that the version number will always bump up when new functionality is added -- however keeping the same layout can greatly simplify the reader change.

To remove the reserved field, I think it might be better to split the summary into two parts: summary header plus the Summary entry array. I will make that change and remove the reserved fields.

lib/ProfileData/InstrProfWriter.cpp
197 ↗(On Diff #46351)

It is tempting to do that -- but ProfileSummaryEntry data structure is intended to also be shared with sample PGO and its contents may be subject to unrelated changes in the future which may affect layout. I would like to avoid it.

davidxl added inline comments.Feb 1 2016, 11:20 AM
include/llvm/ProfileData/InstrProf.h
751 ↗(On Diff #46351)

uint64_t is consistently used in all fields in the header.

davidxl updated this revision to Diff 46591.Feb 1 2016, 4:31 PM

Rebase and address vsk's comment (no more reserved fields)

vsk added a comment.Feb 2 2016, 1:16 PM

Thanks, lgtm

This revision was automatically updated to reflect the committed changes.