This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Make indexed value profile data more compact and add structural definitions for the data format
ClosedPublic

Authored by davidxl on Nov 5 2015, 3:54 PM.

Details

Summary

The patch include the following changes:

  • Make indexed value profile data more compact by peeling out the per-site value count field into its own smaller sized array.
  • Introduced formal data structure definitions to specify value profile data layout in indexed format. Previously the layout of the data is only assumed in the client code (scattered in three different places : size computation, EmitData, and ReadData) which can be out of sync.
  • The bulk of reading/writing is the implementation of serialization/deserialization of the value profile data: interfaces that convert between on-disk ValueProfData and InstrProfRecord class.
  • The new data structure also serves as a central place for layout documentation.

Diff Detail

Repository
rL LLVM

Event Timeline

davidxl updated this revision to Diff 39431.Nov 5 2015, 3:54 PM
davidxl retitled this revision from to [PGO] Make indexed value profile data more compact and add structural definitions for the data format.
davidxl updated this object.
davidxl added reviewers: betulb, bogner.
davidxl added a subscriber: llvm-commits.
davidxl updated this revision to Diff 39478.Nov 5 2015, 11:10 PM

Force llvm style when using clang-format.

betulb edited edge metadata.Nov 6 2015, 8:59 AM

It looks fine to me overall. I'll need to do the testing when I'm rebasing my pending CL's.

davidxl added a subscriber: davidxl.Nov 6 2015, 9:03 AM

thanks. Let me update the patch first -- there were some conflicting format
changes.

David

davidxl updated this revision to Diff 39547.Nov 6 2015, 9:09 AM
davidxl edited edge metadata.

Rebase patch.

vsk added a subscriber: vsk.Nov 6 2015, 11:21 AM

Thanks! I have a few inline comments.

As I understand it, we need to maintain backwards compatibility for the indexed data format. Could you commit a binary blob which uses the old indexed format (in which value site counts are right next to value kinds), and show that we can still read it?

include/llvm/ProfileData/InstrProf.h
415 ↗(On Diff #39478)

recordfor -> record for

419 ↗(On Diff #39478)

SiteCountArray has length NumValueSites, but it's declared here with length 1. Your patch also stores uint32_t into this array, which is an array of uint8_t.

I was confused by this when I first read through your patch. Could you change this to uint32_t SiteCountArray[]? If you mean for this to be an array of uint8_t, please change the return type of getNumValueDataForSite.

421 ↗(On Diff #39478)

I think the convention is to use #if 0 for this in llvm.

469 ↗(On Diff #39478)

Same nit as above.

543 ↗(On Diff #39478)

Do you have any plans for pulling in this definition into compiler-rt, and removing the duplicate definition there?

lib/ProfileData/InstrProfReader.cpp
25 ↗(On Diff #39478)

I think this line would have to change to sizeof(ValueProfRecord) + sizeof(uint8_t) * NumValueSites if you take my earlier suggestion. I think it would be cleaner this way.

davidxl marked 3 inline comments as done.Nov 6 2015, 12:35 PM

There is no need for binary blob testing -- the runtime and raw format change is still pending, so this is no need to keep the backward compatibility. Besides the patch basically removes the old format support, it will break with the test.

include/llvm/ProfileData/InstrProf.h
423 ↗(On Diff #39547)

uint8_t is an implementation detail (where the packing comes from). In practice, most of the value sites are cold with zero value tracked and large number of tracked values will be useless for compiler and is a waste of space, this is why the runtime will guarantee (change by Betul under review) the value count for each site is limited.

However I think the interface should still use uint32_t. Also note that the SiteCountArray is variable length, thus it is declared with element == 1.

546 ↗(On Diff #39547)

Yes -- there is a plan for that.

lib/ProfileData/InstrProfReader.cpp
25 ↗(On Diff #39547)

There was a bug there that may waste space. Changed to use offsetof.

davidxl updated this revision to Diff 39577.Nov 6 2015, 12:36 PM

Update the patch according to comments.

vsk added a comment.Nov 6 2015, 1:09 PM

IIUC, when the raw format and runtime changes land, we will need a backwards compatibility test for the indexed format?

A few more inline comments --

include/llvm/ProfileData/InstrProf.h
423 ↗(On Diff #39577)

Ah ok.

When Betul's patch lands, we should really consider being consistent about using uint8_t for site counts everywhere, as it's a bit more accurate.

I see that SiteCountArray is variable length, but that's not obvious from a superficial reading of the code. It's declared as an array of length 1, which is surprising.

Declaring it as [] conveys the meaning more directly. Alternatively, you could define a method getSiteCountArray() { return ((uint8_t *)this) + sizeof(ValueProfRecord); }, and move the SiteCountArray[] declaration into the #if 0 block.

lib/ProfileData/InstrProfReader.cpp
121 ↗(On Diff #39577)

The use of sizeof(uint64_t) in this function isn't self-explanatory. Afaict sizeof(ValueProfData) would work at least as well, and is a lot clearer.

126 ↗(On Diff #39577)

Can you remove this union, or lift it into a more descriptive comment?

133 ↗(On Diff #39577)

This would be much easier to understand as if (TotalSize % sizeof(ValueProfData)), which gets optimized properly later.

davidxl marked an inline comment as done.Nov 6 2015, 1:28 PM
davidxl added inline comments.
include/llvm/ProfileData/InstrProf.h
423 ↗(On Diff #39577)

The comment says the number of element is NumValueSites. Declaring the array with 1 element serves the purpose of indicating that there is at least one site profiled. Just added a comment that there is one element guaranteed.

lib/ProfileData/InstrProfReader.cpp
126 ↗(On Diff #39577)

Added a comment.

133 ↗(On Diff #39577)

changed to TotalSize % sizeof(uint64_t) and added comment.

yes -- binary blob testing can be added once the feature is turned on (with all pieces ready and possibly after a major public clang release with the feature is announced) -- until then, we still have some flexibility to tune the layout without the need to worry about incompatible changes.

davidxl updated this revision to Diff 39580.Nov 6 2015, 1:35 PM

Update the patch according to Vedant's comments.

vsk added a comment.Nov 6 2015, 3:01 PM

It'd be nice to have blob testing done before a major release, but that sounds like a discussion for llvm-dev. Lgtm with a tiny nit.

include/llvm/ProfileData/InstrProf.h
424 ↗(On Diff #39580)

nit: inthe -> in the

davidxl updated this revision to Diff 39635.Nov 6 2015, 11:22 PM
  1. addressed Justin's comments
  2. consolidate byte swap code for reader/writer
  3. added byte swap related interfaces purely for the purpose of testing
  1. increased test coverage by added a new unittest that streams in/out value profile data in big endian order and force byte swapping to happen.
davidxl updated this revision to Diff 39664.Nov 8 2015, 9:02 PM

Update the patch:

  1. Add tags in doxygen comments
  2. Move ValueProfData implementation from InstrProfReader.cpp to InstrProf.cpp as it is a shared structure between reader and writer.
This revision was automatically updated to reflect the committed changes.