This is an archive of the discontinued LLVM Phabricator instance.

[SampleFDO] Add ExtBinary format to support extension of binary profile
ClosedPublic

Authored by wmi on Aug 20 2019, 9:06 PM.

Details

Summary

This is a patch split from https://reviews.llvm.org/D66374. It tries to add a new format of profile called ExtBinary. The format adds a section header table to the profile and organize the profile in sections, so the future extension like adding a new section or extending an existing section will be easier while keeping backward compatiblity feasible.

Diff Detail

Repository
rL LLVM

Event Timeline

wmi created this revision.Aug 20 2019, 9:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 20 2019, 9:06 PM

llvm-profdata should also have a round trip test:

text format --> ext-format-> text format --> compare text format before and after

text-format-> binary-format->ext-format-> text format --> compare text format.

Also for the the funcProfile section, should it also have a field (in the header) defining the profile kind, such as LBR samples, llc misses etc. Another question, is multiple such funcProfile section supported?

include/llvm/ProfileData/SampleProfReader.h
511

extend --> extended

521

why retiring the existing section? A natural way is to append a new section of a new type. The format of a new section can even be 'opaque' and the section reader plugin can be provided by the client.

lib/ProfileData/SampleProfWriter.cpp
89

I think it is better to explicit pass SectionStart as the parameter. It should also return a new offset.

101

let this function return an offset that can be passed to addNewSection

tools/llvm-profdata/llvm-profdata.cpp
40

is Compact_Ext_Binary also a choice?

wmi marked 4 inline comments as done.Aug 21 2019, 11:37 AM

llvm-profdata should also have a round trip test:

text format --> ext-format-> text format --> compare text format before and after

text-format-> binary-format->ext-format-> text format --> compare text format.

I will add it.

Also for the the funcProfile section, should it also have a field (in the header) defining the profile kind, such as LBR samples, llc misses etc.

My plan was to use different section types to represent different funcProfile sections: secFuncProfile_LBR, secFuncProfile_LBR_LLCM, ... Each will have its own section reader.

Another question, is multiple such funcProfile section supported?

It is supported. Currently reader only read each section according to section tag. It can be multiple sections with the same type. As long as the section reader supports being called multiple times, there is no problem. (FunctionProfile section reader 'readFuncProfile' can be called multiple times)

include/llvm/ProfileData/SampleProfReader.h
511

Will fix.

521

If we want to append cache misses information, we need to express the parent symbol and line offset again in the form of [line offset: cache miss count], that will introduce redundency because we already express line offset in existing function profile. What we want is to append the information directly in the current format like: [line offset: lbr sample count, cache miss count]. That is why I say we may retire the current function profile section v1 and create a new function profile section v2. But the reader of function profile section v1 can be kept if backward compatibility is required.

The format of a new section can even be 'opaque' and the section reader plugin can be provided by the client.

Agreed.

lib/ProfileData/SampleProfWriter.cpp
89

will fix.

tools/llvm-profdata/llvm-profdata.cpp
40

Currently Compact_Binary uses md5 to represent string. Compact_Ext_Binary may confuse people to think the ext format also use md5 to represent string.

In the future, we can port the implementation of Compact_Binary to inherit from Ext_Binary too.

wmi updated this revision to Diff 216677.Aug 22 2019, 11:24 AM

Address David's comment.
Add ExtBinaryBase class for the basics of extensible format and ExtBinary inherits from it, so we can add other types of profiles in extensible format separately.
Some refactoring and comment changes.

LLC miss info should be very sparse, so mixing it with LBR may not bring a lot of benefit. Having mixure of different types of profile data can lead to combinatorial growth of the number of readers. As the number of profile types increases, it can cause the reader to be unmaintaintable. That is why I suggest making it clean -- one type per section.

Regarding compact format. Does it mean when LLC misses are included, there is no way to use the compact representation?

wmi updated this revision to Diff 216740.Aug 22 2019, 4:40 PM

Address David's comment. Rename some section enums and reserve some enum values.

davidxl accepted this revision.Aug 22 2019, 4:46 PM

lgtm

This revision is now accepted and ready to land.Aug 22 2019, 4:46 PM
This revision was automatically updated to reflect the committed changes.
lib/ProfileData/SampleProfWriter.cpp