This is a patch split from https://reviews.llvm.org/D66374. It tries to add a new format of profile called ExtBinary. The format adds a section header table to the profile and organize the profile in sections, so the future extension like adding a new section or extending an existing section will be easier while keeping backward compatiblity feasible.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm-profdata should also have a round trip test:
text format --> ext-format-> text format --> compare text format before and after
text-format-> binary-format->ext-format-> text format --> compare text format.
Also for the the funcProfile section, should it also have a field (in the header) defining the profile kind, such as LBR samples, llc misses etc. Another question, is multiple such funcProfile section supported?
include/llvm/ProfileData/SampleProfReader.h | ||
---|---|---|
463 | extend --> extended | |
473 | why retiring the existing section? A natural way is to append a new section of a new type. The format of a new section can even be 'opaque' and the section reader plugin can be provided by the client. | |
lib/ProfileData/SampleProfWriter.cpp | ||
89 | I think it is better to explicit pass SectionStart as the parameter. It should also return a new offset. | |
101 | let this function return an offset that can be passed to addNewSection | |
tools/llvm-profdata/llvm-profdata.cpp | ||
40 | is Compact_Ext_Binary also a choice? |
I will add it.
Also for the the funcProfile section, should it also have a field (in the header) defining the profile kind, such as LBR samples, llc misses etc.
My plan was to use different section types to represent different funcProfile sections: secFuncProfile_LBR, secFuncProfile_LBR_LLCM, ... Each will have its own section reader.
Another question, is multiple such funcProfile section supported?
It is supported. Currently reader only read each section according to section tag. It can be multiple sections with the same type. As long as the section reader supports being called multiple times, there is no problem. (FunctionProfile section reader 'readFuncProfile' can be called multiple times)
include/llvm/ProfileData/SampleProfReader.h | ||
---|---|---|
463 | Will fix. | |
473 | If we want to append cache misses information, we need to express the parent symbol and line offset again in the form of [line offset: cache miss count], that will introduce redundency because we already express line offset in existing function profile. What we want is to append the information directly in the current format like: [line offset: lbr sample count, cache miss count]. That is why I say we may retire the current function profile section v1 and create a new function profile section v2. But the reader of function profile section v1 can be kept if backward compatibility is required.
Agreed. | |
lib/ProfileData/SampleProfWriter.cpp | ||
89 | will fix. | |
tools/llvm-profdata/llvm-profdata.cpp | ||
40 | Currently Compact_Binary uses md5 to represent string. Compact_Ext_Binary may confuse people to think the ext format also use md5 to represent string. In the future, we can port the implementation of Compact_Binary to inherit from Ext_Binary too. |
Address David's comment.
Add ExtBinaryBase class for the basics of extensible format and ExtBinary inherits from it, so we can add other types of profiles in extensible format separately.
Some refactoring and comment changes.
LLC miss info should be very sparse, so mixing it with LBR may not bring a lot of benefit. Having mixure of different types of profile data can lead to combinatorial growth of the number of readers. As the number of profile types increases, it can cause the reader to be unmaintaintable. That is why I suggest making it clean -- one type per section.
Regarding compact format. Does it mean when LLC misses are included, there is no way to use the compact representation?
extend --> extended