This is an archive of the discontinued LLVM Phabricator instance.

[NFC][SampleFDO] Move some common stuff from SampleProfileReaderExtBinary/SampleProfileWriterExtBinary to their parent classes.
ClosedPublic

Authored by wmi on Oct 15 2020, 7:51 PM.

Details

Summary

SampleProfileReaderExtBinary/SampleProfileWriterExtBinary specify the typical section layout currently used by SampleFDO. Currently a lot of section readers/writers stay in the two classes. However, as we expect to have more types of SampleFDO profiles, we hope those new types of profiles can share the common sections while configuring their own sections easily with minimal change. That is why I move some common stuff from SampleProfileReaderExtBinary/SampleProfileWriterExtBinary to SampleProfileReaderExtBinaryBase/SampleProfileWriterExtBinaryBase so new profiles class inheriting from the base class can reuse them.

Diff Detail

Event Timeline

wmi created this revision.Oct 15 2020, 7:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 15 2020, 7:51 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
wmi requested review of this revision.Oct 15 2020, 7:51 PM
hoy added a subscriber: hoy.Oct 17 2020, 8:38 PM
hoy added inline comments.
llvm/lib/ProfileData/SampleProfWriter.cpp
232

Why is this separated from writeOneSection?

247

Thanks for the refactoring which looks much cleaner.

254

Can this be moved into writeProfileSymbolListSection?

Thanks for the refactoring. I'm wondering what is most flexible way for us to take advantage of extended binary format. I thought being able to mix and use different section is really flexible, but if we tie non-common section to specific new format/sub class, it may restrict the way we use this format. E.g. if ExtFormatA has SecProfileSymbolList, and ExtFormatB has something like SecFuncMetadata (which is what have for CSSPGO internally), then consider if there's a need for a profile to have both SecProfileSymbolList and SecFuncMetadata - implementing that through inheritance could be a bit cumbersome.

I'm thinking about two alternatives:

  1. Follow similar model like ELF, adding or removing a section does not change the format "type". With that, we can pass in SectionHdrLayout as parameter to ctor of reader and writer, and use the layout to drive the reading and writing. In this case, we don't need to use inheritance and all new section and its reader/writer functions are always registered in a single class.
  2. Still use inheritance to allow new combination of sections as a new sub format, but always register all new section and its reader/writer functions in the base class. The subclasses only define unique SectionHdrLayout. With this, the example above will not need to inherit from both ExtFormatA and ExtFormatB.

It's nice to keep the ability to use sections orthogonally, instead of in hierarchical manner. What do you think?

wmi added a comment.Oct 19 2020, 10:00 AM

Thanks for the refactoring. I'm wondering what is most flexible way for us to take advantage of extended binary format. I thought being able to mix and use different section is really flexible, but if we tie non-common section to specific new format/sub class, it may restrict the way we use this format. E.g. if ExtFormatA has SecProfileSymbolList, and ExtFormatB has something like SecFuncMetadata (which is what have for CSSPGO internally), then consider if there's a need for a profile to have both SecProfileSymbolList and SecFuncMetadata - implementing that through inheritance could be a bit cumbersome.

I'm thinking about two alternatives:

  1. Follow similar model like ELF, adding or removing a section does not change the format "type". With that, we can pass in SectionHdrLayout as parameter to ctor of reader and writer, and use the layout to drive the reading and writing. In this case, we don't need to use inheritance and all new section and its reader/writer functions are always registered in a single class.
  2. Still use inheritance to allow new combination of sections as a new sub format, but always register all new section and its reader/writer functions in the base class. The subclasses only define unique SectionHdrLayout. With this, the example above will not need to inherit from both ExtFormatA and ExtFormatB.

It's nice to keep the ability to use sections orthogonally, instead of in hierarchical manner. What do you think?

Good point. Thanks to raise it up. I considered those two alternatives and the current solution. They all have their own advantages and disadvantages.

For 1, it is flexible. Any Reader/Writer users can define their layout they want to have without having to define a new subclass. The disadvantage is the layout reflects the internal logic about how the sections will be implemented and used. Passing in SectionHdrLayout means to expose many details to the code using the reader/writer. Another disadvantage is that putting all the section reader/writer in a single class may be cumbersome.

For 2, it is also very flexible. The disadvantage is putting all the section read/writers in a class.

The current patch is an extension of 2. We still have inheritance and we register the section reader/writers at a proper level that all the subclasses can get/share the support they want (Each class can register their section reader/writer in readOtherSection/writeOtherSection). We may put a section in a subclass at the beginning but move the section up when more formats are added.

A common disadvantage of 2 and 3 are every time we need to define a new subclass, there is some routinary code we need to add.

So current patch is a compromise of the above consideration. I like to hear your opinions since CSSPGO will have some immediate requirement to define some new sections.

wmi added inline comments.Oct 19 2020, 10:11 AM
llvm/lib/ProfileData/SampleProfWriter.cpp
232

That is somewhat explained in the reply to Wenlei's mail. I want to put a section to an inheritance level when foreseeably new subclasses created in other inheritance branch will not use it. Surely whether SecProfileSymbolList will not be used by other new formats are questionable. I just use it as an example and it will be easy to move the section up.

254

It cannot. Whether the section will be compressed or not has to be decided before markSectionStart is called.

hoy added inline comments.Oct 20 2020, 2:39 PM
llvm/lib/ProfileData/SampleProfWriter.cpp
232

I see. Can I ask if you are going to introduce a new ExtBinary variant? Since we are also adding a new section (for pseudo probe) , I'm not sure if it should be treated as an optional section of ExtBinaryBase or a required section of a new variant.

wmi added inline comments.Oct 20 2020, 5:47 PM
llvm/lib/ProfileData/SampleProfWriter.cpp
232

Yes, I am going to add a new ExtBinary variant.

To introduce it a little bit, we have a synthesized profile which is merged from many sample profiles and we use it for targets without regular sample profile. The merged profile is quite large. Previously inline instances in the synthesized profile are all flattened, i.e., all inline instances in the profile are extracted and merged to their outline instance. We found selectively flattening instead of fully flattening the synthesized profile is helpful for performance, however it may increase compile time significantly in the postlink phase during profile loading. To solve that, we are trying to split the profile into two parts, one part containing inline instance and the other part not containing inline instance. When thinlto is used, the second profile loading pass in postlink phase will only have to load the part containing inline instance, which will minimize the compile time impact.

For the new variant we are going to propose, it doesn't need SecProfileSymbolList for the moment because the synthesized profile doesn't need SecProfileSymbolList. But we find in some initial experiment that we can selectively flatten regular sampleFDO profile without performance impact but with much smaller profile size, so the new variant could be possibly useful for regular sampleFDO too in the future. In that case, SecProfileSymbolList will be needed.

With that being said, maybe SecProfileSymbolList is not a good example to be put in a subclass, we can move SecProfileSymbolList to the Base class if you agree.

hoy added inline comments.Oct 20 2020, 6:31 PM
llvm/lib/ProfileData/SampleProfWriter.cpp
232

Thanks for providing more context which is helpful. I'm not particularly concerned with where to place SecProfileSymbolList. I'm still trying to understand class hierarchy. Are we separating the variants so that a section ends up being required but not optional in every variant? If a section can be optional, do we still need the inheritance?

wmi added inline comments.Oct 20 2020, 10:09 PM
llvm/lib/ProfileData/SampleProfWriter.cpp
232

Currently if a section is registered in SectionHdrLayout of a class, it means the variant will have a place for the section in section table in the file header, but if you don't need it, just don't write anything to the section and it will be an empty section. That make section support essentially optional.

I tend to keep inheritance is because if we just use one class, we end up putting the requirement of every variant in the same class and I am worried that may be too much for the single class as the formats evolve.

Surely we also don't want to have too many classes either. So maybe we can support multiple layouts for a certain subclass if the layouts are similar. Only if the new variant is significantly different from existing one, we create a new class. What do you think?

Thanks for the context! So the intention is to use subclass to hide extra complexity of uncommon sections/variants and its reader/writer into subclass. Then this refactoring makes sense.

Just to confirm, this use case you described is the partial profile that was introduced a while ago, correct? And new variant will have new sections for flattened and hierarchical profile?

For SecProfileSymbolList specifically, I think it's common enough that justifies it to be in the base.

Surely we also don't want to have too many classes either. So maybe we can support multiple layouts for a certain subclass if the layouts are similar. Only if the new variant is significantly different from existing one, we create a new class. What do you think?

Sounds good to me.

hoy added inline comments.Oct 20 2020, 11:39 PM
llvm/lib/ProfileData/SampleProfWriter.cpp
232

I see your point now. Yes, it's reasonable to not mix custom section implementation with the common base class.

Should the base implementation SampleProfileWriterExtBinaryBase::writeOtherSection be also called here since a subclass inherits all sections from the base class?

Nit: consider naming it writeCustomSection if that sounds good to you?

wmi added a comment.Oct 21 2020, 9:22 AM

Thanks for the context! So the intention is to use subclass to hide extra complexity of uncommon sections/variants and its reader/writer into subclass. Then this refactoring makes sense.

Just to confirm, this use case you described is the partial profile that was introduced a while ago, correct? And new variant will have new sections for flattened and hierarchical profile?

Yes, that is the partial profile. Previously the partial profile have all flattened profile in SecLBRProfile section. The new variant will try to split the flattened and hierarchical profile into two sections, and so as the SecNameTable and SecFuncOffsetTable sections.

llvm/lib/ProfileData/SampleProfWriter.cpp
232

SampleProfileWriterExtBinaryBase is a pure virtual class so I expect it will not have customized sections. I think I should change writeOtherSection to a pure virtual function instead of leaving it as an empty function.

writeCustomSection is a better name. Thanks!

hoy added inline comments.Oct 21 2020, 9:47 AM
llvm/lib/ProfileData/SampleProfWriter.cpp
232

Yeah, a pure virtual function sounds good.

wmi updated this revision to Diff 299848.Oct 21 2020, 8:13 PM

Address Hongtao and Wenlei's comments.

wenlei accepted this revision.Oct 21 2020, 8:20 PM

LGTM. Thanks!

This revision is now accepted and ready to land.Oct 21 2020, 8:20 PM
hoy accepted this revision.Oct 21 2020, 9:56 PM
hoy removed a reviewer: hoyFB.
This revision was automatically updated to reflect the committed changes.