Page MenuHomePhabricator

[PGO] Profile guided code size optimization (continued).
Needs ReviewPublic

Authored by yamauchi on Sep 3 2019, 11:23 AM.



This follows up D59514.

Enable more of the existing size optimizations for colder code under PGO.

Up to ~11% code size savings and up to ~2.5% performance improvement (due to
reduced icache misses) for large internal apps.

This depends on D67377.

Event Timeline

yamauchi created this revision.Sep 3 2019, 11:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 3 2019, 11:23 AM
Herald added subscribers: jsji, tpr, MaskRay and 15 others. · View Herald Transcript

Can you split the patches into smaller ones -- at least 3 : ProfileSummary API changes, TTI API changes, and the rest (may need to be split too, but we will see).

Will split.

Split the PSI changes into

It looks like the TTI part can't be separated due to dependencies and would pull in the whole thing:

The TTI files (TargetTransformInfo.h, TargetTransformInfoImpl.h, BasicTTIImpl.h, TargetTransformInfo.cpp) would pull in TargetLowering.h for the isSuitableForJumpTable change, which then would pull in SizeOpts.h/cpp for the llvm::shouldOptimizeForSize change. In addition, TargetLowering.h would also pull in SwitchLoweringUtils.cpp for the isSuitableForJumpTable change, which would pull in isSuitableForJumpTable.h for findJumpTables change, which would pull in SelectionDAGBuilder.cpp and IRTranslator.cpp for the findJumpTables changes,...etc.

Did you think of a different way to split the TTI part?

yamauchi updated this revision to Diff 220749.Sep 18 2019, 2:07 PM

Update to sync with D67377.

yamauchi updated this revision to Diff 221824.Wed, Sep 25, 1:06 PM

Rebased past D67377.

This patch should add test cases for each affected codegen passes.


Mark as unused args?

yamauchi updated this revision to Diff 224474.Thu, Oct 10, 2:11 PM
yamauchi marked an inline comment as done.

Added tests.

yamauchi added inline comments.Thu, Oct 10, 2:31 PM

Done (note according to the comment for LLVM_ATTRIBUTE_UNUSED, a cast-to-void is preferred for unused variables.)

yamauchi updated this revision to Diff 224495.Thu, Oct 10, 3:38 PM

Fix the summary.

yamauchi edited the summary of this revision. (Show Details)Thu, Oct 10, 3:42 PM

SizeOpts and MachineSizeOpts changes can also be extracted into its own patch.

After this is done, the TTI change has one fewer dependency and we might figure out a way to isolate that part too..


I've looked at SizeOpts.h, SizeOpts.cpp, MachineSizeOpts.cpp files. I think we should refactorize the code using template to avoid duplicate logic.

This should be done similarly to BlockFrequencyInfoImpl

template <typename FuncT, typename BFIT> bool shouldOptimizeForSize(FuncT *F, ....) {


isColdBlock should probably refactorized in the similar way.

The code can be broken down and contributed in the following order:

  1. SizeOpts related change
  2. TargetLowering change (isSuitableForJumpTable)
  3. TargetTransformation related changes -- depending on 1) and 2)
  4. SwitchLoweringUtils (findJumpTable depends on isSuitableForJumpTable in 2)
  5. the rest of the changes (can be further broken down per-pass).
yamauchi marked an inline comment as done.Wed, Oct 16, 2:15 PM

Split the SizeOpts part into