This is an archive of the discontinued LLVM Phabricator instance.

[PGO][PGSO] Add profile guided size optimization to loop vectorization legality.
AcceptedPublic

Authored by hjyamauchi on Jul 7 2020, 11:35 AM.

Diff Detail

Event Timeline

hjyamauchi created this revision.Jul 7 2020, 11:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2020, 11:35 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
davidxl added inline comments.Jul 14 2020, 11:14 AM
llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
418

should the check of F->hasOptSize() also folded into llvm::shouldOptimizeForSize()?

hjyamauchi marked an inline comment as done.Jul 15 2020, 8:52 AM
hjyamauchi added inline comments.
llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
418

That's actually been suggested before and seems to make sense, and is in my todo list for a while. But it'd be a different patch because it'd be an API change and apply to many sites that have been instrumented with llvm::shouldOptimizeForSize() so far.

davidxl accepted this revision.Jul 15 2020, 8:54 AM

lgtm

This revision is now accepted and ready to land.Jul 15 2020, 8:54 AM
This revision was automatically updated to reflect the committed changes.

I've bisected a Chromium ThinLTO failing assert to this change, could you take a look?
https://crbug.com/1106813

Reproduced https://crbug.com/1106813.

The problem was that we were querying the profile of the newly created loop, as opposed to the original loop in InnerLoopVectorizer.

Querying against the original loop (and saving it in the constructor before the transformations because the profile of the original loop could change as it is used as the fallback loop) should fix the issue.

hjyamauchi reopened this revision.Jul 20 2020, 3:32 PM
This revision is now accepted and ready to land.Jul 20 2020, 3:32 PM

PTAL. This is the diff of the patches before/after the latest fix.

70c70,76
< +        BFI(BFI), PSI(PSI) {}
---
> +        BFI(BFI), PSI(PSI) {
> +    // Query this against the original loop and save it here because the profile
> +    // of the original loop header may change as the transformation happens.
> +    OptForSizeBasedOnProfile = llvm::shouldOptimizeForSize(
> +        OrigLoop->getHeader(), PSI, BFI, PGSOQueryType::IRPass);
> +  }
> +
74c80
< @@ -779,6 +781,10 @@ protected:
---
> @@ -779,6 +787,14 @@ protected:
81a88,91
> +
> +  // Whether this loop should be optimized for size based on profile guided size
> +  // optimizatios.
> +  bool OptForSizeBasedOnProfile;
85c95
< @@ -789,9 +795,10 @@ public:
---
> @@ -789,9 +805,10 @@ public:
98c108
< @@ -2754,7 +2761,9 @@ void InnerLoopVectorizer::emitSCEVChecks(Loop *L, BasicBlock *Bypass) {
---
> @@ -2754,7 +2771,8 @@ void InnerLoopVectorizer::emitSCEVChecks(Loop *L, BasicBlock *Bypass) {
104,105c114
< +           llvm::shouldOptimizeForSize(L->getHeader(), PSI, BFI,
< +                                       PGSOQueryType::IRPass)) &&
---
> +           OptForSizeBasedOnProfile) &&
109c118
< @@ -2800,7 +2809,9 @@ void InnerLoopVectorizer::emitMemRuntimeChecks(Loop *L, BasicBlock *Bypass) {
---
> @@ -2800,7 +2818,7 @@ void InnerLoopVectorizer::emitMemRuntimeChecks(Loop *L, BasicBlock *Bypass) {
114,116c123
< +  if (MemCheckBlock->getParent()->hasOptSize() ||
< +      llvm::shouldOptimizeForSize(L->getHeader(), PSI, BFI,
< +                                  PGSOQueryType::IRPass)) {
---
> +  if (MemCheckBlock->getParent()->hasOptSize() || OptForSizeBasedOnProfile) {

thanks for the fix. looks good.