This is an archive of the discontinued LLVM Phabricator instance.

[PGO][PGSO][LV] Fix loop not vectorized issue under profile guided size opts.
ClosedPublic

Authored by hjyamauchi on Aug 11 2020, 2:22 PM.

Details

Summary

D81345 appears to accidentally disables vectorization when explicitly
enabled. As PGSO isn't currently accessible from LoopAccessInfo, revert back to
the vectorization with versioning-for-unit-stride for PGSO.

Diff Detail

Event Timeline

hjyamauchi created this revision.Aug 11 2020, 2:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2020, 2:22 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
hjyamauchi requested review of this revision.Aug 11 2020, 2:22 PM

If shouldOptimizeForSize is not available for loop analysis, should the check of be removed unconditionally instead of gated upon whether there is a hint?

If shouldOptimizeForSize is not available for loop analysis, should the check of be removed unconditionally instead of gated upon whether there is a hint?

Which check are you referring to to be removed? Can you express in terms of the code?

I mean the check llvm::shouldOptimizeForSize(L->getHeader(), PSI, BFI,

PGSOQueryType::IRPass) on loop header.

If shouldOptimizeForSize is not available for loop analysis, should the check of be removed unconditionally instead of gated upon whether there is a hint?

I think no, as it'd be an unintended removal of this piece of PGSO, which can be avoided by this patch.

Here's a longer description of my understanding.

D81345 changed the vectorization scheme for -Os from "vectorization with versioning" to "vectorization without versioning" (saving some code size). It didn't change the fact that the explicit vectorization (via the pragma) takes precedence over the -Os flag (ie. still vectorize with pragma even with -Os.) But as it moved the size opt check (hasOptSize) from LoopVectorize.cpp to LoopAccessAnalysis.cpp (by not collecting stride accesses for hasOptSize cases earlier in LoopAccessAnalysis), but does not do similar for the PGSO cases (which is not easy), it inadvertently changed it so that the explicit vectorization does no longer take precedence over PGSO (as it removes the "Hints.getForce() != LoopVectorizeHints::FK_Enabled" check.) This is the unintended behavior change that's manifested as (internal) b/163165125.

Since shouldOptimizeForSize is not available for loop analysis (due to the stated reasons), this patch simply recovers the old behavior just under PGSO, which is "vectorize with versioning" if explicitly enabled, rather than "not vectorize" (which is unexpected), while keeping the new behavior for -Os.

This revision is now accepted and ready to land.Aug 18 2020, 10:20 AM