This is an archive of the discontinued LLVM Phabricator instance.

[GlobalOpt] Ignore only loaded / only stored global parts in global SRA heuristic
ClosedPublic

Authored by nikic on Jul 15 2022, 7:45 AM.

Details

Summary

When limiting the number of parts we split a global into, ignore any parts that are either only loaded or only stored, because we expect these to be optimized away after SRA.

This is intended as an alternative to D129525.

Diff Detail

Event Timeline

nikic created this revision.Jul 15 2022, 7:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2022, 7:45 AM
nikic requested review of this revision.Jul 15 2022, 7:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2022, 7:45 AM

I just tried this on the original reproducer and it's not sufficient unfortunately. I'll extract an updated reproducer.

fhahn added a comment.Sep 2 2022, 5:05 AM

I added a reduced test case from https://github.com/llvm/llvm-project/issues/56762 which has a single store to the global which makes the logic here not trigger unfortunately: https://github.com/llvm/llvm-project/commit/91e67c074922cc667fa1c43fc1f01acb96faa0f9

To solve this in general, we could probably track which fields of a global are never modified and then ignore such loads from the count. I'm still wondering if we should re-introduce the previous logic until we are able to completely address this properly to avoid regressions.

nikic updated this revision to Diff 467067.Oct 12 2022, 2:08 AM
nikic retitled this revision from [GlobalOpt] Ignore never loaded global parts in global SRA heuristic to [GlobalOpt] Ignore only loaded / only stored global parts in global SRA heuristic.
nikic edited the summary of this revision. (Show Details)
nikic added a comment.Oct 12 2022, 2:09 AM

I've updated the patch to treat loads and stores symmetrically: We only count the part if it is both loaded and stored. In other cases, we expect it to be optimized away.

fhahn added a comment.Feb 21 2023, 5:35 AM

I've updated the patch to treat loads and stores symmetrically: We only count the part if it is both loaded and stored. In other cases, we expect it to be optimized away.

Thanks, I finally had some time to track down the cases missed by the current version of the heuristic for the larger reproducer. I added a set of test case, rebased the change here (https://github.com/fhahn/llvm-project/commit/1053ab2c24811bfc2bdf4b8b0bf5da15c2772de3) and put up D144468, D144476, D144477. With those, we get the expected optimizations without D129525. So I think we should move ahead with this patch and the follow-ups.

nikic updated this revision to Diff 499505.Feb 22 2023, 7:38 AM
nikic edited the summary of this revision. (Show Details)

Rebase

aeubanks accepted this revision.Feb 22 2023, 1:15 PM
This revision is now accepted and ready to land.Feb 22 2023, 1:15 PM
fhahn accepted this revision.Feb 24 2023, 11:45 AM

LGTM, thanks!

llvm/lib/Transforms/IPO/GlobalOpt.cpp
340

might be good to document the struct

This revision was landed with ongoing or failed builds.Feb 27 2023, 5:58 AM
This revision was automatically updated to reflect the committed changes.