This is an archive of the discontinued LLVM Phabricator instance.

[GlobalOpt] Make global SRA offset based
ClosedPublic

Authored by nikic on Jan 13 2022, 6:12 AM.

Details

Reviewers
reames
Group Reviewers
Restricted Project
Commits
rG4796b4ae7bcc: [GlobalOpt] Make global SRA offset based
Summary

Currently global SRA uses the GEP structure to determine how to split the global. This patch instead analyses the loads and stores that are performed on the global, and collects which types are used at which offset, and then splits the global according to those.

This is both more general, and works fine with opaque pointers. This is also closer to how ordinary SROA is performed.

Diff Detail

Event Timeline

nikic created this revision.Jan 13 2022, 6:12 AM
nikic requested review of this revision.Jan 13 2022, 6:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 13 2022, 6:12 AM

looks like DebugInfo/Generic/global-sra-array.ll is failing

nikic updated this revision to Diff 399921.Jan 14 2022, 12:45 AM

Update debuginfo test.

nikic updated this revision to Diff 399927.Jan 14 2022, 1:26 AM

Rebase over additional test with [N x i8] type.

nikic updated this revision to Diff 399975.Jan 14 2022, 6:52 AM

Protect against recursive SRA.

reames added a subscriber: reames.Jan 14 2022, 8:33 AM

Hm, your new code looks correct to me.

There's one case in the original code I don't see reflected in the new code though. It doesn't appear to have any tests, and I'm not 100% sure it's reachable at all, but consider the following.

An outer struct with an inner array. The index expression has a non-constant operand for the offset into the array. (So, gep g, 0, v) I believe the existing code will try to strip the struct off, and replace the global with the array. Your code, because of the restriction to entirely constant offsets, will not.

In an opaque pointer world, this example can't arise. If you wanted to add your code without removing the existing, and put the old code under a non-opaque pointer check, I'd be fine with that. Though I'm really hoping you tell me I'm missing something and this case is impossible. :)

nikic added a comment.Jan 14 2022, 8:48 AM

Hm, your new code looks correct to me.

There's one case in the original code I don't see reflected in the new code though. It doesn't appear to have any tests, and I'm not 100% sure it's reachable at all, but consider the following.

An outer struct with an inner array. The index expression has a non-constant operand for the offset into the array. (So, gep g, 0, v) I believe the existing code will try to strip the struct off, and replace the global with the array. Your code, because of the restriction to entirely constant offsets, will not.

The old code doesn't handle this either, because isSafeSROAGEP() checks that all indices are constant integers. Besides, doing this optimization without constant indices would be illegal, because there may be notional overindexing (this would only be legal with inrange annotations, which is not something the old code tries to handle -- GlobalSplit is the optimization that does).

reames accepted this revision.Jan 14 2022, 8:59 AM

The old code doesn't handle this either, because isSafeSROAGEP() checks that all indices are constant integers.

What had confused me is that we have an early continue in the loop for structs which seemed to allow non-constant operands. I just realized that this is relying on the verifier to disallow non-constant struct indices. Dang that is subtle for no particular reason.

Given that LGTM

This revision is now accepted and ready to land.Jan 14 2022, 8:59 AM
This revision was automatically updated to reflect the committed changes.
uabelho added a subscriber: uabelho.EditedFeb 1 2022, 4:14 AM

Hello,

We've seen a case where the output from opt is different with/without debug info and ended up bisection at this patch.
After reduction and fiddling it seems like it doesn't actually have to do with debug info at all but something else. Our best reproducer so far is as below.

If we run sroa, early-cse and globalopt like this:

opt -passes='function(sroa,early-cse),invalidate<all>,globalopt' -debug-only=globalopt bbi-65060.ll -o /dev/null

we get one set of printouts and if we run like this where we run the same passes but in two different invocations:

opt -passes='function(sroa,early-cse),invalidate<all>' bbi-65060.ll | opt -passes=globalopt -debug-only=globalopt -o /dev/null

we get something else.

Notably is that with the second way, with the split pipeline we see

PERFORMING GLOBAL SRA ON: @g_93 = internal unnamed_addr global [6 x i16*] undef
MARKING CONSTANT: @g_93.0 = internal unnamed_addr global i16* undef
   *** Marking constant allowed us to simplify all users and delete global!
GLOBAL NEVER LOADED: @g_93.1 = internal unnamed_addr global i16* undef

which we don't see when we run all passes in the same pipe.

This has been very elusive and hard to reproduce and reduce so we wouldn't be surprised if it's connected to some memory issue, iteration order or something like that.

markus added a subscriber: markus.Feb 1 2022, 5:13 AM
nikic added a comment.Feb 1 2022, 5:57 AM

@uabelho Thanks for the report. The issue is related to a handling of dead constant users in GlobalStatus, which is why the two opt invocations are relevant (dead constant users will get dropped).

Cool! It seems the fix solves the original problem we saw with different result with/without debug info as well.
Thank you!

Unfortunately I've now stumbled upon a case where the compiler starts to hang with that fix:
clang -O2 --target=x86_64-unknown-linux-gnu -o /dev/null bbi-66447.c

and here's the bbi-66447.c

fhahn added a subscriber: fhahn.Jul 11 2022, 5:27 PM
fhahn added inline comments.
llvm/lib/Transforms/IPO/GlobalOpt.cpp
444

I think this limit is more aggressive for struct types than the original code AFAICT, which seems to be causing some code-size regressions. I put up D129525 to make the limit behave more like the original code.

Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 5:27 PM