This is an archive of the discontinued LLVM Phabricator instance.

[GlobalOpt] Allow dead struct fields in SRA with non constant offset.
AbandonedPublic

Authored by chrib on May 14 2019, 10:36 AM.

Details

Summary

Allow struct fields SRA and dead stores. This works by considering fields accesses from getElementPtr to be considered as a possible pointer root that can be cleaned up.
We check that the variable can be SRA by recursively checking the sub expressions with the new isSafeSubSROAGEP function.

basically this allows the array in following C code to be optimized out

struct Expr {

int a[2];
int b;

};

static struct Expr e;

int foo (int i)
{

e.b = 2;
e.a[i] = 1;
return e.b;

}

Diff Detail

Event Timeline

chrib created this revision.May 14 2019, 10:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 14 2019, 10:36 AM
chrib updated this revision to Diff 199480.May 14 2019, 10:43 AM
  • Fix comment
chrib edited the summary of this revision. (Show Details)May 14 2019, 11:05 AM
chrib retitled this revision from Improve GlobalOpt to recognize dead fields and propagate values to [GlobalOpt] recognize dead struct fields and propagate values.May 15 2019, 6:11 AM
jmolloy accepted this revision.May 20 2019, 7:18 AM

I think I've convinced myself that this is correct. LGTM.

This revision is now accepted and ready to land.May 20 2019, 7:18 AM
This revision was automatically updated to reflect the committed changes.
efriedma reopened this revision.May 23 2019, 6:06 PM
efriedma added a subscriber: efriedma.

Reverted in r361581.

It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll). Please be more careful in the future when you're changing an existing testcase, especially one with a comment that specifically says the transform you're trying to do is not valid.

This revision is now accepted and ready to land.May 23 2019, 6:06 PM
efriedma requested changes to this revision.May 23 2019, 6:06 PM
This revision now requires changes to proceed.May 23 2019, 6:06 PM
chrib added a comment.May 23 2019, 9:42 PM

looking at it. reverting in the meantime is fine

chrib updated this revision to Diff 206425.Jun 25 2019, 6:01 AM
chrib edited the summary of this revision. (Show Details)

This fix the regression with multiple GEP arrays by allowing SRA for GEP ConstantExpr even if it might use an instruction containing indirect offsets access. The GlobalValue fields can be scalarized if they have StructType and not ArrayType with possible out-of-bound.

chrib retitled this revision from [GlobalOpt] recognize dead struct fields and propagate values to [GlobalOpt] Allow dead struct fields in SRA with non constant offset..Jun 25 2019, 6:03 AM

Is this rebased? A fix landed in https://bugs.llvm.org/show_bug.cgi?id=38334 (last comment).

xbolva00 added a comment.EditedJun 25 2019, 6:51 AM

Ah, sorry. I meant Eli’s link:
https://bugs.llvm.org/show_bug.cgi?id=38309

chrib added a comment.Jun 25 2019, 7:07 AM

I see. yes the fix is still there and I checked that this bug doesn't regresses.

The fix for the GEP check from the last comment is still there, but moved it around the GlovalValue first level to be candidate for SRA . This allow StructType fields to be scalarized apart.

as is 'a' and 'b' in

struct Expr {
    int a[3][3];
    int b;
  };

  static struct Expr e;

  int
  foo(int i)
  {
    e.a[i][0] = 1;
    e.b = 2;
    return e.a[0][0];
  }

but multiple arrays non-constant accesses should be safe

Ah, sorry. I meant Eli’s link:
https://bugs.llvm.org/show_bug.cgi?id=38309

The problem here is that the IR semantics don't allow this transform in general.

Even if you've managed to dodge the exact construct from bug 38309 (I'm not sure I understand how, but it's not really relevant), the rules for GEP indexing don't work the way you want them to. "inbounds" just means the result is somewhere within the same global; it doesn't mean that array indexes don't "overflow". See http://llvm.org/docs/LangRef.html#getelementptr-instruction .

The problem here is that the IR semantics don't allow this transform in general.

Even if you've managed to dodge the exact construct from bug 38309 (I'm not sure I understand how, but it's not really relevant), the rules for GEP indexing don't work the way you want them to. "inbounds" just means the result is somewhere within the same global; it doesn't mean that array indexes don't "overflow". See http://llvm.org/docs/LangRef.html#getelementptr-instruction .

The desire here is to look at constant GEPs? Maybe looking at the "inrange" keyword instead would be better (it has stronger semantics than inbounds), "If the inrange keyword is present before any index, loading from or storing to any pointer derived from the getelementptr has undefined behavior if the load or store would access memory outside of the bounds of the element selected by the index marked as inrange. " (also discussed in http://llvm.org/docs/LangRef.html#getelementptr-instruction).

chrib added a comment.Jun 26 2019, 3:32 AM

Yes you are right, my idea was to avoid checking the non-cst index access for sub uses of the Value, in order to allow independent field GEP accesses to be candidate for SRA (for struct types(, but indeed just checking the first livel of ConstantExpr GEP might just be too complicated and I don't think that the inbound or inrange markers help here. (and multi-dimensional array GEP access would be inbound but wrong to SRA.

What I'm thinking now, maybe, is to discriminate a valid SRA candidate with a common zero internalizer check for all the fields. so even an out-of-bound array access would not break the possible uses.

chrib abandoned this revision.Jul 1 2019, 2:18 AM

the declaration from Bug #38309 currently generates

@g_data = dso_local local_unnamed_addr global [8 x i16] [i16 1, i16 1, i16 1, i16 1, i16 0, i16 0, i16 0, i16 0], align 2

which seems better than a Struct LLVM IR construct as in :
.g_data = internal global <{ [8 x i16], [8 x i16] }> ...

So if my assumption that the test globalsra-multigep.ll was badly constructed is wrong, there don't seem I can fix the semantic inconsistency with the C struct UB without inconsistencies with this regression test.

sorry for the noise, abandoning this proposal