This is an archive of the discontinued LLVM Phabricator instance.

Differential D61911

[GlobalOpt] Allow dead struct fields in SRA with non constant offset.
AbandonedPublic

Authored by chrib on May 14 2019, 10:36 AM.

Download Raw Diff

Details

Reviewers

greened
bkramer
nicholas
jmolloy
efriedma

Commits

rZORG9ad1f8e0907c: [GlobalOpt] recognize dead struct fields and propagate values
rG9ad1f8e0907c: [GlobalOpt] recognize dead struct fields and propagate values
rG4a7da98bd928: [GlobalOpt] recognize dead struct fields and propagate values
rL361460: [GlobalOpt] recognize dead struct fields and propagate values

Summary

Allow struct fields SRA and dead stores. This works by considering fields accesses from getElementPtr to be considered as a possible pointer root that can be cleaned up.
We check that the variable can be SRA by recursively checking the sub expressions with the new isSafeSubSROAGEP function.

basically this allows the array in following C code to be optimized out

struct Expr {

int a[2];
int b;

};

static struct Expr e;

int foo (int i)
{

e.b = 2;
e.a[i] = 1;
return e.b;

}

Diff Detail

Repository

rL LLVM

Build Status

Buildable 33875
Build 33874: arc lint + arc unit

Event Timeline

chrib created this revision.May 14 2019, 10:36 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 14 2019, 10:36 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B31889: Diff 199477.May 14 2019, 10:36 AM

Fix comment

Harbormaster completed remote builds in B31891: Diff 199480.May 14 2019, 10:43 AM

chrib added reviewers: greened, bkramer.May 14 2019, 10:56 AM

chrib edited the summary of this revision. (Show Details)May 14 2019, 11:05 AM

chrib retitled this revision from Improve GlobalOpt to recognize dead fields and propagate values to [GlobalOpt] recognize dead struct fields and propagate values.May 15 2019, 6:11 AM

chrib added reviewers: nicholas, jmolloy.May 20 2019, 6:31 AM

I think I've convinced myself that this is correct. LGTM.

This revision is now accepted and ready to land.May 20 2019, 7:18 AM

Closed by commit rL361460: [GlobalOpt] recognize dead struct fields and propagate values (authored by chrib). · Explain WhyMay 22 2019, 10:54 PM

This revision was automatically updated to reflect the committed changes.

Reverted in r361581.

It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll). Please be more careful in the future when you're changing an existing testcase, especially one with a comment that specifically says the transform you're trying to do is not valid.

This revision is now accepted and ready to land.May 23 2019, 6:06 PM

efriedma requested changes to this revision.May 23 2019, 6:06 PM

This revision now requires changes to proceed.May 23 2019, 6:06 PM

looking at it. reverting in the meantime is fine

This fix the regression with multiple GEP arrays by allowing SRA for GEP ConstantExpr even if it might use an instruction containing indirect offsets access. The GlobalValue fields can be scalarized if they have StructType and not ArrayType with possible out-of-bound.

chrib retitled this revision from [GlobalOpt] recognize dead struct fields and propagate values to [GlobalOpt] Allow dead struct fields in SRA with non constant offset..Jun 25 2019, 6:03 AM

Harbormaster completed remote builds in B33875: Diff 206425.Jun 25 2019, 6:13 AM

Is this rebased? A fix landed in https://bugs.llvm.org/show_bug.cgi?id=38334 (last comment).

Ah, sorry. I meant Eli’s link:
https://bugs.llvm.org/show_bug.cgi?id=38309

I see. yes the fix is still there and I checked that this bug doesn't regresses.

The fix for the GEP check from the last comment is still there, but moved it around the GlovalValue first level to be candidate for SRA . This allow StructType fields to be scalarized apart.

as is 'a' and 'b' in

struct Expr {
    int a[3][3];
    int b;
  };

  static struct Expr e;

  int
  foo(int i)
  {
    e.a[i][0] = 1;
    e.b = 2;
    return e.a[0][0];
  }

but multiple arrays non-constant accesses should be safe

In D61911#1557439, @xbolva00 wrote:

Ah, sorry. I meant Eli’s link:
https://bugs.llvm.org/show_bug.cgi?id=38309

The problem here is that the IR semantics don't allow this transform in general.

Even if you've managed to dodge the exact construct from bug 38309 (I'm not sure I understand how, but it's not really relevant), the rules for GEP indexing don't work the way you want them to. "inbounds" just means the result is somewhere within the same global; it doesn't mean that array indexes don't "overflow". See http://llvm.org/docs/LangRef.html#getelementptr-instruction .

In D61911#1557985, @efriedma wrote:

The problem here is that the IR semantics don't allow this transform in general.

Even if you've managed to dodge the exact construct from bug 38309 (I'm not sure I understand how, but it's not really relevant), the rules for GEP indexing don't work the way you want them to. "inbounds" just means the result is somewhere within the same global; it doesn't mean that array indexes don't "overflow". See http://llvm.org/docs/LangRef.html#getelementptr-instruction .

The desire here is to look at constant GEPs? Maybe looking at the "inrange" keyword instead would be better (it has stronger semantics than inbounds), "If the inrange keyword is present before any index, loading from or storing to any pointer derived from the getelementptr has undefined behavior if the load or store would access memory outside of the bounds of the element selected by the index marked as inrange. " (also discussed in http://llvm.org/docs/LangRef.html#getelementptr-instruction).

Yes you are right, my idea was to avoid checking the non-cst index access for sub uses of the Value, in order to allow independent field GEP accesses to be candidate for SRA (for struct types(, but indeed just checking the first livel of ConstantExpr GEP might just be too complicated and I don't think that the inbound or inrange markers help here. (and multi-dimensional array GEP access would be inbound but wrong to SRA.

What I'm thinking now, maybe, is to discriminate a valid SRA candidate with a common zero internalizer check for all the fields. so even an out-of-bound array access would not break the possible uses.

the declaration from Bug #38309 currently generates

@g_data = dso_local local_unnamed_addr global [8 x i16] [i16 1, i16 1, i16 1, i16 1, i16 0, i16 0, i16 0, i16 0], align 2

which seems better than a Struct LLVM IR construct as in :
.g_data = internal global <{ [8 x i16], [8 x i16] }> ...

So if my assumption that the test globalsra-multigep.ll was badly constructed is wrong, there don't seem I can fix the semantic inconsistency with the C struct UB without inconsistencies with this regression test.

sorry for the noise, abandoning this proposal

Revision Contents

Path

Size

lib/

Transforms/

IPO/

GlobalOpt.cpp

49 lines

test/

Transforms/

GlobalOpt/

globalsra-ptr.ll

18 lines

globalsra-struct.ll

23 lines

Diff 206425

lib/Transforms/IPO/GlobalOpt.cpp

Show First 20 Lines • Show All 178 Lines • ▼ Show 20 Lines	do {

V = I->getOperand(0);		V = I->getOperand(0);
} while (true);		} while (true);
}		}

/// This GV is a pointer root. Loop over all users of the global and clean up		/// This GV is a pointer root. Loop over all users of the global and clean up
/// any that obviously don't assign the global a value that isn't dynamically		/// any that obviously don't assign the global a value that isn't dynamically
/// allocated.		/// allocated.
static bool CleanupPointerRootUsers(GlobalVariable *GV,		static bool CleanupPointerRootUsers(Value GV, const TargetLibraryInfo TLI) {
const TargetLibraryInfo *TLI) {
// A brief explanation of leak checkers. The goal is to find bugs where		// A brief explanation of leak checkers. The goal is to find bugs where
// pointers are forgotten, causing an accumulating growth in memory		// pointers are forgotten, causing an accumulating growth in memory
// usage over time. The common strategy for leak checkers is to whitelist the		// usage over time. The common strategy for leak checkers is to whitelist the
// memory pointed to by globals at exit. This is popular because it also		// memory pointed to by globals at exit. This is popular because it also
// solves another problem where the main thread of a C++ program may shut down		// solves another problem where the main thread of a C++ program may shut down
// before other threads that are still expecting to use those globals. To		// before other threads that are still expecting to use those globals. To
// handle that case, we expect the program may create a singleton and never		// handle that case, we expect the program may create a singleton and never
// destroy it.		// destroy it.
Show All 30 Lines	if (StoreInst *SI = dyn_cast<StoreInst>(U)) {
if (MemSrc && MemSrc->isConstant()) {		if (MemSrc && MemSrc->isConstant()) {
Changed = true;		Changed = true;
MTI->eraseFromParent();		MTI->eraseFromParent();
} else if (Instruction *I = dyn_cast<Instruction>(MemSrc)) {		} else if (Instruction *I = dyn_cast<Instruction>(MemSrc)) {
if (I->hasOneUse())		if (I->hasOneUse())
Dead.push_back(std::make_pair(I, MTI));		Dead.push_back(std::make_pair(I, MTI));
}		}
} else if (ConstantExpr *CE = dyn_cast<ConstantExpr>(U)) {		} else if (ConstantExpr *CE = dyn_cast<ConstantExpr>(U)) {
		if (CE->getOpcode() == Instruction::GetElementPtr)
		Changed \|= CleanupPointerRootUsers(CE, TLI);
if (CE->use_empty()) {		if (CE->use_empty()) {
CE->destroyConstant();		CE->destroyConstant();
Changed = true;		Changed = true;
}		}
} else if (Constant *C = dyn_cast<Constant>(U)) {		} else if (Constant *C = dyn_cast<Constant>(U)) {
if (isSafeToDestroyConstant(C)) {		if (isSafeToDestroyConstant(C)) {
C->destroyConstant();		C->destroyConstant();
// This could have invalidated UI, start over from scratch.		// This could have invalidated UI, start over from scratch.
▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	static bool isSafeSROAGEP(User *U) {
// Check to see if this ConstantExpr GEP is SRA'able. In particular, we		// Check to see if this ConstantExpr GEP is SRA'able. In particular, we
// don't like < 3 operand CE's, and we don't like non-constant integer		// don't like < 3 operand CE's, and we don't like non-constant integer
// indices. This enforces that all uses are 'gep GV, 0, C, ...' for some		// indices. This enforces that all uses are 'gep GV, 0, C, ...' for some
// value of C.		// value of C.
if (U->getNumOperands() < 3 \|\| !isa<Constant>(U->getOperand(1)) \|\|		if (U->getNumOperands() < 3 \|\| !isa<Constant>(U->getOperand(1)) \|\|
!cast<Constant>(U->getOperand(1))->isNullValue())		!cast<Constant>(U->getOperand(1))->isNullValue())
return false;		return false;

gep_type_iterator GEPI = gep_type_begin(U), E = gep_type_end(U);
++GEPI; // Skip over the pointer index.

// For all other level we require that the indices are constant and inrange.
// In particular, consider: A[0][i]. We cannot know that the user isn't doing
// invalid things like allowing i to index an out-of-range subscript that
// accesses A[1]. This can also happen between different members of a struct
// in llvm IR.
for (; GEPI != E; ++GEPI) {
if (GEPI.isStruct())
continue;

ConstantInt *IdxVal = dyn_cast<ConstantInt>(GEPI.getOperand());
if (!IdxVal \|\| (GEPI.isBoundedSequential() &&
IdxVal->getZExtValue() >= GEPI.getSequentialNumElements()))
return false;
}

return llvm::all_of(U->users(),		return llvm::all_of(U->users(),
[](User *UU) { return isSafeSROAElementUse(UU); });		[](User *UU) { return isSafeSROAElementUse(UU); });
}		}

/// Return true if the specified instruction is a safe user of a derived		/// Return true if the specified instruction is a safe user of a derived
/// expression from a global that we want to SROA.		/// expression from a global that we want to SROA.
static bool isSafeSROAElementUse(Value *V) {		static bool isSafeSROAElementUse(Value *V) {
// We might have a dead and dangling constant hanging off of here.		// We might have a dead and dangling constant hanging off of here.
Show All 13 Lines	static bool isSafeSROAElementUse(Value *V) {
// Otherwise, it must be a GEP. Check it and its users are safe to SRA.		// Otherwise, it must be a GEP. Check it and its users are safe to SRA.
return isa<GetElementPtrInst>(I) && isSafeSROAGEP(I);		return isa<GetElementPtrInst>(I) && isSafeSROAGEP(I);
}		}

/// Look at all uses of the global and decide whether it is safe for us to		/// Look at all uses of the global and decide whether it is safe for us to
/// perform this transformation.		/// perform this transformation.
static bool GlobalUsersSafeToSRA(GlobalValue *GV) {		static bool GlobalUsersSafeToSRA(GlobalValue *GV) {
for (User *U : GV->users()) {		for (User *U : GV->users()) {
// The user of the global must be a GEP Inst or a ConstantExpr GEP.		if (!isa<ConstantExpr>(U) \|\|
if (!isa<GetElementPtrInst>(U) &&		cast<ConstantExpr>(U)->getOpcode() != Instruction::GetElementPtr)
(!isa<ConstantExpr>(U) \|\|
cast<ConstantExpr>(U)->getOpcode() != Instruction::GetElementPtr))
return false;		return false;

		// For the first level of this array value, we require that the indices
		// are constant and inrange.
		// In particular, consider: A[0][i]. We cannot know that the user isn't
		// doing invalid things like allowing i to index an out-of-range
		// subscript that accesses A[1].
		if (isa<GetElementPtrInst>(U) && dyn_cast<ArrayType>(GV->getType())) {
		gep_type_iterator GEPI = gep_type_begin(U), E = gep_type_end(U);
		++GEPI; // Skip over the pointer index.

		for (; GEPI != E; ++GEPI) {
		if (GEPI.isStruct())
		continue;

		ConstantInt *IdxVal = dyn_cast<ConstantInt>(GEPI.getOperand());
		if (!IdxVal \|\| (GEPI.isBoundedSequential() &&
		IdxVal->getZExtValue() >= GEPI.getSequentialNumElements()))
		return false;
		}
		}

// Check the gep and it's users are safe to SRA		// Check the gep and it's users are safe to SRA
if (!isSafeSROAGEP(U))		if (!isSafeSROAGEP(U))
return false;		return false;
}		}

return true;		return true;
}		}

▲ Show 20 Lines • Show All 2,588 Lines • Show Last 20 Lines

test/Transforms/GlobalOpt/globalsra-ptr.ll

This file was added.

				; RUN: opt < %s -globalopt -S \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				%struct.Expr = type { [2 x i32], i32* }

				@e = internal unnamed_addr global [2 x %struct.Expr] zeroinitializer, align 16

				define dso_local i32* @foo(i32 %i) {
				entry:
				store i32* inttoptr (i64 3 to i32), i32* getelementptr inbounds ([2 x %struct.Expr], [2 x %struct.Expr]* @e, i64 0, i64 0, i32 1), align 8
				; CHECK-NOT: store i32* inttoptr (i64 3 to i32), i32* getelementptr inbounds ([2 x %struct.Expr], [2 x %struct.Expr]* @e, i64 0, i64 0, i32 1), align 8
				%idxprom = sext i32 %i to i64
				%arrayidx = getelementptr inbounds [2 x %struct.Expr], [2 x %struct.Expr]* @e, i64 0, i64 0, i32 0, i64 %idxprom
				store i32 2, i32* %arrayidx, align 4
				ret i32* inttoptr (i64 3 to i32*)
				}

test/Transforms/GlobalOpt/globalsra-struct.ll

This file was added.

				; RUN: opt < %s -globalopt -S \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				%struct.Expr = type { [2 x i32], i32 }

				@e = internal unnamed_addr global %struct.Expr zeroinitializer, align 4

				define dso_local i32 @foo(i32 %i) {
				entry:
				store i32 1, i32* getelementptr inbounds (%struct.Expr, %struct.Expr* @e, i32 0, i32 0, i64 1), align 4
				; Cannot remote the store because of the indexed GEP load below.
				; CHECK: store i32 1
				store i32 2, i32* getelementptr inbounds (%struct.Expr, %struct.Expr* @e, i32 0, i32 1), align 4
				; This store doesn't conflict with the indexed GEP below
				; CHECK-NOT: store i32 2
				%idxprom = sext i32 %i to i64
				%arrayidx = getelementptr inbounds [2 x i32], [2 x i32]* getelementptr inbounds (%struct.Expr, %struct.Expr* @e, i32 0, i32 0), i64 0, i64 %idxprom
				%0 = load i32, i32* %arrayidx, align 4
				ret i32 %0
				}