Download Raw Diff

Details

Reviewers

davide
jdoerfert
aprantl

Summary

Model the inaccessiblememonly, readnone, readonly, and writeonly attributes on function calls and parameters for GlobalOpt and GlobalStatus.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	20 ms	MLIR.Dialect/Affine::Unknown Unit Message ("")

Event Timeline

ddcc created this revision.Mar 26 2020, 4:30 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2020, 4:30 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B50641: Diff 253014.Mar 26 2020, 5:25 PM

I don't really know enough about this to give a meaningful review.

llvm/lib/Transforms/IPO/GlobalOpt.cpp
1837	Loads.push_back({CB, U->getType()->getPointerElementType()});

Remove explicit std::make_pair

ddcc added a reviewer: jdoerfert.Mar 26 2020, 8:13 PM

Harbormaster completed remote builds in B50652: Diff 253034.Mar 26 2020, 9:14 PM

Thanks for working on this, I'm really hoping we start to use attributes more aggressively. I inlined comments.

llvm/lib/Transforms/IPO/GlobalOpt.cpp
1828	This assert would need a message but it has to go completely. Having non-arg-operand uses is totally fine. That said, we should have a test case. One example where this will soon happen in practise is `llvm.assume`. Create a test with something like: `call void @llvm.assume(i1 true) ["align"(@Global, i64 128)]` If the use is not an arg operand you have to bail. Except if the User `isDroppable()`.
1880	This doesn't work for calls unfortunately. We don't have an attribute that says it is accessed and how much of it is (yet!). My suggestion is a TODO and the following handling for now: Write-only uses in calls are ignored, as read-none uses are. Read-only uses are matched with the type of the global value, thus we pretend we might read the entire thing.
llvm/lib/Transforms/Utils/GlobalStatus.cpp
205	This handles operand bundle uses wrong. We need to bail for them (I guess), except if the user `isDroppable` (maybe). If the argument is not marked `nocapture` you have to assume everything could happen as you cannot track uses anymore.

This revision now requires changes to proceed.Mar 26 2020, 10:26 PM

Revise based on feedback

Sure, I'm working on a instrumentation pass which is inserting calls that inhibit optimization, so I'm trying to work around the issue using function attributes, and need to look into memory to register promotion next.

I've made the changes, but I'm a little confused why read-only and write-only calls need to be handled differently. Why can't I assume that in the worst-case, the entire type is accessed? Also, what are the semantics of readonly and writeonly with respect to pointer casts? Isn't it valid behavior in C to cast any pointer type to void * as long as it is casted back to the original type before being accessed, so wouldn't this affect both loads and stores (that the actual load/store size could be larger than the argument type size)?

Fix test failures

Harbormaster failed remote builds in B50726: Diff 253202!Mar 27 2020, 1:43 PM

Harbormaster failed remote builds in B50730: Diff 253209!Mar 27 2020, 2:17 PM

ddcc mentioned this in D76966: [GlobalOpt/GlobalStatus][Mem2Reg] Handle PtrToInt passed as function call operand.Mar 27 2020, 6:49 PM

In D76894#1946810, @ddcc wrote:

Sure, I'm working on a instrumentation pass which is inserting calls that inhibit optimization, so I'm trying to work around the issue using function attributes, and need to look into memory to register promotion next.

Interesting. Feel free to send me an email to bounce of ideas :)
You should also enable the Attributor once you start adding attributes ;)

I've made the changes, but I'm a little confused why read-only and write-only calls need to be handled differently.

Because we use the fact that something was written for reasoning, at least in globalopt. If the global was always written before it was read we basically privatize it in the function. We cannot ensure that it is actually written for write-only calls. Read only doesn't matter because we don't care if it may or must happen. Does that make sense? (Btw. I'm not an expert on this but I just read the surrounding code so I might be wrong.)

Why can't I assume that in the worst-case, the entire type is accessed?

The worst-case is fine but for a call it is: the entire type is read and something is written.

Also, what are the semantics of readonly and writeonly with respect to pointer casts? Isn't it valid behavior in C to cast any pointer type to void * as long as it is casted back to the original type before being accessed, so wouldn't this affect both loads and stores (that the actual load/store size could be larger than the argument type size)?

C doesn't really matter here but what you say is not wrong. We cannot conclude anything from the type of the pointer that goes into a call. That is why I said we have to assume the entire array is accessed. However,

// Assume that in the worst case, the entire type is accessed
Loads.push_back({CB, U->getType()->getPointerElementType()});

is not it. This uses the type at the call site. The entire thing works because we see the allocation so we know how big the entire thing is. Use the allocation type instead please.
(FWIW, even in C there is no need to cast it back into "the original type" in a lot of situations, including but not limited to access via char*.)

llvm/lib/Transforms/Utils/GlobalStatus.cpp
186	Are we sure `Stored` means "maybe stored"?

Thanks for the feedback, I've sent you an email.

Oops, I missed that writeonly is may-writeonly, which would break the store dominator analysis for the localize optimization. Should be fixed now.

Fix load type and store

Remove unneeded store code

Harbormaster failed remote builds in B50833: Diff 253379!Mar 28 2020, 4:08 PM

Harbormaster failed remote builds in B50835: Diff 253381!Mar 28 2020, 4:40 PM

Update tests with update_test_checks.py

Harbormaster failed remote builds in B51005: Diff 253671!Mar 30 2020, 1:37 PM

ping

Diff 253671

llvm/lib/Transforms/IPO/GlobalOpt.cpp

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/IPO.h"		#include "llvm/Transforms/IPO.h"
#include "llvm/Transforms/Utils/CtorUtils.h"		#include "llvm/Transforms/Utils/CtorUtils.h"
#include "llvm/Transforms/Utils/Evaluator.h"		#include "llvm/Transforms/Utils/Evaluator.h"
#include "llvm/Transforms/Utils/GlobalStatus.h"		#include "llvm/Transforms/Utils/GlobalStatus.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
		#include <tuple>
#include <utility>		#include <utility>
#include <vector>		#include <vector>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "globalopt"		#define DEBUG_TYPE "globalopt"

STATISTIC(NumMarked , "Number of globals marked constant");		STATISTIC(NumMarked , "Number of globals marked constant");
▲ Show 20 Lines • Show All 1,722 Lines • ▼ Show 20 Lines	static bool isPointerValueDeadOnEntryToFunction(
// On each of these uses, identify if the memory that GV points to is		// On each of these uses, identify if the memory that GV points to is
// used/required/live at the start of the function. If it is not, for example		// used/required/live at the start of the function. If it is not, for example
// if the first thing the function does is store to the GV, the GV can		// if the first thing the function does is store to the GV, the GV can
// possibly be demoted.		// possibly be demoted.
//		//
// We don't do an exhaustive search for memory operations - simply look		// We don't do an exhaustive search for memory operations - simply look
// through bitcasts as they're quite common and benign.		// through bitcasts as they're quite common and benign.
const DataLayout &DL = GV->getParent()->getDataLayout();		const DataLayout &DL = GV->getParent()->getDataLayout();
SmallVector<LoadInst *, 4> Loads;		SmallVector<std::pair<const Instruction , Type >, 4> Loads;
SmallVector<StoreInst *, 4> Stores;		SmallVector<const StoreInst *, 4> Stores;
for (auto *U : GV->users()) {		SmallVector<const Value *, 4> Roots;
if (Operator::getOpcode(U) == Instruction::BitCast) {
for (auto *UU : U->users()) {		Roots.push_back(GV);
if (auto *LI = dyn_cast<LoadInst>(UU))		while (Roots.size()) {
Loads.push_back(LI);		const Value *R = Roots.pop_back_val();
		for (auto &U : R->uses()) {
		User *UU = U.getUser();
		if (Operator::getOpcode(UU) == Instruction::BitCast)
		Roots.push_back(UU);
		else if (auto *LI = dyn_cast<LoadInst>(UU))
		Loads.push_back(std::make_pair<>(LI, LI->getType()));
else if (auto *SI = dyn_cast<StoreInst>(UU))		else if (auto *SI = dyn_cast<StoreInst>(UU))
Stores.push_back(SI);		Stores.push_back(SI);
else		else if (auto *CB = dyn_cast<CallBase>(UU)) {
		// TODO: Handle isDroppable() case
		if (!CB->isArgOperand(&U))
		return false;
		jdoerfertUnsubmitted Not Done Reply Inline Actions This assert would need a message but it has to go completely. Having non-arg-operand uses is totally fine. That said, we should have a test case. One example where this will soon happen in practise is `llvm.assume`. Create a test with something like: `call void @llvm.assume(i1 true) ["align"(@Global, i64 128)]` If the use is not an arg operand you have to bail. Except if the User `isDroppable()`. jdoerfert: This assert would need a message but it has to go completely. Having non-arg-operand uses is…
		unsigned ArgNo = CB->getArgOperandNo(&U);
		// Argument must not be captured for subsequent use
		if (!CB->paramHasAttr(ArgNo, Attribute::NoCapture))
		return false;
		// Depending on attributes, either treat calls as load at the
		// call site, or ignore them if they are not going to be dereferenced.
		if (CB->hasFnAttr(Attribute::InaccessibleMemOnly) \|\|
		CB->hasFnAttr(Attribute::ReadNone) \|\|
		CB->paramHasAttr(ArgNo, Attribute::ReadNone))
		aprantlUnsubmitted Not Done Reply Inline Actions Loads.push_back({CB, U->getType()->getPointerElementType()}); aprantl: Loads.push_back({CB, U->getType()->getPointerElementType()});
		continue;
		else if (CB->hasFnAttr(Attribute::ReadOnly) \|\|
		CB->paramHasAttr(ArgNo, Attribute::ReadOnly)) {
		// Assume that in the worst case, the entire type is accessed
		Loads.push_back({CB, GV->getType()->getPointerElementType()});
		} else {
return false;		return false;
}		}
continue;		} else
}

Instruction *I = dyn_cast<Instruction>(U);
if (!I)
return false;
assert(I->getParent()->getParent() == F);

if (auto *LI = dyn_cast<LoadInst>(I))
Loads.push_back(LI);
else if (auto *SI = dyn_cast<StoreInst>(I))
Stores.push_back(SI);
else
return false;		return false;
}		}
		}

// We have identified all uses of GV into loads and stores. Now check if all		// We have identified all uses of GV into loads and stores. Now check if all
// of them are known not to depend on the value of the global at the function		// of them are known not to depend on the value of the global at the function
// entry point. We do this by ensuring that every load is dominated by at		// entry point. We do this by ensuring that every load is dominated by at
// least one store.		// least one store.
auto &DT = LookupDomTree(const_cast<Function >(F));		auto &DT = LookupDomTree(const_cast<Function >(F));

// The below check is quadratic. Check we're not going to do too many tests.		// The below check is quadratic. Check we're not going to do too many tests.
// FIXME: Even though this will always have worst-case quadratic time, we		// FIXME: Even though this will always have worst-case quadratic time, we
// could put effort into minimizing the average time by putting stores that		// could put effort into minimizing the average time by putting stores that
// have been shown to dominate at least one load at the beginning of the		// have been shown to dominate at least one load at the beginning of the
// Stores array, making subsequent dominance checks more likely to succeed		// Stores array, making subsequent dominance checks more likely to succeed
// early.		// early.
//		//
// The threshold here is fairly large because global->local demotion is a		// The threshold here is fairly large because global->local demotion is a
// very powerful optimization should it fire.		// very powerful optimization should it fire.
const unsigned Threshold = 100;		const unsigned Threshold = 100;
if (Loads.size() * Stores.size() > Threshold)		if (Loads.size() * Stores.size() > Threshold)
return false;		return false;

for (auto *L : Loads) {		for (auto &LP : Loads) {
auto *LTy = L->getType();		Type *LTy;
		const Instruction *L;
		std::tie(L, LTy) = LP;
if (none_of(Stores, [&](const StoreInst *S) {		if (none_of(Stores, [&](const StoreInst *S) {
auto *STy = S->getValueOperand()->getType();		auto *STy = S->getValueOperand()->getType();
// The load is only dominated by the store if DomTree says so		// The load is only dominated by the store if DomTree says so
// and the number of bits loaded in L is less than or equal to		// and the number of bits loaded in L is less than or equal to
// the number of bits stored in S.		// the number of bits stored in S.
return DT.dominates(S, L) &&		return DT.dominates(S, L) &&
DL.getTypeStoreSize(LTy) <= DL.getTypeStoreSize(STy);		DL.getTypeStoreSize(LTy) <= DL.getTypeStoreSize(STy);
		jdoerfertUnsubmitted Not Done Reply Inline Actions This doesn't work for calls unfortunately. We don't have an attribute that says it is accessed and how much of it is (yet!). My suggestion is a TODO and the following handling for now: Write-only uses in calls are ignored, as read-none uses are. Read-only uses are matched with the type of the global value, thus we pretend we might read the entire thing. jdoerfert: This doesn't work for calls unfortunately. We don't have an attribute that says it is accessed…
}))		}))
return false;		return false;
}		}
// All loads have known dependences inside F, so the global can be localized.		// All loads have known dependences inside F, so the global can be localized.
return true;		return true;
}		}

/// C may have non-instruction users. Can all of those users be turned into		/// C may have non-instruction users. Can all of those users be turned into
▲ Show 20 Lines • Show All 1,188 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/GlobalStatus.cpp

Show First 20 Lines • Show All 158 Lines • ▼ Show 20 Lines	if (const ConstantExpr *CE = dyn_cast<ConstantExpr>(UR)) {
GS.StoredType = GlobalStatus::Stored;		GS.StoredType = GlobalStatus::Stored;
if (MTI->getArgOperand(1) == V)		if (MTI->getArgOperand(1) == V)
GS.IsLoaded = true;		GS.IsLoaded = true;
} else if (const MemSetInst *MSI = dyn_cast<MemSetInst>(I)) {		} else if (const MemSetInst *MSI = dyn_cast<MemSetInst>(I)) {
assert(MSI->getArgOperand(0) == V && "Memset only takes one pointer!");		assert(MSI->getArgOperand(0) == V && "Memset only takes one pointer!");
if (MSI->isVolatile())		if (MSI->isVolatile())
return true;		return true;
GS.StoredType = GlobalStatus::Stored;		GS.StoredType = GlobalStatus::Stored;
} else if (auto C = ImmutableCallSite(I)) {		} else if (const CallBase *CB = dyn_cast<CallBase>(I)) {
if (!C.isCallee(&U))		if (CB->isCallee(&U)) {
return true;		GS.IsLoaded = true;
		} else if (CB->isArgOperand(&U)) {
		unsigned ArgNo = CB->getArgOperandNo(&U);
		// Argument must not be captured for subsequent use
		if (!CB->paramHasAttr(ArgNo, Attribute::NoCapture))
		return true;
		// Depending on attributes, treat the operand as a pure call or load
		// at the call site.
		if (CB->hasFnAttr(Attribute::InaccessibleMemOnly) \|\|
		CB->hasFnAttr(Attribute::ReadNone) \|\|
		CB->paramHasAttr(ArgNo, Attribute::ReadNone))
		continue;
		else if (CB->hasFnAttr(Attribute::ReadOnly) \|\|
		CB->paramHasAttr(ArgNo, Attribute::ReadOnly))
GS.IsLoaded = true;		GS.IsLoaded = true;
		else
		return true;
		} else
		jdoerfertUnsubmitted Not Done Reply Inline Actions Are we sure `Stored` means "maybe stored"? jdoerfert: Are we sure `Stored` means "maybe stored"?
		return true;
} else {		} else {
return true; // Any other non-load instruction might take address!		return true; // Any other non-load instruction might take address!
}		}
} else if (const Constant *C = dyn_cast<Constant>(UR)) {		} else if (const Constant *C = dyn_cast<Constant>(UR)) {
GS.HasNonInstructionUser = true;		GS.HasNonInstructionUser = true;
// We might have a dead and dangling constant hanging off of here.		// We might have a dead and dangling constant hanging off of here.
if (!isSafeToDestroyConstant(C))		if (!isSafeToDestroyConstant(C))
return true;		return true;
} else {		} else {
GS.HasNonInstructionUser = true;		GS.HasNonInstructionUser = true;
// Otherwise must be some other user.		// Otherwise must be some other user.
return true;		return true;
}		}
}		}

return false;		return false;
}		}

		jdoerfertUnsubmitted Not Done Reply Inline Actions This handles operand bundle uses wrong. We need to bail for them (I guess), except if the user `isDroppable` (maybe). If the argument is not marked `nocapture` you have to assume everything could happen as you cannot track uses anymore. jdoerfert: This handles operand bundle uses wrong. We need to bail for them (I guess), except if the user…
GlobalStatus::GlobalStatus() = default;		GlobalStatus::GlobalStatus() = default;

bool GlobalStatus::analyzeGlobal(const Value *V, GlobalStatus &GS) {		bool GlobalStatus::analyzeGlobal(const Value *V, GlobalStatus &GS) {
SmallPtrSet<const Value *, 16> VisitedUsers;		SmallPtrSet<const Value *, 16> VisitedUsers;
return analyzeGlobalAux(V, GS, VisitedUsers);		return analyzeGlobalAux(V, GS, VisitedUsers);
}		}

llvm/test/Transforms/GlobalOpt/localize-fnattr.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S < %s -globalopt \| FileCheck %s

				declare void @foo1a(i8* readnone nocapture, i8) local_unnamed_addr
				declare void @foo1b(i8* nocapture, i8) local_unnamed_addr readnone
				declare void @foo1c(i8* nocapture, i8) local_unnamed_addr inaccessiblememonly

				@G1 = internal global i32 0

				; Doesn't read from pointer argument
				define i32 @a() norecurse {
				; CHECK-LABEL: define {{[^@]+}}@a() local_unnamed_addr
				; CHECK-NEXT: [[G1:%.*]] = alloca i32
				; CHECK-NEXT: store i32 0, i32* [[G1]]
				; CHECK-NEXT: store i32 42, i32* [[G1]]
				; CHECK-NEXT: [[P:%.]] = bitcast i32 [[G1]] to i8*
				; CHECK-NEXT: call void @foo1a(i8* [[P]], i8 0)
				; CHECK-NEXT: call void @foo1b(i8* [[P]], i8 0)
				; CHECK-NEXT: call void @foo1c(i8* [[P]], i8 0)
				; CHECK-NEXT: [[A:%.]] = load i32, i32 [[G1]]
				; CHECK-NEXT: ret i32 [[A]]
				;
				store i32 42, i32 *@G1
				%p = bitcast i32* @G1 to i8*
				call void @foo1a(i8* %p, i8 0)
				call void @foo1b(i8* %p, i8 0)
				call void @foo1c(i8* %p, i8 0)
				%a = load i32, i32* @G1
				ret i32 %a
				}

				declare void @foo2a(i8* readonly nocapture, i8) local_unnamed_addr
				declare void @foo2b(i8* nocapture, i8) local_unnamed_addr readonly

				@G2 = internal global i32 0

				; Reads from pointer argument, 8-bit call/load is less than 32-bit store
				define i32 @b() norecurse {
				; CHECK-LABEL: define {{[^@]+}}@b() local_unnamed_addr
				; CHECK-NEXT: [[G2:%.*]] = alloca i32
				; CHECK-NEXT: store i32 0, i32* [[G2]]
				; CHECK-NEXT: store i32 42, i32* [[G2]]
				; CHECK-NEXT: [[P:%.]] = bitcast i32 [[G2]] to i8*
				; CHECK-NEXT: call void @foo2a(i8* [[P]], i8 0)
				; CHECK-NEXT: call void @foo2b(i8* [[P]], i8 0)
				; CHECK-NEXT: ret i32 0
				;
				store i32 42, i32 *@G2
				%p = bitcast i32* @G2 to i8*
				call void @foo2a(i8* %p, i8 0)
				call void @foo2b(i8* %p, i8 0)
				ret i32 0
				}

				declare void @foo3a(i32* writeonly nocapture, i8) local_unnamed_addr
				declare void @foo3b(i32* nocapture, i8) local_unnamed_addr writeonly
				declare void @foo3c(i32* writeonly, i8) local_unnamed_addr writeonly

				@G3 = internal global i32 0

				; May-write to pointer argument, not supported
				define i32 @c() norecurse {
				; CHECK-LABEL: define {{[^@]+}}@c() local_unnamed_addr
				; CHECK-NEXT: call void @foo3a(i32* @G3, i8 0)
				; CHECK-NEXT: call void @foo3b(i32* @G3, i8 0)
				; CHECK-NEXT: call void @foo3c(i32* @G3, i8 0)
				; CHECK-NEXT: [[C:%.]] = load i32, i32 @G3
				; CHECK-NEXT: ret i32 [[C]]
				;
				call void @foo3a(i32* @G3, i8 0)
				call void @foo3b(i32* @G3, i8 0)
				call void @foo3c(i32* @G3, i8 0)
				%c = load i32, i32* @G3
				ret i32 %c
				}

				declare void @foo4a(i8* readnone nocapture, i8) local_unnamed_addr
				declare void @foo4b(i8* readnone, i8) local_unnamed_addr
				declare void @llvm.assume(i1 %cond)

				@G4 = internal global i32 0

				; Operand bundle and may-capture not supported
				define i32 @d() norecurse {
				; CHECK-LABEL: define {{[^@]+}}@d() local_unnamed_addr
				; CHECK-NEXT: store i32 42, i32* @G4
				; CHECK-NEXT: call void @llvm.assume(i1 true) [ "align"(i32* @G4, i64 128) ]
				; CHECK-NEXT: [[P:%.]] = bitcast i32 @G4 to i8*
				; CHECK-NEXT: call void @foo4a(i8* [[P]], i8 0)
				; CHECK-NEXT: call void @foo4b(i8* [[P]], i8 0)
				; CHECK-NEXT: [[D:%.]] = load i32, i32 @G4
				; CHECK-NEXT: ret i32 [[D]]
				;
				store i32 42, i32 *@G4
				call void @llvm.assume(i1 true) ["align"(i32 *@G4, i64 128)]
				%p = bitcast i32* @G4 to i8*
				call void @foo4a(i8* %p, i8 0)
				call void @foo4b(i8* %p, i8 0)
				%d = load i32, i32* @G4
				ret i32 %d
				}

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalOpt/GlobalStatus] Handle GlobalVariables passed as function call operands with access attributes
Needs ReviewPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 253671

llvm/lib/Transforms/IPO/GlobalOpt.cpp

llvm/lib/Transforms/Utils/GlobalStatus.cpp

llvm/test/Transforms/GlobalOpt/localize-fnattr.ll

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalOpt/GlobalStatus] Handle GlobalVariables passed as function call operands with access attributesNeeds ReviewPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 253671

llvm/lib/Transforms/IPO/GlobalOpt.cpp

llvm/lib/Transforms/Utils/GlobalStatus.cpp

llvm/test/Transforms/GlobalOpt/localize-fnattr.ll

[GlobalOpt/GlobalStatus] Handle GlobalVariables passed as function call operands with access attributes
Needs ReviewPublic