This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
TargetTransformInfo.h
-
TargetTransformInfoImpl.h
-
lib/
-
Analysis/
-
TargetTransformInfo.cpp
-
Transforms/IPO/
-
IPO/
-
ArgumentPromotion.cpp
-
test/Transforms/ArgumentPromotion/X86/
-
Transforms/
-
ArgumentPromotion/
-
X86/
-
attributes.ll

Differential D53554

[Argument Promotion] Only promote args when function attributes are compatible
ClosedPublic

Authored by tstellar on Oct 22 2018, 9:41 PM.

Download Raw Diff

Details

Reviewers

echristo
chandlerc
eli.friedman
craig.topper

Commits

rG3d36e5c3e6cf: Only promote args when function attributes are compatible
rL351296: Only promote args when function attributes are compatible

Summary

Check to make sure that the caller and the callee have compatible
function arguments before promoting arguments. This uses the same
TargetTransformInfo queries that are used to determine if attributes
are compatible for inlining.

The goal here is to avoid breaking ABI when a called function's ABI
depends on a target feature that is not enabled in the caller.

This is a very conservative fix for PR37358. Ideally we would have a more
sophisticated check for ABI compatiblity rather than checking if the
attributes are compatible for inlining.

Diff Detail

Repository: rL LLVM

Event Timeline

tstellar created this revision.Oct 22 2018, 9:41 PM

Harbormaster completed remote builds in B24051: Diff 170568.Oct 22 2018, 9:41 PM

rkruppe added a subscriber: rkruppe.Oct 23 2018, 12:54 AM

Ping.

@echristo wanted to look at this

echristo added inline comments.Nov 8 2018, 7:10 PM

lib/Transforms/IPO/ArgumentPromotion.cpp
813 ↗	(On Diff #170568)	Seems like we should expose this on TTI and that way backends can override and keep a single copy between here and InlineCost.cpp.

tstellar added inline comments.Nov 9 2018, 6:53 PM

lib/Transforms/IPO/ArgumentPromotion.cpp
813 ↗	(On Diff #170568)	These are the only callers of TTI.areInlineCompatible(). Should we just roll it into that or create something new like? Also, eventually won't we want different call-backs for InlineCost and ArgumentPromotion?

echristo added inline comments.Nov 12 2018, 2:30 PM

lib/Transforms/IPO/ArgumentPromotion.cpp
813 ↗	(On Diff #170568)	I'd be more inclined to go with a TTI::functionsHaveCompatibleAttributes that wraps the existing behavior. And that's a good question. I think the answer is probably, but keep in mind that areInlineCompatible is already pulled out for X86. Mostly I'd like to avoid the same static function duplicated - at that point let's just move the whole thing to TTI and then figure out the best way of specializing per target?

Sorry that this comment has gotten lost *twice*. I tried to write it over a week ago. =[[[[

lib/Transforms/IPO/ArgumentPromotion.cpp
813 ↗	(On Diff #170568)	I would strongly advocate for keeping these completely separate all the way to the backend implementation... While these may happen to overlap today, they seem like deeply different concepts. Also, see my comment below.
870–871 ↗	(On Diff #170568)	I would sink this down and call it with the set of arguments so that it can filter out incompatible ones. The backend really should have access to the arguments themselves before saying that things are incompatible. As an example, regardless of the attributes of the function, `i32` arguments remain perfectly compatible and we should keep promoting those.

Add a new callback function areFunctionArgsABICompatible for the ArgumentPromotion
pass to use. @echristo I'm still a little confused about what to do with
functionsHaveCompatibleAttributes, should I just leave it as is, or use this new
callback?

Harbormaster completed remote builds in B24921: Diff 173787.Nov 12 2018, 5:06 PM

This looks better to me. I've got an inline request for some elaboration and you might want to let chandlerc comment, but otherwise OK with me.

include/llvm/Analysis/TargetTransformInfo.h
933 ↗	(On Diff #173787)	"layout" is a bit difficult here, perhaps elaborate a bit?

This revision is now accepted and ready to land.Nov 13 2018, 4:24 PM

chandlerc requested changes to this revision.Nov 13 2018, 4:46 PM

chandlerc added inline comments.

include/llvm/Analysis/TargetTransformInfoImpl.h
533 ↗	(On Diff #173787)	While this may be sufficient, it isn't necessary and seems a but of a conflation... I'd much rather this be sunk into the codegen layers basic TTI impl so that targets can override it cleanly. Also, I don't think byval and normal arguments have the same constraints. Also, I'd really like to ensure that basic integers that are valid to promote across subtargets still get promoted even when inlining wouldn't be valid.

This revision now requires changes to proceed.Nov 13 2018, 4:46 PM

tstellar added inline comments.Nov 13 2018, 7:08 PM

include/llvm/Analysis/TargetTransformInfoImpl.h
533 ↗	(On Diff #173787)	Ok, so are you suggesting that I move this function definition into BasicTTIImplBase and re-implement it to handle the simple case of basic integers? Anything else?

Some small updates:

+ Clarified Comment
+ Decoupled new function from areInlineCompatible()

Questions I have still:

+ I'm still a little confused by the layering of TTI. I tried to move the
implementation out of TargetTransformInfoImplBase and into BasicTTI, but
that breaks the NoTTIImpl class.

+ Do we need a separate function byval and normal args? I thought that information
was accessible via the Argument objects that are passed to the function.

+ Is it possible to implement a generic version of this function for simple argument
types without any target-specifc knowledge? Does LLVM IR guarantee anything about
calling convention for simple types?

Harbormaster completed remote builds in B25976: Diff 177999.Dec 12 2018, 8:35 PM

I'm also seeing issues from ArgumentPromotion with my min-legal-vector-width attribute used by X86 with avx512 enabled and -mprefer-vector-width=256. I think I'll need to override the X86 implementation of areFunctionArgsABICompatible to check for this attribute as well.

nikic added a subscriber: nikic.Dec 22 2018, 4:11 AM

Gentle ping. Any chance this is going to get fixed before 8.0 branches?

Sorry, for some reason this didn't update as ready for another round of review.

In D53554#1329303, @tstellar wrote:

Some small updates:

+ Clarified Comment
+ Decoupled new function from areInlineCompatible()

Nice.

Questions I have still:

+ I'm still a little confused by the layering of TTI. I tried to move the
implementation out of TargetTransformInfoImplBase and into BasicTTI, but
that breaks the NoTTIImpl class.

For NoTTIImpl you can just return false, or return false except for integer types that DataLayout thinks are legal.

The difference between NoTTIImpl and the "basic" layer inside CodeGen is that the basic layer can reach into common target-independent parts of the code generator (like the type legalization tables) to compute plausible answers, where NoTTIImpl is supposed to only return trivially evident information based on the IR-level input. Further, there is *no* quality bar for NoTTIImpl -- it's more for target-indepndent testing, and not supposed to have any quality per-se.

+ Do we need a separate function byval and normal args? I thought that information
was accessible via the Argument objects that are passed to the function.

I'm fine with whatever works here -- as long as we can model what we need to.

+ Is it possible to implement a generic version of this function for simple argument
types without any target-specifc knowledge? Does LLVM IR guarantee anything about
calling convention for simple types?

It is possible to implement (i suggest one above) but I don't think there is any guaranteed ABI at the IR level. Maaaaybe inside the code generator there are some common rules that can handle easy cases, but I think we'll have to *establish* these rules (and check that the existing targets uphold them), not rely on them already being there.

In D53554#1350691, @chandlerc wrote:

In D53554#1329303, @tstellar wrote:

+ I'm still a little confused by the layering of TTI. I tried to move the
implementation out of TargetTransformInfoImplBase and into BasicTTI, but
that breaks the NoTTIImpl class.

For NoTTIImpl you can just return false, or return false except for integer types that DataLayout thinks are legal.

The difference between NoTTIImpl and the "basic" layer inside CodeGen is that the basic layer can reach into common target-independent parts of the code generator (like the type legalization tables) to compute plausible answers, where NoTTIImpl is supposed to only return trivially evident information based on the IR-level input. Further, there is *no* quality bar for NoTTIImpl -- it's more for target-indepndent testing, and not supposed to have any quality per-se.

Ok, so as I understand, my patch implements the same default for both NoTTI and
BasicTTI since TargetTransformInfoImplBase is the base class for both. Are you
OK with keeping this default implementation I have for NoTTI? The advantage of
my implementation is that it keeps ArugmentPromotion working for 'normal' C/C++
code and only disables it for code with functions targeting different CPU features in the same object.

+ Is it possible to implement a generic version of this function for simple argument
types without any target-specifc knowledge? Does LLVM IR guarantee anything about
calling convention for simple types?

It is possible to implement (i suggest one above) but I don't think there is any guaranteed ABI at the IR level. Maaaaybe inside the code generator there are some common rules that can handle easy cases, but I think we'll have to *establish* these rules (and check that the existing targets uphold them), not rely on them already being there.

Given that we are close to the branch point and this fixes at least one bug, are we OK to commit this
patch as is and then try to work out the IR level guarantees in a future patch?

Sorry I misunderstood that you were saying BasicTTI already had all the logic you were adding.

I *am* thinking about getting promotion to continue to work for obviously safe scalars even when the subtargets differ slightly. As we get more and more of these inferred from the use of instrinsics, I think this will be fairly important as I explain below. Doesn't need to be this patch though so LGTM if you want to do that as a follow-up.

include/llvm/Analysis/TargetTransformInfoImpl.h
533 ↗	(On Diff #173787)	I guess this is fine for NoTTI. It would seem nice to teach BasicTTI to handle some extremely common cases (scalar types that are legal in both caller and callee subtargets) so that we don't lose much in the way of real-world optimization. I'm OK if that's a second patch, but I'd really like to have both of them so we don't have bizarre performance cliffs where you use a fancy new intrinsic for a specialized CPU instruction and regress the basic optimization of the code.

This revision is now accepted and ready to land.Jan 15 2019, 8:31 PM

Closed by commit rL351296: Only promote args when function attributes are compatible (authored by tstellar). · Explain WhyJan 15 2019, 9:19 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Analysis/

TargetTransformInfo.h

16 lines

TargetTransformInfoImpl.h

8 lines

lib/

Analysis/

TargetTransformInfo.cpp

6 lines

Transforms/

IPO/

ArgumentPromotion.cpp

35 lines

test/

Transforms/

ArgumentPromotion/

X86/

attributes.ll

53 lines

Diff 181969

llvm/trunk/include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 928 Lines • ▼ Show 20 Lines	void getMemcpyLoopResidualLoweringType(SmallVectorImpl<Type *> &OpsOut,
unsigned SrcAlign,		unsigned SrcAlign,
unsigned DestAlign) const;		unsigned DestAlign) const;

/// \returns True if the two functions have compatible attributes for inlining		/// \returns True if the two functions have compatible attributes for inlining
/// purposes.		/// purposes.
bool areInlineCompatible(const Function *Caller,		bool areInlineCompatible(const Function *Caller,
const Function *Callee) const;		const Function *Callee) const;

		/// \returns True if the caller and callee agree on how \p Args will be passed
		/// to the callee.
		/// \param[out] Args The list of compatible arguments. The implementation may
		/// filter out any incompatible args from this list.
		bool areFunctionArgsABICompatible(const Function *Caller,
		const Function *Callee,
		SmallPtrSetImpl<Argument *> &Args) const;

/// The type of load/store indexing.		/// The type of load/store indexing.
enum MemIndexedMode {		enum MemIndexedMode {
MIM_Unindexed, ///< No indexing.		MIM_Unindexed, ///< No indexing.
MIM_PreInc, ///< Pre-incrementing.		MIM_PreInc, ///< Pre-incrementing.
MIM_PreDec, ///< Pre-decrementing.		MIM_PreDec, ///< Pre-decrementing.
MIM_PostInc, ///< Post-incrementing.		MIM_PostInc, ///< Post-incrementing.
MIM_PostDec ///< Post-decrementing.		MIM_PostDec ///< Post-decrementing.
};		};
▲ Show 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	public:
virtual Type getMemcpyLoopLoweringType(LLVMContext &Context, Value Length,		virtual Type getMemcpyLoopLoweringType(LLVMContext &Context, Value Length,
unsigned SrcAlign,		unsigned SrcAlign,
unsigned DestAlign) const = 0;		unsigned DestAlign) const = 0;
virtual void getMemcpyLoopResidualLoweringType(		virtual void getMemcpyLoopResidualLoweringType(
SmallVectorImpl<Type *> &OpsOut, LLVMContext &Context,		SmallVectorImpl<Type *> &OpsOut, LLVMContext &Context,
unsigned RemainingBytes, unsigned SrcAlign, unsigned DestAlign) const = 0;		unsigned RemainingBytes, unsigned SrcAlign, unsigned DestAlign) const = 0;
virtual bool areInlineCompatible(const Function *Caller,		virtual bool areInlineCompatible(const Function *Caller,
const Function *Callee) const = 0;		const Function *Callee) const = 0;
		virtual bool
		areFunctionArgsABICompatible(const Function Caller, const Function Callee,
		SmallPtrSetImpl<Argument *> &Args) const = 0;
virtual bool isIndexedLoadLegal(MemIndexedMode Mode, Type *Ty) const = 0;		virtual bool isIndexedLoadLegal(MemIndexedMode Mode, Type *Ty) const = 0;
virtual bool isIndexedStoreLegal(MemIndexedMode Mode,Type *Ty) const = 0;		virtual bool isIndexedStoreLegal(MemIndexedMode Mode,Type *Ty) const = 0;
virtual unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const = 0;		virtual unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const = 0;
virtual bool isLegalToVectorizeLoad(LoadInst *LI) const = 0;		virtual bool isLegalToVectorizeLoad(LoadInst *LI) const = 0;
virtual bool isLegalToVectorizeStore(StoreInst *SI) const = 0;		virtual bool isLegalToVectorizeStore(StoreInst *SI) const = 0;
virtual bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes,		virtual bool isLegalToVectorizeLoadChain(unsigned ChainSizeInBytes,
unsigned Alignment,		unsigned Alignment,
unsigned AddrSpace) const = 0;		unsigned AddrSpace) const = 0;
▲ Show 20 Lines • Show All 362 Lines • ▼ Show 20 Lines	void getMemcpyLoopResidualLoweringType(SmallVectorImpl<Type *> &OpsOut,
unsigned DestAlign) const override {		unsigned DestAlign) const override {
Impl.getMemcpyLoopResidualLoweringType(OpsOut, Context, RemainingBytes,		Impl.getMemcpyLoopResidualLoweringType(OpsOut, Context, RemainingBytes,
SrcAlign, DestAlign);		SrcAlign, DestAlign);
}		}
bool areInlineCompatible(const Function *Caller,		bool areInlineCompatible(const Function *Caller,
const Function *Callee) const override {		const Function *Callee) const override {
return Impl.areInlineCompatible(Caller, Callee);		return Impl.areInlineCompatible(Caller, Callee);
}		}
		bool areFunctionArgsABICompatible(
		const Function Caller, const Function Callee,
		SmallPtrSetImpl<Argument *> &Args) const override {
		return Impl.areFunctionArgsABICompatible(Caller, Callee, Args);
		}
bool isIndexedLoadLegal(MemIndexedMode Mode, Type *Ty) const override {		bool isIndexedLoadLegal(MemIndexedMode Mode, Type *Ty) const override {
return Impl.isIndexedLoadLegal(Mode, Ty, getDataLayout());		return Impl.isIndexedLoadLegal(Mode, Ty, getDataLayout());
}		}
bool isIndexedStoreLegal(MemIndexedMode Mode, Type *Ty) const override {		bool isIndexedStoreLegal(MemIndexedMode Mode, Type *Ty) const override {
return Impl.isIndexedStoreLegal(Mode, Ty, getDataLayout());		return Impl.isIndexedStoreLegal(Mode, Ty, getDataLayout());
}		}
unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const override {		unsigned getLoadStoreVecRegBitWidth(unsigned AddrSpace) const override {
return Impl.getLoadStoreVecRegBitWidth(AddrSpace);		return Impl.getLoadStoreVecRegBitWidth(AddrSpace);
▲ Show 20 Lines • Show All 141 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 520 Lines • ▼ Show 20 Lines	public:
bool areInlineCompatible(const Function *Caller,		bool areInlineCompatible(const Function *Caller,
const Function *Callee) const {		const Function *Callee) const {
return (Caller->getFnAttribute("target-cpu") ==		return (Caller->getFnAttribute("target-cpu") ==
Callee->getFnAttribute("target-cpu")) &&		Callee->getFnAttribute("target-cpu")) &&
(Caller->getFnAttribute("target-features") ==		(Caller->getFnAttribute("target-features") ==
Callee->getFnAttribute("target-features"));		Callee->getFnAttribute("target-features"));
}		}

		bool areFunctionArgsABICompatible(const Function Caller, const Function Callee,
		SmallPtrSetImpl<Argument *> &Args) const {
		return (Caller->getFnAttribute("target-cpu") ==
		Callee->getFnAttribute("target-cpu")) &&
		(Caller->getFnAttribute("target-features") ==
		Callee->getFnAttribute("target-features"));
		}

bool isIndexedLoadLegal(TTI::MemIndexedMode Mode, Type *Ty,		bool isIndexedLoadLegal(TTI::MemIndexedMode Mode, Type *Ty,
const DataLayout &DL) const {		const DataLayout &DL) const {
return false;		return false;
}		}

bool isIndexedStoreLegal(TTI::MemIndexedMode Mode, Type *Ty,		bool isIndexedStoreLegal(TTI::MemIndexedMode Mode, Type *Ty,
const DataLayout &DL) const {		const DataLayout &DL) const {
return false;		return false;
▲ Show 20 Lines • Show All 323 Lines • Show Last 20 Lines

llvm/trunk/lib/Analysis/TargetTransformInfo.cpp

Show First 20 Lines • Show All 619 Lines • ▼ Show 20 Lines	TTIImpl->getMemcpyLoopResidualLoweringType(OpsOut, Context, RemainingBytes,
SrcAlign, DestAlign);		SrcAlign, DestAlign);
}		}

bool TargetTransformInfo::areInlineCompatible(const Function *Caller,		bool TargetTransformInfo::areInlineCompatible(const Function *Caller,
const Function *Callee) const {		const Function *Callee) const {
return TTIImpl->areInlineCompatible(Caller, Callee);		return TTIImpl->areInlineCompatible(Caller, Callee);
}		}

		bool TargetTransformInfo::areFunctionArgsABICompatible(
		const Function Caller, const Function Callee,
		SmallPtrSetImpl<Argument *> &Args) const {
		return TTIImpl->areFunctionArgsABICompatible(Caller, Callee, Args);
		}

bool TargetTransformInfo::isIndexedLoadLegal(MemIndexedMode Mode,		bool TargetTransformInfo::isIndexedLoadLegal(MemIndexedMode Mode,
Type *Ty) const {		Type *Ty) const {
return TTIImpl->isIndexedLoadLegal(Mode, Ty);		return TTIImpl->isIndexedLoadLegal(Mode, Ty);
}		}

bool TargetTransformInfo::isIndexedStoreLegal(MemIndexedMode Mode,		bool TargetTransformInfo::isIndexedStoreLegal(MemIndexedMode Mode,
Type *Ty) const {		Type *Ty) const {
return TTIImpl->isIndexedStoreLegal(Mode, Ty);		return TTIImpl->isIndexedStoreLegal(Mode, Ty);
▲ Show 20 Lines • Show All 579 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/IPO/ArgumentPromotion.cpp

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
#include "llvm/Analysis/BasicAliasAnalysis.h"		#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/Analysis/CGSCCPassManager.h"		#include "llvm/Analysis/CGSCCPassManager.h"
#include "llvm/Analysis/CallGraph.h"		#include "llvm/Analysis/CallGraph.h"
#include "llvm/Analysis/CallGraphSCCPass.h"		#include "llvm/Analysis/CallGraphSCCPass.h"
#include "llvm/Analysis/LazyCallGraph.h"		#include "llvm/Analysis/LazyCallGraph.h"
#include "llvm/Analysis/Loads.h"		#include "llvm/Analysis/Loads.h"
#include "llvm/Analysis/MemoryLocation.h"		#include "llvm/Analysis/MemoryLocation.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/IR/Argument.h"		#include "llvm/IR/Argument.h"
#include "llvm/IR/Attributes.h"		#include "llvm/IR/Attributes.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/CallSite.h"		#include "llvm/IR/CallSite.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
▲ Show 20 Lines • Show All 744 Lines • ▼ Show 20 Lines	static bool canPaddingBeAccessed(Argument *arg) {
// Check to make sure the pointers aren't captured		// Check to make sure the pointers aren't captured
for (StoreInst *Store : Stores)		for (StoreInst *Store : Stores)
if (PtrValues.count(Store->getValueOperand()))		if (PtrValues.count(Store->getValueOperand()))
return true;		return true;

return false;		return false;
}		}

		static bool areFunctionArgsABICompatible(
		const Function &F, const TargetTransformInfo &TTI,
		SmallPtrSetImpl<Argument *> &ArgsToPromote,
		SmallPtrSetImpl<Argument *> &ByValArgsToTransform) {
		for (const Use &U : F.uses()) {
		CallSite CS(U.getUser());
		const Function *Caller = CS.getCaller();
		const Function *Callee = CS.getCalledFunction();
		if (!TTI.areFunctionArgsABICompatible(Caller, Callee, ArgsToPromote) \|\|
		!TTI.areFunctionArgsABICompatible(Caller, Callee, ByValArgsToTransform))
		return false;
		}
		return true;
		}

/// PromoteArguments - This method checks the specified function to see if there		/// PromoteArguments - This method checks the specified function to see if there
/// are any promotable arguments and if it is safe to promote the function (for		/// are any promotable arguments and if it is safe to promote the function (for
/// example, all callers are direct). If safe to promote some arguments, it		/// example, all callers are direct). If safe to promote some arguments, it
/// calls the DoPromotion method.		/// calls the DoPromotion method.
static Function *		static Function *
promoteArguments(Function *F, function_ref<AAResults &(Function &F)> AARGetter,		promoteArguments(Function *F, function_ref<AAResults &(Function &F)> AARGetter,
unsigned MaxElements,		unsigned MaxElements,
Optional<function_ref<void(CallSite OldCS, CallSite NewCS)>>		Optional<function_ref<void(CallSite OldCS, CallSite NewCS)>>
ReplaceCallSite) {		ReplaceCallSite,
		const TargetTransformInfo &TTI) {
// Don't perform argument promotion for naked functions; otherwise we can end		// Don't perform argument promotion for naked functions; otherwise we can end
// up removing parameters that are seemingly 'not used' as they are referred		// up removing parameters that are seemingly 'not used' as they are referred
// to in the assembly.		// to in the assembly.
if(F->hasFnAttribute(Attribute::Naked))		if(F->hasFnAttribute(Attribute::Naked))
return nullptr;		return nullptr;

// Make sure that it is local to this module.		// Make sure that it is local to this module.
if (!F->hasLocalLinkage())		if (!F->hasLocalLinkage())
Show All 12 Lines	promoteArguments(Function *F, function_ref<AAResults &(Function &F)> AARGetter,
for (Argument &I : F->args())		for (Argument &I : F->args())
if (I.getType()->isPointerTy())		if (I.getType()->isPointerTy())
PointerArgs.push_back(&I);		PointerArgs.push_back(&I);
if (PointerArgs.empty())		if (PointerArgs.empty())
return nullptr;		return nullptr;

// Second check: make sure that all callers are direct callers. We can't		// Second check: make sure that all callers are direct callers. We can't
// transform functions that have indirect callers. Also see if the function		// transform functions that have indirect callers. Also see if the function
// is self-recursive.		// is self-recursive and check that target features are compatible.
bool isSelfRecursive = false;		bool isSelfRecursive = false;
for (Use &U : F->uses()) {		for (Use &U : F->uses()) {
CallSite CS(U.getUser());		CallSite CS(U.getUser());
// Must be a direct call.		// Must be a direct call.
if (CS.getInstruction() == nullptr \|\| !CS.isCallee(&U))		if (CS.getInstruction() == nullptr \|\| !CS.isCallee(&U))
return nullptr;		return nullptr;

// Can't change signature of musttail callee		// Can't change signature of musttail callee
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	if (isSafeToPromoteArgument(PtrArg, PtrArg->hasByValOrInAllocaAttr(), AAR,
MaxElements))		MaxElements))
ArgsToPromote.insert(PtrArg);		ArgsToPromote.insert(PtrArg);
}		}

// No promotable pointer arguments.		// No promotable pointer arguments.
if (ArgsToPromote.empty() && ByValArgsToTransform.empty())		if (ArgsToPromote.empty() && ByValArgsToTransform.empty())
return nullptr;		return nullptr;

		if (!areFunctionArgsABICompatible(*F, TTI, ArgsToPromote,
		ByValArgsToTransform))
		return nullptr;

return doPromotion(F, ArgsToPromote, ByValArgsToTransform, ReplaceCallSite);		return doPromotion(F, ArgsToPromote, ByValArgsToTransform, ReplaceCallSite);
}		}

PreservedAnalyses ArgumentPromotionPass::run(LazyCallGraph::SCC &C,		PreservedAnalyses ArgumentPromotionPass::run(LazyCallGraph::SCC &C,
CGSCCAnalysisManager &AM,		CGSCCAnalysisManager &AM,
LazyCallGraph &CG,		LazyCallGraph &CG,
CGSCCUpdateResult &UR) {		CGSCCUpdateResult &UR) {
bool Changed = false, LocalChange;		bool Changed = false, LocalChange;
Show All 9 Lines	for (LazyCallGraph::Node &N : C) {
AM.getResult<FunctionAnalysisManagerCGSCCProxy>(C, CG).getManager();		AM.getResult<FunctionAnalysisManagerCGSCCProxy>(C, CG).getManager();
// FIXME: This lambda must only be used with this function. We should		// FIXME: This lambda must only be used with this function. We should
// skip the lambda and just get the AA results directly.		// skip the lambda and just get the AA results directly.
auto AARGetter = [&](Function &F) -> AAResults & {		auto AARGetter = [&](Function &F) -> AAResults & {
assert(&F == &OldF && "Called with an unexpected function!");		assert(&F == &OldF && "Called with an unexpected function!");
return FAM.getResult<AAManager>(F);		return FAM.getResult<AAManager>(F);
};		};

Function *NewF = promoteArguments(&OldF, AARGetter, MaxElements, None);		const TargetTransformInfo &TTI = FAM.getResult<TargetIRAnalysis>(OldF);
		Function *NewF =
		promoteArguments(&OldF, AARGetter, MaxElements, None, TTI);
if (!NewF)		if (!NewF)
continue;		continue;
LocalChange = true;		LocalChange = true;

// Directly substitute the functions in the call graph. Note that this		// Directly substitute the functions in the call graph. Note that this
// requires the old function to be completely dead and completely		// requires the old function to be completely dead and completely
// replaced by the new function. It does no call graph updates, it merely		// replaced by the new function. It does no call graph updates, it merely
// swaps out the particular function mapped to a particular node in the		// swaps out the particular function mapped to a particular node in the
Show All 21 Lines	struct ArgPromotion : public CallGraphSCCPass {
explicit ArgPromotion(unsigned MaxElements = 3)		explicit ArgPromotion(unsigned MaxElements = 3)
: CallGraphSCCPass(ID), MaxElements(MaxElements) {		: CallGraphSCCPass(ID), MaxElements(MaxElements) {
initializeArgPromotionPass(*PassRegistry::getPassRegistry());		initializeArgPromotionPass(*PassRegistry::getPassRegistry());
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
		AU.addRequired<TargetTransformInfoWrapperPass>();
getAAResultsAnalysisUsage(AU);		getAAResultsAnalysisUsage(AU);
CallGraphSCCPass::getAnalysisUsage(AU);		CallGraphSCCPass::getAnalysisUsage(AU);
}		}

bool runOnSCC(CallGraphSCC &SCC) override;		bool runOnSCC(CallGraphSCC &SCC) override;

private:		private:
using llvm::Pass::doInitialization;		using llvm::Pass::doInitialization;
Show All 9 Lines
char ArgPromotion::ID = 0;		char ArgPromotion::ID = 0;

INITIALIZE_PASS_BEGIN(ArgPromotion, "argpromotion",		INITIALIZE_PASS_BEGIN(ArgPromotion, "argpromotion",
"Promote 'by reference' arguments to scalars", false,		"Promote 'by reference' arguments to scalars", false,
false)		false)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_PASS_DEPENDENCY(CallGraphWrapperPass)		INITIALIZE_PASS_DEPENDENCY(CallGraphWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_END(ArgPromotion, "argpromotion",		INITIALIZE_PASS_END(ArgPromotion, "argpromotion",
"Promote 'by reference' arguments to scalars", false, false)		"Promote 'by reference' arguments to scalars", false, false)

Pass *llvm::createArgumentPromotionPass(unsigned MaxElements) {		Pass *llvm::createArgumentPromotionPass(unsigned MaxElements) {
return new ArgPromotion(MaxElements);		return new ArgPromotion(MaxElements);
}		}

bool ArgPromotion::runOnSCC(CallGraphSCC &SCC) {		bool ArgPromotion::runOnSCC(CallGraphSCC &SCC) {
Show All 20 Lines	for (CallGraphNode *OldNode : SCC) {
auto ReplaceCallSite = [&](CallSite OldCS, CallSite NewCS) {		auto ReplaceCallSite = [&](CallSite OldCS, CallSite NewCS) {
Function *Caller = OldCS.getInstruction()->getParent()->getParent();		Function *Caller = OldCS.getInstruction()->getParent()->getParent();
CallGraphNode *NewCalleeNode =		CallGraphNode *NewCalleeNode =
CG.getOrInsertFunction(NewCS.getCalledFunction());		CG.getOrInsertFunction(NewCS.getCalledFunction());
CallGraphNode *CallerNode = CG[Caller];		CallGraphNode *CallerNode = CG[Caller];
CallerNode->replaceCallEdge(OldCS, NewCS, NewCalleeNode);		CallerNode->replaceCallEdge(OldCS, NewCS, NewCalleeNode);
};		};

		const TargetTransformInfo &TTI =
		getAnalysis<TargetTransformInfoWrapperPass>().getTTI(*OldF);
if (Function *NewF = promoteArguments(OldF, AARGetter, MaxElements,		if (Function *NewF = promoteArguments(OldF, AARGetter, MaxElements,
{ReplaceCallSite})) {		{ReplaceCallSite}, TTI)) {
LocalChange = true;		LocalChange = true;

// Update the call graph for the newly promoted function.		// Update the call graph for the newly promoted function.
CallGraphNode *NewNode = CG.getOrInsertFunction(NewF);		CallGraphNode *NewNode = CG.getOrInsertFunction(NewF);
NewNode->stealCalledFunctionsFrom(OldNode);		NewNode->stealCalledFunctionsFrom(OldNode);
if (OldNode->getNumReferences() == 0)		if (OldNode->getNumReferences() == 0)
delete CG.removeFunctionFromModule(OldNode);		delete CG.removeFunctionFromModule(OldNode);
else		else
Show All 16 Lines

llvm/trunk/test/Transforms/ArgumentPromotion/X86/attributes.ll

				; RUN: opt -S -argpromotion < %s \| FileCheck %s
				; RUN: opt -S -passes=argpromotion < %s \| FileCheck %s
				; Test that we only promote arguments when the caller/callee have compatible
				; function attrubtes.

				target triple = "x86_64-unknown-linux-gnu"

				; CHECK-LABEL: @no_promote_avx2(<4 x i64>* %arg, <4 x i64>* readonly %arg1)
				define internal fastcc void @no_promote_avx2(<4 x i64>* %arg, <4 x i64>* readonly %arg1) #0 {
				bb:
				%tmp = load <4 x i64>, <4 x i64>* %arg1
				store <4 x i64> %tmp, <4 x i64>* %arg
				ret void
				}

				define void @no_promote(<4 x i64>* %arg) #1 {
				bb:
				%tmp = alloca <4 x i64>, align 32
				%tmp2 = alloca <4 x i64>, align 32
				%tmp3 = bitcast <4 x i64>* %tmp to i8*
				call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)
				call fastcc void @no_promote_avx2(<4 x i64>* %tmp2, <4 x i64>* %tmp)
				%tmp4 = load <4 x i64>, <4 x i64>* %tmp2, align 32
				store <4 x i64> %tmp4, <4 x i64>* %arg, align 2
				ret void
				}

				; CHECK-LABEL: @promote_avx2(<4 x i64>* %arg, <4 x i64> %
				define internal fastcc void @promote_avx2(<4 x i64>* %arg, <4 x i64>* readonly %arg1) #0 {
				bb:
				%tmp = load <4 x i64>, <4 x i64>* %arg1
				store <4 x i64> %tmp, <4 x i64>* %arg
				ret void
				}

				define void @promote(<4 x i64>* %arg) #0 {
				bb:
				%tmp = alloca <4 x i64>, align 32
				%tmp2 = alloca <4 x i64>, align 32
				%tmp3 = bitcast <4 x i64>* %tmp to i8*
				call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)
				call fastcc void @promote_avx2(<4 x i64>* %tmp2, <4 x i64>* %tmp)
				%tmp4 = load <4 x i64>, <4 x i64>* %tmp2, align 32
				store <4 x i64> %tmp4, <4 x i64>* %arg, align 2
				ret void
				}

				; Function Attrs: argmemonly nounwind
				declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i1) #2

				attributes #0 = { inlinehint norecurse nounwind uwtable "target-features"="+avx2" }
				attributes #1 = { nounwind uwtable }
				attributes #2 = { argmemonly nounwind }