This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/IPO/
-
llvm/
-
Transforms/
-
IPO/
-
ArgumentPromotion.h
3/3
Attributor.h
-
lib/Transforms/IPO/
-
Transforms/
-
IPO/
2/2
ArgumentPromotion.cpp
5/11
Attributor.cpp
-
test/Transforms/Attributor/
-
Transforms/
-
Attributor/
-
ArgumentPromotion/
-
2008-02-01-ReturnAttrs.ll
-
X86/
-
attributes.ll
-
min-legal-vector-width.ll
-
alignment.ll
-
attrs.ll
-
basictest.ll
-
byval-2.ll
-
byval.ll
-
control-flow2.ll
-
fp80.ll
-
inalloca.ll
-
profile.ll
-
tail.ll
-
IPConstantProp/
-
2009-09-24-byval-ptr.ll
-
PR16052.ll
-
callbacks.ll
-
internal-noalias.ll

Differential D68852

[Attributor] Pointer privatization attribute (argument promotion)
ClosedPublic

Authored by jdoerfert on Oct 10 2019, 7:36 PM.

Download Raw Diff

Details

Reviewers

uenoku
sstefan1
lebedev.ri
hfinkel
vsk
dblaikie
davidxl
tejohnson
tstellar
echristo
chandlerc
efriedma

Commits

rG89c2e733e80e: [Attributor] Pointer privatization attribute (argument promotion)

Summary

A pointer is privatizeable if it can be replaced by a new, private one.
Privatizing pointer reduces the use count, interaction between unrelated
code parts. This is a first step towards replacing argument promotion.
While we can already handle recursion (unlike argument promotion!) we
are restricted to stack allocations for now because we do not analyze
the uses in the callee.

All argument promotion test now run the Attributor as well.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jdoerfert created this revision.Oct 10 2019, 7:36 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2019, 7:36 PM

Herald added subscribers: arphaman, bollu, hiraditya. · View Herald Transcript

I went through all the tests, added mem2reg/sroa where approriate and modified the source sometimes, mostly to avoid UB. I think the results of the Attributor look good, all problems should have been addressed already.

Harbormaster completed remote builds in B39391: Diff 224526.Oct 10 2019, 7:40 PM

jdoerfert mentioned this in D68766: [NFC][ArgPromo][Tests] Run update_test_checks on all ArgumentPromotion tests.Oct 10 2019, 7:44 PM

lebedev.ri added inline comments.Oct 16 2019, 11:59 AM

llvm/test/Transforms/ArgumentPromotion/control-flow2.ll
27–31 ↗	(On Diff #224526)	This is why i'm really pushing on using `--check-prefixes=ALL,ARGPROMOTION` from the getgo :[
35 ↗	(On Diff #224526)	I strongly believe you want to precommit test changes+regeneration first.

jdoerfert added parent revisions: D68766: [NFC][ArgPromo][Tests] Run update_test_checks on all ArgumentPromotion tests, D68851: [Utils] Allow update_test_checks to scrub attribute annotations, D68819: [Utils] Allow update_test_checks to check function arguments, D68850: [Utils] Deal with occasionally deleted functions.Oct 29 2019, 11:09 PM

jdoerfert marked 2 inline comments as done.

jdoerfert added inline comments.

llvm/test/Transforms/ArgumentPromotion/control-flow2.ll
27–31 ↗	(On Diff #224526)	Agreed.
35 ↗	(On Diff #224526)	I will update the test lines after the D68766 update. Do you want me to split test changes, e.g., that remove UB, as well?

jdoerfert added a parent revision: D68765: [Attributor] Function signature rewrite infrastructure.Oct 30 2019, 10:41 PM

Update the tests

Harbormaster completed remote builds in B40327: Diff 227230.Oct 31 2019, 12:02 AM

Update tests

Harbormaster completed remote builds in B40365: Diff 227335.Oct 31 2019, 1:45 PM

I'll copy the tests into the new Attributor test folder. Any other comments? @uenoku @sstefan1

Looks generally fine.
I couldn't imagine what is Pointer privatization at first hand. Could you add an example result of Pointer privatization? Like,

int f(int* ptr){
 ...
}
=>
int f(int p){
 int* ptr = &p;
 ...
}

And I guess the current implementation always does privatization if possible. I think the cost may increase in some cases, right? What do you think about?

llvm/include/llvm/Transforms/IPO/Attributor.h
2419	Could you add comments for the condition of whether the pointer can be replaced a private one?
2431	nit: choose
llvm/test/Transforms/ArgumentPromotion/chained.ll
18 ↗	(On Diff #227335)	Please add FIXME here for `AAValueSimplify`.

In D68852#1781869, @uenoku wrote:
Looks generally fine.
I couldn't imagine what is Pointer privatization at first hand. Could you add an example result of Pointer privatization? Like,
int f(int* ptr){
 ...
}
=>
int f(int p){
 int* ptr = &p;
 ...
}

I will add the example to the class comment in the header file.

For the record:
Privatization, at least the part implemented so far, is roughly argument promotion. Instead of passing a pointer, pass the values accessed through it by the callee.
The existing argument promotion does not do privatization but tries to replace the uses of the pointer with the values passed right away. Privatization is simpler in that
regard but later, partially because of this, also more powerful.

And I guess the current implementation always does privatization if possible. I think the cost may increase in some cases. What do you think about?

That is correct. So far, we build the Attribtor to be powerful (=applicable) not to be "smart" about costs. We'll have to write heurisitcs soon but before that I want to ask people to test the powerful version in order to get more coverage and sniff out bugs.

In fact, we might always do privatization once another piece of code I have only locally is available. With it, privatization might cause arbitrarily many arguments at the call site but we can always recover the original call site from it. More on that later though. For comparison: ArgumentPromotion restricts the size of the structs that are expanded arbitrarily to 3, which is beyond me.

LGTM from my side but please make sure that it passes test-suite.

llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
846	I think you can put this function to `llvm` namespace and split commit.

This revision is now accepted and ready to land.Dec 13 2019, 3:46 AM

In D68852#1783313, @uenoku wrote:

LGTM from my side but please make sure that it passes test-suite.

Will do.

I have to wait for D68765 first but not necessarily for the update scrip patches (they are blocked by the update test infrastructure patch).

rebase

clang-format: pass.

Build artifacts: console-log.txt, diff.json

Addressed comments

Harbormaster failed remote builds in B42507: Diff 233921!Dec 14 2019, 12:21 AM

clang-format: pass.

Build artifacts: console-log.txt, diff.json

jdoerfert added inline comments.Dec 14 2019, 12:24 AM

llvm/include/llvm/Transforms/IPO/Attributor.h
2419	Done.
llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
846	I would like to get rid of this, the Attributor should directly use a more specialized form of areFunctionArgsABICompatible. I'll add a TODO.

Harbormaster failed remote builds in B42508: Diff 233922!Dec 14 2019, 12:30 AM

efriedma added inline comments.Dec 16 2019, 6:38 PM

llvm/lib/Transforms/IPO/Attributor.cpp
5067	No alignment set on loads?
5102	No alignment set on alloca?
5166	isArrayAllocation()

jdoerfert marked 2 inline comments as done.Dec 16 2019, 7:01 PM

jdoerfert added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
5067	I'll add the alignment for the alloca below based on the alignment of the pointer it replaces. The loads and stores will be annotated by the next run of the Attributor automatically. We can also consider not emitting it if we can prove it is not needed, though that will not always be the case and it will require an analysis we do not yet have (something like AAAccessTracker). Finally, SROA should, in the good case, eliminate the alloca completely.
5166	Will do.

efriedma added inline comments.Dec 17 2019, 8:58 AM

llvm/lib/Transforms/IPO/Attributor.cpp
5067	The loads and stores will be annotated by the next run of the Attributor automatically. The default alignment of a load is the alignment of the load's type, as computed by the datalayout. This might be too high, depending on the pointer.

jdoerfert marked an inline comment as done.Dec 17 2019, 9:21 AM

jdoerfert added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
5067	The default alignment of a load is the alignment of the load's type, as computed by the datalayout. Sure. This might be too high, depending on the pointer. How could that be? We create the pointer with a proper type (the alloca) below. Shouldn't the alloca take the default alignment into account when the memory is allocated?

efriedma added inline comments.Dec 17 2019, 10:12 AM

llvm/lib/Transforms/IPO/Attributor.cpp
5067	If you create an alloca, then load/store to that pointer, the default alignments will work, yes. But that isn't what's happening here, is it? The alloca is in the callee, and this load is in the caller.

jdoerfert marked an inline comment as done.Dec 17 2019, 11:57 AM

jdoerfert added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
5067	With this patch we always have an alloca or an argument with some pointer to (struct) type which we only access through proper gep addressing. I don't think this can create an alignment issue. I get that the alloca needs to be aligned with a higher value if the pointer was marked as such, but I already said that will be fixed.

efriedma added inline comments.Dec 17 2019, 12:42 PM

llvm/lib/Transforms/IPO/Attributor.cpp
5067	Pointers can be misaligned, generally. For example: define void @f() { entry: %a = alloca i32, align 1 call void @g(i32* %a) ret void } define internal void @g(i32* %a) { %aa = load i32, i32* %a, align 1 call void @z(aa) } declare void @z(i32) As far as I can tell, your patch will introduce a misaligned load into `@f()`. (C generally provides additional guarantees based on the pointee type of a pointer, but there isn't any corresponding rule for IR pointers.)

jdoerfert marked an inline comment as done.Dec 17 2019, 3:25 PM

jdoerfert added inline comments.

llvm/lib/Transforms/IPO/Attributor.cpp
5067	I finally understand your concern, sorry that it took so long. I played around a bit to see what we currently do and I found this interesting: https://godbolt.org/z/2q_oqH We basically align the alloca naturally at some point. I would for now just set the alignment to 1 and add a TODO. For these loads, the Attributor can find a better alignment in the next run anyway and this allows me to not amend this patch too much. The TODO will explain the situation and we can work on a better solution from then. Maybe, if it is very simple, I'll directly use the AAAlign logic to get a lower bound instead. Long story short, I'll make sure these loads are properly aligned and we test for this.

jdoerfert mentioned this in D72382: [ArgPromotion] Extend search for SafeToUnconditionallyLoad indices to the blocks that must be executed upon entry into the function..Jan 9 2020, 8:24 AM

jdoerfert mentioned this in D71989: [OpenMP][IRBuilder] `omp task` support.Jan 14 2020, 5:54 PM

Closed by commit rG89c2e733e80e: [Attributor] Pointer privatization attribute (argument promotion) (authored by jdoerfert). · Explain WhyJan 29 2020, 7:33 PM

This revision was automatically updated to reflect the committed changes.

I added a test case for the alignment and I tested it on the LLVM Test Suite :)

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

IPO/

ArgumentPromotion.h

12 lines

Attributor.h

50 lines

lib/

Transforms/

IPO/

ArgumentPromotion.cpp

17 lines

Attributor.cpp

675 lines

test/

Transforms/

Attributor/

ArgumentPromotion/

2008-02-01-ReturnAttrs.ll

9 lines

X86/

attributes.ll

9 lines

min-legal-vector-width.ll

54 lines

32 lines

20 lines

21 lines

20 lines

30 lines

9 lines

17 lines

17 lines

9 lines

15 lines

IPConstantProp/

2009-09-24-byval-ptr.ll

83 lines

PR16052.ll

2 lines

callbacks.ll

107 lines

internal-noalias.ll

2 lines

Diff 241337

llvm/include/llvm/Transforms/IPO/ArgumentPromotion.h

	//===- ArgumentPromotion.h - Promote by-reference arguments ------ C++ --===//			//===- ArgumentPromotion.h - Promote by-reference arguments ------ C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_TRANSFORMS_IPO_ARGUMENTPROMOTION_H			#ifndef LLVM_TRANSFORMS_IPO_ARGUMENTPROMOTION_H
	#define LLVM_TRANSFORMS_IPO_ARGUMENTPROMOTION_H			#define LLVM_TRANSFORMS_IPO_ARGUMENTPROMOTION_H

	#include "llvm/Analysis/CGSCCPassManager.h"			#include "llvm/Analysis/CGSCCPassManager.h"
	#include "llvm/Analysis/LazyCallGraph.h"			#include "llvm/Analysis/LazyCallGraph.h"
	#include "llvm/IR/PassManager.h"			#include "llvm/IR/PassManager.h"

	namespace llvm {			namespace llvm {
				class TargetTransformInfo;

	/// Argument promotion pass.			/// Argument promotion pass.
	///			///
	/// This pass walks the functions in each SCC and for each one tries to			/// This pass walks the functions in each SCC and for each one tries to
	/// transform it and all of its callers to replace indirect arguments with			/// transform it and all of its callers to replace indirect arguments with
	/// direct (by-value) arguments.			/// direct (by-value) arguments.
	class ArgumentPromotionPass : public PassInfoMixin<ArgumentPromotionPass> {			class ArgumentPromotionPass : public PassInfoMixin<ArgumentPromotionPass> {
	unsigned MaxElements;			unsigned MaxElements;

	public:			public:
	ArgumentPromotionPass(unsigned MaxElements = 3u) : MaxElements(MaxElements) {}			ArgumentPromotionPass(unsigned MaxElements = 3u) : MaxElements(MaxElements) {}

				/// Check if callers and the callee \p F agree how promoted arguments would be
				/// passed. The ones that they do not agree on are eliminated from the sets but
				/// the return value has to be observed as well.
				static bool areFunctionArgsABICompatible(
				const Function &F, const TargetTransformInfo &TTI,
				SmallPtrSetImpl<Argument *> &ArgsToPromote,
				SmallPtrSetImpl<Argument *> &ByValArgsToTransform);

				/// Checks if a type could have padding bytes.
				static bool isDenselyPacked(Type *type, const DataLayout &DL);

	PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,			PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
	LazyCallGraph &CG, CGSCCUpdateResult &UR);			LazyCallGraph &CG, CGSCCUpdateResult &UR);
	};			};

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_IPO_ARGUMENTPROMOTION_H			#endif // LLVM_TRANSFORMS_IPO_ARGUMENTPROMOTION_H

llvm/include/llvm/Transforms/IPO/Attributor.h

Show First 20 Lines • Show All 98 Lines • ▼ Show 20 Lines

#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/SCCIterator.h"		#include "llvm/ADT/SCCIterator.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/CallGraph.h"		#include "llvm/Analysis/CallGraph.h"
#include "llvm/Analysis/MustExecute.h"		#include "llvm/Analysis/MustExecute.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/IR/CallSite.h"		#include "llvm/IR/CallSite.h"
#include "llvm/IR/ConstantRange.h"		#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"

namespace llvm {		namespace llvm {

struct AbstractAttribute;		struct AbstractAttribute;
struct InformationCache;		struct InformationCache;
▲ Show 20 Lines • Show All 831 Lines • ▼ Show 20 Lines	private:

/// Abstract call site (ACS) repair callback, see ACSRepairCBTy.		/// Abstract call site (ACS) repair callback, see ACSRepairCBTy.
const ACSRepairCBTy ACSRepairCB;		const ACSRepairCBTy ACSRepairCB;

/// Allow access to the private members from the Attributor.		/// Allow access to the private members from the Attributor.
friend struct Attributor;		friend struct Attributor;
};		};

		/// Check if we can rewrite a function signature.
		///
		/// The argument \p Arg is replaced with new ones defined by the number,
		/// order, and types in \p ReplacementTypes.
		///
		/// \returns True, if the replacement can be registered, via
		/// registerFunctionSignatureRewrite, false otherwise.
		bool isValidFunctionSignatureRewrite(Argument &Arg,
		ArrayRef<Type *> ReplacementTypes);

/// Register a rewrite for a function signature.		/// Register a rewrite for a function signature.
///		///
/// The argument \p Arg is replaced with new ones defined by the number,		/// The argument \p Arg is replaced with new ones defined by the number,
/// order, and types in \p ReplacementTypes. The rewiring at the call sites is		/// order, and types in \p ReplacementTypes. The rewiring at the call sites is
/// done through \p ACSRepairCB and at the callee site through		/// done through \p ACSRepairCB and at the callee site through
/// \p CalleeRepairCB.		/// \p CalleeRepairCB.
///		///
/// \returns True, if the replacement was registered, false otherwise.		/// \returns True, if the replacement was registered, false otherwise.
▲ Show 20 Lines • Show All 1,435 Lines • ▼ Show 20 Lines	struct AAHeapToStack : public StateWrapper<BooleanState, AbstractAttribute>,

/// Create an abstract attribute view for the position \p IRP.		/// Create an abstract attribute view for the position \p IRP.
static AAHeapToStack &createForPosition(const IRPosition &IRP, Attributor &A);		static AAHeapToStack &createForPosition(const IRPosition &IRP, Attributor &A);

/// Unique ID (due to the unique address)		/// Unique ID (due to the unique address)
static const char ID;		static const char ID;
};		};

		/// An abstract interface for privatizability.
		///
		/// A pointer is privatizable if it can be replaced by a new, private one.
		/// Privatizing pointer reduces the use count, interaction between unrelated
		uenokuUnsubmitted Done Reply Inline Actions Could you add comments for the condition of whether the pointer can be replaced a private one? uenoku: Could you add comments for the condition of whether the pointer can be replaced a private one?
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions Done. jdoerfert: Done.
		/// code parts.
		///
		/// In order for a pointer to be privatizable its value cannot be observed
		/// (=nocapture), it is (for now) not written (=readonly & noalias), we know
		/// what values are necessary to make the private copy look like the original
		/// one, and the values we need can be loaded (=dereferenceable).
		struct AAPrivatizablePtr : public StateWrapper<BooleanState, AbstractAttribute>,
		public IRPosition {
		AAPrivatizablePtr(const IRPosition &IRP) : IRPosition(IRP) {}

		/// Returns true if pointer privatization is assumed to be possible.
		bool isAssumedPrivatizablePtr() const { return getAssumed(); }
		uenokuUnsubmitted Done Reply Inline Actions nit: choose uenoku: nit: choose

		/// Returns true if pointer privatization is known to be possible.
		bool isKnownPrivatizablePtr() const { return getKnown(); }

		/// Return the type we can choose for a private copy of the underlying
		/// value. None means it is not clear yet, nullptr means there is none.
		virtual Optional<Type *> getPrivatizableType() const = 0;

		/// Return an IR position, see struct IRPosition.
		///
		///{
		IRPosition &getIRPosition() { return *this; }
		const IRPosition &getIRPosition() const { return *this; }
		///}

		/// Create an abstract attribute view for the position \p IRP.
		static AAPrivatizablePtr &createForPosition(const IRPosition &IRP,
		Attributor &A);

		/// Unique ID (due to the unique address)
		static const char ID;
		};

/// An abstract interface for all memory related attributes.		/// An abstract interface for all memory related attributes.
struct AAMemoryBehavior		struct AAMemoryBehavior
: public IRAttribute<		: public IRAttribute<
Attribute::ReadNone,		Attribute::ReadNone,
StateWrapper<BitIntegerState<uint8_t, 3>, AbstractAttribute>> {		StateWrapper<BitIntegerState<uint8_t, 3>, AbstractAttribute>> {
AAMemoryBehavior(const IRPosition &IRP) : IRAttribute(IRP) {}		AAMemoryBehavior(const IRPosition &IRP) : IRAttribute(IRP) {}

/// State encoding bits. A set bit in the state means the property holds.		/// State encoding bits. A set bit in the state means the property holds.
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/ArgumentPromotion.cpp

Show First 20 Lines • Show All 768 Lines • ▼ Show 20 Lines	static bool isSafeToPromoteArgument(Argument Arg, Type ByValTy, AAResults &AAR,
}		}

// If the path from the entry of the function to each load is free of		// If the path from the entry of the function to each load is free of
// instructions that potentially invalidate the load, we can make the		// instructions that potentially invalidate the load, we can make the
// transformation!		// transformation!
return true;		return true;
}		}

/// Checks if a type could have padding bytes.		bool ArgumentPromotionPass::isDenselyPacked(Type *type, const DataLayout &DL) {
static bool isDenselyPacked(Type *type, const DataLayout &DL) {
// There is no size information, so be conservative.		// There is no size information, so be conservative.
if (!type->isSized())		if (!type->isSized())
return false;		return false;

// If the alloc size is not equal to the storage size, then there are padding		// If the alloc size is not equal to the storage size, then there are padding
// bytes. For x86_fp80 on x86-64, size: 80 alloc size: 128.		// bytes. For x86_fp80 on x86-64, size: 80 alloc size: 128.
if (DL.getTypeSizeInBits(type) != DL.getTypeAllocSizeInBits(type))		if (DL.getTypeSizeInBits(type) != DL.getTypeAllocSizeInBits(type))
return false;		return false;
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	static bool canPaddingBeAccessed(Argument *arg) {
// Check to make sure the pointers aren't captured		// Check to make sure the pointers aren't captured
for (StoreInst *Store : Stores)		for (StoreInst *Store : Stores)
if (PtrValues.count(Store->getValueOperand()))		if (PtrValues.count(Store->getValueOperand()))
return true;		return true;

return false;		return false;
}		}

static bool areFunctionArgsABICompatible(		bool ArgumentPromotionPass::areFunctionArgsABICompatible(
		uenokuUnsubmitted Done Reply Inline Actions I think you can put this function to `llvm` namespace and split commit. uenoku: I think you can put this function to `llvm` namespace and split commit.
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions I would like to get rid of this, the Attributor should directly use a more specialized form of areFunctionArgsABICompatible. I'll add a TODO. jdoerfert: I would like to get rid of this, the Attributor should directly use a more specialized form of…
const Function &F, const TargetTransformInfo &TTI,		const Function &F, const TargetTransformInfo &TTI,
SmallPtrSetImpl<Argument *> &ArgsToPromote,		SmallPtrSetImpl<Argument *> &ArgsToPromote,
SmallPtrSetImpl<Argument *> &ByValArgsToTransform) {		SmallPtrSetImpl<Argument *> &ByValArgsToTransform) {
for (const Use &U : F.uses()) {		for (const Use &U : F.uses()) {
CallSite CS(U.getUser());		CallSite CS(U.getUser());
		if (!CS)
		return false;
const Function *Caller = CS.getCaller();		const Function *Caller = CS.getCaller();
const Function *Callee = CS.getCalledFunction();		const Function *Callee = CS.getCalledFunction();
if (!TTI.areFunctionArgsABICompatible(Caller, Callee, ArgsToPromote) \|\|		if (!TTI.areFunctionArgsABICompatible(Caller, Callee, ArgsToPromote) \|\|
!TTI.areFunctionArgsABICompatible(Caller, Callee, ByValArgsToTransform))		!TTI.areFunctionArgsABICompatible(Caller, Callee, ByValArgsToTransform))
return false;		return false;
}		}
return true;		return true;
}		}
▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	if (PtrArg->hasStructRetAttr()) {
CS.removeParamAttr(ArgNo, Attribute::StructRet);		CS.removeParamAttr(ArgNo, Attribute::StructRet);
CS.addParamAttr(ArgNo, Attribute::NoAlias);		CS.addParamAttr(ArgNo, Attribute::NoAlias);
}		}
}		}

// If this is a byval argument, and if the aggregate type is small, just		// If this is a byval argument, and if the aggregate type is small, just
// pass the elements, which is always safe, if the passed value is densely		// pass the elements, which is always safe, if the passed value is densely
// packed or if we can prove the padding bytes are never accessed.		// packed or if we can prove the padding bytes are never accessed.
bool isSafeToPromote =		bool isSafeToPromote = PtrArg->hasByValAttr() &&
PtrArg->hasByValAttr() &&		(ArgumentPromotionPass::isDenselyPacked(AgTy, DL) \|\|
(isDenselyPacked(AgTy, DL) \|\| !canPaddingBeAccessed(PtrArg));		!canPaddingBeAccessed(PtrArg));
if (isSafeToPromote) {		if (isSafeToPromote) {
if (StructType *STy = dyn_cast<StructType>(AgTy)) {		if (StructType *STy = dyn_cast<StructType>(AgTy)) {
if (MaxElements > 0 && STy->getNumElements() > MaxElements) {		if (MaxElements > 0 && STy->getNumElements() > MaxElements) {
LLVM_DEBUG(dbgs() << "argpromotion disable promoting argument '"		LLVM_DEBUG(dbgs() << "argpromotion disable promoting argument '"
<< PtrArg->getName()		<< PtrArg->getName()
<< "' because it would require adding more"		<< "' because it would require adding more"
<< " than " << MaxElements		<< " than " << MaxElements
<< " arguments to the function.\n");		<< " arguments to the function.\n");
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	for (Argument *PtrArg : PointerArgs) {
if (isSafeToPromoteArgument(PtrArg, ByValTy, AAR, MaxElements))		if (isSafeToPromoteArgument(PtrArg, ByValTy, AAR, MaxElements))
ArgsToPromote.insert(PtrArg);		ArgsToPromote.insert(PtrArg);
}		}

// No promotable pointer arguments.		// No promotable pointer arguments.
if (ArgsToPromote.empty() && ByValArgsToTransform.empty())		if (ArgsToPromote.empty() && ByValArgsToTransform.empty())
return nullptr;		return nullptr;

if (!areFunctionArgsABICompatible(*F, TTI, ArgsToPromote,		if (!ArgumentPromotionPass::areFunctionArgsABICompatible(
ByValArgsToTransform))		*F, TTI, ArgsToPromote, ByValArgsToTransform))
return nullptr;		return nullptr;

return doPromotion(F, ArgsToPromote, ByValArgsToTransform, ReplaceCallSite);		return doPromotion(F, ArgsToPromote, ByValArgsToTransform, ReplaceCallSite);
}		}

PreservedAnalyses ArgumentPromotionPass::run(LazyCallGraph::SCC &C,		PreservedAnalyses ArgumentPromotionPass::run(LazyCallGraph::SCC &C,
CGSCCAnalysisManager &AM,		CGSCCAnalysisManager &AM,
LazyCallGraph &CG,		LazyCallGraph &CG,
▲ Show 20 Lines • Show All 152 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/Attributor.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 25 Lines
#include "llvm/Analysis/LazyValueInfo.h"		#include "llvm/Analysis/LazyValueInfo.h"
#include "llvm/Analysis/Loads.h"		#include "llvm/Analysis/Loads.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/ScalarEvolution.h"		#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Argument.h"		#include "llvm/IR/Argument.h"
#include "llvm/IR/Attributes.h"		#include "llvm/IR/Attributes.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Verifier.h"		#include "llvm/IR/Verifier.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
		#include "llvm/IR/NoFolder.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
		#include "llvm/Transforms/IPO/ArgumentPromotion.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"		#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"

#include <cassert>		#include <cassert>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "attributor"		#define DEBUG_TYPE "attributor"
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
PIPE_OPERATOR(AAAlign)		PIPE_OPERATOR(AAAlign)
PIPE_OPERATOR(AANoCapture)		PIPE_OPERATOR(AANoCapture)
PIPE_OPERATOR(AAValueSimplify)		PIPE_OPERATOR(AAValueSimplify)
PIPE_OPERATOR(AANoFree)		PIPE_OPERATOR(AANoFree)
PIPE_OPERATOR(AAHeapToStack)		PIPE_OPERATOR(AAHeapToStack)
PIPE_OPERATOR(AAReachability)		PIPE_OPERATOR(AAReachability)
PIPE_OPERATOR(AAMemoryBehavior)		PIPE_OPERATOR(AAMemoryBehavior)
PIPE_OPERATOR(AAValueConstantRange)		PIPE_OPERATOR(AAValueConstantRange)
		PIPE_OPERATOR(AAPrivatizablePtr)

#undef PIPE_OPERATOR		#undef PIPE_OPERATOR
} // namespace llvm		} // namespace llvm

// TODO: Determine a good default value.		// TODO: Determine a good default value.
//		//
// In the LLVM-TS and SPEC2006, 32 seems to not induce compile time overheads		// In the LLVM-TS and SPEC2006, 32 seems to not induce compile time overheads
// (when run with the first 5 abstract attributes). The results also indicate		// (when run with the first 5 abstract attributes). The results also indicate
▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	if (auto *RMWI = dyn_cast<AtomicRMWInst>(I)) {
if (!AllowVolatile && RMWI->isVolatile())		if (!AllowVolatile && RMWI->isVolatile())
return nullptr;		return nullptr;
return RMWI->getPointerOperand();		return RMWI->getPointerOperand();
}		}

return nullptr;		return nullptr;
}		}

		/// Helper function to create a pointer of type \p ResTy, based on \p Ptr, and
		/// advanced by \p Offset bytes. To aid later analysis the method tries to build
		/// getelement pointer instructions that traverse the natural type of \p Ptr if
		/// possible. If that fails, the remaining offset is adjusted byte-wise, hence
		/// through a cast to i8*.
		///
		/// TODO: This could probably live somewhere more prominantly if it doesn't
		/// already exist.
		static Value constructPointer(Type ResTy, Value *Ptr, int64_t Offset,
		IRBuilder<NoFolder> &IRB, const DataLayout &DL) {
		assert(Offset >= 0 && "Negative offset not supported yet!");
		LLVM_DEBUG(dbgs() << "Construct pointer: " << *Ptr << " + " << Offset
		<< "-bytes as " << *ResTy << "\n");

		// The initial type we are trying to traverse to get nice GEPs.
		Type *Ty = Ptr->getType();

		SmallVector<Value *, 4> Indices;
		std::string GEPName = Ptr->getName().str();
		while (Offset) {
		uint64_t Idx, Rem;

		if (auto *STy = dyn_cast<StructType>(Ty)) {
		const StructLayout *SL = DL.getStructLayout(STy);
		if (int64_t(SL->getSizeInBytes()) < Offset)
		break;
		Idx = SL->getElementContainingOffset(Offset);
		assert(Idx < STy->getNumElements() && "Offset calculation error!");
		Rem = Offset - SL->getElementOffset(Idx);
		Ty = STy->getElementType(Idx);
		} else if (auto *PTy = dyn_cast<PointerType>(Ty)) {
		Ty = PTy->getElementType();
		if (!Ty->isSized())
		break;
		uint64_t ElementSize = DL.getTypeAllocSize(Ty);
		assert(ElementSize && "Expected type with size!");
		Idx = Offset / ElementSize;
		Rem = Offset % ElementSize;
		} else {
		// Non-aggregate type, we cast and make byte-wise progress now.
		break;
		}

		LLVM_DEBUG(errs() << "Ty: " << *Ty << " Offset: " << Offset
		<< " Idx: " << Idx << " Rem: " << Rem << "\n");

		GEPName += "." + std::to_string(Idx);
		Indices.push_back(ConstantInt::get(IRB.getInt32Ty(), Idx));
		Offset = Rem;
		}

		// Create a GEP if we collected indices above.
		if (Indices.size())
		Ptr = IRB.CreateGEP(Ptr, Indices, GEPName);

		// If an offset is left we use byte-wise adjustment.
		if (Offset) {
		Ptr = IRB.CreateBitCast(Ptr, IRB.getInt8PtrTy());
		Ptr = IRB.CreateGEP(Ptr, IRB.getInt32(Offset),
		GEPName + ".b" + Twine(Offset));
		}

		// Ensure the result has the requested type.
		Ptr = IRB.CreateBitOrPointerCast(Ptr, ResTy, Ptr->getName() + ".cast");

		LLVM_DEBUG(dbgs() << "Constructed pointer: " << *Ptr << "\n");
		return Ptr;
		}

/// Recursively visit all values that might become \p IRP at some point. This		/// Recursively visit all values that might become \p IRP at some point. This
/// will be done by looking through cast instructions, selects, phis, and calls		/// will be done by looking through cast instructions, selects, phis, and calls
/// with the "returned" attribute. Once we cannot look through the value any		/// with the "returned" attribute. Once we cannot look through the value any
/// further, the callback \p VisitValueCB is invoked and passed the current		/// further, the callback \p VisitValueCB is invoked and passed the current
/// value, the \p State, and a flag to indicate if we stripped anything. To		/// value, the \p State, and a flag to indicate if we stripped anything. To
/// limit how much effort is invested, we will never visit more values than		/// limit how much effort is invested, we will never visit more values than
/// specified by \p MaxValues.		/// specified by \p MaxValues.
template <typename AAType, typename StateTy>		template <typename AAType, typename StateTy>
▲ Show 20 Lines • Show All 2,471 Lines • ▼ Show 20 Lines	void initialize(Attributor &A) override {
if (!getAssociatedFunction()->hasExactDefinition())		if (!getAssociatedFunction()->hasExactDefinition())
indicatePessimisticFixpoint();		indicatePessimisticFixpoint();
}		}

/// See AbstractAttribute::manifest(...).		/// See AbstractAttribute::manifest(...).
ChangeStatus manifest(Attributor &A) override {		ChangeStatus manifest(Attributor &A) override {
ChangeStatus Changed = AAIsDeadFloating::manifest(A);		ChangeStatus Changed = AAIsDeadFloating::manifest(A);
Argument &Arg = *getAssociatedArgument();		Argument &Arg = *getAssociatedArgument();
if (Arg.getParent()->hasLocalLinkage())		if (A.isValidFunctionSignatureRewrite(Arg, /* ReplacementTypes */ {}))
if (A.registerFunctionSignatureRewrite(		if (A.registerFunctionSignatureRewrite(
Arg, /* ReplacementTypes */ {},		Arg, /* ReplacementTypes */ {},
Attributor::ArgumentReplacementInfo::CalleeRepairCBTy{},		Attributor::ArgumentReplacementInfo::CalleeRepairCBTy{},
Attributor::ArgumentReplacementInfo::ACSRepairCBTy{}))		Attributor::ArgumentReplacementInfo::ACSRepairCBTy{}))
return ChangeStatus::CHANGED;		return ChangeStatus::CHANGED;
return Changed;		return Changed;
}		}

▲ Show 20 Lines • Show All 1,948 Lines • ▼ Show 20 Lines	void trackStatistics() const override {
STATS_DECL(MallocCalls, Function,		STATS_DECL(MallocCalls, Function,
"Number of malloc calls converted to allocas");		"Number of malloc calls converted to allocas");
for (auto *C : MallocCalls)		for (auto *C : MallocCalls)
if (!BadMallocCalls.count(C))		if (!BadMallocCalls.count(C))
++BUILD_STAT_NAME(MallocCalls, Function);		++BUILD_STAT_NAME(MallocCalls, Function);
}		}
};		};

		/// ----------------------- Privatizable Pointers ------------------------------
		struct AAPrivatizablePtrImpl : public AAPrivatizablePtr {
		AAPrivatizablePtrImpl(const IRPosition &IRP)
		: AAPrivatizablePtr(IRP), PrivatizableType(llvm::None) {}

		ChangeStatus indicatePessimisticFixpoint() override {
		AAPrivatizablePtr::indicatePessimisticFixpoint();
		PrivatizableType = nullptr;
		return ChangeStatus::CHANGED;
		}

		/// Identify the type we can chose for a private copy of the underlying
		/// argument. None means it is not clear yet, nullptr means there is none.
		virtual Optional<Type *> identifyPrivatizableType(Attributor &A) = 0;

		/// Return a privatizable type that encloses both T0 and T1.
		/// TODO: This is merely a stub for now as we should manage a mapping as well.
		Optional<Type > combineTypes(Optional<Type > T0, Optional<Type *> T1) {
		if (!T0.hasValue())
		return T1;
		if (!T1.hasValue())
		return T0;
		if (T0 == T1)
		return T0;
		return nullptr;
		}

		Optional<Type *> getPrivatizableType() const override {
		return PrivatizableType;
		}

		const std::string getAsStr() const override {
		return isAssumedPrivatizablePtr() ? "[priv]" : "[no-priv]";
		}

		protected:
		Optional<Type *> PrivatizableType;
		};

		// TODO: Do this for call site arguments (probably also other values) as well.

		struct AAPrivatizablePtrArgument final : public AAPrivatizablePtrImpl {
		AAPrivatizablePtrArgument(const IRPosition &IRP)
		: AAPrivatizablePtrImpl(IRP) {}

		/// See AAPrivatizablePtrImpl::identifyPrivatizableType(...)
		Optional<Type *> identifyPrivatizableType(Attributor &A) override {
		// If this is a byval argument and we know all the call sites (so we can
		// rewrite them), there is no need to check them explicitly.
		if (getIRPosition().hasAttr(Attribute::ByVal) &&
		A.checkForAllCallSites([](AbstractCallSite ACS) { return true; }, *this,
		true))
		return getAssociatedValue().getType()->getPointerElementType();

		Optional<Type *> Ty;
		unsigned ArgNo = getIRPosition().getArgNo();

		// Make sure the associated call site argument has the same type at all call
		// sites and it is an allocation we know is safe to privatize, for now that
		// means we only allow alloca instructions.
		// TODO: We can additionally analyze the accesses in the callee to create
		// the type from that information instead. That is a little more
		// involved and will be done in a follow up patch.
		auto CallSiteCheck = [&](AbstractCallSite ACS) {
		IRPosition ACSArgPos = IRPosition::callsite_argument(ACS, ArgNo);
		// Check if a coresponding argument was found or if it is one not
		// associated (which can happen for callback calls).
		if (ACSArgPos.getPositionKind() == IRPosition::IRP_INVALID)
		return false;

		// Check that all call sites agree on a type.
		auto &PrivCSArgAA = A.getAAFor<AAPrivatizablePtr>(*this, ACSArgPos);
		Optional<Type *> CSTy = PrivCSArgAA.getPrivatizableType();

		LLVM_DEBUG({
		dbgs() << "[AAPrivatizablePtr] ACSPos: " << ACSArgPos << ", CSTy: ";
		if (CSTy.hasValue() && CSTy.getValue())
		CSTy.getValue()->print(dbgs());
		else if (CSTy.hasValue())
		dbgs() << "<nullptr>";
		else
		dbgs() << "<none>";
		});

		Ty = combineTypes(Ty, CSTy);

		LLVM_DEBUG({
		dbgs() << " : New Type: ";
		if (Ty.hasValue() && Ty.getValue())
		Ty.getValue()->print(dbgs());
		else if (Ty.hasValue())
		dbgs() << "<nullptr>";
		else
		dbgs() << "<none>";
		dbgs() << "\n";
		});

		return !Ty.hasValue() \|\| Ty.getValue();
		};

		if (!A.checkForAllCallSites(CallSiteCheck, *this, true))
		return nullptr;
		return Ty;
		}

		/// See AbstractAttribute::updateImpl(...).
		ChangeStatus updateImpl(Attributor &A) override {
		PrivatizableType = identifyPrivatizableType(A);
		if (!PrivatizableType.hasValue())
		return ChangeStatus::UNCHANGED;
		if (!PrivatizableType.getValue())
		return indicatePessimisticFixpoint();

		// Avoid arguments with padding for now.
		if (!getIRPosition().hasAttr(Attribute::ByVal) &&
		!ArgumentPromotionPass::isDenselyPacked(PrivatizableType.getValue(),
		A.getInfoCache().getDL())) {
		LLVM_DEBUG(dbgs() << "[AAPrivatizablePtr] Padding detected\n");
		return indicatePessimisticFixpoint();
		}

		// Verify callee and caller agree on how the promoted argument would be
		// passed.
		// TODO: The use of the ArgumentPromotion interface here is ugly, we need a
		// specialized form of TargetTransformInfo::areFunctionArgsABICompatible
		// which doesn't require the arguments ArgumentPromotion wanted to pass.
		Function &Fn = *getIRPosition().getAnchorScope();
		SmallPtrSet<Argument *, 1> ArgsToPromote, Dummy;
		ArgsToPromote.insert(getAssociatedArgument());
		const auto *TTI =
		A.getInfoCache().getAnalysisResultForFunction<TargetIRAnalysis>(Fn);
		if (!TTI \|\|
		!ArgumentPromotionPass::areFunctionArgsABICompatible(
		Fn, *TTI, ArgsToPromote, Dummy) \|\|
		ArgsToPromote.empty()) {
		LLVM_DEBUG(
		dbgs() << "[AAPrivatizablePtr] ABI incompatibility detected for "
		<< Fn.getName() << "\n");
		return indicatePessimisticFixpoint();
		}

		// Collect the types that will replace the privatizable type in the function
		// signature.
		SmallVector<Type *, 16> ReplacementTypes;
		identifyReplacementTypes(PrivatizableType.getValue(), ReplacementTypes);

		// Register a rewrite of the argument.
		Argument *Arg = getAssociatedArgument();
		if (!A.isValidFunctionSignatureRewrite(*Arg, ReplacementTypes)) {
		LLVM_DEBUG(dbgs() << "[AAPrivatizablePtr] Rewrite not valid\n");
		return indicatePessimisticFixpoint();
		}

		unsigned ArgNo = Arg->getArgNo();

		// Helper to check if for the given call site the associated argument is
		// passed to a callback where the privatization would be different.
		auto IsCompatiblePrivArgOfCallback = [&](CallSite CS) {
		Value *CSArgOp = CS.getArgOperand(ArgNo);
		SmallVector<const Use *, 4> CBUses;
		AbstractCallSite::getCallbackUses(CS, CBUses);
		for (const Use *U : CBUses) {
		AbstractCallSite CBACS(U);
		assert(CBACS && CBACS.isCallbackCall());
		for (Argument &CBArg : CBACS.getCalledFunction()->args()) {
		int CBArgNo = CBACS.getCallArgOperandNo(CBArg);

		LLVM_DEBUG({
		dbgs()
		<< "[AAPrivatizablePtr] Argument " << *Arg
		<< "check if can be privatized in the context of its parent ("
		<< Arg->getParent()->getName()
		<< ")\n[AAPrivatizablePtr] because it is an argument in a "
		"callback ("
		<< CBArgNo << "@" << CBACS.getCalledFunction()->getName()
		<< ")\n[AAPrivatizablePtr] " << CBArg << " : "
		<< CBACS.getCallArgOperand(CBArg) << " vs " << CSArgOp << "\n"
		<< "[AAPrivatizablePtr] " << CBArg << " : "
		<< CBACS.getCallArgOperandNo(CBArg) << " vs " << ArgNo << "\n";
		});

		if (CBArgNo != int(ArgNo))
		continue;
		const auto &CBArgPrivAA =
		A.getAAFor<AAPrivatizablePtr>(*this, IRPosition::argument(CBArg));
		if (CBArgPrivAA.isValidState()) {
		auto CBArgPrivTy = CBArgPrivAA.getPrivatizableType();
		if (!CBArgPrivTy.hasValue())
		continue;
		if (CBArgPrivTy.getValue() == PrivatizableType)
		continue;
		}

		LLVM_DEBUG({
		dbgs() << "[AAPrivatizablePtr] Argument " << *Arg
		<< " cannot be privatized in the context of its parent ("
		<< Arg->getParent()->getName()
		<< ")\n[AAPrivatizablePtr] because it is an argument in a "
		"callback ("
		<< CBArgNo << "@" << CBACS.getCalledFunction()->getName()
		<< ").\n[AAPrivatizablePtr] for which the argument "
		"privatization is not compatible.\n";
		});
		return false;
		}
		}
		return true;
		};

		// Helper to check if for the given call site the associated argument is
		// passed to a direct call where the privatization would be different.
		auto IsCompatiblePrivArgOfDirectCS = [&](AbstractCallSite ACS) {
		CallBase *DC = cast<CallBase>(ACS.getInstruction());
		int DCArgNo = ACS.getCallArgOperandNo(ArgNo);
		assert(DCArgNo >= 0 && unsigned(DCArgNo) < DC->getNumArgOperands() &&
		"Expected a direct call operand for callback call operand");

		LLVM_DEBUG({
		dbgs() << "[AAPrivatizablePtr] Argument " << *Arg
		<< " check if be privatized in the context of its parent ("
		<< Arg->getParent()->getName()
		<< ")\n[AAPrivatizablePtr] because it is an argument in a "
		"direct call of ("
		<< DCArgNo << "@" << DC->getCalledFunction()->getName()
		<< ").\n";
		});

		Function *DCCallee = DC->getCalledFunction();
		if (unsigned(DCArgNo) < DCCallee->arg_size()) {
		const auto &DCArgPrivAA = A.getAAFor<AAPrivatizablePtr>(
		this, IRPosition::argument(DCCallee->getArg(DCArgNo)));
		efriedmaUnsubmitted Not Done Reply Inline Actions No alignment set on loads? efriedma: No alignment set on loads?
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions I'll add the alignment for the alloca below based on the alignment of the pointer it replaces. The loads and stores will be annotated by the next run of the Attributor automatically. We can also consider not emitting it if we can prove it is not needed, though that will not always be the case and it will require an analysis we do not yet have (something like AAAccessTracker). Finally, SROA should, in the good case, eliminate the alloca completely. jdoerfert: I'll add the alignment for the alloca below based on the alignment of the pointer it replaces.
		efriedmaUnsubmitted Not Done Reply Inline Actions The loads and stores will be annotated by the next run of the Attributor automatically. The default alignment of a load is the alignment of the load's type, as computed by the datalayout. This might be too high, depending on the pointer. efriedma: > The loads and stores will be annotated by the next run of the Attributor automatically. The…
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions The default alignment of a load is the alignment of the load's type, as computed by the datalayout. Sure. This might be too high, depending on the pointer. How could that be? We create the pointer with a proper type (the alloca) below. Shouldn't the alloca take the default alignment into account when the memory is allocated? jdoerfert: > The default alignment of a load is the alignment of the load's type, as computed by the…
		efriedmaUnsubmitted Not Done Reply Inline Actions If you create an alloca, then load/store to that pointer, the default alignments will work, yes. But that isn't what's happening here, is it? The alloca is in the callee, and this load is in the caller. efriedma: If you create an alloca, then load/store to that pointer, the default alignments will work, yes.
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions With this patch we always have an alloca or an argument with some pointer to (struct) type which we only access through proper gep addressing. I don't think this can create an alignment issue. I get that the alloca needs to be aligned with a higher value if the pointer was marked as such, but I already said that will be fixed. jdoerfert: With this patch we always have an alloca or an argument with some pointer to (struct) type…
		efriedmaUnsubmitted Not Done Reply Inline Actions Pointers can be misaligned, generally. For example: define void @f() { entry: %a = alloca i32, align 1 call void @g(i32* %a) ret void } define internal void @g(i32* %a) { %aa = load i32, i32* %a, align 1 call void @z(aa) } declare void @z(i32) As far as I can tell, your patch will introduce a misaligned load into `@f()`. (C generally provides additional guarantees based on the pointee type of a pointer, but there isn't any corresponding rule for IR pointers.) efriedma: Pointers can be misaligned, generally. For example: ``` define void @f() { entry: %a =…
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions I finally understand your concern, sorry that it took so long. I played around a bit to see what we currently do and I found this interesting: https://godbolt.org/z/2q_oqH We basically align the alloca naturally at some point. I would for now just set the alignment to 1 and add a TODO. For these loads, the Attributor can find a better alignment in the next run anyway and this allows me to not amend this patch too much. The TODO will explain the situation and we can work on a better solution from then. Maybe, if it is very simple, I'll directly use the AAAlign logic to get a lower bound instead. Long story short, I'll make sure these loads are properly aligned and we test for this. jdoerfert: I finally understand your concern, sorry that it took so long. I played around a bit to see…
		if (DCArgPrivAA.isValidState()) {
		auto DCArgPrivTy = DCArgPrivAA.getPrivatizableType();
		if (!DCArgPrivTy.hasValue())
		return true;
		if (DCArgPrivTy.getValue() == PrivatizableType)
		return true;
		}
		}

		LLVM_DEBUG({
		dbgs() << "[AAPrivatizablePtr] Argument " << *Arg
		<< " cannot be privatized in the context of its parent ("
		<< Arg->getParent()->getName()
		<< ")\n[AAPrivatizablePtr] because it is an argument in a "
		"direct call of ("
		<< ACS.getCallSite().getCalledFunction()->getName()
		<< ").\n[AAPrivatizablePtr] for which the argument "
		"privatization is not compatible.\n";
		});
		return false;
		};

		// Helper to check if the associated argument is used at the given abstract
		// call site in a way that is incompatible with the privatization assumed
		// here.
		auto IsCompatiblePrivArgOfOtherCallSite = [&](AbstractCallSite ACS) {
		if (ACS.isDirectCall())
		return IsCompatiblePrivArgOfCallback(ACS.getCallSite());
		if (ACS.isCallbackCall())
		return IsCompatiblePrivArgOfDirectCS(ACS);
		return false;
		};

		if (!A.checkForAllCallSites(IsCompatiblePrivArgOfOtherCallSite, *this,
		true))
		efriedmaUnsubmitted Not Done Reply Inline Actions No alignment set on alloca? efriedma: No alignment set on alloca?
		return indicatePessimisticFixpoint();

		return ChangeStatus::UNCHANGED;
		}

		/// Given a type to private \p PrivType, collect the constituates (which are
		/// used) in \p ReplacementTypes.
		static void
		identifyReplacementTypes(Type *PrivType,
		SmallVectorImpl<Type *> &ReplacementTypes) {
		// TODO: For now we expand the privatization type to the fullest which can
		// lead to dead arguments that need to be removed later.
		assert(PrivType && "Expected privatizable type!");

		// Traverse the type, extract constituate types on the outermost level.
		if (auto *PrivStructType = dyn_cast<StructType>(PrivType)) {
		for (unsigned u = 0, e = PrivStructType->getNumElements(); u < e; u++)
		ReplacementTypes.push_back(PrivStructType->getElementType(u));
		} else if (auto *PrivArrayType = dyn_cast<ArrayType>(PrivType)) {
		ReplacementTypes.append(PrivArrayType->getNumElements(),
		PrivArrayType->getElementType());
		} else {
		ReplacementTypes.push_back(PrivType);
		}
		}

		/// Initialize \p Base according to the type \p PrivType at position \p IP.
		/// The values needed are taken from the arguments of \p F starting at
		/// position \p ArgNo.
		static void createInitialization(Type *PrivType, Value &Base, Function &F,
		unsigned ArgNo, Instruction &IP) {
		assert(PrivType && "Expected privatizable type!");

		IRBuilder<NoFolder> IRB(&IP);
		const DataLayout &DL = F.getParent()->getDataLayout();

		// Traverse the type, build GEPs and stores.
		if (auto *PrivStructType = dyn_cast<StructType>(PrivType)) {
		const StructLayout *PrivStructLayout = DL.getStructLayout(PrivStructType);
		for (unsigned u = 0, e = PrivStructType->getNumElements(); u < e; u++) {
		Type *PointeeTy = PrivStructType->getElementType(u)->getPointerTo();
		Value *Ptr = constructPointer(
		PointeeTy, &Base, PrivStructLayout->getElementOffset(u), IRB, DL);
		new StoreInst(F.getArg(ArgNo + u), Ptr, &IP);
		}
		} else if (auto *PrivArrayType = dyn_cast<ArrayType>(PrivType)) {
		Type *PointeePtrTy = PrivArrayType->getElementType()->getPointerTo();
		uint64_t PointeeTySize = DL.getTypeStoreSize(PointeePtrTy);
		for (unsigned u = 0, e = PrivArrayType->getNumElements(); u < e; u++) {
		Value *Ptr =
		constructPointer(PointeePtrTy, &Base, u * PointeeTySize, IRB, DL);
		new StoreInst(F.getArg(ArgNo + u), Ptr, &IP);
		}
		} else {
		new StoreInst(F.getArg(ArgNo), &Base, &IP);
		}
		}

		/// Extract values from \p Base according to the type \p PrivType at the
		/// call position \p ACS. The values are appended to \p ReplacementValues.
		void createReplacementValues(Type *PrivType, AbstractCallSite ACS,
		Value *Base,
		SmallVectorImpl<Value *> &ReplacementValues) {
		assert(Base && "Expected base value!");
		efriedmaUnsubmitted Not Done Reply Inline Actions isArrayAllocation() efriedma: isArrayAllocation()
		jdoerfertAuthorUnsubmitted Done Reply Inline Actions Will do. jdoerfert: Will do.
		assert(PrivType && "Expected privatizable type!");
		Instruction *IP = ACS.getInstruction();

		IRBuilder<NoFolder> IRB(IP);
		const DataLayout &DL = IP->getModule()->getDataLayout();

		if (Base->getType()->getPointerElementType() != PrivType)
		Base = BitCastInst::CreateBitOrPointerCast(Base, PrivType->getPointerTo(),
		"", ACS.getInstruction());

		// TODO: Improve the alignment of the loads.
		// Traverse the type, build GEPs and loads.
		if (auto *PrivStructType = dyn_cast<StructType>(PrivType)) {
		const StructLayout *PrivStructLayout = DL.getStructLayout(PrivStructType);
		for (unsigned u = 0, e = PrivStructType->getNumElements(); u < e; u++) {
		Type *PointeeTy = PrivStructType->getElementType(u);
		Value *Ptr =
		constructPointer(PointeeTy->getPointerTo(), Base,
		PrivStructLayout->getElementOffset(u), IRB, DL);
		LoadInst *L = new LoadInst(PointeeTy, Ptr, "", IP);
		L->setAlignment(MaybeAlign(1));
		ReplacementValues.push_back(L);
		}
		} else if (auto *PrivArrayType = dyn_cast<ArrayType>(PrivType)) {
		Type *PointeeTy = PrivArrayType->getElementType();
		uint64_t PointeeTySize = DL.getTypeStoreSize(PointeeTy);
		Type *PointeePtrTy = PointeeTy->getPointerTo();
		for (unsigned u = 0, e = PrivArrayType->getNumElements(); u < e; u++) {
		Value *Ptr =
		constructPointer(PointeePtrTy, Base, u * PointeeTySize, IRB, DL);
		LoadInst *L = new LoadInst(PointeePtrTy, Ptr, "", IP);
		L->setAlignment(MaybeAlign(1));
		ReplacementValues.push_back(L);
		}
		} else {
		LoadInst *L = new LoadInst(PrivType, Base, "", IP);
		L->setAlignment(MaybeAlign(1));
		ReplacementValues.push_back(L);
		}
		}

		/// See AbstractAttribute::manifest(...)
		ChangeStatus manifest(Attributor &A) override {
		if (!PrivatizableType.hasValue())
		return ChangeStatus::UNCHANGED;
		assert(PrivatizableType.getValue() && "Expected privatizable type!");

		// Collect all tail calls in the function as we cannot allow new allocas to
		// escape into tail recursion.
		// TODO: Be smarter about new allocas escaping into tail calls.
		SmallVector<CallInst *, 16> TailCalls;
		if (!A.checkForAllInstructions(
		[&](Instruction &I) {
		CallInst &CI = cast<CallInst>(I);
		if (CI.isTailCall())
		TailCalls.push_back(&CI);
		return true;
		},
		*this, {Instruction::Call}))
		return ChangeStatus::UNCHANGED;

		Argument *Arg = getAssociatedArgument();

		// Callback to repair the associated function. A new alloca is placed at the
		// beginning and initialized with the values passed through arguments. The
		// new alloca replaces the use of the old pointer argument.
		Attributor::ArgumentReplacementInfo::CalleeRepairCBTy FnRepairCB =
		[=](const Attributor::ArgumentReplacementInfo &ARI,
		Function &ReplacementFn, Function::arg_iterator ArgIt) {
		BasicBlock &EntryBB = ReplacementFn.getEntryBlock();
		Instruction IP = &EntryBB.getFirstInsertionPt();
		auto *AI = new AllocaInst(PrivatizableType.getValue(), 0,
		Arg->getName() + ".priv", IP);
		createInitialization(PrivatizableType.getValue(), *AI, ReplacementFn,
		ArgIt->getArgNo(), *IP);
		Arg->replaceAllUsesWith(AI);

		for (CallInst *CI : TailCalls)
		CI->setTailCall(false);
		};

		// Callback to repair a call site of the associated function. The elements
		// of the privatizable type are loaded prior to the call and passed to the
		// new function version.
		Attributor::ArgumentReplacementInfo::ACSRepairCBTy ACSRepairCB =
		[=](const Attributor::ArgumentReplacementInfo &ARI,
		AbstractCallSite ACS, SmallVectorImpl<Value *> &NewArgOperands) {
		createReplacementValues(
		PrivatizableType.getValue(), ACS,
		ACS.getCallArgOperand(ARI.getReplacedArg().getArgNo()),
		NewArgOperands);
		};

		// Collect the types that will replace the privatizable type in the function
		// signature.
		SmallVector<Type *, 16> ReplacementTypes;
		identifyReplacementTypes(PrivatizableType.getValue(), ReplacementTypes);

		// Register a rewrite of the argument.
		if (A.registerFunctionSignatureRewrite(*Arg, ReplacementTypes,
		std::move(FnRepairCB),
		std::move(ACSRepairCB)))
		return ChangeStatus::CHANGED;
		return ChangeStatus::UNCHANGED;
		}

		/// See AbstractAttribute::trackStatistics()
		void trackStatistics() const override {
		STATS_DECLTRACK_ARG_ATTR(privatizable_ptr);
		}
		};

		struct AAPrivatizablePtrFloating : public AAPrivatizablePtrImpl {
		AAPrivatizablePtrFloating(const IRPosition &IRP)
		: AAPrivatizablePtrImpl(IRP) {}

		/// See AbstractAttribute::initialize(...).
		virtual void initialize(Attributor &A) override {
		// TODO: We can privatize more than arguments.
		indicatePessimisticFixpoint();
		}

		ChangeStatus updateImpl(Attributor &A) override {
		llvm_unreachable("AAPrivatizablePtr(Floating\|Returned\|CallSiteReturned)::"
		"updateImpl will not be called");
		}

		/// See AAPrivatizablePtrImpl::identifyPrivatizableType(...)
		Optional<Type *> identifyPrivatizableType(Attributor &A) override {
		Value *Obj =
		GetUnderlyingObject(&getAssociatedValue(), A.getInfoCache().getDL());
		if (!Obj) {
		LLVM_DEBUG(dbgs() << "[AAPrivatizablePtr] No underlying object found!\n");
		return nullptr;
		}

		if (auto *AI = dyn_cast<AllocaInst>(Obj))
		if (auto *CI = dyn_cast<ConstantInt>(AI->getArraySize()))
		if (CI->isOne())
		return Obj->getType()->getPointerElementType();
		if (auto *Arg = dyn_cast<Argument>(Obj)) {
		auto &PrivArgAA =
		A.getAAFor<AAPrivatizablePtr>(this, IRPosition::argument(Arg));
		if (PrivArgAA.isAssumedPrivatizablePtr())
		return Obj->getType()->getPointerElementType();
		}

		LLVM_DEBUG(dbgs() << "[AAPrivatizablePtr] Underlying object neither valid "
		"alloca nor privatizable argument: "
		<< *Obj << "!\n");
		return nullptr;
		}

		/// See AbstractAttribute::trackStatistics()
		void trackStatistics() const override {
		STATS_DECLTRACK_FLOATING_ATTR(privatizable_ptr);
		}
		};

		struct AAPrivatizablePtrCallSiteArgument final
		: public AAPrivatizablePtrFloating {
		AAPrivatizablePtrCallSiteArgument(const IRPosition &IRP)
		: AAPrivatizablePtrFloating(IRP) {}

		/// See AbstractAttribute::initialize(...).
		void initialize(Attributor &A) override {
		if (getIRPosition().hasAttr(Attribute::ByVal))
		indicateOptimisticFixpoint();
		}

		/// See AbstractAttribute::updateImpl(...).
		ChangeStatus updateImpl(Attributor &A) override {
		PrivatizableType = identifyPrivatizableType(A);
		if (!PrivatizableType.hasValue())
		return ChangeStatus::UNCHANGED;
		if (!PrivatizableType.getValue())
		return indicatePessimisticFixpoint();

		const IRPosition &IRP = getIRPosition();
		auto &NoCaptureAA = A.getAAFor<AANoCapture>(*this, IRP);
		if (!NoCaptureAA.isAssumedNoCapture()) {
		LLVM_DEBUG(dbgs() << "[AAPrivatizablePtr] pointer might be captured!\n");
		return indicatePessimisticFixpoint();
		}

		auto &NoAliasAA = A.getAAFor<AANoAlias>(*this, IRP);
		if (!NoAliasAA.isAssumedNoAlias()) {
		LLVM_DEBUG(dbgs() << "[AAPrivatizablePtr] pointer might alias!\n");
		return indicatePessimisticFixpoint();
		}

		const auto &MemBehaviorAA = A.getAAFor<AAMemoryBehavior>(*this, IRP);
		if (!MemBehaviorAA.isAssumedReadOnly()) {
		LLVM_DEBUG(dbgs() << "[AAPrivatizablePtr] pointer is written!\n");
		return indicatePessimisticFixpoint();
		}

		return ChangeStatus::UNCHANGED;
		}

		/// See AbstractAttribute::trackStatistics()
		void trackStatistics() const override {
		STATS_DECLTRACK_CSARG_ATTR(privatizable_ptr);
		}
		};

		struct AAPrivatizablePtrCallSiteReturned final
		: public AAPrivatizablePtrFloating {
		AAPrivatizablePtrCallSiteReturned(const IRPosition &IRP)
		: AAPrivatizablePtrFloating(IRP) {}

		/// See AbstractAttribute::initialize(...).
		void initialize(Attributor &A) override {
		// TODO: We can privatize more than arguments.
		indicatePessimisticFixpoint();
		}

		/// See AbstractAttribute::trackStatistics()
		void trackStatistics() const override {
		STATS_DECLTRACK_CSRET_ATTR(privatizable_ptr);
		}
		};

		struct AAPrivatizablePtrReturned final : public AAPrivatizablePtrFloating {
		AAPrivatizablePtrReturned(const IRPosition &IRP)
		: AAPrivatizablePtrFloating(IRP) {}

		/// See AbstractAttribute::initialize(...).
		void initialize(Attributor &A) override {
		// TODO: We can privatize more than arguments.
		indicatePessimisticFixpoint();
		}

		/// See AbstractAttribute::trackStatistics()
		void trackStatistics() const override {
		STATS_DECLTRACK_FNRET_ATTR(privatizable_ptr);
		}
		};

/// -------------------- Memory Behavior Attributes ----------------------------		/// -------------------- Memory Behavior Attributes ----------------------------
/// Includes read-none, read-only, and write-only.		/// Includes read-none, read-only, and write-only.
/// ----------------------------------------------------------------------------		/// ----------------------------------------------------------------------------
struct AAMemoryBehaviorImpl : public AAMemoryBehavior {		struct AAMemoryBehaviorImpl : public AAMemoryBehavior {
AAMemoryBehaviorImpl(const IRPosition &IRP) : AAMemoryBehavior(IRP) {}		AAMemoryBehaviorImpl(const IRPosition &IRP) : AAMemoryBehavior(IRP) {}

/// See AbstractAttribute::initialize(...).		/// See AbstractAttribute::initialize(...).
void initialize(Attributor &A) override {		void initialize(Attributor &A) override {
▲ Show 20 Lines • Show All 1,534 Lines • ▼ Show 20 Lines	errs() << "\n[Attributor] Fixpoint iteration done after: "
<< " iterations\n";		<< " iterations\n";
llvm_unreachable("The fixpoint was not reached with exactly the number of "		llvm_unreachable("The fixpoint was not reached with exactly the number of "
"specified iterations!");		"specified iterations!");
}		}

return ManifestChange;		return ManifestChange;
}		}

bool Attributor::registerFunctionSignatureRewrite(		bool Attributor::isValidFunctionSignatureRewrite(
Argument &Arg, ArrayRef<Type *> ReplacementTypes,		Argument &Arg, ArrayRef<Type *> ReplacementTypes) {
ArgumentReplacementInfo::CalleeRepairCBTy &&CalleeRepairCB,
ArgumentReplacementInfo::ACSRepairCBTy &&ACSRepairCB) {

auto CallSiteCanBeChanged = [](AbstractCallSite ACS) {		auto CallSiteCanBeChanged = [](AbstractCallSite ACS) {
// Forbid must-tail calls for now.		// Forbid must-tail calls for now.
return !ACS.isCallbackCall() && !ACS.getCallSite().isMustTailCall();		return !ACS.isCallbackCall() && !ACS.getCallSite().isMustTailCall();
};		};

Function *Fn = Arg.getParent();		Function *Fn = Arg.getParent();
// Avoid var-arg functions for now.		// Avoid var-arg functions for now.
Show All 29 Lines	bool Attributor::isValidFunctionSignatureRewrite(
bool AnyDead;		bool AnyDead;
auto &OpcodeInstMap = InfoCache.getOpcodeInstMapForFunction(*Fn);		auto &OpcodeInstMap = InfoCache.getOpcodeInstMapForFunction(*Fn);
if (!checkForAllInstructionsImpl(OpcodeInstMap, InstPred, nullptr, AnyDead,		if (!checkForAllInstructionsImpl(OpcodeInstMap, InstPred, nullptr, AnyDead,
{Instruction::Call})) {		{Instruction::Call})) {
LLVM_DEBUG(dbgs() << "[Attributor] Cannot rewrite due to instructions\n");		LLVM_DEBUG(dbgs() << "[Attributor] Cannot rewrite due to instructions\n");
return false;		return false;
}		}

		return true;
		}

		bool Attributor::registerFunctionSignatureRewrite(
		Argument &Arg, ArrayRef<Type *> ReplacementTypes,
		ArgumentReplacementInfo::CalleeRepairCBTy &&CalleeRepairCB,
		ArgumentReplacementInfo::ACSRepairCBTy &&ACSRepairCB) {
		LLVM_DEBUG(dbgs() << "[Attributor] Register new rewrite of " << Arg << " in "
		<< Arg.getParent()->getName() << " with "
		<< ReplacementTypes.size() << " replacements\n");
		assert(isValidFunctionSignatureRewrite(Arg, ReplacementTypes) &&
		"Cannot register an invalid rewrite");

		Function *Fn = Arg.getParent();
SmallVectorImpl<ArgumentReplacementInfo *> &ARIs = ArgumentReplacementMap[Fn];		SmallVectorImpl<ArgumentReplacementInfo *> &ARIs = ArgumentReplacementMap[Fn];
if (ARIs.size() == 0)		if (ARIs.empty())
ARIs.resize(Fn->arg_size());		ARIs.resize(Fn->arg_size());

// If we have a replacement already with less than or equal new arguments,		// If we have a replacement already with less than or equal new arguments,
// ignore this request.		// ignore this request.
ArgumentReplacementInfo *&ARI = ARIs[Arg.getArgNo()];		ArgumentReplacementInfo *&ARI = ARIs[Arg.getArgNo()];
if (ARI && ARI->getNumReplacementArgs() <= ReplacementTypes.size()) {		if (ARI && ARI->getNumReplacementArgs() <= ReplacementTypes.size()) {
LLVM_DEBUG(dbgs() << "[Attributor] Existing rewrite is preferred\n");		LLVM_DEBUG(dbgs() << "[Attributor] Existing rewrite is preferred\n");
return false;		return false;
}		}

// If we have a replacement already but we like the new one better, delete		// If we have a replacement already but we like the new one better, delete
// the old.		// the old.
if (ARI)		if (ARI)
delete ARI;		delete ARI;

		LLVM_DEBUG(dbgs() << "[Attributor] Register new rewrite of " << Arg << " in "
		<< Arg.getParent()->getName() << " with "
		<< ReplacementTypes.size() << " replacements\n");

// Remember the replacement.		// Remember the replacement.
ARI = new ArgumentReplacementInfo(*this, Arg, ReplacementTypes,		ARI = new ArgumentReplacementInfo(*this, Arg, ReplacementTypes,
std::move(CalleeRepairCB),		std::move(CalleeRepairCB),
std::move(ACSRepairCB));		std::move(ACSRepairCB));

return true;		return true;
}		}

▲ Show 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	if (Arg.getType()->isPointerTy()) {
getOrCreateAAFor<AANoCapture>(ArgPos);		getOrCreateAAFor<AANoCapture>(ArgPos);

// Every argument with pointer type might be marked		// Every argument with pointer type might be marked
// "readnone/readonly/writeonly/..."		// "readnone/readonly/writeonly/..."
getOrCreateAAFor<AAMemoryBehavior>(ArgPos);		getOrCreateAAFor<AAMemoryBehavior>(ArgPos);

// Every argument with pointer type might be marked nofree.		// Every argument with pointer type might be marked nofree.
getOrCreateAAFor<AANoFree>(ArgPos);		getOrCreateAAFor<AANoFree>(ArgPos);

		// Every argument with pointer type might be privatizable (or promotable)
		getOrCreateAAFor<AAPrivatizablePtr>(ArgPos);
}		}
}		}

auto CallSitePred = [&](Instruction &I) -> bool {		auto CallSitePred = [&](Instruction &I) -> bool {
CallSite CS(&I);		CallSite CS(&I);
if (Function *Callee = CS.getCalledFunction()) {		if (Function *Callee = CS.getCalledFunction()) {
// Skip declerations except if annotations on their call sites were		// Skip declerations except if annotations on their call sites were
// explicitly requested.		// explicitly requested.
▲ Show 20 Lines • Show All 238 Lines • ▼ Show 20 Lines
const char AAReachability::ID = 0;		const char AAReachability::ID = 0;
const char AANoReturn::ID = 0;		const char AANoReturn::ID = 0;
const char AAIsDead::ID = 0;		const char AAIsDead::ID = 0;
const char AADereferenceable::ID = 0;		const char AADereferenceable::ID = 0;
const char AAAlign::ID = 0;		const char AAAlign::ID = 0;
const char AANoCapture::ID = 0;		const char AANoCapture::ID = 0;
const char AAValueSimplify::ID = 0;		const char AAValueSimplify::ID = 0;
const char AAHeapToStack::ID = 0;		const char AAHeapToStack::ID = 0;
		const char AAPrivatizablePtr::ID = 0;
const char AAMemoryBehavior::ID = 0;		const char AAMemoryBehavior::ID = 0;
const char AAValueConstantRange::ID = 0;		const char AAValueConstantRange::ID = 0;

// Macro magic to create the static generator function for attributes that		// Macro magic to create the static generator function for attributes that
// follow the naming scheme.		// follow the naming scheme.

#define SWITCH_PK_INV(CLASS, PK, POS_NAME) \		#define SWITCH_PK_INV(CLASS, PK, POS_NAME) \
case IRPosition::PK: \		case IRPosition::PK: \
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines
CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoSync)		CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoSync)
CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoRecurse)		CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoRecurse)
CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAWillReturn)		CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAWillReturn)
CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoReturn)		CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoReturn)
CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAReturnedValues)		CREATE_FUNCTION_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAReturnedValues)

CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANonNull)		CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANonNull)
CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoAlias)		CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoAlias)
		CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAPrivatizablePtr)
CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AADereferenceable)		CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AADereferenceable)
CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAAlign)		CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAAlign)
CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoCapture)		CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoCapture)
CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAValueConstantRange)		CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAValueConstantRange)

CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAValueSimplify)		CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAValueSimplify)
CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAIsDead)		CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAIsDead)
CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoFree)		CREATE_ALL_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoFree)
Show All 20 Lines

llvm/test/Transforms/Attributor/ArgumentPromotion/2008-02-01-ReturnAttrs.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s \| FileCheck %s			; RUN: opt -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 -S < %s \| FileCheck %s

	define internal i32 @deref(i32* %x) nounwind {			define internal i32 @deref(i32* %x) nounwind {
	; CHECK-LABEL: define {{[^@]+}}@deref			; CHECK-LABEL: define {{[^@]+}}@deref
	; CHECK-SAME: (i32* noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[X:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.*]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[X]], align 4			; CHECK-NEXT: [[X_PRIV:%.*]] = alloca i32
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[X_PRIV]]
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[X_PRIV]], align 4
	; CHECK-NEXT: ret i32 [[TMP2]]			; CHECK-NEXT: ret i32 [[TMP2]]
	;			;
	entry:			entry:
	%tmp2 = load i32, i32* %x, align 4			%tmp2 = load i32, i32* %x, align 4
	ret i32 %tmp2			ret i32 %tmp2
	}			}

	define i32 @f(i32 %x) {			define i32 @f(i32 %x) {
	; CHECK-LABEL: define {{[^@]+}}@f			; CHECK-LABEL: define {{[^@]+}}@f
	; CHECK-SAME: (i32 [[X:%.*]])			; CHECK-SAME: (i32 [[X:%.*]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[X_ADDR:%.*]] = alloca i32			; CHECK-NEXT: [[X_ADDR:%.*]] = alloca i32
	; CHECK-NEXT: store i32 [[X]], i32* [[X_ADDR]], align 4			; CHECK-NEXT: store i32 [[X]], i32* [[X_ADDR]], align 4
	; CHECK-NEXT: [[TMP1:%.]] = call i32 @deref(i32 noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[X_ADDR]])			; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[X_ADDR]], align 1
				; CHECK-NEXT: [[TMP1:%.*]] = call i32 @deref(i32 [[TMP0]])
	; CHECK-NEXT: ret i32 [[TMP1]]			; CHECK-NEXT: ret i32 [[TMP1]]
	;			;
	entry:			entry:
	%x_addr = alloca i32			%x_addr = alloca i32
	store i32 %x, i32* %x_addr, align 4			store i32 %x, i32* %x_addr, align 4
	%tmp1 = call i32 @deref( i32* %x_addr ) nounwind			%tmp1 = call i32 @deref( i32* %x_addr ) nounwind
	ret i32 %tmp1			ret i32 %tmp1
	}			}

llvm/test/Transforms/Attributor/ArgumentPromotion/X86/attributes.ll

Show All 39 Lines	bb:
call fastcc void @no_promote_avx2(<4 x i64>* %tmp2, <4 x i64>* %tmp)		call fastcc void @no_promote_avx2(<4 x i64>* %tmp2, <4 x i64>* %tmp)
%tmp4 = load <4 x i64>, <4 x i64>* %tmp2, align 32		%tmp4 = load <4 x i64>, <4 x i64>* %tmp2, align 32
store <4 x i64> %tmp4, <4 x i64>* %arg, align 2		store <4 x i64> %tmp4, <4 x i64>* %arg, align 2
ret void		ret void
}		}

define internal fastcc void @promote_avx2(<4 x i64>* %arg, <4 x i64>* readonly %arg1) #0 {		define internal fastcc void @promote_avx2(<4 x i64>* %arg, <4 x i64>* readonly %arg1) #0 {
; CHECK-LABEL: define {{[^@]+}}@promote_avx2		; CHECK-LABEL: define {{[^@]+}}@promote_avx2
; CHECK-SAME: (<4 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(32) [[ARG:%.]], <4 x i64> noalias nocapture nofree nonnull readonly align 32 dereferenceable(32) [[ARG1:%.*]])		; CHECK-SAME: (<4 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(32) [[ARG:%.]], <4 x i64> [[TMP0:%.]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.]] = load <4 x i64>, <4 x i64> [[ARG1]], align 32		; CHECK-NEXT: [[ARG1_PRIV:%.*]] = alloca <4 x i64>
		; CHECK-NEXT: store <4 x i64> [[TMP0]], <4 x i64>* [[ARG1_PRIV]]
		; CHECK-NEXT: [[TMP:%.]] = load <4 x i64>, <4 x i64> [[ARG1_PRIV]], align 32
; CHECK-NEXT: store <4 x i64> [[TMP]], <4 x i64>* [[ARG]], align 32		; CHECK-NEXT: store <4 x i64> [[TMP]], <4 x i64>* [[ARG]], align 32
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = load <4 x i64>, <4 x i64>* %arg1		%tmp = load <4 x i64>, <4 x i64>* %arg1
store <4 x i64> %tmp, <4 x i64>* %arg		store <4 x i64> %tmp, <4 x i64>* %arg
ret void		ret void
}		}

define void @promote(<4 x i64>* %arg) #0 {		define void @promote(<4 x i64>* %arg) #0 {
; CHECK-LABEL: define {{[^@]+}}@promote		; CHECK-LABEL: define {{[^@]+}}@promote
; CHECK-SAME: (<4 x i64>* nocapture writeonly [[ARG:%.*]])		; CHECK-SAME: (<4 x i64>* nocapture writeonly [[ARG:%.*]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.*]] = alloca <4 x i64>, align 32		; CHECK-NEXT: [[TMP:%.*]] = alloca <4 x i64>, align 32
; CHECK-NEXT: [[TMP2:%.*]] = alloca <4 x i64>, align 32		; CHECK-NEXT: [[TMP2:%.*]] = alloca <4 x i64>, align 32
; CHECK-NEXT: [[TMP3:%.]] = bitcast <4 x i64> [[TMP]] to i8*		; CHECK-NEXT: [[TMP3:%.]] = bitcast <4 x i64> [[TMP]] to i8*
; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(32) [[TMP3]], i8 0, i64 32, i1 false)		; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(32) [[TMP3]], i8 0, i64 32, i1 false)
; CHECK-NEXT: call fastcc void @promote_avx2(<4 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(32) [[TMP2]], <4 x i64>* noalias nocapture nofree nonnull readonly align 32 dereferenceable(32) [[TMP]])		; CHECK-NEXT: [[TMP0:%.]] = load <4 x i64>, <4 x i64> [[TMP]], align 1
		; CHECK-NEXT: call fastcc void @promote_avx2(<4 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(32) [[TMP2]], <4 x i64> [[TMP0]])
; CHECK-NEXT: [[TMP4:%.]] = load <4 x i64>, <4 x i64> [[TMP2]], align 32		; CHECK-NEXT: [[TMP4:%.]] = load <4 x i64>, <4 x i64> [[TMP2]], align 32
; CHECK-NEXT: store <4 x i64> [[TMP4]], <4 x i64>* [[ARG]], align 2		; CHECK-NEXT: store <4 x i64> [[TMP4]], <4 x i64>* [[ARG]], align 2
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = alloca <4 x i64>, align 32		%tmp = alloca <4 x i64>, align 32
%tmp2 = alloca <4 x i64>, align 32		%tmp2 = alloca <4 x i64>, align 32
%tmp3 = bitcast <4 x i64>* %tmp to i8*		%tmp3 = bitcast <4 x i64>* %tmp to i8*
Show All 13 Lines

llvm/test/Transforms/Attributor/ArgumentPromotion/X86/min-legal-vector-width.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s		; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s
; Test that we only promote arguments when the caller/callee have compatible		; Test that we only promote arguments when the caller/callee have compatible
; function attrubtes.		; function attrubtes.

target triple = "x86_64-unknown-linux-gnu"		target triple = "x86_64-unknown-linux-gnu"

; This should promote		; This should promote
define internal fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer512(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #0 {		define internal fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer512(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #0 {
; CHECK-LABEL: define {{[^@]+}}@callee_avx512_legal512_prefer512_call_avx512_legal512_prefer512		; CHECK-LABEL: define {{[^@]+}}@callee_avx512_legal512_prefer512_call_avx512_legal512_prefer512
; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[ARG1:%.*]])		; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> [[TMP0:%.]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1]], align 32		; CHECK-NEXT: [[ARG1_PRIV:%.*]] = alloca <8 x i64>
		; CHECK-NEXT: store <8 x i64> [[TMP0]], <8 x i64>* [[ARG1_PRIV]]
		; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1_PRIV]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32		; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = load <8 x i64>, <8 x i64>* %arg1		%tmp = load <8 x i64>, <8 x i64>* %arg1
store <8 x i64> %tmp, <8 x i64>* %arg		store <8 x i64> %tmp, <8 x i64>* %arg
ret void		ret void
}		}

define void @avx512_legal512_prefer512_call_avx512_legal512_prefer512(<8 x i64>* %arg) #0 {		define void @avx512_legal512_prefer512_call_avx512_legal512_prefer512(<8 x i64>* %arg) #0 {
; CHECK-LABEL: define {{[^@]+}}@avx512_legal512_prefer512_call_avx512_legal512_prefer512		; CHECK-LABEL: define {{[^@]+}}@avx512_legal512_prefer512_call_avx512_legal512_prefer512
; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])		; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*		; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*
; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)		; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)
; CHECK-NEXT: call fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer512(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64>* noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[TMP]])		; CHECK-NEXT: [[TMP0:%.]] = load <8 x i64>, <8 x i64> [[TMP]], align 1
		; CHECK-NEXT: call fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer512(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64> [[TMP0]])
; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32		; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2		; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = alloca <8 x i64>, align 32		%tmp = alloca <8 x i64>, align 32
%tmp2 = alloca <8 x i64>, align 32		%tmp2 = alloca <8 x i64>, align 32
%tmp3 = bitcast <8 x i64>* %tmp to i8*		%tmp3 = bitcast <8 x i64>* %tmp to i8*
call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)		call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)
call fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer512(<8 x i64>* %tmp2, <8 x i64>* %tmp)		call fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer512(<8 x i64>* %tmp2, <8 x i64>* %tmp)
%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32		%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32
store <8 x i64> %tmp4, <8 x i64>* %arg, align 2		store <8 x i64> %tmp4, <8 x i64>* %arg, align 2
ret void		ret void
}		}

; This should promote		; This should promote
define internal fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer256(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #1 {		define internal fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer256(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #1 {
; CHECK-LABEL: define {{[^@]+}}@callee_avx512_legal512_prefer256_call_avx512_legal512_prefer256		; CHECK-LABEL: define {{[^@]+}}@callee_avx512_legal512_prefer256_call_avx512_legal512_prefer256
; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[ARG1:%.*]])		; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> [[TMP0:%.]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1]], align 32		; CHECK-NEXT: [[ARG1_PRIV:%.*]] = alloca <8 x i64>
		; CHECK-NEXT: store <8 x i64> [[TMP0]], <8 x i64>* [[ARG1_PRIV]]
		; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1_PRIV]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32		; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = load <8 x i64>, <8 x i64>* %arg1		%tmp = load <8 x i64>, <8 x i64>* %arg1
store <8 x i64> %tmp, <8 x i64>* %arg		store <8 x i64> %tmp, <8 x i64>* %arg
ret void		ret void
}		}

define void @avx512_legal512_prefer256_call_avx512_legal512_prefer256(<8 x i64>* %arg) #1 {		define void @avx512_legal512_prefer256_call_avx512_legal512_prefer256(<8 x i64>* %arg) #1 {
; CHECK-LABEL: define {{[^@]+}}@avx512_legal512_prefer256_call_avx512_legal512_prefer256		; CHECK-LABEL: define {{[^@]+}}@avx512_legal512_prefer256_call_avx512_legal512_prefer256
; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])		; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*		; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*
; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)		; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)
; CHECK-NEXT: call fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer256(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64>* noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[TMP]])		; CHECK-NEXT: [[TMP0:%.]] = load <8 x i64>, <8 x i64> [[TMP]], align 1
		; CHECK-NEXT: call fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer256(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64> [[TMP0]])
; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32		; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2		; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = alloca <8 x i64>, align 32		%tmp = alloca <8 x i64>, align 32
%tmp2 = alloca <8 x i64>, align 32		%tmp2 = alloca <8 x i64>, align 32
%tmp3 = bitcast <8 x i64>* %tmp to i8*		%tmp3 = bitcast <8 x i64>* %tmp to i8*
call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)		call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)
call fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer256(<8 x i64>* %tmp2, <8 x i64>* %tmp)		call fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer256(<8 x i64>* %tmp2, <8 x i64>* %tmp)
%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32		%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32
store <8 x i64> %tmp4, <8 x i64>* %arg, align 2		store <8 x i64> %tmp4, <8 x i64>* %arg, align 2
ret void		ret void
}		}

; This should promote		; This should promote
define internal fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer256(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #1 {		define internal fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer256(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #1 {
; CHECK-LABEL: define {{[^@]+}}@callee_avx512_legal512_prefer512_call_avx512_legal512_prefer256		; CHECK-LABEL: define {{[^@]+}}@callee_avx512_legal512_prefer512_call_avx512_legal512_prefer256
; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[ARG1:%.*]])		; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> [[TMP0:%.]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1]], align 32		; CHECK-NEXT: [[ARG1_PRIV:%.*]] = alloca <8 x i64>
		; CHECK-NEXT: store <8 x i64> [[TMP0]], <8 x i64>* [[ARG1_PRIV]]
		; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1_PRIV]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32		; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = load <8 x i64>, <8 x i64>* %arg1		%tmp = load <8 x i64>, <8 x i64>* %arg1
store <8 x i64> %tmp, <8 x i64>* %arg		store <8 x i64> %tmp, <8 x i64>* %arg
ret void		ret void
}		}

define void @avx512_legal512_prefer512_call_avx512_legal512_prefer256(<8 x i64>* %arg) #0 {		define void @avx512_legal512_prefer512_call_avx512_legal512_prefer256(<8 x i64>* %arg) #0 {
; CHECK-LABEL: define {{[^@]+}}@avx512_legal512_prefer512_call_avx512_legal512_prefer256		; CHECK-LABEL: define {{[^@]+}}@avx512_legal512_prefer512_call_avx512_legal512_prefer256
; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])		; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*		; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*
; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)		; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)
; CHECK-NEXT: call fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer256(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64>* noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[TMP]])		; CHECK-NEXT: [[TMP0:%.]] = load <8 x i64>, <8 x i64> [[TMP]], align 1
		; CHECK-NEXT: call fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer256(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64> [[TMP0]])
; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32		; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2		; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = alloca <8 x i64>, align 32		%tmp = alloca <8 x i64>, align 32
%tmp2 = alloca <8 x i64>, align 32		%tmp2 = alloca <8 x i64>, align 32
%tmp3 = bitcast <8 x i64>* %tmp to i8*		%tmp3 = bitcast <8 x i64>* %tmp to i8*
call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)		call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)
call fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer256(<8 x i64>* %tmp2, <8 x i64>* %tmp)		call fastcc void @callee_avx512_legal512_prefer512_call_avx512_legal512_prefer256(<8 x i64>* %tmp2, <8 x i64>* %tmp)
%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32		%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32
store <8 x i64> %tmp4, <8 x i64>* %arg, align 2		store <8 x i64> %tmp4, <8 x i64>* %arg, align 2
ret void		ret void
}		}

; This should promote		; This should promote
define internal fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer512(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #0 {		define internal fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer512(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #0 {
; CHECK-LABEL: define {{[^@]+}}@callee_avx512_legal512_prefer256_call_avx512_legal512_prefer512		; CHECK-LABEL: define {{[^@]+}}@callee_avx512_legal512_prefer256_call_avx512_legal512_prefer512
; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[ARG1:%.*]])		; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> [[TMP0:%.]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1]], align 32		; CHECK-NEXT: [[ARG1_PRIV:%.*]] = alloca <8 x i64>
		; CHECK-NEXT: store <8 x i64> [[TMP0]], <8 x i64>* [[ARG1_PRIV]]
		; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1_PRIV]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32		; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = load <8 x i64>, <8 x i64>* %arg1		%tmp = load <8 x i64>, <8 x i64>* %arg1
store <8 x i64> %tmp, <8 x i64>* %arg		store <8 x i64> %tmp, <8 x i64>* %arg
ret void		ret void
}		}

define void @avx512_legal512_prefer256_call_avx512_legal512_prefer512(<8 x i64>* %arg) #1 {		define void @avx512_legal512_prefer256_call_avx512_legal512_prefer512(<8 x i64>* %arg) #1 {
; CHECK-LABEL: define {{[^@]+}}@avx512_legal512_prefer256_call_avx512_legal512_prefer512		; CHECK-LABEL: define {{[^@]+}}@avx512_legal512_prefer256_call_avx512_legal512_prefer512
; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])		; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*		; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*
; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)		; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)
; CHECK-NEXT: call fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer512(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64>* noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[TMP]])		; CHECK-NEXT: [[TMP0:%.]] = load <8 x i64>, <8 x i64> [[TMP]], align 1
		; CHECK-NEXT: call fastcc void @callee_avx512_legal512_prefer256_call_avx512_legal512_prefer512(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64> [[TMP0]])
; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32		; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2		; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = alloca <8 x i64>, align 32		%tmp = alloca <8 x i64>, align 32
%tmp2 = alloca <8 x i64>, align 32		%tmp2 = alloca <8 x i64>, align 32
%tmp3 = bitcast <8 x i64>* %tmp to i8*		%tmp3 = bitcast <8 x i64>* %tmp to i8*
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	bb:
%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32		%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32
store <8 x i64> %tmp4, <8 x i64>* %arg, align 2		store <8 x i64> %tmp4, <8 x i64>* %arg, align 2
ret void		ret void
}		}

; This should promote		; This should promote
define internal fastcc void @callee_avx2_legal256_prefer256_call_avx2_legal512_prefer256(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #3 {		define internal fastcc void @callee_avx2_legal256_prefer256_call_avx2_legal512_prefer256(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #3 {
; CHECK-LABEL: define {{[^@]+}}@callee_avx2_legal256_prefer256_call_avx2_legal512_prefer256		; CHECK-LABEL: define {{[^@]+}}@callee_avx2_legal256_prefer256_call_avx2_legal512_prefer256
; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[ARG1:%.*]])		; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> [[TMP0:%.]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1]], align 32		; CHECK-NEXT: [[ARG1_PRIV:%.*]] = alloca <8 x i64>
		; CHECK-NEXT: store <8 x i64> [[TMP0]], <8 x i64>* [[ARG1_PRIV]]
		; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1_PRIV]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32		; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = load <8 x i64>, <8 x i64>* %arg1		%tmp = load <8 x i64>, <8 x i64>* %arg1
store <8 x i64> %tmp, <8 x i64>* %arg		store <8 x i64> %tmp, <8 x i64>* %arg
ret void		ret void
}		}

define void @avx2_legal256_prefer256_call_avx2_legal512_prefer256(<8 x i64>* %arg) #4 {		define void @avx2_legal256_prefer256_call_avx2_legal512_prefer256(<8 x i64>* %arg) #4 {
; CHECK-LABEL: define {{[^@]+}}@avx2_legal256_prefer256_call_avx2_legal512_prefer256		; CHECK-LABEL: define {{[^@]+}}@avx2_legal256_prefer256_call_avx2_legal512_prefer256
; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])		; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*		; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*
; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)		; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)
; CHECK-NEXT: call fastcc void @callee_avx2_legal256_prefer256_call_avx2_legal512_prefer256(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64>* noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[TMP]])		; CHECK-NEXT: [[TMP0:%.]] = load <8 x i64>, <8 x i64> [[TMP]], align 1
		; CHECK-NEXT: call fastcc void @callee_avx2_legal256_prefer256_call_avx2_legal512_prefer256(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64> [[TMP0]])
; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32		; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2		; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = alloca <8 x i64>, align 32		%tmp = alloca <8 x i64>, align 32
%tmp2 = alloca <8 x i64>, align 32		%tmp2 = alloca <8 x i64>, align 32
%tmp3 = bitcast <8 x i64>* %tmp to i8*		%tmp3 = bitcast <8 x i64>* %tmp to i8*
call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)		call void @llvm.memset.p0i8.i64(i8* align 32 %tmp3, i8 0, i64 32, i1 false)
call fastcc void @callee_avx2_legal256_prefer256_call_avx2_legal512_prefer256(<8 x i64>* %tmp2, <8 x i64>* %tmp)		call fastcc void @callee_avx2_legal256_prefer256_call_avx2_legal512_prefer256(<8 x i64>* %tmp2, <8 x i64>* %tmp)
%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32		%tmp4 = load <8 x i64>, <8 x i64>* %tmp2, align 32
store <8 x i64> %tmp4, <8 x i64>* %arg, align 2		store <8 x i64> %tmp4, <8 x i64>* %arg, align 2
ret void		ret void
}		}

; This should promote		; This should promote
define internal fastcc void @callee_avx2_legal512_prefer256_call_avx2_legal256_prefer256(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #4 {		define internal fastcc void @callee_avx2_legal512_prefer256_call_avx2_legal256_prefer256(<8 x i64>* %arg, <8 x i64>* readonly %arg1) #4 {
; CHECK-LABEL: define {{[^@]+}}@callee_avx2_legal512_prefer256_call_avx2_legal256_prefer256		; CHECK-LABEL: define {{[^@]+}}@callee_avx2_legal512_prefer256_call_avx2_legal256_prefer256
; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[ARG1:%.*]])		; CHECK-SAME: (<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[ARG:%.]], <8 x i64> [[TMP0:%.]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1]], align 32		; CHECK-NEXT: [[ARG1_PRIV:%.*]] = alloca <8 x i64>
		; CHECK-NEXT: store <8 x i64> [[TMP0]], <8 x i64>* [[ARG1_PRIV]]
		; CHECK-NEXT: [[TMP:%.]] = load <8 x i64>, <8 x i64> [[ARG1_PRIV]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32		; CHECK-NEXT: store <8 x i64> [[TMP]], <8 x i64>* [[ARG]], align 32
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = load <8 x i64>, <8 x i64>* %arg1		%tmp = load <8 x i64>, <8 x i64>* %arg1
store <8 x i64> %tmp, <8 x i64>* %arg		store <8 x i64> %tmp, <8 x i64>* %arg
ret void		ret void
}		}

define void @avx2_legal512_prefer256_call_avx2_legal256_prefer256(<8 x i64>* %arg) #3 {		define void @avx2_legal512_prefer256_call_avx2_legal256_prefer256(<8 x i64>* %arg) #3 {
; CHECK-LABEL: define {{[^@]+}}@avx2_legal512_prefer256_call_avx2_legal256_prefer256		; CHECK-LABEL: define {{[^@]+}}@avx2_legal512_prefer256_call_avx2_legal256_prefer256
; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])		; CHECK-SAME: (<8 x i64>* nocapture writeonly [[ARG:%.*]])
; CHECK-NEXT: bb:		; CHECK-NEXT: bb:
; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32		; CHECK-NEXT: [[TMP2:%.*]] = alloca <8 x i64>, align 32
; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*		; CHECK-NEXT: [[TMP3:%.]] = bitcast <8 x i64> [[TMP]] to i8*
; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)		; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 32 dereferenceable(64) [[TMP3]], i8 0, i64 32, i1 false)
; CHECK-NEXT: call fastcc void @callee_avx2_legal512_prefer256_call_avx2_legal256_prefer256(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64>* noalias nocapture nofree nonnull readonly align 32 dereferenceable(64) [[TMP]])		; CHECK-NEXT: [[TMP0:%.]] = load <8 x i64>, <8 x i64> [[TMP]], align 1
		; CHECK-NEXT: call fastcc void @callee_avx2_legal512_prefer256_call_avx2_legal256_prefer256(<8 x i64>* noalias nocapture nofree nonnull writeonly align 32 dereferenceable(64) [[TMP2]], <8 x i64> [[TMP0]])
; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32		; CHECK-NEXT: [[TMP4:%.]] = load <8 x i64>, <8 x i64> [[TMP2]], align 32
; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2		; CHECK-NEXT: store <8 x i64> [[TMP4]], <8 x i64>* [[ARG]], align 2
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
bb:		bb:
%tmp = alloca <8 x i64>, align 32		%tmp = alloca <8 x i64>, align 32
%tmp2 = alloca <8 x i64>, align 32		%tmp2 = alloca <8 x i64>, align 32
%tmp3 = bitcast <8 x i64>* %tmp to i8*		%tmp3 = bitcast <8 x i64>* %tmp to i8*
Show All 16 Lines

llvm/test/Transforms/Attributor/ArgumentPromotion/alignment.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
				; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s

				define void @f() {
				; CHECK-LABEL: define {{[^@]+}}@f()
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[A:%.*]] = alloca i32, align 1
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[A]], align 1
				; CHECK-NEXT: call void @g(i32 [[TMP0]])
				; CHECK-NEXT: ret void
				;
				entry:
				%a = alloca i32, align 1
				call void @g(i32* %a)
				ret void
				}

				define internal void @g(i32* %a) {
				; CHECK-LABEL: define {{[^@]+}}@g
				; CHECK-SAME: (i32 [[TMP0:%.*]])
				; CHECK-NEXT: [[A_PRIV:%.*]] = alloca i32
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[A_PRIV]]
				; CHECK-NEXT: [[AA:%.]] = load i32, i32 [[A_PRIV]], align 1
				; CHECK-NEXT: call void @z(i32 [[AA]])
				; CHECK-NEXT: ret void
				;
				%aa = load i32, i32* %a, align 1
				call void @z(i32 %aa)
				ret void
				}

				declare void @z(i32)

llvm/test/Transforms/Attributor/ArgumentPromotion/attrs.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 < %s \| FileCheck %s			; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 < %s \| FileCheck %s

	%struct.ss = type { i32, i64 }			%struct.ss = type { i32, i64 }

	; Don't drop 'byval' on %X here.			; Don't drop 'byval' on %X here.
	define internal void @f(%struct.ss* byval %b, i32* byval %X, i32 %i) nounwind {			define internal void @f(%struct.ss* byval %b, i32* byval %X, i32 %i) nounwind {
	; CHECK-LABEL: define {{[^@]+}}@f			; CHECK-LABEL: define {{[^@]+}}@f
	; CHECK-SAME: (%struct.ss* noalias nocapture nofree nonnull byval align 8 dereferenceable(12) [[B:%.]], i32 noalias nocapture nofree nonnull writeonly byval dereferenceable(4) [[X:%.]], i32 [[I:%.]])			; CHECK-SAME: (i32 [[TMP0:%.]], i64 [[TMP1:%.]], i32 [[TMP2:%.]], i32 [[I:%.]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP:%.]] = getelementptr [[STRUCT_SS:%.]], %struct.ss* [[B]], i32 0, i32 0			; CHECK-NEXT: [[X_PRIV:%.*]] = alloca i32
				; CHECK-NEXT: store i32 [[TMP2]], i32* [[X_PRIV]]
				; CHECK-NEXT: [[B_PRIV:%.]] = alloca [[STRUCT_SS:%.]]
				; CHECK-NEXT: [[B_PRIV_CAST:%.]] = bitcast %struct.ss [[B_PRIV]] to i32*
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[B_PRIV_CAST]]
				; CHECK-NEXT: [[B_PRIV_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[B_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i64 [[TMP1]], i64* [[B_PRIV_0_1]]
				; CHECK-NEXT: [[TMP:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[B_PRIV]], i32 0, i32 0
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP]], align 8			; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP]], align 8
	; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1			; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1
	; CHECK-NEXT: store i32 [[TMP2]], i32* [[TMP]], align 8			; CHECK-NEXT: store i32 [[TMP2]], i32* [[TMP]], align 8
	; CHECK-NEXT: store i32 0, i32* [[X]]			; CHECK-NEXT: store i32 0, i32* [[X_PRIV]]
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:

	%tmp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0			%tmp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0
	%tmp1 = load i32, i32* %tmp, align 4			%tmp1 = load i32, i32* %tmp, align 4
	%tmp2 = add i32 %tmp1, 1			%tmp2 = add i32 %tmp1, 1
	store i32 %tmp2, i32* %tmp, align 4			store i32 %tmp2, i32* %tmp, align 4

	store i32 %i, i32* %X			store i32 %i, i32* %X
	ret void			ret void
	}			}

	; Also make sure we don't drop the call zeroext attribute.			; Also make sure we don't drop the call zeroext attribute.
	define i32 @test(i32* %X) {			define i32 @test(i32* %X) {
	; CHECK-LABEL: define {{[^@]+}}@test			; CHECK-LABEL: define {{[^@]+}}@test
	; CHECK-SAME: (i32* nocapture nofree readonly [[X:%.*]])			; CHECK-SAME: (i32* nocapture nofree readonly [[X:%.*]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[S:%.]] = alloca [[STRUCT_SS:%.]]			; CHECK-NEXT: [[S:%.]] = alloca [[STRUCT_SS:%.]]
	; CHECK-NEXT: [[TMP1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 0			; CHECK-NEXT: [[TMP1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 0
	; CHECK-NEXT: store i32 1, i32* [[TMP1]], align 8			; CHECK-NEXT: store i32 1, i32* [[TMP1]], align 8
	; CHECK-NEXT: [[TMP4:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1			; CHECK-NEXT: [[TMP4:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
	; CHECK-NEXT: store i64 2, i64* [[TMP4]], align 4			; CHECK-NEXT: store i64 2, i64* [[TMP4]], align 4
	; CHECK-NEXT: call void @f(%struct.ss* noalias nocapture nofree nonnull readonly byval align 8 dereferenceable(12) [[S]], i32* nocapture nofree readonly byval [[X]], i32 zeroext 0)			; CHECK-NEXT: [[S_CAST:%.]] = bitcast %struct.ss [[S]] to i32*
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[S_CAST]], align 1
				; CHECK-NEXT: [[S_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
				; CHECK-NEXT: [[TMP1:%.]] = load i64, i64 [[S_0_1]], align 1
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[X]], align 1
				; CHECK-NEXT: call void @f(i32 [[TMP0]], i64 [[TMP1]], i32 [[TMP2]], i32 zeroext 0)
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: ret i32 0
	;			;
	entry:			entry:
	%S = alloca %struct.ss			%S = alloca %struct.ss
	%tmp1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0			%tmp1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0
	store i32 1, i32* %tmp1, align 8			store i32 1, i32* %tmp1, align 8
	%tmp4 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1			%tmp4 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1
	store i64 2, i64* %tmp4, align 4			store i64 2, i64* %tmp4, align 4

	call void @f( %struct.ss* byval %S, i32* byval %X, i32 zeroext 0)			call void @f( %struct.ss* byval %S, i32* byval %X, i32 zeroext 0)

	ret i32 0			ret i32 0
	}			}

llvm/test/Transforms/Attributor/ArgumentPromotion/basictest.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=6 < %s \| FileCheck %s			; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=6 < %s \| FileCheck %s
	target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"			target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"

	define internal i32 @test(i32* %X, i32* %Y) {			define internal i32 @test(i32* %X, i32* %Y) {
	; CHECK-LABEL: define {{[^@]+}}@test			; CHECK-LABEL: define {{[^@]+}}@test
	; CHECK-SAME: (i32* noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[X:%.]], i32 noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[Y:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.]], i32 [[TMP1:%.]])
	; CHECK-NEXT: [[A:%.]] = load i32, i32 [[X]], align 4			; CHECK-NEXT: [[Y_PRIV:%.*]] = alloca i32
	; CHECK-NEXT: [[B:%.]] = load i32, i32 [[Y]], align 4			; CHECK-NEXT: store i32 [[TMP1]], i32* [[Y_PRIV]]
				; CHECK-NEXT: [[X_PRIV:%.*]] = alloca i32
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[X_PRIV]]
				; CHECK-NEXT: [[A:%.]] = load i32, i32 [[X_PRIV]], align 4
				; CHECK-NEXT: [[B:%.]] = load i32, i32 [[Y_PRIV]], align 4
	; CHECK-NEXT: [[C:%.*]] = add i32 [[A]], [[B]]			; CHECK-NEXT: [[C:%.*]] = add i32 [[A]], [[B]]
	; CHECK-NEXT: ret i32 [[C]]			; CHECK-NEXT: ret i32 [[C]]
	;			;
	%A = load i32, i32* %X			%A = load i32, i32* %X
	%B = load i32, i32* %Y			%B = load i32, i32* %Y
	%C = add i32 %A, %B			%C = add i32 %A, %B
	ret i32 %C			ret i32 %C
	}			}

	define internal i32 @caller(i32* %B) {			define internal i32 @caller(i32* %B) {
	; CHECK-LABEL: define {{[^@]+}}@caller			; CHECK-LABEL: define {{[^@]+}}@caller
	; CHECK-SAME: (i32* noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[B:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.*]])
				; CHECK-NEXT: [[B_PRIV:%.*]] = alloca i32
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[B_PRIV]]
	; CHECK-NEXT: [[A:%.*]] = alloca i32			; CHECK-NEXT: [[A:%.*]] = alloca i32
	; CHECK-NEXT: store i32 1, i32* [[A]], align 4			; CHECK-NEXT: store i32 1, i32* [[A]], align 4
	; CHECK-NEXT: [[C:%.]] = call i32 @test(i32 noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[A]], i32* noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[B]])			; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[A]], align 1
				; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[B_PRIV]], align 1
				; CHECK-NEXT: [[C:%.*]] = call i32 @test(i32 [[TMP2]], i32 [[TMP3]])
	; CHECK-NEXT: ret i32 [[C]]			; CHECK-NEXT: ret i32 [[C]]
	;			;
	%A = alloca i32			%A = alloca i32
	store i32 1, i32* %A			store i32 1, i32* %A
	%C = call i32 @test(i32* %A, i32* %B)			%C = call i32 @test(i32* %A, i32* %B)
	ret i32 %C			ret i32 %C
	}			}

	define i32 @callercaller() {			define i32 @callercaller() {
	; CHECK-LABEL: define {{[^@]+}}@callercaller()			; CHECK-LABEL: define {{[^@]+}}@callercaller()
	; CHECK-NEXT: [[B:%.*]] = alloca i32			; CHECK-NEXT: [[B:%.*]] = alloca i32
	; CHECK-NEXT: store i32 2, i32* [[B]], align 4			; CHECK-NEXT: store i32 2, i32* [[B]], align 4
	; CHECK-NEXT: [[X:%.]] = call i32 @caller(i32 noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[B]])			; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[B]], align 1
				; CHECK-NEXT: [[X:%.*]] = call i32 @caller(i32 [[TMP1]])
	; CHECK-NEXT: ret i32 [[X]]			; CHECK-NEXT: ret i32 [[X]]
	;			;
	%B = alloca i32			%B = alloca i32
	store i32 2, i32* %B			store i32 2, i32* %B
	%X = call i32 @caller(i32* %B)			%X = call i32 @caller(i32* %B)
	ret i32 %X			ret i32 %X
	}			}

llvm/test/Transforms/Attributor/ArgumentPromotion/byval-2.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 < %s \| FileCheck %s			; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 < %s \| FileCheck %s

	%struct.ss = type { i32, i64 }			%struct.ss = type { i32, i64 }

	define internal void @f(%struct.ss* byval %b, i32* byval %X) nounwind {			define internal void @f(%struct.ss* byval %b, i32* byval %X) nounwind {
	; CHECK-LABEL: define {{[^@]+}}@f			; CHECK-LABEL: define {{[^@]+}}@f
	; CHECK-SAME: (%struct.ss* noalias nocapture nofree nonnull byval align 8 dereferenceable(12) [[B:%.]], i32 noalias nocapture nofree nonnull writeonly byval dereferenceable(4) [[X:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.]], i64 [[TMP1:%.]], i32 [[TMP2:%.*]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP:%.]] = getelementptr [[STRUCT_SS:%.]], %struct.ss* [[B]], i32 0, i32 0			; CHECK-NEXT: [[X_PRIV:%.*]] = alloca i32
				; CHECK-NEXT: store i32 [[TMP2]], i32* [[X_PRIV]]
				; CHECK-NEXT: [[B_PRIV:%.]] = alloca [[STRUCT_SS:%.]]
				; CHECK-NEXT: [[B_PRIV_CAST:%.]] = bitcast %struct.ss [[B_PRIV]] to i32*
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[B_PRIV_CAST]]
				; CHECK-NEXT: [[B_PRIV_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[B_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i64 [[TMP1]], i64* [[B_PRIV_0_1]]
				; CHECK-NEXT: [[TMP:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[B_PRIV]], i32 0, i32 0
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP]], align 8			; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP]], align 8
	; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1			; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1
	; CHECK-NEXT: store i32 [[TMP2]], i32* [[TMP]], align 8			; CHECK-NEXT: store i32 [[TMP2]], i32* [[TMP]], align 8
	; CHECK-NEXT: store i32 0, i32* [[X]]			; CHECK-NEXT: store i32 0, i32* [[X_PRIV]]
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%tmp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0			%tmp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0
	%tmp1 = load i32, i32* %tmp, align 4			%tmp1 = load i32, i32* %tmp, align 4
	%tmp2 = add i32 %tmp1, 1			%tmp2 = add i32 %tmp1, 1
	store i32 %tmp2, i32* %tmp, align 4			store i32 %tmp2, i32* %tmp, align 4

	store i32 0, i32* %X			store i32 0, i32* %X
	ret void			ret void
	}			}

	define i32 @test(i32* %X) {			define i32 @test(i32* %X) {
	; CHECK-LABEL: define {{[^@]+}}@test			; CHECK-LABEL: define {{[^@]+}}@test
	; CHECK-SAME: (i32* nocapture nofree readonly [[X:%.*]])			; CHECK-SAME: (i32* nocapture nofree readonly [[X:%.*]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[S:%.]] = alloca [[STRUCT_SS:%.]]			; CHECK-NEXT: [[S:%.]] = alloca [[STRUCT_SS:%.]]
	; CHECK-NEXT: [[TMP1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 0			; CHECK-NEXT: [[TMP1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 0
	; CHECK-NEXT: store i32 1, i32* [[TMP1]], align 8			; CHECK-NEXT: store i32 1, i32* [[TMP1]], align 8
	; CHECK-NEXT: [[TMP4:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1			; CHECK-NEXT: [[TMP4:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
	; CHECK-NEXT: store i64 2, i64* [[TMP4]], align 4			; CHECK-NEXT: store i64 2, i64* [[TMP4]], align 4
	; CHECK-NEXT: call void @f(%struct.ss* noalias nocapture nofree nonnull readonly byval align 8 dereferenceable(12) [[S]], i32* nocapture nofree readonly byval [[X]])			; CHECK-NEXT: [[S_CAST:%.]] = bitcast %struct.ss [[S]] to i32*
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[S_CAST]], align 1
				; CHECK-NEXT: [[S_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
				; CHECK-NEXT: [[TMP1:%.]] = load i64, i64 [[S_0_1]], align 1
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[X]], align 1
				; CHECK-NEXT: call void @f(i32 [[TMP0]], i64 [[TMP1]], i32 [[TMP2]])
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: ret i32 0
	;			;
	entry:			entry:
	%S = alloca %struct.ss			%S = alloca %struct.ss
	%tmp1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0			%tmp1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0
	store i32 1, i32* %tmp1, align 8			store i32 1, i32* %tmp1, align 8
	%tmp4 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1			%tmp4 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1
	store i64 2, i64* %tmp4, align 4			store i64 2, i64* %tmp4, align 4
	call void @f( %struct.ss* byval %S, i32* byval %X)			call void @f( %struct.ss* byval %S, i32* byval %X)
	ret i32 0			ret i32 0
	}			}

llvm/test/Transforms/Attributor/ArgumentPromotion/byval.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s			; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s

	target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"			target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"

	%struct.ss = type { i32, i64 }			%struct.ss = type { i32, i64 }

	define internal void @f(%struct.ss* byval %b) nounwind {			define internal void @f(%struct.ss* byval %b) nounwind {
	; CHECK-LABEL: define {{[^@]+}}@f			; CHECK-LABEL: define {{[^@]+}}@f
	; CHECK-SAME: (%struct.ss* noalias nocapture nofree nonnull byval align 8 dereferenceable(12) [[B:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.]], i64 [[TMP1:%.]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP:%.]] = getelementptr [[STRUCT_SS:%.]], %struct.ss* [[B]], i32 0, i32 0			; CHECK-NEXT: [[B_PRIV:%.]] = alloca [[STRUCT_SS:%.]]
				; CHECK-NEXT: [[B_PRIV_CAST:%.]] = bitcast %struct.ss [[B_PRIV]] to i32*
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[B_PRIV_CAST]]
				; CHECK-NEXT: [[B_PRIV_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[B_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i64 [[TMP1]], i64* [[B_PRIV_0_1]]
				; CHECK-NEXT: [[TMP:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[B_PRIV]], i32 0, i32 0
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP]], align 8			; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP]], align 8
	; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1			; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1
	; CHECK-NEXT: store i32 [[TMP2]], i32* [[TMP]], align 8			; CHECK-NEXT: store i32 [[TMP2]], i32* [[TMP]], align 8
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%tmp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0			%tmp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0
	%tmp1 = load i32, i32* %tmp, align 4			%tmp1 = load i32, i32* %tmp, align 4
	%tmp2 = add i32 %tmp1, 1			%tmp2 = add i32 %tmp1, 1
	store i32 %tmp2, i32* %tmp, align 4			store i32 %tmp2, i32* %tmp, align 4
	ret void			ret void
	}			}


	define internal void @g(%struct.ss* byval align 32 %b) nounwind {			define internal void @g(%struct.ss* byval align 32 %b) nounwind {
	; CHECK-LABEL: define {{[^@]+}}@g			; CHECK-LABEL: define {{[^@]+}}@g
	; CHECK-SAME: (%struct.ss* noalias nocapture nofree nonnull byval align 32 dereferenceable(12) [[B:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.]], i64 [[TMP1:%.]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP:%.]] = getelementptr [[STRUCT_SS:%.]], %struct.ss* [[B]], i32 0, i32 0			; CHECK-NEXT: [[B_PRIV:%.]] = alloca [[STRUCT_SS:%.]]
				; CHECK-NEXT: [[B_PRIV_CAST:%.]] = bitcast %struct.ss [[B_PRIV]] to i32*
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[B_PRIV_CAST]]
				; CHECK-NEXT: [[B_PRIV_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[B_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i64 [[TMP1]], i64* [[B_PRIV_0_1]]
				; CHECK-NEXT: [[TMP:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[B_PRIV]], i32 0, i32 0
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP]], align 32			; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP]], align 32
	; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1			; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[TMP1]], 1
	; CHECK-NEXT: store i32 [[TMP2]], i32* [[TMP]], align 32			; CHECK-NEXT: store i32 [[TMP2]], i32* [[TMP]], align 32
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%tmp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0			%tmp = getelementptr %struct.ss, %struct.ss* %b, i32 0, i32 0
	%tmp1 = load i32, i32* %tmp, align 4			%tmp1 = load i32, i32* %tmp, align 4
	%tmp2 = add i32 %tmp1, 1			%tmp2 = add i32 %tmp1, 1
	store i32 %tmp2, i32* %tmp, align 4			store i32 %tmp2, i32* %tmp, align 4
	ret void			ret void
	}			}


	define i32 @main() nounwind {			define i32 @main() nounwind {
	; CHECK-LABEL: define {{[^@]+}}@main()			; CHECK-LABEL: define {{[^@]+}}@main()
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[S:%.]] = alloca [[STRUCT_SS:%.]]			; CHECK-NEXT: [[S:%.]] = alloca [[STRUCT_SS:%.]]
	; CHECK-NEXT: [[TMP1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 0			; CHECK-NEXT: [[TMP1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 0
	; CHECK-NEXT: store i32 1, i32* [[TMP1]], align 8			; CHECK-NEXT: store i32 1, i32* [[TMP1]], align 8
	; CHECK-NEXT: [[TMP4:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1			; CHECK-NEXT: [[TMP4:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
	; CHECK-NEXT: store i64 2, i64* [[TMP4]], align 4			; CHECK-NEXT: store i64 2, i64* [[TMP4]], align 4
	; CHECK-NEXT: call void @f(%struct.ss* noalias nocapture nofree nonnull readonly byval align 8 dereferenceable(12) [[S]])			; CHECK-NEXT: [[S_CAST:%.]] = bitcast %struct.ss [[S]] to i32*
	; CHECK-NEXT: call void @g(%struct.ss* noalias nocapture nofree nonnull readonly byval align 32 dereferenceable(12) [[S]])			; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[S_CAST]], align 1
				; CHECK-NEXT: [[S_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
				; CHECK-NEXT: [[TMP1:%.]] = load i64, i64 [[S_0_1]], align 1
				; CHECK-NEXT: call void @f(i32 [[TMP0]], i64 [[TMP1]])
				; CHECK-NEXT: [[S_CAST1:%.]] = bitcast %struct.ss [[S]] to i32*
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[S_CAST1]], align 1
				; CHECK-NEXT: [[S_0_12:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
				; CHECK-NEXT: [[TMP3:%.]] = load i64, i64 [[S_0_12]], align 1
				; CHECK-NEXT: call void @g(i32 [[TMP2]], i64 [[TMP3]])
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: ret i32 0
	;			;
	entry:			entry:
	%S = alloca %struct.ss			%S = alloca %struct.ss
	%tmp1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0			%tmp1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0
	store i32 1, i32* %tmp1, align 8			store i32 1, i32* %tmp1, align 8
	%tmp4 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1			%tmp4 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1
	store i64 2, i64* %tmp4, align 4			store i64 2, i64* %tmp4, align 4
	call void @f(%struct.ss* byval %S) nounwind			call void @f(%struct.ss* byval %S) nounwind
	call void @g(%struct.ss* byval %S) nounwind			call void @g(%struct.ss* byval %S) nounwind
	ret i32 0			ret i32 0
	}			}

llvm/test/Transforms/Attributor/ArgumentPromotion/control-flow2.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=9 < %s \| FileCheck %s			; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=9 < %s \| FileCheck %s

	target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"			target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"

	define internal i32 @callee(i1 %C, i32* %P) {			define internal i32 @callee(i1 %C, i32* %P) {
	; CHECK-LABEL: define {{[^@]+}}@callee			; CHECK-LABEL: define {{[^@]+}}@callee
	; CHECK-SAME: (i1 [[C:%.]], i32 noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[P:%.*]])			; CHECK-SAME: (i1 [[C:%.]], i32 [[TMP0:%.]])
				; CHECK-NEXT: [[P_PRIV:%.*]] = alloca i32
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[P_PRIV]]
	; CHECK-NEXT: br label [[F:%.*]]			; CHECK-NEXT: br label [[F:%.*]]
	; CHECK: T:			; CHECK: T:
	; CHECK-NEXT: unreachable			; CHECK-NEXT: unreachable
	; CHECK: F:			; CHECK: F:
	; CHECK-NEXT: [[X:%.]] = load i32, i32 [[P]], align 4			; CHECK-NEXT: [[X:%.]] = load i32, i32 [[P_PRIV]], align 4
	; CHECK-NEXT: ret i32 [[X]]			; CHECK-NEXT: ret i32 [[X]]
	;			;
	br i1 %C, label %T, label %F			br i1 %C, label %T, label %F

	T: ; preds = %0			T: ; preds = %0
	ret i32 17			ret i32 17

	F: ; preds = %0			F: ; preds = %0
	%X = load i32, i32* %P ; <i32> [#uses=1]			%X = load i32, i32* %P ; <i32> [#uses=1]
	ret i32 %X			ret i32 %X
	}			}

	define i32 @foo() {			define i32 @foo() {
	; CHECK-LABEL: define {{[^@]+}}@foo()			; CHECK-LABEL: define {{[^@]+}}@foo()
	; CHECK-NEXT: [[A:%.*]] = alloca i32			; CHECK-NEXT: [[A:%.*]] = alloca i32
	; CHECK-NEXT: store i32 17, i32* [[A]], align 4			; CHECK-NEXT: store i32 17, i32* [[A]], align 4
	; CHECK-NEXT: [[X:%.]] = call i32 @callee(i1 false, i32 noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) [[A]])			; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[A]], align 1
				; CHECK-NEXT: [[X:%.*]] = call i32 @callee(i1 false, i32 [[TMP1]])
	; CHECK-NEXT: ret i32 [[X]]			; CHECK-NEXT: ret i32 [[X]]
	;			;
	%A = alloca i32 ; <i32*> [#uses=2]			%A = alloca i32 ; <i32*> [#uses=2]
	store i32 17, i32* %A			store i32 17, i32* %A
	%X = call i32 @callee( i1 false, i32* %A ) ; <i32> [#uses=1]			%X = call i32 @callee( i1 false, i32* %A ) ; <i32> [#uses=1]
	ret i32 %X			ret i32 %X
	}			}

llvm/test/Transforms/Attributor/ArgumentPromotion/fp80.ll

	Show All 9 Lines
	@b = internal global %struct.s { double 3.14, i16 9439, i8 25, [5 x i8] undef }, align 16			@b = internal global %struct.s { double 3.14, i16 9439, i8 25, [5 x i8] undef }, align 16

	%struct.Foo = type { i32, i64 }			%struct.Foo = type { i32, i64 }
	@a = internal global %struct.Foo { i32 1, i64 2 }, align 8			@a = internal global %struct.Foo { i32 1, i64 2 }, align 8

	define void @run() {			define void @run() {
	; CHECK-LABEL: define {{[^@]+}}@run()			; CHECK-LABEL: define {{[^@]+}}@run()
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP0:%.]] = call i64 @CaptureAStruct(%struct.Foo nofree nonnull readonly align 8 dereferenceable(16) @a)			; CHECK-NEXT: [[A_CAST:%.]] = bitcast %struct.Foo @a to i32*
				; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[A_CAST]], align 1
				; CHECK-NEXT: [[A_0_1:%.]] = getelementptr [[STRUCT_FOO:%.]], %struct.Foo* @a, i32 0, i32 1
				; CHECK-NEXT: [[TMP1:%.]] = load i64, i64 [[A_0_1]], align 1
				; CHECK-NEXT: [[TMP2:%.*]] = call i64 @CaptureAStruct(i32 [[TMP0]], i64 [[TMP1]])
	; CHECK-NEXT: unreachable			; CHECK-NEXT: unreachable
	;			;
	entry:			entry:
	tail call i8 @UseLongDoubleUnsafely(%union.u* byval align 16 bitcast (%struct.s* @b to %union.u*))			tail call i8 @UseLongDoubleUnsafely(%union.u* byval align 16 bitcast (%struct.s* @b to %union.u*))
	tail call x86_fp80 @UseLongDoubleSafely(%union.u* byval align 16 bitcast (%struct.s* @b to %union.u*))			tail call x86_fp80 @UseLongDoubleSafely(%union.u* byval align 16 bitcast (%struct.s* @b to %union.u*))
	call i64 @AccessPaddingOfStruct(%struct.Foo* @a)			call i64 @AccessPaddingOfStruct(%struct.Foo* @a)
	call i64 @CaptureAStruct(%struct.Foo* @a)			call i64 @CaptureAStruct(%struct.Foo* @a)
	ret void			ret void
	Show All 16 Lines
	define internal i64 @AccessPaddingOfStruct(%struct.Foo* byval %a) {			define internal i64 @AccessPaddingOfStruct(%struct.Foo* byval %a) {
	%p = bitcast %struct.Foo* %a to i64*			%p = bitcast %struct.Foo* %a to i64*
	%v = load i64, i64* %p			%v = load i64, i64* %p
	ret i64 %v			ret i64 %v
	}			}

	define internal i64 @CaptureAStruct(%struct.Foo* byval %a) {			define internal i64 @CaptureAStruct(%struct.Foo* byval %a) {
	; CHECK-LABEL: define {{[^@]+}}@CaptureAStruct			; CHECK-LABEL: define {{[^@]+}}@CaptureAStruct
	; CHECK-SAME: (%struct.Foo* noalias nofree nonnull byval align 8 dereferenceable(16) [[A:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.]], i64 [[TMP1:%.]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
				; CHECK-NEXT: [[A_PRIV:%.]] = alloca [[STRUCT_FOO:%.]]
				; CHECK-NEXT: [[A_PRIV_CAST:%.]] = bitcast %struct.Foo [[A_PRIV]] to i32*
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[A_PRIV_CAST]]
				; CHECK-NEXT: [[A_PRIV_0_1:%.]] = getelementptr [[STRUCT_FOO]], %struct.Foo [[A_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i64 [[TMP1]], i64* [[A_PRIV_0_1]]
	; CHECK-NEXT: [[A_PTR:%.]] = alloca %struct.Foo			; CHECK-NEXT: [[A_PTR:%.]] = alloca %struct.Foo
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[PHI:%.]] = phi %struct.Foo [ null, [[ENTRY:%.]] ], [ [[GEP:%.]], [[LOOP]] ]			; CHECK-NEXT: [[PHI:%.]] = phi %struct.Foo [ null, [[ENTRY:%.]] ], [ [[GEP:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[TMP0:%.]] = phi %struct.Foo [ [[A]], [[ENTRY]] ], [ [[TMP0]], [[LOOP]] ]			; CHECK-NEXT: [[TMP2:%.]] = phi %struct.Foo [ [[A_PRIV]], [[ENTRY]] ], [ [[TMP2]], [[LOOP]] ]
	; CHECK-NEXT: store %struct.Foo* [[PHI]], %struct.Foo** [[A_PTR]], align 8			; CHECK-NEXT: store %struct.Foo* [[PHI]], %struct.Foo** [[A_PTR]], align 8
	; CHECK-NEXT: [[GEP]] = getelementptr [[STRUCT_FOO:%.]], %struct.Foo [[A]], i64 0			; CHECK-NEXT: [[GEP]] = getelementptr [[STRUCT_FOO]], %struct.Foo* [[A_PRIV]], i64 0
	; CHECK-NEXT: br label [[LOOP]]			; CHECK-NEXT: br label [[LOOP]]
	;			;
	entry:			entry:
	%a_ptr = alloca %struct.Foo*			%a_ptr = alloca %struct.Foo*
	br label %loop			br label %loop

	loop:			loop:
	%phi = phi %struct.Foo* [ null, %entry ], [ %gep, %loop ]			%phi = phi %struct.Foo* [ null, %entry ], [ %gep, %loop ]
	%0 = phi %struct.Foo* [ %a, %entry ], [ %0, %loop ]			%0 = phi %struct.Foo* [ %a, %entry ], [ %0, %loop ]
	store %struct.Foo* %phi, %struct.Foo** %a_ptr			store %struct.Foo* %phi, %struct.Foo** %a_ptr
	%gep = getelementptr %struct.Foo, %struct.Foo* %a, i64 0			%gep = getelementptr %struct.Foo, %struct.Foo* %a, i64 0
	br label %loop			br label %loop
	}			}

llvm/test/Transforms/Attributor/ArgumentPromotion/inalloca.ll

	Show All 13 Lines
	; ATTRIBUTOR-NEXT: [[F0:%.]] = getelementptr [[STRUCT_SS:%.]], %struct.ss* [[S]], i32 0, i32 0			; ATTRIBUTOR-NEXT: [[F0:%.]] = getelementptr [[STRUCT_SS:%.]], %struct.ss* [[S]], i32 0, i32 0
	; ATTRIBUTOR-NEXT: [[F1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1			; ATTRIBUTOR-NEXT: [[F1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
	; ATTRIBUTOR-NEXT: [[A:%.]] = load i32, i32 [[F0]], align 4			; ATTRIBUTOR-NEXT: [[A:%.]] = load i32, i32 [[F0]], align 4
	; ATTRIBUTOR-NEXT: [[B:%.]] = load i32, i32 [[F1]], align 4			; ATTRIBUTOR-NEXT: [[B:%.]] = load i32, i32 [[F1]], align 4
	; ATTRIBUTOR-NEXT: [[R:%.*]] = add i32 [[A]], [[B]]			; ATTRIBUTOR-NEXT: [[R:%.*]] = add i32 [[A]], [[B]]
	; ATTRIBUTOR-NEXT: ret i32 [[R]]			; ATTRIBUTOR-NEXT: ret i32 [[R]]
	;			;
	; GLOBALOPT_ATTRIBUTOR-LABEL: define {{[^@]+}}@f			; GLOBALOPT_ATTRIBUTOR-LABEL: define {{[^@]+}}@f
	; GLOBALOPT_ATTRIBUTOR-SAME: (%struct.ss* noalias nocapture nofree nonnull readonly align 4 dereferenceable(8) [[S:%.*]]) unnamed_addr			; GLOBALOPT_ATTRIBUTOR-SAME: (i32 [[TMP0:%.]], i32 [[TMP1:%.]]) unnamed_addr
	; GLOBALOPT_ATTRIBUTOR-NEXT: entry:			; GLOBALOPT_ATTRIBUTOR-NEXT: entry:
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[F0:%.]] = getelementptr [[STRUCT_SS:%.]], %struct.ss* [[S]], i32 0, i32 0			; GLOBALOPT_ATTRIBUTOR-NEXT: [[S_PRIV:%.]] = alloca [[STRUCT_SS:%.]]
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[F1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1			; GLOBALOPT_ATTRIBUTOR-NEXT: [[S_PRIV_CAST:%.]] = bitcast %struct.ss [[S_PRIV]] to i32*
				; GLOBALOPT_ATTRIBUTOR-NEXT: store i32 [[TMP0]], i32* [[S_PRIV_CAST]]
				; GLOBALOPT_ATTRIBUTOR-NEXT: [[S_PRIV_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S_PRIV]], i32 0, i32 1
				; GLOBALOPT_ATTRIBUTOR-NEXT: store i32 [[TMP1]], i32* [[S_PRIV_0_1]]
				; GLOBALOPT_ATTRIBUTOR-NEXT: [[F0:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S_PRIV]], i32 0, i32 0
				; GLOBALOPT_ATTRIBUTOR-NEXT: [[F1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S_PRIV]], i32 0, i32 1
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[A:%.]] = load i32, i32 [[F0]], align 4			; GLOBALOPT_ATTRIBUTOR-NEXT: [[A:%.]] = load i32, i32 [[F0]], align 4
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[B:%.]] = load i32, i32 [[F1]], align 4			; GLOBALOPT_ATTRIBUTOR-NEXT: [[B:%.]] = load i32, i32 [[F1]], align 4
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[R:%.*]] = add i32 [[A]], [[B]]			; GLOBALOPT_ATTRIBUTOR-NEXT: [[R:%.*]] = add i32 [[A]], [[B]]
	; GLOBALOPT_ATTRIBUTOR-NEXT: ret i32 [[R]]			; GLOBALOPT_ATTRIBUTOR-NEXT: ret i32 [[R]]
	;			;
	entry:			entry:
	%f0 = getelementptr %struct.ss, %struct.ss* %s, i32 0, i32 0			%f0 = getelementptr %struct.ss, %struct.ss* %s, i32 0, i32 0
	%f1 = getelementptr %struct.ss, %struct.ss* %s, i32 0, i32 1			%f1 = getelementptr %struct.ss, %struct.ss* %s, i32 0, i32 1
	Show All 16 Lines
	;			;
	; GLOBALOPT_ATTRIBUTOR-LABEL: define {{[^@]+}}@main() local_unnamed_addr			; GLOBALOPT_ATTRIBUTOR-LABEL: define {{[^@]+}}@main() local_unnamed_addr
	; GLOBALOPT_ATTRIBUTOR-NEXT: entry:			; GLOBALOPT_ATTRIBUTOR-NEXT: entry:
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[S:%.]] = alloca inalloca [[STRUCT_SS:%.]]			; GLOBALOPT_ATTRIBUTOR-NEXT: [[S:%.]] = alloca inalloca [[STRUCT_SS:%.]]
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[F0:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 0			; GLOBALOPT_ATTRIBUTOR-NEXT: [[F0:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 0
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[F1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1			; GLOBALOPT_ATTRIBUTOR-NEXT: [[F1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
	; GLOBALOPT_ATTRIBUTOR-NEXT: store i32 1, i32* [[F0]], align 4			; GLOBALOPT_ATTRIBUTOR-NEXT: store i32 1, i32* [[F0]], align 4
	; GLOBALOPT_ATTRIBUTOR-NEXT: store i32 2, i32* [[F1]], align 4			; GLOBALOPT_ATTRIBUTOR-NEXT: store i32 2, i32* [[F1]], align 4
	; GLOBALOPT_ATTRIBUTOR-NEXT: [[R:%.]] = call fastcc i32 @f(%struct.ss noalias nocapture nofree nonnull readonly align 4 dereferenceable(8) [[S]])			; GLOBALOPT_ATTRIBUTOR-NEXT: [[S_CAST:%.]] = bitcast %struct.ss [[S]] to i32*
				; GLOBALOPT_ATTRIBUTOR-NEXT: [[TMP0:%.]] = load i32, i32 [[S_CAST]], align 1
				; GLOBALOPT_ATTRIBUTOR-NEXT: [[S_0_1:%.]] = getelementptr [[STRUCT_SS]], %struct.ss [[S]], i32 0, i32 1
				; GLOBALOPT_ATTRIBUTOR-NEXT: [[TMP1:%.]] = load i32, i32 [[S_0_1]], align 1
				; GLOBALOPT_ATTRIBUTOR-NEXT: [[R:%.*]] = call fastcc i32 @f(i32 [[TMP0]], i32 [[TMP1]])
	; GLOBALOPT_ATTRIBUTOR-NEXT: ret i32 [[R]]			; GLOBALOPT_ATTRIBUTOR-NEXT: ret i32 [[R]]
	;			;
	entry:			entry:
	%S = alloca inalloca %struct.ss			%S = alloca inalloca %struct.ss
	%f0 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0			%f0 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 0
	%f1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1			%f1 = getelementptr %struct.ss, %struct.ss* %S, i32 0, i32 1
	store i32 1, i32* %f0, align 4			store i32 1, i32* %f0, align 4
	store i32 2, i32* %f1, align 4			store i32 2, i32* %f1, align 4
	Show All 25 Lines

llvm/test/Transforms/Attributor/ArgumentPromotion/profile.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s			; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s
	target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"			target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"

	; Checks if !prof metadata is corret in deadargelim.			; Checks if !prof metadata is corret in deadargelim.

	define void @caller() #0 {			define void @caller() #0 {
	; CHECK-LABEL: define {{[^@]+}}@caller()			; CHECK-LABEL: define {{[^@]+}}@caller()
	; CHECK-NEXT: [[X:%.*]] = alloca i32			; CHECK-NEXT: [[X:%.*]] = alloca i32
	; CHECK-NEXT: store i32 42, i32* [[X]], align 4			; CHECK-NEXT: store i32 42, i32* [[X]], align 4
	; CHECK-NEXT: call void @promote_i32_ptr(i32* noalias nocapture nonnull readonly align 4 dereferenceable(4) [[X]]), !prof !0			; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[X]], align 1
				; CHECK-NEXT: call void @promote_i32_ptr(i32 [[TMP1]]), !prof !0
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	%x = alloca i32			%x = alloca i32
	store i32 42, i32* %x			store i32 42, i32* %x
	call void @promote_i32_ptr(i32* %x), !prof !0			call void @promote_i32_ptr(i32* %x), !prof !0
	ret void			ret void
	}			}

	define internal void @promote_i32_ptr(i32* %xp) {			define internal void @promote_i32_ptr(i32* %xp) {
	; CHECK-LABEL: define {{[^@]+}}@promote_i32_ptr			; CHECK-LABEL: define {{[^@]+}}@promote_i32_ptr
	; CHECK-SAME: (i32* noalias nocapture nonnull readonly align 4 dereferenceable(4) [[XP:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.*]])
	; CHECK-NEXT: [[X:%.]] = load i32, i32 [[XP]], align 4			; CHECK-NEXT: [[XP_PRIV:%.*]] = alloca i32
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[XP_PRIV]]
				; CHECK-NEXT: [[X:%.]] = load i32, i32 [[XP_PRIV]], align 4
	; CHECK-NEXT: call void @use_i32(i32 [[X]])			; CHECK-NEXT: call void @use_i32(i32 [[X]])
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	%x = load i32, i32* %xp			%x = load i32, i32* %xp
	call void @use_i32(i32 %x)			call void @use_i32(i32 %x)
	ret void			ret void
	}			}

	declare void @use_i32(i32)			declare void @use_i32(i32)

	!0 = !{!"branch_weights", i32 30}			!0 = !{!"branch_weights", i32 30}

llvm/test/Transforms/Attributor/ArgumentPromotion/tail.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=4 < %s \| FileCheck %s			; RUN: opt -S -passes='attributor' -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=4 < %s \| FileCheck %s
	; PR14710			; PR14710

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	%pair = type { i32, i32 }			%pair = type { i32, i32 }

	declare i8* @foo(%pair*)			declare i8* @foo(%pair*)

	define internal void @bar(%pair* byval %Data) {			define internal void @bar(%pair* byval %Data) {
	; CHECK-LABEL: define {{[^@]+}}@bar			; CHECK-LABEL: define {{[^@]+}}@bar
	; CHECK-SAME: (%pair* noalias byval [[DATA:%.*]])			; CHECK-SAME: (i32 [[TMP0:%.]], i32 [[TMP1:%.]])
	; CHECK-NEXT: [[TMP1:%.]] = tail call i8 @foo(%pair* [[DATA]])			; CHECK-NEXT: [[DATA_PRIV:%.]] = alloca [[PAIR:%.]]
				; CHECK-NEXT: [[DATA_PRIV_CAST:%.]] = bitcast %pair [[DATA_PRIV]] to i32*
				; CHECK-NEXT: store i32 [[TMP0]], i32* [[DATA_PRIV_CAST]]
				; CHECK-NEXT: [[DATA_PRIV_0_1:%.]] = getelementptr [[PAIR]], %pair [[DATA_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i32 [[TMP1]], i32* [[DATA_PRIV_0_1]]
				; CHECK-NEXT: [[TMP3:%.]] = call i8 @foo(%pair* [[DATA_PRIV]])
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	tail call i8* @foo(%pair* %Data)			tail call i8* @foo(%pair* %Data)
	ret void			ret void
	}			}

	define void @zed(%pair* byval %Data) {			define void @zed(%pair* byval %Data) {
	; CHECK-LABEL: define {{[^@]+}}@zed			; CHECK-LABEL: define {{[^@]+}}@zed
	; CHECK-SAME: (%pair* noalias nocapture readonly byval [[DATA:%.*]])			; CHECK-SAME: (%pair* noalias nocapture readonly byval [[DATA:%.*]])
	; CHECK-NEXT: call void @bar(%pair* noalias nocapture readonly byval [[DATA]])			; CHECK-NEXT: [[DATA_CAST:%.]] = bitcast %pair [[DATA]] to i32*
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[DATA_CAST]], align 1
				; CHECK-NEXT: [[DATA_0_1:%.]] = getelementptr [[PAIR:%.]], %pair* [[DATA]], i32 0, i32 1
				; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[DATA_0_1]], align 1
				; CHECK-NEXT: call void @bar(i32 [[TMP1]], i32 [[TMP2]])
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	call void @bar(%pair* byval %Data)			call void @bar(%pair* byval %Data)
	ret void			ret void
	}			}

llvm/test/Transforms/Attributor/IPConstantProp/2009-09-24-byval-ptr.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s			; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=2 < %s \| FileCheck %s
	; Don't constant-propagate byval pointers, since they are not pointers!			; Don't constant-propagate byval pointers, since they are not pointers!
	; PR5038			; PR5038
	%struct.MYstr = type { i8, i32 }			%struct.MYstr = type { i8, i32 }
	@mystr = internal global %struct.MYstr zeroinitializer ; <%struct.MYstr*> [#uses=3]			@mystr = internal global %struct.MYstr zeroinitializer ; <%struct.MYstr*> [#uses=3]
	define internal void @vfu1(%struct.MYstr* byval align 4 %u) nounwind {			define internal void @vfu1(%struct.MYstr* byval align 4 %u) nounwind {
	; CHECK-LABEL: define {{[^@]+}}@vfu1			; CHECK-LABEL: define {{[^@]+}}@vfu1
	; CHECK-SAME: (%struct.MYstr* noalias nocapture nofree nonnull writeonly byval align 8 dereferenceable(8) [[U:%.*]])			; CHECK-SAME: (i8 [[TMP0:%.]], i32 [[TMP1:%.]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_MYSTR:%.]], %struct.MYstr* [[U]], i32 0, i32 1			; CHECK-NEXT: [[U_PRIV:%.]] = alloca [[STRUCT_MYSTR:%.]]
	; CHECK-NEXT: store i32 99, i32* [[TMP0]], align 4			; CHECK-NEXT: [[U_PRIV_CAST:%.]] = bitcast %struct.MYstr [[U_PRIV]] to i8*
	; CHECK-NEXT: [[TMP1:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U]], i32 0, i32 0			; CHECK-NEXT: store i8 [[TMP0]], i8* [[U_PRIV_CAST]]
	; CHECK-NEXT: store i8 97, i8* [[TMP1]], align 8			; CHECK-NEXT: [[U_PRIV_0_1:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i32 [[TMP1]], i32* [[U_PRIV_0_1]]
				; CHECK-NEXT: [[TMP2:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i32 99, i32* [[TMP2]], align 4
				; CHECK-NEXT: [[TMP3:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U_PRIV]], i32 0, i32 0
				; CHECK-NEXT: store i8 97, i8* [[TMP3]], align 8
	; CHECK-NEXT: br label [[RETURN:%.*]]			; CHECK-NEXT: br label [[RETURN:%.*]]
	; CHECK: return:			; CHECK: return:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%0 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 1 ; <i32*> [#uses=1]			%0 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 1 ; <i32*> [#uses=1]
	store i32 99, i32* %0, align 4			store i32 99, i32* %0, align 4
	%1 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 0 ; <i8*> [#uses=1]			%1 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 0 ; <i8*> [#uses=1]
	store i8 97, i8* %1, align 4			store i8 97, i8* %1, align 4
	br label %return			br label %return

	return: ; preds = %entry			return: ; preds = %entry
	ret void			ret void
	}			}

	define internal i32 @vfu2(%struct.MYstr* byval align 4 %u) nounwind readonly {			define internal i32 @vfu2(%struct.MYstr* byval align 4 %u) nounwind readonly {
	; CHECK-LABEL: define {{[^@]+}}@vfu2			; CHECK-LABEL: define {{[^@]+}}@vfu2
	; CHECK-SAME: (%struct.MYstr* noalias nocapture nofree nonnull readonly byval align 8 dereferenceable(8) [[U:%.*]])			; CHECK-SAME: (i8 [[TMP0:%.]], i32 [[TMP1:%.]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_MYSTR:%.]], %struct.MYstr* @mystr, i32 0, i32 1			; CHECK-NEXT: [[U_PRIV:%.]] = alloca [[STRUCT_MYSTR:%.]]
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP0]]			; CHECK-NEXT: [[U_PRIV_CAST:%.]] = bitcast %struct.MYstr [[U_PRIV]] to i8*
	; CHECK-NEXT: [[TMP2:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr @mystr, i32 0, i32 0			; CHECK-NEXT: store i8 [[TMP0]], i8* [[U_PRIV_CAST]]
	; CHECK-NEXT: [[TMP3:%.]] = load i8, i8 [[TMP2]], align 8			; CHECK-NEXT: [[U_PRIV_0_1:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U_PRIV]], i32 0, i32 1
	; CHECK-NEXT: [[TMP4:%.*]] = zext i8 [[TMP3]] to i32			; CHECK-NEXT: store i32 [[TMP1]], i32* [[U_PRIV_0_1]]
	; CHECK-NEXT: [[TMP5:%.*]] = add i32 [[TMP4]], [[TMP1]]			; CHECK-NEXT: [[TMP2:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr @mystr, i32 0, i32 1
	; CHECK-NEXT: ret i32 [[TMP5]]			; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP2]]
				; CHECK-NEXT: [[TMP4:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr @mystr, i32 0, i32 0
				; CHECK-NEXT: [[TMP5:%.]] = load i8, i8 [[TMP4]], align 8
				; CHECK-NEXT: [[TMP6:%.*]] = zext i8 [[TMP5]] to i32
				; CHECK-NEXT: [[TMP7:%.*]] = add i32 [[TMP6]], [[TMP3]]
				; CHECK-NEXT: ret i32 [[TMP7]]
	;			;
	entry:			entry:
	%0 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 1 ; <i32*> [#uses=1]			%0 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 1 ; <i32*> [#uses=1]
	%1 = load i32, i32* %0			%1 = load i32, i32* %0
	%2 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 0 ; <i8*> [#uses=1]			%2 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 0 ; <i8*> [#uses=1]
	%3 = load i8, i8* %2			%3 = load i8, i8* %2
	%4 = zext i8 %3 to i32			%4 = zext i8 %3 to i32
	%5 = add i32 %4, %1			%5 = add i32 %4, %1
	ret i32 %5			ret i32 %5
	}			}

	define i32 @unions() nounwind {			define i32 @unions() nounwind {
	; CHECK-LABEL: define {{[^@]+}}@unions()			; CHECK-LABEL: define {{[^@]+}}@unions()
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: call void @vfu1(%struct.MYstr* nofree nonnull readonly byval align 8 dereferenceable(8) @mystr)			; CHECK-NEXT: [[MYSTR_CAST1:%.]] = bitcast %struct.MYstr @mystr to i8*
	; CHECK-NEXT: [[RESULT:%.]] = call i32 @vfu2(%struct.MYstr nofree nonnull readonly byval align 8 dereferenceable(8) @mystr)			; CHECK-NEXT: [[TMP0:%.]] = load i8, i8 [[MYSTR_CAST1]], align 1
				; CHECK-NEXT: [[MYSTR_0_12:%.]] = getelementptr [[STRUCT_MYSTR:%.]], %struct.MYstr* @mystr, i32 0, i32 1
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[MYSTR_0_12]], align 1
				; CHECK-NEXT: call void @vfu1(i8 [[TMP0]], i32 [[TMP1]])
				; CHECK-NEXT: [[MYSTR_CAST:%.]] = bitcast %struct.MYstr @mystr to i8*
				; CHECK-NEXT: [[TMP2:%.]] = load i8, i8 [[MYSTR_CAST]], align 1
				; CHECK-NEXT: [[MYSTR_0_1:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr @mystr, i32 0, i32 1
				; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[MYSTR_0_1]], align 1
				; CHECK-NEXT: [[RESULT:%.*]] = call i32 @vfu2(i8 [[TMP2]], i32 [[TMP3]])
	; CHECK-NEXT: ret i32 [[RESULT]]			; CHECK-NEXT: ret i32 [[RESULT]]
	;			;
	entry:			entry:
	call void @vfu1(%struct.MYstr* byval align 4 @mystr) nounwind			call void @vfu1(%struct.MYstr* byval align 4 @mystr) nounwind
	%result = call i32 @vfu2(%struct.MYstr* byval align 4 @mystr) nounwind			%result = call i32 @vfu2(%struct.MYstr* byval align 4 @mystr) nounwind
	ret i32 %result			ret i32 %result
	}			}

	define internal i32 @vfu2_v2(%struct.MYstr* byval align 4 %u) nounwind readonly {			define internal i32 @vfu2_v2(%struct.MYstr* byval align 4 %u) nounwind readonly {
	; CHECK-LABEL: define {{[^@]+}}@vfu2_v2			; CHECK-LABEL: define {{[^@]+}}@vfu2_v2
	; CHECK-SAME: (%struct.MYstr* noalias nocapture nofree nonnull byval align 8 dereferenceable(8) [[U:%.*]])			; CHECK-SAME: (i8 [[TMP0:%.]], i32 [[TMP1:%.]])
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[Z:%.]] = getelementptr [[STRUCT_MYSTR:%.]], %struct.MYstr* [[U]], i32 0, i32 1			; CHECK-NEXT: [[U_PRIV:%.]] = alloca [[STRUCT_MYSTR:%.]]
				; CHECK-NEXT: [[U_PRIV_CAST:%.]] = bitcast %struct.MYstr [[U_PRIV]] to i8*
				; CHECK-NEXT: store i8 [[TMP0]], i8* [[U_PRIV_CAST]]
				; CHECK-NEXT: [[U_PRIV_0_1:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U_PRIV]], i32 0, i32 1
				; CHECK-NEXT: store i32 [[TMP1]], i32* [[U_PRIV_0_1]]
				; CHECK-NEXT: [[Z:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U_PRIV]], i32 0, i32 1
	; CHECK-NEXT: store i32 99, i32* [[Z]], align 4			; CHECK-NEXT: store i32 99, i32* [[Z]], align 4
	; CHECK-NEXT: [[TMP0:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U]], i32 0, i32 1			; CHECK-NEXT: [[TMP2:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U_PRIV]], i32 0, i32 1
	; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[TMP0]]			; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[TMP2]]
	; CHECK-NEXT: [[TMP2:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U]], i32 0, i32 0			; CHECK-NEXT: [[TMP4:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr [[U_PRIV]], i32 0, i32 0
	; CHECK-NEXT: [[TMP3:%.]] = load i8, i8 [[TMP2]], align 8			; CHECK-NEXT: [[TMP5:%.]] = load i8, i8 [[TMP4]], align 8
	; CHECK-NEXT: [[TMP4:%.*]] = zext i8 [[TMP3]] to i32			; CHECK-NEXT: [[TMP6:%.*]] = zext i8 [[TMP5]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = add i32 [[TMP4]], [[TMP1]]			; CHECK-NEXT: [[TMP7:%.*]] = add i32 [[TMP6]], [[TMP3]]
	; CHECK-NEXT: ret i32 [[TMP5]]			; CHECK-NEXT: ret i32 [[TMP7]]
	;			;
	entry:			entry:
	%z = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 1			%z = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 1
	store i32 99, i32* %z, align 4			store i32 99, i32* %z, align 4
	%0 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 1 ; <i32*> [#uses=1]			%0 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 1 ; <i32*> [#uses=1]
	%1 = load i32, i32* %0			%1 = load i32, i32* %0
	%2 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 0 ; <i8*> [#uses=1]			%2 = getelementptr %struct.MYstr, %struct.MYstr* %u, i32 0, i32 0 ; <i8*> [#uses=1]
	%3 = load i8, i8* %2			%3 = load i8, i8* %2
	%4 = zext i8 %3 to i32			%4 = zext i8 %3 to i32
	%5 = add i32 %4, %1			%5 = add i32 %4, %1
	ret i32 %5			ret i32 %5
	}			}

	define i32 @unions_v2() nounwind {			define i32 @unions_v2() nounwind {
	; CHECK-LABEL: define {{[^@]+}}@unions_v2()			; CHECK-LABEL: define {{[^@]+}}@unions_v2()
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: call void @vfu1(%struct.MYstr* nofree nonnull readonly byval align 8 dereferenceable(8) @mystr)			; CHECK-NEXT: [[MYSTR_CAST1:%.]] = bitcast %struct.MYstr @mystr to i8*
	; CHECK-NEXT: [[RESULT:%.]] = call i32 @vfu2_v2(%struct.MYstr nofree nonnull readonly byval align 8 dereferenceable(8) @mystr)			; CHECK-NEXT: [[TMP0:%.]] = load i8, i8 [[MYSTR_CAST1]], align 1
				; CHECK-NEXT: [[MYSTR_0_12:%.]] = getelementptr [[STRUCT_MYSTR:%.]], %struct.MYstr* @mystr, i32 0, i32 1
				; CHECK-NEXT: [[TMP1:%.]] = load i32, i32 [[MYSTR_0_12]], align 1
				; CHECK-NEXT: call void @vfu1(i8 [[TMP0]], i32 [[TMP1]])
				; CHECK-NEXT: [[MYSTR_CAST:%.]] = bitcast %struct.MYstr @mystr to i8*
				; CHECK-NEXT: [[TMP2:%.]] = load i8, i8 [[MYSTR_CAST]], align 1
				; CHECK-NEXT: [[MYSTR_0_1:%.]] = getelementptr [[STRUCT_MYSTR]], %struct.MYstr @mystr, i32 0, i32 1
				; CHECK-NEXT: [[TMP3:%.]] = load i32, i32 [[MYSTR_0_1]], align 1
				; CHECK-NEXT: [[RESULT:%.*]] = call i32 @vfu2_v2(i8 [[TMP2]], i32 [[TMP3]])
	; CHECK-NEXT: ret i32 [[RESULT]]			; CHECK-NEXT: ret i32 [[RESULT]]
	;			;
	entry:			entry:
	call void @vfu1(%struct.MYstr* byval align 4 @mystr) nounwind			call void @vfu1(%struct.MYstr* byval align 4 @mystr) nounwind
	%result = call i32 @vfu2_v2(%struct.MYstr* byval align 4 @mystr) nounwind			%result = call i32 @vfu2_v2(%struct.MYstr* byval align 4 @mystr) nounwind
	ret i32 %result			ret i32 %result
	}			}

llvm/test/Transforms/Attributor/IPConstantProp/PR16052.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --scrub-attributes
	; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 < %s \| FileCheck %s			; RUN: opt -S -passes=attributor -aa-pipeline='basic-aa' -attributor-disable=false -attributor-max-iterations-verify -attributor-max-iterations=3 < %s \| FileCheck %s

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	define i64 @fn2() {			define i64 @fn2() {
	; CHECK-LABEL: define {{[^@]+}}@fn2()			; CHECK-LABEL: define {{[^@]+}}@fn2()
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CONV:%.*]] = sext i32 undef to i64			; CHECK-NEXT: [[CONV:%.*]] = sext i32 undef to i64
	; CHECK-NEXT: [[DIV:%.*]] = sdiv i64 8, [[CONV]]			; CHECK-NEXT: [[DIV:%.*]] = sdiv i64 8, [[CONV]]
	; CHECK-NEXT: [[CALL2:%.*]] = call i64 @fn1(i64 [[DIV]])			; CHECK-NEXT: [[CALL2:%.*]] = call i64 @fn1(i64 [[DIV]]) #0, !range !0
	; CHECK-NEXT: ret i64 [[CALL2]]			; CHECK-NEXT: ret i64 [[CALL2]]
	;			;
	entry:			entry:
	%conv = sext i32 undef to i64			%conv = sext i32 undef to i64
	%div = sdiv i64 8, %conv			%div = sdiv i64 8, %conv
	%call2 = call i64 @fn1(i64 %div)			%call2 = call i64 @fn1(i64 %div)
	ret i64 %call2			ret i64 %call2
	}			}
	Show All 14 Lines

llvm/test/Transforms/Attributor/callbacks.ll

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	entry:
tail call void @t0_check(i32* %a, i64 %b, i32* %0)		tail call void @t0_check(i32* %a, i64 %b, i32* %0)
ret void		ret void
}		}

declare void @t0_check(i32* align 256, i64, i32*)		declare void @t0_check(i32* align 256, i64, i32*)

declare !callback !0 void @t0_callback_broker(i32, i32, void (i32, i32, ...)*, ...)		declare !callback !0 void @t0_callback_broker(i32, i32, void (i32, i32, ...)*, ...)

		; Test 1
		;
		; Similar to test 0 but with some additional annotations (noalias/nocapute) to make sure
		; we deduce and propagate noalias and others properly.

		define void @t1_caller(i32* noalias %a) {
		; CHECK-LABEL: define {{[^@]+}}@t1_caller
		; CHECK-SAME: (i32* noalias nocapture align 256 [[A:%.*]])
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[B:%.*]] = alloca i32, align 32
		; CHECK-NEXT: [[C:%.]] = alloca i32, align 64
		; CHECK-NEXT: [[PTR:%.*]] = alloca i32, align 128
		; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[B]] to i8*
		; CHECK-NEXT: store i32 42, i32* [[B]], align 32
		; CHECK-NEXT: store i32* [[B]], i32** [[C]], align 64
		; CHECK-NEXT: call void (i32, i32, void (i32, i32, ...), ...) @t1_callback_broker(i32 noalias align 536870912 null, i32* noalias nonnull align 128 dereferenceable(4) [[PTR]], void (i32, i32, ...)* nonnull bitcast (void (i32, i32, i32, i64, i32) @t1_callback_callee to void (i32, i32, ...)), i32 noalias nocapture align 256 [[A]], i64 99, i32** noalias nocapture nonnull readonly align 64 dereferenceable(8) [[C]])
		; CHECK-NEXT: ret void
		;
		entry:
		%b = alloca i32, align 32
		%c = alloca i32*, align 64
		%ptr = alloca i32, align 128
		%0 = bitcast i32* %b to i8*
		store i32 42, i32* %b, align 4
		store i32* %b, i32** %c, align 8
		call void (i32, i32, void (i32, i32, ...), ...) @t1_callback_broker(i32 null, i32* %ptr, void (i32, i32, ...)* bitcast (void (i32, i32, i32, i64, i32) @t1_callback_callee to void (i32, i32, ...)), i32 %a, i64 99, i32** %c)
		ret void
		}

		; Note that the first two arguments are provided by the callback_broker according to the callback in !1 below!
		; The others are annotated with alignment information, amongst others, or even replaced by the constants passed to the call.
		define internal void @t1_callback_callee(i32* %is_not_null, i32* %ptr, i32* %a, i64 %b, i32** %c) {
		; CHECK-LABEL: define {{[^@]+}}@t1_callback_callee
		; CHECK-SAME: (i32* nocapture nonnull writeonly dereferenceable(4) [[IS_NOT_NULL:%.]], i32 nocapture nonnull readonly align 8 dereferenceable(4) [[PTR:%.]], i32 noalias nocapture align 256 [[A:%.]], i64 [[B:%.]], i32** noalias nocapture nonnull readonly align 64 dereferenceable(8) [[C:%.*]])
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[PTR_VAL:%.]] = load i32, i32 [[PTR]], align 8
		; CHECK-NEXT: store i32 [[PTR_VAL]], i32* [[IS_NOT_NULL]]
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32** [[C]], align 64
		; CHECK-NEXT: tail call void @t1_check(i32* nocapture align 256 [[A]], i64 99, i32* [[TMP0]])
		; CHECK-NEXT: ret void
		;
		entry:
		%ptr_val = load i32, i32* %ptr, align 8
		store i32 %ptr_val, i32* %is_not_null
		%0 = load i32, i32* %c, align 8
		tail call void @t1_check(i32* %a, i64 %b, i32* %0)
		ret void
		}

		declare void @t1_check(i32* nocapture align 256, i64, i32* nocapture) nosync

		declare !callback !0 void @t1_callback_broker(i32* nocapture , i32* nocapture , void (i32, i32, ...)* nocapture, ...)

		; Test 2
		;
		; Similar to test 1 but checking that the noalias is only placed if potential synchronization through @t2_check is preserved.

		define void @t2_caller(i32* noalias %a) {
		; CHECK-LABEL: define {{[^@]+}}@t2_caller
		; CHECK-SAME: (i32* noalias nocapture align 256 [[A:%.*]])
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[B:%.*]] = alloca i32, align 32
		; CHECK-NEXT: [[C:%.]] = alloca i32, align 64
		; CHECK-NEXT: [[PTR:%.*]] = alloca i32, align 128
		; CHECK-NEXT: [[TMP0:%.]] = bitcast i32 [[B]] to i8*
		; CHECK-NEXT: store i32 42, i32* [[B]], align 32
		; CHECK-NEXT: store i32* [[B]], i32** [[C]], align 64
		; CHECK-NEXT: call void (i32, i32, void (i32, i32, ...), ...) @t2_callback_broker(i32 noalias align 536870912 null, i32* noalias nonnull align 128 dereferenceable(4) [[PTR]], void (i32, i32, ...)* nonnull bitcast (void (i32, i32, i32, i64, i32) @t2_callback_callee to void (i32, i32, ...)), i32 noalias nocapture align 256 [[A]], i64 99, i32** noalias nocapture nonnull readonly align 64 dereferenceable(8) [[C]])
		; CHECK-NEXT: ret void
		;
		entry:
		%b = alloca i32, align 32
		%c = alloca i32*, align 64
		%ptr = alloca i32, align 128
		%0 = bitcast i32* %b to i8*
		store i32 42, i32* %b, align 4
		store i32* %b, i32** %c, align 8
		call void (i32, i32, void (i32, i32, ...), ...) @t2_callback_broker(i32 null, i32* %ptr, void (i32, i32, ...)* bitcast (void (i32, i32, i32, i64, i32) @t2_callback_callee to void (i32, i32, ...)), i32 %a, i64 99, i32** %c)
		ret void
		}

		; Note that the first two arguments are provided by the callback_broker according to the callback in !1 below!
		; The others are annotated with alignment information, amongst others, or even replaced by the constants passed to the call.
		;
		; FIXME: We should derive noalias for %a and add a "fake use" of %a in all potentially synchronizing calls.
		define internal void @t2_callback_callee(i32* %is_not_null, i32* %ptr, i32* %a, i64 %b, i32** %c) {
		; CHECK-LABEL: define {{[^@]+}}@t2_callback_callee
		; CHECK-SAME: (i32* nocapture nonnull writeonly dereferenceable(4) [[IS_NOT_NULL:%.]], i32 nocapture nonnull readonly align 8 dereferenceable(4) [[PTR:%.]], i32 nocapture align 256 [[A:%.]], i64 [[B:%.]], i32** noalias nocapture nonnull readonly align 64 dereferenceable(8) [[C:%.*]])
		; CHECK-NEXT: entry:
		; CHECK-NEXT: [[PTR_VAL:%.]] = load i32, i32 [[PTR]], align 8
		; CHECK-NEXT: store i32 [[PTR_VAL]], i32* [[IS_NOT_NULL]]
		; CHECK-NEXT: [[TMP0:%.]] = load i32, i32** [[C]], align 64
		; CHECK-NEXT: tail call void @t2_check(i32* nocapture align 256 [[A]], i64 99, i32* [[TMP0]])
		; CHECK-NEXT: ret void
		;
		entry:
		%ptr_val = load i32, i32* %ptr, align 8
		store i32 %ptr_val, i32* %is_not_null
		%0 = load i32, i32* %c, align 8
		tail call void @t2_check(i32* %a, i64 %b, i32* %0)
		ret void
		}

		declare void @t2_check(i32* nocapture align 256, i64, i32* nocapture)

		declare !callback !0 void @t2_callback_broker(i32* nocapture , i32* nocapture , void (i32, i32, ...)* nocapture, ...)

!0 = !{!1}		!0 = !{!1}
!1 = !{i64 2, i64 -1, i64 -1, i1 true}		!1 = !{i64 2, i64 -1, i64 -1, i1 true}

llvm/test/Transforms/Attributor/internal-noalias.ll

Show All 34 Lines	entry:
%B = alloca i32, align 4		%B = alloca i32, align 4
store i32 5, i32* %B, align 4		store i32 5, i32* %B, align 4
%call1 = call i32 @noalias_args(i32* %A, i32* nonnull %B)		%call1 = call i32 @noalias_args(i32* %A, i32* nonnull %B)
%call2 = call i32 @noalias_args_argmem(i32* %A, i32* nonnull %B)		%call2 = call i32 @noalias_args_argmem(i32* %A, i32* nonnull %B)
%add = add nsw i32 %call1, %call2		%add = add nsw i32 %call1, %call2
ret i32 %add		ret i32 %add
}		}

; CHECK: define internal i32 @noalias_args_argmem_ro(i32* noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) %A, i32* noalias nocapture nofree nonnull readonly align 4 dereferenceable(4) %B)		; CHECK: define internal i32 @noalias_args_argmem_ro(i32 %0, i32 %1)
define internal i32 @noalias_args_argmem_ro(i32* %A, i32* %B) #1 {		define internal i32 @noalias_args_argmem_ro(i32* %A, i32* %B) #1 {
%t0 = load i32, i32* %A, align 4		%t0 = load i32, i32* %A, align 4
%t1 = load i32, i32* %B, align 4		%t1 = load i32, i32* %B, align 4
%add = add nsw i32 %t0, %t1		%add = add nsw i32 %t0, %t1
ret i32 %add		ret i32 %add
}		}

define i32 @visible_local_2() {		define i32 @visible_local_2() {
Show All 22 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Attributor] Pointer privatization attribute (argument promotion)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 241337

llvm/include/llvm/Transforms/IPO/ArgumentPromotion.h

llvm/include/llvm/Transforms/IPO/Attributor.h

llvm/lib/Transforms/IPO/ArgumentPromotion.cpp

llvm/lib/Transforms/IPO/Attributor.cpp

llvm/test/Transforms/Attributor/ArgumentPromotion/2008-02-01-ReturnAttrs.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/X86/attributes.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/X86/min-legal-vector-width.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/alignment.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/attrs.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/basictest.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/byval-2.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/byval.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/control-flow2.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/fp80.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/inalloca.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/profile.ll

llvm/test/Transforms/Attributor/ArgumentPromotion/tail.ll

llvm/test/Transforms/Attributor/IPConstantProp/2009-09-24-byval-ptr.ll

llvm/test/Transforms/Attributor/IPConstantProp/PR16052.ll

llvm/test/Transforms/Attributor/callbacks.ll

llvm/test/Transforms/Attributor/internal-noalias.ll

[Attributor] Pointer privatization attribute (argument promotion)
ClosedPublic