This is an archive of the discontinued LLVM Phabricator instance.

Top-Down FunctionAttrs propagation for noalias, dereferenceable and nonnull inference
Needs ReviewPublic

Authored by hfinkel on Jul 21 2014, 3:40 PM.

Download Raw Diff

Details

Reviewers

nicholas
chandlerc

Summary

This patch introduces a new module-level pass which does a top-down traversal of the call graph, inferring when possible for local functions, parameter attributes: noalias, dereferenceable and nonnull. It does this based on an examination of all call sites.

Inference on nonnull is straightforward (all callers must pass an argument we know is nonnull). noalias requires two things at all call sites: that the parameter not alias with any other parameter, and that it not have been captured prior to the call site. dereferenceability requires that we know the pointer is derefenceable at all call sites, obviously, but then we also need to get the size. We use the type size (which is fine b/c we've already verified that a load of that type is safe), but we also use getObjectSize to see if we can do better. We don't always know that the size that getObjectSize returns is valid (because the malloc could have failed, for example), but it will be valid if there is also an access, and isSafeToLoadUnconditionally checks for that.

There is a quirk about scheduling this in the current pass manager: Inference of noalias requires a capturing analysis which greatly benefits from running after the existing FunctionAttrs pass which will infer nocapture parameters. However, the existing FunctionAttrs pass is a CGSCC pass that runs in the main CGSCC pass manager after the inliner. But this being a top-down traversal of the call graph (and thus a module pass), and one that should run early, needs to run before the inliner's CGSCC pass manager. As a result, I think the best option is to insert an extra run of the regular FunctionAttrs pass just prior to this one, and these just prior to the main CGSCC pass manager. Since both preserve the call graph, it should not cause it to be recomputed.

I see no compile-time impact (either from the extra run on functionattrs or from this new pass). Compiling sqlite, for example, both passes take less than 0.1% of the total time.

Diff Detail

Event Timeline

hfinkel updated this revision to Diff 11722.Jul 21 2014, 3:40 PM

hfinkel retitled this revision from to Top-Down FunctionAttrs propagation for noalias, dereferenceable and nonnull inference.

hfinkel updated this object.

hfinkel edited the test plan for this revision. (Show Details)

hfinkel added reviewers: chandlerc, reames.

hfinkel added a subscriber: Unknown Object (MLST).

Use isDereferenceablePointer (which checks things like argument's dereferenceable attribute) in addition to using isSafeToLoadUnconditionally (which is mostly useful for its local instruction scan). Also fixed up one of the test cases.

Updated based on Nick's comments; we need to ensure a stable answer for SCCs (which the df_iterator won't do).

The regular SCC iterator returns the SCCs bottom-up (it uses Tarjan's Algorithm, and knows it has found the root of an SCC as it passes it on the way up -- it is not clear there is a better way). Because we can't collect SCCs in a purely top-down fashion anyway, I've just changed it to store those returned by the bottom-up SCC iterator in a vector and visit this stored vector in reverse order. This pass does not modify the call graph, so this method should be stable.

In the future, we can get a better answer by speculating the attributes within each SCC, and then removing them afterward if call sites are found where the attributes can't be proved. I'll implement this as follow-up work.

Also added support for inferring the align attribute (which, post r213670, is now a useful thing to do).

Ping.

Overall, looks pretty reasonable. A few minor code comments inline.

I might suggest separating the addition of the patch and adding to the pass manager. In particular, adding the second run of the attribute analysis should probably be discussed more broadly.

lib/Transforms/IPO/FunctionAttrsTD.cpp
82	Stylistically, I'd prefer this separated into two checks. Your combining correctness and profitability criteria. Also, rather than "return MadeChange", just "return false" for the early exits. It makes the flow clearer.
91	Please rename MaxSize. It appears to actually be MaxKnownDeferenceable and the current name is highly confusing.
112	A comment here that reminds the reader the default state is fully speculative/optimistic would be helpful.
114	Do you need to iterate the uses, or can you iterate the users directly?
125	Shouldn't this be F?
131	A general point: this function is way too complex. Split it into reasonable helper functions. One arrangement would be: struct ArgState updateArgAtCallSite(...) {} updateForCallSite(...) {} identifyCallSites(...) {} addIndicatedAtrributes(...) {}
157	Is this a correctness bug? Or a missed optimization? I can't tell from your FIXME.
170	Er, this doesn't look right. We've verified that loading from the pointer up to a certain number of bytes is safe. We have not said anything beyond that.
175	It looks like IsDerefernceable could become a function of MaxSize on the state object?
184	I don't understand why you need this check. Is it safe to record an align attribute less than the ABI alignment? If not, that seems slightly odd.
236	I'm not clear why this check is needed. Aren't you only iterating 'number of argument' times anyways?
249	The first is just an early exit, but what are the second two checks for? Are they functionally required?

Original Message -----

From: "Philip Reames" <listmail@philipreames.com>
To: hfinkel@anl.gov, chandlerc@gmail.com, nicholas@mxc.ca, listmail@philipreames.com
Cc: llvm-commits@cs.uiuc.edu
Sent: Wednesday, September 10, 2014 12:59:48 PM
Subject: Re: [PATCH] Top-Down FunctionAttrs propagation for noalias, dereferenceable and nonnull inference

Overall, looks pretty reasonable. A few minor code comments inline.

I might suggest separating the addition of the patch and adding to
the pass manager. In particular, adding the second run of the
attribute analysis should probably be discussed more broadly.

Thanks for the review!

I also need to revisit the way that the noalias inference is done because it is not quite right in this patch. Just checking alias(AI, BI) == NoAlias is not sufficient: a[i] and a[i+1] don't alias, but obviously pointers based on them could. I can really only do this when we have distinct sets of underlying objects (which makes several other things much simpler -- including the capture checking and the ramifications on the optimization pipeline).

-Hal

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:82
@@ +81,3 @@
+ bool MadeChange = false;
+ if (F.isDeclaration() || !F.hasLocalLinkage() || F.arg_empty() ||

+ F.use_empty())

Stylistically, I'd prefer this separated into two checks. Your
combining correctness and profitability criteria. Also, rather than
"return MadeChange", just "return false" for the early exits. It
makes the flow clearer.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:91
@@ +90,3 @@
+ bool IsAlign;
+ uint64_t MaxSize, MaxAlign;

+

Please rename MaxSize. It appears to actually be
MaxKnownDeferenceable and the current name is highly confusing.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:112
@@ +111,3 @@
+ SmallVector<AttrState, 16> ArgumentState;
+ ArgumentState.resize(F.arg_size());

+

A comment here that reminds the reader the default state is fully
speculative/optimistic would be helpful.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:114
@@ +113,3 @@
+
+ for (Use &U : F.uses()) {

+ User *UR = U.getUser();

Do you need to iterate the uses, or can you iterate the users
directly?

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:125
@@ +124,3 @@
+ CallSite CS(cast<Instruction>(UR));
+ if (!CS.isCallee(&U))

+ return MadeChange;

Shouldn't this be F?

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:131
@@ +130,3 @@
+ Function::arg_iterator Arg = F.arg_begin();
+ for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i,
++AI, ++Arg) {

+ // If we've already excluded this argument, ignore it.

A general point: this function is *way* too complex. Split it into
reasonable helper functions. One arrangement would be:
struct ArgState
updateArgAtCallSite(...) {}
updateForCallSite(...) {}
identifyCallSites(...) {}
addIndicatedAtrributes(...) {}

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:157
@@ +156,3 @@
+ ArgumentState[i].IsDereferenceable = false;
+ FIXME: isSafeToLoadUnconditionally does not understand
memset.
+ FIXME: We can use getObjectSize for most things, but

for mallocs

Is this a correctness bug? Or a missed optimization? I can't tell
from your FIXME.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:170
@@ +169,3 @@
+ // trap, so we can use at least that size.
+ Size = std::max(Size, TypeSize);

+ }

Er, this doesn't look right. We've verified that loading from the
pointer *up to* a certain number of bytes is safe. We have not said
anything beyond that.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:175
@@ +174,3 @@
+ if (!ArgumentState[i].MaxSize)
+ ArgumentState[i].IsDereferenceable = false;

+ }

It looks like IsDerefernceable could become a function of MaxSize on
the state object?

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:184
@@ +183,3 @@
+ unsigned Align = getKnownAlignment(*AI, DL);
+ if (Align > DL->getABITypeAlignment(ETy))

+ ArgumentState[i].MaxAlign =

I don't understand why you need this check. Is it safe to record an
align attribute less than the ABI alignment? If not, that seems
slightly odd.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:236
@@ +235,3 @@
+
+ bool HaveCand = false;

+ for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i)

I'm not clear why this check is needed. Aren't you only iterating
'number of argument' times anyways?

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:249
@@ +248,3 @@
+ for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i, ++AI)
{
+ if (!ArgumentState[i].any() || AI->hasInAllocaAttr() ||
AI->hasByValAttr())

+ continue;

The first is just an early exit, but what are the second two checks
for? Are they functionally required?

http://reviews.llvm.org/D4609

Original Message -----

From: "Hal Finkel" <hfinkel@anl.gov>
To: reviews+D4609+public+b1f539b8c63822a5@reviews.llvm.org
Cc: llvm-commits@cs.uiuc.edu
Sent: Wednesday, September 10, 2014 1:13:01 PM
Subject: Re: [PATCH] Top-Down FunctionAttrs propagation for noalias, dereferenceable and nonnull inference

Original Message -----

From: "Philip Reames" <listmail@philipreames.com>
To: hfinkel@anl.gov, chandlerc@gmail.com, nicholas@mxc.ca,
listmail@philipreames.com
Cc: llvm-commits@cs.uiuc.edu
Sent: Wednesday, September 10, 2014 12:59:48 PM
Subject: Re: [PATCH] Top-Down FunctionAttrs propagation for
noalias, dereferenceable and nonnull inference

Overall, looks pretty reasonable. A few minor code comments
inline.

I might suggest separating the addition of the patch and adding to
the pass manager. In particular, adding the second run of the
attribute analysis should probably be discussed more broadly.

Thanks for the review!

I also need to revisit the way that the noalias inference is done
because it is not quite right in this patch. Just checking alias(AI,
BI) == NoAlias is not sufficient: a[i] and a[i+1] don't alias, but
obviously pointers based on them could. I can really only do this
when we have distinct sets of underlying objects (which makes
several other things much simpler -- including the capture checking
and the ramifications on the optimization pipeline).

Actually, I take that back. It does not make the capture checking simpler.

-Hal

-Hal

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:82
@@ +81,3 @@
+ bool MadeChange = false;
+ if (F.isDeclaration() || !F.hasLocalLinkage() || F.arg_empty()

+ F.use_empty())

Stylistically, I'd prefer this separated into two checks. Your
combining correctness and profitability criteria. Also, rather
than
"return MadeChange", just "return false" for the early exits. It
makes the flow clearer.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:91
@@ +90,3 @@
+ bool IsAlign;
+ uint64_t MaxSize, MaxAlign;

+

Please rename MaxSize. It appears to actually be
MaxKnownDeferenceable and the current name is highly confusing.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:112
@@ +111,3 @@
+ SmallVector<AttrState, 16> ArgumentState;
+ ArgumentState.resize(F.arg_size());

+

A comment here that reminds the reader the default state is fully
speculative/optimistic would be helpful.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:114
@@ +113,3 @@
+
+ for (Use &U : F.uses()) {

+ User *UR = U.getUser();

Do you need to iterate the uses, or can you iterate the users
directly?

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:125
@@ +124,3 @@
+ CallSite CS(cast<Instruction>(UR));
+ if (!CS.isCallee(&U))

+ return MadeChange;

Shouldn't this be F?

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:131
@@ +130,3 @@
+ Function::arg_iterator Arg = F.arg_begin();
+ for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i,
++AI, ++Arg) {

+ // If we've already excluded this argument, ignore it.

A general point: this function is *way* too complex. Split it into
reasonable helper functions. One arrangement would be:
struct ArgState
updateArgAtCallSite(...) {}
updateForCallSite(...) {}
identifyCallSites(...) {}
addIndicatedAtrributes(...) {}

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:157
@@ +156,3 @@
+ ArgumentState[i].IsDereferenceable = false;
+ FIXME: isSafeToLoadUnconditionally does not
understand
memset.
+ FIXME: We can use getObjectSize for most things, but

for mallocs

Is this a correctness bug? Or a missed optimization? I can't tell
from your FIXME.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:170
@@ +169,3 @@
+ // trap, so we can use at least that size.
+ Size = std::max(Size, TypeSize);

+ }

Er, this doesn't look right. We've verified that loading from the
pointer *up to* a certain number of bytes is safe. We have not
said
anything beyond that.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:175
@@ +174,3 @@
+ if (!ArgumentState[i].MaxSize)
+ ArgumentState[i].IsDereferenceable = false;

+ }

It looks like IsDerefernceable could become a function of MaxSize
on
the state object?

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:184
@@ +183,3 @@
+ unsigned Align = getKnownAlignment(*AI, DL);
+ if (Align > DL->getABITypeAlignment(ETy))

+ ArgumentState[i].MaxAlign =

I don't understand why you need this check. Is it safe to record
an
align attribute less than the ABI alignment? If not, that seems
slightly odd.

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:236
@@ +235,3 @@
+
+ bool HaveCand = false;

+ for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i)

I'm not clear why this check is needed. Aren't you only iterating
'number of argument' times anyways?

Comment at: lib/Transforms/IPO/FunctionAttrsTD.cpp:249
@@ +248,3 @@
+ for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i,
++AI)
{
+ if (!ArgumentState[i].any() || AI->hasInAllocaAttr() ||
AI->hasByValAttr())

+ continue;

The first is just an early exit, but what are the second two checks
for? Are they functionally required?

http://reviews.llvm.org/D4609

Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

reames removed a reviewer: reames.Dec 29 2014, 6:22 PM

chandlerc removed a reviewer: chandlerc.Mar 29 2015, 12:53 PM

Did you push it to master?

haicheng added a subscriber: haicheng.Jun 9 2017, 11:14 AM

Hi Hal,

I am wondering what's your plan of this patch? I am particularly interested in the propagation of noalias attributes.

Thank you,

Haicheng

hfinkel added a reviewer: chandlerc.Aug 6 2017, 11:50 AM

In D4609#801397, @haicheng wrote:

Hi Hal,

I am wondering what's your plan of this patch? I am particularly interested in the propagation of noalias attributes.

I've not thought about this in a while, but I'd certainly like to see this functionality. Aside from some refactoring, it needs two things:

The noalias inference needs to be updated to be correct. The easiest way to do this is, instead of actually doing an alias check on the arguments, to call GetUnderlyingObjects and check for disjoint sets of *identified* underlying objects (still checking for capturing, and possibly restricting to local identified objects unless we can do a global capture check). Any set that is disjoint from the others means that its associated argument can be marked as noalias.
I'd like to revisit how the iteration is done and how this fits with the rest of the pass manager. While it does naturally iterate top down, and the CGSCC pass manager normally iterates the other way, there should be a way to put this logic into the part of the pipeline that is iterating with the inliner. I was never satisfied with the idea that we'd only do this once at the beginning of the pipeline. Now that we have a new pass manager, we can think about how this might work in that framework as well. Also, the new pass manager should make it more natural to get a per-function dominator tree, etc. @chandlerc, thoughts on how this might best be done?

Thank you,

Haicheng

In D4609#833428, @hfinkel wrote:

In D4609#801397, @haicheng wrote:

Hi Hal,

I am wondering what's your plan of this patch? I am particularly interested in the propagation of noalias attributes.

I've not thought about this in a while, but I'd certainly like to see this functionality. Aside from some refactoring, it needs two things:

The noalias inference needs to be updated to be correct. The easiest way to do this is, instead of actually doing an alias check on the arguments, to call GetUnderlyingObjects and check for disjoint sets of *identified* underlying objects (still checking for capturing, and possibly restricting to local identified objects unless we can do a global capture check). Any set that is disjoint from the others means that its associated argument can be marked as noalias.

I'd like to revisit how the iteration is done and how this fits with the rest of the pass manager. While it does naturally iterate top down, and the CGSCC pass manager normally iterates the other way, there should be a way to put this logic into the part of the pipeline that is iterating with the inliner. I was never satisfied with the idea that we'd only do this once at the beginning of the pipeline. Now that we have a new pass manager, we can think about how this might work in that framework as well. Also, the new pass manager should make it more natural to get a per-function dominator tree, etc. @chandlerc, thoughts on how this might best be done?

Thank you,

Haicheng

That is great. I rebased your code and replaced the alias check with cheking distinct sets of underlying objects. It is working for most benchmarks of llvm-test-suite and spec20xx.

In D4609#835270, @haicheng wrote:

In D4609#833428, @hfinkel wrote:

In D4609#801397, @haicheng wrote:

Hi Hal,

I am wondering what's your plan of this patch? I am particularly interested in the propagation of noalias attributes.

I've not thought about this in a while, but I'd certainly like to see this functionality. Aside from some refactoring, it needs two things:

The noalias inference needs to be updated to be correct. The easiest way to do this is, instead of actually doing an alias check on the arguments, to call GetUnderlyingObjects and check for disjoint sets of *identified* underlying objects (still checking for capturing, and possibly restricting to local identified objects unless we can do a global capture check). Any set that is disjoint from the others means that its associated argument can be marked as noalias.

I'd like to revisit how the iteration is done and how this fits with the rest of the pass manager. While it does naturally iterate top down, and the CGSCC pass manager normally iterates the other way, there should be a way to put this logic into the part of the pipeline that is iterating with the inliner. I was never satisfied with the idea that we'd only do this once at the beginning of the pipeline. Now that we have a new pass manager, we can think about how this might work in that framework as well. Also, the new pass manager should make it more natural to get a per-function dominator tree, etc. @chandlerc, thoughts on how this might best be done?

Thank you,

Haicheng

That is great. I rebased your code and replaced the alias check with cheking distinct sets of underlying objects. It is working for most benchmarks of llvm-test-suite and spec20xx.

It occurs to me that the need for this might be removed by another potential improvement (suggested to me by Chandler at EuroLLVM this year): We can make CGSCC analysis wrapper for our current analysis passes in order to allow these analyses to look back through function arguments into the function's callers. To make this work, we'd:

Make CGSCC analysis passes corresponding to our current analysis passes. For example, AA, ValueTracking, etc. (I realize that ValueTracking is not really an analysis pass right now, and while we might want to change that at some point, I don't think it matters for this description).
These pass accept analysis queries and produce results by looking at the calling functions and calling the function/local analysis on the values at each call site, and then intersecting the results. If there are too many call sites, or unknown call sites, then you need to give up.
Provide these CGSCC analysis handles to the corresponding local analyses so that, as they do a recursive analysis, when they hit arguments, then can call into the CGSCC analyses to continue the analysis into the callers.

If we did that, do you think we'd still need this transformation?

It occurs to me that the need for this might be removed by another potential improvement (suggested to me by Chandler at EuroLLVM this year): We can make CGSCC analysis wrapper for our current analysis passes in order to allow these analyses to look back through function arguments into the function's callers. To make this work, we'd:

Make CGSCC analysis passes corresponding to our current analysis passes. For example, AA, ValueTracking, etc. (I realize that ValueTracking is not really an analysis pass right now, and while we might want to change that at some point, I don't think it matters for this description).

These pass accept analysis queries and produce results by looking at the calling functions and calling the function/local analysis on the values at each call site, and then intersecting the results. If there are too many call sites, or unknown call sites, then you need to give up.

Provide these CGSCC analysis handles to the corresponding local analyses so that, as they do a recursive analysis, when they hit arguments, then can call into the CGSCC analyses to continue the analysis into the callers.

If we did that, do you think we'd still need this transformation?

What you described was close to what I was looking for before I found this patch. I don't know how difficult to implement it, but I can try to hook CGSCC with BasicAA first.

Thank you.

Haicheng

etherzhhb added a subscriber: etherzhhb.Nov 16 2017, 11:44 PM

etherzhhb added inline comments.

lib/Transforms/IPO/FunctionAttrsTD.cpp
319	This is unused?

It occurs to me that the need for this might be removed by another potential improvement (suggested to me by Chandler at EuroLLVM this year): We can make CGSCC analysis wrapper for our current analysis passes in order to allow these analyses to look back through function arguments into the function's callers. To make this work, we'd:

Make CGSCC analysis passes corresponding to our current analysis passes. For example, AA, ValueTracking, etc. (I realize that ValueTracking is not really an analysis pass right now, and while we might want to change that at some point, I don't think it matters for this description).

These pass accept analysis queries and produce results by looking at the calling functions and calling the function/local analysis on the values at each call site, and then intersecting the results. If there are too many call sites, or unknown call sites, then you need to give up.

Provide these CGSCC analysis handles to the corresponding local analyses so that, as they do a recursive analysis, when they hit arguments, then can call into the CGSCC analyses to continue the analysis into the callers.

If we did that, do you think we'd still need this transformation?

Hi Hal,

I spent some time thinking about the alternative method you mentioned here. I think the alternative can cover many cases that this patch tries to solve, but this patch still could bring some additional advantages.

The new pass provides a place dedicated to do the Top-Down propagation. It is simpler and easy to maintain. We can add more attributes to propagate in the future.
Not only the direct users of AA or value tracking can get benefits from this patch, any pass runs after this new pass can easily get access to the propagated attributes.
The alternative one has a long way to go, but your patch is almost there. I spend some time fixing the bugs and it is working for most of cases now.

I am thinking maybe the two methods can co-exist. This patch can cut the number of recursive calls and together they can cover more cases. Do you think it is worthwhile if I continue working on this patch?

jdoerfert mentioned this in D50125: [FunctionAttrs] Annotate function arguments with call site information.Aug 1 2018, 10:46 AM

Revision Contents

Path

Size

include/

llvm-c/

Transforms/

IPO.h

3 lines

llvm/

InitializePasses.h

1 line

LinkAllPasses.h

1 line

Transforms/

IPO.h

6 lines

lib/

LTO/

LTOCodeGenerator.cpp

1 line

Transforms/

IPO/

CMakeLists.txt

1 line

FunctionAttrsTD.cpp

345 lines

IPO.cpp

5 lines

PassManagerBuilder.cpp

4 lines

test/

Transforms/

FunctionAttrsTD/

large-agg.ll

77 lines

malloc.ll

42 lines

Diff 11803

include/llvm-c/Transforms/IPO.h

	Show All 34 Lines
	void LLVMAddConstantMergePass(LLVMPassManagerRef PM);			void LLVMAddConstantMergePass(LLVMPassManagerRef PM);

	/** See llvm::createDeadArgEliminationPass function. */			/** See llvm::createDeadArgEliminationPass function. */
	void LLVMAddDeadArgEliminationPass(LLVMPassManagerRef PM);			void LLVMAddDeadArgEliminationPass(LLVMPassManagerRef PM);

	/** See llvm::createFunctionAttrsPass function. */			/** See llvm::createFunctionAttrsPass function. */
	void LLVMAddFunctionAttrsPass(LLVMPassManagerRef PM);			void LLVMAddFunctionAttrsPass(LLVMPassManagerRef PM);

				/** See llvm::createFunctionAttrsTDPass function. */
				void LLVMAddFunctionAttrsTDPass(LLVMPassManagerRef PM);

	/** See llvm::createFunctionInliningPass function. */			/** See llvm::createFunctionInliningPass function. */
	void LLVMAddFunctionInliningPass(LLVMPassManagerRef PM);			void LLVMAddFunctionInliningPass(LLVMPassManagerRef PM);

	/** See llvm::createAlwaysInlinerPass function. */			/** See llvm::createAlwaysInlinerPass function. */
	void LLVMAddAlwaysInlinerPass(LLVMPassManagerRef PM);			void LLVMAddAlwaysInlinerPass(LLVMPassManagerRef PM);

	/** See llvm::createGlobalDCEPass function. */			/** See llvm::createGlobalDCEPass function. */
	void LLVMAddGlobalDCEPass(LLVMPassManagerRef PM);			void LLVMAddGlobalDCEPass(LLVMPassManagerRef PM);
	Show All 31 Lines

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines
	void initializeMemorySanitizerPass(PassRegistry&);			void initializeMemorySanitizerPass(PassRegistry&);
	void initializeThreadSanitizerPass(PassRegistry&);			void initializeThreadSanitizerPass(PassRegistry&);
	void initializeDataFlowSanitizerPass(PassRegistry&);			void initializeDataFlowSanitizerPass(PassRegistry&);
	void initializeScalarizerPass(PassRegistry&);			void initializeScalarizerPass(PassRegistry&);
	void initializeEarlyCSEPass(PassRegistry&);			void initializeEarlyCSEPass(PassRegistry&);
	void initializeExpandISelPseudosPass(PassRegistry&);			void initializeExpandISelPseudosPass(PassRegistry&);
	void initializeFindUsedTypesPass(PassRegistry&);			void initializeFindUsedTypesPass(PassRegistry&);
	void initializeFunctionAttrsPass(PassRegistry&);			void initializeFunctionAttrsPass(PassRegistry&);
				void initializeFunctionAttrsTDPass(PassRegistry&);
	void initializeGCMachineCodeAnalysisPass(PassRegistry&);			void initializeGCMachineCodeAnalysisPass(PassRegistry&);
	void initializeGCModuleInfoPass(PassRegistry&);			void initializeGCModuleInfoPass(PassRegistry&);
	void initializeGVNPass(PassRegistry&);			void initializeGVNPass(PassRegistry&);
	void initializeGlobalDCEPass(PassRegistry&);			void initializeGlobalDCEPass(PassRegistry&);
	void initializeGlobalOptPass(PassRegistry&);			void initializeGlobalOptPass(PassRegistry&);
	void initializeGlobalsModRefPass(PassRegistry&);			void initializeGlobalsModRefPass(PassRegistry&);
	void initializeIPCPPass(PassRegistry&);			void initializeIPCPPass(PassRegistry&);
	void initializeIPSCCPPass(PassRegistry&);			void initializeIPSCCPPass(PassRegistry&);
	▲ Show 20 Lines • Show All 144 Lines • Show Last 20 Lines

include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void)llvm::createMergedLoadStoreMotionPass();		(void)llvm::createMergedLoadStoreMotionPass();
(void) llvm::createGVNPass();		(void) llvm::createGVNPass();
(void) llvm::createMemCpyOptPass();		(void) llvm::createMemCpyOptPass();
(void) llvm::createLoopDeletionPass();		(void) llvm::createLoopDeletionPass();
(void) llvm::createPostDomTree();		(void) llvm::createPostDomTree();
(void) llvm::createInstructionNamerPass();		(void) llvm::createInstructionNamerPass();
(void) llvm::createMetaRenamerPass();		(void) llvm::createMetaRenamerPass();
(void) llvm::createFunctionAttrsPass();		(void) llvm::createFunctionAttrsPass();
		(void) llvm::createFunctionAttrsTDPass();
(void) llvm::createMergeFunctionsPass();		(void) llvm::createMergeFunctionsPass();
(void) llvm::createPrintModulePass((llvm::raw_ostream)nullptr);		(void) llvm::createPrintModulePass((llvm::raw_ostream)nullptr);
(void) llvm::createPrintFunctionPass((llvm::raw_ostream)nullptr);		(void) llvm::createPrintFunctionPass((llvm::raw_ostream)nullptr);
(void) llvm::createPrintBasicBlockPass((llvm::raw_ostream)nullptr);		(void) llvm::createPrintBasicBlockPass((llvm::raw_ostream)nullptr);
(void) llvm::createModuleDebugInfoPrinterPass();		(void) llvm::createModuleDebugInfoPrinterPass();
(void) llvm::createPartialInliningPass();		(void) llvm::createPartialInliningPass();
(void) llvm::createLintPass();		(void) llvm::createLintPass();
(void) llvm::createSinkingPass();		(void) llvm::createSinkingPass();
Show All 24 Lines

include/llvm/Transforms/IPO.h

	Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
	/// createFunctionAttrsPass - This pass discovers functions that do not access			/// createFunctionAttrsPass - This pass discovers functions that do not access
	/// memory, or only read memory, and gives them the readnone/readonly attribute.			/// memory, or only read memory, and gives them the readnone/readonly attribute.
	/// It also discovers function arguments that are not captured by the function			/// It also discovers function arguments that are not captured by the function
	/// and marks them with the nocapture attribute.			/// and marks them with the nocapture attribute.
	///			///
	Pass *createFunctionAttrsPass();			Pass *createFunctionAttrsPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				/// createFunctionAttrsTDPass - This pass discovers function arguments that
				/// are always nonnull, dereferenceable and/or noalias.
				///
				ModulePass *createFunctionAttrsTDPass();

				//===----------------------------------------------------------------------===//
	/// createMergeFunctionsPass - This pass discovers identical functions and			/// createMergeFunctionsPass - This pass discovers identical functions and
	/// collapses them.			/// collapses them.
	///			///
	ModulePass *createMergeFunctionsPass();			ModulePass *createMergeFunctionsPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	/// createPartialInliningPass - This pass inlines parts of functions.			/// createPartialInliningPass - This pass inlines parts of functions.
	///			///
	Show All 15 Lines

lib/LTO/LTOCodeGenerator.cpp

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	void LTOCodeGenerator::initializeLTOPasses() {
initializePruneEHPass(R);		initializePruneEHPass(R);
initializeGlobalDCEPass(R);		initializeGlobalDCEPass(R);
initializeArgPromotionPass(R);		initializeArgPromotionPass(R);
initializeJumpThreadingPass(R);		initializeJumpThreadingPass(R);
initializeSROAPass(R);		initializeSROAPass(R);
initializeSROA_DTPass(R);		initializeSROA_DTPass(R);
initializeSROA_SSAUpPass(R);		initializeSROA_SSAUpPass(R);
initializeFunctionAttrsPass(R);		initializeFunctionAttrsPass(R);
		initializeFunctionAttrsTDPass(R);
initializeGlobalsModRefPass(R);		initializeGlobalsModRefPass(R);
initializeLICMPass(R);		initializeLICMPass(R);
initializeMergedLoadStoreMotionPass(R);		initializeMergedLoadStoreMotionPass(R);
initializeGVNPass(R);		initializeGVNPass(R);
initializeMemCpyOptPass(R);		initializeMemCpyOptPass(R);
initializeDCEPass(R);		initializeDCEPass(R);
initializeCFGSimplifyPassPass(R);		initializeCFGSimplifyPassPass(R);
}		}
▲ Show 20 Lines • Show All 461 Lines • Show Last 20 Lines

lib/Transforms/IPO/CMakeLists.txt

	add_llvm_library(LLVMipo			add_llvm_library(LLVMipo
	ArgumentPromotion.cpp			ArgumentPromotion.cpp
	BarrierNoopPass.cpp			BarrierNoopPass.cpp
	ConstantMerge.cpp			ConstantMerge.cpp
	DeadArgumentElimination.cpp			DeadArgumentElimination.cpp
	ExtractGV.cpp			ExtractGV.cpp
	FunctionAttrs.cpp			FunctionAttrs.cpp
				FunctionAttrsTD.cpp
	GlobalDCE.cpp			GlobalDCE.cpp
	GlobalOpt.cpp			GlobalOpt.cpp
	IPConstantPropagation.cpp			IPConstantPropagation.cpp
	IPO.cpp			IPO.cpp
	InlineAlways.cpp			InlineAlways.cpp
	InlineSimple.cpp			InlineSimple.cpp
	Inliner.cpp			Inliner.cpp
	Internalize.cpp			Internalize.cpp
	Show All 10 Lines

lib/Transforms/IPO/FunctionAttrsTD.cpp

This file was added.

				//===----- FunctionAttrsTD.cpp - Deduce Function Parameter Attributes -----===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass traverses the call graph top-down in order to deduce
				// function-parameter attributes for local functions based on properties of
				// their callers (nonnull, dereferenceable, noalias).
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Transforms/IPO.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/SmallPtrSet.h"
				#include "llvm/ADT/SCCIterator.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/AliasAnalysis.h"
				#include "llvm/Analysis/CallGraph.h"
				#include "llvm/Analysis/CaptureTracking.h"
				#include "llvm/Analysis/Loads.h"
				#include "llvm/Analysis/MemoryBuiltins.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/IR/DataLayout.h"
				#include "llvm/IR/Dominators.h"
				#include "llvm/IR/GlobalVariable.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Pass.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/DataTypes.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Target/TargetLibraryInfo.h"
				#include "llvm/Transforms/Utils/Local.h"
				using namespace llvm;

				#define DEBUG_TYPE "functionattrs-td"

				STATISTIC(NumNonNull, "Number of arguments marked nonnull");
				STATISTIC(NumNoAlias, "Number of arguments marked noalias");
				STATISTIC(NumDereferenceable, "Number of arguments marked dereferenceable");
				STATISTIC(NumAlign, "Number of arguments marked with enhanced alignment");

				namespace {
				class FunctionAttrsTD : public ModulePass {
				const DataLayout *DL;
				AliasAnalysis *AA;
				TargetLibraryInfo *TLI;

				bool addFuncAttrs(Function &F, DenseMap<Function , DominatorTree > &DTs);

				public:
				static char ID; // Pass identification, replacement for typeid
				explicit FunctionAttrsTD();
				bool runOnModule(Module &M) override;

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.setPreservesCFG();
				AU.addRequired<CallGraphWrapperPass>();
				AU.addPreserved<CallGraphWrapperPass>();
				AU.addRequired<TargetLibraryInfo>();
				AU.addPreserved<TargetLibraryInfo>();
				AU.addRequired<AliasAnalysis>();
				AU.addPreserved<AliasAnalysis>();
				}
				};
				} // end anonymous namespace

				char FunctionAttrsTD::ID = 0;
				INITIALIZE_PASS(FunctionAttrsTD, "functionattrs-td",
				"Top-down function parameter attributes", false, false)

				FunctionAttrsTD::FunctionAttrsTD() : ModulePass(ID) {
				initializeFunctionAttrsTDPass(*PassRegistry::getPassRegistry());
				}

				bool FunctionAttrsTD::addFuncAttrs(Function &F,
				DenseMap<Function , DominatorTree > &DTs) {
				bool MadeChange = false;
				if (F.isDeclaration() \|\| !F.hasLocalLinkage() \|\| F.arg_empty() \|\|
				reamesUnsubmitted Not Done Reply Inline Actions Stylistically, I'd prefer this separated into two checks. Your combining correctness and profitability criteria. Also, rather than "return MadeChange", just "return false" for the early exits. It makes the flow clearer. reames: Stylistically, I'd prefer this separated into two checks. Your combining correctness and…
				F.use_empty())
				return MadeChange;

				struct AttrState {
				bool IsNonNull;
				bool IsNoAlias;
				bool IsDereferenceable;
				bool IsAlign;
				uint64_t MaxSize, MaxAlign;
				reamesUnsubmitted Not Done Reply Inline Actions Please rename MaxSize. It appears to actually be MaxKnownDeferenceable and the current name is highly confusing. reames: Please rename MaxSize. It appears to actually be MaxKnownDeferenceable and the current name is…

				AttrState() : IsNonNull(true), IsNoAlias(true),
				IsDereferenceable(true), IsAlign(true),
				MaxSize(UINT64_MAX), MaxAlign(UINT64_MAX) {}

				bool any() const {
				return IsNonNull \|\| IsNoAlias \|\| IsDereferenceable \|\| IsAlign;
				}

				void setNone() {
				IsNonNull = false;
				IsNoAlias = false;
				IsDereferenceable = false;
				IsAlign = false;
				}
				};

				// For each argument, keep track of whether it is donated or not.
				// The bool is set to true when found to be non-donated.
				SmallVector<AttrState, 16> ArgumentState;
				ArgumentState.resize(F.arg_size());
				reamesUnsubmitted Not Done Reply Inline Actions A comment here that reminds the reader the default state is fully speculative/optimistic would be helpful. reames: A comment here that reminds the reader the default state is fully speculative/optimistic would…

				for (Use &U : F.uses()) {
				reamesUnsubmitted Not Done Reply Inline Actions Do you need to iterate the uses, or can you iterate the users directly? reames: Do you need to iterate the uses, or can you iterate the users directly?
				User *UR = U.getUser();
				// Ignore blockaddress uses.
				if (isa<BlockAddress>(UR)) continue;

				// Used by a non-instruction, or not the callee of a function, do not
				// transform.
				if (!isa<CallInst>(UR) && !isa<InvokeInst>(UR))
				return MadeChange;

				CallSite CS(cast<Instruction>(UR));
				if (!CS.isCallee(&U))
				reamesUnsubmitted Not Done Reply Inline Actions Shouldn't this be F? reames: Shouldn't this be F?
				return MadeChange;

				// Check out all of the arguments. Note that we don't inspect varargs here.
				CallSite::arg_iterator AI = CS.arg_begin();
				Function::arg_iterator Arg = F.arg_begin();
				for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i, ++AI, ++Arg) {
				reamesUnsubmitted Not Done Reply Inline Actions A general point: this function is way too complex. Split it into reasonable helper functions. One arrangement would be: struct ArgState updateArgAtCallSite(...) {} updateForCallSite(...) {} identifyCallSites(...) {} addIndicatedAtrributes(...) {} reames: A general point: this function is way too complex. Split it into reasonable helper functions.
				// If we've already excluded this argument, ignore it.
				if (!ArgumentState[i].any())
				continue;

				Type Ty = (AI)->getType();
				if (!Ty->isPointerTy()) {
				ArgumentState[i].setNone();
				continue;
				}

				if (ArgumentState[i].IsNonNull && !Arg->hasNonNullAttr()) {
				if (!isKnownNonNull(*AI, TLI))
				ArgumentState[i].IsNonNull = false;
				}

				Type *ETy = Ty->getPointerElementType();
				if (!DL \|\| !ETy->isSized()) {
				ArgumentState[i].IsDereferenceable = false;
				} else if (ArgumentState[i].IsDereferenceable &&
				ArgumentState[i].MaxSize > Arg->getDereferenceableBytes()) {
				uint64_t TypeSize = DL->getTypeStoreSize(ETy);
				if (!(*AI)->isDereferenceablePointer(DL) &&
				!isSafeToLoadUnconditionally(*AI, CS.getInstruction(),
				getKnownAlignment(*AI, DL), DL)) {
				ArgumentState[i].IsDereferenceable = false;
				// FIXME: isSafeToLoadUnconditionally does not understand memset.
				reamesUnsubmitted Not Done Reply Inline Actions Is this a correctness bug? Or a missed optimization? I can't tell from your FIXME. reames: Is this a correctness bug? Or a missed optimization? I can't tell from your FIXME.
				// FIXME: We can use getObjectSize for most things, but for mallocs
				// we need to make sure that the answer is nonnull. We can use the
				// existing logic that checks for nonnull for that.
				} else {
				uint64_t Size;
				if (!getObjectSize(*AI, Size, DL, TLI)) {
				// FIXME: getObjectSize does not use the dereferenceable attribute
				// to get the size when possible.
				Size = TypeSize;
				} else {
				// We've already verified that a load of the type size would not
				// trap, so we can use at least that size.
				Size = std::max(Size, TypeSize);
				reamesUnsubmitted Not Done Reply Inline Actions Er, this doesn't look right. We've verified that loading from the pointer up to a certain number of bytes is safe. We have not said anything beyond that. reames: Er, this doesn't look right. We've verified that loading from the pointer up to a certain…
				}

				ArgumentState[i].MaxSize = std::min(ArgumentState[i].MaxSize, Size);
				if (!ArgumentState[i].MaxSize)
				ArgumentState[i].IsDereferenceable = false;
				reamesUnsubmitted Not Done Reply Inline Actions It looks like IsDerefernceable could become a function of MaxSize on the state object? reames: It looks like IsDerefernceable could become a function of MaxSize on the state object?
				}
				}

				if (!DL \|\| !ETy->isSized() \|\| Arg->hasByValOrInAllocaAttr()) {
				ArgumentState[i].IsAlign = false;
				} else if (ArgumentState[i].IsAlign &&
				ArgumentState[i].MaxAlign > Arg->getParamAlignment()) {
				unsigned Align = getKnownAlignment(*AI, DL);
				if (Align > DL->getABITypeAlignment(ETy))
				reamesUnsubmitted Not Done Reply Inline Actions I don't understand why you need this check. Is it safe to record an align attribute less than the ABI alignment? If not, that seems slightly odd. reames: I don't understand why you need this check. Is it safe to record an align attribute less than…
				ArgumentState[i].MaxAlign =
				std::min(ArgumentState[i].MaxAlign, (uint64_t) Align);
				else
				ArgumentState[i].IsAlign = false;
				}

				if (isa<GlobalValue>(*AI)) {
				ArgumentState[i].IsNoAlias = false;
				} else if (ArgumentState[i].IsNoAlias && !Arg->hasNoAliasAttr()) {
				bool NoAlias = true;
				for (CallSite::arg_iterator BI = CS.arg_begin(), BIE = CS.arg_end();
				BI != BIE; ++BI) {
				if (BI == AI)
				continue;

				if (AA->alias(AI, BI) != AliasAnalysis::NoAlias) {
				NoAlias = false;
				break;
				}
				}

				SmallVector<Value*, 16> UObjects;
				GetUnderlyingObjects(*AI, UObjects, DL);
				for (Value *V : UObjects)
				if (!isIdentifiedFunctionLocal(V)) {
				NoAlias = false;
				break;
				}

				if (NoAlias) {
				DominatorTree *DT;
				Function *Caller = CS.getInstruction()->getParent()->getParent();
				auto DTI = DTs.find(Caller);
				if (DTI == DTs.end()) {
				DT = new DominatorTree;
				DT->recalculate(*Caller);
				DTs[Caller] = DT;
				} else {
				DT = DTI->second;
				}

				if (PointerMayBeCapturedBefore(AI, / ReturnCaptures */ false,
				/* StoreCaptures */ false,
				CS.getInstruction(), DT))
				ArgumentState[i].IsNoAlias = false;
				} else {
				ArgumentState[i].IsNoAlias = false;
				}
				}
				}

				bool HaveCand = false;
				reamesUnsubmitted Not Done Reply Inline Actions I'm not clear why this check is needed. Aren't you only iterating 'number of argument' times anyways? reames: I'm not clear why this check is needed. Aren't you only iterating 'number of argument' times…
				for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i)
				if (ArgumentState[i].any()) {
				HaveCand = true;
				break;
				}

				if (!HaveCand)
				break;
				}

				Function::arg_iterator AI = F.arg_begin();
				for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i, ++AI) {
				if (!ArgumentState[i].any() \|\| AI->hasInAllocaAttr() \|\| AI->hasByValAttr())
				reamesUnsubmitted Not Done Reply Inline Actions The first is just an early exit, but what are the second two checks for? Are they functionally required? reames: The first is just an early exit, but what are the second two checks for? Are they functionally…
				continue;

				if (!AI->hasNonNullAttr() && ArgumentState[i].IsNonNull &&
				(!ArgumentState[i].IsDereferenceable \|\|
				AI->getType()->getPointerAddressSpace() != 0)) {
				AttrBuilder B;
				B.addAttribute(Attribute::NonNull);

				AI->addAttr(AttributeSet::get(F.getContext(), AI->getArgNo() + 1, B));
				++NumNonNull;
				MadeChange = true;
				}

				if (!AI->hasNoAliasAttr() && ArgumentState[i].IsNoAlias) {
				AttrBuilder B;
				B.addAttribute(Attribute::NoAlias);

				AI->addAttr(AttributeSet::get(F.getContext(), AI->getArgNo() + 1, B));
				++NumNoAlias;
				MadeChange = true;
				}

				if (AI->getDereferenceableBytes() < ArgumentState[i].MaxSize &&
				ArgumentState[i].IsDereferenceable) {
				AttrBuilder B;
				B.addDereferenceableAttr(ArgumentState[i].MaxSize);

				AI->addAttr(AttributeSet::get(F.getContext(), AI->getArgNo() + 1, B));
				++NumDereferenceable;
				MadeChange = true;
				}

				if (AI->getParamAlignment() < ArgumentState[i].MaxAlign &&
				ArgumentState[i].IsAlign) {
				AttrBuilder B;
				B.addAlignmentAttr(ArgumentState[i].MaxAlign);

				AI->addAttr(AttributeSet::get(F.getContext(), AI->getArgNo() + 1, B));
				++NumAlign;
				MadeChange = true;
				}
				}

				return MadeChange;
				}

				bool FunctionAttrsTD::runOnModule(Module &M) {
				AA = &getAnalysis<AliasAnalysis>();
				TLI = &getAnalysis<TargetLibraryInfo>();
				DataLayoutPass *DLP = getAnalysisIfAvailable<DataLayoutPass>();
				DL = DLP ? &DLP->getDataLayout() : nullptr;

				CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();
				bool Changed = false;

				// We want to iterate top-down, but also want to do a fixed-point iteration
				// on SCCs to ensure a stable answer. The regular SCC iterator returns the
				// SCCs bottom-up (it uses Tarjan's Algorithm, and knows it has found the
				// root of an SCC as it passes it on the way up -- it is not clear there is a
				// better way). Because we can't collect SCCs in a purely top-down fashion
				// anyway, just store those returned by the bottom-up SCC iterator and visit
				// this stored list in reverse order.
				SmallVector<SmallVector<CallGraphNode *, 16>, 8> CGSCCs;
				for (scc_iterator<CallGraph*> CGI = scc_begin(&CG), CGIE = scc_end(&CG);
				CGI != CGIE; ++CGI) {
				CGSCCs.push_back(SmallVector<CallGraphNode *, 16>(CGI->begin(),
				CGI->end()));
				}

				scc_iterator<CallGraph*> CGI = scc_begin(&CG);
				etherzhhbUnsubmitted Not Done Reply Inline Actions This is unused? etherzhhb: This is unused?

				DenseMap<Function , DominatorTree > DTs;
				for (auto S = CGSCCs.rbegin(), SE = CGSCCs.rend(); S != SE; ++S) {
				// FIXME: We could get a better answer by speculating the attributes within
				// each SCC (meaning setting all attributes that might be set, and then
				// removing those found be to unprovable at some call sites of the SCC's
				// functions).

				bool ChangedThisSCC;
				do {
				ChangedThisSCC = false;
				for (CallGraphNode CGN : S) {
				Function *F = CGN->getFunction();
				if (!F)
				continue;
				DEBUG(dbgs() << "FATD visiting: " << F->getName() << "\n");
				Changed \|= (ChangedThisSCC \|= addFuncAttrs(*F, DTs));
				}
				} while (ChangedThisSCC);
				}

				return Changed;
				}

				ModulePass *llvm::createFunctionAttrsTDPass() { return new FunctionAttrsTD(); }

lib/Transforms/IPO/IPO.cpp

	Show All 21 Lines
	using namespace llvm;			using namespace llvm;

	void llvm::initializeIPO(PassRegistry &Registry) {			void llvm::initializeIPO(PassRegistry &Registry) {
	initializeArgPromotionPass(Registry);			initializeArgPromotionPass(Registry);
	initializeConstantMergePass(Registry);			initializeConstantMergePass(Registry);
	initializeDAEPass(Registry);			initializeDAEPass(Registry);
	initializeDAHPass(Registry);			initializeDAHPass(Registry);
	initializeFunctionAttrsPass(Registry);			initializeFunctionAttrsPass(Registry);
				initializeFunctionAttrsTDPass(Registry);
	initializeGlobalDCEPass(Registry);			initializeGlobalDCEPass(Registry);
	initializeGlobalOptPass(Registry);			initializeGlobalOptPass(Registry);
	initializeIPCPPass(Registry);			initializeIPCPPass(Registry);
	initializeAlwaysInlinerPass(Registry);			initializeAlwaysInlinerPass(Registry);
	initializeSimpleInlinerPass(Registry);			initializeSimpleInlinerPass(Registry);
	initializeInternalizePassPass(Registry);			initializeInternalizePassPass(Registry);
	initializeLoopExtractorPass(Registry);			initializeLoopExtractorPass(Registry);
	initializeBlockExtractorPassPass(Registry);			initializeBlockExtractorPassPass(Registry);
	Show All 24 Lines
	void LLVMAddDeadArgEliminationPass(LLVMPassManagerRef PM) {			void LLVMAddDeadArgEliminationPass(LLVMPassManagerRef PM) {
	unwrap(PM)->add(createDeadArgEliminationPass());			unwrap(PM)->add(createDeadArgEliminationPass());
	}			}

	void LLVMAddFunctionAttrsPass(LLVMPassManagerRef PM) {			void LLVMAddFunctionAttrsPass(LLVMPassManagerRef PM) {
	unwrap(PM)->add(createFunctionAttrsPass());			unwrap(PM)->add(createFunctionAttrsPass());
	}			}

				void LLVMAddFunctionAttrsTDPass(LLVMPassManagerRef PM) {
				unwrap(PM)->add(createFunctionAttrsTDPass());
				}

	void LLVMAddFunctionInliningPass(LLVMPassManagerRef PM) {			void LLVMAddFunctionInliningPass(LLVMPassManagerRef PM) {
	unwrap(PM)->add(createFunctionInliningPass());			unwrap(PM)->add(createFunctionInliningPass());
	}			}

	void LLVMAddAlwaysInlinerPass(LLVMPassManagerRef PM) {			void LLVMAddAlwaysInlinerPass(LLVMPassManagerRef PM) {
	unwrap(PM)->add(llvm::createAlwaysInlinerPass());			unwrap(PM)->add(llvm::createAlwaysInlinerPass());
	}			}

	Show All 34 Lines

lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 158 Lines • ▼ Show 20 Lines	if (!DisableUnitAtATime) {
MPM.add(createIPSCCPPass()); // IP SCCP		MPM.add(createIPSCCPPass()); // IP SCCP
MPM.add(createGlobalOptimizerPass()); // Optimize out global vars		MPM.add(createGlobalOptimizerPass()); // Optimize out global vars

MPM.add(createDeadArgEliminationPass()); // Dead argument elimination		MPM.add(createDeadArgEliminationPass()); // Dead argument elimination

MPM.add(createInstructionCombiningPass());// Clean up after IPCP & DAE		MPM.add(createInstructionCombiningPass());// Clean up after IPCP & DAE
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);
MPM.add(createCFGSimplificationPass()); // Clean up after IPCP & DAE		MPM.add(createCFGSimplificationPass()); // Clean up after IPCP & DAE

		MPM.add(createFunctionAttrsPass()); // Set nocapture attr
		MPM.add(createFunctionAttrsTDPass()); // Set noalias attr
}		}

// Start of CallGraph SCC passes.		// Start of CallGraph SCC passes.
if (!DisableUnitAtATime)		if (!DisableUnitAtATime)
MPM.add(createPruneEHPass()); // Remove dead EH info		MPM.add(createPruneEHPass()); // Remove dead EH info
if (Inliner) {		if (Inliner) {
MPM.add(Inliner);		MPM.add(Inliner);
Inliner = nullptr;		Inliner = nullptr;
▲ Show 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateLTOPassManager(PassManagerBase &PM,
// Break up allocas		// Break up allocas
if (UseNewSROA)		if (UseNewSROA)
PM.add(createSROAPass());		PM.add(createSROAPass());
else		else
PM.add(createScalarReplAggregatesPass());		PM.add(createScalarReplAggregatesPass());

// Run a few AA driven optimizations here and now, to cleanup the code.		// Run a few AA driven optimizations here and now, to cleanup the code.
PM.add(createFunctionAttrsPass()); // Add nocapture.		PM.add(createFunctionAttrsPass()); // Add nocapture.
		PM.add(createFunctionAttrsTDPass()); // Add noalias.
PM.add(createGlobalsModRefPass()); // IP alias analysis.		PM.add(createGlobalsModRefPass()); // IP alias analysis.

PM.add(createLICMPass()); // Hoist loop invariants.		PM.add(createLICMPass()); // Hoist loop invariants.
PM.add(createMergedLoadStoreMotionPass()); // Merge load/stores in diamonds		PM.add(createMergedLoadStoreMotionPass()); // Merge load/stores in diamonds
PM.add(createGVNPass(DisableGVNLoadPRE)); // Remove redundancies.		PM.add(createGVNPass(DisableGVNLoadPRE)); // Remove redundancies.
PM.add(createMemCpyOptPass()); // Remove dead memcpys.		PM.add(createMemCpyOptPass()); // Remove dead memcpys.

// Nuke dead stores.		// Nuke dead stores.
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

test/Transforms/FunctionAttrsTD/large-agg.ll

This file was added.

				; RUN: opt -S -basicaa -functionattrs-td < %s \| FileCheck %s
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				%struct.s = type { [65280 x i32] }
				%struct.x = type opaque
				%struct.x.0 = type opaque

				; CHECK: define void @bar()

				; Function Attrs: nounwind uwtable
				define void @bar() #0 {
				entry:
				%x = alloca %struct.s, align 32
				%y = alloca %struct.s, align 4
				%0 = bitcast %struct.s* %x to i8*
				call void @llvm.lifetime.start(i64 261120, i8* %0) #1
				%1 = bitcast %struct.s* %y to i8*
				call void @llvm.lifetime.start(i64 261120, i8* %1) #1
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
				%arrayidx = getelementptr inbounds %struct.s* %x, i64 0, i32 0, i64 %indvars.iv
				%2 = trunc i64 %indvars.iv to i32
				store i32 %2, i32* %arrayidx, align 4, !tbaa !0
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, 17800
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				%3 = bitcast %struct.s* %y to %struct.x*
				call fastcc void @foo(%struct.s* %x, %struct.x* %3)
				call void @llvm.lifetime.end(i64 261120, i8* %1) #1
				call void @llvm.lifetime.end(i64 261120, i8* %0) #1
				ret void
				}

				; Function Attrs: nounwind
				declare void @llvm.lifetime.start(i64, i8* nocapture) #1

				; CHECK: define internal fastcc void @foo(%struct.s* noalias nocapture readonly align 32 dereferenceable(261120) %x, %struct.x* noalias %y)

				; Function Attrs: noinline nounwind uwtable
				define internal fastcc void @foo(%struct.s* nocapture readonly %x, %struct.x* %y) #2 {
				entry:
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
				%sum.05 = phi i32 [ 0, %entry ], [ %add, %for.body ]
				%arrayidx = getelementptr inbounds %struct.s* %x, i64 0, i32 0, i64 %indvars.iv
				%0 = load i32* %arrayidx, align 4, !tbaa !0
				%add = add nsw i32 %0, %sum.05
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond = icmp eq i64 %indvars.iv.next, 16000
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				%1 = bitcast %struct.x* %y to %struct.x.0*
				tail call void @goo(i32 %add, %struct.x.0* %1) #1
				ret void
				}

				; Function Attrs: nounwind
				declare void @llvm.lifetime.end(i64, i8* nocapture) #1

				declare void @goo(i32, %struct.x.0*)

				attributes #0 = { nounwind uwtable }
				attributes #1 = { nounwind }
				attributes #2 = { noinline nounwind uwtable }

				!0 = metadata !{metadata !1, metadata !1, i64 0}
				!1 = metadata !{metadata !"int", metadata !2, i64 0}
				!2 = metadata !{metadata !"omnipotent char", metadata !3, i64 0}
				!3 = metadata !{metadata !"Simple C/C++ TBAA"}

test/Transforms/FunctionAttrsTD/malloc.ll

This file was added.

				; RUN: opt -S -basicaa -functionattrs-td < %s \| FileCheck %s
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; CHECK: define i32 @bar()

				; Function Attrs: nounwind uwtable
				define i32 @bar() #0 {
				entry:
				%call = tail call noalias i8* @malloc(i64 12345) #3
				%0 = bitcast i8* %call to i32*
				store i32 10, i32* %0, align 4, !tbaa !0
				tail call fastcc void @foo(i32* %0)
				ret i32 0
				}

				; Function Attrs: nounwind
				declare noalias i8* @malloc(i64) #1

				; CHECK: define internal fastcc void @foo(i32* noalias dereferenceable(12345) %x)

				; Function Attrs: noinline nounwind uwtable
				define internal fastcc void @foo(i32* %x) #2 {
				entry:
				%arrayidx = getelementptr inbounds i32* %x, i64 5
				store i32 10, i32* %arrayidx, align 4, !tbaa !0
				tail call void @goo(i32* %x) #3
				ret void
				}

				declare void @goo(i32*) #3

				attributes #0 = { nounwind uwtable }
				attributes #1 = { nounwind }
				attributes #2 = { noinline nounwind uwtable }
				attributes #3 = { nounwind }

				!0 = metadata !{metadata !1, metadata !1, i64 0}
				!1 = metadata !{metadata !"int", metadata !2, i64 0}
				!2 = metadata !{metadata !"omnipotent char", metadata !3, i64 0}
				!3 = metadata !{metadata !"Simple C/C++ TBAA"}

This is an archive of the discontinued LLVM Phabricator instance.

Top-Down FunctionAttrs propagation for noalias, dereferenceable and nonnull inferenceNeeds ReviewPublic

Details

Diff Detail

Event Timeline

+ F.use_empty())

+

+

+ User *UR = U.getUser();

+ return MadeChange;

+ // If we've already excluded this argument, ignore it.

for mallocs

+ }

+ }

+ ArgumentState[i].MaxAlign =

+ for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i)

+ continue;

+ F.use_empty())

+

+

+ User *UR = U.getUser();

+ return MadeChange;

+ // If we've already excluded this argument, ignore it.

for mallocs

+ }

+ }

+ ArgumentState[i].MaxAlign =

+ for (unsigned i = 0, e = ArgumentState.size(); i != e; ++i)

+ continue;

Revision Contents

Diff 11803

include/llvm-c/Transforms/IPO.h

include/llvm/InitializePasses.h

include/llvm/LinkAllPasses.h

include/llvm/Transforms/IPO.h

lib/LTO/LTOCodeGenerator.cpp

lib/Transforms/IPO/CMakeLists.txt

lib/Transforms/IPO/FunctionAttrsTD.cpp

lib/Transforms/IPO/IPO.cpp

lib/Transforms/IPO/PassManagerBuilder.cpp

test/Transforms/FunctionAttrsTD/large-agg.ll

test/Transforms/FunctionAttrsTD/malloc.ll

Top-Down FunctionAttrs propagation for noalias, dereferenceable and nonnull inference
Needs ReviewPublic