This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
docs/
-
LangRef.rst
-
include/llvm/
-
llvm/
-
InitializePasses.h
-
LinkAllPasses.h
-
Transforms/
-
Scalar.h
-
Utils/
-
LoopUtils.h
-
lib/Transforms/
-
Transforms/
-
IPO/
-
PassManagerBuilder.cpp
-
Scalar/
-
CMakeLists.txt
-
LoopVersioningLICM.cpp
-
Scalar.cpp
-
test/Transforms/LoopVersioningLICM/
-
Transforms/
-
LoopVersioningLICM/
-
loopversioningLICM1.ll
-
loopversioningLICM2.ll
-
loopversioningLICM3.ll

Differential D9151

Loop Versioning for LICM
ClosedPublic

Authored by ashutosh.nema on Apr 21 2015, 7:30 AM.

Download Raw Diff

Details

Reviewers

anemet
reames
• chatur01
hfinkel

Commits

rGdf6763abe85c: New Loop Versioning LICM Pass
rL259986: New Loop Versioning LICM Pass

Summary

I like to propose a new loop multi versioning optimization for LICM.

Cases where memory aliasing wasn't sure compiler became conservative and
didn't proceeds with invariant code motion. This results in possible
missed optimization opportunities.

Our main motivation is to exploit such cases and allow LICM optimization.

Most of the time when alias analysis is unsure about memory access and it
assumes may-alias. This un surety from alias analysis restrict LICM to proceed
further. Cases where alias analysis is unsure we like to use loop versioning
as an alternative.

Loop Versioning will creates version of the loop with aggressive alias and
the other with conservative (default) alias. Aggressive alias version of loop
will have all the memory access marked as no-alias. These two version of loop
will be preceded by a memory runtime check. This runtime check consists of bound
checks for all unique memory accessed in loop, and it ensures aliasing of memory.
Based on this check result at runtime any of the loops gets executed, if memory
is non aliased then aggressive aliasing loop gets executed, else when memory is
aliased then non aggressive aliased version gets executed.

Following are the top level steps:

Perform loop versioning feasibility check.
If loop is a candidate for versioning then create a memory bound check, by considering all the memory access in loop body.
Clone original loop and set all memory access as no-alias in new loop.
Set original loop & versioned loop as a branch target of runtime check result.
Call LICM::promoteLoopAccessesToScalars on aggressive alias versioned of loop.

Consider following test:

1 int foo(int * var1, int * var2, int * var3, unsigned itr) {
2 unsigned i = 0, j = 0;
3 for(; i < itr; i++) {
4 for(; j < itr; j++) {
5 var1[j] = itr + i;
6 var3[i] = var1[j] + var3[i];
7 }
8 }
9 }

At line #6 store to var3 can be moved out by LICM(promoteLoopAccessesToScalars)
but because of alias analysis un surety about memory access it unable to move it out.

After Loop versioning IR:

<Versioned Loop>
for.body3.loopVersion: ; preds = %for.body3.loopVersion.preheader, %for.body3.loopVersion

%indvars.iv.loopVersion = phi i64 [ %indvars.iv.next.loopVersion, %for.body3.loopVersion ], [ %2, %for.body3.loopVersion.preheader ]
%arrayidx.loopVersion = getelementptr inbounds i32* %var1, i64 %indvars.iv.loopVersion
store i32 %add, i32* %arrayidx.loopVersion, align 4, !tbaa !1, !alias.scope !11, !noalias !11
%indvars.iv.next.loopVersion = add nuw nsw i64 %indvars.iv.loopVersion, 1
%lftr.wideiv.loopVersion = trunc i64 %indvars.iv.loopVersion to i32
%exitcond.loopVersion = icmp eq i32 %lftr.wideiv.loopVersion, %0
br i1 %exitcond.loopVersion, label %for.inc11.loopexit38, label %for.body3.loopVersion
<Original Loop>
for.body3: ; preds = %for.body3.lr.ph, %for.body3

%indvars.iv = phi i64 [ %indvars.iv.next, %for.body3 ], [ %2, %for.body3.lr.ph ]
%arrayidx = getelementptr inbounds i32* %var1, i64 %indvars.iv
store i32 %add, i32* %arrayidx, align 4, !tbaa !1
%8 = load i32* %arrayidx7, align 4, !tbaa !1
%add8 = add nsw i32 %8, %add
store i32 %add8, i32* %arrayidx7, align 4, !tbaa !1
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%lftr.wideiv = trunc i64 %indvars.iv to i32
%exitcond = icmp eq i32 %lftr.wideiv, %0
br i1 %exitcond, label %for.inc11, label %for.body3
In versioned loop difference is visible, 1 store has moved out.

Following are some high level details about current implementation:

LoopVersioning
LoopVersioning is main class which holds multi versioning functionality.

LoopVersioning :: isLegalForVersioning
Its member to ‘LoopVersioning’
Does feasibility check for loop versioning.
a) Checks layout of loop.
b) Instruction level check.
c) memory checks.

LoopVersioning :: versionizeLoop
a) Clone original loop
b) Create a runtime memory check.
c) Add both loops under runtime check results target.

In this patch used maximum loop nest threshold as 2, and maximum number
of pointers in runtime memory check as 5, also these threshold are controlled
by command line.

During feasibility check, we compare possible invariant code motion with total
instruction of loop. If the comparison goes beyond certain threshold limit and
found profitable then only we do loop versioning.

Requesting to go through patch for detailed approach.
Suggestions are comments are welcome.

Diff Detail

Repository: rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

reames added inline comments.Jun 17 2015, 5:40 PM

lib/Transforms/Scalar/LoopVersioningLICM.cpp
3 ↗	(On Diff #27186)	Please follow the standard header format. See LICM as an example. It would help to have a trivial example here. The shortest possible while loop which shows the transform would make the discussion much more clear.
57 ↗	(On Diff #27186)	Include order should be sorted.
78 ↗	(On Diff #27186)	What is an "invariant threshold"? You probably need a better name for this.
124 ↗	(On Diff #27186)	I believe this is duplicated code with LICM. Please fine a place to put a utility function instead. Transforms/Utils/Loops.h would be fine. Ideally, this would be split into it's own patch.
145 ↗	(On Diff #27186)	A better name would help here. Don't we have similar code in CGP?

Thanks Philip for this review.

Drive by review.  A few comments here or there.

Meta comments:

- Code structure and naming is somewhat confusing.  Attention to factoring of functions and method names would help.
- This feels like too large a patch to easily review.  I would strongly prefer that you split the patch into the smallest possible piece (i.e. it works on trivial cases only), then extend.  Given Hal has already started reviewing this, I'll defer to his preferences here, but I'm unlikely to signoff on a patch this large and complex.

I'll try to divide this patch at least at logical level.
Also as suggested there are possibly few functions that can be moved as utility.
I'll come-up with a separate patch for them.

REPOSITORY
  rL LLVM

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:3
@@ +2,3 @@
+//
+//                     The LLVM Compiler Infrastructure
+//
----------------
Please follow the standard header format.  See LICM as an example.

It would help to have a trivial example here.  The shortest possible while loop which shows the transform would make the discussion much more clear.

Sure will update it and add an example.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:57
@@ +56,3 @@
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/Analysis/ScalarEvolutionExpander.h"
+#include "llvm/Transforms/Utils/BasicBlockUtils.h"
----------------
Include order should be sorted.

Sure will sort the included headers and update.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:78
@@ +77,3 @@
+
+/// Loop Versioning invariant threshold
+static cl::opt<unsigned>
----------------
What is an "invariant threshold"?  You probably need a better name for this.

This option is to control the profitability of loop versioning for LICM.
It checks invariant load & store vs total instruction of loop. If invariants
are more than defined threshold then only go for LICM loop versioning.
Threshold is defined in percentage with a default value 25%.

How about "-licm-versioning-threshold" or "-licm-versioning-invariant-threshold" ?

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:124
@@ +123,3 @@
+/// \brief Recursively clone the specified loop and all of its children,
+/// mapping the blocks with the specified map.
+static Loop *cloneLoop(Loop *L, Loop *PL, ValueToValueMapTy &VM, LoopInfo *LI,
----------------
I believe this is duplicated code with LICM.  Please fine a place to put a utility function instead.  Transforms/Utils/Loops.h would be fine.

Ideally, this would be split into it's own patch.

Sure I'll move this to a utility.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:145
@@ +144,3 @@
+/// return the induction operand of the gep pointer.
+static Value *stripGetElementPtr(Value *Ptr, ScalarEvolution *SE, Loop *Lp) {
+  GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(Ptr);
----------------
A better name would help here.
Don't we have similar code in CGP?

Similar function exists in LoopVectorizer will change it as a utility and use it.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:161
@@ +160,3 @@
+
+/// \brief Look for a cast use of the passed value.
+static Value *getUniqueCastUse(Value *Ptr, Loop *Lp, Type *Ty) {
----------------
Er, what does this actually do?  The Ty and L params make me thing this isn't about finding a unique use...

Similar function exists in LoopVectorizer will change it as a utility and use it.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:686
@@ +685,3 @@
+///                           +----------+
+Loop *LoopVersioningLICM::versionizeLoop(LPPassManager &LPM) {
+  std::vector<BasicBlock *> LoopBlocks;
----------------
I suspect there is code which can and should be commoned with unswitch here.

I'll check and come back to you.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:826
@@ +825,3 @@
+      // Only interested in load & stores
+      if (!it->mayReadFromMemory() && !it->mayWriteToMemory())
+        continue;
----------------
Your comment and code disagree.  Are you intentionally handling RMW operations?

Will update comments here, we are handling RMW operations.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:833
@@ +832,3 @@
+      NoAliases.push_back(NewScope);
+      // Set no-alias for current instruction.
+      I->setMetadata(
----------------
The loop structure is very odd here.  It looks like later instructions end up with more noalias markers than earlier ones?  I suspect that's not what you want.

Here we want to make no-alias to instructions, yes the last one will have more no-alias.
But its required to make instruction no-alias, i.e. loop has 3 instruction I1, I2, I3.
Then here we are setting no-alias property like below:
I1 No-Alias-1
I2 No-Alias-1, No-Alias-2
I2 No-Alias-1, No-Alias-2, No-Alias-3

My understanding may be wrong here, if you have any better way of doing it then please suggest.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:871
@@ +870,3 @@
+    BasicBlock *BB = *I;
+    if (LI->getLoopFor(BB) == L) // Ignore blocks in subloops.
+      CurAST->add(*BB);          // Incorporate the specified basic block
----------------
Why?

As we are only interested in the inner most loop, it's overhead creating AST for sub loops.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:889
@@ +888,3 @@
+    // Delete newly created loop from loop pass manager.
+    LPM.deleteLoopFromQueue(VerLoop);
+    Changed = true;
----------------
Why?

This need to be removed as we already introduced metadata to ensure not revisiting same loop.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:893
@@ +892,3 @@
+  // Delete allocated memory.
+  delete CurAST;
+  delete PointerSet;
----------------
These can be stack allocated.

Sure will make stack allocation.

http://reviews.llvm.org/D9151

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/

In addition to some dups you already noticed in LoopVectorize, I spotted a few more on my initial scan.

lib/Transforms/Scalar/LoopVersioningLICM.cpp
99 ↗	(On Diff #27186)	This is duplicated from LoopVectorize.
179 ↗	(On Diff #27186)	Duplicate in LoopVectorize.
338 ↗	(On Diff #27186)	Almost identical to the one in LoopVectorize.

• chatur01 added inline comments.Jun 22 2015, 4:08 PM

lib/Transforms/Scalar/LoopVersioningLICM.cpp
502 ↗	(On Diff #27186)	I don't get where this check is being made. Is the comment out-of-date?

Thanks Charlie for looking into this.

Yes, these functions can be moved to a utility, soon will come up with a patch.

Regards,
Ashutosh

Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:502
@@ +501,3 @@
+/ 2) Make sure loop body should not have any call instruction.
+/ 3) Check store instruction execution guarantee.

+/// 4) Loop body should not have any may throw instruction.

I don't get where this check is being made. Is the comment out-of-date?

Sorry, these are stale comments.

LoopVersioning utility is now available, so LoopVersioningLICM is using that utility.

Earlier following functions was part of LoopVersioningLICM now these are moved to utility. a) getGEPInductionOperand, b) stripGetElementPtr, c) getUniqueCastUse, d) getStrideFromPointer

So, LoopVersioningLICM is now using these utility functions.

Incorporated previous review comments.

Code re-factoring.

Herald added a subscriber: sanjoy. · View Herald TranscriptAug 19 2015, 5:21 AM

LoopVersioning utility now calls ‘addPHINodes’ implicitly(r245579).
Updated LoopVersioningLICM to consider this change.

Hal, Philip, Charlie,

Does this patch look OK to you now ?

Regards,
Ashutosh

Ping.

Hi Ashutosh,

This needs some superficial changes before it could go further. Some of which I've pointed out below. I'd suggest going through all the comments as well and checking grammar, I didn't point out all of that.

I see you have tested when your heuristics would be suitable to apply the transformation are triggered, but there's no tests for the transform itself. Would it not make sense to add some of those too?

I'm merely an interested observer here, I don't have the experience to be able to OK the implementation. Hopefully by sticking my oar in it will help get the attention of people that can :)

Regards,
Charlie.

lib/Transforms/Scalar/LoopVersioningLICM.cpp
102 ↗	(On Diff #32797)	I think it would be more useful to describe what the threshold is for than to state the description twice. Seems redundant. Same goes for the other options.
200 ↗	(On Diff #32797)	Why is this duplicated from LoopVectorizationLegality?
218 ↗	(On Diff #32797)	s/its/it's/
219 ↗	(On Diff #32797)	That's a weird name for the function, do you think legalLoopStructure would be better? Thinking how it would read in a conditional.
225 ↗	(On Diff #32797)	s/loops/loop/
271 ↗	(On Diff #32797)	Don't bother duplicating the assert reason in the line right above the assert.
279–281 ↗	(On Diff #32797)	Use = false here
285 ↗	(On Diff #32797)	Sloppy sentence. Put a space between at and least
286 ↗	(On Diff #32797)	shouldn't
287 ↗	(On Diff #32797)	Make sure the alias set doesn't have any MustAlias set.
288–289 ↗	(On Diff #32797)	Could you instead use the fancy new range-for?
291 ↗	(On Diff #32797)	I don't find this comment useful, I'd be more interested in /why/ we skip forward alias sets, I can see from the code that we do. If it's completely obvious why we skip forward alias sets, then the comment isn't needed.
294 ↗	(On Diff #32797)	Same here.
298 ↗	(On Diff #32797)	Coding violation
299 ↗	(On Diff #32797)	Try to maintain some consistency in your spelling of these things.
356–359 ↗	(On Diff #32797)	Ooof, that conditional is pretty hairy :)
374 ↗	(On Diff #32797)	LoadAndStoreCounter
404–405 ↗	(On Diff #32797)	Use = false
406 ↗	(On Diff #32797)	Coding convention
409–410 ↗	(On Diff #32797)	Coding convetion and also range for
412–413 ↗	(On Diff #32797)	and again
437–438 ↗	(On Diff #32797)	You don't need all those parens. (float)a / float(b) * 100. < c should be fine. Put it in a variable so you don't repeat it below as well.
512 ↗	(On Diff #32797)	StringRef
517–521 ↗	(On Diff #32797)	Range for, coding convention!
563–564 ↗	(On Diff #32797)	Range for

This revision now requires changes to proceed.Aug 27 2015, 3:53 PM

Thanks Charlie, I will correct these.

Regards,
Ashutosh

Incorporated review comments.

ashutosh.nema marked 41 inline comments as done.Sep 1 2015, 4:42 AM

ashutosh.nema added inline comments.

lib/Transforms/Scalar/LoopVersioningLICM.cpp
201 ↗	(On Diff #33675)	Its just a wrapper which calls 'getStrideFromPointer' utility. Earlier 'getStrideFromPointer' was the part of LoopVectorizer now moved to utility.

Ping.

Hi Ashutosh,

You can't commit this change until one of the other reviewers accepts this, but I'll step out of the way now and accept the revision, I'm worried that by leaving it as rejected it's holding up review from the other nominated reviewers.

I'm still not completely happy with the comments, but I think this has been held up for long enough on style issues alone. I hope someone can give you feedback soon, I'm not able to OK the revision.

--Charlie.

This revision is now accepted and ready to land.Sep 16 2015, 9:19 AM

I've started to look at this again; I apologize for the delay...

lib/Transforms/Scalar/LoopVersioningLICM.cpp
125 ↗	(On Diff #33675)	I see that you've marked this as done; does this patch need to be updated to reflect the change?
223 ↗	(On Diff #33675)	Can you be more specific? Are these checks reflecting limitation of the versioning infrastructure? Heuristics for profitability? Both? Please specifically explain this in the comments for each check.
291 ↗	(On Diff #33675)	Why? Must-alias with what? Why does it matter if there happens to be something that aliases with something else, if there is a third thing that only may alias with the first two?
308 ↗	(On Diff #33675)	alias -> Alias
333 ↗	(On Diff #33675)	Atleast -> At least
356 ↗	(On Diff #33675)	Why not? I understand that the vectorizer has these checks, but I don't see why this belongs in an LICM pass?

Thanks Hal for looking into this again.

I've started to look at this again; I apologize for the delay...

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:125
@@ +124,3 @@
+
+// Sets input string as meta data to loop latch terminator instruction.
+static void setLoopMetaData(Loop *CurLoop, const char *MDString) {
----------------
I see that you've marked this as done; does this patch need to be updated to reflect the change?

This comments was old comment on moving 'cloneLoop' to utility.
But now it's no more required as that function got removed, and using LoopVersioning utility for cloning etc.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:223
@@ +222,3 @@
+
+/// \brief Check loop structure and confirms it's good for LoopVersioningLICM.
+bool LoopVersioningLICM::legalLoopStructure() {
----------------
Can you be more specific? Are these checks reflecting limitation of the versioning infrastructure?

Some part of checks are repeated, but I kept them to complete legality at one place.
Not sure it's a right decision.

Heuristics for profitability? Both? Please specifically explain this in the comments for each check.

Sure will update.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:291
@@ +290,3 @@
+  // 2) PointerSet shouldn't have pointers more then RuntimeMemoryCheckThreshold
+  // 3) Make sure AliasSets doesn't have any must alias set.
+  for (const auto &I : *CurAST) {
----------------
Why? Must-alias with what? Why does it matter if there happens to be something that aliases with something else, if there is a third thing that only may alias with the first two?

Case where 2 pointers are must-aliased, there runtime bound check always give same result. In such cases there is no need for runtime checks and loop versioning.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:308
@@ +307,3 @@
+      Value *Ptr = A.getValue();
+      // alias tracker should have pointers of same data type.
+      typeCheck = (typeCheck && (SomePtr->getType() == Ptr->getType()));
----------------
alias -> Alias

Will correct.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:333
@@ +332,3 @@
+  }
+  // Atleast 2 pointers needed for runtime check.
+  if (PointerSet.size() <= 1) {
----------------
Atleast -> At least

Will correct.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:356
@@ +355,3 @@
+  const bool IsAnnotatedParallel = CurLoop->isAnnotatedParallel();
+  // We dont support call instructions. however, we ignore few intrinsic
+  // and libfunc callsite. We don't allow non-intrinsic, non-libfunc callsite
----------------
Why not? I understand that the vectorizer has these checks, but I don't see why this belongs in an LICM pass?

There is a possibility that call may modify aliasing behavior, which may defeat the purpose of versioning & runtime checks.

Repository:
  rL LLVM

http://reviews.llvm.org/D9151

In D9151#249651, @ashutosh.nema wrote:

Thanks Hal for looking into this again.

I've started to look at this again; I apologize for the delay...

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:125
@@ +124,3 @@
+
+// Sets input string as meta data to loop latch terminator instruction.
+static void setLoopMetaData(Loop *CurLoop, const char *MDString) {
----------------
I see that you've marked this as done; does this patch need to be updated to reflect the change?

This comments was old comment on moving 'cloneLoop' to utility.
But now it's no more required as that function got removed, and using LoopVersioning utility for cloning etc.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:223
@@ +222,3 @@
+
+/// \brief Check loop structure and confirms it's good for LoopVersioningLICM.
+bool LoopVersioningLICM::legalLoopStructure() {
----------------
Can you be more specific? Are these checks reflecting limitation of the versioning infrastructure?

Some part of checks are repeated, but I kept them to complete legality at one place.
Not sure it's a right decision.

Heuristics for profitability? Both? Please specifically explain this in the comments for each check.

Sure will update.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:291
@@ +290,3 @@
+  // 2) PointerSet shouldn't have pointers more then RuntimeMemoryCheckThreshold
+  // 3) Make sure AliasSets doesn't have any must alias set.
+  for (const auto &I : *CurAST) {
----------------
Why? Must-alias with what? Why does it matter if there happens to be something that aliases with something else, if there is a third thing that only may alias with the first two?

Case where 2 pointers are must-aliased, there runtime bound check always give same result. In such cases there is no need for runtime checks and loop versioning.

Alright, I have a better understanding of what you're doing now. Your technique for generating the versioned loop is to check that all pointer access in the loop are independent (and, thus, don't alias), and guarded by that check, you reach the versioned variant of the loop where the aliasing metadata asserts that all access are mutually independent. The follow-up LICM pass then actually does the hoisting.

I agree this makes sense, because if you had multiple aliasing domains then you'd still not necessarily be able to hoist the potentially-loop-invariant accesses out of the loop. Please, however, explain all of this near the CurAST iteration code so that it is clear why you have this restriction.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:308
@@ +307,3 @@
+      Value *Ptr = A.getValue();
+      // alias tracker should have pointers of same data type.
+      typeCheck = (typeCheck && (SomePtr->getType() == Ptr->getType()));
----------------
alias -> Alias

Will correct.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:333
@@ +332,3 @@
+  }
+  // Atleast 2 pointers needed for runtime check.
+  if (PointerSet.size() <= 1) {
----------------
Atleast -> At least

Will correct.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:356
@@ +355,3 @@
+  const bool IsAnnotatedParallel = CurLoop->isAnnotatedParallel();
+  // We dont support call instructions. however, we ignore few intrinsic
+  // and libfunc callsite. We don't allow non-intrinsic, non-libfunc callsite
----------------
Why not? I understand that the vectorizer has these checks, but I don't see why this belongs in an LICM pass?

There is a possibility that call may modify aliasing behavior, which may defeat the purpose of versioning & runtime checks.

Ah, good point. However, please then replace this check with an appropriate AA getModRef-type check for whether the call might alias with the relevant pointers (if the function, for example, does not alias with any of the loop-invariant accesses, then it can't affect what you're trying to do, although you'll need to be somewhat more careful about adding the noalias metadata to those functions, etc.).

Repository:
  rL LLVM

http://reviews.llvm.org/D9151

lib/Transforms/Scalar/LoopVersioningLICM.cpp
97 ↗	(On Diff #33675)	please use metadata names more consistent with our general naming scheme for standard metadata (llvm.loop.whatever). You should need only one metadata type, to disable licm-based versioning, it should be documented in the LangRef, and you can add it to both original and versioned loops once this pass has run.
426 ↗	(On Diff #33675)	Should comment say "load or store instruction"?

Thanks Hal for review.

I'll incorporate your comments and come back soon.

>
>   ================
>   Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:291
>   @@ +290,3 @@
>   +  // 2) PointerSet shouldn't have pointers more then RuntimeMemoryCheckThreshold
>   +  // 3) Make sure AliasSets doesn't have any must alias set.
>   +  for (const auto &I : *CurAST) {
>   ----------------
>   Why? Must-alias with what? Why does it matter if there happens to be something that aliases with something else, if there is a third thing that only may alias with the first two?
>
> Case where 2 pointers are must-aliased, there runtime bound check always give same result. In such cases there is no need for runtime checks and loop versioning.


Alright, I have a better understanding of what you're doing now. Your technique for generating the versioned loop is to check that all pointer access in the loop are independent (and, thus, don't alias), and guarded by that check, you reach the versioned variant of the loop where the aliasing metadata asserts that all access are mutually independent. The follow-up LICM pass then actually does the hoisting.

I agree this makes sense, because if you had multiple aliasing domains then you'd still not necessarily be able to hoist the potentially-loop-invariant accesses out of the loop. Please, however, explain all of this near the CurAST iteration code so that it is clear why you have this restriction.

Sure I'll mention about this.

>   Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:356

>   @@ +355,3 @@

>   +  const bool IsAnnotatedParallel = CurLoop->isAnnotatedParallel();

>   +  // We dont support call instructions. however, we ignore few intrinsic

>   +  // and libfunc callsite. We don't allow non-intrinsic, non-libfunc callsite

>   ----------------

>   Why not? I understand that the vectorizer has these checks, but I don't see why this belongs in an LICM pass?

>

> There is a possibility that call may modify aliasing behavior, which may defeat the purpose of versioning & runtime checks.


Ah, good point. However, please then replace this check with an appropriate AA getModRef-type check for whether the call might alias with the relevant pointers (if the function, for example, does not alias with any of the loop-invariant accesses, then it can't affect what you're trying to do, although you'll need to be somewhat more careful about adding the noalias metadata to those functions, etc.).

I'm not sure that itself would be sufficient, consider indirect calls probably need to consider function pointer in runtime check.
Also there is a possibility that in call argument one of the runtime pointer escaped, in such cases it become more difficult to ensure correctness.
That's why for simplicity I considered vectorizer approach to consider only few functions.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:97
@@ +96,3 @@
+#define DEBUG_TYPE "loop-versioning-licm"
+#define ORIGINAL_LOOP_METADATA "LoopVersioningLICMOriginalLoop"
+#define VERSION_LOOP_METADATA "LoopVersioningLICMVersionLoop"
----------------
please use metadata names more consistent with our general naming scheme for standard metadata (llvm.loop.whatever). You should need only one metadata type, to disable licm-based versioning, it should be documented in the LangRef, and you can add it to both original and versioned loops once this pass has run.

Sure I'll use only one meta data.
Will update LangRef as well.

================
Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:426
@@ +425,3 @@
+  }
+  // Loop should have at least invariant store instruction.
+  if (!InvariantCounter) {
----------------
Should comment say "load or store instruction"?

Ah, this comment needs to be updated.

Regards,
Ashutosh

Hal, is below comments on call handling looks OK to you ?

>   Comment at: lib/Transforms/Scalar/LoopVersioningLICM.cpp:356

>   @@ +355,3 @@

>   +  const bool IsAnnotatedParallel = CurLoop->isAnnotatedParallel();

>   +  // We dont support call instructions. however, we ignore few intrinsic

>   +  // and libfunc callsite. We don't allow non-intrinsic, non-libfunc callsite

>   ----------------

>   Why not? I understand that the vectorizer has these checks, but I don't see why this belongs in an LICM pass?

>

> There is a possibility that call may modify aliasing behavior, which may defeat the purpose of versioning & runtime checks.


Ah, good point. However, please then replace this check with an appropriate AA getModRef-type check for whether the call might alias with the relevant pointers (if the function, for example, does not alias with any of the loop-invariant accesses, then it can't affect what you're trying to do, although you'll need to be somewhat more careful about adding the noalias metadata to those functions, etc.).

Can I keep this like vectorizer ?

Regards,
Ashutosh

Incorporated review comments from Hal.

This patch does not contain LangRef changes, I'll soon submitting it.

Incorporated Hal comments & updated LangRef.

Hi Hal,

Is this looks OK now ?

Regards,
Ashutosh

Ping.

I apologize for taking so long to get back to this...

Do the runtme checks inserted cover only the interactions between the invariant access and the non-invariant accesses, or do they also perform range overlap checks on the non-invariant accesses?

lib/Transforms/Scalar/LoopVersioningLICM.cpp
17 ↗	(On Diff #39332)	conservative (default) alias -> conservative (default) aliasing assumptions
17 ↗	(On Diff #39332)	Aggressive alias version -> The version of the loop making aggressive aliasing assumptions
18 ↗	(On Diff #39332)	access -> accesses
18 ↗	(On Diff #39332)	version -> versions
21 ↗	(On Diff #39332)	ensures aliasing of memory -> ensures the lack of memory aliasing
24 ↗	(On Diff #39332)	I'd reword this whole sentence, as: If the runtime check detects any memory aliasing, then the original loop is executed. Otherwise, the version with aggressive aliasing assumptions is used.
249 ↗	(On Diff #39332)	I think this would be better as: << " loop is not bottom tested\n");
253 ↗	(On Diff #39332)	This is fine for now. I suspect that eventually we'll want to deal with this by versioning the whole loop nest instead of just the inner loop.
343 ↗	(On Diff #39332)	Didnt found -> Didn't find
363 ↗	(On Diff #39332)	if( -> if (
373 ↗	(On Diff #39332)	Parallel loops must not have aliasing loop-invariant memory accesses, or else they'd not be trivially vectorizable. We don't need to version anything in this case, but rather, we should be able to hoist any invariant access. If we don't already get this case, then don't get it here (we don't need to version the loop), but rather, we should handle this directly in the usual LICM pass.
375 ↗	(On Diff #39332)	load( -> load (
390 ↗	(On Diff #39332)	store( -> store (
444 ↗	(On Diff #39332)	Please avoid the use of floating-point computation here (we don't necessarily compile LLVM with strict IEEE support, so we need to be careful to make sure that decision-making procedures are not sensitive to small accuracy changes). InvariantPercentage < InvariantThreshold can become: InvariantCounter100 < InvariantThresholdLoadAndStoreCounter (or something like that)

hfinkel added inline comments.Dec 11 2015, 3:29 AM

docs/LangRef.rst
4386 ↗	(On Diff #39332)	Add a space just before "(LICM)".
4387 ↗	(On Diff #39332)	clone loop -> all other loop versions
4387 ↗	(On Diff #39332)	original loop -> the original loop
4388 ↗	(On Diff #39332)	It helps suggesting loop-versioning-licm pass that the loop should not be re-versioned. -> The loop-versioning-licm pass will not create additional loop versions of loops with this metadata.
lib/Transforms/Scalar/LoopVersioningLICM.cpp
12 ↗	(On Diff #39332)	un surety -> uncertainty
12 ↗	(On Diff #39332)	restrict LICM to proceed -> restricts LICM from proceeding
13 ↗	(On Diff #39332)	Cases -> In cases
13 ↗	(On Diff #39332)	like to -> will
16 ↗	(On Diff #39332)	creates version -> create a version
16 ↗	(On Diff #39332)	alias -> aliasing assumptions

Thanks Hal, for again looking into this change.

Do the runtme checks inserted cover only the interactions between
the invariant access and the non-invariant accesses, or do they also
perform range overlap checks on the non-invariant accesses?

Runtime check also perform overlap checks between non-invariant
accesses. without checking all memory access we can’t make aggressive
aliasing assumptions.

In this patch, incorporated comments from Hal.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptDec 14 2015, 1:48 AM

Hi Hal,

Is this looks OK now ?

Regards,
Ashutosh

In D9151#309587, @ashutosh.nema wrote:

Thanks Hal, for again looking into this change.

Do the runtme checks inserted cover only the interactions between
the invariant access and the non-invariant accesses, or do they also
perform range overlap checks on the non-invariant accesses?

Runtime check also perform overlap checks between non-invariant
accesses. without checking all memory access we can’t make aggressive
aliasing assumptions.

In that case, you should also add llvm.mem.parallel_loop_access metadata to the versioned loop. Otherwise, the vectorizer will add a duplicate set of checks should it decide to vectorize the versiond loop. We also need to add metadata to the original loop to disable vectorization (as we already know that the vectorization checks will have failed when that loop is reached).

lib/Transforms/IPO/PassManagerBuilder.cpp
109 ↗	(On Diff #42692)	new, experimental -> experimental (If something is experimental, and yet not new, something's probably wrong ;) )
lib/Transforms/Scalar/LoopVersioningLICM.cpp
5 ↗	(On Diff #42692)	Remove this paragraph.
9 ↗	(On Diff #42692)	Remove this line (it, and the paragraph above, are not needed because the problem is explained clearly in the next paragraph).
62 ↗	(On Diff #42692)	These lines need to appear directly under the "The LLVM Compiler Infrastructure" line.
135 ↗	(On Diff #42692)	This utility function is not right, we need to combine the loop metadata here with any other loop metadata which might also exist. Please see the writeHintsToMetadata function in lib/Transforms/Vectorize/LoopVectorize.cpp and/or the relevant code in lib/Transforms/Utils/LoopUnrollRuntime.cpp near the end of the CloneLoopBlocks function. As a general note, we really should refactor this into a common utility function.
154 ↗	(On Diff #42692)	Add here: AU.addPreserved<AAResultsWrapperPass>(); AU.addPreserved<GlobalsAAWrapperPass>(); (so that you don't kill off GlobalsAA here).
310 ↗	(On Diff #42692)	typeCheck -> TypeCheck
323 ↗	(On Diff #42692)	You're checking here that all points in the given alias set have the same type; why?

Dear All,

sorry for jumping in late. I wonder if there is a specific reason this pass is run as in the canonicalization phase as part of the inliner loop. Would it not make more sense to run it before the vectorizers?

Best,
Tobias

Hi Tobias,

I wonder if there is a specific reason this pass is run as in the
canonicalization phase as part of the inliner loop.

Not very clear, what you mean by 'inliner loop', is it inner loop ?

Would it not make more sense to run it before the vectorizers?

If you see pass manager its scheduled prior to Vectorizer.
As its LICM based, it’s not performing any vectorizer legality.
It helps in cases where LICM became conservative because of
memory aliasing uncertainty.

Thanks,
Ashutosh

Hi Hal,

Will work on your comments and come back.

Thanks,
Ashutosh

lib/Transforms/Scalar/LoopVersioningLICM.cpp
323 ↗	(On Diff #42692)	Actually this is a pre-condition in LICM’s “promoteLoopAccessesToScalars” where it expects all pointers in alias should have same type. <LICM.cpp> 881 Check that all of the pointers in the alias set have the same type. We 882 cannot (yet) promote a memory location that is loaded and stored in 883 // different sizes. 884 if (SomePtr->getType() != ASIV->getType()) 885 return Changed; To confirm same behaviour we added this check.

Addressed Hal's comment.

mehdi_amini added inline comments.Jan 2 2016, 8:36 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	Why is is done here and not where LICM is usually done? Line 419? Did you try multiple placements? Do you have benchmarks results that drives this choice?

ashutosh.nema added inline comments.Jan 12 2016, 9:10 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	We want to run this when inining is over, because after that we may see more accurate aliasing. That’s why we placed it after ‘createBarrierNoopPass’. If we do this earlier probably of no-alias aliasing assumptions in version loop, other optimizations may get some benefit. This can be schedule later as well but I don’t see any benefit. We have observed good gains in our internal benchmarks, mostly based on customer workloads.

mehdi_amini added inline comments.Jan 12 2016, 10:36 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	I'm not sure I follow why you need the inliner to be fully complete over the whole module and not having processed the function on which you want to run LICM. What kind of aliasing will this avoid? Do you have an example? Also, you are saying that you don't see any benefit of scheduling this later, can you elaborate with respect to the other run of LICM I pointed? Also, I ask about benchmark with respect to the different possible placement for this pass, and you didn't answer this in particular.

grosser added inline comments.Jan 12 2016, 10:40 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	One reason to run this late is that too early versioning may prevent both further inlining (due to increased code size) as well as versioning later a possibly larger loop. Furthermore, to my understanding the inlining loop is indeed a canonicalization phase. We are planning to run Polly after the inliner loop for this very same reason.

mehdi_amini added inline comments.Jan 12 2016, 10:47 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	Fair for the inliner loop. But what about the already existing LICM that already runs a little bit later?

grosser added inline comments.Jan 12 2016, 10:55 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	Are you talking about the LICM pass within the inliner loop? It is to my understanding part of the canonicalization sequence that helps both to eliminate loops (in case all statements can be proven loop invariant) and which also helps SCEV by moving certain SCEVUnknowns further out of the tree. As the existing LICM does not add a large code size increase, it seems to be save to be run in the inliner loop. (LICM is not a pure canonicalization hence it causes some troubles for Polly, but we are working on addressing them on our side).

mehdi_amini added inline comments.Jan 12 2016, 11:27 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	No, not within the inliner loop, see line 419 (or 411 before the patch)

ashutosh.nema added inline comments.Jan 12 2016, 11:37 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	We did tried multiple placements i.e. prior to inlining completion and post completion. As this is an internal benchmark, we can’t disclose details. But with these placements we did not noticed any degrade in our regular testing. Reason for running just after inlining completion is other passes might gain from non-alias version of loop (As of now we do not have any case where other optimization getting benefits from no-alias version of loop).

I think this is getting really close to being ready.

docs/LangRef.rst
4531 ↗	(On Diff #43509)	For consistency with our other metadata (such as llvm.loop.unroll.disable), we should name this: llvm.loop.licm_versioning.disable
4534 ↗	(On Diff #43509)	This paragraph is too implementation-specific. How about this: This metadata indicates that the loop should not be versioned for the purpose of enabling loop-invariant code motion (LICM). The metadata has a single operand which is the string `llvm.loop.licm_versioning.disable`.
include/llvm/Transforms/Utils/LoopUtils.h
380 ↗	(On Diff #43509)	Remove space in between the * and `TheLoop`.
383 ↗	(On Diff #43509)	How about calling this: addStringMetadataToLoop
lib/Transforms/Scalar/LoopVersioningLICM.cpp
10 ↗	(On Diff #43509)	This first statement is actually misleading; it should always return MayAlias when it is uncertain. How about saying this: Then alias analysis is uncertain about the aliasing between any two accesses, it will return MayAlias.
12 ↗	(On Diff #43509)	will -> might
16 ↗	(On Diff #43509)	and the other -> in addition to the original
21 ↗	(On Diff #43509)	"Based on this check result at runtime any of the loop gets executed," -> The result of the runtime check determines which of the loop versions is executed:
28 ↗	(On Diff #43509)	LoopVersioningLICM -> LoopVersioningLICM's
30 ↗	(On Diff #43509)	access -> accesses
31 ↗	(On Diff #43509)	access -> accesses
32 ↗	(On Diff #43509)	of runtime check -> of the runtime check
35 ↗	(On Diff #43509)	transform -> transforms
95 ↗	(On Diff #43509)	Please update to llvm.loop.licm_versioning.disable
100 ↗	(On Diff #43509)	instruction -> instructions
103 ↗	(On Diff #43509)	LoopVersioningLICM threshold minimum -> LoopVersioningLICM's minimum
104 ↗	(On Diff #43509)	for -> of
105 ↗	(On Diff #43509)	instruction -> instructions
105 ↗	(On Diff #43509)	in a -> per
112 ↗	(On Diff #43509)	LoopVersioningLICM -> LoopVersioningLICM's
118 ↗	(On Diff #43509)	LoopVersioningLICM's maximum number of pointers per runtime check
302 ↗	(On Diff #43509)	Why are you checking this?
386 ↗	(On Diff #43509)	Shouldn't you use LAI->getNumRuntimePointerChecks() here instead of PointerSet.size()?
413 ↗	(On Diff #43509)	Performing this check per instruction seems silly. Maybe move to legalLoopStructure?
632 ↗	(On Diff #43509)	do-loop-versioning-licm -> loop-versioning-licm (I often just use DEBUG_TYPE for this; there's no reason for them to be out-of-sync)
643 ↗	(On Diff #43509)	do-loop-versioning-licm -> loop-versioning-licm (or DEBUG_TYPE)

mehdi_amini added inline comments.Jan 15 2016, 11:21 AM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	Well as long as it is not enabled by default in the pipeline, it does not matter. It is worth noting that this question may come back later though.

hfinkel added inline comments.Jan 15 2016, 11:29 AM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	I don't think that making functions larger prior to inlining is something we can justify (i.e. I'm sure that will cause regressions). Also, as functions get inlined, it is likely to turn out that aliasing that we could not reason about in the function in isolation can be reasoned about in the context of a particular caller. Making the decision to version early will be problematic in that sense as well (it would be nice to think that, in such a case, we'd statically evaluate the runtime condition one way or the other, but I don't think that can happen now either).

mehdi_amini added inline comments.Jan 15 2016, 11:42 AM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	After talking on IRC, the answer to my original question about the LICM that runs later in the pipeline is that it runs after the vectorizer as part of post-vectorize cleanup. The current point of the multi-versioning LICM is different: it may help to allow better vectorization, so it makes sense to differentiate with the existing LICM.

hfinkel added inline comments.Jan 15 2016, 11:47 AM

lib/Transforms/IPO/PassManagerBuilder.cpp
349 ↗	(On Diff #43509)	Correct. One issue, however, is that unconditionally running LICM after the versioning pass is not ideal (in the compile-time sense), because LICM Is not super cheap. Given that we broke LICM up into a set of utility functions (exposed in include/llvm/Transforms/Utils/LoopUtils.h), we should really call them from this pass only if we actually do anything.

anemet added inline comments.Jan 15 2016, 11:57 AM

lib/Transforms/IPO/PassManagerBuilder.cpp
346–349 ↗	(On Diff #43509)	@ashutosh.nema, We should document the outcome of the above discussion in a comment here. E.g.: The reason we run LICMLVer right after the CGSCC pass manager so that later patches can take advantage of the new non-alias annotations. And the reason we actually schedule an LICM at this point so that invariant access to global could be hoisted out of the loop which could allow further vectorizations. @hfinkel, Please let me know if any of this is inaccurate.

ashutosh.nema added inline comments.Jan 18 2016, 10:34 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
346–349 ↗	(On Diff #43509)	Sure Adam, will document key points in comments.
349 ↗	(On Diff #43509)	LICM utilities can be called from LoopVersioningLICM, but we identified issues running it, i.e. in few cases we noticed LoopInfo for clone is not getting updated, which can make this optimization ineffective. Understand its compile time expensive to run complete pass, but If we run LICM pass later we will not see issues like LoopInfo get getting updated. Also this is an optional pass, should be OK to run LICM.

anemet added inline comments.Jan 20 2016, 6:05 PM

lib/Transforms/Scalar/LoopVersioningLICM.cpp
555–575 ↗	(On Diff #43509)	You probably want to add a comment saying that you can add no-alias between any pairs of memory operations because you ignore loops with must aliasing accesses. Otherwise I don't think this would be valid.
622–623 ↗	(On Diff #43509)	Why is this necessary? LoopVersioning is supposed to update the DT.

Addressed Hal's & Adam's comments.

Herald added a subscriber: MatzeB. · View Herald TranscriptJan 22 2016, 5:00 AM

ashutosh.nema added inline comments.Jan 22 2016, 5:01 AM

lib/Transforms/Scalar/LoopVersioningLICM.cpp
302 ↗	(On Diff #43509)	Loop trip count is required for runtime check generation, as the bound checks are based on this.
386 ↗	(On Diff #43509)	‘LAI->getNumRuntimePointerChecks()’ returns number of runtime checks but here we are checking maximum possible pointers in runtime check.
632 ↗	(On Diff #43509)	Tried making it 'loop-versioning-licm' but later found command line error mentioning option registered more than once. So, made DEBUG_TYPE as “do-loop-versioning-licm”. Hoping this should be OK.

Hi Hal,

Is this Looks OK now ?

Regards,
Ashutosh

hfinkel added inline comments.Feb 2 2016, 9:44 PM

lib/Transforms/IPO/PassManagerBuilder.cpp
120 ↗	(On Diff #45680)	"loop-versioning-licm" -> "enable-loop-versioning-licm"
lib/Transforms/Scalar/LoopVersioningLICM.cpp
303 ↗	(On Diff #45680)	Okay, please explain this better in the comment. Something like: // We need to be able to compute the loop trip count in order to generate the bound checks.
387 ↗	(On Diff #45680)	Why do you care how many pointers there are? I'd think only the number of checks generated matters in terms of cost.
633 ↗	(On Diff #45680)	No, please rename the other option (as I noted by the other option, putting adding enable- as a prefix would be natural). Then make this change.

Thanks Hal.

I'll work on your comments and come back.

lib/Transforms/Scalar/LoopVersioningLICM.cpp
633 ↗	(On Diff #45680)	I did not understood this comment completely. Are you saying use "enable-loop-versioning-licm" here, in DEBUG_TYPE & in PassManagerBuilder ?

hfinkel added inline comments.Feb 3 2016, 4:23 AM

lib/Transforms/Scalar/LoopVersioningLICM.cpp
633 ↗	(On Diff #45680)	Use "enable-loop-versioning-licm" in PassManagerBuilder, and then use "loop-versioning-licm" in DEBUG_TYPE and here. Thanks!

Thanks Hal for clarification, will make these changes and come back.

Regards,
Ashutosh

Addressed Hal's comments.

Corrected include file order.

LGTM, thanks!

Closed by commit rL259986: New Loop Versioning LICM Pass (authored by Ashutosh). · Explain WhyFeb 5 2016, 11:52 PM

This revision was automatically updated to reflect the committed changes.

Thanks Hal.

Revision Contents

Path

Size

llvm/

trunk/

docs/

LangRef.rst

11 lines

include/

llvm/

InitializePasses.h

1 line

LinkAllPasses.h

1 line

Transforms/

Scalar.h

6 lines

Utils/

LoopUtils.h

8 lines

lib/

Transforms/

IPO/

PassManagerBuilder.cpp

14 lines

Scalar/

CMakeLists.txt

1 line

LoopVersioningLICM.cpp

620 lines

Scalar.cpp

1 line

test/

Transforms/

LoopVersioningLICM/

loopversioningLICM1.ll

66 lines

loopversioningLICM2.ll

51 lines

loopversioningLICM3.ll

44 lines

Diff 47076

llvm/trunk/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 4,548 Lines • ▼ Show 20 Lines
	This metadata suggests that the loop should be unrolled fully. The			This metadata suggests that the loop should be unrolled fully. The
	metadata has a single operand which is the string ``llvm.loop.unroll.full``.			metadata has a single operand which is the string ``llvm.loop.unroll.full``.
	For example:			For example:

	.. code-block:: llvm			.. code-block:: llvm

	!0 = !{!"llvm.loop.unroll.full"}			!0 = !{!"llvm.loop.unroll.full"}

				'``llvm.loop.licm_versioning.disable``' Metadata
				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

				This metadata indicates that the loop should not be versioned for the purpose
				of enabling loop-invariant code motion (LICM). The metadata has a single operand
				which is the string ``llvm.loop.licm_versioning.disable``. For example:

				.. code-block:: llvm

				!0 = !{!"llvm.loop.licm_versioning.disable"}

	'``llvm.mem``'			'``llvm.mem``'
	^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^

	Metadata types used to annotate memory accesses with information helpful			Metadata types used to annotate memory accesses with information helpful
	for optimizations are prefixed with ``llvm.mem``.			for optimizations are prefixed with ``llvm.mem``.

	'``llvm.mem.parallel_loop_access``' Metadata			'``llvm.mem.parallel_loop_access``' Metadata
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	▲ Show 20 Lines • Show All 7,519 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 170 Lines • ▼ Show 20 Lines
	void initializeLoopRotatePass(PassRegistry&);			void initializeLoopRotatePass(PassRegistry&);
	void initializeLoopSimplifyPass(PassRegistry&);			void initializeLoopSimplifyPass(PassRegistry&);
	void initializeLoopSimplifyCFGPass(PassRegistry&);			void initializeLoopSimplifyCFGPass(PassRegistry&);
	void initializeLoopStrengthReducePass(PassRegistry&);			void initializeLoopStrengthReducePass(PassRegistry&);
	void initializeGlobalMergePass(PassRegistry&);			void initializeGlobalMergePass(PassRegistry&);
	void initializeLoopRerollPass(PassRegistry&);			void initializeLoopRerollPass(PassRegistry&);
	void initializeLoopUnrollPass(PassRegistry&);			void initializeLoopUnrollPass(PassRegistry&);
	void initializeLoopUnswitchPass(PassRegistry&);			void initializeLoopUnswitchPass(PassRegistry&);
				void initializeLoopVersioningLICMPass(PassRegistry&);
	void initializeLoopIdiomRecognizePass(PassRegistry&);			void initializeLoopIdiomRecognizePass(PassRegistry&);
	void initializeLowerAtomicPass(PassRegistry&);			void initializeLowerAtomicPass(PassRegistry&);
	void initializeLowerBitSetsPass(PassRegistry&);			void initializeLowerBitSetsPass(PassRegistry&);
	void initializeLowerExpectIntrinsicPass(PassRegistry&);			void initializeLowerExpectIntrinsicPass(PassRegistry&);
	void initializeLowerIntrinsicsPass(PassRegistry&);			void initializeLowerIntrinsicsPass(PassRegistry&);
	void initializeLowerInvokePass(PassRegistry&);			void initializeLowerInvokePass(PassRegistry&);
	void initializeLowerSwitchPass(PassRegistry&);			void initializeLowerSwitchPass(PassRegistry&);
	void initializeLowerEmuTLSPass(PassRegistry&);			void initializeLowerEmuTLSPass(PassRegistry&);
	▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createLoopExtractorPass();		(void) llvm::createLoopExtractorPass();
(void) llvm::createLoopInterchangePass();		(void) llvm::createLoopInterchangePass();
(void) llvm::createLoopSimplifyPass();		(void) llvm::createLoopSimplifyPass();
(void) llvm::createLoopSimplifyCFGPass();		(void) llvm::createLoopSimplifyCFGPass();
(void) llvm::createLoopStrengthReducePass();		(void) llvm::createLoopStrengthReducePass();
(void) llvm::createLoopRerollPass();		(void) llvm::createLoopRerollPass();
(void) llvm::createLoopUnrollPass();		(void) llvm::createLoopUnrollPass();
(void) llvm::createLoopUnswitchPass();		(void) llvm::createLoopUnswitchPass();
		(void) llvm::createLoopVersioningLICMPass();
(void) llvm::createLoopIdiomPass();		(void) llvm::createLoopIdiomPass();
(void) llvm::createLoopRotatePass();		(void) llvm::createLoopRotatePass();
(void) llvm::createLowerExpectIntrinsicPass();		(void) llvm::createLowerExpectIntrinsicPass();
(void) llvm::createLowerInvokePass();		(void) llvm::createLowerInvokePass();
(void) llvm::createLowerSwitchPass();		(void) llvm::createLowerSwitchPass();
(void) llvm::createNaryReassociatePass();		(void) llvm::createNaryReassociatePass();
(void) llvm::createObjCARCAAWrapperPass();		(void) llvm::createObjCARCAAWrapperPass();
(void) llvm::createObjCARCAPElimPass();		(void) llvm::createObjCARCAPElimPass();
▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 189 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// LoopIdiom - This pass recognizes and replaces idioms in loops.			// LoopIdiom - This pass recognizes and replaces idioms in loops.
	//			//
	Pass *createLoopIdiomPass();			Pass *createLoopIdiomPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
				// LoopVersioningLICM - This pass is a loop versioning pass for LICM.
				//
				Pass *createLoopVersioningLICMPass();

				//===----------------------------------------------------------------------===//
				//
	// PromoteMemoryToRegister - This pass is used to promote memory references to			// PromoteMemoryToRegister - This pass is used to promote memory references to
	// be register references. A simple example of the transformation performed by			// be register references. A simple example of the transformation performed by
	// this pass is:			// this pass is:
	//			//
	// FROM CODE TO CODE			// FROM CODE TO CODE
	// %X = alloca i32, i32 1 ret i32 42			// %X = alloca i32, i32 1 ret i32 42
	// store i32 42, i32 *%X			// store i32 42, i32 *%X
	// %Y = load i32* %X			// %Y = load i32* %X
	▲ Show 20 Lines • Show All 289 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h

	Show First 20 Lines • Show All 371 Lines • ▼ Show 20 Lines
	/// \brief Computes safety information for a loop			/// \brief Computes safety information for a loop
	/// checks loop body & header for the possibility of may throw			/// checks loop body & header for the possibility of may throw
	/// exception, it takes LICMSafetyInfo and loop as argument.			/// exception, it takes LICMSafetyInfo and loop as argument.
	/// Updates safety information in LICMSafetyInfo argument.			/// Updates safety information in LICMSafetyInfo argument.
	void computeLICMSafetyInfo(LICMSafetyInfo , Loop );			void computeLICMSafetyInfo(LICMSafetyInfo , Loop );

	/// \brief Returns the instructions that use values defined in the loop.			/// \brief Returns the instructions that use values defined in the loop.
	SmallVector<Instruction , 8> findDefsUsedOutsideOfLoop(Loop L);			SmallVector<Instruction , 8> findDefsUsedOutsideOfLoop(Loop L);

				/// \brief Check string metadata into loop, if it exist return true,
				/// else return false.
				bool checkStringMetadataIntoLoop(Loop *TheLoop, StringRef Name);

				/// \brief Set input string into loop metadata by keeping other values intact.
				void addStringMetadataToLoop(Loop TheLoop, const char MDString,
				unsigned V = 0);
	}			}

	#endif			#endif

llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	static cl::opt<std::string> RunPGOInstrGen(
cl::desc("Enable generation phase of PGO instrumentation and specify the "		cl::desc("Enable generation phase of PGO instrumentation and specify the "
"path of profile data file"));		"path of profile data file"));

static cl::opt<std::string> RunPGOInstrUse(		static cl::opt<std::string> RunPGOInstrUse(
"profile-use", cl::init(""), cl::Hidden, cl::value_desc("filename"),		"profile-use", cl::init(""), cl::Hidden, cl::value_desc("filename"),
cl::desc("Enable use phase of PGO instrumentation and specify the path "		cl::desc("Enable use phase of PGO instrumentation and specify the path "
"of profile data file"));		"of profile data file"));

		static cl::opt<bool> UseLoopVersioningLICM(
		"enable-loop-versioning-licm", cl::init(false), cl::Hidden,
		cl::desc("Enable the experimental Loop Versioning LICM pass"));

PassManagerBuilder::PassManagerBuilder() {		PassManagerBuilder::PassManagerBuilder() {
OptLevel = 2;		OptLevel = 2;
SizeLevel = 0;		SizeLevel = 0;
LibraryInfo = nullptr;		LibraryInfo = nullptr;
Inliner = nullptr;		Inliner = nullptr;
FunctionIndex = nullptr;		FunctionIndex = nullptr;
DisableUnitAtATime = false;		DisableUnitAtATime = false;
DisableUnrollLoops = false;		DisableUnrollLoops = false;
▲ Show 20 Lines • Show All 243 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
MPM.add(createInstructionCombiningPass()); // Clean up after everything.		MPM.add(createInstructionCombiningPass()); // Clean up after everything.
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);

// FIXME: This is a HACK! The inliner pass above implicitly creates a CGSCC		// FIXME: This is a HACK! The inliner pass above implicitly creates a CGSCC
// pass manager that we are specifically trying to avoid. To prevent this		// pass manager that we are specifically trying to avoid. To prevent this
// we must insert a no-op module pass to reset the pass manager.		// we must insert a no-op module pass to reset the pass manager.
MPM.add(createBarrierNoopPass());		MPM.add(createBarrierNoopPass());

		// Scheduling LoopVersioningLICM when inining is over, because after that
		// we may see more accurate aliasing. Reason to run this late is that too
		// early versioning may prevent further inlining due to increase of code
		// size. By placing it just after inlining other optimizations which runs
		// later might get benefit of no-alias assumption in clone loop.
		if (UseLoopVersioningLICM) {
		MPM.add(createLoopVersioningLICMPass()); // Do LoopVersioningLICM
		MPM.add(createLICMPass()); // Hoist loop invariants
		}

if (!DisableUnitAtATime)		if (!DisableUnitAtATime)
MPM.add(createReversePostOrderFunctionAttrsPass());		MPM.add(createReversePostOrderFunctionAttrsPass());

if (!DisableUnitAtATime && OptLevel > 1 && !PrepareForLTO) {		if (!DisableUnitAtATime && OptLevel > 1 && !PrepareForLTO) {
// Remove avail extern fns and globals definitions if we aren't		// Remove avail extern fns and globals definitions if we aren't
// compiling an object file for later LTO. For LTO we want to preserve		// compiling an object file for later LTO. For LTO we want to preserve
// these so they are eligible for inlining at link-time. Note if they		// these so they are eligible for inlining at link-time. Note if they
// are unreferenced they will be removed by GlobalDCE later, so		// are unreferenced they will be removed by GlobalDCE later, so
▲ Show 20 Lines • Show All 386 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt

Show All 22 Lines	add_llvm_library(LLVMScalarOpts
LoopInterchange.cpp		LoopInterchange.cpp
LoopLoadElimination.cpp		LoopLoadElimination.cpp
LoopRerollPass.cpp		LoopRerollPass.cpp
LoopRotation.cpp		LoopRotation.cpp
LoopSimplifyCFG.cpp		LoopSimplifyCFG.cpp
LoopStrengthReduce.cpp		LoopStrengthReduce.cpp
LoopUnrollPass.cpp		LoopUnrollPass.cpp
LoopUnswitch.cpp		LoopUnswitch.cpp
		LoopVersioningLICM.cpp
LowerAtomic.cpp		LowerAtomic.cpp
LowerExpectIntrinsic.cpp		LowerExpectIntrinsic.cpp
MemCpyOptimizer.cpp		MemCpyOptimizer.cpp
MergedLoadStoreMotion.cpp		MergedLoadStoreMotion.cpp
NaryReassociate.cpp		NaryReassociate.cpp
PartiallyInlineLibCalls.cpp		PartiallyInlineLibCalls.cpp
PlaceSafepoints.cpp		PlaceSafepoints.cpp
Reassociate.cpp		Reassociate.cpp
Show All 21 Lines

llvm/trunk/lib/Transforms/Scalar/LoopVersioningLICM.cpp

				//===----------- LoopVersioningLICM.cpp - LICM Loop Versioning ------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// When alias analysis is uncertain about the aliasing between any two accesses,
				// it will return MayAlias. This uncertainty from alias analysis restricts LICM
				// from proceeding further. In cases where alias analysis is uncertain we might
				// use loop versioning as an alternative.
				//
				// Loop Versioning will create a version of the loop with aggressive aliasing
				// assumptions in addition to the original with conservative (default) aliasing
				// assumptions. The version of the loop making aggressive aliasing assumptions
				// will have all the memory accesses marked as no-alias. These two versions of
				// loop will be preceded by a memory runtime check. This runtime check consists
				// of bound checks for all unique memory accessed in loop, and it ensures the
				// lack of memory aliasing. The result of the runtime check determines which of
				// the loop versions is executed: If the runtime check detects any memory
				// aliasing, then the original loop is executed. Otherwise, the version with
				// aggressive aliasing assumptions is used.
				//
				// Following are the top level steps:
				//
				// a) Perform LoopVersioningLICM's feasibility check.
				// b) If loop is a candidate for versioning then create a memory bound check,
				// by considering all the memory accesses in loop body.
				// c) Clone original loop and set all memory accesses as no-alias in new loop.
				// d) Set original loop & versioned loop as a branch target of the runtime check
				// result.
				//
				// It transforms loop as shown below:
				//
				// +----------------+
				// \|Runtime Memcheck\|
				// +----------------+
				// \|
				// +----------+----------------+----------+
				// \| \|
				// +---------+----------+ +-----------+----------+
				// \|Orig Loop Preheader \| \|Cloned Loop Preheader \|
				// +--------------------+ +----------------------+
				// \| \|
				// +--------------------+ +----------------------+
				// \|Orig Loop Body \| \|Cloned Loop Body \|
				// +--------------------+ +----------------------+
				// \| \|
				// +--------------------+ +----------------------+
				// \|Orig Loop\|Exit Block\| \|Cloned Loop Exit Block\|
				// +--------------------+ +-----------+----------+
				// \| \|
				// +----------+--------------+-----------+
				// \|
				// +-----+----+
				// \|Join Block\|
				// +----------+
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/MapVector.h"
				#include "llvm/ADT/SmallPtrSet.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/ADT/StringExtras.h"
				#include "llvm/Analysis/AliasAnalysis.h"
				#include "llvm/Analysis/AliasSetTracker.h"
				#include "llvm/Analysis/ConstantFolding.h"
				#include "llvm/Analysis/GlobalsModRef.h"
				#include "llvm/Analysis/LoopAccessAnalysis.h"
				#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/Analysis/LoopPass.h"
				#include "llvm/Analysis/ScalarEvolution.h"
				#include "llvm/Analysis/ScalarEvolutionExpander.h"
				#include "llvm/Analysis/TargetLibraryInfo.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/Analysis/VectorUtils.h"
				#include "llvm/IR/Dominators.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/MDBuilder.h"
				#include "llvm/IR/PatternMatch.h"
				#include "llvm/IR/PredIteratorCache.h"
				#include "llvm/IR/Type.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/raw_ostream.h"
				#include "llvm/Transforms/Scalar.h"
				#include "llvm/Transforms/Utils/BasicBlockUtils.h"
				#include "llvm/Transforms/Utils/Cloning.h"
				#include "llvm/Transforms/Utils/LoopUtils.h"
				#include "llvm/Transforms/Utils/LoopVersioning.h"
				#include "llvm/Transforms/Utils/ValueMapper.h"

				#define DEBUG_TYPE "loop-versioning-licm"
				#define LOOP_VERSIONING_LICM_METADATA "llvm.loop.licm_versioning.disable"

				using namespace llvm;

				/// Threshold minimum allowed percentage for possible
				/// invariant instructions in a loop.
				static cl::opt<float>
				LVInvarThreshold("-licm-versioning-invariant-threshold",
				cl::desc("LoopVersioningLICM's minimum allowed percentage"
				"of possible invariant instructions per loop"),
				cl::init(25), cl::Hidden);

				/// Threshold for maximum allowed loop nest/depth
				static cl::opt<unsigned> LVLoopDepthThreshold(
				"-licm-versioning-max-depth-threshold",
				cl::desc(
				"LoopVersioningLICM's threshold for maximum allowed loop nest/depth"),
				cl::init(2), cl::Hidden);

				/// \brief Create MDNode for input string.
				static MDNode createStringMetadata(Loop TheLoop, StringRef Name, unsigned V) {
				LLVMContext &Context = TheLoop->getHeader()->getContext();
				Metadata *MDs[] = {
				MDString::get(Context, Name),
				ConstantAsMetadata::get(ConstantInt::get(Type::getInt32Ty(Context), V))};
				return MDNode::get(Context, MDs);
				}

				/// \brief Check string metadata in loop, if it exist return true,
				/// else return false.
				bool llvm::checkStringMetadataIntoLoop(Loop *TheLoop, StringRef Name) {
				MDNode *LoopID = TheLoop->getLoopID();
				// Return false if LoopID is false.
				if (!LoopID)
				return false;
				// Iterate over LoopID operands and look for MDString Metadata
				for (unsigned i = 1, e = LoopID->getNumOperands(); i < e; ++i) {
				MDNode *MD = dyn_cast<MDNode>(LoopID->getOperand(i));
				if (!MD)
				continue;
				MDString *S = dyn_cast<MDString>(MD->getOperand(0));
				if (!S)
				continue;
				// Return true if MDString holds expected MetaData.
				if (Name.equals(S->getString()))
				return true;
				}
				return false;
				}

				/// \brief Set input string into loop metadata by keeping other values intact.
				void llvm::addStringMetadataToLoop(Loop TheLoop, const char MDString,
				unsigned V) {
				SmallVector<Metadata *, 4> MDs(1);
				// If the loop already has metadata, retain it.
				MDNode *LoopID = TheLoop->getLoopID();
				if (LoopID) {
				for (unsigned i = 1, ie = LoopID->getNumOperands(); i < ie; ++i) {
				MDNode *Node = cast<MDNode>(LoopID->getOperand(i));
				MDs.push_back(Node);
				}
				}
				// Add new metadata.
				MDs.push_back(createStringMetadata(TheLoop, MDString, V));
				// Replace current metadata node with new one.
				LLVMContext &Context = TheLoop->getHeader()->getContext();
				MDNode *NewLoopID = MDNode::get(Context, MDs);
				// Set operand 0 to refer to the loop id itself.
				NewLoopID->replaceOperandWith(0, NewLoopID);
				TheLoop->setLoopID(NewLoopID);
				}

				namespace {
				struct LoopVersioningLICM : public LoopPass {
				static char ID;

				bool runOnLoop(Loop *L, LPPassManager &LPM) override;

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.setPreservesCFG();
				AU.addRequired<AAResultsWrapperPass>();
				AU.addRequired<DominatorTreeWrapperPass>();
				AU.addRequiredID(LCSSAID);
				AU.addRequired<LoopAccessAnalysis>();
				AU.addRequired<LoopInfoWrapperPass>();
				AU.addRequiredID(LoopSimplifyID);
				AU.addRequired<ScalarEvolutionWrapperPass>();
				AU.addRequired<TargetLibraryInfoWrapperPass>();
				AU.addPreserved<AAResultsWrapperPass>();
				AU.addPreserved<GlobalsAAWrapperPass>();
				}

				using llvm::Pass::doFinalization;

				bool doFinalization() override { return false; }

				LoopVersioningLICM()
				: LoopPass(ID), AA(nullptr), SE(nullptr), LI(nullptr), DT(nullptr),
				TLI(nullptr), LAA(nullptr), LAI(nullptr), Changed(false),
				Preheader(nullptr), CurLoop(nullptr), CurAST(nullptr),
				LoopDepthThreshold(LVLoopDepthThreshold),
				InvariantThreshold(LVInvarThreshold), LoadAndStoreCounter(0),
				InvariantCounter(0), IsReadOnlyLoop(true) {
				initializeLoopVersioningLICMPass(*PassRegistry::getPassRegistry());
				}

				AliasAnalysis *AA; // Current AliasAnalysis information
				ScalarEvolution *SE; // Current ScalarEvolution
				LoopInfo *LI; // Current LoopInfo
				DominatorTree *DT; // Dominator Tree for the current Loop.
				TargetLibraryInfo *TLI; // TargetLibraryInfo for constant folding.
				LoopAccessAnalysis *LAA; // Current LoopAccessAnalysis
				const LoopAccessInfo *LAI; // Current Loop's LoopAccessInfo

				bool Changed; // Set to true when we change anything.
				BasicBlock *Preheader; // The preheader block of the current loop.
				Loop *CurLoop; // The current loop we are working on.
				AliasSetTracker *CurAST; // AliasSet information for the current loop.
				ValueToValueMap Strides;

				unsigned LoopDepthThreshold; // Maximum loop nest threshold
				float InvariantThreshold; // Minimum invariant threshold
				unsigned LoadAndStoreCounter; // Counter to track num of load & store
				unsigned InvariantCounter; // Counter to track num of invariant
				bool IsReadOnlyLoop; // Read only loop marker.

				bool isLegalForVersioning();
				bool legalLoopStructure();
				bool legalLoopInstructions();
				bool legalLoopMemoryAccesses();
				void collectStridedAccess(Value *LoadOrStoreInst);
				bool isLoopAlreadyVisited();
				void setNoAliasToLoop(Loop *);
				bool instructionSafeForVersioning(Instruction *);
				const char *getPassName() const override { return "Loop Versioning"; }
				};
				}

				/// \brief Collects stride access from a given value.
				void LoopVersioningLICM::collectStridedAccess(Value *MemAccess) {
				Value *Ptr = nullptr;
				if (LoadInst *LI = dyn_cast<LoadInst>(MemAccess))
				Ptr = LI->getPointerOperand();
				else if (StoreInst *SI = dyn_cast<StoreInst>(MemAccess))
				Ptr = SI->getPointerOperand();
				else
				return;

				Value *Stride = getStrideFromPointer(Ptr, SE, CurLoop);
				if (!Stride)
				return;

				DEBUG(dbgs() << "Found a strided access that we can version");
				DEBUG(dbgs() << " Ptr: " << Ptr << " Stride: " << Stride << "\n");
				Strides[Ptr] = Stride;
				}

				/// \brief Check loop structure and confirms it's good for LoopVersioningLICM.
				bool LoopVersioningLICM::legalLoopStructure() {
				// Loop must have a preheader, if not return false.
				if (!CurLoop->getLoopPreheader()) {
				DEBUG(dbgs() << " loop preheader is missing\n");
				return false;
				}
				// Loop should be innermost loop, if not return false.
				if (CurLoop->getSubLoops().size()) {
				DEBUG(dbgs() << " loop is not innermost\n");
				return false;
				}
				// Loop should have a single backedge, if not return false.
				if (CurLoop->getNumBackEdges() != 1) {
				DEBUG(dbgs() << " loop has multiple backedges\n");
				return false;
				}
				// Loop must have a single exiting block, if not return false.
				if (!CurLoop->getExitingBlock()) {
				DEBUG(dbgs() << " loop has multiple exiting block\n");
				return false;
				}
				// We only handle bottom-tested loop, i.e. loop in which the condition is
				// checked at the end of each iteration. With that we can assume that all
				// instructions in the loop are executed the same number of times.
				if (CurLoop->getExitingBlock() != CurLoop->getLoopLatch()) {
				DEBUG(dbgs() << " loop is not bottom tested\n");
				return false;
				}
				// Parallel loops must not have aliasing loop-invariant memory accesses.
				// Hence we don't need to version anything in this case.
				if (CurLoop->isAnnotatedParallel()) {
				DEBUG(dbgs() << " Parallel loop is not worth versioning\n");
				return false;
				}
				// Loop depth more then LoopDepthThreshold are not allowed
				if (CurLoop->getLoopDepth() > LoopDepthThreshold) {
				DEBUG(dbgs() << " loop depth is more then threshold\n");
				return false;
				}
				// Loop should have a dedicated exit block, if not return false.
				if (!CurLoop->hasDedicatedExits()) {
				DEBUG(dbgs() << " loop does not has dedicated exit blocks\n");
				return false;
				}
				// We need to be able to compute the loop trip count in order
				// to generate the bound checks.
				const SCEV *ExitCount = SE->getBackedgeTakenCount(CurLoop);
				if (ExitCount == SE->getCouldNotCompute()) {
				DEBUG(dbgs() << " loop does not has trip count\n");
				return false;
				}
				return true;
				}

				/// \brief Check memory accesses in loop and confirms it's good for
				/// LoopVersioningLICM.
				bool LoopVersioningLICM::legalLoopMemoryAccesses() {
				bool HasMayAlias = false;
				bool TypeSafety = false;
				bool HasMod = false;
				// Memory check:
				// Transform phase will generate a versioned loop and also a runtime check to
				// ensure the pointers are independent and they don’t alias.
				// In version variant of loop, alias meta data asserts that all access are
				// mutually independent.
				//
				// Pointers aliasing in alias domain are avoided because with multiple
				// aliasing domains we may not be able to hoist potential loop invariant
				// access out of the loop.
				//
				// Iterate over alias tracker sets, and confirm AliasSets doesn't have any
				// must alias set.
				for (const auto &I : *CurAST) {
				const AliasSet &AS = I;
				// Skip Forward Alias Sets, as this should be ignored as part of
				// the AliasSetTracker object.
				if (AS.isForwardingAliasSet())
				continue;
				// With MustAlias its not worth adding runtime bound check.
				if (AS.isMustAlias())
				return false;
				Value *SomePtr = AS.begin()->getValue();
				bool TypeCheck = true;
				// Check for Mod & MayAlias
				HasMayAlias \|= AS.isMayAlias();
				HasMod \|= AS.isMod();
				for (const auto &A : AS) {
				Value *Ptr = A.getValue();
				// Alias tracker should have pointers of same data type.
				TypeCheck = (TypeCheck && (SomePtr->getType() == Ptr->getType()));
				}
				// At least one alias tracker should have pointers of same data type.
				TypeSafety \|= TypeCheck;
				}
				// Ensure types should be of same type.
				if (!TypeSafety) {
				DEBUG(dbgs() << " Alias tracker type safety failed!\n");
				return false;
				}
				// Ensure loop body shouldn't be read only.
				if (!HasMod) {
				DEBUG(dbgs() << " No memory modified in loop body\n");
				return false;
				}
				// Make sure alias set has may alias case.
				// If there no alias memory ambiguity, return false.
				if (!HasMayAlias) {
				DEBUG(dbgs() << " No ambiguity in memory access.\n");
				return false;
				}
				return true;
				}

				/// \brief Check loop instructions safe for Loop versioning.
				/// It returns true if it's safe else returns false.
				/// Consider following:
				/// 1) Check all load store in loop body are non atomic & non volatile.
				/// 2) Check function call safety, by ensuring its not accessing memory.
				/// 3) Loop body shouldn't have any may throw instruction.
				bool LoopVersioningLICM::instructionSafeForVersioning(Instruction *I) {
				assert(I != nullptr && "Null instruction found!");
				// Check function call safety
				if (dyn_cast<CallInst>(I) && !AA->doesNotAccessMemory(CallSite(I))) {
				DEBUG(dbgs() << " Unsafe call site found.\n");
				return false;
				}
				// Avoid loops with possiblity of throw
				if (I->mayThrow()) {
				DEBUG(dbgs() << " May throw instruction found in loop body\n");
				return false;
				}
				// If current instruction is load instructions
				// make sure it's a simple load (non atomic & non volatile)
				if (I->mayReadFromMemory()) {
				LoadInst *Ld = dyn_cast<LoadInst>(I);
				if (!Ld \|\| !Ld->isSimple()) {
				DEBUG(dbgs() << " Found a non-simple load.\n");
				return false;
				}
				LoadAndStoreCounter++;
				collectStridedAccess(Ld);
				Value *Ptr = Ld->getPointerOperand();
				// Check loop invariant.
				if (SE->isLoopInvariant(SE->getSCEV(Ptr), CurLoop))
				InvariantCounter++;
				}
				// If current instruction is store instruction
				// make sure it's a simple store (non atomic & non volatile)
				else if (I->mayWriteToMemory()) {
				StoreInst *St = dyn_cast<StoreInst>(I);
				if (!St \|\| !St->isSimple()) {
				DEBUG(dbgs() << " Found a non-simple store.\n");
				return false;
				}
				LoadAndStoreCounter++;
				collectStridedAccess(St);
				Value *Ptr = St->getPointerOperand();
				// Check loop invariant.
				if (SE->isLoopInvariant(SE->getSCEV(Ptr), CurLoop))
				InvariantCounter++;

				IsReadOnlyLoop = false;
				}
				return true;
				}

				/// \brief Check loop instructions and confirms it's good for
				/// LoopVersioningLICM.
				bool LoopVersioningLICM::legalLoopInstructions() {
				// Resetting counters.
				LoadAndStoreCounter = 0;
				InvariantCounter = 0;
				IsReadOnlyLoop = true;
				// Iterate over loop blocks and instructions of each block and check
				// instruction safety.
				for (auto *Block : CurLoop->getBlocks())
				for (auto &Inst : *Block) {
				// If instruction in unsafe just return false.
				if (!instructionSafeForVersioning(&Inst))
				return false;
				}
				// Get LoopAccessInfo from current loop.
				LAI = &LAA->getInfo(CurLoop, Strides);
				// Check LoopAccessInfo for need of runtime check.
				if (LAI->getRuntimePointerChecking()->getChecks().empty()) {
				DEBUG(dbgs() << " LAA: Runtime check not found !!\n");
				return false;
				}
				// Number of runtime-checks should be less then RuntimeMemoryCheckThreshold
				if (LAI->getNumRuntimePointerChecks() >
				VectorizerParams::RuntimeMemoryCheckThreshold) {
				DEBUG(dbgs() << " LAA: Runtime checks are more than threshold !!\n");
				return false;
				}
				// Loop should have at least one invariant load or store instruction.
				if (!InvariantCounter) {
				DEBUG(dbgs() << " Invariant not found !!\n");
				return false;
				}
				// Read only loop not allowed.
				if (IsReadOnlyLoop) {
				DEBUG(dbgs() << " Found a read-only loop!\n");
				return false;
				}
				// Profitablity check:
				// Check invariant threshold, should be in limit.
				if (InvariantCounter * 100 < InvariantThreshold * LoadAndStoreCounter) {
				DEBUG(dbgs()
				<< " Invariant load & store are less then defined threshold\n");
				DEBUG(dbgs() << " Invariant loads & stores: "
				<< ((InvariantCounter * 100) / LoadAndStoreCounter) << "%\n");
				DEBUG(dbgs() << " Invariant loads & store threshold: "
				<< InvariantThreshold << "%\n");
				return false;
				}
				return true;
				}

				/// \brief It checks loop is already visited or not.
				/// check loop meta data, If loop revisited return true
				/// else false.
				bool LoopVersioningLICM::isLoopAlreadyVisited() {
				// Check LoopVersioningLICM metadata into loop
				if (checkStringMetadataIntoLoop(CurLoop, LOOP_VERSIONING_LICM_METADATA)) {
				return true;
				}
				return false;
				}

				/// \brief Checks legality for LoopVersioningLICM by considering following:
				/// a) loop structure legality b) loop instruction legality
				/// c) loop memory access legality.
				/// Return true if legal else returns false.
				bool LoopVersioningLICM::isLegalForVersioning() {
				DEBUG(dbgs() << "Loop: " << *CurLoop);
				// Make sure not re-visiting same loop again.
				if (isLoopAlreadyVisited()) {
				DEBUG(
				dbgs() << " Revisiting loop in LoopVersioningLICM not allowed.\n\n");
				return false;
				}
				// Check loop structure leagality.
				if (!legalLoopStructure()) {
				DEBUG(
				dbgs() << " Loop structure not suitable for LoopVersioningLICM\n\n");
				return false;
				}
				// Check loop instruction leagality.
				if (!legalLoopInstructions()) {
				DEBUG(dbgs()
				<< " Loop instructions not suitable for LoopVersioningLICM\n\n");
				return false;
				}
				// Check loop memory access leagality.
				if (!legalLoopMemoryAccesses()) {
				DEBUG(dbgs()
				<< " Loop memory access not suitable for LoopVersioningLICM\n\n");
				return false;
				}
				// Loop versioning is feasible, return true.
				DEBUG(dbgs() << " Loop Versioning found to be beneficial\n\n");
				return true;
				}

				/// \brief Update loop with aggressive aliasing assumptions.
				/// It marks no-alias to any pairs of memory operations by assuming
				/// loop should not have any must-alias memory accesses pairs.
				/// During LoopVersioningLICM legality we ignore loops having must
				/// aliasing memory accesses.
				void LoopVersioningLICM::setNoAliasToLoop(Loop *VerLoop) {
				// Get latch terminator instruction.
				Instruction *I = VerLoop->getLoopLatch()->getTerminator();
				// Create alias scope domain.
				MDBuilder MDB(I->getContext());
				MDNode *NewDomain = MDB.createAnonymousAliasScopeDomain("LVDomain");
				StringRef Name = "LVAliasScope";
				SmallVector<Metadata *, 4> Scopes, NoAliases;
				MDNode *NewScope = MDB.createAnonymousAliasScope(NewDomain, Name);
				// Iterate over each instruction of loop.
				// set no-alias for all load & store instructions.
				for (auto *Block : CurLoop->getBlocks()) {
				for (auto &Inst : *Block) {
				// Only interested in instruction that may modify or read memory.
				if (!Inst.mayReadFromMemory() && !Inst.mayWriteToMemory())
				continue;
				Scopes.push_back(NewScope);
				NoAliases.push_back(NewScope);
				// Set no-alias for current instruction.
				Inst.setMetadata(
				LLVMContext::MD_noalias,
				MDNode::concatenate(Inst.getMetadata(LLVMContext::MD_noalias),
				MDNode::get(Inst.getContext(), NoAliases)));
				// set alias-scope for current instruction.
				Inst.setMetadata(
				LLVMContext::MD_alias_scope,
				MDNode::concatenate(Inst.getMetadata(LLVMContext::MD_alias_scope),
				MDNode::get(Inst.getContext(), Scopes)));
				}
				}
				}

				bool LoopVersioningLICM::runOnLoop(Loop *L, LPPassManager &LPM) {
				if (skipOptnoneFunction(L))
				return false;
				Changed = false;
				// Get Analysis information.
				LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
				AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
				SE = &getAnalysis<ScalarEvolutionWrapperPass>().getSE();
				DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
				TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
				LAA = &getAnalysis<LoopAccessAnalysis>();
				LAI = nullptr;
				// Set Current Loop
				CurLoop = L;
				// Get the preheader block.
				Preheader = L->getLoopPreheader();
				// Initial allocation
				CurAST = new AliasSetTracker(*AA);

				// Loop over the body of this loop, construct AST.
				for (auto *Block : L->getBlocks()) {
				if (LI->getLoopFor(Block) == L) // Ignore blocks in subloop.
				CurAST->add(*Block); // Incorporate the specified basic block
				}
				// Check feasiblity of LoopVersioningLICM.
				// If versioning found to be feasible and beneficial then proceed
				// else simply return, by cleaning up memory.
				if (isLegalForVersioning()) {
				// Do loop versioning.
				// Create memcheck for memory accessed inside loop.
				// Clone original loop, and set blocks properly.
				LoopVersioning LVer(*LAI, CurLoop, LI, DT, SE, true);
				LVer.versionLoop();
				// Set Loop Versioning metaData for original loop.
				addStringMetadataToLoop(LVer.getNonVersionedLoop(),
				LOOP_VERSIONING_LICM_METADATA);
				// Set Loop Versioning metaData for version loop.
				addStringMetadataToLoop(LVer.getVersionedLoop(),
				LOOP_VERSIONING_LICM_METADATA);
				// Set "llvm.mem.parallel_loop_access" metaData to versioned loop.
				addStringMetadataToLoop(LVer.getVersionedLoop(),
				"llvm.mem.parallel_loop_access");
				// Update version loop with aggressive aliasing assumption.
				setNoAliasToLoop(LVer.getVersionedLoop());
				Changed = true;
				}
				// Delete allocated memory.
				delete CurAST;
				return Changed;
				}

				char LoopVersioningLICM::ID = 0;
				INITIALIZE_PASS_BEGIN(LoopVersioningLICM, "loop-versioning-licm",
				"Loop Versioning For LICM", false, false)
				INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(LCSSA)
				INITIALIZE_PASS_DEPENDENCY(LoopAccessAnalysis)
				INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(LoopSimplify)
				INITIALIZE_PASS_DEPENDENCY(ScalarEvolutionWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
				INITIALIZE_PASS_END(LoopVersioningLICM, "loop-versioning-licm",
				"Loop Versioning For LICM", false, false)

				Pass *llvm::createLoopVersioningLICMPass() { return new LoopVersioningLICM(); }

llvm/trunk/lib/Transforms/Scalar/Scalar.cpp

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	void llvm::initializeScalarOpts(PassRegistry &Registry) {
initializeLoopAccessAnalysisPass(Registry);		initializeLoopAccessAnalysisPass(Registry);
initializeLoopInstSimplifyPass(Registry);		initializeLoopInstSimplifyPass(Registry);
initializeLoopInterchangePass(Registry);		initializeLoopInterchangePass(Registry);
initializeLoopRotatePass(Registry);		initializeLoopRotatePass(Registry);
initializeLoopStrengthReducePass(Registry);		initializeLoopStrengthReducePass(Registry);
initializeLoopRerollPass(Registry);		initializeLoopRerollPass(Registry);
initializeLoopUnrollPass(Registry);		initializeLoopUnrollPass(Registry);
initializeLoopUnswitchPass(Registry);		initializeLoopUnswitchPass(Registry);
		initializeLoopVersioningLICMPass(Registry);
initializeLoopIdiomRecognizePass(Registry);		initializeLoopIdiomRecognizePass(Registry);
initializeLowerAtomicPass(Registry);		initializeLowerAtomicPass(Registry);
initializeLowerExpectIntrinsicPass(Registry);		initializeLowerExpectIntrinsicPass(Registry);
initializeMemCpyOptPass(Registry);		initializeMemCpyOptPass(Registry);
initializeMergedLoadStoreMotionPass(Registry);		initializeMergedLoadStoreMotionPass(Registry);
initializeNaryReassociatePass(Registry);		initializeNaryReassociatePass(Registry);
initializePartiallyInlineLibCallsPass(Registry);		initializePartiallyInlineLibCallsPass(Registry);
initializeReassociatePass(Registry);		initializeReassociatePass(Registry);
▲ Show 20 Lines • Show All 184 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopVersioningLICM/loopversioningLICM1.ll

				; RUN: opt < %s -O1 -S -loop-versioning-licm -licm -debug-only=loop-versioning-licm 2>&1 \| FileCheck %s
				;
				; Test to confirm loop is a candidate for LoopVersioningLICM.
				; It also confirms invariant moved out of loop.
				;
				; CHECK: Loop: Loop at depth 2 containing: %for.body3<header><latch><exiting>
				; CHECK-NEXT: Loop Versioning found to be beneficial
				;
				; CHECK: for.body3:
				; CHECK-NEXT: %add86 = phi i32 [ %arrayidx7.promoted, %for.body3.ph ], [ %add8, %for.body3 ]
				; CHECK-NEXT: %j.113 = phi i32 [ %j.016, %for.body3.ph ], [ %inc, %for.body3 ]
				; CHECK-NEXT: %idxprom = zext i32 %j.113 to i64
				; CHECK-NEXT: %arrayidx = getelementptr inbounds i32, i32* %var1, i64 %idxprom
				; CHECK-NEXT: store i32 %add, i32* %arrayidx, align 4, !alias.scope !6, !noalias !6
				; CHECK-NEXT: %add8 = add nsw i32 %add86, %add
				; CHECK-NEXT: %inc = add nuw i32 %j.113, 1
				; CHECK-NEXT: %cmp2 = icmp ult i32 %inc, %itr
				; CHECK-NEXT: br i1 %cmp2, label %for.body3, label %for.inc11.loopexit.loopexit5, !llvm.loop !7
				define i32 @foo(i32* nocapture %var1, i32* nocapture readnone %var2, i32* nocapture %var3, i32 %itr) #0 {
				entry:
				%cmp14 = icmp eq i32 %itr, 0
				br i1 %cmp14, label %for.end13, label %for.cond1.preheader.preheader

				for.cond1.preheader.preheader: ; preds = %entry
				br label %for.cond1.preheader

				for.cond1.preheader: ; preds = %for.cond1.preheader.preheader, %for.inc11
				%j.016 = phi i32 [ %j.1.lcssa, %for.inc11 ], [ 0, %for.cond1.preheader.preheader ]
				%i.015 = phi i32 [ %inc12, %for.inc11 ], [ 0, %for.cond1.preheader.preheader ]
				%cmp212 = icmp ult i32 %j.016, %itr
				br i1 %cmp212, label %for.body3.lr.ph, label %for.inc11

				for.body3.lr.ph: ; preds = %for.cond1.preheader
				%add = add i32 %i.015, %itr
				%idxprom6 = zext i32 %i.015 to i64
				%arrayidx7 = getelementptr inbounds i32, i32* %var3, i64 %idxprom6
				br label %for.body3

				for.body3: ; preds = %for.body3.lr.ph, %for.body3
				%j.113 = phi i32 [ %j.016, %for.body3.lr.ph ], [ %inc, %for.body3 ]
				%idxprom = zext i32 %j.113 to i64
				%arrayidx = getelementptr inbounds i32, i32* %var1, i64 %idxprom
				store i32 %add, i32* %arrayidx, align 4
				%0 = load i32, i32* %arrayidx7, align 4
				%add8 = add nsw i32 %0, %add
				store i32 %add8, i32* %arrayidx7, align 4
				%inc = add nuw i32 %j.113, 1
				%cmp2 = icmp ult i32 %inc, %itr
				br i1 %cmp2, label %for.body3, label %for.inc11.loopexit

				for.inc11.loopexit: ; preds = %for.body3
				br label %for.inc11

				for.inc11: ; preds = %for.inc11.loopexit, %for.cond1.preheader
				%j.1.lcssa = phi i32 [ %j.016, %for.cond1.preheader ], [ %itr, %for.inc11.loopexit ]
				%inc12 = add nuw i32 %i.015, 1
				%cmp = icmp ult i32 %inc12, %itr
				br i1 %cmp, label %for.cond1.preheader, label %for.end13.loopexit

				for.end13.loopexit: ; preds = %for.inc11
				br label %for.end13

				for.end13: ; preds = %for.end13.loopexit, %entry
				ret i32 0
				}

llvm/trunk/test/Transforms/LoopVersioningLICM/loopversioningLICM2.ll

				; RUN: opt < %s -O1 -S -loop-versioning-licm -licm -debug-only=loop-versioning-licm -disable-loop-unrolling 2>&1 \| FileCheck %s
				;
				; Test to confirm loop is a good candidate for LoopVersioningLICM
				; It also confirms invariant moved out of loop.
				;
				; CHECK: Loop: Loop at depth 2 containing: %for.body3.us<header><latch><exiting>
				; CHECK-NEXT: Loop Versioning found to be beneficial
				;
				; CHECK: for.cond1.for.inc17_crit_edge.us.loopexit5: ; preds = %for.body3.us
				; CHECK-NEXT: %add14.us.lcssa = phi float [ %add14.us, %for.body3.us ]
				; CHECK-NEXT: store float %add14.us.lcssa, float* %arrayidx.us, align 4, !alias.scope !7, !noalias !8
				; CHECK-NEXT: br label %for.cond1.for.inc17_crit_edge.us
				;
				define i32 @foo(float* nocapture %var2, float** nocapture readonly %var3, i32 %itr) #0 {
				entry:
				%cmp38 = icmp sgt i32 %itr, 1
				br i1 %cmp38, label %for.body3.lr.ph.us, label %for.end19

				for.body3.us: ; preds = %for.body3.us, %for.body3.lr.ph.us
				%0 = phi float [ %.pre, %for.body3.lr.ph.us ], [ %add14.us, %for.body3.us ]
				%indvars.iv = phi i64 [ 1, %for.body3.lr.ph.us ], [ %indvars.iv.next, %for.body3.us ]
				%1 = trunc i64 %indvars.iv to i32
				%conv.us = sitofp i32 %1 to float
				%add.us = fadd float %conv.us, %0
				%arrayidx7.us = getelementptr inbounds float, float* %3, i64 %indvars.iv
				store float %add.us, float* %arrayidx7.us, align 4
				%2 = load float, float* %arrayidx.us, align 4
				%add14.us = fadd float %2, %add.us
				store float %add14.us, float* %arrayidx.us, align 4
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %itr
				br i1 %exitcond, label %for.cond1.for.inc17_crit_edge.us, label %for.body3.us

				for.body3.lr.ph.us: ; preds = %entry, %for.cond1.for.inc17_crit_edge.us
				%indvars.iv40 = phi i64 [ %indvars.iv.next41, %for.cond1.for.inc17_crit_edge.us ], [ 1, %entry ]
				%arrayidx.us = getelementptr inbounds float, float* %var2, i64 %indvars.iv40
				%arrayidx6.us = getelementptr inbounds float, float* %var3, i64 %indvars.iv40
				%3 = load float, float* %arrayidx6.us, align 8
				%.pre = load float, float* %arrayidx.us, align 4
				br label %for.body3.us

				for.cond1.for.inc17_crit_edge.us: ; preds = %for.body3.us
				%indvars.iv.next41 = add nuw nsw i64 %indvars.iv40, 1
				%lftr.wideiv42 = trunc i64 %indvars.iv.next41 to i32
				%exitcond43 = icmp eq i32 %lftr.wideiv42, %itr
				br i1 %exitcond43, label %for.end19, label %for.body3.lr.ph.us

				for.end19: ; preds = %for.cond1.for.inc17_crit_edge.us, %entry
				ret i32 0
				}

llvm/trunk/test/Transforms/LoopVersioningLICM/loopversioningLICM3.ll

				; RUN: opt < %s -O1 -S -loop-versioning-licm -debug-only=loop-versioning-licm 2>&1 \| FileCheck %s
				;
				; Test to confirm loop is not a candidate for LoopVersioningLICM.
				;
				; CHECK: Loop: Loop at depth 2 containing: %for.body3<header><latch><exiting>
				; CHECK-NEXT: LAA: Runtime check not found !!
				; CHECK-NEXT: Loop instructions not suitable for LoopVersioningLICM

				define i32 @foo(i32* nocapture %var1, i32 %itr) #0 {
				entry:
				%cmp18 = icmp eq i32 %itr, 0
				br i1 %cmp18, label %for.end8, label %for.cond1.preheader

				for.cond1.preheader: ; preds = %entry, %for.inc6
				%j.020 = phi i32 [ %j.1.lcssa, %for.inc6 ], [ 0, %entry ]
				%i.019 = phi i32 [ %inc7, %for.inc6 ], [ 0, %entry ]
				%cmp216 = icmp ult i32 %j.020, %itr
				br i1 %cmp216, label %for.body3.lr.ph, label %for.inc6

				for.body3.lr.ph: ; preds = %for.cond1.preheader
				%0 = zext i32 %j.020 to i64
				br label %for.body3

				for.body3: ; preds = %for.body3, %for.body3.lr.ph
				%indvars.iv = phi i64 [ %0, %for.body3.lr.ph ], [ %indvars.iv.next, %for.body3 ]
				%arrayidx = getelementptr inbounds i32, i32* %var1, i64 %indvars.iv
				%1 = load i32, i32* %arrayidx, align 4
				%add = add nsw i32 %1, %itr
				store i32 %add, i32* %arrayidx, align 4
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%lftr.wideiv = trunc i64 %indvars.iv.next to i32
				%exitcond = icmp eq i32 %lftr.wideiv, %itr
				br i1 %exitcond, label %for.inc6, label %for.body3

				for.inc6: ; preds = %for.body3, %for.cond1.preheader
				%j.1.lcssa = phi i32 [ %j.020, %for.cond1.preheader ], [ %itr, %for.body3 ]
				%inc7 = add nuw i32 %i.019, 1
				%exitcond21 = icmp eq i32 %inc7, %itr
				br i1 %exitcond21, label %for.end8, label %for.cond1.preheader

				for.end8: ; preds = %for.inc6, %entry
				ret i32 0
				}

This is an archive of the discontinued LLVM Phabricator instance.

Loop Versioning for LICMClosedPublic

Details

Diff Detail

Event Timeline

+/// 4) Loop body should not have any may throw instruction.

Revision Contents

Diff 47076

llvm/trunk/docs/LangRef.rst

llvm/trunk/include/llvm/InitializePasses.h

llvm/trunk/include/llvm/LinkAllPasses.h

llvm/trunk/include/llvm/Transforms/Scalar.h

llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h

llvm/trunk/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/trunk/lib/Transforms/Scalar/CMakeLists.txt

llvm/trunk/lib/Transforms/Scalar/LoopVersioningLICM.cpp

llvm/trunk/lib/Transforms/Scalar/Scalar.cpp

llvm/trunk/test/Transforms/LoopVersioningLICM/loopversioningLICM1.ll

llvm/trunk/test/Transforms/LoopVersioningLICM/loopversioningLICM2.ll

llvm/trunk/test/Transforms/LoopVersioningLICM/loopversioningLICM3.ll

Loop Versioning for LICM
ClosedPublic