This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
GVNHoist.cpp
-
test/Transforms/GVNHoist/
-
Transforms/
-
GVNHoist/
-
hoist-more-than-two-branches.ll
-
hoist-mssa.ll
-
hoist-newgvn.ll
-
hoist-pr20242.ll
-
hoist-pr28933.ll
-
hoist-recursive-geps.ll
-
hoist.ll
-
infinite-loop-direct.ll
-
infinite-loop-indirect.ll

Differential D35918

[GVNHoist] Factor out reachability to search for anticipable instructions quickly
ClosedPublic

Authored by hiraditya on Jul 26 2017, 2:36 PM.

Download Raw Diff

Details

Reviewers

• dberlin
sebpop
gberry
davide

Commits

rGdfa8741c9693: [GVNHoist] Factor out reachability to search for anticipable instructions…
rL313116: [GVNHoist] Factor out reachability to search for anticipable instructions…

Summary

Factor out the reachability such that multiple queries to find reachability of values are fast. This is based on finding the ANTIC points
in the CFG which do not change during hoisting. The ANTIC points are basically the dominance-frontiers in the inverse graph. So we introduce a data structure (CHI nodes)
to keep track of values flowing out of a basic block. We only do this for values with multiple occurrences in the function as they are the potential hoistable candidates.

This patch allows us to hoist instructions to a basic block with >2 successors, as well as deal with infinite loops in a trivial way.
Relevant test cases are added to show the functionality as well as regression fixes from PR32821.

Regression from previous GVNHoist:
We do not hoist fully redundant expressions because fully redundant expressions are already handled by NewGVN

Based on the suggestions from @dberlin and @sebpop

Diff Detail

Repository: rL LLVM

Event Timeline

hiraditya created this revision.Jul 26 2017, 2:36 PM

Herald added a subscriber: Prazek. · View Herald TranscriptJul 26 2017, 2:36 PM

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

In D35918#822143, @dberlin wrote:

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

In D35918#822168, @hiraditya wrote:

In D35918#822143, @dberlin wrote:

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

No.
The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

But i'm not sure why that would stop you one way or the other?
The IDF calculator only asks for the defining blocks, not the root.
You can hand it all CFG blocks and it should work fine.

What do you believe you need to hand it the root node for?

For example,the dominance frontier of the root node is guaranteed to be empty.
"the dominance frontier of a node d is the set of all nodes n such that d dominates an immediate predecessor of n, but d does not strictly dominate n."

The root node strictly dominates everything but itself, and it has no immediate predecessors (successors on the reverse graph),.

In D35918#822192, @dberlin wrote:

In D35918#822168, @hiraditya wrote:

In D35918#822143, @dberlin wrote:

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

No.
The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

But i'm not sure why that would stop you one way or the other?
The IDF calculator only asks for the defining blocks, not the root.
You can hand it all CFG blocks and it should work fine.

What I understand is that the API of IDFCalculator::calculate, populates a vector with all the dominance frontiers of the defining blocks (AFAICT from the usage in ADCE.cpp).
How can I create a mapping of a basic block vs. its (post) dominance frontier. For this pass I would need to have such a mapping. I guess I'm having difficulty understanding the code. THanks for the help.

What do you believe you need to hand it the root node for?

For example,the dominance frontier of the root node is guaranteed to be empty.
"the dominance frontier of a node d is the set of all nodes n such that d dominates an immediate predecessor of n, but d does not strictly dominate n."

The root node strictly dominates everything but itself, and it has no immediate predecessors (successors on the reverse graph),.

In D35918#823285, @hiraditya wrote:

In D35918#822192, @dberlin wrote:

In D35918#822168, @hiraditya wrote:

In D35918#822143, @dberlin wrote:

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

No.
The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

But i'm not sure why that would stop you one way or the other?
The IDF calculator only asks for the defining blocks, not the root.
You can hand it all CFG blocks and it should work fine.

What I understand is that the API of IDFCalculator::calculate, populates a vector with all the dominance frontiers of the defining blocks (AFAICT from the usage in ADCE.cpp).
How can I create a mapping of a basic block vs. its (post) dominance frontier. For this pass I would need to have such a mapping. I guess I'm having difficulty understanding the code. THanks for the help.

Such a mapping is by definition n^2 space, and i'm having trouble seeing why it is necessary.

Here is what you are doing:
For each instruction with the same VN:

find the post-dominance frontier of the block of the instruction
Insert a chi there, with certain arguments.

This is unnecessarily wasteful (you may compute the same pdf again and again).

Here is what SSUPRE (and you), should do:

Collect all the blocks of the instructions with the same VN into defining blocks.
Compute the PDF using IDFCalculator.
Place empty chis in the PDF.

At this point, you have two options:

Walk post-dominator tree top-down and use a stack to store the last value you see.
When you hit a chi from a given edge, the value to use as the argument is at the top of the stack.

This is O(Basic Blocks)

The O(instructions+chis) way to do it is:

Make a vector of instructions and chi argument uses. Each should be given the DFS in/out number from the post dominator tree node for the basic block they come from, and a local DFS number (IE order in block) in the case of instructions
A "chi argument use" is created for each incoming edge to the chi, but is empty/fake. These should assume the basic block from the other side of the edge (IE not the chi block, but the edge to the chi block).
Sort by dfs in/out, then local number

Walk vector with a stack.
At each element of vector:
  while( !top of stack is empty && DFS in/out  of current thing in vector is not inside of DFS number of top of stack)
   pop stack
If element you are staring at is a chi use:
  if stack is empty, chi has null operand
  if stack is not, set chi argument for the edge to top of stack
else: // must be an instruction
    if stack is empty, push onto stack
   If stack is not empty, the thing on the stack post-dominates you and you are redundant :)

The forwards version of this algorithm is used by predicateinfo to do SSA renaming.
Your algorithm is the same on the reverse graph, except the chi arguments are virtual :)

Such a mapping is by definition n^2 space, and i'm having trouble seeing why it is necessary.

Here is what you are doing:
For each instruction with the same VN:
find the post-dominance frontier of the block of the instruction
Insert a chi there, with certain arguments.
This is unnecessarily wasteful (you may compute the same pdf again and again).

Here is what SSUPRE (and you), should do:

Collect all the blocks of the instructions with the same VN into defining blocks.
Compute the PDF using IDFCalculator.
Place empty chis in the PDF.

At this point, you have two options:
Walk post-dominator tree top-down and use a stack to store the last value you see.
When you hit a chi from a given edge, the value to use as the argument is at the top of the stack.
This is O(Basic Blocks)

I tried this based on your suggestions, but post-dominator tree does not work well with infinite loops or CFG with multiple exits.
I can wait on @kuhar 's patch to be merged+stabilize and then I can work on this idea.

The O(instructions+chis) way to do it is:

Make a vector of instructions and chi argument uses. Each should be given the DFS in/out number from the post dominator tree node for the basic block they come from, and a local DFS number (IE order in block) in the case of instructions
A "chi argument use" is created for each incoming edge to the chi, but is empty/fake. These should assume the basic block from the other side of the edge (IE not the chi block, but the edge to the chi block).
Sort by dfs in/out, then local number
Walk vector with a stack.
At each element of vector:
  while( !top of stack is empty && DFS in/out  of current thing in vector is not inside of DFS number of top of stack)
   pop stack
If element you are staring at is a chi use:
  if stack is empty, chi has null operand
  if stack is not, set chi argument for the edge to top of stack
else: // must be an instruction
    if stack is empty, push onto stack
   If stack is not empty, the thing on the stack post-dominates you and you are redundant :)
The forwards version of this algorithm is used by predicateinfo to do SSA renaming.
Your algorithm is the same on the reverse graph, except the chi arguments are virtual :)

Because this patch precomputes ANTIC points based on your suggestions, it is already faster than the previous fix of iterating on dominator tree for each instruction to be hoisted.
Can we merge this patch if you think it is good to go, and then I'll work on this idea once the post-dominator patch by @kuhar is ready.
Thank you for writing the algorithm here, it helped me realize how close the algorithm is to PHI-insertion.

For some reason phabricator missed the following comments:

If you are willing to commit to making it better, i'm happy to review this version :)

Yes, I'll update gvn-hoist once post-dominator patch is merged.

Thanks,

-Aditya

From: Daniel Berlin <dberlin at dberlin.org>
Sent: Tuesday, August 1, 2017 11:11 AM
To: reviews+D35918+public+40ddab4ebe264bbb at reviews.llvm.org
Cc: Aditya Kumar; Sebastian Pop; Geoff Berry; Davide Italiano; Kuba Kuderski; Piotr Padlewski; llvm-commits
Subject: Re: [PATCH] D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly

If you are willing to commit to making it better, i'm happy to review this version :)

@dberlin, @hiraditya

In D35918#822192, @dberlin wrote:

The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

That's actually already the case upstream since r309146 (D35597).
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L263
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L307

In D35918#829175, @kuhar wrote:

@dberlin, @hiraditya

In D35918#822192, @dberlin wrote:

The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

That's actually already the case upstream since r309146 (D35597).
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L263
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L307

Can you illustrate how to perform a DFS walk on a post-dom tree with your patch. Also, how would I get the (virtual) root node of post-dom tree.

Thanks,

In D35918#830975, @hiraditya wrote:

In D35918#829175, @kuhar wrote:

@dberlin, @hiraditya

In D35918#822192, @dberlin wrote:

The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

That's actually already the case upstream since r309146 (D35597).
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L263
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L307

Can you illustrate how to perform a DFS walk on a post-dom tree with your patch. Also, how would I get the (virtual) root node of post-dom tree.

Thanks,

You can get the virtual root by calling PDT.getNode(nullptr). Then you can use something like llvm/ADT/DepthFirstIterator.h or llvm/ADT/PostOrderIterator.h to run DFS on it. Implementing a custom DFS should also be easy.

In D35918#830987, @kuhar wrote:

In D35918#830975, @hiraditya wrote:

In D35918#829175, @kuhar wrote:

@dberlin, @hiraditya

In D35918#822192, @dberlin wrote:

The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

That's actually already the case upstream since r309146 (D35597).
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L263
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L307

Can you illustrate how to perform a DFS walk on a post-dom tree with your patch. Also, how would I get the (virtual) root node of post-dom tree.

Thanks,

You can get the virtual root by calling PDT.getNode(nullptr). Then you can use something like llvm/ADT/DepthFirstIterator.h or llvm/ADT/PostOrderIterator.h to run DFS on it. Implementing a custom DFS should also be easy.

I tried a simple df walk like this, it crashes the compiler for the test cases llvm/test/Transforms/GVNHoist/infinite-loop-indirect.ll, and llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll:

auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
}

Because PrevBB is NULL when PrevBB = PDF->getNode(nullptr)

These test cases are added in this patch.

Thanks,

In D35918#831066, @hiraditya wrote:
I tried a simple df walk like this, it crashes the compiler for the test cases llvm/test/Transforms/GVNHoist/infinite-loop-indirect.ll, and llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll:
auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
}
These test cases are added in this patch.

Thanks,

I tried running loop like yours in a couple of my unittests (DominatorTreeTest.cpp) and it seems to work:

auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
  auto BB = it->getBlock();
  outs() << (BB ? BB->getName() : "virtual root") << "\n";
}

Could you prepare a reduced repro for the crash you saw?

Thanks,
Kuba

In D35918#831091, @kuhar wrote:
In D35918#831066, @hiraditya wrote:
I tried a simple df walk like this, it crashes the compiler for the test cases llvm/test/Transforms/GVNHoist/infinite-loop-indirect.ll, and llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll:
auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
}
These test cases are added in this patch.

Thanks,
I tried running loop like yours in a couple of my unittests (DominatorTreeTest.cpp) and it seems to work:
auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
  auto BB = it->getBlock();
  outs() << (BB ? BB->getName() : "virtual root") << "\n";
}
Could you prepare a reduced repro for the crash you saw?

Thanks,
Kuba

Try ./bin/opt -S -adce < llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll

with the following patch. I added the code here for easy reproducibility.

diff --git a/lib/Transforms/Scalar/ADCE.cpp b/lib/Transforms/Scalar/ADCE.cpp
index 5b467dc..eb711b9 100644
--- a/lib/Transforms/Scalar/ADCE.cpp
+++ b/lib/Transforms/Scalar/ADCE.cpp
@@ -164,6 +164,10 @@ public:
 }

 bool AggressiveDeadCodeElimination::performDeadCodeElimination() {
+  auto PrevBB = PDT.getNode(nullptr);
+  for (auto it = df_begin(PrevBB), E = df_end(PrevBB); it != E; ++it) {
+      dbgs() << "\nTest\n";
+  }

Here is the reduced test case:

$ cat a.ll
; ModuleID = 'bugpoint-reduced-simplified.bc'
source_filename = "bugpoint-output-f4ca947.bc"
target triple = "x86_64-unknown-linux-gnu"

define void @bazv1() local_unnamed_addr {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.cond, %entry
  br label %while.cond
}

In D35918#831102, @hiraditya wrote:

Try ./bin/opt -S -adce < llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll

with the following patch. I added the code here for easy reproducibility.

diff --git a/lib/Transforms/Scalar/ADCE.cpp b/lib/Transforms/Scalar/ADCE.cpp
index 5b467dc..eb711b9 100644
--- a/lib/Transforms/Scalar/ADCE.cpp
+++ b/lib/Transforms/Scalar/ADCE.cpp
@@ -164,6 +164,10 @@ public:
 }

 bool AggressiveDeadCodeElimination::performDeadCodeElimination() {
+  auto PrevBB = PDT.getNode(nullptr);
+  for (auto it = df_begin(PrevBB), E = df_end(PrevBB); it != E; ++it) {
+      dbgs() << "\nTest\n";
+  }

Here is the reduced test case:

$ cat a.ll
; ModuleID = 'bugpoint-reduced-simplified.bc'
source_filename = "bugpoint-output-f4ca947.bc"
target triple = "x86_64-unknown-linux-gnu"

define void @bazv1() local_unnamed_addr {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.cond, %entry
  br label %while.cond
}

Oh, yes, that happens because the entire function is reverse-unreachable, which causes the tree to be completely empty. D35851 puts reverse-unreachable CFG nodes in the tree and solves this problem.
In the meanwhile, you can check if PDT.getNode(nullptr) returns a nullptr and not iterate over it that situation.

Based on suggestions from @dberlin, I have updated the patch:

Compute iterated post-dominance frontiers
Iterate on post-dominator tree to insert CHIargs more efficiently

The time complexity would now be: O(duplicate-values * N) where N = nodes in the CFG. For each duplicate values the algorithm finds their control-dependence by computing the iterated post-dominance frontier and then inserts CHI args to track anticipability.

Added Comments

This looks reasonable at a glance, note it is likely to be subtly broken in some cases until D35381 goes in.
I've added a few coments, i'll take a deeper look in a little bit.

lib/Transforms/Scalar/GVNHoist.cpp
751 ↗	(On Diff #110647)	What are you really trying to do here? (IE why is the continue commented out)
954 ↗	(On Diff #110647)	I think since you wrote this we added moving functions you could use here.

Remove dead comments.

hiraditya added inline comments.Aug 11 2017, 12:06 PM

lib/Transforms/Scalar/GVNHoist.cpp
751 ↗	(On Diff #110647)	That was from the early stage when I only handled unique post-dom frontiers. I've removed those lines.
954 ↗	(On Diff #110647)	I'll look into it and update this code. Thanks for the feedback.

Updated MemorySSA updates.
Outlined functions from codegen part.

I think this is mostly reasonable at this point.
If you clean up the naming so it matches the style guide (IE you have a lot of things ending in T for no reason, etc), i'll approve it.

lib/Transforms/Scalar/GVNHoist.cpp
87 ↗	(On Diff #112071)	This is unused
90 ↗	(On Diff #112071)	This is unused
305 ↗	(On Diff #112071)	This is unused

Addressed @dberlin 's comments.

With these changes, i think it is a good start.

lib/Transforms/Scalar/GVNHoist.cpp
117 ↗	(On Diff #112386)	All of the uses of this type should just be auto, AFAICT
557 ↗	(On Diff #112386)	This keeps going after it finds anything, despite the fact that the boolean won't change. I would instead use (you probably need to include STLExtras.h) auto Found = any_of(TI->successors(), [](const BasicBlock *BB){ return BB == Dest; });

This revision is now accepted and ready to land.Aug 24 2017, 12:28 PM

use any_of instead of iterating over the successors.

hiraditya marked an inline comment as done.Sep 6 2017, 2:30 PM

hiraditya added inline comments.

lib/Transforms/Scalar/GVNHoist.cpp
117 ↗	(On Diff #112386)	CHHIt is used in function argument at a couple of places.

Sure, it's a function argument in a few other places.
The other places, you should be using auto :)
I marked a bunch of them.

In general, the only reason not to use auto is if the type is important and non-obvious.

In most of the cases, you would be better off naming variables better than exposing the types.

IE instead of CHIArg C, use auto CHIArg
(auto C is also fine if you think the type is obvious or unimportant)

lib/Transforms/Scalar/GVNHoist.cpp
549 ↗	(On Diff #114071)	Pass in a range instead of two iterators.
552 ↗	(On Diff #114071)	auto C
566 ↗	(On Diff #114071)	Pass in a range instead?
568 ↗	(On Diff #114071)	range based for loop?
609 ↗	(On Diff #114071)	auto VCHI
610 ↗	(On Diff #114071)	auto It
645 ↗	(On Diff #114071)	depth_first instead of explicit iterator
668 ↗	(On Diff #114071)	auto &A
678 ↗	(On Diff #114071)	auto
680 ↗	(On Diff #114071)	auto

added auto to relevant places.

hiraditya marked 4 inline comments as done.Sep 8 2017, 2:59 PM

use depth_first

Closed by commit rL313116: [GVNHoist] Factor out reachability to search for anticipable instructions… (authored by hiraditya). · Explain WhySep 12 2017, 10:29 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

GVNHoist.cpp

704 lines

test/

Transforms/

GVNHoist/

hoist-more-than-two-branches.ll

31 lines

2 lines

105 lines

5 lines

3 lines

hoist-recursive-geps.ll

11 lines

hoist.ll

108 lines

infinite-loop-direct.ll

96 lines

infinite-loop-indirect.ll

285 lines

Diff 114971

llvm/trunk/lib/Transforms/Scalar/GVNHoist.cpp

//===- GVNHoist.cpp - Hoist scalar and load expressions -------------------===//		//===- GVNHoist.cpp - Hoist scalar and load expressions -------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This pass hoists expressions from branches to a common dominator. It uses		// This pass hoists expressions from branches to a common dominator. It uses
// GVN (global value numbering) to discover expressions computing the same		// GVN (global value numbering) to discover expressions computing the same
// values. The primary goals of code-hoisting are:		// values. The primary goals of code-hoisting are:
// 1. To reduce the code size.		// 1. To reduce the code size.
// 2. In some cases reduce critical path (by exposing more ILP).		// 2. In some cases reduce critical path (by exposing more ILP).
//		//
		// The algorithm factors out the reachability of values such that multiple
		// queries to find reachability of values are fast. This is based on finding the
		// ANTIC points in the CFG which do not change during hoisting. The ANTIC points
		// are basically the dominance-frontiers in the inverse graph. So we introduce a
		// data structure (CHI nodes) to keep track of values flowing out of a basic
		// block. We only do this for values with multiple occurrences in the function
		// as they are the potential hoistable candidates. This approach allows us to
		// hoist instructions to a basic block with more than two successors, as well as
		// deal with infinite loops in a trivial way.
		//
		// Limitations: This pass does not hoist fully redundant expressions because
		// they are already handled by GVN-PRE. It is advisable to run gvn-hoist before
		// and after gvn-pre because gvn-pre creates opportunities for more instructions
		// to be hoisted.
		//
// Hoisting may affect the performance in some cases. To mitigate that, hoisting		// Hoisting may affect the performance in some cases. To mitigate that, hoisting
// is disabled in the following cases.		// is disabled in the following cases.
// 1. Scalars across calls.		// 1. Scalars across calls.
// 2. geps when corresponding load/store cannot be hoisted.		// 2. geps when corresponding load/store cannot be hoisted.
//
// TODO: Hoist from >2 successors. Currently GVNHoist will not hoist stores
// in this case because it works on two instructions at a time.
// entry:
// switch i32 %c1, label %exit1 [
// i32 0, label %sw0
// i32 1, label %sw1
// ]
//
// sw0:
// store i32 1, i32* @G
// br label %exit
//
// sw1:
// store i32 1, i32* @G
// br label %exit
//
// exit1:
// store i32 1, i32* @G
// ret void
// exit:
// ret void
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/GlobalsModRef.h"		#include "llvm/Analysis/GlobalsModRef.h"
		#include "llvm/Analysis/IteratedDominanceFrontier.h"
#include "llvm/Analysis/MemorySSA.h"		#include "llvm/Analysis/MemorySSA.h"
#include "llvm/Analysis/MemorySSAUpdater.h"		#include "llvm/Analysis/MemorySSAUpdater.h"
		#include "llvm/Analysis/PostDominators.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/GVN.h"		#include "llvm/Transforms/Scalar/GVN.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"

		#include <stack>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "gvn-hoist"		#define DEBUG_TYPE "gvn-hoist"

STATISTIC(NumHoisted, "Number of instructions hoisted");		STATISTIC(NumHoisted, "Number of instructions hoisted");
STATISTIC(NumRemoved, "Number of instructions removed");		STATISTIC(NumRemoved, "Number of instructions removed");
STATISTIC(NumLoadsHoisted, "Number of loads hoisted");		STATISTIC(NumLoadsHoisted, "Number of loads hoisted");
STATISTIC(NumLoadsRemoved, "Number of loads removed");		STATISTIC(NumLoadsRemoved, "Number of loads removed");
Show All 18 Lines

static cl::opt<int>		static cl::opt<int>
MaxChainLength("gvn-hoist-max-chain-length", cl::Hidden, cl::init(10),		MaxChainLength("gvn-hoist-max-chain-length", cl::Hidden, cl::init(10),
cl::desc("Maximum length of dependent chains to hoist "		cl::desc("Maximum length of dependent chains to hoist "
"(default = 10, unlimited = -1)"));		"(default = 10, unlimited = -1)"));

namespace llvm {		namespace llvm {

// Provides a sorting function based on the execution order of two instructions.		typedef DenseMap<const BasicBlock *, bool> BBSideEffectsSet;
struct SortByDFSIn {		typedef SmallVector<Instruction *, 4> SmallVecInsn;
private:		typedef SmallVectorImpl<Instruction *> SmallVecImplInsn;
DenseMap<const Value *, unsigned> &DFSNumber;		// Each element of a hoisting list contains the basic block where to hoist and
		// a list of instructions to be hoisted.
public:		typedef std::pair<BasicBlock *, SmallVecInsn> HoistingPointInfo;
SortByDFSIn(DenseMap<const Value *, unsigned> &D) : DFSNumber(D) {}		typedef SmallVector<HoistingPointInfo, 4> HoistingPointList;
		// A map from a pair of VNs to all the instructions with those VNs.
		typedef std::pair<unsigned, unsigned> VNType;
		typedef DenseMap<VNType, SmallVector<Instruction *, 4>> VNtoInsns;

// Returns true when A executes before B.		// CHI keeps information about values flowing out of a basic block. It is
bool operator()(const Instruction A, const Instruction B) const {		// similar to PHI but in the inverse graph, and used for outgoing values on each
const BasicBlock *BA = A->getParent();		// edge. For conciseness, it is computed only for instructions with multiple
const BasicBlock *BB = B->getParent();		// occurrences in the CFG because they are the only hoistable candidates.
unsigned ADFS, BDFS;		// A (CHI[{V, B, I1}, {V, C, I2}]
if (BA == BB) {		// / \
ADFS = DFSNumber.lookup(A);		// / \
BDFS = DFSNumber.lookup(B);		// B(I1) C (I2)
} else {		// The Value number for both I1 and I2 is V, the CHI node will save the
ADFS = DFSNumber.lookup(BA);		// instruction as well as the edge where the value is flowing to.
BDFS = DFSNumber.lookup(BB);		struct CHIArg {
}		VNType VN;
assert(ADFS && BDFS);		// Edge destination (shows the direction of flow), may not be where the I is.
return ADFS < BDFS;		BasicBlock *Dest;
}		// The instruction (VN) which uses the values flowing out of CHI.
		Instruction *I;
		bool operator==(const CHIArg &A) { return VN == A.VN; }
		bool operator!=(const CHIArg &A) { return !(*this == A); }
};		};

// A map from a pair of VNs to all the instructions with those VNs.		typedef SmallVectorImpl<CHIArg>::iterator CHIIt;
typedef DenseMap<std::pair<unsigned, unsigned>, SmallVector<Instruction *, 4>>		typedef iterator_range<CHIIt> CHIArgs;
VNtoInsns;		typedef DenseMap<BasicBlock *, SmallVector<CHIArg, 2>> OutValuesType;
		typedef DenseMap<BasicBlock , SmallVector<std::pair<VNType, Instruction >, 2>>
		InValuesType;

// An invalid value number Used when inserting a single value number into		// An invalid value number Used when inserting a single value number into
// VNtoInsns.		// VNtoInsns.
enum : unsigned { InvalidVN = ~2U };		enum : unsigned { InvalidVN = ~2U };

// Records all scalar instructions candidate for code hoisting.		// Records all scalar instructions candidate for code hoisting.
class InsnInfo {		class InsnInfo {
VNtoInsns VNtoScalars;		VNtoInsns VNtoScalars;

▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	public:

const VNtoInsns &getScalarVNTable() const { return VNtoCallsScalars; }		const VNtoInsns &getScalarVNTable() const { return VNtoCallsScalars; }

const VNtoInsns &getLoadVNTable() const { return VNtoCallsLoads; }		const VNtoInsns &getLoadVNTable() const { return VNtoCallsLoads; }

const VNtoInsns &getStoreVNTable() const { return VNtoCallsStores; }		const VNtoInsns &getStoreVNTable() const { return VNtoCallsStores; }
};		};

typedef DenseMap<const BasicBlock *, bool> BBSideEffectsSet;
typedef SmallVector<Instruction *, 4> SmallVecInsn;
typedef SmallVectorImpl<Instruction *> SmallVecImplInsn;

static void combineKnownMetadata(Instruction ReplInst, Instruction I) {		static void combineKnownMetadata(Instruction ReplInst, Instruction I) {
static const unsigned KnownIDs[] = {		static const unsigned KnownIDs[] = {
LLVMContext::MD_tbaa, LLVMContext::MD_alias_scope,		LLVMContext::MD_tbaa, LLVMContext::MD_alias_scope,
LLVMContext::MD_noalias, LLVMContext::MD_range,		LLVMContext::MD_noalias, LLVMContext::MD_range,
LLVMContext::MD_fpmath, LLVMContext::MD_invariant_load,		LLVMContext::MD_fpmath, LLVMContext::MD_invariant_load,
LLVMContext::MD_invariant_group};		LLVMContext::MD_invariant_group};
combineMetadata(ReplInst, I, KnownIDs);		combineMetadata(ReplInst, I, KnownIDs);
}		}

// This pass hoists common computations across branches sharing common		// This pass hoists common computations across branches sharing common
// dominator. The primary goal is to reduce the code size, and in some		// dominator. The primary goal is to reduce the code size, and in some
// cases reduce critical path (by exposing more ILP).		// cases reduce critical path (by exposing more ILP).
class GVNHoist {		class GVNHoist {
public:		public:
GVNHoist(DominatorTree DT, AliasAnalysis AA, MemoryDependenceResults *MD,		GVNHoist(DominatorTree DT, PostDominatorTree PDT, AliasAnalysis *AA,
MemorySSA *MSSA)		MemoryDependenceResults MD, MemorySSA MSSA)
: DT(DT), AA(AA), MD(MD), MSSA(MSSA),		: DT(DT), PDT(PDT), AA(AA), MD(MD), MSSA(MSSA),
MSSAUpdater(make_unique<MemorySSAUpdater>(MSSA)),		MSSAUpdater(make_unique<MemorySSAUpdater>(MSSA)),
HoistingGeps(false),		HoistingGeps(false) {}
HoistedCtr(0)
{ }

bool run(Function &F) {		bool run(Function &F) {
		NumFuncArgs = F.arg_size();
VN.setDomTree(DT);		VN.setDomTree(DT);
VN.setAliasAnalysis(AA);		VN.setAliasAnalysis(AA);
VN.setMemDep(MD);		VN.setMemDep(MD);
bool Res = false;		bool Res = false;
// Perform DFS Numbering of instructions.		// Perform DFS Numbering of instructions.
unsigned BBI = 0;		unsigned BBI = 0;
for (const BasicBlock *BB : depth_first(&F.getEntryBlock())) {		for (const BasicBlock *BB : depth_first(&F.getEntryBlock())) {
DFSNumber[BB] = ++BBI;		DFSNumber[BB] = ++BBI;
Show All 20 Lines	while (1) {
VN.clear();		VN.clear();

Res = true;		Res = true;
}		}

return Res;		return Res;
}		}

		// Copied from NewGVN.cpp
		// This function provides global ranking of operations so that we can place
		// them in a canonical order. Note that rank alone is not necessarily enough
		// for a complete ordering, as constants all have the same rank. However,
		// generally, we will simplify an operation with all constants so that it
		// doesn't matter what order they appear in.
		unsigned int rank(const Value *V) const {
		// Prefer constants to undef to anything else
		// Undef is a constant, have to check it first.
		// Prefer smaller constants to constantexprs
		if (isa<ConstantExpr>(V))
		return 2;
		if (isa<UndefValue>(V))
		return 1;
		if (isa<Constant>(V))
		return 0;
		else if (auto *A = dyn_cast<Argument>(V))
		return 3 + A->getArgNo();

		// Need to shift the instruction DFS by number of arguments + 3 to account
		// for the constant and argument ranking above.
		auto Result = DFSNumber.lookup(V);
		if (Result > 0)
		return 4 + NumFuncArgs + Result;
		// Unreachable or something else, just return a really large number.
		return ~0;
		}

private:		private:
GVN::ValueTable VN;		GVN::ValueTable VN;
DominatorTree *DT;		DominatorTree *DT;
		PostDominatorTree *PDT;
AliasAnalysis *AA;		AliasAnalysis *AA;
MemoryDependenceResults *MD;		MemoryDependenceResults *MD;
MemorySSA *MSSA;		MemorySSA *MSSA;
std::unique_ptr<MemorySSAUpdater> MSSAUpdater;		std::unique_ptr<MemorySSAUpdater> MSSAUpdater;
const bool HoistingGeps;
DenseMap<const Value *, unsigned> DFSNumber;		DenseMap<const Value *, unsigned> DFSNumber;
BBSideEffectsSet BBSideEffects;		BBSideEffectsSet BBSideEffects;
DenseSet<const BasicBlock*> HoistBarrier;		DenseSet<const BasicBlock *> HoistBarrier;
int HoistedCtr;
		SmallVector<BasicBlock *, 32> IDFBlocks;
		unsigned NumFuncArgs;
		const bool HoistingGeps;

enum InsKind { Unknown, Scalar, Load, Store };		enum InsKind { Unknown, Scalar, Load, Store };

// Return true when there are exception handling in BB.		// Return true when there are exception handling in BB.
bool hasEH(const BasicBlock *BB) {		bool hasEH(const BasicBlock *BB) {
auto It = BBSideEffects.find(BB);		auto It = BBSideEffects.find(BB);
if (It != BBSideEffects.end())		if (It != BBSideEffects.end())
return It->second;		return It->second;
Show All 16 Lines	private:
bool successorDominate(const BasicBlock BB, const BasicBlock A) {		bool successorDominate(const BasicBlock BB, const BasicBlock A) {
for (const BasicBlock *Succ : BB->getTerminator()->successors())		for (const BasicBlock *Succ : BB->getTerminator()->successors())
if (DT->dominates(Succ, A))		if (DT->dominates(Succ, A))
return true;		return true;

return false;		return false;
}		}

// Return true when all paths from HoistBB to the end of the function pass
// through one of the blocks in WL.
bool hoistingFromAllPaths(const BasicBlock *HoistBB,
SmallPtrSetImpl<const BasicBlock *> &WL) {

// Copy WL as the loop will remove elements from it.
SmallPtrSet<const BasicBlock *, 2> WorkList(WL.begin(), WL.end());

for (auto It = df_begin(HoistBB), E = df_end(HoistBB); It != E;) {
// There exists a path from HoistBB to the exit of the function if we are
// still iterating in DF traversal and we removed all instructions from
// the work list.
if (WorkList.empty())
return false;

const BasicBlock BB = It;
if (WorkList.erase(BB)) {
// Stop DFS traversal when BB is in the work list.
It.skipChildren();
continue;
}

// We reached the leaf Basic Block => not all paths have this instruction.
if (!BB->getTerminator()->getNumSuccessors())
return false;

// When reaching the back-edge of a loop, there may be a path through the
// loop that does not pass through B or C before exiting the loop.
if (successorDominate(BB, HoistBB))
return false;

// Increment DFS traversal when not skipping children.
++It;
}

return true;
}

/* Return true when I1 appears before I2 in the instructions of BB. */		/* Return true when I1 appears before I2 in the instructions of BB. */
bool firstInBB(const Instruction I1, const Instruction I2) {		bool firstInBB(const Instruction I1, const Instruction I2) {
assert(I1->getParent() == I2->getParent());		assert(I1->getParent() == I2->getParent());
unsigned I1DFS = DFSNumber.lookup(I1);		unsigned I1DFS = DFSNumber.lookup(I1);
unsigned I2DFS = DFSNumber.lookup(I2);		unsigned I2DFS = DFSNumber.lookup(I2);
assert(I1DFS && I2DFS);		assert(I1DFS && I2DFS);
return I1DFS < I2DFS;		return I1DFS < I2DFS;
}		}
▲ Show 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	bool safeToHoistLdSt(const Instruction NewPt, const Instruction OldPt,
}		}

// No side effects: it is safe to hoist.		// No side effects: it is safe to hoist.
return true;		return true;
}		}

// Return true when it is safe to hoist scalar instructions from all blocks in		// Return true when it is safe to hoist scalar instructions from all blocks in
// WL to HoistBB.		// WL to HoistBB.
bool safeToHoistScalar(const BasicBlock *HoistBB,		bool safeToHoistScalar(const BasicBlock HoistBB, const BasicBlock BB,
SmallPtrSetImpl<const BasicBlock *> &WL,
int &NBBsOnAllPaths) {		int &NBBsOnAllPaths) {
// Check that the hoisted expression is needed on all paths.		return !hasEHOnPath(HoistBB, BB, NBBsOnAllPaths);
if (!hoistingFromAllPaths(HoistBB, WL))		}
return false;

for (const BasicBlock *BB : WL)		// In the inverse CFG, the dominance frontier of basic block (BB) is the
if (hasEHOnPath(HoistBB, BB, NBBsOnAllPaths))		// point where ANTIC needs to be computed for instructions which are going
		// to be hoisted. Since this point does not change during gvn-hoist,
		// we compute it only once (on demand).
		// The ides is inspired from:
		// "Partial Redundancy Elimination in SSA Form"
		// ROBERT KENNEDY, SUN CHAN, SHIN-MING LIU, RAYMOND LO, PENG TU and FRED CHOW
		// They use similar idea in the forward graph to to find fully redundant and
		// partially redundant expressions, here it is used in the inverse graph to
		// find fully anticipable instructions at merge point (post-dominator in
		// the inverse CFG).
		// Returns the edge via which an instruction in BB will get the values from.

		// Returns true when the values are flowing out to each edge.
		bool valueAnticipable(CHIArgs C, TerminatorInst *TI) const {
		if (TI->getNumSuccessors() > std::distance(C.begin(), C.end()))
		return false; // Not enough args in this CHI.

		for (auto CHI : C) {
		BasicBlock *Dest = CHI.Dest;
		// Find if all the edges have values flowing out of BB.
		bool Found = any_of(TI->successors(), [Dest](const BasicBlock *BB) {
		return BB == Dest; });
		if (!Found)
return false;		return false;

return true;
}		}
		return true;
// Each element of a hoisting list contains the basic block where to hoist and
// a list of instructions to be hoisted.
typedef std::pair<BasicBlock *, SmallVecInsn> HoistingPointInfo;
typedef SmallVector<HoistingPointInfo, 4> HoistingPointList;

// Partition InstructionsToHoist into a set of candidates which can share a
// common hoisting point. The partitions are collected in HPL. IsScalar is
// true when the instructions in InstructionsToHoist are scalars. IsLoad is
// true when the InstructionsToHoist are loads, false when they are stores.
void partitionCandidates(SmallVecImplInsn &InstructionsToHoist,
HoistingPointList &HPL, InsKind K) {
// No need to sort for two instructions.
if (InstructionsToHoist.size() > 2) {
SortByDFSIn Pred(DFSNumber);
std::sort(InstructionsToHoist.begin(), InstructionsToHoist.end(), Pred);
}		}

		// Check if it is safe to hoist values tracked by CHI in the range
		// [Begin, End) and accumulate them in Safe.
		void checkSafety(CHIArgs C, BasicBlock *BB, InsKind K,
		SmallVectorImpl<CHIArg> &Safe) {
int NumBBsOnAllPaths = MaxNumberOfBBSInPath;		int NumBBsOnAllPaths = MaxNumberOfBBSInPath;
		for (auto CHI : C) {
SmallVecImplInsn::iterator II = InstructionsToHoist.begin();		Instruction *Insn = CHI.I;
SmallVecImplInsn::iterator Start = II;		if (!Insn) // No instruction was inserted in this CHI.
Instruction HoistPt = II;		continue;
BasicBlock *HoistBB = HoistPt->getParent();		if (K == InsKind::Scalar) {
MemoryUseOrDef *UD;		if (safeToHoistScalar(BB, Insn->getParent(), NumBBsOnAllPaths))
if (K != InsKind::Scalar)		Safe.push_back(CHI);
UD = MSSA->getMemoryAccess(HoistPt);

for (++II; II != InstructionsToHoist.end(); ++II) {
Instruction Insn = II;
BasicBlock *BB = Insn->getParent();
BasicBlock *NewHoistBB;
Instruction *NewHoistPt;

if (BB == HoistBB) { // Both are in the same Basic Block.
NewHoistBB = HoistBB;
NewHoistPt = firstInBB(Insn, HoistPt) ? Insn : HoistPt;
} else {		} else {
// If the hoisting point contains one of the instructions,		MemoryUseOrDef *UD = MSSA->getMemoryAccess(Insn);
// then hoist there, otherwise hoist before the terminator.		if (safeToHoistLdSt(BB->getTerminator(), Insn, UD, K, NumBBsOnAllPaths))
NewHoistBB = DT->findNearestCommonDominator(HoistBB, BB);		Safe.push_back(CHI);
if (NewHoistBB == BB)		}
NewHoistPt = Insn;		}
else if (NewHoistBB == HoistBB)
NewHoistPt = HoistPt;
else
NewHoistPt = NewHoistBB->getTerminator();
}		}

SmallPtrSet<const BasicBlock *, 2> WL;		typedef DenseMap<VNType, SmallVector<Instruction *, 2>> RenameStackType;
WL.insert(HoistBB);		// Push all the VNs corresponding to BB into RenameStack.
WL.insert(BB);		void fillRenameStack(BasicBlock *BB, InValuesType &ValueBBs,
		RenameStackType &RenameStack) {
		auto it1 = ValueBBs.find(BB);
		if (it1 != ValueBBs.end()) {
		// Iterate in reverse order to keep lower ranked values on the top.
		for (std::pair<VNType, Instruction *> &VI : reverse(it1->second)) {
		// Get the value of instruction I
		DEBUG(dbgs() << "\nPushing on stack: " << *VI.second);
		RenameStack[VI.first].push_back(VI.second);
		}
		}
		}

if (K == InsKind::Scalar) {		void fillChiArgs(BasicBlock *BB, OutValuesType &CHIBBs,
if (safeToHoistScalar(NewHoistBB, WL, NumBBsOnAllPaths)) {		RenameStackType &RenameStack) {
// Extend HoistPt to NewHoistPt.		// For each predecessor (because Post-DOM) of BB check if it has a CHI
HoistPt = NewHoistPt;		for (auto Pred : predecessors(BB)) {
HoistBB = NewHoistBB;		auto P = CHIBBs.find(Pred);
		if (P == CHIBBs.end()) {
continue;		continue;
}		}
} else {		DEBUG(dbgs() << "\nLooking at CHIs in: " << Pred->getName(););
// When NewBB already contains an instruction to be hoisted, the		// A CHI is found (BB -> Pred is an edge in the CFG)
// expression is needed on all paths.		// Pop the stack until Top(V) = Ve.
// Check that the hoisted expression is needed on all paths: it is		auto &VCHI = P->second;
// unsafe to hoist loads to a place where there may be a path not		for (auto It = VCHI.begin(), E = VCHI.end(); It != E;) {
// loading from the same address: for instance there may be a branch on		CHIArg &C = *It;
// which the address of the load may not be initialized.		if (!C.Dest) {
if ((HoistBB == NewHoistBB \|\| BB == NewHoistBB \|\|		auto si = RenameStack.find(C.VN);
hoistingFromAllPaths(NewHoistBB, WL)) &&		// The Basic Block where CHI is must dominate the value we want to
// Also check that it is safe to move the load or store from HoistPt		// track in a CHI. In the PDom walk, there can be values in the
// to NewHoistPt, and from Insn to NewHoistPt.		// stack which are not control dependent e.g., nested loop.
safeToHoistLdSt(NewHoistPt, HoistPt, UD, K, NumBBsOnAllPaths) &&		if (si != RenameStack.end() && si->second.size() &&
safeToHoistLdSt(NewHoistPt, Insn, MSSA->getMemoryAccess(Insn),		DT->dominates(Pred, si->second.back()->getParent())) {
K, NumBBsOnAllPaths)) {		C.Dest = BB; // Assign the edge
// Extend HoistPt to NewHoistPt.		C.I = si->second.pop_back_val(); // Assign the argument
HoistPt = NewHoistPt;		DEBUG(dbgs() << "\nCHI Inserted in BB: " << C.Dest->getName()
HoistBB = NewHoistBB;		<< *C.I << ", VN: " << C.VN.first << ", "
		<< C.VN.second);
		}
		// Move to next CHI of a different value
		It = std::find_if(It, VCHI.end(),
		[It](CHIArg &A) { return A != *It; });
		} else
		++It;
		}
		}
		}

		// Walk the post-dominator tree top-down and use a stack for each value to
		// store the last value you see. When you hit a CHI from a given edge, the
		// value to use as the argument is at the top of the stack, add the value to
		// CHI and pop.
		void insertCHI(InValuesType &ValueBBs, OutValuesType &CHIBBs) {
		auto Root = PDT->getNode(nullptr);
		if (!Root)
		return;
		// Depth first walk on PDom tree to fill the CHIargs at each PDF.
		RenameStackType RenameStack;
		for (auto Node : depth_first(Root)) {
		BasicBlock *BB = Node->getBlock();
		if (!BB)
continue;		continue;

		// Collect all values in BB and push to stack.
		fillRenameStack(BB, ValueBBs, RenameStack);

		// Fill outgoing values in each CHI corresponding to BB.
		fillChiArgs(BB, CHIBBs, RenameStack);
}		}
}		}

// At this point it is not safe to extend the current hoisting to		// Walk all the CHI-nodes to find ones which have a empty-entry and remove
// NewHoistPt: save the hoisting list so far.		// them Then collect all the instructions which are safe to hoist and see if
if (std::distance(Start, II) > 1)		// they form a list of anticipable values. OutValues contains CHIs
HPL.push_back({HoistBB, SmallVecInsn(Start, II)});		// corresponding to each basic block.
		void findHoistableCandidates(OutValuesType &CHIBBs, InsKind K,
// Start over from BB.		HoistingPointList &HPL) {
Start = II;		auto cmpVN = [](const CHIArg &A, const CHIArg &B) { return A.VN < B.VN; };
if (K != InsKind::Scalar)
UD = MSSA->getMemoryAccess(*Start);		// CHIArgs now have the outgoing values, so check for anticipability and
HoistPt = Insn;		// accumulate hoistable candidates in HPL.
HoistBB = BB;		for (std::pair<BasicBlock *, SmallVector<CHIArg, 2>> &A : CHIBBs) {
NumBBsOnAllPaths = MaxNumberOfBBSInPath;		BasicBlock *BB = A.first;
}		SmallVectorImpl<CHIArg> &CHIs = A.second;
		// Vector of PHIs contains PHIs for different instructions.
// Save the last partition.		// Sort the args according to their VNs, such that identical
if (std::distance(Start, II) > 1)		// instructions are together.
HPL.push_back({HoistBB, SmallVecInsn(Start, II)});		std::sort(CHIs.begin(), CHIs.end(), cmpVN);
		auto TI = BB->getTerminator();
		auto B = CHIs.begin();
		// [PreIt, PHIIt) form a range of CHIs which have identical VNs.
		auto PHIIt = std::find_if(CHIs.begin(), CHIs.end(),
		[B](CHIArg &A) { return A != *B; });
		auto PrevIt = CHIs.begin();
		while (PrevIt != PHIIt) {
		// Collect values which satisfy safety checks.
		SmallVector<CHIArg, 2> Safe;
		// We check for safety first because there might be multiple values in
		// the same path, some of which are not safe to be hoisted, but overall
		// each edge has at least one value which can be hoisted, making the
		// value anticipable along that path.
		checkSafety(make_range(PrevIt, PHIIt), BB, K, Safe);

		// List of safe values should be anticipable at TI.
		if (valueAnticipable(make_range(Safe.begin(), Safe.end()), TI)) {
		HPL.push_back({BB, SmallVecInsn()});
		SmallVecInsn &V = HPL.back().second;
		for (auto B : Safe)
		V.push_back(B.I);
		}

		// Check other VNs
		PrevIt = PHIIt;
		PHIIt = std::find_if(PrevIt, CHIs.end(),
		[PrevIt](CHIArg &A) { return A != *PrevIt; });
		}
		}
}		}

// Initialize HPL from Map.		// Compute insertion points for each values which can be fully anticipated at
		// a dominator. HPL contains all such values.
void computeInsertionPoints(const VNtoInsns &Map, HoistingPointList &HPL,		void computeInsertionPoints(const VNtoInsns &Map, HoistingPointList &HPL,
InsKind K) {		InsKind K) {
		// Sort VNs based on their rankings
		std::vector<VNType> Ranks;
for (const auto &Entry : Map) {		for (const auto &Entry : Map) {
if (MaxHoistedThreshold != -1 && ++HoistedCtr > MaxHoistedThreshold)		Ranks.push_back(Entry.first);
return;		}

const SmallVecInsn &V = Entry.second;		// TODO: Remove fully-redundant expressions.
		// Get instruction from the Map, assume that all the Instructions
		// with same VNs have same rank (this is an approximation).
		std::sort(Ranks.begin(), Ranks.end(),
		[this, &Map](const VNType &r1, const VNType &r2) {
		return (rank(*Map.lookup(r1).begin()) <
		rank(*Map.lookup(r2).begin()));
		});

		// - Sort VNs according to their rank, and start with lowest ranked VN
		// - Take a VN and for each instruction with same VN
		// - Find the dominance frontier in the inverse graph (PDF)
		// - Insert the chi-node at PDF
		// - Remove the chi-nodes with missing entries
		// - Remove values from CHI-nodes which do not truly flow out, e.g.,
		// modified along the path.
		// - Collect the remaining values that are still anticipable
		SmallVector<BasicBlock *, 2> IDFBlocks;
		ReverseIDFCalculator IDFs(*PDT);
		OutValuesType OutValue;
		InValuesType InValue;
		for (const auto &R : Ranks) {
		const SmallVecInsn &V = Map.lookup(R);
if (V.size() < 2)		if (V.size() < 2)
continue;		continue;
		const VNType &VN = R;
// Compute the insertion point and the list of expressions to be hoisted.		SmallPtrSet<BasicBlock *, 2> VNBlocks;
SmallVecInsn InstructionsToHoist;		for (auto &I : V) {
for (auto I : V)		BasicBlock *BBI = I->getParent();
// We don't need to check for hoist-barriers here because if		if (!hasEH(BBI))
// I->getParent() is a barrier then I precedes the barrier.		VNBlocks.insert(BBI);
if (!hasEH(I->getParent()))		}
InstructionsToHoist.push_back(I);		// Compute the Post Dominance Frontiers of each basic block
		// The dominance frontier of a live block X in the reverse
if (!InstructionsToHoist.empty())		// control graph is the set of blocks upon which X is control
partitionCandidates(InstructionsToHoist, HPL, K);		// dependent. The following sequence computes the set of blocks
		// which currently have dead terminators that are control
		// dependence sources of a block which is in NewLiveBlocks.
		IDFs.setDefiningBlocks(VNBlocks);
		IDFs.calculate(IDFBlocks);

		// Make a map of BB vs instructions to be hoisted.
		for (unsigned i = 0; i < V.size(); ++i) {
		InValue[V[i]->getParent()].push_back(std::make_pair(VN, V[i]));
		}
		// Insert empty CHI node for this VN. This is used to factor out
		// basic blocks where the ANTIC can potentially change.
		for (auto IDFB : IDFBlocks) { // TODO: Prune out useless CHI insertions.
		for (unsigned i = 0; i < V.size(); ++i) {
		CHIArg C = {VN, nullptr, nullptr};
		if (DT->dominates(IDFB, V[i]->getParent())) { // Ignore spurious PDFs.
		// InValue[V[i]->getParent()].push_back(std::make_pair(VN, V[i]));
		OutValue[IDFB].push_back(C);
		DEBUG(dbgs() << "\nInsertion a CHI for BB: " << IDFB->getName()
		<< ", for Insn: " << *V[i]);
}		}
}		}
		}
		}

		// Insert CHI args at each PDF to iterate on factored graph of
		// control dependence.
		insertCHI(InValue, OutValue);
		// Using the CHI args inserted at each PDF, find fully anticipable values.
		findHoistableCandidates(OutValue, K, HPL);
		}

// Return true when all operands of Instr are available at insertion point		// Return true when all operands of Instr are available at insertion point
// HoistPt. When limiting the number of hoisted expressions, one could hoist		// HoistPt. When limiting the number of hoisted expressions, one could hoist
// a load without hoisting its access function. So before hoisting any		// a load without hoisting its access function. So before hoisting any
// expression, make sure that all its operands are available at insert point.		// expression, make sure that all its operands are available at insert point.
bool allOperandsAvailable(const Instruction *I,		bool allOperandsAvailable(const Instruction *I,
const BasicBlock *HoistPt) const {		const BasicBlock *HoistPt) const {
for (const Use &Op : I->operands())		for (const Use &Op : I->operands())
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	for (const Instruction *OtherInst : InstructionsToHoist) {
cast<StoreInst>(OtherInst)->getPointerOperand());		cast<StoreInst>(OtherInst)->getPointerOperand());
ClonedGep->andIRFlags(OtherGep);		ClonedGep->andIRFlags(OtherGep);
}		}

// Replace uses of Gep with ClonedGep in Repl.		// Replace uses of Gep with ClonedGep in Repl.
Repl->replaceUsesOfWith(Gep, ClonedGep);		Repl->replaceUsesOfWith(Gep, ClonedGep);
}		}

		void updateAlignment(Instruction I, Instruction Repl) {
		if (auto *ReplacementLoad = dyn_cast<LoadInst>(Repl)) {
		ReplacementLoad->setAlignment(
		std::min(ReplacementLoad->getAlignment(),
		cast<LoadInst>(I)->getAlignment()));
		++NumLoadsRemoved;
		} else if (auto *ReplacementStore = dyn_cast<StoreInst>(Repl)) {
		ReplacementStore->setAlignment(
		std::min(ReplacementStore->getAlignment(),
		cast<StoreInst>(I)->getAlignment()));
		++NumStoresRemoved;
		} else if (auto *ReplacementAlloca = dyn_cast<AllocaInst>(Repl)) {
		ReplacementAlloca->setAlignment(
		std::max(ReplacementAlloca->getAlignment(),
		cast<AllocaInst>(I)->getAlignment()));
		} else if (isa<CallInst>(Repl)) {
		++NumCallsRemoved;
		}
		}

		// Remove all the instructions in Candidates and replace their usage with Repl.
		// Returns the number of instructions removed.
		unsigned rauw(const SmallVecInsn &Candidates, Instruction *Repl,
		MemoryUseOrDef *NewMemAcc) {
		unsigned NR = 0;
		for (Instruction *I : Candidates) {
		if (I != Repl) {
		++NR;
		updateAlignment(I, Repl);
		if (NewMemAcc) {
		// Update the uses of the old MSSA access with NewMemAcc.
		MemoryAccess *OldMA = MSSA->getMemoryAccess(I);
		OldMA->replaceAllUsesWith(NewMemAcc);
		MSSAUpdater->removeMemoryAccess(OldMA);
		}

		Repl->andIRFlags(I);
		combineKnownMetadata(Repl, I);
		I->replaceAllUsesWith(Repl);
		// Also invalidate the Alias Analysis cache.
		MD->removeInstruction(I);
		I->eraseFromParent();
		}
		}
		return NR;
		}

		// Replace all Memory PHI usage with NewMemAcc.
		void raMPHIuw(MemoryUseOrDef *NewMemAcc) {
		SmallPtrSet<MemoryPhi *, 4> UsePhis;
		for (User *U : NewMemAcc->users())
		if (MemoryPhi *Phi = dyn_cast<MemoryPhi>(U))
		UsePhis.insert(Phi);

		for (MemoryPhi *Phi : UsePhis) {
		auto In = Phi->incoming_values();
		if (all_of(In, [&](Use &U) { return U == NewMemAcc; })) {
		Phi->replaceAllUsesWith(NewMemAcc);
		MSSAUpdater->removeMemoryAccess(Phi);
		}
		}
		}

		// Remove all other instructions and replace them with Repl.
		unsigned removeAndReplace(const SmallVecInsn &Candidates, Instruction *Repl,
		BasicBlock *DestBB, bool MoveAccess) {
		MemoryUseOrDef *NewMemAcc = MSSA->getMemoryAccess(Repl);
		if (MoveAccess && NewMemAcc) {
		// The definition of this ld/st will not change: ld/st hoisting is
		// legal when the ld/st is not moved past its current definition.
		MSSAUpdater->moveToPlace(NewMemAcc, DestBB, MemorySSA::End);
		}

		// Replace all other instructions with Repl with memory access NewMemAcc.
		unsigned NR = rauw(Candidates, Repl, NewMemAcc);

		// Remove MemorySSA phi nodes with the same arguments.
		if (NewMemAcc)
		raMPHIuw(NewMemAcc);
		return NR;
		}

// In the case Repl is a load or a store, we make all their GEPs		// In the case Repl is a load or a store, we make all their GEPs
// available: GEPs are not hoisted by default to avoid the address		// available: GEPs are not hoisted by default to avoid the address
// computations to be hoisted without the associated load or store.		// computations to be hoisted without the associated load or store.
bool makeGepOperandsAvailable(Instruction Repl, BasicBlock HoistPt,		bool makeGepOperandsAvailable(Instruction Repl, BasicBlock HoistPt,
const SmallVecInsn &InstructionsToHoist) const {		const SmallVecInsn &InstructionsToHoist) const {
// Check whether the GEP of a ld/st can be synthesized at HoistPt.		// Check whether the GEP of a ld/st can be synthesized at HoistPt.
GetElementPtrInst *Gep = nullptr;		GetElementPtrInst *Gep = nullptr;
Instruction *Val = nullptr;		Instruction *Val = nullptr;
Show All 25 Lines	bool makeGepOperandsAvailable(Instruction Repl, BasicBlock HoistPt,
return true;		return true;
}		}

std::pair<unsigned, unsigned> hoist(HoistingPointList &HPL) {		std::pair<unsigned, unsigned> hoist(HoistingPointList &HPL) {
unsigned NI = 0, NL = 0, NS = 0, NC = 0, NR = 0;		unsigned NI = 0, NL = 0, NS = 0, NC = 0, NR = 0;
for (const HoistingPointInfo &HP : HPL) {		for (const HoistingPointInfo &HP : HPL) {
// Find out whether we already have one of the instructions in HoistPt,		// Find out whether we already have one of the instructions in HoistPt,
// in which case we do not have to move it.		// in which case we do not have to move it.
BasicBlock *HoistPt = HP.first;		BasicBlock *DestBB = HP.first;
const SmallVecInsn &InstructionsToHoist = HP.second;		const SmallVecInsn &InstructionsToHoist = HP.second;
Instruction *Repl = nullptr;		Instruction *Repl = nullptr;
for (Instruction *I : InstructionsToHoist)		for (Instruction *I : InstructionsToHoist)
if (I->getParent() == HoistPt)		if (I->getParent() == DestBB)
// If there are two instructions in HoistPt to be hoisted in place:		// If there are two instructions in HoistPt to be hoisted in place:
// update Repl to be the first one, such that we can rename the uses		// update Repl to be the first one, such that we can rename the uses
// of the second based on the first.		// of the second based on the first.
if (!Repl \|\| firstInBB(I, Repl))		if (!Repl \|\| firstInBB(I, Repl))
Repl = I;		Repl = I;

// Keep track of whether we moved the instruction so we know whether we		// Keep track of whether we moved the instruction so we know whether we
// should move the MemoryAccess.		// should move the MemoryAccess.
bool MoveAccess = true;		bool MoveAccess = true;
if (Repl) {		if (Repl) {
// Repl is already in HoistPt: it remains in place.		// Repl is already in HoistPt: it remains in place.
assert(allOperandsAvailable(Repl, HoistPt) &&		assert(allOperandsAvailable(Repl, DestBB) &&
"instruction depends on operands that are not available");		"instruction depends on operands that are not available");
MoveAccess = false;		MoveAccess = false;
} else {		} else {
// When we do not find Repl in HoistPt, select the first in the list		// When we do not find Repl in HoistPt, select the first in the list
// and move it to HoistPt.		// and move it to HoistPt.
Repl = InstructionsToHoist.front();		Repl = InstructionsToHoist.front();

// We can move Repl in HoistPt only when all operands are available.		// We can move Repl in HoistPt only when all operands are available.
// The order in which hoistings are done may influence the availability		// The order in which hoistings are done may influence the availability
// of operands.		// of operands.
if (!allOperandsAvailable(Repl, HoistPt)) {		if (!allOperandsAvailable(Repl, DestBB)) {

// When HoistingGeps there is nothing more we can do to make the		// When HoistingGeps there is nothing more we can do to make the
// operands available: just continue.		// operands available: just continue.
if (HoistingGeps)		if (HoistingGeps)
continue;		continue;

// When not HoistingGeps we need to copy the GEPs.		// When not HoistingGeps we need to copy the GEPs.
if (!makeGepOperandsAvailable(Repl, HoistPt, InstructionsToHoist))		if (!makeGepOperandsAvailable(Repl, DestBB, InstructionsToHoist))
continue;		continue;
}		}

// Move the instruction at the end of HoistPt.		// Move the instruction at the end of HoistPt.
Instruction *Last = HoistPt->getTerminator();		Instruction *Last = DestBB->getTerminator();
MD->removeInstruction(Repl);		MD->removeInstruction(Repl);
Repl->moveBefore(Last);		Repl->moveBefore(Last);

DFSNumber[Repl] = DFSNumber[Last]++;		DFSNumber[Repl] = DFSNumber[Last]++;
}		}

MemoryAccess *NewMemAcc = MSSA->getMemoryAccess(Repl);		NR += removeAndReplace(InstructionsToHoist, Repl, DestBB, MoveAccess);

if (MoveAccess) {
if (MemoryUseOrDef *OldMemAcc =
dyn_cast_or_null<MemoryUseOrDef>(NewMemAcc)) {
// The definition of this ld/st will not change: ld/st hoisting is
// legal when the ld/st is not moved past its current definition.
MemoryAccess *Def = OldMemAcc->getDefiningAccess();
NewMemAcc =
MSSAUpdater->createMemoryAccessInBB(Repl, Def, HoistPt, MemorySSA::End);
OldMemAcc->replaceAllUsesWith(NewMemAcc);
MSSAUpdater->removeMemoryAccess(OldMemAcc);
}
}

if (isa<LoadInst>(Repl))		if (isa<LoadInst>(Repl))
++NL;		++NL;
else if (isa<StoreInst>(Repl))		else if (isa<StoreInst>(Repl))
++NS;		++NS;
else if (isa<CallInst>(Repl))		else if (isa<CallInst>(Repl))
++NC;		++NC;
else // Scalar		else // Scalar
++NI;		++NI;

// Remove and rename all other instructions.
for (Instruction *I : InstructionsToHoist)
if (I != Repl) {
++NR;
if (auto *ReplacementLoad = dyn_cast<LoadInst>(Repl)) {
ReplacementLoad->setAlignment(
std::min(ReplacementLoad->getAlignment(),
cast<LoadInst>(I)->getAlignment()));
++NumLoadsRemoved;
} else if (auto *ReplacementStore = dyn_cast<StoreInst>(Repl)) {
ReplacementStore->setAlignment(
std::min(ReplacementStore->getAlignment(),
cast<StoreInst>(I)->getAlignment()));
++NumStoresRemoved;
} else if (auto *ReplacementAlloca = dyn_cast<AllocaInst>(Repl)) {
ReplacementAlloca->setAlignment(
std::max(ReplacementAlloca->getAlignment(),
cast<AllocaInst>(I)->getAlignment()));
} else if (isa<CallInst>(Repl)) {
++NumCallsRemoved;
}

if (NewMemAcc) {
// Update the uses of the old MSSA access with NewMemAcc.
MemoryAccess *OldMA = MSSA->getMemoryAccess(I);
OldMA->replaceAllUsesWith(NewMemAcc);
MSSAUpdater->removeMemoryAccess(OldMA);
}

Repl->andIRFlags(I);
combineKnownMetadata(Repl, I);
I->replaceAllUsesWith(Repl);
// Also invalidate the Alias Analysis cache.
MD->removeInstruction(I);
I->eraseFromParent();
}

// Remove MemorySSA phi nodes with the same arguments.
if (NewMemAcc) {
SmallPtrSet<MemoryPhi *, 4> UsePhis;
for (User *U : NewMemAcc->users())
if (MemoryPhi *Phi = dyn_cast<MemoryPhi>(U))
UsePhis.insert(Phi);

for (auto *Phi : UsePhis) {
auto In = Phi->incoming_values();
if (all_of(In, [&](Use &U) { return U == NewMemAcc; })) {
Phi->replaceAllUsesWith(NewMemAcc);
MSSAUpdater->removeMemoryAccess(Phi);
}
}
}
}		}

NumHoisted += NL + NS + NC + NI;		NumHoisted += NL + NS + NC + NI;
NumRemoved += NR;		NumRemoved += NR;
NumLoadsHoisted += NL;		NumLoadsHoisted += NL;
NumStoresHoisted += NS;		NumStoresHoisted += NS;
NumCallsHoisted += NC;		NumCallsHoisted += NC;
return {NI, NL + NC + NS};		return {NI, NL + NC + NS};
}		}

// Hoist all expressions. Returns Number of scalars hoisted		// Hoist all expressions. Returns Number of scalars hoisted
// and number of non-scalars hoisted.		// and number of non-scalars hoisted.
std::pair<unsigned, unsigned> hoistExpressions(Function &F) {		std::pair<unsigned, unsigned> hoistExpressions(Function &F) {
InsnInfo II;		InsnInfo II;
LoadInfo LI;		LoadInfo LI;
StoreInfo SI;		StoreInfo SI;
CallInfo CI;		CallInfo CI;
for (BasicBlock *BB : depth_first(&F.getEntryBlock())) {		for (BasicBlock *BB : depth_first(&F.getEntryBlock())) {
int InstructionNb = 0;		int InstructionNb = 0;
for (Instruction &I1 : *BB) {		for (Instruction &I1 : *BB) {
// If I1 cannot guarantee progress, subsequent instructions		// If I1 cannot guarantee progress, subsequent instructions
// in BB cannot be hoisted anyways.		// in BB cannot be hoisted anyways.
if (!isGuaranteedToTransferExecutionToSuccessor(&I1)) {		if (!isGuaranteedToTransferExecutionToSuccessor(&I1)) {
HoistBarrier.insert(BB);		HoistBarrier.insert(BB);
break;		break;
}		}
// Only hoist the first instructions in BB up to MaxDepthInBB. Hoisting		// Only hoist the first instructions in BB up to MaxDepthInBB. Hoisting
// deeper may increase the register pressure and compilation time.		// deeper may increase the register pressure and compilation time.
if (MaxDepthInBB != -1 && InstructionNb++ >= MaxDepthInBB)		if (MaxDepthInBB != -1 && InstructionNb++ >= MaxDepthInBB)
break;		break;

// Do not value number terminator instructions.		// Do not value number terminator instructions.
if (isa<TerminatorInst>(&I1))		if (isa<TerminatorInst>(&I1))
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	public:
GVNHoistLegacyPass() : FunctionPass(ID) {		GVNHoistLegacyPass() : FunctionPass(ID) {
initializeGVNHoistLegacyPassPass(*PassRegistry::getPassRegistry());		initializeGVNHoistLegacyPassPass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override {		bool runOnFunction(Function &F) override {
if (skipFunction(F))		if (skipFunction(F))
return false;		return false;
auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();		auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
		auto &PDT = getAnalysis<PostDominatorTreeWrapperPass>().getPostDomTree();
auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();		auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
auto &MD = getAnalysis<MemoryDependenceWrapperPass>().getMemDep();		auto &MD = getAnalysis<MemoryDependenceWrapperPass>().getMemDep();
auto &MSSA = getAnalysis<MemorySSAWrapperPass>().getMSSA();		auto &MSSA = getAnalysis<MemorySSAWrapperPass>().getMSSA();

GVNHoist G(&DT, &AA, &MD, &MSSA);		GVNHoist G(&DT, &PDT, &AA, &MD, &MSSA);
return G.run(F);		return G.run(F);
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
		AU.addRequired<PostDominatorTreeWrapperPass>();
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();
AU.addRequired<MemoryDependenceWrapperPass>();		AU.addRequired<MemoryDependenceWrapperPass>();
AU.addRequired<MemorySSAWrapperPass>();		AU.addRequired<MemorySSAWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();		AU.addPreserved<DominatorTreeWrapperPass>();
AU.addPreserved<MemorySSAWrapperPass>();		AU.addPreserved<MemorySSAWrapperPass>();
AU.addPreserved<GlobalsAAWrapperPass>();		AU.addPreserved<GlobalsAAWrapperPass>();
}		}
};		};
} // namespace		} // namespace llvm

PreservedAnalyses GVNHoistPass::run(Function &F, FunctionAnalysisManager &AM) {		PreservedAnalyses GVNHoistPass::run(Function &F, FunctionAnalysisManager &AM) {
DominatorTree &DT = AM.getResult<DominatorTreeAnalysis>(F);		DominatorTree &DT = AM.getResult<DominatorTreeAnalysis>(F);
		PostDominatorTree &PDT = AM.getResult<PostDominatorTreeAnalysis>(F);
AliasAnalysis &AA = AM.getResult<AAManager>(F);		AliasAnalysis &AA = AM.getResult<AAManager>(F);
MemoryDependenceResults &MD = AM.getResult<MemoryDependenceAnalysis>(F);		MemoryDependenceResults &MD = AM.getResult<MemoryDependenceAnalysis>(F);
MemorySSA &MSSA = AM.getResult<MemorySSAAnalysis>(F).getMSSA();		MemorySSA &MSSA = AM.getResult<MemorySSAAnalysis>(F).getMSSA();
GVNHoist G(&DT, &AA, &MD, &MSSA);		GVNHoist G(&DT, &PDT, &AA, &MD, &MSSA);
if (!G.run(F))		if (!G.run(F))
return PreservedAnalyses::all();		return PreservedAnalyses::all();

PreservedAnalyses PA;		PreservedAnalyses PA;
PA.preserve<DominatorTreeAnalysis>();		PA.preserve<DominatorTreeAnalysis>();
PA.preserve<MemorySSAAnalysis>();		PA.preserve<MemorySSAAnalysis>();
PA.preserve<GlobalsAA>();		PA.preserve<GlobalsAA>();
return PA;		return PA;
Show All 13 Lines

llvm/trunk/test/Transforms/GVNHoist/hoist-more-than-two-branches.ll

				; RUN: opt -gvn-hoist -S < %s \| FileCheck %s

				; CHECK: store
				; CHECK-NOT: store

				; Check that an instruction can be hoisted to a basic block
				; with more than two successors.

				@G = external global i32, align 4

				define void @foo(i32 %c1) {
				entry:
				switch i32 %c1, label %exit1 [
				i32 0, label %sw0
				i32 1, label %sw1
				]

				sw0:
				store i32 1, i32* @G
				br label %exit

				sw1:
				store i32 1, i32* @G
				br label %exit

				exit1:
				store i32 1, i32* @G
				ret void
				exit:
				ret void
				}

llvm/trunk/test/Transforms/GVNHoist/hoist-mssa.ll

	; RUN: opt -S -gvn-hoist < %s \| FileCheck %s			; RUN: opt -S -gvn-hoist -newgvn < %s \| FileCheck %s

	; Check that store hoisting works: there should be only one store left.			; Check that store hoisting works: there should be only one store left.
	; CHECK-LABEL: @getopt			; CHECK-LABEL: @getopt
	; CHECK: store i32			; CHECK: store i32
	; CHECK-NOT: store i32			; CHECK-NOT: store i32

	@optind = external global i32, align 4			@optind = external global i32, align 4

	▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/GVNHoist/hoist-newgvn.ll

				; RUN: opt -gvn-hoist -newgvn -S < %s \| FileCheck %s
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@GlobalVar = internal global float 1.000000e+00

				; Check that we hoist load and scalar expressions in dominator.
				; CHECK-LABEL: @dominatorHoisting
				; CHECK: load
				; CHECK: load
				; CHECK: fsub
				; CHECK: fmul
				; CHECK: load
				; CHECK: fsub
				; CHECK: fmul
				; CHECK-NOT: load
				; CHECK-NOT: fmul
				; CHECK-NOT: fsub
				define float @dominatorHoisting(float %d, float* %min, float* %max, float* %a) {
				entry:
				%div = fdiv float 1.000000e+00, %d
				%0 = load float, float* %min, align 4
				%1 = load float, float* %a, align 4
				%sub = fsub float %0, %1
				%mul = fmul float %sub, %div
				%2 = load float, float* %max, align 4
				%sub1 = fsub float %2, %1
				%mul2 = fmul float %sub1, %div
				%cmp = fcmp oge float %div, 0.000000e+00
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry
				%3 = load float, float* %max, align 4
				%4 = load float, float* %a, align 4
				%sub3 = fsub float %3, %4
				%mul4 = fmul float %sub3, %div
				%5 = load float, float* %min, align 4
				%sub5 = fsub float %5, %4
				%mul6 = fmul float %sub5, %div
				br label %if.end

				if.end: ; preds = %entry
				%p1 = phi float [ %mul4, %if.then ], [ 0.000000e+00, %entry ]
				%p2 = phi float [ %mul6, %if.then ], [ 0.000000e+00, %entry ]

				%x = fadd float %p1, %mul2
				%y = fadd float %p2, %mul
				%z = fadd float %x, %y
				ret float %z
				}

				; Check that we hoist load and scalar expressions in dominator.
				; CHECK-LABEL: @domHoisting
				; CHECK: load
				; CHECK: load
				; CHECK: fsub
				; CHECK: fmul
				; CHECK: load
				; CHECK: fsub
				; CHECK: fmul
				; CHECK-NOT: load
				; CHECK-NOT: fmul
				; CHECK-NOT: fsub
				define float @domHoisting(float %d, float* %min, float* %max, float* %a) {
				entry:
				%div = fdiv float 1.000000e+00, %d
				%0 = load float, float* %min, align 4
				%1 = load float, float* %a, align 4
				%sub = fsub float %0, %1
				%mul = fmul float %sub, %div
				%2 = load float, float* %max, align 4
				%sub1 = fsub float %2, %1
				%mul2 = fmul float %sub1, %div
				%cmp = fcmp oge float %div, 0.000000e+00
				br i1 %cmp, label %if.then, label %if.else

				if.then:
				%3 = load float, float* %max, align 4
				%4 = load float, float* %a, align 4
				%sub3 = fsub float %3, %4
				%mul4 = fmul float %sub3, %div
				%5 = load float, float* %min, align 4
				%sub5 = fsub float %5, %4
				%mul6 = fmul float %sub5, %div
				br label %if.end

				if.else:
				%6 = load float, float* %max, align 4
				%7 = load float, float* %a, align 4
				%sub9 = fsub float %6, %7
				%mul10 = fmul float %sub9, %div
				%8 = load float, float* %min, align 4
				%sub12 = fsub float %8, %7
				%mul13 = fmul float %sub12, %div
				br label %if.end

				if.end:
				%p1 = phi float [ %mul4, %if.then ], [ %mul10, %if.else ]
				%p2 = phi float [ %mul6, %if.then ], [ %mul13, %if.else ]

				%x = fadd float %p1, %mul2
				%y = fadd float %p2, %mul
				%z = fadd float %x, %y
				ret float %z
				}

llvm/trunk/test/Transforms/GVNHoist/hoist-pr20242.ll

	; RUN: opt -gvn-hoist -S < %s \| FileCheck %s			; RUN: opt -gvn-hoist -newgvn -gvn-hoist -S < %s \| FileCheck %s
				; Test to demonstrate that newgvn creates opportunities for
				; more gvn-hoist when sibling branches contain identical expressions.

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	; Check that all "or" expressions are hoisted.			; Check that all "or" expressions are hoisted.
	; CHECK-LABEL: @encode			; CHECK-LABEL: @encode
	; CHECK: or i32			; CHECK: or i32
	; CHECK-NOT: or i32			; CHECK-NOT: or i32

	▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/GVNHoist/hoist-pr28933.ll

	; RUN: opt -S -gvn-hoist -verify-memoryssa < %s \| FileCheck %s			; RUN: opt -S -gvn-hoist -verify-memoryssa -newgvn < %s \| FileCheck %s

	; Check that we end up with one load and one store, in the right order			; Check that we end up with one load and one store, in the right order
	; CHECK-LABEL: define void @test_it(			; CHECK-LABEL: define void @test_it(
	; CHECK: store			; CHECK: store
	; CHECK-NEXT: load
	; CHECK-NOT: store			; CHECK-NOT: store
	; CHECK-NOT: load			; CHECK-NOT: load

	%rec894.0.1.2.3.12 = type { i16 }			%rec894.0.1.2.3.12 = type { i16 }

	@a = external global %rec894.0.1.2.3.12			@a = external global %rec894.0.1.2.3.12

	define void @test_it() {			define void @test_it() {
	bb2:			bb2:
	store i16 undef, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1			store i16 undef, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1
	%_tmp61 = load i16, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1			%_tmp61 = load i16, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1
	store i16 undef, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1			store i16 undef, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1
	%_tmp92 = load i16, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1			%_tmp92 = load i16, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1
	ret void			ret void
	}			}

llvm/trunk/test/Transforms/GVNHoist/hoist-recursive-geps.ll

	; RUN: opt -gvn-hoist -S < %s \| FileCheck %s			; RUN: opt -gvn-hoist -newgvn -gvn-hoist -S < %s \| FileCheck %s

				; Check that recursive GEPs are hoisted. Since hoisting creates
				; fully redundant instructions, newgvn is run to remove them which then
				; creates more opportunites for hoisting.

	; Check that recursive GEPs are hoisted.
	; CHECK-LABEL: @fun			; CHECK-LABEL: @fun
	; CHECK: fdiv
	; CHECK: load			; CHECK: load
				; CHECK: fdiv
	; CHECK: load			; CHECK: load
	; CHECK: load			; CHECK: load
	; CHECK: load			; CHECK: load
	; CHECK: fsub			; CHECK: fsub
	; CHECK: fsub
	; CHECK: fmul			; CHECK: fmul
				; CHECK: fsub
	; CHECK: fmul			; CHECK: fmul
	; CHECK-NOT: fsub			; CHECK-NOT: fsub
	; CHECK-NOT: fmul			; CHECK-NOT: fmul

	%0 = type { double, double, double }			%0 = type { double, double, double }
	%1 = type { double, double, double }			%1 = type { double, double, double }
	%2 = type { %3, %1, %1 }			%2 = type { %3, %1, %1 }
	%3 = type { i32 (...)*, %4, %10, %11, %11, %11, %11, %11, %11, %11, %11, %11 }			%3 = type { i32 (...)*, %4, %10, %11, %11, %11, %11, %11, %11, %11, %11, %11 }
	▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/GVNHoist/hoist.ll

; RUN: opt -gvn-hoist -S < %s \| FileCheck %s		; RUN: opt -gvn-hoist -S < %s \| FileCheck %s
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"		target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"		target triple = "x86_64-unknown-linux-gnu"

@GlobalVar = internal global float 1.000000e+00		@GlobalVar = internal global float 1.000000e+00

; Check that all scalar expressions are hoisted.		; Check that all scalar expressions are hoisted.
;		;
; CHECK-LABEL: @scalarsHoisting		; CHECK-LABEL: @scalarsHoisting
; CHECK: fsub		; CHECK: fsub
; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
		; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
; CHECK-NOT: fmul		; CHECK-NOT: fmul
; CHECK-NOT: fsub		; CHECK-NOT: fsub
define float @scalarsHoisting(float %d, float %min, float %max, float %a) {		define float @scalarsHoisting(float %d, float %min, float %max, float %a) {
entry:		entry:
%div = fdiv float 1.000000e+00, %d		%div = fdiv float 1.000000e+00, %d
%cmp = fcmp oge float %div, 0.000000e+00		%cmp = fcmp oge float %div, 0.000000e+00
br i1 %cmp, label %if.then, label %if.else		br i1 %cmp, label %if.then, label %if.else
Show All 22 Lines
; Check that all loads and scalars depending on the loads are hoisted.		; Check that all loads and scalars depending on the loads are hoisted.
; Check that getelementptr computation gets hoisted before the load.		; Check that getelementptr computation gets hoisted before the load.
;		;
; CHECK-LABEL: @readsAndScalarsHoisting		; CHECK-LABEL: @readsAndScalarsHoisting
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: fsub		; CHECK: fsub
; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
		; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
; CHECK-NOT: load		; CHECK-NOT: load
; CHECK-NOT: fmul		; CHECK-NOT: fmul
; CHECK-NOT: fsub		; CHECK-NOT: fsub
define float @readsAndScalarsHoisting(float %d, float* %min, float* %max, float* %a) {		define float @readsAndScalarsHoisting(float %d, float* %min, float* %max, float* %a) {
entry:		entry:
%div = fdiv float 1.000000e+00, %d		%div = fdiv float 1.000000e+00, %d
%cmp = fcmp oge float %div, 0.000000e+00		%cmp = fcmp oge float %div, 0.000000e+00
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines

; Check that we do hoist loads when the store is above the insertion point.		; Check that we do hoist loads when the store is above the insertion point.
;		;
; CHECK-LABEL: @readsAndWriteAboveInsertPt		; CHECK-LABEL: @readsAndWriteAboveInsertPt
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: fsub		; CHECK: fsub
; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
		; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
; CHECK-NOT: load		; CHECK-NOT: load
; CHECK-NOT: fmul		; CHECK-NOT: fmul
; CHECK-NOT: fsub		; CHECK-NOT: fsub
define float @readsAndWriteAboveInsertPt(float %d, float* %min, float* %max, float* %a) {		define float @readsAndWriteAboveInsertPt(float %d, float* %min, float* %max, float* %a) {
entry:		entry:
%div = fdiv float 1.000000e+00, %d		%div = fdiv float 1.000000e+00, %d
store float 0.000000e+00, float* @GlobalVar		store float 0.000000e+00, float* @GlobalVar
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines
}		}

; Check that we hoist load and scalar expressions in triangles.		; Check that we hoist load and scalar expressions in triangles.
; CHECK-LABEL: @triangleHoisting		; CHECK-LABEL: @triangleHoisting
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: fsub		; CHECK: fsub
; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
		; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
; CHECK-NOT: load		; CHECK-NOT: load
; CHECK-NOT: fmul		; CHECK-NOT: fmul
; CHECK-NOT: fsub		; CHECK-NOT: fsub
define float @triangleHoisting(float %d, float* %min, float* %max, float* %a) {		define float @triangleHoisting(float %d, float* %min, float* %max, float* %a) {
entry:		entry:
%div = fdiv float 1.000000e+00, %d		%div = fdiv float 1.000000e+00, %d
%cmp = fcmp oge float %div, 0.000000e+00		%cmp = fcmp oge float %div, 0.000000e+00
Show All 21 Lines	if.end: ; preds = %entry
%mul6 = fmul float %sub5, %div		%mul6 = fmul float %sub5, %div

%x = fadd float %p1, %mul6		%x = fadd float %p1, %mul6
%y = fadd float %p2, %mul4		%y = fadd float %p2, %mul4
%z = fadd float %x, %y		%z = fadd float %x, %y
ret float %z		ret float %z
}		}

; Check that we hoist load and scalar expressions in dominator.
; CHECK-LABEL: @dominatorHoisting
; CHECK: load
; CHECK: load
; CHECK: fsub
; CHECK: fmul
; CHECK: load
; CHECK: fsub
; CHECK: fmul
; CHECK-NOT: load
; CHECK-NOT: fmul
; CHECK-NOT: fsub
define float @dominatorHoisting(float %d, float* %min, float* %max, float* %a) {
entry:
%div = fdiv float 1.000000e+00, %d
%0 = load float, float* %min, align 4
%1 = load float, float* %a, align 4
%sub = fsub float %0, %1
%mul = fmul float %sub, %div
%2 = load float, float* %max, align 4
%sub1 = fsub float %2, %1
%mul2 = fmul float %sub1, %div
%cmp = fcmp oge float %div, 0.000000e+00
br i1 %cmp, label %if.then, label %if.end

if.then: ; preds = %entry
%3 = load float, float* %max, align 4
%4 = load float, float* %a, align 4
%sub3 = fsub float %3, %4
%mul4 = fmul float %sub3, %div
%5 = load float, float* %min, align 4
%sub5 = fsub float %5, %4
%mul6 = fmul float %sub5, %div
br label %if.end

if.end: ; preds = %entry
%p1 = phi float [ %mul4, %if.then ], [ 0.000000e+00, %entry ]
%p2 = phi float [ %mul6, %if.then ], [ 0.000000e+00, %entry ]

%x = fadd float %p1, %mul2
%y = fadd float %p2, %mul
%z = fadd float %x, %y
ret float %z
}

; Check that we hoist load and scalar expressions in dominator.
; CHECK-LABEL: @domHoisting
; CHECK: load
; CHECK: load
; CHECK: fsub
; CHECK: fmul
; CHECK: load
; CHECK: fsub
; CHECK: fmul
; CHECK-NOT: load
; CHECK-NOT: fmul
; CHECK-NOT: fsub
define float @domHoisting(float %d, float* %min, float* %max, float* %a) {
entry:
%div = fdiv float 1.000000e+00, %d
%0 = load float, float* %min, align 4
%1 = load float, float* %a, align 4
%sub = fsub float %0, %1
%mul = fmul float %sub, %div
%2 = load float, float* %max, align 4
%sub1 = fsub float %2, %1
%mul2 = fmul float %sub1, %div
%cmp = fcmp oge float %div, 0.000000e+00
br i1 %cmp, label %if.then, label %if.else

if.then:
%3 = load float, float* %max, align 4
%4 = load float, float* %a, align 4
%sub3 = fsub float %3, %4
%mul4 = fmul float %sub3, %div
%5 = load float, float* %min, align 4
%sub5 = fsub float %5, %4
%mul6 = fmul float %sub5, %div
br label %if.end

if.else:
%6 = load float, float* %max, align 4
%7 = load float, float* %a, align 4
%sub9 = fsub float %6, %7
%mul10 = fmul float %sub9, %div
%8 = load float, float* %min, align 4
%sub12 = fsub float %8, %7
%mul13 = fmul float %sub12, %div
br label %if.end

if.end:
%p1 = phi float [ %mul4, %if.then ], [ %mul10, %if.else ]
%p2 = phi float [ %mul6, %if.then ], [ %mul13, %if.else ]

%x = fadd float %p1, %mul2
%y = fadd float %p2, %mul
%z = fadd float %x, %y
ret float %z
}

; Check that we do not hoist loads past stores within a same basic block.		; Check that we do not hoist loads past stores within a same basic block.
; CHECK-LABEL: @noHoistInSingleBBWithStore		; CHECK-LABEL: @noHoistInSingleBBWithStore
; CHECK: load		; CHECK: load
; CHECK: store		; CHECK: store
; CHECK: load		; CHECK: load
; CHECK: store		; CHECK: store
define i32 @noHoistInSingleBBWithStore() {		define i32 @noHoistInSingleBBWithStore() {
entry:		entry:
▲ Show 20 Lines • Show All 332 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/GVNHoist/infinite-loop-direct.ll

				; RUN: opt -S -gvn-hoist < %s \| FileCheck %s

				; Checking gvn-hoist in case of infinite loops and irreducible control flow.

				; Check that bitcast is not hoisted beacuse down safety is not guaranteed.
				; CHECK-LABEL: @bazv1
				; CHECK: if.then.i:
				; CHECK: bitcast
				; CHECK-NEXT: load
				; CHECK: if.then4.i:
				; CHECK: bitcast
				; CHECK-NEXT: load

				%class.bar = type { i8, %class.base }
				%class.base = type { i32 (...)** }

				; Function Attrs: noreturn nounwind uwtable
				define void @bazv1() local_unnamed_addr {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x.sroa.2.0..sroa_idx2 = getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				store %class.base* null, %class.base** %x.sroa.2.0..sroa_idx2, align 8
				call void @_Z3foo3bar(%class.bar* nonnull %agg.tmp)
				%0 = load %class.base, %class.base* %x.sroa.2.0..sroa_idx2, align 8
				%1 = bitcast %class.bar* %agg.tmp to %class.base*
				%cmp.i = icmp eq %class.base* %0, %1
				br i1 %cmp.i, label %if.then.i, label %if.else.i

				if.then.i: ; preds = %entry
				%2 = bitcast %class.base* %0 to void (%class.base)**
				%vtable.i = load void (%class.base), void (%class.base)*** %2, align 8
				%vfn.i = getelementptr inbounds void (%class.base), void (%class.base)* %vtable.i, i64 2
				%3 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				call void %3(%class.base* %0)
				br label %while.cond.preheader

				if.else.i: ; preds = %entry
				%tobool.i = icmp eq %class.base* %0, null
				br i1 %tobool.i, label %while.cond.preheader, label %if.then4.i

				if.then4.i: ; preds = %if.else.i
				%4 = bitcast %class.base* %0 to void (%class.base)**
				%vtable6.i = load void (%class.base), void (%class.base)*** %4, align 8
				%vfn7.i = getelementptr inbounds void (%class.base), void (%class.base)* %vtable6.i, i64 3
				%5 = load void (%class.base), void (%class.base)* %vfn7.i, align 8
				call void %5(%class.base* nonnull %0)
				br label %while.cond.preheader

				while.cond.preheader: ; preds = %if.then.i, %if.else.i, %if.then4.i
				br label %while.cond

				while.cond: ; preds = %while.cond.preheader, %while.cond
				%call = call i32 @sleep(i32 10)
				br label %while.cond
				}

				declare void @_Z3foo3bar(%class.bar*) local_unnamed_addr

				declare i32 @sleep(i32) local_unnamed_addr

				; Check that the load is hoisted even if it is inside an irreducible control flow
				; because the load is anticipable on all paths.

				; CHECK-LABEL: @bazv
				; CHECK: bb2:
				; CHECK-NOT: load
				; CHECK-NOT: bitcast

				define void @bazv() {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%0 = load %class.base, %class.base* %x, align 8
				%1 = bitcast %class.bar* %agg.tmp to %class.base*
				%cmp.i = icmp eq %class.base* %0, %1
				br i1 %cmp.i, label %bb1, label %bb4

				bb1:
				%b1 = bitcast %class.base* %0 to void (%class.base)**
				%i = load void (%class.base), void (%class.base)*** %b1, align 8
				%vfn.i = getelementptr inbounds void (%class.base), void (%class.base)* %i, i64 2
				%cmp.j = icmp eq %class.base* %0, %1
				br i1 %cmp.j, label %bb2, label %bb3

				bb2:
				%l1 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				br label %bb3

				bb3:
				%l2 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				br label %bb2

				bb4:
				%b2 = bitcast %class.base* %0 to void (%class.base)**
				ret void
				}

llvm/trunk/test/Transforms/GVNHoist/infinite-loop-indirect.ll

				; RUN: opt -S -gvn-hoist < %s \| FileCheck %s

				; Checking gvn-hoist in case of indirect branches.

				; Check that the bitcast is is not hoisted because it is after an indirect call
				; CHECK-LABEL: @foo
				; CHECK-LABEL: l1.preheader:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: l1
				; CHECK: bitcast

				%class.bar = type { i8, %class.base }
				%class.base = type { i32 (...)** }

				@bar = local_unnamed_addr global i32 ()* null, align 8
				@bar1 = local_unnamed_addr global i32 ()* null, align 8

				define i32 @foo(i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				%0 = load i32, i32* %i, align 4
				%.off = add i32 %0, -1
				%switch = icmp ult i32 %.off, 2
				br i1 %switch, label %l1.preheader, label %sw.default

				l1.preheader: ; preds = %sw.default, %entry
				%b1 = bitcast %class.base* %y to void (%class.base)**
				br label %l1

				l1: ; preds = %l1.preheader, %l1
				%1 = load i32 (), i32 ()* @bar, align 8
				%call = tail call i32 %1()
				%b2 = bitcast %class.base* %y to void (%class.base)**
				br label %l1

				sw.default: ; preds = %entry
				%2 = load i32 (), i32 ()* @bar1, align 8
				%call2 = tail call i32 %2()
				br label %l1.preheader
				}


				; Any instruction inside an infinite loop will not be hoisted because
				; there is no path to exit of the function.

				; CHECK-LABEL: @foo1
				; CHECK-LABEL: l1.preheader:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: l1:
				; CHECK: bitcast

				define i32 @foo1(i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				%0 = load i32, i32* %i, align 4
				%.off = add i32 %0, -1
				%switch = icmp ult i32 %.off, 2
				br i1 %switch, label %l1.preheader, label %sw.default

				l1.preheader: ; preds = %sw.default, %entry
				%b1 = bitcast %class.base* %y to void (%class.base)**
				%y1 = load %class.base, %class.base* %x, align 8
				br label %l1

				l1: ; preds = %l1.preheader, %l1
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%1 = load i32 (), i32 ()* @bar, align 8
				%y2 = load %class.base, %class.base* %x, align 8
				%call = tail call i32 %1()
				br label %l1

				sw.default: ; preds = %entry
				%2 = load i32 (), i32 ()* @bar1, align 8
				%call2 = tail call i32 %2()
				br label %l1.preheader
				}

				; Check that bitcast is hoisted even when one of them is partially redundant.
				; CHECK-LABEL: @test13
				; CHECK: bitcast
				; CHECK-NOT: bitcast

				define i32 @test13(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				store i32 4, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				ret i32 123
				F:
				ret i32 1422
				}

				; Check that the bitcast is not hoisted because anticipability
				; cannot be guaranteed here as one of the indirect branch targets
				; do not have the bitcast instruction.

				; CHECK-LABEL: @test14
				; CHECK-LABEL: B2:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: BrBlock:
				; CHECK-NEXT: bitcast

				define i32 @test14(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2, label %T]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				store i32 4, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				%pi = load i32, i32* %i, align 4
				ret i32 %pi
				F:
				%pl = load i32, i32* %P
				ret i32 %pl
				}


				; Check that the bitcast is not hoisted because of a cycle
				; due to indirect branches
				; CHECK-LABEL: @test16
				; CHECK-LABEL: B2:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: BrBlock:
				; CHECK-NEXT: bitcast

				define i32 @test16(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				%0 = load i32, i32* %i, align 4
				store i32 %0, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				indirectbr i32* %P, [label %BrBlock, label %B2]

				F:
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]
				}


				@_ZTIi = external constant i8*

				; Check that an instruction is not hoisted out of landing pad (%lpad4)
				; Also within a landing pad no redundancies are removed by gvn-hoist,
				; however an instruction may be hoisted into a landing pad if
				; landing pad has direct branches (e.g., %lpad to %catch1, %catch)
				; This CFG has a cycle (%lpad -> %catch1 -> %lpad4 -> %lpad)

				; CHECK-LABEL: @foo2
				; Check that nothing gets hoisted out of %lpad
				; CHECK-LABEL: lpad:
				; CHECK: %bc1 = add i32 %0, 10
				; CHECK: %bc7 = add i32 %0, 10

				; Check that the add is hoisted
				; CHECK-LABEL: catch1:
				; CHECK-NEXT: invoke

				; Check that the add is hoisted
				; CHECK-LABEL: catch:
				; CHECK-NEXT: load

				; Check that other adds are not hoisted
				; CHECK-LABEL: lpad4:
				; CHECK: %bc5 = add i32 %0, 10
				; CHECK-LABEL: unreachable:
				; CHECK: %bc2 = add i32 %0, 10

				; Function Attrs: noinline uwtable
				define i32 @foo2(i32* nocapture readonly %i) local_unnamed_addr personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				%0 = load i32, i32* %i, align 4
				%cmp = icmp eq i32 %0, 0
				br i1 %cmp, label %try.cont, label %if.then

				if.then:
				%exception = tail call i8* @__cxa_allocate_exception(i64 4) #2
				%1 = bitcast i8* %exception to i32*
				store i32 %0, i32* %1, align 16
				invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #3
				to label %unreachable unwind label %lpad

				lpad:
				%2 = landingpad { i8*, i32 }
				catch i8* bitcast (i8** @_ZTIi to i8*)
				catch i8* null
				%bc1 = add i32 %0, 10
				%3 = extractvalue { i8*, i32 } %2, 0
				%4 = extractvalue { i8*, i32 } %2, 1
				%5 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) #2
				%matches = icmp eq i32 %4, %5
				%bc7 = add i32 %0, 10
				%6 = tail call i8* @__cxa_begin_catch(i8* %3) #2
				br i1 %matches, label %catch1, label %catch

				catch1:
				%bc3 = add i32 %0, 10
				invoke void @__cxa_rethrow() #3
				to label %unreachable unwind label %lpad4

				catch:
				%bc4 = add i32 %0, 10
				%7 = load i32, i32* %i, align 4
				%add = add nsw i32 %7, 1
				tail call void @__cxa_end_catch()
				br label %try.cont

				lpad4:
				%8 = landingpad { i8*, i32 }
				cleanup
				%bc5 = add i32 %0, 10
				tail call void @__cxa_end_catch() #2
				invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #3
				to label %unreachable unwind label %lpad

				try.cont:
				%k.0 = phi i32 [ %add, %catch ], [ 0, %entry ]
				%bc6 = add i32 %0, 10
				ret i32 %k.0

				unreachable:
				%bc2 = add i32 %0, 10
				ret i32 %bc2
				}

				declare i8* @__cxa_allocate_exception(i64) local_unnamed_addr

				declare void @__cxa_throw(i8, i8, i8*) local_unnamed_addr

				declare i32 @__gxx_personality_v0(...)

				; Function Attrs: nounwind readnone
				declare i32 @llvm.eh.typeid.for(i8*) #1

				declare i8* @__cxa_begin_catch(i8*) local_unnamed_addr

				declare void @__cxa_end_catch() local_unnamed_addr

				declare void @__cxa_rethrow() local_unnamed_addr

				attributes #1 = { nounwind readnone }
				attributes #2 = { nounwind }
				attributes #3 = { noreturn }

This is an archive of the discontinued LLVM Phabricator instance.

[GVNHoist] Factor out reachability to search for anticipable instructions quicklyClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 114971

llvm/trunk/lib/Transforms/Scalar/GVNHoist.cpp

llvm/trunk/test/Transforms/GVNHoist/hoist-more-than-two-branches.ll

llvm/trunk/test/Transforms/GVNHoist/hoist-mssa.ll

llvm/trunk/test/Transforms/GVNHoist/hoist-newgvn.ll

llvm/trunk/test/Transforms/GVNHoist/hoist-pr20242.ll

llvm/trunk/test/Transforms/GVNHoist/hoist-pr28933.ll

llvm/trunk/test/Transforms/GVNHoist/hoist-recursive-geps.ll

llvm/trunk/test/Transforms/GVNHoist/hoist.ll

llvm/trunk/test/Transforms/GVNHoist/infinite-loop-direct.ll

llvm/trunk/test/Transforms/GVNHoist/infinite-loop-indirect.ll

[GVNHoist] Factor out reachability to search for anticipable instructions quickly
ClosedPublic