This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
5/20
GVNHoist.cpp
-
test/Transforms/GVNHoist/
-
Transforms/
-
GVNHoist/
-
hoist-more-than-two-branches.ll
-
hoist-mssa.ll
-
hoist-newgvn.ll
-
hoist-pr20242.ll
-
hoist-pr28933.ll
-
hoist-recursive-geps.ll
-
hoist.ll
-
infinite-loop-direct.ll
-
infinite-loop-indirect.ll

Differential D35918

[GVNHoist] Factor out reachability to search for anticipable instructions quickly
ClosedPublic

Authored by hiraditya on Jul 26 2017, 2:36 PM.

Download Raw Diff

Details

Reviewers

• dberlin
sebpop
gberry
davide

Commits

rGdfa8741c9693: [GVNHoist] Factor out reachability to search for anticipable instructions…
rL313116: [GVNHoist] Factor out reachability to search for anticipable instructions…

Summary

Factor out the reachability such that multiple queries to find reachability of values are fast. This is based on finding the ANTIC points
in the CFG which do not change during hoisting. The ANTIC points are basically the dominance-frontiers in the inverse graph. So we introduce a data structure (CHI nodes)
to keep track of values flowing out of a basic block. We only do this for values with multiple occurrences in the function as they are the potential hoistable candidates.

This patch allows us to hoist instructions to a basic block with >2 successors, as well as deal with infinite loops in a trivial way.
Relevant test cases are added to show the functionality as well as regression fixes from PR32821.

Regression from previous GVNHoist:
We do not hoist fully redundant expressions because fully redundant expressions are already handled by NewGVN

Based on the suggestions from @dberlin and @sebpop

Diff Detail

Event Timeline

hiraditya created this revision.Jul 26 2017, 2:36 PM

Herald added a subscriber: Prazek. · View Herald TranscriptJul 26 2017, 2:36 PM

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

In D35918#822143, @dberlin wrote:

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

In D35918#822168, @hiraditya wrote:

In D35918#822143, @dberlin wrote:

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

No.
The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

But i'm not sure why that would stop you one way or the other?
The IDF calculator only asks for the defining blocks, not the root.
You can hand it all CFG blocks and it should work fine.

What do you believe you need to hand it the root node for?

For example,the dominance frontier of the root node is guaranteed to be empty.
"the dominance frontier of a node d is the set of all nodes n such that d dominates an immediate predecessor of n, but d does not strictly dominate n."

The root node strictly dominates everything but itself, and it has no immediate predecessors (successors on the reverse graph),.

In D35918#822192, @dberlin wrote:

In D35918#822168, @hiraditya wrote:

In D35918#822143, @dberlin wrote:

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

No.
The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

But i'm not sure why that would stop you one way or the other?
The IDF calculator only asks for the defining blocks, not the root.
You can hand it all CFG blocks and it should work fine.

What I understand is that the API of IDFCalculator::calculate, populates a vector with all the dominance frontiers of the defining blocks (AFAICT from the usage in ADCE.cpp).
How can I create a mapping of a basic block vs. its (post) dominance frontier. For this pass I would need to have such a mapping. I guess I'm having difficulty understanding the code. THanks for the help.

What do you believe you need to hand it the root node for?

For example,the dominance frontier of the root node is guaranteed to be empty.
"the dominance frontier of a node d is the set of all nodes n such that d dominates an immediate predecessor of n, but d does not strictly dominate n."

The root node strictly dominates everything but itself, and it has no immediate predecessors (successors on the reverse graph),.

In D35918#823285, @hiraditya wrote:

In D35918#822192, @dberlin wrote:

In D35918#822168, @hiraditya wrote:

In D35918#822143, @dberlin wrote:

FYI: IDFCalculator can compute the PostDominanceFrontier in linear time.

I was unable to find how to pass the root node of inverse graph, because there might be more than one root.
Thanks,

No.
The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

But i'm not sure why that would stop you one way or the other?
The IDF calculator only asks for the defining blocks, not the root.
You can hand it all CFG blocks and it should work fine.

What I understand is that the API of IDFCalculator::calculate, populates a vector with all the dominance frontiers of the defining blocks (AFAICT from the usage in ADCE.cpp).
How can I create a mapping of a basic block vs. its (post) dominance frontier. For this pass I would need to have such a mapping. I guess I'm having difficulty understanding the code. THanks for the help.

Such a mapping is by definition n^2 space, and i'm having trouble seeing why it is necessary.

Here is what you are doing:
For each instruction with the same VN:

find the post-dominance frontier of the block of the instruction
Insert a chi there, with certain arguments.

This is unnecessarily wasteful (you may compute the same pdf again and again).

Here is what SSUPRE (and you), should do:

Collect all the blocks of the instructions with the same VN into defining blocks.
Compute the PDF using IDFCalculator.
Place empty chis in the PDF.

At this point, you have two options:

Walk post-dominator tree top-down and use a stack to store the last value you see.
When you hit a chi from a given edge, the value to use as the argument is at the top of the stack.

This is O(Basic Blocks)

The O(instructions+chis) way to do it is:

Make a vector of instructions and chi argument uses. Each should be given the DFS in/out number from the post dominator tree node for the basic block they come from, and a local DFS number (IE order in block) in the case of instructions
A "chi argument use" is created for each incoming edge to the chi, but is empty/fake. These should assume the basic block from the other side of the edge (IE not the chi block, but the edge to the chi block).
Sort by dfs in/out, then local number

Walk vector with a stack.
At each element of vector:
  while( !top of stack is empty && DFS in/out  of current thing in vector is not inside of DFS number of top of stack)
   pop stack
If element you are staring at is a chi use:
  if stack is empty, chi has null operand
  if stack is not, set chi argument for the edge to top of stack
else: // must be an instruction
    if stack is empty, push onto stack
   If stack is not empty, the thing on the stack post-dominates you and you are redundant :)

The forwards version of this algorithm is used by predicateinfo to do SSA renaming.
Your algorithm is the same on the reverse graph, except the chi arguments are virtual :)

Such a mapping is by definition n^2 space, and i'm having trouble seeing why it is necessary.

Here is what you are doing:
For each instruction with the same VN:
find the post-dominance frontier of the block of the instruction
Insert a chi there, with certain arguments.
This is unnecessarily wasteful (you may compute the same pdf again and again).

Here is what SSUPRE (and you), should do:

Collect all the blocks of the instructions with the same VN into defining blocks.
Compute the PDF using IDFCalculator.
Place empty chis in the PDF.

At this point, you have two options:
Walk post-dominator tree top-down and use a stack to store the last value you see.
When you hit a chi from a given edge, the value to use as the argument is at the top of the stack.
This is O(Basic Blocks)

I tried this based on your suggestions, but post-dominator tree does not work well with infinite loops or CFG with multiple exits.
I can wait on @kuhar 's patch to be merged+stabilize and then I can work on this idea.

The O(instructions+chis) way to do it is:

Make a vector of instructions and chi argument uses. Each should be given the DFS in/out number from the post dominator tree node for the basic block they come from, and a local DFS number (IE order in block) in the case of instructions
A "chi argument use" is created for each incoming edge to the chi, but is empty/fake. These should assume the basic block from the other side of the edge (IE not the chi block, but the edge to the chi block).
Sort by dfs in/out, then local number
Walk vector with a stack.
At each element of vector:
  while( !top of stack is empty && DFS in/out  of current thing in vector is not inside of DFS number of top of stack)
   pop stack
If element you are staring at is a chi use:
  if stack is empty, chi has null operand
  if stack is not, set chi argument for the edge to top of stack
else: // must be an instruction
    if stack is empty, push onto stack
   If stack is not empty, the thing on the stack post-dominates you and you are redundant :)
The forwards version of this algorithm is used by predicateinfo to do SSA renaming.
Your algorithm is the same on the reverse graph, except the chi arguments are virtual :)

Because this patch precomputes ANTIC points based on your suggestions, it is already faster than the previous fix of iterating on dominator tree for each instruction to be hoisted.
Can we merge this patch if you think it is good to go, and then I'll work on this idea once the post-dominator patch by @kuhar is ready.
Thank you for writing the algorithm here, it helped me realize how close the algorithm is to PHI-insertion.

For some reason phabricator missed the following comments:

If you are willing to commit to making it better, i'm happy to review this version :)

Yes, I'll update gvn-hoist once post-dominator patch is merged.

Thanks,

-Aditya

From: Daniel Berlin <dberlin at dberlin.org>
Sent: Tuesday, August 1, 2017 11:11 AM
To: reviews+D35918+public+40ddab4ebe264bbb at reviews.llvm.org
Cc: Aditya Kumar; Sebastian Pop; Geoff Berry; Davide Italiano; Kuba Kuderski; Piotr Padlewski; llvm-commits
Subject: Re: [PATCH] D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly

If you are willing to commit to making it better, i'm happy to review this version :)

@dberlin, @hiraditya

In D35918#822192, @dberlin wrote:

The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

That's actually already the case upstream since r309146 (D35597).
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L263
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L307

In D35918#829175, @kuhar wrote:

@dberlin, @hiraditya

In D35918#822192, @dberlin wrote:

The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

That's actually already the case upstream since r309146 (D35597).
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L263
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L307

Can you illustrate how to perform a DFS walk on a post-dom tree with your patch. Also, how would I get the (virtual) root node of post-dom tree.

Thanks,

In D35918#830975, @hiraditya wrote:

In D35918#829175, @kuhar wrote:

@dberlin, @hiraditya

In D35918#822192, @dberlin wrote:

The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

That's actually already the case upstream since r309146 (D35597).
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L263
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L307

Can you illustrate how to perform a DFS walk on a post-dom tree with your patch. Also, how would I get the (virtual) root node of post-dom tree.

Thanks,

You can get the virtual root by calling PDT.getNode(nullptr). Then you can use something like llvm/ADT/DepthFirstIterator.h or llvm/ADT/PostOrderIterator.h to run DFS on it. Implementing a custom DFS should also be easy.

In D35918#830987, @kuhar wrote:

In D35918#830975, @hiraditya wrote:

In D35918#829175, @kuhar wrote:

@dberlin, @hiraditya

In D35918#822192, @dberlin wrote:

The graph always has one root, virtual or not.
Once kuba's latest patch is committed, it will *always* be virtual, actually.

That's actually already the case upstream since r309146 (D35597).
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L263
https://github.com/llvm-mirror/llvm/blob/22072158f353cc89a7821b13a7c3e99daa5be464/include/llvm/Support/GenericDomTreeConstruction.h#L307

Can you illustrate how to perform a DFS walk on a post-dom tree with your patch. Also, how would I get the (virtual) root node of post-dom tree.

Thanks,

You can get the virtual root by calling PDT.getNode(nullptr). Then you can use something like llvm/ADT/DepthFirstIterator.h or llvm/ADT/PostOrderIterator.h to run DFS on it. Implementing a custom DFS should also be easy.

I tried a simple df walk like this, it crashes the compiler for the test cases llvm/test/Transforms/GVNHoist/infinite-loop-indirect.ll, and llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll:

auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
}

Because PrevBB is NULL when PrevBB = PDF->getNode(nullptr)

These test cases are added in this patch.

Thanks,

In D35918#831066, @hiraditya wrote:
I tried a simple df walk like this, it crashes the compiler for the test cases llvm/test/Transforms/GVNHoist/infinite-loop-indirect.ll, and llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll:
auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
}
These test cases are added in this patch.

Thanks,

I tried running loop like yours in a couple of my unittests (DominatorTreeTest.cpp) and it seems to work:

auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
  auto BB = it->getBlock();
  outs() << (BB ? BB->getName() : "virtual root") << "\n";
}

Could you prepare a reduced repro for the crash you saw?

Thanks,
Kuba

In D35918#831091, @kuhar wrote:
In D35918#831066, @hiraditya wrote:
I tried a simple df walk like this, it crashes the compiler for the test cases llvm/test/Transforms/GVNHoist/infinite-loop-indirect.ll, and llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll:
auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
}
These test cases are added in this patch.

Thanks,
I tried running loop like yours in a couple of my unittests (DominatorTreeTest.cpp) and it seems to work:
auto PrevBB = PDT->getNode(nullptr);
for (auto it = df_begin(PrevBB); it != df_end(PrevBB);
     ++it) {
  auto BB = it->getBlock();
  outs() << (BB ? BB->getName() : "virtual root") << "\n";
}
Could you prepare a reduced repro for the crash you saw?

Thanks,
Kuba

Try ./bin/opt -S -adce < llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll

with the following patch. I added the code here for easy reproducibility.

diff --git a/lib/Transforms/Scalar/ADCE.cpp b/lib/Transforms/Scalar/ADCE.cpp
index 5b467dc..eb711b9 100644
--- a/lib/Transforms/Scalar/ADCE.cpp
+++ b/lib/Transforms/Scalar/ADCE.cpp
@@ -164,6 +164,10 @@ public:
 }

 bool AggressiveDeadCodeElimination::performDeadCodeElimination() {
+  auto PrevBB = PDT.getNode(nullptr);
+  for (auto it = df_begin(PrevBB), E = df_end(PrevBB); it != E; ++it) {
+      dbgs() << "\nTest\n";
+  }

Here is the reduced test case:

$ cat a.ll
; ModuleID = 'bugpoint-reduced-simplified.bc'
source_filename = "bugpoint-output-f4ca947.bc"
target triple = "x86_64-unknown-linux-gnu"

define void @bazv1() local_unnamed_addr {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.cond, %entry
  br label %while.cond
}

In D35918#831102, @hiraditya wrote:

Try ./bin/opt -S -adce < llvm/test/Transforms/GVNHoist/infinite-loop-direct.ll

with the following patch. I added the code here for easy reproducibility.

diff --git a/lib/Transforms/Scalar/ADCE.cpp b/lib/Transforms/Scalar/ADCE.cpp
index 5b467dc..eb711b9 100644
--- a/lib/Transforms/Scalar/ADCE.cpp
+++ b/lib/Transforms/Scalar/ADCE.cpp
@@ -164,6 +164,10 @@ public:
 }

 bool AggressiveDeadCodeElimination::performDeadCodeElimination() {
+  auto PrevBB = PDT.getNode(nullptr);
+  for (auto it = df_begin(PrevBB), E = df_end(PrevBB); it != E; ++it) {
+      dbgs() << "\nTest\n";
+  }

Here is the reduced test case:

$ cat a.ll
; ModuleID = 'bugpoint-reduced-simplified.bc'
source_filename = "bugpoint-output-f4ca947.bc"
target triple = "x86_64-unknown-linux-gnu"

define void @bazv1() local_unnamed_addr {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.cond, %entry
  br label %while.cond
}

Oh, yes, that happens because the entire function is reverse-unreachable, which causes the tree to be completely empty. D35851 puts reverse-unreachable CFG nodes in the tree and solves this problem.
In the meanwhile, you can check if PDT.getNode(nullptr) returns a nullptr and not iterate over it that situation.

Based on suggestions from @dberlin, I have updated the patch:

Compute iterated post-dominance frontiers
Iterate on post-dominator tree to insert CHIargs more efficiently

The time complexity would now be: O(duplicate-values * N) where N = nodes in the CFG. For each duplicate values the algorithm finds their control-dependence by computing the iterated post-dominance frontier and then inserts CHI args to track anticipability.

Added Comments

This looks reasonable at a glance, note it is likely to be subtly broken in some cases until D35381 goes in.
I've added a few coments, i'll take a deeper look in a little bit.

lib/Transforms/Scalar/GVNHoist.cpp
751	What are you really trying to do here? (IE why is the continue commented out)
954	I think since you wrote this we added moving functions you could use here.

Remove dead comments.

hiraditya added inline comments.Aug 11 2017, 12:06 PM

lib/Transforms/Scalar/GVNHoist.cpp
751	That was from the early stage when I only handled unique post-dom frontiers. I've removed those lines.
954	I'll look into it and update this code. Thanks for the feedback.

Updated MemorySSA updates.
Outlined functions from codegen part.

I think this is mostly reasonable at this point.
If you clean up the naming so it matches the style guide (IE you have a lot of things ending in T for no reason, etc), i'll approve it.

lib/Transforms/Scalar/GVNHoist.cpp
87	This is unused
90	This is unused
305	This is unused

Addressed @dberlin 's comments.

With these changes, i think it is a good start.

lib/Transforms/Scalar/GVNHoist.cpp
119–123	All of the uses of this type should just be auto, AFAICT
560	This keeps going after it finds anything, despite the fact that the boolean won't change. I would instead use (you probably need to include STLExtras.h) auto Found = any_of(TI->successors(), [](const BasicBlock *BB){ return BB == Dest; });

This revision is now accepted and ready to land.Aug 24 2017, 12:28 PM

use any_of instead of iterating over the successors.

hiraditya marked an inline comment as done.Sep 6 2017, 2:30 PM

hiraditya added inline comments.

lib/Transforms/Scalar/GVNHoist.cpp
119–123	CHHIt is used in function argument at a couple of places.

Sure, it's a function argument in a few other places.
The other places, you should be using auto :)
I marked a bunch of them.

In general, the only reason not to use auto is if the type is important and non-obvious.

In most of the cases, you would be better off naming variables better than exposing the types.

IE instead of CHIArg C, use auto CHIArg
(auto C is also fine if you think the type is obvious or unimportant)

lib/Transforms/Scalar/GVNHoist.cpp
553	Pass in a range instead of two iterators.
556	auto C
573	Pass in a range instead?
575–581	range based for loop?
642	auto VCHI
643	auto It
667	depth_first instead of explicit iterator
675	auto &A
685	auto
687	auto

added auto to relevant places.

hiraditya marked 4 inline comments as done.Sep 8 2017, 2:59 PM

use depth_first

Closed by commit rL313116: [GVNHoist] Factor out reachability to search for anticipable instructions… (authored by hiraditya). · Explain WhySep 12 2017, 10:29 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

GVNHoist.cpp

554 lines

test/

Transforms/

GVNHoist/

hoist-more-than-two-branches.ll

31 lines

2 lines

105 lines

5 lines

3 lines

hoist-recursive-geps.ll

11 lines

hoist.ll

108 lines

infinite-loop-direct.ll

96 lines

infinite-loop-indirect.ll

285 lines

Diff 110784

lib/Transforms/Scalar/GVNHoist.cpp

//===- GVNHoist.cpp - Hoist scalar and load expressions -------------------===//		//===- GVNHoist.cpp - Hoist scalar and load expressions -------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This pass hoists expressions from branches to a common dominator. It uses		// This pass hoists expressions from branches to a common dominator. It uses
// GVN (global value numbering) to discover expressions computing the same		// GVN (global value numbering) to discover expressions computing the same
// values. The primary goals of code-hoisting are:		// values. The primary goals of code-hoisting are:
// 1. To reduce the code size.		// 1. To reduce the code size.
// 2. In some cases reduce critical path (by exposing more ILP).		// 2. In some cases reduce critical path (by exposing more ILP).
//		//
		// The algorithm factors out the reachability of values such that multiple
		// queries to find reachability of values are fast. This is based on finding the
		// ANTIC points in the CFG which do not change during hoisting. The ANTIC points
		// are basically the dominance-frontiers in the inverse graph. So we introduce a
		// data structure (CHI nodes) to keep track of values flowing out of a basic
		// block. We only do this for values with multiple occurrences in the function
		// as they are the potential hoistable candidates. This approach allows us to
		// hoist instructions to a basic block with more than two successors, as well as
		// deal with infinite loops in a trivial way.
		//
		// Limitations: This pass does not hoist fully redundant expressions because
		// they are already handled by GVN-PRE. It is advisable to run gvn-hoist before
		// and after gvn-pre because gvn-pre creates opportunities for more instructions
		// to be hoisted.
		//
// Hoisting may affect the performance in some cases. To mitigate that, hoisting		// Hoisting may affect the performance in some cases. To mitigate that, hoisting
// is disabled in the following cases.		// is disabled in the following cases.
// 1. Scalars across calls.		// 1. Scalars across calls.
// 2. geps when corresponding load/store cannot be hoisted.		// 2. geps when corresponding load/store cannot be hoisted.
//
// TODO: Hoist from >2 successors. Currently GVNHoist will not hoist stores
// in this case because it works on two instructions at a time.
// entry:
// switch i32 %c1, label %exit1 [
// i32 0, label %sw0
// i32 1, label %sw1
// ]
//
// sw0:
// store i32 1, i32* @G
// br label %exit
//
// sw1:
// store i32 1, i32* @G
// br label %exit
//
// exit1:
// store i32 1, i32* @G
// ret void
// exit:
// ret void
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/GlobalsModRef.h"		#include "llvm/Analysis/GlobalsModRef.h"
		#include "llvm/Analysis/IteratedDominanceFrontier.h"
#include "llvm/Analysis/MemorySSA.h"		#include "llvm/Analysis/MemorySSA.h"
#include "llvm/Analysis/MemorySSAUpdater.h"		#include "llvm/Analysis/MemorySSAUpdater.h"
		#include "llvm/Analysis/PostDominators.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/GVN.h"		#include "llvm/Transforms/Scalar/GVN.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"

		#include <stack>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "gvn-hoist"		#define DEBUG_TYPE "gvn-hoist"

STATISTIC(NumHoisted, "Number of instructions hoisted");		STATISTIC(NumHoisted, "Number of instructions hoisted");
STATISTIC(NumRemoved, "Number of instructions removed");		STATISTIC(NumRemoved, "Number of instructions removed");
STATISTIC(NumLoadsHoisted, "Number of loads hoisted");		STATISTIC(NumLoadsHoisted, "Number of loads hoisted");
STATISTIC(NumLoadsRemoved, "Number of loads removed");		STATISTIC(NumLoadsRemoved, "Number of loads removed");
Show All 18 Lines

static cl::opt<int>		static cl::opt<int>
MaxChainLength("gvn-hoist-max-chain-length", cl::Hidden, cl::init(10),		MaxChainLength("gvn-hoist-max-chain-length", cl::Hidden, cl::init(10),
cl::desc("Maximum length of dependent chains to hoist "		cl::desc("Maximum length of dependent chains to hoist "
"(default = 10, unlimited = -1)"));		"(default = 10, unlimited = -1)"));

namespace llvm {		namespace llvm {

// Provides a sorting function based on the execution order of two instructions.		typedef DenseMap<const BasicBlock *, bool> BBSideEffectsSet;
struct SortByDFSIn {		typedef std::pair<BasicBlock , BasicBlock > PairBB;
		dberlinUnsubmitted Not Done Reply Inline Actions This is unused dberlin: This is unused
private:		typedef SmallVector<Instruction *, 4> SmallVecInsn;
DenseMap<const Value *, unsigned> &DFSNumber;		typedef SmallVectorImpl<Instruction *> SmallVecImplInsn;
		typedef DenseMap<BasicBlock *, PairBB> PDomMapT;
		dberlinUnsubmitted Not Done Reply Inline Actions This is unused dberlin: This is unused
public:		// Each element of a hoisting list contains the basic block where to hoist and
SortByDFSIn(DenseMap<const Value *, unsigned> &D) : DFSNumber(D) {}		// a list of instructions to be hoisted.
		typedef std::pair<BasicBlock *, SmallVecInsn> HoistingPointInfo;
		typedef SmallVector<HoistingPointInfo, 4> HoistingPointList;
		// A map from a pair of VNs to all the instructions with those VNs.
		typedef std::pair<unsigned, unsigned> VNType;
		typedef DenseMap<VNType, SmallVector<Instruction *, 4>> VNtoInsns;

// Returns true when A executes before B.		// CHI keeps information about values flowing out of a basic block. It is
bool operator()(const Instruction A, const Instruction B) const {		// similar to PHI but in the inverse graph, and used for outgoing values on each
const BasicBlock *BA = A->getParent();		// edge. For conciseness, it is computed only for instructions with multiple
const BasicBlock *BB = B->getParent();		// occurrences in the CFG because they are the only hoistable candidates.
unsigned ADFS, BDFS;		// A (CHI[{V, B, I1}, {V, C, I2}]
if (BA == BB) {		// / \
ADFS = DFSNumber.lookup(A);		// / \
BDFS = DFSNumber.lookup(B);		// B(I1) C (I2)
} else {		// The Value number for both I1 and I2 is V, the CHI node will save the
ADFS = DFSNumber.lookup(BA);		// instruction as well as the edge where the value is flowing to.
BDFS = DFSNumber.lookup(BB);		struct CHIArg {
}		VNType VN;
assert(ADFS && BDFS);		// Edge destination (shows the direction of flow), may not be where the I is.
return ADFS < BDFS;		BasicBlock *Dest;
}		// The instruction (VN) which uses the values flowing out of CHI.
		Instruction *I;
		bool operator==(const CHIArg &A) { return VN == A.VN; }
		bool operator!=(const CHIArg &A) { return !(*this == A); }
};		};

// A map from a pair of VNs to all the instructions with those VNs.		typedef SmallVectorImpl<CHIArg>::iterator CHIIt;
typedef DenseMap<std::pair<unsigned, unsigned>, SmallVector<Instruction *, 4>>		typedef DenseMap<BasicBlock *, SmallVector<CHIArg, 2>> OutValuesT;
VNtoInsns;		typedef DenseMap<BasicBlock , SmallVector<std::pair<VNType, Instruction >, 2>>
		InValuesT;

		dberlinUnsubmitted Not Done Reply Inline Actions All of the uses of this type should just be auto, AFAICT dberlin: All of the uses of this type should just be auto, AFAICT
		hiradityaAuthorUnsubmitted Not Done Reply Inline Actions CHHIt is used in function argument at a couple of places. hiraditya: CHHIt is used in function argument at a couple of places.
// An invalid value number Used when inserting a single value number into		// An invalid value number Used when inserting a single value number into
// VNtoInsns.		// VNtoInsns.
enum : unsigned { InvalidVN = ~2U };		enum : unsigned { InvalidVN = ~2U };

// Records all scalar instructions candidate for code hoisting.		// Records all scalar instructions candidate for code hoisting.
class InsnInfo {		class InsnInfo {
VNtoInsns VNtoScalars;		VNtoInsns VNtoScalars;

▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	public:

const VNtoInsns &getScalarVNTable() const { return VNtoCallsScalars; }		const VNtoInsns &getScalarVNTable() const { return VNtoCallsScalars; }

const VNtoInsns &getLoadVNTable() const { return VNtoCallsLoads; }		const VNtoInsns &getLoadVNTable() const { return VNtoCallsLoads; }

const VNtoInsns &getStoreVNTable() const { return VNtoCallsStores; }		const VNtoInsns &getStoreVNTable() const { return VNtoCallsStores; }
};		};

typedef DenseMap<const BasicBlock *, bool> BBSideEffectsSet;
typedef SmallVector<Instruction *, 4> SmallVecInsn;
typedef SmallVectorImpl<Instruction *> SmallVecImplInsn;

static void combineKnownMetadata(Instruction ReplInst, Instruction I) {		static void combineKnownMetadata(Instruction ReplInst, Instruction I) {
static const unsigned KnownIDs[] = {		static const unsigned KnownIDs[] = {
LLVMContext::MD_tbaa, LLVMContext::MD_alias_scope,		LLVMContext::MD_tbaa, LLVMContext::MD_alias_scope,
LLVMContext::MD_noalias, LLVMContext::MD_range,		LLVMContext::MD_noalias, LLVMContext::MD_range,
LLVMContext::MD_fpmath, LLVMContext::MD_invariant_load,		LLVMContext::MD_fpmath, LLVMContext::MD_invariant_load,
LLVMContext::MD_invariant_group};		LLVMContext::MD_invariant_group};
combineMetadata(ReplInst, I, KnownIDs);		combineMetadata(ReplInst, I, KnownIDs);
}		}

// This pass hoists common computations across branches sharing common		// This pass hoists common computations across branches sharing common
// dominator. The primary goal is to reduce the code size, and in some		// dominator. The primary goal is to reduce the code size, and in some
// cases reduce critical path (by exposing more ILP).		// cases reduce critical path (by exposing more ILP).
class GVNHoist {		class GVNHoist {
public:		public:
GVNHoist(DominatorTree DT, AliasAnalysis AA, MemoryDependenceResults *MD,		GVNHoist(DominatorTree DT, PostDominatorTree PDT, AliasAnalysis *AA,
MemorySSA *MSSA)		MemoryDependenceResults MD, MemorySSA MSSA)
: DT(DT), AA(AA), MD(MD), MSSA(MSSA),		: DT(DT), PDT(PDT), AA(AA), MD(MD), MSSA(MSSA),
MSSAUpdater(make_unique<MemorySSAUpdater>(MSSA)),		MSSAUpdater(make_unique<MemorySSAUpdater>(MSSA)), HoistedCtr(0),
HoistingGeps(false),		HoistingGeps(false) {}
HoistedCtr(0)
{ }

bool run(Function &F) {		bool run(Function &F) {
		NumFuncArgs = F.arg_size();
VN.setDomTree(DT);		VN.setDomTree(DT);
VN.setAliasAnalysis(AA);		VN.setAliasAnalysis(AA);
VN.setMemDep(MD);		VN.setMemDep(MD);
bool Res = false;		bool Res = false;
// Perform DFS Numbering of instructions.		// Perform DFS Numbering of instructions.
unsigned BBI = 0;		unsigned BBI = 0;
for (const BasicBlock *BB : depth_first(&F.getEntryBlock())) {		for (const BasicBlock *BB : depth_first(&F.getEntryBlock())) {
DFSNumber[BB] = ++BBI;		DFSNumber[BB] = ++BBI;
Show All 20 Lines	while (1) {
VN.clear();		VN.clear();

Res = true;		Res = true;
}		}

return Res;		return Res;
}		}

		// Copied from NewGVN.cpp
		// This function provides global ranking of operations so that we can place
		// them in a canonical order. Note that rank alone is not necessarily enough
		// for a complete ordering, as constants all have the same rank. However,
		// generally, we will simplify an operation with all constants so that it
		// doesn't matter what order they appear in.
		unsigned int rank(const Value *V) const {
		// Prefer constants to undef to anything else
		// Undef is a constant, have to check it first.
		// Prefer smaller constants to constantexprs
		if (isa<ConstantExpr>(V))
		return 2;
		if (isa<UndefValue>(V))
		return 1;
		if (isa<Constant>(V))
		return 0;
		else if (auto *A = dyn_cast<Argument>(V))
		return 3 + A->getArgNo();

		// Need to shift the instruction DFS by number of arguments + 3 to account
		// for the constant and argument ranking above.
		auto Result = DFSNumber.lookup(V);
		if (Result > 0)
		return 4 + NumFuncArgs + Result;
		// Unreachable or something else, just return a really large number.
		return ~0;
		}

private:		private:
GVN::ValueTable VN;		GVN::ValueTable VN;
DominatorTree *DT;		DominatorTree *DT;
		PostDominatorTree *PDT;
AliasAnalysis *AA;		AliasAnalysis *AA;
MemoryDependenceResults *MD;		MemoryDependenceResults *MD;
MemorySSA *MSSA;		MemorySSA *MSSA;
std::unique_ptr<MemorySSAUpdater> MSSAUpdater;		std::unique_ptr<MemorySSAUpdater> MSSAUpdater;
const bool HoistingGeps;
DenseMap<const Value *, unsigned> DFSNumber;		DenseMap<const Value *, unsigned> DFSNumber;
BBSideEffectsSet BBSideEffects;		BBSideEffectsSet BBSideEffects;
DenseSet<const BasicBlock*> HoistBarrier;		DenseSet<const BasicBlock *> HoistBarrier;
		// PDomMapT PDomMap;
		dberlinUnsubmitted Not Done Reply Inline Actions This is unused dberlin: This is unused

		SmallVector<BasicBlock *, 32> IDFBlocks;
int HoistedCtr;		int HoistedCtr;
		unsigned NumFuncArgs;
		const bool HoistingGeps;

enum InsKind { Unknown, Scalar, Load, Store };		enum InsKind { Unknown, Scalar, Load, Store };

// Return true when there are exception handling in BB.		// Return true when there are exception handling in BB.
bool hasEH(const BasicBlock *BB) {		bool hasEH(const BasicBlock *BB) {
auto It = BBSideEffects.find(BB);		auto It = BBSideEffects.find(BB);
if (It != BBSideEffects.end())		if (It != BBSideEffects.end())
return It->second;		return It->second;
Show All 16 Lines	private:
bool successorDominate(const BasicBlock BB, const BasicBlock A) {		bool successorDominate(const BasicBlock BB, const BasicBlock A) {
for (const BasicBlock *Succ : BB->getTerminator()->successors())		for (const BasicBlock *Succ : BB->getTerminator()->successors())
if (DT->dominates(Succ, A))		if (DT->dominates(Succ, A))
return true;		return true;

return false;		return false;
}		}

// Return true when all paths from HoistBB to the end of the function pass
// through one of the blocks in WL.
bool hoistingFromAllPaths(const BasicBlock *HoistBB,
SmallPtrSetImpl<const BasicBlock *> &WL) {

// Copy WL as the loop will remove elements from it.
SmallPtrSet<const BasicBlock *, 2> WorkList(WL.begin(), WL.end());

for (auto It = df_begin(HoistBB), E = df_end(HoistBB); It != E;) {
// There exists a path from HoistBB to the exit of the function if we are
// still iterating in DF traversal and we removed all instructions from
// the work list.
if (WorkList.empty())
return false;

const BasicBlock BB = It;
if (WorkList.erase(BB)) {
// Stop DFS traversal when BB is in the work list.
It.skipChildren();
continue;
}

// We reached the leaf Basic Block => not all paths have this instruction.
if (!BB->getTerminator()->getNumSuccessors())
return false;

// When reaching the back-edge of a loop, there may be a path through the
// loop that does not pass through B or C before exiting the loop.
if (successorDominate(BB, HoistBB))
return false;

// Increment DFS traversal when not skipping children.
++It;
}

return true;
}

/* Return true when I1 appears before I2 in the instructions of BB. */		/* Return true when I1 appears before I2 in the instructions of BB. */
bool firstInBB(const Instruction I1, const Instruction I2) {		bool firstInBB(const Instruction I1, const Instruction I2) {
assert(I1->getParent() == I2->getParent());		assert(I1->getParent() == I2->getParent());
unsigned I1DFS = DFSNumber.lookup(I1);		unsigned I1DFS = DFSNumber.lookup(I1);
unsigned I2DFS = DFSNumber.lookup(I2);		unsigned I2DFS = DFSNumber.lookup(I2);
assert(I1DFS && I2DFS);		assert(I1DFS && I2DFS);
return I1DFS < I2DFS;		return I1DFS < I2DFS;
}		}
▲ Show 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	bool safeToHoistLdSt(const Instruction NewPt, const Instruction OldPt,
}		}

// No side effects: it is safe to hoist.		// No side effects: it is safe to hoist.
return true;		return true;
}		}

// Return true when it is safe to hoist scalar instructions from all blocks in		// Return true when it is safe to hoist scalar instructions from all blocks in
// WL to HoistBB.		// WL to HoistBB.
bool safeToHoistScalar(const BasicBlock *HoistBB,		bool safeToHoistScalar(const BasicBlock HoistBB, const BasicBlock BB,
SmallPtrSetImpl<const BasicBlock *> &WL,
int &NBBsOnAllPaths) {		int &NBBsOnAllPaths) {
// Check that the hoisted expression is needed on all paths.		return !hasEHOnPath(HoistBB, BB, NBBsOnAllPaths);
if (!hoistingFromAllPaths(HoistBB, WL))		}
return false;

for (const BasicBlock *BB : WL)		// In the inverse CFG, the dominance frontier of basic block (BB) is the
if (hasEHOnPath(HoistBB, BB, NBBsOnAllPaths))		// point where ANTIC needs to be computed for instructions which are going
		// to be hoisted. Since this point does not change during gvn-hoist,
		// we compute it only once (on demand).
		// The ides is inspired from:
		// "Partial Redundancy Elimination in SSA Form"
		// ROBERT KENNEDY, SUN CHAN, SHIN-MING LIU, RAYMOND LO, PENG TU and FRED CHOW
		// They use similar idea in the forward graph to to find fully redundant and
		// partially redundant expressions, here it is used in the inverse graph to
		// find fully anticipable instructions at merge point (post-dominator in
		// the inverse CFG).
		// Returns the edge via which an instruction in BB will get the values from.

		// Returns true when the values are flowing out to each edge.
		bool valueAnticipable(CHIIt Begin, CHIIt End, TerminatorInst *TI) const {
		if (TI->getNumSuccessors() > std::distance(Begin, End))
		dberlinUnsubmitted Not Done Reply Inline Actions Pass in a range instead of two iterators. dberlin: Pass in a range instead of two iterators.
		return false; // Not enough args in this CHI.

		for (CHIArg PHIArg : make_range(Begin, End)) {
		dberlinUnsubmitted Not Done Reply Inline Actions auto C dberlin: auto C
		BasicBlock *Dest = PHIArg.Dest;
		bool Found = false;
		// Find if all the edges have values flowing out of BB.
		for (const auto &S : TI->successors()) {
		dberlinUnsubmitted Done Reply Inline Actions This keeps going after it finds anything, despite the fact that the boolean won't change. I would instead use (you probably need to include STLExtras.h) auto Found = any_of(TI->successors(), [](const BasicBlock BB){ return BB == Dest; }); dberlin:* This keeps going after it finds anything, despite the fact that the boolean won't change. I…
		if (S == Dest)
		Found = true;
		}
		if (!Found)
return false;		return false;

return true;
}		}
		return true;
// Each element of a hoisting list contains the basic block where to hoist and
// a list of instructions to be hoisted.
typedef std::pair<BasicBlock *, SmallVecInsn> HoistingPointInfo;
typedef SmallVector<HoistingPointInfo, 4> HoistingPointList;

// Partition InstructionsToHoist into a set of candidates which can share a
// common hoisting point. The partitions are collected in HPL. IsScalar is
// true when the instructions in InstructionsToHoist are scalars. IsLoad is
// true when the InstructionsToHoist are loads, false when they are stores.
void partitionCandidates(SmallVecImplInsn &InstructionsToHoist,
HoistingPointList &HPL, InsKind K) {
// No need to sort for two instructions.
if (InstructionsToHoist.size() > 2) {
SortByDFSIn Pred(DFSNumber);
std::sort(InstructionsToHoist.begin(), InstructionsToHoist.end(), Pred);
}		}

		// Check if it is safe to hoist values tracked by CHI in the range
		// [Begin, End) and accumulate them in Safe.
		void checkSafety(CHIIt Begin, CHIIt End, BasicBlock *BB, InsKind K,
		SmallVectorImpl<CHIArg> &Safe) {
		dberlinUnsubmitted Done Reply Inline Actions Pass in a range instead? dberlin: Pass in a range instead?
int NumBBsOnAllPaths = MaxNumberOfBBSInPath;		int NumBBsOnAllPaths = MaxNumberOfBBSInPath;
		for (CHIIt B = Begin; B != End; ++B) {
SmallVecImplInsn::iterator II = InstructionsToHoist.begin();		Instruction *Insn = B->I;
SmallVecImplInsn::iterator Start = II;		if (!Insn) // No instruction was inserted in this CHI.
Instruction HoistPt = II;		continue;
BasicBlock *HoistBB = HoistPt->getParent();		if (K == InsKind::Scalar) {
MemoryUseOrDef *UD;		if (safeToHoistScalar(BB, Insn->getParent(), NumBBsOnAllPaths))
if (K != InsKind::Scalar)		Safe.push_back(*B);
		dberlinUnsubmitted Done Reply Inline Actions range based for loop? dberlin: range based for loop?
UD = MSSA->getMemoryAccess(HoistPt);

for (++II; II != InstructionsToHoist.end(); ++II) {
Instruction Insn = II;
BasicBlock *BB = Insn->getParent();
BasicBlock *NewHoistBB;
Instruction *NewHoistPt;

if (BB == HoistBB) { // Both are in the same Basic Block.
NewHoistBB = HoistBB;
NewHoistPt = firstInBB(Insn, HoistPt) ? Insn : HoistPt;
} else {		} else {
// If the hoisting point contains one of the instructions,		MemoryUseOrDef *UD = MSSA->getMemoryAccess(Insn);
// then hoist there, otherwise hoist before the terminator.		if (safeToHoistLdSt(BB->getTerminator(), Insn, UD, K, NumBBsOnAllPaths))
NewHoistBB = DT->findNearestCommonDominator(HoistBB, BB);		Safe.push_back(*B);
if (NewHoistBB == BB)		}
NewHoistPt = Insn;		}
else if (NewHoistBB == HoistBB)
NewHoistPt = HoistPt;
else
NewHoistPt = NewHoistBB->getTerminator();
}		}

SmallPtrSet<const BasicBlock *, 2> WL;		typedef DenseMap<VNType, SmallVector<Instruction *, 2>> RenameStackT;
WL.insert(HoistBB);		// Push all the VNs corresponding to BB into RenameStack.
WL.insert(BB);		void fillRenameStack(BasicBlock *BB, InValuesT &ValueBBs,
		RenameStackT &RenameStack) {
		auto it1 = ValueBBs.find(BB);
		if (it1 != ValueBBs.end()) {
		// Iterate in reverse order to keep lower ranked values on the top.
		for (std::pair<VNType, Instruction *> &VI : reverse(it1->second)) {
		// Get the value of instruction I
		DEBUG(dbgs() << "\nPushing on stack: " << *VI.second);
		RenameStack[VI.first].push_back(VI.second);
		}
		}
		}

if (K == InsKind::Scalar) {		void fillChiArgs(BasicBlock *BB, OutValuesT &CHIBBs,
if (safeToHoistScalar(NewHoistBB, WL, NumBBsOnAllPaths)) {		RenameStackT &RenameStack) {
// Extend HoistPt to NewHoistPt.		// For each predecessor (because Post-DOM) of BB check if it has a CHI
HoistPt = NewHoistPt;		for (auto Pred : predecessors(BB)) {
HoistBB = NewHoistBB;		auto P = CHIBBs.find(Pred);
		if (P == CHIBBs.end()) {
continue;		continue;
}		}
} else {		DEBUG(dbgs() << "\nLooking at CHIs in: " << Pred->getName(););
// When NewBB already contains an instruction to be hoisted, the		// A CHI is found (BB -> Pred is an edge in the CFG)
// expression is needed on all paths.		// Pop the stack until Top(V) = Ve.
// Check that the hoisted expression is needed on all paths: it is		SmallVectorImpl<CHIArg> &VCHI = P->second;
// unsafe to hoist loads to a place where there may be a path not		for (SmallVectorImpl<CHIArg>::iterator It = VCHI.begin(), E = VCHI.end();
// loading from the same address: for instance there may be a branch on		It != E;) {
// which the address of the load may not be initialized.		CHIArg &C = *It;
if ((HoistBB == NewHoistBB \|\| BB == NewHoistBB \|\|		if (!C.Dest) {
hoistingFromAllPaths(NewHoistBB, WL)) &&		auto si = RenameStack.find(C.VN);
// Also check that it is safe to move the load or store from HoistPt		// The Basic Block where CHI is must dominate the value we want to
// to NewHoistPt, and from Insn to NewHoistPt.		// track in a CHI. In the PDom walk, there can be values in the
safeToHoistLdSt(NewHoistPt, HoistPt, UD, K, NumBBsOnAllPaths) &&		// stack which are not control dependent e.g., nested loop.
safeToHoistLdSt(NewHoistPt, Insn, MSSA->getMemoryAccess(Insn),		if (si != RenameStack.end() && si->second.size() &&
K, NumBBsOnAllPaths)) {		DT->dominates(Pred, si->second.back()->getParent())) {
// Extend HoistPt to NewHoistPt.		C.Dest = BB; // Assign the edge
HoistPt = NewHoistPt;		C.I = si->second.pop_back_val(); // Assign the argument
HoistBB = NewHoistBB;		DEBUG(dbgs() << "\nCHI Inserted in BB: " << C.Dest->getName()
		<< *C.I << ", VN: " << C.VN.first << ", "
		<< C.VN.second);
		}
		// Move to next CHI of a different value
		It = std::find_if(It, VCHI.end(),
		[It](CHIArg &A) { return A != *It; });
		} else
		++It;
		}
		}
		}

		// Walk the post-dominator tree top-down and use a stack for each value to
		dberlinUnsubmitted Done Reply Inline Actions auto VCHI dberlin: auto VCHI
		// store the last value you see. When you hit a CHI from a given edge, the
		dberlinUnsubmitted Done Reply Inline Actions auto It dberlin: auto It
		// value to use as the argument is at the top of the stack, add the value to
		// CHI and pop.
		void insertCHI(InValuesT &ValueBBs, OutValuesT &CHIBBs) {
		auto Root = PDT->getNode(nullptr);
		if (!Root)
		return;
		// Depth first walk on PDom tree to fill the CHIargs at each PDF.
		RenameStackT RenameStack;
		for (auto it = df_begin(Root), E = df_end(Root); it != E; ++it) {
		BasicBlock BB = (it)->getBlock();
		if (!BB)
continue;		continue;

		// Collect all values in BB and push to stack.
		fillRenameStack(BB, ValueBBs, RenameStack);

		// Fill outgoing values in each CHI corresponding to BB.
		fillChiArgs(BB, CHIBBs, RenameStack);
}		}
}		}

// At this point it is not safe to extend the current hoisting to		// Walk all the CHI-nodes to find ones which have a empty-entry and remove
// NewHoistPt: save the hoisting list so far.		// them Then collect all the instructions which are safe to hoist and see if
if (std::distance(Start, II) > 1)		// they form a list of anticipable values. OutValues contains CHIs
		dberlinUnsubmitted Not Done Reply Inline Actions depth_first instead of explicit iterator dberlin: depth_first instead of explicit iterator
HPL.push_back({HoistBB, SmallVecInsn(Start, II)});		// corresponding to each basic block.
		void findHoistableCandidates(OutValuesT &CHIBBs, InsKind K,
// Start over from BB.		HoistingPointList &HPL) {
Start = II;		auto cmpVN = [](const CHIArg &A, const CHIArg &B) { return A.VN < B.VN; };
if (K != InsKind::Scalar)
UD = MSSA->getMemoryAccess(*Start);		// CHIArgs now have the outgoing values, so check for anticipability and
HoistPt = Insn;		// accumulate hoistable candidates in HPL.
HoistBB = BB;		for (std::pair<BasicBlock *, SmallVector<CHIArg, 2>> &A : CHIBBs) {
		dberlinUnsubmitted Not Done Reply Inline Actions auto &A dberlin: auto &A
NumBBsOnAllPaths = MaxNumberOfBBSInPath;		BasicBlock *BB = A.first;
}		SmallVectorImpl<CHIArg> &CHIs = A.second;
		// Vector of PHIs contains PHIs for different instructions.
// Save the last partition.		// Sort the args according to their VNs, such that identical
if (std::distance(Start, II) > 1)		// instructions are together.
HPL.push_back({HoistBB, SmallVecInsn(Start, II)});		std::sort(CHIs.begin(), CHIs.end(), cmpVN);
		auto TI = BB->getTerminator();
		auto B = CHIs.begin();
		// [PreIt, PHIIt) form a range of CHIs which have identical VNs.
		CHIIt PHIIt = std::find_if(CHIs.begin(), CHIs.end(),
		dberlinUnsubmitted Not Done Reply Inline Actions auto dberlin: auto
		[B](CHIArg &A) { return A != *B; });
		CHIIt PrevIt = CHIs.begin();
		dberlinUnsubmitted Not Done Reply Inline Actions auto dberlin: auto
		while (PrevIt != PHIIt) {
		// Collect values which satisfy safety checks.
		SmallVector<CHIArg, 2> Safe;
		// We check for safety first because there might be multiple values in
		// the same path, some of which are not safe to be hoisted, but overall
		// each edge has at least one value which can be hoisted, making the
		// value anticipable along that path.
		checkSafety(PrevIt, PHIIt, BB, K, Safe);

		// List of safe values should be anticipable at TI.
		if (valueAnticipable(Safe.begin(), Safe.end(), TI)) {
		HPL.push_back({BB, SmallVecInsn()});
		SmallVecInsn &V = HPL.back().second;
		for (CHIIt B = Safe.begin(); B != Safe.end(); ++B)
		V.push_back(B->I);
		}

		// Check other VNs
		PrevIt = PHIIt;
		PHIIt = std::find_if(PrevIt, CHIs.end(),
		[PrevIt](CHIArg &A) { return A != *PrevIt; });
		}
		}
}		}

// Initialize HPL from Map.		// Compute insertion points for each values which can be fully anticipated at
		// a dominator. HPL contains all such values.
void computeInsertionPoints(const VNtoInsns &Map, HoistingPointList &HPL,		void computeInsertionPoints(const VNtoInsns &Map, HoistingPointList &HPL,
InsKind K) {		InsKind K) {
		// Sort VNs based on their rankings
		std::vector<VNType> Ranks;
for (const auto &Entry : Map) {		for (const auto &Entry : Map) {
if (MaxHoistedThreshold != -1 && ++HoistedCtr > MaxHoistedThreshold)		Ranks.push_back(Entry.first);
return;		}

const SmallVecInsn &V = Entry.second;		// TODO: Remove fully-redundant expressions.
		// Get instruction from the Map, assume that all the Instructions
		// with same VNs have same rank (this is an approximation).
		std::sort(Ranks.begin(), Ranks.end(),
		[this, &Map](const VNType &r1, const VNType &r2) {
		return (rank(*Map.lookup(r1).begin()) <
		rank(*Map.lookup(r2).begin()));
		});

		// - Sort VNs according to their rank, and start with lowest ranked VN
		// - Take a VN and for each instruction with same VN
		// - Find the dominance frontier in the inverse graph (PDF)
		// - Insert the chi-node at PDF
		// - Remove the chi-nodes with missing entries
		// - Remove values from CHI-nodes which do not truly flow out, e.g.,
		// modified along the path.
		// - Collect the remaining values that are still anticipable
		SmallVector<BasicBlock *, 2> IDFBlocks;
		ReverseIDFCalculator IDFs(*PDT);
		OutValuesT OutValue;
		InValuesT InValue;
		for (const auto &R : Ranks) {
		const SmallVecInsn &V = Map.lookup(R);
if (V.size() < 2)		if (V.size() < 2)
continue;		continue;
		const VNType &VN = R;
// Compute the insertion point and the list of expressions to be hoisted.		SmallPtrSet<BasicBlock *, 2> VNBlocks;
SmallVecInsn InstructionsToHoist;		for (auto &I : V) {
for (auto I : V)		BasicBlock *BBI = I->getParent();
		dberlinUnsubmitted Not Done Reply Inline Actions What are you really trying to do here? (IE why is the continue commented out) dberlin: What are you really trying to do here? (IE why is the continue commented out)
		hiradityaAuthorUnsubmitted Not Done Reply Inline Actions That was from the early stage when I only handled unique post-dom frontiers. I've removed those lines. hiraditya: That was from the early stage when I only handled unique post-dom frontiers. I've removed those…
// We don't need to check for hoist-barriers here because if		if (!hasEH(BBI))
// I->getParent() is a barrier then I precedes the barrier.		VNBlocks.insert(BBI);
if (!hasEH(I->getParent()))		}
InstructionsToHoist.push_back(I);		// Compute the Post Dominance Frontiers of each basic block
		// The dominance frontier of a live block X in the reverse
if (!InstructionsToHoist.empty())		// control graph is the set of blocks upon which X is control
partitionCandidates(InstructionsToHoist, HPL, K);		// dependent. The following sequence computes the set of blocks
		// which currently have dead terminators that are control
		// dependence sources of a block which is in NewLiveBlocks.
		IDFs.setDefiningBlocks(VNBlocks);
		IDFs.calculate(IDFBlocks);

		// Make a map of BB vs instructions to be hoisted.
		for (unsigned i = 0; i < V.size(); ++i) {
		InValue[V[i]->getParent()].push_back(std::make_pair(VN, V[i]));
		}
		// Insert empty CHI node for this VN. This is used to factor out
		// basic blocks where the ANTIC can potentially change.
		for (auto IDFB : IDFBlocks) { // TODO: Prune out useless CHI insertions.
		for (unsigned i = 0; i < V.size(); ++i) {
		CHIArg C = {VN, nullptr, nullptr};
		if (DT->dominates(IDFB, V[i]->getParent())) { // Ignore spurious PDFs.
		// InValue[V[i]->getParent()].push_back(std::make_pair(VN, V[i]));
		OutValue[IDFB].push_back(C);
		DEBUG(dbgs() << "\nInsertion a CHI for BB: " << IDFB->getName()
		<< ", for Insn: " << *V[i]);
		}
		}
}		}
}		}

		// Insert CHI args at each PDF to iterate on factored graph of
		// control dependence.
		insertCHI(InValue, OutValue);
		// Using the CHI args inserted at each PDF, find fully anticipable values.
		findHoistableCandidates(OutValue, K, HPL);
		}

// Return true when all operands of Instr are available at insertion point		// Return true when all operands of Instr are available at insertion point
// HoistPt. When limiting the number of hoisted expressions, one could hoist		// HoistPt. When limiting the number of hoisted expressions, one could hoist
// a load without hoisting its access function. So before hoisting any		// a load without hoisting its access function. So before hoisting any
// expression, make sure that all its operands are available at insert point.		// expression, make sure that all its operands are available at insert point.
bool allOperandsAvailable(const Instruction *I,		bool allOperandsAvailable(const Instruction *I,
const BasicBlock *HoistPt) const {		const BasicBlock *HoistPt) const {
for (const Use &Op : I->operands())		for (const Use &Op : I->operands())
if (const auto *Inst = dyn_cast<Instruction>(&Op))		if (const auto *Inst = dyn_cast<Instruction>(&Op))
▲ Show 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	for (const HoistingPointInfo &HP : HPL) {
}		}

// Move the instruction at the end of HoistPt.		// Move the instruction at the end of HoistPt.
Instruction *Last = HoistPt->getTerminator();		Instruction *Last = HoistPt->getTerminator();
MD->removeInstruction(Repl);		MD->removeInstruction(Repl);
Repl->moveBefore(Last);		Repl->moveBefore(Last);

DFSNumber[Repl] = DFSNumber[Last]++;		DFSNumber[Repl] = DFSNumber[Last]++;
}		}
		dberlinUnsubmitted Not Done Reply Inline Actions I think since you wrote this we added moving functions you could use here. dberlin: I think since you wrote this we added moving functions you could use here.
		hiradityaAuthorUnsubmitted Not Done Reply Inline Actions I'll look into it and update this code. Thanks for the feedback. hiraditya: I'll look into it and update this code. Thanks for the feedback.

MemoryAccess *NewMemAcc = MSSA->getMemoryAccess(Repl);		MemoryAccess *NewMemAcc = MSSA->getMemoryAccess(Repl);

if (MoveAccess) {		if (MoveAccess) {
if (MemoryUseOrDef *OldMemAcc =		if (MemoryUseOrDef *OldMemAcc =
dyn_cast_or_null<MemoryUseOrDef>(NewMemAcc)) {		dyn_cast_or_null<MemoryUseOrDef>(NewMemAcc)) {
// The definition of this ld/st will not change: ld/st hoisting is		// The definition of this ld/st will not change: ld/st hoisting is
// legal when the ld/st is not moved past its current definition.		// legal when the ld/st is not moved past its current definition.
MemoryAccess *Def = OldMemAcc->getDefiningAccess();		MemoryAccess *Def = OldMemAcc->getDefiningAccess();
NewMemAcc =		NewMemAcc = MSSAUpdater->createMemoryAccessInBB(Repl, Def, HoistPt,
MSSAUpdater->createMemoryAccessInBB(Repl, Def, HoistPt, MemorySSA::End);		MemorySSA::End);
OldMemAcc->replaceAllUsesWith(NewMemAcc);		OldMemAcc->replaceAllUsesWith(NewMemAcc);
MSSAUpdater->removeMemoryAccess(OldMemAcc);		MSSAUpdater->removeMemoryAccess(OldMemAcc);
}		}
}		}

if (isa<LoadInst>(Repl))		if (isa<LoadInst>(Repl))
++NL;		++NL;
else if (isa<StoreInst>(Repl))		else if (isa<StoreInst>(Repl))
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	std::pair<unsigned, unsigned> hoistExpressions(Function &F) {
StoreInfo SI;		StoreInfo SI;
CallInfo CI;		CallInfo CI;
for (BasicBlock *BB : depth_first(&F.getEntryBlock())) {		for (BasicBlock *BB : depth_first(&F.getEntryBlock())) {
int InstructionNb = 0;		int InstructionNb = 0;
for (Instruction &I1 : *BB) {		for (Instruction &I1 : *BB) {
// If I1 cannot guarantee progress, subsequent instructions		// If I1 cannot guarantee progress, subsequent instructions
// in BB cannot be hoisted anyways.		// in BB cannot be hoisted anyways.
if (!isGuaranteedToTransferExecutionToSuccessor(&I1)) {		if (!isGuaranteedToTransferExecutionToSuccessor(&I1)) {
HoistBarrier.insert(BB);		HoistBarrier.insert(BB);
break;		break;
}		}
// Only hoist the first instructions in BB up to MaxDepthInBB. Hoisting		// Only hoist the first instructions in BB up to MaxDepthInBB. Hoisting
// deeper may increase the register pressure and compilation time.		// deeper may increase the register pressure and compilation time.
if (MaxDepthInBB != -1 && InstructionNb++ >= MaxDepthInBB)		if (MaxDepthInBB != -1 && InstructionNb++ >= MaxDepthInBB)
break;		break;

// Do not value number terminator instructions.		// Do not value number terminator instructions.
if (isa<TerminatorInst>(&I1))		if (isa<TerminatorInst>(&I1))
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	public:
GVNHoistLegacyPass() : FunctionPass(ID) {		GVNHoistLegacyPass() : FunctionPass(ID) {
initializeGVNHoistLegacyPassPass(*PassRegistry::getPassRegistry());		initializeGVNHoistLegacyPassPass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override {		bool runOnFunction(Function &F) override {
if (skipFunction(F))		if (skipFunction(F))
return false;		return false;
auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();		auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
		auto &PDT = getAnalysis<PostDominatorTreeWrapperPass>().getPostDomTree();
auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();		auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
auto &MD = getAnalysis<MemoryDependenceWrapperPass>().getMemDep();		auto &MD = getAnalysis<MemoryDependenceWrapperPass>().getMemDep();
auto &MSSA = getAnalysis<MemorySSAWrapperPass>().getMSSA();		auto &MSSA = getAnalysis<MemorySSAWrapperPass>().getMSSA();

GVNHoist G(&DT, &AA, &MD, &MSSA);		GVNHoist G(&DT, &PDT, &AA, &MD, &MSSA);
return G.run(F);		return G.run(F);
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
		AU.addRequired<PostDominatorTreeWrapperPass>();
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();
AU.addRequired<MemoryDependenceWrapperPass>();		AU.addRequired<MemoryDependenceWrapperPass>();
AU.addRequired<MemorySSAWrapperPass>();		AU.addRequired<MemorySSAWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();		AU.addPreserved<DominatorTreeWrapperPass>();
AU.addPreserved<MemorySSAWrapperPass>();		AU.addPreserved<MemorySSAWrapperPass>();
AU.addPreserved<GlobalsAAWrapperPass>();		AU.addPreserved<GlobalsAAWrapperPass>();
}		}
};		};
} // namespace		} // namespace llvm

PreservedAnalyses GVNHoistPass::run(Function &F, FunctionAnalysisManager &AM) {		PreservedAnalyses GVNHoistPass::run(Function &F, FunctionAnalysisManager &AM) {
DominatorTree &DT = AM.getResult<DominatorTreeAnalysis>(F);		DominatorTree &DT = AM.getResult<DominatorTreeAnalysis>(F);
		PostDominatorTree &PDT = AM.getResult<PostDominatorTreeAnalysis>(F);
AliasAnalysis &AA = AM.getResult<AAManager>(F);		AliasAnalysis &AA = AM.getResult<AAManager>(F);
MemoryDependenceResults &MD = AM.getResult<MemoryDependenceAnalysis>(F);		MemoryDependenceResults &MD = AM.getResult<MemoryDependenceAnalysis>(F);
MemorySSA &MSSA = AM.getResult<MemorySSAAnalysis>(F).getMSSA();		MemorySSA &MSSA = AM.getResult<MemorySSAAnalysis>(F).getMSSA();
GVNHoist G(&DT, &AA, &MD, &MSSA);		GVNHoist G(&DT, &PDT, &AA, &MD, &MSSA);
if (!G.run(F))		if (!G.run(F))
return PreservedAnalyses::all();		return PreservedAnalyses::all();

PreservedAnalyses PA;		PreservedAnalyses PA;
PA.preserve<DominatorTreeAnalysis>();		PA.preserve<DominatorTreeAnalysis>();
PA.preserve<MemorySSAAnalysis>();		PA.preserve<MemorySSAAnalysis>();
PA.preserve<GlobalsAA>();		PA.preserve<GlobalsAA>();
return PA;		return PA;
Show All 13 Lines

test/Transforms/GVNHoist/hoist-more-than-two-branches.ll

This file was added.

				; RUN: opt -gvn-hoist -S < %s \| FileCheck %s

				; CHECK: store
				; CHECK-NOT: store

				; Check that an instruction can be hoisted to a basic block
				; with more than two successors.

				@G = external global i32, align 4

				define void @foo(i32 %c1) {
				entry:
				switch i32 %c1, label %exit1 [
				i32 0, label %sw0
				i32 1, label %sw1
				]

				sw0:
				store i32 1, i32* @G
				br label %exit

				sw1:
				store i32 1, i32* @G
				br label %exit

				exit1:
				store i32 1, i32* @G
				ret void
				exit:
				ret void
				}

test/Transforms/GVNHoist/hoist-mssa.ll

	; RUN: opt -S -gvn-hoist < %s \| FileCheck %s			; RUN: opt -S -gvn-hoist -newgvn < %s \| FileCheck %s

	; Check that store hoisting works: there should be only one store left.			; Check that store hoisting works: there should be only one store left.
	; CHECK-LABEL: @getopt			; CHECK-LABEL: @getopt
	; CHECK: store i32			; CHECK: store i32
	; CHECK-NOT: store i32			; CHECK-NOT: store i32

	@optind = external global i32, align 4			@optind = external global i32, align 4

	▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

test/Transforms/GVNHoist/hoist-newgvn.ll

This file was added.

				; RUN: opt -gvn-hoist -newgvn -S < %s \| FileCheck %s
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@GlobalVar = internal global float 1.000000e+00

				; Check that we hoist load and scalar expressions in dominator.
				; CHECK-LABEL: @dominatorHoisting
				; CHECK: load
				; CHECK: load
				; CHECK: fsub
				; CHECK: fmul
				; CHECK: load
				; CHECK: fsub
				; CHECK: fmul
				; CHECK-NOT: load
				; CHECK-NOT: fmul
				; CHECK-NOT: fsub
				define float @dominatorHoisting(float %d, float* %min, float* %max, float* %a) {
				entry:
				%div = fdiv float 1.000000e+00, %d
				%0 = load float, float* %min, align 4
				%1 = load float, float* %a, align 4
				%sub = fsub float %0, %1
				%mul = fmul float %sub, %div
				%2 = load float, float* %max, align 4
				%sub1 = fsub float %2, %1
				%mul2 = fmul float %sub1, %div
				%cmp = fcmp oge float %div, 0.000000e+00
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry
				%3 = load float, float* %max, align 4
				%4 = load float, float* %a, align 4
				%sub3 = fsub float %3, %4
				%mul4 = fmul float %sub3, %div
				%5 = load float, float* %min, align 4
				%sub5 = fsub float %5, %4
				%mul6 = fmul float %sub5, %div
				br label %if.end

				if.end: ; preds = %entry
				%p1 = phi float [ %mul4, %if.then ], [ 0.000000e+00, %entry ]
				%p2 = phi float [ %mul6, %if.then ], [ 0.000000e+00, %entry ]

				%x = fadd float %p1, %mul2
				%y = fadd float %p2, %mul
				%z = fadd float %x, %y
				ret float %z
				}

				; Check that we hoist load and scalar expressions in dominator.
				; CHECK-LABEL: @domHoisting
				; CHECK: load
				; CHECK: load
				; CHECK: fsub
				; CHECK: fmul
				; CHECK: load
				; CHECK: fsub
				; CHECK: fmul
				; CHECK-NOT: load
				; CHECK-NOT: fmul
				; CHECK-NOT: fsub
				define float @domHoisting(float %d, float* %min, float* %max, float* %a) {
				entry:
				%div = fdiv float 1.000000e+00, %d
				%0 = load float, float* %min, align 4
				%1 = load float, float* %a, align 4
				%sub = fsub float %0, %1
				%mul = fmul float %sub, %div
				%2 = load float, float* %max, align 4
				%sub1 = fsub float %2, %1
				%mul2 = fmul float %sub1, %div
				%cmp = fcmp oge float %div, 0.000000e+00
				br i1 %cmp, label %if.then, label %if.else

				if.then:
				%3 = load float, float* %max, align 4
				%4 = load float, float* %a, align 4
				%sub3 = fsub float %3, %4
				%mul4 = fmul float %sub3, %div
				%5 = load float, float* %min, align 4
				%sub5 = fsub float %5, %4
				%mul6 = fmul float %sub5, %div
				br label %if.end

				if.else:
				%6 = load float, float* %max, align 4
				%7 = load float, float* %a, align 4
				%sub9 = fsub float %6, %7
				%mul10 = fmul float %sub9, %div
				%8 = load float, float* %min, align 4
				%sub12 = fsub float %8, %7
				%mul13 = fmul float %sub12, %div
				br label %if.end

				if.end:
				%p1 = phi float [ %mul4, %if.then ], [ %mul10, %if.else ]
				%p2 = phi float [ %mul6, %if.then ], [ %mul13, %if.else ]

				%x = fadd float %p1, %mul2
				%y = fadd float %p2, %mul
				%z = fadd float %x, %y
				ret float %z
				}

test/Transforms/GVNHoist/hoist-pr20242.ll

	; RUN: opt -gvn-hoist -S < %s \| FileCheck %s			; RUN: opt -gvn-hoist -newgvn -gvn-hoist -S < %s \| FileCheck %s
				; Test to demonstrate that newgvn creates opportunities for
				; more gvn-hoist when sibling branches contain identical expressions.

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	; Check that all "or" expressions are hoisted.			; Check that all "or" expressions are hoisted.
	; CHECK-LABEL: @encode			; CHECK-LABEL: @encode
	; CHECK: or i32			; CHECK: or i32
	; CHECK-NOT: or i32			; CHECK-NOT: or i32

	▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

test/Transforms/GVNHoist/hoist-pr28933.ll

	; RUN: opt -S -gvn-hoist -verify-memoryssa < %s \| FileCheck %s			; RUN: opt -S -gvn-hoist -verify-memoryssa -newgvn < %s \| FileCheck %s

	; Check that we end up with one load and one store, in the right order			; Check that we end up with one load and one store, in the right order
	; CHECK-LABEL: define void @test_it(			; CHECK-LABEL: define void @test_it(
	; CHECK: store			; CHECK: store
	; CHECK-NEXT: load
	; CHECK-NOT: store			; CHECK-NOT: store
	; CHECK-NOT: load			; CHECK-NOT: load

	%rec894.0.1.2.3.12 = type { i16 }			%rec894.0.1.2.3.12 = type { i16 }

	@a = external global %rec894.0.1.2.3.12			@a = external global %rec894.0.1.2.3.12

	define void @test_it() {			define void @test_it() {
	bb2:			bb2:
	store i16 undef, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1			store i16 undef, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1
	%_tmp61 = load i16, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1			%_tmp61 = load i16, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1
	store i16 undef, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1			store i16 undef, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1
	%_tmp92 = load i16, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1			%_tmp92 = load i16, i16* getelementptr inbounds (%rec894.0.1.2.3.12, %rec894.0.1.2.3.12* @a, i16 0, i32 0), align 1
	ret void			ret void
	}			}

test/Transforms/GVNHoist/hoist-recursive-geps.ll

	; RUN: opt -gvn-hoist -S < %s \| FileCheck %s			; RUN: opt -gvn-hoist -newgvn -gvn-hoist -S < %s \| FileCheck %s

				; Check that recursive GEPs are hoisted. Since hoisting creates
				; fully redundant instructions, newgvn is run to remove them which then
				; creates more opportunites for hoisting.

	; Check that recursive GEPs are hoisted.
	; CHECK-LABEL: @fun			; CHECK-LABEL: @fun
	; CHECK: fdiv
	; CHECK: load			; CHECK: load
				; CHECK: fdiv
	; CHECK: load			; CHECK: load
	; CHECK: load			; CHECK: load
	; CHECK: load			; CHECK: load
	; CHECK: fsub			; CHECK: fsub
	; CHECK: fsub
	; CHECK: fmul			; CHECK: fmul
				; CHECK: fsub
	; CHECK: fmul			; CHECK: fmul
	; CHECK-NOT: fsub			; CHECK-NOT: fsub
	; CHECK-NOT: fmul			; CHECK-NOT: fmul

	%0 = type { double, double, double }			%0 = type { double, double, double }
	%1 = type { double, double, double }			%1 = type { double, double, double }
	%2 = type { %3, %1, %1 }			%2 = type { %3, %1, %1 }
	%3 = type { i32 (...)*, %4, %10, %11, %11, %11, %11, %11, %11, %11, %11, %11 }			%3 = type { i32 (...)*, %4, %10, %11, %11, %11, %11, %11, %11, %11, %11, %11 }
	▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

test/Transforms/GVNHoist/hoist.ll

; RUN: opt -gvn-hoist -S < %s \| FileCheck %s		; RUN: opt -gvn-hoist -S < %s \| FileCheck %s
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"		target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"		target triple = "x86_64-unknown-linux-gnu"

@GlobalVar = internal global float 1.000000e+00		@GlobalVar = internal global float 1.000000e+00

; Check that all scalar expressions are hoisted.		; Check that all scalar expressions are hoisted.
;		;
; CHECK-LABEL: @scalarsHoisting		; CHECK-LABEL: @scalarsHoisting
; CHECK: fsub		; CHECK: fsub
; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
		; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
; CHECK-NOT: fmul		; CHECK-NOT: fmul
; CHECK-NOT: fsub		; CHECK-NOT: fsub
define float @scalarsHoisting(float %d, float %min, float %max, float %a) {		define float @scalarsHoisting(float %d, float %min, float %max, float %a) {
entry:		entry:
%div = fdiv float 1.000000e+00, %d		%div = fdiv float 1.000000e+00, %d
%cmp = fcmp oge float %div, 0.000000e+00		%cmp = fcmp oge float %div, 0.000000e+00
br i1 %cmp, label %if.then, label %if.else		br i1 %cmp, label %if.then, label %if.else
Show All 22 Lines
; Check that all loads and scalars depending on the loads are hoisted.		; Check that all loads and scalars depending on the loads are hoisted.
; Check that getelementptr computation gets hoisted before the load.		; Check that getelementptr computation gets hoisted before the load.
;		;
; CHECK-LABEL: @readsAndScalarsHoisting		; CHECK-LABEL: @readsAndScalarsHoisting
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: fsub		; CHECK: fsub
; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
		; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
; CHECK-NOT: load		; CHECK-NOT: load
; CHECK-NOT: fmul		; CHECK-NOT: fmul
; CHECK-NOT: fsub		; CHECK-NOT: fsub
define float @readsAndScalarsHoisting(float %d, float* %min, float* %max, float* %a) {		define float @readsAndScalarsHoisting(float %d, float* %min, float* %max, float* %a) {
entry:		entry:
%div = fdiv float 1.000000e+00, %d		%div = fdiv float 1.000000e+00, %d
%cmp = fcmp oge float %div, 0.000000e+00		%cmp = fcmp oge float %div, 0.000000e+00
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines

; Check that we do hoist loads when the store is above the insertion point.		; Check that we do hoist loads when the store is above the insertion point.
;		;
; CHECK-LABEL: @readsAndWriteAboveInsertPt		; CHECK-LABEL: @readsAndWriteAboveInsertPt
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: fsub		; CHECK: fsub
; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
		; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
; CHECK-NOT: load		; CHECK-NOT: load
; CHECK-NOT: fmul		; CHECK-NOT: fmul
; CHECK-NOT: fsub		; CHECK-NOT: fsub
define float @readsAndWriteAboveInsertPt(float %d, float* %min, float* %max, float* %a) {		define float @readsAndWriteAboveInsertPt(float %d, float* %min, float* %max, float* %a) {
entry:		entry:
%div = fdiv float 1.000000e+00, %d		%div = fdiv float 1.000000e+00, %d
store float 0.000000e+00, float* @GlobalVar		store float 0.000000e+00, float* @GlobalVar
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines
}		}

; Check that we hoist load and scalar expressions in triangles.		; Check that we hoist load and scalar expressions in triangles.
; CHECK-LABEL: @triangleHoisting		; CHECK-LABEL: @triangleHoisting
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: load		; CHECK: load
; CHECK: fsub		; CHECK: fsub
; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
		; CHECK: fsub
; CHECK: fmul		; CHECK: fmul
; CHECK-NOT: load		; CHECK-NOT: load
; CHECK-NOT: fmul		; CHECK-NOT: fmul
; CHECK-NOT: fsub		; CHECK-NOT: fsub
define float @triangleHoisting(float %d, float* %min, float* %max, float* %a) {		define float @triangleHoisting(float %d, float* %min, float* %max, float* %a) {
entry:		entry:
%div = fdiv float 1.000000e+00, %d		%div = fdiv float 1.000000e+00, %d
%cmp = fcmp oge float %div, 0.000000e+00		%cmp = fcmp oge float %div, 0.000000e+00
Show All 21 Lines	if.end: ; preds = %entry
%mul6 = fmul float %sub5, %div		%mul6 = fmul float %sub5, %div

%x = fadd float %p1, %mul6		%x = fadd float %p1, %mul6
%y = fadd float %p2, %mul4		%y = fadd float %p2, %mul4
%z = fadd float %x, %y		%z = fadd float %x, %y
ret float %z		ret float %z
}		}

; Check that we hoist load and scalar expressions in dominator.
; CHECK-LABEL: @dominatorHoisting
; CHECK: load
; CHECK: load
; CHECK: fsub
; CHECK: fmul
; CHECK: load
; CHECK: fsub
; CHECK: fmul
; CHECK-NOT: load
; CHECK-NOT: fmul
; CHECK-NOT: fsub
define float @dominatorHoisting(float %d, float* %min, float* %max, float* %a) {
entry:
%div = fdiv float 1.000000e+00, %d
%0 = load float, float* %min, align 4
%1 = load float, float* %a, align 4
%sub = fsub float %0, %1
%mul = fmul float %sub, %div
%2 = load float, float* %max, align 4
%sub1 = fsub float %2, %1
%mul2 = fmul float %sub1, %div
%cmp = fcmp oge float %div, 0.000000e+00
br i1 %cmp, label %if.then, label %if.end

if.then: ; preds = %entry
%3 = load float, float* %max, align 4
%4 = load float, float* %a, align 4
%sub3 = fsub float %3, %4
%mul4 = fmul float %sub3, %div
%5 = load float, float* %min, align 4
%sub5 = fsub float %5, %4
%mul6 = fmul float %sub5, %div
br label %if.end

if.end: ; preds = %entry
%p1 = phi float [ %mul4, %if.then ], [ 0.000000e+00, %entry ]
%p2 = phi float [ %mul6, %if.then ], [ 0.000000e+00, %entry ]

%x = fadd float %p1, %mul2
%y = fadd float %p2, %mul
%z = fadd float %x, %y
ret float %z
}

; Check that we hoist load and scalar expressions in dominator.
; CHECK-LABEL: @domHoisting
; CHECK: load
; CHECK: load
; CHECK: fsub
; CHECK: fmul
; CHECK: load
; CHECK: fsub
; CHECK: fmul
; CHECK-NOT: load
; CHECK-NOT: fmul
; CHECK-NOT: fsub
define float @domHoisting(float %d, float* %min, float* %max, float* %a) {
entry:
%div = fdiv float 1.000000e+00, %d
%0 = load float, float* %min, align 4
%1 = load float, float* %a, align 4
%sub = fsub float %0, %1
%mul = fmul float %sub, %div
%2 = load float, float* %max, align 4
%sub1 = fsub float %2, %1
%mul2 = fmul float %sub1, %div
%cmp = fcmp oge float %div, 0.000000e+00
br i1 %cmp, label %if.then, label %if.else

if.then:
%3 = load float, float* %max, align 4
%4 = load float, float* %a, align 4
%sub3 = fsub float %3, %4
%mul4 = fmul float %sub3, %div
%5 = load float, float* %min, align 4
%sub5 = fsub float %5, %4
%mul6 = fmul float %sub5, %div
br label %if.end

if.else:
%6 = load float, float* %max, align 4
%7 = load float, float* %a, align 4
%sub9 = fsub float %6, %7
%mul10 = fmul float %sub9, %div
%8 = load float, float* %min, align 4
%sub12 = fsub float %8, %7
%mul13 = fmul float %sub12, %div
br label %if.end

if.end:
%p1 = phi float [ %mul4, %if.then ], [ %mul10, %if.else ]
%p2 = phi float [ %mul6, %if.then ], [ %mul13, %if.else ]

%x = fadd float %p1, %mul2
%y = fadd float %p2, %mul
%z = fadd float %x, %y
ret float %z
}

; Check that we do not hoist loads past stores within a same basic block.		; Check that we do not hoist loads past stores within a same basic block.
; CHECK-LABEL: @noHoistInSingleBBWithStore		; CHECK-LABEL: @noHoistInSingleBBWithStore
; CHECK: load		; CHECK: load
; CHECK: store		; CHECK: store
; CHECK: load		; CHECK: load
; CHECK: store		; CHECK: store
define i32 @noHoistInSingleBBWithStore() {		define i32 @noHoistInSingleBBWithStore() {
entry:		entry:
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

test/Transforms/GVNHoist/infinite-loop-direct.ll

This file was added.

				; RUN: opt -S -gvn-hoist < %s \| FileCheck %s

				; Checking gvn-hoist in case of infinite loops and irreducible control flow.

				; Check that bitcast is not hoisted beacuse down safety is not guaranteed.
				; CHECK-LABEL: @bazv1
				; CHECK: if.then.i:
				; CHECK: bitcast
				; CHECK-NEXT: load
				; CHECK: if.then4.i:
				; CHECK: bitcast
				; CHECK-NEXT: load

				%class.bar = type { i8, %class.base }
				%class.base = type { i32 (...)** }

				; Function Attrs: noreturn nounwind uwtable
				define void @bazv1() local_unnamed_addr {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x.sroa.2.0..sroa_idx2 = getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				store %class.base* null, %class.base** %x.sroa.2.0..sroa_idx2, align 8
				call void @_Z3foo3bar(%class.bar* nonnull %agg.tmp)
				%0 = load %class.base, %class.base* %x.sroa.2.0..sroa_idx2, align 8
				%1 = bitcast %class.bar* %agg.tmp to %class.base*
				%cmp.i = icmp eq %class.base* %0, %1
				br i1 %cmp.i, label %if.then.i, label %if.else.i

				if.then.i: ; preds = %entry
				%2 = bitcast %class.base* %0 to void (%class.base)**
				%vtable.i = load void (%class.base), void (%class.base)*** %2, align 8
				%vfn.i = getelementptr inbounds void (%class.base), void (%class.base)* %vtable.i, i64 2
				%3 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				call void %3(%class.base* %0)
				br label %while.cond.preheader

				if.else.i: ; preds = %entry
				%tobool.i = icmp eq %class.base* %0, null
				br i1 %tobool.i, label %while.cond.preheader, label %if.then4.i

				if.then4.i: ; preds = %if.else.i
				%4 = bitcast %class.base* %0 to void (%class.base)**
				%vtable6.i = load void (%class.base), void (%class.base)*** %4, align 8
				%vfn7.i = getelementptr inbounds void (%class.base), void (%class.base)* %vtable6.i, i64 3
				%5 = load void (%class.base), void (%class.base)* %vfn7.i, align 8
				call void %5(%class.base* nonnull %0)
				br label %while.cond.preheader

				while.cond.preheader: ; preds = %if.then.i, %if.else.i, %if.then4.i
				br label %while.cond

				while.cond: ; preds = %while.cond.preheader, %while.cond
				%call = call i32 @sleep(i32 10)
				br label %while.cond
				}

				declare void @_Z3foo3bar(%class.bar*) local_unnamed_addr

				declare i32 @sleep(i32) local_unnamed_addr

				; Check that the load is not hoisted when inside an irreducible control flow

				; CHECK-LABEL: @bazv
				; CHECK: bb2:
				; CHECK: load
				; CHECK: load
				; CHECK: bitcast

				define void @bazv() {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%0 = load %class.base, %class.base* %x, align 8
				%1 = bitcast %class.bar* %agg.tmp to %class.base*
				%cmp.i = icmp eq %class.base* %0, %1
				br i1 %cmp.i, label %bb1, label %bb4

				bb1:
				%b1 = bitcast %class.base* %0 to void (%class.base)**
				%i = load void (%class.base), void (%class.base)*** %b1, align 8
				%vfn.i = getelementptr inbounds void (%class.base), void (%class.base)* %i, i64 2
				%cmp.j = icmp eq %class.base* %0, %1
				br i1 %cmp.j, label %bb2, label %bb3

				bb2:
				%l1 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				br label %bb3

				bb3:
				%l2 = load void (%class.base), void (%class.base)* %vfn.i, align 8
				br label %bb2

				bb4:
				%b2 = bitcast %class.base* %0 to void (%class.base)**
				ret void
				}

test/Transforms/GVNHoist/infinite-loop-indirect.ll

This file was added.

				; RUN: opt -S -gvn-hoist < %s \| FileCheck %s

				; Checking gvn-hoist in case of indirect branches.

				; Check that the bitcast is is not hoisted because it is after an indirect call
				; CHECK-LABEL: @foo
				; CHECK-LABEL: l1.preheader:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: l1
				; CHECK: bitcast

				%class.bar = type { i8, %class.base }
				%class.base = type { i32 (...)** }

				@bar = local_unnamed_addr global i32 ()* null, align 8
				@bar1 = local_unnamed_addr global i32 ()* null, align 8

				define i32 @foo(i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				%0 = load i32, i32* %i, align 4
				%.off = add i32 %0, -1
				%switch = icmp ult i32 %.off, 2
				br i1 %switch, label %l1.preheader, label %sw.default

				l1.preheader: ; preds = %sw.default, %entry
				%b1 = bitcast %class.base* %y to void (%class.base)**
				br label %l1

				l1: ; preds = %l1.preheader, %l1
				%1 = load i32 (), i32 ()* @bar, align 8
				%call = tail call i32 %1()
				%b2 = bitcast %class.base* %y to void (%class.base)**
				br label %l1

				sw.default: ; preds = %entry
				%2 = load i32 (), i32 ()* @bar1, align 8
				%call2 = tail call i32 %2()
				br label %l1.preheader
				}


				; Any instruction inside an infinite loop will not be hoisted because
				; there is no path to exit of the function.

				; CHECK-LABEL: @foo1
				; CHECK-LABEL: l1.preheader:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: l1:
				; CHECK: bitcast

				define i32 @foo1(i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				%0 = load i32, i32* %i, align 4
				%.off = add i32 %0, -1
				%switch = icmp ult i32 %.off, 2
				br i1 %switch, label %l1.preheader, label %sw.default

				l1.preheader: ; preds = %sw.default, %entry
				%b1 = bitcast %class.base* %y to void (%class.base)**
				%y1 = load %class.base, %class.base* %x, align 8
				br label %l1

				l1: ; preds = %l1.preheader, %l1
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%1 = load i32 (), i32 ()* @bar, align 8
				%y2 = load %class.base, %class.base* %x, align 8
				%call = tail call i32 %1()
				br label %l1

				sw.default: ; preds = %entry
				%2 = load i32 (), i32 ()* @bar1, align 8
				%call2 = tail call i32 %2()
				br label %l1.preheader
				}

				; Check that bitcast is hoisted even when one of them is partially redundant.
				; CHECK-LABEL: @test13
				; CHECK: bitcast
				; CHECK-NOT: bitcast

				define i32 @test13(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				store i32 4, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				ret i32 123
				F:
				ret i32 1422
				}

				; Check that the bitcast is not hoisted because anticipability
				; cannot be guaranteed here as one of the indirect branch targets
				; do not have the bitcast instruction.

				; CHECK-LABEL: @test14
				; CHECK-LABEL: B2:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: BrBlock:
				; CHECK-NEXT: bitcast

				define i32 @test14(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2, label %T]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				store i32 4, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				%pi = load i32, i32* %i, align 4
				ret i32 %pi
				F:
				%pl = load i32, i32* %P
				ret i32 %pl
				}


				; Check that the bitcast is not hoisted because of a cycle
				; due to indirect branches
				; CHECK-LABEL: @test16
				; CHECK-LABEL: B2:
				; CHECK-NEXT: bitcast
				; CHECK-LABEL: BrBlock:
				; CHECK-NEXT: bitcast

				define i32 @test16(i32* %P, i8* %Ptr, i32* nocapture readonly %i) {
				entry:
				%agg.tmp = alloca %class.bar, align 8
				%x= getelementptr inbounds %class.bar, %class.bar* %agg.tmp, i64 0, i32 1
				%y = load %class.base, %class.base* %x, align 8
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]

				B2:
				%b1 = bitcast %class.base* %y to void (%class.base)**
				%0 = load i32, i32* %i, align 4
				store i32 %0, i32 *%P
				br label %BrBlock

				BrBlock:
				%b2 = bitcast %class.base* %y to void (%class.base)**
				%L = load i32, i32* %P
				%C = icmp eq i32 %L, 42
				br i1 %C, label %T, label %F

				T:
				indirectbr i32* %P, [label %BrBlock, label %B2]

				F:
				indirectbr i8* %Ptr, [label %BrBlock, label %B2]
				}


				@_ZTIi = external constant i8*

				; Check that an instruction is not hoisted out of landing pad (%lpad4)
				; Also within a landing pad no redundancies are removed by gvn-hoist,
				; however an instruction may be hoisted into a landing pad if
				; landing pad has direct branches (e.g., %lpad to %catch1, %catch)
				; This CFG has a cycle (%lpad -> %catch1 -> %lpad4 -> %lpad)

				; CHECK-LABEL: @foo2
				; Check that nothing gets hoisted out of %lpad
				; CHECK-LABEL: lpad:
				; CHECK: %bc1 = add i32 %0, 10
				; CHECK: %bc7 = add i32 %0, 10

				; Check that the add is hoisted
				; CHECK-LABEL: catch1:
				; CHECK-NEXT: invoke

				; Check that the add is hoisted
				; CHECK-LABEL: catch:
				; CHECK-NEXT: load

				; Check that other adds are not hoisted
				; CHECK-LABEL: lpad4:
				; CHECK: %bc5 = add i32 %0, 10
				; CHECK-LABEL: unreachable:
				; CHECK: %bc2 = add i32 %0, 10

				; Function Attrs: noinline uwtable
				define i32 @foo2(i32* nocapture readonly %i) local_unnamed_addr personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				%0 = load i32, i32* %i, align 4
				%cmp = icmp eq i32 %0, 0
				br i1 %cmp, label %try.cont, label %if.then

				if.then:
				%exception = tail call i8* @__cxa_allocate_exception(i64 4) #2
				%1 = bitcast i8* %exception to i32*
				store i32 %0, i32* %1, align 16
				invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #3
				to label %unreachable unwind label %lpad

				lpad:
				%2 = landingpad { i8*, i32 }
				catch i8* bitcast (i8** @_ZTIi to i8*)
				catch i8* null
				%bc1 = add i32 %0, 10
				%3 = extractvalue { i8*, i32 } %2, 0
				%4 = extractvalue { i8*, i32 } %2, 1
				%5 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) #2
				%matches = icmp eq i32 %4, %5
				%bc7 = add i32 %0, 10
				%6 = tail call i8* @__cxa_begin_catch(i8* %3) #2
				br i1 %matches, label %catch1, label %catch

				catch1:
				%bc3 = add i32 %0, 10
				invoke void @__cxa_rethrow() #3
				to label %unreachable unwind label %lpad4

				catch:
				%bc4 = add i32 %0, 10
				%7 = load i32, i32* %i, align 4
				%add = add nsw i32 %7, 1
				tail call void @__cxa_end_catch()
				br label %try.cont

				lpad4:
				%8 = landingpad { i8*, i32 }
				cleanup
				%bc5 = add i32 %0, 10
				tail call void @__cxa_end_catch() #2
				invoke void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #3
				to label %unreachable unwind label %lpad

				try.cont:
				%k.0 = phi i32 [ %add, %catch ], [ 0, %entry ]
				%bc6 = add i32 %0, 10
				ret i32 %k.0

				unreachable:
				%bc2 = add i32 %0, 10
				ret i32 %bc2
				}

				declare i8* @__cxa_allocate_exception(i64) local_unnamed_addr

				declare void @__cxa_throw(i8, i8, i8*) local_unnamed_addr

				declare i32 @__gxx_personality_v0(...)

				; Function Attrs: nounwind readnone
				declare i32 @llvm.eh.typeid.for(i8*) #1

				declare i8* @__cxa_begin_catch(i8*) local_unnamed_addr

				declare void @__cxa_end_catch() local_unnamed_addr

				declare void @__cxa_rethrow() local_unnamed_addr

				attributes #1 = { nounwind readnone }
				attributes #2 = { nounwind }
				attributes #3 = { noreturn }

This is an archive of the discontinued LLVM Phabricator instance.

[GVNHoist] Factor out reachability to search for anticipable instructions quicklyClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 110784

lib/Transforms/Scalar/GVNHoist.cpp

test/Transforms/GVNHoist/hoist-more-than-two-branches.ll

test/Transforms/GVNHoist/hoist-mssa.ll

test/Transforms/GVNHoist/hoist-newgvn.ll

test/Transforms/GVNHoist/hoist-pr20242.ll

test/Transforms/GVNHoist/hoist-pr28933.ll

test/Transforms/GVNHoist/hoist-recursive-geps.ll

test/Transforms/GVNHoist/hoist.ll

test/Transforms/GVNHoist/infinite-loop-direct.ll

test/Transforms/GVNHoist/infinite-loop-indirect.ll

[GVNHoist] Factor out reachability to search for anticipable instructions quickly
ClosedPublic