This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
InitializePasses.h
-
LinkAllPasses.h
-
Transforms/
-
Scalar.h
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
-
CMakeLists.txt
6
NaryReassociate.cpp
-
Scalar.cpp
-
test/Transforms/NaryReassociate/
-
Transforms/
-
NaryReassociate/
-
nary-add.ll

Differential D8950

Simplify n-ary adds by reassociation
ClosedPublic

Authored by jingyue on Apr 9 2015, 9:26 PM.

Download Raw Diff

Details

Reviewers

broune
• dberlin
atrick
sanjoy
meheff
hfinkel

Commits

rG8cb6b2a292ea: Simplify n-ary adds by reassociation
rL234855: Simplify n-ary adds by reassociation

Summary

This transformation reassociates a n-ary add so that the add can partially reuse
existing instructions. For example, this pass can simplify

void foo(int a, int b) {
  bar(a + b);
  bar((a + 2) + b);
}

void foo(int a, int b) {
  int t = a + b;
  bar(t);
  bar(t + 2);
}

saving one add instruction.

Fixes PR22357 (https://llvm.org/bugs/show_bug.cgi?id=22357).

Diff Detail

Event Timeline

jingyue updated this revision to Diff 23563.Apr 9 2015, 9:26 PM

jingyue retitled this revision from to Simplify n-ary adds by reassociation.

jingyue updated this object.

jingyue edited the test plan for this revision. (Show Details)

jingyue added reviewers: atrick, broune, • dberlin, hfinkel, meheff, sanjoy.

jingyue added a subscriber: Unknown Object (MLST).

I previously tried performing this optimization in StraightLineStrengthReduce (D8898), but realized it is probably worth a standalone pass for the following reasons:

it is general enough;
users can flexibly choose where to run it;
it can be made more efficient without being restricted to the framework of SLSR.

So, I gave it another try, and would like to see how you like this implementation. As mentioned in the header comments, this implementation is preliminary and has lots of limitations. But I hope to make it more fully-fledged iteratively. For now, I only plan to use it after SLSR in the NVPTX backend.

Your implementation looks good to me.

I would like at least one other reviewer to sign off on creating a new pass to do this--I'm not sure what the best answer is. There are several places in LLVM where reassociation makes sense. This should probably not happen in the current Reassociate pass because that is really a canonicalization and it may be better to leave the simple single-use expressions in place for further optimization. We would also like to reassociate in machine code based on registers and critical path, but this is redundancy elimination so doesn't belong there either. So I'm personally ok with a new IR pass that people can experiment with in their pipelines. I would still like to get feedback from others first.

This revision is now accepted and ready to land.Apr 13 2015, 2:15 PM

Minor comments inline.

lib/Transforms/Scalar/NaryReassociate.cpp
74	Since you're using `PatternMatch` in only one function, I think the namespace should be opened in just that scope.
141	I think LLVM convention is `Node`. Also, I'd suggest using `auto *Node` to make it obvious that we have a pointer here (very minor).
200	Is it possible to switch to a traversal and expression gathering scheme that will always visit defs before uses? Reverse post-order maybe? I think that way we can avoid the `dominates` query (which can be expensive) and just turn it into an `assert`; and just use the earliest (latest?) element in `LHSCandidates`.

I'm okay with this living as a separate pass. I think this pass exposes clear, orthogonal functionality and can also be made more efficient by the virtue of working on an entire function at once.

Minor comments inline, all of which are okay to fix either pre or post commit, as you see fit.

Thanks Andrew and Sanjoy for the review! All comments are addressed inline.

lib/Transforms/Scalar/NaryReassociate.cpp
74	I have a pending patch based on this, and that patch will use `PatternMatch` too. So I'll leave the code as is for now.
141	Changing `auto` to `auto ` doesn't compile. I guess it's because `nodes_begin` returns an iterator and the compiler is unable to deduce `auto ` from the iterator type. Additionally, even if the code compiled, `++Node` would be problematic.
200	We traversed the instructions in the pre-order of the dominator tree, so defs are already always visited before uses. However, this doesn't avoid checking `dominates(LHS, I)` because `LHS` being visited before `I` does not mean `*LHS` dominating `I`. Anyhow, I think there is a way of avoiding the dominance check, and I'm happy to put it in the next patch. Whenever we backtrack from a basic block during DFS'ing the dominator tree, we remove all SCEVs defined in this basic block from `SeenExprs`. This can ensure that, at any time, all the instructions in `SeenExprs` dominate the basic block we are exploring. For example, suppose the dominator tree of this function looks like: B1 -> B2 \| V B3 -> B4 \| V B5 Then, we traverse the tree in this order: +B1 +B2 -B2 +B3 +B4 -B4 +B5 -B5 -B3 -B1. + means entering a node, and - means exiting a node. Let's pick a random basic block say B4. Before entering B4, we added B1, B2, and B3 to `SeenExprs` and removed B2 from it, so `SeenExprs` contains instructions in B1 and B3 which must dominate B4.

Addressed Sanjoy's comments

jingyue closed this revision.Apr 13 2015, 10:02 PM

jingyue mentioned this in D9055: [NaryReassociate] speeds up candidate searching.Apr 16 2015, 11:22 AM

jingyue mentioned this in rL235129: [NaryReassociate] speeds up candidate searching.Apr 16 2015, 11:45 AM

Revision Contents

Path

Size

include/

llvm/

InitializePasses.h

1 line

LinkAllPasses.h

1 line

Transforms/

Scalar.h

6 lines

lib/

Transforms/

Scalar/

CMakeLists.txt

1 line

NaryReassociate.cpp

206 lines

Scalar.cpp

1 line

test/

Transforms/

NaryReassociate/

nary-add.ll

127 lines

Diff 23715

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 199 Lines • ▼ Show 20 Lines
	void initializeMemCpyOptPass(PassRegistry&);			void initializeMemCpyOptPass(PassRegistry&);
	void initializeMemDepPrinterPass(PassRegistry&);			void initializeMemDepPrinterPass(PassRegistry&);
	void initializeMemDerefPrinterPass(PassRegistry&);			void initializeMemDerefPrinterPass(PassRegistry&);
	void initializeMemoryDependenceAnalysisPass(PassRegistry&);			void initializeMemoryDependenceAnalysisPass(PassRegistry&);
	void initializeMergedLoadStoreMotionPass(PassRegistry &);			void initializeMergedLoadStoreMotionPass(PassRegistry &);
	void initializeMetaRenamerPass(PassRegistry&);			void initializeMetaRenamerPass(PassRegistry&);
	void initializeMergeFunctionsPass(PassRegistry&);			void initializeMergeFunctionsPass(PassRegistry&);
	void initializeModuleDebugInfoPrinterPass(PassRegistry&);			void initializeModuleDebugInfoPrinterPass(PassRegistry&);
				void initializeNaryReassociatePass(PassRegistry&);
	void initializeNoAAPass(PassRegistry&);			void initializeNoAAPass(PassRegistry&);
	void initializeObjCARCAliasAnalysisPass(PassRegistry&);			void initializeObjCARCAliasAnalysisPass(PassRegistry&);
	void initializeObjCARCAPElimPass(PassRegistry&);			void initializeObjCARCAPElimPass(PassRegistry&);
	void initializeObjCARCExpandPass(PassRegistry&);			void initializeObjCARCExpandPass(PassRegistry&);
	void initializeObjCARCContractPass(PassRegistry&);			void initializeObjCARCContractPass(PassRegistry&);
	void initializeObjCARCOptPass(PassRegistry&);			void initializeObjCARCOptPass(PassRegistry&);
	void initializePAEvalPass(PassRegistry &);			void initializePAEvalPass(PassRegistry &);
	void initializeOptimizePHIsPass(PassRegistry&);			void initializeOptimizePHIsPass(PassRegistry&);
	▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createLoopRerollPass();		(void) llvm::createLoopRerollPass();
(void) llvm::createLoopUnrollPass();		(void) llvm::createLoopUnrollPass();
(void) llvm::createLoopUnswitchPass();		(void) llvm::createLoopUnswitchPass();
(void) llvm::createLoopIdiomPass();		(void) llvm::createLoopIdiomPass();
(void) llvm::createLoopRotatePass();		(void) llvm::createLoopRotatePass();
(void) llvm::createLowerExpectIntrinsicPass();		(void) llvm::createLowerExpectIntrinsicPass();
(void) llvm::createLowerInvokePass();		(void) llvm::createLowerInvokePass();
(void) llvm::createLowerSwitchPass();		(void) llvm::createLowerSwitchPass();
		(void) llvm::createNaryReassociatePass();
(void) llvm::createNoAAPass();		(void) llvm::createNoAAPass();
(void) llvm::createObjCARCAliasAnalysisPass();		(void) llvm::createObjCARCAliasAnalysisPass();
(void) llvm::createObjCARCAPElimPass();		(void) llvm::createObjCARCAPElimPass();
(void) llvm::createObjCARCExpandPass();		(void) llvm::createObjCARCExpandPass();
(void) llvm::createObjCARCContractPass();		(void) llvm::createObjCARCContractPass();
(void) llvm::createObjCARCOptPass();		(void) llvm::createObjCARCOptPass();
(void) llvm::createPAEvalPass();		(void) llvm::createPAEvalPass();
(void) llvm::createPromoteMemoryToRegisterPass();		(void) llvm::createPromoteMemoryToRegisterPass();
▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 451 Lines • ▼ Show 20 Lines
	FunctionPass *createRewriteStatepointsForGCPass();			FunctionPass *createRewriteStatepointsForGCPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// Float2Int - Demote floats to ints where possible.			// Float2Int - Demote floats to ints where possible.
	//			//
	FunctionPass *createFloat2IntPass();			FunctionPass *createFloat2IntPass();

				//===----------------------------------------------------------------------===//
				//
				// NaryReassociate - Simplify n-ary operations by reassociation.
				//
				FunctionPass *createNaryReassociatePass();

	} // End llvm namespace			} // End llvm namespace

	#endif			#endif

lib/Transforms/Scalar/CMakeLists.txt

Show All 23 Lines	add_llvm_library(LLVMScalarOpts
LoopRotation.cpp		LoopRotation.cpp
LoopStrengthReduce.cpp		LoopStrengthReduce.cpp
LoopUnrollPass.cpp		LoopUnrollPass.cpp
LoopUnswitch.cpp		LoopUnswitch.cpp
LowerAtomic.cpp		LowerAtomic.cpp
LowerExpectIntrinsic.cpp		LowerExpectIntrinsic.cpp
MemCpyOptimizer.cpp		MemCpyOptimizer.cpp
MergedLoadStoreMotion.cpp		MergedLoadStoreMotion.cpp
		NaryReassociate.cpp
PartiallyInlineLibCalls.cpp		PartiallyInlineLibCalls.cpp
PlaceSafepoints.cpp		PlaceSafepoints.cpp
Reassociate.cpp		Reassociate.cpp
Reg2Mem.cpp		Reg2Mem.cpp
RewriteStatepointsForGC.cpp		RewriteStatepointsForGC.cpp
SCCP.cpp		SCCP.cpp
SROA.cpp		SROA.cpp
SampleProfile.cpp		SampleProfile.cpp
Show All 16 Lines

lib/Transforms/Scalar/NaryReassociate.cpp

This file was added.

				//===- NaryReassociate.cpp - Reassociate n-ary expressions ----------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass reassociates n-ary add expressions and eliminates the redundancy
				// exposed by the reassociation.
				//
				// A motivating example:
				//
				// void foo(int a, int b) {
				// bar(a + b);
				// bar((a + 2) + b);
				// }
				//
				// An ideal compiler should reassociate (a + 2) + b to (a + b) + 2 and simplify
				// the above code to
				//
				// int t = a + b;
				// bar(t);
				// bar(t + 2);
				//
				// However, the Reassociate pass is unable to do that because it processes each
				// instruction individually and believes (a + 2) + b is the best form according
				// to its rank system.
				//
				// To address this limitation, NaryReassociate reassociates an expression in a
				// form that reuses existing instructions. As a result, NaryReassociate can
				// reassociate (a + 2) + b in the example to (a + b) + 2 because it detects that
				// (a + b) is computed before.
				//
				// NaryReassociate works as follows. For every instruction in the form of (a +
				// b) + c, it checks whether a + c or b + c is already computed by a dominating
				// instruction. If so, it then reassociates (a + b) + c into (a + c) + b or (b +
				// c) + a respectively. To efficiently look up whether an expression is
				// computed before, we store each instruction seen and its SCEV into an
				// SCEV-to-instruction map.
				//
				// Although the algorithm pattern-matches only ternary additions, it
				// automatically handles many >3-ary expressions by walking through the function
				// in the depth-first order. For example, given
				//
				// (a + c) + d
				// ((a + b) + c) + d
				//
				// NaryReassociate first rewrites (a + b) + c to (a + c) + b, and then rewrites
				// ((a + c) + b) + d into ((a + c) + d) + b.
				//
				// Limitations and TODO items:
				//
				// 1) We only considers n-ary adds for now. This should be extended and
				// generalized.
				//
				// 2) Besides arithmetic operations, similar reassociation can be applied to
				// GEPs. For example, if
				// X = &arr[a]
				// dominates
				// Y = &arr[a + b]
				// we may rewrite Y into X + b.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Analysis/ScalarEvolution.h"
				#include "llvm/IR/Dominators.h"
				#include "llvm/IR/Module.h"
				#include "llvm/IR/PatternMatch.h"
				#include "llvm/Transforms/Scalar.h"
				using namespace llvm;
				using namespace PatternMatch;

				sanjoyUnsubmitted Not Done Reply Inline Actions Since you're using `PatternMatch` in only one function, I think the namespace should be opened in just that scope. sanjoy: Since you're using `PatternMatch` in only one function, I think the namespace should be opened…
				jingyueAuthorUnsubmitted Not Done Reply Inline Actions I have a pending patch based on this, and that patch will use `PatternMatch` too. So I'll leave the code as is for now. jingyue: I have a pending patch based on this, and that patch will use `PatternMatch` too. So I'll leave…
				#define DEBUG_TYPE "nary-reassociate"

				namespace {
				class NaryReassociate : public FunctionPass {
				public:
				static char ID;

				NaryReassociate(): FunctionPass(ID) {
				initializeNaryReassociatePass(*PassRegistry::getPassRegistry());
				}

				bool runOnFunction(Function &F) override;

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addPreserved<DominatorTreeWrapperPass>();
				AU.addRequired<DominatorTreeWrapperPass>();
				// TODO: can we preserve ScalarEvolution?
				AU.addRequired<ScalarEvolution>();
				AU.setPreservesCFG();
				}

				private:
				// Reasssociates I to a better form.
				Instruction tryReassociateAdd(Instruction I);
				// A helper function for tryReassociateAdd. LHS and RHS are explicitly passed.
				Instruction tryReassociateAdd(Value LHS, Value RHS, Instruction I);
				// Rewrites I to LHS + RHS if LHS is computed already.
				Instruction tryReassociatedAdd(const SCEV LHS, Value RHS, Instruction I);

				DominatorTree *DT;
				ScalarEvolution *SE;
				// A lookup table quickly telling which instructions compute the given SCEV.
				// Note that there can be multiple instructions at different locations
				// computing to the same SCEV. For example,
				// if (p1)
				// foo(a + b);
				// if (p2)
				// bar(a + b);
				DenseMap<const SCEV , SmallVector<Instruction , 2>> SeenExprs;
				};
				} // anonymous namespace

				char NaryReassociate::ID = 0;
				INITIALIZE_PASS_BEGIN(NaryReassociate, "nary-reassociate", "Nary reassociation",
				false, false)
				INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(ScalarEvolution)
				INITIALIZE_PASS_END(NaryReassociate, "nary-reassociate", "Nary reassociation",
				false, false)

				FunctionPass *llvm::createNaryReassociatePass() {
				return new NaryReassociate();
				}

				bool NaryReassociate::runOnFunction(Function &F) {
				if (skipOptnoneFunction(F))
				return false;

				DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
				SE = &getAnalysis<ScalarEvolution>();

				// Traverse the dominator tree in the depth-first order. This order makes sure
				// all bases of a candidate are in Candidates when we process it.
				bool Changed = false;
				SeenExprs.clear();
				for (auto Node = GraphTraits<DominatorTree *>::nodes_begin(DT);
				Node != GraphTraits<DominatorTree *>::nodes_end(DT); ++Node) {
				sanjoyUnsubmitted Not Done Reply Inline Actions I think LLVM convention is `Node`. Also, I'd suggest using `auto Node` to make it obvious that we have a pointer here (very minor). sanjoy:* I think LLVM convention is `Node`. Also, I'd suggest using `auto *Node` to make it obvious…
				jingyueAuthorUnsubmitted Not Done Reply Inline Actions Changing `auto` to `auto ` doesn't compile. I guess it's because `nodes_begin` returns an iterator and the compiler is unable to deduce `auto ` from the iterator type. Additionally, even if the code compiled, `++Node` would be problematic. jingyue: Changing `auto` to `auto *` doesn't compile. I guess it's because `nodes_begin` returns an…
				BasicBlock *BB = Node->getBlock();
				for (auto I = BB->begin(); I != BB->end(); ++I) {
				if (I->getOpcode() == Instruction::Add) {
				if (Instruction *NewI = tryReassociateAdd(I)) {
				I->replaceAllUsesWith(NewI);
				I->eraseFromParent();
				I = NewI;
				}
				// We should add the rewritten instruction because tryReassociateAdd may
				// have invalidated the original one.
				SeenExprs[SE->getSCEV(I)].push_back(I);
				}
				}
				}
				return Changed;
				}

				Instruction NaryReassociate::tryReassociateAdd(Instruction I) {
				Value LHS = I->getOperand(0), RHS = I->getOperand(1);
				if (auto *NewI = tryReassociateAdd(LHS, RHS, I))
				return NewI;
				if (auto *NewI = tryReassociateAdd(RHS, LHS, I))
				return NewI;
				return nullptr;
				}

				Instruction NaryReassociate::tryReassociateAdd(Value LHS, Value *RHS,
				Instruction *I) {
				Value A = nullptr, B = nullptr;
				// To be conservative, we reassociate I only when it is the only user of A+B.
				if (LHS->hasOneUse() && match(LHS, m_Add(m_Value(A), m_Value(B)))) {
				// I = (A + B) + RHS
				// = (A + RHS) + B or (B + RHS) + A
				const SCEV AExpr = SE->getSCEV(A), BExpr = SE->getSCEV(B);
				const SCEV *RHSExpr = SE->getSCEV(RHS);
				if (auto *NewI = tryReassociatedAdd(SE->getAddExpr(AExpr, RHSExpr), B, I))
				return NewI;
				if (auto *NewI = tryReassociatedAdd(SE->getAddExpr(BExpr, RHSExpr), A, I))
				return NewI;
				}
				return nullptr;
				}

				Instruction NaryReassociate::tryReassociatedAdd(const SCEV LHSExpr,
				Value RHS, Instruction I) {
				auto Pos = SeenExprs.find(LHSExpr);
				// Bail out if LHSExpr is not previously seen.
				if (Pos == SeenExprs.end())
				return nullptr;

				auto &LHSCandidates = Pos->second;
				unsigned NumIterations = 0;
				// Search at most 10 items to avoid running quadratically.
				static const unsigned MaxNumIterations = 10;
				for (auto LHS = LHSCandidates.rbegin();
				LHS != LHSCandidates.rend() && NumIterations < MaxNumIterations;
				++LHS, ++NumIterations) {
				if (DT->dominates(*LHS, I)) {
				Instruction NewI = BinaryOperator::CreateAdd(LHS, RHS, "", I);
				sanjoyUnsubmitted Not Done Reply Inline Actions Is it possible to switch to a traversal and expression gathering scheme that will always visit defs before uses? Reverse post-order maybe? I think that way we can avoid the `dominates` query (which can be expensive) and just turn it into an `assert`; and just use the earliest (latest?) element in `LHSCandidates`. sanjoy: Is it possible to switch to a traversal and expression gathering scheme that will always visit…
				jingyueAuthorUnsubmitted Not Done Reply Inline Actions We traversed the instructions in the pre-order of the dominator tree, so defs are already always visited before uses. However, this doesn't avoid checking `dominates(LHS, I)` because `LHS` being visited before `I` does not mean `LHS` dominating `I`. Anyhow, I think there is a way of avoiding the dominance check, and I'm happy to put it in the next patch. Whenever we backtrack from a basic block during DFS'ing the dominator tree, we remove all SCEVs defined in this basic block from `SeenExprs`. This can ensure that, at any time, all the instructions in `SeenExprs` dominate the basic block we are exploring. For example, suppose the dominator tree of this function looks like: B1 -> B2 \| V B3 -> B4 \| V B5 Then, we traverse the tree in this order: +B1 +B2 -B2 +B3 +B4 -B4 +B5 -B5 -B3 -B1. + means entering a node, and - means exiting a node. Let's pick a random basic block say B4. Before entering B4, we added B1, B2, and B3 to `SeenExprs` and removed B2 from it, so `SeenExprs` contains instructions in B1 and B3 which must dominate B4. jingyue:* We traversed the instructions in the pre-order of the dominator tree, so defs are already…
				NewI->takeName(I);
				return NewI;
				}
				}
				return nullptr;
				}

lib/Transforms/Scalar/Scalar.cpp

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	void llvm::initializeScalarOpts(PassRegistry &Registry) {
initializeLoopRerollPass(Registry);		initializeLoopRerollPass(Registry);
initializeLoopUnrollPass(Registry);		initializeLoopUnrollPass(Registry);
initializeLoopUnswitchPass(Registry);		initializeLoopUnswitchPass(Registry);
initializeLoopIdiomRecognizePass(Registry);		initializeLoopIdiomRecognizePass(Registry);
initializeLowerAtomicPass(Registry);		initializeLowerAtomicPass(Registry);
initializeLowerExpectIntrinsicPass(Registry);		initializeLowerExpectIntrinsicPass(Registry);
initializeMemCpyOptPass(Registry);		initializeMemCpyOptPass(Registry);
initializeMergedLoadStoreMotionPass(Registry);		initializeMergedLoadStoreMotionPass(Registry);
		initializeNaryReassociatePass(Registry);
initializePartiallyInlineLibCallsPass(Registry);		initializePartiallyInlineLibCallsPass(Registry);
initializeReassociatePass(Registry);		initializeReassociatePass(Registry);
initializeRegToMemPass(Registry);		initializeRegToMemPass(Registry);
initializeRewriteStatepointsForGCPass(Registry);		initializeRewriteStatepointsForGCPass(Registry);
initializeSCCPPass(Registry);		initializeSCCPPass(Registry);
initializeIPSCCPPass(Registry);		initializeIPSCCPPass(Registry);
initializeSROAPass(Registry);		initializeSROAPass(Registry);
initializeSROA_DTPass(Registry);		initializeSROA_DTPass(Registry);
▲ Show 20 Lines • Show All 169 Lines • Show Last 20 Lines

test/Transforms/NaryReassociate/nary-add.ll

This file was added.

				; RUN: opt < %s -nary-reassociate -S \| FileCheck %s

				target datalayout = "e-i64:64-v16:16-v32:32-n16:32:64"

				declare void @foo(i32 %a)

				; foo(a + c);
				; foo((a + (b + c));
				; =>
				; t = a + c;
				; foo(t);
				; foo(t + b);
				define void @left_reassociate(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @left_reassociate(
				%1 = add i32 %a, %c
				; CHECK: [[BASE:%[a-zA-Z0-9]+]] = add i32 %a, %c
				call void @foo(i32 %1)
				%2 = add i32 %b, %c
				%3 = add i32 %a, %2
				; CHECK: add i32 [[BASE]], %b
				call void @foo(i32 %3)
				ret void
				}

				; foo(a + c);
				; foo((a + b) + c);
				; =>
				; t = a + c;
				; foo(t);
				; foo(t + b);
				define void @right_reassociate(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @right_reassociate(
				%1 = add i32 %a, %c
				; CHECK: [[BASE:%[a-zA-Z0-9]+]] = add i32 %a, %c
				call void @foo(i32 %1)
				%2 = add i32 %a, %b
				%3 = add i32 %2, %c
				; CHECK: add i32 [[BASE]], %b
				call void @foo(i32 %3)
				ret void
				}

				; t1 = a + c;
				; foo(t1);
				; t2 = a + b;
				; foo(t2);
				; t3 = t2 + c;
				; foo(t3);
				;
				; Do not rewrite t3 into t1 + b because t2 is used elsewhere and is likely free.
				define void @no_reassociate(i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @no_reassociate(
				%1 = add i32 %a, %c
				; CHECK: add i32 %a, %c
				call void @foo(i32 %1)
				%2 = add i32 %a, %b
				; CHECK: add i32 %a, %b
				call void @foo(i32 %2)
				%3 = add i32 %2, %c
				; CHECK: add i32 %2, %c
				call void @foo(i32 %3)
				ret void
				}

				; if (p1)
				; foo(a + c);
				; if (p2)
				; foo(a + c);
				; if (p3)
				; foo((a + b) + c);
				;
				; No action because (a + c) does not dominate ((a + b) + c).
				define void @conditional(i1 %p1, i1 %p2, i1 %p3, i32 %a, i32 %b, i32 %c) {
				; CHECK-LABEL: @conditional(
				entry:
				br i1 %p1, label %then1, label %branch1

				then1:
				%0 = add i32 %a, %c
				; CHECK: add i32 %a, %c
				call void @foo(i32 %0)
				br label %branch1

				branch1:
				br i1 %p2, label %then2, label %branch2

				then2:
				%1 = add i32 %a, %c
				; CHECK: add i32 %a, %c
				call void @foo(i32 %1)
				br label %branch2

				branch2:
				br i1 %p3, label %then3, label %return

				then3:
				%2 = add i32 %a, %b
				; CHECK: %2 = add i32 %a, %b
				%3 = add i32 %2, %c
				; CHECK: add i32 %2, %c
				call void @foo(i32 %3)
				br label %return

				return:
				ret void
				}

				; foo((a + b) + c)
				; foo(((a + d) + b) + c)
				; =>
				; t = (a + b) + c;
				; foo(t);
				; foo(t + d);
				define void @quaternary(i32 %a, i32 %b, i32 %c, i32 %d) {
				; CHECK-LABEL: @quaternary(
				%1 = add i32 %a, %b
				%2 = add i32 %1, %c
				call void @foo(i32 %2)
				; CHECK: call void @foo(i32 [[TMP1:%[a-zA-Z0-9]]])
				%3 = add i32 %a, %d
				%4 = add i32 %3, %b
				%5 = add i32 %4, %c
				; CHECK: [[TMP2:%[a-zA-Z0-9]]] = add i32 [[TMP1]], %d
				call void @foo(i32 %5)
				; CHECK: call void @foo(i32 [[TMP2]]
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

Simplify n-ary adds by reassociationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 23715

include/llvm/InitializePasses.h

include/llvm/LinkAllPasses.h

include/llvm/Transforms/Scalar.h

lib/Transforms/Scalar/CMakeLists.txt

lib/Transforms/Scalar/NaryReassociate.cpp

lib/Transforms/Scalar/Scalar.cpp

test/Transforms/NaryReassociate/nary-add.ll

Simplify n-ary adds by reassociation
ClosedPublic