This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
InitializePasses.h
-
Transforms/InstCombine/
-
InstCombine/
-
ConstantFoldAnalysis.h
-
lib/Transforms/
-
Transforms/
-
InstCombine/
-
CMakeLists.txt
-
ConstantFoldAnalysis.cpp
-
InstCombineCompares.cpp
-
InstCombineInternal.h
-
InstructionCombining.cpp
-
Utils/
-
SimplifyCFG.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
do_not_create_large_constants.ll

Differential D59926

Constant folding sometimes creates large constants even though a smaller constant is used multiple times
Needs ReviewPublic

Authored by ramred01 on Mar 28 2019, 4:53 AM.

Download Raw Diff

Details

Reviewers

spatel
lebedev.ri

Summary

Sometimes, the constant folding pass creates a large constant from smaller constants, even though the smaller constant is used multiple times and hence can benefit from materializing the constant once and reusing that for multiple operations instead of creating a new large constant and trying to materialize both the constants into two registers.

The simple test case is as follows:

int test(int A, int B) {

int t = A * B >> 8;
return ((t <= 4096) ? t : 4096);

}

Here, the right shift by 8 is folded into the 4096 for the compare. Thus the generated code materializes both 4096 as well as 4096 << 8.

We add a new pass to analyze the number of times a given constant is used. If a constant is used more than once, we prevent folding on that constant as it is better to materialize it once and reuse.

Diff Detail

Event Timeline

ramred01 created this revision.Mar 28 2019, 4:53 AM

Herald added subscribers: jdoerfert, mgorny. · View Herald TranscriptMar 28 2019, 4:53 AM

This sounds like a back-end problem.
Have you considered/tried solving it there, instead of limiting the middle-end?

In D59926#1445820, @lebedev.ri wrote:

This sounds like a back-end problem.
Have you considered/tried solving it there, instead of limiting the middle-end?

This folding is happening in the middle-end itself. The LLVM IR already has the folded value. Hence we thought it best to handle it there as this should be affecting multiple architectures where materializing constants can take two or more instructions.

In D59926#1445822, @ramred01 wrote:

In D59926#1445820, @lebedev.ri wrote:

This sounds like a back-end problem.
Have you considered/tried solving it there, instead of limiting the middle-end?

This folding is happening in the middle-end itself. The LLVM IR already has the folded value.

That is precisely my point. Is that folding not optimal for the IR?

Hence we thought it best to handle it there as this should be affecting multiple architectures where materializing constants can take two or more instructions.

Have you considered the case where you *already* had that IR?
Even thought the middle end did not 'degrade' it, it will still not be handled by back-ends 'properly', right?

In D59926#1445823, @lebedev.ri wrote:

In D59926#1445822, @ramred01 wrote:

In D59926#1445820, @lebedev.ri wrote:

This sounds like a back-end problem.
Have you considered/tried solving it there, instead of limiting the middle-end?

This folding is happening in the middle-end itself. The LLVM IR already has the folded value.

That is precisely my point. Is that folding not optimal for the IR?

That folding would have been optimal for the IR if the constant were not to be reused. If the constant is reused, then most architectures will do better with materializing one constant in a register and reusing it rather than materializing two constants. Most architectures require two instructions to materialize 32-bit constants. If, we add the additional operation of shift also, that now needs to be done, even with one reuse, it will generate 1 instruction fewer. With more reuses and each reuse folded, that number could increase.

Hence we thought it best to handle it there as this should be affecting multiple architectures where materializing constants can take two or more instructions.

Have you considered the case where you *already* had that IR?
Even thought the middle end did not 'degrade' it, it will still not be handled by back-ends 'properly', right?

Yes, you are right. What if I had that IR itself to deal with in the backend. There are two possibilities. One is when one constant is obviously related to another by some simple single operation arithmetic. Those cases might be easier to decipher. But what if a compare canonicalization pass in the interim has changed the constant. Then such a situation will not be obvious and we stand to miss it or employ some more complex mechanism to undo the folding.

Instead isn't it better if we prevent the folding in the first place when we know that for most architectures it will cause more instructions to be generated? There could be possible benefits for a few architectures in the situation hitherto but I'm not aware of them.

In D59926#1445839, @ramred01 wrote:

In D59926#1445823, @lebedev.ri wrote:

In D59926#1445822, @ramred01 wrote:

In D59926#1445820, @lebedev.ri wrote:

This sounds like a back-end problem.
Have you considered/tried solving it there, instead of limiting the middle-end?

This folding is happening in the middle-end itself. The LLVM IR already has the folded value.

That is precisely my point. Is that folding not optimal for the IR?

That folding would have been optimal for the IR if the constant were not to be reused. If the constant is reused, then most architectures will do better with materializing one constant in a register and reusing it rather than materializing two constants. Most architectures require two instructions to materialize 32-bit constants. If, we add the additional operation of shift also, that now needs to be done, even with one reuse, it will generate 1 instruction fewer. With more reuses and each reuse folded, that number could increase.

I think you're mixing up canonicalization with optimization (which I've also done a bunch). The goal here in instcombine is to produce canonical code. Ie, create a reduced set of easier-to-optimize forms of the infinite patterns that might have existed in the source. Canonical code often is identical to optimal code, but sometimes they diverge. That's when things get harder (and we might end up reversing an earlier transform). But we have to deal with it in the backend because -- as Roman noted -- any solution at this level would be incomplete simply because the input might already be in the form that you don't want.

lebedev.ri added a reviewer: lebedev.ri.Mar 28 2019, 9:48 AM

In D59926#1445936, @spatel wrote:

In D59926#1445839, @ramred01 wrote:

In D59926#1445823, @lebedev.ri wrote:

In D59926#1445822, @ramred01 wrote:

In D59926#1445820, @lebedev.ri wrote:

This sounds like a back-end problem.
Have you considered/tried solving it there, instead of limiting the middle-end?

This folding is happening in the middle-end itself. The LLVM IR already has the folded value.

That is precisely my point. Is that folding not optimal for the IR?

That folding would have been optimal for the IR if the constant were not to be reused. If the constant is reused, then most architectures will do better with materializing one constant in a register and reusing it rather than materializing two constants. Most architectures require two instructions to materialize 32-bit constants. If, we add the additional operation of shift also, that now needs to be done, even with one reuse, it will generate 1 instruction fewer. With more reuses and each reuse folded, that number could increase.

I think you're mixing up canonicalization with optimization (which I've also done a bunch). The goal here in instcombine is to produce canonical code. Ie, create a reduced set of easier-to-optimize forms of the infinite patterns that might have existed in the source. Canonical code often is identical to optimal code, but sometimes they diverge. That's when things get harder (and we might end up reversing an earlier transform). But we have to deal with it in the backend because -- as Roman noted -- any solution at this level would be incomplete simply because the input might already be in the form that you don't want.

I get your point and fully agree with it.

But if a certain canonicalization were to result in a form that always requires reversing that transform at a later stage, won't we be better off not performing that transform in the first place? If a canonicalization were to result in exposing better optimization opportunities across architectures, then that makes more sense, isn't it?

In D59926#1447460, @ramred01 wrote:

In D59926#1445936, @spatel wrote:

In D59926#1445839, @ramred01 wrote:

In D59926#1445823, @lebedev.ri wrote:

In D59926#1445822, @ramred01 wrote:

In D59926#1445820, @lebedev.ri wrote:

This sounds like a back-end problem.
Have you considered/tried solving it there, instead of limiting the middle-end?

This folding is happening in the middle-end itself. The LLVM IR already has the folded value.

That is precisely my point. Is that folding not optimal for the IR?

That folding would have been optimal for the IR if the constant were not to be reused. If the constant is reused, then most architectures will do better with materializing one constant in a register and reusing it rather than materializing two constants. Most architectures require two instructions to materialize 32-bit constants. If, we add the additional operation of shift also, that now needs to be done, even with one reuse, it will generate 1 instruction fewer. With more reuses and each reuse folded, that number could increase.

I think you're mixing up canonicalization with optimization (which I've also done a bunch). The goal here in instcombine is to produce canonical code. Ie, create a reduced set of easier-to-optimize forms of the infinite patterns that might have existed in the source. Canonical code often is identical to optimal code, but sometimes they diverge. That's when things get harder (and we might end up reversing an earlier transform). But we have to deal with it in the backend because -- as Roman noted -- any solution at this level would be incomplete simply because the input might already be in the form that you don't want.

I get your point and fully agree with it.

But if a certain canonicalization were to result in a form that always requires reversing that transform at a later stage, won't we be better off not performing that transform in the first place? If a canonicalization were to result in exposing better optimization opportunities across architectures, then that makes more sense, isn't it?

Is it *always*, though?
If i understand correctly, you want to block this transform, right?
https://godbolt.org/z/IvWYKl

we have essentially replaced

%4 = ashr i32 %3, 8
%5 = icmp slt i32 %4, 4096

with

%5 = icmp slt i32 %3, 1048832

thus that cmp does not depend on the ashr, thus we reduced the data dependency chain,
thus icmp can execute without waiting for the ashr.

Which is, as it can be seen from the lowest view there, unsurprisingly improves performance. (assuming i fixed-up asm for mca correctly..)

So i'm not convinced that this hammer approach is the right solution to the problem.

In D59926#1447540, @lebedev.ri wrote:
In D59926#1447460, @ramred01 wrote:

In D59926#1445936, @spatel wrote:

In D59926#1445839, @ramred01 wrote:

In D59926#1445823, @lebedev.ri wrote:

In D59926#1445822, @ramred01 wrote:

In D59926#1445820, @lebedev.ri wrote:

This sounds like a back-end problem.
Have you considered/tried solving it there, instead of limiting the middle-end?

This folding is happening in the middle-end itself. The LLVM IR already has the folded value.

That is precisely my point. Is that folding not optimal for the IR?

That folding would have been optimal for the IR if the constant were not to be reused. If the constant is reused, then most architectures will do better with materializing one constant in a register and reusing it rather than materializing two constants. Most architectures require two instructions to materialize 32-bit constants. If, we add the additional operation of shift also, that now needs to be done, even with one reuse, it will generate 1 instruction fewer. With more reuses and each reuse folded, that number could increase.

I think you're mixing up canonicalization with optimization (which I've also done a bunch). The goal here in instcombine is to produce canonical code. Ie, create a reduced set of easier-to-optimize forms of the infinite patterns that might have existed in the source. Canonical code often is identical to optimal code, but sometimes they diverge. That's when things get harder (and we might end up reversing an earlier transform). But we have to deal with it in the backend because -- as Roman noted -- any solution at this level would be incomplete simply because the input might already be in the form that you don't want.

I get your point and fully agree with it.

But if a certain canonicalization were to result in a form that always requires reversing that transform at a later stage, won't we be better off not performing that transform in the first place? If a canonicalization were to result in exposing better optimization opportunities across architectures, then that makes more sense, isn't it?

Is it *always*, though?
If i understand correctly, you want to block this transform, right?
https://godbolt.org/z/IvWYKl

we have essentially replaced
%4 = ashr i32 %3, 8
%5 = icmp slt i32 %4, 4096
with
%5 = icmp slt i32 %3, 1048832
thus that cmp does not depend on the ashr, thus we reduced the data dependency chain,
thus icmp can execute without waiting for the ashr.

Which is, as it can be seen from the lowest view there, unsurprisingly improves performance. (assuming i fixed-up asm for mca correctly..)

So i'm not convinced that this hammer approach is the right solution to the problem.

Yes, that is exactly what I am trying to block.

What I have missed out is that this should be applicable only for -Os and -Oz. When performance is the criteria, then the folding is perfectly fine. But when size is the criteria, then across architectures, the problem remains the same, since you cannot materialize a 32-bit constant in less than two instructions. The only exceptions being the powers of two in certain architectures. So each new constant that is created adds two instructions to the size if we perform constant folding on a constant that has multiple uses and create a new constant for each such use.

In fact, we can make the entire analysis and decision making conditional upon Os and Oz. That way it doesn't affect the performance based optimization path.

Updated the patch with checks for optimization level to be either -Os or -Oz.

Now, if the compilation is not for size, then the code added by this patch is not executed at all.

This still feels incredibly intrusive.
At this stage, you don't know if you will have issues with the immediates you would get.
This is also extremely pessimistic, as this bans creation of *all* new constants.
It does not attempt to model whether that new immediate would or would not be a problem.

TLDR: i believe this needs cost/benefit analysis (test-suite codesize,performance change report) + RFC on llvm-dev.

This revision now requires changes to proceed.Apr 6 2019, 8:04 AM

This review seems to be stuck/dead, consider abandoning if no longer relevant.

This revision now requires review to proceed.Jan 12 2023, 4:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 4:41 PM

Herald added a subscriber: StephenFan. · View Herald Transcript

Revision Contents

Path

Size

include/

llvm/

InitializePasses.h

1 line

Transforms/

InstCombine/

ConstantFoldAnalysis.h

59 lines

lib/

Transforms/

InstCombine/

CMakeLists.txt

1 line

ConstantFoldAnalysis.cpp

119 lines

InstCombineCompares.cpp

61 lines

InstCombineInternal.h

14 lines

InstructionCombining.cpp

24 lines

Utils/

SimplifyCFG.cpp

1 line

test/

Transforms/

InstCombine/

do_not_create_large_constants.ll

28 lines

Diff 193855

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 98 Lines • ▼ Show 20 Lines
	void initializeCallGraphViewerPass(PassRegistry&);			void initializeCallGraphViewerPass(PassRegistry&);
	void initializeCallGraphWrapperPassPass(PassRegistry&);			void initializeCallGraphWrapperPassPass(PassRegistry&);
	void initializeCallSiteSplittingLegacyPassPass(PassRegistry&);			void initializeCallSiteSplittingLegacyPassPass(PassRegistry&);
	void initializeCalledValuePropagationLegacyPassPass(PassRegistry &);			void initializeCalledValuePropagationLegacyPassPass(PassRegistry &);
	void initializeCodeGenPreparePass(PassRegistry&);			void initializeCodeGenPreparePass(PassRegistry&);
	void initializeConstantHoistingLegacyPassPass(PassRegistry&);			void initializeConstantHoistingLegacyPassPass(PassRegistry&);
	void initializeConstantMergeLegacyPassPass(PassRegistry&);			void initializeConstantMergeLegacyPassPass(PassRegistry&);
	void initializeConstantPropagationPass(PassRegistry&);			void initializeConstantPropagationPass(PassRegistry&);
				void initializeConstantFoldAnalysisLegacyPassPass(PassRegistry&);
	void initializeControlHeightReductionLegacyPassPass(PassRegistry&);			void initializeControlHeightReductionLegacyPassPass(PassRegistry&);
	void initializeCorrelatedValuePropagationPass(PassRegistry&);			void initializeCorrelatedValuePropagationPass(PassRegistry&);
	void initializeCostModelAnalysisPass(PassRegistry&);			void initializeCostModelAnalysisPass(PassRegistry&);
	void initializeCrossDSOCFIPass(PassRegistry&);			void initializeCrossDSOCFIPass(PassRegistry&);
	void initializeDAEPass(PassRegistry&);			void initializeDAEPass(PassRegistry&);
	void initializeDAHPass(PassRegistry&);			void initializeDAHPass(PassRegistry&);
	void initializeDCELegacyPassPass(PassRegistry&);			void initializeDCELegacyPassPass(PassRegistry&);
	void initializeDSELegacyPassPass(PassRegistry&);			void initializeDSELegacyPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 302 Lines • Show Last 20 Lines

include/llvm/Transforms/InstCombine/ConstantFoldAnalysis.h

This file was added.

				//===-- ConstantFoldAnalysis.h - Constant Folding Analysis
				//-------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass Analysis the each instruction and if have any constant numbers
				// in any instruction will push that instruction and constant number into
				// vector. This information will usefull when we are folding(creating big
				// constants) the instructions.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TRANSFORMS_SCALAR_CONSTANTFOLDANALYSIS_H
				#define LLVM_TRANSFORMS_SCALAR_CONSTANTFOLDANALYSIS_H

				#include "llvm/IR/Function.h"
				#include "llvm/IR/PassManager.h"

				namespace llvm {

				// Struct contains the constant original value and Insrtucion,
				// If constant value is able to cananicalized then current value
				// will change otherwise both will be same.
				struct ConstInst {
				unsigned OriginalValue;
				unsigned CurrentValue;
				Instruction *ConstInstruction;
				};

				class ConstantFoldAnalysisLegacyPass : public FunctionPass {
				public:
				static char ID; // Pass identification
				ConstantFoldAnalysisLegacyPass() : FunctionPass(ID) {
				initializeConstantFoldAnalysisLegacyPassPass(
				*PassRegistry::getPassRegistry());
				}
				// return the list of Const Instructions.
				SmallVector<ConstInst, 32> getConstInstList();
				bool runOnFunction(Function &F) override;
				void getAnalysisUsage(AnalysisUsage &AU) const override;
				};

				class ConstantFoldAnalysisPass
				: public PassInfoMixin<ConstantFoldAnalysisPass> {
				friend AnalysisInfoMixin<ConstantFoldAnalysisPass>;

				public:
				using Result = SmallVector<ConstInst, 32>;
				PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
				};

				FunctionPass *createConstantFoldAnalysisPass();
				} // namespace llvm
				#endif // LLVM_TRANSFORMS_SCALAR_CONSTANTFOLDANALYSIS_H

lib/Transforms/InstCombine/CMakeLists.txt

Show All 11 Lines	add_llvm_library(LLVMInstCombine
InstCombineCompares.cpp		InstCombineCompares.cpp
InstCombineLoadStoreAlloca.cpp		InstCombineLoadStoreAlloca.cpp
InstCombineMulDivRem.cpp		InstCombineMulDivRem.cpp
InstCombinePHI.cpp		InstCombinePHI.cpp
InstCombineSelect.cpp		InstCombineSelect.cpp
InstCombineShifts.cpp		InstCombineShifts.cpp
InstCombineSimplifyDemanded.cpp		InstCombineSimplifyDemanded.cpp
InstCombineVectorOps.cpp		InstCombineVectorOps.cpp
		ConstantFoldAnalysis.cpp

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms		${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms
${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms/InstCombine		${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms/InstCombine

DEPENDS		DEPENDS
intrinsics_gen		intrinsics_gen
)		)

lib/Transforms/InstCombine/ConstantFoldAnalysis.cpp

This file was added.

				//===-- ConstantFoldAnalysis.cpp
				//-------------------------------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass Analysis the each instruction and if have any constant numbers
				// in any instruction will push that instruction and constant number into
				// vector. This information will usefull when we are folding(creating big
				// constants) the instructions.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Transforms/InstCombine/ConstantFoldAnalysis.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/AliasAnalysis.h"
				#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/IR/CFG.h"
				#include "llvm/IR/DataLayout.h"
				#include "llvm/IR/Dominators.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/raw_ostream.h"
				#include "llvm/Transforms/Scalar.h"
				using namespace llvm;

				#define DEBUG_TYPE "ConstantFoldAnalysis"

				SmallVector<ConstInst, 32> ConstInstList;

				static bool ProcessBlock(BasicBlock &BB) {
				// iterate through each instruction
				for (Instruction &I : BB) {
				unsigned NumOps = I.getNumOperands();
				// iterate through each operand
				for (unsigned i = 0; i < NumOps; i++) {
				Value *Op1 = I.getOperand(i);
				// checking if any operand is constant int in current instruction
				auto *Op1C = dyn_cast<ConstantInt>(Op1);
				// if optimization level is -Oz or -Os, then constant int value
				// will insert into vector.
				if (Op1C) {
				unsigned Op1ConstValue = Op1C->getUniqueInteger().getZExtValue();
				bool isConstThere = false;
				if (ConstInstList.size() != 0) {
				// we are iterating through the list of insructions
				for (SmallVector<ConstInst, 32>::iterator it = ConstInstList.begin();
				it != ConstInstList.end(); ++it) {
				if (it->ConstInstruction != &I) {
				isConstThere = true;
				}
				}
				}
				if (ConstInstList.size() == 0 \|\| isConstThere) {
				// creating the object of struct
				ConstInst CIObj;
				CIObj.OriginalValue = Op1ConstValue;
				CIObj.CurrentValue = Op1ConstValue;
				CIObj.ConstInstruction = &I;
				// pushing the struct object into vector list.
				ConstInstList.push_back(CIObj);
				}
				}
				}
				}
				return true;
				}

				static bool iterativelyAnalyzeInstructions(Function &F) {
				// Clear the ConstInstList Vector Elements before iterating through new
				// function.
				ConstInstList.clear();
				// Process all basic blocks.
				for (BasicBlock &BB : F)
				ProcessBlock(BB);
				return true;
				}

				PreservedAnalyses ConstantFoldAnalysisPass::run(Function &F,
				FunctionAnalysisManager &AM) {
				if (F.optForSize()) {
				iterativelyAnalyzeInstructions(F);
				}
				return PreservedAnalyses::all();
				}

				// return the Constant Instruction list.
				SmallVector<ConstInst, 32> ConstantFoldAnalysisLegacyPass::getConstInstList() {
				return ConstInstList;
				}

				bool ConstantFoldAnalysisLegacyPass::runOnFunction(Function &F) {
				if (F.optForSize()) {
				return iterativelyAnalyzeInstructions(F);
				}
				return false;
				}

				void ConstantFoldAnalysisLegacyPass::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.setPreservesAll();
				}

				char ConstantFoldAnalysisLegacyPass::ID = 0;

				INITIALIZE_PASS_BEGIN(ConstantFoldAnalysisLegacyPass, "constantfoldanalysis",
				"Constant Fold Analysis", false, false)

				INITIALIZE_PASS_END(ConstantFoldAnalysisLegacyPass, "constantfoldanalysis",
				"Constant Fold Analysis", false, false)

				FunctionPass *llvm::createConstantFoldAnalysisPass() {
				return new ConstantFoldAnalysisLegacyPass();
				}

lib/Transforms/InstCombine/InstCombineCompares.cpp

Show First 20 Lines • Show All 4,874 Lines • ▼ Show 20 Lines	static Instruction *foldVectorCmp(CmpInst &Cmp,
return nullptr;		return nullptr;
}		}

Instruction *InstCombiner::visitICmpInst(ICmpInst &I) {		Instruction *InstCombiner::visitICmpInst(ICmpInst &I) {
bool Changed = false;		bool Changed = false;
Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);
unsigned Op0Cplxity = getComplexity(Op0);		unsigned Op0Cplxity = getComplexity(Op0);
unsigned Op1Cplxity = getComplexity(Op1);		unsigned Op1Cplxity = getComplexity(Op1);

		SmallVector<ConstInst, 32> CFA = getConstantFoldInstList();
		Function *Fn = I.getParent()->getParent();
/// Orders the operands of the compare so that they are listed from most		/// Orders the operands of the compare so that they are listed from most
/// complex to least complex. This puts constants before unary operators,		/// complex to least complex. This puts constants before unary operators,
/// before binary operators.		/// before binary operators.
if (Op0Cplxity < Op1Cplxity \|\|		if (Op0Cplxity < Op1Cplxity \|\|
(Op0Cplxity == Op1Cplxity && swapMayExposeCSEOpportunities(Op0, Op1))) {		(Op0Cplxity == Op1Cplxity && swapMayExposeCSEOpportunities(Op0, Op1))) {
I.swapOperands();		I.swapOperands();
std::swap(Op0, Op1);		std::swap(Op0, Op1);
Changed = true;		Changed = true;
Show All 19 Lines	if (match(Op0, m_Select(m_Value(Cond), m_Value(SelectTrue),
}		}
}		}
}		}

if (Op0->getType()->isIntOrIntVectorTy(1))		if (Op0->getType()->isIntOrIntVectorTy(1))
if (Instruction *Res = canonicalizeICmpBool(I, Builder))		if (Instruction *Res = canonicalizeICmpBool(I, Builder))
return Res;		return Res;

if (ICmpInst *NewICmp = canonicalizeCmpWithConstant(I))		if (ICmpInst *NewICmp = canonicalizeCmpWithConstant(I)) {
		// numUses of 'I' instruction constant number.
		unsigned constNumUses = Op1->getNumUses();
		// if Opt level is either -Oz or -Os then we are iterate through
		// constant instructions list.
		// if 'I' instruction is maching with any one of the instruction in list,
		// and numUses of constant number is more than one.
		// Then update current(NewICmp) inst const value with CFA it current
		// value and update the instruction also.
		if (Fn->optForSize()) {
		for (SmallVector<ConstInst, 32>::iterator it = CFA.begin();
		it != CFA.end();
		++it ) {
		// checking the current instruction(I) is match with any of the
		// instruction in list.
		if (it->ConstInstruction == &I && constNumUses > 1) {
		auto *NewOpC = dyn_cast<Constant>(NewICmp->getOperand(1));
		it->CurrentValue = NewOpC->getUniqueInteger().getZExtValue();
		it->ConstInstruction = NewICmp;
		break;
		}
		}
		// setting the updated CFA.
		setConstantFoldInstList(CFA);
		}
return NewICmp;		return NewICmp;
		}

if (Instruction *Res = foldICmpWithConstant(I))		if (Instruction *Res = foldICmpWithConstant(I))
return Res;		return Res;

if (Instruction *Res = foldICmpWithDominatingICmp(I))		if (Instruction *Res = foldICmpWithDominatingICmp(I))
return Res;		return Res;

if (Instruction *Res = foldICmpUsingKnownBits(I))		if (Instruction *Res = foldICmpUsingKnownBits(I))
Show All 36 Lines	if (match(Op1, m_APInt(C))) {

// For i32: x <u 2147483648 -> x >s -1 -> true if sign bit clear		// For i32: x <u 2147483648 -> x >s -1 -> true if sign bit clear
if (Pred == ICmpInst::ICMP_ULT && C->isMinSignedValue()) {		if (Pred == ICmpInst::ICMP_ULT && C->isMinSignedValue()) {
Constant *AllOnes = Constant::getAllOnesValue(Op0->getType());		Constant *AllOnes = Constant::getAllOnesValue(Op0->getType());
return new ICmpInst(ICmpInst::ICMP_SGT, Op0, AllOnes);		return new ICmpInst(ICmpInst::ICMP_SGT, Op0, AllOnes);
}		}
}		}

		// Here we are checking if constant current value(after cananicalized)
		// and this instruction(I) constant value both should same, both
		// instructions are should same, and Original value and this instruction
		// constant value should not same, then Do not fold(create big constant)
		// constant number else fold the instruction.
		bool isFoldableInst = false;
		if (Fn->optForSize()) {
		for (SmallVector<ConstInst, 32>::iterator it = CFA.begin();
		it != CFA.end();
		++it ) {
		auto *OpCInt = dyn_cast<ConstantInt>(Op1);
		if (!OpCInt)
		break;
		if (it->ConstInstruction == &I &&
		it->CurrentValue == OpCInt->getZExtValue() &&
		it->OriginalValue != OpCInt->getZExtValue()) {
		isFoldableInst = true;
		}
		}
		}

		// if the constant number is unsing more than one's in current
		// function(as per above checkings) and Optimization level is
		// either -Oz or -Os don't fold the cmp instruction with constant
		// (do not create big constants).
		if (!isFoldableInst && !Fn->optForSize())
if (Instruction *Res = foldICmpInstWithConstant(I))		if (Instruction *Res = foldICmpInstWithConstant(I))
return Res;		return Res;

if (Instruction *Res = foldICmpInstWithConstantNotInt(I))		if (Instruction *Res = foldICmpInstWithConstantNotInt(I))
return Res;		return Res;

// If we can optimize a 'icmp GEP, P' or 'icmp P, GEP', do so now.		// If we can optimize a 'icmp GEP, P' or 'icmp P, GEP', do so now.
if (GEPOperator *GEP = dyn_cast<GEPOperator>(Op0))		if (GEPOperator *GEP = dyn_cast<GEPOperator>(Op0))
if (Instruction *NI = foldGEPICmp(GEP, Op1, I.getPredicate(), I))		if (Instruction *NI = foldGEPICmp(GEP, Op1, I.getPredicate(), I))
return NI;		return NI;
▲ Show 20 Lines • Show All 634 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstCombineInternal.h

Show All 35 Lines
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/KnownBits.h"		#include "llvm/Support/KnownBits.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/InstCombine/InstCombineWorklist.h"		#include "llvm/Transforms/InstCombine/InstCombineWorklist.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
		#include "llvm/Transforms/InstCombine/ConstantFoldAnalysis.h"
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>

#define DEBUG_TYPE "instcombine"		#define DEBUG_TYPE "instcombine"

using namespace llvm::PatternMatch;		using namespace llvm::PatternMatch;

namespace llvm {		namespace llvm {
▲ Show 20 Lines • Show All 225 Lines • ▼ Show 20 Lines
/// This class provides both the logic to recursively visit instructions and		/// This class provides both the logic to recursively visit instructions and
/// combine them.		/// combine them.
class LLVM_LIBRARY_VISIBILITY InstCombiner		class LLVM_LIBRARY_VISIBILITY InstCombiner
: public InstVisitor<InstCombiner, Instruction *> {		: public InstVisitor<InstCombiner, Instruction *> {
// FIXME: These members shouldn't be public.		// FIXME: These members shouldn't be public.
public:		public:
/// A worklist of the instructions that need to be simplified.		/// A worklist of the instructions that need to be simplified.
InstCombineWorklist &Worklist;		InstCombineWorklist &Worklist;

/// An IRBuilder that automatically inserts new instructions into the		/// An IRBuilder that automatically inserts new instructions into the
/// worklist.		/// worklist.
using BuilderTy = IRBuilder<TargetFolder, IRBuilderCallbackInserter>;		using BuilderTy = IRBuilder<TargetFolder, IRBuilderCallbackInserter>;
BuilderTy &Builder;		BuilderTy &Builder;

private:		private:
// Mode in which we are running the combiner.		// Mode in which we are running the combiner.
const bool MinimizeSize;		const bool MinimizeSize;

/// Enable combines that trigger rarely but are costly in compiletime.		/// Enable combines that trigger rarely but are costly in compiletime.
const bool ExpensiveCombines;		const bool ExpensiveCombines;

AliasAnalysis *AA;		AliasAnalysis *AA;

// Required analyses.		// Required analyses.
AssumptionCache &AC;		AssumptionCache &AC;
TargetLibraryInfo &TLI;		TargetLibraryInfo &TLI;
DominatorTree &DT;		DominatorTree &DT;
const DataLayout &DL;		const DataLayout &DL;
const SimplifyQuery SQ;		const SimplifyQuery SQ;
OptimizationRemarkEmitter &ORE;		OptimizationRemarkEmitter &ORE;
		SmallVector<ConstInst, 32> ConstantFoldInstList;
// Optional analyses. When non-null, these can both be used to do better		// Optional analyses. When non-null, these can both be used to do better
// combining and will be updated to reflect any changes.		// combining and will be updated to reflect any changes.
LoopInfo *LI;		LoopInfo *LI;

bool MadeIRChange = false;		bool MadeIRChange = false;

public:		public:
InstCombiner(InstCombineWorklist &Worklist, BuilderTy &Builder,		InstCombiner(InstCombineWorklist &Worklist, BuilderTy &Builder,
Show All 15 Lines	public:
const DataLayout &getDataLayout() const { return DL; }		const DataLayout &getDataLayout() const { return DL; }

DominatorTree &getDominatorTree() const { return DT; }		DominatorTree &getDominatorTree() const { return DT; }

LoopInfo *getLoopInfo() const { return LI; }		LoopInfo *getLoopInfo() const { return LI; }

TargetLibraryInfo &getTargetLibraryInfo() const { return TLI; }		TargetLibraryInfo &getTargetLibraryInfo() const { return TLI; }

		// setting the constant fold instruction list
		void setConstantFoldInstList(SmallVector<ConstInst, 32> CFA) {
		ConstantFoldInstList = CFA;
		}

		// return the constant fold instruction list
		SmallVector<ConstInst, 32> getConstantFoldInstList() {
		return ConstantFoldInstList;
		}
// Visitation implementation - Implement instruction combining for different		// Visitation implementation - Implement instruction combining for different
// instruction types. The semantics are as follows:		// instruction types. The semantics are as follows:
// Return Value:		// Return Value:
// null - No change was made		// null - No change was made
// I - Change was made, I is still valid, I may be dead though		// I - Change was made, I is still valid, I may be dead though
// otherwise - Change was made, replace I with returned instruction		// otherwise - Change was made, replace I with returned instruction
//		//
Instruction *visitAdd(BinaryOperator &I);		Instruction *visitAdd(BinaryOperator &I);
▲ Show 20 Lines • Show All 599 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstructionCombining.cpp

Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/DebugCounter.h"		#include "llvm/Support/DebugCounter.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/KnownBits.h"		#include "llvm/Support/KnownBits.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/InstCombine/InstCombine.h"		#include "llvm/Transforms/InstCombine/InstCombine.h"
#include "llvm/Transforms/InstCombine/InstCombineWorklist.h"		#include "llvm/Transforms/InstCombine/InstCombineWorklist.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
		#include "llvm/Transforms/InstCombine/ConstantFoldAnalysis.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <memory>		#include <memory>
#include <string>		#include <string>
#include <utility>		#include <utility>

using namespace llvm;		using namespace llvm;
▲ Show 20 Lines • Show All 3,321 Lines • ▼ Show 20 Lines	for (BasicBlock &BB : F) {
unsigned NumDeadInstInBB = removeAllNonTerminatorAndEHPadInstructions(&BB);		unsigned NumDeadInstInBB = removeAllNonTerminatorAndEHPadInstructions(&BB);
MadeIRChange \|= NumDeadInstInBB > 0;		MadeIRChange \|= NumDeadInstInBB > 0;
NumDeadInst += NumDeadInstInBB;		NumDeadInst += NumDeadInstInBB;
}		}

return MadeIRChange;		return MadeIRChange;
}		}


static bool combineInstructionsOverFunction(		static bool combineInstructionsOverFunction(
Function &F, InstCombineWorklist &Worklist, AliasAnalysis *AA,		Function &F, InstCombineWorklist &Worklist,
AssumptionCache &AC, TargetLibraryInfo &TLI, DominatorTree &DT,		AliasAnalysis *AA, AssumptionCache &AC, TargetLibraryInfo &TLI,
OptimizationRemarkEmitter &ORE, bool ExpensiveCombines = true,		DominatorTree &DT, OptimizationRemarkEmitter &ORE,
		SmallVector<ConstInst, 32> CFA, bool ExpensiveCombines = true,
LoopInfo *LI = nullptr) {		LoopInfo *LI = nullptr) {
auto &DL = F.getParent()->getDataLayout();		auto &DL = F.getParent()->getDataLayout();
ExpensiveCombines \|= EnableExpensiveCombines;		ExpensiveCombines \|= EnableExpensiveCombines;

/// Builder - This is an IRBuilder that automatically inserts new		/// Builder - This is an IRBuilder that automatically inserts new
/// instructions into the worklist when they are created.		/// instructions into the worklist when they are created.
IRBuilder<TargetFolder, IRBuilderCallbackInserter> Builder(		IRBuilder<TargetFolder, IRBuilderCallbackInserter> Builder(
F.getContext(), TargetFolder(DL),		F.getContext(), TargetFolder(DL),
Show All 16 Lines	while (true) {
LLVM_DEBUG(dbgs() << "\n\nINSTCOMBINE ITERATION #" << Iteration << " on "		LLVM_DEBUG(dbgs() << "\n\nINSTCOMBINE ITERATION #" << Iteration << " on "
<< F.getName() << "\n");		<< F.getName() << "\n");

MadeIRChange \|= prepareICWorklistFromFunction(F, DL, &TLI, Worklist);		MadeIRChange \|= prepareICWorklistFromFunction(F, DL, &TLI, Worklist);

InstCombiner IC(Worklist, Builder, F.optForMinSize(), ExpensiveCombines, AA,		InstCombiner IC(Worklist, Builder, F.optForMinSize(), ExpensiveCombines, AA,
AC, TLI, DT, ORE, DL, LI);		AC, TLI, DT, ORE, DL, LI);
IC.MaxArraySizeForCombine = MaxArraySize;		IC.MaxArraySizeForCombine = MaxArraySize;

		IC.setConstantFoldInstList(CFA);
if (!IC.run())		if (!IC.run())
break;		break;
}		}

return MadeIRChange \|\| Iteration > 1;		return MadeIRChange \|\| Iteration > 1;
}		}

PreservedAnalyses InstCombinePass::run(Function &F,		PreservedAnalyses InstCombinePass::run(Function &F,
FunctionAnalysisManager &AM) {		FunctionAnalysisManager &AM) {
auto &AC = AM.getResult<AssumptionAnalysis>(F);		auto &AC = AM.getResult<AssumptionAnalysis>(F);
auto &DT = AM.getResult<DominatorTreeAnalysis>(F);		auto &DT = AM.getResult<DominatorTreeAnalysis>(F);
auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);		auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);
auto &ORE = AM.getResult<OptimizationRemarkEmitterAnalysis>(F);		auto &ORE = AM.getResult<OptimizationRemarkEmitterAnalysis>(F);
		SmallVector<ConstInst, 32> CFA;
auto *LI = AM.getCachedResult<LoopAnalysis>(F);		auto *LI = AM.getCachedResult<LoopAnalysis>(F);

auto *AA = &AM.getResult<AAManager>(F);		auto *AA = &AM.getResult<AAManager>(F);
if (!combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, DT, ORE,		if (!combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, DT, ORE,
ExpensiveCombines, LI))		CFA, ExpensiveCombines, LI))
// No changes, all analyses are preserved.		// No changes, all analyses are preserved.
return PreservedAnalyses::all();		return PreservedAnalyses::all();

// Mark all the analyses that instcombine updates as preserved.		// Mark all the analyses that instcombine updates as preserved.
PreservedAnalyses PA;		PreservedAnalyses PA;
PA.preserveSet<CFGAnalyses>();		PA.preserveSet<CFGAnalyses>();
PA.preserve<AAManager>();		PA.preserve<AAManager>();
PA.preserve<BasicAA>();		PA.preserve<BasicAA>();
PA.preserve<GlobalsAA>();		PA.preserve<GlobalsAA>();
return PA;		return PA;
}		}

void InstructionCombiningPass::getAnalysisUsage(AnalysisUsage &AU) const {		void InstructionCombiningPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesCFG();		AU.setPreservesCFG();
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<OptimizationRemarkEmitterWrapperPass>();		AU.addRequired<OptimizationRemarkEmitterWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();		AU.addPreserved<DominatorTreeWrapperPass>();
AU.addPreserved<AAResultsWrapperPass>();		AU.addPreserved<AAResultsWrapperPass>();
AU.addPreserved<BasicAAWrapperPass>();		AU.addPreserved<BasicAAWrapperPass>();
AU.addPreserved<GlobalsAAWrapperPass>();		AU.addPreserved<GlobalsAAWrapperPass>();
		AU.addRequired<ConstantFoldAnalysisLegacyPass>();
		AU.addPreserved<ConstantFoldAnalysisLegacyPass>();
}		}

bool InstructionCombiningPass::runOnFunction(Function &F) {		bool InstructionCombiningPass::runOnFunction(Function &F) {
if (skipFunction(F))		if (skipFunction(F))
return false;		return false;

// Required analyses.		// Required analyses.
auto AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();		auto AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
auto &AC = getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);		auto &AC = getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();		auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();		auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
auto &ORE = getAnalysis<OptimizationRemarkEmitterWrapperPass>().getORE();		auto &ORE = getAnalysis<OptimizationRemarkEmitterWrapperPass>().getORE();
		SmallVector<ConstInst, 32> CFA = getAnalysis<ConstantFoldAnalysisLegacyPass>().getConstInstList();
// Optional analyses.		// Optional analyses.
auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();		auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();
auto *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;		auto *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;

return combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, DT, ORE,		return combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, DT, ORE,
ExpensiveCombines, LI);		CFA, ExpensiveCombines, LI);
}		}

char InstructionCombiningPass::ID = 0;		char InstructionCombiningPass::ID = 0;

INITIALIZE_PASS_BEGIN(InstructionCombiningPass, "instcombine",		INITIALIZE_PASS_BEGIN(InstructionCombiningPass, "instcombine",
"Combine redundant instructions", false, false)		"Combine redundant instructions", false, false)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)		INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)		INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)		INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)		INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(ConstantFoldAnalysisLegacyPass)
INITIALIZE_PASS_END(InstructionCombiningPass, "instcombine",		INITIALIZE_PASS_END(InstructionCombiningPass, "instcombine",
"Combine redundant instructions", false, false)		"Combine redundant instructions", false, false)

// Initialization Routines		// Initialization Routines
void llvm::initializeInstCombine(PassRegistry &Registry) {		void llvm::initializeInstCombine(PassRegistry &Registry) {
initializeInstructionCombiningPassPass(Registry);		initializeInstructionCombiningPassPass(Registry);
}		}

Show All 11 Lines

lib/Transforms/Utils/SimplifyCFG.cpp

Show First 20 Lines • Show All 2,391 Lines • ▼ Show 20 Lines	if (IfBlock1)
hoistAllInstructionsInto(DomBlock, InsertPt, IfBlock1);		hoistAllInstructionsInto(DomBlock, InsertPt, IfBlock1);
if (IfBlock2)		if (IfBlock2)
hoistAllInstructionsInto(DomBlock, InsertPt, IfBlock2);		hoistAllInstructionsInto(DomBlock, InsertPt, IfBlock2);

while (PHINode *PN = dyn_cast<PHINode>(BB->begin())) {		while (PHINode *PN = dyn_cast<PHINode>(BB->begin())) {
// Change the PHI node into a select instruction.		// Change the PHI node into a select instruction.
Value *TrueVal = PN->getIncomingValue(PN->getIncomingBlock(0) == IfFalse);		Value *TrueVal = PN->getIncomingValue(PN->getIncomingBlock(0) == IfFalse);
Value *FalseVal = PN->getIncomingValue(PN->getIncomingBlock(0) == IfTrue);		Value *FalseVal = PN->getIncomingValue(PN->getIncomingBlock(0) == IfTrue);

Value *Sel = Builder.CreateSelect(IfCond, TrueVal, FalseVal, "", InsertPt);		Value *Sel = Builder.CreateSelect(IfCond, TrueVal, FalseVal, "", InsertPt);
PN->replaceAllUsesWith(Sel);		PN->replaceAllUsesWith(Sel);
Sel->takeName(PN);		Sel->takeName(PN);
PN->eraseFromParent();		PN->eraseFromParent();
}		}

// At this point, IfBlock1 and IfBlock2 are both empty, so our if statement		// At this point, IfBlock1 and IfBlock2 are both empty, so our if statement
// has been flattened. Change DomBlock to jump directly to our new block to		// has been flattened. Change DomBlock to jump directly to our new block to
▲ Show 20 Lines • Show All 3,691 Lines • Show Last 20 Lines

test/Transforms/InstCombine/do_not_create_large_constants.ll

This file was added.

				;RUN: llc %s -o - -verify-machineinstrs \| FileCheck %s

				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64-arm-none-eabi"

				; Function Attrs: minsize norecurse nounwind optsize readnone
				;CHECK-LABEL: @test
				;CHECK: mul
				;CHECK-NEXT: orr
				;CHECK-NEXT: asr
				;CHECK-NEXT: cmp
				;CHECK-NEXT: csel
				;CHECK-NEXT: ret
				define dso_local i32 @test(i32, i32) local_unnamed_addr #0 {
				%3 = mul nsw i32 %1, %0
				%4 = ashr i32 %3, 8
				%5 = icmp slt i32 %4, 4096
				%6 = select i1 %5, i32 %4, i32 4096
				ret i32 %6
				}

				attributes #0 = { minsize norecurse nounwind optsize readnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a53" "target-features"="+aes,+crc,+crypto,+fp-armv8,+neon,+sha2" "unsafe-fp-math"="false" "use-soft-float"="false" }

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 2e53b8068e17d5347f6fc63e68faa2bf8531fb28) (https://git.llvm.org/git/llvm.git/ 4c1861d5b115a9ef6385f650b1ac70d75a7a29e5)"}

This is an archive of the discontinued LLVM Phabricator instance.

Constant folding sometimes creates large constants even though a smaller constant is used multiple timesNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 193855

include/llvm/InitializePasses.h

include/llvm/Transforms/InstCombine/ConstantFoldAnalysis.h

lib/Transforms/InstCombine/CMakeLists.txt

lib/Transforms/InstCombine/ConstantFoldAnalysis.cpp

lib/Transforms/InstCombine/InstCombineCompares.cpp

lib/Transforms/InstCombine/InstCombineInternal.h

lib/Transforms/InstCombine/InstructionCombining.cpp

lib/Transforms/Utils/SimplifyCFG.cpp

test/Transforms/InstCombine/do_not_create_large_constants.ll

Constant folding sometimes creates large constants even though a smaller constant is used multiple times
Needs ReviewPublic