This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/
-
llvm/
-
Transforms/
-
Scalar.h
-
Scalar/
-
SeparateConstOffsetFromGEP.h
-
lib/
-
Passes/
-
PassBuilder.cpp
-
PassRegistry.def
-
Transforms/Scalar/
-
Scalar/
11/29
SeparateConstOffsetFromGEP.cpp
-
test/Transforms/SeparateConstOffsetFromGEP/RISCV/
-
Transforms/
-
SeparateConstOffsetFromGEP/
-
RISCV/
2/8
split-gep.ll

Differential D127727

[SeparateConstOffsetFromGEPPass] Added optional modification strategy
AbandonedPublic

Authored by eklepilkina on Jun 14 2022, 2:04 AM.

Download Raw Diff

Details

Reviewers

anton-afanasyev
luismarques
jingyue
craig.topper
eli.friedman
mkazantsev

Summary

This modification strategy tries to understand which GEP instrucions is profitable to modify for register pressure decreasing.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

eklepilkina created this revision.Jun 14 2022, 2:04 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 2:04 AM

Herald added subscribers: sunshaoce, VincentWu, luke957 and 29 others. · View Herald Transcript

eklepilkina requested review of this revision.Jun 14 2022, 2:04 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 2:04 AM

Herald added subscribers: llvm-commits, • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

eklepilkina added reviewers: anton-afanasyev, luismarques, jingyue, wu.Jun 14 2022, 2:08 AM

eklepilkina removed a reviewer: wu.

Harbormaster completed remote builds in B169665: Diff 436709.Jun 14 2022, 3:08 AM

Please rebase against precommited tests.

anton-afanasyev edited the summary of this revision. (Show Details)Jun 14 2022, 5:07 AM

luismarques added reviewers: craig.topper, eli.friedman.Jun 14 2022, 5:55 AM

Herald added a subscriber: StephenFan. · View Herald TranscriptJun 14 2022, 5:55 AM

Rebased against precommited tests

Harbormaster completed remote builds in B169701: Diff 436758.Jun 14 2022, 6:02 AM

At least for me, I need some context to be able to review this. What is the case which this improves in terms of codegen? And how common are such patterns? Keep in mind, I'm not terribly familiar with the pass here, so this may be pretty basic explanation. Do you have a bug with examples or analyze that lead to this change?

Sorry, I had to provide the context at the beginning.

Now clang for RISC-V doesn't use offset addressing in generated assembly. Example from Dhrystone

 addiw   a0, s1, 5
 slli    a1, a0, 0x2
 add     a2, s4, a1
 sw      s2, 0(a2)
 addiw   a3, s1, 6
 slli    a3, a3, 0x2
 add     a3, a3, s4
 sw      s2, 0(a3)
 addiw   a3, s1, 35
 slli    a3, a3, 0x2
add     a3, a3, s4
sw      a0, 0(a3)

It's inefficient because we can use offsets.
Adding this pass allows to generate the next code

    addiw   a4, a2, 5
    slli    a5, a2, 2
    add a0, a0, a5
    sw  a3, 20(a0)
    sw  a3, 24(a0)
    sw  a4, 140(a0)
...

SeparateConstOffsetFromGEPPass is used to solve this problem in targets with limited addressing modes.
The changes inside pass was made because modification of all GEPs isn't profitable, seems that we need at least 2 GEPs and one value that was used for index that can be removed after modification. Otherwise we don't decrease register pressure.

This should be two patches, one changing the pass and one enabling for RISC-V.

llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
171 ↗	(On Diff #436758)	Can we add a command line option to control this like AArch64 and PowerPC have?
llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll
2	This test doesn't exist in the repo. Where is the patch that adds it?

eklepilkina added inline comments.Jun 14 2022, 8:54 AM

llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll
2	I was told in the first comment to rebase on precommited tests. These tests are added as precommited in separate commit. Should I commit them?

craig.topper added inline comments.Jun 14 2022, 8:58 AM

llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll
2	Why is the script `NOTE: Assertions have been autogenerated by utils/update_test_checks.py` not in the pre-committed version?

This should be two patches, one changing the pass and one enabling for RISC-V.

As far as I want to turn pass with enabled strategy, should I wait approve and merge of the accepting strategy and only after this create the second review? Or create series of patches as mentionedin documentation https://llvm.org/docs/Phabricator.html#creating-a-patch-series?

craig.topper added inline comments.Jun 14 2022, 9:09 AM

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
368	I think there you should be a std::move on `PreviousIndices`
371	rhs -> RHS
970–974	`PossibleBase.size() == 0` -> `PossibleBases.empty()`

In D127727#3582087, @eklepilkina wrote:

This should be two patches, one changing the pass and one enabling for RISC-V.

As far as I want to turn pass with enabled strategy, should I wait approve and merge of the accepting strategy and only after this create the second review? Or create series of patches as mentionedin documentation https://llvm.org/docs/Phabricator.html#creating-a-patch-series?

You should create a series of patches.

anton-afanasyev mentioned this in rG4e1090cfe9d4: [test][RISCV] Precommit test for SeparateConstOffsetFromGEP (NFC).Jun 15 2022, 6:05 AM

Separate part with pass modification

eklepilkina retitled this revision from [RISCV] Turn on SeparateConstOffsetFromGEPPass for RISC-V target and added optional modification strategy in it to [SeparateConstOffsetFromGEPPass] Added optional modification strategy.Jun 15 2022, 7:23 AM

eklepilkina edited the summary of this revision. (Show Details)

eklepilkina added a child revision: D127858: [RISCV] Added flag to enable SeparateConstOffsetFromGEPPass for RISC-V target.Jun 15 2022, 7:26 AM

Harbormaster completed remote builds in B169984: Diff 437155.Jun 15 2022, 8:16 AM

yakush added a subscriber: yakush.Jun 16 2022, 3:44 AM

asb mentioned this in D127858: [RISCV] Added flag to enable SeparateConstOffsetFromGEPPass for RISC-V target.Jun 20 2022, 3:42 AM

[SeparateConstOffsetFromGEP] Fix comparator for map with GEP bases

Harbormaster completed remote builds in B172677: Diff 440908.Jun 29 2022, 4:02 AM

Refactoring

Harbormaster completed remote builds in B174822: Diff 443874.Jul 12 2022, 2:11 AM

Fix review

Harbormaster completed remote builds in B174826: Diff 443879.Jul 12 2022, 3:59 AM

Ping

[SeparateConstOffsetFromGEP] Fix ignoring condition

Harbormaster completed remote builds in B176030: Diff 445498.Jul 18 2022, 9:33 AM

Gentle ping

Some nits from me. If I may, some advice if you want to make progress here.

First, it seems that some pieces of this patch can be split out as seprate NFC refactorings. If so, please do. It should greatly reduce the code to look at, and smaller patches are generally easier to comprehend and review.

Second, the motivation of this patch is obscure. The structures and algorithms that you are using are not obvious. This either needs a detailed explanation in comments, or maybe incremental approach, when each step is understandable.

Third, the benefit isn't obvious either. In your tests, new IR is bigger than the old IR. This is not necessarily bad, but it requires explanation. If you have some particular scenario where the resulting assembly is better with this change, then it makes sense to provide an LLC test which shows it.

Fourth, this patch claims to improve register pressure. At the same time, it is done in a middle-end pass, which means its impact on register pressure on different platforms might be different. Was it actually tested on other platforms than RICSV? Or you think that the algorithm should be profitable regardless of the platform? Can I see any benefit from this patch in X86 for example?

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
203	Can this go as a separate NFC?
271–277	Please commit this reformatting separately.
280	\p CheckProfitability ?..
360	This requires more explanation. I could not figure what are indices, which of them is being optimized, and what is precedence in this context. Maybe write a detailed comment on what's going on here and what does this structure represent?
449	Canonize -> Canonicalize
450	I guess it should be "Returns true if a change was made, false otherwise".
547	Maybe rename `InstructionsToTransform` -> `GEPsToTransform`?
1083	Pls commit separately if it is needed.
1127	Rename as separate NFC?
1373	To me, this code structure looks counter-intuitive. Why do we print "Try to split GEP "... only when we check profitability, and do it silently when we don't? If possible, please restructure it like if (CheckProfitability) { // Do all required profitability checks } // Do common transform logic uniformly I'm not sure if it's possible here because of this post-processing. If not, then the transform part should be unified somehow else.
1380	More natural way would be if (!CurrentChanged) continue; for ...
1381	The complexity of this is `SortedInstructionsList.size() * SortedInstructionsList.size() * sum(SortedInstructionsList[J])` if I'm reading this correctly. Looks very expensive. Is there a cheaper way of doing this? Imagine you have 10k instructions on your list. It will just be stuck forever.
llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll
86	Why is it a better code tha the old one?
286	This code is bigger than it used to be. Can you explain why is it better?

Can I see any benefit from this patch in X86 for example?

This pass was written for targets with limited addressing mode, so it isn't added to X86 pipeline. It's used under the flag on Aarch64 and on RISCV we also suggest to turn off it by default, but this patch helps to make these optimization be useful more often, and remove some regressions that was found if turn these pass on on all test-suite. I'll provide test-suite results on Aarch64 platform with turned this pass with and without this patch. But yes, the main measurements were made for RISCV.

And if you mean that these changes should be done later in pipeline, there is the problem with the current instruction selection that can work only with one BB, so CodeGenPrepare pass need to sunk such GEPs with const to generate adddressing by offset, so I believe this pass was created as middle-end part.

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
1127	I don't really like the idea to rename in separate NFC patch, because renaming is connected with changes that were made and the old name wasn't suitable any more
1373	I understand your concerns, but I don't see a good solution here, because I don't want to make the unneeded actions for original version without checking profitability.
1381	Imagine you have 10k instructions on your list I amn't sure we should optimize this case, because it's mostly impossible, because this list is always quite small. I'll think some more, but I amn't sure that the optimization here is more important than readability.
llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll
86	In assembly we use one more register to save the result of new generated GEP instruction, bt we have no profit because registers that are used by adds are also needed as far as these values are used in other instructions.
286	This code is bigger on IR, and it's so becuase of repeating sext opertaions, but `sext` isn't so critical in assembly, at the same time pass generates 2 new GEP instructions that are used as base and we need registers for them

[NFC][SeparateConstOffsetFromGEP] Small refactoring and reformatting
[SeparateConstOffsetFromGEPPass] Added optional modification strategy
Review fixes (part 1)

[SeparateConstOffsetFromGEPPass] Added optional modification strategy
Review fixes (part 1)

Harbormaster completed remote builds in B180401: Diff 451472.Aug 10 2022, 8:04 AM

eklepilkina added a parent revision: D131572: [SeparateConstOffsetFromGEP] Added statistic and small refactoring.Aug 10 2022, 8:04 AM

Found some bugs, some style proposals as well. The general point still holds. If the patch is purposed to reduce register pressure on some platform, please provide a test which shows that this actually happens. This can only be shown on a llc test.

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
247	This comment is obsolete now, `Extract` does not have these new parameters.
272	And if `V` is not binop, should it change?
362	This is usually called `BasePointer` in other parts of optimizer.
363	Shouldn't `%b` also be a part of it? Or where does it go? Maybe more elaborate example on how there can be more than one previous index?
388	APInt? Just to make sure this doesn't overflow.
389	Naturaly I'd expect this to be `SmallVector<const ConstantInt *>`, but the code below suggests there might not be constants. Misleading name?
557	Use `DenseMapInfo<Value *>::getTombstoneKey()` and same above
561	Why PreviousIndices size but not contents?
709	No `{ }`
710	`undef` and `poison` are constants but not `ConstantInt`. Are you OK with them?
1381	"Mostly impossible" means "possible". We generally bail out on non-linear algorithms with some thresholds. This could also be the case here.
llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll
286	Then please provide a llc test that demonstrates a positive change. The fact that "sext isn't so critical" is a way not obvious to me. Filling the upper part of the registry may sometimes be an extra operation.

This revision now requires changes to proceed.Aug 11 2022, 2:58 AM

eklepilkina removed a parent revision: D131572: [SeparateConstOffsetFromGEP] Added statistic and small refactoring.Aug 29 2022, 5:08 AM

eklepilkina removed a child revision: D127858: [RISCV] Added flag to enable SeparateConstOffsetFromGEPPass for RISC-V target.

Sorry for delay. Looked more on different benchmarks from test-suite during searching a good test case. There are such cases. But a deep exploration shows that SeparateConstOffsetFromGEP pass isn't the main reason, it produces better IR, but in some cases later passes can make it worse and cause worse asssembly code. So hacks I have made in the current pass as workarounds for these particular cases don't seem to be the proper decision. As far as this pass isn't the main reason of regressions we got, I decided to abandon this review.

Thank you very much for review and sorry to bother you.

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

Scalar.h

4 lines

Scalar/

SeparateConstOffsetFromGEP.h

5 lines

lib/

Passes/

PassBuilder.cpp

5 lines

PassRegistry.def

8 lines

Transforms/

Scalar/

SeparateConstOffsetFromGEP.cpp

358 lines

test/

Transforms/

SeparateConstOffsetFromGEP/

RISCV/

split-gep.ll

101 lines

Diff 451472

llvm/include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 436 Lines • ▼ Show 20 Lines
	// calls such as sqrt.			// calls such as sqrt.
	//			//
	FunctionPass *createPartiallyInlineLibCallsPass();			FunctionPass *createPartiallyInlineLibCallsPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// SeparateConstOffsetFromGEP - Split GEPs for better CSE			// SeparateConstOffsetFromGEP - Split GEPs for better CSE
	//			//
	FunctionPass *createSeparateConstOffsetFromGEPPass(bool LowerGEP = false);			FunctionPass *
				createSeparateConstOffsetFromGEPPass(bool LowerGEP = false,
				bool CheckProfitability = false);

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// SpeculativeExecution - Aggressively hoist instructions to enable			// SpeculativeExecution - Aggressively hoist instructions to enable
	// speculative execution on targets where branches are expensive.			// speculative execution on targets where branches are expensive.
	//			//
	FunctionPass *createSpeculativeExecutionPass();			FunctionPass *createSpeculativeExecutionPass();

	▲ Show 20 Lines • Show All 108 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Scalar/SeparateConstOffsetFromGEP.h

	Show All 10 Lines

	#include "llvm/IR/PassManager.h"			#include "llvm/IR/PassManager.h"

	namespace llvm {			namespace llvm {

	class SeparateConstOffsetFromGEPPass			class SeparateConstOffsetFromGEPPass
	: public PassInfoMixin<SeparateConstOffsetFromGEPPass> {			: public PassInfoMixin<SeparateConstOffsetFromGEPPass> {
	bool LowerGEP;			bool LowerGEP;
				bool CheckProfitability;

	public:			public:
	SeparateConstOffsetFromGEPPass(bool LowerGEP = false) : LowerGEP(LowerGEP) {}			SeparateConstOffsetFromGEPPass(bool LowerGEP = false,
				bool CheckProfitability = false)
				: LowerGEP(LowerGEP), CheckProfitability(CheckProfitability) {}
	PreservedAnalyses run(Function &F, FunctionAnalysisManager &);			PreservedAnalyses run(Function &F, FunctionAnalysisManager &);
	};			};

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_SCALAR_SEPARATECONSTOFFSETFROMGEP_H			#endif // LLVM_TRANSFORMS_SCALAR_SEPARATECONSTOFFSETFROMGEP_H

llvm/lib/Passes/PassBuilder.cpp

	Show First 20 Lines • Show All 614 Lines • ▼ Show 20 Lines
	Expected<bool> parseLoopExtractorPassOptions(StringRef Params) {			Expected<bool> parseLoopExtractorPassOptions(StringRef Params) {
	return parseSinglePassOption(Params, "single", "LoopExtractor");			return parseSinglePassOption(Params, "single", "LoopExtractor");
	}			}

	Expected<bool> parseLowerMatrixIntrinsicsPassOptions(StringRef Params) {			Expected<bool> parseLowerMatrixIntrinsicsPassOptions(StringRef Params) {
	return parseSinglePassOption(Params, "minimal", "LowerMatrixIntrinsics");			return parseSinglePassOption(Params, "minimal", "LowerMatrixIntrinsics");
	}			}

				Expected<bool> parseSeparateConstOffsetFromGEPPassOptions(StringRef Params) {
				return parseSinglePassOption(Params, "check-profit",
				"SeparateConstOffsetFromGEP");
				}

	Expected<AddressSanitizerOptions> parseASanPassOptions(StringRef Params) {			Expected<AddressSanitizerOptions> parseASanPassOptions(StringRef Params) {
	AddressSanitizerOptions Result;			AddressSanitizerOptions Result;
	while (!Params.empty()) {			while (!Params.empty()) {
	StringRef ParamName;			StringRef ParamName;
	std::tie(ParamName, Params) = Params.split(';');			std::tie(ParamName, Params) = Params.split(';');

	if (ParamName == "kernel") {			if (ParamName == "kernel") {
	Result.CompileKernel = true;			Result.CompileKernel = true;
	▲ Show 20 Lines • Show All 1,248 Lines • Show Last 20 Lines

llvm/lib/Passes/PassRegistry.def

Show First 20 Lines • Show All 351 Lines • ▼ Show 20 Lines
FUNCTION_PASS("print-predicateinfo", PredicateInfoPrinterPass(dbgs()))		FUNCTION_PASS("print-predicateinfo", PredicateInfoPrinterPass(dbgs()))
FUNCTION_PASS("print-mustexecute", MustExecutePrinterPass(dbgs()))		FUNCTION_PASS("print-mustexecute", MustExecutePrinterPass(dbgs()))
FUNCTION_PASS("print-memderefs", MemDerefPrinterPass(dbgs()))		FUNCTION_PASS("print-memderefs", MemDerefPrinterPass(dbgs()))
FUNCTION_PASS("reassociate", ReassociatePass())		FUNCTION_PASS("reassociate", ReassociatePass())
FUNCTION_PASS("redundant-dbg-inst-elim", RedundantDbgInstEliminationPass())		FUNCTION_PASS("redundant-dbg-inst-elim", RedundantDbgInstEliminationPass())
FUNCTION_PASS("reg2mem", RegToMemPass())		FUNCTION_PASS("reg2mem", RegToMemPass())
FUNCTION_PASS("scalarize-masked-mem-intrin", ScalarizeMaskedMemIntrinPass())		FUNCTION_PASS("scalarize-masked-mem-intrin", ScalarizeMaskedMemIntrinPass())
FUNCTION_PASS("scalarizer", ScalarizerPass())		FUNCTION_PASS("scalarizer", ScalarizerPass())
FUNCTION_PASS("separate-const-offset-from-gep", SeparateConstOffsetFromGEPPass())
FUNCTION_PASS("sccp", SCCPPass())		FUNCTION_PASS("sccp", SCCPPass())
FUNCTION_PASS("sink", SinkingPass())		FUNCTION_PASS("sink", SinkingPass())
FUNCTION_PASS("slp-vectorizer", SLPVectorizerPass())		FUNCTION_PASS("slp-vectorizer", SLPVectorizerPass())
FUNCTION_PASS("slsr", StraightLineStrengthReducePass())		FUNCTION_PASS("slsr", StraightLineStrengthReducePass())
FUNCTION_PASS("speculative-execution", SpeculativeExecutionPass())		FUNCTION_PASS("speculative-execution", SpeculativeExecutionPass())
FUNCTION_PASS("sroa", SROAPass())		FUNCTION_PASS("sroa", SROAPass())
FUNCTION_PASS("strip-gc-relocates", StripGCRelocates())		FUNCTION_PASS("strip-gc-relocates", StripGCRelocates())
FUNCTION_PASS("structurizecfg", StructurizeCFGPass())		FUNCTION_PASS("structurizecfg", StructurizeCFGPass())
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	FUNCTION_PASS_WITH_PARAMS("simplifycfg",
"no-forward-switch-cond;forward-switch-cond;"		"no-forward-switch-cond;forward-switch-cond;"
"no-switch-range-to-icmp;switch-range-to-icmp;"		"no-switch-range-to-icmp;switch-range-to-icmp;"
"no-switch-to-lookup;switch-to-lookup;"		"no-switch-to-lookup;switch-to-lookup;"
"no-keep-loops;keep-loops;"		"no-keep-loops;keep-loops;"
"no-hoist-common-insts;hoist-common-insts;"		"no-hoist-common-insts;hoist-common-insts;"
"no-sink-common-insts;sink-common-insts;"		"no-sink-common-insts;sink-common-insts;"
"bonus-inst-threshold=N"		"bonus-inst-threshold=N"
)		)
		FUNCTION_PASS_WITH_PARAMS("separate-const-offset-from-gep",
		"SeparateConstOffsetFromGEPPass",
		[](bool CheckProfitability) {
		return SeparateConstOffsetFromGEPPass(false, CheckProfitability);
		},
		parseSeparateConstOffsetFromGEPPassOptions,
		"check-profit")
FUNCTION_PASS_WITH_PARAMS("loop-vectorize",		FUNCTION_PASS_WITH_PARAMS("loop-vectorize",
"LoopVectorizePass",		"LoopVectorizePass",
[](LoopVectorizeOptions Opts) {		[](LoopVectorizeOptions Opts) {
return LoopVectorizePass(Opts);		return LoopVectorizePass(Opts);
},		},
parseLoopVectorizeOptions,		parseLoopVectorizeOptions,
"no-interleave-forced-only;interleave-forced-only;"		"no-interleave-forced-only;interleave-forced-only;"
"no-vectorize-forced-only;vectorize-forced-only")		"no-vectorize-forced-only;vectorize-forced-only")
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp

Show First 20 Lines • Show All 188 Lines • ▼ Show 20 Lines
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
		#include <map>
#include <string>		#include <string>

using namespace llvm;		using namespace llvm;
using namespace llvm::PatternMatch;		using namespace llvm::PatternMatch;

#define DEBUG_TYPE "separate-const-offset-from-gep"		#define DEBUG_TYPE "separate-const-offset-from-gep"
		mkazantsevUnsubmitted Done Reply Inline Actions Can this go as a separate NFC? mkazantsev: Can this go as a separate NFC?

static cl::opt<bool> DisableSeparateConstOffsetFromGEP(		static cl::opt<bool> DisableSeparateConstOffsetFromGEP(
"disable-separate-const-offset-from-gep", cl::init(false),		"disable-separate-const-offset-from-gep", cl::init(false),
cl::desc("Do not separate the constant offset from a GEP instruction"),		cl::desc("Do not separate the constant offset from a GEP instruction"),
cl::Hidden);		cl::Hidden);

// Setting this flag may emit false positives when the input module already		// Setting this flag may emit false positives when the input module already
// contains dead instructions. Therefore, we set it only in unit tests that are		// contains dead instructions. Therefore, we set it only in unit tests that are
Show All 27 Lines	public:
/// \p GEP The given GEP		/// \p GEP The given GEP
/// \p UserChainTail Outputs the tail of UserChain so that we can		/// \p UserChainTail Outputs the tail of UserChain so that we can
/// garbage-collect unused instructions in UserChain.		/// garbage-collect unused instructions in UserChain.
static Value Extract(Value Idx, GetElementPtrInst *GEP,		static Value Extract(Value Idx, GetElementPtrInst *GEP,
User &UserChainTail, const DominatorTree DT);		User &UserChainTail, const DominatorTree DT);

/// Looks for a constant offset from the given GEP index without extracting		/// Looks for a constant offset from the given GEP index without extracting
/// it. It returns the numeric value of the extracted constant offset (0 if		/// it. It returns the numeric value of the extracted constant offset (0 if
/// failed). The meaning of the arguments are the same as Extract.		/// failed). The meaning of the arguments are the same as Extract.
		mkazantsevUnsubmitted Not Done Reply Inline Actions This comment is obsolete now, `Extract` does not have these new parameters. mkazantsev: This comment is obsolete now, `Extract` does not have these new parameters.
static int64_t Find(Value Idx, GetElementPtrInst GEP,		static int64_t Find(Value Idx, GetElementPtrInst GEP,
const DominatorTree *DT);		const DominatorTree DT, Value &NonConstantBaseValue,
		bool CheckProfitability = false);

private:		private:
ConstantOffsetExtractor(Instruction InsertionPt, const DominatorTree DT)		ConstantOffsetExtractor(Instruction InsertionPt, const DominatorTree DT)
: IP(InsertionPt), DL(InsertionPt->getModule()->getDataLayout()), DT(DT) {		: IP(InsertionPt), DL(InsertionPt->getModule()->getDataLayout()), DT(DT) {
}		}

/// Searches the expression that computes V for a non-zero constant C s.t.		/// Searches the expression that computes V for a non-zero constant C s.t.
/// V can be reassociated into the form V' + C. If the searching is		/// V can be reassociated into the form V' + C. If the searching is
/// successful, returns C and update UserChain as a def-use chain from C to V;		/// successful, returns C and update UserChain as a def-use chain from C to V;
/// otherwise, UserChain is empty.		/// otherwise, UserChain is empty.
///		///
/// \p V The given expression		/// \p V The given expression
/// \p SignExtended Whether V will be sign-extended in the computation		/// \p SignExtended Whether V will be sign-extended in the computation
/// of the GEP index		/// of the GEP index
/// \p ZeroExtended Whether V will be zero-extended in the computation		/// \p ZeroExtended Whether V will be zero-extended in the computation
/// of the GEP index		/// of the GEP index
/// \p NonNegative Whether V is guaranteed to be non-negative. For		/// \p NonNegative Whether V is guaranteed to be non-negative. For
/// example, an index of an inbounds GEP is guaranteed		/// example, an index of an inbounds GEP is guaranteed
/// to be non-negative. Levaraging this, we can better		/// to be non-negative. Levaraging this, we can better
/// split inbounds GEPs.		/// split inbounds GEPs.
APInt find(Value *V, bool SignExtended, bool ZeroExtended, bool NonNegative);		/// \p NonConstantBaseValue The second non-constant operand if V is binary
		/// operator.
		mkazantsevUnsubmitted Not Done Reply Inline Actions And if `V` is not binop, should it change? mkazantsev: And if `V` is not binop, should it change?
		/// \p CheckProfitability Check the possible profit of optimization and
		/// search only instructions that can be optimized
		/// without extra operations.
		APInt find(Value *V, bool SignExtended, bool ZeroExtended, bool NonNegative,
		Value *&NonConstantBaseValue, bool CheckProfitability = false);
		mkazantsevUnsubmitted Done Reply Inline Actions Please commit this reformatting separately. mkazantsev: Please commit this reformatting separately.

/// A helper function to look into both operands of a binary operator.		/// A helper function to look into both operands of a binary operator.
APInt findInEitherOperand(BinaryOperator *BO, bool SignExtended,		APInt findInEitherOperand(BinaryOperator *BO, bool SignExtended,
		mkazantsevUnsubmitted Done Reply Inline Actions \p CheckProfitability ?.. mkazantsev: \p CheckProfitability ?..
bool ZeroExtended);		bool ZeroExtended, Value *&NonConstantBaseValue,
		bool CheckProfitability = false);

/// After finding the constant offset C from the GEP index I, we build a new		/// After finding the constant offset C from the GEP index I, we build a new
/// index I' s.t. I' + C = I. This function builds and returns the new		/// index I' s.t. I' + C = I. This function builds and returns the new
/// index I' according to UserChain produced by function "find".		/// index I' according to UserChain produced by function "find".
///		///
/// The building conceptually takes two steps:		/// The building conceptually takes two steps:
/// 1) iteratively distribute s/zext towards the leaves of the expression tree		/// 1) iteratively distribute s/zext towards the leaves of the expression tree
/// that computes I		/// that computes I
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	private:

/// Insertion position of cloned instructions.		/// Insertion position of cloned instructions.
Instruction *IP;		Instruction *IP;

const DataLayout &DL;		const DataLayout &DL;
const DominatorTree *DT;		const DominatorTree *DT;
};		};

		/// GEPBaseInfo - structure contains information about possible common base for
		/// GEP instructions.
		/// In case if we have the next GEP instruction
		///
		/// %add = add nsw i64 %i, 5
		/// getelementptr inbounds [50 x i64], [50 x i64]* %array, i64 %a, i64 %b
		mkazantsevUnsubmitted Done Reply Inline Actions This requires more explanation. I could not figure what are indices, which of them is being optimized, and what is precedence in this context. Maybe write a detailed comment on what's going on here and what does this structure represent? mkazantsev: This requires more explanation. I could not figure what are indices, which of them is being…
		///
		/// GEPPointer - %array
		mkazantsevUnsubmitted Not Done Reply Inline Actions This is usually called `BasePointer` in other parts of optimizer. mkazantsev: This is usually called `BasePointer` in other parts of optimizer.
		/// PreviousIndices - [%a]
		mkazantsevUnsubmitted Not Done Reply Inline Actions Shouldn't `%b` also be a part of it? Or where does it go? Maybe more elaborate example on how there can be more than one previous index? mkazantsev: Shouldn't `%b` also be a part of it? Or where does it go? Maybe more elaborate example on how…
		/// NonConstantBaseValue - %i
		///
		/// Structure stores information to calculate GEP instructions with same base.
		/// Needed to count how many GEP instructions can be optimized using one new
		/// GEP instruction.
		craig.topperUnsubmitted Not Done Reply Inline Actions I think there you should be a std::move on `PreviousIndices` craig.topper: I think there you should be a std::move on `PreviousIndices`
		struct GEPBaseInfo {
		/// Pointer used in GEP instruction.
		const Value *GEPPointer;
		craig.topperUnsubmitted Not Done Reply Inline Actions rhs -> RHS craig.topper: rhs -> RHS
		/// Indexes that precede index that can be optimized.
		SmallVector<const Value *> PreviousIndices;
		/// Non constant value that will be used in new base GEP.
		const Value *NonConstantBaseValue;

		GEPBaseInfo(const Value *GEPPointer,
		SmallVector<const Value *> PreviousIndices,
		const Value *NonConstantBaseValue)
		: GEPPointer(GEPPointer), PreviousIndices(PreviousIndices),
		NonConstantBaseValue(NonConstantBaseValue) {}
		};

		/// GEPInfo - structure contains basic information about GEP instruction
		/// needed for their modification.
		struct GEPInfo {
		GetElementPtrInst *GEPInstruction;
		int64_t AccumulativeByteOffset;
		mkazantsevUnsubmitted Not Done Reply Inline Actions APInt? Just to make sure this doesn't overflow. mkazantsev: APInt? Just to make sure this doesn't overflow.
		SmallVector<const Value *> ConstantIndices;
		mkazantsevUnsubmitted Not Done Reply Inline Actions Naturaly I'd expect this to be `SmallVector<const ConstantInt >`, but the code below suggests there might not be constants. Misleading name? mkazantsev:* Naturaly I'd expect this to be `SmallVector<const ConstantInt *>`, but the code below suggests…

		GEPInfo(GetElementPtrInst *GEPInstruction, int64_t AccumulativeByteOffset,
		SmallVector<const Value *> &&Indices)
		: GEPInstruction(GEPInstruction),
		AccumulativeByteOffset(AccumulativeByteOffset),
		ConstantIndices(Indices) {}
		};

/// A pass that tries to split every GEP in the function into a variadic		/// A pass that tries to split every GEP in the function into a variadic
/// base and a constant offset. It is a FunctionPass because searching for the		/// base and a constant offset. It is a FunctionPass because searching for the
/// constant offset may inspect other basic blocks.		/// constant offset may inspect other basic blocks.
class SeparateConstOffsetFromGEPLegacyPass : public FunctionPass {		class SeparateConstOffsetFromGEPLegacyPass : public FunctionPass {
public:		public:
static char ID;		static char ID;

SeparateConstOffsetFromGEPLegacyPass(bool LowerGEP = false)		SeparateConstOffsetFromGEPLegacyPass(bool LowerGEP = false,
: FunctionPass(ID), LowerGEP(LowerGEP) {		bool CheckProfitability = false)
		: FunctionPass(ID), LowerGEP(LowerGEP),
		CheckProfitability(CheckProfitability) {
initializeSeparateConstOffsetFromGEPLegacyPassPass(		initializeSeparateConstOffsetFromGEPLegacyPassPass(
*PassRegistry::getPassRegistry());		*PassRegistry::getPassRegistry());
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<ScalarEvolutionWrapperPass>();		AU.addRequired<ScalarEvolutionWrapperPass>();
AU.addRequired<TargetTransformInfoWrapperPass>();		AU.addRequired<TargetTransformInfoWrapperPass>();
AU.addRequired<LoopInfoWrapperPass>();		AU.addRequired<LoopInfoWrapperPass>();
AU.setPreservesCFG();		AU.setPreservesCFG();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
}		}

bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;

private:		private:
bool LowerGEP;		bool LowerGEP;
		bool CheckProfitability;
};		};

/// A pass that tries to split every GEP in the function into a variadic		/// A pass that tries to split every GEP in the function into a variadic
/// base and a constant offset. It is a FunctionPass because searching for the		/// base and a constant offset. It is a FunctionPass because searching for the
/// constant offset may inspect other basic blocks.		/// constant offset may inspect other basic blocks.
class SeparateConstOffsetFromGEP {		class SeparateConstOffsetFromGEP {
public:		public:
SeparateConstOffsetFromGEP(		SeparateConstOffsetFromGEP(
DominatorTree DT, ScalarEvolution SE, LoopInfo *LI,		DominatorTree DT, ScalarEvolution SE, LoopInfo *LI,
TargetLibraryInfo *TLI,		TargetLibraryInfo *TLI,
function_ref<TargetTransformInfo &(Function &)> GetTTI, bool LowerGEP)		function_ref<TargetTransformInfo &(Function &)> GetTTI, bool LowerGEP,
: DT(DT), SE(SE), LI(LI), TLI(TLI), GetTTI(GetTTI), LowerGEP(LowerGEP) {}		bool CheckProfitability)
		: DT(DT), SE(SE), LI(LI), TLI(TLI), GetTTI(GetTTI), LowerGEP(LowerGEP),
		CheckProfitability(CheckProfitability) {}

bool run(Function &F);		bool run(Function &F);

private:		private:
/// Tries to split the given GEP into a variadic base and a constant offset,		/// Tries to split the given GEP into a variadic base and a constant offset,
/// and returns true if the splitting succeeds.		/// and returns true if the splitting succeeds.
bool splitGEP(GetElementPtrInst *GEP);		bool splitGEP(GetElementPtrInst *GEP, int64_t AccumulativeByteOffset);

		/// Canonicalize GEP if needed and collect information to decide if GEP
		mkazantsevUnsubmitted Done Reply Inline Actions Canonize -> Canonicalize mkazantsev: Canonize -> Canonicalize
		/// modification is useful.
		mkazantsevUnsubmitted Done Reply Inline Actions I guess it should be "Returns true if a change was made, false otherwise". mkazantsev: I guess it should be "Returns true if a change was made, false otherwise".
		/// Returns true if a change was made, false otherwise.
		bool preprocessGEP(GetElementPtrInst *GEP);

/// Lower a GEP with multiple indices into multiple GEPs with a single index.		/// Lower a GEP with multiple indices into multiple GEPs with a single index.
/// Function splitGEP already split the original GEP into a variadic part and		/// Function splitGEP already split the original GEP into a variadic part and
/// a constant offset (i.e., AccumulativeByteOffset). This function lowers the		/// a constant offset (i.e., AccumulativeByteOffset). This function lowers the
/// variadic part into a set of GEPs with a single index and applies		/// variadic part into a set of GEPs with a single index and applies
/// AccumulativeByteOffset to it.		/// AccumulativeByteOffset to it.
/// \p Variadic The variadic part of the original GEP.		/// \p Variadic The variadic part of the original GEP.
/// \p AccumulativeByteOffset The constant offset.		/// \p AccumulativeByteOffset The constant offset.
void lowerToSingleIndexGEPs(GetElementPtrInst *Variadic,		void lowerToSingleIndexGEPs(GetElementPtrInst *Variadic,
int64_t AccumulativeByteOffset);		int64_t AccumulativeByteOffset);

/// Lower a GEP with multiple indices into ptrtoint+arithmetics+inttoptr form.		/// Lower a GEP with multiple indices into ptrtoint+arithmetics+inttoptr form.
/// Function splitGEP already split the original GEP into a variadic part and		/// Function splitGEP already split the original GEP into a variadic part and
/// a constant offset (i.e., AccumulativeByteOffset). This function lowers the		/// a constant offset (i.e., AccumulativeByteOffset). This function lowers the
/// variadic part into a set of arithmetic operations and applies		/// variadic part into a set of arithmetic operations and applies
/// AccumulativeByteOffset to it.		/// AccumulativeByteOffset to it.
/// \p Variadic The variadic part of the original GEP.		/// \p Variadic The variadic part of the original GEP.
/// \p AccumulativeByteOffset The constant offset.		/// \p AccumulativeByteOffset The constant offset.
void lowerToArithmetics(GetElementPtrInst *Variadic,		void lowerToArithmetics(GetElementPtrInst *Variadic,
int64_t AccumulativeByteOffset);		int64_t AccumulativeByteOffset);

/// Finds the constant offset within each index and accumulates them. If		/// Finds the constant offset within each index and accumulates them. If
/// LowerGEP is true, it finds in indices of both sequential and structure		/// LowerGEP is true, it finds in indices of both sequential and structure
/// types, otherwise it only finds in sequential indices. The output		/// types, otherwise it only finds in sequential indices.
/// NeedsExtraction indicates whether we successfully find a non-zero constant		void accumulateByteOffset(GetElementPtrInst *GEP);
/// offset.
int64_t accumulateByteOffset(GetElementPtrInst *GEP, bool &NeedsExtraction);

/// Canonicalize array indices to pointer-size integers. This helps to		/// Canonicalize array indices to pointer-size integers. This helps to
/// simplify the logic of splitting a GEP. For example, if a + b is a		/// simplify the logic of splitting a GEP. For example, if a + b is a
/// pointer-size integer, we have		/// pointer-size integer, we have
/// gep base, a + b = gep (gep base, a), b		/// gep base, a + b = gep (gep base, a), b
/// However, this equality may not hold if the size of a + b is smaller than		/// However, this equality may not hold if the size of a + b is smaller than
/// the pointer size, because LLVM conceptually sign-extends GEP indices to		/// the pointer size, because LLVM conceptually sign-extends GEP indices to
/// pointer size before computing the address		/// pointer size before computing the address
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	private:
TargetLibraryInfo *TLI;		TargetLibraryInfo *TLI;
// Retrieved lazily since not always used.		// Retrieved lazily since not always used.
function_ref<TargetTransformInfo &(Function &)> GetTTI;		function_ref<TargetTransformInfo &(Function &)> GetTTI;

/// Whether to lower a GEP with multiple indices into arithmetic operations or		/// Whether to lower a GEP with multiple indices into arithmetic operations or
/// multiple GEPs with a single index.		/// multiple GEPs with a single index.
bool LowerGEP;		bool LowerGEP;

		/// Check the possible profit of optimization to reduce register pressure
		/// or modify all possible GEPs.
		bool CheckProfitability;

DenseMap<const SCEV , SmallVector<Instruction , 2>> DominatingAdds;		DenseMap<const SCEV , SmallVector<Instruction , 2>> DominatingAdds;
DenseMap<const SCEV , SmallVector<Instruction , 2>> DominatingSubs;		DenseMap<const SCEV , SmallVector<Instruction , 2>> DominatingSubs;

		/// GEP instructions chosen for transformation
		DenseMap<GEPBaseInfo, SmallVector<GEPInfo>> GEPsToTransform;
		mkazantsevUnsubmitted Done Reply Inline Actions Maybe rename `InstructionsToTransform` -> `GEPsToTransform`? mkazantsev: Maybe rename `InstructionsToTransform` -> `GEPsToTransform`?
};		};

} // end anonymous namespace		} // end anonymous namespace

		template <> struct llvm::DenseMapInfo<GEPBaseInfo> {
		static inline GEPBaseInfo getEmptyKey() {
		return GEPBaseInfo(nullptr, SmallVector<const Value *>(), nullptr);
		}
		static inline GEPBaseInfo getTombstoneKey() {
		return GEPBaseInfo((Value )(-1), SmallVector<const Value >(),
		mkazantsevUnsubmitted Not Done Reply Inline Actions Use `DenseMapInfo<Value >::getTombstoneKey()` and same above mkazantsev:* Use `DenseMapInfo<Value *>::getTombstoneKey()` and same above
		(Value *)(-1));
		}
		static unsigned getHashValue(const GEPBaseInfo &Val) {
		return llvm::hash_combine(Val.GEPPointer, Val.PreviousIndices.size(),
		mkazantsevUnsubmitted Not Done Reply Inline Actions Why PreviousIndices size but not contents? mkazantsev: Why PreviousIndices size but not contents?
		Val.NonConstantBaseValue);
		}
		static bool isEqual(const GEPBaseInfo &LHS, const GEPBaseInfo &RHS) {
		return LHS.GEPPointer == RHS.GEPPointer &&
		LHS.NonConstantBaseValue == RHS.NonConstantBaseValue &&
		LHS.PreviousIndices == RHS.PreviousIndices;
		}
		};

char SeparateConstOffsetFromGEPLegacyPass::ID = 0;		char SeparateConstOffsetFromGEPLegacyPass::ID = 0;

INITIALIZE_PASS_BEGIN(		INITIALIZE_PASS_BEGIN(
SeparateConstOffsetFromGEPLegacyPass, "separate-const-offset-from-gep",		SeparateConstOffsetFromGEPLegacyPass, "separate-const-offset-from-gep",
"Split GEPs to a variadic base and a constant offset for better CSE", false,		"Split GEPs to a variadic base and a constant offset for better CSE", false,
false)		false)
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)		INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(ScalarEvolutionWrapperPass)		INITIALIZE_PASS_DEPENDENCY(ScalarEvolutionWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_END(		INITIALIZE_PASS_END(
SeparateConstOffsetFromGEPLegacyPass, "separate-const-offset-from-gep",		SeparateConstOffsetFromGEPLegacyPass, "separate-const-offset-from-gep",
"Split GEPs to a variadic base and a constant offset for better CSE", false,		"Split GEPs to a variadic base and a constant offset for better CSE", false,
false)		false)

FunctionPass *llvm::createSeparateConstOffsetFromGEPPass(bool LowerGEP) {		FunctionPass *
return new SeparateConstOffsetFromGEPLegacyPass(LowerGEP);		llvm::createSeparateConstOffsetFromGEPPass(bool LowerGEP,
		bool CheckProfitability) {
		return new SeparateConstOffsetFromGEPLegacyPass(LowerGEP, CheckProfitability);
}		}

bool ConstantOffsetExtractor::CanTraceInto(bool SignExtended,		bool ConstantOffsetExtractor::CanTraceInto(bool SignExtended,
bool ZeroExtended,		bool ZeroExtended,
BinaryOperator *BO,		BinaryOperator *BO,
bool NonNegative) {		bool NonNegative) {
// We only consider ADD, SUB and OR, because a non-zero constant found in		// We only consider ADD, SUB and OR, because a non-zero constant found in
// expressions composed of these operations can be easily hoisted as a		// expressions composed of these operations can be easily hoisted as a
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	if (ZeroExtended && !BO->hasNoUnsignedWrap())
return false;		return false;
}		}

return true;		return true;
}		}

APInt ConstantOffsetExtractor::findInEitherOperand(BinaryOperator *BO,		APInt ConstantOffsetExtractor::findInEitherOperand(BinaryOperator *BO,
bool SignExtended,		bool SignExtended,
bool ZeroExtended) {		bool ZeroExtended,
		Value *&NonConstantBaseValue,
		bool CheckProfitability) {
// Save off the current height of the chain, in case we need to restore it.		// Save off the current height of the chain, in case we need to restore it.
size_t ChainLength = UserChain.size();		size_t ChainLength = UserChain.size();

// BO being non-negative does not shed light on whether its operands are		// BO being non-negative does not shed light on whether its operands are
// non-negative. Clear the NonNegative flag here.		// non-negative. Clear the NonNegative flag here.
APInt ConstantOffset = find(BO->getOperand(0), SignExtended, ZeroExtended,		APInt ConstantOffset =
/* NonNegative */ false);		find(BO->getOperand(0), SignExtended, ZeroExtended,
		/* NonNegative */ false, NonConstantBaseValue, CheckProfitability);

		// Only sub and add instructions don't need adding extra instructions.
		if (CheckProfitability && (BO->getOpcode() != Instruction::Sub &&
		BO->getOpcode() != Instruction::Add)) {
		NonConstantBaseValue = nullptr;
		ConstantOffset = 0;
		UserChain.resize(ChainLength);
		return ConstantOffset;
		}

// If we found a constant offset in the left operand, stop and return that.		// If we found a constant offset in the left operand, stop and return that.
// This shortcut might cause us to miss opportunities of combining the		// This shortcut might cause us to miss opportunities of combining the
// constant offsets in both operands, e.g., (a + 4) + (b + 5) => (a + b) + 9.		// constant offsets in both operands, e.g., (a + 4) + (b + 5) => (a + b) + 9.
// However, such cases are probably already handled by -instcombine,		// However, such cases are probably already handled by -instcombine,
// given this pass runs after the standard optimizations.		// given this pass runs after the standard optimizations.
if (ConstantOffset != 0) return ConstantOffset;		if (ConstantOffset != 0) {
		if (!isa<ConstantInt>(BO->getOperand(1))) {
		NonConstantBaseValue = BO->getOperand(1);
		}
		return ConstantOffset;
		}

// Reset the chain back to where it was when we started exploring this node,		// Reset the chain back to where it was when we started exploring this node,
// since visiting the LHS didn't pan out.		// since visiting the LHS didn't pan out.
UserChain.resize(ChainLength);		UserChain.resize(ChainLength);

ConstantOffset = find(BO->getOperand(1), SignExtended, ZeroExtended,		ConstantOffset =
/* NonNegative */ false);		find(BO->getOperand(1), SignExtended, ZeroExtended,
		/* NonNegative */ false, NonConstantBaseValue, CheckProfitability);
// If U is a sub operator, negate the constant offset found in the right		// If U is a sub operator, negate the constant offset found in the right
// operand.		// operand.
if (BO->getOpcode() == Instruction::Sub)		if (BO->getOpcode() == Instruction::Sub)
ConstantOffset = -ConstantOffset;		ConstantOffset = -ConstantOffset;

// If RHS wasn't a suitable candidate either, reset the chain again.		// If RHS wasn't a suitable candidate either, reset the chain again.
if (ConstantOffset == 0)		if (ConstantOffset == 0)
UserChain.resize(ChainLength);		UserChain.resize(ChainLength);

		if (!isa<ConstantInt>(BO->getOperand(0))) {
		mkazantsevUnsubmitted Not Done Reply Inline Actions No `{ }` mkazantsev: No `{ }`
		NonConstantBaseValue = BO->getOperand(0);
		mkazantsevUnsubmitted Not Done Reply Inline Actions `undef` and `poison` are constants but not `ConstantInt`. Are you OK with them? mkazantsev: `undef` and `poison` are constants but not `ConstantInt`. Are you OK with them?
		}

return ConstantOffset;		return ConstantOffset;
}		}

APInt ConstantOffsetExtractor::find(Value *V, bool SignExtended,		APInt ConstantOffsetExtractor::find(Value *V, bool SignExtended,
bool ZeroExtended, bool NonNegative) {		bool ZeroExtended, bool NonNegative,
		Value *&NonConstantBaseValue,
		bool CheckProfitability) {
// TODO(jingyue): We could trace into integer/pointer casts, such as		// TODO(jingyue): We could trace into integer/pointer casts, such as
// inttoptr, ptrtoint, bitcast, and addrspacecast. We choose to handle only		// inttoptr, ptrtoint, bitcast, and addrspacecast. We choose to handle only
// integers because it gives good enough results for our benchmarks.		// integers because it gives good enough results for our benchmarks.
unsigned BitWidth = cast<IntegerType>(V->getType())->getBitWidth();		unsigned BitWidth = cast<IntegerType>(V->getType())->getBitWidth();

// We cannot do much with Values that are not a User, such as an Argument.		// We cannot do much with Values that are not a User, such as an Argument.
User *U = dyn_cast<User>(V);		User *U = dyn_cast<User>(V);
if (U == nullptr) return APInt(BitWidth, 0);		if (U == nullptr) return APInt(BitWidth, 0);

APInt ConstantOffset(BitWidth, 0);		APInt ConstantOffset(BitWidth, 0);
if (ConstantInt *CI = dyn_cast<ConstantInt>(V)) {		if (ConstantInt *CI = dyn_cast<ConstantInt>(V)) {
// Hooray, we found it!		// Hooray, we found it!
ConstantOffset = CI->getValue();		ConstantOffset = CI->getValue();
} else if (BinaryOperator *BO = dyn_cast<BinaryOperator>(V)) {		} else if (BinaryOperator *BO = dyn_cast<BinaryOperator>(V)) {
// Trace into subexpressions for more hoisting opportunities.		// Trace into subexpressions for more hoisting opportunities.
if (CanTraceInto(SignExtended, ZeroExtended, BO, NonNegative))		if (CanTraceInto(SignExtended, ZeroExtended, BO, NonNegative))
ConstantOffset = findInEitherOperand(BO, SignExtended, ZeroExtended);
} else if (isa<TruncInst>(V)) {
ConstantOffset =		ConstantOffset =
find(U->getOperand(0), SignExtended, ZeroExtended, NonNegative)		findInEitherOperand(BO, SignExtended, ZeroExtended,
		NonConstantBaseValue, CheckProfitability);
		} else if (isa<TruncInst>(V)) {
		ConstantOffset = find(U->getOperand(0), SignExtended, ZeroExtended,
		NonNegative, NonConstantBaseValue, CheckProfitability)
.trunc(BitWidth);		.trunc(BitWidth);
} else if (isa<SExtInst>(V)) {		} else if (isa<SExtInst>(V)) {
ConstantOffset = find(U->getOperand(0), /* SignExtended */ true,		ConstantOffset =
ZeroExtended, NonNegative).sext(BitWidth);		find(U->getOperand(0), /* SignExtended */ true, ZeroExtended,
		NonNegative, NonConstantBaseValue, CheckProfitability)
		.sext(BitWidth);
} else if (isa<ZExtInst>(V)) {		} else if (isa<ZExtInst>(V)) {
// As an optimization, we can clear the SignExtended flag because		// As an optimization, we can clear the SignExtended flag because
// sext(zext(a)) = zext(a). Verified in @sext_zext in split-gep.ll.		// sext(zext(a)) = zext(a). Verified in @sext_zext in split-gep.ll.
//		//
// Clear the NonNegative flag, because zext(a) >= 0 does not imply a >= 0.		// Clear the NonNegative flag, because zext(a) >= 0 does not imply a >= 0.
ConstantOffset =		ConstantOffset = find(U->getOperand(0), /* SignExtended */ false,
find(U->getOperand(0), /* SignExtended */ false,		/* ZeroExtended / true, / NonNegative */ false,
/* ZeroExtended / true, / NonNegative */ false).zext(BitWidth);		NonConstantBaseValue, CheckProfitability)
		.zext(BitWidth);
}		}

// If we found a non-zero constant offset, add it to the path for		// If we found a non-zero constant offset, add it to the path for
// rebuildWithoutConstOffset. Zero is a valid constant offset, but doesn't		// rebuildWithoutConstOffset. Zero is a valid constant offset, but doesn't
// help this optimization.		// help this optimization.
if (ConstantOffset != 0)		if (ConstantOffset != 0)
UserChain.push_back(U);		UserChain.push_back(U);
return ConstantOffset;		return ConstantOffset;
▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	Value *ConstantOffsetExtractor::removeConstOffset(unsigned ChainIndex) {
NewBO->takeName(BO);		NewBO->takeName(BO);
return NewBO;		return NewBO;
}		}

Value ConstantOffsetExtractor::Extract(Value Idx, GetElementPtrInst *GEP,		Value ConstantOffsetExtractor::Extract(Value Idx, GetElementPtrInst *GEP,
User *&UserChainTail,		User *&UserChainTail,
const DominatorTree *DT) {		const DominatorTree *DT) {
ConstantOffsetExtractor Extractor(GEP, DT);		ConstantOffsetExtractor Extractor(GEP, DT);
		Value *NonConstantBaseValue = nullptr;
// Find a non-zero constant offset first.		// Find a non-zero constant offset first.
APInt ConstantOffset =		APInt ConstantOffset =
Extractor.find(Idx, /* SignExtended / false, / ZeroExtended */ false,		Extractor.find(Idx, /* SignExtended / false, / ZeroExtended */ false,
GEP->isInBounds());		GEP->isInBounds(), NonConstantBaseValue);
if (ConstantOffset == 0) {		if (ConstantOffset == 0) {
UserChainTail = nullptr;		UserChainTail = nullptr;
return nullptr;		return nullptr;
}		}
// Separates the constant offset from the GEP index.		// Separates the constant offset from the GEP index.
Value *IdxWithoutConstOffset = Extractor.rebuildWithoutConstOffset();		Value *IdxWithoutConstOffset = Extractor.rebuildWithoutConstOffset();
UserChainTail = Extractor.UserChain.back();		UserChainTail = Extractor.UserChain.back();
return IdxWithoutConstOffset;		return IdxWithoutConstOffset;
}		}

int64_t ConstantOffsetExtractor::Find(Value Idx, GetElementPtrInst GEP,		int64_t ConstantOffsetExtractor::Find(Value Idx, GetElementPtrInst GEP,
const DominatorTree *DT) {		const DominatorTree *DT,
		Value *&NonConstantBaseValue,
		bool CheckProfitability) {
// If Idx is an index of an inbound GEP, Idx is guaranteed to be non-negative.		// If Idx is an index of an inbound GEP, Idx is guaranteed to be non-negative.
return ConstantOffsetExtractor(GEP, DT)		return ConstantOffsetExtractor(GEP, DT)
.find(Idx, /* SignExtended / false, / ZeroExtended */ false,		.find(Idx, /* SignExtended / false, / ZeroExtended */ false,
GEP->isInBounds())		GEP->isInBounds(), NonConstantBaseValue, CheckProfitability)
.getSExtValue();		.getSExtValue();
}		}

bool SeparateConstOffsetFromGEP::canonicalizeArrayIndicesToPointerSize(		bool SeparateConstOffsetFromGEP::canonicalizeArrayIndicesToPointerSize(
GetElementPtrInst *GEP) {		GetElementPtrInst *GEP) {
bool Changed = false;		bool Changed = false;
Type *IntPtrTy = DL->getIntPtrType(GEP->getType());		Type *IntPtrTy = DL->getIntPtrType(GEP->getType());
gep_type_iterator GTI = gep_type_begin(*GEP);		gep_type_iterator GTI = gep_type_begin(*GEP);
for (User::op_iterator I = GEP->op_begin() + 1, E = GEP->op_end();		for (User::op_iterator I = GEP->op_begin() + 1, E = GEP->op_end();
I != E; ++I, ++GTI) {		I != E; ++I, ++GTI) {
// Skip struct member indices which must be i32.		// Skip struct member indices which must be i32.
if (GTI.isSequential()) {		if (GTI.isSequential()) {
if ((*I)->getType() != IntPtrTy) {		if ((*I)->getType() != IntPtrTy) {
I = CastInst::CreateIntegerCast(I, IntPtrTy, true, "idxprom", GEP);		I = CastInst::CreateIntegerCast(I, IntPtrTy, true, "idxprom", GEP);
Changed = true;		Changed = true;
}		}
}		}
}		}
return Changed;		return Changed;
}		}

int64_t		void SeparateConstOffsetFromGEP::accumulateByteOffset(GetElementPtrInst *GEP) {
SeparateConstOffsetFromGEP::accumulateByteOffset(GetElementPtrInst *GEP,
bool &NeedsExtraction) {
NeedsExtraction = false;
int64_t AccumulativeByteOffset = 0;		int64_t AccumulativeByteOffset = 0;
gep_type_iterator GTI = gep_type_begin(*GEP);		gep_type_iterator GTI = gep_type_begin(*GEP);
		SmallVector<const Value *> ConstantIndices;
		SmallVector<GEPBaseInfo, 2> PossibleBases;

for (unsigned I = 1, E = GEP->getNumOperands(); I != E; ++I, ++GTI) {		for (unsigned I = 1, E = GEP->getNumOperands(); I != E; ++I, ++GTI) {
		Value *NonConstantBaseValue = nullptr;
if (GTI.isSequential()) {		if (GTI.isSequential()) {
// Tries to extract a constant offset from this GEP index.		// Tries to extract a constant offset from this GEP index.
int64_t ConstantOffset =		int64_t ConstantOffset = ConstantOffsetExtractor::Find(
ConstantOffsetExtractor::Find(GEP->getOperand(I), GEP, DT);		GEP->getOperand(I), GEP, DT, NonConstantBaseValue,
		CheckProfitability);
if (ConstantOffset != 0) {		if (ConstantOffset != 0) {
NeedsExtraction = true;		if (CheckProfitability \|\| PossibleBases.empty()) {
		PossibleBases.emplace_back(
		GEP->getPointerOperand(),
		SmallVector<const Value *, 4>(GEP->idx_begin(),
		GEP->idx_begin() + I - 1),
		NonConstantBaseValue);
		}

// A GEP may have multiple indices. We accumulate the extracted		// A GEP may have multiple indices. We accumulate the extracted
// constant offset to a byte offset, and later offset the remainder of		// constant offset to a byte offset, and later offset the remainder of
// the original GEP with this byte offset.		// the original GEP with this byte offset.
AccumulativeByteOffset +=		AccumulativeByteOffset +=
ConstantOffset * DL->getTypeAllocSize(GTI.getIndexedType());		ConstantOffset * DL->getTypeAllocSize(GTI.getIndexedType());
		ConstantIndices.push_back(GEP->getOperand(I));
}		}
} else if (LowerGEP) {		} else if (LowerGEP) {
StructType *StTy = GTI.getStructType();		StructType *StTy = GTI.getStructType();
uint64_t Field = cast<ConstantInt>(GEP->getOperand(I))->getZExtValue();		uint64_t Field = cast<ConstantInt>(GEP->getOperand(I))->getZExtValue();
// Skip field 0 as the offset is always 0.		// Skip field 0 as the offset is always 0.
if (Field != 0) {		if (Field != 0) {
NeedsExtraction = true;		if (CheckProfitability \|\| PossibleBases.empty()) {
		PossibleBases.emplace_back(GEP->getPointerOperand(),
		SmallVector<const Value *>(),
		NonConstantBaseValue);
		}
		craig.topperUnsubmitted Not Done Reply Inline Actions `PossibleBase.size() == 0` -> `PossibleBases.empty()` craig.topper: `PossibleBase.size() == 0` -> `PossibleBases.empty()`
AccumulativeByteOffset +=		AccumulativeByteOffset +=
DL->getStructLayout(StTy)->getElementOffset(Field);		DL->getStructLayout(StTy)->getElementOffset(Field);
}		}
}		}
}		}
return AccumulativeByteOffset;		TargetTransformInfo &TTI = GetTTI(*GEP->getFunction());

		// If LowerGEP is disabled, before really splitting the GEP, check whether the
		// backend supports the addressing mode we are about to produce. If no, this
		// splitting probably won't be beneficial.
		// If LowerGEP is enabled, even the extracted constant offset can not match
		// the addressing mode, we can still do optimizations to other lowered parts
		// of variable indices. Therefore, we don't check for addressing modes in that
		// case.
		if (!LowerGEP) {
		unsigned AddrSpace = GEP->getPointerAddressSpace();
		if (!TTI.isLegalAddressingMode(GEP->getResultElementType(),
		/BaseGV=/nullptr, AccumulativeByteOffset,
		/HasBaseReg=/true, /Scale=/0,
		AddrSpace)) {
		LLVM_DEBUG(
		dbgs()
		<< "Don't optimize. The backend doesn't support the addressing mode\n");
		return;
		}
		}
		for (const GEPBaseInfo &Base : PossibleBases) {
		if (GEPsToTransform.find(Base) == GEPsToTransform.end()) {
		GEPsToTransform[Base] = SmallVector<GEPInfo>();
		}
		GEPsToTransform[Base].emplace_back(GEP, AccumulativeByteOffset,
		std::move(ConstantIndices));
		}
}		}

void SeparateConstOffsetFromGEP::lowerToSingleIndexGEPs(		void SeparateConstOffsetFromGEP::lowerToSingleIndexGEPs(
GetElementPtrInst *Variadic, int64_t AccumulativeByteOffset) {		GetElementPtrInst *Variadic, int64_t AccumulativeByteOffset) {
IRBuilder<> Builder(Variadic);		IRBuilder<> Builder(Variadic);
Type *IntPtrTy = DL->getIntPtrType(Variadic->getType());		Type *IntPtrTy = DL->getIntPtrType(Variadic->getType());

Type *I8PtrTy =		Type *I8PtrTy =
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	if (ResultPtr->getType() != Variadic->getType())
ResultPtr = Builder.CreateBitCast(ResultPtr, Variadic->getType());		ResultPtr = Builder.CreateBitCast(ResultPtr, Variadic->getType());

Variadic->replaceAllUsesWith(ResultPtr);		Variadic->replaceAllUsesWith(ResultPtr);
Variadic->eraseFromParent();		Variadic->eraseFromParent();
}		}

void		void
SeparateConstOffsetFromGEP::lowerToArithmetics(GetElementPtrInst *Variadic,		SeparateConstOffsetFromGEP::lowerToArithmetics(GetElementPtrInst *Variadic,
int64_t AccumulativeByteOffset) {		int64_t AccumulativeByteOffset) {
		mkazantsevUnsubmitted Done Reply Inline Actions Pls commit separately if it is needed. mkazantsev: Pls commit separately if it is needed.
IRBuilder<> Builder(Variadic);		IRBuilder<> Builder(Variadic);
Type *IntPtrTy = DL->getIntPtrType(Variadic->getType());		Type *IntPtrTy = DL->getIntPtrType(Variadic->getType());

Value *ResultPtr = Builder.CreatePtrToInt(Variadic->getOperand(0), IntPtrTy);		Value *ResultPtr = Builder.CreatePtrToInt(Variadic->getOperand(0), IntPtrTy);
gep_type_iterator GTI = gep_type_begin(*Variadic);		gep_type_iterator GTI = gep_type_begin(*Variadic);
// Create ADD/SHL/MUL arithmetic operations for each sequential indices. We		// Create ADD/SHL/MUL arithmetic operations for each sequential indices. We
// don't create arithmetics for structure indices, as they are accumulated		// don't create arithmetics for structure indices, as they are accumulated
// in the constant offset index.		// in the constant offset index.
Show All 27 Lines	ResultPtr = Builder.CreateAdd(
ResultPtr, ConstantInt::get(IntPtrTy, AccumulativeByteOffset));		ResultPtr, ConstantInt::get(IntPtrTy, AccumulativeByteOffset));
}		}

ResultPtr = Builder.CreateIntToPtr(ResultPtr, Variadic->getType());		ResultPtr = Builder.CreateIntToPtr(ResultPtr, Variadic->getType());
Variadic->replaceAllUsesWith(ResultPtr);		Variadic->replaceAllUsesWith(ResultPtr);
Variadic->eraseFromParent();		Variadic->eraseFromParent();
}		}

bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {		bool SeparateConstOffsetFromGEP::preprocessGEP(GetElementPtrInst *GEP) {
		mkazantsevUnsubmitted Not Done Reply Inline Actions Rename as separate NFC? mkazantsev: Rename as separate NFC?
		eklepilkinaAuthorUnsubmitted Done Reply Inline Actions I don't really like the idea to rename in separate NFC patch, because renaming is connected with changes that were made and the old name wasn't suitable any more eklepilkina: I don't really like the idea to rename in separate NFC patch, because renaming is connected…
// Skip vector GEPs.		// Skip vector GEPs.
if (GEP->getType()->isVectorTy())		if (GEP->getType()->isVectorTy())
return false;		return false;

// The backend can already nicely handle the case where all indices are		// The backend can already nicely handle the case where all indices are
// constant.		// constant.
if (GEP->hasAllConstantIndices())		if (GEP->hasAllConstantIndices())
return false;		return false;

bool Changed = canonicalizeArrayIndicesToPointerSize(GEP);		bool Changed = canonicalizeArrayIndicesToPointerSize(GEP);

bool NeedsExtraction;		accumulateByteOffset(GEP);
int64_t AccumulativeByteOffset = accumulateByteOffset(GEP, NeedsExtraction);

if (!NeedsExtraction)
return Changed;		return Changed;
		}

		bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP,
		int64_t AccumulativeByteOffset) {
TargetTransformInfo &TTI = GetTTI(*GEP->getFunction());		TargetTransformInfo &TTI = GetTTI(*GEP->getFunction());

// If LowerGEP is disabled, before really splitting the GEP, check whether the
// backend supports the addressing mode we are about to produce. If no, this
// splitting probably won't be beneficial.
// If LowerGEP is enabled, even the extracted constant offset can not match
// the addressing mode, we can still do optimizations to other lowered parts
// of variable indices. Therefore, we don't check for addressing modes in that
// case.
if (!LowerGEP) {
unsigned AddrSpace = GEP->getPointerAddressSpace();
if (!TTI.isLegalAddressingMode(GEP->getResultElementType(),
/BaseGV=/nullptr, AccumulativeByteOffset,
/HasBaseReg=/true, /Scale=/0,
AddrSpace)) {
return Changed;
}
}

// Remove the constant offset in each sequential index. The resultant GEP		// Remove the constant offset in each sequential index. The resultant GEP
// computes the variadic base.		// computes the variadic base.
// Notice that we don't remove struct field indices here. If LowerGEP is		// Notice that we don't remove struct field indices here. If LowerGEP is
// disabled, a structure index is not accumulated and we still use the old		// disabled, a structure index is not accumulated and we still use the old
// one. If LowerGEP is enabled, a structure index is accumulated in the		// one. If LowerGEP is enabled, a structure index is accumulated in the
// constant offset. LowerToSingleIndexGEPs or lowerToArithmetics will later		// constant offset. LowerToSingleIndexGEPs or lowerToArithmetics will later
// handle the constant offset and won't need a new structure index.		// handle the constant offset and won't need a new structure index.
gep_type_iterator GTI = gep_type_begin(*GEP);		gep_type_iterator GTI = gep_type_begin(*GEP);
▲ Show 20 Lines • Show All 140 Lines • ▼ Show 20 Lines	if (skipFunction(F))
return false;		return false;
auto *DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();		auto *DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
auto *SE = &getAnalysis<ScalarEvolutionWrapperPass>().getSE();		auto *SE = &getAnalysis<ScalarEvolutionWrapperPass>().getSE();
auto *LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();		auto *LI = &getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
auto *TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);		auto *TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
auto GetTTI = [this](Function &F) -> TargetTransformInfo & {		auto GetTTI = [this](Function &F) -> TargetTransformInfo & {
return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);		return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
};		};
SeparateConstOffsetFromGEP Impl(DT, SE, LI, TLI, GetTTI, LowerGEP);		SeparateConstOffsetFromGEP Impl(DT, SE, LI, TLI, GetTTI, LowerGEP,
		CheckProfitability);
return Impl.run(F);		return Impl.run(F);
}		}

bool SeparateConstOffsetFromGEP::run(Function &F) {		bool SeparateConstOffsetFromGEP::run(Function &F) {
if (DisableSeparateConstOffsetFromGEP)		if (DisableSeparateConstOffsetFromGEP)
return false;		return false;

DL = &F.getParent()->getDataLayout();		DL = &F.getParent()->getDataLayout();
bool Changed = false;		bool Changed = false;

LLVM_DEBUG(dbgs() << "========= Function " << F.getName() << " =========\n");		auto OnlyUsedInGEP = [](const GEPInfo &GEPInfo) {
		bool OnlyUsedInGEP = GEPInfo.ConstantIndices.empty();
		for (const Value *Index : GEPInfo.ConstantIndices) {
		uint64_t NumUses = Index->getNumUses();
		// In case of cast instruction check usages of both result and original
		// value.
		if (isa<SExtInst>(Index) \|\| isa<ZExtInst>(Index) \|\|
		isa<TruncInst>(Index)) {
		NumUses += cast<Instruction>(Index)->getOperand(0)->getNumUses() - 1;
		}
		OnlyUsedInGEP \|= NumUses == 1;
		}
		return OnlyUsedInGEP;
		};

		LLVM_DEBUG(dbgs() << "========= Function " << F.getName() << " =========\n");
for (BasicBlock &B : F) {		for (BasicBlock &B : F) {
		GEPsToTransform.clear();
if (!DT->isReachableFromEntry(&B))		if (!DT->isReachableFromEntry(&B))
continue;		continue;

for (Instruction &I : llvm::make_early_inc_range(B))		for (Instruction &I : llvm::make_early_inc_range(B))
if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(&I))		if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(&I))
Changed \|= splitGEP(GEP);		Changed \|= preprocessGEP(GEP);
// No need to split GEP ConstantExprs because all its indices are constant
// already.		if (!CheckProfitability) {
		for (const auto &GEPInfoPair : GEPsToTransform) {
		for (const auto &GEPInfo : GEPInfoPair.second) {
		LLVM_DEBUG(dbgs() << "Try to split GEP " << *GEPInfo.GEPInstruction
		<< "\n");
		Changed \|=
		splitGEP(GEPInfo.GEPInstruction, GEPInfo.AccumulativeByteOffset);
		// No need to split GEP ConstantExprs because all its indices are
		// constant already.
		}
		}
		} else {
		// As far as one instruction can be optimized using different base, choose
		// the best base option based on possible effect for decreasing register
		// pressure. Sort all found bases in decreasing order of possible effect.
		SmallVector<std::pair<SmallVector<GEPInfo>, unsigned>>
		SortedInstructionsList;
		for (const auto &GEPInfoPair : GEPsToTransform) {
		unsigned DeadValuesNumber = count_if(GEPInfoPair.second, OnlyUsedInGEP);
		if (DeadValuesNumber > 0) {
		SortedInstructionsList.emplace_back(GEPInfoPair.second,
		DeadValuesNumber);
		}
		}
		sort(SortedInstructionsList, [&OnlyUsedInGEP](auto &LHS, auto &RHS) {
		return LHS.second > RHS.second \|\| LHS.first.size() > RHS.first.size();
		});

		// Optimize all chosen GEPs
		for (unsigned I = 0; I < SortedInstructionsList.size(); I++) {
		auto DetailedInfoList = SortedInstructionsList[I].first;
		if (DetailedInfoList.size() > 1 &&
		any_of(DetailedInfoList, OnlyUsedInGEP)) {
		mkazantsevUnsubmitted Not Done Reply Inline Actions To me, this code structure looks counter-intuitive. Why do we print "Try to split GEP "... only when we check profitability, and do it silently when we don't? If possible, please restructure it like if (CheckProfitability) { // Do all required profitability checks } // Do common transform logic uniformly I'm not sure if it's possible here because of this post-processing. If not, then the transform part should be unified somehow else. mkazantsev: To me, this code structure looks counter-intuitive. Why do we print "Try to split GEP "... only…
		eklepilkinaAuthorUnsubmitted Done Reply Inline Actions I understand your concerns, but I don't see a good solution here, because I don't want to make the unneeded actions for original version without checking profitability. eklepilkina: I understand your concerns, but I don't see a good solution here, because I don't want to make…
		for (const auto &GEPInfo : DetailedInfoList) {
		LLVM_DEBUG(dbgs() << "Try to split GEP " << *GEPInfo.GEPInstruction
		<< "\n");
		bool CurrentChanged = splitGEP(GEPInfo.GEPInstruction,
		GEPInfo.AccumulativeByteOffset);
		Changed \|= CurrentChanged;
		// If GEP is already optimized remove it from lists connected with
		mkazantsevUnsubmitted Not Done Reply Inline Actions More natural way would be if (!CurrentChanged) continue; for ... mkazantsev: More natural way would be ``` if (!CurrentChanged) continue; for ... ```
		// other bases.
		mkazantsevUnsubmitted Not Done Reply Inline Actions The complexity of this is `SortedInstructionsList.size() * SortedInstructionsList.size() * sum(SortedInstructionsList[J])` if I'm reading this correctly. Looks very expensive. Is there a cheaper way of doing this? Imagine you have 10k instructions on your list. It will just be stuck forever. mkazantsev: The complexity of this is `SortedInstructionsList.size() * SortedInstructionsList.size() * sum…
		eklepilkinaAuthorUnsubmitted Done Reply Inline Actions Imagine you have 10k instructions on your list I amn't sure we should optimize this case, because it's mostly impossible, because this list is always quite small. I'll think some more, but I amn't sure that the optimization here is more important than readability. eklepilkina: > Imagine you have 10k instructions on your list I amn't sure we should optimize this case…
		mkazantsevUnsubmitted Not Done Reply Inline Actions "Mostly impossible" means "possible". We generally bail out on non-linear algorithms with some thresholds. This could also be the case here. mkazantsev: "Mostly impossible" means "possible". We generally bail out on non-linear algorithms with some…
		if (!CurrentChanged)
		continue;
		for (unsigned J = I + 1; J < SortedInstructionsList.size(); J++) {
		auto RemoveIt = remove_if(SortedInstructionsList[J].first,
		[&GEPInfo](const struct GEPInfo &Info) {
		return Info.GEPInstruction ==
		GEPInfo.GEPInstruction;
		});
		SortedInstructionsList[J].first.erase(
		RemoveIt, SortedInstructionsList[J].first.end());
		}
		}
		}
		}
		}
}		}

Changed \|= reuniteExts(F);		Changed \|= reuniteExts(F);

if (VerifyNoDeadCode)		if (VerifyNoDeadCode)
verifyNoDeadCode(F);		verifyNoDeadCode(F);

return Changed;		return Changed;
▲ Show 20 Lines • Show All 191 Lines • ▼ Show 20 Lines
SeparateConstOffsetFromGEPPass::run(Function &F, FunctionAnalysisManager &AM) {		SeparateConstOffsetFromGEPPass::run(Function &F, FunctionAnalysisManager &AM) {
auto *DT = &AM.getResult<DominatorTreeAnalysis>(F);		auto *DT = &AM.getResult<DominatorTreeAnalysis>(F);
auto *SE = &AM.getResult<ScalarEvolutionAnalysis>(F);		auto *SE = &AM.getResult<ScalarEvolutionAnalysis>(F);
auto *LI = &AM.getResult<LoopAnalysis>(F);		auto *LI = &AM.getResult<LoopAnalysis>(F);
auto *TLI = &AM.getResult<TargetLibraryAnalysis>(F);		auto *TLI = &AM.getResult<TargetLibraryAnalysis>(F);
auto GetTTI = [&AM](Function &F) -> TargetTransformInfo & {		auto GetTTI = [&AM](Function &F) -> TargetTransformInfo & {
return AM.getResult<TargetIRAnalysis>(F);		return AM.getResult<TargetIRAnalysis>(F);
};		};
SeparateConstOffsetFromGEP Impl(DT, SE, LI, TLI, GetTTI, LowerGEP);		SeparateConstOffsetFromGEP Impl(DT, SE, LI, TLI, GetTTI, LowerGEP,
		CheckProfitability);
if (!Impl.run(F))		if (!Impl.run(F))
return PreservedAnalyses::all();		return PreservedAnalyses::all();
PreservedAnalyses PA;		PreservedAnalyses PA;
PA.preserveSet<CFGAnalyses>();		PA.preserveSet<CFGAnalyses>();
return PA;		return PA;
}		}

llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -mtriple=riscv64-unknown-elf -passes='separate-const-offset-from-gep,early-cse' \		; RUN: opt < %s -mtriple=riscv64-unknown-elf -passes='separate-const-offset-from-gep<check-profit>,early-cse' \
		craig.topperUnsubmitted Not Done Reply Inline Actions This test doesn't exist in the repo. Where is the patch that adds it? craig.topper: This test doesn't exist in the repo. Where is the patch that adds it?
		eklepilkinaAuthorUnsubmitted Not Done Reply Inline Actions I was told in the first comment to rebase on precommited tests. These tests are added as precommited in separate commit. Should I commit them? eklepilkina: I was told in the first comment to rebase on precommited tests. These tests are added as…
		craig.topperUnsubmitted Not Done Reply Inline Actions Why is the script `NOTE: Assertions have been autogenerated by utils/update_test_checks.py` not in the pre-committed version? craig.topper: Why is the script `NOTE: Assertions have been autogenerated by utils/update_test_checks.py` not…
; RUN: -S \| FileCheck %s		; RUN: -S \| FileCheck %s

; Several tests for -separate-const-offset-from-gep. The transformation		; Several tests for -separate-const-offset-from-gep. The transformation
; heavily relies on TargetTransformInfo, so we put these tests under		; heavily relies on TargetTransformInfo, so we put these tests under
; target-specific folders.		; target-specific folders.

		; Positive test
; Simple case when GEPs should be optimized.		; Simple case when GEPs should be optimized.
define i64 @test1(i64* %array, i64 %i, i64 %j) {		define i64 @test1(i64* %array, i64 %i, i64 %j) {
; CHECK-LABEL: @test1(		; CHECK-LABEL: @test1(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[ADD:%.]] = add nsw i64 [[I:%.]], 5		; CHECK-NEXT: [[ADD:%.]] = add nsw i64 [[I:%.]], 5
; CHECK-NEXT: [[TMP0:%.]] = getelementptr i64, i64 [[ARRAY:%.*]], i64 [[I]]		; CHECK-NEXT: [[TMP0:%.]] = getelementptr i64, i64 [[ARRAY:%.*]], i64 [[I]]
; CHECK-NEXT: [[GEP4:%.]] = getelementptr inbounds i64, i64 [[TMP0]], i64 5		; CHECK-NEXT: [[GEP4:%.]] = getelementptr inbounds i64, i64 [[TMP0]], i64 5
; CHECK-NEXT: store i64 [[J:%.]], i64 [[GEP4]], align 4		; CHECK-NEXT: store i64 [[J:%.]], i64 [[GEP4]], align 4
Show All 11 Lines	entry:
%gep2 = getelementptr inbounds i64, i64* %array, i64 %add2		%gep2 = getelementptr inbounds i64, i64* %array, i64 %add2
store i64 %j, i64* %gep2		store i64 %j, i64* %gep2
%add3 = add nsw i64 %i, 35		%add3 = add nsw i64 %i, 35
%gep3 = getelementptr inbounds i64, i64* %array, i64 %add3		%gep3 = getelementptr inbounds i64, i64* %array, i64 %add3
store i64 %add, i64* %gep3		store i64 %add, i64* %gep3
ret i64 undef		ret i64 undef
}		}

		; Positive test
; Optimize GEPs when there sext instructions are needed to cast index value to expected type.		; Optimize GEPs when there sext instructions are needed to cast index value to expected type.
define i32 @test2(i32* %array, i32 %i, i32 %j) {		define i32 @test2(i32* %array, i32 %i, i32 %j) {
; CHECK-LABEL: @test2(		; CHECK-LABEL: @test2(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5		; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5
; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[I]] to i64		; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[I]] to i64
; CHECK-NEXT: [[TMP1:%.]] = getelementptr i32, i32 [[ARRAY:%.*]], i64 [[TMP0]]		; CHECK-NEXT: [[TMP1:%.]] = getelementptr i32, i32 [[ARRAY:%.*]], i64 [[TMP0]]
; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 5		; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 5
Show All 15 Lines	entry:
store i32 %j, i32* %gep5		store i32 %j, i32* %gep5
%add6 = add nsw i32 %i, 35		%add6 = add nsw i32 %i, 35
%sext7 = sext i32 %add6 to i64		%sext7 = sext i32 %add6 to i64
%gep8 = getelementptr inbounds i32, i32* %array, i64 %sext7		%gep8 = getelementptr inbounds i32, i32* %array, i64 %sext7
store i32 %add, i32* %gep8		store i32 %add, i32* %gep8
ret i32 undef		ret i32 undef
}		}

		; Negative test
; No need to modify because all values are also used in other expressions.		; No need to modify because all values are also used in other expressions.
; Modification doesn't decrease register pressure.		; Modification doesn't decrease register pressure.
define i32 @test3(i32* %array, i32 %i) {		define i32 @test3(i32* %array, i32 %i) {
; CHECK-LABEL: @test3(		; CHECK-LABEL: @test3(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5		; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5
; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[I]] to i64		; CHECK-NEXT: [[SEXT:%.*]] = sext i32 [[ADD]] to i64
; CHECK-NEXT: [[TMP1:%.]] = getelementptr i32, i32 [[ARRAY:%.*]], i64 [[TMP0]]		; CHECK-NEXT: [[GEP:%.]] = getelementptr inbounds i32, i32 [[ARRAY:%.*]], i64 [[SEXT]]
; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 5		; CHECK-NEXT: store i32 [[ADD]], i32* [[GEP]], align 4
; CHECK-NEXT: store i32 [[ADD]], i32* [[GEP2]], align 4
; CHECK-NEXT: [[ADD3:%.*]] = add nsw i32 [[I]], 6		; CHECK-NEXT: [[ADD3:%.*]] = add nsw i32 [[I]], 6
; CHECK-NEXT: [[GEP54:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 6		; CHECK-NEXT: [[SEXT4:%.*]] = sext i32 [[ADD3]] to i64
; CHECK-NEXT: store i32 [[ADD3]], i32* [[GEP54]], align 4		; CHECK-NEXT: [[GEP5:%.]] = getelementptr inbounds i32, i32 [[ARRAY]], i64 [[SEXT4]]
		; CHECK-NEXT: store i32 [[ADD3]], i32* [[GEP5]], align 4
; CHECK-NEXT: [[ADD6:%.*]] = add nsw i32 [[I]], 35		; CHECK-NEXT: [[ADD6:%.*]] = add nsw i32 [[I]], 35
; CHECK-NEXT: [[GEP86:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 35		; CHECK-NEXT: [[SEXT7:%.*]] = sext i32 [[ADD6]] to i64
; CHECK-NEXT: store i32 [[ADD6]], i32* [[GEP86]], align 4		; CHECK-NEXT: [[GEP8:%.]] = getelementptr inbounds i32, i32 [[ARRAY]], i64 [[SEXT7]]
		; CHECK-NEXT: store i32 [[ADD6]], i32* [[GEP8]], align 4
		mkazantsevUnsubmitted Not Done Reply Inline Actions Why is it a better code tha the old one? mkazantsev: Why is it a better code tha the old one?
		eklepilkinaAuthorUnsubmitted Done Reply Inline Actions In assembly we use one more register to save the result of new generated GEP instruction, bt we have no profit because registers that are used by adds are also needed as far as these values are used in other instructions. eklepilkina: In assembly we use one more register to save the result of new generated GEP instruction, bt we…
; CHECK-NEXT: ret i32 undef		; CHECK-NEXT: ret i32 undef
;		;
entry:		entry:
%add = add nsw i32 %i, 5		%add = add nsw i32 %i, 5
%sext = sext i32 %add to i64		%sext = sext i32 %add to i64
%gep = getelementptr inbounds i32, i32* %array, i64 %sext		%gep = getelementptr inbounds i32, i32* %array, i64 %sext
store i32 %add, i32* %gep		store i32 %add, i32* %gep
%add3 = add nsw i32 %i, 6		%add3 = add nsw i32 %i, 6
%sext4 = sext i32 %add3 to i64		%sext4 = sext i32 %add3 to i64
%gep5 = getelementptr inbounds i32, i32* %array, i64 %sext4		%gep5 = getelementptr inbounds i32, i32* %array, i64 %sext4
store i32 %add3, i32* %gep5		store i32 %add3, i32* %gep5
%add6 = add nsw i32 %i, 35		%add6 = add nsw i32 %i, 35
%sext7 = sext i32 %add6 to i64		%sext7 = sext i32 %add6 to i64
%gep8 = getelementptr inbounds i32, i32* %array, i64 %sext7		%gep8 = getelementptr inbounds i32, i32* %array, i64 %sext7
store i32 %add6, i32* %gep8		store i32 %add6, i32* %gep8
ret i32 undef		ret i32 undef
}		}

		; Positive test
; Optimized GEPs for multidimensional array with same base		; Optimized GEPs for multidimensional array with same base
define i32 @test4([50 x i32]* %array2, i32 %i) {		define i32 @test4([50 x i32]* %array2, i32 %i) {
; CHECK-LABEL: @test4(		; CHECK-LABEL: @test4(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5		; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5
; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[I]] to i64		; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[I]] to i64
; CHECK-NEXT: [[TMP1:%.]] = getelementptr [50 x i32], [50 x i32] [[ARRAY2:%.*]], i64 [[TMP0]], i64 [[TMP0]]		; CHECK-NEXT: [[TMP1:%.]] = getelementptr [50 x i32], [50 x i32] [[ARRAY2:%.*]], i64 [[TMP0]], i64 [[TMP0]]
; CHECK-NEXT: [[GEP3:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 255		; CHECK-NEXT: [[GEP3:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 255
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	entry:
store i32 %i, i32* %gep5		store i32 %i, i32* %gep5
%add6 = add nsw i32 %i, 35		%add6 = add nsw i32 %i, 35
%sext7 = sext i32 %add6 to i64		%sext7 = sext i32 %add6 to i64
%gep8 = getelementptr inbounds [50 x i32], [50 x i32]* %array2, i64 %sext7, i64 %j		%gep8 = getelementptr inbounds [50 x i32], [50 x i32]* %array2, i64 %sext7, i64 %j
store i32 %i, i32* %gep8		store i32 %i, i32* %gep8
ret i32 undef		ret i32 undef
}		}

		; Negative test
; No need to optimize GEPs, because there is critical amount with non-constant offsets.		; No need to optimize GEPs, because there is critical amount with non-constant offsets.
define i64 @test6(i64* %array, i64 %i, i64 %j) {		define i64 @test6(i64* %array, i64 %i, i64 %j) {
; CHECK-LABEL: @test6(		; CHECK-LABEL: @test6(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[ADD:%.]] = add nsw i64 [[I:%.]], 5		; CHECK-NEXT: [[ADD:%.]] = add nsw i64 [[I:%.]], 5
; CHECK-NEXT: [[GEP:%.]] = getelementptr inbounds i64, i64 [[ARRAY:%.]], i64 [[J:%.]]		; CHECK-NEXT: [[GEP:%.]] = getelementptr inbounds i64, i64 [[ARRAY:%.]], i64 [[J:%.]]
; CHECK-NEXT: store i64 [[ADD]], i64* [[GEP]], align 4		; CHECK-NEXT: store i64 [[ADD]], i64* [[GEP]], align 4
; CHECK-NEXT: [[TMP0:%.]] = getelementptr i64, i64 [[ARRAY]], i64 [[I]]		; CHECK-NEXT: [[ADD3:%.*]] = add nsw i64 [[I]], 6
; CHECK-NEXT: [[GEP52:%.]] = getelementptr inbounds i64, i64 [[TMP0]], i64 6		; CHECK-NEXT: [[GEP5:%.]] = getelementptr inbounds i64, i64 [[ARRAY]], i64 [[ADD3]]
; CHECK-NEXT: store i64 [[I]], i64* [[GEP52]], align 4		; CHECK-NEXT: store i64 [[I]], i64* [[GEP5]], align 4
; CHECK-NEXT: store i64 [[I]], i64* [[TMP0]], align 4		; CHECK-NEXT: [[GEP8:%.]] = getelementptr inbounds i64, i64 [[ARRAY]], i64 [[I]]
		; CHECK-NEXT: store i64 [[I]], i64* [[GEP8]], align 4
; CHECK-NEXT: ret i64 undef		; CHECK-NEXT: ret i64 undef
;		;
entry:		entry:
%add = add nsw i64 %i, 5		%add = add nsw i64 %i, 5
%gep = getelementptr inbounds i64, i64* %array, i64 %j		%gep = getelementptr inbounds i64, i64* %array, i64 %j
store i64 %add, i64* %gep		store i64 %add, i64* %gep
%add3 = add nsw i64 %i, 6		%add3 = add nsw i64 %i, 6
%gep5 = getelementptr inbounds i64, i64* %array, i64 %add3		%gep5 = getelementptr inbounds i64, i64* %array, i64 %add3
store i64 %i, i64* %gep5		store i64 %i, i64* %gep5
%add6 = add nsw i64 %i, 35		%add6 = add nsw i64 %i, 35
%gep8 = getelementptr inbounds i64, i64* %array, i64 %i		%gep8 = getelementptr inbounds i64, i64* %array, i64 %i
store i64 %i, i64* %gep8		store i64 %i, i64* %gep8
ret i64 undef		ret i64 undef
}		}

		; Negative test
; No need to optimize GEPs, because the base variable is different.		; No need to optimize GEPs, because the base variable is different.
define i32 @test7(i32* %array, i32 %i, i32 %j, i32 %k) {		define i32 @test7(i32* %array, i32 %i, i32 %j, i32 %k) {
; CHECK-LABEL: @test7(		; CHECK-LABEL: @test7(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5		; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5
; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[I]] to i64		; CHECK-NEXT: [[SEXT:%.*]] = sext i32 [[ADD]] to i64
; CHECK-NEXT: [[TMP1:%.]] = getelementptr i32, i32 [[ARRAY:%.*]], i64 [[TMP0]]		; CHECK-NEXT: [[GEP:%.]] = getelementptr inbounds i32, i32 [[ARRAY:%.*]], i64 [[SEXT]]
; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 5		; CHECK-NEXT: store i32 [[ADD]], i32* [[GEP]], align 4
; CHECK-NEXT: store i32 [[ADD]], i32* [[GEP2]], align 4		; CHECK-NEXT: [[ADD3:%.]] = add nsw i32 [[K:%.]], 6
; CHECK-NEXT: [[TMP2:%.]] = sext i32 [[K:%.]] to i64		; CHECK-NEXT: [[SEXT4:%.*]] = sext i32 [[ADD3]] to i64
; CHECK-NEXT: [[TMP3:%.]] = getelementptr i32, i32 [[ARRAY]], i64 [[TMP2]]		; CHECK-NEXT: [[GEP5:%.]] = getelementptr inbounds i32, i32 [[ARRAY]], i64 [[SEXT4]]
; CHECK-NEXT: [[GEP54:%.]] = getelementptr inbounds i32, i32 [[TMP3]], i64 6		; CHECK-NEXT: store i32 [[I]], i32* [[GEP5]], align 4
; CHECK-NEXT: store i32 [[I]], i32* [[GEP54]], align 4		; CHECK-NEXT: [[ADD6:%.]] = add nsw i32 [[J:%.]], 35
; CHECK-NEXT: [[TMP4:%.]] = sext i32 [[J:%.]] to i64		; CHECK-NEXT: [[SEXT7:%.*]] = sext i32 [[ADD6]] to i64
; CHECK-NEXT: [[TMP5:%.]] = getelementptr i32, i32 [[ARRAY]], i64 [[TMP4]]		; CHECK-NEXT: [[GEP8:%.]] = getelementptr inbounds i32, i32 [[ARRAY]], i64 [[SEXT7]]
; CHECK-NEXT: [[GEP86:%.]] = getelementptr inbounds i32, i32 [[TMP5]], i64 35		; CHECK-NEXT: store i32 [[I]], i32* [[GEP8]], align 4
; CHECK-NEXT: store i32 [[I]], i32* [[GEP86]], align 4
; CHECK-NEXT: ret i32 undef		; CHECK-NEXT: ret i32 undef
;		;
entry:		entry:
%add = add nsw i32 %i, 5		%add = add nsw i32 %i, 5
%sext = sext i32 %add to i64		%sext = sext i32 %add to i64
%gep = getelementptr inbounds i32, i32* %array, i64 %sext		%gep = getelementptr inbounds i32, i32* %array, i64 %sext
store i32 %add, i32* %gep		store i32 %add, i32* %gep
%add3 = add nsw i32 %k, 6		%add3 = add nsw i32 %k, 6
%sext4 = sext i32 %add3 to i64		%sext4 = sext i32 %add3 to i64
%gep5 = getelementptr inbounds i32, i32* %array, i64 %sext4		%gep5 = getelementptr inbounds i32, i32* %array, i64 %sext4
store i32 %i, i32* %gep5		store i32 %i, i32* %gep5
%add6 = add nsw i32 %j, 35		%add6 = add nsw i32 %j, 35
%sext7 = sext i32 %add6 to i64		%sext7 = sext i32 %add6 to i64
%gep8 = getelementptr inbounds i32, i32* %array, i64 %sext7		%gep8 = getelementptr inbounds i32, i32* %array, i64 %sext7
store i32 %i, i32* %gep8		store i32 %i, i32* %gep8
ret i32 undef		ret i32 undef
}		}

		; Negative test
; No need to optimize GEPs, because the base of GEP instructions is different.		; No need to optimize GEPs, because the base of GEP instructions is different.
define i32 @test8(i32* %array, i32* %array2, i32* %array3, i32 %i) {		define i32 @test8(i32* %array, i32* %array2, i32* %array3, i32 %i) {
; CHECK-LABEL: @test8(		; CHECK-LABEL: @test8(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5		; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5
; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[I]] to i64		; CHECK-NEXT: [[SEXT:%.*]] = sext i32 [[ADD]] to i64
; CHECK-NEXT: [[TMP1:%.]] = getelementptr i32, i32 [[ARRAY:%.*]], i64 [[TMP0]]		; CHECK-NEXT: [[GEP:%.]] = getelementptr inbounds i32, i32 [[ARRAY:%.*]], i64 [[SEXT]]
; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 5		; CHECK-NEXT: store i32 [[ADD]], i32* [[GEP]], align 4
; CHECK-NEXT: store i32 [[ADD]], i32* [[GEP2]], align 4		; CHECK-NEXT: [[ADD3:%.*]] = add nsw i32 [[I]], 6
; CHECK-NEXT: [[TMP2:%.]] = getelementptr i32, i32 [[ARRAY2:%.*]], i64 [[TMP0]]		; CHECK-NEXT: [[SEXT4:%.*]] = sext i32 [[ADD3]] to i64
; CHECK-NEXT: [[GEP54:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i64 6		; CHECK-NEXT: [[GEP5:%.]] = getelementptr inbounds i32, i32 [[ARRAY2:%.*]], i64 [[SEXT4]]
; CHECK-NEXT: store i32 [[I]], i32* [[GEP54]], align 4		; CHECK-NEXT: store i32 [[I]], i32* [[GEP5]], align 4
; CHECK-NEXT: [[TMP3:%.]] = getelementptr i32, i32 [[ARRAY3:%.*]], i64 [[TMP0]]		; CHECK-NEXT: [[ADD6:%.*]] = add nsw i32 [[I]], 35
; CHECK-NEXT: [[GEP86:%.]] = getelementptr inbounds i32, i32 [[TMP3]], i64 35		; CHECK-NEXT: [[SEXT7:%.*]] = sext i32 [[ADD6]] to i64
; CHECK-NEXT: store i32 [[I]], i32* [[GEP86]], align 4		; CHECK-NEXT: [[GEP8:%.]] = getelementptr inbounds i32, i32 [[ARRAY3:%.*]], i64 [[SEXT7]]
		; CHECK-NEXT: store i32 [[I]], i32* [[GEP8]], align 4
; CHECK-NEXT: ret i32 undef		; CHECK-NEXT: ret i32 undef
;		;
entry:		entry:
%add = add nsw i32 %i, 5		%add = add nsw i32 %i, 5
%sext = sext i32 %add to i64		%sext = sext i32 %add to i64
%gep = getelementptr inbounds i32, i32* %array, i64 %sext		%gep = getelementptr inbounds i32, i32* %array, i64 %sext
store i32 %add, i32* %gep		store i32 %add, i32* %gep
%add3 = add nsw i32 %i, 6		%add3 = add nsw i32 %i, 6
%sext4 = sext i32 %add3 to i64		%sext4 = sext i32 %add3 to i64
%gep5 = getelementptr inbounds i32, i32* %array2, i64 %sext4		%gep5 = getelementptr inbounds i32, i32* %array2, i64 %sext4
store i32 %i, i32* %gep5		store i32 %i, i32* %gep5
%add6 = add nsw i32 %i, 35		%add6 = add nsw i32 %i, 35
%sext7 = sext i32 %add6 to i64		%sext7 = sext i32 %add6 to i64
%gep8 = getelementptr inbounds i32, i32* %array3, i64 %sext7		%gep8 = getelementptr inbounds i32, i32* %array3, i64 %sext7
store i32 %i, i32* %gep8		store i32 %i, i32* %gep8
ret i32 undef		ret i32 undef
}		}

		; Negative test
; No need to optimize GEPs of multidimensional array, because the base of GEP instructions is different.		; No need to optimize GEPs of multidimensional array, because the base of GEP instructions is different.
define i32 @test9([50 x i32]* %array, i32 %i) {		define i32 @test9([50 x i32]* %array, i32 %i) {
; CHECK-LABEL: @test9(		; CHECK-LABEL: @test9(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5		; CHECK-NEXT: [[ADD:%.]] = add nsw i32 [[I:%.]], 5
; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[I]] to i64		; CHECK-NEXT: [[SEXT:%.*]] = sext i32 [[ADD]] to i64
; CHECK-NEXT: [[TMP1:%.]] = getelementptr [50 x i32], [50 x i32] [[ARRAY:%.*]], i64 0, i64 [[TMP0]]		; CHECK-NEXT: [[GEP:%.]] = getelementptr inbounds [50 x i32], [50 x i32] [[ARRAY:%.*]], i64 0, i64 [[SEXT]]
; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds i32, i32 [[TMP1]], i64 5		; CHECK-NEXT: store i32 [[ADD]], i32* [[GEP]], align 4
; CHECK-NEXT: store i32 [[ADD]], i32* [[GEP2]], align 4		; CHECK-NEXT: [[ADD3:%.*]] = add nsw i32 [[I]], 6
; CHECK-NEXT: [[TMP2:%.]] = getelementptr [50 x i32], [50 x i32] [[ARRAY]], i64 [[TMP0]], i64 [[TMP0]]		; CHECK-NEXT: [[SEXT4:%.*]] = sext i32 [[ADD3]] to i64
; CHECK-NEXT: [[GEP54:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i64 6		; CHECK-NEXT: [[INT:%.*]] = sext i32 [[I]] to i64
; CHECK-NEXT: store i32 [[I]], i32* [[GEP54]], align 4		; CHECK-NEXT: [[GEP5:%.]] = getelementptr inbounds [50 x i32], [50 x i32] [[ARRAY]], i64 [[INT]], i64 [[SEXT4]]
; CHECK-NEXT: [[GEP87:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i64 335		; CHECK-NEXT: store i32 [[I]], i32* [[GEP5]], align 4
; CHECK-NEXT: store i32 [[I]], i32* [[GEP87]], align 4		; CHECK-NEXT: [[ADD6:%.*]] = add nsw i32 [[I]], 35
		; CHECK-NEXT: [[SEXT7:%.*]] = sext i32 [[ADD6]] to i64
		; CHECK-NEXT: [[GEP8:%.]] = getelementptr inbounds [50 x i32], [50 x i32] [[ARRAY]], i64 [[SEXT4]], i64 [[SEXT7]]
		; CHECK-NEXT: store i32 [[I]], i32* [[GEP8]], align 4
		mkazantsevUnsubmitted Not Done Reply Inline Actions This code is bigger than it used to be. Can you explain why is it better? mkazantsev: This code is bigger than it used to be. Can you explain why is it better?
		eklepilkinaAuthorUnsubmitted Done Reply Inline Actions This code is bigger on IR, and it's so becuase of repeating sext opertaions, but `sext` isn't so critical in assembly, at the same time pass generates 2 new GEP instructions that are used as base and we need registers for them eklepilkina: This code is bigger on IR, and it's so becuase of repeating sext opertaions, but `sext` isn't…
		mkazantsevUnsubmitted Not Done Reply Inline Actions Then please provide a llc test that demonstrates a positive change. The fact that "sext isn't so critical" is a way not obvious to me. Filling the upper part of the registry may sometimes be an extra operation. mkazantsev: Then please provide a llc test that demonstrates a positive change. The fact that "sext isn't…
; CHECK-NEXT: ret i32 undef		; CHECK-NEXT: ret i32 undef
;		;
entry:		entry:
%add = add nsw i32 %i, 5		%add = add nsw i32 %i, 5
%sext = sext i32 %add to i64		%sext = sext i32 %add to i64
%gep = getelementptr inbounds [50 x i32], [50 x i32]* %array, i64 0, i64 %sext		%gep = getelementptr inbounds [50 x i32], [50 x i32]* %array, i64 0, i64 %sext
store i32 %add, i32* %gep		store i32 %add, i32* %gep
%add3 = add nsw i32 %i, 6		%add3 = add nsw i32 %i, 6
Show All 10 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SeparateConstOffsetFromGEPPass] Added optional modification strategyAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 451472

llvm/include/llvm/Transforms/Scalar.h

llvm/include/llvm/Transforms/Scalar/SeparateConstOffsetFromGEP.h

llvm/lib/Passes/PassBuilder.cpp

llvm/lib/Passes/PassRegistry.def

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp

llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll

[SeparateConstOffsetFromGEPPass] Added optional modification strategy
AbandonedPublic