This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/lib/CodeGen/
-
trunk/
-
lib/
-
CodeGen/
-
CodeGenPrepare.cpp

Differential D36073

[CGP] Extends the scope of optimizeMemoryInst optimization
ClosedPublic

Authored by skatkov on Jul 31 2017, 12:17 AM.

Download Raw Diff

Details

Reviewers

efriedma
• dberlin
mkazantsev
reames
john.brawn

Commits

rGee892325bf3e: [CGP] Enable extending scope of optimizeMemoryInst
rG365200295a94: [CGP] Disable Select instruction handling in optimizeMemoryInst. NFC
rGd5d8d54b0811: [CGP] Extends the scope of optimizeMemoryInst optimization
rGcde03f3d27a8: [CGP] Extends the scope of optimizeMemoryInst optimization. NFC
rL317665: [CGP] Enable extending scope of optimizeMemoryInst
rL317555: [CGP] Disable Select instruction handling in optimizeMemoryInst. NFC
rL317430: [CGP] Extends the scope of optimizeMemoryInst optimization. NFC
rL317429: [CGP] Extends the scope of optimizeMemoryInst optimization

Summary

This is an implementation of PR26223.

Currently optimizeMemoryInst optimization tries to fold address computation
if all possible way to get compute the address are of the form

baseGV + base + scale * Index + offset

where scale and offset are constants and baseGV, base and Index are exactly
the same instructions if defined.

The patch extends this optimization to allow different bases. In this case
it tries to find/build a Phi node merging all possible bases and use this Phi node
as a base for sunk address computation.

The main motivation for this scope extension is GCRelocateInst.
If there is a relocation of derived pointer it will be represented as relocation of base + offset.
Also there will be a Phi node merging address computation for relocated derived pointer
and derived pointer itself. If we have a Phi node merging original base and relocated base
and can fold the address computation of derived pointer then we can potentially reduce
the code size and Phi node for derived pointer. The later can have a positive impact to
register allocator.

Diff Detail

Repository: rL LLVM

Event Timeline

skatkov created this revision.Jul 31 2017, 12:17 AM

skatkov mentioned this in D35474: SSAUpdater: Add mode when Phi creation is not allowed.Jul 31 2017, 12:21 AM

dneilson added a subscriber: dneilson.Jul 31 2017, 6:08 AM

• dberlin added inline comments.Jul 31 2017, 10:38 AM

lib/CodeGen/CodeGenPrepare.cpp
4381 ↗	(On Diff #108867)	FWIW, i'd just hash them, rather thancheck the possible matches again and again, i think. Happy to give you some code to crib for that.

skatkov added inline comments.Jul 31 2017, 8:08 PM

lib/CodeGen/CodeGenPrepare.cpp
4381 ↗	(On Diff #108867)	Interesting idea, you mean hash the Phi node and compare hashes first? I'm not sure I'm doing comparison of the same Phi node many times. I take a Phi node and compare it against each Phi in the same basic block. If I find a match I will recursively check the dependency. If this Phi node is not matched to any phi node in this basic block, I just bail out if creation of new Phi node is not allowed or move this Phi node (and all other Phi nodes considered together) to the list of known Phi nodes. And I will not consider them again. So it seems that hashing is redundant here. Do I miss anything?

skatkov added inline comments.Jul 31 2017, 8:22 PM

lib/CodeGen/CodeGenPrepare.cpp
4429 ↗	(On Diff #108867)	should be static. Will update after first iteration of review.

I would really appreciate if you review the patch :)
May be I really miss something.

I know the code is pretty big in size but I tried to write it as much as simple for understanding as I could.
If I can do anything to simplify the review, please let me know.

Add a couple of more reviewers... If someone knows who can also be added please do.

mkazantsev added inline comments.Aug 7 2017, 12:15 AM

lib/CodeGen/CodeGenPrepare.cpp
2617 ↗	(On Diff #108867)	I think this can be merged separately as NFC. Also you could re-define the `operator==` avode like `return (BaseReg == O.BaseReg) && EqualsIgnoreBase(O)` to avoid code duplication.
4630 ↗	(On Diff #108867)	Better use && instread of increased nesting.

mkazantsev added inline comments.Aug 7 2017, 12:16 AM

lib/CodeGen/CodeGenPrepare.cpp
4630 ↗	(On Diff #108867)	UPD: Not viable here, sorry, I misread it.

skatkov edited the summary of this revision. (Show Details)Aug 16 2017, 11:36 PM

reames added inline comments.Sep 22 2017, 2:49 PM

lib/CodeGen/CodeGenPrepare.cpp
2617 ↗	(On Diff #108867)	Please take Max's suggestion. Getting this integrated is going to be a slow process and we should carve out any piece we can.
4617 ↗	(On Diff #108867)	getParent()->getParent() --> getFunction()
4630 ↗	(On Diff #108867)	Why is it not viable?
4645 ↗	(On Diff #108867)	This snippet of code confuses me and the comment doesn't help.
4649 ↗	(On Diff #108867)	As I pointed out in the other review (https://reviews.llvm.org/D38133), I think both can be done using the same API as seen by this function. I generally like the framing of the other patch (track the addrmodes found, then merge later) a bit better than this and it's clearly more general. I think splitting this patch into two would simply the review greatly and highlight the common work between the reviews. My suggested split would be: The changes to this function required to track which components differ. With a trivial implementation of the analysis/transform for when one component differs and the necessary phi/select can be trivially found in the same basic block. A patch which builds on top of that with the rest of the algorithm. (Note: This is in addition to the select handling factored out from that other review.)

Marking as changes needs to reflect split requested. Note that I really haven't even looked at the guts of the transform and don't plan to until the common pieces have been extracted.

This revision now requires changes to proceed.Sep 22 2017, 2:50 PM

john.brawn mentioned this in D38133: [CGP] Make optimizeMemoryInst able to combine more kinds of ExtAddrMode fields.Sep 25 2017, 9:30 AM

skatkov added inline comments.Sep 25 2017, 10:12 PM

lib/CodeGen/CodeGenPrepare.cpp
2617 ↗	(On Diff #108867)	We agreed that john.brown will implement it in a separate patch. So I 'm waiting him for this update.
4630 ↗	(On Diff #108867)	Because there is an additional code with we should execute if first condition is true while the second one is false.
4645 ↗	(On Diff #108867)	I meant that if AddrModes contains only base (no index, no offset), no need to do a complex analysis because we end up by the same Phi node as we start from.

john.brawn added a subscriber: john.brawn.Sep 26 2017, 6:48 AM

john.brawn added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
2617 ↗	(On Diff #108867)	This now in D38278 (plus its dependent D38242).

john.brawn mentioned this in D38278: [CGP] Make optimizeMemoryInst capable of handling multiple AddrModes.Oct 2 2017, 8:10 AM

john.brawn added a child revision: D38133: [CGP] Make optimizeMemoryInst able to combine more kinds of ExtAddrMode fields.Oct 2 2017, 9:21 AM

skatkov added a parent revision: D38533: [CGP]Restrict complex select/phi case for optimizeMemoryInst.Oct 4 2017, 2:21 AM

skatkov edited parent revisions, added: D38535: [CGP] Separate Select and Phi case in optimizeMemoryInst; removed: D38533: [CGP]Restrict complex select/phi case for optimizeMemoryInst.Oct 4 2017, 3:15 AM

mkazantsev added inline comments.Oct 10 2017, 1:16 AM

lib/CodeGen/CodeGenPrepare.cpp
4366 ↗	(On Diff #108867)	Nit: `Item`

Please take a look.
I also plan to add more complex test cases like different mix of select and phi.

Herald added a subscriber: javed.absar. · View Herald TranscriptOct 10 2017, 4:24 AM

skatkov added a reviewer: john.brawn.Oct 10 2017, 4:24 AM

skatkov added inline comments.Oct 10 2017, 4:31 AM

lib/CodeGen/CodeGenPrepare.cpp
203 ↗	(On Diff #118344)	I plan to set this to true before submitting the patch and return back to false as a separate commit.
209 ↗	(On Diff #118344)	If I set this to true I get one unit test failure. The optimization works but generated code for ARM seems worse... Need to think about some heuristic when to enable Phi node creation. For example if original Phi will be eliminated (no other users except memory operations...)
2752 ↗	(On Diff #118344)	Unfortunately it is not enough for checking the trivial case. It is possible that BaseReg is just a bitcast while there is no other fields but OriginalValue != baseReg and we consider this as not trivial what seems not true. I plan to come up with separate patch for this.
5044 ↗	(On Diff #118344)	Like an idea for compile time improvement: If we know that for this Addr there is already sunk address why we need all checks above?

skatkov mentioned this in D38533: [CGP]Restrict complex select/phi case for optimizeMemoryInst.Oct 10 2017, 4:34 AM

mkazantsev added inline comments.Oct 10 2017, 4:39 AM

lib/CodeGen/CodeGenPrepare.cpp
3414 ↗	(On Diff #118344)	In such cases it needs to be `auto *PHI`.

Handled Maxim's comment.

john.brawn added inline comments.Oct 20 2017, 7:24 AM

lib/CodeGen/CodeGenPrepare.cpp
3765 ↗	(On Diff #118546)	If TrueValue is not an Instruction then this will fail because the value will have been inserted into the Map with nullptr for the block. I think we need handling similar to how PHINode handles it below (and the same is true for FalseValue also).
3779 ↗	(On Diff #118546)	The iterator order of predecessors(CurrentBlock) may be different to the iterator order of CurrentPhi->blocks(), which can cause PHI to have incoming values in a different order to CurrentPhi which is a bit of a nuisance when writing tests. Could you see if you can preserve the block order? I tried modifying this to iterate over CurrentPhi->blocks(), but then I hit the "No predecessor Value" assert for reasons I haven't figured out yet.
3816 ↗	(On Diff #118546)	Elsewhere Map.find(Current) != Map.end() is used to see if a value is in the map. It would be nice for that to be checked the same way everywhere.
3835 ↗	(On Diff #118546)	Map[Current] = PHI (and similar elsewhere in this function).

Handled comments. Still need to add more tests... Plan to do it in 1-2 days.

skatkov marked 2 inline comments as done.Oct 23 2017, 4:23 AM

skatkov added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
3765 ↗	(On Diff #118546)	Good catch. Fixed.
3779 ↗	(On Diff #118546)	We can do this only if CurrentPhi is declared in this basic block otherwise you'll get an assert as you mentioned. If we really want it it will require the duplication of the loop to support both cases - in this block and not. Do you see any other possible way to do that?
3816 ↗	(On Diff #118546)	Done.
3835 ↗	(On Diff #118546)	Done,

added a couple of more tests. Specifically for select of select, select of phi and phi of select.

LGTM.

lib/CodeGen/CodeGenPrepare.cpp
3779 ↗	(On Diff #118546)	Looking at this some more it does look like it would be more trouble than it's worth to try and keep the block order, so this I think this is OK as it is.

Re-based before landing.
Disable optimization by default, I will land separate patch to enable it on Tue my morning to be able to re-act on possible problems.

Closed by commit rL317429: [CGP] Extends the scope of optimizeMemoryInst optimization (authored by skatkov). · Explain WhyNov 4 2017, 10:51 PM

This revision was automatically updated to reflect the committed changes.

Seems it broke selfhosting. http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/13335

Yeah :(

Reverted as r317667. Will investigated and re-submit later.

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

CodeGenPrepare.cpp

443 lines

Diff 121616

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines
STATISTIC(NumPHIsElim, "Number of trivial PHIs eliminated");		STATISTIC(NumPHIsElim, "Number of trivial PHIs eliminated");
STATISTIC(NumGEPsElim, "Number of GEPs converted to casts");		STATISTIC(NumGEPsElim, "Number of GEPs converted to casts");
STATISTIC(NumCmpUses, "Number of uses of Cmp expressions replaced with uses of "		STATISTIC(NumCmpUses, "Number of uses of Cmp expressions replaced with uses of "
"sunken Cmps");		"sunken Cmps");
STATISTIC(NumCastUses, "Number of uses of Cast expressions replaced with uses "		STATISTIC(NumCastUses, "Number of uses of Cast expressions replaced with uses "
"of sunken Casts");		"of sunken Casts");
STATISTIC(NumMemoryInsts, "Number of memory instructions whose address "		STATISTIC(NumMemoryInsts, "Number of memory instructions whose address "
"computations were sunk");		"computations were sunk");
		STATISTIC(NumMemoryInstsPhiCreated,
		"Number of phis created when address "
		"computations were sunk to memory instructions");
		STATISTIC(NumMemoryInstsSelectCreated,
		"Number of select created when address "
		"computations were sunk to memory instructions");
STATISTIC(NumExtsMoved, "Number of [s\|z]ext instructions combined with loads");		STATISTIC(NumExtsMoved, "Number of [s\|z]ext instructions combined with loads");
STATISTIC(NumExtUses, "Number of uses of [s\|z]ext instructions optimized");		STATISTIC(NumExtUses, "Number of uses of [s\|z]ext instructions optimized");
STATISTIC(NumAndsAdded,		STATISTIC(NumAndsAdded,
"Number of and mask instructions added to form ext loads");		"Number of and mask instructions added to form ext loads");
STATISTIC(NumAndUses, "Number of uses of and mask instructions optimized");		STATISTIC(NumAndUses, "Number of uses of and mask instructions optimized");
STATISTIC(NumRetsDup, "Number of return instructions duplicated");		STATISTIC(NumRetsDup, "Number of return instructions duplicated");
STATISTIC(NumDbgValueMoved, "Number of debug value instructions moved");		STATISTIC(NumDbgValueMoved, "Number of debug value instructions moved");
STATISTIC(NumSelectsExpanded, "Number of selects turned into branches");		STATISTIC(NumSelectsExpanded, "Number of selects turned into branches");
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	static cl::opt<bool> ForceSplitStore(
"force-split-store", cl::Hidden, cl::init(false),		"force-split-store", cl::Hidden, cl::init(false),
cl::desc("Force store splitting no matter what the target query says."));		cl::desc("Force store splitting no matter what the target query says."));

static cl::opt<bool>		static cl::opt<bool>
EnableTypePromotionMerge("cgp-type-promotion-merge", cl::Hidden,		EnableTypePromotionMerge("cgp-type-promotion-merge", cl::Hidden,
cl::desc("Enable merging of redundant sexts when one is dominating"		cl::desc("Enable merging of redundant sexts when one is dominating"
" the other."), cl::init(true));		" the other."), cl::init(true));

		static cl::opt<bool> DisableComplexAddrModes(
		"disable-complex-addr-modes", cl::Hidden, cl::init(true),
		cl::desc("Disables combining addressing modes with different parts "
		"in optimizeMemoryInst."));

		static cl::opt<bool>
		AddrSinkNewPhis("addr-sink-new-phis", cl::Hidden, cl::init(false),
		cl::desc("Allow creation of Phis in Address sinking."));

		static cl::opt<bool>
		AddrSinkNewSelects("addr-sink-new-select", cl::Hidden, cl::init(true),
		cl::desc("Allow creation of selects in Address sinking."));

namespace {		namespace {

using SetOfInstrs = SmallPtrSet<Instruction *, 16>;		using SetOfInstrs = SmallPtrSet<Instruction *, 16>;
using TypeIsSExt = PointerIntPair<Type *, 1, bool>;		using TypeIsSExt = PointerIntPair<Type *, 1, bool>;
using InstrToOrigTy = DenseMap<Instruction *, TypeIsSExt>;		using InstrToOrigTy = DenseMap<Instruction *, TypeIsSExt>;
using SExts = SmallVector<Instruction *, 16>;		using SExts = SmallVector<Instruction *, 16>;
using ValueToSExts = DenseMap<Value *, SExts>;		using ValueToSExts = DenseMap<Value *, SExts>;

▲ Show 20 Lines • Show All 2,469 Lines • ▼ Show 20 Lines	private:
bool isProfitableToFoldIntoAddressingMode(Instruction *I,		bool isProfitableToFoldIntoAddressingMode(Instruction *I,
ExtAddrMode &AMBefore,		ExtAddrMode &AMBefore,
ExtAddrMode &AMAfter);		ExtAddrMode &AMAfter);
bool valueAlreadyLiveAtInst(Value Val, Value KnownLive1, Value *KnownLive2);		bool valueAlreadyLiveAtInst(Value Val, Value KnownLive1, Value *KnownLive2);
bool isPromotionProfitable(unsigned NewCost, unsigned OldCost,		bool isPromotionProfitable(unsigned NewCost, unsigned OldCost,
Value *PromotedOperand) const;		Value *PromotedOperand) const;
};		};

		/// \brief Keep track of simplification of Phi nodes.
		/// Accept the set of all phi nodes and erase phi node from this set
		/// if it is simplified.
		class SimplificationTracker {
		DenseMap<Value , Value > Storage;
		const SimplifyQuery &SQ;
		SmallPtrSetImpl<PHINode *> &AllPhiNodes;
		SmallPtrSetImpl<SelectInst *> &AllSelectNodes;

		public:
		SimplificationTracker(const SimplifyQuery &sq,
		SmallPtrSetImpl<PHINode *> &APN,
		SmallPtrSetImpl<SelectInst *> &ASN)
		: SQ(sq), AllPhiNodes(APN), AllSelectNodes(ASN) {}

		Value Get(Value V) {
		do {
		auto SV = Storage.find(V);
		if (SV == Storage.end())
		return V;
		V = SV->second;
		} while (true);
		}

		Value Simplify(Value Val) {
		SmallVector<Value *, 32> WorkList;
		SmallPtrSet<Value *, 32> Visited;
		WorkList.push_back(Val);
		while (!WorkList.empty()) {
		auto P = WorkList.pop_back_val();
		if (!Visited.insert(P).second)
		continue;
		if (auto *PI = dyn_cast<Instruction>(P))
		if (Value *V = SimplifyInstruction(cast<Instruction>(PI), SQ)) {
		for (auto *U : PI->users())
		WorkList.push_back(cast<Value>(U));
		Put(PI, V);
		PI->replaceAllUsesWith(V);
		if (auto *PHI = dyn_cast<PHINode>(PI))
		AllPhiNodes.erase(PHI);
		if (auto *Select = dyn_cast<SelectInst>(PI))
		AllSelectNodes.erase(Select);
		PI->eraseFromParent();
		}
		}
		return Get(Val);
		}

		void Put(Value From, Value To) {
		Storage.insert({ From, To });
		}
		};

/// \brief A helper class for combining addressing modes.		/// \brief A helper class for combining addressing modes.
class AddressingModeCombiner {		class AddressingModeCombiner {
		typedef std::pair<Value , BasicBlock > ValueInBB;
		typedef DenseMap<ValueInBB, Value *> FoldAddrToValueMapping;
		typedef std::pair<PHINode , PHINode > PHIPair;

private:		private:
/// The addressing modes we've collected.		/// The addressing modes we've collected.
SmallVector<ExtAddrMode, 16> AddrModes;		SmallVector<ExtAddrMode, 16> AddrModes;

/// The field in which the AddrModes differ, when we have more than one.		/// The field in which the AddrModes differ, when we have more than one.
ExtAddrMode::FieldName DifferentField = ExtAddrMode::NoField;		ExtAddrMode::FieldName DifferentField = ExtAddrMode::NoField;

/// Are the AddrModes that we have all just equal to their original values?		/// Are the AddrModes that we have all just equal to their original values?
bool AllAddrModesTrivial = true;		bool AllAddrModesTrivial = true;

		/// Common Type for all different fields in addressing modes.
		Type *CommonType;

		/// SimplifyQuery for simplifyInstruction utility.
		const SimplifyQuery &SQ;

		/// Original Address.
		ValueInBB Original;

public:		public:
		AddressingModeCombiner(const SimplifyQuery &_SQ, ValueInBB OriginalValue)
		: CommonType(nullptr), SQ(_SQ), Original(OriginalValue) {}

/// \brief Get the combined AddrMode		/// \brief Get the combined AddrMode
const ExtAddrMode &getAddrMode() const {		const ExtAddrMode &getAddrMode() const {
return AddrModes[0];		return AddrModes[0];
}		}

/// \brief Add a new AddrMode if it's compatible with the AddrModes we already		/// \brief Add a new AddrMode if it's compatible with the AddrModes we already
/// have.		/// have.
/// \return True iff we succeeded in doing so.		/// \return True iff we succeeded in doing so.
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	bool combineAddrModes() {
if (AddrModes.size() == 1)		if (AddrModes.size() == 1)
return true;		return true;

// If the AddrModes we collected are all just equal to the value they are		// If the AddrModes we collected are all just equal to the value they are
// derived from then combining them wouldn't do anything useful.		// derived from then combining them wouldn't do anything useful.
if (AllAddrModesTrivial)		if (AllAddrModesTrivial)
return false;		return false;

// TODO: Combine multiple AddrModes by inserting a select or phi for the		if (DisableComplexAddrModes)
// field in which the AddrModes differ.
return false;		return false;

		// For now we support only different base registers.
		// TODO: enable others.
		if (DifferentField != ExtAddrMode::BaseRegField)
		return false;

		// Build a map between <original value, basic block where we saw it> to
		// value of base register.
		FoldAddrToValueMapping Map;
		initializeMap(Map);

		Value *CommonValue = findCommon(Map);
		if (CommonValue)
		AddrModes[0].BaseReg = CommonValue;
		return CommonValue != nullptr;
}		}
};

		private:
		/// \brief Initialize Map with anchor values. For address seen in some BB
		/// we set the value of different field saw in this address.
		/// If address is not an instruction than basic block is set to null.
		/// At the same time we find a common type for different field we will
		/// use to create new Phi/Select nodes. Keep it in CommonType field.
		void initializeMap(FoldAddrToValueMapping &Map) {
		// Keep track of keys where the value is null. We will need to replace it
		// with constant null when we know the common type.
		SmallVector<ValueInBB, 2> NullValue;
		for (auto &AM : AddrModes) {
		BasicBlock *BB = nullptr;
		if (Instruction *I = dyn_cast<Instruction>(AM.OriginalValue))
		BB = I->getParent();

		// For now we support only base register as different field.
		// TODO: Enable others.
		Value *DV = AM.BaseReg;
		if (DV) {
		if (CommonType)
		assert(CommonType == DV->getType() && "Different types detected!");
		else
		CommonType = DV->getType();
		Map[{ AM.OriginalValue, BB }] = DV;
		} else {
		NullValue.push_back({ AM.OriginalValue, BB });
		}
		}
		assert(CommonType && "At least one non-null value must be!");
		for (auto VIBB : NullValue)
		Map[VIBB] = Constant::getNullValue(CommonType);
		}

		/// \brief We have mapping between value A and basic block where value A
		/// seen to other value B where B was a field in addressing mode represented
		/// by A. Also we have an original value C representin an address in some
		/// basic block. Traversing from C through phi and selects we ended up with
		/// A's in a map. This utility function tries to find a value V which is a
		/// field in addressing mode C and traversing through phi nodes and selects
		/// we will end up in corresponded values B in a map.
		/// The utility will create a new Phi/Selects if needed.
		// The simple example looks as follows:
		// BB1:
		// p1 = b1 + 40
		// br cond BB2, BB3
		// BB2:
		// p2 = b2 + 40
		// br BB3
		// BB3:
		// p = phi [p1, BB1], [p2, BB2]
		// v = load p
		// Map is
		// <p1, BB1> -> b1
		// <p2, BB2> -> b2
		// Request is
		// <p, BB3> -> ?
		// The function tries to find or build phi [b1, BB1], [b2, BB2] in BB3
		Value *findCommon(FoldAddrToValueMapping &Map) {
		// Tracks of new created Phi nodes.
		SmallPtrSet<PHINode *, 32> NewPhiNodes;
		// Tracks of new created Select nodes.
		SmallPtrSet<SelectInst *, 32> NewSelectNodes;
		// Tracks the simplification of new created phi nodes. The reason we use
		// this mapping is because we will add new created Phi nodes in AddrToBase.
		// Simplification of Phi nodes is recursive, so some Phi node may
		// be simplified after we added it to AddrToBase.
		// Using this mapping we can find the current value in AddrToBase.
		SimplificationTracker ST(SQ, NewPhiNodes, NewSelectNodes);

		// First step, DFS to create PHI nodes for all intermediate blocks.
		// Also fill traverse order for the second step.
		SmallVector<ValueInBB, 32> TraverseOrder;
		InsertPlaceholders(Map, TraverseOrder, NewPhiNodes, NewSelectNodes);

		// Second Step, fill new nodes by merged values and simplify if possible.
		FillPlaceholders(Map, TraverseOrder, ST);

		if (!AddrSinkNewSelects && NewSelectNodes.size() > 0) {
		DestroyNodes(NewPhiNodes);
		DestroyNodes(NewSelectNodes);
		return nullptr;
		}

		// Now we'd like to match New Phi nodes to existed ones.
		unsigned PhiNotMatchedCount = 0;
		if (!MatchPhiSet(NewPhiNodes, ST, AddrSinkNewPhis, PhiNotMatchedCount)) {
		DestroyNodes(NewPhiNodes);
		DestroyNodes(NewSelectNodes);
		return nullptr;
		}

		auto *Result = ST.Get(Map.find(Original)->second);
		if (Result) {
		NumMemoryInstsPhiCreated += NewPhiNodes.size() + PhiNotMatchedCount;
		NumMemoryInstsSelectCreated += NewSelectNodes.size();
		}
		return Result;
		}

		/// \brief Destroy nodes from a set.
		template <typename T> void DestroyNodes(SmallPtrSetImpl<T *> &Instructions) {
		// For safe erasing, replace the Phi with dummy value first.
		auto Dummy = UndefValue::get(CommonType);
		for (auto I : Instructions) {
		I->replaceAllUsesWith(Dummy);
		I->eraseFromParent();
		}
		}

		/// \brief Try to match PHI node to Candidate.
		/// Matcher tracks the matched Phi nodes.
		bool MatchPhiNode(PHINode PHI, PHINode Candidate,
		DenseSet<PHIPair> &Matcher,
		SmallPtrSetImpl<PHINode *> &PhiNodesToMatch) {
		SmallVector<PHIPair, 8> WorkList;
		Matcher.insert({ PHI, Candidate });
		WorkList.push_back({ PHI, Candidate });
		SmallSet<PHIPair, 8> Visited;
		while (!WorkList.empty()) {
		auto Item = WorkList.pop_back_val();
		if (!Visited.insert(Item).second)
		continue;
		// We iterate over all incoming values to Phi to compare them.
		// If values are different and both of them Phi and the first one is a
		// Phi we added (subject to match) and both of them is in the same basic
		// block then we can match our pair if values match. So we state that
		// these values match and add it to work list to verify that.
		for (auto B : Item.first->blocks()) {
		Value *FirstValue = Item.first->getIncomingValueForBlock(B);
		Value *SecondValue = Item.second->getIncomingValueForBlock(B);
		if (FirstValue == SecondValue)
		continue;

		PHINode *FirstPhi = dyn_cast<PHINode>(FirstValue);
		PHINode *SecondPhi = dyn_cast<PHINode>(SecondValue);

		// One of them is not Phi or
		// The first one is not Phi node from the set we'd like to match or
		// Phi nodes from different basic blocks then
		// we will not be able to match.
		if (!FirstPhi \|\| !SecondPhi \|\| !PhiNodesToMatch.count(FirstPhi) \|\|
		FirstPhi->getParent() != SecondPhi->getParent())
		return false;

		// If we already matched them then continue.
		if (Matcher.count({ FirstPhi, SecondPhi }))
		continue;
		// So the values are different and does not match. So we need them to
		// match.
		Matcher.insert({ FirstPhi, SecondPhi });
		// But me must check it.
		WorkList.push_back({ FirstPhi, SecondPhi });
		}
		}
		return true;
		}

		/// \brief For the given set of PHI nodes try to find their equivalents.
		/// Returns false if this matching fails and creation of new Phi is disabled.
		bool MatchPhiSet(SmallPtrSetImpl<PHINode *> &PhiNodesToMatch,
		SimplificationTracker &ST, bool AllowNewPhiNodes,
		unsigned &PhiNotMatchedCount) {
		DenseSet<PHIPair> Matched;
		SmallPtrSet<PHINode *, 8> WillNotMatch;
		while (PhiNodesToMatch.size()) {
		PHINode PHI = PhiNodesToMatch.begin();

		// Add us, if no Phi nodes in the basic block we do not match.
		WillNotMatch.clear();
		WillNotMatch.insert(PHI);

		// Traverse all Phis until we found equivalent or fail to do that.
		bool IsMatched = false;
		for (auto &P : PHI->getParent()->phis()) {
		if (&P == PHI)
		continue;
		if ((IsMatched = MatchPhiNode(PHI, &P, Matched, PhiNodesToMatch)))
		break;
		// If it does not match, collect all Phi nodes from matcher.
		// if we end up with no match, them all these Phi nodes will not match
		// later.
		for (auto M : Matched)
		WillNotMatch.insert(M.first);
		Matched.clear();
		}
		if (IsMatched) {
		// Replace all matched values and erase them.
		for (auto MV : Matched) {
		MV.first->replaceAllUsesWith(MV.second);
		PhiNodesToMatch.erase(MV.first);
		ST.Put(MV.first, MV.second);
		MV.first->eraseFromParent();
		}
		Matched.clear();
		continue;
		}
		// If we are not allowed to create new nodes then bail out.
		if (!AllowNewPhiNodes)
		return false;
		// Just remove all seen values in matcher. They will not match anything.
		PhiNotMatchedCount += WillNotMatch.size();
		for (auto *P : WillNotMatch)
		PhiNodesToMatch.erase(P);
		}
		return true;
		}
		/// \brief Fill the placeholder with values from predecessors and simplify it.
		void FillPlaceholders(FoldAddrToValueMapping &Map,
		SmallVectorImpl<ValueInBB> &TraverseOrder,
		SimplificationTracker &ST) {
		while (!TraverseOrder.empty()) {
		auto Current = TraverseOrder.pop_back_val();
		assert(Map.find(Current) != Map.end() && "No node to fill!!!");
		Value *CurrentValue = Current.first;
		BasicBlock *CurrentBlock = Current.second;
		Value *V = Map[Current];

		if (SelectInst *Select = dyn_cast<SelectInst>(V)) {
		// CurrentValue also must be Select.
		auto *CurrentSelect = cast<SelectInst>(CurrentValue);
		auto *TrueValue = CurrentSelect->getTrueValue();
		ValueInBB TrueItem = { TrueValue, isa<Instruction>(TrueValue)
		? CurrentBlock
		: nullptr };
		assert(Map.find(TrueItem) != Map.end() && "No True Value!");
		Select->setTrueValue(Map[TrueItem]);
		auto *FalseValue = CurrentSelect->getFalseValue();
		ValueInBB FalseItem = { FalseValue, isa<Instruction>(FalseValue)
		? CurrentBlock
		: nullptr };
		assert(Map.find(FalseItem) != Map.end() && "No False Value!");
		Select->setFalseValue(Map[FalseItem]);
		} else {
		// Must be a Phi node then.
		PHINode *PHI = cast<PHINode>(V);
		// Fill the Phi node with values from predecessors.
		bool IsDefinedInThisBB =
		cast<Instruction>(CurrentValue)->getParent() == CurrentBlock;
		auto *CurrentPhi = dyn_cast<PHINode>(CurrentValue);
		for (auto B : predecessors(CurrentBlock)) {
		Value *PV = IsDefinedInThisBB
		? CurrentPhi->getIncomingValueForBlock(B)
		: CurrentValue;
		ValueInBB item = { PV, isa<Instruction>(PV) ? B : nullptr };
		assert(Map.find(item) != Map.end() && "No predecessor Value!");
		PHI->addIncoming(ST.Get(Map[item]), B);
		}
		}
		// Simplify if possible.
		Map[Current] = ST.Simplify(V);
		}
		}

		/// Starting from value recursively iterates over predecessors up to known
		/// ending values represented in a map. For each traversed block inserts
		/// a placeholder Phi or Select.
		/// Reports all new created Phi/Select nodes by adding them to set.
		/// Also reports and order in what basic blocks have been traversed.
		void InsertPlaceholders(FoldAddrToValueMapping &Map,
		SmallVectorImpl<ValueInBB> &TraverseOrder,
		SmallPtrSetImpl<PHINode *> &NewPhiNodes,
		SmallPtrSetImpl<SelectInst *> &NewSelectNodes) {
		SmallVector<ValueInBB, 32> Worklist;
		assert((isa<PHINode>(Original.first) \|\| isa<SelectInst>(Original.first)) &&
		"Address must be a Phi or Select node");
		auto *Dummy = UndefValue::get(CommonType);
		Worklist.push_back(Original);
		while (!Worklist.empty()) {
		auto Current = Worklist.pop_back_val();
		// If value is not an instruction it is something global, constant,
		// parameter and we can say that this value is observable in any block.
		// Set block to null to denote it.
		// Also please take into account that it is how we build anchors.
		if (!isa<Instruction>(Current.first))
		Current.second = nullptr;
		// if it is already visited or it is an ending value then skip it.
		if (Map.find(Current) != Map.end())
		continue;
		TraverseOrder.push_back(Current);

		Value *CurrentValue = Current.first;
		BasicBlock *CurrentBlock = Current.second;
		// CurrentValue must be a Phi node or select. All others must be covered
		// by anchors.
		Instruction *CurrentI = cast<Instruction>(CurrentValue);
		bool IsDefinedInThisBB = CurrentI->getParent() == CurrentBlock;

		unsigned PredCount =
		std::distance(pred_begin(CurrentBlock), pred_end(CurrentBlock));
		// if Current Value is not defined in this basic block we are interested
		// in values in predecessors.
		if (!IsDefinedInThisBB) {
		assert(PredCount && "Unreachable block?!");
		PHINode *PHI = PHINode::Create(CommonType, PredCount, "sunk_phi",
		&CurrentBlock->front());
		Map[Current] = PHI;
		NewPhiNodes.insert(PHI);
		// Add all predecessors in work list.
		for (auto B : predecessors(CurrentBlock))
		Worklist.push_back({ CurrentValue, B });
		continue;
		}
		// Value is defined in this basic block.
		if (SelectInst *OrigSelect = dyn_cast<SelectInst>(CurrentI)) {
		// Is it OK to get metadata from OrigSelect?!
		// Create a Select placeholder with dummy value.
		SelectInst *Select =
		SelectInst::Create(OrigSelect->getCondition(), Dummy, Dummy,
		OrigSelect->getName(), OrigSelect, OrigSelect);
		Map[Current] = Select;
		NewSelectNodes.insert(Select);
		// We are interested in True and False value in this basic block.
		Worklist.push_back({ OrigSelect->getTrueValue(), CurrentBlock });
		Worklist.push_back({ OrigSelect->getFalseValue(), CurrentBlock });
		} else {
		// It must be a Phi node then.
		auto *CurrentPhi = cast<PHINode>(CurrentI);
		// Create new Phi node for merge of bases.
		assert(PredCount && "Unreachable block?!");
		PHINode *PHI = PHINode::Create(CommonType, PredCount, "sunk_phi",
		&CurrentBlock->front());
		Map[Current] = PHI;
		NewPhiNodes.insert(PHI);

		// Add all predecessors in work list.
		for (auto B : predecessors(CurrentBlock))
		Worklist.push_back({ CurrentPhi->getIncomingValueForBlock(B), B });
		}
		}
		}
		};
} // end anonymous namespace		} // end anonymous namespace

/// Try adding ScaleReg*Scale to the current addressing mode.		/// Try adding ScaleReg*Scale to the current addressing mode.
/// Return true and update AddrMode if this addr mode is legal for the target,		/// Return true and update AddrMode if this addr mode is legal for the target,
/// false if not.		/// false if not.
bool AddressingModeMatcher::matchScaledValue(Value *ScaleReg, int64_t Scale,		bool AddressingModeMatcher::matchScaledValue(Value *ScaleReg, int64_t Scale,
unsigned Depth) {		unsigned Depth) {
// If Scale is 1, then this is the same as adding ScaleReg to the addressing		// If Scale is 1, then this is the same as adding ScaleReg to the addressing
▲ Show 20 Lines • Show All 1,076 Lines • ▼ Show 20 Lines	bool CodeGenPrepare::optimizeMemoryInst(Instruction MemoryInst, Value Addr,
SmallPtrSet<Value*, 16> Visited;		SmallPtrSet<Value*, 16> Visited;
worklist.push_back(Addr);		worklist.push_back(Addr);

// Use a worklist to iteratively look through PHI and select nodes, and		// Use a worklist to iteratively look through PHI and select nodes, and
// ensure that the addressing mode obtained from the non-PHI/select roots of		// ensure that the addressing mode obtained from the non-PHI/select roots of
// the graph are compatible.		// the graph are compatible.
bool PhiOrSelectSeen = false;		bool PhiOrSelectSeen = false;
SmallVector<Instruction*, 16> AddrModeInsts;		SmallVector<Instruction*, 16> AddrModeInsts;
AddressingModeCombiner AddrModes;		AddressingModeCombiner AddrModes({ *DL, TLInfo },
		{ Addr, MemoryInst->getParent() });
TypePromotionTransaction TPT(RemovedInsts);		TypePromotionTransaction TPT(RemovedInsts);
TypePromotionTransaction::ConstRestorationPt LastKnownGood =		TypePromotionTransaction::ConstRestorationPt LastKnownGood =
TPT.getRestorationPoint();		TPT.getRestorationPoint();
while (!worklist.empty()) {		while (!worklist.empty()) {
Value *V = worklist.back();		Value *V = worklist.back();
worklist.pop_back();		worklist.pop_back();

// We allow traversing cyclic Phi nodes.		// We allow traversing cyclic Phi nodes.
▲ Show 20 Lines • Show All 2,384 Lines • Show Last 20 Lines