This is an archive of the discontinued LLVM Phabricator instance.

Differential D22551

CodeGen: If Convert blocks that would form a diamond when tail-merged.
ClosedPublic

Authored by iteratee on Jul 19 2016, 4:28 PM.

Download Raw Diff

Details

Reviewers

davidxl

Summary

Some if conversion currently requires tail-merging to have run first.

As an example the following function currently relies on tail-merging for if
conversion to succeed. The common tail of cond_true and cond_false is
extracted, and this then forms a diamond pattern that can be
successfully if converted.

If this block does not get extracted, either because tail-merging is
disabled or the threshold is higher, we should still recognize this
pattern and if-convert it.

define i32 @t2(i32 %a, i32 %b) nounwind {
entry:
      %tmp1434 = icmp eq i32 %a, %b           ; <i1> [#uses=1]
      br i1 %tmp1434, label %bb17, label %bb.outer

bb.outer:               ; preds = %cond_false, %entry
      %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ]
      %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ]
      br label %bb

bb:             ; preds = %cond_true, %bb.outer
      %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ]
      %tmp. = sub i32 0, %b_addr.021.0.ph
      %tmp.40 = mul i32 %indvar, %tmp.
      %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph
      %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph
      br i1 %tmp3, label %cond_true, label %cond_false

cond_true:              ; preds = %bb
      %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph
      %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph
      %indvar.next = add i32 %indvar, 1
      br i1 %tmp1437, label %bb17, label %bb

cond_false:             ; preds = %bb
      %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0
      %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10
      br i1 %tmp14, label %bb17, label %bb.outer

bb17:           ; preds = %cond_false, %cond_true, %entry
      %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ]
      ret i32 %a_addr.026.1
}

Without tail-merging or diamond-tail if conversion:

LBB1_1:                                 @ %bb
                                      @ =>This Inner Loop Header: Depth=1
      cmp     r0, r1
      ble     LBB1_3
@ BB#2:                                 @ %cond_true
                                      @   in Loop: Header=BB1_1 Depth=1
      subs    r0, r0, r1
      cmp     r1, r0
      it      ne
      cmpne   r0, r1
      bgt     LBB1_4
LBB1_3:                                 @ %cond_false
                                      @   in Loop: Header=BB1_1 Depth=1
      subs    r1, r1, r0
      cmp     r1, r0
      bne     LBB1_1
LBB1_4:                                 @ %bb17
      bx      lr

With diamond-tail if conversion, but without tail-merging:

@ BB#0:                                 @ %entry
      cmp     r0, r1
      it      eq
      bxeq    lr
LBB1_1:                                 @ %bb
                                      @ =>This Inner Loop Header: Depth=1
      cmp     r0, r1
      ite     le
      suble   r1, r1, r0
      subgt   r0, r0, r1
      cmp     r1, r0
      bne     LBB1_1
@ BB#2:                                 @ %bb17
      bx      lr

Diff Detail

Event Timeline

iteratee updated this revision to Diff 64599.Jul 19 2016, 4:28 PM

iteratee retitled this revision from to CodeGen: If Convert blocks that would form a diamond when tail-merged..

iteratee updated this object.

iteratee added a reviewer: davidxl.

iteratee set the repository for this revision to rL LLVM.

iteratee updated this object.

iteratee added subscribers: echristo, chandlerc, timshen, llvm-commits.

Removed debugging, tidied spacing, added comments.

I am torn about this change. While this looks like a useful thing to do, I suspect this is not the right way to approach the problem.

The example actually confirms the fact the pre-layout tailmerging is a good normalization/enabler pass for later optimizations. This is the reason why it should be run with lower threshold enabling as much optimization as possible, and later let TailDup to undo those that do not bring benefit and to improve layout.

Another question is that whether this patch can handle more cases where (tailMerge + ifcvt) can not handle. If not, it seems to me the patch seems to have duplicated logics (e.g, counting dups) in tailMerge which is not the right approach.

Is this patch required to enable your tailDup enhancement patch? I don't think this one is essential for it. We can probably focus on getting your tailDup patch in first (it is very close to get -- probably just to make your latest tailMerge tuning to be enabled only in post layout mode?)

In D22551#490028, @davidxl wrote:

I am torn about this change. While this looks like a useful thing to do, I suspect this is not the right way to approach the problem.

The example actually confirms the fact the pre-layout tailmerging is a good normalization/enabler pass for later optimizations. This is the reason why it should be run with lower threshold enabling as much optimization as possible, and later let TailDup to undo those that do not bring benefit and to improve layout.

I couldn't find any of the original commits about tail merging that refer to normalization. It's always about reducing code size. Also, it's unusual for a canonicalization pass to have a threshold.

Another question is that whether this patch can handle more cases where (tailMerge + ifcvt) can not handle. If not, it seems to me the patch seems to have duplicated logics (e.g, counting dups) in tailMerge which is not the right approach.

Two things:

The code has to count duplicates anyway. Look at how IfConvertDiamond is written.
There is a precedent for teaching optimization passes to look deeper, even if there is a canonicalization pass that would prevent the need.

Is this patch required to enable your tailDup enhancement patch? I don't think this one is essential for it. We can probably focus on getting your tailDup patch in first (it is very close to get -- probably just to make your latest tailMerge tuning to be enabled only in post layout mode?)

Not specifically. If tailMerge is made less aggressive only during layout, then this is not necessary.

Refactor shared code between Diamond and Diamond with shared tail.

The validateDiamond refactoring change can be split out from the functional change into a different patch .

davidxl added inline comments.Jul 21 2016, 3:58 PM

lib/CodeGen/IfConversion.cpp
864	Document the difference between this pattern vs ValidDiamond? Also since there is no common tail shared between truebb and falsebb, the shape is not really 'Diamond'. Perfhaps make it named "ValidDiamondWithTailCommonned' ? to indicate the shape will be diamond if the tail is commonned?
882	is 'fallthough' check needed here?
892	Can this check be moved earlier? If so, common check can be extracted and shared across this and ValidDiamond.
906	Merge these two : if (TF == FT && TT == FF) { if (! Reversable) return false; reverse ... }
924	Better move this out of line to improve readability.
929	I have not looked in details here. Is there existing code that can be refactored/reused by any chance?

Move closure to member function.

Renamed to ForkedDiamond, and add more documentation about how it differs from standard diamond.

lib/CodeGen/IfConversion.cpp
860–861	I renamed it ForkedDiamond, and added more comments about it.
882	No. It's just checking if the branch is analyzable.
892	It's short enough that I don't think it's worth factoring out. If there were more checks in common, then maybe, but there aren't.
929	The function that is the most similar is ScanInstructions. I think they're different enough that having them share code would be more confusing than helpful.

The code looks in pretty good shape now. I find the test case is little missing -- how about adding some more (including negative one)?

lib/CodeGen/IfConversion.cpp
90	nit: with a common tail that can be shared
489	Split out the independent fix with a test case if possible
1024	Document the parameters.
1024	Why can't this function be folded into existing FeasbilityAnalysis with a new flag : hasCommonForkedTail (which defaults to false) ?
2085	Why is this check not done for the forked case?

Cleanups in response to comments.
Removed DebugLocation propagation.

lib/CodeGen/IfConversion.cpp
489	I've removed it. Finding a test case is enough work that I just don't have time.
2085	Good catch.

Add negative test.

iteratee added a child revision: D22317: Codegen: Tail Merge: Be less aggressive with special cases..Jul 27 2016, 5:08 PM

Is there anything else that I need to do for this patch?

davidxl added inline comments.Jul 30 2016, 8:32 PM

lib/CodeGen/IfConversion.cpp
895	The variable names here does not seem to match the control flow graph drawn in the comment. Please make it consistent.
932	Clean up the comment -- last instruction of what?
933	Is this a good assumption to make? Any assert can be added countDuplicatedInstructions?
965	This skip code pattern has appeared many times -- good candidate to extract into an inline function.
973	Why is this check not done outside of this function (in ValidForkedForkedDiamond before countDuplicatedInstuctions as in ValidDiamond ?
978	Why is this not already computed?
981	feasbilityAnalysis already checks isUnpredicable bit -- why is it still done here?
1155	This code looks almost exactly the same as the regular diamond case. Perhaps defined a lamba function auto DiamondFinder = [&](decltype(&IfConverter::ValidDiamond) Checker) { if (CanRevCond && (this->*Checker(..) && ...) { .... } }; DiamondFinder(&IfConverter::ValidDiamond); DiamondFinder(&IfConverter::ValidForkedDiamond);

Refactor and comments.

lib/CodeGen/IfConversion.cpp
895	Which comment specifically? The names in the graph are member names of BBInfo, and are assumed to coincide. Here we can't make that assumption. The names are also logical: TT = TrueBBI.TrueBB, TF = TrueBBI.FalseBB, etc.
933	I've elaborated in the comment. The size is computed by ScanInstructions, and the duplicated portion is subtracted off, so there's no point in recomputing the size, we would get the same answer.
973	Because countDuplicatedInstructions adjusts the iterators so that we know exactly which instructions are duplicated. We only worry about the non-duplicated instructions that clobber the predicate info.
978	See above.
981	Same reason as above.
1155	It would take too many parameters, and the code would be less legible. I would have to pass in TrueBBICalc, FalseBBICalc, hasCommonTail, and then conditionalize the calls, because ValidDiamond takes a different number of arguments from ValidForkedDiamond. I don't think it would be cleaner.

Missed changes from last patch. (Refactor and comments.)

davidxl added inline comments.Aug 2 2016, 10:24 AM

lib/CodeGen/IfConversion.cpp
895	Ok -- the naming convention makes sense.
933	My question in this comment is that countDuplicatiedInstruction does not document exit state of TIE and FIE, so add assertions to make sure TIE and FIE point to what you expect to see (pointing to identical instructions before and not identical after ?)
973	Should this be done for ValidDiamond too -- only check the non-shared portion? Also I don't think it is ideal to have code duplication like this. Looks like you should re-use scanInstructions or part of it (by making it accepting BIB and BIE) ?
1155	I still think refactoring is better -- the main reason of doing the refactoring is to avoid code duplication which is better longer term. For instance, no need to worry about fixing bugs in multiple different places. Another point is that Diamond and ForkedDiamond patterns are exclusive, so there is no need to do forked diamond check after diamond check returns true. In other words, the code can be further simplified to: if (CanRevCond) { if (!DiamondFinder(...::ValidDiamond)) { DiamondFilnder(... ::ValidForkedDiamond); } Also it seems to me you don't need to introduce TrueBBICalc and FalseBBICalc. How about in ValidForkedDiamond saving the initial value of TrueBBI and FalseBBI re-scan region of BB and update TrueBBI and FalseBBI if ValidForkedDiamond fails, restore TrueBBI etc value before returning.

More refactors and comments.

I split ScanInstructions in two, made the actual scan take iterator bounds, and removed RecaculateCostsAndClobbers.

lib/CodeGen/IfConversion.cpp
1183	I factored out the feasibility analysis. Take a look now.

More refactoring.

lgtm

When commtting, I suggest you carve out the refactoring changes into one or more NFC patches (debug skip for one, scanning for one, and the rest of refactoring) before committing the functional change.

This revision is now accepted and ready to land.Aug 5 2016, 12:30 PM

iteratee added a parent revision: D22796: [ADT] Add make_scope_exit()..Aug 9 2016, 1:24 PM

Added bug fix from the revert.

Herald added a subscriber: nemanjai. · View Herald TranscriptAug 10 2016, 2:15 PM

iteratee closed this revision.Aug 11 2016, 2:21 PM

Committed in: r278287

I have a possible fix for the broken self hosting bot. I've split the fixed patch into 3: Fix (which can apply to top), Rescan diamonds, and Forked diamond. I'm going to re-commit all 3 and let the bots churn through them again.

Recommitted in r279670 and r279671

Revision Contents

Path

Size

lib/

CodeGen/

IfConversion.cpp

507 lines

test/

CodeGen/

Thumb2/

thumb2-ifcvt1.ll

9 lines

Diff 64599

lib/CodeGen/IfConversion.cpp

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
static cl::opt<bool> DisableTriangleR("disable-ifcvt-triangle-rev",		static cl::opt<bool> DisableTriangleR("disable-ifcvt-triangle-rev",
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);
static cl::opt<bool> DisableTriangleF("disable-ifcvt-triangle-false",		static cl::opt<bool> DisableTriangleF("disable-ifcvt-triangle-false",
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);
static cl::opt<bool> DisableTriangleFR("disable-ifcvt-triangle-false-rev",		static cl::opt<bool> DisableTriangleFR("disable-ifcvt-triangle-false-rev",
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);
static cl::opt<bool> DisableDiamond("disable-ifcvt-diamond",		static cl::opt<bool> DisableDiamond("disable-ifcvt-diamond",
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);
		static cl::opt<bool> DisableDiamondTail("disable-ifcvt-diamond-tail",
		cl::init(false), cl::Hidden);
static cl::opt<bool> IfCvtBranchFold("ifcvt-branch-fold",		static cl::opt<bool> IfCvtBranchFold("ifcvt-branch-fold",
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

STATISTIC(NumSimple, "Number of simple if-conversions performed");		STATISTIC(NumSimple, "Number of simple if-conversions performed");
STATISTIC(NumSimpleFalse, "Number of simple (F) if-conversions performed");		STATISTIC(NumSimpleFalse, "Number of simple (F) if-conversions performed");
STATISTIC(NumTriangle, "Number of triangle if-conversions performed");		STATISTIC(NumTriangle, "Number of triangle if-conversions performed");
STATISTIC(NumTriangleRev, "Number of triangle (R) if-conversions performed");		STATISTIC(NumTriangleRev, "Number of triangle (R) if-conversions performed");
STATISTIC(NumTriangleFalse,"Number of triangle (F) if-conversions performed");		STATISTIC(NumTriangleFalse,"Number of triangle (F) if-conversions performed");
STATISTIC(NumTriangleFRev, "Number of triangle (F/R) if-conversions performed");		STATISTIC(NumTriangleFRev, "Number of triangle (F/R) if-conversions performed");
STATISTIC(NumDiamonds, "Number of diamond if-conversions performed");		STATISTIC(NumDiamonds, "Number of diamond if-conversions performed");
		STATISTIC(NumDiamondTails, "Number of diamond-tail if-conversions performed");
STATISTIC(NumIfConvBBs, "Number of if-converted blocks");		STATISTIC(NumIfConvBBs, "Number of if-converted blocks");
STATISTIC(NumDupBBs, "Number of duplicated blocks");		STATISTIC(NumDupBBs, "Number of duplicated blocks");
STATISTIC(NumUnpred, "Number of true blocks of diamonds unpredicated");		STATISTIC(NumUnpred, "Number of true blocks of diamonds unpredicated");

namespace {		namespace {
class IfConverter : public MachineFunctionPass {		class IfConverter : public MachineFunctionPass {
enum IfcvtKind {		enum IfcvtKind {
ICNotClassfied, // BB data valid, but not classified.		ICNotClassfied, // BB data valid, but not classified.
ICSimpleFalse, // Same as ICSimple, but on the false path.		ICSimpleFalse, // Same as ICSimple, but on the false path.
ICSimple, // BB is entry of an one split, no rejoin sub-CFG.		ICSimple, // BB is entry of an one split, no rejoin sub-CFG.
ICTriangleFRev, // Same as ICTriangleFalse, but false path rev condition.		ICTriangleFRev, // Same as ICTriangleFalse, but false path rev condition.
ICTriangleRev, // Same as ICTriangle, but true path rev condition.		ICTriangleRev, // Same as ICTriangle, but true path rev condition.
ICTriangleFalse, // Same as ICTriangle, but on the false path.		ICTriangleFalse, // Same as ICTriangle, but on the false path.
ICTriangle, // BB is entry of a triangle sub-CFG.		ICTriangle, // BB is entry of a triangle sub-CFG.
ICDiamond // BB is entry of a diamond sub-CFG.		ICDiamond, // BB is entry of a diamond sub-CFG.
		ICDiamondTail // BB is entry of an almost diamond sub-CFG, with a
		// shared tail.
		davidxlUnsubmitted Done Reply Inline Actions nit: with a common tail that can be shared davidxl: nit: with a common tail that can be shared
};		};

/// BBInfo - One per MachineBasicBlock, this is used to cache the result		/// BBInfo - One per MachineBasicBlock, this is used to cache the result
/// if-conversion feasibility analysis. This includes results from		/// if-conversion feasibility analysis. This includes results from
/// TargetInstrInfo::analyzeBranch() (i.e. TBB, FBB, and Cond), and its		/// TargetInstrInfo::analyzeBranch() (i.e. TBB, FBB, and Cond), and its
/// classification, and common tail block of its successors (if it's a		/// classification, and common tail block of its successors (if it's a
/// diamond shape), its size, whether it's predicable, and whether any		/// diamond shape), its size, whether it's predicable, and whether any
/// instruction can clobber the 'would-be' predicate.		/// instruction can clobber the 'would-be' predicate.
Show All 15 Lines	class IfConverter : public MachineFunctionPass {
/// BrCond - Conditions for end of block conditional branches.		/// BrCond - Conditions for end of block conditional branches.
/// Predicate - Predicate used in the BB.		/// Predicate - Predicate used in the BB.
struct BBInfo {		struct BBInfo {
bool IsDone : 1;		bool IsDone : 1;
bool IsBeingAnalyzed : 1;		bool IsBeingAnalyzed : 1;
bool IsAnalyzed : 1;		bool IsAnalyzed : 1;
bool IsEnqueued : 1;		bool IsEnqueued : 1;
bool IsBrAnalyzable : 1;		bool IsBrAnalyzable : 1;
		bool IsBrReversible : 1;
bool HasFallThrough : 1;		bool HasFallThrough : 1;
bool IsUnpredicable : 1;		bool IsUnpredicable : 1;
bool CannotBeCopied : 1;		bool CannotBeCopied : 1;
bool ClobbersPred : 1;		bool ClobbersPred : 1;
unsigned NonPredSize;		unsigned NonPredSize;
unsigned ExtraCost;		unsigned ExtraCost;
unsigned ExtraCost2;		unsigned ExtraCost2;
MachineBasicBlock *BB;		MachineBasicBlock *BB;
MachineBasicBlock *TrueBB;		MachineBasicBlock *TrueBB;
MachineBasicBlock *FalseBB;		MachineBasicBlock *FalseBB;
SmallVector<MachineOperand, 4> BrCond;		SmallVector<MachineOperand, 4> BrCond;
SmallVector<MachineOperand, 4> Predicate;		SmallVector<MachineOperand, 4> Predicate;
BBInfo() : IsDone(false), IsBeingAnalyzed(false),		BBInfo() : IsDone(false), IsBeingAnalyzed(false),
IsAnalyzed(false), IsEnqueued(false), IsBrAnalyzable(false),		IsAnalyzed(false), IsEnqueued(false), IsBrAnalyzable(false),
HasFallThrough(false), IsUnpredicable(false),		IsBrReversible(false), HasFallThrough(false),
CannotBeCopied(false), ClobbersPred(false), NonPredSize(0),		IsUnpredicable(false), CannotBeCopied(false),
ExtraCost(0), ExtraCost2(0), BB(nullptr), TrueBB(nullptr),		ClobbersPred(false), NonPredSize(0), ExtraCost(0),
		ExtraCost2(0), BB(nullptr), TrueBB(nullptr),
FalseBB(nullptr) {}		FalseBB(nullptr) {}
};		};

/// IfcvtToken - Record information about pending if-conversions to attempt:		/// IfcvtToken - Record information about pending if-conversions to attempt:
/// BBI - Corresponding BBInfo.		/// BBI - Corresponding BBInfo.
/// Kind - Type of block. See IfcvtKind.		/// Kind - Type of block. See IfcvtKind.
/// NeedSubsumption - True if the to-be-predicated BB has already been		/// NeedSubsumption - True if the to-be-predicated BB has already been
/// predicated.		/// predicated.
/// NumDups - Number of instructions that would be duplicated due		/// NumDups - Number of instructions that would be duplicated due
/// to this if-conversion. (For diamonds, the number of		/// to this if-conversion. (For diamonds, the number of
/// identical instructions at the beginnings of both		/// identical instructions at the beginnings of both
/// paths).		/// paths).
/// NumDups2 - For diamonds, the number of identical instructions		/// NumDups2 - For diamonds, the number of identical instructions
/// at the ends of both paths.		/// at the ends of both paths.
struct IfcvtToken {		struct IfcvtToken {
BBInfo &BBI;		BBInfo &BBI;
IfcvtKind Kind;		IfcvtKind Kind;
bool NeedSubsumption;
unsigned NumDups;		unsigned NumDups;
unsigned NumDups2;		unsigned NumDups2;
IfcvtToken(BBInfo &b, IfcvtKind k, bool s, unsigned d, unsigned d2 = 0)		bool NeedSubsumption : 1;
: BBI(b), Kind(k), NeedSubsumption(s), NumDups(d), NumDups2(d2) {}		bool TClobbersPred : 1;
		bool FClobbersPred : 1;
		IfcvtToken(BBInfo &b, IfcvtKind k, bool s, unsigned d, unsigned d2 = 0,
		bool tc = false, bool fc = false)
		: BBI(b), Kind(k), NumDups(d), NumDups2(d2), NeedSubsumption(s),
		TClobbersPred(tc), FClobbersPred(fc) {}
};		};

/// BBAnalysis - Results of if-conversion feasibility analysis indexed by		/// BBAnalysis - Results of if-conversion feasibility analysis indexed by
/// basic block number.		/// basic block number.
std::vector<BBInfo> BBAnalysis;		std::vector<BBInfo> BBAnalysis;
TargetSchedModel SchedModel;		TargetSchedModel SchedModel;

const TargetLoweringBase *TLI;		const TargetLoweringBase *TLI;
Show All 26 Lines	public:
bool runOnMachineFunction(MachineFunction &MF) override;		bool runOnMachineFunction(MachineFunction &MF) override;

MachineFunctionProperties getRequiredProperties() const override {		MachineFunctionProperties getRequiredProperties() const override {
return MachineFunctionProperties().set(		return MachineFunctionProperties().set(
MachineFunctionProperties::Property::AllVRegsAllocated);		MachineFunctionProperties::Property::AllVRegsAllocated);
}		}

private:		private:
bool ReverseBranchCondition(BBInfo &BBI);		bool ReverseBranchCondition(BBInfo &BBI) const;
bool ValidSimple(BBInfo &TrueBBI, unsigned &Dups,		bool ValidSimple(BBInfo &TrueBBI, unsigned &Dups,
BranchProbability Prediction) const;		BranchProbability Prediction) const;
bool ValidTriangle(BBInfo &TrueBBI, BBInfo &FalseBBI,		bool ValidTriangle(BBInfo &TrueBBI, BBInfo &FalseBBI,
bool FalseBranch, unsigned &Dups,		bool FalseBranch, unsigned &Dups,
BranchProbability Prediction) const;		BranchProbability Prediction) const;
bool ValidDiamond(BBInfo &TrueBBI, BBInfo &FalseBBI,		bool ValidDiamond(BBInfo &TrueBBI, BBInfo &FalseBBI,
unsigned &Dups1, unsigned &Dups2) const;		unsigned &Dups1, unsigned &Dups2) const;
		bool ValidDiamondTail(BBInfo &TrueBBI, BBInfo &FalseBBI,
		unsigned &Dups1, unsigned &Dups2,
		BBInfo &TrueBBICalc, BBInfo &FalseBBICalc) const;
void ScanInstructions(BBInfo &BBI);		void ScanInstructions(BBInfo &BBI);
void AnalyzeBlock(MachineBasicBlock *MBB,		void AnalyzeBlock(MachineBasicBlock *MBB,
std::vector<std::unique_ptr<IfcvtToken>> &Tokens);		std::vector<std::unique_ptr<IfcvtToken>> &Tokens);
bool FeasibilityAnalysis(BBInfo &BBI, SmallVectorImpl<MachineOperand> &Cond,		bool FeasibilityAnalysis(BBInfo &BBI, SmallVectorImpl<MachineOperand> &Cond,
bool isTriangle = false, bool RevBranch = false);		bool isTriangle = false, bool RevBranch = false);
		// Perform Feasability Analysis, assuming that BBI contains a shared tail.
		// This disregards IsUnpredicable, as the tail may contain unpredicable
		// instructions, but may be shared. It is assumed that the caller has
		// verified this.
		bool FeasibilityAnalysisSharedTail(
		BBInfo &BBI, SmallVectorImpl<MachineOperand> &Pred);
void AnalyzeBlocks(MachineFunction &MF,		void AnalyzeBlocks(MachineFunction &MF,
std::vector<std::unique_ptr<IfcvtToken>> &Tokens);		std::vector<std::unique_ptr<IfcvtToken>> &Tokens);
void InvalidatePreds(MachineBasicBlock *BB);		void InvalidatePreds(MachineBasicBlock *BB);
void RemoveExtraEdges(BBInfo &BBI);		void RemoveExtraEdges(BBInfo &BBI);
bool IfConvertSimple(BBInfo &BBI, IfcvtKind Kind);		bool IfConvertSimple(BBInfo &BBI, IfcvtKind Kind);
bool IfConvertTriangle(BBInfo &BBI, IfcvtKind Kind);		bool IfConvertTriangle(BBInfo &BBI, IfcvtKind Kind);
bool IfConvertDiamond(BBInfo &BBI, IfcvtKind Kind,		bool IfConvertDiamond(BBInfo &BBI, IfcvtKind Kind,
unsigned NumDups1, unsigned NumDups2);		unsigned NumDups1, unsigned NumDups2);
		bool IfConvertDiamondTail(BBInfo &BBI, IfcvtKind Kind,
		unsigned NumDups1, unsigned NumDups2,
		bool TClobbers, bool FClobbers);
void PredicateBlock(BBInfo &BBI,		void PredicateBlock(BBInfo &BBI,
MachineBasicBlock::iterator E,		MachineBasicBlock::iterator E,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
SmallSet<unsigned, 4> *LaterRedefs = nullptr);		SmallSet<unsigned, 4> *LaterRedefs = nullptr);
void CopyAndPredicateBlock(BBInfo &ToBBI, BBInfo &FromBBI,		void CopyAndPredicateBlock(BBInfo &ToBBI, BBInfo &FromBBI,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool IgnoreBr = false);		bool IgnoreBr = false);
void MergeBlocks(BBInfo &ToBBI, BBInfo &FromBBI, bool AddEdges = true);		void MergeBlocks(BBInfo &ToBBI, BBInfo &FromBBI, bool AddEdges = true);
▲ Show 20 Lines • Show All 175 Lines • ▼ Show 20 Lines	while (!Tokens.empty()) {
DEBUG(dbgs() << "Ifcvt (Diamond): BB#" << BBI.BB->getNumber() << " (T:"		DEBUG(dbgs() << "Ifcvt (Diamond): BB#" << BBI.BB->getNumber() << " (T:"
<< BBI.TrueBB->getNumber() << ",F:"		<< BBI.TrueBB->getNumber() << ",F:"
<< BBI.FalseBB->getNumber() << ") ");		<< BBI.FalseBB->getNumber() << ") ");
RetVal = IfConvertDiamond(BBI, Kind, NumDups, NumDups2);		RetVal = IfConvertDiamond(BBI, Kind, NumDups, NumDups2);
DEBUG(dbgs() << (RetVal ? "succeeded!" : "failed!") << "\n");		DEBUG(dbgs() << (RetVal ? "succeeded!" : "failed!") << "\n");
if (RetVal) ++NumDiamonds;		if (RetVal) ++NumDiamonds;
break;		break;
}		}
		case ICDiamondTail: {
		if (DisableDiamondTail) break;
		DEBUG(dbgs() << "Ifcvt (Diamond w/ tail): BB#" << BBI.BB->getNumber() << " (T:"
		<< BBI.TrueBB->getNumber() << ",F:"
		<< BBI.FalseBB->getNumber() << ") ");
		RetVal = IfConvertDiamondTail(BBI, Kind, NumDups, NumDups2,
		Token->TClobbersPred,
		Token->FClobbersPred);
		DEBUG(dbgs() << (RetVal ? "succeeded!" : "failed!") << "\n");
		if (RetVal) ++NumDiamondTails;
		break;
		}
}		}

Change \|= RetVal;		Change \|= RetVal;

NumIfCvts = NumSimple + NumSimpleFalse + NumTriangle + NumTriangleRev +		NumIfCvts = NumSimple + NumSimpleFalse + NumTriangle + NumTriangleRev +
NumTriangleFalse + NumTriangleFRev + NumDiamonds;		NumTriangleFalse + NumTriangleFRev + NumDiamonds;
if (IfCvtLimit != -1 && (int)NumIfCvts >= IfCvtLimit)		if (IfCvtLimit != -1 && (int)NumIfCvts >= IfCvtLimit)
break;		break;
Show All 27 Lines	for (MachineBasicBlock::succ_iterator SI = BB->succ_begin(),
if (SuccBB != TrueBB)		if (SuccBB != TrueBB)
return SuccBB;		return SuccBB;
}		}
return nullptr;		return nullptr;
}		}

/// ReverseBranchCondition - Reverse the condition of the end of the block		/// ReverseBranchCondition - Reverse the condition of the end of the block
/// branch. Swap block's 'true' and 'false' successors.		/// branch. Swap block's 'true' and 'false' successors.
bool IfConverter::ReverseBranchCondition(BBInfo &BBI) {		bool IfConverter::ReverseBranchCondition(BBInfo &BBI) const {
DebugLoc dl; // FIXME: this is nowhere		DebugLoc dl;
		davidxlUnsubmitted Not Done Reply Inline Actions Split out the independent fix with a test case if possible davidxl: Split out the independent fix with a test case if possible
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions I've removed it. Finding a test case is enough work that I just don't have time. iteratee: I've removed it. Finding a test case is enough work that I just don't have time.
		MachineBasicBlock::iterator BBIT = BBI.BB->getFirstTerminator();
		if (BBIT != BBI.BB->end())
		dl = BBIT->getDebugLoc();
if (!TII->ReverseBranchCondition(BBI.BrCond)) {		if (!TII->ReverseBranchCondition(BBI.BrCond)) {
TII->RemoveBranch(*BBI.BB);		TII->RemoveBranch(*BBI.BB);
TII->InsertBranch(*BBI.BB, BBI.FalseBB, BBI.TrueBB, BBI.BrCond, dl);		TII->InsertBranch(*BBI.BB, BBI.FalseBB, BBI.TrueBB, BBI.BrCond, dl);
std::swap(BBI.TrueBB, BBI.FalseBB);		std::swap(BBI.TrueBB, BBI.FalseBB);
return true;		return true;
}		}
return false;		return false;
}		}
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	if (!TExit && blockAlwaysFallThrough(TrueBBI)) {
MachineFunction::iterator I = TrueBBI.BB->getIterator();		MachineFunction::iterator I = TrueBBI.BB->getIterator();
if (++I == TrueBBI.BB->getParent()->end())		if (++I == TrueBBI.BB->getParent()->end())
return false;		return false;
TExit = &*I;		TExit = &*I;
}		}
return TExit && TExit == FalseBBI.BB;		return TExit && TExit == FalseBBI.BB;
}		}

		/// ValidDiamondTail - Returns true if the 'true' and 'false' blocks (along
		/// with their common predecessor) form a diamond if a common tail block is
		/// extracted
		bool IfConverter::ValidDiamondTail(
		BBInfo &TrueBBI, BBInfo &FalseBBI,
		unsigned &Dups1, unsigned &Dups2,
		BBInfo &TrueBBICalc, BBInfo &FalseBBICalc) const {
		Dups1 = Dups2 = 0;
		if (TrueBBI.IsBeingAnalyzed \|\| TrueBBI.IsDone \|\|
		FalseBBI.IsBeingAnalyzed \|\| FalseBBI.IsDone)
		return false;

		MachineBasicBlock *TT = TrueBBI.TrueBB;
		MachineBasicBlock *TF = TrueBBI.FalseBB;
		MachineBasicBlock *FT = FalseBBI.TrueBB;
		MachineBasicBlock *FF = FalseBBI.FalseBB;

		if (!TrueBBI.IsBrAnalyzable \|\| !FalseBBI.IsBrAnalyzable)
		return false;

		if (!TT)
		TT = getNextBlock(TrueBBI.BB);
		if (!TF)
		TF = getNextBlock(TrueBBI.BB);
		if (!FT)
		FT = getNextBlock(FalseBBI.BB);
		if (!FF)
		FF = getNextBlock(FalseBBI.BB);

		if (!TT \|\| !TF)
		return false;
		if (TrueBBI.BB->pred_size() > 1 \|\| FalseBBI.BB->pred_size() > 1)
		return false;

		// Only looking for the case where it's not an actual diamond.
		if (TT == TF \|\| FT == FF)
		return false;

		// Check successors. If they don't match, bail.
		if (!((TT == FT && TF == FF) \|\| (TF == FT && TT == FF)))
		return false;

		// If the branches are opposing, but we can't reverse, don't do it.
		if (TF == FT && TT == FF && !FalseBBI.IsBrReversible)
		return false;
		if (TF == FT && TT == FF)
		ReverseBranchCondition(FalseBBI);

		// add debug statement to show that a pair of blocks has passed the basic
		// checks.

		// Count duplicate instructions at the beginning of the true and false blocks.
		// While we do this, we calculate the costs of predicating the non-shared
		// sections of the blocks.
		// The size of the blocks are the same.
		TrueBBICalc.NonPredSize = TrueBBI.NonPredSize;
		FalseBBICalc.NonPredSize = FalseBBI.NonPredSize;
		// We only count extra cost for instructions that aren't shared.
		TrueBBICalc.ExtraCost = TrueBBICalc.ExtraCost2 = 0;
		FalseBBICalc.ExtraCost = FalseBBICalc.ExtraCost2 = 0;
		TrueBBICalc.ClobbersPred = false;
		FalseBBICalc.ClobbersPred = false;
		MachineBasicBlock::iterator TIB = TrueBBI.BB->begin();
		MachineBasicBlock::iterator FIB = FalseBBI.BB->begin();
		MachineBasicBlock::iterator TIE = TrueBBI.BB->end();
		MachineBasicBlock::iterator FIE = FalseBBI.BB->end();
		std::vector<MachineOperand> PredDefs;
		while (TIB != TIE && FIB != FIE) {
		// Skip dbg_value instructions. These do not count.
		if (TIB->isDebugValue()) {
		while (TIB != TIE && TIB->isDebugValue())
		++TIB;
		if (TIB == TIE)
		break;
		}
		if (FIB->isDebugValue()) {
		while (FIB != FIE && FIB->isDebugValue())
		++FIB;
		if (FIB == FIE)
		break;
		}
		if (!TIB->isIdenticalTo(*FIB))
		break;
		if (TII->DefinesPredicate(*TIB, PredDefs)) {
		TrueBBICalc.ClobbersPred = true;
		FalseBBICalc.ClobbersPred = true;
		}
		++Dups1;
		++TIB;
		++FIB;
		}

		// Now, in preparation for counting duplicate instructions at the ends of the
		// blocks, move the end iterators up past any unconditional branch
		// instructions.
		// Check for already containing all of the block.
		if (TIB == TIE \|\| FIB == FIE)
		return true;
		--TIE;
		--FIE;
		while (TIE != TIB && TIE->isUnconditionalBranch())
		--TIE;
		while (FIE != FIB && FIE->isUnconditionalBranch())
		--FIE;

		// If Dups1 includes all of a block, then don't count duplicate
		// instructions at the end of the blocks.
		if (TIB == TIE \|\| FIB == FIE)
		return true;

		// Count duplicate instructions at the ends of the blocks.
		while (TIE != TIB && FIE != FIB) {
		// Skip dbg_value instructions. These do not count.
		if (TIE->isDebugValue()) {
		while (TIE != TIB && TIE->isDebugValue())
		--TIE;
		if (TIE == TIB)
		break;
		}
		if (FIE->isDebugValue()) {
		while (FIE != FIB && FIE->isDebugValue())
		--FIE;
		if (FIE == FIB)
		break;
		}
		if (!TIE->isIdenticalTo(*FIE))
		break;
		// We need to make sure the conditional branch instructions are the same,
		// but we shouldn't count the branch instructions, as they will be stripped
		// out during if conversion.
		if (!TIE->isBranch())
		++Dups2;
		--TIE;
		--FIE;
		}

		// Make sure that neither block has any remaining branches, and that at most
		// one of them has remaining predicate clobbering instructions.
		auto recalculateCostsAndClobbers = [&](
		MachineBasicBlock::iterator &BIB,
		MachineBasicBlock::iterator &BIE,
		BBInfo &BBIRecalc) {
		while (BIB != BIE) {
		// Skip dbg_value instructions. These do not count.
		if (BIB->isDebugValue()) {
		while (BIB != BIE && BIB->isDebugValue())
		++BIB;
		if (BIB == BIE)
		break;
		}
		// A Cond-clobbering instruction can only occur at the end of the
		// non-duplicated section.
		if (BBIRecalc.ClobbersPred)
		return false;
		if (TII->isPredicated(*BIB))
		return false;
		if (TII->DefinesPredicate(*BIB, PredDefs))
		BBIRecalc.ClobbersPred = true;
		if (BIB->isBranch())
		return false;
		if (!TII->isPredicable(*BIB))
		return false;
		unsigned ExtraPredCost = TII->getPredicationCost(*BIB);
		unsigned NumCycles = SchedModel.computeInstrLatency(&(*BIB), false);
		if (NumCycles > 1)
		BBIRecalc.ExtraCost += NumCycles-1;
		BBIRecalc.ExtraCost2 += ExtraPredCost;
		++BIB;
		}
		return true;
		};
		// TIE and FIE both point at the last instruction, move them back.
		++TIE; ++FIE;
		if (!recalculateCostsAndClobbers(TIB, TIE, TrueBBICalc))
		return false;
		if (!recalculateCostsAndClobbers(FIB, FIE, FalseBBICalc))
		return false;
		if (TrueBBICalc.ClobbersPred && FalseBBICalc.ClobbersPred)
		return false;
		return true;
		}

/// ValidDiamond - Returns true if the 'true' and 'false' blocks (along		/// ValidDiamond - Returns true if the 'true' and 'false' blocks (along
/// with their common predecessor) forms a valid diamond shape for ifcvt.		/// with their common predecessor) forms a valid diamond shape for ifcvt.
bool IfConverter::ValidDiamond(BBInfo &TrueBBI, BBInfo &FalseBBI,		bool IfConverter::ValidDiamond(BBInfo &TrueBBI, BBInfo &FalseBBI,
unsigned &Dups1, unsigned &Dups2) const {		unsigned &Dups1, unsigned &Dups2) const {
Dups1 = Dups2 = 0;		Dups1 = Dups2 = 0;
if (TrueBBI.IsBeingAnalyzed \|\| TrueBBI.IsDone \|\|		if (TrueBBI.IsBeingAnalyzed \|\| TrueBBI.IsDone \|\|
FalseBBI.IsBeingAnalyzed \|\| FalseBBI.IsDone)		FalseBBI.IsBeingAnalyzed \|\| FalseBBI.IsDone)
return false;		return false;
▲ Show 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	if (FIE->isDebugValue()) {
break;		break;
}		}
if (!TIE->isIdenticalTo(*FIE))		if (!TIE->isIdenticalTo(*FIE))
break;		break;
++Dups2;		++Dups2;
--TIE;		--TIE;
--FIE;		--FIE;
}		}

return true;		return true;
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions I renamed it ForkedDiamond, and added more comments about it. iteratee: I renamed it ForkedDiamond, and added more comments about it.
}		}

/// ScanInstructions - Scan all the instructions in the block to determine if		/// ScanInstructions - Scan all the instructions in the block to determine if
		davidxlUnsubmitted Not Done Reply Inline Actions Document the difference between this pattern vs ValidDiamond? Also since there is no common tail shared between truebb and falsebb, the shape is not really 'Diamond'. Perfhaps make it named "ValidDiamondWithTailCommonned' ? to indicate the shape will be diamond if the tail is commonned? davidxl: Document the difference between this pattern vs ValidDiamond? Also since there is no common…
/// the block is predicable. In most cases, that means all the instructions		/// the block is predicable. In most cases, that means all the instructions
/// in the block are isPredicable(). Also checks if the block contains any		/// in the block are isPredicable(). Also checks if the block contains any
/// instruction which can clobber a predicate (e.g. condition code register).		/// instruction which can clobber a predicate (e.g. condition code register).
/// If so, the block is not predicable unless it's the last instruction.		/// If so, the block is not predicable unless it's the last instruction.
void IfConverter::ScanInstructions(BBInfo &BBI) {		void IfConverter::ScanInstructions(BBInfo &BBI) {
if (BBI.IsDone)		if (BBI.IsDone)
return;		return;

bool AlreadyPredicated = !BBI.Predicate.empty();		bool AlreadyPredicated = !BBI.Predicate.empty();
// First analyze the end of BB branches.		// First analyze the end of BB branches.
BBI.TrueBB = BBI.FalseBB = nullptr;		BBI.TrueBB = BBI.FalseBB = nullptr;
BBI.BrCond.clear();		BBI.BrCond.clear();
BBI.IsBrAnalyzable =		BBI.IsBrAnalyzable =
!TII->analyzeBranch(*BBI.BB, BBI.TrueBB, BBI.FalseBB, BBI.BrCond);		!TII->AnalyzeBranch(*BBI.BB, BBI.TrueBB, BBI.FalseBB, BBI.BrCond);
		SmallVector<MachineOperand, 4> RevCond(BBI.BrCond.begin(), BBI.BrCond.end());
		BBI.IsBrReversible = (RevCond.size() == 0) \|\|
		!TII->ReverseBranchCondition(RevCond);
BBI.HasFallThrough = BBI.IsBrAnalyzable && BBI.FalseBB == nullptr;		BBI.HasFallThrough = BBI.IsBrAnalyzable && BBI.FalseBB == nullptr;
		davidxlUnsubmitted Done Reply Inline Actions is 'fallthough' check needed here? davidxl: is 'fallthough' check needed here?
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions No. It's just checking if the branch is analyzable. iteratee: No. It's just checking if the branch is analyzable.

if (BBI.BrCond.size()) {		if (BBI.BrCond.size()) {
// No false branch. This BB must end with a conditional branch and a		// No false branch. This BB must end with a conditional branch and a
// fallthrough.		// fallthrough.
if (!BBI.FalseBB)		if (!BBI.FalseBB)
BBI.FalseBB = findFalseBlock(BBI.BB, BBI.TrueBB);		BBI.FalseBB = findFalseBlock(BBI.BB, BBI.TrueBB);
if (!BBI.FalseBB) {		if (!BBI.FalseBB) {
// Malformed bcc? True and false blocks are the same?		// Malformed bcc? True and false blocks are the same?
BBI.IsUnpredicable = true;		BBI.IsUnpredicable = true;
return;		return;
		davidxlUnsubmitted Done Reply Inline Actions Can this check be moved earlier? If so, common check can be extracted and shared across this and ValidDiamond. davidxl: Can this check be moved earlier? If so, common check can be extracted and shared across this…
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions It's short enough that I don't think it's worth factoring out. If there were more checks in common, then maybe, but there aren't. iteratee: It's short enough that I don't think it's worth factoring out. If there were more checks in…
}		}
}		}

		davidxlUnsubmitted Not Done Reply Inline Actions The variable names here does not seem to match the control flow graph drawn in the comment. Please make it consistent. davidxl: The variable names here does not seem to match the control flow graph drawn in the comment.
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions Which comment specifically? The names in the graph are member names of BBInfo, and are assumed to coincide. Here we can't make that assumption. The names are also logical: TT = TrueBBI.TrueBB, TF = TrueBBI.FalseBB, etc. iteratee: Which comment specifically? The names in the graph are member names of BBInfo, and are assumed…
		davidxlUnsubmitted Not Done Reply Inline Actions Ok -- the naming convention makes sense. davidxl: Ok -- the naming convention makes sense.
// Then scan all the instructions.		// Then scan all the instructions.
BBI.NonPredSize = 0;		BBI.NonPredSize = 0;
BBI.ExtraCost = 0;		BBI.ExtraCost = 0;
BBI.ExtraCost2 = 0;		BBI.ExtraCost2 = 0;
BBI.ClobbersPred = false;		BBI.ClobbersPred = false;
for (auto &MI : *BBI.BB) {		for (auto &MI : *BBI.BB) {
if (MI.isDebugValue())		if (MI.isDebugValue())
continue;		continue;

// It's unsafe to duplicate convergent instructions in this context, so set		// It's unsafe to duplicate convergent instructions in this context, so set
// BBI.CannotBeCopied to true if MI is convergent. To see why, consider the		// BBI.CannotBeCopied to true if MI is convergent. To see why, consider the
		davidxlUnsubmitted Done Reply Inline Actions Merge these two : if (TF == FT && TT == FF) { if (! Reversable) return false; reverse ... } davidxl: Merge these two : if (TF == FT && TT == FF) { if (! Reversable) return…
// following CFG, which is subject to our "simple" transformation.		// following CFG, which is subject to our "simple" transformation.
//		//
// BB0 // if (c1) goto BB1; else goto BB2;		// BB0 // if (c1) goto BB1; else goto BB2;
// / \		// / \
// BB1 \|		// BB1 \|
// \| BB2 // if (c2) goto TBB; else goto FBB;		// \| BB2 // if (c2) goto TBB; else goto FBB;
// \| / \|		// \| / \|
// \| / \|		// \| / \|
// TBB \|		// TBB \|
// \| \|		// \| \|
// \| FBB		// \| FBB
// \|		// \|
// exit		// exit
//		//
// Suppose we want to move TBB's contents up into BB1 and BB2 (in BB1 they'd		// Suppose we want to move TBB's contents up into BB1 and BB2 (in BB1 they'd
// be unconditional, and in BB2, they'd be predicated upon c2), and suppose		// be unconditional, and in BB2, they'd be predicated upon c2), and suppose
// TBB contains a convergent instruction. This is safe iff doing so does		// TBB contains a convergent instruction. This is safe iff doing so does
// not add a control-flow dependency to the convergent instruction -- i.e.,		// not add a control-flow dependency to the convergent instruction -- i.e.,
		davidxlUnsubmitted Done Reply Inline Actions Better move this out of line to improve readability. davidxl: Better move this out of line to improve readability.
// it's safe iff the set of control flows that leads us to the convergent		// it's safe iff the set of control flows that leads us to the convergent
// instruction does not get smaller after the transformation.		// instruction does not get smaller after the transformation.
//		//
// Originally we executed TBB if c1 \|\| c2. After the transformation, there		// Originally we executed TBB if c1 \|\| c2. After the transformation, there
// are two copies of TBB's instructions. We get to the first if c1, and we		// are two copies of TBB's instructions. We get to the first if c1, and we
		davidxlUnsubmitted Done Reply Inline Actions I have not looked in details here. Is there existing code that can be refactored/reused by any chance? davidxl: I have not looked in details here. Is there existing code that can be refactored/reused by any…
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions The function that is the most similar is ScanInstructions. I think they're different enough that having them share code would be more confusing than helpful. iteratee: The function that is the most similar is ScanInstructions. I think they're different enough…
// get to the second if !c1 && c2.		// get to the second if !c1 && c2.
//		//
// There are clearly fewer ways to satisfy the condition "c1" than		// There are clearly fewer ways to satisfy the condition "c1" than
		davidxlUnsubmitted Not Done Reply Inline Actions Clean up the comment -- last instruction of what? davidxl: Clean up the comment -- last instruction of what?
// "c1 \|\| c2". Since we've shrunk the set of control flows which lead to		// "c1 \|\| c2". Since we've shrunk the set of control flows which lead to
		davidxlUnsubmitted Not Done Reply Inline Actions Is this a good assumption to make? Any assert can be added countDuplicatedInstructions? davidxl: Is this a good assumption to make? Any assert can be added countDuplicatedInstructions?
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions I've elaborated in the comment. The size is computed by ScanInstructions, and the duplicated portion is subtracted off, so there's no point in recomputing the size, we would get the same answer. iteratee: I've elaborated in the comment. The size is computed by ScanInstructions, and the duplicated…
		davidxlUnsubmitted Not Done Reply Inline Actions My question in this comment is that countDuplicatiedInstruction does not document exit state of TIE and FIE, so add assertions to make sure TIE and FIE point to what you expect to see (pointing to identical instructions before and not identical after ?) davidxl: My question in this comment is that countDuplicatiedInstruction does not document exit state of…
// our convergent instruction, the transformation is unsafe.		// our convergent instruction, the transformation is unsafe.
if (MI.isNotDuplicable() \|\| MI.isConvergent())		if (MI.isNotDuplicable() \|\| MI.isConvergent())
BBI.CannotBeCopied = true;		BBI.CannotBeCopied = true;

bool isPredicated = TII->isPredicated(MI);		bool isPredicated = TII->isPredicated(MI);
bool isCondBr = BBI.IsBrAnalyzable && MI.isConditionalBranch();		bool isCondBr = BBI.IsBrAnalyzable && MI.isConditionalBranch();

// A conditional branch is not predicable, but it may be eliminated.		// A conditional branch is not predicable, but it may be eliminated.
Show All 15 Lines	if (!isPredicated) {
return;		return;
}		}

if (BBI.ClobbersPred && !isPredicated) {		if (BBI.ClobbersPred && !isPredicated) {
// Predicate modification instruction should end the block (except for		// Predicate modification instruction should end the block (except for
// already predicated instructions and end of block branches).		// already predicated instructions and end of block branches).
// Predicate may have been modified, the subsequent (currently)		// Predicate may have been modified, the subsequent (currently)
// unpredicated instructions cannot be correctly predicated.		// unpredicated instructions cannot be correctly predicated.
BBI.IsUnpredicable = true;		BBI.IsUnpredicable = true;
		davidxlUnsubmitted Done Reply Inline Actions This skip code pattern has appeared many times -- good candidate to extract into an inline function. davidxl: This skip code pattern has appeared many times -- good candidate to extract into an inline…
return;		return;
}		}

// FIXME: Make use of PredDefs? e.g. ADDC, SUBC sets predicates but are		// FIXME: Make use of PredDefs? e.g. ADDC, SUBC sets predicates but are
// still potentially predicable.		// still potentially predicable.
std::vector<MachineOperand> PredDefs;		std::vector<MachineOperand> PredDefs;
if (TII->DefinesPredicate(MI, PredDefs))		if (TII->DefinesPredicate(MI, PredDefs))
BBI.ClobbersPred = true;		BBI.ClobbersPred = true;
		davidxlUnsubmitted Not Done Reply Inline Actions Why is this check not done outside of this function (in ValidForkedForkedDiamond before countDuplicatedInstuctions as in ValidDiamond ? davidxl: Why is this check not done outside of this function (in ValidForkedForkedDiamond before…
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions Because countDuplicatedInstructions adjusts the iterators so that we know exactly which instructions are duplicated. We only worry about the non-duplicated instructions that clobber the predicate info. iteratee: Because countDuplicatedInstructions adjusts the iterators so that we know exactly which…
		davidxlUnsubmitted Done Reply Inline Actions Should this be done for ValidDiamond too -- only check the non-shared portion? Also I don't think it is ideal to have code duplication like this. Looks like you should re-use scanInstructions or part of it (by making it accepting BIB and BIE) ? davidxl: Should this be done for ValidDiamond too -- only check the non-shared portion? Also I don't…

if (!TII->isPredicable(MI)) {		if (!TII->isPredicable(MI)) {
BBI.IsUnpredicable = true;		BBI.IsUnpredicable = true;
return;		return;
}		}
		davidxlUnsubmitted Not Done Reply Inline Actions Why is this not already computed? davidxl: Why is this not already computed?
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions See above. iteratee: See above.
}		}
}		}

		davidxlUnsubmitted Not Done Reply Inline Actions feasbilityAnalysis already checks isUnpredicable bit -- why is it still done here? davidxl: feasbilityAnalysis already checks isUnpredicable bit -- why is it still done here?
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions Same reason as above. iteratee: Same reason as above.
/// FeasibilityAnalysis - Determine if the block is a suitable candidate to be		/// FeasibilityAnalysis - Determine if the block is a suitable candidate to be
/// predicated by the specified predicate.		/// predicated by the specified predicate.
bool IfConverter::FeasibilityAnalysis(BBInfo &BBI,		bool IfConverter::FeasibilityAnalysis(BBInfo &BBI,
SmallVectorImpl<MachineOperand> &Pred,		SmallVectorImpl<MachineOperand> &Pred,
bool isTriangle, bool RevBranch) {		bool isTriangle, bool RevBranch) {
// If the block is dead or unpredicable, then it cannot be predicated.		// If the block is dead or unpredicable, then it cannot be predicated.
if (BBI.IsDone \|\| BBI.IsUnpredicable)		if (BBI.IsDone \|\| BBI.IsUnpredicable)
return false;		return false;
Show All 23 Lines	if (BBI.BrCond.size()) {
if (TII->ReverseBranchCondition(RevPred) \|\|		if (TII->ReverseBranchCondition(RevPred) \|\|
!TII->SubsumesPredicate(Cond, RevPred))		!TII->SubsumesPredicate(Cond, RevPred))
return false;		return false;
}		}

return true;		return true;
}		}

		/// FeasibilityAnalysisSharedTail - Determine if the block is a suitable
		/// candidate to be predicated by the specified predicate, assuming that all
		/// non predicable instructions are part of a shared tail.
		bool IfConverter::FeasibilityAnalysisSharedTail(
		davidxlUnsubmitted Done Reply Inline Actions Document the parameters. davidxl: Document the parameters.
		davidxlUnsubmitted Done Reply Inline Actions Why can't this function be folded into existing FeasbilityAnalysis with a new flag : hasCommonForkedTail (which defaults to false) ? davidxl: Why can't this function be folded into existing FeasbilityAnalysis with a new flag…
		BBInfo &BBI, SmallVectorImpl<MachineOperand> &Pred) {
		// If the block is dead, then it cannot be predicated. Don't check
		// IsUnpredicable, because while the whole block may not be, the portion that
		// is unshared may well be predicable.
		if (BBI.IsDone)
		return false;

		// If it is already predicated but we couldn't analyze its terminator, the
		// latter might fallthrough, but we can't determine where to.
		// Conservatively avoid if-converting again.
		if (BBI.Predicate.size() && !BBI.IsBrAnalyzable)
		return false;

		// If it is already predicated, check if the new predicate subsumes
		// its predicate.
		if (BBI.Predicate.size() && !TII->SubsumesPredicate(Pred, BBI.Predicate))
		return false;

		return true;
		}

/// AnalyzeBlock - Analyze the structure of the sub-CFG starting from		/// AnalyzeBlock - Analyze the structure of the sub-CFG starting from
/// the specified block. Record its successors and whether it looks like an		/// the specified block. Record its successors and whether it looks like an
/// if-conversion candidate.		/// if-conversion candidate.
void IfConverter::AnalyzeBlock(		void IfConverter::AnalyzeBlock(
MachineBasicBlock *MBB, std::vector<std::unique_ptr<IfcvtToken>> &Tokens) {		MachineBasicBlock *MBB, std::vector<std::unique_ptr<IfcvtToken>> &Tokens) {
struct BBState {		struct BBState {
BBState(MachineBasicBlock *BB) : MBB(BB), SuccsAnalyzed(false) {}		BBState(MachineBasicBlock *BB) : MBB(BB), SuccsAnalyzed(false) {}
MachineBasicBlock *MBB;		MachineBasicBlock *MBB;
Show All 27 Lines	if (!State.SuccsAnalyzed) {
BBI.IsBeingAnalyzed = false;		BBI.IsBeingAnalyzed = false;
BBI.IsAnalyzed = true;		BBI.IsAnalyzed = true;
BBStack.pop_back();		BBStack.pop_back();
continue;		continue;
}		}

// Do not ifcvt if either path is a back edge to the entry block.		// Do not ifcvt if either path is a back edge to the entry block.
if (BBI.TrueBB == BB \|\| BBI.FalseBB == BB) {		if (BBI.TrueBB == BB \|\| BBI.FalseBB == BB) {
		DEBUG(dbgs() << "Not ifcvting because the back edge is the entry block.\n");
BBI.IsBeingAnalyzed = false;		BBI.IsBeingAnalyzed = false;
BBI.IsAnalyzed = true;		BBI.IsAnalyzed = true;
BBStack.pop_back();		BBStack.pop_back();
continue;		continue;
}		}

// Do not ifcvt if true and false fallthrough blocks are the same.		// Do not ifcvt if true and false fallthrough blocks are the same.
if (!BBI.FalseBB) {		if (!BBI.FalseBB) {
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	if (CanRevCond && ValidDiamond(TrueBBI, FalseBBI, Dups, Dups2) &&
// \ /		// \ /
// TailBB		// TailBB
// Note TailBB can be empty.		// Note TailBB can be empty.
Tokens.push_back(llvm::make_unique<IfcvtToken>(		Tokens.push_back(llvm::make_unique<IfcvtToken>(
BBI, ICDiamond, TNeedSub \| FNeedSub, Dups, Dups2));		BBI, ICDiamond, TNeedSub \| FNeedSub, Dups, Dups2));
Enqueued = true;		Enqueued = true;
}		}

		BBInfo TrueBBICalc, FalseBBICalc;
		if (CanRevCond && ValidDiamondTail(TrueBBI, FalseBBI, Dups, Dups2,
		davidxlUnsubmitted Not Done Reply Inline Actions This code looks almost exactly the same as the regular diamond case. Perhaps defined a lamba function auto DiamondFinder = [&](decltype(&IfConverter::ValidDiamond) Checker) { if (CanRevCond && (this->Checker(..) && ...) { .... } }; DiamondFinder(&IfConverter::ValidDiamond); DiamondFinder(&IfConverter::ValidForkedDiamond); davidxl:* This code looks almost exactly the same as the regular diamond case. Perhaps defined a lamba…
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions It would take too many parameters, and the code would be less legible. I would have to pass in TrueBBICalc, FalseBBICalc, hasCommonTail, and then conditionalize the calls, because ValidDiamond takes a different number of arguments from ValidForkedDiamond. I don't think it would be cleaner. iteratee: It would take too many parameters, and the code would be less legible. I would have to pass in…
		davidxlUnsubmitted Not Done Reply Inline Actions I still think refactoring is better -- the main reason of doing the refactoring is to avoid code duplication which is better longer term. For instance, no need to worry about fixing bugs in multiple different places. Another point is that Diamond and ForkedDiamond patterns are exclusive, so there is no need to do forked diamond check after diamond check returns true. In other words, the code can be further simplified to: if (CanRevCond) { if (!DiamondFinder(...::ValidDiamond)) { DiamondFilnder(... ::ValidForkedDiamond); } Also it seems to me you don't need to introduce TrueBBICalc and FalseBBICalc. How about in ValidForkedDiamond saving the initial value of TrueBBI and FalseBBI re-scan region of BB and update TrueBBI and FalseBBI if ValidForkedDiamond fails, restore TrueBBI etc value before returning. davidxl: I still think refactoring is better -- the main reason of doing the refactoring is to avoid…
		TrueBBICalc, FalseBBICalc) &&
		MeetIfcvtSizeLimit(*TrueBBI.BB, (TrueBBICalc.NonPredSize - (Dups + Dups2) +
		TrueBBICalc.ExtraCost),
		TrueBBICalc.ExtraCost2,
		*FalseBBI.BB, (FalseBBICalc.NonPredSize - (Dups + Dups2) +
		FalseBBICalc.ExtraCost),
		FalseBBICalc.ExtraCost2,
		Prediction) &&
		FeasibilityAnalysisSharedTail(TrueBBI, BBI.BrCond) &&
		FeasibilityAnalysisSharedTail(FalseBBI, RevCond)) {
		// DiamondTail:
		// if TBB and FBB have a common tail that includes their conditional
		// branch instructions, then we can If Convert this pattern.
		// EBB
		// _/ \_
		// \| \|
		// TBB FBB
		// / \ / \
		// FalseBB TrueBB FalseBB
		//
		Tokens.push_back(llvm::make_unique<IfcvtToken>(
		BBI, ICDiamondTail, TNeedSub \| FNeedSub, Dups, Dups2,
		(bool) TrueBBICalc.ClobbersPred, (bool) FalseBBICalc.ClobbersPred));
		Enqueued = true;
		}

if (ValidTriangle(TrueBBI, FalseBBI, false, Dups, Prediction) &&		if (ValidTriangle(TrueBBI, FalseBBI, false, Dups, Prediction) &&
MeetIfcvtSizeLimit(*TrueBBI.BB, TrueBBI.NonPredSize + TrueBBI.ExtraCost,		MeetIfcvtSizeLimit(*TrueBBI.BB, TrueBBI.NonPredSize + TrueBBI.ExtraCost,
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions I factored out the feasibility analysis. Take a look now. iteratee: I factored out the feasibility analysis. Take a look now.
TrueBBI.ExtraCost2, Prediction) &&		TrueBBI.ExtraCost2, Prediction) &&
FeasibilityAnalysis(TrueBBI, BBI.BrCond, true)) {		FeasibilityAnalysis(TrueBBI, BBI.BrCond, true)) {
// Triangle:		// Triangle:
// EBB		// EBB
// \| \_		// \| \_
// \| \|		// \| \|
// \| TBB		// \| TBB
// \| /		// \| /
▲ Show 20 Lines • Show All 419 Lines • ▼ Show 20 Lines	bool IfConverter::IfConvertTriangle(BBInfo &BBI, IfcvtKind Kind) {
CvtBBI->IsDone = true;		CvtBBI->IsDone = true;
if (FalseBBDead)		if (FalseBBDead)
NextBBI->IsDone = true;		NextBBI->IsDone = true;

// FIXME: Must maintain LiveIns.		// FIXME: Must maintain LiveIns.
return true;		return true;
}		}

		/// IfConvertDiamondTail - If convert an almost-diamond sub-CFG where the true
		/// and false blocks share a common tail.
		bool IfConverter::IfConvertDiamondTail(
		BBInfo &BBI, IfcvtKind Kind,
		unsigned NumDups1, unsigned NumDups2,
		bool TClobbersPred, bool FClobbersPred) {
		BBInfo &TrueBBI = BBAnalysis[BBI.TrueBB->getNumber()];
		BBInfo &FalseBBI = BBAnalysis[BBI.FalseBB->getNumber()];

		if (TrueBBI.IsDone \|\| FalseBBI.IsDone) {
		// Something has changed. It's no longer safe to predicate these blocks.
		BBI.IsAnalyzed = false;
		TrueBBI.IsAnalyzed = false;
		FalseBBI.IsAnalyzed = false;
		return false;
		}

		if (TrueBBI.BB->hasAddressTaken() \|\| FalseBBI.BB->hasAddressTaken())
		// Conservatively abort if-conversion if either BB has its address taken.
		return false;

		// Put the predicated instructions from the 'true' block before the
		// instructions from the 'false' block, unless the true block would clobber
		// the predicate, in which case, do the opposite.
		BBInfo *BBI1 = &TrueBBI;
		BBInfo *BBI2 = &FalseBBI;
		SmallVector<MachineOperand, 4> RevCond(BBI.BrCond.begin(), BBI.BrCond.end());
		if (TII->ReverseBranchCondition(RevCond))
		llvm_unreachable("Unable to reverse branch condition!");
		SmallVector<MachineOperand, 4> *Cond1 = &BBI.BrCond;
		SmallVector<MachineOperand, 4> *Cond2 = &RevCond;

		// Figure out the more profitable ordering.
		bool DoSwap = false;
		if (TClobbersPred && !FClobbersPred)
		DoSwap = true;
		else if (TClobbersPred == FClobbersPred) {
		if (TrueBBI.NonPredSize > FalseBBI.NonPredSize)
		DoSwap = true;
		}
		if (DoSwap) {
		std::swap(BBI1, BBI2);
		std::swap(Cond1, Cond2);
		}

		// Remove the conditional branch from entry to the blocks.
		BBI.NonPredSize -= TII->RemoveBranch(*BBI.BB);

		// Initialize liveins to the first BB. These are potentially redefined by
		// predicated instructions.
		Redefs.init(TRI);
		Redefs.addLiveIns(*BBI1->BB);

		// Remove the duplicated instructions at the beginnings of both paths.
		// Skip dbg_value instructions
		MachineBasicBlock::iterator DI1 = BBI1->BB->getFirstNonDebugInstr();
		MachineBasicBlock::iterator DI2 = BBI2->BB->getFirstNonDebugInstr();
		BBI1->NonPredSize -= NumDups1;
		BBI2->NonPredSize -= NumDups1;

		// Skip past the dups on each side separately since there may be
		// differing dbg_value entries.
		for (unsigned i = 0; i < NumDups1; ++DI1) {
		if (!DI1->isDebugValue())
		++i;
		}
		while (NumDups1 != 0) {
		++DI2;
		if (!DI2->isDebugValue())
		--NumDups1;
		}

		// Compute a set of registers which must not be killed by instructions in BB1:
		// This is everything used+live in BB2 after the duplicated instructions. We
		// can compute this set by simulating liveness backwards from the end of BB2.
		DontKill.init(TRI);
		for (MachineBasicBlock::reverse_iterator I = BBI2->BB->rbegin(),
		E = MachineBasicBlock::reverse_iterator(DI2); I != E; ++I) {
		DontKill.stepBackward(*I);
		}

		for (MachineBasicBlock::const_iterator I = BBI1->BB->begin(), E = DI1; I != E;
		++I) {
		SmallVector<std::pair<unsigned, const MachineOperand*>, 4> IgnoredClobbers;
		Redefs.stepForward(*I, IgnoredClobbers);
		}
		BBI.BB->splice(BBI.BB->end(), BBI1->BB, BBI1->BB->begin(), DI1);
		BBI2->BB->erase(BBI2->BB->begin(), DI2);

		// Remove branch from the 'true' block. This is safe, because we have
		// determined that both blocks have the same branch instructions. The branch
		// will be added back at the end, unpredicated.
		BBI1->NonPredSize -= TII->RemoveBranch(*BBI1->BB);
		// Remove duplicated instructions.
		DI1 = BBI1->BB->end();
		for (unsigned i = 0; i != NumDups2; ) {
		// NumDups2 only counted non-dbg_value instructions, so this won't
		// run off the head of the list.
		assert (DI1 != BBI1->BB->begin());
		--DI1;
		// skip dbg_value instructions
		if (!DI1->isDebugValue())
		++i;
		}
		BBI1->BB->erase(DI1, BBI1->BB->end());

		// Kill flags in the true block for registers living into the false block
		// must be removed.
		RemoveKills(BBI1->BB->begin(), BBI1->BB->end(), DontKill, *TRI);

		// Remove 'false' block branch, and find the last instruction to predicate.
		// Save the debug location.
		DebugLoc dl;
		MachineBasicBlock::iterator BBI2T = BBI2->BB->getFirstTerminator();
		if (BBI2T != BBI2->BB->end())
		dl = BBI2T->getDebugLoc();
		BBI2->NonPredSize -= TII->RemoveBranch(*BBI2->BB);
		DI2 = BBI2->BB->end();
		while (NumDups2 != 0) {
		// NumDups2 only counted non-dbg_value instructions, so this won't
		// run off the head of the list.
		assert (DI2 != BBI2->BB->begin());
		--DI2;
		// skip dbg_value instructions
		if (!DI2->isDebugValue())
		--NumDups2;
		}

		// Remember which registers would later be defined by the false block.
		// This allows us not to predicate instructions in the true block that would
		// later be re-defined. That is, rather than
		// subeq r0, r1, #1
		// addne r0, r1, #1
		// generate:
		// sub r0, r1, #1
		// addne r0, r1, #1
		SmallSet<unsigned, 4> RedefsByFalse;
		SmallSet<unsigned, 4> ExtUses;
		if (TII->isProfitableToUnpredicate(BBI1->BB, BBI2->BB)) {
		for (MachineBasicBlock::iterator FI = BBI2->BB->begin(); FI != DI2; ++FI) {
		if (FI->isDebugValue())
		continue;
		SmallVector<unsigned, 4> Defs;
		for (unsigned i = 0, e = FI->getNumOperands(); i != e; ++i) {
		const MachineOperand &MO = FI->getOperand(i);
		if (!MO.isReg())
		continue;
		unsigned Reg = MO.getReg();
		if (!Reg)
		continue;
		if (MO.isDef()) {
		Defs.push_back(Reg);
		} else if (!RedefsByFalse.count(Reg)) {
		// These are defined before ctrl flow reach the 'false' instructions.
		// They cannot be modified by the 'true' instructions.
		for (MCSubRegIterator SubRegs(Reg, TRI, /IncludeSelf=/true);
		SubRegs.isValid(); ++SubRegs)
		ExtUses.insert(*SubRegs);
		}
		}

		for (unsigned i = 0, e = Defs.size(); i != e; ++i) {
		unsigned Reg = Defs[i];
		if (!ExtUses.count(Reg)) {
		for (MCSubRegIterator SubRegs(Reg, TRI, /IncludeSelf=/true);
		SubRegs.isValid(); ++SubRegs)
		RedefsByFalse.insert(*SubRegs);
		}
		}
		}
		}

		// Predicate the 'true' block.
		PredicateBlock(BBI1, BBI1->BB->end(), Cond1, &RedefsByFalse);

		// After predicating BBI1, if there is a predicated terminator in BBI1 and
		// a non-predicated in BBI2, then we don't want to predicate the one from
		// BBI2. The reason is that if we merged these blocks, we would end up with
		// two predicated terminators in the same block.
		if (!BBI2->BB->empty() && (DI2 == BBI2->BB->end())) {
		MachineBasicBlock::iterator BBI1T = BBI1->BB->getFirstTerminator();
		MachineBasicBlock::iterator BBI2T = BBI2->BB->getFirstTerminator();
		if (BBI1T != BBI1->BB->end() && TII->isPredicated(*BBI1T) &&
		BBI2T != BBI2->BB->end() && !TII->isPredicated(*BBI2T))
		llvm_unreachable("Terminator should have been removed for Diamond-Tail case.");
		}

		// Predicate the 'false' block.
		PredicateBlock(BBI2, DI2, Cond2);

		// Merge the true block into the entry of the diamond.
		MergeBlocks(BBI, BBI1, / AddEdges */ true);
		MergeBlocks(BBI, BBI2, / AddEdges */ true);

		// Add back the branch.
		// Debug location saved above when removing the branch from BBI2
		TII->InsertBranch(*BBI.BB, BBI2->TrueBB, BBI2->FalseBB, BBI2->BrCond, dl);

		RemoveExtraEdges(BBI);

		// Update block info.
		BBI.IsDone = TrueBBI.IsDone = FalseBBI.IsDone = true;
		InvalidatePreds(BBI.BB);

		// FIXME: Must maintain LiveIns.
		return true;
		}

/// IfConvertDiamond - If convert a diamond sub-CFG.		/// IfConvertDiamond - If convert a diamond sub-CFG.
///		///
bool IfConverter::IfConvertDiamond(BBInfo &BBI, IfcvtKind Kind,		bool IfConverter::IfConvertDiamond(BBInfo &BBI, IfcvtKind Kind,
unsigned NumDups1, unsigned NumDups2) {		unsigned NumDups1, unsigned NumDups2) {
BBInfo &TrueBBI = BBAnalysis[BBI.TrueBB->getNumber()];		BBInfo &TrueBBI = BBAnalysis[BBI.TrueBB->getNumber()];
BBInfo &FalseBBI = BBAnalysis[BBI.FalseBB->getNumber()];		BBInfo &FalseBBI = BBAnalysis[BBI.FalseBB->getNumber()];
MachineBasicBlock *TailBB = TrueBBI.TrueBB;		MachineBasicBlock *TailBB = TrueBBI.TrueBB;
// True block must fall through or end with an unanalyzable terminator.		// True block must fall through or end with an unanalyzable terminator.
▲ Show 20 Lines • Show All 242 Lines • ▼ Show 20 Lines	static bool MaySpeculate(const MachineInstr &MI,
SmallSet<unsigned, 4> &LaterRedefs) {		SmallSet<unsigned, 4> &LaterRedefs) {
bool SawStore = true;		bool SawStore = true;
if (!MI.isSafeToMove(nullptr, SawStore))		if (!MI.isSafeToMove(nullptr, SawStore))
return false;		return false;

for (unsigned i = 0, e = MI.getNumOperands(); i != e; ++i) {		for (unsigned i = 0, e = MI.getNumOperands(); i != e; ++i) {
const MachineOperand &MO = MI.getOperand(i);		const MachineOperand &MO = MI.getOperand(i);
if (!MO.isReg())		if (!MO.isReg())
continue;		continue;
		davidxlUnsubmitted Not Done Reply Inline Actions Why is this check not done for the forked case? davidxl: Why is this check not done for the forked case?
		iterateeAuthorUnsubmitted Not Done Reply Inline Actions Good catch. iteratee: Good catch.
unsigned Reg = MO.getReg();		unsigned Reg = MO.getReg();
if (!Reg)		if (!Reg)
continue;		continue;
if (MO.isDef() && !LaterRedefs.count(Reg))		if (MO.isDef() && !LaterRedefs.count(Reg))
return false;		return false;
}		}

return true;		return true;
▲ Show 20 Lines • Show All 236 Lines • Show Last 20 Lines

test/CodeGen/Thumb2/thumb2-ifcvt1.ll

	; RUN: llc < %s -mtriple=thumbv7-apple-darwin \| FileCheck %s			; RUN: llc < %s -mtriple=thumbv7-apple-darwin \| FileCheck %s
	; RUN: llc < %s -mtriple=thumbv7-apple-darwin -arm-default-it \| FileCheck %s			; RUN: llc < %s -mtriple=thumbv7-apple-darwin -arm-default-it \| FileCheck %s
	; RUN: llc < %s -mtriple=thumbv8 -arm-no-restrict-it \|FileCheck %s			; RUN: llc < %s -mtriple=thumbv8 -arm-no-restrict-it \| FileCheck %s
				; RUN: llc < %s -mtriple=thumbv8 -arm-no-restrict-it -enable-tail-merge=0 \| FileCheck %s
	define i32 @t1(i32 %a, i32 %b, i32 %c, i32 %d) nounwind {			define i32 @t1(i32 %a, i32 %b, i32 %c, i32 %d) nounwind {
	; CHECK-LABEL: t1:			; CHECK-LABEL: t1:
	; CHECK: ittt ne			; CHECK: ittt ne
	; CHECK: cmpne			; CHECK: cmpne
	; CHECK: addne			; CHECK: addne
	; CHECK: bxne lr			; CHECK: bxne lr
	switch i32 %c, label %cond_next [			switch i32 %c, label %cond_next [
	i32 1, label %cond_true			i32 1, label %cond_true
	i32 7, label %cond_true			i32 7, label %cond_true
	]			]

	cond_true:			cond_true:
	%tmp12 = add i32 %a, 1			%tmp12 = add i32 %a, 1
	%tmp1518 = add i32 %tmp12, %b			%tmp1518 = add i32 %tmp12, %b
	ret i32 %tmp1518			ret i32 %tmp1518

	cond_next:			cond_next:
	%tmp15 = add i32 %b, %a			%tmp15 = add i32 %b, %a
	ret i32 %tmp15			ret i32 %tmp15
	}			}

	define i32 @t2(i32 %a, i32 %b) nounwind {			define i32 @t2(i32 %a, i32 %b) nounwind {
	entry:			entry:
	; CHECK-LABEL: t2:			; CHECK-LABEL: t2:
	; CHECK: ite gt			; CHECK: ite {{gt\|le}}
	; CHECK: subgt			; CHECK-DAG: suble
	; CHECK: suble			; CHECK-DAG: subgt
	%tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1]			%tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1]
	br i1 %tmp1434, label %bb17, label %bb.outer			br i1 %tmp1434, label %bb17, label %bb.outer

	bb.outer: ; preds = %cond_false, %entry			bb.outer: ; preds = %cond_false, %entry
	%b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] ; <i32> [#uses=5]			%b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] ; <i32> [#uses=5]
	%a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] ; <i32> [#uses=1]			%a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] ; <i32> [#uses=1]
	br label %bb			br label %bb

	▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines