This is an archive of the discontinued LLVM Phabricator instance.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
9673	could this be generalized further? i.e., `(br_cc cc, (add X, c0), c1, bb) => (br_cc cc, X, (c1 - c0), bb)`, assuming that `c1 >= c0`. and then generalize that to other binary ops?
9687	`EVT::getScalarSizeInBits`. also, isn't it already the case that `VT(AddRHS) == VT(AddLHS) == VT(CondLHSNode)` (because of the implied constraint on add operands) and `VT(CondLHSNode) == VT(CondRHS)` (because of br_cc's constraint)?
9695	not so sure that this is the appropriate place for debug output.
test/CodeGen/AArch64/DAGCombine-optimize-brcc.ll
46	could this test case be simplified?

%2 = add %1, #const
br_cc cond %2, #const
-->
br_cc cond %1, 0

I don't think this transform is safe unless you have 'nsw' or 'nuw' on the add.

In D24327#537057, @spatel wrote:

%2 = add %1, #const
br_cc cond %2, #const
-->
br_cc cond %1, 0

I don't think this transform is safe unless you have 'nsw' or 'nuw' on the add.

am i missing something obvious? if nsw/nuw are present, then %2 could potentially be undef, and the transform would elide this case and alter semantics. so wouldn't it be unsafe only if the flags are present?

In D24327#537999, @bryant wrote:

In D24327#537057, @spatel wrote:

%2 = add %1, #const
br_cc cond %2, #const
-->
br_cc cond %1, 0

I don't think this transform is safe unless you have 'nsw' or 'nuw' on the add.

am i missing something obvious? if nsw/nuw are present, then %2 could potentially be undef, and the transform would elide this case and alter semantics. so wouldn't it be unsafe only if the flags are present?

The wrapping flags are what allow us to guarantee that the transform does not alter semantics.

Consider the example with these values:

%1 is i8 255 (unsigned max)
%2 = add i8 %1, 1
br_cc ugt %2, 1

Wrapping is allowed; there is no undef.
The branch condition is:

ugt 0, 1 --> false.

If we make the transform suggested by this patch, the branch condition is:

ugt 255, 0  --> true

If the add has 'nuw', this case cannot occur. Otherwise, we have poison:
http://llvm.org/docs/LangRef.html#poisonvalues

Note that the semantics of DAG instructions are not actually specified/defined anywhere AFAIK (if they are, I'd love to know where!). I'm assuming they are the same as LLVM IR in this case because we can and do pass nsw/nuw flags from IR instructions to DAG nodes.

Please take a look at how this transform is handled for signed and unsigned compares in InstCombineCompares.cpp. I assume the reason we need this fold repeated as a DAG combine is that the pattern does not emerge until it's too late to be handled by InstCombine?

cc'ing Sanjoy and David in case I have any or all of this wrong. :)

Note: D24700 is proposing to enhance InstCombine's folding for a similar pattern.

RKSimon resigned from this revision.Jan 6 2017, 6:11 AM

RKSimon removed a reviewer: RKSimon.

RKSimon added a subscriber: RKSimon.

spatel resigned from this revision.Sep 26 2017, 3:52 PM

Revision Contents

Path

Size

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

48 lines

test/

CodeGen/

AArch64/

DAGCombine-optimize-brcc.ll

46 lines

Diff 70636

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	StressLoadSlicing("combiner-stress-load-slicing", cl::Hidden,
cl::desc("Bypass the profitability model of load "		cl::desc("Bypass the profitability model of load "
"slicing"),		"slicing"),
cl::init(false));		cl::init(false));

static cl::opt<bool>		static cl::opt<bool>
MaySplitLoadIndex("combiner-split-load-index", cl::Hidden, cl::init(true),		MaySplitLoadIndex("combiner-split-load-index", cl::Hidden, cl::init(true),
cl::desc("DAG combiner may split indexing from loads"));		cl::desc("DAG combiner may split indexing from loads"));

//------------------------------ DAGCombiner ---------------------------------//		static cl::opt<bool>
		CombinerOptBRCC("combiner-optimize-brcc", cl::Hidden,
		cl::desc("Optimize removable src operand of br_cc"));

		//------------------------------ DAGCombiner--------------------------------//

class DAGCombiner {		class DAGCombiner {
SelectionDAG &DAG;		SelectionDAG &DAG;
const TargetLowering &TLI;		const TargetLowering &TLI;
CombineLevel Level;		CombineLevel Level;
CodeGenOpt::Level OptLevel;		CodeGenOpt::Level OptLevel;
bool LegalOperations;		bool LegalOperations;
bool LegalTypes;		bool LegalTypes;
▲ Show 20 Lines • Show All 9,563 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitBR_CC(SDNode *N) {

// fold to a simpler setcc		// fold to a simpler setcc
if (Simp.getNode() && Simp.getOpcode() == ISD::SETCC)		if (Simp.getNode() && Simp.getOpcode() == ISD::SETCC)
return DAG.getNode(ISD::BR_CC, SDLoc(N), MVT::Other,		return DAG.getNode(ISD::BR_CC, SDLoc(N), MVT::Other,
N->getOperand(0), Simp.getOperand(2),		N->getOperand(0), Simp.getOperand(2),
Simp.getOperand(0), Simp.getOperand(1),		Simp.getOperand(0), Simp.getOperand(1),
N->getOperand(4));		N->getOperand(4));

		if (CombinerOptBRCC.getNumOccurrences() > 0) {
		// %2 = add %1, #const
		// br_cc cond %2, #const
		// -->
		// br_cc cond %1, 0
		bryantUnsubmitted Not Done Reply Inline Actions could this be generalized further? i.e., `(br_cc cc, (add X, c0), c1, bb) => (br_cc cc, X, (c1 - c0), bb)`, assuming that `c1 >= c0`. and then generalize that to other binary ops? bryant: could this be generalized further? i.e., `(br_cc cc, (add X, c0), c1, bb) => (br_cc cc, X, (c1…
		SDNode *CondLHSNode = CondLHS.getNode();
		if (!Simp.getNode() && CondLHSNode->hasOneUse() &&
		CondLHSNode->getOpcode() == ISD::ADD &&
		(CC->get() == ISD::SETEQ \|\| CC->get() == ISD::SETGT \|\|
		CC->get() == ISD::SETGE \|\| CC->get() == ISD::SETLT \|\|
		CC->get() == ISD::SETLE \|\| CC->get() == ISD::SETNE)) {
		assert(N == *CondLHSNode->use_begin());
		SDValue AddLHS = CondLHSNode->getOperand(0);
		SDValue AddRHS = CondLHSNode->getOperand(1);
		ConstantSDNode *ConstAddRHS = dyn_cast<ConstantSDNode>(AddRHS);
		ConstantSDNode *ConstNRHS = dyn_cast<ConstantSDNode>(CondRHS);
		if (ConstAddRHS && ConstNRHS &&
		AddRHS.getValueType().getScalarType().getSizeInBits() ==
		CondRHS.getValueType().getScalarType().getSizeInBits() &&
		bryantUnsubmitted Not Done Reply Inline Actions `EVT::getScalarSizeInBits`. also, isn't it already the case that `VT(AddRHS) == VT(AddLHS) == VT(CondLHSNode)` (because of the implied constraint on add operands) and `VT(CondLHSNode) == VT(CondRHS)` (because of br_cc's constraint)? bryant: `EVT::getScalarSizeInBits`. also, isn't it already the case that `VT(AddRHS) == VT(AddLHS) ==…
		APInt::isSameValue(ConstAddRHS->getAPIntValue(),
		ConstNRHS->getAPIntValue())) {
		DEBUG(dbgs() << "ConstAddRHS and ConstNRHS are the same constants\n");
		DEBUG(dbgs() << "ConstAddRHS's value: " << ConstAddRHS->getAPIntValue()
		<< "\n");
		DEBUG(dbgs() << "ConstNRHS's value: " << ConstNRHS->getAPIntValue()
		<< "\n");
		DEBUG(dbgs() << "ConstNRHS - ConstAddRHS: "
		bryantUnsubmitted Not Done Reply Inline Actions not so sure that this is the appropriate place for debug output. bryant: not so sure that this is the appropriate place for debug output.
		<< ConstNRHS->getAPIntValue() -
		ConstAddRHS->getAPIntValue()
		<< "\n");

		return DAG.getNode(
		ISD::BR_CC, SDLoc(N), MVT::Other, N->getOperand(0),
		N->getOperand(1), AddLHS,
		DAG.getConstant(
		APInt::getNullValue(
		CondRHS.getValueType().getScalarType().getSizeInBits()),
		SDLoc(N), CondRHS.getValueType()),
		N->getOperand(4));
		}
		}
		}
return SDValue();		return SDValue();
}		}

/// Return true if 'Use' is a load or a store that uses N as its base pointer		/// Return true if 'Use' is a load or a store that uses N as its base pointer
/// and that N may be folded in the load / store addressing mode.		/// and that N may be folded in the load / store addressing mode.
static bool canFoldInAddressingMode(SDNode N, SDNode Use,		static bool canFoldInAddressingMode(SDNode N, SDNode Use,
SelectionDAG &DAG,		SelectionDAG &DAG,
const TargetLowering &TLI) {		const TargetLowering &TLI) {
▲ Show 20 Lines • Show All 5,512 Lines • Show Last 20 Lines

test/CodeGen/AArch64/DAGCombine-optimize-brcc.ll

This file was added.

				; RUN: llc -march=aarch64 -asm-verbose=false -combiner-optimize-brcc -o - %s \| FileCheck %s

				; This test is for the option '-combiner-optimize-brcc' which removes an unnecessary add instruciton.
				; The transformation is done as following.
				; %2 = add %1, #const
				; br_cc cond %2, #const
				; -->
				; br_cc cond %1, 0

				; CHECK: cmp x8, #0
				; CHECK-NEXT: sub x8, x8, #1
				; CHECK-NEXT: add x0, x0, x10
				define i64 @foo(i64* %a, i64 %w) {
				entry:
				%cmp = icmp eq i64 %w, 0
				br i1 %cmp, label %cleanup, label %if.end

				if.end:
				%0 = load i64, i64* %a, align 8
				%cmp1 = icmp sgt i64 %0, 0
				br i1 %cmp1, label %for.body.lr.ph, label %cleanup

				for.body.lr.ph:
				%d1 = bitcast i64* %a to i64**
				%1 = load i64, i64* %d1, align 8
				%2 = add i64 %0, -1
				br label %for.body

				for.body:
				%lsr.iv = phi i64 [ %2, %for.body.lr.ph ], [ %lsr.iv.next, %for.body ]
				%ret = phi i64 [ 0, %for.body.lr.ph ], [ %ret.next, %for.body ]
				%gep = getelementptr i64, i64* %1, i64 %lsr.iv
				%3 = load i64, i64* %gep, align 8
				%ret.next = add i64 %ret, %3
				%lsr.iv.next = add i64 %lsr.iv, -1
				%4 = add i64 %lsr.iv.next, 2
				%cmp2 = icmp sgt i64 %4, 1
				br i1 %cmp2, label %for.body, label %for.end.loopexit

				for.end.loopexit:
				br label %cleanup

				cleanup:
				%conv3.sink = phi i64 [ -1, %entry ], [ 0, %if.end ], [ %ret.next, %for.end.loopexit ]
				ret i64 %conv3.sink
				}
				bryantUnsubmitted Not Done Reply Inline Actions could this test case be simplified? bryant: could this test case be simplified?

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Modification of visitBR_CCNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 70636

lib/CodeGen/SelectionDAG/DAGCombiner.cpp

test/CodeGen/AArch64/DAGCombine-optimize-brcc.ll

[DAGCombine] Modification of visitBR_CC
Needs ReviewPublic