This is an archive of the discontinued LLVM Phabricator instance.

Maybe add RISCV tests? That's another target that stores compare results as 0/1 instead of 0/-1 - could you also pre-commit the tests for RISCV/SystemZ with current codegen and rebase the patch to show the diff?

laytonio mentioned this in D91671: [DAGCombiner] Precommit Sext Tests.Nov 17 2020, 5:43 PM

RKSimon mentioned this in rG7a8b2f692ec4: [DAGCombiner] Precommit Sext Tests for D91589.Nov 18 2020, 7:56 AM

@laytonio Please can you rebase? BTW I extended the RISCV tests to include i64 tests as well as i32

Thanks!

Rebase

Herald added subscribers: frasercrmck, luismarques, apazos and 20 others. · View Herald TranscriptNov 18 2020, 4:09 PM

RKSimon added reviewers: efriedma, nemanjai.Nov 19 2020, 2:02 AM

RKSimon added inline comments.

llvm/test/CodeGen/RISCV/sext-zext-trunc.ll
477	Not sure what comment should be here if RISCV doesn't actually do the fold

lkail added a subscriber: lkail.Nov 19 2020, 2:55 AM

laytonio added inline comments.Nov 19 2020, 4:42 AM

llvm/test/CodeGen/RISCV/sext-zext-trunc.ll
477	Unless I'm missing something trunk was already doing this fold. I added this as a test case because, without checking if we could fold away the not by converting back to a sign extend, we would have regressed here. After this patch the codegen of sext_of_not_cmp and dec_of_zexted_cmp should be identical. I can remove these if you think they're redundant.

Does anyone have any more commments?

I might have missed something - how are we guarding against these 2 transforms getting into an infinite loop?

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2317	Shouldn't that "zext" be "sext"? We are creating sext in this block.
2329–2331	We do not usually speculatively create nodes like that 'Not' end then not explicitly delete it if it wasn't necessary. It's not clear to me from the description or code comments why we are doing this transform at all. This is the opposite of the patch title?

In D91589#2408627, @spatel wrote:

I might have missed something - how are we guarding against these 2 transforms getting into an infinite loop?

I don't believe this is a danger because going from (sext (not i1 x)) to (add (zext i1 x), -1) requires (not i1 x), and we only go back when we immediately fold away the created (not i1 x). We could instead decide not to do the initial fold if we have a (not i1 x), but if we fold to (add (zext i1 x), -1) first we have the opportunity to constant fold the -1 away, and additionally cover cases where we initially have the (add (zext i1 x), -1) version. (eg in the dec_of_zexted_cmp tests added to RISCV and SystemZ)

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2317	No, because the sext is only better if we can fold away the not. As the comment below indicates. The actual fold that creates the zext of this form is not here, but in visitSIGN_EXTEND.
2329–2331	I will update the patch to delete the Not node if not necessary. As for why we are doing this, we would regress in some cases (eg in the sext_of_not_cmp tests added to RISCV and SystemZ) where we would have otherwise folded away the (not i1 x). In general as the TODO above indicates we should probably have a TLI method to determine whether we want to prefer the zext or sext versions of this and several other folds. It might make sense to use TLI.getBooleanContents for this, as we already do for the (add (sext i1 Y), X) -> (sub X, (zext i1 Y)) fold in visitADDLikeCommutative.

spatel added inline comments.Nov 21 2020, 10:23 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2329–2331	Yes - using getBooleanContents seems like a better option. Another possibility is that we extend the pattern matching to include a `setcc` node (assuming that can absorb the `not` op).

Remove speculatively create (not i1 x), when not used.

laytonio added inline comments.Nov 23 2020, 3:50 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2329–2331	I tried an implementation using getBooleanContents here and in a few other places, and it seems promising, but there are a few regressions I was having a hard time nailing down. I would like to save this for a follow up patch if possible. While inverting a `setcc` is probably the most common way we would be able to absorb the `not`, I'm don't know if it makes sense to only check that case. At the least, I think `(not (or x, y)) -> (and (not x), (not y)) iff x or y are setcc` could also be useful.

spatel added inline comments.Nov 23 2020, 5:26 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

2317

Ok - I understand now, but I think we can improve the order of the statements to make it clearer.
How about:

// add (zext i1 X), -1 -> sext (not i1 X)
// Usually, we transform this pattern in the opposite 
// direction because most targets generate better code for
// the zext form. However, if we can eliminate the 'not', the
// sext form should be better.
// TODO: ...

2333

Please add a code comment with something like this:

// The speculatively created 'not' node could not be folded away, 
// so delete it immediately to avoid conflicting with other folds.

10668

Can we use this here:

/// Returns true if \p V is a bitwise not operation. Assumes that an all ones
/// constant is canonicalized to be operand 1.
bool isBitwiseNot(SDValue V, bool AllowUndefs = false);

laytonio added inline comments.Nov 24 2020, 9:38 PM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2329–2331	Using isBitwiseNot below as suggested caused us to regress in a new case. Upon investigating, I've realized my argument for using visitXOR here is incorrect. At the time we try to fold away the created `not`, the value we're trying to fold the `not` into has two uses, the new `not`, and the original `zext`. This causes most of the folds in visitXOR not to fire. I'm not sure there is a good way to overcome this. I believe conditioning the initial fold on having a non-foldable `not` is probably the better solution.

lenary removed a subscriber: lenary.Nov 25 2020, 2:25 AM

Use isBitwiseNot
Avoid folding cases where (not i1 x) can already be folded

Switch to only matching sign extends from scalar i1s. This is what the X86 backend is currently doing. I don't believe this transform is always better for vector types as it requires materializing an all ones vector.

spatel added inline comments.Nov 27 2020, 2:09 PM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
10440	This is unrelated refactoring? Please commit separately as an NFC patch.

Address comments

laytonio marked 4 inline comments as done.Nov 30 2020, 3:57 PM

LGTM

This revision is now accepted and ready to land.Dec 2 2020, 1:58 PM

Thanks for the review! Could you commit this for me please? Layton Kifer <laytonkifer@gmail.com>

In D91589#2431054, @laytonio wrote:

Thanks for the review! Could you commit this for me please? Layton Kifer <laytonkifer@gmail.com>

Sure - but I missed something: are we excluding vector types intentionally (do we have any test coverage?) or is that planned as a follow-up? Either way, we should add a code comment to explain.

are we excluding vector types intentionally (do we have any test coverage?) or is that planned as a follow-up? Either way, we should add a code comment to explain.

The thought behind excluding vector types was that materializing the all ones constant may be more expensive in some cases. Although thinking about it again, I'm not sure if that is really true as we would remove the constant we are xoring against, most likely making it a wash. I did try allowing vector types and it caused a regression in a single test. That regression did seem like it was avoidable though as it exposed a missing fold of (not (xor (setcc), (setcc))) -> (xor (not (setcc)), (setcc)). Implementing that turned into a bit of a rabbit hole though. What do you think the best approach is here?

In D91589#2431862, @laytonio wrote:

are we excluding vector types intentionally (do we have any test coverage?) or is that planned as a follow-up? Either way, we should add a code comment to explain.

The thought behind excluding vector types was that materializing the all ones constant may be more expensive in some cases. Although thinking about it again, I'm not sure if that is really true as we would remove the constant we are xoring against, most likely making it a wash. I did try allowing vector types and it caused a regression in a single test. That regression did seem like it was avoidable though as it exposed a missing fold of (not (xor (setcc), (setcc))) -> (xor (not (setcc)), (setcc)). Implementing that turned into a bit of a rabbit hole though. What do you think the best approach is here?

I think it's ok to proceed with the scalar-only patch to make progress, but put a TODO comment on it about extending to vectors.

Closed by commit rGac522f87002f: [DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1) (authored by laytonio, committed by spatel). · Explain WhyDec 6 2020, 8:55 AM

This revision was automatically updated to reflect the committed changes.

spatel added a commit: rGac522f87002f: [DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1).

Herald added a subscriber: jrtc27. · View Herald TranscriptDec 6 2020, 8:55 AM

Thanks you for this! I apologise I hadn't gotten to it yet.

Hi! We have bisected a clang crash to this patch. The following reduced test case crashes clang when built with clang -O1 -frounding-math:

template <class> class a {
  int b() { return c == 0.0 ? 0 : -1; }
  int c;
};
template class a<long>;

A debug build of clang produces this "assertion failed" error:

clang: /home/jgorbe/code/llvm/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:264: void {anonymous}::DAGCombiner::AddToWorklist(llvm::
SDNode*): Assertion `N->getOpcode() != ISD::DELETED_NODE && "Deleted Node added to Worklist"' failed.

In D91589#2453540, @jgorbe wrote:
Hi! We have bisected a clang crash to this patch. The following reduced test case crashes clang when built with clang -O1 -frounding-math:
template <class> class a {
  int b() { return c == 0.0 ? 0 : -1; }
  int c;
};
template class a<long>;
A debug build of clang produces this "assertion failed" error:
clang: /home/jgorbe/code/llvm/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:264: void {anonymous}::DAGCombiner::AddToWorklist(llvm::
SDNode*): Assertion `N->getOpcode() != ISD::DELETED_NODE && "Deleted Node added to Worklist"' failed.

Thanks, I will look at this right now.

laytonio mentioned this in D93274: [DAGCombiner] Don't create sexts of deleted xors when they were in-visit replaced.Dec 14 2020, 9:44 PM

rupprecht mentioned this in rGd29f93bda511: [DAGCombiner] Don't create sexts of deleted xors when they were in-visit….Dec 23 2020, 4:21 PM

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

13 lines

Target/

X86/

X86ISelLowering.cpp

11 lines

test/

CodeGen/

AArch64/

select_const.ll

7 lines

ARM/

select_const.ll

24 lines

PowerPC/

select_const.ll

6 lines

RISCV/

sext-zext-trunc.ll

21 lines

SystemZ/

sext-zext.ll

7 lines

X86/

pr44140.ll

4 lines

Diff 309774

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,308 Lines • ▼ Show 20 Lines	if (N0.getOpcode() == ISD::SIGN_EXTEND && N0.hasOneUse() &&
SDValue Not = DAG.getNOT(DL, X, X.getValueType());		SDValue Not = DAG.getNOT(DL, X, X.getValueType());
return DAG.getNode(ISD::ZERO_EXTEND, DL, VT, Not);		return DAG.getNode(ISD::ZERO_EXTEND, DL, VT, Not);
}		}
}		}

// Fold (add (or x, c0), c1) -> (add x, (c0 + c1)) if (or x, c0) is		// Fold (add (or x, c0), c1) -> (add x, (c0 + c1)) if (or x, c0) is
// equivalent to (add x, c0).		// equivalent to (add x, c0).
if (N0.getOpcode() == ISD::OR &&		if (N0.getOpcode() == ISD::OR &&
isConstantOrConstantVector(N0.getOperand(1), /* NoOpaque */ true) &&		isConstantOrConstantVector(N0.getOperand(1), /* NoOpaque */ true) &&
		spatelUnsubmitted Not Done Reply Inline Actions Shouldn't that "zext" be "sext"? We are creating sext in this block. spatel: Shouldn't that "zext" be "sext"? We are creating sext in this block.
		laytonioAuthorUnsubmitted Done Reply Inline Actions No, because the sext is only better if we can fold away the not. As the comment below indicates. The actual fold that creates the zext of this form is not here, but in visitSIGN_EXTEND. laytonio: No, because the sext is only better if we can fold away the not. As the comment below indicates.
		spatelUnsubmitted Not Done Reply Inline Actions Ok - I understand now, but I think we can improve the order of the statements to make it clearer. How about: // add (zext i1 X), -1 -> sext (not i1 X) // Usually, we transform this pattern in the opposite // direction because most targets generate better code for // the zext form. However, if we can eliminate the 'not', the // sext form should be better. // TODO: ... spatel: Ok - I understand now, but I think we can improve the order of the statements to make it…
DAG.haveNoCommonBitsSet(N0.getOperand(0), N0.getOperand(1))) {		DAG.haveNoCommonBitsSet(N0.getOperand(0), N0.getOperand(1))) {
if (SDValue Add0 = DAG.FoldConstantArithmetic(ISD::ADD, DL, VT,		if (SDValue Add0 = DAG.FoldConstantArithmetic(ISD::ADD, DL, VT,
{N1, N0.getOperand(1)}))		{N1, N0.getOperand(1)}))
return DAG.getNode(ISD::ADD, DL, VT, N0.getOperand(0), Add0);		return DAG.getNode(ISD::ADD, DL, VT, N0.getOperand(0), Add0);
}		}
}		}

if (SDValue NewSel = foldBinOpIntoSelect(N))		if (SDValue NewSel = foldBinOpIntoSelect(N))
return NewSel;		return NewSel;

// reassociate add		// reassociate add
if (!reassociationCanBreakAddressingModePattern(ISD::ADD, DL, N0, N1)) {		if (!reassociationCanBreakAddressingModePattern(ISD::ADD, DL, N0, N1)) {
if (SDValue RADD = reassociateOps(ISD::ADD, DL, N0, N1, N->getFlags()))		if (SDValue RADD = reassociateOps(ISD::ADD, DL, N0, N1, N->getFlags()))
return RADD;		return RADD;
		spatelUnsubmitted Done Reply Inline Actions We do not usually speculatively create nodes like that 'Not' end then not explicitly delete it if it wasn't necessary. It's not clear to me from the description or code comments why we are doing this transform at all. This is the opposite of the patch title? spatel: We do not usually speculatively create nodes like that 'Not' end then not explicitly delete it…
		laytonioAuthorUnsubmitted Done Reply Inline Actions I will update the patch to delete the Not node if not necessary. As for why we are doing this, we would regress in some cases (eg in the sext_of_not_cmp tests added to RISCV and SystemZ) where we would have otherwise folded away the (not i1 x). In general as the TODO above indicates we should probably have a TLI method to determine whether we want to prefer the zext or sext versions of this and several other folds. It might make sense to use TLI.getBooleanContents for this, as we already do for the (add (sext i1 Y), X) -> (sub X, (zext i1 Y)) fold in visitADDLikeCommutative. laytonio: I will update the patch to delete the Not node if not necessary. As for why we are doing this…
		spatelUnsubmitted Not Done Reply Inline Actions Yes - using getBooleanContents seems like a better option. Another possibility is that we extend the pattern matching to include a `setcc` node (assuming that can absorb the `not` op). spatel: Yes - using getBooleanContents seems like a better option. Another possibility is that we…
		laytonioAuthorUnsubmitted Done Reply Inline Actions I tried an implementation using getBooleanContents here and in a few other places, and it seems promising, but there are a few regressions I was having a hard time nailing down. I would like to save this for a follow up patch if possible. While inverting a `setcc` is probably the most common way we would be able to absorb the `not`, I'm don't know if it makes sense to only check that case. At the least, I think `(not (or x, y)) -> (and (not x), (not y)) iff x or y are setcc` could also be useful. laytonio: I tried an implementation using getBooleanContents here and in a few other places, and it seems…
		laytonioAuthorUnsubmitted Done Reply Inline Actions Using isBitwiseNot below as suggested caused us to regress in a new case. Upon investigating, I've realized my argument for using visitXOR here is incorrect. At the time we try to fold away the created `not`, the value we're trying to fold the `not` into has two uses, the new `not`, and the original `zext`. This causes most of the folds in visitXOR not to fire. I'm not sure there is a good way to overcome this. I believe conditioning the initial fold on having a non-foldable `not` is probably the better solution. laytonio: Using isBitwiseNot below as suggested caused us to regress in a new case. Upon investigating…
}		}
// fold ((0-A) + B) -> B-A		// fold ((0-A) + B) -> B-A
		spatelUnsubmitted Done Reply Inline Actions Please add a code comment with something like this: // The speculatively created 'not' node could not be folded away, // so delete it immediately to avoid conflicting with other folds. spatel: Please add a code comment with something like this: // The speculatively created 'not' node…
if (N0.getOpcode() == ISD::SUB && isNullOrNullSplat(N0.getOperand(0)))		if (N0.getOpcode() == ISD::SUB && isNullOrNullSplat(N0.getOperand(0)))
return DAG.getNode(ISD::SUB, DL, VT, N1, N0.getOperand(1));		return DAG.getNode(ISD::SUB, DL, VT, N1, N0.getOperand(1));

// fold (A + (0-B)) -> A-B		// fold (A + (0-B)) -> A-B
if (N1.getOpcode() == ISD::SUB && isNullOrNullSplat(N1.getOperand(0)))		if (N1.getOpcode() == ISD::SUB && isNullOrNullSplat(N1.getOperand(0)))
return DAG.getNode(ISD::SUB, DL, VT, N0, N1.getOperand(1));		return DAG.getNode(ISD::SUB, DL, VT, N0, N1.getOperand(1));

// fold (A+(B-A)) -> B		// fold (A+(B-A)) -> B
▲ Show 20 Lines • Show All 8,090 Lines • ▼ Show 20 Lines	if (CC == ISD::SETGT && isAllOnesConstant(Ones) && VT == XVT) {
}		}
}		}
return SDValue();		return SDValue();
}		}

SDValue DAGCombiner::visitSIGN_EXTEND(SDNode *N) {		SDValue DAGCombiner::visitSIGN_EXTEND(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
SDLoc DL(N);		SDLoc DL(N);
		spatelUnsubmitted Done Reply Inline Actions This is unrelated refactoring? Please commit separately as an NFC patch. spatel: This is unrelated refactoring? Please commit separately as an NFC patch.

if (SDValue Res = tryToFoldExtendOfConstant(N, TLI, DAG, LegalTypes))		if (SDValue Res = tryToFoldExtendOfConstant(N, TLI, DAG, LegalTypes))
return Res;		return Res;

// fold (sext (sext x)) -> (sext x)		// fold (sext (sext x)) -> (sext x)
// fold (sext (aext x)) -> (sext x)		// fold (sext (aext x)) -> (sext x)
if (N0.getOpcode() == ISD::SIGN_EXTEND \|\| N0.getOpcode() == ISD::ANY_EXTEND)		if (N0.getOpcode() == ISD::SIGN_EXTEND \|\| N0.getOpcode() == ISD::ANY_EXTEND)
return DAG.getNode(ISD::SIGN_EXTEND, DL, VT, N0.getOperand(0));		return DAG.getNode(ISD::SIGN_EXTEND, DL, VT, N0.getOperand(0));
▲ Show 20 Lines • Show All 209 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitSIGN_EXTEND(SDNode *N) {
if (N0.getOpcode() == ISD::ADD && N0.hasOneUse() &&		if (N0.getOpcode() == ISD::ADD && N0.hasOneUse() &&
isAllOnesOrAllOnesSplat(N0.getOperand(1)) &&		isAllOnesOrAllOnesSplat(N0.getOperand(1)) &&
N0.getOperand(0).getOpcode() == ISD::ZERO_EXTEND &&		N0.getOperand(0).getOpcode() == ISD::ZERO_EXTEND &&
TLI.isOperationLegalOrCustom(ISD::ADD, VT)) {		TLI.isOperationLegalOrCustom(ISD::ADD, VT)) {
SDValue Zext = DAG.getZExtOrTrunc(N0.getOperand(0).getOperand(0), DL, VT);		SDValue Zext = DAG.getZExtOrTrunc(N0.getOperand(0).getOperand(0), DL, VT);
return DAG.getNode(ISD::ADD, DL, VT, Zext, DAG.getAllOnesConstant(DL, VT));		return DAG.getNode(ISD::ADD, DL, VT, Zext, DAG.getAllOnesConstant(DL, VT));
}		}

		// fold sext (not i1 X) -> add (zext i1 X), -1
		// TODO: This could be extended to handle bool vectors.
		if (N0.getValueType() == MVT::i1 && isBitwiseNot(N0) && N0.hasOneUse() &&
		spatelUnsubmitted Done Reply Inline Actions Can we use this here: /// Returns true if \p V is a bitwise not operation. Assumes that an all ones /// constant is canonicalized to be operand 1. bool isBitwiseNot(SDValue V, bool AllowUndefs = false); spatel: Can we use this here: /// Returns true if \p V is a bitwise not operation. Assumes that an…
		(!LegalOperations \|\| (TLI.isOperationLegal(ISD::ZERO_EXTEND, VT) &&
		TLI.isOperationLegal(ISD::ADD, VT)))) {
		// If we can eliminate the 'not', the sext form should be better
		if (SDValue NewXor = visitXOR(N0.getNode()))
		return DAG.getNode(ISD::SIGN_EXTEND, DL, VT, NewXor);

		SDValue Zext = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, N0.getOperand(0));
		return DAG.getNode(ISD::ADD, DL, VT, Zext, DAG.getAllOnesConstant(DL, VT));
		}

return SDValue();		return SDValue();
}		}

// isTruncateOf - If N is a truncate of some other value, return true, record		// isTruncateOf - If N is a truncate of some other value, return true, record
// the value being truncated in Op and which of Op's bits are zero/one in Known.		// the value being truncated in Op and which of Op's bits are zero/one in Known.
// This function computes KnownBits to avoid a duplicated call to		// This function computes KnownBits to avoid a duplicated call to
// computeKnownBits in the caller.		// computeKnownBits in the caller.
static bool isTruncateOf(SelectionDAG &DAG, SDValue N, SDValue &Op,		static bool isTruncateOf(SelectionDAG &DAG, SDValue N, SDValue &Op,
▲ Show 20 Lines • Show All 11,953 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 46,876 Lines • ▼ Show 20 Lines	static SDValue combineExtSetcc(SDNode *N, SelectionDAG &DAG,
return Res;		return Res;
}		}

static SDValue combineSext(SDNode *N, SelectionDAG &DAG,		static SDValue combineSext(SDNode *N, SelectionDAG &DAG,
TargetLowering::DAGCombinerInfo &DCI,		TargetLowering::DAGCombinerInfo &DCI,
const X86Subtarget &Subtarget) {		const X86Subtarget &Subtarget) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
EVT InVT = N0.getValueType();
SDLoc DL(N);		SDLoc DL(N);

// (i32 (sext (i8 (x86isd::setcc_carry)))) -> (i32 (x86isd::setcc_carry))		// (i32 (sext (i8 (x86isd::setcc_carry)))) -> (i32 (x86isd::setcc_carry))
if (!DCI.isBeforeLegalizeOps() &&		if (!DCI.isBeforeLegalizeOps() &&
N0.getOpcode() == X86ISD::SETCC_CARRY) {		N0.getOpcode() == X86ISD::SETCC_CARRY) {
SDValue Setcc = DAG.getNode(X86ISD::SETCC_CARRY, DL, VT, N0->getOperand(0),		SDValue Setcc = DAG.getNode(X86ISD::SETCC_CARRY, DL, VT, N0->getOperand(0),
N0->getOperand(1));		N0->getOperand(1));
bool ReplaceOtherUses = !N0.hasOneUse();		bool ReplaceOtherUses = !N0.hasOneUse();
Show All 12 Lines	if (SDValue NewCMov = combineToExtendCMOV(N, DAG))
return NewCMov;		return NewCMov;

if (!DCI.isBeforeLegalizeOps())		if (!DCI.isBeforeLegalizeOps())
return SDValue();		return SDValue();

if (SDValue V = combineExtSetcc(N, DAG, Subtarget))		if (SDValue V = combineExtSetcc(N, DAG, Subtarget))
return V;		return V;

if (InVT == MVT::i1 && N0.getOpcode() == ISD::XOR &&
isAllOnesConstant(N0.getOperand(1)) && N0.hasOneUse()) {
// Invert and sign-extend a boolean is the same as zero-extend and subtract
// 1 because 0 becomes -1 and 1 becomes 0. The subtract is efficiently
// lowered with an LEA or a DEC. This is the same as: select Bool, 0, -1.
// sext (xor Bool, -1) --> sub (zext Bool), 1
SDValue Zext = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, N0.getOperand(0));
return DAG.getNode(ISD::SUB, DL, VT, Zext, DAG.getConstant(1, DL, VT));
}

if (SDValue V = combineToExtendBoolVectorInReg(N, DAG, DCI, Subtarget))		if (SDValue V = combineToExtendBoolVectorInReg(N, DAG, DCI, Subtarget))
return V;		return V;

if (VT.isVector()) {		if (VT.isVector()) {
if (SDValue R = PromoteMaskArithmetic(N, DAG, Subtarget))		if (SDValue R = PromoteMaskArithmetic(N, DAG, Subtarget))
return R;		return R;

if (N0.getOpcode() == ISD::SIGN_EXTEND_VECTOR_INREG)		if (N0.getOpcode() == ISD::SIGN_EXTEND_VECTOR_INREG)
▲ Show 20 Lines • Show All 4,190 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/select_const.ll

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
ret i32 %sel		ret i32 %sel
}		}

; select Cond, 0, -1 --> sext (!Cond)		; select Cond, 0, -1 --> sext (!Cond)

define i32 @select_0_or_neg1(i1 %cond) {		define i32 @select_0_or_neg1(i1 %cond) {
; CHECK-LABEL: select_0_or_neg1:		; CHECK-LABEL: select_0_or_neg1:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mvn w8, w0		; CHECK-NEXT: and w8, w0, #0x1
; CHECK-NEXT: sbfx w0, w8, #0, #1		; CHECK-NEXT: sub w0, w8, #1 // =1
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {		define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {
; CHECK-LABEL: select_0_or_neg1_zeroext:		; CHECK-LABEL: select_0_or_neg1_zeroext:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mvn w8, w0		; CHECK-NEXT: sub w0, w0, #1 // =1
; CHECK-NEXT: sbfx w0, w8, #0, #1
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_signext(i1 signext %cond) {		define i32 @select_0_or_neg1_signext(i1 signext %cond) {
; CHECK-LABEL: select_0_or_neg1_signext:		; CHECK-LABEL: select_0_or_neg1_signext:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
▲ Show 20 Lines • Show All 535 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/select_const.ll

Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	; THUMB-NEXT: bx lr
ret i32 %sel		ret i32 %sel
}		}

; select Cond, 0, -1 --> sext (!Cond)		; select Cond, 0, -1 --> sext (!Cond)

define i32 @select_0_or_neg1(i1 %cond) {		define i32 @select_0_or_neg1(i1 %cond) {
; ARM-LABEL: select_0_or_neg1:		; ARM-LABEL: select_0_or_neg1:
; ARM: @ %bb.0:		; ARM: @ %bb.0:
; ARM-NEXT: mov r1, #1		; ARM-NEXT: and r0, r0, #1
; ARM-NEXT: bic r0, r1, r0		; ARM-NEXT: sub r0, r0, #1
; ARM-NEXT: rsb r0, r0, #0
; ARM-NEXT: mov pc, lr		; ARM-NEXT: mov pc, lr
;		;
; THUMB2-LABEL: select_0_or_neg1:		; THUMB2-LABEL: select_0_or_neg1:
; THUMB2: @ %bb.0:		; THUMB2: @ %bb.0:
; THUMB2-NEXT: movs r1, #1		; THUMB2-NEXT: and r0, r0, #1
; THUMB2-NEXT: bic.w r0, r1, r0		; THUMB2-NEXT: subs r0, #1
; THUMB2-NEXT: rsbs r0, r0, #0
; THUMB2-NEXT: bx lr		; THUMB2-NEXT: bx lr
;		;
; THUMB-LABEL: select_0_or_neg1:		; THUMB-LABEL: select_0_or_neg1:
; THUMB: @ %bb.0:		; THUMB: @ %bb.0:
; THUMB-NEXT: movs r1, #1		; THUMB-NEXT: movs r1, #1
; THUMB-NEXT: bics r1, r0		; THUMB-NEXT: ands r1, r0
; THUMB-NEXT: rsbs r0, r1, #0		; THUMB-NEXT: subs r0, r1, #1
; THUMB-NEXT: bx lr		; THUMB-NEXT: bx lr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {		define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {
; ARM-LABEL: select_0_or_neg1_zeroext:		; ARM-LABEL: select_0_or_neg1_zeroext:
; ARM: @ %bb.0:		; ARM: @ %bb.0:
; ARM-NEXT: eor r0, r0, #1		; ARM-NEXT: sub r0, r0, #1
; ARM-NEXT: rsb r0, r0, #0
; ARM-NEXT: mov pc, lr		; ARM-NEXT: mov pc, lr
;		;
; THUMB2-LABEL: select_0_or_neg1_zeroext:		; THUMB2-LABEL: select_0_or_neg1_zeroext:
; THUMB2: @ %bb.0:		; THUMB2: @ %bb.0:
; THUMB2-NEXT: eor r0, r0, #1		; THUMB2-NEXT: subs r0, #1
; THUMB2-NEXT: rsbs r0, r0, #0
; THUMB2-NEXT: bx lr		; THUMB2-NEXT: bx lr
;		;
; THUMB-LABEL: select_0_or_neg1_zeroext:		; THUMB-LABEL: select_0_or_neg1_zeroext:
; THUMB: @ %bb.0:		; THUMB: @ %bb.0:
; THUMB-NEXT: movs r1, #1		; THUMB-NEXT: subs r0, r0, #1
; THUMB-NEXT: eors r1, r0
; THUMB-NEXT: rsbs r0, r1, #0
; THUMB-NEXT: bx lr		; THUMB-NEXT: bx lr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_signext(i1 signext %cond) {		define i32 @select_0_or_neg1_signext(i1 signext %cond) {
; ARM-LABEL: select_0_or_neg1_signext:		; ARM-LABEL: select_0_or_neg1_signext:
; ARM: @ %bb.0:		; ARM: @ %bb.0:
▲ Show 20 Lines • Show All 579 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/select_const.ll

Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	; ALL-NEXT: blr
ret i32 %sel		ret i32 %sel
}		}

; select Cond, 0, -1 --> sext (!Cond)		; select Cond, 0, -1 --> sext (!Cond)

define i32 @select_0_or_neg1(i1 %cond) {		define i32 @select_0_or_neg1(i1 %cond) {
; ALL-LABEL: select_0_or_neg1:		; ALL-LABEL: select_0_or_neg1:
; ALL: # %bb.0:		; ALL: # %bb.0:
; ALL-NEXT: not 3, 3
; ALL-NEXT: clrldi 3, 3, 63		; ALL-NEXT: clrldi 3, 3, 63
; ALL-NEXT: neg 3, 3		; ALL-NEXT: addi 3, 3, -1
; ALL-NEXT: blr		; ALL-NEXT: blr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {		define i32 @select_0_or_neg1_zeroext(i1 zeroext %cond) {
; ALL-LABEL: select_0_or_neg1_zeroext:		; ALL-LABEL: select_0_or_neg1_zeroext:
; ALL: # %bb.0:		; ALL: # %bb.0:
; ALL-NEXT: xori 3, 3, 1		; ALL-NEXT: addi 3, 3, -1
; ALL-NEXT: neg 3, 3
; ALL-NEXT: blr		; ALL-NEXT: blr
%sel = select i1 %cond, i32 0, i32 -1		%sel = select i1 %cond, i32 0, i32 -1
ret i32 %sel		ret i32 %sel
}		}

define i32 @select_0_or_neg1_signext(i1 signext %cond) {		define i32 @select_0_or_neg1_signext(i1 signext %cond) {
; ALL-LABEL: select_0_or_neg1_signext:		; ALL-LABEL: select_0_or_neg1_signext:
; ALL: # %bb.0:		; ALL: # %bb.0:
▲ Show 20 Lines • Show All 830 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/sext-zext-trunc.ll

	Show First 20 Lines • Show All 431 Lines • ▼ Show 20 Lines
	;			;
	; RV64I-LABEL: trunc_i64_to_i32:			; RV64I-LABEL: trunc_i64_to_i32:
	; RV64I: # %bb.0:			; RV64I: # %bb.0:
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	%1 = trunc i64 %a to i32			%1 = trunc i64 %a to i32
	ret i32 %1			ret i32 %1
	}			}

	;; TODO: fold (sext (not x)) -> (add (zext x) -1)			;; fold (sext (not x)) -> (add (zext x) -1)
	define i32 @sext_of_not_i32(i1 %x) {			define i32 @sext_of_not_i32(i1 %x) {
	; RV32I-LABEL: sext_of_not_i32:			; RV32I-LABEL: sext_of_not_i32:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: not a0, a0
	; RV32I-NEXT: andi a0, a0, 1			; RV32I-NEXT: andi a0, a0, 1
	; RV32I-NEXT: neg a0, a0			; RV32I-NEXT: addi a0, a0, -1
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	;			;
	; RV64I-LABEL: sext_of_not_i32:			; RV64I-LABEL: sext_of_not_i32:
	; RV64I: # %bb.0:			; RV64I: # %bb.0:
	; RV64I-NEXT: not a0, a0
	; RV64I-NEXT: andi a0, a0, 1			; RV64I-NEXT: andi a0, a0, 1
	; RV64I-NEXT: neg a0, a0			; RV64I-NEXT: addi a0, a0, -1
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	%xor = xor i1 %x, 1			%xor = xor i1 %x, 1
	%sext = sext i1 %xor to i32			%sext = sext i1 %xor to i32
	ret i32 %sext			ret i32 %sext
	}			}

	define i64 @sext_of_not_i64(i1 %x) {			define i64 @sext_of_not_i64(i1 %x) {
	; RV32I-LABEL: sext_of_not_i64:			; RV32I-LABEL: sext_of_not_i64:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: not a0, a0			; RV32I-NEXT: andi a1, a0, 1
	; RV32I-NEXT: andi a0, a0, 1			; RV32I-NEXT: addi a0, a1, -1
	; RV32I-NEXT: neg a0, a0			; RV32I-NEXT: sltu a1, a0, a1
	; RV32I-NEXT: mv a1, a0			; RV32I-NEXT: addi a1, a1, -1
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	;			;
	; RV64I-LABEL: sext_of_not_i64:			; RV64I-LABEL: sext_of_not_i64:
	; RV64I: # %bb.0:			; RV64I: # %bb.0:
	; RV64I-NEXT: not a0, a0
	; RV64I-NEXT: andi a0, a0, 1			; RV64I-NEXT: andi a0, a0, 1
	; RV64I-NEXT: neg a0, a0			; RV64I-NEXT: addi a0, a0, -1
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	%xor = xor i1 %x, 1			%xor = xor i1 %x, 1
	%sext = sext i1 %xor to i64			%sext = sext i1 %xor to i64
	ret i64 %sext			ret i64 %sext
	}			}

	;; TODO: fold (sext (not (setcc a, b, cc))) -> (sext (setcc a, b, !cc))			;; fold (sext (not (setcc a, b, cc))) -> (sext (setcc a, b, !cc))
				RKSimonUnsubmitted Not Done Reply Inline Actions Not sure what comment should be here if RISCV doesn't actually do the fold RKSimon: Not sure what comment should be here if RISCV doesn't actually do the fold
				laytonioAuthorUnsubmitted Done Reply Inline Actions Unless I'm missing something trunk was already doing this fold. I added this as a test case because, without checking if we could fold away the not by converting back to a sign extend, we would have regressed here. After this patch the codegen of sext_of_not_cmp and dec_of_zexted_cmp should be identical. I can remove these if you think they're redundant. laytonio: Unless I'm missing something trunk was already doing this fold. I added this as a test case…
	define i32 @sext_of_not_cmp_i32(i32 %x) {			define i32 @sext_of_not_cmp_i32(i32 %x) {
	; RV32I-LABEL: sext_of_not_cmp_i32:			; RV32I-LABEL: sext_of_not_cmp_i32:
	; RV32I: # %bb.0:			; RV32I: # %bb.0:
	; RV32I-NEXT: addi a0, a0, -7			; RV32I-NEXT: addi a0, a0, -7
	; RV32I-NEXT: snez a0, a0			; RV32I-NEXT: snez a0, a0
	; RV32I-NEXT: neg a0, a0			; RV32I-NEXT: neg a0, a0
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	;			;
	▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/test/CodeGen/SystemZ/sext-zext.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=s390x-linux-gnu \| FileCheck %s			; RUN: llc < %s -mtriple=s390x-linux-gnu \| FileCheck %s

	;; TODO: fold (sext (not x)) -> (add (zext x) -1)			;; fold (sext (not x)) -> (add (zext x) -1)
	define i32 @sext_of_not(i1 %x) {			define i32 @sext_of_not(i1 %x) {
	; CHECK-LABEL: sext_of_not:			; CHECK-LABEL: sext_of_not:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xilf %r2, 4294967295
	; CHECK-NEXT: nilf %r2, 1			; CHECK-NEXT: nilf %r2, 1
	; CHECK-NEXT: lcr %r2, %r2			; CHECK-NEXT: ahi %r2, -1
	; CHECK-NEXT: br %r14			; CHECK-NEXT: br %r14
	%xor = xor i1 %x, 1			%xor = xor i1 %x, 1
	%sext = sext i1 %xor to i32			%sext = sext i1 %xor to i32
	ret i32 %sext			ret i32 %sext
	}			}

	;; TODO: fold (sext (not (setcc a, b, cc))) -> (sext (setcc a, b, !cc))			;; fold (sext (not (setcc a, b, cc))) -> (sext (setcc a, b, !cc))
	define i32 @sext_of_not_cmp(i32 %x) {			define i32 @sext_of_not_cmp(i32 %x) {
	; CHECK-LABEL: sext_of_not_cmp:			; CHECK-LABEL: sext_of_not_cmp:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: chi %r2, 7			; CHECK-NEXT: chi %r2, 7
	; CHECK-NEXT: ipm %r2			; CHECK-NEXT: ipm %r2
	; CHECK-NEXT: afi %r2, 1879048192			; CHECK-NEXT: afi %r2, 1879048192
	; CHECK-NEXT: sra %r2, 31			; CHECK-NEXT: sra %r2, 31
	; CHECK-NEXT: br %r14			; CHECK-NEXT: br %r14
	Show All 21 Lines

llvm/test/CodeGen/X86/pr44140.ll

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: callq opaque			; CHECK-NEXT: callq opaque
	; CHECK-NEXT: vmovaps %xmm6, {{[0-9]+}}(%rsp)			; CHECK-NEXT: vmovaps %xmm6, {{[0-9]+}}(%rsp)
	; CHECK-NEXT: testb %sil, %sil			; CHECK-NEXT: testb %sil, %sil
	; CHECK-NEXT: jne .LBB1_1			; CHECK-NEXT: jne .LBB1_1
	; CHECK-NEXT: # %bb.2: # %exit			; CHECK-NEXT: # %bb.2: # %exit
	; CHECK-NEXT: movabsq $1010101010101010101, %rcx # imm = 0xE04998456557EB5			; CHECK-NEXT: movabsq $1010101010101010101, %rcx # imm = 0xE04998456557EB5
	; CHECK-NEXT: xorl %eax, %eax			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: cmpq %rcx, {{[0-9]+}}(%rsp)			; CHECK-NEXT: cmpq %rcx, {{[0-9]+}}(%rsp)
	; CHECK-NEXT: sete %al			; CHECK-NEXT: setne %al
	; CHECK-NEXT: decl %eax			; CHECK-NEXT: negl %eax
	; CHECK-NEXT: addq $584, %rsp # imm = 0x248			; CHECK-NEXT: addq $584, %rsp # imm = 0x248
	; CHECK-NEXT: .cfi_def_cfa_offset 8			; CHECK-NEXT: .cfi_def_cfa_offset 8
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	start:			start:
	%dummy0 = alloca [22 x i64], align 8			%dummy0 = alloca [22 x i64], align 8
	%dummy1 = alloca [22 x i64], align 8			%dummy1 = alloca [22 x i64], align 8
	%dummy2 = alloca [22 x i64], align 8			%dummy2 = alloca [22 x i64], align 8

	Show All 36 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 309774

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/test/CodeGen/AArch64/select_const.ll

llvm/test/CodeGen/ARM/select_const.ll

llvm/test/CodeGen/PowerPC/select_const.ll

llvm/test/CodeGen/RISCV/sext-zext-trunc.ll

llvm/test/CodeGen/SystemZ/sext-zext.ll

llvm/test/CodeGen/X86/pr44140.ll

[DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1)
ClosedPublic