This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
-
AArch64ISelLowering.h
1/3
AArch64ISelLowering.cpp
-
AArch64InstrInfo.td
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
2/2
logical-op-with-not.ll
-
shiftregister-from-and.ll

Differential D139609

[AArch64][DAGCombiner] fold instruction BIC from ISD::AND
AbandonedPublic

Authored by bcl5980 on Dec 8 2022, 12:39 AM.

Download Raw Diff

Details

Reviewers

dmgreen
efriedma
mingmingl

Summary

This patch add a new target ISD AArch64ISD::SBIC to represent scalar version instruction bic.
And select the ISD in the stage combine from this pattern:
(~X | C) & Y --> bic Y, (X & ~C)

Diff Detail

Event Timeline

bcl5980 created this revision.Dec 8 2022, 12:39 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 8 2022, 12:39 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald Transcript

bcl5980 requested review of this revision.Dec 8 2022, 12:39 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 8 2022, 12:39 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

bcl5980 retitled this revision from [AArch64][DAGCombiner] fold BIC from AND in dagcombiner to [AArch64][DAGCombiner] fold instruction BIC from ISD::AND.Dec 8 2022, 12:40 AM

bcl5980 added a child revision: D139610: [AArch64][DAGCombiner] fold instruction EON from ISD::XOR.Dec 8 2022, 12:55 AM

Harbormaster completed remote builds in B201893: Diff 481172.Dec 8 2022, 9:03 AM

add shiftreg selection for bic

improve code in td file

Harbormaster completed remote builds in B202157: Diff 481532.Dec 9 2022, 12:42 AM

Ping.

mingmingl added inline comments.Dec 13 2022, 4:18 PM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16046	An alternative is `and Y, (orn C, X)`. https://godbolt.org/z/PGnTdTMKW shows `orn` is generated if RHS of `or` is not a constant or RHS is a large constant, but `orn` not generated for small constant. https://gist.github.com/minglotus-6/dd2c75a2253b128081125a578e7ff6c6 is the llc ISel debug output for small constant, and https://gist.github.com/minglotus-6/2692d59e5c939941895d3581a4bdcbea for large constant. I wonder if fixing ISel (after dag-combiner) is a more general solution (i.e., selects `~X \| C` to `orn`, and fixes this motivating case as well).
16091	nit: add a comment like `LHS won't be a constant sd node if RHS is not a constant, due to canonicalization`, or better assert the conditon.
llvm/test/CodeGen/AArch64/logical-op-with-not.ll
39–40	Related with scalar BIC DAG node (probably not a part of this patch) %4 = xor i32 %3, -1 %5 = and i32 %4, %1 itself could be a `BIC` as well. `llc` with `-debug` shows `xor %3, -1` is combined into `or (xor %0, -1), -65281` so tblgen pattern won't see the `%4 %5` sequence.

bcl5980 added inline comments.Dec 13 2022, 7:16 PM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16046	Yeah, we have two ways to fix the issue: and + orn and + bic The problem for orn is it always need to involve a mov. So I choice and+bic to fix the issue. I also try to add the code into ISelDAG, but it looks the code become complex and it will lose some optimization for the first `and`. Like the case @bic_shiftedreg_from_and.

bcl5980 added inline comments.Dec 13 2022, 10:32 PM

llvm/test/CodeGen/AArch64/logical-op-with-not.ll
39–40	Yeah, actually what this change do is reverting the `xor + or` to `xor + and`.

Hello. In general we have found it is better to avoid AArch64ISD nodes if possible. That way the benefits from generic dag combines keep applying, as opposed to becoming black boxes that the rest of the optimizer cannot see. Sometimes they are necessary, and that might be the case here, but is it possible to just adjust the code using standard nodes? It doesn't seem like this applies very often though - perhaps that makes it OK in this case.

In D139609#3998214, @dmgreen wrote:

Hello. In general we have found it is better to avoid AArch64ISD nodes if possible. That way the benefits from generic dag combines keep applying, as opposed to becoming black boxes that the rest of the optimizer cannot see. Sometimes they are necessary, and that might be the case here, but is it possible to just adjust the code using standard nodes? It doesn't seem like this applies very often though - perhaps that makes it OK in this case.

The headache thing here is DAGCombiner will revert the pattern to the origin if I use standard node. So I try to add a new AArch64ISD.

In D139609#4003099, @bcl5980 wrote:

In D139609#3998214, @dmgreen wrote:

Hello. In general we have found it is better to avoid AArch64ISD nodes if possible. That way the benefits from generic dag combines keep applying, as opposed to becoming black boxes that the rest of the optimizer cannot see. Sometimes they are necessary, and that might be the case here, but is it possible to just adjust the code using standard nodes? It doesn't seem like this applies very often though - perhaps that makes it OK in this case.

The headache thing here is DAGCombiner will revert the pattern to the origin if I use standard node. So I try to add a new AArch64ISD.

Maybe I can put the transform to after legalize to make sure standard nodes exist longer.

The headache thing here is DAGCombiner will revert the pattern to the origin if I use standard node. So I try to add a new AArch64ISD.

Maybe I can put the transform to after legalize to make sure standard nodes exist longer.

Putting the transform late can certainly help, thats a good option if it is needed. It can sometimes be better to just prevent the DAG combine that is going in the wrong direction. There are a set of methods like shouldFoldConstantShiftPairToMask and isDesirableToCommuteWithShift that can be used by the target to control how the DAGCombiner acts. There can be problems if the transform is towards a canonical form that other DAG combines rely upon, but it sounds like in this case it might be easy enough to add?

In D139609#4004201, @dmgreen wrote:

The headache thing here is DAGCombiner will revert the pattern to the origin if I use standard node. So I try to add a new AArch64ISD.

Maybe I can put the transform to after legalize to make sure standard nodes exist longer.

Putting the transform late can certainly help, thats a good option if it is needed. It can sometimes be better to just prevent the DAG combine that is going in the wrong direction. There are a set of methods like shouldFoldConstantShiftPairToMask and isDesirableToCommuteWithShift that can be used by the target to control how the DAGCombiner acts. There can be problems if the transform is towards a canonical form that other DAG combines rely upon, but it sounds like in this case it might be easy enough to add?

These method like shouldFoldConstantShiftPairToMask or isDesirableToCommuteWithShift only prevent DAG to transform a good pattern to bad pattern for us. But if the code is the bad pattern at first, we still can't optimize the case. So if DAGCombiner already has some canonicalization, I prefer to optimize the pattern after DAG combine. And I am not sure if we need add a TLI interface, what interface can we add to make the code general?

These method like shouldFoldConstantShiftPairToMask or isDesirableToCommuteWithShift only prevent DAG to transform a good pattern to bad pattern for us. But if the code is the bad pattern at first, we still can't optimize the case. So if DAGCombiner already has some canonicalization, I prefer to optimize the pattern after DAG combine. And I am not sure if we need add a TLI interface, what interface can we add to make the code general?

It appears to be this code that is transforming not(and(x,C)) -> or(not(x), not(C)): https://github.com/llvm/llvm-project/blob/85d049a089d4a6a4a67145429ea5d8e155651138/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L8746
There would be advantages and disadvantages to both methods, but it seems like disabling that with a target hook, either universally or in specific cases should work and might be a little more general than this version. mvn_shiftedreg_from_and does get a little worse though, which can maybe be fixed in the DAG2DAG matchers? It could help other architectures if the not constant isn't as cheap as the original too.
If the reverse pattern is needed, it should be OK to add either as a DAG combine or in AArch64ISelLowering, so long as it does not fight with the existing combines.

I would recommend avoiding new AArch64ISD nodes, but if the alternative doesn't work then make sure we only start creating the node late - after the DAG has been legalized.

bcl5980 abandoned this revision.Jan 5 2023, 10:58 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64ISelLowering.h

3 lines

AArch64ISelLowering.cpp

59 lines

AArch64InstrInfo.td

22 lines

test/

CodeGen/

AArch64/

logical-op-with-not.ll

10 lines

shiftregister-from-and.ll

7 lines

Diff 481532

llvm/lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {

// Arithmetic instructions which write flags.		// Arithmetic instructions which write flags.
ADDS,		ADDS,
SUBS,		SUBS,
ADCS,		ADCS,
SBCS,		SBCS,
ANDS,		ANDS,

		// Scalar logical instructions with not operand
		SBIC,

// Conditional compares. Operands: left,right,falsecc,cc,flags		// Conditional compares. Operands: left,right,falsecc,cc,flags
CCMP,		CCMP,
CCMN,		CCMN,
FCCMP,		FCCMP,

// Floating point comparison		// Floating point comparison
FCMP,		FCMP,

▲ Show 20 Lines • Show All 1,056 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,257 Lines • ▼ Show 20 Lines	case AArch64ISD::FIRST_NUMBER:
MAKE_CASE(AArch64ISD::SETCC_MERGE_ZERO)		MAKE_CASE(AArch64ISD::SETCC_MERGE_ZERO)
MAKE_CASE(AArch64ISD::ADC)		MAKE_CASE(AArch64ISD::ADC)
MAKE_CASE(AArch64ISD::SBC)		MAKE_CASE(AArch64ISD::SBC)
MAKE_CASE(AArch64ISD::ADDS)		MAKE_CASE(AArch64ISD::ADDS)
MAKE_CASE(AArch64ISD::SUBS)		MAKE_CASE(AArch64ISD::SUBS)
MAKE_CASE(AArch64ISD::ADCS)		MAKE_CASE(AArch64ISD::ADCS)
MAKE_CASE(AArch64ISD::SBCS)		MAKE_CASE(AArch64ISD::SBCS)
MAKE_CASE(AArch64ISD::ANDS)		MAKE_CASE(AArch64ISD::ANDS)
		MAKE_CASE(AArch64ISD::SBIC)
MAKE_CASE(AArch64ISD::CCMP)		MAKE_CASE(AArch64ISD::CCMP)
MAKE_CASE(AArch64ISD::CCMN)		MAKE_CASE(AArch64ISD::CCMN)
MAKE_CASE(AArch64ISD::FCCMP)		MAKE_CASE(AArch64ISD::FCCMP)
MAKE_CASE(AArch64ISD::FCMP)		MAKE_CASE(AArch64ISD::FCMP)
MAKE_CASE(AArch64ISD::STRICT_FCMP)		MAKE_CASE(AArch64ISD::STRICT_FCMP)
MAKE_CASE(AArch64ISD::STRICT_FCMPE)		MAKE_CASE(AArch64ISD::STRICT_FCMPE)
MAKE_CASE(AArch64ISD::DUP)		MAKE_CASE(AArch64ISD::DUP)
MAKE_CASE(AArch64ISD::DUPLANE8)		MAKE_CASE(AArch64ISD::DUPLANE8)
▲ Show 20 Lines • Show All 13,763 Lines • ▼ Show 20 Lines	static SDValue performSVEAndCombine(SDNode *N,
}		}

if (isConstantSplatVectorMaskForType(Mask.getNode(), MemVT))		if (isConstantSplatVectorMaskForType(Mask.getNode(), MemVT))
return Src;		return Src;

return SDValue();		return SDValue();
}		}

		// (~X \| C) & Y --> bic Y, (X & ~C)
		mingminglUnsubmitted Not Done Reply Inline Actions An alternative is `and Y, (orn C, X)`. https://godbolt.org/z/PGnTdTMKW shows `orn` is generated if RHS of `or` is not a constant or RHS is a large constant, but `orn` not generated for small constant. https://gist.github.com/minglotus-6/dd2c75a2253b128081125a578e7ff6c6 is the llc ISel debug output for small constant, and https://gist.github.com/minglotus-6/2692d59e5c939941895d3581a4bdcbea for large constant. I wonder if fixing ISel (after dag-combiner) is a more general solution (i.e., selects `~X \| C` to `orn`, and fixes this motivating case as well). mingmingl: An alternative is `and Y, (orn C, X)`. https://godbolt.org/z/PGnTdTMKW shows `orn` is…
		bcl5980AuthorUnsubmitted Done Reply Inline Actions Yeah, we have two ways to fix the issue: and + orn and + bic The problem for orn is it always need to involve a mov. So I choice and+bic to fix the issue. I also try to add the code into ISelDAG, but it looks the code become complex and it will lose some optimization for the first `and`. Like the case @bic_shiftedreg_from_and. bcl5980: Yeah, we have two ways to fix the issue: and + orn and + bic The problem for orn is it…
		static SDValue performAndCombineWithNotOp(SDNode *N, SelectionDAG &DAG) {
		EVT VT = N->getValueType(0);
		if (VT != MVT::i32 && VT != MVT::i64)
		return SDValue();

		SDValue LHS = N->getOperand(0);
		SDValue RHS = N->getOperand(1);

		SDLoc DL(N);
		SDValue X, C;
		auto canGetNotFromOrXor = [&](SDNode *N, SDValue Op0) {
		if (Op0.getOpcode() != ISD::OR \|\| !Op0.hasOneUse())
		return false;

		if (!isa<ConstantSDNode>(Op0.getOperand(1)))
		return false;

		SDValue Op0LHS = Op0.getOperand(0);
		if (!Op0LHS.hasOneUse())
		return false;

		if (Op0LHS->getOpcode() == ISD::ANY_EXTEND && VT == MVT::i64 &&
		Op0LHS->getOperand(0).getValueType() == MVT::i32)
		Op0LHS = Op0LHS->getOperand(0);

		if (Op0LHS.getOpcode() != ISD::XOR \|\| !Op0LHS.hasOneUse())
		return false;

		ConstantSDNode *XorC = dyn_cast<ConstantSDNode>(Op0LHS.getOperand(1));
		if (!XorC \|\| !XorC->isAllOnes())
		return false;

		C = DAG.getNOT(DL, Op0.getOperand(1), VT);
		X = Op0LHS.getOperand(0);
		return true;
		};

		if (canGetNotFromOrXor(N, LHS) && !isa<ConstantSDNode>(RHS)) {
		if (X.getValueType() != VT)
		X = DAG.getNode(ISD::ANY_EXTEND, DL, VT, X);
		SDValue AndVal = DAG.getNode(ISD::AND, DL, VT, X, C);
		return DAG.getNode(AArch64ISD::SBIC, DL, VT, RHS, AndVal);
		}

		if (canGetNotFromOrXor(N, RHS)) {
		mingminglUnsubmitted Not Done Reply Inline Actions nit: add a comment like `LHS won't be a constant sd node if RHS is not a constant, due to canonicalization`, or better assert the conditon. mingmingl: nit: add a comment like `LHS won't be a constant sd node if RHS is not a constant, due to…
		if (X.getValueType() != VT)
		X = DAG.getNode(ISD::ANY_EXTEND, DL, VT, X);
		SDValue AndVal = DAG.getNode(ISD::AND, DL, VT, X, C);
		return DAG.getNode(AArch64ISD::SBIC, DL, VT, LHS, AndVal);
		}

		return SDValue();
		}

static SDValue performANDCombine(SDNode *N,		static SDValue performANDCombine(SDNode *N,
TargetLowering::DAGCombinerInfo &DCI) {		TargetLowering::DAGCombinerInfo &DCI) {
SelectionDAG &DAG = DCI.DAG;		SelectionDAG &DAG = DCI.DAG;
SDValue LHS = N->getOperand(0);		SDValue LHS = N->getOperand(0);
SDValue RHS = N->getOperand(1);		SDValue RHS = N->getOperand(1);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);

if (SDValue R = performANDORCSELCombine(N, DAG))		if (SDValue R = performANDORCSELCombine(N, DAG))
return R;		return R;

if (!DAG.getTargetLoweringInfo().isTypeLegal(VT))		if (!DAG.getTargetLoweringInfo().isTypeLegal(VT))
return SDValue();		return SDValue();

if (VT.isScalableVector())		if (VT.isScalableVector())
return performSVEAndCombine(N, DCI);		return performSVEAndCombine(N, DCI);

		if (SDValue R = performAndCombineWithNotOp(N, DAG))
		return R;

// The combining code below works only for NEON vectors. In particular, it		// The combining code below works only for NEON vectors. In particular, it
// does not work for SVE when dealing with vectors wider than 128 bits.		// does not work for SVE when dealing with vectors wider than 128 bits.
if (!VT.is64BitVector() && !VT.is128BitVector())		if (!VT.is64BitVector() && !VT.is128BitVector())
return SDValue();		return SDValue();

BuildVectorSDNode *BVN = dyn_cast<BuildVectorSDNode>(RHS.getNode());		BuildVectorSDNode *BVN = dyn_cast<BuildVectorSDNode>(RHS.getNode());
if (!BVN)		if (!BVN)
return SDValue();		return SDValue();
▲ Show 20 Lines • Show All 7,486 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 607 Lines • ▼ Show 20 Lines	def AArch64and_flag : SDNode<"AArch64ISD::ANDS", SDTBinaryArithWithFlagsOut,
[SDNPCommutative]>;		[SDNPCommutative]>;
def AArch64adc_flag : SDNode<"AArch64ISD::ADCS", SDTBinaryArithWithFlagsInOut>;		def AArch64adc_flag : SDNode<"AArch64ISD::ADCS", SDTBinaryArithWithFlagsInOut>;
def AArch64sbc_flag : SDNode<"AArch64ISD::SBCS", SDTBinaryArithWithFlagsInOut>;		def AArch64sbc_flag : SDNode<"AArch64ISD::SBCS", SDTBinaryArithWithFlagsInOut>;

def AArch64ccmp : SDNode<"AArch64ISD::CCMP", SDT_AArch64CCMP>;		def AArch64ccmp : SDNode<"AArch64ISD::CCMP", SDT_AArch64CCMP>;
def AArch64ccmn : SDNode<"AArch64ISD::CCMN", SDT_AArch64CCMP>;		def AArch64ccmn : SDNode<"AArch64ISD::CCMN", SDT_AArch64CCMP>;
def AArch64fccmp : SDNode<"AArch64ISD::FCCMP", SDT_AArch64FCCMP>;		def AArch64fccmp : SDNode<"AArch64ISD::FCCMP", SDT_AArch64FCCMP>;

		def AArch64sbic : SDNode<"AArch64ISD::SBIC", SDTIntBinOp>;

def AArch64threadpointer : SDNode<"AArch64ISD::THREAD_POINTER", SDTPtrLeaf>;		def AArch64threadpointer : SDNode<"AArch64ISD::THREAD_POINTER", SDTPtrLeaf>;

def AArch64fcmp : SDNode<"AArch64ISD::FCMP", SDT_AArch64FCmp>;		def AArch64fcmp : SDNode<"AArch64ISD::FCMP", SDT_AArch64FCmp>;
def AArch64strict_fcmp : SDNode<"AArch64ISD::STRICT_FCMP", SDT_AArch64FCmp,		def AArch64strict_fcmp : SDNode<"AArch64ISD::STRICT_FCMP", SDT_AArch64FCmp,
[SDNPHasChain]>;		[SDNPHasChain]>;
def AArch64strict_fcmpe : SDNode<"AArch64ISD::STRICT_FCMPE", SDT_AArch64FCmp,		def AArch64strict_fcmpe : SDNode<"AArch64ISD::STRICT_FCMPE", SDT_AArch64FCmp,
[SDNPHasChain]>;		[SDNPHasChain]>;
def AArch64any_fcmp : PatFrags<(ops node:$lhs, node:$rhs),		def AArch64any_fcmp : PatFrags<(ops node:$lhs, node:$rhs),
▲ Show 20 Lines • Show All 1,590 Lines • ▼ Show 20 Lines	def : InstAlias<"tst $src1, $src2$sh",
(ANDSWrs WZR, GPR32:$src1, GPR32:$src2, logical_shift32:$sh), 2>;		(ANDSWrs WZR, GPR32:$src1, GPR32:$src2, logical_shift32:$sh), 2>;
def : InstAlias<"tst $src1, $src2$sh",		def : InstAlias<"tst $src1, $src2$sh",
(ANDSXrs XZR, GPR64:$src1, GPR64:$src2, logical_shift64:$sh), 2>;		(ANDSXrs XZR, GPR64:$src1, GPR64:$src2, logical_shift64:$sh), 2>;


def : Pat<(not GPR32:$Wm), (ORNWrr WZR, GPR32:$Wm)>;		def : Pat<(not GPR32:$Wm), (ORNWrr WZR, GPR32:$Wm)>;
def : Pat<(not GPR64:$Xm), (ORNXrr XZR, GPR64:$Xm)>;		def : Pat<(not GPR64:$Xm), (ORNXrr XZR, GPR64:$Xm)>;

		multiclass LogicalRegPat<SDPatternOperator OpNode, string INST> {
		def : Pat<(OpNode GPR32:$Wn, GPR32:$Wm),
		(!cast<Instruction>(INST # Wrr) GPR32:$Wn, GPR32:$Wm)>;

		def : Pat<(OpNode GPR64:$Xn, GPR64:$Xm),
		(!cast<Instruction>(INST # Xrr) GPR64:$Xn, GPR64:$Xm)>;

		def : Pat<(OpNode GPR32:$Wn,
		(logical_shifted_reg32 GPR32:$Wm, logical_shift32:$shift)),
		(!cast<Instruction>(INST # Wrs) GPR32:$Wn, GPR32:$Wm,
		logical_shift32:$shift)>;

		def : Pat<(OpNode GPR64:$Xn,
		(logical_shifted_reg64 GPR64:$Xm, logical_shift64:$shift)),
		(!cast<Instruction>(INST # Xrs) GPR64:$Xn, GPR64:$Xm,
		logical_shift64:$shift)>;
		}

		defm : LogicalRegPat<AArch64sbic, "BIC">;


//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// One operand data processing instructions.		// One operand data processing instructions.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

defm CLS : OneOperandData<0b000101, "cls">;		defm CLS : OneOperandData<0b000101, "cls">;
defm CLZ : OneOperandData<0b000100, "clz", ctlz>;		defm CLZ : OneOperandData<0b000100, "clz", ctlz>;
defm RBIT : OneOperandData<0b000000, "rbit", bitreverse>;		defm RBIT : OneOperandData<0b000000, "rbit", bitreverse>;
▲ Show 20 Lines • Show All 6,513 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/logical-op-with-not.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu \| FileCheck %s			; RUN: llc -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu \| FileCheck %s

	define i64 @and_bic(i64 %0, i64 %1) {			define i64 @and_bic(i64 %0, i64 %1) {
	; CHECK-LABEL: and_bic:			; CHECK-LABEL: and_bic:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: mvn w8, w0			; CHECK-NEXT: and x8, x0, #0xff00
	; CHECK-NEXT: orr x8, x8, #0xffffffffffff00ff			; CHECK-NEXT: bic x0, x1, x8
	; CHECK-NEXT: and x0, x8, x1
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%3 = and i64 %0, 65280			%3 = and i64 %0, 65280
	%4 = xor i64 %3, -1			%4 = xor i64 %3, -1
	%5 = and i64 %4, %1			%5 = and i64 %4, %1
	ret i64 %5			ret i64 %5
	}			}

	define i64 @and_bic2(i32 %0, i64 %1) {			define i64 @and_bic2(i32 %0, i64 %1) {
	; CHECK-LABEL: and_bic2:			; CHECK-LABEL: and_bic2:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: mvn w8, w0			; CHECK-NEXT: mvn w8, w0
	; CHECK-NEXT: orr w8, w8, #0xffff00ff			; CHECK-NEXT: orr w8, w8, #0xffff00ff
	; CHECK-NEXT: and x0, x8, x1			; CHECK-NEXT: and x0, x8, x1
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%3 = and i32 %0, 65280			%3 = and i32 %0, 65280
	%4 = xor i32 %3, -1			%4 = xor i32 %3, -1
	%5 = zext i32 %4 to i64			%5 = zext i32 %4 to i64
	%6 = and i64 %5, %1			%6 = and i64 %5, %1
	ret i64 %6			ret i64 %6
	}			}

	define i32 @and_bic3(i32 %0, i32 %1) {			define i32 @and_bic3(i32 %0, i32 %1) {
	; CHECK-LABEL: and_bic3:			; CHECK-LABEL: and_bic3:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: mvn w8, w0			; CHECK-NEXT: and w8, w0, #0xff00
	; CHECK-NEXT: orr w8, w8, #0xffff00ff			; CHECK-NEXT: bic w0, w1, w8
	; CHECK-NEXT: and w0, w8, w1
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%3 = and i32 %0, 65280			%3 = and i32 %0, 65280
	%4 = xor i32 %3, -1			%4 = xor i32 %3, -1
	%5 = and i32 %4, %1			%5 = and i32 %4, %1
	mingminglUnsubmitted Done Reply Inline Actions Related with scalar BIC DAG node (probably not a part of this patch) %4 = xor i32 %3, -1 %5 = and i32 %4, %1 itself could be a `BIC` as well. `llc` with `-debug` shows `xor %3, -1` is combined into `or (xor %0, -1), -65281` so tblgen pattern won't see the `%4 %5` sequence. mingmingl: Related with scalar BIC DAG node (probably not a part of this patch) ``` %4 = xor i32 %3, -1 %5…
	bcl5980AuthorUnsubmitted Done Reply Inline Actions Yeah, actually what this change do is reverting the `xor + or` to `xor + and`. bcl5980: Yeah, actually what this change do is reverting the `xor + or` to `xor + and`.
	ret i32 %5			ret i32 %5
	}			}

	define i64 @and_eon(i64 %0, i64 %1) {			define i64 @and_eon(i64 %0, i64 %1) {
	; CHECK-LABEL: and_eon:			; CHECK-LABEL: and_eon:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: and x8, x0, #0xff00			; CHECK-NEXT: and x8, x0, #0xff00
	; CHECK-NEXT: eon x0, x8, x1			; CHECK-NEXT: eon x0, x8, x1
	▲ Show 20 Lines • Show All 190 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/shiftregister-from-and.ll

	Show All 10 Lines
	; CHECK-NEXT: and x0, x8, #0xffffffffff000000			; CHECK-NEXT: and x0, x8, #0xffffffffff000000
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%ashr = ashr i64 %a, 23			%ashr = ashr i64 %a, 23
	%and = and i64 %ashr, -16777216			%and = and i64 %ashr, -16777216
	%r = and i64 %b, %and			%r = and i64 %b, %and
	ret i64 %r			ret i64 %r
	}			}

	; TODO: logic shift reg pattern: bic

	define i64 @bic_shiftedreg_from_and(i64 %a, i64 %b) {			define i64 @bic_shiftedreg_from_and(i64 %a, i64 %b) {
	; CHECK-LABEL: bic_shiftedreg_from_and:			; CHECK-LABEL: bic_shiftedreg_from_and:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: mov w8, #16777215			; CHECK-NEXT: asr x8, x0, #47
	; CHECK-NEXT: orn x8, x8, x0, asr #23			; CHECK-NEXT: bic x0, x1, x8, lsl #24
	; CHECK-NEXT: and x0, x1, x8
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%ashr = ashr i64 %a, 23			%ashr = ashr i64 %a, 23
	%and = and i64 %ashr, -16777216			%and = and i64 %ashr, -16777216
	%not = xor i64 %and, -1			%not = xor i64 %and, -1
	%r = and i64 %b, %not			%r = and i64 %b, %not
	ret i64 %r			ret i64 %r
	}			}

	▲ Show 20 Lines • Show All 257 Lines • Show Last 20 Lines