This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
-
DAGCombiner.cpp
-
test/CodeGen/
-
CodeGen/
-
AArch64/
-
signbit-shift.ll
-
PowerPC/
-
signbit-shift.ll
-
X86/
-
signbit-shift.ll

Differential D49924

[DAGCombiner] transform sub-of-shifted-signbit to add
ClosedPublic

Authored by spatel on Jul 27 2018, 11:49 AM.

Download Raw Diff

Details

Reviewers

craig.topper
efriedma
lebedev.ri
javed.absar
dmgreen
aemerson
evandro
fhahn

Commits

rG9f807f44b101: [DAGCombiner] transform sub-of-shifted-signbit to add
rL338317: [DAGCombiner] transform sub-of-shifted-signbit to add

Summary

This is exchanging a sub-of-1 with add-of-minus-1:
https://rise4fun.com/Alive/plKAH

This is another step towards improving select-of-constants codegen (see D48970).

x86 is the motivating target, and those diffs all appear to be wins. PPC looks neutral. I'm not sure about AArch64.
I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but I think canonicalizing to 'add' is more likely to produce further transforms because we have more folds for 'add'.

Note that we're also missing this canonicalization in IR, but I'm less sure which direction we should go in there. 'lshr' gives us better knownbits, but again the chance of subsequent folds seems more likely with 'add'. We should choose one form or the other.

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.Jul 27 2018, 11:49 AM

Herald added a reviewer: javed.absar. · View Herald TranscriptJul 27 2018, 11:49 AM

Herald added subscribers: kristof.beyls, nemanjai, mcrosier. · View Herald Transcript

lebedev.ri added inline comments.Jul 27 2018, 12:29 PM

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
2742 ↗	(On Diff #157724)	// sub N0, (lshr N10, width-1) -> add N0, (ashr N10, width-1)

Patch updated:
Add a code comment to describe the transform and motivation.

The code+x86 test change looks ok to me.

In aarch64 case, as far as i can tell, the main change is that we avoided having to materialize the immediate in register,
although i'm not sure why we no longer fuse the shift into Operand2 of add/sub, commutativity overlook?
https://godbolt.org/g/yAFJS4 - but i don't think i'm comparing them correctly, that syntax is rather alien to me.
So yeah, not sure about aarch64.

This revision is now accepted and ready to land.Jul 30 2018, 4:50 AM

In D49924#1180044, @lebedev.ri wrote:

The code+x86 test change looks ok to me.

Thanks!

In aarch64 case, as far as i can tell, the main change is that we avoided having to materialize the immediate in register,
although i'm not sure why we no longer fuse the shift into Operand2 of add/sub, commutativity overlook?
https://godbolt.org/g/yAFJS4 - but i don't think i'm comparing them correctly, that syntax is rather alien to me.
So yeah, not sure about aarch64.

Let me add some more ARM experts to see if we can get an answer on those test diffs; I don't know if we can trust llvm-mca for aarch yet.

Yes, the AArch64 issue is that we decide to use add-with-immediate rather than sub-with-shifted-operand. Which form is better probably depends on the CPU and the context: by itself, on an A57 you're trading an expensive instruction for cheaper instruction, but if the immediate can be hoisted out of a hot loop the expensive instruction might be cheaper than two cheap instructions. But I think the sub-with-shifted-operand might be cheaper on other CPUs?

Anyway, that's basically independent of this patch, so don't worry about it.

In D49924#1180892, @efriedma wrote:

Yes, the AArch64 issue is that we decide to use add-with-immediate rather than sub-with-shifted-operand. Which form is better probably depends on the CPU and the context: by itself, on an A57 you're trading an expensive instruction for cheaper instruction, but if the immediate can be hoisted out of a hot loop the expensive instruction might be cheaper than two cheap instructions. But I think the sub-with-shifted-operand might be cheaper on other CPUs?

Anyway, that's basically independent of this patch, so don't worry about it.

Thanks!

Closed by commit rL338317: [DAGCombiner] transform sub-of-shifted-signbit to add (authored by spatel). · Explain WhyJul 30 2018, 3:22 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

11 lines

test/

CodeGen/

AArch64/

signbit-shift.ll

18 lines

PowerPC/

signbit-shift.ll

16 lines

X86/

signbit-shift.ll

36 lines

Diff 158104

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,737 Lines • ▼ Show 20 Lines	if (N1.getOpcode() == ISD::SIGN_EXTEND_INREG) {
VTSDNode *TN = cast<VTSDNode>(N1.getOperand(1));		VTSDNode *TN = cast<VTSDNode>(N1.getOperand(1));
if (TN->getVT() == MVT::i1) {		if (TN->getVT() == MVT::i1) {
SDValue ZExt = DAG.getNode(ISD::AND, DL, VT, N1.getOperand(0),		SDValue ZExt = DAG.getNode(ISD::AND, DL, VT, N1.getOperand(0),
DAG.getConstant(1, DL, VT));		DAG.getConstant(1, DL, VT));
return DAG.getNode(ISD::ADD, DL, VT, N0, ZExt);		return DAG.getNode(ISD::ADD, DL, VT, N0, ZExt);
}		}
}		}

		// Prefer an add for more folding potential and possibly better codegen:
		// sub N0, (lshr N10, width-1) --> add N0, (ashr N10, width-1)
		if (!LegalOperations && N1.getOpcode() == ISD::SRL && N1.hasOneUse()) {
		SDValue ShAmt = N1.getOperand(1);
		ConstantSDNode *ShAmtC = isConstOrConstSplat(ShAmt);
		if (ShAmtC && ShAmtC->getZExtValue() == N1.getScalarValueSizeInBits() - 1) {
		SDValue SRA = DAG.getNode(ISD::SRA, DL, VT, N1.getOperand(0), ShAmt);
		return DAG.getNode(ISD::ADD, DL, VT, N0, SRA);
		}
		}

return SDValue();		return SDValue();
}		}

SDValue DAGCombiner::visitSUBC(SDNode *N) {		SDValue DAGCombiner::visitSUBC(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);
EVT VT = N0.getValueType();		EVT VT = N0.getValueType();
SDLoc DL(N);		SDLoc DL(N);
▲ Show 20 Lines • Show All 15,921 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AArch64/signbit-shift.ll

Show First 20 Lines • Show All 144 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%c = icmp slt i32 %x, 0		%c = icmp slt i32 %x, 0
%r = sext i1 %c to i32		%r = sext i1 %c to i32
ret i32 %r		ret i32 %r
}		}

define i32 @add_sext_ifneg(i32 %x) {		define i32 @add_sext_ifneg(i32 %x) {
; CHECK-LABEL: add_sext_ifneg:		; CHECK-LABEL: add_sext_ifneg:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mov w8, #42		; CHECK-NEXT: asr w8, w0, #31
; CHECK-NEXT: sub w0, w8, w0, lsr #31		; CHECK-NEXT: add w0, w8, #42 // =42
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%c = icmp slt i32 %x, 0		%c = icmp slt i32 %x, 0
%e = sext i1 %c to i32		%e = sext i1 %c to i32
%r = add i32 %e, 42		%r = add i32 %e, 42
ret i32 %r		ret i32 %r
}		}

define i32 @sel_ifneg_fval_bigger(i32 %x) {		define i32 @sel_ifneg_fval_bigger(i32 %x) {
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%e = lshr <4 x i32> %c, <i32 31, i32 31, i32 31, i32 31>		%e = lshr <4 x i32> %c, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %e		%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %e
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define i32 @sub_lshr(i32 %x, i32 %y) {		define i32 @sub_lshr(i32 %x, i32 %y) {
; CHECK-LABEL: sub_lshr:		; CHECK-LABEL: sub_lshr:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sub w0, w1, w0, lsr #31		; CHECK-NEXT: add w0, w1, w0, asr #31
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sh = lshr i32 %x, 31		%sh = lshr i32 %x, 31
%r = sub i32 %y, %sh		%r = sub i32 %y, %sh
ret i32 %r		ret i32 %r
}		}

define <4 x i32> @sub_lshr_vec(<4 x i32> %x, <4 x i32> %y) {		define <4 x i32> @sub_lshr_vec(<4 x i32> %x, <4 x i32> %y) {
; CHECK-LABEL: sub_lshr_vec:		; CHECK-LABEL: sub_lshr_vec:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushr v0.4s, v0.4s, #31		; CHECK-NEXT: ssra v1.4s, v0.4s, #31
; CHECK-NEXT: sub v0.4s, v1.4s, v0.4s		; CHECK-NEXT: mov v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>		%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> %y, %sh		%r = sub <4 x i32> %y, %sh
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define i32 @sub_const_op_lshr(i32 %x) {		define i32 @sub_const_op_lshr(i32 %x) {
; CHECK-LABEL: sub_const_op_lshr:		; CHECK-LABEL: sub_const_op_lshr:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mov w8, #43		; CHECK-NEXT: asr w8, w0, #31
; CHECK-NEXT: sub w0, w8, w0, lsr #31		; CHECK-NEXT: add w0, w8, #43 // =43
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sh = lshr i32 %x, 31		%sh = lshr i32 %x, 31
%r = sub i32 43, %sh		%r = sub i32 43, %sh
ret i32 %r		ret i32 %r
}		}

define <4 x i32> @sub_const_op_lshr_vec(<4 x i32> %x) {		define <4 x i32> @sub_const_op_lshr_vec(<4 x i32> %x) {
; CHECK-LABEL: sub_const_op_lshr_vec:		; CHECK-LABEL: sub_const_op_lshr_vec:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushr v0.4s, v0.4s, #31
; CHECK-NEXT: movi v1.4s, #42		; CHECK-NEXT: movi v1.4s, #42
; CHECK-NEXT: sub v0.4s, v1.4s, v0.4s		; CHECK-NEXT: ssra v1.4s, v0.4s, #31
		; CHECK-NEXT: mov v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>		%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %sh		%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %sh
ret <4 x i32> %r		ret <4 x i32> %r
}		}

llvm/trunk/test/CodeGen/PowerPC/signbit-shift.ll

Show First 20 Lines • Show All 237 Lines • ▼ Show 20 Lines	; CHECK-NEXT: blr
%e = lshr <4 x i32> %c, <i32 31, i32 31, i32 31, i32 31>		%e = lshr <4 x i32> %c, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %e		%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %e
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define i32 @sub_lshr(i32 %x, i32 %y) {		define i32 @sub_lshr(i32 %x, i32 %y) {
; CHECK-LABEL: sub_lshr:		; CHECK-LABEL: sub_lshr:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: srwi 3, 3, 31		; CHECK-NEXT: srawi 3, 3, 31
; CHECK-NEXT: subf 3, 3, 4		; CHECK-NEXT: add 3, 4, 3
; CHECK-NEXT: blr		; CHECK-NEXT: blr
%sh = lshr i32 %x, 31		%sh = lshr i32 %x, 31
%r = sub i32 %y, %sh		%r = sub i32 %y, %sh
ret i32 %r		ret i32 %r
}		}

define <4 x i32> @sub_lshr_vec(<4 x i32> %x, <4 x i32> %y) {		define <4 x i32> @sub_lshr_vec(<4 x i32> %x, <4 x i32> %y) {
; CHECK-LABEL: sub_lshr_vec:		; CHECK-LABEL: sub_lshr_vec:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: vspltisw 4, -16		; CHECK-NEXT: vspltisw 4, -16
; CHECK-NEXT: vspltisw 5, 15		; CHECK-NEXT: vspltisw 5, 15
; CHECK-NEXT: vsubuwm 4, 5, 4		; CHECK-NEXT: vsubuwm 4, 5, 4
; CHECK-NEXT: vsrw 2, 2, 4		; CHECK-NEXT: vsraw 2, 2, 4
; CHECK-NEXT: vsubuwm 2, 3, 2		; CHECK-NEXT: vadduwm 2, 3, 2
; CHECK-NEXT: blr		; CHECK-NEXT: blr
%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>		%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> %y, %sh		%r = sub <4 x i32> %y, %sh
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define i32 @sub_const_op_lshr(i32 %x) {		define i32 @sub_const_op_lshr(i32 %x) {
; CHECK-LABEL: sub_const_op_lshr:		; CHECK-LABEL: sub_const_op_lshr:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: srwi 3, 3, 31		; CHECK-NEXT: srawi 3, 3, 31
; CHECK-NEXT: subfic 3, 3, 43		; CHECK-NEXT: addi 3, 3, 43
; CHECK-NEXT: blr		; CHECK-NEXT: blr
%sh = lshr i32 %x, 31		%sh = lshr i32 %x, 31
%r = sub i32 43, %sh		%r = sub i32 43, %sh
ret i32 %r		ret i32 %r
}		}

define <4 x i32> @sub_const_op_lshr_vec(<4 x i32> %x) {		define <4 x i32> @sub_const_op_lshr_vec(<4 x i32> %x) {
; CHECK-LABEL: sub_const_op_lshr_vec:		; CHECK-LABEL: sub_const_op_lshr_vec:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: vspltisw 3, -16		; CHECK-NEXT: vspltisw 3, -16
; CHECK-NEXT: vspltisw 4, 15		; CHECK-NEXT: vspltisw 4, 15
; CHECK-NEXT: addis 3, 2, .LCPI21_0@toc@ha		; CHECK-NEXT: addis 3, 2, .LCPI21_0@toc@ha
; CHECK-NEXT: addi 3, 3, .LCPI21_0@toc@l		; CHECK-NEXT: addi 3, 3, .LCPI21_0@toc@l
; CHECK-NEXT: vsubuwm 3, 4, 3		; CHECK-NEXT: vsubuwm 3, 4, 3
; CHECK-NEXT: vsrw 2, 2, 3		; CHECK-NEXT: vsraw 2, 2, 3
; CHECK-NEXT: lvx 3, 0, 3		; CHECK-NEXT: lvx 3, 0, 3
; CHECK-NEXT: vsubuwm 2, 3, 2		; CHECK-NEXT: vadduwm 2, 2, 3
; CHECK-NEXT: blr		; CHECK-NEXT: blr
%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>		%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %sh		%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %sh
ret <4 x i32> %r		ret <4 x i32> %r
}		}

llvm/trunk/test/CodeGen/X86/signbit-shift.ll

Show First 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq
%c = icmp slt i32 %x, 0		%c = icmp slt i32 %x, 0
%r = sext i1 %c to i32		%r = sext i1 %c to i32
ret i32 %r		ret i32 %r
}		}

define i32 @add_sext_ifneg(i32 %x) {		define i32 @add_sext_ifneg(i32 %x) {
; CHECK-LABEL: add_sext_ifneg:		; CHECK-LABEL: add_sext_ifneg:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: shrl $31, %edi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: movl $42, %eax		; CHECK-NEXT: sarl $31, %edi
; CHECK-NEXT: subl %edi, %eax		; CHECK-NEXT: leal 42(%rdi), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%c = icmp slt i32 %x, 0		%c = icmp slt i32 %x, 0
%e = sext i1 %c to i32		%e = sext i1 %c to i32
%r = add i32 %e, 42		%r = add i32 %e, 42
ret i32 %r		ret i32 %r
}		}

define i32 @sel_ifneg_fval_bigger(i32 %x) {		define i32 @sel_ifneg_fval_bigger(i32 %x) {
; CHECK-LABEL: sel_ifneg_fval_bigger:		; CHECK-LABEL: sel_ifneg_fval_bigger:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: shrl $31, %edi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: movl $42, %eax		; CHECK-NEXT: sarl $31, %edi
; CHECK-NEXT: subl %edi, %eax		; CHECK-NEXT: leal 42(%rdi), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%c = icmp slt i32 %x, 0		%c = icmp slt i32 %x, 0
%r = select i1 %c, i32 41, i32 42		%r = select i1 %c, i32 41, i32 42
ret i32 %r		ret i32 %r
}		}

define i32 @add_lshr_not(i32 %x) {		define i32 @add_lshr_not(i32 %x) {
; CHECK-LABEL: add_lshr_not:		; CHECK-LABEL: add_lshr_not:
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retq
%e = lshr <4 x i32> %c, <i32 31, i32 31, i32 31, i32 31>		%e = lshr <4 x i32> %c, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %e		%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %e
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define i32 @sub_lshr(i32 %x, i32 %y) {		define i32 @sub_lshr(i32 %x, i32 %y) {
; CHECK-LABEL: sub_lshr:		; CHECK-LABEL: sub_lshr:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: shrl $31, %edi		; CHECK-NEXT: # kill: def $esi killed $esi def $rsi
; CHECK-NEXT: subl %edi, %esi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: movl %esi, %eax		; CHECK-NEXT: sarl $31, %edi
		; CHECK-NEXT: leal (%rdi,%rsi), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%sh = lshr i32 %x, 31		%sh = lshr i32 %x, 31
%r = sub i32 %y, %sh		%r = sub i32 %y, %sh
ret i32 %r		ret i32 %r
}		}

define <4 x i32> @sub_lshr_vec(<4 x i32> %x, <4 x i32> %y) {		define <4 x i32> @sub_lshr_vec(<4 x i32> %x, <4 x i32> %y) {
; CHECK-LABEL: sub_lshr_vec:		; CHECK-LABEL: sub_lshr_vec:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: psrld $31, %xmm0		; CHECK-NEXT: psrad $31, %xmm0
; CHECK-NEXT: psubd %xmm0, %xmm1		; CHECK-NEXT: paddd %xmm1, %xmm0
; CHECK-NEXT: movdqa %xmm1, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>		%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> %y, %sh		%r = sub <4 x i32> %y, %sh
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define i32 @sub_const_op_lshr(i32 %x) {		define i32 @sub_const_op_lshr(i32 %x) {
; CHECK-LABEL: sub_const_op_lshr:		; CHECK-LABEL: sub_const_op_lshr:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: shrl $31, %edi		; CHECK-NEXT: # kill: def $edi killed $edi def $rdi
; CHECK-NEXT: xorl $43, %edi		; CHECK-NEXT: sarl $31, %edi
; CHECK-NEXT: movl %edi, %eax		; CHECK-NEXT: leal 43(%rdi), %eax
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%sh = lshr i32 %x, 31		%sh = lshr i32 %x, 31
%r = sub i32 43, %sh		%r = sub i32 43, %sh
ret i32 %r		ret i32 %r
}		}

define <4 x i32> @sub_const_op_lshr_vec(<4 x i32> %x) {		define <4 x i32> @sub_const_op_lshr_vec(<4 x i32> %x) {
; CHECK-LABEL: sub_const_op_lshr_vec:		; CHECK-LABEL: sub_const_op_lshr_vec:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: psrld $31, %xmm0		; CHECK-NEXT: psrad $31, %xmm0
; CHECK-NEXT: movdqa {{.*#+}} xmm1 = [42,42,42,42]		; CHECK-NEXT: paddd {{.*}}(%rip), %xmm0
; CHECK-NEXT: psubd %xmm0, %xmm1
; CHECK-NEXT: movdqa %xmm1, %xmm0
; CHECK-NEXT: retq		; CHECK-NEXT: retq
%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>		%sh = lshr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %sh		%r = sub <4 x i32> <i32 42, i32 42, i32 42, i32 42>, %sh
ret <4 x i32> %r		ret <4 x i32> %r
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] transform sub-of-shifted-signbit to addClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 158104

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/trunk/test/CodeGen/AArch64/signbit-shift.ll

llvm/trunk/test/CodeGen/PowerPC/signbit-shift.ll

llvm/trunk/test/CodeGen/X86/signbit-shift.ll

[DAGCombiner] transform sub-of-shifted-signbit to add
ClosedPublic