Download Raw Diff

Details

Reviewers

nikic
zvi
lebedev.ri
xbolva00
joanlluch

Commits

rG455ce7816ce4: [InstCombine] fold a shifted bool zext to a select (2nd try)
rL374886: [InstCombine] fold a shifted bool zext to a select (2nd try)
rG1f40f15d54aa: [InstCombine] fold a shifted bool zext to a select
rL374828: [InstCombine] fold a shifted bool zext to a select

Summary

For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)

https://rise4fun.com/Alive/IZ9

Fixes PR42257.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

zvi created this revision.Jun 15 2019, 4:47 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2019, 4:47 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

Don't know if this is the canonicalization we want, but some notes

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
858	vectors of i1 too (`isIntOrIntVectorTy(1)`)
859	this will overflow, you want ConstantInt::get(Ty, APInt(Ty->getScalarType()->getBitWidth(), 1) << ShAmt) or something

zvi added a subscriber: zvi.Jun 16 2019, 1:30 AM

Addressed comments by @lebedev.ri
Thanks!

zvi marked 2 inline comments as done.Jun 16 2019, 2:27 AM

lebedev.ri added inline comments.Jun 16 2019, 2:37 AM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
919–920	I think you want to add the fold here, no reason to restrict it to identical vectors

zvi marked an inline comment as done.Jun 16 2019, 3:51 AM

zvi added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
919–920	What is the policy for handling non-splat cases, should they always be handled when possible or are they limited only to certain cases to reduce compile-time?

Following suggestion by @lebedev.ri to support non-splat vector shift amount

xbolva00 added a subscriber: xbolva00.Jun 16 2019, 5:21 AM

xbolva00 added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
933	LHS/RHS dont match the comment

Fix typo in comment pointed out by @xbolva00

zvi edited the summary of this revision. (Show Details)Jun 16 2019, 9:53 AM

zvi marked 2 inline comments as done.

IMO, this is the right canonicalization for IR because it's the smallest form. Also if the code was already in this form, it might have profile data on the select condition that would benefit codegen (for example, we might want to compare and branch on this instead of turning it into bit-logic). I've made several related changes to prefer 'select' in IR over bithacks.

The only problem that I see is that the backend (DAGCombiner) isn't prepared to generically reverse this pattern yet (x86 might have some code that can be lifted):

define i32 @shl_zext_bool(i1 %t) {
  %ext = zext i1 %t to i32
  %shl = shl i32 %ext, 7
  ret i32 %shl
}

define i32 @sel_zext_bool(i1 %t) {
  %shl= select i1 %t, i32 128, i32 0
  ret i32 %shl
}

$ ./llc -o - selsh.ll -mtriple=aarch64--
shl_zext_bool:                          // @shl_zext_bool
	and	w8, w0, #0x1
	lsl	w0, w8, #7
	ret

sel_zext_bool:                          // @sel_zext_bool
	tst	w0, #0x1
	mov	w8, #128
	csel	w0, w8, wzr, ne
	ret

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
919–920	There's no official policy AFAIK. If it's easy enough to apply the transform to an arbitrary vector constant, then I'd prefer we do it. There should not be much compile-time overhead because (1) either way, we're checking each element of the vector constant to determine if it's a splat and (2) there probably aren't that many non-splat vector constants to begin with.

In D63382#1547090, @spatel wrote:

The only problem that I see is that the backend (DAGCombiner) isn't prepared to generically reverse this pattern yet (x86 might have some code that can be lifted):

combineSelectOfTwoConstants seems to be the handler for X86 you mentioned . I like your idea to migrate it to a target-independent combine.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
919–920	This makes sense, thanks.

Reverse-ping?

spatel mentioned this in rL374397: [DAGCombiner] fold select-of-constants to shift.Oct 10 2019, 10:53 AM

spatel mentioned this in rG7b904ce7246b: [DAGCombiner] fold select-of-constants to shift.

spatel mentioned this in rG3b581ac80f72: [DAGCombiner] fold vselect-of-constants to shift.Oct 11 2019, 7:17 AM

spatel mentioned this in rL374555: [DAGCombiner] fold vselect-of-constants to shift.

I haven't tested this on trunk, but I added the DAGCombiner reversals:
rL374397
rL374555
...so this should be good to go. Should I commandeer?

There may still be regressions because most in-trunk targets don't enable the guarding hook. Example:
D68911
...but any target has the ability to change that as needed.

@joanlluch - this is a case where we could transform incoming shift to select and benefit small targets as you've noted.

In D63382#1707153, @spatel wrote:

@joanlluch - this is a case where we could transform incoming shift to select and benefit small targets as you've noted.

Indeed, this seems to replace a shift that would be executed in all cases by a shift that will execute only if the incoming value was 1.
I also agree with your comment above about "prefer 'select' in IR over bithacks".

@spatel I want to express my support to selects over bit manipulation instructions in IR, as stated above, in order to move such optimisations to DAGCombine. Ideally, this should involve the removal of some of the existing InstCombineSelect transformations, particularly most of the ones in foldSelectInstWithICmp. However, as I exposed earlier in LLVM-dev, the DAGCombine code should incorporate hooks to allow targets to decide whether such bihacks are actually profitable, or it's best to keep them as selects.

spatel mentioned this in rL374818: [InstCombine] add tests for select/shift transforms; NFC.Oct 14 2019, 1:25 PM

spatel mentioned this in rGbfaa1082e126: [InstCombine] add tests for select/shift transforms; NFC.Oct 14 2019, 1:35 PM

Commandeering to rebase.

Herald added a subscriber: mcrosier. · View Herald TranscriptOct 14 2019, 2:15 PM

Patch updated:
Rebased after committing baseline tests and cosmetic changes to the code.

spatel added reviewers: lebedev.ri, xbolva00, joanlluch.Oct 14 2019, 2:23 PM

This revision is now accepted and ready to land.Oct 14 2019, 2:26 PM

lebedev.ri edited the summary of this revision. (Show Details)Oct 14 2019, 2:28 PM

Looks good.

Closed by commit rG1f40f15d54aa: [InstCombine] fold a shifted bool zext to a select (authored by spatel). · Explain WhyOct 14 2019, 3:00 PM

This revision was automatically updated to reflect the committed changes.

Reverted at rL374851 - appears to cause a test-suite failure and stage 2 failure.

This revision is now accepted and ready to land.Oct 14 2019, 4:54 PM

xbolva00 added inline comments.Oct 14 2019, 5:10 PM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
942	Not guarded by if (match(Op1, m_Constant(C1))) { ?

spatel marked an inline comment as done.Oct 14 2019, 5:29 PM

spatel added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
942	Yikes - yes, that would do it...copy/pasted in the wrong spot.

Closed by commit rG455ce7816ce4: [InstCombine] fold a shifted bool zext to a select (2nd try) (authored by spatel). · Explain WhyOct 15 2019, 6:15 AM

This revision was automatically updated to reflect the committed changes.

Diff 225019

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp

Show First 20 Lines • Show All 849 Lines • ▼ Show 20 Lines	if (match(Op1, m_APInt(ShAmtAPInt))) {
Value *X;		Value *X;
if (match(Op0, m_OneUse(m_ZExt(m_Value(X))))) {		if (match(Op0, m_OneUse(m_ZExt(m_Value(X))))) {
unsigned SrcWidth = X->getType()->getScalarSizeInBits();		unsigned SrcWidth = X->getType()->getScalarSizeInBits();
if (ShAmt < SrcWidth &&		if (ShAmt < SrcWidth &&
MaskedValueIsZero(X, APInt::getHighBitsSet(SrcWidth, ShAmt), 0, &I))		MaskedValueIsZero(X, APInt::getHighBitsSet(SrcWidth, ShAmt), 0, &I))
return new ZExtInst(Builder.CreateShl(X, ShAmt), Ty);		return new ZExtInst(Builder.CreateShl(X, ShAmt), Ty);
}		}

// (X >> C) << C --> X & (-1 << C)		// (X >> C) << C --> X & (-1 << C)
		lebedev.riUnsubmitted Done Reply Inline Actions vectors of i1 too (`isIntOrIntVectorTy(1)`) lebedev.ri: vectors of i1 too (`isIntOrIntVectorTy(1)`)
if (match(Op0, m_Shr(m_Value(X), m_Specific(Op1)))) {		if (match(Op0, m_Shr(m_Value(X), m_Specific(Op1)))) {
		lebedev.riUnsubmitted Done Reply Inline Actions this will overflow, you want ConstantInt::get(Ty, APInt(Ty->getScalarType()->getBitWidth(), 1) << ShAmt) or something lebedev.ri: this will overflow, you want ``` ConstantInt::get(Ty, APInt(Ty->getScalarType()->getBitWidth()…
APInt Mask(APInt::getHighBitsSet(BitWidth, BitWidth - ShAmt));		APInt Mask(APInt::getHighBitsSet(BitWidth, BitWidth - ShAmt));
return BinaryOperator::CreateAnd(X, ConstantInt::get(Ty, Mask));		return BinaryOperator::CreateAnd(X, ConstantInt::get(Ty, Mask));
}		}

// FIXME: we do not yet transform non-exact shr's. The backend (DAGCombine)		// FIXME: we do not yet transform non-exact shr's. The backend (DAGCombine)
// needs a few fixes for the rotate pattern recognition first.		// needs a few fixes for the rotate pattern recognition first.
const APInt *ShOp1;		const APInt *ShOp1;
if (match(Op0, m_Exact(m_Shr(m_Value(X), m_APInt(ShOp1))))) {		if (match(Op0, m_Exact(m_Shr(m_Value(X), m_APInt(ShOp1))))) {
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitShl(BinaryOperator &I) {
Value *X;		Value *X;
if (match(Op0, m_OneUse(m_Shr(m_Value(X), m_Specific(Op1))))) {		if (match(Op0, m_OneUse(m_Shr(m_Value(X), m_Specific(Op1))))) {
Constant *AllOnes = ConstantInt::getAllOnesValue(Ty);		Constant *AllOnes = ConstantInt::getAllOnesValue(Ty);
Value *Mask = Builder.CreateShl(AllOnes, Op1);		Value *Mask = Builder.CreateShl(AllOnes, Op1);
return BinaryOperator::CreateAnd(Mask, X);		return BinaryOperator::CreateAnd(Mask, X);
}		}

Constant *C1;		Constant *C1;
if (match(Op1, m_Constant(C1))) {		if (match(Op1, m_Constant(C1))) {
Constant *C2;		Constant *C2;
		lebedev.riUnsubmitted Done Reply Inline Actions I think you want to add the fold here, no reason to restrict it to identical vectors lebedev.ri: I think you want to add the fold here, no reason to restrict it to identical vectors
		zviUnsubmitted Done Reply Inline Actions What is the policy for handling non-splat cases, should they always be handled when possible or are they limited only to certain cases to reduce compile-time? zvi: What is the policy for handling non-splat cases, should they always be handled when possible or…
		spatelAuthorUnsubmitted Not Done Reply Inline Actions There's no official policy AFAIK. If it's easy enough to apply the transform to an arbitrary vector constant, then I'd prefer we do it. There should not be much compile-time overhead because (1) either way, we're checking each element of the vector constant to determine if it's a splat and (2) there probably aren't that many non-splat vector constants to begin with. spatel: There's no official policy AFAIK. If it's easy enough to apply the transform to an arbitrary…
		zviUnsubmitted Done Reply Inline Actions This makes sense, thanks. zvi: This makes sense, thanks.
Value *X;		Value *X;
// (C2 << X) << C1 --> (C2 << C1) << X		// (C2 << X) << C1 --> (C2 << C1) << X
if (match(Op0, m_OneUse(m_Shl(m_Constant(C2), m_Value(X)))))		if (match(Op0, m_OneUse(m_Shl(m_Constant(C2), m_Value(X)))))
return BinaryOperator::CreateShl(ConstantExpr::getShl(C2, C1), X);		return BinaryOperator::CreateShl(ConstantExpr::getShl(C2, C1), X);

// (X * C2) << C1 --> X * (C2 << C1)		// (X * C2) << C1 --> X * (C2 << C1)
if (match(Op0, m_Mul(m_Value(X), m_Constant(C2))))		if (match(Op0, m_Mul(m_Value(X), m_Constant(C2))))
return BinaryOperator::CreateMul(X, ConstantExpr::getShl(C2, C1));		return BinaryOperator::CreateMul(X, ConstantExpr::getShl(C2, C1));

		// shl (zext i1 X), C1 --> select (X, 1 << C1, 0)
		if (match(Op0, m_ZExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1)) {
		auto *NewC = ConstantExpr::getShl(ConstantInt::get(Ty, 1), C1);
		return SelectInst::Create(X, NewC, ConstantInt::getNullValue(Ty));
		xbolva00Unsubmitted Done Reply Inline Actions LHS/RHS dont match the comment xbolva00: LHS/RHS dont match the comment
		}
}		}

// (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1		// (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1
if (match(Op0, m_One()) &&		if (match(Op0, m_One()) &&
match(Op1, m_Sub(m_SpecificInt(BitWidth - 1), m_Value(X))))		match(Op1, m_Sub(m_SpecificInt(BitWidth - 1), m_Value(X))))
return BinaryOperator::CreateLShr(		return BinaryOperator::CreateLShr(
ConstantInt::get(Ty, APInt::getSignMask(BitWidth)), X);		ConstantInt::get(Ty, APInt::getSignMask(BitWidth)), X);

		xbolva00Unsubmitted Not Done Reply Inline Actions Not guarded by if (match(Op1, m_Constant(C1))) { ? xbolva00: Not guarded by if (match(Op1, m_Constant(C1))) { ?
		spatelAuthorUnsubmitted Done Reply Inline Actions Yikes - yes, that would do it...copy/pasted in the wrong spot. spatel: Yikes - yes, that would do it...copy/pasted in the wrong spot.
return nullptr;		return nullptr;
}		}

Instruction *InstCombiner::visitLShr(BinaryOperator &I) {		Instruction *InstCombiner::visitLShr(BinaryOperator &I) {
if (Value *V = SimplifyLShrInst(I.getOperand(0), I.getOperand(1), I.isExact(),		if (Value *V = SimplifyLShrInst(I.getOperand(0), I.getOperand(1), I.isExact(),
SQ.getWithInstruction(&I)))		SQ.getWithInstruction(&I)))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

▲ Show 20 Lines • Show All 276 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/and.ll

	Show First 20 Lines • Show All 340 Lines • ▼ Show 20 Lines
	;			;
	%Y = zext i1 %X to i32			%Y = zext i1 %X to i32
	%Z = and i32 %Y, 1			%Z = and i32 %Y, 1
	ret i32 %Z			ret i32 %Z
	}			}

	define i32 @test31(i1 %X) {			define i32 @test31(i1 %X) {
	; CHECK-LABEL: @test31(			; CHECK-LABEL: @test31(
	; CHECK-NEXT: [[Y:%.*]] = zext i1 %X to i32			; CHECK-NEXT: [[Z:%.]] = select i1 [[X:%.]], i32 16, i32 0
	; CHECK-NEXT: [[Z:%.*]] = shl nuw nsw i32 [[Y]], 4
	; CHECK-NEXT: ret i32 [[Z]]			; CHECK-NEXT: ret i32 [[Z]]
	;			;
	%Y = zext i1 %X to i32			%Y = zext i1 %X to i32
	%Z = shl i32 %Y, 4			%Z = shl i32 %Y, 4
	%A = and i32 %Z, 16			%A = and i32 %Z, 16
	ret i32 %A			ret i32 %A
	}			}

	▲ Show 20 Lines • Show All 481 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/shift.ll

Show First 20 Lines • Show All 1,175 Lines • ▼ Show 20 Lines	;
%a = zext <2 x i64> %t to <2 x i65>		%a = zext <2 x i64> %t to <2 x i65>
%sext = shl <2 x i65> %a, <i65 33, i65 33>		%sext = shl <2 x i65> %a, <i65 33, i65 33>
%b = ashr <2 x i65> %sext, <i65 33, i65 33>		%b = ashr <2 x i65> %sext, <i65 33, i65 33>
ret <2 x i65> %b		ret <2 x i65> %b
}		}

define i32 @test_shl_zext_bool(i1 %t) {		define i32 @test_shl_zext_bool(i1 %t) {
; CHECK-LABEL: @test_shl_zext_bool(		; CHECK-LABEL: @test_shl_zext_bool(
; CHECK-NEXT: [[EXT:%.]] = zext i1 [[T:%.]] to i32		; CHECK-NEXT: [[SHL:%.]] = select i1 [[T:%.]], i32 4, i32 0
; CHECK-NEXT: [[SHL:%.*]] = shl nuw nsw i32 [[EXT]], 2
; CHECK-NEXT: ret i32 [[SHL]]		; CHECK-NEXT: ret i32 [[SHL]]
;		;
%ext = zext i1 %t to i32		%ext = zext i1 %t to i32
%shl = shl i32 %ext, 2		%shl = shl i32 %ext, 2
ret i32 %shl		ret i32 %shl
}		}

define <2 x i32> @test_shl_zext_bool_splat(<2 x i1> %t) {		define <2 x i32> @test_shl_zext_bool_splat(<2 x i1> %t) {
; CHECK-LABEL: @test_shl_zext_bool_splat(		; CHECK-LABEL: @test_shl_zext_bool_splat(
; CHECK-NEXT: [[EXT:%.]] = zext <2 x i1> [[T:%.]] to <2 x i32>		; CHECK-NEXT: [[SHL:%.]] = select <2 x i1> [[T:%.]], <2 x i32> <i32 8, i32 8>, <2 x i32> zeroinitializer
; CHECK-NEXT: [[SHL:%.*]] = shl nuw nsw <2 x i32> [[EXT]], <i32 3, i32 3>
; CHECK-NEXT: ret <2 x i32> [[SHL]]		; CHECK-NEXT: ret <2 x i32> [[SHL]]
;		;
%ext = zext <2 x i1> %t to <2 x i32>		%ext = zext <2 x i1> %t to <2 x i32>
%shl = shl <2 x i32> %ext, <i32 3, i32 3>		%shl = shl <2 x i32> %ext, <i32 3, i32 3>
ret <2 x i32> %shl		ret <2 x i32> %shl
}		}

define <2 x i32> @test_shl_zext_bool_vec(<2 x i1> %t) {		define <2 x i32> @test_shl_zext_bool_vec(<2 x i1> %t) {
; CHECK-LABEL: @test_shl_zext_bool_vec(		; CHECK-LABEL: @test_shl_zext_bool_vec(
; CHECK-NEXT: [[EXT:%.]] = zext <2 x i1> [[T:%.]] to <2 x i32>		; CHECK-NEXT: [[SHL:%.]] = select <2 x i1> [[T:%.]], <2 x i32> <i32 4, i32 8>, <2 x i32> zeroinitializer
; CHECK-NEXT: [[SHL:%.*]] = shl <2 x i32> [[EXT]], <i32 2, i32 3>
; CHECK-NEXT: ret <2 x i32> [[SHL]]		; CHECK-NEXT: ret <2 x i32> [[SHL]]
;		;
%ext = zext <2 x i1> %t to <2 x i32>		%ext = zext <2 x i1> %t to <2 x i32>
%shl = shl <2 x i32> %ext, <i32 2, i32 3>		%shl = shl <2 x i32> %ext, <i32 2, i32 3>
ret <2 x i32> %shl		ret <2 x i32> %shl
}		}

		define i32 @test_shl_zext_bool_not_constant(i1 %cmp, i32 %shamt) {
		; CHECK-LABEL: @test_shl_zext_bool_not_constant(
		; CHECK-NEXT: [[CONV3:%.]] = zext i1 [[CMP:%.]] to i32
		; CHECK-NEXT: [[SHL:%.]] = shl i32 [[CONV3]], [[SHAMT:%.]]
		; CHECK-NEXT: ret i32 [[SHL]]
		;
		%conv3 = zext i1 %cmp to i32
		%shl = shl i32 %conv3, %shamt
		ret i32 %shl
		}

define i64 @shl_zext(i32 %t) {		define i64 @shl_zext(i32 %t) {
; CHECK-LABEL: @shl_zext(		; CHECK-LABEL: @shl_zext(
; CHECK-NEXT: [[TMP1:%.]] = shl i32 [[T:%.]], 8		; CHECK-NEXT: [[TMP1:%.]] = shl i32 [[T:%.]], 8
; CHECK-NEXT: [[SHL:%.*]] = zext i32 [[TMP1]] to i64		; CHECK-NEXT: [[SHL:%.*]] = zext i32 [[TMP1]] to i64
; CHECK-NEXT: ret i64 [[SHL]]		; CHECK-NEXT: ret i64 [[SHL]]
;		;
%and = and i32 %t, 16777215		%and = and i32 %t, 16777215
%ext = zext i32 %and to i64		%ext = zext i32 %and to i64
▲ Show 20 Lines • Show All 439 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] fold a shifted zext to a select
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 225019

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp

llvm/test/Transforms/InstCombine/and.ll

llvm/test/Transforms/InstCombine/shift.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] fold a shifted zext to a selectClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 225019

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp

llvm/test/Transforms/InstCombine/and.ll

llvm/test/Transforms/InstCombine/shift.ll

[InstCombine] fold a shifted zext to a select
ClosedPublic