Download Raw Diff

Details

Reviewers

nikic
zvi
lebedev.ri
xbolva00
joanlluch

Commits

rG455ce7816ce4: [InstCombine] fold a shifted bool zext to a select (2nd try)
rL374886: [InstCombine] fold a shifted bool zext to a select (2nd try)
rG1f40f15d54aa: [InstCombine] fold a shifted bool zext to a select
rL374828: [InstCombine] fold a shifted bool zext to a select

Summary

For a constant shift amount, add the following fold.
shl (zext (i1 X)), ShAmt --> select (X, 1 << ShAmt, 0)

https://rise4fun.com/Alive/IZ9

Fixes PR42257.

Diff Detail

Event Timeline

zvi created this revision.Jun 15 2019, 4:47 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2019, 4:47 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

Don't know if this is the canonicalization we want, but some notes

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
600	vectors of i1 too (`isIntOrIntVectorTy(1)`)
601	this will overflow, you want ConstantInt::get(Ty, APInt(Ty->getScalarType()->getBitWidth(), 1) << ShAmt) or something

zvi added a subscriber: zvi.Jun 16 2019, 1:30 AM

Addressed comments by @lebedev.ri
Thanks!

zvi marked 2 inline comments as done.Jun 16 2019, 2:27 AM

lebedev.ri added inline comments.Jun 16 2019, 2:37 AM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
667–668	I think you want to add the fold here, no reason to restrict it to identical vectors

zvi marked an inline comment as done.Jun 16 2019, 3:51 AM

zvi added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
667–668	What is the policy for handling non-splat cases, should they always be handled when possible or are they limited only to certain cases to reduce compile-time?

Following suggestion by @lebedev.ri to support non-splat vector shift amount

xbolva00 added a subscriber: xbolva00.Jun 16 2019, 5:21 AM

xbolva00 added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
681	LHS/RHS dont match the comment

Fix typo in comment pointed out by @xbolva00

zvi edited the summary of this revision. (Show Details)Jun 16 2019, 9:53 AM

zvi marked 2 inline comments as done.

IMO, this is the right canonicalization for IR because it's the smallest form. Also if the code was already in this form, it might have profile data on the select condition that would benefit codegen (for example, we might want to compare and branch on this instead of turning it into bit-logic). I've made several related changes to prefer 'select' in IR over bithacks.

The only problem that I see is that the backend (DAGCombiner) isn't prepared to generically reverse this pattern yet (x86 might have some code that can be lifted):

define i32 @shl_zext_bool(i1 %t) {
  %ext = zext i1 %t to i32
  %shl = shl i32 %ext, 7
  ret i32 %shl
}

define i32 @sel_zext_bool(i1 %t) {
  %shl= select i1 %t, i32 128, i32 0
  ret i32 %shl
}

$ ./llc -o - selsh.ll -mtriple=aarch64--
shl_zext_bool:                          // @shl_zext_bool
	and	w8, w0, #0x1
	lsl	w0, w8, #7
	ret

sel_zext_bool:                          // @sel_zext_bool
	tst	w0, #0x1
	mov	w8, #128
	csel	w0, w8, wzr, ne
	ret

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
667–668	There's no official policy AFAIK. If it's easy enough to apply the transform to an arbitrary vector constant, then I'd prefer we do it. There should not be much compile-time overhead because (1) either way, we're checking each element of the vector constant to determine if it's a splat and (2) there probably aren't that many non-splat vector constants to begin with.

In D63382#1547090, @spatel wrote:

The only problem that I see is that the backend (DAGCombiner) isn't prepared to generically reverse this pattern yet (x86 might have some code that can be lifted):

combineSelectOfTwoConstants seems to be the handler for X86 you mentioned . I like your idea to migrate it to a target-independent combine.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
667–668	This makes sense, thanks.

Reverse-ping?

spatel mentioned this in rL374397: [DAGCombiner] fold select-of-constants to shift.Oct 10 2019, 10:53 AM

spatel mentioned this in rG7b904ce7246b: [DAGCombiner] fold select-of-constants to shift.

spatel mentioned this in rG3b581ac80f72: [DAGCombiner] fold vselect-of-constants to shift.Oct 11 2019, 7:17 AM

spatel mentioned this in rL374555: [DAGCombiner] fold vselect-of-constants to shift.

I haven't tested this on trunk, but I added the DAGCombiner reversals:
rL374397
rL374555
...so this should be good to go. Should I commandeer?

There may still be regressions because most in-trunk targets don't enable the guarding hook. Example:
D68911
...but any target has the ability to change that as needed.

@joanlluch - this is a case where we could transform incoming shift to select and benefit small targets as you've noted.

In D63382#1707153, @spatel wrote:

@joanlluch - this is a case where we could transform incoming shift to select and benefit small targets as you've noted.

Indeed, this seems to replace a shift that would be executed in all cases by a shift that will execute only if the incoming value was 1.
I also agree with your comment above about "prefer 'select' in IR over bithacks".

@spatel I want to express my support to selects over bit manipulation instructions in IR, as stated above, in order to move such optimisations to DAGCombine. Ideally, this should involve the removal of some of the existing InstCombineSelect transformations, particularly most of the ones in foldSelectInstWithICmp. However, as I exposed earlier in LLVM-dev, the DAGCombine code should incorporate hooks to allow targets to decide whether such bihacks are actually profitable, or it's best to keep them as selects.

spatel mentioned this in rL374818: [InstCombine] add tests for select/shift transforms; NFC.Oct 14 2019, 1:25 PM

spatel mentioned this in rGbfaa1082e126: [InstCombine] add tests for select/shift transforms; NFC.Oct 14 2019, 1:35 PM

Commandeering to rebase.

Herald added a subscriber: mcrosier. · View Herald TranscriptOct 14 2019, 2:15 PM

Patch updated:
Rebased after committing baseline tests and cosmetic changes to the code.

spatel added reviewers: lebedev.ri, xbolva00, joanlluch.Oct 14 2019, 2:23 PM

This revision is now accepted and ready to land.Oct 14 2019, 2:26 PM

lebedev.ri edited the summary of this revision. (Show Details)Oct 14 2019, 2:28 PM

Looks good.

Closed by commit rG1f40f15d54aa: [InstCombine] fold a shifted bool zext to a select (authored by spatel). · Explain WhyOct 14 2019, 3:00 PM

This revision was automatically updated to reflect the committed changes.

Reverted at rL374851 - appears to cause a test-suite failure and stage 2 failure.

This revision is now accepted and ready to land.Oct 14 2019, 4:54 PM

xbolva00 added inline comments.Oct 14 2019, 5:10 PM

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
678–679	Not guarded by if (match(Op1, m_Constant(C1))) { ?

spatel marked an inline comment as done.Oct 14 2019, 5:29 PM

spatel added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
678–679	Yikes - yes, that would do it...copy/pasted in the wrong spot.

Closed by commit rG455ce7816ce4: [InstCombine] fold a shifted bool zext to a select (2nd try) (authored by spatel). · Explain WhyOct 15 2019, 6:15 AM

This revision was automatically updated to reflect the committed changes.

Diff 204936

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp

Show First 20 Lines • Show All 581 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitShl(BinaryOperator &I) {

Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);
Type *Ty = I.getType();		Type *Ty = I.getType();
const APInt *ShAmtAPInt;		const APInt *ShAmtAPInt;
if (match(Op1, m_APInt(ShAmtAPInt))) {		if (match(Op1, m_APInt(ShAmtAPInt))) {
unsigned ShAmt = ShAmtAPInt->getZExtValue();		unsigned ShAmt = ShAmtAPInt->getZExtValue();
unsigned BitWidth = Ty->getScalarSizeInBits();		unsigned BitWidth = Ty->getScalarSizeInBits();

// shl (zext X), ShAmt --> zext (shl X, ShAmt)
// This is only valid if X would have zeros shifted out.
Value *X;		Value *X;
if (match(Op0, m_ZExt(m_Value(X)))) {		if (match(Op0, m_ZExt(m_Value(X)))) {
unsigned SrcWidth = X->getType()->getScalarSizeInBits();		unsigned SrcWidth = X->getType()->getScalarSizeInBits();
		// shl (zext X), ShAmt --> zext (shl X, ShAmt)
		// This is only valid if X would have zeros shifted out.
if (ShAmt < SrcWidth &&		if (ShAmt < SrcWidth &&
MaskedValueIsZero(X, APInt::getHighBitsSet(SrcWidth, ShAmt), 0, &I))		MaskedValueIsZero(X, APInt::getHighBitsSet(SrcWidth, ShAmt), 0, &I))
return new ZExtInst(Builder.CreateShl(X, ShAmt), Ty);		return new ZExtInst(Builder.CreateShl(X, ShAmt), Ty);

		// shl (zext (i1 X)), ShAmt --> select (X, 0, 1 << ShAmt)
		if (X->getType()->isIntegerTy(1)) {
		lebedev.riUnsubmitted Done Reply Inline Actions vectors of i1 too (`isIntOrIntVectorTy(1)`) lebedev.ri: vectors of i1 too (`isIntOrIntVectorTy(1)`)
		return SelectInst::Create(X, ConstantInt::get(Ty, 1 << ShAmt),
		lebedev.riUnsubmitted Done Reply Inline Actions this will overflow, you want ConstantInt::get(Ty, APInt(Ty->getScalarType()->getBitWidth(), 1) << ShAmt) or something lebedev.ri: this will overflow, you want ``` ConstantInt::get(Ty, APInt(Ty->getScalarType()->getBitWidth()…
		ConstantInt::get(Ty, 0));
		}
}		}

// (X >> C) << C --> X & (-1 << C)		// (X >> C) << C --> X & (-1 << C)
if (match(Op0, m_Shr(m_Value(X), m_Specific(Op1)))) {		if (match(Op0, m_Shr(m_Value(X), m_Specific(Op1)))) {
APInt Mask(APInt::getHighBitsSet(BitWidth, BitWidth - ShAmt));		APInt Mask(APInt::getHighBitsSet(BitWidth, BitWidth - ShAmt));
return BinaryOperator::CreateAnd(X, ConstantInt::get(Ty, Mask));		return BinaryOperator::CreateAnd(X, ConstantInt::get(Ty, Mask));
}		}

▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitShl(BinaryOperator &I) {
Value *X;		Value *X;
if (match(Op0, m_OneUse(m_Shr(m_Value(X), m_Specific(Op1))))) {		if (match(Op0, m_OneUse(m_Shr(m_Value(X), m_Specific(Op1))))) {
Constant *AllOnes = ConstantInt::getAllOnesValue(Ty);		Constant *AllOnes = ConstantInt::getAllOnesValue(Ty);
Value *Mask = Builder.CreateShl(AllOnes, Op1);		Value *Mask = Builder.CreateShl(AllOnes, Op1);
return BinaryOperator::CreateAnd(Mask, X);		return BinaryOperator::CreateAnd(Mask, X);
}		}

Constant *C1;		Constant *C1;
if (match(Op1, m_Constant(C1))) {		if (match(Op1, m_Constant(C1))) {
Constant *C2;		Constant *C2;
		lebedev.riUnsubmitted Done Reply Inline Actions I think you want to add the fold here, no reason to restrict it to identical vectors lebedev.ri: I think you want to add the fold here, no reason to restrict it to identical vectors
		zviUnsubmitted Done Reply Inline Actions What is the policy for handling non-splat cases, should they always be handled when possible or are they limited only to certain cases to reduce compile-time? zvi: What is the policy for handling non-splat cases, should they always be handled when possible or…
		spatelAuthorUnsubmitted Not Done Reply Inline Actions There's no official policy AFAIK. If it's easy enough to apply the transform to an arbitrary vector constant, then I'd prefer we do it. There should not be much compile-time overhead because (1) either way, we're checking each element of the vector constant to determine if it's a splat and (2) there probably aren't that many non-splat vector constants to begin with. spatel: There's no official policy AFAIK. If it's easy enough to apply the transform to an arbitrary…
		zviUnsubmitted Done Reply Inline Actions This makes sense, thanks. zvi: This makes sense, thanks.
Value *X;		Value *X;
// (C2 << X) << C1 --> (C2 << C1) << X		// (C2 << X) << C1 --> (C2 << C1) << X
if (match(Op0, m_OneUse(m_Shl(m_Constant(C2), m_Value(X)))))		if (match(Op0, m_OneUse(m_Shl(m_Constant(C2), m_Value(X)))))
return BinaryOperator::CreateShl(ConstantExpr::getShl(C2, C1), X);		return BinaryOperator::CreateShl(ConstantExpr::getShl(C2, C1), X);

// (X * C2) << C1 --> X * (C2 << C1)		// (X * C2) << C1 --> X * (C2 << C1)
if (match(Op0, m_Mul(m_Value(X), m_Constant(C2))))		if (match(Op0, m_Mul(m_Value(X), m_Constant(C2))))
return BinaryOperator::CreateMul(X, ConstantExpr::getShl(C2, C1));		return BinaryOperator::CreateMul(X, ConstantExpr::getShl(C2, C1));
}		}

return nullptr;		return nullptr;
		xbolva00Unsubmitted Not Done Reply Inline Actions Not guarded by if (match(Op1, m_Constant(C1))) { ? xbolva00: Not guarded by if (match(Op1, m_Constant(C1))) { ?
		spatelAuthorUnsubmitted Done Reply Inline Actions Yikes - yes, that would do it...copy/pasted in the wrong spot. spatel: Yikes - yes, that would do it...copy/pasted in the wrong spot.
}		}

		xbolva00Unsubmitted Done Reply Inline Actions LHS/RHS dont match the comment xbolva00: LHS/RHS dont match the comment
Instruction *InstCombiner::visitLShr(BinaryOperator &I) {		Instruction *InstCombiner::visitLShr(BinaryOperator &I) {
if (Value *V = SimplifyLShrInst(I.getOperand(0), I.getOperand(1), I.isExact(),		if (Value *V = SimplifyLShrInst(I.getOperand(0), I.getOperand(1), I.isExact(),
SQ.getWithInstruction(&I)))		SQ.getWithInstruction(&I)))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (Instruction *X = foldVectorBinop(I))		if (Instruction *X = foldVectorBinop(I))
return X;		return X;

▲ Show 20 Lines • Show All 201 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/and.ll

	Show First 20 Lines • Show All 340 Lines • ▼ Show 20 Lines
	;			;
	%Y = zext i1 %X to i32			%Y = zext i1 %X to i32
	%Z = and i32 %Y, 1			%Z = and i32 %Y, 1
	ret i32 %Z			ret i32 %Z
	}			}

	define i32 @test31(i1 %X) {			define i32 @test31(i1 %X) {
	; CHECK-LABEL: @test31(			; CHECK-LABEL: @test31(
	; CHECK-NEXT: [[Y:%.*]] = zext i1 %X to i32			; CHECK-NEXT: [[Z:%.*]] = select i1 %X, i32 16, i32 0
	; CHECK-NEXT: [[Z:%.*]] = shl nuw nsw i32 [[Y]], 4
	; CHECK-NEXT: ret i32 [[Z]]			; CHECK-NEXT: ret i32 [[Z]]
	;			;
	%Y = zext i1 %X to i32			%Y = zext i1 %X to i32
	%Z = shl i32 %Y, 4			%Z = shl i32 %Y, 4
	%A = and i32 %Z, 16			%A = and i32 %Z, 16
	ret i32 %A			ret i32 %A
	}			}

	▲ Show 20 Lines • Show All 481 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/shift.ll

	Show First 20 Lines • Show All 1,185 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret i64 [[SHL]]			; CHECK-NEXT: ret i64 [[SHL]]
	;			;
	%and = and i32 %t, 16777215			%and = and i32 %t, 16777215
	%ext = zext i32 %and to i64			%ext = zext i32 %and to i64
	%shl = shl i64 %ext, 8			%shl = shl i64 %ext, 8
	ret i64 %shl			ret i64 %shl
	}			}

				define i32 @test_shl_zext_bool(i1 %t) {
				; CHECK-LABEL: @test_shl_zext_bool(
				; CHECK-NEXT: [[SEL:%.*]] = select i1 %t, i32 4, i32 0
				; CHECK-NEXT: ret i32 [[SEL]]
				;
				%ext = zext i1 %t to i32
				%shl = shl i32 %ext, 2
				ret i32 %shl
				}

	define <2 x i64> @test_64_splat_vec(<2 x i32> %t) {			define <2 x i64> @test_64_splat_vec(<2 x i32> %t) {
	; CHECK-LABEL: @test_64_splat_vec(			; CHECK-LABEL: @test_64_splat_vec(
	; CHECK-NEXT: [[TMP1:%.*]] = shl <2 x i32> %t, <i32 8, i32 8>			; CHECK-NEXT: [[TMP1:%.*]] = shl <2 x i32> %t, <i32 8, i32 8>
	; CHECK-NEXT: [[SHL:%.*]] = zext <2 x i32> [[TMP1]] to <2 x i64>			; CHECK-NEXT: [[SHL:%.*]] = zext <2 x i32> [[TMP1]] to <2 x i64>
	; CHECK-NEXT: ret <2 x i64> [[SHL]]			; CHECK-NEXT: ret <2 x i64> [[SHL]]
	;			;
	%and = and <2 x i32> %t, <i32 16777215, i32 16777215>			%and = and <2 x i32> %t, <i32 16777215, i32 16777215>
	%ext = zext <2 x i32> %and to <2 x i64>			%ext = zext <2 x i32> %and to <2 x i64>
	▲ Show 20 Lines • Show All 338 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] fold a shifted zext to a select
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 204936

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp

llvm/test/Transforms/InstCombine/and.ll

llvm/test/Transforms/InstCombine/shift.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] fold a shifted zext to a selectClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 204936

llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp

llvm/test/Transforms/InstCombine/and.ll

llvm/test/Transforms/InstCombine/shift.ll

[InstCombine] fold a shifted zext to a select
ClosedPublic