This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
1/1
InstructionCombining.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
3
add.ll
-
and-or.ll
-
ctpop.ll
4
sub.ll

Differential D132412

[InstCombine] ease use constraint in tryFactorization()
ClosedPublic

Authored by spatel on Aug 22 2022, 2:32 PM.

Download Raw Diff

Details

Reviewers

craig.topper
hiraditya
Allen
bcl5980

Commits

rG0cfc6510323f: [InstCombine] ease use constraint in tryFactorization()

Summary

(x * y) + x --> x * (y + 1)
(x * y) - x --> x * (y - 1)

https://alive2.llvm.org/ce/z/eMhvQa

This is one of the IR transforms suggested in issue #57255.

This should be better in IR because it removes a use of a variable operand (we already fold the case with a constant multiply operand).
The backend should be able to re-distribute the multiply if that's better for the target.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

spatel created this revision.Aug 22 2022, 2:32 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2022, 2:32 PM

Herald added subscribers: StephenFan, wenlei, mcrosier. · View Herald Transcript

spatel requested review of this revision.Aug 22 2022, 2:32 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 22 2022, 2:32 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B182683: Diff 454611.Aug 22 2022, 3:45 PM

hiraditya added inline comments.Aug 22 2022, 5:04 PM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1422 ↗	(On Diff #454611)	Can this cause a regression when Y is a power of two? Some other optimization would rewrite `x*y` as `y << n (when n == lg(y) at compile time)`?

bcl5980 added a subscriber: bcl5980.Aug 23 2022, 3:40 AM

bcl5980 added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1422 ↗	(On Diff #454611)	https://alive2.llvm.org/ce/z/LRobTp It looks the answer is yes.

spatel added inline comments.Aug 23 2022, 5:56 AM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1422 ↗	(On Diff #454611)	That is correct - this transform could obscure a power-of-2 transform. But it could also expose a power-of-2 multiply that we miss today: https://alive2.llvm.org/ce/z/ZUjR5S In the typical instcombine sequence, we would process the multiply operand before the add user of that operand, so I haven't found a way to show a regression from this patch yet. We need a more general solution if we are going to get the optimal IR for all patterns. But the example shows a simpler pattern that I missed initially: https://alive2.llvm.org/ce/z/eMhvQa I think we should do that transform for the same reason stated here - it reduces the number of uses of a variable operand. And it looks like we might not need this patch because `-reassociate` gives us that simpler pattern.

Patch updated:
Handle a simpler pattern - just looking for a mul with a common operand with an add.

Harbormaster completed remote builds in B182838: Diff 454824.Aug 23 2022, 6:52 AM

bcl5980 added inline comments.Aug 23 2022, 7:36 AM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1422 ↗	(On Diff #454611)	This looks like a special case of distributive law: x * y + x * 1 How about move this to InstCombinerImpl::SimplifyUsingDistributiveLaws or InstCombinerImpl::tryFactorization ?

Allen added inline comments.Aug 23 2022, 8:15 AM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1422 ↗	(On Diff #454611)	so, the decomposition the const of a MUL should be improved first?
1423 ↗	(On Diff #454824)	should we need consider the flag of NUW and NSW ?

spatel added inline comments.Aug 23 2022, 8:45 AM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1422 ↗	(On Diff #454611)	We do try to match this pattern in that code, but we miss it because the operand 'x' has multiple uses. I'll try to make it work if that isn't too ugly...but this explicit pattern-match might be the clearer way to achieve the transform.
1422 ↗	(On Diff #454611)	Ideally yes, we would have decomposition completed in the backend. But I also think this patch can uncover folds that we miss today (like the example where we created a power-of-2 multiply). So this patch should not be blocked on improving the backend. There could be both wins and losses.
1423 ↗	(On Diff #454824)	I only found one case so far where we can partially propagate nuw: https://alive2.llvm.org/ce/z/aGSyrK Everything else seems to be illegal. Propagating the flag would be another reason to code this match explicitly rather than trying to make it work through tryFactorization().

bcl5980 added inline comments.Aug 23 2022, 9:03 AM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1423 ↗	(On Diff #454824)	If we don't move it to tryFactorization, do we need to add the similar pattern xy - x --> (x-1) y ?
1423 ↗	(On Diff #454824)	sorry, should be xy -x --> x (y-1)

spatel mentioned this in rG8ccca3f3a498: [InstCombine] adjust tests for mul+add common factor; NFC.Aug 23 2022, 2:54 PM

Patch updated:
Adjust the use checks in tryFactorization() to handle the motivating patterns. This also changes some existing tests with multi-uses that we ignored before. Those look like good changes to me for the same reason cited before - it reduces uses of variables.

Also, I didn't remember that we have code that tries to propagate nsw/nuw through those folds. It is there, but incomplete as noted with the TODO items in the subtract tests.

Harbormaster completed remote builds in B182960: Diff 454992.Aug 23 2022, 3:48 PM

LGTM. But please wait for someone else.

This revision is now accepted and ready to land.Aug 23 2022, 8:42 PM

Allen added inline comments.Aug 24 2022, 12:24 AM

llvm/test/Transforms/InstCombine/sub.ll
2007	sorry for the naive question, does the nsw of mul implies that of add? As you just say the mul could retain nsw.

bcl5980 added inline comments.Aug 24 2022, 12:41 AM

llvm/test/Transforms/InstCombine/sub.ll
2007	https://alive2.llvm.org/ce/z/WjpNoD add pattern can't retain nsw. overflow comes from -1 * INT_MIN

Allen added inline comments.Aug 24 2022, 1:41 AM

llvm/test/Transforms/InstCombine/sub.ll
2007	Thanks @bcl5980, as your case showed, the mul pattern also can't retain nsw ?

bcl5980 added inline comments.Aug 24 2022, 2:01 AM

llvm/test/Transforms/InstCombine/sub.ll
2007	Yeah, these two patterns are the only two cases we can keep mul flag. All add/sub flags will be lost. https://alive2.llvm.org/ce/z/HDyrwb

spatel added inline comments.Aug 24 2022, 5:59 AM

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
688–689	For reference, this is the block where we try propagating overflow flags. It can be adjusted independently of this patch.

Allen accepted this revision.Aug 24 2022, 7:16 AM

This revision was landed with ongoing or failed builds.Aug 24 2022, 9:11 AM

Closed by commit rG0cfc6510323f: [InstCombine] ease use constraint in tryFactorization() (authored by spatel). · Explain Why

This revision was automatically updated to reflect the committed changes.

spatel added a commit: rG0cfc6510323f: [InstCombine] ease use constraint in tryFactorization().

Allen mentioned this in D136623: [InstCombine] enable more factorization in SimplifyUsingDistributiveLaws.Oct 24 2022, 10:24 AM

arsenm added a subscriber: arsenm.Nov 2 2022, 5:17 PM

arsenm added inline comments.

llvm/test/Transforms/InstCombine/add.ll
1759	Do we have a backend combine to undo this already?

arsenm added inline comments.Nov 2 2022, 5:31 PM

llvm/test/Transforms/InstCombine/add.ll
1759	Specifically, this is bad for us because it breaks integer mad matching

bcl5980 added inline comments.Nov 3 2022, 12:13 AM

llvm/test/Transforms/InstCombine/add.ll

1759

There is a similar pattern in DAGCombiner but the mul second op is also constant:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L4112

// fold (mul (add x, c1), c2) -> (add (mul x, c2), c1*c2)
if (DAG.isConstantIntBuildVectorOrConstantInt(N1) &&
    N0.getOpcode() == ISD::ADD &&
    DAG.isConstantIntBuildVectorOrConstantInt(N0.getOperand(1)) &&
    isMulAddWithConstProfitable(N, N0, N1))
  return DAG.getNode(
      ISD::ADD, DL, VT,
      DAG.getNode(ISD::MUL, SDLoc(N0), VT, N0.getOperand(0), N1),
      DAG.getNode(ISD::MUL, SDLoc(N1), VT, N0.getOperand(1), N1));

And it looks AArch64 backend already has the pattern:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L15066

// Canonicalize X*(Y+1) -> X*Y+X and (X+1)*Y -> X*Y+Y,
// and in MachineCombiner pass, add+mul will be combined into madd.
// Similarly, X*(1-Y) -> X - X*Y and (1-Y)*X -> X - Y*X.

AMDGPU can add similar code, maybe you need to consider more for that (divergence? data type?).
https://alive2.llvm.org/ce/z/gJRkR5

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstructionCombining.cpp

8 lines

test/

Transforms/

InstCombine/

20 lines

12 lines

12 lines

25 lines

Diff 455247

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

Show First 20 Lines • Show All 651 Lines • ▼ Show 20 Lines	if (leftDistributesOverRight(InnerOpcode, TopLevelOpcode)) {
// commutative case, "(A op' B) op (C op' A)"?		// commutative case, "(A op' B) op (C op' A)"?
if (A == C \|\| (InnerCommutative && A == D)) {		if (A == C \|\| (InnerCommutative && A == D)) {
if (A != C)		if (A != C)
std::swap(C, D);		std::swap(C, D);
// Consider forming "A op' (B op D)".		// Consider forming "A op' (B op D)".
// If "B op D" simplifies then it can be formed with no cost.		// If "B op D" simplifies then it can be formed with no cost.
V = simplifyBinOp(TopLevelOpcode, B, D, SQ.getWithInstruction(&I));		V = simplifyBinOp(TopLevelOpcode, B, D, SQ.getWithInstruction(&I));

// If "B op D" doesn't simplify then only go on if both of the existing		// If "B op D" doesn't simplify then only go on if one of the existing
// operations "A op' B" and "C op' D" will be zapped as no longer used.		// operations "A op' B" and "C op' D" will be zapped as no longer used.
if (!V && LHS->hasOneUse() && RHS->hasOneUse())		if (!V && (LHS->hasOneUse() \|\| RHS->hasOneUse()))
V = Builder.CreateBinOp(TopLevelOpcode, B, D, RHS->getName());		V = Builder.CreateBinOp(TopLevelOpcode, B, D, RHS->getName());
if (V)		if (V)
RetVal = Builder.CreateBinOp(InnerOpcode, A, V);		RetVal = Builder.CreateBinOp(InnerOpcode, A, V);
}		}
}		}

// Does "(X op Y) op' Z" always equal "(X op' Z) op (Y op' Z)"?		// Does "(X op Y) op' Z" always equal "(X op' Z) op (Y op' Z)"?
if (!RetVal && rightDistributesOverLeft(TopLevelOpcode, InnerOpcode)) {		if (!RetVal && rightDistributesOverLeft(TopLevelOpcode, InnerOpcode)) {
// Does the instruction have the form "(A op' B) op (C op' B)" or, in the		// Does the instruction have the form "(A op' B) op (C op' B)" or, in the
// commutative case, "(A op' B) op (B op' D)"?		// commutative case, "(A op' B) op (B op' D)"?
if (B == D \|\| (InnerCommutative && B == C)) {		if (B == D \|\| (InnerCommutative && B == C)) {
if (B != D)		if (B != D)
std::swap(C, D);		std::swap(C, D);
// Consider forming "(A op C) op' B".		// Consider forming "(A op C) op' B".
// If "A op C" simplifies then it can be formed with no cost.		// If "A op C" simplifies then it can be formed with no cost.
V = simplifyBinOp(TopLevelOpcode, A, C, SQ.getWithInstruction(&I));		V = simplifyBinOp(TopLevelOpcode, A, C, SQ.getWithInstruction(&I));

// If "A op C" doesn't simplify then only go on if both of the existing		// If "A op C" doesn't simplify then only go on if one of the existing
// operations "A op' B" and "C op' D" will be zapped as no longer used.		// operations "A op' B" and "C op' D" will be zapped as no longer used.
if (!V && LHS->hasOneUse() && RHS->hasOneUse())		if (!V && (LHS->hasOneUse() \|\| RHS->hasOneUse()))
V = Builder.CreateBinOp(TopLevelOpcode, A, C, LHS->getName());		V = Builder.CreateBinOp(TopLevelOpcode, A, C, LHS->getName());
if (V)		if (V)
RetVal = Builder.CreateBinOp(InnerOpcode, V, B);		RetVal = Builder.CreateBinOp(InnerOpcode, V, B);
}		}
}		}

if (!RetVal)		if (!RetVal)
		spatelAuthorUnsubmitted Done Reply Inline Actions For reference, this is the block where we try propagating overflow flags. It can be adjusted independently of this patch. spatel: For reference, this is the block where we try propagating overflow flags. It can be adjusted…
return nullptr;		return nullptr;

++NumFactor;		++NumFactor;
RetVal->takeName(&I);		RetVal->takeName(&I);

// Try to add no-overflow flags to the final value.		// Try to add no-overflow flags to the final value.
if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(RetVal)) {		if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(RetVal)) {
bool HasNSW = false;		bool HasNSW = false;
▲ Show 20 Lines • Show All 4,020 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/add.ll

	Show First 20 Lines • Show All 1,750 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret i32 [[G]]			; CHECK-NEXT: ret i32 [[G]]
	;			;
	%E = add i32 %B, %A			%E = add i32 %B, %A
	%F = add i32 %C, %E			%F = add i32 %C, %E
	%G = add i32 %D, %F			%G = add i32 %D, %F
	ret i32 %G			ret i32 %G
	}			}

				; x * y + x --> (y + 1) * x
				arsenmUnsubmitted Not Done Reply Inline Actions Do we have a backend combine to undo this already? arsenm: Do we have a backend combine to undo this already?
				arsenmUnsubmitted Not Done Reply Inline Actions Specifically, this is bad for us because it breaks integer mad matching arsenm: Specifically, this is bad for us because it breaks integer mad matching
				bcl5980Unsubmitted Not Done Reply Inline Actions There is a similar pattern in DAGCombiner but the `mul` second op is also constant: https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L4112 // fold (mul (add x, c1), c2) -> (add (mul x, c2), c1c2) if (DAG.isConstantIntBuildVectorOrConstantInt(N1) && N0.getOpcode() == ISD::ADD && DAG.isConstantIntBuildVectorOrConstantInt(N0.getOperand(1)) && isMulAddWithConstProfitable(N, N0, N1)) return DAG.getNode( ISD::ADD, DL, VT, DAG.getNode(ISD::MUL, SDLoc(N0), VT, N0.getOperand(0), N1), DAG.getNode(ISD::MUL, SDLoc(N1), VT, N0.getOperand(1), N1)); And it looks AArch64 backend already has the pattern: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L15066 // Canonicalize X(Y+1) -> XY+X and (X+1)Y -> XY+Y, // and in MachineCombiner pass, add+mul will be combined into madd. // Similarly, X(1-Y) -> X - XY and (1-Y)X -> X - YX. AMDGPU can add similar code, maybe you need to consider more for that (divergence? data type?). https://alive2.llvm.org/ce/z/gJRkR5 bcl5980:* There is a similar pattern in DAGCombiner but the `mul` second op is also constant: https…

	define i8 @mul_add_common_factor_commute1(i8 %x, i8 %y) {			define i8 @mul_add_common_factor_commute1(i8 %x, i8 %y) {
	; CHECK-LABEL: @mul_add_common_factor_commute1(			; CHECK-LABEL: @mul_add_common_factor_commute1(
	; CHECK-NEXT: [[M:%.]] = mul nsw i8 [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[X1:%.]] = add i8 [[Y:%.]], 1
	; CHECK-NEXT: [[A:%.*]] = add nsw i8 [[M]], [[X]]			; CHECK-NEXT: [[A:%.]] = mul i8 [[X1]], [[X:%.]]
	; CHECK-NEXT: ret i8 [[A]]			; CHECK-NEXT: ret i8 [[A]]
	;			;
	%m = mul nsw i8 %x, %y			%m = mul nsw i8 %x, %y
	%a = add nsw i8 %m, %x			%a = add nsw i8 %m, %x
	ret i8 %a			ret i8 %a
	}			}

	define <2 x i8> @mul_add_common_factor_commute2(<2 x i8> %x, <2 x i8> %y) {			define <2 x i8> @mul_add_common_factor_commute2(<2 x i8> %x, <2 x i8> %y) {
	; CHECK-LABEL: @mul_add_common_factor_commute2(			; CHECK-LABEL: @mul_add_common_factor_commute2(
	; CHECK-NEXT: [[M:%.]] = mul nuw <2 x i8> [[Y:%.]], [[X:%.*]]			; CHECK-NEXT: [[M1:%.]] = add <2 x i8> [[Y:%.]], <i8 1, i8 1>
	; CHECK-NEXT: [[A:%.*]] = add nuw <2 x i8> [[M]], [[X]]			; CHECK-NEXT: [[A:%.]] = mul nuw <2 x i8> [[M1]], [[X:%.]]
	; CHECK-NEXT: ret <2 x i8> [[A]]			; CHECK-NEXT: ret <2 x i8> [[A]]
	;			;
	%m = mul nuw <2 x i8> %y, %x			%m = mul nuw <2 x i8> %y, %x
	%a = add nuw <2 x i8> %m, %x			%a = add nuw <2 x i8> %m, %x
	ret <2 x i8> %a			ret <2 x i8> %a
	}			}

	define i8 @mul_add_common_factor_commute3(i8 %p, i8 %y) {			define i8 @mul_add_common_factor_commute3(i8 %p, i8 %y) {
	; CHECK-LABEL: @mul_add_common_factor_commute3(			; CHECK-LABEL: @mul_add_common_factor_commute3(
	; CHECK-NEXT: [[X:%.]] = mul i8 [[P:%.]], [[P]]			; CHECK-NEXT: [[X:%.]] = mul i8 [[P:%.]], [[P]]
	; CHECK-NEXT: [[M:%.]] = mul nuw i8 [[X]], [[Y:%.]]			; CHECK-NEXT: [[M1:%.]] = add i8 [[Y:%.]], 1
	; CHECK-NEXT: [[A:%.*]] = add nsw i8 [[X]], [[M]]			; CHECK-NEXT: [[A:%.*]] = mul i8 [[X]], [[M1]]
	; CHECK-NEXT: ret i8 [[A]]			; CHECK-NEXT: ret i8 [[A]]
	;			;
	%x = mul i8 %p, %p ; thwart complexity-based canonicalization			%x = mul i8 %p, %p ; thwart complexity-based canonicalization
	%m = mul nuw i8 %x, %y			%m = mul nuw i8 %x, %y
	%a = add nsw i8 %x, %m			%a = add nsw i8 %x, %m
	ret i8 %a			ret i8 %a
	}			}

	define i8 @mul_add_common_factor_commute4(i8 %p, i8 %q) {			define i8 @mul_add_common_factor_commute4(i8 %p, i8 %q) {
	; CHECK-LABEL: @mul_add_common_factor_commute4(			; CHECK-LABEL: @mul_add_common_factor_commute4(
	; CHECK-NEXT: [[X:%.]] = mul i8 [[P:%.]], [[P]]			; CHECK-NEXT: [[X:%.]] = mul i8 [[P:%.]], [[P]]
	; CHECK-NEXT: [[Y:%.]] = mul i8 [[Q:%.]], [[Q]]			; CHECK-NEXT: [[Y:%.]] = mul i8 [[Q:%.]], [[Q]]
	; CHECK-NEXT: [[M:%.*]] = mul nsw i8 [[Y]], [[X]]			; CHECK-NEXT: [[M1:%.*]] = add i8 [[Y]], 1
	; CHECK-NEXT: [[A:%.*]] = add nuw i8 [[X]], [[M]]			; CHECK-NEXT: [[A:%.*]] = mul i8 [[X]], [[M1]]
	; CHECK-NEXT: ret i8 [[A]]			; CHECK-NEXT: ret i8 [[A]]
	;			;
	%x = mul i8 %p, %p ; thwart complexity-based canonicalization			%x = mul i8 %p, %p ; thwart complexity-based canonicalization
	%y = mul i8 %q, %q ; thwart complexity-based canonicalization			%y = mul i8 %q, %q ; thwart complexity-based canonicalization
	%m = mul nsw i8 %y, %x			%m = mul nsw i8 %y, %x
	%a = add nuw i8 %x, %m			%a = add nuw i8 %x, %m
	ret i8 %a			ret i8 %a
	}			}

				; negative test - uses

	define i8 @mul_add_common_factor_use(i8 %x, i8 %y) {			define i8 @mul_add_common_factor_use(i8 %x, i8 %y) {
	; CHECK-LABEL: @mul_add_common_factor_use(			; CHECK-LABEL: @mul_add_common_factor_use(
	; CHECK-NEXT: [[M:%.]] = mul i8 [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[M:%.]] = mul i8 [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: call void @use(i8 [[M]])			; CHECK-NEXT: call void @use(i8 [[M]])
	; CHECK-NEXT: [[A:%.*]] = add i8 [[M]], [[X]]			; CHECK-NEXT: [[A:%.*]] = add i8 [[M]], [[X]]
	; CHECK-NEXT: ret i8 [[A]]			; CHECK-NEXT: ret i8 [[A]]
	;			;
	%m = mul i8 %x, %y			%m = mul i8 %x, %y
	call void @use(i8 %m)			call void @use(i8 %m)
	%a = add i8 %m, %x			%a = add i8 %m, %x
	ret i8 %a			ret i8 %a
	}			}

llvm/test/Transforms/InstCombine/and-or.ll

	Show First 20 Lines • Show All 665 Lines • ▼ Show 20 Lines

	declare void @use2(i32)			declare void @use2(i32)

	define i32 @or_or_and_noOneUse_fail1(i32 %a, i32 %b) {			define i32 @or_or_and_noOneUse_fail1(i32 %a, i32 %b) {
	; CHECK-LABEL: @or_or_and_noOneUse_fail1(			; CHECK-LABEL: @or_or_and_noOneUse_fail1(
	; CHECK-NEXT: [[SHR:%.]] = ashr i32 [[A:%.]], 23			; CHECK-NEXT: [[SHR:%.]] = ashr i32 [[A:%.]], 23
	; CHECK-NEXT: [[AND:%.*]] = and i32 [[SHR]], 157			; CHECK-NEXT: [[AND:%.*]] = and i32 [[SHR]], 157
	; CHECK-NEXT: call void @use2(i32 [[AND]])			; CHECK-NEXT: call void @use2(i32 [[AND]])
	; CHECK-NEXT: [[AND3:%.]] = and i32 [[SHR]], [[B:%.]]			; CHECK-NEXT: [[AND1:%.]] = or i32 [[B:%.]], 157
				; CHECK-NEXT: [[OR:%.*]] = and i32 [[SHR]], [[AND1]]
	; CHECK-NEXT: [[TMP1:%.*]] = lshr i32 [[B]], 23			; CHECK-NEXT: [[TMP1:%.*]] = lshr i32 [[B]], 23
	; CHECK-NEXT: [[AND9:%.*]] = and i32 [[TMP1]], 157			; CHECK-NEXT: [[AND9:%.*]] = and i32 [[TMP1]], 157
	; CHECK-NEXT: [[TMP2:%.*]] = or i32 [[AND3]], [[AND9]]			; CHECK-NEXT: [[R:%.*]] = or i32 [[OR]], [[AND9]]
	; CHECK-NEXT: [[R:%.*]] = or i32 [[TMP2]], [[AND]]
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[R]]
	;			;
	%shr = ashr i32 %a, 23			%shr = ashr i32 %a, 23
	%conv = trunc i32 %shr to i8			%conv = trunc i32 %shr to i8
	%conv1 = zext i8 %conv to i32			%conv1 = zext i8 %conv to i32
	%and = and i32 %conv1, 925			%and = and i32 %conv1, 925
	call void @use2(i32 %and)			call void @use2(i32 %and)
	%and3 = and i32 %shr, %b			%and3 = and i32 %shr, %b
	%or = or i32 %and3, %and			%or = or i32 %and3, %and
	%shr8 = ashr i32 %b, 23			%shr8 = ashr i32 %b, 23
	%and9 = and i32 %shr8, 157			%and9 = and i32 %shr8, 157
	%r = or i32 %or, %and9			%r = or i32 %or, %and9
	ret i32 %r			ret i32 %r
	}			}

	define { i1, i1, i1, i1, i1 } @or_or_and_noOneUse_fail2(i1 %a_0, i1 %a_1, i1 %a_2, i1 %a_3, i1 %b_0, i1 %b_1, i1 %b_2, i1 %b_3) {			define { i1, i1, i1, i1, i1 } @or_or_and_noOneUse_fail2(i1 %a_0, i1 %a_1, i1 %a_2, i1 %a_3, i1 %b_0, i1 %b_1, i1 %b_2, i1 %b_3) {
	; CHECK-LABEL: @or_or_and_noOneUse_fail2(			; CHECK-LABEL: @or_or_and_noOneUse_fail2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP0:%.]] = and i1 [[A_0:%.]], [[B_0:%.*]]			; CHECK-NEXT: [[TMP0:%.]] = and i1 [[A_0:%.]], [[B_0:%.*]]
	; CHECK-NEXT: [[TMP1:%.]] = and i1 [[A_3:%.]], [[B_3:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = and i1 [[A_3:%.]], [[B_3:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = xor i1 [[A_2:%.]], [[B_2:%.*]]			; CHECK-NEXT: [[TMP2:%.]] = xor i1 [[A_2:%.]], [[B_2:%.*]]
	; CHECK-NEXT: [[TMP3:%.]] = and i1 [[A_1:%.]], [[B_1:%.*]]			; CHECK-NEXT: [[TMP3:%.]] = and i1 [[A_1:%.]], [[B_1:%.*]]
	; CHECK-NEXT: [[TMP4:%.*]] = xor i1 [[TMP3]], true			; CHECK-NEXT: [[TMP4:%.*]] = xor i1 [[TMP3]], true
	; CHECK-NEXT: [[TMP5:%.*]] = and i1 [[TMP0]], [[A_1]]			; CHECK-NEXT: [[TMP5:%.*]] = and i1 [[TMP0]], [[A_1]]
	; CHECK-NEXT: [[TMP6:%.*]] = and i1 [[TMP2]], [[B_1]]			; CHECK-NEXT: [[TMP6:%.*]] = or i1 [[TMP2]], [[A_1]]
	; CHECK-NEXT: [[TMP7:%.*]] = or i1 [[TMP6]], [[TMP5]]			; CHECK-NEXT: [[TMP7:%.*]] = and i1 [[TMP6]], [[B_1]]
	; CHECK-NEXT: [[D:%.*]] = or i1 [[TMP7]], [[TMP3]]			; CHECK-NEXT: [[D:%.*]] = or i1 [[TMP7]], [[TMP5]]
	; CHECK-NEXT: [[TMP8:%.*]] = or i1 [[TMP1]], [[TMP3]]			; CHECK-NEXT: [[TMP8:%.*]] = or i1 [[TMP1]], [[TMP3]]
	; CHECK-NEXT: [[TMP9:%.*]] = insertvalue { i1, i1, i1, i1, i1 } zeroinitializer, i1 [[D]], 0			; CHECK-NEXT: [[TMP9:%.*]] = insertvalue { i1, i1, i1, i1, i1 } zeroinitializer, i1 [[D]], 0
	; CHECK-NEXT: [[TMP10:%.*]] = insertvalue { i1, i1, i1, i1, i1 } [[TMP9]], i1 [[TMP4]], 1			; CHECK-NEXT: [[TMP10:%.*]] = insertvalue { i1, i1, i1, i1, i1 } [[TMP9]], i1 [[TMP4]], 1
	; CHECK-NEXT: [[TMP11:%.*]] = insertvalue { i1, i1, i1, i1, i1 } [[TMP10]], i1 true, 2			; CHECK-NEXT: [[TMP11:%.*]] = insertvalue { i1, i1, i1, i1, i1 } [[TMP10]], i1 true, 2
	; CHECK-NEXT: [[TMP12:%.*]] = insertvalue { i1, i1, i1, i1, i1 } [[TMP11]], i1 [[A_3]], 3			; CHECK-NEXT: [[TMP12:%.*]] = insertvalue { i1, i1, i1, i1, i1 } [[TMP11]], i1 [[A_3]], 3
	; CHECK-NEXT: [[TMP13:%.*]] = insertvalue { i1, i1, i1, i1, i1 } [[TMP12]], i1 [[TMP8]], 4			; CHECK-NEXT: [[TMP13:%.*]] = insertvalue { i1, i1, i1, i1, i1 } [[TMP12]], i1 [[TMP8]], 4
	; CHECK-NEXT: ret { i1, i1, i1, i1, i1 } [[TMP13]]			; CHECK-NEXT: ret { i1, i1, i1, i1, i1 } [[TMP13]]
	;			;
	Show All 22 Lines

llvm/test/Transforms/InstCombine/ctpop.ll

Show First 20 Lines • Show All 442 Lines • ▼ Show 20 Lines	;
ret i32 %i4		ret i32 %i4
}		}

define i32 @parity_xor_extra_use(i32 %arg, i32 %arg1) {		define i32 @parity_xor_extra_use(i32 %arg, i32 %arg1) {
; CHECK-LABEL: @parity_xor_extra_use(		; CHECK-LABEL: @parity_xor_extra_use(
; CHECK-NEXT: [[I:%.]] = tail call i32 @llvm.ctpop.i32(i32 [[ARG:%.]]), !range [[RNG1]]		; CHECK-NEXT: [[I:%.]] = tail call i32 @llvm.ctpop.i32(i32 [[ARG:%.]]), !range [[RNG1]]
; CHECK-NEXT: [[I2:%.*]] = and i32 [[I]], 1		; CHECK-NEXT: [[I2:%.*]] = and i32 [[I]], 1
; CHECK-NEXT: tail call void @use(i32 [[I2]])		; CHECK-NEXT: tail call void @use(i32 [[I2]])
; CHECK-NEXT: [[I3:%.]] = tail call i32 @llvm.ctpop.i32(i32 [[ARG1:%.]]), !range [[RNG1]]		; CHECK-NEXT: [[TMP1:%.]] = xor i32 [[ARG1:%.]], [[ARG]]
; CHECK-NEXT: [[I4:%.*]] = and i32 [[I3]], 1		; CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.ctpop.i32(i32 [[TMP1]]), !range [[RNG1]]
; CHECK-NEXT: [[I5:%.*]] = xor i32 [[I4]], [[I2]]		; CHECK-NEXT: [[I5:%.*]] = and i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[I5]]		; CHECK-NEXT: ret i32 [[I5]]
;		;
%i = tail call i32 @llvm.ctpop.i32(i32 %arg)		%i = tail call i32 @llvm.ctpop.i32(i32 %arg)
%i2 = and i32 %i, 1		%i2 = and i32 %i, 1
tail call void @use(i32 %i2)		tail call void @use(i32 %i2)
%i3 = tail call i32 @llvm.ctpop.i32(i32 %arg1)		%i3 = tail call i32 @llvm.ctpop.i32(i32 %arg1)
%i4 = and i32 %i3, 1		%i4 = and i32 %i3, 1
%i5 = xor i32 %i4, %i2		%i5 = xor i32 %i4, %i2
ret i32 %i5		ret i32 %i5
}		}

define i32 @parity_xor_extra_use2(i32 %arg, i32 %arg1) {		define i32 @parity_xor_extra_use2(i32 %arg, i32 %arg1) {
; CHECK-LABEL: @parity_xor_extra_use2(		; CHECK-LABEL: @parity_xor_extra_use2(
; CHECK-NEXT: [[I:%.]] = tail call i32 @llvm.ctpop.i32(i32 [[ARG1:%.]]), !range [[RNG1]]		; CHECK-NEXT: [[I:%.]] = tail call i32 @llvm.ctpop.i32(i32 [[ARG1:%.]]), !range [[RNG1]]
; CHECK-NEXT: [[I2:%.*]] = and i32 [[I]], 1		; CHECK-NEXT: [[I2:%.*]] = and i32 [[I]], 1
; CHECK-NEXT: tail call void @use(i32 [[I2]])		; CHECK-NEXT: tail call void @use(i32 [[I2]])
; CHECK-NEXT: [[I3:%.]] = tail call i32 @llvm.ctpop.i32(i32 [[ARG:%.]]), !range [[RNG1]]		; CHECK-NEXT: [[TMP1:%.]] = xor i32 [[ARG1]], [[ARG:%.]]
; CHECK-NEXT: [[I4:%.*]] = and i32 [[I3]], 1		; CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.ctpop.i32(i32 [[TMP1]]), !range [[RNG1]]
; CHECK-NEXT: [[I5:%.*]] = xor i32 [[I2]], [[I4]]		; CHECK-NEXT: [[I5:%.*]] = and i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[I5]]		; CHECK-NEXT: ret i32 [[I5]]
;		;
%i = tail call i32 @llvm.ctpop.i32(i32 %arg1)		%i = tail call i32 @llvm.ctpop.i32(i32 %arg1)
%i2 = and i32 %i, 1		%i2 = and i32 %i, 1
tail call void @use(i32 %i2)		tail call void @use(i32 %i2)
%i3 = tail call i32 @llvm.ctpop.i32(i32 %arg)		%i3 = tail call i32 @llvm.ctpop.i32(i32 %arg)
%i4 = and i32 %i3, 1		%i4 = and i32 %i3, 1
%i5 = xor i32 %i2, %i4		%i5 = xor i32 %i2, %i4
ret i32 %i5		ret i32 %i5
}		}

llvm/test/Transforms/InstCombine/sub.ll

Show First 20 Lines • Show All 1,997 Lines • ▼ Show 20 Lines	;
%r = urem i8 %x, %y		%r = urem i8 %x, %y
%d = sub i8 %x, %r		%d = sub i8 %x, %r
%ed = zext i8 %d to i16		%ed = zext i8 %d to i16
%ex = zext i8 %x to i16		%ex = zext i8 %x to i16
%z = sub i16 %ex, %ed		%z = sub i16 %ex, %ed
ret i16 %z		ret i16 %z
}		}

		; x * y - x --> (y - 1) * x
		; TODO: The mul could retain nsw.
		AllenUnsubmitted Not Done Reply Inline Actions sorry for the naive question, does the nsw of mul implies that of add? As you just say the mul could retain nsw. Allen: sorry for the naive question, does the nsw of mul implies that of add? As you just say the mul…
		bcl5980Unsubmitted Not Done Reply Inline Actions https://alive2.llvm.org/ce/z/WjpNoD add pattern can't retain nsw. overflow comes from -1 * INT_MIN bcl5980: https://alive2.llvm.org/ce/z/WjpNoD add pattern can't retain nsw. overflow comes from -1 *…
		AllenUnsubmitted Not Done Reply Inline Actions Thanks @bcl5980, as your case showed, the mul pattern also can't retain nsw ? Allen: Thanks @bcl5980, as your case showed, the mul pattern also can't retain nsw ?
		bcl5980Unsubmitted Not Done Reply Inline Actions Yeah, these two patterns are the only two cases we can keep mul flag. All add/sub flags will be lost. https://alive2.llvm.org/ce/z/HDyrwb bcl5980: Yeah, these two patterns are the only two cases we can keep mul flag. All add/sub flags will be…

define i8 @mul_sub_common_factor_commute1(i8 %x, i8 %y) {		define i8 @mul_sub_common_factor_commute1(i8 %x, i8 %y) {
; CHECK-LABEL: @mul_sub_common_factor_commute1(		; CHECK-LABEL: @mul_sub_common_factor_commute1(
; CHECK-NEXT: [[M:%.]] = mul nsw i8 [[X:%.]], [[Y:%.*]]		; CHECK-NEXT: [[X1:%.]] = add i8 [[Y:%.]], -1
; CHECK-NEXT: [[A:%.*]] = sub nsw i8 [[M]], [[X]]		; CHECK-NEXT: [[A:%.]] = mul i8 [[X1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[A]]		; CHECK-NEXT: ret i8 [[A]]
;		;
%m = mul nsw i8 %x, %y		%m = mul nsw i8 %x, %y
%a = sub nsw i8 %m, %x		%a = sub nsw i8 %m, %x
ret i8 %a		ret i8 %a
}		}

		; TODO: The mul could retain nuw.

define <2 x i8> @mul_sub_common_factor_commute2(<2 x i8> %x, <2 x i8> %y) {		define <2 x i8> @mul_sub_common_factor_commute2(<2 x i8> %x, <2 x i8> %y) {
; CHECK-LABEL: @mul_sub_common_factor_commute2(		; CHECK-LABEL: @mul_sub_common_factor_commute2(
; CHECK-NEXT: [[M:%.]] = mul nuw <2 x i8> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[M1:%.]] = add <2 x i8> [[Y:%.]], <i8 -1, i8 -1>
; CHECK-NEXT: [[A:%.*]] = sub nuw <2 x i8> [[M]], [[X]]		; CHECK-NEXT: [[A:%.]] = mul <2 x i8> [[M1]], [[X:%.]]
; CHECK-NEXT: ret <2 x i8> [[A]]		; CHECK-NEXT: ret <2 x i8> [[A]]
;		;
%m = mul nuw <2 x i8> %y, %x		%m = mul nuw <2 x i8> %y, %x
%a = sub nuw <2 x i8> %m, %x		%a = sub nuw <2 x i8> %m, %x
ret <2 x i8> %a		ret <2 x i8> %a
}		}

		; x - (x * y) --> (1 - y) * x

define i8 @mul_sub_common_factor_commute3(i8 %x, i8 %y) {		define i8 @mul_sub_common_factor_commute3(i8 %x, i8 %y) {
; CHECK-LABEL: @mul_sub_common_factor_commute3(		; CHECK-LABEL: @mul_sub_common_factor_commute3(
; CHECK-NEXT: [[M:%.]] = mul nuw i8 [[X:%.]], [[Y:%.*]]		; CHECK-NEXT: [[M1:%.]] = sub i8 1, [[Y:%.]]
; CHECK-NEXT: [[A:%.*]] = sub nsw i8 [[X]], [[M]]		; CHECK-NEXT: [[A:%.]] = mul i8 [[M1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[A]]		; CHECK-NEXT: ret i8 [[A]]
;		;
%m = mul nuw i8 %x, %y		%m = mul nuw i8 %x, %y
%a = sub nsw i8 %x, %m		%a = sub nsw i8 %x, %m
ret i8 %a		ret i8 %a
}		}

define i8 @mul_sub_common_factor_commute4(i8 %x, i8 %y) {		define i8 @mul_sub_common_factor_commute4(i8 %x, i8 %y) {
; CHECK-LABEL: @mul_sub_common_factor_commute4(		; CHECK-LABEL: @mul_sub_common_factor_commute4(
; CHECK-NEXT: [[M:%.]] = mul nsw i8 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[M1:%.]] = sub i8 1, [[Y:%.]]
; CHECK-NEXT: [[A:%.*]] = sub nuw i8 [[X]], [[M]]		; CHECK-NEXT: [[A:%.]] = mul i8 [[M1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[A]]		; CHECK-NEXT: ret i8 [[A]]
;		;
%m = mul nsw i8 %y, %x		%m = mul nsw i8 %y, %x
%a = sub nuw i8 %x, %m		%a = sub nuw i8 %x, %m
ret i8 %a		ret i8 %a
}		}

		; negative test - uses

define i8 @mul_sub_common_factor_use(i8 %x, i8 %y) {		define i8 @mul_sub_common_factor_use(i8 %x, i8 %y) {
; CHECK-LABEL: @mul_sub_common_factor_use(		; CHECK-LABEL: @mul_sub_common_factor_use(
; CHECK-NEXT: [[M:%.]] = mul i8 [[X:%.]], [[Y:%.*]]		; CHECK-NEXT: [[M:%.]] = mul i8 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: call void @use8(i8 [[M]])		; CHECK-NEXT: call void @use8(i8 [[M]])
; CHECK-NEXT: [[A:%.*]] = sub i8 [[M]], [[X]]		; CHECK-NEXT: [[A:%.*]] = sub i8 [[M]], [[X]]
; CHECK-NEXT: ret i8 [[A]]		; CHECK-NEXT: ret i8 [[A]]
;		;
%m = mul i8 %x, %y		%m = mul i8 %x, %y
call void @use8(i8 %m)		call void @use8(i8 %m)
%a = sub i8 %m, %x		%a = sub i8 %m, %x
ret i8 %a		ret i8 %a
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] ease use constraint in tryFactorization()ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 455247

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

llvm/test/Transforms/InstCombine/add.ll

llvm/test/Transforms/InstCombine/and-or.ll

llvm/test/Transforms/InstCombine/ctpop.ll

llvm/test/Transforms/InstCombine/sub.ll

[InstCombine] ease use constraint in tryFactorization()
ClosedPublic