This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Analysis/
-
Analysis/
2/2
InstructionSimplify.cpp
-
test/Transforms/
-
Transforms/
-
InstSimplify/
-
div-by-0-guard-before-smul_ov-not.ll
-
div-by-0-guard-before-umul_ov-not.ll
-
PhaseOrdering/
-
unsigned-multiply-overflow-check.ll

Differential D65151

[InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` inverted overflow bit
ClosedPublic

Authored by lebedev.ri on Jul 23 2019, 7:32 AM.

Download Raw Diff

Details

Reviewers

nikic
spatel
xbolva00
RKSimon

Commits

rGc58478685410: [InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.
rL370351: [InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.

Summary

Now that with D65143/D65144 we've produce @llvm.umul.with.overflow,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:

----------------------------------------
Name: no overflow or zero
  %iszero = icmp eq i4 %y, 0
  %umul = smul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %umul.ov.not = xor %umul.ov, -1
  %retval.0 = or i1 %iszero, %umul.ov.not
  ret i1 %retval.0
=>
  %iszero = icmp eq i4 %y, 0
  %umul = smul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %umul.ov.not = xor %umul.ov, -1
  %retval.0 = or i1 %iszero, %umul.ov.not
  ret i1 %umul.ov.not

Done: 1
Optimization is correct!

Note that this is inverted from what we have in a previous patch,
here we are looking for the inverted overflow bit.
And that inversion is kinda problematic - given this particular
pattern we neither hoist that not closer to ret (then the pattern
would have been identical to the one without inversion,
and would have been handled by the previous patch), neither
do the opposite transform. But regardless, we should handle this too.
I've filled PR42720.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lebedev.ri created this revision.Jul 23 2019, 7:32 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptJul 23 2019, 7:32 AM

lebedev.ri added parent revisions: D65150: [InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` overflow bit, D65147: [SimplifyCFG] FoldTwoEntryPHINode(): don't bailout on i1 PHI's if we can hoist a 'not' from incoming values, D65148: [SimplifyCFG] Bump phi-node-folding-threshold from 2 to 3.Jul 23 2019, 7:34 AM

LGTM

llvm/lib/Analysis/InstructionSimplify.cpp
1825–1838	A couple of possibilities to reduce the code duplication, but you can decide if it is worth it: Make a tiny helper for this chunk of code that matches the extract/intrinsic. Add a parameter that tells this function whether we should match the EQ/NE and 'not' part of the pattern (distinguishes between if we were called by 'and' or 'or').

This revision is now accepted and ready to land.Jul 26 2019, 6:51 AM

Reduced duplication.

llvm/lib/Analysis/InstructionSimplify.cpp
1825–1838	Yeah, this ended up having more duplication than i thought there will be.

One more thing: given this IR:

define dso_local i64 @_Z17will_not_overflowmmPb(i64 %arg, i64 %arg1, i8* nocapture %arg2) {
  %tmp = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %arg, i64 %arg1)
  %tmp3 = extractvalue { i64, i1 } %tmp, 1
  %tmp4 = zext i1 %tmp3 to i8
  store i8 %tmp4, i8* %arg2, align 1
  %tmp5 = mul i64 %arg1, %arg
  ret i64 %tmp5
}

declare { i64, i1 } @llvm.umul.with.overflow.i64(i64, i64)

which pass should be responsible to merging those two multiplications into:

define dso_local i64 @_Z17will_not_overflowmmPb(i64 %arg, i64 %arg1, i8* nocapture %arg2) {
  %tmp = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %arg, i64 %arg1)
  %tmp3 = extractvalue { i64, i1 } %tmp, 1
  %tmp4 = zext i1 %tmp3 to i8
  store i8 %tmp4, i8* %arg2, align 1
  %tmp5 = extractvalue { i64, i1 } %tmp, 0
  ret i64 %tmp5
}

declare { i64, i1 } @llvm.umul.with.overflow.i64(i64, i64)

https://godbolt.org/z/CQW8it

----------------------------------------
Name: mul -> smul_overflow
  %r = mul i4 %x, %y
=>
  %tmp = smul_overflow i4 %x, %y
  %r = extractvalue {i4, i1} %tmp, 0

Done: 1
Optimization is correct!

----------------------------------------
Name: mul -> umul_overflow
  %r = mul i4 %x, %y
=>
  %tmp = umul_overflow i4 %x, %y
  %r = extractvalue {i4, i1} %tmp, 0

Done: 1
Optimization is correct!

In D65151#1602699, @lebedev.ri wrote:

which pass should be responsible to merging those two multiplications into:

This would be a shared problem for all of the overflow intrinsics. EarlyCSE is my first guess. Best to ask this on llvm-dev though, so we can get more feedback.

There's some code related to this in GVN: https://github.com/llvm-mirror/llvm/blob/6c33c8991a6740ed8c87d5d3980a6469b415628d/lib/Transforms/Scalar/GVN.cpp#L333-L342 However it only works if an appropriate extractelement already exists.

lebedev.ri removed a parent revision: D65148: [SimplifyCFG] Bump phi-node-folding-threshold from 2 to 3.Aug 29 2019, 4:54 AM

Rebased, NFC.

lebedev.ri edited the summary of this revision. (Show Details)Aug 29 2019, 5:31 AM

Closed by commit rL370351: [InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with. (authored by lebedevri). · Explain WhyAug 29 2019, 5:52 AM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in D101423: [InstCombine] Fold overflow bit of [u|s]mul.with.overflow in a poison-safe way.Apr 28 2021, 5:53 AM

Revision Contents

Path

Size

llvm/

lib/

Analysis/

InstructionSimplify.cpp

71 lines

test/

Transforms/

InstSimplify/

div-by-0-guard-before-smul_ov-not.ll

12 lines

div-by-0-guard-before-umul_ov-not.ll

12 lines

PhaseOrdering/

unsigned-multiply-overflow-check.ll

12 lines

Diff 217848

llvm/lib/Analysis/InstructionSimplify.cpp

Show First 20 Lines • Show All 1,753 Lines • ▼ Show 20 Lines	static Value *simplifyAndOrOfCmps(const SimplifyQuery &Q,
// If we looked through casts, we can only handle a constant simplification		// If we looked through casts, we can only handle a constant simplification
// because we are not allowed to create a cast instruction here.		// because we are not allowed to create a cast instruction here.
if (auto *C = dyn_cast<Constant>(V))		if (auto *C = dyn_cast<Constant>(V))
return ConstantExpr::getCast(Cast0->getOpcode(), C, Cast0->getType());		return ConstantExpr::getCast(Cast0->getOpcode(), C, Cast0->getType());

return nullptr;		return nullptr;
}		}

		/// Check that the Op1 is in expected form, i.e.:
		/// %Agg = tail call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %???)
		/// %Op1 = extractvalue { i4, i1 } %Agg, 1
		static bool omitCheckForZeroBeforeMulWithOverflowInternal(Value *Op1,
		Value *X) {
		auto *Extract = dyn_cast<ExtractValueInst>(Op1);
		// We should only be extracting the overflow bit.
		if (!Extract \|\| !Extract->getIndices().equals(1))
		return false;
		Value *Agg = Extract->getAggregateOperand();
		// This should be a multiplication-with-overflow intrinsic.
		if (!match(Agg, m_CombineOr(m_Intrinsic<Intrinsic::umul_with_overflow>(),
		m_Intrinsic<Intrinsic::smul_with_overflow>())))
		return false;
		// One of its multipliers should be the value we checked for zero before.
		if (!match(Agg, m_CombineOr(m_Argument<0>(m_Specific(X)),
		m_Argument<1>(m_Specific(X)))))
		return false;
		return true;
		}

/// The @llvm.[us]mul.with.overflow intrinsic could have been folded from some		/// The @llvm.[us]mul.with.overflow intrinsic could have been folded from some
/// other form of check, e.g. one that was using division; it may have been		/// other form of check, e.g. one that was using division; it may have been
/// guarded against division-by-zero. We can drop that check now.		/// guarded against division-by-zero. We can drop that check now.
/// Look for:		/// Look for:
/// %Op0 = icmp ne i4 %X, 0		/// %Op0 = icmp ne i4 %X, 0
/// %Agg = tail call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %???)		/// %Agg = tail call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %???)
/// %Op1 = extractvalue { i4, i1 } %Agg, 1		/// %Op1 = extractvalue { i4, i1 } %Agg, 1
/// %??? = and i1 %Op0, %Op1		/// %??? = and i1 %Op0, %Op1
/// We can just return %Op1		/// We can just return %Op1
static Value omitCheckForZeroBeforeMulWithOverflow(Value Op0, Value *Op1) {		static Value omitCheckForZeroBeforeMulWithOverflow(Value Op0, Value *Op1) {
ICmpInst::Predicate Pred;		ICmpInst::Predicate Pred;
Value *X;		Value *X;
if (!match(Op0, m_ICmp(Pred, m_Value(X), m_Zero())) \|\|		if (!match(Op0, m_ICmp(Pred, m_Value(X), m_Zero())) \|\|
Pred != ICmpInst::Predicate::ICMP_NE)		Pred != ICmpInst::Predicate::ICMP_NE)
return nullptr;		return nullptr;
auto *Extract = dyn_cast<ExtractValueInst>(Op1);		// Is Op1 in expected form?
// We should only be extracting the overflow bit.		if (!omitCheckForZeroBeforeMulWithOverflowInternal(Op1, X))
if (!Extract \|\| !Extract->getIndices().equals(1))
return nullptr;
Value *Agg = Extract->getAggregateOperand();
// This should be a multiplication-with-overflow intrinsic.
if (!match(Agg, m_CombineOr(m_Intrinsic<Intrinsic::umul_with_overflow>(),
m_Intrinsic<Intrinsic::smul_with_overflow>())))
return nullptr;
// One of its multipliers should be the value we checked for zero before.
if (!match(Agg, m_CombineOr(m_Argument<0>(m_Specific(X)),
m_Argument<1>(m_Specific(X)))))
return nullptr;		return nullptr;
// Can omit 'and', and just return the overflow bit.		// Can omit 'and', and just return the overflow bit.
return Op1;		return Op1;
}		}

		/// The @llvm.[us]mul.with.overflow intrinsic could have been folded from some
		/// other form of check, e.g. one that was using division; it may have been
		/// guarded against division-by-zero. We can drop that check now.
		/// Look for:
		/// %Op0 = icmp eq i4 %X, 0
		/// %Agg = tail call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %???)
		/// %Op1 = extractvalue { i4, i1 } %Agg, 1
		/// %NotOp1 = xor i1 %Op1, true
		/// %or = or i1 %Op0, %NotOp1
		/// We can just return %NotOp1
		static Value omitCheckForZeroBeforeInvertedMulWithOverflow(Value Op0,
		Value *NotOp1) {
		ICmpInst::Predicate Pred;
		Value *X;
		if (!match(Op0, m_ICmp(Pred, m_Value(X), m_Zero())) \|\|
		Pred != ICmpInst::Predicate::ICMP_EQ)
		return nullptr;
		// We expect the other hand of an 'or' to be a 'not'.
		Value *Op1;
		if (!match(NotOp1, m_Not(m_Value(Op1))))
		return nullptr;
		// Is Op1 in expected form?
		if (!omitCheckForZeroBeforeMulWithOverflowInternal(Op1, X))
		return nullptr;
		// Can omit 'and', and just return the inverted overflow bit.
		return NotOp1;
		}

/// Given operands for an And, see if we can fold the result.		/// Given operands for an And, see if we can fold the result.
/// If not, this returns null.		/// If not, this returns null.
static Value SimplifyAndInst(Value Op0, Value *Op1, const SimplifyQuery &Q,		static Value SimplifyAndInst(Value Op0, Value *Op1, const SimplifyQuery &Q,
unsigned MaxRecurse) {		unsigned MaxRecurse) {
if (Constant *C = foldOrCommuteConstant(Instruction::And, Op0, Op1, Q))		if (Constant *C = foldOrCommuteConstant(Instruction::And, Op0, Op1, Q))
return C;		return C;
		spatelUnsubmitted Done Reply Inline Actions A couple of possibilities to reduce the code duplication, but you can decide if it is worth it: Make a tiny helper for this chunk of code that matches the extract/intrinsic. Add a parameter that tells this function whether we should match the EQ/NE and 'not' part of the pattern (distinguishes between if we were called by 'and' or 'or'). spatel: A couple of possibilities to reduce the code duplication, but you can decide if it is worth it…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Yeah, this ended up having more duplication than i thought there will be. lebedev.ri: Yeah, this ended up having more duplication than i thought there will be.

// X & undef -> 0		// X & undef -> 0
if (match(Op1, m_Undef()))		if (match(Op1, m_Undef()))
return Constant::getNullValue(Op0->getType());		return Constant::getNullValue(Op0->getType());

// X & X = X		// X & X = X
if (Op0 == Op1)		if (Op0 == Op1)
return Op0;		return Op0;
▲ Show 20 Lines • Show All 214 Lines • ▼ Show 20 Lines	static Value SimplifyOrInst(Value Op0, Value *Op1, const SimplifyQuery &Q,
if (match(Op1, m_And(m_Value(A), m_Value(B))) &&		if (match(Op1, m_And(m_Value(A), m_Value(B))) &&
(match(Op0, m_c_Xor(m_Specific(A), m_Not(m_Specific(B)))) \|\|		(match(Op0, m_c_Xor(m_Specific(A), m_Not(m_Specific(B)))) \|\|
match(Op0, m_c_Xor(m_Not(m_Specific(A)), m_Specific(B)))))		match(Op0, m_c_Xor(m_Not(m_Specific(A)), m_Specific(B)))))
return Op0;		return Op0;

if (Value *V = simplifyAndOrOfCmps(Q, Op0, Op1, false))		if (Value *V = simplifyAndOrOfCmps(Q, Op0, Op1, false))
return V;		return V;

		// If we have a multiplication overflow check that is being 'and'ed with a
		// check that one of the multipliers is not zero, we can omit the 'and', and
		// only keep the overflow check.
		if (Value *V = omitCheckForZeroBeforeInvertedMulWithOverflow(Op0, Op1))
		return V;
		if (Value *V = omitCheckForZeroBeforeInvertedMulWithOverflow(Op1, Op0))
		return V;

// Try some generic simplifications for associative operations.		// Try some generic simplifications for associative operations.
if (Value *V = SimplifyAssociativeBinOp(Instruction::Or, Op0, Op1, Q,		if (Value *V = SimplifyAssociativeBinOp(Instruction::Or, Op0, Op1, Q,
MaxRecurse))		MaxRecurse))
return V;		return V;

// Or distributes over And. Try some generic simplifications based on this.		// Or distributes over And. Try some generic simplifications based on this.
if (Value *V = ExpandBinOp(Instruction::Or, Op0, Op1, Instruction::And, Q,		if (Value *V = ExpandBinOp(Instruction::Or, Op0, Op1, Instruction::And, Q,
MaxRecurse))		MaxRecurse))
▲ Show 20 Lines • Show All 3,330 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/div-by-0-guard-before-smul_ov-not.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt %s -instsimplify -S \| FileCheck %s			; RUN: opt %s -instsimplify -S \| FileCheck %s

	declare { i4, i1 } @llvm.smul.with.overflow.i4(i4, i4) #1			declare { i4, i1 } @llvm.smul.with.overflow.i4(i4, i4) #1

	define i1 @t0_umul(i4 %size, i4 %nmemb) {			define i1 @t0_umul(i4 %size, i4 %nmemb) {
	; CHECK-LABEL: @t0_umul(			; CHECK-LABEL: @t0_umul(
	; CHECK-NEXT: [[CMP:%.]] = icmp eq i4 [[SIZE:%.]], 0			; CHECK-NEXT: [[SMUL:%.]] = tail call { i4, i1 } @llvm.smul.with.overflow.i4(i4 [[SIZE:%.]], i4 [[NMEMB:%.*]])
	; CHECK-NEXT: [[SMUL:%.]] = tail call { i4, i1 } @llvm.smul.with.overflow.i4(i4 [[SIZE]], i4 [[NMEMB:%.]])
	; CHECK-NEXT: [[SMUL_OV:%.*]] = extractvalue { i4, i1 } [[SMUL]], 1			; CHECK-NEXT: [[SMUL_OV:%.*]] = extractvalue { i4, i1 } [[SMUL]], 1
	; CHECK-NEXT: [[PHITMP:%.*]] = xor i1 [[SMUL_OV]], true			; CHECK-NEXT: [[PHITMP:%.*]] = xor i1 [[SMUL_OV]], true
	; CHECK-NEXT: [[OR:%.*]] = or i1 [[CMP]], [[PHITMP]]			; CHECK-NEXT: ret i1 [[PHITMP]]
	; CHECK-NEXT: ret i1 [[OR]]
	;			;
	%cmp = icmp eq i4 %size, 0			%cmp = icmp eq i4 %size, 0
	%smul = tail call { i4, i1 } @llvm.smul.with.overflow.i4(i4 %size, i4 %nmemb)			%smul = tail call { i4, i1 } @llvm.smul.with.overflow.i4(i4 %size, i4 %nmemb)
	%smul.ov = extractvalue { i4, i1 } %smul, 1			%smul.ov = extractvalue { i4, i1 } %smul, 1
	%phitmp = xor i1 %smul.ov, true			%phitmp = xor i1 %smul.ov, true
	%or = or i1 %cmp, %phitmp			%or = or i1 %cmp, %phitmp
	ret i1 %or			ret i1 %or
	}			}

	define i1 @t1_commutative(i4 %size, i4 %nmemb) {			define i1 @t1_commutative(i4 %size, i4 %nmemb) {
	; CHECK-LABEL: @t1_commutative(			; CHECK-LABEL: @t1_commutative(
	; CHECK-NEXT: [[CMP:%.]] = icmp eq i4 [[SIZE:%.]], 0			; CHECK-NEXT: [[SMUL:%.]] = tail call { i4, i1 } @llvm.smul.with.overflow.i4(i4 [[SIZE:%.]], i4 [[NMEMB:%.*]])
	; CHECK-NEXT: [[SMUL:%.]] = tail call { i4, i1 } @llvm.smul.with.overflow.i4(i4 [[SIZE]], i4 [[NMEMB:%.]])
	; CHECK-NEXT: [[SMUL_OV:%.*]] = extractvalue { i4, i1 } [[SMUL]], 1			; CHECK-NEXT: [[SMUL_OV:%.*]] = extractvalue { i4, i1 } [[SMUL]], 1
	; CHECK-NEXT: [[PHITMP:%.*]] = xor i1 [[SMUL_OV]], true			; CHECK-NEXT: [[PHITMP:%.*]] = xor i1 [[SMUL_OV]], true
	; CHECK-NEXT: [[OR:%.*]] = or i1 [[PHITMP]], [[CMP]]			; CHECK-NEXT: ret i1 [[PHITMP]]
	; CHECK-NEXT: ret i1 [[OR]]
	;			;
	%cmp = icmp eq i4 %size, 0			%cmp = icmp eq i4 %size, 0
	%smul = tail call { i4, i1 } @llvm.smul.with.overflow.i4(i4 %size, i4 %nmemb)			%smul = tail call { i4, i1 } @llvm.smul.with.overflow.i4(i4 %size, i4 %nmemb)
	%smul.ov = extractvalue { i4, i1 } %smul, 1			%smul.ov = extractvalue { i4, i1 } %smul, 1
	%phitmp = xor i1 %smul.ov, true			%phitmp = xor i1 %smul.ov, true
	%or = or i1 %phitmp, %cmp ; swapped			%or = or i1 %phitmp, %cmp ; swapped
	ret i1 %or			ret i1 %or
	}			}
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/div-by-0-guard-before-umul_ov-not.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt %s -instsimplify -S \| FileCheck %s			; RUN: opt %s -instsimplify -S \| FileCheck %s

	declare { i4, i1 } @llvm.umul.with.overflow.i4(i4, i4) #1			declare { i4, i1 } @llvm.umul.with.overflow.i4(i4, i4) #1

	define i1 @t0_umul(i4 %size, i4 %nmemb) {			define i1 @t0_umul(i4 %size, i4 %nmemb) {
	; CHECK-LABEL: @t0_umul(			; CHECK-LABEL: @t0_umul(
	; CHECK-NEXT: [[CMP:%.]] = icmp eq i4 [[SIZE:%.]], 0			; CHECK-NEXT: [[UMUL:%.]] = tail call { i4, i1 } @llvm.umul.with.overflow.i4(i4 [[SIZE:%.]], i4 [[NMEMB:%.*]])
	; CHECK-NEXT: [[UMUL:%.]] = tail call { i4, i1 } @llvm.umul.with.overflow.i4(i4 [[SIZE]], i4 [[NMEMB:%.]])
	; CHECK-NEXT: [[UMUL_OV:%.*]] = extractvalue { i4, i1 } [[UMUL]], 1			; CHECK-NEXT: [[UMUL_OV:%.*]] = extractvalue { i4, i1 } [[UMUL]], 1
	; CHECK-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true			; CHECK-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true
	; CHECK-NEXT: [[OR:%.*]] = or i1 [[CMP]], [[PHITMP]]			; CHECK-NEXT: ret i1 [[PHITMP]]
	; CHECK-NEXT: ret i1 [[OR]]
	;			;
	%cmp = icmp eq i4 %size, 0			%cmp = icmp eq i4 %size, 0
	%umul = tail call { i4, i1 } @llvm.umul.with.overflow.i4(i4 %size, i4 %nmemb)			%umul = tail call { i4, i1 } @llvm.umul.with.overflow.i4(i4 %size, i4 %nmemb)
	%umul.ov = extractvalue { i4, i1 } %umul, 1			%umul.ov = extractvalue { i4, i1 } %umul, 1
	%phitmp = xor i1 %umul.ov, true			%phitmp = xor i1 %umul.ov, true
	%or = or i1 %cmp, %phitmp			%or = or i1 %cmp, %phitmp
	ret i1 %or			ret i1 %or
	}			}

	define i1 @t1_commutative(i4 %size, i4 %nmemb) {			define i1 @t1_commutative(i4 %size, i4 %nmemb) {
	; CHECK-LABEL: @t1_commutative(			; CHECK-LABEL: @t1_commutative(
	; CHECK-NEXT: [[CMP:%.]] = icmp eq i4 [[SIZE:%.]], 0			; CHECK-NEXT: [[UMUL:%.]] = tail call { i4, i1 } @llvm.umul.with.overflow.i4(i4 [[SIZE:%.]], i4 [[NMEMB:%.*]])
	; CHECK-NEXT: [[UMUL:%.]] = tail call { i4, i1 } @llvm.umul.with.overflow.i4(i4 [[SIZE]], i4 [[NMEMB:%.]])
	; CHECK-NEXT: [[UMUL_OV:%.*]] = extractvalue { i4, i1 } [[UMUL]], 1			; CHECK-NEXT: [[UMUL_OV:%.*]] = extractvalue { i4, i1 } [[UMUL]], 1
	; CHECK-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true			; CHECK-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true
	; CHECK-NEXT: [[OR:%.*]] = or i1 [[PHITMP]], [[CMP]]			; CHECK-NEXT: ret i1 [[PHITMP]]
	; CHECK-NEXT: ret i1 [[OR]]
	;			;
	%cmp = icmp eq i4 %size, 0			%cmp = icmp eq i4 %size, 0
	%umul = tail call { i4, i1 } @llvm.umul.with.overflow.i4(i4 %size, i4 %nmemb)			%umul = tail call { i4, i1 } @llvm.umul.with.overflow.i4(i4 %size, i4 %nmemb)
	%umul.ov = extractvalue { i4, i1 } %umul, 1			%umul.ov = extractvalue { i4, i1 } %umul, 1
	%phitmp = xor i1 %umul.ov, true			%phitmp = xor i1 %umul.ov, true
	%or = or i1 %phitmp, %cmp ; swapped			%or = or i1 %phitmp, %cmp ; swapped
	ret i1 %or			ret i1 %or
	}			}
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

llvm/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll

	Show First 20 Lines • Show All 118 Lines • ▼ Show 20 Lines
	; INSTCOMBINESIMPLIFYCFGONLY-NEXT: [[UMUL:%.]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 [[ARG]], i64 [[ARG1:%.]])			; INSTCOMBINESIMPLIFYCFGONLY-NEXT: [[UMUL:%.]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 [[ARG]], i64 [[ARG1:%.]])
	; INSTCOMBINESIMPLIFYCFGONLY-NEXT: [[UMUL_OV:%.*]] = extractvalue { i64, i1 } [[UMUL]], 1			; INSTCOMBINESIMPLIFYCFGONLY-NEXT: [[UMUL_OV:%.*]] = extractvalue { i64, i1 } [[UMUL]], 1
	; INSTCOMBINESIMPLIFYCFGONLY-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true			; INSTCOMBINESIMPLIFYCFGONLY-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true
	; INSTCOMBINESIMPLIFYCFGONLY-NEXT: [[T6:%.*]] = select i1 [[T0]], i1 true, i1 [[PHITMP]]			; INSTCOMBINESIMPLIFYCFGONLY-NEXT: [[T6:%.*]] = select i1 [[T0]], i1 true, i1 [[PHITMP]]
	; INSTCOMBINESIMPLIFYCFGONLY-NEXT: ret i1 [[T6]]			; INSTCOMBINESIMPLIFYCFGONLY-NEXT: ret i1 [[T6]]
	;			;
	; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-LABEL: @will_overflow(			; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-LABEL: @will_overflow(
	; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: bb:			; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: bb:
	; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: [[T0:%.]] = icmp eq i64 [[ARG:%.]], 0			; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: [[UMUL:%.]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 [[ARG:%.]], i64 [[ARG1:%.*]])
	; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: [[UMUL:%.]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 [[ARG]], i64 [[ARG1:%.]])
	; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: [[UMUL_OV:%.*]] = extractvalue { i64, i1 } [[UMUL]], 1			; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: [[UMUL_OV:%.*]] = extractvalue { i64, i1 } [[UMUL]], 1
	; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true			; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true
	; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: [[T6:%.*]] = or i1 [[T0]], [[PHITMP]]			; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: ret i1 [[PHITMP]]
	; INSTCOMBINESIMPLIFYCFGINSTCOMBINE-NEXT: ret i1 [[T6]]
	;			;
	; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-LABEL: @will_overflow(			; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-LABEL: @will_overflow(
	; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: bb:			; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: bb:
	; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[T0:%.]] = icmp eq i64 [[ARG:%.]], 0			; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[T0:%.]] = icmp eq i64 [[ARG:%.]], 0
	; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[UMUL:%.]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 [[ARG]], i64 [[ARG1:%.]])			; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[UMUL:%.]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 [[ARG]], i64 [[ARG1:%.]])
	; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[UMUL_OV:%.*]] = extractvalue { i64, i1 } [[UMUL]], 1			; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[UMUL_OV:%.*]] = extractvalue { i64, i1 } [[UMUL]], 1
	; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true			; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true
	; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[T6:%.*]] = select i1 [[T0]], i1 true, i1 [[PHITMP]]			; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: [[T6:%.*]] = select i1 [[T0]], i1 true, i1 [[PHITMP]]
	; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: ret i1 [[T6]]			; INSTCOMBINESIMPLIFYCFGCOSTLYONLY-NEXT: ret i1 [[T6]]
	;			;
	; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-LABEL: @will_overflow(			; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-LABEL: @will_overflow(
	; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: bb:			; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: bb:
	; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: [[T0:%.]] = icmp eq i64 [[ARG:%.]], 0			; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: [[UMUL:%.]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 [[ARG:%.]], i64 [[ARG1:%.*]])
	; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: [[UMUL:%.]] = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 [[ARG]], i64 [[ARG1:%.]])
	; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: [[UMUL_OV:%.*]] = extractvalue { i64, i1 } [[UMUL]], 1			; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: [[UMUL_OV:%.*]] = extractvalue { i64, i1 } [[UMUL]], 1
	; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true			; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: [[PHITMP:%.*]] = xor i1 [[UMUL_OV]], true
	; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: [[T6:%.*]] = or i1 [[T0]], [[PHITMP]]			; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: ret i1 [[PHITMP]]
	; INSTCOMBINESIMPLIFYCFGCOSTLYINSTCOMBINE-NEXT: ret i1 [[T6]]
	;			;
	bb:			bb:
	%t0 = icmp eq i64 %arg, 0			%t0 = icmp eq i64 %arg, 0
	br i1 %t0, label %bb5, label %bb2			br i1 %t0, label %bb5, label %bb2

	bb2: ; preds = %bb			bb2: ; preds = %bb
	%t3 = udiv i64 -1, %arg			%t3 = udiv i64 -1, %arg
	%t4 = icmp ult i64 %t3, %arg1			%t4 = icmp ult i64 %t3, %arg1
	br label %bb5			br label %bb5

	bb5: ; preds = %bb2, %bb			bb5: ; preds = %bb2, %bb
	%t6 = phi i1 [ false, %bb ], [ %t4, %bb2 ]			%t6 = phi i1 [ false, %bb ], [ %t4, %bb2 ]
	%t7 = xor i1 %t6, true			%t7 = xor i1 %t6, true
	ret i1 %t7			ret i1 %t7
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` inverted overflow bitClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 217848

llvm/lib/Analysis/InstructionSimplify.cpp

llvm/test/Transforms/InstSimplify/div-by-0-guard-before-smul_ov-not.ll

llvm/test/Transforms/InstSimplify/div-by-0-guard-before-umul_ov-not.ll

llvm/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll

[InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` inverted overflow bit
ClosedPublic