This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineAndOrXor.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
result-of-usub-is-non-zero-and-no-overflow.ll

Differential D67356

[InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)
ClosedPublic

Authored by lebedev.ri on Sep 9 2019, 7:59 AM.

Download Raw Diff

Details

Reviewers

spatel
nikic
xbolva00
majnemer

Commits

rG7a67ed579520: [InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)
rL372341: [InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)

Summary

This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens -fsanitize=pointer-overflow
overhead from 25% to 50%, which strongly implies missing folds.

In this particular case, given

char* test(char& base, unsigned long offset) {
  return &base - offset;
}

it will end up producing something like
https://godbolt.org/z/luGEju
which after optimizations reduces down to roughly

declare void @use64(i64)
define i1 @test(i8* dereferenceable(1) %base, i64 %offset) {
  %base_int = ptrtoint i8* %base to i64
  %adjusted = sub i64 %base_int, %offset
  call void @use64(i64 %adjusted)
  %not_null = icmp ne i64 %adjusted, 0
  %no_underflow = icmp ule i64 %adjusted, %base_int
  %no_underflow_and_not_null = and i1 %not_null, %no_underflow
  ret i1 %no_underflow_and_not_null
}

Without D67122 there was no %not_null,
and in this particular case we can "get rid of it", by merging two checks:
Here we are checking: Base u>= Offset && (Base u- Offset) != 0, but that is simply Base u> Offset

Alive proofs:
https://rise4fun.com/Alive/QOs

The @llvm.usub.with.overflow pattern itself is not handled here
because this is the main pattern, that we currently consider canonical.

https://bugs.llvm.org/show_bug.cgi?id=43251

Diff Detail

Repository: rL LLVM

Event Timeline

lebedev.ri created this revision.Sep 9 2019, 7:59 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptSep 9 2019, 7:59 AM

vsk added inline comments.Sep 9 2019, 1:02 PM

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	Does the 'or of icmps' case arise anywhere?

lebedev.ri marked 2 inline comments as done.Sep 9 2019, 1:15 PM

lebedev.ri added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	Define arise. We can trivially get one from another via De Morgan laws: `(a && b) ? c : d` <--> `!(a && b) ? d : c` <--> `(!a \|\| !b) ? d : c`, It's really bad idea to intentionally not handle such nearby patterns.

vsk added inline comments.Sep 9 2019, 1:24 PM

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	As in, do unsigned underflow checks tend to contain this specific or-of-icmp pattern, or more generally whether programs in the wild tend to. If this isn't the case, then it seems like there's a compile-time cost for optimizing the pattern without any compensating benefit.

lebedev.ri marked 2 inline comments as done.Sep 9 2019, 1:29 PM

lebedev.ri added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	I'm not sure what the question is.

xbolva00 added a subscriber: xbolva00.Sep 9 2019, 1:51 PM

xbolva00 added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	If a compile time cost is concern, put it to AgressiveInstCombine - but I dont think we should do it in this case since there is nothing expensive like ValueTracking here.. I like Roman’s current code as is.

vsk added inline comments.Sep 9 2019, 3:59 PM

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	I'm asking: why bother optimizing this pattern, or what is the positive case for doing so (apart from "it's possible to do so")? Essentially my previous comment, with a question mark at the end :).

lebedev.ri marked an inline comment as done.Sep 10 2019, 1:03 AM

lebedev.ri added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	I'm still unable to answer the question because i'm not sure about which pattern specifically you are talking about. We can't just hope that the pattern we will always get is `(a && b) ? c : d` , it can trivially be converted to the `or`-form.

lebedev.ri edited the summary of this revision. (Show Details)Sep 10 2019, 6:04 AM

lebedev.ri edited the summary of this revision. (Show Details)Sep 10 2019, 6:36 AM

Patch updated: handle two more predicates
https://rise4fun.com/Alive/gjT

vsk added inline comments.Sep 10 2019, 12:11 PM

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	Stepping back a bit, I think the approach here should be driven by data. You're right to be concerned about a change in the pipeline causing a performance regression ubsan. I don't think the answer to that is "introduce folds for all kinds of patterns the compiler might see", because then we'd endlessly be spinning our wheels and writing folds that aren't necessarily helpful (just extra maintenance & compile-time burden for no benefit). Instead, the approach should be to focus on the actual end goal. We can check in a simple -fsanitize=pointer-overflow benchmark, perhaps your rawspeed bencharmk, to the llvm test suite. There are tons of bots that monitor the performance changes in the test suite (close to) every commit. Then, it makes sense to land targeted changes to improve performance. If there's ever a regression, we will actually know for sure, and be able to narrow it down, introduce a targeted fix, etc. Taking a narrower focus again, I thought it was clear that we were discussing the 'or' form, but let me rephrase. Are the changes in "foldOrOfICmps" necessary to improve performance on your benchmark?

lebedev.ri added a child revision: D67412: [InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251).Sep 10 2019, 12:35 PM

xbolva00 added inline comments.Sep 10 2019, 12:49 PM

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	just extra maintenance Well, we could leave it as is, but after some weeks/months/years we would realize we need this fold and somebody may implement it from the scratch instead of current few lines. compile-time burden Well, I quite suprised that this concern is repeated so often for instcombine's fold but major new things introduced to other passes just go in.

lebedev.ri added inline comments.Sep 10 2019, 12:55 PM

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	Are the changes in "foldOrOfICmps" necessary to improve performance on your benchmark? Ok, thank you for your question. I'm going to answer honestly: i don't know, i didn't check, and more importantly i don't care. By short-shortsightedly handling only the most obvious pattern one guarantees that the work will have to be redone to spot underoptimized pattern recognize that it can be folded come up with proof write tests find where to patch patch submit patch & get it through review. That all is time-wasteful given it could have been trivially solved when the initial pattern was being added. Granted, not all variations will be caught, but most obvious ones will be. Even in my limited time contributing to instcombine pass i have seen this play out, and it isn't fun to find that it is due to not considering even basic commutativity (and the question at hand, (a && b) ? c : d <--> !(a && b) ? d : c <--> (!a \|\| !b) ? d : c is basic commutativity) And finally, 'just as a glimpse of things to come', that question will not exist (in it's current form at least) when one day the peep-hole pass is SMT driven. (well, more specifically all the folds would be pre-auto-deduced) Let me know if that answers the question?

xbolva00 added inline comments.Sep 10 2019, 1:00 PM

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	when one day the peep-hole pass is SMT driven. compile time would be crazy :D

lebedev.ri marked 2 inline comments as done.Sep 10 2019, 1:09 PM

lebedev.ri added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	when one day the peep-hole pass is SMT driven. compile time would be crazy :D Yes, correct observation :D thus (well, more specifically all the folds would be pre-auto-deduced) so it wouldn't actually do any SMT during opt..

vsk added a subscriber: majnemer.Sep 10 2019, 1:42 PM

vsk added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	To your first concern, llvm contributors are generally pretty good at spotting opportunities to reuse code, so I'm not too worried. Besides, @lebedev.ri's already done the work and left a paper trail :). Second, note that InstCombine really is one of the biggest (top 3, IIRC) compile-time bears in the mid-level pipeline (http://lists.llvm.org/pipermail/llvm-dev/2017-March/111257.html). The marginal cost of introducing a new fold to InstCombine may be low, but adding folds without regard for the time-benefit tradeoff is what gets us into trouble.
2249 ↗	(On Diff #219356)	So, "i didn't check, and more importantly i don't care" is not a great approach here :). And it's not at all clear to me that work will have to be redone: perhaps only a subset of these changes will ever be needed! Imho we should let performance data and real-world examples guide us, not intuitions and hunches. Perhaps I'm not making the case for this well. Paging @majnemer -- as the code owner, wdyt?

lebedev.ri removed a reviewer: vsk.Sep 10 2019, 1:56 PM

lebedev.ri marked an inline comment as done.

lebedev.ri added a subscriber: vsk.

lebedev.ri marked 9 inline comments as done.Sep 10 2019, 2:49 PM

lebedev.ri added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2249 ↗	(On Diff #219356)	I have nothing further to add here. To me it's exactly the same question as "shall we only match `(X * Y) ult Z` or shall we also match `Z ugt (X * Y)`?"

Split off instsimplify changes

lebedev.ri mentioned this in D67498: [InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases (PR43251).Sep 12 2019, 7:15 AM

lebedev.ri added a child revision: D67498: [InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases (PR43251).

lebedev.ri mentioned this in rG9c5a4a4527bc: [InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases….Sep 14 2019, 6:50 AM

Diffusion mentioned this in rL371921: [InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases….Sep 14 2019, 6:50 AM

bump

lebedev.ri removed a child revision: D67412: [InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251).Sep 18 2019, 1:10 PM

lebedev.ri added a parent revision: D67412: [InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251).

Rebased by swapping patches around.

@vsk @majnemer please can you specify, is this patch blocked by you?

This stand-off is really super unproductive for forward progress.

It both blocks the UBSan patch itself, and makes adding further folds
that are needed for that patch both more complicated
(it's best not to have many patches in-flight),
and makes them "illegal" - i don't overlook similar commutative cases in them.

Btw, I am in favour of this patch.

It both blocks the UBSan patch itself,

Well, I dont think so. ubsan patch works now, right? these folds just makes overhead a bit smaller, right? I would not connect ubsan patch with these folds. We could have ubsan patch in tree and these folds would land a bit later - no problem I think. Or miss I something?

xbolva00 accepted this revision.Sep 18 2019, 2:32 PM

This revision is now accepted and ready to land.Sep 18 2019, 2:32 PM

In D67356#1674601, @xbolva00 wrote:

Btw, I am in favour of this patch.

It both blocks the UBSan patch itself,

Well, I dont think so. ubsan patch works now, right? these folds just makes overhead a bit smaller, right? I would not connect ubsan patch with these folds. We could have ubsan patch in tree and these folds would land a bit later - no problem I think. Or miss I something?

No, you are correct, these aren't *required* for the ubsan patch, but they are
very much welcomed - i suspect they will reduce the extra overhead significantly.
(Random guess - down from extra 26% to extra 10%, but i won't know until these folds are in place.)
(for optimized builds, not much can help -O0 :S)

LGTM

Thanks everyone!

Closed by commit rL372341: [InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251) (authored by lebedevri). · Explain WhySep 19 2019, 10:26 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

InstCombine/

InstCombineAndOrXor.cpp

21 lines

test/

Transforms/

InstCombine/

result-of-usub-is-non-zero-and-no-overflow.ll

36 lines

Diff 220885

llvm/trunk/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Show First 20 Lines • Show All 1,078 Lines • ▼ Show 20 Lines	if (UnsignedPred == ICmpInst::ICMP_ULT && IsAnd &&
isKnownNonZero(Offset, Q.DL, /Depth=/0, Q.AC, Q.CxtI, Q.DT))		isKnownNonZero(Offset, Q.DL, /Depth=/0, Q.AC, Q.CxtI, Q.DT))
return Builder.CreateICmpUGT(Base, Offset);		return Builder.CreateICmpUGT(Base, Offset);
if (UnsignedPred == ICmpInst::ICMP_UGE && !IsAnd &&		if (UnsignedPred == ICmpInst::ICMP_UGE && !IsAnd &&
EqPred == ICmpInst::ICMP_EQ &&		EqPred == ICmpInst::ICMP_EQ &&
isKnownNonZero(Offset, Q.DL, /Depth=/0, Q.AC, Q.CxtI, Q.DT))		isKnownNonZero(Offset, Q.DL, /Depth=/0, Q.AC, Q.CxtI, Q.DT))
return Builder.CreateICmpULE(Base, Offset);		return Builder.CreateICmpULE(Base, Offset);
}		}

		if (!match(UnsignedICmp,
		m_c_ICmp(UnsignedPred, m_Specific(Base), m_Specific(Offset))) \|\|
		!ICmpInst::isUnsigned(UnsignedPred))
		return nullptr;
		if (UnsignedICmp->getOperand(0) != Base)
		UnsignedPred = ICmpInst::getSwappedPredicate(UnsignedPred);

		// Base >=/> Offset && (Base - Offset) != 0 <--> Base > Offset
		// (no overflow and not null)
		if ((UnsignedPred == ICmpInst::ICMP_UGE \|\|
		UnsignedPred == ICmpInst::ICMP_UGT) &&
		EqPred == ICmpInst::ICMP_NE && IsAnd)
		return Builder.CreateICmpUGT(Base, Offset);

		// Base <=/< Offset \|\| (Base - Offset) == 0 <--> Base <= Offset
		// (overflow or null)
		if ((UnsignedPred == ICmpInst::ICMP_ULE \|\|
		UnsignedPred == ICmpInst::ICMP_ULT) &&
		EqPred == ICmpInst::ICMP_EQ && !IsAnd)
		return Builder.CreateICmpULE(Base, Offset);

return nullptr;		return nullptr;
}		}

/// Fold (icmp)&(icmp) if possible.		/// Fold (icmp)&(icmp) if possible.
Value InstCombiner::foldAndOfICmps(ICmpInst LHS, ICmpInst *RHS,		Value InstCombiner::foldAndOfICmps(ICmpInst LHS, ICmpInst *RHS,
Instruction &CxtI) {		Instruction &CxtI) {
// Fold (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 \| K2)) == (K1 \| K2)		// Fold (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 \| K2)) == (K1 \| K2)
// if K1 and K2 are a one-bit mask.		// if K1 and K2 are a one-bit mask.
▲ Show 20 Lines • Show All 2,103 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/result-of-usub-is-non-zero-and-no-overflow.ll

	Show All 18 Lines
	define i1 @t0_noncanonical_ignoreme(i8 %base, i8 %offset) {			define i1 @t0_noncanonical_ignoreme(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t0_noncanonical_ignoreme(			; CHECK-LABEL: @t0_noncanonical_ignoreme(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])
	; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])			; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])
	; CHECK-NEXT: [[R:%.*]] = and i1 [[NOT_NULL]], [[NO_UNDERFLOW]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ugt i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%no_underflow = icmp ule i8 %adjusted, %base			%no_underflow = icmp ule i8 %adjusted, %base
	call void @use1(i1 %no_underflow)			call void @use1(i1 %no_underflow)
	%not_null = icmp ne i8 %adjusted, 0			%not_null = icmp ne i8 %adjusted, 0
	call void @use1(i1 %not_null)			call void @use1(i1 %not_null)
	%r = and i1 %not_null, %no_underflow			%r = and i1 %not_null, %no_underflow
	ret i1 %r			ret i1 %r
	}			}

	define i1 @t1(i8 %base, i8 %offset) {			define i1 @t1(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t1(			; CHECK-LABEL: @t1(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])
	; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])			; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])
	; CHECK-NEXT: [[R:%.*]] = and i1 [[NOT_NULL]], [[NO_UNDERFLOW]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ugt i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%no_underflow = icmp uge i8 %base, %offset			%no_underflow = icmp uge i8 %base, %offset
	call void @use1(i1 %no_underflow)			call void @use1(i1 %no_underflow)
	%not_null = icmp ne i8 %adjusted, 0			%not_null = icmp ne i8 %adjusted, 0
	call void @use1(i1 %not_null)			call void @use1(i1 %not_null)
	%r = and i1 %not_null, %no_underflow			%r = and i1 %not_null, %no_underflow
	ret i1 %r			ret i1 %r
	}			}
	define i1 @t1_strict(i8 %base, i8 %offset) {			define i1 @t1_strict(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t1_strict(			; CHECK-LABEL: @t1_strict(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp ugt i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp ugt i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])
	; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])			; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])
	; CHECK-NEXT: [[R:%.*]] = and i1 [[NOT_NULL]], [[NO_UNDERFLOW]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ugt i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%no_underflow = icmp ugt i8 %base, %offset ; same is valid for strict predicate			%no_underflow = icmp ugt i8 %base, %offset ; same is valid for strict predicate
	call void @use1(i1 %no_underflow)			call void @use1(i1 %no_underflow)
	%not_null = icmp ne i8 %adjusted, 0			%not_null = icmp ne i8 %adjusted, 0
	call void @use1(i1 %not_null)			call void @use1(i1 %not_null)
	%r = and i1 %not_null, %no_underflow			%r = and i1 %not_null, %no_underflow
	Show All 32 Lines
	define i1 @t3_commutability0(i8 %base, i8 %offset) {			define i1 @t3_commutability0(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t3_commutability0(			; CHECK-LABEL: @t3_commutability0(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])
	; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])			; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])
	; CHECK-NEXT: [[R:%.*]] = and i1 [[NOT_NULL]], [[NO_UNDERFLOW]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ugt i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%no_underflow = icmp ule i8 %offset, %base ; swapped			%no_underflow = icmp ule i8 %offset, %base ; swapped
	call void @use1(i1 %no_underflow)			call void @use1(i1 %no_underflow)
	%not_null = icmp ne i8 %adjusted, 0			%not_null = icmp ne i8 %adjusted, 0
	call void @use1(i1 %not_null)			call void @use1(i1 %not_null)
	%r = and i1 %not_null, %no_underflow			%r = and i1 %not_null, %no_underflow
	ret i1 %r			ret i1 %r
	}			}
	define i1 @t4_commutability1(i8 %base, i8 %offset) {			define i1 @t4_commutability1(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t4_commutability1(			; CHECK-LABEL: @t4_commutability1(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])
	; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])			; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])
	; CHECK-NEXT: [[R:%.*]] = and i1 [[NO_UNDERFLOW]], [[NOT_NULL]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ugt i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%no_underflow = icmp uge i8 %base, %offset			%no_underflow = icmp uge i8 %base, %offset
	call void @use1(i1 %no_underflow)			call void @use1(i1 %no_underflow)
	%not_null = icmp ne i8 %adjusted, 0			%not_null = icmp ne i8 %adjusted, 0
	call void @use1(i1 %not_null)			call void @use1(i1 %not_null)
	%r = and i1 %no_underflow, %not_null ; swapped			%r = and i1 %no_underflow, %not_null ; swapped
	ret i1 %r			ret i1 %r
	}			}
	define i1 @t5_commutability2(i8 %base, i8 %offset) {			define i1 @t5_commutability2(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t5_commutability2(			; CHECK-LABEL: @t5_commutability2(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[NO_UNDERFLOW:%.*]] = icmp uge i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[NO_UNDERFLOW]])
	; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NOT_NULL:%.*]] = icmp ne i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])			; CHECK-NEXT: call void @use1(i1 [[NOT_NULL]])
	; CHECK-NEXT: [[R:%.*]] = and i1 [[NO_UNDERFLOW]], [[NOT_NULL]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ugt i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%no_underflow = icmp ule i8 %offset, %base ; swapped			%no_underflow = icmp ule i8 %offset, %base ; swapped
	call void @use1(i1 %no_underflow)			call void @use1(i1 %no_underflow)
	%not_null = icmp ne i8 %adjusted, 0			%not_null = icmp ne i8 %adjusted, 0
	call void @use1(i1 %not_null)			call void @use1(i1 %not_null)
	%r = and i1 %no_underflow, %not_null ; swapped			%r = and i1 %no_underflow, %not_null ; swapped
	Show All 33 Lines
	define i1 @t7(i8 %base, i8 %offset) {			define i1 @t7(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t7(			; CHECK-LABEL: @t7(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[UNDERFLOW:%.*]] = icmp ult i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[UNDERFLOW:%.*]] = icmp ult i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[UNDERFLOW]])
	; CHECK-NEXT: [[NULL:%.*]] = icmp eq i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NULL:%.*]] = icmp eq i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NULL]])			; CHECK-NEXT: call void @use1(i1 [[NULL]])
	; CHECK-NEXT: [[R:%.*]] = or i1 [[NULL]], [[UNDERFLOW]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ule i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%underflow = icmp ult i8 %base, %offset			%underflow = icmp ult i8 %base, %offset
	call void @use1(i1 %underflow)			call void @use1(i1 %underflow)
	%null = icmp eq i8 %adjusted, 0			%null = icmp eq i8 %adjusted, 0
	call void @use1(i1 %null)			call void @use1(i1 %null)
	%r = or i1 %null, %underflow			%r = or i1 %null, %underflow
	ret i1 %r			ret i1 %r
	}			}
	define i1 @t7_nonstrict(i8 %base, i8 %offset) {			define i1 @t7_nonstrict(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t7_nonstrict(			; CHECK-LABEL: @t7_nonstrict(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[UNDERFLOW:%.*]] = icmp ule i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[UNDERFLOW:%.*]] = icmp ule i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[UNDERFLOW]])
	; CHECK-NEXT: [[NULL:%.*]] = icmp eq i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NULL:%.*]] = icmp eq i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NULL]])			; CHECK-NEXT: call void @use1(i1 [[NULL]])
	; CHECK-NEXT: [[R:%.*]] = or i1 [[NULL]], [[UNDERFLOW]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ule i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%underflow = icmp ule i8 %base, %offset ; same is valid for non-strict predicate			%underflow = icmp ule i8 %base, %offset ; same is valid for non-strict predicate
	call void @use1(i1 %underflow)			call void @use1(i1 %underflow)
	%null = icmp eq i8 %adjusted, 0			%null = icmp eq i8 %adjusted, 0
	call void @use1(i1 %null)			call void @use1(i1 %null)
	%r = or i1 %null, %underflow			%r = or i1 %null, %underflow
	Show All 28 Lines
	define i1 @t9_commutative(i8 %base, i8 %offset) {			define i1 @t9_commutative(i8 %base, i8 %offset) {
	; CHECK-LABEL: @t9_commutative(			; CHECK-LABEL: @t9_commutative(
	; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]			; CHECK-NEXT: [[ADJUSTED:%.]] = sub i8 [[BASE:%.]], [[OFFSET:%.*]]
	; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])			; CHECK-NEXT: call void @use8(i8 [[ADJUSTED]])
	; CHECK-NEXT: [[UNDERFLOW:%.*]] = icmp ult i8 [[BASE]], [[OFFSET]]			; CHECK-NEXT: [[UNDERFLOW:%.*]] = icmp ult i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: call void @use1(i1 [[UNDERFLOW]])			; CHECK-NEXT: call void @use1(i1 [[UNDERFLOW]])
	; CHECK-NEXT: [[NULL:%.*]] = icmp eq i8 [[ADJUSTED]], 0			; CHECK-NEXT: [[NULL:%.*]] = icmp eq i8 [[ADJUSTED]], 0
	; CHECK-NEXT: call void @use1(i1 [[NULL]])			; CHECK-NEXT: call void @use1(i1 [[NULL]])
	; CHECK-NEXT: [[R:%.*]] = or i1 [[NULL]], [[UNDERFLOW]]			; CHECK-NEXT: [[TMP1:%.*]] = icmp ule i8 [[BASE]], [[OFFSET]]
	; CHECK-NEXT: ret i1 [[R]]			; CHECK-NEXT: ret i1 [[TMP1]]
	;			;
	%adjusted = sub i8 %base, %offset			%adjusted = sub i8 %base, %offset
	call void @use8(i8 %adjusted)			call void @use8(i8 %adjusted)
	%underflow = icmp ult i8 %base, %adjusted ; swapped			%underflow = icmp ult i8 %base, %adjusted ; swapped
	call void @use1(i1 %underflow)			call void @use1(i1 %underflow)
	%null = icmp eq i8 %adjusted, 0			%null = icmp eq i8 %adjusted, 0
	call void @use1(i1 %null)			call void @use1(i1 %null)
	%r = or i1 %null, %underflow			%r = or i1 %null, %underflow
	▲ Show 20 Lines • Show All 119 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 220885

llvm/trunk/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

llvm/trunk/test/Transforms/InstCombine/result-of-usub-is-non-zero-and-no-overflow.ll

[InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)
ClosedPublic