This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] don't let poison inhibit an easy fold
ClosedPublic

Authored by spatel on Oct 6 2017, 10:34 AM.

Download Raw Diff

Details

Reviewers

majnemer
efriedma
craig.topper
nlopes

Summary

D38591 offers one way to avoid the assert in PR34838, but we could just explicitly handle this pattern in InstSimplify to make life easier for InstCombine. This also avoids using computeKnownBits() if we don't have to and gets known bad code reduced faster, so we're not wasting time on it.

Diff Detail

Event Timeline

spatel created this revision.Oct 6 2017, 10:34 AM

Herald added a subscriber: mcrosier. · View Herald TranscriptOct 6 2017, 10:34 AM

spatel mentioned this in D38591: [InstCombine] don't assert that InstSimplify has removed a known true/false cmp (PR34838).Oct 6 2017, 10:37 AM

If I understand correctly, the reason computeKnownBits can't handle this is that it doesn't know what to do with a known poison value? We could just solve the issue in computeKnownBits: currently, it says there are no known bits when it detects a shift overflow, but it could just say, for example, that all the bits are known zero (since the result of computeKnownBits is only meaningful if the value isn't poison).

In D38637#890899, @efriedma wrote:

If I understand correctly, the reason computeKnownBits can't handle this is that it doesn't know what to do with a known poison value? We could just solve the issue in computeKnownBits: currently, it says there are no known bits when it detects a shift overflow, but it could just say, for example, that all the bits are known zero (since the result of computeKnownBits is only meaningful if the value isn't poison).

Ah, I thought that wasn't an option. I remember some bug report related to undef handling in computeKnownBits that made we think we have to be conservative, but I'm not locating it. We have these comments in computeKnownBitsFromShiftOperator():

// If there is conflict between Known.Zero and Known.One, this must be an
// overflowing left shift, so the shift result is undefined. Clear Known
// bits so that other code could propagate this undef.

...

// If the shift amount could be greater than or equal to the bit-width of the LHS, the
// value could be undef, so we don't know anything about it.

...

// If there are no compatible shift amounts, then we've proven that the shift
// amount must be >= the BitWidth, and the result is undefined. We could
// return anything we'd like, but we need to make sure the sets of known bits
// stay disjoint (it should be better for some other code to actually
// propagate the undef than to pick a value here using known bits).

Also, this is in the header comment for computeKnownBits():
/ NOTE: we cannot consider 'undef' to be "IsZero" here. The problem is that
/ we cannot optimize based on the assumption that it is zero without changing
/ it to be an explicit zero. If we don't change it to zero, other code could
/ optimized based on the contradictory assumption that it is non-zero.

So since we don't know what the caller will do with the result, we're conservative. Is it different if something is known to produce poison rather than undef?

The exact definition of poison is still getting refined, but it's different from undef. undef is a bit-wise property, which is why ComputeKnownBits has to be careful around it. poison works differently; essentially, any arithmetic or logical operation which has poison as an input produces poison, no matter what the other input is. So it doesn't matter what ComputeKnownBits returns for a known poison value.

In D38637#891147, @efriedma wrote:

The exact definition of poison is still getting refined, but it's different from undef. undef is a bit-wise property, which is why ComputeKnownBits has to be careful around it. poison works differently; essentially, any arithmetic or logical operation which has poison as an input produces poison, no matter what the other input is. So it doesn't matter what ComputeKnownBits returns for a known poison value.

Thanks! Then, it seems clear I can abandon the InstCombine fix, and I'll redo this one to work in value tracking directly.

Patch updated:
Have computeKnownBitsFromShiftOperator() return a zero constant when we discover a conflict in known bits. This allows InstSimplify to fold compares.

spatel added inline comments.Oct 7 2017, 8:55 AM

lib/Analysis/ValueTracking.cpp
822–825	Oops - this comment doesn't make sense. An overshift produces poison too. Removing this check would mean we're going to fall through to the expensive check below more often though. Do we want to do that or should I just fix the comment?

Patch updated:
Fix bogus comment about undef and add a TODO for a potential follow-up patch.

My only potential concern here is that we could end up blocking optimizations because we're folding to undef rather than zero... but that's probably rare enough that it doesn't matter. LGTM.

lib/Analysis/ValueTracking.cpp
822–825	IIRC the old version of this comment is just outdated; we recently adjusted LangRef to be a bit more aggressive with shifts because we have some transforms which depend on it being poison rather than undef. Probably worth investigating getting rid of this at some point; I expect there are some interesting shifts we could analyze.

This revision is now accepted and ready to land.Oct 11 2017, 7:11 PM

Closed with rL315595

spatel mentioned this in D40649: [InstCombine] Don't crash on out of bounds shifts.Nov 30 2017, 8:59 AM

Revision Contents

Path

Size

lib/

Analysis/

ValueTracking.cpp

26 lines

test/

Transforms/

InstSimplify/

icmp-constant.ll

21 lines

Diff 118139

lib/Analysis/ValueTracking.cpp

Show First 20 Lines • Show All 802 Lines • ▼ Show 20 Lines	static void computeKnownBitsFromShiftOperator(
unsigned BitWidth = Known.getBitWidth();		unsigned BitWidth = Known.getBitWidth();

if (auto *SA = dyn_cast<ConstantInt>(I->getOperand(1))) {		if (auto *SA = dyn_cast<ConstantInt>(I->getOperand(1))) {
unsigned ShiftAmt = SA->getLimitedValue(BitWidth-1);		unsigned ShiftAmt = SA->getLimitedValue(BitWidth-1);

computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);		computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);
Known.Zero = KZF(Known.Zero, ShiftAmt);		Known.Zero = KZF(Known.Zero, ShiftAmt);
Known.One = KOF(Known.One, ShiftAmt);		Known.One = KOF(Known.One, ShiftAmt);
// If there is conflict between Known.Zero and Known.One, this must be an		// If the known bits conflict, this must be an overflowing left shift, so
// overflowing left shift, so the shift result is undefined. Clear Known		// the shift result is poison. We can return anything we want. Choose 0 for
// bits so that other code could propagate this undef.		// the best folding opportunity.
if ((Known.Zero & Known.One) != 0)		if (Known.hasConflict())
Known.resetAll();		Known.setAllZero();

return;		return;
}		}

computeKnownBits(I->getOperand(1), Known, Depth + 1, Q);		computeKnownBits(I->getOperand(1), Known, Depth + 1, Q);

// If the shift amount could be greater than or equal to the bit-width of the LHS, the		// If the shift amount could be greater than or equal to the bit-width of the
// value could be undef, so we don't know anything about it.		// LHS, the value could be poison, but bail out because the check below is
		// expensive. TODO: Should we just carry on?
if ((~Known.Zero).uge(BitWidth)) {		if ((~Known.Zero).uge(BitWidth)) {
		spatelAuthorUnsubmitted Not Done Reply Inline Actions Oops - this comment doesn't make sense. An overshift produces poison too. Removing this check would mean we're going to fall through to the expensive check below more often though. Do we want to do that or should I just fix the comment? spatel: Oops - this comment doesn't make sense. An overshift produces poison too. Removing this check…
		efriedmaUnsubmitted Not Done Reply Inline Actions IIRC the old version of this comment is just outdated; we recently adjusted LangRef to be a bit more aggressive with shifts because we have some transforms which depend on it being poison rather than undef. Probably worth investigating getting rid of this at some point; I expect there are some interesting shifts we could analyze. efriedma: IIRC the old version of this comment is just outdated; we recently adjusted LangRef to be a bit…
Known.resetAll();		Known.resetAll();
return;		return;
}		}

// Note: We cannot use Known.Zero.getLimitedValue() here, because if		// Note: We cannot use Known.Zero.getLimitedValue() here, because if
// BitWidth > 64 and any upper bits are known, we'll end up returning the		// BitWidth > 64 and any upper bits are known, we'll end up returning the
// limit value (which implies all bits are known).		// limit value (which implies all bits are known).
uint64_t ShiftAmtKZ = Known.Zero.zextOrTrunc(64).getZExtValue();		uint64_t ShiftAmtKZ = Known.Zero.zextOrTrunc(64).getZExtValue();
Show All 38 Lines	if (ShiftAmt == 0) {
if (*ShifterOperandIsNonZero)		if (*ShifterOperandIsNonZero)
continue;		continue;
}		}

Known.Zero &= KZF(Known2.Zero, ShiftAmt);		Known.Zero &= KZF(Known2.Zero, ShiftAmt);
Known.One &= KOF(Known2.One, ShiftAmt);		Known.One &= KOF(Known2.One, ShiftAmt);
}		}

// If there are no compatible shift amounts, then we've proven that the shift		// If the known bits conflict, the result is poison. Return a 0 and hope the
// amount must be >= the BitWidth, and the result is undefined. We could		// caller can further optimize that.
// return anything we'd like, but we need to make sure the sets of known bits		if (Known.hasConflict())
// stay disjoint (it should be better for some other code to actually		Known.setAllZero();
// propagate the undef than to pick a value here using known bits).
if (Known.Zero.intersects(Known.One))
Known.resetAll();
}		}

static void computeKnownBitsFromOperator(const Operator *I, KnownBits &Known,		static void computeKnownBitsFromOperator(const Operator *I, KnownBits &Known,
unsigned Depth, const Query &Q) {		unsigned Depth, const Query &Q) {
unsigned BitWidth = Known.getBitWidth();		unsigned BitWidth = Known.getBitWidth();

KnownBits Known2(Known);		KnownBits Known2(Known);
switch (I->getOpcode()) {		switch (I->getOpcode()) {
▲ Show 20 Lines • Show All 3,768 Lines • Show Last 20 Lines

test/Transforms/InstSimplify/icmp-constant.ll

Show First 20 Lines • Show All 570 Lines • ▼ Show 20 Lines	;
ret <2 x i1> %cmp		ret <2 x i1> %cmp
}		}

; PR34838 - https://bugs.llvm.org/show_bug.cgi?id=34838		; PR34838 - https://bugs.llvm.org/show_bug.cgi?id=34838
; The shift is known to create poison, so we can simplify the cmp.		; The shift is known to create poison, so we can simplify the cmp.

define i1 @ne_shl_by_constant_produces_poison(i8 %x) {		define i1 @ne_shl_by_constant_produces_poison(i8 %x) {
; CHECK-LABEL: @ne_shl_by_constant_produces_poison(		; CHECK-LABEL: @ne_shl_by_constant_produces_poison(
; CHECK-NEXT: [[ZX:%.*]] = zext i8 %x to i16		; CHECK-NEXT: ret i1 true
; CHECK-NEXT: [[XOR:%.*]] = xor i16 [[ZX]], 32767
; CHECK-NEXT: [[SUB:%.*]] = sub nsw i16 [[ZX]], [[XOR]]
; CHECK-NEXT: [[POISON:%.*]] = shl nsw i16 [[SUB]], 2
; CHECK-NEXT: [[CMP:%.*]] = icmp ne i16 [[POISON]], 1
; CHECK-NEXT: ret i1 [[CMP]]
;		;
%zx = zext i8 %x to i16 ; zx = 0x00xx		%zx = zext i8 %x to i16 ; zx = 0x00xx
%xor = xor i16 %zx, 32767 ; xor = 0x7fyy		%xor = xor i16 %zx, 32767 ; xor = 0x7fyy
%sub = sub nsw i16 %zx, %xor ; sub = 0x80zz (the top bit is known one)		%sub = sub nsw i16 %zx, %xor ; sub = 0x80zz (the top bit is known one)
%poison = shl nsw i16 %sub, 2 ; oops! this shl can't be nsw; that's POISON		%poison = shl nsw i16 %sub, 2 ; oops! this shl can't be nsw; that's POISON
%cmp = icmp ne i16 %poison, 1		%cmp = icmp ne i16 %poison, 1
ret i1 %cmp		ret i1 %cmp
}		}

define i1 @eq_shl_by_constant_produces_poison(i8 %x) {		define i1 @eq_shl_by_constant_produces_poison(i8 %x) {
; CHECK-LABEL: @eq_shl_by_constant_produces_poison(		; CHECK-LABEL: @eq_shl_by_constant_produces_poison(
; CHECK-NEXT: [[CLEAR_HIGH_BIT:%.*]] = and i8 %x, 127		; CHECK-NEXT: ret i1 false
; CHECK-NEXT: [[SET_NEXT_HIGH_BITS:%.*]] = or i8 [[CLEAR_HIGH_BIT]], 112
; CHECK-NEXT: [[POISON:%.*]] = shl nsw i8 [[SET_NEXT_HIGH_BITS]], 3
; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[POISON]], 15
; CHECK-NEXT: ret i1 [[CMP]]
;		;
%clear_high_bit = and i8 %x, 127 ; 0x7f		%clear_high_bit = and i8 %x, 127 ; 0x7f
%set_next_high_bits = or i8 %clear_high_bit, 112 ; 0x70		%set_next_high_bits = or i8 %clear_high_bit, 112 ; 0x70
%poison = shl nsw i8 %set_next_high_bits, 3		%poison = shl nsw i8 %set_next_high_bits, 3
%cmp = icmp eq i8 %poison, 15		%cmp = icmp eq i8 %poison, 15
ret i1 %cmp		ret i1 %cmp
}		}

; Shift-by-variable that produces poison is more complicated but still possible.		; Shift-by-variable that produces poison is more complicated but still possible.
; We guarantee that the shift will change the sign of the shifted value (and		; We guarantee that the shift will change the sign of the shifted value (and
; therefore produce poison) by limiting its range from 1 to 3.		; therefore produce poison) by limiting its range from 1 to 3.

define i1 @eq_shl_by_variable_produces_poison(i8 %x) {		define i1 @eq_shl_by_variable_produces_poison(i8 %x) {
; CHECK-LABEL: @eq_shl_by_variable_produces_poison(		; CHECK-LABEL: @eq_shl_by_variable_produces_poison(
; CHECK-NEXT: [[CLEAR_HIGH_BIT:%.*]] = and i8 %x, 127		; CHECK-NEXT: ret i1 false
; CHECK-NEXT: [[SET_NEXT_HIGH_BITS:%.*]] = or i8 [[CLEAR_HIGH_BIT]], 112
; CHECK-NEXT: [[NOTUNDEF_SHIFTAMT:%.*]] = and i8 %x, 3
; CHECK-NEXT: [[NONZERO_SHIFTAMT:%.*]] = or i8 [[NOTUNDEF_SHIFTAMT]], 1
; CHECK-NEXT: [[POISON:%.*]] = shl nsw i8 [[SET_NEXT_HIGH_BITS]], [[NONZERO_SHIFTAMT]]
; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[POISON]], 15
; CHECK-NEXT: ret i1 [[CMP]]
;		;
%clear_high_bit = and i8 %x, 127 ; 0x7f		%clear_high_bit = and i8 %x, 127 ; 0x7f
%set_next_high_bits = or i8 %clear_high_bit, 112 ; 0x70		%set_next_high_bits = or i8 %clear_high_bit, 112 ; 0x70
%notundef_shiftamt = and i8 %x, 3		%notundef_shiftamt = and i8 %x, 3
%nonzero_shiftamt = or i8 %notundef_shiftamt, 1		%nonzero_shiftamt = or i8 %notundef_shiftamt, 1
%poison = shl nsw i8 %set_next_high_bits, %nonzero_shiftamt		%poison = shl nsw i8 %set_next_high_bits, %nonzero_shiftamt
%cmp = icmp eq i8 %poison, 15		%cmp = icmp eq i8 %poison, 15
ret i1 %cmp		ret i1 %cmp
}		}