This is an archive of the discontinued LLVM Phabricator instance.

[InstCombineCasts] Fix checks in sext->lshr->trunc pattern.
ClosedPublic

Authored by jacobly on Apr 20 2017, 4:30 AM.

Download Raw Diff

Details

Reviewers

spatel
kuhar

Commits

rG6844e21f5934: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern.
rL302548: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern.

Summary

The comment says to avoid the case where zero bits are shifted into the truncated value, but the code checks that the shift is smaller than the truncated value instead of the number of bits added by the sign extension. Fixing this allows a shift by more than the value size to be introduced, which is undefined behavior, so the shift is capped at the value size minus one, which has the expected behavior of filling the value with the sign bit.

Diff Detail

Repository: rL LLVM

Event Timeline

jacobly created this revision.Apr 20 2017, 4:30 AM

Fix comment grammar.

Harbormaster completed remote builds in B5775: Diff 96258.Apr 21 2017, 3:49 PM

efriedma added a subscriber: efriedma.Apr 21 2017, 3:59 PM

Ping

zzheng added a subscriber: zzheng.May 8 2017, 1:44 PM

spatel mentioned this in rL302475: [InstCombine] add tests from D32285 to show current problems; NFC.May 8 2017, 3:46 PM

I couldn't tell what was happening in all these cases from the code or tests, so I added the tests with their current output here:
https://reviews.llvm.org/rL302475

I think there are 3 or more problems/opportunities here, so I want to separate these if possible. Can you make this patch just about fixing the miscompile in test91()? Is there a bug report for that case?

Note - Alive is very helpful to check that we're not either miscompiling or missing optimizations:
http://rise4fun.com/Alive/2I

All that needs to be changed to fix the test91 miscompile is changing ShiftAmt < ASize to ShiftAmt <= SExtSize - ASize. However, making this change actually introduces the test93 miscompile because the original code already assured that ShiftAmt < ASize (which is correct for avoiding a ub shift, but incorrect for avoiding pulling zero bits into the value). Note that by far the most common case of SExtSize >= 2*ASize (which is always true when all types are powers of 2) doesn't actually miscompile (because then ASize <= SExtSize - ASize anyway). I was planning on adding more optimization opportunities in a separate patch, but I'm not sure how to separate this change without introducing a miscompile. I could separate out the naming common subexpressions with variables if that helps.

Ok, your explanation makes this clearer. Since we're fixing a miscompile, I think it's ok to keep this patch mostly as-is. But please add some FIXME notes and rebase this patch after rL302475:

This code is trying to do too much in one place IMO. It (and the related zext transform above it) should be split off into a helper function so that the (common?) case where "ASize" is the same as the final size (CI.getType()) are handled separately. When those types match, we do not need to check hasOneUse as strictly. We're also wastefully calling CastInst::CreateIntegerCast() in that case.
We've artificially excluded vector types by using m_ConstantInt().
As I showed in the Alive example, we can handle the case where (ShiftAmt > SExtSize - ASize) by adding an 'and' mask. This transform makes sense with appropriate one-use checking.
We can use m_OneUse() with the match() calls to make the code cleaner.

Rebased and added fixmes.

I don't have commit access so if this is good to go could you commit it for me? Meanwhile, I'll work on actually cleaning up these two transforms.

In D32285#749724, @jacobly wrote:

I don't have commit access so if this is good to go could you commit it for me? Meanwhile, I'll work on actually cleaning up these two transforms.

Sure. LGTM - thanks for fixing this!

This revision is now accepted and ready to land.May 9 2017, 8:26 AM

Closed by commit rL302548: [InstCombineCasts] Fix checks in sext->lshr->trunc pattern. (authored by spatel). · Explain WhyMay 9 2017, 9:38 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

InstCombine/

InstCombineCasts.cpp

20 lines

test/

Transforms/

InstCombine/

cast.ll

12 lines

Diff 98306

llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp

Show First 20 Lines • Show All 553 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitTrunc(TruncInst &CI) {
// Canonicalize trunc x to i1 -> (icmp ne (and x, 1), 0), likewise for vector.		// Canonicalize trunc x to i1 -> (icmp ne (and x, 1), 0), likewise for vector.
if (DestTy->getScalarSizeInBits() == 1) {		if (DestTy->getScalarSizeInBits() == 1) {
Constant *One = ConstantInt::get(SrcTy, 1);		Constant *One = ConstantInt::get(SrcTy, 1);
Src = Builder->CreateAnd(Src, One);		Src = Builder->CreateAnd(Src, One);
Value *Zero = Constant::getNullValue(Src->getType());		Value *Zero = Constant::getNullValue(Src->getType());
return new ICmpInst(ICmpInst::ICMP_NE, Src, Zero);		return new ICmpInst(ICmpInst::ICMP_NE, Src, Zero);
}		}

		// FIXME: Maybe combine the next two transforms to handle the no cast case
		// more efficiently. Support vector types. Cleanup code by using m_OneUse.

// Transform trunc(lshr (zext A), Cst) to eliminate one type conversion.		// Transform trunc(lshr (zext A), Cst) to eliminate one type conversion.
Value A = nullptr; ConstantInt Cst = nullptr;		Value A = nullptr; ConstantInt Cst = nullptr;
if (Src->hasOneUse() &&		if (Src->hasOneUse() &&
match(Src, m_LShr(m_ZExt(m_Value(A)), m_ConstantInt(Cst)))) {		match(Src, m_LShr(m_ZExt(m_Value(A)), m_ConstantInt(Cst)))) {
// We have three types to worry about here, the type of A, the source of		// We have three types to worry about here, the type of A, the source of
// the truncate (MidSize), and the destination of the truncate. We know that		// the truncate (MidSize), and the destination of the truncate. We know that
// ASize < MidSize and MidSize > ResultSize, but don't know the relation		// ASize < MidSize and MidSize > ResultSize, but don't know the relation
// between ASize and ResultSize.		// between ASize and ResultSize.
Show All 13 Lines	Instruction *InstCombiner::visitTrunc(TruncInst &CI) {
}		}

// Transform trunc(lshr (sext A), Cst) to ashr A, Cst to eliminate type		// Transform trunc(lshr (sext A), Cst) to ashr A, Cst to eliminate type
// conversion.		// conversion.
// It works because bits coming from sign extension have the same value as		// It works because bits coming from sign extension have the same value as
// the sign bit of the original value; performing ashr instead of lshr		// the sign bit of the original value; performing ashr instead of lshr
// generates bits of the same value as the sign bit.		// generates bits of the same value as the sign bit.
if (Src->hasOneUse() &&		if (Src->hasOneUse() &&
match(Src, m_LShr(m_SExt(m_Value(A)), m_ConstantInt(Cst))) &&		match(Src, m_LShr(m_SExt(m_Value(A)), m_ConstantInt(Cst)))) {
cast<Instruction>(Src)->getOperand(0)->hasOneUse()) {		Value *SExt = cast<Instruction>(Src)->getOperand(0);
		const unsigned SExtSize = SExt->getType()->getPrimitiveSizeInBits();
const unsigned ASize = A->getType()->getPrimitiveSizeInBits();		const unsigned ASize = A->getType()->getPrimitiveSizeInBits();
		unsigned ShiftAmt = Cst->getZExtValue();
// This optimization can be only performed when zero bits generated by		// This optimization can be only performed when zero bits generated by
// the original lshr aren't pulled into the value after truncation, so we		// the original lshr aren't pulled into the value after truncation, so we
// can only shift by values smaller than the size of destination type (in		// can only shift by values no larger than the number of extension bits.
// bits).		// FIXME: Instead of bailing when the shift is too large, use and to clear
if (Cst->getValue().ult(ASize)) {		// the extra bits.
Value *Shift = Builder->CreateAShr(A, Cst->getZExtValue());		if (SExt->hasOneUse() && ShiftAmt <= SExtSize - ASize) {
		// If shifting by the size of the original value in bits or more, it is
		// being filled with the sign bit, so shift by ASize-1 to avoid ub.
		Value *Shift = Builder->CreateAShr(A, std::min(ShiftAmt, ASize-1));
Shift->takeName(Src);		Shift->takeName(Src);
return CastInst::CreateIntegerCast(Shift, CI.getType(), true);		return CastInst::CreateIntegerCast(Shift, CI.getType(), true);
}		}
}		}

if (Instruction *I = shrinkBitwiseLogic(CI))		if (Instruction *I = shrinkBitwiseLogic(CI))
return I;		return I;

▲ Show 20 Lines • Show All 1,615 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/cast.ll

	Show First 20 Lines • Show All 1,430 Lines • ▼ Show 20 Lines
	; CHECK: ret <2 x i32> <i32 0, i32 15360>			; CHECK: ret <2 x i32> <i32 0, i32 15360>
	%tmp6 = bitcast <4 x half> <half undef, half undef, half undef, half 0xH3C00> to <2 x i32>			%tmp6 = bitcast <4 x half> <half undef, half undef, half undef, half 0xH3C00> to <2 x i32>
	ret <2 x i32> %tmp6			ret <2 x i32> %tmp6
	}			}

	; Do not optimize to ashr i64 (shift by 48 > 96 - 64)			; Do not optimize to ashr i64 (shift by 48 > 96 - 64)
	define i64 @test91(i64 %A) {			define i64 @test91(i64 %A) {
	; CHECK-LABEL: @test91(			; CHECK-LABEL: @test91(
	; CHECK-NEXT: [[C:%.*]] = ashr i64 %A, 48			; CHECK-NEXT: [[B:%.*]] = sext i64 %A to i96
	; CHECK-NEXT: ret i64 [[C]]			; CHECK-NEXT: [[C:%.*]] = lshr i96 [[B]], 48
				; CHECK-NEXT: [[D:%.*]] = trunc i96 [[C]] to i64
				; CHECK-NEXT: ret i64 [[D]]
	;			;
	%B = sext i64 %A to i96			%B = sext i64 %A to i96
	%C = lshr i96 %B, 48			%C = lshr i96 %B, 48
	%D = trunc i96 %C to i64			%D = trunc i96 %C to i64
	ret i64 %D			ret i64 %D
	}			}

	; Do optimize to ashr i64 (shift by 32 <= 96 - 64)			; Do optimize to ashr i64 (shift by 32 <= 96 - 64)
	define i64 @test92(i64 %A) {			define i64 @test92(i64 %A) {
	; CHECK-LABEL: @test92(			; CHECK-LABEL: @test92(
	; CHECK-NEXT: [[C:%.*]] = ashr i64 %A, 32			; CHECK-NEXT: [[C:%.*]] = ashr i64 %A, 32
	; CHECK-NEXT: ret i64 [[C]]			; CHECK-NEXT: ret i64 [[C]]
	;			;
	%B = sext i64 %A to i96			%B = sext i64 %A to i96
	%C = lshr i96 %B, 32			%C = lshr i96 %B, 32
	%D = trunc i96 %C to i64			%D = trunc i96 %C to i64
	ret i64 %D			ret i64 %D
	}			}

	; When optimizing to ashr i32, don't shift by more than 31.			; When optimizing to ashr i32, don't shift by more than 31.
	define i32 @test93(i32 %A) {			define i32 @test93(i32 %A) {
	; CHECK-LABEL: @test93(			; CHECK-LABEL: @test93(
	; CHECK-NEXT: [[B:%.*]] = sext i32 %A to i96			; CHECK-NEXT: [[C:%.*]] = ashr i32 %A, 31
	; CHECK-NEXT: [[C:%.*]] = lshr i96 [[B]], 64			; CHECK-NEXT: ret i32 [[C]]
	; CHECK-NEXT: [[D:%.*]] = trunc i96 [[C]] to i32
	; CHECK-NEXT: ret i32 [[D]]
	;			;
	%B = sext i32 %A to i96			%B = sext i32 %A to i96
	%C = lshr i96 %B, 64			%C = lshr i96 %B, 64
	%D = trunc i96 %C to i32			%D = trunc i96 %C to i32
	ret i32 %D			ret i32 %D
	}			}