Download Raw Diff

Details

Reviewers

spatel
lebedev.ri

Commits

rG5a2265647ed3: Reapply [InstSimplify] Remove known bits constant folding
rG08556afc54e7: [InstSimplify] Remove known bits constant folding

Summary

If SimplifyInstruction() does not succeed in simplifying the instruction, it will compute the known bits of the instruction in the hope that all bits are known and the instruction can be folded to a constant. I have removed a similar optimization from InstCombine in D75801, and would like to drop this one as well.

On average, we spend 1% of total compile-time performing this known bits calculation. However, if we introduce some additional statistics for known bits computations and how many of them succeed in simplifying the instruction we get (on test-suite):

instsimplify.NumKnownBits: 216
instsimplify.NumKnownBitsComputed: 13828375
valuetracking.NumKnownBitsComputed: 45860806

Out of ~14M known bits calculations (accounting for approximately one third of all known bits calculations), only 0.0015% succeed in producing a constant. Those cases where we do succeed to compute all known bits will get folded by other passes like InstCombine later. On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change.

There are of course "regressions" in InstSimplify tests, because some things that were previously handled by InstSimplify are now only handled by InstCombine. I will comment inline.

One final thing to note here is that all this affects only the SimplifyInstruction() API, not the individual per-instruction-kind Simplify APIs, which never try to use KnownBits in this way.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nikic created this revision.May 2 2020, 12:19 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2020, 12:19 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Okay, after going through the tests, there isn't really anything to comment on beyond "everything gets folded by InstCombine instead, in the same way".

If the general change looks fine, I'd move some of the tests that no longer fold with InstSimplify into the InstCombine tests instead.

Harbormaster failed remote builds in B55554: Diff 261655!May 2 2020, 1:14 PM

On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change.

Did you analyze these changes?

In D79294#2016453, @xbolva00 wrote:

On test-suite, only lencod.test and GCC-C-execute-pr44858.test show a hash difference after this change.

Did you analyze these changes?

For lencod:

in function writeMBLayer:
  in block %for.cond138.preheader.i:
    <   %mb_y.0328.i = phi i32 [ 0, %for.cond134.preheader.i ], [ 2, %for.inc219.i ]
  in block %for.inc219.i:
    >   %cmp135.i = icmp eq i64 %indvars.iv383.i, 0
    >   br i1 %cmp135.i, label %for.cond138.preheader.i, label %writeCBPandLumaCoeff.exit
    <   %cmp135.i = icmp eq i32 %mb_y.0328.i, 0
    <   %indvars.iv.next339.i = add nuw nsw i64 %indvars.iv338.i, 2
    <   br i1 %cmp135.i, label %for.cond138.preheader.i, label %writeCBPandLumaCoeff.exit

This is an improvement, we optimize away a loop phi. Don't ask me how.

For the torture test:

in function main:
  in block %entry:
    >   %cmp = icmp ne i32 %call, 0
        %0 = load i32, i32* @b, align 4
    >   %cmp1 = icmp ne i32 %0, 1
    >   %or.cond = or i1 %cmp, %cmp1
    >   br i1 %or.cond, label %if.then, label %if.end
    <   %cmp1 = icmp eq i32 %0, 1
    <   br i1 %cmp1, label %if.end, label %if.then

This is a regression, we fail to optimize away a comparison. This test is small enough to trace: We determine that a call always returns zero a bit later, not in time for IPSCCP. As this is noinline-based test case, IPSCCP is the only chance to propagate interprocedurally.

I have now duplicated all the relevant tests into InstCombine and added comments that these no longer fold as part of InstSimplify. The only test I've dropped outright is the assume.ll one, as I don't think it makes sense if we don't perform top-level KnownBits folding. I've left the remaining ones alone, but could drop them if desired.

Herald added a subscriber: ormris. · View Herald TranscriptMay 3 2020, 3:57 AM

nikic mentioned this in rG7c649b58f029: [InstCombine] Duplicate some InstSimplify tests (NFC).May 3 2020, 4:15 AM

LGTM - if we can remove redundant/expensive logic with few visible side-effects, that's a nice win.

test/Transforms/GVN/PRE/volatile.ll
204 ↗	(On Diff #261690)	There's no dedicated fold for this in InstCombiner::visitLoadInst(). But we call computeKnownBits on the ret arg in instcombine anyway, so we get all basic patterns. That might be worth looking at as another candidate for removal for efficiency. Ie, we may want to add a dedicated fold (add a simplifyLoad?) since it's cheap to do directly and a big win if it works (no idea if this happens in the real-world, but somebody added this GVN test case...) A test that would subvert the current known-bits call from InstCombiner::visitReturnInst() based on current MaxDepth recursion: define i32 @known0(i32* %V, i32 %y) { %load = load i32, i32* %V, !range !0 %m1 = mul i32 %load, %y %m2 = mul i32 %m1, %y %m3 = mul i32 %m2, %y %m4 = mul i32 %m3, %y %m5 = mul i32 %m4, %y %m6 = mul i32 %m5, %y ret i32 %m6 } !0 = !{ i32 0, i32 1 }

This revision is now accepted and ready to land.May 3 2020, 9:29 AM

nikic marked an inline comment as done.May 3 2020, 9:51 AM

nikic added inline comments.

test/Transforms/GVN/PRE/volatile.ll
204 ↗	(On Diff #261690)	Going through the blame, this test was added in https://github.com/llvm/llvm-project/commit/b8da3a2bb2b840db6ab7c473190ee6d65dcf3a1e. I think the intention was to make sure that instructions with side-effects don't get optimized away just because they simplify. Given that, it might make sense to replace the load volatile with something else here (not sure what though -- do we have any standard pattern that simplifies without being trivially dead?) I don't think we need to explicitly handle this case in InstCombine, as it seems unlikely that single-element range annotations are common in the wild. Additionally I think that this is best left to passes that specialize in range-propagation. For example, CVP will also handle your more complex example successfully. Given ongoing range work, I expect that SCCP will also handle it in the future.

nikic marked an inline comment as done.May 3 2020, 11:30 AM

nikic added inline comments.

test/Transforms/GVN/PRE/volatile.ll
204 ↗	(On Diff #261690)	Given that, it might make sense to replace the load volatile with something else here (not sure what though -- do we have any standard pattern that simplifies without being trivially dead?) Looks like `call i32 undef()` works for that purpose. I've replaced the load volatile with that.

Closed by commit rG08556afc54e7: [InstSimplify] Remove known bits constant folding (authored by nikic). · Explain WhyMay 3 2020, 11:41 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: hiraditya. · View Herald TranscriptMay 3 2020, 11:41 AM

Had to revert this due to AMDGPU test failures. What happens there is that AMDGPU expands divisions in a custom CGP pass, then instructions get simplified as part of EarlyCSE run it schedules. The simplification in question is a x ashr 31 sign bit extraction, which gets folded to zero.

Not quite sure what to do about this. I guess it's one of a) just accept the change b) add a known sign check in the AMDGPU div expansion code or c) make SimplifyAShr more aggressive (i.e. use known bits) if we're extracting the sign bit. I'm leaning towards b). cc @arsenm

This revision is now accepted and ready to land.May 3 2020, 12:43 PM

Try to determine sign of div operands in AMDGPU expansion code. This avoids any test changes on the AMDGPU side.

If this approach looks reasonable, I'll commit it separately.

Herald added subscribers: kerbowa, nhaehnle, jvesely. · View Herald TranscriptMay 3 2020, 1:00 PM

nikic mentioned this in D79596: [AMDGPU] Try to determine sign bit during div/rem expansion.May 7 2020, 1:49 PM

Closed by commit rG5a2265647ed3: Reapply [InstSimplify] Remove known bits constant folding (authored by nikic). · Explain WhyMay 8 2020, 1:34 AM

This revision was automatically updated to reflect the committed changes.

nikic mentioned this in rG5fa87ec004fd: [AMDGPU] Try to determine sign bit during div/rem expansion.

Diff 262838

llvm/lib/Analysis/InstructionSimplify.cpp

Show First 20 Lines • Show All 5,594 Lines • ▼ Show 20 Lines	Result =
SVI->getShuffleMask(), SVI->getType(), Q);		SVI->getShuffleMask(), SVI->getType(), Q);
break;		break;
}		}
case Instruction::PHI:		case Instruction::PHI:
Result = SimplifyPHINode(cast<PHINode>(I), Q);		Result = SimplifyPHINode(cast<PHINode>(I), Q);
break;		break;
case Instruction::Call: {		case Instruction::Call: {
Result = SimplifyCall(cast<CallInst>(I), Q);		Result = SimplifyCall(cast<CallInst>(I), Q);
// Don't perform known bits simplification below for musttail calls.
if (cast<CallInst>(I)->isMustTailCall())
return Result;
break;		break;
}		}
case Instruction::Freeze:		case Instruction::Freeze:
Result = SimplifyFreezeInst(I->getOperand(0), Q);		Result = SimplifyFreezeInst(I->getOperand(0), Q);
break;		break;
#define HANDLE_CAST_INST(num, opc, clas) case Instruction::opc:		#define HANDLE_CAST_INST(num, opc, clas) case Instruction::opc:
#include "llvm/IR/Instruction.def"		#include "llvm/IR/Instruction.def"
#undef HANDLE_CAST_INST		#undef HANDLE_CAST_INST
Result =		Result =
SimplifyCastInst(I->getOpcode(), I->getOperand(0), I->getType(), Q);		SimplifyCastInst(I->getOpcode(), I->getOperand(0), I->getType(), Q);
break;		break;
case Instruction::Alloca:		case Instruction::Alloca:
// No simplifications for Alloca and it can't be constant folded.		// No simplifications for Alloca and it can't be constant folded.
Result = nullptr;		Result = nullptr;
break;		break;
}		}

// In general, it is possible for computeKnownBits to determine all bits in a
// value even when the operands are not all constants.
if (!Result && I->getType()->isIntOrIntVectorTy()) {
KnownBits Known = computeKnownBits(I, Q.DL, /Depth/ 0, Q.AC, I, Q.DT, ORE);
if (Known.isConstant())
Result = ConstantInt::get(I->getType(), Known.getConstant());
}

/// If called on unreachable code, the above logic may report that the		/// If called on unreachable code, the above logic may report that the
/// instruction simplified to itself. Make life easier for users by		/// instruction simplified to itself. Make life easier for users by
/// detecting that case here, returning a safe value instead.		/// detecting that case here, returning a safe value instead.
return Result == I ? UndefValue::get(I->getType()) : Result;		return Result == I ? UndefValue::get(I->getType()) : Result;
}		}

/// Implementation of recursive simplification through an instruction's		/// Implementation of recursive simplification through an instruction's
/// uses.		/// uses.
▲ Show 20 Lines • Show All 114 Lines • Show Last 20 Lines

llvm/test/Analysis/ValueTracking/knownzero-shift.ll

Show All 9 Lines	;
%2 = and i8 %1, 254 ; %2[0] = 0, %2[1] = 1		%2 = and i8 %1, 254 ; %2[0] = 0, %2[1] = 1
%A = lshr i8 %2, 1 ; We should know that %A is nonzero.		%A = lshr i8 %2, 1 ; We should know that %A is nonzero.
%x = icmp eq i8 %A, 0		%x = icmp eq i8 %A, 0
ret i1 %x		ret i1 %x
}		}

!0 = !{ i8 1, i8 5 }		!0 = !{ i8 1, i8 5 }

		; The following cases only get folded by InstCombine,
		; see InstCombine/shift-shift.ll. If we wanted to,
		; we could explicitly handle them in InstSimplify as well.

define i32 @shl_shl(i32 %A) {		define i32 @shl_shl(i32 %A) {
; CHECK-LABEL: @shl_shl(		; CHECK-LABEL: @shl_shl(
; CHECK-NEXT: ret i32 0		; CHECK-NEXT: [[B:%.]] = shl i32 [[A:%.]], 6
		; CHECK-NEXT: [[C:%.*]] = shl i32 [[B]], 28
		; CHECK-NEXT: ret i32 [[C]]
;		;
%B = shl i32 %A, 6		%B = shl i32 %A, 6
%C = shl i32 %B, 28		%C = shl i32 %B, 28
ret i32 %C		ret i32 %C
}		}

define <2 x i33> @shl_shl_splat_vec(<2 x i33> %A) {		define <2 x i33> @shl_shl_splat_vec(<2 x i33> %A) {
; CHECK-LABEL: @shl_shl_splat_vec(		; CHECK-LABEL: @shl_shl_splat_vec(
; CHECK-NEXT: ret <2 x i33> zeroinitializer		; CHECK-NEXT: [[B:%.]] = shl <2 x i33> [[A:%.]], <i33 5, i33 5>
		; CHECK-NEXT: [[C:%.*]] = shl <2 x i33> [[B]], <i33 28, i33 28>
		; CHECK-NEXT: ret <2 x i33> [[C]]
;		;
%B = shl <2 x i33> %A, <i33 5, i33 5>		%B = shl <2 x i33> %A, <i33 5, i33 5>
%C = shl <2 x i33> %B, <i33 28, i33 28>		%C = shl <2 x i33> %B, <i33 28, i33 28>
ret <2 x i33> %C		ret <2 x i33> %C
}		}

; FIXME		; FIXME

define <2 x i33> @shl_shl_vec(<2 x i33> %A) {		define <2 x i33> @shl_shl_vec(<2 x i33> %A) {
; CHECK-LABEL: @shl_shl_vec(		; CHECK-LABEL: @shl_shl_vec(
; CHECK-NEXT: [[B:%.*]] = shl <2 x i33> %A, <i33 6, i33 5>		; CHECK-NEXT: [[B:%.]] = shl <2 x i33> [[A:%.]], <i33 6, i33 5>
; CHECK-NEXT: [[C:%.*]] = shl <2 x i33> [[B]], <i33 27, i33 28>		; CHECK-NEXT: [[C:%.*]] = shl <2 x i33> [[B]], <i33 27, i33 28>
; CHECK-NEXT: ret <2 x i33> [[C]]		; CHECK-NEXT: ret <2 x i33> [[C]]
;		;
%B = shl <2 x i33> %A, <i33 6, i33 5>		%B = shl <2 x i33> %A, <i33 6, i33 5>
%C = shl <2 x i33> %B, <i33 27, i33 28>		%C = shl <2 x i33> %B, <i33 27, i33 28>
ret <2 x i33> %C		ret <2 x i33> %C
}		}

define i232 @lshr_lshr(i232 %A) {		define i232 @lshr_lshr(i232 %A) {
; CHECK-LABEL: @lshr_lshr(		; CHECK-LABEL: @lshr_lshr(
; CHECK-NEXT: ret i232 0		; CHECK-NEXT: [[B:%.]] = lshr i232 [[A:%.]], 231
		; CHECK-NEXT: [[C:%.*]] = lshr i232 [[B]], 1
		; CHECK-NEXT: ret i232 [[C]]
;		;
%B = lshr i232 %A, 231		%B = lshr i232 %A, 231
%C = lshr i232 %B, 1		%C = lshr i232 %B, 1
ret i232 %C		ret i232 %C
}		}

define <2 x i32> @lshr_lshr_splat_vec(<2 x i32> %A) {		define <2 x i32> @lshr_lshr_splat_vec(<2 x i32> %A) {
; CHECK-LABEL: @lshr_lshr_splat_vec(		; CHECK-LABEL: @lshr_lshr_splat_vec(
; CHECK-NEXT: ret <2 x i32> zeroinitializer		; CHECK-NEXT: [[B:%.]] = lshr <2 x i32> [[A:%.]], <i32 28, i32 28>
		; CHECK-NEXT: [[C:%.*]] = lshr <2 x i32> [[B]], <i32 4, i32 4>
		; CHECK-NEXT: ret <2 x i32> [[C]]
;		;
%B = lshr <2 x i32> %A, <i32 28, i32 28>		%B = lshr <2 x i32> %A, <i32 28, i32 28>
%C = lshr <2 x i32> %B, <i32 4, i32 4>		%C = lshr <2 x i32> %B, <i32 4, i32 4>
ret <2 x i32> %C		ret <2 x i32> %C
}		}

define <2 x i32> @lshr_lshr_vec(<2 x i32> %A) {		define <2 x i32> @lshr_lshr_vec(<2 x i32> %A) {
; CHECK-LABEL: @lshr_lshr_vec(		; CHECK-LABEL: @lshr_lshr_vec(
; CHECK-NEXT: ret <2 x i32> zeroinitializer		; CHECK-NEXT: [[B:%.]] = lshr <2 x i32> [[A:%.]], <i32 29, i32 28>
		; CHECK-NEXT: [[C:%.*]] = lshr <2 x i32> [[B]], <i32 4, i32 5>
		; CHECK-NEXT: ret <2 x i32> [[C]]
;		;
%B = lshr <2 x i32> %A, <i32 29, i32 28>		%B = lshr <2 x i32> %A, <i32 29, i32 28>
%C = lshr <2 x i32> %B, <i32 4, i32 5>		%C = lshr <2 x i32> %B, <i32 4, i32 5>
ret <2 x i32> %C		ret <2 x i32> %C
}		}

llvm/test/Transforms/GVN/PRE/volatile.ll

Show First 20 Lines • Show All 191 Lines • ▼ Show 20 Lines	skip:
; them noalias		; them noalias
call void @clobber(i32* %p, i32* %q)		call void @clobber(i32* %p, i32* %q)
br i1 %c, label %header, label %exit		br i1 %c, label %header, label %exit
exit:		exit:
%add = sub i32 %y, %x		%add = sub i32 %y, %x
ret i32 %add		ret i32 %add
}		}

		; This test checks that we don't optimize away instructions that are
		; simplified by SimplifyInstruction(), but are not trivially dead.

define i32 @test9(i32* %V) {		define i32 @test9(i32* %V) {
; CHECK-LABEL: @test9(		; CHECK-LABEL: @test9(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[LOAD:%.]] = load volatile i32, i32 [[V:%.*]], !range !0		; CHECK-NEXT: [[LOAD:%.*]] = call i32 undef()
; CHECK-NEXT: ret i32 0		; CHECK-NEXT: ret i32 undef
;		;
entry:		entry:
%load = load volatile i32, i32* %V, !range !0		%load = call i32 undef()
ret i32 %load		ret i32 %load
}		}

declare void @use(i32) readonly		declare void @use(i32) readonly
declare void @clobber(i32* %p, i32* %q)		declare void @clobber(i32* %p, i32* %q)

!0 = !{ i32 0, i32 1 }		!0 = !{ i32 0, i32 1 }

llvm/test/Transforms/InstSimplify/assume.ll

This file was deleted.

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -instsimplify -S < %s 2>&1 -pass-remarks-analysis=.* \| FileCheck %s

	; Verify that warnings are emitted for the 2nd and 3rd tests.

	; CHECK: remark: /tmp/s.c:1:13: Detected conflicting code assumptions.
	; CHECK: remark: /tmp/s.c:4:10: Detected conflicting code assumptions.
	; CHECK: remark: /tmp/s.c:5:50: Detected conflicting code assumptions.

	define void @test1() {
	; CHECK-LABEL: @test1(
	; CHECK-NEXT: ret void
	;
	call void @llvm.assume(i1 1)
	ret void

	}

	; The alloca guarantees that the low bits of %a are zero because of alignment.
	; The assume says the opposite. The assume is processed last, so that's the
	; return value. There's no way to win (we can't undo transforms that happened
	; based on half-truths), so just don't crash.

	define i64 @PR31809() !dbg !7 {
	; CHECK-LABEL: @PR31809(
	; CHECK-NEXT: ret i64 3
	;
	%a = alloca i32
	%t1 = ptrtoint i32* %a to i64, !dbg !9
	%cond = icmp eq i64 %t1, 3
	call void @llvm.assume(i1 %cond)
	ret i64 %t1
	}

	; Similar to above: there's no way to know which assumption is truthful,
	; so just don't crash.

	define i8 @conflicting_assumptions(i8 %x) !dbg !10 {
	; CHECK-LABEL: @conflicting_assumptions(
	; CHECK-NEXT: [[ADD:%.]] = add i8 [[X:%.]], 1, !dbg !10
	; CHECK-NEXT: call void @llvm.assume(i1 false)
	; CHECK-NEXT: [[COND2:%.*]] = icmp eq i8 [[X]], 4
	; CHECK-NEXT: call void @llvm.assume(i1 [[COND2]])
	; CHECK-NEXT: ret i8 [[ADD]]
	;
	%add = add i8 %x, 1, !dbg !11
	%cond1 = icmp eq i8 %x, 3
	call void @llvm.assume(i1 %cond1)
	%cond2 = icmp eq i8 %x, 4
	call void @llvm.assume(i1 %cond2)
	ret i8 %add
	}

	; Another case of conflicting assumptions. This would crash because we'd
	; try to set more known bits than existed in the known bits struct.

	define void @PR36270(i32 %b) !dbg !13 {
	; CHECK-LABEL: @PR36270(
	; CHECK-NEXT: tail call void @llvm.assume(i1 false)
	; CHECK-NEXT: unreachable
	;
	%B7 = xor i32 -1, 2147483647
	%and1 = and i32 %b, 3
	%B12 = lshr i32 %B7, %and1, !dbg !14
	%C1 = icmp ult i32 %and1, %B12
	tail call void @llvm.assume(i1 %C1)
	%cmp2 = icmp eq i32 0, %B12
	tail call void @llvm.assume(i1 %cmp2)
	unreachable
	}

	declare void @llvm.assume(i1) nounwind

	!llvm.dbg.cu = !{!0}
	!llvm.module.flags = !{!3, !4, !5}
	!llvm.ident = !{!6}

	!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 4.0.0 (trunk 282540) (llvm/trunk 282542)", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !2)
	!1 = !DIFile(filename: "/tmp/s.c", directory: "/tmp")
	!2 = !{}
	!3 = !{i32 2, !"Dwarf Version", i32 4}
	!4 = !{i32 2, !"Debug Info Version", i32 3}
	!5 = !{i32 1, !"PIC Level", i32 2}
	!6 = !{!"clang version 4.0.0 (trunk 282540) (llvm/trunk 282542)"}
	!7 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !8, isLocal: false, isDefinition: true, scopeLine: 1, isOptimized: true, unit: !0, retainedNodes: !2)
	!8 = !DISubroutineType(types: !2)
	!9 = !DILocation(line: 1, column: 13, scope: !7)
	!10 = distinct !DISubprogram(name: "bar", scope: !1, file: !1, line: 3, type: !8, isLocal: false, isDefinition: true, scopeLine: 3, isOptimized: true, unit: !0, retainedNodes: !2)
	!11 = !DILocation(line: 4, column: 10, scope: !10)
	!12 = !DILocation(line: 4, column: 3, scope: !10)
	!13 = distinct !DISubprogram(name: "PR36270", scope: !1, file: !1, line: 3, type: !8, isLocal: false, isDefinition: true, scopeLine: 3, isOptimized: true, unit: !0, retainedNodes: !2)
	!14 = !DILocation(line: 5, column: 50, scope: !13)

llvm/test/Transforms/InstSimplify/call.ll

	Show First 20 Lines • Show All 982 Lines • ▼ Show 20 Lines
	; for call graph passes.			; for call graph passes.

	declare i32 @passthru_i32(i32 returned)			declare i32 @passthru_i32(i32 returned)
	declare i8* @passthru_p8(i8* returned)			declare i8* @passthru_p8(i8* returned)

	define i32 @returned_const_int_arg() {			define i32 @returned_const_int_arg() {
	; CHECK-LABEL: @returned_const_int_arg(			; CHECK-LABEL: @returned_const_int_arg(
	; CHECK-NEXT: [[X:%.*]] = call i32 @passthru_i32(i32 42)			; CHECK-NEXT: [[X:%.*]] = call i32 @passthru_i32(i32 42)
	; CHECK-NEXT: ret i32 42			; CHECK-NEXT: ret i32 [[X]]
	;			;
	%x = call i32 @passthru_i32(i32 42)			%x = call i32 @passthru_i32(i32 42)
	ret i32 %x			ret i32 %x
	}			}

	define i8* @returned_const_ptr_arg() {			define i8* @returned_const_ptr_arg() {
	; CHECK-LABEL: @returned_const_ptr_arg(			; CHECK-LABEL: @returned_const_ptr_arg(
	; CHECK-NEXT: [[X:%.]] = call i8 @passthru_p8(i8* null)			; CHECK-NEXT: [[X:%.]] = call i8 @passthru_p8(i8* null)
	▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/or.ll

	Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret i8 -2			; CHECK-NEXT: ret i8 -2
	;			;
	%B = or i8 %A, 1			%B = or i8 %A, 1
	%C = and i8 %B, -2			%C = and i8 %B, -2
	%D = or i8 %C, -2			%D = or i8 %C, -2
	ret i8 %D			ret i8 %D
	}			}

				; The following two cases only get folded by InstCombine,
				; see InstCombine/or-xor.ll.

	; (X ^ C1) \| C2 --> (X \| C2) ^ (C1&~C2)			; (X ^ C1) \| C2 --> (X \| C2) ^ (C1&~C2)
	define i8 @test11(i8 %A) {			define i8 @test11(i8 %A) {
	; CHECK-LABEL: @test11(			; CHECK-LABEL: @test11(
	; CHECK-NEXT: ret i8 -1			; CHECK-NEXT: [[B:%.]] = or i8 [[A:%.]], -2
				; CHECK-NEXT: [[C:%.*]] = xor i8 [[B]], 13
				; CHECK-NEXT: [[D:%.*]] = or i8 [[C]], 1
				; CHECK-NEXT: [[E:%.*]] = xor i8 [[D]], 12
				; CHECK-NEXT: ret i8 [[E]]
	;			;
	%B = or i8 %A, -2			%B = or i8 %A, -2
	%C = xor i8 %B, 13			%C = xor i8 %B, 13
	%D = or i8 %C, 1			%D = or i8 %C, 1
	%E = xor i8 %D, 12			%E = xor i8 %D, 12
	ret i8 %E			ret i8 %E
	}			}

	define i8 @test11v(<2 x i8> %A) {			define i8 @test11v(<2 x i8> %A) {
	; CHECK-LABEL: @test11v(			; CHECK-LABEL: @test11v(
	; CHECK-NEXT: ret i8 -1			; CHECK-NEXT: [[B:%.]] = or <2 x i8> [[A:%.]], <i8 -2, i8 0>
				; CHECK-NEXT: [[CV:%.*]] = xor <2 x i8> [[B]], <i8 13, i8 13>
				; CHECK-NEXT: [[C:%.*]] = extractelement <2 x i8> [[CV]], i32 0
				; CHECK-NEXT: [[D:%.*]] = or i8 [[C]], 1
				; CHECK-NEXT: [[E:%.*]] = xor i8 [[D]], 12
				; CHECK-NEXT: ret i8 [[E]]
	;			;
	%B = or <2 x i8> %A, <i8 -2, i8 0>			%B = or <2 x i8> %A, <i8 -2, i8 0>
	%CV = xor <2 x i8> %B, <i8 13, i8 13>			%CV = xor <2 x i8> %B, <i8 13, i8 13>
	%C = extractelement <2 x i8> %CV, i32 0			%C = extractelement <2 x i8> %CV, i32 0
	%D = or i8 %C, 1			%D = or i8 %C, 1
	%E = xor i8 %D, 12			%E = xor i8 %D, 12
	ret i8 %E			ret i8 %E
	}			}
	▲ Show 20 Lines • Show All 110 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/shift-knownbits.ll

	Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines
	define i1 @shl_i1(i1 %a, i1 %b) {			define i1 @shl_i1(i1 %a, i1 %b) {
	; CHECK-LABEL: @shl_i1(			; CHECK-LABEL: @shl_i1(
	; CHECK-NEXT: ret i1 [[A:%.*]]			; CHECK-NEXT: ret i1 [[A:%.*]]
	;			;
	%shl = shl i1 %a, %b			%shl = shl i1 %a, %b
	ret i1 %shl			ret i1 %shl
	}			}

	; Simplify count leading/trailing zeros to zero if all valid bits are shifted out.			; The following cases only get folded by InstCombine,
				; see InstCombine/lshr.ll.

	declare i32 @llvm.cttz.i32(i32, i1) nounwind readnone			declare i32 @llvm.cttz.i32(i32, i1) nounwind readnone
	declare i32 @llvm.ctlz.i32(i32, i1) nounwind readnone			declare i32 @llvm.ctlz.i32(i32, i1) nounwind readnone
	declare <2 x i8> @llvm.cttz.v2i8(<2 x i8>, i1) nounwind readnone			declare <2 x i8> @llvm.cttz.v2i8(<2 x i8>, i1) nounwind readnone
	declare <2 x i8> @llvm.ctlz.v2i8(<2 x i8>, i1) nounwind readnone			declare <2 x i8> @llvm.ctlz.v2i8(<2 x i8>, i1) nounwind readnone

	define i32 @lshr_ctlz_zero_is_undef(i32 %x) {			define i32 @lshr_ctlz_zero_is_undef(i32 %x) {
	; CHECK-LABEL: @lshr_ctlz_zero_is_undef(			; CHECK-LABEL: @lshr_ctlz_zero_is_undef(
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: [[CT:%.]] = call i32 @llvm.ctlz.i32(i32 [[X:%.]], i1 true)
				; CHECK-NEXT: [[SH:%.*]] = lshr i32 [[CT]], 5
				; CHECK-NEXT: ret i32 [[SH]]
	;			;
	%ct = call i32 @llvm.ctlz.i32(i32 %x, i1 true)			%ct = call i32 @llvm.ctlz.i32(i32 %x, i1 true)
	%sh = lshr i32 %ct, 5			%sh = lshr i32 %ct, 5
	ret i32 %sh			ret i32 %sh
	}			}

	define i32 @lshr_cttz_zero_is_undef(i32 %x) {			define i32 @lshr_cttz_zero_is_undef(i32 %x) {
	; CHECK-LABEL: @lshr_cttz_zero_is_undef(			; CHECK-LABEL: @lshr_cttz_zero_is_undef(
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: [[CT:%.]] = call i32 @llvm.cttz.i32(i32 [[X:%.]], i1 true)
				; CHECK-NEXT: [[SH:%.*]] = lshr i32 [[CT]], 5
				; CHECK-NEXT: ret i32 [[SH]]
	;			;
	%ct = call i32 @llvm.cttz.i32(i32 %x, i1 true)			%ct = call i32 @llvm.cttz.i32(i32 %x, i1 true)
	%sh = lshr i32 %ct, 5			%sh = lshr i32 %ct, 5
	ret i32 %sh			ret i32 %sh
	}			}

	define <2 x i8> @lshr_ctlz_zero_is_undef_splat_vec(<2 x i8> %x) {			define <2 x i8> @lshr_ctlz_zero_is_undef_splat_vec(<2 x i8> %x) {
	; CHECK-LABEL: @lshr_ctlz_zero_is_undef_splat_vec(			; CHECK-LABEL: @lshr_ctlz_zero_is_undef_splat_vec(
	; CHECK-NEXT: ret <2 x i8> zeroinitializer			; CHECK-NEXT: [[CT:%.]] = call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> [[X:%.]], i1 true)
				; CHECK-NEXT: [[SH:%.*]] = lshr <2 x i8> [[CT]], <i8 3, i8 3>
				; CHECK-NEXT: ret <2 x i8> [[SH]]
	;			;
	%ct = call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> %x, i1 true)			%ct = call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> %x, i1 true)
	%sh = lshr <2 x i8> %ct, <i8 3, i8 3>			%sh = lshr <2 x i8> %ct, <i8 3, i8 3>
	ret <2 x i8> %sh			ret <2 x i8> %sh
	}			}

	define i8 @lshr_ctlz_zero_is_undef_vec(<2 x i8> %x) {			define i8 @lshr_ctlz_zero_is_undef_vec(<2 x i8> %x) {
	; CHECK-LABEL: @lshr_ctlz_zero_is_undef_vec(			; CHECK-LABEL: @lshr_ctlz_zero_is_undef_vec(
	; CHECK-NEXT: ret i8 0			; CHECK-NEXT: [[CT:%.]] = call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> [[X:%.]], i1 true)
				; CHECK-NEXT: [[SH:%.*]] = lshr <2 x i8> [[CT]], <i8 3, i8 0>
				; CHECK-NEXT: [[EX:%.*]] = extractelement <2 x i8> [[SH]], i32 0
				; CHECK-NEXT: ret i8 [[EX]]
	;			;
	%ct = call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> %x, i1 true)			%ct = call <2 x i8> @llvm.ctlz.v2i8(<2 x i8> %x, i1 true)
	%sh = lshr <2 x i8> %ct, <i8 3, i8 0>			%sh = lshr <2 x i8> %ct, <i8 3, i8 0>
	%ex = extractelement <2 x i8> %sh, i32 0			%ex = extractelement <2 x i8> %sh, i32 0
	ret i8 %ex			ret i8 %ex
	}			}

	define <2 x i8> @lshr_cttz_zero_is_undef_splat_vec(<2 x i8> %x) {			define <2 x i8> @lshr_cttz_zero_is_undef_splat_vec(<2 x i8> %x) {
	; CHECK-LABEL: @lshr_cttz_zero_is_undef_splat_vec(			; CHECK-LABEL: @lshr_cttz_zero_is_undef_splat_vec(
	; CHECK-NEXT: ret <2 x i8> zeroinitializer			; CHECK-NEXT: [[CT:%.]] = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> [[X:%.]], i1 true)
				; CHECK-NEXT: [[SH:%.*]] = lshr <2 x i8> [[CT]], <i8 3, i8 3>
				; CHECK-NEXT: ret <2 x i8> [[SH]]
	;			;
	%ct = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> %x, i1 true)			%ct = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> %x, i1 true)
	%sh = lshr <2 x i8> %ct, <i8 3, i8 3>			%sh = lshr <2 x i8> %ct, <i8 3, i8 3>
	ret <2 x i8> %sh			ret <2 x i8> %sh
	}			}

	define i8 @lshr_cttz_zero_is_undef_vec(<2 x i8> %x) {			define i8 @lshr_cttz_zero_is_undef_vec(<2 x i8> %x) {
	; CHECK-LABEL: @lshr_cttz_zero_is_undef_vec(			; CHECK-LABEL: @lshr_cttz_zero_is_undef_vec(
	; CHECK-NEXT: ret i8 0			; CHECK-NEXT: [[CT:%.]] = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> [[X:%.]], i1 true)
				; CHECK-NEXT: [[SH:%.*]] = lshr <2 x i8> [[CT]], <i8 3, i8 0>
				; CHECK-NEXT: [[EX:%.*]] = extractelement <2 x i8> [[SH]], i32 0
				; CHECK-NEXT: ret i8 [[EX]]
	;			;
	%ct = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> %x, i1 true)			%ct = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> %x, i1 true)
	%sh = lshr <2 x i8> %ct, <i8 3, i8 0>			%sh = lshr <2 x i8> %ct, <i8 3, i8 0>
	%ex = extractelement <2 x i8> %sh, i32 0			%ex = extractelement <2 x i8> %sh, i32 0
	ret i8 %ex			ret i8 %ex
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] Remove known bits constant folding
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 262838

llvm/lib/Analysis/InstructionSimplify.cpp

llvm/test/Analysis/ValueTracking/knownzero-shift.ll

llvm/test/Transforms/GVN/PRE/volatile.ll

llvm/test/Transforms/InstSimplify/assume.ll

llvm/test/Transforms/InstSimplify/call.ll

llvm/test/Transforms/InstSimplify/or.ll

llvm/test/Transforms/InstSimplify/shift-knownbits.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] Remove known bits constant foldingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 262838

llvm/lib/Analysis/InstructionSimplify.cpp

llvm/test/Analysis/ValueTracking/knownzero-shift.ll

llvm/test/Transforms/GVN/PRE/volatile.ll

llvm/test/Transforms/InstSimplify/assume.ll

llvm/test/Transforms/InstSimplify/call.ll

llvm/test/Transforms/InstSimplify/or.ll

llvm/test/Transforms/InstSimplify/shift-knownbits.ll

[InstSimplify] Remove known bits constant folding
ClosedPublic